Commit 14782c81 by Srinath Parvathaneni Committed by Kyrylo Tkachov

[ARM][GCC][4/x]: MVE ACLE vector interleaving store intrinsics.

This patch supports MVE ACLE intrinsics vst4q_s8, vst4q_s16, vst4q_s32, vst4q_u8, vst4q_u16, vst4q_u32, vst4q_f16 and vst4q_f32.

In this patch arm_mve_builtins.def file is added to the source code in which the builtins for MVE ACLE intrinsics are defined using builtin qualifiers.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more details.
[1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

2020-03-16  Andre Vieira  <andre.simoesdiasvieira@arm.com>
	    Mihail Ionescu  <mihail.ionescu@arm.com>
	    Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	* config/arm/arm-builtins.c (CF): Define mve_builtin_data.
	(VAR1): Define.
	(ARM_BUILTIN_MVE_PATTERN_START): Define.
	(arm_init_mve_builtins): Define function.
	(arm_init_builtins): Add TARGET_HAVE_MVE check.
	(arm_expand_builtin_1): Check the range of fcode.
	(arm_expand_mve_builtin): Define function to expand MVE builtins.
	(arm_expand_builtin): Check the range of fcode.
	* config/arm/arm_mve.h (__ARM_FEATURE_MVE): Define MVE floating point
	types.
	(__ARM_MVE_PRESERVE_USER_NAMESPACE): Define to protect user namespace.
	(vst4q_s8): Define macro.
	(vst4q_s16): Likewise.
	(vst4q_s32): Likewise.
	(vst4q_u8): Likewise.
	(vst4q_u16): Likewise.
	(vst4q_u32): Likewise.
	(vst4q_f16): Likewise.
	(vst4q_f32): Likewise.
	(__arm_vst4q_s8): Define inline builtin.
	(__arm_vst4q_s16): Likewise.
	(__arm_vst4q_s32): Likewise.
	(__arm_vst4q_u8): Likewise.
	(__arm_vst4q_u16): Likewise.
	(__arm_vst4q_u32): Likewise.
	(__arm_vst4q_f16): Likewise.
	(__arm_vst4q_f32): Likewise.
	(__ARM_mve_typeid): Define macro with MVE types.
	(__ARM_mve_coerce): Define macro with _Generic feature.
	(vst4q): Define polymorphic variant for different vst4q builtins.
	* config/arm/arm_mve_builtins.def: New file.
	* config/arm/iterators.md (VSTRUCT): Modify to allow XI and OI
	modes in MVE.
	* config/arm/mve.md (MVE_VLD_ST): Define iterator.
	(unspec): Define unspec.
	(mve_vst4q<mode>): Define RTL pattern.
	* config/arm/neon.md (mov<mode>): Modify expand to allow XI and OI
	modes in MVE.
	(neon_mov<mode>): Modify RTL define_insn to allow XI and OI modes
	in MVE.
	(define_split): Allow OI mode split for MVE after reload.
	(define_split): Allow XI mode split for MVE after reload.
	* config/arm/t-arm (arm.o): Add entry for arm_mve_builtins.def.
	(arm-builtins.o): Likewise.

2020-03-16  Andre Vieira  <andre.simoesdiasvieira@arm.com>
	    Mihail Ionescu  <mihail.ionescu@arm.com>
	    Srinath Parvathaneni  <srinath.parvathaneni@arm.com>

	* gcc.target/arm/mve/intrinsics/vst4q_f16.c: New test.
	* gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise.
parent 994d4862
2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* config/arm/arm-builtins.c (CF): Define mve_builtin_data.
(VAR1): Define.
(ARM_BUILTIN_MVE_PATTERN_START): Define.
(arm_init_mve_builtins): Define function.
(arm_init_builtins): Add TARGET_HAVE_MVE check.
(arm_expand_builtin_1): Check the range of fcode.
(arm_expand_mve_builtin): Define function to expand MVE builtins.
(arm_expand_builtin): Check the range of fcode.
* config/arm/arm_mve.h (__ARM_FEATURE_MVE): Define MVE floating point
types.
(__ARM_MVE_PRESERVE_USER_NAMESPACE): Define to protect user namespace.
(vst4q_s8): Define macro.
(vst4q_s16): Likewise.
(vst4q_s32): Likewise.
(vst4q_u8): Likewise.
(vst4q_u16): Likewise.
(vst4q_u32): Likewise.
(vst4q_f16): Likewise.
(vst4q_f32): Likewise.
(__arm_vst4q_s8): Define inline builtin.
(__arm_vst4q_s16): Likewise.
(__arm_vst4q_s32): Likewise.
(__arm_vst4q_u8): Likewise.
(__arm_vst4q_u16): Likewise.
(__arm_vst4q_u32): Likewise.
(__arm_vst4q_f16): Likewise.
(__arm_vst4q_f32): Likewise.
(__ARM_mve_typeid): Define macro with MVE types.
(__ARM_mve_coerce): Define macro with _Generic feature.
(vst4q): Define polymorphic variant for different vst4q builtins.
* config/arm/arm_mve_builtins.def: New file.
* config/arm/iterators.md (VSTRUCT): Modify to allow XI and OI
modes in MVE.
* config/arm/mve.md (MVE_VLD_ST): Define iterator.
(unspec): Define unspec.
(mve_vst4q<mode>): Define RTL pattern.
* config/arm/neon.md (mov<mode>): Modify expand to allow XI and OI
modes in MVE.
(neon_mov<mode>): Modify RTL define_insn to allow XI and OI modes
in MVE.
(define_split): Allow OI mode split for MVE after reload.
(define_split): Allow XI mode split for MVE after reload.
* config/arm/t-arm (arm.o): Add entry for arm_mve_builtins.def.
(arm-builtins.o): Likewise.
2020-03-17 Christophe Lyon <christophe.lyon@linaro.org>
* c-typeck.c (process_init_element): Handle constructor_type with
......
......@@ -432,6 +432,13 @@ static arm_builtin_datum neon_builtin_data[] =
};
#undef CF
#define CF(N,X) CODE_FOR_mve_##N##X
static arm_builtin_datum mve_builtin_data[] =
{
#include "arm_mve_builtins.def"
};
#undef CF
#undef VAR1
#define VAR1(T, N, A) \
{#N, UP (A), CODE_FOR_arm_##N, 0, T##_QUALIFIERS},
......@@ -736,6 +743,13 @@ enum arm_builtins
#include "arm_acle_builtins.def"
ARM_BUILTIN_MVE_BASE,
#undef VAR1
#define VAR1(T, N, X) \
ARM_BUILTIN_MVE_##N##X,
#include "arm_mve_builtins.def"
ARM_BUILTIN_MAX
};
......@@ -745,6 +759,9 @@ enum arm_builtins
#define ARM_BUILTIN_NEON_PATTERN_START \
(ARM_BUILTIN_NEON_BASE + 1)
#define ARM_BUILTIN_MVE_PATTERN_START \
(ARM_BUILTIN_MVE_BASE + 1)
#define ARM_BUILTIN_ACLE_PATTERN_START \
(ARM_BUILTIN_ACLE_BASE + 1)
......@@ -1278,6 +1295,22 @@ arm_init_acle_builtins (void)
}
}
/* Set up all the MVE builtins mentioned in arm_mve_builtins.def file. */
static void
arm_init_mve_builtins (void)
{
volatile unsigned int i, fcode = ARM_BUILTIN_MVE_PATTERN_START;
arm_init_simd_builtin_scalar_types ();
arm_init_simd_builtin_types ();
for (i = 0; i < ARRAY_SIZE (mve_builtin_data); i++, fcode++)
{
arm_builtin_datum *d = &mve_builtin_data[i];
arm_init_builtin (fcode, d, "__builtin_mve");
}
}
/* Set up all the NEON builtins, even builtins for instructions that are not
in the current target ISA to allow the user to compile particular modules
with different target specific options that differ from the command line
......@@ -2022,8 +2055,10 @@ arm_init_builtins (void)
= add_builtin_function ("__builtin_arm_lane_check", lane_check_fpr,
ARM_BUILTIN_SIMD_LANE_CHECK, BUILT_IN_MD,
NULL, NULL_TREE);
arm_init_neon_builtins ();
if (TARGET_HAVE_MVE)
arm_init_mve_builtins ();
else
arm_init_neon_builtins ();
arm_init_vfp_builtins ();
arm_init_crypto_builtins ();
}
......@@ -2567,10 +2602,14 @@ arm_expand_builtin_1 (int fcode, tree exp, rtx target,
int is_void = 0;
int k;
bool neon = false;
bool mve = false;
if (IN_RANGE (fcode, ARM_BUILTIN_VFP_BASE, ARM_BUILTIN_ACLE_BASE - 1))
neon = true;
if (IN_RANGE (fcode, ARM_BUILTIN_MVE_BASE, ARM_BUILTIN_MAX - 1))
mve = true;
is_void = !!(d->qualifiers[0] & qualifier_void);
num_args += is_void;
......@@ -2612,7 +2651,7 @@ arm_expand_builtin_1 (int fcode, tree exp, rtx target,
}
else if (d->qualifiers[qualifiers_k] & qualifier_pointer)
{
if (neon)
if (neon || mve)
args[k] = ARG_BUILTIN_NEON_MEMORY;
else
args[k] = ARG_BUILTIN_MEMORY;
......@@ -2662,6 +2701,26 @@ arm_expand_acle_builtin (int fcode, tree exp, rtx target)
return arm_expand_builtin_1 (fcode, exp, target, d);
}
/* Expand an MVE builtin, i.e. those registered only if their respective target
constraints are met. This check happens within arm_expand_builtin. */
static rtx
arm_expand_mve_builtin (int fcode, tree exp, rtx target)
{
if (fcode >= ARM_BUILTIN_MVE_BASE && !TARGET_HAVE_MVE)
{
fatal_error (input_location,
"You must enable MVE instructions"
" to use these intrinsics");
return const0_rtx;
}
arm_builtin_datum *d
= &mve_builtin_data[fcode - ARM_BUILTIN_MVE_PATTERN_START];
return arm_expand_builtin_1 (fcode, exp, target, d);
}
/* Expand a Neon builtin, i.e. those registered only if TARGET_NEON holds.
Most of these are "special" because they don't have symbolic
constants defined per-instruction or per instruction-variant. Instead, the
......@@ -2755,6 +2814,8 @@ arm_expand_builtin (tree exp,
/* Don't generate any RTL. */
return const0_rtx;
}
if (fcode >= ARM_BUILTIN_MVE_BASE)
return arm_expand_mve_builtin (fcode, exp, target);
if (fcode >= ARM_BUILTIN_ACLE_BASE)
return arm_expand_acle_builtin (fcode, exp, target);
......
/* MVE builtin definitions for Arm.
Copyright (C) 2019-2020 Free Software Foundation, Inc.
Contributed by Arm.
This file is part of GCC.
GCC is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published
by the Free Software Foundation; either version 3, or (at your
option) any later version.
GCC is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
License for more details.
You should have received a copy of the GNU General Public License
along with GCC; see the file COPYING3. If not see
<http://www.gnu.org/licenses/>. */
VAR5 (STORE1, vst4q, v16qi, v8hi, v4si, v8hf, v4sf)
......@@ -131,7 +131,8 @@
(define_mode_iterator VQXMOV [V16QI V8HI V8HF V8BF V4SI V4SF V2DI TI])
;; Opaque structure types wider than TImode.
(define_mode_iterator VSTRUCT [EI OI CI XI])
(define_mode_iterator VSTRUCT [(EI "!TARGET_HAVE_MVE") OI
(CI "!TARGET_HAVE_MVE") XI])
;; Opaque structure types used in table lookups (except vtbl1/vtbx1).
(define_mode_iterator VTAB [TI EI OI])
......
......@@ -17,9 +17,12 @@
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
(define_mode_iterator MVE_types [V16QI V8HI V4SI V2DI TI V8HF V4SF V2DF])
(define_mode_attr V_sz_elem2 [(V16QI "s8") (V8HI "u16") (V4SI "u32")
(V2DI "u64")])
(define_mode_iterator MVE_types [V16QI V8HI V4SI V2DI TI V8HF V4SF V2DF])
(define_mode_iterator MVE_VLD_ST [V16QI V8HI V4SI V8HF V4SF])
(define_c_enum "unspec" [VST4Q])
(define_insn "*mve_mov<mode>"
[(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w,w,r,w,Us")
......@@ -83,3 +86,37 @@
}
[(set_attr "length" "4,4")
(set_attr "type" "mve_move,mve_move")])
;;
;; [vst4q])
;;
(define_insn "mve_vst4q<mode>"
[(set (match_operand:XI 0 "neon_struct_operand" "=Um")
(unspec:XI [(match_operand:XI 1 "s_register_operand" "w")
(unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)]
VST4Q))
]
"TARGET_HAVE_MVE"
{
rtx ops[6];
int regno = REGNO (operands[1]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = gen_rtx_REG (TImode, regno+4);
ops[2] = gen_rtx_REG (TImode, regno+8);
ops[3] = gen_rtx_REG (TImode, regno+12);
rtx reg = operands[0];
while (reg && !REG_P (reg))
reg = XEXP (reg, 0);
gcc_assert (REG_P (reg));
ops[4] = reg;
ops[5] = operands[0];
/* Here in first three instructions data is stored to ops[4]'s location but
in the fourth instruction data is stored to operands[0], this is to
support the writeback. */
output_asm_insn ("vst40.<V_sz_elem>\t{%q0, %q1, %q2, %q3}, [%4]\n\t"
"vst41.<V_sz_elem>\t{%q0, %q1, %q2, %q3}, [%4]\n\t"
"vst42.<V_sz_elem>\t{%q0, %q1, %q2, %q3}, [%4]\n\t"
"vst43.<V_sz_elem>\t{%q0, %q1, %q2, %q3}, %5", ops);
return "";
}
[(set_attr "length" "16")])
......@@ -149,7 +149,7 @@
(define_expand "mov<mode>"
[(set (match_operand:VSTRUCT 0 "nonimmediate_operand")
(match_operand:VSTRUCT 1 "general_operand"))]
"TARGET_NEON"
"TARGET_NEON || TARGET_HAVE_MVE"
{
gcc_checking_assert (aligned_operand (operands[0], <MODE>mode));
gcc_checking_assert (aligned_operand (operands[1], <MODE>mode));
......@@ -181,7 +181,7 @@
(define_insn "*neon_mov<mode>"
[(set (match_operand:VSTRUCT 0 "nonimmediate_operand" "=w,Ut,w")
(match_operand:VSTRUCT 1 "general_operand" " w,w, Ut"))]
"TARGET_NEON
"(TARGET_NEON || TARGET_HAVE_MVE)
&& (register_operand (operands[0], <MODE>mode)
|| register_operand (operands[1], <MODE>mode))"
{
......@@ -217,7 +217,7 @@
(define_split
[(set (match_operand:OI 0 "s_register_operand" "")
(match_operand:OI 1 "s_register_operand" ""))]
"TARGET_NEON && reload_completed"
"(TARGET_NEON || TARGET_HAVE_MVE)&& reload_completed"
[(set (match_dup 0) (match_dup 1))
(set (match_dup 2) (match_dup 3))]
{
......@@ -258,7 +258,7 @@
(define_split
[(set (match_operand:XI 0 "s_register_operand" "")
(match_operand:XI 1 "s_register_operand" ""))]
"TARGET_NEON && reload_completed"
"(TARGET_NEON || TARGET_HAVE_MVE) && reload_completed"
[(set (match_dup 0) (match_dup 1))
(set (match_dup 2) (match_dup 3))
(set (match_dup 4) (match_dup 5))
......
......@@ -137,7 +137,8 @@ arm.o: $(srcdir)/config/arm/arm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
arm-cpu-data.h \
$(srcdir)/config/arm/arm-protos.h \
$(srcdir)/config/arm/arm_neon_builtins.def \
$(srcdir)/config/arm/arm_vfp_builtins.def
$(srcdir)/config/arm/arm_vfp_builtins.def \
$(srcdir)/config/arm/arm_mve_builtins.def
arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \
$(SYSTEM_H) coretypes.h $(TM_H) \
......@@ -147,6 +148,7 @@ arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \
$(srcdir)/config/arm/arm_acle_builtins.def \
$(srcdir)/config/arm/arm_neon_builtins.def \
$(srcdir)/config/arm/arm_vfp_builtins.def \
$(srcdir)/config/arm/arm_mve_builtins.def \
$(srcdir)/config/arm/arm-simd-builtin-types.def
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/arm/arm-builtins.c
......
2020-03-16 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* gcc.target/arm/mve/intrinsics/vst4q_f16.c: New test.
* gcc.target/arm/mve/intrinsics/vst4q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vst4q_u8.c: Likewise.
2020-03-17 Jakub Jelinek <jakub@redhat.com>
PR target/94185
......
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
/* { dg-add-options arm_v8_1m_mve_fp } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (float16_t * addr, float16x8x4_t value)
{
vst4q_f16 (addr, value);
}
/* { dg-final { scan-assembler "vst40.16" } } */
/* { dg-final { scan-assembler "vst41.16" } } */
/* { dg-final { scan-assembler "vst42.16" } } */
/* { dg-final { scan-assembler "vst43.16" } } */
void
foo1 (float16_t * addr, float16x8x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.16" } } */
/* { dg-final { scan-assembler "vst41.16" } } */
/* { dg-final { scan-assembler "vst42.16" } } */
/* { dg-final { scan-assembler "vst43.16" } } */
void
foo2 (float16_t * addr, float16x8x4_t value)
{
vst4q_f16 (addr, value);
addr += 32;
vst4q_f16 (addr, value);
}
/* { dg-final { scan-assembler {vst43.16\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
/* { dg-add-options arm_v8_1m_mve_fp } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (float32_t * addr, float32x4x4_t value)
{
vst4q_f32 (addr, value);
}
/* { dg-final { scan-assembler "vst40.32" } } */
/* { dg-final { scan-assembler "vst41.32" } } */
/* { dg-final { scan-assembler "vst42.32" } } */
/* { dg-final { scan-assembler "vst43.32" } } */
void
foo1 (float32_t * addr, float32x4x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.32" } } */
/* { dg-final { scan-assembler "vst41.32" } } */
/* { dg-final { scan-assembler "vst42.32" } } */
/* { dg-final { scan-assembler "vst43.32" } } */
void
foo2 (float32_t * addr, float32x4x4_t value)
{
vst4q_f32 (addr, value);
addr += 16;
vst4q_f32 (addr, value);
}
/* { dg-final { scan-assembler {vst43.32\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (int16_t * addr, int16x8x4_t value)
{
vst4q_s16 (addr, value);
}
/* { dg-final { scan-assembler "vst40.16" } } */
/* { dg-final { scan-assembler "vst41.16" } } */
/* { dg-final { scan-assembler "vst42.16" } } */
/* { dg-final { scan-assembler "vst43.16" } } */
void
foo1 (int16_t * addr, int16x8x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.16" } } */
/* { dg-final { scan-assembler "vst41.16" } } */
/* { dg-final { scan-assembler "vst42.16" } } */
/* { dg-final { scan-assembler "vst43.16" } } */
void
foo2 (int16_t * addr, int16x8x4_t value)
{
vst4q_s16 (addr, value);
addr += 32;
vst4q_s16 (addr, value);
}
/* { dg-final { scan-assembler {vst43.16\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (int32_t * addr, int32x4x4_t value)
{
vst4q_s32 (addr, value);
}
/* { dg-final { scan-assembler "vst40.32" } } */
/* { dg-final { scan-assembler "vst41.32" } } */
/* { dg-final { scan-assembler "vst42.32" } } */
/* { dg-final { scan-assembler "vst43.32" } } */
void
foo1 (int32_t * addr, int32x4x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.32" } } */
/* { dg-final { scan-assembler "vst41.32" } } */
/* { dg-final { scan-assembler "vst42.32" } } */
/* { dg-final { scan-assembler "vst43.32" } } */
void
foo2 (int32_t * addr, int32x4x4_t value)
{
vst4q_s32 (addr, value);
addr += 16;
vst4q_s32 (addr, value);
}
/* { dg-final { scan-assembler {vst43.32\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (int8_t * addr, int8x16x4_t value)
{
vst4q_s8 (addr, value);
}
/* { dg-final { scan-assembler "vst40.8" } } */
/* { dg-final { scan-assembler "vst41.8" } } */
/* { dg-final { scan-assembler "vst42.8" } } */
/* { dg-final { scan-assembler "vst43.8" } } */
void
foo1 (int8_t * addr, int8x16x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.8" } } */
/* { dg-final { scan-assembler "vst41.8" } } */
/* { dg-final { scan-assembler "vst42.8" } } */
/* { dg-final { scan-assembler "vst43.8" } } */
void
foo2 (int8_t * addr, int8x16x4_t value)
{
vst4q_s8 (addr, value);
addr += 16*4;
vst4q_s8 (addr, value);
}
/* { dg-final { scan-assembler {vst43.8\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (uint16_t * addr, uint16x8x4_t value)
{
vst4q_u16 (addr, value);
}
/* { dg-final { scan-assembler "vst40.16" } } */
/* { dg-final { scan-assembler "vst41.16" } } */
/* { dg-final { scan-assembler "vst42.16" } } */
/* { dg-final { scan-assembler "vst43.16" } } */
void
foo1 (uint16_t * addr, uint16x8x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.16" } } */
/* { dg-final { scan-assembler "vst41.16" } } */
/* { dg-final { scan-assembler "vst42.16" } } */
/* { dg-final { scan-assembler "vst43.16" } } */
void
foo2 (uint16_t * addr, uint16x8x4_t value)
{
vst4q_u16 (addr, value);
addr += 32;
vst4q_u16 (addr, value);
}
/* { dg-final { scan-assembler {vst43.16\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (uint32_t * addr, uint32x4x4_t value)
{
vst4q_u32 (addr, value);
}
/* { dg-final { scan-assembler "vst40.32" } } */
/* { dg-final { scan-assembler "vst41.32" } } */
/* { dg-final { scan-assembler "vst42.32" } } */
/* { dg-final { scan-assembler "vst43.32" } } */
void
foo1 (uint32_t * addr, uint32x4x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.32" } } */
/* { dg-final { scan-assembler "vst41.32" } } */
/* { dg-final { scan-assembler "vst42.32" } } */
/* { dg-final { scan-assembler "vst43.32" } } */
void
foo2 (uint32_t * addr, uint32x4x4_t value)
{
vst4q_u32 (addr, value);
addr += 16;
vst4q_u32 (addr, value);
}
/* { dg-final { scan-assembler {vst43.32\s\{.*\}, \[.*\]!} } } */
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
void
foo (uint8_t * addr, uint8x16x4_t value)
{
vst4q_u8 (addr, value);
}
/* { dg-final { scan-assembler "vst40.8" } } */
/* { dg-final { scan-assembler "vst41.8" } } */
/* { dg-final { scan-assembler "vst42.8" } } */
/* { dg-final { scan-assembler "vst43.8" } } */
void
foo1 (uint8_t * addr, uint8x16x4_t value)
{
vst4q (addr, value);
}
/* { dg-final { scan-assembler "vst40.8" } } */
/* { dg-final { scan-assembler "vst41.8" } } */
/* { dg-final { scan-assembler "vst42.8" } } */
/* { dg-final { scan-assembler "vst43.8" } } */
void
foo2 (uint8_t * addr, uint8x16x4_t value)
{
vst4q_u8 (addr, value);
addr += 16*4;
vst4q_u8 (addr, value);
}
/* { dg-final { scan-assembler {vst43.8\s\{.*\}, \[.*\]!} } } */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment