Commit a19ba9e1 by Richard Sandiford Committed by Richard Sandiford

[AArch64] Use SVE binary immediate instructions for conditional arithmetic

This patch lets us use the immediate forms of FADD, FSUB, FSUBR,
FMUL, FMAXNM and FMINNM for conditional arithmetic.  (We already
use them for normal unconditional arithmetic.)

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
	Print 2.0 naturally.
	(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
	* config/aarch64/predicates.md
	(aarch64_sve_float_negated_arith_immediate): New predicate,
	renamed from aarch64_sve_float_arith_with_sub_immediate.
	(aarch64_sve_float_arith_with_sub_immediate): Test for both
	positive and negative constants.
	(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
	or an aarch64_sve_float_arith_with_sub_immediate.
	* config/aarch64/constraints.md (vsN): Use
	aarch64_sve_float_negated_arith_immediate.
	* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
	iterator.
	(sve_pred_fp_rhs2_immediate): New int attribute.
	* config/aarch64/aarch64-sve.md
	(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
	sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
	(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
	(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
	(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
	(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
	* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274508
parent bf30864e
2019-08-15 Richard Sandiford <richard.sandiford@arm.com> 2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
Print 2.0 naturally.
(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
* config/aarch64/predicates.md
(aarch64_sve_float_negated_arith_immediate): New predicate,
renamed from aarch64_sve_float_arith_with_sub_immediate.
(aarch64_sve_float_arith_with_sub_immediate): Test for both
positive and negative constants.
(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
or an aarch64_sve_float_arith_with_sub_immediate.
* config/aarch64/constraints.md (vsN): Use
aarch64_sve_float_negated_arith_immediate.
* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
iterator.
(sve_pred_fp_rhs2_immediate): New int attribute.
* config/aarch64/aarch64-sve.md
(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
* config/aarch64/aarch64-sve.md (*aarch64_cond_abd<SVE_F:mode>_2) * config/aarch64/aarch64-sve.md (*aarch64_cond_abd<SVE_F:mode>_2)
(*aarch64_cond_abd<SVE_F:mode>_3) (*aarch64_cond_abd<SVE_F:mode>_3)
(*aarch64_cond_abd<SVE_F:mode>_any): New patterns. (*aarch64_cond_abd<SVE_F:mode>_any): New patterns.
......
...@@ -8289,6 +8289,8 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate) ...@@ -8289,6 +8289,8 @@ aarch64_print_vector_float_operand (FILE *f, rtx x, bool negate)
fixed form in the assembly syntax. */ fixed form in the assembly syntax. */
if (real_equal (&r, &dconst0)) if (real_equal (&r, &dconst0))
asm_fprintf (f, "0.0"); asm_fprintf (f, "0.0");
else if (real_equal (&r, &dconst2))
asm_fprintf (f, "2.0");
else if (real_equal (&r, &dconst1)) else if (real_equal (&r, &dconst1))
asm_fprintf (f, "1.0"); asm_fprintf (f, "1.0");
else if (real_equal (&r, &dconsthalf)) else if (real_equal (&r, &dconsthalf))
...@@ -15205,11 +15207,10 @@ aarch64_sve_float_mul_immediate_p (rtx x) ...@@ -15205,11 +15207,10 @@ aarch64_sve_float_mul_immediate_p (rtx x)
{ {
rtx elt; rtx elt;
/* GCC will never generate a multiply with an immediate of 2, so there is no
point testing for it (even though it is a valid constant). */
return (const_vec_duplicate_p (x, &elt) return (const_vec_duplicate_p (x, &elt)
&& GET_CODE (elt) == CONST_DOUBLE && GET_CODE (elt) == CONST_DOUBLE
&& real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf)); && (real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconsthalf)
|| real_equal (CONST_DOUBLE_REAL_VALUE (elt), &dconst2)));
} }
/* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate /* Return true if replicating VAL32 is a valid 2-byte or 4-byte immediate
......
...@@ -458,4 +458,4 @@ ...@@ -458,4 +458,4 @@
(define_constraint "vsN" (define_constraint "vsN"
"@internal "@internal
A constraint that matches the negative of vsA" A constraint that matches the negative of vsA"
(match_operand 0 "aarch64_sve_float_arith_with_sub_immediate")) (match_operand 0 "aarch64_sve_float_negated_arith_immediate"))
...@@ -1713,6 +1713,10 @@ ...@@ -1713,6 +1713,10 @@
UNSPEC_COND_FMUL UNSPEC_COND_FMUL
UNSPEC_COND_FSUB]) UNSPEC_COND_FSUB])
(define_int_iterator SVE_COND_FP_BINARY_I1 [UNSPEC_COND_FMAXNM
UNSPEC_COND_FMINNM
UNSPEC_COND_FMUL])
(define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV]) (define_int_iterator SVE_COND_FP_BINARY_REG [UNSPEC_COND_FDIV])
;; Floating-point max/min operations that correspond to optabs, ;; Floating-point max/min operations that correspond to optabs,
...@@ -2108,3 +2112,9 @@ ...@@ -2108,3 +2112,9 @@
(UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_operand") (UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_operand")
(UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand") (UNSPEC_COND_FMUL "aarch64_sve_float_mul_operand")
(UNSPEC_COND_FSUB "register_operand")]) (UNSPEC_COND_FSUB "register_operand")])
;; Likewise for immediates only.
(define_int_attr sve_pred_fp_rhs2_immediate
[(UNSPEC_COND_FMAXNM "aarch64_sve_float_maxmin_immediate")
(UNSPEC_COND_FMINNM "aarch64_sve_float_maxmin_immediate")
(UNSPEC_COND_FMUL "aarch64_sve_float_mul_immediate")])
...@@ -663,10 +663,14 @@ ...@@ -663,10 +663,14 @@
(and (match_code "const,const_vector") (and (match_code "const,const_vector")
(match_test "aarch64_sve_float_arith_immediate_p (op, false)"))) (match_test "aarch64_sve_float_arith_immediate_p (op, false)")))
(define_predicate "aarch64_sve_float_arith_with_sub_immediate" (define_predicate "aarch64_sve_float_negated_arith_immediate"
(and (match_code "const,const_vector") (and (match_code "const,const_vector")
(match_test "aarch64_sve_float_arith_immediate_p (op, true)"))) (match_test "aarch64_sve_float_arith_immediate_p (op, true)")))
(define_predicate "aarch64_sve_float_arith_with_sub_immediate"
(ior (match_operand 0 "aarch64_sve_float_arith_immediate")
(match_operand 0 "aarch64_sve_float_negated_arith_immediate")))
(define_predicate "aarch64_sve_float_mul_immediate" (define_predicate "aarch64_sve_float_mul_immediate"
(and (match_code "const,const_vector") (and (match_code "const,const_vector")
(match_test "aarch64_sve_float_mul_immediate_p (op)"))) (match_test "aarch64_sve_float_mul_immediate_p (op)")))
...@@ -730,7 +734,7 @@ ...@@ -730,7 +734,7 @@
(match_operand 0 "aarch64_sve_float_arith_immediate"))) (match_operand 0 "aarch64_sve_float_arith_immediate")))
(define_predicate "aarch64_sve_float_arith_with_sub_operand" (define_predicate "aarch64_sve_float_arith_with_sub_operand"
(ior (match_operand 0 "aarch64_sve_float_arith_operand") (ior (match_operand 0 "register_operand")
(match_operand 0 "aarch64_sve_float_arith_with_sub_immediate"))) (match_operand 0 "aarch64_sve_float_arith_with_sub_immediate")))
(define_predicate "aarch64_sve_float_mul_operand" (define_predicate "aarch64_sve_float_mul_operand"
......
2019-08-15 Richard Sandiford <richard.sandiford@arm.com> 2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.
2019-08-15 Richard Sandiford <richard.sandiford@arm.com>
Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
* gcc.target/aarch64/sve/cond_fabd_1.c: New test. * gcc.target/aarch64/sve/cond_fabd_1.c: New test.
* gcc.target/aarch64/sve/cond_fabd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_1_run.c: Likewise.
* gcc.target/aarch64/sve/cond_fabd_2.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_2.c: Likewise.
......
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : y[i]; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, one, 1.0) \
T (TYPE, PRED_TYPE, two, 2.0) \
T (TYPE, PRED_TYPE, minus_half, -0.5) \
T (TYPE, PRED_TYPE, minus_one, -1.0) \
T (TYPE, PRED_TYPE, minus_two, -2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fadd_1.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
TYPE *__restrict z, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = y[i] < 8 ? z[i] + (TYPE) CONST : y[i]; \
}
#define TEST_TYPE(T, TYPE) \
T (TYPE, half, 0.5) \
T (TYPE, one, 1.0) \
T (TYPE, two, 2.0) \
T (TYPE, minus_half, -0.5) \
T (TYPE, minus_one, -1.0) \
T (TYPE, minus_two, -2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, float) \
TEST_TYPE (T, double)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 6 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 6 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fadd_2.c"
#define N 99
#define TEST_LOOP(TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N], z[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i % 13; \
z[i] = i * i; \
} \
test_##TYPE##_##NAME (x, y, z, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = y[i] < 8 ? z[i] + (TYPE) CONST : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 4; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, one, 1.0) \
T (TYPE, PRED_TYPE, two, 2.0) \
T (TYPE, PRED_TYPE, minus_half, -0.5) \
T (TYPE, PRED_TYPE, minus_one, -1.0) \
T (TYPE, PRED_TYPE, minus_two, -2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 6 } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fadd_3.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 4; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? y[i] + (TYPE) CONST : 0; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, one, 1.0) \
T (TYPE, PRED_TYPE, two, 2.0) \
T (TYPE, PRED_TYPE, minus_half, -0.5) \
T (TYPE, PRED_TYPE, minus_one, -1.0) \
T (TYPE, PRED_TYPE, minus_two, -2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #-2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 2 } } */
/* { dg-final { scan-assembler-times {\tfadd\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 2 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 6 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 6 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fadd_4.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? y[i] + (TYPE) CONST : 0; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include <stdint.h>
#ifndef FN
#define FN(X) __builtin_fmax##X
#endif
#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? FN (y[i], CONST) : y[i]; \
}
#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
T (FN, TYPE, PRED_TYPE, zero, 0) \
T (FN, TYPE, PRED_TYPE, one, 1) \
T (FN, TYPE, PRED_TYPE, two, 2)
#define TEST_ALL(T) \
TEST_TYPE (T, FN (f16), _Float16, int16_t) \
TEST_TYPE (T, FN (f32), float, int32_t) \
TEST_TYPE (T, FN (f64), double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include "cond_fmaxnm_1.c"
#define N 99
#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include <stdint.h>
#ifndef FN
#define FN(X) __builtin_fmax##X
#endif
#define DEF_LOOP(FN, TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
TYPE *__restrict z, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = y[i] < 8 ? FN (z[i], CONST) : y[i]; \
}
#define TEST_TYPE(T, FN, TYPE) \
T (FN, TYPE, zero, 0) \
T (FN, TYPE, one, 1) \
T (FN, TYPE, two, 2)
#define TEST_ALL(T) \
TEST_TYPE (T, FN (f32), float) \
TEST_TYPE (T, FN (f64), double)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include "cond_fmaxnm_2.c"
#define N 99
#define TEST_LOOP(FN, TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N], z[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i % 13; \
z[i] = i * i; \
} \
test_##TYPE##_##NAME (x, y, z, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = y[i] < 8 ? FN (z[i], CONST) : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include <stdint.h>
#ifndef FN
#define FN(X) __builtin_fmax##X
#endif
#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? FN (y[i], CONST) : 4; \
}
#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
T (FN, TYPE, PRED_TYPE, zero, 0) \
T (FN, TYPE, PRED_TYPE, one, 1) \
T (FN, TYPE, PRED_TYPE, two, 2)
#define TEST_ALL(T) \
TEST_TYPE (T, FN (f16), _Float16, int16_t) \
TEST_TYPE (T, FN (f32), float, int32_t) \
TEST_TYPE (T, FN (f64), double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include "cond_fmaxnm_3.c"
#define N 99
#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 4; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include <stdint.h>
#ifndef FN
#define FN(X) __builtin_fmax##X
#endif
#define DEF_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? FN (y[i], CONST) : 0; \
}
#define TEST_TYPE(T, FN, TYPE, PRED_TYPE) \
T (FN, TYPE, PRED_TYPE, zero, 0) \
T (FN, TYPE, PRED_TYPE, one, 1) \
T (FN, TYPE, PRED_TYPE, two, 2)
#define TEST_ALL(T) \
TEST_TYPE (T, FN (f16), _Float16, int16_t) \
TEST_TYPE (T, FN (f32), float, int32_t) \
TEST_TYPE (T, FN (f64), double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#include "cond_fmaxnm_4.c"
#define N 99
#define TEST_LOOP(FN, TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? FN (y[i], CONST) : 0; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_1.c"
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_1_run.c"
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_2.c"
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_2_run.c"
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_3.c"
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_3_run.c"
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_4.c"
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfminnm\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */
#define FN(X) __builtin_fmin##X
#include "cond_fmaxnm_4_run.c"
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : y[i]; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, two, 2.0) \
T (TYPE, PRED_TYPE, four, 4.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fmul_1.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
TYPE *__restrict z, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = y[i] < 8 ? z[i] * (TYPE) CONST : y[i]; \
}
#define TEST_TYPE(T, TYPE) \
T (TYPE, half, 0.5) \
T (TYPE, two, 2.0) \
T (TYPE, four, 4.0)
#define TEST_ALL(T) \
TEST_TYPE (T, float) \
TEST_TYPE (T, double)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fmul_2.c"
#define N 99
#define TEST_LOOP(TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N], z[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i % 13; \
z[i] = i * i; \
} \
test_##TYPE##_##NAME (x, y, z, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = y[i] < 8 ? z[i] * (TYPE) CONST : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 8; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, two, 2.0) \
T (TYPE, PRED_TYPE, four, 4.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fmul_3.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 8; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? y[i] * (TYPE) CONST : 0; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, two, 2.0) \
T (TYPE, PRED_TYPE, four, 4.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #2\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #4\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fmul_4.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? y[i] * (TYPE) CONST : 0; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : y[i]; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, one, 1.0) \
T (TYPE, PRED_TYPE, two, 2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fsubr_1.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
TYPE *__restrict z, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = y[i] < 8 ? (TYPE) CONST - z[i] : y[i]; \
}
#define TEST_TYPE(T, TYPE) \
T (TYPE, half, 0.5) \
T (TYPE, one, 1.0) \
T (TYPE, two, 2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, float) \
TEST_TYPE (T, double)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fsubr_2.c"
#define N 99
#define TEST_LOOP(TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N], z[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i % 13; \
z[i] = i * i; \
} \
test_##TYPE##_##NAME (x, y, z, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = y[i] < 8 ? (TYPE) CONST - z[i] : y[i]; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 4; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, one, 1.0) \
T (TYPE, PRED_TYPE, two, 2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h, p[0-7], z[0-9]+\.h, z[0-9]+\.h\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fsubr_3.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 4; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include <stdint.h>
#define DEF_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
void __attribute__ ((noipa)) \
test_##TYPE##_##NAME (TYPE *__restrict x, \
TYPE *__restrict y, \
PRED_TYPE *__restrict pred, \
int n) \
{ \
for (int i = 0; i < n; ++i) \
x[i] = pred[i] != 1 ? (TYPE) CONST - y[i] : 0; \
}
#define TEST_TYPE(T, TYPE, PRED_TYPE) \
T (TYPE, PRED_TYPE, half, 0.5) \
T (TYPE, PRED_TYPE, one, 1.0) \
T (TYPE, PRED_TYPE, two, 2.0)
#define TEST_ALL(T) \
TEST_TYPE (T, _Float16, int16_t) \
TEST_TYPE (T, float, int32_t) \
TEST_TYPE (T, double, int64_t)
TEST_ALL (DEF_LOOP)
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #0\.5\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, #1\.0\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.h, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.s, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfmov\tz[0-9]+\.d, #2\.0} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.h, p[0-7]/m, z[0-9]+\.h, z[0-9]+\.h\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m, z[0-9]+\.s, z[0-9]+\.s\n} 1 } } */
/* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m, z[0-9]+\.d, z[0-9]+\.d\n} 1 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.s, p[0-7]/z, z[0-9]+\.s\n} 3 } } */
/* { dg-final { scan-assembler-times {\tmovprfx\tz[0-9]+\.d, p[0-7]/z, z[0-9]+\.d\n} 3 } } */
/* { dg-final { scan-assembler-not {\tmov\tz} } } */
/* { dg-final { scan-assembler-not {\tsel\t} } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
#include "cond_fsubr_4.c"
#define N 99
#define TEST_LOOP(TYPE, PRED_TYPE, NAME, CONST) \
{ \
TYPE x[N], y[N]; \
PRED_TYPE pred[N]; \
for (int i = 0; i < N; ++i) \
{ \
y[i] = i * i; \
pred[i] = i % 3; \
} \
test_##TYPE##_##NAME (x, y, pred, N); \
for (int i = 0; i < N; ++i) \
{ \
TYPE expected = i % 3 != 1 ? (TYPE) CONST - y[i] : 0; \
if (x[i] != expected) \
__builtin_abort (); \
asm volatile ("" ::: "memory"); \
} \
}
int
main (void)
{
TEST_ALL (TEST_LOOP)
return 0;
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment