Commit 55a9b91b by Matthew Wahab Committed by Matthew Wahab

[PATCH 9/17][ARM] Add NEON FP16 arithmetic instructions.

gcc/
2016-09-23  Matthew Wahab  <matthew.wahab@arm.com>

	* config/arm/iterators.md (VCVTHI): New.
	(NEON_VCMP): Add UNSPEC_VCLT and UNSPEC_VCLE.  Fix a long line.
	(NEON_VAGLTE): New.
	(VFM_LANE_AS): New.
	(VH_CVTTO): New.
	(V_reg): Add HF, V4HF and V8HF.  Fix white-space.
	(V_HALF): Add V4HF.  Fix white-space.
	(V_if_elem): Add HF, V4HF and V8HF.  Fix white-space.
	(V_s_elem): Likewise.
	(V_sz_elem): Fix white-space.
	(V_elem_ch): Likewise.
	(VH_elem_ch): New.
	(scalar_mul_constraint): Add V8HF and V4HF.
	(Is_float_mode): Fix white-space.
	(Is_d_reg): Add V4HF and V8HF.  Fix white-space.
	(q): Add HF.  Fix white-space.
	(float_sup): New.
	(float_SUP): New.
	(cmp_op_unsp): Add UNSPEC_VCALE and UNSPEC_VCALT.
	(neon_vfm_lane_as): New.
	* config/arm/neon.md (add<mode>3_fp16): New.
	(sub<mode>3_fp16): New.
	(mul<mode>3add<mode>_neon): New.
	(fma<VH:mode>4_intrinsic): New.
	(fmsub<VCVTF:mode>4_intrinsic): Fix white-space.
	(fmsub<VH:mode>4_intrinsic): New.
	(<absneg_str><mode>2): New.
	(neon_v<absneg_str><mode>): New.
	(neon_v<fp16_rnd_str><mode>): New.
	(neon_vrsqrte<mode>): New.
	(neon_vpaddv4hf): New.
	(neon_vadd<mode>): New.
	(neon_vsub<mode>): New.
	(neon_vmulf<mode>): New.
	(neon_vfma<VH:mode>): New.
	(neon_vfms<VH:mode>): New.
	(neon_vc<cmp_op><mode>): New.
	(neon_vc<cmp_op><mode>_fp16insn): New
	(neon_vc<cmp_op_unsp><mode>_fp16insn_unspec): New.
	(neon_vca<cmp_op><mode>): New.
	(neon_vca<cmp_op><mode>_fp16insn): New.
	(neon_vca<cmp_op_unsp><mode>_fp16insn_unspec): New.
	(neon_vc<cmp_op>z<mode>): New.
	(neon_vabd<mode>): New.
	(neon_v<maxmin>f<mode>): New.
	(neon_vp<maxmin>fv4hf: New.
	(neon_<fmaxmin_op><mode>): New.
	(neon_vrecps<mode>): New.
	(neon_vrsqrts<mode>): New.
	(neon_vrecpe<mode>): New (VH variant).
	(neon_vdup_lane<mode>_internal): New.
	(neon_vdup_lane<mode>): New.
	(neon_vcvt<sup><mode>): New (VCVTHI variant).
	(neon_vcvt<sup><mode>): New (VH variant).
	(neon_vcvt<sup>_n<mode>): New (VH variant).
	(neon_vcvt<sup>_n<mode>): New (VCVTHI variant).
	(neon_vcvt<vcvth_op><sup><mode>): New.
	(neon_vmul_lane<mode>): New.
	(neon_vmul_n<mode>): New.
	* config/arm/unspecs.md (UNSPEC_VCALE): New
	(UNSPEC_VCALT): New.
	(UNSPEC_VFMA_LANE): New.
	(UNSPECS_VFMS_LANE): New.

testsuite/
2016-09-23  Matthew Wahab  <matthew.wahab@arm.com>

	* gcc.target/arm/armv8_2-fp16-arith-1.c: Use arm_v8_2a_fp16_neon
	options.  Add tests for float16x4_t and float16x8_t.

From-SVN: r240415
parent 64c744b9
2016-09-23 Matthew Wahab <matthew.wahab@arm.com>
* config/arm/iterators.md (VCVTHI): New.
(NEON_VCMP): Add UNSPEC_VCLT and UNSPEC_VCLE. Fix a long line.
(NEON_VAGLTE): New.
(VFM_LANE_AS): New.
(VH_CVTTO): New.
(V_reg): Add HF, V4HF and V8HF. Fix white-space.
(V_HALF): Add V4HF. Fix white-space.
(V_if_elem): Add HF, V4HF and V8HF. Fix white-space.
(V_s_elem): Likewise.
(V_sz_elem): Fix white-space.
(V_elem_ch): Likewise.
(VH_elem_ch): New.
(scalar_mul_constraint): Add V8HF and V4HF.
(Is_float_mode): Fix white-space.
(Is_d_reg): Add V4HF and V8HF. Fix white-space.
(q): Add HF. Fix white-space.
(float_sup): New.
(float_SUP): New.
(cmp_op_unsp): Add UNSPEC_VCALE and UNSPEC_VCALT.
(neon_vfm_lane_as): New.
* config/arm/neon.md (add<mode>3_fp16): New.
(sub<mode>3_fp16): New.
(mul<mode>3add<mode>_neon): New.
(fma<VH:mode>4_intrinsic): New.
(fmsub<VCVTF:mode>4_intrinsic): Fix white-space.
(fmsub<VH:mode>4_intrinsic): New.
(<absneg_str><mode>2): New.
(neon_v<absneg_str><mode>): New.
(neon_v<fp16_rnd_str><mode>): New.
(neon_vrsqrte<mode>): New.
(neon_vpaddv4hf): New.
(neon_vadd<mode>): New.
(neon_vsub<mode>): New.
(neon_vmulf<mode>): New.
(neon_vfma<VH:mode>): New.
(neon_vfms<VH:mode>): New.
(neon_vc<cmp_op><mode>): New.
(neon_vc<cmp_op><mode>_fp16insn): New
(neon_vc<cmp_op_unsp><mode>_fp16insn_unspec): New.
(neon_vca<cmp_op><mode>): New.
(neon_vca<cmp_op><mode>_fp16insn): New.
(neon_vca<cmp_op_unsp><mode>_fp16insn_unspec): New.
(neon_vc<cmp_op>z<mode>): New.
(neon_vabd<mode>): New.
(neon_v<maxmin>f<mode>): New.
(neon_vp<maxmin>fv4hf: New.
(neon_<fmaxmin_op><mode>): New.
(neon_vrecps<mode>): New.
(neon_vrsqrts<mode>): New.
(neon_vrecpe<mode>): New (VH variant).
(neon_vdup_lane<mode>_internal): New.
(neon_vdup_lane<mode>): New.
(neon_vcvt<sup><mode>): New (VCVTHI variant).
(neon_vcvt<sup><mode>): New (VH variant).
(neon_vcvt<sup>_n<mode>): New (VH variant).
(neon_vcvt<sup>_n<mode>): New (VCVTHI variant).
(neon_vcvt<vcvth_op><sup><mode>): New.
(neon_vmul_lane<mode>): New.
(neon_vmul_n<mode>): New.
* config/arm/unspecs.md (UNSPEC_VCALE): New
(UNSPEC_VCALT): New.
(UNSPEC_VFMA_LANE): New.
(UNSPECS_VFMS_LANE): New.
2016-09-23 Dominik Vogt <vogt@linux.vnet.ibm.com>
* config/s390/s390.md ("*extzv<mode>_zEC12", "*extzv<mode>_z10")
......
......@@ -145,6 +145,9 @@
;; Vector modes form int->float conversions.
(define_mode_iterator VCVTI [V2SI V4SI])
;; Vector modes for int->half conversions.
(define_mode_iterator VCVTHI [V4HI V8HI])
;; Vector modes for doubleword multiply-accumulate, etc. insns.
(define_mode_iterator VMD [V4HI V2SI V2SF])
......@@ -267,10 +270,14 @@
(define_int_iterator VRINT [UNSPEC_VRINTZ UNSPEC_VRINTP UNSPEC_VRINTM
UNSPEC_VRINTR UNSPEC_VRINTX UNSPEC_VRINTA])
(define_int_iterator NEON_VCMP [UNSPEC_VCEQ UNSPEC_VCGT UNSPEC_VCGE UNSPEC_VCLT UNSPEC_VCLE])
(define_int_iterator NEON_VCMP [UNSPEC_VCEQ UNSPEC_VCGT UNSPEC_VCGE
UNSPEC_VCLT UNSPEC_VCLE])
(define_int_iterator NEON_VACMP [UNSPEC_VCAGE UNSPEC_VCAGT])
(define_int_iterator NEON_VAGLTE [UNSPEC_VCAGE UNSPEC_VCAGT
UNSPEC_VCALE UNSPEC_VCALT])
(define_int_iterator VCVT [UNSPEC_VRINTP UNSPEC_VRINTM UNSPEC_VRINTA])
(define_int_iterator NEON_VRINT [UNSPEC_NVRINTP UNSPEC_NVRINTZ UNSPEC_NVRINTM
......@@ -398,6 +405,8 @@
(define_int_iterator VQRDMLH_AS [UNSPEC_VQRDMLAH UNSPEC_VQRDMLSH])
(define_int_iterator VFM_LANE_AS [UNSPEC_VFMA_LANE UNSPEC_VFMS_LANE])
;;----------------------------------------------------------------------------
;; Mode attributes
;;----------------------------------------------------------------------------
......@@ -416,6 +425,10 @@
(define_mode_attr V_cvtto [(V2SI "v2sf") (V2SF "v2si")
(V4SI "v4sf") (V4SF "v4si")])
;; (Opposite) mode to convert to/from for vector-half mode conversions.
(define_mode_attr VH_CVTTO [(V4HI "V4HF") (V4HF "V4HI")
(V8HI "V8HF") (V8HF "V8HI")])
;; Define element mode for each vector mode.
(define_mode_attr V_elem [(V8QI "QI") (V16QI "QI")
(V4HI "HI") (V8HI "HI")
......@@ -459,12 +472,13 @@
;; Register width from element mode
(define_mode_attr V_reg [(V8QI "P") (V16QI "q")
(V4HI "P") (V8HI "q")
(V4HF "P") (V8HF "q")
(V2SI "P") (V4SI "q")
(V2SF "P") (V4SF "q")
(DI "P") (V2DI "q")
(SF "") (DF "P")])
(V4HI "P") (V8HI "q")
(V4HF "P") (V8HF "q")
(V2SI "P") (V4SI "q")
(V2SF "P") (V4SF "q")
(DI "P") (V2DI "q")
(SF "") (DF "P")
(HF "")])
;; Wider modes with the same number of elements.
(define_mode_attr V_widen [(V8QI "V8HI") (V4HI "V4SI") (V2SI "V2DI")])
......@@ -480,7 +494,7 @@
(define_mode_attr V_HALF [(V16QI "V8QI") (V8HI "V4HI")
(V8HF "V4HF") (V4SI "V2SI")
(V4SF "V2SF") (V2DF "DF")
(V2DI "DI")])
(V2DI "DI") (V4HF "HF")])
;; Same, but lower-case.
(define_mode_attr V_half [(V16QI "v8qi") (V8HI "v4hi")
......@@ -529,18 +543,22 @@
;; Get element type from double-width mode, for operations where we
;; don't care about signedness.
(define_mode_attr V_if_elem [(V8QI "i8") (V16QI "i8")
(V4HI "i16") (V8HI "i16")
(V2SI "i32") (V4SI "i32")
(DI "i64") (V2DI "i64")
(V2SF "f32") (V4SF "f32")
(SF "f32") (DF "f64")])
(V4HI "i16") (V8HI "i16")
(V2SI "i32") (V4SI "i32")
(DI "i64") (V2DI "i64")
(V2SF "f32") (V4SF "f32")
(SF "f32") (DF "f64")
(HF "f16") (V4HF "f16")
(V8HF "f16")])
;; Same, but for operations which work on signed values.
(define_mode_attr V_s_elem [(V8QI "s8") (V16QI "s8")
(V4HI "s16") (V8HI "s16")
(V2SI "s32") (V4SI "s32")
(DI "s64") (V2DI "s64")
(V2SF "f32") (V4SF "f32")])
(V4HI "s16") (V8HI "s16")
(V2SI "s32") (V4SI "s32")
(DI "s64") (V2DI "s64")
(V2SF "f32") (V4SF "f32")
(HF "f16") (V4HF "f16")
(V8HF "f16")])
;; Same, but for operations which work on unsigned values.
(define_mode_attr V_u_elem [(V8QI "u8") (V16QI "u8")
......@@ -557,17 +575,22 @@
(V2SF "32") (V4SF "32")])
(define_mode_attr V_sz_elem [(V8QI "8") (V16QI "8")
(V4HI "16") (V8HI "16")
(V2SI "32") (V4SI "32")
(DI "64") (V2DI "64")
(V4HI "16") (V8HI "16")
(V2SI "32") (V4SI "32")
(DI "64") (V2DI "64")
(V4HF "16") (V8HF "16")
(V2SF "32") (V4SF "32")])
(V2SF "32") (V4SF "32")])
(define_mode_attr V_elem_ch [(V8QI "b") (V16QI "b")
(V4HI "h") (V8HI "h")
(V2SI "s") (V4SI "s")
(DI "d") (V2DI "d")
(V2SF "s") (V4SF "s")])
(V4HI "h") (V8HI "h")
(V2SI "s") (V4SI "s")
(DI "d") (V2DI "d")
(V2SF "s") (V4SF "s")
(V2SF "s") (V4SF "s")])
(define_mode_attr VH_elem_ch [(V4HI "s") (V8HI "s")
(V4HF "s") (V8HF "s")
(HF "s")])
;; Element sizes for duplicating ARM registers to all elements of a vector.
(define_mode_attr VD_dup [(V8QI "8") (V4HI "16") (V2SI "32") (V2SF "32")])
......@@ -603,16 +626,17 @@
;; This mode attribute is used to obtain the correct register constraints.
(define_mode_attr scalar_mul_constraint [(V4HI "x") (V2SI "t") (V2SF "t")
(V8HI "x") (V4SI "t") (V4SF "t")])
(V8HI "x") (V4SI "t") (V4SF "t")
(V8HF "x") (V4HF "x")])
;; Predicates used for setting type for neon instructions
(define_mode_attr Is_float_mode [(V8QI "false") (V16QI "false")
(V4HI "false") (V8HI "false")
(V2SI "false") (V4SI "false")
(V4HF "true") (V8HF "true")
(V2SF "true") (V4SF "true")
(DI "false") (V2DI "false")])
(V4HI "false") (V8HI "false")
(V2SI "false") (V4SI "false")
(V4HF "true") (V8HF "true")
(V2SF "true") (V4SF "true")
(DI "false") (V2DI "false")])
(define_mode_attr Scalar_mul_8_16 [(V8QI "true") (V16QI "true")
(V4HI "true") (V8HI "true")
......@@ -621,10 +645,10 @@
(DI "false") (V2DI "false")])
(define_mode_attr Is_d_reg [(V8QI "true") (V16QI "false")
(V4HI "true") (V8HI "false")
(V2SI "true") (V4SI "false")
(V2SF "true") (V4SF "false")
(DI "true") (V2DI "false")
(V4HI "true") (V8HI "false")
(V2SI "true") (V4SI "false")
(V2SF "true") (V4SF "false")
(DI "true") (V2DI "false")
(V4HF "true") (V8HF "false")])
(define_mode_attr V_mode_nunits [(V8QI "8") (V16QI "16")
......@@ -670,12 +694,14 @@
;; Mode attribute used to build the "type" attribute.
(define_mode_attr q [(V8QI "") (V16QI "_q")
(V4HI "") (V8HI "_q")
(V2SI "") (V4SI "_q")
(V4HI "") (V8HI "_q")
(V2SI "") (V4SI "_q")
(V4HF "") (V8HF "_q")
(V2SF "") (V4SF "_q")
(DI "") (V2DI "_q")
(DF "") (V2DF "_q")])
(V2SF "") (V4SF "_q")
(V4HF "") (V8HF "_q")
(DI "") (V2DI "_q")
(DF "") (V2DF "_q")
(HF "")])
(define_mode_attr pf [(V8QI "p") (V16QI "p") (V2SF "f") (V4SF "f")])
......@@ -718,6 +744,10 @@
;; Conversions.
(define_code_attr FCVTI32typename [(unsigned_float "u32") (float "s32")])
(define_code_attr float_sup [(unsigned_float "u") (float "s")])
(define_code_attr float_SUP [(unsigned_float "U") (float "S")])
;;----------------------------------------------------------------------------
;; Int attributes
;;----------------------------------------------------------------------------
......@@ -790,9 +820,10 @@
(UNSPEC_VRNDP "vrintp") (UNSPEC_VRNDX "vrintx")])
(define_int_attr cmp_op_unsp [(UNSPEC_VCEQ "eq") (UNSPEC_VCGT "gt")
(UNSPEC_VCGE "ge") (UNSPEC_VCLE "le")
(UNSPEC_VCLT "lt") (UNSPEC_VCAGE "ge")
(UNSPEC_VCAGT "gt")])
(UNSPEC_VCGE "ge") (UNSPEC_VCLE "le")
(UNSPEC_VCLT "lt") (UNSPEC_VCAGE "ge")
(UNSPEC_VCAGT "gt") (UNSPEC_VCALE "le")
(UNSPEC_VCALT "lt")])
(define_int_attr r [
(UNSPEC_VRHADD_S "r") (UNSPEC_VRHADD_U "r")
......@@ -908,3 +939,7 @@
;; Attributes for VQRDMLAH/VQRDMLSH
(define_int_attr neon_rdma_as [(UNSPEC_VQRDMLAH "a") (UNSPEC_VQRDMLSH "s")])
;; Attributes for VFMA_LANE/ VFMS_LANE
(define_int_attr neon_vfm_lane_as
[(UNSPEC_VFMA_LANE "a") (UNSPEC_VFMS_LANE "s")])
......@@ -191,6 +191,8 @@
UNSPEC_VBSL
UNSPEC_VCAGE
UNSPEC_VCAGT
UNSPEC_VCALE
UNSPEC_VCALT
UNSPEC_VCEQ
UNSPEC_VCGE
UNSPEC_VCGEU
......@@ -258,6 +260,8 @@
UNSPEC_VMLSL_S_LANE
UNSPEC_VMLSL_U_LANE
UNSPEC_VMLSL_LANE
UNSPEC_VFMA_LANE
UNSPEC_VFMS_LANE
UNSPEC_VMOVL_S
UNSPEC_VMOVL_U
UNSPEC_VMOVN
......@@ -387,4 +391,3 @@
UNSPEC_VRNDP
UNSPEC_VRNDX
])
2016-09-23 Matthew Wahab <matthew.wahab@arm.com>
* gcc.target/arm/armv8_2-fp16-arith-1.c: Use arm_v8_2a_fp16_neon
options. Add tests for float16x4_t and float16x8_t.
2016-09-23 Dominik Vogt <vogt@linux.vnet.ibm.com>
* gcc.target/s390/risbg-ll-1.c: Ported risbg tests from llvm.
......
/* { dg-do compile } */
/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */
/* { dg-require-effective-target arm_v8_2a_fp16_neon_ok } */
/* { dg-options "-O2 -ffast-math" } */
/* { dg-add-options arm_v8_2a_fp16_scalar } */
/* { dg-add-options arm_v8_2a_fp16_neon } */
/* Test instructions generated for half-precision arithmetic. */
......@@ -9,6 +9,9 @@ typedef __fp16 float16_t;
typedef __simd64_float16_t float16x4_t;
typedef __simd128_float16_t float16x8_t;
typedef short int16x4_t __attribute__ ((vector_size (8)));
typedef short int int16x8_t __attribute__ ((vector_size (16)));
float16_t
fp16_abs (float16_t a)
{
......@@ -50,15 +53,49 @@ TEST_CMP (greaterthan, >, int, float16_t)
TEST_CMP (lessthanequal, <=, int, float16_t)
TEST_CMP (greaterthanqual, >=, int, float16_t)
/* Vectors of size 4. */
TEST_UNOP (neg, -, float16x4_t)
TEST_BINOP (add, +, float16x4_t)
TEST_BINOP (sub, -, float16x4_t)
TEST_BINOP (mult, *, float16x4_t)
TEST_BINOP (div, /, float16x4_t)
TEST_CMP (equal, ==, int16x4_t, float16x4_t)
TEST_CMP (unequal, !=, int16x4_t, float16x4_t)
TEST_CMP (lessthan, <, int16x4_t, float16x4_t)
TEST_CMP (greaterthan, >, int16x4_t, float16x4_t)
TEST_CMP (lessthanequal, <=, int16x4_t, float16x4_t)
TEST_CMP (greaterthanqual, >=, int16x4_t, float16x4_t)
/* Vectors of size 8. */
TEST_UNOP (neg, -, float16x8_t)
TEST_BINOP (add, +, float16x8_t)
TEST_BINOP (sub, -, float16x8_t)
TEST_BINOP (mult, *, float16x8_t)
TEST_BINOP (div, /, float16x8_t)
TEST_CMP (equal, ==, int16x8_t, float16x8_t)
TEST_CMP (unequal, !=, int16x8_t, float16x8_t)
TEST_CMP (lessthan, <, int16x8_t, float16x8_t)
TEST_CMP (greaterthan, >, int16x8_t, float16x8_t)
TEST_CMP (lessthanequal, <=, int16x8_t, float16x8_t)
TEST_CMP (greaterthanqual, >=, int16x8_t, float16x8_t)
/* { dg-final { scan-assembler-times {vneg\.f16\ts[0-9]+, s[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vneg\.f16\td[0-9]+, d[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vneg\.f16\tq[0-9]+, q[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vabs\.f16\ts[0-9]+, s[0-9]+} 2 } } */
/* { dg-final { scan-assembler-times {vadd\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vsub\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vmul\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vdiv\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 1 } } */
/* { dg-final { scan-assembler-times {vcmp\.f32\ts[0-9]+, s[0-9]+} 2 } } */
/* { dg-final { scan-assembler-times {vcmpe\.f32\ts[0-9]+, s[0-9]+} 4 } } */
/* { dg-final { scan-assembler-times {vadd\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 13 } } */
/* { dg-final { scan-assembler-times {vsub\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 13 } } */
/* { dg-final { scan-assembler-times {vmul\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 13 } } */
/* { dg-final { scan-assembler-times {vdiv\.f16\ts[0-9]+, s[0-9]+, s[0-9]+} 13 } } */
/* { dg-final { scan-assembler-times {vcmp\.f32\ts[0-9]+, s[0-9]+} 26 } } */
/* { dg-final { scan-assembler-times {vcmpe\.f32\ts[0-9]+, s[0-9]+} 52 } } */
/* { dg-final { scan-assembler-not {vadd\.f32} } } */
/* { dg-final { scan-assembler-not {vsub\.f32} } } */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment