- 16 Nov, 2019 10 commits
-
-
This patch uses distinct values for the FFR and FFRT outputs of aarch64_wrffr, so that a following aarch64_copy_ffr_to_ffrt has an effect. This is needed to avoid regressions with later patches. The block comment at the head of the file already described the pattern this way, and there was already an unspec for it. Not sure what made me change it... 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (aarch64_wrffr): Wrap the FFRT output in UNSPEC_WRFFR. From-SVN: r278356
Richard Sandiford committed -
This patch adds support for scatter stores of partial vectors, where the vector base or offset elements can be wider than the elements being stored. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (scatter_store<SVE_FULL_SD:mode><v_int_equiv>): Extend to... (scatter_store<SVE_24:mode><v_int_container>): ...this. (mask_scatter_store<SVE_FULL_S:mode><v_int_equiv>): Extend to... (mask_scatter_store<SVE_4:mode><v_int_equiv>): ...this. (mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>): Extend to... (mask_scatter_store<SVE_2:mode><v_int_equiv>): ...this. (*mask_scatter_store<mode><v_int_container>_<su>xtw_unpacked): New pattern. (*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to... (*mask_scatter_store<SVE_2:mode><v_int_equiv>_sxtw): ...this. (*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to... (*mask_scatter_store<SVE_2:mode><v_int_equiv>_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly. * gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/scatter_store_8.c: New test. * gcc.target/aarch64/sve/scatter_store_9.c: Likewise. From-SVN: r278347
Richard Sandiford committed -
This patch pattern-matches a partial gather load followed by a sign or zero extension into an extending gather load. (The partial gather load is already an extending load; we just don't rely on the upper bits of the elements.) 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_2BHSI, SVE_2HSDI, SVE_4BHI) (SVE_4HSI): New mode iterators. (ANY_EXTEND2): New code iterator. * config/aarch64/aarch64-sve.md (@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>): Extend to... (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>): ...this, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. (@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Likewise extend to... (@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>): ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw): Likewise extend to... (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_sxtw) ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw): Likewise extend to... (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_uxtw) ...this, making the same adjustments. (*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked): New pattern. (*aarch64_ldff1_gather<mode>_sxtw): Canonicalize to a constant extension predicate. (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw): Describe the extension as a predicated rather than unpredicated extension. (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw): Likewise. Canonicalize to a constant extension predicate. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_gather_extend_impl::expand): Add an extra predicate for the extension. (svldff1_gather_extend_impl::expand): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_extend_1.c: New test. * gcc.target/aarch64/sve/gather_load_extend_2.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_3.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_4.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_5.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_6.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_7.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_8.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_9.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_10.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_11.c: Likewise. * gcc.target/aarch64/sve/gather_load_extend_12.c: Likewise. From-SVN: r278346
Richard Sandiford committed -
This patch adds support for gather loads of partial vectors, where the vector base or offset elements can be wider than the elements being loaded. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_24, SVE_2, SVE_4): New mode iterators. * config/aarch64/aarch64-sve.md (gather_load<SVE_FULL_SD:mode><v_int_equiv>): Extend to... (gather_load<SVE_24:mode><v_int_container>): ...this. (mask_gather_load<SVE_FULL_S:mode><v_int_equiv>): Extend to... (mask_gather_load<SVE_4:mode><v_int_container>): ...this. (mask_gather_load<SVE_FULL_D:mode><v_int_equiv>): Extend to... (mask_gather_load<SVE_2:mode><v_int_container>): ...this. (*mask_gather_load<SVE_2:mode><v_int_container>_<su>xtw_unpacked): New pattern. (*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to... (*mask_gather_load<SVE_2:mode><v_int_equiv>_sxtw): ...this. Allow the nominal extension predicate to be different from the load predicate. (*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to... (*mask_gather_load<SVE_2:mode><v_int_equiv>_uxtw): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/gather_load_1.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/gather_load_2.c: Update accordingly. * gcc.target/aarch64/sve/gather_load_3.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit and 16-bit elements. * gcc.target/aarch64/sve/gather_load_4.c: Update accordingly. * gcc.target/aarch64/sve/gather_load_5.c (TEST_LOOP): Start at 0. (TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements. * gcc.target/aarch64/sve/gather_load_6.c: Add --param aarch64-sve-compare-costs=0. (TEST_LOOP): Start at 0. * gcc.target/aarch64/sve/gather_load_7.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/gather_load_8.c: New test. * gcc.target/aarch64/sve/gather_load_9.c: Likewise. * gcc.target/aarch64/sve/mask_gather_load_6.c: Add --param aarch64-sve-compare-costs=0. From-SVN: r278345
Richard Sandiford committed -
This patch adds support for "truncating" to a partial SVE vector from either a full SVE vector or a wider partial vector. This truncation is actually a no-op and so should have zero cost in the vector cost model. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (trunc<SVE_HSDI:mode><SVE_PARTIAL_I:mode>2): New pattern. * config/aarch64/aarch64.c (aarch64_integer_truncation_p): New function. (aarch64_sve_adjust_stmt_cost): Call it. gcc/testsuite/ * gcc.target/aarch64/sve/mask_struct_load_1.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise. * gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise. * gcc.target/aarch64/sve/pack_1.c: Likewise. * gcc.target/aarch64/sve/truncate_1.c: New test. From-SVN: r278344
Richard Sandiford committed -
This patch pattern-matches a partial SVE load followed by a sign or zero extension into an extending load. (The partial load is already an extending load; we just don't rely on the upper bits of the elements.) Nothing yet uses the extra LDFF1 and LDNF1 combinations, but it seemed more consistent to provide them, since I needed to update the pattern to use a predicated extension anyway. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>): (@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Combine into... (@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): ...this new pattern, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>): Combine into... (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>): ...this new pattern, handling extension to partial modes as well as full modes. Describe the extension as a predicated rather than unpredicated extension. * config/aarch64/aarch64-sve-builtins.cc (function_expander::use_contiguous_load_insn): Add an extra predicate for extending loads. * config/aarch64/aarch64.c (aarch64_extending_load_p): New function. (aarch64_sve_adjust_stmt_cost): Likewise. (aarch64_add_stmt_cost): Use aarch64_sve_adjust_stmt_cost to adjust the cost of SVE vector stmts. gcc/testsuite/ * gcc.target/aarch64/sve/load_extend_1.c: New test. * gcc.target/aarch64/sve/load_extend_2.c: Likewise. * gcc.target/aarch64/sve/load_extend_3.c: Likewise. * gcc.target/aarch64/sve/load_extend_4.c: Likewise. * gcc.target/aarch64/sve/load_extend_5.c: Likewise. * gcc.target/aarch64/sve/load_extend_6.c: Likewise. * gcc.target/aarch64/sve/load_extend_7.c: Likewise. * gcc.target/aarch64/sve/load_extend_8.c: Likewise. * gcc.target/aarch64/sve/load_extend_9.c: Likewise. * gcc.target/aarch64/sve/load_extend_10.c: Likewise. * gcc.target/aarch64/sve/reduc_4.c: Add --param aarch64-sve-compare-costs=0. From-SVN: r278343
Richard Sandiford committed -
This patch adds support for extending from partial SVE modes to both full vector modes and wider partial modes. Some tests now need --param aarch64-sve-compare-costs=0 to force the original full-vector code. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_HSDI): New mode iterator. (narrower_mask): Handle VNx4HI, VNx2HI and VNx2SI. * config/aarch64/aarch64-sve.md (<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): New pattern. (*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise. (@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Update comment. Avoid new narrower_mask ambiguity. (@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise. (*cond_uxt<mode>_2): Update comment. (*cond_uxt<mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_1.c: Expect the loop to be vectorized with bytes stored in 32-bit containers. * gcc.target/aarch64/sve/extend_1.c: New test. * gcc.target/aarch64/sve/extend_2.c: New test. * gcc.target/aarch64/sve/extend_3.c: New test. * gcc.target/aarch64/sve/extend_4.c: New test. * gcc.target/aarch64/sve/load_const_offset_3.c: Add --param aarch64-sve-compare-costs=0. * gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise. * gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise. * gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise. From-SVN: r278342
Richard Sandiford committed -
This patch adds the bare minimum needed to support autovectorisation of partial SVE vectors, namely moves and integer addition. Later patches add more interesting cases. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-modes.def: Define partial SVE vector float modes. * config/aarch64/aarch64-protos.h (aarch64_sve_pred_mode): New function. * config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle the new vector float modes. (aarch64_sve_container_bits): New function. (aarch64_sve_pred_mode): Likewise. (aarch64_get_mask_mode): Use it. (aarch64_sve_element_int_mode): Handle structure modes and partial modes. (aarch64_sve_container_int_mode): New function. (aarch64_vectorize_related_mode): Return SVE modes when given SVE modes. Handle partial modes, taking the preferred number of units from the size of the given mode. (aarch64_hard_regno_mode_ok): Allow partial modes to be stored in registers. (aarch64_expand_sve_ld1rq): Use the mode form of aarch64_sve_pred_mode. (aarch64_expand_sve_const_vector): Handle partial SVE vectors. (aarch64_split_sve_subreg_move): Use the mode form of aarch64_sve_pred_mode. (aarch64_secondary_reload): Handle partial modes in the same way as full big-endian vectors. (aarch64_vector_mode_supported_p): Allow partial SVE vectors. (aarch64_autovectorize_vector_modes): Try unpacked SVE vectors, merging with the Advanced SIMD modes. If two modes have the same size, try the Advanced SIMD mode first. (aarch64_simd_valid_immediate): Use the container rather than the element mode for INDEX constants. (aarch64_simd_vector_alignment): Make the alignment of partial SVE vector modes the same as their minimum size. (aarch64_evpc_sel): Use the mode form of aarch64_sve_pred_mode. * config/aarch64/aarch64-sve.md (mov<SVE_FULL:mode>): Extend to... (mov<SVE_ALL:mode>): ...this. (movmisalign<SVE_FULL:mode>): Extend to... (movmisalign<SVE_ALL:mode>): ...this. (*aarch64_sve_mov<mode>_le): Rename to... (*aarch64_sve_mov<mode>_ldr_str): ...this. (*aarch64_sve_mov<SVE_FULL:mode>_be): Rename and extend to... (*aarch64_sve_mov<SVE_ALL:mode>_no_ldr_str): ...this. Handle partial modes regardless of endianness. (aarch64_sve_reload_be): Rename to... (aarch64_sve_reload_mem): ...this and enable for little-endian. Use aarch64_sve_pred_mode to get the appropriate predicate mode. (@aarch64_pred_mov<SVE_FULL:mode>): Extend to... (@aarch64_pred_mov<SVE_ALL:mode>): ...this. (*aarch64_sve_mov<SVE_FULL:mode>_subreg_be): Extend to... (*aarch64_sve_mov<SVE_ALL:mode>_subreg_be): ...this. (@aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to... (@aarch64_sve_reinterpret<SVE_ALL:mode>): ...this. (*aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to... (*aarch64_sve_reinterpret<SVE_ALL:mode>): ...this. (maskload<SVE_FULL:mode><vpred>): Extend to... (maskload<SVE_ALL:mode><vpred>): ...this. (maskstore<SVE_FULL:mode><vpred>): Extend to... (maskstore<SVE_ALL:mode><vpred>): ...this. (vec_duplicate<SVE_FULL:mode>): Extend to... (vec_duplicate<SVE_ALL:mode>): ...this. (*vec_duplicate<SVE_FULL:mode>_reg): Extend to... (*vec_duplicate<SVE_ALL:mode>_reg): ...this. (sve_ld1r<SVE_FULL:mode>): Extend to... (sve_ld1r<SVE_ALL:mode>): ...this. (vec_series<SVE_FULL_I:mode>): Extend to... (vec_series<SVE_I:mode>): ...this. (*vec_series<SVE_FULL_I:mode>_plus): Extend to... (*vec_series<SVE_I:mode>_plus): ...this. (@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Avoid new VPRED ambiguity. (@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise. (add<SVE_FULL_I:mode>3): Extend to... (add<SVE_I:mode>3): ...this. * config/aarch64/iterators.md (SVE_ALL, SVE_I): New mode iterators. (Vetype, Vesize, VEL, Vel, vwcore): Handle partial SVE vector modes. (VPRED, vpred): Likewise. (Vctype): New iterator. (vw): Remove SVE modes. gcc/testsuite/ * gcc.target/aarch64/sve/mixed_size_1.c: New test. * gcc.target/aarch64/sve/mixed_size_2.c: Likewise. * gcc.target/aarch64/sve/mixed_size_3.c: Likewise. * gcc.target/aarch64/sve/mixed_size_4.c: Likewise. * gcc.target/aarch64/sve/mixed_size_5.c: Likewise. From-SVN: r278341
Richard Sandiford committed -
Another renaming, this time to make way for partial/unpacked float modes. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_PARTIAL): Rename to... (SVE_PARTIAL_I): ...this. * config/aarch64/aarch64-sve.md: Apply the above renaming throughout. From-SVN: r278339
Richard Sandiford committed -
An upcoming patch will make more use of partial/unpacked SVE vectors. We then need a distinction between mode iterators that include partial modes and those that only include "full" modes. This patch prepares for that by adding "FULL" to the names of iterators that only select full modes. There should be no change in behaviour. 2019-11-16 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_ALL): Rename to... (SVE_FULL): ...this. (SVE_I): Rename to... (SVE_FULL_I): ...this. (SVE_F): Rename to... (SVE_FULL_F): ...this. (SVE_BHSI): Rename to... (SVE_FULL_BHSI): ...this. (SVE_HSD): Rename to... (SVE_FULL_HSD): ...this. (SVE_HSDI): Rename to... (SVE_FULL_HSDI): ...this. (SVE_HSF): Rename to... (SVE_FULL_HSF): ...this. (SVE_SD): Rename to... (SVE_FULL_SD): ...this. (SVE_SDI): Rename to... (SVE_FULL_SDI): ...this. (SVE_SDF): Rename to... (SVE_FULL_SDF): ...this. (SVE_S): Rename to... (SVE_FULL_S): ...this. (SVE_D): Rename to... (SVE_FULL_D): ...this. * config/aarch64/aarch64-sve.md: Apply the above renaming throughout. * config/aarch64/aarch64-sve2.md: Likewise. From-SVN: r278338
Richard Sandiford committed
-
- 08 Nov, 2019 1 commit
-
-
The gather and scatter optabs required the vector offset to be the integer equivalent of the vector mode being loaded or stored. This patch generalises them so that the two vectors can have different element sizes, although they still need to have the same number of elements. One consequence of this is that it's possible (if unlikely) for two IFN_GATHER_LOADs to have the same arguments but different return types. E.g. the same scalar base and vector of 32-bit offsets could be used to load 8-bit elements and to load 16-bit elements. From just looking at the arguments, we could wrongly deduce that they're equivalent. I know we saw this happen at one point with IFN_WHILE_ULT, and we dealt with it there by passing a zero of the return type as an extra argument. Doing the same here also makes the load and store functions have the same argument assignment. For now this patch should be a no-op, but later SVE patches take advantage of the new flexibility. 2019-11-08 Richard Sandiford <richard.sandiford@arm.com> gcc/ * optabs.def (gather_load_optab, mask_gather_load_optab) (scatter_store_optab, mask_scatter_store_optab): Turn into conversion optabs, with the offset mode given explicitly. * doc/md.texi: Update accordingly. * config/aarch64/aarch64-sve-builtins-base.cc (svld1_gather_impl::expand): Likewise. (svst1_scatter_impl::expand): Likewise. * internal-fn.c (gather_load_direct, scatter_store_direct): Likewise. (expand_scatter_store_optab_fn): Likewise. (direct_gather_load_optab_supported_p): Likewise. (direct_scatter_store_optab_supported_p): Likewise. (expand_gather_load_optab_fn): Likewise. Expect the mask argument to be argument 4. (internal_fn_mask_index): Return 4 for IFN_MASK_GATHER_LOAD. (internal_gather_scatter_fn_supported_p): Replace the offset sign argument with the offset vector type. Require the two vector types to have the same number of elements but allow their element sizes to be different. Treat the optabs as conversion optabs. * internal-fn.h (internal_gather_scatter_fn_supported_p): Update prototype accordingly. * optabs-query.c (supports_at_least_one_mode_p): Replace with... (supports_vec_convert_optab_p): ...this new function. (supports_vec_gather_load_p): Update accordingly. (supports_vec_scatter_store_p): Likewise. * tree-vectorizer.h (vect_gather_scatter_fn_p): Take a vec_info. Replace the offset sign and bits parameters with a scalar type tree. * tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise. Pass back the offset vector type instead of the scalar element type. Allow the offset to be wider than the memory elements. Search for an offset type that the target supports, stopping once we've reached the maximum of the element size and pointer size. Update call to internal_gather_scatter_fn_supported_p. (vect_check_gather_scatter): Update calls accordingly. When testing a new scale before knowing the final offset type, check whether the scale is supported for any signed or unsigned offset type. Check whether the target supports the source and target types of a conversion before deciding whether to look through the conversion. Record the chosen offset_vectype. * tree-vect-patterns.c (vect_get_gather_scatter_offset_type): Delete. (vect_recog_gather_scatter_pattern): Get the scalar offset type directly from the gs_info's offset_vectype instead. Pass a zero of the result type to IFN_GATHER_LOAD and IFN_MASK_GATHER_LOAD. * tree-vect-stmts.c (check_load_store_masking): Update call to internal_gather_scatter_fn_supported_p, passing the offset vector type recorded in the gs_info. (vect_truncate_gather_scatter_offset): Update call to vect_check_gather_scatter, leaving it to search for a valid offset vector type. (vect_use_strided_gather_scatters_p): Convert the offset to the element type of the gs_info's offset_vectype. (vect_get_gather_scatter_ops): Get the offset vector type directly from the gs_info. (vect_get_strided_load_store_ops): Likewise. (vectorizable_load): Pass a zero of the result type to IFN_GATHER_LOAD and IFN_MASK_GATHER_LOAD. * config/aarch64/aarch64-sve.md (gather_load<mode>): Rename to... (gather_load<mode><v_int_equiv>): ...this. (mask_gather_load<mode>): Rename to... (mask_gather_load<mode><v_int_equiv>): ...this. (scatter_store<mode>): Rename to... (scatter_store<mode><v_int_equiv>): ...this. (mask_scatter_store<mode>): Rename to... (mask_scatter_store<mode><v_int_equiv>): ...this. From-SVN: r277949
Richard Sandiford committed
-
- 29 Oct, 2019 3 commits
-
-
The AAPCS64 specifies that if a function takes arguments in SVE registers or returns them in SVE registers, it must preserve all of Z8-Z23 and all of P4-P11. (Normal functions only preserve the low 64 bits of Z8-Z15 and clobber all of the predicate registers.) This variation is known informally as the "SVE PCS" and functions that use it are known informally as "SVE functions". The SVE PCS is mutually interoperable with functions that follow the standard AAPCS64 rules and those that use the aarch64_vector_pcs attribute. (Note that it's an error to use the attribute for SVE functions.) One complication -- although it's not really that complicated -- is that SVE registers need to be saved at a VL-dependent offset while other registers need to be saved at a constant offset. The easiest way of handling this seemed to be to group the SVE registers together below the hard frame pointer. In common cases, the frame pointer is then usually an easy-to-compute VL multiple above the stack pointer and a constant amount below the incoming stack pointer. A bigger complication is that, because the base AAPCS64 specifies that only the low 64 bits of V8-V15 are preserved by calls, the associated DWARF frame registers are also treated as 64 bits by the unwinder. The 64 bits must also have the same layout as they would for a base AAPCS64 function, otherwise unwinding won't work correctly. (This is actually a problem for the existing aarch64_vector_pcs support too, but I'll fix that separately.) This falls out naturally for little-endian targets but not for big-endian targets. The easiest way of meeting the requirement for them was to use ST1D and LD1D to save and restore Z8-Z15, which also has the nice property of storing the 64 bits at the start of the slot. However, using ST1D and LD1D requires a spare predicate register, and since all of P0-P7 are either argument registers or call-preserved, we may need to spill P4 in order to save the vector registers, even if P4 wouldn't need to be saved otherwise. Since Z16-Z23 are fully clobbered by base AAPCS64 functions, we don't need to emit frame information for them at all. This avoids having to decide whether the registers should be treated as having 64 bits (as for Z8-Z15), 128 bits (for Advanced SIMD) or the full SVE width. There are two ways of dealing with stack-clash protection when saving SVE registers: (1) If the area between the hard frame pointer and the incoming stack pointer is allocated via a store with writeback (callee_adjust != 0), the SVE save area is allocated separately and becomes the "initial" allocation as far as stack-clash protection goes. In this case the store with writeback acts as a probe at the hard frame pointer position. (2) If the area between the hard frame pointer and the incoming stack pointer is allocated via aarch64_allocate_and_probe_stack_space, the SVE save area is added to this initial allocation, so that the SP ends up pointing at the SVE register saves. It's then necessary to use a temporary base register to save the non-SVE registers. Setting up this temporary register requires a single instruction only and so should be more efficient than doing two allocations and probes. When SVE registers need to be saved, saving them below the frame pointer makes it harder to rely on the LR save as a stack probe, since the LR register's offset won't usually be a compile-time constant. The patch copes with that by using the lowest SVE register save as a stack probe too, and thus prevents the save from being shrink-wrapped if stack clash protection is enabled. The changelog describes the low-level details. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * calls.c (pass_by_reference): Leave the target to decide whether POLY_INT_CST-sized arguments should be passed by value or reference, rather than forcing them to be passed by reference. (must_pass_in_stack_var_size): Likewise. * config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Redefine from V31_REGNUM to P15_REGNUM. * config/aarch64/aarch64-protos.h (aarch64_init_cumulative_args): Take an extra "silent_p" parameter, defaulting to false. (aarch64_sve::svbool_type_p): Declare. (aarch64_sve::nvectors_if_data_type): Likewise. * config/aarch64/aarch64.h (NUM_PR_ARG_REGS): New macro. (aarch64_frame::reg_offset): Turn into poly_int64s. (aarch64_frame::save_regs_size): Likewise. (aarch64_frame::below_hard_fp_saved_regs_size): New field. (aarch64_frame::sve_callee_adjust): Likewise. (aarch64_frame::spare_reg_reg): Likewise. (ARM_PCS_SVE): New arm_pcs value. (CUMULATIVE_ARGS::aapcs_nprn): New field. (CUMULATIVE_ARGS::aapcs_nextnprn): Likewise. (CUMULATIVE_ARGS::silent_p): Likewise. (BITS_PER_SVE_PRED): New macro. * config/aarch64/aarch64.c (handle_aarch64_vector_pcs_attribute): New function. Reject aarch64_vector_pcs attributes on SVE functions. (aarch64_attribute_table): Use the above handler. (aarch64_sve_abi): New function. (aarch64_sve_argument_p): Likewise. (aarch64_returns_value_in_sve_regs_p): Likewise. (aarch64_takes_arguments_in_sve_regs_p): Likewise. (aarch64_fntype_abi): Check for SVE functions and return the SVE PCS descriptor for them. (aarch64_simd_decl_p): Delete. (aarch64_emit_cfi_for_reg_p): New function. (aarch64_reg_save_mode): Remove the fndecl argument and instead use crtl->abi to choose the mode for FP registers. Handle the SVE PCS. (aarch64_hard_regno_call_part_clobbered): Do not treat FP registers as partly clobbered for the SVE PCS. (aarch64_function_ok_for_sibcall): Check whether the two functions use the same ABI, rather than checking specifically for whether they're aarch64_vector_pcs functions. (aarch64_pass_by_reference): Raise an error for attempts to pass SVE arguments when SVE is disabled. Pass SVE arguments by reference if there are not enough free registers left, or if the argument is variadic. (aarch64_function_value): Handle SVE predicates, vectors and tuples. (aarch64_return_in_memory): Do not return SVE predicates, vectors and tuples in memory. (aarch64_layout_arg): Take a function_arg_info rather than individual properties. Handle SVE predicates, vectors and tuples. Raise an error if they are passed to unprototyped functions. (aarch64_function_arg): If the silent_p flag is set, suppress the usual error about using float registers without TARGET_FLOAT. (aarch64_init_cumulative_args): Take a silent_p parameter and store it in the cumulative_args structure. Initialize aapcs_nprn and aapcs_nextnprn. If the silent_p flag is set, suppress the usual error about using float registers without TARGET_FLOAT. If the silent_p flag is not set, also raise an error about using SVE functions when SVE is disabled. (aarch64_function_arg_advance): Update the call to aarch64_layout_arg, and call it for SVE functions too. Update aapcs_nprn similarly to the other register counts. (aarch64_layout_frame): If a big-endian function needs to save and restore Z8-Z15, search for a spare predicate that it can use. Store SVE predicates at the bottom of the register save area, followed by SVE vectors, then followed by the normal slots. Keep pointing the hard frame pointer at the base of the normal slots, above the SVE vectors. Update the various frame creation and tear-down strategies for the new layout, initializing the new sve_callee_adjust field. Add an additional layout for frames whose saved registers are all SVE registers. (aarch64_register_saved_on_entry): Cope with poly_int64 reg_offsets. (aarch64_return_address_signing_enabled): Likewise. (aarch64_push_regs, aarch64_pop_regs): Update calls to aarch64_reg_save_mode. (aarch64_adjust_sve_callee_save_base): New function. (aarch64_add_cfa_expression): Move earlier in file. Take the saved register as an rtx rather than a register number and use its mode for the MEM slot. (aarch64_save_callee_saves): Remove the mode argument and instead use aarch64_reg_save_mode to get the mode of each save slot. Add a hard_fp_valid_p parameter. Cope with poly_int64 register offsets. Allow GP offsets to be saved at a VL-based offset from the stack, handling this case using the frame pointer if available or a temporary register otherwise. Use ST1D to save Z8-Z15 for big-endian SVE functions; use normal moves for other SVE saves. Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p returns true. Add explicit CFA notes when not storing via the stack pointer. Do not try to pair SVE saves. (aarch64_restore_callee_saves): Cope with poly_int64 register offsets. Use LD1D to restore Z8-Z15 for big-endian SVE functions; use normal moves for other SVE restores. Only add CFA restore notes if aarch64_emit_cfi_for_reg_p returns true. Do not try to pair SVE restores. (aarch64_get_separate_components): Always keep the first SVE save in the prologue if we need to use it as a stack probe. Don't allow Z8-Z15 saves and loads to be shrink-wrapped for big-endian targets. Likewise the spare predicate register that they need. Update the offset calculation to account for the SVE save area. Use the appropriate range check for SVE LDR and STR instructions. (aarch64_components_for_bb): Cope with poly_int64 reg_offsets. (aarch64_process_components): Likewise. Update the offset calculation to account for the SVE save area. Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p returns true. Do not try to pair SVE saves. (aarch64_allocate_and_probe_stack_space): Cope with poly_int64 reg_offsets. When handling the final allocation, expect the first SVE register save to be part of the initial allocation and for it to act as a probe at SP. Account for the SVE callee save area in the dump information. (aarch64_expand_prologue): Update the frame diagram. Fold the SVE callee allocation into the initial allocation if stack clash protection is enabled. Use new variables to track the offset of the frame chain (and hard frame pointer) from the current stack pointer, and likewise the offset of the bottom of the register save area. Update calls to aarch64_save_callee_saves and aarch64_add_cfa_expression. Apply sve_callee_adjust before saving the FP&SIMD registers. Save the predicate registers. (aarch64_expand_epilogue): Take below_hard_fp_saved_regs_size into account when setting the stack pointer from the frame pointer, and when deciding whether we can inherit the initial adjustment amount from the prologue. Restore the predicate registers after the vector registers, then apply sve_callee_adjust, then restore the general registers. (aarch64_secondary_reload): Don't use secondary SVE reloads for VNx16BImode. (aapcs_vfp_sub_candidate): Assert that the type is not an SVE type. (aarch64_short_vector_p): Return false for SVE types. (aarch64_vfp_is_call_or_return_candidate): Initialize *is_ha at the start of the function. Return false for SVE types. (aarch64_asm_output_variant_pcs): Output .variant_pcs for SVE functions too. (TARGET_STRICT_ARGUMENT_NAMING): Redefine to request strict naming. * config/aarch64/aarch64-sve.md (*aarch64_sve_mov<mode>_le): Extend to big-endian targets for bytewise moves. (*aarch64_sve_mov<mode>_be): Exclude the bytewise case. gcc/testsuite/ * gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp: New file. * gcc.target/aarch64/sve/pcs/annotate_1.c: New test. * gcc.target/aarch64/sve/pcs/annotate_2.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_3.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_4.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_5.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_6.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_7.c: Likewise. * gcc.target/aarch64/sve/pcs/args_1.c: Likewise. * gcc.target/aarch64/sve/pcs/args_10.c: Likewise. * gcc.target/aarch64/sve/pcs/args_11_nosc.c: Likewise. * gcc.target/aarch64/sve/pcs/args_11_sc.c: Likewise. * gcc.target/aarch64/sve/pcs/args_2.c: Likewise. * gcc.target/aarch64/sve/pcs/args_3.c: Likewise. * gcc.target/aarch64/sve/pcs/args_4.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_7.c: Likewise. * gcc.target/aarch64/sve/pcs/args_8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_9.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_1.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_7.c: Likewise. * gcc.target/aarch64/sve/pcs/nosve_8.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_2.c: Likewise. * gcc.target/aarch64/sve/pcs/return_3.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise. * gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise. * gcc.target/aarch64/sve/pcs/return_7.c: Likewise. * gcc.target/aarch64/sve/pcs/return_8.c: Likewise. * gcc.target/aarch64/sve/pcs/return_9.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_3.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_5_be.c: Likewise. * gcc.target/aarch64/sve/pcs/saves_5_le.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_256.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_512.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_256.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_512.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c: Likewise. * gcc.target/aarch64/sve/pcs/stack_clash_3.c: Likewise. * gcc.target/aarch64/sve/pcs/unprototyped_1.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise. * gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise. * gcc.target/aarch64/sve/pcs/vpcs_1.c: Likewise. * g++.target/aarch64/sve/catch_7.C: Likewise. From-SVN: r277564
Richard Sandiford committed -
This patch adds support for arm_sve.h. I've tried to split all the groundwork out into separate patches, so this is mostly adding new code rather than changing existing code. The C++ frontend seems to handle correct ACLE code without modification, even in length-agnostic mode. The C frontend is close; the only correct construct I know it doesn't handle is initialisation. E.g.: svbool_t pg = svptrue_b8 (); produces: variable-sized object may not be initialized although: svbool_t pg; pg = svptrue_b8 (); works fine. This can be fixed by changing: { /* A complete type is ok if size is fixed. */ - if (TREE_CODE (TYPE_SIZE (TREE_TYPE (decl))) != INTEGER_CST + if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl))) || C_DECL_VARIABLE_SIZE (decl)) { error ("variable-sized object may not be initialized"); in c/c-decl.c:start_decl. Invalid code is likely to trigger ICEs, so this isn't ready for general use yet. However, it seemed better to apply the patch now and deal with diagnosing invalid code as a follow-up. For one thing, it means that we'll be able to provide testcases for middle-end changes related to SVE vectors, which has been a problem until now. (I already have a series of such patches lined up.) The patch includes some tests, but the main ones need to wait until the PCS support has been applied. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ * config.gcc (aarch64*-*-*): Add arm_sve.h to extra_headers. Add aarch64-sve-builtins.o, aarch64-sve-builtins-shapes.o and aarch64-sve-builtins-base.o to extra_objs. Add aarch64-sve-builtins.h and aarch64-sve-builtins.cc to target_gtfiles. * config/aarch64/t-aarch64 (aarch64-sve-builtins.o): New rule. (aarch64-sve-builtins-shapes.o): Likewise. (aarch64-sve-builtins-base.o): New rules. * config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): New function. (aarch64_resolve_overloaded_builtin): Likewise. (aarch64_check_builtin_call): Likewise. (aarch64_register_pragmas): Install aarch64_resolve_overloaded_builtin and aarch64_check_builtin_call in targetm. Register the GCC aarch64 pragma. * config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPRFOP): New macro. (aarch64_svprfop): New enum. (AARCH64_BUILTIN_SVE): New aarch64_builtin_class enum value. (aarch64_sve_int_mode, aarch64_sve_data_mode): Declare. (aarch64_fold_sve_cnt_pat, aarch64_output_sve_prefetch): Likewise. (aarch64_output_sve_cnt_pat_immediate): Likewise. (aarch64_output_sve_ptrues, aarch64_sve_ptrue_svpattern_p): Likewise. (aarch64_sve_sqadd_sqsub_immediate_p, aarch64_sve_ldff1_operand_p) (aarch64_sve_ldnf1_operand_p, aarch64_sve_prefetch_operand_p) (aarch64_ptrue_all_mode, aarch64_convert_sve_data_to_pred): Likewise. (aarch64_expand_sve_dupq, aarch64_replace_reg_mode): Likewise. (aarch64_sve::init_builtins, aarch64_sve::handle_arm_sve_h): Likewise. (aarch64_sve::builtin_decl, aarch64_sve::builtin_type_p): Likewise. (aarch64_sve::mangle_builtin_type): Likewise. (aarch64_sve::resolve_overloaded_builtin): Likewise. (aarch64_sve::check_builtin_call, aarch64_sve::gimple_fold_builtin) (aarch64_sve::expand_builtin): Likewise. * config/aarch64/aarch64.c (aarch64_sve_data_mode): Make public. (aarch64_sve_int_mode): Likewise. (aarch64_ptrue_all_mode): New function. (aarch64_convert_sve_data_to_pred): Make public. (svprfop_token): New function. (aarch64_output_sve_prefetch): Likewise. (aarch64_fold_sve_cnt_pat): Likewise. (aarch64_output_sve_cnt_pat_immediate): Likewise. (aarch64_sve_move_pred_via_while): Use gen_while with UNSPEC_WHILE_LO instead of gen_while_ult. (aarch64_replace_reg_mode): Make public. (aarch64_init_builtins): Call aarch64_sve::init_builtins. (aarch64_fold_builtin): Handle AARCH64_BUILTIN_SVE. (aarch64_gimple_fold_builtin, aarch64_expand_builtin): Likewise. (aarch64_builtin_decl, aarch64_builtin_reciprocal): Likewise. (aarch64_mangle_type): Call aarch64_sve::mangle_type. (aarch64_sve_sqadd_sqsub_immediate_p): New function. (aarch64_sve_ptrue_svpattern_p): Likewise. (aarch64_sve_pred_valid_immediate): Check aarch64_sve_ptrue_svpattern_p. (aarch64_sve_ldff1_operand_p, aarch64_sve_ldnf1_operand_p) (aarch64_sve_prefetch_operand_p, aarch64_output_sve_ptrues): New functions. * config/aarch64/aarch64.md (UNSPEC_LDNT1_SVE, UNSPEC_STNT1_SVE) (UNSPEC_LDFF1_GATHER, UNSPEC_PTRUE, UNSPEC_WHILE_LE, UNSPEC_WHILE_LS) (UNSPEC_WHILE_LT, UNSPEC_CLASTA, UNSPEC_UPDATE_FFR) (UNSPEC_UPDATE_FFRT, UNSPEC_RDFFR, UNSPEC_WRFFR) (UNSPEC_SVE_LANE_SELECT, UNSPEC_SVE_CNT_PAT, UNSPEC_SVE_PREFETCH) (UNSPEC_SVE_PREFETCH_GATHER, UNSPEC_SVE_COMPACT, UNSPEC_SVE_SPLICE): New unspecs. * config/aarch64/iterators.md (SI_ONLY, DI_ONLY, VNx8HI_ONLY) (VNx2DI_ONLY, SVE_PARTIAL, VNx8_NARROW, VNx8_WIDE, VNx4_NARROW) (VNx4_WIDE, VNx2_NARROW, VNx2_WIDE, PRED_HSD): New mode iterators. (UNSPEC_ADR, UNSPEC_BRKA, UNSPEC_BRKB, UNSPEC_BRKN, UNSPEC_BRKPA) (UNSPEC_BRKPB, UNSPEC_PFIRST, UNSPEC_PNEXT, UNSPEC_CNTP, UNSPEC_SADDV) (UNSPEC_UADDV, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTMAD) (UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_CMPEQ_WIDE): New unspecs. (UNSPEC_COND_CMPGE_WIDE, UNSPEC_COND_CMPGT_WIDE): Likewise. (UNSPEC_COND_CMPHI_WIDE, UNSPEC_COND_CMPHS_WIDE): Likewise. (UNSPEC_COND_CMPLE_WIDE, UNSPEC_COND_CMPLO_WIDE): Likewise. (UNSPEC_COND_CMPLS_WIDE, UNSPEC_COND_CMPLT_WIDE): Likewise. (UNSPEC_COND_CMPNE_WIDE, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270) (UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180) (UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN): Likewise. (UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX, UNSPEC_COND_FSCALE): Likewise. (UNSPEC_LASTA, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE): Likewise. (UNSPEC_LSHIFTRT_WIDE, UNSPEC_LDFF1, UNSPEC_LDNF1): Likewise. (Vesize): Handle partial vector modes. (self_mask, narrower_mask, sve_lane_con, sve_lane_pair_con): New mode attributes. (UBINQOPS, ANY_PLUS, SAT_PLUS, ANY_MINUS, SAT_MINUS): New code iterators. (s, paired_extend, inc_dec): New code attributes. (SVE_INT_ADDV, CLAST, LAST): New int iterators. (SVE_INT_UNARY): Add UNSPEC_RBIT. (SVE_FP_UNARY, SVE_FP_UNARY_INT): New int iterators. (SVE_FP_BINARY, SVE_FP_BINARY_INT): Likewise. (SVE_COND_FP_UNARY): Add UNSPEC_COND_FRECPX. (SVE_COND_FP_BINARY): Add UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (SVE_COND_FP_BINARY_INT, SVE_COND_FP_ADD): New int iterators. (SVE_COND_FP_SUB, SVE_COND_FP_MUL): Likewise. (SVE_COND_FP_BINARY_I1): Add UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (SVE_COND_FP_BINARY_REG): Add UNSPEC_COND_FMULX. (SVE_COND_FCADD, SVE_COND_FP_MAXMIN, SVE_COND_FCMLA) (SVE_COND_INT_CMP_WIDE, SVE_FP_TERNARY_LANE, SVE_CFP_TERNARY_LANE) (SVE_WHILE, SVE_SHIFT_WIDE, SVE_LDFF1_LDNF1, SVE_BRK_UNARY) (SVE_BRK_BINARY, SVE_PITER): New int iterators. (optab): Handle UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_RBIT, UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FCMLA, UNSPEC_FCMLA90, UNSPEC_FCMLA180, UNSPEC_FCMLA270, UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270, UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180, UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE. (maxmin_uns): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (binqops_op, binqops_op_rev, last_op): New int attributes. (su): Handle UNSPEC_SADDV and UNSPEC_UADDV. (fn, ab): New int attributes. (cmp_op): Handle UNSPEC_COND_CMP*_WIDE and UNSPEC_WHILE_*. (while_optab_cmp, brk_op, sve_pred_op): New int attributes. (sve_int_op): Handle UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE, UNSPEC_LSHIFTRT_WIDE and UNSPEC_RBIT. (sve_fp_op): Handle UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE. (sve_fp_op_rev): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (rot): Handle UNSPEC_COND_FCADD* and UNSPEC_COND_FCMLA*. (brk_reg_con, brk_reg_opno): New int attributes. (sve_pred_fp_rhs1_operand, sve_pred_fp_rhs2_operand): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX. (sve_pred_fp_rhs2_immediate): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN. (max_elem_bits): New int attribute. (min_elem_bits): Handle UNSPEC_RBIT. * config/aarch64/predicates.md (subreg_lowpart_operator): Handle TRUNCATE as well as SUBREG. (ascending_int_parallel, aarch64_simd_reg_or_minus_one) (aarch64_sve_ldff1_operand, aarch64_sve_ldnf1_operand) (aarch64_sve_prefetch_operand, aarch64_sve_ptrue_svpattern_immediate) (aarch64_sve_qadd_immediate, aarch64_sve_qsub_immediate) (aarch64_sve_gather_immediate_b, aarch64_sve_gather_immediate_h) (aarch64_sve_gather_immediate_w, aarch64_sve_gather_immediate_d) (aarch64_sve_sqadd_operand, aarch64_sve_gather_offset_b) (aarch64_sve_gather_offset_h, aarch64_sve_gather_offset_w) (aarch64_sve_gather_offset_d, aarch64_gather_scale_operand_b) (aarch64_gather_scale_operand_h): New predicates. * config/aarch64/constraints.md (UPb, UPd, UPh, UPw, Utf, Utn, vgb) (vgd, vgh, vgw, vsQ, vsS): New constraints. * config/aarch64/aarch64-sve.md: Add a note on the FFR handling. (*aarch64_sve_reinterpret<mode>): Allow any source register instead of requiring an exact match. (*aarch64_sve_ptruevnx16bi_cc, *aarch64_sve_ptrue<mode>_cc) (*aarch64_sve_ptruevnx16bi_ptest, *aarch64_sve_ptrue<mode>_ptest) (aarch64_wrffr, aarch64_update_ffr_for_load, aarch64_copy_ffr_to_ffrt) (aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest) (*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc) (aarch64_update_ffrt): New patterns. (@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (@aarch64_ld<fn>f1<mode>): New patterns. (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (@aarch64_ldnt1<mode>): New patterns. (gather_load<mode>): Use aarch64_sve_gather_offset_<Vesize> for the scalar part of the address. (mask_gather_load<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the scalar part of the addresse and add an alternative for handling nonzero offsets. (mask_gather_load<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d. (*mask_gather_load<mode>_sxtw, *mask_gather_load<mode>_uxtw) (@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw) (*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw) (@aarch64_ldff1_gather<SVE_S:mode>, @aarch64_ldff1_gather<SVE_D:mode>) (*aarch64_ldff1_gather<mode>_sxtw, *aarch64_ldff1_gather<mode>_uxtw) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>) (@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw) (*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw) (@aarch64_sve_prefetch<mode>): New patterns. (@aarch64_sve_gather_prefetch<SVE_I:mode><VNx4SI_ONLY:mode>) (@aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>) (*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_sxtw) (*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_uxtw) (@aarch64_store_trunc<VNx8_NARROW:mode><VNx8_WIDE:mode>) (@aarch64_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>) (@aarch64_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>) (@aarch64_stnt1<mode>): New patterns. (scatter_store<mode>): Use aarch64_sve_gather_offset_<Vesize> for the scalar part of the address. (mask_scatter_store<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the scalar part of the addresse and add an alternative for handling nonzero offsets. (mask_scatter_store<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d. (*mask_scatter_store<mode>_sxtw, *mask_scatter_store<mode>_uxtw) (@aarch64_scatter_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>) (@aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>) (*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_sxtw) (*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_uxtw): New patterns. (vec_duplicate<mode>): Use QI as the mode of the input operand. (extract_last_<mode>): Generalize to... (@extract_<LAST:last_op>_<mode>): ...this. (*<SVE_INT_UNARY:optab><mode>2): Rename to... (@aarch64_pred_<SVE_INT_UNARY:optab><mode>): ...this. (@cond_<SVE_INT_UNARY:optab><mode>): New expander. (@aarch64_pred_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): New pattern. (@aarch64_cond_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): Likewise. (@aarch64_pred_cnot<mode>, @cond_cnot<mode>): New expanders. (@aarch64_sve_<SVE_FP_UNARY_INT:optab><mode>): New pattern. (@aarch64_sve_<SVE_FP_UNARY:optab><mode>): Likewise. (*<SVE_COND_FP_UNARY:optab><mode>2): Rename to... (@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): ...this. (@cond_<SVE_COND_FP_UNARY:optab><mode>): New expander. (*<SVE_INT_BINARY_IMM:optab><mode>3): Rename to... (@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>): ...this. (@aarch64_adr<mode>, *aarch64_adr_sxtw): New patterns. (*aarch64_adr_uxtw_unspec): Likewise. (*aarch64_adr_uxtw): Rename to... (*aarch64_adr_uxtw_and): ...this. (@aarch64_adr<mode>_shift): New expander. (*aarch64_adr_shift_sxtw): New pattern. (aarch64_<su>abd<mode>_3): Rename to... (@aarch64_pred_<su>abd<mode>): ...this. (<su>abd<mode>_3): Update accordingly. (@aarch64_cond_<su>abd<mode>): New expander. (@aarch64_<SBINQOPS:su_optab><optab><mode>): New pattern. (@aarch64_<UBINQOPS:su_optab><optab><mode>): Likewise. (*<su>mul<mode>3_highpart): Rename to... (@aarch64_pred_<optab><mode>): ...this. (@cond_<MUL_HIGHPART:optab><mode>): New expander. (*cond_<MUL_HIGHPART:optab><mode>_2): New pattern. (*cond_<MUL_HIGHPART:optab><mode>_z): Likewise. (*<SVE_INT_BINARY_SD:optab><mode>3): Rename to... (@aarch64_pred_<SVE_INT_BINARY_SD:optab><mode>): ...this. (cond_<SVE_INT_BINARY_SD:optab><mode>): Add a "@" marker. (@aarch64_bic<mode>, @cond_bic<mode>): New expanders. (*v<ASHIFT:optab><mode>3): Rename to... (@aarch64_pred_<ASHIFT:optab><mode>): ...this. (@aarch64_sve_<SVE_SHIFT_WIDE:sve_int_op><mode>): New pattern. (@cond_<SVE_SHIFT_WIDE:sve_int_op><mode>): New expander. (*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_m): New pattern. (*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_z): Likewise. (@cond_asrd<mode>): New expander. (*cond_asrd<mode>_2, *cond_asrd<mode>_z): New patterns. (sdiv_pow2<mode>3): Expand to *cond_asrd<mode>_2. (*sdiv_pow2<mode>3): Delete. (@cond_<SVE_COND_FP_BINARY_INT:optab><mode>): New expander. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): New pattern. (*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Likewise. (@aarch64_sve_<SVE_FP_BINARY:optab><mode>): New pattern. (@aarch64_sve_<SVE_FP_BINARY_INT:optab><mode>): Likewise. (*<SVE_COND_FP_BINARY_REG:optab><mode>3): Rename to... (@aarch64_pred_<SVE_COND_FP_BINARY_REG:optab><mode>): ...this. (@aarch64_pred_<SVE_COND_FP_BINARY_INT:optab><mode>): New pattern. (cond_<SVE_COND_FP_BINARY:optab><mode>): Add a "@" marker. (*add<SVE_F:mode>3): Rename to... (@aarch64_pred_add<SVE_F:mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_pred_<SVE_COND_FCADD:optab><mode>): New pattern. (@cond_<SVE_COND_FCADD:optab><mode>): New expander. (*cond_<SVE_COND_FCADD:optab><mode>_2): New pattern. (*cond_<SVE_COND_FCADD:optab><mode>_any): Likewise. (*sub<SVE_F:mode>3): Rename to... (@aarch64_pred_sub<SVE_F:mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_pred_abd<SVE_F:mode>): New expander. (*fabd<SVE_F:mode>3): Rename to... (*aarch64_pred_abd<SVE_F:mode>): ...this. (@aarch64_cond_abd<SVE_F:mode>): New expander. (*mul<SVE_F:mode>3): Rename to... (@aarch64_pred_<SVE_F:optab><mode>): ...this and add alternatives for SVE_STRICT_GP. (@aarch64_mul_lane_<SVE_F:mode>): New pattern. (*<SVE_COND_FP_MAXMIN_PUBLIC:optab><mode>3): Rename and generalize to... (@aarch64_pred_<SVE_COND_FP_MAXMIN:optab><mode>): ...this. (*<LOGICAL:optab><PRED_ALL:mode>3_ptest): New pattern. (*<nlogical><PRED_ALL:mode>3): Rename to... (aarch64_pred_<nlogical><PRED_ALL:mode>_z): ...this. (*<nlogical><PRED_ALL:mode>3_cc): New pattern. (*<nlogical><PRED_ALL:mode>3_ptest): Likewise. (*<logical_nn><PRED_ALL:mode>3): Rename to... (aarch64_pred_<logical_nn><mode>_z): ...this. (*<logical_nn><PRED_ALL:mode>3_cc): New pattern. (*<logical_nn><PRED_ALL:mode>3_ptest): Likewise. (*fma<SVE_I:mode>4): Rename to... (@aarch64_pred_fma<SVE_I:mode>): ...this. (*fnma<SVE_I:mode>4): Rename to... (@aarch64_pred_fnma<SVE_I:mode>): ...this. (@aarch64_<sur>dot_prod_lane<vsi2qi>): New pattern. (*<SVE_FP_TERNARY:optab><mode>4): Rename to... (@aarch64_pred_<SVE_FP_TERNARY:optab><mode>): ...this. (cond_<SVE_FP_TERNARY:optab><mode>): Add a "@" marker. (@aarch64_<SVE_FP_TERNARY_LANE:optab>_lane_<mode>): New pattern. (@aarch64_pred_<SVE_COND_FCMLA:optab><mode>): Likewise. (@cond_<SVE_COND_FCMLA:optab><mode>): New expander. (*cond_<SVE_COND_FCMLA:optab><mode>_4): New pattern. (*cond_<SVE_COND_FCMLA:optab><mode>_any): Likewise. (@aarch64_<FCMLA:optab>_lane_<mode>): Likewise. (@aarch64_sve_tmad<mode>): Likewise. (vcond_mask_<SVE_ALL:mode><vpred>): Add a "@" marker. (*aarch64_sel_dup<mode>): Rename to... (@aarch64_sel_dup<mode>): ...this. (@aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide): New pattern. (*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_cc): Likewise. (*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_ptest): Likewise. (@while_ult<GPI:mode><PRED_ALL:mode>): Generalize to... (@while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>): ...this. (*while_ult<GPI:mode><PRED_ALL:mode>_cc): Generalize to. (*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_cc): ...this. (*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_ptest): New pattern. (*fcm<cmp_op><mode>): Rename to... (@aarch64_pred_fcm<cmp_op><mode>): ...this. Make operand order match @aarch64_pred_cmp<cmp_op><SVE_I:mode>. (*fcmuo<mode>): Rename to... (@aarch64_pred_fcmuo<mode>): ...this. Make operand order match @aarch64_pred_cmp<cmp_op><SVE_I:mode>. (@aarch64_pred_fac<cmp_op><mode>): New expander. (@vcond_mask_<PRED_ALL:mode><mode>): New pattern. (fold_extract_last_<mode>): Generalize to... (@fold_extract_<last_op>_<mode>): ...this. (@aarch64_fold_extract_vector_<last_op>_<mode>): New pattern. (*reduc_plus_scal_<SVE_I:mode>): Replace with... (@aarch64_pred_reduc_<optab>_<mode>): ...this pattern, making the DImode result explicit. (reduc_plus_scal_<mode>): Update accordingly. (*reduc_<optab>_scal_<SVE_I:mode>): Rename to... (@aarch64_pred_reduc_<optab>_<SVE_I:mode>): ...this. (*reduc_<optab>_scal_<SVE_F:mode>): Rename to... (@aarch64_pred_reduc_<optab>_<SVE_F:mode>): ...this. (*aarch64_sve_tbl<mode>): Rename to... (@aarch64_sve_tbl<mode>): ...this. (@aarch64_sve_compact<mode>): New pattern. (*aarch64_sve_dup_lane<mode>): Rename to... (@aarch64_sve_dup_lane<mode>): ...this. (@aarch64_sve_dupq_lane<mode>): New pattern. (@aarch64_sve_splice<mode>): Likewise. (aarch64_sve_<perm_insn><mode>): Rename to... (@aarch64_sve_<perm_insn><mode>): ...this. (*aarch64_sve_ext<mode>): Rename to... (@aarch64_sve_ext<mode>): ...this. (aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): Add a "@" marker. (*aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): Rename to... (@aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): ...this. (*aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Rename to... (@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): ...this. (@cond_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): New expander. (@cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Likewise. (*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): New pattern. (*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): Rename to... (@aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): ...this. (aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Add a "@" marker. (@cond_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New expander. (@cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Likewise. (*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): New pattern. (*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): Rename to... (@aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): ...this. (@cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New expander. (*cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New pattern. (aarch64_sve_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): Add a "@" marker. (@cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New expander. (*cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New pattern. (aarch64_sve_punpk<perm_hilo>_<mode>): Add a "@" marker. (@aarch64_brk<SVE_BRK_UNARY:brk_op>): New pattern. (*aarch64_brk<SVE_BRK_UNARY:brk_op>_cc): Likewise. (*aarch64_brk<SVE_BRK_UNARY:brk_op>_ptest): Likewise. (@aarch64_brk<SVE_BRK_BINARY:brk_op>): Likewise. (*aarch64_brk<SVE_BRK_BINARY:brk_op>_cc): Likewise. (*aarch64_brk<SVE_BRK_BINARY:brk_op>_ptest): Likewise. (@aarch64_sve_<SVE_PITER:sve_pred_op><mode>): Likewise. (*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_cc): Likewise. (*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_ptest): Likewise. (aarch64_sve_cnt_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode>_pat): Likewise. (*aarch64_sve_incsi_pat): Likewise. (@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode>_pat): Likewise. (*aarch64_sve_decsi_pat): Likewise. (@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern. (@aarch64_pred_cntp<mode>): Likewise. (@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp) (*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns. (@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp) (*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns. (@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New expander. (*aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern. (@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander. (*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern. * config/aarch64/arm_sve.h: New file. * config/aarch64/aarch64-sve-builtins.h: Likewise. * config/aarch64/aarch64-sve-builtins.cc: Likewise. * config/aarch64/aarch64-sve-builtins.def: Likewise. * config/aarch64/aarch64-sve-builtins-base.h: Likewise. * config/aarch64/aarch64-sve-builtins-base.cc: Likewise. * config/aarch64/aarch64-sve-builtins-base.def: Likewise. * config/aarch64/aarch64-sve-builtins-functions.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h: Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise. gcc/testsuite/ * g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file. * g++.target/aarch64/sve/acle/general-c++: New test directory. * gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file. * gcc.target/aarch64/sve/acle/general: New test directory. * gcc.target/aarch64/sve/acle/general-c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r277563
Richard Sandiford committed -
This is tested by the main SVE ACLE patches, but since it affects the evpc routines, it seemed worth splitting out. 2019-10-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (@aarch64_sve_rev<PRED_ALL:mode>): New pattern. * config/aarch64/aarch64.c (aarch64_evpc_rev_global): Handle all SVE modes. From-SVN: r277562
Richard Sandiford committed
-
- 30 Sep, 2019 1 commit
-
-
2019-09-30 Yuliang Wang <yuliang.wang@arm.com> gcc/ * config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3): New pattern for ASRD. * config/aarch64/iterators.md (UNSPEC_ASRD): New unspec. * internal-fn.def (IFN_DIV_POW2): New internal function. * optabs.def (sdiv_pow2_optab): New optab. * tree-vect-patterns.c (vect_recog_divmod_pattern): Modify pattern to support new operation. * doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above. * doc/sourcebuild.texi (vect_sdiv_pow2_si): Document new target selector. gcc/testsuite/ * gcc.dg/vect/vect-sdiv-pow2-1.c: New test. * gcc.target/aarch64/sve/asrdiv_1.c: As above. * lib/target-supports.exp (check_effective_target_vect_sdiv_pow2_si): Return true for AArch64 with SVE. From-SVN: r276343
Yuliang Wang committed
-
- 22 Aug, 2019 1 commit
-
-
2019-08-22 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> * config/aarch64/aarch64-sve.md (vcond_mask): Add "@". From-SVN: r274817
Prathamesh Kulkarni committed
-
- 15 Aug, 2019 14 commits
-
-
SVE defines an assembly alias: MOV pa.B, pb/Z, pc.B -> AND pa.B. pb/Z, pc.B, pc.B Our and<mode>3 pattern was instead using the functionally-equivalent: AND pa.B. pb/Z, pb.B, pc.B ^^^^ This patch duplicates pc.B instead so that the alias can be seen in disassembly. I wondered about using the alias in the pattern instead, but using AND explicitly seems to fit better with the pattern name and surrounding code. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (and<PRED_ALL:mode>3): Make the operand order match the MOV /Z alias. From-SVN: r274521
Richard Sandiford committed -
The scalar addition patterns allowed all the VL constants that ADDVL and ADDPL allow, but wrote the instructions as INC or DEC if possible (i.e. adding or subtracting a number of elements * [1, 16] when the source and target registers the same). That works for the cases that the autovectoriser needs, but there are a few constants that INC and DEC can handle but ADDPL and ADDVL can't. E.g.: inch x0, all, mul #9 is not a multiple of the number of bytes in an SVE register, and so can't use ADDVL. It represents 36 times the number of bytes in an SVE predicate, putting it outside the range of ADDPL. This patch therefore adds separate alternatives for INC and DEC, tied to a new Uai constraint. It also adds an explicit "scalar" or "vector" to the function names, to avoid a clash with the existing support for vector INC and DEC. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-protos.h (aarch64_sve_scalar_inc_dec_immediate_p): Declare. (aarch64_sve_inc_dec_immediate_p): Rename to... (aarch64_sve_vector_inc_dec_immediate_p): ...this. (aarch64_output_sve_addvl_addpl): Take a single rtx argument. (aarch64_output_sve_scalar_inc_dec): Declare. (aarch64_output_sve_inc_dec_immediate): Rename to... (aarch64_output_sve_vector_inc_dec): ...this. * config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p) (aarch64_output_sve_scalar_inc_dec): New functions. (aarch64_output_sve_addvl_addpl): Remove the base and offset arguments. Only handle true ADDVL and ADDPL instructions; don't emit an INC or DEC. (aarch64_sve_inc_dec_immediate_p): Rename to... (aarch64_sve_vector_inc_dec_immediate_p): ...this. (aarch64_output_sve_inc_dec_immediate): Rename to... (aarch64_output_sve_vector_inc_dec): ...this. Update call to aarch64_sve_vector_inc_dec_immediate_p. * config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate) (aarch64_sve_plus_immediate): New predicates. (aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate rather than aarch64_sve_addvl_addpl_immediate. (aarch64_sve_inc_dec_immediate): Rename to... (aarch64_sve_vector_inc_dec_immediate): ...this. Update call to aarch64_sve_vector_inc_dec_immediate_p. (aarch64_sve_add_operand): Update accordingly. * config/aarch64/constraints.md (Uai): New constraint. (vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p. * config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second operand into a register if it satisfies aarch64_sve_plus_immediate. (*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative for Uai. Update calls to aarch64_output_sve_addvl_addpl. * config/aarch64/aarch64-sve.md (add<mode>3): Call aarch64_output_sve_vector_inc_dec instead of aarch64_output_sve_inc_dec_immediate. From-SVN: r274518
Richard Sandiford committed -
The current SVE REV patterns follow the AArch64 scheme, in which UNSPEC_REV<NN> reverses elements within an <NN>-bit granule. E.g. UNSPEC_REV64 on VNx8HI reverses the four 16-bit elements within each 64-bit granule. The native SVE scheme is the other way around: UNSPEC_REV64 is seen as an operation on 64-bit elements, with REVB swapping bytes within the elements, REVH swapping halfwords, and so on. This fits SVE more naturally because the operation can then be predicated per <NN>-bit granule/element. Making the patterns use the Advanced SIMD scheme was more natural when all we cared about were permutes, since we could then use the source and target of the permute in their original modes. However, the ACLE does need patterns that follow the native scheme, treating them as operations on integer elements. This patch defines the patterns that way instead and updates the existing uses to match. This also brings in a couple of helper routines from the ACLE branch. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (UNSPEC_REVB, UNSPEC_REVH) (UNSPEC_REVW): New constants. (elem_bits): New mode attribute. (SVE_INT_UNARY): New int iterator. (optab): Handle UNSPEC_REV[BHW]. (sve_int_op): New int attribute. (min_elem_bits): Handle VNx16QI and the predicate modes. * config/aarch64/aarch64-sve.md (*aarch64_sve_rev64<mode>) (*aarch64_sve_rev32<mode>, *aarch64_sve_rev16vnx16qi): Delete. (@aarch64_pred_<SVE_INT_UNARY:optab><SVE_I:mode>): New pattern. * config/aarch64/aarch64.c (aarch64_sve_data_mode): New function. (aarch64_sve_int_mode, aarch64_sve_rev_unspec): Likewise. (aarch64_split_sve_subreg_move): Use UNSPEC_REV[BHW] instead of unspecs based on the total width of the reversed data. (aarch64_evpc_rev_local): Likewise (for SVE only). Use a reinterpret followed by a subreg on big-endian targets. gcc/testsuite/ * gcc.target/aarch64/sve/revb_1.c: Restrict to little-endian targets. Avoid including stdint.h. * gcc.target/aarch64/sve/revh_1.c: Likewise. * gcc.target/aarch64/sve/revw_1.c: Likewise. * gcc.target/aarch64/sve/revb_2.c: New big-endian test. * gcc.target/aarch64/sve/revh_2.c: Likewise. * gcc.target/aarch64/sve/revw_2.c: Likewise. From-SVN: r274517
Richard Sandiford committed -
This patch makes the floating-point conditional FMA patterns provide the same /z alternatives as the integer patterns added by a previous patch. We can handle cases in which individual inputs are allocated to the same register as the output, so we don't need to force all registers to be different. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>_any): Add /z alternatives in which one of the inputs is in the same register as the output. gcc/testsuite/ * gcc.target/aarch64/sve/cond_mla_5.c: Allow FMAD as well as FMLA and FMSB as well as FMLS. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274516
Richard Sandiford committed -
We use EXT both to implement vec_extract for large indices and as a permute. In both cases we can use MOVPRFX to handle the case in which the first input and output can't be tied. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (*vec_extract<mode><Vel>_ext) (*aarch64_sve_ext<mode>): Add MOVPRFX alternatives. gcc/testsuite/ * gcc.target/aarch64/sve/ext_2.c: Expect a MOVPRFX. * gcc.target/aarch64/sve/ext_3.c: New test. From-SVN: r274515
Richard Sandiford committed -
The floating-point subtraction patterns don't need to handle subtraction of constants, since those go through the addition patterns instead. There was a missing MOVPRFX alternative for FSUBR though. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (*sub<SVE_F:mode>3): Remove immediate FADD and FSUB alternatives. Add a MOVPRFX alternative for FSUBR. From-SVN: r274514
Richard Sandiford committed -
FABD and some immediate instructions were missing MOVPRFX alternatives. This is tested by the ACLE patches but is really an independent improvement. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (add<SVE_I:mode>3, sub<SVE_I:mode>3) (<LOGICAL:optab><SVE_I:mode>3, *add<SVE_F:mode>3, *mul<SVE_F:mode>3) (*fabd<SVE_F:mode>3): Add more MOVPRFX alternatives. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274513
Richard Sandiford committed -
This patch makes us use reversed SVE shifts when the first operand can't be tied to the output but the second can. This is tested more thoroughly by the ACLE patches but is really an independent improvement. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*v<ASHIFT:optab><SVE_I:mode>3): Add an alternative that uses reversed shifts. gcc/testsuite/ * gcc.target/aarch64/sve/shift_1.c: Accept reversed shifts. Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r274512
Richard Sandiford committed -
This will be tested by the ACLE patches, but it's really an independent improvement. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (aarch64_<su>abd<mode>_3): Add a commutativity marker. From-SVN: r274510
Richard Sandiford committed -
This patch uses predicated MLA, MLS, MAD and MSB to implement conditional "FMA"s on integers. This also requires providing the unpredicated optabs (fma and fnma) since otherwise tree-ssa-math-opts.c won't try to use the conditional forms. We still want to use shifts and adds in preference to multiplications, so the patch makes the optab expanders check for that. The tests cover floating-point types too, which are already handled, and which were already tested to some extent by gcc.dg/vect. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-protos.h (aarch64_prepare_sve_int_fma) (aarch64_prepare_sve_cond_int_fma): Declare. * config/aarch64/aarch64.c (aarch64_convert_mult_to_shift) (aarch64_prepare_sve_int_fma): New functions. (aarch64_prepare_sve_cond_int_fma): Likewise. * config/aarch64/aarch64-sve.md (cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Add a "@" marker. (fma<SVE_I:mode>4, cond_fma<SVE_I:mode>, *cond_fma<SVE_I:mode>_2) (*cond_fma<SVE_I:mode>_4, *cond_fma<SVE_I:mode>_any, fnma<SVE_I:mode>4) (cond_fnma<SVE_I:mode>, *cond_fnma<SVE_I:mode>_2) (*cond_fnma<SVE_I:mode>_4, *cond_fnma<SVE_I:mode>_any): New patterns. (*madd<mode>): Rename to... (*fma<mode>4): ...this. (*msub<mode>): Rename to... (*fnma<mode>4): ...this. gcc/testsuite/ * gcc.target/aarch64/sve/cond_mla_1.c: New test. * gcc.target/aarch64/sve/cond_mla_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_2.c: Likewise. * gcc.target/aarch64/sve/cond_mla_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_3.c: Likewise. * gcc.target/aarch64/sve/cond_mla_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_4.c: Likewise. * gcc.target/aarch64/sve/cond_mla_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_5.c: Likewise. * gcc.target/aarch64/sve/cond_mla_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_6.c: Likewise. * gcc.target/aarch64/sve/cond_mla_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_7.c: Likewise. * gcc.target/aarch64/sve/cond_mla_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_mla_8.c: Likewise. * gcc.target/aarch64/sve/cond_mla_8_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274509
Richard Sandiford committed -
This patch lets us use the immediate forms of FADD, FSUB, FSUBR, FMUL, FMAXNM and FMINNM for conditional arithmetic. (We already use them for normal unconditional arithmetic.) 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64.c (aarch64_print_vector_float_operand): Print 2.0 naturally. (aarch64_sve_float_mul_immediate_p): Return true for 2.0. * config/aarch64/predicates.md (aarch64_sve_float_negated_arith_immediate): New predicate, renamed from aarch64_sve_float_arith_with_sub_immediate. (aarch64_sve_float_arith_with_sub_immediate): Test for both positive and negative constants. (aarch64_sve_float_arith_with_sub_operand): Redefine as a register or an aarch64_sve_float_arith_with_sub_immediate. * config/aarch64/constraints.md (vsN): Use aarch64_sve_float_negated_arith_immediate. * config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int iterator. (sve_pred_fp_rhs2_immediate): New int attribute. * config/aarch64/aarch64-sve.md (cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand. (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const) (*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const) (*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const) (*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_fadd_1.c: New test. * gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_2.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_3.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_4.c: Likewise. * gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise. * gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise. * gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise. * gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_1.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_2.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_3.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_4.c: Likewise. * gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274508
Richard Sandiford committed -
This patch extends the FABD support so that it handles conditional arithmetic. We're relying on combine for this, since there's no associated IFN_COND_* (yet?). 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*aarch64_cond_abd<SVE_F:mode>_2) (*aarch64_cond_abd<SVE_F:mode>_3) (*aarch64_cond_abd<SVE_F:mode>_any): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_fabd_1.c: New test. * gcc.target/aarch64/sve/cond_fabd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_2.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_3.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_4.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_5.c: Likewise. * gcc.target/aarch64/sve/cond_fabd_5_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274507
Richard Sandiford committed -
This patch extends the [SU]ABD support so that it handles conditional arithmetic. We're relying on combine for this, since there's no associated IFN_COND_* (yet?). 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*aarch64_cond_<su>abd<mode>_2) (*aarch64_cond_<su>abd<mode>_any): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_abd_1.c: New test. * gcc.target/aarch64/sve/cond_abd_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_abd_2.c: Likewise. * gcc.target/aarch64/sve/cond_abd_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_abd_3.c: Likewise. * gcc.target/aarch64/sve/cond_abd_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_abd_4.c: Likewise. * gcc.target/aarch64/sve/cond_abd_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_abd_5.c: Likewise. * gcc.target/aarch64/sve/cond_abd_5_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274506
Richard Sandiford committed -
This patch adds support for IFN_COND shifts left and shifts right. This is mostly mechanical, but since we try to handle conditional operations in the same way as unconditional operations in match.pd, we need to support IFN_COND shifts by scalars as well as vectors. E.g.: IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback) and: IFN_COND_SHL (cond, a, 1, fallback) are the same operation, with: (for shiftrotate (lrotate rrotate lshift rshift) ... /* Prefer vector1 << scalar to vector1 << vector2 if vector2 is uniform. */ (for vec (VECTOR_CST CONSTRUCTOR) (simplify (shiftrotate @0 vec@1) (with { tree tem = uniform_vector_p (@1); } (if (tem) (shiftrotate @0 { tem; })))))) preferring the latter. The patch copes with this by extending create_convert_operand_from to handle scalar-to-vector conversions. 2019-08-15 Richard Sandiford <richard.sandiford@arm.com> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> gcc/ * internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions. * internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts. * match.pd (UNCOND_BINARY, COND_BINARY): Likewise. * optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New optabs. * optabs.h (create_convert_operand_from): Expand comment. * optabs.c (maybe_legitimize_operand): Allow implicit broadcasts when mapping scalar rtxes to vector operands. * config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift, ashiftrt and lshiftrt. (sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them. * config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_const) (*cond_<optab><mode>_any_const): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_shift_1.c: New test. * gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2.c: Likewise. * gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3.c: Likewise. * gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4.c: Likewise. * gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5.c: Likewise. * gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6.c: Likewise. * gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7.c: Likewise. * gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8.c: Likewise. * gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9.c: Likewise. * gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise. Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> From-SVN: r274505
Richard Sandiford committed
-
- 14 Aug, 2019 10 commits
-
-
This patch uses BIC to pattern-match conditional AND with an inverted third input. It also adds extra tests for AND, ORR and EOR. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*cond_bic<mode>_2) (*cond_bic<mode>_any): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_logical_1.c: New test. * gcc.target/aarch64/sve/cond_logical_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_logical_2.c: Likewise. * gcc.target/aarch64/sve/cond_logical_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_logical_3.c: Likewise. * gcc.target/aarch64/sve/cond_logical_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_logical_4.c: Likewise. * gcc.target/aarch64/sve/cond_logical_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_logical_5.c: Likewise. * gcc.target/aarch64/sve/cond_logical_5_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274480
Richard Sandiford committed -
UXTB, UXTH and UXTW are equivalent to predicated ANDs with the constants 0xff, 0xffff and 0xffffffff respectively. This patch uses them in the patterns for IFN_COND_AND. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64.c (aarch64_print_operand): Allow %e to take the equivalent mask, as well as a bit count. * config/aarch64/predicates.md (aarch64_sve_uxtb_immediate) (aarch64_sve_uxth_immediate, aarch64_sve_uxt_immediate) (aarch64_sve_pred_and_operand): New predicates. * config/aarch64/iterators.md (sve_pred_int_rhs2_operand): New code attribute. * config/aarch64/aarch64-sve.md (cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Use it. (*cond_uxt<mode>_2, *cond_uxt<mode>_any): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_uxt_1.c: New test. * gcc.target/aarch64/sve/cond_uxt_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_2.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_3.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_4.c: Likewise. * gcc.target/aarch64/sve/cond_uxt_4_run.c: Likewise. From-SVN: r274479
Richard Sandiford committed -
This patch adds patterns to match conditional conversions between integers and like-sized floats. The patterns are actually more general than that, but the other combinations can only be tested via the ACLE. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/aarch64-sve.md (*cond_<SVE_COND_FCVTI:optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>) (*cond_<SVE_COND_ICVTF:optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New patterns. gcc/testsuite/ * gcc.target/aarch64/sve/cond_convert_1.c: New test. * gcc.target/aarch64/sve/cond_convert_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_convert_2.c: Likewise. * gcc.target/aarch64/sve/cond_convert_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_convert_3.c: Likewise. * gcc.target/aarch64/sve/cond_convert_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_convert_4.c: Likewise. * gcc.target/aarch64/sve/cond_convert_4_run.c: Likewise. * gcc.target/aarch64/sve/cond_convert_5.c: Likewise. * gcc.target/aarch64/sve/cond_convert_5_run.c: Likewise. * gcc.target/aarch64/sve/cond_convert_6.c: Likewise. * gcc.target/aarch64/sve/cond_convert_6_run.c: Likewise. From-SVN: r274478
Richard Sandiford committed -
This patch adds patterns to match conditional unary operations on floating-point modes. At the moment we rely on combine to merge separate arithmetic and vcond_mask operations, and since the latter doesn't accept zero operands, we miss out on the opportunity to use the movprfx /z alternative. (This alternative is tested by the ACLE patches though.) 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*cond_<SVE_COND_FP_UNARY:optab><SVE_F:mode>_2): New pattern. (*cond_<SVE_COND_FP_UNARY:optab><SVE_F:mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cond_unary_1.c: Add tests for floating-point types. * gcc.target/aarch64/sve/cond_unary_2.c: Likewise. * gcc.target/aarch64/sve/cond_unary_3.c: Likewise. * gcc.target/aarch64/sve/cond_unary_4.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274477
Richard Sandiford committed -
This patch adds patterns to match conditional unary operations on integers. At the moment we rely on combine to merge separate arithmetic and vcond_mask operations, and since the latter doesn't accept zero operands, we miss out on the opportunity to use the movprfx /z alternative. (This alternative is tested by the ACLE patches though.) 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*cond_<SVE_INT_UNARY:optab><SVE_I:mode>_2): New pattern. (*cond_<SVE_INT_UNARY:optab><SVE_I:mode>_any): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cond_unary_1.c: New test. * gcc.target/aarch64/sve/cond_unary_1_run.c: Likewise. * gcc.target/aarch64/sve/cond_unary_2.c: Likewise. * gcc.target/aarch64/sve/cond_unary_2_run.c: Likewise. * gcc.target/aarch64/sve/cond_unary_3.c: Likewise. * gcc.target/aarch64/sve/cond_unary_3_run.c: Likewise. * gcc.target/aarch64/sve/cond_unary_4.c: Likewise. * gcc.target/aarch64/sve/cond_unary_4_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274476
Richard Sandiford committed -
This patch adds support for floating-point absolute comparisons FACLT and FACLE (aliased as FACGT and FACGE with swapped operands). 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/iterators.md (SVE_COND_FP_ABS_CMP): New iterator. * config/aarch64/aarch64-sve.md (*aarch64_pred_fac<cmp_op><mode>): New pattern. gcc/testsuite/ * gcc.target/aarch64/sve/vcond_21.c: New test. * gcc.target/aarch64/sve/vcond_21_run.c: Likewise. From-SVN: r274443
Richard Sandiford committed -
This patch uses MOV /M to optimise selects between a duplicated scalar variable and a vector. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64-sve.md (*aarch64_sel_dup<mode>): New pattern. gcc/testsuite/ * g++.target/aarch64/sve/dup_sel_1.C: New test. * g++.target/aarch64/sve/dup_sel_2.C: Likewise. * g++.target/aarch64/sve/dup_sel_3.C: Likewise. * g++.target/aarch64/sve/dup_sel_4.C: Likewise. * g++.target/aarch64/sve/dup_sel_5.C: Likewise. * g++.target/aarch64/sve/dup_sel_6.C: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274442
Richard Sandiford committed -
This patch extends the SVE UNSPEC_SEL patterns so that they can use: (1) MOV /M of a duplicated integer constant (2) MOV /M of a duplicated floating-point constant bitcast to an integer, accepting the same constants as (1) (3) FMOV /M of a duplicated floating-point constant (4) MOV /Z of a duplicated integer constant (5) MOV /Z of a duplicated floating-point constant bitcast to an integer, accepting the same constants as (4) (6) MOVPRFXed FMOV /M of a duplicated floating-point constant We already handled (4) with a special pattern; the rest are new. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org> gcc/ * config/aarch64/aarch64.c (aarch64_bit_representation): New function. (aarch64_print_vector_float_operand): Also handle 8-bit floats. (aarch64_print_operand): Add support for %I. (aarch64_sve_dup_immediate_p): Handle scalars as well as vectors. Bitcast floating-point constants to the corresponding integer constant. (aarch64_float_const_representable_p): Handle vectors as well as scalars. (aarch64_expand_sve_vcond): Make sure that the operands are valid for the new vcond_mask_<mode><vpred> expander. * config/aarch64/predicates.md (aarch64_sve_dup_immediate): Also test aarch64_float_const_representable_p. (aarch64_sve_reg_or_dup_imm): New predicate. * config/aarch64/aarch64-sve.md (vec_extract<vpred><Vel>): Use gen_vcond_mask_<mode><vpred> instead of gen_aarch64_sve_dup<mode>_const. (vcond_mask_<mode><vpred>): Turn into a define_expand that accepts aarch64_sve_reg_or_dup_imm and aarch64_simd_reg_or_zero for operands 1 and 2 respectively. Force operand 2 into a register if operand 1 is a register. Fold old define_insn... (aarch64_sve_dup<mode>_const): ...and this define_insn... (*vcond_mask_<mode><vpred>): ...into this new pattern. Handle floating-point constants that can be moved as integers. Add alternatives for MOV /M and FMOV /M. (vcond<mode><v_int_equiv>, vcondu<mode><v_int_equiv>) (vcond<mode><v_fp_equiv>): Accept nonmemory_operand for operands 1 and 2 respectively. * config/aarch64/constraints.md (Ufc): Handle vectors as well as scalars. (vss): New constraint. gcc/testsuite/ * gcc.target/aarch64/sve/vcond_18.c: New test. * gcc.target/aarch64/sve/vcond_18_run.c: Likewise. * gcc.target/aarch64/sve/vcond_19.c: Likewise. * gcc.target/aarch64/sve/vcond_19_run.c: Likewise. * gcc.target/aarch64/sve/vcond_20.c: Likewise. * gcc.target/aarch64/sve/vcond_20_run.c: Likewise. Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org> From-SVN: r274441
Richard Sandiford committed -
This patch uses the immediate forms of FMAXNM and FMINNM for unconditional arithmetic. The same rules apply to FMAX and FMIN, but we only generate those via the ACLE. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/predicates.md (aarch64_sve_float_maxmin_immediate) (aarch64_sve_float_maxmin_operand): New predicates. * config/aarch64/constraints.md (vsB): New constraint. (vsM): Fix typo. * config/aarch64/iterators.md (sve_pred_fp_rhs2_operand): Use aarch64_sve_float_maxmin_operand for UNSPEC_COND_FMAXNM and UNSPEC_COND_FMINNM. * config/aarch64/aarch64-sve.md (<maxmin_uns><SVE_F:mode>3): Use aarch64_sve_float_maxmin_operand for operand 2. (*<SVE_COND_FP_MAXMIN_PUBLIC:optab><SVE_F:mode>3): Likewise. Add alternatives for the constant forms. gcc/testsuite/ * gcc.target/aarch64/sve/fmaxnm_1.c: New test. * gcc.target/aarch64/sve/fminnm_1.c: Likewise. From-SVN: r274440
Richard Sandiford committed -
This patch adds support for the immediate forms of SVE SMAX, SMIN, UMAX and UMIN. SMAX and SMIN take the same range as MUL, so the patch basically just moves and generalises the existing MUL patterns. 2019-08-14 Richard Sandiford <richard.sandiford@arm.com> gcc/ * config/aarch64/constraints.md (vsb): New constraint. (vsm): Generalize description. * config/aarch64/iterators.md (SVE_INT_BINARY_IMM): New code iterator. (sve_imm_con): Handle smax, smin, umax and umin. (sve_imm_prefix): New code attribute. * config/aarch64/predicates.md (aarch64_sve_vsb_immediate) (aarch64_sve_vsb_operand): New predicates. (aarch64_sve_mul_immediate): Rename to... (aarch64_sve_vsm_immediate): ...this. (aarch64_sve_mul_operand): Rename to... (aarch64_sve_vsm_operand): ...this. * config/aarch64/aarch64-sve.md (mul<mode>3): Generalize to... (<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...this. (*mul<mode>3, *post_ra_mul<mode>3): Generalize to... (*<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3) (*post_ra_<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...these and add movprfx support for the immediate alternatives. (<su><maxmin><mode>3, *<su><maxmin><mode>3): Delete in favor of the above. (*<SVE_INT_BINARY_SD:optab><SVE_SDI:mode>3): Fix incorrect predicate for operand 3. gcc/testsuite/ * gcc.target/aarch64/sve/smax_1.c: New test. * gcc.target/aarch64/sve/smin_1.c: Likewise. * gcc.target/aarch64/sve/umax_1.c: Likewise. * gcc.target/aarch64/sve/umin_1.c: Likewise. From-SVN: r274439
Richard Sandiford committed
-