1. 16 Nov, 2019 10 commits
    • [AArch64] Robustify aarch64_wrffr · 4ec943d6
      This patch uses distinct values for the FFR and FFRT outputs of
      aarch64_wrffr, so that a following aarch64_copy_ffr_to_ffrt has
      an effect.  This is needed to avoid regressions with later patches.
      
      The block comment at the head of the file already described
      the pattern this way, and there was already an unspec for it.
      Not sure what made me change it...
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (aarch64_wrffr): Wrap the FFRT
      	output in UNSPEC_WRFFR.
      
      From-SVN: r278356
      Richard Sandiford committed
    • [AArch64] Add scatter stores for partial SVE modes · 37a3662f
      This patch adds support for scatter stores of partial vectors,
      where the vector base or offset elements can be wider than the
      elements being stored.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(scatter_store<SVE_FULL_SD:mode><v_int_equiv>): Extend to...
      	(scatter_store<SVE_24:mode><v_int_container>): ...this.
      	(mask_scatter_store<SVE_FULL_S:mode><v_int_equiv>): Extend to...
      	(mask_scatter_store<SVE_4:mode><v_int_equiv>): ...this.
      	(mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>): Extend to...
      	(mask_scatter_store<SVE_2:mode><v_int_equiv>): ...this.
      	(*mask_scatter_store<mode><v_int_container>_<su>xtw_unpacked): New
      	pattern.
      	(*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to...
      	(*mask_scatter_store<SVE_2:mode><v_int_equiv>_sxtw): ...this.
      	(*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to...
      	(*mask_scatter_store<SVE_2:mode><v_int_equiv>_uxtw): ...this.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0.
      	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
      	* gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly.
      	* gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0.
      	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
      	* gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly.
      	* gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0.
      	(TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements.
      	* gcc.target/aarch64/sve/scatter_store_8.c: New test.
      	* gcc.target/aarch64/sve/scatter_store_9.c: Likewise.
      
      From-SVN: r278347
      Richard Sandiford committed
    • [AArch64] Pattern-match SVE extending gather loads · 87a80d27
      This patch pattern-matches a partial gather load followed by a sign or
      zero extension into an extending gather load.  (The partial gather load
      is already an extending load; we just don't rely on the upper bits of
      the elements.)
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (SVE_2BHSI, SVE_2HSDI, SVE_4BHI)
      	(SVE_4HSI): New mode iterators.
      	(ANY_EXTEND2): New code iterator.
      	* config/aarch64/aarch64-sve.md
      	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>):
      	Extend to...
      	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>):
      	...this, handling extension to partial modes as well as full modes.
      	Describe the extension as a predicated rather than unpredicated
      	extension.
      	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
      	Likewise extend to...
      	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>):
      	...this, making the same adjustments.
      	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw):
      	Likewise extend to...
      	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_sxtw)
      	...this, making the same adjustments.
      	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw):
      	Likewise extend to...
      	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_uxtw)
      	...this, making the same adjustments.
      	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked):
      	New pattern.
      	(*aarch64_ldff1_gather<mode>_sxtw): Canonicalize to a constant
      	extension predicate.
      	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
      	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw):
      	Describe the extension as a predicated rather than unpredicated
      	extension.
      	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw):
      	Likewise.  Canonicalize to a constant extension predicate.
      	* config/aarch64/aarch64-sve-builtins-base.cc
      	(svld1_gather_extend_impl::expand): Add an extra predicate for
      	the extension.
      	(svldff1_gather_extend_impl::expand): Likewise.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/gather_load_extend_1.c: New test.
      	* gcc.target/aarch64/sve/gather_load_extend_2.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_3.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_4.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_5.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_6.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_7.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_8.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_9.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_10.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_11.c: Likewise.
      	* gcc.target/aarch64/sve/gather_load_extend_12.c: Likewise.
      
      From-SVN: r278346
      Richard Sandiford committed
    • [AArch64] Add gather loads for partial SVE modes · f8186eea
      This patch adds support for gather loads of partial vectors,
      where the vector base or offset elements can be wider than the
      elements being loaded.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (SVE_24, SVE_2, SVE_4): New mode
      	iterators.
      	* config/aarch64/aarch64-sve.md
      	(gather_load<SVE_FULL_SD:mode><v_int_equiv>): Extend to...
      	(gather_load<SVE_24:mode><v_int_container>): ...this.
      	(mask_gather_load<SVE_FULL_S:mode><v_int_equiv>): Extend to...
      	(mask_gather_load<SVE_4:mode><v_int_container>): ...this.
      	(mask_gather_load<SVE_FULL_D:mode><v_int_equiv>): Extend to...
      	(mask_gather_load<SVE_2:mode><v_int_container>): ...this.
      	(*mask_gather_load<SVE_2:mode><v_int_container>_<su>xtw_unpacked):
      	New pattern.
      	(*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to...
      	(*mask_gather_load<SVE_2:mode><v_int_equiv>_sxtw): ...this.
      	Allow the nominal extension predicate to be different from the
      	load predicate.
      	(*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to...
      	(*mask_gather_load<SVE_2:mode><v_int_equiv>_uxtw): ...this.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/gather_load_1.c (TEST_LOOP): Start at 0.
      	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
      	* gcc.target/aarch64/sve/gather_load_2.c: Update accordingly.
      	* gcc.target/aarch64/sve/gather_load_3.c (TEST_LOOP): Start at 0.
      	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
      	* gcc.target/aarch64/sve/gather_load_4.c: Update accordingly.
      	* gcc.target/aarch64/sve/gather_load_5.c (TEST_LOOP): Start at 0.
      	(TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements.
      	* gcc.target/aarch64/sve/gather_load_6.c: Add
      	--param aarch64-sve-compare-costs=0.
      	(TEST_LOOP): Start at 0.
      	* gcc.target/aarch64/sve/gather_load_7.c: Add
      	--param aarch64-sve-compare-costs=0.
      	* gcc.target/aarch64/sve/gather_load_8.c: New test.
      	* gcc.target/aarch64/sve/gather_load_9.c: Likewise.
      	* gcc.target/aarch64/sve/mask_gather_load_6.c: Add
      	--param aarch64-sve-compare-costs=0.
      
      From-SVN: r278345
      Richard Sandiford committed
    • [AArch64] Add truncation for partial SVE modes · 2d56600c
      This patch adds support for "truncating" to a partial SVE vector from
      either a full SVE vector or a wider partial vector.  This truncation is
      actually a no-op and so should have zero cost in the vector cost model.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(trunc<SVE_HSDI:mode><SVE_PARTIAL_I:mode>2): New pattern.
      	* config/aarch64/aarch64.c (aarch64_integer_truncation_p): New
      	function.
      	(aarch64_sve_adjust_stmt_cost): Call it.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/mask_struct_load_1.c: Add
      	--param aarch64-sve-compare-costs=0.
      	* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
      	* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
      	* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
      	* gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
      	* gcc.target/aarch64/sve/pack_1.c: Likewise.
      	* gcc.target/aarch64/sve/truncate_1.c: New test.
      
      From-SVN: r278344
      Richard Sandiford committed
    • [AArch64] Pattern-match SVE extending loads · 217ccab8
      This patch pattern-matches a partial SVE load followed by a sign or zero
      extension into an extending load.  (The partial load is already an
      extending load; we just don't rely on the upper bits of the elements.)
      
      Nothing yet uses the extra LDFF1 and LDNF1 combinations, but it seemed
      more consistent to provide them, since I needed to update the pattern
      to use a predicated extension anyway.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>):
      	(@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
      	Combine into...
      	(@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>):
      	...this new pattern, handling extension to partial modes as well
      	as full modes.  Describe the extension as a predicated rather than
      	unpredicated extension.
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
      	Combine into...
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>):
      	...this new pattern, handling extension to partial modes as well
      	as full modes.  Describe the extension as a predicated rather than
      	unpredicated extension.
      	* config/aarch64/aarch64-sve-builtins.cc
      	(function_expander::use_contiguous_load_insn): Add an extra
      	predicate for extending loads.
      	* config/aarch64/aarch64.c (aarch64_extending_load_p): New function.
      	(aarch64_sve_adjust_stmt_cost): Likewise.
      	(aarch64_add_stmt_cost): Use aarch64_sve_adjust_stmt_cost to adjust
      	the cost of SVE vector stmts.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/load_extend_1.c: New test.
      	* gcc.target/aarch64/sve/load_extend_2.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_3.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_4.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_5.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_6.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_7.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_8.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_9.c: Likewise.
      	* gcc.target/aarch64/sve/load_extend_10.c: Likewise.
      	* gcc.target/aarch64/sve/reduc_4.c: Add
      	--param aarch64-sve-compare-costs=0.
      
      From-SVN: r278343
      Richard Sandiford committed
    • [AArch64] Add sign and zero extension for partial SVE modes · e58703e2
      This patch adds support for extending from partial SVE modes
      to both full vector modes and wider partial modes.
      
      Some tests now need --param aarch64-sve-compare-costs=0 to force
      the original full-vector code.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (SVE_HSDI): New mode iterator.
      	(narrower_mask): Handle VNx4HI, VNx2HI and VNx2SI.
      	* config/aarch64/aarch64-sve.md
      	(<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): New pattern.
      	(*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise.
      	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Update
      	comment.  Avoid new narrower_mask ambiguity.
      	(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
      	(*cond_uxt<mode>_2): Update comment.
      	(*cond_uxt<mode>_any): Likewise.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cost_model_1.c: Expect the loop to be
      	vectorized with bytes stored in 32-bit containers.
      	* gcc.target/aarch64/sve/extend_1.c: New test.
      	* gcc.target/aarch64/sve/extend_2.c: New test.
      	* gcc.target/aarch64/sve/extend_3.c: New test.
      	* gcc.target/aarch64/sve/extend_4.c: New test.
      	* gcc.target/aarch64/sve/load_const_offset_3.c: Add
      	--param aarch64-sve-compare-costs=0.
      	* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
      	* gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
      	* gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise.
      	* gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise.
      
      From-SVN: r278342
      Richard Sandiford committed
    • [AArch64] Add autovec support for partial SVE vectors · cc68f7c2
      This patch adds the bare minimum needed to support autovectorisation of
      partial SVE vectors, namely moves and integer addition.  Later patches
      add more interesting cases.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-modes.def: Define partial SVE vector
      	float modes.
      	* config/aarch64/aarch64-protos.h (aarch64_sve_pred_mode): New
      	function.
      	* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle the
      	new vector float modes.
      	(aarch64_sve_container_bits): New function.
      	(aarch64_sve_pred_mode): Likewise.
      	(aarch64_get_mask_mode): Use it.
      	(aarch64_sve_element_int_mode): Handle structure modes and partial
      	modes.
      	(aarch64_sve_container_int_mode): New function.
      	(aarch64_vectorize_related_mode): Return SVE modes when given
      	SVE modes.  Handle partial modes, taking the preferred number
      	of units from the size of the given mode.
      	(aarch64_hard_regno_mode_ok): Allow partial modes to be stored
      	in registers.
      	(aarch64_expand_sve_ld1rq): Use the mode form of aarch64_sve_pred_mode.
      	(aarch64_expand_sve_const_vector): Handle partial SVE vectors.
      	(aarch64_split_sve_subreg_move): Use the mode form of
      	aarch64_sve_pred_mode.
      	(aarch64_secondary_reload): Handle partial modes in the same way
      	as full big-endian vectors.
      	(aarch64_vector_mode_supported_p): Allow partial SVE vectors.
      	(aarch64_autovectorize_vector_modes): Try unpacked SVE vectors,
      	merging with the Advanced SIMD modes.  If two modes have the
      	same size, try the Advanced SIMD mode first.
      	(aarch64_simd_valid_immediate): Use the container rather than
      	the element mode for INDEX constants.
      	(aarch64_simd_vector_alignment): Make the alignment of partial
      	SVE vector modes the same as their minimum size.
      	(aarch64_evpc_sel): Use the mode form of aarch64_sve_pred_mode.
      	* config/aarch64/aarch64-sve.md (mov<SVE_FULL:mode>): Extend to...
      	(mov<SVE_ALL:mode>): ...this.
      	(movmisalign<SVE_FULL:mode>): Extend to...
      	(movmisalign<SVE_ALL:mode>): ...this.
      	(*aarch64_sve_mov<mode>_le): Rename to...
      	(*aarch64_sve_mov<mode>_ldr_str): ...this.
      	(*aarch64_sve_mov<SVE_FULL:mode>_be): Rename and extend to...
      	(*aarch64_sve_mov<SVE_ALL:mode>_no_ldr_str): ...this.  Handle
      	partial modes regardless of endianness.
      	(aarch64_sve_reload_be): Rename to...
      	(aarch64_sve_reload_mem): ...this and enable for little-endian.
      	Use aarch64_sve_pred_mode to get the appropriate predicate mode.
      	(@aarch64_pred_mov<SVE_FULL:mode>): Extend to...
      	(@aarch64_pred_mov<SVE_ALL:mode>): ...this.
      	(*aarch64_sve_mov<SVE_FULL:mode>_subreg_be): Extend to...
      	(*aarch64_sve_mov<SVE_ALL:mode>_subreg_be): ...this.
      	(@aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to...
      	(@aarch64_sve_reinterpret<SVE_ALL:mode>): ...this.
      	(*aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to...
      	(*aarch64_sve_reinterpret<SVE_ALL:mode>): ...this.
      	(maskload<SVE_FULL:mode><vpred>): Extend to...
      	(maskload<SVE_ALL:mode><vpred>): ...this.
      	(maskstore<SVE_FULL:mode><vpred>): Extend to...
      	(maskstore<SVE_ALL:mode><vpred>): ...this.
      	(vec_duplicate<SVE_FULL:mode>): Extend to...
      	(vec_duplicate<SVE_ALL:mode>): ...this.
      	(*vec_duplicate<SVE_FULL:mode>_reg): Extend to...
      	(*vec_duplicate<SVE_ALL:mode>_reg): ...this.
      	(sve_ld1r<SVE_FULL:mode>): Extend to...
      	(sve_ld1r<SVE_ALL:mode>): ...this.
      	(vec_series<SVE_FULL_I:mode>): Extend to...
      	(vec_series<SVE_I:mode>): ...this.
      	(*vec_series<SVE_FULL_I:mode>_plus): Extend to...
      	(*vec_series<SVE_I:mode>_plus): ...this.
      	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Avoid
      	new VPRED ambiguity.
      	(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
      	(add<SVE_FULL_I:mode>3): Extend to...
      	(add<SVE_I:mode>3): ...this.
      	* config/aarch64/iterators.md (SVE_ALL, SVE_I): New mode iterators.
      	(Vetype, Vesize, VEL, Vel, vwcore): Handle partial SVE vector modes.
      	(VPRED, vpred): Likewise.
      	(Vctype): New iterator.
      	(vw): Remove SVE modes.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/mixed_size_1.c: New test.
      	* gcc.target/aarch64/sve/mixed_size_2.c: Likewise.
      	* gcc.target/aarch64/sve/mixed_size_3.c: Likewise.
      	* gcc.target/aarch64/sve/mixed_size_4.c: Likewise.
      	* gcc.target/aarch64/sve/mixed_size_5.c: Likewise.
      
      From-SVN: r278341
      Richard Sandiford committed
    • [AArch64] Replace SVE_PARTIAL with SVE_PARTIAL_I · 6544cb52
      Another renaming, this time to make way for partial/unpacked
      float modes.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (SVE_PARTIAL): Rename to...
      	(SVE_PARTIAL_I): ...this.
      	* config/aarch64/aarch64-sve.md: Apply the above renaming throughout.
      
      From-SVN: r278339
      Richard Sandiford committed
    • [AArch64] Add "FULL" to SVE mode iterator names · f75cdd2c
      An upcoming patch will make more use of partial/unpacked SVE vectors.
      We then need a distinction between mode iterators that include partial
      modes and those that only include "full" modes.  This patch prepares
      for that by adding "FULL" to the names of iterators that only select
      full modes.  There should be no change in behaviour.
      
      2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (SVE_ALL): Rename to...
      	(SVE_FULL): ...this.
      	(SVE_I): Rename to...
      	(SVE_FULL_I): ...this.
      	(SVE_F): Rename to...
      	(SVE_FULL_F): ...this.
      	(SVE_BHSI): Rename to...
      	(SVE_FULL_BHSI): ...this.
      	(SVE_HSD): Rename to...
      	(SVE_FULL_HSD): ...this.
      	(SVE_HSDI): Rename to...
      	(SVE_FULL_HSDI): ...this.
      	(SVE_HSF): Rename to...
      	(SVE_FULL_HSF): ...this.
      	(SVE_SD): Rename to...
      	(SVE_FULL_SD): ...this.
      	(SVE_SDI): Rename to...
      	(SVE_FULL_SDI): ...this.
      	(SVE_SDF): Rename to...
      	(SVE_FULL_SDF): ...this.
      	(SVE_S): Rename to...
      	(SVE_FULL_S): ...this.
      	(SVE_D): Rename to...
      	(SVE_FULL_D): ...this.
      	* config/aarch64/aarch64-sve.md: Apply the above renaming throughout.
      	* config/aarch64/aarch64-sve2.md: Likewise.
      
      From-SVN: r278338
      Richard Sandiford committed
  2. 08 Nov, 2019 1 commit
    • Generalise gather and scatter optabs · 09eb042a
      The gather and scatter optabs required the vector offset to be
      the integer equivalent of the vector mode being loaded or stored.
      This patch generalises them so that the two vectors can have different
      element sizes, although they still need to have the same number of
      elements.
      
      One consequence of this is that it's possible (if unlikely)
      for two IFN_GATHER_LOADs to have the same arguments but different
      return types.  E.g. the same scalar base and vector of 32-bit offsets
      could be used to load 8-bit elements and to load 16-bit elements.
      From just looking at the arguments, we could wrongly deduce that
      they're equivalent.
      
      I know we saw this happen at one point with IFN_WHILE_ULT,
      and we dealt with it there by passing a zero of the return type
      as an extra argument.  Doing the same here also makes the load
      and store functions have the same argument assignment.
      
      For now this patch should be a no-op, but later SVE patches take
      advantage of the new flexibility.
      
      2019-11-08  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* optabs.def (gather_load_optab, mask_gather_load_optab)
      	(scatter_store_optab, mask_scatter_store_optab): Turn into
      	conversion optabs, with the offset mode given explicitly.
      	* doc/md.texi: Update accordingly.
      	* config/aarch64/aarch64-sve-builtins-base.cc
      	(svld1_gather_impl::expand): Likewise.
      	(svst1_scatter_impl::expand): Likewise.
      	* internal-fn.c (gather_load_direct, scatter_store_direct): Likewise.
      	(expand_scatter_store_optab_fn): Likewise.
      	(direct_gather_load_optab_supported_p): Likewise.
      	(direct_scatter_store_optab_supported_p): Likewise.
      	(expand_gather_load_optab_fn): Likewise.  Expect the mask argument
      	to be argument 4.
      	(internal_fn_mask_index): Return 4 for IFN_MASK_GATHER_LOAD.
      	(internal_gather_scatter_fn_supported_p): Replace the offset sign
      	argument with the offset vector type.  Require the two vector
      	types to have the same number of elements but allow their element
      	sizes to be different.  Treat the optabs as conversion optabs.
      	* internal-fn.h (internal_gather_scatter_fn_supported_p): Update
      	prototype accordingly.
      	* optabs-query.c (supports_at_least_one_mode_p): Replace with...
      	(supports_vec_convert_optab_p): ...this new function.
      	(supports_vec_gather_load_p): Update accordingly.
      	(supports_vec_scatter_store_p): Likewise.
      	* tree-vectorizer.h (vect_gather_scatter_fn_p): Take a vec_info.
      	Replace the offset sign and bits parameters with a scalar type tree.
      	* tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise.
      	Pass back the offset vector type instead of the scalar element type.
      	Allow the offset to be wider than the memory elements.  Search for
      	an offset type that the target supports, stopping once we've
      	reached the maximum of the element size and pointer size.
      	Update call to internal_gather_scatter_fn_supported_p.
      	(vect_check_gather_scatter): Update calls accordingly.
      	When testing a new scale before knowing the final offset type,
      	check whether the scale is supported for any signed or unsigned
      	offset type.  Check whether the target supports the source and
      	target types of a conversion before deciding whether to look
      	through the conversion.  Record the chosen offset_vectype.
      	* tree-vect-patterns.c (vect_get_gather_scatter_offset_type): Delete.
      	(vect_recog_gather_scatter_pattern): Get the scalar offset type
      	directly from the gs_info's offset_vectype instead.  Pass a zero
      	of the result type to IFN_GATHER_LOAD and IFN_MASK_GATHER_LOAD.
      	* tree-vect-stmts.c (check_load_store_masking): Update call to
      	internal_gather_scatter_fn_supported_p, passing the offset vector
      	type recorded in the gs_info.
      	(vect_truncate_gather_scatter_offset): Update call to
      	vect_check_gather_scatter, leaving it to search for a valid
      	offset vector type.
      	(vect_use_strided_gather_scatters_p): Convert the offset to the
      	element type of the gs_info's offset_vectype.
      	(vect_get_gather_scatter_ops): Get the offset vector type directly
      	from the gs_info.
      	(vect_get_strided_load_store_ops): Likewise.
      	(vectorizable_load): Pass a zero of the result type to IFN_GATHER_LOAD
      	and IFN_MASK_GATHER_LOAD.
      	* config/aarch64/aarch64-sve.md (gather_load<mode>): Rename to...
      	(gather_load<mode><v_int_equiv>): ...this.
      	(mask_gather_load<mode>): Rename to...
      	(mask_gather_load<mode><v_int_equiv>): ...this.
      	(scatter_store<mode>): Rename to...
      	(scatter_store<mode><v_int_equiv>): ...this.
      	(mask_scatter_store<mode>): Rename to...
      	(mask_scatter_store<mode><v_int_equiv>): ...this.
      
      From-SVN: r277949
      Richard Sandiford committed
  3. 29 Oct, 2019 3 commits
    • [AArch64] Add support for the SVE PCS · c600df9a
      The AAPCS64 specifies that if a function takes arguments in SVE
      registers or returns them in SVE registers, it must preserve all
      of Z8-Z23 and all of P4-P11.  (Normal functions only preserve the
      low 64 bits of Z8-Z15 and clobber all of the predicate registers.)
      
      This variation is known informally as the "SVE PCS" and functions
      that use it are known informally as "SVE functions".  The SVE PCS
      is mutually interoperable with functions that follow the standard
      AAPCS64 rules and those that use the aarch64_vector_pcs attribute.
      (Note that it's an error to use the attribute for SVE functions.)
      
      One complication -- although it's not really that complicated --
      is that SVE registers need to be saved at a VL-dependent offset while
      other registers need to be saved at a constant offset.  The easiest way
      of handling this seemed to be to group the SVE registers together below
      the hard frame pointer.  In common cases, the frame pointer is then
      usually an easy-to-compute VL multiple above the stack pointer and a
      constant amount below the incoming stack pointer.
      
      A bigger complication is that, because the base AAPCS64 specifies that
      only the low 64 bits of V8-V15 are preserved by calls, the associated
      DWARF frame registers are also treated as 64 bits by the unwinder.
      The 64 bits must also have the same layout as they would for a base
      AAPCS64 function, otherwise unwinding won't work correctly.  (This is
      actually a problem for the existing aarch64_vector_pcs support too,
      but I'll fix that separately.)
      
      This falls out naturally for little-endian targets but not for
      big-endian targets.  The easiest way of meeting the requirement for them
      was to use ST1D and LD1D to save and restore Z8-Z15, which also has the
      nice property of storing the 64 bits at the start of the slot.  However,
      using ST1D and LD1D requires a spare predicate register, and since all
      of P0-P7 are either argument registers or call-preserved, we may need
      to spill P4 in order to save the vector registers, even if P4 wouldn't
      need to be saved otherwise.
      
      Since Z16-Z23 are fully clobbered by base AAPCS64 functions, we don't
      need to emit frame information for them at all.  This avoids having
      to decide whether the registers should be treated as having 64 bits
      (as for Z8-Z15), 128 bits (for Advanced SIMD) or the full SVE width.
      
      There are two ways of dealing with stack-clash protection when
      saving SVE registers:
      
      (1) If the area between the hard frame pointer and the incoming stack
          pointer is allocated via a store with writeback (callee_adjust != 0),
          the SVE save area is allocated separately and becomes the "initial"
          allocation as far as stack-clash protection goes.  In this case
          the store with writeback acts as a probe at the hard frame pointer
          position.
      
      (2) If the area between the hard frame pointer and the incoming stack
          pointer is allocated via aarch64_allocate_and_probe_stack_space,
          the SVE save area is added to this initial allocation, so that the
          SP ends up pointing at the SVE register saves.  It's then necessary
          to use a temporary base register to save the non-SVE registers.
          Setting up this temporary register requires a single instruction
          only and so should be more efficient than doing two allocations
          and probes.
      
      When SVE registers need to be saved, saving them below the frame pointer
      makes it harder to rely on the LR save as a stack probe, since the LR
      register's offset won't usually be a compile-time constant.  The patch
      copes with that by using the lowest SVE register save as a stack probe
      too, and thus prevents the save from being shrink-wrapped if stack clash
      protection is enabled.
      
      The changelog describes the low-level details.
      
      2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* calls.c (pass_by_reference): Leave the target to decide whether
      	POLY_INT_CST-sized arguments should be passed by value or reference,
      	rather than forcing them to be passed by reference.
      	(must_pass_in_stack_var_size): Likewise.
      	* config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Redefine from
      	V31_REGNUM to P15_REGNUM.
      	* config/aarch64/aarch64-protos.h (aarch64_init_cumulative_args):
      	Take an extra "silent_p" parameter, defaulting to false.
      	(aarch64_sve::svbool_type_p): Declare.
      	(aarch64_sve::nvectors_if_data_type): Likewise.
      	* config/aarch64/aarch64.h (NUM_PR_ARG_REGS): New macro.
      	(aarch64_frame::reg_offset): Turn into poly_int64s.
      	(aarch64_frame::save_regs_size): Likewise.
      	(aarch64_frame::below_hard_fp_saved_regs_size): New field.
      	(aarch64_frame::sve_callee_adjust): Likewise.
      	(aarch64_frame::spare_reg_reg): Likewise.
      	(ARM_PCS_SVE): New arm_pcs value.
      	(CUMULATIVE_ARGS::aapcs_nprn): New field.
      	(CUMULATIVE_ARGS::aapcs_nextnprn): Likewise.
      	(CUMULATIVE_ARGS::silent_p): Likewise.
      	(BITS_PER_SVE_PRED): New macro.
      	* config/aarch64/aarch64.c (handle_aarch64_vector_pcs_attribute): New
      	function.  Reject aarch64_vector_pcs attributes on SVE functions.
      	(aarch64_attribute_table): Use the above handler.
      	(aarch64_sve_abi): New function.
      	(aarch64_sve_argument_p): Likewise.
      	(aarch64_returns_value_in_sve_regs_p): Likewise.
      	(aarch64_takes_arguments_in_sve_regs_p): Likewise.
      	(aarch64_fntype_abi): Check for SVE functions and return the SVE PCS
      	descriptor for them.
      	(aarch64_simd_decl_p): Delete.
      	(aarch64_emit_cfi_for_reg_p): New function.
      	(aarch64_reg_save_mode): Remove the fndecl argument and instead use
      	crtl->abi to choose the mode for FP registers.  Handle the SVE PCS.
      	(aarch64_hard_regno_call_part_clobbered): Do not treat FP registers
      	as partly clobbered for the SVE PCS.
      	(aarch64_function_ok_for_sibcall): Check whether the two functions
      	use the same ABI, rather than checking specifically for whether
      	they're aarch64_vector_pcs functions.
      	(aarch64_pass_by_reference): Raise an error for attempts to pass
      	SVE arguments when SVE is disabled.  Pass SVE arguments by reference
      	if there are not enough free registers left, or if the argument is
      	variadic.
      	(aarch64_function_value): Handle SVE predicates, vectors and tuples.
      	(aarch64_return_in_memory): Do not return SVE predicates, vectors and
      	tuples in memory.
      	(aarch64_layout_arg): Take a function_arg_info rather than
      	individual properties.  Handle SVE predicates, vectors and tuples.
      	Raise an error if they are passed to unprototyped functions.
      	(aarch64_function_arg): If the silent_p flag is set, suppress the
      	usual error about using float registers without TARGET_FLOAT.
      	(aarch64_init_cumulative_args): Take a silent_p parameter and store
      	it in the cumulative_args structure.  Initialize aapcs_nprn and
      	aapcs_nextnprn.  If the silent_p flag is set, suppress the usual
      	error about using float registers without TARGET_FLOAT.
      	If the silent_p flag is not set, also raise an error about
      	using SVE functions when SVE is disabled.
      	(aarch64_function_arg_advance): Update the call to aarch64_layout_arg,
      	and call it for SVE functions too.  Update aapcs_nprn similarly
      	to the other register counts.
      	(aarch64_layout_frame): If a big-endian function needs to save
      	and restore Z8-Z15, search for a spare predicate that it can use.
      	Store SVE predicates at the bottom of the register save area,
      	followed by SVE vectors, then followed by the normal slots.
      	Keep pointing the hard frame pointer at the base of the normal slots,
      	above the SVE vectors.  Update the various frame creation and
      	tear-down strategies for the new layout, initializing the new
      	sve_callee_adjust field.  Add an additional layout for frames
      	whose saved registers are all SVE registers.
      	(aarch64_register_saved_on_entry): Cope with poly_int64 reg_offsets.
      	(aarch64_return_address_signing_enabled): Likewise.
      	(aarch64_push_regs, aarch64_pop_regs): Update calls to
      	aarch64_reg_save_mode.
      	(aarch64_adjust_sve_callee_save_base): New function.
      	(aarch64_add_cfa_expression): Move earlier in file.  Take the
      	saved register as an rtx rather than a register number and use
      	its mode for the MEM slot.
      	(aarch64_save_callee_saves): Remove the mode argument and instead
      	use aarch64_reg_save_mode to get the mode of each save slot.
      	Add a hard_fp_valid_p parameter.  Cope with poly_int64 register
      	offsets.  Allow GP offsets to be saved at a VL-based offset from
      	the stack, handling this case using the frame pointer if available
      	or a temporary register otherwise.  Use ST1D to save Z8-Z15 for
      	big-endian SVE functions; use normal moves for other SVE saves.
      	Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p
      	returns true.  Add explicit CFA notes when not storing via the
      	stack pointer.  Do not try to pair SVE saves.
      	(aarch64_restore_callee_saves): Cope with poly_int64 register
      	offsets.  Use LD1D to restore Z8-Z15 for big-endian SVE functions;
      	use normal moves for other SVE restores.  Only add CFA restore notes
      	if aarch64_emit_cfi_for_reg_p returns true.  Do not try to pair
      	SVE restores.
      	(aarch64_get_separate_components): Always keep the first SVE save
      	in the prologue if we need to use it as a stack probe.  Don't allow
      	Z8-Z15 saves and loads to be shrink-wrapped for big-endian targets.
      	Likewise the spare predicate register that they need.  Update the
      	offset calculation to account for the SVE save area.  Use the
      	appropriate range check for SVE LDR and STR instructions.
      	(aarch64_components_for_bb): Cope with poly_int64 reg_offsets.
      	(aarch64_process_components): Likewise.  Update the offset
      	calculation to account for the SVE save area.  Only mark the
      	save as frame-related if aarch64_emit_cfi_for_reg_p returns true.
      	Do not try to pair SVE saves.
      	(aarch64_allocate_and_probe_stack_space): Cope with poly_int64
      	reg_offsets.  When handling the final allocation, expect the
      	first SVE register save to be part of the initial allocation
      	and for it to act as a probe at SP.  Account for the SVE callee
      	save area in the dump information.
      	(aarch64_expand_prologue): Update the frame diagram.  Fold the
      	SVE callee allocation into the initial allocation if stack clash
      	protection is enabled.  Use new variables to track the offset
      	of the frame chain (and hard frame pointer) from the current
      	stack pointer, and likewise the offset of the bottom of the
      	register save area.  Update calls to aarch64_save_callee_saves
      	and aarch64_add_cfa_expression.  Apply sve_callee_adjust before
      	saving the FP&SIMD registers.  Save the predicate registers.
      	(aarch64_expand_epilogue): Take below_hard_fp_saved_regs_size
      	into account when setting the stack pointer from the frame pointer,
      	and when deciding whether we can inherit the initial adjustment
      	amount from the prologue.  Restore the predicate registers after
      	the vector registers, then apply sve_callee_adjust, then restore
      	the general registers.
      	(aarch64_secondary_reload): Don't use secondary SVE reloads
      	for VNx16BImode.
      	(aapcs_vfp_sub_candidate): Assert that the type is not an SVE type.
      	(aarch64_short_vector_p): Return false for SVE types.
      	(aarch64_vfp_is_call_or_return_candidate): Initialize *is_ha
      	at the start of the function.  Return false for SVE types.
      	(aarch64_asm_output_variant_pcs): Output .variant_pcs for SVE
      	functions too.
      	(TARGET_STRICT_ARGUMENT_NAMING): Redefine to request strict naming.
      	* config/aarch64/aarch64-sve.md (*aarch64_sve_mov<mode>_le): Extend
      	to big-endian targets for bytewise moves.
      	(*aarch64_sve_mov<mode>_be): Exclude the bytewise case.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp: New file.
      	* gcc.target/aarch64/sve/pcs/annotate_1.c: New test.
      	* gcc.target/aarch64/sve/pcs/annotate_2.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/annotate_3.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/annotate_4.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/annotate_5.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/annotate_6.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/annotate_7.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_1.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_10.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_11_nosc.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_11_sc.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_2.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_3.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_4.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_7.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/args_9.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_1.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_7.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/nosve_8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_1.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_2.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_3.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_4.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_5.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_6.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_7.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/return_9.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_3.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_5_be.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/saves_5_le.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_1_256.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_1_512.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_2.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_2_256.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_2_512.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/stack_clash_3.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/unprototyped_1.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise.
      	* gcc.target/aarch64/sve/pcs/vpcs_1.c: Likewise.
      	* g++.target/aarch64/sve/catch_7.C: Likewise.
      
      From-SVN: r277564
      Richard Sandiford committed
    • [AArch64] Add support for arm_sve.h · 624d0f07
      This patch adds support for arm_sve.h.  I've tried to split all the
      groundwork out into separate patches, so this is mostly adding new code
      rather than changing existing code.
      
      The C++ frontend seems to handle correct ACLE code without modification,
      even in length-agnostic mode.  The C frontend is close; the only correct
      construct I know it doesn't handle is initialisation.  E.g.:
      
        svbool_t pg = svptrue_b8 ();
      
      produces:
      
        variable-sized object may not be initialized
      
      although:
      
        svbool_t pg; pg = svptrue_b8 ();
      
      works fine.  This can be fixed by changing:
      
       	  {
       	    /* A complete type is ok if size is fixed.  */
      
      -	    if (TREE_CODE (TYPE_SIZE (TREE_TYPE (decl))) != INTEGER_CST
      +	    if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl)))
       		|| C_DECL_VARIABLE_SIZE (decl))
       	      {
       		error ("variable-sized object may not be initialized");
      
      in c/c-decl.c:start_decl.
      
      Invalid code is likely to trigger ICEs, so this isn't ready for general
      use yet.  However, it seemed better to apply the patch now and deal with
      diagnosing invalid code as a follow-up.  For one thing, it means that
      we'll be able to provide testcases for middle-end changes related
      to SVE vectors, which has been a problem until now.  (I already have
      a series of such patches lined up.)
      
      The patch includes some tests, but the main ones need to wait until the
      PCS support has been applied.
      
      2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      	    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
      
      gcc/
      	* config.gcc (aarch64*-*-*): Add arm_sve.h to extra_headers.
      	Add aarch64-sve-builtins.o, aarch64-sve-builtins-shapes.o and
      	aarch64-sve-builtins-base.o to extra_objs.  Add
      	aarch64-sve-builtins.h and aarch64-sve-builtins.cc to target_gtfiles.
      	* config/aarch64/t-aarch64 (aarch64-sve-builtins.o): New rule.
      	(aarch64-sve-builtins-shapes.o): Likewise.
      	(aarch64-sve-builtins-base.o): New rules.
      	* config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): New function.
      	(aarch64_resolve_overloaded_builtin): Likewise.
      	(aarch64_check_builtin_call): Likewise.
      	(aarch64_register_pragmas): Install aarch64_resolve_overloaded_builtin
      	and aarch64_check_builtin_call in targetm.  Register the GCC aarch64
      	pragma.
      	* config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPRFOP): New macro.
      	(aarch64_svprfop): New enum.
      	(AARCH64_BUILTIN_SVE): New aarch64_builtin_class enum value.
      	(aarch64_sve_int_mode, aarch64_sve_data_mode): Declare.
      	(aarch64_fold_sve_cnt_pat, aarch64_output_sve_prefetch): Likewise.
      	(aarch64_output_sve_cnt_pat_immediate): Likewise.
      	(aarch64_output_sve_ptrues, aarch64_sve_ptrue_svpattern_p): Likewise.
      	(aarch64_sve_sqadd_sqsub_immediate_p, aarch64_sve_ldff1_operand_p)
      	(aarch64_sve_ldnf1_operand_p, aarch64_sve_prefetch_operand_p)
      	(aarch64_ptrue_all_mode, aarch64_convert_sve_data_to_pred): Likewise.
      	(aarch64_expand_sve_dupq, aarch64_replace_reg_mode): Likewise.
      	(aarch64_sve::init_builtins, aarch64_sve::handle_arm_sve_h): Likewise.
      	(aarch64_sve::builtin_decl, aarch64_sve::builtin_type_p): Likewise.
      	(aarch64_sve::mangle_builtin_type): Likewise.
      	(aarch64_sve::resolve_overloaded_builtin): Likewise.
      	(aarch64_sve::check_builtin_call, aarch64_sve::gimple_fold_builtin)
      	(aarch64_sve::expand_builtin): Likewise.
      	* config/aarch64/aarch64.c (aarch64_sve_data_mode): Make public.
      	(aarch64_sve_int_mode): Likewise.
      	(aarch64_ptrue_all_mode): New function.
      	(aarch64_convert_sve_data_to_pred): Make public.
      	(svprfop_token): New function.
      	(aarch64_output_sve_prefetch): Likewise.
      	(aarch64_fold_sve_cnt_pat): Likewise.
      	(aarch64_output_sve_cnt_pat_immediate): Likewise.
      	(aarch64_sve_move_pred_via_while): Use gen_while with UNSPEC_WHILE_LO
      	instead of gen_while_ult.
      	(aarch64_replace_reg_mode): Make public.
      	(aarch64_init_builtins): Call aarch64_sve::init_builtins.
      	(aarch64_fold_builtin): Handle AARCH64_BUILTIN_SVE.
      	(aarch64_gimple_fold_builtin, aarch64_expand_builtin): Likewise.
      	(aarch64_builtin_decl, aarch64_builtin_reciprocal): Likewise.
      	(aarch64_mangle_type): Call aarch64_sve::mangle_type.
      	(aarch64_sve_sqadd_sqsub_immediate_p): New function.
      	(aarch64_sve_ptrue_svpattern_p): Likewise.
      	(aarch64_sve_pred_valid_immediate): Check
      	aarch64_sve_ptrue_svpattern_p.
      	(aarch64_sve_ldff1_operand_p, aarch64_sve_ldnf1_operand_p)
      	(aarch64_sve_prefetch_operand_p, aarch64_output_sve_ptrues): New
      	functions.
      	* config/aarch64/aarch64.md (UNSPEC_LDNT1_SVE, UNSPEC_STNT1_SVE)
      	(UNSPEC_LDFF1_GATHER, UNSPEC_PTRUE, UNSPEC_WHILE_LE, UNSPEC_WHILE_LS)
      	(UNSPEC_WHILE_LT, UNSPEC_CLASTA, UNSPEC_UPDATE_FFR)
      	(UNSPEC_UPDATE_FFRT, UNSPEC_RDFFR, UNSPEC_WRFFR)
      	(UNSPEC_SVE_LANE_SELECT, UNSPEC_SVE_CNT_PAT, UNSPEC_SVE_PREFETCH)
      	(UNSPEC_SVE_PREFETCH_GATHER, UNSPEC_SVE_COMPACT, UNSPEC_SVE_SPLICE):
      	New unspecs.
      	* config/aarch64/iterators.md (SI_ONLY, DI_ONLY, VNx8HI_ONLY)
      	(VNx2DI_ONLY, SVE_PARTIAL, VNx8_NARROW, VNx8_WIDE, VNx4_NARROW)
      	(VNx4_WIDE, VNx2_NARROW, VNx2_WIDE, PRED_HSD): New mode iterators.
      	(UNSPEC_ADR, UNSPEC_BRKA, UNSPEC_BRKB, UNSPEC_BRKN, UNSPEC_BRKPA)
      	(UNSPEC_BRKPB, UNSPEC_PFIRST, UNSPEC_PNEXT, UNSPEC_CNTP, UNSPEC_SADDV)
      	(UNSPEC_UADDV, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTMAD)
      	(UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_CMPEQ_WIDE): New unspecs.
      	(UNSPEC_COND_CMPGE_WIDE, UNSPEC_COND_CMPGT_WIDE): Likewise.
      	(UNSPEC_COND_CMPHI_WIDE, UNSPEC_COND_CMPHS_WIDE): Likewise.
      	(UNSPEC_COND_CMPLE_WIDE, UNSPEC_COND_CMPLO_WIDE): Likewise.
      	(UNSPEC_COND_CMPLS_WIDE, UNSPEC_COND_CMPLT_WIDE): Likewise.
      	(UNSPEC_COND_CMPNE_WIDE, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270)
      	(UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180)
      	(UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN): Likewise.
      	(UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX, UNSPEC_COND_FSCALE): Likewise.
      	(UNSPEC_LASTA, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE): Likewise.
      	(UNSPEC_LSHIFTRT_WIDE, UNSPEC_LDFF1, UNSPEC_LDNF1): Likewise.
      	(Vesize): Handle partial vector modes.
      	(self_mask, narrower_mask, sve_lane_con, sve_lane_pair_con): New
      	mode attributes.
      	(UBINQOPS, ANY_PLUS, SAT_PLUS, ANY_MINUS, SAT_MINUS): New code
      	iterators.
      	(s, paired_extend, inc_dec): New code attributes.
      	(SVE_INT_ADDV, CLAST, LAST): New int iterators.
      	(SVE_INT_UNARY): Add UNSPEC_RBIT.
      	(SVE_FP_UNARY, SVE_FP_UNARY_INT): New int iterators.
      	(SVE_FP_BINARY, SVE_FP_BINARY_INT): Likewise.
      	(SVE_COND_FP_UNARY): Add UNSPEC_COND_FRECPX.
      	(SVE_COND_FP_BINARY): Add UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and
      	UNSPEC_COND_FMULX.
      	(SVE_COND_FP_BINARY_INT, SVE_COND_FP_ADD): New int iterators.
      	(SVE_COND_FP_SUB, SVE_COND_FP_MUL): Likewise.
      	(SVE_COND_FP_BINARY_I1): Add UNSPEC_COND_FMAX and UNSPEC_COND_FMIN.
      	(SVE_COND_FP_BINARY_REG): Add UNSPEC_COND_FMULX.
      	(SVE_COND_FCADD, SVE_COND_FP_MAXMIN, SVE_COND_FCMLA)
      	(SVE_COND_INT_CMP_WIDE, SVE_FP_TERNARY_LANE, SVE_CFP_TERNARY_LANE)
      	(SVE_WHILE, SVE_SHIFT_WIDE, SVE_LDFF1_LDNF1, SVE_BRK_UNARY)
      	(SVE_BRK_BINARY, SVE_PITER): New int iterators.
      	(optab): Handle UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FRECPE,
      	UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_RBIT,
      	UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_FMLA, UNSPEC_FMLS,
      	UNSPEC_FCMLA, UNSPEC_FCMLA90, UNSPEC_FCMLA180, UNSPEC_FCMLA270,
      	UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FCADD90,
      	UNSPEC_COND_FCADD270, UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90,
      	UNSPEC_COND_FCMLA180, UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX,
      	UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and
      	UNSPEC_COND_FSCALE.
      	(maxmin_uns): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN.
      	(binqops_op, binqops_op_rev, last_op): New int attributes.
      	(su): Handle UNSPEC_SADDV and UNSPEC_UADDV.
      	(fn, ab): New int attributes.
      	(cmp_op): Handle UNSPEC_COND_CMP*_WIDE and UNSPEC_WHILE_*.
      	(while_optab_cmp, brk_op, sve_pred_op): New int attributes.
      	(sve_int_op): Handle UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART,
      	UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE, UNSPEC_LSHIFTRT_WIDE and
      	UNSPEC_RBIT.
      	(sve_fp_op): Handle UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE,
      	UNSPEC_RSQRTS, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTSMUL,
      	UNSPEC_FTSSEL, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX,
      	UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE.
      	(sve_fp_op_rev): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and
      	UNSPEC_COND_FMULX.
      	(rot): Handle UNSPEC_COND_FCADD* and UNSPEC_COND_FCMLA*.
      	(brk_reg_con, brk_reg_opno): New int attributes.
      	(sve_pred_fp_rhs1_operand, sve_pred_fp_rhs2_operand): Handle
      	UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX.
      	(sve_pred_fp_rhs2_immediate): Handle UNSPEC_COND_FMAX and
      	UNSPEC_COND_FMIN.
      	(max_elem_bits): New int attribute.
      	(min_elem_bits): Handle UNSPEC_RBIT.
      	* config/aarch64/predicates.md (subreg_lowpart_operator): Handle
      	TRUNCATE as well as SUBREG.
      	(ascending_int_parallel, aarch64_simd_reg_or_minus_one)
      	(aarch64_sve_ldff1_operand, aarch64_sve_ldnf1_operand)
      	(aarch64_sve_prefetch_operand, aarch64_sve_ptrue_svpattern_immediate)
      	(aarch64_sve_qadd_immediate, aarch64_sve_qsub_immediate)
      	(aarch64_sve_gather_immediate_b, aarch64_sve_gather_immediate_h)
      	(aarch64_sve_gather_immediate_w, aarch64_sve_gather_immediate_d)
      	(aarch64_sve_sqadd_operand, aarch64_sve_gather_offset_b)
      	(aarch64_sve_gather_offset_h, aarch64_sve_gather_offset_w)
      	(aarch64_sve_gather_offset_d, aarch64_gather_scale_operand_b)
      	(aarch64_gather_scale_operand_h): New predicates.
      	* config/aarch64/constraints.md (UPb, UPd, UPh, UPw, Utf, Utn, vgb)
      	(vgd, vgh, vgw, vsQ, vsS): New constraints.
      	* config/aarch64/aarch64-sve.md: Add a note on the FFR handling.
      	(*aarch64_sve_reinterpret<mode>): Allow any source register
      	instead of requiring an exact match.
      	(*aarch64_sve_ptruevnx16bi_cc, *aarch64_sve_ptrue<mode>_cc)
      	(*aarch64_sve_ptruevnx16bi_ptest, *aarch64_sve_ptrue<mode>_ptest)
      	(aarch64_wrffr, aarch64_update_ffr_for_load, aarch64_copy_ffr_to_ffrt)
      	(aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest)
      	(*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc)
      	(aarch64_update_ffrt): New patterns.
      	(@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
      	(@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
      	(@aarch64_ld<fn>f1<mode>): New patterns.
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
      	(@aarch64_ldnt1<mode>): New patterns.
      	(gather_load<mode>): Use aarch64_sve_gather_offset_<Vesize> for
      	the scalar part of the address.
      	(mask_gather_load<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the
      	scalar part of the addresse and add an alternative for handling
      	nonzero offsets.
      	(mask_gather_load<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d.
      	(*mask_gather_load<mode>_sxtw, *mask_gather_load<mode>_uxtw)
      	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
      	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw)
      	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw)
      	(@aarch64_ldff1_gather<SVE_S:mode>, @aarch64_ldff1_gather<SVE_D:mode>)
      	(*aarch64_ldff1_gather<mode>_sxtw, *aarch64_ldff1_gather<mode>_uxtw)
      	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
      	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
      	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw)
      	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw)
      	(@aarch64_sve_prefetch<mode>): New patterns.
      	(@aarch64_sve_gather_prefetch<SVE_I:mode><VNx4SI_ONLY:mode>)
      	(@aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>)
      	(*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_sxtw)
      	(*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_uxtw)
      	(@aarch64_store_trunc<VNx8_NARROW:mode><VNx8_WIDE:mode>)
      	(@aarch64_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>)
      	(@aarch64_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>)
      	(@aarch64_stnt1<mode>): New patterns.
      	(scatter_store<mode>): Use aarch64_sve_gather_offset_<Vesize> for
      	the scalar part of the address.
      	(mask_scatter_store<SVE_S:mode>): Use aarch64_sve_gather_offset_w for
      	the scalar part of the addresse and add an alternative for handling
      	nonzero offsets.
      	(mask_scatter_store<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d.
      	(*mask_scatter_store<mode>_sxtw, *mask_scatter_store<mode>_uxtw)
      	(@aarch64_scatter_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>)
      	(@aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>)
      	(*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_sxtw)
      	(*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_uxtw):
      	New patterns.
      	(vec_duplicate<mode>): Use QI as the mode of the input operand.
      	(extract_last_<mode>): Generalize to...
      	(@extract_<LAST:last_op>_<mode>): ...this.
      	(*<SVE_INT_UNARY:optab><mode>2): Rename to...
      	(@aarch64_pred_<SVE_INT_UNARY:optab><mode>): ...this.
      	(@cond_<SVE_INT_UNARY:optab><mode>): New expander.
      	(@aarch64_pred_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): New pattern.
      	(@aarch64_cond_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): Likewise.
      	(@aarch64_pred_cnot<mode>, @cond_cnot<mode>): New expanders.
      	(@aarch64_sve_<SVE_FP_UNARY_INT:optab><mode>): New pattern.
      	(@aarch64_sve_<SVE_FP_UNARY:optab><mode>): Likewise.
      	(*<SVE_COND_FP_UNARY:optab><mode>2): Rename to...
      	(@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): ...this.
      	(@cond_<SVE_COND_FP_UNARY:optab><mode>): New expander.
      	(*<SVE_INT_BINARY_IMM:optab><mode>3): Rename to...
      	(@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>): ...this.
      	(@aarch64_adr<mode>, *aarch64_adr_sxtw): New patterns.
      	(*aarch64_adr_uxtw_unspec): Likewise.
      	(*aarch64_adr_uxtw): Rename to...
      	(*aarch64_adr_uxtw_and): ...this.
      	(@aarch64_adr<mode>_shift): New expander.
      	(*aarch64_adr_shift_sxtw): New pattern.
      	(aarch64_<su>abd<mode>_3): Rename to...
      	(@aarch64_pred_<su>abd<mode>): ...this.
      	(<su>abd<mode>_3): Update accordingly.
      	(@aarch64_cond_<su>abd<mode>): New expander.
      	(@aarch64_<SBINQOPS:su_optab><optab><mode>): New pattern.
      	(@aarch64_<UBINQOPS:su_optab><optab><mode>): Likewise.
      	(*<su>mul<mode>3_highpart): Rename to...
      	(@aarch64_pred_<optab><mode>): ...this.
      	(@cond_<MUL_HIGHPART:optab><mode>): New expander.
      	(*cond_<MUL_HIGHPART:optab><mode>_2): New pattern.
      	(*cond_<MUL_HIGHPART:optab><mode>_z): Likewise.
      	(*<SVE_INT_BINARY_SD:optab><mode>3): Rename to...
      	(@aarch64_pred_<SVE_INT_BINARY_SD:optab><mode>): ...this.
      	(cond_<SVE_INT_BINARY_SD:optab><mode>): Add a "@" marker.
      	(@aarch64_bic<mode>, @cond_bic<mode>): New expanders.
      	(*v<ASHIFT:optab><mode>3): Rename to...
      	(@aarch64_pred_<ASHIFT:optab><mode>): ...this.
      	(@aarch64_sve_<SVE_SHIFT_WIDE:sve_int_op><mode>): New pattern.
      	(@cond_<SVE_SHIFT_WIDE:sve_int_op><mode>): New expander.
      	(*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_m): New pattern.
      	(*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_z): Likewise.
      	(@cond_asrd<mode>): New expander.
      	(*cond_asrd<mode>_2, *cond_asrd<mode>_z): New patterns.
      	(sdiv_pow2<mode>3): Expand to *cond_asrd<mode>_2.
      	(*sdiv_pow2<mode>3): Delete.
      	(@cond_<SVE_COND_FP_BINARY_INT:optab><mode>): New expander.
      	(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): New pattern.
      	(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Likewise.
      	(@aarch64_sve_<SVE_FP_BINARY:optab><mode>): New pattern.
      	(@aarch64_sve_<SVE_FP_BINARY_INT:optab><mode>): Likewise.
      	(*<SVE_COND_FP_BINARY_REG:optab><mode>3): Rename to...
      	(@aarch64_pred_<SVE_COND_FP_BINARY_REG:optab><mode>): ...this.
      	(@aarch64_pred_<SVE_COND_FP_BINARY_INT:optab><mode>): New pattern.
      	(cond_<SVE_COND_FP_BINARY:optab><mode>): Add a "@" marker.
      	(*add<SVE_F:mode>3): Rename to...
      	(@aarch64_pred_add<SVE_F:mode>): ...this and add alternatives
      	for SVE_STRICT_GP.
      	(@aarch64_pred_<SVE_COND_FCADD:optab><mode>): New pattern.
      	(@cond_<SVE_COND_FCADD:optab><mode>): New expander.
      	(*cond_<SVE_COND_FCADD:optab><mode>_2): New pattern.
      	(*cond_<SVE_COND_FCADD:optab><mode>_any): Likewise.
      	(*sub<SVE_F:mode>3): Rename to...
      	(@aarch64_pred_sub<SVE_F:mode>): ...this and add alternatives
      	for SVE_STRICT_GP.
      	(@aarch64_pred_abd<SVE_F:mode>): New expander.
      	(*fabd<SVE_F:mode>3): Rename to...
      	(*aarch64_pred_abd<SVE_F:mode>): ...this.
      	(@aarch64_cond_abd<SVE_F:mode>): New expander.
      	(*mul<SVE_F:mode>3): Rename to...
      	(@aarch64_pred_<SVE_F:optab><mode>): ...this and add alternatives
      	for SVE_STRICT_GP.
      	(@aarch64_mul_lane_<SVE_F:mode>): New pattern.
      	(*<SVE_COND_FP_MAXMIN_PUBLIC:optab><mode>3): Rename and generalize
      	to...
      	(@aarch64_pred_<SVE_COND_FP_MAXMIN:optab><mode>): ...this.
      	(*<LOGICAL:optab><PRED_ALL:mode>3_ptest): New pattern.
      	(*<nlogical><PRED_ALL:mode>3): Rename to...
      	(aarch64_pred_<nlogical><PRED_ALL:mode>_z): ...this.
      	(*<nlogical><PRED_ALL:mode>3_cc): New pattern.
      	(*<nlogical><PRED_ALL:mode>3_ptest): Likewise.
      	(*<logical_nn><PRED_ALL:mode>3): Rename to...
      	(aarch64_pred_<logical_nn><mode>_z): ...this.
      	(*<logical_nn><PRED_ALL:mode>3_cc): New pattern.
      	(*<logical_nn><PRED_ALL:mode>3_ptest): Likewise.
      	(*fma<SVE_I:mode>4): Rename to...
      	(@aarch64_pred_fma<SVE_I:mode>): ...this.
      	(*fnma<SVE_I:mode>4): Rename to...
      	(@aarch64_pred_fnma<SVE_I:mode>): ...this.
      	(@aarch64_<sur>dot_prod_lane<vsi2qi>): New pattern.
      	(*<SVE_FP_TERNARY:optab><mode>4): Rename to...
      	(@aarch64_pred_<SVE_FP_TERNARY:optab><mode>): ...this.
      	(cond_<SVE_FP_TERNARY:optab><mode>): Add a "@" marker.
      	(@aarch64_<SVE_FP_TERNARY_LANE:optab>_lane_<mode>): New pattern.
      	(@aarch64_pred_<SVE_COND_FCMLA:optab><mode>): Likewise.
      	(@cond_<SVE_COND_FCMLA:optab><mode>): New expander.
      	(*cond_<SVE_COND_FCMLA:optab><mode>_4): New pattern.
      	(*cond_<SVE_COND_FCMLA:optab><mode>_any): Likewise.
      	(@aarch64_<FCMLA:optab>_lane_<mode>): Likewise.
      	(@aarch64_sve_tmad<mode>): Likewise.
      	(vcond_mask_<SVE_ALL:mode><vpred>): Add a "@" marker.
      	(*aarch64_sel_dup<mode>): Rename to...
      	(@aarch64_sel_dup<mode>): ...this.
      	(@aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide): New pattern.
      	(*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_cc): Likewise.
      	(*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_ptest): Likewise.
      	(@while_ult<GPI:mode><PRED_ALL:mode>): Generalize to...
      	(@while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>): ...this.
      	(*while_ult<GPI:mode><PRED_ALL:mode>_cc): Generalize to.
      	(*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_cc): ...this.
      	(*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_ptest): New pattern.
      	(*fcm<cmp_op><mode>): Rename to...
      	(@aarch64_pred_fcm<cmp_op><mode>): ...this.  Make operand order
      	match @aarch64_pred_cmp<cmp_op><SVE_I:mode>.
      	(*fcmuo<mode>): Rename to...
      	(@aarch64_pred_fcmuo<mode>): ...this.  Make operand order
      	match @aarch64_pred_cmp<cmp_op><SVE_I:mode>.
      	(@aarch64_pred_fac<cmp_op><mode>): New expander.
      	(@vcond_mask_<PRED_ALL:mode><mode>): New pattern.
      	(fold_extract_last_<mode>): Generalize to...
      	(@fold_extract_<last_op>_<mode>): ...this.
      	(@aarch64_fold_extract_vector_<last_op>_<mode>): New pattern.
      	(*reduc_plus_scal_<SVE_I:mode>): Replace with...
      	(@aarch64_pred_reduc_<optab>_<mode>): ...this pattern, making the
      	DImode result explicit.
      	(reduc_plus_scal_<mode>): Update accordingly.
      	(*reduc_<optab>_scal_<SVE_I:mode>): Rename to...
      	(@aarch64_pred_reduc_<optab>_<SVE_I:mode>): ...this.
      	(*reduc_<optab>_scal_<SVE_F:mode>): Rename to...
      	(@aarch64_pred_reduc_<optab>_<SVE_F:mode>): ...this.
      	(*aarch64_sve_tbl<mode>): Rename to...
      	(@aarch64_sve_tbl<mode>): ...this.
      	(@aarch64_sve_compact<mode>): New pattern.
      	(*aarch64_sve_dup_lane<mode>): Rename to...
      	(@aarch64_sve_dup_lane<mode>): ...this.
      	(@aarch64_sve_dupq_lane<mode>): New pattern.
      	(@aarch64_sve_splice<mode>): Likewise.
      	(aarch64_sve_<perm_insn><mode>): Rename to...
      	(@aarch64_sve_<perm_insn><mode>): ...this.
      	(*aarch64_sve_ext<mode>): Rename to...
      	(@aarch64_sve_ext<mode>): ...this.
      	(aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): Add a "@" marker.
      	(*aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): Rename
      	to...
      	(@aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): ...this.
      	(*aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
      	Rename to...
      	(@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
      	...this.
      	(@cond_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): New expander.
      	(@cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Likewise.
      	(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): New pattern.
      	(*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): Rename
      	to...
      	(@aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): ...this.
      	(aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Add
      	a "@" marker.
      	(@cond_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New expander.
      	(@cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Likewise.
      	(*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): New
      	pattern.
      	(*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): Rename to...
      	(@aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): ...this.
      	(@cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New expander.
      	(*cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New pattern.
      	(aarch64_sve_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): Add a
      	"@" marker.
      	(@cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New expander.
      	(*cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New pattern.
      	(aarch64_sve_punpk<perm_hilo>_<mode>): Add a "@" marker.
      	(@aarch64_brk<SVE_BRK_UNARY:brk_op>): New pattern.
      	(*aarch64_brk<SVE_BRK_UNARY:brk_op>_cc): Likewise.
      	(*aarch64_brk<SVE_BRK_UNARY:brk_op>_ptest): Likewise.
      	(@aarch64_brk<SVE_BRK_BINARY:brk_op>): Likewise.
      	(*aarch64_brk<SVE_BRK_BINARY:brk_op>_cc): Likewise.
      	(*aarch64_brk<SVE_BRK_BINARY:brk_op>_ptest): Likewise.
      	(@aarch64_sve_<SVE_PITER:sve_pred_op><mode>): Likewise.
      	(*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_cc): Likewise.
      	(*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_ptest): Likewise.
      	(aarch64_sve_cnt_pat): Likewise.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode>_pat): Likewise.
      	(*aarch64_sve_incsi_pat): Likewise.
      	(@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode>_pat): Likewise.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander.
      	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode>_pat): Likewise.
      	(*aarch64_sve_decsi_pat): Likewise.
      	(@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode>_pat): Likewise.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander.
      	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern.
      	(@aarch64_pred_cntp<mode>): Likewise.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp):
      	New expander.
      	(*aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp)
      	(*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns.
      	(@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
      	New expander.
      	(*aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
      	New pattern.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander.
      	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander.
      	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern.
      	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander.
      	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp):
      	New expander.
      	(*aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp)
      	(*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns.
      	(@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
      	New expander.
      	(*aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
      	New pattern.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New
      	expander.
      	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New
      	expander.
      	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern.
      	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New
      	expander.
      	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern.
      	* config/aarch64/arm_sve.h: New file.
      	* config/aarch64/aarch64-sve-builtins.h: Likewise.
      	* config/aarch64/aarch64-sve-builtins.cc: Likewise.
      	* config/aarch64/aarch64-sve-builtins.def: Likewise.
      	* config/aarch64/aarch64-sve-builtins-base.h: Likewise.
      	* config/aarch64/aarch64-sve-builtins-base.cc: Likewise.
      	* config/aarch64/aarch64-sve-builtins-base.def: Likewise.
      	* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
      	* config/aarch64/aarch64-sve-builtins-shapes.h: Likewise.
      	* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.
      
      gcc/testsuite/
      	* g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file.
      	* g++.target/aarch64/sve/acle/general-c++: New test directory.
      	* gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file.
      	* gcc.target/aarch64/sve/acle/general: New test directory.
      	* gcc.target/aarch64/sve/acle/general-c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
      
      From-SVN: r277563
      Richard Sandiford committed
    • [AArch64] Extend SVE reverse permutes to predicates · 28350fd1
      This is tested by the main SVE ACLE patches, but since it affects
      the evpc routines, it seemed worth splitting out.
      
      2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (@aarch64_sve_rev<PRED_ALL:mode>):
      	New pattern.
      	* config/aarch64/aarch64.c (aarch64_evpc_rev_global): Handle all
      	SVE modes.
      
      From-SVN: r277562
      Richard Sandiford committed
  4. 30 Sep, 2019 1 commit
    • [AArch64][SVE] Utilize ASRD instruction for division and remainder · c0c2f013
      2019-09-30  Yuliang Wang  <yuliang.wang@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3):
      	New pattern for ASRD.
      	* config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
      	* internal-fn.def (IFN_DIV_POW2): New internal function.
      	* optabs.def (sdiv_pow2_optab): New optab.
      	* tree-vect-patterns.c (vect_recog_divmod_pattern):
      	Modify pattern to support new operation.
      	* doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above.
      	* doc/sourcebuild.texi (vect_sdiv_pow2_si):
      	Document new target selector.
      
      gcc/testsuite/
      	* gcc.dg/vect/vect-sdiv-pow2-1.c: New test.
      	* gcc.target/aarch64/sve/asrdiv_1.c: As above.
      	* lib/target-supports.exp (check_effective_target_vect_sdiv_pow2_si):
      	Return true for AArch64 with SVE.
      
      From-SVN: r276343
      Yuliang Wang committed
  5. 22 Aug, 2019 1 commit
  6. 15 Aug, 2019 14 commits
    • [AArch64] Tweak operand choice for SVE predicate AND · 2d2388f8
      SVE defines an assembly alias:
      
         MOV pa.B, pb/Z, pc.B  ->  AND pa.B. pb/Z, pc.B, pc.B
      
      Our and<mode>3 pattern was instead using the functionally-equivalent:
      
         AND pa.B. pb/Z, pb.B, pc.B
                         ^^^^
      This patch duplicates pc.B instead so that the alias can be seen
      in disassembly.
      
      I wondered about using the alias in the pattern instead, but using AND
      explicitly seems to fit better with the pattern name and surrounding code.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (and<PRED_ALL:mode>3): Make the
      	operand order match the MOV /Z alias.
      
      From-SVN: r274521
      Richard Sandiford committed
    • [AArch64] Rework SVE INC/DEC handling · 0fdc30bc
      The scalar addition patterns allowed all the VL constants that
      ADDVL and ADDPL allow, but wrote the instructions as INC or DEC
      if possible (i.e. adding or subtracting a number of elements * [1, 16]
      when the source and target registers the same).  That works for the
      cases that the autovectoriser needs, but there are a few constants
      that INC and DEC can handle but ADDPL and ADDVL can't.  E.g.:
      
              inch    x0, all, mul #9
      
      is not a multiple of the number of bytes in an SVE register, and so
      can't use ADDVL.  It represents 36 times the number of bytes in an
      SVE predicate, putting it outside the range of ADDPL.
      
      This patch therefore adds separate alternatives for INC and DEC,
      tied to a new Uai constraint.  It also adds an explicit "scalar"
      or "vector" to the function names, to avoid a clash with the
      existing support for vector INC and DEC.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-protos.h
      	(aarch64_sve_scalar_inc_dec_immediate_p): Declare.
      	(aarch64_sve_inc_dec_immediate_p): Rename to...
      	(aarch64_sve_vector_inc_dec_immediate_p): ...this.
      	(aarch64_output_sve_addvl_addpl): Take a single rtx argument.
      	(aarch64_output_sve_scalar_inc_dec): Declare.
      	(aarch64_output_sve_inc_dec_immediate): Rename to...
      	(aarch64_output_sve_vector_inc_dec): ...this.
      	* config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p)
      	(aarch64_output_sve_scalar_inc_dec): New functions.
      	(aarch64_output_sve_addvl_addpl): Remove the base and offset
      	arguments.  Only handle true ADDVL and ADDPL instructions;
      	don't emit an INC or DEC.
      	(aarch64_sve_inc_dec_immediate_p): Rename to...
      	(aarch64_sve_vector_inc_dec_immediate_p): ...this.
      	(aarch64_output_sve_inc_dec_immediate): Rename to...
      	(aarch64_output_sve_vector_inc_dec): ...this.  Update call to
      	aarch64_sve_vector_inc_dec_immediate_p.
      	* config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate)
      	(aarch64_sve_plus_immediate): New predicates.
      	(aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate
      	rather than aarch64_sve_addvl_addpl_immediate.
      	(aarch64_sve_inc_dec_immediate): Rename to...
      	(aarch64_sve_vector_inc_dec_immediate): ...this.  Update call to
      	aarch64_sve_vector_inc_dec_immediate_p.
      	(aarch64_sve_add_operand): Update accordingly.
      	* config/aarch64/constraints.md (Uai): New constraint.
      	(vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p.
      	* config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second
      	operand into a register if it satisfies aarch64_sve_plus_immediate.
      	(*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative
      	for Uai.  Update calls to aarch64_output_sve_addvl_addpl.
      	* config/aarch64/aarch64-sve.md (add<mode>3): Call
      	aarch64_output_sve_vector_inc_dec instead of
      	aarch64_output_sve_inc_dec_immediate.
      
      From-SVN: r274518
      Richard Sandiford committed
    • [AArch64] Rework SVE REV[BHW] patterns · d7a09c44
      The current SVE REV patterns follow the AArch64 scheme, in which
      UNSPEC_REV<NN> reverses elements within an <NN>-bit granule.
      E.g. UNSPEC_REV64 on VNx8HI reverses the four 16-bit elements
      within each 64-bit granule.
      
      The native SVE scheme is the other way around: UNSPEC_REV64 is seen
      as an operation on 64-bit elements, with REVB swapping bytes within
      the elements, REVH swapping halfwords, and so on.  This fits SVE more
      naturally because the operation can then be predicated per <NN>-bit
      granule/element.
      
      Making the patterns use the Advanced SIMD scheme was more natural
      when all we cared about were permutes, since we could then use
      the source and target of the permute in their original modes.
      However, the ACLE does need patterns that follow the native scheme,
      treating them as operations on integer elements.  This patch defines
      the patterns that way instead and updates the existing uses to match.
      
      This also brings in a couple of helper routines from the ACLE branch.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (UNSPEC_REVB, UNSPEC_REVH)
      	(UNSPEC_REVW): New constants.
      	(elem_bits): New mode attribute.
      	(SVE_INT_UNARY): New int iterator.
      	(optab): Handle UNSPEC_REV[BHW].
      	(sve_int_op): New int attribute.
      	(min_elem_bits): Handle VNx16QI and the predicate modes.
      	* config/aarch64/aarch64-sve.md (*aarch64_sve_rev64<mode>)
      	(*aarch64_sve_rev32<mode>, *aarch64_sve_rev16vnx16qi): Delete.
      	(@aarch64_pred_<SVE_INT_UNARY:optab><SVE_I:mode>): New pattern.
      	* config/aarch64/aarch64.c (aarch64_sve_data_mode): New function.
      	(aarch64_sve_int_mode, aarch64_sve_rev_unspec): Likewise.
      	(aarch64_split_sve_subreg_move): Use UNSPEC_REV[BHW] instead of
      	unspecs based on the total width of the reversed data.
      	(aarch64_evpc_rev_local): Likewise (for SVE only).  Use a
      	reinterpret followed by a subreg on big-endian targets.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/revb_1.c: Restrict to little-endian targets.
      	Avoid including stdint.h.
      	* gcc.target/aarch64/sve/revh_1.c: Likewise.
      	* gcc.target/aarch64/sve/revw_1.c: Likewise.
      	* gcc.target/aarch64/sve/revb_2.c: New big-endian test.
      	* gcc.target/aarch64/sve/revh_2.c: Likewise.
      	* gcc.target/aarch64/sve/revw_2.c: Likewise.
      
      From-SVN: r274517
      Richard Sandiford committed
    • [AArch64] Add more SVE FMLA and FMAD /z alternatives · 432b29c1
      This patch makes the floating-point conditional FMA patterns provide the
      same /z alternatives as the integer patterns added by a previous patch.
      We can handle cases in which individual inputs are allocated to the same
      register as the output, so we don't need to force all registers to be
      different.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(*cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>_any): Add /z
      	alternatives in which one of the inputs is in the same register
      	as the output.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_mla_5.c: Allow FMAD as well as FMLA
      	and FMSB as well as FMLS.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274516
      Richard Sandiford committed
    • [AArch64] Add MOVPRFX alternatives for SVE EXT patterns · 06b3ba23
      We use EXT both to implement vec_extract for large indices and as a
      permute.  In both cases we can use MOVPRFX to handle the case in which
      the first input and output can't be tied.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*vec_extract<mode><Vel>_ext)
      	(*aarch64_sve_ext<mode>): Add MOVPRFX alternatives.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/ext_2.c: Expect a MOVPRFX.
      	* gcc.target/aarch64/sve/ext_3.c: New test.
      
      From-SVN: r274515
      Richard Sandiford committed
    • [AArch64] Remove unneeded FSUB alternatives and add a new one · 2ae21bd1
      The floating-point subtraction patterns don't need to handle
      subtraction of constants, since those go through the addition
      patterns instead.  There was a missing MOVPRFX alternative for
      FSUBR though.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*sub<SVE_F:mode>3): Remove immediate
      	FADD and FSUB alternatives.  Add a MOVPRFX alternative for FSUBR.
      
      From-SVN: r274514
      Richard Sandiford committed
    • [AArch64] Add more unpredicated MOVPRFX alternatives · 5e176a61
      FABD and some immediate instructions were missing MOVPRFX alternatives.
      This is tested by the ACLE patches but is really an independent improvement.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (add<SVE_I:mode>3, sub<SVE_I:mode>3)
      	(<LOGICAL:optab><SVE_I:mode>3, *add<SVE_F:mode>3, *mul<SVE_F:mode>3)
      	(*fabd<SVE_F:mode>3): Add more MOVPRFX alternatives.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274513
      Richard Sandiford committed
    • [AArch64] Use SVE reversed shifts in preference to MOVPRFX · 7d1f2401
      This patch makes us use reversed SVE shifts when the first operand
      can't be tied to the output but the second can.  This is tested
      more thoroughly by the ACLE patches but is really an independent
      improvement.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*v<ASHIFT:optab><SVE_I:mode>3):
      	Add an alternative that uses reversed shifts.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/shift_1.c: Accept reversed shifts.
      
      Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
      
      From-SVN: r274512
      Richard Sandiford committed
    • [AArch64] Add a commutativity marker to the SVE [SU]ABD patterns · 9a8d9b3f
      This will be tested by the ACLE patches, but it's really an
      independent improvement.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (aarch64_<su>abd<mode>_3): Add
      	a commutativity marker.
      
      From-SVN: r274510
      Richard Sandiford committed
    • [AArch64] Use SVE MLA, MLS, MAD and MSB for conditional arithmetic · b6c3aea1
      This patch uses predicated MLA, MLS, MAD and MSB to implement
      conditional "FMA"s on integers.  This also requires providing
      the unpredicated optabs (fma and fnma) since otherwise
      tree-ssa-math-opts.c won't try to use the conditional forms.
      
      We still want to use shifts and adds in preference to multiplications,
      so the patch makes the optab expanders check for that.
      
      The tests cover floating-point types too, which are already handled,
      and which were already tested to some extent by gcc.dg/vect.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-protos.h (aarch64_prepare_sve_int_fma)
      	(aarch64_prepare_sve_cond_int_fma): Declare.
      	* config/aarch64/aarch64.c (aarch64_convert_mult_to_shift)
      	(aarch64_prepare_sve_int_fma): New functions.
      	(aarch64_prepare_sve_cond_int_fma): Likewise.
      	* config/aarch64/aarch64-sve.md
      	(cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Add a "@" marker.
      	(fma<SVE_I:mode>4, cond_fma<SVE_I:mode>, *cond_fma<SVE_I:mode>_2)
      	(*cond_fma<SVE_I:mode>_4, *cond_fma<SVE_I:mode>_any, fnma<SVE_I:mode>4)
      	(cond_fnma<SVE_I:mode>, *cond_fnma<SVE_I:mode>_2)
      	(*cond_fnma<SVE_I:mode>_4, *cond_fnma<SVE_I:mode>_any): New patterns.
      	(*madd<mode>): Rename to...
      	(*fma<mode>4): ...this.
      	(*msub<mode>): Rename to...
      	(*fnma<mode>4): ...this.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_mla_1.c: New test.
      	* gcc.target/aarch64/sve/cond_mla_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_5.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_5_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_6.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_6_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_7.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_7_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_8.c: Likewise.
      	* gcc.target/aarch64/sve/cond_mla_8_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274509
      Richard Sandiford committed
    • [AArch64] Use SVE binary immediate instructions for conditional arithmetic · a19ba9e1
      This patch lets us use the immediate forms of FADD, FSUB, FSUBR,
      FMUL, FMAXNM and FMINNM for conditional arithmetic.  (We already
      use them for normal unconditional arithmetic.)
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
      	Print 2.0 naturally.
      	(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
      	* config/aarch64/predicates.md
      	(aarch64_sve_float_negated_arith_immediate): New predicate,
      	renamed from aarch64_sve_float_arith_with_sub_immediate.
      	(aarch64_sve_float_arith_with_sub_immediate): Test for both
      	positive and negative constants.
      	(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
      	or an aarch64_sve_float_arith_with_sub_immediate.
      	* config/aarch64/constraints.md (vsN): Use
      	aarch64_sve_float_negated_arith_immediate.
      	* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
      	iterator.
      	(sve_pred_fp_rhs2_immediate): New int attribute.
      	* config/aarch64/aarch64-sve.md
      	(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
      	sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
      	(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
      	(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
      	(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
      	(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
      	* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274508
      Richard Sandiford committed
    • [AArch64] Use SVE FABD in conditional arithmetic · bf30864e
      This patch extends the FABD support so that it handles conditional
      arithmetic.  We're relying on combine for this, since there's no
      associated IFN_COND_* (yet?).
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*aarch64_cond_abd<SVE_F:mode>_2)
      	(*aarch64_cond_abd<SVE_F:mode>_3)
      	(*aarch64_cond_abd<SVE_F:mode>_any): New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_fabd_1.c: New test.
      	* gcc.target/aarch64/sve/cond_fabd_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_5.c: Likewise.
      	* gcc.target/aarch64/sve/cond_fabd_5_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274507
      Richard Sandiford committed
    • [AArch64] Use SVE [SU]ABD in conditional arithmetic · 9730c5cc
      This patch extends the [SU]ABD support so that it handles
      conditional arithmetic.  We're relying on combine for this,
      since there's no associated IFN_COND_* (yet?).
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*aarch64_cond_<su>abd<mode>_2)
      	(*aarch64_cond_<su>abd<mode>_any): New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_abd_1.c: New test.
      	* gcc.target/aarch64/sve/cond_abd_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_5.c: Likewise.
      	* gcc.target/aarch64/sve/cond_abd_5_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274506
      Richard Sandiford committed
    • Add support for conditional shifts · 20103c0e
      This patch adds support for IFN_COND shifts left and shifts right.
      This is mostly mechanical, but since we try to handle conditional
      operations in the same way as unconditional operations in match.pd,
      we need to support IFN_COND shifts by scalars as well as vectors.
      E.g.:
      
         IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback)
      
      and:
      
         IFN_COND_SHL (cond, a, 1, fallback)
      
      are the same operation, with:
      
         (for shiftrotate (lrotate rrotate lshift rshift)
          ...
          /* Prefer vector1 << scalar to vector1 << vector2
             if vector2 is uniform.  */
          (for vec (VECTOR_CST CONSTRUCTOR)
           (simplify
            (shiftrotate @0 vec@1)
            (with { tree tem = uniform_vector_p (@1); }
             (if (tem)
      	(shiftrotate @0 { tem; }))))))
      
      preferring the latter.  The patch copes with this by extending
      create_convert_operand_from to handle scalar-to-vector conversions.
      
      2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
      	    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
      
      gcc/
      	* internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions.
      	* internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts.
      	* match.pd (UNCOND_BINARY, COND_BINARY): Likewise.
      	* optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New
      	optabs.
      	* optabs.h (create_convert_operand_from): Expand comment.
      	* optabs.c (maybe_legitimize_operand): Allow implicit broadcasts
      	when mapping scalar rtxes to vector operands.
      	* config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift,
      	ashiftrt and lshiftrt.
      	(sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them.
      	* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_const)
      	(*cond_<optab><mode>_any_const): New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_shift_1.c: New test.
      	* gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_5.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_6.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_7.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_8.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_9.c: Likewise.
      	* gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise.
      
      Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>
      
      From-SVN: r274505
      Richard Sandiford committed
  7. 14 Aug, 2019 10 commits
    • [AArch64] Use SVE BIC for conditional arithmetic · 1b187f36
      This patch uses BIC to pattern-match conditional AND with an inverted
      third input.  It also adds extra tests for AND, ORR and EOR.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*cond_bic<mode>_2)
      	(*cond_bic<mode>_any): New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_logical_1.c: New test.
      	* gcc.target/aarch64/sve/cond_logical_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_5.c: Likewise.
      	* gcc.target/aarch64/sve/cond_logical_5_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274480
      Richard Sandiford committed
    • [AArch64] Use SVE UXT[BHW] as a form of predicated AND · d113ece6
      UXTB, UXTH and UXTW are equivalent to predicated ANDs with the constants
      0xff, 0xffff and 0xffffffff respectively.  This patch uses them in the
      patterns for IFN_COND_AND.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64.c (aarch64_print_operand): Allow %e to
      	take the equivalent mask, as well as a bit count.
      	* config/aarch64/predicates.md (aarch64_sve_uxtb_immediate)
      	(aarch64_sve_uxth_immediate, aarch64_sve_uxt_immediate)
      	(aarch64_sve_pred_and_operand): New predicates.
      	* config/aarch64/iterators.md (sve_pred_int_rhs2_operand): New
      	code attribute.
      	* config/aarch64/aarch64-sve.md
      	(cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Use it.
      	(*cond_uxt<mode>_2, *cond_uxt<mode>_any): New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_uxt_1.c: New test.
      	* gcc.target/aarch64/sve/cond_uxt_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_uxt_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_uxt_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_uxt_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_uxt_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_uxt_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_uxt_4_run.c: Likewise.
      
      From-SVN: r274479
      Richard Sandiford committed
    • [AArch64] Add SVE conditional conversion patterns · c5e16983
      This patch adds patterns to match conditional conversions between
      integers and like-sized floats.  The patterns are actually more
      general than that, but the other combinations can only be tested
      via the ACLE.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(*cond_<SVE_COND_FCVTI:optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>)
      	(*cond_<SVE_COND_ICVTF:optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>):
      	New patterns.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_convert_1.c: New test.
      	* gcc.target/aarch64/sve/cond_convert_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_4_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_5.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_5_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_6.c: Likewise.
      	* gcc.target/aarch64/sve/cond_convert_6_run.c: Likewise.
      
      From-SVN: r274478
      Richard Sandiford committed
    • [AArch64] Add SVE conditional floating-point unary patterns · b21f7d53
      This patch adds patterns to match conditional unary operations
      on floating-point modes.  At the moment we rely on combine to merge
      separate arithmetic and vcond_mask operations, and since the latter
      doesn't accept zero operands, we miss out on the opportunity to use
      the movprfx /z alternative.  (This alternative is tested by the ACLE
      patches though.)
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(*cond_<SVE_COND_FP_UNARY:optab><SVE_F:mode>_2): New pattern.
      	(*cond_<SVE_COND_FP_UNARY:optab><SVE_F:mode>_any): Likewise.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_unary_1.c: Add tests for
      	floating-point types.
      	* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274477
      Richard Sandiford committed
    • [AArch64] Add SVE conditional integer unary patterns · 3c9f4963
      This patch adds patterns to match conditional unary operations
      on integers.  At the moment we rely on combine to merge separate
      arithmetic and vcond_mask operations, and since the latter doesn't
      accept zero operands, we miss out on the opportunity to use the
      movprfx /z alternative.  (This alternative is tested by the ACLE
      patches though.)
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md
      	(*cond_<SVE_INT_UNARY:optab><SVE_I:mode>_2): New pattern.
      	(*cond_<SVE_INT_UNARY:optab><SVE_I:mode>_any): Likewise.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cond_unary_1.c: New test.
      	* gcc.target/aarch64/sve/cond_unary_1_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_2_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_3.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_3_run.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.
      	* gcc.target/aarch64/sve/cond_unary_4_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274476
      Richard Sandiford committed
    • [AArch64] Add support for SVE absolute comparisons · 42b4e87d
      This patch adds support for floating-point absolute comparisons
      FACLT and FACLE (aliased as FACGT and FACGE with swapped operands).
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/iterators.md (SVE_COND_FP_ABS_CMP): New iterator.
      	* config/aarch64/aarch64-sve.md (*aarch64_pred_fac<cmp_op><mode>):
      	New pattern.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/vcond_21.c: New test.
      	* gcc.target/aarch64/sve/vcond_21_run.c: Likewise.
      
      From-SVN: r274443
      Richard Sandiford committed
    • [AArch64] Use SVE MOV /M of scalars · 88a37c4d
      This patch uses MOV /M to optimise selects between a duplicated
      scalar variable and a vector.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64-sve.md (*aarch64_sel_dup<mode>): New pattern.
      
      gcc/testsuite/
      	* g++.target/aarch64/sve/dup_sel_1.C: New test.
      	* g++.target/aarch64/sve/dup_sel_2.C: Likewise.
      	* g++.target/aarch64/sve/dup_sel_3.C: Likewise.
      	* g++.target/aarch64/sve/dup_sel_4.C: Likewise.
      	* g++.target/aarch64/sve/dup_sel_5.C: Likewise.
      	* g++.target/aarch64/sve/dup_sel_6.C: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274442
      Richard Sandiford committed
    • [AArch64] Make more use of SVE conditional constant moves · d29f7dd5
      This patch extends the SVE UNSPEC_SEL patterns so that they can use:
      
      (1) MOV /M of a duplicated integer constant
      (2) MOV /M of a duplicated floating-point constant bitcast to an integer,
          accepting the same constants as (1)
      (3) FMOV /M of a duplicated floating-point constant
      (4) MOV /Z of a duplicated integer constant
      (5) MOV /Z of a duplicated floating-point constant bitcast to an integer,
          accepting the same constants as (4)
      (6) MOVPRFXed FMOV /M of a duplicated floating-point constant
      
      We already handled (4) with a special pattern; the rest are new.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
      
      gcc/
      	* config/aarch64/aarch64.c (aarch64_bit_representation): New function.
      	(aarch64_print_vector_float_operand): Also handle 8-bit floats.
      	(aarch64_print_operand): Add support for %I.
      	(aarch64_sve_dup_immediate_p): Handle scalars as well as vectors.
      	Bitcast floating-point constants to the corresponding integer constant.
      	(aarch64_float_const_representable_p): Handle vectors as well
      	as scalars.
      	(aarch64_expand_sve_vcond): Make sure that the operands are valid
      	for the new vcond_mask_<mode><vpred> expander.
      	* config/aarch64/predicates.md (aarch64_sve_dup_immediate): Also
      	test aarch64_float_const_representable_p.
      	(aarch64_sve_reg_or_dup_imm): New predicate.
      	* config/aarch64/aarch64-sve.md (vec_extract<vpred><Vel>): Use
      	gen_vcond_mask_<mode><vpred> instead of
      	gen_aarch64_sve_dup<mode>_const.
      	(vcond_mask_<mode><vpred>): Turn into a define_expand that
      	accepts aarch64_sve_reg_or_dup_imm and aarch64_simd_reg_or_zero
      	for operands 1 and 2 respectively.  Force operand 2 into a
      	register if operand 1 is a register.  Fold old define_insn...
      	(aarch64_sve_dup<mode>_const): ...and this define_insn...
      	(*vcond_mask_<mode><vpred>): ...into this new pattern.  Handle
      	floating-point constants that can be moved as integers.  Add
      	alternatives for MOV /M and FMOV /M.
      	(vcond<mode><v_int_equiv>, vcondu<mode><v_int_equiv>)
      	(vcond<mode><v_fp_equiv>): Accept nonmemory_operand for operands
      	1 and 2 respectively.
      	* config/aarch64/constraints.md (Ufc): Handle vectors as well
      	as scalars.
      	(vss): New constraint.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/vcond_18.c: New test.
      	* gcc.target/aarch64/sve/vcond_18_run.c: Likewise.
      	* gcc.target/aarch64/sve/vcond_19.c: Likewise.
      	* gcc.target/aarch64/sve/vcond_19_run.c: Likewise.
      	* gcc.target/aarch64/sve/vcond_20.c: Likewise.
      	* gcc.target/aarch64/sve/vcond_20_run.c: Likewise.
      
      Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
      
      From-SVN: r274441
      Richard Sandiford committed
    • [AArch64] Add support for SVE F{MAX,MIN}NM immediate · 75079ddf
      This patch uses the immediate forms of FMAXNM and FMINNM for
      unconditional arithmetic.
      
      The same rules apply to FMAX and FMIN, but we only generate those
      via the ACLE.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/predicates.md (aarch64_sve_float_maxmin_immediate)
      	(aarch64_sve_float_maxmin_operand): New predicates.
      	* config/aarch64/constraints.md (vsB): New constraint.
      	(vsM): Fix typo.
      	* config/aarch64/iterators.md (sve_pred_fp_rhs2_operand): Use
      	aarch64_sve_float_maxmin_operand for UNSPEC_COND_FMAXNM and
      	UNSPEC_COND_FMINNM.
      	* config/aarch64/aarch64-sve.md (<maxmin_uns><SVE_F:mode>3):
      	Use aarch64_sve_float_maxmin_operand for operand 2.
      	(*<SVE_COND_FP_MAXMIN_PUBLIC:optab><SVE_F:mode>3): Likewise.
      	Add alternatives for the constant forms.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/fmaxnm_1.c: New test.
      	* gcc.target/aarch64/sve/fminnm_1.c: Likewise.
      
      From-SVN: r274440
      Richard Sandiford committed
    • [AArch64] Add support for SVE [SU]{MAX,MIN} immediate · f8c22a8b
      This patch adds support for the immediate forms of SVE SMAX, SMIN, UMAX
      and UMIN.  SMAX and SMIN take the same range as MUL, so the patch
      basically just moves and generalises the existing MUL patterns.
      
      2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/constraints.md (vsb): New constraint.
      	(vsm): Generalize description.
      	* config/aarch64/iterators.md (SVE_INT_BINARY_IMM): New code
      	iterator.
      	(sve_imm_con): Handle smax, smin, umax and umin.
      	(sve_imm_prefix): New code attribute.
      	* config/aarch64/predicates.md (aarch64_sve_vsb_immediate)
      	(aarch64_sve_vsb_operand): New predicates.
      	(aarch64_sve_mul_immediate): Rename to...
      	(aarch64_sve_vsm_immediate): ...this.
      	(aarch64_sve_mul_operand): Rename to...
      	(aarch64_sve_vsm_operand): ...this.
      	* config/aarch64/aarch64-sve.md (mul<mode>3): Generalize to...
      	(<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...this.
      	(*mul<mode>3, *post_ra_mul<mode>3): Generalize to...
      	(*<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3)
      	(*post_ra_<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...these and
      	add movprfx support for the immediate alternatives.
      	(<su><maxmin><mode>3, *<su><maxmin><mode>3): Delete in favor
      	of the above.
      	(*<SVE_INT_BINARY_SD:optab><SVE_SDI:mode>3): Fix incorrect predicate
      	for operand 3.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/smax_1.c: New test.
      	* gcc.target/aarch64/sve/smin_1.c: Likewise.
      	* gcc.target/aarch64/sve/umax_1.c: Likewise.
      	* gcc.target/aarch64/sve/umin_1.c: Likewise.
      
      From-SVN: r274439
      Richard Sandiford committed