Commits · 4ec943d630c6860998a30384c81fcc0446d7c656 · lvzhengyang / riscv-gcc-1

16 Nov, 2019 10 commits

[AArch64] Robustify aarch64_wrffr · 4ec943d6

This patch uses distinct values for the FFR and FFRT outputs of
aarch64_wrffr, so that a following aarch64_copy_ffr_to_ffrt has
an effect.  This is needed to avoid regressions with later patches.

The block comment at the head of the file already described
the pattern this way, and there was already an unspec for it.
Not sure what made me change it...

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (aarch64_wrffr): Wrap the FFRT
	output in UNSPEC_WRFFR.

From-SVN: r278356

committed 5 years ago

4ec943d6 Browse File

[AArch64] Add scatter stores for partial SVE modes · 37a3662f

This patch adds support for scatter stores of partial vectors,
where the vector base or offset elements can be wider than the
elements being stored.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(scatter_store<SVE_FULL_SD:mode><v_int_equiv>): Extend to...
	(scatter_store<SVE_24:mode><v_int_container>): ...this.
	(mask_scatter_store<SVE_FULL_S:mode><v_int_equiv>): Extend to...
	(mask_scatter_store<SVE_4:mode><v_int_equiv>): ...this.
	(mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>): Extend to...
	(mask_scatter_store<SVE_2:mode><v_int_equiv>): ...this.
	(*mask_scatter_store<mode><v_int_container>_<su>xtw_unpacked): New
	pattern.
	(*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to...
	(*mask_scatter_store<SVE_2:mode><v_int_equiv>_sxtw): ...this.
	(*mask_scatter_store<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to...
	(*mask_scatter_store<SVE_2:mode><v_int_equiv>_uxtw): ...this.

gcc/testsuite/
	* gcc.target/aarch64/sve/scatter_store_1.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/scatter_store_2.c: Update accordingly.
	* gcc.target/aarch64/sve/scatter_store_3.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/scatter_store_4.c: Update accordingly.
	* gcc.target/aarch64/sve/scatter_store_5.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements.
	* gcc.target/aarch64/sve/scatter_store_8.c: New test.
	* gcc.target/aarch64/sve/scatter_store_9.c: Likewise.

From-SVN: r278347

committed 5 years ago

37a3662f Browse File

[AArch64] Pattern-match SVE extending gather loads · 87a80d27

This patch pattern-matches a partial gather load followed by a sign or
zero extension into an extending gather load.  (The partial gather load
is already an extending load; we just don't rely on the upper bits of
the elements.)

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_2BHSI, SVE_2HSDI, SVE_4BHI)
	(SVE_4HSI): New mode iterators.
	(ANY_EXTEND2): New code iterator.
	* config/aarch64/aarch64-sve.md
	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>):
	Extend to...
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_4HSI:mode><SVE_4BHI:mode>):
	...this, handling extension to partial modes as well as full modes.
	Describe the extension as a predicated rather than unpredicated
	extension.
	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
	Likewise extend to...
	(@aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>):
	...this, making the same adjustments.
	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw):
	Likewise extend to...
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_sxtw)
	...this, making the same adjustments.
	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw):
	Likewise extend to...
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_uxtw)
	...this, making the same adjustments.
	(*aarch64_gather_load_<ANY_EXTEND:optab><SVE_2HSDI:mode><SVE_2BHSI:mode>_<ANY_EXTEND2:su>xtw_unpacked):
	New pattern.
	(*aarch64_ldff1_gather<mode>_sxtw): Canonicalize to a constant
	extension predicate.
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw):
	Describe the extension as a predicated rather than unpredicated
	extension.
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw):
	Likewise.  Canonicalize to a constant extension predicate.
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svld1_gather_extend_impl::expand): Add an extra predicate for
	the extension.
	(svldff1_gather_extend_impl::expand): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/gather_load_extend_1.c: New test.
	* gcc.target/aarch64/sve/gather_load_extend_2.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_3.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_4.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_5.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_6.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_7.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_8.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_9.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_10.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_11.c: Likewise.
	* gcc.target/aarch64/sve/gather_load_extend_12.c: Likewise.

From-SVN: r278346

committed 5 years ago

87a80d27 Browse File

[AArch64] Add gather loads for partial SVE modes · f8186eea

This patch adds support for gather loads of partial vectors,
where the vector base or offset elements can be wider than the
elements being loaded.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_24, SVE_2, SVE_4): New mode
	iterators.
	* config/aarch64/aarch64-sve.md
	(gather_load<SVE_FULL_SD:mode><v_int_equiv>): Extend to...
	(gather_load<SVE_24:mode><v_int_container>): ...this.
	(mask_gather_load<SVE_FULL_S:mode><v_int_equiv>): Extend to...
	(mask_gather_load<SVE_4:mode><v_int_container>): ...this.
	(mask_gather_load<SVE_FULL_D:mode><v_int_equiv>): Extend to...
	(mask_gather_load<SVE_2:mode><v_int_container>): ...this.
	(*mask_gather_load<SVE_2:mode><v_int_container>_<su>xtw_unpacked):
	New pattern.
	(*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_sxtw): Extend to...
	(*mask_gather_load<SVE_2:mode><v_int_equiv>_sxtw): ...this.
	Allow the nominal extension predicate to be different from the
	load predicate.
	(*mask_gather_load<SVE_FULL_D:mode><v_int_equiv>_uxtw): Extend to...
	(*mask_gather_load<SVE_2:mode><v_int_equiv>_uxtw): ...this.

gcc/testsuite/
	* gcc.target/aarch64/sve/gather_load_1.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/gather_load_2.c: Update accordingly.
	* gcc.target/aarch64/sve/gather_load_3.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit and 16-bit elements.
	* gcc.target/aarch64/sve/gather_load_4.c: Update accordingly.
	* gcc.target/aarch64/sve/gather_load_5.c (TEST_LOOP): Start at 0.
	(TEST_ALL): Add tests for 8-bit, 16-bit and 32-bit elements.
	* gcc.target/aarch64/sve/gather_load_6.c: Add
	--param aarch64-sve-compare-costs=0.
	(TEST_LOOP): Start at 0.
	* gcc.target/aarch64/sve/gather_load_7.c: Add
	--param aarch64-sve-compare-costs=0.
	* gcc.target/aarch64/sve/gather_load_8.c: New test.
	* gcc.target/aarch64/sve/gather_load_9.c: Likewise.
	* gcc.target/aarch64/sve/mask_gather_load_6.c: Add
	--param aarch64-sve-compare-costs=0.

From-SVN: r278345

committed 5 years ago

f8186eea Browse File

[AArch64] Add truncation for partial SVE modes · 2d56600c

This patch adds support for "truncating" to a partial SVE vector from
either a full SVE vector or a wider partial vector.  This truncation is
actually a no-op and so should have zero cost in the vector cost model.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(trunc<SVE_HSDI:mode><SVE_PARTIAL_I:mode>2): New pattern.
	* config/aarch64/aarch64.c (aarch64_integer_truncation_p): New
	function.
	(aarch64_sve_adjust_stmt_cost): Call it.

gcc/testsuite/
	* gcc.target/aarch64/sve/mask_struct_load_1.c: Add
	--param aarch64-sve-compare-costs=0.
	* gcc.target/aarch64/sve/mask_struct_load_2.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_3.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_4.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_load_5.c: Likewise.
	* gcc.target/aarch64/sve/pack_1.c: Likewise.
	* gcc.target/aarch64/sve/truncate_1.c: New test.

From-SVN: r278344

committed 5 years ago

2d56600c Browse File

[AArch64] Pattern-match SVE extending loads · 217ccab8

This patch pattern-matches a partial SVE load followed by a sign or zero
extension into an extending load.  (The partial load is already an
extending load; we just don't rely on the upper bits of the elements.)

Nothing yet uses the extra LDFF1 and LDNF1 combinations, but it seemed
more consistent to provide them, since I needed to update the pattern
to use a predicated extension anyway.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>):
	(@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
	Combine into...
	(@aarch64_load_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>):
	...this new pattern, handling extension to partial modes as well
	as full modes.  Describe the extension as a predicated rather than
	unpredicated extension.
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>):
	Combine into...
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><SVE_HSDI:mode><SVE_PARTIAL_I:mode>):
	...this new pattern, handling extension to partial modes as well
	as full modes.  Describe the extension as a predicated rather than
	unpredicated extension.
	* config/aarch64/aarch64-sve-builtins.cc
	(function_expander::use_contiguous_load_insn): Add an extra
	predicate for extending loads.
	* config/aarch64/aarch64.c (aarch64_extending_load_p): New function.
	(aarch64_sve_adjust_stmt_cost): Likewise.
	(aarch64_add_stmt_cost): Use aarch64_sve_adjust_stmt_cost to adjust
	the cost of SVE vector stmts.

gcc/testsuite/
	* gcc.target/aarch64/sve/load_extend_1.c: New test.
	* gcc.target/aarch64/sve/load_extend_2.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_3.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_4.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_5.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_6.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_7.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_8.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_9.c: Likewise.
	* gcc.target/aarch64/sve/load_extend_10.c: Likewise.
	* gcc.target/aarch64/sve/reduc_4.c: Add
	--param aarch64-sve-compare-costs=0.

From-SVN: r278343

committed 5 years ago

217ccab8 Browse File

[AArch64] Add sign and zero extension for partial SVE modes · e58703e2

This patch adds support for extending from partial SVE modes
to both full vector modes and wider partial modes.

Some tests now need --param aarch64-sve-compare-costs=0 to force
the original full-vector code.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_HSDI): New mode iterator.
	(narrower_mask): Handle VNx4HI, VNx2HI and VNx2SI.
	* config/aarch64/aarch64-sve.md
	(<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): New pattern.
	(*<ANY_EXTEND:optab><SVE_PARTIAL_I:mode><SVE_HSDI:mode>2): Likewise.
	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Update
	comment.  Avoid new narrower_mask ambiguity.
	(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
	(*cond_uxt<mode>_2): Update comment.
	(*cond_uxt<mode>_any): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/cost_model_1.c: Expect the loop to be
	vectorized with bytes stored in 32-bit containers.
	* gcc.target/aarch64/sve/extend_1.c: New test.
	* gcc.target/aarch64/sve/extend_2.c: New test.
	* gcc.target/aarch64/sve/extend_3.c: New test.
	* gcc.target/aarch64/sve/extend_4.c: New test.
	* gcc.target/aarch64/sve/load_const_offset_3.c: Add
	--param aarch64-sve-compare-costs=0.
	* gcc.target/aarch64/sve/mask_struct_store_1.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_1_run.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_2.c: Likewise.
	* gcc.target/aarch64/sve/mask_struct_store_2_run.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1.c: Likewise.
	* gcc.target/aarch64/sve/unpack_unsigned_1_run.c: Likewise.

From-SVN: r278342

committed 5 years ago

e58703e2 Browse File

[AArch64] Add autovec support for partial SVE vectors · cc68f7c2

This patch adds the bare minimum needed to support autovectorisation of
partial SVE vectors, namely moves and integer addition.  Later patches
add more interesting cases.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-modes.def: Define partial SVE vector
	float modes.
	* config/aarch64/aarch64-protos.h (aarch64_sve_pred_mode): New
	function.
	* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Handle the
	new vector float modes.
	(aarch64_sve_container_bits): New function.
	(aarch64_sve_pred_mode): Likewise.
	(aarch64_get_mask_mode): Use it.
	(aarch64_sve_element_int_mode): Handle structure modes and partial
	modes.
	(aarch64_sve_container_int_mode): New function.
	(aarch64_vectorize_related_mode): Return SVE modes when given
	SVE modes.  Handle partial modes, taking the preferred number
	of units from the size of the given mode.
	(aarch64_hard_regno_mode_ok): Allow partial modes to be stored
	in registers.
	(aarch64_expand_sve_ld1rq): Use the mode form of aarch64_sve_pred_mode.
	(aarch64_expand_sve_const_vector): Handle partial SVE vectors.
	(aarch64_split_sve_subreg_move): Use the mode form of
	aarch64_sve_pred_mode.
	(aarch64_secondary_reload): Handle partial modes in the same way
	as full big-endian vectors.
	(aarch64_vector_mode_supported_p): Allow partial SVE vectors.
	(aarch64_autovectorize_vector_modes): Try unpacked SVE vectors,
	merging with the Advanced SIMD modes.  If two modes have the
	same size, try the Advanced SIMD mode first.
	(aarch64_simd_valid_immediate): Use the container rather than
	the element mode for INDEX constants.
	(aarch64_simd_vector_alignment): Make the alignment of partial
	SVE vector modes the same as their minimum size.
	(aarch64_evpc_sel): Use the mode form of aarch64_sve_pred_mode.
	* config/aarch64/aarch64-sve.md (mov<SVE_FULL:mode>): Extend to...
	(mov<SVE_ALL:mode>): ...this.
	(movmisalign<SVE_FULL:mode>): Extend to...
	(movmisalign<SVE_ALL:mode>): ...this.
	(*aarch64_sve_mov<mode>_le): Rename to...
	(*aarch64_sve_mov<mode>_ldr_str): ...this.
	(*aarch64_sve_mov<SVE_FULL:mode>_be): Rename and extend to...
	(*aarch64_sve_mov<SVE_ALL:mode>_no_ldr_str): ...this.  Handle
	partial modes regardless of endianness.
	(aarch64_sve_reload_be): Rename to...
	(aarch64_sve_reload_mem): ...this and enable for little-endian.
	Use aarch64_sve_pred_mode to get the appropriate predicate mode.
	(@aarch64_pred_mov<SVE_FULL:mode>): Extend to...
	(@aarch64_pred_mov<SVE_ALL:mode>): ...this.
	(*aarch64_sve_mov<SVE_FULL:mode>_subreg_be): Extend to...
	(*aarch64_sve_mov<SVE_ALL:mode>_subreg_be): ...this.
	(@aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to...
	(@aarch64_sve_reinterpret<SVE_ALL:mode>): ...this.
	(*aarch64_sve_reinterpret<SVE_FULL:mode>): Extend to...
	(*aarch64_sve_reinterpret<SVE_ALL:mode>): ...this.
	(maskload<SVE_FULL:mode><vpred>): Extend to...
	(maskload<SVE_ALL:mode><vpred>): ...this.
	(maskstore<SVE_FULL:mode><vpred>): Extend to...
	(maskstore<SVE_ALL:mode><vpred>): ...this.
	(vec_duplicate<SVE_FULL:mode>): Extend to...
	(vec_duplicate<SVE_ALL:mode>): ...this.
	(*vec_duplicate<SVE_FULL:mode>_reg): Extend to...
	(*vec_duplicate<SVE_ALL:mode>_reg): ...this.
	(sve_ld1r<SVE_FULL:mode>): Extend to...
	(sve_ld1r<SVE_ALL:mode>): ...this.
	(vec_series<SVE_FULL_I:mode>): Extend to...
	(vec_series<SVE_I:mode>): ...this.
	(*vec_series<SVE_FULL_I:mode>_plus): Extend to...
	(*vec_series<SVE_I:mode>_plus): ...this.
	(@aarch64_pred_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Avoid
	new VPRED ambiguity.
	(@aarch64_cond_sxt<SVE_FULL_HSDI:mode><SVE_PARTIAL_I:mode>): Likewise.
	(add<SVE_FULL_I:mode>3): Extend to...
	(add<SVE_I:mode>3): ...this.
	* config/aarch64/iterators.md (SVE_ALL, SVE_I): New mode iterators.
	(Vetype, Vesize, VEL, Vel, vwcore): Handle partial SVE vector modes.
	(VPRED, vpred): Likewise.
	(Vctype): New iterator.
	(vw): Remove SVE modes.

gcc/testsuite/
	* gcc.target/aarch64/sve/mixed_size_1.c: New test.
	* gcc.target/aarch64/sve/mixed_size_2.c: Likewise.
	* gcc.target/aarch64/sve/mixed_size_3.c: Likewise.
	* gcc.target/aarch64/sve/mixed_size_4.c: Likewise.
	* gcc.target/aarch64/sve/mixed_size_5.c: Likewise.

From-SVN: r278341

committed 5 years ago

cc68f7c2 Browse File

[AArch64] Replace SVE_PARTIAL with SVE_PARTIAL_I · 6544cb52

Another renaming, this time to make way for partial/unpacked
float modes.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_PARTIAL): Rename to...
	(SVE_PARTIAL_I): ...this.
	* config/aarch64/aarch64-sve.md: Apply the above renaming throughout.

From-SVN: r278339

committed 5 years ago

6544cb52 Browse File

[AArch64] Add "FULL" to SVE mode iterator names · f75cdd2c

An upcoming patch will make more use of partial/unpacked SVE vectors.
We then need a distinction between mode iterators that include partial
modes and those that only include "full" modes.  This patch prepares
for that by adding "FULL" to the names of iterators that only select
full modes.  There should be no change in behaviour.

2019-11-16  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_ALL): Rename to...
	(SVE_FULL): ...this.
	(SVE_I): Rename to...
	(SVE_FULL_I): ...this.
	(SVE_F): Rename to...
	(SVE_FULL_F): ...this.
	(SVE_BHSI): Rename to...
	(SVE_FULL_BHSI): ...this.
	(SVE_HSD): Rename to...
	(SVE_FULL_HSD): ...this.
	(SVE_HSDI): Rename to...
	(SVE_FULL_HSDI): ...this.
	(SVE_HSF): Rename to...
	(SVE_FULL_HSF): ...this.
	(SVE_SD): Rename to...
	(SVE_FULL_SD): ...this.
	(SVE_SDI): Rename to...
	(SVE_FULL_SDI): ...this.
	(SVE_SDF): Rename to...
	(SVE_FULL_SDF): ...this.
	(SVE_S): Rename to...
	(SVE_FULL_S): ...this.
	(SVE_D): Rename to...
	(SVE_FULL_D): ...this.
	* config/aarch64/aarch64-sve.md: Apply the above renaming throughout.
	* config/aarch64/aarch64-sve2.md: Likewise.

From-SVN: r278338

committed 5 years ago

f75cdd2c Browse File

08 Nov, 2019 1 commit

Generalise gather and scatter optabs · 09eb042a

The gather and scatter optabs required the vector offset to be
the integer equivalent of the vector mode being loaded or stored.
This patch generalises them so that the two vectors can have different
element sizes, although they still need to have the same number of
elements.

One consequence of this is that it's possible (if unlikely)
for two IFN_GATHER_LOADs to have the same arguments but different
return types.  E.g. the same scalar base and vector of 32-bit offsets
could be used to load 8-bit elements and to load 16-bit elements.
From just looking at the arguments, we could wrongly deduce that
they're equivalent.

I know we saw this happen at one point with IFN_WHILE_ULT,
and we dealt with it there by passing a zero of the return type
as an extra argument.  Doing the same here also makes the load
and store functions have the same argument assignment.

For now this patch should be a no-op, but later SVE patches take
advantage of the new flexibility.

2019-11-08  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* optabs.def (gather_load_optab, mask_gather_load_optab)
	(scatter_store_optab, mask_scatter_store_optab): Turn into
	conversion optabs, with the offset mode given explicitly.
	* doc/md.texi: Update accordingly.
	* config/aarch64/aarch64-sve-builtins-base.cc
	(svld1_gather_impl::expand): Likewise.
	(svst1_scatter_impl::expand): Likewise.
	* internal-fn.c (gather_load_direct, scatter_store_direct): Likewise.
	(expand_scatter_store_optab_fn): Likewise.
	(direct_gather_load_optab_supported_p): Likewise.
	(direct_scatter_store_optab_supported_p): Likewise.
	(expand_gather_load_optab_fn): Likewise.  Expect the mask argument
	to be argument 4.
	(internal_fn_mask_index): Return 4 for IFN_MASK_GATHER_LOAD.
	(internal_gather_scatter_fn_supported_p): Replace the offset sign
	argument with the offset vector type.  Require the two vector
	types to have the same number of elements but allow their element
	sizes to be different.  Treat the optabs as conversion optabs.
	* internal-fn.h (internal_gather_scatter_fn_supported_p): Update
	prototype accordingly.
	* optabs-query.c (supports_at_least_one_mode_p): Replace with...
	(supports_vec_convert_optab_p): ...this new function.
	(supports_vec_gather_load_p): Update accordingly.
	(supports_vec_scatter_store_p): Likewise.
	* tree-vectorizer.h (vect_gather_scatter_fn_p): Take a vec_info.
	Replace the offset sign and bits parameters with a scalar type tree.
	* tree-vect-data-refs.c (vect_gather_scatter_fn_p): Likewise.
	Pass back the offset vector type instead of the scalar element type.
	Allow the offset to be wider than the memory elements.  Search for
	an offset type that the target supports, stopping once we've
	reached the maximum of the element size and pointer size.
	Update call to internal_gather_scatter_fn_supported_p.
	(vect_check_gather_scatter): Update calls accordingly.
	When testing a new scale before knowing the final offset type,
	check whether the scale is supported for any signed or unsigned
	offset type.  Check whether the target supports the source and
	target types of a conversion before deciding whether to look
	through the conversion.  Record the chosen offset_vectype.
	* tree-vect-patterns.c (vect_get_gather_scatter_offset_type): Delete.
	(vect_recog_gather_scatter_pattern): Get the scalar offset type
	directly from the gs_info's offset_vectype instead.  Pass a zero
	of the result type to IFN_GATHER_LOAD and IFN_MASK_GATHER_LOAD.
	* tree-vect-stmts.c (check_load_store_masking): Update call to
	internal_gather_scatter_fn_supported_p, passing the offset vector
	type recorded in the gs_info.
	(vect_truncate_gather_scatter_offset): Update call to
	vect_check_gather_scatter, leaving it to search for a valid
	offset vector type.
	(vect_use_strided_gather_scatters_p): Convert the offset to the
	element type of the gs_info's offset_vectype.
	(vect_get_gather_scatter_ops): Get the offset vector type directly
	from the gs_info.
	(vect_get_strided_load_store_ops): Likewise.
	(vectorizable_load): Pass a zero of the result type to IFN_GATHER_LOAD
	and IFN_MASK_GATHER_LOAD.
	* config/aarch64/aarch64-sve.md (gather_load<mode>): Rename to...
	(gather_load<mode><v_int_equiv>): ...this.
	(mask_gather_load<mode>): Rename to...
	(mask_gather_load<mode><v_int_equiv>): ...this.
	(scatter_store<mode>): Rename to...
	(scatter_store<mode><v_int_equiv>): ...this.
	(mask_scatter_store<mode>): Rename to...
	(mask_scatter_store<mode><v_int_equiv>): ...this.

From-SVN: r277949

committed 5 years ago

09eb042a Browse File

29 Oct, 2019 3 commits

[AArch64] Add support for the SVE PCS · c600df9a

The AAPCS64 specifies that if a function takes arguments in SVE
registers or returns them in SVE registers, it must preserve all
of Z8-Z23 and all of P4-P11.  (Normal functions only preserve the
low 64 bits of Z8-Z15 and clobber all of the predicate registers.)

This variation is known informally as the "SVE PCS" and functions
that use it are known informally as "SVE functions".  The SVE PCS
is mutually interoperable with functions that follow the standard
AAPCS64 rules and those that use the aarch64_vector_pcs attribute.
(Note that it's an error to use the attribute for SVE functions.)

One complication -- although it's not really that complicated --
is that SVE registers need to be saved at a VL-dependent offset while
other registers need to be saved at a constant offset.  The easiest way
of handling this seemed to be to group the SVE registers together below
the hard frame pointer.  In common cases, the frame pointer is then
usually an easy-to-compute VL multiple above the stack pointer and a
constant amount below the incoming stack pointer.

A bigger complication is that, because the base AAPCS64 specifies that
only the low 64 bits of V8-V15 are preserved by calls, the associated
DWARF frame registers are also treated as 64 bits by the unwinder.
The 64 bits must also have the same layout as they would for a base
AAPCS64 function, otherwise unwinding won't work correctly.  (This is
actually a problem for the existing aarch64_vector_pcs support too,
but I'll fix that separately.)

This falls out naturally for little-endian targets but not for
big-endian targets.  The easiest way of meeting the requirement for them
was to use ST1D and LD1D to save and restore Z8-Z15, which also has the
nice property of storing the 64 bits at the start of the slot.  However,
using ST1D and LD1D requires a spare predicate register, and since all
of P0-P7 are either argument registers or call-preserved, we may need
to spill P4 in order to save the vector registers, even if P4 wouldn't
need to be saved otherwise.

Since Z16-Z23 are fully clobbered by base AAPCS64 functions, we don't
need to emit frame information for them at all.  This avoids having
to decide whether the registers should be treated as having 64 bits
(as for Z8-Z15), 128 bits (for Advanced SIMD) or the full SVE width.

There are two ways of dealing with stack-clash protection when
saving SVE registers:

(1) If the area between the hard frame pointer and the incoming stack
    pointer is allocated via a store with writeback (callee_adjust != 0),
    the SVE save area is allocated separately and becomes the "initial"
    allocation as far as stack-clash protection goes.  In this case
    the store with writeback acts as a probe at the hard frame pointer
    position.

(2) If the area between the hard frame pointer and the incoming stack
    pointer is allocated via aarch64_allocate_and_probe_stack_space,
    the SVE save area is added to this initial allocation, so that the
    SP ends up pointing at the SVE register saves.  It's then necessary
    to use a temporary base register to save the non-SVE registers.
    Setting up this temporary register requires a single instruction
    only and so should be more efficient than doing two allocations
    and probes.

When SVE registers need to be saved, saving them below the frame pointer
makes it harder to rely on the LR save as a stack probe, since the LR
register's offset won't usually be a compile-time constant.  The patch
copes with that by using the lowest SVE register save as a stack probe
too, and thus prevents the save from being shrink-wrapped if stack clash
protection is enabled.

The changelog describes the low-level details.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* calls.c (pass_by_reference): Leave the target to decide whether
	POLY_INT_CST-sized arguments should be passed by value or reference,
	rather than forcing them to be passed by reference.
	(must_pass_in_stack_var_size): Likewise.
	* config/aarch64/aarch64.md (LAST_SAVED_REGNUM): Redefine from
	V31_REGNUM to P15_REGNUM.
	* config/aarch64/aarch64-protos.h (aarch64_init_cumulative_args):
	Take an extra "silent_p" parameter, defaulting to false.
	(aarch64_sve::svbool_type_p): Declare.
	(aarch64_sve::nvectors_if_data_type): Likewise.
	* config/aarch64/aarch64.h (NUM_PR_ARG_REGS): New macro.
	(aarch64_frame::reg_offset): Turn into poly_int64s.
	(aarch64_frame::save_regs_size): Likewise.
	(aarch64_frame::below_hard_fp_saved_regs_size): New field.
	(aarch64_frame::sve_callee_adjust): Likewise.
	(aarch64_frame::spare_reg_reg): Likewise.
	(ARM_PCS_SVE): New arm_pcs value.
	(CUMULATIVE_ARGS::aapcs_nprn): New field.
	(CUMULATIVE_ARGS::aapcs_nextnprn): Likewise.
	(CUMULATIVE_ARGS::silent_p): Likewise.
	(BITS_PER_SVE_PRED): New macro.
	* config/aarch64/aarch64.c (handle_aarch64_vector_pcs_attribute): New
	function.  Reject aarch64_vector_pcs attributes on SVE functions.
	(aarch64_attribute_table): Use the above handler.
	(aarch64_sve_abi): New function.
	(aarch64_sve_argument_p): Likewise.
	(aarch64_returns_value_in_sve_regs_p): Likewise.
	(aarch64_takes_arguments_in_sve_regs_p): Likewise.
	(aarch64_fntype_abi): Check for SVE functions and return the SVE PCS
	descriptor for them.
	(aarch64_simd_decl_p): Delete.
	(aarch64_emit_cfi_for_reg_p): New function.
	(aarch64_reg_save_mode): Remove the fndecl argument and instead use
	crtl->abi to choose the mode for FP registers.  Handle the SVE PCS.
	(aarch64_hard_regno_call_part_clobbered): Do not treat FP registers
	as partly clobbered for the SVE PCS.
	(aarch64_function_ok_for_sibcall): Check whether the two functions
	use the same ABI, rather than checking specifically for whether
	they're aarch64_vector_pcs functions.
	(aarch64_pass_by_reference): Raise an error for attempts to pass
	SVE arguments when SVE is disabled.  Pass SVE arguments by reference
	if there are not enough free registers left, or if the argument is
	variadic.
	(aarch64_function_value): Handle SVE predicates, vectors and tuples.
	(aarch64_return_in_memory): Do not return SVE predicates, vectors and
	tuples in memory.
	(aarch64_layout_arg): Take a function_arg_info rather than
	individual properties.  Handle SVE predicates, vectors and tuples.
	Raise an error if they are passed to unprototyped functions.
	(aarch64_function_arg): If the silent_p flag is set, suppress the
	usual error about using float registers without TARGET_FLOAT.
	(aarch64_init_cumulative_args): Take a silent_p parameter and store
	it in the cumulative_args structure.  Initialize aapcs_nprn and
	aapcs_nextnprn.  If the silent_p flag is set, suppress the usual
	error about using float registers without TARGET_FLOAT.
	If the silent_p flag is not set, also raise an error about
	using SVE functions when SVE is disabled.
	(aarch64_function_arg_advance): Update the call to aarch64_layout_arg,
	and call it for SVE functions too.  Update aapcs_nprn similarly
	to the other register counts.
	(aarch64_layout_frame): If a big-endian function needs to save
	and restore Z8-Z15, search for a spare predicate that it can use.
	Store SVE predicates at the bottom of the register save area,
	followed by SVE vectors, then followed by the normal slots.
	Keep pointing the hard frame pointer at the base of the normal slots,
	above the SVE vectors.  Update the various frame creation and
	tear-down strategies for the new layout, initializing the new
	sve_callee_adjust field.  Add an additional layout for frames
	whose saved registers are all SVE registers.
	(aarch64_register_saved_on_entry): Cope with poly_int64 reg_offsets.
	(aarch64_return_address_signing_enabled): Likewise.
	(aarch64_push_regs, aarch64_pop_regs): Update calls to
	aarch64_reg_save_mode.
	(aarch64_adjust_sve_callee_save_base): New function.
	(aarch64_add_cfa_expression): Move earlier in file.  Take the
	saved register as an rtx rather than a register number and use
	its mode for the MEM slot.
	(aarch64_save_callee_saves): Remove the mode argument and instead
	use aarch64_reg_save_mode to get the mode of each save slot.
	Add a hard_fp_valid_p parameter.  Cope with poly_int64 register
	offsets.  Allow GP offsets to be saved at a VL-based offset from
	the stack, handling this case using the frame pointer if available
	or a temporary register otherwise.  Use ST1D to save Z8-Z15 for
	big-endian SVE functions; use normal moves for other SVE saves.
	Only mark the save as frame-related if aarch64_emit_cfi_for_reg_p
	returns true.  Add explicit CFA notes when not storing via the
	stack pointer.  Do not try to pair SVE saves.
	(aarch64_restore_callee_saves): Cope with poly_int64 register
	offsets.  Use LD1D to restore Z8-Z15 for big-endian SVE functions;
	use normal moves for other SVE restores.  Only add CFA restore notes
	if aarch64_emit_cfi_for_reg_p returns true.  Do not try to pair
	SVE restores.
	(aarch64_get_separate_components): Always keep the first SVE save
	in the prologue if we need to use it as a stack probe.  Don't allow
	Z8-Z15 saves and loads to be shrink-wrapped for big-endian targets.
	Likewise the spare predicate register that they need.  Update the
	offset calculation to account for the SVE save area.  Use the
	appropriate range check for SVE LDR and STR instructions.
	(aarch64_components_for_bb): Cope with poly_int64 reg_offsets.
	(aarch64_process_components): Likewise.  Update the offset
	calculation to account for the SVE save area.  Only mark the
	save as frame-related if aarch64_emit_cfi_for_reg_p returns true.
	Do not try to pair SVE saves.
	(aarch64_allocate_and_probe_stack_space): Cope with poly_int64
	reg_offsets.  When handling the final allocation, expect the
	first SVE register save to be part of the initial allocation
	and for it to act as a probe at SP.  Account for the SVE callee
	save area in the dump information.
	(aarch64_expand_prologue): Update the frame diagram.  Fold the
	SVE callee allocation into the initial allocation if stack clash
	protection is enabled.  Use new variables to track the offset
	of the frame chain (and hard frame pointer) from the current
	stack pointer, and likewise the offset of the bottom of the
	register save area.  Update calls to aarch64_save_callee_saves
	and aarch64_add_cfa_expression.  Apply sve_callee_adjust before
	saving the FP&SIMD registers.  Save the predicate registers.
	(aarch64_expand_epilogue): Take below_hard_fp_saved_regs_size
	into account when setting the stack pointer from the frame pointer,
	and when deciding whether we can inherit the initial adjustment
	amount from the prologue.  Restore the predicate registers after
	the vector registers, then apply sve_callee_adjust, then restore
	the general registers.
	(aarch64_secondary_reload): Don't use secondary SVE reloads
	for VNx16BImode.
	(aapcs_vfp_sub_candidate): Assert that the type is not an SVE type.
	(aarch64_short_vector_p): Return false for SVE types.
	(aarch64_vfp_is_call_or_return_candidate): Initialize *is_ha
	at the start of the function.  Return false for SVE types.
	(aarch64_asm_output_variant_pcs): Output .variant_pcs for SVE
	functions too.
	(TARGET_STRICT_ARGUMENT_NAMING): Redefine to request strict naming.
	* config/aarch64/aarch64-sve.md (*aarch64_sve_mov<mode>_le): Extend
	to big-endian targets for bytewise moves.
	(*aarch64_sve_mov<mode>_be): Exclude the bytewise case.

gcc/testsuite/
	* gcc.target/aarch64/sve/pcs/aarch64-sve-pcs.exp: New file.
	* gcc.target/aarch64/sve/pcs/annotate_1.c: New test.
	* gcc.target/aarch64/sve/pcs/annotate_2.c: Likewise.
	* gcc.target/aarch64/sve/pcs/annotate_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/annotate_4.c: Likewise.
	* gcc.target/aarch64/sve/pcs/annotate_5.c: Likewise.
	* gcc.target/aarch64/sve/pcs/annotate_6.c: Likewise.
	* gcc.target/aarch64/sve/pcs/annotate_7.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_10.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_11_nosc.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_11_sc.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_2.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_4.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_be_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_5_le_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_be_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_6_le_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_7.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/args_9.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_2.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_4.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_5.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_6.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_7.c: Likewise.
	* gcc.target/aarch64/sve/pcs/nosve_8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_1_1024.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_1_2048.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_1_256.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_1_512.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_2.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_4.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_4_1024.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_4_2048.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_4_256.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_4_512.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_5.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_5_1024.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_5_2048.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_5_256.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_5_512.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_6.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_6_1024.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_6_2048.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_6_256.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_6_512.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_7.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/return_9.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_1_be_nowrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_1_be_wrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_1_le_nowrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_1_le_wrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_2_be_nowrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_2_be_wrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_2_le_nowrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_2_le_wrap.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_4_be.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_4_le.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_5_be.c: Likewise.
	* gcc.target/aarch64/sve/pcs/saves_5_le.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_1_256.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_1_512.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_1_1024.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_1_2048.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_2.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_2_256.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_2_512.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_2_1024.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_2_2048.c: Likewise.
	* gcc.target/aarch64/sve/pcs/stack_clash_3.c: Likewise.
	* gcc.target/aarch64/sve/pcs/unprototyped_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_1.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_f16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_f32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_f64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_s8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u16.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u32.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u64.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_2_u8.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_3_nosc.c: Likewise.
	* gcc.target/aarch64/sve/pcs/varargs_3_sc.c: Likewise.
	* gcc.target/aarch64/sve/pcs/vpcs_1.c: Likewise.
	* g++.target/aarch64/sve/catch_7.C: Likewise.

From-SVN: r277564

committed 5 years ago

c600df9a Browse File

[AArch64] Add support for arm_sve.h · 624d0f07

This patch adds support for arm_sve.h.  I've tried to split all the
groundwork out into separate patches, so this is mostly adding new code
rather than changing existing code.

The C++ frontend seems to handle correct ACLE code without modification,
even in length-agnostic mode.  The C frontend is close; the only correct
construct I know it doesn't handle is initialisation.  E.g.:

  svbool_t pg = svptrue_b8 ();

produces:

  variable-sized object may not be initialized

although:

  svbool_t pg; pg = svptrue_b8 ();

works fine.  This can be fixed by changing:

 	  {
 	    /* A complete type is ok if size is fixed.  */

-	    if (TREE_CODE (TYPE_SIZE (TREE_TYPE (decl))) != INTEGER_CST
+	    if (!poly_int_tree_p (TYPE_SIZE (TREE_TYPE (decl)))
 		|| C_DECL_VARIABLE_SIZE (decl))
 	      {
 		error ("variable-sized object may not be initialized");

in c/c-decl.c:start_decl.

Invalid code is likely to trigger ICEs, so this isn't ready for general
use yet.  However, it seemed better to apply the patch now and deal with
diagnosing invalid code as a follow-up.  For one thing, it means that
we'll be able to provide testcases for middle-end changes related
to SVE vectors, which has been a problem until now.  (I already have
a series of such patches lined up.)

The patch includes some tests, but the main ones need to wait until the
PCS support has been applied.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>
	    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

gcc/
	* config.gcc (aarch64*-*-*): Add arm_sve.h to extra_headers.
	Add aarch64-sve-builtins.o, aarch64-sve-builtins-shapes.o and
	aarch64-sve-builtins-base.o to extra_objs.  Add
	aarch64-sve-builtins.h and aarch64-sve-builtins.cc to target_gtfiles.
	* config/aarch64/t-aarch64 (aarch64-sve-builtins.o): New rule.
	(aarch64-sve-builtins-shapes.o): Likewise.
	(aarch64-sve-builtins-base.o): New rules.
	* config/aarch64/aarch64-c.c (aarch64_pragma_aarch64): New function.
	(aarch64_resolve_overloaded_builtin): Likewise.
	(aarch64_check_builtin_call): Likewise.
	(aarch64_register_pragmas): Install aarch64_resolve_overloaded_builtin
	and aarch64_check_builtin_call in targetm.  Register the GCC aarch64
	pragma.
	* config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPRFOP): New macro.
	(aarch64_svprfop): New enum.
	(AARCH64_BUILTIN_SVE): New aarch64_builtin_class enum value.
	(aarch64_sve_int_mode, aarch64_sve_data_mode): Declare.
	(aarch64_fold_sve_cnt_pat, aarch64_output_sve_prefetch): Likewise.
	(aarch64_output_sve_cnt_pat_immediate): Likewise.
	(aarch64_output_sve_ptrues, aarch64_sve_ptrue_svpattern_p): Likewise.
	(aarch64_sve_sqadd_sqsub_immediate_p, aarch64_sve_ldff1_operand_p)
	(aarch64_sve_ldnf1_operand_p, aarch64_sve_prefetch_operand_p)
	(aarch64_ptrue_all_mode, aarch64_convert_sve_data_to_pred): Likewise.
	(aarch64_expand_sve_dupq, aarch64_replace_reg_mode): Likewise.
	(aarch64_sve::init_builtins, aarch64_sve::handle_arm_sve_h): Likewise.
	(aarch64_sve::builtin_decl, aarch64_sve::builtin_type_p): Likewise.
	(aarch64_sve::mangle_builtin_type): Likewise.
	(aarch64_sve::resolve_overloaded_builtin): Likewise.
	(aarch64_sve::check_builtin_call, aarch64_sve::gimple_fold_builtin)
	(aarch64_sve::expand_builtin): Likewise.
	* config/aarch64/aarch64.c (aarch64_sve_data_mode): Make public.
	(aarch64_sve_int_mode): Likewise.
	(aarch64_ptrue_all_mode): New function.
	(aarch64_convert_sve_data_to_pred): Make public.
	(svprfop_token): New function.
	(aarch64_output_sve_prefetch): Likewise.
	(aarch64_fold_sve_cnt_pat): Likewise.
	(aarch64_output_sve_cnt_pat_immediate): Likewise.
	(aarch64_sve_move_pred_via_while): Use gen_while with UNSPEC_WHILE_LO
	instead of gen_while_ult.
	(aarch64_replace_reg_mode): Make public.
	(aarch64_init_builtins): Call aarch64_sve::init_builtins.
	(aarch64_fold_builtin): Handle AARCH64_BUILTIN_SVE.
	(aarch64_gimple_fold_builtin, aarch64_expand_builtin): Likewise.
	(aarch64_builtin_decl, aarch64_builtin_reciprocal): Likewise.
	(aarch64_mangle_type): Call aarch64_sve::mangle_type.
	(aarch64_sve_sqadd_sqsub_immediate_p): New function.
	(aarch64_sve_ptrue_svpattern_p): Likewise.
	(aarch64_sve_pred_valid_immediate): Check
	aarch64_sve_ptrue_svpattern_p.
	(aarch64_sve_ldff1_operand_p, aarch64_sve_ldnf1_operand_p)
	(aarch64_sve_prefetch_operand_p, aarch64_output_sve_ptrues): New
	functions.
	* config/aarch64/aarch64.md (UNSPEC_LDNT1_SVE, UNSPEC_STNT1_SVE)
	(UNSPEC_LDFF1_GATHER, UNSPEC_PTRUE, UNSPEC_WHILE_LE, UNSPEC_WHILE_LS)
	(UNSPEC_WHILE_LT, UNSPEC_CLASTA, UNSPEC_UPDATE_FFR)
	(UNSPEC_UPDATE_FFRT, UNSPEC_RDFFR, UNSPEC_WRFFR)
	(UNSPEC_SVE_LANE_SELECT, UNSPEC_SVE_CNT_PAT, UNSPEC_SVE_PREFETCH)
	(UNSPEC_SVE_PREFETCH_GATHER, UNSPEC_SVE_COMPACT, UNSPEC_SVE_SPLICE):
	New unspecs.
	* config/aarch64/iterators.md (SI_ONLY, DI_ONLY, VNx8HI_ONLY)
	(VNx2DI_ONLY, SVE_PARTIAL, VNx8_NARROW, VNx8_WIDE, VNx4_NARROW)
	(VNx4_WIDE, VNx2_NARROW, VNx2_WIDE, PRED_HSD): New mode iterators.
	(UNSPEC_ADR, UNSPEC_BRKA, UNSPEC_BRKB, UNSPEC_BRKN, UNSPEC_BRKPA)
	(UNSPEC_BRKPB, UNSPEC_PFIRST, UNSPEC_PNEXT, UNSPEC_CNTP, UNSPEC_SADDV)
	(UNSPEC_UADDV, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTMAD)
	(UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_CMPEQ_WIDE): New unspecs.
	(UNSPEC_COND_CMPGE_WIDE, UNSPEC_COND_CMPGT_WIDE): Likewise.
	(UNSPEC_COND_CMPHI_WIDE, UNSPEC_COND_CMPHS_WIDE): Likewise.
	(UNSPEC_COND_CMPLE_WIDE, UNSPEC_COND_CMPLO_WIDE): Likewise.
	(UNSPEC_COND_CMPLS_WIDE, UNSPEC_COND_CMPLT_WIDE): Likewise.
	(UNSPEC_COND_CMPNE_WIDE, UNSPEC_COND_FCADD90, UNSPEC_COND_FCADD270)
	(UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90, UNSPEC_COND_FCMLA180)
	(UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN): Likewise.
	(UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX, UNSPEC_COND_FSCALE): Likewise.
	(UNSPEC_LASTA, UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE): Likewise.
	(UNSPEC_LSHIFTRT_WIDE, UNSPEC_LDFF1, UNSPEC_LDNF1): Likewise.
	(Vesize): Handle partial vector modes.
	(self_mask, narrower_mask, sve_lane_con, sve_lane_pair_con): New
	mode attributes.
	(UBINQOPS, ANY_PLUS, SAT_PLUS, ANY_MINUS, SAT_MINUS): New code
	iterators.
	(s, paired_extend, inc_dec): New code attributes.
	(SVE_INT_ADDV, CLAST, LAST): New int iterators.
	(SVE_INT_UNARY): Add UNSPEC_RBIT.
	(SVE_FP_UNARY, SVE_FP_UNARY_INT): New int iterators.
	(SVE_FP_BINARY, SVE_FP_BINARY_INT): Likewise.
	(SVE_COND_FP_UNARY): Add UNSPEC_COND_FRECPX.
	(SVE_COND_FP_BINARY): Add UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and
	UNSPEC_COND_FMULX.
	(SVE_COND_FP_BINARY_INT, SVE_COND_FP_ADD): New int iterators.
	(SVE_COND_FP_SUB, SVE_COND_FP_MUL): Likewise.
	(SVE_COND_FP_BINARY_I1): Add UNSPEC_COND_FMAX and UNSPEC_COND_FMIN.
	(SVE_COND_FP_BINARY_REG): Add UNSPEC_COND_FMULX.
	(SVE_COND_FCADD, SVE_COND_FP_MAXMIN, SVE_COND_FCMLA)
	(SVE_COND_INT_CMP_WIDE, SVE_FP_TERNARY_LANE, SVE_CFP_TERNARY_LANE)
	(SVE_WHILE, SVE_SHIFT_WIDE, SVE_LDFF1_LDNF1, SVE_BRK_UNARY)
	(SVE_BRK_BINARY, SVE_PITER): New int iterators.
	(optab): Handle UNSPEC_SADDV, UNSPEC_UADDV, UNSPEC_FRECPE,
	UNSPEC_FRECPS, UNSPEC_RSQRTE, UNSPEC_RSQRTS, UNSPEC_RBIT,
	UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART, UNSPEC_FMLA, UNSPEC_FMLS,
	UNSPEC_FCMLA, UNSPEC_FCMLA90, UNSPEC_FCMLA180, UNSPEC_FCMLA270,
	UNSPEC_FEXPA, UNSPEC_FTSMUL, UNSPEC_FTSSEL, UNSPEC_COND_FCADD90,
	UNSPEC_COND_FCADD270, UNSPEC_COND_FCMLA, UNSPEC_COND_FCMLA90,
	UNSPEC_COND_FCMLA180, UNSPEC_COND_FCMLA270, UNSPEC_COND_FMAX,
	UNSPEC_COND_FMIN, UNSPEC_COND_FMULX, UNSPEC_COND_FRECPX and
	UNSPEC_COND_FSCALE.
	(maxmin_uns): Handle UNSPEC_COND_FMAX and UNSPEC_COND_FMIN.
	(binqops_op, binqops_op_rev, last_op): New int attributes.
	(su): Handle UNSPEC_SADDV and UNSPEC_UADDV.
	(fn, ab): New int attributes.
	(cmp_op): Handle UNSPEC_COND_CMP*_WIDE and UNSPEC_WHILE_*.
	(while_optab_cmp, brk_op, sve_pred_op): New int attributes.
	(sve_int_op): Handle UNSPEC_SMUL_HIGHPART, UNSPEC_UMUL_HIGHPART,
	UNSPEC_ASHIFT_WIDE, UNSPEC_ASHIFTRT_WIDE, UNSPEC_LSHIFTRT_WIDE and
	UNSPEC_RBIT.
	(sve_fp_op): Handle UNSPEC_FRECPE, UNSPEC_FRECPS, UNSPEC_RSQRTE,
	UNSPEC_RSQRTS, UNSPEC_FMLA, UNSPEC_FMLS, UNSPEC_FEXPA, UNSPEC_FTSMUL,
	UNSPEC_FTSSEL, UNSPEC_COND_FMAX, UNSPEC_COND_FMIN, UNSPEC_COND_FMULX,
	UNSPEC_COND_FRECPX and UNSPEC_COND_FSCALE.
	(sve_fp_op_rev): Handle UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and
	UNSPEC_COND_FMULX.
	(rot): Handle UNSPEC_COND_FCADD* and UNSPEC_COND_FCMLA*.
	(brk_reg_con, brk_reg_opno): New int attributes.
	(sve_pred_fp_rhs1_operand, sve_pred_fp_rhs2_operand): Handle
	UNSPEC_COND_FMAX, UNSPEC_COND_FMIN and UNSPEC_COND_FMULX.
	(sve_pred_fp_rhs2_immediate): Handle UNSPEC_COND_FMAX and
	UNSPEC_COND_FMIN.
	(max_elem_bits): New int attribute.
	(min_elem_bits): Handle UNSPEC_RBIT.
	* config/aarch64/predicates.md (subreg_lowpart_operator): Handle
	TRUNCATE as well as SUBREG.
	(ascending_int_parallel, aarch64_simd_reg_or_minus_one)
	(aarch64_sve_ldff1_operand, aarch64_sve_ldnf1_operand)
	(aarch64_sve_prefetch_operand, aarch64_sve_ptrue_svpattern_immediate)
	(aarch64_sve_qadd_immediate, aarch64_sve_qsub_immediate)
	(aarch64_sve_gather_immediate_b, aarch64_sve_gather_immediate_h)
	(aarch64_sve_gather_immediate_w, aarch64_sve_gather_immediate_d)
	(aarch64_sve_sqadd_operand, aarch64_sve_gather_offset_b)
	(aarch64_sve_gather_offset_h, aarch64_sve_gather_offset_w)
	(aarch64_sve_gather_offset_d, aarch64_gather_scale_operand_b)
	(aarch64_gather_scale_operand_h): New predicates.
	* config/aarch64/constraints.md (UPb, UPd, UPh, UPw, Utf, Utn, vgb)
	(vgd, vgh, vgw, vsQ, vsS): New constraints.
	* config/aarch64/aarch64-sve.md: Add a note on the FFR handling.
	(*aarch64_sve_reinterpret<mode>): Allow any source register
	instead of requiring an exact match.
	(*aarch64_sve_ptruevnx16bi_cc, *aarch64_sve_ptrue<mode>_cc)
	(*aarch64_sve_ptruevnx16bi_ptest, *aarch64_sve_ptrue<mode>_ptest)
	(aarch64_wrffr, aarch64_update_ffr_for_load, aarch64_copy_ffr_to_ffrt)
	(aarch64_rdffr, aarch64_rdffr_z, *aarch64_rdffr_z_ptest)
	(*aarch64_rdffr_ptest, *aarch64_rdffr_z_cc, *aarch64_rdffr_cc)
	(aarch64_update_ffrt): New patterns.
	(@aarch64_load_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
	(@aarch64_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
	(@aarch64_ld<fn>f1<mode>): New patterns.
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx8_WIDE:mode><VNx8_NARROW:mode>)
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_ld<fn>f1_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
	(@aarch64_ldnt1<mode>): New patterns.
	(gather_load<mode>): Use aarch64_sve_gather_offset_<Vesize> for
	the scalar part of the address.
	(mask_gather_load<SVE_S:mode>): Use aarch64_sve_gather_offset_w for the
	scalar part of the addresse and add an alternative for handling
	nonzero offsets.
	(mask_gather_load<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d.
	(*mask_gather_load<mode>_sxtw, *mask_gather_load<mode>_uxtw)
	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw)
	(*aarch64_gather_load_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw)
	(@aarch64_ldff1_gather<SVE_S:mode>, @aarch64_ldff1_gather<SVE_D:mode>)
	(*aarch64_ldff1_gather<mode>_sxtw, *aarch64_ldff1_gather<mode>_uxtw)
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx4_WIDE:mode><VNx4_NARROW:mode>)
	(@aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>)
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_sxtw)
	(*aarch64_ldff1_gather_<ANY_EXTEND:optab><VNx2_WIDE:mode><VNx2_NARROW:mode>_uxtw)
	(@aarch64_sve_prefetch<mode>): New patterns.
	(@aarch64_sve_gather_prefetch<SVE_I:mode><VNx4SI_ONLY:mode>)
	(@aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>)
	(*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_sxtw)
	(*aarch64_sve_gather_prefetch<SVE_I:mode><VNx2DI_ONLY:mode>_uxtw)
	(@aarch64_store_trunc<VNx8_NARROW:mode><VNx8_WIDE:mode>)
	(@aarch64_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>)
	(@aarch64_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>)
	(@aarch64_stnt1<mode>): New patterns.
	(scatter_store<mode>): Use aarch64_sve_gather_offset_<Vesize> for
	the scalar part of the address.
	(mask_scatter_store<SVE_S:mode>): Use aarch64_sve_gather_offset_w for
	the scalar part of the addresse and add an alternative for handling
	nonzero offsets.
	(mask_scatter_store<SVE_D:mode>): Likewise aarch64_sve_gather_offset_d.
	(*mask_scatter_store<mode>_sxtw, *mask_scatter_store<mode>_uxtw)
	(@aarch64_scatter_store_trunc<VNx4_NARROW:mode><VNx4_WIDE:mode>)
	(@aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>)
	(*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_sxtw)
	(*aarch64_scatter_store_trunc<VNx2_NARROW:mode><VNx2_WIDE:mode>_uxtw):
	New patterns.
	(vec_duplicate<mode>): Use QI as the mode of the input operand.
	(extract_last_<mode>): Generalize to...
	(@extract_<LAST:last_op>_<mode>): ...this.
	(*<SVE_INT_UNARY:optab><mode>2): Rename to...
	(@aarch64_pred_<SVE_INT_UNARY:optab><mode>): ...this.
	(@cond_<SVE_INT_UNARY:optab><mode>): New expander.
	(@aarch64_pred_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): New pattern.
	(@aarch64_cond_sxt<SVE_HSDI:mode><SVE_PARTIAL:mode>): Likewise.
	(@aarch64_pred_cnot<mode>, @cond_cnot<mode>): New expanders.
	(@aarch64_sve_<SVE_FP_UNARY_INT:optab><mode>): New pattern.
	(@aarch64_sve_<SVE_FP_UNARY:optab><mode>): Likewise.
	(*<SVE_COND_FP_UNARY:optab><mode>2): Rename to...
	(@aarch64_pred_<SVE_COND_FP_UNARY:optab><mode>): ...this.
	(@cond_<SVE_COND_FP_UNARY:optab><mode>): New expander.
	(*<SVE_INT_BINARY_IMM:optab><mode>3): Rename to...
	(@aarch64_pred_<SVE_INT_BINARY_IMM:optab><mode>): ...this.
	(@aarch64_adr<mode>, *aarch64_adr_sxtw): New patterns.
	(*aarch64_adr_uxtw_unspec): Likewise.
	(*aarch64_adr_uxtw): Rename to...
	(*aarch64_adr_uxtw_and): ...this.
	(@aarch64_adr<mode>_shift): New expander.
	(*aarch64_adr_shift_sxtw): New pattern.
	(aarch64_<su>abd<mode>_3): Rename to...
	(@aarch64_pred_<su>abd<mode>): ...this.
	(<su>abd<mode>_3): Update accordingly.
	(@aarch64_cond_<su>abd<mode>): New expander.
	(@aarch64_<SBINQOPS:su_optab><optab><mode>): New pattern.
	(@aarch64_<UBINQOPS:su_optab><optab><mode>): Likewise.
	(*<su>mul<mode>3_highpart): Rename to...
	(@aarch64_pred_<optab><mode>): ...this.
	(@cond_<MUL_HIGHPART:optab><mode>): New expander.
	(*cond_<MUL_HIGHPART:optab><mode>_2): New pattern.
	(*cond_<MUL_HIGHPART:optab><mode>_z): Likewise.
	(*<SVE_INT_BINARY_SD:optab><mode>3): Rename to...
	(@aarch64_pred_<SVE_INT_BINARY_SD:optab><mode>): ...this.
	(cond_<SVE_INT_BINARY_SD:optab><mode>): Add a "@" marker.
	(@aarch64_bic<mode>, @cond_bic<mode>): New expanders.
	(*v<ASHIFT:optab><mode>3): Rename to...
	(@aarch64_pred_<ASHIFT:optab><mode>): ...this.
	(@aarch64_sve_<SVE_SHIFT_WIDE:sve_int_op><mode>): New pattern.
	(@cond_<SVE_SHIFT_WIDE:sve_int_op><mode>): New expander.
	(*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_m): New pattern.
	(*cond_<SVE_SHIFT_WIDE:sve_int_op><mode>_z): Likewise.
	(@cond_asrd<mode>): New expander.
	(*cond_asrd<mode>_2, *cond_asrd<mode>_z): New patterns.
	(sdiv_pow2<mode>3): Expand to *cond_asrd<mode>_2.
	(*sdiv_pow2<mode>3): Delete.
	(@cond_<SVE_COND_FP_BINARY_INT:optab><mode>): New expander.
	(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_2): New pattern.
	(*cond_<SVE_COND_FP_BINARY_INT:optab><mode>_any): Likewise.
	(@aarch64_sve_<SVE_FP_BINARY:optab><mode>): New pattern.
	(@aarch64_sve_<SVE_FP_BINARY_INT:optab><mode>): Likewise.
	(*<SVE_COND_FP_BINARY_REG:optab><mode>3): Rename to...
	(@aarch64_pred_<SVE_COND_FP_BINARY_REG:optab><mode>): ...this.
	(@aarch64_pred_<SVE_COND_FP_BINARY_INT:optab><mode>): New pattern.
	(cond_<SVE_COND_FP_BINARY:optab><mode>): Add a "@" marker.
	(*add<SVE_F:mode>3): Rename to...
	(@aarch64_pred_add<SVE_F:mode>): ...this and add alternatives
	for SVE_STRICT_GP.
	(@aarch64_pred_<SVE_COND_FCADD:optab><mode>): New pattern.
	(@cond_<SVE_COND_FCADD:optab><mode>): New expander.
	(*cond_<SVE_COND_FCADD:optab><mode>_2): New pattern.
	(*cond_<SVE_COND_FCADD:optab><mode>_any): Likewise.
	(*sub<SVE_F:mode>3): Rename to...
	(@aarch64_pred_sub<SVE_F:mode>): ...this and add alternatives
	for SVE_STRICT_GP.
	(@aarch64_pred_abd<SVE_F:mode>): New expander.
	(*fabd<SVE_F:mode>3): Rename to...
	(*aarch64_pred_abd<SVE_F:mode>): ...this.
	(@aarch64_cond_abd<SVE_F:mode>): New expander.
	(*mul<SVE_F:mode>3): Rename to...
	(@aarch64_pred_<SVE_F:optab><mode>): ...this and add alternatives
	for SVE_STRICT_GP.
	(@aarch64_mul_lane_<SVE_F:mode>): New pattern.
	(*<SVE_COND_FP_MAXMIN_PUBLIC:optab><mode>3): Rename and generalize
	to...
	(@aarch64_pred_<SVE_COND_FP_MAXMIN:optab><mode>): ...this.
	(*<LOGICAL:optab><PRED_ALL:mode>3_ptest): New pattern.
	(*<nlogical><PRED_ALL:mode>3): Rename to...
	(aarch64_pred_<nlogical><PRED_ALL:mode>_z): ...this.
	(*<nlogical><PRED_ALL:mode>3_cc): New pattern.
	(*<nlogical><PRED_ALL:mode>3_ptest): Likewise.
	(*<logical_nn><PRED_ALL:mode>3): Rename to...
	(aarch64_pred_<logical_nn><mode>_z): ...this.
	(*<logical_nn><PRED_ALL:mode>3_cc): New pattern.
	(*<logical_nn><PRED_ALL:mode>3_ptest): Likewise.
	(*fma<SVE_I:mode>4): Rename to...
	(@aarch64_pred_fma<SVE_I:mode>): ...this.
	(*fnma<SVE_I:mode>4): Rename to...
	(@aarch64_pred_fnma<SVE_I:mode>): ...this.
	(@aarch64_<sur>dot_prod_lane<vsi2qi>): New pattern.
	(*<SVE_FP_TERNARY:optab><mode>4): Rename to...
	(@aarch64_pred_<SVE_FP_TERNARY:optab><mode>): ...this.
	(cond_<SVE_FP_TERNARY:optab><mode>): Add a "@" marker.
	(@aarch64_<SVE_FP_TERNARY_LANE:optab>_lane_<mode>): New pattern.
	(@aarch64_pred_<SVE_COND_FCMLA:optab><mode>): Likewise.
	(@cond_<SVE_COND_FCMLA:optab><mode>): New expander.
	(*cond_<SVE_COND_FCMLA:optab><mode>_4): New pattern.
	(*cond_<SVE_COND_FCMLA:optab><mode>_any): Likewise.
	(@aarch64_<FCMLA:optab>_lane_<mode>): Likewise.
	(@aarch64_sve_tmad<mode>): Likewise.
	(vcond_mask_<SVE_ALL:mode><vpred>): Add a "@" marker.
	(*aarch64_sel_dup<mode>): Rename to...
	(@aarch64_sel_dup<mode>): ...this.
	(@aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide): New pattern.
	(*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_cc): Likewise.
	(*aarch64_pred_cmp<cmp_op><SVE_I:mode>_wide_ptest): Likewise.
	(@while_ult<GPI:mode><PRED_ALL:mode>): Generalize to...
	(@while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>): ...this.
	(*while_ult<GPI:mode><PRED_ALL:mode>_cc): Generalize to.
	(*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_cc): ...this.
	(*while_<while_optab_cmp><GPI:mode><PRED_ALL:mode>_ptest): New pattern.
	(*fcm<cmp_op><mode>): Rename to...
	(@aarch64_pred_fcm<cmp_op><mode>): ...this.  Make operand order
	match @aarch64_pred_cmp<cmp_op><SVE_I:mode>.
	(*fcmuo<mode>): Rename to...
	(@aarch64_pred_fcmuo<mode>): ...this.  Make operand order
	match @aarch64_pred_cmp<cmp_op><SVE_I:mode>.
	(@aarch64_pred_fac<cmp_op><mode>): New expander.
	(@vcond_mask_<PRED_ALL:mode><mode>): New pattern.
	(fold_extract_last_<mode>): Generalize to...
	(@fold_extract_<last_op>_<mode>): ...this.
	(@aarch64_fold_extract_vector_<last_op>_<mode>): New pattern.
	(*reduc_plus_scal_<SVE_I:mode>): Replace with...
	(@aarch64_pred_reduc_<optab>_<mode>): ...this pattern, making the
	DImode result explicit.
	(reduc_plus_scal_<mode>): Update accordingly.
	(*reduc_<optab>_scal_<SVE_I:mode>): Rename to...
	(@aarch64_pred_reduc_<optab>_<SVE_I:mode>): ...this.
	(*reduc_<optab>_scal_<SVE_F:mode>): Rename to...
	(@aarch64_pred_reduc_<optab>_<SVE_F:mode>): ...this.
	(*aarch64_sve_tbl<mode>): Rename to...
	(@aarch64_sve_tbl<mode>): ...this.
	(@aarch64_sve_compact<mode>): New pattern.
	(*aarch64_sve_dup_lane<mode>): Rename to...
	(@aarch64_sve_dup_lane<mode>): ...this.
	(@aarch64_sve_dupq_lane<mode>): New pattern.
	(@aarch64_sve_splice<mode>): Likewise.
	(aarch64_sve_<perm_insn><mode>): Rename to...
	(@aarch64_sve_<perm_insn><mode>): ...this.
	(*aarch64_sve_ext<mode>): Rename to...
	(@aarch64_sve_ext<mode>): ...this.
	(aarch64_sve_<su>unpk<perm_hilo>_<SVE_BHSI:mode>): Add a "@" marker.
	(*aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): Rename
	to...
	(@aarch64_sve_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): ...this.
	(*aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
	Rename to...
	(@aarch64_sve_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>):
	...this.
	(@cond_<optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>): New expander.
	(@cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): Likewise.
	(*cond_<optab>_trunc<VNx2DF_ONLY:mode><VNx4SI_ONLY:mode>): New pattern.
	(*aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): Rename
	to...
	(@aarch64_sve_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): ...this.
	(aarch64_sve_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Add
	a "@" marker.
	(@cond_<optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>): New expander.
	(@cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): Likewise.
	(*cond_<optab>_extend<VNx4SI_ONLY:mode><VNx2DF_ONLY:mode>): New
	pattern.
	(*aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): Rename to...
	(@aarch64_sve_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): ...this.
	(@cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New expander.
	(*cond_<optab>_trunc<SVE_SDF:mode><SVE_HSF:mode>): New pattern.
	(aarch64_sve_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): Add a
	"@" marker.
	(@cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New expander.
	(*cond_<optab>_nontrunc<SVE_HSF:mode><SVE_SDF:mode>): New pattern.
	(aarch64_sve_punpk<perm_hilo>_<mode>): Add a "@" marker.
	(@aarch64_brk<SVE_BRK_UNARY:brk_op>): New pattern.
	(*aarch64_brk<SVE_BRK_UNARY:brk_op>_cc): Likewise.
	(*aarch64_brk<SVE_BRK_UNARY:brk_op>_ptest): Likewise.
	(@aarch64_brk<SVE_BRK_BINARY:brk_op>): Likewise.
	(*aarch64_brk<SVE_BRK_BINARY:brk_op>_cc): Likewise.
	(*aarch64_brk<SVE_BRK_BINARY:brk_op>_ptest): Likewise.
	(@aarch64_sve_<SVE_PITER:sve_pred_op><mode>): Likewise.
	(*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_cc): Likewise.
	(*aarch64_sve_<SVE_PITER:sve_pred_op><mode>_ptest): Likewise.
	(aarch64_sve_cnt_pat): Likewise.
	(@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode>_pat): Likewise.
	(*aarch64_sve_incsi_pat): Likewise.
	(@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode>_pat): Likewise.
	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise.
	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise.
	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander.
	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern.
	(@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode>_pat): Likewise.
	(*aarch64_sve_decsi_pat): Likewise.
	(@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode>_pat): Likewise.
	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_pat): Likewise.
	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_pat): Likewise.
	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New expander.
	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_pat): New pattern.
	(@aarch64_pred_cntp<mode>): Likewise.
	(@aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp):
	New expander.
	(*aarch64_sve_<ANY_PLUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp)
	(*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns.
	(@aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
	New expander.
	(*aarch64_sve_<SAT_PLUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
	New pattern.
	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New expander.
	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern.
	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New expander.
	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern.
	(@aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New expander.
	(*aarch64_sve_<ANY_PLUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern.
	(@aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp):
	New expander.
	(*aarch64_sve_<ANY_MINUS:inc_dec><DI_ONLY:mode><PRED_ALL:mode>_cntp)
	(*aarch64_incsi<PRED_ALL:mode>_cntp): New patterns.
	(@aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
	New expander.
	(*aarch64_sve_<SAT_MINUS:inc_dec><SI_ONLY:mode><PRED_ALL:mode>_cntp):
	New pattern.
	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New
	expander.
	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx2DI_ONLY:mode>_cntp): New pattern.
	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New
	expander.
	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx4SI_ONLY:mode>_cntp): New pattern.
	(@aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New
	expander.
	(*aarch64_sve_<ANY_MINUS:inc_dec><VNx8HI_ONLY:mode>_cntp): New pattern.
	* config/aarch64/arm_sve.h: New file.
	* config/aarch64/aarch64-sve-builtins.h: Likewise.
	* config/aarch64/aarch64-sve-builtins.cc: Likewise.
	* config/aarch64/aarch64-sve-builtins.def: Likewise.
	* config/aarch64/aarch64-sve-builtins-base.h: Likewise.
	* config/aarch64/aarch64-sve-builtins-base.cc: Likewise.
	* config/aarch64/aarch64-sve-builtins-base.def: Likewise.
	* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
	* config/aarch64/aarch64-sve-builtins-shapes.h: Likewise.
	* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.

gcc/testsuite/
	* g++.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file.
	* g++.target/aarch64/sve/acle/general-c++: New test directory.
	* gcc.target/aarch64/sve/acle/aarch64-sve-acle.exp: New file.
	* gcc.target/aarch64/sve/acle/general: New test directory.
	* gcc.target/aarch64/sve/acle/general-c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>
Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>

From-SVN: r277563

committed 5 years ago

624d0f07 Browse File

[AArch64] Extend SVE reverse permutes to predicates · 28350fd1

This is tested by the main SVE ACLE patches, but since it affects
the evpc routines, it seemed worth splitting out.

2019-10-29  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (@aarch64_sve_rev<PRED_ALL:mode>):
	New pattern.
	* config/aarch64/aarch64.c (aarch64_evpc_rev_global): Handle all
	SVE modes.

From-SVN: r277562

committed 5 years ago

28350fd1 Browse File

30 Sep, 2019 1 commit

[AArch64][SVE] Utilize ASRD instruction for division and remainder · c0c2f013

2019-09-30  Yuliang Wang  <yuliang.wang@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (sdiv_pow2<mode>3):
	New pattern for ASRD.
	* config/aarch64/iterators.md (UNSPEC_ASRD): New unspec.
	* internal-fn.def (IFN_DIV_POW2): New internal function.
	* optabs.def (sdiv_pow2_optab): New optab.
	* tree-vect-patterns.c (vect_recog_divmod_pattern):
	Modify pattern to support new operation.
	* doc/md.texi (sdiv_pow2$var{m3}): Documentation for the above.
	* doc/sourcebuild.texi (vect_sdiv_pow2_si):
	Document new target selector.

gcc/testsuite/
	* gcc.dg/vect/vect-sdiv-pow2-1.c: New test.
	* gcc.target/aarch64/sve/asrdiv_1.c: As above.
	* lib/target-supports.exp (check_effective_target_vect_sdiv_pow2_si):
	Return true for AArch64 with SVE.

From-SVN: r276343

committed 5 years ago

c0c2f013 Browse File

22 Aug, 2019 1 commit

aarch64-sve.md (vcond_mask): Add "@". · b1c9ec72

2019-08-22  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

	* config/aarch64/aarch64-sve.md (vcond_mask): Add "@".

From-SVN: r274817

committed 5 years ago

b1c9ec72 Browse File

15 Aug, 2019 14 commits

[AArch64] Tweak operand choice for SVE predicate AND · 2d2388f8

SVE defines an assembly alias:

   MOV pa.B, pb/Z, pc.B  ->  AND pa.B. pb/Z, pc.B, pc.B

Our and<mode>3 pattern was instead using the functionally-equivalent:

   AND pa.B. pb/Z, pb.B, pc.B
                   ^^^^
This patch duplicates pc.B instead so that the alias can be seen
in disassembly.

I wondered about using the alias in the pattern instead, but using AND
explicitly seems to fit better with the pattern name and surrounding code.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (and<PRED_ALL:mode>3): Make the
	operand order match the MOV /Z alias.

From-SVN: r274521

committed 5 years ago

2d2388f8 Browse File

[AArch64] Rework SVE INC/DEC handling · 0fdc30bc

The scalar addition patterns allowed all the VL constants that
ADDVL and ADDPL allow, but wrote the instructions as INC or DEC
if possible (i.e. adding or subtracting a number of elements * [1, 16]
when the source and target registers the same).  That works for the
cases that the autovectoriser needs, but there are a few constants
that INC and DEC can handle but ADDPL and ADDVL can't.  E.g.:

        inch    x0, all, mul #9

is not a multiple of the number of bytes in an SVE register, and so
can't use ADDVL.  It represents 36 times the number of bytes in an
SVE predicate, putting it outside the range of ADDPL.

This patch therefore adds separate alternatives for INC and DEC,
tied to a new Uai constraint.  It also adds an explicit "scalar"
or "vector" to the function names, to avoid a clash with the
existing support for vector INC and DEC.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-protos.h
	(aarch64_sve_scalar_inc_dec_immediate_p): Declare.
	(aarch64_sve_inc_dec_immediate_p): Rename to...
	(aarch64_sve_vector_inc_dec_immediate_p): ...this.
	(aarch64_output_sve_addvl_addpl): Take a single rtx argument.
	(aarch64_output_sve_scalar_inc_dec): Declare.
	(aarch64_output_sve_inc_dec_immediate): Rename to...
	(aarch64_output_sve_vector_inc_dec): ...this.
	* config/aarch64/aarch64.c (aarch64_sve_scalar_inc_dec_immediate_p)
	(aarch64_output_sve_scalar_inc_dec): New functions.
	(aarch64_output_sve_addvl_addpl): Remove the base and offset
	arguments.  Only handle true ADDVL and ADDPL instructions;
	don't emit an INC or DEC.
	(aarch64_sve_inc_dec_immediate_p): Rename to...
	(aarch64_sve_vector_inc_dec_immediate_p): ...this.
	(aarch64_output_sve_inc_dec_immediate): Rename to...
	(aarch64_output_sve_vector_inc_dec): ...this.  Update call to
	aarch64_sve_vector_inc_dec_immediate_p.
	* config/aarch64/predicates.md (aarch64_sve_scalar_inc_dec_immediate)
	(aarch64_sve_plus_immediate): New predicates.
	(aarch64_pluslong_operand): Accept aarch64_sve_plus_immediate
	rather than aarch64_sve_addvl_addpl_immediate.
	(aarch64_sve_inc_dec_immediate): Rename to...
	(aarch64_sve_vector_inc_dec_immediate): ...this.  Update call to
	aarch64_sve_vector_inc_dec_immediate_p.
	(aarch64_sve_add_operand): Update accordingly.
	* config/aarch64/constraints.md (Uai): New constraint.
	(vsi): Update call to aarch64_sve_vector_inc_dec_immediate_p.
	* config/aarch64/aarch64.md (add<GPI:mode>3): Don't force the second
	operand into a register if it satisfies aarch64_sve_plus_immediate.
	(*add<GPI:mode>3_aarch64, *add<GPI:mode>3_poly_1): Add an alternative
	for Uai.  Update calls to aarch64_output_sve_addvl_addpl.
	* config/aarch64/aarch64-sve.md (add<mode>3): Call
	aarch64_output_sve_vector_inc_dec instead of
	aarch64_output_sve_inc_dec_immediate.

From-SVN: r274518

committed 5 years ago

0fdc30bc Browse File

[AArch64] Rework SVE REV[BHW] patterns · d7a09c44

The current SVE REV patterns follow the AArch64 scheme, in which
UNSPEC_REV<NN> reverses elements within an <NN>-bit granule.
E.g. UNSPEC_REV64 on VNx8HI reverses the four 16-bit elements
within each 64-bit granule.

The native SVE scheme is the other way around: UNSPEC_REV64 is seen
as an operation on 64-bit elements, with REVB swapping bytes within
the elements, REVH swapping halfwords, and so on.  This fits SVE more
naturally because the operation can then be predicated per <NN>-bit
granule/element.

Making the patterns use the Advanced SIMD scheme was more natural
when all we cared about were permutes, since we could then use
the source and target of the permute in their original modes.
However, the ACLE does need patterns that follow the native scheme,
treating them as operations on integer elements.  This patch defines
the patterns that way instead and updates the existing uses to match.

This also brings in a couple of helper routines from the ACLE branch.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (UNSPEC_REVB, UNSPEC_REVH)
	(UNSPEC_REVW): New constants.
	(elem_bits): New mode attribute.
	(SVE_INT_UNARY): New int iterator.
	(optab): Handle UNSPEC_REV[BHW].
	(sve_int_op): New int attribute.
	(min_elem_bits): Handle VNx16QI and the predicate modes.
	* config/aarch64/aarch64-sve.md (*aarch64_sve_rev64<mode>)
	(*aarch64_sve_rev32<mode>, *aarch64_sve_rev16vnx16qi): Delete.
	(@aarch64_pred_<SVE_INT_UNARY:optab><SVE_I:mode>): New pattern.
	* config/aarch64/aarch64.c (aarch64_sve_data_mode): New function.
	(aarch64_sve_int_mode, aarch64_sve_rev_unspec): Likewise.
	(aarch64_split_sve_subreg_move): Use UNSPEC_REV[BHW] instead of
	unspecs based on the total width of the reversed data.
	(aarch64_evpc_rev_local): Likewise (for SVE only).  Use a
	reinterpret followed by a subreg on big-endian targets.

gcc/testsuite/
	* gcc.target/aarch64/sve/revb_1.c: Restrict to little-endian targets.
	Avoid including stdint.h.
	* gcc.target/aarch64/sve/revh_1.c: Likewise.
	* gcc.target/aarch64/sve/revw_1.c: Likewise.
	* gcc.target/aarch64/sve/revb_2.c: New big-endian test.
	* gcc.target/aarch64/sve/revh_2.c: Likewise.
	* gcc.target/aarch64/sve/revw_2.c: Likewise.

From-SVN: r274517

committed 5 years ago

d7a09c44 Browse File

[AArch64] Add more SVE FMLA and FMAD /z alternatives · 432b29c1

This patch makes the floating-point conditional FMA patterns provide the
same /z alternatives as the integer patterns added by a previous patch.
We can handle cases in which individual inputs are allocated to the same
register as the output, so we don't need to force all registers to be
different.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md
	(*cond_<SVE_COND_FP_TERNARY:optab><SVE_F:mode>_any): Add /z
	alternatives in which one of the inputs is in the same register
	as the output.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_mla_5.c: Allow FMAD as well as FMLA
	and FMSB as well as FMLS.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274516

committed 5 years ago

432b29c1 Browse File

[AArch64] Add MOVPRFX alternatives for SVE EXT patterns · 06b3ba23

We use EXT both to implement vec_extract for large indices and as a
permute.  In both cases we can use MOVPRFX to handle the case in which
the first input and output can't be tied.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (*vec_extract<mode><Vel>_ext)
	(*aarch64_sve_ext<mode>): Add MOVPRFX alternatives.

gcc/testsuite/
	* gcc.target/aarch64/sve/ext_2.c: Expect a MOVPRFX.
	* gcc.target/aarch64/sve/ext_3.c: New test.

From-SVN: r274515

committed 5 years ago

06b3ba23 Browse File

[AArch64] Remove unneeded FSUB alternatives and add a new one · 2ae21bd1

The floating-point subtraction patterns don't need to handle
subtraction of constants, since those go through the addition
patterns instead.  There was a missing MOVPRFX alternative for
FSUBR though.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (*sub<SVE_F:mode>3): Remove immediate
	FADD and FSUB alternatives.  Add a MOVPRFX alternative for FSUBR.

From-SVN: r274514

committed 5 years ago

2ae21bd1 Browse File

[AArch64] Add more unpredicated MOVPRFX alternatives · 5e176a61

FABD and some immediate instructions were missing MOVPRFX alternatives.
This is tested by the ACLE patches but is really an independent improvement.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md (add<SVE_I:mode>3, sub<SVE_I:mode>3)
	(<LOGICAL:optab><SVE_I:mode>3, *add<SVE_F:mode>3, *mul<SVE_F:mode>3)
	(*fabd<SVE_F:mode>3): Add more MOVPRFX alternatives.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274513

committed 5 years ago

5e176a61 Browse File

[AArch64] Use SVE reversed shifts in preference to MOVPRFX · 7d1f2401

This patch makes us use reversed SVE shifts when the first operand
can't be tied to the output but the second can.  This is tested
more thoroughly by the ACLE patches but is really an independent
improvement.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md (*v<ASHIFT:optab><SVE_I:mode>3):
	Add an alternative that uses reversed shifts.

gcc/testsuite/
	* gcc.target/aarch64/sve/shift_1.c: Accept reversed shifts.

Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>

From-SVN: r274512

committed 5 years ago

7d1f2401 Browse File

[AArch64] Add a commutativity marker to the SVE [SU]ABD patterns · 9a8d9b3f

This will be tested by the ACLE patches, but it's really an
independent improvement.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md (aarch64_<su>abd<mode>_3): Add
	a commutativity marker.

From-SVN: r274510

committed 5 years ago

9a8d9b3f Browse File

[AArch64] Use SVE MLA, MLS, MAD and MSB for conditional arithmetic · b6c3aea1

This patch uses predicated MLA, MLS, MAD and MSB to implement
conditional "FMA"s on integers.  This also requires providing
the unpredicated optabs (fma and fnma) since otherwise
tree-ssa-math-opts.c won't try to use the conditional forms.

We still want to use shifts and adds in preference to multiplications,
so the patch makes the optab expanders check for that.

The tests cover floating-point types too, which are already handled,
and which were already tested to some extent by gcc.dg/vect.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-protos.h (aarch64_prepare_sve_int_fma)
	(aarch64_prepare_sve_cond_int_fma): Declare.
	* config/aarch64/aarch64.c (aarch64_convert_mult_to_shift)
	(aarch64_prepare_sve_int_fma): New functions.
	(aarch64_prepare_sve_cond_int_fma): Likewise.
	* config/aarch64/aarch64-sve.md
	(cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Add a "@" marker.
	(fma<SVE_I:mode>4, cond_fma<SVE_I:mode>, *cond_fma<SVE_I:mode>_2)
	(*cond_fma<SVE_I:mode>_4, *cond_fma<SVE_I:mode>_any, fnma<SVE_I:mode>4)
	(cond_fnma<SVE_I:mode>, *cond_fnma<SVE_I:mode>_2)
	(*cond_fnma<SVE_I:mode>_4, *cond_fnma<SVE_I:mode>_any): New patterns.
	(*madd<mode>): Rename to...
	(*fma<mode>4): ...this.
	(*msub<mode>): Rename to...
	(*fnma<mode>4): ...this.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_mla_1.c: New test.
	* gcc.target/aarch64/sve/cond_mla_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_5.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_5_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_6.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_6_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_7.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_7_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_8.c: Likewise.
	* gcc.target/aarch64/sve/cond_mla_8_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274509

committed 5 years ago

b6c3aea1 Browse File

[AArch64] Use SVE binary immediate instructions for conditional arithmetic · a19ba9e1

This patch lets us use the immediate forms of FADD, FSUB, FSUBR,
FMUL, FMAXNM and FMINNM for conditional arithmetic.  (We already
use them for normal unconditional arithmetic.)

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64.c (aarch64_print_vector_float_operand):
	Print 2.0 naturally.
	(aarch64_sve_float_mul_immediate_p): Return true for 2.0.
	* config/aarch64/predicates.md
	(aarch64_sve_float_negated_arith_immediate): New predicate,
	renamed from aarch64_sve_float_arith_with_sub_immediate.
	(aarch64_sve_float_arith_with_sub_immediate): Test for both
	positive and negative constants.
	(aarch64_sve_float_arith_with_sub_operand): Redefine as a register
	or an aarch64_sve_float_arith_with_sub_immediate.
	* config/aarch64/constraints.md (vsN): Use
	aarch64_sve_float_negated_arith_immediate.
	* config/aarch64/iterators.md (SVE_COND_FP_BINARY_I1): New int
	iterator.
	(sve_pred_fp_rhs2_immediate): New int attribute.
	* config/aarch64/aarch64-sve.md
	(cond_<SVE_COND_FP_BINARY:optab><SVE_F:mode>): Use
	sve_pred_fp_rhs1_operand and sve_pred_fp_rhs2_operand.
	(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_2_const)
	(*cond_<SVE_COND_FP_BINARY_I1:optab><SVE_F:mode>_any_const)
	(*cond_add<SVE_F:mode>_2_const, *cond_add<SVE_F:mode>_any_const)
	(*cond_sub<mode>_3_const, *cond_sub<mode>_any_const): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_fadd_1.c: New test.
	* gcc.target/aarch64/sve/cond_fadd_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fadd_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fsubr_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmaxnm_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fminnm_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_1.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fmul_4_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274508

committed 5 years ago

a19ba9e1 Browse File

[AArch64] Use SVE FABD in conditional arithmetic · bf30864e

This patch extends the FABD support so that it handles conditional
arithmetic.  We're relying on combine for this, since there's no
associated IFN_COND_* (yet?).

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md (*aarch64_cond_abd<SVE_F:mode>_2)
	(*aarch64_cond_abd<SVE_F:mode>_3)
	(*aarch64_cond_abd<SVE_F:mode>_any): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_fabd_1.c: New test.
	* gcc.target/aarch64/sve/cond_fabd_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_5.c: Likewise.
	* gcc.target/aarch64/sve/cond_fabd_5_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274507

committed 5 years ago

bf30864e Browse File

[AArch64] Use SVE [SU]ABD in conditional arithmetic · 9730c5cc

This patch extends the [SU]ABD support so that it handles
conditional arithmetic.  We're relying on combine for this,
since there's no associated IFN_COND_* (yet?).

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md (*aarch64_cond_<su>abd<mode>_2)
	(*aarch64_cond_<su>abd<mode>_any): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_abd_1.c: New test.
	* gcc.target/aarch64/sve/cond_abd_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_5.c: Likewise.
	* gcc.target/aarch64/sve/cond_abd_5_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274506

committed 5 years ago

9730c5cc Browse File

Add support for conditional shifts · 20103c0e

This patch adds support for IFN_COND shifts left and shifts right.
This is mostly mechanical, but since we try to handle conditional
operations in the same way as unconditional operations in match.pd,
we need to support IFN_COND shifts by scalars as well as vectors.
E.g.:

   IFN_COND_SHL (cond, a, { 1, 1, ... }, fallback)

and:

   IFN_COND_SHL (cond, a, 1, fallback)

are the same operation, with:

   (for shiftrotate (lrotate rrotate lshift rshift)
    ...
    /* Prefer vector1 << scalar to vector1 << vector2
       if vector2 is uniform.  */
    (for vec (VECTOR_CST CONSTRUCTOR)
     (simplify
      (shiftrotate @0 vec@1)
      (with { tree tem = uniform_vector_p (@1); }
       (if (tem)
	(shiftrotate @0 { tem; }))))))

preferring the latter.  The patch copes with this by extending
create_convert_operand_from to handle scalar-to-vector conversions.

2019-08-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>

gcc/
	* internal-fn.def (IFN_COND_SHL, IFN_COND_SHR): New internal functions.
	* internal-fn.c (FOR_EACH_CODE_MAPPING): Handle shifts.
	* match.pd (UNCOND_BINARY, COND_BINARY): Likewise.
	* optabs.def (cond_ashl_optab, cond_ashr_optab, cond_lshr_optab): New
	optabs.
	* optabs.h (create_convert_operand_from): Expand comment.
	* optabs.c (maybe_legitimize_operand): Allow implicit broadcasts
	when mapping scalar rtxes to vector operands.
	* config/aarch64/iterators.md (SVE_INT_BINARY): Add ashift,
	ashiftrt and lshiftrt.
	(sve_int_op, sve_int_op_rev, sve_pred_int_rhs2_operand): Handle them.
	* config/aarch64/aarch64-sve.md (*cond_<optab><mode>_2_const)
	(*cond_<optab><mode>_any_const): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_shift_1.c: New test.
	* gcc.target/aarch64/sve/cond_shift_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_5.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_5_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_6.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_6_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_7.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_7_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_8.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_8_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_9.c: Likewise.
	* gcc.target/aarch64/sve/cond_shift_9_run.c: Likewise.

Co-Authored-By: Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org>

From-SVN: r274505

committed 5 years ago

20103c0e Browse File

14 Aug, 2019 10 commits

[AArch64] Use SVE BIC for conditional arithmetic · 1b187f36

This patch uses BIC to pattern-match conditional AND with an inverted
third input.  It also adds extra tests for AND, ORR and EOR.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md (*cond_bic<mode>_2)
	(*cond_bic<mode>_any): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_logical_1.c: New test.
	* gcc.target/aarch64/sve/cond_logical_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_5.c: Likewise.
	* gcc.target/aarch64/sve/cond_logical_5_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274480

committed 5 years ago

1b187f36 Browse File

[AArch64] Use SVE UXT[BHW] as a form of predicated AND · d113ece6

UXTB, UXTH and UXTW are equivalent to predicated ANDs with the constants
0xff, 0xffff and 0xffffffff respectively.  This patch uses them in the
patterns for IFN_COND_AND.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.c (aarch64_print_operand): Allow %e to
	take the equivalent mask, as well as a bit count.
	* config/aarch64/predicates.md (aarch64_sve_uxtb_immediate)
	(aarch64_sve_uxth_immediate, aarch64_sve_uxt_immediate)
	(aarch64_sve_pred_and_operand): New predicates.
	* config/aarch64/iterators.md (sve_pred_int_rhs2_operand): New
	code attribute.
	* config/aarch64/aarch64-sve.md
	(cond_<SVE_INT_BINARY:optab><SVE_I:mode>): Use it.
	(*cond_uxt<mode>_2, *cond_uxt<mode>_any): New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_uxt_1.c: New test.
	* gcc.target/aarch64/sve/cond_uxt_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_uxt_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_uxt_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_uxt_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_uxt_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_uxt_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_uxt_4_run.c: Likewise.

From-SVN: r274479

committed 5 years ago

d113ece6 Browse File

[AArch64] Add SVE conditional conversion patterns · c5e16983

This patch adds patterns to match conditional conversions between
integers and like-sized floats.  The patterns are actually more
general than that, but the other combinations can only be tested
via the ACLE.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-sve.md
	(*cond_<SVE_COND_FCVTI:optab>_nontrunc<SVE_F:mode><SVE_HSDI:mode>)
	(*cond_<SVE_COND_ICVTF:optab>_nonextend<SVE_HSDI:mode><SVE_F:mode>):
	New patterns.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_convert_1.c: New test.
	* gcc.target/aarch64/sve/cond_convert_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_4_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_5.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_5_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_6.c: Likewise.
	* gcc.target/aarch64/sve/cond_convert_6_run.c: Likewise.

From-SVN: r274478

committed 5 years ago

c5e16983 Browse File

[AArch64] Add SVE conditional floating-point unary patterns · b21f7d53

This patch adds patterns to match conditional unary operations
on floating-point modes.  At the moment we rely on combine to merge
separate arithmetic and vcond_mask operations, and since the latter
doesn't accept zero operands, we miss out on the opportunity to use
the movprfx /z alternative.  (This alternative is tested by the ACLE
patches though.)

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md
	(*cond_<SVE_COND_FP_UNARY:optab><SVE_F:mode>_2): New pattern.
	(*cond_<SVE_COND_FP_UNARY:optab><SVE_F:mode>_any): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_unary_1.c: Add tests for
	floating-point types.
	* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274477

committed 5 years ago

b21f7d53 Browse File

[AArch64] Add SVE conditional integer unary patterns · 3c9f4963

This patch adds patterns to match conditional unary operations
on integers.  At the moment we rely on combine to merge separate
arithmetic and vcond_mask operations, and since the latter doesn't
accept zero operands, we miss out on the opportunity to use the
movprfx /z alternative.  (This alternative is tested by the ACLE
patches though.)

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md
	(*cond_<SVE_INT_UNARY:optab><SVE_I:mode>_2): New pattern.
	(*cond_<SVE_INT_UNARY:optab><SVE_I:mode>_any): Likewise.

gcc/testsuite/
	* gcc.target/aarch64/sve/cond_unary_1.c: New test.
	* gcc.target/aarch64/sve/cond_unary_1_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_2.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_2_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_3.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_3_run.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_4.c: Likewise.
	* gcc.target/aarch64/sve/cond_unary_4_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274476

committed 5 years ago

3c9f4963 Browse File

[AArch64] Add support for SVE absolute comparisons · 42b4e87d

This patch adds support for floating-point absolute comparisons
FACLT and FACLE (aliased as FACGT and FACGE with swapped operands).

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/iterators.md (SVE_COND_FP_ABS_CMP): New iterator.
	* config/aarch64/aarch64-sve.md (*aarch64_pred_fac<cmp_op><mode>):
	New pattern.

gcc/testsuite/
	* gcc.target/aarch64/sve/vcond_21.c: New test.
	* gcc.target/aarch64/sve/vcond_21_run.c: Likewise.

From-SVN: r274443

committed 5 years ago

42b4e87d Browse File

[AArch64] Use SVE MOV /M of scalars · 88a37c4d

This patch uses MOV /M to optimise selects between a duplicated
scalar variable and a vector.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64-sve.md (*aarch64_sel_dup<mode>): New pattern.

gcc/testsuite/
	* g++.target/aarch64/sve/dup_sel_1.C: New test.
	* g++.target/aarch64/sve/dup_sel_2.C: Likewise.
	* g++.target/aarch64/sve/dup_sel_3.C: Likewise.
	* g++.target/aarch64/sve/dup_sel_4.C: Likewise.
	* g++.target/aarch64/sve/dup_sel_5.C: Likewise.
	* g++.target/aarch64/sve/dup_sel_6.C: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274442

committed 5 years ago

88a37c4d Browse File

[AArch64] Make more use of SVE conditional constant moves · d29f7dd5

This patch extends the SVE UNSPEC_SEL patterns so that they can use:

(1) MOV /M of a duplicated integer constant
(2) MOV /M of a duplicated floating-point constant bitcast to an integer,
    accepting the same constants as (1)
(3) FMOV /M of a duplicated floating-point constant
(4) MOV /Z of a duplicated integer constant
(5) MOV /Z of a duplicated floating-point constant bitcast to an integer,
    accepting the same constants as (4)
(6) MOVPRFXed FMOV /M of a duplicated floating-point constant

We already handled (4) with a special pattern; the rest are new.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>
	    Kugan Vivekanandarajah  <kugan.vivekanandarajah@linaro.org>

gcc/
	* config/aarch64/aarch64.c (aarch64_bit_representation): New function.
	(aarch64_print_vector_float_operand): Also handle 8-bit floats.
	(aarch64_print_operand): Add support for %I.
	(aarch64_sve_dup_immediate_p): Handle scalars as well as vectors.
	Bitcast floating-point constants to the corresponding integer constant.
	(aarch64_float_const_representable_p): Handle vectors as well
	as scalars.
	(aarch64_expand_sve_vcond): Make sure that the operands are valid
	for the new vcond_mask_<mode><vpred> expander.
	* config/aarch64/predicates.md (aarch64_sve_dup_immediate): Also
	test aarch64_float_const_representable_p.
	(aarch64_sve_reg_or_dup_imm): New predicate.
	* config/aarch64/aarch64-sve.md (vec_extract<vpred><Vel>): Use
	gen_vcond_mask_<mode><vpred> instead of
	gen_aarch64_sve_dup<mode>_const.
	(vcond_mask_<mode><vpred>): Turn into a define_expand that
	accepts aarch64_sve_reg_or_dup_imm and aarch64_simd_reg_or_zero
	for operands 1 and 2 respectively.  Force operand 2 into a
	register if operand 1 is a register.  Fold old define_insn...
	(aarch64_sve_dup<mode>_const): ...and this define_insn...
	(*vcond_mask_<mode><vpred>): ...into this new pattern.  Handle
	floating-point constants that can be moved as integers.  Add
	alternatives for MOV /M and FMOV /M.
	(vcond<mode><v_int_equiv>, vcondu<mode><v_int_equiv>)
	(vcond<mode><v_fp_equiv>): Accept nonmemory_operand for operands
	1 and 2 respectively.
	* config/aarch64/constraints.md (Ufc): Handle vectors as well
	as scalars.
	(vss): New constraint.

gcc/testsuite/
	* gcc.target/aarch64/sve/vcond_18.c: New test.
	* gcc.target/aarch64/sve/vcond_18_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_19.c: Likewise.
	* gcc.target/aarch64/sve/vcond_19_run.c: Likewise.
	* gcc.target/aarch64/sve/vcond_20.c: Likewise.
	* gcc.target/aarch64/sve/vcond_20_run.c: Likewise.

Co-Authored-By: Kugan Vivekanandarajah <kuganv@linaro.org>

From-SVN: r274441

committed 5 years ago

d29f7dd5 Browse File

[AArch64] Add support for SVE F{MAX,MIN}NM immediate · 75079ddf

This patch uses the immediate forms of FMAXNM and FMINNM for
unconditional arithmetic.

The same rules apply to FMAX and FMIN, but we only generate those
via the ACLE.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/predicates.md (aarch64_sve_float_maxmin_immediate)
	(aarch64_sve_float_maxmin_operand): New predicates.
	* config/aarch64/constraints.md (vsB): New constraint.
	(vsM): Fix typo.
	* config/aarch64/iterators.md (sve_pred_fp_rhs2_operand): Use
	aarch64_sve_float_maxmin_operand for UNSPEC_COND_FMAXNM and
	UNSPEC_COND_FMINNM.
	* config/aarch64/aarch64-sve.md (<maxmin_uns><SVE_F:mode>3):
	Use aarch64_sve_float_maxmin_operand for operand 2.
	(*<SVE_COND_FP_MAXMIN_PUBLIC:optab><SVE_F:mode>3): Likewise.
	Add alternatives for the constant forms.

gcc/testsuite/
	* gcc.target/aarch64/sve/fmaxnm_1.c: New test.
	* gcc.target/aarch64/sve/fminnm_1.c: Likewise.

From-SVN: r274440

committed 5 years ago

75079ddf Browse File

[AArch64] Add support for SVE [SU]{MAX,MIN} immediate · f8c22a8b

This patch adds support for the immediate forms of SVE SMAX, SMIN, UMAX
and UMIN.  SMAX and SMIN take the same range as MUL, so the patch
basically just moves and generalises the existing MUL patterns.

2019-08-14  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/constraints.md (vsb): New constraint.
	(vsm): Generalize description.
	* config/aarch64/iterators.md (SVE_INT_BINARY_IMM): New code
	iterator.
	(sve_imm_con): Handle smax, smin, umax and umin.
	(sve_imm_prefix): New code attribute.
	* config/aarch64/predicates.md (aarch64_sve_vsb_immediate)
	(aarch64_sve_vsb_operand): New predicates.
	(aarch64_sve_mul_immediate): Rename to...
	(aarch64_sve_vsm_immediate): ...this.
	(aarch64_sve_mul_operand): Rename to...
	(aarch64_sve_vsm_operand): ...this.
	* config/aarch64/aarch64-sve.md (mul<mode>3): Generalize to...
	(<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...this.
	(*mul<mode>3, *post_ra_mul<mode>3): Generalize to...
	(*<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3)
	(*post_ra_<SVE_INT_BINARY_IMM:optab><SVE_I:mode>3): ...these and
	add movprfx support for the immediate alternatives.
	(<su><maxmin><mode>3, *<su><maxmin><mode>3): Delete in favor
	of the above.
	(*<SVE_INT_BINARY_SD:optab><SVE_SDI:mode>3): Fix incorrect predicate
	for operand 3.

gcc/testsuite/
	* gcc.target/aarch64/sve/smax_1.c: New test.
	* gcc.target/aarch64/sve/smin_1.c: Likewise.
	* gcc.target/aarch64/sve/umax_1.c: Likewise.
	* gcc.target/aarch64/sve/umin_1.c: Likewise.

From-SVN: r274439

committed 5 years ago

f8c22a8b Browse File