init_4.c
537 Bytes
-
[AArch64] Improve SVE constant moves · 4aeb1ba7
If there's no SVE instruction to load a given constant directly, this patch instead tries to use an Advanced SIMD constant move and then duplicates the constant to fill an SVE vector. The main use of this is to support constants in which each byte is in { 0, 0xff }. Also, the patch prefers a simple integer move followed by a duplicate over a load from memory, like we already do for Advanced SIMD. This is a useful option to have and would be easy to turn off via a tuning parameter if necessary. The patch also extends the handling of wide LD1Rs to big endian, whereas previously we punted to a full LD1RQ. 2019-08-13 Richard Sandiford <richard.sandiford@arm.com> gcc/ * machmode.h (opt_mode::else_mode): New function. (opt_mode::else_blk): Use it. * config/aarch64/aarch64-protos.h (aarch64_vq_mode): Declare. (aarch64_full_sve_mode, aarch64_sve_ld1rq_operand_p): Likewise. (aarch64_gen_stepped_int_parallel): Likewise. (aarch64_stepped_int_parallel_p): Likewise. (aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate argument. * config/aarch64/aarch64.c (aarch64_expand_sve_widened_duplicate): Delete. (aarch64_expand_sve_dupq, aarch64_expand_sve_ld1rq): New functions. (aarch64_expand_sve_const_vector): Rewrite to handle more cases. (aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate argument. Use early returns in the !CONST_INT_P handling. Pass all SVE data vectors to aarch64_expand_sve_const_vector rather than handling some inline. (aarch64_full_sve_mode, aarch64_vq_mode): New functions, split out from... (aarch64_simd_container_mode): ...here. (aarch64_gen_stepped_int_parallel, aarch64_stepped_int_parallel_p) (aarch64_sve_ld1rq_operand_p): New functions. * config/aarch64/predicates.md (descending_int_parallel) (aarch64_sve_ld1rq_operand): New predicates. * config/aarch64/constraints.md (UtQ): New constraint. * config/aarch64/aarch64.md (UNSPEC_REINTERPRET): New unspec. * config/aarch64/aarch64-sve.md (mov<SVE_ALL:mode>): Remove the gen_vec_duplicate from call to aarch64_expand_mov_immediate. (@aarch64_sve_reinterpret<mode>): New expander. (*aarch64_sve_reinterpret<mode>): New pattern. (@aarch64_vec_duplicate_vq<mode>_le): New pattern. (@aarch64_vec_duplicate_vq<mode>_be): Likewise. (*sve_ld1rq<Vesize>): Replace with... (@aarch64_sve_ld1rq<mode>): ...this new pattern. gcc/testsuite/ * gcc.target/aarch64/sve/init_2.c: Expect ld1rd to be used instead of a full vector load. * gcc.target/aarch64/sve/init_4.c: Likewise. * gcc.target/aarch64/sve/ld1r_2.c: Remove constants that no longer need to be loaded from memory. * gcc.target/aarch64/sve/slp_2.c: Expect the same output for big and little endian. * gcc.target/aarch64/sve/slp_3.c: Likewise. Expect 3 of the doubles to be moved via integer registers rather than loaded from memory. * gcc.target/aarch64/sve/slp_4.c: Likewise but for 4 doubles. * gcc.target/aarch64/sve/spill_4.c: Expect 16-bit constants to be loaded via an integer register rather than from memory. * gcc.target/aarch64/sve/const_1.c: New test. * gcc.target/aarch64/sve/const_2.c: Likewise. * gcc.target/aarch64/sve/const_3.c: Likewise. From-SVN: r274375
Richard Sandiford committed