Inline assembler instructions don't have latency info and the scheduler does not attempt to schedule them at all - it does not even honor latencies of asm source operands. As a result, SIMD intrinsics which are implemented using inline assembler perform very poorly, particularly on in-order cores. Add new patterns and intrinsics for widening multiplies, which results in a 63% speedup for the example in the PR, thus fixing the reported regression. gcc/ PR target/91598 * config/aarch64/aarch64-builtins.c (TYPES_TERNOPU_LANE): Add define. * config/aarch64/aarch64-simd.md (aarch64_vec_<su>mult_lane<Qlane>): Add new insn for widening lane mul. (aarch64_vec_<su>mlal_lane<Qlane>): Likewise. * config/aarch64/aarch64-simd-builtins.def: Add intrinsics. * config/aarch64/arm_neon.h: (vmlal_lane_s16): Expand using intrinsics rather than inline asm. (vmlal_lane_u16): Likewise. (vmlal_lane_s32): Likewise. (vmlal_lane_u32): Likewise. (vmlal_laneq_s16): Likewise. (vmlal_laneq_u16): Likewise. (vmlal_laneq_s32): Likewise. (vmlal_laneq_u32): Likewise. (vmull_lane_s16): Likewise. (vmull_lane_u16): Likewise. (vmull_lane_s32): Likewise. (vmull_lane_u32): Likewise. (vmull_laneq_s16): Likewise. (vmull_laneq_u16): Likewise. (vmull_laneq_s32): Likewise. (vmull_laneq_u32): Likewise. * config/aarch64/iterators.md (Vcondtype): New iterator for lane mul. (Qlane): Likewise.
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
aarch64-arches.def | Loading commit data... | |
aarch64-bti-insert.c | Loading commit data... | |
aarch64-builtins.c | Loading commit data... | |
aarch64-c.c | Loading commit data... | |
aarch64-cores.def | Loading commit data... | |
aarch64-cost-tables.h | Loading commit data... | |
aarch64-d.c | Loading commit data... | |
aarch64-elf-raw.h | Loading commit data... | |
aarch64-elf.h | Loading commit data... | |
aarch64-errata.h | Loading commit data... | |
aarch64-freebsd.h | Loading commit data... | |
aarch64-fusion-pairs.def | Loading commit data... | |
aarch64-ldpstp.md | Loading commit data... | |
aarch64-linux.h | Loading commit data... | |
aarch64-modes.def | Loading commit data... | |
aarch64-netbsd.h | Loading commit data... | |
aarch64-option-extensions.def | Loading commit data... | |
aarch64-opts.h | Loading commit data... | |
aarch64-passes.def | Loading commit data... | |
aarch64-protos.h | Loading commit data... | |
aarch64-simd-builtin-types.def | Loading commit data... | |
aarch64-simd-builtins.def | Loading commit data... | |
aarch64-simd.md | Loading commit data... | |
aarch64-speculation.cc | Loading commit data... | |
aarch64-sve-builtins-base.cc | Loading commit data... | |
aarch64-sve-builtins-base.def | Loading commit data... | |
aarch64-sve-builtins-base.h | Loading commit data... | |
aarch64-sve-builtins-functions.h | Loading commit data... | |
aarch64-sve-builtins-shapes.cc | Loading commit data... | |
aarch64-sve-builtins-shapes.h | Loading commit data... | |
aarch64-sve-builtins-sve2.cc | Loading commit data... | |
aarch64-sve-builtins-sve2.def | Loading commit data... | |
aarch64-sve-builtins-sve2.h | Loading commit data... | |
aarch64-sve-builtins.cc | Loading commit data... | |
aarch64-sve-builtins.def | Loading commit data... | |
aarch64-sve-builtins.h | Loading commit data... | |
aarch64-sve.md | Loading commit data... | |
aarch64-sve2.md | Loading commit data... | |
aarch64-tune.md | Loading commit data... | |
aarch64-tuning-flags.def | Loading commit data... | |
aarch64-vxworks.h | Loading commit data... | |
aarch64.c | Loading commit data... | |
aarch64.h | Loading commit data... | |
aarch64.md | Loading commit data... | |
aarch64.opt | Loading commit data... | |
arm_acle.h | Loading commit data... | |
arm_bf16.h | Loading commit data... | |
arm_fp16.h | Loading commit data... | |
arm_neon.h | Loading commit data... | |
arm_sve.h | Loading commit data... | |
atomics.md | Loading commit data... | |
biarchilp32.h | Loading commit data... | |
biarchlp64.h | Loading commit data... | |
check-sve-md.awk | Loading commit data... | |
constraints.md | Loading commit data... | |
cortex-a57-fma-steering.c | Loading commit data... | |
driver-aarch64.c | Loading commit data... | |
falkor-tag-collision-avoidance.c | Loading commit data... | |
falkor.md | Loading commit data... | |
geniterators.sh | Loading commit data... | |
gentune.sh | Loading commit data... | |
iterators.md | Loading commit data... | |
predicates.md | Loading commit data... | |
rtems.h | Loading commit data... | |
saphira.md | Loading commit data... | |
t-aarch64 | Loading commit data... | |
t-aarch64-freebsd | Loading commit data... | |
t-aarch64-linux | Loading commit data... | |
t-aarch64-netbsd | Loading commit data... | |
t-aarch64-vxworks | Loading commit data... | |
thunderx.md | Loading commit data... | |
thunderx2t99.md | Loading commit data... | |
tsv110.md | Loading commit data... | |
x-aarch64 | Loading commit data... |