using SVE. Given this input code: int sum_abs (uint8_t *restrict x, uint8_t *restrict y, int n) { int sum = 0; for (int i = 0; i < n; i++) { sum += __builtin_abs (x[i] - y[i]); } return sum; } The resulting SVE code is: 0000000000000000 <sum_abs>: 0: 7100005f cmp w2, #0x0 4: 5400026d b.le 50 <sum_abs+0x50> 8: d2800003 mov x3, #0x0 // #0 c: 93407c42 sxtw x2, w2 10: 2538c002 mov z2.b, #0 14: 25221fe0 whilelo p0.b, xzr, x2 18: 2538c023 mov z3.b, #1 1c: 2518e3e1 ptrue p1.b 20: a4034000 ld1b {z0.b}, p0/z, [x0, x3] 24: a4034021 ld1b {z1.b}, p0/z, [x1, x3] 28: 0430e3e3 incb x3 2c: 0520c021 sel z1.b, p0, z1.b, z0.b 30: 25221c60 whilelo p0.b, x3, x2 34: 040d0420 uabd z0.b, p1/m, z0.b, z1.b 38: 44830402 udot z2.s, z0.b, z3.b 3c: 54ffff21 b.ne 20 <sum_abs+0x20> // b.any 40: 2598e3e0 ptrue p0.s 44: 04812042 uaddv d2, p0, z2.s 48: 1e260040 fmov w0, s2 4c: d65f03c0 ret 50: 1e2703e2 fmov s2, wzr 54: 1e260040 fmov w0, s2 58: d65f03c0 ret Notice how udot is used inside a fully masked loop. gcc/Changelog: 2019-05-07 Alejandro Martinez <alejandro.martinezvicente@arm.com> * config/aarch64/aarch64-sve.md (<su>abd<mode>_3): New define_expand. (aarch64_<su>abd<mode>_3): Likewise. (*aarch64_<su>abd<mode>_3): New define_insn. (<sur>sad<vsi2qi>): New define_expand. * config/aarch64/iterators.md: Added MAX_OPP attribute. * tree-vect-loop.c (use_mask_by_cond_expr_p): Add SAD_EXPR. (build_vect_cond_expr): Likewise. gcc/testsuite/Changelog: 2019-05-07 Alejandro Martinez <alejandro.martinezvicente@arm.com> * gcc.target/aarch64/sve/sad_1.c: New test for sum of absolute differences. From-SVN: r270975
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
aarch64-arches.def | Loading commit data... | |
aarch64-bti-insert.c | Loading commit data... | |
aarch64-builtins.c | Loading commit data... | |
aarch64-c.c | Loading commit data... | |
aarch64-cores.def | Loading commit data... | |
aarch64-cost-tables.h | Loading commit data... | |
aarch64-d.c | Loading commit data... | |
aarch64-elf-raw.h | Loading commit data... | |
aarch64-elf.h | Loading commit data... | |
aarch64-freebsd.h | Loading commit data... | |
aarch64-fusion-pairs.def | Loading commit data... | |
aarch64-ldpstp.md | Loading commit data... | |
aarch64-linux.h | Loading commit data... | |
aarch64-modes.def | Loading commit data... | |
aarch64-option-extensions.def | Loading commit data... | |
aarch64-opts.h | Loading commit data... | |
aarch64-passes.def | Loading commit data... | |
aarch64-protos.h | Loading commit data... | |
aarch64-simd-builtin-types.def | Loading commit data... | |
aarch64-simd-builtins.def | Loading commit data... | |
aarch64-simd.md | Loading commit data... | |
aarch64-speculation.cc | Loading commit data... | |
aarch64-sve.md | Loading commit data... | |
aarch64-tune.md | Loading commit data... | |
aarch64-tuning-flags.def | Loading commit data... | |
aarch64.c | Loading commit data... | |
aarch64.h | Loading commit data... | |
aarch64.md | Loading commit data... | |
aarch64.opt | Loading commit data... | |
arm_acle.h | Loading commit data... | |
arm_fp16.h | Loading commit data... | |
arm_neon.h | Loading commit data... | |
atomics.md | Loading commit data... | |
biarchilp32.h | Loading commit data... | |
biarchlp64.h | Loading commit data... | |
constraints.md | Loading commit data... | |
cortex-a57-fma-steering.c | Loading commit data... | |
driver-aarch64.c | Loading commit data... | |
falkor-tag-collision-avoidance.c | Loading commit data... | |
falkor.md | Loading commit data... | |
geniterators.sh | Loading commit data... | |
gentune.sh | Loading commit data... | |
iterators.md | Loading commit data... | |
predicates.md | Loading commit data... | |
rtems.h | Loading commit data... | |
saphira.md | Loading commit data... | |
t-aarch64 | Loading commit data... | |
t-aarch64-freebsd | Loading commit data... | |
t-aarch64-linux | Loading commit data... | |
thunderx.md | Loading commit data... | |
thunderx2t99.md | Loading commit data... | |
tsv110.md | Loading commit data... | |
x-aarch64 | Loading commit data... |