gcc/tree-parloops.c · d1b9a5724b8a05d2d2d51b3e5d13cc785326c74f · lvzhengyang / riscv-gcc-1

Add support for in-order addition reduction using SVE FADDA · b781a135
This patch adds support for in-order floating-point addition reductions,
which are suitable even in strict IEEE mode.

Previously vect_is_simple_reduction would reject any cases that forbid
reassociation.  The idea is instead to tentatively accept them as
"FOLD_LEFT_REDUCTIONs" and only fail later if there is no support
for them.  Although this patch only handles the particular case of plus
and minus on floating-point types, there's no reason in principle why
we couldn't handle other cases.

The reductions use a new fold_left_plus_optab if available, otherwise
they fall back to elementwise additions or subtractions.

The vect_force_simple_reduction change makes it easier for parloops
to read the type of reduction.

2018-01-13  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* optabs.def (fold_left_plus_optab): New optab.
	* doc/md.texi (fold_left_plus_@var{m}): Document.
	* internal-fn.def (IFN_FOLD_LEFT_PLUS): New internal function.
	* internal-fn.c (fold_left_direct): Define.
	(expand_fold_left_optab_fn): Likewise.
	(direct_fold_left_optab_supported_p): Likewise.
	* fold-const-call.c (fold_const_fold_left): New function.
	(fold_const_call): Use it to fold CFN_FOLD_LEFT_PLUS.
	* tree-parloops.c (valid_reduction_p): New function.
	(gather_scalar_reductions): Use it.
	* tree-vectorizer.h (FOLD_LEFT_REDUCTION): New vect_reduction_type.
	(vect_finish_replace_stmt): Declare.
	* tree-vect-loop.c (fold_left_reduction_fn): New function.
	(needs_fold_left_reduction_p): New function, split out from...
	(vect_is_simple_reduction): ...here.  Accept reductions that
	forbid reassociation, but give them type FOLD_LEFT_REDUCTION.
	(vect_force_simple_reduction): Also store the reduction type in
	the assignment's STMT_VINFO_REDUC_TYPE.
	(vect_model_reduction_cost): Handle FOLD_LEFT_REDUCTION.
	(merge_with_identity): New function.
	(vect_expand_fold_left): Likewise.
	(vectorize_fold_left_reduction): Likewise.
	(vectorizable_reduction): Handle FOLD_LEFT_REDUCTION.  Leave the
	scalar phi in place for it.  Check for target support and reject
	cases that would reassociate the operation.  Defer the transform
	phase to vectorize_fold_left_reduction.
	* config/aarch64/aarch64.md (UNSPEC_FADDA): New unspec.
	* config/aarch64/aarch64-sve.md (fold_left_plus_<mode>): New expander.
	(*fold_left_plus_<mode>, *pred_fold_left_plus_<mode>): New insns.

gcc/testsuite/
	* gcc.dg/vect/no-fast-math-vect16.c: Expect the test to pass and
	check for a message about using in-order reductions.
	* gcc.dg/vect/pr79920.c: Expect both loops to be vectorized and
	check for a message about using in-order reductions.
	* gcc.dg/vect/trapv-vect-reduc-4.c: Expect all three loops to be
	vectorized and check for a message about using in-order reductions.
	Expect targets with variable-length vectors to fall back to the
	fixed-length mininum.
	* gcc.dg/vect/vect-reduc-6.c: Expect the loop to be vectorized and
	check for a message about using in-order reductions.
	* gcc.dg/vect/vect-reduc-in-order-1.c: New test.
	* gcc.dg/vect/vect-reduc-in-order-2.c: Likewise.
	* gcc.dg/vect/vect-reduc-in-order-3.c: Likewise.
	* gcc.dg/vect/vect-reduc-in-order-4.c: Likewise.
	* gcc.target/aarch64/sve/reduc_strict_1.c: New test.
	* gcc.target/aarch64/sve/reduc_strict_1_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_strict_2.c: Likewise.
	* gcc.target/aarch64/sve/reduc_strict_2_run.c: Likewise.
	* gcc.target/aarch64/sve/reduc_strict_3.c: Likewise.
	* gcc.target/aarch64/sve/slp_13.c: Add floating-point types.
	* gfortran.dg/vect/vect-8.f90: Expect 22 loops to be vectorized if
	vect_fold_left_plus.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256639
committed Jan 13, 2018
b781a135
tree-parloops.c 98.9 KB