Commit db4a1c18 by Wilco Dijkstra Committed by Wilco Dijkstra

The existing vector costs stop some beneficial vectorization.

The existing vector costs stop some beneficial vectorization.  This is mostly
due to vector statement cost being set to 3 as well as vector loads having a
higher cost than scalar loads.  This means that even when we vectorize 4x, it
is possible that the cost of a vectorized loop is similar to the scalar
version, and we fail to vectorize.

Using a cost of 3 for a vector operation suggests they are 3 times as
expensive as scalar operations.  Since most vector operations have a 
similar throughput as scalar operations, this is not correct.

Using slightly lower values for these heuristics now allows this loop
and many others to be vectorized.  On a proprietary benchmark the gain
from vectorizing this loop is around 15-30% which shows vectorizing it is
indeed beneficial.

	* config/aarch64/aarch64.c (cortexa57_vector_cost):
	Change vec_stmt_cost, vec_align_load_cost and vec_unalign_load_cost.

From-SVN: r242383
parent 725bbb80
2016-11-14 Wilco Dijkstra <wdijkstr@arm.com>
* config/aarch64/aarch64.c (cortexa57_vector_cost):
Change vec_stmt_cost, vec_align_load_cost and vec_unalign_load_cost.
2016-11-14 Richard Biener <rguenther@suse.de>
PR tree-optimization/78312
......@@ -398,12 +398,12 @@ static const struct cpu_vector_cost cortexa57_vector_cost =
1, /* scalar_stmt_cost */
4, /* scalar_load_cost */
1, /* scalar_store_cost */
3, /* vec_stmt_cost */
2, /* vec_stmt_cost */
3, /* vec_permute_cost */
8, /* vec_to_scalar_cost */
8, /* scalar_to_vec_cost */
5, /* vec_align_load_cost */
5, /* vec_unalign_load_cost */
4, /* vec_align_load_cost */
4, /* vec_unalign_load_cost */
1, /* vec_unalign_store_cost */
1, /* vec_store_cost */
1, /* cond_taken_branch_cost */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment