Commit 396c2a98 by Kewen Lin

[rs6000] Adjust vectorization cost for scalar COND_EXPR

We found that the vectorization cost modeling on scalar COND_EXPR is a bit off
on rs6000.  One typical case is 548.exchange2_r, -Ofast -mcpu=power9 -mrecip
-fvect-cost-model=unlimited is better than -Ofast -mcpu=power9 -mrecip (the
default is -fvect-cost-model=dynamic) by 1.94%.  Scalar COND_EXPR is expanded
into compare + branch or compare + isel normally, either of them should be
priced more than the simple FXU operation.  This patch is to add additional
vectorization cost onto scalar COND_EXPR on top of builtin_vectorization_cost.
The idea to use additional cost value 2 instead of the others: 1) try various
possible value candidates from 1 to 5, 2 is the best measured on Power9.  2) 
from latency view, compare takes 3 cycles and isel takes 2 on Power9, it's 
2.5 times of simple FXU instruction which takes cost 1 in the current
modeling, it's close.  3) get fine SPEC2017 ratio on Power8 as well.

gcc/ChangeLog

    * config/rs6000/rs6000.c (adjust_vectorization_cost): New function.
    (rs6000_add_stmt_cost): Call adjust_vectorization_cost and update
    stmt_cost.

From-SVN: r279336
parent a1af2dd9
2019-12-13 Kewen Lin <linkw@gcc.gnu.org>
* config/rs6000/rs6000.c (adjust_vectorization_cost): New function.
(rs6000_add_stmt_cost): Call adjust_vectorization_cost and update
stmt_cost.
2019-12-12 Jakub Jelinek <jakub@redhat.com> 2019-12-12 Jakub Jelinek <jakub@redhat.com>
PR target/92904 PR target/92904
...@@ -4997,6 +4997,29 @@ rs6000_init_cost (struct loop *loop_info) ...@@ -4997,6 +4997,29 @@ rs6000_init_cost (struct loop *loop_info)
return data; return data;
} }
/* Adjust vectorization cost after calling rs6000_builtin_vectorization_cost.
For some statement, we would like to further fine-grain tweak the cost on
top of rs6000_builtin_vectorization_cost handling which doesn't have any
information on statement operation codes etc. One typical case here is
COND_EXPR, it takes the same cost to simple FXU instruction when evaluating
for scalar cost, but it should be priced more whatever transformed to either
compare + branch or compare + isel instructions. */
static unsigned
adjust_vectorization_cost (enum vect_cost_for_stmt kind,
struct _stmt_vec_info *stmt_info)
{
if (kind == scalar_stmt && stmt_info && stmt_info->stmt
&& gimple_code (stmt_info->stmt) == GIMPLE_ASSIGN)
{
tree_code subcode = gimple_assign_rhs_code (stmt_info->stmt);
if (subcode == COND_EXPR)
return 2;
}
return 0;
}
/* Implement targetm.vectorize.add_stmt_cost. */ /* Implement targetm.vectorize.add_stmt_cost. */
static unsigned static unsigned
...@@ -5012,6 +5035,7 @@ rs6000_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind, ...@@ -5012,6 +5035,7 @@ rs6000_add_stmt_cost (void *data, int count, enum vect_cost_for_stmt kind,
tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
int stmt_cost = rs6000_builtin_vectorization_cost (kind, vectype, int stmt_cost = rs6000_builtin_vectorization_cost (kind, vectype,
misalign); misalign);
stmt_cost += adjust_vectorization_cost (kind, stmt_info);
/* Statements in an inner loop relative to the loop being /* Statements in an inner loop relative to the loop being
vectorized are weighted more heavily. The value here is vectorized are weighted more heavily. The value here is
arbitrary and could potentially be improved with analysis. */ arbitrary and could potentially be improved with analysis. */
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment