Commit e3969868 by Richard Sandiford Committed by Richard Sandiford

Fix SSA update when vectorisation adds a vdef to a read-only loop

This patch fixes an awkward corner case in which:

(a) we apply if-conversion to a loop;

(b) the original scalar loop doesn't have a vdef, and thus doesn't
    need a virtual phi;

(c) the vectorised main loop does need a vdef and a virtual phi (see below);

(d) we also vectorise the epilogue; and

(e) the vectorised epilogue still needs a scalar epilogue

The specific case in which (c) applies is if a read-only loop is
vectorised using IFN_LOAD_LANES, which uses clobber statements to
mark the lifetime of the temporary array.

The vectoriser relies on the SSA renamer to update virtual operands.
All would probably be well if it postponed this update until after
it had vectorised both the main loop and the epilogue loop.  However,
when vectorising the epilogue, vect_do_peeling does:

  create_lcssa_for_virtual_phi (loop);
  update_ssa (TODO_update_ssa_only_virtuals);

(with "loop" in this case being the to-be-vectorised epilogue loop).
So the vectoriser puts the virtual operand into SSA form for the
vectorised main loop as a separate step, during the early stages
of vectorising the epilogue.

I wasn't sure at first why that update_ssa was there.  It looked
initially like it was related to create_lcssa_for_virtual_phi,
which seemed strange when create_lcssa_for_virtual_phi keeps the
SSA form up-to-date.  But before r241099 it had the following comment,
which AFAICT is still the reason:

  /* We might have a queued need to update virtual SSA form.  As we
     delete the update SSA machinery below after doing a regular
     incremental SSA update during loop copying make sure we don't
     lose that fact.
     ???  Needing to update virtual SSA form by renaming is unfortunate
     but not all of the vectorizer code inserting new loads / stores
     properly assigns virtual operands to those statements.  */

The patch restores that comment since IMO it's helpful.

(a), (d) and (e) mean that we copy the original un-if-converted scalar
loop to act as the scalar epilogue.  The update_ssa above means that this
copying needs to cope with any new virtual SSA names in the main loop.
The code to do that (reasonably) assumed that one of two things was true:

(1) the scalar loop and the vector loops don't have vdefs, and so no
    virtual operand update is needed.  The definition that applies
    on entry to the loops is the same in all cases.

(2) the scalar loop and the vector loops have virtual phis, and so --
    after applying create_lcssa_for_virtual_phi on the to-be-vectorised
    epilogue loop -- the virtual operand update can be handled in the
    same way as for normal SSA names.

But (b) and (c) together mean that the scalar loop and the
still-to-be-vectorised epilogue loop have no virtual phi that (2)
can use.  We'd therefore keep the original vuses when duplicating,
rather than updating them to the definition that applies on exit
from the epilogue loop.  (Since the epilogue is still unvectorised
and has no vdefs, the definition that applies on exit is the same
as the one that applies on entry.)

This patch therefore adds a third case: the scalar loop and
to-be-vectorised epilogue have no virtual defs, but the main loop does.

2019-12-31  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Return
	the incoming virtual operand definition.
	(vect_do_peeling): When vectorizing an epilogue loop, handle the
	case in which the main loop has a virtual phi and the epilogue
	and scalar loops don't.  Restore an earlier comment about the
	update_ssa call.

gcc/testsuite/
	* gcc.dg/vect/vect-epilogues-2.c: New test.

From-SVN: r279802
parent 657ccd4d
2020-12-31 Richard Sandiford <richard.sandiford@arm.com>
* tree-vect-loop-manip.c (create_lcssa_for_virtual_phi): Return
the incoming virtual operand definition.
(vect_do_peeling): When vectorizing an epilogue loop, handle the
case in which the main loop has a virtual phi and the epilogue
and scalar loops don't. Restore an earlier comment about the
update_ssa call.
2019-12-30 Olivier Hainque <hainque@adacore.com> 2019-12-30 Olivier Hainque <hainque@adacore.com>
* config/vxworks/stdint.h: New file. Include _yvals.h * config/vxworks/stdint.h: New file. Include _yvals.h
2020-12-31 Richard Sandiford <richard.sandiford@arm.com>
* gcc.dg/vect/vect-epilogues-2.c: New test.
2019-12-30 Olivier Hainque <hainque@adacore.com> 2019-12-30 Olivier Hainque <hainque@adacore.com>
* gcc.dg/vect/tree-vect.h: #undef OFFSET. * gcc.dg/vect/tree-vect.h: #undef OFFSET.
......
/* { dg-do compile } */
int
f1 (int *x, int n)
{
int res = 0;
for (int i = 0; i < n; ++i)
res += x[i * 2] == 1 ? 2 : 3;
return res;
}
int
f2 (int *x)
{
int res = 0;
for (int i = 0; i < 0x83; ++i)
res += x[i * 2] == 1 ? 2 : 3;
return res;
}
int
f3 (int *x, int n)
{
int res = 0;
for (int i = 0; i < n; ++i)
res += x[i * 2] == 1 ? 2 : 3;
return res + x[0x100];
}
int
f4 (int *x)
{
int res = 0;
for (int i = 0; i < 0x83; ++i)
res += x[i * 2] == 1 ? 2 : 3;
return res + x[0x100];
}
int
f5 (int *x, int n, int a)
{
int res = 0;
for (int i = 0; i < n; ++i)
res += x[i * 2] == 1 ? 2 : 3;
x[a] += 1;
return res;
}
int
f6 (int *x, int a)
{
int res = 0;
for (int i = 0; i < 0x83; ++i)
res += x[i * 2] == 1 ? 2 : 3;
x[a] += 1;
return res;
}
...@@ -1249,9 +1249,12 @@ slpeel_can_duplicate_loop_p (const class loop *loop, const_edge e) ...@@ -1249,9 +1249,12 @@ slpeel_can_duplicate_loop_p (const class loop *loop, const_edge e)
the *guard[12] routines, which assume loop closed SSA form for all PHIs the *guard[12] routines, which assume loop closed SSA form for all PHIs
(but normally loop closed SSA form doesn't require virtual PHIs to be (but normally loop closed SSA form doesn't require virtual PHIs to be
in the same form). Doing this early simplifies the checking what in the same form). Doing this early simplifies the checking what
uses should be renamed. */ uses should be renamed.
static void If we create a new phi after the loop, return the definition that
applies on entry to the loop, otherwise return null. */
static tree
create_lcssa_for_virtual_phi (class loop *loop) create_lcssa_for_virtual_phi (class loop *loop)
{ {
gphi_iterator gsi; gphi_iterator gsi;
...@@ -1283,10 +1286,12 @@ create_lcssa_for_virtual_phi (class loop *loop) ...@@ -1283,10 +1286,12 @@ create_lcssa_for_virtual_phi (class loop *loop)
&& !flow_bb_inside_loop_p (loop, gimple_bb (stmt))) && !flow_bb_inside_loop_p (loop, gimple_bb (stmt)))
FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
SET_USE (use_p, new_vop); SET_USE (use_p, new_vop);
return PHI_ARG_DEF_FROM_EDGE (phi, loop_preheader_edge (loop));
} }
break; break;
} }
return NULL_TREE;
} }
/* Function vect_get_loop_location. /* Function vect_get_loop_location.
...@@ -2483,9 +2488,42 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, ...@@ -2483,9 +2488,42 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
class loop *prolog, *epilog = NULL, *loop = LOOP_VINFO_LOOP (loop_vinfo); class loop *prolog, *epilog = NULL, *loop = LOOP_VINFO_LOOP (loop_vinfo);
class loop *first_loop = loop; class loop *first_loop = loop;
bool irred_flag = loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP; bool irred_flag = loop_preheader_edge (loop)->flags & EDGE_IRREDUCIBLE_LOOP;
create_lcssa_for_virtual_phi (loop);
/* We might have a queued need to update virtual SSA form. As we
delete the update SSA machinery below after doing a regular
incremental SSA update during loop copying make sure we don't
lose that fact.
??? Needing to update virtual SSA form by renaming is unfortunate
but not all of the vectorizer code inserting new loads / stores
properly assigns virtual operands to those statements. */
update_ssa (TODO_update_ssa_only_virtuals); update_ssa (TODO_update_ssa_only_virtuals);
create_lcssa_for_virtual_phi (loop);
/* If we're vectorizing an epilogue loop, the update_ssa above will
have ensured that the virtual operand is in SSA form throughout the
vectorized main loop. Normally it is possible to trace the updated
vector-stmt vdefs back to scalar-stmt vdefs and vector-stmt vuses
back to scalar-stmt vuses, meaning that the effect of the SSA update
remains local to the main loop. However, there are rare cases in
which the vectorized loop has vdefs even when the original scalar
loop didn't. For example, vectorizing a load with IFN_LOAD_LANES
introduces clobbers of the temporary vector array, which in turn
needs new vdefs. If the scalar loop doesn't write to memory, these
new vdefs will be the only ones in the vector loop.
In that case, update_ssa will have added a new virtual phi to the
main loop, which previously didn't need one. Ensure that we (locally)
maintain LCSSA form for the virtual operand, just as we would have
done if the virtual phi had existed from the outset. This makes it
easier to duplicate the scalar epilogue loop below. */
tree vop_to_rename = NULL_TREE;
if (loop_vec_info orig_loop_vinfo = LOOP_VINFO_ORIG_LOOP_INFO (loop_vinfo))
{
class loop *orig_loop = LOOP_VINFO_LOOP (orig_loop_vinfo);
vop_to_rename = create_lcssa_for_virtual_phi (orig_loop);
}
if (MAY_HAVE_DEBUG_BIND_STMTS) if (MAY_HAVE_DEBUG_BIND_STMTS)
{ {
gcc_assert (!adjust_vec.exists ()); gcc_assert (!adjust_vec.exists ());
...@@ -2706,6 +2744,26 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, ...@@ -2706,6 +2744,26 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1,
as the transformations mentioned above make less or no sense when not as the transformations mentioned above make less or no sense when not
vectorizing. */ vectorizing. */
epilog = vect_epilogues ? get_loop_copy (loop) : scalar_loop; epilog = vect_epilogues ? get_loop_copy (loop) : scalar_loop;
if (vop_to_rename)
{
/* Vectorizing the main loop can sometimes introduce a vdef to
a loop that previously didn't have one; see the comment above
the definition of VOP_TO_RENAME for details. The definition
D that holds on E will then be different from the definition
VOP_TO_RENAME that holds during SCALAR_LOOP, so we need to
rename VOP_TO_RENAME to D when copying the loop.
The virtual operand is in LCSSA form for the main loop,
and no stmt between the main loop and E needs a vdef,
so we know that D is provided by a phi rather than by a
vdef on a normal gimple stmt. */
basic_block vdef_bb = e->src;
gphi *vphi;
while (!(vphi = get_virtual_phi (vdef_bb)))
vdef_bb = get_immediate_dominator (CDI_DOMINATORS, vdef_bb);
gcc_assert (vop_to_rename != gimple_phi_result (vphi));
set_current_def (vop_to_rename, gimple_phi_result (vphi));
}
epilog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, epilog, e); epilog = slpeel_tree_duplicate_loop_to_edge_cfg (loop, epilog, e);
if (!epilog) if (!epilog)
{ {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment