Commit dfbddbeb by Richard Sandiford Committed by Richard Sandiford

Handle data dependence relations with different bases

This patch tries to calculate conservatively-correct distance
vectors for two references whose base addresses are not the same.
It sets a new flag DDR_COULD_BE_INDEPENDENT_P if the dependence
isn't guaranteed to occur.

The motivating example is:

  struct s { int x[8]; };
  void
  f (struct s *a, struct s *b)
  {
    for (int i = 0; i < 8; ++i)
      a->x[i] += b->x[i];
  }

in which the "a" and "b" accesses are either independent or have a
dependence distance of 0 (assuming -fstrict-aliasing).  Neither case
prevents vectorisation, so we can vectorise without an alias check.

I'd originally wanted to do the same thing for arrays as well, e.g.:

  void
  f (int a[][8], struct b[][8])
  {
    for (int i = 0; i < 8; ++i)
      a[0][i] += b[0][i];
  }

I think this is valid because C11 6.7.6.2/6 says:

  For two array types to be compatible, both shall have compatible
  element types, and if both size specifiers are present, and are
  integer constant expressions, then both size specifiers shall have
  the same constant value.

So if we access an array through an int (*)[8], it must have type X[8]
or X[], where X is compatible with int.  It doesn't seem possible in
either case for "a[0]" and "b[0]" to overlap when "a != b".

However, as the comment above "if (same_base_p)" explains, GCC is more
forgiving: it supports arbitrary overlap of arrays and allows arrays to
be accessed with different dimensionality.  There are examples of this
in PR50067.  The patch therefore only handles references that end in a
structure field access.

There are two ways of handling these dependences in the vectoriser:
use them to limit VF, or check at runtime as before.  I've gone for
the approach of checking at runtime if we can, to avoid limiting VF
unnecessarily, but falling back to a VF cap when runtime checks aren't
allowed.

The patch tests whether we queued an alias check with a dependence
distance of X and then picked a VF <= X, in which case it's safe to
drop the alias check.  Since vect_prune_runtime_alias_check_list
can be called twice with different VF for the same loop, it's no
longer safe to clear may_alias_ddrs on exit.  Instead we should use
comp_alias_ddrs to check whether versioning is necessary.

2017-08-04  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* tree-data-ref.h (subscript): Add access_fn field.
	(data_dependence_relation): Add could_be_independent_p.
	(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
	(same_access_functions): Move to tree-data-ref.c.
	* tree-data-ref.c (ref_contains_union_access_p): New function.
	(access_fn_component_p): Likewise.
	(access_fn_components_comparable_p): Likewise.
	(dr_analyze_indices): Add a reference to access_fn_component_p.
	(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
	DR_ACCESS_FN.
	(constant_access_functions): Likewise.
	(add_other_self_distances): Likewise.
	(same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
	(initialize_data_dependence_relation): Use XCNEW and remove
	explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
	of access functions that have the same type.  Allow the
	subsequence to end with different bases in some circumstances.
	Record the chosen access functions in SUB_ACCESS_FN.
	(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
	a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
	(subscript_dependence_tester_1): Likewise dra and drb.
	(build_classic_dist_vector): Update calls accordingly.
	(subscript_dependence_tester): Likewise.
	* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
	DDR_COULD_BE_INDEPENDENT_P.
	* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
	comp_alias_ddrs instead of may_alias_ddrs.
	* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
	New function.
	(vect_analyze_data_ref_dependence): Use it if
	DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
	distance vectors if that fails.
	(dependence_distance_ge_vf): New function.
	(vect_prune_runtime_alias_test_list): Use it.  Don't clear
	LOOP_VINFO_MAY_ALIAS_DDRS.

gcc/testsuite/
	* gcc.dg/vect/vect-alias-check-3.c: New test.
	* gcc.dg/vect/vect-alias-check-4.c: Likewise.
	* gcc.dg/vect/vect-alias-check-5.c: Likewise.

From-SVN: r250867
parent 165b2f5f
2017-08-04 Richard Sandiford <richard.sandiford@linaro.org>
* tree-data-ref.h (subscript): Add access_fn field.
(data_dependence_relation): Add could_be_independent_p.
(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
(same_access_functions): Move to tree-data-ref.c.
* tree-data-ref.c (ref_contains_union_access_p): New function.
(access_fn_component_p): Likewise.
(access_fn_components_comparable_p): Likewise.
(dr_analyze_indices): Add a reference to access_fn_component_p.
(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
DR_ACCESS_FN.
(constant_access_functions): Likewise.
(add_other_self_distances): Likewise.
(same_access_functions): Likewise. (Moved from tree-data-ref.h.)
(initialize_data_dependence_relation): Use XCNEW and remove
explicit zeroing of DDR_REVERSED_P. Look for a subsequence
of access functions that have the same type. Allow the
subsequence to end with different bases in some circumstances.
Record the chosen access functions in SUB_ACCESS_FN.
(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
a_index and b_index. Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
(subscript_dependence_tester_1): Likewise dra and drb.
(build_classic_dist_vector): Update calls accordingly.
(subscript_dependence_tester): Likewise.
* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
DDR_COULD_BE_INDEPENDENT_P.
* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
comp_alias_ddrs instead of may_alias_ddrs.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
New function.
(vect_analyze_data_ref_dependence): Use it if
DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
distance vectors if that fails.
(dependence_distance_ge_vf): New function.
(vect_prune_runtime_alias_test_list): Use it. Don't clear
LOOP_VINFO_MAY_ALIAS_DDRS.
2017-08-04 Richard Biener <rguenther@suse.de>
PR middle-end/81705
......
2017-08-04 Richard Sandiford <richard.sandiford@linaro.org>
* gcc.dg/vect/vect-alias-check-3.c: New test.
* gcc.dg/vect/vect-alias-check-4.c: Likewise.
* gcc.dg/vect/vect-alias-check-5.c: Likewise.
2017-08-04 Richard Biener <rguenther@suse.de>
PR middle-end/81705
......
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
/* { dg-additional-options "--param vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
/* Intended to be larger than any VF. */
#define GAP 128
#define N (GAP * 3)
struct s { int x[N + 1]; };
struct t { struct s x[N + 1]; };
struct u { int x[N + 1]; int y; };
struct v { struct s s; };
void
f1 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a->x[i] += b->x[i];
}
void
f2 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a[1].x[i] += b[2].x[i];
}
void
f3 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a[1].x[i] += b[i].x[i];
}
void
f4 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a[i].x[i] += b[i].x[i];
}
void
f5 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a->x[i] += b->x[i + 1];
}
void
f6 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a[1].x[i] += b[2].x[i + 1];
}
void
f7 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a[1].x[i] += b[i].x[i + 1];
}
void
f8 (struct s *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a[i].x[i] += b[i].x[i + 1];
}
void
f9 (struct s *a, struct t *b)
{
for (int i = 0; i < N; ++i)
a->x[i] += b->x[1].x[i];
}
void
f10 (struct s *a, struct t *b)
{
for (int i = 0; i < N; ++i)
a->x[i] += b->x[i].x[i];
}
void
f11 (struct u *a, struct u *b)
{
for (int i = 0; i < N; ++i)
a->x[i] += b->x[i] + b[i].y;
}
void
f12 (struct s *a, struct s *b)
{
for (int i = 0; i < GAP; ++i)
a->x[i + GAP] += b->x[i];
}
void
f13 (struct s *a, struct s *b)
{
for (int i = 0; i < GAP * 2; ++i)
a->x[i + GAP] += b->x[i];
}
void
f14 (struct v *a, struct s *b)
{
for (int i = 0; i < N; ++i)
a->s.x[i] = b->x[i];
}
void
f15 (struct s *a, struct s *b)
{
#pragma omp simd safelen(N)
for (int i = 0; i < N; ++i)
a->x[i + 1] += b->x[i];
}
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
#define N 16
struct s1 { int a[N]; };
struct s2 { struct s1 b; int c; };
struct s3 { int d; struct s1 e; };
union u { struct s2 f; struct s3 g; };
/* We allow a and b to overlap arbitrarily. */
void
f1 (int a[][N], int b[][N])
{
for (int i = 0; i < N; ++i)
a[0][i] += b[0][i];
}
void
f2 (union u *a, union u *b)
{
for (int i = 0; i < N; ++i)
a->f.b.a[i] += b->g.e.a[i];
}
void
f3 (struct s1 *a, struct s1 *b)
{
for (int i = 0; i < N - 1; ++i)
a->a[i + 1] += b->a[i];
}
/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
/* Intended to be larger than any VF. */
#define GAP 128
#define N (GAP * 3)
struct s { int x[N]; };
void
f1 (struct s *a, struct s *b)
{
for (int i = 0; i < GAP * 2; ++i)
a->x[i + GAP] += b->x[i];
}
/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "improved number of alias checks from 1 to 0" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */
......@@ -260,6 +260,9 @@ struct conflict_function
struct subscript
{
/* The access functions of the two references. */
tree access_fn[2];
/* A description of the iterations for which the elements are
accessed twice. */
conflict_function *conflicting_iterations_in_a;
......@@ -278,6 +281,7 @@ struct subscript
typedef struct subscript *subscript_p;
#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
#define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
#define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
#define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
......@@ -333,6 +337,33 @@ struct data_dependence_relation
/* Set to true when the dependence relation is on the same data
access. */
bool self_reference_p;
/* True if the dependence described is conservatively correct rather
than exact, and if it is still possible for the accesses to be
conditionally independent. For example, the a and b references in:
struct s *a, *b;
for (int i = 0; i < n; ++i)
a->f[i] += b->f[i];
conservatively have a distance vector of (0), for the case in which
a == b, but the accesses are independent if a != b. Similarly,
the a and b references in:
struct s *a, *b;
for (int i = 0; i < n; ++i)
a[0].f[i] += b[i].f[i];
conservatively have a distance vector of (0), but they are indepenent
when a != b + i. In contrast, the references in:
struct s *a;
for (int i = 0; i < n; ++i)
a->f[i] += a->f[i];
have the same distance vector of (0), but the accesses can never be
independent. */
bool could_be_independent_p;
};
typedef struct data_dependence_relation *ddr_p;
......@@ -363,6 +394,7 @@ typedef struct data_dependence_relation *ddr_p;
#define DDR_DIST_VECT(DDR, I) \
DDR_DIST_VECTS (DDR)[I]
#define DDR_REVERSED_P(DDR) (DDR)->reversed_p
#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
......@@ -459,22 +491,6 @@ same_data_refs (data_reference_p a, data_reference_p b)
return true;
}
/* Return true when the DDR contains two data references that have the
same access functions. */
static inline bool
same_access_functions (const struct data_dependence_relation *ddr)
{
unsigned i;
for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
DR_ACCESS_FN (DDR_B (ddr), i)))
return false;
return true;
}
/* Returns true when all the dependences are computable. */
inline bool
......
......@@ -1668,6 +1668,7 @@ determine_loop_nest_reuse (struct loop *loop, struct mem_ref_group *refs,
refb = (struct mem_ref *) DDR_B (dep)->aux;
if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
|| DDR_COULD_BE_INDEPENDENT_P (dep)
|| DDR_NUM_DIST_VECTS (dep) == 0)
{
/* If the dependence cannot be analyzed, assume that there might be
......
......@@ -160,6 +160,60 @@ vect_mark_for_runtime_alias_test (ddr_p ddr, loop_vec_info loop_vinfo)
}
/* A subroutine of vect_analyze_data_ref_dependence. Handle
DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
distances. These distances are conservatively correct but they don't
reflect a guaranteed dependence.
Return true if this function does all the work necessary to avoid
an alias or false if the caller should use the dependence distances
to limit the vectorization factor in the usual way. LOOP_DEPTH is
the depth of the loop described by LOOP_VINFO and the other arguments
are as for vect_analyze_data_ref_dependence. */
static bool
vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
loop_vec_info loop_vinfo,
int loop_depth, int *max_vf)
{
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
lambda_vector dist_v;
unsigned int i;
FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
{
int dist = dist_v[loop_depth];
if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
{
/* If the user asserted safelen >= DIST consecutive iterations
can be executed concurrently, assume independence.
??? An alternative would be to add the alias check even
in this case, and vectorize the fallback loop with the
maximum VF set to safelen. However, if the user has
explicitly given a length, it's less likely that that
would be a win. */
if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
{
if (loop->safelen < *max_vf)
*max_vf = loop->safelen;
LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
continue;
}
/* For dependence distances of 2 or more, we have the option
of limiting VF or checking for an alias at runtime.
Prefer to check at runtime if we can, to avoid limiting
the VF unnecessarily when the bases are in fact independent.
Note that the alias checks will be removed if the VF ends up
being small enough. */
return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
}
}
return true;
}
/* Function vect_analyze_data_ref_dependence.
Return TRUE if there (might) exist a dependence between a memory-reference
......@@ -305,6 +359,12 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
}
loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
if (DDR_COULD_BE_INDEPENDENT_P (ddr)
&& vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
loop_depth, max_vf))
return false;
FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
{
int dist = dist_v[loop_depth];
......@@ -2878,6 +2938,44 @@ vect_no_alias_p (struct data_reference *a, struct data_reference *b,
return false;
}
/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
in DDR is >= VF. */
static bool
dependence_distance_ge_vf (data_dependence_relation *ddr,
unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
{
if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
|| DDR_NUM_DIST_VECTS (ddr) == 0)
return false;
/* If the dependence is exact, we should have limited the VF instead. */
gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
unsigned int i;
lambda_vector dist_v;
FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
{
HOST_WIDE_INT dist = dist_v[loop_depth];
if (dist != 0
&& !(dist > 0 && DDR_REVERSED_P (ddr))
&& (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
return false;
}
if (dump_enabled_p ())
{
dump_printf_loc (MSG_NOTE, vect_location,
"dependence distance between ");
dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
dump_printf (MSG_NOTE, " and ");
dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
dump_printf (MSG_NOTE, " is >= VF\n");
}
return true;
}
/* Function vect_prune_runtime_alias_test_list.
Prune a list of ddrs to be tested at run-time by versioning for alias.
......@@ -2908,6 +3006,10 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
comp_alias_ddrs.create (may_alias_ddrs.length ());
unsigned int loop_depth
= index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
LOOP_VINFO_LOOP_NEST (loop_vinfo));
/* First, we collect all data ref pairs for aliasing checks. */
FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
{
......@@ -2917,6 +3019,11 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
tree segment_length_a, segment_length_b;
gimple *stmt_a, *stmt_b;
/* Ignore the alias if the VF we chose ended up being no greater
than the dependence distance. */
if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
continue;
dr_a = DDR_A (ddr);
stmt_a = DR_STMT (DDR_A (ddr));
dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
......@@ -2993,10 +3100,6 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
return false;
}
/* All alias checks have been resolved at compilation time. */
if (!comp_alias_ddrs.length ())
LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
return true;
}
......
......@@ -358,7 +358,7 @@ typedef struct _loop_vec_info : public vec_info {
#define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L) \
((L)->may_misalign_stmts.length () > 0)
#define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L) \
((L)->may_alias_ddrs.length () > 0)
((L)->comp_alias_ddrs.length () > 0)
#define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L) \
(LOOP_VINFO_NITERS_ASSUMPTIONS (L))
#define LOOP_REQUIRES_VERSIONING(L) \
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment