Commit 468c2ac0 by Dorit Nuzman Committed by Dorit Nuzman

tree-data-refs.c (split_constant_offset): Expose.

        * tree-data-refs.c (split_constant_offset): Expose.
        * tree-data-refs.h (split_constant_offset): Add declaration.

        * tree-vectorizer.h (dr_alignment_support): Renamed
        dr_unaligned_software_pipeline to dr_explicit_realign_optimized.
        Added a new value dr_explicit_realign.
        (_stmt_vec_info): Added new fields: dr_base_address, dr_init,
        dr_offset, dr_step, and dr_aligned_to, along with new access
        functions for these fields: STMT_VINFO_DR_BASE_ADDRESS,
        STMT_VINFO_DR_INIT, STMT_VINFO_DR_OFFSET, STMT_VINFO_DR_STEP, and
        STMT_VINFO_DR_ALIGNED_TO.

        * tree-vectorizer.c (vect_supportable_dr_alignment): Add
        documentation.
        In case of outer-loop vectorization with non-fixed misalignment - use
        the dr_explicit_realign scheme instead of the optimized realignment
        scheme.
        (new_stmt_vec_info): Initialize new fields.

        * tree-vect-analyze.c (vect_compute_data_ref_alignment): Handle the
        'nested_in_vect_loop' case. Change verbosity level.
        (vect_analyze_data_ref_access): Handle the 'nested_in_vect_loop' case.
        Don't fail on zero step in the outer-loop for loads.
        (vect_analyze_data_refs): Call split_constant_offset to calculate base,
        offset and init relative to the outer-loop.

        * tree-vect-transform.c (vect_create_data_ref_ptr): Replace the unused
        BSI function argument with a new function argument - at_loop.
        Simplify the condition that determines STEP. Takes additional argument
        INV_P. Support outer-loop vectorization (handle the nested_in_vect_loop
        case), including zero step in the outer-loop. Call
        vect_create_addr_base_for_vector_ref with additional argument.
        (vect_create_addr_base_for_vector_ref): Takes additional argument LOOP.
        Updated function documentation. Handle the 'nested_in_vect_loop' case.
        Fixed and simplified calculation of step.
        (vectorizable_store): Call vect_create_data_ref_ptr with loop instead
        of bsi, and with additional argument. Call bump_vector_ptr with
        additional argument. Fix typos. Handle the 'nested_in_vect_loop' case.
        (vect_setup_realignment): Takes additional arguments INIT_ADDR and
        DR_ALIGNMENT_SUPPORT. Returns another value AT_LOOP. Handle the case
        when the realignment setup needs to take place inside the loop.  Support
        the dr_explicit_realign scheme. Allow generating the optimized
        realignment scheme for outer-loop vectorization. Added documentation.
        (vectorizable_load): Support the dr_explicit_realign scheme. Handle the
        'nested_in_vect_loop' case, including loads that are invariant in the
        outer-loop and the realignment schemes. Handle the case when the
        realignment setup needs to take place inside the loop. Call
        vect_setup_realignment with additional arguments.  Call
        vect_create_data_ref_ptr with additional argument and with loop instead
        of bsi. Fix 80-column overflow. Fix typos. Rename PHI_STMT to PHI.
        (vect_gen_niters_for_prolog_loop): Call
        vect_create_addr_base_for_vector_ref with additional arguments.
        (vect_create_cond_for_align_checks): Likewise.
        (bump_vector_ptr): Updated to support the new dr_explicit_realign
        scheme: takes additional argument bump; argument ptr_incr is now
        optional; updated documentation.
        (vect_init_vector): Takes additional argument (bsi). Use it, if
        available, to insert the vector initialization.
        (get_initial_def_for_induction): Pass additional argument in call to
        vect_init_vector.
        (vect_get_vec_def_for_operand): Likewise.
        (vect_setup_realignment): Likewise.
        (vectorizable_load): Likewise.

From-SVN: r127624
parent d29de1bf
2007-08-19 Dorit Nuzman <dorit@il.ibm.com>
* tree-data-refs.c (split_constant_offset): Expose.
* tree-data-refs.h (split_constant_offset): Add declaration.
* tree-vectorizer.h (dr_alignment_support): Renamed
dr_unaligned_software_pipeline to dr_explicit_realign_optimized.
Added a new value dr_explicit_realign.
(_stmt_vec_info): Added new fields: dr_base_address, dr_init,
dr_offset, dr_step, and dr_aligned_to, along with new access
functions for these fields: STMT_VINFO_DR_BASE_ADDRESS,
STMT_VINFO_DR_INIT, STMT_VINFO_DR_OFFSET, STMT_VINFO_DR_STEP, and
STMT_VINFO_DR_ALIGNED_TO.
* tree-vectorizer.c (vect_supportable_dr_alignment): Add
documentation.
In case of outer-loop vectorization with non-fixed misalignment - use
the dr_explicit_realign scheme instead of the optimized realignment
scheme.
(new_stmt_vec_info): Initialize new fields.
* tree-vect-analyze.c (vect_compute_data_ref_alignment): Handle the
'nested_in_vect_loop' case. Change verbosity level.
(vect_analyze_data_ref_access): Handle the 'nested_in_vect_loop' case.
Don't fail on zero step in the outer-loop for loads.
(vect_analyze_data_refs): Call split_constant_offset to calculate base,
offset and init relative to the outer-loop.
* tree-vect-transform.c (vect_create_data_ref_ptr): Replace the unused
BSI function argument with a new function argument - at_loop.
Simplify the condition that determines STEP. Takes additional argument
INV_P. Support outer-loop vectorization (handle the nested_in_vect_loop
case), including zero step in the outer-loop. Call
vect_create_addr_base_for_vector_ref with additional argument.
(vect_create_addr_base_for_vector_ref): Takes additional argument LOOP.
Updated function documentation. Handle the 'nested_in_vect_loop' case.
Fixed and simplified calculation of step.
(vectorizable_store): Call vect_create_data_ref_ptr with loop instead
of bsi, and with additional argument. Call bump_vector_ptr with
additional argument. Fix typos. Handle the 'nested_in_vect_loop' case.
(vect_setup_realignment): Takes additional arguments INIT_ADDR and
DR_ALIGNMENT_SUPPORT. Returns another value AT_LOOP. Handle the case
when the realignment setup needs to take place inside the loop. Support
the dr_explicit_realign scheme. Allow generating the optimized
realignment scheme for outer-loop vectorization. Added documentation.
(vectorizable_load): Support the dr_explicit_realign scheme. Handle the
'nested_in_vect_loop' case, including loads that are invariant in the
outer-loop and the realignment schemes. Handle the case when the
realignment setup needs to take place inside the loop. Call
vect_setup_realignment with additional arguments. Call
vect_create_data_ref_ptr with additional argument and with loop instead
of bsi. Fix 80-column overflow. Fix typos. Rename PHI_STMT to PHI.
(vect_gen_niters_for_prolog_loop): Call
vect_create_addr_base_for_vector_ref with additional arguments.
(vect_create_cond_for_align_checks): Likewise.
(bump_vector_ptr): Updated to support the new dr_explicit_realign
scheme: takes additional argument bump; argument ptr_incr is now
optional; updated documentation.
(vect_init_vector): Takes additional argument (bsi). Use it, if
available, to insert the vector initialization.
(get_initial_def_for_induction): Pass additional argument in call to
vect_init_vector.
(vect_get_vec_def_for_operand): Likewise.
(vect_setup_realignment): Likewise.
(vectorizable_load): Likewise.
2007-08-19 Dorit Nuzman <dorit@il.ibm.com>
* tree-vectorizer.h (vect_is_simple_reduction): Takes a loop_vec_info
as argument instead of struct loop.
(nested_in_vect_loop_p): New function.
......
2007-08-19 Dorit Nuzman <dorit@il.ibm.com>
* gcc.dg/vect/vect-117.c: Change inner-loop bound to
unknown (so that outer-loop wont get analyzed).
* gcc.dg/vect/vect-outer-1a.c: New test.
* gcc.dg/vect/vect-outer-1b.c: New test.
* gcc.dg/vect/vect-outer-1.c: New test.
* gcc.dg/vect/vect-outer-2a.c: New test.
* gcc.dg/vect/vect-outer-2b.c: New test.
* gcc.dg/vect/vect-outer-2c.c: New test.
* gcc.dg/vect/vect-outer-2.c: New test.
* gcc.dg/vect/vect-outer-3a.c: New test.
* gcc.dg/vect/vect-outer-3b.c: New test.
* gcc.dg/vect/vect-outer-3c.c: New test.
* gcc.dg/vect/vect-outer-3.c: New test.
* gcc.dg/vect/vect-outer-4a.c: New test.
* gcc.dg/vect/vect-outer-4b.c: New test.
* gcc.dg/vect/vect-outer-4c.c: New test.
* gcc.dg/vect/vect-outer-4d.c: New test.
* gcc.dg/vect/vect-outer-4e.c: New test.
* gcc.dg/vect/vect-outer-4f.c: New test.
* gcc.dg/vect/vect-outer-4g.c: New test.
* gcc.dg/vect/no-section-anchors-vect-outer-4h.c: New test.
* gcc.dg/vect/vect-outer-4i.c: New test.
* gcc.dg/vect/vect-outer-4j.c: New test.
* gcc.dg/vect/vect-outer-4k.c: New test.
* gcc.dg/vect/vect-outer-4l.c: New test.
* gcc.dg/vect/vect-outer-4m.c: New test.
* gcc.dg/vect/vect-outer-4.c: New test.
* gcc.dg/vect/vect-outer-5.c: New test.
* gcc.dg/vect/vect-outer-6.c: New test.
* gcc.dg/vect/vect-outer-fir.c: New test.
* gcc.dg/vect/vect-outer-fir-lb.c: New test.
* gcc.dg/vect/costmodel/ppc/costmodel-vect-outer-fir.c: New test.
2007-08-19 Dorit Nuzman <dorit@il.ibm.com>
* gcc.dg/vect/vect.exp: Compile tests with -fno-tree-scev-cprop
and -fno-tree-reassoc.
* gcc.dg/vect/no-tree-scev-cprop-vect-iv-1.c: Moved to...
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "../../tree-vect.h"
#define N 40
#define M 128
float in[N+M];
float coeff[M];
float out[N];
float fir_out[N];
/* Should be vectorized. Fixed misaligment in the inner-loop. */
/* Currently not vectorized because we get too many BBs in the inner-loop,
because the compiler doesn't realize that the inner-loop executes at
least once (cause k<4), and so there's no need to create a guard code
to skip the inner-loop in case it doesn't execute. */
void foo (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
out[i] = 0;
}
for (k = 0; k < 4; k++) {
for (i = 0; i < N; i++) {
diff = 0;
for (j = k; j < M; j+=4) {
diff += in[j+i]*coeff[j];
}
out[i] += diff;
}
}
/* Vectorized. Changing misalignment in the inner-loop. */
void fir (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j++) {
diff += in[j+i]*coeff[j];
}
fir_out[i] = diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < M; i++)
coeff[i] = i;
for (i = 0; i < N+M; i++)
in[i] = i;
foo ();
fir ();
for (i = 0; i < N; i++) {
if (out[i] != fir_out[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 2 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
unsigned short a[M][N];
unsigned int out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
unsigned int diff;
for (i = 0; i < N; i++) {
for (j = 0; j < M; j++) {
a[j][i] = 4;
}
out[i]=5;
}
}
int main (void)
{
int i, j;
check_vect ();
foo ();
for (i = 0; i < N; i++) {
for (j = 0; j < M; j++) {
if (a[j][i] != 4)
abort ();
}
if (out[i] != 5)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
......@@ -20,7 +20,7 @@ static int c[N][N] = {{ 1, 2, 3, 4, 5},
volatile int foo;
int main1 (int A[N][N])
int main1 (int A[N][N], int n)
{
int i,j;
......@@ -28,7 +28,7 @@ int main1 (int A[N][N])
/* vectorizable */
for (i = 1; i < N; i++)
{
for (j = 0; j < N; j++)
for (j = 0; j < n; j++)
{
A[i][j] = A[i-1][j] + A[i][j];
}
......@@ -42,7 +42,7 @@ int main (void)
int i,j;
foo = 0;
main1 (a);
main1 (a, N);
/* check results: */
......
/* { dg-do compile } */
#define N 40
signed short image[N][N] __attribute__ ((__aligned__(16)));
signed short block[N][N] __attribute__ ((__aligned__(16)));
signed short out[N] __attribute__ ((__aligned__(16)));
/* Can't do outer-loop vectorization because of non-consecutive access. */
void
foo (){
int i,j;
int diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j+=8) {
diff += (image[i][j] - block[i][j]);
}
out[i]=diff;
}
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "strided access in outer loop" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
signed short image[N][N] __attribute__ ((__aligned__(16)));
signed short block[N][N] __attribute__ ((__aligned__(16)));
/* Can't do outer-loop vectorization because of non-consecutive access.
Currently fails to vectorize because the reduction pattern is not
recognized. */
int
foo (){
int i,j;
int diff = 0;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j+=8) {
diff += (image[i][j] - block[i][j]);
}
}
return diff;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* FORNOW */
/* { dg-final { scan-tree-dump-times "strided access in outer loop" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "unexpected pattern" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
signed short image[N][N];
signed short block[N][N];
signed short out[N];
/* Outer-loop cannot get vectorized because of non-consecutive access. */
void
foo (){
int i,j;
int diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j+=4) {
diff += (image[i][j] - block[i][j]);
}
out[i]=diff;
}
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "strided access in outer loop" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N] __attribute__ ((__aligned__(16)));
float out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[j][i] = j+i;
}
}
}
int main (void)
{
check_vect ();
int i, j;
foo ();
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
if (image[j][i] != j+i)
abort ();
}
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N][N] __attribute__ ((__aligned__(16)));
void
foo (){
int i,j,k;
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[k][j][i] = j+i+k;
}
}
}
}
int main (void)
{
check_vect ();
int i, j, k;
foo ();
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
if (image[k][j][i] != j+i+k)
abort ();
}
}
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[2*N][N][N] __attribute__ ((__aligned__(16)));
void
foo (){
int i,j,k;
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[k+i][j][i] = j+i+k;
}
}
}
}
int main (void)
{
check_vect ();
int i, j, k;
foo ();
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
if (image[k+i][j][i] != j+i+k)
abort ();
}
}
}
return 0;
}
/* { dg-final { scan-tree-dump-times "strided access in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[2*N][2*N][N] __attribute__ ((__aligned__(16)));
void
foo (){
int i,j,k;
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j+=2) {
image[k][j][i] = j+i+k;
}
}
}
}
int main (void)
{
check_vect ();
int i, j, k;
foo ();
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < N; j+=2) {
if (image[k][j][i] != j+i+k)
abort ();
}
}
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N][N+1] __attribute__ ((__aligned__(16)));
void
foo (){
int i,j,k;
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < i+1; j++) {
image[k][j][i] = j+i+k;
}
}
}
}
int main (void)
{
check_vect ();
int i, j, k;
foo ();
for (k=0; k<N; k++) {
for (i = 0; i < N; i++) {
for (j = 0; j < i+1; j++) {
if (image[k][j][i] != j+i+k)
abort ();
}
}
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 0 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N] __attribute__ ((__aligned__(16)));
float out[N];
/* Outer-loop vectoriation. */
void
foo (){
int i,j;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j++) {
diff += image[j][i];
}
out[i]=diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[i][j]=i+j;
}
}
foo ();
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j++) {
diff += image[j][i];
}
if (out[i] != diff)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N+1] __attribute__ ((__aligned__(16)));
float out[N];
/* Outer-loop vectorization with misaliged accesses in the inner-loop. */
void
foo (){
int i,j;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j++) {
diff += image[j][i];
}
out[i]=diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[i][j]=i+j;
}
}
foo ();
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j++) {
diff += image[j][i];
}
if (out[i] != diff)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
/* { dg-final { scan-tree-dump-times "step doesn't divide the vector-size" 2 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N] __attribute__ ((__aligned__(16)));
float out[N];
/* Outer-loop vectorization with non-consecutive access. Not vectorized yet. */
void
foo (){
int i,j;
float diff;
for (i = 0; i < N/2; i++) {
diff = 0;
for (j = 0; j < N; j++) {
diff += image[j][2*i];
}
out[i]=diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[i][j]=i+j;
}
}
foo ();
for (i = 0; i < N/2; i++) {
diff = 0;
for (j = 0; j < N; j++) {
diff += image[j][2*i];
}
if (out[i] != diff)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "strided access in outer loop" 2 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
float image[N][N+1] __attribute__ ((__aligned__(16)));
float out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j+=4) {
diff += image[j][i];
}
out[i]=diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < N; i++) {
for (j = 0; j < N; j++) {
image[i][j]=i+j;
}
}
foo ();
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < N; j+=4) {
diff += image[j][i];
}
if (out[i] != diff)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
float in[N+M];
float coeff[M];
float out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=4) {
diff += in[j+i]*coeff[j];
}
out[i]=diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < M; i++)
coeff[i] = i;
for (i = 0; i < N+M; i++)
in[i] = i;
foo ();
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=4) {
diff += in[j+i]*coeff[j];
}
if (out[i] != diff)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "zero step in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
#define M 128
signed short in[N+M];
signed short coeff[M];
signed short out[N];
/* Outer-loop vectorization.
Currently not vectorized because of multiple-data-types in the inner-loop. */
void
foo (){
int i,j;
int diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i]*coeff[j];
}
out[i]=diff;
}
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* FORNOW. not vectorized until we support 0-stride acceses like coeff[j]. should be:
{ scan-tree-dump-not "multiple types in nested loop." "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "zero step in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
#define M 128
signed short in[N+M];
signed short coeff[M];
int out[N];
/* Outer-loop vectorization.
Currently not vectorized because of multiple-data-types in the inner-loop. */
void
foo (){
int i,j;
int diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i]*coeff[j];
}
out[i]=diff;
}
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* FORNOW. not vectorized until we support 0-stride acceses like coeff[j]. should be:
{ scan-tree-dump-not "multiple types in nested loop." "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "zero step in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
#define M 128
unsigned short in[N+M];
unsigned short coeff[M];
unsigned int out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
unsigned short diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i]*coeff[j];
}
out[i]=diff;
}
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { target vect_short_mult } } } */
/* { dg-final { scan-tree-dump-times "zero step in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
float in[N+M];
float out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=4) {
diff += in[j+i];
}
out[i]=diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < N; i++)
in[i] = i;
foo ();
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=4) {
diff += in[j+i];
}
if (out[i] != diff)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
#define M 128
unsigned int in[N+M];
unsigned short out[N];
/* Outer-loop vectorization. */
void
foo (){
int i,j;
unsigned int diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
out[i]=(unsigned short)diff;
}
return;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
unsigned short in[N+M];
unsigned int out[N];
unsigned char arr[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
unsigned int
foo (){
int i,j;
unsigned int diff;
unsigned int s=0;
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s+=diff;
}
return s;
}
unsigned int
bar (int i, unsigned int diff, unsigned short *in)
{
int j;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
return diff;
}
int main (void)
{
int i, j;
unsigned int diff;
unsigned int s=0,sum=0;
check_vect ();
for (i = 0; i < N+M; i++) {
in[i] = i;
}
sum=foo ();
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
diff = bar (i, diff, in);
s += diff;
}
if (s != sum)
abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: not allowed" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
unsigned short in[N+M];
unsigned int out[N];
unsigned char arr[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
unsigned int
foo (){
int i,j;
unsigned int diff;
unsigned int s=0;
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s+=diff;
}
return s;
}
unsigned int
bar (int i, unsigned int diff, unsigned short *in)
{
int j;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
return diff;
}
int main (void)
{
int i, j;
unsigned int diff;
unsigned int s=0,sum=0;
check_vect ();
for (i = 0; i < N+M; i++) {
in[i] = i;
}
sum=foo ();
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
diff = bar (i, diff, in);
s += diff;
}
if (s != sum)
abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: not allowed" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
#define M 128
unsigned char in[N+M];
unsigned short out[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
unsigned short
foo (){
int i,j;
unsigned short diff;
unsigned short s=0;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s+=diff;
}
return s;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-do compile } */
#define N 40
#define M 128
unsigned char in[N+M];
unsigned short out[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
void
foo (){
int i,j;
unsigned short diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
out[i]=diff;
}
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
unsigned short in[N+M];
unsigned int out[N];
unsigned char arr[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
unsigned int
foo (){
int i,j;
unsigned int diff;
unsigned int s=0;
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s+=diff;
}
return s;
}
unsigned int
bar (int i, unsigned int diff, unsigned short *in)
{
int j;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
return diff;
}
int main (void)
{
int i, j;
unsigned int diff;
unsigned int s=0,sum=0;
check_vect ();
for (i = 0; i < N+M; i++) {
in[i] = i;
}
sum=foo ();
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
diff = bar (i, diff, in);
s += diff;
}
if (s != sum)
abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: not allowed" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
unsigned short in[N+M];
unsigned int out[N];
unsigned char arr[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
unsigned int
foo (){
int i,j;
unsigned int diff;
unsigned int s=0;
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s+=diff;
}
return s;
}
unsigned int
bar (int i, unsigned int diff, unsigned short *in)
{
int j;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
return diff;
}
int main (void)
{
int i, j;
unsigned int diff;
unsigned int s=0,sum=0;
check_vect ();
for (i = 0; i < N+M; i++) {
in[i] = i;
}
sum=foo ();
for (i = 0; i < N; i++) {
arr[i] = 3;
diff = 0;
diff = bar (i, diff, in);
s += diff;
}
if (s != sum)
abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: not allowed" 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
unsigned short in[N+M];
unsigned int out[N];
/* Outer-loop vectorization. */
/* Not vectorized due to multiple-types in the inner-loop. */
unsigned int
foo (){
int i,j;
unsigned int diff;
unsigned int s=0;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s+=((unsigned short)diff>>3);
}
return s;
}
int main (void)
{
int i, j;
unsigned int diff;
unsigned int s=0,sum=0;
check_vect ();
for (i = 0; i < N+M; i++) {
in[i] = i;
}
sum=foo ();
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j+=8) {
diff += in[j+i];
}
s += ((unsigned short)diff>>3);
}
if (s != sum)
abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <signal.h>
#include "tree-vect.h"
#define N 64
#define MAX 42
extern void abort(void);
int main1 ()
{
float A[N] __attribute__ ((__aligned__(16)));
float B[N] __attribute__ ((__aligned__(16)));
float C[N] __attribute__ ((__aligned__(16)));
float D[N] __attribute__ ((__aligned__(16)));
float s;
int i, j;
for (i = 0; i < N; i++)
{
A[i] = i;
B[i] = i;
C[i] = i;
D[i] = i;
}
/* Outer-loop 1: Vectorizable with respect to dependence distance. */
for (i = 0; i < N-20; i++)
{
s = 0;
for (j=0; j<N; j+=4)
s += C[j];
A[i] = A[i+20] + s;
}
/* check results: */
for (i = 0; i < N-20; i++)
{
s = 0;
for (j=0; j<N; j+=4)
s += C[j];
if (A[i] != D[i+20] + s)
abort ();
}
/* Outer-loop 2: Not vectorizable because of dependence distance. */
for (i = 0; i < 4; i++)
{
s = 0;
for (j=0; j<N; j+=4)
s += C[j];
B[i] = B[i+3] + s;
}
/* check results: */
for (i = 0; i < 4; i++)
{
s = 0;
for (j=0; j<N; j+=4)
s += C[j];
if (B[i] != D[i+3] + s)
abort ();
}
return 0;
}
int main ()
{
check_vect ();
return main1();
}
/* NOTE: We temporarily xfail the following check until versioning for
aliasing is fixed to avoid versioning when the dependence distance
is known. */
/* { dg-final { scan-tree-dump-times "not vectorized: possible dependence between data-refs" 1 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "zero step in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <signal.h>
#include "tree-vect.h"
#define N 64
#define MAX 42
float A[N] __attribute__ ((__aligned__(16)));
float B[N] __attribute__ ((__aligned__(16)));
float C[N] __attribute__ ((__aligned__(16)));
float D[N] __attribute__ ((__aligned__(16)));
extern void abort(void);
int main1 ()
{
float s;
int i, j;
for (i = 0; i < 8; i++)
{
s = 0;
for (j=0; j<8; j+=4)
s += C[j];
A[i] = s;
}
return 0;
}
int main ()
{
int i,j;
float s;
check_vect ();
for (i = 0; i < N; i++)
{
A[i] = i;
B[i] = i;
C[i] = i;
D[i] = i;
}
main1();
/* check results: */
for (i = 0; i < 8; i++)
{
s = 0;
for (j=0; j<8; j+=4)
s += C[j];
if (A[i] != s)
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "zero step in outer loop." 1 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 64
float in[N+M];
float coeff[M];
float out[N];
float fir_out[N];
/* Should be vectorized. Fixed misaligment in the inner-loop. */
/* Currently not vectorized because the loop-count for the inner-loop
has a maybe_zero component. Will be fixed when we incorporate the
"cond_expr in rhs" patch. */
void foo (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
out[i] = 0;
}
for (k = 0; k < 4; k++) {
for (i = 0; i < N; i++) {
diff = 0;
j = k;
do {
diff += in[j+i]*coeff[j];
j+=4;
} while (j < M);
out[i] += diff;
}
}
}
/* Vectorized. Changing misalignment in the inner-loop. */
void fir (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j++) {
diff += in[j+i]*coeff[j];
}
fir_out[i] = diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < M; i++)
coeff[i] = i;
for (i = 0; i < N+M; i++)
in[i] = i;
foo ();
fir ();
for (i = 0; i < N; i++) {
if (out[i] != fir_out[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 2 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_float } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 40
#define M 128
float in[N+M];
float coeff[M];
float out[N];
float fir_out[N];
/* Should be vectorized. Fixed misaligment in the inner-loop. */
/* Currently not vectorized because we get too many BBs in the inner-loop,
because the compiler doesn't realize that the inner-loop executes at
least once (cause k<4), and so there's no need to create a guard code
to skip the inner-loop in case it doesn't execute. */
void foo (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
out[i] = 0;
}
for (k = 0; k < 4; k++) {
for (i = 0; i < N; i++) {
diff = 0;
for (j = k; j < M; j+=4) {
diff += in[j+i]*coeff[j];
}
out[i] += diff;
}
}
}
/* Vectorized. Changing misalignment in the inner-loop. */
void fir (){
int i,j,k;
float diff;
for (i = 0; i < N; i++) {
diff = 0;
for (j = 0; j < M; j++) {
diff += in[j+i]*coeff[j];
}
fir_out[i] = diff;
}
}
int main (void)
{
check_vect ();
int i, j;
float diff;
for (i = 0; i < M; i++)
coeff[i] = i;
for (i = 0; i < N+M; i++)
in[i] = i;
foo ();
fir ();
for (i = 0; i < N; i++) {
if (out[i] != fir_out[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 2 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail vect_no_align } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
......@@ -489,7 +489,7 @@ dump_ddrs (FILE *file, VEC (ddr_p, heap) *ddrs)
/* Expresses EXP as VAR + OFF, where off is a constant. The type of OFF
will be ssizetype. */
static void
void
split_constant_offset (tree exp, tree *var, tree *off)
{
tree type = TREE_TYPE (exp), otype;
......
......@@ -388,4 +388,7 @@ index_in_loop_nest (int var, VEC (loop_p, heap) *loop_nest)
/* In lambda-code.c */
bool lambda_transform_legal_p (lambda_trans_matrix, int, VEC (ddr_p, heap) *);
/* In tree-data-refs.c */
void split_constant_offset (tree , tree *, tree *);
#endif /* GCC_TREE_DATA_REF_H */
......@@ -1345,6 +1345,13 @@ new_stmt_vec_info (tree stmt, loop_vec_info loop_vinfo)
STMT_VINFO_IN_PATTERN_P (res) = false;
STMT_VINFO_RELATED_STMT (res) = NULL;
STMT_VINFO_DATA_REF (res) = NULL;
STMT_VINFO_DR_BASE_ADDRESS (res) = NULL;
STMT_VINFO_DR_OFFSET (res) = NULL;
STMT_VINFO_DR_INIT (res) = NULL;
STMT_VINFO_DR_STEP (res) = NULL;
STMT_VINFO_DR_ALIGNED_TO (res) = NULL;
if (TREE_CODE (stmt) == PHI_NODE && is_loop_header_bb_p (bb_for_stmt (stmt)))
STMT_VINFO_DEF_TYPE (res) = vect_unknown_def_type;
else
......@@ -1655,21 +1662,103 @@ get_vectype_for_scalar_type (tree scalar_type)
enum dr_alignment_support
vect_supportable_dr_alignment (struct data_reference *dr)
{
tree vectype = STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr)));
tree stmt = DR_STMT (dr);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
tree vectype = STMT_VINFO_VECTYPE (stmt_info);
enum machine_mode mode = (int) TYPE_MODE (vectype);
struct loop *vect_loop = LOOP_VINFO_LOOP (STMT_VINFO_LOOP_VINFO (stmt_info));
bool nested_in_vect_loop = nested_in_vect_loop_p (vect_loop, stmt);
bool invariant_in_outerloop = false;
if (aligned_access_p (dr))
return dr_aligned;
if (nested_in_vect_loop)
{
tree outerloop_step = STMT_VINFO_DR_STEP (stmt_info);
invariant_in_outerloop =
(tree_int_cst_compare (outerloop_step, size_zero_node) == 0);
}
/* Possibly unaligned access. */
/* We can choose between using the implicit realignment scheme (generating
a misaligned_move stmt) and the explicit realignment scheme (generating
aligned loads with a REALIGN_LOAD). There are two variants to the explicit
realignment scheme: optimized, and unoptimized.
We can optimize the realignment only if the step between consecutive
vector loads is equal to the vector size. Since the vector memory
accesses advance in steps of VS (Vector Size) in the vectorized loop, it
is guaranteed that the misalignment amount remains the same throughout the
execution of the vectorized loop. Therefore, we can create the
"realignment token" (the permutation mask that is passed to REALIGN_LOAD)
at the loop preheader.
However, in the case of outer-loop vectorization, when vectorizing a
memory access in the inner-loop nested within the LOOP that is now being
vectorized, while it is guaranteed that the misalignment of the
vectorized memory access will remain the same in different outer-loop
iterations, it is *not* guaranteed that is will remain the same throughout
the execution of the inner-loop. This is because the inner-loop advances
with the original scalar step (and not in steps of VS). If the inner-loop
step happens to be a multiple of VS, then the misalignment remaines fixed
and we can use the optimized realignment scheme. For example:
for (i=0; i<N; i++)
for (j=0; j<M; j++)
s += a[i+j];
When vectorizing the i-loop in the above example, the step between
consecutive vector loads is 1, and so the misalignment does not remain
fixed across the execution of the inner-loop, and the realignment cannot
be optimized (as illustrated in the following pseudo vectorized loop):
for (i=0; i<N; i+=4)
for (j=0; j<M; j++){
vs += vp[i+j]; // misalignment of &vp[i+j] is {0,1,2,3,0,1,2,3,...}
// when j is {0,1,2,3,4,5,6,7,...} respectively.
// (assuming that we start from an aligned address).
}
We therefore have to use the unoptimized realignment scheme:
for (i=0; i<N; i+=4)
for (j=k; j<M; j+=4)
vs += vp[i+j]; // misalignment of &vp[i+j] is always k (assuming
// that the misalignment of the initial address is
// 0).
The loop can then be vectorized as follows:
for (k=0; k<4; k++){
rt = get_realignment_token (&vp[k]);
for (i=0; i<N; i+=4){
v1 = vp[i+k];
for (j=k; j<M; j+=4){
v2 = vp[i+j+VS-1];
va = REALIGN_LOAD <v1,v2,rt>;
vs += va;
v1 = v2;
}
}
} */
if (DR_IS_READ (dr))
{
if (optab_handler (vec_realign_load_optab, mode)->insn_code != CODE_FOR_nothing
if (optab_handler (vec_realign_load_optab, mode)->insn_code !=
CODE_FOR_nothing
&& (!targetm.vectorize.builtin_mask_for_load
|| targetm.vectorize.builtin_mask_for_load ()))
return dr_unaligned_software_pipeline;
{
if (nested_in_vect_loop
&& TREE_INT_CST_LOW (DR_STEP (dr)) != UNITS_PER_SIMD_WORD)
return dr_explicit_realign;
else
return dr_explicit_realign_optimized;
}
if (optab_handler (movmisalign_optab, mode)->insn_code != CODE_FOR_nothing)
if (optab_handler (movmisalign_optab, mode)->insn_code !=
CODE_FOR_nothing)
/* Can't software pipeline the loads, but can at least do them. */
return dr_unaligned_supported;
}
......
......@@ -53,7 +53,8 @@ enum operation_type {
enum dr_alignment_support {
dr_unaligned_unsupported,
dr_unaligned_supported,
dr_unaligned_software_pipeline,
dr_explicit_realign,
dr_explicit_realign_optimized,
dr_aligned
};
......@@ -249,9 +250,18 @@ typedef struct _stmt_vec_info {
data-ref (array/pointer/struct access). A GIMPLE stmt is expected to have
at most one such data-ref. **/
/* Information about the data-ref (access function, etc). */
/* Information about the data-ref (access function, etc),
relative to the inner-most containing loop. */
struct data_reference *data_ref_info;
/* Information about the data-ref relative to this loop
nest (the loop that is being considered for vectorization). */
tree dr_base_address;
tree dr_init;
tree dr_offset;
tree dr_step;
tree dr_aligned_to;
/* Stmt is part of some pattern (computation idiom) */
bool in_pattern_p;
......@@ -310,6 +320,13 @@ typedef struct _stmt_vec_info {
#define STMT_VINFO_VECTYPE(S) (S)->vectype
#define STMT_VINFO_VEC_STMT(S) (S)->vectorized_stmt
#define STMT_VINFO_DATA_REF(S) (S)->data_ref_info
#define STMT_VINFO_DR_BASE_ADDRESS(S) (S)->dr_base_address
#define STMT_VINFO_DR_INIT(S) (S)->dr_init
#define STMT_VINFO_DR_OFFSET(S) (S)->dr_offset
#define STMT_VINFO_DR_STEP(S) (S)->dr_step
#define STMT_VINFO_DR_ALIGNED_TO(S) (S)->dr_aligned_to
#define STMT_VINFO_IN_PATTERN_P(S) (S)->in_pattern_p
#define STMT_VINFO_RELATED_STMT(S) (S)->related_stmt
#define STMT_VINFO_SAME_ALIGN_REFS(S) (S)->same_align_refs
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment