Commit 5d593372 by Ira Rosen Committed by Ira Rosen

tree-vectorizer.c (supportable_widening_operation): Support multi-step conversion...

	* tree-vectorizer.c (supportable_widening_operation): Support
	multi-step conversion, return the number of steps in such conversion
	and the required intermediate types.
	(supportable_narrowing_operation): Likewise.
	* tree-vectorizer.h (vect_pow2): New function.
	(supportable_widening_operation): Change argument types.
	(supportable_narrowing_operation): Likewise.
	(vectorizable_type_promotion): Add an argument.
	(vectorizable_type_demotion): Likewise.
	* tree-vect-analyze.c (vect_analyze_operations): Call 
	vectorizable_type_promotion and vectorizable_type_demotion with
	additional argument.
	(vect_get_and_check_slp_defs): Detect patterns.
	(vect_build_slp_tree): Add an argument, don't fail in case of multiple
	types. 
	(vect_analyze_slp_instance): Don't fail in case of multiple types. Call
	vect_build_slp_tree with correct arguments. Calculate unrolling factor
	according to the smallest type in the loop.
	(vect_detect_hybrid_slp_stmts): Include statements from patterns.
	* tree-vect-patterns.c (vect_recog_widen_mult_pattern): Call 
	supportable_widening_operation with correct arguments. 
	* tree-vect-transform.c (vect_get_slp_defs): Allocate output vector 
	operands lists according to the number of vector statements in left
	or right node, if exists.
	(vect_gen_widened_results_half): Remove unused argument.
	(vectorizable_conversion): Call supportable_widening_operation, 
	supportable_narrowing_operation, and vect_gen_widened_results_half
	with correct arguments. 
	(vectorizable_assignment): Change documentation, support multiple
	types in SLP. 
	(vectorizable_operation): Likewise.
	(vect_get_loop_based_defs): New function.
	(vect_create_vectorized_demotion_stmts): Likewise.
	(vectorizable_type_demotion): Support loop-aware SLP and general
	multi-step conversion. Call vect_get_loop_based_defs and
	vect_create_vectorized_demotion_stmts for transformation.
	(vect_create_vectorized_promotion_stmts): New function.
	(vectorizable_type_promotion): Support loop-aware SLP and general
	multi-step conversion. Call vect_create_vectorized_promotion_stmts
	for transformation.	
	(vectorizable_store): Change documentation, support multiple
	types in SLP. 
	(vectorizable_load): Likewise.
	(vect_transform_stmt): Pass SLP_NODE to 
	vectorizable_type_promotion and vectorizable_type_demotion.
	(vect_schedule_slp_instance): Move here the calculation of number
	of vectorized statements for each node from...
	(vect_schedule_slp): ... here.
	(vect_transform_loop): Call vect_schedule_slp without the last
	argument.

From-SVN: r139225
parent 45ea82c1
2008-08-19 Ira Rosen <irar@il.ibm.com>
* tree-vectorizer.c (supportable_widening_operation): Support
multi-step conversion, return the number of steps in such conversion
and the required intermediate types.
(supportable_narrowing_operation): Likewise.
* tree-vectorizer.h (vect_pow2): New function.
(supportable_widening_operation): Change argument types.
(supportable_narrowing_operation): Likewise.
(vectorizable_type_promotion): Add an argument.
(vectorizable_type_demotion): Likewise.
* tree-vect-analyze.c (vect_analyze_operations): Call
vectorizable_type_promotion and vectorizable_type_demotion with
additional argument.
(vect_get_and_check_slp_defs): Detect patterns.
(vect_build_slp_tree): Add an argument, don't fail in case of multiple
types.
(vect_analyze_slp_instance): Don't fail in case of multiple types. Call
vect_build_slp_tree with correct arguments. Calculate unrolling factor
according to the smallest type in the loop.
(vect_detect_hybrid_slp_stmts): Include statements from patterns.
* tree-vect-patterns.c (vect_recog_widen_mult_pattern): Call
supportable_widening_operation with correct arguments.
* tree-vect-transform.c (vect_get_slp_defs): Allocate output vector
operands lists according to the number of vector statements in left
or right node, if exists.
(vect_gen_widened_results_half): Remove unused argument.
(vectorizable_conversion): Call supportable_widening_operation,
supportable_narrowing_operation, and vect_gen_widened_results_half
with correct arguments.
(vectorizable_assignment): Change documentation, support multiple
types in SLP.
(vectorizable_operation): Likewise.
(vect_get_loop_based_defs): New function.
(vect_create_vectorized_demotion_stmts): Likewise.
(vectorizable_type_demotion): Support loop-aware SLP and general
multi-step conversion. Call vect_get_loop_based_defs and
vect_create_vectorized_demotion_stmts for transformation.
(vect_create_vectorized_promotion_stmts): New function.
(vectorizable_type_promotion): Support loop-aware SLP and general
multi-step conversion. Call vect_create_vectorized_promotion_stmts
for transformation.
(vectorizable_store): Change documentation, support multiple
types in SLP.
(vectorizable_load): Likewise.
(vect_transform_stmt): Pass SLP_NODE to
vectorizable_type_promotion and vectorizable_type_demotion.
(vect_schedule_slp_instance): Move here the calculation of number
of vectorized statements for each node from...
(vect_schedule_slp): ... here.
(vect_transform_loop): Call vect_schedule_slp without the last
argument.
2008-08-19 Dorit Nuzman <dorit@il.ibm.com>
PR bootstrap/37152
......
2008-08-19 Ira Rosen <irar@il.ibm.com>
* gcc.dg/vect/slp-multitypes-1.c: New testcase.
* gcc.dg/vect/slp-multitypes-2.c, gcc.dg/vect/slp-multitypes-3.c,
gcc.dg/vect/slp-multitypes-4.c, gcc.dg/vect/slp-multitypes-5.c,
gcc.dg/vect/slp-multitypes-6.c, gcc.dg/vect/slp-multitypes-7.c,
gcc.dg/vect/slp-multitypes-8.c, gcc.dg/vect/slp-multitypes-9.c,
gcc.dg/vect/slp-multitypes-10.c, gcc.dg/vect/slp-multitypes-11.c,
gcc.dg/vect/slp-multitypes-12.c, gcc.dg/vect/slp-widen-mult-u8.c,
gcc.dg/vect/slp-widen-mult-s16.c, gcc.dg/vect/vect-multitypes-16.c,
gcc.dg/vect/vect-multitypes-17.c: Likewise.
* gcc.dg/vect/slp-9.c: Now vectorizable using SLP.
* gcc.dg/vect/slp-14.c, gcc.dg/vect/slp-5.c: Likewise.
* lib/target-supports.exp (check_effective_target_vect_long_long): New.
2008-08-18 Adam Nemet <anemet@caviumnetworks.com>
* gcc.target/mips/ext-1.c: Add -mgp64 to dg-mips-options.
......
......@@ -15,7 +15,7 @@ main1 (int n)
unsigned short in2[N*16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
unsigned short out2[N*16];
/* Multiple types are not SLPable yet. */
/* Multiple types are now SLPable. */
for (i = 0; i < n; i++)
{
a0 = in[i*8] + 5;
......@@ -110,9 +110,7 @@ int main (void)
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_strided && vect_int_mult } } } } */
/* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" {target { ! { vect_strided && vect_int_mult } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { xfail *-*-* } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_int_mult } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
......@@ -15,7 +15,7 @@ main1 ()
unsigned short ia[N];
unsigned int ib[N*2];
/* Not SLPable for now: multiple types with SLP of the smaller type. */
/* Multiple types with SLP of the smaller type. */
for (i = 0; i < N; i++)
{
out[i*8] = in[i*8];
......@@ -121,8 +121,7 @@ int main (void)
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { vect_strided_wide } } } } */
/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target { ! { vect_strided_wide } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
......@@ -41,7 +41,7 @@ int main (void)
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_strided && vect_widen_mult_hi_to_si } } } }*/
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_widen_mult_hi_to_si } } }*/
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_widen_mult_hi_to_si } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 128
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short sout[N*8];
unsigned int iout[N*8];
for (i = 0; i < N; i++)
{
sout[i*4] = 8;
sout[i*4 + 1] = 18;
sout[i*4 + 2] = 28;
sout[i*4 + 3] = 38;
iout[i*4] = 8;
iout[i*4 + 1] = 18;
iout[i*4 + 2] = 28;
iout[i*4 + 3] = 38;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (sout[i*4] != 8
|| sout[i*4 + 1] != 18
|| sout[i*4 + 2] != 28
|| sout[i*4 + 3] != 38
|| iout[i*4] != 8
|| iout[i*4 + 1] != 18
|| iout[i*4 + 2] != 28
|| iout[i*4 + 3] != 38)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
struct s
{
unsigned char a;
unsigned char b;
};
__attribute__ ((noinline)) int
main1 ()
{
int i;
struct s out[N*4];
for (i = 0; i < N*4; i++)
{
out[i].a = (unsigned char) in[i*2] + 1;
out[i].b = (unsigned char) in[i*2 + 1] + 2;
}
/* check results: */
for (i = 0; i < N*4; i++)
{
if (out[i].a != (unsigned char) in[i*2] + 1
|| out[i].b != (unsigned char) in[i*2 + 1] + 2)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 18
struct s
{
int a;
int b;
int c;
};
char in[N*3] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53};
__attribute__ ((noinline)) int
main1 ()
{
int i;
struct s out[N];
for (i = 0; i < N; i++)
{
out[i].a = (int) in[i*3] + 1;
out[i].b = (int) in[i*3 + 1] + 2;
out[i].c = (int) in[i*3 + 2] + 3;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i].a != (int) in[i*3] + 1
|| out[i].b != (int) in[i*3 + 1] + 2
|| out[i].c != (int) in[i*3 + 2] + 3)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_unpack } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_unpack } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 128
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short sout[N*8];
unsigned int iout[N*8];
unsigned char cout[N*8];
for (i = 0; i < N; i++)
{
sout[i*4] = 8;
sout[i*4 + 1] = 18;
sout[i*4 + 2] = 28;
sout[i*4 + 3] = 38;
iout[i*4] = 8;
iout[i*4 + 1] = 18;
iout[i*4 + 2] = 28;
iout[i*4 + 3] = 38;
cout[i*4] = 1;
cout[i*4 + 1] = 2;
cout[i*4 + 2] = 3;
cout[i*4 + 3] = 4;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (sout[i*4] != 8
|| sout[i*4 + 1] != 18
|| sout[i*4 + 2] != 28
|| sout[i*4 + 3] != 38
|| iout[i*4] != 8
|| iout[i*4 + 1] != 18
|| iout[i*4 + 2] != 28
|| iout[i*4 + 3] != 38
|| cout[i*4] != 1
|| cout[i*4 + 1] != 2
|| cout[i*4 + 2] != 3
|| cout[i*4 + 3] != 4)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 128
__attribute__ ((noinline)) int
main1 (unsigned short a0, unsigned short a1, unsigned short a2,
unsigned short a3, unsigned short a4, unsigned short a5,
unsigned short a6, unsigned short a7, unsigned short a8,
unsigned short a9, unsigned short a10, unsigned short a11,
unsigned short a12, unsigned short a13, unsigned short a14,
unsigned short a15, unsigned char b0, unsigned char b1)
{
int i;
unsigned short out[N*16];
unsigned char out2[N*16];
for (i = 0; i < N; i++)
{
out[i*16] = a8;
out[i*16 + 1] = a7;
out[i*16 + 2] = a1;
out[i*16 + 3] = a2;
out[i*16 + 4] = a8;
out[i*16 + 5] = a5;
out[i*16 + 6] = a5;
out[i*16 + 7] = a4;
out[i*16 + 8] = a12;
out[i*16 + 9] = a13;
out[i*16 + 10] = a14;
out[i*16 + 11] = a15;
out[i*16 + 12] = a6;
out[i*16 + 13] = a9;
out[i*16 + 14] = a0;
out[i*16 + 15] = a7;
out2[i*2] = b1;
out2[i*2+1] = b0;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*16] != a8
|| out[i*16 + 1] != a7
|| out[i*16 + 2] != a1
|| out[i*16 + 3] != a2
|| out[i*16 + 4] != a8
|| out[i*16 + 5] != a5
|| out[i*16 + 6] != a5
|| out[i*16 + 7] != a4
|| out[i*16 + 8] != a12
|| out[i*16 + 9] != a13
|| out[i*16 + 10] != a14
|| out[i*16 + 11] != a15
|| out[i*16 + 12] != a6
|| out[i*16 + 13] != a9
|| out[i*16 + 14] != a0
|| out[i*16 + 15] != a7
|| out2[i*2] != b1
|| out2[i*2 + 1] != b0)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 (15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0,20,21);
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
unsigned char in2[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned int out[N*8];
unsigned char out2[N*8];
for (i = 0; i < N/2; i++)
{
out[i*8] = in[i*8] + 5;
out[i*8 + 1] = in[i*8 + 1] + 6;
out[i*8 + 2] = in[i*8 + 2] + 7;
out[i*8 + 3] = in[i*8 + 3] + 8;
out[i*8 + 4] = in[i*8 + 4] + 9;
out[i*8 + 5] = in[i*8 + 5] + 10;
out[i*8 + 6] = in[i*8 + 6] + 11;
out[i*8 + 7] = in[i*8 + 7] + 12;
out2[i*16] = in2[i*16] + 2;
out2[i*16 + 1] = in2[i*16 + 1] + 3;
out2[i*16 + 2] = in2[i*16 + 2] + 4;
out2[i*16 + 3] = in2[i*16 + 3] + 3;
out2[i*16 + 4] = in2[i*16 + 4] + 2;
out2[i*16 + 5] = in2[i*16 + 5] + 3;
out2[i*16 + 6] = in2[i*16 + 6] + 2;
out2[i*16 + 7] = in2[i*16 + 7] + 4;
out2[i*16 + 8] = in2[i*16 + 8] + 2;
out2[i*16 + 9] = in2[i*16 + 9] + 5;
out2[i*16 + 10] = in2[i*16 + 10] + 2;
out2[i*16 + 11] = in2[i*16 + 11] + 3;
out2[i*16 + 12] = in2[i*16 + 12] + 4;
out2[i*16 + 13] = in2[i*16 + 13] + 4;
out2[i*16 + 14] = in2[i*16 + 14] + 3;
out2[i*16 + 15] = in2[i*16 + 15] + 2;
}
/* check results: */
for (i = 0; i < N/2; i++)
{
if (out[i*8] != in[i*8] + 5
|| out[i*8 + 1] != in[i*8 + 1] + 6
|| out[i*8 + 2] != in[i*8 + 2] + 7
|| out[i*8 + 3] != in[i*8 + 3] + 8
|| out[i*8 + 4] != in[i*8 + 4] + 9
|| out[i*8 + 5] != in[i*8 + 5] + 10
|| out[i*8 + 6] != in[i*8 + 6] + 11
|| out[i*8 + 7] != in[i*8 + 7] + 12
|| out2[i*16] != in2[i*16] + 2
|| out2[i*16 + 1] != in2[i*16 + 1] + 3
|| out2[i*16 + 2] != in2[i*16 + 2] + 4
|| out2[i*16 + 3] != in2[i*16 + 3] + 3
|| out2[i*16 + 4] != in2[i*16 + 4] + 2
|| out2[i*16 + 5] != in2[i*16 + 5] + 3
|| out2[i*16 + 6] != in2[i*16 + 6] + 2
|| out2[i*16 + 7] != in2[i*16 + 7] + 4
|| out2[i*16 + 8] != in2[i*16 + 8] + 2
|| out2[i*16 + 9] != in2[i*16 + 9] + 5
|| out2[i*16 + 10] != in2[i*16 + 10] + 2
|| out2[i*16 + 11] != in2[i*16 + 11] + 3
|| out2[i*16 + 12] != in2[i*16 + 12] + 4
|| out2[i*16 + 13] != in2[i*16 + 13] + 4
|| out2[i*16 + 14] != in2[i*16 + 14] + 3
|| out2[i*16 + 15] != in2[i*16 + 15] + 2)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
short in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
int out[N*8];
for (i = 0; i < N; i++)
{
out[i*8] = (int) in[i*8] + 1;
out[i*8 + 1] = (int) in[i*8 + 1] + 2;
out[i*8 + 2] = (int) in[i*8 + 2] + 3;
out[i*8 + 3] = (int) in[i*8 + 3] + 4;
out[i*8 + 4] = (int) in[i*8 + 4] + 5;
out[i*8 + 5] = (int) in[i*8 + 5] + 6;
out[i*8 + 6] = (int) in[i*8 + 6] + 7;
out[i*8 + 7] = (int) in[i*8 + 7] + 8;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != (int) in[i*8] + 1
|| out[i*8 + 1] != (int) in[i*8 + 1] + 2
|| out[i*8 + 2] != (int) in[i*8 + 2] + 3
|| out[i*8 + 3] != (int) in[i*8 + 3] + 4
|| out[i*8 + 4] != (int) in[i*8 + 4] + 5
|| out[i*8 + 5] != (int) in[i*8 + 5] + 6
|| out[i*8 + 6] != (int) in[i*8 + 6] + 7
|| out[i*8 + 7] != (int) in[i*8 + 7] + 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_unpack } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_unpack } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
short in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
int out[N*8];
for (i = 0; i < N; i++)
{
out[i*8] = (short) in[i*8] + 1;
out[i*8 + 1] = (short) in[i*8 + 1] + 2;
out[i*8 + 2] = (short) in[i*8 + 2] + 3;
out[i*8 + 3] = (short) in[i*8 + 3] + 4;
out[i*8 + 4] = (short) in[i*8 + 4] + 5;
out[i*8 + 5] = (short) in[i*8 + 5] + 6;
out[i*8 + 6] = (short) in[i*8 + 6] + 7;
out[i*8 + 7] = (short) in[i*8 + 7] + 8;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != (short) in[i*8] + 1
|| out[i*8 + 1] != (short) in[i*8 + 1] + 2
|| out[i*8 + 2] != (short) in[i*8 + 2] + 3
|| out[i*8 + 3] != (short) in[i*8 + 3] + 4
|| out[i*8 + 4] != (short) in[i*8 + 4] + 5
|| out[i*8 + 5] != (short) in[i*8 + 5] + 6
|| out[i*8 + 6] != (short) in[i*8 + 6] + 7
|| out[i*8 + 7] != (short) in[i*8 + 7] + 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned char out[N*8];
for (i = 0; i < N; i++)
{
out[i*8] = (unsigned char) in[i*8] + 1;
out[i*8 + 1] = (unsigned char) in[i*8 + 1] + 2;
out[i*8 + 2] = (unsigned char) in[i*8 + 2] + 3;
out[i*8 + 3] = (unsigned char) in[i*8 + 3] + 4;
out[i*8 + 4] = (unsigned char) in[i*8 + 4] + 5;
out[i*8 + 5] = (unsigned char) in[i*8 + 5] + 6;
out[i*8 + 6] = (unsigned char) in[i*8 + 6] + 7;
out[i*8 + 7] = (unsigned char) in[i*8 + 7] + 8;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != (unsigned char) in[i*8] + 1
|| out[i*8 + 1] != (unsigned char) in[i*8 + 1] + 2
|| out[i*8 + 2] != (unsigned char) in[i*8 + 2] + 3
|| out[i*8 + 3] != (unsigned char) in[i*8 + 3] + 4
|| out[i*8 + 4] != (unsigned char) in[i*8 + 4] + 5
|| out[i*8 + 5] != (unsigned char) in[i*8 + 5] + 6
|| out[i*8 + 6] != (unsigned char) in[i*8 + 6] + 7
|| out[i*8 + 7] != (unsigned char) in[i*8 + 7] + 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
char in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
int out[N*8];
for (i = 0; i < N; i++)
{
out[i*8] = (int) in[i*8] + 1;
out[i*8 + 1] = (int) in[i*8 + 1] + 2;
out[i*8 + 2] = (int) in[i*8 + 2] + 3;
out[i*8 + 3] = (int) in[i*8 + 3] + 4;
out[i*8 + 4] = (int) in[i*8 + 4] + 5;
out[i*8 + 5] = (int) in[i*8 + 5] + 6;
out[i*8 + 6] = (int) in[i*8 + 6] + 7;
out[i*8 + 7] = (int) in[i*8 + 7] + 8;
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != (int) in[i*8] + 1
|| out[i*8 + 1] != (int) in[i*8 + 1] + 2
|| out[i*8 + 2] != (int) in[i*8 + 2] + 3
|| out[i*8 + 3] != (int) in[i*8 + 3] + 4
|| out[i*8 + 4] != (int) in[i*8 + 4] + 5
|| out[i*8 + 5] != (int) in[i*8 + 5] + 6
|| out[i*8 + 6] != (int) in[i*8 + 6] + 7
|| out[i*8 + 7] != (int) in[i*8 + 7] + 8)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_unpack } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_unpack } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
char in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
int out[N*8];
for (i = 0; i < N*4; i++)
{
out[i*2] = (int) in[i*2] + 1;
out[i*2 + 1] = (int) in[i*2 + 1] + 2;
}
/* check results: */
for (i = 0; i < N*4; i++)
{
if (out[i*2] != (int) in[i*2] + 1
|| out[i*2 + 1] != (int) in[i*2 + 1] + 2)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_unpack } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_unpack } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 8
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned char out[N*8];
for (i = 0; i < N*4; i++)
{
out[i*2] = (unsigned char) in[i*2] + 1;
out[i*2 + 1] = (unsigned char) in[i*2 + 1] + 2;
}
/* check results: */
for (i = 0; i < N*4; i++)
{
if (out[i*2] != (unsigned char) in[i*2] + 1
|| out[i*2 + 1] != (unsigned char) in[i*2 + 1] + 2)
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 64
short X[N] __attribute__ ((__aligned__(16)));
short Y[N] __attribute__ ((__aligned__(16)));
int result[N];
/* short->int widening-mult */
__attribute__ ((noinline)) int
foo1(int len) {
int i;
for (i=0; i<len/2; i++) {
result[2*i] = X[2*i] * Y[2*i];
result[2*i+1] = X[2*i+1] * Y[2*i+1];
}
}
int main (void)
{
int i;
check_vect ();
for (i=0; i<N; i++) {
X[i] = i;
Y[i] = 64-i;
}
foo1 (N);
for (i=0; i<N; i++) {
if (result[i] != X[i] * Y[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_widen_mult_hi_to_si || vect_inpack } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_widen_mult_hi_to_si || vect_inpack } } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 64
unsigned char X[N] __attribute__ ((__aligned__(16)));
unsigned char Y[N] __attribute__ ((__aligned__(16)));
unsigned short result[N];
/* char->short widening-mult */
__attribute__ ((noinline)) int
foo1(int len) {
int i;
for (i=0; i<len/2; i++) {
result[2*i] = X[2*i] * Y[2*i];
result[2*i+1] = X[2*i+1] * Y[2*i+1];
}
}
int main (void)
{
int i;
check_vect ();
for (i=0; i<N; i++) {
X[i] = i;
Y[i] = 64-i;
}
foo1 (N);
for (i=0; i<N; i++) {
if (result[i] != X[i] * Y[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target { vect_widen_mult_qi_to_hi || vect_unpack } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_widen_mult_hi_to_si || vect_inpack } } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_long_long } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 64
char x[N] __attribute__ ((__aligned__(16)));
__attribute__ ((noinline)) int
foo (int len, long long *z) {
int i;
for (i=0; i<len; i++) {
z[i] = x[i];
}
}
int main (void)
{
char i;
long long z[N+4];
check_vect ();
for (i=0; i<N; i++) {
x[i] = i;
}
foo (N,z+2);
for (i=0; i<N; i++) {
if (z[i+2] != x[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_unpack } } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! vect_unpack } } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
/* { dg-require-effective-target vect_long_long } */
#include <stdarg.h>
#include "tree-vect.h"
#define N 64
unsigned char uX[N] __attribute__ ((__aligned__(16)));
unsigned char uresultX[N];
unsigned long long uY[N] __attribute__ ((__aligned__(16)));
unsigned char uresultY[N];
/* Unsigned type demotion (si->qi) */
__attribute__ ((noinline)) int
foo1(int len) {
int i;
for (i=0; i<len; i++) {
uresultX[i] = uX[i];
uresultY[i] = (unsigned char)uY[i];
}
}
int main (void)
{
int i;
check_vect ();
for (i=0; i<N; i++) {
uX[i] = 16-i;
uY[i] = 16-i;
if (i%5 == 0)
uX[i] = 16-i;
}
foo1 (N);
for (i=0; i<N; i++) {
if (uresultX[i] != uX[i])
abort ();
if (uresultY[i] != (unsigned char)uY[i])
abort ();
}
return 0;
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
/* { dg-final { cleanup-tree-dump "vect" } } */
......@@ -1526,6 +1526,29 @@ proc check_effective_target_vect_double { } {
return $et_vect_double_saved
}
# Return 1 if the target supports hardware vectors of long long, 0 otherwise.
#
# This won't change for different subtargets so cache the result.
proc check_effective_target_vect_long_long { } {
global et_vect_long_long_saved
if [info exists et_vect_long_long_saved] {
verbose "check_effective_target_vect_long_long: using cached result" 2
} else {
set et_vect_long_long_saved 0
if { [istarget i?86-*-*]
|| [istarget x86_64-*-*]
|| [istarget spu-*-*] } {
set et_vect_long_long_saved 1
}
}
verbose "check_effective_target_vect_long_long: returning $et_vect_long_long_saved" 2
return $et_vect_long_long_saved
}
# Return 1 if the target plus current options does not support a vector
# max instruction on "int", 0 otherwise.
#
......
......@@ -374,7 +374,8 @@ vect_recog_widen_mult_pattern (gimple last_stmt,
tree dummy;
tree var;
enum tree_code dummy_code;
bool dummy_bool;
int dummy_int;
VEC (tree, heap) *dummy_vec;
if (!is_gimple_assign (last_stmt))
return NULL;
......@@ -415,7 +416,7 @@ vect_recog_widen_mult_pattern (gimple last_stmt,
if (!vectype
|| !supportable_widening_operation (WIDEN_MULT_EXPR, last_stmt, vectype,
&dummy, &dummy, &dummy_code,
&dummy_code, &dummy_bool, &dummy))
&dummy_code, &dummy_int, &dummy_vec))
return NULL;
*type_in = vectype;
......
......@@ -522,6 +522,10 @@ typedef struct _stmt_vec_info {
#define TARG_VEC_STORE_COST 1
#endif
/* The maximum number of intermediate steps required in multi-step type
conversion. */
#define MAX_INTERM_CVT_STEPS 3
/* Avoid GTY(()) on stmt_vec_info. */
typedef void *vec_void_p;
DEF_VEC_P (vec_void_p);
......@@ -602,6 +606,16 @@ stmt_vinfo_set_outside_of_loop_cost (stmt_vec_info stmt_info, slp_tree slp_node,
STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = cost;
}
static inline int
vect_pow2 (int x)
{
int i, res = 1;
for (i = 0; i < x; i++)
res *= 2;
return res;
}
/*-----------------------------------------------------------------*/
/* Info on data references alignment. */
......@@ -671,9 +685,10 @@ extern enum dr_alignment_support vect_supportable_dr_alignment
(struct data_reference *);
extern bool reduction_code_for_scalar_code (enum tree_code, enum tree_code *);
extern bool supportable_widening_operation (enum tree_code, gimple, tree,
tree *, tree *, enum tree_code *, enum tree_code *, bool *, tree *);
tree *, tree *, enum tree_code *, enum tree_code *,
int *, VEC (tree, heap) **);
extern bool supportable_narrowing_operation (enum tree_code, const_gimple,
const_tree, enum tree_code *, bool *, tree *);
tree, enum tree_code *, int *, VEC (tree, heap) **);
/* Creation and deletion of loop and stmt info structs. */
extern loop_vec_info new_loop_vec_info (struct loop *loop);
......@@ -705,9 +720,9 @@ extern bool vectorizable_store (gimple, gimple_stmt_iterator *, gimple *,
extern bool vectorizable_operation (gimple, gimple_stmt_iterator *, gimple *,
slp_tree);
extern bool vectorizable_type_promotion (gimple, gimple_stmt_iterator *,
gimple *);
gimple *, slp_tree);
extern bool vectorizable_type_demotion (gimple, gimple_stmt_iterator *,
gimple *);
gimple *, slp_tree);
extern bool vectorizable_conversion (gimple, gimple_stmt_iterator *, gimple *,
slp_tree);
extern bool vectorizable_assignment (gimple, gimple_stmt_iterator *, gimple *,
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment