Commit a70d6342 by Ira Rosen Committed by Ira Rosen

passes.texi (Tree-SSA passes): Document SLP pass.


	* doc/passes.texi (Tree-SSA passes): Document SLP pass.
	* tree-pass.h (pass_slp_vectorize): New pass.
	* params.h (SLP_MAX_INSNS_IN_BB): Define.
	* timevar.def (TV_TREE_SLP_VECTORIZATION): Define.
	* tree-vectorizer.c (timevar.h): Include.
	(user_vect_verbosity_level): Declare.
	(vect_location): Fix comment.
	(vect_set_verbosity_level): Update user_vect_verbosity_level
	instead of vect_verbosity_level.
	(vect_set_dump_settings): Add an argument. Ignore user defined
	verbosity if dump flags require higher level of verbosity. Print to
	stderr only for loop vectorization.
	(vectorize_loops): Update call to vect_set_dump_settings.
	(execute_vect_slp): New function.
	(gate_vect_slp): Likewise.
	(struct gimple_opt_pass pass_slp_vectorize): New.
	* tree-vectorizer.h (struct _bb_vec_info): Define along macros to
	access its members.
	(vec_info_for_bb): New function.
	(struct _stmt_vec_info): Add bb_vinfo and a macro for its access.
	(VECTORIZATION_ENABLED): New macro.
	(SLP_ENABLED, SLP_DISABLED): Likewise.
	(vect_is_simple_use): Add bb_vec_info argument.
	(new_stmt_vec_info, vect_analyze_data_ref_dependences,
	vect_analyze_data_refs_alignment, vect_verify_datarefs_alignment,
	vect_analyze_data_ref_accesses, vect_analyze_data_refs,
	vect_schedule_slp, vect_analyze_slp): Likewise.
	(vect_analyze_stmt): Add slp_tree argument.
	(find_bb_location): Declare.
	(vect_slp_analyze_bb, vect_slp_transform_bb): Likewise.
	* tree-vect-loop.c (new_loop_vec_info): Adjust function calls.
	(vect_analyze_loop_operations, vect_analyze_loop,
	get_initial_def_for_induction, vect_create_epilog_for_reduction,
	vect_finalize_reduction, vectorizable_reduction,
	vectorizable_live_operation, vect_transform_loop): Likewise.
	* tree-data-ref.c (dr_analyze_innermost): Update comment,
	skip evolution analysis if analyzing a basic block.
	(dr_analyze_indices): Likewise.
	(initialize_data_dependence_relation): Skip the test whether the
	object is invariant for basic blocks.
	(compute_all_dependences): Skip dependence analysis for data
	references in basic blocks.
	(find_data_references_in_stmt): Don't fail in case of invariant
	access in basic block.
	(find_data_references_in_bb): New function.
	(find_data_references_in_loop): Move code to
	find_data_references_in_bb    and add a call to it.
	(compute_data_dependences_for_bb): New function.
	* tree-data-ref.h (compute_data_dependences_for_bb): Declare.
	* tree-vect-data-refs.c (vect_check_interleaving): Adjust to the case
	that STEP is 0.
	(vect_analyze_data_ref_dependence): Check for interleaving in case of
	unknown dependence in basic block and fail in case of dependence in
	basic block.
	(vect_analyze_data_ref_dependences): Add bb_vinfo argument, get data
	dependence instances from either loop or basic block vectorization
	info.
	(vect_compute_data_ref_alignment): Check if it is loop vectorization
	before calling nested_in_vect_loop_p.
	(vect_compute_data_refs_alignment): Add bb_vinfo argument, get data
	dependence instances from either loop or basic block vectorization
	info.
	(vect_verify_datarefs_alignment): Likewise.
	(vect_enhance_data_refs_alignment): Adjust function calls.
	(vect_analyze_data_refs_alignment): Likewise.
	(vect_analyze_group_access): Fix printing. Skip different checks if
	DR_STEP is 0. Keep strided stores either in loop or basic block
	vectorization data structure. Fix indentation.
	(vect_analyze_data_ref_access): Fix comments, allow zero step in
	basic blocks.
	(vect_analyze_data_ref_accesses): Add bb_vinfo argument, get data
	dependence instances from either loop or basic block vectorization
	info.
	(vect_analyze_data_refs): Update comment. Call
	compute_data_dependences_for_bb to analyze basic blocks.
	(vect_create_addr_base_for_vector_ref): Check for outer loop only in
	case of loop vectorization. In case of basic block vectorization use
	data-ref itself   as  a base.
	(vect_create_data_ref_ptr): In case of basic block vectorization:
	don't advance the pointer, add new statements before the current
	statement.  Adjust function calls.
	(vect_supportable_dr_alignment): Support only aligned accesses in
	basic block vectorization.
	* common.opt (ftree-slp-vectorize): New flag.
	* tree-vect-patterns.c (widened_name_p): Adjust function calls.
	(vect_pattern_recog_1): Likewise.
	* tree-vect-stmts.c (process_use): Likewise.
	(vect_init_vector): Add new statements in the beginning of the basic
	block in case of basic block SLP.
	(vect_get_vec_def_for_operand): Adjust function calls.
	(vect_finish_stmt_generation): Likewise.
	(vectorizable_call): Add assert that it is loop vectorization, adjust
	function calls.
	(vectorizable_conversion, vectorizable_assignment): Likewise.
	(vectorizable_operation): In case of basic block SLP, take
	vectorization factor from statement's type and skip the relevance
	check. Adjust function calls.
	(vectorizable_type_demotion): Add assert that it is loop
	vectorization, adjust function calls.
	(vectorizable_type_promotion): Likewise.
	(vectorizable_store): Check for outer loop only in case of loop
	vectorization. Adjust function calls. For basic blocks, skip the
	relevance check and don't advance pointers.
	(vectorizable_load): Likewise.
	(vectorizable_condition): Add assert that it is loop vectorization,
	adjust function calls.
	(vect_analyze_stmt): Add argument. In case of basic block SLP, check
	that it is not reduction, get vector type, call only supported
	functions, skip loop    specific parts.
	(vect_transform_stmt): Check for outer loop only in case of loop
	vectorization.
	(new_stmt_vec_info): Add new argument and initialize bb_vinfo.
	(vect_is_simple_use): Fix comment, add new argument, fix conditions
	for external definition.
	* passes.c (pass_slp_vectorize): New pass.
	* tree-vect-slp.c (find_bb_location): New function.
	(vect_get_and_check_slp_defs): Add argument, adjust function calls,
	check for patterns only in loops.
	(vect_build_slp_tree): Add argument, adjust function calls, fail in
	case of multiple types in basic block SLP.
	(vect_mark_slp_stmts_relevant): New function.
	(vect_supported_load_permutation_p): Fix comment.
	(vect_analyze_slp_instance): Add argument. In case of basic block
	SLP, take vectorization factor from statement's type, check that
	unrolling factor is 1. Adjust function call. Save SLP instance in
	either loop or basic block vectorization structure. Return FALSE,
	if SLP failed.
	(vect_analyze_slp): Add argument. Get strided stores groups from
	either loop or basic block vectorization structure. Return FALSE
	if basic block SLP failed.
	(new_bb_vec_info): New function.
	(destroy_bb_vec_info, vect_slp_analyze_node_operations,
	vect_slp_analyze_operations, vect_slp_analyze_bb): Likewise.
	(vect_schedule_slp): Add argument. Get SLP instances from either
	loop or basic block vectorization structure. Set vectorization factor
	to be 1 for basic block SLP.
	(vect_slp_transform_bb): New function.
	* params.def (PARAM_SLP_MAX_INSNS_IN_BB): Define.

From-SVN: r147829
parent ffa52e11
2009-05-24 Ira Rosen <irar@il.ibm.com>
* doc/passes.texi (Tree-SSA passes): Document SLP pass.
* tree-pass.h (pass_slp_vectorize): New pass.
* params.h (SLP_MAX_INSNS_IN_BB): Define.
* timevar.def (TV_TREE_SLP_VECTORIZATION): Define.
* tree-vectorizer.c (timevar.h): Include.
(user_vect_verbosity_level): Declare.
(vect_location): Fix comment.
(vect_set_verbosity_level): Update user_vect_verbosity_level
instead of vect_verbosity_level.
(vect_set_dump_settings): Add an argument. Ignore user defined
verbosity if dump flags require higher level of verbosity. Print to
stderr only for loop vectorization.
(vectorize_loops): Update call to vect_set_dump_settings.
(execute_vect_slp): New function.
(gate_vect_slp): Likewise.
(struct gimple_opt_pass pass_slp_vectorize): New.
* tree-vectorizer.h (struct _bb_vec_info): Define along macros to
access its members.
(vec_info_for_bb): New function.
(struct _stmt_vec_info): Add bb_vinfo and a macro for its access.
(VECTORIZATION_ENABLED): New macro.
(SLP_ENABLED, SLP_DISABLED): Likewise.
(vect_is_simple_use): Add bb_vec_info argument.
(new_stmt_vec_info, vect_analyze_data_ref_dependences,
vect_analyze_data_refs_alignment, vect_verify_datarefs_alignment,
vect_analyze_data_ref_accesses, vect_analyze_data_refs,
vect_schedule_slp, vect_analyze_slp): Likewise.
(vect_analyze_stmt): Add slp_tree argument.
(find_bb_location): Declare.
(vect_slp_analyze_bb, vect_slp_transform_bb): Likewise.
* tree-vect-loop.c (new_loop_vec_info): Adjust function calls.
(vect_analyze_loop_operations, vect_analyze_loop,
get_initial_def_for_induction, vect_create_epilog_for_reduction,
vect_finalize_reduction, vectorizable_reduction,
vectorizable_live_operation, vect_transform_loop): Likewise.
* tree-data-ref.c (dr_analyze_innermost): Update comment,
skip evolution analysis if analyzing a basic block.
(dr_analyze_indices): Likewise.
(initialize_data_dependence_relation): Skip the test whether the
object is invariant for basic blocks.
(compute_all_dependences): Skip dependence analysis for data
references in basic blocks.
(find_data_references_in_stmt): Don't fail in case of invariant
access in basic block.
(find_data_references_in_bb): New function.
(find_data_references_in_loop): Move code to
find_data_references_in_bb and add a call to it.
(compute_data_dependences_for_bb): New function.
* tree-data-ref.h (compute_data_dependences_for_bb): Declare.
* tree-vect-data-refs.c (vect_check_interleaving): Adjust to the case
that STEP is 0.
(vect_analyze_data_ref_dependence): Check for interleaving in case of
unknown dependence in basic block and fail in case of dependence in
basic block.
(vect_analyze_data_ref_dependences): Add bb_vinfo argument, get data
dependence instances from either loop or basic block vectorization
info.
(vect_compute_data_ref_alignment): Check if it is loop vectorization
before calling nested_in_vect_loop_p.
(vect_compute_data_refs_alignment): Add bb_vinfo argument, get data
dependence instances from either loop or basic block vectorization
info.
(vect_verify_datarefs_alignment): Likewise.
(vect_enhance_data_refs_alignment): Adjust function calls.
(vect_analyze_data_refs_alignment): Likewise.
(vect_analyze_group_access): Fix printing. Skip different checks if
DR_STEP is 0. Keep strided stores either in loop or basic block
vectorization data structure. Fix indentation.
(vect_analyze_data_ref_access): Fix comments, allow zero step in
basic blocks.
(vect_analyze_data_ref_accesses): Add bb_vinfo argument, get data
dependence instances from either loop or basic block vectorization
info.
(vect_analyze_data_refs): Update comment. Call
compute_data_dependences_for_bb to analyze basic blocks.
(vect_create_addr_base_for_vector_ref): Check for outer loop only in
case of loop vectorization. In case of basic block vectorization use
data-ref itself as a base.
(vect_create_data_ref_ptr): In case of basic block vectorization:
don't advance the pointer, add new statements before the current
statement. Adjust function calls.
(vect_supportable_dr_alignment): Support only aligned accesses in
basic block vectorization.
* common.opt (ftree-slp-vectorize): New flag.
* tree-vect-patterns.c (widened_name_p): Adjust function calls.
(vect_pattern_recog_1): Likewise.
* tree-vect-stmts.c (process_use): Likewise.
(vect_init_vector): Add new statements in the beginning of the basic
block in case of basic block SLP.
(vect_get_vec_def_for_operand): Adjust function calls.
(vect_finish_stmt_generation): Likewise.
(vectorizable_call): Add assert that it is loop vectorization, adjust
function calls.
(vectorizable_conversion, vectorizable_assignment): Likewise.
(vectorizable_operation): In case of basic block SLP, take
vectorization factor from statement's type and skip the relevance
check. Adjust function calls.
(vectorizable_type_demotion): Add assert that it is loop
vectorization, adjust function calls.
(vectorizable_type_promotion): Likewise.
(vectorizable_store): Check for outer loop only in case of loop
vectorization. Adjust function calls. For basic blocks, skip the
relevance check and don't advance pointers.
(vectorizable_load): Likewise.
(vectorizable_condition): Add assert that it is loop vectorization,
adjust function calls.
(vect_analyze_stmt): Add argument. In case of basic block SLP, check
that it is not reduction, get vector type, call only supported
functions, skip loop specific parts.
(vect_transform_stmt): Check for outer loop only in case of loop
vectorization.
(new_stmt_vec_info): Add new argument and initialize bb_vinfo.
(vect_is_simple_use): Fix comment, add new argument, fix conditions
for external definition.
* passes.c (pass_slp_vectorize): New pass.
* tree-vect-slp.c (find_bb_location): New function.
(vect_get_and_check_slp_defs): Add argument, adjust function calls,
check for patterns only in loops.
(vect_build_slp_tree): Add argument, adjust function calls, fail in
case of multiple types in basic block SLP.
(vect_mark_slp_stmts_relevant): New function.
(vect_supported_load_permutation_p): Fix comment.
(vect_analyze_slp_instance): Add argument. In case of basic block
SLP, take vectorization factor from statement's type, check that
unrolling factor is 1. Adjust function call. Save SLP instance in
either loop or basic block vectorization structure. Return FALSE,
if SLP failed.
(vect_analyze_slp): Add argument. Get strided stores groups from
either loop or basic block vectorization structure. Return FALSE
if basic block SLP failed.
(new_bb_vec_info): New function.
(destroy_bb_vec_info, vect_slp_analyze_node_operations,
vect_slp_analyze_operations, vect_slp_analyze_bb): Likewise.
(vect_schedule_slp): Add argument. Get SLP instances from either
loop or basic block vectorization structure. Set vectorization factor
to be 1 for basic block SLP.
(vect_slp_transform_bb): New function.
* params.def (PARAM_SLP_MAX_INSNS_IN_BB): Define.
2009-05-23 Mark Mitchell <mark@codesourcery.com> 2009-05-23 Mark Mitchell <mark@codesourcery.com>
* final.c (shorten_branches): Do not align labels for jump tables. * final.c (shorten_branches): Do not align labels for jump tables.
......
...@@ -1330,6 +1330,10 @@ ftree-vectorize ...@@ -1330,6 +1330,10 @@ ftree-vectorize
Common Report Var(flag_tree_vectorize) Optimization Common Report Var(flag_tree_vectorize) Optimization
Enable loop vectorization on trees Enable loop vectorization on trees
ftree-slp-vectorize
Common Report Var(flag_tree_slp_vectorize) Init(2) Optimization
Enable basic block vectorization (SLP) on trees
fvect-cost-model fvect-cost-model
Common Report Var(flag_vect_cost_model) Optimization Common Report Var(flag_vect_cost_model) Optimization
Enable use of cost model in vectorization Enable use of cost model in vectorization
......
...@@ -438,11 +438,19 @@ conceptually unrolled by a factor @code{VF} (vectorization factor), which is ...@@ -438,11 +438,19 @@ conceptually unrolled by a factor @code{VF} (vectorization factor), which is
the number of elements operated upon in parallel in each iteration, and the the number of elements operated upon in parallel in each iteration, and the
@code{VF} copies of each scalar operation are fused to form a vector operation. @code{VF} copies of each scalar operation are fused to form a vector operation.
Additional loop transformations such as peeling and versioning may take place Additional loop transformations such as peeling and versioning may take place
to align the number of iterations, and to align the memory accesses in the loop. to align the number of iterations, and to align the memory accesses in the
The pass is implemented in @file{tree-vectorizer.c} (the main driver and general loop.
utilities), @file{tree-vect-analyze.c} and @file{tree-vect-transform.c}. The pass is implemented in @file{tree-vectorizer.c} (the main driver),
@file{tree-vect-loop.c} and @file{tree-vect-loop-manip.c} (loop specific parts
and general loop utilities), @file{tree-vect-slp} (loop-aware SLP
functionality), @file{tree-vect-stmts.c} and @file{tree-vect-data-refs.c}.
Analysis of data references is in @file{tree-data-ref.c}. Analysis of data references is in @file{tree-data-ref.c}.
SLP Vectorization. This pass performs vectorization of straight-line code. The
pass is implemented in @file{tree-vectorizer.c} (the main driver),
@file{tree-vect-slp.c}, @file{tree-vect-stmts.c} and
@file{tree-vect-data-refs.c}.
Autoparallelization. This pass splits the loop iteration space to run Autoparallelization. This pass splits the loop iteration space to run
into several threads. The pass is implemented in @file{tree-parloops.c}. into several threads. The pass is implemented in @file{tree-parloops.c}.
......
...@@ -765,6 +765,12 @@ DEFPARAM (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP, ...@@ -765,6 +765,12 @@ DEFPARAM (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP,
"max basic blocks number in loop for loop invariant motion", "max basic blocks number in loop for loop invariant motion",
10000, 0, 0) 10000, 0, 0)
/* Avoid SLP vectorization of large basic blocks. */
DEFPARAM (PARAM_SLP_MAX_INSNS_IN_BB,
"slp-max-insns-in-bb",
"Maximum number of instructions in basic block to be considered for SLP vectorization",
1000, 0, 0)
/* /*
Local variables: Local variables:
mode:c mode:c
......
...@@ -170,4 +170,6 @@ typedef enum compiler_param ...@@ -170,4 +170,6 @@ typedef enum compiler_param
PARAM_VALUE (PARAM_SWITCH_CONVERSION_BRANCH_RATIO) PARAM_VALUE (PARAM_SWITCH_CONVERSION_BRANCH_RATIO)
#define LOOP_INVARIANT_MAX_BBS_IN_LOOP \ #define LOOP_INVARIANT_MAX_BBS_IN_LOOP \
PARAM_VALUE (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP) PARAM_VALUE (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP)
#define SLP_MAX_INSNS_IN_BB \
PARAM_VALUE (PARAM_SLP_MAX_INSNS_IN_BB)
#endif /* ! GCC_PARAMS_H */ #endif /* ! GCC_PARAMS_H */
...@@ -662,6 +662,7 @@ init_optimization_passes (void) ...@@ -662,6 +662,7 @@ init_optimization_passes (void)
NEXT_PASS (pass_dce_loop); NEXT_PASS (pass_dce_loop);
} }
NEXT_PASS (pass_complete_unroll); NEXT_PASS (pass_complete_unroll);
NEXT_PASS (pass_slp_vectorize);
NEXT_PASS (pass_parallelize_loops); NEXT_PASS (pass_parallelize_loops);
NEXT_PASS (pass_loop_prefetch); NEXT_PASS (pass_loop_prefetch);
NEXT_PASS (pass_iv_optimize); NEXT_PASS (pass_iv_optimize);
......
2009-05-24 Ira Rosen <irar@il.ibm.com>
* gcc.dg/vect/bb-slp-1.c: New test.
* gcc.dg/vect/bb-slp-2.c, gcc.dg/vect/bb-slp-3.c,
gcc.dg/vect/bb-slp-4.c, gcc.dg/vect/bb-slp-5.c,
gcc.dg/vect/bb-slp-6.c, gcc.dg/vect/bb-slp-7.c,
gcc.dg/vect/bb-slp-8.c, gcc.dg/vect/bb-slp-9.c,
gcc.dg/vect/bb-slp-10.c, gcc.dg/vect/bb-slp-11.c,
gcc.dg/vect/no-tree-reassoc-bb-slp-12.c, gcc.dg/vect/bb-slp-13.c,
gcc.dg/vect/bb-slp-14.c, gcc.dg/vect/bb-slp-15.c,
gcc.dg/vect/bb-slp-16.c, gcc.dg/vect/bb-slp-17.c,
gcc.dg/vect/bb-slp-18.c, gcc.dg/vect/bb-slp-19.c,
gcc.dg/vect/bb-slp-20.c, gcc.dg/vect/bb-slp-21.c,
gcc.dg/vect/bb-slp-22.c: Likewise.
* gcc.dg/vect/vect.exp: Run basic block SLP tests.
2009-05-23 Mark Mitchell <mark@codesourcery.com> 2009-05-23 Mark Mitchell <mark@codesourcery.com>
Maxim Kuvyrkov <maxim@codesourcery.com> Maxim Kuvyrkov <maxim@codesourcery.com>
......
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 32
unsigned int out[N*8];
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63, 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 (int dummy)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
for (i = 0; i < N; i++)
{
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Avoid loop vectorization. */
if (dummy == 32)
abort ();
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != in[i*8]
|| out[i*8 + 1] != in[i*8 + 1]
|| out[i*8 + 2] != in[i*8 + 2]
|| out[i*8 + 3] != in[i*8 + 3]
|| out[i*8 + 4] != in[i*8 + 4]
|| out[i*8 + 5] != in[i*8 + 5]
|| out[i*8 + 6] != in[i*8 + 6]
|| out[i*8 + 7] != in[i*8 + 7])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 (33);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[2];
unsigned int a0, a1, a2, a3;
/* Misaligned store. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[2] != (in[0] + 23) * x
|| out[3] != (in[1] + 142) * y
|| out[4] != (in[2] + 2) * x
|| out[5] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { scan-tree-dump-times "unsupported alignment in basic block." 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
short a0, a1, a2, a3;
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { scan-tree-dump-times "SLP with multiple types" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
/* Not consecutive load with permutation - not supported. */
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[1] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[1] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
if (x > y)
x = x + y;
else
y = x;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 32
unsigned int out[N*8];
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63, 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
unsigned int arr[N] = {0,1,2,3,4,5,6,7};
__attribute__ ((noinline)) int
main1 (int dummy)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
unsigned int a = 0;
for (i = 0; i < N; i++)
{
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
if (arr[i] = i)
a = i;
else
a = 2;
}
a = 0;
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != in[i*8] + a
|| out[i*8 + 1] != in[i*8 + 1] + a
|| out[i*8 + 2] != in[i*8 + 2] + a
|| out[i*8 + 3] != in[i*8 + 3] + a
|| out[i*8 + 4] != in[i*8 + 4] + a
|| out[i*8 + 5] != in[i*8 + 5] + a
|| out[i*8 + 6] != in[i*8 + 6] + a
|| out[i*8 + 7] != in[i*8 + 7] + a)
abort ();
if (arr[i] = i)
a = i;
else
a = 2;
}
return 0;
}
int main (void)
{
check_vect ();
main1 (33);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
if (x > y)
x = x + y;
else
y = x;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
b[0] = a0;
b[1] = a1;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y
|| b[0] != in[0] + 23
|| b[1] != in[1] + 142)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != a0 * x
|| out[1] != a1 * y
|| out[2] != a2 * x
|| out[3] != a3 * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned short out[N];
unsigned short in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short *pin = &in[0];
unsigned short *pout = &out[0];
/* A group of 9 shorts - unsupported for now. */
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3]
|| out[4] != in[4]
|| out[5] != in[5]
|| out[6] != in[6]
|| out[7] != in[7]
|| out[8] != in[8])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { xfail *-*-* } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N*8];
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 (int dummy)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
for (i = 0; i < N*2; i++)
{
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Avoid loop vectorization. */
if (dummy == 32)
abort ();
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != in[i*8]
|| out[i*8 + 1] != in[i*8 + 1]
|| out[i*8 + 2] != in[i*8 + 2]
|| out[i*8 + 3] != in[i*8 + 3]
|| out[i*8 + 4] != in[i*8 + 4]
|| out[i*8 + 5] != in[i*8 + 5]
|| out[i*8 + 6] != in[i*8 + 6]
|| out[i*8 + 7] != in[i*8 + 7])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 (33);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
if (x > y)
x = x + y;
else
y = x;
/* Two SLP instances in the basic block, only one is supported for now,
the second one contains type conversion. */
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
b[0] = -a0;
b[1] = -a1;
b[2] = -a2;
b[3] = -a3;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y
|| b[0] != -(in[0] + 23)
|| b[1] != -(in[1] + 142)
|| b[2] != -(in[2] + 2)
|| b[3] != -(in[3] + 31))
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
/* Two SLP instances in one basic block. */
if (x > y)
x = x + y;
else
y = x;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
b[0] = a0;
b[1] = a1;
b[2] = a2;
b[3] = a3;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y
|| b[0] != (in[0] + 23)
|| b[1] != (in[1] + 142)
|| b[2] != (in[2] + 2)
|| b[3] != (in[3] + 31))
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "slp" { target { ! {vect_int_mult } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
if (x > y)
{
b[0] = a0;
b[1] = a1;
b[2] = a2;
b[3] = a3;
}
else
{
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
}
/* Check results. */
if ((x <= y
&& (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y))
|| (x > y
&& (b[0] != (in[0] + 23)
|| b[1] != (in[1] + 142)
|| b[2] != (in[2] + 2)
|| b[3] != (in[3] + 31))))
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target { ! {vect_int_mult } } } } } */
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 2 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned short out[N];
unsigned short in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short *pin = &in[0];
unsigned short *pout = &out[0];
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned short out[N];
unsigned short in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short *pin = &in[0];
unsigned short *pout = &out[0];
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3]
|| out[4] != in[4]
|| out[5] != in[5]
|| out[6] != in[6]
|| out[7] != in[7])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
/* Non isomorphic. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ * 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] * 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y, unsigned int *pin, unsigned int *pout)
{
int i;
unsigned int a0, a1, a2, a3;
/* pin and pout may alias. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3, &in[0], &out[0]);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[1];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
/* Misaligned load. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[1] + 23) * x
|| out[1] != (in[2] + 142) * y
|| out[2] != (in[3] + 2) * x
|| out[3] != (in[4] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { scan-tree-dump-times "unsupported alignment in basic block." 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in1[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
unsigned int in2[N] = {10,11,12,13,14,15,16,17,18,19,110,111,112,113,114,115};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin1 = &in1[0];
unsigned int *pin2 = &in2[0];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
a0 = *pin2++ - *pin1++ + 23;
a1 = *pin2++ - *pin1++ + 142;
a2 = *pin2++ - *pin1++ + 2;
a3 = *pin2++ - *pin1++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in2[0] - in1[0] + 23) * x
|| out[1] != (in2[1] - in1[1] + 142) * y
|| out[2] != (in2[2] - in1[2] + 2) * x
|| out[3] != (in2[3] - in1[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
...@@ -122,6 +122,8 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/nodump-*.\[cS\]]] \ ...@@ -122,6 +122,8 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/nodump-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS "" $DEFAULT_VECTCFLAGS
lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details" lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
set VECT_SLP_CFLAGS $DEFAULT_VECTCFLAGS
lappend VECT_SLP_CFLAGS "-fdump-tree-slp-details"
# Main loop. # Main loop.
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/pr*.\[cS\]]] \ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/pr*.\[cS\]]] \
...@@ -130,10 +132,14 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vect-*.\[cS\]]] \ ...@@ -130,10 +132,14 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vect-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS "" $DEFAULT_VECTCFLAGS
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/slp-*.\[cS\]]] \ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/slp-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS "" $DEFAULT_VECTCFLAGS
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/bb-slp*.\[cS\]]] \
"" $VECT_SLP_CFLAGS
#### Tests with special options #### Tests with special options
global SAVED_DEFAULT_VECTCFLAGS global SAVED_DEFAULT_VECTCFLAGS
set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS
set SAVED_VECT_SLP_CFLAGS $VECT_SLP_CFLAGS
# --param vect-max-version-for-alias-checks=0 tests # --param vect-max-version-for-alias-checks=0 tests
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
...@@ -262,6 +268,11 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/O3-*.\[cS\]]] \ ...@@ -262,6 +268,11 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/O3-*.\[cS\]]] \
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/O1-*.\[cS\]]] \ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/O1-*.\[cS\]]] \
"" $O1_VECTCFLAGS "" $O1_VECTCFLAGS
# -fno-tree-reassoc
set VECT_SLP_CFLAGS $SAVED_VECT_SLP_CFLAGS
lappend VECT_SLP_CFLAGS "-fno-tree-reassoc"
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/no-tree-reassoc-bb-slp-*.\[cS\]]] \
"" $VECT_SLP_CFLAGS
# Clean up. # Clean up.
set dg-do-what-default ${save-dg-do-what-default} set dg-do-what-default ${save-dg-do-what-default}
......
...@@ -121,6 +121,7 @@ DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH , "tree loop unswitching") ...@@ -121,6 +121,7 @@ DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH , "tree loop unswitching")
DEFTIMEVAR (TV_COMPLETE_UNROLL , "complete unrolling") DEFTIMEVAR (TV_COMPLETE_UNROLL , "complete unrolling")
DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops") DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
DEFTIMEVAR (TV_TREE_VECTORIZATION , "tree vectorization") DEFTIMEVAR (TV_TREE_VECTORIZATION , "tree vectorization")
DEFTIMEVAR (TV_TREE_SLP_VECTORIZATION, "tree slp vectorization")
DEFTIMEVAR (TV_GRAPHITE_TRANSFORMS , "GRAPHITE loop transforms") DEFTIMEVAR (TV_GRAPHITE_TRANSFORMS , "GRAPHITE loop transforms")
DEFTIMEVAR (TV_TREE_LINEAR_TRANSFORM , "tree loop linear") DEFTIMEVAR (TV_TREE_LINEAR_TRANSFORM , "tree loop linear")
DEFTIMEVAR (TV_TREE_LOOP_DISTRIBUTION, "tree loop distribution") DEFTIMEVAR (TV_TREE_LOOP_DISTRIBUTION, "tree loop distribution")
......
...@@ -668,8 +668,9 @@ canonicalize_base_object_address (tree addr) ...@@ -668,8 +668,9 @@ canonicalize_base_object_address (tree addr)
return build_fold_addr_expr (TREE_OPERAND (addr, 0)); return build_fold_addr_expr (TREE_OPERAND (addr, 0));
} }
/* Analyzes the behavior of the memory reference DR in the innermost loop that /* Analyzes the behavior of the memory reference DR in the innermost loop or
contains it. Returns true if analysis succeed or false otherwise. */ basic block that contains it. Returns true if analysis succeed or false
otherwise. */
bool bool
dr_analyze_innermost (struct data_reference *dr) dr_analyze_innermost (struct data_reference *dr)
...@@ -683,6 +684,7 @@ dr_analyze_innermost (struct data_reference *dr) ...@@ -683,6 +684,7 @@ dr_analyze_innermost (struct data_reference *dr)
int punsignedp, pvolatilep; int punsignedp, pvolatilep;
affine_iv base_iv, offset_iv; affine_iv base_iv, offset_iv;
tree init, dinit, step; tree init, dinit, step;
bool in_loop = (loop && loop->num);
if (dump_file && (dump_flags & TDF_DETAILS)) if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "analyze_innermost: "); fprintf (dump_file, "analyze_innermost: ");
...@@ -699,13 +701,24 @@ dr_analyze_innermost (struct data_reference *dr) ...@@ -699,13 +701,24 @@ dr_analyze_innermost (struct data_reference *dr)
} }
base = build_fold_addr_expr (base); base = build_fold_addr_expr (base);
if (!simple_iv (loop, loop_containing_stmt (stmt), base, &base_iv, false)) if (in_loop)
{
if (!simple_iv (loop, loop_containing_stmt (stmt), base, &base_iv,
false))
{ {
if (dump_file && (dump_flags & TDF_DETAILS)) if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "failed: evolution of base is not affine.\n"); fprintf (dump_file, "failed: evolution of base is not affine.\n");
return false; return false;
} }
if (!poffset) }
else
{
base_iv.base = base;
base_iv.step = ssize_int (0);
base_iv.no_overflow = true;
}
if (!poffset || !in_loop)
{ {
offset_iv.base = ssize_int (0); offset_iv.base = ssize_int (0);
offset_iv.step = ssize_int (0); offset_iv.step = ssize_int (0);
...@@ -752,17 +765,23 @@ dr_analyze_indices (struct data_reference *dr, struct loop *nest) ...@@ -752,17 +765,23 @@ dr_analyze_indices (struct data_reference *dr, struct loop *nest)
struct loop *loop = loop_containing_stmt (stmt); struct loop *loop = loop_containing_stmt (stmt);
VEC (tree, heap) *access_fns = NULL; VEC (tree, heap) *access_fns = NULL;
tree ref = unshare_expr (DR_REF (dr)), aref = ref, op; tree ref = unshare_expr (DR_REF (dr)), aref = ref, op;
tree base, off, access_fn; tree base, off, access_fn = NULL_TREE;
basic_block before_loop = block_before_loop (nest); basic_block before_loop = NULL;
if (nest)
before_loop = block_before_loop (nest);
while (handled_component_p (aref)) while (handled_component_p (aref))
{ {
if (TREE_CODE (aref) == ARRAY_REF) if (TREE_CODE (aref) == ARRAY_REF)
{ {
op = TREE_OPERAND (aref, 1); op = TREE_OPERAND (aref, 1);
if (nest)
{
access_fn = analyze_scalar_evolution (loop, op); access_fn = analyze_scalar_evolution (loop, op);
access_fn = instantiate_scev (before_loop, loop, access_fn); access_fn = instantiate_scev (before_loop, loop, access_fn);
VEC_safe_push (tree, heap, access_fns, access_fn); VEC_safe_push (tree, heap, access_fns, access_fn);
}
TREE_OPERAND (aref, 1) = build_int_cst (TREE_TYPE (op), 0); TREE_OPERAND (aref, 1) = build_int_cst (TREE_TYPE (op), 0);
} }
...@@ -770,7 +789,7 @@ dr_analyze_indices (struct data_reference *dr, struct loop *nest) ...@@ -770,7 +789,7 @@ dr_analyze_indices (struct data_reference *dr, struct loop *nest)
aref = TREE_OPERAND (aref, 0); aref = TREE_OPERAND (aref, 0);
} }
if (INDIRECT_REF_P (aref)) if (nest && INDIRECT_REF_P (aref))
{ {
op = TREE_OPERAND (aref, 0); op = TREE_OPERAND (aref, 0);
access_fn = analyze_scalar_evolution (loop, op); access_fn = analyze_scalar_evolution (loop, op);
...@@ -1332,7 +1351,8 @@ initialize_data_dependence_relation (struct data_reference *a, ...@@ -1332,7 +1351,8 @@ initialize_data_dependence_relation (struct data_reference *a,
/* If the base of the object is not invariant in the loop nest, we cannot /* If the base of the object is not invariant in the loop nest, we cannot
analyze it. TODO -- in fact, it would suffice to record that there may analyze it. TODO -- in fact, it would suffice to record that there may
be arbitrary dependences in the loops where the base object varies. */ be arbitrary dependences in the loops where the base object varies. */
if (!object_address_invariant_in_loop_p (VEC_index (loop_p, loop_nest, 0), if (loop_nest
&& !object_address_invariant_in_loop_p (VEC_index (loop_p, loop_nest, 0),
DR_BASE_OBJECT (a))) DR_BASE_OBJECT (a)))
{ {
DDR_ARE_DEPENDENT (res) = chrec_dont_know; DDR_ARE_DEPENDENT (res) = chrec_dont_know;
...@@ -4003,6 +4023,7 @@ compute_all_dependences (VEC (data_reference_p, heap) *datarefs, ...@@ -4003,6 +4023,7 @@ compute_all_dependences (VEC (data_reference_p, heap) *datarefs,
{ {
ddr = initialize_data_dependence_relation (a, b, loop_nest); ddr = initialize_data_dependence_relation (a, b, loop_nest);
VEC_safe_push (ddr_p, heap, *dependence_relations, ddr); VEC_safe_push (ddr_p, heap, *dependence_relations, ddr);
if (loop_nest)
compute_affine_dependence (ddr, VEC_index (loop_p, loop_nest, 0)); compute_affine_dependence (ddr, VEC_index (loop_p, loop_nest, 0));
} }
...@@ -4110,9 +4131,10 @@ find_data_references_in_stmt (struct loop *nest, gimple stmt, ...@@ -4110,9 +4131,10 @@ find_data_references_in_stmt (struct loop *nest, gimple stmt,
dr = create_data_ref (nest, *ref->pos, stmt, ref->is_read); dr = create_data_ref (nest, *ref->pos, stmt, ref->is_read);
gcc_assert (dr != NULL); gcc_assert (dr != NULL);
/* FIXME -- data dependence analysis does not work correctly for objects with /* FIXME -- data dependence analysis does not work correctly for objects
invariant addresses. Let us fail here until the problem is fixed. */ with invariant addresses in loop nests. Let us fail here until the
if (dr_address_invariant_p (dr)) problem is fixed. */
if (dr_address_invariant_p (dr) && nest)
{ {
free_data_ref (dr); free_data_ref (dr);
if (dump_file && (dump_flags & TDF_DETAILS)) if (dump_file && (dump_flags & TDF_DETAILS))
...@@ -4129,6 +4151,33 @@ find_data_references_in_stmt (struct loop *nest, gimple stmt, ...@@ -4129,6 +4151,33 @@ find_data_references_in_stmt (struct loop *nest, gimple stmt,
/* Search the data references in LOOP, and record the information into /* Search the data references in LOOP, and record the information into
DATAREFS. Returns chrec_dont_know when failing to analyze a DATAREFS. Returns chrec_dont_know when failing to analyze a
difficult case, returns NULL_TREE otherwise. */
static tree
find_data_references_in_bb (struct loop *loop, basic_block bb,
VEC (data_reference_p, heap) **datarefs)
{
gimple_stmt_iterator bsi;
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
gimple stmt = gsi_stmt (bsi);
if (!find_data_references_in_stmt (loop, stmt, datarefs))
{
struct data_reference *res;
res = XCNEW (struct data_reference);
VEC_safe_push (data_reference_p, heap, *datarefs, res);
return chrec_dont_know;
}
}
return NULL_TREE;
}
/* Search the data references in LOOP, and record the information into
DATAREFS. Returns chrec_dont_know when failing to analyze a
difficult case, returns NULL_TREE otherwise. difficult case, returns NULL_TREE otherwise.
TODO: This function should be made smarter so that it can handle address TODO: This function should be made smarter so that it can handle address
...@@ -4140,7 +4189,6 @@ find_data_references_in_loop (struct loop *loop, ...@@ -4140,7 +4189,6 @@ find_data_references_in_loop (struct loop *loop,
{ {
basic_block bb, *bbs; basic_block bb, *bbs;
unsigned int i; unsigned int i;
gimple_stmt_iterator bsi;
bbs = get_loop_body_in_dom_order (loop); bbs = get_loop_body_in_dom_order (loop);
...@@ -4148,21 +4196,12 @@ find_data_references_in_loop (struct loop *loop, ...@@ -4148,21 +4196,12 @@ find_data_references_in_loop (struct loop *loop,
{ {
bb = bbs[i]; bb = bbs[i];
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi)) if (find_data_references_in_bb (loop, bb, datarefs) == chrec_dont_know)
{ {
gimple stmt = gsi_stmt (bsi);
if (!find_data_references_in_stmt (loop, stmt, datarefs))
{
struct data_reference *res;
res = XCNEW (struct data_reference);
VEC_safe_push (data_reference_p, heap, *datarefs, res);
free (bbs); free (bbs);
return chrec_dont_know; return chrec_dont_know;
} }
} }
}
free (bbs); free (bbs);
return NULL_TREE; return NULL_TREE;
...@@ -4298,6 +4337,26 @@ compute_data_dependences_for_loop (struct loop *loop, ...@@ -4298,6 +4337,26 @@ compute_data_dependences_for_loop (struct loop *loop,
return res; return res;
} }
/* Returns true when the data dependences for the basic block BB have been
computed, false otherwise.
DATAREFS is initialized to all the array elements contained in this basic
block, DEPENDENCE_RELATIONS contains the relations between the data
references. Compute read-read and self relations if
COMPUTE_SELF_AND_READ_READ_DEPENDENCES is TRUE. */
bool
compute_data_dependences_for_bb (basic_block bb,
bool compute_self_and_read_read_dependences,
VEC (data_reference_p, heap) **datarefs,
VEC (ddr_p, heap) **dependence_relations)
{
if (find_data_references_in_bb (NULL, bb, datarefs) == chrec_dont_know)
return false;
compute_all_dependences (*datarefs, dependence_relations, NULL,
compute_self_and_read_read_dependences);
return true;
}
/* Entry point (for testing only). Analyze all the data references /* Entry point (for testing only). Analyze all the data references
and the dependence relations in LOOP. and the dependence relations in LOOP.
......
...@@ -383,6 +383,9 @@ bool dr_analyze_innermost (struct data_reference *); ...@@ -383,6 +383,9 @@ bool dr_analyze_innermost (struct data_reference *);
extern bool compute_data_dependences_for_loop (struct loop *, bool, extern bool compute_data_dependences_for_loop (struct loop *, bool,
VEC (data_reference_p, heap) **, VEC (data_reference_p, heap) **,
VEC (ddr_p, heap) **); VEC (ddr_p, heap) **);
extern bool compute_data_dependences_for_bb (basic_block, bool,
VEC (data_reference_p, heap) **,
VEC (ddr_p, heap) **);
extern tree find_data_references_in_loop (struct loop *, extern tree find_data_references_in_loop (struct loop *,
VEC (data_reference_p, heap) **); VEC (data_reference_p, heap) **);
extern void print_direction_vector (FILE *, lambda_vector, int); extern void print_direction_vector (FILE *, lambda_vector, int);
......
...@@ -338,6 +338,7 @@ extern struct gimple_opt_pass pass_graphite_transforms; ...@@ -338,6 +338,7 @@ extern struct gimple_opt_pass pass_graphite_transforms;
extern struct gimple_opt_pass pass_if_conversion; extern struct gimple_opt_pass pass_if_conversion;
extern struct gimple_opt_pass pass_loop_distribution; extern struct gimple_opt_pass pass_loop_distribution;
extern struct gimple_opt_pass pass_vectorize; extern struct gimple_opt_pass pass_vectorize;
extern struct gimple_opt_pass pass_slp_vectorize;
extern struct gimple_opt_pass pass_complete_unroll; extern struct gimple_opt_pass pass_complete_unroll;
extern struct gimple_opt_pass pass_complete_unrolli; extern struct gimple_opt_pass pass_complete_unrolli;
extern struct gimple_opt_pass pass_parallelize_loops; extern struct gimple_opt_pass pass_parallelize_loops;
......
...@@ -363,7 +363,7 @@ vect_check_interleaving (struct data_reference *dra, ...@@ -363,7 +363,7 @@ vect_check_interleaving (struct data_reference *dra,
and DRB is accessed before DRA. */ and DRB is accessed before DRA. */
diff_mod_size = (init_a - init_b) % type_size_a; diff_mod_size = (init_a - init_b) % type_size_a;
if ((init_a - init_b) > step) if (step && (init_a - init_b) > step)
return false; return false;
if (diff_mod_size == 0) if (diff_mod_size == 0)
...@@ -385,7 +385,7 @@ vect_check_interleaving (struct data_reference *dra, ...@@ -385,7 +385,7 @@ vect_check_interleaving (struct data_reference *dra,
interleaving, and DRA is accessed before DRB. */ interleaving, and DRA is accessed before DRB. */
diff_mod_size = (init_b - init_a) % type_size_a; diff_mod_size = (init_b - init_a) % type_size_a;
if ((init_b - init_a) > step) if (step && (init_b - init_a) > step)
return false; return false;
if (diff_mod_size == 0) if (diff_mod_size == 0)
...@@ -479,6 +479,7 @@ vect_mark_for_runtime_alias_test (ddr_p ddr, loop_vec_info loop_vinfo) ...@@ -479,6 +479,7 @@ vect_mark_for_runtime_alias_test (ddr_p ddr, loop_vec_info loop_vinfo)
return true; return true;
} }
/* Function vect_analyze_data_ref_dependence. /* Function vect_analyze_data_ref_dependence.
Return TRUE if there (might) exist a dependence between a memory-reference Return TRUE if there (might) exist a dependence between a memory-reference
...@@ -490,8 +491,8 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr, ...@@ -490,8 +491,8 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
loop_vec_info loop_vinfo) loop_vec_info loop_vinfo)
{ {
unsigned int i; unsigned int i;
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
int vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); int vectorization_factor = 0;
struct data_reference *dra = DDR_A (ddr); struct data_reference *dra = DDR_A (ddr);
struct data_reference *drb = DDR_B (ddr); struct data_reference *drb = DDR_B (ddr);
stmt_vec_info stmtinfo_a = vinfo_for_stmt (DR_STMT (dra)); stmt_vec_info stmtinfo_a = vinfo_for_stmt (DR_STMT (dra));
...@@ -508,23 +509,68 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr, ...@@ -508,23 +509,68 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
return false; return false;
} }
if ((DR_IS_READ (dra) && DR_IS_READ (drb)) || dra == drb) if (loop_vinfo)
{
loop = LOOP_VINFO_LOOP (loop_vinfo);
vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
}
if ((DR_IS_READ (dra) && DR_IS_READ (drb) && loop_vinfo) || dra == drb)
return false; return false;
if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know) if (DDR_ARE_DEPENDENT (ddr) == chrec_dont_know)
{ {
if (loop_vinfo)
{
if (vect_print_dump_info (REPORT_DR_DETAILS)) if (vect_print_dump_info (REPORT_DR_DETAILS))
{ {
fprintf (vect_dump, fprintf (vect_dump, "versioning for alias required: "
"versioning for alias required: can't determine dependence between "); "can't determine dependence between ");
print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM); print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM);
fprintf (vect_dump, " and "); fprintf (vect_dump, " and ");
print_generic_expr (vect_dump, DR_REF (drb), TDF_SLIM); print_generic_expr (vect_dump, DR_REF (drb), TDF_SLIM);
} }
/* Add to list of ddrs that need to be tested at run-time. */ /* Add to list of ddrs that need to be tested at run-time. */
return !vect_mark_for_runtime_alias_test (ddr, loop_vinfo); return !vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
} }
/* When vectorizing a basic block unknown depnedence can still mean
strided access. */
if (vect_check_interleaving (dra, drb))
return false;
if (vect_print_dump_info (REPORT_DR_DETAILS))
{
fprintf (vect_dump, "can't determine dependence between ");
print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM);
fprintf (vect_dump, " and ");
print_generic_expr (vect_dump, DR_REF (drb), TDF_SLIM);
}
return true;
}
/* Versioning for alias is not yet supported for basic block SLP, and
dependence distance is unapplicable, hence, in case of known data
dependence, basic block vectorization is impossible for now. */
if (!loop_vinfo)
{
if (dra != drb && vect_check_interleaving (dra, drb))
return false;
if (vect_print_dump_info (REPORT_DR_DETAILS))
{
fprintf (vect_dump, "determined dependence between ");
print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM);
fprintf (vect_dump, " and ");
print_generic_expr (vect_dump, DR_REF (drb), TDF_SLIM);
}
return true;
}
/* Loop-based vectorization and known data dependence. */
if (DDR_NUM_DIST_VECTS (ddr) == 0) if (DDR_NUM_DIST_VECTS (ddr) == 0)
{ {
if (vect_print_dump_info (REPORT_DR_DETAILS)) if (vect_print_dump_info (REPORT_DR_DETAILS))
...@@ -589,8 +635,7 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr, ...@@ -589,8 +635,7 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
{ {
fprintf (vect_dump, fprintf (vect_dump, "not vectorized, possible dependence "
"not vectorized, possible dependence "
"between data-refs "); "between data-refs ");
print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM); print_generic_expr (vect_dump, DR_REF (dra), TDF_SLIM);
fprintf (vect_dump, " and "); fprintf (vect_dump, " and ");
...@@ -609,15 +654,21 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr, ...@@ -609,15 +654,21 @@ vect_analyze_data_ref_dependence (struct data_dependence_relation *ddr,
exist any data dependences between them. */ exist any data dependences between them. */
bool bool
vect_analyze_data_ref_dependences (loop_vec_info loop_vinfo) vect_analyze_data_ref_dependences (loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo)
{ {
unsigned int i; unsigned int i;
VEC (ddr_p, heap) * ddrs = LOOP_VINFO_DDRS (loop_vinfo); VEC (ddr_p, heap) *ddrs = NULL;
struct data_dependence_relation *ddr; struct data_dependence_relation *ddr;
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vect_analyze_dependences ==="); fprintf (vect_dump, "=== vect_analyze_dependences ===");
if (loop_vinfo)
ddrs = LOOP_VINFO_DDRS (loop_vinfo);
else
ddrs = BB_VINFO_DDRS (bb_vinfo);
for (i = 0; VEC_iterate (ddr_p, ddrs, i, ddr); i++) for (i = 0; VEC_iterate (ddr_p, ddrs, i, ddr); i++)
if (vect_analyze_data_ref_dependence (ddr, loop_vinfo)) if (vect_analyze_data_ref_dependence (ddr, loop_vinfo))
return false; return false;
...@@ -644,7 +695,7 @@ vect_compute_data_ref_alignment (struct data_reference *dr) ...@@ -644,7 +695,7 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
gimple stmt = DR_STMT (dr); gimple stmt = DR_STMT (dr);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
tree ref = DR_REF (dr); tree ref = DR_REF (dr);
tree vectype; tree vectype;
tree base, base_addr; tree base, base_addr;
...@@ -655,6 +706,9 @@ vect_compute_data_ref_alignment (struct data_reference *dr) ...@@ -655,6 +706,9 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "vect_compute_data_ref_alignment:"); fprintf (vect_dump, "vect_compute_data_ref_alignment:");
if (loop_vinfo)
loop = LOOP_VINFO_LOOP (loop_vinfo);
/* Initialize misalignment to unknown. */ /* Initialize misalignment to unknown. */
SET_DR_MISALIGNMENT (dr, -1); SET_DR_MISALIGNMENT (dr, -1);
...@@ -669,7 +723,7 @@ vect_compute_data_ref_alignment (struct data_reference *dr) ...@@ -669,7 +723,7 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
stays the same throughout the execution of the inner-loop, which is why stays the same throughout the execution of the inner-loop, which is why
we have to check that the stride of the dataref in the inner-loop evenly we have to check that the stride of the dataref in the inner-loop evenly
divides by the vector size. */ divides by the vector size. */
if (nested_in_vect_loop_p (loop, stmt)) if (loop && nested_in_vect_loop_p (loop, stmt))
{ {
tree step = DR_STEP (dr); tree step = DR_STEP (dr);
HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step); HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step);
...@@ -773,12 +827,18 @@ vect_compute_data_ref_alignment (struct data_reference *dr) ...@@ -773,12 +827,18 @@ vect_compute_data_ref_alignment (struct data_reference *dr)
Return FALSE if a data reference is found that cannot be vectorized. */ Return FALSE if a data reference is found that cannot be vectorized. */
static bool static bool
vect_compute_data_refs_alignment (loop_vec_info loop_vinfo) vect_compute_data_refs_alignment (loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo)
{ {
VEC (data_reference_p, heap) *datarefs = LOOP_VINFO_DATAREFS (loop_vinfo); VEC (data_reference_p, heap) *datarefs;
struct data_reference *dr; struct data_reference *dr;
unsigned int i; unsigned int i;
if (loop_vinfo)
datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
else
datarefs = BB_VINFO_DATAREFS (bb_vinfo);
for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++) for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
if (!vect_compute_data_ref_alignment (dr)) if (!vect_compute_data_ref_alignment (dr))
return false; return false;
...@@ -850,14 +910,19 @@ vect_update_misalignment_for_peel (struct data_reference *dr, ...@@ -850,14 +910,19 @@ vect_update_misalignment_for_peel (struct data_reference *dr,
Return TRUE if all data references in the loop can be Return TRUE if all data references in the loop can be
handled with respect to alignment. */ handled with respect to alignment. */
static bool bool
vect_verify_datarefs_alignment (loop_vec_info loop_vinfo) vect_verify_datarefs_alignment (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
{ {
VEC (data_reference_p, heap) *datarefs = LOOP_VINFO_DATAREFS (loop_vinfo); VEC (data_reference_p, heap) *datarefs;
struct data_reference *dr; struct data_reference *dr;
enum dr_alignment_support supportable_dr_alignment; enum dr_alignment_support supportable_dr_alignment;
unsigned int i; unsigned int i;
if (loop_vinfo)
datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
else
datarefs = BB_VINFO_DATAREFS (bb_vinfo);
for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++) for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
{ {
gimple stmt = DR_STMT (dr); gimple stmt = DR_STMT (dr);
...@@ -1223,7 +1288,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) ...@@ -1223,7 +1288,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Peeling for alignment will be applied."); fprintf (vect_dump, "Peeling for alignment will be applied.");
stat = vect_verify_datarefs_alignment (loop_vinfo); stat = vect_verify_datarefs_alignment (loop_vinfo, NULL);
gcc_assert (stat); gcc_assert (stat);
return stat; return stat;
} }
...@@ -1331,7 +1396,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) ...@@ -1331,7 +1396,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
/* Peeling and versioning can't be done together at this time. */ /* Peeling and versioning can't be done together at this time. */
gcc_assert (! (do_peeling && do_versioning)); gcc_assert (! (do_peeling && do_versioning));
stat = vect_verify_datarefs_alignment (loop_vinfo); stat = vect_verify_datarefs_alignment (loop_vinfo, NULL);
gcc_assert (stat); gcc_assert (stat);
return stat; return stat;
} }
...@@ -1339,7 +1404,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) ...@@ -1339,7 +1404,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
/* This point is reached if neither peeling nor versioning is being done. */ /* This point is reached if neither peeling nor versioning is being done. */
gcc_assert (! (do_peeling || do_versioning)); gcc_assert (! (do_peeling || do_versioning));
stat = vect_verify_datarefs_alignment (loop_vinfo); stat = vect_verify_datarefs_alignment (loop_vinfo, NULL);
return stat; return stat;
} }
...@@ -1350,12 +1415,13 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo) ...@@ -1350,12 +1415,13 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
Return FALSE if a data reference is found that cannot be vectorized. */ Return FALSE if a data reference is found that cannot be vectorized. */
bool bool
vect_analyze_data_refs_alignment (loop_vec_info loop_vinfo) vect_analyze_data_refs_alignment (loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vect_analyze_data_refs_alignment ==="); fprintf (vect_dump, "=== vect_analyze_data_refs_alignment ===");
if (!vect_compute_data_refs_alignment (loop_vinfo)) if (!vect_compute_data_refs_alignment (loop_vinfo, bb_vinfo))
{ {
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, fprintf (vect_dump,
...@@ -1381,6 +1447,7 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1381,6 +1447,7 @@ vect_analyze_group_access (struct data_reference *dr)
gimple stmt = DR_STMT (dr); gimple stmt = DR_STMT (dr);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step); HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step);
HOST_WIDE_INT stride; HOST_WIDE_INT stride;
bool slp_impossible = false; bool slp_impossible = false;
...@@ -1406,8 +1473,7 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1406,8 +1473,7 @@ vect_analyze_group_access (struct data_reference *dr)
DR_GROUP_SIZE (vinfo_for_stmt (stmt)) = stride; DR_GROUP_SIZE (vinfo_for_stmt (stmt)) = stride;
if (vect_print_dump_info (REPORT_DR_DETAILS)) if (vect_print_dump_info (REPORT_DR_DETAILS))
{ {
fprintf (vect_dump, "Detected single element interleaving %d ", fprintf (vect_dump, "Detected single element interleaving ");
DR_GROUP_SIZE (vinfo_for_stmt (stmt)));
print_generic_expr (vect_dump, DR_REF (dr), TDF_SLIM); print_generic_expr (vect_dump, DR_REF (dr), TDF_SLIM);
fprintf (vect_dump, " step "); fprintf (vect_dump, " step ");
print_generic_expr (vect_dump, step, TDF_SLIM); print_generic_expr (vect_dump, step, TDF_SLIM);
...@@ -1508,8 +1574,8 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1508,8 +1574,8 @@ vect_analyze_group_access (struct data_reference *dr)
the type to get COUNT_IN_BYTES. */ the type to get COUNT_IN_BYTES. */
count_in_bytes = type_size * count; count_in_bytes = type_size * count;
/* Check that the size of the interleaving (including gaps) is not greater /* Check that the size of the interleaving (including gaps) is not
than STEP. */ greater than STEP. */
if (dr_step && dr_step < count_in_bytes + gaps * type_size) if (dr_step && dr_step < count_in_bytes + gaps * type_size)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -1522,7 +1588,7 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1522,7 +1588,7 @@ vect_analyze_group_access (struct data_reference *dr)
/* Check that the size of the interleaving is equal to STEP for stores, /* Check that the size of the interleaving is equal to STEP for stores,
i.e., that there are no gaps. */ i.e., that there are no gaps. */
if (dr_step != count_in_bytes) if (dr_step && dr_step != count_in_bytes)
{ {
if (DR_IS_READ (dr)) if (DR_IS_READ (dr))
{ {
...@@ -1541,7 +1607,7 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1541,7 +1607,7 @@ vect_analyze_group_access (struct data_reference *dr)
} }
/* Check that STEP is a multiple of type size. */ /* Check that STEP is a multiple of type size. */
if ((dr_step % type_size) != 0) if (dr_step && (dr_step % type_size) != 0)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
{ {
...@@ -1566,6 +1632,10 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1566,6 +1632,10 @@ vect_analyze_group_access (struct data_reference *dr)
if (slp_impossible) if (slp_impossible)
return false; return false;
} }
if (stride == 0)
stride = count;
DR_GROUP_SIZE (vinfo_for_stmt (stmt)) = stride; DR_GROUP_SIZE (vinfo_for_stmt (stmt)) = stride;
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Detected interleaving of size %d", (int)stride); fprintf (vect_dump, "Detected interleaving of size %d", (int)stride);
...@@ -1573,7 +1643,14 @@ vect_analyze_group_access (struct data_reference *dr) ...@@ -1573,7 +1643,14 @@ vect_analyze_group_access (struct data_reference *dr)
/* SLP: create an SLP data structure for every interleaving group of /* SLP: create an SLP data structure for every interleaving group of
stores for further analysis in vect_analyse_slp. */ stores for further analysis in vect_analyse_slp. */
if (!DR_IS_READ (dr) && !slp_impossible) if (!DR_IS_READ (dr) && !slp_impossible)
VEC_safe_push (gimple, heap, LOOP_VINFO_STRIDED_STORES (loop_vinfo), stmt); {
if (loop_vinfo)
VEC_safe_push (gimple, heap, LOOP_VINFO_STRIDED_STORES (loop_vinfo),
stmt);
if (bb_vinfo)
VEC_safe_push (gimple, heap, BB_VINFO_STRIDED_STORES (bb_vinfo),
stmt);
}
} }
return true; return true;
...@@ -1592,21 +1669,24 @@ vect_analyze_data_ref_access (struct data_reference *dr) ...@@ -1592,21 +1669,24 @@ vect_analyze_data_ref_access (struct data_reference *dr)
gimple stmt = DR_STMT (dr); gimple stmt = DR_STMT (dr);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step); HOST_WIDE_INT dr_step = TREE_INT_CST_LOW (step);
if (!step) if (loop_vinfo)
loop = LOOP_VINFO_LOOP (loop_vinfo);
if (loop_vinfo && !step)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "bad data-ref access"); fprintf (vect_dump, "bad data-ref access in loop");
return false; return false;
} }
/* Don't allow invariant accesses. */ /* Don't allow invariant accesses in loops. */
if (dr_step == 0) if (loop_vinfo && dr_step == 0)
return false; return false;
if (nested_in_vect_loop_p (loop, stmt)) if (loop && nested_in_vect_loop_p (loop, stmt))
{ {
/* Interleaved accesses are not yet supported within outer-loop /* Interleaved accesses are not yet supported within outer-loop
vectorization for references in the inner-loop. */ vectorization for references in the inner-loop. */
...@@ -1635,7 +1715,7 @@ vect_analyze_data_ref_access (struct data_reference *dr) ...@@ -1635,7 +1715,7 @@ vect_analyze_data_ref_access (struct data_reference *dr)
return true; return true;
} }
if (nested_in_vect_loop_p (loop, stmt)) if (loop && nested_in_vect_loop_p (loop, stmt))
{ {
if (vect_print_dump_info (REPORT_ALIGNMENT)) if (vect_print_dump_info (REPORT_ALIGNMENT))
fprintf (vect_dump, "strided access in outer loop."); fprintf (vect_dump, "strided access in outer loop.");
...@@ -1657,15 +1737,20 @@ vect_analyze_data_ref_access (struct data_reference *dr) ...@@ -1657,15 +1737,20 @@ vect_analyze_data_ref_access (struct data_reference *dr)
FORNOW: handle only arrays and pointer accesses. */ FORNOW: handle only arrays and pointer accesses. */
bool bool
vect_analyze_data_ref_accesses (loop_vec_info loop_vinfo) vect_analyze_data_ref_accesses (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
{ {
unsigned int i; unsigned int i;
VEC (data_reference_p, heap) *datarefs = LOOP_VINFO_DATAREFS (loop_vinfo); VEC (data_reference_p, heap) *datarefs;
struct data_reference *dr; struct data_reference *dr;
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vect_analyze_data_ref_accesses ==="); fprintf (vect_dump, "=== vect_analyze_data_ref_accesses ===");
if (loop_vinfo)
datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
else
datarefs = BB_VINFO_DATAREFS (bb_vinfo);
for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++) for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
if (!vect_analyze_data_ref_access (dr)) if (!vect_analyze_data_ref_access (dr))
{ {
...@@ -1752,12 +1837,13 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo) ...@@ -1752,12 +1837,13 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
/* Function vect_analyze_data_refs. /* Function vect_analyze_data_refs.
Find all the data references in the loop. Find all the data references in the loop or basic block.
The general structure of the analysis of data refs in the vectorizer is as The general structure of the analysis of data refs in the vectorizer is as
follows: follows:
1- vect_analyze_data_refs(loop): call compute_data_dependences_for_loop to 1- vect_analyze_data_refs(loop/bb): call
find and analyze all data-refs in the loop and their dependences. compute_data_dependences_for_loop/bb to find and analyze all data-refs
in the loop/bb and their dependences.
2- vect_analyze_dependences(): apply dependence testing using ddrs. 2- vect_analyze_dependences(): apply dependence testing using ddrs.
3- vect_analyze_drs_alignment(): check that ref_stmt.alignment is ok. 3- vect_analyze_drs_alignment(): check that ref_stmt.alignment is ok.
4- vect_analyze_drs_access(): check that ref_stmt.step is ok. 4- vect_analyze_drs_access(): check that ref_stmt.step is ok.
...@@ -1765,9 +1851,10 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo) ...@@ -1765,9 +1851,10 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
*/ */
bool bool
vect_analyze_data_refs (loop_vec_info loop_vinfo) vect_analyze_data_refs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
{ {
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
basic_block bb = NULL;
unsigned int i; unsigned int i;
VEC (data_reference_p, heap) *datarefs; VEC (data_reference_p, heap) *datarefs;
struct data_reference *dr; struct data_reference *dr;
...@@ -1776,13 +1863,25 @@ vect_analyze_data_refs (loop_vec_info loop_vinfo) ...@@ -1776,13 +1863,25 @@ vect_analyze_data_refs (loop_vec_info loop_vinfo)
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== vect_analyze_data_refs ===\n"); fprintf (vect_dump, "=== vect_analyze_data_refs ===\n");
if (loop_vinfo)
{
loop = LOOP_VINFO_LOOP (loop_vinfo);
compute_data_dependences_for_loop (loop, true, compute_data_dependences_for_loop (loop, true,
&LOOP_VINFO_DATAREFS (loop_vinfo), &LOOP_VINFO_DATAREFS (loop_vinfo),
&LOOP_VINFO_DDRS (loop_vinfo)); &LOOP_VINFO_DDRS (loop_vinfo));
datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
}
else
{
bb = BB_VINFO_BB (bb_vinfo);
compute_data_dependences_for_bb (bb, true,
&BB_VINFO_DATAREFS (bb_vinfo),
&BB_VINFO_DDRS (bb_vinfo));
datarefs = BB_VINFO_DATAREFS (bb_vinfo);
}
/* Go through the data-refs, check that the analysis succeeded. Update pointer /* Go through the data-refs, check that the analysis succeeded. Update pointer
from stmt_vec_info struct to DR and vectype. */ from stmt_vec_info struct to DR and vectype. */
datarefs = LOOP_VINFO_DATAREFS (loop_vinfo);
for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++) for (i = 0; VEC_iterate (data_reference_p, datarefs, i, dr); i++)
{ {
...@@ -1834,7 +1933,7 @@ vect_analyze_data_refs (loop_vec_info loop_vinfo) ...@@ -1834,7 +1933,7 @@ vect_analyze_data_refs (loop_vec_info loop_vinfo)
inner-most enclosing loop). We do that by building a reference to the inner-most enclosing loop). We do that by building a reference to the
first location accessed by the inner-loop, and analyze it relative to first location accessed by the inner-loop, and analyze it relative to
the outer-loop. */ the outer-loop. */
if (nested_in_vect_loop_p (loop, stmt)) if (loop && nested_in_vect_loop_p (loop, stmt))
{ {
tree outer_step, outer_base, outer_init; tree outer_step, outer_base, outer_init;
HOST_WIDE_INT pbitsize, pbitpos; HOST_WIDE_INT pbitsize, pbitpos;
...@@ -2053,7 +2152,6 @@ vect_create_addr_base_for_vector_ref (gimple stmt, ...@@ -2053,7 +2152,6 @@ vect_create_addr_base_for_vector_ref (gimple stmt,
{ {
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
struct loop *containing_loop = (gimple_bb (stmt))->loop_father;
tree data_ref_base = unshare_expr (DR_BASE_ADDRESS (dr)); tree data_ref_base = unshare_expr (DR_BASE_ADDRESS (dr));
tree base_name; tree base_name;
tree data_ref_base_var; tree data_ref_base_var;
...@@ -2065,22 +2163,28 @@ vect_create_addr_base_for_vector_ref (gimple stmt, ...@@ -2065,22 +2163,28 @@ vect_create_addr_base_for_vector_ref (gimple stmt,
tree init = unshare_expr (DR_INIT (dr)); tree init = unshare_expr (DR_INIT (dr));
tree vect_ptr_type; tree vect_ptr_type;
tree step = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr))); tree step = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
gcc_assert (loop); if (loop_vinfo && loop && loop != (gimple_bb (stmt))->loop_father)
if (loop != containing_loop)
{ {
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); struct loop *outer_loop = LOOP_VINFO_LOOP (loop_vinfo);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
gcc_assert (nested_in_vect_loop_p (loop, stmt)); gcc_assert (nested_in_vect_loop_p (outer_loop, stmt));
data_ref_base = unshare_expr (STMT_VINFO_DR_BASE_ADDRESS (stmt_info)); data_ref_base = unshare_expr (STMT_VINFO_DR_BASE_ADDRESS (stmt_info));
base_offset = unshare_expr (STMT_VINFO_DR_OFFSET (stmt_info)); base_offset = unshare_expr (STMT_VINFO_DR_OFFSET (stmt_info));
init = unshare_expr (STMT_VINFO_DR_INIT (stmt_info)); init = unshare_expr (STMT_VINFO_DR_INIT (stmt_info));
} }
/* Create data_ref_base */ if (loop_vinfo)
base_name = build_fold_indirect_ref (data_ref_base); base_name = build_fold_indirect_ref (data_ref_base);
else
{
base_offset = ssize_int (0);
init = ssize_int (0);
base_name = build_fold_indirect_ref (unshare_expr (DR_REF (dr)));
}
data_ref_base_var = create_tmp_var (TREE_TYPE (data_ref_base), "batmp"); data_ref_base_var = create_tmp_var (TREE_TYPE (data_ref_base), "batmp");
add_referenced_var (data_ref_base_var); add_referenced_var (data_ref_base_var);
data_ref_base = force_gimple_operand (data_ref_base, &seq, true, data_ref_base = force_gimple_operand (data_ref_base, &seq, true,
...@@ -2110,15 +2214,24 @@ vect_create_addr_base_for_vector_ref (gimple stmt, ...@@ -2110,15 +2214,24 @@ vect_create_addr_base_for_vector_ref (gimple stmt,
} }
/* base + base_offset */ /* base + base_offset */
if (loop_vinfo)
addr_base = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (data_ref_base), addr_base = fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (data_ref_base),
data_ref_base, base_offset); data_ref_base, base_offset);
else
{
if (TREE_CODE (DR_REF (dr)) == INDIRECT_REF)
addr_base = unshare_expr (TREE_OPERAND (DR_REF (dr), 0));
else
addr_base = build1 (ADDR_EXPR,
build_pointer_type (TREE_TYPE (DR_REF (dr))),
unshare_expr (DR_REF (dr)));
}
vect_ptr_type = build_pointer_type (STMT_VINFO_VECTYPE (stmt_info)); vect_ptr_type = build_pointer_type (STMT_VINFO_VECTYPE (stmt_info));
vec_stmt = fold_convert (vect_ptr_type, addr_base); vec_stmt = fold_convert (vect_ptr_type, addr_base);
addr_expr = vect_get_new_vect_var (vect_ptr_type, vect_pointer_var, addr_expr = vect_get_new_vect_var (vect_ptr_type, vect_pointer_var,
get_name (base_name)); get_name (base_name));
add_referenced_var (addr_expr); add_referenced_var (addr_expr);
vec_stmt = force_gimple_operand (vec_stmt, &seq, false, addr_expr); vec_stmt = force_gimple_operand (vec_stmt, &seq, false, addr_expr);
gimple_seq_add_seq (new_stmt_list, seq); gimple_seq_add_seq (new_stmt_list, seq);
...@@ -2186,16 +2299,16 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop, ...@@ -2186,16 +2299,16 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop,
tree base_name; tree base_name;
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
bool nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt); bool nested_in_vect_loop = false;
struct loop *containing_loop = (gimple_bb (stmt))->loop_father; struct loop *containing_loop = NULL;
tree vectype = STMT_VINFO_VECTYPE (stmt_info); tree vectype = STMT_VINFO_VECTYPE (stmt_info);
tree vect_ptr_type; tree vect_ptr_type;
tree vect_ptr; tree vect_ptr;
tree new_temp; tree new_temp;
gimple vec_stmt; gimple vec_stmt;
gimple_seq new_stmt_list = NULL; gimple_seq new_stmt_list = NULL;
edge pe; edge pe = NULL;
basic_block new_bb; basic_block new_bb;
tree vect_ptr_init; tree vect_ptr_init;
struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info); struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
...@@ -2205,6 +2318,22 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop, ...@@ -2205,6 +2318,22 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop,
tree indx_before_incr, indx_after_incr; tree indx_before_incr, indx_after_incr;
gimple incr; gimple incr;
tree step; tree step;
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
if (loop_vinfo)
{
loop = LOOP_VINFO_LOOP (loop_vinfo);
nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt);
containing_loop = (gimple_bb (stmt))->loop_father;
pe = loop_preheader_edge (loop);
}
else
{
gcc_assert (bb_vinfo);
only_init = true;
*ptr_incr = NULL;
}
/* Check the step (evolution) of the load in LOOP, and record /* Check the step (evolution) of the load in LOOP, and record
whether it's invariant. */ whether it's invariant. */
...@@ -2305,12 +2434,16 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop, ...@@ -2305,12 +2434,16 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop,
new_temp = vect_create_addr_base_for_vector_ref (stmt, &new_stmt_list, new_temp = vect_create_addr_base_for_vector_ref (stmt, &new_stmt_list,
offset, loop); offset, loop);
pe = loop_preheader_edge (loop);
if (new_stmt_list) if (new_stmt_list)
{ {
if (pe)
{
new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmt_list); new_bb = gsi_insert_seq_on_edge_immediate (pe, new_stmt_list);
gcc_assert (!new_bb); gcc_assert (!new_bb);
} }
else
gsi_insert_seq_before (&gsi, new_stmt_list, GSI_SAME_STMT);
}
*initial_address = new_temp; *initial_address = new_temp;
...@@ -2319,16 +2452,21 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop, ...@@ -2319,16 +2452,21 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop,
fold_convert (vect_ptr_type, new_temp)); fold_convert (vect_ptr_type, new_temp));
vect_ptr_init = make_ssa_name (vect_ptr, vec_stmt); vect_ptr_init = make_ssa_name (vect_ptr, vec_stmt);
gimple_assign_set_lhs (vec_stmt, vect_ptr_init); gimple_assign_set_lhs (vec_stmt, vect_ptr_init);
if (pe)
{
new_bb = gsi_insert_on_edge_immediate (pe, vec_stmt); new_bb = gsi_insert_on_edge_immediate (pe, vec_stmt);
gcc_assert (!new_bb); gcc_assert (!new_bb);
}
else
gsi_insert_before (&gsi, vec_stmt, GSI_SAME_STMT);
/** (4) Handle the updating of the vector-pointer inside the loop. /** (4) Handle the updating of the vector-pointer inside the loop.
This is needed when ONLY_INIT is false, and also when AT_LOOP This is needed when ONLY_INIT is false, and also when AT_LOOP
is the inner-loop nested in LOOP (during outer-loop vectorization). is the inner-loop nested in LOOP (during outer-loop vectorization).
**/ **/
if (only_init && at_loop == loop) /* No update in loop is required. */ /* No update in loop is required. */
if (only_init && (!loop_vinfo || at_loop == loop))
{ {
/* Copy the points-to information if it exists. */ /* Copy the points-to information if it exists. */
if (DR_PTR_INFO (dr)) if (DR_PTR_INFO (dr))
...@@ -2351,7 +2489,7 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop, ...@@ -2351,7 +2489,7 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop,
vect_ptr, loop, &incr_gsi, insert_after, vect_ptr, loop, &incr_gsi, insert_after,
&indx_before_incr, &indx_after_incr); &indx_before_incr, &indx_after_incr);
incr = gsi_stmt (incr_gsi); incr = gsi_stmt (incr_gsi);
set_vinfo_for_stmt (incr, new_stmt_vec_info (incr, loop_vinfo)); set_vinfo_for_stmt (incr, new_stmt_vec_info (incr, loop_vinfo, NULL));
/* Copy the points-to information if it exists. */ /* Copy the points-to information if it exists. */
if (DR_PTR_INFO (dr)) if (DR_PTR_INFO (dr))
...@@ -2383,7 +2521,7 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop, ...@@ -2383,7 +2521,7 @@ vect_create_data_ref_ptr (gimple stmt, struct loop *at_loop,
containing_loop, &incr_gsi, insert_after, &indx_before_incr, containing_loop, &incr_gsi, insert_after, &indx_before_incr,
&indx_after_incr); &indx_after_incr);
incr = gsi_stmt (incr_gsi); incr = gsi_stmt (incr_gsi);
set_vinfo_for_stmt (incr, new_stmt_vec_info (incr, loop_vinfo)); set_vinfo_for_stmt (incr, new_stmt_vec_info (incr, loop_vinfo, NULL));
/* Copy the points-to information if it exists. */ /* Copy the points-to information if it exists. */
if (DR_PTR_INFO (dr)) if (DR_PTR_INFO (dr))
...@@ -3237,13 +3375,21 @@ vect_supportable_dr_alignment (struct data_reference *dr) ...@@ -3237,13 +3375,21 @@ vect_supportable_dr_alignment (struct data_reference *dr)
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
tree vectype = STMT_VINFO_VECTYPE (stmt_info); tree vectype = STMT_VINFO_VECTYPE (stmt_info);
enum machine_mode mode = TYPE_MODE (vectype); enum machine_mode mode = TYPE_MODE (vectype);
struct loop *vect_loop = LOOP_VINFO_LOOP (STMT_VINFO_LOOP_VINFO (stmt_info));
bool nested_in_vect_loop = nested_in_vect_loop_p (vect_loop, stmt);
bool invariant_in_outerloop = false; bool invariant_in_outerloop = false;
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *vect_loop = NULL;
bool nested_in_vect_loop = false;
if (aligned_access_p (dr)) if (aligned_access_p (dr))
return dr_aligned; return dr_aligned;
if (!loop_vinfo)
/* FORNOW: Misaligned accesses are supported only in loops. */
return dr_unaligned_unsupported;
vect_loop = LOOP_VINFO_LOOP (loop_vinfo);
nested_in_vect_loop = nested_in_vect_loop_p (vect_loop, stmt);
if (nested_in_vect_loop) if (nested_in_vect_loop)
{ {
tree outerloop_step = STMT_VINFO_DR_STEP (stmt_info); tree outerloop_step = STMT_VINFO_DR_STEP (stmt_info);
......
...@@ -640,14 +640,14 @@ new_loop_vec_info (struct loop *loop) ...@@ -640,14 +640,14 @@ new_loop_vec_info (struct loop *loop)
{ {
gimple phi = gsi_stmt (si); gimple phi = gsi_stmt (si);
gimple_set_uid (phi, 0); gimple_set_uid (phi, 0);
set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, res)); set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, res, NULL));
} }
for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si)) for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
{ {
gimple stmt = gsi_stmt (si); gimple stmt = gsi_stmt (si);
gimple_set_uid (stmt, 0); gimple_set_uid (stmt, 0);
set_vinfo_for_stmt (stmt, new_stmt_vec_info (stmt, res)); set_vinfo_for_stmt (stmt, new_stmt_vec_info (stmt, res, NULL));
} }
} }
} }
...@@ -1153,7 +1153,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo) ...@@ -1153,7 +1153,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo)
gcc_assert (stmt_info); gcc_assert (stmt_info);
if (!vect_analyze_stmt (stmt, &need_to_vectorize)) if (!vect_analyze_stmt (stmt, &need_to_vectorize, NULL))
return false; return false;
if (STMT_VINFO_RELEVANT_P (stmt_info) && !PURE_SLP_STMT (stmt_info)) if (STMT_VINFO_RELEVANT_P (stmt_info) && !PURE_SLP_STMT (stmt_info))
...@@ -1316,7 +1316,7 @@ vect_analyze_loop (struct loop *loop) ...@@ -1316,7 +1316,7 @@ vect_analyze_loop (struct loop *loop)
FORNOW: Handle only simple, array references, which FORNOW: Handle only simple, array references, which
alignment can be forced, and aligned pointer-references. */ alignment can be forced, and aligned pointer-references. */
ok = vect_analyze_data_refs (loop_vinfo); ok = vect_analyze_data_refs (loop_vinfo, NULL);
if (!ok) if (!ok)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -1346,7 +1346,7 @@ vect_analyze_loop (struct loop *loop) ...@@ -1346,7 +1346,7 @@ vect_analyze_loop (struct loop *loop)
/* Analyze the alignment of the data-refs in the loop. /* Analyze the alignment of the data-refs in the loop.
Fail if a data reference is found that cannot be vectorized. */ Fail if a data reference is found that cannot be vectorized. */
ok = vect_analyze_data_refs_alignment (loop_vinfo); ok = vect_analyze_data_refs_alignment (loop_vinfo, NULL);
if (!ok) if (!ok)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -1367,7 +1367,7 @@ vect_analyze_loop (struct loop *loop) ...@@ -1367,7 +1367,7 @@ vect_analyze_loop (struct loop *loop)
/* Analyze data dependences between the data-refs in the loop. /* Analyze data dependences between the data-refs in the loop.
FORNOW: fail at the first data dependence that we encounter. */ FORNOW: fail at the first data dependence that we encounter. */
ok = vect_analyze_data_ref_dependences (loop_vinfo); ok = vect_analyze_data_ref_dependences (loop_vinfo, NULL);
if (!ok) if (!ok)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -1379,7 +1379,7 @@ vect_analyze_loop (struct loop *loop) ...@@ -1379,7 +1379,7 @@ vect_analyze_loop (struct loop *loop)
/* Analyze the access patterns of the data-refs in the loop (consecutive, /* Analyze the access patterns of the data-refs in the loop (consecutive,
complex, etc.). FORNOW: Only handle consecutive access pattern. */ complex, etc.). FORNOW: Only handle consecutive access pattern. */
ok = vect_analyze_data_ref_accesses (loop_vinfo); ok = vect_analyze_data_ref_accesses (loop_vinfo, NULL);
if (!ok) if (!ok)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -1402,7 +1402,7 @@ vect_analyze_loop (struct loop *loop) ...@@ -1402,7 +1402,7 @@ vect_analyze_loop (struct loop *loop)
} }
/* Check the SLP opportunities in the loop, analyze and build SLP trees. */ /* Check the SLP opportunities in the loop, analyze and build SLP trees. */
ok = vect_analyze_slp (loop_vinfo); ok = vect_analyze_slp (loop_vinfo, NULL);
if (ok) if (ok)
{ {
/* Decide which possible SLP instances to SLP. */ /* Decide which possible SLP instances to SLP. */
...@@ -2354,7 +2354,7 @@ get_initial_def_for_induction (gimple iv_phi) ...@@ -2354,7 +2354,7 @@ get_initial_def_for_induction (gimple iv_phi)
add_referenced_var (vec_dest); add_referenced_var (vec_dest);
induction_phi = create_phi_node (vec_dest, iv_loop->header); induction_phi = create_phi_node (vec_dest, iv_loop->header);
set_vinfo_for_stmt (induction_phi, set_vinfo_for_stmt (induction_phi,
new_stmt_vec_info (induction_phi, loop_vinfo)); new_stmt_vec_info (induction_phi, loop_vinfo, NULL));
induc_def = PHI_RESULT (induction_phi); induc_def = PHI_RESULT (induction_phi);
/* Create the iv update inside the loop */ /* Create the iv update inside the loop */
...@@ -2363,7 +2363,8 @@ get_initial_def_for_induction (gimple iv_phi) ...@@ -2363,7 +2363,8 @@ get_initial_def_for_induction (gimple iv_phi)
vec_def = make_ssa_name (vec_dest, new_stmt); vec_def = make_ssa_name (vec_dest, new_stmt);
gimple_assign_set_lhs (new_stmt, vec_def); gimple_assign_set_lhs (new_stmt, vec_def);
gsi_insert_before (&si, new_stmt, GSI_SAME_STMT); gsi_insert_before (&si, new_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (new_stmt, new_stmt_vec_info (new_stmt, loop_vinfo)); set_vinfo_for_stmt (new_stmt, new_stmt_vec_info (new_stmt, loop_vinfo,
NULL));
/* Set the arguments of the phi node: */ /* Set the arguments of the phi node: */
add_phi_arg (induction_phi, vec_init, pe); add_phi_arg (induction_phi, vec_init, pe);
...@@ -2405,7 +2406,7 @@ get_initial_def_for_induction (gimple iv_phi) ...@@ -2405,7 +2406,7 @@ get_initial_def_for_induction (gimple iv_phi)
gsi_insert_before (&si, new_stmt, GSI_SAME_STMT); gsi_insert_before (&si, new_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (new_stmt, set_vinfo_for_stmt (new_stmt,
new_stmt_vec_info (new_stmt, loop_vinfo)); new_stmt_vec_info (new_stmt, loop_vinfo, NULL));
STMT_VINFO_RELATED_STMT (prev_stmt_vinfo) = new_stmt; STMT_VINFO_RELATED_STMT (prev_stmt_vinfo) = new_stmt;
prev_stmt_vinfo = vinfo_for_stmt (new_stmt); prev_stmt_vinfo = vinfo_for_stmt (new_stmt);
} }
...@@ -2743,7 +2744,7 @@ vect_create_epilog_for_reduction (tree vect_def, gimple stmt, ...@@ -2743,7 +2744,7 @@ vect_create_epilog_for_reduction (tree vect_def, gimple stmt,
for (j = 0; j < ncopies; j++) for (j = 0; j < ncopies; j++)
{ {
phi = create_phi_node (SSA_NAME_VAR (vect_def), exit_bb); phi = create_phi_node (SSA_NAME_VAR (vect_def), exit_bb);
set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, loop_vinfo)); set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, loop_vinfo, NULL));
if (j == 0) if (j == 0)
new_phi = phi; new_phi = phi;
else else
...@@ -3021,7 +3022,8 @@ vect_finalize_reduction: ...@@ -3021,7 +3022,8 @@ vect_finalize_reduction:
epilog_stmt = adjustment_def ? epilog_stmt : new_phi; epilog_stmt = adjustment_def ? epilog_stmt : new_phi;
STMT_VINFO_VEC_STMT (stmt_vinfo) = epilog_stmt; STMT_VINFO_VEC_STMT (stmt_vinfo) = epilog_stmt;
set_vinfo_for_stmt (epilog_stmt, set_vinfo_for_stmt (epilog_stmt,
new_stmt_vec_info (epilog_stmt, loop_vinfo)); new_stmt_vec_info (epilog_stmt, loop_vinfo,
NULL));
if (adjustment_def) if (adjustment_def)
STMT_VINFO_RELATED_STMT (vinfo_for_stmt (epilog_stmt)) = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (epilog_stmt)) =
STMT_VINFO_RELATED_STMT (vinfo_for_stmt (new_phi)); STMT_VINFO_RELATED_STMT (vinfo_for_stmt (new_phi));
...@@ -3204,7 +3206,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3204,7 +3206,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi,
The last use is the reduction variable. */ The last use is the reduction variable. */
for (i = 0; i < op_type-1; i++) for (i = 0; i < op_type-1; i++)
{ {
is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, &def_stmt, is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, NULL, &def_stmt,
&def, &dt); &def, &dt);
gcc_assert (is_simple_use); gcc_assert (is_simple_use);
if (dt != vect_internal_def if (dt != vect_internal_def
...@@ -3214,8 +3216,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3214,8 +3216,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi,
return false; return false;
} }
is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, &def_stmt, &def, is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, NULL, &def_stmt,
&dt); &def, &dt);
gcc_assert (is_simple_use); gcc_assert (is_simple_use);
gcc_assert (dt == vect_reduction_def); gcc_assert (dt == vect_reduction_def);
gcc_assert (gimple_code (def_stmt) == GIMPLE_PHI); gcc_assert (gimple_code (def_stmt) == GIMPLE_PHI);
...@@ -3394,7 +3396,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3394,7 +3396,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi,
{ {
/* Create the reduction-phi that defines the reduction-operand. */ /* Create the reduction-phi that defines the reduction-operand. */
new_phi = create_phi_node (vec_dest, loop->header); new_phi = create_phi_node (vec_dest, loop->header);
set_vinfo_for_stmt (new_phi, new_stmt_vec_info (new_phi, loop_vinfo)); set_vinfo_for_stmt (new_phi, new_stmt_vec_info (new_phi, loop_vinfo,
NULL));
} }
/* Handle uses. */ /* Handle uses. */
...@@ -3592,7 +3595,8 @@ vectorizable_live_operation (gimple stmt, ...@@ -3592,7 +3595,8 @@ vectorizable_live_operation (gimple stmt,
op = TREE_OPERAND (gimple_op (stmt, 1), i); op = TREE_OPERAND (gimple_op (stmt, 1), i);
else else
op = gimple_op (stmt, i + 1); op = gimple_op (stmt, i + 1);
if (op && !vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt)) if (op
&& !vect_is_simple_use (op, loop_vinfo, NULL, &def_stmt, &def, &dt))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -3766,7 +3770,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) ...@@ -3766,7 +3770,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== scheduling SLP instances ==="); fprintf (vect_dump, "=== scheduling SLP instances ===");
vect_schedule_slp (loop_vinfo); vect_schedule_slp (loop_vinfo, NULL);
} }
/* Hybrid SLP stmts must be vectorized in addition to SLP. */ /* Hybrid SLP stmts must be vectorized in addition to SLP. */
......
...@@ -78,7 +78,7 @@ widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt) ...@@ -78,7 +78,7 @@ widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt)
stmt_vinfo = vinfo_for_stmt (use_stmt); stmt_vinfo = vinfo_for_stmt (use_stmt);
loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo); loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
if (!vect_is_simple_use (name, loop_vinfo, def_stmt, &def, &dt)) if (!vect_is_simple_use (name, loop_vinfo, NULL, def_stmt, &def, &dt))
return false; return false;
if (dt != vect_internal_def if (dt != vect_internal_def
...@@ -102,7 +102,8 @@ widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt) ...@@ -102,7 +102,8 @@ widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt)
|| (TYPE_PRECISION (type) < (TYPE_PRECISION (*half_type) * 2))) || (TYPE_PRECISION (type) < (TYPE_PRECISION (*half_type) * 2)))
return false; return false;
if (!vect_is_simple_use (oprnd0, loop_vinfo, &dummy_gimple, &dummy, &dt)) if (!vect_is_simple_use (oprnd0, loop_vinfo, NULL, &dummy_gimple, &dummy,
&dt))
return false; return false;
return true; return true;
...@@ -734,7 +735,7 @@ vect_pattern_recog_1 ( ...@@ -734,7 +735,7 @@ vect_pattern_recog_1 (
/* Mark the stmts that are involved in the pattern. */ /* Mark the stmts that are involved in the pattern. */
gsi_insert_before (&si, pattern_stmt, GSI_SAME_STMT); gsi_insert_before (&si, pattern_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (pattern_stmt, set_vinfo_for_stmt (pattern_stmt,
new_stmt_vec_info (pattern_stmt, loop_vinfo)); new_stmt_vec_info (pattern_stmt, loop_vinfo, NULL));
pattern_stmt_info = vinfo_for_stmt (pattern_stmt); pattern_stmt_info = vinfo_for_stmt (pattern_stmt);
STMT_VINFO_RELATED_STMT (pattern_stmt_info) = stmt; STMT_VINFO_RELATED_STMT (pattern_stmt_info) = stmt;
......
...@@ -38,6 +38,29 @@ along with GCC; see the file COPYING3. If not see ...@@ -38,6 +38,29 @@ along with GCC; see the file COPYING3. If not see
#include "optabs.h" #include "optabs.h"
#include "tree-vectorizer.h" #include "tree-vectorizer.h"
/* Extract the location of the basic block in the source code.
Return the basic block location if succeed and NULL if not. */
LOC
find_bb_location (basic_block bb)
{
gimple stmt = NULL;
gimple_stmt_iterator si;
if (!bb)
return UNKNOWN_LOC;
for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
{
stmt = gsi_stmt (si);
if (gimple_location (stmt) != UNKNOWN_LOC)
return gimple_location (stmt);
}
return UNKNOWN_LOC;
}
/* Recursively free the memory allocated for the SLP tree rooted at NODE. */ /* Recursively free the memory allocated for the SLP tree rooted at NODE. */
static void static void
...@@ -77,8 +100,9 @@ vect_free_slp_instance (slp_instance instance) ...@@ -77,8 +100,9 @@ vect_free_slp_instance (slp_instance instance)
the SLP group (stored in FIRST_STMT_...). */ the SLP group (stored in FIRST_STMT_...). */
static bool static bool
vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node, vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
gimple stmt, VEC (gimple, heap) **def_stmts0, slp_tree slp_node, gimple stmt,
VEC (gimple, heap) **def_stmts0,
VEC (gimple, heap) **def_stmts1, VEC (gimple, heap) **def_stmts1,
enum vect_def_type *first_stmt_dt0, enum vect_def_type *first_stmt_dt0,
enum vect_def_type *first_stmt_dt1, enum vect_def_type *first_stmt_dt1,
...@@ -96,7 +120,10 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node, ...@@ -96,7 +120,10 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node,
stmt_vec_info stmt_info = stmt_vec_info stmt_info =
vinfo_for_stmt (VEC_index (gimple, SLP_TREE_SCALAR_STMTS (slp_node), 0)); vinfo_for_stmt (VEC_index (gimple, SLP_TREE_SCALAR_STMTS (slp_node), 0));
enum gimple_rhs_class rhs_class; enum gimple_rhs_class rhs_class;
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
if (loop_vinfo)
loop = LOOP_VINFO_LOOP (loop_vinfo);
rhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (stmt)); rhs_class = get_gimple_rhs_class (gimple_assign_rhs_code (stmt));
number_of_oprnds = gimple_num_ops (stmt) - 1; /* RHS only */ number_of_oprnds = gimple_num_ops (stmt) - 1; /* RHS only */
...@@ -105,7 +132,8 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node, ...@@ -105,7 +132,8 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node,
{ {
oprnd = gimple_op (stmt, i + 1); oprnd = gimple_op (stmt, i + 1);
if (!vect_is_simple_use (oprnd, loop_vinfo, &def_stmt, &def, &dt[i]) if (!vect_is_simple_use (oprnd, loop_vinfo, bb_vinfo, &def_stmt, &def,
&dt[i])
|| (!def_stmt && dt[i] != vect_constant_def)) || (!def_stmt && dt[i] != vect_constant_def))
{ {
if (vect_print_dump_info (REPORT_SLP)) if (vect_print_dump_info (REPORT_SLP))
...@@ -117,10 +145,10 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node, ...@@ -117,10 +145,10 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node,
return false; return false;
} }
/* Check if DEF_STMT is a part of a pattern and get the def stmt from /* Check if DEF_STMT is a part of a pattern in LOOP and get the def stmt
the pattern. Check that all the stmts of the node are in the from the pattern. Check that all the stmts of the node are in the
pattern. */ pattern. */
if (def_stmt && gimple_bb (def_stmt) if (loop && def_stmt && gimple_bb (def_stmt)
&& flow_bb_inside_loop_p (loop, gimple_bb (def_stmt)) && flow_bb_inside_loop_p (loop, gimple_bb (def_stmt))
&& vinfo_for_stmt (def_stmt) && vinfo_for_stmt (def_stmt)
&& STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (def_stmt))) && STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (def_stmt)))
...@@ -271,12 +299,13 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node, ...@@ -271,12 +299,13 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, slp_tree slp_node,
TRUE. */ TRUE. */
static bool static bool
vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, vect_build_slp_tree (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
unsigned int group_size, slp_tree *node, unsigned int group_size,
int *inside_cost, int *outside_cost, int *inside_cost, int *outside_cost,
int ncopies_for_cost, unsigned int *max_nunits, int ncopies_for_cost, unsigned int *max_nunits,
VEC (int, heap) **load_permutation, VEC (int, heap) **load_permutation,
VEC (slp_tree, heap) **loads) VEC (slp_tree, heap) **loads,
unsigned int vectorization_factor)
{ {
VEC (gimple, heap) *def_stmts0 = VEC_alloc (gimple, heap, group_size); VEC (gimple, heap) *def_stmts0 = VEC_alloc (gimple, heap, group_size);
VEC (gimple, heap) *def_stmts1 = VEC_alloc (gimple, heap, group_size); VEC (gimple, heap) *def_stmts1 = VEC_alloc (gimple, heap, group_size);
...@@ -290,7 +319,7 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, ...@@ -290,7 +319,7 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node,
tree lhs; tree lhs;
bool stop_recursion = false, need_same_oprnds = false; bool stop_recursion = false, need_same_oprnds = false;
tree vectype, scalar_type, first_op1 = NULL_TREE; tree vectype, scalar_type, first_op1 = NULL_TREE;
unsigned int vectorization_factor = 0, ncopies; unsigned int ncopies;
optab optab; optab optab;
int icode; int icode;
enum machine_mode optab_op2_mode; enum machine_mode optab_op2_mode;
...@@ -337,12 +366,17 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, ...@@ -337,12 +366,17 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node,
return false; return false;
} }
gcc_assert (LOOP_VINFO_VECT_FACTOR (loop_vinfo));
vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
ncopies = vectorization_factor / TYPE_VECTOR_SUBPARTS (vectype); ncopies = vectorization_factor / TYPE_VECTOR_SUBPARTS (vectype);
if (ncopies > 1 && vect_print_dump_info (REPORT_SLP)) if (ncopies != 1)
{
if (vect_print_dump_info (REPORT_SLP))
fprintf (vect_dump, "SLP with multiple types "); fprintf (vect_dump, "SLP with multiple types ");
/* FORNOW: multiple types are unsupported in BB SLP. */
if (bb_vinfo)
return false;
}
/* In case of multiple types we need to detect the smallest type. */ /* In case of multiple types we need to detect the smallest type. */
if (*max_nunits < TYPE_VECTOR_SUBPARTS (vectype)) if (*max_nunits < TYPE_VECTOR_SUBPARTS (vectype))
*max_nunits = TYPE_VECTOR_SUBPARTS (vectype); *max_nunits = TYPE_VECTOR_SUBPARTS (vectype);
...@@ -438,8 +472,8 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, ...@@ -438,8 +472,8 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node,
if (REFERENCE_CLASS_P (lhs)) if (REFERENCE_CLASS_P (lhs))
{ {
/* Store. */ /* Store. */
if (!vect_get_and_check_slp_defs (loop_vinfo, *node, stmt, if (!vect_get_and_check_slp_defs (loop_vinfo, bb_vinfo, *node,
&def_stmts0, &def_stmts1, stmt, &def_stmts0, &def_stmts1,
&first_stmt_dt0, &first_stmt_dt0,
&first_stmt_dt1, &first_stmt_dt1,
&first_stmt_def0_type, &first_stmt_def0_type,
...@@ -552,7 +586,7 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, ...@@ -552,7 +586,7 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node,
} }
/* Find the def-stmts. */ /* Find the def-stmts. */
if (!vect_get_and_check_slp_defs (loop_vinfo, *node, stmt, if (!vect_get_and_check_slp_defs (loop_vinfo, bb_vinfo, *node, stmt,
&def_stmts0, &def_stmts1, &def_stmts0, &def_stmts1,
&first_stmt_dt0, &first_stmt_dt1, &first_stmt_dt0, &first_stmt_dt1,
&first_stmt_def0_type, &first_stmt_def0_type,
...@@ -590,9 +624,10 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, ...@@ -590,9 +624,10 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node,
SLP_TREE_RIGHT (left_node) = NULL; SLP_TREE_RIGHT (left_node) = NULL;
SLP_TREE_OUTSIDE_OF_LOOP_COST (left_node) = 0; SLP_TREE_OUTSIDE_OF_LOOP_COST (left_node) = 0;
SLP_TREE_INSIDE_OF_LOOP_COST (left_node) = 0; SLP_TREE_INSIDE_OF_LOOP_COST (left_node) = 0;
if (!vect_build_slp_tree (loop_vinfo, &left_node, group_size, if (!vect_build_slp_tree (loop_vinfo, bb_vinfo, &left_node, group_size,
inside_cost, outside_cost, ncopies_for_cost, inside_cost, outside_cost, ncopies_for_cost,
max_nunits, load_permutation, loads)) max_nunits, load_permutation, loads,
vectorization_factor))
return false; return false;
SLP_TREE_LEFT (*node) = left_node; SLP_TREE_LEFT (*node) = left_node;
...@@ -607,9 +642,10 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node, ...@@ -607,9 +642,10 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, slp_tree *node,
SLP_TREE_RIGHT (right_node) = NULL; SLP_TREE_RIGHT (right_node) = NULL;
SLP_TREE_OUTSIDE_OF_LOOP_COST (right_node) = 0; SLP_TREE_OUTSIDE_OF_LOOP_COST (right_node) = 0;
SLP_TREE_INSIDE_OF_LOOP_COST (right_node) = 0; SLP_TREE_INSIDE_OF_LOOP_COST (right_node) = 0;
if (!vect_build_slp_tree (loop_vinfo, &right_node, group_size, if (!vect_build_slp_tree (loop_vinfo, bb_vinfo, &right_node, group_size,
inside_cost, outside_cost, ncopies_for_cost, inside_cost, outside_cost, ncopies_for_cost,
max_nunits, load_permutation, loads)) max_nunits, load_permutation, loads,
vectorization_factor))
return false; return false;
SLP_TREE_RIGHT (*node) = right_node; SLP_TREE_RIGHT (*node) = right_node;
...@@ -664,6 +700,31 @@ vect_mark_slp_stmts (slp_tree node, enum slp_vect_type mark, int j) ...@@ -664,6 +700,31 @@ vect_mark_slp_stmts (slp_tree node, enum slp_vect_type mark, int j)
} }
/* Mark the statements of the tree rooted at NODE as relevant (vect_used). */
static void
vect_mark_slp_stmts_relevant (slp_tree node)
{
int i;
gimple stmt;
stmt_vec_info stmt_info;
if (!node)
return;
for (i = 0; VEC_iterate (gimple, SLP_TREE_SCALAR_STMTS (node), i, stmt); i++)
{
stmt_info = vinfo_for_stmt (stmt);
gcc_assert (!STMT_VINFO_RELEVANT (stmt_info)
|| STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_scope);
STMT_VINFO_RELEVANT (stmt_info) = vect_used_in_scope;
}
vect_mark_slp_stmts_relevant (SLP_TREE_LEFT (node));
vect_mark_slp_stmts_relevant (SLP_TREE_RIGHT (node));
}
/* Check if the permutation required by the SLP INSTANCE is supported. /* Check if the permutation required by the SLP INSTANCE is supported.
Reorganize the SLP nodes stored in SLP_INSTANCE_LOADS if needed. */ Reorganize the SLP nodes stored in SLP_INSTANCE_LOADS if needed. */
...@@ -736,7 +797,7 @@ vect_supported_load_permutation_p (slp_instance slp_instn, int group_size, ...@@ -736,7 +797,7 @@ vect_supported_load_permutation_p (slp_instance slp_instn, int group_size,
int i = 0, j, prev = -1, next, k; int i = 0, j, prev = -1, next, k;
bool supported; bool supported;
/* FORNOW: permutations are only supported for loop-aware SLP. */ /* FORNOW: permutations are only supported in SLP. */
if (!slp_instn) if (!slp_instn)
return false; return false;
...@@ -809,7 +870,8 @@ vect_find_first_load_in_slp_instance (slp_instance instance) ...@@ -809,7 +870,8 @@ vect_find_first_load_in_slp_instance (slp_instance instance)
Return FALSE if it's impossible to SLP any stmt in the loop. */ Return FALSE if it's impossible to SLP any stmt in the loop. */
static bool static bool
vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) vect_analyze_slp_instance (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
gimple stmt)
{ {
slp_instance new_instance; slp_instance new_instance;
slp_tree node = XNEW (struct _slp_tree); slp_tree node = XNEW (struct _slp_tree);
...@@ -818,7 +880,6 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -818,7 +880,6 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
tree vectype, scalar_type; tree vectype, scalar_type;
gimple next; gimple next;
unsigned int vectorization_factor = 0, ncopies; unsigned int vectorization_factor = 0, ncopies;
bool slp_impossible = false;
int inside_cost = 0, outside_cost = 0, ncopies_for_cost; int inside_cost = 0, outside_cost = 0, ncopies_for_cost;
unsigned int max_nunits = 0; unsigned int max_nunits = 0;
VEC (int, heap) *load_permutation; VEC (int, heap) *load_permutation;
...@@ -838,9 +899,24 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -838,9 +899,24 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
} }
nunits = TYPE_VECTOR_SUBPARTS (vectype); nunits = TYPE_VECTOR_SUBPARTS (vectype);
if (loop_vinfo)
vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo); vectorization_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
else
/* No multitypes in BB SLP. */
vectorization_factor = nunits;
ncopies = vectorization_factor / nunits; ncopies = vectorization_factor / nunits;
/* Calculate the unrolling factor. */
unrolling_factor = least_common_multiple (nunits, group_size) / group_size;
if (unrolling_factor != 1 && !loop_vinfo)
{
if (vect_print_dump_info (REPORT_SLP))
fprintf (vect_dump, "Build SLP failed: unrolling required in BB SLP");
return false;
}
/* Create a node (a root of the SLP tree) for the packed strided stores. */ /* Create a node (a root of the SLP tree) for the packed strided stores. */
SLP_TREE_SCALAR_STMTS (node) = VEC_alloc (gimple, heap, group_size); SLP_TREE_SCALAR_STMTS (node) = VEC_alloc (gimple, heap, group_size);
next = stmt; next = stmt;
...@@ -858,9 +934,6 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -858,9 +934,6 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
SLP_TREE_OUTSIDE_OF_LOOP_COST (node) = 0; SLP_TREE_OUTSIDE_OF_LOOP_COST (node) = 0;
SLP_TREE_INSIDE_OF_LOOP_COST (node) = 0; SLP_TREE_INSIDE_OF_LOOP_COST (node) = 0;
/* Calculate the unrolling factor. */
unrolling_factor = least_common_multiple (nunits, group_size) / group_size;
/* Calculate the number of vector stmts to create based on the unrolling /* Calculate the number of vector stmts to create based on the unrolling
factor (number of vectors is 1 if NUNITS >= GROUP_SIZE, and is factor (number of vectors is 1 if NUNITS >= GROUP_SIZE, and is
GROUP_SIZE / NUNITS otherwise. */ GROUP_SIZE / NUNITS otherwise. */
...@@ -870,9 +943,10 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -870,9 +943,10 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
loads = VEC_alloc (slp_tree, heap, group_size); loads = VEC_alloc (slp_tree, heap, group_size);
/* Build the tree for the SLP instance. */ /* Build the tree for the SLP instance. */
if (vect_build_slp_tree (loop_vinfo, &node, group_size, &inside_cost, if (vect_build_slp_tree (loop_vinfo, bb_vinfo, &node, group_size,
&outside_cost, ncopies_for_cost, &max_nunits, &inside_cost, &outside_cost, ncopies_for_cost,
&load_permutation, &loads)) &max_nunits, &load_permutation, &loads,
vectorization_factor))
{ {
/* Create a new SLP instance. */ /* Create a new SLP instance. */
new_instance = XNEW (struct _slp_instance); new_instance = XNEW (struct _slp_instance);
...@@ -912,8 +986,14 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -912,8 +986,14 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
else else
VEC_free (int, heap, SLP_INSTANCE_LOAD_PERMUTATION (new_instance)); VEC_free (int, heap, SLP_INSTANCE_LOAD_PERMUTATION (new_instance));
VEC_safe_push (slp_instance, heap, LOOP_VINFO_SLP_INSTANCES (loop_vinfo), if (loop_vinfo)
VEC_safe_push (slp_instance, heap,
LOOP_VINFO_SLP_INSTANCES (loop_vinfo),
new_instance); new_instance);
else
VEC_safe_push (slp_instance, heap, BB_VINFO_SLP_INSTANCES (bb_vinfo),
new_instance);
if (vect_print_dump_info (REPORT_SLP)) if (vect_print_dump_info (REPORT_SLP))
vect_print_slp_tree (node); vect_print_slp_tree (node);
...@@ -926,12 +1006,7 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -926,12 +1006,7 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
VEC_free (int, heap, load_permutation); VEC_free (int, heap, load_permutation);
VEC_free (slp_tree, heap, loads); VEC_free (slp_tree, heap, loads);
if (slp_impossible)
return false; return false;
/* SLP failed for this instance, but it is still possible to SLP other stmts
in the loop. */
return true;
} }
...@@ -939,21 +1014,29 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt) ...@@ -939,21 +1014,29 @@ vect_analyze_slp_instance (loop_vec_info loop_vinfo, gimple stmt)
trees of packed scalar stmts if SLP is possible. */ trees of packed scalar stmts if SLP is possible. */
bool bool
vect_analyze_slp (loop_vec_info loop_vinfo) vect_analyze_slp (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
{ {
unsigned int i; unsigned int i;
VEC (gimple, heap) *strided_stores = LOOP_VINFO_STRIDED_STORES (loop_vinfo); VEC (gimple, heap) *strided_stores;
gimple store; gimple store;
bool ok = false;
if (vect_print_dump_info (REPORT_SLP)) if (vect_print_dump_info (REPORT_SLP))
fprintf (vect_dump, "=== vect_analyze_slp ==="); fprintf (vect_dump, "=== vect_analyze_slp ===");
if (loop_vinfo)
strided_stores = LOOP_VINFO_STRIDED_STORES (loop_vinfo);
else
strided_stores = BB_VINFO_STRIDED_STORES (bb_vinfo);
for (i = 0; VEC_iterate (gimple, strided_stores, i, store); i++) for (i = 0; VEC_iterate (gimple, strided_stores, i, store); i++)
if (!vect_analyze_slp_instance (loop_vinfo, store)) if (vect_analyze_slp_instance (loop_vinfo, bb_vinfo, store))
ok = true;
if (bb_vinfo && !ok)
{ {
/* SLP failed. No instance can be SLPed in the loop. */ if (vect_print_dump_info (REPORT_SLP))
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) fprintf (vect_dump, "Failed to SLP the basic block.");
fprintf (vect_dump, "SLP failed.");
return false; return false;
} }
...@@ -1041,6 +1124,255 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo) ...@@ -1041,6 +1124,255 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo)
vect_detect_hybrid_slp_stmts (SLP_INSTANCE_TREE (instance)); vect_detect_hybrid_slp_stmts (SLP_INSTANCE_TREE (instance));
} }
/* Create and initialize a new bb_vec_info struct for BB, as well as
stmt_vec_info structs for all the stmts in it. */
static bb_vec_info
new_bb_vec_info (basic_block bb)
{
bb_vec_info res = NULL;
gimple_stmt_iterator gsi;
res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
BB_VINFO_BB (res) = bb;
for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
gimple stmt = gsi_stmt (gsi);
gimple_set_uid (stmt, 0);
set_vinfo_for_stmt (stmt, new_stmt_vec_info (stmt, NULL, res));
}
BB_VINFO_STRIDED_STORES (res) = VEC_alloc (gimple, heap, 10);
BB_VINFO_SLP_INSTANCES (res) = VEC_alloc (slp_instance, heap, 2);
bb->aux = res;
return res;
}
/* Free BB_VINFO struct, as well as all the stmt_vec_info structs of all the
stmts in the basic block. */
static void
destroy_bb_vec_info (bb_vec_info bb_vinfo)
{
basic_block bb;
gimple_stmt_iterator si;
if (!bb_vinfo)
return;
bb = BB_VINFO_BB (bb_vinfo);
for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
{
gimple stmt = gsi_stmt (si);
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
if (stmt_info)
/* Free stmt_vec_info. */
free_stmt_vec_info (stmt);
}
VEC_free (gimple, heap, BB_VINFO_STRIDED_STORES (bb_vinfo));
VEC_free (slp_instance, heap, BB_VINFO_SLP_INSTANCES (bb_vinfo));
free (bb_vinfo);
bb->aux = NULL;
}
/* Analyze statements contained in SLP tree node after recursively analyzing
the subtree. Return TRUE if the operations are supported. */
static bool
vect_slp_analyze_node_operations (bb_vec_info bb_vinfo, slp_tree node)
{
bool dummy;
int i;
gimple stmt;
if (!node)
return true;
if (!vect_slp_analyze_node_operations (bb_vinfo, SLP_TREE_LEFT (node))
|| !vect_slp_analyze_node_operations (bb_vinfo, SLP_TREE_RIGHT (node)))
return false;
for (i = 0; VEC_iterate (gimple, SLP_TREE_SCALAR_STMTS (node), i, stmt); i++)
{
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
gcc_assert (stmt_info);
gcc_assert (PURE_SLP_STMT (stmt_info));
if (!vect_analyze_stmt (stmt, &dummy, node))
return false;
}
return true;
}
/* Analyze statements in SLP instances of the basic block. Return TRUE if the
operations are supported. */
static bool
vect_slp_analyze_operations (bb_vec_info bb_vinfo)
{
VEC (slp_instance, heap) *slp_instances = BB_VINFO_SLP_INSTANCES (bb_vinfo);
slp_instance instance;
int i;
for (i = 0; VEC_iterate (slp_instance, slp_instances, i, instance); )
{
if (!vect_slp_analyze_node_operations (bb_vinfo,
SLP_INSTANCE_TREE (instance)))
{
vect_free_slp_instance (instance);
VEC_ordered_remove (slp_instance, slp_instances, i);
}
else
i++;
}
if (!VEC_length (slp_instance, slp_instances))
return false;
return true;
}
/* Cheick if the basic block can be vectorized. */
bb_vec_info
vect_slp_analyze_bb (basic_block bb)
{
bb_vec_info bb_vinfo;
VEC (ddr_p, heap) *ddrs;
VEC (slp_instance, heap) *slp_instances;
slp_instance instance;
int i, insns = 0;
gimple_stmt_iterator gsi;
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "===vect_slp_analyze_bb===\n");
for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
insns++;
if (insns > PARAM_VALUE (PARAM_SLP_MAX_INSNS_IN_BB))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: too many instructions in basic "
"block.\n");
return NULL;
}
bb_vinfo = new_bb_vec_info (bb);
if (!bb_vinfo)
return NULL;
if (!vect_analyze_data_refs (NULL, bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: unhandled data-ref in basic "
"block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
ddrs = BB_VINFO_DDRS (bb_vinfo);
if (!VEC_length (ddr_p, ddrs))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: not enough data-refs in basic "
"block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
if (!vect_analyze_data_refs_alignment (NULL, bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: bad data alignment in basic "
"block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
if (!vect_analyze_data_ref_dependences (NULL, bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: unhandled data dependence in basic"
" block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
if (!vect_analyze_data_ref_accesses (NULL, bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: unhandled data access in basic "
"block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
if (!vect_verify_datarefs_alignment (NULL, bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: unsupported alignment in basic "
"block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
/* Check the SLP opportunities in the basic block, analyze and build SLP
trees. */
if (!vect_analyze_slp (NULL, bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: failed to find SLP opportunities "
"in basic block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
slp_instances = BB_VINFO_SLP_INSTANCES (bb_vinfo);
/* Mark all the statements that we want to vectorize as pure SLP and
relevant. */
for (i = 0; VEC_iterate (slp_instance, slp_instances, i, instance); i++)
{
vect_mark_slp_stmts (SLP_INSTANCE_TREE (instance), pure_slp, -1);
vect_mark_slp_stmts_relevant (SLP_INSTANCE_TREE (instance));
}
if (!vect_slp_analyze_operations (bb_vinfo))
{
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: bad operation in basic block.\n");
destroy_bb_vec_info (bb_vinfo);
return NULL;
}
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "BB will be vectorized using SLP\n");
return bb_vinfo;
}
/* SLP costs are calculated according to SLP instance unrolling factor (i.e., /* SLP costs are calculated according to SLP instance unrolling factor (i.e.,
the number of created vector stmts depends on the unrolling factor). However, the number of created vector stmts depends on the unrolling factor). However,
the actual number of vector stmts for every SLP node depends on VF which is the actual number of vector stmts for every SLP node depends on VF which is
...@@ -1064,6 +1396,7 @@ vect_update_slp_costs_according_to_vf (loop_vec_info loop_vinfo) ...@@ -1064,6 +1396,7 @@ vect_update_slp_costs_according_to_vf (loop_vec_info loop_vinfo)
/ SLP_INSTANCE_UNROLLING_FACTOR (instance); / SLP_INSTANCE_UNROLLING_FACTOR (instance);
} }
/* For constant and loop invariant defs of SLP_NODE this function returns /* For constant and loop invariant defs of SLP_NODE this function returns
(vector) defs (VEC_OPRNDS) that will be used in the vectorized stmts. (vector) defs (VEC_OPRNDS) that will be used in the vectorized stmts.
OP_NUM determines if we gather defs for operand 0 or operand 1 of the scalar OP_NUM determines if we gather defs for operand 0 or operand 1 of the scalar
...@@ -1284,6 +1617,7 @@ vect_get_slp_defs (slp_tree slp_node, VEC (tree,heap) **vec_oprnds0, ...@@ -1284,6 +1617,7 @@ vect_get_slp_defs (slp_tree slp_node, VEC (tree,heap) **vec_oprnds0,
vect_get_constant_vectors (slp_node, vec_oprnds1, 1, number_of_vects); vect_get_constant_vectors (slp_node, vec_oprnds1, 1, number_of_vects);
} }
/* Create NCOPIES permutation statements using the mask MASK_BYTES (by /* Create NCOPIES permutation statements using the mask MASK_BYTES (by
building a vector of type MASK_TYPE from it) and two input vectors placed in building a vector of type MASK_TYPE from it) and two input vectors placed in
DR_CHAIN at FIRST_VEC_INDX and SECOND_VEC_INDX for the first copy and DR_CHAIN at FIRST_VEC_INDX and SECOND_VEC_INDX for the first copy and
...@@ -1680,20 +2014,29 @@ vect_schedule_slp_instance (slp_tree node, slp_instance instance, ...@@ -1680,20 +2014,29 @@ vect_schedule_slp_instance (slp_tree node, slp_instance instance,
bool bool
vect_schedule_slp (loop_vec_info loop_vinfo) vect_schedule_slp (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
{ {
VEC (slp_instance, heap) *slp_instances = VEC (slp_instance, heap) *slp_instances;
LOOP_VINFO_SLP_INSTANCES (loop_vinfo);
slp_instance instance; slp_instance instance;
unsigned int i; unsigned int i, vf;
bool is_store = false; bool is_store = false;
if (loop_vinfo)
{
slp_instances = LOOP_VINFO_SLP_INSTANCES (loop_vinfo);
vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
}
else
{
slp_instances = BB_VINFO_SLP_INSTANCES (bb_vinfo);
vf = 1;
}
for (i = 0; VEC_iterate (slp_instance, slp_instances, i, instance); i++) for (i = 0; VEC_iterate (slp_instance, slp_instances, i, instance); i++)
{ {
/* Schedule the tree of INSTANCE. */ /* Schedule the tree of INSTANCE. */
is_store = vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance), is_store = vect_schedule_slp_instance (SLP_INSTANCE_TREE (instance),
instance, LOOP_VINFO_VECT_FACTOR (loop_vinfo)); instance, vf);
if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS) if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS)
|| vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) || vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "vectorizing stmts using SLP."); fprintf (vect_dump, "vectorizing stmts using SLP.");
...@@ -1701,3 +2044,51 @@ vect_schedule_slp (loop_vec_info loop_vinfo) ...@@ -1701,3 +2044,51 @@ vect_schedule_slp (loop_vec_info loop_vinfo)
return is_store; return is_store;
} }
/* Vectorize the basic block. */
void
vect_slp_transform_bb (basic_block bb)
{
bb_vec_info bb_vinfo = vec_info_for_bb (bb);
gimple_stmt_iterator si;
gcc_assert (bb_vinfo);
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "SLPing BB\n");
for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
{
gimple stmt = gsi_stmt (si);
stmt_vec_info stmt_info;
if (vect_print_dump_info (REPORT_DETAILS))
{
fprintf (vect_dump, "------>SLPing statement: ");
print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM);
}
stmt_info = vinfo_for_stmt (stmt);
gcc_assert (stmt_info);
/* Schedule all the SLP instances when the first SLP stmt is reached. */
if (STMT_SLP_TYPE (stmt_info))
{
vect_schedule_slp (NULL, bb_vinfo);
break;
}
}
mark_sym_for_renaming (gimple_vop (cfun));
/* The memory tags and pointers in vectorized statements need to
have their SSA forms updated. FIXME, why can't this be delayed
until all the loops have been transformed? */
update_ssa (TODO_update_ssa);
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "BB VECTORIZED\n");
}
...@@ -248,7 +248,7 @@ process_use (gimple stmt, tree use, loop_vec_info loop_vinfo, bool live_p, ...@@ -248,7 +248,7 @@ process_use (gimple stmt, tree use, loop_vec_info loop_vinfo, bool live_p,
if (!exist_non_indexing_operands_for_use_p (use, stmt)) if (!exist_non_indexing_operands_for_use_p (use, stmt))
return true; return true;
if (!vect_is_simple_use (use, loop_vinfo, &def_stmt, &def, &dt)) if (!vect_is_simple_use (use, loop_vinfo, NULL, &def_stmt, &def, &dt))
{ {
if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS)) if (vect_print_dump_info (REPORT_UNVECTORIZED_LOCATIONS))
fprintf (vect_dump, "not vectorized: unsupported use in stmt."); fprintf (vect_dump, "not vectorized: unsupported use in stmt.");
...@@ -813,14 +813,30 @@ vect_init_vector (gimple stmt, tree vector_var, tree vector_type, ...@@ -813,14 +813,30 @@ vect_init_vector (gimple stmt, tree vector_var, tree vector_type,
else else
{ {
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
if (loop_vinfo)
{
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
if (nested_in_vect_loop_p (loop, stmt)) if (nested_in_vect_loop_p (loop, stmt))
loop = loop->inner; loop = loop->inner;
pe = loop_preheader_edge (loop); pe = loop_preheader_edge (loop);
new_bb = gsi_insert_on_edge_immediate (pe, init_stmt); new_bb = gsi_insert_on_edge_immediate (pe, init_stmt);
gcc_assert (!new_bb); gcc_assert (!new_bb);
} }
else
{
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo);
basic_block bb;
gimple_stmt_iterator gsi_bb_start;
gcc_assert (bb_vinfo);
bb = BB_VINFO_BB (bb_vinfo);
gsi_bb_start = gsi_start_bb (bb);
gsi_insert_before (&gsi_bb_start, init_stmt, GSI_SAME_STMT);
}
}
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
{ {
...@@ -832,6 +848,7 @@ vect_init_vector (gimple stmt, tree vector_var, tree vector_type, ...@@ -832,6 +848,7 @@ vect_init_vector (gimple stmt, tree vector_var, tree vector_type,
return vec_oprnd; return vec_oprnd;
} }
/* Function vect_get_vec_def_for_operand. /* Function vect_get_vec_def_for_operand.
OP is an operand in STMT. This function returns a (vector) def that will be OP is an operand in STMT. This function returns a (vector) def that will be
...@@ -869,7 +886,8 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, tree *scalar_def) ...@@ -869,7 +886,8 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, tree *scalar_def)
print_generic_expr (vect_dump, op, TDF_SLIM); print_generic_expr (vect_dump, op, TDF_SLIM);
} }
is_simple_use = vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt); is_simple_use = vect_is_simple_use (op, loop_vinfo, NULL, &def_stmt, &def,
&dt);
gcc_assert (is_simple_use); gcc_assert (is_simple_use);
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
{ {
...@@ -1122,12 +1140,14 @@ vect_finish_stmt_generation (gimple stmt, gimple vec_stmt, ...@@ -1122,12 +1140,14 @@ vect_finish_stmt_generation (gimple stmt, gimple vec_stmt,
{ {
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
gcc_assert (gimple_code (stmt) != GIMPLE_LABEL); gcc_assert (gimple_code (stmt) != GIMPLE_LABEL);
gsi_insert_before (gsi, vec_stmt, GSI_SAME_STMT); gsi_insert_before (gsi, vec_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (vec_stmt, new_stmt_vec_info (vec_stmt, loop_vinfo)); set_vinfo_for_stmt (vec_stmt, new_stmt_vec_info (vec_stmt, loop_vinfo,
bb_vinfo));
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
{ {
...@@ -1191,6 +1211,9 @@ vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt) ...@@ -1191,6 +1211,9 @@ vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt)
enum { NARROW, NONE, WIDEN } modifier; enum { NARROW, NONE, WIDEN } modifier;
size_t i, nargs; size_t i, nargs;
/* FORNOW: unsupported in basic block SLP. */
gcc_assert (loop_vinfo);
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false; return false;
...@@ -1232,7 +1255,7 @@ vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt) ...@@ -1232,7 +1255,7 @@ vectorizable_call (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt)
} }
rhs_type = TREE_TYPE (op); rhs_type = TREE_TYPE (op);
if (!vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt[i])) if (!vect_is_simple_use (op, loop_vinfo, NULL, &def_stmt, &def, &dt[i]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -1509,6 +1532,9 @@ vectorizable_conversion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1509,6 +1532,9 @@ vectorizable_conversion (gimple stmt, gimple_stmt_iterator *gsi,
/* Is STMT a vectorizable conversion? */ /* Is STMT a vectorizable conversion? */
/* FORNOW: unsupported in basic block SLP. */
gcc_assert (loop_vinfo);
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false; return false;
...@@ -1575,7 +1601,7 @@ vectorizable_conversion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1575,7 +1601,7 @@ vectorizable_conversion (gimple stmt, gimple_stmt_iterator *gsi,
gcc_assert (ncopies >= 1); gcc_assert (ncopies >= 1);
/* Check the operands of the operation. */ /* Check the operands of the operation. */
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt[0])) if (!vect_is_simple_use (op0, loop_vinfo, NULL, &def_stmt, &def, &dt[0]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -1759,6 +1785,7 @@ vectorizable_assignment (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1759,6 +1785,7 @@ vectorizable_assignment (gimple stmt, gimple_stmt_iterator *gsi,
int i; int i;
VEC(tree,heap) *vec_oprnds = NULL; VEC(tree,heap) *vec_oprnds = NULL;
tree vop; tree vop;
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
/* Multiple types in SLP are handled by creating the appropriate number of /* Multiple types in SLP are handled by creating the appropriate number of
vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in
...@@ -1772,7 +1799,7 @@ vectorizable_assignment (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1772,7 +1799,7 @@ vectorizable_assignment (gimple stmt, gimple_stmt_iterator *gsi,
if (ncopies > 1) if (ncopies > 1)
return false; /* FORNOW */ return false; /* FORNOW */
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
return false; return false;
if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def) if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def)
...@@ -1792,7 +1819,7 @@ vectorizable_assignment (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1792,7 +1819,7 @@ vectorizable_assignment (gimple stmt, gimple_stmt_iterator *gsi,
else else
return false; return false;
if (!vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt[0])) if (!vect_is_simple_use (op, loop_vinfo, bb_vinfo, &def_stmt, &def, &dt[0]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -1875,6 +1902,14 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1875,6 +1902,14 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
unsigned int k; unsigned int k;
bool shift_p = false; bool shift_p = false;
bool scalar_shift_arg = false; bool scalar_shift_arg = false;
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
int vf;
if (loop_vinfo)
vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
else
/* FORNOW: multiple types are not supported in basic block SLP. */
vf = nunits_in;
/* Multiple types in SLP are handled by creating the appropriate number of /* Multiple types in SLP are handled by creating the appropriate number of
vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in
...@@ -1886,7 +1921,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1886,7 +1921,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
gcc_assert (ncopies >= 1); gcc_assert (ncopies >= 1);
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
return false; return false;
if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def) if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def)
...@@ -1924,7 +1959,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1924,7 +1959,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
} }
op0 = gimple_assign_rhs1 (stmt); op0 = gimple_assign_rhs1 (stmt);
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt[0])) if (!vect_is_simple_use (op0, loop_vinfo, bb_vinfo, &def_stmt, &def, &dt[0]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -1934,7 +1969,8 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -1934,7 +1969,8 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
if (op_type == binary_op) if (op_type == binary_op)
{ {
op1 = gimple_assign_rhs2 (stmt); op1 = gimple_assign_rhs2 (stmt);
if (!vect_is_simple_use (op1, loop_vinfo, &def_stmt, &def, &dt[1])) if (!vect_is_simple_use (op1, loop_vinfo, bb_vinfo, &def_stmt, &def,
&dt[1]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -2016,8 +2052,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2016,8 +2052,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
fprintf (vect_dump, "op not supported by target."); fprintf (vect_dump, "op not supported by target.");
/* Check only during analysis. */ /* Check only during analysis. */
if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD if (GET_MODE_SIZE (vec_mode) != UNITS_PER_WORD
|| (LOOP_VINFO_VECT_FACTOR (loop_vinfo) || (vf < vect_min_worthwhile_factor (code)
< vect_min_worthwhile_factor (code)
&& !vec_stmt)) && !vec_stmt))
return false; return false;
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -2026,8 +2061,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2026,8 +2061,7 @@ vectorizable_operation (gimple stmt, gimple_stmt_iterator *gsi,
/* Worthwhile without SIMD support? Check only during analysis. */ /* Worthwhile without SIMD support? Check only during analysis. */
if (!VECTOR_MODE_P (TYPE_MODE (vectype)) if (!VECTOR_MODE_P (TYPE_MODE (vectype))
&& LOOP_VINFO_VECT_FACTOR (loop_vinfo) && vf < vect_min_worthwhile_factor (code)
< vect_min_worthwhile_factor (code)
&& !vec_stmt) && !vec_stmt)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -2333,6 +2367,9 @@ vectorizable_type_demotion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2333,6 +2367,9 @@ vectorizable_type_demotion (gimple stmt, gimple_stmt_iterator *gsi,
VEC (tree, heap) *vec_dsts = NULL, *interm_types = NULL, *tmp_vec_dsts = NULL; VEC (tree, heap) *vec_dsts = NULL, *interm_types = NULL, *tmp_vec_dsts = NULL;
tree last_oprnd, intermediate_type; tree last_oprnd, intermediate_type;
/* FORNOW: not supported by basic block SLP vectorization. */
gcc_assert (loop_vinfo);
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false; return false;
...@@ -2371,7 +2408,6 @@ vectorizable_type_demotion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2371,7 +2408,6 @@ vectorizable_type_demotion (gimple stmt, gimple_stmt_iterator *gsi,
ncopies = 1; ncopies = 1;
else else
ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out; ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits_out;
gcc_assert (ncopies >= 1); gcc_assert (ncopies >= 1);
if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest)) if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
...@@ -2382,7 +2418,7 @@ vectorizable_type_demotion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2382,7 +2418,7 @@ vectorizable_type_demotion (gimple stmt, gimple_stmt_iterator *gsi,
return false; return false;
/* Check the operands of the operation. */ /* Check the operands of the operation. */
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt[0])) if (!vect_is_simple_use (op0, loop_vinfo, NULL, &def_stmt, &def, &dt[0]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -2600,6 +2636,9 @@ vectorizable_type_promotion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2600,6 +2636,9 @@ vectorizable_type_promotion (gimple stmt, gimple_stmt_iterator *gsi,
VEC (tree, heap) *vec_oprnds0 = NULL, *vec_oprnds1 = NULL; VEC (tree, heap) *vec_oprnds0 = NULL, *vec_oprnds1 = NULL;
VEC (tree, heap) *vec_dsts = NULL, *interm_types = NULL, *tmp_vec_dsts = NULL; VEC (tree, heap) *vec_dsts = NULL, *interm_types = NULL, *tmp_vec_dsts = NULL;
/* FORNOW: not supported by basic block SLP vectorization. */
gcc_assert (loop_vinfo);
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false; return false;
...@@ -2650,7 +2689,7 @@ vectorizable_type_promotion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2650,7 +2689,7 @@ vectorizable_type_promotion (gimple stmt, gimple_stmt_iterator *gsi,
return false; return false;
/* Check the operands of the operation. */ /* Check the operands of the operation. */
if (!vect_is_simple_use (op0, loop_vinfo, &def_stmt, &def, &dt[0])) if (!vect_is_simple_use (op0, loop_vinfo, NULL, &def_stmt, &def, &dt[0]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -2661,7 +2700,7 @@ vectorizable_type_promotion (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -2661,7 +2700,7 @@ vectorizable_type_promotion (gimple stmt, gimple_stmt_iterator *gsi,
if (op_type == binary_op) if (op_type == binary_op)
{ {
op1 = gimple_assign_rhs2 (stmt); op1 = gimple_assign_rhs2 (stmt);
if (!vect_is_simple_use (op1, loop_vinfo, &def_stmt, &def, &dt[1])) if (!vect_is_simple_use (op1, loop_vinfo, NULL, &def_stmt, &def, &dt[1]))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -2806,7 +2845,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -2806,7 +2845,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr = NULL; struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr = NULL;
tree vectype = STMT_VINFO_VECTYPE (stmt_info); tree vectype = STMT_VINFO_VECTYPE (stmt_info);
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
enum machine_mode vec_mode; enum machine_mode vec_mode;
tree dummy; tree dummy;
enum dr_alignment_support alignment_support_scheme; enum dr_alignment_support alignment_support_scheme;
...@@ -2827,6 +2866,10 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -2827,6 +2866,10 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
bool slp = (slp_node != NULL); bool slp = (slp_node != NULL);
stmt_vec_info first_stmt_vinfo; stmt_vec_info first_stmt_vinfo;
unsigned int vec_num; unsigned int vec_num;
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
if (loop_vinfo)
loop = LOOP_VINFO_LOOP (loop_vinfo);
/* Multiple types in SLP are handled by creating the appropriate number of /* Multiple types in SLP are handled by creating the appropriate number of
vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in
...@@ -2839,14 +2882,14 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -2839,14 +2882,14 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
gcc_assert (ncopies >= 1); gcc_assert (ncopies >= 1);
/* FORNOW. This restriction should be relaxed. */ /* FORNOW. This restriction should be relaxed. */
if (nested_in_vect_loop_p (loop, stmt) && ncopies > 1) if (loop && nested_in_vect_loop_p (loop, stmt) && ncopies > 1)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "multiple types in nested loop."); fprintf (vect_dump, "multiple types in nested loop.");
return false; return false;
} }
if (!STMT_VINFO_RELEVANT_P (stmt_info)) if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
return false; return false;
if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def) if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def)
...@@ -2865,7 +2908,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -2865,7 +2908,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
gcc_assert (gimple_assign_single_p (stmt)); gcc_assert (gimple_assign_single_p (stmt));
op = gimple_assign_rhs1 (stmt); op = gimple_assign_rhs1 (stmt);
if (!vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt)) if (!vect_is_simple_use (op, loop_vinfo, bb_vinfo, &def_stmt, &def, &dt))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -2907,7 +2950,8 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -2907,7 +2950,8 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
{ {
gcc_assert (gimple_assign_single_p (next_stmt)); gcc_assert (gimple_assign_single_p (next_stmt));
op = gimple_assign_rhs1 (next_stmt); op = gimple_assign_rhs1 (next_stmt);
if (!vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt)) if (!vect_is_simple_use (op, loop_vinfo, bb_vinfo, &def_stmt,
&def, &dt))
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple."); fprintf (vect_dump, "use not simple.");
...@@ -2935,7 +2979,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -2935,7 +2979,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
DR_GROUP_STORE_COUNT (vinfo_for_stmt (first_stmt))++; DR_GROUP_STORE_COUNT (vinfo_for_stmt (first_stmt))++;
/* FORNOW */ /* FORNOW */
gcc_assert (!nested_in_vect_loop_p (loop, stmt)); gcc_assert (!loop || !nested_in_vect_loop_p (loop, stmt));
/* We vectorize all the stmts of the interleaving group when we /* We vectorize all the stmts of the interleaving group when we
reach the last stmt in the group. */ reach the last stmt in the group. */
...@@ -3063,7 +3107,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3063,7 +3107,7 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
dataref_ptr = vect_create_data_ref_ptr (first_stmt, NULL, NULL_TREE, dataref_ptr = vect_create_data_ref_ptr (first_stmt, NULL, NULL_TREE,
&dummy, &ptr_incr, false, &dummy, &ptr_incr, false,
&inv_p); &inv_p);
gcc_assert (!inv_p); gcc_assert (bb_vinfo || !inv_p);
} }
else else
{ {
...@@ -3077,7 +3121,8 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3077,7 +3121,8 @@ vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
for (i = 0; i < group_size; i++) for (i = 0; i < group_size; i++)
{ {
op = VEC_index (tree, oprnds, i); op = VEC_index (tree, oprnds, i);
vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt); vect_is_simple_use (op, loop_vinfo, bb_vinfo, &def_stmt, &def,
&dt);
vec_oprnd = vect_get_vec_def_for_stmt_copy (dt, op); vec_oprnd = vect_get_vec_def_for_stmt_copy (dt, op);
VEC_replace(tree, dr_chain, i, vec_oprnd); VEC_replace(tree, dr_chain, i, vec_oprnd);
VEC_replace(tree, oprnds, i, vec_oprnd); VEC_replace(tree, oprnds, i, vec_oprnd);
...@@ -3162,9 +3207,9 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3162,9 +3207,9 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
stmt_vec_info prev_stmt_info; stmt_vec_info prev_stmt_info;
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
struct loop *containing_loop = (gimple_bb (stmt))->loop_father; struct loop *containing_loop = (gimple_bb (stmt))->loop_father;
bool nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt); bool nested_in_vect_loop = false;
struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr; struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info), *first_dr;
tree vectype = STMT_VINFO_VECTYPE (stmt_info); tree vectype = STMT_VINFO_VECTYPE (stmt_info);
tree new_temp; tree new_temp;
...@@ -3192,6 +3237,18 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3192,6 +3237,18 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
bool slp = (slp_node != NULL); bool slp = (slp_node != NULL);
bool slp_perm = false; bool slp_perm = false;
enum tree_code code; enum tree_code code;
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
int vf;
if (loop_vinfo)
{
loop = LOOP_VINFO_LOOP (loop_vinfo);
nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt);
vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
}
else
/* FORNOW: multiple types are not supported in basic block SLP. */
vf = nunits;
/* Multiple types in SLP are handled by creating the appropriate number of /* Multiple types in SLP are handled by creating the appropriate number of
vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in
...@@ -3211,10 +3268,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3211,10 +3268,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
return false; return false;
} }
if (slp && SLP_INSTANCE_LOAD_PERMUTATION (slp_node_instance)) if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
slp_perm = true;
if (!STMT_VINFO_RELEVANT_P (stmt_info))
return false; return false;
if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def) if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def)
...@@ -3300,6 +3354,8 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3300,6 +3354,8 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
{ {
strided_load = false; strided_load = false;
vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
if (SLP_INSTANCE_LOAD_PERMUTATION (slp_node_instance))
slp_perm = true;
} }
else else
vec_num = group_size; vec_num = group_size;
...@@ -3419,7 +3475,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3419,7 +3475,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
This can only occur when vectorizing memory accesses in the inner-loop This can only occur when vectorizing memory accesses in the inner-loop
nested within an outer-loop that is being vectorized. */ nested within an outer-loop that is being vectorized. */
if (nested_in_vect_loop_p (loop, stmt) if (loop && nested_in_vect_loop_p (loop, stmt)
&& (TREE_INT_CST_LOW (DR_STEP (dr)) && (TREE_INT_CST_LOW (DR_STEP (dr))
% GET_MODE_SIZE (TYPE_MODE (vectype)) != 0)) % GET_MODE_SIZE (TYPE_MODE (vectype)) != 0))
{ {
...@@ -3551,7 +3607,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3551,7 +3607,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
} }
/* 4. Handle invariant-load. */ /* 4. Handle invariant-load. */
if (inv_p) if (inv_p && !bb_vinfo)
{ {
gcc_assert (!strided_load); gcc_assert (!strided_load);
gcc_assert (nested_in_vect_loop_p (loop, stmt)); gcc_assert (nested_in_vect_loop_p (loop, stmt));
...@@ -3598,8 +3654,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, ...@@ -3598,8 +3654,7 @@ vectorizable_load (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt,
if (slp_perm) if (slp_perm)
{ {
if (!vect_transform_slp_perm_load (stmt, dr_chain, gsi, if (!vect_transform_slp_perm_load (stmt, dr_chain, gsi, vf,
LOOP_VINFO_VECT_FACTOR (loop_vinfo),
slp_node_instance, false)) slp_node_instance, false))
{ {
VEC_free (tree, heap, dr_chain); VEC_free (tree, heap, dr_chain);
...@@ -3659,7 +3714,8 @@ vect_is_simple_cond (tree cond, loop_vec_info loop_vinfo) ...@@ -3659,7 +3714,8 @@ vect_is_simple_cond (tree cond, loop_vec_info loop_vinfo)
if (TREE_CODE (lhs) == SSA_NAME) if (TREE_CODE (lhs) == SSA_NAME)
{ {
gimple lhs_def_stmt = SSA_NAME_DEF_STMT (lhs); gimple lhs_def_stmt = SSA_NAME_DEF_STMT (lhs);
if (!vect_is_simple_use (lhs, loop_vinfo, &lhs_def_stmt, &def, &dt)) if (!vect_is_simple_use (lhs, loop_vinfo, NULL, &lhs_def_stmt, &def,
&dt))
return false; return false;
} }
else if (TREE_CODE (lhs) != INTEGER_CST && TREE_CODE (lhs) != REAL_CST else if (TREE_CODE (lhs) != INTEGER_CST && TREE_CODE (lhs) != REAL_CST
...@@ -3669,7 +3725,8 @@ vect_is_simple_cond (tree cond, loop_vec_info loop_vinfo) ...@@ -3669,7 +3725,8 @@ vect_is_simple_cond (tree cond, loop_vec_info loop_vinfo)
if (TREE_CODE (rhs) == SSA_NAME) if (TREE_CODE (rhs) == SSA_NAME)
{ {
gimple rhs_def_stmt = SSA_NAME_DEF_STMT (rhs); gimple rhs_def_stmt = SSA_NAME_DEF_STMT (rhs);
if (!vect_is_simple_use (rhs, loop_vinfo, &rhs_def_stmt, &def, &dt)) if (!vect_is_simple_use (rhs, loop_vinfo, NULL, &rhs_def_stmt, &def,
&dt))
return false; return false;
} }
else if (TREE_CODE (rhs) != INTEGER_CST && TREE_CODE (rhs) != REAL_CST else if (TREE_CODE (rhs) != INTEGER_CST && TREE_CODE (rhs) != REAL_CST
...@@ -3709,6 +3766,9 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3709,6 +3766,9 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi,
int ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits; int ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
enum tree_code code; enum tree_code code;
/* FORNOW: unsupported in basic block SLP. */
gcc_assert (loop_vinfo);
gcc_assert (ncopies >= 1); gcc_assert (ncopies >= 1);
if (ncopies > 1) if (ncopies > 1)
return false; /* FORNOW */ return false; /* FORNOW */
...@@ -3757,7 +3817,7 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3757,7 +3817,7 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi,
if (TREE_CODE (then_clause) == SSA_NAME) if (TREE_CODE (then_clause) == SSA_NAME)
{ {
gimple then_def_stmt = SSA_NAME_DEF_STMT (then_clause); gimple then_def_stmt = SSA_NAME_DEF_STMT (then_clause);
if (!vect_is_simple_use (then_clause, loop_vinfo, if (!vect_is_simple_use (then_clause, loop_vinfo, NULL,
&then_def_stmt, &def, &dt)) &then_def_stmt, &def, &dt))
return false; return false;
} }
...@@ -3769,7 +3829,7 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3769,7 +3829,7 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi,
if (TREE_CODE (else_clause) == SSA_NAME) if (TREE_CODE (else_clause) == SSA_NAME)
{ {
gimple else_def_stmt = SSA_NAME_DEF_STMT (else_clause); gimple else_def_stmt = SSA_NAME_DEF_STMT (else_clause);
if (!vect_is_simple_use (else_clause, loop_vinfo, if (!vect_is_simple_use (else_clause, loop_vinfo, NULL,
&else_def_stmt, &def, &dt)) &else_def_stmt, &def, &dt))
return false; return false;
} }
...@@ -3819,11 +3879,14 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3819,11 +3879,14 @@ vectorizable_condition (gimple stmt, gimple_stmt_iterator *gsi,
/* Make sure the statement is vectorizable. */ /* Make sure the statement is vectorizable. */
bool bool
vect_analyze_stmt (gimple stmt, bool *need_to_vectorize) vect_analyze_stmt (gimple stmt, bool *need_to_vectorize, slp_tree node)
{ {
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info); enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_info);
bool ok; bool ok;
HOST_WIDE_INT dummy;
tree scalar_type, vectype;
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
{ {
...@@ -3854,9 +3917,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize) ...@@ -3854,9 +3917,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize)
break; break;
case vect_reduction_def: case vect_reduction_def:
gcc_assert (relevance == vect_used_in_outer gcc_assert (!bb_vinfo && (relevance == vect_used_in_outer
|| relevance == vect_used_in_outer_by_reduction || relevance == vect_used_in_outer_by_reduction
|| relevance == vect_unused_in_scope); || relevance == vect_unused_in_scope));
break; break;
case vect_induction_def: case vect_induction_def:
...@@ -3867,6 +3930,37 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize) ...@@ -3867,6 +3930,37 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize)
gcc_unreachable (); gcc_unreachable ();
} }
if (bb_vinfo)
{
gcc_assert (PURE_SLP_STMT (stmt_info));
scalar_type = vect_get_smallest_scalar_type (stmt, &dummy, &dummy);
if (vect_print_dump_info (REPORT_DETAILS))
{
fprintf (vect_dump, "get vectype for scalar type: ");
print_generic_expr (vect_dump, scalar_type, TDF_SLIM);
}
vectype = get_vectype_for_scalar_type (scalar_type);
if (!vectype)
{
if (vect_print_dump_info (REPORT_DETAILS))
{
fprintf (vect_dump, "not SLPed: unsupported data-type ");
print_generic_expr (vect_dump, scalar_type, TDF_SLIM);
}
return false;
}
if (vect_print_dump_info (REPORT_DETAILS))
{
fprintf (vect_dump, "vectype: ");
print_generic_expr (vect_dump, vectype, TDF_SLIM);
}
STMT_VINFO_VECTYPE (stmt_info) = vectype;
}
if (STMT_VINFO_RELEVANT_P (stmt_info)) if (STMT_VINFO_RELEVANT_P (stmt_info))
{ {
gcc_assert (!VECTOR_MODE_P (TYPE_MODE (gimple_expr_type (stmt)))); gcc_assert (!VECTOR_MODE_P (TYPE_MODE (gimple_expr_type (stmt))));
...@@ -3875,8 +3969,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize) ...@@ -3875,8 +3969,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize)
} }
ok = true; ok = true;
if (STMT_VINFO_RELEVANT_P (stmt_info) if (!bb_vinfo
|| STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def) && (STMT_VINFO_RELEVANT_P (stmt_info)
|| STMT_VINFO_DEF_TYPE (stmt_info) == vect_reduction_def))
ok = (vectorizable_type_promotion (stmt, NULL, NULL, NULL) ok = (vectorizable_type_promotion (stmt, NULL, NULL, NULL)
|| vectorizable_type_demotion (stmt, NULL, NULL, NULL) || vectorizable_type_demotion (stmt, NULL, NULL, NULL)
|| vectorizable_conversion (stmt, NULL, NULL, NULL) || vectorizable_conversion (stmt, NULL, NULL, NULL)
...@@ -3887,6 +3982,14 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize) ...@@ -3887,6 +3982,14 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize)
|| vectorizable_store (stmt, NULL, NULL, NULL) || vectorizable_store (stmt, NULL, NULL, NULL)
|| vectorizable_condition (stmt, NULL, NULL) || vectorizable_condition (stmt, NULL, NULL)
|| vectorizable_reduction (stmt, NULL, NULL)); || vectorizable_reduction (stmt, NULL, NULL));
else
{
if (bb_vinfo)
ok = (vectorizable_operation (stmt, NULL, NULL, node)
|| vectorizable_assignment (stmt, NULL, NULL, node)
|| vectorizable_load (stmt, NULL, NULL, node, NULL)
|| vectorizable_store (stmt, NULL, NULL, node));
}
if (!ok) if (!ok)
{ {
...@@ -3900,6 +4003,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize) ...@@ -3900,6 +4003,9 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize)
return false; return false;
} }
if (bb_vinfo)
return true;
/* Stmts that are (also) "live" (i.e. - that are used out of the loop) /* Stmts that are (also) "live" (i.e. - that are used out of the loop)
need extra handling, except for vectorizable reductions. */ need extra handling, except for vectorizable reductions. */
if (STMT_VINFO_LIVE_P (stmt_info) if (STMT_VINFO_LIVE_P (stmt_info)
...@@ -3957,8 +4063,6 @@ vect_transform_stmt (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -3957,8 +4063,6 @@ vect_transform_stmt (gimple stmt, gimple_stmt_iterator *gsi,
stmt_vec_info stmt_info = vinfo_for_stmt (stmt); stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
gimple orig_stmt_in_pattern; gimple orig_stmt_in_pattern;
bool done; bool done;
loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
switch (STMT_VINFO_TYPE (stmt_info)) switch (STMT_VINFO_TYPE (stmt_info))
{ {
...@@ -4045,12 +4149,16 @@ vect_transform_stmt (gimple stmt, gimple_stmt_iterator *gsi, ...@@ -4045,12 +4149,16 @@ vect_transform_stmt (gimple stmt, gimple_stmt_iterator *gsi,
/* Handle inner-loop stmts whose DEF is used in the loop-nest that /* Handle inner-loop stmts whose DEF is used in the loop-nest that
is being vectorized, but outside the immediately enclosing loop. */ is being vectorized, but outside the immediately enclosing loop. */
if (vec_stmt if (vec_stmt
&& nested_in_vect_loop_p (loop, stmt) && STMT_VINFO_LOOP_VINFO (stmt_info)
&& nested_in_vect_loop_p (LOOP_VINFO_LOOP (
STMT_VINFO_LOOP_VINFO (stmt_info)), stmt)
&& STMT_VINFO_TYPE (stmt_info) != reduc_vec_info_type && STMT_VINFO_TYPE (stmt_info) != reduc_vec_info_type
&& (STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer && (STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer
|| STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer_by_reduction)) || STMT_VINFO_RELEVANT (stmt_info) ==
vect_used_in_outer_by_reduction))
{ {
struct loop *innerloop = loop->inner; struct loop *innerloop = LOOP_VINFO_LOOP (
STMT_VINFO_LOOP_VINFO (stmt_info))->inner;
imm_use_iterator imm_iter; imm_use_iterator imm_iter;
use_operand_p use_p; use_operand_p use_p;
tree scalar_dest; tree scalar_dest;
...@@ -4137,7 +4245,8 @@ vect_remove_stores (gimple first_stmt) ...@@ -4137,7 +4245,8 @@ vect_remove_stores (gimple first_stmt)
Create and initialize a new stmt_vec_info struct for STMT. */ Create and initialize a new stmt_vec_info struct for STMT. */
stmt_vec_info stmt_vec_info
new_stmt_vec_info (gimple stmt, loop_vec_info loop_vinfo) new_stmt_vec_info (gimple stmt, loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo)
{ {
stmt_vec_info res; stmt_vec_info res;
res = (stmt_vec_info) xcalloc (1, sizeof (struct _stmt_vec_info)); res = (stmt_vec_info) xcalloc (1, sizeof (struct _stmt_vec_info));
...@@ -4145,6 +4254,7 @@ new_stmt_vec_info (gimple stmt, loop_vec_info loop_vinfo) ...@@ -4145,6 +4254,7 @@ new_stmt_vec_info (gimple stmt, loop_vec_info loop_vinfo)
STMT_VINFO_TYPE (res) = undef_vec_info_type; STMT_VINFO_TYPE (res) = undef_vec_info_type;
STMT_VINFO_STMT (res) = stmt; STMT_VINFO_STMT (res) = stmt;
STMT_VINFO_LOOP_VINFO (res) = loop_vinfo; STMT_VINFO_LOOP_VINFO (res) = loop_vinfo;
STMT_VINFO_BB_VINFO (res) = bb_vinfo;
STMT_VINFO_RELEVANT (res) = vect_unused_in_scope; STMT_VINFO_RELEVANT (res) = vect_unused_in_scope;
STMT_VINFO_LIVE_P (res) = false; STMT_VINFO_LIVE_P (res) = false;
STMT_VINFO_VECTYPE (res) = NULL; STMT_VINFO_VECTYPE (res) = NULL;
...@@ -4267,23 +4377,30 @@ get_vectype_for_scalar_type (tree scalar_type) ...@@ -4267,23 +4377,30 @@ get_vectype_for_scalar_type (tree scalar_type)
/* Function vect_is_simple_use. /* Function vect_is_simple_use.
Input: Input:
LOOP - the loop that is being vectorized. LOOP_VINFO - the vect info of the loop that is being vectorized.
OPERAND - operand of a stmt in LOOP. BB_VINFO - the vect info of the basic block that is being vectorized.
OPERAND - operand of a stmt in the loop or bb.
DEF - the defining stmt in case OPERAND is an SSA_NAME. DEF - the defining stmt in case OPERAND is an SSA_NAME.
Returns whether a stmt with OPERAND can be vectorized. Returns whether a stmt with OPERAND can be vectorized.
Supportable operands are constants, loop invariants, and operands that are For loops, supportable operands are constants, loop invariants, and operands
defined by the current iteration of the loop. Unsupportable operands are that are defined by the current iteration of the loop. Unsupportable
those that are defined by a previous iteration of the loop (as is the case operands are those that are defined by a previous iteration of the loop (as
in reduction/induction computations). */ is the case in reduction/induction computations).
For basic blocks, supportable operands are constants and bb invariants.
For now, operands defined outside the basic block are not supported. */
bool bool
vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt, vect_is_simple_use (tree operand, loop_vec_info loop_vinfo,
bb_vec_info bb_vinfo, gimple *def_stmt,
tree *def, enum vect_def_type *dt) tree *def, enum vect_def_type *dt)
{ {
basic_block bb; basic_block bb;
stmt_vec_info stmt_vinfo; stmt_vec_info stmt_vinfo;
struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); struct loop *loop = NULL;
if (loop_vinfo)
loop = LOOP_VINFO_LOOP (loop_vinfo);
*def_stmt = NULL; *def_stmt = NULL;
*def = NULL_TREE; *def = NULL_TREE;
...@@ -4299,6 +4416,7 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt, ...@@ -4299,6 +4416,7 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt,
*dt = vect_constant_def; *dt = vect_constant_def;
return true; return true;
} }
if (is_gimple_min_invariant (operand)) if (is_gimple_min_invariant (operand))
{ {
*def = operand; *def = operand;
...@@ -4312,6 +4430,7 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt, ...@@ -4312,6 +4430,7 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt,
fprintf (vect_dump, "non-associatable copy."); fprintf (vect_dump, "non-associatable copy.");
operand = TREE_OPERAND (operand, 0); operand = TREE_OPERAND (operand, 0);
} }
if (TREE_CODE (operand) != SSA_NAME) if (TREE_CODE (operand) != SSA_NAME)
{ {
if (vect_print_dump_info (REPORT_DETAILS)) if (vect_print_dump_info (REPORT_DETAILS))
...@@ -4343,7 +4462,10 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt, ...@@ -4343,7 +4462,10 @@ vect_is_simple_use (tree operand, loop_vec_info loop_vinfo, gimple *def_stmt,
} }
bb = gimple_bb (*def_stmt); bb = gimple_bb (*def_stmt);
if (!flow_bb_inside_loop_p (loop, bb))
if ((loop && !flow_bb_inside_loop_p (loop, bb))
|| (!loop && bb != BB_VINFO_BB (bb_vinfo))
|| (!loop && gimple_code (*def_stmt) == GIMPLE_PHI))
*dt = vect_external_def; *dt = vect_external_def;
else else
{ {
......
...@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3. If not see ...@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3. If not see
#include "cfglayout.h" #include "cfglayout.h"
#include "tree-vectorizer.h" #include "tree-vectorizer.h"
#include "tree-pass.h" #include "tree-pass.h"
#include "timevar.h"
/* vect_dump will be set to stderr or dump_file if exist. */ /* vect_dump will be set to stderr or dump_file if exist. */
FILE *vect_dump; FILE *vect_dump;
...@@ -75,8 +76,9 @@ FILE *vect_dump; ...@@ -75,8 +76,9 @@ FILE *vect_dump;
/* vect_verbosity_level set to an invalid value /* vect_verbosity_level set to an invalid value
to mark that it's uninitialized. */ to mark that it's uninitialized. */
static enum verbosity_levels vect_verbosity_level = MAX_VERBOSITY_LEVEL; static enum verbosity_levels vect_verbosity_level = MAX_VERBOSITY_LEVEL;
static enum verbosity_levels user_vect_verbosity_level = MAX_VERBOSITY_LEVEL;
/* Loop location. */ /* Loop or bb location. */
LOC vect_location; LOC vect_location;
/* Bitmap of virtual variables to be renamed. */ /* Bitmap of virtual variables to be renamed. */
...@@ -99,9 +101,10 @@ vect_set_verbosity_level (const char *val) ...@@ -99,9 +101,10 @@ vect_set_verbosity_level (const char *val)
vl = atoi (val); vl = atoi (val);
if (vl < MAX_VERBOSITY_LEVEL) if (vl < MAX_VERBOSITY_LEVEL)
vect_verbosity_level = (enum verbosity_levels) vl; user_vect_verbosity_level = (enum verbosity_levels) vl;
else else
vect_verbosity_level = (enum verbosity_levels) (MAX_VERBOSITY_LEVEL - 1); user_vect_verbosity_level
= (enum verbosity_levels) (MAX_VERBOSITY_LEVEL - 1);
} }
...@@ -115,18 +118,34 @@ vect_set_verbosity_level (const char *val) ...@@ -115,18 +118,34 @@ vect_set_verbosity_level (const char *val)
print to stderr, otherwise print to the dump file. */ print to stderr, otherwise print to the dump file. */
static void static void
vect_set_dump_settings (void) vect_set_dump_settings (bool slp)
{ {
vect_dump = dump_file; vect_dump = dump_file;
/* Check if the verbosity level was defined by the user: */ /* Check if the verbosity level was defined by the user: */
if (vect_verbosity_level != MAX_VERBOSITY_LEVEL) if (user_vect_verbosity_level != MAX_VERBOSITY_LEVEL)
{ {
/* If there is no dump file, print to stderr. */ vect_verbosity_level = user_vect_verbosity_level;
if (!dump_file) /* Ignore user defined verbosity if dump flags require higher level of
verbosity. */
if (dump_file)
{
if (((dump_flags & TDF_DETAILS)
&& vect_verbosity_level >= REPORT_DETAILS)
|| ((dump_flags & TDF_STATS)
&& vect_verbosity_level >= REPORT_UNVECTORIZED_LOCATIONS))
return;
}
else
{
/* If there is no dump file, print to stderr in case of loop
vectorization. */
if (!slp)
vect_dump = stderr; vect_dump = stderr;
return; return;
} }
}
/* User didn't specify verbosity level: */ /* User didn't specify verbosity level: */
if (dump_file && (dump_flags & TDF_DETAILS)) if (dump_file && (dump_flags & TDF_DETAILS))
...@@ -185,7 +204,7 @@ vectorize_loops (void) ...@@ -185,7 +204,7 @@ vectorize_loops (void)
return 0; return 0;
/* Fix the verbosity level if not defined explicitly by the user. */ /* Fix the verbosity level if not defined explicitly by the user. */
vect_set_dump_settings (); vect_set_dump_settings (false);
/* Allocate the bitmap that records which virtual variables /* Allocate the bitmap that records which virtual variables
need to be renamed. */ need to be renamed. */
...@@ -245,6 +264,68 @@ vectorize_loops (void) ...@@ -245,6 +264,68 @@ vectorize_loops (void)
} }
/* Entry point to basic block SLP phase. */
static unsigned int
execute_vect_slp (void)
{
basic_block bb;
/* Fix the verbosity level if not defined explicitly by the user. */
vect_set_dump_settings (true);
init_stmt_vec_info_vec ();
FOR_EACH_BB (bb)
{
vect_location = find_bb_location (bb);
if (vect_slp_analyze_bb (bb))
{
vect_slp_transform_bb (bb);
if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS))
fprintf (vect_dump, "basic block vectorized using SLP\n");
}
}
free_stmt_vec_info_vec ();
return 0;
}
static bool
gate_vect_slp (void)
{
/* Apply SLP either if the vectorizer is on and the user didn't specify
whether to run SLP or not, or if the SLP flag was set by the user. */
return ((flag_tree_vectorize != 0 && flag_tree_slp_vectorize != 0)
|| flag_tree_slp_vectorize == 1);
}
struct gimple_opt_pass pass_slp_vectorize =
{
{
GIMPLE_PASS,
"slp", /* name */
gate_vect_slp, /* gate */
execute_vect_slp, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
TV_TREE_SLP_VECTORIZATION, /* tv_id */
PROP_ssa | PROP_cfg, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_ggc_collect
| TODO_verify_ssa
| TODO_dump_func
| TODO_update_ssa
| TODO_verify_stmts /* todo_flags_finish */
}
};
/* Increase alignment of global arrays to improve vectorization potential. /* Increase alignment of global arrays to improve vectorization potential.
TODO: TODO:
- Consider also structs that have an array field. - Consider also structs that have an array field.
......
...@@ -281,6 +281,36 @@ nested_in_vect_loop_p (struct loop *loop, gimple stmt) ...@@ -281,6 +281,36 @@ nested_in_vect_loop_p (struct loop *loop, gimple stmt)
&& (loop->inner == (gimple_bb (stmt))->loop_father)); && (loop->inner == (gimple_bb (stmt))->loop_father));
} }
typedef struct _bb_vec_info {
basic_block bb;
/* All interleaving chains of stores in the basic block, represented by the
first stmt in the chain. */
VEC(gimple, heap) *strided_stores;
/* All SLP instances in the basic block. This is a subset of the set of
STRIDED_STORES of the basic block. */
VEC(slp_instance, heap) *slp_instances;
/* All data references in the basic block. */
VEC (data_reference_p, heap) *datarefs;
/* All data dependences in the basic block. */
VEC (ddr_p, heap) *ddrs;
} *bb_vec_info;
#define BB_VINFO_BB(B) (B)->bb
#define BB_VINFO_STRIDED_STORES(B) (B)->strided_stores
#define BB_VINFO_SLP_INSTANCES(B) (B)->slp_instances
#define BB_VINFO_DATAREFS(B) (B)->datarefs
#define BB_VINFO_DDRS(B) (B)->ddrs
static inline bb_vec_info
vec_info_for_bb (basic_block bb)
{
return (bb_vec_info) bb->aux;
}
/*-----------------------------------------------------------------*/ /*-----------------------------------------------------------------*/
/* Info on vectorized defs. */ /* Info on vectorized defs. */
/*-----------------------------------------------------------------*/ /*-----------------------------------------------------------------*/
...@@ -437,12 +467,16 @@ typedef struct _stmt_vec_info { ...@@ -437,12 +467,16 @@ typedef struct _stmt_vec_info {
/* Whether the stmt is SLPed, loop-based vectorized, or both. */ /* Whether the stmt is SLPed, loop-based vectorized, or both. */
enum slp_vect_type slp_type; enum slp_vect_type slp_type;
/* The bb_vec_info with respect to which STMT is vectorized. */
bb_vec_info bb_vinfo;
} *stmt_vec_info; } *stmt_vec_info;
/* Access Functions. */ /* Access Functions. */
#define STMT_VINFO_TYPE(S) (S)->type #define STMT_VINFO_TYPE(S) (S)->type
#define STMT_VINFO_STMT(S) (S)->stmt #define STMT_VINFO_STMT(S) (S)->stmt
#define STMT_VINFO_LOOP_VINFO(S) (S)->loop_vinfo #define STMT_VINFO_LOOP_VINFO(S) (S)->loop_vinfo
#define STMT_VINFO_BB_VINFO(S) (S)->bb_vinfo
#define STMT_VINFO_RELEVANT(S) (S)->relevant #define STMT_VINFO_RELEVANT(S) (S)->relevant
#define STMT_VINFO_LIVE_P(S) (S)->live #define STMT_VINFO_LIVE_P(S) (S)->live
#define STMT_VINFO_VECTYPE(S) (S)->vectype #define STMT_VINFO_VECTYPE(S) (S)->vectype
...@@ -714,8 +748,8 @@ extern bool vect_can_advance_ivs_p (loop_vec_info); ...@@ -714,8 +748,8 @@ extern bool vect_can_advance_ivs_p (loop_vec_info);
/* In tree-vect-stmts.c. */ /* In tree-vect-stmts.c. */
extern tree get_vectype_for_scalar_type (tree); extern tree get_vectype_for_scalar_type (tree);
extern bool vect_is_simple_use (tree, loop_vec_info, gimple *, tree *, extern bool vect_is_simple_use (tree, loop_vec_info, bb_vec_info, gimple *,
enum vect_def_type *); tree *, enum vect_def_type *);
extern bool supportable_widening_operation (enum tree_code, gimple, tree, extern bool supportable_widening_operation (enum tree_code, gimple, tree,
tree *, tree *, enum tree_code *, tree *, tree *, enum tree_code *,
enum tree_code *, int *, enum tree_code *, int *,
...@@ -723,7 +757,8 @@ extern bool supportable_widening_operation (enum tree_code, gimple, tree, ...@@ -723,7 +757,8 @@ extern bool supportable_widening_operation (enum tree_code, gimple, tree,
extern bool supportable_narrowing_operation (enum tree_code, const_gimple, extern bool supportable_narrowing_operation (enum tree_code, const_gimple,
tree, enum tree_code *, int *, tree, enum tree_code *, int *,
VEC (tree, heap) **); VEC (tree, heap) **);
extern stmt_vec_info new_stmt_vec_info (gimple stmt, loop_vec_info); extern stmt_vec_info new_stmt_vec_info (gimple stmt, loop_vec_info,
bb_vec_info);
extern void free_stmt_vec_info (gimple stmt); extern void free_stmt_vec_info (gimple stmt);
extern tree vectorizable_function (gimple, tree, tree); extern tree vectorizable_function (gimple, tree, tree);
extern void vect_model_simple_cost (stmt_vec_info, int, enum vect_def_type *, extern void vect_model_simple_cost (stmt_vec_info, int, enum vect_def_type *,
...@@ -742,7 +777,7 @@ extern tree vect_get_vec_def_for_stmt_copy (enum vect_def_type, tree); ...@@ -742,7 +777,7 @@ extern tree vect_get_vec_def_for_stmt_copy (enum vect_def_type, tree);
extern bool vect_transform_stmt (gimple, gimple_stmt_iterator *, extern bool vect_transform_stmt (gimple, gimple_stmt_iterator *,
bool *, slp_tree, slp_instance); bool *, slp_tree, slp_instance);
extern void vect_remove_stores (gimple); extern void vect_remove_stores (gimple);
extern bool vect_analyze_stmt (gimple, bool *); extern bool vect_analyze_stmt (gimple, bool *, slp_tree);
/* In tree-vect-data-refs.c. */ /* In tree-vect-data-refs.c. */
extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int); extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int);
...@@ -750,12 +785,13 @@ extern enum dr_alignment_support vect_supportable_dr_alignment ...@@ -750,12 +785,13 @@ extern enum dr_alignment_support vect_supportable_dr_alignment
(struct data_reference *); (struct data_reference *);
extern tree vect_get_smallest_scalar_type (gimple, HOST_WIDE_INT *, extern tree vect_get_smallest_scalar_type (gimple, HOST_WIDE_INT *,
HOST_WIDE_INT *); HOST_WIDE_INT *);
extern bool vect_analyze_data_ref_dependences (loop_vec_info); extern bool vect_analyze_data_ref_dependences (loop_vec_info, bb_vec_info);
extern bool vect_enhance_data_refs_alignment (loop_vec_info); extern bool vect_enhance_data_refs_alignment (loop_vec_info);
extern bool vect_analyze_data_refs_alignment (loop_vec_info); extern bool vect_analyze_data_refs_alignment (loop_vec_info, bb_vec_info);
extern bool vect_analyze_data_ref_accesses (loop_vec_info); extern bool vect_verify_datarefs_alignment (loop_vec_info, bb_vec_info);
extern bool vect_analyze_data_ref_accesses (loop_vec_info, bb_vec_info);
extern bool vect_prune_runtime_alias_test_list (loop_vec_info); extern bool vect_prune_runtime_alias_test_list (loop_vec_info);
extern bool vect_analyze_data_refs (loop_vec_info); extern bool vect_analyze_data_refs (loop_vec_info, bb_vec_info);
extern tree vect_create_data_ref_ptr (gimple, struct loop *, tree, tree *, extern tree vect_create_data_ref_ptr (gimple, struct loop *, tree, tree *,
gimple *, bool, bool *); gimple *, bool, bool *);
extern tree bump_vector_ptr (tree, gimple, gimple_stmt_iterator *, gimple, tree); extern tree bump_vector_ptr (tree, gimple, gimple_stmt_iterator *, gimple, tree);
...@@ -799,13 +835,16 @@ extern void vect_free_slp_instance (slp_instance); ...@@ -799,13 +835,16 @@ extern void vect_free_slp_instance (slp_instance);
extern bool vect_transform_slp_perm_load (gimple, VEC (tree, heap) *, extern bool vect_transform_slp_perm_load (gimple, VEC (tree, heap) *,
gimple_stmt_iterator *, int, gimple_stmt_iterator *, int,
slp_instance, bool); slp_instance, bool);
extern bool vect_schedule_slp (loop_vec_info); extern bool vect_schedule_slp (loop_vec_info, bb_vec_info);
extern void vect_update_slp_costs_according_to_vf (loop_vec_info); extern void vect_update_slp_costs_according_to_vf (loop_vec_info);
extern bool vect_analyze_slp (loop_vec_info); extern bool vect_analyze_slp (loop_vec_info, bb_vec_info);
extern void vect_make_slp_decision (loop_vec_info); extern void vect_make_slp_decision (loop_vec_info);
extern void vect_detect_hybrid_slp (loop_vec_info); extern void vect_detect_hybrid_slp (loop_vec_info);
extern void vect_get_slp_defs (slp_tree, VEC (tree,heap) **, extern void vect_get_slp_defs (slp_tree, VEC (tree,heap) **,
VEC (tree,heap) **); VEC (tree,heap) **);
extern LOC find_bb_location (basic_block);
extern bb_vec_info vect_slp_analyze_bb (basic_block);
extern void vect_slp_transform_bb (basic_block);
/* In tree-vect-patterns.c. */ /* In tree-vect-patterns.c. */
/* Pattern recognition functions. /* Pattern recognition functions.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment