Commit a70d6342 by Ira Rosen Committed by Ira Rosen

passes.texi (Tree-SSA passes): Document SLP pass.


	* doc/passes.texi (Tree-SSA passes): Document SLP pass.
	* tree-pass.h (pass_slp_vectorize): New pass.
	* params.h (SLP_MAX_INSNS_IN_BB): Define.
	* timevar.def (TV_TREE_SLP_VECTORIZATION): Define.
	* tree-vectorizer.c (timevar.h): Include.
	(user_vect_verbosity_level): Declare.
	(vect_location): Fix comment.
	(vect_set_verbosity_level): Update user_vect_verbosity_level
	instead of vect_verbosity_level.
	(vect_set_dump_settings): Add an argument. Ignore user defined
	verbosity if dump flags require higher level of verbosity. Print to
	stderr only for loop vectorization.
	(vectorize_loops): Update call to vect_set_dump_settings.
	(execute_vect_slp): New function.
	(gate_vect_slp): Likewise.
	(struct gimple_opt_pass pass_slp_vectorize): New.
	* tree-vectorizer.h (struct _bb_vec_info): Define along macros to
	access its members.
	(vec_info_for_bb): New function.
	(struct _stmt_vec_info): Add bb_vinfo and a macro for its access.
	(VECTORIZATION_ENABLED): New macro.
	(SLP_ENABLED, SLP_DISABLED): Likewise.
	(vect_is_simple_use): Add bb_vec_info argument.
	(new_stmt_vec_info, vect_analyze_data_ref_dependences,
	vect_analyze_data_refs_alignment, vect_verify_datarefs_alignment,
	vect_analyze_data_ref_accesses, vect_analyze_data_refs,
	vect_schedule_slp, vect_analyze_slp): Likewise.
	(vect_analyze_stmt): Add slp_tree argument.
	(find_bb_location): Declare.
	(vect_slp_analyze_bb, vect_slp_transform_bb): Likewise.
	* tree-vect-loop.c (new_loop_vec_info): Adjust function calls.
	(vect_analyze_loop_operations, vect_analyze_loop,
	get_initial_def_for_induction, vect_create_epilog_for_reduction,
	vect_finalize_reduction, vectorizable_reduction,
	vectorizable_live_operation, vect_transform_loop): Likewise.
	* tree-data-ref.c (dr_analyze_innermost): Update comment,
	skip evolution analysis if analyzing a basic block.
	(dr_analyze_indices): Likewise.
	(initialize_data_dependence_relation): Skip the test whether the
	object is invariant for basic blocks.
	(compute_all_dependences): Skip dependence analysis for data
	references in basic blocks.
	(find_data_references_in_stmt): Don't fail in case of invariant
	access in basic block.
	(find_data_references_in_bb): New function.
	(find_data_references_in_loop): Move code to
	find_data_references_in_bb    and add a call to it.
	(compute_data_dependences_for_bb): New function.
	* tree-data-ref.h (compute_data_dependences_for_bb): Declare.
	* tree-vect-data-refs.c (vect_check_interleaving): Adjust to the case
	that STEP is 0.
	(vect_analyze_data_ref_dependence): Check for interleaving in case of
	unknown dependence in basic block and fail in case of dependence in
	basic block.
	(vect_analyze_data_ref_dependences): Add bb_vinfo argument, get data
	dependence instances from either loop or basic block vectorization
	info.
	(vect_compute_data_ref_alignment): Check if it is loop vectorization
	before calling nested_in_vect_loop_p.
	(vect_compute_data_refs_alignment): Add bb_vinfo argument, get data
	dependence instances from either loop or basic block vectorization
	info.
	(vect_verify_datarefs_alignment): Likewise.
	(vect_enhance_data_refs_alignment): Adjust function calls.
	(vect_analyze_data_refs_alignment): Likewise.
	(vect_analyze_group_access): Fix printing. Skip different checks if
	DR_STEP is 0. Keep strided stores either in loop or basic block
	vectorization data structure. Fix indentation.
	(vect_analyze_data_ref_access): Fix comments, allow zero step in
	basic blocks.
	(vect_analyze_data_ref_accesses): Add bb_vinfo argument, get data
	dependence instances from either loop or basic block vectorization
	info.
	(vect_analyze_data_refs): Update comment. Call
	compute_data_dependences_for_bb to analyze basic blocks.
	(vect_create_addr_base_for_vector_ref): Check for outer loop only in
	case of loop vectorization. In case of basic block vectorization use
	data-ref itself   as  a base.
	(vect_create_data_ref_ptr): In case of basic block vectorization:
	don't advance the pointer, add new statements before the current
	statement.  Adjust function calls.
	(vect_supportable_dr_alignment): Support only aligned accesses in
	basic block vectorization.
	* common.opt (ftree-slp-vectorize): New flag.
	* tree-vect-patterns.c (widened_name_p): Adjust function calls.
	(vect_pattern_recog_1): Likewise.
	* tree-vect-stmts.c (process_use): Likewise.
	(vect_init_vector): Add new statements in the beginning of the basic
	block in case of basic block SLP.
	(vect_get_vec_def_for_operand): Adjust function calls.
	(vect_finish_stmt_generation): Likewise.
	(vectorizable_call): Add assert that it is loop vectorization, adjust
	function calls.
	(vectorizable_conversion, vectorizable_assignment): Likewise.
	(vectorizable_operation): In case of basic block SLP, take
	vectorization factor from statement's type and skip the relevance
	check. Adjust function calls.
	(vectorizable_type_demotion): Add assert that it is loop
	vectorization, adjust function calls.
	(vectorizable_type_promotion): Likewise.
	(vectorizable_store): Check for outer loop only in case of loop
	vectorization. Adjust function calls. For basic blocks, skip the
	relevance check and don't advance pointers.
	(vectorizable_load): Likewise.
	(vectorizable_condition): Add assert that it is loop vectorization,
	adjust function calls.
	(vect_analyze_stmt): Add argument. In case of basic block SLP, check
	that it is not reduction, get vector type, call only supported
	functions, skip loop    specific parts.
	(vect_transform_stmt): Check for outer loop only in case of loop
	vectorization.
	(new_stmt_vec_info): Add new argument and initialize bb_vinfo.
	(vect_is_simple_use): Fix comment, add new argument, fix conditions
	for external definition.
	* passes.c (pass_slp_vectorize): New pass.
	* tree-vect-slp.c (find_bb_location): New function.
	(vect_get_and_check_slp_defs): Add argument, adjust function calls,
	check for patterns only in loops.
	(vect_build_slp_tree): Add argument, adjust function calls, fail in
	case of multiple types in basic block SLP.
	(vect_mark_slp_stmts_relevant): New function.
	(vect_supported_load_permutation_p): Fix comment.
	(vect_analyze_slp_instance): Add argument. In case of basic block
	SLP, take vectorization factor from statement's type, check that
	unrolling factor is 1. Adjust function call. Save SLP instance in
	either loop or basic block vectorization structure. Return FALSE,
	if SLP failed.
	(vect_analyze_slp): Add argument. Get strided stores groups from
	either loop or basic block vectorization structure. Return FALSE
	if basic block SLP failed.
	(new_bb_vec_info): New function.
	(destroy_bb_vec_info, vect_slp_analyze_node_operations,
	vect_slp_analyze_operations, vect_slp_analyze_bb): Likewise.
	(vect_schedule_slp): Add argument. Get SLP instances from either
	loop or basic block vectorization structure. Set vectorization factor
	to be 1 for basic block SLP.
	(vect_slp_transform_bb): New function.
	* params.def (PARAM_SLP_MAX_INSNS_IN_BB): Define.

From-SVN: r147829
parent ffa52e11
2009-05-24 Ira Rosen <irar@il.ibm.com>
* doc/passes.texi (Tree-SSA passes): Document SLP pass.
* tree-pass.h (pass_slp_vectorize): New pass.
* params.h (SLP_MAX_INSNS_IN_BB): Define.
* timevar.def (TV_TREE_SLP_VECTORIZATION): Define.
* tree-vectorizer.c (timevar.h): Include.
(user_vect_verbosity_level): Declare.
(vect_location): Fix comment.
(vect_set_verbosity_level): Update user_vect_verbosity_level
instead of vect_verbosity_level.
(vect_set_dump_settings): Add an argument. Ignore user defined
verbosity if dump flags require higher level of verbosity. Print to
stderr only for loop vectorization.
(vectorize_loops): Update call to vect_set_dump_settings.
(execute_vect_slp): New function.
(gate_vect_slp): Likewise.
(struct gimple_opt_pass pass_slp_vectorize): New.
* tree-vectorizer.h (struct _bb_vec_info): Define along macros to
access its members.
(vec_info_for_bb): New function.
(struct _stmt_vec_info): Add bb_vinfo and a macro for its access.
(VECTORIZATION_ENABLED): New macro.
(SLP_ENABLED, SLP_DISABLED): Likewise.
(vect_is_simple_use): Add bb_vec_info argument.
(new_stmt_vec_info, vect_analyze_data_ref_dependences,
vect_analyze_data_refs_alignment, vect_verify_datarefs_alignment,
vect_analyze_data_ref_accesses, vect_analyze_data_refs,
vect_schedule_slp, vect_analyze_slp): Likewise.
(vect_analyze_stmt): Add slp_tree argument.
(find_bb_location): Declare.
(vect_slp_analyze_bb, vect_slp_transform_bb): Likewise.
* tree-vect-loop.c (new_loop_vec_info): Adjust function calls.
(vect_analyze_loop_operations, vect_analyze_loop,
get_initial_def_for_induction, vect_create_epilog_for_reduction,
vect_finalize_reduction, vectorizable_reduction,
vectorizable_live_operation, vect_transform_loop): Likewise.
* tree-data-ref.c (dr_analyze_innermost): Update comment,
skip evolution analysis if analyzing a basic block.
(dr_analyze_indices): Likewise.
(initialize_data_dependence_relation): Skip the test whether the
object is invariant for basic blocks.
(compute_all_dependences): Skip dependence analysis for data
references in basic blocks.
(find_data_references_in_stmt): Don't fail in case of invariant
access in basic block.
(find_data_references_in_bb): New function.
(find_data_references_in_loop): Move code to
find_data_references_in_bb and add a call to it.
(compute_data_dependences_for_bb): New function.
* tree-data-ref.h (compute_data_dependences_for_bb): Declare.
* tree-vect-data-refs.c (vect_check_interleaving): Adjust to the case
that STEP is 0.
(vect_analyze_data_ref_dependence): Check for interleaving in case of
unknown dependence in basic block and fail in case of dependence in
basic block.
(vect_analyze_data_ref_dependences): Add bb_vinfo argument, get data
dependence instances from either loop or basic block vectorization
info.
(vect_compute_data_ref_alignment): Check if it is loop vectorization
before calling nested_in_vect_loop_p.
(vect_compute_data_refs_alignment): Add bb_vinfo argument, get data
dependence instances from either loop or basic block vectorization
info.
(vect_verify_datarefs_alignment): Likewise.
(vect_enhance_data_refs_alignment): Adjust function calls.
(vect_analyze_data_refs_alignment): Likewise.
(vect_analyze_group_access): Fix printing. Skip different checks if
DR_STEP is 0. Keep strided stores either in loop or basic block
vectorization data structure. Fix indentation.
(vect_analyze_data_ref_access): Fix comments, allow zero step in
basic blocks.
(vect_analyze_data_ref_accesses): Add bb_vinfo argument, get data
dependence instances from either loop or basic block vectorization
info.
(vect_analyze_data_refs): Update comment. Call
compute_data_dependences_for_bb to analyze basic blocks.
(vect_create_addr_base_for_vector_ref): Check for outer loop only in
case of loop vectorization. In case of basic block vectorization use
data-ref itself as a base.
(vect_create_data_ref_ptr): In case of basic block vectorization:
don't advance the pointer, add new statements before the current
statement. Adjust function calls.
(vect_supportable_dr_alignment): Support only aligned accesses in
basic block vectorization.
* common.opt (ftree-slp-vectorize): New flag.
* tree-vect-patterns.c (widened_name_p): Adjust function calls.
(vect_pattern_recog_1): Likewise.
* tree-vect-stmts.c (process_use): Likewise.
(vect_init_vector): Add new statements in the beginning of the basic
block in case of basic block SLP.
(vect_get_vec_def_for_operand): Adjust function calls.
(vect_finish_stmt_generation): Likewise.
(vectorizable_call): Add assert that it is loop vectorization, adjust
function calls.
(vectorizable_conversion, vectorizable_assignment): Likewise.
(vectorizable_operation): In case of basic block SLP, take
vectorization factor from statement's type and skip the relevance
check. Adjust function calls.
(vectorizable_type_demotion): Add assert that it is loop
vectorization, adjust function calls.
(vectorizable_type_promotion): Likewise.
(vectorizable_store): Check for outer loop only in case of loop
vectorization. Adjust function calls. For basic blocks, skip the
relevance check and don't advance pointers.
(vectorizable_load): Likewise.
(vectorizable_condition): Add assert that it is loop vectorization,
adjust function calls.
(vect_analyze_stmt): Add argument. In case of basic block SLP, check
that it is not reduction, get vector type, call only supported
functions, skip loop specific parts.
(vect_transform_stmt): Check for outer loop only in case of loop
vectorization.
(new_stmt_vec_info): Add new argument and initialize bb_vinfo.
(vect_is_simple_use): Fix comment, add new argument, fix conditions
for external definition.
* passes.c (pass_slp_vectorize): New pass.
* tree-vect-slp.c (find_bb_location): New function.
(vect_get_and_check_slp_defs): Add argument, adjust function calls,
check for patterns only in loops.
(vect_build_slp_tree): Add argument, adjust function calls, fail in
case of multiple types in basic block SLP.
(vect_mark_slp_stmts_relevant): New function.
(vect_supported_load_permutation_p): Fix comment.
(vect_analyze_slp_instance): Add argument. In case of basic block
SLP, take vectorization factor from statement's type, check that
unrolling factor is 1. Adjust function call. Save SLP instance in
either loop or basic block vectorization structure. Return FALSE,
if SLP failed.
(vect_analyze_slp): Add argument. Get strided stores groups from
either loop or basic block vectorization structure. Return FALSE
if basic block SLP failed.
(new_bb_vec_info): New function.
(destroy_bb_vec_info, vect_slp_analyze_node_operations,
vect_slp_analyze_operations, vect_slp_analyze_bb): Likewise.
(vect_schedule_slp): Add argument. Get SLP instances from either
loop or basic block vectorization structure. Set vectorization factor
to be 1 for basic block SLP.
(vect_slp_transform_bb): New function.
* params.def (PARAM_SLP_MAX_INSNS_IN_BB): Define.
2009-05-23 Mark Mitchell <mark@codesourcery.com>
* final.c (shorten_branches): Do not align labels for jump tables.
......
......@@ -1330,6 +1330,10 @@ ftree-vectorize
Common Report Var(flag_tree_vectorize) Optimization
Enable loop vectorization on trees
ftree-slp-vectorize
Common Report Var(flag_tree_slp_vectorize) Init(2) Optimization
Enable basic block vectorization (SLP) on trees
fvect-cost-model
Common Report Var(flag_vect_cost_model) Optimization
Enable use of cost model in vectorization
......
......@@ -438,11 +438,19 @@ conceptually unrolled by a factor @code{VF} (vectorization factor), which is
the number of elements operated upon in parallel in each iteration, and the
@code{VF} copies of each scalar operation are fused to form a vector operation.
Additional loop transformations such as peeling and versioning may take place
to align the number of iterations, and to align the memory accesses in the loop.
The pass is implemented in @file{tree-vectorizer.c} (the main driver and general
utilities), @file{tree-vect-analyze.c} and @file{tree-vect-transform.c}.
to align the number of iterations, and to align the memory accesses in the
loop.
The pass is implemented in @file{tree-vectorizer.c} (the main driver),
@file{tree-vect-loop.c} and @file{tree-vect-loop-manip.c} (loop specific parts
and general loop utilities), @file{tree-vect-slp} (loop-aware SLP
functionality), @file{tree-vect-stmts.c} and @file{tree-vect-data-refs.c}.
Analysis of data references is in @file{tree-data-ref.c}.
SLP Vectorization. This pass performs vectorization of straight-line code. The
pass is implemented in @file{tree-vectorizer.c} (the main driver),
@file{tree-vect-slp.c}, @file{tree-vect-stmts.c} and
@file{tree-vect-data-refs.c}.
Autoparallelization. This pass splits the loop iteration space to run
into several threads. The pass is implemented in @file{tree-parloops.c}.
......
......@@ -765,6 +765,12 @@ DEFPARAM (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP,
"max basic blocks number in loop for loop invariant motion",
10000, 0, 0)
/* Avoid SLP vectorization of large basic blocks. */
DEFPARAM (PARAM_SLP_MAX_INSNS_IN_BB,
"slp-max-insns-in-bb",
"Maximum number of instructions in basic block to be considered for SLP vectorization",
1000, 0, 0)
/*
Local variables:
mode:c
......
......@@ -170,4 +170,6 @@ typedef enum compiler_param
PARAM_VALUE (PARAM_SWITCH_CONVERSION_BRANCH_RATIO)
#define LOOP_INVARIANT_MAX_BBS_IN_LOOP \
PARAM_VALUE (PARAM_LOOP_INVARIANT_MAX_BBS_IN_LOOP)
#define SLP_MAX_INSNS_IN_BB \
PARAM_VALUE (PARAM_SLP_MAX_INSNS_IN_BB)
#endif /* ! GCC_PARAMS_H */
......@@ -662,6 +662,7 @@ init_optimization_passes (void)
NEXT_PASS (pass_dce_loop);
}
NEXT_PASS (pass_complete_unroll);
NEXT_PASS (pass_slp_vectorize);
NEXT_PASS (pass_parallelize_loops);
NEXT_PASS (pass_loop_prefetch);
NEXT_PASS (pass_iv_optimize);
......
2009-05-24 Ira Rosen <irar@il.ibm.com>
* gcc.dg/vect/bb-slp-1.c: New test.
* gcc.dg/vect/bb-slp-2.c, gcc.dg/vect/bb-slp-3.c,
gcc.dg/vect/bb-slp-4.c, gcc.dg/vect/bb-slp-5.c,
gcc.dg/vect/bb-slp-6.c, gcc.dg/vect/bb-slp-7.c,
gcc.dg/vect/bb-slp-8.c, gcc.dg/vect/bb-slp-9.c,
gcc.dg/vect/bb-slp-10.c, gcc.dg/vect/bb-slp-11.c,
gcc.dg/vect/no-tree-reassoc-bb-slp-12.c, gcc.dg/vect/bb-slp-13.c,
gcc.dg/vect/bb-slp-14.c, gcc.dg/vect/bb-slp-15.c,
gcc.dg/vect/bb-slp-16.c, gcc.dg/vect/bb-slp-17.c,
gcc.dg/vect/bb-slp-18.c, gcc.dg/vect/bb-slp-19.c,
gcc.dg/vect/bb-slp-20.c, gcc.dg/vect/bb-slp-21.c,
gcc.dg/vect/bb-slp-22.c: Likewise.
* gcc.dg/vect/vect.exp: Run basic block SLP tests.
2009-05-23 Mark Mitchell <mark@codesourcery.com>
Maxim Kuvyrkov <maxim@codesourcery.com>
......
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 32
unsigned int out[N*8];
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63, 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 (int dummy)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
for (i = 0; i < N; i++)
{
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Avoid loop vectorization. */
if (dummy == 32)
abort ();
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != in[i*8]
|| out[i*8 + 1] != in[i*8 + 1]
|| out[i*8 + 2] != in[i*8 + 2]
|| out[i*8 + 3] != in[i*8 + 3]
|| out[i*8 + 4] != in[i*8 + 4]
|| out[i*8 + 5] != in[i*8 + 5]
|| out[i*8 + 6] != in[i*8 + 6]
|| out[i*8 + 7] != in[i*8 + 7])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 (33);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[2];
unsigned int a0, a1, a2, a3;
/* Misaligned store. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[2] != (in[0] + 23) * x
|| out[3] != (in[1] + 142) * y
|| out[4] != (in[2] + 2) * x
|| out[5] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { scan-tree-dump-times "unsupported alignment in basic block." 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
short a0, a1, a2, a3;
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { scan-tree-dump-times "SLP with multiple types" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
/* Not consecutive load with permutation - not supported. */
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[1] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[1] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
if (x > y)
x = x + y;
else
y = x;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 32
unsigned int out[N*8];
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63, 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
unsigned int arr[N] = {0,1,2,3,4,5,6,7};
__attribute__ ((noinline)) int
main1 (int dummy)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
unsigned int a = 0;
for (i = 0; i < N; i++)
{
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
*pout++ = *pin++ + a;
if (arr[i] = i)
a = i;
else
a = 2;
}
a = 0;
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != in[i*8] + a
|| out[i*8 + 1] != in[i*8 + 1] + a
|| out[i*8 + 2] != in[i*8 + 2] + a
|| out[i*8 + 3] != in[i*8 + 3] + a
|| out[i*8 + 4] != in[i*8 + 4] + a
|| out[i*8 + 5] != in[i*8 + 5] + a
|| out[i*8 + 6] != in[i*8 + 6] + a
|| out[i*8 + 7] != in[i*8 + 7] + a)
abort ();
if (arr[i] = i)
a = i;
else
a = 2;
}
return 0;
}
int main (void)
{
check_vect ();
main1 (33);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
if (x > y)
x = x + y;
else
y = x;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
b[0] = a0;
b[1] = a1;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y
|| b[0] != in[0] + 23
|| b[1] != in[1] + 142)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != a0 * x
|| out[1] != a1 * y
|| out[2] != a2 * x
|| out[3] != a3 * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned short out[N];
unsigned short in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short *pin = &in[0];
unsigned short *pout = &out[0];
/* A group of 9 shorts - unsupported for now. */
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3]
|| out[4] != in[4]
|| out[5] != in[5]
|| out[6] != in[6]
|| out[7] != in[7]
|| out[8] != in[8])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { xfail *-*-* } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N*8];
unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
__attribute__ ((noinline)) int
main1 (int dummy)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
for (i = 0; i < N*2; i++)
{
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Avoid loop vectorization. */
if (dummy == 32)
abort ();
}
/* check results: */
for (i = 0; i < N; i++)
{
if (out[i*8] != in[i*8]
|| out[i*8 + 1] != in[i*8 + 1]
|| out[i*8 + 2] != in[i*8 + 2]
|| out[i*8 + 3] != in[i*8 + 3]
|| out[i*8 + 4] != in[i*8 + 4]
|| out[i*8 + 5] != in[i*8 + 5]
|| out[i*8 + 6] != in[i*8 + 6]
|| out[i*8 + 7] != in[i*8 + 7])
abort ();
}
return 0;
}
int main (void)
{
check_vect ();
main1 (33);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
if (x > y)
x = x + y;
else
y = x;
/* Two SLP instances in the basic block, only one is supported for now,
the second one contains type conversion. */
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
b[0] = -a0;
b[1] = -a1;
b[2] = -a2;
b[3] = -a3;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y
|| b[0] != -(in[0] + 23)
|| b[1] != -(in[1] + 142)
|| b[2] != -(in[2] + 2)
|| b[3] != -(in[3] + 31))
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
/* Two SLP instances in one basic block. */
if (x > y)
x = x + y;
else
y = x;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
b[0] = a0;
b[1] = a1;
b[2] = a2;
b[3] = a3;
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y
|| b[0] != (in[0] + 23)
|| b[1] != (in[1] + 142)
|| b[2] != (in[2] + 2)
|| b[3] != (in[3] + 31))
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "slp" { target { ! {vect_int_mult } } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int b[N];
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int a0, a1, a2, a3;
a0 = in[0] + 23;
a1 = in[1] + 142;
a2 = in[2] + 2;
a3 = in[3] + 31;
if (x > y)
{
b[0] = a0;
b[1] = a1;
b[2] = a2;
b[3] = a3;
}
else
{
out[0] = a0 * x;
out[1] = a1 * y;
out[2] = a2 * x;
out[3] = a3 * y;
}
/* Check results. */
if ((x <= y
&& (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y))
|| (x > y
&& (b[0] != (in[0] + 23)
|| b[1] != (in[1] + 142)
|| b[2] != (in[2] + 2)
|| b[3] != (in[3] + 31))))
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target { ! {vect_int_mult } } } } } */
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 2 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned short out[N];
unsigned short in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short *pin = &in[0];
unsigned short *pout = &out[0];
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned short out[N];
unsigned short in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 ()
{
int i;
unsigned short *pin = &in[0];
unsigned short *pout = &out[0];
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
*pout++ = *pin++;
/* Check results. */
if (out[0] != in[0]
|| out[1] != in[1]
|| out[2] != in[2]
|| out[3] != in[3]
|| out[4] != in[4]
|| out[5] != in[5]
|| out[6] != in[6]
|| out[7] != in[7])
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 ();
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[0];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
/* Non isomorphic. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ * 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] * 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y, unsigned int *pin, unsigned int *pout)
{
int i;
unsigned int a0, a1, a2, a3;
/* pin and pout may alias. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[0] + 23) * x
|| out[1] != (in[1] + 142) * y
|| out[2] != (in[2] + 2) * x
|| out[3] != (in[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3, &in[0], &out[0]);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin = &in[1];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
/* Misaligned load. */
a0 = *pin++ + 23;
a1 = *pin++ + 142;
a2 = *pin++ + 2;
a3 = *pin++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in[1] + 23) * x
|| out[1] != (in[2] + 142) * y
|| out[2] != (in[3] + 2) * x
|| out[3] != (in[4] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 0 "slp" } } */
/* { dg-final { scan-tree-dump-times "unsupported alignment in basic block." 1 "slp" } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
/* { dg-require-effective-target vect_int } */
#include <stdarg.h>
#include <stdio.h>
#include "tree-vect.h"
#define N 16
unsigned int out[N];
unsigned int in1[N] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
unsigned int in2[N] = {10,11,12,13,14,15,16,17,18,19,110,111,112,113,114,115};
__attribute__ ((noinline)) int
main1 (unsigned int x, unsigned int y)
{
int i;
unsigned int *pin1 = &in1[0];
unsigned int *pin2 = &in2[0];
unsigned int *pout = &out[0];
unsigned int a0, a1, a2, a3;
a0 = *pin2++ - *pin1++ + 23;
a1 = *pin2++ - *pin1++ + 142;
a2 = *pin2++ - *pin1++ + 2;
a3 = *pin2++ - *pin1++ + 31;
*pout++ = a0 * x;
*pout++ = a1 * y;
*pout++ = a2 * x;
*pout++ = a3 * y;
/* Check results. */
if (out[0] != (in2[0] - in1[0] + 23) * x
|| out[1] != (in2[1] - in1[1] + 142) * y
|| out[2] != (in2[2] - in1[2] + 2) * x
|| out[3] != (in2[3] - in1[3] + 31) * y)
abort();
return 0;
}
int main (void)
{
check_vect ();
main1 (2, 3);
return 0;
}
/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_int_mult } } } */
/* { dg-final { cleanup-tree-dump "slp" } } */
......@@ -122,6 +122,8 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/nodump-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
lappend DEFAULT_VECTCFLAGS "-fdump-tree-vect-details"
set VECT_SLP_CFLAGS $DEFAULT_VECTCFLAGS
lappend VECT_SLP_CFLAGS "-fdump-tree-slp-details"
# Main loop.
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/pr*.\[cS\]]] \
......@@ -130,10 +132,14 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/vect-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/slp-*.\[cS\]]] \
"" $DEFAULT_VECTCFLAGS
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/bb-slp*.\[cS\]]] \
"" $VECT_SLP_CFLAGS
#### Tests with special options
global SAVED_DEFAULT_VECTCFLAGS
set SAVED_DEFAULT_VECTCFLAGS $DEFAULT_VECTCFLAGS
set SAVED_VECT_SLP_CFLAGS $VECT_SLP_CFLAGS
# --param vect-max-version-for-alias-checks=0 tests
set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
......@@ -262,6 +268,11 @@ dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/O3-*.\[cS\]]] \
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/O1-*.\[cS\]]] \
"" $O1_VECTCFLAGS
# -fno-tree-reassoc
set VECT_SLP_CFLAGS $SAVED_VECT_SLP_CFLAGS
lappend VECT_SLP_CFLAGS "-fno-tree-reassoc"
dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/no-tree-reassoc-bb-slp-*.\[cS\]]] \
"" $VECT_SLP_CFLAGS
# Clean up.
set dg-do-what-default ${save-dg-do-what-default}
......
......@@ -121,6 +121,7 @@ DEFTIMEVAR (TV_TREE_LOOP_UNSWITCH , "tree loop unswitching")
DEFTIMEVAR (TV_COMPLETE_UNROLL , "complete unrolling")
DEFTIMEVAR (TV_TREE_PARALLELIZE_LOOPS, "tree parallelize loops")
DEFTIMEVAR (TV_TREE_VECTORIZATION , "tree vectorization")
DEFTIMEVAR (TV_TREE_SLP_VECTORIZATION, "tree slp vectorization")
DEFTIMEVAR (TV_GRAPHITE_TRANSFORMS , "GRAPHITE loop transforms")
DEFTIMEVAR (TV_TREE_LINEAR_TRANSFORM , "tree loop linear")
DEFTIMEVAR (TV_TREE_LOOP_DISTRIBUTION, "tree loop distribution")
......
......@@ -668,8 +668,9 @@ canonicalize_base_object_address (tree addr)
return build_fold_addr_expr (TREE_OPERAND (addr, 0));
}
/* Analyzes the behavior of the memory reference DR in the innermost loop that
contains it. Returns true if analysis succeed or false otherwise. */
/* Analyzes the behavior of the memory reference DR in the innermost loop or
basic block that contains it. Returns true if analysis succeed or false
otherwise. */
bool
dr_analyze_innermost (struct data_reference *dr)
......@@ -683,6 +684,7 @@ dr_analyze_innermost (struct data_reference *dr)
int punsignedp, pvolatilep;
affine_iv base_iv, offset_iv;
tree init, dinit, step;
bool in_loop = (loop && loop->num);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "analyze_innermost: ");
......@@ -699,13 +701,24 @@ dr_analyze_innermost (struct data_reference *dr)
}
base = build_fold_addr_expr (base);
if (!simple_iv (loop, loop_containing_stmt (stmt), base, &base_iv, false))
if (in_loop)
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "failed: evolution of base is not affine.\n");
return false;
if (!simple_iv (loop, loop_containing_stmt (stmt), base, &base_iv,
false))
{
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "failed: evolution of base is not affine.\n");
return false;
}
}
else
{
base_iv.base = base;
base_iv.step = ssize_int (0);
base_iv.no_overflow = true;
}
if (!poffset)
if (!poffset || !in_loop)
{
offset_iv.base = ssize_int (0);
offset_iv.step = ssize_int (0);
......@@ -752,17 +765,23 @@ dr_analyze_indices (struct data_reference *dr, struct loop *nest)
struct loop *loop = loop_containing_stmt (stmt);
VEC (tree, heap) *access_fns = NULL;
tree ref = unshare_expr (DR_REF (dr)), aref = ref, op;
tree base, off, access_fn;
basic_block before_loop = block_before_loop (nest);
tree base, off, access_fn = NULL_TREE;
basic_block before_loop = NULL;
if (nest)
before_loop = block_before_loop (nest);
while (handled_component_p (aref))
{
if (TREE_CODE (aref) == ARRAY_REF)
{
op = TREE_OPERAND (aref, 1);
access_fn = analyze_scalar_evolution (loop, op);
access_fn = instantiate_scev (before_loop, loop, access_fn);
VEC_safe_push (tree, heap, access_fns, access_fn);
if (nest)
{
access_fn = analyze_scalar_evolution (loop, op);
access_fn = instantiate_scev (before_loop, loop, access_fn);
VEC_safe_push (tree, heap, access_fns, access_fn);
}
TREE_OPERAND (aref, 1) = build_int_cst (TREE_TYPE (op), 0);
}
......@@ -770,7 +789,7 @@ dr_analyze_indices (struct data_reference *dr, struct loop *nest)
aref = TREE_OPERAND (aref, 0);
}
if (INDIRECT_REF_P (aref))
if (nest && INDIRECT_REF_P (aref))
{
op = TREE_OPERAND (aref, 0);
access_fn = analyze_scalar_evolution (loop, op);
......@@ -1332,8 +1351,9 @@ initialize_data_dependence_relation (struct data_reference *a,
/* If the base of the object is not invariant in the loop nest, we cannot
analyze it. TODO -- in fact, it would suffice to record that there may
be arbitrary dependences in the loops where the base object varies. */
if (!object_address_invariant_in_loop_p (VEC_index (loop_p, loop_nest, 0),
DR_BASE_OBJECT (a)))
if (loop_nest
&& !object_address_invariant_in_loop_p (VEC_index (loop_p, loop_nest, 0),
DR_BASE_OBJECT (a)))
{
DDR_ARE_DEPENDENT (res) = chrec_dont_know;
return res;
......@@ -4003,7 +4023,8 @@ compute_all_dependences (VEC (data_reference_p, heap) *datarefs,
{
ddr = initialize_data_dependence_relation (a, b, loop_nest);
VEC_safe_push (ddr_p, heap, *dependence_relations, ddr);
compute_affine_dependence (ddr, VEC_index (loop_p, loop_nest, 0));
if (loop_nest)
compute_affine_dependence (ddr, VEC_index (loop_p, loop_nest, 0));
}
if (compute_self_and_rr)
......@@ -4110,9 +4131,10 @@ find_data_references_in_stmt (struct loop *nest, gimple stmt,
dr = create_data_ref (nest, *ref->pos, stmt, ref->is_read);
gcc_assert (dr != NULL);
/* FIXME -- data dependence analysis does not work correctly for objects with
invariant addresses. Let us fail here until the problem is fixed. */
if (dr_address_invariant_p (dr))
/* FIXME -- data dependence analysis does not work correctly for objects
with invariant addresses in loop nests. Let us fail here until the
problem is fixed. */
if (dr_address_invariant_p (dr) && nest)
{
free_data_ref (dr);
if (dump_file && (dump_flags & TDF_DETAILS))
......@@ -4129,6 +4151,33 @@ find_data_references_in_stmt (struct loop *nest, gimple stmt,
/* Search the data references in LOOP, and record the information into
DATAREFS. Returns chrec_dont_know when failing to analyze a
difficult case, returns NULL_TREE otherwise. */
static tree
find_data_references_in_bb (struct loop *loop, basic_block bb,
VEC (data_reference_p, heap) **datarefs)
{
gimple_stmt_iterator bsi;
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
gimple stmt = gsi_stmt (bsi);
if (!find_data_references_in_stmt (loop, stmt, datarefs))
{
struct data_reference *res;
res = XCNEW (struct data_reference);
VEC_safe_push (data_reference_p, heap, *datarefs, res);
return chrec_dont_know;
}
}
return NULL_TREE;
}
/* Search the data references in LOOP, and record the information into
DATAREFS. Returns chrec_dont_know when failing to analyze a
difficult case, returns NULL_TREE otherwise.
TODO: This function should be made smarter so that it can handle address
......@@ -4140,7 +4189,6 @@ find_data_references_in_loop (struct loop *loop,
{
basic_block bb, *bbs;
unsigned int i;
gimple_stmt_iterator bsi;
bbs = get_loop_body_in_dom_order (loop);
......@@ -4148,20 +4196,11 @@ find_data_references_in_loop (struct loop *loop,
{
bb = bbs[i];
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (&bsi))
{
gimple stmt = gsi_stmt (bsi);
if (!find_data_references_in_stmt (loop, stmt, datarefs))
{
struct data_reference *res;
res = XCNEW (struct data_reference);
VEC_safe_push (data_reference_p, heap, *datarefs, res);
free (bbs);
return chrec_dont_know;
}
}
if (find_data_references_in_bb (loop, bb, datarefs) == chrec_dont_know)
{
free (bbs);
return chrec_dont_know;
}
}
free (bbs);
......@@ -4298,6 +4337,26 @@ compute_data_dependences_for_loop (struct loop *loop,
return res;
}
/* Returns true when the data dependences for the basic block BB have been
computed, false otherwise.
DATAREFS is initialized to all the array elements contained in this basic
block, DEPENDENCE_RELATIONS contains the relations between the data
references. Compute read-read and self relations if
COMPUTE_SELF_AND_READ_READ_DEPENDENCES is TRUE. */
bool
compute_data_dependences_for_bb (basic_block bb,
bool compute_self_and_read_read_dependences,
VEC (data_reference_p, heap) **datarefs,
VEC (ddr_p, heap) **dependence_relations)
{
if (find_data_references_in_bb (NULL, bb, datarefs) == chrec_dont_know)
return false;
compute_all_dependences (*datarefs, dependence_relations, NULL,
compute_self_and_read_read_dependences);
return true;
}
/* Entry point (for testing only). Analyze all the data references
and the dependence relations in LOOP.
......
......@@ -383,6 +383,9 @@ bool dr_analyze_innermost (struct data_reference *);
extern bool compute_data_dependences_for_loop (struct loop *, bool,
VEC (data_reference_p, heap) **,
VEC (ddr_p, heap) **);
extern bool compute_data_dependences_for_bb (basic_block, bool,
VEC (data_reference_p, heap) **,
VEC (ddr_p, heap) **);
extern tree find_data_references_in_loop (struct loop *,
VEC (data_reference_p, heap) **);
extern void print_direction_vector (FILE *, lambda_vector, int);
......
......@@ -338,6 +338,7 @@ extern struct gimple_opt_pass pass_graphite_transforms;
extern struct gimple_opt_pass pass_if_conversion;
extern struct gimple_opt_pass pass_loop_distribution;
extern struct gimple_opt_pass pass_vectorize;
extern struct gimple_opt_pass pass_slp_vectorize;
extern struct gimple_opt_pass pass_complete_unroll;
extern struct gimple_opt_pass pass_complete_unrolli;
extern struct gimple_opt_pass pass_parallelize_loops;
......
......@@ -640,14 +640,14 @@ new_loop_vec_info (struct loop *loop)
{
gimple phi = gsi_stmt (si);
gimple_set_uid (phi, 0);
set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, res));
set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, res, NULL));
}
for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
{
gimple stmt = gsi_stmt (si);
gimple_set_uid (stmt, 0);
set_vinfo_for_stmt (stmt, new_stmt_vec_info (stmt, res));
set_vinfo_for_stmt (stmt, new_stmt_vec_info (stmt, res, NULL));
}
}
}
......@@ -1153,7 +1153,7 @@ vect_analyze_loop_operations (loop_vec_info loop_vinfo)
gcc_assert (stmt_info);
if (!vect_analyze_stmt (stmt, &need_to_vectorize))
if (!vect_analyze_stmt (stmt, &need_to_vectorize, NULL))
return false;
if (STMT_VINFO_RELEVANT_P (stmt_info) && !PURE_SLP_STMT (stmt_info))
......@@ -1316,7 +1316,7 @@ vect_analyze_loop (struct loop *loop)
FORNOW: Handle only simple, array references, which
alignment can be forced, and aligned pointer-references. */
ok = vect_analyze_data_refs (loop_vinfo);
ok = vect_analyze_data_refs (loop_vinfo, NULL);
if (!ok)
{
if (vect_print_dump_info (REPORT_DETAILS))
......@@ -1346,7 +1346,7 @@ vect_analyze_loop (struct loop *loop)
/* Analyze the alignment of the data-refs in the loop.
Fail if a data reference is found that cannot be vectorized. */
ok = vect_analyze_data_refs_alignment (loop_vinfo);
ok = vect_analyze_data_refs_alignment (loop_vinfo, NULL);
if (!ok)
{
if (vect_print_dump_info (REPORT_DETAILS))
......@@ -1367,7 +1367,7 @@ vect_analyze_loop (struct loop *loop)
/* Analyze data dependences between the data-refs in the loop.
FORNOW: fail at the first data dependence that we encounter. */
ok = vect_analyze_data_ref_dependences (loop_vinfo);
ok = vect_analyze_data_ref_dependences (loop_vinfo, NULL);
if (!ok)
{
if (vect_print_dump_info (REPORT_DETAILS))
......@@ -1379,7 +1379,7 @@ vect_analyze_loop (struct loop *loop)
/* Analyze the access patterns of the data-refs in the loop (consecutive,
complex, etc.). FORNOW: Only handle consecutive access pattern. */
ok = vect_analyze_data_ref_accesses (loop_vinfo);
ok = vect_analyze_data_ref_accesses (loop_vinfo, NULL);
if (!ok)
{
if (vect_print_dump_info (REPORT_DETAILS))
......@@ -1402,7 +1402,7 @@ vect_analyze_loop (struct loop *loop)
}
/* Check the SLP opportunities in the loop, analyze and build SLP trees. */
ok = vect_analyze_slp (loop_vinfo);
ok = vect_analyze_slp (loop_vinfo, NULL);
if (ok)
{
/* Decide which possible SLP instances to SLP. */
......@@ -2354,7 +2354,7 @@ get_initial_def_for_induction (gimple iv_phi)
add_referenced_var (vec_dest);
induction_phi = create_phi_node (vec_dest, iv_loop->header);
set_vinfo_for_stmt (induction_phi,
new_stmt_vec_info (induction_phi, loop_vinfo));
new_stmt_vec_info (induction_phi, loop_vinfo, NULL));
induc_def = PHI_RESULT (induction_phi);
/* Create the iv update inside the loop */
......@@ -2363,7 +2363,8 @@ get_initial_def_for_induction (gimple iv_phi)
vec_def = make_ssa_name (vec_dest, new_stmt);
gimple_assign_set_lhs (new_stmt, vec_def);
gsi_insert_before (&si, new_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (new_stmt, new_stmt_vec_info (new_stmt, loop_vinfo));
set_vinfo_for_stmt (new_stmt, new_stmt_vec_info (new_stmt, loop_vinfo,
NULL));
/* Set the arguments of the phi node: */
add_phi_arg (induction_phi, vec_init, pe);
......@@ -2405,7 +2406,7 @@ get_initial_def_for_induction (gimple iv_phi)
gsi_insert_before (&si, new_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (new_stmt,
new_stmt_vec_info (new_stmt, loop_vinfo));
new_stmt_vec_info (new_stmt, loop_vinfo, NULL));
STMT_VINFO_RELATED_STMT (prev_stmt_vinfo) = new_stmt;
prev_stmt_vinfo = vinfo_for_stmt (new_stmt);
}
......@@ -2743,7 +2744,7 @@ vect_create_epilog_for_reduction (tree vect_def, gimple stmt,
for (j = 0; j < ncopies; j++)
{
phi = create_phi_node (SSA_NAME_VAR (vect_def), exit_bb);
set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, loop_vinfo));
set_vinfo_for_stmt (phi, new_stmt_vec_info (phi, loop_vinfo, NULL));
if (j == 0)
new_phi = phi;
else
......@@ -3021,7 +3022,8 @@ vect_finalize_reduction:
epilog_stmt = adjustment_def ? epilog_stmt : new_phi;
STMT_VINFO_VEC_STMT (stmt_vinfo) = epilog_stmt;
set_vinfo_for_stmt (epilog_stmt,
new_stmt_vec_info (epilog_stmt, loop_vinfo));
new_stmt_vec_info (epilog_stmt, loop_vinfo,
NULL));
if (adjustment_def)
STMT_VINFO_RELATED_STMT (vinfo_for_stmt (epilog_stmt)) =
STMT_VINFO_RELATED_STMT (vinfo_for_stmt (new_phi));
......@@ -3204,7 +3206,7 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi,
The last use is the reduction variable. */
for (i = 0; i < op_type-1; i++)
{
is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, &def_stmt,
is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, NULL, &def_stmt,
&def, &dt);
gcc_assert (is_simple_use);
if (dt != vect_internal_def
......@@ -3214,8 +3216,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi,
return false;
}
is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, &def_stmt, &def,
&dt);
is_simple_use = vect_is_simple_use (ops[i], loop_vinfo, NULL, &def_stmt,
&def, &dt);
gcc_assert (is_simple_use);
gcc_assert (dt == vect_reduction_def);
gcc_assert (gimple_code (def_stmt) == GIMPLE_PHI);
......@@ -3394,7 +3396,8 @@ vectorizable_reduction (gimple stmt, gimple_stmt_iterator *gsi,
{
/* Create the reduction-phi that defines the reduction-operand. */
new_phi = create_phi_node (vec_dest, loop->header);
set_vinfo_for_stmt (new_phi, new_stmt_vec_info (new_phi, loop_vinfo));
set_vinfo_for_stmt (new_phi, new_stmt_vec_info (new_phi, loop_vinfo,
NULL));
}
/* Handle uses. */
......@@ -3592,7 +3595,8 @@ vectorizable_live_operation (gimple stmt,
op = TREE_OPERAND (gimple_op (stmt, 1), i);
else
op = gimple_op (stmt, i + 1);
if (op && !vect_is_simple_use (op, loop_vinfo, &def_stmt, &def, &dt))
if (op
&& !vect_is_simple_use (op, loop_vinfo, NULL, &def_stmt, &def, &dt))
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "use not simple.");
......@@ -3766,7 +3770,7 @@ vect_transform_loop (loop_vec_info loop_vinfo)
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "=== scheduling SLP instances ===");
vect_schedule_slp (loop_vinfo);
vect_schedule_slp (loop_vinfo, NULL);
}
/* Hybrid SLP stmts must be vectorized in addition to SLP. */
......
......@@ -78,7 +78,7 @@ widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt)
stmt_vinfo = vinfo_for_stmt (use_stmt);
loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
if (!vect_is_simple_use (name, loop_vinfo, def_stmt, &def, &dt))
if (!vect_is_simple_use (name, loop_vinfo, NULL, def_stmt, &def, &dt))
return false;
if (dt != vect_internal_def
......@@ -102,7 +102,8 @@ widened_name_p (tree name, gimple use_stmt, tree *half_type, gimple *def_stmt)
|| (TYPE_PRECISION (type) < (TYPE_PRECISION (*half_type) * 2)))
return false;
if (!vect_is_simple_use (oprnd0, loop_vinfo, &dummy_gimple, &dummy, &dt))
if (!vect_is_simple_use (oprnd0, loop_vinfo, NULL, &dummy_gimple, &dummy,
&dt))
return false;
return true;
......@@ -734,7 +735,7 @@ vect_pattern_recog_1 (
/* Mark the stmts that are involved in the pattern. */
gsi_insert_before (&si, pattern_stmt, GSI_SAME_STMT);
set_vinfo_for_stmt (pattern_stmt,
new_stmt_vec_info (pattern_stmt, loop_vinfo));
new_stmt_vec_info (pattern_stmt, loop_vinfo, NULL));
pattern_stmt_info = vinfo_for_stmt (pattern_stmt);
STMT_VINFO_RELATED_STMT (pattern_stmt_info) = stmt;
......
......@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3. If not see
#include "cfglayout.h"
#include "tree-vectorizer.h"
#include "tree-pass.h"
#include "timevar.h"
/* vect_dump will be set to stderr or dump_file if exist. */
FILE *vect_dump;
......@@ -75,8 +76,9 @@ FILE *vect_dump;
/* vect_verbosity_level set to an invalid value
to mark that it's uninitialized. */
static enum verbosity_levels vect_verbosity_level = MAX_VERBOSITY_LEVEL;
static enum verbosity_levels user_vect_verbosity_level = MAX_VERBOSITY_LEVEL;
/* Loop location. */
/* Loop or bb location. */
LOC vect_location;
/* Bitmap of virtual variables to be renamed. */
......@@ -99,9 +101,10 @@ vect_set_verbosity_level (const char *val)
vl = atoi (val);
if (vl < MAX_VERBOSITY_LEVEL)
vect_verbosity_level = (enum verbosity_levels) vl;
user_vect_verbosity_level = (enum verbosity_levels) vl;
else
vect_verbosity_level = (enum verbosity_levels) (MAX_VERBOSITY_LEVEL - 1);
user_vect_verbosity_level
= (enum verbosity_levels) (MAX_VERBOSITY_LEVEL - 1);
}
......@@ -115,17 +118,33 @@ vect_set_verbosity_level (const char *val)
print to stderr, otherwise print to the dump file. */
static void
vect_set_dump_settings (void)
vect_set_dump_settings (bool slp)
{
vect_dump = dump_file;
/* Check if the verbosity level was defined by the user: */
if (vect_verbosity_level != MAX_VERBOSITY_LEVEL)
if (user_vect_verbosity_level != MAX_VERBOSITY_LEVEL)
{
/* If there is no dump file, print to stderr. */
if (!dump_file)
vect_dump = stderr;
return;
vect_verbosity_level = user_vect_verbosity_level;
/* Ignore user defined verbosity if dump flags require higher level of
verbosity. */
if (dump_file)
{
if (((dump_flags & TDF_DETAILS)
&& vect_verbosity_level >= REPORT_DETAILS)
|| ((dump_flags & TDF_STATS)
&& vect_verbosity_level >= REPORT_UNVECTORIZED_LOCATIONS))
return;
}
else
{
/* If there is no dump file, print to stderr in case of loop
vectorization. */
if (!slp)
vect_dump = stderr;
return;
}
}
/* User didn't specify verbosity level: */
......@@ -185,7 +204,7 @@ vectorize_loops (void)
return 0;
/* Fix the verbosity level if not defined explicitly by the user. */
vect_set_dump_settings ();
vect_set_dump_settings (false);
/* Allocate the bitmap that records which virtual variables
need to be renamed. */
......@@ -245,6 +264,68 @@ vectorize_loops (void)
}
/* Entry point to basic block SLP phase. */
static unsigned int
execute_vect_slp (void)
{
basic_block bb;
/* Fix the verbosity level if not defined explicitly by the user. */
vect_set_dump_settings (true);
init_stmt_vec_info_vec ();
FOR_EACH_BB (bb)
{
vect_location = find_bb_location (bb);
if (vect_slp_analyze_bb (bb))
{
vect_slp_transform_bb (bb);
if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS))
fprintf (vect_dump, "basic block vectorized using SLP\n");
}
}
free_stmt_vec_info_vec ();
return 0;
}
static bool
gate_vect_slp (void)
{
/* Apply SLP either if the vectorizer is on and the user didn't specify
whether to run SLP or not, or if the SLP flag was set by the user. */
return ((flag_tree_vectorize != 0 && flag_tree_slp_vectorize != 0)
|| flag_tree_slp_vectorize == 1);
}
struct gimple_opt_pass pass_slp_vectorize =
{
{
GIMPLE_PASS,
"slp", /* name */
gate_vect_slp, /* gate */
execute_vect_slp, /* execute */
NULL, /* sub */
NULL, /* next */
0, /* static_pass_number */
TV_TREE_SLP_VECTORIZATION, /* tv_id */
PROP_ssa | PROP_cfg, /* properties_required */
0, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
TODO_ggc_collect
| TODO_verify_ssa
| TODO_dump_func
| TODO_update_ssa
| TODO_verify_stmts /* todo_flags_finish */
}
};
/* Increase alignment of global arrays to improve vectorization potential.
TODO:
- Consider also structs that have an array field.
......
......@@ -281,6 +281,36 @@ nested_in_vect_loop_p (struct loop *loop, gimple stmt)
&& (loop->inner == (gimple_bb (stmt))->loop_father));
}
typedef struct _bb_vec_info {
basic_block bb;
/* All interleaving chains of stores in the basic block, represented by the
first stmt in the chain. */
VEC(gimple, heap) *strided_stores;
/* All SLP instances in the basic block. This is a subset of the set of
STRIDED_STORES of the basic block. */
VEC(slp_instance, heap) *slp_instances;
/* All data references in the basic block. */
VEC (data_reference_p, heap) *datarefs;
/* All data dependences in the basic block. */
VEC (ddr_p, heap) *ddrs;
} *bb_vec_info;
#define BB_VINFO_BB(B) (B)->bb
#define BB_VINFO_STRIDED_STORES(B) (B)->strided_stores
#define BB_VINFO_SLP_INSTANCES(B) (B)->slp_instances
#define BB_VINFO_DATAREFS(B) (B)->datarefs
#define BB_VINFO_DDRS(B) (B)->ddrs
static inline bb_vec_info
vec_info_for_bb (basic_block bb)
{
return (bb_vec_info) bb->aux;
}
/*-----------------------------------------------------------------*/
/* Info on vectorized defs. */
/*-----------------------------------------------------------------*/
......@@ -437,12 +467,16 @@ typedef struct _stmt_vec_info {
/* Whether the stmt is SLPed, loop-based vectorized, or both. */
enum slp_vect_type slp_type;
/* The bb_vec_info with respect to which STMT is vectorized. */
bb_vec_info bb_vinfo;
} *stmt_vec_info;
/* Access Functions. */
#define STMT_VINFO_TYPE(S) (S)->type
#define STMT_VINFO_STMT(S) (S)->stmt
#define STMT_VINFO_LOOP_VINFO(S) (S)->loop_vinfo
#define STMT_VINFO_BB_VINFO(S) (S)->bb_vinfo
#define STMT_VINFO_RELEVANT(S) (S)->relevant
#define STMT_VINFO_LIVE_P(S) (S)->live
#define STMT_VINFO_VECTYPE(S) (S)->vectype
......@@ -707,15 +741,15 @@ extern void slpeel_make_loop_iterate_ntimes (struct loop *, tree);
extern bool slpeel_can_duplicate_loop_p (const struct loop *, const_edge);
extern void vect_loop_versioning (loop_vec_info, bool, tree *, gimple_seq *);
extern void vect_do_peeling_for_loop_bound (loop_vec_info, tree *,
tree, gimple_seq);
tree, gimple_seq);
extern void vect_do_peeling_for_alignment (loop_vec_info);
extern LOC find_loop_location (struct loop *);
extern bool vect_can_advance_ivs_p (loop_vec_info);
/* In tree-vect-stmts.c. */
extern tree get_vectype_for_scalar_type (tree);
extern bool vect_is_simple_use (tree, loop_vec_info, gimple *, tree *,
enum vect_def_type *);
extern bool vect_is_simple_use (tree, loop_vec_info, bb_vec_info, gimple *,
tree *, enum vect_def_type *);
extern bool supportable_widening_operation (enum tree_code, gimple, tree,
tree *, tree *, enum tree_code *,
enum tree_code *, int *,
......@@ -723,7 +757,8 @@ extern bool supportable_widening_operation (enum tree_code, gimple, tree,
extern bool supportable_narrowing_operation (enum tree_code, const_gimple,
tree, enum tree_code *, int *,
VEC (tree, heap) **);
extern stmt_vec_info new_stmt_vec_info (gimple stmt, loop_vec_info);
extern stmt_vec_info new_stmt_vec_info (gimple stmt, loop_vec_info,
bb_vec_info);
extern void free_stmt_vec_info (gimple stmt);
extern tree vectorizable_function (gimple, tree, tree);
extern void vect_model_simple_cost (stmt_vec_info, int, enum vect_def_type *,
......@@ -742,7 +777,7 @@ extern tree vect_get_vec_def_for_stmt_copy (enum vect_def_type, tree);
extern bool vect_transform_stmt (gimple, gimple_stmt_iterator *,
bool *, slp_tree, slp_instance);
extern void vect_remove_stores (gimple);
extern bool vect_analyze_stmt (gimple, bool *);
extern bool vect_analyze_stmt (gimple, bool *, slp_tree);
/* In tree-vect-data-refs.c. */
extern bool vect_can_force_dr_alignment_p (const_tree, unsigned int);
......@@ -750,14 +785,15 @@ extern enum dr_alignment_support vect_supportable_dr_alignment
(struct data_reference *);
extern tree vect_get_smallest_scalar_type (gimple, HOST_WIDE_INT *,
HOST_WIDE_INT *);
extern bool vect_analyze_data_ref_dependences (loop_vec_info);
extern bool vect_analyze_data_ref_dependences (loop_vec_info, bb_vec_info);
extern bool vect_enhance_data_refs_alignment (loop_vec_info);
extern bool vect_analyze_data_refs_alignment (loop_vec_info);
extern bool vect_analyze_data_ref_accesses (loop_vec_info);
extern bool vect_analyze_data_refs_alignment (loop_vec_info, bb_vec_info);
extern bool vect_verify_datarefs_alignment (loop_vec_info, bb_vec_info);
extern bool vect_analyze_data_ref_accesses (loop_vec_info, bb_vec_info);
extern bool vect_prune_runtime_alias_test_list (loop_vec_info);
extern bool vect_analyze_data_refs (loop_vec_info);
extern bool vect_analyze_data_refs (loop_vec_info, bb_vec_info);
extern tree vect_create_data_ref_ptr (gimple, struct loop *, tree, tree *,
gimple *, bool, bool *);
gimple *, bool, bool *);
extern tree bump_vector_ptr (tree, gimple, gimple_stmt_iterator *, gimple, tree);
extern tree vect_create_destination_var (tree, tree);
extern bool vect_strided_store_supported (tree);
......@@ -799,13 +835,16 @@ extern void vect_free_slp_instance (slp_instance);
extern bool vect_transform_slp_perm_load (gimple, VEC (tree, heap) *,
gimple_stmt_iterator *, int,
slp_instance, bool);
extern bool vect_schedule_slp (loop_vec_info);
extern bool vect_schedule_slp (loop_vec_info, bb_vec_info);
extern void vect_update_slp_costs_according_to_vf (loop_vec_info);
extern bool vect_analyze_slp (loop_vec_info);
extern bool vect_analyze_slp (loop_vec_info, bb_vec_info);
extern void vect_make_slp_decision (loop_vec_info);
extern void vect_detect_hybrid_slp (loop_vec_info);
extern void vect_get_slp_defs (slp_tree, VEC (tree,heap) **,
VEC (tree,heap) **);
extern LOC find_bb_location (basic_block);
extern bb_vec_info vect_slp_analyze_bb (basic_block);
extern void vect_slp_transform_bb (basic_block);
/* In tree-vect-patterns.c. */
/* Pattern recognition functions.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment