Commit c2d7ab2a by Richard Sandiford Committed by Richard Sandiford

md.texi (vec_load_lanes, [...]): Document.

gcc/
	* doc/md.texi (vec_load_lanes, vec_store_lanes): Document.
	* optabs.h (COI_vec_load_lanes, COI_vec_store_lanes): New
	convert_optab_index values.
	(vec_load_lanes_optab, vec_store_lanes_optab): New convert optabs.
	* genopinit.c (optabs): Initialize the new optabs.
	* internal-fn.def (LOAD_LANES, STORE_LANES): New internal functions.
	* internal-fn.c (get_multi_vector_move, expand_LOAD_LANES)
	(expand_STORE_LANES): New functions.
	* tree.h (build_array_type_nelts): Declare.
	* tree.c (build_array_type_nelts): New function.
	* tree-vectorizer.h (vect_model_store_cost): Add a bool argument.
	(vect_model_load_cost): Likewise.
	(vect_store_lanes_supported, vect_load_lanes_supported)
	(vect_record_strided_load_vectors): Declare.
	* tree-vect-data-refs.c (vect_lanes_optab_supported_p)
	(vect_store_lanes_supported, vect_load_lanes_supported): New functions.
	(vect_transform_strided_load): Split out statement recording into...
	(vect_record_strided_load_vectors): ...this new function.
	* tree-vect-stmts.c (create_vector_array, read_vector_array)
	(write_vector_array, create_array_ref): New functions.
	(vect_model_store_cost): Add store_lanes_p argument.
	(vect_model_load_cost): Add load_lanes_p argument.
	(vectorizable_store): Try to use store-lanes functions for
	interleaved stores.
	(vectorizable_load): Likewise load-lanes and loads.
	* tree-vect-slp.c (vect_get_and_check_slp_defs)
	(vect_build_slp_tree):

From-SVN: r172760
parent 1da0876c
2011-04-20 Richard Sandiford <richard.sandiford@linaro.org>
* doc/md.texi (vec_load_lanes, vec_store_lanes): Document.
* optabs.h (COI_vec_load_lanes, COI_vec_store_lanes): New
convert_optab_index values.
(vec_load_lanes_optab, vec_store_lanes_optab): New convert optabs.
* genopinit.c (optabs): Initialize the new optabs.
* internal-fn.def (LOAD_LANES, STORE_LANES): New internal functions.
* internal-fn.c (get_multi_vector_move, expand_LOAD_LANES)
(expand_STORE_LANES): New functions.
* tree.h (build_array_type_nelts): Declare.
* tree.c (build_array_type_nelts): New function.
* tree-vectorizer.h (vect_model_store_cost): Add a bool argument.
(vect_model_load_cost): Likewise.
(vect_store_lanes_supported, vect_load_lanes_supported)
(vect_record_strided_load_vectors): Declare.
* tree-vect-data-refs.c (vect_lanes_optab_supported_p)
(vect_store_lanes_supported, vect_load_lanes_supported): New functions.
(vect_transform_strided_load): Split out statement recording into...
(vect_record_strided_load_vectors): ...this new function.
* tree-vect-stmts.c (create_vector_array, read_vector_array)
(write_vector_array, create_array_ref): New functions.
(vect_model_store_cost): Add store_lanes_p argument.
(vect_model_load_cost): Add load_lanes_p argument.
(vectorizable_store): Try to use store-lanes functions for
interleaved stores.
(vectorizable_load): Likewise load-lanes and loads.
* tree-vect-slp.c (vect_get_and_check_slp_defs)
(vect_build_slp_tree):
2011-04-20 Richard Sandiford <richard.sandiford@linaro.org>
* tree-vect-stmts.c (vectorizable_store): Only chain one related
statement per copy.
......
......@@ -3846,6 +3846,48 @@ into consecutive memory locations. Operand 0 is the first of the
consecutive memory locations, operand 1 is the first register, and
operand 2 is a constant: the number of consecutive registers.
@cindex @code{vec_load_lanes@var{m}@var{n}} instruction pattern
@item @samp{vec_load_lanes@var{m}@var{n}}
Perform an interleaved load of several vectors from memory operand 1
into register operand 0. Both operands have mode @var{m}. The register
operand is viewed as holding consecutive vectors of mode @var{n},
while the memory operand is a flat array that contains the same number
of elements. The operation is equivalent to:
@smallexample
int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});
for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)
for (i = 0; i < c; i++)
operand0[i][j] = operand1[j * c + i];
@end smallexample
For example, @samp{vec_load_lanestiv4hi} loads 8 16-bit values
from memory into a register of mode @samp{TI}@. The register
contains two consecutive vectors of mode @samp{V4HI}@.
This pattern can only be used if:
@smallexample
TARGET_ARRAY_MODE_SUPPORTED_P (@var{n}, @var{c})
@end smallexample
is true. GCC assumes that, if a target supports this kind of
instruction for some mode @var{n}, it also supports unaligned
loads for vectors of mode @var{n}.
@cindex @code{vec_store_lanes@var{m}@var{n}} instruction pattern
@item @samp{vec_store_lanes@var{m}@var{n}}
Equivalent to @samp{vec_load_lanes@var{m}@var{n}}, with the memory
and register operands reversed. That is, the instruction is
equivalent to:
@smallexample
int c = GET_MODE_SIZE (@var{m}) / GET_MODE_SIZE (@var{n});
for (j = 0; j < GET_MODE_NUNITS (@var{n}); j++)
for (i = 0; i < c; i++)
operand0[j * c + i] = operand1[i][j];
@end smallexample
for a memory operand 0 and register operand 1.
@cindex @code{vec_set@var{m}} instruction pattern
@item @samp{vec_set@var{m}}
Set given field in the vector value. Operand 0 is the vector to modify,
......
......@@ -74,6 +74,8 @@ static const char * const optabs[] =
"set_convert_optab_handler (fractuns_optab, $B, $A, CODE_FOR_$(fractuns$Q$a$I$b2$))",
"set_convert_optab_handler (satfract_optab, $B, $A, CODE_FOR_$(satfract$a$Q$b2$))",
"set_convert_optab_handler (satfractuns_optab, $B, $A, CODE_FOR_$(satfractuns$I$a$Q$b2$))",
"set_convert_optab_handler (vec_load_lanes_optab, $A, $B, CODE_FOR_$(vec_load_lanes$a$b$))",
"set_convert_optab_handler (vec_store_lanes_optab, $A, $B, CODE_FOR_$(vec_store_lanes$a$b$))",
"set_optab_handler (add_optab, $A, CODE_FOR_$(add$P$a3$))",
"set_optab_handler (addv_optab, $A, CODE_FOR_$(add$F$a3$)),\n\
set_optab_handler (add_optab, $A, CODE_FOR_$(add$F$a3$))",
......
......@@ -42,6 +42,73 @@ const int internal_fn_flags_array[] = {
0
};
/* ARRAY_TYPE is an array of vector modes. Return the associated insn
for load-lanes-style optab OPTAB. The insn must exist. */
static enum insn_code
get_multi_vector_move (tree array_type, convert_optab optab)
{
enum insn_code icode;
enum machine_mode imode;
enum machine_mode vmode;
gcc_assert (TREE_CODE (array_type) == ARRAY_TYPE);
imode = TYPE_MODE (array_type);
vmode = TYPE_MODE (TREE_TYPE (array_type));
icode = convert_optab_handler (optab, imode, vmode);
gcc_assert (icode != CODE_FOR_nothing);
return icode;
}
/* Expand LOAD_LANES call STMT. */
static void
expand_LOAD_LANES (gimple stmt)
{
struct expand_operand ops[2];
tree type, lhs, rhs;
rtx target, mem;
lhs = gimple_call_lhs (stmt);
rhs = gimple_call_arg (stmt, 0);
type = TREE_TYPE (lhs);
target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
mem = expand_normal (rhs);
gcc_assert (MEM_P (mem));
PUT_MODE (mem, TYPE_MODE (type));
create_output_operand (&ops[0], target, TYPE_MODE (type));
create_fixed_operand (&ops[1], mem);
expand_insn (get_multi_vector_move (type, vec_load_lanes_optab), 2, ops);
}
/* Expand STORE_LANES call STMT. */
static void
expand_STORE_LANES (gimple stmt)
{
struct expand_operand ops[2];
tree type, lhs, rhs;
rtx target, reg;
lhs = gimple_call_lhs (stmt);
rhs = gimple_call_arg (stmt, 0);
type = TREE_TYPE (rhs);
target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
reg = expand_normal (rhs);
gcc_assert (MEM_P (target));
PUT_MODE (target, TYPE_MODE (type));
create_fixed_operand (&ops[0], target);
create_input_operand (&ops[1], reg, TYPE_MODE (type));
expand_insn (get_multi_vector_move (type, vec_store_lanes_optab), 2, ops);
}
/* Routines to expand each internal function, indexed by function number.
Each routine has the prototype:
......
......@@ -37,3 +37,6 @@ along with GCC; see the file COPYING3. If not see
void expand_NAME (gimple stmt)
where STMT is the statement that performs the call. */
DEF_INTERNAL_FN (LOAD_LANES, ECF_CONST | ECF_LEAF)
DEF_INTERNAL_FN (STORE_LANES, ECF_CONST | ECF_LEAF)
......@@ -578,6 +578,9 @@ enum convert_optab_index
COI_satfract,
COI_satfractuns,
COI_vec_load_lanes,
COI_vec_store_lanes,
COI_MAX
};
......@@ -598,6 +601,8 @@ enum convert_optab_index
#define fractuns_optab (&convert_optab_table[COI_fractuns])
#define satfract_optab (&convert_optab_table[COI_satfract])
#define satfractuns_optab (&convert_optab_table[COI_satfractuns])
#define vec_load_lanes_optab (&convert_optab_table[COI_vec_load_lanes])
#define vec_store_lanes_optab (&convert_optab_table[COI_vec_store_lanes])
/* Contains the optab used for each rtx code. */
extern optab code_to_optab[NUM_RTX_CODE + 1];
......
......@@ -43,6 +43,45 @@ along with GCC; see the file COPYING3. If not see
#include "expr.h"
#include "optabs.h"
/* Return true if load- or store-lanes optab OPTAB is implemented for
COUNT vectors of type VECTYPE. NAME is the name of OPTAB. */
static bool
vect_lanes_optab_supported_p (const char *name, convert_optab optab,
tree vectype, unsigned HOST_WIDE_INT count)
{
enum machine_mode mode, array_mode;
bool limit_p;
mode = TYPE_MODE (vectype);
limit_p = !targetm.array_mode_supported_p (mode, count);
array_mode = mode_for_size (count * GET_MODE_BITSIZE (mode),
MODE_INT, limit_p);
if (array_mode == BLKmode)
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "no array mode for %s[" HOST_WIDE_INT_PRINT_DEC "]",
GET_MODE_NAME (mode), count);
return false;
}
if (convert_optab_handler (optab, array_mode, mode) == CODE_FOR_nothing)
{
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "cannot use %s<%s><%s>",
name, GET_MODE_NAME (array_mode), GET_MODE_NAME (mode));
return false;
}
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "can use %s<%s><%s>",
name, GET_MODE_NAME (array_mode), GET_MODE_NAME (mode));
return true;
}
/* Return the smallest scalar part of STMT.
This is used to determine the vectype of the stmt. We generally set the
vectype according to the type of the result (lhs). For stmts whose
......@@ -3376,6 +3415,18 @@ vect_strided_store_supported (tree vectype, unsigned HOST_WIDE_INT count)
}
/* Return TRUE if vec_store_lanes is available for COUNT vectors of
type VECTYPE. */
bool
vect_store_lanes_supported (tree vectype, unsigned HOST_WIDE_INT count)
{
return vect_lanes_optab_supported_p ("vec_store_lanes",
vec_store_lanes_optab,
vectype, count);
}
/* Function vect_permute_store_chain.
Given a chain of interleaved stores in DR_CHAIN of LENGTH that must be
......@@ -3830,6 +3881,16 @@ vect_strided_load_supported (tree vectype, unsigned HOST_WIDE_INT count)
return true;
}
/* Return TRUE if vec_load_lanes is available for COUNT vectors of
type VECTYPE. */
bool
vect_load_lanes_supported (tree vectype, unsigned HOST_WIDE_INT count)
{
return vect_lanes_optab_supported_p ("vec_load_lanes",
vec_load_lanes_optab,
vectype, count);
}
/* Function vect_permute_load_chain.
......@@ -3977,19 +4038,28 @@ void
vect_transform_strided_load (gimple stmt, VEC(tree,heap) *dr_chain, int size,
gimple_stmt_iterator *gsi)
{
stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
gimple first_stmt = DR_GROUP_FIRST_DR (stmt_info);
gimple next_stmt, new_stmt;
VEC(tree,heap) *result_chain = NULL;
unsigned int i, gap_count;
tree tmp_data_ref;
/* DR_CHAIN contains input data-refs that are a part of the interleaving.
RESULT_CHAIN is the output of vect_permute_load_chain, it contains permuted
vectors, that are ready for vector computation. */
result_chain = VEC_alloc (tree, heap, size);
/* Permute. */
vect_permute_load_chain (dr_chain, size, stmt, gsi, &result_chain);
vect_record_strided_load_vectors (stmt, result_chain);
VEC_free (tree, heap, result_chain);
}
/* RESULT_CHAIN contains the output of a group of strided loads that were
generated as part of the vectorization of STMT. Assign the statement
for each vector to the associated scalar statement. */
void
vect_record_strided_load_vectors (gimple stmt, VEC(tree,heap) *result_chain)
{
gimple first_stmt = DR_GROUP_FIRST_DR (vinfo_for_stmt (stmt));
gimple next_stmt, new_stmt;
unsigned int i, gap_count;
tree tmp_data_ref;
/* Put a permuted data-ref in the VECTORIZED_STMT field.
Since we scan the chain starting from it's first node, their order
......@@ -4051,8 +4121,6 @@ vect_transform_strided_load (gimple stmt, VEC(tree,heap) *dr_chain, int size,
break;
}
}
VEC_free (tree, heap, result_chain);
}
/* Function vect_force_dr_alignment_p.
......
......@@ -215,7 +215,8 @@ vect_get_and_check_slp_defs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
vect_model_simple_cost (stmt_info, ncopies_for_cost, dt, slp_node);
else
/* Store. */
vect_model_store_cost (stmt_info, ncopies_for_cost, dt[0], slp_node);
vect_model_store_cost (stmt_info, ncopies_for_cost, false,
dt[0], slp_node);
}
else
......@@ -579,7 +580,7 @@ vect_build_slp_tree (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo,
/* Analyze costs (for the first stmt in the group). */
vect_model_load_cost (vinfo_for_stmt (stmt),
ncopies_for_cost, *node);
ncopies_for_cost, false, *node);
}
/* Store the place of this load in the interleaving chain. In
......
......@@ -788,9 +788,9 @@ extern void free_stmt_vec_info (gimple stmt);
extern tree vectorizable_function (gimple, tree, tree);
extern void vect_model_simple_cost (stmt_vec_info, int, enum vect_def_type *,
slp_tree);
extern void vect_model_store_cost (stmt_vec_info, int, enum vect_def_type,
slp_tree);
extern void vect_model_load_cost (stmt_vec_info, int, slp_tree);
extern void vect_model_store_cost (stmt_vec_info, int, bool,
enum vect_def_type, slp_tree);
extern void vect_model_load_cost (stmt_vec_info, int, bool, slp_tree);
extern void vect_finish_stmt_generation (gimple, gimple,
gimple_stmt_iterator *);
extern bool vect_mark_stmts_to_be_vectorized (loop_vec_info);
......@@ -829,7 +829,9 @@ extern tree vect_create_data_ref_ptr (gimple, tree, struct loop *, tree,
extern tree bump_vector_ptr (tree, gimple, gimple_stmt_iterator *, gimple, tree);
extern tree vect_create_destination_var (tree, tree);
extern bool vect_strided_store_supported (tree, unsigned HOST_WIDE_INT);
extern bool vect_store_lanes_supported (tree, unsigned HOST_WIDE_INT);
extern bool vect_strided_load_supported (tree, unsigned HOST_WIDE_INT);
extern bool vect_load_lanes_supported (tree, unsigned HOST_WIDE_INT);
extern void vect_permute_store_chain (VEC(tree,heap) *,unsigned int, gimple,
gimple_stmt_iterator *, VEC(tree,heap) **);
extern tree vect_setup_realignment (gimple, gimple_stmt_iterator *, tree *,
......@@ -837,6 +839,7 @@ extern tree vect_setup_realignment (gimple, gimple_stmt_iterator *, tree *,
struct loop **);
extern void vect_transform_strided_load (gimple, VEC(tree,heap) *, int,
gimple_stmt_iterator *);
extern void vect_record_strided_load_vectors (gimple, VEC(tree,heap) *);
extern int vect_get_place_in_interleaving_chain (gimple, gimple);
extern tree vect_get_new_vect_var (tree, enum vect_var_kind, const char *);
extern tree vect_create_addr_base_for_vector_ref (gimple, gimple_seq *,
......
......@@ -7340,6 +7340,15 @@ build_nonshared_array_type (tree elt_type, tree index_type)
return build_array_type_1 (elt_type, index_type, false);
}
/* Return a representation of ELT_TYPE[NELTS], using indices of type
sizetype. */
tree
build_array_type_nelts (tree elt_type, unsigned HOST_WIDE_INT nelts)
{
return build_array_type (elt_type, build_index_type (size_int (nelts - 1)));
}
/* Recursively examines the array elements of TYPE, until a non-array
element type is found. */
......
......@@ -4247,6 +4247,7 @@ extern tree build_type_no_quals (tree);
extern tree build_index_type (tree);
extern tree build_array_type (tree, tree);
extern tree build_nonshared_array_type (tree, tree);
extern tree build_array_type_nelts (tree, unsigned HOST_WIDE_INT);
extern tree build_function_type (tree, tree);
extern tree build_function_type_list (tree, ...);
extern tree build_function_type_skip_args (tree, bitmap);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment