Commit 85a7c926 by Bill Schmidt Committed by William Schmidt

altivec.md (altivec_lvx_<mode>): Remove.

2016-04-27  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (altivec_lvx_<mode>): Remove.
	(altivec_lvx_<mode>_internal): Document.
	(altivec_lvx_<mode>_2op): New define_insn.
	(altivec_lvx_<mode>_1op): Likewise.
	(altivec_lvx_<mode>_2op_si): Likewise.
	(altivec_lvx_<mode>_1op_si): Likewise.
	(altivec_stvx_<mode>): Remove.
	(altivec_stvx_<mode>_internal): Document.
	(altivec_stvx_<mode>_2op): New define_insn.
	(altivec_stvx_<mode>_1op): Likewise.
	(altivec_stvx_<mode>_2op_si): Likewise.
	(altivec_stvx_<mode>_1op_si): Likewise.
	* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
	Expand vec_ld and vec_st during parsing.
	* config/rs6000/rs6000.c (altivec_expand_lvx_be): Commentary
	changes.
	(altivec_expand_stvx_be): Likewise.
	(altivec_expand_lv_builtin): Expand lvx built-ins to expose the
	address-masking behavior in RTL.
	(altivec_expand_stv_builtin): Expand stvx built-ins to expose the
	address-masking behavior in RTL.
	(altivec_expand_builtin): Change builtin code arguments for calls
	to altivec_expand_stv_builtin and altivec_expand_lv_builtin.
	(insn_is_swappable_p): Avoid incorrect swap optimization in the
	presence of lvx/stvx patterns.
	(alignment_with_canonical_addr): New function.
	(alignment_mask): Likewise.
	(find_alignment_op): Likewise.
	(recombine_lvx_pattern): Likewise.
	(recombine_stvx_pattern): Likewise.
	(recombine_lvx_stvx_patterns): Likewise.
	(rs6000_analyze_swaps): Perform a pre-pass to recognize lvx and
	stvx patterns from expand.
	* config/rs6000/vector.md (vector_altivec_load_<mode>): Use new
	expansions.
	(vector_altivec_store_<mode>): Likewise.

From-SVN: r235533
parent 523d7207
2016-04-27 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/altivec.md (altivec_lvx_<mode>): Remove.
(altivec_lvx_<mode>_internal): Document.
(altivec_lvx_<mode>_2op): New define_insn.
(altivec_lvx_<mode>_1op): Likewise.
(altivec_lvx_<mode>_2op_si): Likewise.
(altivec_lvx_<mode>_1op_si): Likewise.
(altivec_stvx_<mode>): Remove.
(altivec_stvx_<mode>_internal): Document.
(altivec_stvx_<mode>_2op): New define_insn.
(altivec_stvx_<mode>_1op): Likewise.
(altivec_stvx_<mode>_2op_si): Likewise.
(altivec_stvx_<mode>_1op_si): Likewise.
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Expand vec_ld and vec_st during parsing.
* config/rs6000/rs6000.c (altivec_expand_lvx_be): Commentary
changes.
(altivec_expand_stvx_be): Likewise.
(altivec_expand_lv_builtin): Expand lvx built-ins to expose the
address-masking behavior in RTL.
(altivec_expand_stv_builtin): Expand stvx built-ins to expose the
address-masking behavior in RTL.
(altivec_expand_builtin): Change builtin code arguments for calls
to altivec_expand_stv_builtin and altivec_expand_lv_builtin.
(insn_is_swappable_p): Avoid incorrect swap optimization in the
presence of lvx/stvx patterns.
(alignment_with_canonical_addr): New function.
(alignment_mask): Likewise.
(find_alignment_op): Likewise.
(recombine_lvx_pattern): Likewise.
(recombine_stvx_pattern): Likewise.
(recombine_lvx_stvx_patterns): Likewise.
(rs6000_analyze_swaps): Perform a pre-pass to recognize lvx and
stvx patterns from expand.
* config/rs6000/vector.md (vector_altivec_load_<mode>): Use new
expansions.
(vector_altivec_store_<mode>): Likewise.
2016-04-26 Evandro Menezes <e.menezes@samsung.com> 2016-04-26 Evandro Menezes <e.menezes@samsung.com>
* config/aarch64/aarch64.md * config/aarch64/aarch64.md
......
...@@ -2514,20 +2514,9 @@ ...@@ -2514,20 +2514,9 @@
"lvxl %0,%y1" "lvxl %0,%y1"
[(set_attr "type" "vecload")]) [(set_attr "type" "vecload")])
(define_expand "altivec_lvx_<mode>" ; This version of lvx is used only in cases where we need to force an lvx
[(parallel ; over any other load, and we don't care about losing CSE opportunities.
[(set (match_operand:VM2 0 "register_operand" "=v") ; Its primary use is for prologue register saves.
(match_operand:VM2 1 "memory_operand" "Z"))
(unspec [(const_int 0)] UNSPEC_LVX)])]
"TARGET_ALTIVEC"
{
if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
{
altivec_expand_lvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_LVX);
DONE;
}
})
(define_insn "altivec_lvx_<mode>_internal" (define_insn "altivec_lvx_<mode>_internal"
[(parallel [(parallel
[(set (match_operand:VM2 0 "register_operand" "=v") [(set (match_operand:VM2 0 "register_operand" "=v")
...@@ -2537,20 +2526,45 @@ ...@@ -2537,20 +2526,45 @@
"lvx %0,%y1" "lvx %0,%y1"
[(set_attr "type" "vecload")]) [(set_attr "type" "vecload")])
(define_expand "altivec_stvx_<mode>" ; The next two patterns embody what lvx should usually look like.
[(parallel (define_insn "altivec_lvx_<mode>_2op"
[(set (match_operand:VM2 0 "memory_operand" "=Z") [(set (match_operand:VM2 0 "register_operand" "=v")
(match_operand:VM2 1 "register_operand" "v")) (mem:VM2 (and:DI (plus:DI (match_operand:DI 1 "register_operand" "b")
(unspec [(const_int 0)] UNSPEC_STVX)])] (match_operand:DI 2 "register_operand" "r"))
"TARGET_ALTIVEC" (const_int -16))))]
{ "TARGET_ALTIVEC && TARGET_64BIT"
if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG) "lvx %0,%1,%2"
{ [(set_attr "type" "vecload")])
altivec_expand_stvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_STVX);
DONE;
}
})
(define_insn "altivec_lvx_<mode>_1op"
[(set (match_operand:VM2 0 "register_operand" "=v")
(mem:VM2 (and:DI (match_operand:DI 1 "register_operand" "r")
(const_int -16))))]
"TARGET_ALTIVEC && TARGET_64BIT"
"lvx %0,0,%1"
[(set_attr "type" "vecload")])
; 32-bit versions of the above.
(define_insn "altivec_lvx_<mode>_2op_si"
[(set (match_operand:VM2 0 "register_operand" "=v")
(mem:VM2 (and:SI (plus:SI (match_operand:SI 1 "register_operand" "b")
(match_operand:SI 2 "register_operand" "r"))
(const_int -16))))]
"TARGET_ALTIVEC && TARGET_32BIT"
"lvx %0,%1,%2"
[(set_attr "type" "vecload")])
(define_insn "altivec_lvx_<mode>_1op_si"
[(set (match_operand:VM2 0 "register_operand" "=v")
(mem:VM2 (and:SI (match_operand:SI 1 "register_operand" "r")
(const_int -16))))]
"TARGET_ALTIVEC && TARGET_32BIT"
"lvx %0,0,%1"
[(set_attr "type" "vecload")])
; This version of stvx is used only in cases where we need to force an stvx
; over any other store, and we don't care about losing CSE opportunities.
; Its primary use is for epilogue register restores.
(define_insn "altivec_stvx_<mode>_internal" (define_insn "altivec_stvx_<mode>_internal"
[(parallel [(parallel
[(set (match_operand:VM2 0 "memory_operand" "=Z") [(set (match_operand:VM2 0 "memory_operand" "=Z")
...@@ -2560,6 +2574,42 @@ ...@@ -2560,6 +2574,42 @@
"stvx %1,%y0" "stvx %1,%y0"
[(set_attr "type" "vecstore")]) [(set_attr "type" "vecstore")])
; The next two patterns embody what stvx should usually look like.
(define_insn "altivec_stvx_<mode>_2op"
[(set (mem:VM2 (and:DI (plus:DI (match_operand:DI 1 "register_operand" "b")
(match_operand:DI 2 "register_operand" "r"))
(const_int -16)))
(match_operand:VM2 0 "register_operand" "v"))]
"TARGET_ALTIVEC && TARGET_64BIT"
"stvx %0,%1,%2"
[(set_attr "type" "vecstore")])
(define_insn "altivec_stvx_<mode>_1op"
[(set (mem:VM2 (and:DI (match_operand:DI 1 "register_operand" "r")
(const_int -16)))
(match_operand:VM2 0 "register_operand" "v"))]
"TARGET_ALTIVEC && TARGET_64BIT"
"stvx %0,0,%1"
[(set_attr "type" "vecstore")])
; 32-bit versions of the above.
(define_insn "altivec_stvx_<mode>_2op_si"
[(set (mem:VM2 (and:SI (plus:SI (match_operand:SI 1 "register_operand" "b")
(match_operand:SI 2 "register_operand" "r"))
(const_int -16)))
(match_operand:VM2 0 "register_operand" "v"))]
"TARGET_ALTIVEC && TARGET_32BIT"
"stvx %0,%1,%2"
[(set_attr "type" "vecstore")])
(define_insn "altivec_stvx_<mode>_1op_si"
[(set (mem:VM2 (and:SI (match_operand:SI 1 "register_operand" "r")
(const_int -16)))
(match_operand:VM2 0 "register_operand" "v"))]
"TARGET_ALTIVEC && TARGET_32BIT"
"stvx %0,0,%1"
[(set_attr "type" "vecstore")])
(define_expand "altivec_stvxl_<mode>" (define_expand "altivec_stvxl_<mode>"
[(parallel [(parallel
[(set (match_operand:VM2 0 "memory_operand" "=Z") [(set (match_operand:VM2 0 "memory_operand" "=Z")
......
...@@ -4800,6 +4800,130 @@ assignment for unaligned loads and stores"); ...@@ -4800,6 +4800,130 @@ assignment for unaligned loads and stores");
return stmt; return stmt;
} }
/* Expand vec_ld into an expression that masks the address and
performs the load. We need to expand this early to allow
the best aliasing, as by the time we get into RTL we no longer
are able to honor __restrict__, for example. We may want to
consider this for all memory access built-ins.
When -maltivec=be is specified, simply punt to existing
built-in processing. */
if (fcode == ALTIVEC_BUILTIN_VEC_LD
&& (BYTES_BIG_ENDIAN || !VECTOR_ELT_ORDER_BIG))
{
tree arg0 = (*arglist)[0];
tree arg1 = (*arglist)[1];
/* Strip qualifiers like "const" from the pointer arg. */
tree arg1_type = TREE_TYPE (arg1);
tree inner_type = TREE_TYPE (arg1_type);
if (TYPE_QUALS (TREE_TYPE (arg1_type)) != 0)
{
arg1_type = build_pointer_type (build_qualified_type (inner_type,
0));
arg1 = fold_convert (arg1_type, arg1);
}
/* Construct the masked address. Let existing error handling take
over if we don't have a constant offset. */
arg0 = fold (arg0);
if (TREE_CODE (arg0) == INTEGER_CST)
{
if (!ptrofftype_p (TREE_TYPE (arg0)))
arg0 = build1 (NOP_EXPR, sizetype, arg0);
tree arg1_type = TREE_TYPE (arg1);
tree addr = fold_build2_loc (loc, POINTER_PLUS_EXPR, arg1_type,
arg1, arg0);
tree aligned = fold_build2_loc (loc, BIT_AND_EXPR, arg1_type, addr,
build_int_cst (arg1_type, -16));
/* Find the built-in to get the return type so we can convert
the result properly (or fall back to default handling if the
arguments aren't compatible). */
for (desc = altivec_overloaded_builtins;
desc->code && desc->code != fcode; desc++)
continue;
for (; desc->code == fcode; desc++)
if (rs6000_builtin_type_compatible (TREE_TYPE (arg0), desc->op1)
&& (rs6000_builtin_type_compatible (TREE_TYPE (arg1),
desc->op2)))
{
tree ret_type = rs6000_builtin_type (desc->ret_type);
if (TYPE_MODE (ret_type) == V2DImode)
/* Type-based aliasing analysis thinks vector long
and vector long long are different and will put them
in distinct alias classes. Force our return type
to be a may-alias type to avoid this. */
ret_type
= build_pointer_type_for_mode (ret_type, Pmode,
true/*can_alias_all*/);
else
ret_type = build_pointer_type (ret_type);
aligned = build1 (NOP_EXPR, ret_type, aligned);
tree ret_val = build_indirect_ref (loc, aligned, RO_NULL);
return ret_val;
}
}
}
/* Similarly for stvx. */
if (fcode == ALTIVEC_BUILTIN_VEC_ST
&& (BYTES_BIG_ENDIAN || !VECTOR_ELT_ORDER_BIG))
{
tree arg0 = (*arglist)[0];
tree arg1 = (*arglist)[1];
tree arg2 = (*arglist)[2];
/* Construct the masked address. Let existing error handling take
over if we don't have a constant offset. */
arg1 = fold (arg1);
if (TREE_CODE (arg1) == INTEGER_CST)
{
if (!ptrofftype_p (TREE_TYPE (arg1)))
arg1 = build1 (NOP_EXPR, sizetype, arg1);
tree arg2_type = TREE_TYPE (arg2);
tree addr = fold_build2_loc (loc, POINTER_PLUS_EXPR, arg2_type,
arg2, arg1);
tree aligned = fold_build2_loc (loc, BIT_AND_EXPR, arg2_type, addr,
build_int_cst (arg2_type, -16));
/* Find the built-in to make sure a compatible one exists; if not
we fall back to default handling to get the error message. */
for (desc = altivec_overloaded_builtins;
desc->code && desc->code != fcode; desc++)
continue;
for (; desc->code == fcode; desc++)
if (rs6000_builtin_type_compatible (TREE_TYPE (arg0), desc->op1)
&& rs6000_builtin_type_compatible (TREE_TYPE (arg1), desc->op2)
&& rs6000_builtin_type_compatible (TREE_TYPE (arg2),
desc->op3))
{
tree arg0_type = TREE_TYPE (arg0);
if (TYPE_MODE (arg0_type) == V2DImode)
/* Type-based aliasing analysis thinks vector long
and vector long long are different and will put them
in distinct alias classes. Force our address type
to be a may-alias type to avoid this. */
arg0_type
= build_pointer_type_for_mode (arg0_type, Pmode,
true/*can_alias_all*/);
else
arg0_type = build_pointer_type (arg0_type);
aligned = build1 (NOP_EXPR, arg0_type, aligned);
tree stg = build_indirect_ref (loc, aligned, RO_NULL);
tree retval = build2 (MODIFY_EXPR, TREE_TYPE (stg), stg,
convert (TREE_TYPE (stg), arg0));
return retval;
}
}
}
for (n = 0; for (n = 0;
!VOID_TYPE_P (TREE_VALUE (fnargs)) && n < nargs; !VOID_TYPE_P (TREE_VALUE (fnargs)) && n < nargs;
fnargs = TREE_CHAIN (fnargs), n++) fnargs = TREE_CHAIN (fnargs), n++)
......
...@@ -167,7 +167,14 @@ ...@@ -167,7 +167,14 @@
if (VECTOR_MEM_VSX_P (<MODE>mode)) if (VECTOR_MEM_VSX_P (<MODE>mode))
{ {
operands[1] = rs6000_address_for_altivec (operands[1]); operands[1] = rs6000_address_for_altivec (operands[1]);
emit_insn (gen_altivec_lvx_<mode> (operands[0], operands[1])); rtx and_op = XEXP (operands[1], 0);
gcc_assert (GET_CODE (and_op) == AND);
rtx addr = XEXP (and_op, 0);
if (GET_CODE (addr) == PLUS)
emit_insn (gen_altivec_lvx_<mode>_2op (operands[0], XEXP (addr, 0),
XEXP (addr, 1)));
else
emit_insn (gen_altivec_lvx_<mode>_1op (operands[0], operands[1]));
DONE; DONE;
} }
}") }")
...@@ -183,7 +190,14 @@ ...@@ -183,7 +190,14 @@
if (VECTOR_MEM_VSX_P (<MODE>mode)) if (VECTOR_MEM_VSX_P (<MODE>mode))
{ {
operands[0] = rs6000_address_for_altivec (operands[0]); operands[0] = rs6000_address_for_altivec (operands[0]);
emit_insn (gen_altivec_stvx_<mode> (operands[0], operands[1])); rtx and_op = XEXP (operands[0], 0);
gcc_assert (GET_CODE (and_op) == AND);
rtx addr = XEXP (and_op, 0);
if (GET_CODE (addr) == PLUS)
emit_insn (gen_altivec_stvx_<mode>_2op (operands[1], XEXP (addr, 0),
XEXP (addr, 1)));
else
emit_insn (gen_altivec_stvx_<mode>_1op (operands[1], operands[0]));
DONE; DONE;
} }
}") }")
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment