Commit 5e04b3b6 by Richard Henderson Committed by Richard Henderson

Implement vec_perm broadcast, and tidy lots of patterns to help.

From-SVN: r154836
parent 9fda11a2
2009-11-30 Richard Henderson <rth@redhat.com>
* config/i386/i386.c (ix86_vec_interleave_v2df_operator_ok): New.
(bdesc_special_args): Update insn codes.
(avx_vpermilp_parallel): Correct range check.
(ix86_rtx_costs): Handle vector permutation rtx codes.
(struct expand_vec_perm_d): Move earlier.
(get_mode_wider_vector): New.
(expand_vec_perm_broadcast_1): New.
(ix86_expand_vector_init_duplicate): Use it. Tidy AVX modes.
(expand_vec_perm_broadcast): New.
(ix86_expand_vec_perm_builtin_1): Use it.
* config/i386/i386-protos.h: Update.
* config/i386/predicates.md (avx_vbroadcast_operand): New.
* config/i386/sse.md (AVX256MODE24P): New.
(ssescalarmodesuffix2s): New.
(avxhalfvecmode, avxscalarmode): Fill out to all modes.
(avxmodesuffixf2c): Add V8SI, V4DI.
(vec_dupv4sf): New expander.
(*vec_dupv4sf_avx): Add vbroadcastss alternative.
(*vec_set<mode>_0_avx, **vec_set<mode>_0_sse4_1): Macro-ize for
V4SF and V4SI. Move C alternatives to front. Add insertps and
pinsrd alternatives.
(*vec_set<mode>_0_sse2): Split out from ...
(vec_set<mode>_0): Macro-ize for V4SF and V4SI.
(vec_interleave_highv2df, vec_interleave_lowv2df): Require register
destination; use ix86_vec_interleave_v2df_operator_ok, instead of
ix86_fixup_binary_operands.
(*avx_interleave_highv2df, avx_interleave_lowv2df): Add movddup.
(*sse3_interleave_highv2df, sse3_interleave_lowv2df): New.
(*avx_movddup, *sse3_movddup): Remove. New splitter from
vec_select form to vec_duplicate form.
(*sse2_interleave_highv2df, sse2_interleave_lowv2df): Use
ix86_vec_interleave_v2df_operator_ok.
(avx_movddup256, avx_unpcklpd256): Change to expanders, merge into ...
(*avx_unpcklpd256): ... here.
(*vec_dupv4si_avx): New.
(*vec_dupv2di_avx): Add movddup alternative.
(*vec_dupv2di_sse3): New.
(vec_dup<AVX256MODE24P>): Replace avx_vbroadcasts<AVXMODEF4P> and
avx_vbroadcastss256; represent with vec_duplicate instead of
nested vec_concat operations.
(avx_vbroadcastf128_<mode>): Rename from
avx_vbroadcastf128_p<avxmodesuffixf2c>256.
(*avx_vperm_broadcast_v4sf): New.
(*avx_vperm_broadcast_<AVX256MODEF2P>): New.
2009-11-30 Martin Jambor <mjambor@suse.cz>
PR middle-end/42196
......@@ -86,6 +86,7 @@ extern void ix86_expand_binary_operator (enum rtx_code,
enum machine_mode, rtx[]);
extern int ix86_binary_operator_ok (enum rtx_code, enum machine_mode, rtx[]);
extern bool ix86_lea_for_add_ok (enum rtx_code, rtx, rtx[]);
extern bool ix86_vec_interleave_v2df_operator_ok (rtx operands[3], bool high);
extern bool ix86_dep_by_shift_count (const_rtx set_insn, const_rtx use_insn);
extern bool ix86_agi_dependent (rtx set_insn, rtx use_insn);
extern void ix86_expand_unary_operator (enum rtx_code, enum machine_mode,
......
......@@ -1241,3 +1241,20 @@
(define_predicate "avx_vperm2f128_v4df_operand"
(and (match_code "parallel")
(match_test "avx_vperm2f128_parallel (op, V4DFmode)")))
;; Return 1 if OP is a parallel for a vbroadcast permute.
(define_predicate "avx_vbroadcast_operand"
(and (match_code "parallel")
(match_code "const_int" "a"))
{
rtx elt = XVECEXP (op, 0, 0);
int i, nelt = XVECLEN (op, 0);
/* Don't bother checking there are the right number of operands,
merely that they're all identical. */
for (i = 1; i < nelt; ++i)
if (XVECEXP (op, 0, i) != elt)
return false;
return true;
})
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment