Commit 00347934 by Richard Sandiford Committed by Richard Sandiford

[12/n] PR85694: Rework detection of widened operations

This patch adds helper functions for detecting widened operations and
generalises the existing code to handle more cases.

One of the main changes is to recognise multi-stage type conversions,
which are possible even in the original IR and can also occur as a
result of earlier pattern matching (especially after the main
over-widening patch).  E.g. for:

  unsigned int res = 0;
  for (__INTPTR_TYPE__ i = 0; i < N; ++i)
    {
      int av = a[i];
      int bv = b[i];
      short diff = av - bv;
      unsigned short abs = diff < 0 ? -diff : diff;
      res += abs;
    }

we have:

  _9 = _7 - _8;
  diff_20 = (short int) _9;
  _10 = (int) diff_20;
  _11 = ABS_EXPR <_10>;

where the first cast establishes the sign of the promotion done
by the second cast.

vect_recog_sad_pattern didn't handle this kind of intermediate promotion
between the MINUS_EXPR and the ABS_EXPR.  Sign extensions and casts from
unsigned to signed are both OK there.  Unsigned promotions aren't, and
need to be rejected, but should have been folded away earlier anyway.

Also, the dot_prod and widen_sum patterns both required the promotions
to be from one signedness to the same signedness, rather than say signed
char to unsigned int.  That shouldn't be necessary, since it's only the
sign of the input to the promotion that matters.  Nothing requires the
narrow and wide types in a DOT_PROD_EXPR or WIDEN_SUM_EXPR to have the
same sign (and IMO that's a good thing).

Fixing these fixed an XFAIL in gcc.dg/vect/vect-widen-mult-sum.c.

vect_widened_op_tree is a bit more general than the current patch needs,
since it copes with a tree of operations rather than a single statement.
This is used by the later average-detection patch.

The patch also uses a common routine to handle both the WIDEN_MULT_EXPR
and WIDEN_LSHIFT_EXPR patterns.  I hope this could be reused for other
similar operations in future.

Also, the patch means we recognise the index calculations in
vect-mult-const-pattern*.c as widening multiplications, whereas the
scan test was expecting them to be recognised as mult patterns instead.
The patch makes the tests check specifically for the multiplication we
care about.

2018-06-30  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-patterns.c (append_pattern_def_seq): Take an optional
	vector type.  If given, install it in the new statement's
	STMT_VINFO_VECTYPE.
	(vect_element_precision): New function.
	(vect_unpromoted_value): New struct.
	(vect_unpromoted_value::vect_unpromoted_value): New function.
	(vect_unpromoted_value::set_op): Likewise.
	(vect_look_through_possible_promotion): Likewise.
	(vect_joust_widened_integer, vect_joust_widened_type): Likewise.
	(vect_widened_op_tree, vect_convert_input): Likewise.
	(vect_convert_inputs, vect_convert_output): Likewise.
	(vect_recog_dot_prod_pattern): Use vect_look_through_possible_promotion
	to handle the optional cast of the multiplication result and
	vect_widened_op_tree to detect the widened multiplication itself.
	Do not require the input and output of promotion casts to have
	the same sign, but base the signedness of the operation on the
	input rather than the result.  If the pattern includes two
	promotions, check that those promotions have the same sign.
	Do not restrict the MULT_EXPR handling to a double-width result;
	handle quadruple-width results and wider.  Use vect_convert_inputs
	to convert the inputs to the common type.
	(vect_recog_sad_pattern):  Use vect_look_through_possible_promotion
	to handle the optional cast of the ABS result.  Also allow a sign
	change or a sign extension between the ABS and MINUS.
	Use vect_widened_op_tree to detect the widened subtraction and use
	vect_convert_inputs to convert the inputs to the common type.
	(vect_handle_widen_op_by_const): Delete.
	(vect_recog_widen_op_pattern): New function.
	(vect_recog_widen_mult_pattern): Use it.
	(vect_recog_widen_shift_pattern): Likewise.
	(vect_recog_widen_sum_pattern): Use
	vect_look_through_possible_promotion to handle the promoted
	PLUS_EXPR operand.

gcc/testsuite/
	* gcc.dg/vect/vect-widen-mult-sum.c: Remove xfail.
	* gcc.dg/vect/no-scevccp-outer-6.c: Don't match widened multiplications
	by 4 in the computation of a[i].
	* gcc.dg/vect/vect-mult-const-pattern-1.c: Test specifically for the
	main multiplication constant.
	* gcc.dg/vect/vect-mult-const-pattern-2.c: Likewise.
	* gcc.dg/vect/vect-widen-mult-const-s16.c: Likewise.
	* gcc.dg/vect/vect-widen-mult-const-u16.c: Likewise.  Expect the
	pattern to cast the result to int.
	* gcc.dg/vect/vect-reduc-dot-1.c: New test.
	* gcc.dg/vect/vect-reduc-dot-2.c: Likewise.
	* gcc.dg/vect/vect-reduc-dot-3.c: Likewise.
	* gcc.dg/vect/vect-reduc-dot-4.c: Likewise.
	* gcc.dg/vect/vect-reduc-dot-5.c: Likewise.
	* gcc.dg/vect/vect-reduc-dot-6.c: Likewise.
	* gcc.dg/vect/vect-reduc-dot-7.c: Likewise.
	* gcc.dg/vect/vect-reduc-dot-8.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-1.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-2.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-3.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-4.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-5.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-6.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-7.c: Likewise.
	* gcc.dg/vect/vect-reduc-sad-8.c: Likewise.
	* gcc.dg/vect/vect-widen-mult-1.c: Likewise.
	* gcc.dg/vect/vect-widen-mult-2.c: Likewise.
	* gcc.dg/vect/vect-widen-mult-3.c: Likewise.
	* gcc.dg/vect/vect-widen-mult-4.c: Likewise.

From-SVN: r262276
parent 41949de9
2018-06-30 Richard Sandiford <richard.sandiford@arm.com> 2018-06-30 Richard Sandiford <richard.sandiford@arm.com>
* tree-vect-patterns.c (append_pattern_def_seq): Take an optional
vector type. If given, install it in the new statement's
STMT_VINFO_VECTYPE.
(vect_element_precision): New function.
(vect_unpromoted_value): New struct.
(vect_unpromoted_value::vect_unpromoted_value): New function.
(vect_unpromoted_value::set_op): Likewise.
(vect_look_through_possible_promotion): Likewise.
(vect_joust_widened_integer, vect_joust_widened_type): Likewise.
(vect_widened_op_tree, vect_convert_input): Likewise.
(vect_convert_inputs, vect_convert_output): Likewise.
(vect_recog_dot_prod_pattern): Use vect_look_through_possible_promotion
to handle the optional cast of the multiplication result and
vect_widened_op_tree to detect the widened multiplication itself.
Do not require the input and output of promotion casts to have
the same sign, but base the signedness of the operation on the
input rather than the result. If the pattern includes two
promotions, check that those promotions have the same sign.
Do not restrict the MULT_EXPR handling to a double-width result;
handle quadruple-width results and wider. Use vect_convert_inputs
to convert the inputs to the common type.
(vect_recog_sad_pattern): Use vect_look_through_possible_promotion
to handle the optional cast of the ABS result. Also allow a sign
change or a sign extension between the ABS and MINUS.
Use vect_widened_op_tree to detect the widened subtraction and use
vect_convert_inputs to convert the inputs to the common type.
(vect_handle_widen_op_by_const): Delete.
(vect_recog_widen_op_pattern): New function.
(vect_recog_widen_mult_pattern): Use it.
(vect_recog_widen_shift_pattern): Likewise.
(vect_recog_widen_sum_pattern): Use
vect_look_through_possible_promotion to handle the promoted
PLUS_EXPR operand.
2018-06-30 Richard Sandiford <richard.sandiford@arm.com>
* gimple-iterator.c (gsi_for_stmt): Add a new overload that takes * gimple-iterator.c (gsi_for_stmt): Add a new overload that takes
the containing gimple_seq *. the containing gimple_seq *.
* gimple-iterator.h (gsi_for_stmt): Declare it. * gimple-iterator.h (gsi_for_stmt): Declare it.
......
2018-06-30 Richard Sandiford <richard.sandiford@arm.com> 2018-06-30 Richard Sandiford <richard.sandiford@arm.com>
* gcc.dg/vect/vect-widen-mult-sum.c: Remove xfail.
* gcc.dg/vect/no-scevccp-outer-6.c: Don't match widened multiplications
by 4 in the computation of a[i].
* gcc.dg/vect/vect-mult-const-pattern-1.c: Test specifically for the
main multiplication constant.
* gcc.dg/vect/vect-mult-const-pattern-2.c: Likewise.
* gcc.dg/vect/vect-widen-mult-const-s16.c: Likewise.
* gcc.dg/vect/vect-widen-mult-const-u16.c: Likewise. Expect the
pattern to cast the result to int.
* gcc.dg/vect/vect-reduc-dot-1.c: New test.
* gcc.dg/vect/vect-reduc-dot-2.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-3.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-4.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-5.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-6.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-7.c: Likewise.
* gcc.dg/vect/vect-reduc-dot-8.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-1.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-2.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-3.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-4.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-5.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-6.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-7.c: Likewise.
* gcc.dg/vect/vect-reduc-sad-8.c: Likewise.
* gcc.dg/vect/vect-widen-mult-1.c: Likewise.
* gcc.dg/vect/vect-widen-mult-2.c: Likewise.
* gcc.dg/vect/vect-widen-mult-3.c: Likewise.
* gcc.dg/vect/vect-widen-mult-4.c: Likewise.
2018-06-30 Richard Sandiford <richard.sandiford@arm.com>
* gcc.dg/vect/vect-reduc-dot-u16b.c: Remove xfail and update the * gcc.dg/vect/vect-reduc-dot-u16b.c: Remove xfail and update the
test for vectorization along the lines described in the comment. test for vectorization along the lines described in the comment.
......
...@@ -52,4 +52,5 @@ int main (void) ...@@ -52,4 +52,5 @@ int main (void)
} }
/* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { unaligned_stack || { vect_no_align && { ! vect_hw_misalign } } } } } } */ /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { xfail { unaligned_stack || { vect_no_align && { ! vect_hw_misalign } } } } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 1 "vect" { xfail *-*-* } } } */ /* Exclude widening of i * 4 in the calculation of a[i]. */
/* { dg-final { scan-tree-dump-times {vect_recog_widen_mult_pattern: detected: [^\n]* \* ([^4]|4[^;])} 1 "vect" { xfail *-*-* } } } */
...@@ -37,5 +37,5 @@ main (void) ...@@ -37,5 +37,5 @@ main (void)
return 0; return 0;
} }
/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect" { target aarch64*-*-* xfail aarch64_sve } } } */ /* { dg-final { scan-tree-dump {vect_recog_mult_pattern: detected:[^\n]* \* 123} "vect" { target aarch64*-*-* xfail aarch64_sve } } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target aarch64*-*-* } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target aarch64*-*-* } } } */
...@@ -36,5 +36,5 @@ main (void) ...@@ -36,5 +36,5 @@ main (void)
return 0; return 0;
} }
/* { dg-final { scan-tree-dump-times "vect_recog_mult_pattern: detected" 2 "vect" { target aarch64*-*-* xfail aarch64_sve } } } */ /* { dg-final { scan-tree-dump {vect_recog_mult_pattern: detected:[^\n]* \* -19594} "vect" { target aarch64*-*-* xfail aarch64_sve } } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target aarch64*-*-* } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target aarch64*-*-* } } } */
/* { dg-require-effective-target vect_int } */
#include "tree-vect.h"
#define N 50
#ifndef SIGNEDNESS_1
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 unsigned
#endif
SIGNEDNESS_1 int __attribute__ ((noipa))
f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
SIGNEDNESS_3 char *restrict b)
{
for (__INTPTR_TYPE__ i = 0; i < N; ++i)
{
int av = a[i];
int bv = b[i];
SIGNEDNESS_2 short mult = av * bv;
res += mult;
}
return res;
}
#define BASE ((SIGNEDNESS_3 int) -1 < 0 ? -126 : 4)
#define OFFSET 20
int
main (void)
{
check_vect ();
SIGNEDNESS_3 char a[N], b[N];
int expected = 0x12345;
for (int i = 0; i < N; ++i)
{
a[i] = BASE + i * 5;
b[i] = BASE + OFFSET + i * 4;
asm volatile ("" ::: "memory");
expected += (SIGNEDNESS_2 short) (a[i] * b[i]);
}
if (f (0x12345, a, b) != expected)
__builtin_abort ();
}
/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_sdot_qi } } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 signed
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 unsigned
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 unsigned
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 signed
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 unsigned
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#include "vect-reduc-dot-1.c"
/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#include "tree-vect.h"
#define N 50
#ifndef SIGNEDNESS_1
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 unsigned
#endif
SIGNEDNESS_1 int __attribute__ ((noipa))
f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
SIGNEDNESS_3 char *restrict b)
{
for (__INTPTR_TYPE__ i = 0; i < N; ++i)
{
int av = a[i];
int bv = b[i];
short diff = av - bv;
SIGNEDNESS_2 short abs = diff < 0 ? -diff : diff;
res += abs;
}
return res;
}
#define BASE1 ((SIGNEDNESS_3 int) -1 < 0 ? -126 : 4)
#define BASE2 ((SIGNEDNESS_3 int) -1 < 0 ? 116 : 220)
int
main (void)
{
check_vect ();
SIGNEDNESS_3 char a[N], b[N];
int expected = 0x12345;
for (int i = 0; i < N; ++i)
{
a[i] = BASE1 + i * 5;
b[i] = BASE1 - i * 4;
asm volatile ("" ::: "memory");
int diff = a[i] - b[i];
expected += (SIGNEDNESS_2 short) (diff < 0 ? -diff : diff);
}
if (f (0x12345, a, b) != expected)
__builtin_abort ();
}
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_sdot_qi } } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 signed
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 unsigned
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 unsigned
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 unsigned
#define SIGNEDNESS_3 signed
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 unsigned
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 signed
#define SIGNEDNESS_3 signed
#include "vect-reduc-sad-1.c"
/* { dg-final { scan-tree-dump "vect_recog_sad_pattern: detected" "vect" } } */
/* { dg-require-effective-target vect_int } */
#include "tree-vect.h"
#define N 50
#ifndef SIGNEDNESS_1
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 signed
#endif
void __attribute__ ((noipa))
f (SIGNEDNESS_1 short *restrict a, SIGNEDNESS_2 char *restrict b,
SIGNEDNESS_2 char *restrict c)
{
for (__INTPTR_TYPE__ i = 0; i < N; ++i)
{
int bv = (SIGNEDNESS_2 char) b[i];
int cv = (SIGNEDNESS_2 char) c[i];
a[i] = bv * cv;
}
}
#define BASE ((SIGNEDNESS_2 int) -1 < 0 ? -126 : 4)
#define OFFSET 20
int
main (void)
{
check_vect ();
SIGNEDNESS_1 short a[N];
SIGNEDNESS_2 char b[N], c[N];
for (int i = 0; i < N; ++i)
{
b[i] = BASE + i * 5;
c[i] = BASE + OFFSET + i * 4;
asm volatile ("" ::: "memory");
}
f (a, b, c);
for (int i = 0; i < N; ++i)
if (a[i] != (SIGNEDNESS_1 short) ((BASE + i * 5)
* (BASE + OFFSET + i * 4)))
__builtin_abort ();
}
/* { dg-final { scan-tree-dump "vect_recog_widen_mult_pattern: detected" "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_widen_mult_qi_to_hi } } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 unsigned
#define SIGNEDNESS_2 unsigned
#include "vect-widen-mult-1.c"
/* { dg-final { scan-tree-dump "vect_recog_widen_mult_pattern: detected" "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_widen_mult_qi_to_hi } } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 unsigned
#include "vect-widen-mult-1.c"
/* { dg-final { scan-tree-dump "vect_recog_widen_mult_pattern: detected" "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_widen_mult_qi_to_hi } } } */
/* { dg-require-effective-target vect_int } */
#define SIGNEDNESS_1 signed
#define SIGNEDNESS_2 signed
#include "vect-widen-mult-1.c"
/* { dg-final { scan-tree-dump "vect_recog_widen_mult_pattern: detected" "vect" } } */
/* { dg-final { scan-tree-dump-times "vectorized 1 loop" 1 "vect" { target vect_widen_mult_qi_to_hi } } } */
...@@ -56,6 +56,6 @@ int main (void) ...@@ -56,6 +56,6 @@ int main (void)
} }
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_widen_mult_hi_to_si } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_widen_mult_hi_to_si } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ /* { dg-final { scan-tree-dump-times {vect_recog_widen_mult_pattern: detected:[^\n]* 2333} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */
/* { dg-final { scan-tree-dump-times "widen_mult pattern recognized" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ /* { dg-final { scan-tree-dump-times {widen_mult pattern recognized:[^\n]* 2333} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */
...@@ -73,5 +73,5 @@ int main (void) ...@@ -73,5 +73,5 @@ int main (void)
} }
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { target vect_widen_mult_hi_to_si } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 3 "vect" { target vect_widen_mult_hi_to_si } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ /* { dg-final { scan-tree-dump-times {vect_recog_widen_mult_pattern: detected:[^\n]* 2333} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */
/* { dg-final { scan-tree-dump-times "widen_mult pattern recognized" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ /* { dg-final { scan-tree-dump-times {widen_mult pattern recognized:[^\n]* = \(int\)} 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */
...@@ -43,4 +43,4 @@ int main (void) ...@@ -43,4 +43,4 @@ int main (void)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_widen_mult_hi_to_si } } } */ /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_widen_mult_hi_to_si } } } */
/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 1 "vect" } } */ /* { dg-final { scan-tree-dump "vect_recog_widen_mult_pattern: detected" "vect" } } */
...@@ -96,11 +96,24 @@ vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info, ...@@ -96,11 +96,24 @@ vect_set_pattern_stmt (gimple *pattern_stmt, stmt_vec_info orig_stmt_info,
vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype); vect_init_pattern_stmt (pattern_stmt, orig_stmt_info, vectype);
} }
/* Add NEW_STMT to STMT_INFO's pattern definition statements. If VECTYPE
is nonnull, record that NEW_STMT's vector type is VECTYPE, which might
be different from the vector type of the final pattern statement. */
static inline void static inline void
append_pattern_def_seq (stmt_vec_info stmt_info, gimple *stmt) append_pattern_def_seq (stmt_vec_info stmt_info, gimple *new_stmt,
tree vectype = NULL_TREE)
{ {
vec_info *vinfo = stmt_info->vinfo;
if (vectype)
{
gcc_assert (!vinfo_for_stmt (new_stmt));
stmt_vec_info new_stmt_info = new_stmt_vec_info (new_stmt, vinfo);
set_vinfo_for_stmt (new_stmt, new_stmt_info);
STMT_VINFO_VECTYPE (new_stmt_info) = vectype;
}
gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_info), gimple_seq_add_stmt_without_update (&STMT_VINFO_PATTERN_DEF_SEQ (stmt_info),
stmt); new_stmt);
} }
static inline void static inline void
...@@ -187,6 +200,15 @@ vect_single_imm_use (gimple *def_stmt) ...@@ -187,6 +200,15 @@ vect_single_imm_use (gimple *def_stmt)
return use_stmt; return use_stmt;
} }
/* Round bit precision PRECISION up to a full element. */
static unsigned int
vect_element_precision (unsigned int precision)
{
precision = 1 << ceil_log2 (precision);
return MAX (precision, BITS_PER_UNIT);
}
/* If OP is defined by a statement that's being considered for vectorization, /* If OP is defined by a statement that's being considered for vectorization,
return information about that statement, otherwise return NULL. */ return information about that statement, otherwise return NULL. */
...@@ -253,6 +275,322 @@ type_conversion_p (tree name, gimple *use_stmt, bool check_sign, ...@@ -253,6 +275,322 @@ type_conversion_p (tree name, gimple *use_stmt, bool check_sign,
return true; return true;
} }
/* Holds information about an input operand after some sign changes
and type promotions have been peeled away. */
struct vect_unpromoted_value {
vect_unpromoted_value ();
void set_op (tree, vect_def_type, stmt_vec_info = NULL);
/* The value obtained after peeling away zero or more casts. */
tree op;
/* The type of OP. */
tree type;
/* The definition type of OP. */
vect_def_type dt;
/* If OP is the result of peeling at least one cast, and if the cast
of OP itself is a vectorizable statement, CASTER identifies that
statement, otherwise it is null. */
stmt_vec_info caster;
};
inline vect_unpromoted_value::vect_unpromoted_value ()
: op (NULL_TREE),
type (NULL_TREE),
dt (vect_uninitialized_def),
caster (NULL)
{
}
/* Set the operand to OP_IN, its definition type to DT_IN, and the
statement that casts it to CASTER_IN. */
inline void
vect_unpromoted_value::set_op (tree op_in, vect_def_type dt_in,
stmt_vec_info caster_in)
{
op = op_in;
type = TREE_TYPE (op);
dt = dt_in;
caster = caster_in;
}
/* If OP is a vectorizable SSA name, strip a sequence of integer conversions
to reach some vectorizable inner operand OP', continuing as long as it
is possible to convert OP' back to OP using a possible sign change
followed by a possible promotion P. Return this OP', or null if OP is
not a vectorizable SSA name. If there is a promotion P, describe its
input in UNPROM, otherwise describe OP' in UNPROM.
A successful return means that it is possible to go from OP' to OP
via UNPROM. The cast from OP' to UNPROM is at most a sign change,
whereas the cast from UNPROM to OP might be a promotion, a sign
change, or a nop.
E.g. say we have:
signed short *ptr = ...;
signed short C = *ptr;
unsigned short B = (unsigned short) C; // sign change
signed int A = (signed int) B; // unsigned promotion
...possible other uses of A...
unsigned int OP = (unsigned int) A; // sign change
In this case it's possible to go directly from C to OP using:
OP = (unsigned int) (unsigned short) C;
+------------+ +--------------+
promotion sign change
so OP' would be C. The input to the promotion is B, so UNPROM
would describe B. */
static tree
vect_look_through_possible_promotion (vec_info *vinfo, tree op,
vect_unpromoted_value *unprom)
{
tree res = NULL_TREE;
tree op_type = TREE_TYPE (op);
unsigned int orig_precision = TYPE_PRECISION (op_type);
stmt_vec_info caster = NULL;
while (TREE_CODE (op) == SSA_NAME && INTEGRAL_TYPE_P (op_type))
{
/* See whether OP is simple enough to vectorize. */
gimple *def_stmt;
vect_def_type dt;
if (!vect_is_simple_use (op, vinfo, &dt, &def_stmt))
break;
/* If OP is the input of a demotion, skip over it to see whether
OP is itself the result of a promotion. If so, the combined
effect of the promotion and the demotion might fit the required
pattern, otherwise neither operation fits.
This copes with cases such as the result of an arithmetic
operation being truncated before being stored, and where that
arithmetic operation has been recognized as an over-widened one. */
if (TYPE_PRECISION (op_type) <= orig_precision)
{
/* Use OP as the UNPROM described above if we haven't yet
found a promotion, or if using the new input preserves the
sign of the previous promotion. */
if (!res
|| TYPE_PRECISION (unprom->type) == orig_precision
|| TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type))
unprom->set_op (op, dt, caster);
/* Stop if we've already seen a promotion and if this
conversion does more than change the sign. */
else if (TYPE_PRECISION (op_type)
!= TYPE_PRECISION (unprom->type))
break;
/* The sequence now extends to OP. */
res = op;
}
/* See whether OP is defined by a cast. Record it as CASTER if
the cast is potentially vectorizable. */
if (!def_stmt)
break;
if (dt == vect_internal_def)
caster = vinfo_for_stmt (def_stmt);
else
caster = NULL;
gassign *assign = dyn_cast <gassign *> (def_stmt);
if (!assign || !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def_stmt)))
break;
/* Continue with the input to the cast. */
op = gimple_assign_rhs1 (def_stmt);
op_type = TREE_TYPE (op);
}
return res;
}
/* OP is an integer operand to an operation that returns TYPE, and we
want to treat the operation as a widening one. So far we can treat
it as widening from *COMMON_TYPE.
Return true if OP is suitable for such a widening operation,
either widening from *COMMON_TYPE or from some supertype of it.
Update *COMMON_TYPE to the supertype in the latter case.
SHIFT_P is true if OP is a shift amount. */
static bool
vect_joust_widened_integer (tree type, bool shift_p, tree op,
tree *common_type)
{
/* Calculate the minimum precision required by OP, without changing
the sign of either operand. */
unsigned int precision;
if (shift_p)
{
if (!wi::leu_p (wi::to_widest (op), TYPE_PRECISION (type) / 2))
return false;
precision = TREE_INT_CST_LOW (op);
}
else
{
precision = wi::min_precision (wi::to_widest (op),
TYPE_SIGN (*common_type));
if (precision * 2 > TYPE_PRECISION (type))
return false;
}
/* If OP requires a wider type, switch to that type. The checks
above ensure that this is still narrower than the result. */
precision = vect_element_precision (precision);
if (TYPE_PRECISION (*common_type) < precision)
*common_type = build_nonstandard_integer_type
(precision, TYPE_UNSIGNED (*common_type));
return true;
}
/* Return true if the common supertype of NEW_TYPE and *COMMON_TYPE
is narrower than type, storing the supertype in *COMMON_TYPE if so. */
static bool
vect_joust_widened_type (tree type, tree new_type, tree *common_type)
{
if (types_compatible_p (*common_type, new_type))
return true;
/* See if *COMMON_TYPE can hold all values of NEW_TYPE. */
if ((TYPE_PRECISION (new_type) < TYPE_PRECISION (*common_type))
&& (TYPE_UNSIGNED (new_type) || !TYPE_UNSIGNED (*common_type)))
return true;
/* See if NEW_TYPE can hold all values of *COMMON_TYPE. */
if (TYPE_PRECISION (*common_type) < TYPE_PRECISION (new_type)
&& (TYPE_UNSIGNED (*common_type) || !TYPE_UNSIGNED (new_type)))
{
*common_type = new_type;
return true;
}
/* We have mismatched signs, with the signed type being
no wider than the unsigned type. In this case we need
a wider signed type. */
unsigned int precision = MAX (TYPE_PRECISION (*common_type),
TYPE_PRECISION (new_type));
precision *= 2;
if (precision * 2 > TYPE_PRECISION (type))
return false;
*common_type = build_nonstandard_integer_type (precision, false);
return true;
}
/* Check whether STMT_INFO can be viewed as a tree of integer operations
in which each node either performs CODE or WIDENED_CODE, and where
each leaf operand is narrower than the result of STMT_INFO. MAX_NOPS
specifies the maximum number of leaf operands. SHIFT_P says whether
CODE and WIDENED_CODE are some sort of shift.
If STMT_INFO is such a tree, return the number of leaf operands
and describe them in UNPROM[0] onwards. Also set *COMMON_TYPE
to a type that (a) is narrower than the result of STMT_INFO and
(b) can hold all leaf operand values.
Return 0 if STMT_INFO isn't such a tree, or if no such COMMON_TYPE
exists. */
static unsigned int
vect_widened_op_tree (stmt_vec_info stmt_info, tree_code code,
tree_code widened_code, bool shift_p,
unsigned int max_nops,
vect_unpromoted_value *unprom, tree *common_type)
{
/* Check for an integer operation with the right code. */
gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
if (!assign)
return 0;
tree_code rhs_code = gimple_assign_rhs_code (assign);
if (rhs_code != code && rhs_code != widened_code)
return 0;
tree type = gimple_expr_type (assign);
if (!INTEGRAL_TYPE_P (type))
return 0;
/* Assume that both operands will be leaf operands. */
max_nops -= 2;
/* Check the operands. */
unsigned int next_op = 0;
for (unsigned int i = 0; i < 2; ++i)
{
vect_unpromoted_value *this_unprom = &unprom[next_op];
unsigned int nops = 1;
tree op = gimple_op (assign, i + 1);
if (i == 1 && TREE_CODE (op) == INTEGER_CST)
{
/* We already have a common type from earlier operands.
Update it to account for OP. */
this_unprom->set_op (op, vect_constant_def);
if (!vect_joust_widened_integer (type, shift_p, op, common_type))
return 0;
}
else
{
/* Only allow shifts by constants. */
if (shift_p && i == 1)
return 0;
if (!vect_look_through_possible_promotion (stmt_info->vinfo, op,
this_unprom))
return 0;
if (TYPE_PRECISION (this_unprom->type) == TYPE_PRECISION (type))
{
/* The operand isn't widened. If STMT_INFO has the code
for an unwidened operation, recursively check whether
this operand is a node of the tree. */
if (rhs_code != code
|| max_nops == 0
|| this_unprom->dt != vect_internal_def)
return 0;
/* Give back the leaf slot allocated above now that we're
not treating this as a leaf operand. */
max_nops += 1;
/* Recursively process the definition of the operand. */
stmt_vec_info def_stmt_info
= vinfo_for_stmt (SSA_NAME_DEF_STMT (this_unprom->op));
nops = vect_widened_op_tree (def_stmt_info, code, widened_code,
shift_p, max_nops, this_unprom,
common_type);
if (nops == 0)
return 0;
max_nops -= nops;
}
else
{
/* Make sure that the operand is narrower than the result. */
if (TYPE_PRECISION (this_unprom->type) * 2
> TYPE_PRECISION (type))
return 0;
/* Update COMMON_TYPE for the new operand. */
if (i == 0)
*common_type = this_unprom->type;
else if (!vect_joust_widened_type (type, this_unprom->type,
common_type))
return 0;
}
}
next_op += nops;
}
return next_op;
}
/* Helper to return a new temporary for pattern of TYPE for STMT. If STMT /* Helper to return a new temporary for pattern of TYPE for STMT. If STMT
is NULL, the caller must set SSA_NAME_DEF_STMT for the returned SSA var. */ is NULL, the caller must set SSA_NAME_DEF_STMT for the returned SSA var. */
...@@ -262,6 +600,81 @@ vect_recog_temp_ssa_var (tree type, gimple *stmt) ...@@ -262,6 +600,81 @@ vect_recog_temp_ssa_var (tree type, gimple *stmt)
return make_temp_ssa_name (type, stmt, "patt"); return make_temp_ssa_name (type, stmt, "patt");
} }
/* Convert UNPROM to TYPE and return the result, adding new statements
to STMT_INFO's pattern definition statements if no better way is
available. VECTYPE is the vector form of TYPE. */
static tree
vect_convert_input (stmt_vec_info stmt_info, tree type,
vect_unpromoted_value *unprom, tree vectype)
{
/* Check for a no-op conversion. */
if (types_compatible_p (type, TREE_TYPE (unprom->op)))
return unprom->op;
/* Allow the caller to create constant vect_unpromoted_values. */
if (TREE_CODE (unprom->op) == INTEGER_CST)
return wide_int_to_tree (type, wi::to_widest (unprom->op));
/* See if we can reuse an existing result. */
if (unprom->caster)
{
tree lhs = gimple_get_lhs (unprom->caster->stmt);
if (types_compatible_p (TREE_TYPE (lhs), type))
return lhs;
}
/* We need a new conversion statement. */
tree new_op = vect_recog_temp_ssa_var (type, NULL);
gassign *new_stmt = gimple_build_assign (new_op, NOP_EXPR, unprom->op);
/* As a (common) last resort, add the statement to the pattern itself. */
append_pattern_def_seq (stmt_info, new_stmt, vectype);
return new_op;
}
/* Invoke vect_convert_input for N elements of UNPROM and store the
result in the corresponding elements of RESULT. */
static void
vect_convert_inputs (stmt_vec_info stmt_info, unsigned int n,
tree *result, tree type, vect_unpromoted_value *unprom,
tree vectype)
{
for (unsigned int i = 0; i < n; ++i)
{
unsigned int j;
for (j = 0; j < i; ++j)
if (unprom[j].op == unprom[i].op)
break;
if (j < i)
result[i] = result[j];
else
result[i] = vect_convert_input (stmt_info, type, &unprom[i], vectype);
}
}
/* The caller has created a (possibly empty) sequence of pattern definition
statements followed by a single statement PATTERN_STMT. Cast the result
of this final statement to TYPE. If a new statement is needed, add
PATTERN_STMT to the end of STMT_INFO's pattern definition statements
and return the new statement, otherwise return PATTERN_STMT as-is.
VECITYPE is the vector form of PATTERN_STMT's result type. */
static gimple *
vect_convert_output (stmt_vec_info stmt_info, tree type, gimple *pattern_stmt,
tree vecitype)
{
tree lhs = gimple_get_lhs (pattern_stmt);
if (!types_compatible_p (type, TREE_TYPE (lhs)))
{
append_pattern_def_seq (stmt_info, pattern_stmt, vecitype);
tree cast_var = vect_recog_temp_ssa_var (type, NULL);
pattern_stmt = gimple_build_assign (cast_var, NOP_EXPR, lhs);
}
return pattern_stmt;
}
/* Return true if STMT_VINFO describes a reduction for which reassociation /* Return true if STMT_VINFO describes a reduction for which reassociation
is allowed. If STMT_INFO is part of a group, assume that it's part of is allowed. If STMT_INFO is part of a group, assume that it's part of
a reduction chain and optimistically assume that all statements a reduction chain and optimistically assume that all statements
...@@ -351,16 +764,13 @@ vect_reassociating_reduction_p (stmt_vec_info stmt_info, tree_code code, ...@@ -351,16 +764,13 @@ vect_reassociating_reduction_p (stmt_vec_info stmt_info, tree_code code,
static gimple * static gimple *
vect_recog_dot_prod_pattern (vec<gimple *> *stmts, tree *type_out) vect_recog_dot_prod_pattern (vec<gimple *> *stmts, tree *type_out)
{ {
gimple *stmt, *last_stmt = (*stmts)[0]; gimple *last_stmt = (*stmts)[0];
tree oprnd0, oprnd1; tree oprnd0, oprnd1;
tree oprnd00, oprnd01;
stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
vec_info *vinfo = stmt_vinfo->vinfo; vec_info *vinfo = stmt_vinfo->vinfo;
tree type, half_type; tree type, half_type;
gimple *pattern_stmt; gimple *pattern_stmt;
tree prod_type;
tree var; tree var;
bool promotion;
/* Look for the following pattern /* Look for the following pattern
DX = (TYPE1) X; DX = (TYPE1) X;
...@@ -391,79 +801,50 @@ vect_recog_dot_prod_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -391,79 +801,50 @@ vect_recog_dot_prod_pattern (vec<gimple *> *stmts, tree *type_out)
return NULL; return NULL;
type = gimple_expr_type (last_stmt); type = gimple_expr_type (last_stmt);
stmt = last_stmt;
gimple *def_stmt; vect_unpromoted_value unprom_mult;
if (type_conversion_p (oprnd0, stmt, true, &half_type, &def_stmt, oprnd0 = vect_look_through_possible_promotion (vinfo, oprnd0, &unprom_mult);
&promotion)
&& promotion)
{
stmt = def_stmt;
oprnd0 = gimple_assign_rhs1 (stmt);
}
else
half_type = type;
/* So far so good. Since last_stmt was detected as a (summation) reduction, /* So far so good. Since last_stmt was detected as a (summation) reduction,
we know that oprnd1 is the reduction variable (defined by a loop-header we know that oprnd1 is the reduction variable (defined by a loop-header
phi), and oprnd0 is an ssa-name defined by a stmt in the loop body. phi), and oprnd0 is an ssa-name defined by a stmt in the loop body.
Left to check that oprnd0 is defined by a (widen_)mult_expr */ Left to check that oprnd0 is defined by a (widen_)mult_expr */
if (TREE_CODE (oprnd0) != SSA_NAME) if (!oprnd0)
return NULL; return NULL;
prod_type = half_type;
stmt_vec_info mult_vinfo = vect_get_internal_def (vinfo, oprnd0); stmt_vec_info mult_vinfo = vect_get_internal_def (vinfo, oprnd0);
if (!mult_vinfo) if (!mult_vinfo)
return NULL; return NULL;
/* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi
inside the loop (in case we are analyzing an outer-loop). */ inside the loop (in case we are analyzing an outer-loop). */
gassign *mult = dyn_cast <gassign *> (mult_vinfo->stmt); vect_unpromoted_value unprom0[2];
if (!mult) if (!vect_widened_op_tree (mult_vinfo, MULT_EXPR, WIDEN_MULT_EXPR,
false, 2, unprom0, &half_type))
return NULL; return NULL;
if (gimple_assign_rhs_code (mult) == WIDEN_MULT_EXPR)
{
/* Has been detected as a widening multiplication? */
oprnd00 = gimple_assign_rhs1 (mult);
oprnd01 = gimple_assign_rhs2 (mult);
}
else
{
tree half_type0, half_type1;
gimple *def_stmt;
tree oprnd0, oprnd1;
if (gimple_assign_rhs_code (mult) != MULT_EXPR)
return NULL;
oprnd0 = gimple_assign_rhs1 (mult); /* If there are two widening operations, make sure they agree on
oprnd1 = gimple_assign_rhs2 (mult); the sign of the extension. */
if (!type_conversion_p (oprnd0, mult, true, &half_type0, &def_stmt, if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
&promotion) && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
|| !promotion) return NULL;
return NULL;
oprnd00 = gimple_assign_rhs1 (def_stmt);
if (!type_conversion_p (oprnd1, mult, true, &half_type1, &def_stmt,
&promotion)
|| !promotion)
return NULL;
oprnd01 = gimple_assign_rhs1 (def_stmt);
if (!types_compatible_p (half_type0, half_type1))
return NULL;
if (TYPE_PRECISION (prod_type) != TYPE_PRECISION (half_type0) * 2)
return NULL;
}
vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt); vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt);
half_type = TREE_TYPE (oprnd00); tree half_vectype;
if (!vect_supportable_direct_optab_p (type, DOT_PROD_EXPR, half_type, if (!vect_supportable_direct_optab_p (type, DOT_PROD_EXPR, half_type,
type_out)) type_out, &half_vectype))
return NULL; return NULL;
/* Get the inputs in the appropriate types. */
STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
tree mult_oprnd[2];
vect_convert_inputs (stmt_vinfo, 2, mult_oprnd, half_type,
unprom0, half_vectype);
var = vect_recog_temp_ssa_var (type, NULL); var = vect_recog_temp_ssa_var (type, NULL);
pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR, pattern_stmt = gimple_build_assign (var, DOT_PROD_EXPR,
oprnd00, oprnd01, oprnd1); mult_oprnd[0], mult_oprnd[1], oprnd1);
return pattern_stmt; return pattern_stmt;
} }
...@@ -510,11 +891,9 @@ static gimple * ...@@ -510,11 +891,9 @@ static gimple *
vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out) vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out)
{ {
gimple *last_stmt = (*stmts)[0]; gimple *last_stmt = (*stmts)[0];
tree sad_oprnd0, sad_oprnd1;
stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
vec_info *vinfo = stmt_vinfo->vinfo; vec_info *vinfo = stmt_vinfo->vinfo;
tree half_type; tree half_type;
bool promotion;
/* Look for the following pattern /* Look for the following pattern
DX = (TYPE1) X; DX = (TYPE1) X;
...@@ -550,24 +929,23 @@ vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -550,24 +929,23 @@ vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out)
tree sum_type = gimple_expr_type (last_stmt); tree sum_type = gimple_expr_type (last_stmt);
/* The type conversion could be promotion, demotion, /* Any non-truncating sequence of conversions is OK here, since
or just signed -> unsigned. */ with a successful match, the result of the ABS(U) is known to fit
gimple *def_stmt; within the nonnegative range of the result type. (It cannot be the
if (type_conversion_p (plus_oprnd0, last_stmt, false, negative of the minimum signed value due to the range of the widening
&half_type, &def_stmt, &promotion)) MINUS_EXPR.) */
plus_oprnd0 = gimple_assign_rhs1 (def_stmt); vect_unpromoted_value unprom_abs;
else plus_oprnd0 = vect_look_through_possible_promotion (vinfo, plus_oprnd0,
half_type = sum_type; &unprom_abs);
/* So far so good. Since last_stmt was detected as a (summation) reduction, /* So far so good. Since last_stmt was detected as a (summation) reduction,
we know that plus_oprnd1 is the reduction variable (defined by a loop-header we know that plus_oprnd1 is the reduction variable (defined by a loop-header
phi), and plus_oprnd0 is an ssa-name defined by a stmt in the loop body. phi), and plus_oprnd0 is an ssa-name defined by a stmt in the loop body.
Then check that plus_oprnd0 is defined by an abs_expr. */ Then check that plus_oprnd0 is defined by an abs_expr. */
if (TREE_CODE (plus_oprnd0) != SSA_NAME) if (!plus_oprnd0)
return NULL; return NULL;
tree abs_type = half_type;
stmt_vec_info abs_stmt_vinfo = vect_get_internal_def (vinfo, plus_oprnd0); stmt_vec_info abs_stmt_vinfo = vect_get_internal_def (vinfo, plus_oprnd0);
if (!abs_stmt_vinfo) if (!abs_stmt_vinfo)
return NULL; return NULL;
...@@ -581,334 +959,114 @@ vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -581,334 +959,114 @@ vect_recog_sad_pattern (vec<gimple *> *stmts, tree *type_out)
return NULL; return NULL;
tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); tree abs_oprnd = gimple_assign_rhs1 (abs_stmt);
tree abs_type = TREE_TYPE (abs_oprnd);
if (TYPE_UNSIGNED (abs_type)) if (TYPE_UNSIGNED (abs_type))
return NULL; return NULL;
/* We then detect if the operand of abs_expr is defined by a minus_expr. */ /* Peel off conversions from the ABS input. This can involve sign
changes (e.g. from an unsigned subtraction to a signed ABS input)
if (TREE_CODE (abs_oprnd) != SSA_NAME) or signed promotion, but it can't include unsigned promotion.
(Note that ABS of an unsigned promotion should have been folded
away before now anyway.) */
vect_unpromoted_value unprom_diff;
abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd,
&unprom_diff);
if (!abs_oprnd)
return NULL;
if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type)
&& TYPE_UNSIGNED (unprom_diff.type))
return NULL; return NULL;
/* We then detect if the operand of abs_expr is defined by a minus_expr. */
stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd);
if (!diff_stmt_vinfo) if (!diff_stmt_vinfo)
return NULL; return NULL;
/* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi
inside the loop (in case we are analyzing an outer-loop). */ inside the loop (in case we are analyzing an outer-loop). */
gassign *diff_stmt = dyn_cast <gassign *> (diff_stmt_vinfo->stmt); vect_unpromoted_value unprom[2];
if (!diff_stmt || gimple_assign_rhs_code (diff_stmt) != MINUS_EXPR) if (!vect_widened_op_tree (diff_stmt_vinfo, MINUS_EXPR, MINUS_EXPR,
return NULL; false, 2, unprom, &half_type))
tree half_type0, half_type1;
tree minus_oprnd0 = gimple_assign_rhs1 (diff_stmt);
tree minus_oprnd1 = gimple_assign_rhs2 (diff_stmt);
if (!type_conversion_p (minus_oprnd0, diff_stmt, false,
&half_type0, &def_stmt, &promotion)
|| !promotion)
return NULL;
sad_oprnd0 = gimple_assign_rhs1 (def_stmt);
if (!type_conversion_p (minus_oprnd1, diff_stmt, false,
&half_type1, &def_stmt, &promotion)
|| !promotion)
return NULL;
sad_oprnd1 = gimple_assign_rhs1 (def_stmt);
if (!types_compatible_p (half_type0, half_type1))
return NULL;
if (TYPE_PRECISION (abs_type) < TYPE_PRECISION (half_type0) * 2
|| TYPE_PRECISION (sum_type) < TYPE_PRECISION (half_type0) * 2)
return NULL; return NULL;
vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); vect_pattern_detected ("vect_recog_sad_pattern", last_stmt);
if (!vect_supportable_direct_optab_p (sum_type, SAD_EXPR, tree half_vectype;
TREE_TYPE (sad_oprnd0), type_out)) if (!vect_supportable_direct_optab_p (sum_type, SAD_EXPR, half_type,
type_out, &half_vectype))
return NULL; return NULL;
/* Get the inputs to the SAD_EXPR in the appropriate types. */
STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
tree sad_oprnd[2];
vect_convert_inputs (stmt_vinfo, 2, sad_oprnd, half_type,
unprom, half_vectype);
tree var = vect_recog_temp_ssa_var (sum_type, NULL); tree var = vect_recog_temp_ssa_var (sum_type, NULL);
gimple *pattern_stmt = gimple_build_assign (var, SAD_EXPR, sad_oprnd0, gimple *pattern_stmt = gimple_build_assign (var, SAD_EXPR, sad_oprnd[0],
sad_oprnd1, plus_oprnd1); sad_oprnd[1], plus_oprnd1);
return pattern_stmt; return pattern_stmt;
} }
/* Recognize an operation that performs ORIG_CODE on widened inputs,
so that it can be treated as though it had the form:
/* Handle widening operation by a constant. At the moment we support MULT_EXPR A_TYPE a;
and LSHIFT_EXPR. B_TYPE b;
HALF_TYPE a_cast = (HALF_TYPE) a; // possible no-op
HALF_TYPE b_cast = (HALF_TYPE) b; // possible no-op
| RES_TYPE a_extend = (RES_TYPE) a_cast; // promotion from HALF_TYPE
| RES_TYPE b_extend = (RES_TYPE) b_cast; // promotion from HALF_TYPE
| RES_TYPE res = a_extend ORIG_CODE b_extend;
For MULT_EXPR we check that CONST_OPRND fits HALF_TYPE, and for LSHIFT_EXPR Try to replace the pattern with:
we check that CONST_OPRND is less or equal to the size of HALF_TYPE.
Otherwise, if the type of the result (TYPE) is at least 4 times bigger than A_TYPE a;
HALF_TYPE, and there is an intermediate type (2 times smaller than TYPE) B_TYPE b;
that satisfies the above restrictions, we can perform a widening opeartion HALF_TYPE a_cast = (HALF_TYPE) a; // possible no-op
from the intermediate type to TYPE and replace a_T = (TYPE) a_t; HALF_TYPE b_cast = (HALF_TYPE) b; // possible no-op
with a_it = (interm_type) a_t; Store such operation in *WSTMT. */ | EXT_TYPE ext = a_cast WIDE_CODE b_cast;
| RES_TYPE res = (EXT_TYPE) ext; // possible no-op
static bool where EXT_TYPE is wider than HALF_TYPE but has the same signedness.
vect_handle_widen_op_by_const (gimple *stmt, enum tree_code code,
tree const_oprnd, tree *oprnd,
gimple **wstmt, tree type,
tree *half_type, gimple *def_stmt)
{
tree new_type, new_oprnd;
if (code != MULT_EXPR && code != LSHIFT_EXPR)
return false;
if (((code == MULT_EXPR && int_fits_type_p (const_oprnd, *half_type)) SHIFT_P is true if ORIG_CODE and WIDE_CODE are shifts. NAME is the
|| (code == LSHIFT_EXPR name of the pattern being matched, for dump purposes. */
&& compare_tree_int (const_oprnd, TYPE_PRECISION (*half_type))
!= 1))
&& TYPE_PRECISION (type) == (TYPE_PRECISION (*half_type) * 2))
{
/* CONST_OPRND is a constant of HALF_TYPE. */
*oprnd = gimple_assign_rhs1 (def_stmt);
return true;
}
if (TYPE_PRECISION (type) < (TYPE_PRECISION (*half_type) * 4))
return false;
if (!vect_same_loop_or_bb_p (stmt, def_stmt))
return false;
/* TYPE is 4 times bigger than HALF_TYPE, try widening operation for
a type 2 times bigger than HALF_TYPE. */
new_type = build_nonstandard_integer_type (TYPE_PRECISION (type) / 2,
TYPE_UNSIGNED (type));
if ((code == MULT_EXPR && !int_fits_type_p (const_oprnd, new_type))
|| (code == LSHIFT_EXPR
&& compare_tree_int (const_oprnd, TYPE_PRECISION (new_type)) == 1))
return false;
/* Use NEW_TYPE for widening operation and create a_T = (NEW_TYPE) a_t; */
*oprnd = gimple_assign_rhs1 (def_stmt);
new_oprnd = make_ssa_name (new_type);
*wstmt = gimple_build_assign (new_oprnd, NOP_EXPR, *oprnd);
*oprnd = new_oprnd;
*half_type = new_type;
return true;
}
/* Function vect_recog_widen_mult_pattern
Try to find the following pattern:
type1 a_t;
type2 b_t;
TYPE a_T, b_T, prod_T;
S1 a_t = ;
S2 b_t = ;
S3 a_T = (TYPE) a_t;
S4 b_T = (TYPE) b_t;
S5 prod_T = a_T * b_T;
where type 'TYPE' is at least double the size of type 'type1' and 'type2'.
Also detect unsigned cases:
unsigned type1 a_t;
unsigned type2 b_t;
unsigned TYPE u_prod_T;
TYPE a_T, b_T, prod_T;
S1 a_t = ;
S2 b_t = ;
S3 a_T = (TYPE) a_t;
S4 b_T = (TYPE) b_t;
S5 prod_T = a_T * b_T;
S6 u_prod_T = (unsigned TYPE) prod_T;
and multiplication by constants:
type a_t;
TYPE a_T, prod_T;
S1 a_t = ;
S3 a_T = (TYPE) a_t;
S5 prod_T = a_T * CONST;
A special case of multiplication by constants is when 'TYPE' is 4 times
bigger than 'type', but CONST fits an intermediate type 2 times smaller
than 'TYPE'. In that case we create an additional pattern stmt for S3
to create a variable of the intermediate type, and perform widen-mult
on the intermediate type as well:
type a_t;
interm_type a_it;
TYPE a_T, prod_T, prod_T';
S1 a_t = ;
S3 a_T = (TYPE) a_t;
'--> a_it = (interm_type) a_t;
S5 prod_T = a_T * CONST;
'--> prod_T' = a_it w* CONST;
Input/Output:
* STMTS: Contains a stmt from which the pattern search begins. In the
example, when this function is called with S5, the pattern {S3,S4,S5,(S6)}
is detected. In case of unsigned widen-mult, the original stmt (S5) is
replaced with S6 in STMTS. In case of multiplication by a constant
of an intermediate type (the last case above), STMTS also contains S3
(inserted before S5).
Output:
* TYPE_OUT: The type of the output of this pattern.
* Return value: A new stmt that will be used to replace the sequence of
stmts that constitute the pattern. In this case it will be:
WIDEN_MULT <a_t, b_t>
If the result of WIDEN_MULT needs to be converted to a larger type, the
returned stmt will be this type conversion stmt.
*/
static gimple * static gimple *
vect_recog_widen_mult_pattern (vec<gimple *> *stmts, tree *type_out) vect_recog_widen_op_pattern (vec<gimple *> *stmts, tree *type_out,
tree_code orig_code, tree_code wide_code,
bool shift_p, const char *name)
{ {
gimple *last_stmt = stmts->pop (); gimple *last_stmt = stmts->pop ();
gimple *def_stmt0, *def_stmt1; stmt_vec_info last_stmt_info = vinfo_for_stmt (last_stmt);
tree oprnd0, oprnd1;
tree type, half_type0, half_type1;
gimple *new_stmt = NULL, *pattern_stmt = NULL;
tree vectype, vecitype;
tree var;
enum tree_code dummy_code;
int dummy_int;
vec<tree> dummy_vec;
bool op1_ok;
bool promotion;
if (!is_gimple_assign (last_stmt))
return NULL;
type = gimple_expr_type (last_stmt);
/* Starting from LAST_STMT, follow the defs of its uses in search
of the above pattern. */
if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR) vect_unpromoted_value unprom[2];
tree half_type;
if (!vect_widened_op_tree (last_stmt_info, orig_code, orig_code,
shift_p, 2, unprom, &half_type))
return NULL; return NULL;
oprnd0 = gimple_assign_rhs1 (last_stmt); /* Pattern detected. */
oprnd1 = gimple_assign_rhs2 (last_stmt); vect_pattern_detected (name, last_stmt);
/* Check argument 0. */
if (!type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0,
&promotion)
|| !promotion)
return NULL;
/* Check argument 1. */
op1_ok = type_conversion_p (oprnd1, last_stmt, false, &half_type1,
&def_stmt1, &promotion);
if (op1_ok && promotion)
{
oprnd0 = gimple_assign_rhs1 (def_stmt0);
oprnd1 = gimple_assign_rhs1 (def_stmt1);
}
else
{
if (TREE_CODE (oprnd1) == INTEGER_CST
&& TREE_CODE (half_type0) == INTEGER_TYPE
&& vect_handle_widen_op_by_const (last_stmt, MULT_EXPR, oprnd1,
&oprnd0, &new_stmt, type,
&half_type0, def_stmt0))
{
half_type1 = half_type0;
oprnd1 = fold_convert (half_type1, oprnd1);
}
else
return NULL;
}
/* If the two arguments have different sizes, convert the one with
the smaller type into the larger type. */
if (TYPE_PRECISION (half_type0) != TYPE_PRECISION (half_type1))
{
/* If we already used up the single-stmt slot give up. */
if (new_stmt)
return NULL;
tree* oprnd = NULL;
gimple *def_stmt = NULL;
if (TYPE_PRECISION (half_type0) < TYPE_PRECISION (half_type1))
{
def_stmt = def_stmt0;
half_type0 = half_type1;
oprnd = &oprnd0;
}
else
{
def_stmt = def_stmt1;
half_type1 = half_type0;
oprnd = &oprnd1;
}
tree old_oprnd = gimple_assign_rhs1 (def_stmt);
tree new_oprnd = make_ssa_name (half_type0);
new_stmt = gimple_build_assign (new_oprnd, NOP_EXPR, old_oprnd);
*oprnd = new_oprnd;
}
/* Handle unsigned case. Look for
S6 u_prod_T = (unsigned TYPE) prod_T;
Use unsigned TYPE as the type for WIDEN_MULT_EXPR. */
if (TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type0))
{
gimple *use_stmt;
tree use_lhs;
tree use_type;
if (TYPE_UNSIGNED (type) == TYPE_UNSIGNED (half_type1))
return NULL;
use_stmt = vect_single_imm_use (last_stmt);
if (!use_stmt || !is_gimple_assign (use_stmt)
|| !CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (use_stmt)))
return NULL;
use_lhs = gimple_assign_lhs (use_stmt);
use_type = TREE_TYPE (use_lhs);
if (!INTEGRAL_TYPE_P (use_type)
|| (TYPE_UNSIGNED (type) == TYPE_UNSIGNED (use_type))
|| (TYPE_PRECISION (type) != TYPE_PRECISION (use_type)))
return NULL;
type = use_type;
last_stmt = use_stmt;
}
if (!types_compatible_p (half_type0, half_type1))
return NULL;
/* If TYPE is more than twice larger than HALF_TYPE, we use WIDEN_MULT tree type = gimple_expr_type (last_stmt);
to get an intermediate result of type ITYPE. In this case we need
to build a statement to convert this intermediate result to type TYPE. */
tree itype = type; tree itype = type;
if (TYPE_PRECISION (type) > TYPE_PRECISION (half_type0) * 2) if (TYPE_PRECISION (type) != TYPE_PRECISION (half_type) * 2
itype = build_nonstandard_integer_type || TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type))
(GET_MODE_BITSIZE (SCALAR_TYPE_MODE (half_type0)) * 2, itype = build_nonstandard_integer_type (TYPE_PRECISION (half_type) * 2,
TYPE_UNSIGNED (type)); TYPE_UNSIGNED (half_type));
/* Pattern detected. */
vect_pattern_detected ("vect_recog_widen_mult_pattern", last_stmt);
/* Check target support */ /* Check target support */
vectype = get_vectype_for_scalar_type (half_type0); tree vectype = get_vectype_for_scalar_type (half_type);
vecitype = get_vectype_for_scalar_type (itype); tree vecitype = get_vectype_for_scalar_type (itype);
enum tree_code dummy_code;
int dummy_int;
auto_vec<tree> dummy_vec;
if (!vectype if (!vectype
|| !vecitype || !vecitype
|| !supportable_widening_operation (WIDEN_MULT_EXPR, last_stmt, || !supportable_widening_operation (wide_code, last_stmt,
vecitype, vectype, vecitype, vectype,
&dummy_code, &dummy_code, &dummy_code, &dummy_code,
&dummy_int, &dummy_vec)) &dummy_int, &dummy_vec))
...@@ -918,43 +1076,28 @@ vect_recog_widen_mult_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -918,43 +1076,28 @@ vect_recog_widen_mult_pattern (vec<gimple *> *stmts, tree *type_out)
if (!*type_out) if (!*type_out)
return NULL; return NULL;
/* Pattern supported. Create a stmt to be used to replace the pattern: */ STMT_VINFO_PATTERN_DEF_SEQ (last_stmt_info) = NULL;
var = vect_recog_temp_ssa_var (itype, NULL); tree oprnd[2];
pattern_stmt = gimple_build_assign (var, WIDEN_MULT_EXPR, oprnd0, oprnd1); vect_convert_inputs (last_stmt_info, 2, oprnd, half_type, unprom, vectype);
stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) = NULL;
/* If the original two operands have different sizes, we may need to convert tree var = vect_recog_temp_ssa_var (itype, NULL);
the smaller one into the larget type. If this is the case, at this point gimple *pattern_stmt = gimple_build_assign (var, wide_code,
the new stmt is already built. */ oprnd[0], oprnd[1]);
if (new_stmt)
{
append_pattern_def_seq (stmt_vinfo, new_stmt);
stmt_vec_info new_stmt_info
= new_stmt_vec_info (new_stmt, stmt_vinfo->vinfo);
set_vinfo_for_stmt (new_stmt, new_stmt_info);
STMT_VINFO_VECTYPE (new_stmt_info) = vectype;
}
/* If ITYPE is not TYPE, we need to build a type convertion stmt to convert
the result of the widen-mult operation into type TYPE. */
if (itype != type)
{
append_pattern_def_seq (stmt_vinfo, pattern_stmt);
stmt_vec_info pattern_stmt_info
= new_stmt_vec_info (pattern_stmt, stmt_vinfo->vinfo);
set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info);
STMT_VINFO_VECTYPE (pattern_stmt_info) = vecitype;
pattern_stmt = gimple_build_assign (vect_recog_temp_ssa_var (type, NULL),
NOP_EXPR,
gimple_assign_lhs (pattern_stmt));
}
stmts->safe_push (last_stmt); stmts->safe_push (last_stmt);
return pattern_stmt; return vect_convert_output (last_stmt_info, type, pattern_stmt, vecitype);
} }
/* Try to detect multiplication on widened inputs, converting MULT_EXPR
to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */
static gimple *
vect_recog_widen_mult_pattern (vec<gimple *> *stmts, tree *type_out)
{
return vect_recog_widen_op_pattern (stmts, type_out, MULT_EXPR,
WIDEN_MULT_EXPR, false,
"vect_recog_widen_mult_pattern");
}
/* Function vect_recog_pow_pattern /* Function vect_recog_pow_pattern
...@@ -1147,13 +1290,13 @@ vect_recog_pow_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -1147,13 +1290,13 @@ vect_recog_pow_pattern (vec<gimple *> *stmts, tree *type_out)
static gimple * static gimple *
vect_recog_widen_sum_pattern (vec<gimple *> *stmts, tree *type_out) vect_recog_widen_sum_pattern (vec<gimple *> *stmts, tree *type_out)
{ {
gimple *stmt, *last_stmt = (*stmts)[0]; gimple *last_stmt = (*stmts)[0];
tree oprnd0, oprnd1; tree oprnd0, oprnd1;
stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
tree type, half_type; vec_info *vinfo = stmt_vinfo->vinfo;
tree type;
gimple *pattern_stmt; gimple *pattern_stmt;
tree var; tree var;
bool promotion;
/* Look for the following pattern /* Look for the following pattern
DX = (TYPE) X; DX = (TYPE) X;
...@@ -1177,21 +1320,19 @@ vect_recog_widen_sum_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -1177,21 +1320,19 @@ vect_recog_widen_sum_pattern (vec<gimple *> *stmts, tree *type_out)
Left to check that oprnd0 is defined by a cast from type 'type' to type Left to check that oprnd0 is defined by a cast from type 'type' to type
'TYPE'. */ 'TYPE'. */
if (!type_conversion_p (oprnd0, last_stmt, true, &half_type, &stmt, vect_unpromoted_value unprom0;
&promotion) if (!vect_look_through_possible_promotion (vinfo, oprnd0, &unprom0)
|| !promotion) || TYPE_PRECISION (unprom0.type) * 2 > TYPE_PRECISION (type))
return NULL; return NULL;
oprnd0 = gimple_assign_rhs1 (stmt);
vect_pattern_detected ("vect_recog_widen_sum_pattern", last_stmt); vect_pattern_detected ("vect_recog_widen_sum_pattern", last_stmt);
if (!vect_supportable_direct_optab_p (type, WIDEN_SUM_EXPR, half_type, if (!vect_supportable_direct_optab_p (type, WIDEN_SUM_EXPR, unprom0.type,
type_out)) type_out))
return NULL; return NULL;
var = vect_recog_temp_ssa_var (type, NULL); var = vect_recog_temp_ssa_var (type, NULL);
pattern_stmt = gimple_build_assign (var, WIDEN_SUM_EXPR, oprnd0, oprnd1); pattern_stmt = gimple_build_assign (var, WIDEN_SUM_EXPR, unprom0.op, oprnd1);
return pattern_stmt; return pattern_stmt;
} }
...@@ -1556,159 +1697,15 @@ vect_recog_over_widening_pattern (vec<gimple *> *stmts, tree *type_out) ...@@ -1556,159 +1697,15 @@ vect_recog_over_widening_pattern (vec<gimple *> *stmts, tree *type_out)
return pattern_stmt; return pattern_stmt;
} }
/* Detect widening shift pattern: /* Try to detect a shift left of a widened input, converting LSHIFT_EXPR
to WIDEN_LSHIFT_EXPR. See vect_recog_widen_op_pattern for details. */
type a_t;
TYPE a_T, res_T;
S1 a_t = ;
S2 a_T = (TYPE) a_t;
S3 res_T = a_T << CONST;
where type 'TYPE' is at least double the size of type 'type'.
Also detect cases where the shift result is immediately converted
to another type 'result_type' that is no larger in size than 'TYPE'.
In those cases we perform a widen-shift that directly results in
'result_type', to avoid a possible over-widening situation:
type a_t;
TYPE a_T, res_T;
result_type res_result;
S1 a_t = ;
S2 a_T = (TYPE) a_t;
S3 res_T = a_T << CONST;
S4 res_result = (result_type) res_T;
'--> res_result' = a_t w<< CONST;
And a case when 'TYPE' is 4 times bigger than 'type'. In that case we
create an additional pattern stmt for S2 to create a variable of an
intermediate type, and perform widen-shift on the intermediate type:
type a_t;
interm_type a_it;
TYPE a_T, res_T, res_T';
S1 a_t = ;
S2 a_T = (TYPE) a_t;
'--> a_it = (interm_type) a_t;
S3 res_T = a_T << CONST;
'--> res_T' = a_it <<* CONST;
Input/Output:
* STMTS: Contains a stmt from which the pattern search begins.
In case of unsigned widen-shift, the original stmt (S3) is replaced with S4
in STMTS. When an intermediate type is used and a pattern statement is
created for S2, we also put S2 here (before S3).
Output:
* TYPE_OUT: The type of the output of this pattern.
* Return value: A new stmt that will be used to replace the sequence of
stmts that constitute the pattern. In this case it will be:
WIDEN_LSHIFT_EXPR <a_t, CONST>. */
static gimple * static gimple *
vect_recog_widen_shift_pattern (vec<gimple *> *stmts, tree *type_out) vect_recog_widen_shift_pattern (vec<gimple *> *stmts, tree *type_out)
{ {
gimple *last_stmt = stmts->pop (); return vect_recog_widen_op_pattern (stmts, type_out, LSHIFT_EXPR,
gimple *def_stmt0; WIDEN_LSHIFT_EXPR, true,
tree oprnd0, oprnd1; "vect_widen_shift_pattern");
tree type, half_type0;
gimple *pattern_stmt;
tree vectype, vectype_out = NULL_TREE;
tree var;
enum tree_code dummy_code;
int dummy_int;
vec<tree> dummy_vec;
gimple *use_stmt;
bool promotion;
if (!is_gimple_assign (last_stmt) || !vinfo_for_stmt (last_stmt))
return NULL;
if (gimple_assign_rhs_code (last_stmt) != LSHIFT_EXPR)
return NULL;
oprnd0 = gimple_assign_rhs1 (last_stmt);
oprnd1 = gimple_assign_rhs2 (last_stmt);
if (TREE_CODE (oprnd0) != SSA_NAME || TREE_CODE (oprnd1) != INTEGER_CST)
return NULL;
/* Check operand 0: it has to be defined by a type promotion. */
if (!type_conversion_p (oprnd0, last_stmt, false, &half_type0, &def_stmt0,
&promotion)
|| !promotion)
return NULL;
/* Check operand 1: has to be positive. We check that it fits the type
in vect_handle_widen_op_by_const (). */
if (tree_int_cst_compare (oprnd1, size_zero_node) <= 0)
return NULL;
oprnd0 = gimple_assign_rhs1 (def_stmt0);
type = gimple_expr_type (last_stmt);
/* Check for subsequent conversion to another type. */
use_stmt = vect_single_imm_use (last_stmt);
if (use_stmt && is_gimple_assign (use_stmt)
&& CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (use_stmt))
&& !STMT_VINFO_IN_PATTERN_P (vinfo_for_stmt (use_stmt)))
{
tree use_lhs = gimple_assign_lhs (use_stmt);
tree use_type = TREE_TYPE (use_lhs);
if (INTEGRAL_TYPE_P (use_type)
&& TYPE_PRECISION (use_type) <= TYPE_PRECISION (type))
{
last_stmt = use_stmt;
type = use_type;
}
}
/* Check if this a widening operation. */
gimple *wstmt = NULL;
if (!vect_handle_widen_op_by_const (last_stmt, LSHIFT_EXPR, oprnd1,
&oprnd0, &wstmt,
type, &half_type0, def_stmt0))
return NULL;
/* Pattern detected. */
vect_pattern_detected ("vect_recog_widen_shift_pattern", last_stmt);
/* Check target support. */
vectype = get_vectype_for_scalar_type (half_type0);
vectype_out = get_vectype_for_scalar_type (type);
if (!vectype
|| !vectype_out
|| !supportable_widening_operation (WIDEN_LSHIFT_EXPR, last_stmt,
vectype_out, vectype,
&dummy_code, &dummy_code,
&dummy_int, &dummy_vec))
return NULL;
*type_out = vectype_out;
/* Pattern supported. Create a stmt to be used to replace the pattern. */
var = vect_recog_temp_ssa_var (type, NULL);
pattern_stmt
= gimple_build_assign (var, WIDEN_LSHIFT_EXPR, oprnd0, oprnd1);
if (wstmt)
{
stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
new_pattern_def_seq (stmt_vinfo, wstmt);
stmt_vec_info new_stmt_info
= new_stmt_vec_info (wstmt, stmt_vinfo->vinfo);
set_vinfo_for_stmt (wstmt, new_stmt_info);
STMT_VINFO_VECTYPE (new_stmt_info) = vectype;
}
stmts->safe_push (last_stmt);
return pattern_stmt;
} }
/* Detect a rotate pattern wouldn't be otherwise vectorized: /* Detect a rotate pattern wouldn't be otherwise vectorized:
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment