Commit ba3c3dc0 by Richard Sandiford Committed by Richard Sandiford

Fix simplify_shift_const_1 handling of vector shifts

simplify_shift_const_1 handles both shifts of scalars by scalars
and shifts of vectors by scalars.  For vectors this means that
each element is shifted by the same amount.

However:

(a) the two cases weren't always distinguished, so we'd try
        things for vectors that only made sense for scalars.

(b) a lot of the range and bitcount checks were based on the
        bitsize or precision of the full shifted operand, rather
        than the mode of each element.

Fixing (b) accidentally exposed more optimisation opportunities,
although that wasn't the point of the patch.

gcc/
2016-11-15  Richard Sandiford  <richard.sandiford@arm.com>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

	* combine.c (simplify_shift_const_1): Use the number of bits
	in the inner mode to determine the range of the shift.
	When handling shifts of vectors, skip any rules that apply
	only to scalars.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r242442
parent 89e64bc0
...@@ -2,6 +2,15 @@ ...@@ -2,6 +2,15 @@
Alan Hayward <alan.hayward@arm.com> Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com> David Sherwood <david.sherwood@arm.com>
* combine.c (simplify_shift_const_1): Use the number of bits
in the inner mode to determine the range of the shift.
When handling shifts of vectors, skip any rules that apply
only to scalars.
2016-11-15 Richard Sandiford <richard.sandiford@arm.com>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* rtlanal.c (num_sign_bit_copies1): Calculate bitwidth after * rtlanal.c (num_sign_bit_copies1): Calculate bitwidth after
handling VOIDmode. handling VOIDmode.
...@@ -10228,12 +10228,12 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10228,12 +10228,12 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
want to do this inside the loop as it makes it more difficult to want to do this inside the loop as it makes it more difficult to
combine shifts. */ combine shifts. */
if (SHIFT_COUNT_TRUNCATED) if (SHIFT_COUNT_TRUNCATED)
orig_count &= GET_MODE_BITSIZE (mode) - 1; orig_count &= GET_MODE_UNIT_BITSIZE (mode) - 1;
/* If we were given an invalid count, don't do anything except exactly /* If we were given an invalid count, don't do anything except exactly
what was requested. */ what was requested. */
if (orig_count < 0 || orig_count >= (int) GET_MODE_PRECISION (mode)) if (orig_count < 0 || orig_count >= (int) GET_MODE_UNIT_PRECISION (mode))
return NULL_RTX; return NULL_RTX;
count = orig_count; count = orig_count;
...@@ -10250,16 +10250,14 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10250,16 +10250,14 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
/* Convert ROTATERT to ROTATE. */ /* Convert ROTATERT to ROTATE. */
if (code == ROTATERT) if (code == ROTATERT)
{ {
unsigned int bitsize = GET_MODE_PRECISION (result_mode); unsigned int bitsize = GET_MODE_UNIT_PRECISION (result_mode);
code = ROTATE; code = ROTATE;
if (VECTOR_MODE_P (result_mode))
count = bitsize / GET_MODE_NUNITS (result_mode) - count;
else
count = bitsize - count; count = bitsize - count;
} }
shift_mode = try_widen_shift_mode (code, varop, count, result_mode, shift_mode = try_widen_shift_mode (code, varop, count, result_mode,
mode, outer_op, outer_const); mode, outer_op, outer_const);
machine_mode shift_unit_mode = GET_MODE_INNER (shift_mode);
/* Handle cases where the count is greater than the size of the mode /* Handle cases where the count is greater than the size of the mode
minus 1. For ASHIFT, use the size minus one as the count (this can minus 1. For ASHIFT, use the size minus one as the count (this can
...@@ -10271,12 +10269,12 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10271,12 +10269,12 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
multiple operations, each of which are defined, we know what the multiple operations, each of which are defined, we know what the
result is supposed to be. */ result is supposed to be. */
if (count > (GET_MODE_PRECISION (shift_mode) - 1)) if (count > (GET_MODE_PRECISION (shift_unit_mode) - 1))
{ {
if (code == ASHIFTRT) if (code == ASHIFTRT)
count = GET_MODE_PRECISION (shift_mode) - 1; count = GET_MODE_PRECISION (shift_unit_mode) - 1;
else if (code == ROTATE || code == ROTATERT) else if (code == ROTATE || code == ROTATERT)
count %= GET_MODE_PRECISION (shift_mode); count %= GET_MODE_PRECISION (shift_unit_mode);
else else
{ {
/* We can't simply return zero because there may be an /* We can't simply return zero because there may be an
...@@ -10292,11 +10290,13 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10292,11 +10290,13 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
if (complement_p) if (complement_p)
break; break;
if (shift_mode == shift_unit_mode)
{
/* An arithmetic right shift of a quantity known to be -1 or 0 /* An arithmetic right shift of a quantity known to be -1 or 0
is a no-op. */ is a no-op. */
if (code == ASHIFTRT if (code == ASHIFTRT
&& (num_sign_bit_copies (varop, shift_mode) && (num_sign_bit_copies (varop, shift_unit_mode)
== GET_MODE_PRECISION (shift_mode))) == GET_MODE_PRECISION (shift_unit_mode)))
{ {
count = 0; count = 0;
break; break;
...@@ -10304,32 +10304,35 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10304,32 +10304,35 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
/* If we are doing an arithmetic right shift and discarding all but /* If we are doing an arithmetic right shift and discarding all but
the sign bit copies, this is equivalent to doing a shift by the the sign bit copies, this is equivalent to doing a shift by the
bitsize minus one. Convert it into that shift because it will often bitsize minus one. Convert it into that shift because it will
allow other simplifications. */ often allow other simplifications. */
if (code == ASHIFTRT if (code == ASHIFTRT
&& (count + num_sign_bit_copies (varop, shift_mode) && (count + num_sign_bit_copies (varop, shift_unit_mode)
>= GET_MODE_PRECISION (shift_mode))) >= GET_MODE_PRECISION (shift_unit_mode)))
count = GET_MODE_PRECISION (shift_mode) - 1; count = GET_MODE_PRECISION (shift_unit_mode) - 1;
/* We simplify the tests below and elsewhere by converting /* We simplify the tests below and elsewhere by converting
ASHIFTRT to LSHIFTRT if we know the sign bit is clear. ASHIFTRT to LSHIFTRT if we know the sign bit is clear.
`make_compound_operation' will convert it to an ASHIFTRT for `make_compound_operation' will convert it to an ASHIFTRT for
those machines (such as VAX) that don't have an LSHIFTRT. */ those machines (such as VAX) that don't have an LSHIFTRT. */
if (code == ASHIFTRT if (code == ASHIFTRT
&& val_signbit_known_clear_p (shift_mode, && HWI_COMPUTABLE_MODE_P (shift_unit_mode)
nonzero_bits (varop, shift_mode))) && val_signbit_known_clear_p (shift_unit_mode,
nonzero_bits (varop,
shift_unit_mode)))
code = LSHIFTRT; code = LSHIFTRT;
if (((code == LSHIFTRT if (((code == LSHIFTRT
&& HWI_COMPUTABLE_MODE_P (shift_mode) && HWI_COMPUTABLE_MODE_P (shift_unit_mode)
&& !(nonzero_bits (varop, shift_mode) >> count)) && !(nonzero_bits (varop, shift_unit_mode) >> count))
|| (code == ASHIFT || (code == ASHIFT
&& HWI_COMPUTABLE_MODE_P (shift_mode) && HWI_COMPUTABLE_MODE_P (shift_unit_mode)
&& !((nonzero_bits (varop, shift_mode) << count) && !((nonzero_bits (varop, shift_unit_mode) << count)
& GET_MODE_MASK (shift_mode)))) & GET_MODE_MASK (shift_unit_mode))))
&& !side_effects_p (varop)) && !side_effects_p (varop))
varop = const0_rtx; varop = const0_rtx;
}
switch (GET_CODE (varop)) switch (GET_CODE (varop))
{ {
...@@ -10346,6 +10349,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10346,6 +10349,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case MEM: case MEM:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* If we have (xshiftrt (mem ...) C) and C is MODE_WIDTH /* If we have (xshiftrt (mem ...) C) and C is MODE_WIDTH
minus the width of a smaller mode, we can do this with a minus the width of a smaller mode, we can do this with a
SIGN_EXTEND or ZERO_EXTEND from the narrower memory location. */ SIGN_EXTEND or ZERO_EXTEND from the narrower memory location. */
...@@ -10368,6 +10375,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10368,6 +10375,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case SUBREG: case SUBREG:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* If VAROP is a SUBREG, strip it as long as the inner operand has /* If VAROP is a SUBREG, strip it as long as the inner operand has
the same number of words as what we've seen so far. Then store the same number of words as what we've seen so far. Then store
the widest mode in MODE. */ the widest mode in MODE. */
...@@ -10424,9 +10435,9 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10424,9 +10435,9 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
interpreted as the sign bit in a narrower mode, so, if interpreted as the sign bit in a narrower mode, so, if
the result is narrower, don't discard the shift. */ the result is narrower, don't discard the shift. */
if (code == LSHIFTRT if (code == LSHIFTRT
&& count == (GET_MODE_BITSIZE (result_mode) - 1) && count == (GET_MODE_UNIT_BITSIZE (result_mode) - 1)
&& (GET_MODE_BITSIZE (result_mode) && (GET_MODE_UNIT_BITSIZE (result_mode)
>= GET_MODE_BITSIZE (GET_MODE (varop)))) >= GET_MODE_UNIT_BITSIZE (GET_MODE (varop))))
{ {
varop = XEXP (varop, 0); varop = XEXP (varop, 0);
continue; continue;
...@@ -10437,14 +10448,17 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10437,14 +10448,17 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
case LSHIFTRT: case LSHIFTRT:
case ASHIFT: case ASHIFT:
case ROTATE: case ROTATE:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* Here we have two nested shifts. The result is usually the /* Here we have two nested shifts. The result is usually the
AND of a new shift with a mask. We compute the result below. */ AND of a new shift with a mask. We compute the result below. */
if (CONST_INT_P (XEXP (varop, 1)) if (CONST_INT_P (XEXP (varop, 1))
&& INTVAL (XEXP (varop, 1)) >= 0 && INTVAL (XEXP (varop, 1)) >= 0
&& INTVAL (XEXP (varop, 1)) < GET_MODE_PRECISION (GET_MODE (varop)) && INTVAL (XEXP (varop, 1)) < GET_MODE_PRECISION (GET_MODE (varop))
&& HWI_COMPUTABLE_MODE_P (result_mode) && HWI_COMPUTABLE_MODE_P (result_mode)
&& HWI_COMPUTABLE_MODE_P (mode) && HWI_COMPUTABLE_MODE_P (mode))
&& !VECTOR_MODE_P (result_mode))
{ {
enum rtx_code first_code = GET_CODE (varop); enum rtx_code first_code = GET_CODE (varop);
unsigned int first_count = INTVAL (XEXP (varop, 1)); unsigned int first_count = INTVAL (XEXP (varop, 1));
...@@ -10610,7 +10624,8 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10610,7 +10624,8 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case NOT: case NOT:
if (VECTOR_MODE_P (mode)) /* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break; break;
/* Make this fit the case below. */ /* Make this fit the case below. */
...@@ -10620,6 +10635,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10620,6 +10635,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
case IOR: case IOR:
case AND: case AND:
case XOR: case XOR:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* If we have (xshiftrt (ior (plus X (const_int -1)) X) C) /* If we have (xshiftrt (ior (plus X (const_int -1)) X) C)
with C the size of VAROP - 1 and the shift is logical if with C the size of VAROP - 1 and the shift is logical if
STORE_FLAG_VALUE is 1 and arithmetic if STORE_FLAG_VALUE is -1, STORE_FLAG_VALUE is 1 and arithmetic if STORE_FLAG_VALUE is -1,
...@@ -10696,6 +10715,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10696,6 +10715,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case EQ: case EQ:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* Convert (lshiftrt (eq FOO 0) C) to (xor FOO 1) if STORE_FLAG_VALUE /* Convert (lshiftrt (eq FOO 0) C) to (xor FOO 1) if STORE_FLAG_VALUE
says that the sign bit can be tested, FOO has mode MODE, C is says that the sign bit can be tested, FOO has mode MODE, C is
GET_MODE_PRECISION (MODE) - 1, and FOO has only its low-order bit GET_MODE_PRECISION (MODE) - 1, and FOO has only its low-order bit
...@@ -10717,6 +10740,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10717,6 +10740,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case NEG: case NEG:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* (lshiftrt (neg A) C) where A is either 0 or 1 and C is one less /* (lshiftrt (neg A) C) where A is either 0 or 1 and C is one less
than the number of bits in the mode is equivalent to A. */ than the number of bits in the mode is equivalent to A. */
if (code == LSHIFTRT if (code == LSHIFTRT
...@@ -10740,6 +10767,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10740,6 +10767,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case PLUS: case PLUS:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* (lshiftrt (plus A -1) C) where A is either 0 or 1 and C /* (lshiftrt (plus A -1) C) where A is either 0 or 1 and C
is one less than the number of bits in the mode is is one less than the number of bits in the mode is
equivalent to (xor A 1). */ equivalent to (xor A 1). */
...@@ -10821,6 +10852,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10821,6 +10852,10 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
break; break;
case MINUS: case MINUS:
/* The following rules apply only to scalars. */
if (shift_mode != shift_unit_mode)
break;
/* If we have (xshiftrt (minus (ashiftrt X C)) X) C) /* If we have (xshiftrt (minus (ashiftrt X C)) X) C)
with C the size of VAROP - 1 and the shift is logical if with C the size of VAROP - 1 and the shift is logical if
STORE_FLAG_VALUE is 1 and arithmetic if STORE_FLAG_VALUE is -1, STORE_FLAG_VALUE is 1 and arithmetic if STORE_FLAG_VALUE is -1,
...@@ -10854,8 +10889,8 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode, ...@@ -10854,8 +10889,8 @@ simplify_shift_const_1 (enum rtx_code code, machine_mode result_mode,
&& GET_CODE (XEXP (varop, 0)) == LSHIFTRT && GET_CODE (XEXP (varop, 0)) == LSHIFTRT
&& CONST_INT_P (XEXP (XEXP (varop, 0), 1)) && CONST_INT_P (XEXP (XEXP (varop, 0), 1))
&& (INTVAL (XEXP (XEXP (varop, 0), 1)) && (INTVAL (XEXP (XEXP (varop, 0), 1))
>= (GET_MODE_PRECISION (GET_MODE (XEXP (varop, 0))) >= (GET_MODE_UNIT_PRECISION (GET_MODE (XEXP (varop, 0)))
- GET_MODE_PRECISION (GET_MODE (varop))))) - GET_MODE_UNIT_PRECISION (GET_MODE (varop)))))
{ {
rtx varop_inner = XEXP (varop, 0); rtx varop_inner = XEXP (varop, 0);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment