Commit f7d884d4 by Sudakshina Das Committed by Sudakshina Das

[PR81647][AARCH64] Fix handling of Unordered Comparisons in aarch64-simd.md

This patch fixes the inconsistent behavior observed at -O3 for the unordered
comparisons. According to the online docs (https://gcc.gnu.org/onlinedocs
/gcc-7.2.0/gccint/Unary-and-Binary-Expressions.html), all of the following
should not raise an FP exception:
- UNGE_EXPR
- UNGT_EXPR
- UNLE_EXPR
- UNLT_EXPR
- UNEQ_EXPR
Also ORDERED_EXPR and UNORDERED_EXPR should only return zero or one.

The aarch64-simd.md handling of these were generating exception raising
instructions such as fcmgt. This patch changes the instructions that are
emitted in order to not give out the exceptions. We first check each
operand for NaNs and force any elements containing NaN to zero before using
them in the compare.

Example: UN<cc> (a, b) -> UNORDERED (a, b)
			  | (cm<cc> (isnan (a) ? 0.0 : a, isnan (b) ? 0.0 : b))


The ORDERED_EXPR is now handled as (cmeq (a, a) & cmeq (b, b)) and
UNORDERED_EXPR as ~ORDERED_EXPR and UNEQ as (~ORDERED_EXPR | cmeq (a,b)).

ChangeLog Entries:

*** gcc/ChangeLog ***

2018-03-19  Sudakshina Das  <sudi.das@arm.com>

	PR target/81647
	* config/aarch64/aarch64-simd.md (vec_cmp<mode><v_int_equiv>): Modify
	instructions for UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED.

*** gcc/testsuite/ChangeLog ***

2018-03-19  Sudakshina Das  <sudi.das@arm.com>

	PR target/81647
	* gcc.target/aarch64/pr81647.c: New.

From-SVN: r258653
parent a84677b8
2018-03-19 Sudakshina Das <sudi.das@arm.com>
PR target/81647
* config/aarch64/aarch64-simd.md (vec_cmp<mode><v_int_equiv>): Modify
instructions for UNLT, UNLE, UNGT, UNGE, UNEQ, UNORDERED and ORDERED.
2018-03-19 Jim Wilson <jimw@sifive.com> 2018-03-19 Jim Wilson <jimw@sifive.com>
PR bootstrap/84856 PR bootstrap/84856
......
...@@ -2730,10 +2730,10 @@ ...@@ -2730,10 +2730,10 @@
break; break;
} }
/* Fall through. */ /* Fall through. */
case UNGE: case UNLT:
std::swap (operands[2], operands[3]); std::swap (operands[2], operands[3]);
/* Fall through. */ /* Fall through. */
case UNLE: case UNGT:
case GT: case GT:
comparison = gen_aarch64_cmgt<mode>; comparison = gen_aarch64_cmgt<mode>;
break; break;
...@@ -2744,10 +2744,10 @@ ...@@ -2744,10 +2744,10 @@
break; break;
} }
/* Fall through. */ /* Fall through. */
case UNGT: case UNLE:
std::swap (operands[2], operands[3]); std::swap (operands[2], operands[3]);
/* Fall through. */ /* Fall through. */
case UNLT: case UNGE:
case GE: case GE:
comparison = gen_aarch64_cmge<mode>; comparison = gen_aarch64_cmge<mode>;
break; break;
...@@ -2770,21 +2770,41 @@ ...@@ -2770,21 +2770,41 @@
case UNGT: case UNGT:
case UNLE: case UNLE:
case UNLT: case UNLT:
case NE: {
/* FCM returns false for lanes which are unordered, so if we use /* All of the above must not raise any FP exceptions. Thus we first
the inverse of the comparison we actually want to emit, then check each operand for NaNs and force any elements containing NaN to
invert the result, we will end up with the correct result. zero before using them in the compare.
Note that a NE NaN and NaN NE b are true for all a, b. Example: UN<cc> (a, b) -> UNORDERED (a, b) |
(cm<cc> (isnan (a) ? 0.0 : a,
Our transformations are: isnan (b) ? 0.0 : b))
a UNGE b -> !(b GT a) We use the following transformations for doing the comparisions:
a UNGT b -> !(b GE a) a UNGE b -> a GE b
a UNLE b -> !(a GT b) a UNGT b -> a GT b
a UNLT b -> !(a GE b) a UNLE b -> b GE a
a NE b -> !(a EQ b) */ a UNLT b -> b GT a. */
gcc_assert (comparison != NULL);
emit_insn (comparison (operands[0], operands[2], operands[3])); rtx tmp0 = gen_reg_rtx (<V_INT_EQUIV>mode);
emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0])); rtx tmp1 = gen_reg_rtx (<V_INT_EQUIV>mode);
rtx tmp2 = gen_reg_rtx (<V_INT_EQUIV>mode);
emit_insn (gen_aarch64_cmeq<mode> (tmp0, operands[2], operands[2]));
emit_insn (gen_aarch64_cmeq<mode> (tmp1, operands[3], operands[3]));
emit_insn (gen_and<v_int_equiv>3 (tmp2, tmp0, tmp1));
emit_insn (gen_and<v_int_equiv>3 (tmp0, tmp0,
lowpart_subreg (<V_INT_EQUIV>mode,
operands[2],
<MODE>mode)));
emit_insn (gen_and<v_int_equiv>3 (tmp1, tmp1,
lowpart_subreg (<V_INT_EQUIV>mode,
operands[3],
<MODE>mode)));
gcc_assert (comparison != NULL);
emit_insn (comparison (operands[0],
lowpart_subreg (<MODE>mode,
tmp0, <V_INT_EQUIV>mode),
lowpart_subreg (<MODE>mode,
tmp1, <V_INT_EQUIV>mode)));
emit_insn (gen_orn<v_int_equiv>3 (operands[0], tmp2, operands[0]));
}
break; break;
case LT: case LT:
...@@ -2792,25 +2812,19 @@ ...@@ -2792,25 +2812,19 @@
case GT: case GT:
case GE: case GE:
case EQ: case EQ:
case NE:
/* The easy case. Here we emit one of FCMGE, FCMGT or FCMEQ. /* The easy case. Here we emit one of FCMGE, FCMGT or FCMEQ.
As a LT b <=> b GE a && a LE b <=> b GT a. Our transformations are: As a LT b <=> b GE a && a LE b <=> b GT a. Our transformations are:
a GE b -> a GE b a GE b -> a GE b
a GT b -> a GT b a GT b -> a GT b
a LE b -> b GE a a LE b -> b GE a
a LT b -> b GT a a LT b -> b GT a
a EQ b -> a EQ b */ a EQ b -> a EQ b
a NE b -> ~(a EQ b) */
gcc_assert (comparison != NULL); gcc_assert (comparison != NULL);
emit_insn (comparison (operands[0], operands[2], operands[3])); emit_insn (comparison (operands[0], operands[2], operands[3]));
break; if (code == NE)
emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
case UNEQ:
/* We first check (a > b || b > a) which is !UNEQ, inverting
this result will then give us (a == b || a UNORDERED b). */
emit_insn (gen_aarch64_cmgt<mode> (operands[0],
operands[2], operands[3]));
emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[3], operands[2]));
emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp));
emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
break; break;
case LTGT: case LTGT:
...@@ -2822,21 +2836,22 @@ ...@@ -2822,21 +2836,22 @@
emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp)); emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp));
break; break;
case UNORDERED:
/* Operands are ORDERED iff (a > b || b >= a), so we can compute
UNORDERED as !ORDERED. */
emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[2], operands[3]));
emit_insn (gen_aarch64_cmge<mode> (operands[0],
operands[3], operands[2]));
emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp));
emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
break;
case ORDERED: case ORDERED:
emit_insn (gen_aarch64_cmgt<mode> (tmp, operands[2], operands[3])); case UNORDERED:
emit_insn (gen_aarch64_cmge<mode> (operands[0], case UNEQ:
operands[3], operands[2])); /* cmeq (a, a) & cmeq (b, b). */
emit_insn (gen_ior<v_int_equiv>3 (operands[0], operands[0], tmp)); emit_insn (gen_aarch64_cmeq<mode> (operands[0],
operands[2], operands[2]));
emit_insn (gen_aarch64_cmeq<mode> (tmp, operands[3], operands[3]));
emit_insn (gen_and<v_int_equiv>3 (operands[0], operands[0], tmp));
if (code == UNORDERED)
emit_insn (gen_one_cmpl<v_int_equiv>2 (operands[0], operands[0]));
else if (code == UNEQ)
{
emit_insn (gen_aarch64_cmeq<mode> (tmp, operands[2], operands[3]));
emit_insn (gen_orn<v_int_equiv>3 (operands[0], operands[0], tmp));
}
break; break;
default: default:
......
2018-03-19 Sudakshina Das <sudi.das@arm.com>
PR target/81647
* gcc.target/aarch64/pr81647.c: New.
2018-03-19 Richard Biener <rguenther@suse.de> 2018-03-19 Richard Biener <rguenther@suse.de>
PR tree-optimization/84933 PR tree-optimization/84933
......
/* { dg-do run } */
/* { dg-options "-O3 -fdump-tree-ssa" } */
#include <fenv.h>
double x[28], y[28];
int res[28];
int
main (void)
{
int i;
for (i = 0; i < 28; ++i)
{
x[i] = __builtin_nan ("");
y[i] = i;
}
__asm__ volatile ("" ::: "memory");
feclearexcept (FE_ALL_EXCEPT);
for (i = 0; i < 4; ++i)
res[i] = __builtin_isgreater (x[i], y[i]);
for (i = 4; i < 8; ++i)
res[i] = __builtin_isgreaterequal (x[i], y[i]);
for (i = 8; i < 12; ++i)
res[i] = __builtin_isless (x[i], y[i]);
for (i = 12; i < 16; ++i)
res[i] = __builtin_islessequal (x[i], y[i]);
for (i = 16; i < 20; ++i)
res[i] = __builtin_islessgreater (x[i], y[i]);
for (i = 20; i < 24; ++i)
res[i] = __builtin_isunordered (x[i], y[i]);
for (i = 24; i < 28; ++i)
res[i] = !(__builtin_isunordered (x[i], y[i]));
__asm__ volatile ("" ::: "memory");
return fetestexcept (FE_ALL_EXCEPT) != 0;
}
/* { dg-final { scan-tree-dump " u> " "ssa" } } */
/* { dg-final { scan-tree-dump " u>= " "ssa" } } */
/* { dg-final { scan-tree-dump " u< " "ssa" } } */
/* { dg-final { scan-tree-dump " u<= " "ssa" } } */
/* { dg-final { scan-tree-dump " u== " "ssa" } } */
/* { dg-final { scan-tree-dump " unord " "ssa" } } */
/* { dg-final { scan-tree-dump " ord " "ssa" } } */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment