Commit e0b059b1 by Wilco Dijkstra Committed by Wilco Dijkstra

This patch fixes an exponential issue in ccmp.c.

This patch fixes an exponential issue in ccmp.c.  When deciding which ccmp
expansion to use, the tree nodes gs0 and gs1 are fully expanded twice.  If
they contain more CCMP opportunities, their subtrees are also expanded twice.
When the trees are complex the expansion takes exponential time and memory.
As a workaround in GCC6 compute the cost of the first expansion early, and
only try the alternative expansion if the cost is low enough.  This rarely
affects real code, eg. SPECINT2006 has identical codesize.

2016-02-04  Wilco Dijkstra  <wdijkstr@arm.com>

    gcc/
	PR target/69619
	* ccmp.c (expand_ccmp_expr_1): Avoid evaluating gs0/gs1
	twice when complex.

    gcc/testsuite/
	PR target/69619
	* gcc.dg/pr69619.c: Add new test.

From-SVN: r233145
parent 56f3bb38
2016-02-04 Wilco Dijkstra <wdijkstr@arm.com>
PR target/69619
* ccmp.c (expand_ccmp_expr_1): Avoid evaluating gs0/gs1
twice when complex.
2016-02-04 Mike Frysinger <vapier@gentoo.org>
* doc/invoke.texi: Delete -mno-fma4.
......
......@@ -170,7 +170,7 @@ expand_ccmp_expr_1 (gimple *g, rtx *prep_seq, rtx *gen_seq)
int unsignedp0, unsignedp1;
rtx_code rcode0, rcode1;
int speed_p = optimize_insn_for_speed_p ();
rtx tmp2, ret = NULL_RTX, ret2 = NULL_RTX;
rtx tmp2 = NULL_RTX, ret = NULL_RTX, ret2 = NULL_RTX;
unsigned cost1 = MAX_COST;
unsigned cost2 = MAX_COST;
......@@ -183,19 +183,25 @@ expand_ccmp_expr_1 (gimple *g, rtx *prep_seq, rtx *gen_seq)
gimple_assign_rhs1 (gs0),
gimple_assign_rhs2 (gs0));
tmp2 = targetm.gen_ccmp_first (&prep_seq_2, &gen_seq_2, rcode1,
gimple_assign_rhs1 (gs1),
gimple_assign_rhs2 (gs1));
if (!tmp && !tmp2)
return NULL_RTX;
if (tmp != NULL)
{
ret = expand_ccmp_next (gs1, code, tmp, &prep_seq_1, &gen_seq_1);
cost1 = seq_cost (safe_as_a <rtx_insn *> (prep_seq_1), speed_p);
cost1 += seq_cost (safe_as_a <rtx_insn *> (gen_seq_1), speed_p);
}
/* FIXME: Temporary workaround for PR69619.
Avoid exponential compile time due to expanding gs0 and gs1 twice.
If gs0 and gs1 are complex, the cost will be high, so avoid
reevaluation if above an arbitrary threshold. */
if (tmp == NULL || cost1 < COSTS_N_INSNS (25))
tmp2 = targetm.gen_ccmp_first (&prep_seq_2, &gen_seq_2, rcode1,
gimple_assign_rhs1 (gs1),
gimple_assign_rhs2 (gs1));
if (!tmp && !tmp2)
return NULL_RTX;
if (tmp2 != NULL)
{
ret2 = expand_ccmp_next (gs0, code, tmp2, &prep_seq_2,
......
2016-02-04 Wilco Dijkstra <wdijkstr@arm.com>
PR target/69619
* gcc.dg/pr69619.c: Add new test.
2016-02-04 Richard Sandiford <richard.sandiford@arm.com>
PR rtl-optimization/69577
......
/* { dg-do compile } */
/* { dg-options "-O3" } */
int a, b, c, d;
int e[100];
void
fn1 ()
{
int *f = &d;
c = 6;
for (; c; c--)
{
b = 0;
for (; b <= 5; b++)
{
short g = e[(b + 2) * 9 + c];
*f = *f == a && e[(b + 2) * 9 + c];
}
}
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment