Commit 37ccb0ba by Steven Bosscher Committed by Richard Biener

re PR tree-optimization/23286 (Missed code hoisting optimization)

2016-07-12  Steven Bosscher  <steven@gcc.gnu.org>
	Richard Biener  <rguenther@suse.de>

	PR tree-optimization/23286
	PR tree-optimization/70159
	* doc/invoke.texi: Document -fcode-hoisting.
	* common.opt (fcode-hoisting): New flag.
	* opts.c (default_options_table): Enable -fcode-hoisting at -O2+.
	* tree-ssa-pre.c (pre_stats): Add hoist_insert.
	(do_regular_insertion): Rename to ...
	(do_pre_regular_insertion): ... this and amend general comments
	on insertion strathegy.
	(do_partial_partial_insertion): Rename to ...
	(do_pre_partial_partial_insertion): ... this.
	(do_hoist_insertion): New function.
	(insert_aux): Take flags on whether to do PRE and/or hoist insertion
	and call do_hoist_insertion properly.
	(insert): Adjust.
	(pass_pre::gate): Enable also if -fcode-hoisting is enabled.
	(pass_pre::execute): Register hoist_insert stats.

	* gcc.dg/tree-ssa/ssa-pre-11.c: Disable code hosting.
	* gcc.dg/tree-ssa/ssa-pre-27.c: Likewise.
	* gcc.dg/tree-ssa/ssa-pre-28.c: Likewise.
	* gcc.dg/tree-ssa/ssa-pre-2.c: Likewise.
	* gcc.dg/tree-ssa/pr35286.c: Likewise.
	* gcc.dg/tree-ssa/pr35287.c: Likewise.
	* gcc.dg/hoist-register-pressure-1.c: Likewise.
	* gcc.dg/hoist-register-pressure-2.c: Likewise.
	* gcc.dg/hoist-register-pressure-3.c: Likewise.
	* gcc.dg/pr51879-12.c: Likewise.
	* gcc.dg/strlenopt-9.c: Likewise.
	* gcc.dg/tree-ssa/pr47392.c: Likewise.
	* gcc.dg/tree-ssa/pr68619-4.c: Likewise.
	* gcc.dg/tree-ssa/split-path-5.c: Likewise.
	* gcc.dg/tree-ssa/slsr-35.c: Likewise.
	* gcc.dg/tree-ssa/slsr-36.c: Likewise.
	* gcc.dg/tree-ssa/loadpre3.c: Adjust so hosting doesn't apply.
	* gcc.dg/tree-ssa/pr43491.c: Scan optimized dump for desired result.
	* gcc.dg/tree-ssa/ssa-pre-31.c: Adjust expected outcome for hoisting.
	* gcc.dg/tree-ssa/ssa-hoist-1.c: New testcase.
	* gcc.dg/tree-ssa/ssa-hoist-2.c: New testcase.
	* gcc.dg/tree-ssa/ssa-hoist-3.c: New testcase.
	* gcc.dg/tree-ssa/ssa-hoist-4.c: New testcase.
	* gcc.dg/tree-ssa/ssa-hoist-5.c: New testcase.
	* gcc.dg/tree-ssa/ssa-hoist-6.c: New testcase.
	* gfortran.dg/pr43984.f90: Adjust expected outcome.

Co-Authored-By: Richard Biener <rguenther@suse.de>

From-SVN: r238242
parent 1de3c940
2016-07-12 Steven Bosscher <steven@gcc.gnu.org>
Richard Biener <rguenther@suse.de>
PR tree-optimization/23286
PR tree-optimization/70159
* doc/invoke.texi: Document -fcode-hoisting.
* common.opt (fcode-hoisting): New flag.
* opts.c (default_options_table): Enable -fcode-hoisting at -O2+.
* tree-ssa-pre.c (pre_stats): Add hoist_insert.
(do_regular_insertion): Rename to ...
(do_pre_regular_insertion): ... this and amend general comments
on insertion strathegy.
(do_partial_partial_insertion): Rename to ...
(do_pre_partial_partial_insertion): ... this.
(do_hoist_insertion): New function.
(insert_aux): Take flags on whether to do PRE and/or hoist insertion
and call do_hoist_insertion properly.
(insert): Adjust.
(pass_pre::gate): Enable also if -fcode-hoisting is enabled.
(pass_pre::execute): Register hoist_insert stats.
2016-07-12 Jakub Jelinek <jakub@redhat.com>
PR middle-end/71716
......
......@@ -1038,6 +1038,10 @@ fchecking=
Common Joined RejectNegative UInteger Var(flag_checking)
Perform internal consistency checkings.
fcode-hoisting
Common Report Var(flag_code_hoisting) Optimization
Enable code hoisting.
fcombine-stack-adjustments
Common Report Var(flag_combine_stack_adjustments) Optimization
Looks for opportunities to reduce stack adjustments and stack references.
......
......@@ -404,7 +404,7 @@ Objective-C and Objective-C++ Dialects}.
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
-ftree-dse -ftree-forwprop -ftree-fre -fcode-hoisting -ftree-loop-if-convert @gol
-ftree-loop-if-convert-stores -ftree-loop-im @gol
-ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol
-ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol
......@@ -6380,6 +6380,7 @@ also turns on the following optimization flags:
-fstrict-aliasing -fstrict-overflow @gol
-ftree-builtin-call-dce @gol
-ftree-switch-conversion -ftree-tail-merge @gol
-fcode-hoisting @gol
-ftree-pre @gol
-ftree-vrp @gol
-fipa-ra}
......@@ -7273,6 +7274,14 @@ and the @option{large-stack-frame-growth} parameter to 400.
Perform reassociation on trees. This flag is enabled by default
at @option{-O} and higher.
@item -fcode-hoisting
@opindex fcode-hoisting
Perform code hoisting. Code hoisting tries to move the
evaluation of expressions executed on all paths to the function exit
as early as possible. This is especially useful as a code size
optimization, but it often helps for code speed as well.
This flag is enabled by defailt at @option{-O2} and higher.
@item -ftree-pre
@opindex ftree-pre
Perform partial redundancy elimination (PRE) on trees. This flag is
......@@ -12230,8 +12239,8 @@ Dump each function after STORE-CCP@. The file name is made by appending
@item pre
@opindex fdump-tree-pre
Dump trees after partial redundancy elimination. The file name is made
by appending @file{.pre} to the source file name.
Dump trees after partial redundancy elimination and/or code hoisting.
The file name is made by appending @file{.pre} to the source file name.
@item fre
@opindex fdump-tree-fre
......
......@@ -500,6 +500,7 @@ static const struct default_options default_options_table[] =
REORDER_BLOCKS_ALGORITHM_STC },
{ OPT_LEVELS_2_PLUS, OPT_freorder_functions, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_ftree_vrp, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_fcode_hoisting, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_ftree_pre, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_ftree_switch_conversion, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_fipa_cp, NULL, 1 },
......
2016-07-12 Steven Bosscher <steven@gcc.gnu.org>
Richard Biener <rguenther@suse.de>
PR tree-optimization/23286
PR tree-optimization/70159
* gcc.dg/tree-ssa/ssa-pre-11.c: Disable code hosting.
* gcc.dg/tree-ssa/ssa-pre-27.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-28.c: Likewise.
* gcc.dg/tree-ssa/ssa-pre-2.c: Likewise.
* gcc.dg/tree-ssa/pr35286.c: Likewise.
* gcc.dg/tree-ssa/pr35287.c: Likewise.
* gcc.dg/hoist-register-pressure-1.c: Likewise.
* gcc.dg/hoist-register-pressure-2.c: Likewise.
* gcc.dg/hoist-register-pressure-3.c: Likewise.
* gcc.dg/pr51879-12.c: Likewise.
* gcc.dg/strlenopt-9.c: Likewise.
* gcc.dg/tree-ssa/pr47392.c: Likewise.
* gcc.dg/tree-ssa/pr68619-4.c: Likewise.
* gcc.dg/tree-ssa/split-path-5.c: Likewise.
* gcc.dg/tree-ssa/slsr-35.c: Likewise.
* gcc.dg/tree-ssa/slsr-36.c: Likewise.
* gcc.dg/tree-ssa/loadpre3.c: Adjust so hosting doesn't apply.
* gcc.dg/tree-ssa/pr43491.c: Scan optimized dump for desired result.
* gcc.dg/tree-ssa/ssa-pre-31.c: Adjust expected outcome for hoisting.
* gcc.dg/tree-ssa/ssa-hoist-1.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-2.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-3.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-4.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-5.c: New testcase.
* gcc.dg/tree-ssa/ssa-hoist-6.c: New testcase.
* gfortran.dg/pr43984.f90: Adjust expected outcome.
2016-07-12 Richard Biener <rguenther@suse.de>
PR rtl-optimization/68961
......
/* { dg-options "-Os -fdump-rtl-hoist" } */
/* { dg-options "-Os -fdump-rtl-hoist -fno-code-hoisting" } */
/* The rtl hoist pass requires that the expression to be hoisted can
be assigned without clobbering cc. For a PLUS rtx on S/390 this
requires a load address instruction which is fine on 64 bit but
......
/* { dg-options "-Os -fdump-rtl-hoist" } */
/* { dg-options "-Os -fdump-rtl-hoist -fno-code-hoisting" } */
/* The rtl hoist pass requires that the expression to be hoisted can
be assigned without clobbering cc. For a PLUS rtx on S/390 this
requires a load address instruction which is fine on 64 bit but
......
/* { dg-options "-Os -fdump-rtl-hoist" } */
/* { dg-options "-Os -fdump-rtl-hoist -fno-code-hoisting" } */
/* The rtl hoist pass requires that the expression to be hoisted can
be assigned without clobbering cc. For a PLUS rtx on S/390 this
requires a load address instruction which is fine on 64 bit but
......
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-tail-merge -fdump-tree-pre" } */
/* { dg-options "-O2 -ftree-tail-merge -fdump-tree-pre -fno-code-hoisting" } */
__attribute__((pure)) int bar (int);
__attribute__((pure)) int bar2 (int);
......
/* { dg-do run } */
/* { dg-options "-O2 -fdump-tree-strlen -fdump-tree-optimized" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-strlen -fdump-tree-optimized" } */
#include "strlenopt.h"
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
extern void spoil (void);
int foo(int **a,int argc)
{
int b;
......@@ -11,7 +14,8 @@ int foo(int **a,int argc)
}
else
{
/* Spoil *a and *(*a) to avoid hoisting it before the "if (...)". */
spoil ();
}
/* Should be able to eliminate one of the *(*a)'s along the if path
by pushing it into the else path. We will also eliminate
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-pre-stats" } */
int g2;
struct A {
int a; int b;
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-pre-stats" } */
int *gp;
int foo(int p)
{
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
/* { dg-options "-O2 -fdump-tree-optimized" } */
#define REGISTER register
......@@ -35,7 +35,11 @@ long foo(long data, long v)
u = i;
return v * t * u;
}
/* We should not eliminate global register variable when it is the RHS of
a single assignment. */
/* { dg-final { scan-tree-dump-times "Eliminated: 2" 1 "pre" { target { arm*-*-* i?86-*-* mips*-*-* x86_64-*-* } } } } */
/* { dg-final { scan-tree-dump-times "Eliminated: 3" 1 "pre" { target { ! { arm*-*-* i?86-*-* mips*-*-* x86_64-*-* } } } } } */
a single assignment. So the number of loads from data_0 has to match
that of the number of adds (we hoist data_0 + data_3 above the
if (data) and eliminate the useless copy). */
/* { dg-final { scan-tree-dump-times "= data_0;" 1 "optimized" { target { arm*-*-* i?86-*-* mips*-*-* x86_64-*-* } } } } */
/* { dg-final { scan-tree-dump-times " \\+ " 1 "optimized" { target { ! { arm*-*-* i?86-*-* mips*-*-* x86_64-*-* } } } } } */
/* { dg-do run } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-pre-stats" } */
struct A
{
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized -w" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-optimized -w" } */
typedef struct rtx_def *rtx;
enum rtx_code
......
......@@ -3,7 +3,7 @@
phi has an argument which is a parameter. */
/* { dg-do compile } */
/* { dg-options "-O3 -fdump-tree-optimized" } */
/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */
int
f (int c, int i)
......
......@@ -3,7 +3,7 @@
phi has an argument which is a parameter. */
/* { dg-do compile } */
/* { dg-options "-O3 -fdump-tree-optimized" } */
/* { dg-options "-O3 -fno-code-hoisting -fdump-tree-optimized" } */
int
f (int s, int c, int i)
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre" } */
unsigned short f(unsigned short a)
{
if (a & 0x8000)
a <<= 1, a = a ^ 0x1021;
else
a <<= 1;
return a;
}
/* We should hoist and CSE the shift. */
/* { dg-final { scan-tree-dump-times " << 1;" 1 "pre" } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre" } */
int f(int i)
{
if (i < 0)
return i/10+ i;
return i/10 + i;
}
/* Hoisting of i/10 + i should make the code straight-line
with one division. */
/* { dg-final { scan-tree-dump-times "goto" 0 "pre" } } */
/* { dg-final { scan-tree-dump-times " / 10;" 1 "pre" } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
int test (int a, int b, int c, int g)
{
int d, e;
if (a)
d = b * c;
else
d = b - c;
e = b * c + g;
return d + e;
}
/* We should hoist and CSE only the multiplication. */
/* { dg-final { scan-tree-dump-times " \\* " 1 "pre" } } */
/* { dg-final { scan-tree-dump "Insertions: 1" "pre" } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-optimized" } */
/* From PR21485. */
long
NumSift (long *array, int b, unsigned long k)
{
if (b)
if (array[k] < array[k + 1L])
++k;
return array[k];
}
/* There should be only two loads left. And the final value in the
if (b) arm should be if-converted:
tem1 = array[k];
if (b)
tem1 = MAX (array[k+1], tem1)
return tem1; */
/* { dg-final { scan-tree-dump-times "= \\*" 2 "optimized" } } */
/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
/* { dg-final { scan-tree-dump-times "= PHI" 1 "optimized" } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
int a[1024];
int b[1024], c[1024];
void foo ()
{
for (int j = 0; j < 1024; ++j)
{
for (int i = 0; i < 1024; ++i)
a[i] = j;
b[j] = c[j];
}
}
/* We should not hoist/PRE the outer loop IV increment or the load
from c across the inner loop. */
/* { dg-final { scan-tree-dump-not "HOIST inserted" "pre" } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-pre-stats" } */
double cos (double);
double f(double a)
{
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-stats" } */
/* { dg-options "-O2 -fno-code-hoisting -fdump-tree-pre-stats" } */
int motion_test1(int data, int data_0, int data_3, int v)
{
int i;
......
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre" } */
/* { dg-options "-O2 -fdump-tree-pre -fno-code-hoisting" } */
int foo (int i, int j, int b)
{
......
/* PR37997 */
/* { dg-do compile } */
/* { dg-options "-O2 -fdump-tree-pre-details" } */
/* { dg-options "-O2 -fdump-tree-pre-details -fno-code-hoisting" } */
int foo (int i, int b, int result)
{
......@@ -16,5 +16,7 @@ int foo (int i, int b, int result)
/* We should insert i + 1 into the if (b) path as well as the simplified
i + 1 & -2 expression. And do replacement with two PHI temps. */
/* With hoisting enabled we'd hoist i + 1 to before the if, retaining
only one PHI node. */
/* { dg-final { scan-tree-dump-times "with prephitmp" 2 "pre" } } */
......@@ -43,4 +43,4 @@ int foo (S1 *root, int N)
return 0;
}
/* { dg-final { scan-tree-dump-times "key" 4 "pre" } } */
/* { dg-final { scan-tree-dump-times "key" 3 "pre" } } */
......@@ -50,6 +50,6 @@ end subroutine
end
! There should be three loads from iyz.data, not four.
! There should be two loads from iyz.data, not four.
! { dg-final { scan-tree-dump-times "= iyz.data" 3 "pre" } }
! { dg-final { scan-tree-dump-times "= iyz.data" 2 "pre" } }
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment