Commit f663d9ad by Kyrylo Tkachov Committed by Kyrylo Tkachov

GIMPLE store merging pass

2016-10-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

	PR middle-end/22141
	* Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
	* common.opt (fstore-merging): New Optimization option.
	* opts.c (default_options_table): Add entry for
	OPT_ftree_store_merging.
	* fold-const.h (can_native_encode_type_p): Declare prototype.
	* fold-const.c (can_native_encode_type_p): Define.
	* params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
	(PARAM_MAX_STORES_TO_MERGE): Likewise.
	* timevar.def (TV_GIMPLE_STORE_MERGING): New timevar.
	* passes.def: Insert pass_tree_store_merging.
	* tree-pass.h (make_pass_store_merging): Declare extern
	prototype.
	* gimple-ssa-store-merging.c: New file.
	* doc/invoke.texi (Optimization Options): Document
	-fstore-merging.
	(--param documentation): Document store-merging-allow-unaligned
	and max-stores-to-merge.

2016-10-28  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
            Jakub Jelinek  <jakub@redhat.com>
            Andrew Pinski  <pinskia@gmail.com>

	PR middle-end/22141
	PR rtl-optimization/23684
	* gcc.c-torture/execute/pr22141-1.c: New test.
	* gcc.c-torture/execute/pr22141-2.c: Likewise.
	* gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
	* gcc.target/aarch64/ldp_stp_4.c: Likewise.
	* gcc.dg/store_merging_1.c: New test.
	* gcc.dg/store_merging_2.c: Likewise.
	* gcc.dg/store_merging_3.c: Likewise.
	* gcc.dg/store_merging_4.c: Likewise.
	* gcc.dg/store_merging_5.c: Likewise.
	* gcc.dg/store_merging_6.c: Likewise.
	* gcc.dg/store_merging_7.c: Likewise.
	* gcc.target/i386/pr22141.c: Likewise.
	* gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.
	* g++.dg/init/new17.C: Likewise.



Co-Authored-By: Andrew Pinski <pinskia@gmail.com>
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>

From-SVN: r241649
parent 1f5700e9
2016-10-28 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
PR middle-end/22141
* Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
* common.opt (fstore-merging): New Optimization option.
* opts.c (default_options_table): Add entry for
OPT_ftree_store_merging.
* fold-const.h (can_native_encode_type_p): Declare prototype.
* fold-const.c (can_native_encode_type_p): Define.
* params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
(PARAM_MAX_STORES_TO_MERGE): Likewise.
* timevar.def (TV_GIMPLE_STORE_MERGING): New timevar.
* passes.def: Insert pass_tree_store_merging.
* tree-pass.h (make_pass_store_merging): Declare extern
prototype.
* gimple-ssa-store-merging.c: New file.
* doc/invoke.texi (Optimization Options): Document
-fstore-merging.
(--param documentation): Document store-merging-allow-unaligned
and max-stores-to-merge.
2016-10-28 Will Schmidt <will_schmidt@vnet.ibm.com> 2016-10-28 Will Schmidt <will_schmidt@vnet.ibm.com>
PR middle-end/72747 PR middle-end/72747
...@@ -1296,6 +1296,7 @@ OBJS = \ ...@@ -1296,6 +1296,7 @@ OBJS = \
gimple-ssa-isolate-paths.o \ gimple-ssa-isolate-paths.o \
gimple-ssa-nonnull-compare.o \ gimple-ssa-nonnull-compare.o \
gimple-ssa-split-paths.o \ gimple-ssa-split-paths.o \
gimple-ssa-store-merging.o \
gimple-ssa-strength-reduction.o \ gimple-ssa-strength-reduction.o \
gimple-ssa-sprintf.o \ gimple-ssa-sprintf.o \
gimple-ssa-warn-alloca.o \ gimple-ssa-warn-alloca.o \
......
...@@ -1463,6 +1463,10 @@ fstrict-volatile-bitfields ...@@ -1463,6 +1463,10 @@ fstrict-volatile-bitfields
Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization
Force bitfield accesses to match their type width. Force bitfield accesses to match their type width.
fstore-merging
Common Report Var(flag_store_merging) Optimization
Merge adjacent stores.
fguess-branch-probability fguess-branch-probability
Common Report Var(flag_guess_branch_prob) Optimization Common Report Var(flag_guess_branch_prob) Optimization
Enable guessing of branch probabilities. Enable guessing of branch probabilities.
......
...@@ -405,7 +405,7 @@ Objective-C and Objective-C++ Dialects}. ...@@ -405,7 +405,7 @@ Objective-C and Objective-C++ Dialects}.
-fsingle-precision-constant -fsplit-ivs-in-unroller -fsplit-loops@gol -fsingle-precision-constant -fsplit-ivs-in-unroller -fsplit-loops@gol
-fsplit-paths @gol -fsplit-paths @gol
-fsplit-wide-types -fssa-backprop -fssa-phiopt @gol -fsplit-wide-types -fssa-backprop -fssa-phiopt @gol
-fstdarg-opt -fstrict-aliasing @gol -fstdarg-opt -fstore-merging -fstrict-aliasing @gol
-fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
-ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol -ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol
...@@ -416,8 +416,8 @@ Objective-C and Objective-C++ Dialects}. ...@@ -416,8 +416,8 @@ Objective-C and Objective-C++ Dialects}.
-ftree-loop-vectorize @gol -ftree-loop-vectorize @gol
-ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol
-ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol
-ftree-switch-conversion -ftree-tail-merge -ftree-ter @gol -ftree-switch-conversion -ftree-tail-merge @gol
-ftree-vectorize -ftree-vrp -funconstrained-commons @gol -ftree-ter -ftree-vectorize -ftree-vrp -funconstrained-commons @gol
-funit-at-a-time -funroll-all-loops -funroll-loops @gol -funit-at-a-time -funroll-all-loops -funroll-loops @gol
-funsafe-math-optimizations -funswitch-loops @gol -funsafe-math-optimizations -funswitch-loops @gol
-fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol
...@@ -6753,6 +6753,7 @@ compilation time. ...@@ -6753,6 +6753,7 @@ compilation time.
-fsplit-wide-types @gol -fsplit-wide-types @gol
-fssa-backprop @gol -fssa-backprop @gol
-fssa-phiopt @gol -fssa-phiopt @gol
-fstore-merging @gol
-ftree-bit-ccp @gol -ftree-bit-ccp @gol
-ftree-ccp @gol -ftree-ccp @gol
-ftree-ch @gol -ftree-ch @gol
...@@ -8095,6 +8096,13 @@ Perform scalar replacement of aggregates. This pass replaces structure ...@@ -8095,6 +8096,13 @@ Perform scalar replacement of aggregates. This pass replaces structure
references with scalars to prevent committing structures to memory too references with scalars to prevent committing structures to memory too
early. This flag is enabled by default at @option{-O} and higher. early. This flag is enabled by default at @option{-O} and higher.
@item -fstore-merging
@opindex fstore-merging
Perform merging of narrow stores to consecutive memory addresses. This pass
merges contiguous stores of immediate values narrower than a word into fewer
wider stores to reduce the number of instructions. This is enabled by default
at @option{-O} and higher.
@item -ftree-ter @item -ftree-ter
@opindex ftree-ter @opindex ftree-ter
Perform temporary expression replacement during the SSA->normal phase. Single Perform temporary expression replacement during the SSA->normal phase. Single
...@@ -9573,6 +9581,14 @@ avoid quadratic behavior in tree tail merging. The default value is 10. ...@@ -9573,6 +9581,14 @@ avoid quadratic behavior in tree tail merging. The default value is 10.
The maximum amount of iterations of the pass over the function. This is used to The maximum amount of iterations of the pass over the function. This is used to
limit compilation time in tree tail merging. The default value is 2. limit compilation time in tree tail merging. The default value is 2.
@item store-merging-allow-unaligned
Allow the store merging pass to introduce unaligned stores if it is legal to
do so. The default value is 1.
@item max-stores-to-merge
The maximum number of stores to attempt to merge into wider stores in the store
merging pass. The minimum value is 2 and the default is 64.
@item max-unrolled-insns @item max-unrolled-insns
The maximum number of instructions that a loop may have to be unrolled. The maximum number of instructions that a loop may have to be unrolled.
If a loop is unrolled, this parameter also determines how many times If a loop is unrolled, this parameter also determines how many times
......
...@@ -7516,6 +7516,26 @@ can_native_interpret_type_p (tree type) ...@@ -7516,6 +7516,26 @@ can_native_interpret_type_p (tree type)
} }
} }
/* Return true iff a constant of type TYPE is accepted by
native_encode_expr. */
bool
can_native_encode_type_p (tree type)
{
switch (TREE_CODE (type))
{
case INTEGER_TYPE:
case REAL_TYPE:
case FIXED_POINT_TYPE:
case COMPLEX_TYPE:
case VECTOR_TYPE:
case POINTER_TYPE:
return true;
default:
return false;
}
}
/* Fold a VIEW_CONVERT_EXPR of a constant expression EXPR to type /* Fold a VIEW_CONVERT_EXPR of a constant expression EXPR to type
TYPE at compile-time. If we're unable to perform the conversion TYPE at compile-time. If we're unable to perform the conversion
return NULL_TREE. */ return NULL_TREE. */
......
...@@ -27,6 +27,7 @@ extern int folding_initializer; ...@@ -27,6 +27,7 @@ extern int folding_initializer;
/* Convert between trees and native memory representation. */ /* Convert between trees and native memory representation. */
extern int native_encode_expr (const_tree, unsigned char *, int, int off = -1); extern int native_encode_expr (const_tree, unsigned char *, int, int off = -1);
extern tree native_interpret_expr (tree, const unsigned char *, int); extern tree native_interpret_expr (tree, const unsigned char *, int);
extern bool can_native_encode_type_p (tree);
/* Fold constants as much as possible in an expression. /* Fold constants as much as possible in an expression.
Returns the simplified expression. Returns the simplified expression.
......
...@@ -521,6 +521,7 @@ static const struct default_options default_options_table[] = ...@@ -521,6 +521,7 @@ static const struct default_options default_options_table[] =
{ OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_fstore_merging, NULL, 1 },
/* -O3 optimizations. */ /* -O3 optimizations. */
{ OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
......
...@@ -1094,6 +1094,18 @@ DEFPARAM (PARAM_MAX_TAIL_MERGE_COMPARISONS, ...@@ -1094,6 +1094,18 @@ DEFPARAM (PARAM_MAX_TAIL_MERGE_COMPARISONS,
"Maximum amount of similar bbs to compare a bb with.", "Maximum amount of similar bbs to compare a bb with.",
10, 0, 0) 10, 0, 0)
DEFPARAM (PARAM_STORE_MERGING_ALLOW_UNALIGNED,
"store-merging-allow-unaligned",
"Allow the store merging pass to introduce unaligned stores "
"if it is legal to do so",
1, 0, 1)
DEFPARAM (PARAM_MAX_STORES_TO_MERGE,
"max-stores-to-merge",
"Maximum number of constant stores to merge in the"
"store merging pass",
64, 2, 0)
DEFPARAM (PARAM_MAX_TAIL_MERGE_ITERATIONS, DEFPARAM (PARAM_MAX_TAIL_MERGE_ITERATIONS,
"max-tail-merge-iterations", "max-tail-merge-iterations",
"Maximum amount of iterations of the pass over a function.", "Maximum amount of iterations of the pass over a function.",
......
...@@ -332,6 +332,7 @@ along with GCC; see the file COPYING3. If not see ...@@ -332,6 +332,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_phiopt); NEXT_PASS (pass_phiopt);
NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_fold_builtins);
NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_optimize_widening_mul);
NEXT_PASS (pass_store_merging);
NEXT_PASS (pass_tail_calls); NEXT_PASS (pass_tail_calls);
/* If DCE is not run before checking for uninitialized uses, /* If DCE is not run before checking for uninitialized uses,
we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c). we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
......
2016-10-28 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
Jakub Jelinek <jakub@redhat.com>
Andrew Pinski <pinskia@gmail.com>
PR middle-end/22141
PR rtl-optimization/23684
* gcc.c-torture/execute/pr22141-1.c: New test.
* gcc.c-torture/execute/pr22141-2.c: Likewise.
* gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
* gcc.target/aarch64/ldp_stp_4.c: Likewise.
* gcc.dg/store_merging_1.c: New test.
* gcc.dg/store_merging_2.c: Likewise.
* gcc.dg/store_merging_3.c: Likewise.
* gcc.dg/store_merging_4.c: Likewise.
* gcc.dg/store_merging_5.c: Likewise.
* gcc.dg/store_merging_6.c: Likewise.
* gcc.dg/store_merging_7.c: Likewise.
* gcc.target/i386/pr22141.c: Likewise.
* gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.
* g++.dg/init/new17.C: Likewise.
2016-10-26 Will Schmidt <will_schmidt@vnet.ibm.com> 2016-10-26 Will Schmidt <will_schmidt@vnet.ibm.com>
PR middle-end/72747 PR middle-end/72747
......
// { dg-do compile } // { dg-do compile }
// { dg-options "-O2 -fstrict-aliasing -fdump-tree-optimized" } // { dg-options "-O2 -fstrict-aliasing -fno-store-merging -fdump-tree-optimized" }
// Test that placement new does not introduce an unnecessary memory // Test that placement new does not introduce an unnecessary memory
// barrier. // barrier.
......
/* PR middle-end/22141 */
extern void abort (void);
struct S
{
struct T
{
char a;
char b;
char c;
char d;
} t;
} u;
struct U
{
struct S s[4];
};
void __attribute__((noinline))
c1 (struct T *p)
{
if (p->a != 1 || p->b != 2 || p->c != 3 || p->d != 4)
abort ();
__builtin_memset (p, 0xaa, sizeof (*p));
}
void __attribute__((noinline))
c2 (struct S *p)
{
c1 (&p->t);
}
void __attribute__((noinline))
c3 (struct U *p)
{
c2 (&p->s[2]);
}
void __attribute__((noinline))
f1 (void)
{
u = (struct S) { { 1, 2, 3, 4 } };
}
void __attribute__((noinline))
f2 (void)
{
u.t.a = 1;
u.t.b = 2;
u.t.c = 3;
u.t.d = 4;
}
void __attribute__((noinline))
f3 (void)
{
u.t.d = 4;
u.t.b = 2;
u.t.a = 1;
u.t.c = 3;
}
void __attribute__((noinline))
f4 (void)
{
struct S v;
v.t.a = 1;
v.t.b = 2;
v.t.c = 3;
v.t.d = 4;
c2 (&v);
}
void __attribute__((noinline))
f5 (struct S *p)
{
p->t.a = 1;
p->t.c = 3;
p->t.d = 4;
p->t.b = 2;
}
void __attribute__((noinline))
f6 (void)
{
struct U v;
v.s[2].t.a = 1;
v.s[2].t.b = 2;
v.s[2].t.c = 3;
v.s[2].t.d = 4;
c3 (&v);
}
void __attribute__((noinline))
f7 (struct U *p)
{
p->s[2].t.a = 1;
p->s[2].t.c = 3;
p->s[2].t.d = 4;
p->s[2].t.b = 2;
}
int
main (void)
{
struct U w;
f1 ();
c2 (&u);
f2 ();
c1 (&u.t);
f3 ();
c2 (&u);
f4 ();
f5 (&u);
c2 (&u);
f6 ();
f7 (&w);
c3 (&w);
return 0;
}
/* PR middle-end/22141 */
extern void abort (void);
struct S
{
struct T
{
char a;
char b;
char c;
char d;
} t;
} u __attribute__((aligned));
struct U
{
struct S s[4];
};
void __attribute__((noinline))
c1 (struct T *p)
{
if (p->a != 1 || p->b != 2 || p->c != 3 || p->d != 4)
abort ();
__builtin_memset (p, 0xaa, sizeof (*p));
}
void __attribute__((noinline))
c2 (struct S *p)
{
c1 (&p->t);
}
void __attribute__((noinline))
c3 (struct U *p)
{
c2 (&p->s[2]);
}
void __attribute__((noinline))
f1 (void)
{
u = (struct S) { { 1, 2, 3, 4 } };
}
void __attribute__((noinline))
f2 (void)
{
u.t.a = 1;
u.t.b = 2;
u.t.c = 3;
u.t.d = 4;
}
void __attribute__((noinline))
f3 (void)
{
u.t.d = 4;
u.t.b = 2;
u.t.a = 1;
u.t.c = 3;
}
void __attribute__((noinline))
f4 (void)
{
struct S v __attribute__((aligned));
v.t.a = 1;
v.t.b = 2;
v.t.c = 3;
v.t.d = 4;
c2 (&v);
}
void __attribute__((noinline))
f5 (struct S *p)
{
p->t.a = 1;
p->t.c = 3;
p->t.d = 4;
p->t.b = 2;
}
void __attribute__((noinline))
f6 (void)
{
struct U v __attribute__((aligned));
v.s[2].t.a = 1;
v.s[2].t.b = 2;
v.s[2].t.c = 3;
v.s[2].t.d = 4;
c3 (&v);
}
void __attribute__((noinline))
f7 (struct U *p)
{
p->s[2].t.a = 1;
p->s[2].t.c = 3;
p->s[2].t.d = 4;
p->s[2].t.b = 2;
}
int
main (void)
{
struct U w __attribute__((aligned));
f1 ();
c2 (&u);
f2 ();
c1 (&u.t);
f3 ();
c2 (&u);
f4 ();
f5 (&u);
c2 (&u);
f6 ();
f7 (&w);
c3 (&w);
return 0;
}
/* { dg-do compile } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging" } */
struct bar {
int a;
char b;
char c;
char d;
char e;
char f;
char g;
};
void
foo1 (struct bar *p)
{
p->b = 0;
p->a = 0;
p->c = 0;
p->d = 0;
p->e = 0;
}
void
foo2 (struct bar *p)
{
p->b = 0;
p->a = 0;
p->c = 1;
p->d = 0;
p->e = 0;
}
/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
/* { dg-do run } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging" } */
struct bar
{
int a;
unsigned char b;
unsigned char c;
short d;
unsigned char e;
unsigned char f;
unsigned char g;
};
__attribute__ ((noinline)) void
foozero (struct bar *p)
{
p->b = 0;
p->a = 0;
p->c = 0;
p->d = 0;
p->e = 0;
p->f = 0;
p->g = 0;
}
__attribute__ ((noinline)) void
foo1 (struct bar *p)
{
p->b = 1;
p->a = 2;
p->c = 3;
p->d = 4;
p->e = 5;
p->f = 0;
p->g = 0xff;
}
__attribute__ ((noinline)) void
foo2 (struct bar *p, struct bar *p2)
{
p->b = 0xff;
p2->b = 0xa;
p->a = 0xfffff;
p2->c = 0xc;
p->c = 0xff;
p2->d = 0xbf;
p->d = 0xfff;
}
int
main (void)
{
struct bar b1, b2;
foozero (&b1);
foozero (&b2);
foo1 (&b1);
if (b1.b != 1 || b1.a != 2 || b1.c != 3 || b1.d != 4 || b1.e != 5
|| b1.f != 0 || b1.g != 0xff)
__builtin_abort ();
foozero (&b1);
/* Make sure writes to aliasing struct pointers preserve the
correct order. */
foo2 (&b1, &b1);
if (b1.b != 0xa || b1.a != 0xfffff || b1.c != 0xff || b1.d != 0xfff)
__builtin_abort ();
foozero (&b1);
foo2 (&b1, &b2);
if (b1.a != 0xfffff || b1.b != 0xff || b1.c != 0xff || b1.d != 0xfff
|| b2.b != 0xa || b2.c != 0xc || b2.d != 0xbf)
__builtin_abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
/* { dg-do compile } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging-details" } */
/* Make sure stores to volatile addresses don't get combined with
other accesses. */
struct bar
{
int a;
char b;
char c;
volatile short d;
char e;
char f;
char g;
};
void
foozero (struct bar *p)
{
p->b = 0xa;
p->a = 0xb;
p->c = 0xc;
p->d = 0;
p->e = 0xd;
p->f = 0xe;
p->g = 0xf;
}
/* { dg-final { scan-tree-dump "Volatile access terminates all chains" "store-merging" } } */
/* { dg-final { scan-tree-dump-times "=\{v\} 0;" 1 "store-merging" } } */
/* { dg-do compile } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging" } */
/* Check that we can merge interleaving stores that are guaranteed
to be non-aliasing. */
struct bar
{
int a;
char b;
char c;
short d;
char e;
char f;
char g;
};
void
foozero (struct bar *restrict p, struct bar *restrict p2)
{
p->b = 0xff;
p2->b = 0xa;
p->a = 0xfffff;
p2->a = 0xab;
p2->c = 0xc;
p->c = 0xff;
p2->d = 0xbf;
p->d = 0xfff;
}
/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
/* { dg-do compile } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging" } */
/* Make sure that non-aliasing non-constant interspersed stores do not
stop chains. */
struct bar {
int a;
char b;
char c;
char d;
char e;
char g;
};
void
foo1 (struct bar *p, char tmp)
{
p->a = 0;
p->b = 0;
p->g = tmp;
p->c = 0;
p->d = 0;
p->e = 0;
}
/* { dg-final { scan-tree-dump-times "Merging successful" 1 "store-merging" } } */
/* { dg-final { scan-tree-dump-times "MEM\\\[.*\\\]" 1 "store-merging" } } */
/* { dg-do run } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging" } */
/* Check that we can widen accesses to bitfields. */
struct bar {
int a : 3;
unsigned char b : 4;
unsigned char c : 1;
char d;
char e;
char f;
char g;
};
__attribute__ ((noinline)) void
foozero (struct bar *p)
{
p->b = 0;
p->a = 0;
p->c = 0;
p->d = 0;
p->e = 0;
p->f = 0;
p->g = 0;
}
__attribute__ ((noinline)) void
foo1 (struct bar *p)
{
p->b = 3;
p->a = 2;
p->c = 1;
p->d = 4;
p->e = 5;
}
int
main (void)
{
struct bar p;
foozero (&p);
foo1 (&p);
if (p.a != 2 || p.b != 3 || p.c != 1 || p.d != 4 || p.e != 5
|| p.f != 0 || p.g != 0)
__builtin_abort ();
return 0;
}
/* { dg-final { scan-tree-dump-times "Merging successful" 2 "store-merging" } } */
/* { dg-do compile } */
/* { dg-require-effective-target non_strict_align } */
/* { dg-options "-O2 -fdump-tree-store-merging" } */
/* Check that we can merge consecutive array members through the pointer.
PR rtl-optimization/23684. */
void
foo (char *input)
{
input = __builtin_assume_aligned (input, 8);
input[0] = 'H';
input[1] = 'e';
input[2] = 'l';
input[3] = 'l';
input[4] = 'o';
input[5] = ' ';
input[6] = 'w';
input[7] = 'o';
input[8] = 'r';
input[9] = 'l';
input[10] = 'd';
input[11] = '\0';
}
/* { dg-final { scan-tree-dump-times "Merging successful" 1 "store-merging" } } */
...@@ -3,22 +3,22 @@ ...@@ -3,22 +3,22 @@
int arr[4][4]; int arr[4][4];
void void
foo () foo (int x, int y)
{ {
arr[0][1] = 1; arr[0][1] = x;
arr[1][0] = -1; arr[1][0] = y;
arr[2][0] = 1; arr[2][0] = x;
arr[1][1] = -1; arr[1][1] = y;
arr[0][2] = 1; arr[0][2] = x;
arr[0][3] = -1; arr[0][3] = y;
arr[1][2] = 1; arr[1][2] = x;
arr[2][1] = -1; arr[2][1] = y;
arr[3][0] = 1; arr[3][0] = x;
arr[3][1] = -1; arr[3][1] = y;
arr[2][2] = 1; arr[2][2] = x;
arr[1][3] = -1; arr[1][3] = y;
arr[2][3] = 1; arr[2][3] = x;
arr[3][2] = -1; arr[3][2] = y;
} }
/* { dg-final { scan-assembler-times "stp\tw\[0-9\]+, w\[0-9\]" 7 } } */ /* { dg-final { scan-assembler-times "stp\tw\[0-9\]+, w\[0-9\]" 7 } } */
...@@ -3,22 +3,22 @@ ...@@ -3,22 +3,22 @@
float arr[4][4]; float arr[4][4];
void void
foo () foo (float x, float y)
{ {
arr[0][1] = 1; arr[0][1] = x;
arr[1][0] = -1; arr[1][0] = y;
arr[2][0] = 1; arr[2][0] = x;
arr[1][1] = -1; arr[1][1] = y;
arr[0][2] = 1; arr[0][2] = x;
arr[0][3] = -1; arr[0][3] = y;
arr[1][2] = 1; arr[1][2] = x;
arr[2][1] = -1; arr[2][1] = y;
arr[3][0] = 1; arr[3][0] = x;
arr[3][1] = -1; arr[3][1] = y;
arr[2][2] = 1; arr[2][2] = x;
arr[1][3] = -1; arr[1][3] = y;
arr[2][3] = 1; arr[2][3] = x;
arr[3][2] = -1; arr[3][2] = y;
} }
/* { dg-final { scan-assembler-times "stp\ts\[0-9\]+, s\[0-9\]" 7 } } */ /* { dg-final { scan-assembler-times "stp\ts\[0-9\]+, s\[0-9\]" 7 } } */
/* PR middle-end/22141 */
/* { dg-do compile } */
/* { dg-options "-Os" } */
extern void abort (void);
struct S
{
struct T
{
char a;
char b;
char c;
char d;
} t;
} u;
struct U
{
struct S s[4];
};
void __attribute__((noinline))
c1 (struct T *p)
{
if (p->a != 1 || p->b != 2 || p->c != 3 || p->d != 4)
abort ();
__builtin_memset (p, 0xaa, sizeof (*p));
}
void __attribute__((noinline))
c2 (struct S *p)
{
c1 (&p->t);
}
void __attribute__((noinline))
c3 (struct U *p)
{
c2 (&p->s[2]);
}
void __attribute__((noinline))
f1 (void)
{
u = (struct S) { { 1, 2, 3, 4 } };
}
void __attribute__((noinline))
f2 (void)
{
u.t.a = 1;
u.t.b = 2;
u.t.c = 3;
u.t.d = 4;
}
void __attribute__((noinline))
f3 (void)
{
u.t.d = 4;
u.t.b = 2;
u.t.a = 1;
u.t.c = 3;
}
void __attribute__((noinline))
f4 (void)
{
struct S v;
v.t.a = 1;
v.t.b = 2;
v.t.c = 3;
v.t.d = 4;
c2 (&v);
}
void __attribute__((noinline))
f5 (struct S *p)
{
p->t.a = 1;
p->t.c = 3;
p->t.d = 4;
p->t.b = 2;
}
void __attribute__((noinline))
f6 (void)
{
struct U v;
v.s[2].t.a = 1;
v.s[2].t.b = 2;
v.s[2].t.c = 3;
v.s[2].t.d = 4;
c3 (&v);
}
void __attribute__((noinline))
f7 (struct U *p)
{
p->s[2].t.a = 1;
p->s[2].t.c = 3;
p->s[2].t.d = 4;
p->s[2].t.b = 2;
}
int
main (void)
{
struct U w;
f1 ();
c2 (&u);
f2 ();
c1 (&u.t);
f3 ();
c2 (&u);
f4 ();
f5 (&u);
c2 (&u);
f6 ();
f7 (&w);
c3 (&w);
return 0;
}
/* { dg-final { scan-assembler-times "67305985\|4030201" 7 } } */
/* PR rtl-optimization/34012 */ /* PR rtl-optimization/34012 */
/* { dg-do compile } */ /* { dg-do compile } */
/* { dg-require-effective-target lp64 } */ /* { dg-require-effective-target lp64 } */
/* { dg-options "-O2" } */ /* { dg-options "-O2 -fno-store-merging" } */
void bar (long int *); void bar (long int *);
void void
......
...@@ -282,6 +282,7 @@ DEFTIMEVAR (TV_TREE_UNINIT , "uninit var analysis") ...@@ -282,6 +282,7 @@ DEFTIMEVAR (TV_TREE_UNINIT , "uninit var analysis")
DEFTIMEVAR (TV_PLUGIN_INIT , "plugin initialization") DEFTIMEVAR (TV_PLUGIN_INIT , "plugin initialization")
DEFTIMEVAR (TV_PLUGIN_RUN , "plugin execution") DEFTIMEVAR (TV_PLUGIN_RUN , "plugin execution")
DEFTIMEVAR (TV_GIMPLE_SLSR , "straight-line strength reduction") DEFTIMEVAR (TV_GIMPLE_SLSR , "straight-line strength reduction")
DEFTIMEVAR (TV_GIMPLE_STORE_MERGING , "store merging")
DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification") DEFTIMEVAR (TV_VTABLE_VERIFICATION , "vtable verification")
DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan") DEFTIMEVAR (TV_TREE_UBSAN , "tree ubsan")
DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl") DEFTIMEVAR (TV_INITIALIZE_RTL , "initialize rtl")
......
...@@ -426,6 +426,7 @@ extern gimple_opt_pass *make_pass_late_warn_uninitialized (gcc::context *ctxt); ...@@ -426,6 +426,7 @@ extern gimple_opt_pass *make_pass_late_warn_uninitialized (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_cse_reciprocals (gcc::context *ctxt); extern gimple_opt_pass *make_pass_cse_reciprocals (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_cse_sincos (gcc::context *ctxt); extern gimple_opt_pass *make_pass_cse_sincos (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_optimize_bswap (gcc::context *ctxt); extern gimple_opt_pass *make_pass_optimize_bswap (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_store_merging (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_optimize_widening_mul (gcc::context *ctxt); extern gimple_opt_pass *make_pass_optimize_widening_mul (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_warn_function_return (gcc::context *ctxt); extern gimple_opt_pass *make_pass_warn_function_return (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_warn_function_noreturn (gcc::context *ctxt); extern gimple_opt_pass *make_pass_warn_function_noreturn (gcc::context *ctxt);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment