Commit 8277ddf9 by Richard Sandiford Committed by Richard Sandiford

Make ivopts handle calls to internal functions

ivopts previously treated pointer arguments to internal functions
like IFN_MASK_LOAD and IFN_MASK_STORE as normal gimple values.
This patch makes it treat them as addresses instead.  This makes
a significant difference to the code quality for SVE loops,
since we can then use loads and stores with scaled indices.

2018-01-13  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-ssa-loop-ivopts.c (USE_ADDRESS): Split into...
	(USE_REF_ADDRESS, USE_PTR_ADDRESS): ...these new use types.
	(dump_groups): Update accordingly.
	(iv_use::mem_type): New member variable.
	(address_p): New function.
	(record_use): Add a mem_type argument and initialize the new
	mem_type field.
	(record_group_use): Add a mem_type argument.  Use address_p.
	Remove obsolete null checks of base_object.  Update call to record_use.
	(find_interesting_uses_op): Update call to record_group_use.
	(find_interesting_uses_cond): Likewise.
	(find_interesting_uses_address): Likewise.
	(get_mem_type_for_internal_fn): New function.
	(find_address_like_use): Likewise.
	(find_interesting_uses_stmt): Try find_address_like_use before
	calling find_interesting_uses_op.
	(addr_offset_valid_p): Use the iv mem_type field as the type
	of the addressed memory.
	(add_autoinc_candidates): Likewise.
	(get_address_cost): Likewise.
	(split_small_address_groups_p): Use address_p.
	(split_address_groups): Likewise.
	(add_iv_candidate_for_use): Likewise.
	(autoinc_possible_for_pair): Likewise.
	(rewrite_groups): Likewise.
	(get_use_type): Check for USE_REF_ADDRESS instead of USE_ADDRESS.
	(determine_group_iv_cost): Update after split of USE_ADDRESS.
	(get_alias_ptr_type_for_ptr_address): New function.
	(rewrite_use_address): Rewrite address uses in calls that were
	identified by find_address_like_use.

gcc/testsuite/
	* gcc.dg/tree-ssa/scev-9.c: Expected REFERENCE ADDRESS
	instead of just ADDRESS.
	* gcc.dg/tree-ssa/scev-10.c: Likewise.
	* gcc.dg/tree-ssa/scev-11.c: Likewise.
	* gcc.dg/tree-ssa/scev-12.c: Likewise.
	* gcc.target/aarch64/sve/index_offset_1.c: New test.
	* gcc.target/aarch64/sve/index_offset_1_run.c: Likewise.
	* gcc.target/aarch64/sve/loop_add_2.c: Likewise.
	* gcc.target/aarch64/sve/loop_add_3.c: Likewise.
	* gcc.target/aarch64/sve/while_1.c: Check for indexed addressing modes.
	* gcc.target/aarch64/sve/while_2.c: Likewise.
	* gcc.target/aarch64/sve/while_3.c: Likewise.
	* gcc.target/aarch64/sve/while_4.c: Likewise.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256628
parent 65dd1346
......@@ -2,6 +2,41 @@
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* tree-ssa-loop-ivopts.c (USE_ADDRESS): Split into...
(USE_REF_ADDRESS, USE_PTR_ADDRESS): ...these new use types.
(dump_groups): Update accordingly.
(iv_use::mem_type): New member variable.
(address_p): New function.
(record_use): Add a mem_type argument and initialize the new
mem_type field.
(record_group_use): Add a mem_type argument. Use address_p.
Remove obsolete null checks of base_object. Update call to record_use.
(find_interesting_uses_op): Update call to record_group_use.
(find_interesting_uses_cond): Likewise.
(find_interesting_uses_address): Likewise.
(get_mem_type_for_internal_fn): New function.
(find_address_like_use): Likewise.
(find_interesting_uses_stmt): Try find_address_like_use before
calling find_interesting_uses_op.
(addr_offset_valid_p): Use the iv mem_type field as the type
of the addressed memory.
(add_autoinc_candidates): Likewise.
(get_address_cost): Likewise.
(split_small_address_groups_p): Use address_p.
(split_address_groups): Likewise.
(add_iv_candidate_for_use): Likewise.
(autoinc_possible_for_pair): Likewise.
(rewrite_groups): Likewise.
(get_use_type): Check for USE_REF_ADDRESS instead of USE_ADDRESS.
(determine_group_iv_cost): Update after split of USE_ADDRESS.
(get_alias_ptr_type_for_ptr_address): New function.
(rewrite_use_address): Rewrite address uses in calls that were
identified by find_address_like_use.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* expr.c (expand_expr_addr_expr_1): Handle ADDR_EXPRs of
TARGET_MEM_REFs.
* gimple-expr.h (is_gimple_addressable: Likewise.
......
......@@ -2,6 +2,24 @@
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* gcc.dg/tree-ssa/scev-9.c: Expected REFERENCE ADDRESS
instead of just ADDRESS.
* gcc.dg/tree-ssa/scev-10.c: Likewise.
* gcc.dg/tree-ssa/scev-11.c: Likewise.
* gcc.dg/tree-ssa/scev-12.c: Likewise.
* gcc.target/aarch64/sve/index_offset_1.c: New test.
* gcc.target/aarch64/sve/index_offset_1_run.c: Likewise.
* gcc.target/aarch64/sve/loop_add_2.c: Likewise.
* gcc.target/aarch64/sve/loop_add_3.c: Likewise.
* gcc.target/aarch64/sve/while_1.c: Check for indexed addressing modes.
* gcc.target/aarch64/sve/while_2.c: Likewise.
* gcc.target/aarch64/sve/while_3.c: Likewise.
* gcc.target/aarch64/sve/while_4.c: Likewise.
2018-01-13 Richard Sandiford <richard.sandiford@linaro.org>
Alan Hayward <alan.hayward@arm.com>
David Sherwood <david.sherwood@arm.com>
* gcc.dg/vect/pr60482.c: Remove XFAIL for variable-length vectors.
* gcc.target/aarch64/sve/reduc_1.c: Expect the loop operations
to be predicated.
......
......@@ -18,5 +18,5 @@ foo (signed char s, signed char l)
}
/* Address of array reference is scev. */
/* { dg-final { scan-tree-dump-times " Type:\\tADDRESS\n Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
/* { dg-final { scan-tree-dump-times " Type:\\tREFERENCE ADDRESS\n Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
......@@ -23,4 +23,4 @@ foo (int n)
}
/* Address of array reference to b is scev. */
/* { dg-final { scan-tree-dump-times " Type:\\tADDRESS\n Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */
/* { dg-final { scan-tree-dump-times " Type:\\tREFERENCE ADDRESS\n Use \[0-9\].\[0-9\]:" 2 "ivopts" } } */
......@@ -24,4 +24,4 @@ foo (int x, int n)
}
/* Address of array reference to b is not scev. */
/* { dg-final { scan-tree-dump-times " Type:\\tADDRESS\n Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
/* { dg-final { scan-tree-dump-times " Type:\\tREFERENCE ADDRESS\n Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
......@@ -18,5 +18,5 @@ foo (unsigned char s, unsigned char l)
}
/* Address of array reference is scev. */
/* { dg-final { scan-tree-dump-times " Type:\\tADDRESS\n Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
/* { dg-final { scan-tree-dump-times " Type:\\tREFERENCE ADDRESS\n Use \[0-9\].\[0-9\]:" 1 "ivopts" } } */
/* { dg-do compile } */
/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256" } */
#define SIZE (15 * 8 + 3)
#define DEF_INDEX_OFFSET(SIGNED, TYPE, ITERTYPE) \
void __attribute__ ((noinline, noclone)) \
set_##SIGNED##_##TYPE##_##ITERTYPE (SIGNED TYPE *restrict out, \
SIGNED TYPE *restrict in) \
{ \
SIGNED ITERTYPE i; \
for (i = 0; i < SIZE; i++) \
{ \
out[i] = in[i]; \
} \
} \
void __attribute__ ((noinline, noclone)) \
set_##SIGNED##_##TYPE##_##ITERTYPE##_var (SIGNED TYPE *restrict out, \
SIGNED TYPE *restrict in, \
SIGNED ITERTYPE n) \
{ \
SIGNED ITERTYPE i; \
for (i = 0; i < n; i++) \
{ \
out[i] = in[i]; \
} \
}
#define TEST_TYPE(T, SIGNED, TYPE) \
T (SIGNED, TYPE, char) \
T (SIGNED, TYPE, short) \
T (SIGNED, TYPE, int) \
T (SIGNED, TYPE, long)
#define TEST_ALL(T) \
TEST_TYPE (T, signed, long) \
TEST_TYPE (T, unsigned, long) \
TEST_TYPE (T, signed, int) \
TEST_TYPE (T, unsigned, int) \
TEST_TYPE (T, signed, short) \
TEST_TYPE (T, unsigned, short) \
TEST_TYPE (T, signed, char) \
TEST_TYPE (T, unsigned, char)
TEST_ALL (DEF_INDEX_OFFSET)
/* { dg-final { scan-assembler-times "ld1d\\tz\[0-9\]+.d, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */
/* { dg-final { scan-assembler-times "st1d\\tz\[0-9\]+.d, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 3\\\]" 16 } } */
/* { dg-final { scan-assembler-times "ld1w\\tz\[0-9\]+.s, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */
/* { dg-final { scan-assembler-times "st1w\\tz\[0-9\]+.s, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 2\\\]" 16 } } */
/* { dg-final { scan-assembler-times "ld1h\\tz\[0-9\]+.h, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */
/* { dg-final { scan-assembler-times "st1h\\tz\[0-9\]+.h, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+, lsl 1\\\]" 16 } } */
/* { dg-final { scan-assembler-times "ld1b\\tz\[0-9\]+.b, p\[0-9\]+/z, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */
/* { dg-final { scan-assembler-times "st1b\\tz\[0-9\]+.b, p\[0-9\]+, \\\[x\[0-9\]+, x\[0-9\]+\\\]" 16 } } */
/* { dg-do run { target aarch64_sve_hw } } */
/* { dg-options "-O2 -ftree-vectorize" } */
/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=256" { target aarch64_sve256_hw } } */
#include "index_offset_1.c"
#define TEST_INDEX_OFFSET(SIGNED, TYPE, ITERTYPE) \
{ \
SIGNED TYPE out[SIZE + 1]; \
SIGNED TYPE in1[SIZE + 1]; \
SIGNED TYPE in2[SIZE + 1]; \
for (int i = 0; i < SIZE + 1; ++i) \
{ \
in1[i] = (i * 4) ^ i; \
in2[i] = (i * 2) ^ i; \
asm volatile ("" ::: "memory"); \
} \
out[SIZE] = 42; \
set_##SIGNED##_##TYPE##_##ITERTYPE (out, in1); \
if (0 != __builtin_memcmp (out, in1, SIZE * sizeof (TYPE))) \
__builtin_abort (); \
set_##SIGNED##_##TYPE##_##ITERTYPE##_var (out, in2, SIZE); \
if (0 != __builtin_memcmp (out, in2, SIZE * sizeof (TYPE))) \
__builtin_abort (); \
if (out[SIZE] != 42) \
__builtin_abort (); \
}
int __attribute__ ((optimize (1)))
main (void)
{
TEST_ALL (TEST_INDEX_OFFSET);
return 0;
}
/* { dg-do compile } */
/* { dg-options "-std=c99 -O3" } */
void
foo (int *__restrict a, int *__restrict b)
{
for (int i = 0; i < 512; ++i)
a[i] += b[i];
}
/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */
/* { dg-do compile } */
/* { dg-options "-std=c99 -O3" } */
void
f (int *__restrict a,
int *__restrict b,
int *__restrict c,
int *__restrict d,
int *__restrict e,
int *__restrict f,
int *__restrict g,
int *__restrict h,
int count)
{
for (int i = 0; i < count; ++i)
a[i] = b[i] + c[i] + d[i] + e[i] + f[i] + g[i] + h[i];
}
/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+.s, p[0-7]+/z, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 7 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+.s, p[0-7]+, \[x[0-9]+, x[0-9]+, lsl 2\]\n} 1 } } */
......@@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
......@@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
......@@ -34,3 +34,11 @@ TEST_ALL (ADD_LOOP)
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
......@@ -35,3 +35,11 @@ TEST_ALL (ADD_LOOP)
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, xzr,} 3 } } */
/* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d, x[0-9]+,} 3 } } */
/* { dg-final { scan-assembler-times {\tld1b\tz[0-9]+\.b, p[0-7]/z, \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0, x[0-9]+\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1h\tz[0-9]+\.h, p[0-7]/z, \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0, x[0-9]+, lsl 1\]\n} 2 } } */
/* { dg-final { scan-assembler-times {\tld1w\tz[0-9]+\.s, p[0-7]/z, \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0, x[0-9]+, lsl 2\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tld1d\tz[0-9]+\.d, p[0-7]/z, \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0, x[0-9]+, lsl 3\]\n} 3 } } */
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment