Commit cd1bef27 by Jeff Law Committed by Tamar Christina

Updated stack-clash implementation supporting 64k probes.

This patch implements the use of the stack clash mitigation for aarch64.
In Aarch64 we expect both the probing interval and the guard size to be 64KB
and we enforce them to always be equal.

We also probe up by 1024 bytes in the general case when a probe is required.

AArch64 has the following probing conditions:

 1a) Any initial adjustment less than 63KB requires no probing.  An ABI defined
     safe buffer of 1Kbytes is used and a page size of 64k is assumed.

  b) Any final adjustment residual requires a probe at SP + 1KB.
     We know this to be safe since you would have done at least one page worth
     of allocations already to get to that point.

  c) Any final adjustment more than remainder (total allocation amount) larger
     than 1K - LR offset requires a probe at SP.


  safe buffer mentioned in 1a is maintained by the storing of FP/LR.
  In the case of -fomit-frame-pointer we can still count on LR being stored
  if the function makes a call, even if it's a tail call.  The AArch64 frame
  layout code guarantees this and tests have been added to check against
  this particular case.

 2) Any allocations larger than 1 page size, is done in increments of page size
    and probed up by 1KB leaving the residuals.

 3a) Any residual for initial adjustment that is less than guard-size - 1KB
     requires no probing.  Essentially this is a sliding window.  The probing
     range determines the ABI safe buffer, and the amount to be probed up.

Incrementally allocating less than the probing thresholds, e.g. recursive functions will
not be an issue as the storing of LR counts as a probe.


                            +-------------------+                                    
                            |  ABI SAFE REGION  |                                    
                  +------------------------------                                    
                  |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |                                    
 maximum amount   |         |                   |                                    
 not needing a    |         |                   |                                    
 probe            |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |                                    
                  |         |                   |        Probe offset when           
                  |         ---------------------------- probe is required           
                  |         |                   |                                    
                  +-------- +-------------------+ --------  Point of first probe     
                            |  ABI SAFE REGION  |                                    
                            ---------------------                                    
                            |                   |                                    
                            |                   |                                    
                            |                   |                                         

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Target was tested with stack clash on and off by default.

GLIBC testsuite also ran with stack clash on by default and no new
regressions.


Co-Authored-By: Richard Sandiford <richard.sandiford@linaro.org>
Co-Authored-By: Tamar Christina <tamar.christina@arm.com>

From-SVN: r264747
parent 041bfa6f
2018-10-01 Jeff Law <law@redhat.com>
Richard Sandiford <richard.sandiford@linaro.org>
Tamar Christina <tamar.christina@arm.com>
PR target/86486
* config/aarch64/aarch64.md
(probe_stack_range): Add k (SP) constraint.
* config/aarch64/aarch64.h (STACK_CLASH_CALLER_GUARD,
STACK_CLASH_MAX_UNROLL_PAGES): New.
* config/aarch64/aarch64.c (aarch64_output_probe_stack_range): Emit
stack probes for stack clash.
(aarch64_allocate_and_probe_stack_space): New.
(aarch64_expand_prologue): Use it.
(aarch64_expand_epilogue): Likewise and update IP regs re-use criteria.
(aarch64_sub_sp): Add emit_move_imm optional param.
2018-10-01 MCC CS <deswurstes@users.noreply.github.com>
PR tree-optimization/87261
......@@ -84,6 +84,14 @@
#define LONG_DOUBLE_TYPE_SIZE 128
/* This value is the amount of bytes a caller is allowed to drop the stack
before probing has to be done for stack clash protection. */
#define STACK_CLASH_CALLER_GUARD 1024
/* This value controls how many pages we manually unroll the loop for when
generating stack clash probes. */
#define STACK_CLASH_MAX_UNROLL_PAGES 4
/* The architecture reserves all bits of the address for hardware use,
so the vbit must go into the delta field of pointers to member
functions. This is the same config as that in the AArch32
......
......@@ -6503,7 +6503,7 @@
)
(define_insn "probe_stack_range"
[(set (match_operand:DI 0 "register_operand" "=r")
[(set (match_operand:DI 0 "register_operand" "=rk")
(unspec_volatile:DI [(match_operand:DI 1 "register_operand" "0")
(match_operand:DI 2 "register_operand" "r")]
UNSPECV_PROBE_STACK_RANGE))]
......
2018-10-01 Jeff Law <law@redhat.com>
Richard Sandiford <richard.sandiford@linaro.org>
Tamar Christina <tamar.christina@arm.com>
PR target/86486
* gcc.target/aarch64/stack-check-12.c: New.
* gcc.target/aarch64/stack-check-13.c: New.
* gcc.target/aarch64/stack-check-cfa-1.c: New.
* gcc.target/aarch64/stack-check-cfa-2.c: New.
* gcc.target/aarch64/stack-check-prologue-1.c: New.
* gcc.target/aarch64/stack-check-prologue-10.c: New.
* gcc.target/aarch64/stack-check-prologue-11.c: New.
* gcc.target/aarch64/stack-check-prologue-12.c: New.
* gcc.target/aarch64/stack-check-prologue-13.c: New.
* gcc.target/aarch64/stack-check-prologue-14.c: New.
* gcc.target/aarch64/stack-check-prologue-15.c: New.
* gcc.target/aarch64/stack-check-prologue-2.c: New.
* gcc.target/aarch64/stack-check-prologue-3.c: New.
* gcc.target/aarch64/stack-check-prologue-4.c: New.
* gcc.target/aarch64/stack-check-prologue-5.c: New.
* gcc.target/aarch64/stack-check-prologue-6.c: New.
* gcc.target/aarch64/stack-check-prologue-7.c: New.
* gcc.target/aarch64/stack-check-prologue-8.c: New.
* gcc.target/aarch64/stack-check-prologue-9.c: New.
* gcc.target/aarch64/stack-check-prologue.h: New.
* lib/target-supports.exp
(check_effective_target_supports_stack_clash_protection): Add AArch64.
2018-10-01 Tamar Christina <tamar.christina@arm.com>
* lib/target-supports.exp (check_cached_effective_target_indexed): New.
......
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -fno-asynchronous-unwind-tables -fno-unwind-tables" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
extern void arf (unsigned long int *, unsigned long int *);
void
frob ()
{
unsigned long int num[10000];
unsigned long int den[10000];
arf (den, num);
}
/* This verifies that the scheduler did not break the dependencies
by adjusting the offsets within the probe and that the scheduler
did not reorder around the stack probes. */
/* { dg-final { scan-assembler-times {sub\tsp, sp, #65536\n\tstr\txzr, \[sp, 1024\]} 2 } } */
/* There is some residual allocation, but we don't care about that. Only that it's not probed. */
/* { dg-final { scan-assembler-times {str\txzr, } 2 } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -fno-asynchronous-unwind-tables -fno-unwind-tables" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define ARG32(X) X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X,X
#define ARG192(X) ARG32(X),ARG32(X),ARG32(X),ARG32(X),ARG32(X),ARG32(X)
void out1(ARG192(__int128));
int t1(int);
int t3(int x)
{
if (x < 1000)
return t1 (x) + 1;
out1 (ARG192(1));
return 0;
}
/* This test creates a large (> 1k) outgoing argument area that needs
to be probed. We don't test the exact size of the space or the
exact offset to make the test a little less sensitive to trivial
output changes. */
/* { dg-final { scan-assembler-times "sub\\tsp, sp, #....\\n\\tstr\\txzr, \\\[sp" 1 } } */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 128*1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 65536} 1 } } */
/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 131072} 1 } } */
/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 0} 1 } } */
/* Checks that the CFA notes are correct for every sp adjustment. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -funwind-tables" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 1280*1024 + 512
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {\.cfi_def_cfa [0-9]+, 1310720} 1 } } */
/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 1311232} 1 } } */
/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 1310720} 1 } } */
/* { dg-final { scan-assembler-times {\.cfi_def_cfa_offset 0} 1 } } */
/* Checks that the CFA notes are correct for every sp adjustment. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 128
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr,} 0 } } */
/* SIZE is smaller than guard-size - 1Kb so no probe expected. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE (6 * 64 * 1024) + (1 * 63 * 1024) + 512
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 2 } } */
/* SIZE is more than 4x guard-size and remainder larger than guard-size - 1Kb,
1 probe expected in a loop and 1 residual probe. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE (6 * 64 * 1024) + (1 * 32 * 1024)
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* SIZE is more than 4x guard-size and remainder larger than guard-size - 1Kb,
1 probe expected in a loop and 1 residual probe. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -fomit-frame-pointer -momit-leaf-frame-pointer" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
void
f (void)
{
volatile int x[16384 + 1000];
x[0] = 0;
}
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* SIZE is more than 1 guard-size, but only one 64KB page is used, expect only 1
probe. Leaf function and omitting leaf pointers. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -fomit-frame-pointer -momit-leaf-frame-pointer" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
void h (void) __attribute__ ((noreturn));
void
f (void)
{
volatile int x[16384 + 1000];
x[30]=0;
h ();
}
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* { dg-final { scan-assembler-times {str\s+x30, \[sp\]} 1 } } */
/* SIZE is more than 1 guard-size, but only one 64KB page is used, expect only 1
probe. Leaf function and omitting leaf pointers, tail call to noreturn which
may only omit an epilogue and not a prologue. Checking for LR saving. */
\ No newline at end of file
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -fomit-frame-pointer -momit-leaf-frame-pointer" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
void h (void) __attribute__ ((noreturn));
void
f (void)
{
volatile int x[16384 + 1000];
if (x[0])
h ();
x[345] = 1;
h ();
}
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* { dg-final { scan-assembler-times {str\s+x30, \[sp\]} 1 } } */
/* SIZE is more than 1 guard-size, two 64k pages used, expect only 1 explicit
probe at 1024 and one implicit probe due to LR being saved. Leaf function
and omitting leaf pointers, tail call to noreturn which may only omit an
epilogue and not a prologue and control flow in between. Checking for
LR saving. */
\ No newline at end of file
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16 -fomit-frame-pointer -momit-leaf-frame-pointer" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
void g (volatile int *x) ;
void h (void) __attribute__ ((noreturn));
void
f (void)
{
volatile int x[16384 + 1000];
g (x);
h ();
}
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* { dg-final { scan-assembler-times {str\s+x30, \[sp\]} 1 } } */
/* SIZE is more than 1 guard-size, two 64k pages used, expect only 1 explicit
probe at 1024 and one implicit probe due to LR being saved. Leaf function
and omitting leaf pointers, normal function call followed by a tail call to
noreturn which may only omit an epilogue and not a prologue and control flow
in between. Checking for LR saving. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 2 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr,} 0 } } */
/* SIZE is smaller than guard-size - 1Kb so no probe expected. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 63 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr,} 1 } } */
/* SIZE is exactly guard-size - 1Kb, boundary condition so 1 probe expected.
*/
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 63 * 1024 + 512
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* SIZE is more than guard-size - 1Kb and remainder is less than 1kB,
1 probe expected. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 64 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* SIZE is more than guard-size - 1Kb and remainder is zero,
1 probe expected, boundary condition. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 65 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* SIZE is more than guard-size - 1Kb and remainder is equal to 1kB,
1 probe expected. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 127 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 2 } } */
/* SIZE is more than 1x guard-size and remainder equal than guard-size - 1Kb,
2 probe expected, unrolled, no loop. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 128 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 2 } } */
/* SIZE is more than 2x guard-size and no remainder, unrolled, no loop. */
/* { dg-do compile } */
/* { dg-options "-O2 -fstack-clash-protection --param stack-clash-protection-guard-size=16" } */
/* { dg-require-effective-target supports_stack_clash_protection } */
#define SIZE 6 * 64 * 1024
#include "stack-check-prologue.h"
/* { dg-final { scan-assembler-times {str\s+xzr, \[sp, 1024\]} 1 } } */
/* SIZE is more than 4x guard-size and no remainder, 1 probe expected in a loop
and no residual probe. */
int f_test (int x)
{
char arr[SIZE];
return arr[x];
}
......@@ -8385,14 +8385,9 @@ proc check_effective_target_autoincdec { } {
#
proc check_effective_target_supports_stack_clash_protection { } {
# Temporary until the target bits are fully ACK'd.
# if { [istarget aarch*-*-*] } {
# return 1
# }
if { [istarget x86_64-*-*] || [istarget i?86-*-*]
|| [istarget powerpc*-*-*] || [istarget rs6000*-*-*]
|| [istarget s390*-*-*] } {
|| [istarget aarch64*-**] || [istarget s390*-*-*] } {
return 1
}
return 0
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment