Commit 19e5389d by Alan Modra

[RS6000] PR94145, make PLT loads volatile

The PLT is volatile.  On PowerPC it is a bss style section which the
dynamic loader initialises to point at resolver stubs (called glink on
PowerPC64) to support lazy resolution of function addresses.  The
first call to a given function goes via the dynamic loader symbol
resolver, which updates the PLT entry for that function and calls the
function.  The second call, if there is one and we don't have a
multi-threaded race, will use the updated PLT entry and thus avoid
the relatively slow symbol resolver path.

Calls via the PLT are like calls via a function pointer, except that
no initialised function pointer is volatile like the PLT.  All
initialised function pointers are resolved at program startup to point
at the function or are left as NULL.  There is no support for lazy
resolution of any user visible function pointer.

So why does any of this matter to gcc?  Well, normally the PLT call
mechanism happens entirely behind gcc's back, but since we implemented
inline PLT calls (effectively putting the PLT code stub that loads the
PLT entry inline and making that code sequence scheduled), the load of
the PLT entry is visible to gcc.  That load then is subject to gcc
optimization, for example in

/* -S -mcpu=future -mpcrel -mlongcall -O2.  */
int foo (int);
void bar (void)
{
  while (foo(0))
    foo (99);
}

we see the PLT load for foo being hoisted out of the loop and stashed
in a call-saved register.  If that happens to be the first call to
foo, then the stashed value is that for the resolver stub, and every
call to foo in the loop will then go via the slow resolver path.  Not
a good idea.  Also, if foo turns out to be a local function and the
linker replaces the PLT calls with direct calls to foo then gcc has
just wasted a call-saved register.

This patch teaches gcc that the PLT loads are volatile.  The change
doesn't affect other loads of function pointers and thus has no effect
on normal indirect function calls.  Note that because the
"optimization" this patch prevents can only occur over function calls,
the only place gcc can stash PLT loads is in call-saved registers or
in other memory.  I'm reasonably confident that this change will be
neutral or positive for the "ld -z now" case where the PLT is not
volatile, in code where there is any register pressure.  Even if gcc
could be taught to recognise cases where the PLT is resolved, you'd
need to discount use of registers to cache PLT loads by some factor
involving the chance that those calls would be converted to direct
calls.

	PR target/94145
	* config/rs6000/rs6000.c (rs6000_longcall_ref): Use unspec_volatile
	for PLT16_LO and PLT_PCREL.
	* config/rs6000/rs6000.md (UNSPEC_PLT16_LO, UNSPEC_PLT_PCREL): Remove.
	(UNSPECV_PLT16_LO, UNSPECV_PLT_PCREL): Define.
	(pltseq_plt16_lo_, pltseq_plt_pcrel): Use unspec_volatile.
parent 54de5afb
2020-03-27 Alan Modra <amodra@gmail.com>
PR target/94145
* config/rs6000/rs6000.c (rs6000_longcall_ref): Use unspec_volatile
for PLT16_LO and PLT_PCREL.
* config/rs6000/rs6000.md (UNSPEC_PLT16_LO, UNSPEC_PLT_PCREL): Remove.
(UNSPECV_PLT16_LO, UNSPECV_PLT_PCREL): Define.
(pltseq_plt16_lo_, pltseq_plt_pcrel): Use unspec_volatile.
2020-03-27 Martin Sebor <msebor@redhat.com> 2020-03-27 Martin Sebor <msebor@redhat.com>
PR c++/94098 PR c++/94098
......
...@@ -19283,8 +19283,9 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) ...@@ -19283,8 +19283,9 @@ rs6000_longcall_ref (rtx call_ref, rtx arg)
if (rs6000_pcrel_p (cfun)) if (rs6000_pcrel_p (cfun))
{ {
rtx reg = gen_rtx_REG (Pmode, regno); rtx reg = gen_rtx_REG (Pmode, regno);
rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), rtx u = gen_rtx_UNSPEC_VOLATILE (Pmode,
UNSPEC_PLT_PCREL); gen_rtvec (3, base, call_ref, arg),
UNSPECV_PLT_PCREL);
emit_insn (gen_rtx_SET (reg, u)); emit_insn (gen_rtx_SET (reg, u));
return reg; return reg;
} }
...@@ -19303,8 +19304,9 @@ rs6000_longcall_ref (rtx call_ref, rtx arg) ...@@ -19303,8 +19304,9 @@ rs6000_longcall_ref (rtx call_ref, rtx arg)
rtx reg = gen_rtx_REG (Pmode, regno); rtx reg = gen_rtx_REG (Pmode, regno);
rtx hi = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg), rtx hi = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
UNSPEC_PLT16_HA); UNSPEC_PLT16_HA);
rtx lo = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, reg, call_ref, arg), rtx lo = gen_rtx_UNSPEC_VOLATILE (Pmode,
UNSPEC_PLT16_LO); gen_rtvec (3, reg, call_ref, arg),
UNSPECV_PLT16_LO);
emit_insn (gen_rtx_SET (reg, hi)); emit_insn (gen_rtx_SET (reg, hi));
emit_insn (gen_rtx_SET (reg, lo)); emit_insn (gen_rtx_SET (reg, lo));
return reg; return reg;
......
...@@ -148,8 +148,6 @@ ...@@ -148,8 +148,6 @@
UNSPEC_SI_FROM_SF UNSPEC_SI_FROM_SF
UNSPEC_PLTSEQ UNSPEC_PLTSEQ
UNSPEC_PLT16_HA UNSPEC_PLT16_HA
UNSPEC_PLT16_LO
UNSPEC_PLT_PCREL
]) ])
;; ;;
...@@ -178,6 +176,8 @@ ...@@ -178,6 +176,8 @@
UNSPECV_MTFSB1 ; Set FPSCR Field bit to 1 UNSPECV_MTFSB1 ; Set FPSCR Field bit to 1
UNSPECV_SPLIT_STACK_RETURN ; A camouflaged return UNSPECV_SPLIT_STACK_RETURN ; A camouflaged return
UNSPECV_SPEC_BARRIER ; Speculation barrier UNSPECV_SPEC_BARRIER ; Speculation barrier
UNSPECV_PLT16_LO
UNSPECV_PLT_PCREL
]) ])
; The three different kinds of epilogue. ; The three different kinds of epilogue.
...@@ -10359,10 +10359,10 @@ ...@@ -10359,10 +10359,10 @@
(define_insn "*pltseq_plt16_lo_<mode>" (define_insn "*pltseq_plt16_lo_<mode>"
[(set (match_operand:P 0 "gpc_reg_operand" "=r") [(set (match_operand:P 0 "gpc_reg_operand" "=r")
(unspec:P [(match_operand:P 1 "gpc_reg_operand" "b") (unspec_volatile:P [(match_operand:P 1 "gpc_reg_operand" "b")
(match_operand:P 2 "symbol_ref_operand" "s") (match_operand:P 2 "symbol_ref_operand" "s")
(match_operand:P 3 "" "")] (match_operand:P 3 "" "")]
UNSPEC_PLT16_LO))] UNSPECV_PLT16_LO))]
"TARGET_PLTSEQ" "TARGET_PLTSEQ"
{ {
return rs6000_pltseq_template (operands, RS6000_PLTSEQ_PLT16_LO); return rs6000_pltseq_template (operands, RS6000_PLTSEQ_PLT16_LO);
...@@ -10382,10 +10382,10 @@ ...@@ -10382,10 +10382,10 @@
(define_insn "*pltseq_plt_pcrel<mode>" (define_insn "*pltseq_plt_pcrel<mode>"
[(set (match_operand:P 0 "gpc_reg_operand" "=r") [(set (match_operand:P 0 "gpc_reg_operand" "=r")
(unspec:P [(match_operand:P 1 "" "") (unspec_volatile:P [(match_operand:P 1 "" "")
(match_operand:P 2 "symbol_ref_operand" "s") (match_operand:P 2 "symbol_ref_operand" "s")
(match_operand:P 3 "" "")] (match_operand:P 3 "" "")]
UNSPEC_PLT_PCREL))] UNSPECV_PLT_PCREL))]
"HAVE_AS_PLTSEQ && TARGET_ELF "HAVE_AS_PLTSEQ && TARGET_ELF
&& rs6000_pcrel_p (cfun)" && rs6000_pcrel_p (cfun)"
{ {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment