Commit 2eb2847e by Wilco Dijkstra Committed by Wilco Dijkstra

[AArch64] Fix aarch64_ira_change_pseudo_allocno_class

A recent commit removing '*' from the md files caused a large regression in
h264ref.  It turns out aarch64_ira_change_pseudo_allocno_class is no longer
effective after the SVE changes, and the combination results in the regression.
This patch fixes it by explicitly checking for a subset of GENERAL_REGS and
FP_REGS.  Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.

    gcc/
	* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
	Check for subset of GENERAL_REGS and FP_REGS.
	* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of
	r=w alternative.

From-SVN: r260951
parent 30522cdb
2018-05-30 Wilco Dijkstra <wdijkstr@arm.com>
* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
Check for subset of GENERAL_REGS and FP_REGS.
* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of
r=w alternative.
2018-05-30 Richard Sandiford <richard.sandiford@linaro.org>
* alias.c (adjust_offset_for_component_ref): Use poly_int_tree_p
......
......@@ -3022,7 +3022,7 @@
;; is guaranteed so upper bits should be considered undefined.
;; RTL uses GCC vector extension indices throughout so flip only for assembly.
(define_insn "aarch64_get_lane<mode>"
[(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv")
[(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv")
(vec_select:<VEL>
(match_operand:VALL_F16 1 "register_operand" "w, w, w")
(parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
......
......@@ -1087,16 +1087,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
}
/* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.
The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have
the same cost even if ALL_REGS has a much larger cost. ALL_REGS is also
used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory
cost (in this case the best class is the lowest cost one). Using ALL_REGS
irrespectively of its cost results in bad allocations with many redundant
int<->FP moves which are expensive on various cores.
To avoid this we don't allow ALL_REGS as the allocno class, but force a
decision between FP_REGS and GENERAL_REGS. We use the allocno class if it
isn't ALL_REGS. Similarly, use the best class if it isn't ALL_REGS.
Otherwise set the allocno class depending on the mode.
The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and
GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much
higher cost. POINTER_AND_FP_REGS is also used if the cost of both FP_REGS
and GENERAL_REGS is lower than the memory cost (in this case the best class
is the lowest cost one). Using POINTER_AND_FP_REGS irrespectively of its
cost results in bad allocations with many redundant int<->FP moves which
are expensive on various cores.
To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but
force a decision between FP_REGS and GENERAL_REGS. We use the allocno class
if it isn't POINTER_AND_FP_REGS. Similarly, use the best class if it isn't
POINTER_AND_FP_REGS. Otherwise set the allocno class depending on the mode.
The result of this is that it is no longer inefficient to have a higher
memory move cost than the register move cost.
*/
......@@ -1107,10 +1108,12 @@ aarch64_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class,
{
machine_mode mode;
if (allocno_class != ALL_REGS)
if (reg_class_subset_p (allocno_class, GENERAL_REGS)
|| reg_class_subset_p (allocno_class, FP_REGS))
return allocno_class;
if (best_class != ALL_REGS)
if (reg_class_subset_p (best_class, GENERAL_REGS)
|| reg_class_subset_p (best_class, FP_REGS))
return best_class;
mode = PSEUDO_REGNO_MODE (regno);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment