Commit 2eb2847e by Wilco Dijkstra Committed by Wilco Dijkstra

[AArch64] Fix aarch64_ira_change_pseudo_allocno_class

A recent commit removing '*' from the md files caused a large regression in
h264ref.  It turns out aarch64_ira_change_pseudo_allocno_class is no longer
effective after the SVE changes, and the combination results in the regression.
This patch fixes it by explicitly checking for a subset of GENERAL_REGS and
FP_REGS.  Add a missing ? to aarch64_get_lane to fix a failure in the testsuite.

    gcc/
	* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
	Check for subset of GENERAL_REGS and FP_REGS.
	* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of
	r=w alternative.

From-SVN: r260951
parent 30522cdb
2018-05-30 Wilco Dijkstra <wdijkstr@arm.com>
* config/aarch64/aarch64.c (aarch64_ira_change_pseudo_allocno_class):
Check for subset of GENERAL_REGS and FP_REGS.
* config/aarch64/aarch64-simd.md (aarch64_get_lane): Increase cost of
r=w alternative.
2018-05-30 Richard Sandiford <richard.sandiford@linaro.org> 2018-05-30 Richard Sandiford <richard.sandiford@linaro.org>
* alias.c (adjust_offset_for_component_ref): Use poly_int_tree_p * alias.c (adjust_offset_for_component_ref): Use poly_int_tree_p
......
...@@ -3022,7 +3022,7 @@ ...@@ -3022,7 +3022,7 @@
;; is guaranteed so upper bits should be considered undefined. ;; is guaranteed so upper bits should be considered undefined.
;; RTL uses GCC vector extension indices throughout so flip only for assembly. ;; RTL uses GCC vector extension indices throughout so flip only for assembly.
(define_insn "aarch64_get_lane<mode>" (define_insn "aarch64_get_lane<mode>"
[(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=r, w, Utv") [(set (match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv")
(vec_select:<VEL> (vec_select:<VEL>
(match_operand:VALL_F16 1 "register_operand" "w, w, w") (match_operand:VALL_F16 1 "register_operand" "w, w, w")
(parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))]
......
...@@ -1087,16 +1087,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg) ...@@ -1087,16 +1087,17 @@ aarch64_err_no_fpadvsimd (machine_mode mode, const char *msg)
} }
/* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS. /* Implement TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS.
The register allocator chooses ALL_REGS if FP_REGS and GENERAL_REGS have The register allocator chooses POINTER_AND_FP_REGS if FP_REGS and
the same cost even if ALL_REGS has a much larger cost. ALL_REGS is also GENERAL_REGS have the same cost - even if POINTER_AND_FP_REGS has a much
used if the cost of both FP_REGS and GENERAL_REGS is lower than the memory higher cost. POINTER_AND_FP_REGS is also used if the cost of both FP_REGS
cost (in this case the best class is the lowest cost one). Using ALL_REGS and GENERAL_REGS is lower than the memory cost (in this case the best class
irrespectively of its cost results in bad allocations with many redundant is the lowest cost one). Using POINTER_AND_FP_REGS irrespectively of its
int<->FP moves which are expensive on various cores. cost results in bad allocations with many redundant int<->FP moves which
To avoid this we don't allow ALL_REGS as the allocno class, but force a are expensive on various cores.
decision between FP_REGS and GENERAL_REGS. We use the allocno class if it To avoid this we don't allow POINTER_AND_FP_REGS as the allocno class, but
isn't ALL_REGS. Similarly, use the best class if it isn't ALL_REGS. force a decision between FP_REGS and GENERAL_REGS. We use the allocno class
Otherwise set the allocno class depending on the mode. if it isn't POINTER_AND_FP_REGS. Similarly, use the best class if it isn't
POINTER_AND_FP_REGS. Otherwise set the allocno class depending on the mode.
The result of this is that it is no longer inefficient to have a higher The result of this is that it is no longer inefficient to have a higher
memory move cost than the register move cost. memory move cost than the register move cost.
*/ */
...@@ -1107,10 +1108,12 @@ aarch64_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class, ...@@ -1107,10 +1108,12 @@ aarch64_ira_change_pseudo_allocno_class (int regno, reg_class_t allocno_class,
{ {
machine_mode mode; machine_mode mode;
if (allocno_class != ALL_REGS) if (reg_class_subset_p (allocno_class, GENERAL_REGS)
|| reg_class_subset_p (allocno_class, FP_REGS))
return allocno_class; return allocno_class;
if (best_class != ALL_REGS) if (reg_class_subset_p (best_class, GENERAL_REGS)
|| reg_class_subset_p (best_class, FP_REGS))
return best_class; return best_class;
mode = PSEUDO_REGNO_MODE (regno); mode = PSEUDO_REGNO_MODE (regno);
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment