Commit 3b0cb1a5 by Aaron Sawdey Committed by Aaron Sawdey

rs6000-string.c (expand_block_move): Allow the use of unaligned VSX load/store on P8/P9.

2018-01-02  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>

        * config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
        unaligned VSX load/store on P8/P9.
        (expand_block_clear): Allow the use of unaligned VSX
	load/store on P8/P9.

From-SVN: r256112
parent 6012c652
2018-01-02 Aaron Sawdey <acsawdey@linux.vnet.ibm.com>
* config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
unaligned VSX load/store on P8/P9.
(expand_block_clear): Allow the use of unaligned VSX
load/store on P8/P9.
2018-01-02 Bill Schmidt <wschmidt@linux.vnet.ibm.com> 2018-01-02 Bill Schmidt <wschmidt@linux.vnet.ibm.com>
* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store): * config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
...@@ -73,7 +73,7 @@ expand_block_clear (rtx operands[]) ...@@ -73,7 +73,7 @@ expand_block_clear (rtx operands[])
When optimize_size, avoid any significant code bloat; calling When optimize_size, avoid any significant code bloat; calling
memset is about 4 instructions, so allow for one instruction to memset is about 4 instructions, so allow for one instruction to
load zero and three to do clearing. */ load zero and three to do clearing. */
if (TARGET_ALTIVEC && align >= 128) if (TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
clear_step = 16; clear_step = 16;
else if (TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT)) else if (TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT))
clear_step = 8; clear_step = 8;
...@@ -90,7 +90,7 @@ expand_block_clear (rtx operands[]) ...@@ -90,7 +90,7 @@ expand_block_clear (rtx operands[])
machine_mode mode = BLKmode; machine_mode mode = BLKmode;
rtx dest; rtx dest;
if (bytes >= 16 && TARGET_ALTIVEC && align >= 128) if (bytes >= 16 && TARGET_ALTIVEC && (align >= 128 || TARGET_EFFICIENT_UNALIGNED_VSX))
{ {
clear_bytes = 16; clear_bytes = 16;
mode = V4SImode; mode = V4SImode;
...@@ -1260,7 +1260,7 @@ expand_block_move (rtx operands[]) ...@@ -1260,7 +1260,7 @@ expand_block_move (rtx operands[])
/* Altivec first, since it will be faster than a string move /* Altivec first, since it will be faster than a string move
when it applies, and usually not significantly larger. */ when it applies, and usually not significantly larger. */
if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX || align >= 128))
{ {
move_bytes = 16; move_bytes = 16;
mode = V4SImode; mode = V4SImode;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment