constraints.md (wH constraint): Add new constraints for allowing 32-bit integers…

constraints.md (wH constraint): Add new constraints for allowing 32-bit integers (and eventually 8/16-bit... [gcc] 2016-10-27 Michael Meissner <meissner@linux.vnet.ibm.com> * config/rs6000/constraints.md (wH constraint): Add new constraints for allowing 32-bit integers (and eventually 8/16-bit integers) into the vector registers. (wI constraint): Likewise. (wJ constraint): Likewise. (wK constraint): Likewise. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add -mvsx-small-integer as a default option for ISA 2.07 (i.e. power8). (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.opt (-mvsx-small-integer): Add new debug switch to turn off small integer support in vector registers. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Eliminate test for -mupper-regs-di, since it is already done with the reg_add[mode].scalar_in_vsx_p. Add support for the switch -mvsx-small-integer. (rs6000_debug_reg_global): Add support for wH, wI, wJ, and wK constraints. (rs6000_setup_reg_addr_masks): Likewise. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_option_override_internal): Add consistency checks for -mvsx-small-integer. (rs6000_secondary_reload_simple_move): SImode is a simple move if -mvsx-small-integer. (rs6000_secondary_reload): Use std::swap. (rs6000_preferred_reload_class): Don't prefer FLOAT_REGS over VSX_REGS for small integers in vector registers, since there is no D-FORM address mode for such types. (rs6000_register_move_cost): Use FIRST_FPR_REGNO instead of 32. (rs6000_opt_masks): Add -mvsx-small-integer. * config/rs6000/vsx.md (VSINT_84): Add SImode for small integer support. (VSX_EXTRACT_I2): Clone VSX_EXTRACT_I, but drop V4SI since SImode extracts can be done on ISA 2.07. (vsx_extract_<mode>): Add support for small integers in vsx registers. (vsx_extract_<mode>_p9): Use 'v' instead of VSX_EX, since we no longer support V4SImode in this pattern. (vsx_extract_si): New insn to support extraction of SImode in ISA 2.07 using either xxextractuw or vspltw. (vsx_extract_<mode>_p8): Use 'v' instead of VSX_EX, since we no longer support V4SImode in this pattern. * config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wH, wI, wJ, and wK constraints. * config/rs6000/rs6000.md (f32_sv): Use correct instruction for storing SDmode with VSX instructions. (zero_extendsi<mode>2): Reorder pattern, so RLDICL comes after the GPR load and before the FPR and VSX loads. Remove ??, ! from the constraints. Add MFVSRWZ and XXEXTRACTUW instructions to support small integers in vector registers. (extendsi<mode>2): Reorder pattern, so EXTSW comes after the GPR load and before the FPR and VSX loads. Remove ??, ! from the constraints. Add VEXTSW2D support for small integers in vector registers. (lfiwax): Remove ! constraint. Add VEXTSW2D support for small integers in vector registers. (floatsi<mode>2_lfiwax): If -mvsx-small-integer issue a normal move instead of using an UNSPEC. (lfiwzx): Remove ! constraint. Add XXEXTRACTUW support for small integers in vector registers. (floatunssi<mode>2_lfiwzx): If -mvsx-small-integer issue a normal move instead of using an UNSPEC. (movsi_internal1): Add support for -mvsx-small-integer. Align columns so that it is more readable. (SImode splitter for ISA 3.0 constants): Add splitter for -128..127 constants that can easily be constructed on ISA 3.0. * doc/md.texi (PowerPC Constraints): Document wH, wI, wJ, and wK constraints. [gcc/testsuite] 2016-10-27 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/vsx-simode.c: New test. * gcc.target/powerpc/vsx-simode2.c: Likewise. * gcc.target/powerpc/vsx-simode3.c: Likewise. From-SVN: r241631

constraints.md (wH constraint): Add new constraints for allowing 32-bit integers…
constraints.md (wH constraint): Add new constraints for allowing 32-bit integers (and eventually 8/16-bit... [gcc] 2016-10-27 Michael Meissner <meissner@linux.vnet.ibm.com> * config/rs6000/constraints.md (wH constraint): Add new constraints for allowing 32-bit integers (and eventually 8/16-bit integers) into the vector registers. (wI constraint): Likewise. (wJ constraint): Likewise. (wK constraint): Likewise. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add -mvsx-small-integer as a default option for ISA 2.07 (i.e. power8). (POWERPC_MASKS): Likewise. * config/rs6000/rs6000.opt (-mvsx-small-integer): Add new debug switch to turn off small integer support in vector registers. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Eliminate test for -mupper-regs-di, since it is already done with the reg_add[mode].scalar_in_vsx_p. Add support for the switch -mvsx-small-integer. (rs6000_debug_reg_global): Add support for wH, wI, wJ, and wK constraints. (rs6000_setup_reg_addr_masks): Likewise. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_option_override_internal): Add consistency checks for -mvsx-small-integer. (rs6000_secondary_reload_simple_move): SImode is a simple move if -mvsx-small-integer. (rs6000_secondary_reload): Use std::swap. (rs6000_preferred_reload_class): Don't prefer FLOAT_REGS over VSX_REGS for small integers in vector registers, since there is no D-FORM address mode for such types. (rs6000_register_move_cost): Use FIRST_FPR_REGNO instead of 32. (rs6000_opt_masks): Add -mvsx-small-integer. * config/rs6000/vsx.md (VSINT_84): Add SImode for small integer support. (VSX_EXTRACT_I2): Clone VSX_EXTRACT_I, but drop V4SI since SImode extracts can be done on ISA 2.07. (vsx_extract_<mode>): Add support for small integers in vsx registers. (vsx_extract_<mode>_p9): Use 'v' instead of VSX_EX, since we no longer support V4SImode in this pattern. (vsx_extract_si): New insn to support extraction of SImode in ISA 2.07 using either xxextractuw or vspltw. (vsx_extract_<mode>_p8): Use 'v' instead of VSX_EX, since we no longer support V4SImode in this pattern. * config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wH, wI, wJ, and wK constraints. * config/rs6000/rs6000.md (f32_sv): Use correct instruction for storing SDmode with VSX instructions. (zero_extendsi<mode>2): Reorder pattern, so RLDICL comes after the GPR load and before the FPR and VSX loads. Remove ??, ! from the constraints. Add MFVSRWZ and XXEXTRACTUW instructions to support small integers in vector registers. (extendsi<mode>2): Reorder pattern, so EXTSW comes after the GPR load and before the FPR and VSX loads. Remove ??, ! from the constraints. Add VEXTSW2D support for small integers in vector registers. (lfiwax): Remove ! constraint. Add VEXTSW2D support for small integers in vector registers. (floatsi<mode>2_lfiwax): If -mvsx-small-integer issue a normal move instead of using an UNSPEC. (lfiwzx): Remove ! constraint. Add XXEXTRACTUW support for small integers in vector registers. (floatunssi<mode>2_lfiwzx): If -mvsx-small-integer issue a normal move instead of using an UNSPEC. (movsi_internal1): Add support for -mvsx-small-integer. Align columns so that it is more readable. (SImode splitter for ISA 3.0 constants): Add splitter for -128..127 constants that can easily be constructed on ISA 3.0. * doc/md.texi (PowerPC Constraints): Document wH, wI, wJ, and wK constraints. [gcc/testsuite] 2016-10-27 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/vsx-simode.c: New test. * gcc.target/powerpc/vsx-simode2.c: Likewise. * gcc.target/powerpc/vsx-simode3.c: Likewise. From-SVN: r241631
787c7a65 · Michael Meissner · Michael Meissner · 6f21288f · 787c7a65 · 787c7a65
Commit 787c7a65 authored Oct 27, 2016 by Michael Meissner Committed by Michael Meissner Oct 27, 2016
13 changed files
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
+2016-10-27  Michael Meissner  <meissner@linux.vnet.ibm.com>
+	* config/rs6000/constraints.md (wH constraint): Add new
+	constraints for allowing 32-bit integers (and eventually 8/16-bit
+	integers) into the vector registers.
+	(wI constraint): Likewise.
+	(wJ constraint): Likewise.
+	(wK constraint): Likewise.
+	* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add
+	-mvsx-small-integer as a default option for ISA 2.07
+	(i.e. power8).
+	(POWERPC_MASKS): Likewise.
+	* config/rs6000/rs6000.opt (-mvsx-small-integer): Add new debug
+	switch to turn off small integer support in vector registers.
+	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Eliminate
+	test for -mupper-regs-di, since it is already done with the
+	reg_add[mode].scalar_in_vsx_p.  Add support for the switch
+	-mvsx-small-integer.
+	(rs6000_debug_reg_global): Add support for wH, wI, wJ, and wK
+	constraints.
+	(rs6000_setup_reg_addr_masks): Likewise.
+	(rs6000_init_hard_regno_mode_ok): Likewise.
+	(rs6000_option_override_internal): Add consistency checks for
+	-mvsx-small-integer.
+	(rs6000_secondary_reload_simple_move): SImode is a simple move if
+	-mvsx-small-integer.
+	(rs6000_secondary_reload): Use std::swap.
+	(rs6000_preferred_reload_class): Don't prefer FLOAT_REGS over
+	VSX_REGS for small integers in vector registers, since there is no
+	D-FORM address mode for such types.
+	(rs6000_register_move_cost): Use FIRST_FPR_REGNO instead of 32.
+	(rs6000_opt_masks): Add -mvsx-small-integer.
+	* config/rs6000/vsx.md (VSINT_84): Add SImode for small integer
+	support.
+	(VSX_EXTRACT_I2): Clone VSX_EXTRACT_I, but drop V4SI since SImode
+	extracts can be done on ISA 2.07.
+	(vsx_extract_<mode>): Add support for small integers in vsx
+	registers.
+	(vsx_extract_<mode>_p9): Use 'v' instead of VSX_EX, since we no
+	longer support V4SImode in this pattern.
+	(vsx_extract_si): New insn to support extraction of SImode in ISA
+	2.07 using either xxextractuw or vspltw.
+	(vsx_extract_<mode>_p8): Use 'v' instead of VSX_EX, since we no
+	longer support V4SImode in this pattern.
+	* config/rs6000/rs6000.h (enum rs6000_reg_class_enum): Add wH, wI,
+	wJ, and wK constraints.
+	* config/rs6000/rs6000.md (f32_sv): Use correct instruction for
+	storing SDmode with VSX instructions.
+	(zero_extendsi<mode>2): Reorder pattern, so RLDICL comes after the
+	GPR load and before the FPR and VSX loads.  Remove ??, ! from the
+	constraints.  Add MFVSRWZ and XXEXTRACTUW instructions to support
+	small integers in vector registers.
+	(extendsi<mode>2): Reorder pattern, so EXTSW comes after the GPR
+	load and before the FPR and VSX loads.  Remove ??, ! from the
+	constraints.  Add VEXTSW2D support for small integers in vector
+	registers.
+	(lfiwax): Remove ! constraint.  Add VEXTSW2D support for small
+	integers in vector registers.
+	(floatsi<mode>2_lfiwax): If -mvsx-small-integer issue a normal
+	move instead of using an UNSPEC.
+	(lfiwzx): Remove ! constraint.  Add XXEXTRACTUW support for small
+	integers in vector registers.
+	(floatunssi<mode>2_lfiwzx): If -mvsx-small-integer issue a normal
+	move instead of using an UNSPEC.
+	(movsi_internal1): Add support for -mvsx-small-integer.  Align
+	columns so that it is more readable.
+	(SImode splitter for ISA 3.0 constants): Add splitter for
+	-128..127 constants that can easily be constructed on ISA 3.0.
+	* doc/md.texi (PowerPC Constraints): Document wH, wI, wJ, and wK
+	constraints.
 2016-10-27  Jakub Jelinek  <jakub@redhat.com>
 	PR middle-end/78025
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -159,6 +159,18 @@
  "Memory operand suitable for TOC fusion memory references"
  (match_operand 0 "toc_fusion_mem_wrapped"))
+(define_register_constraint "wH" "rs6000_constraints[RS6000_CONSTRAINT_wH]"
+  "Altivec register to hold 32-bit integers or NO_REGS.")
+(define_register_constraint "wI" "rs6000_constraints[RS6000_CONSTRAINT_wI]"
+  "FPR register to hold 32-bit integers or NO_REGS.")
+(define_register_constraint "wJ" "rs6000_constraints[RS6000_CONSTRAINT_wJ]"
+  "FPR register to hold 8/16-bit integers or NO_REGS.")
+(define_register_constraint "wK" "rs6000_constraints[RS6000_CONSTRAINT_wK]"
+  "Altivec register to hold 8/16-bit integers or NO_REGS.")
 (define_constraint "wL"
  "Int constant that is the element number mfvsrld accesses in a vector."
  (and (match_code "const_int")

--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -58,7 +58,8 @@
 				 | OPTION_MASK_HTM			\
 				 | OPTION_MASK_QUAD_MEMORY		\
  				 | OPTION_MASK_QUAD_MEMORY_ATOMIC	\
-				 | OPTION_MASK_UPPER_REGS_SF)
+				 | OPTION_MASK_UPPER_REGS_SF		\
+				 | OPTION_MASK_VSX_SMALL_INTEGER)
 /* Add ISEL back into ISA 3.0, since it is supposed to be a win.  Do not add
   P9_MINMAX until the hardware that supports it is available.  Do not add
@@ -138,6 +139,7 @@
 				 | OPTION_MASK_UPPER_REGS_DF		\
 				 | OPTION_MASK_UPPER_REGS_SF		\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_VSX_SMALL_INTEGER	\
 				 | OPTION_MASK_VSX_TIMODE)
 #endif

--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1602,6 +1602,10 @@ enum r6000_reg_class_enum {
  RS6000_CONSTRAINT_wx,		/* FPR register for STFIWX */
  RS6000_CONSTRAINT_wy,		/* VSX register for SF */
  RS6000_CONSTRAINT_wz,		/* FPR register for LFIWZX */
+  RS6000_CONSTRAINT_wH,		/* Altivec register for 32-bit integers.  */
+  RS6000_CONSTRAINT_wI,		/* VSX register for 32-bit integers.  */
+  RS6000_CONSTRAINT_wJ,		/* VSX register for 8/16-bit integers.  */
+  RS6000_CONSTRAINT_wK,		/* Altivec register for 16/32-bit integers.  */
  RS6000_CONSTRAINT_MAX
 };

--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -458,7 +458,7 @@
 (define_mode_attr f32_sm2 [(SF "wY")		   (SD "wn")])
 (define_mode_attr f32_si  [(SF "stfs%U0%X0 %1,%0") (SD "stfiwx %1,%y0")])
 (define_mode_attr f32_si2 [(SF "stxssp %1,%0")     (SD "stfiwx %1,%y0")])
-(define_mode_attr f32_sv  [(SF "stxsspx %x1,%y0")  (SD "stxsiwzx %x1,%y0")])
+(define_mode_attr f32_sv  [(SF "stxsspx %x1,%y0")  (SD "stxsiwx %x1,%y0")])
 ; Definitions for 32-bit fpr direct move
 ; At present, the decimal modes are not allowed in the traditional altivec
@@ -837,16 +837,18 @@
 (define_insn "zero_extendsi<mode>2"
-  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,??wj,!wz,!wu")
+  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wz,wu,wj,r,wJwK")
-	(zero_extend:EXTSI (match_operand:SI 1 "reg_or_mem_operand" "m,r,r,Z,Z")))]
+	(zero_extend:EXTSI (match_operand:SI 1 "reg_or_mem_operand" "m,r,Z,Z,r,wIwH,wJwK")))]
  ""
  "@
   lwz%U1%X1 %0,%1
   rldicl %0,%1,0,32
-   mtvsrwz %x0,%1
   lfiwzx %0,%y1
-   lxsiwzx %x0,%y1"
+   lxsiwzx %x0,%y1
-  [(set_attr "type" "load,shift,mffgpr,fpload,fpload")])
+   mtvsrwz %x0,%1
+   mfvsrwz %0,%x1
+   xxextractuw %x0,%x1,1"
+  [(set_attr "type" "load,shift,fpload,fpload,mffgpr,mftgpr,vecexts")])
 (define_insn_and_split "*zero_extendsi<mode>2_dot"
  [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
@@ -1005,16 +1007,17 @@
 (define_insn "extendsi<mode>2"
-  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,??wj,!wl,!wu")
+  [(set (match_operand:EXTSI 0 "gpc_reg_operand" "=r,r,wl,wu,wj,wK")
-	(sign_extend:EXTSI (match_operand:SI 1 "lwa_operand" "Y,r,r,Z,Z")))]
+	(sign_extend:EXTSI (match_operand:SI 1 "lwa_operand" "Y,r,Z,Z,r,wK")))]
  ""
  "@
   lwa%U1%X1 %0,%1
   extsw %0,%1
-   mtvsrwa %x0,%1
   lfiwax %0,%y1
-   lxsiwax %x0,%y1"
+   lxsiwax %x0,%y1
-  [(set_attr "type" "load,exts,mffgpr,fpload,fpload")
+   mtvsrwa %x0,%1
+   vextsw2d %0,%1"
+  [(set_attr "type" "load,exts,fpload,fpload,mffgpr,vecexts")
   (set_attr "sign_extend" "yes")])
 (define_insn_and_split "*extendsi<mode>2_dot"
@@ -4947,15 +4950,16 @@
 ; We don't define lfiwax/lfiwzx with the normal definition, because we
 ; don't want to support putting SImode in FPR registers.
 (define_insn "lfiwax"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,!wj")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,wj,wK")
-	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")]
+	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r,wK")]
 		   UNSPEC_LFIWAX))]
  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX"
  "@
   lfiwax %0,%y1
   lxsiwax %x0,%y1
-   mtvsrwa %x0,%1"
+   mtvsrwa %x0,%1
-  [(set_attr "type" "fpload,fpload,mffgpr")])
+   vextsw2d %0,%1"
+  [(set_attr "type" "fpload,fpload,mffgpr,vecexts")])
 ; This split must be run before register allocation because it allocates the
 ; memory slot that is needed to move values to/from the FPR.  We don't allocate
@@ -5019,7 +5023,10 @@
  operands[1] = rs6000_address_for_fpconvert (operands[1]);
  if (GET_CODE (operands[2]) == SCRATCH)
    operands[2] = gen_reg_rtx (DImode);
-  emit_insn (gen_lfiwax (operands[2], operands[1]));
+  if (TARGET_VSX_SMALL_INTEGER)
+    emit_insn (gen_extendsidi2 (operands[2], operands[1]));
+  else
+    emit_insn (gen_lfiwax (operands[2], operands[1]));
  emit_insn (gen_floatdi<mode>2 (operands[0], operands[2]));
  DONE;
 }"
@@ -5027,15 +5034,16 @@
   (set_attr "type" "fpload")])
 (define_insn "lfiwzx"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,!wj")
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wj,wj,wJwK")
-	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r")]
+	(unspec:DI [(match_operand:SI 1 "reg_or_indexed_operand" "Z,Z,r,wJwK")]
 		   UNSPEC_LFIWZX))]
  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX"
  "@
   lfiwzx %0,%y1
   lxsiwzx %x0,%y1
-   mtvsrwz %x0,%1"
+   mtvsrwz %x0,%1
-  [(set_attr "type" "fpload,fpload,mftgpr")])
+   xxextractuw %x0,%x1,1"
+  [(set_attr "type" "fpload,fpload,mftgpr,vecexts")])
 (define_insn_and_split "floatunssi<mode>2_lfiwzx"
  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
@@ -5094,7 +5102,10 @@
  operands[1] = rs6000_address_for_fpconvert (operands[1]);
  if (GET_CODE (operands[2]) == SCRATCH)
    operands[2] = gen_reg_rtx (DImode);
-  emit_insn (gen_lfiwzx (operands[2], operands[1]));
+  if (TARGET_VSX_SMALL_INTEGER)
+    emit_insn (gen_zero_extendsidi2 (operands[2], operands[1]));
+  else
+    emit_insn (gen_lfiwzx (operands[2], operands[1]));
  emit_insn (gen_floatdi<mode>2 (operands[0], operands[2]));
  DONE;
 }"
@@ -6518,25 +6529,66 @@
  [(set_attr "type" "load")
   (set_attr "length" "4")])
+;;		MR           LA           LWZ          LFIWZX       LXSIWZX
+;;		STW          STFIWX       STXSIWX      LI           LIS
+;;		#            XXLOR        XXSPLTIB 0   XXSPLTIB -1  VSPLTISW
+;;		XXLXOR 0     XXLORC -1    P9 const     MTVSRWZ      MFVSRWZ
+;;		MF%1         MT%0         MT%0         NOP
 (define_insn "*movsi_internal1"
-  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "=r,r,r,m,r,r,r,r,*c*l,*h,*h")
+  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand"
-	(match_operand:SI 1 "input_operand" "r,U,m,r,I,L,n,*h,r,r,0"))]
+		"=r,         r,           r,           ?*wI,        ?*wH,
+		 m,          ?Z,          ?Z,          r,           r,
+		 r,          ?*wIwH,      ?*wJwK,      ?*wK,        ?*wJwK,
+		 ?*wJwK,     ?*wH,        ?*wK,        ?*wIwH,      ?r,
+		 r,          *c*l,        *h,          *h")
+	(match_operand:SI 1 "input_operand"
+		"r,          U,           m,           Z,           Z,
+		 r,          wI,          wH,          I,           L,
+		 n,          wIwH,        O,           wM,          wB,
+		 O,          wM,          wS,          r,           wIwH,
+		 *h,         r,           r,           0"))]
  "!TARGET_SINGLE_FPU &&
   (gpc_reg_operand (operands[0], SImode) || gpc_reg_operand (operands[1], SImode))"
  "@
   mr %0,%1
   la %0,%a1
   lwz%U1%X1 %0,%1
+   lfiwzx %0,%y1
+   lxsiwzx %x0,%y1
   stw%U0%X0 %1,%0
+   stfiwx %1,%y0
+   stxsiwx %x1,%y0
   li %0,%1
   lis %0,%v1
   #
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
+   mtvsrwz %x0,%1
+   mfvsrwz %0,%x1
   mf%1 %0
   mt%0 %1
   mt%0 %1
   nop"
-  [(set_attr "type" "*,*,load,store,*,*,*,mfjmpr,mtjmpr,*,*")
+  [(set_attr "type"
-   (set_attr "length" "4,4,4,4,4,4,8,4,4,4,4")])
+		"*,          *,           load,        fpload,      fpload,
+		 store,      fpstore,     fpstore,     *,           *,
+		 *,          veclogical,  vecsimple,   vecsimple,   vecsimple,
+		 veclogical, veclogical,  vecsimple,   mffgpr,      mftgpr,
+		 *,           *,           *,           *")
+   (set_attr "length"
+		"4,          4,           4,           4,           4,
+		 4,          4,           4,           4,           4,
+		 8,          4,           4,           4,           4,
+		 4,          4,           8,           4,           4,
+		 4,          4,           4,           4")])
 (define_insn "*movsi_internal1_single"
  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "=r,r,r,m,r,r,r,r,*c*l,*h,*h,m,*f")
@@ -6581,6 +6633,23 @@
    FAIL;
 }")
+;; Split loading -128..127 to use XXSPLITB and VEXTSW2D
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "xxspltib_constant_split" ""))]
+  "TARGET_VSX_SMALL_INTEGER && TARGET_P9_VECTOR && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v16qi = gen_rtx_REG (V16QImode, r);
+  emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1));
+  emit_insn (gen_vsx_sign_extend_qi_si (operands[0], op0_v16qi));
+  DONE;
+})
 (define_insn "*mov<mode>_internal2"
  [(set (match_operand:CC 2 "cc_reg_operand" "=y,x,?y")
 	(compare:CC (match_operand:P 1 "gpc_reg_operand" "0,r,r")

--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -664,3 +664,7 @@ Enable using IEEE 128-bit floating point instructions.
 mfloat128-convert
 Target Undocumented Mask(FLOAT128_CVT) Var(rs6000_isa_flags)
 Enable default conversions between __float128 & long double.
+mvsx-small-integer
+Target Report Mask(VSX_SMALL_INTEGER) Var(rs6000_isa_flags)
+Enable small integers to be in VSX registers.
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -263,11 +263,14 @@
 			    (V2DI	"wi")])
 ;; Iterators for loading constants with xxspltib
-(define_mode_iterator VSINT_84  [V4SI V2DI DI])
+(define_mode_iterator VSINT_84  [V4SI V2DI DI SI])
 (define_mode_iterator VSINT_842 [V8HI V4SI V2DI])
-;; Iterator for ISA 3.0 vector extract/insert of integer vectors
+;; Iterator for ISA 3.0 vector extract/insert of small integer vectors.
-(define_mode_iterator VSX_EXTRACT_I [V16QI V8HI V4SI])
+;; VSX_EXTRACT_I2 doesn't include V4SImode because SI extracts can be
+;; done on ISA 2.07 and not just ISA 3.0.
+(define_mode_iterator VSX_EXTRACT_I  [V16QI V8HI V4SI])
+(define_mode_iterator VSX_EXTRACT_I2 [V16QI V8HI])
 (define_mode_attr VSX_EXTRACT_WIDTH [(V16QI "b")
 		  		     (V8HI "h")
@@ -2496,7 +2499,9 @@
 	      (clobber (match_dup 3))])]
  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
 {
-  operands[3] = gen_rtx_SCRATCH ((TARGET_VEXTRACTUB) ? DImode : <MODE>mode);
+  machine_mode smode = ((<MODE>mode != V4SImode && TARGET_VEXTRACTUB)
+			? DImode : <MODE>mode);
+  operands[3] = gen_rtx_SCRATCH (smode);
 })
 ;; Under ISA 3.0, we can use the byte/half-word/word integer stores if we are
@@ -2505,9 +2510,9 @@
 (define_insn_and_split  "*vsx_extract_<mode>_p9"
  [(set (match_operand:<VS_scalar> 0 "nonimmediate_operand" "=r,Z")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>,<VSX_EX>")
+	 (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v,v")
 	 (parallel [(match_operand:QI 2 "<VSX_EXTRACT_PREDICATE>" "n,n")])))
-   (clobber (match_scratch:DI 3 "=<VSX_EX>,<VSX_EX>"))]
+   (clobber (match_scratch:DI 3 "=v,v"))]
  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_VEXTRACTUB"
  "#"
  "&& (reload_completed || MEM_P (operands[0]))"
@@ -2536,8 +2541,6 @@
 	emit_insn (gen_p9_stxsibx (dest, di_tmp));
      else if (<MODE>mode == V8HImode)
 	emit_insn (gen_p9_stxsihx (dest, di_tmp));
-      else if (<MODE>mode == V4SImode)
-	emit_insn (gen_stfiwx (dest, di_tmp));
      else
 	gcc_unreachable ();
    }
@@ -2570,12 +2573,70 @@
 }
  [(set_attr "type" "vecsimple")])
+(define_insn_and_split  "*vsx_extract_si"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,Z,Z,wJwK")
+	(vec_select:SI
+	 (match_operand:V4SI 1 "gpc_reg_operand" "v,wJwK,v,v")
+	 (parallel [(match_operand:QI 2 "const_0_to_3_operand" "n,n,n,n")])))
+   (clobber (match_scratch:V4SI 3 "=v,wJwK,v,v"))]
+  "VECTOR_MEM_VSX_P (V4SImode) && TARGET_DIRECT_MOVE_64BIT"
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+{
+  rtx dest = operands[0];
+  rtx src = operands[1];
+  rtx element = operands[2];
+  rtx vec_tmp = operands[3];
+  int value;
+  if (!VECTOR_ELT_ORDER_BIG)
+    element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+  /* If the value is in the correct position, we can avoid doing the VSPLT<x>
+     instruction.  */
+  value = INTVAL (element);
+  if (value != 1)
+    {
+      if (TARGET_VEXTRACTUB)
+	{
+	  rtx di_tmp = gen_rtx_REG (DImode, REGNO (vec_tmp));
+	  emit_insn (gen_vsx_extract_v4si_di (di_tmp,src, element));
+	}
+      else
+	emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element));
+    }
+  else
+    vec_tmp = src;
+  if (MEM_P (operands[0]))
+    {
+      if (can_create_pseudo_p ())
+	dest = rs6000_address_for_fpconvert (dest);
+      if (TARGET_VSX_SMALL_INTEGER)
+	emit_move_insn (dest, gen_rtx_REG (SImode, REGNO (vec_tmp)));
+      else
+	emit_insn (gen_stfiwx (dest, gen_rtx_REG (DImode, REGNO (vec_tmp))));
+    }
+  else if (TARGET_VSX_SMALL_INTEGER)
+    emit_move_insn (dest, gen_rtx_REG (SImode, REGNO (vec_tmp)));
+  else
+    emit_move_insn (gen_rtx_REG (DImode, REGNO (dest)),
+		    gen_rtx_REG (DImode, REGNO (vec_tmp)));
+  DONE;
+}
+  [(set_attr "type" "mftgpr,fpstore,fpstore,vecsimple")
+   (set_attr "length" "8")])
 (define_insn_and_split  "*vsx_extract_<mode>_p8"
  [(set (match_operand:<VS_scalar> 0 "nonimmediate_operand" "=r")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "v")
+	 (match_operand:VSX_EXTRACT_I2 1 "gpc_reg_operand" "v")
 	 (parallel [(match_operand:QI 2 "<VSX_EXTRACT_PREDICATE>" "n")])))
-   (clobber (match_scratch:VSX_EXTRACT_I 3 "=v"))]
+   (clobber (match_scratch:VSX_EXTRACT_I2 3 "=v"))]
  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_DIRECT_MOVE_64BIT"
  "#"
  "&& reload_completed"
@@ -2607,13 +2668,6 @@
      else
 	vec_tmp = src;
    }
-  else if (<MODE>mode == V4SImode)
-    {
-      if (value != 1)
-	emit_insn (gen_altivec_vspltw_direct (vec_tmp, src, element));
-      else
-	vec_tmp = src;
-    }
  else
    gcc_unreachable ();

--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3125,6 +3125,18 @@ Memory operand suitable for power9 fusion load/stores.
 @item wG
 Memory operand suitable for TOC fusion memory references.
+@item wH
+Altivec register if @option{-mvsx-small-integer}.
+@item wI
+Floating point register if @option{-mvsx-small-integer}.
+@item wJ
+FP register if @option{-mvsx-small-integer} and @option{-mpower9-vector}.
+@item wK
+Altivec register if @option{-mvsx-small-integer} and @option{-mpower9-vector}.
 @item wL
 Int constant that is the element number that the MFVSRLD instruction.
 targets.

--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
+2016-10-27  Michael Meissner  <meissner@linux.vnet.ibm.com>
+	* gcc.target/powerpc/vsx-simode.c: New test.
+	* gcc.target/powerpc/vsx-simode2.c: Likewise.
+	* gcc.target/powerpc/vsx-simode3.c: Likewise.
 2016-10-27  Jakub Jelinek  <jakub@redhat.com>
 	PR fortran/78026

--- a/gcc/testsuite/gcc.target/powerpc/vsx-simode.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-simode.c
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2 -mvsx-small-integer" } */
+double load_asm_d_constraint (int *p)
+{
+  double ret;
+  __asm__ ("xxlor %x0,%x1,%x1\t# load d constraint" : "=d" (ret) : "d" (*p));
+  return ret;
+}
+void store_asm_d_constraint (int *p, double x)
+{
+  int i;
+  __asm__ ("xxlor %x0,%x1,%x1\t# store d constraint" : "=d" (i) : "d" (x));
+  *p = i;
+}
+/* { dg-final { scan-assembler "lfiwzx" } } */
+/* { dg-final { scan-assembler "stfiwx" } } */
--- a/gcc/testsuite/gcc.target/powerpc/vsx-simode2.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-simode2.c
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2 -mvsx-small-integer" } */
+unsigned int foo (unsigned int u)
+{
+  unsigned int ret;
+  __asm__ ("xxlor %x0,%x1,%x1\t# v, v constraints" : "=v" (ret) : "v" (u));
+  return ret;
+}
+/* { dg-final { scan-assembler "mtvsrwz" } } */
+/* { dg-final { scan-assembler "mfvsrwz" } } */
--- a/gcc/testsuite/gcc.target/powerpc/vsx-simode3.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-simode3.c
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O2 -mvsx-small-integer" } */
+double load_asm_v_constraint (int *p)
+{
+  double ret;
+  __asm__ ("xxlor %x0,%x1,%x1\t# load v constraint" : "=d" (ret) : "v" (*p));
+  return ret;
+}
+void store_asm_v_constraint (int *p, double x)
+{
+  int i;
+  __asm__ ("xxlor %x0,%x1,%x1\t# store v constraint" : "=v" (i) : "d" (x));
+  *p = i;
+}
+/* { dg-final { scan-assembler "lxsiwzx" } } */
+/* { dg-final { scan-assembler "stxsiwx" } } */