predicates.md (indexed_address, [...]): New predicates.

[gcc] 2013-03-20 Pat Haugen <pthaugen@us.ibm.com> * config/rs6000/predicates.md (indexed_address, update_address_mem update_indexed_address_mem): New predicates. * config/rs6000/vsx.md (vsx_extract_<mode>_zero): Set correct "type" attribute for load/store instructions. * config/rs6000/dfp.md (movsd_store): Likewise. (movsd_load): Likewise. * config/rs6000/rs6000.md (zero_extend<mode>di2_internal1): Likewise. (unnamed HI->DI extend define_insn): Likewise. (unnamed SI->DI extend define_insn): Likewise. (unnamed QI->SI extend define_insn): Likewise. (unnamed QI->HI extend define_insn): Likewise. (unnamed HI->SI extend define_insn): Likewise. (unnamed HI->SI extend define_insn): Likewise. (extendsfdf2_fpr): Likewise. (movsi_internal1): Likewise. (movsi_internal1_single): Likewise. (movhi_internal): Likewise. (movqi_internal): Likewise. (movcc_internal1): Correct mnemonic for stw insn. Set correct "type" attribute for load/store instructions. (mov<mode>_hardfloat): Set correct "type" attribute for load/store instructions. (mov<mode>_softfloat): Likewise. (mov<mode>_hardfloat32): Likewise. (mov<mode>_hardfloat64): Likewise. (mov<mode>_softfloat64): Likewise. (movdi_internal32): Likewise. (movdi_internal64): Likewise. (probe_stack_<mode>): Likewise. 2013-03-20 Michael Meissner <meissner@linux.vnet.ibm.com> * config/rs6000/vector.md (VEC_R): Add 32-bit integer, binary floating point, and decimal floating point to reload iterator. * config/rs6000/constraints.md (wl constraint): New constraints to return FLOAT_REGS if certain options are used to reduce the number of separate patterns that exist in the file. (wx constraint): Likewise. (wz constraint): Likewise. * config/rs6000/rs6000.c (rs6000_debug_reg_global): If -mdebug=reg, print wg, wl, wx, and wz constraints. (rs6000_init_hard_regno_mode_ok): Initialize new constraints. Initialize the reload functions for 64-bit binary/decimal floating point types. (reg_offset_addressing_ok_p): If we are on a power7 or later, use LFIWZX and STFIWX to load/store 32-bit decimal types, and don't create the buffer on the stack to overcome not having a 32-bit load and store. (rs6000_emit_move): Likewise. (rs6000_secondary_memory_needed_rtx): Likewise. (rs6000_alloc_sdmode_stack_slot): Likewise. (rs6000_preferred_reload_class): On VSX, we can create SFmode 0.0f via xxlxor, just like DFmode 0.0. * config/rs6000/rs6000.h (TARGET_NO_SDMODE_STACK): New macro, define as 1 if we are running on a power7 or newer. (enum r6000_reg_class_enum): Add new constraints. * config/rs6000/dfp.md (movsd): Delete, combine with binary floating point moves in rs6000.md. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). Use LFIWZX and STFIWX for loading SDmode on power7. Use xxlxor to create 0.0f. (movsd splitter): Likewise. (movsd_hardfloat): Likewise. (movsd_softfloat): Likewise. * config/rs6000/rs6000.md (FMOVE32): New iterators to combine binary and decimal floating point moves. (fmove_ok): New attributes to combine binary and decimal floating point moves, and to combine power6x (mfpgpr) moves along normal floating moves. (real_value_to_target): Likewise. (f32_lr): Likewise. (f32_lm): Likewise. (f32_li): Likewise. (f32_sr): Likewise. (f32_sm): Likewise. (f32_si): Likewise. (movsf): Combine binary and decimal floating point moves. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). Use LFIWZX and STFIWX for loading SDmode on power7. (mov<mode> for SFmode/SDmode); Likewise. (SFmode/SDmode splitters): Likewise. (movsf_hardfloat): Likewise. (mov<mode>_hardfloat for SFmode/SDmode): Likewise. (movsf_softfloat): Likewise. (mov<mode>_softfloat for SFmode/SDmode): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wl, wx and wz constraints. * config/rs6000/constraints.md (wg constraint): New constraint to return FLOAT_REGS if -mmfpgpr (power6x) was used. * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add wg constraint. * config/rs6000/rs6000.c (rs6000_debug_reg_global): If -mdebug=reg, print wg, wl, wx, and wz constraints. (rs6000_init_hard_regno_mode_ok): Initialize new constraints. Initialize the reload functions for 64-bit binary/decimal floating point types. (reg_offset_addressing_ok_p): If we are on a power7 or later, use LFIWZX and STFIWX to load/store 32-bit decimal types, and don't create the buffer on the stack to overcome not having a 32-bit load and store. (rs6000_emit_move): Likewise. (rs6000_secondary_memory_needed_rtx): Likewise. (rs6000_alloc_sdmode_stack_slot): Likewise. (rs6000_preferred_reload_class): On VSX, we can create SFmode 0.0f via xxlxor, just like DFmode 0.0. * config/rs6000/dfp.md (movdd): Delete, combine with binary floating point moves in rs6000.md. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). Use LFIWZX and STFIWX for loading SDmode on power7. (movdd splitters): Likewise. (movdd_hardfloat32): Likewise. (movdd_softfloat32): Likewise. (movdd_hardfloat64_mfpgpr): Likewise. (movdd_hardfloat64): Likewise. (movdd_softfloat64): Likewise. * config/rs6000/rs6000.md (FMOVE64): New iterators to combine 64-bit binary and decimal floating point moves. (FMOVE64X): Likewise. (movdf): Combine 64-bit binary and decimal floating point moves. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). (mov<mode> for DFmode/DDmode): Likewise. (DFmode/DDmode splitters): Likewise. (movdf_hardfloat32): Likewise. (mov<mode>_hardfloat32 for DFmode/DDmode): Likewise. (movdf_softfloat32): Likewise. (movdf_hardfloat64_mfpgpr): Likewise. (movdf_hardfloat64): Likewise. (mov<mode>_hardfloat64 for DFmode/DDmode): Likewise. (movdf_softfloat64): Likewise. (mov<mode>_softfloat64 for DFmode/DDmode): Likewise. (reload_<mode>_load): Move to later in the file so they aren't in the middle of the floating point move insns. (reload_<mode>_store): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wg constraint. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg constraint if -mdebug=reg. (rs6000_initi_hard_regno_mode_ok): Enable wg constraint if -mfpgpr. Enable using dd reload support if needed. * config/rs6000/dfp.md (movtd): Delete, combine with 128-bit binary and decimal floating point moves in rs6000.md. (movtd_internal): Likewise. * config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and decimal floating point moves. (movtf): Likewise. (movtf_internal): Likewise. (mov<mode>_internal, TDmode/TFmode): Likewise. (movtf_softfloat): Likewise. (mov<mode>_softfloat, TDmode/TFmode): Likewise. * config/rs6000/rs6000.md (movdi_mfpgpr): Delete, combine with movdi_internal64, using wg constraint for move direct operations. (movdi_internal64): Likewise. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print MODES_TIEABLE_P for selected modes. Print the numerical value of the various virtual registers. Use GPR/FPR first/last values, instead of hard coding the register numbers. Print which modes have reload functions registered. (rs6000_option_override_internal): If -mdebug=reg, trace the options settings before/after setting cpu, target and subtarget settings. (rs6000_secondary_reload_trace): Improve the RTL dump for -mdebug=addr and for secondary reload failures in rs6000_secondary_reload_inner. (rs6000_secondary_reload_fail): Likewise. (rs6000_secondary_reload_inner): Likewise. * config/rs6000/rs6000.md (FIRST_GPR_REGNO): Add convenience macros for first/last GPR and FPR registers. (LAST_GPR_REGNO): Likewise. (FIRST_FPR_REGNO): Likewise. (LAST_FPR_REGNO): Likewise. * config/rs6000/vector.md (mul<mode>3): Use the combined macro VECTOR_UNIT_ALTIVEC_OR_VSX_P instead of separate calls to VECTOR_UNIT_ALTIVEC_P and VECTOR_UNIT_VSX_P. (vcond<mode><mode>): Likewise. (vcondu<mode><mode>): Likewise. (vector_gtu<mode>): Likewise. (vector_gte<mode>): Likewise. (xor<mode>3): Don't allow logical operations on TImode in 32-bit to prevent the compiler from converting DImode operations to TImode. (ior<mode>3): Likewise. (and<mode>3): Likewise. (one_cmpl<mode>2): Likewise. (nor<mode>3): Likewise. (andc<mode>3): Likewise. * config/rs6000/constraints.md (wt constraint): New constraint that returns VSX_REGS if TImode is allowed in VSX registers. * config/rs6000/predicates.md (easy_fp_constant): 0.0f is an easy constant under VSX. * config/rs6000/rs6000-modes.def (PTImode): Define, PTImode is similar to TImode, but it is restricted to being in the GPRs. * config/rs6000/rs6000.opt (-mvsx-timode): New switch to allow TImode to occupy a single VSX register. * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Default to -mvsx-timode for power7/power8. (power7 cpu): Likewise. (power8 cpu): Likewise. * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Make sure that TFmode/TDmode take up two registers if they are ever allowed in the upper VSX registers. (rs6000_hard_regno_mode_ok): If -mvsx-timode, allow TImode in VSX registers. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_debug_reg_global): Add debugging for PTImode and wt constraint. Print if LRA is turned on. (rs6000_option_override_internal): Give an error if -mvsx-timode and VSX is not enabled. (invalid_e500_subreg): Handle PTImode, restricting it to GPRs. If -mvsx-timode, restrict TImode to reg+reg addressing, and PTImode to reg+offset addressing. Use PTImode when checking offset addresses for validity. (reg_offset_addressing_ok_p): Likewise. (rs6000_legitimate_offset_address_p): Likewise. (rs6000_legitimize_address): Likewise. (rs6000_legitimize_reload_address): Likewise. (rs6000_legitimate_address_p): Likewise. (rs6000_eliminate_indexed_memrefs): Likewise. (rs6000_emit_move): Likewise. (rs6000_secondary_reload): Likewise. (rs6000_secondary_reload_inner): Handle PTImode. Allow 64-bit reloads to fpr registers to continue to use reg+offset addressing, but 64-bit reloads to altivec registers need reg+reg addressing. Drop test for PRE_MODIFY, since VSX loads/stores no longer support it. Treat LO_SUM like a PLUS operation. (rs6000_secondary_reload_class): If type is 64-bit, prefer to use FLOAT_REGS instead of VSX_RGS to allow use of reg+offset addressing. (rs6000_cannot_change_mode_class): Do not allow TImode in VSX registers to share a register with a smaller sized type, since VSX puts scalars in the upper 64-bits. (print_operand): Add support for PTImode. (rs6000_register_move_cost): Use VECTOR_MEM_VSX_P instead of VECTOR_UNIT_VSX_P to catch types that can be loaded in VSX registers, but don't have arithmetic support. (rs6000_memory_move_cost): Add test for VSX. (rs6000_opt_masks): Add -mvsx-timode. * config/rs6000/vsx.md (VSm): Change to use 64-bit aligned moves for TImode. (VSs): Likewise. (VSr): Use wt constraint for TImode. (VSv): Drop TImode support. (vsx_movti): Delete, replace with versions for 32-bit and 64-bit. (vsx_movti_64bit): Likewise. (vsx_movti_32bit): Likewise. (vec_store_<mode>): Use VSX iterator instead of vector iterator. (vsx_and<mode>3): Delete use of '?' constraint on inputs, just put one '?' on the appropriate output constraint. Do not allow TImode logical operations on 32-bit systems. (vsx_ior<mode>3): Likewise. (vsx_xor<mode>3): Likewise. (vsx_one_cmpl<mode>2): Likewise. (vsx_nor<mode>3): Likewise. (vsx_andc<mode>3): Likewise. (vsx_concat_<mode>): Likewise. (vsx_xxpermdi_<mode>): Fix thinko for non V2DF/V2DI modes. * config/rs6000/rs6000.h (MASK_VSX_TIMODE): Map from OPTION_MASK_VSX_TIMODE. (enum rs6000_reg_class_enum): Add RS6000_CONSTRAINT_wt. (STACK_SAVEAREA_MODE): Use PTImode instead of TImode. * config/rs6000/rs6000.md (INT mode attribute): Add PTImode. (TI2 iterator): New iterator for TImode, PTImode. (wd mode attribute): Add values for vector types. (movti_string): Replace TI move operations with operations for TImode and PTImode. Add support for TImode being allowed in VSX registers. (mov<mode>_string, TImode/PTImode): Likewise. (movti_ppc64): Likewise. (mov<mode>_ppc64, TImode/PTImode): Likewise. (TI mode splitters): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wt constraint. [gcc/testsuite] 2013-03-20 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/mmfpgpr.c: New test. * gcc.target/powerpc/sd-vsx.c: Likewise. * gcc.target/powerpc/sd-pwr6.c: Likewise. * gcc.target/powerpc/vsx-float0.c: Likewise. From-SVN: r196831

predicates.md (indexed_address, [...]): New predicates.
[gcc] 2013-03-20 Pat Haugen <pthaugen@us.ibm.com> * config/rs6000/predicates.md (indexed_address, update_address_mem update_indexed_address_mem): New predicates. * config/rs6000/vsx.md (vsx_extract_<mode>_zero): Set correct "type" attribute for load/store instructions. * config/rs6000/dfp.md (movsd_store): Likewise. (movsd_load): Likewise. * config/rs6000/rs6000.md (zero_extend<mode>di2_internal1): Likewise. (unnamed HI->DI extend define_insn): Likewise. (unnamed SI->DI extend define_insn): Likewise. (unnamed QI->SI extend define_insn): Likewise. (unnamed QI->HI extend define_insn): Likewise. (unnamed HI->SI extend define_insn): Likewise. (unnamed HI->SI extend define_insn): Likewise. (extendsfdf2_fpr): Likewise. (movsi_internal1): Likewise. (movsi_internal1_single): Likewise. (movhi_internal): Likewise. (movqi_internal): Likewise. (movcc_internal1): Correct mnemonic for stw insn. Set correct "type" attribute for load/store instructions. (mov<mode>_hardfloat): Set correct "type" attribute for load/store instructions. (mov<mode>_softfloat): Likewise. (mov<mode>_hardfloat32): Likewise. (mov<mode>_hardfloat64): Likewise. (mov<mode>_softfloat64): Likewise. (movdi_internal32): Likewise. (movdi_internal64): Likewise. (probe_stack_<mode>): Likewise. 2013-03-20 Michael Meissner <meissner@linux.vnet.ibm.com> * config/rs6000/vector.md (VEC_R): Add 32-bit integer, binary floating point, and decimal floating point to reload iterator. * config/rs6000/constraints.md (wl constraint): New constraints to return FLOAT_REGS if certain options are used to reduce the number of separate patterns that exist in the file. (wx constraint): Likewise. (wz constraint): Likewise. * config/rs6000/rs6000.c (rs6000_debug_reg_global): If -mdebug=reg, print wg, wl, wx, and wz constraints. (rs6000_init_hard_regno_mode_ok): Initialize new constraints. Initialize the reload functions for 64-bit binary/decimal floating point types. (reg_offset_addressing_ok_p): If we are on a power7 or later, use LFIWZX and STFIWX to load/store 32-bit decimal types, and don't create the buffer on the stack to overcome not having a 32-bit load and store. (rs6000_emit_move): Likewise. (rs6000_secondary_memory_needed_rtx): Likewise. (rs6000_alloc_sdmode_stack_slot): Likewise. (rs6000_preferred_reload_class): On VSX, we can create SFmode 0.0f via xxlxor, just like DFmode 0.0. * config/rs6000/rs6000.h (TARGET_NO_SDMODE_STACK): New macro, define as 1 if we are running on a power7 or newer. (enum r6000_reg_class_enum): Add new constraints. * config/rs6000/dfp.md (movsd): Delete, combine with binary floating point moves in rs6000.md. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). Use LFIWZX and STFIWX for loading SDmode on power7. Use xxlxor to create 0.0f. (movsd splitter): Likewise. (movsd_hardfloat): Likewise. (movsd_softfloat): Likewise. * config/rs6000/rs6000.md (FMOVE32): New iterators to combine binary and decimal floating point moves. (fmove_ok): New attributes to combine binary and decimal floating point moves, and to combine power6x (mfpgpr) moves along normal floating moves. (real_value_to_target): Likewise. (f32_lr): Likewise. (f32_lm): Likewise. (f32_li): Likewise. (f32_sr): Likewise. (f32_sm): Likewise. (f32_si): Likewise. (movsf): Combine binary and decimal floating point moves. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). Use LFIWZX and STFIWX for loading SDmode on power7. (mov<mode> for SFmode/SDmode); Likewise. (SFmode/SDmode splitters): Likewise. (movsf_hardfloat): Likewise. (mov<mode>_hardfloat for SFmode/SDmode): Likewise. (movsf_softfloat): Likewise. (mov<mode>_softfloat for SFmode/SDmode): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wl, wx and wz constraints. * config/rs6000/constraints.md (wg constraint): New constraint to return FLOAT_REGS if -mmfpgpr (power6x) was used. * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add wg constraint. * config/rs6000/rs6000.c (rs6000_debug_reg_global): If -mdebug=reg, print wg, wl, wx, and wz constraints. (rs6000_init_hard_regno_mode_ok): Initialize new constraints. Initialize the reload functions for 64-bit binary/decimal floating point types. (reg_offset_addressing_ok_p): If we are on a power7 or later, use LFIWZX and STFIWX to load/store 32-bit decimal types, and don't create the buffer on the stack to overcome not having a 32-bit load and store. (rs6000_emit_move): Likewise. (rs6000_secondary_memory_needed_rtx): Likewise. (rs6000_alloc_sdmode_stack_slot): Likewise. (rs6000_preferred_reload_class): On VSX, we can create SFmode 0.0f via xxlxor, just like DFmode 0.0. * config/rs6000/dfp.md (movdd): Delete, combine with binary floating point moves in rs6000.md. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). Use LFIWZX and STFIWX for loading SDmode on power7. (movdd splitters): Likewise. (movdd_hardfloat32): Likewise. (movdd_softfloat32): Likewise. (movdd_hardfloat64_mfpgpr): Likewise. (movdd_hardfloat64): Likewise. (movdd_softfloat64): Likewise. * config/rs6000/rs6000.md (FMOVE64): New iterators to combine 64-bit binary and decimal floating point moves. (FMOVE64X): Likewise. (movdf): Combine 64-bit binary and decimal floating point moves. Combine power6x (mfpgpr) moves with other moves by using conditional constraits (wg). (mov<mode> for DFmode/DDmode): Likewise. (DFmode/DDmode splitters): Likewise. (movdf_hardfloat32): Likewise. (mov<mode>_hardfloat32 for DFmode/DDmode): Likewise. (movdf_softfloat32): Likewise. (movdf_hardfloat64_mfpgpr): Likewise. (movdf_hardfloat64): Likewise. (mov<mode>_hardfloat64 for DFmode/DDmode): Likewise. (movdf_softfloat64): Likewise. (mov<mode>_softfloat64 for DFmode/DDmode): Likewise. (reload_<mode>_load): Move to later in the file so they aren't in the middle of the floating point move insns. (reload_<mode>_store): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wg constraint. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg constraint if -mdebug=reg. (rs6000_initi_hard_regno_mode_ok): Enable wg constraint if -mfpgpr. Enable using dd reload support if needed. * config/rs6000/dfp.md (movtd): Delete, combine with 128-bit binary and decimal floating point moves in rs6000.md. (movtd_internal): Likewise. * config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and decimal floating point moves. (movtf): Likewise. (movtf_internal): Likewise. (mov<mode>_internal, TDmode/TFmode): Likewise. (movtf_softfloat): Likewise. (mov<mode>_softfloat, TDmode/TFmode): Likewise. * config/rs6000/rs6000.md (movdi_mfpgpr): Delete, combine with movdi_internal64, using wg constraint for move direct operations. (movdi_internal64): Likewise. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print MODES_TIEABLE_P for selected modes. Print the numerical value of the various virtual registers. Use GPR/FPR first/last values, instead of hard coding the register numbers. Print which modes have reload functions registered. (rs6000_option_override_internal): If -mdebug=reg, trace the options settings before/after setting cpu, target and subtarget settings. (rs6000_secondary_reload_trace): Improve the RTL dump for -mdebug=addr and for secondary reload failures in rs6000_secondary_reload_inner. (rs6000_secondary_reload_fail): Likewise. (rs6000_secondary_reload_inner): Likewise. * config/rs6000/rs6000.md (FIRST_GPR_REGNO): Add convenience macros for first/last GPR and FPR registers. (LAST_GPR_REGNO): Likewise. (FIRST_FPR_REGNO): Likewise. (LAST_FPR_REGNO): Likewise. * config/rs6000/vector.md (mul<mode>3): Use the combined macro VECTOR_UNIT_ALTIVEC_OR_VSX_P instead of separate calls to VECTOR_UNIT_ALTIVEC_P and VECTOR_UNIT_VSX_P. (vcond<mode><mode>): Likewise. (vcondu<mode><mode>): Likewise. (vector_gtu<mode>): Likewise. (vector_gte<mode>): Likewise. (xor<mode>3): Don't allow logical operations on TImode in 32-bit to prevent the compiler from converting DImode operations to TImode. (ior<mode>3): Likewise. (and<mode>3): Likewise. (one_cmpl<mode>2): Likewise. (nor<mode>3): Likewise. (andc<mode>3): Likewise. * config/rs6000/constraints.md (wt constraint): New constraint that returns VSX_REGS if TImode is allowed in VSX registers. * config/rs6000/predicates.md (easy_fp_constant): 0.0f is an easy constant under VSX. * config/rs6000/rs6000-modes.def (PTImode): Define, PTImode is similar to TImode, but it is restricted to being in the GPRs. * config/rs6000/rs6000.opt (-mvsx-timode): New switch to allow TImode to occupy a single VSX register. * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Default to -mvsx-timode for power7/power8. (power7 cpu): Likewise. (power8 cpu): Likewise. * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Make sure that TFmode/TDmode take up two registers if they are ever allowed in the upper VSX registers. (rs6000_hard_regno_mode_ok): If -mvsx-timode, allow TImode in VSX registers. (rs6000_init_hard_regno_mode_ok): Likewise. (rs6000_debug_reg_global): Add debugging for PTImode and wt constraint. Print if LRA is turned on. (rs6000_option_override_internal): Give an error if -mvsx-timode and VSX is not enabled. (invalid_e500_subreg): Handle PTImode, restricting it to GPRs. If -mvsx-timode, restrict TImode to reg+reg addressing, and PTImode to reg+offset addressing. Use PTImode when checking offset addresses for validity. (reg_offset_addressing_ok_p): Likewise. (rs6000_legitimate_offset_address_p): Likewise. (rs6000_legitimize_address): Likewise. (rs6000_legitimize_reload_address): Likewise. (rs6000_legitimate_address_p): Likewise. (rs6000_eliminate_indexed_memrefs): Likewise. (rs6000_emit_move): Likewise. (rs6000_secondary_reload): Likewise. (rs6000_secondary_reload_inner): Handle PTImode. Allow 64-bit reloads to fpr registers to continue to use reg+offset addressing, but 64-bit reloads to altivec registers need reg+reg addressing. Drop test for PRE_MODIFY, since VSX loads/stores no longer support it. Treat LO_SUM like a PLUS operation. (rs6000_secondary_reload_class): If type is 64-bit, prefer to use FLOAT_REGS instead of VSX_RGS to allow use of reg+offset addressing. (rs6000_cannot_change_mode_class): Do not allow TImode in VSX registers to share a register with a smaller sized type, since VSX puts scalars in the upper 64-bits. (print_operand): Add support for PTImode. (rs6000_register_move_cost): Use VECTOR_MEM_VSX_P instead of VECTOR_UNIT_VSX_P to catch types that can be loaded in VSX registers, but don't have arithmetic support. (rs6000_memory_move_cost): Add test for VSX. (rs6000_opt_masks): Add -mvsx-timode. * config/rs6000/vsx.md (VSm): Change to use 64-bit aligned moves for TImode. (VSs): Likewise. (VSr): Use wt constraint for TImode. (VSv): Drop TImode support. (vsx_movti): Delete, replace with versions for 32-bit and 64-bit. (vsx_movti_64bit): Likewise. (vsx_movti_32bit): Likewise. (vec_store_<mode>): Use VSX iterator instead of vector iterator. (vsx_and<mode>3): Delete use of '?' constraint on inputs, just put one '?' on the appropriate output constraint. Do not allow TImode logical operations on 32-bit systems. (vsx_ior<mode>3): Likewise. (vsx_xor<mode>3): Likewise. (vsx_one_cmpl<mode>2): Likewise. (vsx_nor<mode>3): Likewise. (vsx_andc<mode>3): Likewise. (vsx_concat_<mode>): Likewise. (vsx_xxpermdi_<mode>): Fix thinko for non V2DF/V2DI modes. * config/rs6000/rs6000.h (MASK_VSX_TIMODE): Map from OPTION_MASK_VSX_TIMODE. (enum rs6000_reg_class_enum): Add RS6000_CONSTRAINT_wt. (STACK_SAVEAREA_MODE): Use PTImode instead of TImode. * config/rs6000/rs6000.md (INT mode attribute): Add PTImode. (TI2 iterator): New iterator for TImode, PTImode. (wd mode attribute): Add values for vector types. (movti_string): Replace TI move operations with operations for TImode and PTImode. Add support for TImode being allowed in VSX registers. (mov<mode>_string, TImode/PTImode): Likewise. (movti_ppc64): Likewise. (mov<mode>_ppc64, TImode/PTImode): Likewise. (TI mode splitters): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wt constraint. [gcc/testsuite] 2013-03-20 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/mmfpgpr.c: New test. * gcc.target/powerpc/sd-vsx.c: Likewise. * gcc.target/powerpc/sd-pwr6.c: Likewise. * gcc.target/powerpc/vsx-float0.c: Likewise. From-SVN: r196831
c6d5ff83 · Michael Meissner · 1fc5eced · c6d5ff83 · c6d5ff83 · c6d5ff83
Commit c6d5ff83 authored Mar 20, 2013 by Michael Meissner
18 changed files
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -64,10 +64,27 @@
 (define_register_constraint "ws" "rs6000_constraints[RS6000_CONSTRAINT_ws]"
  "@internal")

+;; TImode in VSX registers
+(define_register_constraint "wt" "rs6000_constraints[RS6000_CONSTRAINT_wt]"
+  "@internal")
+
 ;; any VSX register
 (define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]"
  "@internal")

+;; Register constraints to simplify move patterns
+(define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]"
+  "Floating point register if -mmfpgpr is used, or NO_REGS.")
+
+(define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
+  "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
+
+(define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]"
+  "Floating point register if the STFIWX instruction is enabled or NO_REGS.")
+
+(define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
+  "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
+
 ;; Altivec style load/store that ignores the bottom bits of the address
 (define_memory_constraint "wZ"
  "Indexed or indirect memory operand, ignoring the bottom 4 bits"

--- a/gcc/config/rs6000/dfp.md
+++ b/gcc/config/rs6000/dfp.md
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -329,6 +329,11 @@
      && mode != DImode)
    return 1;

+  /* The constant 0.0 is easy under VSX.  */
+  if ((mode == SFmode || mode == DFmode || mode == SDmode || mode == DDmode)
+      && VECTOR_UNIT_VSX_P (DFmode) && op == CONST0_RTX (mode))
+    return 1;
+
  if (DECIMAL_FLOAT_MODE_P (mode))
    return 0;

@@ -552,6 +557,28 @@
 			&& REG_P (XEXP (op, 1)))")
       (match_operand 0 "address_operand")))

+;; Return 1 if the operand is an index-form address.
+(define_special_predicate "indexed_address"
+  (match_test "(GET_CODE (op) == PLUS
+		&& REG_P (XEXP (op, 0))
+		&& REG_P (XEXP (op, 1)))"))
+
+;; Return 1 if the operand is a MEM with an update-form address. This may
+;; also include update-indexed form.
+(define_special_predicate "update_address_mem"
+  (match_test "(MEM_P (op)
+		&& (GET_CODE (XEXP (op, 0)) == PRE_INC
+		    || GET_CODE (XEXP (op, 0)) == PRE_DEC
+		    || GET_CODE (XEXP (op, 0)) == PRE_MODIFY))"))
+
+;; Return 1 if the operand is a MEM with an update-indexed-form address. Note
+;; that PRE_INC/PRE_DEC will always be non-indexed (i.e. non X-form) since the
+;; increment is based on the mode size and will therefor always be a const.
+(define_special_predicate "update_indexed_address_mem"
+  (match_test "(MEM_P (op)
+		&& GET_CODE (XEXP (op, 0)) == PRE_MODIFY
+		&& indexed_address (XEXP (XEXP (op, 0), 1), mode))"))
+
 ;; Used for the destination of the fix_truncdfsi2 expander.
 ;; If stfiwx will be used, the result goes to memory; otherwise,
 ;; we're going to emit a store and a load of a subreg, so the dest is a

--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -42,7 +42,8 @@
 #define ISA_2_6_MASKS_SERVER	(ISA_2_5_MASKS_SERVER			\
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
-				 | OPTION_MASK_VSX)
+				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_VSX_TIMODE)

 #define POWERPC_7400_MASK	(OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC)

@@ -76,7 +77,8 @@
 				 | OPTION_MASK_RECIP_PRECISION		\
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
-				 | OPTION_MASK_VSX)
+				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_VSX_TIMODE)

 #endif

@@ -165,11 +167,11 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6, MASK_POWERPC64 | MASK_PPC_GPOPT
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION)
+	    | MASK_VSX | MASK_RECIP_PRECISION | MASK_VSX_TIMODE)
 RS6000_CPU ("power8", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION)
+	    | MASK_VSX | MASK_RECIP_PRECISION | MASK_VSX_TIMODE)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
 RS6000_CPU ("rs64", PROCESSOR_RS64A, MASK_PPC_GFXOPT | MASK_POWERPC64)
--- a/gcc/config/rs6000/rs6000-modes.def
+++ b/gcc/config/rs6000/rs6000-modes.def
@@ -41,3 +41,6 @@ VECTOR_MODE (INT, DI, 1);
 VECTOR_MODES (FLOAT, 8);      /*             V4HF V2SF */
 VECTOR_MODES (FLOAT, 16);     /*       V8HF  V4SF V2DF */
 VECTOR_MODES (FLOAT, 32);     /*       V16HF V8SF V4DF */
+
+/* Replacement for TImode that only is allowed in GPRs.  */
+PARTIAL_INT_MODE (TI);
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -479,6 +479,11 @@ extern int rs6000_vector_align[];
 #define TARGET_FCTIDUZ	TARGET_POPCNTD
 #define TARGET_FCTIWUZ	TARGET_POPCNTD

+/* Power7 has both 32-bit load and store integer for the FPRs, so we don't need
+   to allocate the SDmode stack slot to get the value into the proper location
+   in the register.  */
+#define TARGET_NO_SDMODE_STACK	(TARGET_LFIWZX && TARGET_STFIWX && TARGET_DFP)
+
 /* In switching from using target_flags to using rs6000_isa_flags, the options
   machinery creates OPTION_MASK_<xxx> instead of MASK_<xxx>.  For now map
   OPTION_MASK_<xxx> back into MASK_<xxx>.  */
@@ -505,6 +510,7 @@ extern int rs6000_vector_align[];
 #define MASK_STRING			OPTION_MASK_STRING
 #define MASK_UPDATE			OPTION_MASK_UPDATE
 #define MASK_VSX			OPTION_MASK_VSX
+#define MASK_VSX_TIMODE			OPTION_MASK_VSX_TIMODE

 #ifndef IN_LIBGCC2
 #define MASK_POWERPC64			OPTION_MASK_POWERPC64
@@ -1325,8 +1331,13 @@ enum r6000_reg_class_enum {
  RS6000_CONSTRAINT_v,		/* Altivec registers */
  RS6000_CONSTRAINT_wa,		/* Any VSX register */
  RS6000_CONSTRAINT_wd,		/* VSX register for V2DF */
+  RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
  RS6000_CONSTRAINT_wf,		/* VSX register for V4SF */
+  RS6000_CONSTRAINT_wl,		/* FPR register for LFIWAX */
  RS6000_CONSTRAINT_ws,		/* VSX register for DF */
+  RS6000_CONSTRAINT_wt,		/* VSX register for TImode */
+  RS6000_CONSTRAINT_wx,		/* FPR register for STFIWX */
+  RS6000_CONSTRAINT_wz,		/* FPR register for LFIWZX */
  RS6000_CONSTRAINT_MAX
 };

@@ -1511,7 +1522,7 @@ extern enum reg_class rs6000_constraints[RS6000_CONSTRAINT_MAX];
   NONLOCAL needs twice Pmode to maintain both backchain and SP.  */
 #define STACK_SAVEAREA_MODE(LEVEL)	\
  (LEVEL == SAVE_FUNCTION ? VOIDmode	\
-  : LEVEL == SAVE_NONLOCAL ? (TARGET_32BIT ? DImode : TImode) : Pmode)
+  : LEVEL == SAVE_NONLOCAL ? (TARGET_32BIT ? DImode : PTImode) : Pmode)

 /* Minimum and maximum general purpose registers used to hold arguments.  */
 #define GP_ARG_MIN_REG 3

--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -514,3 +514,7 @@ Use/do not use r11 to hold the static link in calls to functions via pointers.
 msave-toc-indirect
 Target Report Var(TARGET_SAVE_TOC_INDIRECT) Save
 Control whether we save the TOC in the prologue for indirect calls or generate the save inline
+
+mvsx-timode
+Target Undocumented Mask(VSX_TIMODE) Var(rs6000_isa_flags)
+; Allow/disallow TImode in VSX registers
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -54,7 +54,7 @@
 (define_mode_iterator VEC_64 [V2DI V2DF])

 ;; Vector reload iterator
-(define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF DF TI])
+(define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF SF SD SI DF DD DI TI])

 ;; Base type from vector mode
 (define_mode_attr VEC_base [(V16QI "QI")
@@ -249,7 +249,7 @@
  [(set (match_operand:VEC_F 0 "vfloat_operand" "")
 	(mult:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
 		    (match_operand:VEC_F 2 "vfloat_operand" "")))]
-  "VECTOR_UNIT_VSX_P (<MODE>mode) || VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
 {
  if (<MODE>mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (<MODE>mode))
    {
@@ -395,7 +395,7 @@
 			  (match_operand:VEC_I 5 "vint_operand" "")])
 	 (match_operand:VEC_I 1 "vint_operand" "")
 	 (match_operand:VEC_I 2 "vint_operand" "")))]
-  "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
  "
 {
  if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
@@ -451,7 +451,7 @@
 			  (match_operand:VEC_I 5 "vint_operand" "")])
 	 (match_operand:VEC_I 1 "vint_operand" "")
 	 (match_operand:VEC_I 2 "vint_operand" "")))]
-  "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
  "
 {
  if (rs6000_emit_vector_cond_expr (operands[0], operands[1], operands[2],
@@ -505,14 +505,14 @@
  [(set (match_operand:VEC_I 0 "vint_operand" "")
 	(gtu:VEC_I (match_operand:VEC_I 1 "vint_operand" "")
 		   (match_operand:VEC_I 2 "vint_operand" "")))]
-  "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
  "")

 (define_expand "vector_geu<mode>"
  [(set (match_operand:VEC_I 0 "vint_operand" "")
 	(geu:VEC_I (match_operand:VEC_I 1 "vint_operand" "")
 		   (match_operand:VEC_I 2 "vint_operand" "")))]
-  "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
  "")

 (define_insn_and_split "*vector_uneq<mode>"
@@ -709,45 +709,55 @@


 ;; Vector logical instructions
+;; Do not support TImode logical instructions on 32-bit at present, because the
+;; compiler will see that we have a TImode and when it wanted DImode, and
+;; convert the DImode to TImode, store it on the stack, and load it in a VSX
+;; register.
 (define_expand "xor<mode>3"
  [(set (match_operand:VEC_L 0 "vlogical_operand" "")
        (xor:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
 		   (match_operand:VEC_L 2 "vlogical_operand" "")))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "")

 (define_expand "ior<mode>3"
  [(set (match_operand:VEC_L 0 "vlogical_operand" "")
        (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
 		   (match_operand:VEC_L 2 "vlogical_operand" "")))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "")

 (define_expand "and<mode>3"
  [(set (match_operand:VEC_L 0 "vlogical_operand" "")
        (and:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
 		   (match_operand:VEC_L 2 "vlogical_operand" "")))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "")

 (define_expand "one_cmpl<mode>2"
  [(set (match_operand:VEC_L 0 "vlogical_operand" "")
        (not:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "")

 (define_expand "nor<mode>3"
  [(set (match_operand:VEC_L 0 "vlogical_operand" "")
        (not:VEC_L (ior:VEC_L (match_operand:VEC_L 1 "vlogical_operand" "")
 			      (match_operand:VEC_L 2 "vlogical_operand" ""))))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "")

 (define_expand "andc<mode>3"
  [(set (match_operand:VEC_L 0 "vlogical_operand" "")
        (and:VEC_L (not:VEC_L (match_operand:VEC_L 2 "vlogical_operand" ""))
 		   (match_operand:VEC_L 1 "vlogical_operand" "")))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "")

 ;; Same size conversions

--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -48,7 +48,7 @@
 			(V2DF  "vd2")
 			(V2DI  "vd2")
 			(DF    "d")
-			(TI    "vw4")])
+			(TI    "vd2")])

 ;; Map into the appropriate suffix based on the type
 (define_mode_attr VSs	[(V16QI "sp")
@@ -59,7 +59,7 @@
 			 (V2DI  "dp")
 			 (DF    "dp")
 			 (SF	"sp")
-			 (TI    "sp")])
+			 (TI    "dp")])

 ;; Map the register class used
 (define_mode_attr VSr	[(V16QI "v")
@@ -70,7 +70,7 @@
 			 (V2DF  "wd")
 			 (DF    "ws")
 			 (SF	"d")
-			 (TI    "wd")])
+			 (TI    "wt")])

 ;; Map the register class used for float<->int conversions
 (define_mode_attr VSr2	[(V2DF  "wd")
@@ -115,7 +115,6 @@
 			 (V4SF  "v")
 			 (V2DI  "v")
 			 (V2DF  "v")
-			 (TI    "v")
 			 (DF    "s")])

 ;; Appropriate type for add ops (and other simple FP ops)
@@ -268,12 +267,13 @@
 }
  [(set_attr "type" "vecstore,vecload,vecsimple,vecstore,vecload,vecsimple,*,*,*,vecsimple,vecsimple,*,vecstore,vecload")])

-;; Unlike other VSX moves, allow the GPRs, since a normal use of TImode is for
-;; unions.  However for plain data movement, slightly favor the vector loads
-(define_insn "*vsx_movti"
-  [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,wa,wa,?Y,?r,?r,wa,v,v,wZ")
-	(match_operand:TI 1 "input_operand" "wa,Z,wa,r,Y,r,j,W,wZ,v"))]
-  "VECTOR_MEM_VSX_P (TImode)
+;; Unlike other VSX moves, allow the GPRs even for reloading, since a normal
+;; use of TImode is for unions.  However for plain data movement, slightly
+;; favor the vector loads
+(define_insn "*vsx_movti_64bit"
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,wa,wa,wa,v, v,wZ,?Y,?r,?r")
+	(match_operand:TI 1 "input_operand"        "wa, Z,wa, j,W,wZ, v, r, Y, r"))]
+  "TARGET_POWERPC64 && VECTOR_MEM_VSX_P (TImode)
   && (register_operand (operands[0], TImode) 
       || register_operand (operands[1], TImode))"
 {
@@ -289,27 +289,87 @@
      return "xxlor %x0,%x1,%x1";

    case 3:
+      return "xxlxor %x0,%x0,%x0";
+
    case 4:
+      return output_vec_const_move (operands);
+
    case 5:
-      return "#";
+      return "stvx %1,%y0";

    case 6:
-      return "xxlxor %x0,%x0,%x0";
+      return "lvx %0,%y1";

    case 7:
+    case 8:
+    case 9:
+      return "#";
+
+    default:
+      gcc_unreachable ();
+    }
+}
+  [(set_attr "type" "vecstore,vecload,vecsimple,vecsimple,vecsimple,vecstore,vecload,*,*,*")
+   (set_attr "length" "     4,      4,        4,       4,         8,       4,      4,8,8,8")])
+
+(define_insn "*vsx_movti_32bit"
+  [(set (match_operand:TI 0 "nonimmediate_operand" "=Z,wa,wa,wa,v, v,wZ,Q,Y,????r,????r,????r,r")
+	(match_operand:TI 1 "input_operand"        "wa, Z,wa, j,W,wZ, v,r,r,    Q,    Y,    r,n"))]
+  "! TARGET_POWERPC64 && VECTOR_MEM_VSX_P (TImode)
+   && (register_operand (operands[0], TImode)
+       || register_operand (operands[1], TImode))"
+{
+  switch (which_alternative)
+    {
+    case 0:
+      return "stxvd2x %x1,%y0";
+
+    case 1:
+      return "lxvd2x %x0,%y1";
+
+    case 2:
+      return "xxlor %x0,%x1,%x1";
+
+    case 3:
+      return "xxlxor %x0,%x0,%x0";
+
+    case 4:
      return output_vec_const_move (operands);

-    case 8:
+    case 5:
      return "stvx %1,%y0";

-    case 9:
+    case 6:
      return "lvx %0,%y1";

+    case 7:
+      if (TARGET_STRING)
+        return \"stswi %1,%P0,16\";
+
+    case 8:
+      return \"#\";
+
+    case 9:
+      /* If the address is not used in the output, we can use lsi.  Otherwise,
+	 fall through to generating four loads.  */
+      if (TARGET_STRING
+          && ! reg_overlap_mentioned_p (operands[0], operands[1]))
+	return \"lswi %0,%P1,16\";
+      /* ... fall through ...  */
+
+    case 10:
+    case 11:
+    case 12:
+      return \"#\";
    default:
      gcc_unreachable ();
    }
 }
-  [(set_attr "type" "vecstore,vecload,vecsimple,*,*,*,vecsimple,*,vecstore,vecload")])
+  [(set_attr "type" "vecstore,vecload,vecsimple,vecsimple,vecsimple,vecstore,vecload,store_ux,store_ux,load_ux,load_ux, *, *")
+   (set_attr "length" "     4,      4,        4,       4,         8,       4,      4,      16,      16,     16,     16,16,16")
+   (set (attr "cell_micro") (if_then_else (match_test "TARGET_STRING")
+   			                  (const_string "always")
+					  (const_string "conditional")))])

 ;; Explicit  load/store expanders for the builtin functions
 (define_expand "vsx_load_<mode>"
@@ -319,8 +379,8 @@
  "")

 (define_expand "vsx_store_<mode>"
-  [(set (match_operand:VEC_M 0 "memory_operand" "")
-	(match_operand:VEC_M 1 "vsx_register_operand" ""))]
+  [(set (match_operand:VSX_M 0 "memory_operand" "")
+	(match_operand:VSX_M 1 "vsx_register_operand" ""))]
  "VECTOR_MEM_VSX_P (<MODE>mode)"
  "")

@@ -1026,38 +1086,46 @@
   (set_attr "fp_type" "<VSfptype_simple>")])


-;; Logical and permute operations
+;; Logical operations
+;; Do not support TImode logical instructions on 32-bit at present, because the
+;; compiler will see that we have a TImode and when it wanted DImode, and
+;; convert the DImode to TImode, store it on the stack, and load it in a VSX
+;; register.
 (define_insn "*vsx_and<mode>3"
  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
        (and:VSX_L
-	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
+	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "xxland %x0,%x1,%x2"
  [(set_attr "type" "vecsimple")])

 (define_insn "*vsx_ior<mode>3"
  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
-        (ior:VSX_L (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
-		   (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
+        (ior:VSX_L (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+		   (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "xxlor %x0,%x1,%x2"
  [(set_attr "type" "vecsimple")])

 (define_insn "*vsx_xor<mode>3"
  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
        (xor:VSX_L
-	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
-	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
+	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")
+	 (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,wa")))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "xxlxor %x0,%x1,%x2"
  [(set_attr "type" "vecsimple")])

 (define_insn "*vsx_one_cmpl<mode>2"
  [(set (match_operand:VSX_L 0 "vsx_register_operand" "=<VSr>,?wa")
        (not:VSX_L
-	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
+	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,wa")))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "xxlnor %x0,%x1,%x1"
  [(set_attr "type" "vecsimple")])
  
@@ -1067,7 +1135,8 @@
 	 (ior:VSX_L
 	  (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")
 	  (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa"))))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "xxlnor %x0,%x1,%x2"
  [(set_attr "type" "vecsimple")])

@@ -1077,7 +1146,8 @@
 	 (not:VSX_L
 	  (match_operand:VSX_L 2 "vsx_register_operand" "<VSr>,?wa"))
 	 (match_operand:VSX_L 1 "vsx_register_operand" "<VSr>,?wa")))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_VSX_P (<MODE>mode)
+   && (<MODE>mode != TImode || TARGET_POWERPC64)"
  "xxlandc %x0,%x1,%x2"
  [(set_attr "type" "vecsimple")])

@@ -1086,11 +1156,10 @@

 ;; Build a V2DF/V2DI vector from two scalars
 (define_insn "vsx_concat_<mode>"
-  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wd,?wa")
-	(unspec:VSX_D
-	 [(match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,wa")
-	  (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")]
-	 UNSPEC_VSX_CONCAT))]
+  [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSr>,?wa")
+	(vec_concat:VSX_D
+	 (match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,wa")
+	 (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,wa")))]
  "VECTOR_MEM_VSX_P (<MODE>mode)"
  "xxpermdi %x0,%x1,%x2,0"
  [(set_attr "type" "vecperm")])
@@ -1148,7 +1217,11 @@
 	 (parallel [(const_int 0)])))]
  "VECTOR_MEM_VSX_P (<MODE>mode) && WORDS_BIG_ENDIAN"
  "lxsd%U1x %x0,%y1"
-  [(set_attr "type" "fpload")
+  [(set (attr "type")
+      (if_then_else
+	(match_test "update_indexed_address_mem (operands[1], VOIDmode)")
+	(const_string "fpload_ux")
+	(const_string "fpload")))
   (set_attr "length" "4")])  

 ;; Extract a SF element from V4SF
@@ -1212,8 +1285,8 @@
      if (<MODE>mode != V2DImode)
 	{
 	  target = gen_lowpart (V2DImode, target);
-	  op0 = gen_lowpart (V2DImode, target);
-	  op1 = gen_lowpart (V2DImode, target);
+	  op0 = gen_lowpart (V2DImode, op0);
+	  op1 = gen_lowpart (V2DImode, op1);
 	}
    }
  emit_insn (gen (target, op0, op1, perm0, perm1));

--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -2075,9 +2075,24 @@ VSX vector register to hold vector double data
 @item wf
 VSX vector register to hold vector float data

+@item wg
+If @option{-mmfpgpr} was used, a floating point register
+
+@item wl
+If the LFIWAX instruction is enabled, a floating point register
+
 @item ws
 VSX vector register to hold scalar float data

+@item wt
+VSX vector register to hold 128 bit integer
+
+@item wx
+If the STFIWX instruction is enabled, a floating point register
+
+@item wz
+If the LFIWZX instruction is enabled, a floating point register
+
 @item wa
 Any VSX register


--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
+2013-03-20  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* gcc.target/powerpc/mmfpgpr.c: New test.
+	* gcc.target/powerpc/sd-vsx.c: Likewise.
+	* gcc.target/powerpc/sd-pwr6.c: Likewise.
+	* gcc.target/powerpc/vsx-float0.c: Likewise.
+
 2013-03-20  Marc Glisse  <marc.glisse@inria.fr>

 	PR tree-optimization/56355

--- a/gcc/testsuite/gcc.target/powerpc/mmfpgpr.c
+++ b/gcc/testsuite/gcc.target/powerpc/mmfpgpr.c
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power6x -mmfpgpr" } */
+/* { dg-final { scan-assembler "mffgpr" } } */
+/* { dg-final { scan-assembler "mftgpr" } } */
+
+/* Test that we generate the instructions to move between the GPR and FPR
+   registers under power6x.  */
+
+extern long return_long (void);
+extern double return_double (void);
+
+double return_double2 (void)
+{
+  return (double) return_long ();
+}
+
+long return_long2 (void)
+{
+  return (long) return_double ();
+}
--- a/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
+++ b/gcc/testsuite/gcc.target/powerpc/sd-pwr6.c
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power6 -mhard-dfp" } */
+/* { dg-final { scan-assembler-not   "lfiwzx"   } } */
+/* { dg-final { scan-assembler-times "lfd"    2 } } */
+/* { dg-final { scan-assembler-times "dctdp"  2 } } */
+/* { dg-final { scan-assembler-times "dadd"   1 } } */
+/* { dg-final { scan-assembler-times "drsp"   1 } } */
+
+/* Test that for power6 we need to use a bounce buffer on the stack to load
+   SDmode variables because the power6 does not have a way to directly load
+   32-bit values from memory.  */
+_Decimal32 a;
+
+void inc_dec32 (void)
+{
+  a += (_Decimal32) 1.0;
+}
--- a/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
+++ b/gcc/testsuite/gcc.target/powerpc/sd-vsx.c
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power7 -mhard-dfp" } */
+/* { dg-final { scan-assembler-times "lfiwzx" 2 } } */
+/* { dg-final { scan-assembler-times "stfiwx" 1 } } */
+/* { dg-final { scan-assembler-not   "lfd"      } } */
+/* { dg-final { scan-assembler-not   "stfd"     } } */
+/* { dg-final { scan-assembler-times "dctdp"  2 } } */
+/* { dg-final { scan-assembler-times "dadd"   1 } } */
+/* { dg-final { scan-assembler-times "drsp"   1 } } */
+
+/* Test that power7 can directly load/store SDmode variables without using a
+   bounce buffer.  */
+_Decimal32 a;
+
+void inc_dec32 (void)
+{
+  a += (_Decimal32) 1.0;
+}
--- a/gcc/testsuite/gcc.target/powerpc/vsx-float0.c
+++ b/gcc/testsuite/gcc.target/powerpc/vsx-float0.c
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power7" } */
+/* { dg-final { scan-assembler "xxlxor" } } */
+
+/* Test that we generate xxlor to clear a SFmode register.  */
+
+float sum (float *p, unsigned long n)
+{
+  float sum = 0.0f;	/* generate xxlxor instead of load */
+  while (n-- > 0)
+    sum += *p++;
+
+  return sum;
+}