Commit b6837b94 by Joey Ye Committed by H.J. Lu

Atom pipeline model, tuning and insn selection.

2009-04-06  Joey Ye  <joey.ye@intel.com>
	    Xuepeng Guo <xuepeng.guo@intel.com>
	    H.J. Lu  <hongjiu.lu@intel.com>

	Atom pipeline model, tuning and insn selection.
	* config.gcc (atom): Add atom config options and target.

	* config/i386/atom.md: New.

	* config/i386/i386.c (atom_cost): New cost.
	(m_ATOM): New macro flag.
	(initial_ix86_tune_features): Set m_ATOM.
	(x86_accumulate_outgoing_args): Likewise.
	(x86_arch_always_fancy_math_387): Likewise.
	(processor_target): Add Atom cost.
	(cpu_names): Add Atom cpu name.
	(override_options): Set Atom ISA.
	(ix86_issue_rate): New case PROCESSOR_ATOM.
	(ix86_adjust_cost): Likewise.

	* config/i386/i386.h (TARGET_ATOM): New target macro.
	(ix86_tune_indices): Add X86_TUNE_OPT_AGU.
	(TARGET_OPT_AGU): New target option.
	(target_cpu_default): Add TARGET_CPU_DEFAULT_atom.
	(processor_type): Add PROCESSOR_ATOM.

	* config/i386/i386.md (cpu): Add new value "atom".
	(use_carry, movu): New attr.
	(atom.md): Include atom.md.
	(adddi3_carry_rex64): Set attr "use_carry".
	(addqi3_carry): Likewise.
	(addhi3_carry): Likewise.
	(addsi3_carry): Likewise.
	(*addsi3_carry_zext): Likewise.
	(subdi3_carry_rex64): Likewise.
	(subqi3_carry): Likewise.
	(subhi3_carry): Likewise.
	(subsi3_carry): Likewise.
	(x86_movdicc_0_m1_rex64): Likewise.
	(*x86_movdicc_0_m1_se): Likewise.
	(x86_movsicc_0_m1): Likewise.
	(*x86_movsicc_0_m1_se): Likewise.
	(*adddi_1_rex64): Emit add insn as much as possible.
	(*addsi_1): Likewise.
	(return_internal): Set atom_unit.
	(return_internal_long): Likewise.
	(return_pop_internal): Likewise.
	(*rcpsf2_sse): Set atom_sse_attr attr.
	(*qrt<mode>2_sse): Likewise.
	(*prefetch_sse): Likewise.

	* config/i386/i386-c.c (ix86_target_macros_internal): New case
	PROCESSOR_ATOM.
	(ix86_target_macros_internal): Likewise.

	* config/i386/sse.md (cpu): Set attr "atom_sse_attr".
	(*prefetch_sse_rex): Likewise.
	(sse_rcpv4sf2): Likewise.
	(sse_vmrcpv4sf2): Likewise.
	(sse_sqrtv4sf2): Likewise.
	(<sse>_vmsqrt<mode>2): Likewise.
	(sse_ldmxcsr): Likewise.
	(sse_stmxcsr): Likewise.
	(*sse_sfence): Likewise.
	(sse2_clflush): Likewise.
	(*sse2_mfence): Likewise.
	(*sse2_lfence): Likewise.
	(avx_movup<avxmodesuffixf2c><avxmodesuffix>): Set attr "movu".
	(<sse>_movup<ssemodesuffixf2c>): Likewise.
	(avx_movdqu<avxmodesuffix>): Likewise.
	(avx_lddqu<avxmodesuffix>): Likewise.
	(sse2_movntv2di): Change attr "type" to "ssemov".
	(sse2_movntsi): Likewise.
	(rsqrtv8sf2): Change attr "type" to "sseadd".
	(sse3_addsubv2df3): Set attr "atom_unit".
	(sse3_h<plusminus_insn>v4sf3): Likewise.
	(*sse2_pmaddwd): Likewise.
	(*vec_extractv2di_1_rex64): Likewise.
	(*vec_extractv2di_1_avx): Likewise.
	(sse2_psadbw): Likewise.
	(ssse3_phaddwv8hi3): Likewise.
	(ssse3_phaddwv4hi3): Likewise.
	(ssse3_phadddv4si3): Likewise.
	(ssse3_phadddv2si3): Likewise.
	(ssse3_phaddswv8hi3): Likewise.
	(ssse3_phaddswv4hi3): Likewise.
	(ssse3_phsubwv8hi3): Likewise.
	(ssse3_phsubwv4hi3): Likewise.
	(ssse3_phsubdv4si3): Likewise.
	(ssse3_phsubdv2si3): Likewise.
	(ssse3_phsubswv8hi3): Likewise.
	(ssse3_phsubswv4hi3): Likewise.
	(ssse3_pmaddubsw128): Likewise.
	(sse3_pmaddubsw: Likewise.
	(ssse3_palignrti): Likewise.
	(ssse3_palignrdi): Likewise.

Co-Authored-By: H.J. Lu <hongjiu.lu@intel.com>
Co-Authored-By: Xuepeng Guo <xuepeng.guo@intel.com>

From-SVN: r145624
parent 6d63ea75
2009-04-06 Joey Ye <joey.ye@intel.com>
Xuepeng Guo <xuepeng.guo@intel.com>
H.J. Lu <hongjiu.lu@intel.com>
Atom pipeline model, tuning and insn selection.
* config.gcc (atom): Add atom config options and target.
* config/i386/atom.md: New.
* config/i386/i386.c (atom_cost): New cost.
(m_ATOM): New macro flag.
(initial_ix86_tune_features): Set m_ATOM.
(x86_accumulate_outgoing_args): Likewise.
(x86_arch_always_fancy_math_387): Likewise.
(processor_target): Add Atom cost.
(cpu_names): Add Atom cpu name.
(override_options): Set Atom ISA.
(ix86_issue_rate): New case PROCESSOR_ATOM.
(ix86_adjust_cost): Likewise.
* config/i386/i386.h (TARGET_ATOM): New target macro.
(ix86_tune_indices): Add X86_TUNE_OPT_AGU.
(TARGET_OPT_AGU): New target option.
(target_cpu_default): Add TARGET_CPU_DEFAULT_atom.
(processor_type): Add PROCESSOR_ATOM.
* config/i386/i386.md (cpu): Add new value "atom".
(use_carry, movu): New attr.
(atom.md): Include atom.md.
(adddi3_carry_rex64): Set attr "use_carry".
(addqi3_carry): Likewise.
(addhi3_carry): Likewise.
(addsi3_carry): Likewise.
(*addsi3_carry_zext): Likewise.
(subdi3_carry_rex64): Likewise.
(subqi3_carry): Likewise.
(subhi3_carry): Likewise.
(subsi3_carry): Likewise.
(x86_movdicc_0_m1_rex64): Likewise.
(*x86_movdicc_0_m1_se): Likewise.
(x86_movsicc_0_m1): Likewise.
(*x86_movsicc_0_m1_se): Likewise.
(*adddi_1_rex64): Emit add insn as much as possible.
(*addsi_1): Likewise.
(return_internal): Set atom_unit.
(return_internal_long): Likewise.
(return_pop_internal): Likewise.
(*rcpsf2_sse): Set atom_sse_attr attr.
(*qrt<mode>2_sse): Likewise.
(*prefetch_sse): Likewise.
* config/i386/i386-c.c (ix86_target_macros_internal): New case
PROCESSOR_ATOM.
(ix86_target_macros_internal): Likewise.
* config/i386/sse.md (cpu): Set attr "atom_sse_attr".
(*prefetch_sse_rex): Likewise.
(sse_rcpv4sf2): Likewise.
(sse_vmrcpv4sf2): Likewise.
(sse_sqrtv4sf2): Likewise.
(<sse>_vmsqrt<mode>2): Likewise.
(sse_ldmxcsr): Likewise.
(sse_stmxcsr): Likewise.
(*sse_sfence): Likewise.
(sse2_clflush): Likewise.
(*sse2_mfence): Likewise.
(*sse2_lfence): Likewise.
(avx_movup<avxmodesuffixf2c><avxmodesuffix>): Set attr "movu".
(<sse>_movup<ssemodesuffixf2c>): Likewise.
(avx_movdqu<avxmodesuffix>): Likewise.
(avx_lddqu<avxmodesuffix>): Likewise.
(sse2_movntv2di): Change attr "type" to "ssemov".
(sse2_movntsi): Likewise.
(rsqrtv8sf2): Change attr "type" to "sseadd".
(sse3_addsubv2df3): Set attr "atom_unit".
(sse3_h<plusminus_insn>v4sf3): Likewise.
(*sse2_pmaddwd): Likewise.
(*vec_extractv2di_1_rex64): Likewise.
(*vec_extractv2di_1_avx): Likewise.
(sse2_psadbw): Likewise.
(ssse3_phaddwv8hi3): Likewise.
(ssse3_phaddwv4hi3): Likewise.
(ssse3_phadddv4si3): Likewise.
(ssse3_phadddv2si3): Likewise.
(ssse3_phaddswv8hi3): Likewise.
(ssse3_phaddswv4hi3): Likewise.
(ssse3_phsubwv8hi3): Likewise.
(ssse3_phsubwv4hi3): Likewise.
(ssse3_phsubdv4si3): Likewise.
(ssse3_phsubdv2si3): Likewise.
(ssse3_phsubswv8hi3): Likewise.
(ssse3_phsubswv4hi3): Likewise.
(ssse3_pmaddubsw128): Likewise.
(sse3_pmaddubsw: Likewise.
(ssse3_palignrti): Likewise.
(ssse3_palignrdi): Likewise.
2009-04-06 Gerald Pfeifer <gerald@pfeifer.com> 2009-04-06 Gerald Pfeifer <gerald@pfeifer.com>
* doc/install.texi (Specific): Fix two cross-references to MinGW. * doc/install.texi (Specific): Fix two cross-references to MinGW.
......
...@@ -1074,7 +1074,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i ...@@ -1074,7 +1074,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i
tmake_file="${tmake_file} i386/t-linux64" tmake_file="${tmake_file} i386/t-linux64"
need_64bit_hwint=yes need_64bit_hwint=yes
case X"${with_cpu}" in case X"${with_cpu}" in
Xgeneric|Xcore2|Xnocona|Xx86-64|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx) Xgeneric|Xatom|Xcore2|Xnocona|Xx86-64|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx)
;; ;;
X) X)
if test x$with_cpu_64 = x; then if test x$with_cpu_64 = x; then
...@@ -1083,7 +1083,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i ...@@ -1083,7 +1083,7 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | i[34567]86-*-knetbsd*-gnu | i
;; ;;
*) *)
echo "Unsupported CPU used in --with-cpu=$with_cpu, supported values:" 1>&2 echo "Unsupported CPU used in --with-cpu=$with_cpu, supported values:" 1>&2
echo "generic core2 nocona x86-64 amdfam10 barcelona k8 opteron athlon64 athlon-fx" 1>&2 echo "generic atom core2 nocona x86-64 amdfam10 barcelona k8 opteron athlon64 athlon-fx" 1>&2
exit 1 exit 1
;; ;;
esac esac
...@@ -1189,7 +1189,7 @@ i[34567]86-*-solaris2*) ...@@ -1189,7 +1189,7 @@ i[34567]86-*-solaris2*)
need_64bit_hwint=yes need_64bit_hwint=yes
use_gcc_stdint=wrap use_gcc_stdint=wrap
case X"${with_cpu}" in case X"${with_cpu}" in
Xgeneric|Xcore2|Xnocona|Xx86-64|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx) Xgeneric|Xatom|Xcore2|Xnocona|Xx86-64|Xamdfam10|Xbarcelona|Xk8|Xopteron|Xathlon64|Xathlon-fx)
;; ;;
X) X)
if test x$with_cpu_64 = x; then if test x$with_cpu_64 = x; then
...@@ -1198,7 +1198,7 @@ i[34567]86-*-solaris2*) ...@@ -1198,7 +1198,7 @@ i[34567]86-*-solaris2*)
;; ;;
*) *)
echo "Unsupported CPU used in --with-cpu=$with_cpu, supported values:" 1>&2 echo "Unsupported CPU used in --with-cpu=$with_cpu, supported values:" 1>&2
echo "generic core2 nocona x86-64 amdfam10 barcelona k8 opteron athlon64 athlon-fx" 1>&2 echo "generic atom core2 nocona x86-64 amdfam10 barcelona k8 opteron athlon64 athlon-fx" 1>&2
exit 1 exit 1
;; ;;
esac esac
...@@ -2801,7 +2801,7 @@ case "${target}" in ...@@ -2801,7 +2801,7 @@ case "${target}" in
esac esac
# OK # OK
;; ;;
"" | amdfam10 | barcelona | k8 | opteron | athlon64 | athlon-fx | nocona | core2 | generic) "" | amdfam10 | barcelona | k8 | opteron | athlon64 | athlon-fx | nocona | core2 | atom | generic)
# OK # OK
;; ;;
*) *)
......
;; Atom Scheduling
;; Copyright (C) 2009 Free Software Foundation, Inc.
;;
;; This file is part of GCC.
;;
;; GCC is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 3, or (at your option)
;; any later version.
;;
;; GCC is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
;; GNU General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with GCC; see the file COPYING3. If not see
;; <http://www.gnu.org/licenses/>.
;;
;; Atom is an in-order core with two integer pipelines.
(define_attr "atom_unit" "sishuf,simul,jeu,complex,other"
(const_string "other"))
(define_attr "atom_sse_attr" "rcp,movdup,lfence,fence,prefetch,sqrt,mxcsr,other"
(const_string "other"))
(define_automaton "atom")
;; Atom has two ports: port 0 and port 1 connecting to all execution units
(define_cpu_unit "atom-port-0,atom-port-1" "atom")
;; EU: Execution Unit
;; Atom EUs are connected by port 0 or port 1.
(define_cpu_unit "atom-eu-0, atom-eu-1,
atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4"
"atom")
;; Some EUs have duplicated copied and can be accessed via either
;; port 0 or port 1
;; (define_reservation "atom-port-either" "(atom-port-0 | atom-port-1)")
;;; Some instructions is dual-pipe execution, need both ports
;;; Complex multi-op macro-instructoins need both ports and all EUs
(define_reservation "atom-port-dual" "(atom-port-0 + atom-port-1)")
(define_reservation "atom-all-eu" "(atom-eu-0 + atom-eu-1 +
atom-imul-1 + atom-imul-2 + atom-imul-3 +
atom-imul-4)")
;;; Most of simple instructions have 1 cycle latency. Some of them
;;; issue in port 0, some in port 0 and some in either port.
(define_reservation "atom-simple-0" "(atom-port-0 + atom-eu-0)")
(define_reservation "atom-simple-1" "(atom-port-1 + atom-eu-1)")
(define_reservation "atom-simple-either" "(atom-simple-0 | atom-simple-1)")
;;; Some insn issues in port 0 with 3 cycle latency and 1 cycle tput
(define_reservation "atom-eu-0-3-1" "(atom-port-0 + atom-eu-0, nothing*2)")
;;; fmul insn can have 4 or 5 cycles latency
(define_reservation "atom-fmul-5c" "(atom-port-0 + atom-eu-0), nothing*4")
(define_reservation "atom-fmul-4c" "(atom-port-0 + atom-eu-0), nothing*3")
;;; fadd can has 5 cycles latency depends on instruction forms
(define_reservation "atom-fadd-5c" "(atom-port-1 + atom-eu-1), nothing*5")
;;; imul insn has 5 cycles latency
(define_reservation "atom-imul-32"
"atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4,
atom-port-0")
;;; imul instruction excludes other non-FP instructions.
(exclusion_set "atom-eu-0, atom-eu-1"
"atom-imul-1, atom-imul-2, atom-imul-3, atom-imul-4")
;;; dual-execution instructions can have 1,2,4,5 cycles latency depends on
;;; instruction forms
(define_reservation "atom-dual-1c" "(atom-port-dual + atom-eu-0 + atom-eu-1)")
(define_reservation "atom-dual-2c"
"(atom-port-dual + atom-eu-0 + atom-eu-1, nothing)")
(define_reservation "atom-dual-5c"
"(atom-port-dual + atom-eu-0 + atom-eu-1, nothing*4)")
;;; Complex macro-instruction has variants of latency, and uses both ports.
(define_reservation "atom-complex" "(atom-port-dual + atom-all-eu)")
(define_insn_reservation "atom_other" 9
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "other")
(eq_attr "atom_unit" "!jeu")))
"atom-complex, atom-all-eu*8")
;; return has type "other" with atom_unit "jeu"
(define_insn_reservation "atom_other_2" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "other")
(eq_attr "atom_unit" "jeu")))
"atom-dual-1c")
(define_insn_reservation "atom_multi" 9
(and (eq_attr "cpu" "atom")
(eq_attr "type" "multi"))
"atom-complex, atom-all-eu*8")
;; Normal alu insns without carry
(define_insn_reservation "atom_alu" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "alu")
(and (eq_attr "memory" "none")
(eq_attr "use_carry" "0"))))
"atom-simple-either")
;; Normal alu insns without carry
(define_insn_reservation "atom_alu_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "alu")
(and (eq_attr "memory" "!none")
(eq_attr "use_carry" "0"))))
"atom-simple-either")
;; Alu insn consuming CF, such as add/sbb
(define_insn_reservation "atom_alu_carry" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "alu")
(and (eq_attr "memory" "none")
(eq_attr "use_carry" "1"))))
"atom-simple-either")
;; Alu insn consuming CF, such as add/sbb
(define_insn_reservation "atom_alu_carry_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "alu")
(and (eq_attr "memory" "!none")
(eq_attr "use_carry" "1"))))
"atom-simple-either")
(define_insn_reservation "atom_alu1" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "alu1")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_alu1_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "alu1")
(eq_attr "memory" "!none")))
"atom-simple-either")
(define_insn_reservation "atom_negnot" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "negnot")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_negnot_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "negnot")
(eq_attr "memory" "!none")))
"atom-simple-either")
(define_insn_reservation "atom_imov" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imov")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_imov_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imov")
(eq_attr "memory" "!none")))
"atom-simple-either")
;; 16<-16, 32<-32
(define_insn_reservation "atom_imovx" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imovx")
(and (eq_attr "memory" "none")
(ior (and (match_operand:HI 0 "register_operand")
(match_operand:HI 1 "general_operand"))
(and (match_operand:SI 0 "register_operand")
(match_operand:SI 1 "general_operand"))))))
"atom-simple-either")
;; 16<-16, 32<-32, mem
(define_insn_reservation "atom_imovx_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imovx")
(and (eq_attr "memory" "!none")
(ior (and (match_operand:HI 0 "register_operand")
(match_operand:HI 1 "general_operand"))
(and (match_operand:SI 0 "register_operand")
(match_operand:SI 1 "general_operand"))))))
"atom-simple-either")
;; 32<-16, 32<-8, 64<-16, 64<-8, 64<-32, 8<-8
(define_insn_reservation "atom_imovx_2" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imovx")
(and (eq_attr "memory" "none")
(ior (match_operand:QI 0 "register_operand")
(ior (and (match_operand:SI 0 "register_operand")
(not (match_operand:SI 1 "general_operand")))
(match_operand:DI 0 "register_operand"))))))
"atom-simple-0")
;; 32<-16, 32<-8, 64<-16, 64<-8, 64<-32, 8<-8, mem
(define_insn_reservation "atom_imovx_2_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imovx")
(and (eq_attr "memory" "!none")
(ior (match_operand:QI 0 "register_operand")
(ior (and (match_operand:SI 0 "register_operand")
(not (match_operand:SI 1 "general_operand")))
(match_operand:DI 0 "register_operand"))))))
"atom-simple-0")
;; 16<-8
(define_insn_reservation "atom_imovx_3" 3
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imovx")
(and (match_operand:HI 0 "register_operand")
(match_operand:QI 1 "general_operand"))))
"atom-complex, atom-all-eu*2")
(define_insn_reservation "atom_lea" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "lea")
(eq_attr "mode" "!HI")))
"atom-simple-either")
;; lea 16bit address is complex insn
(define_insn_reservation "atom_lea_2" 2
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "lea")
(eq_attr "mode" "HI")))
"atom-complex, atom-all-eu")
(define_insn_reservation "atom_incdec" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "incdec")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_incdec_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "incdec")
(eq_attr "memory" "!none")))
"atom-simple-either")
;; simple shift instruction use SHIFT eu, none memory
(define_insn_reservation "atom_ishift" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ishift")
(and (eq_attr "memory" "none") (eq_attr "prefix_0f" "0"))))
"atom-simple-0")
;; simple shift instruction use SHIFT eu, memory
(define_insn_reservation "atom_ishift_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ishift")
(and (eq_attr "memory" "!none") (eq_attr "prefix_0f" "0"))))
"atom-simple-0")
;; DF shift (prefixed with 0f) is complex insn with latency of 7 cycles
(define_insn_reservation "atom_ishift_3" 7
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ishift")
(eq_attr "prefix_0f" "1")))
"atom-complex, atom-all-eu*6")
(define_insn_reservation "atom_ishift1" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ishift1")
(eq_attr "memory" "none")))
"atom-simple-0")
(define_insn_reservation "atom_ishift1_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ishift1")
(eq_attr "memory" "!none")))
"atom-simple-0")
(define_insn_reservation "atom_rotate" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "rotate")
(eq_attr "memory" "none")))
"atom-simple-0")
(define_insn_reservation "atom_rotate_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "rotate")
(eq_attr "memory" "!none")))
"atom-simple-0")
(define_insn_reservation "atom_rotate1" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "rotate1")
(eq_attr "memory" "none")))
"atom-simple-0")
(define_insn_reservation "atom_rotate1_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "rotate1")
(eq_attr "memory" "!none")))
"atom-simple-0")
(define_insn_reservation "atom_imul" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imul")
(and (eq_attr "memory" "none") (eq_attr "mode" "SI"))))
"atom-imul-32")
(define_insn_reservation "atom_imul_mem" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imul")
(and (eq_attr "memory" "!none") (eq_attr "mode" "SI"))))
"atom-imul-32")
;; latency set to 10 as common 64x64 imul
(define_insn_reservation "atom_imul_3" 10
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "imul")
(eq_attr "mode" "!SI")))
"atom-complex, atom-all-eu*9")
(define_insn_reservation "atom_idiv" 65
(and (eq_attr "cpu" "atom")
(eq_attr "type" "idiv"))
"atom-complex, atom-all-eu*32, nothing*32")
(define_insn_reservation "atom_icmp" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "icmp")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_icmp_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "icmp")
(eq_attr "memory" "!none")))
"atom-simple-either")
(define_insn_reservation "atom_test" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "test")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_test_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "test")
(eq_attr "memory" "!none")))
"atom-simple-either")
(define_insn_reservation "atom_ibr" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ibr")
(eq_attr "memory" "!load")))
"atom-simple-1")
;; complex if jump target is from address
(define_insn_reservation "atom_ibr_2" 2
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ibr")
(eq_attr "memory" "load")))
"atom-complex, atom-all-eu")
(define_insn_reservation "atom_setcc" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "setcc")
(eq_attr "memory" "!store")))
"atom-simple-either")
;; 2 cycles complex if target is in memory
(define_insn_reservation "atom_setcc_2" 2
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "setcc")
(eq_attr "memory" "store")))
"atom-complex, atom-all-eu")
(define_insn_reservation "atom_icmov" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "icmov")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_icmov_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "icmov")
(eq_attr "memory" "!none")))
"atom-simple-either")
;; UCODE if segreg, ignored
(define_insn_reservation "atom_push" 2
(and (eq_attr "cpu" "atom")
(eq_attr "type" "push"))
"atom-dual-2c")
;; pop r64 is 1 cycle. UCODE if segreg, ignored
(define_insn_reservation "atom_pop" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "pop")
(eq_attr "mode" "DI")))
"atom-dual-1c")
;; pop non-r64 is 2 cycles. UCODE if segreg, ignored
(define_insn_reservation "atom_pop_2" 2
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "pop")
(eq_attr "mode" "!DI")))
"atom-dual-2c")
;; UCODE if segreg, ignored
(define_insn_reservation "atom_call" 1
(and (eq_attr "cpu" "atom")
(eq_attr "type" "call"))
"atom-dual-1c")
(define_insn_reservation "atom_callv" 1
(and (eq_attr "cpu" "atom")
(eq_attr "type" "callv"))
"atom-dual-1c")
(define_insn_reservation "atom_leave" 3
(and (eq_attr "cpu" "atom")
(eq_attr "type" "leave"))
"atom-complex, atom-all-eu*2")
(define_insn_reservation "atom_str" 3
(and (eq_attr "cpu" "atom")
(eq_attr "type" "str"))
"atom-complex, atom-all-eu*2")
(define_insn_reservation "atom_sselog" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sselog")
(eq_attr "memory" "none")))
"atom-simple-either")
(define_insn_reservation "atom_sselog_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sselog")
(eq_attr "memory" "!none")))
"atom-simple-either")
(define_insn_reservation "atom_sselog1" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sselog1")
(eq_attr "memory" "none")))
"atom-simple-0")
(define_insn_reservation "atom_sselog1_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sselog1")
(eq_attr "memory" "!none")))
"atom-simple-0")
;; not pmad, not psad
(define_insn_reservation "atom_sseiadd" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseiadd")
(and (not (match_operand:V2DI 0 "register_operand"))
(and (eq_attr "atom_unit" "!simul")
(eq_attr "atom_unit" "!complex")))))
"atom-simple-either")
;; pmad, psad and 64
(define_insn_reservation "atom_sseiadd_2" 4
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseiadd")
(and (not (match_operand:V2DI 0 "register_operand"))
(and (eq_attr "atom_unit" "simul" )
(eq_attr "mode" "DI")))))
"atom-fmul-4c")
;; pmad, psad and 128
(define_insn_reservation "atom_sseiadd_3" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseiadd")
(and (not (match_operand:V2DI 0 "register_operand"))
(and (eq_attr "atom_unit" "simul" )
(eq_attr "mode" "TI")))))
"atom-fmul-5c")
;; if paddq(64 bit op), phadd/phsub
(define_insn_reservation "atom_sseiadd_4" 6
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseiadd")
(ior (match_operand:V2DI 0 "register_operand")
(eq_attr "atom_unit" "complex"))))
"atom-complex, atom-all-eu*5")
;; if immediate op.
(define_insn_reservation "atom_sseishft" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseishft")
(and (eq_attr "atom_unit" "!sishuf")
(match_operand 2 "immediate_operand"))))
"atom-simple-either")
;; if palignr or psrldq
(define_insn_reservation "atom_sseishft_2" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseishft")
(and (eq_attr "atom_unit" "sishuf")
(match_operand 2 "immediate_operand"))))
"atom-simple-0")
;; if reg/mem op
(define_insn_reservation "atom_sseishft_3" 2
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseishft")
(not (match_operand 2 "immediate_operand"))))
"atom-complex, atom-all-eu")
(define_insn_reservation "atom_sseimul" 1
(and (eq_attr "cpu" "atom")
(eq_attr "type" "sseimul"))
"atom-simple-0")
;; rcpss or rsqrtss
(define_insn_reservation "atom_sse" 4
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sse")
(and (eq_attr "atom_sse_attr" "rcp") (eq_attr "mode" "SF"))))
"atom-fmul-4c")
;; movshdup, movsldup. Suggest to type sseishft
(define_insn_reservation "atom_sse_2" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sse")
(eq_attr "atom_sse_attr" "movdup")))
"atom-simple-0")
;; lfence
(define_insn_reservation "atom_sse_3" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sse")
(eq_attr "atom_sse_attr" "lfence")))
"atom-simple-either")
;; sfence,clflush,mfence, prefetch
(define_insn_reservation "atom_sse_4" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sse")
(ior (eq_attr "atom_sse_attr" "fence")
(eq_attr "atom_sse_attr" "prefetch"))))
"atom-simple-0")
;; rcpps, rsqrtss, sqrt, ldmxcsr
(define_insn_reservation "atom_sse_5" 7
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sse")
(ior (ior (eq_attr "atom_sse_attr" "sqrt")
(eq_attr "atom_sse_attr" "mxcsr"))
(and (eq_attr "atom_sse_attr" "rcp")
(eq_attr "mode" "V4SF")))))
"atom-complex, atom-all-eu*6")
;; xmm->xmm
(define_insn_reservation "atom_ssemov" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemov")
(and (match_operand 0 "register_operand" "xy") (match_operand 1 "register_operand" "xy"))))
"atom-simple-either")
;; reg->xmm
(define_insn_reservation "atom_ssemov_2" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemov")
(and (match_operand 0 "register_operand" "xy") (match_operand 1 "register_operand" "r"))))
"atom-simple-0")
;; xmm->reg
(define_insn_reservation "atom_ssemov_3" 3
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemov")
(and (match_operand 0 "register_operand" "r") (match_operand 1 "register_operand" "xy"))))
"atom-eu-0-3-1")
;; mov mem
(define_insn_reservation "atom_ssemov_4" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemov")
(and (eq_attr "movu" "0") (eq_attr "memory" "!none"))))
"atom-simple-0")
;; movu mem
(define_insn_reservation "atom_ssemov_5" 2
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemov")
(ior (eq_attr "movu" "1") (eq_attr "memory" "!none"))))
"atom-complex, atom-all-eu")
;; no memory simple
(define_insn_reservation "atom_sseadd" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseadd")
(and (eq_attr "memory" "none")
(and (eq_attr "mode" "!V2DF")
(eq_attr "atom_unit" "!complex")))))
"atom-fadd-5c")
;; memory simple
(define_insn_reservation "atom_sseadd_mem" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseadd")
(and (eq_attr "memory" "!none")
(and (eq_attr "mode" "!V2DF")
(eq_attr "atom_unit" "!complex")))))
"atom-dual-5c")
;; maxps, minps, *pd, hadd, hsub
(define_insn_reservation "atom_sseadd_3" 8
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseadd")
(ior (eq_attr "mode" "V2DF") (eq_attr "atom_unit" "complex"))))
"atom-complex, atom-all-eu*7")
;; Except dppd/dpps
(define_insn_reservation "atom_ssemul" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemul")
(eq_attr "mode" "!SF")))
"atom-fmul-5c")
;; Except dppd/dpps, 4 cycle if mulss
(define_insn_reservation "atom_ssemul_2" 4
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssemul")
(eq_attr "mode" "SF")))
"atom-fmul-4c")
(define_insn_reservation "atom_ssecmp" 1
(and (eq_attr "cpu" "atom")
(eq_attr "type" "ssecmp"))
"atom-simple-either")
(define_insn_reservation "atom_ssecomi" 10
(and (eq_attr "cpu" "atom")
(eq_attr "type" "ssecomi"))
"atom-complex, atom-all-eu*9")
;; no memory and cvtpi2ps, cvtps2pi, cvttps2pi
(define_insn_reservation "atom_ssecvt" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssecvt")
(ior (and (match_operand:V2SI 0 "register_operand")
(match_operand:V4SF 1 "register_operand"))
(and (match_operand:V4SF 0 "register_operand")
(match_operand:V2SI 1 "register_operand")))))
"atom-fadd-5c")
;; memory and cvtpi2ps, cvtps2pi, cvttps2pi
(define_insn_reservation "atom_ssecvt_2" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssecvt")
(ior (and (match_operand:V2SI 0 "register_operand")
(match_operand:V4SF 1 "memory_operand"))
(and (match_operand:V4SF 0 "register_operand")
(match_operand:V2SI 1 "memory_operand")))))
"atom-dual-5c")
;; otherwise. 7 cycles average for cvtss2sd
(define_insn_reservation "atom_ssecvt_3" 7
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "ssecvt")
(not (ior (and (match_operand:V2SI 0 "register_operand")
(match_operand:V4SF 1 "nonimmediate_operand"))
(and (match_operand:V4SF 0 "register_operand")
(match_operand:V2SI 1 "nonimmediate_operand"))))))
"atom-complex, atom-all-eu*6")
;; memory and cvtsi2sd
(define_insn_reservation "atom_sseicvt" 5
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseicvt")
(and (match_operand:V2DF 0 "register_operand")
(match_operand:SI 1 "memory_operand"))))
"atom-dual-5c")
;; otherwise. 8 cycles average for cvtsd2si
(define_insn_reservation "atom_sseicvt_2" 8
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "sseicvt")
(not (and (match_operand:V2DF 0 "register_operand")
(match_operand:SI 1 "memory_operand")))))
"atom-complex, atom-all-eu*7")
(define_insn_reservation "atom_ssediv" 62
(and (eq_attr "cpu" "atom")
(eq_attr "type" "ssediv"))
"atom-complex, atom-all-eu*12, nothing*49")
;; simple for fmov
(define_insn_reservation "atom_fmov" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "fmov")
(eq_attr "memory" "none")))
"atom-simple-either")
;; simple for fmov
(define_insn_reservation "atom_fmov_mem" 1
(and (eq_attr "cpu" "atom")
(and (eq_attr "type" "fmov")
(eq_attr "memory" "!none")))
"atom-simple-either")
;; Define bypass here
;; There will be no stall from lea to non-mem EX insns
(define_bypass 0 "atom_lea"
"atom_alu_carry,
atom_alu,atom_alu1,atom_negnot,atom_imov,atom_imovx,
atom_incdec, atom_setcc, atom_icmov, atom_pop")
(define_bypass 0 "atom_lea"
"atom_alu_mem, atom_alu_carry_mem, atom_alu1_mem,
atom_imovx_mem, atom_imovx_2_mem,
atom_imov_mem, atom_icmov_mem, atom_fmov_mem"
"!ix86_agi_dependent")
;; There will be 3 cycles stall from EX insns to AGAN insns LEA
(define_bypass 4 "atom_alu_carry,
atom_alu,atom_alu1,atom_negnot,atom_imov,atom_imovx,
atom_incdec,atom_ishift,atom_ishift1,atom_rotate,
atom_rotate1, atom_setcc, atom_icmov, atom_pop,
atom_alu_mem, atom_alu_carry_mem, atom_alu1_mem,
atom_imovx_mem, atom_imovx_2_mem,
atom_imov_mem, atom_icmov_mem, atom_fmov_mem"
"atom_lea")
;; There will be 3 cycles stall from EX insns to insns need addr calculation
(define_bypass 4 "atom_alu_carry,
atom_alu,atom_alu1,atom_negnot,atom_imov,atom_imovx,
atom_incdec,atom_ishift,atom_ishift1,atom_rotate,
atom_rotate1, atom_setcc, atom_icmov, atom_pop,
atom_imovx_mem, atom_imovx_2_mem,
atom_alu_mem, atom_alu_carry_mem, atom_alu1_mem,
atom_imov_mem, atom_icmov_mem, atom_fmov_mem"
"atom_alu_mem, atom_alu_carry_mem, atom_alu1_mem,
atom_negnot_mem, atom_imov_mem, atom_incdec_mem,
atom_imovx_mem, atom_imovx_2_mem,
atom_imul_mem, atom_icmp_mem,
atom_test_mem, atom_icmov_mem, atom_sselog_mem,
atom_sselog1_mem, atom_fmov_mem, atom_sseadd_mem,
atom_ishift_mem, atom_ishift1_mem,
atom_rotate_mem, atom_rotate1_mem"
"ix86_agi_dependent")
;; Stall from imul to lea is 8 cycles.
(define_bypass 9 "atom_imul, atom_imul_mem" "atom_lea")
;; Stall from imul to memory address is 8 cycles.
(define_bypass 9 "atom_imul, atom_imul_mem"
"atom_alu_mem, atom_alu_carry_mem, atom_alu1_mem,
atom_negnot_mem, atom_imov_mem, atom_incdec_mem,
atom_ishift_mem, atom_ishift1_mem, atom_rotate_mem,
atom_rotate1_mem, atom_imul_mem, atom_icmp_mem,
atom_test_mem, atom_icmov_mem, atom_sselog_mem,
atom_sselog1_mem, atom_fmov_mem, atom_sseadd_mem"
"ix86_agi_dependent")
;; There will be 0 cycle stall from cmp/test to jcc
;; There will be 1 cycle stall from flag producer to cmov and adc/sbb
(define_bypass 2 "atom_icmp, atom_test, atom_alu, atom_alu_carry,
atom_alu1, atom_negnot, atom_incdec, atom_ishift,
atom_ishift1, atom_rotate, atom_rotate1"
"atom_icmov, atom_alu_carry")
...@@ -119,6 +119,10 @@ ix86_target_macros_internal (int isa_flag, ...@@ -119,6 +119,10 @@ ix86_target_macros_internal (int isa_flag,
def_or_undef (parse_in, "__core2"); def_or_undef (parse_in, "__core2");
def_or_undef (parse_in, "__core2__"); def_or_undef (parse_in, "__core2__");
break; break;
case PROCESSOR_ATOM:
def_or_undef (parse_in, "__atom");
def_or_undef (parse_in, "__atom__");
break;
/* use PROCESSOR_max to not set/unset the arch macro. */ /* use PROCESSOR_max to not set/unset the arch macro. */
case PROCESSOR_max: case PROCESSOR_max:
break; break;
...@@ -187,6 +191,9 @@ ix86_target_macros_internal (int isa_flag, ...@@ -187,6 +191,9 @@ ix86_target_macros_internal (int isa_flag,
case PROCESSOR_CORE2: case PROCESSOR_CORE2:
def_or_undef (parse_in, "__tune_core2__"); def_or_undef (parse_in, "__tune_core2__");
break; break;
case PROCESSOR_ATOM:
def_or_undef (parse_in, "__tune_atom__");
break;
case PROCESSOR_GENERIC32: case PROCESSOR_GENERIC32:
case PROCESSOR_GENERIC64: case PROCESSOR_GENERIC64:
break; break;
......
...@@ -1036,6 +1036,79 @@ struct processor_costs core2_cost = { ...@@ -1036,6 +1036,79 @@ struct processor_costs core2_cost = {
1, /* cond_not_taken_branch_cost. */ 1, /* cond_not_taken_branch_cost. */
}; };
static const
struct processor_costs atom_cost = {
COSTS_N_INSNS (1), /* cost of an add instruction */
COSTS_N_INSNS (1) + 1, /* cost of a lea instruction */
COSTS_N_INSNS (1), /* variable shift costs */
COSTS_N_INSNS (1), /* constant shift costs */
{COSTS_N_INSNS (3), /* cost of starting multiply for QI */
COSTS_N_INSNS (4), /* HI */
COSTS_N_INSNS (3), /* SI */
COSTS_N_INSNS (4), /* DI */
COSTS_N_INSNS (2)}, /* other */
0, /* cost of multiply per each bit set */
{COSTS_N_INSNS (18), /* cost of a divide/mod for QI */
COSTS_N_INSNS (26), /* HI */
COSTS_N_INSNS (42), /* SI */
COSTS_N_INSNS (74), /* DI */
COSTS_N_INSNS (74)}, /* other */
COSTS_N_INSNS (1), /* cost of movsx */
COSTS_N_INSNS (1), /* cost of movzx */
8, /* "large" insn */
17, /* MOVE_RATIO */
2, /* cost for loading QImode using movzbl */
{4, 4, 4}, /* cost of loading integer registers
in QImode, HImode and SImode.
Relative to reg-reg move (2). */
{4, 4, 4}, /* cost of storing integer registers */
4, /* cost of reg,reg fld/fst */
{12, 12, 12}, /* cost of loading fp registers
in SFmode, DFmode and XFmode */
{6, 6, 8}, /* cost of storing fp registers
in SFmode, DFmode and XFmode */
2, /* cost of moving MMX register */
{8, 8}, /* cost of loading MMX registers
in SImode and DImode */
{8, 8}, /* cost of storing MMX registers
in SImode and DImode */
2, /* cost of moving SSE register */
{8, 8, 8}, /* cost of loading SSE registers
in SImode, DImode and TImode */
{8, 8, 8}, /* cost of storing SSE registers
in SImode, DImode and TImode */
5, /* MMX or SSE register to integer */
32, /* size of l1 cache. */
256, /* size of l2 cache. */
64, /* size of prefetch block */
6, /* number of parallel prefetches */
3, /* Branch cost */
COSTS_N_INSNS (8), /* cost of FADD and FSUB insns. */
COSTS_N_INSNS (8), /* cost of FMUL instruction. */
COSTS_N_INSNS (20), /* cost of FDIV instruction. */
COSTS_N_INSNS (8), /* cost of FABS instruction. */
COSTS_N_INSNS (8), /* cost of FCHS instruction. */
COSTS_N_INSNS (40), /* cost of FSQRT instruction. */
{{libcall, {{11, loop}, {-1, rep_prefix_4_byte}}},
{libcall, {{32, loop}, {64, rep_prefix_4_byte},
{8192, rep_prefix_8_byte}, {-1, libcall}}}},
{{libcall, {{8, loop}, {15, unrolled_loop},
{2048, rep_prefix_4_byte}, {-1, libcall}}},
{libcall, {{24, loop}, {32, unrolled_loop},
{8192, rep_prefix_8_byte}, {-1, libcall}}}},
1, /* scalar_stmt_cost. */
1, /* scalar load_cost. */
1, /* scalar_store_cost. */
1, /* vec_stmt_cost. */
1, /* vec_to_scalar_cost. */
1, /* scalar_to_vec_cost. */
1, /* vec_align_load_cost. */
2, /* vec_unalign_load_cost. */
1, /* vec_store_cost. */
3, /* cond_taken_branch_cost. */
1, /* cond_not_taken_branch_cost. */
};
/* Generic64 should produce code tuned for Nocona and K8. */ /* Generic64 should produce code tuned for Nocona and K8. */
static const static const
struct processor_costs generic64_cost = { struct processor_costs generic64_cost = {
...@@ -1194,6 +1267,7 @@ const struct processor_costs *ix86_cost = &pentium_cost; ...@@ -1194,6 +1267,7 @@ const struct processor_costs *ix86_cost = &pentium_cost;
#define m_PENT4 (1<<PROCESSOR_PENTIUM4) #define m_PENT4 (1<<PROCESSOR_PENTIUM4)
#define m_NOCONA (1<<PROCESSOR_NOCONA) #define m_NOCONA (1<<PROCESSOR_NOCONA)
#define m_CORE2 (1<<PROCESSOR_CORE2) #define m_CORE2 (1<<PROCESSOR_CORE2)
#define m_ATOM (1<<PROCESSOR_ATOM)
#define m_GEODE (1<<PROCESSOR_GEODE) #define m_GEODE (1<<PROCESSOR_GEODE)
#define m_K6 (1<<PROCESSOR_K6) #define m_K6 (1<<PROCESSOR_K6)
...@@ -1231,10 +1305,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1231,10 +1305,11 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
m_486 | m_PENT, m_486 | m_PENT,
/* X86_TUNE_UNROLL_STRLEN */ /* X86_TUNE_UNROLL_STRLEN */
m_486 | m_PENT | m_PPRO | m_AMD_MULTIPLE | m_K6 | m_CORE2 | m_GENERIC, m_486 | m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_K6
| m_CORE2 | m_GENERIC,
/* X86_TUNE_DEEP_BRANCH_PREDICTION */ /* X86_TUNE_DEEP_BRANCH_PREDICTION */
m_PPRO | m_K6_GEODE | m_AMD_MULTIPLE | m_PENT4 | m_GENERIC, m_ATOM | m_PPRO | m_K6_GEODE | m_AMD_MULTIPLE | m_PENT4 | m_GENERIC,
/* X86_TUNE_BRANCH_PREDICTION_HINTS: Branch hints were put in P4 based /* X86_TUNE_BRANCH_PREDICTION_HINTS: Branch hints were put in P4 based
on simulation result. But after P4 was made, no performance benefit on simulation result. But after P4 was made, no performance benefit
...@@ -1246,12 +1321,12 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1246,12 +1321,12 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
~m_386, ~m_386,
/* X86_TUNE_USE_SAHF */ /* X86_TUNE_USE_SAHF */
m_PPRO | m_K6_GEODE | m_K8 | m_AMDFAM10 | m_PENT4 m_ATOM | m_PPRO | m_K6_GEODE | m_K8 | m_AMDFAM10 | m_PENT4
| m_NOCONA | m_CORE2 | m_GENERIC, | m_NOCONA | m_CORE2 | m_GENERIC,
/* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid /* X86_TUNE_MOVX: Enable to zero extend integer registers to avoid
partial dependencies. */ partial dependencies. */
m_AMD_MULTIPLE | m_PPRO | m_PENT4 | m_NOCONA m_AMD_MULTIPLE | m_ATOM | m_PPRO | m_PENT4 | m_NOCONA
| m_CORE2 | m_GENERIC | m_GEODE /* m_386 | m_K6 */, | m_CORE2 | m_GENERIC | m_GEODE /* m_386 | m_K6 */,
/* X86_TUNE_PARTIAL_REG_STALL: We probably ought to watch for partial /* X86_TUNE_PARTIAL_REG_STALL: We probably ought to watch for partial
...@@ -1271,13 +1346,13 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1271,13 +1346,13 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
m_386 | m_486 | m_K6_GEODE, m_386 | m_486 | m_K6_GEODE,
/* X86_TUNE_USE_SIMODE_FIOP */ /* X86_TUNE_USE_SIMODE_FIOP */
~(m_PPRO | m_AMD_MULTIPLE | m_PENT | m_CORE2 | m_GENERIC), ~(m_PPRO | m_AMD_MULTIPLE | m_PENT | m_ATOM | m_CORE2 | m_GENERIC),
/* X86_TUNE_USE_MOV0 */ /* X86_TUNE_USE_MOV0 */
m_K6, m_K6,
/* X86_TUNE_USE_CLTD */ /* X86_TUNE_USE_CLTD */
~(m_PENT | m_K6 | m_CORE2 | m_GENERIC), ~(m_PENT | m_ATOM | m_K6 | m_CORE2 | m_GENERIC),
/* X86_TUNE_USE_XCHGB: Use xchgb %rh,%rl instead of rolw/rorw $8,rx. */ /* X86_TUNE_USE_XCHGB: Use xchgb %rh,%rl instead of rolw/rorw $8,rx. */
m_PENT4, m_PENT4,
...@@ -1292,8 +1367,8 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1292,8 +1367,8 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
~(m_PENT | m_PPRO), ~(m_PENT | m_PPRO),
/* X86_TUNE_PROMOTE_QIMODE */ /* X86_TUNE_PROMOTE_QIMODE */
m_K6_GEODE | m_PENT | m_386 | m_486 | m_AMD_MULTIPLE | m_CORE2 m_K6_GEODE | m_PENT | m_ATOM | m_386 | m_486 | m_AMD_MULTIPLE
| m_GENERIC /* | m_PENT4 ? */, | m_CORE2 | m_GENERIC /* | m_PENT4 ? */,
/* X86_TUNE_FAST_PREFIX */ /* X86_TUNE_FAST_PREFIX */
~(m_PENT | m_486 | m_386), ~(m_PENT | m_486 | m_386),
...@@ -1317,26 +1392,28 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1317,26 +1392,28 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
m_PPRO, m_PPRO,
/* X86_TUNE_ADD_ESP_4: Enable if add/sub is preferred over 1/2 push/pop. */ /* X86_TUNE_ADD_ESP_4: Enable if add/sub is preferred over 1/2 push/pop. */
m_AMD_MULTIPLE | m_K6_GEODE | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, m_ATOM | m_AMD_MULTIPLE | m_K6_GEODE | m_PENT4 | m_NOCONA
| m_CORE2 | m_GENERIC,
/* X86_TUNE_ADD_ESP_8 */ /* X86_TUNE_ADD_ESP_8 */
m_AMD_MULTIPLE | m_PPRO | m_K6_GEODE | m_386 m_AMD_MULTIPLE | m_ATOM | m_PPRO | m_K6_GEODE | m_386
| m_486 | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, | m_486 | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC,
/* X86_TUNE_SUB_ESP_4 */ /* X86_TUNE_SUB_ESP_4 */
m_AMD_MULTIPLE | m_PPRO | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, m_AMD_MULTIPLE | m_ATOM | m_PPRO | m_PENT4 | m_NOCONA | m_CORE2
| m_GENERIC,
/* X86_TUNE_SUB_ESP_8 */ /* X86_TUNE_SUB_ESP_8 */
m_AMD_MULTIPLE | m_PPRO | m_386 | m_486 m_AMD_MULTIPLE | m_ATOM | m_PPRO | m_386 | m_486
| m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC,
/* X86_TUNE_INTEGER_DFMODE_MOVES: Enable if integer moves are preferred /* X86_TUNE_INTEGER_DFMODE_MOVES: Enable if integer moves are preferred
for DFmode copies */ for DFmode copies */
~(m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2 ~(m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2
| m_GENERIC | m_GEODE), | m_GENERIC | m_GEODE),
/* X86_TUNE_PARTIAL_REG_DEPENDENCY */ /* X86_TUNE_PARTIAL_REG_DEPENDENCY */
m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC,
/* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: In the Generic model we have a /* X86_TUNE_SSE_PARTIAL_REG_DEPENDENCY: In the Generic model we have a
conflict here in between PPro/Pentium4 based chips that thread 128bit conflict here in between PPro/Pentium4 based chips that thread 128bit
...@@ -1347,7 +1424,8 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1347,7 +1424,8 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
shows that disabling this option on P4 brings over 20% SPECfp regression, shows that disabling this option on P4 brings over 20% SPECfp regression,
while enabling it on K8 brings roughly 2.4% regression that can be partly while enabling it on K8 brings roughly 2.4% regression that can be partly
masked by careful scheduling of moves. */ masked by careful scheduling of moves. */
m_PENT4 | m_NOCONA | m_PPRO | m_CORE2 | m_GENERIC | m_AMDFAM10, m_ATOM | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2 | m_GENERIC
| m_AMDFAM10,
/* X86_TUNE_SSE_UNALIGNED_MOVE_OPTIMAL */ /* X86_TUNE_SSE_UNALIGNED_MOVE_OPTIMAL */
m_AMDFAM10, m_AMDFAM10,
...@@ -1365,13 +1443,13 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1365,13 +1443,13 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
m_PPRO | m_PENT4 | m_NOCONA, m_PPRO | m_PENT4 | m_NOCONA,
/* X86_TUNE_MEMORY_MISMATCH_STALL */ /* X86_TUNE_MEMORY_MISMATCH_STALL */
m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC,
/* X86_TUNE_PROLOGUE_USING_MOVE */ /* X86_TUNE_PROLOGUE_USING_MOVE */
m_ATHLON_K8 | m_PPRO | m_CORE2 | m_GENERIC, m_ATHLON_K8 | m_ATOM | m_PPRO | m_CORE2 | m_GENERIC,
/* X86_TUNE_EPILOGUE_USING_MOVE */ /* X86_TUNE_EPILOGUE_USING_MOVE */
m_ATHLON_K8 | m_PPRO | m_CORE2 | m_GENERIC, m_ATHLON_K8 | m_ATOM | m_PPRO | m_CORE2 | m_GENERIC,
/* X86_TUNE_SHIFT1 */ /* X86_TUNE_SHIFT1 */
~m_486, ~m_486,
...@@ -1380,29 +1458,32 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1380,29 +1458,32 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
m_AMD_MULTIPLE, m_AMD_MULTIPLE,
/* X86_TUNE_INTER_UNIT_MOVES */ /* X86_TUNE_INTER_UNIT_MOVES */
~(m_AMD_MULTIPLE | m_GENERIC), ~(m_AMD_MULTIPLE | m_ATOM | m_GENERIC),
/* X86_TUNE_INTER_UNIT_CONVERSIONS */ /* X86_TUNE_INTER_UNIT_CONVERSIONS */
~(m_AMDFAM10), ~(m_AMDFAM10),
/* X86_TUNE_FOUR_JUMP_LIMIT: Some CPU cores are not able to predict more /* X86_TUNE_FOUR_JUMP_LIMIT: Some CPU cores are not able to predict more
than 4 branch instructions in the 16 byte window. */ than 4 branch instructions in the 16 byte window. */
m_PPRO | m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_CORE2 | m_GENERIC, m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_CORE2
| m_GENERIC,
/* X86_TUNE_SCHEDULE */ /* X86_TUNE_SCHEDULE */
m_PPRO | m_AMD_MULTIPLE | m_K6_GEODE | m_PENT | m_CORE2 | m_GENERIC, m_PPRO | m_AMD_MULTIPLE | m_K6_GEODE | m_PENT | m_ATOM | m_CORE2
| m_GENERIC,
/* X86_TUNE_USE_BT */ /* X86_TUNE_USE_BT */
m_AMD_MULTIPLE | m_CORE2 | m_GENERIC, m_AMD_MULTIPLE | m_ATOM | m_CORE2 | m_GENERIC,
/* X86_TUNE_USE_INCDEC */ /* X86_TUNE_USE_INCDEC */
~(m_PENT4 | m_NOCONA | m_GENERIC), ~(m_PENT4 | m_NOCONA | m_GENERIC | m_ATOM),
/* X86_TUNE_PAD_RETURNS */ /* X86_TUNE_PAD_RETURNS */
m_AMD_MULTIPLE | m_CORE2 | m_GENERIC, m_AMD_MULTIPLE | m_CORE2 | m_GENERIC,
/* X86_TUNE_EXT_80387_CONSTANTS */ /* X86_TUNE_EXT_80387_CONSTANTS */
m_K6_GEODE | m_ATHLON_K8 | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2 | m_GENERIC, m_K6_GEODE | m_ATHLON_K8 | m_ATOM | m_PENT4 | m_NOCONA | m_PPRO
| m_CORE2 | m_GENERIC,
/* X86_TUNE_SHORTEN_X87_SSE */ /* X86_TUNE_SHORTEN_X87_SSE */
~m_K8, ~m_K8,
...@@ -1447,6 +1528,10 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = { ...@@ -1447,6 +1528,10 @@ static unsigned int initial_ix86_tune_features[X86_TUNE_LAST] = {
with a subsequent conditional jump instruction into a single with a subsequent conditional jump instruction into a single
compare-and-branch uop. */ compare-and-branch uop. */
m_CORE2, m_CORE2,
/* X86_TUNE_OPT_AGU: Optimize for Address Generation Unit. This flag
will impact LEA instruction selection. */
m_ATOM,
}; };
/* Feature tests against the various architecture variations. */ /* Feature tests against the various architecture variations. */
...@@ -1472,10 +1557,11 @@ static unsigned int initial_ix86_arch_features[X86_ARCH_LAST] = { ...@@ -1472,10 +1557,11 @@ static unsigned int initial_ix86_arch_features[X86_ARCH_LAST] = {
}; };
static const unsigned int x86_accumulate_outgoing_args static const unsigned int x86_accumulate_outgoing_args
= m_AMD_MULTIPLE | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2 | m_GENERIC; = m_AMD_MULTIPLE | m_ATOM | m_PENT4 | m_NOCONA | m_PPRO | m_CORE2
| m_GENERIC;
static const unsigned int x86_arch_always_fancy_math_387 static const unsigned int x86_arch_always_fancy_math_387
= m_PENT | m_PPRO | m_AMD_MULTIPLE | m_PENT4 = m_PENT | m_ATOM | m_PPRO | m_AMD_MULTIPLE | m_PENT4
| m_NOCONA | m_CORE2 | m_GENERIC; | m_NOCONA | m_CORE2 | m_GENERIC;
static enum stringop_alg stringop_alg = no_stringop; static enum stringop_alg stringop_alg = no_stringop;
...@@ -1958,7 +2044,8 @@ static const struct ptt processor_target_table[PROCESSOR_max] = ...@@ -1958,7 +2044,8 @@ static const struct ptt processor_target_table[PROCESSOR_max] =
{&core2_cost, 16, 10, 16, 10, 16}, {&core2_cost, 16, 10, 16, 10, 16},
{&generic32_cost, 16, 7, 16, 7, 16}, {&generic32_cost, 16, 7, 16, 7, 16},
{&generic64_cost, 16, 10, 16, 10, 16}, {&generic64_cost, 16, 10, 16, 10, 16},
{&amdfam10_cost, 32, 24, 32, 7, 32} {&amdfam10_cost, 32, 24, 32, 7, 32},
{&atom_cost, 16, 7, 16, 7, 16}
}; };
static const char *const cpu_names[TARGET_CPU_DEFAULT_max] = static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
...@@ -1976,6 +2063,7 @@ static const char *const cpu_names[TARGET_CPU_DEFAULT_max] = ...@@ -1976,6 +2063,7 @@ static const char *const cpu_names[TARGET_CPU_DEFAULT_max] =
"prescott", "prescott",
"nocona", "nocona",
"core2", "core2",
"atom",
"geode", "geode",
"k6", "k6",
"k6-2", "k6-2",
...@@ -2534,6 +2622,9 @@ override_options (bool main_args_p) ...@@ -2534,6 +2622,9 @@ override_options (bool main_args_p)
{"core2", PROCESSOR_CORE2, CPU_CORE2, {"core2", PROCESSOR_CORE2, CPU_CORE2,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_CX16}, | PTA_SSSE3 | PTA_CX16},
{"atom", PROCESSOR_ATOM, CPU_ATOM,
PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
| PTA_SSSE3 | PTA_CX16},
{"geode", PROCESSOR_GEODE, CPU_GEODE, {"geode", PROCESSOR_GEODE, CPU_GEODE,
PTA_MMX | PTA_3DNOW | PTA_3DNOW_A |PTA_PREFETCH_SSE}, PTA_MMX | PTA_3DNOW | PTA_3DNOW_A |PTA_PREFETCH_SSE},
{"k6", PROCESSOR_K6, CPU_K6, PTA_MMX}, {"k6", PROCESSOR_K6, CPU_K6, PTA_MMX},
...@@ -19026,6 +19117,7 @@ ix86_issue_rate (void) ...@@ -19026,6 +19117,7 @@ ix86_issue_rate (void)
switch (ix86_tune) switch (ix86_tune)
{ {
case PROCESSOR_PENTIUM: case PROCESSOR_PENTIUM:
case PROCESSOR_ATOM:
case PROCESSOR_K6: case PROCESSOR_K6:
return 2; return 2;
...@@ -19226,6 +19318,7 @@ ix86_adjust_cost (rtx insn, rtx link, rtx dep_insn, int cost) ...@@ -19226,6 +19318,7 @@ ix86_adjust_cost (rtx insn, rtx link, rtx dep_insn, int cost)
case PROCESSOR_ATHLON: case PROCESSOR_ATHLON:
case PROCESSOR_K8: case PROCESSOR_K8:
case PROCESSOR_AMDFAM10: case PROCESSOR_AMDFAM10:
case PROCESSOR_ATOM:
case PROCESSOR_GENERIC32: case PROCESSOR_GENERIC32:
case PROCESSOR_GENERIC64: case PROCESSOR_GENERIC64:
memory = get_attr_memory (insn); memory = get_attr_memory (insn);
......
...@@ -231,6 +231,7 @@ extern const struct processor_costs ix86_size_cost; ...@@ -231,6 +231,7 @@ extern const struct processor_costs ix86_size_cost;
#define TARGET_GENERIC64 (ix86_tune == PROCESSOR_GENERIC64) #define TARGET_GENERIC64 (ix86_tune == PROCESSOR_GENERIC64)
#define TARGET_GENERIC (TARGET_GENERIC32 || TARGET_GENERIC64) #define TARGET_GENERIC (TARGET_GENERIC32 || TARGET_GENERIC64)
#define TARGET_AMDFAM10 (ix86_tune == PROCESSOR_AMDFAM10) #define TARGET_AMDFAM10 (ix86_tune == PROCESSOR_AMDFAM10)
#define TARGET_ATOM (ix86_tune == PROCESSOR_ATOM)
/* Feature tests against the various tunings. */ /* Feature tests against the various tunings. */
enum ix86_tune_indices { enum ix86_tune_indices {
...@@ -295,6 +296,7 @@ enum ix86_tune_indices { ...@@ -295,6 +296,7 @@ enum ix86_tune_indices {
X86_TUNE_USE_VECTOR_FP_CONVERTS, X86_TUNE_USE_VECTOR_FP_CONVERTS,
X86_TUNE_USE_VECTOR_CONVERTS, X86_TUNE_USE_VECTOR_CONVERTS,
X86_TUNE_FUSE_CMP_AND_BRANCH, X86_TUNE_FUSE_CMP_AND_BRANCH,
X86_TUNE_OPT_AGU,
X86_TUNE_LAST X86_TUNE_LAST
}; };
...@@ -382,6 +384,7 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; ...@@ -382,6 +384,7 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
ix86_tune_features[X86_TUNE_USE_VECTOR_CONVERTS] ix86_tune_features[X86_TUNE_USE_VECTOR_CONVERTS]
#define TARGET_FUSE_CMP_AND_BRANCH \ #define TARGET_FUSE_CMP_AND_BRANCH \
ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH] ix86_tune_features[X86_TUNE_FUSE_CMP_AND_BRANCH]
#define TARGET_OPT_AGU ix86_tune_features[X86_TUNE_OPT_AGU]
/* Feature tests against the various architecture variations. */ /* Feature tests against the various architecture variations. */
enum ix86_arch_indices { enum ix86_arch_indices {
...@@ -567,6 +570,7 @@ enum target_cpu_default ...@@ -567,6 +570,7 @@ enum target_cpu_default
TARGET_CPU_DEFAULT_prescott, TARGET_CPU_DEFAULT_prescott,
TARGET_CPU_DEFAULT_nocona, TARGET_CPU_DEFAULT_nocona,
TARGET_CPU_DEFAULT_core2, TARGET_CPU_DEFAULT_core2,
TARGET_CPU_DEFAULT_atom,
TARGET_CPU_DEFAULT_geode, TARGET_CPU_DEFAULT_geode,
TARGET_CPU_DEFAULT_k6, TARGET_CPU_DEFAULT_k6,
...@@ -2272,6 +2276,7 @@ enum processor_type ...@@ -2272,6 +2276,7 @@ enum processor_type
PROCESSOR_GENERIC32, PROCESSOR_GENERIC32,
PROCESSOR_GENERIC64, PROCESSOR_GENERIC64,
PROCESSOR_AMDFAM10, PROCESSOR_AMDFAM10,
PROCESSOR_ATOM,
PROCESSOR_max PROCESSOR_max
}; };
......
...@@ -316,7 +316,7 @@ ...@@ -316,7 +316,7 @@
;; Processor type. ;; Processor type.
(define_attr "cpu" "none,pentium,pentiumpro,geode,k6,athlon,k8,core2, (define_attr "cpu" "none,pentium,pentiumpro,geode,k6,athlon,k8,core2,atom,
generic64,amdfam10" generic64,amdfam10"
(const (symbol_ref "ix86_schedule"))) (const (symbol_ref "ix86_schedule")))
...@@ -612,6 +612,12 @@ ...@@ -612,6 +612,12 @@
(define_attr "i387_cw" "trunc,floor,ceil,mask_pm,uninitialized,any" (define_attr "i387_cw" "trunc,floor,ceil,mask_pm,uninitialized,any"
(const_string "any")) (const_string "any"))
;; Define attribute to classify add/sub insns that consumes carry flag (CF)
(define_attr "use_carry" "0,1" (const_string "0"))
;; Define attribute to indicate unaligned ssemov insns
(define_attr "movu" "0,1" (const_string "0"))
;; Describe a user's asm statement. ;; Describe a user's asm statement.
(define_asm_attributes (define_asm_attributes
[(set_attr "length" "128") [(set_attr "length" "128")
...@@ -727,6 +733,7 @@ ...@@ -727,6 +733,7 @@
(include "k6.md") (include "k6.md")
(include "athlon.md") (include "athlon.md")
(include "geode.md") (include "geode.md")
(include "atom.md")
;; Operand and operator predicates and constraints ;; Operand and operator predicates and constraints
...@@ -5816,6 +5823,7 @@ ...@@ -5816,6 +5823,7 @@
"TARGET_64BIT && ix86_binary_operator_ok (PLUS, DImode, operands)" "TARGET_64BIT && ix86_binary_operator_ok (PLUS, DImode, operands)"
"adc{q}\t{%2, %0|%0, %2}" "adc{q}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -5890,6 +5898,7 @@ ...@@ -5890,6 +5898,7 @@
"ix86_binary_operator_ok (PLUS, QImode, operands)" "ix86_binary_operator_ok (PLUS, QImode, operands)"
"adc{b}\t{%2, %0|%0, %2}" "adc{b}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "QI")]) (set_attr "mode" "QI")])
...@@ -5902,6 +5911,7 @@ ...@@ -5902,6 +5911,7 @@
"ix86_binary_operator_ok (PLUS, HImode, operands)" "ix86_binary_operator_ok (PLUS, HImode, operands)"
"adc{w}\t{%2, %0|%0, %2}" "adc{w}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "HI")]) (set_attr "mode" "HI")])
...@@ -5914,6 +5924,7 @@ ...@@ -5914,6 +5924,7 @@
"ix86_binary_operator_ok (PLUS, SImode, operands)" "ix86_binary_operator_ok (PLUS, SImode, operands)"
"adc{l}\t{%2, %0|%0, %2}" "adc{l}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "SI")]) (set_attr "mode" "SI")])
...@@ -5927,6 +5938,7 @@ ...@@ -5927,6 +5938,7 @@
"TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)" "TARGET_64BIT && ix86_binary_operator_ok (PLUS, SImode, operands)"
"adc{l}\t{%2, %k0|%k0, %2}" "adc{l}\t{%2, %k0|%k0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "SI")]) (set_attr "mode" "SI")])
...@@ -6156,9 +6168,9 @@ ...@@ -6156,9 +6168,9 @@
(set_attr "mode" "SI")]) (set_attr "mode" "SI")])
(define_insn "*adddi_1_rex64" (define_insn "*adddi_1_rex64"
[(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r") [(set (match_operand:DI 0 "nonimmediate_operand" "=r,rm,r,r")
(plus:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0,r") (plus:DI (match_operand:DI 1 "nonimmediate_operand" "%0,0,r,r")
(match_operand:DI 2 "x86_64_general_operand" "rme,re,le"))) (match_operand:DI 2 "x86_64_general_operand" "rme,re,0,le")))
(clobber (reg:CC FLAGS_REG))] (clobber (reg:CC FLAGS_REG))]
"TARGET_64BIT && ix86_binary_operator_ok (PLUS, DImode, operands)" "TARGET_64BIT && ix86_binary_operator_ok (PLUS, DImode, operands)"
{ {
...@@ -6179,6 +6191,10 @@ ...@@ -6179,6 +6191,10 @@
} }
default: default:
/* Use add as much as possible to replace lea for AGU optimization. */
if (which_alternative == 2 && TARGET_OPT_AGU)
return "add{q}\t{%1, %0|%0, %1}";
gcc_assert (rtx_equal_p (operands[0], operands[1])); gcc_assert (rtx_equal_p (operands[0], operands[1]));
/* Make things pretty and `subl $4,%eax' rather than `addl $-4, %eax'. /* Make things pretty and `subl $4,%eax' rather than `addl $-4, %eax'.
...@@ -6197,8 +6213,11 @@ ...@@ -6197,8 +6213,11 @@
} }
} }
[(set (attr "type") [(set (attr "type")
(cond [(eq_attr "alternative" "2") (cond [(and (eq_attr "alternative" "2")
(eq (symbol_ref "TARGET_OPT_AGU") (const_int 0)))
(const_string "lea") (const_string "lea")
(eq_attr "alternative" "3")
(const_string "lea")
; Current assemblers are broken and do not allow @GOTOFF in ; Current assemblers are broken and do not allow @GOTOFF in
; ought but a memory context. ; ought but a memory context.
(match_operand:DI 2 "pic_symbolic_operand" "") (match_operand:DI 2 "pic_symbolic_operand" "")
...@@ -6215,8 +6234,7 @@ ...@@ -6215,8 +6234,7 @@
(plus:DI (match_operand:DI 1 "register_operand" "") (plus:DI (match_operand:DI 1 "register_operand" "")
(match_operand:DI 2 "x86_64_nonmemory_operand" ""))) (match_operand:DI 2 "x86_64_nonmemory_operand" "")))
(clobber (reg:CC FLAGS_REG))] (clobber (reg:CC FLAGS_REG))]
"TARGET_64BIT && reload_completed "TARGET_64BIT && reload_completed"
&& true_regnum (operands[0]) != true_regnum (operands[1])"
[(set (match_dup 0) [(set (match_dup 0)
(plus:DI (match_dup 1) (plus:DI (match_dup 1)
(match_dup 2)))] (match_dup 2)))]
...@@ -6420,9 +6438,9 @@ ...@@ -6420,9 +6438,9 @@
(define_insn "*addsi_1" (define_insn "*addsi_1"
[(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r") [(set (match_operand:SI 0 "nonimmediate_operand" "=r,rm,r,r")
(plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r") (plus:SI (match_operand:SI 1 "nonimmediate_operand" "%0,0,r,r")
(match_operand:SI 2 "general_operand" "g,ri,li"))) (match_operand:SI 2 "general_operand" "g,ri,0,li")))
(clobber (reg:CC FLAGS_REG))] (clobber (reg:CC FLAGS_REG))]
"ix86_binary_operator_ok (PLUS, SImode, operands)" "ix86_binary_operator_ok (PLUS, SImode, operands)"
{ {
...@@ -6443,6 +6461,10 @@ ...@@ -6443,6 +6461,10 @@
} }
default: default:
/* Use add as much as possible to replace lea for AGU optimization. */
if (which_alternative == 2 && TARGET_OPT_AGU)
return "add{l}\t{%1, %0|%0, %1}";
gcc_assert (rtx_equal_p (operands[0], operands[1])); gcc_assert (rtx_equal_p (operands[0], operands[1]));
/* Make things pretty and `subl $4,%eax' rather than `addl $-4, %eax'. /* Make things pretty and `subl $4,%eax' rather than `addl $-4, %eax'.
...@@ -6459,7 +6481,10 @@ ...@@ -6459,7 +6481,10 @@
} }
} }
[(set (attr "type") [(set (attr "type")
(cond [(eq_attr "alternative" "2") (cond [(and (eq_attr "alternative" "2")
(eq (symbol_ref "TARGET_OPT_AGU") (const_int 0)))
(const_string "lea")
(eq_attr "alternative" "3")
(const_string "lea") (const_string "lea")
; Current assemblers are broken and do not allow @GOTOFF in ; Current assemblers are broken and do not allow @GOTOFF in
; ought but a memory context. ; ought but a memory context.
...@@ -6477,8 +6502,7 @@ ...@@ -6477,8 +6502,7 @@
(plus (match_operand 1 "register_operand" "") (plus (match_operand 1 "register_operand" "")
(match_operand 2 "nonmemory_operand" ""))) (match_operand 2 "nonmemory_operand" "")))
(clobber (reg:CC FLAGS_REG))] (clobber (reg:CC FLAGS_REG))]
"reload_completed "reload_completed"
&& true_regnum (operands[0]) != true_regnum (operands[1])"
[(const_int 0)] [(const_int 0)]
{ {
rtx pat; rtx pat;
...@@ -7580,6 +7604,7 @@ ...@@ -7580,6 +7604,7 @@
"TARGET_64BIT && ix86_binary_operator_ok (MINUS, DImode, operands)" "TARGET_64BIT && ix86_binary_operator_ok (MINUS, DImode, operands)"
"sbb{q}\t{%2, %0|%0, %2}" "sbb{q}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -7628,6 +7653,7 @@ ...@@ -7628,6 +7653,7 @@
"ix86_binary_operator_ok (MINUS, QImode, operands)" "ix86_binary_operator_ok (MINUS, QImode, operands)"
"sbb{b}\t{%2, %0|%0, %2}" "sbb{b}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "QI")]) (set_attr "mode" "QI")])
...@@ -7640,6 +7666,7 @@ ...@@ -7640,6 +7666,7 @@
"ix86_binary_operator_ok (MINUS, HImode, operands)" "ix86_binary_operator_ok (MINUS, HImode, operands)"
"sbb{w}\t{%2, %0|%0, %2}" "sbb{w}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "HI")]) (set_attr "mode" "HI")])
...@@ -7652,6 +7679,7 @@ ...@@ -7652,6 +7679,7 @@
"ix86_binary_operator_ok (MINUS, SImode, operands)" "ix86_binary_operator_ok (MINUS, SImode, operands)"
"sbb{l}\t{%2, %0|%0, %2}" "sbb{l}\t{%2, %0|%0, %2}"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "mode" "SI")]) (set_attr "mode" "SI")])
...@@ -15275,6 +15303,7 @@ ...@@ -15275,6 +15303,7 @@
"reload_completed" "reload_completed"
"ret" "ret"
[(set_attr "length" "1") [(set_attr "length" "1")
(set_attr "atom_unit" "jeu")
(set_attr "length_immediate" "0") (set_attr "length_immediate" "0")
(set_attr "modrm" "0")]) (set_attr "modrm" "0")])
...@@ -15287,6 +15316,7 @@ ...@@ -15287,6 +15316,7 @@
"reload_completed" "reload_completed"
"rep\;ret" "rep\;ret"
[(set_attr "length" "1") [(set_attr "length" "1")
(set_attr "atom_unit" "jeu")
(set_attr "length_immediate" "0") (set_attr "length_immediate" "0")
(set_attr "prefix_rep" "1") (set_attr "prefix_rep" "1")
(set_attr "modrm" "0")]) (set_attr "modrm" "0")])
...@@ -15297,6 +15327,7 @@ ...@@ -15297,6 +15327,7 @@
"reload_completed" "reload_completed"
"ret\t%0" "ret\t%0"
[(set_attr "length" "3") [(set_attr "length" "3")
(set_attr "atom_unit" "jeu")
(set_attr "length_immediate" "2") (set_attr "length_immediate" "2")
(set_attr "modrm" "0")]) (set_attr "modrm" "0")])
...@@ -16418,6 +16449,7 @@ ...@@ -16418,6 +16449,7 @@
"TARGET_SSE_MATH" "TARGET_SSE_MATH"
"%vrcpss\t{%1, %d0|%d0, %1}" "%vrcpss\t{%1, %d0|%d0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "rcp")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "SF")]) (set_attr "mode" "SF")])
...@@ -16777,6 +16809,7 @@ ...@@ -16777,6 +16809,7 @@
"TARGET_SSE_MATH" "TARGET_SSE_MATH"
"%vrsqrtss\t{%1, %d0|%d0, %1}" "%vrsqrtss\t{%1, %d0|%d0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "rcp")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "SF")]) (set_attr "mode" "SF")])
...@@ -16797,6 +16830,7 @@ ...@@ -16797,6 +16830,7 @@
"SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH" "SSE_FLOAT_MODE_P (<MODE>mode) && TARGET_SSE_MATH"
"%vsqrts<ssemodefsuffix>\t{%1, %d0|%d0, %1}" "%vsqrts<ssemodefsuffix>\t{%1, %d0|%d0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "sqrt")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "<MODE>") (set_attr "mode" "<MODE>")
(set_attr "athlon_decode" "*") (set_attr "athlon_decode" "*")
...@@ -19850,6 +19884,7 @@ ...@@ -19850,6 +19884,7 @@
; Since we don't have the proper number of operands for an alu insn, ; Since we don't have the proper number of operands for an alu insn,
; fill in all the blanks. ; fill in all the blanks.
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "memory" "none") (set_attr "memory" "none")
(set_attr "imm_disp" "false") (set_attr "imm_disp" "false")
...@@ -19865,6 +19900,7 @@ ...@@ -19865,6 +19900,7 @@
"" ""
"sbb{q}\t%0, %0" "sbb{q}\t%0, %0"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "memory" "none") (set_attr "memory" "none")
(set_attr "imm_disp" "false") (set_attr "imm_disp" "false")
...@@ -19908,6 +19944,7 @@ ...@@ -19908,6 +19944,7 @@
; Since we don't have the proper number of operands for an alu insn, ; Since we don't have the proper number of operands for an alu insn,
; fill in all the blanks. ; fill in all the blanks.
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "memory" "none") (set_attr "memory" "none")
(set_attr "imm_disp" "false") (set_attr "imm_disp" "false")
...@@ -19923,6 +19960,7 @@ ...@@ -19923,6 +19960,7 @@
"" ""
"sbb{l}\t%0, %0" "sbb{l}\t%0, %0"
[(set_attr "type" "alu") [(set_attr "type" "alu")
(set_attr "use_carry" "1")
(set_attr "pent_pair" "pu") (set_attr "pent_pair" "pu")
(set_attr "memory" "none") (set_attr "memory" "none")
(set_attr "imm_disp" "false") (set_attr "imm_disp" "false")
...@@ -20255,7 +20293,8 @@ ...@@ -20255,7 +20293,8 @@
} }
} }
[(set (attr "type") [(set (attr "type")
(cond [(eq_attr "alternative" "0") (cond [(and (eq_attr "alternative" "0")
(eq (symbol_ref "TARGET_OPT_AGU") (const_int 0)))
(const_string "alu") (const_string "alu")
(match_operand:SI 2 "const0_operand" "") (match_operand:SI 2 "const0_operand" "")
(const_string "imov") (const_string "imov")
...@@ -20298,7 +20337,8 @@ ...@@ -20298,7 +20337,8 @@
} }
} }
[(set (attr "type") [(set (attr "type")
(cond [(eq_attr "alternative" "0") (cond [(and (eq_attr "alternative" "0")
(eq (symbol_ref "TARGET_OPT_AGU") (const_int 0)))
(const_string "alu") (const_string "alu")
(match_operand:DI 2 "const0_operand" "") (match_operand:DI 2 "const0_operand" "")
(const_string "imov") (const_string "imov")
...@@ -21790,6 +21830,7 @@ ...@@ -21790,6 +21830,7 @@
return patterns[locality]; return patterns[locality];
} }
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "prefetch")
(set_attr "memory" "none")]) (set_attr "memory" "none")])
(define_insn "*prefetch_sse_rex" (define_insn "*prefetch_sse_rex"
...@@ -21808,6 +21849,7 @@ ...@@ -21808,6 +21849,7 @@
return patterns[locality]; return patterns[locality];
} }
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "prefetch")
(set_attr "memory" "none")]) (set_attr "memory" "none")])
(define_insn "*prefetch_3dnow" (define_insn "*prefetch_3dnow"
......
...@@ -338,6 +338,7 @@ ...@@ -338,6 +338,7 @@
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))" && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
"vmovup<avxmodesuffixf2c>\t{%1, %0|%0, %1}" "vmovup<avxmodesuffixf2c>\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov") [(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "prefix" "vex") (set_attr "prefix" "vex")
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
...@@ -363,6 +364,7 @@ ...@@ -363,6 +364,7 @@
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))" && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
"movup<ssemodesuffixf2c>\t{%1, %0|%0, %1}" "movup<ssemodesuffixf2c>\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov") [(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
(define_insn "avx_movdqu<avxmodesuffix>" (define_insn "avx_movdqu<avxmodesuffix>"
...@@ -373,6 +375,7 @@ ...@@ -373,6 +375,7 @@
"TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "TARGET_AVX && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
"vmovdqu\t{%1, %0|%0, %1}" "vmovdqu\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov") [(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "prefix" "vex") (set_attr "prefix" "vex")
(set_attr "mode" "<avxvecmode>")]) (set_attr "mode" "<avxvecmode>")])
...@@ -383,6 +386,7 @@ ...@@ -383,6 +386,7 @@
"TARGET_SSE2 && !(MEM_P (operands[0]) && MEM_P (operands[1]))" "TARGET_SSE2 && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
"movdqu\t{%1, %0|%0, %1}" "movdqu\t{%1, %0|%0, %1}"
[(set_attr "type" "ssemov") [(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -424,7 +428,7 @@ ...@@ -424,7 +428,7 @@
UNSPEC_MOVNT))] UNSPEC_MOVNT))]
"TARGET_SSE2" "TARGET_SSE2"
"movntdq\t{%1, %0|%0, %1}" "movntdq\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -434,7 +438,7 @@ ...@@ -434,7 +438,7 @@
UNSPEC_MOVNT))] UNSPEC_MOVNT))]
"TARGET_SSE2" "TARGET_SSE2"
"movnti\t{%1, %0|%0, %1}" "movnti\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "mode" "V2DF")]) (set_attr "mode" "V2DF")])
(define_insn "avx_lddqu<avxmodesuffix>" (define_insn "avx_lddqu<avxmodesuffix>"
...@@ -445,6 +449,7 @@ ...@@ -445,6 +449,7 @@
"TARGET_AVX" "TARGET_AVX"
"vlddqu\t{%1, %0|%0, %1}" "vlddqu\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssecvt")
(set_attr "movu" "1")
(set_attr "prefix" "vex") (set_attr "prefix" "vex")
(set_attr "mode" "<avxvecmode>")]) (set_attr "mode" "<avxvecmode>")])
...@@ -454,7 +459,8 @@ ...@@ -454,7 +459,8 @@
UNSPEC_LDDQU))] UNSPEC_LDDQU))]
"TARGET_SSE3" "TARGET_SSE3"
"lddqu\t{%1, %0|%0, %1}" "lddqu\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "movu" "1")
(set_attr "prefix_rep" "1") (set_attr "prefix_rep" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -761,6 +767,7 @@ ...@@ -761,6 +767,7 @@
"TARGET_SSE" "TARGET_SSE"
"%vrcpps\t{%1, %0|%0, %1}" "%vrcpps\t{%1, %0|%0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "rcp")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "V4SF")]) (set_attr "mode" "V4SF")])
...@@ -787,6 +794,7 @@ ...@@ -787,6 +794,7 @@
"TARGET_SSE" "TARGET_SSE"
"rcpss\t{%1, %0|%0, %1}" "rcpss\t{%1, %0|%0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "rcp")
(set_attr "mode" "SF")]) (set_attr "mode" "SF")])
(define_expand "sqrtv8sf2" (define_expand "sqrtv8sf2"
...@@ -832,6 +840,7 @@ ...@@ -832,6 +840,7 @@
"TARGET_SSE" "TARGET_SSE"
"%vsqrtps\t{%1, %0|%0, %1}" "%vsqrtps\t{%1, %0|%0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "sqrt")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "V4SF")]) (set_attr "mode" "V4SF")])
...@@ -876,6 +885,7 @@ ...@@ -876,6 +885,7 @@
"SSE_VEC_FLOAT_MODE_P (<MODE>mode)" "SSE_VEC_FLOAT_MODE_P (<MODE>mode)"
"sqrts<ssemodesuffixf2c>\t{%1, %0|%0, %1}" "sqrts<ssemodesuffixf2c>\t{%1, %0|%0, %1}"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "sqrt")
(set_attr "mode" "<ssescalarmode>")]) (set_attr "mode" "<ssescalarmode>")])
(define_expand "rsqrtv8sf2" (define_expand "rsqrtv8sf2"
...@@ -1039,7 +1049,7 @@ ...@@ -1039,7 +1049,7 @@
(const_int 1)))] (const_int 1)))]
"SSE_VEC_FLOAT_MODE_P (<MODE>mode)" "SSE_VEC_FLOAT_MODE_P (<MODE>mode)"
"<maxminfprefix>s<ssemodesuffixf2c>\t{%2, %0|%0, %2}" "<maxminfprefix>s<ssemodesuffixf2c>\t{%2, %0|%0, %2}"
[(set_attr "type" "sse") [(set_attr "type" "sseadd")
(set_attr "mode" "<ssescalarmode>")]) (set_attr "mode" "<ssescalarmode>")])
;; These versions of the min/max patterns implement exactly the operations ;; These versions of the min/max patterns implement exactly the operations
...@@ -1175,6 +1185,7 @@ ...@@ -1175,6 +1185,7 @@
"TARGET_SSE3" "TARGET_SSE3"
"addsubpd\t{%2, %0|%0, %2}" "addsubpd\t{%2, %0|%0, %2}"
[(set_attr "type" "sseadd") [(set_attr "type" "sseadd")
(set_attr "atom_unit" "complex")
(set_attr "mode" "V2DF")]) (set_attr "mode" "V2DF")])
(define_insn "avx_h<plusminus_insn>v4df3" (define_insn "avx_h<plusminus_insn>v4df3"
...@@ -1298,6 +1309,7 @@ ...@@ -1298,6 +1309,7 @@
"TARGET_SSE3" "TARGET_SSE3"
"h<plusminus_mnemonic>ps\t{%2, %0|%0, %2}" "h<plusminus_mnemonic>ps\t{%2, %0|%0, %2}"
[(set_attr "type" "sseadd") [(set_attr "type" "sseadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_rep" "1") (set_attr "prefix_rep" "1")
(set_attr "mode" "V4SF")]) (set_attr "mode" "V4SF")])
...@@ -5066,6 +5078,7 @@ ...@@ -5066,6 +5078,7 @@
"TARGET_SSE2 && ix86_binary_operator_ok (MULT, V8HImode, operands)" "TARGET_SSE2 && ix86_binary_operator_ok (MULT, V8HImode, operands)"
"pmaddwd\t{%2, %0|%0, %2}" "pmaddwd\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "simul")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -7025,6 +7038,7 @@ ...@@ -7025,6 +7038,7 @@
movq\t{%H1, %0|%0, %H1} movq\t{%H1, %0|%0, %H1}
mov{q}\t{%H1, %0|%0, %H1}" mov{q}\t{%H1, %0|%0, %H1}"
[(set_attr "type" "ssemov,sseishft,ssemov,imov") [(set_attr "type" "ssemov,sseishft,ssemov,imov")
(set_attr "atom_unit" "*,sishuf,*,*")
(set_attr "memory" "*,none,*,*") (set_attr "memory" "*,none,*,*")
(set_attr "mode" "V2SF,TI,TI,DI")]) (set_attr "mode" "V2SF,TI,TI,DI")])
...@@ -7057,6 +7071,7 @@ ...@@ -7057,6 +7071,7 @@
psrldq\t{$8, %0|%0, 8} psrldq\t{$8, %0|%0, 8}
movq\t{%H1, %0|%0, %H1}" movq\t{%H1, %0|%0, %H1}"
[(set_attr "type" "ssemov,sseishft,ssemov") [(set_attr "type" "ssemov,sseishft,ssemov")
(set_attr "atom_unit" "*,sishuf,*")
(set_attr "memory" "*,none,*") (set_attr "memory" "*,none,*")
(set_attr "mode" "V2SF,TI,TI")]) (set_attr "mode" "V2SF,TI,TI")])
...@@ -7614,6 +7629,7 @@ ...@@ -7614,6 +7629,7 @@
"TARGET_SSE2" "TARGET_SSE2"
"psadbw\t{%2, %0|%0, %2}" "psadbw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "simul")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -7635,7 +7651,7 @@ ...@@ -7635,7 +7651,7 @@
UNSPEC_MOVMSK))] UNSPEC_MOVMSK))]
"SSE_VEC_FLOAT_MODE_P (<MODE>mode)" "SSE_VEC_FLOAT_MODE_P (<MODE>mode)"
"%vmovmskp<ssemodesuffixf2c>\t{%1, %0|%0, %1}" "%vmovmskp<ssemodesuffixf2c>\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
...@@ -7645,7 +7661,7 @@ ...@@ -7645,7 +7661,7 @@
UNSPEC_MOVMSK))] UNSPEC_MOVMSK))]
"TARGET_SSE2" "TARGET_SSE2"
"%vpmovmskb\t{%1, %0|%0, %1}" "%vpmovmskb\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "SI")]) (set_attr "mode" "SI")])
...@@ -7668,7 +7684,7 @@ ...@@ -7668,7 +7684,7 @@
"TARGET_SSE2 && !TARGET_64BIT" "TARGET_SSE2 && !TARGET_64BIT"
;; @@@ check ordering of operands in intel/nonintel syntax ;; @@@ check ordering of operands in intel/nonintel syntax
"%vmaskmovdqu\t{%2, %1|%1, %2}" "%vmaskmovdqu\t{%2, %1|%1, %2}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -7682,7 +7698,7 @@ ...@@ -7682,7 +7698,7 @@
"TARGET_SSE2 && TARGET_64BIT" "TARGET_SSE2 && TARGET_64BIT"
;; @@@ check ordering of operands in intel/nonintel syntax ;; @@@ check ordering of operands in intel/nonintel syntax
"%vmaskmovdqu\t{%2, %1|%1, %2}" "%vmaskmovdqu\t{%2, %1|%1, %2}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -7693,6 +7709,7 @@ ...@@ -7693,6 +7709,7 @@
"TARGET_SSE" "TARGET_SSE"
"%vldmxcsr\t%0" "%vldmxcsr\t%0"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "mxcsr")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "memory" "load")]) (set_attr "memory" "load")])
...@@ -7702,6 +7719,7 @@ ...@@ -7702,6 +7719,7 @@
"TARGET_SSE" "TARGET_SSE"
"%vstmxcsr\t%0" "%vstmxcsr\t%0"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "mxcsr")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "memory" "store")]) (set_attr "memory" "store")])
...@@ -7720,6 +7738,7 @@ ...@@ -7720,6 +7738,7 @@
"TARGET_SSE || TARGET_3DNOW_A" "TARGET_SSE || TARGET_3DNOW_A"
"sfence" "sfence"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "fence")
(set_attr "memory" "unknown")]) (set_attr "memory" "unknown")])
(define_insn "sse2_clflush" (define_insn "sse2_clflush"
...@@ -7728,6 +7747,7 @@ ...@@ -7728,6 +7747,7 @@
"TARGET_SSE2" "TARGET_SSE2"
"clflush\t%a0" "clflush\t%a0"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "fence")
(set_attr "memory" "unknown")]) (set_attr "memory" "unknown")])
(define_expand "sse2_mfence" (define_expand "sse2_mfence"
...@@ -7745,6 +7765,7 @@ ...@@ -7745,6 +7765,7 @@
"TARGET_64BIT || TARGET_SSE2" "TARGET_64BIT || TARGET_SSE2"
"mfence" "mfence"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "fence")
(set_attr "memory" "unknown")]) (set_attr "memory" "unknown")])
(define_expand "sse2_lfence" (define_expand "sse2_lfence"
...@@ -7762,6 +7783,7 @@ ...@@ -7762,6 +7783,7 @@
"TARGET_SSE2" "TARGET_SSE2"
"lfence" "lfence"
[(set_attr "type" "sse") [(set_attr "type" "sse")
(set_attr "atom_sse_attr" "lfence")
(set_attr "memory" "unknown")]) (set_attr "memory" "unknown")])
(define_insn "sse3_mwait" (define_insn "sse3_mwait"
...@@ -7885,6 +7907,7 @@ ...@@ -7885,6 +7907,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phaddw\t{%2, %0|%0, %2}" "phaddw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -7913,6 +7936,7 @@ ...@@ -7913,6 +7936,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phaddw\t{%2, %0|%0, %2}" "phaddw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -7967,6 +7991,7 @@ ...@@ -7967,6 +7991,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phaddd\t{%2, %0|%0, %2}" "phaddd\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -7987,6 +8012,7 @@ ...@@ -7987,6 +8012,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phaddd\t{%2, %0|%0, %2}" "phaddd\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8073,6 +8099,7 @@ ...@@ -8073,6 +8099,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phaddsw\t{%2, %0|%0, %2}" "phaddsw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -8101,6 +8128,7 @@ ...@@ -8101,6 +8128,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phaddsw\t{%2, %0|%0, %2}" "phaddsw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8187,6 +8215,7 @@ ...@@ -8187,6 +8215,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phsubw\t{%2, %0|%0, %2}" "phsubw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -8215,6 +8244,7 @@ ...@@ -8215,6 +8244,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phsubw\t{%2, %0|%0, %2}" "phsubw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8269,6 +8299,7 @@ ...@@ -8269,6 +8299,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phsubd\t{%2, %0|%0, %2}" "phsubd\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -8289,6 +8320,7 @@ ...@@ -8289,6 +8320,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phsubd\t{%2, %0|%0, %2}" "phsubd\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8375,6 +8407,7 @@ ...@@ -8375,6 +8407,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phsubsw\t{%2, %0|%0, %2}" "phsubsw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -8403,6 +8436,7 @@ ...@@ -8403,6 +8436,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"phsubsw\t{%2, %0|%0, %2}" "phsubsw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "complex")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8509,6 +8543,7 @@ ...@@ -8509,6 +8543,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"pmaddubsw\t{%2, %0|%0, %2}" "pmaddubsw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "simul")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -8547,6 +8582,7 @@ ...@@ -8547,6 +8582,7 @@
"TARGET_SSSE3" "TARGET_SSSE3"
"pmaddubsw\t{%2, %0|%0, %2}" "pmaddubsw\t{%2, %0|%0, %2}"
[(set_attr "type" "sseiadd") [(set_attr "type" "sseiadd")
(set_attr "atom_unit" "simul")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8754,6 +8790,7 @@ ...@@ -8754,6 +8790,7 @@
return "palignr\t{%3, %2, %0|%0, %2, %3}"; return "palignr\t{%3, %2, %0|%0, %2, %3}";
} }
[(set_attr "type" "sseishft") [(set_attr "type" "sseishft")
(set_attr "atom_unit" "sishuf")
(set_attr "prefix_data16" "1") (set_attr "prefix_data16" "1")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
...@@ -8770,6 +8807,7 @@ ...@@ -8770,6 +8807,7 @@
return "palignr\t{%3, %2, %0|%0, %2, %3}"; return "palignr\t{%3, %2, %0|%0, %2, %3}";
} }
[(set_attr "type" "sseishft") [(set_attr "type" "sseishft")
(set_attr "atom_unit" "sishuf")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "mode" "DI")]) (set_attr "mode" "DI")])
...@@ -8956,7 +8994,7 @@ ...@@ -8956,7 +8994,7 @@
UNSPEC_MOVNTDQA))] UNSPEC_MOVNTDQA))]
"TARGET_SSE4_1" "TARGET_SSE4_1"
"%vmovntdqa\t{%1, %0|%0, %1}" "%vmovntdqa\t{%1, %0|%0, %1}"
[(set_attr "type" "ssecvt") [(set_attr "type" "ssemov")
(set_attr "prefix_extra" "1") (set_attr "prefix_extra" "1")
(set_attr "prefix" "maybe_vex") (set_attr "prefix" "maybe_vex")
(set_attr "mode" "TI")]) (set_attr "mode" "TI")])
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment