[NDS32] Add hard float support.

gcc/ * config.gcc (nds32*-*-*): Add v2j v3f v3s checking. (nds32*-*-*): Add float and fpu_config into supported_defaults. * common/config/nds32/nds32-common.c (TARGET_DEFAULT_TARGET_FLAGS): Include TARGET_DEFAULT_FPU_ISA and TARGET_DEFAULT_FPU_FMA. * config/nds32/constants.md (unspec_element): Add UNSPEC_COPYSIGN, UNSPEC_FCPYNSD, UNSPEC_FCPYNSS, UNSPEC_FCPYSD and UNSPEC_FCPYSS. * config/nds32/constraints.md: New constraints and checking for hard float configuration. * config/nds32/iterators.md: New mode iterator and attribute for hard float configuration. * config/nds32/nds32-doubleword.md: Use hard float alternatives and patterns. * config/nds32/nds32-fpu.md: New file. * config/nds32/nds32-md-auxiliary.c: New functions and checkings to deal with hard float code generation. * config/nds32/nds32-opts.h (nds32_arch_type): Add ARCH_V3F and ARCH_V3S. (abi_type, float_reg_number): New enum type. * config/nds32/nds32-predicates.c: New predicates for hard float. * config/nds32/nds32-protos.h: Declare functions for hard float. * config/nds32/nds32.c: Implementation for hard float configuration. * config/nds32/nds32.h: Definitions for hard float configuration. * config/nds32/nds32.md: Include hard float machine description and modify patterns for hard float configuration. * config/nds32/nds32.opt: New options for hard float configuration. * config/nds32/predicates.md: New predicates for hard float configuration. Co-Authored-By: Chung-Ju Wu <jasonwucj@gmail.com> From-SVN: r259161

[NDS32] Add hard float support.
gcc/ * config.gcc (nds32*-*-*): Add v2j v3f v3s checking. (nds32*-*-*): Add float and fpu_config into supported_defaults. * common/config/nds32/nds32-common.c (TARGET_DEFAULT_TARGET_FLAGS): Include TARGET_DEFAULT_FPU_ISA and TARGET_DEFAULT_FPU_FMA. * config/nds32/constants.md (unspec_element): Add UNSPEC_COPYSIGN, UNSPEC_FCPYNSD, UNSPEC_FCPYNSS, UNSPEC_FCPYSD and UNSPEC_FCPYSS. * config/nds32/constraints.md: New constraints and checking for hard float configuration. * config/nds32/iterators.md: New mode iterator and attribute for hard float configuration. * config/nds32/nds32-doubleword.md: Use hard float alternatives and patterns. * config/nds32/nds32-fpu.md: New file. * config/nds32/nds32-md-auxiliary.c: New functions and checkings to deal with hard float code generation. * config/nds32/nds32-opts.h (nds32_arch_type): Add ARCH_V3F and ARCH_V3S. (abi_type, float_reg_number): New enum type. * config/nds32/nds32-predicates.c: New predicates for hard float. * config/nds32/nds32-protos.h: Declare functions for hard float. * config/nds32/nds32.c: Implementation for hard float configuration. * config/nds32/nds32.h: Definitions for hard float configuration. * config/nds32/nds32.md: Include hard float machine description and modify patterns for hard float configuration. * config/nds32/nds32.opt: New options for hard float configuration. * config/nds32/predicates.md: New predicates for hard float configuration. Co-Authored-By: Chung-Ju Wu <jasonwucj@gmail.com> From-SVN: r259161
e2286268 · Monk Chiang · Chung-Ju Wu · 58e29762 · e2286268 · e2286268
Commit e2286268 authored Apr 06, 2018 by Monk Chiang Committed by Chung-Ju Wu Apr 06, 2018
17 changed files
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
+2018-04-06  Monk Chiang  <sh.chiang04@gmail.com>
+	    Chung-Ju Wu  <jasonwucj@gmail.com>
+	* config.gcc (nds32*-*-*): Add v2j v3f v3s checking.
+	(nds32*-*-*): Add float and fpu_config into supported_defaults.
+	* common/config/nds32/nds32-common.c (TARGET_DEFAULT_TARGET_FLAGS):
+	Include TARGET_DEFAULT_FPU_ISA and TARGET_DEFAULT_FPU_FMA.
+	* config/nds32/constants.md (unspec_element): Add UNSPEC_COPYSIGN,
+	UNSPEC_FCPYNSD, UNSPEC_FCPYNSS, UNSPEC_FCPYSD and UNSPEC_FCPYSS.
+	* config/nds32/constraints.md: New constraints and checking for hard
+	float configuration.
+	* config/nds32/iterators.md: New mode iterator and attribute for hard
+	float configuration.
+	* config/nds32/nds32-doubleword.md: Use hard float alternatives and
+	patterns.
+	* config/nds32/nds32-fpu.md: New file.
+	* config/nds32/nds32-md-auxiliary.c: New functions and checkings to
+	deal with hard float code generation.
+	* config/nds32/nds32-opts.h (nds32_arch_type): Add ARCH_V3F and
+	ARCH_V3S.
+	(abi_type, float_reg_number): New enum type.
+	* config/nds32/nds32-predicates.c: New predicates for hard float.
+	* config/nds32/nds32-protos.h: Declare functions for hard float.
+	* config/nds32/nds32.c: Implementation for hard float configuration.
+	* config/nds32/nds32.h: Definitions for hard float configuration.
+	* config/nds32/nds32.md: Include hard float machine description and
+	modify patterns for hard float configuration.
+	* config/nds32/nds32.opt: New options for hard float configuration.
+	* config/nds32/predicates.md: New predicates for hard float
+	configuration.
 2018-04-06  Kuan-Lin Chen  <kuanlinchentw@gmail.com>
 	* common/config/nds32/nds32-common.c

--- a/gcc/common/config/nds32/nds32-common.c
+++ b/gcc/common/config/nds32/nds32-common.c
@@ -107,6 +107,8 @@ static const struct default_options nds32_option_optimization_table[] =
 #undef TARGET_DEFAULT_TARGET_FLAGS
 #define TARGET_DEFAULT_TARGET_FLAGS		\
  (TARGET_CPU_DEFAULT				\
+   | TARGET_DEFAULT_FPU_ISA			\
+   | TARGET_DEFAULT_FPU_FMA			\
   | MASK_16_BIT				\
   | MASK_EXT_PERF				\
   | MASK_EXT_PERF2				\

--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4278,15 +4278,26 @@ case "${target}" in
 		;;
 	nds32*-*-*)
-		supported_defaults="arch cpu nds32_lib"
+		supported_defaults="arch cpu nds32_lib float fpu_config"
 		# process --with-arch
 		case "${with_arch}" in
-		"" | v2 | v3 | v3m)
+		"" | v3 )
+			tm_defines="${tm_defines} TARGET_ARCH_DEFAULT=0"
+			;;
+		v2 | v2j | v3m)
 			# OK
+			tm_defines="${tm_defines} TARGET_ARCH_DEFAULT=0"
+			;;
+		v3f)
+			tm_defines="${tm_defines} TARGET_ARCH_DEFAULT=1"
+			;;
+		v3s)
+			tm_defines="${tm_defines} TARGET_ARCH_DEFAULT=2"
 			;;
 		*)
-			echo "Cannot accept --with-arch=$with_arch, available values are: v2 v3 v3m" 1>&2
+			echo "Cannot accept --with-arch=$with_arch, available values are: v2 v2j v3 v3m v3f v3s" 1>&2
 			exit 1
 			;;
 		esac
@@ -4321,8 +4332,31 @@ case "${target}" in
 			exit 1
 			;;
 		esac
-		;;
+		# process --with-float
+		case "${with_float}" in
+		"" | soft | hard)
+			# OK
+			;;
+		*)
+			echo "Cannot accept --with-float=$with_float, available values are: soft hard" 1>&2
+			exit 1
+			;;
+		esac
+		# process --with-config-fpu
+		case "${with_config_fpu}" in
+		"" | 0 | 1 | 2 | 3)
+			# OK
+			;;
+		*)
+			echo "Cannot accept --with-config-fpu=$with_config_fpu, available values from 0 to 7" 1>&2
+			exit 1
+			;;
+		esac
+		;;
 	nios2*-*-*)
 		supported_defaults="arch"
 			case "$with_arch" in

--- a/gcc/config/nds32/constants.md
+++ b/gcc/config/nds32/constants.md
@@ -32,6 +32,11 @@
 ;; The unpec operation index.
 (define_c_enum "unspec_element" [
+  UNSPEC_COPYSIGN
+  UNSPEC_FCPYNSD
+  UNSPEC_FCPYNSS
+  UNSPEC_FCPYSD
+  UNSPEC_FCPYSS
  UNSPEC_FFB
  UNSPEC_FFMISM
  UNSPEC_FLMISM

--- a/gcc/config/nds32/constraints.md
+++ b/gcc/config/nds32/constraints.md
@@ -53,6 +53,10 @@
 (define_register_constraint "x" "FRAME_POINTER_REG"
  "Frame pointer register $fp")
+(define_register_constraint "f"
+  "(TARGET_FPU_SINGLE || TARGET_FPU_DOUBLE) ? FP_REGS : NO_REGS"
+ "The Floating point registers $fs0 ~ $fs31")
 (define_constraint "Iv00"
  "Constant value 0"
  (and (match_code "const_int")
@@ -108,6 +112,11 @@
  (and (match_code "const_int")
       (match_test "ival < (1 << 4) && ival >= -(1 << 4)")))
+(define_constraint "Cs05"
+  "Signed immediate 5-bit value"
+  (and (match_code "const_double")
+       (match_test "nds32_const_double_range_ok_p (op, SFmode, -(1 << 4), (1 << 4))")))
 (define_constraint "Iu05"
  "Unsigned immediate 5-bit value"
  (and (match_code "const_int")
@@ -246,12 +255,21 @@
  (and (match_code "const_int")
       (match_test "ival < (1 << 19) && ival >= -(1 << 19)")))
+(define_constraint "Cs20"
+  "Signed immediate 20-bit value"
+  (and (match_code "const_double")
+       (match_test "nds32_const_double_range_ok_p (op, SFmode, -(1 << 19), (1 << 19))")))
 (define_constraint "Ihig"
  "The immediate value that can be simply set high 20-bit"
  (and (match_code "const_int")
       (match_test "(ival != 0) && ((ival & 0xfff) == 0)")))
+(define_constraint "Chig"
+  "The immediate value that can be simply set high 20-bit"
+  (and (match_code "high")
+       (match_test "GET_CODE (XEXP (op, 0)) == CONST_DOUBLE")))
 (define_constraint "Izeb"
  "The immediate value 0xff"
  (and (match_code "const_int")
@@ -296,25 +314,39 @@
  "Memory constraint for 45 format"
  (and (match_code "mem")
       (match_test "(nds32_mem_format (op) == ADDRESS_REG)
-		    && (GET_MODE (op) == SImode)")))
+		    && ((GET_MODE (op) == SImode)
+		       || (GET_MODE (op) == SFmode))")))
 (define_memory_constraint "Ufe"
  "Memory constraint for fe format"
  (and (match_code "mem")
       (match_test "nds32_mem_format (op) == ADDRESS_R8_IMM7U
-		    && (GET_MODE (op) == SImode)")))
+		    && (GET_MODE (op) == SImode
+			|| GET_MODE (op) == SFmode)")))
 (define_memory_constraint "U37"
  "Memory constraint for 37 format"
  (and (match_code "mem")
       (match_test "(nds32_mem_format (op) == ADDRESS_SP_IMM7U
 		    || nds32_mem_format (op) == ADDRESS_FP_IMM7U)
-		    && (GET_MODE (op) == SImode)")))
+		    && (GET_MODE (op) == SImode
+			|| GET_MODE (op) == SFmode)")))
 (define_memory_constraint "Umw"
  "Memory constraint for lwm/smw"
  (and (match_code "mem")
       (match_test "nds32_valid_smw_lwm_base_p (op)")))
+(define_memory_constraint "Da"
+  "Memory constraint for non-offset loads/stores"
+  (and (match_code "mem")
+       (match_test "REG_P (XEXP (op, 0))
+		    || (GET_CODE (XEXP (op, 0)) == POST_INC)")))
+(define_memory_constraint "Q"
+  "Memory constraint for no symbol_ref and const"
+  (and (match_code "mem")
+       (match_test "(TARGET_FPU_SINGLE || TARGET_FPU_DOUBLE)
+		     && nds32_float_mem_operand_p (op)")))
 ;; ------------------------------------------------------------------------
--- a/gcc/config/nds32/iterators.md
+++ b/gcc/config/nds32/iterators.md
@@ -45,11 +45,15 @@
 (define_mode_iterator VSQIHIDI [V4QI V2HI QI HI DI])
 (define_mode_iterator VQIHIDI [V4QI V2HI DI])
+;; A list of the modes that are up to double-word long.
+(define_mode_iterator ANYF [(SF "TARGET_FPU_SINGLE")
+			    (DF "TARGET_FPU_DOUBLE")])
 ;;----------------------------------------------------------------------------
 ;; Mode attributes.
 ;;----------------------------------------------------------------------------
-(define_mode_attr size [(QI "b") (HI "h") (SI "w")])
+(define_mode_attr size [(QI "b") (HI "h") (SI "w") (SF "s") (DF "d")])
 (define_mode_attr byte [(QI "1") (HI "2") (SI "4") (V4QI "4") (V2HI "4")])

--- a/gcc/config/nds32/nds32-doubleword.md
+++ b/gcc/config/nds32/nds32-doubleword.md
@@ -46,145 +46,77 @@
 (define_insn "move_<mode>"
-  [(set (match_operand:DIDF 0 "nonimmediate_operand" "=r, r, r, m")
+  [(set (match_operand:DIDF 0 "nonimmediate_operand" "=r, r,  r, r, Da, m, f, Q, f, *r, *f")
-	(match_operand:DIDF 1 "general_operand"      " r, i, m, r"))]
+	(match_operand:DIDF 1 "general_operand"      " r, i, Da, m,  r, r, Q, f, f, *f, *r"))]
  "register_operand(operands[0], <MODE>mode)
   || register_operand(operands[1], <MODE>mode)"
 {
-  rtx addr;
-  rtx otherops[5];
  switch (which_alternative)
    {
    case 0:
      return "movd44\t%0, %1";
    case 1:
      /* reg <- const_int, we ask gcc to split instruction.  */
      return "#";
    case 2:
-      /* Refer to nds32_legitimate_address_p() in nds32.c,
+      /* The memory format is (mem (reg)),
-         we only allow "reg", "symbol_ref", "const", and "reg + const_int"
+	 we can generate 'lmw.bi' instruction.  */
-         as address rtx for DImode/DFmode memory access.  */
+      return nds32_output_double (operands, true);
-      addr = XEXP (operands[1], 0);
-      otherops[0] = gen_rtx_REG (SImode, REGNO (operands[0]));
-      otherops[1] = gen_rtx_REG (SImode, REGNO (operands[0]) + 1);
-      otherops[2] = addr;
-      if (REG_P (addr))
-	{
-	  /* (reg) <- (mem (reg)) */
-	  output_asm_insn ("lmw.bi\t%0, [%2], %1, 0", otherops);
-	}
-      else if (GET_CODE (addr) == PLUS)
-	{
-	  /* (reg) <- (mem (plus (reg) (const_int))) */
-	  rtx op0 = XEXP (addr, 0);
-	  rtx op1 = XEXP (addr, 1);
-	  if (REG_P (op0))
-	    {
-	      otherops[2] = op0;
-	      otherops[3] = op1;
-	      otherops[4] = gen_int_mode (INTVAL (op1) + 4, SImode);
-	    }
-	  else
-	    {
-	      otherops[2] = op1;
-	      otherops[3] = op0;
-	      otherops[4] = gen_int_mode (INTVAL (op0) + 4, SImode);
-	    }
-	  /* To avoid base overwrite when REGNO(%0) == REGNO(%2).  */
-	  if (REGNO (otherops[0]) != REGNO (otherops[2]))
-	    {
-	      output_asm_insn ("lwi\t%0, [%2 + (%3)]", otherops);
-	      output_asm_insn ("lwi\t%1, [%2 + (%4)]", otherops);
-	    }
-	  else
-	    {
-	      output_asm_insn ("lwi\t%1, [%2 + (%4)]", otherops);
-	      output_asm_insn ("lwi\t%0,[ %2 + (%3)]", otherops);
-	    }
-	}
-      else
-	{
-	  /* (reg) <- (mem (symbol_ref ...))
-	     (reg) <- (mem (const ...)) */
-	  output_asm_insn ("lwi.gp\t%0, [ + %2]", otherops);
-	  output_asm_insn ("lwi.gp\t%1, [ + %2 + 4]", otherops);
-	}
-      /* We have already used output_asm_insn() by ourself,
-         so return an empty string.  */
-      return "";
    case 3:
-      /* Refer to nds32_legitimate_address_p() in nds32.c,
+      /* We haven't 64-bit load instruction,
-         we only allow "reg", "symbol_ref", "const", and "reg + const_int"
+	 we split this pattern to two SImode pattern.  */
-         as address rtx for DImode/DFmode memory access.  */
+      return "#";
-      addr = XEXP (operands[0], 0);
+    case 4:
+      /* The memory format is (mem (reg)),
-      otherops[0] = gen_rtx_REG (SImode, REGNO (operands[1]));
+	 we can generate 'smw.bi' instruction.  */
-      otherops[1] = gen_rtx_REG (SImode, REGNO (operands[1]) + 1);
+      return nds32_output_double (operands, false);
-      otherops[2] = addr;
+    case 5:
+      /* We haven't 64-bit store instruction,
-      if (REG_P (addr))
+	 we split this pattern to two SImode pattern.  */
-	{
+      return "#";
-	  /* (mem (reg)) <- (reg) */
+    case 6:
-	  output_asm_insn ("smw.bi\t%0, [%2], %1, 0", otherops);
+      return nds32_output_float_load (operands);
-	}
+    case 7:
-      else if (GET_CODE (addr) == PLUS)
+      return nds32_output_float_store (operands);
-	{
+    case 8:
-	  /* (mem (plus (reg) (const_int))) <- (reg) */
+      return "fcpysd\t%0, %1, %1";
-	  rtx op0 = XEXP (addr, 0);
+    case 9:
-	  rtx op1 = XEXP (addr, 1);
+      return "fmfdr\t%0, %1";
+    case 10:
-	  if (REG_P (op0))
+      return "fmtdr\t%1, %0";
-	    {
-	      otherops[2] = op0;
-	      otherops[3] = op1;
-	      otherops[4] = gen_int_mode (INTVAL (op1) + 4, SImode);
-	    }
-	  else
-	    {
-	      otherops[2] = op1;
-	      otherops[3] = op0;
-	      otherops[4] = gen_int_mode (INTVAL (op0) + 4, SImode);
-	    }
-	  /* To avoid base overwrite when REGNO(%0) == REGNO(%2).  */
-	  if (REGNO (otherops[0]) != REGNO (otherops[2]))
-	    {
-	      output_asm_insn ("swi\t%0, [%2 + (%3)]", otherops);
-	      output_asm_insn ("swi\t%1, [%2 + (%4)]", otherops);
-	    }
-	  else
-	    {
-	      output_asm_insn ("swi\t%1, [%2 + (%4)]", otherops);
-	      output_asm_insn ("swi\t%0, [%2 + (%3)]", otherops);
-	    }
-	}
-      else
-	{
-	  /* (mem (symbol_ref ...)) <- (reg)
-	     (mem (const ...))      <- (reg) */
-	  output_asm_insn ("swi.gp\t%0, [ + %2]", otherops);
-	  output_asm_insn ("swi.gp\t%1, [ + %2 + 4]", otherops);
-	}
-      /* We have already used output_asm_insn() by ourself,
-         so return an empty string.  */
-      return "";
    default:
      gcc_unreachable ();
    }
 }
-  [(set_attr "type"   "alu,alu,alu,alu")
+  [(set_attr "type"    "alu,alu,load,load,store,store,fload,fstore,fcpy,fmfdr,fmtdr")
-   (set_attr "length" "  4, 16,  8,  8")])
+   (set_attr_alternative "length"
+     [
+       ;; Alternative 0
+       (if_then_else (match_test "!TARGET_16_BIT")
+		     (const_int 4)
+		     (const_int 2))
+       ;; Alternative 1
+       (const_int 16)
+       ;; Alternative 2
+       (const_int 4)
+       ;; Alternative 3
+       (const_int 8)
+       ;; Alternative 4
+       (const_int 4)
+       ;; Alternative 5
+       (const_int 8)
+       ;; Alternative 6
+       (const_int 4)
+       ;; Alternative 7
+       (const_int 4)
+       ;; Alternative 8
+       (const_int 4)
+       ;; Alternative 9
+       (const_int 4)
+       ;; Alternative 10
+       (const_int 4)
+     ])
+   (set_attr "feature" " v1, v1,  v1,  v1,   v1,   v1,    fpu,    fpu,    fpu,    fpu,    fpu")])
 (define_split
  [(set (match_operand:DIDF 0 "register_operand"     "")
@@ -208,7 +140,12 @@
  /* Actually we would like to create move behavior by ourself.
     So that movsi expander could have chance to split large constant.  */
  emit_move_insn (operands[2], operands[3]);
-  emit_move_insn (operands[4], operands[5]);
+  unsigned HOST_WIDE_INT mask = GET_MODE_MASK (SImode);
+  if ((UINTVAL (operands[3]) & mask) == (UINTVAL (operands[5]) & mask))
+    emit_move_insn (operands[4], operands[2]);
+  else
+    emit_move_insn (operands[4], operands[5]);
  DONE;
 })
@@ -218,7 +155,9 @@
  [(set (match_operand:DIDF 0 "register_operand" "")
 	(match_operand:DIDF 1 "register_operand" ""))]
  "reload_completed
-   && (TARGET_ISA_V2 || !TARGET_16_BIT)"
+   && (TARGET_ISA_V2 || !TARGET_16_BIT)
+   && NDS32_IS_GPR_REGNUM (REGNO (operands[0]))
+   && NDS32_IS_GPR_REGNUM (REGNO (operands[1]))"
  [(set (match_dup 0) (match_dup 1))
   (set (match_dup 2) (match_dup 3))]
 {
@@ -240,6 +179,28 @@
    }
 })
+(define_split
+  [(set (match_operand:DIDF 0 "nds32_general_register_operand" "")
+	(match_operand:DIDF 1 "memory_operand" ""))]
+  "reload_completed
+   && nds32_split_double_word_load_store_p (operands, true)"
+  [(set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 5))]
+{
+  nds32_spilt_doubleword (operands, true);
+})
+(define_split
+  [(set (match_operand:DIDF 0  "memory_operand" "")
+	(match_operand:DIDF 1  "nds32_general_register_operand" ""))]
+  "reload_completed
+   && nds32_split_double_word_load_store_p (operands, false)"
+  [(set (match_dup 2) (match_dup 3))
+   (set (match_dup 4) (match_dup 5))]
+{
+  nds32_spilt_doubleword (operands, false);
+})
 ;; -------------------------------------------------------------
 ;; Boolean DImode instructions.
 ;; -------------------------------------------------------------

--- a/gcc/config/nds32/nds32-fpu.md
+++ b/gcc/config/nds32/nds32-fpu.md
--- a/gcc/config/nds32/nds32-md-auxiliary.c
+++ b/gcc/config/nds32/nds32-md-auxiliary.c
--- a/gcc/config/nds32/nds32-opts.h
+++ b/gcc/config/nds32/nds32-opts.h
@@ -29,7 +29,9 @@ enum nds32_arch_type
 {
  ARCH_V2,
  ARCH_V3,
-  ARCH_V3M
+  ARCH_V3M,
+  ARCH_V3F,
+  ARCH_V3S
 };
 /* The code model defines the address generation strategy.  */
@@ -46,4 +48,24 @@ enum nds32_cpu_type
  CPU_N9
 };
+/* Which ABI to use.  */
+enum abi_type
+{
+  NDS32_ABI_V2,
+  NDS32_ABI_V2_FP_PLUS
+};
+/* The various FPU number of registers.  */
+enum float_reg_number
+{
+  NDS32_CONFIG_FPU_0,
+  NDS32_CONFIG_FPU_1,
+  NDS32_CONFIG_FPU_2,
+  NDS32_CONFIG_FPU_3,
+  NDS32_CONFIG_FPU_4,
+  NDS32_CONFIG_FPU_5,
+  NDS32_CONFIG_FPU_6,
+  NDS32_CONFIG_FPU_7
+};
 #endif
--- a/gcc/config/nds32/nds32-predicates.c
+++ b/gcc/config/nds32/nds32-predicates.c
@@ -448,4 +448,71 @@ nds32_symbol_load_store_p (rtx_insn *insn)
  return false;
 }
+/* Vaild memory operand for floating-point loads and stores */
+bool
+nds32_float_mem_operand_p (rtx op)
+{
+  machine_mode mode = GET_MODE (op);
+  rtx addr = XEXP (op, 0);
+  /* Not support [symbol] [const] memory */
+  if (GET_CODE (addr) == SYMBOL_REF
+      || GET_CODE (addr) == CONST
+      || GET_CODE (addr) == LO_SUM)
+    return false;
+  if (GET_CODE (addr) == PLUS)
+    {
+      if (GET_CODE (XEXP (addr, 0)) == SYMBOL_REF)
+	return false;
+      /* Restrict const range: (imm12s << 2) */
+      if (GET_CODE (XEXP (addr, 1)) == CONST_INT)
+	{
+	  if ((mode == SImode || mode == SFmode)
+	      && NDS32_SINGLE_WORD_ALIGN_P (INTVAL (XEXP (addr, 1)))
+	      && !satisfies_constraint_Is14 ( XEXP(addr, 1)))
+	    return false;
+	  if ((mode == DImode || mode == DFmode)
+	      && NDS32_DOUBLE_WORD_ALIGN_P (INTVAL (XEXP (addr, 1)))
+	      && !satisfies_constraint_Is14 (XEXP (addr, 1)))
+	    return false;
+	}
+    }
+  return true;
+}
+int
+nds32_cond_move_p (rtx cmp_rtx)
+{
+  machine_mode cmp0_mode = GET_MODE (XEXP (cmp_rtx, 0));
+  machine_mode cmp1_mode = GET_MODE (XEXP (cmp_rtx, 1));
+  enum rtx_code cond = GET_CODE (cmp_rtx);
+  if ((cmp0_mode == DFmode || cmp0_mode == SFmode)
+      && (cmp1_mode == DFmode || cmp1_mode == SFmode)
+      && (cond == ORDERED || cond == UNORDERED))
+    return true;
+  return false;
+}
+bool
+nds32_const_double_range_ok_p (rtx op, machine_mode mode,
+			       HOST_WIDE_INT lower, HOST_WIDE_INT upper)
+{
+  if (GET_CODE (op) != CONST_DOUBLE
+      || GET_MODE (op) != mode)
+    return false;
+  const REAL_VALUE_TYPE *rv;
+  long val;
+  rv = CONST_DOUBLE_REAL_VALUE (op);
+  REAL_VALUE_TO_TARGET_SINGLE (*rv, val);
+  return val >= lower && val < upper;
+}
 /* ------------------------------------------------------------------------ */
--- a/gcc/config/nds32/nds32-protos.h
+++ b/gcc/config/nds32/nds32-protos.h
@@ -58,6 +58,13 @@ extern void nds32_expand_prologue (void);
 extern void nds32_expand_epilogue (bool);
 extern void nds32_expand_prologue_v3push (void);
 extern void nds32_expand_epilogue_v3pop (bool);
+extern void nds32_emit_push_fpr_callee_saved (int);
+extern void nds32_emit_pop_fpr_callee_saved (int);
+extern void nds32_emit_v3pop_fpr_callee_saved (int);
+/* Controlling Debugging Information Format.  */
+extern unsigned int nds32_dbx_register_number (unsigned int);
 /* ------------------------------------------------------------------------ */
@@ -101,6 +108,9 @@ extern int nds32_can_use_btgl_p (int);
 extern int nds32_can_use_bitci_p (int);
+extern bool nds32_const_double_range_ok_p (rtx, machine_mode,
+					   HOST_WIDE_INT, HOST_WIDE_INT);
 /* Auxiliary function for 'Computing the Length of an Insn'.  */
 extern int nds32_adjust_insn_length (rtx_insn *, int);
@@ -120,19 +130,30 @@ extern const char *nds32_output_casesi (rtx *);
 extern enum nds32_expand_result_type nds32_expand_cbranch (rtx *);
 extern enum nds32_expand_result_type nds32_expand_cstore (rtx *);
+extern void nds32_expand_float_cbranch (rtx *);
+extern void nds32_expand_float_cstore (rtx *);
 /* Auxiliary functions for conditional move generation.  */
 extern enum nds32_expand_result_type nds32_expand_movcc (rtx *);
+extern void nds32_expand_float_movcc (rtx *);
 /* Auxiliary functions to identify long-call symbol.  */
 extern bool nds32_long_call_p (rtx);
+/* Auxiliary functions to identify conditional move comparison operand.  */
+extern int nds32_cond_move_p (rtx);
 /* Auxiliary functions to identify 16 bit addresing mode.  */
 extern enum nds32_16bit_address_type nds32_mem_format (rtx);
+/* Auxiliary functions to identify floating-point addresing mode.  */
+extern bool nds32_float_mem_operand_p (rtx);
 /* Auxiliary functions to output assembly code.  */
 extern const char *nds32_output_16bit_store (rtx *, int);
@@ -140,8 +161,11 @@ extern const char *nds32_output_16bit_load (rtx *, int);
 extern const char *nds32_output_32bit_store (rtx *, int);
 extern const char *nds32_output_32bit_load (rtx *, int);
 extern const char *nds32_output_32bit_load_s (rtx *, int);
+extern const char *nds32_output_float_load(rtx *);
+extern const char *nds32_output_float_store(rtx *);
 extern const char *nds32_output_smw_single_word (rtx *);
 extern const char *nds32_output_lmw_single_word (rtx *);
+extern const char *nds32_output_double (rtx *, bool);
 extern const char *nds32_output_cbranchsi4_equality_zero (rtx_insn *, rtx *);
 extern const char *nds32_output_cbranchsi4_equality_reg (rtx_insn *, rtx *);
 extern const char *nds32_output_cbranchsi4_equality_reg_or_const_int (rtx_insn *,
@@ -154,6 +178,10 @@ extern const char *nds32_output_cbranchsi4_greater_less_zero (rtx_insn *, rtx *)
 extern const char *nds32_output_stack_push (rtx);
 extern const char *nds32_output_stack_pop (rtx);
+/* Auxiliary functions to split double word RTX pattern.  */
+extern void nds32_spilt_doubleword (rtx *, bool);
 /* Auxiliary functions to split large constant RTX pattern.  */
 extern void nds32_expand_constant (machine_mode,
@@ -190,6 +218,8 @@ extern int nds32_address_cost_impl (rtx, machine_mode, addr_space_t, bool);
 /* Auxiliary functions for pre-define marco.  */
 extern void nds32_cpu_cpp_builtins(struct cpp_reader *);
+extern bool nds32_split_double_word_load_store_p (rtx *,bool);
 /* Functions for create nds32 specific optimization pass.  */
 extern rtl_opt_pass *make_pass_nds32_relax_opt (gcc::context *);

--- a/gcc/config/nds32/nds32.c
+++ b/gcc/config/nds32/nds32.c
--- a/gcc/config/nds32/nds32.h
+++ b/gcc/config/nds32/nds32.h
--- a/gcc/config/nds32/nds32.md
+++ b/gcc/config/nds32/nds32.md
@@ -46,13 +46,17 @@
 ;; Include DImode/DFmode operations.
 (include "nds32-doubleword.md")
+;; Include floating-point patterns.
+(include "nds32-fpu.md")
 ;; Include peephole patterns.
 (include "nds32-peephole2.md")
 ;; Insn type, it is used to default other attribute values.
 (define_attr "type"
-  "unknown,load,store,load_multiple,store_multiple,alu,alu_shift,mul,mac,div,branch,call,misc"
+  "unknown,load,store,load_multiple,store_multiple,alu,alu_shift,mul,mac,div,branch,call,misc,\
+   falu,fmuls,fmuld,fmacs,fmacd,fdivs,fdivd,fsqrts,fsqrtd,fcmp,fabs,fcpy,fcmov,fmfsr,fmfdr,fmtsr,fmtdr,fload,fstore"
  (const_string "unknown"))
 ;; Insn sub-type
@@ -77,7 +81,7 @@
 ;; pe2 : Performance Extension Version 2 Instructions
 ;; se  : String Extension instructions
 (define_attr "feature"
-  "v1,v2,v3m,v3,pe1,pe2,se"
+  "v1,v2,v3m,v3,pe1,pe2,se,fpu"
  (const_string "v1"))
 ;; Enabled, which is used to enable/disable insn alternatives.
@@ -108,6 +112,9 @@
 						    (const_string "no"))
 	   (eq_attr "feature" "se")   (if_then_else (match_test "TARGET_EXT_STRING")
 						    (const_string "yes")
+						    (const_string "no"))
+	   (eq_attr "feature" "fpu")  (if_then_else (match_test "TARGET_FPU_SINGLE || TARGET_FPU_DOUBLE")
+						    (const_string "yes")
 						    (const_string "no"))]
 	   (const_string "yes"))))
@@ -193,8 +200,8 @@
 })
 (define_insn "*mov<mode>"
-  [(set (match_operand:QIHISI 0 "nonimmediate_operand" "=r, r, U45, U33, U37, U45, m,   l,   l,   l,   d,   d, r,    d,    r,    r,    r")
+  [(set (match_operand:QIHISI 0 "nonimmediate_operand" "=r, r,U45,U33,U37,U45, m,  l,  l,  l,  d,  d, r,   d,    r,    r,    r, *f, *f,  r, *f,  Q")
-	(match_operand:QIHISI 1 "nds32_move_operand"   " r, r,   l,   l,   l,   d, r, U45, U33, U37, U45, Ufe, m, Ip05, Is05, Is20, Ihig"))]
+	(match_operand:QIHISI 1 "nds32_move_operand"   " r, r,  l,  l,  l,  d, r,U45,U33,U37,U45,Ufe, m,Ip05, Is05, Is20, Ihig, *f,  r, *f,  Q, *f"))]
  "register_operand(operands[0], <MODE>mode)
   || register_operand(operands[1], <MODE>mode)"
 {
@@ -227,12 +234,26 @@
      return "movi\t%0, %1";
    case 16:
      return "sethi\t%0, hi20(%1)";
+    case 17:
+      if (TARGET_FPU_SINGLE)
+	return "fcpyss\t%0, %1, %1";
+      else
+	return "#";
+    case 18:
+      return "fmtsr\t%1, %0";
+    case 19:
+      return "fmfsr\t%0, %1";
+    case 20:
+      return nds32_output_float_load (operands);
+    case 21:
+      return nds32_output_float_store (operands);
    default:
      gcc_unreachable ();
    }
 }
-  [(set_attr "type"   "alu,alu,store,store,store,store,store,load,load,load,load,load,load,alu,alu,alu,alu")
+  [(set_attr "type"    "alu,alu,store,store,store,store,store,load,load,load,load,load,load,alu,alu,alu,alu,fcpy,fmtsr,fmfsr,fload,fstore")
-   (set_attr "length" "  2,  4,    2,    2,    2,    2,    4,   2,   2,   2,   2,   2,   4,  2,  2,  4,  4")])
+   (set_attr "length"  "  2,  4,    2,    2,    2,    2,    4,   2,   2,   2,   2,   2,   4,  2,  2,  4,  4,   4,    4,    4,    4,     4")
+   (set_attr "feature" " v1, v1,   v1,   v1,   v1,   v1,   v1,  v1,  v1,  v1,  v1, v3m,  v1, v1, v1, v1, v1, fpu,  fpu,  fpu,  fpu,   fpu")])
 ;; We use nds32_symbolic_operand to limit that only CONST/SYMBOL_REF/LABEL_REF
@@ -804,6 +825,87 @@
   (set_attr "length"  "  2,  4")
   (set_attr "feature" "v3m, v1")])
+(define_expand "negsf2"
+  [(set (match_operand:SF 0 "register_operand" "")
+	(neg:SF (match_operand:SF 1 "register_operand" "")))]
+  ""
+{
+  if (!TARGET_FPU_SINGLE && !TARGET_EXT_PERF)
+    {
+      rtx new_dst = simplify_gen_subreg (SImode, operands[0], SFmode, 0);
+      rtx new_src = simplify_gen_subreg (SImode, operands[1], SFmode, 0);
+      emit_insn (gen_xorsi3 (new_dst,
+			     new_src,
+			     gen_int_mode (0x80000000, SImode)));
+      DONE;
+    }
+})
+(define_expand "negdf2"
+  [(set (match_operand:DF 0 "register_operand" "")
+	(neg:DF (match_operand:DF 1 "register_operand" "")))]
+  ""
+{
+})
+(define_insn_and_split "soft_negdf2"
+  [(set (match_operand:DF 0 "register_operand" "")
+	(neg:DF (match_operand:DF 1 "register_operand" "")))]
+  "!TARGET_FPU_DOUBLE"
+  "#"
+  "!TARGET_FPU_DOUBLE"
+  [(const_int 1)]
+{
+    rtx src = operands[1];
+    rtx dst = operands[0];
+    rtx ori_dst = operands[0];
+    bool need_extra_move_for_dst_p;
+    /* FPU register can't change mode to SI directly, so we need create a
+       tmp register to handle it, and FPU register can't do `xor` or btgl.  */
+    if (HARD_REGISTER_P (src)
+	&& TEST_HARD_REG_BIT (reg_class_contents[FP_REGS], REGNO (src)))
+      {
+	rtx tmp = gen_reg_rtx (DFmode);
+	emit_move_insn (tmp, src);
+	src = tmp;
+      }
+    if (HARD_REGISTER_P (dst)
+	&& TEST_HARD_REG_BIT (reg_class_contents[FP_REGS], REGNO (dst)))
+      {
+	need_extra_move_for_dst_p = true;
+	rtx tmp = gen_reg_rtx (DFmode);
+	dst = tmp;
+      }
+    rtx dst_high_part = simplify_gen_subreg (
+			  SImode, dst,
+			  DFmode, subreg_highpart_offset (SImode, DFmode));
+    rtx dst_low_part = simplify_gen_subreg (
+			  SImode, dst,
+			  DFmode, subreg_lowpart_offset (SImode, DFmode));
+    rtx src_high_part = simplify_gen_subreg (
+			  SImode, src,
+			  DFmode, subreg_highpart_offset (SImode, DFmode));
+    rtx src_low_part = simplify_gen_subreg (
+			  SImode, src,
+			  DFmode, subreg_lowpart_offset (SImode, DFmode));
+    emit_insn (gen_xorsi3 (dst_high_part,
+			   src_high_part,
+			   gen_int_mode (0x80000000, SImode)));
+    emit_move_insn (dst_low_part, src_low_part);
+    if (need_extra_move_for_dst_p)
+      emit_move_insn (ori_dst, dst);
+    DONE;
+})
 ;; ----------------------------------------------------------------------------
 ;; 'ONE_COMPLIMENT' operation
 ;; ----------------------------------------------------------------------------

--- a/gcc/config/nds32/nds32.opt
+++ b/gcc/config/nds32/nds32.opt
@@ -32,6 +32,31 @@ EL
 Target RejectNegative Alias(mlittle-endian)
 Generate code in little-endian mode.
+; ---------------------------------------------------------------
+mabi=
+Target RejectNegative Joined Enum(abi_type) Var(nds32_abi) Init(TARGET_DEFAULT_ABI)
+Specify which ABI type to generate code for: 2, 2fp+.
+Enum
+Name(abi_type) Type(enum abi_type)
+Known ABIs (for use with the -mabi= option):
+EnumValue
+Enum(abi_type) String(2) Value(NDS32_ABI_V2)
+EnumValue
+Enum(abi_type) String(2fp+) Value(NDS32_ABI_V2_FP_PLUS)
+mfloat-abi=soft
+Target RejectNegative Alias(mabi=, 2)
+Specify use soft floating point ABI which mean alias to -mabi=2.
+mfloat-abi=hard
+Target RejectNegative Alias(mabi=, 2fp+)
+Specify use soft floating point ABI which mean alias to -mabi=2fp+.
 ; ---------------------------------------------------------------
 mreduced-regs
@@ -110,6 +135,12 @@ Enum(nds32_arch_type) String(v3) Value(ARCH_V3)
 EnumValue
 Enum(nds32_arch_type) String(v3m) Value(ARCH_V3M)
+EnumValue
+Enum(nds32_arch_type) String(v3f) Value(ARCH_V3F)
+EnumValue
+Enum(nds32_arch_type) String(v3s) Value(ARCH_V3S)
 mcmodel=
 Target RejectNegative Joined Enum(nds32_cmodel_type) Var(nds32_cmodel_option) Init(CMODEL_LARGE)
 Specify the address generation strategy for code model.
@@ -138,6 +169,38 @@ Known cpu types (for use with the -mcpu= option):
 EnumValue
 Enum(nds32_cpu_type) String(n9) Value(CPU_N9)
+mconfig-fpu=
+Target RejectNegative Joined Enum(float_reg_number) Var(nds32_fp_regnum) Init(TARGET_CONFIG_FPU_DEFAULT)
+Specify a fpu configuration value from 0 to 7; 0-3 is as FPU spec says, and 4-7 is corresponding to 0-3.
+Enum
+Name(float_reg_number) Type(enum float_reg_number)
+Known floating-point number of registers (for use with the -mconfig-fpu= option):
+EnumValue
+Enum(float_reg_number) String(0) Value(NDS32_CONFIG_FPU_0)
+EnumValue
+Enum(float_reg_number) String(1) Value(NDS32_CONFIG_FPU_1)
+EnumValue
+Enum(float_reg_number) String(2) Value(NDS32_CONFIG_FPU_2)
+EnumValue
+Enum(float_reg_number) String(3) Value(NDS32_CONFIG_FPU_3)
+EnumValue
+Enum(float_reg_number) String(4) Value(NDS32_CONFIG_FPU_4)
+EnumValue
+Enum(float_reg_number) String(5) Value(NDS32_CONFIG_FPU_5)
+EnumValue
+Enum(float_reg_number) String(6) Value(NDS32_CONFIG_FPU_6)
+EnumValue
+Enum(float_reg_number) String(7) Value(NDS32_CONFIG_FPU_7)
 mctor-dtor
 Target Report
 Enable constructor/destructor feature.
@@ -145,3 +208,15 @@ Enable constructor/destructor feature.
 mrelax
 Target Report
 Guide linker to relax instructions.
+mext-fpu-fma
+Target Report Mask(EXT_FPU_FMA)
+Generate floating-point multiply-accumulation instructions.
+mext-fpu-sp
+Target Report Mask(FPU_SINGLE)
+Generate single-precision floating-point instructions.
+mext-fpu-dp
+Target Report Mask(FPU_DOUBLE)
+Generate double-precision floating-point instructions.
--- a/gcc/config/nds32/predicates.md
+++ b/gcc/config/nds32/predicates.md
@@ -24,12 +24,21 @@
 (define_predicate "nds32_greater_less_comparison_operator"
  (match_code "gt,ge,lt,le"))
+(define_predicate "nds32_float_comparison_operator"
+  (match_code "eq,ne,le,lt,ge,gt,ordered,unordered,ungt,unge,unlt,unle"))
 (define_predicate "nds32_movecc_comparison_operator"
  (match_code "eq,ne,le,leu,ge,geu"))
 (define_special_predicate "nds32_logical_binary_operator"
  (match_code "and,ior,xor"))
+(define_special_predicate "nds32_conditional_call_comparison_operator"
+  (match_code "lt,ge"))
+(define_special_predicate "nds32_have_33_inst_operator"
+  (match_code "mult,and,ior,xor"))
 (define_predicate "nds32_symbolic_operand"
  (match_code "const,symbol_ref,label_ref"))
@@ -122,6 +131,18 @@
  (and (match_code "mem")
       (match_test "nds32_valid_smw_lwm_base_p (op)")))
+(define_predicate "float_even_register_operand"
+  (and (match_code "reg")
+       (and (match_test "REGNO (op) >= NDS32_FIRST_FPR_REGNUM")
+	    (match_test "REGNO (op) <= NDS32_LAST_FPR_REGNUM")
+	    (match_test "(REGNO (op) & 1) == 0"))))
+(define_predicate "float_odd_register_operand"
+  (and (match_code "reg")
+       (and (match_test "REGNO (op) >= NDS32_FIRST_FPR_REGNUM")
+	    (match_test "REGNO (op) <= NDS32_LAST_FPR_REGNUM")
+	    (match_test "(REGNO (op) & 1) != 0"))))
 (define_special_predicate "nds32_load_multiple_operation"
  (match_code "parallel")
 {