AArch64 [3/10]

2012-10-23 Ian Bolton <ian.bolton@arm.com> James Greenhalgh <james.greenhalgh@arm.com> Jim MacArthur <jim.macarthur@arm.com> Chris Schlumberger-Socha <chris.schlumberger-socha@arm.com> Marcus Shawcroft <marcus.shawcroft@arm.com> Nigel Stephens <nigel.stephens@arm.com> Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Richard Earnshaw <rearnsha@arm.com> Sofiane Naci <sofiane.naci@arm.com> Stephen Thomas <stephen.thomas@arm.com> Tejas Belagod <tejas.belagod@arm.com> Yufeng Zhang <yufeng.zhang@arm.com> * common/config/aarch64/aarch64-common.c: New file. * config/aarch64/aarch64-arches.def: New file. * config/aarch64/aarch64-builtins.c: New file. * config/aarch64/aarch64-cores.def: New file. * config/aarch64/aarch64-elf-raw.h: New file. * config/aarch64/aarch64-elf.h: New file. * config/aarch64/aarch64-generic.md: New file. * config/aarch64/aarch64-linux.h: New file. * config/aarch64/aarch64-modes.def: New file. * config/aarch64/aarch64-option-extensions.def: New file. * config/aarch64/aarch64-opts.h: New file. * config/aarch64/aarch64-protos.h: New file. * config/aarch64/aarch64-simd.md: New file. * config/aarch64/aarch64-tune.md: New file. * config/aarch64/aarch64.c: New file. * config/aarch64/aarch64.h: New file. * config/aarch64/aarch64.md: New file. * config/aarch64/aarch64.opt: New file. * config/aarch64/arm_neon.h: New file. * config/aarch64/constraints.md: New file. * config/aarch64/gentune.sh: New file. * config/aarch64/iterators.md: New file. * config/aarch64/large.md: New file. * config/aarch64/predicates.md: New file. * config/aarch64/small.md: New file. * config/aarch64/sync.md: New file. * config/aarch64/t-aarch64-linux: New file. * config/aarch64/t-aarch64: New file. Co-Authored-By: Chris Schlumberger-Socha <chris.schlumberger-socha@arm.com> Co-Authored-By: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Jim MacArthur <jim.macarthur@arm.com> Co-Authored-By: Marcus Shawcroft <marcus.shawcroft@arm.com> Co-Authored-By: Nigel Stephens <nigel.stephens@arm.com> Co-Authored-By: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Co-Authored-By: Richard Earnshaw <rearnsha@arm.com> Co-Authored-By: Sofiane Naci <sofiane.naci@arm.com> Co-Authored-By: Stephen Thomas <stephen.thomas@arm.com> Co-Authored-By: Tejas Belagod <tejas.belagod@arm.com> Co-Authored-By: Yufeng Zhang <yufeng.zhang@arm.com> From-SVN: r192723

AArch64 [3/10]
2012-10-23 Ian Bolton <ian.bolton@arm.com> James Greenhalgh <james.greenhalgh@arm.com> Jim MacArthur <jim.macarthur@arm.com> Chris Schlumberger-Socha <chris.schlumberger-socha@arm.com> Marcus Shawcroft <marcus.shawcroft@arm.com> Nigel Stephens <nigel.stephens@arm.com> Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Richard Earnshaw <rearnsha@arm.com> Sofiane Naci <sofiane.naci@arm.com> Stephen Thomas <stephen.thomas@arm.com> Tejas Belagod <tejas.belagod@arm.com> Yufeng Zhang <yufeng.zhang@arm.com> * common/config/aarch64/aarch64-common.c: New file. * config/aarch64/aarch64-arches.def: New file. * config/aarch64/aarch64-builtins.c: New file. * config/aarch64/aarch64-cores.def: New file. * config/aarch64/aarch64-elf-raw.h: New file. * config/aarch64/aarch64-elf.h: New file. * config/aarch64/aarch64-generic.md: New file. * config/aarch64/aarch64-linux.h: New file. * config/aarch64/aarch64-modes.def: New file. * config/aarch64/aarch64-option-extensions.def: New file. * config/aarch64/aarch64-opts.h: New file. * config/aarch64/aarch64-protos.h: New file. * config/aarch64/aarch64-simd.md: New file. * config/aarch64/aarch64-tune.md: New file. * config/aarch64/aarch64.c: New file. * config/aarch64/aarch64.h: New file. * config/aarch64/aarch64.md: New file. * config/aarch64/aarch64.opt: New file. * config/aarch64/arm_neon.h: New file. * config/aarch64/constraints.md: New file. * config/aarch64/gentune.sh: New file. * config/aarch64/iterators.md: New file. * config/aarch64/large.md: New file. * config/aarch64/predicates.md: New file. * config/aarch64/small.md: New file. * config/aarch64/sync.md: New file. * config/aarch64/t-aarch64-linux: New file. * config/aarch64/t-aarch64: New file. Co-Authored-By: Chris Schlumberger-Socha <chris.schlumberger-socha@arm.com> Co-Authored-By: James Greenhalgh <james.greenhalgh@arm.com> Co-Authored-By: Jim MacArthur <jim.macarthur@arm.com> Co-Authored-By: Marcus Shawcroft <marcus.shawcroft@arm.com> Co-Authored-By: Nigel Stephens <nigel.stephens@arm.com> Co-Authored-By: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com> Co-Authored-By: Richard Earnshaw <rearnsha@arm.com> Co-Authored-By: Sofiane Naci <sofiane.naci@arm.com> Co-Authored-By: Stephen Thomas <stephen.thomas@arm.com> Co-Authored-By: Tejas Belagod <tejas.belagod@arm.com> Co-Authored-By: Yufeng Zhang <yufeng.zhang@arm.com> From-SVN: r192723
43e9d192 · Ian Bolton · Marcus Shawcroft · 0065c7eb · 43e9d192 · 43e9d192
Commit 43e9d192 authored Oct 23, 2012 by Ian Bolton Committed by Marcus Shawcroft Oct 23, 2012
29 changed files
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
+2012-10-23  Ian Bolton  <ian.bolton@arm.com>
+	    James Greenhalgh  <james.greenhalgh@arm.com>
+	    Jim MacArthur  <jim.macarthur@arm.com>
+	    Chris Schlumberger-Socha <chris.schlumberger-socha@arm.com>
+	    Marcus Shawcroft  <marcus.shawcroft@arm.com>
+	    Nigel Stephens  <nigel.stephens@arm.com>
+	    Ramana Radhakrishnan  <ramana.radhakrishnan@arm.com>
+	    Richard Earnshaw  <rearnsha@arm.com>
+	    Sofiane Naci  <sofiane.naci@arm.com>
+	    Stephen Thomas  <stephen.thomas@arm.com>
+	    Tejas Belagod  <tejas.belagod@arm.com>
+	    Yufeng Zhang  <yufeng.zhang@arm.com>
+
+	* common/config/aarch64/aarch64-common.c: New file.
+	* config/aarch64/aarch64-arches.def: New file.
+	* config/aarch64/aarch64-builtins.c: New file.
+	* config/aarch64/aarch64-cores.def: New file.
+	* config/aarch64/aarch64-elf-raw.h: New file.
+	* config/aarch64/aarch64-elf.h: New file.
+	* config/aarch64/aarch64-generic.md: New file.
+	* config/aarch64/aarch64-linux.h: New file.
+	* config/aarch64/aarch64-modes.def: New file.
+	* config/aarch64/aarch64-option-extensions.def: New file.
+	* config/aarch64/aarch64-opts.h: New file.
+	* config/aarch64/aarch64-protos.h: New file.
+	* config/aarch64/aarch64-simd.md: New file.
+	* config/aarch64/aarch64-tune.md: New file.
+	* config/aarch64/aarch64.c: New file.
+	* config/aarch64/aarch64.h: New file.
+	* config/aarch64/aarch64.md: New file.
+	* config/aarch64/aarch64.opt: New file.
+	* config/aarch64/arm_neon.h: New file.
+	* config/aarch64/constraints.md: New file.
+	* config/aarch64/gentune.sh: New file.
+	* config/aarch64/iterators.md: New file.
+	* config/aarch64/large.md: New file.
+	* config/aarch64/predicates.md: New file.
+	* config/aarch64/small.md: New file.
+	* config/aarch64/sync.md: New file.
+	* config/aarch64/t-aarch64-linux: New file.
+	* config/aarch64/t-aarch64: New file.
+
 2012-10-23  Jakub Jelinek  <jakub@redhat.com>

 	PR c++/54988
--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
+/* Common hooks for AArch64.
+   Copyright (C) 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tm_p.h"
+#include "common/common-target.h"
+#include "common/common-target-def.h"
+#include "opts.h"
+#include "flags.h"
+
+#ifdef  TARGET_BIG_ENDIAN_DEFAULT
+#undef  TARGET_DEFAULT_TARGET_FLAGS
+#define TARGET_DEFAULT_TARGET_FLAGS (MASK_BIG_END)
+#endif
+
+#undef  TARGET_HANDLE_OPTION
+#define TARGET_HANDLE_OPTION aarch64_handle_option
+
+#undef	TARGET_OPTION_OPTIMIZATION_TABLE
+#define TARGET_OPTION_OPTIMIZATION_TABLE aarch_option_optimization_table
+
+/* Set default optimization options.  */
+static const struct default_options aarch_option_optimization_table[] =
+  {
+    /* Enable section anchors by default at -O1 or higher.  */
+    { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
+    { OPT_LEVELS_NONE, 0, NULL, 0 }
+  };
+
+/* Implement TARGET_HANDLE_OPTION.
+   This function handles the target specific options for CPU/target selection.
+
+   march wins over mcpu, so when march is defined, mcpu takes the same value,
+   otherwise march remains undefined. mtune can be used with either march or
+   mcpu. If march and mcpu are used together, the rightmost option wins.
+   mtune can be used with either march or mcpu.  */
+
+static bool
+aarch64_handle_option (struct gcc_options *opts,
+		       struct gcc_options *opts_set ATTRIBUTE_UNUSED,
+		       const struct cl_decoded_option *decoded,
+		       location_t loc ATTRIBUTE_UNUSED)
+{
+  size_t code = decoded->opt_index;
+  const char *arg = decoded->arg;
+
+  switch (code)
+    {
+    case OPT_march_:
+      opts->x_aarch64_arch_string = arg;
+      opts->x_aarch64_cpu_string = arg;
+      return true;
+
+    case OPT_mcpu_:
+      opts->x_aarch64_cpu_string = arg;
+      opts->x_aarch64_arch_string = NULL;
+      return true;
+
+    case OPT_mtune_:
+      opts->x_aarch64_tune_string = arg;
+      return true;
+
+    default:
+      return true;
+    }
+}
+
+struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
+/* Copyright (C) 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Before using #include to read this file, define a macro:
+
+      AARCH64_ARCH(NAME, CORE, ARCH, FLAGS)
+
+   The NAME is the name of the architecture, represented as a string
+   constant.  The CORE is the identifier for a core representative of
+   this architecture.  ARCH is the architecture revision.  FLAGS are
+   the flags implied by the architecture.  */
+
+AARCH64_ARCH("armv8-a",	      generic,	     8,  AARCH64_FL_FOR_ARCH8)
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
+/* Builtins' description for AArch64 SIMD architecture.
+   Copyright (C) 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "tree.h"
+#include "expr.h"
+#include "tm_p.h"
+#include "recog.h"
+#include "langhooks.h"
+#include "diagnostic-core.h"
+#include "optabs.h"
+
+enum aarch64_simd_builtin_type_bits
+{
+  T_V8QI = 0x0001,
+  T_V4HI = 0x0002,
+  T_V2SI = 0x0004,
+  T_V2SF = 0x0008,
+  T_DI = 0x0010,
+  T_DF = 0x0020,
+  T_V16QI = 0x0040,
+  T_V8HI = 0x0080,
+  T_V4SI = 0x0100,
+  T_V4SF = 0x0200,
+  T_V2DI = 0x0400,
+  T_V2DF = 0x0800,
+  T_TI = 0x1000,
+  T_EI = 0x2000,
+  T_OI = 0x4000,
+  T_XI = 0x8000,
+  T_SI = 0x10000,
+  T_HI = 0x20000,
+  T_QI = 0x40000
+};
+
+#define v8qi_UP  T_V8QI
+#define v4hi_UP  T_V4HI
+#define v2si_UP  T_V2SI
+#define v2sf_UP  T_V2SF
+#define di_UP    T_DI
+#define df_UP    T_DF
+#define v16qi_UP T_V16QI
+#define v8hi_UP  T_V8HI
+#define v4si_UP  T_V4SI
+#define v4sf_UP  T_V4SF
+#define v2di_UP  T_V2DI
+#define v2df_UP  T_V2DF
+#define ti_UP	 T_TI
+#define ei_UP	 T_EI
+#define oi_UP	 T_OI
+#define xi_UP	 T_XI
+#define si_UP    T_SI
+#define hi_UP    T_HI
+#define qi_UP    T_QI
+
+#define UP(X) X##_UP
+
+#define T_MAX 19
+
+typedef enum
+{
+  AARCH64_SIMD_BINOP,
+  AARCH64_SIMD_TERNOP,
+  AARCH64_SIMD_QUADOP,
+  AARCH64_SIMD_UNOP,
+  AARCH64_SIMD_GETLANE,
+  AARCH64_SIMD_SETLANE,
+  AARCH64_SIMD_CREATE,
+  AARCH64_SIMD_DUP,
+  AARCH64_SIMD_DUPLANE,
+  AARCH64_SIMD_COMBINE,
+  AARCH64_SIMD_SPLIT,
+  AARCH64_SIMD_LANEMUL,
+  AARCH64_SIMD_LANEMULL,
+  AARCH64_SIMD_LANEMULH,
+  AARCH64_SIMD_LANEMAC,
+  AARCH64_SIMD_SCALARMUL,
+  AARCH64_SIMD_SCALARMULL,
+  AARCH64_SIMD_SCALARMULH,
+  AARCH64_SIMD_SCALARMAC,
+  AARCH64_SIMD_CONVERT,
+  AARCH64_SIMD_FIXCONV,
+  AARCH64_SIMD_SELECT,
+  AARCH64_SIMD_RESULTPAIR,
+  AARCH64_SIMD_REINTERP,
+  AARCH64_SIMD_VTBL,
+  AARCH64_SIMD_VTBX,
+  AARCH64_SIMD_LOAD1,
+  AARCH64_SIMD_LOAD1LANE,
+  AARCH64_SIMD_STORE1,
+  AARCH64_SIMD_STORE1LANE,
+  AARCH64_SIMD_LOADSTRUCT,
+  AARCH64_SIMD_LOADSTRUCTLANE,
+  AARCH64_SIMD_STORESTRUCT,
+  AARCH64_SIMD_STORESTRUCTLANE,
+  AARCH64_SIMD_LOGICBINOP,
+  AARCH64_SIMD_SHIFTINSERT,
+  AARCH64_SIMD_SHIFTIMM,
+  AARCH64_SIMD_SHIFTACC
+} aarch64_simd_itype;
+
+typedef struct
+{
+  const char *name;
+  const aarch64_simd_itype itype;
+  const int bits;
+  const enum insn_code codes[T_MAX];
+  const unsigned int num_vars;
+  unsigned int base_fcode;
+} aarch64_simd_builtin_datum;
+
+#define CF(N, X) CODE_FOR_aarch64_##N##X
+
+#define VAR1(T, N, A) \
+  #N, AARCH64_SIMD_##T, UP (A), { CF (N, A) }, 1, 0
+#define VAR2(T, N, A, B) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B), { CF (N, A), CF (N, B) }, 2, 0
+#define VAR3(T, N, A, B, C) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C), \
+  { CF (N, A), CF (N, B), CF (N, C) }, 3, 0
+#define VAR4(T, N, A, B, C, D) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D) }, 4, 0
+#define VAR5(T, N, A, B, C, D, E) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) | UP (E), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E) }, 5, 0
+#define VAR6(T, N, A, B, C, D, E, F) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) | UP (E) | UP (F), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F) }, 6, 0
+#define VAR7(T, N, A, B, C, D, E, F, G) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) \
+			| UP (E) | UP (F) | UP (G), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F), \
+    CF (N, G) }, 7, 0
+#define VAR8(T, N, A, B, C, D, E, F, G, H) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) \
+		| UP (E) | UP (F) | UP (G) \
+		| UP (H), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F), \
+    CF (N, G), CF (N, H) }, 8, 0
+#define VAR9(T, N, A, B, C, D, E, F, G, H, I) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) \
+		| UP (E) | UP (F) | UP (G) \
+		| UP (H) | UP (I), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F), \
+    CF (N, G), CF (N, H), CF (N, I) }, 9, 0
+#define VAR10(T, N, A, B, C, D, E, F, G, H, I, J) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) \
+		| UP (E) | UP (F) | UP (G) \
+		| UP (H) | UP (I) | UP (J), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F), \
+    CF (N, G), CF (N, H), CF (N, I), CF (N, J) }, 10, 0
+
+#define VAR11(T, N, A, B, C, D, E, F, G, H, I, J, K) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) \
+		| UP (E) | UP (F) | UP (G) \
+		| UP (H) | UP (I) | UP (J) | UP (K), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F), \
+    CF (N, G), CF (N, H), CF (N, I), CF (N, J), CF (N, K) }, 11, 0
+
+#define VAR12(T, N, A, B, C, D, E, F, G, H, I, J, K, L) \
+  #N, AARCH64_SIMD_##T, UP (A) | UP (B) | UP (C) | UP (D) \
+		| UP (E) | UP (F) | UP (G) \
+		| UP (H) | UP (I) | UP (J) | UP (K) | UP (L), \
+  { CF (N, A), CF (N, B), CF (N, C), CF (N, D), CF (N, E), CF (N, F), \
+    CF (N, G), CF (N, H), CF (N, I), CF (N, J), CF (N, K), CF (N, L) }, 12, 0
+
+
+/* The mode entries in the following table correspond to the "key" type of the
+   instruction variant, i.e. equivalent to that which would be specified after
+   the assembler mnemonic, which usually refers to the last vector operand.
+   (Signed/unsigned/polynomial types are not differentiated between though, and
+   are all mapped onto the same mode for a given element size.) The modes
+   listed per instruction should be the same as those defined for that
+   instruction's pattern in aarch64_simd.md.
+   WARNING: Variants should be listed in the same increasing order as
+   aarch64_simd_builtin_type_bits.  */
+
+static aarch64_simd_builtin_datum aarch64_simd_builtin_data[] = {
+  {VAR6 (CREATE, create, v8qi, v4hi, v2si, v2sf, di, df)},
+  {VAR6 (GETLANE, get_lane_signed,
+	  v8qi, v4hi, v2si, v16qi, v8hi, v4si)},
+  {VAR7 (GETLANE, get_lane_unsigned,
+	  v8qi, v4hi, v2si, v16qi, v8hi, v4si, v2di)},
+  {VAR4 (GETLANE, get_lane, v2sf, di, v4sf, v2df)},
+  {VAR6 (GETLANE, get_dregoi, v8qi, v4hi, v2si, v2sf, di, df)},
+  {VAR6 (GETLANE, get_qregoi, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (GETLANE, get_dregci, v8qi, v4hi, v2si, v2sf, di, df)},
+  {VAR6 (GETLANE, get_qregci, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (GETLANE, get_dregxi, v8qi, v4hi, v2si, v2sf, di, df)},
+  {VAR6 (GETLANE, get_qregxi, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (SETLANE, set_qregoi, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (SETLANE, set_qregci, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (SETLANE, set_qregxi, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+
+  {VAR5 (REINTERP, reinterpretv8qi, v8qi, v4hi, v2si, v2sf, di)},
+  {VAR5 (REINTERP, reinterpretv4hi, v8qi, v4hi, v2si, v2sf, di)},
+  {VAR5 (REINTERP, reinterpretv2si, v8qi, v4hi, v2si, v2sf, di)},
+  {VAR5 (REINTERP, reinterpretv2sf, v8qi, v4hi, v2si, v2sf, di)},
+  {VAR5 (REINTERP, reinterpretdi, v8qi, v4hi, v2si, v2sf, di)},
+  {VAR6 (REINTERP, reinterpretv16qi, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (REINTERP, reinterpretv8hi, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (REINTERP, reinterpretv4si, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (REINTERP, reinterpretv4sf, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (REINTERP, reinterpretv2di, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR6 (COMBINE, combine, v8qi, v4hi, v2si, v2sf, di, df)},
+
+  {VAR3 (BINOP, saddl, v8qi, v4hi, v2si)},
+  {VAR3 (BINOP, uaddl, v8qi, v4hi, v2si)},
+  {VAR3 (BINOP, saddl2, v16qi, v8hi, v4si)},
+  {VAR3 (BINOP, uaddl2, v16qi, v8hi, v4si)},
+  {VAR3 (BINOP, saddw, v8qi, v4hi, v2si)},
+  {VAR3 (BINOP, uaddw, v8qi, v4hi, v2si)},
+  {VAR3 (BINOP, saddw2, v16qi, v8hi, v4si)},
+  {VAR3 (BINOP, uaddw2, v16qi, v8hi, v4si)},
+  {VAR6 (BINOP, shadd, v8qi, v4hi, v2si, v16qi, v8hi, v4si)},
+  {VAR6 (BINOP, uhadd, v8qi, v4hi, v2si, v16qi, v8hi, v4si)},
+  {VAR6 (BINOP, srhadd, v8qi, v4hi, v2si, v16qi, v8hi, v4si)},
+  {VAR6 (BINOP, urhadd, v8qi, v4hi, v2si, v16qi, v8hi, v4si)},
+  {VAR3 (BINOP, addhn, v8hi, v4si, v2di)},
+  {VAR3 (BINOP, raddhn, v8hi, v4si, v2di)},
+  {VAR3 (TERNOP, addhn2, v8hi, v4si, v2di)},
+  {VAR3 (TERNOP, raddhn2, v8hi, v4si, v2di)},
+  {VAR3 (BINOP, ssubl, v8qi, v4hi, v2si)},
+  {VAR3 (BINOP, usubl, v8qi, v4hi, v2si)},
+  {VAR3 (BINOP, ssubl2, v16qi, v8hi, v4si) },
+  {VAR3 (BINOP, usubl2, v16qi, v8hi, v4si) },
+  {VAR3 (BINOP, ssubw, v8qi, v4hi, v2si) },
+  {VAR3 (BINOP, usubw, v8qi, v4hi, v2si) },
+  {VAR3 (BINOP, ssubw2, v16qi, v8hi, v4si) },
+  {VAR3 (BINOP, usubw2, v16qi, v8hi, v4si) },
+  {VAR11 (BINOP, sqadd, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi)},
+  {VAR11 (BINOP, uqadd, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi)},
+  {VAR11 (BINOP, sqsub, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi)},
+  {VAR11 (BINOP, uqsub, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi)},
+  {VAR11 (BINOP, suqadd, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi)},
+  {VAR11 (BINOP, usqadd, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi)},
+  {VAR6 (UNOP, sqmovun, di, v8hi, v4si, v2di, si, hi)},
+  {VAR6 (UNOP, sqmovn, di, v8hi, v4si, v2di, si, hi)},
+  {VAR6 (UNOP, uqmovn, di, v8hi, v4si, v2di, si, hi)},
+  {VAR10 (UNOP, sqabs, v8qi, v4hi, v2si, v16qi, v8hi, v4si, v2di, si, hi, qi)},
+  {VAR10 (UNOP, sqneg, v8qi, v4hi, v2si, v16qi, v8hi, v4si, v2di, si, hi, qi)},
+  {VAR2 (BINOP, pmul, v8qi, v16qi)},
+  {VAR4 (TERNOP, sqdmlal, v4hi, v2si, si, hi)},
+  {VAR4 (QUADOP, sqdmlal_lane, v4hi, v2si, si, hi) },
+  {VAR2 (QUADOP, sqdmlal_laneq, v4hi, v2si) },
+  {VAR2 (TERNOP, sqdmlal_n, v4hi, v2si) },
+  {VAR2 (TERNOP, sqdmlal2, v8hi, v4si)},
+  {VAR2 (QUADOP, sqdmlal2_lane, v8hi, v4si) },
+  {VAR2 (QUADOP, sqdmlal2_laneq, v8hi, v4si) },
+  {VAR2 (TERNOP, sqdmlal2_n, v8hi, v4si) },
+  {VAR4 (TERNOP, sqdmlsl, v4hi, v2si, si, hi)},
+  {VAR4 (QUADOP, sqdmlsl_lane, v4hi, v2si, si, hi) },
+  {VAR2 (QUADOP, sqdmlsl_laneq, v4hi, v2si) },
+  {VAR2 (TERNOP, sqdmlsl_n, v4hi, v2si) },
+  {VAR2 (TERNOP, sqdmlsl2, v8hi, v4si)},
+  {VAR2 (QUADOP, sqdmlsl2_lane, v8hi, v4si) },
+  {VAR2 (QUADOP, sqdmlsl2_laneq, v8hi, v4si) },
+  {VAR2 (TERNOP, sqdmlsl2_n, v8hi, v4si) },
+  {VAR4 (BINOP, sqdmull, v4hi, v2si, si, hi)},
+  {VAR4 (TERNOP, sqdmull_lane, v4hi, v2si, si, hi) },
+  {VAR2 (TERNOP, sqdmull_laneq, v4hi, v2si) },
+  {VAR2 (BINOP, sqdmull_n, v4hi, v2si) },
+  {VAR2 (BINOP, sqdmull2, v8hi, v4si) },
+  {VAR2 (TERNOP, sqdmull2_lane, v8hi, v4si) },
+  {VAR2 (TERNOP, sqdmull2_laneq, v8hi, v4si) },
+  {VAR2 (BINOP, sqdmull2_n, v8hi, v4si) },
+  {VAR6 (BINOP, sqdmulh, v4hi, v2si, v8hi, v4si, si, hi)},
+  {VAR6 (BINOP, sqrdmulh, v4hi, v2si, v8hi, v4si, si, hi)},
+  {VAR8 (BINOP, sshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR3 (SHIFTIMM, sshll_n, v8qi, v4hi, v2si) },
+  {VAR3 (SHIFTIMM, ushll_n, v8qi, v4hi, v2si) },
+  {VAR3 (SHIFTIMM, sshll2_n, v16qi, v8hi, v4si) },
+  {VAR3 (SHIFTIMM, ushll2_n, v16qi, v8hi, v4si) },
+  {VAR8 (BINOP, ushl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (BINOP, sshl_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (BINOP, ushl_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR11 (BINOP, sqshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  {VAR11 (BINOP, uqshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  {VAR8 (BINOP, srshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (BINOP, urshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR11 (BINOP, sqrshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  {VAR11 (BINOP, uqrshl, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  {VAR8 (SHIFTIMM, sshr_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTIMM, ushr_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTIMM, srshr_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTIMM, urshr_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTACC, ssra_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTACC, usra_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTACC, srsra_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTACC, ursra_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTINSERT, ssri_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTINSERT, usri_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTINSERT, ssli_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR8 (SHIFTINSERT, usli_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  {VAR11 (SHIFTIMM, sqshlu_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  {VAR11 (SHIFTIMM, sqshl_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  {VAR11 (SHIFTIMM, uqshl_n, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  { VAR6 (SHIFTIMM, sqshrun_n, di, v8hi, v4si, v2di, si, hi) },
+  { VAR6 (SHIFTIMM, sqrshrun_n, di, v8hi, v4si, v2di, si, hi) },
+  { VAR6 (SHIFTIMM, sqshrn_n, di, v8hi, v4si, v2di, si, hi) },
+  { VAR6 (SHIFTIMM, uqshrn_n, di, v8hi, v4si, v2di, si, hi) },
+  { VAR6 (SHIFTIMM, sqrshrn_n, di, v8hi, v4si, v2di, si, hi) },
+  { VAR6 (SHIFTIMM, uqrshrn_n, di, v8hi, v4si, v2di, si, hi) },
+  { VAR8 (BINOP, cmeq, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmge, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmgt, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmle, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmlt, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmhs, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmhi, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR8 (BINOP, cmtst, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di) },
+  { VAR6 (TERNOP, sqdmulh_lane, v4hi, v2si, v8hi, v4si, si, hi) },
+  { VAR6 (TERNOP, sqrdmulh_lane, v4hi, v2si, v8hi, v4si, si, hi) },
+  { VAR3 (BINOP, addp, v8qi, v4hi, v2si) },
+  { VAR1 (UNOP, addp, di) },
+  { VAR11 (BINOP, dup_lane, v8qi, v4hi, v2si, di, v16qi, v8hi, v4si, v2di,
+	  si, hi, qi) },
+  { VAR3 (BINOP, fmax, v2sf, v4sf, v2df) },
+  { VAR3 (BINOP, fmin, v2sf, v4sf, v2df) },
+  { VAR6 (BINOP, smax, v8qi, v4hi, v2si, v16qi, v8hi, v4si) },
+  { VAR6 (BINOP, smin, v8qi, v4hi, v2si, v16qi, v8hi, v4si) },
+  { VAR6 (BINOP, umax, v8qi, v4hi, v2si, v16qi, v8hi, v4si) },
+  { VAR6 (BINOP, umin, v8qi, v4hi, v2si, v16qi, v8hi, v4si) },
+  { VAR3 (UNOP, sqrt, v2sf, v4sf, v2df) },
+  {VAR12 (LOADSTRUCT, ld2,
+	 v8qi, v4hi, v2si, v2sf, di, df, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR12 (LOADSTRUCT, ld3,
+	 v8qi, v4hi, v2si, v2sf, di, df, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR12 (LOADSTRUCT, ld4,
+	 v8qi, v4hi, v2si, v2sf, di, df, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR12 (STORESTRUCT, st2,
+	 v8qi, v4hi, v2si, v2sf, di, df, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR12 (STORESTRUCT, st3,
+	 v8qi, v4hi, v2si, v2sf, di, df, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+  {VAR12 (STORESTRUCT, st4,
+	 v8qi, v4hi, v2si, v2sf, di, df, v16qi, v8hi, v4si, v4sf, v2di, v2df)},
+};
+
+#undef CF
+#undef VAR1
+#undef VAR2
+#undef VAR3
+#undef VAR4
+#undef VAR5
+#undef VAR6
+#undef VAR7
+#undef VAR8
+#undef VAR9
+#undef VAR10
+#undef VAR11
+
+#define NUM_DREG_TYPES 6
+#define NUM_QREG_TYPES 6
+
+void
+init_aarch64_simd_builtins (void)
+{
+  unsigned int i, fcode = AARCH64_SIMD_BUILTIN_BASE;
+
+  /* Scalar type nodes.  */
+  tree aarch64_simd_intQI_type_node;
+  tree aarch64_simd_intHI_type_node;
+  tree aarch64_simd_polyQI_type_node;
+  tree aarch64_simd_polyHI_type_node;
+  tree aarch64_simd_intSI_type_node;
+  tree aarch64_simd_intDI_type_node;
+  tree aarch64_simd_float_type_node;
+  tree aarch64_simd_double_type_node;
+
+  /* Pointer to scalar type nodes.  */
+  tree intQI_pointer_node;
+  tree intHI_pointer_node;
+  tree intSI_pointer_node;
+  tree intDI_pointer_node;
+  tree float_pointer_node;
+  tree double_pointer_node;
+
+  /* Const scalar type nodes.  */
+  tree const_intQI_node;
+  tree const_intHI_node;
+  tree const_intSI_node;
+  tree const_intDI_node;
+  tree const_float_node;
+  tree const_double_node;
+
+  /* Pointer to const scalar type nodes.  */
+  tree const_intQI_pointer_node;
+  tree const_intHI_pointer_node;
+  tree const_intSI_pointer_node;
+  tree const_intDI_pointer_node;
+  tree const_float_pointer_node;
+  tree const_double_pointer_node;
+
+  /* Vector type nodes.  */
+  tree V8QI_type_node;
+  tree V4HI_type_node;
+  tree V2SI_type_node;
+  tree V2SF_type_node;
+  tree V16QI_type_node;
+  tree V8HI_type_node;
+  tree V4SI_type_node;
+  tree V4SF_type_node;
+  tree V2DI_type_node;
+  tree V2DF_type_node;
+
+  /* Scalar unsigned type nodes.  */
+  tree intUQI_type_node;
+  tree intUHI_type_node;
+  tree intUSI_type_node;
+  tree intUDI_type_node;
+
+  /* Opaque integer types for structures of vectors.  */
+  tree intEI_type_node;
+  tree intOI_type_node;
+  tree intCI_type_node;
+  tree intXI_type_node;
+
+  /* Pointer to vector type nodes.  */
+  tree V8QI_pointer_node;
+  tree V4HI_pointer_node;
+  tree V2SI_pointer_node;
+  tree V2SF_pointer_node;
+  tree V16QI_pointer_node;
+  tree V8HI_pointer_node;
+  tree V4SI_pointer_node;
+  tree V4SF_pointer_node;
+  tree V2DI_pointer_node;
+  tree V2DF_pointer_node;
+
+  /* Operations which return results as pairs.  */
+  tree void_ftype_pv8qi_v8qi_v8qi;
+  tree void_ftype_pv4hi_v4hi_v4hi;
+  tree void_ftype_pv2si_v2si_v2si;
+  tree void_ftype_pv2sf_v2sf_v2sf;
+  tree void_ftype_pdi_di_di;
+  tree void_ftype_pv16qi_v16qi_v16qi;
+  tree void_ftype_pv8hi_v8hi_v8hi;
+  tree void_ftype_pv4si_v4si_v4si;
+  tree void_ftype_pv4sf_v4sf_v4sf;
+  tree void_ftype_pv2di_v2di_v2di;
+  tree void_ftype_pv2df_v2df_v2df;
+
+  tree reinterp_ftype_dreg[NUM_DREG_TYPES][NUM_DREG_TYPES];
+  tree reinterp_ftype_qreg[NUM_QREG_TYPES][NUM_QREG_TYPES];
+  tree dreg_types[NUM_DREG_TYPES], qreg_types[NUM_QREG_TYPES];
+
+  /* Create distinguished type nodes for AARCH64_SIMD vector element types,
+     and pointers to values of such types, so we can detect them later.  */
+  aarch64_simd_intQI_type_node =
+    make_signed_type (GET_MODE_PRECISION (QImode));
+  aarch64_simd_intHI_type_node =
+    make_signed_type (GET_MODE_PRECISION (HImode));
+  aarch64_simd_polyQI_type_node =
+    make_signed_type (GET_MODE_PRECISION (QImode));
+  aarch64_simd_polyHI_type_node =
+    make_signed_type (GET_MODE_PRECISION (HImode));
+  aarch64_simd_intSI_type_node =
+    make_signed_type (GET_MODE_PRECISION (SImode));
+  aarch64_simd_intDI_type_node =
+    make_signed_type (GET_MODE_PRECISION (DImode));
+  aarch64_simd_float_type_node = make_node (REAL_TYPE);
+  aarch64_simd_double_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (aarch64_simd_float_type_node) = FLOAT_TYPE_SIZE;
+  TYPE_PRECISION (aarch64_simd_double_type_node) = DOUBLE_TYPE_SIZE;
+  layout_type (aarch64_simd_float_type_node);
+  layout_type (aarch64_simd_double_type_node);
+
+  /* Define typedefs which exactly correspond to the modes we are basing vector
+     types on.  If you change these names you'll need to change
+     the table used by aarch64_mangle_type too.  */
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_intQI_type_node,
+					     "__builtin_aarch64_simd_qi");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_intHI_type_node,
+					     "__builtin_aarch64_simd_hi");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_intSI_type_node,
+					     "__builtin_aarch64_simd_si");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_float_type_node,
+					     "__builtin_aarch64_simd_sf");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_intDI_type_node,
+					     "__builtin_aarch64_simd_di");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_double_type_node,
+					     "__builtin_aarch64_simd_df");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_polyQI_type_node,
+					     "__builtin_aarch64_simd_poly8");
+  (*lang_hooks.types.register_builtin_type) (aarch64_simd_polyHI_type_node,
+					     "__builtin_aarch64_simd_poly16");
+
+  intQI_pointer_node = build_pointer_type (aarch64_simd_intQI_type_node);
+  intHI_pointer_node = build_pointer_type (aarch64_simd_intHI_type_node);
+  intSI_pointer_node = build_pointer_type (aarch64_simd_intSI_type_node);
+  intDI_pointer_node = build_pointer_type (aarch64_simd_intDI_type_node);
+  float_pointer_node = build_pointer_type (aarch64_simd_float_type_node);
+  double_pointer_node = build_pointer_type (aarch64_simd_double_type_node);
+
+  /* Next create constant-qualified versions of the above types.  */
+  const_intQI_node = build_qualified_type (aarch64_simd_intQI_type_node,
+					   TYPE_QUAL_CONST);
+  const_intHI_node = build_qualified_type (aarch64_simd_intHI_type_node,
+					   TYPE_QUAL_CONST);
+  const_intSI_node = build_qualified_type (aarch64_simd_intSI_type_node,
+					   TYPE_QUAL_CONST);
+  const_intDI_node = build_qualified_type (aarch64_simd_intDI_type_node,
+					   TYPE_QUAL_CONST);
+  const_float_node = build_qualified_type (aarch64_simd_float_type_node,
+					   TYPE_QUAL_CONST);
+  const_double_node = build_qualified_type (aarch64_simd_double_type_node,
+					    TYPE_QUAL_CONST);
+
+  const_intQI_pointer_node = build_pointer_type (const_intQI_node);
+  const_intHI_pointer_node = build_pointer_type (const_intHI_node);
+  const_intSI_pointer_node = build_pointer_type (const_intSI_node);
+  const_intDI_pointer_node = build_pointer_type (const_intDI_node);
+  const_float_pointer_node = build_pointer_type (const_float_node);
+  const_double_pointer_node = build_pointer_type (const_double_node);
+
+  /* Now create vector types based on our AARCH64 SIMD element types.  */
+  /* 64-bit vectors.  */
+  V8QI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intQI_type_node, V8QImode);
+  V4HI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intHI_type_node, V4HImode);
+  V2SI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intSI_type_node, V2SImode);
+  V2SF_type_node =
+    build_vector_type_for_mode (aarch64_simd_float_type_node, V2SFmode);
+  /* 128-bit vectors.  */
+  V16QI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intQI_type_node, V16QImode);
+  V8HI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intHI_type_node, V8HImode);
+  V4SI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intSI_type_node, V4SImode);
+  V4SF_type_node =
+    build_vector_type_for_mode (aarch64_simd_float_type_node, V4SFmode);
+  V2DI_type_node =
+    build_vector_type_for_mode (aarch64_simd_intDI_type_node, V2DImode);
+  V2DF_type_node =
+    build_vector_type_for_mode (aarch64_simd_double_type_node, V2DFmode);
+
+  /* Unsigned integer types for various mode sizes.  */
+  intUQI_type_node = make_unsigned_type (GET_MODE_PRECISION (QImode));
+  intUHI_type_node = make_unsigned_type (GET_MODE_PRECISION (HImode));
+  intUSI_type_node = make_unsigned_type (GET_MODE_PRECISION (SImode));
+  intUDI_type_node = make_unsigned_type (GET_MODE_PRECISION (DImode));
+
+  (*lang_hooks.types.register_builtin_type) (intUQI_type_node,
+					     "__builtin_aarch64_simd_uqi");
+  (*lang_hooks.types.register_builtin_type) (intUHI_type_node,
+					     "__builtin_aarch64_simd_uhi");
+  (*lang_hooks.types.register_builtin_type) (intUSI_type_node,
+					     "__builtin_aarch64_simd_usi");
+  (*lang_hooks.types.register_builtin_type) (intUDI_type_node,
+					     "__builtin_aarch64_simd_udi");
+
+  /* Opaque integer types for structures of vectors.  */
+  intEI_type_node = make_signed_type (GET_MODE_PRECISION (EImode));
+  intOI_type_node = make_signed_type (GET_MODE_PRECISION (OImode));
+  intCI_type_node = make_signed_type (GET_MODE_PRECISION (CImode));
+  intXI_type_node = make_signed_type (GET_MODE_PRECISION (XImode));
+
+  (*lang_hooks.types.register_builtin_type) (intTI_type_node,
+					     "__builtin_aarch64_simd_ti");
+  (*lang_hooks.types.register_builtin_type) (intEI_type_node,
+					     "__builtin_aarch64_simd_ei");
+  (*lang_hooks.types.register_builtin_type) (intOI_type_node,
+					     "__builtin_aarch64_simd_oi");
+  (*lang_hooks.types.register_builtin_type) (intCI_type_node,
+					     "__builtin_aarch64_simd_ci");
+  (*lang_hooks.types.register_builtin_type) (intXI_type_node,
+					     "__builtin_aarch64_simd_xi");
+
+  /* Pointers to vector types.  */
+  V8QI_pointer_node = build_pointer_type (V8QI_type_node);
+  V4HI_pointer_node = build_pointer_type (V4HI_type_node);
+  V2SI_pointer_node = build_pointer_type (V2SI_type_node);
+  V2SF_pointer_node = build_pointer_type (V2SF_type_node);
+  V16QI_pointer_node = build_pointer_type (V16QI_type_node);
+  V8HI_pointer_node = build_pointer_type (V8HI_type_node);
+  V4SI_pointer_node = build_pointer_type (V4SI_type_node);
+  V4SF_pointer_node = build_pointer_type (V4SF_type_node);
+  V2DI_pointer_node = build_pointer_type (V2DI_type_node);
+  V2DF_pointer_node = build_pointer_type (V2DF_type_node);
+
+  /* Operations which return results as pairs.  */
+  void_ftype_pv8qi_v8qi_v8qi =
+    build_function_type_list (void_type_node, V8QI_pointer_node,
+			      V8QI_type_node, V8QI_type_node, NULL);
+  void_ftype_pv4hi_v4hi_v4hi =
+    build_function_type_list (void_type_node, V4HI_pointer_node,
+			      V4HI_type_node, V4HI_type_node, NULL);
+  void_ftype_pv2si_v2si_v2si =
+    build_function_type_list (void_type_node, V2SI_pointer_node,
+			      V2SI_type_node, V2SI_type_node, NULL);
+  void_ftype_pv2sf_v2sf_v2sf =
+    build_function_type_list (void_type_node, V2SF_pointer_node,
+			      V2SF_type_node, V2SF_type_node, NULL);
+  void_ftype_pdi_di_di =
+    build_function_type_list (void_type_node, intDI_pointer_node,
+			      aarch64_simd_intDI_type_node,
+			      aarch64_simd_intDI_type_node, NULL);
+  void_ftype_pv16qi_v16qi_v16qi =
+    build_function_type_list (void_type_node, V16QI_pointer_node,
+			      V16QI_type_node, V16QI_type_node, NULL);
+  void_ftype_pv8hi_v8hi_v8hi =
+    build_function_type_list (void_type_node, V8HI_pointer_node,
+			      V8HI_type_node, V8HI_type_node, NULL);
+  void_ftype_pv4si_v4si_v4si =
+    build_function_type_list (void_type_node, V4SI_pointer_node,
+			      V4SI_type_node, V4SI_type_node, NULL);
+  void_ftype_pv4sf_v4sf_v4sf =
+    build_function_type_list (void_type_node, V4SF_pointer_node,
+			      V4SF_type_node, V4SF_type_node, NULL);
+  void_ftype_pv2di_v2di_v2di =
+    build_function_type_list (void_type_node, V2DI_pointer_node,
+			      V2DI_type_node, V2DI_type_node, NULL);
+  void_ftype_pv2df_v2df_v2df =
+    build_function_type_list (void_type_node, V2DF_pointer_node,
+			      V2DF_type_node, V2DF_type_node, NULL);
+
+  dreg_types[0] = V8QI_type_node;
+  dreg_types[1] = V4HI_type_node;
+  dreg_types[2] = V2SI_type_node;
+  dreg_types[3] = V2SF_type_node;
+  dreg_types[4] = aarch64_simd_intDI_type_node;
+  dreg_types[5] = aarch64_simd_double_type_node;
+
+  qreg_types[0] = V16QI_type_node;
+  qreg_types[1] = V8HI_type_node;
+  qreg_types[2] = V4SI_type_node;
+  qreg_types[3] = V4SF_type_node;
+  qreg_types[4] = V2DI_type_node;
+  qreg_types[5] = V2DF_type_node;
+
+  /* If NUM_DREG_TYPES != NUM_QREG_TYPES, we will need separate nested loops
+     for qreg and dreg reinterp inits.  */
+  for (i = 0; i < NUM_DREG_TYPES; i++)
+    {
+      int j;
+      for (j = 0; j < NUM_DREG_TYPES; j++)
+	{
+	  reinterp_ftype_dreg[i][j]
+	    = build_function_type_list (dreg_types[i], dreg_types[j], NULL);
+	  reinterp_ftype_qreg[i][j]
+	    = build_function_type_list (qreg_types[i], qreg_types[j], NULL);
+	}
+    }
+
+  for (i = 0; i < ARRAY_SIZE (aarch64_simd_builtin_data); i++)
+    {
+      aarch64_simd_builtin_datum *d = &aarch64_simd_builtin_data[i];
+      unsigned int j, codeidx = 0;
+
+      d->base_fcode = fcode;
+
+      for (j = 0; j < T_MAX; j++)
+	{
+	  const char *const modenames[] = {
+	    "v8qi", "v4hi", "v2si", "v2sf", "di", "df",
+	    "v16qi", "v8hi", "v4si", "v4sf", "v2di", "v2df",
+	    "ti", "ei", "oi", "xi", "si", "hi", "qi"
+	  };
+	  char namebuf[60];
+	  tree ftype = NULL;
+	  enum insn_code icode;
+	  int is_load = 0;
+	  int is_store = 0;
+
+	  /* Skip if particular mode not supported.  */
+	  if ((d->bits & (1 << j)) == 0)
+	    continue;
+
+	  icode = d->codes[codeidx++];
+
+	  switch (d->itype)
+	    {
+	    case AARCH64_SIMD_LOAD1:
+	    case AARCH64_SIMD_LOAD1LANE:
+	    case AARCH64_SIMD_LOADSTRUCTLANE:
+	    case AARCH64_SIMD_LOADSTRUCT:
+	      is_load = 1;
+	      /* Fall through.  */
+	    case AARCH64_SIMD_STORE1:
+	    case AARCH64_SIMD_STORE1LANE:
+	    case AARCH64_SIMD_STORESTRUCTLANE:
+	    case AARCH64_SIMD_STORESTRUCT:
+	      if (!is_load)
+		is_store = 1;
+	      /* Fall through.  */
+	    case AARCH64_SIMD_UNOP:
+	    case AARCH64_SIMD_BINOP:
+	    case AARCH64_SIMD_LOGICBINOP:
+	    case AARCH64_SIMD_SHIFTINSERT:
+	    case AARCH64_SIMD_TERNOP:
+	    case AARCH64_SIMD_QUADOP:
+	    case AARCH64_SIMD_GETLANE:
+	    case AARCH64_SIMD_SETLANE:
+	    case AARCH64_SIMD_CREATE:
+	    case AARCH64_SIMD_DUP:
+	    case AARCH64_SIMD_DUPLANE:
+	    case AARCH64_SIMD_SHIFTIMM:
+	    case AARCH64_SIMD_SHIFTACC:
+	    case AARCH64_SIMD_COMBINE:
+	    case AARCH64_SIMD_SPLIT:
+	    case AARCH64_SIMD_CONVERT:
+	    case AARCH64_SIMD_FIXCONV:
+	    case AARCH64_SIMD_LANEMUL:
+	    case AARCH64_SIMD_LANEMULL:
+	    case AARCH64_SIMD_LANEMULH:
+	    case AARCH64_SIMD_LANEMAC:
+	    case AARCH64_SIMD_SCALARMUL:
+	    case AARCH64_SIMD_SCALARMULL:
+	    case AARCH64_SIMD_SCALARMULH:
+	    case AARCH64_SIMD_SCALARMAC:
+	    case AARCH64_SIMD_SELECT:
+	    case AARCH64_SIMD_VTBL:
+	    case AARCH64_SIMD_VTBX:
+	      {
+		int k;
+		tree return_type = void_type_node, args = void_list_node;
+
+		/* Build a function type directly from the insn_data for this
+		   builtin.  The build_function_type() function takes care of
+		   removing duplicates for us.  */
+		for (k = insn_data[icode].n_operands - 1; k >= 0; k--)
+		  {
+		    tree eltype;
+
+		    /* Skip an internal operand for vget_{low, high}.  */
+		    if (k == 2 && d->itype == AARCH64_SIMD_SPLIT)
+		      continue;
+
+		    if (is_load && k == 1)
+		      {
+			/* AdvSIMD load patterns always have the memory operand
+			   (a DImode pointer) in the operand 1 position.  We
+			   want a const pointer to the element type in that
+			   position.  */
+			gcc_assert (insn_data[icode].operand[k].mode ==
+				    DImode);
+
+			switch (1 << j)
+			  {
+			  case T_V8QI:
+			  case T_V16QI:
+			    eltype = const_intQI_pointer_node;
+			    break;
+
+			  case T_V4HI:
+			  case T_V8HI:
+			    eltype = const_intHI_pointer_node;
+			    break;
+
+			  case T_V2SI:
+			  case T_V4SI:
+			    eltype = const_intSI_pointer_node;
+			    break;
+
+			  case T_V2SF:
+			  case T_V4SF:
+			    eltype = const_float_pointer_node;
+			    break;
+
+			  case T_DI:
+			  case T_V2DI:
+			    eltype = const_intDI_pointer_node;
+			    break;
+
+			  case T_DF:
+			  case T_V2DF:
+			    eltype = const_double_pointer_node;
+			    break;
+
+			  default:
+			    gcc_unreachable ();
+			  }
+		      }
+		    else if (is_store && k == 0)
+		      {
+			/* Similarly, AdvSIMD store patterns use operand 0 as
+			   the memory location to store to (a DImode pointer).
+			   Use a pointer to the element type of the store in
+			   that position.  */
+			gcc_assert (insn_data[icode].operand[k].mode ==
+				    DImode);
+
+			switch (1 << j)
+			  {
+			  case T_V8QI:
+			  case T_V16QI:
+			    eltype = intQI_pointer_node;
+			    break;
+
+			  case T_V4HI:
+			  case T_V8HI:
+			    eltype = intHI_pointer_node;
+			    break;
+
+			  case T_V2SI:
+			  case T_V4SI:
+			    eltype = intSI_pointer_node;
+			    break;
+
+			  case T_V2SF:
+			  case T_V4SF:
+			    eltype = float_pointer_node;
+			    break;
+
+			  case T_DI:
+			  case T_V2DI:
+			    eltype = intDI_pointer_node;
+			    break;
+
+			  case T_DF:
+			  case T_V2DF:
+			    eltype = double_pointer_node;
+			    break;
+
+			  default:
+			    gcc_unreachable ();
+			  }
+		      }
+		    else
+		      {
+			switch (insn_data[icode].operand[k].mode)
+			  {
+			  case VOIDmode:
+			    eltype = void_type_node;
+			    break;
+			    /* Scalars.  */
+			  case QImode:
+			    eltype = aarch64_simd_intQI_type_node;
+			    break;
+			  case HImode:
+			    eltype = aarch64_simd_intHI_type_node;
+			    break;
+			  case SImode:
+			    eltype = aarch64_simd_intSI_type_node;
+			    break;
+			  case SFmode:
+			    eltype = aarch64_simd_float_type_node;
+			    break;
+			  case DFmode:
+			    eltype = aarch64_simd_double_type_node;
+			    break;
+			  case DImode:
+			    eltype = aarch64_simd_intDI_type_node;
+			    break;
+			  case TImode:
+			    eltype = intTI_type_node;
+			    break;
+			  case EImode:
+			    eltype = intEI_type_node;
+			    break;
+			  case OImode:
+			    eltype = intOI_type_node;
+			    break;
+			  case CImode:
+			    eltype = intCI_type_node;
+			    break;
+			  case XImode:
+			    eltype = intXI_type_node;
+			    break;
+			    /* 64-bit vectors.  */
+			  case V8QImode:
+			    eltype = V8QI_type_node;
+			    break;
+			  case V4HImode:
+			    eltype = V4HI_type_node;
+			    break;
+			  case V2SImode:
+			    eltype = V2SI_type_node;
+			    break;
+			  case V2SFmode:
+			    eltype = V2SF_type_node;
+			    break;
+			    /* 128-bit vectors.  */
+			  case V16QImode:
+			    eltype = V16QI_type_node;
+			    break;
+			  case V8HImode:
+			    eltype = V8HI_type_node;
+			    break;
+			  case V4SImode:
+			    eltype = V4SI_type_node;
+			    break;
+			  case V4SFmode:
+			    eltype = V4SF_type_node;
+			    break;
+			  case V2DImode:
+			    eltype = V2DI_type_node;
+			    break;
+			  case V2DFmode:
+			    eltype = V2DF_type_node;
+			    break;
+			  default:
+			    gcc_unreachable ();
+			  }
+		      }
+
+		    if (k == 0 && !is_store)
+		      return_type = eltype;
+		    else
+		      args = tree_cons (NULL_TREE, eltype, args);
+		  }
+
+		ftype = build_function_type (return_type, args);
+	      }
+	      break;
+
+	    case AARCH64_SIMD_RESULTPAIR:
+	      {
+		switch (insn_data[icode].operand[1].mode)
+		  {
+		  case V8QImode:
+		    ftype = void_ftype_pv8qi_v8qi_v8qi;
+		    break;
+		  case V4HImode:
+		    ftype = void_ftype_pv4hi_v4hi_v4hi;
+		    break;
+		  case V2SImode:
+		    ftype = void_ftype_pv2si_v2si_v2si;
+		    break;
+		  case V2SFmode:
+		    ftype = void_ftype_pv2sf_v2sf_v2sf;
+		    break;
+		  case DImode:
+		    ftype = void_ftype_pdi_di_di;
+		    break;
+		  case V16QImode:
+		    ftype = void_ftype_pv16qi_v16qi_v16qi;
+		    break;
+		  case V8HImode:
+		    ftype = void_ftype_pv8hi_v8hi_v8hi;
+		    break;
+		  case V4SImode:
+		    ftype = void_ftype_pv4si_v4si_v4si;
+		    break;
+		  case V4SFmode:
+		    ftype = void_ftype_pv4sf_v4sf_v4sf;
+		    break;
+		  case V2DImode:
+		    ftype = void_ftype_pv2di_v2di_v2di;
+		    break;
+		  case V2DFmode:
+		    ftype = void_ftype_pv2df_v2df_v2df;
+		    break;
+		  default:
+		    gcc_unreachable ();
+		  }
+	      }
+	      break;
+
+	    case AARCH64_SIMD_REINTERP:
+	      {
+		/* We iterate over 6 doubleword types, then 6 quadword
+		   types.  */
+		int rhs_d = j % NUM_DREG_TYPES;
+		int rhs_q = (j - NUM_DREG_TYPES) % NUM_QREG_TYPES;
+		switch (insn_data[icode].operand[0].mode)
+		  {
+		  case V8QImode:
+		    ftype = reinterp_ftype_dreg[0][rhs_d];
+		    break;
+		  case V4HImode:
+		    ftype = reinterp_ftype_dreg[1][rhs_d];
+		    break;
+		  case V2SImode:
+		    ftype = reinterp_ftype_dreg[2][rhs_d];
+		    break;
+		  case V2SFmode:
+		    ftype = reinterp_ftype_dreg[3][rhs_d];
+		    break;
+		  case DImode:
+		    ftype = reinterp_ftype_dreg[4][rhs_d];
+		    break;
+		  case DFmode:
+		    ftype = reinterp_ftype_dreg[5][rhs_d];
+		    break;
+		  case V16QImode:
+		    ftype = reinterp_ftype_qreg[0][rhs_q];
+		    break;
+		  case V8HImode:
+		    ftype = reinterp_ftype_qreg[1][rhs_q];
+		    break;
+		  case V4SImode:
+		    ftype = reinterp_ftype_qreg[2][rhs_q];
+		    break;
+		  case V4SFmode:
+		    ftype = reinterp_ftype_qreg[3][rhs_q];
+		    break;
+		  case V2DImode:
+		    ftype = reinterp_ftype_qreg[4][rhs_q];
+		    break;
+		  case V2DFmode:
+		    ftype = reinterp_ftype_qreg[5][rhs_q];
+		    break;
+		  default:
+		    gcc_unreachable ();
+		  }
+	      }
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+
+	  gcc_assert (ftype != NULL);
+
+	  snprintf (namebuf, sizeof (namebuf), "__builtin_aarch64_%s%s",
+		    d->name, modenames[j]);
+
+	  add_builtin_function (namebuf, ftype, fcode++, BUILT_IN_MD, NULL,
+				NULL_TREE);
+	}
+    }
+}
+
+static int
+aarch64_simd_builtin_compare (const void *a, const void *b)
+{
+  const aarch64_simd_builtin_datum *const key =
+    (const aarch64_simd_builtin_datum *) a;
+  const aarch64_simd_builtin_datum *const memb =
+    (const aarch64_simd_builtin_datum *) b;
+  unsigned int soughtcode = key->base_fcode;
+
+  if (soughtcode >= memb->base_fcode
+      && soughtcode < memb->base_fcode + memb->num_vars)
+    return 0;
+  else if (soughtcode < memb->base_fcode)
+    return -1;
+  else
+    return 1;
+}
+
+
+static enum insn_code
+locate_simd_builtin_icode (int fcode, aarch64_simd_itype * itype)
+{
+  aarch64_simd_builtin_datum key
+    = { NULL, (aarch64_simd_itype) 0, 0, {CODE_FOR_nothing}, 0, 0};
+  aarch64_simd_builtin_datum *found;
+  int idx;
+
+  key.base_fcode = fcode;
+  found = (aarch64_simd_builtin_datum *)
+    bsearch (&key, &aarch64_simd_builtin_data[0],
+	     ARRAY_SIZE (aarch64_simd_builtin_data),
+	     sizeof (aarch64_simd_builtin_data[0]),
+	     aarch64_simd_builtin_compare);
+  gcc_assert (found);
+  idx = fcode - (int) found->base_fcode;
+  gcc_assert (idx >= 0 && idx < T_MAX && idx < (int) found->num_vars);
+
+  if (itype)
+    *itype = found->itype;
+
+  return found->codes[idx];
+}
+
+typedef enum
+{
+  SIMD_ARG_COPY_TO_REG,
+  SIMD_ARG_CONSTANT,
+  SIMD_ARG_STOP
+} builtin_simd_arg;
+
+#define SIMD_MAX_BUILTIN_ARGS 5
+
+static rtx
+aarch64_simd_expand_args (rtx target, int icode, int have_retval,
+			  tree exp, ...)
+{
+  va_list ap;
+  rtx pat;
+  tree arg[SIMD_MAX_BUILTIN_ARGS];
+  rtx op[SIMD_MAX_BUILTIN_ARGS];
+  enum machine_mode tmode = insn_data[icode].operand[0].mode;
+  enum machine_mode mode[SIMD_MAX_BUILTIN_ARGS];
+  int argc = 0;
+
+  if (have_retval
+      && (!target
+	  || GET_MODE (target) != tmode
+	  || !(*insn_data[icode].operand[0].predicate) (target, tmode)))
+    target = gen_reg_rtx (tmode);
+
+  va_start (ap, exp);
+
+  for (;;)
+    {
+      builtin_simd_arg thisarg = (builtin_simd_arg) va_arg (ap, int);
+
+      if (thisarg == SIMD_ARG_STOP)
+	break;
+      else
+	{
+	  arg[argc] = CALL_EXPR_ARG (exp, argc);
+	  op[argc] = expand_normal (arg[argc]);
+	  mode[argc] = insn_data[icode].operand[argc + have_retval].mode;
+
+	  switch (thisarg)
+	    {
+	    case SIMD_ARG_COPY_TO_REG:
+	      /*gcc_assert (GET_MODE (op[argc]) == mode[argc]); */
+	      if (!(*insn_data[icode].operand[argc + have_retval].predicate)
+		  (op[argc], mode[argc]))
+		op[argc] = copy_to_mode_reg (mode[argc], op[argc]);
+	      break;
+
+	    case SIMD_ARG_CONSTANT:
+	      if (!(*insn_data[icode].operand[argc + have_retval].predicate)
+		  (op[argc], mode[argc]))
+		error_at (EXPR_LOCATION (exp), "incompatible type for argument %d, "
+		       "expected %<const int%>", argc + 1);
+	      break;
+
+	    case SIMD_ARG_STOP:
+	      gcc_unreachable ();
+	    }
+
+	  argc++;
+	}
+    }
+
+  va_end (ap);
+
+  if (have_retval)
+    switch (argc)
+      {
+      case 1:
+	pat = GEN_FCN (icode) (target, op[0]);
+	break;
+
+      case 2:
+	pat = GEN_FCN (icode) (target, op[0], op[1]);
+	break;
+
+      case 3:
+	pat = GEN_FCN (icode) (target, op[0], op[1], op[2]);
+	break;
+
+      case 4:
+	pat = GEN_FCN (icode) (target, op[0], op[1], op[2], op[3]);
+	break;
+
+      case 5:
+	pat = GEN_FCN (icode) (target, op[0], op[1], op[2], op[3], op[4]);
+	break;
+
+      default:
+	gcc_unreachable ();
+      }
+  else
+    switch (argc)
+      {
+      case 1:
+	pat = GEN_FCN (icode) (op[0]);
+	break;
+
+      case 2:
+	pat = GEN_FCN (icode) (op[0], op[1]);
+	break;
+
+      case 3:
+	pat = GEN_FCN (icode) (op[0], op[1], op[2]);
+	break;
+
+      case 4:
+	pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3]);
+	break;
+
+      case 5:
+	pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4]);
+	break;
+
+      default:
+	gcc_unreachable ();
+      }
+
+  if (!pat)
+    return 0;
+
+  emit_insn (pat);
+
+  return target;
+}
+
+/* Expand an AArch64 AdvSIMD builtin(intrinsic).  */
+rtx
+aarch64_simd_expand_builtin (int fcode, tree exp, rtx target)
+{
+  aarch64_simd_itype itype;
+  enum insn_code icode = locate_simd_builtin_icode (fcode, &itype);
+
+  switch (itype)
+    {
+    case AARCH64_SIMD_UNOP:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_BINOP:
+      {
+        rtx arg2 = expand_normal (CALL_EXPR_ARG (exp, 1));
+        /* Handle constants only if the predicate allows it.  */
+	bool op1_const_int_p =
+	  (CONST_INT_P (arg2)
+	   && (*insn_data[icode].operand[2].predicate)
+		(arg2, insn_data[icode].operand[2].mode));
+	return aarch64_simd_expand_args
+	  (target, icode, 1, exp,
+	   SIMD_ARG_COPY_TO_REG,
+	   op1_const_int_p ? SIMD_ARG_CONSTANT : SIMD_ARG_COPY_TO_REG,
+	   SIMD_ARG_STOP);
+      }
+
+    case AARCH64_SIMD_TERNOP:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_QUADOP:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_STOP);
+    case AARCH64_SIMD_LOAD1:
+    case AARCH64_SIMD_LOADSTRUCT:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG, SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_STORESTRUCT:
+      return aarch64_simd_expand_args (target, icode, 0, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG, SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_REINTERP:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG, SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_CREATE:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG, SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_COMBINE:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG, SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_GETLANE:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_CONSTANT,
+				       SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_SETLANE:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_CONSTANT,
+				       SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_SHIFTIMM:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_CONSTANT,
+				       SIMD_ARG_STOP);
+
+    case AARCH64_SIMD_SHIFTACC:
+    case AARCH64_SIMD_SHIFTINSERT:
+      return aarch64_simd_expand_args (target, icode, 1, exp,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_COPY_TO_REG,
+				       SIMD_ARG_CONSTANT,
+				       SIMD_ARG_STOP);
+
+    default:
+      gcc_unreachable ();
+    }
+}
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
+/* Copyright (C) 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is a list of cores that implement AArch64.
+
+   Before using #include to read this file, define a macro:
+
+      AARCH64_CORE(CORE_NAME, CORE_IDENT, ARCH, FLAGS, COSTS)
+
+   The CORE_NAME is the name of the core, represented as a string constant.
+   The CORE_IDENT is the name of the core, represented as an identifier.
+   ARCH is the architecture revision implemented by the chip.
+   FLAGS are the bitwise-or of the traits that apply to that core.
+   This need not include flags implied by the architecture.
+   COSTS is the name of the rtx_costs routine to use.  */
+
+/* V8 Architecture Processors.
+   This list currently contains example CPUs that implement AArch64, and
+   therefore serves as a template for adding more CPUs in the future.  */
+
+AARCH64_CORE("example-1",	      large,	     8,  AARCH64_FL_FPSIMD,    generic)
+AARCH64_CORE("example-2",	      small,	     8,  AARCH64_FL_FPSIMD,    generic)
--- a/gcc/config/aarch64/aarch64-elf-raw.h
+++ b/gcc/config/aarch64/aarch64-elf-raw.h
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Support for bare-metal builds.  */
+#ifndef GCC_AARCH64_ELF_RAW_H
+#define GCC_AARCH64_ELF_RAW_H
+
+#define STARTFILE_SPEC " crti%O%s crtbegin%O%s crt0%O%s"
+#define ENDFILE_SPEC " crtend%O%s crtn%O%s"
+
+#ifndef LINK_SPEC
+#define LINK_SPEC "%{mbig-endian:-EB} %{mlittle-endian:-EL} -X"
+#endif
+
+#endif /* GCC_AARCH64_ELF_RAW_H */
--- a/gcc/config/aarch64/aarch64-elf.h
+++ b/gcc/config/aarch64/aarch64-elf.h
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_AARCH64_ELF_H
+#define GCC_AARCH64_ELF_H
+
+
+#define ASM_OUTPUT_LABELREF(FILE, NAME) \
+  aarch64_asm_output_labelref (FILE, NAME)
+
+#define ASM_OUTPUT_DEF(FILE, NAME1, NAME2)	\
+  do						\
+    {						\
+      assemble_name (FILE, NAME1);		\
+      fputs (" = ", FILE);			\
+      assemble_name (FILE, NAME2);		\
+      fputc ('\n', FILE);			\
+    } while (0)
+
+#define TEXT_SECTION_ASM_OP	"\t.text"
+#define DATA_SECTION_ASM_OP	"\t.data"
+#define BSS_SECTION_ASM_OP	"\t.bss"
+
+#define CTORS_SECTION_ASM_OP "\t.section\t.init_array,\"aw\",%init_array"
+#define DTORS_SECTION_ASM_OP "\t.section\t.fini_array,\"aw\",%fini_array"
+
+#undef INIT_SECTION_ASM_OP
+#undef FINI_SECTION_ASM_OP
+#define INIT_ARRAY_SECTION_ASM_OP CTORS_SECTION_ASM_OP
+#define FINI_ARRAY_SECTION_ASM_OP DTORS_SECTION_ASM_OP
+
+/* Since we use .init_array/.fini_array we don't need the markers at
+   the start and end of the ctors/dtors arrays.  */
+#define CTOR_LIST_BEGIN asm (CTORS_SECTION_ASM_OP)
+#define CTOR_LIST_END		/* empty */
+#define DTOR_LIST_BEGIN asm (DTORS_SECTION_ASM_OP)
+#define DTOR_LIST_END		/* empty */
+
+#undef TARGET_ASM_CONSTRUCTOR
+#define TARGET_ASM_CONSTRUCTOR aarch64_elf_asm_constructor
+
+#undef TARGET_ASM_DESTRUCTOR
+#define TARGET_ASM_DESTRUCTOR aarch64_elf_asm_destructor
+
+#ifdef HAVE_GAS_MAX_SKIP_P2ALIGN
+/* Support for -falign-* switches.  Use .p2align to ensure that code
+   sections are padded with NOP instructions, rather than zeros.  */
+#define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE, LOG, MAX_SKIP)		\
+  do								\
+    {								\
+      if ((LOG) != 0)						\
+	{							\
+	  if ((MAX_SKIP) == 0)					\
+	    fprintf ((FILE), "\t.p2align %d\n", (int) (LOG));	\
+	  else							\
+	    fprintf ((FILE), "\t.p2align %d,,%d\n",		\
+		     (int) (LOG), (int) (MAX_SKIP));		\
+	}							\
+    } while (0)
+
+#endif /* HAVE_GAS_MAX_SKIP_P2ALIGN */
+
+#define JUMP_TABLES_IN_TEXT_SECTION 0
+
+#define ASM_OUTPUT_ADDR_DIFF_ELT(STREAM, BODY, VALUE, REL)		\
+  do {									\
+    switch (GET_MODE (BODY))						\
+      {									\
+      case QImode:							\
+	asm_fprintf (STREAM, "\t.byte\t(%LL%d - %LLrtx%d) / 4\n",	\
+		     VALUE, REL);					\
+	break;								\
+      case HImode:							\
+	asm_fprintf (STREAM, "\t.2byte\t(%LL%d - %LLrtx%d) / 4\n",	\
+		     VALUE, REL);					\
+	break;								\
+      case SImode:							\
+      case DImode: /* See comment in aarch64_output_casesi.  */		\
+	asm_fprintf (STREAM, "\t.word\t(%LL%d - %LLrtx%d) / 4\n",	\
+		     VALUE, REL);					\
+	break;								\
+      default:								\
+	gcc_unreachable ();						\
+      }									\
+  } while (0)
+
+#define ASM_OUTPUT_ALIGN(STREAM, POWER)		\
+  fprintf(STREAM, "\t.align\t%d\n", (int)POWER)
+
+#define ASM_COMMENT_START "//"
+
+#define REGISTER_PREFIX		""
+#define LOCAL_LABEL_PREFIX	"."
+#define USER_LABEL_PREFIX	""
+
+#define GLOBAL_ASM_OP "\t.global\t"
+
+#ifndef ASM_SPEC
+#define ASM_SPEC "\
+%{mbig-endian:-EB} \
+%{mlittle-endian:-EL} \
+%{mcpu=*:-mcpu=%*} \
+%{march=*:-march=%*}"
+#endif
+
+#undef TYPE_OPERAND_FMT
+#define TYPE_OPERAND_FMT	"%%%s"
+
+#undef TARGET_ASM_NAMED_SECTION
+#define TARGET_ASM_NAMED_SECTION  aarch64_elf_asm_named_section
+
+/* Stabs debug not required.  */
+#undef DBX_DEBUGGING_INFO
+
+#endif /* GCC_AARCH64_ELF_H */
--- a/gcc/config/aarch64/aarch64-generic.md
+++ b/gcc/config/aarch64/aarch64-generic.md
+;; Machine description for AArch64 architecture.
+;; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Generic scheduler
+
+(define_automaton "aarch64")
+
+(define_cpu_unit "core" "aarch64")
+
+(define_attr "is_load" "yes,no"
+  (if_then_else (eq_attr "v8type" "fpsimd_load,fpsimd_load2,load1,load2")
+	(const_string "yes")
+	(const_string "no")))
+
+(define_insn_reservation "load" 2
+  (eq_attr "is_load" "yes")
+  "core")
+
+(define_insn_reservation "nonload" 1
+  (eq_attr "is_load" "no")
+  "core")
--- a/gcc/config/aarch64/aarch64-linux.h
+++ b/gcc/config/aarch64/aarch64-linux.h
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_AARCH64_LINUX_H
+#define GCC_AARCH64_LINUX_H
+
+#define GLIBC_DYNAMIC_LINKER "/lib/ld-linux-aarch64.so.1"
+
+#define LINUX_TARGET_LINK_SPEC  "%{h*}		\
+   %{static:-Bstatic}				\
+   %{shared:-shared}				\
+   %{symbolic:-Bsymbolic}			\
+   %{rdynamic:-export-dynamic}			\
+   -dynamic-linker " GNU_USER_DYNAMIC_LINKER "	\
+   -X						\
+   %{mbig-endian:-EB} %{mlittle-endian:-EL}"
+
+#define LINK_SPEC LINUX_TARGET_LINK_SPEC
+
+#define TARGET_OS_CPP_BUILTINS()		\
+  do						\
+    {						\
+	GNU_USER_TARGET_OS_CPP_BUILTINS();	\
+    }						\
+  while (0)
+
+#endif  /* GCC_AARCH64_LINUX_H */
--- a/gcc/config/aarch64/aarch64-modes.def
+++ b/gcc/config/aarch64/aarch64-modes.def
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+CC_MODE (CCFP);
+CC_MODE (CCFPE);
+CC_MODE (CC_SWP);
+CC_MODE (CC_ZESWP); /* zero-extend LHS (but swap to make it RHS).  */
+CC_MODE (CC_SESWP); /* sign-extend LHS (but swap to make it RHS).  */
+CC_MODE (CC_NZ);    /* Only N and Z bits of condition flags are valid.  */
+
+/* Vector modes.  */
+VECTOR_MODES (INT, 8);        /*       V8QI V4HI V2SI.  */
+VECTOR_MODES (INT, 16);       /* V16QI V8HI V4SI V2DI.  */
+VECTOR_MODES (FLOAT, 8);      /*                 V2SF.  */
+VECTOR_MODES (FLOAT, 16);     /*            V4SF V2DF.  */
+
+/* Oct Int: 256-bit integer mode needed for 32-byte vector arguments.  */
+INT_MODE (OI, 32);
+
+/* Opaque integer modes for 3, 6 or 8 Neon double registers (2 is
+   TImode).  */
+INT_MODE (EI, 24);
+INT_MODE (CI, 48);
+INT_MODE (XI, 64);
+
+/* Vector modes for register lists.  */
+VECTOR_MODES (INT, 32);		/* V32QI V16HI V8SI V4DI.  */
+VECTOR_MODES (FLOAT, 32);	/* V8SF V4DF.  */
+
+VECTOR_MODES (INT, 48);		/* V32QI V16HI V8SI V4DI.  */
+VECTOR_MODES (FLOAT, 48);	/* V8SF V4DF.  */
+
+VECTOR_MODES (INT, 64);		/* V32QI V16HI V8SI V4DI.  */
+VECTOR_MODES (FLOAT, 64);	/* V8SF V4DF.  */
+
+/* Quad float: 128-bit floating mode for long doubles.  */
+FLOAT_MODE (TF, 16, ieee_quad_format);
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
+/* Copyright (C) 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is a list of ISA extentsions in AArch64.
+
+   Before using #include to read this file, define a macro:
+
+      AARCH64_OPT_EXTENSION(EXT_NAME, FLAGS_ON, FLAGS_OFF)
+
+   EXT_NAME is the name of the extension, represented as a string constant.
+   FLAGS_ON are the bitwise-or of the features that the extension adds.
+   FLAGS_OFF are the bitwise-or of the features that the extension removes.  */
+
+/* V8 Architecture Extensions.
+   This list currently contains example extensions for CPUs that implement
+   AArch64, and therefore serves as a template for adding more CPUs in the
+   future.  */
+
+AARCH64_OPT_EXTENSION("fp",	AARCH64_FL_FP,	AARCH64_FL_FPSIMD | AARCH64_FL_CRYPTO)
+AARCH64_OPT_EXTENSION("simd",	AARCH64_FL_FPSIMD,	AARCH64_FL_SIMD | AARCH64_FL_CRYPTO)
+AARCH64_OPT_EXTENSION("crypto",	AARCH64_FL_CRYPTO | AARCH64_FL_FPSIMD,	AARCH64_FL_CRYPTO)
--- a/gcc/config/aarch64/aarch64-opts.h
+++ b/gcc/config/aarch64/aarch64-opts.h
+/* Copyright (C) 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Definitions for option handling for AArch64.  */
+
+#ifndef GCC_AARCH64_OPTS_H
+#define GCC_AARCH64_OPTS_H
+
+/* The various cores that implement AArch64.  */
+enum aarch64_processor
+{
+#define AARCH64_CORE(NAME, IDENT, ARCH, FLAGS, COSTS) \
+  IDENT,
+#include "aarch64-cores.def"
+#undef AARCH64_CORE
+  /* Used to indicate that no processor has been specified.  */
+  generic,
+  /* Used to mark the end of the processor table.  */
+  aarch64_none
+};
+
+/* TLS types.  */
+enum aarch64_tls_type {
+  TLS_TRADITIONAL,
+  TLS_DESCRIPTORS
+};
+
+/* The code model defines the address generation strategy.
+   Most have a PIC and non-PIC variant.  */
+enum aarch64_code_model {
+  /* Static code and data fit within a 1MB region.
+     Not fully implemented, mostly treated as SMALL.  */
+  AARCH64_CMODEL_TINY,
+  /* Static code, data and GOT/PLT fit within a 1MB region.
+     Not fully implemented, mostly treated as SMALL_PIC.  */
+  AARCH64_CMODEL_TINY_PIC,
+  /* Static code and data fit within a 4GB region.
+     The default non-PIC code model.  */
+  AARCH64_CMODEL_SMALL,
+  /* Static code, data and GOT/PLT fit within a 4GB region.
+     The default PIC code model.  */
+  AARCH64_CMODEL_SMALL_PIC,
+  /* No assumptions about addresses of code and data.
+     The PIC variant is not yet implemented.  */
+  AARCH64_CMODEL_LARGE
+};
+
+#endif
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+
+#ifndef GCC_AARCH64_PROTOS_H
+#define GCC_AARCH64_PROTOS_H
+
+ /* This generator struct and enum is used to wrap a function pointer
+    to a function that generates an RTX fragment but takes either 3 or
+    4 operands.
+
+    The omn flavour, wraps a function that generates a synchronization
+    instruction from 3 operands: old value, memory and new value.
+
+    The omrn flavour, wraps a function that generates a synchronization
+    instruction from 4 operands: old value, memory, required value and
+    new value.  */
+
+enum aarch64_sync_generator_tag
+{
+  aarch64_sync_generator_omn,
+  aarch64_sync_generator_omrn
+};
+
+ /* Wrapper to pass around a polymorphic pointer to a sync instruction
+    generator and.  */
+struct aarch64_sync_generator
+{
+  enum aarch64_sync_generator_tag op;
+  union
+  {
+    rtx (*omn) (rtx, rtx, rtx);
+    rtx (*omrn) (rtx, rtx, rtx, rtx);
+  } u;
+};
+
+/*
+  SYMBOL_CONTEXT_ADR
+  The symbol is used in a load-address operation.
+  SYMBOL_CONTEXT_MEM
+  The symbol is used as the address in a MEM.
+ */
+enum aarch64_symbol_context
+{
+  SYMBOL_CONTEXT_MEM,
+  SYMBOL_CONTEXT_ADR
+};
+
+/* SYMBOL_SMALL_ABSOLUTE: Generate symbol accesses through
+   high and lo relocs that calculate the base address using a PC
+   relative reloc.
+   So to get the address of foo, we generate
+   adrp x0, foo
+   add  x0, x0, :lo12:foo
+
+   To load or store something to foo, we could use the corresponding
+   load store variants that generate an
+   ldr x0, [x0,:lo12:foo]
+   or
+   str x1, [x0, :lo12:foo]
+
+   This corresponds to the small code model of the compiler.
+
+   SYMBOL_SMALL_GOT: Similar to the one above but this
+   gives us the GOT entry of the symbol being referred to :
+   Thus calculating the GOT entry for foo is done using the
+   following sequence of instructions.  The ADRP instruction
+   gets us to the page containing the GOT entry of the symbol
+   and the got_lo12 gets us the actual offset in it.
+
+   adrp  x0, :got:foo
+   ldr   x0, [x0, :gotoff_lo12:foo]
+
+   This corresponds to the small PIC model of the compiler.
+
+   SYMBOL_SMALL_TLSGD
+   SYMBOL_SMALL_TLSDESC
+   SYMBOL_SMALL_GOTTPREL
+   SYMBOL_SMALL_TPREL
+   Each of of these represents a thread-local symbol, and corresponds to the
+   thread local storage relocation operator for the symbol being referred to.
+
+   SYMBOL_FORCE_TO_MEM : Global variables are addressed using
+   constant pool.  All variable addresses are spilled into constant
+   pools.  The constant pools themselves are addressed using PC
+   relative accesses.  This only works for the large code model.
+ */
+enum aarch64_symbol_type
+{
+  SYMBOL_SMALL_ABSOLUTE,
+  SYMBOL_SMALL_GOT,
+  SYMBOL_SMALL_TLSGD,
+  SYMBOL_SMALL_TLSDESC,
+  SYMBOL_SMALL_GOTTPREL,
+  SYMBOL_SMALL_TPREL,
+  SYMBOL_FORCE_TO_MEM
+};
+
+/* A set of tuning parameters contains references to size and time
+   cost models and vectors for address cost calculations, register
+   move costs and memory move costs.  */
+
+/* Extra costs for specific insns.  Only records the cost above a
+   single insn.  */
+
+struct cpu_rtx_cost_table
+{
+  const int memory_load;
+  const int memory_store;
+  const int register_shift;
+  const int int_divide;
+  const int float_divide;
+  const int double_divide;
+  const int int_multiply;
+  const int int_multiply_extend;
+  const int int_multiply_add;
+  const int int_multiply_extend_add;
+  const int float_multiply;
+  const int double_multiply;
+};
+
+/* Additional cost for addresses.  */
+struct cpu_addrcost_table
+{
+  const int pre_modify;
+  const int post_modify;
+  const int register_offset;
+  const int register_extend;
+  const int imm_offset;
+};
+
+/* Additional costs for register copies.  Cost is for one register.  */
+struct cpu_regmove_cost
+{
+  const int GP2GP;
+  const int GP2FP;
+  const int FP2GP;
+  const int FP2FP;
+};
+
+struct tune_params
+{
+  const struct cpu_rtx_cost_table *const insn_extra_cost;
+  const struct cpu_addrcost_table *const addr_cost;
+  const struct cpu_regmove_cost *const regmove_cost;
+  const int memmov_cost;
+};
+
+HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
+bool aarch64_bitmask_imm (HOST_WIDE_INT val, enum machine_mode);
+bool aarch64_const_double_zero_rtx_p (rtx);
+bool aarch64_constant_address_p (rtx);
+bool aarch64_function_arg_regno_p (unsigned);
+bool aarch64_gen_movmemqi (rtx *);
+bool aarch64_is_extend_from_extract (enum machine_mode, rtx, rtx);
+bool aarch64_is_long_call_p (rtx);
+bool aarch64_label_mentioned_p (rtx);
+bool aarch64_legitimate_pic_operand_p (rtx);
+bool aarch64_move_imm (HOST_WIDE_INT, enum machine_mode);
+bool aarch64_pad_arg_upward (enum machine_mode, const_tree);
+bool aarch64_pad_reg_upward (enum machine_mode, const_tree, bool);
+bool aarch64_regno_ok_for_base_p (int, bool);
+bool aarch64_regno_ok_for_index_p (int, bool);
+bool aarch64_simd_imm_scalar_p (rtx x, enum machine_mode mode);
+bool aarch64_simd_imm_zero_p (rtx, enum machine_mode);
+bool aarch64_simd_shift_imm_p (rtx, enum machine_mode, bool);
+bool aarch64_symbolic_address_p (rtx);
+bool aarch64_symbolic_constant_p (rtx, enum aarch64_symbol_context,
+				  enum aarch64_symbol_type *);
+bool aarch64_uimm12_shift (HOST_WIDE_INT);
+const char *aarch64_output_casesi (rtx *);
+const char *aarch64_output_sync_insn (rtx, rtx *);
+const char *aarch64_output_sync_lock_release (rtx, rtx);
+enum aarch64_symbol_type aarch64_classify_symbol (rtx,
+						  enum aarch64_symbol_context);
+enum aarch64_symbol_type aarch64_classify_tls_symbol (rtx);
+enum reg_class aarch64_regno_regclass (unsigned);
+int aarch64_asm_preferred_eh_data_format (int, int);
+int aarch64_hard_regno_mode_ok (unsigned, enum machine_mode);
+int aarch64_hard_regno_nregs (unsigned, enum machine_mode);
+int aarch64_simd_attr_length_move (rtx);
+int aarch64_simd_immediate_valid_for_move (rtx, enum machine_mode, rtx *,
+					   int *, unsigned char *, int *,
+					   int *);
+int aarch64_uxt_size (int, HOST_WIDE_INT);
+rtx aarch64_final_eh_return_addr (void);
+rtx aarch64_legitimize_reload_address (rtx *, enum machine_mode, int, int, int);
+const char *aarch64_output_move_struct (rtx *operands);
+rtx aarch64_return_addr (int, rtx);
+rtx aarch64_simd_gen_const_vector_dup (enum machine_mode, int);
+bool aarch64_simd_mem_operand_p (rtx);
+rtx aarch64_simd_vect_par_cnst_half (enum machine_mode, bool);
+rtx aarch64_tls_get_addr (void);
+unsigned aarch64_dbx_register_number (unsigned);
+unsigned aarch64_trampoline_size (void);
+unsigned aarch64_sync_loop_insns (rtx, rtx *);
+void aarch64_asm_output_labelref (FILE *, const char *);
+void aarch64_elf_asm_named_section (const char *, unsigned, tree);
+void aarch64_expand_epilogue (bool);
+void aarch64_expand_mov_immediate (rtx, rtx);
+void aarch64_expand_prologue (void);
+void aarch64_expand_sync (enum machine_mode, struct aarch64_sync_generator *,
+			  rtx, rtx, rtx, rtx);
+void aarch64_function_profiler (FILE *, int);
+void aarch64_init_cumulative_args (CUMULATIVE_ARGS *, const_tree, rtx,
+				   const_tree, unsigned);
+void aarch64_init_expanders (void);
+void aarch64_print_operand (FILE *, rtx, char);
+void aarch64_print_operand_address (FILE *, rtx);
+
+/* Initialize builtins for SIMD intrinsics.  */
+void init_aarch64_simd_builtins (void);
+
+void aarch64_simd_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+void aarch64_simd_disambiguate_copy (rtx *, rtx *, rtx *, unsigned int);
+
+/* Emit code to place a AdvSIMD pair result in memory locations (with equal
+   registers).  */
+void aarch64_simd_emit_pair_result_insn (enum machine_mode,
+					 rtx (*intfn) (rtx, rtx, rtx), rtx,
+					 rtx);
+
+/* Expand builtins for SIMD intrinsics.  */
+rtx aarch64_simd_expand_builtin (int, tree, rtx);
+
+void aarch64_simd_lane_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
+
+/* Emit code for reinterprets.  */
+void aarch64_simd_reinterpret (rtx, rtx);
+
+void aarch64_split_128bit_move (rtx, rtx);
+
+bool aarch64_split_128bit_move_p (rtx, rtx);
+
+#if defined (RTX_CODE)
+
+bool aarch64_legitimate_address_p (enum machine_mode, rtx, RTX_CODE, bool);
+enum machine_mode aarch64_select_cc_mode (RTX_CODE, rtx, rtx);
+rtx aarch64_gen_compare_reg (RTX_CODE, rtx, rtx);
+
+#endif /* RTX_CODE */
+
+#endif /* GCC_AARCH64_PROTOS_H */
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
+;; -*- buffer-read-only: t -*-
+;; Generated automatically by gentune.sh from aarch64-cores.def
+(define_attr "tune"
+	"large,small"
+	(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+
+#ifndef GCC_AARCH64_H
+#define GCC_AARCH64_H
+
+/* Target CPU builtins.  */
+#define TARGET_CPU_CPP_BUILTINS()			\
+  do							\
+    {							\
+      builtin_define ("__aarch64__");			\
+      if (TARGET_BIG_END)				\
+	builtin_define ("__AARCH64EB__");		\
+      else						\
+	builtin_define ("__AARCH64EL__");		\
+							\
+      switch (aarch64_cmodel)				\
+	{						\
+	  case AARCH64_CMODEL_TINY:			\
+	  case AARCH64_CMODEL_TINY_PIC:			\
+	    builtin_define ("__AARCH64_CMODEL_TINY__");	\
+	    break;					\
+	  case AARCH64_CMODEL_SMALL:			\
+	  case AARCH64_CMODEL_SMALL_PIC:		\
+	    builtin_define ("__AARCH64_CMODEL_SMALL__");\
+	    break;					\
+	  case AARCH64_CMODEL_LARGE:			\
+	    builtin_define ("__AARCH64_CMODEL_LARGE__");	\
+	    break;					\
+	  default:					\
+	    break;					\
+	}						\
+							\
+    } while (0)
+
+
+
+/* Target machine storage layout.  */
+
+#define PROMOTE_MODE(MODE, UNSIGNEDP, TYPE)	\
+  if (GET_MODE_CLASS (MODE) == MODE_INT		\
+      && GET_MODE_SIZE (MODE) < 4)		\
+    {						\
+      if (MODE == QImode || MODE == HImode)	\
+	{					\
+	  MODE = SImode;			\
+	}					\
+    }
+
+/* Bits are always numbered from the LSBit.  */
+#define BITS_BIG_ENDIAN 0
+
+/* Big/little-endian flavour.  */
+#define BYTES_BIG_ENDIAN (TARGET_BIG_END != 0)
+#define WORDS_BIG_ENDIAN (BYTES_BIG_ENDIAN)
+
+/* AdvSIMD is supported in the default configuration, unless disabled by
+   -mgeneral-regs-only.  */
+#define TARGET_SIMD !TARGET_GENERAL_REGS_ONLY
+#define TARGET_FLOAT !TARGET_GENERAL_REGS_ONLY
+
+#define UNITS_PER_WORD		8
+
+#define UNITS_PER_VREG		16
+
+#define PARM_BOUNDARY		64
+
+#define STACK_BOUNDARY		128
+
+#define FUNCTION_BOUNDARY	32
+
+#define EMPTY_FIELD_BOUNDARY	32
+
+#define BIGGEST_ALIGNMENT	128
+
+#define SHORT_TYPE_SIZE		16
+
+#define INT_TYPE_SIZE		32
+
+#define LONG_TYPE_SIZE		64	/* XXX This should be an option */
+
+#define LONG_LONG_TYPE_SIZE	64
+
+#define FLOAT_TYPE_SIZE		32
+
+#define DOUBLE_TYPE_SIZE	64
+
+#define LONG_DOUBLE_TYPE_SIZE	128
+
+/* The architecture reserves all bits of the address for hardware use,
+   so the vbit must go into the delta field of pointers to member
+   functions.  This is the same config as that in the AArch32
+   port.  */
+#define TARGET_PTRMEMFUNC_VBIT_LOCATION ptrmemfunc_vbit_in_delta
+
+/* Make strings word-aligned so that strcpy from constants will be
+   faster.  */
+#define CONSTANT_ALIGNMENT(EXP, ALIGN)		\
+  ((TREE_CODE (EXP) == STRING_CST		\
+    && !optimize_size				\
+    && (ALIGN) < BITS_PER_WORD)			\
+   ? BITS_PER_WORD : ALIGN)
+
+#define DATA_ALIGNMENT(EXP, ALIGN)		\
+  ((((ALIGN) < BITS_PER_WORD)			\
+    && (TREE_CODE (EXP) == ARRAY_TYPE		\
+	|| TREE_CODE (EXP) == UNION_TYPE	\
+	|| TREE_CODE (EXP) == RECORD_TYPE))	\
+   ? BITS_PER_WORD : (ALIGN))
+
+#define LOCAL_ALIGNMENT(EXP, ALIGN) DATA_ALIGNMENT(EXP, ALIGN)
+
+#define STRUCTURE_SIZE_BOUNDARY		8
+
+/* Defined by the ABI */
+#define WCHAR_TYPE "unsigned int"
+#define WCHAR_TYPE_SIZE			32
+
+/* Using long long breaks -ansi and -std=c90, so these will need to be
+   made conditional for an LLP64 ABI.  */
+
+#define SIZE_TYPE	"long unsigned int"
+
+#define PTRDIFF_TYPE	"long int"
+
+#define PCC_BITFIELD_TYPE_MATTERS	1
+
+
+/* Instruction tuning/selection flags.  */
+
+/* Bit values used to identify processor capabilities.  */
+#define AARCH64_FL_SIMD       (1 << 0)	/* Has SIMD instructions.  */
+#define AARCH64_FL_FP         (1 << 1)	/* Has FP.  */
+#define AARCH64_FL_CRYPTO     (1 << 2)	/* Has crypto.  */
+#define AARCH64_FL_SLOWMUL    (1 << 3)	/* A slow multiply core.  */
+
+/* Has FP and SIMD.  */
+#define AARCH64_FL_FPSIMD     (AARCH64_FL_FP | AARCH64_FL_SIMD)
+
+/* Has FP without SIMD.  */
+#define AARCH64_FL_FPQ16      (AARCH64_FL_FP & ~AARCH64_FL_SIMD)
+
+/* Architecture flags that effect instruction selection.  */
+#define AARCH64_FL_FOR_ARCH8       (AARCH64_FL_FPSIMD)
+
+/* Macros to test ISA flags.  */
+extern unsigned long aarch64_isa_flags;
+#define AARCH64_ISA_CRYPTO         (aarch64_isa_flags & AARCH64_FL_CRYPTO)
+#define AARCH64_ISA_FP             (aarch64_isa_flags & AARCH64_FL_FP)
+#define AARCH64_ISA_SIMD           (aarch64_isa_flags & AARCH64_FL_SIMD)
+
+/* Macros to test tuning flags.  */
+extern unsigned long aarch64_tune_flags;
+#define AARCH64_TUNE_SLOWMUL       (aarch64_tune_flags & AARCH64_FL_SLOWMUL)
+
+
+/* Standard register usage.  */
+
+/* 31 64-bit general purpose registers R0-R30:
+   R30		LR (link register)
+   R29		FP (frame pointer)
+   R19-R28	Callee-saved registers
+   R18		The platform register; use as temporary register.
+   R17		IP1 The second intra-procedure-call temporary register
+		(can be used by call veneers and PLT code); otherwise use
+		as a temporary register
+   R16		IP0 The first intra-procedure-call temporary register (can
+		be used by call veneers and PLT code); otherwise use as a
+		temporary register
+   R9-R15	Temporary registers
+   R8		Structure value parameter / temporary register
+   R0-R7	Parameter/result registers
+
+   SP		stack pointer, encoded as X/R31 where permitted.
+   ZR		zero register, encoded as X/R31 elsewhere
+
+   32 x 128-bit floating-point/vector registers
+   V16-V31	Caller-saved (temporary) registers
+   V8-V15	Callee-saved registers
+   V0-V7	Parameter/result registers
+
+   The vector register V0 holds scalar B0, H0, S0 and D0 in its least
+   significant bits.  Unlike AArch32 S1 is not packed into D0,
+   etc.  */
+
+/* Note that we don't mark X30 as a call-clobbered register.  The idea is
+   that it's really the call instructions themselves which clobber X30.
+   We don't care what the called function does with it afterwards.
+
+   This approach makes it easier to implement sibcalls.  Unlike normal
+   calls, sibcalls don't clobber X30, so the register reaches the
+   called function intact.  EPILOGUE_USES says that X30 is useful
+   to the called function.  */
+
+#define FIXED_REGISTERS					\
+  {							\
+    0, 0, 0, 0,   0, 0, 0, 0,	/* R0 - R7 */		\
+    0, 0, 0, 0,   0, 0, 0, 0,	/* R8 - R15 */		\
+    0, 0, 0, 0,   0, 0, 0, 0,	/* R16 - R23 */		\
+    0, 0, 0, 0,   0, 1, 0, 1,	/* R24 - R30, SP */	\
+    0, 0, 0, 0,   0, 0, 0, 0,   /* V0 - V7 */           \
+    0, 0, 0, 0,   0, 0, 0, 0,   /* V8 - V15 */		\
+    0, 0, 0, 0,   0, 0, 0, 0,   /* V16 - V23 */         \
+    0, 0, 0, 0,   0, 0, 0, 0,   /* V24 - V31 */         \
+    1, 1, 1,			/* SFP, AP, CC */	\
+  }
+
+#define CALL_USED_REGISTERS				\
+  {							\
+    1, 1, 1, 1,   1, 1, 1, 1,	/* R0 - R7 */		\
+    1, 1, 1, 1,   1, 1, 1, 1,	/* R8 - R15 */		\
+    1, 1, 1, 0,   0, 0, 0, 0,	/* R16 - R23 */		\
+    0, 0, 0, 0,   0, 1, 0, 1,	/* R24 - R30, SP */	\
+    1, 1, 1, 1,   1, 1, 1, 1,	/* V0 - V7 */		\
+    0, 0, 0, 0,   0, 0, 0, 0,	/* V8 - V15 */		\
+    1, 1, 1, 1,   1, 1, 1, 1,   /* V16 - V23 */         \
+    1, 1, 1, 1,   1, 1, 1, 1,   /* V24 - V31 */         \
+    1, 1, 1,			/* SFP, AP, CC */	\
+  }
+
+#define REGISTER_NAMES						\
+  {								\
+    "x0",  "x1",  "x2",  "x3",  "x4",  "x5",  "x6",  "x7",	\
+    "x8",  "x9",  "x10", "x11", "x12", "x13", "x14", "x15",	\
+    "x16", "x17", "x18", "x19", "x20", "x21", "x22", "x23",	\
+    "x24", "x25", "x26", "x27", "x28", "x29", "x30", "sp",	\
+    "v0",  "v1",  "v2",  "v3",  "v4",  "v5",  "v6",  "v7",	\
+    "v8",  "v9",  "v10", "v11", "v12", "v13", "v14", "v15",	\
+    "v16", "v17", "v18", "v19", "v20", "v21", "v22", "v23",	\
+    "v24", "v25", "v26", "v27", "v28", "v29", "v30", "v31",	\
+    "sfp", "ap",  "cc",						\
+  }
+
+/* Generate the register aliases for core register N */
+#define R_ALIASES(N) {"r" # N, R0_REGNUM + (N)}, \
+                     {"w" # N, R0_REGNUM + (N)}
+
+#define V_ALIASES(N) {"q" # N, V0_REGNUM + (N)}, \
+                     {"d" # N, V0_REGNUM + (N)}, \
+                     {"s" # N, V0_REGNUM + (N)}, \
+                     {"h" # N, V0_REGNUM + (N)}, \
+                     {"b" # N, V0_REGNUM + (N)}
+
+/* Provide aliases for all of the ISA defined register name forms.
+   These aliases are convenient for use in the clobber lists of inline
+   asm statements.  */
+
+#define ADDITIONAL_REGISTER_NAMES \
+  { R_ALIASES(0),  R_ALIASES(1),  R_ALIASES(2),  R_ALIASES(3),  \
+    R_ALIASES(4),  R_ALIASES(5),  R_ALIASES(6),  R_ALIASES(7),  \
+    R_ALIASES(8),  R_ALIASES(9),  R_ALIASES(10), R_ALIASES(11), \
+    R_ALIASES(12), R_ALIASES(13), R_ALIASES(14), R_ALIASES(15), \
+    R_ALIASES(16), R_ALIASES(17), R_ALIASES(18), R_ALIASES(19), \
+    R_ALIASES(20), R_ALIASES(21), R_ALIASES(22), R_ALIASES(23), \
+    R_ALIASES(24), R_ALIASES(25), R_ALIASES(26), R_ALIASES(27), \
+    R_ALIASES(28), R_ALIASES(29), R_ALIASES(30), /* 31 omitted  */ \
+    V_ALIASES(0),  V_ALIASES(1),  V_ALIASES(2),  V_ALIASES(3),  \
+    V_ALIASES(4),  V_ALIASES(5),  V_ALIASES(6),  V_ALIASES(7),  \
+    V_ALIASES(8),  V_ALIASES(9),  V_ALIASES(10), V_ALIASES(11), \
+    V_ALIASES(12), V_ALIASES(13), V_ALIASES(14), V_ALIASES(15), \
+    V_ALIASES(16), V_ALIASES(17), V_ALIASES(18), V_ALIASES(19), \
+    V_ALIASES(20), V_ALIASES(21), V_ALIASES(22), V_ALIASES(23), \
+    V_ALIASES(24), V_ALIASES(25), V_ALIASES(26), V_ALIASES(27), \
+    V_ALIASES(28), V_ALIASES(29), V_ALIASES(30), V_ALIASES(31)  \
+  }
+
+/* Say that the epilogue uses the return address register.  Note that
+   in the case of sibcalls, the values "used by the epilogue" are
+   considered live at the start of the called function.  */
+
+#define EPILOGUE_USES(REGNO) \
+  ((REGNO) == LR_REGNUM)
+
+/* EXIT_IGNORE_STACK should be nonzero if, when returning from a function,
+   the stack pointer does not matter.  The value is tested only in
+   functions that have frame pointers.  */
+#define EXIT_IGNORE_STACK	1
+
+#define STATIC_CHAIN_REGNUM		R18_REGNUM
+#define HARD_FRAME_POINTER_REGNUM	R29_REGNUM
+#define FRAME_POINTER_REGNUM		SFP_REGNUM
+#define STACK_POINTER_REGNUM		SP_REGNUM
+#define ARG_POINTER_REGNUM		AP_REGNUM
+#define FIRST_PSEUDO_REGISTER		67
+
+/* The number of (integer) argument register available.  */
+#define NUM_ARG_REGS			8
+#define NUM_FP_ARG_REGS			8
+
+/* A Homogeneous Floating-Point or Short-Vector Aggregate may have at most
+   four members.  */
+#define HA_MAX_NUM_FLDS		4
+
+/* External dwarf register number scheme.  These number are used to
+   identify registers in dwarf debug information, the values are
+   defined by the AArch64 ABI.  The numbering scheme is independent of
+   GCC's internal register numbering scheme.  */
+
+#define AARCH64_DWARF_R0        0
+
+/* The number of R registers, note 31! not 32.  */
+#define AARCH64_DWARF_NUMBER_R 31
+
+#define AARCH64_DWARF_SP       31
+#define AARCH64_DWARF_V0       64
+
+/* The number of V registers.  */
+#define AARCH64_DWARF_NUMBER_V 32
+
+/* For signal frames we need to use an alternative return column.  This
+   value must not correspond to a hard register and must be out of the
+   range of DWARF_FRAME_REGNUM().  */
+#define DWARF_ALT_FRAME_RETURN_COLUMN   \
+  (AARCH64_DWARF_V0 + AARCH64_DWARF_NUMBER_V)
+
+/* We add 1 extra frame register for use as the
+   DWARF_ALT_FRAME_RETURN_COLUMN.  */
+#define DWARF_FRAME_REGISTERS           (DWARF_ALT_FRAME_RETURN_COLUMN + 1)
+
+
+#define DBX_REGISTER_NUMBER(REGNO)	aarch64_dbx_register_number (REGNO)
+/* Provide a definition of DWARF_FRAME_REGNUM here so that fallback unwinders
+   can use DWARF_ALT_FRAME_RETURN_COLUMN defined below.  This is just the same
+   as the default definition in dwarf2out.c.  */
+#undef DWARF_FRAME_REGNUM
+#define DWARF_FRAME_REGNUM(REGNO)	DBX_REGISTER_NUMBER (REGNO)
+
+#define DWARF_FRAME_RETURN_COLUMN	DWARF_FRAME_REGNUM (LR_REGNUM)
+
+#define HARD_REGNO_NREGS(REGNO, MODE)	aarch64_hard_regno_nregs (REGNO, MODE)
+
+#define HARD_REGNO_MODE_OK(REGNO, MODE)	aarch64_hard_regno_mode_ok (REGNO, MODE)
+
+#define MODES_TIEABLE_P(MODE1, MODE2)			\
+  (GET_MODE_CLASS (MODE1) == GET_MODE_CLASS (MODE2))
+
+#define DWARF2_UNWIND_INFO 1
+
+/* Use R0 through R3 to pass exception handling information.  */
+#define EH_RETURN_DATA_REGNO(N) \
+  ((N) < 4 ? ((unsigned int) R0_REGNUM + (N)) : INVALID_REGNUM)
+
+/* Select a format to encode pointers in exception handling data.  */
+#define ASM_PREFERRED_EH_DATA_FORMAT(CODE, GLOBAL) \
+  aarch64_asm_preferred_eh_data_format ((CODE), (GLOBAL))
+
+/* The register that holds the return address in exception handlers.  */
+#define AARCH64_EH_STACKADJ_REGNUM	(R0_REGNUM + 4)
+#define EH_RETURN_STACKADJ_RTX	gen_rtx_REG (Pmode, AARCH64_EH_STACKADJ_REGNUM)
+
+/* Don't use __builtin_setjmp until we've defined it.  */
+#undef DONT_USE_BUILTIN_SETJMP
+#define DONT_USE_BUILTIN_SETJMP 1
+
+/* Register in which the structure value is to be returned.  */
+#define AARCH64_STRUCT_VALUE_REGNUM R8_REGNUM
+
+/* Non-zero if REGNO is part of the Core register set.
+
+   The rather unusual way of expressing this check is to avoid
+   warnings when building the compiler when R0_REGNUM is 0 and REGNO
+   is unsigned.  */
+#define GP_REGNUM_P(REGNO)						\
+  (((unsigned) (REGNO - R0_REGNUM)) <= (R30_REGNUM - R0_REGNUM))
+
+#define FP_REGNUM_P(REGNO)			\
+  (((unsigned) (REGNO - V0_REGNUM)) <= (V31_REGNUM - V0_REGNUM))
+
+#define FP_LO_REGNUM_P(REGNO)            \
+  (((unsigned) (REGNO - V0_REGNUM)) <= (V15_REGNUM - V0_REGNUM))
+
+
+/* Register and constant classes.  */
+
+enum reg_class
+{
+  NO_REGS,
+  CORE_REGS,
+  GENERAL_REGS,
+  STACK_REG,
+  POINTER_REGS,
+  FP_LO_REGS,
+  FP_REGS,
+  ALL_REGS,
+  LIM_REG_CLASSES		/* Last */
+};
+
+#define N_REG_CLASSES	((int) LIM_REG_CLASSES)
+
+#define REG_CLASS_NAMES				\
+{						\
+  "NO_REGS",					\
+  "CORE_REGS",					\
+  "GENERAL_REGS",				\
+  "STACK_REG",					\
+  "POINTER_REGS",				\
+  "FP_LO_REGS",					\
+  "FP_REGS",					\
+  "ALL_REGS"					\
+}
+
+#define REG_CLASS_CONTENTS						\
+{									\
+  { 0x00000000, 0x00000000, 0x00000000 },	/* NO_REGS */		\
+  { 0x7fffffff, 0x00000000, 0x00000003 },	/* CORE_REGS */		\
+  { 0x7fffffff, 0x00000000, 0x00000003 },	/* GENERAL_REGS */	\
+  { 0x80000000, 0x00000000, 0x00000000 },	/* STACK_REG */		\
+  { 0xffffffff, 0x00000000, 0x00000003 },	/* POINTER_REGS */	\
+  { 0x00000000, 0x0000ffff, 0x00000000 },       /* FP_LO_REGS  */	\
+  { 0x00000000, 0xffffffff, 0x00000000 },       /* FP_REGS  */		\
+  { 0xffffffff, 0xffffffff, 0x00000007 }	/* ALL_REGS */		\
+}
+
+#define REGNO_REG_CLASS(REGNO)	aarch64_regno_regclass (REGNO)
+
+#define INDEX_REG_CLASS	CORE_REGS
+#define BASE_REG_CLASS  POINTER_REGS
+
+/* Register pairs used to eliminate unneeded registers that point intoi
+   the stack frame.  */
+#define ELIMINABLE_REGS							\
+{									\
+  { ARG_POINTER_REGNUM,		STACK_POINTER_REGNUM		},	\
+  { ARG_POINTER_REGNUM,		HARD_FRAME_POINTER_REGNUM	},	\
+  { FRAME_POINTER_REGNUM,	STACK_POINTER_REGNUM		},	\
+  { FRAME_POINTER_REGNUM,	HARD_FRAME_POINTER_REGNUM	},	\
+}
+
+#define INITIAL_ELIMINATION_OFFSET(FROM, TO, OFFSET) \
+  (OFFSET) = aarch64_initial_elimination_offset (FROM, TO)
+
+/* CPU/ARCH option handling.  */
+#include "config/aarch64/aarch64-opts.h"
+
+enum target_cpus
+{
+#define AARCH64_CORE(NAME, IDENT, ARCH, FLAGS, COSTS) \
+  TARGET_CPU_##IDENT,
+#include "aarch64-cores.def"
+#undef AARCH64_CORE
+  TARGET_CPU_generic
+};
+
+/* If there is no CPU defined at configure, use "generic" as default.  */
+#ifndef TARGET_CPU_DEFAULT
+#define TARGET_CPU_DEFAULT \
+  (TARGET_CPU_generic | (AARCH64_CPU_DEFAULT_FLAGS << 6))
+#endif
+
+/* The processor for which instructions should be scheduled.  */
+extern enum aarch64_processor aarch64_tune;
+
+/* RTL generation support.  */
+#define INIT_EXPANDERS aarch64_init_expanders ()
+
+
+/* Stack layout; function entry, exit and calling.  */
+#define STACK_GROWS_DOWNWARD	1
+
+#define FRAME_GROWS_DOWNWARD	0
+
+#define STARTING_FRAME_OFFSET	0
+
+#define ACCUMULATE_OUTGOING_ARGS	1
+
+#define FIRST_PARM_OFFSET(FNDECL) 0
+
+/* Fix for VFP */
+#define LIBCALL_VALUE(MODE)  \
+  gen_rtx_REG (MODE, FLOAT_MODE_P (MODE) ? V0_REGNUM : R0_REGNUM)
+
+#define DEFAULT_PCC_STRUCT_RETURN 0
+
+#define AARCH64_ROUND_UP(X, ALIGNMENT) \
+  (((X) + ((ALIGNMENT) - 1)) & ~((ALIGNMENT) - 1))
+
+#define AARCH64_ROUND_DOWN(X, ALIGNMENT) \
+  ((X) & ~((ALIGNMENT) - 1))
+
+#ifdef HOST_WIDE_INT
+struct GTY (()) aarch64_frame
+{
+  HOST_WIDE_INT reg_offset[FIRST_PSEUDO_REGISTER];
+  HOST_WIDE_INT saved_regs_size;
+  /* Padding if needed after the all the callee save registers have
+     been saved.  */
+  HOST_WIDE_INT padding0;
+  HOST_WIDE_INT hardfp_offset;	/* HARD_FRAME_POINTER_REGNUM */
+  HOST_WIDE_INT fp_lr_offset;	/* Space needed for saving fp and/or lr */
+
+  bool laid_out;
+};
+
+typedef struct GTY (()) machine_function
+{
+  struct aarch64_frame frame;
+
+  /* The number of extra stack bytes taken up by register varargs.
+     This area is allocated by the callee at the very top of the frame.  */
+  HOST_WIDE_INT saved_varargs_size;
+
+} machine_function;
+#endif
+
+
+/* Which ABI to use.  */
+enum arm_abi_type
+{
+  ARM_ABI_AAPCS64
+};
+
+enum arm_pcs
+{
+  ARM_PCS_AAPCS64,		/* Base standard AAPCS for 64 bit.  */
+  ARM_PCS_UNKNOWN
+};
+
+
+extern enum arm_abi_type arm_abi;
+extern enum arm_pcs arm_pcs_variant;
+#ifndef ARM_DEFAULT_ABI
+#define ARM_DEFAULT_ABI ARM_ABI_AAPCS64
+#endif
+
+#ifndef ARM_DEFAULT_PCS
+#define ARM_DEFAULT_PCS ARM_PCS_AAPCS64
+#endif
+
+/* We can't use enum machine_mode inside a generator file because it
+   hasn't been created yet; we shouldn't be using any code that
+   needs the real definition though, so this ought to be safe.  */
+#ifdef GENERATOR_FILE
+#define MACHMODE int
+#else
+#include "insn-modes.h"
+#define MACHMODE enum machine_mode
+#endif
+
+
+/* AAPCS related state tracking.  */
+typedef struct
+{
+  enum arm_pcs pcs_variant;
+  int aapcs_arg_processed;	/* No need to lay out this argument again.  */
+  int aapcs_ncrn;		/* Next Core register number.  */
+  int aapcs_nextncrn;		/* Next next core register number.  */
+  int aapcs_nvrn;		/* Next Vector register number.  */
+  int aapcs_nextnvrn;		/* Next Next Vector register number.  */
+  rtx aapcs_reg;		/* Register assigned to this argument.  This
+				   is NULL_RTX if this parameter goes on
+				   the stack.  */
+  MACHMODE aapcs_vfp_rmode;
+  int aapcs_stack_words;	/* If the argument is passed on the stack, this
+				   is the number of words needed, after rounding
+				   up.  Only meaningful when
+				   aapcs_reg == NULL_RTX.  */
+  int aapcs_stack_size;		/* The total size (in words, per 8 byte) of the
+				   stack arg area so far.  */
+} CUMULATIVE_ARGS;
+
+#define FUNCTION_ARG_PADDING(MODE, TYPE) \
+  (aarch64_pad_arg_upward (MODE, TYPE) ? upward : downward)
+
+#define BLOCK_REG_PADDING(MODE, TYPE, FIRST) \
+  (aarch64_pad_reg_upward (MODE, TYPE, FIRST) ? upward : downward)
+
+#define PAD_VARARGS_DOWN	0
+
+#define INIT_CUMULATIVE_ARGS(CUM, FNTYPE, LIBNAME, FNDECL, N_NAMED_ARGS) \
+  aarch64_init_cumulative_args (&(CUM), FNTYPE, LIBNAME, FNDECL, N_NAMED_ARGS)
+
+#define FUNCTION_ARG_REGNO_P(REGNO) \
+  aarch64_function_arg_regno_p(REGNO)
+
+
+/* ISA Features.  */
+
+/* Addressing modes, etc.  */
+#define HAVE_POST_INCREMENT	1
+#define HAVE_PRE_INCREMENT	1
+#define HAVE_POST_DECREMENT	1
+#define HAVE_PRE_DECREMENT	1
+#define HAVE_POST_MODIFY_DISP	1
+#define HAVE_PRE_MODIFY_DISP	1
+
+#define MAX_REGS_PER_ADDRESS	2
+
+#define CONSTANT_ADDRESS_P(X)		aarch64_constant_address_p(X)
+
+/* Try a machine-dependent way of reloading an illegitimate address
+   operand.  If we find one, push the reload and jump to WIN.  This
+   macro is used in only one place: `find_reloads_address' in reload.c.  */
+
+#define LEGITIMIZE_RELOAD_ADDRESS(X, MODE, OPNUM, TYPE, IND_L, WIN)	     \
+do {									     \
+  rtx new_x = aarch64_legitimize_reload_address (&(X), MODE, OPNUM, TYPE,    \
+						 IND_L);		     \
+  if (new_x)								     \
+    {									     \
+      X = new_x;							     \
+      goto WIN;								     \
+    }									     \
+} while (0)
+
+#define REGNO_OK_FOR_BASE_P(REGNO)	\
+  aarch64_regno_ok_for_base_p (REGNO, true)
+
+#define REGNO_OK_FOR_INDEX_P(REGNO) \
+  aarch64_regno_ok_for_index_p (REGNO, true)
+
+#define LEGITIMATE_PIC_OPERAND_P(X) \
+  aarch64_legitimate_pic_operand_p (X)
+
+#define CASE_VECTOR_MODE Pmode
+
+#define DEFAULT_SIGNED_CHAR 0
+
+/* An integer expression for the size in bits of the largest integer machine
+   mode that should actually be used.  We allow pairs of registers.  */
+#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TImode)
+
+/* Maximum bytes moved by a single instruction (load/store pair).  */
+#define MOVE_MAX (UNITS_PER_WORD * 2)
+
+/* The base cost overhead of a memcpy call, for MOVE_RATIO and friends.  */
+#define AARCH64_CALL_RATIO 8
+
+/* When optimizing for size, give a better estimate of the length of a memcpy
+   call, but use the default otherwise.  But move_by_pieces_ninsns() counts
+   memory-to-memory moves, and we'll have to generate a load & store for each,
+   so halve the value to take that into account.  */
+#define MOVE_RATIO(speed) \
+  (((speed) ? 15 : AARCH64_CALL_RATIO) / 2)
+
+/* For CLEAR_RATIO, when optimizing for size, give a better estimate
+   of the length of a memset call, but use the default otherwise.  */
+#define CLEAR_RATIO(speed) \
+  ((speed) ? 15 : AARCH64_CALL_RATIO)
+
+/* SET_RATIO is similar to CLEAR_RATIO, but for a non-zero constant, so when
+   optimizing for size adjust the ratio to account for the overhead of loading
+   the constant.  */
+#define SET_RATIO(speed) \
+  ((speed) ? 15 : AARCH64_CALL_RATIO - 2)
+
+/* STORE_BY_PIECES_P can be used when copying a constant string, but
+   in that case each 64-bit chunk takes 5 insns instead of 2 (LDR/STR).
+   For now we always fail this and let the move_by_pieces code copy
+   the string from read-only memory.  */
+#define STORE_BY_PIECES_P(SIZE, ALIGN) 0
+
+/* Disable auto-increment in move_by_pieces et al.  Use of auto-increment is
+   rarely a good idea in straight-line code since it adds an extra address
+   dependency between each instruction.  Better to use incrementing offsets.  */
+#define USE_LOAD_POST_INCREMENT(MODE)   0
+#define USE_LOAD_POST_DECREMENT(MODE)   0
+#define USE_LOAD_PRE_INCREMENT(MODE)    0
+#define USE_LOAD_PRE_DECREMENT(MODE)    0
+#define USE_STORE_POST_INCREMENT(MODE)  0
+#define USE_STORE_POST_DECREMENT(MODE)  0
+#define USE_STORE_PRE_INCREMENT(MODE)   0
+#define USE_STORE_PRE_DECREMENT(MODE)   0
+
+/* ?? #define WORD_REGISTER_OPERATIONS  */
+
+/* Define if loading from memory in MODE, an integral mode narrower than
+   BITS_PER_WORD will either zero-extend or sign-extend.  The value of this
+   macro should be the code that says which one of the two operations is
+   implicitly done, or UNKNOWN if none.  */
+#define LOAD_EXTEND_OP(MODE) ZERO_EXTEND
+
+/* Define this macro to be non-zero if instructions will fail to work
+   if given data not on the nominal alignment.  */
+#define STRICT_ALIGNMENT		TARGET_STRICT_ALIGN
+
+/* Define this macro to be non-zero if accessing less than a word of
+   memory is no faster than accessing a word of memory, i.e., if such
+   accesses require more than one instruction or if there is no
+   difference in cost.
+   Although there's no difference in instruction count or cycles,
+   in AArch64 we don't want to expand to a sub-word to a 64-bit access
+   if we don't have to, for power-saving reasons.  */
+#define SLOW_BYTE_ACCESS		0
+
+#define TRULY_NOOP_TRUNCATION(OUTPREC, INPREC) 1
+
+#define NO_FUNCTION_CSE	1
+
+#define Pmode		DImode
+#define FUNCTION_MODE	Pmode
+
+#define SELECT_CC_MODE(OP, X, Y)	aarch64_select_cc_mode (OP, X, Y)
+
+#define REVERSE_CONDITION(CODE, MODE)		\
+  (((MODE) == CCFPmode || (MODE) == CCFPEmode)	\
+   ? reverse_condition_maybe_unordered (CODE)	\
+   : reverse_condition (CODE))
+
+#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+  ((VALUE) = ((MODE) == SImode ? 32 : 64), 2)
+#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
+  ((VALUE) = ((MODE) == SImode ? 32 : 64), 2)
+
+#define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, LR_REGNUM)
+
+#define RETURN_ADDR_RTX aarch64_return_addr
+
+#define TRAMPOLINE_SIZE	aarch64_trampoline_size ()
+
+/* Trampolines contain dwords, so must be dword aligned.  */
+#define TRAMPOLINE_ALIGNMENT 64
+
+/* Put trampolines in the text section so that mapping symbols work
+   correctly.  */
+#define TRAMPOLINE_SECTION text_section
+
+/* Costs, etc.  */
+#define MEMORY_MOVE_COST(M, CLASS, IN) \
+  (GET_MODE_SIZE (M) < 8 ? 8 : GET_MODE_SIZE (M))
+
+/* To start with.  */
+#define BRANCH_COST(SPEED_P, PREDICTABLE_P) 2
+
+
+/* Assembly output.  */
+
+/* For now we'll make all jump tables pc-relative.  */
+#define CASE_VECTOR_PC_RELATIVE	1
+
+#define CASE_VECTOR_SHORTEN_MODE(min, max, body)	\
+  ((min < -0x1fff0 || max > 0x1fff0) ? SImode		\
+   : (min < -0x1f0 || max > 0x1f0) ? HImode		\
+   : QImode)
+
+/* Jump table alignment is explicit in ASM_OUTPUT_CASE_LABEL.  */
+#define ADDR_VEC_ALIGN(JUMPTABLE) 0
+
+#define PRINT_OPERAND(STREAM, X, CODE) aarch64_print_operand (STREAM, X, CODE)
+
+#define PRINT_OPERAND_ADDRESS(STREAM, X) \
+  aarch64_print_operand_address (STREAM, X)
+
+#define FUNCTION_PROFILER(STREAM, LABELNO) \
+  aarch64_function_profiler (STREAM, LABELNO)
+
+/* For some reason, the Linux headers think they know how to define
+   these macros.  They don't!!!  */
+#undef ASM_APP_ON
+#undef ASM_APP_OFF
+#define ASM_APP_ON	"\t" ASM_COMMENT_START " Start of user assembly\n"
+#define ASM_APP_OFF	"\t" ASM_COMMENT_START " End of user assembly\n"
+
+#define ASM_FPRINTF_EXTENSIONS(FILE, ARGS, P)		\
+  case '@':						\
+    fputs (ASM_COMMENT_START, FILE);			\
+    break;						\
+							\
+  case 'r':						\
+    fputs (REGISTER_PREFIX, FILE);			\
+    fputs (reg_names[va_arg (ARGS, int)], FILE);	\
+    break;
+
+#define CONSTANT_POOL_BEFORE_FUNCTION 0
+
+/* This definition should be relocated to aarch64-elf-raw.h.  This macro
+   should be undefined in aarch64-linux.h and a clear_cache pattern
+   implmented to emit either the call to __aarch64_sync_cache_range()
+   directly or preferably the appropriate sycall or cache clear
+   instructions inline.  */
+#define CLEAR_INSN_CACHE(beg, end)				\
+  extern void  __aarch64_sync_cache_range (void *, void *);	\
+  __aarch64_sync_cache_range (beg, end)
+
+/* This should be integrated with the equivalent in the 32 bit
+   world.  */
+enum aarch64_builtins
+{
+  AARCH64_BUILTIN_MIN,
+  AARCH64_BUILTIN_THREAD_POINTER,
+  AARCH64_SIMD_BUILTIN_BASE
+};
+
+/*  VFP registers may only be accessed in the mode they
+   were set.  */
+#define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
+  (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
+   ? reg_classes_intersect_p (FP_REGS, (CLASS))		\
+   : 0)
+
+
+#define SHIFT_COUNT_TRUNCATED !TARGET_SIMD
+
+/* Callee only saves lower 64-bits of a 128-bit register.  Tell the
+   compiler the callee clobbers the top 64-bits when restoring the
+   bottom 64-bits.  */
+#define HARD_REGNO_CALL_PART_CLOBBERED(REGNO, MODE) \
+		(FP_REGNUM_P (REGNO) && GET_MODE_SIZE (MODE) > 8)
+
+/* Check TLS Descriptors mechanism is selected.  */
+#define TARGET_TLS_DESC (aarch64_tls_dialect == TLS_DESCRIPTORS)
+
+extern enum aarch64_code_model aarch64_cmodel;
+
+/* When using the tiny addressing model conditional and unconditional branches
+   can span the whole of the available address space (1MB).  */
+#define HAS_LONG_COND_BRANCH				\
+  (aarch64_cmodel == AARCH64_CMODEL_TINY		\
+   || aarch64_cmodel == AARCH64_CMODEL_TINY_PIC)
+
+#define HAS_LONG_UNCOND_BRANCH				\
+  (aarch64_cmodel == AARCH64_CMODEL_TINY		\
+   || aarch64_cmodel == AARCH64_CMODEL_TINY_PIC)
+
+/* Modes valid for AdvSIMD Q registers.  */
+#define AARCH64_VALID_SIMD_QREG_MODE(MODE) \
+  ((MODE) == V4SImode || (MODE) == V8HImode || (MODE) == V16QImode \
+   || (MODE) == V4SFmode || (MODE) == V2DImode || mode == V2DFmode)
+
+#endif /* GCC_AARCH64_H */
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
+;; Machine description for AArch64 architecture.
+;; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; Register numbers
+(define_constants
+  [
+    (R0_REGNUM		0)
+    (R1_REGNUM		1)
+    (R2_REGNUM		2)
+    (R3_REGNUM		3)
+    (R4_REGNUM		4)
+    (R5_REGNUM		5)
+    (R6_REGNUM		6)
+    (R7_REGNUM		7)
+    (R8_REGNUM		8)
+    (R9_REGNUM		9)
+    (R10_REGNUM		10)
+    (R11_REGNUM		11)
+    (R12_REGNUM		12)
+    (R13_REGNUM		13)
+    (R14_REGNUM		14)
+    (R15_REGNUM		15)
+    (R16_REGNUM		16)
+    (IP0_REGNUM		16)
+    (R17_REGNUM		17)
+    (IP1_REGNUM		17)
+    (R18_REGNUM		18)
+    (R19_REGNUM		19)
+    (R20_REGNUM		20)
+    (R21_REGNUM		21)
+    (R22_REGNUM		22)
+    (R23_REGNUM		23)
+    (R24_REGNUM		24)
+    (R25_REGNUM		25)
+    (R26_REGNUM		26)
+    (R27_REGNUM		27)
+    (R28_REGNUM		28)
+    (R29_REGNUM		29)
+    (R30_REGNUM		30)
+    (LR_REGNUM		30)
+    (SP_REGNUM		31)
+    (V0_REGNUM		32)
+    (V15_REGNUM		47)
+    (V31_REGNUM		63)
+    (SFP_REGNUM		64)
+    (AP_REGNUM		65)
+    (CC_REGNUM		66)
+  ]
+)
+
+(define_c_enum "unspec" [
+    UNSPEC_CASESI
+    UNSPEC_CLS
+    UNSPEC_FRINTA
+    UNSPEC_FRINTI
+    UNSPEC_FRINTM
+    UNSPEC_FRINTP
+    UNSPEC_FRINTX
+    UNSPEC_FRINTZ
+    UNSPEC_GOTSMALLPIC
+    UNSPEC_GOTSMALLTLS
+    UNSPEC_LD2
+    UNSPEC_LD3
+    UNSPEC_LD4
+    UNSPEC_MB
+    UNSPEC_NOP
+    UNSPEC_PRLG_STK
+    UNSPEC_RBIT
+    UNSPEC_ST2
+    UNSPEC_ST3
+    UNSPEC_ST4
+    UNSPEC_TLS
+    UNSPEC_TLSDESC
+    UNSPEC_VSTRUCTDUMMY
+])
+
+(define_c_enum "unspecv" [
+    UNSPECV_EH_RETURN		; Represent EH_RETURN
+  ]
+)
+
+;; If further include files are added the defintion of MD_INCLUDES
+;; must be updated.
+
+(include "constraints.md")
+(include "predicates.md")
+(include "iterators.md")
+
+;; -------------------------------------------------------------------
+;; Synchronization Builtins
+;; -------------------------------------------------------------------
+
+;; The following sync_* attributes are applied to sychronization
+;; instruction patterns to control the way in which the
+;; synchronization loop is expanded.
+;; All instruction patterns that call aarch64_output_sync_insn ()
+;; should define these attributes.  Refer to the comment above
+;; aarch64.c:aarch64_output_sync_loop () for more detail on the use of
+;; these attributes.
+
+;; Attribute specifies the operand number which contains the
+;; result of a synchronization operation.  The result is the old value
+;; loaded from SYNC_MEMORY.
+(define_attr "sync_result"          "none,0,1,2,3,4,5" (const_string "none"))
+
+;; Attribute specifies the operand number which contains the memory
+;; address to which the synchronization operation is being applied.
+(define_attr "sync_memory"          "none,0,1,2,3,4,5" (const_string "none"))
+
+;; Attribute specifies the operand number which contains the required
+;; old value expected in the memory location.  This attribute may be
+;; none if no required value test should be performed in the expanded
+;; code.
+(define_attr "sync_required_value"  "none,0,1,2,3,4,5" (const_string "none"))
+
+;; Attribute specifies the operand number of the new value to be stored
+;; into the memory location identitifed by the sync_memory attribute.
+(define_attr "sync_new_value"       "none,0,1,2,3,4,5" (const_string "none"))
+
+;; Attribute specifies the operand number of a temporary register
+;; which can be clobbered by the synchronization instruction sequence.
+;; The register provided byn SYNC_T1 may be the same as SYNC_RESULT is
+;; which case the result value will be clobbered and not available
+;; after the synchronization loop exits.
+(define_attr "sync_t1"              "none,0,1,2,3,4,5" (const_string "none"))
+
+;; Attribute specifies the operand number of a temporary register
+;; which can be clobbered by the synchronization instruction sequence.
+;; This register is used to collect the result of a store exclusive
+;; instruction.
+(define_attr "sync_t2"              "none,0,1,2,3,4,5" (const_string "none"))
+
+;; Attribute that specifies whether or not the emitted synchronization
+;; loop must contain a release barrier.
+(define_attr "sync_release_barrier" "yes,no"           (const_string "yes"))
+
+;; Attribute that specifies the operation that the synchronization
+;; loop should apply to the old and new values to generate the value
+;; written back to memory.
+(define_attr "sync_op"              "none,add,sub,ior,xor,and,nand"
+                                    (const_string "none"))
+
+;; -------------------------------------------------------------------
+;; Instruction types and attributes
+;; -------------------------------------------------------------------
+
+;; Main data types used by the insntructions
+
+(define_attr "mode" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF"
+  (const_string "unknown"))
+
+(define_attr "mode2" "unknown,none,QI,HI,SI,DI,TI,SF,DF,TF"
+  (const_string "unknown"))
+
+; The "v8type" attribute is used to for fine grained classification of
+; AArch64 instructions.  This table briefly explains the meaning of each type.
+
+; adc              add/subtract with carry.
+; adcs             add/subtract with carry (setting condition flags).
+; adr              calculate address.
+; alu              simple alu instruction (no memory or fp regs access).
+; alu_ext          simple alu instruction (sign/zero-extended register).
+; alu_shift        simple alu instruction, with a source operand shifted by a constant.
+; alus             simple alu instruction (setting condition flags).
+; alus_ext         simple alu instruction (sign/zero-extended register, setting condition flags).
+; alus_shift       simple alu instruction, with a source operand shifted by a constant (setting condition flags).
+; bfm              bitfield move operation.
+; branch           branch.
+; call             subroutine call.
+; ccmp             conditional compare.
+; clz              count leading zeros/sign bits.
+; csel             conditional select.
+; dmb              data memory barrier.
+; extend           sign/zero-extend (specialised bitfield move).
+; extr             extract register-sized bitfield encoding.
+; fpsimd_load      load single floating point / simd scalar register from memory.
+; fpsimd_load2     load pair of floating point / simd scalar registers from memory.
+; fpsimd_store     store single floating point / simd scalar register to memory.
+; fpsimd_store2    store pair floating point / simd scalar registers to memory.
+; fadd             floating point add/sub.
+; fccmp            floating point conditional compare.
+; fcmp             floating point comparison.
+; fconst           floating point load immediate.
+; fcsel            floating point conditional select.
+; fcvt             floating point convert (float to float).
+; fcvtf2i          floating point convert (float to integer).
+; fcvti2f          floating point convert (integer to float).
+; fdiv             floating point division operation.
+; ffarith          floating point abs, neg or cpy.
+; fmadd            floating point multiply-add/sub.
+; fminmax          floating point min/max.
+; fmov             floating point move (float to float).
+; fmovf2i          floating point move (float to integer).
+; fmovi2f          floating point move (integer to float).
+; fmul             floating point multiply.
+; frint            floating point round to integral.
+; fsqrt            floating point square root.
+; load_acq         load-acquire.
+; load             load single general register from memory
+; load2            load pair of general registers from memory
+; logic            logical operation (register).
+; logic_imm        and/or/xor operation (immediate).
+; logic_shift      logical operation with shift.
+; logics           logical operation (register, setting condition flags).
+; logics_imm       and/or/xor operation (immediate, setting condition flags).
+; logics_shift     logical operation with shift (setting condition flags).
+; madd             integer multiply-add/sub.
+; maddl            widening integer multiply-add/sub.
+; misc             miscellaneous - any type that doesn't fit into the rest.
+; move             integer move operation.
+; move2            double integer move operation.
+; movk             move 16-bit immediate with keep.
+; movz             move 16-bit immmediate with zero/one.
+; mrs              system/special register move.
+; mulh             64x64 to 128-bit multiply (high part).
+; mull             widening multiply.
+; mult             integer multiply instruction.
+; prefetch         memory prefetch.
+; rbit             reverse bits.
+; rev              reverse bytes.
+; sdiv             integer division operation (signed).
+; shift            variable shift operation.
+; shift_imm        immediate shift operation (specialised bitfield move).
+; store_rel        store-release.
+; store            store single general register to memory.
+; store2           store pair of general registers to memory.
+; udiv             integer division operation (unsigned).
+
+(define_attr "v8type"
+   "adc,\
+   adcs,\
+   adr,\
+   alu,\
+   alu_ext,\
+   alu_shift,\
+   alus,\
+   alus_ext,\
+   alus_shift,\
+   bfm,\
+   branch,\
+   call,\
+   ccmp,\
+   clz,\
+   csel,\
+   dmb,\
+   div,\
+   div64,\
+   extend,\
+   extr,\
+   fpsimd_load,\
+   fpsimd_load2,\
+   fpsimd_store2,\
+   fpsimd_store,\
+   fadd,\
+   fccmp,\
+   fcvt,\
+   fcvtf2i,\
+   fcvti2f,\
+   fcmp,\
+   fconst,\
+   fcsel,\
+   fdiv,\
+   ffarith,\
+   fmadd,\
+   fminmax,\
+   fmov,\
+   fmovf2i,\
+   fmovi2f,\
+   fmul,\
+   frint,\
+   fsqrt,\
+   load_acq,\
+   load1,\
+   load2,\
+   logic,\
+   logic_imm,\
+   logic_shift,\
+   logics,\
+   logics_imm,\
+   logics_shift,\
+   madd,\
+   maddl,\
+   misc,\
+   move,\
+   move2,\
+   movk,\
+   movz,\
+   mrs,\
+   mulh,\
+   mull,\
+   mult,\
+   prefetch,\
+   rbit,\
+   rev,\
+   sdiv,\
+   shift,\
+   shift_imm,\
+   store_rel,\
+   store1,\
+   store2,\
+   udiv"
+  (const_string "alu"))
+
+
+; The "type" attribute is used by the AArch32 backend.  Below is a mapping
+; from "v8type" to "type".
+
+(define_attr "type"
+  "alu,alu_shift,block,branch,call,f_2_r,f_cvt,f_flag,f_loads,
+   f_loadd,f_stored,f_stores,faddd,fadds,fcmpd,fcmps,fconstd,fconsts,
+   fcpys,fdivd,fdivs,ffarithd,ffariths,fmacd,fmacs,fmuld,fmuls,load_byte,
+   load1,load2,mult,r_2_f,store1,store2"
+  (cond [
+	  (eq_attr "v8type" "alu_shift,alus_shift,logic_shift,logics_shift") (const_string "alu_shift")
+	  (eq_attr "v8type" "branch") (const_string "branch")
+	  (eq_attr "v8type" "call") (const_string "call")
+	  (eq_attr "v8type" "fmovf2i") (const_string "f_2_r")
+	  (eq_attr "v8type" "fcvt,fcvtf2i,fcvti2f") (const_string "f_cvt")
+	  (and (eq_attr "v8type" "fpsimd_load") (eq_attr "mode" "SF")) (const_string "f_loads")
+	  (and (eq_attr "v8type" "fpsimd_load") (eq_attr "mode" "DF")) (const_string "f_loadd")
+	  (and (eq_attr "v8type" "fpsimd_store") (eq_attr "mode" "SF")) (const_string "f_stores")
+	  (and (eq_attr "v8type" "fpsimd_store") (eq_attr "mode" "DF")) (const_string "f_stored")
+	  (and (eq_attr "v8type" "fadd,fminmax") (eq_attr "mode" "DF")) (const_string "faddd")
+	  (and (eq_attr "v8type" "fadd,fminmax") (eq_attr "mode" "SF")) (const_string "fadds")
+	  (and (eq_attr "v8type" "fcmp,fccmp") (eq_attr "mode" "DF")) (const_string "fcmpd")
+	  (and (eq_attr "v8type" "fcmp,fccmp") (eq_attr "mode" "SF")) (const_string "fcmps")
+	  (and (eq_attr "v8type" "fconst") (eq_attr "mode" "DF")) (const_string "fconstd")
+	  (and (eq_attr "v8type" "fconst") (eq_attr "mode" "SF")) (const_string "fconsts")
+	  (and (eq_attr "v8type" "fdiv,fsqrt") (eq_attr "mode" "DF")) (const_string "fdivd")
+	  (and (eq_attr "v8type" "fdiv,fsqrt") (eq_attr "mode" "SF")) (const_string "fdivs")
+	  (and (eq_attr "v8type" "ffarith") (eq_attr "mode" "DF")) (const_string "ffarithd")
+	  (and (eq_attr "v8type" "ffarith") (eq_attr "mode" "SF")) (const_string "ffariths")
+	  (and (eq_attr "v8type" "fmadd") (eq_attr "mode" "DF")) (const_string "fmacd")
+	  (and (eq_attr "v8type" "fmadd") (eq_attr "mode" "SF")) (const_string "fmacs")
+	  (and (eq_attr "v8type" "fmul") (eq_attr "mode" "DF")) (const_string "fmuld")
+	  (and (eq_attr "v8type" "fmul") (eq_attr "mode" "SF")) (const_string "fmuls")
+	  (and (eq_attr "v8type" "load1") (eq_attr "mode" "QI,HI")) (const_string "load_byte")
+	  (and (eq_attr "v8type" "load1") (eq_attr "mode" "SI,DI,TI")) (const_string "load1")
+	  (eq_attr "v8type" "load2") (const_string "load2")
+	  (and (eq_attr "v8type" "mulh,mult,mull,madd,sdiv,udiv") (eq_attr "mode" "SI")) (const_string "mult")
+	  (eq_attr "v8type" "fmovi2f") (const_string "r_2_f")
+	  (eq_attr "v8type" "store1") (const_string "store1")
+	  (eq_attr "v8type" "store2") (const_string "store2")
+  ]
+  (const_string "alu")))
+
+;; Attribute that specifies whether or not the instruction touches fp
+;; registers.
+(define_attr "fp" "no,yes" (const_string "no"))
+
+;; Attribute that specifies whether or not the instruction touches simd
+;; registers.
+(define_attr "simd" "no,yes" (const_string "no"))
+
+(define_attr "length" ""
+  (cond [(not (eq_attr "sync_memory" "none"))
+	   (symbol_ref "aarch64_sync_loop_insns (insn, operands) * 4")
+	] (const_int 4)))
+
+;; Attribute that controls whether an alternative is enabled or not.
+;; Currently it is only used to disable alternatives which touch fp or simd
+;; registers when -mgeneral-regs-only is specified.
+(define_attr "enabled" "no,yes"
+  (cond [(ior
+	(and (eq_attr "fp" "yes")
+	     (eq (symbol_ref "TARGET_FLOAT") (const_int 0)))
+	(and (eq_attr "simd" "yes")
+	     (eq (symbol_ref "TARGET_SIMD") (const_int 0))))
+	     (const_string "no")
+	] (const_string "yes")))
+
+;; -------------------------------------------------------------------
+;; Pipeline descriptions and scheduling
+;; -------------------------------------------------------------------
+
+;; Processor types.
+(include "aarch64-tune.md")
+
+;; Scheduling
+(include "aarch64-generic.md")
+(include "large.md")
+(include "small.md")
+
+;; -------------------------------------------------------------------
+;; Jumps and other miscellaneous insns
+;; -------------------------------------------------------------------
+
+(define_insn "indirect_jump"
+  [(set (pc) (match_operand:DI 0 "register_operand" "r"))]
+  ""
+  "br\\t%0"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_insn "jump"
+  [(set (pc) (label_ref (match_operand 0 "" "")))]
+  ""
+  "b\\t%l0"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_expand "cbranch<mode>4"
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+			    [(match_operand:GPI 1 "register_operand" "")
+			     (match_operand:GPI 2 "aarch64_plus_operand" "")])
+			   (label_ref (match_operand 3 "" ""))
+			   (pc)))]
+  ""
+  "
+  operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), operands[1],
+					 operands[2]);
+  operands[2] = const0_rtx;
+  "
+)
+
+(define_expand "cbranch<mode>4"
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+			    [(match_operand:GPF 1 "register_operand" "")
+			     (match_operand:GPF 2 "aarch64_reg_or_zero" "")])
+			   (label_ref (match_operand 3 "" ""))
+			   (pc)))]
+  ""
+  "
+  operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), operands[1],
+					 operands[2]);
+  operands[2] = const0_rtx;
+  "
+)
+
+(define_insn "*condjump"
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+			    [(match_operand 1 "cc_register" "") (const_int 0)])
+			   (label_ref (match_operand 2 "" ""))
+			   (pc)))]
+  ""
+  "b%m0\\t%l2"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_expand "casesi"
+  [(match_operand:SI 0 "register_operand" "")	; Index
+   (match_operand:SI 1 "const_int_operand" "")	; Lower bound
+   (match_operand:SI 2 "const_int_operand" "")	; Total range
+   (match_operand:DI 3 "" "")			; Table label
+   (match_operand:DI 4 "" "")]			; Out of range label
+  ""
+  {
+    if (operands[1] != const0_rtx)
+      {
+	rtx reg = gen_reg_rtx (SImode);
+
+	/* Canonical RTL says that if you have:
+
+	   (minus (X) (CONST))
+
+           then this should be emitted as:
+
+           (plus (X) (-CONST))
+
+	   The use of trunc_int_for_mode ensures that the resulting
+	   constant can be represented in SImode, this is important
+	   for the corner case where operand[1] is INT_MIN.  */
+
+	operands[1] = GEN_INT (trunc_int_for_mode (-INTVAL (operands[1]), SImode));
+
+	if (!(*insn_data[CODE_FOR_addsi3].operand[2].predicate)
+	      (operands[1], SImode))
+	  operands[1] = force_reg (SImode, operands[1]);
+	emit_insn (gen_addsi3 (reg, operands[0], operands[1]));
+	operands[0] = reg;
+      }
+
+    if (!aarch64_plus_operand (operands[2], SImode))
+      operands[2] = force_reg (SImode, operands[2]);
+    emit_jump_insn (gen_cbranchsi4 (gen_rtx_GTU (SImode, const0_rtx,
+						 const0_rtx),
+				    operands[0], operands[2], operands[4]));
+
+    operands[2] = force_reg (DImode, gen_rtx_LABEL_REF (VOIDmode, operands[3]));
+    emit_jump_insn (gen_casesi_dispatch (operands[2], operands[0],
+					 operands[3]));
+    DONE;
+  }
+)
+
+(define_insn "casesi_dispatch"
+  [(parallel
+    [(set (pc)
+	  (mem:DI (unspec [(match_operand:DI 0 "register_operand" "r")
+			   (match_operand:SI 1 "register_operand" "r")]
+			UNSPEC_CASESI)))
+     (clobber (reg:CC CC_REGNUM))
+     (clobber (match_scratch:DI 3 "=r"))
+     (clobber (match_scratch:DI 4 "=r"))
+     (use (label_ref (match_operand 2 "" "")))])]
+  ""
+  "*
+  return aarch64_output_casesi (operands);
+  "
+  [(set_attr "length" "16")
+   (set_attr "v8type" "branch")]
+)
+
+(define_insn "nop"
+  [(unspec[(const_int 0)] UNSPEC_NOP)]
+  ""
+  "nop"
+  [(set_attr "v8type" "misc")]
+)
+
+(define_expand "prologue"
+  [(clobber (const_int 0))]
+  ""
+  "
+  aarch64_expand_prologue ();
+  DONE;
+  "
+)
+
+(define_expand "epilogue"
+  [(clobber (const_int 0))]
+  ""
+  "
+  aarch64_expand_epilogue (false);
+  DONE;
+  "
+)
+
+(define_expand "sibcall_epilogue"
+  [(clobber (const_int 0))]
+  ""
+  "
+  aarch64_expand_epilogue (true);
+  DONE;
+  "
+)
+
+(define_insn "*do_return"
+  [(return)]
+  ""
+  "ret"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_insn "eh_return"
+  [(unspec_volatile [(match_operand:DI 0 "register_operand" "r")]
+    UNSPECV_EH_RETURN)]
+  ""
+  "#"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_split
+  [(unspec_volatile [(match_operand:DI 0 "register_operand" "")]
+    UNSPECV_EH_RETURN)]
+  "reload_completed"
+  [(set (match_dup 1) (match_dup 0))]
+  {
+    operands[1] = aarch64_final_eh_return_addr ();
+  }
+)
+
+(define_insn "*cb<optab><mode>1"
+  [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r")
+				(const_int 0))
+			   (label_ref (match_operand 1 "" ""))
+			   (pc)))]
+  ""
+  "<cbz>\\t%<w>0, %l1"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_insn "*tb<optab><mode>1"
+  [(set (pc) (if_then_else
+	      (EQL (zero_extract:DI (match_operand:GPI 0 "register_operand" "r")
+				    (const_int 1)
+				    (match_operand 1 "const_int_operand" "n"))
+		   (const_int 0))
+	     (label_ref (match_operand 2 "" ""))
+	     (pc)))
+   (clobber (match_scratch:DI 3 "=r"))]
+  ""
+  "*
+  if (get_attr_length (insn) == 8)
+    return \"ubfx\\t%<w>3, %<w>0, %1, #1\;<cbz>\\t%<w>3, %l2\";
+  return \"<tbz>\\t%<w>0, %1, %l2\";
+  "
+  [(set_attr "v8type" "branch")
+   (set_attr "mode" "<MODE>")
+   (set (attr "length")
+	(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -32768))
+			   (lt (minus (match_dup 2) (pc)) (const_int 32764)))
+		      (const_int 4)
+		      (const_int 8)))]
+)
+
+(define_insn "*cb<optab><mode>1"
+  [(set (pc) (if_then_else (LTGE (match_operand:ALLI 0 "register_operand" "r")
+				 (const_int 0))
+			   (label_ref (match_operand 1 "" ""))
+			   (pc)))
+   (clobber (match_scratch:DI 2 "=r"))]
+  ""
+  "*
+  if (get_attr_length (insn) == 8)
+    return \"ubfx\\t%<w>2, %<w>0, <sizem1>, #1\;<cbz>\\t%<w>2, %l1\";
+  return \"<tbz>\\t%<w>0, <sizem1>, %l1\";
+  "
+  [(set_attr "v8type" "branch")
+   (set_attr "mode" "<MODE>")
+   (set (attr "length")
+	(if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -32768))
+			   (lt (minus (match_dup 1) (pc)) (const_int 32764)))
+		      (const_int 4)
+		      (const_int 8)))]
+)
+
+;; -------------------------------------------------------------------
+;; Subroutine calls and sibcalls
+;; -------------------------------------------------------------------
+
+(define_expand "call"
+  [(parallel [(call (match_operand 0 "memory_operand" "")
+		    (match_operand 1 "general_operand" ""))
+	      (use (match_operand 2 "" ""))
+	      (clobber (reg:DI LR_REGNUM))])]
+  ""
+  "
+  {
+    rtx callee;
+
+    /* In an untyped call, we can get NULL for operand 2.  */
+    if (operands[2] == NULL)
+      operands[2] = const0_rtx;
+
+    /* Decide if we should generate indirect calls by loading the
+       64-bit address of the callee into a register before performing
+       the branch-and-link.  */
+    callee = XEXP (operands[0], 0);
+    if (GET_CODE (callee) == SYMBOL_REF
+	? aarch64_is_long_call_p (callee)
+	: !REG_P (callee))
+      XEXP (operands[0], 0) = force_reg (Pmode, callee);
+  }"
+)
+
+(define_insn "*call_reg"
+  [(call (mem:DI (match_operand:DI 0 "register_operand" "r"))
+	 (match_operand 1 "" ""))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:DI LR_REGNUM))]
+  ""
+  "blr\\t%0"
+  [(set_attr "v8type" "call")]
+)
+
+(define_insn "*call_symbol"
+  [(call (mem:DI (match_operand:DI 0 "" ""))
+	 (match_operand 1 "" ""))
+   (use (match_operand 2 "" ""))
+   (clobber (reg:DI LR_REGNUM))]
+  "GET_CODE (operands[0]) == SYMBOL_REF
+   && !aarch64_is_long_call_p (operands[0])"
+  "bl\\t%a0"
+  [(set_attr "v8type" "call")]
+)
+
+(define_expand "call_value"
+  [(parallel [(set (match_operand 0 "" "")
+		   (call (match_operand 1 "memory_operand" "")
+			 (match_operand 2 "general_operand" "")))
+	      (use (match_operand 3 "" ""))
+	      (clobber (reg:DI LR_REGNUM))])]
+  ""
+  "
+  {
+    rtx callee;
+
+    /* In an untyped call, we can get NULL for operand 3.  */
+    if (operands[3] == NULL)
+      operands[3] = const0_rtx;
+
+    /* Decide if we should generate indirect calls by loading the
+       64-bit address of the callee into a register before performing
+       the branch-and-link.  */
+    callee = XEXP (operands[1], 0);
+    if (GET_CODE (callee) == SYMBOL_REF
+	? aarch64_is_long_call_p (callee)
+	: !REG_P (callee))
+      XEXP (operands[1], 0) = force_reg (Pmode, callee);
+  }"
+)
+
+(define_insn "*call_value_reg"
+  [(set (match_operand 0 "" "")
+	(call (mem:DI (match_operand:DI 1 "register_operand" "r"))
+		      (match_operand 2 "" "")))
+   (use (match_operand 3 "" ""))
+   (clobber (reg:DI LR_REGNUM))]
+  ""
+  "blr\\t%1"
+  [(set_attr "v8type" "call")]
+)
+
+(define_insn "*call_value_symbol"
+  [(set (match_operand 0 "" "")
+	(call (mem:DI (match_operand:DI 1 "" ""))
+	      (match_operand 2 "" "")))
+   (use (match_operand 3 "" ""))
+   (clobber (reg:DI LR_REGNUM))]
+  "GET_CODE (operands[1]) == SYMBOL_REF
+   && !aarch64_is_long_call_p (operands[1])"
+  "bl\\t%a1"
+  [(set_attr "v8type" "call")]
+)
+
+(define_expand "sibcall"
+  [(parallel [(call (match_operand 0 "memory_operand" "")
+		    (match_operand 1 "general_operand" ""))
+	      (return)
+	      (use (match_operand 2 "" ""))])]
+  ""
+  {
+    if (operands[2] == NULL_RTX)
+      operands[2] = const0_rtx;
+  }
+)
+
+(define_expand "sibcall_value"
+  [(parallel [(set (match_operand 0 "" "")
+		   (call (match_operand 1 "memory_operand" "")
+			 (match_operand 2 "general_operand" "")))
+	      (return)
+	      (use (match_operand 3 "" ""))])]
+  ""
+  {
+    if (operands[3] == NULL_RTX)
+      operands[3] = const0_rtx;
+  }
+)
+
+(define_insn "*sibcall_insn"
+  [(call (mem:DI (match_operand:DI 0 "" "X"))
+	 (match_operand 1 "" ""))
+   (return)
+   (use (match_operand 2 "" ""))]
+  "GET_CODE (operands[0]) == SYMBOL_REF"
+  "b\\t%a0"
+  [(set_attr "v8type" "branch")]
+)
+
+(define_insn "*sibcall_value_insn"
+  [(set (match_operand 0 "" "")
+	(call (mem:DI (match_operand 1 "" "X"))
+	      (match_operand 2 "" "")))
+   (return)
+   (use (match_operand 3 "" ""))]
+  "GET_CODE (operands[1]) == SYMBOL_REF"
+  "b\\t%a1"
+  [(set_attr "v8type" "branch")]
+)
+
+;; Call subroutine returning any type.
+
+(define_expand "untyped_call"
+  [(parallel [(call (match_operand 0 "")
+		    (const_int 0))
+	      (match_operand 1 "")
+	      (match_operand 2 "")])]
+  ""
+{
+  int i;
+
+  emit_call_insn (GEN_CALL (operands[0], const0_rtx, NULL, const0_rtx));
+
+  for (i = 0; i < XVECLEN (operands[2], 0); i++)
+    {
+      rtx set = XVECEXP (operands[2], 0, i);
+      emit_move_insn (SET_DEST (set), SET_SRC (set));
+    }
+
+  /* The optimizer does not know that the call sets the function value
+     registers we stored in the result block.  We avoid problems by
+     claiming that all hard registers are used and clobbered at this
+     point.  */
+  emit_insn (gen_blockage ());
+  DONE;
+})
+
+;; -------------------------------------------------------------------
+;; Moves
+;; -------------------------------------------------------------------
+
+(define_expand "mov<mode>"
+  [(set (match_operand:SHORT 0 "nonimmediate_operand" "")
+	(match_operand:SHORT 1 "general_operand" ""))]
+  ""
+  "
+    if (GET_CODE (operands[0]) == MEM && operands[1] != const0_rtx)
+      operands[1] = force_reg (<MODE>mode, operands[1]);
+  "
+)
+
+(define_insn "*mov<mode>_aarch64"
+  [(set (match_operand:SHORT 0 "nonimmediate_operand" "=r,r,r,m,  r,*w")
+        (match_operand:SHORT 1 "general_operand"      " r,M,m,rZ,*w,r"))]
+  "(register_operand (operands[0], <MODE>mode)
+    || aarch64_reg_or_zero (operands[1], <MODE>mode))"
+  "@
+   mov\\t%w0, %w1
+   mov\\t%w0, %1
+   ldr<size>\\t%w0, %1
+   str<size>\\t%w1, %0
+   umov\\t%w0, %1.<v>[0]
+   dup\\t%0.<Vallxd>, %w1"
+  [(set_attr "v8type" "move,alu,load1,store1,*,*")
+   (set_attr "simd_type" "*,*,*,*,simd_movgp,simd_dupgp")
+   (set_attr "mode" "<MODE>")
+   (set_attr "simd_mode" "<MODE>")]
+)
+
+(define_expand "mov<mode>"
+  [(set (match_operand:GPI 0 "nonimmediate_operand" "")
+	(match_operand:GPI 1 "general_operand" ""))]
+  ""
+  "
+    if (GET_CODE (operands[0]) == MEM && operands[1] != const0_rtx)
+      operands[1] = force_reg (<MODE>mode, operands[1]);
+
+    if (CONSTANT_P (operands[1]))
+      {
+	aarch64_expand_mov_immediate (operands[0], operands[1]);
+	DONE;
+      }
+  "
+)
+
+(define_insn "*movsi_aarch64"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r,m, *w, r,*w")
+	(match_operand:SI 1 "aarch64_mov_operand"     " r,M,m,rZ,rZ,*w,*w"))]
+  "(register_operand (operands[0], SImode)
+    || aarch64_reg_or_zero (operands[1], SImode))"
+  "@
+   mov\\t%w0, %w1
+   mov\\t%w0, %1
+   ldr\\t%w0, %1
+   str\\t%w1, %0
+   fmov\\t%s0, %w1
+   fmov\\t%w0, %s1
+   fmov\\t%s0, %s1"
+  [(set_attr "v8type" "move,alu,load1,store1,fmov,fmov,fmov")
+   (set_attr "mode" "SI")
+   (set_attr "fp" "*,*,*,*,yes,yes,yes")]
+)
+
+(define_insn "*movdi_aarch64"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,m, r,  r,  *w, r,*w,w")
+	(match_operand:DI 1 "aarch64_mov_operand"  " r,r,k,N,m,rZ,Usa,Ush,rZ,*w,*w,Dd"))]
+  "(register_operand (operands[0], DImode)
+    || aarch64_reg_or_zero (operands[1], DImode))"
+  "@
+   mov\\t%x0, %x1
+   mov\\t%0, %x1
+   mov\\t%x0, %1
+   mov\\t%x0, %1
+   ldr\\t%x0, %1
+   str\\t%x1, %0
+   adr\\t%x0, %a1
+   adrp\\t%x0, %A1
+   fmov\\t%d0, %x1
+   fmov\\t%x0, %d1
+   fmov\\t%d0, %d1
+   movi\\t%d0, %1"
+  [(set_attr "v8type" "move,move,move,alu,load1,store1,adr,adr,fmov,fmov,fmov,fmov")
+   (set_attr "mode" "DI")
+   (set_attr "fp" "*,*,*,*,*,*,*,*,yes,yes,yes,yes")]
+)
+
+(define_insn "insv_imm<mode>"
+  [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r")
+			  (const_int 16)
+			  (match_operand 1 "const_int_operand" "n"))
+	(match_operand 2 "const_int_operand" "n"))]
+  "INTVAL (operands[1]) < GET_MODE_BITSIZE (<MODE>mode)
+   && INTVAL (operands[1]) % 16 == 0
+   && INTVAL (operands[2]) <= 0xffff"
+  "movk\\t%<w>0, %2, lsl %1"
+  [(set_attr "v8type" "movk")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_expand "movti"
+  [(set (match_operand:TI 0 "nonimmediate_operand" "")
+	(match_operand:TI 1 "general_operand" ""))]
+  ""
+  "
+    if (GET_CODE (operands[0]) == MEM && operands[1] != const0_rtx)
+      operands[1] = force_reg (TImode, operands[1]);
+  "
+)
+
+(define_insn "*movti_aarch64"
+  [(set (match_operand:TI 0
+	 "nonimmediate_operand"  "=r, *w,r ,*w,r  ,Ump,Ump,*w,m")
+	(match_operand:TI 1
+	 "aarch64_movti_operand" " rn,r ,*w,*w,Ump,r  ,Z  , m,*w"))]
+  "(register_operand (operands[0], TImode)
+    || aarch64_reg_or_zero (operands[1], TImode))"
+  "@
+   #
+   #
+   #
+   orr\\t%0.16b, %1.16b, %1.16b
+   ldp\\t%0, %H0, %1
+   stp\\t%1, %H1, %0
+   stp\\txzr, xzr, %0
+   ldr\\t%q0, %1
+   str\\t%q1, %0"
+  [(set_attr "v8type" "move2,fmovi2f,fmovf2i,*, \
+		       load2,store2,store2,fpsimd_load,fpsimd_store")
+   (set_attr "simd_type" "*,*,*,simd_move,*,*,*,*,*")
+   (set_attr "mode" "DI,DI,DI,TI,DI,DI,DI,TI,TI")
+   (set_attr "length" "8,8,8,4,4,4,4,4,4")
+   (set_attr "fp" "*,*,*,*,*,*,*,yes,yes")
+   (set_attr "simd" "*,*,*,yes,*,*,*,*,*")])
+
+;; Split a TImode register-register or register-immediate move into
+;; its component DImode pieces, taking care to handle overlapping
+;; source and dest registers.
+(define_split
+   [(set (match_operand:TI 0 "register_operand" "")
+	 (match_operand:TI 1 "aarch64_reg_or_imm" ""))]
+  "reload_completed && aarch64_split_128bit_move_p (operands[0], operands[1])"
+  [(const_int 0)]
+{
+  aarch64_split_128bit_move (operands[0], operands[1]);
+  DONE;
+})
+
+(define_expand "mov<mode>"
+  [(set (match_operand:GPF 0 "nonimmediate_operand" "")
+	(match_operand:GPF 1 "general_operand" ""))]
+  ""
+  "
+    if (!TARGET_FLOAT)
+     {
+	sorry (\"%qs and floating point code\", \"-mgeneral-regs-only\");
+	FAIL;
+     }
+
+    if (GET_CODE (operands[0]) == MEM)
+      operands[1] = force_reg (<MODE>mode, operands[1]);
+  "
+)
+
+(define_insn "*movsf_aarch64"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "= w,?r,w,w,m,r,m ,r")
+	(match_operand:SF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  "TARGET_FLOAT && (register_operand (operands[0], SFmode)
+    || register_operand (operands[1], SFmode))"
+  "@
+   fmov\\t%s0, %w1
+   fmov\\t%w0, %s1
+   fmov\\t%s0, %s1
+   ldr\\t%s0, %1
+   str\\t%s1, %0
+   ldr\\t%w0, %1
+   str\\t%w1, %0
+   mov\\t%w0, %w1"
+  [(set_attr "v8type" "fmovi2f,fmovf2i,fmov,fpsimd_load,fpsimd_store,fpsimd_load,fpsimd_store,fmov")
+   (set_attr "mode" "SF")]
+)
+
+(define_insn "*movdf_aarch64"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "= w,?r,w,w,m,r,m ,r")
+	(match_operand:DF 1 "general_operand"      "?rY, w,w,m,w,m,rY,r"))]
+  "TARGET_FLOAT && (register_operand (operands[0], DFmode)
+    || register_operand (operands[1], DFmode))"
+  "@
+   fmov\\t%d0, %x1
+   fmov\\t%x0, %d1
+   fmov\\t%d0, %d1
+   ldr\\t%d0, %1
+   str\\t%d1, %0
+   ldr\\t%x0, %1
+   str\\t%x1, %0
+   mov\\t%x0, %x1"
+  [(set_attr "v8type" "fmovi2f,fmovf2i,fmov,fpsimd_load,fpsimd_store,fpsimd_load,fpsimd_store,move")
+   (set_attr "mode" "DF")]
+)
+
+(define_expand "movtf"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+	(match_operand:TF 1 "general_operand" ""))]
+  ""
+  "
+    if (!TARGET_FLOAT)
+     {
+	sorry (\"%qs and floating point code\", \"-mgeneral-regs-only\");
+	FAIL;
+     }
+
+    if (GET_CODE (operands[0]) == MEM)
+      operands[1] = force_reg (TFmode, operands[1]);
+  "
+)
+
+(define_insn "*movtf_aarch64"
+  [(set (match_operand:TF 0
+	 "nonimmediate_operand" "=w,?&r,w ,?r,w,?w,w,m,?r ,Ump")
+	(match_operand:TF 1
+	 "general_operand"      " w,?r, ?r,w ,Y,Y ,m,w,Ump,?rY"))]
+  "TARGET_FLOAT && (register_operand (operands[0], TFmode)
+    || register_operand (operands[1], TFmode))"
+  "@
+   orr\\t%0.16b, %1.16b, %1.16b
+   mov\\t%0, %1\;mov\\t%H0, %H1
+   fmov\\t%d0, %Q1\;fmov\\t%0.d[1], %R1
+   fmov\\t%Q0, %d1\;fmov\\t%R0, %1.d[1]
+   movi\\t%0.2d, #0
+   fmov\\t%s0, wzr
+   ldr\\t%q0, %1
+   str\\t%q1, %0
+   ldp\\t%0, %H0, %1
+   stp\\t%1, %H1, %0"
+  [(set_attr "v8type" "logic,move2,fmovi2f,fmovf2i,fconst,fconst,fpsimd_load,fpsimd_store,fpsimd_load2,fpsimd_store2")
+   (set_attr "mode" "DF,DF,DF,DF,DF,DF,TF,TF,DF,DF")
+   (set_attr "length" "4,8,8,8,4,4,4,4,4,4")
+   (set_attr "fp" "*,*,yes,yes,*,yes,yes,yes,*,*")
+   (set_attr "simd" "yes,*,*,*,yes,*,*,*,*,*")]
+)
+
+
+;; Operands 1 and 3 are tied together by the final condition; so we allow
+;; fairly lax checking on the second memory operation.
+(define_insn "load_pair<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(match_operand:GPI 1 "aarch64_mem_pair_operand" "Ump"))
+   (set (match_operand:GPI 2 "register_operand" "=r")
+        (match_operand:GPI 3 "memory_operand" "m"))]
+  "rtx_equal_p (XEXP (operands[3], 0),
+		plus_constant (Pmode,
+			       XEXP (operands[1], 0),
+			       GET_MODE_SIZE (<MODE>mode)))"
+  "ldp\\t%<w>0, %<w>2, %1"
+  [(set_attr "v8type" "load2")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; Operands 0 and 2 are tied together by the final condition; so we allow
+;; fairly lax checking on the second memory operation.
+(define_insn "store_pair<mode>"
+  [(set (match_operand:GPI 0 "aarch64_mem_pair_operand" "=Ump")
+	(match_operand:GPI 1 "register_operand" "r"))
+   (set (match_operand:GPI 2 "memory_operand" "=m")
+        (match_operand:GPI 3 "register_operand" "r"))]
+  "rtx_equal_p (XEXP (operands[2], 0),
+		plus_constant (Pmode,
+			       XEXP (operands[0], 0),
+			       GET_MODE_SIZE (<MODE>mode)))"
+  "stp\\t%<w>1, %<w>3, %0"
+  [(set_attr "v8type" "store2")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; Operands 1 and 3 are tied together by the final condition; so we allow
+;; fairly lax checking on the second memory operation.
+(define_insn "load_pair<mode>"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(match_operand:GPF 1 "aarch64_mem_pair_operand" "Ump"))
+   (set (match_operand:GPF 2 "register_operand" "=w")
+        (match_operand:GPF 3 "memory_operand" "m"))]
+  "rtx_equal_p (XEXP (operands[3], 0),
+		plus_constant (Pmode,
+			       XEXP (operands[1], 0),
+			       GET_MODE_SIZE (<MODE>mode)))"
+  "ldp\\t%<w>0, %<w>2, %1"
+  [(set_attr "v8type" "fpsimd_load2")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; Operands 0 and 2 are tied together by the final condition; so we allow
+;; fairly lax checking on the second memory operation.
+(define_insn "store_pair<mode>"
+  [(set (match_operand:GPF 0 "aarch64_mem_pair_operand" "=Ump")
+	(match_operand:GPF 1 "register_operand" "w"))
+   (set (match_operand:GPF 2 "memory_operand" "=m")
+        (match_operand:GPF 3 "register_operand" "w"))]
+  "rtx_equal_p (XEXP (operands[2], 0),
+		plus_constant (Pmode,
+			       XEXP (operands[0], 0),
+			       GET_MODE_SIZE (<MODE>mode)))"
+  "stp\\t%<w>1, %<w>3, %0"
+  [(set_attr "v8type" "fpsimd_load2")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; Load pair with writeback.  This is primarily used in function epilogues
+;; when restoring [fp,lr]
+(define_insn "loadwb_pair<GPI:mode>_<PTR:mode>"
+  [(parallel
+    [(set (match_operand:PTR 0 "register_operand" "=k")
+          (plus:PTR (match_operand:PTR 1 "register_operand" "0")
+                  (match_operand:PTR 4 "const_int_operand" "n")))
+     (set (match_operand:GPI 2 "register_operand" "=r")
+          (mem:GPI (plus:PTR (match_dup 1)
+                   (match_dup 4))))
+     (set (match_operand:GPI 3 "register_operand" "=r")
+          (mem:GPI (plus:PTR (match_dup 1)
+                   (match_operand:PTR 5 "const_int_operand" "n"))))])]
+  "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (<GPI:MODE>mode)"
+  "ldp\\t%<w>2, %<w>3, [%1], %4"
+  [(set_attr "v8type" "load2")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+;; Store pair with writeback.  This is primarily used in function prologues
+;; when saving [fp,lr]
+(define_insn "storewb_pair<GPI:mode>_<PTR:mode>"
+  [(parallel
+    [(set (match_operand:PTR 0 "register_operand" "=&k")
+          (plus:PTR (match_operand:PTR 1 "register_operand" "0")
+                  (match_operand:PTR 4 "const_int_operand" "n")))
+     (set (mem:GPI (plus:PTR (match_dup 0)
+                   (match_dup 4)))
+          (match_operand:GPI 2 "register_operand" "r"))
+     (set (mem:GPI (plus:PTR (match_dup 0)
+                   (match_operand:PTR 5 "const_int_operand" "n")))
+          (match_operand:GPI 3 "register_operand" "r"))])]
+  "INTVAL (operands[5]) == INTVAL (operands[4]) + GET_MODE_SIZE (<GPI:MODE>mode)"
+  "stp\\t%<w>2, %<w>3, [%0, %4]!"
+  [(set_attr "v8type" "store2")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Sign/Zero extension
+;; -------------------------------------------------------------------
+
+(define_expand "<optab>sidi2"
+  [(set (match_operand:DI 0 "register_operand")
+	(ANY_EXTEND:DI (match_operand:SI 1 "nonimmediate_operand")))]
+  ""
+)
+
+(define_insn "*extendsidi2_aarch64"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+        (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))]
+  ""
+  "@
+   sxtw\t%0, %w1
+   ldrsw\t%0, %1"
+  [(set_attr "v8type" "extend,load1")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "*zero_extendsidi2_aarch64"
+  [(set (match_operand:DI 0 "register_operand" "=r,r")
+        (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))]
+  ""
+  "@
+   uxtw\t%0, %w1
+   ldr\t%w0, %1"
+  [(set_attr "v8type" "extend,load1")
+   (set_attr "mode" "DI")]
+)
+
+(define_expand "<ANY_EXTEND:optab><SHORT:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand")
+        (ANY_EXTEND:GPI (match_operand:SHORT 1 "nonimmediate_operand")))]
+  ""
+)
+
+(define_insn "*extend<SHORT:mode><GPI:mode>2_aarch64"
+  [(set (match_operand:GPI 0 "register_operand" "=r,r")
+        (sign_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m")))]
+  ""
+  "@
+   sxt<SHORT:size>\t%<GPI:w>0, %w1
+   ldrs<SHORT:size>\t%<GPI:w>0, %1"
+  [(set_attr "v8type" "extend,load1")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*zero_extend<SHORT:mode><GPI:mode>2_aarch64"
+  [(set (match_operand:GPI 0 "register_operand" "=r,r")
+        (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m")))]
+  ""
+  "@
+   uxt<SHORT:size>\t%<GPI:w>0, %w1
+   ldr<SHORT:size>\t%w0, %1"
+  [(set_attr "v8type" "extend,load1")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_expand "<optab>qihi2"
+  [(set (match_operand:HI 0 "register_operand")
+        (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand")))]
+  ""
+)
+
+(define_insn "*<optab>qihi2_aarch64"
+  [(set (match_operand:HI 0 "register_operand" "=r,r")
+        (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
+  ""
+  "@
+   <su>xtb\t%w0, %w1
+   <ldrxt>b\t%w0, %1"
+  [(set_attr "v8type" "extend,load1")
+   (set_attr "mode" "HI")]
+)
+
+;; -------------------------------------------------------------------
+;; Simple arithmetic
+;; -------------------------------------------------------------------
+
+(define_expand "add<mode>3"
+  [(set
+    (match_operand:GPI 0 "register_operand" "")
+    (plus:GPI (match_operand:GPI 1 "register_operand" "")
+	      (match_operand:GPI 2 "aarch64_pluslong_operand" "")))]
+  ""
+  "
+  if (! aarch64_plus_operand (operands[2], VOIDmode))
+    {
+      rtx subtarget = ((optimize && can_create_pseudo_p ())
+		       ? gen_reg_rtx (<MODE>mode) : operands[0]);
+      HOST_WIDE_INT imm = INTVAL (operands[2]);
+
+      if (imm < 0)
+	imm = -(-imm & ~0xfff);
+      else
+        imm &= ~0xfff;
+
+      emit_insn (gen_add<mode>3 (subtarget, operands[1], GEN_INT (imm)));
+      operands[1] = subtarget;
+      operands[2] = GEN_INT (INTVAL (operands[2]) - imm);
+    }
+  "
+)
+
+(define_insn "*addsi3_aarch64"
+  [(set
+    (match_operand:SI 0 "register_operand" "=rk,rk,rk")
+    (plus:SI
+     (match_operand:SI 1 "register_operand" "%rk,rk,rk")
+     (match_operand:SI 2 "aarch64_plus_operand" "I,r,J")))]
+  ""
+  "@
+  add\\t%w0, %w1, %2
+  add\\t%w0, %w1, %w2
+  sub\\t%w0, %w1, #%n2"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "SI")]
+)
+
+(define_insn "*adddi3_aarch64"
+  [(set
+    (match_operand:DI 0 "register_operand" "=rk,rk,rk,!w")
+    (plus:DI
+     (match_operand:DI 1 "register_operand" "%rk,rk,rk,!w")
+     (match_operand:DI 2 "aarch64_plus_operand" "I,r,J,!w")))]
+  ""
+  "@
+  add\\t%x0, %x1, %2
+  add\\t%x0, %x1, %x2
+  sub\\t%x0, %x1, #%n2
+  add\\t%d0, %d1, %d2"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "DI")
+   (set_attr "simd" "*,*,*,yes")]
+)
+
+(define_insn "*add<mode>3_compare0"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (plus:GPI (match_operand:GPI 1 "register_operand" "%r,r")
+		   (match_operand:GPI 2 "aarch64_plus_operand" "rI,J"))
+	 (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=r,r")
+	(plus:GPI (match_dup 1) (match_dup 2)))]
+  ""
+  "@
+  adds\\t%<w>0, %<w>1, %<w>2
+  subs\\t%<w>0, %<w>1, #%n2"
+  [(set_attr "v8type" "alus")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add<mode>3nr_compare0"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ
+	 (plus:GPI (match_operand:GPI 0 "register_operand" "%r,r")
+		   (match_operand:GPI 1 "aarch64_plus_operand" "rI,J"))
+	 (const_int 0)))]
+  ""
+  "@
+  cmn\\t%<w>0, %<w>1
+  cmp\\t%<w>0, #%n1"
+  [(set_attr "v8type" "alus")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add_<shift>_<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (ASHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
+			      (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))
+		  (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "add\\t%<w>0, %<w>3, %<w>1, <shift> %2"
+  [(set_attr "v8type" "alu_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add_mul_imm_<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			    (match_operand:QI 2 "aarch64_pwr_2_<mode>" "n"))
+		  (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "add\\t%<w>0, %<w>3, %<w>1, lsl %p2"
+  [(set_attr "v8type" "alu_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add_<optab><ALLX:mode>_<GPI:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (ANY_EXTEND:GPI (match_operand:ALLX 1 "register_operand" "r"))
+		  (match_operand:GPI 2 "register_operand" "r")))]
+  ""
+  "add\\t%<GPI:w>0, %<GPI:w>2, %<GPI:w>1, <su>xt<ALLX:size>"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*add_<optab><ALLX:mode>_shft_<GPI:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (ashift:GPI (ANY_EXTEND:GPI
+			       (match_operand:ALLX 1 "register_operand" "r"))
+			      (match_operand 2 "aarch64_imm3" "Ui3"))
+		  (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "add\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %2"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*add_<optab><ALLX:mode>_mult_<GPI:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (mult:GPI (ANY_EXTEND:GPI
+			     (match_operand:ALLX 1 "register_operand" "r"))
+			    (match_operand 2 "aarch64_pwr_imm3" "Up3"))
+		  (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "add\\t%<GPI:w>0, %<GPI:w>3, %<GPI:w>1, <su>xt<ALLX:size> %p2"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*add_<optab><mode>_multp2"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (ANY_EXTRACT:GPI
+		   (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			     (match_operand 2 "aarch64_pwr_imm3" "Up3"))
+		   (match_operand 3 "const_int_operand" "n")
+		   (const_int 0))
+		  (match_operand:GPI 4 "register_operand" "r")))]
+  "aarch64_is_extend_from_extract (<MODE>mode, operands[2], operands[3])"
+  "add\\t%<w>0, %<w>4, %<w>1, <su>xt%e3 %p2"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add<mode>3_carryin"
+  [(set
+    (match_operand:GPI 0 "register_operand" "=r")
+    (plus:GPI (geu:GPI (reg:CC CC_REGNUM) (const_int 0))
+	      (plus:GPI
+		(match_operand:GPI 1 "register_operand" "r")
+		(match_operand:GPI 2 "register_operand" "r"))))]
+   ""
+   "adc\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "adc")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add<mode>3_carryin_alt1"
+  [(set
+    (match_operand:GPI 0 "register_operand" "=r")
+    (plus:GPI (plus:GPI
+		(match_operand:GPI 1 "register_operand" "r")
+		(match_operand:GPI 2 "register_operand" "r"))
+              (geu:GPI (reg:CC CC_REGNUM) (const_int 0))))]
+   ""
+   "adc\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "adc")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add<mode>3_carryin_alt2"
+  [(set
+    (match_operand:GPI 0 "register_operand" "=r")
+    (plus:GPI (plus:GPI
+                (geu:GPI (reg:CC CC_REGNUM) (const_int 0))
+		(match_operand:GPI 1 "register_operand" "r"))
+	      (match_operand:GPI 2 "register_operand" "r")))]
+   ""
+   "adc\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "adc")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add<mode>3_carryin_alt3"
+  [(set
+    (match_operand:GPI 0 "register_operand" "=r")
+    (plus:GPI (plus:GPI
+                (geu:GPI (reg:CC CC_REGNUM) (const_int 0))
+		(match_operand:GPI 2 "register_operand" "r"))
+	      (match_operand:GPI 1 "register_operand" "r")))]
+   ""
+   "adc\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "adc")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*add_uxt<mode>_multp2"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(plus:GPI (and:GPI
+		   (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			     (match_operand 2 "aarch64_pwr_imm3" "Up3"))
+		   (match_operand 3 "const_int_operand" "n"))
+		  (match_operand:GPI 4 "register_operand" "r")))]
+  "aarch64_uxt_size (exact_log2 (INTVAL (operands[2])), INTVAL (operands[3])) != 0"
+  "*
+  operands[3] = GEN_INT (aarch64_uxt_size (exact_log2 (INTVAL (operands[2])),
+					   INTVAL (operands[3])));
+  return \"add\t%<w>0, %<w>4, %<w>1, uxt%e3 %p2\";"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "subsi3"
+  [(set (match_operand:SI 0 "register_operand" "=rk")
+	(minus:SI (match_operand:SI 1 "register_operand" "r")
+		   (match_operand:SI 2 "register_operand" "r")))]
+  ""
+  "sub\\t%w0, %w1, %w2"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "SI")]
+)
+
+(define_insn "subdi3"
+  [(set (match_operand:DI 0 "register_operand" "=rk,!w")
+	(minus:DI (match_operand:DI 1 "register_operand" "r,!w")
+		   (match_operand:DI 2 "register_operand" "r,!w")))]
+  ""
+  "@
+   sub\\t%x0, %x1, %x2
+   sub\\t%d0, %d1, %d2"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "DI")
+   (set_attr "simd" "*,yes")]
+)
+
+
+(define_insn "*sub<mode>3_compare0"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "r")
+				  (match_operand:GPI 2 "register_operand" "r"))
+		       (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=r")
+	(minus:GPI (match_dup 1) (match_dup 2)))]
+  ""
+  "subs\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "alus")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*sub_<shift>_<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 3 "register_operand" "r")
+		   (ASHIFT:GPI
+		    (match_operand:GPI 1 "register_operand" "r")
+		    (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))))]
+  ""
+  "sub\\t%<w>0, %<w>3, %<w>1, <shift> %2"
+  [(set_attr "v8type" "alu_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*sub_mul_imm_<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 3 "register_operand" "r")
+		   (mult:GPI
+		    (match_operand:GPI 1 "register_operand" "r")
+		    (match_operand:QI 2 "aarch64_pwr_2_<mode>" "n"))))]
+  ""
+  "sub\\t%<w>0, %<w>3, %<w>1, lsl %p2"
+  [(set_attr "v8type" "alu_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*sub_<optab><ALLX:mode>_<GPI:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 1 "register_operand" "r")
+		   (ANY_EXTEND:GPI
+		    (match_operand:ALLX 2 "register_operand" "r"))))]
+  ""
+  "sub\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size>"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*sub_<optab><ALLX:mode>_shft_<GPI:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 1 "register_operand" "r")
+		   (ashift:GPI (ANY_EXTEND:GPI
+				(match_operand:ALLX 2 "register_operand" "r"))
+			       (match_operand 3 "aarch64_imm3" "Ui3"))))]
+  ""
+  "sub\\t%<GPI:w>0, %<GPI:w>1, %<GPI:w>2, <su>xt<ALLX:size> %3"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*sub_<optab><mode>_multp2"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 4 "register_operand" "r")
+		   (ANY_EXTRACT:GPI
+		    (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			      (match_operand 2 "aarch64_pwr_imm3" "Up3"))
+		    (match_operand 3 "const_int_operand" "n")
+		    (const_int 0))))]
+  "aarch64_is_extend_from_extract (<MODE>mode, operands[2], operands[3])"
+  "sub\\t%<w>0, %<w>4, %<w>1, <su>xt%e3 %p2"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*sub_uxt<mode>_multp2"
+  [(set (match_operand:GPI 0 "register_operand" "=rk")
+	(minus:GPI (match_operand:GPI 4 "register_operand" "r")
+		   (and:GPI
+		    (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			      (match_operand 2 "aarch64_pwr_imm3" "Up3"))
+		    (match_operand 3 "const_int_operand" "n"))))]
+  "aarch64_uxt_size (exact_log2 (INTVAL (operands[2])),INTVAL (operands[3])) != 0"
+  "*
+  operands[3] = GEN_INT (aarch64_uxt_size (exact_log2 (INTVAL (operands[2])),
+					   INTVAL (operands[3])));
+  return \"sub\t%<w>0, %<w>4, %<w>1, uxt%e3 %p2\";"
+  [(set_attr "v8type" "alu_ext")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(neg:GPI (match_operand:GPI 1 "register_operand" "r")))]
+  ""
+  "neg\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*neg<mode>2_compare0"
+  [(set (reg:CC_NZ CC_REGNUM)
+	(compare:CC_NZ (neg:GPI (match_operand:GPI 1 "register_operand" "r"))
+		       (const_int 0)))
+   (set (match_operand:GPI 0 "register_operand" "=r")
+	(neg:GPI (match_dup 1)))]
+  ""
+  "negs\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "alus")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*neg_<shift>_<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(neg:GPI (ASHIFT:GPI
+		  (match_operand:GPI 1 "register_operand" "r")
+		  (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))))]
+  ""
+  "neg\\t%<w>0, %<w>1, <shift> %2"
+  [(set_attr "v8type" "alu_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*neg_mul_imm_<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(neg:GPI (mult:GPI
+		  (match_operand:GPI 1 "register_operand" "r")
+		  (match_operand:QI 2 "aarch64_pwr_2_<mode>" "n"))))]
+  ""
+  "neg\\t%<w>0, %<w>1, lsl %p2"
+  [(set_attr "v8type" "alu_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(mult:GPI (match_operand:GPI 1 "register_operand" "r")
+		  (match_operand:GPI 2 "register_operand" "r")))]
+  ""
+  "mul\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "mult")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*madd<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(plus:GPI (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			    (match_operand:GPI 2 "register_operand" "r"))
+		  (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "madd\\t%<w>0, %<w>1, %<w>2, %<w>3"
+  [(set_attr "v8type" "madd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*msub<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(minus:GPI (match_operand:GPI 3 "register_operand" "r")
+		   (mult:GPI (match_operand:GPI 1 "register_operand" "r")
+			     (match_operand:GPI 2 "register_operand" "r"))))]
+
+  ""
+  "msub\\t%<w>0, %<w>1, %<w>2, %<w>3"
+  [(set_attr "v8type" "madd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*mul<mode>_neg"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(mult:GPI (neg:GPI (match_operand:GPI 1 "register_operand" "r"))
+		  (match_operand:GPI 2 "register_operand" "r")))]
+
+  ""
+  "mneg\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "mult")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "<su_optab>mulsidi3"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(mult:DI (ANY_EXTEND:DI (match_operand:SI 1 "register_operand" "r"))
+		 (ANY_EXTEND:DI (match_operand:SI 2 "register_operand" "r"))))]
+  ""
+  "<su>mull\\t%0, %w1, %w2"
+  [(set_attr "v8type" "mull")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "<su_optab>maddsidi4"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(plus:DI (mult:DI
+		  (ANY_EXTEND:DI (match_operand:SI 1 "register_operand" "r"))
+		  (ANY_EXTEND:DI (match_operand:SI 2 "register_operand" "r")))
+		 (match_operand:DI 3 "register_operand" "r")))]
+  ""
+  "<su>maddl\\t%0, %w1, %w2, %3"
+  [(set_attr "v8type" "maddl")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "<su_optab>msubsidi4"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(minus:DI
+	 (match_operand:DI 3 "register_operand" "r")
+	 (mult:DI (ANY_EXTEND:DI (match_operand:SI 1 "register_operand" "r"))
+		  (ANY_EXTEND:DI
+		   (match_operand:SI 2 "register_operand" "r")))))]
+  ""
+  "<su>msubl\\t%0, %w1, %w2, %3"
+  [(set_attr "v8type" "maddl")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "*<su_optab>mulsidi_neg"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(mult:DI (neg:DI
+		  (ANY_EXTEND:DI (match_operand:SI 1 "register_operand" "r")))
+		  (ANY_EXTEND:DI (match_operand:SI 2 "register_operand" "r"))))]
+  ""
+  "<su>mnegl\\t%0, %w1, %w2"
+  [(set_attr "v8type" "mull")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "<su>muldi3_highpart"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(truncate:DI
+	 (lshiftrt:TI
+	  (mult:TI
+	   (ANY_EXTEND:TI (match_operand:DI 1 "register_operand" "r"))
+	   (ANY_EXTEND:TI (match_operand:DI 2 "register_operand" "r")))
+	  (const_int 64))))]
+  ""
+  "<su>mulh\\t%0, %1, %2"
+  [(set_attr "v8type" "mulh")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "<su_optab>div<mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ANY_DIV:GPI (match_operand:GPI 1 "register_operand" "r")
+		     (match_operand:GPI 2 "register_operand" "r")))]
+  ""
+  "<su>div\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "<su>div")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Comparison insns
+;; -------------------------------------------------------------------
+
+(define_insn "*cmp<mode>"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC (match_operand:GPI 0 "register_operand" "r,r")
+		    (match_operand:GPI 1 "aarch64_plus_operand" "rI,J")))]
+  ""
+  "@
+   cmp\\t%<w>0, %<w>1
+   cmn\\t%<w>0, #%n1"
+  [(set_attr "v8type" "alus")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*cmp<mode>"
+  [(set (reg:CCFP CC_REGNUM)
+        (compare:CCFP (match_operand:GPF 0 "register_operand" "w,w")
+		      (match_operand:GPF 1 "aarch64_fp_compare_operand" "Y,w")))]
+   "TARGET_FLOAT"
+   "@
+    fcmp\\t%<s>0, #0.0
+    fcmp\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "fcmp")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*cmpe<mode>"
+  [(set (reg:CCFPE CC_REGNUM)
+        (compare:CCFPE (match_operand:GPF 0 "register_operand" "w,w")
+		       (match_operand:GPF 1 "aarch64_fp_compare_operand" "Y,w")))]
+   "TARGET_FLOAT"
+   "@
+    fcmpe\\t%<s>0, #0.0
+    fcmpe\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "fcmp")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*cmp_swp_<shift>_reg<mode>"
+  [(set (reg:CC_SWP CC_REGNUM)
+	(compare:CC_SWP (ASHIFT:GPI
+			 (match_operand:GPI 0 "register_operand" "r")
+			 (match_operand:QI 1 "aarch64_shift_imm_<mode>" "n"))
+			(match_operand:GPI 2 "aarch64_reg_or_zero" "rZ")))]
+  ""
+  "cmp\\t%<w>2, %<w>0, <shift> %1"
+  [(set_attr "v8type" "alus_shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*cmp_swp_<optab><ALLX:mode>_reg<GPI:mode>"
+  [(set (reg:CC_SWP CC_REGNUM)
+	(compare:CC_SWP (ANY_EXTEND:GPI
+			 (match_operand:ALLX 0 "register_operand" "r"))
+			(match_operand:GPI 1 "register_operand" "r")))]
+  ""
+  "cmp\\t%<GPI:w>1, %<GPI:w>0, <su>xt<ALLX:size>"
+  [(set_attr "v8type" "alus_ext")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+
+;; -------------------------------------------------------------------
+;; Store-flag and conditional select insns
+;; -------------------------------------------------------------------
+
+(define_expand "cstore<mode>4"
+  [(set (match_operand:SI 0 "register_operand" "")
+	(match_operator:SI 1 "aarch64_comparison_operator"
+	 [(match_operand:GPI 2 "register_operand" "")
+	  (match_operand:GPI 3 "aarch64_plus_operand" "")]))]
+  ""
+  "
+  operands[2] = aarch64_gen_compare_reg (GET_CODE (operands[1]), operands[2],
+				      operands[3]);
+  operands[3] = const0_rtx;
+  "
+)
+
+(define_expand "cstore<mode>4"
+  [(set (match_operand:SI 0 "register_operand" "")
+	(match_operator:SI 1 "aarch64_comparison_operator"
+	 [(match_operand:GPF 2 "register_operand" "")
+	  (match_operand:GPF 3 "register_operand" "")]))]
+  ""
+  "
+  operands[2] = aarch64_gen_compare_reg (GET_CODE (operands[1]), operands[2],
+				      operands[3]);
+  operands[3] = const0_rtx;
+  "
+)
+
+(define_insn "*cstore<mode>_insn"
+  [(set (match_operand:ALLI 0 "register_operand" "=r")
+	(match_operator:ALLI 1 "aarch64_comparison_operator"
+	 [(match_operand 2 "cc_register" "") (const_int 0)]))]
+  ""
+  "cset\\t%<w>0, %m1"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*cstore<mode>_neg"
+  [(set (match_operand:ALLI 0 "register_operand" "=r")
+	(neg:ALLI (match_operator:ALLI 1 "aarch64_comparison_operator"
+		  [(match_operand 2 "cc_register" "") (const_int 0)])))]
+  ""
+  "csetm\\t%<w>0, %m1"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_expand "cmov<mode>6"
+  [(set (match_operand:GPI 0 "register_operand" "")
+	(if_then_else:GPI
+	 (match_operator 1 "aarch64_comparison_operator"
+	  [(match_operand:GPI 2 "register_operand" "")
+	   (match_operand:GPI 3 "aarch64_plus_operand" "")])
+	 (match_operand:GPI 4 "register_operand" "")
+	 (match_operand:GPI 5 "register_operand" "")))]
+  ""
+  "
+  operands[2] = aarch64_gen_compare_reg (GET_CODE (operands[1]), operands[2],
+				      operands[3]);
+  operands[3] = const0_rtx;
+  "
+)
+
+(define_expand "cmov<mode>6"
+  [(set (match_operand:GPF 0 "register_operand" "")
+	(if_then_else:GPF
+	 (match_operator 1 "aarch64_comparison_operator"
+	  [(match_operand:GPF 2 "register_operand" "")
+	   (match_operand:GPF 3 "register_operand" "")])
+	 (match_operand:GPF 4 "register_operand" "")
+	 (match_operand:GPF 5 "register_operand" "")))]
+  ""
+  "
+  operands[2] = aarch64_gen_compare_reg (GET_CODE (operands[1]), operands[2],
+				      operands[3]);
+  operands[3] = const0_rtx;
+  "
+)
+
+(define_insn "*cmov<mode>_insn"
+  [(set (match_operand:ALLI 0 "register_operand" "=r,r,r,r")
+	(if_then_else:ALLI
+	 (match_operator 1 "aarch64_comparison_operator"
+	  [(match_operand 2 "cc_register" "") (const_int 0)])
+	 (match_operand:ALLI 3 "aarch64_reg_zero_or_m1" "rZ,rZ,UsM,UsM")
+	 (match_operand:ALLI 4 "aarch64_reg_zero_or_m1" "rZ,UsM,rZ,UsM")))]
+  ""
+  ;; Final alternative should be unreachable, but included for completeness
+  "@
+   csel\\t%<w>0, %<w>3, %<w>4, %m1
+   csinv\\t%<w>0, %<w>3, <w>zr, %m1
+   csinv\\t%<w>0, %<w>4, <w>zr, %M1
+   mov\\t%<w>0, -1"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*cmov<mode>_insn"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(if_then_else:GPF
+	 (match_operator 1 "aarch64_comparison_operator"
+	  [(match_operand 2 "cc_register" "") (const_int 0)])
+	 (match_operand:GPF 3 "register_operand" "w")
+	 (match_operand:GPF 4 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fcsel\\t%<s>0, %<s>3, %<s>4, %m1"
+  [(set_attr "v8type" "fcsel")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_expand "mov<mode>cc"
+  [(set (match_operand:ALLI 0 "register_operand" "")
+	(if_then_else:ALLI (match_operand 1 "aarch64_comparison_operator" "")
+			   (match_operand:ALLI 2 "register_operand" "")
+			   (match_operand:ALLI 3 "register_operand" "")))]
+  ""
+  {
+    rtx ccreg;
+    enum rtx_code code = GET_CODE (operands[1]);
+
+    if (code == UNEQ || code == LTGT)
+      FAIL;
+
+    ccreg = aarch64_gen_compare_reg (code, XEXP (operands[1], 0),
+				  XEXP (operands[1], 1));
+    operands[1] = gen_rtx_fmt_ee (code, VOIDmode, ccreg, const0_rtx);
+  }
+)
+
+(define_expand "mov<GPF:mode><GPI:mode>cc"
+  [(set (match_operand:GPI 0 "register_operand" "")
+	(if_then_else:GPI (match_operand 1 "aarch64_comparison_operator" "")
+			  (match_operand:GPF 2 "register_operand" "")
+			  (match_operand:GPF 3 "register_operand" "")))]
+  ""
+  {
+    rtx ccreg;
+    enum rtx_code code = GET_CODE (operands[1]);
+
+    if (code == UNEQ || code == LTGT)
+      FAIL;
+
+    ccreg = aarch64_gen_compare_reg (code, XEXP (operands[1], 0),
+				  XEXP (operands[1], 1));
+    operands[1] = gen_rtx_fmt_ee (code, VOIDmode, ccreg, const0_rtx);
+  }
+)
+
+(define_insn "*csinc2<mode>_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (plus:GPI (match_operator:GPI 2 "aarch64_comparison_operator"
+		  [(match_operand:CC 3 "cc_register" "") (const_int 0)])
+		 (match_operand:GPI 1 "register_operand" "r")))]
+  ""
+  "csinc\\t%<w>0, %<w>1, %<w>1, %M2"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "csinc3<mode>_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (if_then_else:GPI
+	  (match_operator:GPI 1 "aarch64_comparison_operator"
+	   [(match_operand:CC 2 "cc_register" "") (const_int 0)])
+	  (plus:GPI (match_operand:GPI 3 "register_operand" "r")
+		    (const_int 1))
+	  (match_operand:GPI 4 "aarch64_reg_or_zero" "rZ")))]
+  ""
+  "csinc\\t%<w>0, %<w>4, %<w>3, %M1"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*csinv3<mode>_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (if_then_else:GPI
+	  (match_operator:GPI 1 "aarch64_comparison_operator"
+	   [(match_operand:CC 2 "cc_register" "") (const_int 0)])
+	  (not:GPI (match_operand:GPI 3 "register_operand" "r"))
+	  (match_operand:GPI 4 "aarch64_reg_or_zero" "rZ")))]
+  ""
+  "csinv\\t%<w>0, %<w>4, %<w>3, %M1"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*csneg3<mode>_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (if_then_else:GPI
+	  (match_operator:GPI 1 "aarch64_comparison_operator"
+	   [(match_operand:CC 2 "cc_register" "") (const_int 0)])
+	  (neg:GPI (match_operand:GPI 3 "register_operand" "r"))
+	  (match_operand:GPI 4 "aarch64_reg_or_zero" "rZ")))]
+  ""
+  "csneg\\t%<w>0, %<w>4, %<w>3, %M1"
+  [(set_attr "v8type" "csel")
+   (set_attr "mode" "<MODE>")])
+
+;; -------------------------------------------------------------------
+;; Logical operations
+;; -------------------------------------------------------------------
+
+(define_insn "<optab><mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r,rk")
+	(LOGICAL:GPI (match_operand:GPI 1 "register_operand" "%r,r")
+		     (match_operand:GPI 2 "aarch64_logical_operand" "r,<lconst>")))]
+  ""
+  "<logical>\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "logic,logic_imm")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*<LOGICAL:optab>_<SHIFT:optab><mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(LOGICAL:GPI (SHIFT:GPI
+		      (match_operand:GPI 1 "register_operand" "r")
+		      (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))
+		     (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "<LOGICAL:logical>\\t%<w>0, %<w>3, %<w>1, <SHIFT:shift> %2"
+  [(set_attr "v8type" "logic_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(not:GPI (match_operand:GPI 1 "register_operand" "r")))]
+  ""
+  "mvn\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*one_cmpl_<optab><mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(not:GPI (SHIFT:GPI (match_operand:GPI 1 "register_operand" "r")
+			    (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))))]
+  ""
+  "mvn\\t%<w>0, %<w>1, <shift> %2"
+  [(set_attr "v8type" "logic_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*<LOGICAL:optab>_one_cmpl<mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(LOGICAL:GPI (not:GPI
+		      (match_operand:GPI 1 "register_operand" "r"))
+		     (match_operand:GPI 2 "register_operand" "r")))]
+  ""
+  "<LOGICAL:nlogical>\\t%<w>0, %<w>2, %<w>1"
+  [(set_attr "v8type" "logic")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*<LOGICAL:optab>_one_cmpl_<SHIFT:optab><mode>3"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(LOGICAL:GPI (not:GPI
+		      (SHIFT:GPI
+		       (match_operand:GPI 1 "register_operand" "r")
+		       (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n")))
+		     (match_operand:GPI 3 "register_operand" "r")))]
+  ""
+  "<LOGICAL:nlogical>\\t%<w>0, %<w>3, %<w>1, <SHIFT:shift> %2"
+  [(set_attr "v8type" "logic_shift")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "clz<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(clz:GPI (match_operand:GPI 1 "register_operand" "r")))]
+  ""
+  "clz\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "clz")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "ffs<mode>2"
+  [(match_operand:GPI 0 "register_operand")
+   (match_operand:GPI 1 "register_operand")]
+  ""
+  {
+    rtx ccreg = aarch64_gen_compare_reg (EQ, operands[1], const0_rtx);
+    rtx x = gen_rtx_NE (VOIDmode, ccreg, const0_rtx);
+
+    emit_insn (gen_rbit<mode>2 (operands[0], operands[1]));
+    emit_insn (gen_clz<mode>2 (operands[0], operands[0]));
+    emit_insn (gen_csinc3<mode>_insn (operands[0], x, ccreg, operands[0], const0_rtx));
+    DONE;
+  }
+)
+
+(define_insn "clrsb<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(unspec:GPI [(match_operand:GPI 1 "register_operand" "r")] UNSPEC_CLS))]
+  ""
+  "cls\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "clz")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "rbit<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(unspec:GPI [(match_operand:GPI 1 "register_operand" "r")] UNSPEC_RBIT))]
+  ""
+  "rbit\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "rbit")
+   (set_attr "mode" "<MODE>")])
+
+(define_expand "ctz<mode>2"
+  [(match_operand:GPI 0 "register_operand")
+   (match_operand:GPI 1 "register_operand")]
+  ""
+  {
+    emit_insn (gen_rbit<mode>2 (operands[0], operands[1]));
+    emit_insn (gen_clz<mode>2 (operands[0], operands[0]));
+    DONE;
+  }
+)
+
+(define_insn "*and<mode>3nr_compare0"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	 (and:GPI (match_operand:GPI 0 "register_operand" "%r,r")
+		  (match_operand:GPI 1 "aarch64_logical_operand" "r,<lconst>"))
+	 (const_int 0)))]
+  ""
+  "tst\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "logics")
+   (set_attr "mode" "<MODE>")])
+
+(define_insn "*and_<SHIFT:optab><mode>3nr_compare0"
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC
+	 (and:GPI (SHIFT:GPI
+		   (match_operand:GPI 0 "register_operand" "r")
+		   (match_operand:QI 1 "aarch64_shift_imm_<mode>" "n"))
+		  (match_operand:GPI 2 "register_operand" "r"))
+	(const_int 0)))]
+  ""
+  "tst\\t%<w>2, %<w>0, <SHIFT:shift> %1"
+  [(set_attr "v8type" "logics_shift")
+   (set_attr "mode" "<MODE>")])
+
+;; -------------------------------------------------------------------
+;; Shifts
+;; -------------------------------------------------------------------
+
+(define_expand "<optab><mode>3"
+  [(set (match_operand:GPI 0 "register_operand")
+	(ASHIFT:GPI (match_operand:GPI 1 "register_operand")
+		    (match_operand:QI 2 "nonmemory_operand")))]
+  ""
+  {
+    if (CONST_INT_P (operands[2]))
+      {
+        operands[2] = GEN_INT (INTVAL (operands[2])
+                               & (GET_MODE_BITSIZE (<MODE>mode) - 1));
+
+        if (operands[2] == const0_rtx)
+          {
+	    emit_insn (gen_mov<mode> (operands[0], operands[1]));
+	    DONE;
+          }
+      }
+  }
+)
+
+(define_expand "ashl<mode>3"
+  [(set (match_operand:SHORT 0 "register_operand")
+	(ashift:SHORT (match_operand:SHORT 1 "register_operand")
+		      (match_operand:QI 2 "nonmemory_operand")))]
+  ""
+  {
+    if (CONST_INT_P (operands[2]))
+      {
+        operands[2] = GEN_INT (INTVAL (operands[2])
+                               & (GET_MODE_BITSIZE (<MODE>mode) - 1));
+
+        if (operands[2] == const0_rtx)
+          {
+	    emit_insn (gen_mov<mode> (operands[0], operands[1]));
+	    DONE;
+          }
+      }
+  }
+)
+
+(define_expand "rotr<mode>3"
+  [(set (match_operand:GPI 0 "register_operand")
+	(rotatert:GPI (match_operand:GPI 1 "register_operand")
+		      (match_operand:QI 2 "nonmemory_operand")))]
+  ""
+  {
+    if (CONST_INT_P (operands[2]))
+      {
+        operands[2] = GEN_INT (INTVAL (operands[2])
+                               & (GET_MODE_BITSIZE (<MODE>mode) - 1));
+
+        if (operands[2] == const0_rtx)
+          {
+	    emit_insn (gen_mov<mode> (operands[0], operands[1]));
+	    DONE;
+          }
+      }
+  }
+)
+
+(define_expand "rotl<mode>3"
+  [(set (match_operand:GPI 0 "register_operand")
+	(rotatert:GPI (match_operand:GPI 1 "register_operand")
+		      (match_operand:QI 2 "nonmemory_operand")))]
+  ""
+  {
+    /* (SZ - cnt) % SZ == -cnt % SZ */
+    if (CONST_INT_P (operands[2]))
+      {
+        operands[2] = GEN_INT ((-INTVAL (operands[2]))
+			       & (GET_MODE_BITSIZE (<MODE>mode) - 1));
+        if (operands[2] == const0_rtx)
+          {
+	    emit_insn (gen_mov<mode> (operands[0], operands[1]));
+	    DONE;
+          }
+      }
+    else
+      operands[2] = expand_simple_unop (QImode, NEG, operands[2],
+					NULL_RTX, 1);
+  }
+)
+
+(define_insn "*<optab><mode>3_insn"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(SHIFT:GPI
+	 (match_operand:GPI 1 "register_operand" "r")
+	 (match_operand:QI 2 "aarch64_reg_or_shift_imm_<mode>" "rUs<cmode>")))]
+  ""
+  "<shift>\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*ashl<mode>3_insn"
+  [(set (match_operand:SHORT 0 "register_operand" "=r")
+	(ashift:SHORT (match_operand:SHORT 1 "register_operand" "r")
+		      (match_operand:QI 2 "aarch64_reg_or_shift_imm_si" "rUss")))]
+  ""
+  "lsl\\t%<w>0, %<w>1, %<w>2"
+  [(set_attr "v8type" "shift")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*<optab><mode>3_insn"
+  [(set (match_operand:SHORT 0 "register_operand" "=r")
+	(ASHIFT:SHORT (match_operand:SHORT 1 "register_operand" "r")
+		      (match_operand 2 "const_int_operand" "n")))]
+  "UINTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)"
+{
+  operands[3] = GEN_INT (<sizen> - UINTVAL (operands[2]));
+  return "<bfshift>\t%w0, %w1, %2, %3";
+}
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*<ANY_EXTEND:optab><GPI:mode>_ashl<SHORT:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ANY_EXTEND:GPI
+	 (ashift:SHORT (match_operand:SHORT 1 "register_operand" "r")
+		       (match_operand 2 "const_int_operand" "n"))))]
+  "UINTVAL (operands[2]) < GET_MODE_BITSIZE (<SHORT:MODE>mode)"
+{
+  operands[3] = GEN_INT (<SHORT:sizen> - UINTVAL (operands[2]));
+  return "<su>bfiz\t%<GPI:w>0, %<GPI:w>1, %2, %3";
+}
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*zero_extend<GPI:mode>_lshr<SHORT:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(zero_extend:GPI
+	 (lshiftrt:SHORT (match_operand:SHORT 1 "register_operand" "r")
+			 (match_operand 2 "const_int_operand" "n"))))]
+  "UINTVAL (operands[2]) < GET_MODE_BITSIZE (<SHORT:MODE>mode)"
+{
+  operands[3] = GEN_INT (<SHORT:sizen> - UINTVAL (operands[2]));
+  return "ubfx\t%<GPI:w>0, %<GPI:w>1, %2, %3";
+}
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+(define_insn "*extend<GPI:mode>_ashr<SHORT:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(sign_extend:GPI
+	 (ashiftrt:SHORT (match_operand:SHORT 1 "register_operand" "r")
+			 (match_operand 2 "const_int_operand" "n"))))]
+  "UINTVAL (operands[2]) < GET_MODE_BITSIZE (<SHORT:MODE>mode)"
+{
+  operands[3] = GEN_INT (<SHORT:sizen> - UINTVAL (operands[2]));
+  return "sbfx\\t%<GPI:w>0, %<GPI:w>1, %2, %3";
+}
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Bitfields
+;; -------------------------------------------------------------------
+
+(define_expand "<optab>"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ANY_EXTRACT:DI (match_operand:DI 1 "register_operand" "r")
+			(match_operand 2 "const_int_operand" "n")
+			(match_operand 3 "const_int_operand" "n")))]
+  ""
+  ""
+)
+
+(define_insn "*<optab><mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ANY_EXTRACT:GPI (match_operand:GPI 1 "register_operand" "r")
+			 (match_operand 2 "const_int_operand" "n")
+			 (match_operand 3 "const_int_operand" "n")))]
+  ""
+  "<su>bfx\\t%<w>0, %<w>1, %3, %2"
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*<optab><ALLX:mode>_shft_<GPI:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(ashift:GPI (ANY_EXTEND:GPI
+		     (match_operand:ALLX 1 "register_operand" "r"))
+		    (match_operand 2 "const_int_operand" "n")))]
+  "UINTVAL (operands[2]) < <GPI:sizen>"
+{
+  operands[3] = (<ALLX:sizen> <= (<GPI:sizen> - UINTVAL (operands[2])))
+	      ? GEN_INT (<ALLX:sizen>)
+	      : GEN_INT (<GPI:sizen> - UINTVAL (operands[2]));
+  return "<su>bfiz\t%<GPI:w>0, %<GPI:w>1, %2, %3";
+}
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<GPI:MODE>")]
+)
+
+;; XXX We should match (any_extend (ashift)) here, like (and (ashift)) below
+
+(define_insn "*andim_ashift<mode>_bfiz"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand" "r")
+			     (match_operand 2 "const_int_operand" "n"))
+		 (match_operand 3 "const_int_operand" "n")))]
+  "exact_log2 ((INTVAL (operands[3]) >> INTVAL (operands[2])) + 1) >= 0
+   && (INTVAL (operands[3]) & ((1 << INTVAL (operands[2])) - 1)) == 0"
+  "ubfiz\\t%<w>0, %<w>1, %2, %P3"
+  [(set_attr "v8type" "bfm")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "bswap<mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (bswap:GPI (match_operand:GPI 1 "register_operand" "r")))]
+  ""
+  "rev\\t%<w>0, %<w>1"
+  [(set_attr "v8type" "rev")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Floating-point intrinsics
+;; -------------------------------------------------------------------
+
+;; trunc - nothrow
+
+(define_insn "btrunc<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+	 UNSPEC_FRINTZ))]
+  "TARGET_FLOAT"
+  "frintz\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "frint")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*lbtrunc<su_optab><GPF:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (FIXUORS:GPI (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+		      UNSPEC_FRINTZ)))]
+  "TARGET_FLOAT"
+  "fcvtz<su>\\t%<GPI:w>0, %<GPF:s>1"
+  [(set_attr "v8type" "fcvtf2i")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+;; ceil - nothrow
+
+(define_insn "ceil<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+	 UNSPEC_FRINTP))]
+  "TARGET_FLOAT"
+  "frintp\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "frint")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "lceil<su_optab><GPF:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (FIXUORS:GPI (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+		      UNSPEC_FRINTP)))]
+  "TARGET_FLOAT"
+  "fcvtp<su>\\t%<GPI:w>0, %<GPF:s>1"
+  [(set_attr "v8type" "fcvtf2i")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+;; floor - nothrow
+
+(define_insn "floor<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+	 UNSPEC_FRINTM))]
+  "TARGET_FLOAT"
+  "frintm\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "frint")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "lfloor<su_optab><GPF:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (FIXUORS:GPI (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+		      UNSPEC_FRINTM)))]
+  "TARGET_FLOAT"
+  "fcvtm<su>\\t%<GPI:w>0, %<GPF:s>1"
+  [(set_attr "v8type" "fcvtf2i")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+;; nearbyint - nothrow
+
+(define_insn "nearbyint<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+	 UNSPEC_FRINTI))]
+  "TARGET_FLOAT"
+  "frinti\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "frint")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; rint
+
+(define_insn "rint<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+	 UNSPEC_FRINTX))]
+  "TARGET_FLOAT"
+  "frintx\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "frint")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; round - nothrow
+
+(define_insn "round<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+	 UNSPEC_FRINTA))]
+  "TARGET_FLOAT"
+  "frinta\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "frint")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "lround<su_optab><GPF:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (FIXUORS:GPI (unspec:GPF [(match_operand:GPF 1 "register_operand" "w")]
+		      UNSPEC_FRINTA)))]
+  "TARGET_FLOAT"
+  "fcvta<su>\\t%<GPI:w>0, %<GPF:s>1"
+  [(set_attr "v8type" "fcvtf2i")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+;; fma - no throw
+
+(define_insn "fma<mode>4"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (fma:GPF (match_operand:GPF 1 "register_operand" "w")
+		 (match_operand:GPF 2 "register_operand" "w")
+		 (match_operand:GPF 3 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fmadd\\t%<s>0, %<s>1, %<s>2, %<s>3"
+  [(set_attr "v8type" "fmadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "fnma<mode>4"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(fma:GPF (neg:GPF (match_operand:GPF 1 "register_operand" "w"))
+		 (match_operand:GPF 2 "register_operand" "w")
+		 (match_operand:GPF 3 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
+  [(set_attr "v8type" "fmadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "fms<mode>4"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (fma:GPF (match_operand:GPF 1 "register_operand" "w")
+		 (match_operand:GPF 2 "register_operand" "w")
+		 (neg:GPF (match_operand:GPF 3 "register_operand" "w"))))]
+  "TARGET_FLOAT"
+  "fnmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
+  [(set_attr "v8type" "fmadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "fnms<mode>4"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(fma:GPF (neg:GPF (match_operand:GPF 1 "register_operand" "w"))
+		 (match_operand:GPF 2 "register_operand" "w")
+		 (neg:GPF (match_operand:GPF 3 "register_operand" "w"))))]
+  "TARGET_FLOAT"
+  "fnmadd\\t%<s>0, %<s>1, %<s>2, %<s>3"
+  [(set_attr "v8type" "fmadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; If signed zeros are ignored, -(a * b + c) = -a * b - c.
+(define_insn "*fnmadd<mode>4"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+	(neg:GPF (fma:GPF (match_operand:GPF 1 "register_operand" "w")
+			  (match_operand:GPF 2 "register_operand" "w")
+			  (match_operand:GPF 3 "register_operand" "w"))))]
+  "!HONOR_SIGNED_ZEROS (<MODE>mode) && TARGET_FLOAT"
+  "fnmadd\\t%<s>0, %<s>1, %<s>2, %<s>3"
+  [(set_attr "v8type" "fmadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Floating-point conversions
+;; -------------------------------------------------------------------
+
+(define_insn "extendsfdf2"
+  [(set (match_operand:DF 0 "register_operand" "=w")
+        (float_extend:DF (match_operand:SF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fcvt\\t%d0, %s1"
+  [(set_attr "v8type" "fcvt")
+   (set_attr "mode" "DF")
+   (set_attr "mode2" "SF")]
+)
+
+(define_insn "truncdfsf2"
+  [(set (match_operand:SF 0 "register_operand" "=w")
+        (float_truncate:SF (match_operand:DF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fcvt\\t%s0, %d1"
+  [(set_attr "v8type" "fcvt")
+   (set_attr "mode" "SF")
+   (set_attr "mode2" "DF")]
+)
+
+(define_insn "fix_trunc<GPF:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (fix:GPI (match_operand:GPF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fcvtzs\\t%<GPI:w>0, %<GPF:s>1"
+  [(set_attr "v8type" "fcvtf2i")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+(define_insn "fixuns_trunc<GPF:mode><GPI:mode>2"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+        (unsigned_fix:GPI (match_operand:GPF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fcvtzu\\t%<GPI:w>0, %<GPF:s>1"
+  [(set_attr "v8type" "fcvtf2i")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+(define_insn "float<GPI:mode><GPF:mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (float:GPF (match_operand:GPI 1 "register_operand" "r")))]
+  "TARGET_FLOAT"
+  "scvtf\\t%<GPF:s>0, %<GPI:w>1"
+  [(set_attr "v8type" "fcvti2f")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+(define_insn "floatuns<GPI:mode><GPF:mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (unsigned_float:GPF (match_operand:GPI 1 "register_operand" "r")))]
+  "TARGET_FLOAT"
+  "ucvtf\\t%<GPF:s>0, %<GPI:w>1"
+  [(set_attr "v8type" "fcvt")
+   (set_attr "mode" "<GPF:MODE>")
+   (set_attr "mode2" "<GPI:MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Floating-point arithmetic
+;; -------------------------------------------------------------------
+
+(define_insn "add<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (plus:GPF
+         (match_operand:GPF 1 "register_operand" "w")
+         (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fadd\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "sub<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (minus:GPF
+         (match_operand:GPF 1 "register_operand" "w")
+         (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fsub\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fadd")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "mul<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (mult:GPF
+         (match_operand:GPF 1 "register_operand" "w")
+         (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fmul\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fmul")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "*fnmul<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (mult:GPF
+		 (neg:GPF (match_operand:GPF 1 "register_operand" "w"))
+		 (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fnmul\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fmul")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "div<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (div:GPF
+         (match_operand:GPF 1 "register_operand" "w")
+         (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fdiv\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fdiv")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "neg<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (neg:GPF (match_operand:GPF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fneg\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "ffarith")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "sqrt<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (sqrt:GPF (match_operand:GPF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fsqrt\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "fsqrt")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "abs<mode>2"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (abs:GPF (match_operand:GPF 1 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fabs\\t%<s>0, %<s>1"
+  [(set_attr "v8type" "ffarith")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; Given that smax/smin do not specify the result when either input is NaN,
+;; we could use either FMAXNM or FMAX for smax, and either FMINNM or FMIN
+;; for smin.
+
+(define_insn "smax<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (smax:GPF (match_operand:GPF 1 "register_operand" "w")
+		  (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fmaxnm\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fminmax")
+   (set_attr "mode" "<MODE>")]
+)
+
+(define_insn "smin<mode>3"
+  [(set (match_operand:GPF 0 "register_operand" "=w")
+        (smin:GPF (match_operand:GPF 1 "register_operand" "w")
+		  (match_operand:GPF 2 "register_operand" "w")))]
+  "TARGET_FLOAT"
+  "fminnm\\t%<s>0, %<s>1, %<s>2"
+  [(set_attr "v8type" "fminmax")
+   (set_attr "mode" "<MODE>")]
+)
+
+;; -------------------------------------------------------------------
+;; Reload support
+;; -------------------------------------------------------------------
+
+;; Reload SP+imm where imm cannot be handled by a single ADD instruction.  
+;; Must load imm into a scratch register and copy SP to the dest reg before
+;; adding, since SP cannot be used as a source register in an ADD
+;; instruction.
+(define_expand "reload_sp_immediate"
+  [(parallel [(set (match_operand:DI 0 "register_operand" "=r")
+		   (match_operand:DI 1 "" ""))
+	     (clobber (match_operand:TI 2 "register_operand" "=&r"))])]
+  ""
+  {
+    rtx sp = XEXP (operands[1], 0);
+    rtx val = XEXP (operands[1], 1);
+    unsigned regno = REGNO (operands[2]);
+    rtx scratch = operands[1];
+    gcc_assert (GET_CODE (operands[1]) == PLUS);
+    gcc_assert (sp == stack_pointer_rtx);
+    gcc_assert (CONST_INT_P (val));
+
+    /* It is possible that one of the registers we got for operands[2]
+       might coincide with that of operands[0] (which is why we made
+       it TImode).  Pick the other one to use as our scratch.  */
+    if (regno == REGNO (operands[0]))
+      regno++;
+    scratch = gen_rtx_REG (DImode, regno);
+
+    emit_move_insn (scratch, val);
+    emit_move_insn (operands[0], sp);
+    emit_insn (gen_adddi3 (operands[0], operands[0], scratch));
+    DONE;
+  }
+)
+
+(define_expand "aarch64_reload_mov<mode>"
+  [(set (match_operand:TX 0 "register_operand" "=w")
+        (match_operand:TX 1 "register_operand" "w"))
+   (clobber (match_operand:DI 2 "register_operand" "=&r"))
+  ]
+  ""
+  {
+    rtx op0 = simplify_gen_subreg (TImode, operands[0], <MODE>mode, 0);
+    rtx op1 = simplify_gen_subreg (TImode, operands[1], <MODE>mode, 0);
+    gen_aarch64_movtilow_tilow (op0, op1);
+    gen_aarch64_movdi_tihigh (operands[2], op1);
+    gen_aarch64_movtihigh_di (op0, operands[2]);
+    DONE;
+  }
+)
+
+;; The following secondary reload helpers patterns are invoked
+;; after or during reload as we don't want these patterns to start
+;; kicking in during the combiner.
+ 
+(define_insn "aarch64_movdi_tilow"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+        (truncate:DI (match_operand:TI 1 "register_operand" "w")))]
+  "reload_completed || reload_in_progress"
+  "fmov\\t%x0, %d1"
+  [(set_attr "v8type" "fmovf2i")
+   (set_attr "mode"   "DI")
+   (set_attr "length" "4")
+  ])
+
+(define_insn "aarch64_movdi_tihigh"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+        (truncate:DI
+	  (lshiftrt:TI (match_operand:TI 1 "register_operand" "w")
+		       (const_int 64))))]
+  "reload_completed || reload_in_progress"
+  "fmov\\t%x0, %1.d[1]"
+  [(set_attr "v8type" "fmovf2i")
+   (set_attr "mode"   "DI")
+   (set_attr "length" "4")
+  ])
+
+(define_insn "aarch64_movtihigh_di"
+  [(set (zero_extract:TI (match_operand:TI 0 "register_operand" "+w")
+                         (const_int 64) (const_int 64))
+        (zero_extend:TI (match_operand:DI 1 "register_operand" "r")))]
+  "reload_completed || reload_in_progress"
+  "fmov\\t%0.d[1], %x1"
+
+  [(set_attr "v8type" "fmovi2f")
+   (set_attr "mode"   "DI")
+   (set_attr "length" "4")
+  ])
+
+(define_insn "aarch64_movtilow_di"
+  [(set (match_operand:TI 0 "register_operand" "=w")
+        (zero_extend:TI (match_operand:DI 1 "register_operand" "r")))]
+  "reload_completed || reload_in_progress"
+  "fmov\\t%d0, %x1"
+
+  [(set_attr "v8type" "fmovi2f")
+   (set_attr "mode"   "DI")
+   (set_attr "length" "4")
+  ])
+
+(define_insn "aarch64_movtilow_tilow"
+  [(set (match_operand:TI 0 "register_operand" "=w")
+        (zero_extend:TI 
+	  (truncate:DI (match_operand:TI 1 "register_operand" "w"))))]
+  "reload_completed || reload_in_progress"
+  "fmov\\t%d0, %d1"
+
+  [(set_attr "v8type" "fmovi2f")
+   (set_attr "mode"   "DI")
+   (set_attr "length" "4")
+  ])
+
+;; There is a deliberate reason why the parameters of high and lo_sum's
+;; don't have modes for ADRP and ADD instructions.  This is to allow high
+;; and lo_sum's to be used with the labels defining the jump tables in
+;; rodata section.
+
+(define_insn "add_losym"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(lo_sum:DI (match_operand:DI 1 "register_operand" "r")
+		   (match_operand 2 "aarch64_valid_symref" "S")))]
+  ""
+  "add\\t%0, %1, :lo12:%a2"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "DI")]
+
+)
+
+(define_insn "ldr_got_small"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(mem:DI (lo_sum:DI
+			      (match_operand:DI 1 "register_operand" "r")
+			      (match_operand:DI 2 "aarch64_valid_symref" "S")))]
+		   UNSPEC_GOTSMALLPIC))]
+  ""
+  "ldr\\t%0, [%1, #:got_lo12:%a2]"
+  [(set_attr "v8type" "load1")
+   (set_attr "mode" "DI")]
+)
+
+(define_insn "aarch64_load_tp_hard"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec:DI [(const_int 0)] UNSPEC_TLS))]
+  ""
+  "mrs\\t%0, tpidr_el0"
+  [(set_attr "v8type" "mrs")
+   (set_attr "mode" "DI")]
+)
+
+;; The TLS ABI specifically requires that the compiler does not schedule
+;; instructions in the TLS stubs, in order to enable linker relaxation.
+;; Therefore we treat the stubs as an atomic sequence.
+(define_expand "tlsgd_small"
+ [(parallel [(set (match_operand 0 "register_operand" "")
+                  (call (mem:DI (match_dup 2)) (const_int 1)))
+	     (unspec:DI [(match_operand:DI 1 "aarch64_valid_symref" "")] UNSPEC_GOTSMALLTLS)
+	     (clobber (reg:DI LR_REGNUM))])]
+ ""
+{
+  operands[2] = aarch64_tls_get_addr ();
+})
+
+(define_insn "*tlsgd_small"
+  [(set (match_operand 0 "register_operand" "")
+	(call (mem:DI (match_operand:DI 2 "" "")) (const_int 1)))
+   (unspec:DI [(match_operand:DI 1 "aarch64_valid_symref" "S")] UNSPEC_GOTSMALLTLS)
+   (clobber (reg:DI LR_REGNUM))
+  ]
+  ""
+  "adrp\\tx0, %A1\;add\\tx0, x0, %L1\;bl\\t%2\;nop"
+  [(set_attr "v8type" "call")
+   (set_attr "length" "16")])
+
+(define_insn "tlsie_small"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+        (unspec:DI [(match_operand:DI 1 "aarch64_tls_ie_symref" "S")]
+		   UNSPEC_GOTSMALLTLS))]
+  ""
+  "adrp\\t%0, %A1\;ldr\\t%0, [%0, #%L1]"
+  [(set_attr "v8type" "load1")
+   (set_attr "mode" "DI")
+   (set_attr "length" "8")]
+)
+
+(define_insn "tlsle_small"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+        (unspec:DI [(match_operand:DI 1 "register_operand" "r")
+                   (match_operand:DI 2 "aarch64_tls_le_symref" "S")]
+		   UNSPEC_GOTSMALLTLS))]
+  ""
+  "add\\t%0, %1, #%G2\;add\\t%0, %0, #%L2"
+  [(set_attr "v8type" "alu")
+   (set_attr "mode" "DI")
+   (set_attr "length" "8")]
+)
+
+(define_insn "tlsdesc_small"
+  [(set (reg:DI R0_REGNUM)
+        (unspec:DI [(match_operand:DI 0 "aarch64_valid_symref" "S")]
+		   UNSPEC_TLSDESC))
+   (clobber (reg:DI LR_REGNUM))
+   (clobber (match_scratch:DI 1 "=r"))]
+  "TARGET_TLS_DESC"
+  "adrp\\tx0, %A0\;ldr\\t%1, [x0, #%L0]\;add\\tx0, x0, %L0\;.tlsdesccall\\t%0\;blr\\t%1"
+  [(set_attr "v8type" "call")
+   (set_attr "length" "16")])
+
+(define_insn "stack_tie"
+  [(set (mem:BLK (scratch))
+	(unspec:BLK [(match_operand:DI 0 "register_operand" "rk")
+		     (match_operand:DI 1 "register_operand" "rk")]
+		    UNSPEC_PRLG_STK))]
+  ""
+  ""
+  [(set_attr "length" "0")]
+)
+
+;; AdvSIMD Stuff
+(include "aarch64-simd.md")
+
+;; Synchronization Builtins
+(include "sync.md")
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
+; Machine description for AArch64 architecture.
+; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+; Contributed by ARM Ltd.
+;
+; This file is part of GCC.
+;
+; GCC is free software; you can redistribute it and/or modify it
+; under the terms of the GNU General Public License as published by
+; the Free Software Foundation; either version 3, or (at your option)
+; any later version.
+;
+; GCC is distributed in the hope that it will be useful, but
+; WITHOUT ANY WARRANTY; without even the implied warranty of
+; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+; General Public License for more details.
+;
+; You should have received a copy of the GNU General Public License
+; along with GCC; see the file COPYING3.  If not see
+; <http://www.gnu.org/licenses/>.
+
+HeaderInclude
+config/aarch64/aarch64-opts.h
+
+; The TLS dialect names to use with -mtls-dialect.
+
+Enum
+Name(tls_type) Type(enum aarch64_tls_type)
+The possible TLS dialects:
+
+EnumValue
+Enum(tls_type) String(trad) Value(TLS_TRADITIONAL)
+
+EnumValue
+Enum(tls_type) String(desc) Value(TLS_DESCRIPTORS)
+
+; The code model option names for -mcmodel.
+
+Enum
+Name(cmodel) Type(enum aarch64_code_model)
+The code model option names for -mcmodel:
+
+EnumValue
+Enum(cmodel) String(tiny) Value(AARCH64_CMODEL_TINY)
+
+EnumValue
+Enum(cmodel) String(small) Value(AARCH64_CMODEL_SMALL)
+
+EnumValue
+Enum(cmodel) String(large) Value(AARCH64_CMODEL_LARGE)
+
+; The cpu/arch option names to use in cpu/arch selection.
+
+Variable
+const char *aarch64_arch_string
+
+Variable
+const char *aarch64_cpu_string
+
+Variable
+const char *aarch64_tune_string
+
+mbig-endian
+Target Report RejectNegative Mask(BIG_END)
+Assume target CPU is configured as big endian
+
+mgeneral-regs-only
+Target Report RejectNegative Mask(GENERAL_REGS_ONLY)
+Generate code which uses only the general registers
+
+mlittle-endian
+Target Report RejectNegative InverseMask(BIG_END)
+Assume target CPU is configured as little endian
+
+mcmodel=
+Target RejectNegative Joined Enum(cmodel) Var(aarch64_cmodel_var) Init(AARCH64_CMODEL_SMALL)
+Specify the code model
+
+mstrict-align
+Target Report RejectNegative Mask(STRICT_ALIGN)
+Don't assume that unaligned accesses are handled by the system
+
+momit-leaf-frame-pointer
+Target Report Save Var(flag_omit_leaf_frame_pointer) Init(1)
+Omit the frame pointer in leaf functions
+
+mtls-dialect=
+Target RejectNegative Joined Enum(tls_type) Var(aarch64_tls_dialect) Init(TLS_DESCRIPTORS)
+Specify TLS dialect
+
+march=
+Target RejectNegative Joined Var(aarch64_arch_string)
+-march=ARCH	Use features of architecture ARCH
+
+mcpu=
+Target RejectNegative Joined Var(aarch64_cpu_string)
+-mcpu=CPU	Use features of and optimize for CPU
+
+mtune=
+Target RejectNegative Joined Var(aarch64_tune_string)
+-mtune=CPU	Optimize for CPU
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
--- a/gcc/config/aarch64/constraints.md
+++ b/gcc/config/aarch64/constraints.md
+;; Machine description for AArch64 architecture.
+;; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_register_constraint "k" "STACK_REG"
+  "@internal The stack register.")
+
+(define_register_constraint "w" "FP_REGS"
+  "Floating point and SIMD vector registers.")
+
+(define_register_constraint "x" "FP_LO_REGS"
+  "Floating point and SIMD vector registers V0 - V15.")
+
+(define_constraint "I"
+ "A constant that can be used with an ADD operation."
+ (and (match_code "const_int")
+      (match_test "aarch64_uimm12_shift (ival)")))
+
+(define_constraint "J"
+ "A constant that can be used with a SUB operation (once negated)."
+ (and (match_code "const_int")
+      (match_test "aarch64_uimm12_shift (-ival)")))
+
+;; We can't use the mode of a CONST_INT to determine the context in
+;; which it is being used, so we must have a separate constraint for
+;; each context.
+
+(define_constraint "K"
+ "A constant that can be used with a 32-bit logical operation."
+ (and (match_code "const_int")
+      (match_test "aarch64_bitmask_imm (ival, SImode)")))
+
+(define_constraint "L"
+ "A constant that can be used with a 64-bit logical operation."
+ (and (match_code "const_int")
+      (match_test "aarch64_bitmask_imm (ival, DImode)")))
+
+(define_constraint "M"
+ "A constant that can be used with a 32-bit MOV immediate operation."
+ (and (match_code "const_int")
+      (match_test "aarch64_move_imm (ival, SImode)")))
+
+(define_constraint "N"
+ "A constant that can be used with a 64-bit MOV immediate operation."
+ (and (match_code "const_int")
+      (match_test "aarch64_move_imm (ival, DImode)")))
+
+(define_constraint "S"
+  "A constraint that matches an absolute symbolic address."
+  (and (match_code "const,symbol_ref,label_ref")
+       (match_test "aarch64_symbolic_address_p (op)")))
+
+(define_constraint "Y"
+  "Floating point constant zero."
+  (and (match_code "const_double")
+       (match_test "aarch64_const_double_zero_rtx_p (op)")))
+
+(define_constraint "Z"
+  "Integer constant zero."
+  (match_test "op == const0_rtx"))
+
+(define_constraint "Usa"
+  "A constraint that matches an absolute symbolic address."
+  (and (match_code "const,symbol_ref")
+       (match_test "aarch64_symbolic_address_p (op)")))
+
+(define_constraint "Ush"
+  "A constraint that matches an absolute symbolic address high part."
+  (and (match_code "high")
+       (match_test "aarch64_valid_symref (XEXP (op, 0), GET_MODE (XEXP (op, 0)))")))
+
+(define_constraint "Uss"
+  "@internal
+  A constraint that matches an immediate shift constant in SImode."
+  (and (match_code "const_int")
+       (match_test "(unsigned HOST_WIDE_INT) ival < 32")))
+
+(define_constraint "Usd"
+  "@internal
+  A constraint that matches an immediate shift constant in DImode."
+  (and (match_code "const_int")
+       (match_test "(unsigned HOST_WIDE_INT) ival < 64")))
+
+(define_constraint "UsM"
+  "@internal
+  A constraint that matches the immediate constant -1."
+  (match_test "op == constm1_rtx"))
+
+(define_constraint "Ui3"
+  "@internal
+  A constraint that matches the integers 0...4."
+  (and (match_code "const_int")
+       (match_test "(unsigned HOST_WIDE_INT) ival <= 4")))
+
+(define_constraint "Up3"
+  "@internal
+  A constraint that matches the integers 2^(0...4)."
+  (and (match_code "const_int")
+       (match_test "(unsigned) exact_log2 (ival) <= 4")))
+
+(define_memory_constraint "Q"
+ "A memory address which uses a single base register with no offset."
+ (and (match_code "mem")
+      (match_test "REG_P (XEXP (op, 0))")))
+
+(define_memory_constraint "Ump"
+  "@internal
+  A memory address suitable for a load/store pair operation."
+  (and (match_code "mem")
+       (match_test "aarch64_legitimate_address_p (GET_MODE (op), XEXP (op, 0),
+						  PARALLEL, 1)")))
+
+(define_memory_constraint "Utv"
+  "@internal
+   An address valid for loading/storing opaque structure
+   types wider than TImode."
+  (and (match_code "mem")
+       (match_test "aarch64_simd_mem_operand_p (op)")))
+
+(define_constraint "Dn"
+  "@internal
+ A constraint that matches vector of immediates."
+ (and (match_code "const_vector")
+      (match_test "aarch64_simd_immediate_valid_for_move (op, GET_MODE (op),
+							  NULL, NULL, NULL,
+							  NULL, NULL) != 0")))
+
+(define_constraint "Dl"
+  "@internal
+ A constraint that matches vector of immediates for left shifts."
+ (and (match_code "const_vector")
+      (match_test "aarch64_simd_shift_imm_p (op, GET_MODE (op),
+						 true)")))
+
+(define_constraint "Dr"
+  "@internal
+ A constraint that matches vector of immediates for right shifts."
+ (and (match_code "const_vector")
+      (match_test "aarch64_simd_shift_imm_p (op, GET_MODE (op),
+						 false)")))
+(define_constraint "Dz"
+  "@internal
+ A constraint that matches vector of immediate zero."
+ (and (match_code "const_vector")
+      (match_test "aarch64_simd_imm_zero_p (op, GET_MODE (op))")))
+
+(define_constraint "Dd"
+  "@internal
+ A constraint that matches an immediate operand valid for AdvSIMD scalar."
+ (and (match_code "const_int")
+      (match_test "aarch64_simd_imm_scalar_p (op, GET_MODE (op))")))
--- a/gcc/config/aarch64/gentune.sh
+++ b/gcc/config/aarch64/gentune.sh
+#!/bin/sh
+#
+# Copyright (C) 2011, 2012 Free Software Foundation, Inc.
+# Contributed by ARM Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Generate aarch64-tune.md, a file containing the tune attribute from the list of 
+# CPUs in aarch64-cores.def
+
+echo ";; -*- buffer-read-only: t -*-"
+echo ";; Generated automatically by gentune.sh from aarch64-cores.def"
+
+allcores=`awk -F'[(, 	]+' '/^AARCH64_CORE/ { cores = cores$3"," } END { print cores } ' $1`
+
+echo "(define_attr \"tune\""
+echo "	\"$allcores\"" | sed -e 's/,"$/"/'
+echo "	(const (symbol_ref \"((enum attr_tune) aarch64_tune)\")))"
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
+;; Machine description for AArch64 architecture.
+;; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; -------------------------------------------------------------------
+;; Mode Iterators
+;; -------------------------------------------------------------------
+
+
+;; Iterator for General Purpose Integer registers (32- and 64-bit modes)
+(define_mode_iterator GPI [SI DI])
+
+;; Iterator for QI and HI modes
+(define_mode_iterator SHORT [QI HI])
+
+;; Iterator for all integer modes (up to 64-bit)
+(define_mode_iterator ALLI [QI HI SI DI])
+
+;; Iterator scalar modes (up to 64-bit)
+(define_mode_iterator SDQ_I [QI HI SI DI])
+
+;; Iterator for all integer modes that can be extended (up to 64-bit)
+(define_mode_iterator ALLX [QI HI SI])
+
+;; Iterator for General Purpose Floating-point registers (32- and 64-bit modes)
+(define_mode_iterator GPF [SF DF])
+
+;; Integer vector modes.
+(define_mode_iterator VDQ [V8QI V16QI V4HI V8HI V2SI V4SI V2DI])
+
+;; Integer vector modes.
+(define_mode_iterator VDQ_I [V8QI V16QI V4HI V8HI V2SI V4SI V2DI])
+
+;; vector and scalar, 64 & 128-bit container, all integer modes
+(define_mode_iterator VSDQ_I [V8QI V16QI V4HI V8HI V2SI V4SI V2DI QI HI SI DI])
+
+;; vector and scalar, 64 & 128-bit container: all vector integer modes;
+;; 64-bit scalar integer mode
+(define_mode_iterator VSDQ_I_DI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI DI])
+
+;; Double vector modes.
+(define_mode_iterator VD [V8QI V4HI V2SI V2SF])
+
+;; vector, 64-bit container, all integer modes
+(define_mode_iterator VD_BHSI [V8QI V4HI V2SI])
+
+;; 128 and 64-bit container; 8, 16, 32-bit vector integer modes
+(define_mode_iterator VDQ_BHSI [V8QI V16QI V4HI V8HI V2SI V4SI])
+
+;; Quad vector modes.
+(define_mode_iterator VQ [V16QI V8HI V4SI V2DI V4SF V2DF])
+
+;; All vector modes, except double.
+(define_mode_iterator VQ_S [V8QI V16QI V4HI V8HI V2SI V4SI])
+
+;; Vector and scalar, 64 & 128-bit container: all vector integer mode;
+;; 8, 16, 32-bit scalar integer modes
+(define_mode_iterator VSDQ_I_BHSI [V8QI V16QI V4HI V8HI V2SI V4SI V2DI QI HI SI])
+
+;; Vector modes for moves.
+(define_mode_iterator VDQM [V8QI V16QI V4HI V8HI V2SI V4SI])
+
+;; This mode iterator allows :PTR to be used for patterns that operate on
+;; pointer-sized quantities.  Exactly one of the two alternatives will match.
+(define_mode_iterator PTR [(SI "Pmode == SImode") (DI "Pmode == DImode")])
+
+;; Vector Float modes.
+(define_mode_iterator VDQF [V2SF V4SF V2DF])
+
+;; Vector Float modes with 2 elements.
+(define_mode_iterator V2F [V2SF V2DF])
+
+;; All modes.
+(define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF])
+
+;; Vector modes for Integer reduction across lanes.
+(define_mode_iterator VDQV [V8QI V16QI V4HI V8HI V4SI])
+
+;; All double integer narrow-able modes.
+(define_mode_iterator VDN [V4HI V2SI DI])
+
+;; All quad integer narrow-able modes.
+(define_mode_iterator VQN [V8HI V4SI V2DI])
+
+;; All double integer widen-able modes.
+(define_mode_iterator VDW [V8QI V4HI V2SI])
+
+;; Vector and scalar 128-bit container: narrowable 16, 32, 64-bit integer modes
+(define_mode_iterator VSQN_HSDI [V8HI V4SI V2DI HI SI DI])
+
+;; All quad integer widen-able modes.
+(define_mode_iterator VQW [V16QI V8HI V4SI])
+
+;; Double vector modes for combines.
+(define_mode_iterator VDC [V8QI V4HI V2SI V2SF DI DF])
+
+;; Double vector modes for combines.
+(define_mode_iterator VDIC [V8QI V4HI V2SI])
+
+;; Double vector modes.
+(define_mode_iterator VD_RE [V8QI V4HI V2SI DI DF V2SF])
+
+;; Vector modes except double int.
+(define_mode_iterator VDQIF [V8QI V16QI V4HI V8HI V2SI V4SI V2SF V4SF V2DF])
+
+;; Vector modes for H and S types.
+(define_mode_iterator VDQHS [V4HI V8HI V2SI V4SI])
+
+;; Vector and scalar integer modes for H and S
+(define_mode_iterator VSDQ_HSI [V4HI V8HI V2SI V4SI HI SI])
+
+;; Vector and scalar 64-bit container: 16, 32-bit integer modes
+(define_mode_iterator VSD_HSI [V4HI V2SI HI SI])
+
+;; Vector 64-bit container: 16, 32-bit integer modes
+(define_mode_iterator VD_HSI [V4HI V2SI])
+
+;; Scalar 64-bit container: 16, 32-bit integer modes
+(define_mode_iterator SD_HSI [HI SI])
+
+;; Vector 64-bit container: 16, 32-bit integer modes
+(define_mode_iterator VQ_HSI [V8HI V4SI])
+
+;; All byte modes.
+(define_mode_iterator VB [V8QI V16QI])
+
+(define_mode_iterator TX [TI TF])
+
+;; Opaque structure modes.
+(define_mode_iterator VSTRUCT [OI CI XI])
+
+;; Double scalar modes
+(define_mode_iterator DX [DI DF])
+
+;; ------------------------------------------------------------------
+;; Unspec enumerations for Advance SIMD. These could well go into
+;; aarch64.md but for their use in int_iterators here.
+;; ------------------------------------------------------------------
+
+(define_c_enum "unspec"
+ [
+    UNSPEC_ASHIFT_SIGNED	; Used in aarch-simd.md.
+    UNSPEC_ASHIFT_UNSIGNED	; Used in aarch64-simd.md.
+    UNSPEC_FMAXV	; Used in aarch64-simd.md.
+    UNSPEC_FMINV	; Used in aarch64-simd.md.
+    UNSPEC_FADDV	; Used in aarch64-simd.md.
+    UNSPEC_ADDV		; Used in aarch64-simd.md.
+    UNSPEC_SMAXV	; Used in aarch64-simd.md.
+    UNSPEC_SMINV	; Used in aarch64-simd.md.
+    UNSPEC_UMAXV	; Used in aarch64-simd.md.
+    UNSPEC_UMINV	; Used in aarch64-simd.md.
+    UNSPEC_SHADD	; Used in aarch64-simd.md.
+    UNSPEC_UHADD	; Used in aarch64-simd.md.
+    UNSPEC_SRHADD	; Used in aarch64-simd.md.
+    UNSPEC_URHADD	; Used in aarch64-simd.md.
+    UNSPEC_SHSUB	; Used in aarch64-simd.md.
+    UNSPEC_UHSUB	; Used in aarch64-simd.md.
+    UNSPEC_SRHSUB	; Used in aarch64-simd.md.
+    UNSPEC_URHSUB	; Used in aarch64-simd.md.
+    UNSPEC_ADDHN	; Used in aarch64-simd.md.
+    UNSPEC_RADDHN	; Used in aarch64-simd.md.
+    UNSPEC_SUBHN	; Used in aarch64-simd.md.
+    UNSPEC_RSUBHN	; Used in aarch64-simd.md.
+    UNSPEC_ADDHN2	; Used in aarch64-simd.md.
+    UNSPEC_RADDHN2	; Used in aarch64-simd.md.
+    UNSPEC_SUBHN2	; Used in aarch64-simd.md.
+    UNSPEC_RSUBHN2	; Used in aarch64-simd.md.
+    UNSPEC_SQDMULH	; Used in aarch64-simd.md.
+    UNSPEC_SQRDMULH	; Used in aarch64-simd.md.
+    UNSPEC_PMUL		; Used in aarch64-simd.md.
+    UNSPEC_USQADD	; Used in aarch64-simd.md.
+    UNSPEC_SUQADD	; Used in aarch64-simd.md.
+    UNSPEC_SQXTUN	; Used in aarch64-simd.md.
+    UNSPEC_SQXTN	; Used in aarch64-simd.md.
+    UNSPEC_UQXTN	; Used in aarch64-simd.md.
+    UNSPEC_SSRA		; Used in aarch64-simd.md.
+    UNSPEC_USRA		; Used in aarch64-simd.md.
+    UNSPEC_SRSRA	; Used in aarch64-simd.md.
+    UNSPEC_URSRA	; Used in aarch64-simd.md.
+    UNSPEC_SRSHR	; Used in aarch64-simd.md.
+    UNSPEC_URSHR	; Used in aarch64-simd.md.
+    UNSPEC_SQSHLU	; Used in aarch64-simd.md.
+    UNSPEC_SQSHL	; Used in aarch64-simd.md.
+    UNSPEC_UQSHL	; Used in aarch64-simd.md.
+    UNSPEC_SQSHRUN	; Used in aarch64-simd.md.
+    UNSPEC_SQRSHRUN	; Used in aarch64-simd.md.
+    UNSPEC_SQSHRN	; Used in aarch64-simd.md.
+    UNSPEC_UQSHRN	; Used in aarch64-simd.md.
+    UNSPEC_SQRSHRN	; Used in aarch64-simd.md.
+    UNSPEC_UQRSHRN	; Used in aarch64-simd.md.
+    UNSPEC_SSHL		; Used in aarch64-simd.md.
+    UNSPEC_USHL		; Used in aarch64-simd.md.
+    UNSPEC_SRSHL	; Used in aarch64-simd.md.
+    UNSPEC_URSHL	; Used in aarch64-simd.md.
+    UNSPEC_SQRSHL	; Used in aarch64-simd.md.
+    UNSPEC_UQRSHL	; Used in aarch64-simd.md.
+    UNSPEC_CMEQ		; Used in aarch64-simd.md.
+    UNSPEC_CMLE		; Used in aarch64-simd.md.
+    UNSPEC_CMLT		; Used in aarch64-simd.md.
+    UNSPEC_CMGE		; Used in aarch64-simd.md.
+    UNSPEC_CMGT		; Used in aarch64-simd.md.
+    UNSPEC_CMHS		; Used in aarch64-simd.md.
+    UNSPEC_CMHI		; Used in aarch64-simd.md.
+    UNSPEC_SSLI		; Used in aarch64-simd.md.
+    UNSPEC_USLI		; Used in aarch64-simd.md.
+    UNSPEC_SSRI		; Used in aarch64-simd.md.
+    UNSPEC_USRI		; Used in aarch64-simd.md.
+    UNSPEC_SSHLL	; Used in aarch64-simd.md.
+    UNSPEC_USHLL	; Used in aarch64-simd.md.
+    UNSPEC_ADDP		; Used in aarch64-simd.md.
+    UNSPEC_CMTST	; Used in aarch64-simd.md.
+    UNSPEC_FMAX		; Used in aarch64-simd.md.
+    UNSPEC_FMIN		; Used in aarch64-simd.md.
+])
+
+;; -------------------------------------------------------------------
+;; Mode attributes
+;; -------------------------------------------------------------------
+
+;; In GPI templates, a string like "%<w>0" will expand to "%w0" in the
+;; 32-bit version and "%x0" in the 64-bit version.
+(define_mode_attr w [(QI "w") (HI "w") (SI "w") (DI "x") (SF "s") (DF "d")])
+
+;; For scalar usage of vector/FP registers
+(define_mode_attr v [(QI "b") (HI "h") (SI "s") (DI "d")
+		    (V8QI "") (V16QI "")
+		    (V4HI "") (V8HI "")
+		    (V2SI "") (V4SI  "")
+		    (V2DI "") (V2SF "")
+		    (V4SF "") (V2DF "")])
+
+;; For scalar usage of vector/FP registers, narrowing
+(define_mode_attr vn2 [(QI "") (HI "b") (SI "h") (DI "s")
+		    (V8QI "") (V16QI "")
+		    (V4HI "") (V8HI "")
+		    (V2SI "") (V4SI  "")
+		    (V2DI "") (V2SF "")
+		    (V4SF "") (V2DF "")])
+
+;; For scalar usage of vector/FP registers, widening
+(define_mode_attr vw2 [(DI "") (QI "h") (HI "s") (SI "d")
+		    (V8QI "") (V16QI "")
+		    (V4HI "") (V8HI "")
+		    (V2SI "") (V4SI  "")
+		    (V2DI "") (V2SF "")
+		    (V4SF "") (V2DF "")])
+
+;; Map a floating point mode to the appropriate register name prefix
+(define_mode_attr s [(SF "s") (DF "d")])
+
+;; Give the length suffix letter for a sign- or zero-extension.
+(define_mode_attr size [(QI "b") (HI "h") (SI "w")])
+
+;; Give the number of bits in the mode
+(define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")])
+
+;; Give the ordinal of the MSB in the mode
+(define_mode_attr sizem1 [(QI "#7") (HI "#15") (SI "#31") (DI "#63")])
+
+;; Attribute to describe constants acceptable in logical operations
+(define_mode_attr lconst [(SI "K") (DI "L")])
+
+;; Map a mode to a specific constraint character.
+(define_mode_attr cmode [(QI "q") (HI "h") (SI "s") (DI "d")])
+
+(define_mode_attr Vtype [(V8QI "8b") (V16QI "16b")
+			 (V4HI "4h") (V8HI  "8h")
+                         (V2SI "2s") (V4SI  "4s")
+                         (DI   "1d") (DF    "1d")
+                         (V2DI "2d") (V2SF "2s")
+			 (V4SF "4s") (V2DF "2d")])
+
+(define_mode_attr Vmtype [(V8QI ".8b") (V16QI ".16b")
+			 (V4HI ".4h") (V8HI  ".8h")
+			 (V2SI ".2s") (V4SI  ".4s")
+			 (V2DI ".2d") (V2SF ".2s")
+			 (V4SF ".4s") (V2DF ".2d")
+			 (DI   "")    (SI   "")
+			 (HI   "")    (QI   "")
+			 (TI   "")])
+
+;; Register suffix narrowed modes for VQN.
+(define_mode_attr Vmntype [(V8HI ".8b") (V4SI ".4h")
+			   (V2DI ".2s")
+			   (DI   "")    (SI   "")
+			   (HI   "")])
+
+;; Mode-to-individual element type mapping.
+(define_mode_attr Vetype [(V8QI "b") (V16QI "b")
+			  (V4HI "h") (V8HI  "h")
+                          (V2SI "s") (V4SI  "s")
+			  (V2DI "d") (V2SF  "s")
+			  (V4SF "s") (V2DF  "d")
+			  (QI "b")   (HI "h")
+			  (SI "s")   (DI "d")])
+
+;; Mode-to-bitwise operation type mapping.
+(define_mode_attr Vbtype [(V8QI "8b")  (V16QI "16b")
+			  (V4HI "8b") (V8HI  "16b")
+			  (V2SI "8b") (V4SI  "16b")
+			  (V2DI "16b") (V2SF  "8b")
+			  (V4SF "16b") (V2DF  "16b")])
+
+;; Define element mode for each vector mode.
+(define_mode_attr VEL [(V8QI "QI") (V16QI "QI")
+			(V4HI "HI") (V8HI "HI")
+                        (V2SI "SI") (V4SI "SI")
+                        (DI "DI")   (V2DI "DI")
+                        (V2SF "SF") (V4SF "SF")
+                        (V2DF "DF")
+			(SI   "SI") (HI   "HI")
+			(QI   "QI")])
+
+;; Define container mode for lane selection.
+(define_mode_attr VCON [(V8QI "V16QI") (V16QI "V16QI")
+			(V4HI "V8HI") (V8HI "V8HI")
+			(V2SI "V4SI") (V4SI "V4SI")
+			(DI   "V2DI") (V2DI "V2DI")
+			(V2SF "V2SF") (V4SF "V4SF")
+			(V2DF "V2DF") (SI   "V4SI")
+			(HI   "V8HI") (QI   "V16QI")])
+
+;; Half modes of all vector modes.
+(define_mode_attr VHALF [(V8QI "V4QI")  (V16QI "V8QI")
+			 (V4HI "V2HI")  (V8HI  "V4HI")
+			 (V2SI "SI")    (V4SI  "V2SI")
+			 (V2DI "DI")    (V2SF  "SF")
+			 (V4SF "V2SF")  (V2DF  "DF")])
+
+;; Double modes of vector modes.
+(define_mode_attr VDBL [(V8QI "V16QI") (V4HI "V8HI")
+			(V2SI "V4SI")  (V2SF "V4SF")
+			(SI   "V2SI")  (DI   "V2DI")
+			(DF   "V2DF")])
+
+;; Double modes of vector modes (lower case).
+(define_mode_attr Vdbl [(V8QI "v16qi") (V4HI "v8hi")
+			(V2SI "v4si")  (V2SF "v4sf")
+			(SI   "v2si")  (DI   "v2di")])
+
+;; Narrowed modes for VDN.
+(define_mode_attr VNARROWD [(V4HI "V8QI") (V2SI "V4HI")
+			    (DI   "V2SI")])
+
+;; Narrowed double-modes for VQN (Used for XTN).
+(define_mode_attr VNARROWQ [(V8HI "V8QI") (V4SI "V4HI")
+			    (V2DI "V2SI")
+			    (DI	  "SI")	  (SI	"HI")
+			    (HI	  "QI")])
+
+;; Narrowed quad-modes for VQN (Used for XTN2).
+(define_mode_attr VNARROWQ2 [(V8HI "V16QI") (V4SI "V8HI")
+			     (V2DI "V4SI")])
+
+;; Register suffix narrowed modes for VQN.
+(define_mode_attr Vntype [(V8HI "8b") (V4SI "4h")
+			  (V2DI "2s")])
+
+;; Register suffix narrowed modes for VQN.
+(define_mode_attr V2ntype [(V8HI "16b") (V4SI "8h")
+			   (V2DI "4s")])
+
+;; Widened modes of vector modes.
+(define_mode_attr VWIDE [(V8QI "V8HI") (V4HI "V4SI")
+			 (V2SI "V2DI") (V16QI "V8HI") 
+			 (V8HI "V4SI") (V4SI "V2DI")
+			 (HI "SI")     (SI "DI")]
+
+)
+
+;; Widened mode register suffixes for VDW/VQW.
+(define_mode_attr Vwtype [(V8QI "8h") (V4HI "4s")
+			  (V2SI "2d") (V16QI "8h") 
+			  (V8HI "4s") (V4SI "2d")])
+
+;; Widened mode register suffixes for VDW/VQW.
+(define_mode_attr Vmwtype [(V8QI ".8h") (V4HI ".4s")
+			   (V2SI ".2d") (V16QI ".8h") 
+			   (V8HI ".4s") (V4SI ".2d")
+			   (SI   "")    (HI   "")])
+
+;; Lower part register suffixes for VQW.
+(define_mode_attr Vhalftype [(V16QI "8b") (V8HI "4h")
+			     (V4SI "2s")])
+
+;; Define corresponding core/FP element mode for each vector mode.
+(define_mode_attr vw   [(V8QI "w") (V16QI "w")
+                        (V4HI "w") (V8HI "w")
+                        (V2SI "w") (V4SI "w")
+                        (DI   "x") (V2DI "x")
+                        (V2SF "s") (V4SF "s")
+                        (V2DF "d")])
+
+;; Double vector types for ALLX.
+(define_mode_attr Vallxd [(QI "8b") (HI "4h") (SI "2s")])
+
+;; Mode of result of comparison operations.
+(define_mode_attr V_cmp_result [(V8QI "V8QI") (V16QI "V16QI")
+				(V4HI "V4HI") (V8HI  "V8HI")
+				(V2SI "V2SI") (V4SI  "V4SI")
+				(V2SF "V2SI") (V4SF  "V4SI")
+				(DI   "DI")   (V2DI  "V2DI")])
+
+;; Vm for lane instructions is restricted to FP_LO_REGS.
+(define_mode_attr vwx [(V4HI "x") (V8HI "x") (HI "x")
+		       (V2SI "w") (V4SI "w") (SI "w")])
+
+(define_mode_attr Vendreg [(OI "T") (CI "U") (XI "V")])
+
+(define_mode_attr nregs [(OI "2") (CI "3") (XI "4")])
+
+(define_mode_attr VRL2 [(V8QI "V32QI") (V4HI "V16HI")
+			(V2SI "V8SI")  (V2SF "V8SF")
+			(DI   "V4DI")  (DF   "V4DF")
+			(V16QI "V32QI") (V8HI "V16HI")
+			(V4SI "V8SI")  (V4SF "V8SF")
+			(V2DI "V4DI")  (V2DF "V4DF")])
+
+(define_mode_attr VRL3 [(V8QI "V48QI") (V4HI "V24HI")
+			(V2SI "V12SI")  (V2SF "V12SF")
+			(DI   "V6DI")  (DF   "V6DF")
+			(V16QI "V48QI") (V8HI "V24HI")
+			(V4SI "V12SI")  (V4SF "V12SF")
+			(V2DI "V6DI")  (V2DF "V6DF")])
+
+(define_mode_attr VRL4 [(V8QI "V64QI") (V4HI "V32HI")
+			(V2SI "V16SI")  (V2SF "V16SF")
+			(DI   "V8DI")  (DF   "V8DF")
+			(V16QI "V64QI") (V8HI "V32HI")
+			(V4SI "V16SI")  (V4SF "V16SF")
+			(V2DI "V8DI")  (V2DF "V8DF")])
+
+(define_mode_attr VSTRUCT_DREG [(OI "TI") (CI "EI") (XI "OI")])
+
+;; -------------------------------------------------------------------
+;; Code Iterators
+;; -------------------------------------------------------------------
+
+;; This code iterator allows the various shifts supported on the core
+(define_code_iterator SHIFT [ashift ashiftrt lshiftrt rotatert])
+
+;; This code iterator allows the shifts supported in arithmetic instructions
+(define_code_iterator ASHIFT [ashift ashiftrt lshiftrt])
+
+;; Code iterator for logical operations
+(define_code_iterator LOGICAL [and ior xor])
+
+;; Code iterator for sign/zero extension
+(define_code_iterator ANY_EXTEND [sign_extend zero_extend])
+
+;; All division operations (signed/unsigned)
+(define_code_iterator ANY_DIV [div udiv])
+
+;; Code iterator for sign/zero extraction
+(define_code_iterator ANY_EXTRACT [sign_extract zero_extract])
+
+;; Code iterator for equality comparisons
+(define_code_iterator EQL [eq ne])
+
+;; Code iterator for less-than and greater/equal-to
+(define_code_iterator LTGE [lt ge])
+
+;; Iterator for __sync_<op> operations that where the operation can be
+;; represented directly RTL.  This is all of the sync operations bar
+;; nand.
+(define_code_iterator syncop [plus minus ior xor and])
+
+;; Iterator for integer conversions
+(define_code_iterator FIXUORS [fix unsigned_fix])
+
+;; Code iterator for variants of vector max and min.
+(define_code_iterator MAXMIN [smax smin umax umin])
+
+;; Code iterator for variants of vector max and min.
+(define_code_iterator ADDSUB [plus minus])
+
+;; Code iterator for variants of vector saturating binary ops.
+(define_code_iterator BINQOPS [ss_plus us_plus ss_minus us_minus])
+
+;; Code iterator for variants of vector saturating unary ops.
+(define_code_iterator UNQOPS [ss_neg ss_abs])
+
+;; Code iterator for signed variants of vector saturating binary ops.
+(define_code_iterator SBINQOPS [ss_plus ss_minus])
+
+;; -------------------------------------------------------------------
+;; Code Attributes
+;; -------------------------------------------------------------------
+;; Map rtl objects to optab names
+(define_code_attr optab [(ashift "ashl")
+			 (ashiftrt "ashr")
+			 (lshiftrt "lshr")
+			 (rotatert "rotr")
+			 (sign_extend "extend")
+			 (zero_extend "zero_extend")
+			 (sign_extract "extv")
+			 (zero_extract "extzv")
+			 (and "and")
+			 (ior "ior")
+			 (xor "xor")
+			 (not "one_cmpl")
+			 (neg "neg")
+			 (plus "add")
+			 (minus "sub")
+			 (ss_plus "qadd")
+			 (us_plus "qadd")
+			 (ss_minus "qsub")
+			 (us_minus "qsub")
+			 (ss_neg "qneg")
+			 (ss_abs "qabs")
+			 (eq "eq")
+			 (ne "ne")
+			 (lt "lt")
+			 (ge "ge")])
+
+;; Optab prefix for sign/zero-extending operations
+(define_code_attr su_optab [(sign_extend "") (zero_extend "u")
+			    (div "") (udiv "u")
+			    (fix "") (unsigned_fix "u")
+			    (ss_plus "s") (us_plus "u")
+			    (ss_minus "s") (us_minus "u")])
+
+;; Similar for the instruction mnemonics
+(define_code_attr shift [(ashift "lsl") (ashiftrt "asr")
+			 (lshiftrt "lsr") (rotatert "ror")])
+
+;; Map shift operators onto underlying bit-field instructions
+(define_code_attr bfshift [(ashift "ubfiz") (ashiftrt "sbfx")
+			   (lshiftrt "ubfx") (rotatert "extr")])
+
+;; Logical operator instruction mnemonics
+(define_code_attr logical [(and "and") (ior "orr") (xor "eor")])
+
+;; Similar, but when not(op)
+(define_code_attr nlogical [(and "bic") (ior "orn") (xor "eon")])
+
+;; Sign- or zero-extending load
+(define_code_attr ldrxt [(sign_extend "ldrs") (zero_extend "ldr")])
+
+;; Sign- or zero-extending data-op
+(define_code_attr su [(sign_extend "s") (zero_extend "u")
+		      (sign_extract "s") (zero_extract "u")
+		      (fix "s") (unsigned_fix "u")
+		      (div "s") (udiv "u")])
+
+;; Emit cbz/cbnz depending on comparison type.
+(define_code_attr cbz [(eq "cbz") (ne "cbnz") (lt "cbnz") (ge "cbz")])
+
+;; Emit tbz/tbnz depending on comparison type.
+(define_code_attr tbz [(eq "tbz") (ne "tbnz") (lt "tbnz") (ge "tbz")])
+
+;; Max/min attributes.
+(define_code_attr maxmin [(smax "smax")
+			  (smin "smin")
+			  (umax "umax")
+			  (umin "umin")])
+
+;; MLA/MLS attributes.
+(define_code_attr as [(ss_plus "a") (ss_minus "s")])
+
+
+;; -------------------------------------------------------------------
+;; Int Iterators.
+;; -------------------------------------------------------------------
+(define_int_iterator MAXMINV [UNSPEC_UMAXV UNSPEC_UMINV
+			      UNSPEC_SMAXV UNSPEC_SMINV])
+
+(define_int_iterator FMAXMINV [UNSPEC_FMAXV UNSPEC_FMINV])
+
+(define_int_iterator HADDSUB [UNSPEC_SHADD UNSPEC_UHADD
+			      UNSPEC_SRHADD UNSPEC_URHADD
+			      UNSPEC_SHSUB UNSPEC_UHSUB
+			      UNSPEC_SRHSUB UNSPEC_URHSUB])
+
+
+(define_int_iterator ADDSUBHN [UNSPEC_ADDHN UNSPEC_RADDHN
+			       UNSPEC_SUBHN UNSPEC_RSUBHN])
+
+(define_int_iterator ADDSUBHN2 [UNSPEC_ADDHN2 UNSPEC_RADDHN2
+			        UNSPEC_SUBHN2 UNSPEC_RSUBHN2])
+
+(define_int_iterator FMAXMIN [UNSPEC_FMAX UNSPEC_FMIN])
+
+(define_int_iterator VQDMULH [UNSPEC_SQDMULH UNSPEC_SQRDMULH])
+
+(define_int_iterator USSUQADD [UNSPEC_SUQADD UNSPEC_USQADD])
+
+(define_int_iterator SUQMOVN [UNSPEC_SQXTN UNSPEC_UQXTN])
+
+(define_int_iterator VSHL [UNSPEC_SSHL UNSPEC_USHL
+		           UNSPEC_SRSHL UNSPEC_URSHL])
+
+(define_int_iterator VSHLL [UNSPEC_SSHLL UNSPEC_USHLL])
+
+(define_int_iterator VQSHL [UNSPEC_SQSHL UNSPEC_UQSHL
+                            UNSPEC_SQRSHL UNSPEC_UQRSHL])
+
+(define_int_iterator VSRA [UNSPEC_SSRA UNSPEC_USRA
+			     UNSPEC_SRSRA UNSPEC_URSRA])
+
+(define_int_iterator VSLRI [UNSPEC_SSLI UNSPEC_USLI
+			      UNSPEC_SSRI UNSPEC_USRI])
+
+
+(define_int_iterator VRSHR_N [UNSPEC_SRSHR UNSPEC_URSHR])
+
+(define_int_iterator VQSHL_N [UNSPEC_SQSHLU UNSPEC_SQSHL UNSPEC_UQSHL])
+
+(define_int_iterator VQSHRN_N [UNSPEC_SQSHRUN UNSPEC_SQRSHRUN
+                               UNSPEC_SQSHRN UNSPEC_UQSHRN
+                               UNSPEC_SQRSHRN UNSPEC_UQRSHRN])
+
+(define_int_iterator VCMP_S [UNSPEC_CMEQ UNSPEC_CMGE UNSPEC_CMGT
+			     UNSPEC_CMLE UNSPEC_CMLT])
+
+(define_int_iterator VCMP_U [UNSPEC_CMHS UNSPEC_CMHI UNSPEC_CMTST])
+
+
+;; -------------------------------------------------------------------
+;; Int Iterators Attributes.
+;; -------------------------------------------------------------------
+(define_int_attr  maxminv [(UNSPEC_UMAXV "umax")
+			   (UNSPEC_UMINV "umin")
+			   (UNSPEC_SMAXV "smax")
+			   (UNSPEC_SMINV "smin")])
+
+(define_int_attr  fmaxminv [(UNSPEC_FMAXV "max")
+			    (UNSPEC_FMINV "min")])
+
+(define_int_attr  fmaxmin [(UNSPEC_FMAX "fmax")
+			   (UNSPEC_FMIN "fmin")])
+
+(define_int_attr sur [(UNSPEC_SHADD "s") (UNSPEC_UHADD "u")
+		      (UNSPEC_SRHADD "sr") (UNSPEC_URHADD "ur")
+		      (UNSPEC_SHSUB "s") (UNSPEC_UHSUB "u")
+		      (UNSPEC_SRHSUB "sr") (UNSPEC_URHSUB "ur")
+		      (UNSPEC_ADDHN "") (UNSPEC_RADDHN "r")
+		      (UNSPEC_SUBHN "") (UNSPEC_RSUBHN "r")
+		      (UNSPEC_ADDHN2 "") (UNSPEC_RADDHN2 "r")
+		      (UNSPEC_SUBHN2 "") (UNSPEC_RSUBHN2 "r")
+		      (UNSPEC_SQXTN "s") (UNSPEC_UQXTN "u")
+		      (UNSPEC_USQADD "us") (UNSPEC_SUQADD "su")
+		      (UNSPEC_SSLI  "s") (UNSPEC_USLI  "u")
+		      (UNSPEC_SSRI  "s") (UNSPEC_USRI  "u")
+		      (UNSPEC_USRA  "u") (UNSPEC_SSRA  "s")
+		      (UNSPEC_URSRA  "ur") (UNSPEC_SRSRA  "sr")
+		      (UNSPEC_URSHR  "ur") (UNSPEC_SRSHR  "sr")
+		      (UNSPEC_SQSHLU "s") (UNSPEC_SQSHL   "s")
+		      (UNSPEC_UQSHL  "u")
+		      (UNSPEC_SQSHRUN "s") (UNSPEC_SQRSHRUN "s")
+                      (UNSPEC_SQSHRN "s")  (UNSPEC_UQSHRN "u")
+                      (UNSPEC_SQRSHRN "s") (UNSPEC_UQRSHRN "u")
+		      (UNSPEC_USHL  "u")   (UNSPEC_SSHL  "s")
+		      (UNSPEC_USHLL  "u")  (UNSPEC_SSHLL "s")
+		      (UNSPEC_URSHL  "ur") (UNSPEC_SRSHL  "sr")
+		      (UNSPEC_UQRSHL  "u") (UNSPEC_SQRSHL  "s")
+])
+
+(define_int_attr r [(UNSPEC_SQDMULH "") (UNSPEC_SQRDMULH "r")
+		    (UNSPEC_SQSHRUN "") (UNSPEC_SQRSHRUN "r")
+                    (UNSPEC_SQSHRN "")  (UNSPEC_UQSHRN "")
+                    (UNSPEC_SQRSHRN "r") (UNSPEC_UQRSHRN "r")
+                    (UNSPEC_SQSHL   "")  (UNSPEC_UQSHL  "")
+                    (UNSPEC_SQRSHL   "r")(UNSPEC_UQRSHL  "r")
+])
+
+(define_int_attr lr [(UNSPEC_SSLI  "l") (UNSPEC_USLI  "l")
+		     (UNSPEC_SSRI  "r") (UNSPEC_USRI  "r")])
+
+(define_int_attr u [(UNSPEC_SQSHLU "u") (UNSPEC_SQSHL "") (UNSPEC_UQSHL "")
+		    (UNSPEC_SQSHRUN "u") (UNSPEC_SQRSHRUN "u")
+                    (UNSPEC_SQSHRN "")  (UNSPEC_UQSHRN "")
+                    (UNSPEC_SQRSHRN "") (UNSPEC_UQRSHRN "")])
+
+(define_int_attr addsub [(UNSPEC_SHADD "add")
+			 (UNSPEC_UHADD "add")
+			 (UNSPEC_SRHADD "add")
+			 (UNSPEC_URHADD "add")
+			 (UNSPEC_SHSUB "sub")
+			 (UNSPEC_UHSUB "sub")
+			 (UNSPEC_SRHSUB "sub")
+			 (UNSPEC_URHSUB "sub")
+			 (UNSPEC_ADDHN "add")
+			 (UNSPEC_SUBHN "sub")
+			 (UNSPEC_RADDHN "add")
+			 (UNSPEC_RSUBHN "sub")
+			 (UNSPEC_ADDHN2 "add")
+			 (UNSPEC_SUBHN2 "sub")
+			 (UNSPEC_RADDHN2 "add")
+			 (UNSPEC_RSUBHN2 "sub")])
+
+(define_int_attr cmp [(UNSPEC_CMGE "ge") (UNSPEC_CMGT "gt")
+		      (UNSPEC_CMLE "le") (UNSPEC_CMLT "lt")
+                      (UNSPEC_CMEQ "eq")
+		      (UNSPEC_CMHS "hs") (UNSPEC_CMHI "hi")
+		      (UNSPEC_CMTST "tst")])
+
+(define_int_attr offsetlr [(UNSPEC_SSLI	"1") (UNSPEC_USLI "1")
+			   (UNSPEC_SSRI	"0") (UNSPEC_USRI "0")])
+
--- a/gcc/config/aarch64/large.md
+++ b/gcc/config/aarch64/large.md
+;; Copyright (C) 2012 Free Software Foundation, Inc.
+;;
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; In the absence of any ARMv8-A implementations, two examples derived
+;; from ARM's most recent ARMv7-A cores (Cortex-A7 and Cortex-A15) are
+;; included by way of example.  This is a temporary measure.
+
+;; Example pipeline description for an example 'large' core
+;; implementing AArch64
+
+;;-------------------------------------------------------
+;; General Description
+;;-------------------------------------------------------
+
+(define_automaton "large_cpu")
+
+;; The core is modelled as a triple issue pipeline that has
+;; the following dispatch units.
+;; 1. Two pipelines for simple integer operations: int1, int2
+;; 2. Two pipelines for SIMD and FP data-processing operations: fpsimd1, fpsimd2
+;; 3. One pipeline for branch operations: br
+;; 4. One pipeline for integer multiply and divide operations: multdiv
+;; 5. Two pipelines for load and store operations: ls1, ls2
+;;
+;; We can issue into three pipelines per-cycle.
+;;
+;; We assume that where we have unit pairs xxx1 is always filled before xxx2.
+
+;;-------------------------------------------------------
+;; CPU Units and Reservations
+;;-------------------------------------------------------
+
+;; The three issue units
+(define_cpu_unit "large_cpu_unit_i1, large_cpu_unit_i2, large_cpu_unit_i3" "large_cpu")
+
+(define_reservation "large_cpu_resv_i1"
+		    "(large_cpu_unit_i1 | large_cpu_unit_i2 | large_cpu_unit_i3)")
+
+(define_reservation "large_cpu_resv_i2"
+		    "((large_cpu_unit_i1 + large_cpu_unit_i2) | (large_cpu_unit_i2 + large_cpu_unit_i3))")
+
+(define_reservation "large_cpu_resv_i3"
+		    "(large_cpu_unit_i1 + large_cpu_unit_i2 + large_cpu_unit_i3)")
+
+(final_presence_set "large_cpu_unit_i2" "large_cpu_unit_i1")
+(final_presence_set "large_cpu_unit_i3" "large_cpu_unit_i2")
+
+;; The main dispatch units
+(define_cpu_unit "large_cpu_unit_int1, large_cpu_unit_int2" "large_cpu")
+(define_cpu_unit "large_cpu_unit_fpsimd1, large_cpu_unit_fpsimd2" "large_cpu")
+(define_cpu_unit "large_cpu_unit_ls1, large_cpu_unit_ls2" "large_cpu")
+(define_cpu_unit "large_cpu_unit_br" "large_cpu")
+(define_cpu_unit "large_cpu_unit_multdiv" "large_cpu")
+
+(define_reservation "large_cpu_resv_ls" "(large_cpu_unit_ls1 | large_cpu_unit_ls2)")
+
+;; The extended load-store pipeline
+(define_cpu_unit "large_cpu_unit_load, large_cpu_unit_store" "large_cpu")
+
+;; The extended ALU pipeline
+(define_cpu_unit "large_cpu_unit_int1_alu, large_cpu_unit_int2_alu" "large_cpu")
+(define_cpu_unit "large_cpu_unit_int1_shf, large_cpu_unit_int2_shf" "large_cpu")
+(define_cpu_unit "large_cpu_unit_int1_sat, large_cpu_unit_int2_sat" "large_cpu")
+
+
+;;-------------------------------------------------------
+;; Simple ALU Instructions
+;;-------------------------------------------------------
+
+;; Simple ALU operations without shift
+(define_insn_reservation "large_cpu_alu" 2
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "adc,alu,alu_ext"))
+  "large_cpu_resv_i1, \
+   (large_cpu_unit_int1, large_cpu_unit_int1_alu) |\
+     (large_cpu_unit_int2, large_cpu_unit_int2_alu)")
+
+(define_insn_reservation "large_cpu_logic" 2
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "logic,logic_imm"))
+  "large_cpu_resv_i1, \
+   (large_cpu_unit_int1, large_cpu_unit_int1_alu) |\
+     (large_cpu_unit_int2, large_cpu_unit_int2_alu)")
+
+(define_insn_reservation "large_cpu_shift" 2
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "shift,shift_imm"))
+  "large_cpu_resv_i1, \
+   (large_cpu_unit_int1, large_cpu_unit_int1_shf) |\
+     (large_cpu_unit_int2, large_cpu_unit_int2_shf)")
+
+;; Simple ALU operations with immediate shift
+(define_insn_reservation "large_cpu_alu_shift" 3
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "alu_shift"))
+  "large_cpu_resv_i1, \
+   (large_cpu_unit_int1,
+     large_cpu_unit_int1 + large_cpu_unit_int1_shf, large_cpu_unit_int1_alu) | \
+   (large_cpu_unit_int2,
+     large_cpu_unit_int2 + large_cpu_unit_int2_shf, large_cpu_unit_int2_alu)")
+
+(define_insn_reservation "large_cpu_logic_shift" 3
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "logic_shift"))
+  "large_cpu_resv_i1, \
+   (large_cpu_unit_int1, large_cpu_unit_int1_alu) |\
+     (large_cpu_unit_int2, large_cpu_unit_int2_alu)")
+
+
+;;-------------------------------------------------------
+;; Multiplication/Division
+;;-------------------------------------------------------
+
+;; Simple multiplication
+(define_insn_reservation "large_cpu_mult_single" 3
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "mult,madd") (eq_attr "mode" "SI")))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv")
+
+(define_insn_reservation "large_cpu_mult_double" 4
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "mult,madd") (eq_attr "mode" "DI")))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv")
+
+;; 64-bit multiplication
+(define_insn_reservation "large_cpu_mull" 4
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "mull,mulh,maddl"))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv * 2")
+
+;; Division
+(define_insn_reservation "large_cpu_udiv_single" 9
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "udiv") (eq_attr "mode" "SI")))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv")
+
+(define_insn_reservation "large_cpu_udiv_double" 18
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "udiv") (eq_attr "mode" "DI")))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv")
+
+(define_insn_reservation "large_cpu_sdiv_single" 10
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "sdiv") (eq_attr "mode" "SI")))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv")
+
+(define_insn_reservation "large_cpu_sdiv_double" 20
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "sdiv") (eq_attr "mode" "DI")))
+  "large_cpu_resv_i1, large_cpu_unit_multdiv")
+
+
+;;-------------------------------------------------------
+;; Branches
+;;-------------------------------------------------------
+
+;; Branches take one issue slot.
+;; No latency as there is no result
+(define_insn_reservation "large_cpu_branch" 0
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "branch"))
+  "large_cpu_resv_i1, large_cpu_unit_br")
+
+
+;; Calls take up all issue slots, and form a block in the
+;; pipeline.  The result however is available the next cycle.
+;; Addition of new units requires this to be updated.
+(define_insn_reservation "large_cpu_call" 1
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "call"))
+  "large_cpu_resv_i3 | large_cpu_resv_i2, \
+   large_cpu_unit_int1 + large_cpu_unit_int2 + large_cpu_unit_br + \
+     large_cpu_unit_multdiv + large_cpu_unit_fpsimd1 + large_cpu_unit_fpsimd2 + \
+     large_cpu_unit_ls1 + large_cpu_unit_ls2,\
+   large_cpu_unit_int1_alu + large_cpu_unit_int1_shf + large_cpu_unit_int1_sat + \
+     large_cpu_unit_int2_alu + large_cpu_unit_int2_shf + \
+     large_cpu_unit_int2_sat + large_cpu_unit_load + large_cpu_unit_store")
+
+
+;;-------------------------------------------------------
+;; Load/Store Instructions
+;;-------------------------------------------------------
+
+;; Loads of up to two words.
+(define_insn_reservation "large_cpu_load1" 4
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "load_acq,load1,load2"))
+  "large_cpu_resv_i1, large_cpu_resv_ls, large_cpu_unit_load, nothing")
+
+;; Stores of up to two words.
+(define_insn_reservation "large_cpu_store1" 0
+  (and (eq_attr "tune" "large") (eq_attr "v8type" "store_rel,store1,store2"))
+  "large_cpu_resv_i1, large_cpu_resv_ls, large_cpu_unit_store")
+
+
+;;-------------------------------------------------------
+;; Floating-point arithmetic.
+;;-------------------------------------------------------
+
+(define_insn_reservation "large_cpu_fpalu" 4
+  (and (eq_attr "tune" "large")
+       (eq_attr "v8type" "ffarith,fadd,fccmp,fcvt,fcmp"))
+  "large_cpu_resv_i1 + large_cpu_unit_fpsimd1")
+
+(define_insn_reservation "large_cpu_fconst" 3
+  (and (eq_attr "tune" "large")
+       (eq_attr "v8type" "fconst"))
+  "large_cpu_resv_i1 + large_cpu_unit_fpsimd1")
+
+(define_insn_reservation "large_cpu_fpmuls" 4
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fmul,fmadd") (eq_attr "mode" "SF")))
+  "large_cpu_resv_i1 + large_cpu_unit_fpsimd1")
+
+(define_insn_reservation "large_cpu_fpmuld" 7
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fmul,fmadd") (eq_attr "mode" "DF")))
+  "large_cpu_resv_i1 + large_cpu_unit_fpsimd1, large_cpu_unit_fpsimd1 * 2,\
+   large_cpu_resv_i1 + large_cpu_unit_fpsimd1")
+
+
+;;-------------------------------------------------------
+;; Floating-point Division
+;;-------------------------------------------------------
+
+;; Single-precision divide takes 14 cycles to complete, and this
+;; includes the time taken for the special instruction used to collect the
+;; result to travel down the multiply pipeline.
+
+(define_insn_reservation "large_cpu_fdivs" 14
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fdiv,fsqrt") (eq_attr "mode" "SF")))
+  "large_cpu_resv_i1, large_cpu_unit_fpsimd1 * 13")
+
+(define_insn_reservation "large_cpu_fdivd" 29
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fdiv,fsqrt") (eq_attr "mode" "DF")))
+  "large_cpu_resv_i1, large_cpu_unit_fpsimd1 * 28")
+
+
+
+;;-------------------------------------------------------
+;; Floating-point Transfers
+;;-------------------------------------------------------
+
+(define_insn_reservation "large_cpu_i2f" 4
+  (and (eq_attr "tune" "large")
+       (eq_attr "v8type" "fmovi2f"))
+  "large_cpu_resv_i1")
+
+(define_insn_reservation "large_cpu_f2i" 2
+  (and (eq_attr "tune" "large")
+       (eq_attr "v8type" "fmovf2i"))
+  "large_cpu_resv_i1")
+
+
+;;-------------------------------------------------------
+;; Floating-point Load/Store
+;;-------------------------------------------------------
+
+(define_insn_reservation "large_cpu_floads" 4
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fpsimd_load,fpsimd_load2") (eq_attr "mode" "SF")))
+  "large_cpu_resv_i1")
+
+(define_insn_reservation "large_cpu_floadd" 5
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fpsimd_load,fpsimd_load2") (eq_attr "mode" "DF")))
+  "large_cpu_resv_i1 + large_cpu_unit_br, large_cpu_resv_i1")
+
+(define_insn_reservation "large_cpu_fstores" 0
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fpsimd_store,fpsimd_store2") (eq_attr "mode" "SF")))
+  "large_cpu_resv_i1")
+
+(define_insn_reservation "large_cpu_fstored" 0
+  (and (eq_attr "tune" "large")
+       (and (eq_attr "v8type" "fpsimd_store,fpsimd_store2") (eq_attr "mode" "DF")))
+  "large_cpu_resv_i1 + large_cpu_unit_br, large_cpu_resv_i1")
+
+
+;;-------------------------------------------------------
+;; Bypasses
+;;-------------------------------------------------------
+
+(define_bypass 1 "large_cpu_alu, large_cpu_logic, large_cpu_shift"
+  "large_cpu_alu, large_cpu_alu_shift, large_cpu_logic, large_cpu_logic_shift, large_cpu_shift")
+
+(define_bypass 2 "large_cpu_alu_shift, large_cpu_logic_shift"
+  "large_cpu_alu, large_cpu_alu_shift, large_cpu_logic, large_cpu_logic_shift, large_cpu_shift")
+
+(define_bypass 1 "large_cpu_alu, large_cpu_logic, large_cpu_shift" "large_cpu_load1")
+
+(define_bypass 2 "large_cpu_alu_shift, large_cpu_logic_shift" "large_cpu_load1")
+
+(define_bypass 2 "large_cpu_floads"
+                 "large_cpu_fpalu, large_cpu_fpmuld,\
+		  large_cpu_fdivs, large_cpu_fdivd,\
+		  large_cpu_f2i")
+
+(define_bypass 3 "large_cpu_floadd"
+                 "large_cpu_fpalu, large_cpu_fpmuld,\
+		  large_cpu_fdivs, large_cpu_fdivd,\
+		  large_cpu_f2i")
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
+;; Machine description for AArch64 architecture.
+;; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_special_predicate "cc_register"
+  (and (match_code "reg")
+       (and (match_test "REGNO (op) == CC_REGNUM")
+	    (ior (match_test "mode == GET_MODE (op)")
+		 (match_test "mode == VOIDmode
+			      && GET_MODE_CLASS (GET_MODE (op)) == MODE_CC"))))
+)
+
+(define_predicate "aarch64_reg_or_zero"
+  (and (match_code "reg,subreg,const_int")
+       (ior (match_operand 0 "register_operand")
+	    (match_test "op == const0_rtx"))))
+
+(define_predicate "aarch64_reg_zero_or_m1"
+  (and (match_code "reg,subreg,const_int")
+       (ior (match_operand 0 "register_operand")
+	    (ior (match_test "op == const0_rtx")
+		 (match_test "op == constm1_rtx")))))
+
+(define_predicate "aarch64_fp_compare_operand"
+  (ior (match_operand 0 "register_operand")
+       (and (match_code "const_double")
+	    (match_test "aarch64_const_double_zero_rtx_p (op)"))))
+
+(define_predicate "aarch64_plus_immediate"
+  (and (match_code "const_int")
+       (ior (match_test "aarch64_uimm12_shift (INTVAL (op))")
+	    (match_test "aarch64_uimm12_shift (-INTVAL (op))"))))
+
+(define_predicate "aarch64_plus_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_plus_immediate")))
+
+(define_predicate "aarch64_pluslong_immediate"
+  (and (match_code "const_int")
+       (match_test "(INTVAL (op) < 0xffffff && INTVAL (op) > -0xffffff)")))
+
+(define_predicate "aarch64_pluslong_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_pluslong_immediate")))
+
+(define_predicate "aarch64_logical_immediate"
+  (and (match_code "const_int")
+       (match_test "aarch64_bitmask_imm (INTVAL (op), mode)")))
+
+(define_predicate "aarch64_logical_operand"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_logical_immediate")))
+
+(define_predicate "aarch64_shift_imm_si"
+  (and (match_code "const_int")
+       (match_test "(unsigned HOST_WIDE_INT) INTVAL (op) < 32")))
+
+(define_predicate "aarch64_shift_imm_di"
+  (and (match_code "const_int")
+       (match_test "(unsigned HOST_WIDE_INT) INTVAL (op) < 64")))
+
+(define_predicate "aarch64_reg_or_shift_imm_si"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_shift_imm_si")))
+
+(define_predicate "aarch64_reg_or_shift_imm_di"
+  (ior (match_operand 0 "register_operand")
+       (match_operand 0 "aarch64_shift_imm_di")))
+
+;; The imm3 field is a 3-bit field that only accepts immediates in the
+;; range 0..4.
+(define_predicate "aarch64_imm3"
+  (and (match_code "const_int")
+       (match_test "(unsigned HOST_WIDE_INT) INTVAL (op) <= 4")))
+
+(define_predicate "aarch64_pwr_imm3"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) != 0
+		    && (unsigned) exact_log2 (INTVAL (op)) <= 4")))
+
+(define_predicate "aarch64_pwr_2_si"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) != 0
+		    && (unsigned) exact_log2 (INTVAL (op)) < 32")))
+
+(define_predicate "aarch64_pwr_2_di"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) != 0
+		    && (unsigned) exact_log2 (INTVAL (op)) < 64")))
+
+(define_predicate "aarch64_mem_pair_operand"
+  (and (match_code "mem")
+       (match_test "aarch64_legitimate_address_p (mode, XEXP (op, 0), PARALLEL,
+					       0)")))
+
+(define_predicate "aarch64_const_address"
+  (and (match_code "symbol_ref")
+       (match_test "mode == DImode && CONSTANT_ADDRESS_P (op)")))
+
+(define_predicate "aarch64_valid_symref"
+  (match_code "const, symbol_ref, label_ref")
+{
+  enum aarch64_symbol_type symbol_type;
+  return (aarch64_symbolic_constant_p (op, SYMBOL_CONTEXT_ADR, &symbol_type)
+	 && symbol_type != SYMBOL_FORCE_TO_MEM);
+})
+
+(define_predicate "aarch64_tls_ie_symref"
+  (match_code "const, symbol_ref, label_ref")
+{
+  switch (GET_CODE (op))
+    {
+    case CONST:
+      op = XEXP (op, 0);
+      if (GET_CODE (op) != PLUS
+	  || GET_CODE (XEXP (op, 0)) != SYMBOL_REF
+	  || GET_CODE (XEXP (op, 1)) != CONST_INT)
+	return false;
+      op = XEXP (op, 0);
+
+    case SYMBOL_REF:
+      return SYMBOL_REF_TLS_MODEL (op) == TLS_MODEL_INITIAL_EXEC;
+
+    default:
+      gcc_unreachable ();
+    }
+})
+
+(define_predicate "aarch64_tls_le_symref"
+  (match_code "const, symbol_ref, label_ref")
+{
+  switch (GET_CODE (op))
+    {
+    case CONST:
+      op = XEXP (op, 0);
+      if (GET_CODE (op) != PLUS
+	  || GET_CODE (XEXP (op, 0)) != SYMBOL_REF
+	  || GET_CODE (XEXP (op, 1)) != CONST_INT)
+	return false;
+      op = XEXP (op, 0);
+
+    case SYMBOL_REF:
+      return SYMBOL_REF_TLS_MODEL (op) == TLS_MODEL_LOCAL_EXEC;
+
+    default:
+      gcc_unreachable ();
+    }
+})
+
+(define_predicate "aarch64_mov_operand"
+  (and (match_code "reg,subreg,mem,const_int,symbol_ref,high")
+       (ior (match_operand 0 "register_operand")
+	    (ior (match_operand 0 "memory_operand")
+		 (ior (match_test "GET_CODE (op) == HIGH
+				   && aarch64_valid_symref (XEXP (op, 0),
+							    GET_MODE (XEXP (op, 0)))")
+		      (ior (match_test "CONST_INT_P (op)
+					&& aarch64_move_imm (INTVAL (op), mode)")
+			   (match_test "aarch64_const_address (op, mode)")))))))
+
+(define_predicate "aarch64_movti_operand"
+  (and (match_code "reg,subreg,mem,const_int")
+       (ior (match_operand 0 "register_operand")
+	    (ior (match_operand 0 "memory_operand")
+		 (match_operand 0 "const_int_operand")))))
+
+(define_predicate "aarch64_reg_or_imm"
+  (and (match_code "reg,subreg,const_int")
+       (ior (match_operand 0 "register_operand")
+	    (match_operand 0 "const_int_operand"))))
+
+;; True for integer comparisons and for FP comparisons other than LTGT or UNEQ.
+(define_special_predicate "aarch64_comparison_operator"
+  (match_code "eq,ne,le,lt,ge,gt,geu,gtu,leu,ltu,unordered,ordered,unlt,unle,unge,ungt"))
+
+;; True if the operand is memory reference suitable for a load/store exclusive.
+(define_predicate "aarch64_sync_memory_operand"
+  (and (match_operand 0 "memory_operand")
+       (match_code "reg" "0")))
+
+;; Predicates for parallel expanders based on mode.
+(define_special_predicate "vect_par_cnst_hi_half"
+  (match_code "parallel")
+{
+  HOST_WIDE_INT count = XVECLEN (op, 0);
+  int nunits = GET_MODE_NUNITS (mode);
+  int i;
+
+  if (count < 1
+      || count != nunits / 2)
+    return false;
+ 
+  if (!VECTOR_MODE_P (mode))
+    return false;
+
+  for (i = 0; i < count; i++)
+   {
+     rtx elt = XVECEXP (op, 0, i);
+     int val;
+
+     if (GET_CODE (elt) != CONST_INT)
+       return false;
+
+     val = INTVAL (elt);
+     if (val != (nunits / 2) + i)
+       return false;
+   }
+  return true;
+})
+
+(define_special_predicate "vect_par_cnst_lo_half"
+  (match_code "parallel")
+{
+  HOST_WIDE_INT count = XVECLEN (op, 0);
+  int nunits = GET_MODE_NUNITS (mode);
+  int i;
+
+  if (count < 1
+      || count != nunits / 2)
+    return false;
+
+  if (!VECTOR_MODE_P (mode))
+    return false;
+
+  for (i = 0; i < count; i++)
+   {
+     rtx elt = XVECEXP (op, 0, i);
+     int val;
+
+     if (GET_CODE (elt) != CONST_INT)
+       return false;
+
+     val = INTVAL (elt);
+     if (val != i)
+       return false;
+   }
+  return true;
+})
+
+
+(define_special_predicate "aarch64_simd_lshift_imm"
+  (match_code "const_vector")
+{
+  return aarch64_simd_shift_imm_p (op, mode, true);
+})
+
+(define_special_predicate "aarch64_simd_rshift_imm"
+  (match_code "const_vector")
+{
+  return aarch64_simd_shift_imm_p (op, mode, false);
+})
+
+(define_predicate "aarch64_simd_reg_or_zero"
+  (and (match_code "reg,subreg,const_int,const_vector")
+       (ior (match_operand 0 "register_operand")
+           (ior (match_test "op == const0_rtx")
+                (match_test "aarch64_simd_imm_zero_p (op, mode)")))))
+
+(define_predicate "aarch64_simd_struct_operand"
+  (and (match_code "mem")
+       (match_test "TARGET_SIMD && aarch64_simd_mem_operand_p (op)")))
+
+;; Like general_operand but allow only valid SIMD addressing modes.
+(define_predicate "aarch64_simd_general_operand"
+  (and (match_operand 0 "general_operand")
+       (match_test "!MEM_P (op)
+		    || GET_CODE (XEXP (op, 0)) == POST_INC
+		    || GET_CODE (XEXP (op, 0)) == REG")))
+
+;; Like nonimmediate_operand but allow only valid SIMD addressing modes.
+(define_predicate "aarch64_simd_nonimmediate_operand"
+  (and (match_operand 0 "nonimmediate_operand")
+       (match_test "!MEM_P (op)
+		    || GET_CODE (XEXP (op, 0)) == POST_INC
+		    || GET_CODE (XEXP (op, 0)) == REG")))
+
+(define_special_predicate "aarch64_simd_imm_zero"
+  (match_code "const_vector")
+{
+  return aarch64_simd_imm_zero_p (op, mode);
+})
--- a/gcc/config/aarch64/small.md
+++ b/gcc/config/aarch64/small.md
+;; Copyright (C) 2012 Free Software Foundation, Inc.
+;;
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;; In the absence of any ARMv8-A implementations, two examples derived
+;; from ARM's most recent ARMv7-A cores (Cortex-A7 and Cortex-A15) are
+;; included by way of example.  This is a temporary measure.
+
+;; Example pipeline description for an example 'small' core
+;; implementing AArch64
+
+;;-------------------------------------------------------
+;; General Description
+;;-------------------------------------------------------
+
+(define_automaton "small_cpu")
+
+;; The core is modelled as a single issue pipeline with the following
+;; dispatch units.
+;; 1. One pipeline for simple intructions.
+;; 2. One pipeline for branch intructions.
+;;
+;; There are five pipeline stages.
+;; The decode/issue stages operate the same for all instructions.
+;; Instructions always advance one stage per cycle in order.
+;; Only branch instructions may dual-issue with other instructions, except
+;; when those instructions take multiple cycles to issue.
+
+
+;;-------------------------------------------------------
+;; CPU Units and Reservations
+;;-------------------------------------------------------
+
+(define_cpu_unit "small_cpu_unit_i" "small_cpu")
+(define_cpu_unit "small_cpu_unit_br" "small_cpu")
+
+;; Pseudo-unit for blocking the multiply pipeline when a double-precision
+;; multiply is in progress.
+(define_cpu_unit "small_cpu_unit_fpmul_pipe" "small_cpu")
+
+;; The floating-point add pipeline, used to model the usage
+;; of the add pipeline by fp alu instructions.
+(define_cpu_unit "small_cpu_unit_fpadd_pipe" "small_cpu")
+
+;; Floating-point division pipeline (long latency, out-of-order completion).
+(define_cpu_unit "small_cpu_unit_fpdiv" "small_cpu")
+
+
+;;-------------------------------------------------------
+;; Simple ALU Instructions
+;;-------------------------------------------------------
+
+;; Simple ALU operations without shift
+(define_insn_reservation "small_cpu_alu" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "adc,alu,alu_ext"))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_logic" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "logic,logic_imm"))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_shift" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "shift,shift_imm"))
+  "small_cpu_unit_i")
+
+;; Simple ALU operations with immediate shift
+(define_insn_reservation "small_cpu_alu_shift" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "alu_shift"))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_logic_shift" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "logic_shift"))
+  "small_cpu_unit_i")
+
+
+;;-------------------------------------------------------
+;; Multiplication/Division
+;;-------------------------------------------------------
+
+;; Simple multiplication
+(define_insn_reservation "small_cpu_mult_single" 2
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "mult,madd") (eq_attr "mode" "SI")))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_mult_double" 3
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "mult,madd") (eq_attr "mode" "DI")))
+  "small_cpu_unit_i")
+
+;; 64-bit multiplication
+(define_insn_reservation "small_cpu_mull" 3
+  (and (eq_attr "tune" "small") (eq_attr "v8type" "mull,mulh,maddl"))
+  "small_cpu_unit_i * 2")
+
+;; Division
+(define_insn_reservation "small_cpu_udiv_single" 5
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "udiv") (eq_attr "mode" "SI")))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_udiv_double" 10
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "udiv") (eq_attr "mode" "DI")))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_sdiv_single" 6
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "sdiv") (eq_attr "mode" "SI")))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_sdiv_double" 12
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "sdiv") (eq_attr "mode" "DI")))
+  "small_cpu_unit_i")
+
+
+;;-------------------------------------------------------
+;; Load/Store Instructions
+;;-------------------------------------------------------
+
+(define_insn_reservation "small_cpu_load1" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "load_acq,load1"))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_store1" 0
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "store_rel,store1"))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_load2" 3
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "load2"))
+  "small_cpu_unit_i + small_cpu_unit_br, small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_store2" 0
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "store2"))
+  "small_cpu_unit_i + small_cpu_unit_br, small_cpu_unit_i")
+
+
+;;-------------------------------------------------------
+;; Branches
+;;-------------------------------------------------------
+
+;; Direct branches are the only instructions that can dual-issue.
+;; The latency here represents when the branch actually takes place.
+
+(define_insn_reservation "small_cpu_unit_br" 3
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "branch,call"))
+  "small_cpu_unit_br")
+
+
+;;-------------------------------------------------------
+;; Floating-point arithmetic.
+;;-------------------------------------------------------
+
+(define_insn_reservation "small_cpu_fpalu" 4
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "ffarith,fadd,fccmp,fcvt,fcmp"))
+  "small_cpu_unit_i + small_cpu_unit_fpadd_pipe")
+
+(define_insn_reservation "small_cpu_fconst" 3
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "fconst"))
+  "small_cpu_unit_i + small_cpu_unit_fpadd_pipe")
+
+(define_insn_reservation "small_cpu_fpmuls" 4
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fmul") (eq_attr "mode" "SF")))
+  "small_cpu_unit_i + small_cpu_unit_fpmul_pipe")
+
+(define_insn_reservation "small_cpu_fpmuld" 7
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fmul") (eq_attr "mode" "DF")))
+  "small_cpu_unit_i + small_cpu_unit_fpmul_pipe, small_cpu_unit_fpmul_pipe * 2,\
+   small_cpu_unit_i + small_cpu_unit_fpmul_pipe")
+
+
+;;-------------------------------------------------------
+;; Floating-point Division
+;;-------------------------------------------------------
+
+;; Single-precision divide takes 14 cycles to complete, and this
+;; includes the time taken for the special instruction used to collect the
+;; result to travel down the multiply pipeline.
+
+(define_insn_reservation "small_cpu_fdivs" 14
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fdiv,fsqrt") (eq_attr "mode" "SF")))
+  "small_cpu_unit_i, small_cpu_unit_fpdiv * 13")
+
+(define_insn_reservation "small_cpu_fdivd" 29
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fdiv,fsqrt") (eq_attr "mode" "DF")))
+  "small_cpu_unit_i, small_cpu_unit_fpdiv * 28")
+
+
+;;-------------------------------------------------------
+;; Floating-point Transfers
+;;-------------------------------------------------------
+
+(define_insn_reservation "small_cpu_i2f" 4
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "fmovi2f"))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_f2i" 2
+  (and (eq_attr "tune" "small")
+       (eq_attr "v8type" "fmovf2i"))
+  "small_cpu_unit_i")
+
+
+;;-------------------------------------------------------
+;; Floating-point Load/Store
+;;-------------------------------------------------------
+
+(define_insn_reservation "small_cpu_floads" 4
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fpsimd_load") (eq_attr "mode" "SF")))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_floadd" 5
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fpsimd_load") (eq_attr "mode" "DF")))
+  "small_cpu_unit_i + small_cpu_unit_br, small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_fstores" 0
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fpsimd_store") (eq_attr "mode" "SF")))
+  "small_cpu_unit_i")
+
+(define_insn_reservation "small_cpu_fstored" 0
+  (and (eq_attr "tune" "small")
+       (and (eq_attr "v8type" "fpsimd_store") (eq_attr "mode" "DF")))
+  "small_cpu_unit_i + small_cpu_unit_br, small_cpu_unit_i")
+
+
+;;-------------------------------------------------------
+;; Bypasses
+;;-------------------------------------------------------
+
+;; Forwarding path for unshifted operands.
+
+(define_bypass 1 "small_cpu_alu, small_cpu_alu_shift" 
+  "small_cpu_alu, small_cpu_alu_shift, small_cpu_logic, small_cpu_logic_shift, small_cpu_shift")
+
+(define_bypass 1 "small_cpu_logic, small_cpu_logic_shift" 
+  "small_cpu_alu, small_cpu_alu_shift, small_cpu_logic, small_cpu_logic_shift, small_cpu_shift")
+
+(define_bypass 1 "small_cpu_shift" 
+  "small_cpu_alu, small_cpu_alu_shift, small_cpu_logic, small_cpu_logic_shift, small_cpu_shift")
+
+;; Load-to-use for floating-point values has a penalty of one cycle.
+
+(define_bypass 2 "small_cpu_floads"
+                 "small_cpu_fpalu, small_cpu_fpmuld,\
+		  small_cpu_fdivs, small_cpu_fdivd,\
+		  small_cpu_f2i")
+
+(define_bypass 3 "small_cpu_floadd"
+                 "small_cpu_fpalu, small_cpu_fpmuld,\
+		  small_cpu_fdivs, small_cpu_fdivd,\
+		  small_cpu_f2i")
--- a/gcc/config/aarch64/sync.md
+++ b/gcc/config/aarch64/sync.md
+;; Machine description for AArch64 processor synchronization primitives.
+;; Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+;; Contributed by ARM Ltd.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+(define_c_enum "unspecv"
+ [
+    UNSPECV_SYNC_COMPARE_AND_SWAP       ; Represent a sync_compare_and_swap.
+    UNSPECV_SYNC_LOCK			; Represent a sync_lock_test_and_set.
+    UNSPECV_SYNC_LOCK_RELEASE		; Represent a sync_lock_release.
+    UNSPECV_SYNC_OP			; Represent a sync_<op>
+    UNSPECV_SYNC_NEW_OP			; Represent a sync_new_<op>
+    UNSPECV_SYNC_OLD_OP			; Represent a sync_old_<op>
+])
+
+(define_expand "sync_compare_and_swap<mode>"
+  [(set (match_operand:ALLI 0 "register_operand")
+        (unspec_volatile:ALLI [(match_operand:ALLI 1 "memory_operand")
+			       (match_operand:ALLI 2 "register_operand")
+			       (match_operand:ALLI 3 "register_operand")]
+			       UNSPECV_SYNC_COMPARE_AND_SWAP))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omrn;
+    generator.u.omrn = gen_aarch64_sync_compare_and_swap<mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, operands[0], operands[1],
+    			 operands[2], operands[3]);
+    DONE;
+  })
+
+(define_expand "sync_lock_test_and_set<mode>"
+  [(match_operand:ALLI 0 "register_operand")
+   (match_operand:ALLI 1 "memory_operand")
+   (match_operand:ALLI 2 "register_operand")]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_lock_test_and_set<mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, operands[0], operands[1],
+                         NULL, operands[2]);
+    DONE;
+  })
+
+(define_expand "sync_<optab><mode>"
+  [(match_operand:ALLI 0 "memory_operand")
+   (match_operand:ALLI 1 "register_operand")
+   (syncop:ALLI (match_dup 0) (match_dup 1))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_new_<optab><mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, NULL, operands[0], NULL,
+                         operands[1]);
+    DONE;
+  })
+
+(define_expand "sync_nand<mode>"
+  [(match_operand:ALLI 0 "memory_operand")
+   (match_operand:ALLI 1 "register_operand")
+   (not:ALLI (and:ALLI (match_dup 0) (match_dup 1)))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_new_nand<mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, NULL, operands[0], NULL,
+                         operands[1]);
+    DONE;
+  })
+
+(define_expand "sync_new_<optab><mode>"
+  [(match_operand:ALLI 0 "register_operand")
+   (match_operand:ALLI 1 "memory_operand")
+   (match_operand:ALLI 2 "register_operand")
+   (syncop:ALLI (match_dup 1) (match_dup 2))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_new_<optab><mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, operands[0], operands[1],
+    		    	 NULL, operands[2]);
+    DONE;
+  })
+
+(define_expand "sync_new_nand<mode>"
+  [(match_operand:ALLI 0 "register_operand")
+   (match_operand:ALLI 1 "memory_operand")
+   (match_operand:ALLI 2 "register_operand")
+   (not:ALLI (and:ALLI (match_dup 1) (match_dup 2)))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_new_nand<mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, operands[0], operands[1],
+    			 NULL, operands[2]);
+    DONE;
+  });
+
+(define_expand "sync_old_<optab><mode>"
+  [(match_operand:ALLI 0 "register_operand")
+   (match_operand:ALLI 1 "memory_operand")
+   (match_operand:ALLI 2 "register_operand")
+   (syncop:ALLI (match_dup 1) (match_dup 2))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_old_<optab><mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, operands[0], operands[1],
+    		         NULL, operands[2]);
+    DONE;
+  })
+
+(define_expand "sync_old_nand<mode>"
+  [(match_operand:ALLI 0 "register_operand")
+   (match_operand:ALLI 1 "memory_operand")
+   (match_operand:ALLI 2 "register_operand")
+   (not:ALLI (and:ALLI (match_dup 1) (match_dup 2)))]
+  ""
+  {
+    struct aarch64_sync_generator generator;
+    generator.op = aarch64_sync_generator_omn;
+    generator.u.omn = gen_aarch64_sync_old_nand<mode>;
+    aarch64_expand_sync (<MODE>mode, &generator, operands[0], operands[1],
+                         NULL, operands[2]);
+    DONE;
+  })
+
+(define_expand "memory_barrier"
+  [(set (match_dup 0) (unspec:BLK [(match_dup 0)] UNSPEC_MB))]
+  ""
+{
+  operands[0] = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode));
+  MEM_VOLATILE_P (operands[0]) = 1;
+})
+
+(define_insn "aarch64_sync_compare_and_swap<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=&r")
+        (unspec_volatile:GPI
+	  [(match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q")
+   	   (match_operand:GPI 2 "register_operand" "r")
+	   (match_operand:GPI 3 "register_operand" "r")]
+	  UNSPECV_SYNC_COMPARE_AND_SWAP))
+   (set (match_dup 1) (unspec_volatile:GPI [(match_dup 2)]
+                                          UNSPECV_SYNC_COMPARE_AND_SWAP))
+   (clobber:GPI (match_scratch:GPI 4 "=&r"))
+   (set (reg:CC CC_REGNUM) (unspec_volatile:CC [(match_dup 1)]
+                                                UNSPECV_SYNC_COMPARE_AND_SWAP))
+   ]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_required_value"  "2")
+   (set_attr "sync_new_value"       "3")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "4")
+   ])
+
+(define_insn "aarch64_sync_compare_and_swap<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+        (zero_extend:SI
+	  (unspec_volatile:SHORT
+	    [(match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q")
+   	     (match_operand:SI 2 "register_operand" "r")
+	     (match_operand:SI 3 "register_operand" "r")]
+	    UNSPECV_SYNC_COMPARE_AND_SWAP)))
+   (set (match_dup 1) (unspec_volatile:SHORT [(match_dup 2)]
+                                             UNSPECV_SYNC_COMPARE_AND_SWAP))
+   (clobber:SI (match_scratch:SI 4 "=&r"))
+   (set (reg:CC CC_REGNUM) (unspec_volatile:CC [(match_dup 1)]
+                                                UNSPECV_SYNC_COMPARE_AND_SWAP))
+   ]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_required_value"  "2")
+   (set_attr "sync_new_value"       "3")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "4")
+   ])
+
+(define_insn "aarch64_sync_lock_test_and_set<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=&r")
+        (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q"))
+   (set (match_dup 1)
+        (unspec_volatile:GPI [(match_operand:GPI 2 "register_operand" "r")]
+	                     UNSPECV_SYNC_LOCK))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:GPI 3 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_release_barrier" "no")
+   (set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "3")
+   ])
+
+(define_insn "aarch64_sync_lock_test_and_set<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+        (zero_extend:SI (match_operand:SHORT 1
+	                  "aarch64_sync_memory_operand" "+Q")))
+   (set (match_dup 1)
+        (unspec_volatile:SHORT [(match_operand:SI 2 "register_operand" "r")]
+                               UNSPECV_SYNC_LOCK))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:SI 3 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_release_barrier" "no")
+   (set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "3")
+   ])
+
+(define_insn "aarch64_sync_new_<optab><mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=&r")
+        (unspec_volatile:GPI
+	  [(syncop:GPI
+	     (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q")
+             (match_operand:GPI 2 "register_operand" "r"))]
+           UNSPECV_SYNC_NEW_OP))
+   (set (match_dup 1)
+        (unspec_volatile:GPI [(match_dup 1) (match_dup 2)]
+	                    UNSPECV_SYNC_NEW_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:GPI 3 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "3")
+   (set_attr "sync_op"              "<optab>")
+   ])
+
+(define_insn "aarch64_sync_new_nand<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=&r")
+        (unspec_volatile:GPI
+	  [(not:GPI (and:GPI
+                     (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q")
+                     (match_operand:GPI 2 "register_operand" "r")))]
+          UNSPECV_SYNC_NEW_OP))
+   (set (match_dup 1)
+        (unspec_volatile:GPI [(match_dup 1) (match_dup 2)]
+	                    UNSPECV_SYNC_NEW_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:GPI 3 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "3")
+   (set_attr "sync_op"              "nand")
+   ])
+
+(define_insn "aarch64_sync_new_<optab><mode>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+        (unspec_volatile:SI
+	  [(syncop:SI
+             (zero_extend:SI
+	       (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))
+               (match_operand:SI 2 "register_operand" "r"))]
+          UNSPECV_SYNC_NEW_OP))
+   (set (match_dup 1)
+        (unspec_volatile:SHORT [(match_dup 1) (match_dup 2)]
+	                       UNSPECV_SYNC_NEW_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:SI 3 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "3")
+   (set_attr "sync_op"              "<optab>")
+   ])
+
+(define_insn "aarch64_sync_new_nand<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+        (unspec_volatile:SI
+	  [(not:SI
+	     (and:SI
+               (zero_extend:SI
+	         (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))
+               (match_operand:SI 2 "register_operand" "r")))
+	  ] UNSPECV_SYNC_NEW_OP))
+   (set (match_dup 1)
+        (unspec_volatile:SHORT [(match_dup 1) (match_dup 2)]
+	                       UNSPECV_SYNC_NEW_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:SI 3 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "0")
+   (set_attr "sync_t2"              "3")
+   (set_attr "sync_op"              "nand")
+   ])
+
+(define_insn "aarch64_sync_old_<optab><mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=&r")
+        (unspec_volatile:GPI
+          [(syncop:GPI
+             (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q")
+             (match_operand:GPI 2 "register_operand" "r"))]
+          UNSPECV_SYNC_OLD_OP))
+   (set (match_dup 1)
+        (unspec_volatile:GPI [(match_dup 1) (match_dup 2)]
+	                     UNSPECV_SYNC_OLD_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:GPI 3 "=&r"))
+   (clobber (match_scratch:GPI 4 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "3")
+   (set_attr "sync_t2"              "4")
+   (set_attr "sync_op"              "<optab>")
+   ])
+
+(define_insn "aarch64_sync_old_nand<mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=&r")
+        (unspec_volatile:GPI
+	  [(not:GPI (and:GPI
+                     (match_operand:GPI 1 "aarch64_sync_memory_operand" "+Q")
+                     (match_operand:GPI 2 "register_operand" "r")))]
+          UNSPECV_SYNC_OLD_OP))
+   (set (match_dup 1)
+        (unspec_volatile:GPI [(match_dup 1) (match_dup 2)]
+	                     UNSPECV_SYNC_OLD_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:GPI 3 "=&r"))
+   (clobber (match_scratch:GPI 4 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "3")
+   (set_attr "sync_t2"              "4")
+   (set_attr "sync_op"              "nand")
+   ])
+
+(define_insn "aarch64_sync_old_<optab><mode>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+        (unspec_volatile:SI
+	  [(syncop:SI
+             (zero_extend:SI
+	       (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))
+               (match_operand:SI 2 "register_operand" "r"))]
+           UNSPECV_SYNC_OLD_OP))
+   (set (match_dup 1)
+        (unspec_volatile:SHORT [(match_dup 1) (match_dup 2)]
+	                       UNSPECV_SYNC_OLD_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:SI 3 "=&r"))
+   (clobber (match_scratch:SI 4 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "3")
+   (set_attr "sync_t2"              "4")
+   (set_attr "sync_op"              "<optab>")
+   ])
+
+(define_insn "aarch64_sync_old_nand<mode>"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+        (unspec_volatile:SI
+	  [(not:SI
+	     (and:SI
+               (zero_extend:SI
+		 (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))
+                 (match_operand:SI 2 "register_operand" "r")))]
+          UNSPECV_SYNC_OLD_OP))
+   (set (match_dup 1)
+        (unspec_volatile:SHORT [(match_dup 1) (match_dup 2)]
+	                       UNSPECV_SYNC_OLD_OP))
+   (clobber (reg:CC CC_REGNUM))
+   (clobber (match_scratch:SI 3 "=&r"))
+   (clobber (match_scratch:SI 4 "=&r"))]
+  ""
+  {
+    return aarch64_output_sync_insn (insn, operands);
+  }
+  [(set_attr "sync_result"          "0")
+   (set_attr "sync_memory"          "1")
+   (set_attr "sync_new_value"       "2")
+   (set_attr "sync_t1"              "3")
+   (set_attr "sync_t2"              "4")
+   (set_attr "sync_op"              "nand")
+   ])
+
+(define_insn "*memory_barrier"
+  [(set (match_operand:BLK 0 "" "")
+	(unspec:BLK [(match_dup 0)] UNSPEC_MB))]
+  ""
+  "dmb\\tish"
+)
+
+(define_insn "sync_lock_release<mode>"
+  [(set (match_operand:ALLI 0 "memory_operand" "+Q")
+  	(unspec_volatile:ALLI [(match_operand:ALLI 1 "register_operand" "r")]
+	                      UNSPECV_SYNC_LOCK_RELEASE))]
+
+  ""
+  {
+    return aarch64_output_sync_lock_release (operands[1], operands[0]);
+  })
+
--- a/gcc/config/aarch64/t-aarch64
+++ b/gcc/config/aarch64/t-aarch64
+# Machine description for AArch64 architecture.
+#  Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  <http://www.gnu.org/licenses/>.
+
+$(srcdir)/config/aarch64/aarch64-tune.md: $(srcdir)/config/aarch64/gentune.sh \
+	$(srcdir)/config/aarch64/aarch64-cores.def
+	$(SHELL) $(srcdir)/config/aarch64/gentune.sh \
+		$(srcdir)/config/aarch64/aarch64-cores.def > \
+		$(srcdir)/config/aarch64/aarch64-tune.md
+
+aarch64-builtins.o: $(srcdir)/config/aarch64/aarch64-builtins.c $(CONFIG_H) \
+  $(SYSTEM_H) coretypes.h $(TM_H) \
+  $(RTL_H) $(TREE_H) expr.h $(TM_P_H) $(RECOG_H) langhooks.h \
+  $(DIAGNOSTIC_CORE_H) $(OPTABS_H)
+	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+		$(srcdir)/config/aarch64/aarch64-builtins.c
--- a/gcc/config/aarch64/t-aarch64-linux
+++ b/gcc/config/aarch64/t-aarch64-linux
+# Machine description for AArch64 architecture.
+#  Copyright (C) 2009, 2010, 2011, 2012 Free Software Foundation, Inc.
+#  Contributed by ARM Ltd.
+#
+#  This file is part of GCC.
+#
+#  GCC is free software; you can redistribute it and/or modify it
+#  under the terms of the GNU General Public License as published by
+#  the Free Software Foundation; either version 3, or (at your option)
+#  any later version.
+#
+#  GCC is distributed in the hope that it will be useful, but
+#  WITHOUT ANY WARRANTY; without even the implied warranty of
+#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+#  General Public License for more details.
+#
+#  You should have received a copy of the GNU General Public License
+#  along with GCC; see the file COPYING3.  If not see
+#  <http://www.gnu.org/licenses/>.
+
+LIB1ASMSRC   = aarch64/lib1funcs.asm
+LIB1ASMFUNCS = _aarch64_sync_cache_range