Let the target choose a vectorisation alignment

The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although there was also a hard-coded assumption that this was equal to the type size. This was inconvenient for SVE for two reasons: - When compiling for a specific power-of-2 SVE vector length, we might want to align to a full vector. However, the TYPE_ALIGN is governed by the ABI alignment, which is 128 bits regardless of size. - For vector-length-agnostic code it doesn't usually make sense to align, since the runtime vector length might not be a power of two. Even for power of two sizes, there's no guarantee that aligning to the previous 16 bytes will be an improveent. This patch therefore adds a target hook to control the preferred vectoriser (as opposed to ABI) alignment. 2017-09-22 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * target.def (preferred_vector_alignment): New hook. * doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New hook. * doc/tm.texi: Regenerate. * targhooks.h (default_preferred_vector_alignment): Declare. * targhooks.c (default_preferred_vector_alignment): New function. * tree-vectorizer.h (dataref_aux): Add a target_alignment field. Expand commentary. (DR_TARGET_ALIGNMENT): New macro. (aligned_access_p): Update commentary. (vect_known_alignment_in_bytes): New function. * tree-vect-data-refs.c (vect_calculate_required_alignment): New function. (vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT. Calculate the misalignment based on the target alignment rather than the vector size. (vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment. (vect_enhance_data_refs_alignment): Mask the byte misalignment with the target alignment, rather than masking the element misalignment with the number of elements in a vector. Also use the target alignment when calculating the maximum number of peels. (vect_find_same_alignment_drs): Use vect_calculate_required_alignment instead of TYPE_ALIGN_UNIT. (vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter. Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT. (vect_create_addr_base_for_vector_ref): Update call accordingly. (vect_create_data_ref_ptr): Likewise. (vect_setup_realignment): Realign by ANDing with -DR_TARGET_MISALIGNMENT. * tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate the number of peels based on DR_TARGET_ALIGNMENT. * tree-vect-stmts.c (get_group_load_store_type): Compare the gap with the guaranteed alignment boundary when deciding whether overrun is OK. (vectorizable_mask_load_store): Interpret DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT. (ensure_base_align): Remove stmt_info parameter. Get the target base alignment from DR_TARGET_ALIGNMENT. (vectorizable_store): Update call accordingly. Interpret DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT. (vectorizable_load): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording of alignment message. * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253101

Let the target choose a vectorisation alignment
The vectoriser aligned vectors to TYPE_ALIGN unconditionally, although there was also a hard-coded assumption that this was equal to the type size. This was inconvenient for SVE for two reasons: - When compiling for a specific power-of-2 SVE vector length, we might want to align to a full vector. However, the TYPE_ALIGN is governed by the ABI alignment, which is 128 bits regardless of size. - For vector-length-agnostic code it doesn't usually make sense to align, since the runtime vector length might not be a power of two. Even for power of two sizes, there's no guarantee that aligning to the previous 16 bytes will be an improveent. This patch therefore adds a target hook to control the preferred vectoriser (as opposed to ABI) alignment. 2017-09-22 Richard Sandiford <richard.sandiford@linaro.org> Alan Hayward <alan.hayward@arm.com> David Sherwood <david.sherwood@arm.com> gcc/ * target.def (preferred_vector_alignment): New hook. * doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New hook. * doc/tm.texi: Regenerate. * targhooks.h (default_preferred_vector_alignment): Declare. * targhooks.c (default_preferred_vector_alignment): New function. * tree-vectorizer.h (dataref_aux): Add a target_alignment field. Expand commentary. (DR_TARGET_ALIGNMENT): New macro. (aligned_access_p): Update commentary. (vect_known_alignment_in_bytes): New function. * tree-vect-data-refs.c (vect_calculate_required_alignment): New function. (vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT. Calculate the misalignment based on the target alignment rather than the vector size. (vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment. (vect_enhance_data_refs_alignment): Mask the byte misalignment with the target alignment, rather than masking the element misalignment with the number of elements in a vector. Also use the target alignment when calculating the maximum number of peels. (vect_find_same_alignment_drs): Use vect_calculate_required_alignment instead of TYPE_ALIGN_UNIT. (vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter. Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT. (vect_create_addr_base_for_vector_ref): Update call accordingly. (vect_create_data_ref_ptr): Likewise. (vect_setup_realignment): Realign by ANDing with -DR_TARGET_MISALIGNMENT. * tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate the number of peels based on DR_TARGET_ALIGNMENT. * tree-vect-stmts.c (get_group_load_store_type): Compare the gap with the guaranteed alignment boundary when deciding whether overrun is OK. (vectorizable_mask_load_store): Interpret DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT. (ensure_base_align): Remove stmt_info parameter. Get the target base alignment from DR_TARGET_ALIGNMENT. (vectorizable_store): Update call accordingly. Interpret DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT. (vectorizable_load): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording of alignment message. * gcc.dg/vect/vect-outer-3a-big-array.c: Likewise. Co-Authored-By: Alan Hayward <alan.hayward@arm.com> Co-Authored-By: David Sherwood <david.sherwood@arm.com> From-SVN: r253101
f702e7d4 · Richard Sandiford · Richard Sandiford · b2b67217 · f702e7d4 · f702e7d4
Commit f702e7d4 authored 7 years ago by Richard Sandiford Committed by Richard Sandiford 7 years ago
13 changed files
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -2,6 +2,54 @@
 	    Alan Hayward  <alan.hayward@arm.com>
 	    David Sherwood  <david.sherwood@arm.com>

+	* target.def (preferred_vector_alignment): New hook.
+	* doc/tm.texi.in (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): New
+	hook.
+	* doc/tm.texi: Regenerate.
+	* targhooks.h (default_preferred_vector_alignment): Declare.
+	* targhooks.c (default_preferred_vector_alignment): New function.
+	* tree-vectorizer.h (dataref_aux): Add a target_alignment field.
+	Expand commentary.
+	(DR_TARGET_ALIGNMENT): New macro.
+	(aligned_access_p): Update commentary.
+	(vect_known_alignment_in_bytes): New function.
+	* tree-vect-data-refs.c (vect_calculate_required_alignment): New
+	function.
+	(vect_compute_data_ref_alignment): Set DR_TARGET_ALIGNMENT.
+	Calculate the misalignment based on the target alignment rather than
+	the vector size.
+	(vect_update_misalignment_for_peel): Use DR_TARGET_ALIGMENT
+	rather than TYPE_ALIGN / BITS_PER_UNIT to update the misalignment.
+	(vect_enhance_data_refs_alignment): Mask the byte misalignment with
+	the target alignment, rather than masking the element misalignment
+	with the number of elements in a vector.  Also use the target
+	alignment when calculating the maximum number of peels.
+	(vect_find_same_alignment_drs): Use vect_calculate_required_alignment
+	instead of TYPE_ALIGN_UNIT.
+	(vect_duplicate_ssa_name_ptr_info): Remove stmt_info parameter.
+	Measure DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT.
+	(vect_create_addr_base_for_vector_ref): Update call accordingly.
+	(vect_create_data_ref_ptr): Likewise.
+	(vect_setup_realignment): Realign by ANDing with
+	-DR_TARGET_MISALIGNMENT.
+	* tree-vect-loop-manip.c (vect_gen_prolog_loop_niters): Calculate
+	the number of peels based on DR_TARGET_ALIGNMENT.
+	* tree-vect-stmts.c (get_group_load_store_type): Compare the gap
+	with the guaranteed alignment boundary when deciding whether
+	overrun is OK.
+	(vectorizable_mask_load_store): Interpret DR_MISALIGNMENT
+	relative to DR_TARGET_ALIGNMENT instead of TYPE_ALIGN_UNIT.
+	(ensure_base_align): Remove stmt_info parameter.  Get the
+	target base alignment from DR_TARGET_ALIGNMENT.
+	(vectorizable_store): Update call accordingly.   Interpret
+	DR_MISALIGNMENT relative to DR_TARGET_ALIGNMENT instead of
+	TYPE_ALIGN_UNIT.
+	(vectorizable_load): Likewise.
+
+2017-09-22  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
 	* tree-vectorizer.h (vect_get_scalar_dr_size): New function.
 	* tree-vect-data-refs.c (vect_update_misalignment_for_peel): Use it.
 	(vect_enhance_data_refs_alignment): Likewise.
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5758,6 +5758,18 @@ For vector memory operations the cost may depend on type (@var{vectype}) and
 misalignment value (@var{misalign}).
 @end deftypefn

+@deftypefn {Target Hook} HOST_WIDE_INT TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT (const_tree @var{type})
+This hook returns the preferred alignment in bits for accesses to
+vectors of type @var{type} in vectorized code.  This might be less than
+or greater than the ABI-defined value returned by
+@code{TARGET_VECTOR_ALIGNMENT}.  It can be equal to the alignment of
+a single element, in which case the vectorizer will not try to optimize
+for alignment.
+
+The default hook returns @code{TYPE_ALIGN (@var{type})}, which is
+correct for most targets.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE (const_tree @var{type}, bool @var{is_packed})
 Return true if vector alignment is reachable (by peeling N iterations) for the given scalar type @var{type}.  @var{is_packed} is false if the scalar access using @var{type} is known to be naturally aligned.
 @end deftypefn

--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4088,6 +4088,8 @@ address;  but often a machine-dependent strategy can generate better code.

 @hook TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST

+@hook TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT
+
 @hook TARGET_VECTORIZE_VECTOR_ALIGNMENT_REACHABLE

 @hook TARGET_VECTORIZE_VEC_PERM_CONST_OK

--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1820,6 +1820,20 @@ misalignment value (@var{misalign}).",
 int, (enum vect_cost_for_stmt type_of_cost, tree vectype, int misalign),
 default_builtin_vectorization_cost)

+DEFHOOK
+(preferred_vector_alignment,
+ "This hook returns the preferred alignment in bits for accesses to\n\
+vectors of type @var{type} in vectorized code.  This might be less than\n\
+or greater than the ABI-defined value returned by\n\
+@code{TARGET_VECTOR_ALIGNMENT}.  It can be equal to the alignment of\n\
+a single element, in which case the vectorizer will not try to optimize\n\
+for alignment.\n\
+\n\
+The default hook returns @code{TYPE_ALIGN (@var{type})}, which is\n\
+correct for most targets.",
+ HOST_WIDE_INT, (const_tree type),
+ default_preferred_vector_alignment)
+
 /* Return true if vector alignment is reachable (by peeling N
   iterations) for the given scalar type.  */
 DEFHOOK

--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1175,6 +1175,15 @@ default_vector_alignment (const_tree type)
  return align;
 }

+/* The default implementation of
+   TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT.  */
+
+HOST_WIDE_INT
+default_preferred_vector_alignment (const_tree type)
+{
+  return TYPE_ALIGN (type);
+}
+
 /* By default assume vectors of element TYPE require a multiple of the natural
   alignment of TYPE.  TYPE is naturally aligned if IS_PACKED is false.  */
 bool

--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -95,6 +95,7 @@ extern tree default_builtin_reciprocal (tree);

 extern HOST_WIDE_INT default_vector_alignment (const_tree);

+extern HOST_WIDE_INT default_preferred_vector_alignment (const_tree);
 extern bool default_builtin_vector_alignment_reachable (const_tree, bool);
 extern bool
 default_builtin_support_vector_misalignment (machine_mode mode,

--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
+2017-09-22  Richard Sandiford  <richard.sandiford@linaro.org>
+	    Alan Hayward  <alan.hayward@arm.com>
+	    David Sherwood  <david.sherwood@arm.com>
+
+	* gcc.dg/vect/vect-outer-3a.c: Adjust dump scan for new wording
+	of alignment message.
+	* gcc.dg/vect/vect-outer-3a-big-array.c: Likewise.
+
 2017-09-22  Martin Sebor  <msebor@redhat.com>

 	PR c/81854

--- a/gcc/testsuite/gcc.dg/vect/vect-outer-3a-big-array.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-outer-3a-big-array.c
@@ -49,4 +49,4 @@ int main (void)
 }

 /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "step doesn't divide the vector-size" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "step doesn't divide the vector alignment" 1 "vect" } } */
--- a/gcc/testsuite/gcc.dg/vect/vect-outer-3a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-outer-3a.c
@@ -49,4 +49,4 @@ int main (void)
 }

 /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "step doesn't divide the vector-size" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "step doesn't divide the vector alignment" 1 "vect" } } */
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -956,8 +956,7 @@ vect_gen_prolog_loop_niters (loop_vec_info loop_vinfo,
  gimple *dr_stmt = DR_STMT (dr);
  stmt_vec_info stmt_info = vinfo_for_stmt (dr_stmt);
  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
-  int vectype_align = TYPE_ALIGN (vectype) / BITS_PER_UNIT;
-  int nelements = TYPE_VECTOR_SUBPARTS (vectype);
+  unsigned int target_align = DR_TARGET_ALIGNMENT (dr);

  if (LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) > 0)
    {
@@ -978,32 +977,36 @@ vect_gen_prolog_loop_niters (loop_vec_info loop_vinfo,
      tree start_addr = vect_create_addr_base_for_vector_ref (dr_stmt,
 							      &stmts, offset);
      tree type = unsigned_type_for (TREE_TYPE (start_addr));
-      tree vectype_align_minus_1 = build_int_cst (type, vectype_align - 1);
-      HOST_WIDE_INT elem_size =
-                int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
+      tree target_align_minus_1 = build_int_cst (type, target_align - 1);
+      HOST_WIDE_INT elem_size
+	= int_cst_value (TYPE_SIZE_UNIT (TREE_TYPE (vectype)));
      tree elem_size_log = build_int_cst (type, exact_log2 (elem_size));
-      tree nelements_minus_1 = build_int_cst (type, nelements - 1);
-      tree nelements_tree = build_int_cst (type, nelements);
-      tree byte_misalign;
-      tree elem_misalign;
-
-      /* Create:  byte_misalign = addr & (vectype_align - 1)  */
-      byte_misalign =
-	fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
-		     vectype_align_minus_1);
-
-      /* Create:  elem_misalign = byte_misalign / element_size  */
-      elem_misalign =
-	fold_build2 (RSHIFT_EXPR, type, byte_misalign, elem_size_log);
-
-      /* Create:  (niters_type) (nelements - elem_misalign)&(nelements - 1)  */
+      HOST_WIDE_INT align_in_elems = target_align / elem_size;
+      tree align_in_elems_minus_1 = build_int_cst (type, align_in_elems - 1);
+      tree align_in_elems_tree = build_int_cst (type, align_in_elems);
+      tree misalign_in_bytes;
+      tree misalign_in_elems;
+
+      /* Create:  misalign_in_bytes = addr & (target_align - 1).  */
+      misalign_in_bytes
+	= fold_build2 (BIT_AND_EXPR, type, fold_convert (type, start_addr),
+		       target_align_minus_1);
+
+      /* Create:  misalign_in_elems = misalign_in_bytes / element_size.  */
+      misalign_in_elems
+	= fold_build2 (RSHIFT_EXPR, type, misalign_in_bytes, elem_size_log);
+
+      /* Create:  (niters_type) ((align_in_elems - misalign_in_elems)
+				 & (align_in_elems - 1)).  */
      if (negative)
-	iters = fold_build2 (MINUS_EXPR, type, elem_misalign, nelements_tree);
+	iters = fold_build2 (MINUS_EXPR, type, misalign_in_elems,
+			     align_in_elems_tree);
      else
-	iters = fold_build2 (MINUS_EXPR, type, nelements_tree, elem_misalign);
-      iters = fold_build2 (BIT_AND_EXPR, type, iters, nelements_minus_1);
+	iters = fold_build2 (MINUS_EXPR, type, align_in_elems_tree,
+			     misalign_in_elems);
+      iters = fold_build2 (BIT_AND_EXPR, type, iters, align_in_elems_minus_1);
      iters = fold_convert (niters_type, iters);
-      *bound = nelements - 1;
+      *bound = align_in_elems - 1;
    }

  if (dump_enabled_p ())

--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1737,6 +1737,7 @@ get_group_load_store_type (gimple *stmt, tree vectype, bool slp,
  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
  struct loop *loop = loop_vinfo ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
  gimple *first_stmt = GROUP_FIRST_ELEMENT (stmt_info);
+  data_reference *first_dr = STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt));
  unsigned int group_size = GROUP_SIZE (vinfo_for_stmt (first_stmt));
  bool single_element_p = (stmt == first_stmt
 			   && !GROUP_NEXT_ELEMENT (stmt_info));
@@ -1780,10 +1781,13 @@ get_group_load_store_type (gimple *stmt, tree vectype, bool slp,
 			       " non-consecutive accesses\n");
 	      return false;
 	    }
-	  /* If the access is aligned an overrun is fine.  */
+	  /* An overrun is fine if the trailing elements are smaller
+	     than the alignment boundary B.  Every vector access will
+	     be a multiple of B and so we are guaranteed to access a
+	     non-gap element in the same B-sized block.  */
 	  if (overrun_p
-	      && aligned_access_p
-		   (STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt))))
+	      && gap < (vect_known_alignment_in_bytes (first_dr)
+			/ vect_get_scalar_dr_size (first_dr)))
 	    overrun_p = false;
 	  if (overrun_p && !can_overrun_p)
 	    {
@@ -1804,14 +1808,15 @@ get_group_load_store_type (gimple *stmt, tree vectype, bool slp,
      /* If there is a gap at the end of the group then these optimizations
 	 would access excess elements in the last iteration.  */
      bool would_overrun_p = (gap != 0);
-      /* If the access is aligned an overrun is fine, but only if the
-         overrun is not inside an unused vector (if the gap is as large
-	 or larger than a vector).  */
+      /* An overrun is fine if the trailing elements are smaller than the
+	 alignment boundary B.  Every vector access will be a multiple of B
+	 and so we are guaranteed to access a non-gap element in the
+	 same B-sized block.  */
      if (would_overrun_p
-	  && gap < nunits
-	  && aligned_access_p
-		(STMT_VINFO_DATA_REF (vinfo_for_stmt (first_stmt))))
+	  && gap < (vect_known_alignment_in_bytes (first_dr)
+		    / vect_get_scalar_dr_size (first_dr)))
 	would_overrun_p = false;
+
      if (!STMT_VINFO_STRIDED_P (stmt_info)
 	  && (can_overrun_p || !would_overrun_p)
 	  && compare_step_with_zero (stmt) > 0)
@@ -2351,7 +2356,7 @@ vectorizable_mask_load_store (gimple *stmt, gimple_stmt_iterator *gsi,
 					     TYPE_SIZE_UNIT (vectype));
 	    }

-	  align = TYPE_ALIGN_UNIT (vectype);
+	  align = DR_TARGET_ALIGNMENT (dr);
 	  if (aligned_access_p (dr))
 	    misalign = 0;
 	  else if (DR_MISALIGNMENT (dr) == -1)
@@ -2404,7 +2409,7 @@ vectorizable_mask_load_store (gimple *stmt, gimple_stmt_iterator *gsi,
 					     TYPE_SIZE_UNIT (vectype));
 	    }

-	  align = TYPE_ALIGN_UNIT (vectype);
+	  align = DR_TARGET_ALIGNMENT (dr);
 	  if (aligned_access_p (dr))
 	    misalign = 0;
 	  else if (DR_MISALIGNMENT (dr) == -1)
@@ -5553,25 +5558,25 @@ vectorizable_operation (gimple *stmt, gimple_stmt_iterator *gsi,
  return true;
 }

-/* A helper function to ensure data reference DR's base alignment
-   for STMT_INFO.  */
+/* A helper function to ensure data reference DR's base alignment.  */

 static void
-ensure_base_align (stmt_vec_info stmt_info, struct data_reference *dr)
+ensure_base_align (struct data_reference *dr)
 {
  if (!dr->aux)
    return;

  if (DR_VECT_AUX (dr)->base_misaligned)
    {
-      tree vectype = STMT_VINFO_VECTYPE (stmt_info);
      tree base_decl = DR_VECT_AUX (dr)->base_decl;

+      unsigned int align_base_to = DR_TARGET_ALIGNMENT (dr) * BITS_PER_UNIT;
+
      if (decl_in_symtab_p (base_decl))
-	symtab_node::get (base_decl)->increase_alignment (TYPE_ALIGN (vectype));
+	symtab_node::get (base_decl)->increase_alignment (align_base_to);
      else
 	{
-          SET_DECL_ALIGN (base_decl, TYPE_ALIGN (vectype));
+	  SET_DECL_ALIGN (base_decl, align_base_to);
          DECL_USER_ALIGN (base_decl) = 1;
 	}
      DR_VECT_AUX (dr)->base_misaligned = false;
@@ -5775,7 +5780,7 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,

  /* Transform.  */

-  ensure_base_align (stmt_info, dr);
+  ensure_base_align (dr);

  if (memory_access_type == VMAT_GATHER_SCATTER)
    {
@@ -6417,7 +6422,7 @@ vectorizable_store (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 				      dataref_offset
 				      ? dataref_offset
 				      : build_int_cst (ref_type, 0));
-	      align = TYPE_ALIGN_UNIT (vectype);
+	      align = DR_TARGET_ALIGNMENT (first_dr);
 	      if (aligned_access_p (first_dr))
 		misalign = 0;
 	      else if (DR_MISALIGNMENT (first_dr) == -1)
@@ -6813,7 +6818,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,

  /* Transform.  */

-  ensure_base_align (stmt_info, dr);
+  ensure_base_align (dr);

  if (memory_access_type == VMAT_GATHER_SCATTER)
    {
@@ -7512,7 +7517,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 				     dataref_offset
 				     ? dataref_offset
 				     : build_int_cst (ref_type, 0));
-		    align = TYPE_ALIGN_UNIT (vectype);
+		    align = DR_TARGET_ALIGNMENT (dr);
 		    if (alignment_support_scheme == dr_aligned)
 		      {
 			gcc_assert (aligned_access_p (first_dr));
@@ -7555,11 +7560,12 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 		      ptr = copy_ssa_name (dataref_ptr);
 		    else
 		      ptr = make_ssa_name (TREE_TYPE (dataref_ptr));
+		    unsigned int align = DR_TARGET_ALIGNMENT (first_dr);
 		    new_stmt = gimple_build_assign
 				 (ptr, BIT_AND_EXPR, dataref_ptr,
 				  build_int_cst
 				  (TREE_TYPE (dataref_ptr),
-				   -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
+				   -(HOST_WIDE_INT) align));
 		    vect_finish_stmt_generation (stmt, new_stmt, gsi);
 		    data_ref
 		      = build2 (MEM_REF, vectype, ptr,
@@ -7581,8 +7587,7 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 		    new_stmt = gimple_build_assign
 				 (NULL_TREE, BIT_AND_EXPR, ptr,
 				  build_int_cst
-				  (TREE_TYPE (ptr),
-				   -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
+				  (TREE_TYPE (ptr), -(HOST_WIDE_INT) align));
 		    ptr = copy_ssa_name (ptr, new_stmt);
 		    gimple_assign_set_lhs (new_stmt, ptr);
 		    vect_finish_stmt_generation (stmt, new_stmt, gsi);
@@ -7592,20 +7597,22 @@ vectorizable_load (gimple *stmt, gimple_stmt_iterator *gsi, gimple **vec_stmt,
 		    break;
 		  }
 		case dr_explicit_realign_optimized:
-		  if (TREE_CODE (dataref_ptr) == SSA_NAME)
-		    new_temp = copy_ssa_name (dataref_ptr);
-		  else
-		    new_temp = make_ssa_name (TREE_TYPE (dataref_ptr));
-		  new_stmt = gimple_build_assign
-			       (new_temp, BIT_AND_EXPR, dataref_ptr,
-				build_int_cst
-				  (TREE_TYPE (dataref_ptr),
-				   -(HOST_WIDE_INT)TYPE_ALIGN_UNIT (vectype)));
-		  vect_finish_stmt_generation (stmt, new_stmt, gsi);
-		  data_ref
-		    = build2 (MEM_REF, vectype, new_temp,
-			      build_int_cst (ref_type, 0));
-		  break;
+		  {
+		    if (TREE_CODE (dataref_ptr) == SSA_NAME)
+		      new_temp = copy_ssa_name (dataref_ptr);
+		    else
+		      new_temp = make_ssa_name (TREE_TYPE (dataref_ptr));
+		    unsigned int align = DR_TARGET_ALIGNMENT (first_dr);
+		    new_stmt = gimple_build_assign
+		      (new_temp, BIT_AND_EXPR, dataref_ptr,
+		       build_int_cst (TREE_TYPE (dataref_ptr),
+				     -(HOST_WIDE_INT) align));
+		    vect_finish_stmt_generation (stmt, new_stmt, gsi);
+		    data_ref
+		      = build2 (MEM_REF, vectype, new_temp,
+				build_int_cst (ref_type, 0));
+		    break;
+		  }
 		default:
 		  gcc_unreachable ();
 		}

--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -790,7 +790,11 @@ STMT_VINFO_BB_VINFO (stmt_vec_info stmt_vinfo)
 #define STMT_SLP_TYPE(S)                   (S)->slp_type

 struct dataref_aux {
+  /* The misalignment in bytes of the reference, or -1 if not known.  */
  int misalignment;
+  /* The byte alignment that we'd ideally like the reference to have,
+     and the value that misalignment is measured against.  */
+  int target_alignment;
  /* If true the alignment of base_decl needs to be increased.  */
  bool base_misaligned;
  tree base_decl;
@@ -1037,7 +1041,11 @@ dr_misalignment (struct data_reference *dr)
 #define SET_DR_MISALIGNMENT(DR, VAL) set_dr_misalignment (DR, VAL)
 #define DR_MISALIGNMENT_UNKNOWN (-1)

-/* Return TRUE if the data access is aligned, and FALSE otherwise.  */
+/* Only defined once DR_MISALIGNMENT is defined.  */
+#define DR_TARGET_ALIGNMENT(DR) DR_VECT_AUX (DR)->target_alignment
+
+/* Return true if data access DR is aligned to its target alignment
+   (which may be less than a full vector).  */

 static inline bool
 aligned_access_p (struct data_reference *data_ref_info)
@@ -1054,6 +1062,19 @@ known_alignment_for_access_p (struct data_reference *data_ref_info)
  return (DR_MISALIGNMENT (data_ref_info) != DR_MISALIGNMENT_UNKNOWN);
 }

+/* Return the minimum alignment in bytes that the vectorized version
+   of DR is guaranteed to have.  */
+
+static inline unsigned int
+vect_known_alignment_in_bytes (struct data_reference *dr)
+{
+  if (DR_MISALIGNMENT (dr) == DR_MISALIGNMENT_UNKNOWN)
+    return TYPE_ALIGN_UNIT (TREE_TYPE (DR_REF (dr)));
+  if (DR_MISALIGNMENT (dr) == 0)
+    return DR_TARGET_ALIGNMENT (dr);
+  return DR_MISALIGNMENT (dr) & -DR_MISALIGNMENT (dr);
+}
+
 /* Return the behavior of DR with respect to the vectorization context
   (which for outer loop vectorization might not be the behavior recorded
   in DR itself).  */