1. 03 Jan, 2018 30 commits
    • poly_int: REGMODE_NATURAL_SIZE · fad2288b
      This patch makes target-independent code that uses REGMODE_NATURAL_SIZE
      treat it as a poly_int rather than a constant.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* combine.c (can_change_dest_mode): Handle polynomial
      	REGMODE_NATURAL_SIZE.
      	* expmed.c (store_bit_field_1): Likewise.
      	* expr.c (store_constructor): Likewise.
      	* emit-rtl.c (validate_subreg): Operate on polynomial mode sizes
      	and polynomial REGMODE_NATURAL_SIZE.
      	(gen_lowpart_common): Likewise.
      	* reginfo.c (record_subregs_of_mode): Likewise.
      	* rtlanal.c (read_modify_subreg_p): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256149
      Richard Sandiford committed
    • poly_int: expand_vector_ubsan_overflow · 07626e49
      This patch makes expand_vector_ubsan_overflow cope with a polynomial
      number of elements.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* internal-fn.c (expand_vector_ubsan_overflow): Handle polynomial
      	numbers of elements.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256148
      Richard Sandiford committed
    • poly_int: folding BIT_FIELD_REFs on vectors · d34457c1
      This patch makes the:
      
        (BIT_FIELD_REF CONSTRUCTOR@0 @1 @2)
      
      folder cope with polynomial numbers of elements.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* match.pd: Cope with polynomial numbers of vector elements.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256147
      Richard Sandiford committed
    • poly_int: fold_indirect_ref_1 · fece509b
      This patch makes fold_indirect_ref_1 handle polynomial offsets in
      a POINTER_PLUS_EXPR.  The specific reason for doing this now is
      to handle:
      
       		  (tree_to_uhwi (part_width) / BITS_PER_UNIT
       		   * TYPE_VECTOR_SUBPARTS (op00type));
      
      when TYPE_VECTOR_SUBPARTS becomes a poly_int.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* fold-const.c (fold_indirect_ref_1): Handle polynomial offsets
      	in a POINTER_PLUS_EXPR.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256146
      Richard Sandiford committed
    • poly_int: omp-simd-clone.c · d8f860ef
      This patch adds a wrapper around TYPE_VECTOR_SUBPARTS for omp-simd-clone.c.
      Supporting SIMD clones for variable-length vectors is post GCC8 work.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* omp-simd-clone.c (simd_clone_subparts): New function.
      	(simd_clone_init_simd_arrays): Use it instead of TYPE_VECTOR_SUBPARTS.
      	(ipa_simd_modify_function_body): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256145
      Richard Sandiford committed
    • poly_int: brig vector elements · e112bba2
      This patch adds a brig-specific wrapper around TYPE_VECTOR_SUBPARTS,
      since presumably it will never need to support variable vector lengths.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/brig/
      	* brigfrontend/brig-util.h (gccbrig_type_vector_subparts): New
      	function.
      	* brigfrontend/brig-basic-inst-handler.cc
      	(brig_basic_inst_handler::build_shuffle): Use it instead of
      	TYPE_VECTOR_SUBPARTS.
      	(brig_basic_inst_handler::build_unpack): Likewise.
      	(brig_basic_inst_handler::build_pack): Likewise.
      	(brig_basic_inst_handler::build_unpack_lo_or_hi): Likewise.
      	(brig_basic_inst_handler::operator ()): Likewise.
      	(brig_basic_inst_handler::build_lower_element_broadcast): Likewise.
      	* brigfrontend/brig-code-entry-handler.cc
      	(brig_code_entry_handler::get_tree_cst_for_hsa_operand): Likewise.
      	(brig_code_entry_handler::get_comparison_result_type): Likewise.
      	(brig_code_entry_handler::expand_or_call_builtin): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256144
      Richard Sandiford committed
    • poly_int: tree-vect-generic.c · 22afc2b3
      This patch makes tree-vect-generic.c cope with variable-length vectors.
      Decomposition is only supported for constant-length vectors, since we
      should never generate unsupported variable-length operations.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-generic.c (nunits_for_known_piecewise_op): New function.
      	(expand_vector_piecewise): Use it instead of TYPE_VECTOR_SUBPARTS.
      	(expand_vector_addition, add_rshift, expand_vector_divmod): Likewise.
      	(expand_vector_condition, vector_element): Likewise.
      	(subparts_gt): New function.
      	(get_compute_type): Use subparts_gt.
      	(count_type_subparts): Delete.
      	(expand_vector_operations_1): Use subparts_gt instead of
      	count_type_subparts.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256143
      Richard Sandiford committed
    • poly_int: vect_no_alias_p · b064d4f9
      This patch replaces the two-state vect_no_alias_p with a three-state
      vect_compile_time_alias that handles polynomial segment lengths.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-data-refs.c (vect_no_alias_p): Replace with...
      	(vect_compile_time_alias): ...this new function.  Do the calculation
      	on poly_ints rather than trees.
      	(vect_prune_runtime_alias_test_list): Update call accordingly.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256142
      Richard Sandiford committed
    • poly_int: two-operation SLP · dad55d70
      This patch makes two-operation SLP handle but reject variable-length
      vectors.  Adding support for this is a post-GCC8 thing.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-slp.c (vect_build_slp_tree_1): Handle polynomial
      	numbers of units.
      	(vect_schedule_slp_instance): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256141
      Richard Sandiford committed
    • poly_int: vect_get_constant_vectors · a23644f2
      For now, vect_get_constant_vectors can only cope with constant-length
      vectors, although a patch after the main SVE submission relaxes this.
      This patch adds an appropriate guard for variable-length vectors.
      The TYPE_VECTOR_SUBPARTS use in vect_get_constant_vectors will then
      have a to_constant call when TYPE_VECTOR_SUBPARTS becomes a poly_int.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-slp.c (vect_get_and_check_slp_defs): Reject
      	constant and extern definitions for variable-length vectors.
      	(vect_get_constant_vectors): Note that the number of units
      	is known to be constant.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256140
      Richard Sandiford committed
    • poly_int: vectorizable_conversion · 062d5ccc
      This patch makes vectorizable_conversion cope with variable-length
      vectors.  We already require the number of elements in one vector
      to be a multiple of the number of elements in the other vector,
      so the patch uses that to choose between widening and narrowing.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-stmts.c (vectorizable_conversion): Treat the number
      	of units as polynomial.  Choose between WIDE and NARROW based
      	on multiple_p.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256139
      Richard Sandiford committed
    • poly_int: vectorizable_simd_clone_call · cf1b2ba4
      This patch makes vectorizable_simd_clone_call cope with variable-length
      vectors.  For now we don't support SIMD clones for variable-length
      vectors; this will be post GCC 8 material.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-stmts.c (simd_clone_subparts): New function.
      	(vectorizable_simd_clone_call): Use it instead of TYPE_VECTOR_SUBPARTS.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256138
      Richard Sandiford committed
    • poly_int: vectorizable_call · c7bda0f4
      This patch makes vectorizable_call handle variable-length vectors.
      The only substantial change is to use build_index_vector for
      IFN_GOMP_SIMD_LANE; this makes no functional difference for
      fixed-length vectors.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-stmts.c (vectorizable_call): Treat the number of
      	vectors as polynomial.  Use build_index_vector for
      	IFN_GOMP_SIMD_LANE.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256137
      Richard Sandiford committed
    • poly_int: vectorizable_load/store · 4d694b27
      This patch makes vectorizable_load and vectorizable_store cope with
      variable-length vectors.  The reverse and permute cases will be
      excluded by the code that checks the permutation mask (although a
      patch after the main SVE submission adds support for the reversed
      case).  Here we also need to exclude VMAT_ELEMENTWISE and
      VMAT_STRIDED_SLP, which split the operation up into a constant
      number of constant-sized operations.  We also don't try to extend
      the current widening gather/scatter support to variable-length
      vectors, since SVE uses a different approach.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-stmts.c (get_load_store_type): Treat the number of
      	units as polynomial.  Reject VMAT_ELEMENTWISE and VMAT_STRIDED_SLP
      	for variable-length vectors.
      	(vectorizable_mask_load_store): Treat the number of units as
      	polynomial, asserting that it is constant if the condition has
      	already been enforced.
      	(vectorizable_store, vectorizable_load): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256136
      Richard Sandiford committed
    • poly_int: vectorizable_live_operation · fa780794
      This patch makes vectorizable_live_operation cope with variable-length
      vectors.  For now we just handle cases in which we can tell at compile
      time which vector contains the final result.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-loop.c (vectorizable_live_operation): Treat the number
      	of units as polynomial.  Punt if we can't tell at compile time
      	which vector contains the final result.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256135
      Richard Sandiford committed
    • poly_int: vectorizable_induction · 9fb9293a
      This patch makes vectorizable_induction cope with variable-length
      vectors.  For now we punt on SLP inductions, but patchees after
      the main SVE submission add support for those too.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-loop.c (vectorizable_induction): Treat the number
      	of units as polynomial.  Punt on SLP inductions.  Use an integer
      	VEC_SERIES_EXPR for variable-length integer reductions.  Use a
      	cast of such a series for variable-length floating-point
      	reductions.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256134
      Richard Sandiford committed
    • poly_int: vectorizable_reduction · e54dd6d3
      This patch makes vectorizable_reduction cope with variable-length vectors.
      We can handle the simple case of an inner loop reduction for which
      the target has native support for the epilogue operation.  For now we
      punt on other cases, but patches after the main SVE submission allow
      SLP and double reductions too.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree.h (build_index_vector): Declare.
      	* tree.c (build_index_vector): New function.
      	* tree-vect-loop.c (get_initial_defs_for_reduction): Treat the number
      	of units as polynomial, forcibly converting it to a constant if
      	vectorizable_reduction has already enforced the condition.
      	(vect_create_epilog_for_reduction): Likewise.  Use build_index_vector
      	to create a {1,2,3,...} vector.
      	(vectorizable_reduction): Treat the number of units as polynomial.
      	Choose vectype_in based on the largest scalar element size rather
      	than the smallest number of units.  Enforce the restrictions
      	relied on above.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256133
      Richard Sandiford committed
    • poly_int: vector_alignment_reachable_p · 9031b367
      This patch makes vector_alignment_reachable_p cope with variable-length
      vectors.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-data-refs.c (vector_alignment_reachable_p): Treat the
      	number of units as polynomial.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256132
      Richard Sandiford committed
    • poly_int: current_vector_size and TARGET_AUTOVECTORIZE_VECTOR_SIZES · 86e36728
      This patch changes the type of current_vector_size to poly_uint64.
      It also changes TARGET_AUTOVECTORIZE_VECTOR_SIZES so that it fills
      in a vector of possible sizes (as poly_uint64s) instead of returning
      a bitmask.  The documentation claimed that the hook didn't need to
      include the default vector size (returned by preferred_simd_mode),
      but that wasn't consistent with the omp-low.c usage.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* target.h (vector_sizes, auto_vector_sizes): New typedefs.
      	* target.def (autovectorize_vector_sizes): Return the vector sizes
      	by pointer, using vector_sizes rather than a bitmask.
      	* targhooks.h (default_autovectorize_vector_sizes): Update accordingly.
      	* targhooks.c (default_autovectorize_vector_sizes): Likewise.
      	* config/aarch64/aarch64.c (aarch64_autovectorize_vector_sizes):
      	Likewise.
      	* config/arc/arc.c (arc_autovectorize_vector_sizes): Likewise.
      	* config/arm/arm.c (arm_autovectorize_vector_sizes): Likewise.
      	* config/i386/i386.c (ix86_autovectorize_vector_sizes): Likewise.
      	* config/mips/mips.c (mips_autovectorize_vector_sizes): Likewise.
      	* omp-general.c (omp_max_vf): Likewise.
      	* omp-low.c (omp_clause_aligned_alignment): Likewise.
      	* optabs-query.c (can_vec_mask_load_store_p): Likewise.
      	* tree-vect-loop.c (vect_analyze_loop): Likewise.
      	* tree-vect-slp.c (vect_slp_bb): Likewise.
      	* doc/tm.texi: Regenerate.
      	* tree-vectorizer.h (current_vector_size): Change from an unsigned int
      	to a poly_uint64.
      	* tree-vect-stmts.c (get_vectype_for_scalar_type_and_size): Take
      	the vector size as a poly_uint64 rather than an unsigned int.
      	(current_vector_size): Change from an unsigned int to a poly_uint64.
      	(get_vectype_for_scalar_type): Update accordingly.
      	* tree.h (build_truth_vector_type): Take the size and number of
      	units as a poly_uint64 rather than an unsigned int.
      	(build_vector_type): Add a temporary overload that takes
      	the number of units as a poly_uint64 rather than an unsigned int.
      	* tree.c (make_vector_type): Likewise.
      	(build_truth_vector_type): Take the number of units as a poly_uint64
      	rather than an unsigned int.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256131
      Richard Sandiford committed
    • poly_int: get_mask_mode · 87133c45
      This patch makes TARGET_GET_MASK_MODE take polynomial nunits and
      vector_size arguments.  The gcc_assert in default_get_mask_mode
      is now handled by the exact_div call in vector_element_size.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* target.def (get_mask_mode): Take the number of units and length
      	as poly_uint64s rather than unsigned ints.
      	* targhooks.h (default_get_mask_mode): Update accordingly.
      	* targhooks.c (default_get_mask_mode): Likewise.
      	* config/i386/i386.c (ix86_get_mask_mode): Likewise.
      	* doc/tm.texi: Regenerate.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256130
      Richard Sandiford committed
    • poly_int: omp_max_vf · 9d2f08ab
      This patch makes omp_max_vf return a polynomial vectorization factor.
      We then need to be able to stash a polynomial value in
      OMP_CLAUSE_SAFELEN_EXPR too:
      
         /* If max_vf is non-zero, then we can use only a vectorization factor
            up to the max_vf we chose.  So stick it into the safelen clause.  */
      
      For now the cfgloop safelen is still constant though.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* omp-general.h (omp_max_vf): Return a poly_uint64 instead of an int.
      	* omp-general.c (omp_max_vf): Likewise.
      	* omp-expand.c (omp_adjust_chunk_size): Update call to omp_max_vf.
      	(expand_omp_simd): Handle polynomial safelen.
      	* omp-low.c (omplow_simd_context): Add a default constructor.
      	(omplow_simd_context::max_vf): Change from int to poly_uint64.
      	(lower_rec_simd_input_clauses): Update accordingly.
      	(lower_rec_input_clauses): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256129
      Richard Sandiford committed
    • poly_int: vect_nunits_for_cost · c5126ce8
      This patch adds a function for getting the number of elements in
      a vector for cost purposes, which is always constant.  It makes
      it possible for a later patch to change GET_MODE_NUNITS and
      TYPE_VECTOR_SUBPARTS to a poly_int.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_nunits_for_cost): New function.
      	* tree-vect-loop.c (vect_model_reduction_cost): Use it.
      	* tree-vect-slp.c (vect_analyze_slp_cost_1): Likewise.
      	(vect_analyze_slp_cost): Likewise.
      	* tree-vect-stmts.c (vect_model_store_cost): Likewise.
      	(vect_model_load_cost): Likewise.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256128
      Richard Sandiford committed
    • poly_int: SLP max_units · 4b6068ea
      This match makes tree-vect-slp.c track the maximum number of vector
      units as a poly_uint64 rather than an unsigned int.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vect-slp.c (vect_record_max_nunits, vect_build_slp_tree_1)
      	(vect_build_slp_tree_2, vect_build_slp_tree): Change max_nunits
      	from an unsigned int * to a poly_uint64_pod *.
      	(calculate_unrolling_factor): New function.
      	(vect_analyze_slp_instance): Use it.  Track polynomial max_nunits.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256127
      Richard Sandiford committed
    • poly_int: vectoriser vf and uf · d9f21f6a
      This patch changes the type of the vectorisation factor and SLP
      unrolling factor to poly_uint64.  This in turn required some knock-on
      changes in signedness elsewhere.
      
      Cost decisions are generally based on estimated_poly_value,
      which for VF is wrapped up as vect_vf_for_cost.
      
      The patch doesn't on its own enable variable-length vectorisation.
      It just makes the minimum changes necessary for the code to build
      with the new VF and UF types.  Later patches also make the
      vectoriser cope with variable TYPE_VECTOR_SUBPARTS and variable
      GET_MODE_NUNITS, at which point the code really does handle
      variable-length vectors.
      
      The patch also changes MAX_VECTORIZATION_FACTOR to INT_MAX,
      to avoid hard-coding a particular architectural limit.
      
      The patch includes a new test because a development version of the patch
      accidentally used file print routines instead of dump_*, which would
      fail with -fopt-info.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* tree-vectorizer.h (_slp_instance::unrolling_factor): Change
      	from an unsigned int to a poly_uint64.
      	(_loop_vec_info::slp_unrolling_factor): Likewise.
      	(_loop_vec_info::vectorization_factor): Change from an int
      	to a poly_uint64.
      	(MAX_VECTORIZATION_FACTOR): Bump from 64 to INT_MAX.
      	(vect_get_num_vectors): New function.
      	(vect_update_max_nunits, vect_vf_for_cost): Likewise.
      	(vect_get_num_copies): Use vect_get_num_vectors.
      	(vect_analyze_data_ref_dependences): Change max_vf from an int *
      	to an unsigned int *.
      	(vect_analyze_data_refs): Change min_vf from an int * to a
      	poly_uint64 *.
      	(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
      	than an unsigned HOST_WIDE_INT.
      	* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr)
      	(vect_analyze_data_ref_dependence): Change max_vf from an int *
      	to an unsigned int *.
      	(vect_analyze_data_ref_dependences): Likewise.
      	(vect_compute_data_ref_alignment): Handle polynomial vf.
      	(vect_enhance_data_refs_alignment): Likewise.
      	(vect_prune_runtime_alias_test_list): Likewise.
      	(vect_shift_permute_load_chain): Likewise.
      	(vect_supportable_dr_alignment): Likewise.
      	(dependence_distance_ge_vf): Take the vectorization factor as a
      	poly_uint64 rather than an unsigned HOST_WIDE_INT.
      	(vect_analyze_data_refs): Change min_vf from an int * to a
      	poly_uint64 *.
      	* tree-vect-loop-manip.c (vect_gen_scalar_loop_niters): Take
      	vfm1 as a poly_uint64 rather than an int.  Make the same change
      	for the returned bound_scalar.
      	(vect_gen_vector_loop_niters): Handle polynomial vf.
      	(vect_do_peeling): Likewise.  Update call to
      	vect_gen_scalar_loop_niters and handle polynomial bound_scalars.
      	(vect_gen_vector_loop_niters_mult_vf): Assert that the vf must
      	be constant.
      	* tree-vect-loop.c (vect_determine_vectorization_factor)
      	(vect_update_vf_for_slp, vect_analyze_loop_2): Handle polynomial vf.
      	(vect_get_known_peeling_cost): Likewise.
      	(vect_estimate_min_profitable_iters, vectorizable_reduction): Likewise.
      	(vect_worthwhile_without_simd_p, vectorizable_induction): Likewise.
      	(vect_transform_loop): Likewise.  Use the lowest possible VF when
      	updating the upper bounds of the loop.
      	(vect_min_worthwhile_factor): Make static.  Return an unsigned int
      	rather than an int.
      	* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Cope with
      	polynomial unroll factors.
      	(vect_analyze_slp_cost_1, vect_analyze_slp_instance): Likewise.
      	(vect_make_slp_decision): Likewise.
      	(vect_supported_load_permutation_p): Likewise, and polynomial
      	vf too.
      	(vect_analyze_slp_cost): Handle polynomial vf.
      	(vect_slp_analyze_node_operations): Likewise.
      	(vect_slp_analyze_bb_1): Likewise.
      	(vect_transform_slp_perm_load): Take the vf as a poly_uint64 rather
      	than an unsigned HOST_WIDE_INT.
      	* tree-vect-stmts.c (vectorizable_simd_clone_call, vectorizable_store)
      	(vectorizable_load): Handle polynomial vf.
      	* tree-vectorizer.c (simduid_to_vf::vf): Change from an int to
      	a poly_uint64.
      	(adjust_simduid_builtins, shrink_simd_arrays): Update accordingly.
      
      gcc/testsuite/
      	* gcc.dg/vect-opt-info-1.c: New test.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256126
      Richard Sandiford committed
    • match.pd handling of three-constant bitops · fba05d9e
      natch.pd tries to reassociate two bit operations if both of them have
      constant operands.  However, with the polynomial integers added later,
      there's no guarantee that a bit operation on two integers can be folded
      at compile time.  This means that the pattern can trigger for operations
      on three constants, and as things stood could endlessly oscillate
      between the two associations.
      
      This patch keeps the existing pattern for the normal case of a
      non-constant first operand.  When all three operands are constant it
      tries to find a pair of constants that do fold.  If none do, it keeps
      the original expression as-was.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* match.pd: Handle bit operations involving three constants
      	and try to fold one pair.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256125
      Richard Sandiford committed
    • Add an alternative vector loop iv mechanism · 0f26839a
      Normally we adjust the vector loop so that it iterates:
      
         (original number of scalar iterations - number of peels) / VF
      
      times, enforcing this using an IV that starts at zero and increments
      by one each iteration.  However, dividing by VF would be expensive
      for variable VF, so this patch adds an alternative in which the IV
      increments by VF each iteration instead.  We then need to take care
      to handle possible overflow in the IV.
      
      The new mechanism isn't used yet; a later patch replaces the
      "if (1)" with a check for variable VF.
      
      2018-01-03  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* tree-vect-loop-manip.c: Include gimple-fold.h.
      	(slpeel_make_loop_iterate_ntimes): Add step, final_iv and
      	niters_maybe_zero parameters.  Handle other cases besides a step of 1.
      	(vect_gen_vector_loop_niters): Add a step_vector_ptr parameter.
      	Add a path that uses a step of VF instead of 1, but disable it
      	for now.
      	(vect_do_peeling): Add step_vector, niters_vector_mult_vf_var
      	and niters_no_overflow parameters.  Update calls to
      	slpeel_make_loop_iterate_ntimes and vect_gen_vector_loop_niters.
      	Create a new SSA name if the latter choses to use a ste other
      	than zero, and return it via niters_vector_mult_vf_var.
      	* tree-vect-loop.c (vect_transform_loop): Update calls to
      	vect_do_peeling, vect_gen_vector_loop_niters and
      	slpeel_make_loop_iterate_ntimes.
      	* tree-vectorizer.h (slpeel_make_loop_iterate_ntimes, vect_do_peeling)
      	(vect_gen_vector_loop_niters): Update declarations after above changes.
      
      From-SVN: r256124
      Richard Sandiford committed
    • config.guess: Import latest version. · ef7d7cf5
      	* config.guess: Import latest version.
      	* config.sub: Likewise.
      
      From-SVN: r256122
      Ben Elliston committed
    • rs6000.md (floor<mode>2): Add support for IEEE 128-bit round to integer instructions. · 2d71e7b8
      [gcc]
      2018-01-02  Michael Meissner  <meissner@linux.vnet.ibm.com>
      
      	* config/rs6000/rs6000.md (floor<mode>2): Add support for IEEE
      	128-bit round to integer instructions.
      	(ceil<mode>2): Likewise.
      	(btrunc<mode>2): Likewise.
      	(round<mode>2): Likewise.
      
      [gcc/testsuite]
      2018-01-02  Michael Meissner  <meissner@linux.vnet.ibm.com>
      
      	* gcc.target/powerpc/float128-hw2.c: Add tests for ceilf128,
      	floorf128, truncf128, and roundf128.
      	* gcc.target/powerpc/float128-hw5.c: New tests for _Float128
      	optimizations added in match.pd.
      	* gcc.target/powerpc/float128-hw6.c: Likewise.
      	* gcc.target/powerpc/float128-hw7.c: Likewise.
      	* gcc.target/powerpc/float128-hw8.c: Likewise.
      	* gcc.target/powerpc/float128-hw9.c: Likewise.
      	* gcc.target/powerpc/float128-hw10.c: Likewise.
      	* gcc.target/powerpc/float128-hw11.c: Likewise.
      
      From-SVN: r256118
      Michael Meissner committed
    • Daily bump. · 50d75500
      From-SVN: r256116
      GCC Administrator committed
  2. 02 Jan, 2018 10 commits
    • rs6000-string.c (expand_block_move): Allow the use of unaligned VSX load/store on P8/P9. · 3b0cb1a5
      2018-01-02  Aaron Sawdey  <acsawdey@linux.vnet.ibm.com>
      
              * config/rs6000/rs6000-string.c (expand_block_move): Allow the use of
              unaligned VSX load/store on P8/P9.
              (expand_block_clear): Allow the use of unaligned VSX
      	load/store on P8/P9.
      
      From-SVN: r256112
      Aaron Sawdey committed
    • rs6000-p8swap.c (swap_feeds_both_load_and_store): New function. · 6012c652
      2018-01-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
      
      	* config/rs6000/rs6000-p8swap.c (swap_feeds_both_load_and_store):
      	New function.
      	(rs6000_analyze_swaps): Mark a web unoptimizable if it contains a
      	swap associated with both a load and a store.
      
      From-SVN: r256111
      Bill Schmidt committed
    • RISC-V: Fix for icache flush issue on multicore processors. · f1bdc63a
      	gcc/
      	* config/riscv/linux.h (ICACHE_FLUSH_FUNC): New.
      	* config/riscv/riscv.md (clear_cache): Use it.
      
      From-SVN: r256109
      Andrew Waterman committed
    • * web.c: Remove out-of-date comment. · a7e92aff
      From-SVN: r256106
      Artyom Skrobov committed
    • Fix REG_ARGS_SIZE handling when pushing TLS addresses · 2bc6986d
      The new assert in add_args_size_note triggered for gcc.dg/tls/opt-3.c
      and others on m68k.  This looks like a pre-existing bug: if we pushed
      a value that needs a call to something like __tls_get_addr, we ended
      up with two different REG_ARGS_SIZE notes on the same instruction.
      
      It seems to be OK for emit_single_push_insn to push something that
      needs a call to __tls_get_addr:
      
            /* We have to allow non-call_pop patterns for the case
      	 of emit_single_push_insn of a TLS address.  */
            if (GET_CODE (pat) != PARALLEL)
      	return 0;
      
      so I think the bug is in the way this is handled rather than the fact
      that it occurs at all.
      
      If we're pushing a value X that needs a call C to calculate, we'll
      add REG_ARGS_SIZE notes to the pushes and pops for C as part of the
      call sequence.  Then emit_single_push_insn calls fixup_args_size_notes
      on the whole push sequence (the calculation of X, including C,
      and the push of X itself).  This is where the double notes came from.
      But emit_single_push_insn_1 adjusted stack_pointer_delta *before* the
      push, so the notes added for C were relative to the situation after
      the future push of X rather than before it.
      
      Presumably this didn't matter in practice because the note added
      second tended to trump the note added first.  But code is allowed to
      walk REG_NOTES without having to disregard secondary notes.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* expr.c (fixup_args_size_notes): Check that any existing
      	REG_ARGS_SIZE notes are correct, and don't try to re-add them.
      	(emit_single_push_insn_1): Move stack_pointer_delta adjustment to...
      	(emit_single_push_insn): ...here.
      
      From-SVN: r256105
      Richard Sandiford committed
    • Make CONST_VECTOR_ELT handle implicitly-encoded elements · cd5ff7bc
      This patch makes CONST_VECTOR_ELT handle implicitly-encoded elements,
      in a similar way to VECTOR_CST_ELT.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* rtl.h (CONST_VECTOR_ELT): Redefine to const_vector_elt.
      	(const_vector_encoded_nelts): New function.
      	(CONST_VECTOR_NUNITS): Redefine to use GET_MODE_NUNITS.
      	(const_vector_int_elt, const_vector_elt): Declare.
      	* emit-rtl.c (const_vector_int_elt_1): New function.
      	(const_vector_elt): Likewise.
      	* simplify-rtx.c (simplify_immed_subreg): Avoid taking the address
      	of CONST_VECTOR_ELT.
      
      From-SVN: r256104
      Richard Sandiford committed
    • Make more use of rtx_vector_builder · 3d8ca53d
      This patch makes various bits of CONST_VECTOR-building code use
      rtx_vector_builder, operating directly on a specific encoding.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* expr.c: Include rtx-vector-builder.h.
      	(const_vector_mask_from_tree): Use rtx_vector_builder and operate
      	directly on the tree encoding.
      	(const_vector_from_tree): Likewise.
      	* optabs.c: Include rtx-vector-builder.h.
      	(expand_vec_perm_var): Use rtx_vector_builder and create a repeating
      	sequence of "u" values.
      	* vec-perm-indices.c: Include rtx-vector-builder.h.
      	(vec_perm_indices_to_rtx): Use rtx_vector_builder and operate
      	directly on the vec_perm_indices encoding.
      
      From-SVN: r256103
      Richard Sandiford committed
    • New CONST_VECTOR layout · 3877c560
      This patch makes CONST_VECTOR use the same encoding as VECTOR_CST.
      
      One problem that occurs in RTL but not at the tree level is that a fair
      amount of code uses XVEC and XVECEXP directly on CONST_VECTORs (which is
      valid, just with looser checking).  This is complicated by the fact that
      vectors are also represented as PARALLELs in some target interfaces,
      so using XVECEXP is a good polymorphic way of handling both forms.
      
      Rather than try to untangle all that, the best approach seemed to be to
      continue to encode every element in a fixed-length vector.  That way only
      target-independent and AArch64 code need to be precise about using
      CONST_VECTOR_ELT over XVECEXP.
      
      After this change is no longer valid to modify CONST_VECTORs in-place.
      This needed some fix-up in the powerpc backends.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* doc/rtl.texi (const_vector): Describe new encoding scheme.
      	* Makefile.in (OBJS): Add rtx-vector-builder.o.
      	* rtx-vector-builder.h: New file.
      	* rtx-vector-builder.c: Likewise.
      	* rtl.h (rtx_def::u2): Add a const_vector field.
      	(CONST_VECTOR_NPATTERNS): New macro.
      	(CONST_VECTOR_NELTS_PER_PATTERN): Likewise.
      	(CONST_VECTOR_DUPLICATE_P): Likewise.
      	(CONST_VECTOR_STEPPED_P): Likewise.
      	(CONST_VECTOR_ENCODED_ELT): Likewise.
      	(const_vec_duplicate_p): Check for a duplicated vector encoding.
      	(unwrap_const_vec_duplicate): Likewise.
      	(const_vec_series_p): Check for a non-duplicated vector encoding.
      	Say that the function only returns true for integer vectors.
      	* emit-rtl.c: Include rtx-vector-builder.h.
      	(gen_const_vec_duplicate_1): Delete.
      	(gen_const_vector): Call gen_const_vec_duplicate instead of
      	gen_const_vec_duplicate_1.
      	(const_vec_series_p_1): Operate directly on the CONST_VECTOR encoding.
      	(gen_const_vec_duplicate): Use rtx_vector_builder.
      	(gen_const_vec_series): Likewise.
      	(gen_rtx_CONST_VECTOR): Likewise.
      	* config/powerpcspe/powerpcspe.c: Include rtx-vector-builder.h.
      	(swap_const_vector_halves): Take an rtx pointer rather than rtx.
      	Build a new vector rather than modifying a CONST_VECTOR in-place.
      	(handle_special_swappables): Update call accordingly.
      	* config/rs6000/rs6000-p8swap.c: Include rtx-vector-builder.h.
      	(swap_const_vector_halves): Take an rtx pointer rather than rtx.
      	Build a new vector rather than modifying a CONST_VECTOR in-place.
      	(handle_special_swappables): Update call accordingly.
      
      From-SVN: r256102
      Richard Sandiford committed
    • Use CONST_VECTOR_ELT instead of XVECEXP · 8eff75e0
      This patch replaces target-independent uses of XVECEXP with uses
      of CONST_VECTOR_ELT.  This kind of replacement isn't necessary
      for code specific to targets other than AArch64.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* simplify-rtx.c (simplify_const_binary_operation): Use
      	CONST_VECTOR_ELT instead of XVECEXP.
      
      From-SVN: r256101
      Richard Sandiford committed
    • Use ssizetype selectors for autovectorised VEC_PERM_EXPRs · b00cb3bf
      The previous patches mean that there's no reason that constant
      VEC_PERM_EXPRs need to have the same shape as the data inputs.
      This patch makes the autovectoriser use sizetype elements instead,
      so that indices don't get truncated for large or variable-length
      vectors.
      
      2018-01-02  Richard Sandiford  <richard.sandiford@linaro.org>
      
      gcc/
      	* tree-cfg.c (verify_gimple_assign_ternary): Allow the size of
      	the selector elements to be different from the data elements
      	if the selector is a VECTOR_CST.
      	* tree-vect-stmts.c (vect_gen_perm_mask_any): Use a vector of
      	ssizetype for the selector.
      
      From-SVN: r256100
      Richard Sandiford committed