1. 21 Oct, 2019 6 commits
    • Pass a vec_info to get_vectype_for_scalar_type · 7ed54790
      2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (get_vectype_for_scalar_type): Take a vec_info.
      	* tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise.
      	(vect_prologue_cost_for_slp_op): Update call accordingly.
      	(vect_get_vec_def_for_operand, vect_get_gather_scatter_ops)
      	(vect_get_strided_load_store_ops, vectorizable_simd_clone_call)
      	(vect_supportable_shift, vect_is_simple_cond, vectorizable_comparison)
      	(get_mask_type_for_scalar_type): Likewise.
      	(vect_get_vector_types_for_stmt): Likewise.
      	* tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
      	* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
      	(get_initial_def_for_reduction, build_vect_cond_expr): Likewise.
      	* tree-vect-patterns.c (vect_supportable_direct_optab_p): Likewise.
      	(vect_split_statement, vect_convert_input): Likewise.
      	(vect_recog_widen_op_pattern, vect_recog_pow_pattern): Likewise.
      	(vect_recog_over_widening_pattern, vect_recog_mulhs_pattern): Likewise.
      	(vect_recog_average_pattern, vect_recog_cast_forwprop_pattern)
      	(vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern)
      	(vect_synth_mult_by_constant, vect_recog_mult_pattern): Likewise.
      	(vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern)
      	(check_bool_pattern, adjust_bool_pattern_cast, adjust_bool_pattern)
      	(search_type_for_mask_1, vect_recog_bool_pattern): Likewise.
      	(vect_recog_mask_conversion_pattern): Likewise.
      	(vect_add_conversion_to_pattern): Likewise.
      	(vect_recog_gather_scatter_pattern): Likewise.
      	* tree-vect-slp.c (vect_build_slp_tree_2): Likewise.
      	(vect_analyze_slp_instance, vect_get_constant_vectors): Likewise.
      
      From-SVN: r277227
      Richard Sandiford committed
    • Pass a vec_info to get_mask_type_for_scalar_type · 1bd5196c
      2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (get_mask_type_for_scalar_type): Take a vec_info.
      	* tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise.
      	(vect_check_load_store_mask): Update call accordingly.
      	(vect_get_mask_type_for_stmt): Likewise.
      	* tree-vect-patterns.c (check_bool_pattern): Likewise.
      	(search_type_for_mask_1, vect_recog_mask_conversion_pattern): Likewise.
      	(vect_convert_mask_for_vectype): Likewise.
      
      From-SVN: r277226
      Richard Sandiford committed
    • Pass a vec_info to vect_supportable_direct_optab_p · dcab2a0d
      2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
      	a vec_info.
      	(vect_recog_dot_prod_pattern): Update call accordingly.
      	(vect_recog_sad_pattern, vect_recog_pow_pattern): Likewise.
      	(vect_recog_widen_sum_pattern): Likewise.
      
      From-SVN: r277225
      Richard Sandiford committed
    • Pass a vec_info to vect_supportable_shift · a5c3185a
      2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_supportable_shift): Take a vec_info.
      	* tree-vect-stmts.c (vect_supportable_shift): Likewise.
      	* tree-vect-patterns.c (vect_synth_mult_by_constant): Update call
      	accordingly.
      
      From-SVN: r277224
      Richard Sandiford committed
    • Avoid setting current_vector_size in get_vec_alignment_for_array_type · da157e2e
      The increase_alignment pass was using get_vectype_for_scalar_type
      to get the preferred vector type for each array element type.
      This has the effect of carrying over the vector size chosen by
      the first successful call to all subsequent calls, whereas it seems
      more natural to treat each array type independently and pick the
      "best" vector type for each element type.
      
      2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.c (get_vec_alignment_for_array_type): Use
      	get_vectype_for_scalar_type_and_size instead of
      	get_vectype_for_scalar_type.
      
      From-SVN: r277223
      Richard Sandiford committed
    • Daily bump. · 5bf2f162
      From-SVN: r277221
      GCC Administrator committed
  2. 20 Oct, 2019 7 commits
    • common.opt (-fcommon): Fix description. · df73e971
      2019-10-20  Bernd Edlinger  <bernd.edlinger@hotmail.de>
      
              * common.opt (-fcommon): Fix description.
      
      From-SVN: r277217
      Bernd Edlinger committed
    • i386-protos.h (ix86_pre_reload_split): Declare. · 51085ca5
      	* config/i386/i386-protos.h (ix86_pre_reload_split): Declare.
      	* config/i386/i386.c (ix86_pre_reload_split): New function.
      	* config/i386/i386.md (*fix_trunc<mode>_i387_1, *add<mode>3_eq,
      	*add<mode>3_ne, *add<mode>3_eq_0, *add<mode>3_ne_0, *add<mode>3_eq,
      	*add<mode>3_ne, *add<mode>3_eq_1, *add<mode>3_eq_0, *add<mode>3_ne_0,
      	*anddi3_doubleword, *andndi3_doubleword, *<code>di3_doubleword,
      	*one_cmpldi2_doubleword, *ashl<dwi>3_doubleword_mask,
      	*ashl<dwi>3_doubleword_mask_1, *ashl<mode>3_mask, *ashl<mode>3_mask_1,
      	*<shift_insn><mode>3_mask, *<shift_insn><mode>3_mask_1,
      	*<shift_insn><dwi>3_doubleword_mask,
      	*<shift_insn><dwi>3_doubleword_mask_1, *<rotate_insn><mode>3_mask,
      	*<rotate_insn><mode>3_mask_1, *<btsc><mode>_mask, *<btsc><mode>_mask_1,
      	*btr<mode>_mask, *btr<mode>_mask_1, *jcc_bt<mode>, *jcc_bt<mode>_1,
      	*jcc_bt<mode>_mask, *popcounthi2_1, frndintxf2_<rounding>,
      	*fist<mode>2_<rounding>_1, *<code><mode>3_1, *<code>di3_doubleword):
      	Use ix86_pre_reload_split instead of can_create_pseudo_p in condition.
      	* config/i386/sse.md (*sse4_1_<code>v8qiv8hi2<mask_name>_2,
      	*avx2_<code>v8qiv8si2<mask_name>_2,
      	*sse4_1_<code>v4qiv4si2<mask_name>_2,
      	*sse4_1_<code>v4hiv4si2<mask_name>_2,
      	*avx512f_<code>v8qiv8di2<mask_name>_2,
      	*avx2_<code>v4qiv4di2<mask_name>_2, *avx2_<code>v4hiv4di2<mask_name>_2,
      	*sse4_1_<code>v2hiv2di2<mask_name>_2,
      	*sse4_1_<code>v2siv2di2<mask_name>_2, sse4_2_pcmpestr,
      	sse4_2_pcmpistr): Likewise.
      
      From-SVN: r277216
      Jakub Jelinek committed
    • install.texi (Configuration, [...]): hboehm.info now defaults to https. · efbf0f1e
      	* doc/install.texi (Configuration, --enable-objc-gc): hboehm.info
      	now defaults to https.
      
      From-SVN: r277215
      Gerald Pfeifer committed
    • tree-ssa-alias.c (nonoverlapping_refs_since_match_p): Do not skip non-zero array accesses. · f373041c
      
      	* tree-ssa-alias.c (nonoverlapping_refs_since_match_p): Do not
      	skip non-zero array accesses.
      
      	* gcc.c-torture/execute/alias-access-path-2.c: New testcase.
      	* gcc.dg/tree-ssa/alias-access-path-11.c: xfail.
      
      From-SVN: r277214
      Jan Hubicka committed
    • Move code out of vect_slp_analyze_bb_1 · 1d778697
      After the previous patch, it seems more natural to apply the
      PARAM_SLP_MAX_INSNS_IN_BB threshold as soon as we know what
      the region is, rather than delaying it to vect_slp_analyze_bb_1.
      (But rather than carve out the biggest region possible and then
      reject it, wouldn't it be better to stop when the region gets
      too big, to at least give us a chance of vectorising something?)
      
      It also seems more natural for vect_slp_bb_region to create the
      bb_vec_info itself rather than (a) having to pass bits of data down
      for the initialisation and (b) forcing vect_slp_analyze_bb_1 to free
      on every failure return.
      
      2019-10-20  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vect-slp.c (vect_slp_analyze_bb_1): Take a bb_vec_info
      	and return a boolean success value.  Move the allocation and
      	initialization of the bb_vec_info to...
      	(vect_slp_bb_region): ...here.  Update call accordingly.
      	(vect_slp_bb): Apply PARAM_SLP_MAX_INSNS_IN_BB here rather
      	than in vect_slp_analyze_bb_1.
      
      From-SVN: r277211
      Richard Sandiford committed
    • Avoid recomputing data references in BB SLP · fa0c8df7
      If the first attempt at applying BB SLP to a region fails, the main loop
      in vect_slp_bb recomputes the region's bounds and datarefs for the next
      vector size.  AFAICT this isn't needed any more; we should be able
      to reuse the datarefs from the first attempt instead.
      
      2019-10-20  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vect-slp.c (vect_slp_analyze_bb_1): Call save_datarefs
      	when processing the given datarefs for the first time and
      	check_datarefs subsequently.
      	(vect_slp_bb_region): New function, split out of...
      	(vect_slp_bb): ...here.  Don't recompute the region bounds and
      	dataref sets when retrying with a different vector size.
      
      From-SVN: r277210
      Richard Sandiford committed
    • Daily bump. · b4edf5c5
      From-SVN: r277209
      GCC Administrator committed
  3. 19 Oct, 2019 7 commits
    • nodiscard-reason-only-one.C: In dg-error or dg-warning remove (?n) uses and… · 0fcd8629
      nodiscard-reason-only-one.C: In dg-error or dg-warning remove (?n) uses and replace .* with \[^\n\r]*.
      
      	* g++.dg/cpp2a/nodiscard-reason-only-one.C: In dg-error or dg-warning
      	remove (?n) uses and replace .* with \[^\n\r]*.
      	* g++.dg/cpp2a/nodiscard-reason.C: Likewise.
      	* g++.dg/cpp2a/nodiscard-once.C: Likewise.
      	* g++.dg/cpp2a/nodiscard-reason-nonstring.C: Likewise.
      
      From-SVN: r277205
      Jakub Jelinek committed
    • re PR fortran/91926 (assumed rank optional) · b3fbf95e
      2019-10-19  Paul Thomas  <pault@gcc.gnu.org>
      
      	PR fortran/91926
      	* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Revert
      	the change made on 2019-10-05.
      
      From-SVN: r277204
      Paul Thomas committed
    • re PR target/92140 (clang vs gcc optimizing with adc/sbb) · 15643a0d
      	PR target/92140
      	* config/i386/predicates.md (int_nonimmediate_operand): New special
      	predicate.
      	* config/i386/i386.md (*add<mode>3_eq, *add<mode>3_ne,
      	*add<mode>3_eq_0, *add<mode>3_ne_0, *sub<mode>3_eq, *sub<mode>3_ne,
      	*sub<mode>3_eq_1, *sub<mode>3_eq_0, *sub<mode>3_ne_0): New
      	define_insn_and_split patterns.
      
      	* gcc.target/i386/pr92140.c: New test.
      	* gcc.c-torture/execute/pr92140.c: New test.
      
      Co-Authored-By: Uros Bizjak <ubizjak@gmail.com>
      
      From-SVN: r277203
      Jakub Jelinek committed
    • [Darwin, testsuite] Fix Wnonnull on Darwin. · 2366bf60
      Darwin does not mark entries in string.h with nonnull attributes
      so the test fails.  Since the purpose of the test is to check that
      the warnings are issued for an inlined function, not that the target
      headers are marked up, we can provide marked up headers for Darwin.
      
      gcc/testsuite/ChangeLog:
      
      2019-10-19  Iain Sandoe  <iain@sandoe.co.uk>
      
      	* gcc.dg/Wnonnull.c: Add attributed function declarations for
      	memcpy and strlen for Darwin.
      
      From-SVN: r277202
      Iain Sandoe committed
    • [PPC] Delete out of date comment. · dc7e9feb
      Removes a comment that's no longer relevant.
      
      gcc/ChangeLog:
      
      2019-10-19  Iain Sandoe  <iain@sandoe.co.uk>
      
      	* config/rs6000/rs6000.md: Delete out--of-date comment about
      	special-casing integer loads.
      
      From-SVN: r277201
      Iain Sandoe committed
    • Implement C++20 P1301 [[nodiscard("should have a reason")]]. · 8ad0c477
      2019-10-17  JeanHeyd Meneide  <phdofthehouse@gmail.com>
      
      gcc/
              * escaped_string.h (escaped_string): New header.
              * tree.c (escaped_string): Remove escaped_string class.
      
      gcc/c-family
              * c-lex.c (c_common_has_attribute): Update nodiscard value.
      
      gcc/cp/
              * tree.c (handle_nodiscard_attribute) Added C++2a nodiscard
      	string message.
              (std_attribute_table) Increase nodiscard argument handling
      	max_length from 0 to 1.
              * parser.c (cp_parser_check_std_attribute): Add requirement
      	that nodiscard only be seen once in attribute-list.
              (cp_parser_std_attribute): Check that empty parenthesis lists are
              not specified for attributes that have max_length > 0 (e.g.
      	[[attr()]]).
              * cvt.c (maybe_warn_nodiscard): Add nodiscard message to
      	output, if applicable.
      	(convert_to_void): Allow constructors to be nodiscard-able (P1771).
      
      gcc/testsuite/g++.dg/cpp0x
              * gen-attrs-67.C: Test new error message for empty-parenthesis-list.
      
      gcc/testsuite/g++.dg/cpp2a
              * nodiscard-construct.C: New test.
              * nodiscard-once.C: New test.
              * nodiscard-reason-nonstring.C: New test.
              * nodiscard-reason-only-one.C: New test.
              * nodiscard-reason.C: New test.
      
      Reviewed-by: Jason Merrill <jason@redhat.com>
      
      From-SVN: r277200
      JeanHeyd Meneide committed
    • Daily bump. · 9299523c
      From-SVN: r277199
      GCC Administrator committed
  4. 18 Oct, 2019 20 commits
    • PR tree-optimization/92157 - incorrect strcmp() == 0 result for unknown strings · 9c233ad0
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/92157
      	* gcc.dg/strlenopt-69.c: Disable test failing due to PR 92155.
      	* gcc.dg/strlenopt-87.c: New test.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/92157
      	* tree-ssa-strlen.c (handle_builtin_string_cmp): Be prepared for
      	compute_string_length to return a negative result.
      
      From-SVN: r277194
      Martin Sebor committed
    • [arm] Fix testsuite nit when compiling for thumb2 · f8b9b1ed
      In thumb2 we now generate a NEGS instruction rather than RSBS, so this
      test needs updating.
      
      	* gcc.target/arm/negdi-3.c: Update expected output to allow NEGS.
      
      From-SVN: r277192
      Richard Earnshaw committed
    • [arm] Improvements to negvsi4 and negvdi4. · a7c3ebae
      The generic expansion code for negv does not try the subv patterns,
      but instead emits a sub and a compare separately.  Fortunately, the
      patterns can make use of the new subv operations, so just call those.
      We can also rewrite this using an iterator to simplify things further.
      Finally, we can now make negvdi4 work on Thumb2 as well as Arm.
      
      	* config/arm/arm.md (negv<SIDI:mode>3): New expansion rule.
      	(negvsi3, negvdi3): Delete.
      	(negdi2_compare): Delete.
      
      From-SVN: r277191
      Richard Earnshaw committed
    • [arm] Early expansion of subvdi4 · ead32773
      This patch adds early expansion of subvdi4.  The expansion sequence
      is broadly based on the expansion of usubvdi4.
      
      	* config/arm/arm.md (subvdi4): Decompose calculation into 32-bit
      	operations.
      	(subdi3_compare1): Delete pattern.
      	(subvsi3_borrow): New insn pattern.
      	(subvsi3_borrow_imm): Likewise.
      
      From-SVN: r277190
      Richard Earnshaw committed
    • [arm] Improve constant handling for subvsi4. · 238273fe
      This patch addresses constant handling in subvsi4.  Either operand may
      be a constant.  If the second input (operand[2]) is a constant, then
      we can canonicalize this into an addition form, providing we take care
      of the INT_MIN case.  In that case the negation has to handle the fact
      that -INT_MIN is still INT_MIN and we need to ensure that a subtract
      operation is performed rather than an addition.  The remaining cases
      are largely duals of the usubvsi4 expansion.
      
      This patch also fixes a technical correctness bug in the old
      expansion, where we did not realy describe the test for overflow in
      the RTL.  We seem to have got away with that, however...
      
      	* config/arm/arm.md (subv<mode>4): Delete.
      	(subvdi4): New expander pattern.
      	(subvsi4): Likewise.  Handle some immediate values.
      	(subvsi3_intmin): New insn pattern.
      	(subvsi3): Likewise.
      	(subvsi3_imm1): Likewise.
      	* config/arm/arm.c (select_cc_mode): Also allow minus for CC_V
      	idioms.
      
      From-SVN: r277189
      Richard Earnshaw committed
    • [arm] Early expansion of usubvdi4. · eff5ce0a
      This patch adds early expansion of usubvdi4, allowing us to handle some
      constants in place, which previously we were unable to do.
      
      	* config/arm/arm.md (usubvdi4): Allow registers or integers for
      	incoming operands.  Early split the calculation into SImode
      	operations.
      	(usubvsi3_borrow): New insn pattern.
      	(usubvsi3_borrow_imm): Likewise.
      
      From-SVN: r277188
      Richard Earnshaw committed
    • [arm] Improve constant handling for usubvsi4. · a79048f6
      This patch improves the expansion of usubvsi4 by allowing suitable
      constants to be passed directly.  Unlike normal subtraction, either
      operand may be a constant (and indeed I have seen cases where both can
      be with LTO enabled).  One interesting testcase that improves as a
      result of this is:
      
      unsigned f6 (unsigned a)
      {
        unsigned x;
        return __builtin_sub_overflow (5U, a, &x) ? 0 : x;
      }
      
      Which previously compiled to:
      
      	rsbs	r3, r0, #5
      	cmp	r0, #5
      	movls	r0, r3
      	movhi	r0, #0
      
      but now generates the optimal sequence:
      
      	rsbs	r0, r0, #5
      	movcc	r0, #0
      
      	* config/arm/arm.md (usubv<mode>4): Delete expansion.
      	(usubvsi4): New pattern.  Allow some immediate values for inputs.
      	(usubvdi4): New pattern.
      
      From-SVN: r277187
      Richard Earnshaw committed
    • [arm] Early split addvdi4 · fa62df0e
      This patch adds early splitting for addvdi4; it's very similar to the
      uaddvdi4 splitter, but the details are just different enough in
      places, especially for the patterns that match the splitting, where we
      have to compare against the non-widened version to detect if overflow
      occurred.
      
      I've also added a testcase to the testsuite for a couple of constants
      that caught me out during the development of this patch.  They're
      probably arm-specific values, but the test is generic enough that I've
      included it for all targets.
      
      [gcc]
      	* config/arm/arm.c (arm_select_cc_mode): Allow either the first
      	or second operand of the PLUS inside a DImode equality test to be
      	sign-extend when selecting CC_Vmode.
      	* config/arm/arm.md (addvdi4): Early-split the operation into SImode
      	instructions.
      	(addsi3_cin_vout_reg, addsi3_cin_vout_imm, addsi3_cin_vout_0): New
      	expand patterns.
      	(addsi3_cin_vout_reg_insn, addsi3_cin_vout_imm_insn): New patterns.
      	(addsi3_cin_vout_0): Likewise.
      	(adddi3_compareV): Delete.
      
      [gcc/testsuite]
      	* gcc.dg/builtin-arith-overflow-3.c: New test.
      
      From-SVN: r277186
      Richard Earnshaw committed
    • [arm] Allow the summation result of signed add-with-overflow to be discarded. · db962d0a
      This patch matches the signed add-with-overflow patterns when the
      summation itself is dropped.  In this case we can use CMN (or CMP with
      some immediates).  There are a small number of constants in thumb2
      where this can result in less dense code (as we lack 16-bit CMN with
      immediate patterns).  To handle this we use peepholes to try these
      alternatives when either a scratch is available (0 <= i <= 7) or the
      original register is dead (0 <= i <= 255).  We don't use a scratch in
      the pattern as if those conditions are not satisfied then the 32-bit
      form is preferable to forcing a reload.
      
      	* config/arm/arm.md (addsi3_compareV_reg_nosum): New insn.
      	(addsi3_compareV_imm_nosum): New insn.  Also add peephole2 patterns
      	to transform this back into the summation version when that leads
      	to smaller code.
      
      From-SVN: r277185
      Richard Earnshaw committed
    • [arm] Improve code generation for addvsi4. · dbba8a17
      Similar to the improvements for uaddvsi4, this patch improves the code
      generation for addvsi4 to handle immediates and to add alternatives
      that better target thumb2.  To do this we separate out the expansion
      of uaddvsi4 from that of uaddvdi4 and then add an additional pattern
      to handle constants.  Also, while doing this I've fixed the incorrect
      usage of NE instead of COMPARE in the generated RTL.
      
      	* config/arm/arm.md (addv<mode>4): Delete.
      	(addvsi4): New pattern.  Handle immediate values that the architecture
      	supports.
      	(addvdi4): New pattern.
      	(addsi3_compareV): Rename to ...
      	(addsi3_compareV_reg): ... this.  Add constraints for thumb2 variants
      	and use COMPARE rather than NE.
      	(addsi3_compareV_imm): New pattern.
      	* config/arm/arm.c (arm_select_cc_mode): Return CC_Vmode for
      	a signed-overflow check.
      
      From-SVN: r277184
      Richard Earnshaw committed
    • [arm] Early expansion of uaddvdi4. · deb254e0
      This code borrows strongly on the uaddvti4 expansion for aarch64 since
      the principles are similar.  Firstly, if the one of the low words of
      the expansion is 0, we can simply copy the other low word to the
      destination and use uaddvsi4 for the upper word.  If that doesn't work
      we have to handle three possible cases for the upper work (the lower
      word is simply an add-with-carry operation as for adddi3): zero in the
      upper word, some other constant and a register (each has a different
      canonicalization).  We use CC_ADCmode (a new CC mode variant) to
      describe the cases as the introduction of the carry means we can
      no-longer use the normal overflow trick of comparing the sum against
      one of the operands.
      
      	* config/arm/arm-modes.def (CC_ADC): New CC mode.
      	* config/arm/arm.c (arm_select_cc_mode): Detect selection of
      	CC_ADCmode.
      	(maybe_get_arm_condition_code): Handle CC_ADCmode.
      	* config/arm/arm.md (uaddvdi4): Early expansion of unsigned addition
      	with overflow.
      	(addsi3_cin_cout_reg, addsi3_cin_cout_imm, addsi3_cin_cout_0): New
      	expand patterns.
      	(addsi3_cin_cout_reg_insn, addsi3_cin_cout_0_insn): New insn patterns
      	(addsi3_cin_cout_imm_insn): Likewise.
      	(adddi3_compareC): Delete insn.
      	* config/arm/predicates.md (arm_carry_operation): Handle CC_ADCmode.
      
      From-SVN: r277183
      Richard Earnshaw committed
    • [arm] Handle immediate values in uaddvsi4 · ed6588f2
      The uaddv patterns in the arm back-end do not currenty handle immediates
      during expansion.  This patch adds this support for uaddvsi4.  It's really
      a stepping-stone towards early expansion of uaddvdi4, but it complete and
      a useful change in its own right.
      
      Whilst making this change I also observed that we really had two patterns
      that did exactly the same thing, but with slightly different properties;
      consequently I've cleaned up all of the add-and-compare patterns to bring
      some consistency.
      
      	* config/arm/arm.md (adddi3): Call gen_addsi3_compare_op1.
      	* (uaddv<mode>4): Delete expansion pattern.
      	(uaddvsi4): New pattern.
      	(uaddvdi4): Likewise.
      	(addsi3_compareC): Delete pattern, change callers to use
      	addsi3_compare_op1.
      	(addsi3_compare_op1): No-longer anonymous.  Clean up constraints to
      	reduce the number of alternatives and re-work type attribute handling.
      	(addsi3_compare_op2): Clean up constraints to reduce the number of
      	alternatives and re-work type attribute handling.
      	(compare_addsi2_op0): Likewise.
      	(compare_addsi2_op1): Likewise.
      
      From-SVN: r277182
      Richard Earnshaw committed
    • [arm] Cleanup dead code - old support for DImode comparisons · f9f6247d
      Now that all the major patterns for DImode have been converted to
      early expansion, we can safely clean up some dead code for the old way
      of handling DImode.
      
      	* config/arm/arm-modes.def (CC_NCV, CC_CZ): Delete CC modes.
      	* config/arm/arm.c (arm_select_cc_mode): Remove old selection code
      	for DImode operands.
      	(arm_gen_dicompare_reg): Remove unreachable expansion code.
      	(maybe_get_arm_condition_code): Remove support for CC_CZmode and
      	CC_NCVmode.
      	* config/arm/arm.md (arm_cmpdi_insn): Delete.
      	(arm_cmpdi_unsigned): Delete.
      
      From-SVN: r277181
      Richard Earnshaw committed
    • [arm] Handle some constant comparisons using rsbs+rscs · af74bfee
      In a small number of cases it is preferable to handle comparisons with
      constants using the sequence
      
      	RSBS	tmp, Xlo, constlo
      	RSCS	tmp, Xhi, consthi
      
      which allows us to handle a small number of LE/GT/LEU/GEU cases when
      changing the code to use LT/GE/LTU/GEU would make the constant more
      expensive.  Sadly, we cannot do this on Thumb, since we need RSC, so we
      now always use the incremented constant in that case since normally that
      still works out cheaper than forcing the entire constant into a register.
      
      Further investigation has also shown that the canonicalization of a
      reverse subtract and compare is valid for signed as well as unsigned value,
      so we relax the restriction on selecting CC_RSBmode to allow all types
      of compare.
      
      	* config/arm/arm.c (arm_const_double_prefer_rsbs_rsc): New function.
      	(arm_canonicalize_comparison): For GT/LE/GTU/GEU, use the constant
      	unchanged only if that will be cheaper.
      	(arm_select_cc_mode): Recognize a swapped comparison that will
      	be regenerated using RSBS or RSCS.  Relax restriction on selecting
      	CC_RSBmode.
      	(arm_gen_dicompare_reg): Handle LE/GT/LEU/GEU comparisons against
      	a constant.
      	(arm_gen_compare_reg): Handle compare (CONST, X) when the mode
      	is CC_RSBmode.
      	(maybe_get_arm_condition_code): CC_RSBmode now returns the same codes
      	as CCmode.
      	* config/arm/arm.md (rsb_imm_compare_scratch): New pattern.
      	(rscsi3_<CC_EXTEND>out_scratch): New pattern.
      
      From-SVN: r277180
      Richard Earnshaw committed
    • [arm] early split most DImode comparison operations. · 8b8ab8f4
      This patch does most of the work for early splitting the DImode
      comparisons.  We now handle EQ, NE, LT, GE, LTU and GEU during early
      expansion, in addition to EQ and NE, for which the expansion has now
      been reworked to use a standard conditional-compare pattern already in
      the back-end.
      
      To handle this we introduce two new condition flag modes that are used
      when comparing the upper words of decomposed DImode values: one for
      signed, and one for unsigned comparisons.  CC_Bmode (B for Borrow) is
      essentially the inverse of CC_Cmode and is used when the carry flag is
      set by a subtraction of unsigned values.
      
      	* config/arm/arm-modes.def (CC_NV, CC_B): New CC modes.
      	* config/arm/arm.c (arm_select_cc_mode): Recognize constructs that
      	need these modes.
      	(arm_gen_dicompare_reg): New code to early expand the sub-operations
      	of EQ, NE, LT, GE, LTU and GEU.
      	* config/arm/iterators.md (CC_EXTEND): New code attribute.
      	* config/arm/predicates.md (arm_adcimm_operand): New predicate..
      	* config/arm/arm.md (cmpsi3_carryin_<CC_EXTEND>out): New pattern.
      	(cmpsi3_imm_carryin_<CC_EXTEND>out): Likewise.
      	(cmpsi3_0_carryin_<CC_EXTEND>out): Likewise.
      
      From-SVN: r277179
      Richard Earnshaw committed
    • [arm] Improve handling of DImode comparisions against constants. · 22060d0e
      In almost all cases it is better to handle inequality handling against constants
      by transforming comparisons of the form (reg <GE/LT/GEU/LTU> const) into
      (reg <GT/LE/GTU/LEU> (const+1)).  However, there are many cases that we could
      handle but currently failed to do so because we forced the constant into a
      register too early in the pattern expansion.  To permit this to be done we need
      to defer forcing the constant into a register until after we've had the chance
      to do the transform - in some cases that may even mean that we no-longer need
      to force the constant into a register at all.  For example, on Arm, the case:
      
      _Bool f8 (unsigned long long a) { return a > 0xffffffff; }
      
      previously compiled to
      
              mov     r3, #0
              cmp     r1, r3
              mvn     r2, #0
              cmpeq   r0, r2
              movhi   r0, #1
              movls   r0, #0
              bx      lr
      
      But now compiles to
      
              cmp     r1, #1
              cmpeq   r0, #0
              movcs   r0, #1
              movcc   r0, #0
              bx      lr
      
      Which although not yet completely optimal, is certainly better than
      previously.
      
      	* config/arm/arm.md (cbranchdi4): Accept reg_or_int_operand for
      	operand 2.
      	(cstoredi4): Similarly, but for operand 3.
      	* config/arm/arm.c (arm_canoncialize_comparison): Allow canonicalization
      	of unsigned compares with a constant on Arm.  Prefer using const+1 and
      	adjusting the comparison over swapping the operands whenever the
      	original constant was not valid.
      	(arm_gen_dicompare_reg): If Y is not a valid operand, force it to a
      	register here.
      	(arm_validize_comparison): Do not force invalid DImode operands to
      	registers here.
      
      From-SVN: r277178
      Richard Earnshaw committed
    • [arm] Early split simple DImode equality comparisons · 5899656b
      This is the first step of early splitting all the DImode comparison
      operations.  We start by factoring the DImode handling out of
      arm_gen_compare_reg into its own function.
      
      Simple DImode equality comparisions (such as equality with zero, or
      equality with a constant that is zero in one of the two word values
      that it comprises) can be done using a single subtract followed by an
      ORRS instruction.  This avoids the need for conditional execution.
      
      For example, (r0 != 5) can be written as
      
      	SUB	Rt, R0, #5
      	ORRS	Rt, Rt, R1
      
      The ORRS is now expanded using an SImode pattern that already exists
      in the MD file and this gives the register allocator more freedom to
      select registers (consecutive pairs are no-longer required).
      Furthermore, we can then delete the arm_cmpdi_zero pattern as it is
      no-longer required.  We use SUB for the value adjustment as this has a
      generally more flexible range of immediates than XOR and what's more
      has the opportunity to be relaxed in thumb2 to a 16-bit SUBS
      instruction.
      
      	* config/arm/arm.c (arm_select_cc_mode): For DImode equality tests
      	return CC_Zmode if comparing against a constant where one word is
      	zero.
      	(arm_gen_compare_reg): Split DImode handling to ...
      	(arm_gen_dicompare_reg): ... here.  Handle equality comparisons
      	against simple constants.
      	* config/arm/arm.md (arm_cmpdi_zero): Delete pattern.
      
      From-SVN: r277177
      Richard Earnshaw committed
    • [arm] Add alternative canonicalizations for subtract-with-carry + shift · 0b478cdd
      This patch adds a couple of alternative canonicalizations to allow
      combine to match a subtract-with-carry operation when one of the operands
      is shifted first.  The most common case of this is when combining a
      sign-extend of one operand with a long-long value during subtraction.
      The RSC variant is only enabled for Arm, the SBC variant for any 32-bit
      compilation.
      
      	* config/arm/arm.md (subsi3_carryin_shift_alt): New pattern.
      	(rsbsi3_carryin_shift_alt): Likewise.
      
      From-SVN: r277176
      Richard Earnshaw committed
    • [arm] Implement negscc using SBC when appropriate. · f6ff841b
      When the carry flag is appropriately set by a comprison, negscc
      patterns can expand into a simple SBC of a register with itself.  This
      means we can convert two conditional instructions into a single
      non-conditional instruction.  Furthermore, in Thumb2 we can avoid the
      need for an IT instruction as well.  This patch also fixes the remaining
      testcase that we initially XFAILed in the first patch of this series.
      
      gcc:
      	* config/arm/arm.md (negscc_borrow): New pattern.
      	(mov_negscc): Don't split if the insn would match negscc_borrow.
      	* config/arm/thumb2.md (thumb2_mov_negscc): Likewise.
      	(thumb2_mov_negscc_strict_it): Likewise.
      
      testsuite:
      	* gcc.target/arm/negdi-3.c: Remove XFAIL markers.
      
      From-SVN: r277175
      Richard Earnshaw committed
    • [arm] Reduce cost of insns that are simple reg-reg moves. · 24d28a87
      Consider this sequence during combine:
      
      Trying 18, 7 -> 22:
         18: r118:SI=r122:SI
            REG_DEAD r122:SI
          7: r114:SI=0x1-r118:SI-ltu(cc:CC_RSB,0)
            REG_DEAD r118:SI
            REG_DEAD cc:CC_RSB
         22: r1:SI=r114:SI
            REG_DEAD r114:SI
      Failed to match this instruction:
      (set (reg:SI 1 r1 [+4 ])
          (minus:SI (geu:SI (reg:CC_RSB 100 cc)
                  (const_int 0 [0]))
              (reg:SI 122)))
      Successfully matched this instruction:
      (set (reg:SI 114)
          (geu:SI (reg:CC_RSB 100 cc)
              (const_int 0 [0])))
      Successfully matched this instruction:
      (set (reg:SI 1 r1 [+4 ])
          (minus:SI (reg:SI 114)
              (reg:SI 122)))
      allowing combination of insns 18, 7 and 22
      original costs 4 + 4 + 4 = 12
      replacement costs 8 + 4 = 12
      
      The costs are all correct, but we really don't want this combination
      to take place.  The original costs contain an insn that is a simple
      move of one pseudo register to another and it is extremely likely that
      register allocation will eliminate this insn entirely.  On the other
      hand, the resulting sequence really does expand into a sequence that
      costs 12 (ie 3 insns).
      
      We don't want to prevent combine from eliminating such moves, as this
      can expose more combine opportunities, but we shouldn't rate them as
      profitable in themselves.  We can do this be adjusting the costs
      slightly so that the benefit of eliminating such a simple insn is
      reduced.
      
      We only do this before register allocation; after allocation we give
      such insns their full cost.
      
      	* config/arm/arm.c (arm_insn_cost): New function.
      	(TARGET_INSN_COST): Override default definition.
      
      From-SVN: r277174
      Richard Earnshaw committed