- 21 Oct, 2019 6 commits
-
-
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_vectype_for_scalar_type): Take a vec_info. * tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise. (vect_prologue_cost_for_slp_op): Update call accordingly. (vect_get_vec_def_for_operand, vect_get_gather_scatter_ops) (vect_get_strided_load_store_ops, vectorizable_simd_clone_call) (vect_supportable_shift, vect_is_simple_cond, vectorizable_comparison) (get_mask_type_for_scalar_type): Likewise. (vect_get_vector_types_for_stmt): Likewise. * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise. * tree-vect-loop.c (vect_determine_vectorization_factor): Likewise. (get_initial_def_for_reduction, build_vect_cond_expr): Likewise. * tree-vect-patterns.c (vect_supportable_direct_optab_p): Likewise. (vect_split_statement, vect_convert_input): Likewise. (vect_recog_widen_op_pattern, vect_recog_pow_pattern): Likewise. (vect_recog_over_widening_pattern, vect_recog_mulhs_pattern): Likewise. (vect_recog_average_pattern, vect_recog_cast_forwprop_pattern) (vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern) (vect_synth_mult_by_constant, vect_recog_mult_pattern): Likewise. (vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern) (check_bool_pattern, adjust_bool_pattern_cast, adjust_bool_pattern) (search_type_for_mask_1, vect_recog_bool_pattern): Likewise. (vect_recog_mask_conversion_pattern): Likewise. (vect_add_conversion_to_pattern): Likewise. (vect_recog_gather_scatter_pattern): Likewise. * tree-vect-slp.c (vect_build_slp_tree_2): Likewise. (vect_analyze_slp_instance, vect_get_constant_vectors): Likewise. From-SVN: r277227
Richard Sandiford committed -
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_mask_type_for_scalar_type): Take a vec_info. * tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise. (vect_check_load_store_mask): Update call accordingly. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-patterns.c (check_bool_pattern): Likewise. (search_type_for_mask_1, vect_recog_mask_conversion_pattern): Likewise. (vect_convert_mask_for_vectype): Likewise. From-SVN: r277226
Richard Sandiford committed -
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take a vec_info. (vect_recog_dot_prod_pattern): Update call accordingly. (vect_recog_sad_pattern, vect_recog_pow_pattern): Likewise. (vect_recog_widen_sum_pattern): Likewise. From-SVN: r277225
Richard Sandiford committed -
2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (vect_supportable_shift): Take a vec_info. * tree-vect-stmts.c (vect_supportable_shift): Likewise. * tree-vect-patterns.c (vect_synth_mult_by_constant): Update call accordingly. From-SVN: r277224
Richard Sandiford committed -
The increase_alignment pass was using get_vectype_for_scalar_type to get the preferred vector type for each array element type. This has the effect of carrying over the vector size chosen by the first successful call to all subsequent calls, whereas it seems more natural to treat each array type independently and pick the "best" vector type for each element type. 2019-10-21 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.c (get_vec_alignment_for_array_type): Use get_vectype_for_scalar_type_and_size instead of get_vectype_for_scalar_type. From-SVN: r277223
Richard Sandiford committed -
From-SVN: r277221
GCC Administrator committed
-
- 20 Oct, 2019 7 commits
-
-
2019-10-20 Bernd Edlinger <bernd.edlinger@hotmail.de> * common.opt (-fcommon): Fix description. From-SVN: r277217
Bernd Edlinger committed -
* config/i386/i386-protos.h (ix86_pre_reload_split): Declare. * config/i386/i386.c (ix86_pre_reload_split): New function. * config/i386/i386.md (*fix_trunc<mode>_i387_1, *add<mode>3_eq, *add<mode>3_ne, *add<mode>3_eq_0, *add<mode>3_ne_0, *add<mode>3_eq, *add<mode>3_ne, *add<mode>3_eq_1, *add<mode>3_eq_0, *add<mode>3_ne_0, *anddi3_doubleword, *andndi3_doubleword, *<code>di3_doubleword, *one_cmpldi2_doubleword, *ashl<dwi>3_doubleword_mask, *ashl<dwi>3_doubleword_mask_1, *ashl<mode>3_mask, *ashl<mode>3_mask_1, *<shift_insn><mode>3_mask, *<shift_insn><mode>3_mask_1, *<shift_insn><dwi>3_doubleword_mask, *<shift_insn><dwi>3_doubleword_mask_1, *<rotate_insn><mode>3_mask, *<rotate_insn><mode>3_mask_1, *<btsc><mode>_mask, *<btsc><mode>_mask_1, *btr<mode>_mask, *btr<mode>_mask_1, *jcc_bt<mode>, *jcc_bt<mode>_1, *jcc_bt<mode>_mask, *popcounthi2_1, frndintxf2_<rounding>, *fist<mode>2_<rounding>_1, *<code><mode>3_1, *<code>di3_doubleword): Use ix86_pre_reload_split instead of can_create_pseudo_p in condition. * config/i386/sse.md (*sse4_1_<code>v8qiv8hi2<mask_name>_2, *avx2_<code>v8qiv8si2<mask_name>_2, *sse4_1_<code>v4qiv4si2<mask_name>_2, *sse4_1_<code>v4hiv4si2<mask_name>_2, *avx512f_<code>v8qiv8di2<mask_name>_2, *avx2_<code>v4qiv4di2<mask_name>_2, *avx2_<code>v4hiv4di2<mask_name>_2, *sse4_1_<code>v2hiv2di2<mask_name>_2, *sse4_1_<code>v2siv2di2<mask_name>_2, sse4_2_pcmpestr, sse4_2_pcmpistr): Likewise. From-SVN: r277216
Jakub Jelinek committed -
* doc/install.texi (Configuration, --enable-objc-gc): hboehm.info now defaults to https. From-SVN: r277215
Gerald Pfeifer committed -
* tree-ssa-alias.c (nonoverlapping_refs_since_match_p): Do not skip non-zero array accesses. * gcc.c-torture/execute/alias-access-path-2.c: New testcase. * gcc.dg/tree-ssa/alias-access-path-11.c: xfail. From-SVN: r277214
Jan Hubicka committed -
After the previous patch, it seems more natural to apply the PARAM_SLP_MAX_INSNS_IN_BB threshold as soon as we know what the region is, rather than delaying it to vect_slp_analyze_bb_1. (But rather than carve out the biggest region possible and then reject it, wouldn't it be better to stop when the region gets too big, to at least give us a chance of vectorising something?) It also seems more natural for vect_slp_bb_region to create the bb_vec_info itself rather than (a) having to pass bits of data down for the initialisation and (b) forcing vect_slp_analyze_bb_1 to free on every failure return. 2019-10-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vect_slp_analyze_bb_1): Take a bb_vec_info and return a boolean success value. Move the allocation and initialization of the bb_vec_info to... (vect_slp_bb_region): ...here. Update call accordingly. (vect_slp_bb): Apply PARAM_SLP_MAX_INSNS_IN_BB here rather than in vect_slp_analyze_bb_1. From-SVN: r277211
Richard Sandiford committed -
If the first attempt at applying BB SLP to a region fails, the main loop in vect_slp_bb recomputes the region's bounds and datarefs for the next vector size. AFAICT this isn't needed any more; we should be able to reuse the datarefs from the first attempt instead. 2019-10-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-slp.c (vect_slp_analyze_bb_1): Call save_datarefs when processing the given datarefs for the first time and check_datarefs subsequently. (vect_slp_bb_region): New function, split out of... (vect_slp_bb): ...here. Don't recompute the region bounds and dataref sets when retrying with a different vector size. From-SVN: r277210
Richard Sandiford committed -
From-SVN: r277209
GCC Administrator committed
-
- 19 Oct, 2019 7 commits
-
-
nodiscard-reason-only-one.C: In dg-error or dg-warning remove (?n) uses and replace .* with \[^\n\r]*. * g++.dg/cpp2a/nodiscard-reason-only-one.C: In dg-error or dg-warning remove (?n) uses and replace .* with \[^\n\r]*. * g++.dg/cpp2a/nodiscard-reason.C: Likewise. * g++.dg/cpp2a/nodiscard-once.C: Likewise. * g++.dg/cpp2a/nodiscard-reason-nonstring.C: Likewise. From-SVN: r277205
Jakub Jelinek committed -
2019-10-19 Paul Thomas <pault@gcc.gnu.org> PR fortran/91926 * runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Revert the change made on 2019-10-05. From-SVN: r277204
Paul Thomas committed -
PR target/92140 * config/i386/predicates.md (int_nonimmediate_operand): New special predicate. * config/i386/i386.md (*add<mode>3_eq, *add<mode>3_ne, *add<mode>3_eq_0, *add<mode>3_ne_0, *sub<mode>3_eq, *sub<mode>3_ne, *sub<mode>3_eq_1, *sub<mode>3_eq_0, *sub<mode>3_ne_0): New define_insn_and_split patterns. * gcc.target/i386/pr92140.c: New test. * gcc.c-torture/execute/pr92140.c: New test. Co-Authored-By: Uros Bizjak <ubizjak@gmail.com> From-SVN: r277203
Jakub Jelinek committed -
Darwin does not mark entries in string.h with nonnull attributes so the test fails. Since the purpose of the test is to check that the warnings are issued for an inlined function, not that the target headers are marked up, we can provide marked up headers for Darwin. gcc/testsuite/ChangeLog: 2019-10-19 Iain Sandoe <iain@sandoe.co.uk> * gcc.dg/Wnonnull.c: Add attributed function declarations for memcpy and strlen for Darwin. From-SVN: r277202
Iain Sandoe committed -
Removes a comment that's no longer relevant. gcc/ChangeLog: 2019-10-19 Iain Sandoe <iain@sandoe.co.uk> * config/rs6000/rs6000.md: Delete out--of-date comment about special-casing integer loads. From-SVN: r277201
Iain Sandoe committed -
2019-10-17 JeanHeyd Meneide <phdofthehouse@gmail.com> gcc/ * escaped_string.h (escaped_string): New header. * tree.c (escaped_string): Remove escaped_string class. gcc/c-family * c-lex.c (c_common_has_attribute): Update nodiscard value. gcc/cp/ * tree.c (handle_nodiscard_attribute) Added C++2a nodiscard string message. (std_attribute_table) Increase nodiscard argument handling max_length from 0 to 1. * parser.c (cp_parser_check_std_attribute): Add requirement that nodiscard only be seen once in attribute-list. (cp_parser_std_attribute): Check that empty parenthesis lists are not specified for attributes that have max_length > 0 (e.g. [[attr()]]). * cvt.c (maybe_warn_nodiscard): Add nodiscard message to output, if applicable. (convert_to_void): Allow constructors to be nodiscard-able (P1771). gcc/testsuite/g++.dg/cpp0x * gen-attrs-67.C: Test new error message for empty-parenthesis-list. gcc/testsuite/g++.dg/cpp2a * nodiscard-construct.C: New test. * nodiscard-once.C: New test. * nodiscard-reason-nonstring.C: New test. * nodiscard-reason-only-one.C: New test. * nodiscard-reason.C: New test. Reviewed-by: Jason Merrill <jason@redhat.com> From-SVN: r277200
JeanHeyd Meneide committed -
From-SVN: r277199
GCC Administrator committed
-
- 18 Oct, 2019 20 commits
-
-
gcc/testsuite/ChangeLog: PR tree-optimization/92157 * gcc.dg/strlenopt-69.c: Disable test failing due to PR 92155. * gcc.dg/strlenopt-87.c: New test. gcc/ChangeLog: PR tree-optimization/92157 * tree-ssa-strlen.c (handle_builtin_string_cmp): Be prepared for compute_string_length to return a negative result. From-SVN: r277194
Martin Sebor committed -
In thumb2 we now generate a NEGS instruction rather than RSBS, so this test needs updating. * gcc.target/arm/negdi-3.c: Update expected output to allow NEGS. From-SVN: r277192
Richard Earnshaw committed -
The generic expansion code for negv does not try the subv patterns, but instead emits a sub and a compare separately. Fortunately, the patterns can make use of the new subv operations, so just call those. We can also rewrite this using an iterator to simplify things further. Finally, we can now make negvdi4 work on Thumb2 as well as Arm. * config/arm/arm.md (negv<SIDI:mode>3): New expansion rule. (negvsi3, negvdi3): Delete. (negdi2_compare): Delete. From-SVN: r277191
Richard Earnshaw committed -
This patch adds early expansion of subvdi4. The expansion sequence is broadly based on the expansion of usubvdi4. * config/arm/arm.md (subvdi4): Decompose calculation into 32-bit operations. (subdi3_compare1): Delete pattern. (subvsi3_borrow): New insn pattern. (subvsi3_borrow_imm): Likewise. From-SVN: r277190
Richard Earnshaw committed -
This patch addresses constant handling in subvsi4. Either operand may be a constant. If the second input (operand[2]) is a constant, then we can canonicalize this into an addition form, providing we take care of the INT_MIN case. In that case the negation has to handle the fact that -INT_MIN is still INT_MIN and we need to ensure that a subtract operation is performed rather than an addition. The remaining cases are largely duals of the usubvsi4 expansion. This patch also fixes a technical correctness bug in the old expansion, where we did not realy describe the test for overflow in the RTL. We seem to have got away with that, however... * config/arm/arm.md (subv<mode>4): Delete. (subvdi4): New expander pattern. (subvsi4): Likewise. Handle some immediate values. (subvsi3_intmin): New insn pattern. (subvsi3): Likewise. (subvsi3_imm1): Likewise. * config/arm/arm.c (select_cc_mode): Also allow minus for CC_V idioms. From-SVN: r277189
Richard Earnshaw committed -
This patch adds early expansion of usubvdi4, allowing us to handle some constants in place, which previously we were unable to do. * config/arm/arm.md (usubvdi4): Allow registers or integers for incoming operands. Early split the calculation into SImode operations. (usubvsi3_borrow): New insn pattern. (usubvsi3_borrow_imm): Likewise. From-SVN: r277188
Richard Earnshaw committed -
This patch improves the expansion of usubvsi4 by allowing suitable constants to be passed directly. Unlike normal subtraction, either operand may be a constant (and indeed I have seen cases where both can be with LTO enabled). One interesting testcase that improves as a result of this is: unsigned f6 (unsigned a) { unsigned x; return __builtin_sub_overflow (5U, a, &x) ? 0 : x; } Which previously compiled to: rsbs r3, r0, #5 cmp r0, #5 movls r0, r3 movhi r0, #0 but now generates the optimal sequence: rsbs r0, r0, #5 movcc r0, #0 * config/arm/arm.md (usubv<mode>4): Delete expansion. (usubvsi4): New pattern. Allow some immediate values for inputs. (usubvdi4): New pattern. From-SVN: r277187
Richard Earnshaw committed -
This patch adds early splitting for addvdi4; it's very similar to the uaddvdi4 splitter, but the details are just different enough in places, especially for the patterns that match the splitting, where we have to compare against the non-widened version to detect if overflow occurred. I've also added a testcase to the testsuite for a couple of constants that caught me out during the development of this patch. They're probably arm-specific values, but the test is generic enough that I've included it for all targets. [gcc] * config/arm/arm.c (arm_select_cc_mode): Allow either the first or second operand of the PLUS inside a DImode equality test to be sign-extend when selecting CC_Vmode. * config/arm/arm.md (addvdi4): Early-split the operation into SImode instructions. (addsi3_cin_vout_reg, addsi3_cin_vout_imm, addsi3_cin_vout_0): New expand patterns. (addsi3_cin_vout_reg_insn, addsi3_cin_vout_imm_insn): New patterns. (addsi3_cin_vout_0): Likewise. (adddi3_compareV): Delete. [gcc/testsuite] * gcc.dg/builtin-arith-overflow-3.c: New test. From-SVN: r277186
Richard Earnshaw committed -
This patch matches the signed add-with-overflow patterns when the summation itself is dropped. In this case we can use CMN (or CMP with some immediates). There are a small number of constants in thumb2 where this can result in less dense code (as we lack 16-bit CMN with immediate patterns). To handle this we use peepholes to try these alternatives when either a scratch is available (0 <= i <= 7) or the original register is dead (0 <= i <= 255). We don't use a scratch in the pattern as if those conditions are not satisfied then the 32-bit form is preferable to forcing a reload. * config/arm/arm.md (addsi3_compareV_reg_nosum): New insn. (addsi3_compareV_imm_nosum): New insn. Also add peephole2 patterns to transform this back into the summation version when that leads to smaller code. From-SVN: r277185
Richard Earnshaw committed -
Similar to the improvements for uaddvsi4, this patch improves the code generation for addvsi4 to handle immediates and to add alternatives that better target thumb2. To do this we separate out the expansion of uaddvsi4 from that of uaddvdi4 and then add an additional pattern to handle constants. Also, while doing this I've fixed the incorrect usage of NE instead of COMPARE in the generated RTL. * config/arm/arm.md (addv<mode>4): Delete. (addvsi4): New pattern. Handle immediate values that the architecture supports. (addvdi4): New pattern. (addsi3_compareV): Rename to ... (addsi3_compareV_reg): ... this. Add constraints for thumb2 variants and use COMPARE rather than NE. (addsi3_compareV_imm): New pattern. * config/arm/arm.c (arm_select_cc_mode): Return CC_Vmode for a signed-overflow check. From-SVN: r277184
Richard Earnshaw committed -
This code borrows strongly on the uaddvti4 expansion for aarch64 since the principles are similar. Firstly, if the one of the low words of the expansion is 0, we can simply copy the other low word to the destination and use uaddvsi4 for the upper word. If that doesn't work we have to handle three possible cases for the upper work (the lower word is simply an add-with-carry operation as for adddi3): zero in the upper word, some other constant and a register (each has a different canonicalization). We use CC_ADCmode (a new CC mode variant) to describe the cases as the introduction of the carry means we can no-longer use the normal overflow trick of comparing the sum against one of the operands. * config/arm/arm-modes.def (CC_ADC): New CC mode. * config/arm/arm.c (arm_select_cc_mode): Detect selection of CC_ADCmode. (maybe_get_arm_condition_code): Handle CC_ADCmode. * config/arm/arm.md (uaddvdi4): Early expansion of unsigned addition with overflow. (addsi3_cin_cout_reg, addsi3_cin_cout_imm, addsi3_cin_cout_0): New expand patterns. (addsi3_cin_cout_reg_insn, addsi3_cin_cout_0_insn): New insn patterns (addsi3_cin_cout_imm_insn): Likewise. (adddi3_compareC): Delete insn. * config/arm/predicates.md (arm_carry_operation): Handle CC_ADCmode. From-SVN: r277183
Richard Earnshaw committed -
The uaddv patterns in the arm back-end do not currenty handle immediates during expansion. This patch adds this support for uaddvsi4. It's really a stepping-stone towards early expansion of uaddvdi4, but it complete and a useful change in its own right. Whilst making this change I also observed that we really had two patterns that did exactly the same thing, but with slightly different properties; consequently I've cleaned up all of the add-and-compare patterns to bring some consistency. * config/arm/arm.md (adddi3): Call gen_addsi3_compare_op1. * (uaddv<mode>4): Delete expansion pattern. (uaddvsi4): New pattern. (uaddvdi4): Likewise. (addsi3_compareC): Delete pattern, change callers to use addsi3_compare_op1. (addsi3_compare_op1): No-longer anonymous. Clean up constraints to reduce the number of alternatives and re-work type attribute handling. (addsi3_compare_op2): Clean up constraints to reduce the number of alternatives and re-work type attribute handling. (compare_addsi2_op0): Likewise. (compare_addsi2_op1): Likewise. From-SVN: r277182
Richard Earnshaw committed -
Now that all the major patterns for DImode have been converted to early expansion, we can safely clean up some dead code for the old way of handling DImode. * config/arm/arm-modes.def (CC_NCV, CC_CZ): Delete CC modes. * config/arm/arm.c (arm_select_cc_mode): Remove old selection code for DImode operands. (arm_gen_dicompare_reg): Remove unreachable expansion code. (maybe_get_arm_condition_code): Remove support for CC_CZmode and CC_NCVmode. * config/arm/arm.md (arm_cmpdi_insn): Delete. (arm_cmpdi_unsigned): Delete. From-SVN: r277181
Richard Earnshaw committed -
In a small number of cases it is preferable to handle comparisons with constants using the sequence RSBS tmp, Xlo, constlo RSCS tmp, Xhi, consthi which allows us to handle a small number of LE/GT/LEU/GEU cases when changing the code to use LT/GE/LTU/GEU would make the constant more expensive. Sadly, we cannot do this on Thumb, since we need RSC, so we now always use the incremented constant in that case since normally that still works out cheaper than forcing the entire constant into a register. Further investigation has also shown that the canonicalization of a reverse subtract and compare is valid for signed as well as unsigned value, so we relax the restriction on selecting CC_RSBmode to allow all types of compare. * config/arm/arm.c (arm_const_double_prefer_rsbs_rsc): New function. (arm_canonicalize_comparison): For GT/LE/GTU/GEU, use the constant unchanged only if that will be cheaper. (arm_select_cc_mode): Recognize a swapped comparison that will be regenerated using RSBS or RSCS. Relax restriction on selecting CC_RSBmode. (arm_gen_dicompare_reg): Handle LE/GT/LEU/GEU comparisons against a constant. (arm_gen_compare_reg): Handle compare (CONST, X) when the mode is CC_RSBmode. (maybe_get_arm_condition_code): CC_RSBmode now returns the same codes as CCmode. * config/arm/arm.md (rsb_imm_compare_scratch): New pattern. (rscsi3_<CC_EXTEND>out_scratch): New pattern. From-SVN: r277180
Richard Earnshaw committed -
This patch does most of the work for early splitting the DImode comparisons. We now handle EQ, NE, LT, GE, LTU and GEU during early expansion, in addition to EQ and NE, for which the expansion has now been reworked to use a standard conditional-compare pattern already in the back-end. To handle this we introduce two new condition flag modes that are used when comparing the upper words of decomposed DImode values: one for signed, and one for unsigned comparisons. CC_Bmode (B for Borrow) is essentially the inverse of CC_Cmode and is used when the carry flag is set by a subtraction of unsigned values. * config/arm/arm-modes.def (CC_NV, CC_B): New CC modes. * config/arm/arm.c (arm_select_cc_mode): Recognize constructs that need these modes. (arm_gen_dicompare_reg): New code to early expand the sub-operations of EQ, NE, LT, GE, LTU and GEU. * config/arm/iterators.md (CC_EXTEND): New code attribute. * config/arm/predicates.md (arm_adcimm_operand): New predicate.. * config/arm/arm.md (cmpsi3_carryin_<CC_EXTEND>out): New pattern. (cmpsi3_imm_carryin_<CC_EXTEND>out): Likewise. (cmpsi3_0_carryin_<CC_EXTEND>out): Likewise. From-SVN: r277179
Richard Earnshaw committed -
In almost all cases it is better to handle inequality handling against constants by transforming comparisons of the form (reg <GE/LT/GEU/LTU> const) into (reg <GT/LE/GTU/LEU> (const+1)). However, there are many cases that we could handle but currently failed to do so because we forced the constant into a register too early in the pattern expansion. To permit this to be done we need to defer forcing the constant into a register until after we've had the chance to do the transform - in some cases that may even mean that we no-longer need to force the constant into a register at all. For example, on Arm, the case: _Bool f8 (unsigned long long a) { return a > 0xffffffff; } previously compiled to mov r3, #0 cmp r1, r3 mvn r2, #0 cmpeq r0, r2 movhi r0, #1 movls r0, #0 bx lr But now compiles to cmp r1, #1 cmpeq r0, #0 movcs r0, #1 movcc r0, #0 bx lr Which although not yet completely optimal, is certainly better than previously. * config/arm/arm.md (cbranchdi4): Accept reg_or_int_operand for operand 2. (cstoredi4): Similarly, but for operand 3. * config/arm/arm.c (arm_canoncialize_comparison): Allow canonicalization of unsigned compares with a constant on Arm. Prefer using const+1 and adjusting the comparison over swapping the operands whenever the original constant was not valid. (arm_gen_dicompare_reg): If Y is not a valid operand, force it to a register here. (arm_validize_comparison): Do not force invalid DImode operands to registers here. From-SVN: r277178
Richard Earnshaw committed -
This is the first step of early splitting all the DImode comparison operations. We start by factoring the DImode handling out of arm_gen_compare_reg into its own function. Simple DImode equality comparisions (such as equality with zero, or equality with a constant that is zero in one of the two word values that it comprises) can be done using a single subtract followed by an ORRS instruction. This avoids the need for conditional execution. For example, (r0 != 5) can be written as SUB Rt, R0, #5 ORRS Rt, Rt, R1 The ORRS is now expanded using an SImode pattern that already exists in the MD file and this gives the register allocator more freedom to select registers (consecutive pairs are no-longer required). Furthermore, we can then delete the arm_cmpdi_zero pattern as it is no-longer required. We use SUB for the value adjustment as this has a generally more flexible range of immediates than XOR and what's more has the opportunity to be relaxed in thumb2 to a 16-bit SUBS instruction. * config/arm/arm.c (arm_select_cc_mode): For DImode equality tests return CC_Zmode if comparing against a constant where one word is zero. (arm_gen_compare_reg): Split DImode handling to ... (arm_gen_dicompare_reg): ... here. Handle equality comparisons against simple constants. * config/arm/arm.md (arm_cmpdi_zero): Delete pattern. From-SVN: r277177
Richard Earnshaw committed -
This patch adds a couple of alternative canonicalizations to allow combine to match a subtract-with-carry operation when one of the operands is shifted first. The most common case of this is when combining a sign-extend of one operand with a long-long value during subtraction. The RSC variant is only enabled for Arm, the SBC variant for any 32-bit compilation. * config/arm/arm.md (subsi3_carryin_shift_alt): New pattern. (rsbsi3_carryin_shift_alt): Likewise. From-SVN: r277176
Richard Earnshaw committed -
When the carry flag is appropriately set by a comprison, negscc patterns can expand into a simple SBC of a register with itself. This means we can convert two conditional instructions into a single non-conditional instruction. Furthermore, in Thumb2 we can avoid the need for an IT instruction as well. This patch also fixes the remaining testcase that we initially XFAILed in the first patch of this series. gcc: * config/arm/arm.md (negscc_borrow): New pattern. (mov_negscc): Don't split if the insn would match negscc_borrow. * config/arm/thumb2.md (thumb2_mov_negscc): Likewise. (thumb2_mov_negscc_strict_it): Likewise. testsuite: * gcc.target/arm/negdi-3.c: Remove XFAIL markers. From-SVN: r277175
Richard Earnshaw committed -
Consider this sequence during combine: Trying 18, 7 -> 22: 18: r118:SI=r122:SI REG_DEAD r122:SI 7: r114:SI=0x1-r118:SI-ltu(cc:CC_RSB,0) REG_DEAD r118:SI REG_DEAD cc:CC_RSB 22: r1:SI=r114:SI REG_DEAD r114:SI Failed to match this instruction: (set (reg:SI 1 r1 [+4 ]) (minus:SI (geu:SI (reg:CC_RSB 100 cc) (const_int 0 [0])) (reg:SI 122))) Successfully matched this instruction: (set (reg:SI 114) (geu:SI (reg:CC_RSB 100 cc) (const_int 0 [0]))) Successfully matched this instruction: (set (reg:SI 1 r1 [+4 ]) (minus:SI (reg:SI 114) (reg:SI 122))) allowing combination of insns 18, 7 and 22 original costs 4 + 4 + 4 = 12 replacement costs 8 + 4 = 12 The costs are all correct, but we really don't want this combination to take place. The original costs contain an insn that is a simple move of one pseudo register to another and it is extremely likely that register allocation will eliminate this insn entirely. On the other hand, the resulting sequence really does expand into a sequence that costs 12 (ie 3 insns). We don't want to prevent combine from eliminating such moves, as this can expose more combine opportunities, but we shouldn't rate them as profitable in themselves. We can do this be adjusting the costs slightly so that the benefit of eliminating such a simple insn is reduced. We only do this before register allocation; after allocation we give such insns their full cost. * config/arm/arm.c (arm_insn_cost): New function. (TARGET_INSN_COST): Override default definition. From-SVN: r277174
Richard Earnshaw committed
-