Commits · 7ed54790da87bbb4a134020a9fb8bd1b72fd0acb · lvzhengyang / riscv-gcc-1

21 Oct, 2019 6 commits

Pass a vec_info to get_vectype_for_scalar_type · 7ed54790

2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (get_vectype_for_scalar_type): Take a vec_info.
	* tree-vect-stmts.c (get_vectype_for_scalar_type): Likewise.
	(vect_prologue_cost_for_slp_op): Update call accordingly.
	(vect_get_vec_def_for_operand, vect_get_gather_scatter_ops)
	(vect_get_strided_load_store_ops, vectorizable_simd_clone_call)
	(vect_supportable_shift, vect_is_simple_cond, vectorizable_comparison)
	(get_mask_type_for_scalar_type): Likewise.
	(vect_get_vector_types_for_stmt): Likewise.
	* tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
	* tree-vect-loop.c (vect_determine_vectorization_factor): Likewise.
	(get_initial_def_for_reduction, build_vect_cond_expr): Likewise.
	* tree-vect-patterns.c (vect_supportable_direct_optab_p): Likewise.
	(vect_split_statement, vect_convert_input): Likewise.
	(vect_recog_widen_op_pattern, vect_recog_pow_pattern): Likewise.
	(vect_recog_over_widening_pattern, vect_recog_mulhs_pattern): Likewise.
	(vect_recog_average_pattern, vect_recog_cast_forwprop_pattern)
	(vect_recog_rotate_pattern, vect_recog_vector_vector_shift_pattern)
	(vect_synth_mult_by_constant, vect_recog_mult_pattern): Likewise.
	(vect_recog_divmod_pattern, vect_recog_mixed_size_cond_pattern)
	(check_bool_pattern, adjust_bool_pattern_cast, adjust_bool_pattern)
	(search_type_for_mask_1, vect_recog_bool_pattern): Likewise.
	(vect_recog_mask_conversion_pattern): Likewise.
	(vect_add_conversion_to_pattern): Likewise.
	(vect_recog_gather_scatter_pattern): Likewise.
	* tree-vect-slp.c (vect_build_slp_tree_2): Likewise.
	(vect_analyze_slp_instance, vect_get_constant_vectors): Likewise.

From-SVN: r277227

committed Oct 21, 2019

7ed54790 Browse Files

Pass a vec_info to get_mask_type_for_scalar_type · 1bd5196c

2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (get_mask_type_for_scalar_type): Take a vec_info.
	* tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise.
	(vect_check_load_store_mask): Update call accordingly.
	(vect_get_mask_type_for_stmt): Likewise.
	* tree-vect-patterns.c (check_bool_pattern): Likewise.
	(search_type_for_mask_1, vect_recog_mask_conversion_pattern): Likewise.
	(vect_convert_mask_for_vectype): Likewise.

From-SVN: r277226

committed Oct 21, 2019

1bd5196c Browse Files

Pass a vec_info to vect_supportable_direct_optab_p · dcab2a0d

2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
	a vec_info.
	(vect_recog_dot_prod_pattern): Update call accordingly.
	(vect_recog_sad_pattern, vect_recog_pow_pattern): Likewise.
	(vect_recog_widen_sum_pattern): Likewise.

From-SVN: r277225

committed Oct 21, 2019

dcab2a0d Browse Files

Pass a vec_info to vect_supportable_shift · a5c3185a

2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_supportable_shift): Take a vec_info.
	* tree-vect-stmts.c (vect_supportable_shift): Likewise.
	* tree-vect-patterns.c (vect_synth_mult_by_constant): Update call
	accordingly.

From-SVN: r277224

committed Oct 21, 2019

a5c3185a Browse Files

Avoid setting current_vector_size in get_vec_alignment_for_array_type · da157e2e

The increase_alignment pass was using get_vectype_for_scalar_type
to get the preferred vector type for each array element type.
This has the effect of carrying over the vector size chosen by
the first successful call to all subsequent calls, whereas it seems
more natural to treat each array type independently and pick the
"best" vector type for each element type.

2019-10-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.c (get_vec_alignment_for_array_type): Use
	get_vectype_for_scalar_type_and_size instead of
	get_vectype_for_scalar_type.

From-SVN: r277223

committed Oct 21, 2019

da157e2e Browse Files

Daily bump. · 5bf2f162
```
From-SVN: r277221
```
GCC Administrator committed Oct 21, 2019
5bf2f162 Browse Files

20 Oct, 2019 7 commits

common.opt (-fcommon): Fix description. · df73e971

2019-10-20  Bernd Edlinger  <bernd.edlinger@hotmail.de>

        * common.opt (-fcommon): Fix description.

From-SVN: r277217

committed Oct 20, 2019

df73e971 Browse Files

i386-protos.h (ix86_pre_reload_split): Declare. · 51085ca5

	* config/i386/i386-protos.h (ix86_pre_reload_split): Declare.
	* config/i386/i386.c (ix86_pre_reload_split): New function.
	* config/i386/i386.md (*fix_trunc<mode>_i387_1, *add<mode>3_eq,
	*add<mode>3_ne, *add<mode>3_eq_0, *add<mode>3_ne_0, *add<mode>3_eq,
	*add<mode>3_ne, *add<mode>3_eq_1, *add<mode>3_eq_0, *add<mode>3_ne_0,
	*anddi3_doubleword, *andndi3_doubleword, *<code>di3_doubleword,
	*one_cmpldi2_doubleword, *ashl<dwi>3_doubleword_mask,
	*ashl<dwi>3_doubleword_mask_1, *ashl<mode>3_mask, *ashl<mode>3_mask_1,
	*<shift_insn><mode>3_mask, *<shift_insn><mode>3_mask_1,
	*<shift_insn><dwi>3_doubleword_mask,
	*<shift_insn><dwi>3_doubleword_mask_1, *<rotate_insn><mode>3_mask,
	*<rotate_insn><mode>3_mask_1, *<btsc><mode>_mask, *<btsc><mode>_mask_1,
	*btr<mode>_mask, *btr<mode>_mask_1, *jcc_bt<mode>, *jcc_bt<mode>_1,
	*jcc_bt<mode>_mask, *popcounthi2_1, frndintxf2_<rounding>,
	*fist<mode>2_<rounding>_1, *<code><mode>3_1, *<code>di3_doubleword):
	Use ix86_pre_reload_split instead of can_create_pseudo_p in condition.
	* config/i386/sse.md (*sse4_1_<code>v8qiv8hi2<mask_name>_2,
	*avx2_<code>v8qiv8si2<mask_name>_2,
	*sse4_1_<code>v4qiv4si2<mask_name>_2,
	*sse4_1_<code>v4hiv4si2<mask_name>_2,
	*avx512f_<code>v8qiv8di2<mask_name>_2,
	*avx2_<code>v4qiv4di2<mask_name>_2, *avx2_<code>v4hiv4di2<mask_name>_2,
	*sse4_1_<code>v2hiv2di2<mask_name>_2,
	*sse4_1_<code>v2siv2di2<mask_name>_2, sse4_2_pcmpestr,
	sse4_2_pcmpistr): Likewise.

From-SVN: r277216

committed Oct 20, 2019

51085ca5 Browse Files

install.texi (Configuration, [...]): hboehm.info now defaults to https. · efbf0f1e
```
	* doc/install.texi (Configuration, --enable-objc-gc): hboehm.info
	now defaults to https.

From-SVN: r277215
```
Gerald Pfeifer committed Oct 20, 2019
efbf0f1e Browse Files

tree-ssa-alias.c (nonoverlapping_refs_since_match_p): Do not skip non-zero array accesses. · f373041c


	* tree-ssa-alias.c (nonoverlapping_refs_since_match_p): Do not
	skip non-zero array accesses.

	* gcc.c-torture/execute/alias-access-path-2.c: New testcase.
	* gcc.dg/tree-ssa/alias-access-path-11.c: xfail.

From-SVN: r277214

committed Oct 20, 2019

f373041c Browse Files

Move code out of vect_slp_analyze_bb_1 · 1d778697

After the previous patch, it seems more natural to apply the
PARAM_SLP_MAX_INSNS_IN_BB threshold as soon as we know what
the region is, rather than delaying it to vect_slp_analyze_bb_1.
(But rather than carve out the biggest region possible and then
reject it, wouldn't it be better to stop when the region gets
too big, to at least give us a chance of vectorising something?)

It also seems more natural for vect_slp_bb_region to create the
bb_vec_info itself rather than (a) having to pass bits of data down
for the initialisation and (b) forcing vect_slp_analyze_bb_1 to free
on every failure return.

2019-10-20  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-slp.c (vect_slp_analyze_bb_1): Take a bb_vec_info
	and return a boolean success value.  Move the allocation and
	initialization of the bb_vec_info to...
	(vect_slp_bb_region): ...here.  Update call accordingly.
	(vect_slp_bb): Apply PARAM_SLP_MAX_INSNS_IN_BB here rather
	than in vect_slp_analyze_bb_1.

From-SVN: r277211

committed Oct 20, 2019

1d778697 Browse Files

Avoid recomputing data references in BB SLP · fa0c8df7

If the first attempt at applying BB SLP to a region fails, the main loop
in vect_slp_bb recomputes the region's bounds and datarefs for the next
vector size.  AFAICT this isn't needed any more; we should be able
to reuse the datarefs from the first attempt instead.

2019-10-20  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-slp.c (vect_slp_analyze_bb_1): Call save_datarefs
	when processing the given datarefs for the first time and
	check_datarefs subsequently.
	(vect_slp_bb_region): New function, split out of...
	(vect_slp_bb): ...here.  Don't recompute the region bounds and
	dataref sets when retrying with a different vector size.

From-SVN: r277210

committed Oct 20, 2019

fa0c8df7 Browse Files

Daily bump. · b4edf5c5
```
From-SVN: r277209
```
GCC Administrator committed Oct 20, 2019
b4edf5c5 Browse Files

19 Oct, 2019 7 commits

nodiscard-reason-only-one.C: In dg-error or dg-warning remove (?n) uses and… · 0fcd8629

nodiscard-reason-only-one.C: In dg-error or dg-warning remove (?n) uses and replace .* with \[^\n\r]*.

	* g++.dg/cpp2a/nodiscard-reason-only-one.C: In dg-error or dg-warning
	remove (?n) uses and replace .* with \[^\n\r]*.
	* g++.dg/cpp2a/nodiscard-reason.C: Likewise.
	* g++.dg/cpp2a/nodiscard-once.C: Likewise.
	* g++.dg/cpp2a/nodiscard-reason-nonstring.C: Likewise.

From-SVN: r277205

committed Oct 20, 2019

0fcd8629 Browse Files

re PR fortran/91926 (assumed rank optional) · b3fbf95e

2019-10-19  Paul Thomas  <pault@gcc.gnu.org>

	PR fortran/91926
	* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc): Revert
	the change made on 2019-10-05.

From-SVN: r277204

committed Oct 19, 2019

b3fbf95e Browse Files

re PR target/92140 (clang vs gcc optimizing with adc/sbb) · 15643a0d

	PR target/92140
	* config/i386/predicates.md (int_nonimmediate_operand): New special
	predicate.
	* config/i386/i386.md (*add<mode>3_eq, *add<mode>3_ne,
	*add<mode>3_eq_0, *add<mode>3_ne_0, *sub<mode>3_eq, *sub<mode>3_ne,
	*sub<mode>3_eq_1, *sub<mode>3_eq_0, *sub<mode>3_ne_0): New
	define_insn_and_split patterns.

	* gcc.target/i386/pr92140.c: New test.
	* gcc.c-torture/execute/pr92140.c: New test.

Co-Authored-By: Uros Bizjak <ubizjak@gmail.com>

From-SVN: r277203

committed Oct 19, 2019

15643a0d Browse Files

[Darwin, testsuite] Fix Wnonnull on Darwin. · 2366bf60

Darwin does not mark entries in string.h with nonnull attributes
so the test fails.  Since the purpose of the test is to check that
the warnings are issued for an inlined function, not that the target
headers are marked up, we can provide marked up headers for Darwin.

gcc/testsuite/ChangeLog:

2019-10-19  Iain Sandoe  <iain@sandoe.co.uk>

	* gcc.dg/Wnonnull.c: Add attributed function declarations for
	memcpy and strlen for Darwin.

From-SVN: r277202

committed Oct 19, 2019

2366bf60 Browse Files

[PPC] Delete out of date comment. · dc7e9feb

Removes a comment that's no longer relevant.

gcc/ChangeLog:

2019-10-19  Iain Sandoe  <iain@sandoe.co.uk>

	* config/rs6000/rs6000.md: Delete out--of-date comment about
	special-casing integer loads.

From-SVN: r277201

committed Oct 19, 2019

dc7e9feb Browse Files

Implement C++20 P1301 [[nodiscard("should have a reason")]]. · 8ad0c477

2019-10-17  JeanHeyd Meneide  <phdofthehouse@gmail.com>

gcc/
        * escaped_string.h (escaped_string): New header.
        * tree.c (escaped_string): Remove escaped_string class.

gcc/c-family
        * c-lex.c (c_common_has_attribute): Update nodiscard value.

gcc/cp/
        * tree.c (handle_nodiscard_attribute) Added C++2a nodiscard
	string message.
        (std_attribute_table) Increase nodiscard argument handling
	max_length from 0 to 1.
        * parser.c (cp_parser_check_std_attribute): Add requirement
	that nodiscard only be seen once in attribute-list.
        (cp_parser_std_attribute): Check that empty parenthesis lists are
        not specified for attributes that have max_length > 0 (e.g.
	[[attr()]]).
        * cvt.c (maybe_warn_nodiscard): Add nodiscard message to
	output, if applicable.
	(convert_to_void): Allow constructors to be nodiscard-able (P1771).

gcc/testsuite/g++.dg/cpp0x
        * gen-attrs-67.C: Test new error message for empty-parenthesis-list.

gcc/testsuite/g++.dg/cpp2a
        * nodiscard-construct.C: New test.
        * nodiscard-once.C: New test.
        * nodiscard-reason-nonstring.C: New test.
        * nodiscard-reason-only-one.C: New test.
        * nodiscard-reason.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

From-SVN: r277200

committed Oct 19, 2019

8ad0c477 Browse Files

Daily bump. · 9299523c
```
From-SVN: r277199
```
GCC Administrator committed Oct 19, 2019
9299523c Browse Files

18 Oct, 2019 20 commits

PR tree-optimization/92157 - incorrect strcmp() == 0 result for unknown strings · 9c233ad0

gcc/testsuite/ChangeLog:

	PR tree-optimization/92157
	* gcc.dg/strlenopt-69.c: Disable test failing due to PR 92155.
	* gcc.dg/strlenopt-87.c: New test.

gcc/ChangeLog:

	PR tree-optimization/92157
	* tree-ssa-strlen.c (handle_builtin_string_cmp): Be prepared for
	compute_string_length to return a negative result.

From-SVN: r277194

committed Oct 18, 2019

9c233ad0 Browse Files

[arm] Fix testsuite nit when compiling for thumb2 · f8b9b1ed

In thumb2 we now generate a NEGS instruction rather than RSBS, so this
test needs updating.

	* gcc.target/arm/negdi-3.c: Update expected output to allow NEGS.

From-SVN: r277192

committed Oct 18, 2019

f8b9b1ed Browse Files

[arm] Improvements to negvsi4 and negvdi4. · a7c3ebae

The generic expansion code for negv does not try the subv patterns,
but instead emits a sub and a compare separately.  Fortunately, the
patterns can make use of the new subv operations, so just call those.
We can also rewrite this using an iterator to simplify things further.
Finally, we can now make negvdi4 work on Thumb2 as well as Arm.

	* config/arm/arm.md (negv<SIDI:mode>3): New expansion rule.
	(negvsi3, negvdi3): Delete.
	(negdi2_compare): Delete.

From-SVN: r277191

committed Oct 18, 2019

a7c3ebae Browse Files

[arm] Early expansion of subvdi4 · ead32773

This patch adds early expansion of subvdi4.  The expansion sequence
is broadly based on the expansion of usubvdi4.

	* config/arm/arm.md (subvdi4): Decompose calculation into 32-bit
	operations.
	(subdi3_compare1): Delete pattern.
	(subvsi3_borrow): New insn pattern.
	(subvsi3_borrow_imm): Likewise.

From-SVN: r277190

committed Oct 18, 2019

ead32773 Browse Files

[arm] Improve constant handling for subvsi4. · 238273fe

This patch addresses constant handling in subvsi4.  Either operand may
be a constant.  If the second input (operand[2]) is a constant, then
we can canonicalize this into an addition form, providing we take care
of the INT_MIN case.  In that case the negation has to handle the fact
that -INT_MIN is still INT_MIN and we need to ensure that a subtract
operation is performed rather than an addition.  The remaining cases
are largely duals of the usubvsi4 expansion.

This patch also fixes a technical correctness bug in the old
expansion, where we did not realy describe the test for overflow in
the RTL.  We seem to have got away with that, however...

	* config/arm/arm.md (subv<mode>4): Delete.
	(subvdi4): New expander pattern.
	(subvsi4): Likewise.  Handle some immediate values.
	(subvsi3_intmin): New insn pattern.
	(subvsi3): Likewise.
	(subvsi3_imm1): Likewise.
	* config/arm/arm.c (select_cc_mode): Also allow minus for CC_V
	idioms.

From-SVN: r277189

committed Oct 18, 2019

238273fe Browse Files

[arm] Early expansion of usubvdi4. · eff5ce0a

This patch adds early expansion of usubvdi4, allowing us to handle some
constants in place, which previously we were unable to do.

	* config/arm/arm.md (usubvdi4): Allow registers or integers for
	incoming operands.  Early split the calculation into SImode
	operations.
	(usubvsi3_borrow): New insn pattern.
	(usubvsi3_borrow_imm): Likewise.

From-SVN: r277188

committed Oct 18, 2019

eff5ce0a Browse Files

[arm] Improve constant handling for usubvsi4. · a79048f6

This patch improves the expansion of usubvsi4 by allowing suitable
constants to be passed directly.  Unlike normal subtraction, either
operand may be a constant (and indeed I have seen cases where both can
be with LTO enabled).  One interesting testcase that improves as a
result of this is:

unsigned f6 (unsigned a)
{
  unsigned x;
  return __builtin_sub_overflow (5U, a, &x) ? 0 : x;
}

Which previously compiled to:

	rsbs	r3, r0, #5
	cmp	r0, #5
	movls	r0, r3
	movhi	r0, #0

but now generates the optimal sequence:

	rsbs	r0, r0, #5
	movcc	r0, #0

	* config/arm/arm.md (usubv<mode>4): Delete expansion.
	(usubvsi4): New pattern.  Allow some immediate values for inputs.
	(usubvdi4): New pattern.

From-SVN: r277187

committed Oct 18, 2019

a79048f6 Browse Files

[arm] Early split addvdi4 · fa62df0e

This patch adds early splitting for addvdi4; it's very similar to the
uaddvdi4 splitter, but the details are just different enough in
places, especially for the patterns that match the splitting, where we
have to compare against the non-widened version to detect if overflow
occurred.

I've also added a testcase to the testsuite for a couple of constants
that caught me out during the development of this patch.  They're
probably arm-specific values, but the test is generic enough that I've
included it for all targets.

[gcc]
	* config/arm/arm.c (arm_select_cc_mode): Allow either the first
	or second operand of the PLUS inside a DImode equality test to be
	sign-extend when selecting CC_Vmode.
	* config/arm/arm.md (addvdi4): Early-split the operation into SImode
	instructions.
	(addsi3_cin_vout_reg, addsi3_cin_vout_imm, addsi3_cin_vout_0): New
	expand patterns.
	(addsi3_cin_vout_reg_insn, addsi3_cin_vout_imm_insn): New patterns.
	(addsi3_cin_vout_0): Likewise.
	(adddi3_compareV): Delete.

[gcc/testsuite]
	* gcc.dg/builtin-arith-overflow-3.c: New test.

From-SVN: r277186

committed Oct 18, 2019

fa62df0e Browse Files

[arm] Allow the summation result of signed add-with-overflow to be discarded. · db962d0a

This patch matches the signed add-with-overflow patterns when the
summation itself is dropped.  In this case we can use CMN (or CMP with
some immediates).  There are a small number of constants in thumb2
where this can result in less dense code (as we lack 16-bit CMN with
immediate patterns).  To handle this we use peepholes to try these
alternatives when either a scratch is available (0 <= i <= 7) or the
original register is dead (0 <= i <= 255).  We don't use a scratch in
the pattern as if those conditions are not satisfied then the 32-bit
form is preferable to forcing a reload.

	* config/arm/arm.md (addsi3_compareV_reg_nosum): New insn.
	(addsi3_compareV_imm_nosum): New insn.  Also add peephole2 patterns
	to transform this back into the summation version when that leads
	to smaller code.

From-SVN: r277185

committed Oct 18, 2019

db962d0a Browse Files

[arm] Improve code generation for addvsi4. · dbba8a17

Similar to the improvements for uaddvsi4, this patch improves the code
generation for addvsi4 to handle immediates and to add alternatives
that better target thumb2.  To do this we separate out the expansion
of uaddvsi4 from that of uaddvdi4 and then add an additional pattern
to handle constants.  Also, while doing this I've fixed the incorrect
usage of NE instead of COMPARE in the generated RTL.

	* config/arm/arm.md (addv<mode>4): Delete.
	(addvsi4): New pattern.  Handle immediate values that the architecture
	supports.
	(addvdi4): New pattern.
	(addsi3_compareV): Rename to ...
	(addsi3_compareV_reg): ... this.  Add constraints for thumb2 variants
	and use COMPARE rather than NE.
	(addsi3_compareV_imm): New pattern.
	* config/arm/arm.c (arm_select_cc_mode): Return CC_Vmode for
	a signed-overflow check.

From-SVN: r277184

committed Oct 18, 2019

dbba8a17 Browse Files

[arm] Early expansion of uaddvdi4. · deb254e0

This code borrows strongly on the uaddvti4 expansion for aarch64 since
the principles are similar.  Firstly, if the one of the low words of
the expansion is 0, we can simply copy the other low word to the
destination and use uaddvsi4 for the upper word.  If that doesn't work
we have to handle three possible cases for the upper work (the lower
word is simply an add-with-carry operation as for adddi3): zero in the
upper word, some other constant and a register (each has a different
canonicalization).  We use CC_ADCmode (a new CC mode variant) to
describe the cases as the introduction of the carry means we can
no-longer use the normal overflow trick of comparing the sum against
one of the operands.

	* config/arm/arm-modes.def (CC_ADC): New CC mode.
	* config/arm/arm.c (arm_select_cc_mode): Detect selection of
	CC_ADCmode.
	(maybe_get_arm_condition_code): Handle CC_ADCmode.
	* config/arm/arm.md (uaddvdi4): Early expansion of unsigned addition
	with overflow.
	(addsi3_cin_cout_reg, addsi3_cin_cout_imm, addsi3_cin_cout_0): New
	expand patterns.
	(addsi3_cin_cout_reg_insn, addsi3_cin_cout_0_insn): New insn patterns
	(addsi3_cin_cout_imm_insn): Likewise.
	(adddi3_compareC): Delete insn.
	* config/arm/predicates.md (arm_carry_operation): Handle CC_ADCmode.

From-SVN: r277183

committed Oct 18, 2019

deb254e0 Browse Files

[arm] Handle immediate values in uaddvsi4 · ed6588f2

The uaddv patterns in the arm back-end do not currenty handle immediates
during expansion.  This patch adds this support for uaddvsi4.  It's really
a stepping-stone towards early expansion of uaddvdi4, but it complete and
a useful change in its own right.

Whilst making this change I also observed that we really had two patterns
that did exactly the same thing, but with slightly different properties;
consequently I've cleaned up all of the add-and-compare patterns to bring
some consistency.

	* config/arm/arm.md (adddi3): Call gen_addsi3_compare_op1.
	* (uaddv<mode>4): Delete expansion pattern.
	(uaddvsi4): New pattern.
	(uaddvdi4): Likewise.
	(addsi3_compareC): Delete pattern, change callers to use
	addsi3_compare_op1.
	(addsi3_compare_op1): No-longer anonymous.  Clean up constraints to
	reduce the number of alternatives and re-work type attribute handling.
	(addsi3_compare_op2): Clean up constraints to reduce the number of
	alternatives and re-work type attribute handling.
	(compare_addsi2_op0): Likewise.
	(compare_addsi2_op1): Likewise.

From-SVN: r277182

committed Oct 18, 2019

ed6588f2 Browse Files

[arm] Cleanup dead code - old support for DImode comparisons · f9f6247d

Now that all the major patterns for DImode have been converted to
early expansion, we can safely clean up some dead code for the old way
of handling DImode.

	* config/arm/arm-modes.def (CC_NCV, CC_CZ): Delete CC modes.
	* config/arm/arm.c (arm_select_cc_mode): Remove old selection code
	for DImode operands.
	(arm_gen_dicompare_reg): Remove unreachable expansion code.
	(maybe_get_arm_condition_code): Remove support for CC_CZmode and
	CC_NCVmode.
	* config/arm/arm.md (arm_cmpdi_insn): Delete.
	(arm_cmpdi_unsigned): Delete.

From-SVN: r277181

committed Oct 18, 2019

f9f6247d Browse Files

[arm] Handle some constant comparisons using rsbs+rscs · af74bfee

In a small number of cases it is preferable to handle comparisons with
constants using the sequence

	RSBS	tmp, Xlo, constlo
	RSCS	tmp, Xhi, consthi

which allows us to handle a small number of LE/GT/LEU/GEU cases when
changing the code to use LT/GE/LTU/GEU would make the constant more
expensive.  Sadly, we cannot do this on Thumb, since we need RSC, so we
now always use the incremented constant in that case since normally that
still works out cheaper than forcing the entire constant into a register.

Further investigation has also shown that the canonicalization of a
reverse subtract and compare is valid for signed as well as unsigned value,
so we relax the restriction on selecting CC_RSBmode to allow all types
of compare.

	* config/arm/arm.c (arm_const_double_prefer_rsbs_rsc): New function.
	(arm_canonicalize_comparison): For GT/LE/GTU/GEU, use the constant
	unchanged only if that will be cheaper.
	(arm_select_cc_mode): Recognize a swapped comparison that will
	be regenerated using RSBS or RSCS.  Relax restriction on selecting
	CC_RSBmode.
	(arm_gen_dicompare_reg): Handle LE/GT/LEU/GEU comparisons against
	a constant.
	(arm_gen_compare_reg): Handle compare (CONST, X) when the mode
	is CC_RSBmode.
	(maybe_get_arm_condition_code): CC_RSBmode now returns the same codes
	as CCmode.
	* config/arm/arm.md (rsb_imm_compare_scratch): New pattern.
	(rscsi3_<CC_EXTEND>out_scratch): New pattern.

From-SVN: r277180

committed Oct 18, 2019

af74bfee Browse Files

[arm] early split most DImode comparison operations. · 8b8ab8f4

This patch does most of the work for early splitting the DImode
comparisons.  We now handle EQ, NE, LT, GE, LTU and GEU during early
expansion, in addition to EQ and NE, for which the expansion has now
been reworked to use a standard conditional-compare pattern already in
the back-end.

To handle this we introduce two new condition flag modes that are used
when comparing the upper words of decomposed DImode values: one for
signed, and one for unsigned comparisons.  CC_Bmode (B for Borrow) is
essentially the inverse of CC_Cmode and is used when the carry flag is
set by a subtraction of unsigned values.

	* config/arm/arm-modes.def (CC_NV, CC_B): New CC modes.
	* config/arm/arm.c (arm_select_cc_mode): Recognize constructs that
	need these modes.
	(arm_gen_dicompare_reg): New code to early expand the sub-operations
	of EQ, NE, LT, GE, LTU and GEU.
	* config/arm/iterators.md (CC_EXTEND): New code attribute.
	* config/arm/predicates.md (arm_adcimm_operand): New predicate..
	* config/arm/arm.md (cmpsi3_carryin_<CC_EXTEND>out): New pattern.
	(cmpsi3_imm_carryin_<CC_EXTEND>out): Likewise.
	(cmpsi3_0_carryin_<CC_EXTEND>out): Likewise.

From-SVN: r277179

committed Oct 18, 2019

8b8ab8f4 Browse Files

[arm] Improve handling of DImode comparisions against constants. · 22060d0e

In almost all cases it is better to handle inequality handling against constants
by transforming comparisons of the form (reg <GE/LT/GEU/LTU> const) into
(reg <GT/LE/GTU/LEU> (const+1)).  However, there are many cases that we could
handle but currently failed to do so because we forced the constant into a
register too early in the pattern expansion.  To permit this to be done we need
to defer forcing the constant into a register until after we've had the chance
to do the transform - in some cases that may even mean that we no-longer need
to force the constant into a register at all.  For example, on Arm, the case:

_Bool f8 (unsigned long long a) { return a > 0xffffffff; }

previously compiled to

        mov     r3, #0
        cmp     r1, r3
        mvn     r2, #0
        cmpeq   r0, r2
        movhi   r0, #1
        movls   r0, #0
        bx      lr

But now compiles to

        cmp     r1, #1
        cmpeq   r0, #0
        movcs   r0, #1
        movcc   r0, #0
        bx      lr

Which although not yet completely optimal, is certainly better than
previously.

	* config/arm/arm.md (cbranchdi4): Accept reg_or_int_operand for
	operand 2.
	(cstoredi4): Similarly, but for operand 3.
	* config/arm/arm.c (arm_canoncialize_comparison): Allow canonicalization
	of unsigned compares with a constant on Arm.  Prefer using const+1 and
	adjusting the comparison over swapping the operands whenever the
	original constant was not valid.
	(arm_gen_dicompare_reg): If Y is not a valid operand, force it to a
	register here.
	(arm_validize_comparison): Do not force invalid DImode operands to
	registers here.

From-SVN: r277178

committed Oct 18, 2019

22060d0e Browse Files

[arm] Early split simple DImode equality comparisons · 5899656b

This is the first step of early splitting all the DImode comparison
operations.  We start by factoring the DImode handling out of
arm_gen_compare_reg into its own function.

Simple DImode equality comparisions (such as equality with zero, or
equality with a constant that is zero in one of the two word values
that it comprises) can be done using a single subtract followed by an
ORRS instruction.  This avoids the need for conditional execution.

For example, (r0 != 5) can be written as

	SUB	Rt, R0, #5
	ORRS	Rt, Rt, R1

The ORRS is now expanded using an SImode pattern that already exists
in the MD file and this gives the register allocator more freedom to
select registers (consecutive pairs are no-longer required).
Furthermore, we can then delete the arm_cmpdi_zero pattern as it is
no-longer required.  We use SUB for the value adjustment as this has a
generally more flexible range of immediates than XOR and what's more
has the opportunity to be relaxed in thumb2 to a 16-bit SUBS
instruction.

	* config/arm/arm.c (arm_select_cc_mode): For DImode equality tests
	return CC_Zmode if comparing against a constant where one word is
	zero.
	(arm_gen_compare_reg): Split DImode handling to ...
	(arm_gen_dicompare_reg): ... here.  Handle equality comparisons
	against simple constants.
	* config/arm/arm.md (arm_cmpdi_zero): Delete pattern.

From-SVN: r277177

committed Oct 18, 2019

5899656b Browse Files

[arm] Add alternative canonicalizations for subtract-with-carry + shift · 0b478cdd

This patch adds a couple of alternative canonicalizations to allow
combine to match a subtract-with-carry operation when one of the operands
is shifted first.  The most common case of this is when combining a
sign-extend of one operand with a long-long value during subtraction.
The RSC variant is only enabled for Arm, the SBC variant for any 32-bit
compilation.

	* config/arm/arm.md (subsi3_carryin_shift_alt): New pattern.
	(rsbsi3_carryin_shift_alt): Likewise.

From-SVN: r277176

committed Oct 18, 2019

0b478cdd Browse Files

[arm] Implement negscc using SBC when appropriate. · f6ff841b

When the carry flag is appropriately set by a comprison, negscc
patterns can expand into a simple SBC of a register with itself.  This
means we can convert two conditional instructions into a single
non-conditional instruction.  Furthermore, in Thumb2 we can avoid the
need for an IT instruction as well.  This patch also fixes the remaining
testcase that we initially XFAILed in the first patch of this series.

gcc:
	* config/arm/arm.md (negscc_borrow): New pattern.
	(mov_negscc): Don't split if the insn would match negscc_borrow.
	* config/arm/thumb2.md (thumb2_mov_negscc): Likewise.
	(thumb2_mov_negscc_strict_it): Likewise.

testsuite:
	* gcc.target/arm/negdi-3.c: Remove XFAIL markers.

From-SVN: r277175

committed Oct 18, 2019

f6ff841b Browse Files

[arm] Reduce cost of insns that are simple reg-reg moves. · 24d28a87

Consider this sequence during combine:

Trying 18, 7 -> 22:
   18: r118:SI=r122:SI
      REG_DEAD r122:SI
    7: r114:SI=0x1-r118:SI-ltu(cc:CC_RSB,0)
      REG_DEAD r118:SI
      REG_DEAD cc:CC_RSB
   22: r1:SI=r114:SI
      REG_DEAD r114:SI
Failed to match this instruction:
(set (reg:SI 1 r1 [+4 ])
    (minus:SI (geu:SI (reg:CC_RSB 100 cc)
            (const_int 0 [0]))
        (reg:SI 122)))
Successfully matched this instruction:
(set (reg:SI 114)
    (geu:SI (reg:CC_RSB 100 cc)
        (const_int 0 [0])))
Successfully matched this instruction:
(set (reg:SI 1 r1 [+4 ])
    (minus:SI (reg:SI 114)
        (reg:SI 122)))
allowing combination of insns 18, 7 and 22
original costs 4 + 4 + 4 = 12
replacement costs 8 + 4 = 12

The costs are all correct, but we really don't want this combination
to take place.  The original costs contain an insn that is a simple
move of one pseudo register to another and it is extremely likely that
register allocation will eliminate this insn entirely.  On the other
hand, the resulting sequence really does expand into a sequence that
costs 12 (ie 3 insns).

We don't want to prevent combine from eliminating such moves, as this
can expose more combine opportunities, but we shouldn't rate them as
profitable in themselves.  We can do this be adjusting the costs
slightly so that the benefit of eliminating such a simple insn is
reduced.

We only do this before register allocation; after allocation we give
such insns their full cost.

	* config/arm/arm.c (arm_insn_cost): New function.
	(TARGET_INSN_COST): Override default definition.

From-SVN: r277174

committed Oct 18, 2019

24d28a87 Browse Files