- 18 Oct, 2019 20 commits
-
-
The cost routine for Arm and Thumb2 was not recognising the idioms that describe the addition with carry, this results in the instructions appearing more expensive than they really are, which occasionally can lead to poor choices by combine. Recognising all the possible variants is a little trickier than normal because the expressions can become complex enough that this is no single canonical from. * config/arm/arm.c (strip_carry_operation): New function. (arm_rtx_costs_internal, case PLUS): Handle addtion with carry-in for SImode. From-SVN: r277172
Richard Earnshaw committed -
An earlier patch introduced arm_borrow_operation, this one introduces the carry variant, which is the same except that the logic of the carry-setting is inverted. Having done this we can now match more cases where the carry flag is propagated from comparisons with different modes without having to define even more patterns. A few small changes to the expand patterns are required to directly create the carry representation. The iterators LTUGEU is no-longer needed and removed, as is the code attribute 'cnb'. Finally, we fix a long-standing bug which was probably inert before: in Thumb2 a shift with ADC can only be by an immediate amount; register-specified shifts are not permitted. * config/arm/predicates.md (arm_carry_operation): New special predicate. * config/arm/iterators.md (LTUGEU): Delete iterator. (cnb): Delete code attribute. (optab): Delete ltu and geu elements. * config/arm/arm.md (addsi3_carryin): Renamed from addsi3_carryin_<optab>. Remove iterator and use arm_carry_operand. (add0si3_carryin): Similarly, but from add0si3_carryin_<optab>. (addsi3_carryin_alt2): Similarly, but from addsi3_carryin_alt2_<optab>. (addsi3_carryin_clobercc): Similarly. (addsi3_carryin_shift): Similarly. Do not allow register shifts in Thumb2 state. From-SVN: r277171
Richard Earnshaw committed -
Now that we early split DImode subtracts, the patterns to emit the original and to match zero-extend with subtraction or negation are no-longer useful. * config/arm/arm.md (arm_subdi3): Delete insn. (zextendsidi_negsi, negdi_extendsidi): Delete insn_and_split. From-SVN: r277170
Richard Earnshaw committed -
This patch adds early splitting of subdi3 so that the individual operations can be seen by the optimizers, particuarly combine. This should allow us to do at least as good a job as previously, but with far fewer patterns in the machine description. This is just the initial patch to add the early splitting. The cleanups will follow later. A special trick is used to handle the 'reverse subtract and compare' where a register is subtracted from a constant. The natural comparison (COMPARE (const) (reg)) is not canonical in this case and combine will never correctly generate it (trying to swap the order of the operands. To handle this we write the comparison as (COMPARE (NOT (reg)) (~const)), which has the same result for EQ, NE, LTU, LEU, GTU and GEU, which are all the cases we are really interested in here. Finally, we delete the negdi2 pattern. The generic expanders will use our new subdi3 expander if this pattern is missing and that can handle the negate case just fine. * config/arm/arm-modes.def (CC_RSB): New CC mode. * config/arm/predicates.md (arm_borrow_operation): Handle CC_RSBmode. * config/arm/arm.c (arm_select_cc_mode): Detect when we should return CC_RSBmode. (maybe_get_arm_condition_code): Handle CC_RSBmode. * config/arm/arm.md (subsi3_carryin): Make this pattern available to expand. (subdi3): Rewrite to early-expand the sub-operations. (rsb_im_compare): New pattern. (negdi2): Delete. (negdi2_insn): Delete. (arm_negsi2): Correct type attribute to alu_imm. (negsi2_0compare): New insn pattern. (negsi2_carryin): New insn pattern. From-SVN: r277169
Richard Earnshaw committed -
addsi3_carryin_alt2 has a more strict constraint than the predicate when adding a constant. This leads to sub-optimal code in some circumstances. * config/arm/arm.md (addsi3_carryin_alt2): Use arm_not_operand for operand 2. From-SVN: r277168
Richard Earnshaw committed -
The add-with-carry operation which involves a shift doesn't match at present because it isn't matching the canonical form generated by combine. Fixing this is simply a matter of re-ordering the operands. * config/arm/arm.md (addsi3_carryin_shift_<optab>): Reorder operands to match canonical form. From-SVN: r277167
Richard Earnshaw committed -
This patch changes the insn patterns for zero- and sign-extend into define_expands that generate the appropriate word operations immediately. * config/arm/arm.md (zero_extend<mode>di2): Convert to define_expand. (extend<mode>di2): Likewise. From-SVN: r277166
Richard Earnshaw committed -
This patch causes the expansion of adddi3 to split the operation immediately for Arm and Thumb-2. This is desirable as it frees up the register allocator to pick what ever combination of registers suits best and reduces the number of auxiliary patterns that we need in the back-end. Three of the testcases that we disabled earlier are already fixed by this patch. Finally, we add a new pattern to match the canonicalization of add-with-carry when using an immediate of zero. gcc: * config/arm/arm-protos.h (arm_decompose_di_binop): New prototype. * config/arm/arm.c (arm_decompose_di_binop): New function. * config/arm/arm.md (adddi3): Also accept any const_int for op2. If not generating Thumb-1 code, decompose the operation into 32-bit pieces. * add0si_carryin_<optab>: New pattern. testsuite: * gcc.target/arm/pr53447-1.c: Remove XFAIL. * gcc.target/arm/pr53447-3.c: Remove XFAIL. * gcc.target/arm/pr53447-4.c: Remove XFAIL. From-SVN: r277165
Richard Earnshaw committed -
The first step towards early splitting of addition and subtraction at DImode is to rip out the old patterns that are designed to propagate DImode through the RTL optimization passes and the do late splitting. This patch does cause some code size regressions, but it should still execute correctly. We will progressively add back the optimizations we had here in later patches. A small number of tests in the Arm-specific testsuite do fail as a result of this patch, but that's to be expected, since the optimizations they are looking for have just been removed. I've kept the tests, but XFAILed them for now. One small technical change is also done in this patch as part of the cleanup: the uaddv<mode>4 expander is changed to use LTU as the branch comparison. This eliminates the need for CC_Cmode to recognize somewhat bogus equality constraints. gcc: * arm.md (adddi3): Only accept register operands. (arm_adddi3): Convert to simple insn with no split. Do not accept constants. (adddi_sesidi_di): Delete patern. (adddi_zesidi_di): Likewise. (uaddv<mode>4): Use LTU as condition for branch. (adddi3_compareV): Convert to simple insn with no split. (addsi3_compareV_upper): Delete pattern. (adddi3_compareC): Convert to simple insn with no split. Correct flags setting expression. (addsi3_compareC_upper): Delete pattern. (addsi3_compareC): Correct flags setting expression. (subdi3_compare1): Convert to simple insn with no split. (subsi3_carryin_compare): Delete pattern. (arm_subdi3): Convert to simple insn with no split. (subdi_zesidi): Delete pattern. (subdi_di_sesidi): Delete pattern. (subdi_zesidi_di): Delete pattern. (subdi_sesidi_di): Delete pattern. (subdi_zesidi_zesidi): Delete pattern. (negvdi3): Use s_register_operand. (negdi2_compare): Convert to simple insn with no split. (negdi2_insn): Likewise. (negsi2_carryin_compare): Delete pattern. (negdi_zero_extendsidi): Delete pattern. (arm_cmpdi_insn): Convert to simple insn with no split. (negdi2): Don't call gen_negdi2_neon. * config/arm/neon.md (adddi3_neon): Delete pattern. (subdi3_neon): Delete pattern. (negdi2_neon): Delete pattern. (splits for negdi2_neon): Delete splits. testsuite: * gcc.target/arm/negdi-3.c: Add XFAILS. * gcc.target/arm/pr3447-1.c: Likewise. * gcc.target/arm/pr3447-3.c: Likewise. * gcc.target/arm/pr3447-4.c: Likewise. From-SVN: r277164
Richard Earnshaw committed -
2019-10-18 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/69455 * trans-decl.c (generate_local_decl): Avoid misconstructed intrinsic modules in a BLOCK construct. 2019-10-18 Steven G. Kargl <kargl@gcc.gnu.org> PR fortran/69455 * gfortran.dg/pr69455_1.f90: New test. * gfortran.dg/pr69455_2.f90: Ditto. From-SVN: r277158
Steven G. Kargl committed -
PR middle-end/92153 * ggc-page.c (release_pages): Read g->alloc_size before free rather than after it. From-SVN: r277157
Jakub Jelinek committed -
This patch maps multilibs using -march=armv7-r+vfpv3-d16-fp16 and -march=armv7-r+vfpv3-d16-fp16+idiv to v7+fp. This patch also adds a new multilib for armv7-r+fp.sp and maps -march=armv7-r+fp.sp+idiv, -march=armv7-r+vfpv3xd-fp16 and -march=armv7-r+vfpv3xd-fp16+idiv to it. This patch also makes it so that the generated multilib header file is regenerated if changes have been made to either t-multilib, t-aprofile or t-rmprofile when doing incremental builds. gcc/ChangeLog: 2019-10-18 Andre Vieira <andre.simoesdiasvieira@arm.com> * config/arm/t-multilib: Add rule to regenerate mutlilib header file with any change to t-multilib, t-aprofile and t-rmprofile. Also add new multilib variants and new mappings. gcc/testsuite/ChangeLog: 2019-10-18 Andre Vieira <andre.simoesdiasvieira@arm.com> * gcc.target/arm/multilib.exp: Add extra tests. From-SVN: r277156
Andre Vieira committed -
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01354.html I noticed that we use a bitfield flag to note types with names for linkage purposes: typedef struct {} foo; but, we can infer this by comparing TYPE_STUB_DECL and TYPE_DECL of the main variant. It's only checked in two places -- the C++ parser and the objective C++ encoder. * cp-tree.h (struct lang_type): Remove was_anonymous. (TYPE_WAS_UNNAMED): Implement by checking TYPE_DECL & TYPE_STUB_DECL. * decl.c (name_unnamed_type): Don't set TYPE_WAS_UNNAMED. From-SVN: r277155
Nathan Sidwell committed -
gcc/fortran/ PR fortran/91586 * class.c (gfc_find_derived_vtab): Return NULL instead of deref'ing NULL pointer. gcc/testsuite/ PR fortran/91586 * gfortran.dg/class_71.f90: New. From-SVN: r277153
Tobias Burnus committed -
OS X 10.15 adds aligned_alloc but it has the same restriction as the AIX version, namely that alignments smaller than sizeof(void*) are not supported. PR libstdc++/92143 * libsupc++/new_opa.cc (operator new) [__APPLE__]: Increase alignment to at least sizeof(void*). From-SVN: r277151
Jonathan Wakely committed -
* include/bits/range_cmp.h (ranges::less::operator()): Inline the logic from std::less::operator() to remove the dependency on it. From-SVN: r277150
Jonathan Wakely committed -
PR target/86040 * config/avr/avr.c (avr_out_lpm): Do not shortcut-return. From-SVN: r277143
Georg-Johann Lay committed -
gcc/testsuite/ Fix some fallout for small targets. PR testsuite/52641 * gcc.c-torture/execute/20190820-1.c: Add dg-require-effective-target int32plus. * gcc.c-torture/execute/pr85331.c Add dg-require-effective-target double64plus. * gcc.dg/pow-sqrt-1.c: Same. * gcc.dg/pow-sqrt-2.c: Same. * gcc.dg/pow-sqrt-3.c: Same. * gcc.c-torture/execute/20190901-1.c: Same. * gcc.c-torture/execute/user-printf.c [avr]: Skip. * gcc.c-torture/execute/fprintf-2.c [avr]: Skip. * gcc.c-torture/execute/printf-2.c [avr]: Skip. * gcc.dg/Wlarger-than3.c [avr]: Skip. * gcc.c-torture/execute/ieee/20041213-1.c (sqrt) [avr,double=float]: Provide custom prototype. * gcc.dg/pr36017.c: Same. * gcc.c-torture/execute/pr90025.c: Use 32-bit int. * gcc.dg/complex-7.c: Add dg-require-effective-target double64. * gcc.dg/loop-versioning-1.c: Add dg-require-effective-target size32plus. * gcc.dg/loop-versioning-2.c: Same. From-SVN: r277142
Georg-Johann Lay committed -
2019-10-18 Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> Richard Sandiford <richard.sandiford@arm.com> PR target/86753 * tree-vectorizer.h (scalar_cond_masked_key): New struct, and define hashmap traits for it. (loop_vec_info::scalar_cond_masked_set): New member. (vect_record_loop_mask): Adjust prototype. * tree-vectorizer.c (scalar_cond_masked_key::get_cond_ops_from_tree): Implement method. * tree-vect-loop.c (vectorizable_reduction): Pass NULL as last arg to vect_record_loop_mask. (vectorizable_live_operation): Likewise. (vect_record_loop_mask): New param scalar_mask. Add entry cond, loop_mask to scalar_cond_masked_set if scalar_mask is non NULL. * tree-vect-stmts.c (check_load_store_masking): New param scalar_mask. Pass it as last arg to vect_record_loop_mask. (vectorizable_call): Pass scalar_mask as last arg to vect_record_loop_mask. (vectorizable_store): Likewise. (vectorizable_load): Likewise. (vectorizable_condition): Check if another part of vectorized code applies loop_mask to condition or to it's inverse, and if yes, apply loop_mask to result of vector comparison. testsuite/ * gcc.target/aarch64/sve/cond_cnot_2.c: Remove XFAIL from { scan-assembler-not {\tsel\t}. * gcc.target/aarch64/sve/cond_convert_1.c: Adjust to make only one load conditional. * gcc.target/aarch64/sve/cond_convert_4.c: Likewise. * gcc.target/aarch64/sve/cond_unary_2.c: Likewise. * gcc.target/aarch64/sve/vcond_4.c: Remove XFAIL's. * gcc.target/aarch64/sve/vcond_5.c: Likewise. Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com> From-SVN: r277141
Prathamesh Kulkarni committed -
From-SVN: r277140
GCC Administrator committed
-
- 17 Oct, 2019 20 commits
-
-
* config/pa/pa.c (pa_output_indirect_call): Fix typos in last change. From-SVN: r277135
John David Anglin committed -
PR tree-optimization/92056 * tree-ssa-strlen.c (determine_min_objsize): Call init_object_sizes before calling compute_builtin_object_size. * gcc.dg/tree-ssa/pr92056.c: New test. From-SVN: r277134
Jakub Jelinek committed -
/cp 2019-10-17 Paolo Carlini <paolo.carlini@oracle.com> * decl.c (grokfndecl): Remove redundant use of in_system_header_at. (compute_array_index_type_loc): Likewise. (grokdeclarator): Likewise. * error.c (cp_printer): Likewise. * lambda.c (add_default_capture): Likewise. * parser.c (cp_parser_primary_expression): Likewise. (cp_parser_selection_statement): Likewise. (cp_parser_toplevel_declaration): Likewise. (cp_parser_enumerator_list): Likewise. (cp_parser_using_declaration): Likewise. (cp_parser_member_declaration): Likewise. (cp_parser_exception_specification_opt): Likewise. (cp_parser_std_attribute_spec): Likewise. * pt.c (do_decl_instantiation): Likewise. (do_type_instantiation): Likewise. * typeck.c (cp_build_unary_op): Likewise. * decl.c (check_tag_decl): Pass to in_system_header_at the same location used for the permerror. (grokdeclarator): Likewise. * decl.c (check_tag_decl): Use locations[ds_typedef] in error_at. /testsuite 2019-10-17 Paolo Carlini <paolo.carlini@oracle.com> * g++.old-deja/g++.other/decl9.C: Check locations too. From-SVN: r277133
Paolo Carlini committed -
The current Darwin load/store lo_sum patterns have neither predicate nor constraint. This means that most parts of the backend, which rely on recog() to validate the rtx, can produce invalid combinations/selections. For 32bit cases this isn't a problem since we can load/store to unaligned addresses using D-mode insns. Conversely, for 64bit instructions that use DS mode, this can manifest as assemble errors (for an assembler that checks the LO14 relocations), or as crashes caused by wrong offsets (or worse, wrong content for the two LSBs). What we want to check for "Y" on Darwin is: - that the alignment of the Symbols' target is sufficient for DS mode - that the offset is suitable for DS mode. (while looking through the Mach-O PIC unspecs). So, the patch removes the Darwin-specific lo_sum patterns (we begin using the movdi_internal64 patterns). We also we need to extend the handling of the mem_operand_gpr constraint to allow looking through Mach-O PIC UNSPECs in the lo_sum cases. gcc/ChangeLog: 2019-10-17 Iain Sandoe <iain@sandoe.co.uk> PR target/65342 * config/rs6000/darwin.md (movdi_low, movsi_low_st): Delete. (movdi_low_st): Delete. * config/rs6000/rs6000.c (darwin_rs6000_legitimate_lo_sum_const_p): New. (mem_operand_gpr): Validate Mach-O LO_SUM cases separately. * config/rs6000/rs6000.md (movsi_low): Delete. From-SVN: r277130
Iain Sandoe committed -
* .gitattributes: Avoid {} in filename pattern. Brace-expansion is a bash feature, not part of glob(7). From-SVN: r277129
Jason Merrill committed -
* cp-gimplify.c (cp_gimplify_expr): Use get_initialized_tmp_var. The comment for get_formal_tmp_var says that it shouldn't be used for expressions whose value might change between initialization and use, and in this case we're creating a temporary precisely because the value might change, so we should use get_initialized_tmp_var instead. I also noticed that many callers of get_initialized_tmp_var pass NULL for post_p, so it seems appropriate to make it a default argument. gcc/ * gimplify.h (get_initialized_tmp_var): Add default argument to post_p. * gimplify.c (gimplify_self_mod_expr, gimplify_omp_atomic): Remove NULL post_p argument. * targhooks (std_gimplify_va_arg_expr): Likewise. From-SVN: r277128
Jason Merrill committed -
2019-10-17 Richard Biener <rguenther@suse.de> * tree-vectorizer.h (_stmt_vec_info::cond_reduc_code): Remove. (STMT_VINFO_VEC_COND_REDUC_CODE): Likewise. * tree-vectorizer.c (vec_info::new_stmt_vec_info): Do not initialize STMT_VINFO_VEC_COND_REDUC_CODE. * tree-vect-loop.c (vect_is_simple_reduction): Set STMT_VINFO_REDUC_CODE. (vectorizable_reduction): Remove dead and redundant code, use STMT_VINFO_REDUC_CODE instead of STMT_VINFO_VEC_COND_REDUC_CODE. From-SVN: r277126
Richard Biener committed -
This won't do anything by default, because __cplusplus is set to 201402L when Doxygen runs. If/when that changes, these headers should be processed. * doc/doxygen/user.cfg.in (INPUT): Add new C++17 and C++20 headers. From-SVN: r277121
Jonathan Wakely committed -
Define std::identity, std::ranges::equal_to, std::ranges::not_equal_to, std::ranges::greater, std::ranges::less, std::ranges::greater_equal and std::ranges::less_equal. * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/range_cmp.h: New header for C++20 function objects. * include/std/functional: Include new header. * testsuite/20_util/function_objects/identity/1.cc: New test. * testsuite/20_util/function_objects/range.cmp/equal_to.cc: New test. * testsuite/20_util/function_objects/range.cmp/greater.cc: New test. * testsuite/20_util/function_objects/range.cmp/greater_equal.cc: New test. * testsuite/20_util/function_objects/range.cmp/less.cc: New test. * testsuite/20_util/function_objects/range.cmp/less_equal.cc: New test. * testsuite/20_util/function_objects/range.cmp/not_equal_to.cc: New test. From-SVN: r277120
Jonathan Wakely committed -
* config/avr/avr.c (avr_option_override): Remove set of PARAM_ALLOW_STORE_DATA_RACES. * common/config/avr/avr-common.c (avr_option_optimization_table) [OPT_LEVELS_ALL]: Turn on -fallow-store-data-races. From-SVN: r277115
Georg-Johann Lay committed -
i386.h has #define CLEAR_RATIO(speed) ((speed) ? MIN (6, ix86_cost->move_ratio) : 2) It is impossible to have CLEAR_RATIO > 6. This patch adds clear_ratio to processor_costs, sets it to the minimum of 6 and move_ratio in all cost models and defines CLEAR_RATIO with clear_ratio. * config/i386/i386.h (processor_costs): Add clear_ratio. (CLEAR_RATIO): Remove MIN and use ix86_cost->clear_ratio. * config/i386/x86-tune-costs.h: Set clear_ratio to the minimum of 6 and move_ratio in all cost models. From-SVN: r277114
H.J. Lu committed -
The container requirements say that for move assignment "All existing elements of [the target] are either move assigned or destroyed". Some of our containers currently use __make_move_if_noexcept which makes the move depend on whether the element type is nothrow move constructible. This is incorrect, because the standard says we must move assign, not move or copy depending on the move constructor. Use make_move_iterator instead so that we move unconditionally. This ensures existing elements won't be copy assigned. PR libstdc++/92124 * include/bits/forward_list.h (_M_move_assign(forward_list&&, false_type)): Do not use __make_move_if_noexcept, instead move unconditionally. * include/bits/stl_deque.h (_M_move_assign2(deque&&, false_type)): Likewise. * include/bits/stl_list.h (_M_move_assign(list&&, false_type)): Likewise. * include/bits/stl_vector.h (_M_move_assign(vector&&, false_type)): Likewise. * testsuite/23_containers/vector/92124.cc: New test. From-SVN: r277113
Jonathan Wakely committed -
2019-10-17 Richard Biener <rguenther@suse.de> * tree-vect-loop.c (check_reduction_path): Compute reduction operation here. (vect_is_simple_reduction): Remove special-case of single-stmt reduction path detection. From-SVN: r277112
Richard Biener committed -
According to GAS, the Marvell PJ4 CPU has a VFPv3-D16 floating point unit, but GCC's CPU configuration tables omits this meaning that -mfpu=auto will not correctly select the FPU. This patch fixes this by adding the +fp option to the architecture specification for this device. * config/arm/arm-cpus.in (marvel-pj4): Add +fp to the architecture. From-SVN: r277111
Richard Earnshaw committed -
2019-10-17 Yuliang Wang <yuliang.wang@arm.com> gcc/ * config/aarch64/aarch64-sve2.md (aarch64_sve2_eor3<mode>) (aarch64_sve2_nor<mode>, aarch64_sve2_nand<mode>) (aarch64_sve2_bsl<mode>, aarch64_sve2_nbsl<mode>) (aarch64_sve2_bsl1n<mode>, aarch64_sve2_bsl2n<mode>): New combine patterns. * config/aarch64/iterators.md (BSL_DUP): New int iterator for the above. (bsl_1st, bsl_2nd, bsl_dup, bsl_mov): Attributes for the above. gcc/testsuite/ * gcc.target/aarch64/sve2/eor3_1.c: New test. * gcc.target/aarch64/sve2/nlogic_1.c: As above. * gcc.target/aarch64/sve2/nlogic_2.c: As above. * gcc.target/aarch64/sve2/bitsel_1.c: As above. * gcc.target/aarch64/sve2/bitsel_2.c: As above. * gcc.target/aarch64/sve2/bitsel_3.c: As above. * gcc.target/aarch64/sve2/bitsel_4.c: As above. From-SVN: r277110
Yuliang Wang committed -
From-SVN: r277108
Aldy Hernandez committed -
PR tree-optimization/92131 * tree-vrp.c (value_range_base::dump): Display +INF for both pointers and integers when appropriate. From-SVN: r277107
Aldy Hernandez committed -
gcc/ChangeLog: 2019-10-17 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (vect_analyze_loop_2): Use same condition to decide when to use versioning threshold. From-SVN: r277105
Andre Vieira committed -
gcc/ChangeLog: 2019-10-17 Andre Vieira <andre.simoesdiasvieira@arm.com> * tree-vect-loop.c (determine_peel_for_niter): New function contained outlined code from ... (vect_analyze_loop_2): ... here. From-SVN: r277103
Andre Vieira committed -
https://gcc.gnu.org/ml/gcc-patches/2019-10/msg01283.html * decl.c (builtin_function_1): Merge into ... (cxx_builtin_function): ... here. Nadger the decl before maybe copying it. Set the context. (cxx_builtin_function_ext_scope): Push to top level, then call cxx_builtin_function. From-SVN: r277102
Nathan Sidwell committed
-