- 19 Nov, 2019 28 commits
-
-
The C++ committee continues to discuss how best to avoid breaking existing code with the new rules for reversed operators. A recent suggestion was to base the tie-breaker on the parameter types of the candidates, which made a lot of sense to me, so this patch implements that. This patch also mentions that a candidate was reversed or rewritten when printing the list of candidates, and warns about a comparison that becomes recursive under the new rules. There is no flag for this warning; people can silence it by swapping the operands. * call.c (same_fn_or_template): Change to cand_parms_match. (joust): Adjust. (print_z_candidate): Mark rewritten/reversed candidates. (build_new_op_1): Warn about recursive call with reversed arguments. From-SVN: r278465
Jason Merrill committed -
* config/rs6000/rs6000.c (move_to_end_of_ready): New, factored out from common code. (power6_sched_reorder2): Factored out from rs6000_sched_reorder2, call new function. (power9_sched_reorder2): Call new function. (rs6000_sched_reorder2): Likewise. From-SVN: r278463
Pat Haugen committed -
From-SVN: r278461
Richard Sandiford committed -
* ipa-fnsummary.c (estimate_edge_size_and_time): Drop parameter PROB. (estimate_calls_size_and_time): Update. From-SVN: r278460
Jan Hubicka committed -
* ipa-inline.c (inlining_speedup): New function. (edge_badness): Use it. From-SVN: r278459
Jan Hubicka committed -
This patch tightens the instruction definitions to make sure that MSA branch instructions cannot be put into delay slots and have their delay slots eligible for being filled. Also, MSA *div*3 patterns use MSA branches for zero checks but are not marked as being multi instruction and thus could be put into delay slots. This patch fixes that. gcc/ChangeLog: 2019-11-19 Zoran Jovanovic <zoran.jovanovic@mips.com> Dragan Mladjenovic <dmladjenovic@wavecomp.com> * config/mips/mips-msa.md (msa_<msabr>_<msafmt_f>, msa_<msabr>_v_<msafmt_f>): Mark as not having "likely" version. * config/mips/mips.md (insn_count): The simd_div instruction with TARGET_CHECK_ZERO_DIV consists of 3 instructions. (can_delay): Exclude simd_branch. (defile_delay *): Add simd_branch instructions. They have one regular delay slot. gcc/testsuite/ChangeLog: 2019-11-19 Dragan Mladjenovic <dmladjenovic@wavecomp.com> * gcc.target/mips/msa-ds.c: New test. From-SVN: r278458
Dragan Mladjenovic committed -
To restore powerpc bootstrap. 2019-11-19 Richard Sandiford <richard.sandiford@arm.com> gcc/ Revert: 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> * cse.c (cse_insn): Delete no-op register moves too. * simplify-rtx.c (comparison_to_mask): Handle unsigned comparisons. Take a second comparison to control the value for NE. (mask_to_comparison): Handle unsigned comparisons. (simplify_logical_relational_operation): Likewise. Update call to comparison_to_mask. Handle AND if !HONOR_NANs. (simplify_binary_operation_1): Call the above for AND too. gcc/testsuite/ Revert: 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> * gcc.target/aarch64/sve/acle/asm/ptest_pmore.c: New test. From-SVN: r278455
Richard Sandiford committed -
PR79262 has been fixed for almost all AArch64 cpus, however the example is still vectorized in a few cases, resulting in lower performance. Adjust the vector cost slightly so that so that -mcpu=cortex-a53 now has identical performance as -mcpu=cortex-a57 on libquantum. gcc/ PR target/79262 * config/aarch64/aarch64.c (generic_vector_cost): Adjust vec_to_scalar_cost. From-SVN: r278452
Wilco Dijkstra committed -
PR c++/89913 gcc/cp/ * pt.c (get_underlying_template): Exit loop if the original type of the alias is null. gcc/testsuite/ * g++.dg/cpp2a/pr89913.C: New test. From-SVN: r278451
Andrew Sutton committed -
PR c++/92078 gcc/cp/ * pt.c (maybe_new_partial_specialization): Apply access to newly created partial specializations. Update comment style. gcc/testsuite/ * g++.dg/cpp2a/concepts-pr92078.C: New. * g++.dg/cpp2a/concepts-requires18.C: Update diagnostics. From-SVN: r278450
Andrew Sutton committed -
gcc/cp/ * pt.c (tsubst_copy_and_build): Perform the first substitution without diagnostics and a second only if tsubst_requries_expr returns an error. From-SVN: r278449
Andrew Sutton committed -
2019-11-19 Martin Liska <mliska@suse.cz> * toplev.c (general_init): Move the call... (toplev::main): ... here as we need init_options_struct being called. From-SVN: r278448
Martin Liska committed -
By default Armv7-A tunes for Cortex-A8. This is an ancient core today and the settings are no longer useful for newer cores. So switch to Cortex-A53 tuning since it works well across a wide range of modern cores. On SPECINT2006 the performance gain is 0.7% compared to Cortex-A8 tuning, and codesize reduces by 0.2%. gcc/ * config/arm/arm-cpus.in (armv7): Set tune to Cortex-A53. (armv7-a): Likewise. (armv7ve): Likewise. From-SVN: r278447
Wilco Dijkstra committed -
2019-11-19 Andrew Stubbs <ams@codesourcery.com> gcc/testsuite/ * gcc.dg/tree-ssa/loop-1.c: Change amdgcn assembler scan. From-SVN: r278446
Andrew Stubbs committed -
2019-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92581 * tree-vect-loop.c (vect_create_epilog_for_reduction): For condition reduction chains gather all conditions involved for computing the index reduction vector. * gcc.dg/vect/vect-cond-reduc-5.c: New testcase. From-SVN: r278445
Richard Biener committed -
2019-11-19 Dennis Zhang <dennis.zhang@arm.com> * config/aarch64/aarch64-builtins.c (enum aarch64_builtins): Add AARCH64_MEMTAG_BUILTIN_START, AARCH64_MEMTAG_BUILTIN_IRG, AARCH64_MEMTAG_BUILTIN_GMI, AARCH64_MEMTAG_BUILTIN_SUBP, AARCH64_MEMTAG_BUILTIN_INC_TAG, AARCH64_MEMTAG_BUILTIN_SET_TAG, AARCH64_MEMTAG_BUILTIN_GET_TAG, and AARCH64_MEMTAG_BUILTIN_END. (aarch64_init_memtag_builtins): New. (AARCH64_INIT_MEMTAG_BUILTINS_DECL): New macro. (aarch64_general_init_builtins): Call aarch64_init_memtag_builtins. (aarch64_expand_builtin_memtag): New. (aarch64_general_expand_builtin): Call aarch64_expand_builtin_memtag. (AARCH64_BUILTIN_SUBCODE): New macro. (aarch64_resolve_overloaded_memtag): New. (aarch64_resolve_overloaded_builtin_general): New. Call aarch64_resolve_overloaded_memtag to handle overloaded MTE builtins. * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define __ARM_FEATURE_MEMORY_TAGGING when enabled. (aarch64_resolve_overloaded_builtin): Call aarch64_resolve_overloaded_builtin_general. * config/aarch64/aarch64-protos.h (aarch64_resolve_overloaded_builtin_general): New declaration. * config/aarch64/aarch64.h (AARCH64_ISA_MEMTAG): New macro. (TARGET_MEMTAG): Likewise. * config/aarch64/aarch64.md (UNSPEC_GEN_TAG): New unspec. (UNSPEC_GEN_TAG_RND, and UNSPEC_TAG_SPACE): Likewise. (irg, gmi, subp, addg, ldg, stg): New instructions. * config/aarch64/arm_acle.h (__arm_mte_create_random_tag): New macro. (__arm_mte_exclude_tag, __arm_mte_ptrdiff): Likewise. (__arm_mte_increment_tag, __arm_mte_set_tag): Likewise. (__arm_mte_get_tag): Likewise. * config/aarch64/predicates.md (aarch64_memtag_tag_offset): New. (aarch64_granule16_uimm6, aarch64_granule16_simm9): New. * config/arm/types.md (memtag): New. * doc/invoke.texi (-memtag): Update description. 2019-11-19 Dennis Zhang <dennis.zhang@arm.com> * gcc.target/aarch64/acle/memtag_1.c: New test. * gcc.target/aarch64/acle/memtag_2.c: New test. * gcc.target/aarch64/acle/memtag_3.c: New test. From-SVN: r278444
Dennis Zhang committed -
Thumb1 cannot support asm-flags currently, because we don't expose the flags register to the compiler. Disable the support for that case. Adjust the asm-flag-6 test for aarch64 ilp32 correctness. gcc/ * config/arm/arm-c.c (arm_cpu_builtins): Use def_or_undef_macro to define __GCC_ASM_FLAG_OUTPUTS__. * config/arm/arm.c (thumb1_md_asm_adjust): New function. (arm_option_params_internal): Swap out targetm.md_asm_adjust depending on TARGET_THUMB1. * doc/extend.texi (FlagOutputOperands): Document thumb1 restriction. gcc/testsuite/ * testsuite/gcc.target/arm/asm-flag-3.c: Skip for thumb1. * testsuite/gcc.target/arm/asm-flag-5.c: Likewise. * testsuite/gcc.target/arm/asm-flag-6.c: Likewise. * testsuite/gcc.target/arm/asm-flag-4.c: New test. * testsuite/gcc.target/aarch64/asm-flag-6.c: Use %w for asm inputs to cmp instruction for ILP32. From-SVN: r278443
Richard Henderson committed -
This code is invalid and rejected by other compilers (see PR 92576). * include/bits/regex.h (ranges::__detail::__enable_view_impl): Fix declaration. * include/bits/stl_multiset.h (ranges::__detail::__enable_view_impl): Likewise. * include/bits/stl_set.h (ranges::__detail::__enable_view_impl): Likewise. * include/bits/unordered_set.h (ranges::__detail::__enable_view_impl): Likewise. * include/debug/multiset.h (ranges::__detail::__enable_view_impl): Likewise. * include/debug/set.h (ranges::__detail::__enable_view_impl): Likewise. * include/debug/unordered_set (ranges::__detail::__enable_view_impl): Likewise. From-SVN: r278440
Jonathan Wakely committed -
PR target/92549 * config/i386/i386.md (peephole2 for *swap<mode>): New peephole2. * gcc.target/i386/pr92549.c: New test. From-SVN: r278439
Jakub Jelinek committed -
re PR middle-end/91450 (__builtin_mul_overflow(A,B,R) wrong code if product < 0, *R is unsigned, and !(A&B)) PR middle-end/91450 * internal-fn.c (expand_mul_overflow): For s1 * s2 -> ur, if one operand is negative and one non-negative, compare the non-negative one against 0 rather than comparing s1 & s2 against 0. Otherwise, don't compare (s1 & s2) == 0, but compare separately both s1 == 0 and s2 == 0, unless one of them is known to be negative. Remove tem2 variable, use tem where tem2 has been used before. * gcc.c-torture/execute/pr91450-1.c: New test. * gcc.c-torture/execute/pr91450-2.c: New test. From-SVN: r278437
Jakub Jelinek committed -
From-SVN: r278434
Eric Botcazou committed -
re PR c++/92504 (ICE on gcc-9 -fopenmp: internal compiler error: tree check: expected tree that contains 'decl common' structure, have 'baselink' in get_inner_reference, at expr.c:7238) PR c++/92504 * semantics.c (handle_omp_for_class_iterator): Don't call cp_fully_fold on cond. * g++.dg/gomp/pr92504.C: New test. From-SVN: r278433
Jakub Jelinek committed -
PR tree-optimization/92557 * omp-low.c (omp_clause_aligned_alignment): Punt if TYPE_MODE is not vmode rather than asserting it always is. * gcc.dg/gomp/pr92557.c: New test. From-SVN: r278432
Jakub Jelinek committed -
2019-11-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92554 * tree-vect-loop.c (vect_create_epilog_for_reduction): Look for the actual condition stmt and deal with sign-changes. * gcc.dg/vect/pr92554.c: New testcase. From-SVN: r278431
Richard Biener committed -
2019-09-19 Richard Biener <rguenther@suse.de> PR tree-optimization/92555 * tree-vect-loop.c (vect_update_vf_for_slp): Also scan PHIs for non-SLP stmts. * gcc.dg/vect/pr92555.c: New testcase. From-SVN: r278430
Richard Biener committed -
2019-11-19 Martin Liska <mliska@suse.cz> PR bootstrap/92540 * config/riscv/riscv.c (riscv_address_insns): Initialize addr in order to remove boostrap -Wmaybe-uninitialized error. From-SVN: r278429
Martin Liska committed -
Certain bad uses of C2x standard attributes (that is, attributes inside [[]] with only a name but no namespace specified) are constraint violations, and so should be diagnosed with a pedwarn (or error) where GCC currently uses a warning. This patch implements this in some cases (not yet for attributes used on types, nor for some bad uses of fallthrough attributes). Specifically, this applies to unknown standard attributes (taking care not to pedwarn for nodiscard, which is known but not implemented for C), and to all currently implemented standard attributes in attribute declarations (including when mixed with fallthrough) and on statements. Bootstrapped with no regressions on x86_64-pc-linux-gnu. gcc/c: * c-decl.c (c_warn_unused_attributes): Use pedwarn not warning for standard attributes. * c-parser.c (c_parser_std_attribute): Take argument for_tm. Use pedwarn for unknown standard attributes and return error_mark_node for them. gcc/c-family: * c-common.c (attribute_fallthrough_p): In C, use pedwarn not warning for standard attributes mixed with fallthrough attributes. gcc/testsuite: * gcc.dg/c2x-attr-fallthrough-5.c, gcc.dg/c2x-attr-syntax-5.c: New tests. * gcc.dg/c2x-attr-deprecated-2.c, gcc.dg/c2x-attr-deprecated-4.c, gcc.dg/c2x-attr-fallthrough-2.c, gcc.dg/c2x-attr-maybe_unused-2.c, gcc.dg/c2x-attr-maybe_unused-4.c: Expect errors in place of some warnings. From-SVN: r278428
Joseph Myers committed -
From-SVN: r278427
GCC Administrator committed
-
- 18 Nov, 2019 12 commits
-
-
/cp 2019-11-18 Paolo Carlini <paolo.carlini@oracle.com> * typeck.c (cp_build_addr_expr_1): Use cp_expr_loc_or_input_loc in three places. (cxx_sizeof_expr): Use it in one additional place. (cxx_alignof_expr): Likewise. (lvalue_or_else): Likewise. /testsuite 2019-11-18 Paolo Carlini <paolo.carlini@oracle.com> * g++.dg/cpp0x/addressof2.C: Test locations too. * g++.dg/cpp0x/rv-lvalue-req.C: Likewise. * g++.dg/expr/crash2.C: Likewise. * g++.dg/expr/lval1.C: Likewise. * g++.dg/expr/unary2.C: Likewise. * g++.dg/ext/lvaddr.C: Likewise. * g++.dg/ext/lvalue1.C: Likewise. * g++.dg/tree-ssa/pr20280.C: Likewise. * g++.dg/warn/Wplacement-new-size.C: Likewise. * g++.old-deja/g++.brendan/alignof.C: Likewise. * g++.old-deja/g++.brendan/sizeof2.C: Likewise. * g++.old-deja/g++.law/temps1.C: Likewise. From-SVN: r278424
Paolo Carlini committed -
gcc/ChangeLog: PR tree-optimization/92493 * gimple-ssa-sprintf.c (get_origin_and_offset): Remove spurious assignment. gcc/testsuite/ChangeLog: PR tree-optimization/92493 * gcc.dg/pr92493.c: New test. From-SVN: r278423
Martin Sebor committed -
This patch refactors tree-loop-distribution.c for thread safety without use of C11 __thread feature. All global variables were moved to `class loop_distribution` which is initialized at ::execute time. From-SVN: r278421
Giuliano Belinassi committed -
PR ipa/92508 * ipa-inline.c (inline_small_functions): Add new edges after reseting caches. * ipa-inline-analysis.c (do_estimate_edge_time): Fix sanity check. From-SVN: r278419
Jan Hubicka committed -
This patch adds more tests of C2x attributes, where I found cases that were handled correctly by my patches but missing from the original tests. Tests are added for -std=c11 -pedantic handling of C2x attribute syntax and corresponding -Wc11-c2x-compat handling; for struct [[deprecated]]; and for the [[__fallthrough__]] spelling of [[fallthrough]] in the case of valid fallthrough attributes. Tested for x86_64-pc-linux-gnu. * gcc.dg/c11-attr-syntax-1.c, gcc.dg/c11-attr-syntax-2.c, gcc.dg/c11-attr-syntax-3.c, gcc.dg/c2x-attr-syntax-4.c: New tests. * gcc.dg/c2x-attr-deprecated-1.c: Also test struct [[deprecated]]. * gcc.dg/c2x-attr-fallthrough-1.c: Also test [[__fallthrough__]]. From-SVN: r278418
Joseph Myers committed -
When fixing c++/91889 (r276251) I was assuming that we couldn't have a ck_qual under a ck_ref_bind, and I was introducing it in the patch and so this + if (next_conversion (convs)->kind == ck_qual) + { + gcc_assert (same_type_p (TREE_TYPE (expr), + next_conversion (convs)->type)); + /* Strip the cast created by the ck_qual; cp_build_addr_expr + below expects an lvalue. */ + STRIP_NOPS (expr); + } in convert_like_real was supposed to handle it. But that assumption was wrong as this test shows; here we have "(int *)f" where f is of type long int, and we're converting it to "const int *const &", so we have both ck_ref_bind and ck_qual. That means that the new STRIP_NOPS strips an expression it shouldn't have, and that then breaks when creating a TARGET_EXPR. So we want to limit the stripping to the new case only. This I do by checking need_temporary_p, which will be 0 in the new case. Yes, we can set need_temporary_p when binding a reference directly, but then we won't have a qualification conversion. It is possible to have a bit-field, convert it to a pointer, and then convert that pointer to a more-qualified pointer, but in that case we're not dealing with an lvalue, so gl_kind is 0, so we won't enter this block in reference_binding: 1747 if ((related_p || compatible_p) && gl_kind) * call.c (convert_like_real) <case ck_ref_bind>: Check need_temporary_p. * g++.dg/cpp0x/ref-bind7.C: New test. From-SVN: r278416
Marek Polacek committed -
2019-11-18 Martin Jambor <mjambor@suse.cz> PR ipa/92528 * g++.dg/ipa/pr92528.C: New test. From-SVN: r278415
Martin Jambor committed -
This patch adds optabs that check whether a read followed by a write or a write followed by a read can be divided into interleaved byte accesses without changing the dependencies between the bytes. This is one of the uses of the SVE2 WHILERW and WHILEWR instructions. (The instructions can also be used to limit the VF at runtime, but that's future work.) 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/sourcebuild.texi (vect_check_ptrs): Document. * optabs.def (check_raw_ptrs_optab, check_war_ptrs_optab): New optabs. * doc/md.texi: Document them. * internal-fn.def (IFN_CHECK_RAW_PTRS, IFN_CHECK_WAR_PTRS): New internal functions. * internal-fn.h (internal_check_ptrs_fn_supported_p): Declare. * internal-fn.c (check_ptrs_direct): New macro. (expand_check_ptrs_optab_fn): Likewise. (direct_check_ptrs_optab_supported_p): Likewise. (internal_check_ptrs_fn_supported_p): New fuction. * tree-data-ref.c: Include internal-fn.h. (create_ifn_alias_checks): New function. (create_intersect_range_checks): Use it. * config/aarch64/iterators.md (SVE2_WHILE_PTR): New int iterator. (optab, cmp_op): Handle it. (raw_war, unspec): New int attributes. * config/aarch64/aarch64.md (UNSPEC_WHILERW, UNSPEC_WHILE_WR): New constants. * config/aarch64/predicates.md (aarch64_bytes_per_sve_vector_operand): New predicate. * config/aarch64/aarch64-sve2.md (check_<raw_war>_ptrs<mode>): New expander. (@aarch64_sve2_while<cmp_op><GPI:mode><PRED_ALL:mode>_ptest): New pattern. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_check_ptrs): New procedure. * gcc.dg/vect/vect-alias-check-14.c: Expect IFN_CHECK_WAR to be used, if available. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-16.c: Likewise IFN_CHECK_RAW. * gcc.target/aarch64/sve2/whilerw_1.c: New test. * gcc.target/aarch64/sve2/whilewr_1.c: Likewise. * gcc.target/aarch64/sve2/whilewr_2.c: Likewise. From-SVN: r278414
Richard Sandiford committed -
Empty vector constructors are equivalent to zero vectors. If we handle that case directly, we can support it for variable-length vectors and can hopefully make things more efficient for fixed-length vectors. This is needed by a later C++ patch. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree.c (build_vector_from_ctor): Directly return a zero vector for empty constructors. From-SVN: r278413
Richard Sandiford committed -
SVE has two composite conditions: pmore == at least one bit set && last bit clear plast == no bits set || last bit set So in general we generate them from: A: CC = test bits B: reg1 = first condition C: CC = test bits D: reg2 = second condition E: result = (reg1 op reg2) where op is || or && To fold all this into a single test, we need to be able to remove the redundant C (the cse.c patch) and then fold B, D and E down to a single condition (the simplify-rtx.c patch). The underlying conditions are unsigned, so the simplify-rtx.c part needs to support both unsigned comparisons and AND. However, to avoid opening the can of worms that is ANDing FP comparisons for unordered inputs, I've restricted the new AND handling to cases in which NaNs can be ignored. I think this is still a strict extension of what we have now, it just doesn't go as far as it could. Going further would need an entirely different set of testcases so I think would make more sense as separate work. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * cse.c (cse_insn): Delete no-op register moves too. * simplify-rtx.c (comparison_to_mask): Handle unsigned comparisons. Take a second comparison to control the value for NE. (mask_to_comparison): Handle unsigned comparisons. (simplify_logical_relational_operation): Likewise. Update call to comparison_to_mask. Handle AND if !HONOR_NANs. (simplify_binary_operation_1): Call the above for AND too. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/ptest_pmore.c: New test. From-SVN: r278411
Richard Sandiford committed -
This patch handles VIEW_CONVERT_EXPRs of variable-length VECTOR_CSTs by adding tree-level versions of native_decode_vector_rtx and simplify_const_vector_subreg. It uses the same code for fixed-length vectors, both to get more coverage and because operating directly on the compressed encoding should be more efficient for longer vectors with a regular pattern. The structure and comments are very similar between the tree and rtx routines. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * fold-const.c (native_encode_vector): Turn into a wrapper function, splitting the main code out into... (native_encode_vector_part): ...this new function. (native_decode_vector_tree): New function. (fold_view_convert_vector_encoding): Likewise. (fold_view_convert_expr): Use it for converting VECTOR_CSTs to VECTOR_TYPEs. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/temporaries_1.c: New test. From-SVN: r278410
Richard Sandiford committed -
For: void f1 (int *x, int *y) { for (int i = 0; i < 32; ++i) x[i] += y[i]; } we checked at runtime whether one vector at x would overlap one vector at y. But in cases like this, the vector code would handle x <= y just fine, since any write to address A still happens after any read from address A. The only problem is if x is ahead of y by less than a vector. The same is true for two writes: void f2 (int *x, int *y) { for (int i = 0; i < 32; ++i) { x[i] = i; y[i] = 2; } } if y <= x then a vector write at y after a vector write at x would have the same net effect as the original scalar writes. This patch optimises the alias checks for these two cases. E.g., before the patch, f1 used: add x2, x0, 15 sub x2, x2, x1 cmp x2, 30 bls .L2 whereas after the patch it uses: add x2, x1, 4 sub x2, x0, x2 cmp x2, 8 bls .L2 Read-after-write cases like: int f3 (int *x, int *y) { int res = 0; for (int i = 0; i < 32; ++i) { x[i] = i; res += y[i]; } return res; } can cope with x == y, but otherwise don't allow overlap in either direction. Since checking for x == y at runtime would require extra code, we're probably better off sticking with the current overlap test. An overlap test is also needed if the scalar or vector accesses covered by the alias check are mixed together, rather than all statements for the second access following all statements for the first access. The new code for gcc.target/aarch64/sve/var_strict_[135].c is slightly better than before. 2019-11-18 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-data-ref.c (create_intersect_range_checks_index): If the alias pair describes simple WAW and WAR dependencies, just check whether the first B access overlaps later A accesses. (create_waw_or_war_checks): New function that performs the same optimization on addresses. (create_intersect_range_checks): Call it. gcc/testsuite/ * gcc.dg/vect/vect-alias-check-8.c: Expect WAR/WAW checks to be used. * gcc.dg/vect/vect-alias-check-14.c: Likewise. * gcc.dg/vect/vect-alias-check-15.c: Likewise. * gcc.dg/vect/vect-alias-check-18.c: Likewise. * gcc.dg/vect/vect-alias-check-19.c: Likewise. * gcc.target/aarch64/sve/var_stride_1.c: Update expected sequence. * gcc.target/aarch64/sve/var_stride_2.c: Likewise. * gcc.target/aarch64/sve/var_stride_3.c: Likewise. * gcc.target/aarch64/sve/var_stride_5.c: Likewise. From-SVN: r278409
Richard Sandiford committed
-