- 06 Feb, 2020 15 commits
-
-
Implement standard approach by emitting "#" for insns that have to be split. * config/i386/i386.md (*pushtf): Emit "#" instead of calling gcc_unreachable in insn output. (*pushxf): Ditto. (*pushdf): Ditto. (*pushsf_rex64): Ditto for alternatives other than 1. (*pushsf): Ditto for alternatives other than 1.
Uros Bizjak committed -
PR93570 reports that the documentation shows __builtin_mtfsf to return a double, but this is incorrect. The return signature should be void. 2020-02-06 Bill Schmidt <wschmidt@linux.ibm.com> PR target/93570 * doc/extend.texi (Basic PowerPC Built-in Functions): Correct prototype for __builtin_mtfsf.Bill Schmidt committed -
PR gcov-profile/91971 PR gcov-profile/93466 * coverage.c (coverage_init): Revert mangling of path into filename. It can lead to huge filename length. Creation of subfolders seem more natural.
Martin Liska committed -
* include/bits/stl_iterator.h (__detail::__common_iter_ptr): Fix PR number in comment. Fix indentation.
Jonathan Wakely committed -
* include/bits/stl_algobase.h (__iter_swap, __iter_swap<true>): Remove redundant _GLIBCXX20_CONSTEXPR.
Jonathan Wakely committed -
* gcc.target/arm/multilib.exp (multilib_config): Pass flags to …_target_compile as (additional_flags=) option and not as source filename to make it work with remote execution. * lib/target-supports.exp (check_runtime, check_gc_sections_available, check_effective_target_gas, check_effective_target_gld): Likewise.
Tobias Burnus committed -
The __iter_swap class template and explicit specialization are only declared (and used) for C++03 so _GLIBCXX20_CONSTEXPR does nothing here. * include/bits/stl_algobase.h (__iter_swap, __iter_swap<true>): Remove redundant _GLIBCXX20_CONSTEXPR.
Jonathan Wakely committed -
This was sent and approved on gcc-patches as "[GCC][BUG][Aarch64][ARM] (PR93300) Fix ICE due to BFmode placement in GET_MODES_WIDER chain". The observed error came about because BFmode was placed between HFmode and SFmode in the GET_MODES_WIDER chain, resulting in convert_mode_scalar attempting to gen a libfunc for a HFmode -> BFmode conversion. This patch registers NULL for all libfuncs in BFmode, which stops the middle-end from attempting to generate them. gcc/ChangeLog: 2020-02-06 Stam Markianos-Wright <stam.markianos-wright@arm.com> PR target/93300 * config/arm/arm.c (arm_block_arith_comp_libfuncs_for_mode): New. (arm_init_libfuncs): Add BFmode support to block spurious BF libfuncs. Use arm_block_arith_comp_libfuncs_for_mode for HFmode.Stam Markianos-Wright committed -
The following testcase shows that for _mm256_set*_m128i and similar intrinsics, we sometimes generate bad code. All 4 routines are expressing the same thing, a 128-bit vector zero padded to 256-bit vector, but only the 3rd one actually emits the desired vmovdqa %xmm0, %xmm0 insn, the others vpxor %xmm1, %xmm1, %xmm1; vinserti128 $0x1, %xmm1, %ymm0, %ymm0 The problem is that the cast builtins use UNSPEC_CAST which is after reload simplified using a splitter, but during combine it prevents optimizations. We do have avx_vec_concat* patterns that generate efficient code, both for this low part + zero concatenation special case and for other cases too, so the following define_insn_and_split just recognizes avx_vec_concat made of a low half of a cast and some other reg. 2020-02-06 Jakub Jelinek <jakub@redhat.com> PR target/93594 * config/i386/predicates.md (avx_identity_operand): New predicate. * config/i386/sse.md (*avx_vec_concat<mode>_1): New define_insn_and_split. * gcc.target/i386/avx2-pr93594.c: New test.
Jakub Jelinek committed -
openmp: Fix handling of non-addressable shared scalars in parallel nested inside of target [PR93515] As the following testcase shows, we need to consider even target to be a construct that forces not to use copy in/out for shared on parallel inside of the target. E.g. for parallel nested inside another parallel or host teams, we already avoid copy in/out and we need to treat target the same. 2020-02-06 Jakub Jelinek <jakub@redhat.com> PR libgomp/93515 * omp-low.c (use_pointer_for_field): For nested constructs, also look for map clauses on target construct. (scan_omp_1_stmt) <case GIMPLE_OMP_TARGET>: Bump temporarily taskreg_nesting_level. * testsuite/libgomp.c-c++-common/pr93515.c: New test.
Jakub Jelinek committed -
If we call omp_add_variable, following omp_notice_variable will already find it on that construct and not go through outer constructs, the following patch fixes that. Note, this still doesn't follow OpenMP 5.0 semantics on target combined with other constructs with reduction/lastprivate/linear clauses, will handle that for GCC11. 2020-02-06 Jakub Jelinek <jakub@redhat.com> PR libgomp/93515 * gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding shared clause, call omp_notice_variable on outer context if any.
Jakub Jelinek committed -
The ARM Exception Handling ABI requires personality functions in phase1 to initialize barrier_cache before returning _URC_HANDLER_FOUND, and we don't. Although our own ARM personality function does not use barrier_cache at all, other languages' ARM personality functions, during phase2, are allowed and expected to test barrier_cache.sp to check whether the handler frame was reached, which implies that personality functions is in charge of the frame, and the remaining fields of barrier_cache hold whatever values it put there in phase1. Since we did not set barrier_cache.sp, an earlier exception, already handled by a non-Ada handler and then released, may have its storage reused for a new exception, that phase1 matches to an Ada frame, but if that leaves barrier_cache.sp alone, the phase2 personality function that handled the earlier exception, upon reaching the frame that handled the earlier exception, may believe the information in barrier_cache applies to the current exception. The C++ personality function, for example, would take the information in the barrier_cache and end up activating the handler that handled the earlier exception: try { throw 1; } catch (int i) { std::cout << "caught " << i << " by c++" << std::endl; } raise_ada_exception (); // might loop back to the handler above for gcc/ada/ChangeLog * raise-gcc.c (personality_body) [__ARM_EABI_UNWINDER__]: Initialize barrier_cache.sp when ending phase1.Alexandre Oliva committed -
We should be able to assume that a template instantiation or other COMDAT has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets DECL_WEAK and we haven't yet decided to emit a definition in this translation unit. PR c++/92003 * symtab.c (symtab_node::nonzero_address): A DECL_COMDAT decl has non-zero address even if weak and not yet defined.
Jason Merrill committed -
GCC Administrator committed
-
Martin Sebor committed
-
- 05 Feb, 2020 25 commits
-
-
gcc/ChangeLog: PR tree-optimization/92765 * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL. * tree-ssa-strlen.c (compute_string_length): Remove. (determine_min_objsize): Remove. (get_len_or_size): Add an argument. Call get_range_strlen_dynamic. Avoid using type size as the upper bound on string length. (handle_builtin_string_cmp): Add an argument. Adjust. (strlen_check_and_optimize_call): Pass additional argument to handle_builtin_string_cmp. gcc/testsuite/ChangeLog: PR tree-optimization/92765 * g++.dg/tree-ssa/strlenopt-1.C: New test. * g++.dg/tree-ssa/strlenopt-2.C: New test. * gcc.dg/Warray-bounds-58.c: New test. * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow. * gcc.dg/Wstring-compare.c: Xfail a test. * gcc.dg/strcmpopt_2.c: Disable tests. * gcc.dg/strcmpopt_4.c: Adjust tests. * gcc.dg/strcmpopt_10.c: New test. * gcc.dg/strcmpopt_11.c: New test. * gcc.dg/strlenopt-69.c: Disable tests. * gcc.dg/strlenopt-92.c: New test. * gcc.dg/strlenopt-93.c: New test. * gcc.dg/strlenopt.h: Declare calloc. * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved. * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).
Martin Sebor committed -
In unevaluated context, we only substitute a single PARM_DECL, not the entire chain, but the handling of an empty pack expansion was missing that check. PR c++/93140 * pt.c (tsubst_decl) [PARM_DECL]: Check cp_unevaluated_operand in handling of TREE_CHAIN for empty pack.
Jason Merrill committed -
Now that we have post epilogue_completed split point for all optimization levels, we can simplify post epilogue_completed splitters considerably. If corresponding define_peephole2 pattern fails to allocate a temporary register (or if peephole2 pass isn't run at all), we can now always split invalid RTX after epilogue_completed is set. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. * config/i386/i386.md (*pushdi2_rex64 peephole2): Remove. (*pushdi2_rex64 peephole2): Unconditionally split after epilogue_completed. (*ashl<mode>3_doubleword): Ditto. (*<shift_insn><mode>3_doubleword): Ditto.Uros Bizjak committed -
Marek Polacek committed
-
Marek Polacek committed
-
In C++ we weren't calling mark_exp_read on the __builtin_convertvector first argument. I guess it could misbehave even with lambda implicit captures. Fixed by calling decay_conversion on the argument, we use the argument as rvalue so we want the standard lvalue to rvalue conversions, but as the argument must be a vector type, e.g. integral promotions aren't really needed. 2020-02-05 Jakub Jelinek <jakub@redhat.com> PR c++/93557 * semantics.c (cp_build_vec_convert): Call decay_conversion on arg prior to passing it to c_build_vec_convert. * c-c++-common/Wunused-var-17.c: New test.
Jakub Jelinek committed -
2020-02-05 Michael Meissner <meissner@linux.ibm.com> PR target/93568 * config/rs6000/rs6000.c (get_vector_offset): Fix
Michael Meissner committed -
Since reshape_init_array_1 can now reuse a single constructor for an array of non-aggregate type, we might run into a scenario where we reuse a constructor with TREE_SIDE_EFFECTS. This broke this test because we have something like { { expr } } and we try to reshape it, so we recurse on the inner CONSTRUCTOR, reuse an existing CONSTRUCTOR with TREE_SIDE_EFFECTS, and then ICE on the discrepancy because the outermost CONSTRUCTOR doesn't have TREE_SIDE_EFFECTS. In this case EXPR was a call to an operator function so TREE_SIDE_EFFECTS should be set. Naturally one would want to fix this by calling recompute_constructor_flags in an appropriate place so that the flags on the CONSTRUCTORs match. The appropriate place would be at the end of reshape_init, but this breaks initlist109.C: there we are dealing with { { TARGET_EXPR <{}> } } where the outermost { } is TREE_CONSTANT but the inner { } is not, so recompute_constructor_flags would clear the constant flag in the outermost { }. Seems resonable but it upsets check_initializer which then complains about "non-constant in-class initialization invalid for static member". TARGET_EXPRs are always created with TREE_SIDE_EFFECTS on, but that is mutually exclusive with TREE_CONSTANT. So we're in a bind. Fixed by not reusing a CONSTRUCTOR that has TREE_SIDE_EFFECTS; in the grand scheme of things it isn't measurable: it only affects ~3 tests in the testsuite. PR c++/93559 - ICE with CONSTRUCTOR flags verification. * decl.c (reshape_init_array_1): Don't reuse a CONSTRUCTOR with TREE_SIDE_EFFECTS. * g++.dg/cpp0x/initlist119.C: New test. * g++.dg/cpp0x/initlist120.C: New test.Marek Polacek committed -
In the testcase, since there's no declaration of T, ref_view(T) declares a non-static data member T of type ref_view, the same type as its enclosing class. Then when we try to do C++20 aggregate class template argument deduction we recursively try to adjust the braced-init-list to match the template class definition until we run out of stack. Fixed by rejecting the template data member. PR c++/92593 * decl.c (grokdeclarator): Reject field of current class type even in a template.
Jason Merrill committed -
2020-02-05 Andrew Stubbs <ams@codesourcery.com> gcc/ * config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Use / not space.
Andrew Stubbs committed -
* gcc.target/hppa/shadd-3.c: Disable delay slot filling and adjust expected shadd insn count appropriately.
Jeff Law committed -
* testsuite/lib/libgomp.exp (check_effective_target_offload_target_nvptx): Pass flags as 'options' and not as 'source' argument to libgomp_target_compile.
Tobias Burnus committed -
The G++ bug has been fixed for a couple of months so we can remove these workarounds that define alias templates in terms of constrained class templates. We can just apply constraints directly to alias templates as specified in the C++20 working draft. * include/bits/iterator_concepts.h (iter_reference_t) (iter_rvalue_reference_t, iter_common_reference_t, indirect_result_t): Remove workarounds for PR c++/67704. * testsuite/24_iterators/aliases.cc: New test.
Jonathan Wakely committed -
The analyzer recognizes __analyzer_dump_exploded_nodes as a "magic" function for use in DejaGnu tests: at the end of the pass, it issues a warning at each such call, dumping the count of exploded nodes seen at the call, which can be checked in test cases via dg-warning directives, along with the IDs of the enodes (which is helpful when debugging). My intent was to give a way of testing the results of the state-merging code. The state-merging code can generate duplicate exploded nodes at a point when state merging occurs, taking a pair of enodes from the worklist that share a program_point and sufficiently similar state. For these cases it generates a merged state, and adds edges from those enodes to the merged-state enode (potentially a new or a pre-existing enode); the input enodes don't have process_node called on them. This means that at a CFG join point there can be an unpredictable number of enodes that we don't care about, where the precise number depends on the details of the state-merger code, immediately followed by a more predictable number that we do care about. I've been papering over this in the analyzer DejaGnu tests somewhat by adding pairs of __analyzer_dump_exploded_nodes calls at CFG join points, where the output at the first call is somewhat arbitrary, and the second has the number we care about; the first number tends to change "at random" as I tweak the state merging code, in ways that aren't interesting, but require the tests to be updated. See e.g. gcc.dg/analyzer/paths-6.c which had: __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 exploded nodes" } */ // FIXME: the above can vary between 2 and 3 exploded nodes __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 exploded node" } */ This patch remedies this situation by tracking which enodes are processed, and which are merely "merger" enodes. It updates the output for __analyzer_dump_exploded_nodes so that count of enodes only includes the *processed* enodes, and that the IDs are split into "processed" and "merger" enodes. The patch simplifies the testsuite by eliminating the redundant calls described above; the example above becomes: __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */ where the output in question is now: warning: 1 processed enode: [EN: 94] merger(s): [EN: 93] The patch also adds various checks on the status of enodes, to ensure e.g. that each enode is processed at most once. gcc/analyzer/ChangeLog: * engine.cc (exploded_node::dump_dot): Show merger enodes. (worklist::add_node): Assert that the node's m_status is STATUS_WORKLIST. (exploded_graph::process_worklist): Likewise for nodes from the worklist. Set status of merged nodes to STATUS_MERGER. (exploded_graph::process_node): Set status of node to STATUS_PROCESSED. (exploded_graph::dump_exploded_nodes): Rework handling of "__analyzer_dump_exploded_nodes", splitting enodes by status into "processed" and "merger", showing the count of just the processed enodes at the call, rather than the count of all enodes. * exploded-graph.h (exploded_node::status): New enum. (exploded_node::exploded_node): Initialize m_status to STATUS_WORKLIST. (exploded_node::get_status): New getter. (exploded_node::set_status): New setter. (exploded_node::m_status): New field. gcc/ChangeLog: * doc/analyzer.texi (Special Functions for Debugging the Analyzer): Update description of __analyzer_dump_exploded_nodes. gcc/testsuite/ChangeLog: * gcc.dg/analyzer/data-model-1.c: Update for changed output to __analyzer_dump_exploded_nodes, dropping redundant call at merger. * gcc.dg/analyzer/data-model-7.c: Likewise. * gcc.dg/analyzer/loop-2.c: Update for changed output format. * gcc.dg/analyzer/loop-2a.c: Likewise. * gcc.dg/analyzer/loop-4.c: Likewise. * gcc.dg/analyzer/loop.c: Likewise. * gcc.dg/analyzer/malloc-paths-10.c: Likewise; drop redundant call at merger. * gcc.dg/analyzer/malloc-vs-local-1a.c: Likewise. * gcc.dg/analyzer/malloc-vs-local-1b.c: Likewise. * gcc.dg/analyzer/malloc-vs-local-2.c: Likewise. * gcc.dg/analyzer/malloc-vs-local-3.c: Likewise. * gcc.dg/analyzer/paths-1.c: Likewise. * gcc.dg/analyzer/paths-1a.c: Likewise. * gcc.dg/analyzer/paths-2.c: Likewise. * gcc.dg/analyzer/paths-3.c: Likewise. * gcc.dg/analyzer/paths-4.c: Update for changed output format. * gcc.dg/analyzer/paths-5.c: Likewise. * gcc.dg/analyzer/paths-6.c: Likewise; drop redundant calls at merger. * gcc.dg/analyzer/paths-7.c: Likewise. * gcc.dg/analyzer/torture/conditionals-2.c: Update for changed output format. * gcc.dg/analyzer/zlib-1.c: Likewise; drop redundant calls. * gcc.dg/analyzer/zlib-5.c: Update for changed output format.David Malcolm committed -
As mentioned in the PR, the CLOBBERs in vzeroupper are added there even for registers that aren't ever live in the function before and break the prologue/epilogue expansion with ms ABI (normal ABIs are fine, as they consider all [xyz]mm registers call clobbered, but the ms ABI considers xmm0-15 call used but the bits above low 128 ones call clobbered). The following patch fixes it by not adding the clobbers during vzeroupper pass (before pro_and_epilogue), but adding them for -fipa-ra purposes only during the final output. Perhaps we could add some CLOBBERs early (say for df_regs_ever_live_p regs that aren't live in the live_regs bitmap, or depending on the ABI either add all of them immediately, or for ms ABI add CLOBBERs for xmm0-xmm5 if they don't have a SET) and add the rest later. And the addition could be perhaps done at other spots, e.g. in an epilogue_completed guarded splitter. 2020-02-05 Jakub Jelinek <jakub@redhat.com> PR target/92190 * config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper): Only include sets and not clobbers in the vzeroupper pattern. * config/i386/sse.md (*avx_vzeroupper): Require in insn condition that the parallel has 17 (64-bit) or 9 (32-bit) elts. (*avx_vzeroupper_1): New define_insn_and_split. * gcc.target/i386/pr92190.c: New test.
Jakub Jelinek committed -
The problem is that x86 is the only target that defines HAVE_ATTR_length and doesn't schedule any splitting pass at -O0 after pro_and_epilogue. So, either we go back to handling this during vzeroupper output (unconditionally, rather than flag_ipa_ra guarded), or we need to tweak the split* passes for x86. Seems there are 5 split passes, split1 is run unconditionally before reload, split2 is run for optimize > 0 or STACK_REGS (x86) after ra but before epilogue_completed, split3 is run before regstack for STACK_REGS and optimize and -fno-schedule-insns2, split4 is run before sched2 if sched2 is run and split5 is run before shorten_branches if HAVE_ATTR_length and not STACK_REGS. 2020-02-05 Jakub Jelinek <jakub@redhat.com> PR target/92190 * recog.c (pass_split_after_reload::gate): For STACK_REGS targets, don't run when !optimize. (pass_split_before_regstack::gate): For STACK_REGS targets, run even when !optimize.
Jakub Jelinek committed -
We're now consistently building SLP operations with only scalar defs from scalars which makes the testcase no longer testing multiplication vectorization. The following smuggles in a constant making the vector variant profitable for SLP build. 2020-02-05 Richard Biener <rguenther@suse.de> PR testsuite/92177 * gcc.dg/vect/bb-slp-22.c: Adjust.
Richard Biener committed -
This adds guards to genmatch generated code before accessing call expression or stmt arguments that might be out of bounds when the user provided bogus prototypes for what we consider builtins. 2020-02-05 Richard Biener <rguenther@suse.de> PR middle-end/90648 * genmatch.c (dt_node::gen_kids_1): Emit number of argument checks before matching calls. * gcc.dg/pr90648.c: New testcase.
Richard Biener committed -
Makes some parameters const in libiberty's hashtab library. include/ChangeLog: * hashtab.h (htab_remove_elt): Make a parameter const. (htab_remove_elt_with_hash): Likewise. libiberty/ChangeLog: * hashtab.c (htab_remove_elt): Make a parameter const. (htab_remove_elt_with_hash): Likewise.Andrew Burgess committed -
gcc/cp * coroutines.cc (maybe_promote_captured_temps): Increase the index number for temporary variables' name.Bin Cheng committed -
2020-02-05 Jakub Jelinek <jakub@redhat.com> * tree-ssa-alias.c (aliasing_matching_component_refs_p): Fix up function comment typo.
Jakub Jelinek committed -
The testcases ICE because when processing the declare simd inbranch, we don't create the i == 0 clone as it already exists, which means clone_info->nargs is not adjusted, but we then rely on it being adjusted when trying other clones. 2020-02-05 Jakub Jelinek <jakub@redhat.com> PR middle-end/93555 * omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or simd_clone_create failed when i == 0, adjust clone->nargs by clone->inbranch. * c-c++-common/gomp/pr93555-1.c: New test. * c-c++-common/gomp/pr93555-2.c: New test. * gfortran.dg/gomp/pr93555.f90: New test.
Jakub Jelinek committed -
PR lto/93489 * lto-dump.c (dump_list_functions): Do not load body for aliases. (dump_body): Likewise here.
Martin Liska committed -
PR c++/92717 * doc/invoke.texi: Document that one should not combine ASLR and -fpch.
Martin Liska committed -
These changes are needed for some of the tests in the constrained algorithm patch, because they use move_iterator with an uncopyable output_iterator. The other changes described in the paper are already applied, it seems. libstdc++-v3/ChangeLog: * include/bits/stl_iterator.h (move_iterator::move_iterator): Move __i when initializing _M_current. (move_iterator::base): Split into two overloads differing in ref-qualifiers as in P1207R4 for C++20.
Patrick Palka committed
-