- 20 Apr, 2020 12 commits
-
-
This patch fixes a large lmbench performance regression with 128-bit SVE, compiled in length-agnostic mode. vect_better_loop_vinfo_p (new in GCC 10) tries to estimate whether a new loop_vinfo is cheaper than a previous one, with an in-built preference for the old one. For variable VF it prefers the old loop_vinfo if it is cheaper for at least one VF. However, we have no idea how likely that VF is in practice. Another extreme would be to do what most of the rest of the vectoriser does, and rely solely on the constant estimated VF. But as noted in the comment, this means that a one-unit cost difference would be enough to pick the new loop_vinfo, despite the target generally preferring the old loop_vinfo where possible. The cost model just isn't accurate enough for that to produce good results as things stand: there might not be any practical benefit to the new loop_vinfo at the estimated VF, and it would be significantly worse for higher VFs. The patch instead goes for a hacky compromise: make sure that the new loop_vinfo is also no worse than the old loop_vinfo at double the estimated VF. For all but trivial loops, this ensures that the new loop_vinfo is only chosen if it is better than the old one by a non-trivial amount at the estimated VF. It also avoids putting too much faith in the VF estimate. I realise this isn't great, but it's supposed to be a conservative fix suitable for stage 4. The only affected testcases are the ones for pr89007-*.c, where Advanced SIMD is indeed preferred for 128-bit SVE and is no worse for 256-bit SVE. Part of the problem here is that if the new loop_vinfo is better, we discard the old one and never consider using it even as an epilogue loop. This means that if we choose Advanced SIMD over SVE, we're much more likely to have left-over scalar elements. Another is that the estimate provided by estimated_poly_value might have different probabilities attached. E.g. when tuning for a particular core, the estimate is probably accurate, but when tuning for generic code, the estimate is more of a guess. Relying solely on the estimate is probably correct for the former but not for the latter. Hopefully those are things that we could tackle in GCC 11. 2020-04-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-loop.c (vect_better_loop_vinfo_p): If old_loop_vinfo has a variable VF, prefer new_loop_vinfo if it is cheaper for the estimated VF and is no worse at double the estimated VF. gcc/testsuite/ * gcc.target/aarch64/sve/cost_model_8.c: New test. * gcc.target/aarch64/sve/cost_model_9.c: Likewise. * gcc.target/aarch64/sve/pr89007-1.c: Add -msve-vector-bits=512. * gcc.target/aarch64/sve/pr89007-2.c: Likewise.
Richard Sandiford committed -
This testcase triggered an ICE in rtx_vector_builder::step because we were trying to use a stepped representation for floating-point constants. The underlying problem was that the arguments to rtx_vector_builder were the wrong way around, meaning that some variations were likely to be incorrectly encoded for integers (but probably as a silent failure). Also, aarch64_sve_expand_vector_init_handle_trailing_constants tries to extend the trailing constant elements to a full vector by following the "natural" pattern of the original vector, which should generally lead to nicer constants. However, for the testcase, we'd then end up picking a variable for some elements. Fixed by stubbing out all variable elements with zeros. That fix involved testing valid_for_const_vector_p. For consistency, the patch uses the same test when finding trailing constants, instead of the previous aarch64_legitimate_constant_p. 2020-04-20 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR target/94668 * config/aarch64/aarch64.c (aarch64_sve_expand_vector_init): Fix order of arguments to rtx_vector_builder. (aarch64_sve_expand_vector_init_handle_trailing_constants): Likewise. When extending the trailing constants to a full vector, replace any variables with zeros. gcc/testsuite/ PR target/94668 * gcc.target/aarch64/sve/pr94668.c: New test.
Richard Sandiford committed -
If extra_tool_flags starts with a dash, an error like 'ERROR: verbose: illegal argument: -march=native -O2 -std=c++17' is printed. This is easily fixed by inserting a double dash before the variable. 2020-04-20 Matthias Kretz <kretz@kde.org> * testsuite/lib/libstdc++.exp: Avoid illegal argument to verbose.
Matthias Kretz committed -
We treat tpl-tpl-parms as types. They're not; bound-tpl-tpl-parms are. We can get away with them being type-like. Unfortunately we give the original level==orig_level case a canonical type, but the reduced cases of level<orig_level get structural equality. This patch gives them structural type always. * pt.c (canonical_type_parameter): Assert not a tpl-tpl-parm. (process_template_parm): tpl-tpl-parms are structural. (rewrite_template_parm): Propagate structuralness.
Nathan Sidwell committed -
We were not comparing expression pack expansions correctly. We could consider distinct expansions equal and creating two, apparently equal, specializations that would sometimes collide. cp_tree_operand_length says a pack has 1 operand (for mangling), whereas it actually has 3, but only two of which are significant for equality. We must special case that in cp_tree_equal. That new code matches the hasher and the type_pack_expansion case in structural_comp_types. * tree.c (cp_tree_equal): [TEMPLATE_ID_EXPR, default] Refactor. [EXPR_PACK_EXPANSION]: Add.
Nathan Sidwell committed -
One of the problems hit by pr94454 was that the argument hasher was not skipping nodes that template_args_equal would. Fixed by replacing the STRIP_NOPS invocation by a bespoke loop. We also confuse the canonical type machinery by treating tpl-tpl-parms as types. They're not; bound-tpl-tpl-parms are. We can get away with them being type-like. Unfortunately we give the original level==orig_level case a canonical type, but the reduced cases of level<orig_level get structural equality. That breaks the hasher because we'll use TYPE_HASH (CANONICAL_TYPE ()) when we can. There's a note in tsubst[TEMPLATE_TEMPLATE_PARM] about why the reduced ones cannot have a canonical type. (I didn't feel like questioning that assertion at this point.) * pt.c (iterative_hash_template_arg): Strip nodes as template_args_equal does. [ARGUMENT_PACK_SELECT, TREE_VEC, CONSTRUCTOR]: Refactor. [node_class:TEMPLATE_TEMPLATE_PARM]: Hash by level & index. [node_class:default]: Refactor.
Nathan Sidwell committed -
PR ipa/94582 * tree-inline.c (optimize_inline_calls): Recompute calls_comdat_local flag. * g++.dg/torture/pr94582.C: New test.
Jan Hubicka committed -
Add missing check in gfc_set_array_spec for sum of rank and corank to not exceed GFC_MAX_DIMENSIONS. 2020-04-20 Harald Anlauf <anlauf@gmx.de> PR fortran/93364 * array.c (gfc_set_array_spec): Check for sum of rank and corank not exceeding GFC_MAX_DIMENSIONS. 2020-04-20 Harald Anlauf <anlauf@gmx.de> PR fortran/93364 * gfortran.dg/pr93364.f90: New test.
Harald Anlauf committed -
* symtab.c (symtab_node::dump_references): Add space after one entry. (symtab_node::dump_referring): Likewise.
Martin Liska committed -
2020-04-20 Steve Kargl <kargl@gcc.gnu.org> Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/91800 * decl.c (variable_decl): Reject Hollerith constants as type initializer. 2020-04-20 Steve Kargl <kargl@gcc.gnu.org> Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/91800 * gfortran.dg/hollerith_9.f90: New test.
Steve Kargl committed -
Testing on the host does not make sense for 'declare copyout' for a same-scope stack-allocated variable. Once the copyout is done, the variable is gone. Hence, test the variable on the device. This can be revisit after the OpenACC semantic has been fixed; but with that fix, the test PASSes again with devices. PR middle-end/94120 * testsuite/libgomp.oacc-c++/declare-pr94120.C: Fix 'declare copy(out)' test case.
Tobias Burnus committed -
GCC Administrator committed
-
- 19 Apr, 2020 17 commits
-
-
Some more C++20 changes from P1614R2, "The Mothership has Landed". * include/bits/stl_queue.h (queue): Define operator<=> for C++20. * include/bits/stl_stack.h (stack): Likewise. * testsuite/23_containers/queue/cmp_c++20.cc: New test. * testsuite/23_containers/stack/cmp_c++20.cc: New test.
Jonathan Wakely committed -
Some more C++20 changes from P1614R2, "The Mothership has Landed". * include/bits/unordered_map.h (unordered_map, unordered_multimap): Remove redundant operator!= for C++20. * include/bits/unordered_set.h (unordered_set, unordered_multiset): Likewise. * include/debug/unordered_map (unordered_map, unordered_multimap): Likewise. * include/debug/unordered_set (unordered_set, unordered_multiset): Likewise.
Jonathan Wakely committed -
This appears to be a copy&paste error, which cppcheck diagnoses. PR other/94629 * include/debug/formatter.h (_Error_formatter::_Parameter): Fix redundant assignment in constructor.
Jonathan Wakely committed -
AIX does not support DWARF5 sections. -fcompare-debug causes gratuitous testcase failures on AIX. 2020-04-19 David Edelsohn <dje.gcc@gmail.com> * g++.dg/debug/dwarf2/pr85550.C: Skip AIX. * g++.dg/debug/pr94272.C: Skip AIX. * g++.dg/debug/pr94281.C: Skip AIX. * g++.dg/debug/pr94323.C: Skip AIX.
David Edelsohn committed -
std.array.Appender and RefAppender: use .opSlice() instead of data() Previously, Appender.data() was used to extract a slice of the Appender's array. Now use the [] slice operator instead. The same goes for RefAppender. Fixes: PR d/94455 Reviewed-on: https://github.com/dlang/phobos/pull/7450
Iain Buclaw committed -
Fixes hasLength unittest to pass on X32. References: PR d/94609 Reviewed-on: https://github.com/dlang/phobos/pull/7448
Iain Buclaw committed -
Initializes the VectorArrayExp::size field with the correct value. Fixes: PR d/94652 Reviewed-on: https://github.com/dlang/dmd/pull/11046
Iain Buclaw committed -
Initializes ncost before use, which was caught by valgrind. Fixes: PR d/94653 Reviewed-on: https://github.com/dlang/dmd/pull/11045
Iain Buclaw committed -
According to "Intel 64 and IA32 Arch SDM, Vol. 3: "Because SIMD floating-point exceptions are precise and occur immediately, the situation does not arise where an x87 FPU instruction, a WAIT/FWAIT instruction, or another SSE/SSE2/SSE3 instruction will catch a pending unmasked SIMD floating-point exception." Remove unneeded assignments to volatile memory. libgcc/ChangeLog: * config/i386/sfp-exceptions.c (__sfp_handle_exceptions) [__SSE_MATH__]: Remove unneeded assignments to volatile memory. libatomic/ChangeLog: * config/x86/fenv.c (__atomic_feraiseexcept) [__SSE_MATH__]: Remove unneeded assignments to volatile memory. libgfortran/ChangeLog: * config/fpu-387.h (local_feraiseexcept) [__SSE_MATH__]: Remove unneeded assignments to volatile memory.
Uros Bizjak committed -
While the coroutines implementation, and most of the coroutines tests, will operate with C++14 or newer, these tests require facilities introduced in C++17. Add the target requirement. gcc/testsuite/ 2020-04-19 Iain Sandoe <iain@sandoe.co.uk> * g++.dg/coroutines/torture/co-await-17-capture-comp-ref.C: Require C++17. * g++.dg/coroutines/torture/co-ret-15-default-return_void.C: Likewise.
Iain Sandoe committed -
Thomas König committed
-
2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/94347 * gfortran.dg/char_pointer_init_1.f90: New test.
Thomas König committed -
2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/57129 * gfortran.dg/subroutine_as_type.f90: New test.
Thomas König committed -
Returning &gfc_bad_expr when simplifying bounds after a divisin by zero happened results in the division by zero error actually reaching the user. 2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/93500 * resolve.c (resolve_operator): If both operands are NULL, return false. * simplify.c (simplify_bound): If a division by zero was seen during bound simplification, free the corresponcing expression and return &gfc_bad_expr. 2020-04-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/93500 * arith_divide_3.f90: New test.
Thomas König committed -
Similarly to inline asm, :: (or any other number of consecutive colons) can appear in ObjC @selector argument and with the introduction of CPP_SCOPE into the C FE, we need to trat CPP_SCOPE as two CPP_COLON tokens. The C++ FE does that already that way. 2020-04-19 Jakub Jelinek <jakub@redhat.com> PR objc/94637 * c-parser.c (c_parser_objc_selector_arg): Handle CPP_SCOPE like two CPP_COLON tokens. * objc.dg/pr94637.m: New test.
Jakub Jelinek committed -
Patch fixes test failure seen on X32 where a nested struct was passed in registers, rather than via invisible reference. Now, all non-POD structs are passed by invisible reference, not just those with a user-defined copy constructor/destructor. gcc/d/ChangeLog: PR d/94609 * d-codegen.cc (argument_reference_p): Don't check TREE_ADDRESSABLE. (type_passed_as): Build reference type if TREE_ADDRESSABLE. * d-convert.cc (convert_for_argument): Build explicit TARGET_EXPR if needed for arguments passed by invisible reference. * types.cc (TypeVisitor::visit (TypeStruct *)): Mark all structs that are not POD as TREE_ADDRESSABLE.
Iain Buclaw committed -
GCC Administrator committed
-
- 18 Apr, 2020 7 commits
-
-
Iain Buclaw committed
-
The intended purpose of the option is both for targets that don't support phobos yet, and for gdc itself to support bootstrapping itself as a self-hosted D compiler. The libphobos testsuite has been updated to only add libphobos to the search paths if it's being built. A new D2 testsuite directive RUNNABLE_PHOBOS_TEST has also been patched in to disable some runnable tests that have phobos dependencies, of which is a temporary measure until upstream DMD fixes or removes these tests entirely. gcc/testsuite/ChangeLog: * lib/gdc-utils.exp (gdc-convert-test): Add dg-skip-if for tests that depending on the phobos standard library. libphobos/ChangeLog: * configure: Regenerate. * configure.ac: Add --with-libphobos-druntime-only option and the conditional ENABLE_LIBDRUNTIME_ONLY. * configure.tgt: Define LIBDRUNTIME_ONLY. * src/Makefile.am: Add phobos sources if not ENABLE_LIBDRUNTIME_ONLY. * src/Makefile.in: Regenerate. * testsuite/testsuite_flags.in: Add phobos path if compiling phobos.
Iain Buclaw committed -
PR debug/94439 * regrename.c (check_new_reg_p): Ignore DEBUG_INSNs when walking the chain. PR debug/94439 * gcc.dg/torture/pr94439.c: New test.
Jeff Law committed -
The current check_effective_target_d_runtime procedure returns false if the target is built without any core runtime library for D being available (--disable-libphobos). This additional procedure is for targets where the core runtime library exists, but without the higher level standard library. gcc/ChangeLog: * doc/sourcebuild.texi (Effective-Target Keywords, Environment attributes): Document d_runtime_has_std_library. gcc/testsuite/ChangeLog: * gdc.dg/link.d: Use d_runtime_has_std_library effective target. * gdc.dg/runnable.d: Move phobos tests to... * gdc.dg/runnable2.d: ...here. New test. * lib/target-supports.exp (check_effective_target_d_runtime_has_std_library): New. libphobos/ChangeLog: * testsuite/libphobos.phobos/phobos.exp: Skip if effective target is not d_runtime_has_std_library. * testsuite/libphobos.phobos_shared/phobos_shared.exp: Likewise.
Iain Buclaw committed -
In the testcase below, during specialization of c<int>::d, we build two identical specializations of the parameter type b<decltype(e)::k> -- one when substituting into c<int>::d's TYPE_ARG_TYPES and another when substituting into c<int>::d's DECL_ARGUMENTS. We don't reuse the first specialization the second time around as a consequence of the fix for PR c++/56247 which made PARM_DECLs always compare different from one another during spec_hasher::equal. As a result, when looking up existing specializations of 'b', spec_hasher::equal considers the template argument decltype(e')::k to be different from decltype(e'')::k, where e' and e'' are the result of two calls to tsubst_copy on the PARM_DECL e. Since the two specializations are considered different due to the mentioned fix, their TYPE_CANONICAL points to themselves even though they are otherwise identical types, and this triggers an ICE in maybe_rebuild_function_decl_type when comparing the TYPE_ARG_TYPES of c<int>::d to its DECL_ARGUMENTS. This patch fixes this issue at the spec_hasher::equal level by ignoring the 'comparing_specializations' flag in cp_tree_equal whenever the DECL_CONTEXTs of the two parameters are identical. This seems to be a sufficient condition to be able to correctly compare PARM_DECLs structurally. (This also subsumes the CONSTRAINT_VAR_P check since constraint variables all have empty, and therefore identical, DECL_CONTEXTs.) gcc/cp/ChangeLog: PR c++/94632 * tree.c (cp_tree_equal) <case PARM_DECL>: Ignore comparing_specializations if the parameters' contexts are identical. gcc/testsuite/ChangeLog: PR c++/94632 * g++.dg/template/canon-type-14.C: New test.
Patrick Palka committed -
When updating an auto return type of an abbreviated function template in splice_late_return_type, we should also propagate PLACEHOLDER_TYPE_CONSTRAINTS (and cv-qualifiers) of the original auto node. gcc/cp/ChangeLog: PR c++/92187 * pt.c (splice_late_return_type): Propagate cv-qualifiers and PLACEHOLDER_TYPE_CONSTRAINTS from the original auto node to the new one. gcc/testsuite/ChangeLog: PR c++/92187 * g++.dg/concepts/abbrev5.C: New test. * g++.dg/concepts/abbrev6.C: New test.
Patrick Palka committed -
GCC Administrator committed
-
- 17 Apr, 2020 4 commits
-
-
Some more C++20 changes from P1614R2, "The Mothership has Landed". * include/std/chrono (duration, time_point): Define operator<=> and remove redundant operator!= for C++20. * testsuite/20_util/duration/comparison_operators/three_way.cc: New test. * testsuite/20_util/time_point/comparison_operators/three_way.cc: New test.
Jonathan Wakely committed -
In C++20 the rebind and const_reference members of std::allocator are gone, so this testsuite utility stopped working, causing ext/pb_ds/regression/priority_queue_rand_debug.cc to FAIL. * testsuite/util/native_type/native_priority_queue.hpp: Use allocator_traits to rebind allocator.
Jonathan Wakely committed -
Some more C++20 changes from P1614R2, "The Mothership has Landed". This implements <=> for sequence containers (and the __normal_iterator and _Pointer_adapter class templates). * include/bits/forward_list.h (forward_list): Define operator<=> and remove redundant comparison operators for C++20. * include/bits/stl_bvector.h (vector<bool, Alloc>): Likewise. * include/bits/stl_deque.h (deque): Likewise. * include/bits/stl_iterator.h (__normal_iterator): Likewise. * include/bits/stl_list.h (list): Likewise. * include/bits/stl_vector.h (vector): Likewise. * include/debug/deque (__gnu_debug::deque): Likewise. * include/debug/forward_list (__gnu_debug::forward_list): Likewise. * include/debug/list (__gnu_debug::list): Likewise. * include/debug/safe_iterator.h (__gnu_debug::_Safe_iterator): Likewise. * include/debug/vector (__gnu_debug::vector): Likewise. * include/ext/pointer.h (__gnu_cxx::_Pointer_adapter): Define operator<=> for C++20. * testsuite/23_containers/deque/operators/cmp_c++20.cc: New test. * testsuite/23_containers/forward_list/cmp_c++20.cc: New test. * testsuite/23_containers/list/cmp_c++20.cc: New test. * testsuite/23_containers/vector/bool/cmp_c++20.cc: New test. * testsuite/23_containers/vector/cmp_c++20.cc: New test.
Jonathan Wakely committed -
This time instead of having a NOP copy insn that we can completely ignore and ultimately remove, we have a NOP set within a multi-set PARALLEL. It triggers, the same failure when the source of such a set is a hard register for the same reasons as we've already noted in the BZ and patches-to-date. For prior cases we've been able to mark the insn as a nop set and ignore it for the rest of cse_insn, ultimately removing it. That's not really an option here as there are other sets that we have to preserve. We might be able to fix this instance by splitting the multi-set insn, but I'm not keen to introduce splitting into cse. Furthermore, the target may not be able to split the insn. So I considered this is non-starter. What I finally settled on was to use the existing do_not_record machinery to ignore the nop set within the parallel (and only that set within the parallel). One might argue that we should always ignore a REG_UNUSED set. But I rejected that idea -- we could have cse-able divmod insns where the first had a REG_UNUSED note for a destination, but the second did not. One might also argue that we could have a nop set without a REG_UNUSED in a multi-set parallel and thus we could trigger yet another insert_regs ICE at some point. I tend to think this is a possibility. If we see this happen, we'll have to revisit. PR rtl-optimization/90275 * cse.c (cse_insn): Avoid recording nop sets in multi-set parallels when the destination has a REG_UNUSED note.
Jeff Law committed
-