1. 20 Apr, 2020 12 commits
    • vect: Tweak vect_better_loop_vinfo_p handling of variable VFs · 77aecf92
      This patch fixes a large lmbench performance regression with
      128-bit SVE, compiled in length-agnostic mode.
      
      vect_better_loop_vinfo_p (new in GCC 10) tries to estimate whether
      a new loop_vinfo is cheaper than a previous one, with an in-built
      preference for the old one.  For variable VF it prefers the old
      loop_vinfo if it is cheaper for at least one VF.  However, we have
      no idea how likely that VF is in practice.
      
      Another extreme would be to do what most of the rest of the
      vectoriser does, and rely solely on the constant estimated VF.
      But as noted in the comment, this means that a one-unit cost
      difference would be enough to pick the new loop_vinfo,
      despite the target generally preferring the old loop_vinfo
      where possible.  The cost model just isn't accurate enough
      for that to produce good results as things stand: there might
      not be any practical benefit to the new loop_vinfo at the
      estimated VF, and it would be significantly worse for higher VFs.
      
      The patch instead goes for a hacky compromise: make sure that the new
      loop_vinfo is also no worse than the old loop_vinfo at double the
      estimated VF.  For all but trivial loops, this ensures that the
      new loop_vinfo is only chosen if it is better than the old one
      by a non-trivial amount at the estimated VF.  It also avoids
      putting too much faith in the VF estimate.
      
      I realise this isn't great, but it's supposed to be a conservative fix
      suitable for stage 4.  The only affected testcases are the ones for
      pr89007-*.c, where Advanced SIMD is indeed preferred for 128-bit SVE
      and is no worse for 256-bit SVE.
      
      Part of the problem here is that if the new loop_vinfo is better,
      we discard the old one and never consider using it even as an
      epilogue loop.  This means that if we choose Advanced SIMD over SVE,
      we're much more likely to have left-over scalar elements.
      
      Another is that the estimate provided by estimated_poly_value might have
      different probabilities attached.  E.g. when tuning for a particular core,
      the estimate is probably accurate, but when tuning for generic code,
      the estimate is more of a guess.  Relying solely on the estimate is
      probably correct for the former but not for the latter.
      
      Hopefully those are things that we could tackle in GCC 11.
      
      2020-04-20  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vect-loop.c (vect_better_loop_vinfo_p): If old_loop_vinfo
      	has a variable VF, prefer new_loop_vinfo if it is cheaper for the
      	estimated VF and is no worse at double the estimated VF.
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/cost_model_8.c: New test.
      	* gcc.target/aarch64/sve/cost_model_9.c: Likewise.
      	* gcc.target/aarch64/sve/pr89007-1.c: Add -msve-vector-bits=512.
      	* gcc.target/aarch64/sve/pr89007-2.c: Likewise.
      Richard Sandiford committed
    • aarch64: Fix vector builds used by SVE vec_init [PR94668] · 5da301cb
      This testcase triggered an ICE in rtx_vector_builder::step because
      we were trying to use a stepped representation for floating-point
      constants.  The underlying problem was that the arguments to
      rtx_vector_builder were the wrong way around, meaning that some
      variations were likely to be incorrectly encoded for integers
      (but probably as a silent failure).
      
      Also, aarch64_sve_expand_vector_init_handle_trailing_constants
      tries to extend the trailing constant elements to a full vector
      by following the "natural" pattern of the original vector, which
      should generally lead to nicer constants.  However, for the testcase,
      we'd then end up picking a variable for some elements.  Fixed by
      stubbing out all variable elements with zeros.
      
      That fix involved testing valid_for_const_vector_p.  For consistency,
      the patch uses the same test when finding trailing constants, instead
      of the previous aarch64_legitimate_constant_p.
      
      2020-04-20  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	PR target/94668
      	* config/aarch64/aarch64.c (aarch64_sve_expand_vector_init): Fix
      	order of arguments to rtx_vector_builder.
      	(aarch64_sve_expand_vector_init_handle_trailing_constants): Likewise.
      	When extending the trailing constants to a full vector, replace any
      	variables with zeros.
      
      gcc/testsuite/
      	PR target/94668
      	* gcc.target/aarch64/sve/pr94668.c: New test.
      Richard Sandiford committed
    • libstdc++: Avoid illegal argument to verbose in dg-test callback · 697b94cf
      If extra_tool_flags starts with a dash, an error like 'ERROR: verbose:
      illegal argument: -march=native -O2 -std=c++17' is printed. This is
      easily fixed by inserting a double dash before the variable.
      
      2020-04-20  Matthias Kretz  <kretz@kde.org>
      
      	* testsuite/lib/libstdc++.exp: Avoid illegal argument to verbose.
      Matthias Kretz committed
    • c++: tpl-tpl-parms are not canonicalizable types [pr94454] · a6f40023
      We treat tpl-tpl-parms as types.  They're not; bound-tpl-tpl-parms
      are.  We can get away with them being type-like.  Unfortunately we
      give the original level==orig_level case a canonical type, but the
      reduced cases of level<orig_level get structural equality.  This patch
      gives them structural type always.
      
      	* pt.c (canonical_type_parameter): Assert not a tpl-tpl-parm.
      	(process_template_parm): tpl-tpl-parms are structural.
      	(rewrite_template_parm): Propagate structuralness.
      Nathan Sidwell committed
    • c++: Expr pack expansion equality [pr94454] · 7fcb9343
      We were not comparing expression pack expansions correctly. We could
      consider distinct expansions equal and creating two, apparently equal,
      specializations that would sometimes collide.  cp_tree_operand_length
      says a pack has 1 operand (for mangling), whereas it actually has 3,
      but only two of which are significant for equality.  We must special
      case that in cp_tree_equal.  That new code matches the hasher and the
      type_pack_expansion case in structural_comp_types.
      
      	* tree.c (cp_tree_equal): [TEMPLATE_ID_EXPR, default] Refactor.
      	[EXPR_PACK_EXPANSION]: Add.
      Nathan Sidwell committed
    • c++: Template argument hashing [pr94454] · aa576f2a
      One of the problems hit by pr94454 was that the argument hasher was
      not skipping nodes that template_args_equal would.  Fixed by replacing
      the STRIP_NOPS invocation by a bespoke loop.  We also confuse the
      canonical type machinery by treating tpl-tpl-parms as types.  They're
      not; bound-tpl-tpl-parms are.  We can get away with them being
      type-like.  Unfortunately we give the original level==orig_level case
      a canonical type, but the reduced cases of level<orig_level get
      structural equality.  That breaks the hasher because we'll use
      TYPE_HASH (CANONICAL_TYPE ()) when we can. There's a note in
      tsubst[TEMPLATE_TEMPLATE_PARM] about why the reduced ones cannot have
      a canonical type. (I didn't feel like questioning that assertion at
      this point.)
      
      	* pt.c (iterative_hash_template_arg): Strip nodes as
      	template_args_equal does.
      	[ARGUMENT_PACK_SELECT, TREE_VEC, CONSTRUCTOR]: Refactor.
      	[node_class:TEMPLATE_TEMPLATE_PARM]: Hash by level & index.
      	[node_class:default]: Refactor.
      Nathan Sidwell committed
    • Fix ICE on invalid calls_comdat_local flag [pr94582] · 48c82310
      	PR ipa/94582
      	* tree-inline.c (optimize_inline_calls): Recompute calls_comdat_local
      	flag.
      
      	* g++.dg/torture/pr94582.C: New test.
      Jan Hubicka committed
    • PR fortran/93364 - ICE in gfc_set_array_spec, at fortran/array.c:879 · aeb430aa
      Add missing check in gfc_set_array_spec for sum of rank and corank to not
      exceed GFC_MAX_DIMENSIONS.
      
      2020-04-20  Harald Anlauf  <anlauf@gmx.de>
      
      	PR fortran/93364
      	* array.c (gfc_set_array_spec): Check for sum of rank and corank
      	not exceeding GFC_MAX_DIMENSIONS.
      
      2020-04-20  Harald Anlauf  <anlauf@gmx.de>
      
      	PR fortran/93364
      	* gfortran.dg/pr93364.f90: New test.
      Harald Anlauf committed
    • Fix spacing in symtab_node::dump_references. · 9b4d38df
      	* symtab.c (symtab_node::dump_references): Add space after
      	one entry.
      	(symtab_node::dump_referring): Likewise.
      Martin Liska committed
    • PR 91800 - reject Hollerith constants as type initializer. · 38acc41d
      2020-04-20  Steve Kargl  <kargl@gcc.gnu.org>
      	Thomas Koenig  <tkoenig@gcc.gnu.org>
      
      	PR fortran/91800
      	* decl.c (variable_decl): Reject Hollerith constants as type
      	initializer.
      
      2020-04-20  Steve Kargl  <kargl@gcc.gnu.org>
      	Thomas Koenig  <tkoenig@gcc.gnu.org>
      
      	PR fortran/91800
      	* gfortran.dg/hollerith_9.f90: New test.
      Steve Kargl committed
    • Fix declare copyout in libgomp.oacc-c++/declare-pr94120.C · 85d8c05a
      Testing on the host does not make sense for 'declare copyout' for
      a same-scope stack-allocated variable. Once the copyout is done,
      the variable is gone. Hence, test the variable on the device. This
      can be revisit after the OpenACC semantic has been fixed; but with
      that fix, the test PASSes again with devices.
      
              PR middle-end/94120
              * testsuite/libgomp.oacc-c++/declare-pr94120.C: Fix 'declare copy(out)'
              test case.
      Tobias Burnus committed
    • Daily bump. · 79b9d18e
      GCC Administrator committed
  2. 19 Apr, 2020 17 commits
  3. 18 Apr, 2020 7 commits
    • libphobos: Add --with-libphobos-druntime-only option. · 261bd78d
      The intended purpose of the option is both for targets that don't
      support phobos yet, and for gdc itself to support bootstrapping itself
      as a self-hosted D compiler.
      
      The libphobos testsuite has been updated to only add libphobos to the
      search paths if it's being built.  A new D2 testsuite directive
      RUNNABLE_PHOBOS_TEST has also been patched in to disable some runnable
      tests that have phobos dependencies, of which is a temporary measure
      until upstream DMD fixes or removes these tests entirely.
      
      gcc/testsuite/ChangeLog:
      
      	* lib/gdc-utils.exp (gdc-convert-test): Add dg-skip-if for tests that
      	depending on the phobos standard library.
      
      libphobos/ChangeLog:
      
      	* configure: Regenerate.
      	* configure.ac: Add --with-libphobos-druntime-only option and the
      	conditional ENABLE_LIBDRUNTIME_ONLY.
      	* configure.tgt: Define LIBDRUNTIME_ONLY.
      	* src/Makefile.am: Add phobos sources if not ENABLE_LIBDRUNTIME_ONLY.
      	* src/Makefile.in: Regenerate.
      	* testsuite/testsuite_flags.in: Add phobos path if compiling phobos.
      Iain Buclaw committed
    • Don't let DEBUG_INSNSs change register renaming decisions · baf3b9b2
      	PR debug/94439
      	* regrename.c (check_new_reg_p): Ignore DEBUG_INSNs when walking
      	the chain.
      
      	PR debug/94439
      	* gcc.dg/torture/pr94439.c: New test.
      Jeff Law committed
    • testsuite: Disable gdc standard runtime tests if phobos is not built. · b57e1621
      The current check_effective_target_d_runtime procedure returns false if
      the target is built without any core runtime library for D being
      available (--disable-libphobos).  This additional procedure is for
      targets where the core runtime library exists, but without the higher
      level standard library.
      
      gcc/ChangeLog:
      
      	* doc/sourcebuild.texi (Effective-Target Keywords, Environment
      	attributes): Document d_runtime_has_std_library.
      
      gcc/testsuite/ChangeLog:
      
      	* gdc.dg/link.d: Use d_runtime_has_std_library effective target.
      	* gdc.dg/runnable.d: Move phobos tests to...
      	* gdc.dg/runnable2.d: ...here.  New test.
      	* lib/target-supports.exp
      	(check_effective_target_d_runtime_has_std_library): New.
      
      libphobos/ChangeLog:
      
      	* testsuite/libphobos.phobos/phobos.exp: Skip if effective target is
      	not d_runtime_has_std_library.
      	* testsuite/libphobos.phobos_shared/phobos_shared.exp: Likewise.
      Iain Buclaw committed
    • c++: spec_hasher::equal and PARM_DECLs [PR94632] · f83adb68
      In the testcase below, during specialization of c<int>::d, we build two
      identical specializations of the parameter type b<decltype(e)::k> -- one when
      substituting into c<int>::d's TYPE_ARG_TYPES and another when substituting into
      c<int>::d's DECL_ARGUMENTS.
      
      We don't reuse the first specialization the second time around as a consequence
      of the fix for PR c++/56247 which made PARM_DECLs always compare different from
      one another during spec_hasher::equal.  As a result, when looking up existing
      specializations of 'b', spec_hasher::equal considers the template argument
      decltype(e')::k to be different from decltype(e'')::k, where e' and e'' are the
      result of two calls to tsubst_copy on the PARM_DECL e.
      
      Since the two specializations are considered different due to the mentioned fix,
      their TYPE_CANONICAL points to themselves even though they are otherwise
      identical types, and this triggers an ICE in maybe_rebuild_function_decl_type
      when comparing the TYPE_ARG_TYPES of c<int>::d to its DECL_ARGUMENTS.
      
      This patch fixes this issue at the spec_hasher::equal level by ignoring the
      'comparing_specializations' flag in cp_tree_equal whenever the DECL_CONTEXTs of
      the two parameters are identical.  This seems to be a sufficient condition to be
      able to correctly compare PARM_DECLs structurally.  (This also subsumes the
      CONSTRAINT_VAR_P check since constraint variables all have empty, and therefore
      identical, DECL_CONTEXTs.)
      
      gcc/cp/ChangeLog:
      
      	PR c++/94632
      	* tree.c (cp_tree_equal) <case PARM_DECL>: Ignore
      	comparing_specializations if the parameters' contexts are identical.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/94632
      	* g++.dg/template/canon-type-14.C: New test.
      Patrick Palka committed
    • c++: Abbreviated function template return type [PR92187] · e43b28ae
      When updating an auto return type of an abbreviated function template in
      splice_late_return_type, we should also propagate PLACEHOLDER_TYPE_CONSTRAINTS
      (and cv-qualifiers) of the original auto node.
      
      gcc/cp/ChangeLog:
      
      	PR c++/92187
      	* pt.c (splice_late_return_type): Propagate cv-qualifiers and
      	PLACEHOLDER_TYPE_CONSTRAINTS from the original auto node to the new one.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/92187
      	* g++.dg/concepts/abbrev5.C: New test.
      	* g++.dg/concepts/abbrev6.C: New test.
      Patrick Palka committed
    • Daily bump. · c5bac7d1
      GCC Administrator committed
  4. 17 Apr, 2020 4 commits
    • libstdc++: Add comparison operators to <chrono> types · 27c17177
      Some more C++20 changes from P1614R2, "The Mothership has Landed".
      
      	* include/std/chrono (duration, time_point): Define operator<=> and
      	remove redundant operator!= for C++20.
      	* testsuite/20_util/duration/comparison_operators/three_way.cc: New
      	test.
      	* testsuite/20_util/time_point/comparison_operators/three_way.cc: New
      	test.
      Jonathan Wakely committed
    • libstdc++: Fix testsuite utility's use of allocators · c9960294
      In C++20 the rebind and const_reference members of std::allocator are
      gone, so this testsuite utility stopped working, causing
      ext/pb_ds/regression/priority_queue_rand_debug.cc to FAIL.
      
      	* testsuite/util/native_type/native_priority_queue.hpp: Use
      	allocator_traits to rebind allocator.
      Jonathan Wakely committed
    • libstdc++: Add comparison operators to sequence containers · bd2420f8
      Some more C++20 changes from P1614R2, "The Mothership has Landed".
      
      This implements <=> for sequence containers (and the __normal_iterator
      and _Pointer_adapter class templates).
      
      	* include/bits/forward_list.h (forward_list): Define operator<=> and
      	remove redundant comparison operators for C++20.
      	* include/bits/stl_bvector.h (vector<bool, Alloc>): Likewise.
      	* include/bits/stl_deque.h (deque): Likewise.
      	* include/bits/stl_iterator.h (__normal_iterator): Likewise.
      	* include/bits/stl_list.h (list): Likewise.
      	* include/bits/stl_vector.h (vector): Likewise.
      	* include/debug/deque (__gnu_debug::deque): Likewise.
      	* include/debug/forward_list (__gnu_debug::forward_list): Likewise.
      	* include/debug/list (__gnu_debug::list): Likewise.
      	* include/debug/safe_iterator.h (__gnu_debug::_Safe_iterator):
      	Likewise.
      	* include/debug/vector (__gnu_debug::vector): Likewise.
      	* include/ext/pointer.h (__gnu_cxx::_Pointer_adapter): Define
      	operator<=> for C++20.
      	* testsuite/23_containers/deque/operators/cmp_c++20.cc: New test.
      	* testsuite/23_containers/forward_list/cmp_c++20.cc: New test.
      	* testsuite/23_containers/list/cmp_c++20.cc: New test.
      	* testsuite/23_containers/vector/bool/cmp_c++20.cc: New test.
      	* testsuite/23_containers/vector/cmp_c++20.cc: New test.
      Jonathan Wakely committed
    • [committed] [PR rtl-optimization/90275] Another 90275 related cse.c fix · 3737ccc4
      This time instead of having a NOP copy insn that we can completely ignore and
      ultimately remove, we have a NOP set within a multi-set PARALLEL.  It triggers,
      the same failure when the source of such a set is a hard register for the same
      reasons as we've already noted in the BZ and patches-to-date.
      
      For prior cases we've been able to mark the insn as a nop set and ignore it for
      the rest of cse_insn, ultimately removing it.  That's not really an option here
      as there are other sets that we have to preserve.
      
      We might be able to fix this instance by splitting the multi-set insn, but I'm
      not keen to introduce splitting into cse.  Furthermore, the target may not be
      able to split the insn.  So I considered this is non-starter.
      
      What I finally settled on was to use the existing do_not_record machinery to
      ignore the nop set within the parallel (and only that set within the parallel).
      
      One might argue that we should always ignore a REG_UNUSED set.  But I rejected
      that idea -- we could have cse-able divmod insns where the first had a
      REG_UNUSED note for a destination, but the second did not.
      
      One might also argue that we could have a nop set without a REG_UNUSED in a
      multi-set parallel and thus we could trigger yet another insert_regs ICE at
      some point.  I tend to think this is a possibility.  If we see this happen,
      we'll have to revisit.
      
      	PR rtl-optimization/90275
      	* cse.c (cse_insn): Avoid recording nop sets in multi-set parallels
      	when the destination has a REG_UNUSED note.
      Jeff Law committed