1. 04 Apr, 2020 4 commits
    • ipa: Fix wrong code with failed propagation to builtin_constant_p [PR93940] · 2523d721
      this patch fixes wrong code on a testcase where inline predicts
      builtin_constant_p to be true but we fail to optimize its parameter to constant
      becuase FRE is not run and the value is passed by an aggregate.
      
      This patch makes the inline predicates to disable aggregate tracking
      when FRE is not going to be run and similarly value range when VRP is not
      going to be run.
      
      This is just partial fix.  Even with it we can arrange FRE/VRP to fail and
      produce wrong code, unforutnately.
      
      I think for GCC11 I will need to implement transformation in ipa-inline
      but this is bit hard to do: predicates only tracks that value will be constant
      and do not track what constant to be.
      
      Optimizing builtin_constant_p in a conditional is not going to do good job
      when the value is used later in a place that expects it to be constant.
      This is pre-existing problem that is not limited to inline tracking. For example,
      FRE may do the transofrm at one place but not in another due to alias oracle
      walking limits.
      
      So I am not sure what full fix would be :(
      
      gcc/ChangeLog:
      
      2020-04-04  Jan Hubicka  <hubicka@ucw.cz>
      
      	PR ipa/93940
      	* ipa-fnsummary.c (vrp_will_run_p): New function.
      	(fre_will_run_p): New function.
      	(evaluate_properties_for_edge): Use it.
      	* ipa-inline.c (can_inline_edge_by_limits_p): Do not inline
      	!optimize_debug to optimize_debug.
      
      gcc/testsuite/ChangeLog:
      
      2020-04-04  Jan Hubicka  <hubicka@ucw.cz>
      
      	* g++.dg/tree-ssa/pr93940.C: New test.
      Jan Hubicka committed
    • cselib: Don't consider SP_DERIVED_VALUE_P values as useless [PR94468] · bab8d962
      The following testcase ICEs, because at one point we see the
      SP_DERIVED_VALUE_P VALUE as useless (not PRESERVED_VALUE_P and no locs)
      and so expect it to be discarded as useless.  But, later on we
      are adding some new VALUE that is equivalent to it, and when adding
      the equivalency that that new VALUE is equal to this SP_DERIVED_VALUE_P,
      new_elt_loc_list has code for VALUE canonicalization and reverses addition
      if uid is smaller, and at that point a new loc is added to the
      SP_DERIVED_VALUE_P VALUE and it isn't discarded as useless anymore.
      Now, I think we don't want to discard the SP_DERIVED_VALUE_P values
      even if they have no locs, because they still have the special behaviour
      that they then force other new VALUEs to be canonicalized against them,
      which is what this patch implements.  I've not set PRESERVED_VALUE_P
      on the SP_DERIVED_VALUE_P at the creation time, because whether a VALUE
      is preserved or not is something that affects var-tracking decisions quite a
      lot and we shouldn't set it blindly on other VALUEs.
      
      Or, to avoid the repetitive code, should I introduce
      static bool
      cselib_useless_value_p (cselib_val *v)
      {
        return (v->locs == 0
      	  && !PRESERVED_VALUE_P (v->val_rtx)
      	  && !SP_DERIVED_VALUE_P (v->val_rtx)));
      }
      predicate and use it in those 6 spots?
      
      2020-04-04  Jakub Jelinek  <jakub@redhat.com>
      
      	PR rtl-optimization/94468
      	* cselib.c (references_value_p): Formatting fix.
      	(cselib_useless_value_p): New function.
      	(discard_useless_locs, discard_useless_values,
      	cselib_invalidate_regno_val, cselib_invalidate_mem,
      	cselib_record_set): Use it instead of
      	v->locs == 0 && !PRESERVED_VALUE_P (v->val_rtx).
      
      	* g++.dg/opt/pr94468.C: New test.
      Jakub Jelinek committed
    • c++: Fix further protected_set_expr_location related -fcompare-debug issues [PR94441] · aae5d08a
      My recent protected_set_expr_location changes work well when
      that function is called unconditionally, but as the testcase shows, the C++
      FE has a few spots that do:
        if (!EXPR_HAS_LOCATION (stmt))
          protected_set_expr_location (stmt, locus);
      or similar.  Now, if we have for -g0 stmt of some expression that can
      have location and has != UNKNOWN_LOCATION, while -g instead has
      a STATEMENT_LIST containing some DEBUG_BEGIN_STMTs + that expression with
      that location, we don't call protected_set_expr_location in the -g0 case,
      but do call it in the -g case, because on the STATEMENT_LIST
      !EXPR_HAS_LOCATION.
      The following patch introduces a helper function which digs up the single
      expression of a STATEMENT_LIST and uses that expression in the
      EXPR_HAS_LOCATION check (plus changes protected_set_expr_location to
      also use that helper).
      
      Or do we want a further wrapper, perhaps C++ FE only, that would do this
      protected_set_expr_location_if_unset (stmt, locus)?
      
      2020-04-04  Jakub Jelinek  <jakub@redhat.com>
      
      	PR debug/94441
      	* tree-iterator.h (expr_single): Declare.
      	* tree-iterator.c (expr_single): New function.
      	* tree.h (protected_set_expr_location_if_unset): Declare.
      	* tree.c (protected_set_expr_location): Use expr_single.
      	(protected_set_expr_location_if_unset): New function.
      
      	* parser.c (cp_parser_omp_for_loop): Use
      	protected_set_expr_location_if_unset.
      	* cp-gimplify.c (genericize_if_stmt, genericize_cp_loop): Likewise.
      
      	* g++.dg/opt/pr94441.C: New test.
      Jakub Jelinek committed
    • Daily bump. · 78e27649
      GCC Administrator committed
  2. 03 Apr, 2020 21 commits
    • Fix stdarg-3 regression on xstormy16 port · 7f26e60c
      	PR rtl-optimization/92264
      	* config/stormy16/stormy16.c (xstormy16_preferred_reload_class): Handle
      	reloading of auto-increment addressing modes.
      Jeff Law committed
    • openmp: Fix ICE on #pragma omp parallel master in template [PR94477] · 0c809f72
      The following testcase ICEs, because for parallel combined with some
      other construct we initialize the omp_parallel_combined_clauses pointer
      and expect the construct combined with it to clear it after it no longer
      needs it, but OMP_MASTER didn't do that.
      
      2020-04-04  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/94477
      	* pt.c (tsubst_expr) <case OMP_MASTER>: Clear
      	omp_parallel_combined_clauses.
      
      	* g++.dg/gomp/pr94477.C: New test.
      Jakub Jelinek committed
    • libgcc: avoid mmap/munmap hooks in split-stack code on GNU/Linux · 710d54ed
      	* generic-morestack.c: On GNU/Linux use __mmap/__munmap rather
      	than mmap/munmap, to avoid hooks.
      Ian Lance Taylor committed
    • x86: Mark scratch operand in ssse3_pshufbv8qi3 as earlyclobber · bbcdf9bb
      commit 16ed2601
      Author: H.J. Lu <hongjiu.lu@intel.com>
      Date:   Wed May 15 15:26:19 2019 +0000
      
          i386: Emulate MMX pshufb with SSE version
      
      has
      
      +(define_insn_and_split "ssse3_pshufbv8qi3"
      +  [(set (match_operand:V8QI 0 "register_operand" "=y,x,Yv")
      +  (unspec:V8QI [(match_operand:V8QI 1 "register_operand" "0,0,Yv")
      +           (match_operand:V8QI 2 "register_mmxmem_operand" "ym,x,Yv")]
      +          UNSPEC_PSHUFB))
      +   (clobber (match_scratch:V4SI 3 "=X,x,Yv"))]
                                             ^^^  There are earlyclobber.
      +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3"
      +  "@
      +   pshufb\t{%2, %0|%0, %2}
      +   #
      +   #"
      +  "TARGET_MMX_WITH_SSE && reload_completed"
      +  [(set (match_dup 3) (match_dup 5))
      +   (set (match_dup 3)
      +  (and:V4SI (match_dup 3) (match_dup 2)))
      +   (set (match_dup 0)
      +  (unspec:V16QI [(match_dup 1) (match_dup 4)] UNSPEC_PSHUFB))]
      
      If input register operand 2 is dead after this insn, RA may choose it
      as scratch operand.  Since it isn't marked as earlyclobber, operand 2
      becomes unused after split and then it gets optimized out.  Mark scratch
      operand as earlyclobber fixes the issue.
      
      gcc/
      
      	PR target/94467
      	* config/i386/sse.md (ssse3_pshufbv8qi3): Mark scratch operand
      	as earlyclobber.
      
      gcc/testsuite/
      
      	PR target/94467
      	* gcc.target/i386/pr94467-1.c: New test.
      	* gcc.target/i386/pr94467-2.c: Likewise.
      H.J. Lu committed
    • Fix va-arg-22.c at -O1 on m32r. · b949f8e2
      	PR rtl-optimization/92264
      	* config/m32r/m32r.c (m32r_output_block_move): Properly account for
      	post-increment addressing of source operands as well as residuals
      	when computing any adjustments to the input pointer.
      Jeff Law committed
    • i386: Fix up handling of OPTION_MASK_ISA_MMX builtins [PR94461] · a13d6ec8
      In https://gcc.gnu.org/ml/gcc-patches/2017-10/msg00576.html the builtin
      handling was changed so that OPTION_MASK_ISA_MMX | OPTION_MASK_ISA_SSE
      etc. in i386-builtin.def means we require both mmx and sse, not just one of
      those, and later on for other option combinations very similar rule has
      been clarified, with a few exceptions that ix86_expand_builtin lists
      (SSE | 3DNOW_A, SSE4_2 | CRC32 and FMA | FMA4 are one or the other).
      The above mentioned patch also added OPTION_MASK_ISA_MMX to a few insns
      that in the ISA documents are documented e.g. only requiring SSE2 or SSSE3
      etc. CPUID, but because those builtins take or return V2SI or similar
      MMX-ish arguments, we can't really support those builtins in functions that
      have MMX disabled.
      Now, during the TARGET_MMX_WITH_SSE changes,
      https://gcc.gnu.org/ml/gcc-patches/2019-02/msg01479.html
      and
      https://gcc.gnu.org/ml/gcc-patches/2019-05/msg01084.html
      actually changed this; it added | OPTION_MASK_ISA_SSE2 to builtins
      that were formerly OPTION_MASK_ISA_MMX only, but didn't touch the builtins
      that were already using OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_MMX
      for something different (both options must be enabled).
      This causes e.g. ICE on the following testcase, because the builtins are
      now enabled even with just -mmmx -mno-sse2, even when they (those changed in
      2017) require SSE2.
      The following patch instead reverts the above two 2019-ish changes (except
      for header/testsuite changes), and instead treats OPTION_MASK_ISA_MMX
      requirement in bdesc/.isa specially, as being satisfied by either
      TARGET_MMX (no changes really needed for that), or by TARGET_MMX_WITH_SSE.
      This achieves what the two 2019-ish patches want to do, that the
      OPTION_MASK_ISA_MMX only builtins are enabled not just with -mmmx, but also
      with -m64 -msse2, and for the other builtins that require MMX and something
      else will either require -mmmx and that some other ISA, or -m64 -msse2 and
      that other ISA, but -mmmx will not enable builtins that need something more
      than OPTION_MASK_ISA_MMX only.
      The i386-builtins.c changes that aren't reversion of the two patches try to
      make sure that in .isa we still record OPTION_MASK_ISA_MMX for builtins that
      have that requirement, so that it is in the end only ix86_expand_builtin
      that decides if the builtin is ok or not and the rest of code just decides
      if it is the right time to declare the builtin already or if it should be
      deferred.
      
      2020-04-03  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/94461
      	* config/i386/i386-expand.c (ix86_expand_builtin): If
      	TARGET_MMX_WITH_SSE without TARGET_MMX and bisa contains
      	OPTION_MASK_ISA_MMX, clear OPTION_MASK_ISA_MMX and set
      	OPTION_MASK_ISA_SSE2 in bisa.  Revert 2019-05-17 and 2019-05-15
      	changes.
      	* config/i386/i386-builtins.c (def_builtin): If mask includes
      	OPTION_MASK_ISA_MMX and TARGET_MMX_WITH_SSE, consider it satisfied.
      	(ix86_add_new_builtins): For TARGET_64BIT, consider
      	OPTION_MASK_ISA_SSE2 enabled in isa as satisfying OPTION_MASK_ISA_MMX
      	requirement.
      	(ix86_init_tm_builtins): If TARGET_MMX_WITH_SSE consider
      	OPTION_MASK_ISA_MMX as satisfied.
      	(bdesc_tm): Revert 2019-05-15 changes.
      	(ix86_init_mmx_sse_builtins): Likewise.
      	* config/i386/i386-builtin.def: Likewise.
      
      	* gcc.target/i386/pr94461.c: New test.
      Jakub Jelinek committed
    • c++: alias template and parameter packs (PR91966). · bcafd874
      In this testcase, when we do a pack expansion of count_better_mins<nums>,
      nums appears both in the definition of count_better_mins and as its template
      argument.  The intent is that we get a expansion over pairs of elements of
      the pack, i.e. less<2,2>, less<2,7>, less<7,2>, ....  But if we substitute
      into the definition of count_better_mins when parsing the template, we end
      up with sum<less<nums,nums>...>, which never gives us less<2,7>.  We could
      deal with this by somehow marking up the use of 'nums' as an argument for
      'num', but it's simpler to mark the alias as complex, so we need to
      instantiate it later with all its arguments rather than replace it early
      with its expansion.
      
      gcc/cp/ChangeLog
      2020-04-03  Jason Merrill  <jason@redhat.com>
      
      	PR c++/91966
      	* pt.c (complex_pack_expansion_r): New.
      	(complex_alias_template_p): Use it.
      Jason Merrill committed
    • i386: Fix vph{add,subs?}[wd] 256-bit AVX2 RTL patterns [PR94460] · b8020a5a
      The following testcase is miscompiled, because the AVX2 patterns don't
      describe correctly what the insn does.  E.g. vphaddd with %ymm* operands
      (the second pattern) instruction as per:
      https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_hadd_epi32&expand=2941
      does { a0+a1, a2+a3, b0+b1, b2+b3, a4+a5, a6+a7, b4+b5, b6+b7 }
      but our RTL pattern did
           { a0+a1, a2+a3, a4+a5, a6+a7, b0+b1, b2+b3, b4+b5, b6+b7 }
      where the first and last 64 bits are the same and two middle 64 bits
      swapped.
      https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm256_hadd_epi16&expand=2939
      similarly, insn does:
           { a0+a1, a2+a3, a4+a5, a6+a7, b0+b1, b2+b3, b4+b5, b6+b7,
             a8+a9, a10+a11, a12+a13, a14+a15, b8+b9, b10+b11, b12+b13, b14+b15 }
      but RTL pattern did
           { a0+a1, a2+a3, a4+a5, a6+a7, a8+a9, a10+a11, a12+a13, a14+a15,
             b0+b1, b2+b3, b4+b5, b6+b7, b8+b9, b10+b11, b12+b13, b14+b15 }
      again, first and last 64 bits are the same and the two middle 64 bits
      swapped.
      
      2020-04-03  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/94460
      	* config/i386/sse.md (avx2_ph<plusminus_mnemonic>wv16hi3,
      	avx2_ph<plusminus_mnemonic>dv8si3): Fix up RTL pattern to do
      	second half of first lane from first lane of second operand and
      	first half of second lane from second lane of first operand.
      
      	* gcc.target/i386/avx2-pr94460.c: New test.
      Jakub Jelinek committed
    • c++: Add test for PR c++/93211 · 51ecad3c
      The fix for PR c++/90711 also fixed this PR.
      
      gcc/testsuite/ChangeLog:
      
      	PR c++/93211
      	PR c++/90711
      	* g++.dg/template/koenig11.C: New test.
      Patrick Palka committed
    • arm: MVE: Fix unintended change to tests · a87cd913
      When committing my last patch I accidentally removed -mfpu=auto from the following tests. This puts it back.
      
      testsuite/ChangeLog:
      2020-04-03  Andre Vieira  <andre.simoesdiasvieira@arm.com>
      
      	* gcc.target/arm/mve/intrinsics/mve_vector_float.c: Put -mfpu=auto back.
      	* gcc.target/arm/mve/intrinsics/mve_vector_float1.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_int.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_int1.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_int2.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_uint.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_uint1.c: Likewise.
      	* gcc.target/arm/mve/intrinsics/mve_vector_uint2.c: Likewise.
      
      Testing Done:
      @IP: I assert this is almost no risk.
      
      Reviewed at http://pdtlreviewboard.cambridge.arm.com/r/12880/
      Andre Simoes Dias Vieira committed
    • arm: Do not process rest of MVE header file after unsupported error · 3b6e79ae
      This patch makes sure the rest of the header file is not parsed if MVE is not
      supported.  The user should not be including this file if MVE is not supported,
      nevertheless making sure it doesn't parse the rest of the header file will
      save the user from a huge error output that would be rather useless.
      
      gcc/ChangeLog:
      2020-04-03  Andre Vieira  <andre.simoesdiasvieira@arm.com>
      
              * config/arm/arm_mve.h: Condition the header file on __ARM_FEATURE_MVE.
      Andre Simoes Dias Vieira committed
    • AArch64: Fix options canonicalization for assembler · 53161358
      It is currently impossible to use fp16 on any architecture higher than Armv8.3-a
      due to a bug in options canonization.  This bug results in the fp16 flag not
      being emitted in the assembly when it should have been.
      
      This is caused by a complicated architectural requirement at Armv8.4-a.  On
      Armv8.2-a and Armv8.3-a fp16fml is an optional extension and turning it on turns
      on both fp and fp16.  However starting with Armv8.4-a fp16fml is mandatory if
      fp16 is available, otherwise it's optional.
      
      In short this means that to enable fp16fml the smallest option that needs to
      passed to the assembler is Armv8.4-a+fp16.
      
      The fix in this patch takes into account that an option may be on by default in
      an architecture, but that not all the bits required to use it are on by default
      in an architecture.  In such cases the difference between the two are still
      emitted to the assembler.
      
      gcc/ChangeLog:
      
      	PR target/94396
      	* common/config/aarch64/aarch64-common.c
      	(aarch64_get_extension_string_for_isa_flags): Handle default flags.
      
      gcc/testsuite/ChangeLog:
      
      	PR target/94396
      	* gcc.target/aarch64/options_set_11.c: New test.
      	* gcc.target/aarch64/options_set_12.c: New test.
      	* gcc.target/aarch64/options_set_13.c: New test.
      	* gcc.target/aarch64/options_set_14.c: New test.
      	* gcc.target/aarch64/options_set_15.c: New test.
      	* gcc.target/aarch64/options_set_16.c: New test.
      	* gcc.target/aarch64/options_set_17.c: New test.
      	* gcc.target/aarch64/options_set_18.c: New test.
      	* gcc.target/aarch64/options_set_19.c: New test.
      	* gcc.target/aarch64/options_set_20.c: New test.
      	* gcc.target/aarch64/options_set_21.c: New test.
      	* gcc.target/aarch64/options_set_22.c: New test.
      	* gcc.target/aarch64/options_set_23.c: New test.
      	* gcc.target/aarch64/options_set_24.c: New test.
      	* gcc.target/aarch64/options_set_25.c: New test.
      	* gcc.target/aarch64/options_set_26.c: New test.
      Tamar Christina committed
    • middle-end/94465 - handle released SSA names in array_ref_low_bound · ef663105
      array_ref_low_bound is used in dumping ARRAY_REFs which in turn
      is called when basic blocks are deleted.  cleanup_control_flow_pre
      consciously decides to remove unreachable basic-blocks in arbitrary
      order so the following makes array_ref_low_bound forgiving in the
      case the SSA name with the index definition has been released
      already.
      
      2020-04-03  Richard Biener  <rguenther@suse.de>
      
      	PR middle-end/94465
      	* tree.c (array_ref_low_bound): Deal with released SSA names
      	in index position.
      Richard Biener committed
    • Improve svn-rev to search for pattern at line beginning. · fa4aab7f
      	* gcc-git-customization.sh: Search for the pattern
      	at line beginning only.
      Martin Liska committed
    • amdgcn: Support unordered floating-point comparison operators · 1dff18a1
      2020-04-03  Kwok Cheung Yeung  <kcy@codesourcery.com>
      
      	gcc/
      	* config/gcn/gcn.c (print_operand): Handle unordered comparison
      	operators.
      	* config/gcn/predicates.md (gcn_fp_compare_operator): Add unordered
      	comparison operators.
      Kwok Cheung Yeung committed
    • libstdc++: Fix std::to_address for debug iterators (PR 93960) · 24fe8c8e
      It should be valid to use std::to_address on a past-the-end iterator,
      but the debug mode iterators do a check for dereferenceable in their
      operator->(). That check is generally useful, so rather than remove it
      this changes std::__to_address to identify a debug mode iterator and
      use base().operator->() to skip the check.
      
      	PR libstdc++/93960
      	* include/bits/ptr_traits.h (__to_address): Add special case for debug
      	iterators, to avoid dereferenceable check.
      	* testsuite/20_util/to_address/1_neg.cc: Adjust dg-error line number.
      	* testsuite/20_util/to_address/debug.cc: New test.
      Jonathan Wakely committed
    • Revert "[nvptx, libgomp] Update pr85381-{2,4}.c test-cases" [PR89713, PR94392] · 2b1e849b
      In response to PR94392 commit 75efe9cb
      "c/94392 - only enable -ffinite-loops for C++", this reverts PR89713
      commit 00908992, as apparently now again
      "empty oacc loops are" no longer "removed before expand".
      
      	libgomp/
      	PR tree-optimization/89713
      	PR c/94392
      	* testsuite/libgomp.oacc-c-c++-common/pr85381-2.c: Again expect
      	'bar.sync'.
      	* testsuite/libgomp.oacc-c-c++-common/pr85381-4.c: Likewise.
      Thomas Schwinge committed
    • Fix PR94443 with gsi_insert_seq_before [PR94443] · 4441eced
      This patch is to fix the stupid mistake by using
      gsi_insert_seq_before instead of gsi_insert_before.
      
      BTW, the regression testing on one x86_64 machine from CFarm is
      unable to reveal it (I guess due to native arch sandybridge?), so I
      specified additional option -march=znver2 and verified the coverage.
      
      Bootstrapped/regtested on powerpc64le-linux-gnu (P9) and
      x86_64-pc-linux-gnu, also verified the fail cases in related PRs.
      
      2020-04-03  Kewen Lin  <linkw@gcc.gnu.org>
      
      gcc/
          PR tree-optimization/94443
          * tree-vect-loop.c (vectorizable_live_operation): Use
          gsi_insert_seq_before to replace gsi_insert_before.
      
      gcc/testsuite/
          PR tree-optimization/94443
          * gcc.dg/vect/pr94443.c: New test.
      Kewen Lin committed
    • ICF: compare type attributes for gimple_call_fntypes. · 55a73802
      	PR ipa/94445
      	* ipa-icf-gimple.c (func_checker::compare_gimple_call):
      	  Compare type attributes for gimple_call_fntypes.
      Martin Liska committed
    • S/390 zTPF: Handle skip trace addresses when unwinding · b749b5ec
      Check for and handle new skip trace addresses when unwinding on zTPF.
      
      libgcc/ChangeLog:
      
      2020-04-03  Jim Johnston  <jjohnst@us.ibm.com>
      
      	* config/s390/tpf-unwind.h (MIN_PATRANGE, MAX_PATRANGE)
      	(TPFRA_OFFSET): Macros removed.
      	(CP_CNF, cinfc_fast, CINFC_CMRESET, CINTFC_CMCENBKST)
      	(CINTFC_CMCENBKED, ICST_CRET, ICST_SRET, LOWCORE_PAGE3_ADDR)
      	(PG3_SKIPPING_OFFSET): New macros.
      	(__isPATrange): Use cinfc_fast for the check.
      	(__isSkipResetAddr): New function.
      	(s390_fallback_frame_state): Check for skip trace addresses. Use
      	either ICST_CRET or ICST_SRET to calculate return address
      	location.
      	(__tpf_eh_return): Handle skip trace addresses.
      Jim Johnston committed
    • Daily bump. · 535ce76a
      GCC Administrator committed
  3. 02 Apr, 2020 15 commits
    • Fix some comment typos in alias.c. · 63f56527
      2020-04-02  Sandra Loosemore  <sandra@codesourcery.com>
      
      	* alias.c (get_alias_set): Fix comment typos.
      Sandra Loosemore committed
    • Fix check_effective_target_sigsetjmp for glibc targets. · a950bb6e
      2020-04-02  Sandra Loosemore  <sandra@codesourcery.com>
      
      	gcc/testsuite/
      	* lib/target-supports.exp (check_effective_target_sigsetjmp): Test
      	for __sigsetjmp as well as sigsetjmp.
      Sandra Loosemore committed
    • Fix fortran/85982 ICE in resolve_component. · 0cd74f35
      2020-04-01  Fritz Reese  <foreese@gcc.gnu.org>
      
      	PR fortran/85982
      	* fortran/decl.c (match_attr_spec): Lump COMP_STRUCTURE/COMP_MAP into
      	attribute checking used by TYPE.
      
      2020-04-01  Fritz Reese  <foreese@gcc.gnu.org>
      
      	PR fortran/85982
      	* gfortran.dg/dec_structure_28.f90: New test.
      Fritz Reese committed
    • [Fortran] Resolve formal args before checking DTIO · 3ab216a4
              * gfortran.h (gfc_resolve_formal_arglist): Add prototype.
              * interface.c (check_dtio_interface1): Call it.
              * resolve.c (gfc_resolve_formal_arglist): Renamed from
              resolve_formal_arglist, removed static.
              (find_arglists, resolve_types): Update calls.
      
              * gfortran.dg/dtio_35.f90: New.
      Tobias Burnus committed
    • Prevent IPA-SRA from creating calls to local comdats (PR 92676) · b90061c6
      since r278669 (fix for PR ipa/91956), IPA-SRA makes sure that the clone
      it creates is put into the same same_comdat as the original cgraph_node,
      so that it can call private comdats (such as the ipa-split bits of a
      comdat that is private).
      
      However, that means that if there is non-comdat caller of a public
      comdat that is modified by IPA-SRA, it now finds itself calling a
      private comdat, which call graph verifier does not like (and for a
      reason, in theory it can disappear and since it is private it would not
      be available from other CUs).
      
      The patch fixes this by performing the fix for PR 91956 only when the
      node in question actually calls a local comdat and when it does, also
      making sure that no callers come from a different same_comdat (disabling
      IPA-SRA if both conditions are true), so that it plays by the rules in
      both modes, does not violate the private comdat calling rule and at the
      same time does not disable the transformation unnecessarily.
      
      The patch also fixes up the calls_comdat_local of callers of the
      modified node, despite that not triggering any known issues.
      
      2020-04-02  Martin Jambor  <mjambor@suse.cz>
      
      	PR ipa/92676
      	* ipa-sra.c (struct caller_issues): New fields candidate and
      	call_from_outside_comdat.
      	(check_for_caller_issues): Check for calls from outsied of
      	candidate's same_comdat_group.
      	(check_all_callers_for_issues): Set up issues.candidate, check result
      	of the new check.
      	(mark_callers_calls_comdat_local): New function.
      	(process_isra_node_results): Set calls_comdat_local of callers if
      	appropriate.
      Martin Jambor committed
    • c/94392 - only enable -ffinite-loops for C++ · 75efe9cb
      This does away with enabling -ffinite-loops at -O2+ for all languages
      and instead enables it selectively for C++ only.
      
      It also makes -ffinite-loops loop-private at CFG construction time
      fixing correctness issues with inlining.
      
      2020-04-02  Richard Biener  <rguenther@suse.de>
      
      	PR c/94392
      	* c-opts.c (c_common_post_options): Enable -ffinite-loops
      	for -O2 and C++11 or newer.
      
      	* common.opt (ffinite-loops): Initialize to zero.
      	* opts.c (default_options_table): Remove OPT_ffinite_loops
      	entry.
      	* cfgloop.h (loop::finite_p): New member.
      	* cfgloopmanip.c (copy_loop_info): Copy finite_p.
      	* ipa-icf-gimple.c (func_checker::compare_loops): Compare
      	finite_p.
      	* lto-streamer-in.c (input_cfg): Stream finite_p.
      	* lto-streamer-out.c (output_cfg): Likewise.
      	* tree-cfg.c (replace_loop_annotate): Initialize finite_p
      	from flag_finite_loops at CFG build time.
      	* tree-ssa-loop-niter.c (finite_loop_p): Check the loops
      	finite_p flag instead of flag_finite_loops.
      	* doc/invoke.texi (ffinite-loops): Adjust documentation of
      	default setting.
      
      	* gcc.dg/torture/pr94392.c: New testcase.
      Richard Biener committed
    • debug/94450 - remove DW_TAG_imported_unit generated in LTRANS units · 54af9576
      This removes the DW_TAG_imported_unit we generate for each referenced
      early debug unit in LTRANS units.  They are more harmful than they
      do good and the semantics can be read in a way making it even wrong.
      
      2020-04-02  Richard Biener  <rguenther@suse.de>
      
      	PR debug/94450
      	* dwarf2out.c (dwarf2out_early_finish): Remove code emitting
      	DW_TAG_imported_unit.
      Richard Biener committed
    • doc: RISC-V: Update binutils requirement to 2.30 · 879bc686
      Complement commit bfe78b08 ("RISC-V: Using fmv.x.w/fmv.w.x rather
      than fmv.x.s/fmv.s.x") and document a binutils 2.30 requirement in the
      installation manual, matching the addition of fmv.x.w/fmv.w.x mnemonics
      to GAS.
      
      	gcc/
      	* doc/install.texi (Specific) <riscv32-*-elf, riscv32-*-linux>
      	<riscv64-*-elf, riscv64-*-linux>: Update binutils requirement to
      	2.30.
      Maciej W. Rozycki committed
    • Fix PR94401 by considering reverse overrun · 81ce375d
      The commit r10-7415 brings scalar type consideration
      to eliminate epilogue peeling for gaps, but it exposed
      one problem that the current handling doesn't consider
      the memory access type VMAT_CONTIGUOUS_REVERSE, for
      which the overrun happens on low address side.  This
      patch is to make the code take care of it by updating
      the offset and construction element order accordingly.
      
      Bootstrapped/regtested on powerpc64le-linux-gnu P8
      and aarch64-linux-gnu.
      
      2020-04-02  Kewen Lin  <linkw@gcc.gnu.org>
      
      gcc/ChangeLog
      
          PR tree-optimization/94401
          * tree-vect-loop.c (vectorizable_load): Handle VMAT_CONTIGUOUS_REVERSE
          access type when loading halves of vector to avoid peeling for gaps.
      Kewen Lin committed
    • Fix up -Wliteral-suffix warning on mti-linux.h · 68cbee9b
      I've noticed while trying to reproduce PR92989 the following warning:
      In file included from ./tm.h:42,
                       from ../../gcc/backend.h:28,
                       from ../../gcc/lra-assigns.c:80:
      ../../gcc/config/mips/mti-linux.h:31:5: warning: invalid suffix on literal; C++11 requires a space between literal and string macro [-Wliteral-suffix]
           "/%{mmicromips:micro}mips%{mel|EL:el}-"MIPS_SYSVERSION_SPEC  \
           ^
      This fixes it, string concatenation works just fine even with whitespace
      in between.
      
      2020-04-02  Jakub Jelinek  <jakub@redhat.com>
      
      	* config/mips/mti-linux.h (SYSROOT_SUFFIX_SPEC): Add a space in
      	between a string literal and MIPS_SYSVERSION_SPEC macro.
      Jakub Jelinek committed
    • sra/doc: Document param sra-max-propagations · d4ed2cd1
      I forgot to document the new param in invoke.texi, does the text below
      look OK?
      
      Tested with make info and make pdf.
      
      Thanks,
      
      Martin
      
      2020-04-02  Martin Jambor  <mjambor@suse.cz>
      
      	* doc/invoke.texi (Optimize Options): Document sra-max-propagations.
      Martin Jambor committed
    • params: Decrease -param=max-find-base-term-values= default [PR92264] · 86c92411
      For the PR in question, my proposal would be to also lower
      -param=max-find-base-term-values=
      default from 2000 to 200 after this, at least in the above 4
      bootstraps/regtests there is nothing that would ever result in
      find_base_term returning non-NULL with more than 200 VALUEs being processed.
      
      2020-04-02  Jakub Jelinek  <jakub@redhat.com>
      
      	PR rtl-optimization/92264
      	* params.opt (-param=max-find-base-term-values=): Decrease default
      	from 2000 to 200.
      Jakub Jelinek committed
    • cselib: Reuse VALUEs on sp adjustments [PR92264] · 2c0fa3ec
      As discussed in the PR, if !ACCUMULATE_OUTGOING_ARGS on large functions we
      can have hundreds of thousands of stack pointer adjustments and cselib
      creates a new VALUE after each sp adjustment, which form extremely deep
      VALUE chains, which is very harmful e.g. for find_base_term.
      E.g. if we have
      sp -= 4
      sp -= 4
      sp += 4
      sp += 4
      sp -= 4
      sp += 4
      that means 7 VALUEs, one for the sp at beginning (val1), than val2 = val1 -
      4, then val3 = val2 - 4, then val4 = val3 + 4, then val5 = val4 + 4, then
      val6 = val5 - 4, then val7 = val6 + 4.
      This patch tweaks cselib, so that it is smarter about sp adjustments.
      When cselib_lookup (stack_pointer_rtx, Pmode, 1, VOIDmode) and we know
      nothing about sp yet (this happens at the start of the function, for
      non-var-tracking also after cselib_reset_table and for var-tracking after
      processing fp_setter insn where we forget about former sp values because
      that is now hfp related while everything after it is sp related), we
      look it up normally, but in addition to what we have been doing before
      we mark the VALUE as SP_DERIVED_VALUE_P.  Further lookups of sp + offset
      are then special cased, so that it is canonicalized to that
      SP_DERIVED_VALUE_P VALUE + CONST_INT (if possible).  So, for the above,
      we get val1 with SP_DERIVED_VALUE_P set, then val2 = val1 - 4, val3 = val1 -
      8 (note, no longer val2 - 4!), then we get val2 again, val1 again, val2
      again, val1 again.
      In the find_base_term visited_vals.length () > 100 find_base_term
      statistics during combined x86_64-linux and i686-linux bootstrap+regtest
      cycle, without the patch I see:
      			find_base_term > 100
      			returning NULL	returning non-NULL
      32-bit compilations	4229178		407
      64-bit compilations	217523		0
      with largest visited_vals.length () when returning non-NULL being 206.
      With the patch the same numbers are:
      32-bit compilations	1249588		135
      64-bit compilations	3510		0
      with largest visited_vals.length () when returning non-NULL being 173.
      This shows significant reduction of the deep VALUE chains.
      On powerpc64{,le}-linux, these stats didn't change at all, we have
      			1008		0
      for all of -m32, -m64 and little-endian -m64, just the
      gcc.dg/pr85180.c and gcc.dg/pr87985.c testcases which are unrelated to sp.
      
      My earlier version of the patch, which contained just the rtl.h and cselib.c
      changes, regressed some tests:
      gcc.dg/guality/{pr36728-{1,3},pr68860-{1,2}}.c
      gcc.target/i386/{pr88416,sse-{13,23,24,25,26}}.c
      The problem with the former tests was worse debug info, where with -m32
      where arg7 was passed in a stack slot we though a push later on might have
      invalidated it, when it couldn't.  This is something I've solved with the
      var-tracking.c (vt_initialize) changes.  In those problematic functions, we
      create a cfa_base VALUE (argp) and want to record that at the start of
      the function the argp VALUE is sp + off and also record that current sp
      VALUE is argp's VALUE - off.  The second permanent equivalence didn't make
      it after the patch though, because cselib_add_permanent_equiv will
      cselib_lookup the value of the expression it wants to add as the equivalence
      and if it is the same VALUE as we are calling it on, it doesn't do anything;
      and due to the cselib changes for sp based accesses that is exactly what
      happened.  By reversing the order of the cselib_add_permanent_equiv calls we
      get both equivalences though and thus are able to canonicalize the sp based
      accesses in var-tracking to the cfa_base value + offset.
      The i386 FAILs were all ICEs, where we had pushf instruction pushing flags
      and then pop pseudo reading that value again.  With the cselib changes,
      cselib during RTL DSE is able to see through the sp adjustment and wanted
      to replace_read what was done pushf, by moving the flags register into a
      pseudo and replace the memory read in the pop with that pseudo.  That is
      wrong for two reasons: one is that the backend doesn't have an instruction
      to move the flags hard register into some other register, but replace_read
      has been validating just the mem -> pseudo replacement and not the insns
      emitted by copy_to_mode_reg.  And the second issue is that it is obviously
      wrong to replace a stack pop which contains stack post-increment by a copy
      of pseudo into destination.  dse.c has some code to handle RTX_AUTOINC, but
      only uses it when actually removing stores and only when there is REG_INC
      note (stack RTX_AUTOINC does not have those), in check_for_inc_dec* where
      it emits the reg adjustment(s) before the insn that is going to be deleted.
      replace_read doesn't remove the insn, so if it e.g. contained REG_INC note,
      it would be kept there and we might have the RTX_AUTOINC not just in *loc,
      but other spots.
      So, the dse.c changes try to validate the added insns and punt on all
      RTX_AUTOINC in *loc.  Furthermore, it seems that with the cselib.c changes
      on the gfortran.dg/pr87360.f90 and gcc.target/i386/pr88416.c testcases
      check_for_inc_dec{,_1} happily throws stack pointer autoinc on the floor,
      which is also wrong.  While we could perhaps do the for_each_inc_dec
      call regardless of whether we have REG_INC note or not, we aren't prepared
      to handle e.g. REG_ARGS_SIZE distribution and thus could end up with wrong
      unwind info or ICEs during dwarf2cfi.c.  So the patch also punts on those,
      after all, if we'd in theory managed to try to optimize such pushes before,
      we'd create wrong-code.
      
      On x86_64-linux and i686-linux, the patch has some minor debug info coverage
      differences, but it doesn't appear very significant to me.
      https://github.com/pmachata/dwlocstat tool gives (where before is vanilla
      trunk + the rtl.h patch but not {cselib,var-tracking,dse}.c
      --enable-checking=yes,rtl,extra bootstrapped, then {cselib,var-tracking,dse}.c
      hunks applied and make cc1plus, while after is trunk with the whole patch
      applied).
      
      64-bit cc1plus
      before
      cov%	samples	cumul
      0..10	1232756/48%	1232756/48%
      11..20	31089/1%	1263845/49%
      21..30	39172/1%	1303017/51%
      31..40	38853/1%	1341870/52%
      41..50	47473/1%	1389343/54%
      51..60	45171/1%	1434514/56%
      61..70	69393/2%	1503907/59%
      71..80	61988/2%	1565895/61%
      81..90	104528/4%	1670423/65%
      91..100	875402/34%	2545825/100%
      after
      cov%	samples	cumul
      0..10	1233238/48%	1233238/48%
      11..20	31086/1%	1264324/49%
      21..30	39157/1%	1303481/51%
      31..40	38819/1%	1342300/52%
      41..50	47447/1%	1389747/54%
      51..60	45151/1%	1434898/56%
      61..70	69379/2%	1504277/59%
      71..80	61946/2%	1566223/61%
      81..90	104508/4%	1670731/65%
      91..100	875094/34%	2545825/100%
      
      32-bit cc1plus
      before
      cov%	samples	cumul
      0..10	1231221/48%	1231221/48%
      11..20	30992/1%	1262213/49%
      21..30	36422/1%	1298635/51%
      31..40	35793/1%	1334428/52%
      41..50	47102/1%	1381530/54%
      51..60	41201/1%	1422731/56%
      61..70	65467/2%	1488198/58%
      71..80	59560/2%	1547758/61%
      81..90	104076/4%	1651834/65%
      91..100	881879/34%	2533713/100%
      after
      cov%	samples	cumul
      0..10	1230469/48%	1230469/48%
      11..20	30390/1%	1260859/49%
      21..30	36362/1%	1297221/51%
      31..40	36042/1%	1333263/52%
      41..50	47619/1%	1380882/54%
      51..60	41674/1%	1422556/56%
      61..70	65849/2%	1488405/58%
      71..80	59857/2%	1548262/61%
      81..90	104178/4%	1652440/65%
      91..100	881273/34%	2533713/100%
      
      2020-04-02  Jakub Jelinek  <jakub@redhat.com>
      
      	PR rtl-optimization/92264
      	* rtl.h (struct rtx_def): Mention that call bit is used as
      	SP_DERIVED_VALUE_P in cselib.c.
      	* cselib.c (SP_DERIVED_VALUE_P): Define.
      	(PRESERVED_VALUE_P, SP_BASED_VALUE_P): Move definitions earlier.
      	(cselib_hasher::equal): Handle equality between SP_DERIVED_VALUE_P
      	val_rtx and sp based expression where offsets cancel each other.
      	(preserve_constants_and_equivs): Formatting fix.
      	(cselib_reset_table): Add reverse op loc to SP_DERIVED_VALUE_P
      	locs list for cfa_base_preserved_val if needed.  Formatting fix.
      	(autoinc_split): If the to be returned value is a REG, MEM or
      	VALUE which has SP_DERIVED_VALUE_P + CONST_INT as one of its
      	locs, return the SP_DERIVED_VALUE_P VALUE and adjust *off.
      	(rtx_equal_for_cselib_1): Call autoinc_split even if both
      	expressions are PLUS in Pmode with CONST_INT second operands.
      	Handle SP_DERIVED_VALUE_P cases.
      	(cselib_hash_plus_const_int): New function.
      	(cselib_hash_rtx): Use it for PLUS in Pmode with CONST_INT
      	second operand, as well as for PRE_DEC etc. that ought to be
      	hashed the same way.
      	(cselib_subst_to_values): Substitute PLUS with Pmode and
      	CONST_INT operand if the first operand is a VALUE which has
      	SP_DERIVED_VALUE_P + CONST_INT as one of its locs for the
      	SP_DERIVED_VALUE_P + adjusted offset.
      	(cselib_lookup_1): When creating a new VALUE for stack_pointer_rtx,
      	set SP_DERIVED_VALUE_P on it.  Set PRESERVED_VALUE_P when adding
      	SP_DERIVED_VALUE_P PRESERVED_VALUE_P subseted VALUE location.
      	* var-tracking.c (vt_initialize): Call cselib_add_permanent_equiv
      	on the sp value before calling cselib_add_permanent_equiv on the
      	cfa_base value.
      	* dse.c (check_for_inc_dec_1, check_for_inc_dec): Punt on RTX_AUTOINC
      	in the insn without REG_INC note.
      	(replace_read): Punt on RTX_AUTOINC in the *loc being replaced.
      	Punt on invalid insns added by copy_to_mode_reg.  Formatting fixes.
      Jakub Jelinek committed
    • aarch64: Fix ICE due to aarch64_gen_compare_reg_maybe_ze [PR94435] · df562b12
      The following testcase ICEs, because aarch64_gen_compare_reg_maybe_ze emits
      invalid RTL.
      For y_mode [QH]Imode it expects y to be of that mode (or CONST_INT that fits
      into that mode) and x being SImode; for non-CONST_INT y it zero extends y
      into SImode and compares that against x, for CONST_INT y it zero extends y
      into SImode.  The problem is that when the zero extended constant isn't
      usable directly, it forces it into a REG, but with y_mode mode, and then
      compares against y.  That is wrong, because it should force it into a SImode
      REG and compare that way.
      
      2020-04-02  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/94435
      	* config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): For
      	y_mode E_[QH]Imode and y being a CONST_INT, change y_mode to SImode.
      
      	* gcc.target/aarch64/pr94435.c: New test.
      Jakub Jelinek committed
    • aarch64: Fix ICE due to aarch64_gen_compare_reg_maybe_ze [PR94435] · 66e32751
      The following testcase ICEs, because aarch64_gen_compare_reg_maybe_ze emits
      invalid RTL.
      For y_mode [QH]Imode it expects y to be of that mode (or CONST_INT that fits
      into that mode) and x being SImode; for non-CONST_INT y it zero extends y
      into SImode and compares that against x, for CONST_INT y it zero extends y
      into SImode.  The problem is that when the zero extended constant isn't
      usable directly, it forces it into a REG, but with y_mode mode, and then
      compares against y.  That is wrong, because it should force it into a SImode
      REG and compare that way.
      
      2020-04-02  Jakub Jelinek  <jakub@redhat.com>
      
      	PR target/94435
      	* config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): For
      	y_mode E_[QH]Imode and y being a CONST_INT, change y_mode to SImode.
      
      	* gcc.target/aarch64/pr94435.c: New test.
      Jakub Jelinek committed