1. 19 Dec, 2018 15 commits
    • re PR tree-optimization/88533 (Higher performance penalty of array-bounds… · 08926e6f
      re PR tree-optimization/88533 (Higher performance penalty of array-bounds checking for sparse-matrix vector multiply)
      
      2018-12-19  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/88533
      	Revert
      	2018-04-30  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/28364
      	PR tree-optimization/85275
      	* tree-ssa-loop-ch.c (ch_base::copy_headers): Stop after
      	copying first exit test.
      
      	* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
      
      	* tree-ssa-loop-ch.c: Include tree-phinodes.h and
      	ssa-iterators.h.
      	(should_duplicate_loop_header_p): Track whether stmt compute
      	loop invariants or values based on IVs.  Apart from the
      	original loop header only duplicate blocks with exit tests
      	that are based on IVs or invariants.
      
      	* gcc.dg/tree-ssa/copy-headers-6.c: New testcase.
      	* gcc.dg/tree-ssa/copy-headers-7.c: Likewise.
      	* gcc.dg/tree-ssa/ivopt_mult_1.c: Un-XFAIL.
      	* gcc.dg/tree-ssa/ivopt_mult_2.c: Likewise.
      
      From-SVN: r267262
      Richard Biener committed
    • [nvptx] Use MAX, MIN, ROUND_UP macros · 3c55d60f
      Use MAX, MIN, and ROUND_UP macros to simplify code.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-19  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_gen_shared_bcast, shared_prop_gen)
      	(nvptx_goacc_expand_accel_var): Use MAX and ROUND_UP.
      	(nvptx_assemble_value, nvptx_output_skip): Use MIN.
      	(nvptx_shared_propagate, nvptx_single, nvptx_expand_shared_addr): Use
      	MAX.
      
      From-SVN: r267261
      Tom de Vries committed
    • [nvptx] Make nvptx state propagation function names more generic · a0b3b5c4
      Rename state propagation functions to avoid worker/vector terminology.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-19  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_gen_wcast): Rename as
      	nvptx_gen_warp_bcast.
      	(nvptx_gen_wcast): Rename to nvptx_gen_shared_bcast, add bool
      	vector argument, and update call to nvptx_gen_shared_bcast.
      	(propagator_fn): Add bool argument.
      	(nvptx_propagate): New bool argument, pass bool argument to fn.
      	(vprop_gen): Rename to warp_prop_gen, update call to
      	nvptx_gen_warp_bcast.
      	(nvptx_vpropagate): Rename to nvptx_warp_propagate, update call to
      	nvptx_propagate.
      	(wprop_gen): Rename to shared_prop_gen, update call to
      	nvptx_gen_shared_bcast.
      	(nvptx_wpropagate): Rename to nvptx_shared_propagate, update call
      	to nvptx_propagate.
      	(nvptx_wsync): Rename to nvptx_cta_sync.
      	(nvptx_single): Update calls to nvptx_gen_warp_bcast,
      	nvptx_gen_shared_bcast and nvptx_cta_sync.
      	(nvptx_process_pars): Likewise.
      	(write_worker_buffer): Rename as write_shared_buffer.
      	(nvptx_file_end): Update calls to write_shared_buffer.
      	(nvptx_expand_worker_addr): Rename as nvptx_expand_shared_addr.
      	(nvptx_expand_builtin): Update call to nvptx_expand_shared_addr.
      	(nvptx_get_worker_red_addr): Rename as nvptx_get_shared_red_addr.
      	(nvptx_goacc_reduction_setup): Update call to
      	nvptx_get_shared_red_addr.
      	(nvptx_goacc_reduction_fini): Likewise.
      	(nvptx_goacc_reduction_teardown): Likewise.
      
      From-SVN: r267260
      Tom de Vries committed
    • [nvptx] Rename worker_bcast variables to oacc_bcast · 1ed57fb8
      Rename worker_bcast variables to oacc_bcast, avoiding worker terminology.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-19  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (worker_bcast_size): Rename as
      	oacc_bcast_size.
      	(worker_bcast_align): Rename as oacc_bcast_align.
      	(worker_bcast_sym): Rename as oacc_bcast_sym.
      	(nvptx_option_override): Update usage of oacc_bcast_*.
      	(struct wcast_data_t): Rename as broadcast_data_t.
      	(nvptx_gen_wcast): Update type of data argument and usage of
      	oacc_bcast_align.
      	(wprop_gen): Update type of data_ and usage of oacc_bcast_align.
      	(nvptx_wpropagate): Update type of data and usage of
      	oacc_bcast_{sym,size}.
      	(nvptx_single): Update type of data and usage of oacc_bcast_size.
      	(nvptx_file_end): Update usage of oacc_bcast_{sym,align,size}.
      
      From-SVN: r267259
      Tom de Vries committed
    • [nvptx] Generalize bar.sync instruction · 1dcf2688
      Allow the logical barrier operand of nvptx_barsync to be a register, and add a
      thread count operand.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-19  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.md (nvptx_barsync): Add and handle operand.
      	* config/nvptx/nvptx.c (nvptx_wsync): Update call to gen_nvptx_barsync.
      
      From-SVN: r267258
      Tom de Vries committed
    • [nvptx] Only use one logical barrier resource · 22aa0613
      For openacc loops, we generate this style of code:
      ...
              @%r41   bra.uni $L5;
              @%r40   bra     $L6;
                      mov.u64 %r32, %ar0;
                      cvta.shared.u64 %r39, __worker_bcast;
                      st.u64  [%r39], %r32;
      $L6:
      $L5:
                      bar.sync        0;
              @%r40   bra     $L4;
                      cvta.shared.u64 %r38, __worker_bcast;
                      ld.u64  %r32, [%r38];
                      ...
      $L4:
                      bar.sync        1;
      ...
      
      The first barrier is there to ensure that no thread reads the broadcast buffer
      before it's written.  The second barrier is there to ensure that no thread
      overwrites the broadcast buffer before all threads have read it (as well as
      implementing the obligatory synchronization after a worker loop).
      
      We've been using the logical barrier resources '0' and '1' for these two
      barriers, but there's no reason why we can't use the same one.
      
      Use logical barrier resource '0' for both barriers, making the openacc
      implementation claim less resources.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-19  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_single): Always pass false to
      	nvptx_wsync.
      	(nvptx_process_pars): Likewise.
      
      From-SVN: r267257
      Tom de Vries committed
    • [nvptx] Use TARGET_SET_CURRENT_FUNCTION · 43be05f5
      Implement TARGET_SET_CURRENT_FUNCTION for nvptx.  This gives us a place to
      add initialization or reset actions that need to be executed on a per-function
      basis.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-19  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_previous_fndecl): Declare.
      	(nvptx_set_current_function): New function.
      	(TARGET_SET_CURRENT_FUNCTION): Define.
      
      From-SVN: r267256
      Tom de Vries committed
    • [aarch64] Correct architecture for tsv110. · 5a8d95cc
      For HiSilicon's tsv110 cpu core, it supports some v8_4A features, but
      some mandatory features are not implemented.
      
      2018-12-19  Shaokun Zhang  <zhangshaokun@hisilicon.com>
      
      	* config/aarch64/aarch64-cores.def (tsv110): Fix architecture.  This
      	part is really Armv8.2 with some permitted Armv8.4 extensions.
      
      From-SVN: r267255
      Shaokun Zhang committed
    • re PR target/88541 (VPCLMULQDQ 256-bit inline function unavailable with… · a62fd9dd
      re PR target/88541 (VPCLMULQDQ 256-bit inline function unavailable with optimization but without enabled AVX512VL support)
      
      	PR target/88541
      	* config/i386/vpclmulqdqintrin.h (_mm256_clmulepi64_epi128): Enable
      	for -mavx -mvpclmulqdq rather than just for -mavx512vl -mvpclmulqdq.
      
      	* gcc.target/i386/avx-vpclmulqdq-1.c: New test.
      
      From-SVN: r267254
      Jakub Jelinek committed
    • re PR c++/87934 (struct with NSDMI of enum makes initialization a non-constant expression) · 35d87f01
      	PR c++/87934
      	* constexpr.c (cxx_eval_constant_expression) <case CONSTRUCTOR>: Do
      	re-process TREE_CONSTANT CONSTRUCTORs if they aren't reduced constant
      	expressions.
      
      	* g++.dg/cpp0x/constexpr-87934.C: New test.
      
      From-SVN: r267253
      Jakub Jelinek committed
    • [PR86153] simplify more overflow tests in VRP · 0d3d674b
      PR 86153 was originally filed when changes to the C++11's
      implementation of vector resize(size_type) limited inlining that were
      required for testsuite/g++.dg/pr83239.C to verify that we did not
      issue an undesired warning.
      
      That was worked by increasing the limit for inlining, but that in turn
      caused the C++98 implementation of vector resize, that is
      significantly different, to also be fully inlined, and that happened
      to issue the very warnings the test was meant to verify we did NOT
      issue.
      
      The reason we issued the warnings was that we failed to optimize out
      some parts of _M_fill_insert, used by the C++98 version of vector
      resize, although the call of _M_fill_insert was guarded by a test that
      could never pass: test testcase only calls resize when the vector size
      is >= 3, to decrement the size by two.  The limitation we hit in VRP
      was that the compared values could pass as an overflow test, if the
      vector size was 0 or 1 (we knew it wasn't), but even with dynamic
      ranges we failed to decide that the test result could be determined at
      compile time, even though after the test we introduced ASSERT_EXPRs
      that required a condition known to be false from earlier ones.
      
      I pondered turning ASSERT_EXPRs that show impossible conditions into
      traps, to enable subsequent instructions to be optimized, but I ended
      up finding an earlier spot in which an overflow test that would have
      introduced the impossible ASSERT_EXPR can have its result deduced from
      earlier known ranges and resolved to the other path.
      
      Although such overflow tests could be uniformly simplified to compares
      against a constant, the original code would only perform such
      simplifications when the test could be resolved to an equality test
      against zero.  I've thus avoided introducing compares against other
      constants, and instead added code that will only simplify overflow
      tests that weren't simplified before when the condition can be
      evaluated at compile time.
      
      
      for  gcc/ChangeLog
      
      	PR testsuite/86153
      	PR middle-end/83239
      	* vr-values.c
      	(vr_values::vrp_evaluate_conditional_warnv_with_ops): Extend
      	simplification of overflow tests to cover cases in which we
      	can determine the result of the comparison.
      
      for  gcc/testsuite/ChangeLog
      
      	PR testsuite/86153
      	PR middle-end/83239
      	* gcc.dg/vrp-overflow-1.c: New.
      
      From-SVN: r267252
      Alexandre Oliva committed
    • [PR87012] canonicalize ref type for tmpl arg · de62200f
      When binding an object to a template parameter of reference type, we
      take the address of the object and dereference that address.  The type
      of the address may still carry (template) typedefs, but
      verify_unstripped_args_1 rejects such typedefs other than in the top
      level of template arguments.
      
      Canonicalizing the type we want to convert to right after any
      substitutions or deductions avoids that issue.
      
      
      for  gcc/cp/ChangeLog
      
      	PR c++/87012
      	* pt.c (convert_template_argument): Canonicalize type after
      	tsubst/deduce.
      
      for  gcc/testsuite/ChangeLog
      
      	PR c++/87012
      	* g++.dg/cpp0x/pr87012.C: New.
      
      From-SVN: r267251
      Alexandre Oliva committed
    • [PR c++/88146] do not crash synthesizing inherited ctor(...) · bceca9b3
      This patch started out from the testcase in PR88146, that attempted to
      synthesize an inherited ctor without any args before a varargs
      ellipsis and crashed while at that, because of the unguarded
      dereferencing of the parm type list, that usually contains a
      terminator.  The terminator is not there for varargs functions,
      however, and without any other args, we ended up dereferencing a NULL
      pointer.  Oops.
      
      Guarding accesses to parm would be easy, but not necessary.  In
      do_build_copy_constructor, non-inherited ctors are copy-ctors, that
      always have at least one parm, so parm needs not be guarded when we
      know the access will only take place when we're dealing with an
      inherited ctor.  The only other problematic use was in the cvquals
      initializer, a variable only used in a loop over fields, that we
      skipped individually in inherited ctors.  I've guarded the cvquals
      initialization and the entire loop over fields so they only run for
      copy-ctors.
      
      Avoiding the crash from unguarded accesses was easy, but I thought we
      should still produce the sorry message we got in other testcases that
      passed arguments through the ellipsis in inherited ctors.  I put a
      check in, and noticed the inherited ctors were synthesized with the
      location assigned to the class name, although they were initially
      assigned the location of the using declaration.  I decided the latter
      was better, and arranged for the better location to be retained.
      
      Further investigation revealed the lack of a sorry message had to do
      with the call being in a non-evaluated context, in this case, a
      noexcept expression.  The sorry would be correctly reported in other
      contexts, so I rolled back the check I'd added, but retained the
      source location improvement.
      
      I was still concerned about issuing sorry messages while instantiating
      template ctors even in non-evaluated contexts, e.g., if a template
      ctor had a base initializer that used an inherited ctor with enough
      arguments that they'd go through an ellipsis.  I wanted to defer the
      instantiation of such template ctors, but that would have been wrong
      for constexpr template ctors, and already done for non-constexpr ones.
      So, I just consolidated multiple test variants into a single testcase
      that explores and explains various of the possibilities I thought of.
      
      
      for  gcc/cp/ChangeLog
      
      	PR c++/88146
      	* method.c (do_build_copy_constructor): Guard cvquals init and
      	loop over fields to run for non-inherited ctors only.
      	(synthesize_method): Retain location of inherited ctor.
      
      for  gcc/testsuite/ChangeLog
      
      	PR c++/88146
      	* g++.dg/cpp0x/inh-ctor32.C: New.
      
      From-SVN: r267250
      Alexandre Oliva committed
    • auto-profile.c (afdo_indirect_call): Skip generating histogram value if we can't… · 4469188c
      auto-profile.c (afdo_indirect_call): Skip generating histogram value if we can't find cgraph_node for then...
      
      	* auto-profile.c (afdo_indirect_call): Skip generating histogram
      	value if we can't find cgraph_node for then indirected callee.  Save
      	profile_id of the cgraph_node in histogram value's first counter.
      	* value-prof.c (gimple_value_profile_transformations): Don't skip
      	for flag_auto_profile.
      
      From-SVN: r267249
      Bin Cheng committed
    • Daily bump. · 0fb778bc
      From-SVN: r267248
      GCC Administrator committed
  2. 18 Dec, 2018 15 commits
    • re PR rtl-optimization/87759 (ICE in lra_assign, at lra-assigns.c:1624, or ICE:… · 4a7e3b42
      re PR rtl-optimization/87759 (ICE in lra_assign, at lra-assigns.c:1624, or ICE: Maximum number of LRA assignment passes is achieved (30), or compile-time hog)
      
      	PR rtl-optimization/87759
      	* gcc.target/i386/pr87759.c: Require int128 effective target.
      
      From-SVN: r267245
      Jakub Jelinek committed
    • re PR rtl-optimization/87759 (ICE in lra_assign, at lra-assigns.c:1624, or ICE:… · 003cd04c
      re PR rtl-optimization/87759 (ICE in lra_assign, at lra-assigns.c:1624, or ICE: Maximum number of LRA assignment passes is achieved (30), or compile-time hog)
      
      2018-12-18  Vladimir Makarov  <vmakarov@redhat.com>
      
      	PR rtl-optimization/87759
      	* lra-assigns.c (lra_split_hard_reg_for): Recalculate
      	non_reload_pseudos.
      
      2018-12-18  Vladimir Makarov  <vmakarov@redhat.com>
      
      	PR rtl-optimization/87759
      	* gcc.target/i386/pr87759.c: New.
      
      From-SVN: r267244
      Vladimir Makarov committed
    • re PR tree-optimization/88464 (AVX-512 vectorization of masked scatter failing… · dc5b05a0
      re PR tree-optimization/88464 (AVX-512 vectorization of masked scatter failing with "not suitable for scatter store")
      
      	PR target/88464
      	* config/i386/i386-builtin-types.def
      	(VOID_FTYPE_PDOUBLE_QI_V8SI_V4DF_INT,
      	VOID_FTYPE_PFLOAT_QI_V4DI_V8SF_INT,
      	VOID_FTYPE_PLONGLONG_QI_V8SI_V4DI_INT,
      	VOID_FTYPE_PINT_QI_V4DI_V8SI_INT,
      	VOID_FTYPE_PDOUBLE_QI_V4SI_V2DF_INT,
      	VOID_FTYPE_PFLOAT_QI_V2DI_V4SF_INT,
      	VOID_FTYPE_PLONGLONG_QI_V4SI_V2DI_INT,
      	VOID_FTYPE_PINT_QI_V2DI_V4SI_INT): New builtin types.
      	* config/i386/i386.c (enum ix86_builtins): Add
      	IX86_BUILTIN_SCATTERALTSIV4DF, IX86_BUILTIN_SCATTERALTDIV8SF,
      	IX86_BUILTIN_SCATTERALTSIV4DI, IX86_BUILTIN_SCATTERALTDIV8SI,
      	IX86_BUILTIN_SCATTERALTSIV2DF, IX86_BUILTIN_SCATTERALTDIV4SF,
      	IX86_BUILTIN_SCATTERALTSIV2DI and IX86_BUILTIN_SCATTERALTDIV4SI.
      	(ix86_init_mmx_sse_builtins): Fix up names of IX86_BUILTIN_GATHERALT*,
      	IX86_BUILTIN_GATHER3ALT* and IX86_BUILTIN_SCATTERALT* builtins to
      	match the IX86_BUILTIN codes.  Build 	IX86_BUILTIN_SCATTERALTSIV4DF,
      	IX86_BUILTIN_SCATTERALTDIV8SF, IX86_BUILTIN_SCATTERALTSIV4DI,
      	IX86_BUILTIN_SCATTERALTDIV8SI, IX86_BUILTIN_SCATTERALTSIV2DF,
      	IX86_BUILTIN_SCATTERALTDIV4SF, IX86_BUILTIN_SCATTERALTSIV2DI and
      	IX86_BUILTIN_SCATTERALTDIV4SI decls.
      	(ix86_vectorize_builtin_scatter): Expand those new builtins.
      
      	* gcc.target/i386/avx512f-pr88464-5.c: New test.
      	* gcc.target/i386/avx512f-pr88464-6.c: New test.
      	* gcc.target/i386/avx512f-pr88464-7.c: New test.
      	* gcc.target/i386/avx512f-pr88464-8.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-5.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-6.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-7.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-8.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-9.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-10.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-11.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-12.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-13.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-14.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-15.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-16.c: New test.
      
      From-SVN: r267239
      Jakub Jelinek committed
    • LWG 3171: restore stream insertion for filesystem::directory_entry · 4894e316
      	* include/bits/fs_dir.h (operator<<): Overload for directory_entry,
      	as per LWG 3171.
      	* testsuite/27_io/filesystem/directory_entry/lwg3171.cc: New test.
      
      From-SVN: r267238
      Jonathan Wakely committed
    • Fix previous commit to move instead of copying · fb601354
      	* src/filesystem/std-dir.cc (filesystem::_Dir::advance): Move new
      	path instead of copying.
      
      From-SVN: r267237
      Jonathan Wakely committed
    • Micro-optimization to avoid creating temporary path · 8d531548
      Now that path::operator/=(basic_string_view<value_type>) works directly
      from the string argument, instead of constructing a temporary path from
      the string, it's potentially more efficient to do 'path(x) /= s' instead
      of 'x / s'. This changes the only relevant place in the library.
      
      	* src/filesystem/std-dir.cc (filesystem::_Dir::advance): Append
      	string to lvalue to avoid creating temporary path.
      
      From-SVN: r267236
      Jonathan Wakely committed
    • LWG 2936: update path::compare logic and optimize string comparisons · 36313a6b
      The resolution for LWG 2936 defines the comparison more precisely, which
      this patch implements. The patch also defines comparisons with strings
      to work without constructing a temporary path object (so avoids any
      memory allocations).
      
      	* include/bits/fs_path.h (path::compare(const string_type&))
      	(path::compare(const value_type*)): Add noexcept and construct a
      	string view to compare to instead of a path.
      	(path::compare(basic_string_view<value_type>)): Add noexcept. Remove
      	inline definition.
      	* src/filesystem/std-path.cc (path::_Parser): Track last type read
      	from input.
      	(path::_Parser::next()): Return a final empty component when the
      	input ends in a non-root directory separator.
      	(path::_M_append(basic_string_view<value_type>)): Remove special cases
      	for trailing non-root directory separator.
      	(path::_M_concat(basic_string_view<value_type>)): Likewise.
      	(path::compare(const path&)): Implement LWG 2936.
      	(path::compare(basic_string_view<value_type>)): Define in terms of
      	components returned by parser, consistent with LWG 2936.
      	* testsuite/27_io/filesystem/path/compare/lwg2936.cc: New.
      	* testsuite/27_io/filesystem/path/compare/path.cc: Test more cases.
      	* testsuite/27_io/filesystem/path/compare/strings.cc: Likewise.
      
      From-SVN: r267235
      Jonathan Wakely committed
    • LWG 3040: define starts_with/ends_with as proposed · 49cefcf3
      	* include/std/string_view [__cplusplus > 201703L]
      	(basic_string_view::starts_with(basic_string_view)): Implement
      	proposed resolution of LWG 3040 to avoid redundant length check.
      	(basic_string_view::starts_with(_CharT)): Implement proposed
      	resolution of LWG 3040 to check at most one character.
      	(basic_string_view::ends_with(_CharT)): Likewise.
      
      From-SVN: r267234
      Jonathan Wakely committed
    • extend.texi (PowerPC Altivec/VSX Built-in Functions): Describe when a typedef… · 34a9bcaf
      extend.texi (PowerPC Altivec/VSX Built-in Functions): Describe when a typedef name can be used as the type specifier for a vector type...
      
      2018-12-18  Bill Schmidt  <wschmidt@linux.ibm.com>
      
      	* doc/extend.texi (PowerPC Altivec/VSX Built-in Functions):
      	Describe when a typedef name can be used as the type specifier for
      	a vector type, and when it cannot.
      
      From-SVN: r267232
      Bill Schmidt committed
    • [testsuite] Enable vect_usad_char effective target for non-SVE aarch64 · 68d459d9
      In GCC 9 the aarch64 port learned how to do V16QImode SAD operations on signed and unsigned chars.
      But I had missed enabling the effective target for that.
      This patch enables that target for non-SVE aarch64.
      Two new tests now PASS on aarch64:
      gcc.dg/vect/slp-reduc-sad.c
      gcc.dg/vect/vect-reduc-sad.c
      
      	* lib/target-supports.exp (check_effective_target_vect_usad_char):
      	Add non-SVE aarch64 to supported list.
      
      From-SVN: r267230
      Kyrylo Tkachov committed
    • msp430.h: Define TARGET_VTABLE_ENTRY_ALIGN. · e7b78f72
      2018-12-18  Jozef Lawrynowicz  <jozef.l@mittosystems.com>
      
      	* config/msp430/msp430.h: Define TARGET_VTABLE_ENTRY_ALIGN.
      
      From-SVN: r267229
      Jozef Lawrynowicz committed
    • re PR target/88513 (FAIL: gcc.target/i386/pr59591-1.c) · 4714942e
      	PR target/88513
      	PR target/88514
      	* optabs.def (vec_pack_sbool_trunc_optab, vec_unpacks_sbool_hi_optab,
      	vec_unpacks_sbool_lo_optab): New optabs.
      	* optabs.c (expand_widen_pattern_expr): Use vec_unpacks_sbool_*_optab
      	and pass additional argument if both input and target have the same
      	scalar mode of VECTOR_BOOLEAN_TYPE_P vectors.
      	* expr.c (expand_expr_real_2) <case VEC_PACK_TRUNC_EXPR>: Handle
      	VECTOR_BOOLEAN_TYPE_P pack where result has the same scalar mode
      	as the operands using vec_pack_sbool_trunc_optab.
      	* tree-vect-stmts.c (supportable_widening_operation): Use
      	vec_unpacks_sbool_{lo,hi}_optab for VECTOR_BOOLEAN_TYPE_P conversions
      	where both wider_vectype and vectype have the same scalar mode.
      	(supportable_narrowing_operation): Similarly use
      	vec_pack_sbool_trunc_optab if narrow_vectype and vectype have the same
      	scalar mode.
      	* config/i386/i386.c (ix86_get_builtin)
      	<case IX86_BUILTIN_GATHER3ALTDIV8SF>: Check for VECTOR_MODE_P
      	rather than non-VOIDmode.
      	* config/i386/sse.md (vec_pack_trunc_qi, vec_pack_trunc_<mode>):
      	Remove useless ()s around "register_operand", formatting fixes.
      	(vec_pack_sbool_trunc_qi, vec_unpacks_sbool_lo_qi,
      	vec_unpacks_sbool_hi_qi): New expanders.
      	* doc/md.texi (vec_pack_sbool_trunc_M, vec_unpacks_sbool_hi_M,
      	vec_unpacks_sbool_lo_M): Document.
      
      	* gcc.target/i386/avx512f-pr88513-1.c: New test.
      	* gcc.target/i386/avx512f-pr88513-2.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-1.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-2.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-3.c: New test.
      	* gcc.target/i386/avx512vl-pr88464-4.c: New test.
      	* gcc.target/i386/avx512vl-pr88513-1.c: New test.
      	* gcc.target/i386/avx512vl-pr88513-2.c: New test.
      	* gcc.target/i386/avx512vl-pr88513-3.c: New test.
      	* gcc.target/i386/avx512vl-pr88513-4.c: New test.
      	* gcc.target/i386/avx512vl-pr88514-1.c: New test.
      	* gcc.target/i386/avx512vl-pr88514-2.c: New test.
      	* gcc.target/i386/avx512vl-pr88514-3.c: New test.
      
      From-SVN: r267228
      Jakub Jelinek committed
    • combine.c (update_rsp_from_reg_equal): Only look for the nonzero bits of src in… · 6a30d8c0
      combine.c (update_rsp_from_reg_equal): Only look for the nonzero bits of src in nonzero_bits_mode if...
      
      2018-12-18  Jozef Lawrynowicz  <jozef.l@mittosystems.com>
      
      	* combine.c (update_rsp_from_reg_equal): Only look for the nonzero bits
      	of src in nonzero_bits_mode if the mode of src is MODE_INT and
      	HWI_COMPUTABLE.
      	(reg_nonzero_bits_for_combine): Add clarification to comment.
      
      From-SVN: r267227
      Jozef Lawrynowicz committed
    • driver-i386.c (host_detect_local_cpu): Detect cascadelake. · 5d54c798
      gcc/ChangeLog
      2018-12-18  Wei Xiao  <wei3.xiao@intel.com>
      
      	* config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake.
      	* config/i386/i386.c (fold_builtin_cpu): Handle cascadelake.
      	* doc/extend.texi: Add cascadelake.
      
      gcc/testsuite/ChangeLog
      2018-12-18  Wei Xiao  <wei3.xiao@intel.com>
      
      	* g++.target/i386/mv16.C: Handle new march.
      	* gcc.target/i386/builtin_target.c: Ditto.
      
      libgcc/ChangeLog
      2018-12-18  Wei Xiao  <wei3.xiao@intel.com>
      
      	* config/i386/cpuinfo.c (get_intel_cpu): Handle cascadelake.
      	* config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
      
      From-SVN: r267226
      Wei Xiao committed
    • Daily bump. · f9fd26fe
      From-SVN: r267225
      GCC Administrator committed
  3. 17 Dec, 2018 10 commits
    • PR libstdc++/71044 fix off-by-one errors introduced recently · 2017595d
      The recent changes to append/concat directly from strings (without
      constructing paths) introduced regressions where one of the components
      could be omitted from the iteration sequence in the result.
      
      	PR libstdc++/71044
      	* src/filesystem/std-path.cc (path::_M_append): Fix off-by-one error
      	that caused a component to be lost from the iteration sequence.
      	(path::_M_concat): Likewise.
      	* testsuite/27_io/filesystem/path/append/source.cc: Test appending
      	long strings.
      	* testsuite/27_io/filesystem/path/concat/strings.cc: Test
      	concatenating long strings.
      	* testsuite/27_io/filesystem/path/construct/string_view.cc: Test
      	construction from long string.
      
      From-SVN: r267222
      Jonathan Wakely committed
    • re PR target/87870 (ppc64le generates poor code when loading constants into TImode vars) · 00fd0628
      gcc/
      	PR target/87870
      	* config/rs6000/vsx.md (nW): New mode iterator.
      	(vsx_mov<mode>_64bit): Use it.  Remove redundant GPR 0/-1 alternative.
      	Update length attribute for (<??r>, <nW>)  alternative.
      	(vsx_mov<mode>_32bit): Likewise.
      
      gcc/testsuite/
      	PR target/87870
      	* gcc.target/powerpc/pr87870.c: New test.
      
      From-SVN: r267221
      Peter Bergner committed
    • re PR c++/88410 (internal compiler error: output_operand: invalid expression as operand) · 1e9d6923
      	PR c++/88410
      	* cp-gimplify.c (cp_fold) <case ADDR_EXPR>: For offsetof-like folding,
      	call maybe_constant_value on val to see if it is INTEGER_CST.
      
      	* g++.dg/cpp0x/pr88410.C: New test.
      
      From-SVN: r267220
      Jakub Jelinek committed
    • PR c++/52321 print note for static_cast to/from incomplete type · f4d458f3
      	PR c++/52321
      	* typeck.c (build_static_cast): Print a note when the destination
      	type or the operand is a pointer/reference to incomplete class type.
      
      From-SVN: r267219
      Jonathan Wakely committed
    • [nvptx] Move macro defs to top of nvptx.c · 693ad66b
      Move macro definition to the top of the file, allowing them to be used
      there-after.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-17  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (PTX_VECTOR_LENGTH, PTX_WORKER_LENGTH,
      	PTX_DEFAULT_RUNTIME_DIM): Move to the top of the file.
      
      From-SVN: r267216
      Tom de Vries committed
    • [nvptx] Add PTX_WARP_SIZE · 5d17a476
      Add PTX_WARP_SIZE constant and use it in nvptx_simt_vf.  The function
      nvptx_simt_vf is used for OpenMP, and using PTX_WARP_SIZE here decouples the
      OpenMP support from the PTX_VECTOR_LENGTH constant used in OpenACC support.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-17  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (PTX_WARP_SIZE): Define.
      	(nvptx_simt_vf): Return PTX_WARP_SIZE instead of PTX_VECTOR_LENGTH.
      
      From-SVN: r267215
      Tom de Vries committed
    • [nvptx] Fix whitespace in nvptx_single and nvptx_neuter_pars · 7820b298
      Fix whitespace in nvptx_single and nvptx_neuter_pars.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-17  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_single): Fix whitespace.
      	(nvptx_neuter_pars): Likewise.
      
      From-SVN: r267214
      Tom de Vries committed
    • [nvptx] Unify C/Fortran routine handling in nvptx_goacc_validate_dims · 207e7fea
      The Fortran front-end has a bug (PR72741) that means what when
      nvptx_goacc_validate_dims is called for a Fortran routine, the dims parameter
      is not the same as it would have been if the function would have been called for
      an equivalent C routine.
      
      Work around this bug by overriding the dims parameter for routines, allowing the
      function to handle routines in Fortran and C the same.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-17  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_goacc_validate_dims): Work around Fortran
      	bug PR72741 by overriding dims parameter for routines.
      
      From-SVN: r267213
      Tom de Vries committed
    • [nvptx] Rewrite nvptx_goacc_validate_dims to use predicate vars · ec6c865c
      The function nvptx_goacc_validate_dims has arguments decl and fn_level which
      together describe different situations.
      
      Introduce a predicate var for each situation, and use them, allowing to
      understand what the function does in each situation without having to know the
      way the situations are encoded in the args.
      
      Build and reg-tested on x86_64 with nvptx accelerator.
      
      2018-12-17  Tom de Vries  <tdevries@suse.de>
      
      	* config/nvptx/nvptx.c (nvptx_goacc_validate_dims): Rewrite using
      	predicate vars.
      
      From-SVN: r267212
      Tom de Vries committed
    • Add missing ChangeLog entry from last checkin: · c764b12c
      2018-12-17  Steve Ellcey  <sellcey@cavium.com>
      
      	* gcc.target/aarch64/torture/aarch64-torture.exp: New file.
      	* gcc.target/aarch64/torture/simd-abi-1.c: New test.
      	* gcc.target/aarch64/torture/simd-abi-2.c: Ditto.
      	* gcc.target/aarch64/torture/simd-abi-3.c: Ditto.
      	* gcc.target/aarch64/torture/simd-abi-4.c: Ditto.
      	* gcc.target/aarch64/torture/simd-abi-5.c: Ditto.
      	* gcc.target/aarch64/torture/simd-abi-6.c: Ditto.
      	* gcc.target/aarch64/torture/simd-abi-7.c: Ditto.
      
      From-SVN: r267210
      Steve Ellcey committed