1. 02 Aug, 2018 8 commits
    • [gen/AArch64] Generate helpers for substituting iterator values into pattern names · 0016d8d9
      Given a pattern like:
      
        (define_insn "aarch64_frecpe<mode>" ...)
      
      the SVE ACLE implementation wants to generate the pattern for a
      particular (non-constant) mode.  This patch automatically generates
      helpers to do that, specifically:
      
        // Return CODE_FOR_nothing on failure.
        insn_code maybe_code_for_aarch64_frecpe (machine_mode);
      
        // Assert that the code exists.
        insn_code code_for_aarch64_frecpe (machine_mode);
      
        // Return NULL_RTX on failure.
        rtx maybe_gen_aarch64_frecpe (machine_mode, rtx, rtx);
      
        // Assert that generation succeeds.
        rtx gen_aarch64_frecpe (machine_mode, rtx, rtx);
      
      Many patterns don't have sensible names when all <...>s are removed.
      E.g. "<optab><mode>2" would give a base name "2".  The new functions
      therefore require explicit opt-in, which should also help to reduce
      code bloat.
      
      The (arbitrary) opt-in syntax I went for was to prefix the pattern
      name with '@', similarly to the existing '*' marker.
      
      The patch also makes config/aarch64 use the new routines in cases where
      they obviously apply.  This was mostly straight-forward, but it seemed
      odd that we defined:
      
         aarch64_reload_movcp<...><P:mode>
      
      but then only used it with DImode, never SImode.  If we should be
      using Pmode instead of DImode, then that's a simple change,
      but should probably be a separate patch.
      
      2018-08-02  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* doc/md.texi: Expand the documentation of instruction names
      	to mention port-local uses.  Document '@' in pattern names.
      	* read-md.h (overloaded_instance, overloaded_name): New structs.
      	(mapping): Declare.
      	(md_reader::handle_overloaded_name): New member function.
      	(md_reader::get_overloads): Likewise.
      	(md_reader::m_first_overload): New member variable.
      	(md_reader::m_next_overload_ptr): Likewise.
      	(md_reader::m_overloads_htab): Likewise.
      	* read-md.c (md_reader::md_reader): Initialize m_first_overload,
      	m_next_overload_ptr and m_overloads_htab.
      	* read-rtl.c (iterator_group): Add "type" and "get_c_token" fields.
      	(get_mode_token, get_code_token, get_int_token): New functions.
      	(map_attr_string): Add an optional argument that passes back
      	the associated iterator.
      	(overloaded_name_hash, overloaded_name_eq_p, named_rtx_p):
      	(md_reader::handle_overloaded_name, add_overload_instance): New
      	functions.
      	(apply_iterators): Handle '@' names.  Report an error if '@'
      	is used without iterators.
      	(initialize_iterators): Initialize the new iterator_group fields.
      	* genopinit.c (handle_overloaded_code_for)
      	(handle_overloaded_gen): New functions.
      	(main): Use them to print declarations of maybe_code_for_* and
      	maybe_gen_* functions, and inline definitions of code_for_* and gen_*.
      	* genemit.c (print_overload_arguments, print_overload_test)
      	(handle_overloaded_code_for, handle_overloaded_gen): New functions.
      	(main): Use it to print definitions of maybe_code_for_* and
      	maybe_gen_* functions.
      	* config/aarch64/aarch64.c (aarch64_split_128bit_move): Use
      	gen_aarch64_mov{low,high}_di and gen_aarch64_movdi_{low,high}
      	instead of explicit mode checks.
      	(aarch64_split_simd_combine): Likewise gen_aarch64_simd_combine.
      	(aarch64_split_simd_move): Likewise gen_aarch64_split_simd_mov.
      	(aarch64_emit_load_exclusive): Likewise gen_aarch64_load_exclusive.
      	(aarch64_emit_store_exclusive): Likewise gen_aarch64_store_exclusive.
      	(aarch64_expand_compare_and_swap): Likewise
      	gen_aarch64_compare_and_swap and gen_aarch64_compare_and_swap_lse
      	(aarch64_gen_atomic_cas): Likewise gen_aarch64_atomic_cas.
      	(aarch64_emit_atomic_swap): Likewise gen_aarch64_atomic_swp.
      	(aarch64_constant_pool_reload_icode): Delete.
      	(aarch64_secondary_reload): Use code_for_aarch64_reload_movcp
      	instead of aarch64_constant_pool_reload_icode.  Use
      	code_for_aarch64_reload_mov instead of explicit mode checks.
      	(rsqrte_type, get_rsqrte_type, rsqrts_type, get_rsqrts_type): Delete.
      	(aarch64_emit_approx_sqrt): Use gen_aarch64_rsqrte instead of
      	get_rsqrte_type and gen_aarch64_rsqrts instead of gen_rqrts_type.
      	(recpe_type, get_recpe_type, recps_type, get_recps_type): Delete.
      	(aarch64_emit_approx_div): Use gen_aarch64_frecpe instead of
      	get_recpe_type and gen_aarch64_frecps instead of get_recps_type.
      	(aarch64_atomic_load_op_code): Delete.
      	(aarch64_emit_atomic_load_op): Likewise.
      	(aarch64_gen_atomic_ldop): Use UNSPECV_ATOMIC_* instead of
      	aarch64_atomic_load_op_code.  Use gen_aarch64_atomic_load
      	instead of aarch64_emit_atomic_load_op.
      	* config/aarch64/aarch64.md (aarch64_reload_movcp<GPF_TF:mode><P:mode>)
      	(aarch64_reload_movcp<VALL:mode><P:mode>, aarch64_reload_mov<mode>)
      	(aarch64_movdi_<mode>low, aarch64_movdi_<mode>high)
      	(aarch64_mov<mode>high_di, aarch64_mov<mode>low_di): Add a '@'
      	character before the pattern name.
      	* config/aarch64/aarch64-simd.md (aarch64_split_simd_mov<mode>)
      	(aarch64_rsqrte<mode>, aarch64_rsqrts<mode>)
      	(aarch64_simd_combine<mode>, aarch64_frecpe<mode>)
      	(aarch64_frecps<mode>): Likewise.
      	* config/aarch64/atomics.md (atomic_compare_and_swap<mode>)
      	(aarch64_compare_and_swap<mode>, aarch64_compare_and_swap<mode>_lse)
      	(aarch64_load_exclusive<mode>, aarch64_store_exclusive<mode>)
      	(aarch64_atomic_swp<mode>, aarch64_atomic_cas<mode>)
      	(aarch64_atomic_load<atomic_ldop><mode>): Likewise.
      
      From-SVN: r263251
      Richard Sandiford committed
    • [AArch64] Add support for 16-bit FMOV immediates · a4518821
      aarch64_float_const_representable_p was still returning false for
      HFmode, so we wouldn't use 16-bit FMOV immediate.  E.g. before the
      patch:
      
          __fp16 foo (void) { return 0x1.1p-3; }
      
      gave:
      
             mov     w0, 12352
             fmov    h0, w0
      
      with -march=armv8.2-a+fp16, whereas now it gives:
      
             fmov    h0, 1.328125e-1
      
      2018-08-02  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* config/aarch64/aarch64.c (aarch64_float_const_representable_p):
      	Allow HFmode constants if TARGET_FP_F16INST.
      
      gcc/testsuite/
      	* gcc.target/aarch64/f16_mov_immediate_1.c: Expect fmov immediate
      	to be used.
      	* gcc.target/aarch64/f16_mov_immediate_2.c: Likewise.
      	* gcc.target/aarch64/f16_mov_immediate_3.c: Force +nofp16.
      	* gcc.target/aarch64/sve/single_1.c: Except fmov immediate to be used
      	for .h.
      	* gcc.target/aarch64/sve/single_2.c: Likewise.
      	* gcc.target/aarch64/sve/single_3.c: Likewise.
      	* gcc.target/aarch64/sve/single_4.c: Likewise.
      
      From-SVN: r263250
      Richard Sandiford committed
    • re PR target/86014 ([AArch64] missed LDP optimization) · 363b395b
      gcc/
      2018-08-02  Jackson Woodruff  <jackson.woodruff@arm.com>
      
      	PR target/86014
      	* config/aarch64/aarch64.c (aarch64_operands_adjust_ok_for_ldpstp):
      	No longer check last store for clobber of address register.
      
      
      gcc/testsuite
      2018-08-02  Jackson Woodruff  <jackson.woodruff@arm.com>
      
      	PR target/86014
      	* gcc.target/aarch64/ldp_stp_13.c: New test.
      
      From-SVN: r263249
      Jackson Woodruff committed
    • Fix gcov misleading error (PR gcov-profile/86817). · ca498a11
      2018-08-02  Martin Liska  <mliska@suse.cz>
      
              PR gcov-profile/86817
      	* gcov.c (process_all_functions): New function.
      	(main): Call it.
      	(process_file): Move functions processing to
              process_all_functions.
      
      From-SVN: r263248
      Martin Liska committed
    • Cherry-pick compiler-rt revision 338606 (PR sanitizer/86022). · b4f1f01d
      Fix sizeof(struct pthread) in glibc 2.14.
      
      2018-08-02  Martin Liska  <mliska@suse.cz>
      
              PR sanitizer/86022
      	* sanitizer_common/sanitizer_linux_libcdep.cc (ThreadDescriptorSize):
              Cherry-pick compiler-rt revision 338606.
      
      From-SVN: r263246
      Martin Liska committed
    • [ARM] Fix PR85434: spilling of stack protector guard's address on ARM · 39e4731c
      In case of high register pressure in PIC mode, address of the stack
      protector's guard can be spilled on ARM targets as shown in PR85434,
      thus allowing an attacker to control what the canary would be compared
      against. This is also known as CVE-2018-12886. ARM does lack
      stack_protect_set and stack_protect_test insn patterns, defining them
      does not help as the address is expanded regularly and the patterns
      only deal with the copy and test of the guard with the canary.
      
      This problem does not occur for x86 targets because the PIC access and
      the test can be done in the same instruction. Aarch64 is exempt too
      because PIC access insn pattern are mov of UNSPEC which prevents it from
      the second access in the epilogue being CSEd in cse_local pass with the
      first access in the prologue.
      
      The approach followed here is to create new "combined" set and test
      standard pattern names that take the unexpanded guard and do the set or
      test. This allows the target to use an opaque pattern (eg. using UNSPEC)
      to hide the individual instructions being generated to the compiler and
      split the pattern into generic load, compare and branch instruction
      after register allocator, therefore avoiding any spilling. This is here
      implemented for the ARM targets. For targets not implementing these new
      standard pattern names, the existing stack_protect_set and
      stack_protect_test pattern names are used.
      
      To be able to split PIC access after register allocation, the functions
      had to be augmented to force a new PIC register load and to control
      which register it loads into. This is because sharing the PIC register
      between prologue and epilogue could lead to spilling due to CSE again
      which an attacker could use to control what the canary gets compared
      against.
      
      2018-08-02  Thomas Preud'homme  <thomas.preudhomme@linaro.org>
      
          gcc/
          PR target/85434
          * target-insns.def (stack_protect_combined_set): Define new standard
          pattern name.
          (stack_protect_combined_test): Likewise.
          * cfgexpand.c (stack_protect_prologue): Try new
          stack_protect_combined_set pattern first.
          * function.c (stack_protect_epilogue): Try new
          stack_protect_combined_test pattern first.
          * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
          parameters to control which register to use as PIC register and force
          reloading PIC register respectively.  Insert in the stream of insns if
          possible.
          (legitimize_pic_address): Expose above new parameters in prototype and
          adapt recursive calls accordingly.
          (arm_legitimize_address): Adapt to new legitimize_pic_address
          prototype.
          (thumb_legitimize_address): Likewise.
          (arm_emit_call_insn): Adapt to new require_pic_register prototype.
          * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
          change.
          * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
          prototype change.
          (stack_protect_combined_set): New insn_and_split pattern.
          (stack_protect_set): New insn pattern.
          (stack_protect_combined_test): New insn_and_split pattern.
          (stack_protect_test): New insn pattern.
          * config/arm/unspecs.md (UNSPEC_SP_SET): New unspec.
          (UNSPEC_SP_TEST): Likewise.
          * doc/md.texi (stack_protect_combined_set): Document new standard
          pattern name.
          (stack_protect_set): Clarify that the operand for guard's address is
          legal.
          (stack_protect_combined_test): Document new standard pattern name.
          (stack_protect_test): Clarify that the operand for guard's address is
          legal.
      
          gcc/testsuite/
          PR target/85434
          * gcc.target/arm/pr85434.c: New test.
      
      From-SVN: r263245
      Thomas Preud'homme committed
    • dumpfile.c/h: add "const" to dump location ctors · 12c27c75
      gcc/ChangeLog:
      	* dumpfile.c (dump_user_location_t::dump_user_location_t): Add
      	"const" to the "gimple *" and "rtx_insn *" parameters.
      	* dumpfile.h (dump_user_location_t::dump_user_location_t):
      	Likewise.
      	(dump_location_t::dump_location_t): Likewise.
      
      From-SVN: r263244
      David Malcolm committed
    • Daily bump. · fbdd6065
      From-SVN: r263243
      GCC Administrator committed
  2. 01 Aug, 2018 32 commits
    • PR tree-optimization/86650 - -Warray-bounds missing inlining context · 8a45b051
      gcc/c/ChangeLog:
      
      	PR tree-optimization/86650
      	* c-objc-common.c (c_tree_printer): Move usage of EXPR_LOCATION (t)
      	and TREE_BLOCK (t) from within percent_K_format	to this callsite.
      
      gcc/c-family/ChangeLog:
      
      	PR tree-optimization/86650
      	* c-family/c-format.c (gcc_tdiag_char_table): Update comment for "%G".
      	(gcc_cdiag_char_table, gcc_cxxdiag_char_table): Same.
       	(init_dynamic_diag_info): Update from "gcall *" to "gimple *".
       	* c-format.h (T89_G): Update to be "gimple *" rather than
       	"gcall *".
      	(local_gcall_ptr_node): Rename...
       	(local_gimple_ptr_node): ...to this.
      
      gcc/cp/ChangeLog:
      
      	PR tree-optimization/86650
      	* error.c (cp_printer): Move usage of EXPR_LOCATION (t) and
      	TREE_BLOCK (t) from within percent_K_format to this callsite.
      
      gcc/ChangeLog:
      
      	PR tree-optimization/86650
      	* gimple-pretty-print.c (percent_G_format): Accept a "gimple *"
      	rather than a "gcall *".  Directly pass the data of interest
       	to percent_K_format, rather than building a temporary CALL_EXPR
       	to hold it.
      	* gimple-fold.c (gimple_fold_builtin_strncpy): Adjust.
      	(gimple_fold_builtin_strncat): Adjust.
      	* gimple-ssa-warn-restrict.h (check_bounds_or_overlap): Replace
      	gcall* argument with gimple*.
      	* gimple-ssa-warn-restrict.c (check_call): Same.
      	(wrestrict_dom_walker::before_dom_children): Same.
      	(builtin_access::builtin_access): Same.
      	(check_bounds_or_overlap): Same
      	(maybe_diag_overlap): Same.
      	(maybe_diag_offset_bounds): Same.
      	* tree-diagnostic.c (default_tree_printer): Move usage of
      	EXPR_LOCATION (t) and TREE_BLOCK (t) from within percent_K_format
      	to this callsite.
      	* tree-pretty-print.c (percent_K_format): Add argument.
      	* tree-pretty-print.h: Add argument.
      	* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Adjust.
      	* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Adjust.
      	(maybe_diag_stxncpy_trunc): Same.
      	(handle_builtin_stxncpy): Same.
      	(handle_builtin_strcat): Same.
      
      gcc/testsuite/ChangeLog:
      
      	PR tree-optimization/86650
      	* gcc.dg/format/gcc_diag-10.c: Adjust.
      
      From-SVN: r263239
      Martin Sebor committed
    • xcoff.c (struct xcoff_line, [...]): Remove. · ca9a1314
      	* xcoff.c (struct xcoff_line, struct xcoff_line_vector): Remove.
      	(struct xcoff_func, struct xcoff_func_vector): New structs.
      	(xcoff_syminfo): Drop leading dot from symbol name.
      	(xcoff_line_compare, xcoff_line_search): Remove.
      	(xcoff_func_compare, xcoff_func_search): New static functions.
      	(xcoff_lookup_pc): Search function table.
      	(xcoff_add_line, xcoff_process_linenos): Remove.
      	(xcoff_initialize_fileline): Build function table.
      
      From-SVN: r263238
      Tony Reix committed
    • [libgomp] Truncate config/nvptx/oacc-parallel.c · 701d080a
      	libgomp/
      	* config/nvptx/oacc-parallel.c: Truncate.
      
      Co-Authored-By: James Norris <jnorris@codesourcery.com>
      Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
      
      From-SVN: r263236
      Cesar Philippidis committed
    • Add -D_GLIBCXX_ASSERTIONS to DEBUG_FLAGS · 9fbd2e55
      Enable assertions in the extra debug library built when
      --enable-libstdcxx-debug is used. Replace some Debug Mode assertions
      in src/c++11/futex.cc with __glibcxx_assert, because the library will
      never be built with Debug Mode.
      
      	* configure: Regenerate.
      	* configure.ac: Add -D_GLIBCXX_ASSERTIONS to default DEBUG_FLAGS.
      	* src/c++11/futex.cc: Use __glibcxx_assert instead of
      	_GLIBCXX_DEBUG_ASSERT.
      
      From-SVN: r263235
      Jonathan Wakely committed
    • Cherry-pick compiler-rt revision 318044 and 319180. · c191b1ab
          [PowerPC][tsan] Update tsan to handle changed memory layouts in newer kernels
          
          In more recent Linux kernels with 47 bit VMAs the layout of virtual memory
          for powerpc64 changed causing the thread sanitizer to not work properly. This
          patch adds support for 47 bit VMA kernels for powerpc64.
          
          Tested on several 4.x and 3.x kernel releases.
      
      Regtested/bootstrapped on ppc64le-linux with kernel 4.14; applying to
      trunk/8.3.
      
      2018-08-01  Marek Polacek  <polacek@redhat.com>
      
      	PR sanitizer/86759
      	* tsan/tsan_platform.h: Cherry-pick compiler-rt revision 318044.
      	* tsan/tsan_platform_linux.cc: Cherry-pick compiler-rt revision
      	319180.
      
      From-SVN: r263229
      Marek Polacek committed
    • [AArch64] Update expected output for sve/var_stride_[24].c · 616fc41c
      After Segher's recent combine change, these tests now use a single
      instruction to do the "and" and "lsl 10".  This is a good thing,
      so the patch updates the expected output accordingly.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/testsuite/
      	* gcc.target/aarch64/sve/var_stride_2.c: Update expected form
      	of range check.
      	* gcc.target/aarch64/sve/var_stride_4.c: Likewise.
      
      From-SVN: r263228
      Richard Sandiford committed
    • [AArch64] XFAIL sve/vcond_[45].c tests · f811f141
      See PR 86753 for details.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/testsuite/
      	PR target/86753
      	* gcc.target/aarch64/sve/vcond_4.c: XFAIL positive tests.
      	* gcc.target/aarch64/sve/vcond_5.c: Likewise.
      
      From-SVN: r263227
      Richard Sandiford committed
    • Fold pointer range checks with equal spans · a19f98d5
      When checking whether vectorised accesses at A and B are independent,
      the vectoriser falls back to tests of the form:
      
          A + size <= B || B + size <= A
      
      But in the common case that "size" is just the constant size of a vector
      (or a small multiple), it would be more efficient to do:
      
         (size_t) (A + (size - 1) - B) > (size - 1) * 2
      
      This patch adds folds to do that.  E.g. before the patch, the alias
      checks for:
      
        for (int j = 0; j < n; ++j)
          {
            for (int i = 0; i < 16; ++i)
      	a[i] = (b[i] + c[i]) >> 1;
            a += step;
            b += step;
            c += step;
          }
      
      were:
      
      	add     x7, x1, 15
      	add     x5, x0, 15
      	cmp     x0, x7
      	add     x7, x2, 15
      	ccmp    x1, x5, 2, ls
      	cset    w8, hi
      	cmp     x0, x7
      	ccmp    x2, x5, 2, ls
      	cset    w4, hi
      	tst     w8, w4
      
      while after the patch they're:
      
      	add     x0, x0, 15
      	sub     x6, x0, x1
      	sub     x5, x0, x2
      	cmp     x6, 30
      	ccmp    x5, 30, 0, hi
      
      The old scheme needs:
      
      [A] one addition per vector pointer
      [B] two comparisons and one IOR per range check
      
      The new one needs:
      
      [C] less than one addition per vector pointer
      [C] one subtraction and one comparison per range check
      
      The range checks are then ANDed together, with the same number of
      ANDs either way.
      
      With conditional comparisons (such as on AArch64), we're able to remove
      the IOR between comparisons in the old scheme, but then need an explicit
      AND or branch when combining the range checks, as the example above shows.
      With the new scheme we can instead use conditional comparisons for
      the AND chain.
      
      So even with conditional comparisons, the new scheme should in practice
      be a win in almost all cases.  Without conditional comparisons, the new
      scheme removes at least one operation from [A] and one operation per
      range check from [B], so should always give fewer operations overall.
      
      2018-07-20  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* match.pd: Optimise pointer range checks.
      
      gcc/testsuite/
      	* gcc.dg/pointer-range-check-1.c: New test.
      	* gcc.dg/pointer-range-check-2.c: Likewise.
      
      From-SVN: r263226
      Richard Sandiford committed
    • Use steady_clock to implement condition_variable::wait_for · 9e68aa3c
      The C++ standard says that std::condition_variable::wait_for should be
      implemented to be equivalent to:
      
        return wait_until(lock, chrono::steady_clock::now() + rel_time);
      
      But the existing implementation uses chrono::system_clock. Now that
      wait_until has potentially-different behaviour for chrono::steady_clock,
      let's at least try to wait using the correct clock.
      
      2018-08-01  Mike Crowe  <mac@mcrowe.com>
      
      	* include/std/condition_variable (wait_for): Use steady_clock.
      
      From-SVN: r263225
      Mike Crowe committed
    • Report early wakeup of condition_variable::wait_until as no_timeout · 2f593432
      As currently implemented, condition_variable always ultimately waits
      against std::chrono::system_clock. This clock can be changed in arbitrary
      ways by the user which may result in us waking up too early or too late
      when measured against the caller-supplied clock.
      
      We can't (yet) do much about waking up too late (PR 41861), but
      if we wake up too early we must return cv_status::no_timeout to indicate a
      spurious wakeup rather than incorrectly returning cv_status::timeout.
      
      2018-08-01  Mike Crowe  <mac@mcrowe.com>
      
      	* include/std/condition_variable (wait_until): Only report timeout
      	if we really have timed out when measured against the
      	caller-supplied clock.
      	* testsuite/30_threads/condition_variable/members/2.cc: Add test
      	case to confirm above behaviour.
      
      From-SVN: r263224
      Mike Crowe committed
    • Fix PR number · 5534096c
      From-SVN: r263223
      Richard Sandiford committed
    • Fix remove_stmt in vectorizable_simd_clone_call (PR 86758) · 41b6b80e
      vectorizable_simd_clone_call was trying to remove a pattern statement
      instead of the original statement,  Fixes existing tests
      gcc.dg/pr84452.c and gcc.target/i386/pr84309.c on x86.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	PR tree-optimization/86748
      	* tree-vect-stmts.c (vectorizable_simd_clone_call): Don't try
      	to remove pattern statements.
      
      From-SVN: r263222
      Richard Sandiford committed
    • [07/11] Use single basic block array in loop_vec_info · beeb6ce8
      _loop_vec_info::_loop_vec_info used get_loop_array to get the
      order of the blocks when creating stmt_vec_infos, but then used
      dfs_enumerate_from to get the order of the blocks that the rest
      of the vectoriser uses.  We should be able to use that order
      for creating stmt_vec_infos too.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Use the
      	result of dfs_enumerate_from when constructing stmt_vec_infos,
      	instead of additionally calling get_loop_body.
      
      From-SVN: r263221
      Richard Sandiford committed
    • [06/11] Handle VMAT_INVARIANT separately · 2d4bca81
      Invariant loads were handled as a variation on the code for contiguous
      loads.  We detected whether they were invariant or not as a byproduct of
      creating the vector pointer ivs: vect_create_data_ref_ptr passed back an
      inv_p to say whether the pointer was invariant.
      
      But vectorised invariant loads just keep the original scalar load,
      so this meant that detecting invariant loads had the side-effect of
      creating an unwanted vector pointer iv.  The placement of the code
      also meant that we'd create a vector load and then not use the result.
      In principle this is wrong code, since there's no guarantee that there's
      a vector's worth of accessible data at that address, but we rely on DCE
      to get rid of the load before any harm is done.
      
      E.g., for an invariant load in an inner loop (which seems like the more
      common use case for this code), we'd create:
      
         vectp_a.6_52 = &a + 4;
      
         # vectp_a.5_53 = PHI <vectp_a.5_54(9), vectp_a.6_52(2)>
      
         # vectp_a.5_55 = PHI <vectp_a.5_53(3), vectp_a.5_56(10)>
      
         vect_next_a_11.7_57 = MEM[(int *)vectp_a.5_55];
         next_a_11 = a[_1];
         vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};
      
         vectp_a.5_56 = vectp_a.5_55 + 4;
      
         vectp_a.5_54 = vectp_a.5_53 + 0;
      
      whereas all we want is:
      
         next_a_11 = a[_1];
         vect_cst__58 = {next_a_11, next_a_11, next_a_11, next_a_11};
      
      This patch moves the handling to its own block and makes
      vect_create_data_ref_ptr assert (when creating a full iv) that the
      address isn't invariant.
      
      The ncopies handling is unfortunate, but a preexisting issue.
      Richi's suggestion of using a vector of vector statements would
      let us reuse one statement for all copies.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_create_data_ref_ptr): Remove inv_p
      	parameter.
      	* tree-vect-data-refs.c (vect_create_data_ref_ptr): Likewise.
      	When creating an iv, assert that the step is not known to be zero.
      	(vect_setup_realignment): Update call accordingly.
      	* tree-vect-stmts.c (vectorizable_store): Likewise.
      	(vectorizable_load): Likewise.  Handle VMAT_INVARIANT separately.
      
      From-SVN: r263220
      Richard Sandiford committed
    • [05/11] Add a vect_stmt_to_vectorize helper function · 6e6b18e5
      This patch adds a helper that does the opposite of vect_orig_stmt:
      go from the original scalar statement to the statement that should
      actually be vectorised.
      
      The use in the last two hunks of vectorizable_reduction are because
      reduc_stmt_info (first hunk) and stmt_info (second hunk) are already
      pattern statements if appropriate.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_stmt_to_vectorize): New function.
      	* tree-vect-loop.c (vect_update_vf_for_slp): Use it.
      	(vectorizable_reduction): Likewise.
      	* tree-vect-slp.c (vect_analyze_slp_instance): Likewise.
      	(vect_detect_hybrid_slp_stmts): Likewise.
      	* tree-vect-stmts.c (vect_is_simple_use): Likewise.
      
      From-SVN: r263219
      Richard Sandiford committed
    • tree-vrp (zero_nonzero_bits_from_bounds): Rename to... · cd3ca910
      	* tree-vrp (zero_nonzero_bits_from_bounds): Rename to...
      	(wide_int_set_zero_nonzero_bits): ...this.
      	(zero_nonzero_bits_from_vr): Rename to...
      	(vrp_set_zero_nonzero_bits): ...this.
      	(extract_range_from_multiplicative_op_1): Abstract wide int
      	code...
      	(wide_int_range_multiplicative_op): ...here.
      	(extract_range_from_binary_expr_1): Extract wide int binary
      	operations into their own functions.
      	(wide_int_range_lshift): New.
      	(wide_int_range_can_optimize_bit_op): New.
      	(wide_int_range_shift_undefined_p): New.
      	(wide_int_range_bit_xor): New.
      	(wide_int_range_bit_ior): New.
      	(wide_int_range_bit_and): New.
      	(wide_int_range_trunc_mod): New.
      	(extract_range_into_wide_ints): New.
      	(vrp_shift_undefined_p): New.
      	(extract_range_from_multiplicative_op): New.
      	(vrp_can_optimize_bit_op): New.
      	* tree-vrp.h (value_range::dump): New.
      	(wide_int_range_multiplicative_op): New.
      	(wide_int_range_lshift):New.
      	(wide_int_range_shift_undefined_p): New.
      	(wide_int_range_bit_xor): New.
      	(wide_int_range_bit_ior): New.
      	(wide_int_range_bit_and): New.
      	(wide_int_range_trunc_mod): New.
      	(zero_nonzero_bits_from_bounds): Rename to...
      	(wide_int_set_zero_nonzero_bits): ...this.
      	(zero_nonzero_bits_from_vr): Rename to...
      	(vrp_set_zero_nonzero_bits): ...this.
      	(range_easy_mask_min_max): Rename to...
      	(wide_int_range_can_optimize_bit_op): this.
      
      From-SVN: r263218
      Aldy Hernandez committed
    • [04/11] Add a vect_orig_stmt helper function · 211cd1e2
      This patch just adds a helper function for going from a potential
      pattern statement to the original scalar statement.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_orig_stmt): New function.
      	* tree-vect-data-refs.c (vect_preserves_scalar_order_p): Use it.
      	* tree-vect-loop.c (vect_model_reduction_cost): Likewise.
      	(vect_create_epilog_for_reduction): Likewise.
      	(vectorizable_live_operation): Likewise.
      	* tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Likewise.
      	(vect_detect_hybrid_slp_stmts, vect_schedule_slp): Likewise.
      	* tree-vect-stmts.c (vectorizable_call): Likewise.
      	(vectorizable_simd_clone_call, vect_remove_stores): Likewise.
      
      From-SVN: r263217
      Richard Sandiford committed
    • [03/11] Remove vect_transform_stmt grouped_store argument · b0b45e58
      Nothing now uses the grouped_store value passed back by
      vect_transform_stmt, so we might as well remove it.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_transform_stmt): Remove grouped_store
      	argument.
      	* tree-vect-stmts.c (vect_transform_stmt): Likewise.
      	* tree-vect-loop.c (vect_transform_loop_stmt): Update call accordingly.
      	(vect_transform_loop): Likewise.
      	* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
      
      From-SVN: r263216
      Richard Sandiford committed
    • [02/11] Remove vect_schedule_slp return value · 8fe1bd30
      Nothing now uses the vect_schedule_slp return value, so it's not worth
      propagating the value through vect_schedule_slp_instance.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vectorizer.h (vect_schedule_slp): Return void.
      	* tree-vect-slp.c (vect_schedule_slp_instance): Likewise.
      	(vect_schedule_slp): Likewise.
      
      From-SVN: r263215
      Richard Sandiford committed
    • [01/11] Schedule SLP earlier · 99615cf5
      vect_transform_loop used to call vect_schedule_slp lazily when it
      came across the first SLP statement, but it seems easier to do it
      before the main loop.
      
      2018-07-30  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	* tree-vect-loop.c (vect_transform_loop_stmt): Remove slp_scheduled
      	argument.
      	(vect_transform_loop): Update calls accordingly.  Schedule SLP
      	instances before the main loop, if any exist.
      
      From-SVN: r263214
      Richard Sandiford committed
    • Fix over-widening handling of COND_EXPRs (PR 86749) · 047fba34
      This PR is a wrong-code bug caused by the over-widening support.
      The minimum input precisions for a COND_EXPR are supposed to apply
      only to the "then" and "else" values, but here we were applying
      them to the operands of a nested COND_EXPR comparison instead.
      
      2018-08-01  Richard Sandiford  <richard.sandiford@arm.com>
      
      gcc/
      	PR tree-optimization/86749
      	* tree-vect-patterns.c (vect_determine_min_output_precision_1):
      	If the lhs is used in a COND_EXPR, check that it is being used
      	as the "then" or "else" value.
      
      gcc/testsuite/
      	PR tree-optimization/86749
      	* gcc.dg/vect/pr86749.c: New test.
      
      From-SVN: r263213
      Richard Sandiford committed
    • [PATCH] Remove use of 'struct map' from plugin (nvptx) · 094db6be
      	libgomp/
      	* plugin/plugin-nvptx.c (struct map): Removed.
      	(map_init, map_pop): Remove use of struct map. (map_push):
      	Likewise and change argument list.
      	* testsuite/libgomp.oacc-c-c++-common/mapping-1.c: New
      
      Co-Authored-By: James Norris <jnorris@codesourcery.com>
      
      From-SVN: r263212
      Cesar Philippidis committed
    • PR libstdc++/60555 std::system_category() should recognise POSIX errno values · 5ecfbf82
      	PR libstdc++/60555
      	* src/c++11/system_error.cc
      	(system_error_category::default_error_condition): New override to
      	check for POSIX errno values.
      	* testsuite/19_diagnostics/error_category/generic_category.cc: New
      	* testsuite/19_diagnostics/error_category/system_category.cc: New
      	test.
      
      From-SVN: r263210
      Jonathan Wakely committed
    • [nvptx] Define TARGET_HAVE_SPECULATION_SAFE_VALUE · e335138d
      2018-08-01  Tom de Vries  <tdevries@suse.de>
      
      	PR target/86800
      	* config/nvptx/nvptx.c (TARGET_HAVE_SPECULATION_SAFE_VALUE): Define to
      	speculation_safe_value_not_needed.
      
      From-SVN: r263209
      Tom de Vries committed
    • [libgomp, nvptx] Add cuda-lib.def · 8c6310a2
      2018-08-01  Tom de Vries  <tdevries@suse.de>
      
      	* plugin/cuda-lib.def: New file.  Factor out of ...
      	* plugin/plugin-nvptx.c (CUDA_CALLS): ... here.
      	(struct cuda_lib_s, init_cuda_lib): Include cuda-lib.def instead of
      	using CUDA_CALLS.
      
      From-SVN: r263208
      Tom de Vries committed
    • re PR c++/86661 (g++ ICE:tree check: expected tree that contains ‘decl minimal’… · 5ebbb72c
      re PR c++/86661 (g++ ICE:tree check: expected tree that contains ‘decl minimal’ structure, have ‘overload’ in note_name_declared_in_class, at cp/class.c:8288)
      
      /cp
      2018-08-01  Paolo Carlini  <paolo.carlini@oracle.com>
      
      	PR c++/86661
      	* class.c (note_name_declared_in_class): Use location_of in permerror
      	instead of DECL_SOURCE_LOCATION (for OVERLOADs).
      
      /testsuite
      2018-08-01  Paolo Carlini  <paolo.carlini@oracle.com>
      
      	PR c++/86661
      	* g++.dg/lookup/name-clash12.C: New.
      
      From-SVN: r263207
      Paolo Carlini committed
    • tree-ssa-sccvn.c (visit_phi): Compare invariant addresses as base and offset. · e4837aa9
      2018-08-01  Richard Biener  <rguenther@suse.de>
      
      	* tree-ssa-sccvn.c (visit_phi): Compare invariant addresses
      	as base and offset.
      
      	* gcc.dg/tree-ssa/ssa-fre-68.c: New testcase.
      
      From-SVN: r263206
      Richard Biener committed
    • poly-int-07_plugin.c (dg-options): Use -O0. · 42c4ccce
      	* gcc.dg/plugin/poly-int-07_plugin.c (dg-options): Use -O0.
      
      From-SVN: r263205
      Uros Bizjak committed
    • pr84512.c: Xfail on alpha*-*-*. · 4bbea044
      	* gcc.dg/tree-ssa/pr84512.c: Xfail on alpha*-*-*.
      
      From-SVN: r263204
      Uros Bizjak committed
    • Improve dumping of value profiling transformations. · 7f87c8da
      2018-08-01  Martin Liska  <mliska@suse.cz>
      
      	* value-prof.c (gimple_divmod_fixed_value_transform): Unify
              format how successful transformation is dumped.
      	(gimple_mod_pow2_value_transform): Likewise.
      	(gimple_mod_subtract_transform): Likewise.
      	(gimple_stringops_transform): Likewise.
      2018-08-01  Martin Liska  <mliska@suse.cz>
      
      	* gcc.dg/tree-prof/stringop-1.c: Adjust scanned pattern.
      	* gcc.dg/tree-prof/stringop-2.c: Likewise.
      	* gcc.dg/tree-prof/val-prof-1.c: Likewise.
      	* gcc.dg/tree-prof/val-prof-2.c: Likewise.
      	* gcc.dg/tree-prof/val-prof-3.c: Likewise.
      	* gcc.dg/tree-prof/val-prof-4.c: Likewise.
      	* gcc.dg/tree-prof/val-prof-5.c: Likewise.
      	* gcc.dg/tree-prof/val-prof-7.c: Likewise.
      
      From-SVN: r263203
      Martin Liska committed
    • __gcov_indirect_call_callee can't be null in __gcov_indirect_call_profiler_v2. · fd2e1dcd
      2018-08-01  Martin Liska  <mliska@suse.cz>
      
      	* libgcov-profiler.c (__gcov_indirect_call_profiler_v2): Do not
              check that  __gcov_indirect_call_callee is non-null.
      
      From-SVN: r263202
      Martin Liska committed
    • Add memmove to value profiling. · 181f2e99
      2018-08-01  Martin Liska  <mliska@suse.cz>
      
              PR value-prof/35543
      	* value-prof.c (interesting_stringop_to_profile_p):
              Simplify the code and add BUILT_IN_MEMMOVE.
      	(gimple_stringops_transform): Likewise.
      2018-08-01  Martin Liska  <mliska@suse.cz>
      
              PR value-prof/35543
      	* gcc.dg/tree-prof/val-prof-7.c: Add __builtin_memmove.
      
      From-SVN: r263201
      Martin Liska committed