Commits · dc192bbdd0442f75e324cc82a98e611b7912e2f9 · lvzhengyang / riscv-gcc-1

26 Feb, 2020 19 commits

coroutines: Amend parameter handling to match n4849. · dc192bbd

In n4849 and preceding versions, [class.copy.elision] (1.3)
appears to confer additional permissions on coroutines to elide
parameter copies.

After considerable discussion on this topic by email and during
the February 2020 WG21 meeting, it has been determined that there
are no additional permissions applicable to coroutine parameter
copy elision.

The content of that clause in the standard is expected to be amended
eventually to clarify this.  Other than this, the handling of
parameter lifetimes is expected to be as per n4849:

 * A copy is made before the promise is constructed
 * If the promise CTOR uses the parms, then it should use the copy
   where appropriate.
 * The param copy lifetimes end after the promise is destroyed
   (during the coroutine frame destruction).
 * Otherwise, C++20 copy elision rules apply.

(as an aside) In practice, we expect that copy elision can only occur
when the coroutine body is fully inlined, possibly in conjunction with
heap allocation elision.

The patch:
 * Reorders the copying process to precede the promise CTOR and
    ensures the correct use.
 * Copies all params into the frame regardless of whether the coro
   body uses them (this is a bit unfortunate, and we should figure
   out an amendment for C++23).

gcc/cp/ChangeLog:

2020-02-26  Iain Sandoe  <iain@sandoe.co.uk>

	* class.c (classtype_has_non_deleted_copy_ctor): New.
	* coroutines.cc (struct param_info): Keep track of params
	that are references, and cache the original type and whether
	the DTOR is trivial.
	(build_actor_fn): Handle param copies always, and adjust the
	handling for references.
	(register_param_uses): Only handle uses here.
	(classtype_has_non_deleted_copy_ctor): New.
	(morph_fn_to_coro): Adjust param copy handling to match n4849
	by reordering ahead of the promise CTOR and always making a
	frame copy, even if the param is unused in the coroutine body.
	* cp-tree.h (classtype_has_non_deleted_copy_ctor): New.

gcc/testsuite/ChangeLog:

2020-02-26  Iain Sandoe  <iain@sandoe.co.uk>

	* g++.dg/coroutines/coro1-refs-and-ctors.h: New.
	* g++.dg/coroutines/torture/func-params-07.C: New test.
	* g++.dg/coroutines/torture/func-params-08.C: New test.

committed Feb 26, 2020

dc192bbd Browse Files

libgo: update to final Go1.14 release · c5decc83
```
Reviewed-on: https://go-review.googlesource.com/c/gofrontend/+/221158
```
Ian Lance Taylor committed Feb 26, 2020
c5decc83 Browse Files

rs6000: Fix more testsuite fallout from rs6000_legitimate_address_p() fix. [PR93913] · 051b9873

	PR target/93913
	* gcc.target/powerpc/fold-vec-st-char.c (scan-assembler-times): Allow
	stxv and stxvx instructions as well.
	* gcc.target/powerpc/fold-vec-st-float.c: Likewise.
	* gcc.target/powerpc/fold-vec-st-int.c: Likewise.
	* gcc.target/powerpc/fold-vec-st-short.c: Likewise.

committed Feb 26, 2020

051b9873 Browse Files

c++: Some improvements to concept diagnostics · 44f6b7fb

This patch improves our concept diagnostics in two ways.  First, it sets a more
precise location for the constraint expressions built in
finish_constraint_binary_op.  As a result, when a disjunction is unsatisfied we
now print e.g.

.../include/bits/range_access.h:467:2: note: neither operand of the disjunction is satisfied
  466 |  requires is_bounded_array_v<remove_reference_t<_Tp>> || __member_end<_Tp>
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  467 |  || __adl_end<_Tp>
      |  ^~~~~~~~~~~~~~~~~

instead of

.../include/bits/range_access.h:467:2: note: neither operand of the disjunction is satisfied
  467 |  || __adl_end<_Tp>
      |  ^~

Second, this patch changes diagnose_atomic_constraint to print unsatisfied
atomic constraint expressions with their template arguments.  So e.g. we now
print

cpp2a/concepts-pr67719.C:9:8: note: the expression ‘(... &&(C<Tx>)()) [with Tx = {int, long int, void}]’ evaluated to ‘false’

instead of

cpp2a/concepts-pr67719.C:9:8: note: the expression ‘(... &&(C<Tx>)())’ evaluated to ‘false’

Tested on x86_64-pc-linux-gnu, and verified that all the diagnostics emitted in
our concept tests are no worse with this patch.

gcc/cp/ChangeLog:

	* constraint.cc (finish_constraint_binary_op): Set expr's location range
	to the range of its operands.
	(satisfy_atom): Pass MAP instead of ARGS to diagnose_atomic_constraint.
	(diagnose_trait_expr): Take the instantiated parameter mapping MAP
	instead of the corresponding template arguments ARGS and adjust body
	accordingly.
	(diagnose_requires_expr): Likewise.
	(diagnose_atomic_constraint): Likewise.  When printing an atomic
	constraint expression, print the instantiated parameter mapping
	alongside it.
	* cxx-pretty-print.cc (cxx_pretty_printer::expression)
	[NONTYPE_ARGUMENT_PACK]: Print braces around a NONTYPE_ARGUMENT_PACK.
	(cxx_pretty_printer::type_id): Handle TYPE_ARGUMENT_PACK.

gcc/testsuite/ChangeLog:

	* g++.dg/concepts/diagnostic2.C: New test.
	* g++.dg/concepts/diagnostic3.C: New test.

committed Feb 26, 2020

44f6b7fb Browse Files

c++: Fix value-init crash in template [PR93676] · 38e10026

Since <https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00556.html> we
attempt to value-initialize in build_vec_init even when there's no
initializer but the type has a constexpr default constructor.  But
build_value_init doesn't work in templates, and build_vec_init
creates a lot of garbage that would not be used anyway, so don't
call it in a template.

	PR c++/93676 - value-init crash in template.
	* init.c (build_new_1): Don't call build_vec_init in a template.

	* g++.dg/cpp0x/nsdmi-template19.C: New test.

committed Feb 26, 2020

38e10026 Browse Files

libstdc++: Fix use of inaccessible private member in split_view (PR93936) · 8ce13842

We are calling _OuterIter::__current from _InnerIter::operator==, but the former
is private within this non-member friend.  Fix this by calling
_OuterIter::operator== instead, which does the right thing here.

libstdc++-v3/ChangeLog:

	PR libstdc++/93936
	* include/std/ranges (split_view::_InnerIter::operator==): Compare
	the operands' _M_i rather than their _M_i.current().
	* testsuite/std/ranges/adaptors/split.cc: Augment test.

committed Feb 26, 2020

8ce13842 Browse Files

libstdc++: P1645R1 constexpr for <numeric> algorithms · fd335985

This adds constexpr to 11 algorithms defined in <numeric> as per P1645R1.

libstdc++-v3/ChangeLog:

	P1645R1 constexpr for <numeric> algorithms
	* include/bits/stl_numeric.h (iota, accumulate, inner_product,
	partial_sum, adjacent_difference): Make conditionally constexpr for
	C++20.
	* include/std/numeric (__cpp_lib_constexpr_numeric): Define this feature
	test macro.
	(reduce, transform_reduce, exclusive_scan, inclusive_scan,
	transform_exclusive_scan, transform_inclusive_scan): Make conditionally
	constexpr for C++20.
	* include/std/version (__cpp_lib_constexpr_numeric): Define.
	* testsuite/26_numerics/accumulate/constexpr.cc: New test.
	* testsuite/26_numerics/adjacent_difference/constexpr.cc: Likewise.
	* testsuite/26_numerics/exclusive_scan/constexpr.cc: Likewise.
	* testsuite/26_numerics/inclusive_scan/constexpr.cc: Likewise.
	* testsuite/26_numerics/inner_product/constexpr.cc: Likewise.
	* testsuite/26_numerics/iota/constexpr.cc: Likewise.
	* testsuite/26_numerics/partial_sum/constexpr.cc: Likewise.
	* testsuite/26_numerics/reduce/constexpr.cc: Likewise.
	* testsuite/26_numerics/transform_exclusive_scan/constexpr.cc: Likewise.
	* testsuite/26_numerics/transform_inclusive_scan/constexpr.cc: Likewise.
	* testsuite/26_numerics/transform_reduce/constexpr.cc: Likewise.

committed Feb 26, 2020

fd335985 Browse Files

libstdc++ Two simplifications for lexicographical_compare · 113f0a63

	* include/bits/ranges_algo.h (__lexicographical_compare_fn): Declare
	variables in smaller scope and avoid calling ranges::distance when we
	know they are pointers. Remove statically-unreachable use of
	__builtin_unreachable().
	* include/bits/stl_algobase.h (__lexicographical_compare::__lc):
	Define inline.

committed Feb 26, 2020

113f0a63 Browse Files

libstdc++: Add __maybe_const_t and __maybe_empty_t aliases · 8017d95c

This introduces a couple of convenience alias templates to be used for
some repeated patterns using std::conditional_t.

	* include/std/ranges (__detail::__maybe_empty_t): Define new helper
	alias.
	(__detail::__maybe_const_t): Likewise.
	(__adaptor::_RangeAdaptor): Use __maybe_empty_t.
	(transform_view, take_view, take_while_view, elements_view): Use
	__maybe_const_t.
	(join_view, split_view): Use both.

committed Feb 26, 2020

8017d95c Browse Files

c++: Fix ICE with static_cast when converting from int[] [PR93862] · 4a305fa2

This ICEs since my patch for P0388, which allowed conversions to arrays
of unknown bound, but not the reverse, so these two static_casts are
ill-formed.

[expr.static.cast]/3 says that "cv1 T1" and "cv2 T2" have to be
reference-compatible and the comment in build_static_cast_1 says it too
but then we actually use reference_related_p...  Fixed thus.

2020-02-26  Marek Polacek  <polacek@redhat.com>

	PR c++/93862 - ICE with static_cast when converting from int[].
	* call.c (reference_compatible_p): No longer static.
	* cp-tree.h (reference_compatible_p): Declare.
	* typeck.c (build_static_cast_1): Use reference_compatible_p instead
	of reference_related_p.

	* g++.dg/cpp0x/rv-cast7.C: New test.

committed Feb 26, 2020

4a305fa2 Browse Files

c++: Add test for DR 1423, Convertibility of nullptr to bool. · b9934ad8

DR 1423, which supersedes DR 654, says that you can't copy-init
a bool from a std::nullptr_t:

  bool b = nullptr;  // error

Conversely, it works with direct-initialization which is more
permissive than copy-initialization.

No code changes necessary since we handle it right.

2020-02-26  Marek Polacek  <polacek@redhat.com>

	DR 1423, Convertibility of nullptr to bool.
	* g++.dg/DRs/dr1423.C: New test.

committed Feb 26, 2020

b9934ad8 Browse Files

c++: Fix ICE with constexpr init and [[no_unique_address]] [PR93803] · d6ff2207

Here we crash when constexpr-initializing a class member of empty class
type with [[no_unique_address]].  Without the attribute we would have
a ctor (that initializes bar) of the form

  { .D.2173 = { .x = {} } }

but with the attribute reduced_constant_expression_p gets

  { .x = {} }

That means that "idx != field" is true for the latter and we see that
foo, the base class of bar, is an empty class, so we want to look at
the next initializable field (since empty class fields may not have an
initializer).  But in this case there are no more, therefore accessing
DECL_CHAIN (field) crashes.  Long story short, we need to avoid a crash
on a null field when we're initializing a class that only contains an
empty base class.

While poking into this I discovered c++/93898, but that's a different
problem.

2020-02-26  Marek Polacek  <polacek@redhat.com>

	PR c++/93803 - ICE with constexpr init and [[no_unique_address]].
	* constexpr.c (reduced_constant_expression_p): Don't crash on a null
	field.

	* g++.dg/cpp2a/constexpr-init16.C: New test.
	* g++.dg/cpp2a/constexpr-init17.C: New test.

committed Feb 26, 2020

d6ff2207 Browse Files

dump load permutations and refcount per SLP node · 759bd406

This adjusts dumping as proved useful in debugging.

2020-02-26  Richard Biener  <rguenther@suse.de>

	* tree-vect-slp.c (vect_print_slp_tree): Also dump ref count
	and load permutation.

committed Feb 26, 2020

759bd406 Browse Files

optabs: Don't use scalar conversions for vectors [PR93843] · b6268016

In this PR we had a conversion between two integer vectors that
both had scalar integer modes.  We then tried to implement the
conversion using the scalar optab for those modes, instead of
doing the conversion elementwise.

I wondered about letting through scalar modes for single-element
vectors, but I don't have any evidence that that's useful/necessary,
so it seemed better to keep things simple.

2020-02-26  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR middle-end/93843
	* optabs-tree.c (supportable_convert_operation): Reject types with
	scalar modes.

gcc/testsuite/
	PR middle-end/93843
	* gcc.dg/vect/pr93843-1.c: New test.
	* gcc.dg/vect/pr93843-2.c: Likewise.

committed Feb 26, 2020

b6268016 Browse Files

analyzer: improvements to logging/dumping · 67fa274c

This patch adds various information to -fdump-analyzer and
-fdump-analyzer-stderr to make it easier to track down
problems with state explosions in the exploded_graph.

It logs the number of unprocessed nodes in the worklist, for
the case where the upper limit on exploded nodes is reached.

It prints:
[a] a bar chart showing the number of exploded nodes by function, and

[b] bar charts for each function showing the number of exploded nodes
    per supernode/BB, and

[c] bar charts for each function showing the number of excess exploded
    nodes per supernode/BB beyond the limit
    (--param=analyzer-max-enodes-per-program-point), where that limit
    was reached

I've found these helpful in finding exactly where we fail to consolidate
state, leading to state explosions and false negatives due to the
thresholds being reached.

The patch also adds a "superedge::dump" member function I found myself
needing.

gcc/ChangeLog:
	* Makefile.in (ANALYZER_OBJS): Add analyzer/bar-chart.o.

gcc/analyzer/ChangeLog:
	* bar-chart.cc: New file.
	* bar-chart.h: New file.
	* engine.cc: Include "analyzer/bar-chart.h".
	(stats::log): Only log the m_num_nodes kinds that are non-zero.
	(stats::dump): Likewise when dumping.
	(stats::get_total_enodes): New.
	(exploded_graph::get_or_create_node): Increment the per-point-data
	m_excess_enodes when hitting the per-program-point limit on
	enodes.
	(exploded_graph::print_bar_charts): New.
	(exploded_graph::log_stats): Log the number of unprocessed enodes
	in the worklist.  Call print_bar_charts.
	(exploded_graph::dump_stats): Print the number of unprocessed
	enodes in the worklist.
	* exploded-graph.h (stats::get_total_enodes): New decl.
	(struct per_program_point_data): Add field m_excess_enodes.
	(exploded_graph::print_bar_charts): New decl.
	* supergraph.cc (superedge::dump): New.
	(superedge::dump): New.
	* supergraph.h (supernode::get_function): New.
	(superedge::dump): New decl.
	(superedge::dump): New decl.

committed Feb 26, 2020

67fa274c Browse Files

testsuite: Add a -O2 -fgimple testcase next to the -O2 -fno-tree-dse one [PR93820] · ce25177f
```
2020-02-26  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/93820
	* gcc.dg/pr93820-2.c: New test.
```
Jakub Jelinek committed Feb 26, 2020
ce25177f Browse Files

store-merging: Fix coalesce_immediate_stores [PR93820] · 4d213bf6

The following testcase is miscompiled in 8+.
The problem is that check_no_overlap has a special case for INTEGER_CST
marked stores (i.e. stores of constants), if both all currenly merged stores
and the one under consideration for merging with them are marked that way,
it anticipates that other INTEGER_CST marked stores that overlap with those
and precede those (have smaller info->order) could be merged with those and
doesn't punt for them.
In PR86844 and PR87859 fixes I've then added quite large code that is
performed after check_no_overlap and tries to find out if we need and can
merge further INTEGER_CST marked stores, or need to punt.
Unfortunately, that code is there only in the overlapping case code and
the testcase below shows that we really need it even in the adjacent store
case.  After sort_by_bitpos we have:
bitpos	width	order	rhs_code
96	32	3	INTEGER_CST
128	32	1	INTEGER_CST
128	128	2	INTEGER_CST
192	32	0	MEM_REF
Because of the missing PR86844/PR87859-ish code in the adjacent store
case, we merge the adjacent (memory wise) stores 96/32/3 and 128/32/1,
and then we consider the 128-bit store which is in program-order in between
them, but in this case we punt, because the merging would extend the
merged store region from bitpos 96 and 64-bits to bitpos 96 and 160-bits
and that has an overlap with an incompatible store (the MEM_REF one).
The problem is that we can't really punt this way, because the 128-bit
store is in between those two we've merged already, so either we manage
to merge even that one together with the others, or would need to avoid
already merging the 96/32/3 and 128/32/1 stores together.
Now, rather than copying around the PR86844/PR87859 code to the other spot,
we can actually just use the overlapping code, merge_overlapping is really
a superset of merge_into, so that is what the patch does.  If doing
adjacent store merge for rhs_code other than INTEGER_CST, I believe the
current code is already fine, check_no_overlap in that case doesn't make
the exception and will punt if there is some earlier (smaller order)
non-mergeable overlapping store.  There is just one case that could be
problematic, if the merged_store has BIT_INSERT_EXPRs in them and the
new store is a constant store (INTEGER_CST rhs_code), then check_no_overlap
would do the exception and still would allow the special case.  But we
really shouldn't have the special case in that case, so this patch also
changes check_no_overlap to just have a bool whether we should have the
special case or not.

Note, as I said in the PR, for GCC11 we could consider performing some kind
of cheap DSE during the store merging (perhaps guarded with flag_tree_dse).
And another thing to consider is only consider as problematic non-mergeable
stores that not only have order smaller than last_order as currently, but
also have order larger than first_order, as in this testcase if we actually
ignored (not merged with anything at all) the 192/32/0 store, because it is
not in between the other stores we'd merge, it would be fine to merge the
other 3 stores, though of course the testcase can be easily adjusted by
putting the 192/32 store after the 128/32 store and then this patch would be
still needed.  Though, I think I'd need more time thinking this over.

2020-02-26  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/93820
	* gimple-ssa-store-merging.c (check_no_overlap): Change RHS_CODE
	argument to ALL_INTEGER_CST_P boolean.
	(imm_store_chain_info::try_coalesce_bswap): Adjust caller.
	(imm_store_chain_info::coalesce_immediate_stores): Likewise.  Handle
	adjacent INTEGER_CST store into merged_store->only_constants like
	overlapping one.

	* gcc.dg/pr93820.c: New test.

committed Feb 26, 2020

4d213bf6 Browse Files

c++: Fix rejects-valid bug in cxx_eval_outermost_constant_expr [PR93905] · 5de338f0

Add testcase for a bug that has been just on the 8 branch.

2020-02-26  Jakub Jelinek  <jakub@redhat.com>

	PR c++/93905
	* g++.dg/cpp0x/pr93905.C: New test.

committed Feb 26, 2020

5de338f0 Browse Files

Daily bump. · 07a0e380
GCC Administrator committed Feb 26, 2020

07a0e380 Browse Files

25 Feb, 2020 21 commits

typo fix: Fix probablity, becuse, sucessor and destinarion typos [PR93912] · 9c3da8cc

2020-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR other/93912
	* config/sh/sh.c (expand_cbranchdi4): Fix comment typo, probablity
	-> probability.
	* cfghooks.c (verify_flow_info): Likewise.
	* predict.c (combine_predictions_for_bb): Likewise.
	* bb-reorder.c (connect_better_edge_p): Likewise.  Fix comment typo,
	sucessor -> successor.
	(find_traces_1_round): Fix comment typo, destinarion -> destination.
	* omp-expand.c (expand_oacc_for): Fix comment typo, sucessors ->
	successors.
	* tree-ssa-loop-ch.c (should_duplicate_loop_header_p): Fix dump
	message typo, sucessors -> successors.
c/
	* gimple-parser.c (c_parser_gimple_parse_bb_spec_edge_probability):
	Rename last argument from probablity to probability.

committed Feb 25, 2020

9c3da8cc Browse Files

Correct an attribute access example. · 68f8763d
```
gcc/ChangeLog:
	* doc/extend.texi (attribute access): Correct an example.
```
Martin Sebor committed Feb 25, 2020
68f8763d Browse Files

aarch64: Add bfloat16 vldn/vstn intrinsics · e603cd43

This patch adds the load/store bfloat16 intrinsics to the AArch64 back-end.
ACLE documents are at https://developer.arm.com/docs/101028/latest
ISA documents are at https://developer.arm.com/docs/ddi0596/latest

2020-02-25  Mihail Ionescu  <mihail.ionescu@arm.com>

gcc/
	* config/aarch64/aarch64-builtins.c (aarch64_scalar_builtin_types):
	Add simd_bf.
	(aarch64_init_simd_builtin_scalar_types): Register simd_bf.
	(VAR15, VAR16): New.
	* config/aarch64/iterators.md (VALLDIF): Enable for V4BF and V8BF.
	(VD): Enable for V4BF.
	(VDC): Likewise.
	(VQ): Enable for V8BF.
	(VQ2): Likewise.
	(VQ_NO2E): Likewise.
	(VDBL, Vdbl): Add V4BF.
	(V_INT_EQUIV, v_int_equiv): Add V4BF and V8BF.
	* config/aarch64/arm_neon.h (bfloat16x4x2_t): New typedef.
	(bfloat16x8x2_t): Likewise.
	(bfloat16x4x3_t): Likewise.
	(bfloat16x8x3_t): Likewise.
	(bfloat16x4x4_t): Likewise.
	(bfloat16x8x4_t): Likewise.
	(vcombine_bf16): New.
	(vld1_bf16, vld1_bf16_x2): New.
	(vld1_bf16_x3, vld1_bf16_x4): New.
	(vld1q_bf16, vld1q_bf16_x2): New.
	(vld1q_bf16_x3, vld1q_bf16_x4): New.
	(vld1_lane_bf16): New.
	(vld1q_lane_bf16): New.
	(vld1_dup_bf16): New.
	(vld1q_dup_bf16): New.
	(vld2_bf16): New.
	(vld2q_bf16): New.
	(vld2_dup_bf16): New.
	(vld2q_dup_bf16): New.
	(vld3_bf16): New.
	(vld3q_bf16): New.
	(vld3_dup_bf16): New.
	(vld3q_dup_bf16): New.
	(vld4_bf16): New.
	(vld4q_bf16): New.
	(vld4_dup_bf16): New.
	(vld4q_dup_bf16): New.
	(vst1_bf16, vst1_bf16_x2): New.
	(vst1_bf16_x3, vst1_bf16_x4): New.
	(vst1q_bf16, vst1q_bf16_x2): New.
	(vst1q_bf16_x3, vst1q_bf16_x4): New.
	(vst1_lane_bf16): New.
	(vst1q_lane_bf16): New.
	(vst2_bf16): New.
	(vst2q_bf16): New.
	(vst3_bf16): New.
	(vst3q_bf16): New.
	(vst4_bf16): New.
	(vst4q_bf16): New.

gcc/testsuite/
	* gcc.target/aarch64/advsimd-intrinsics/bf16_vstn.c: New test.
	* gcc.target/aarch64/advsimd-intrinsics/bf16_vldn.c: New test.

committed Feb 25, 2020

e603cd43 Browse Files

aarch64: Add bfloat16 vdup and vreinterpret ACLE intrinsics · 8ea6c1b8

This patch adds support for the bf16 duplicate and reinterpret intrinsics.
ACLE documents are at https://developer.arm.com/docs/101028/latest
ISA documents are at https://developer.arm.com/docs/ddi0596/latest

2020-02-25  Mihail Ionescu  <mihail.ionescu@arm.com>

gcc/
	* config/aarch64/iterators.md (VDQF_F16) Add V4BF and V8BF.
	(VALL_F16): Likewise.
	(VALLDI_F16): Likewise.
	(Vtype): Likewise.
	(Vetype): Likewise.
	(vswap_width_name): Likewise.
	(VSWAP_WIDTH): Likewise.
	(Vel): Likewise.
	(VEL): Likewise.
	(q): Likewise.
	* config/aarch64/arm_neon.h (vset_lane_bf16, vsetq_lane_bf16): New.
	(vget_lane_bf16, vgetq_lane_bf16): New.
	(vcreate_bf16): New.
	(vdup_n_bf16, vdupq_n_bf16): New.
	(vdup_lane_bf16, vdup_laneq_bf16): New.
	(vdupq_lane_bf16, vdupq_laneq_bf16): New.
	(vduph_lane_bf16, vduph_laneq_bf16): New.
	(vreinterpret_bf16_u8, vreinterpretq_bf16_u8): New.
	(vreinterpret_bf16_u16, vreinterpretq_bf16_u16): New.
	(vreinterpret_bf16_u32, vreinterpretq_bf16_u32): New.
	(vreinterpret_bf16_u64, vreinterpretq_bf16_u64): New.
	(vreinterpret_bf16_s8, vreinterpretq_bf16_s8): New.
	(vreinterpret_bf16_s16, vreinterpretq_bf16_s16): New.
	(vreinterpret_bf16_s32, vreinterpretq_bf16_s32): New.
	(vreinterpret_bf16_s64, vreinterpretq_bf16_s64): New.
	(vreinterpret_bf16_p8, vreinterpretq_bf16_p8): New.
	(vreinterpret_bf16_p16, vreinterpretq_bf16_p16): New.
	(vreinterpret_bf16_p64, vreinterpretq_bf16_p64): New
	(vreinterpret_bf16_f16, vreinterpretq_bf16_f16): New
	(vreinterpret_bf16_f32, vreinterpretq_bf16_f32): New.
	(vreinterpret_bf16_f64, vreinterpretq_bf16_f64): New.
	(vreinterpretq_bf16_p128): New.
	(vreinterpret_s8_bf16, vreinterpretq_s8_bf16): New.
	(vreinterpret_s16_bf16, vreinterpretq_s16_bf16): New.
	(vreinterpret_s32_bf16, vreinterpretq_s32_bf16): New.
	(vreinterpret_s64_bf16, vreinterpretq_s64_bf16): New.
	(vreinterpret_u8_bf16, vreinterpretq_u8_bf16): New.
	(vreinterpret_u16_bf16, vreinterpretq_u16_bf16): New.
	(vreinterpret_u32_bf16, vreinterpretq_u32_bf16): New.
	(vreinterpret_u64_bf16, vreinterpretq_u64_bf16): New.
	(vreinterpret_p8_bf16, vreinterpretq_p8_bf16): New.
	(vreinterpret_p16_bf16, vreinterpretq_p16_bf16): New.
	(vreinterpret_p64_bf16, vreinterpretq_p64_bf16): New.
	(vreinterpret_f32_bf16, vreinterpretq_f32_bf16): New.
	(vreinterpret_f64_bf16,vreinterpretq_f64_bf16): New.
	(vreinterpret_f16_bf16,vreinterpretq_f16_bf16): New.
	(vreinterpretq_p128_bf16): New.

gcc/testsuite/
	* gcc.target/aarch64/advsimd-intrinsics/bf16_dup.c: New test.
	* gcc.target/aarch64/advsimd-intrinsics/bf16_reinterpret.c: New test.

committed Feb 25, 2020

8ea6c1b8 Browse Files

libstdc++: LWG 3397 basic_istream_view::iterator should not provide iterator_category · 76a8c0f6

libstdc++-v3/ChangeLog:

	LWG 3397 basic_istream_view::iterator should not provide
	iterator_category
	* include/std/ranges (basic_istream_view:_Iterator::iterator_category):
	Rename to ...
	(basic_istream_view:_Iterator::iterator_concept): ... this.
	* testsuite/std/ranges/istream_view.cc: Augment test.

committed Feb 25, 2020

76a8c0f6 Browse Files

libstdc++: LWG 3325 Constrain return type of transformation function for transform_view · ec15da7c

libstdc++-v3/ChangeLog:

	LWG 3325 Constrain return type of transformation function for
	transform_view
	* include/std/ranges (transform_view): Constrain the return type of the
	transformation function as per LWG 3325.
	* testsuite/std/ranges/adaptors/lwg3325_neg.cc: New test.

committed Feb 25, 2020

ec15da7c Browse Files

libstdc++: LWG 3313 join_view::iterator::operator-- is incorrectly constrained · 55c4b3f4

libstdc++-v3/ChangeLog:

	LWG 3313 join_view::_Iterator::operator-- is incorrectly constrained
	* include/std/ranges (join_view::_Iterator::operator--): Require that
	range_reference_t<_Base> models common_range.
	* testsuite/std/ranges/adaptors/lwg3313_neg.cc: New test.

committed Feb 25, 2020

55c4b3f4 Browse Files

libstdc++: LWG 3301 transform_view::iterator has incorrect iterator_category · 510bd1c1

libstdc++-v3/ChangeLog:

	LWG 3301 transform_view::_Iterator has incorrect iterator_category
	* include/std/ranges (transform_view::_Iterator::_S_iter_cat): Adjust
	determination of iterator_category as per LWG 3301.
	* testsuite/std/ranges/adaptors/transform.cc: Augment test.

committed Feb 25, 2020

510bd1c1 Browse Files

libstdc++: LWG 3292 iota_view is under-constrained · 7f0f1083

libstdc++-v3/ChangeLog:

	LWG 3292 iota_view is under-constrained
	* include/std/ranges (iota_view): Require that _Winc models semiregular
	  as per LWG 3292.
	* testsuite/std/ranges/iota/lwg3292_neg.cc: New test.

committed Feb 25, 2020

7f0f1083 Browse Files

arm: ACLE intrinsics for bfloat16 dot product · eb7ba6c3

This patch is part of a series adding support for Armv8.6-A features.
It adds intrinsics for brain half-precision float-point (BF16) dot
instructions with AdvSIMD support.

gcc/ChangeLog:

2020-02-25  Dennis Zhang  <dennis.zhang@arm.com>

	* config/arm/arm_neon.h (vbfdot_f32, vbfdotq_f32): New
	(vbfdot_lane_f32, vbfdotq_laneq_f32): New.
	(vbfdot_laneq_f32, vbfdotq_lane_f32): New.
	* config/arm/arm_neon_builtins.def (vbfdot): New entry.
	(vbfdot_lanev4bf, vbfdot_lanev8bf): Likewise.
	* config/arm/iterators.md (VSF2BF): New attribute.
	* config/arm/neon.md (neon_vbfdot<VCVTF:mode>): New entry.
	(neon_vbfdot_lanev4bf<VCVTF:mode>): Likewise.
	(neon_vbfdot_lanev8bf<VCVTF:mode>): Likewise.

gcc/testsuite/ChangeLog:

2020-02-25  Dennis Zhang  <dennis.zhang@arm.com>

	* gcc.target/arm/simd/bf16_dot_1.c: New test.
	* gcc.target/arm/simd/bf16_dot_2.c: New test.
	* gcc.target/arm/simd/bf16_dot_3.c: New test.

committed Feb 25, 2020

eb7ba6c3 Browse Files

libstdc++: Remove __memmove wrapper for constexpr algorithms · 490350a1

The mutating sequence algorithms std::copy, std::copy_backward,
std::move and std::move_backward conditionally use __builtin_memmove
for trivially copyable types. However, because memmove isn't usable in
constant expressions the use of __builtin_memmove is wrapped in a
__memmove function which replaces __builtin_memmove with a handwritten
loop when std::is_constant_evaluated() is true.

This means we have a manual loop for non-trivially copyable cases, and a
different manual loop for trivially copyable but constexpr cases. The
latter loop has incorrect semantics for the {copy,move}_backward cases
and so isn't used for them. Until earlier today the latter loop also had
incorrect semantics for the std::move cases, trying to move from const
rvalues.

The approach taken by this patch is to remove the __memmove function
entirely and use the original (and correct) manual loops for the
constexpr cases as well as the non-trivially copyable cases. This was
already done for move_backward and copy_backward, but was incorrectly
turning copy_backward into move_backward, by failing to use the _IsMove
constant to select the right specialization. This patch also fixes that.

	* include/bits/ranges_algobase.h (__copy_or_move): Do not use memmove
	during constant evaluation. Call __builtin_memmove directly instead of
	__memmove.
	(__copy_or_move_backward): Likewise.
	* include/bits/stl_algobase.h (__memmove): Remove.
	(__copy_move<M, true, random_access_iterator_tag>::__copy_m)
	(__copy_move_backward<M, true, random_access_iterator_tag>::__copy_m):
	Use __builtin_memmove directly instead of __memmove.
	(__copy_move_a2): Do not use memmove during constant evaluation.
	(__copy_move_backward_a2): Use _IsMove constant to select correct
	__copy_move_backward specialization.
	* testsuite/25_algorithms/copy_backward/constexpr.cc: Check for copies
	begin turned into moves during constant evaluation.

committed Feb 25, 2020

490350a1 Browse Files

Fix ChangeLog date · dfb93d05
Jonathan Wakely committed Feb 25, 2020

dfb93d05 Browse Files

[ARM] Fix -mpure-code for v6m · a71f2193

When running the testsuite with -fdisable-rtl-fwprop2 and -mpure-code
for cortex-m0, I noticed that some testcases were failing because we
still generate "ldr rX, .LCY", which is what we want to avoid with
-mpure-code. This is latent since a recent improvement in fwprop
(PR88833).

In this patch I change the thumb1_movsi_insn pattern so that it emits
the desired instruction sequence when arm_disable_literal_pool is set.

To achieve that, I introduce a new required_for_purecode attribute to
enable the corresponding alternative in thumb1_movsi_insn and take the
actual instruction sequence length into account.

gcc/ChangeLog:

2020-02-13  Christophe Lyon  <christophe.lyon@linaro.org>

	* config/arm/arm.md (required_for_purecode): New attribute.
	(enabled): Handle required_for_purecode.
	* config/arm/thumb1.md (thumb1_movsi_insn): Add alternative to
	work with -mpure-code.

committed Feb 25, 2020

a71f2193 Browse Files

combine: Fix find_split_point handling of constant store into ZERO_EXTRACT [PR93908] · 73dc4ae4

git is miscompiled on s390x-linux with -O2 -march=zEC12 -mtune=z13.
I've managed to reduce it into the following testcase.  The problem is that
during combine we see the s->k = -1; bitfield store and change the SET_SRC
from a pseudo into a constant:
(set (zero_extract:DI (mem/j:HI (plus:DI (reg/v/f:DI 60 [ s ])
                (const_int 10 [0xa])) [0 +0 S2 A16])
        (const_int 2 [0x2])
        (const_int 7 [0x7]))
    (const_int -1 [0xffffffffffffffff]))
This on s390x with the above option isn't recognized as valid instruction,
so find_split_point decides to handle it as IOR or IOR/AND.
src is -1, mask is 3 and pos is 7.
src != mask (this is also incorrect, we want to set all (both) bits in the
bitfield), so we go for IOR/AND, but instead of trying
mem = (mem & ~0x180) | ((-1 << 7) & 0x180)
we actually try
mem = (mem & ~0x180) | (-1 << 7)
and that is further simplified into:
mem = mem | (-1 << 7)
aka
mem = mem | 0xff80
which doesn't set just the 2-bit bitfield, but also many other bitfields
that shouldn't be touched.
We really should do:
mem = mem | 0x180
instead.
The problem is that we assume that no bits but those low len (2 here) will
be set in the SET_SRC, but there is nothing that can prevent that, we just
should ignore the other bits.

The following patch fixes it by masking src with mask, this way already
the src == mask test will DTRT, and as the code for or_mask uses
gen_int_mode, if the most significant bit is set after shifting it left by
pos, it will be properly sign-extended.

2020-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/93908
	* combine.c (find_split_point): For store into ZERO_EXTRACT, and src
	with mask.

	* gcc.c-torture/execute/pr93908.c: New test.

committed Feb 25, 2020

73dc4ae4 Browse Files

libstdc++: Add test accidentally left out of previous commit · 6de946e6
```
	* testsuite/25_algorithms/move_backward/93872.cc: Add test left out of
	previous commit.
```
Jonathan Wakely committed Feb 25, 2020
6de946e6 Browse Files

libstdc++: Fix regression in std::move algorithm (PR 93872) · 5b904f17

The std::move and std::move_backward algorithms dispatch to the
std::__memmove helper when appropriate. That function uses a
pointer-to-const for the source values, preventing them from being
moved. The two callers of that function have the same problem.

Rather than altering __memmove and its callers to work with const or
non-const source pointers, this takes a more conservative approach of
casting away the const at the point where we want to do a move
assignment. This relies on the fact that we only use __memmove when the
type is trivially copyable, so we know the move assignment doesn't alter
the source anyway.

	PR libstdc++/93872
	* include/bits/stl_algobase.h (__memmove): Cast away const before
	doing move assignment.
	* testsuite/25_algorithms/move/93872.cc: New test.
	* testsuite/25_algorithms/move_backward/93872.cc: New test.

committed Feb 25, 2020

5b904f17 Browse Files

Fix link failure with debug info in LTO mode · 2877ad9a

This fixes a regression whereby the program fails to link with debug
info in LTO mode because of an undefined reference to a symbol coming
from the object files containing the early debug info.

	* dwarf2out.c (dwarf2out_size_function): Run in early-DWARF mode.

committed Feb 25, 2020

2877ad9a Browse Files

testcase for last_vuse in FRE · 81ef67c1

This adds a testcase for some basic FRE functionality.

2020-02-25  Richard Biener  <rguenther@suse.de>

	* gcc.dg/tree-ssa/ssa-fre-86.c: New testcase.

committed Feb 25, 2020

81ef67c1 Browse Files

doc: minor --enable-checking wording fixes · 8bc6d0a2
```
gcc/ChangeLog:
	 doc/install.texi (--enable-checking): Adjust wording.
```
Roman Zhuykov committed Feb 25, 2020
8bc6d0a2 Browse Files

tree-optimization/93868 copy SLP tree before re-arranging stmts · 81c833b3

This avoids altering possibly shared SLP subtrees when attempting
to get rid of permutations in SLP reductions by copying the SLP
subtree before re-arranging stmts in it.

2020-02-25  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/93868
	* tree-vect-slp.c (slp_copy_subtree): New function.
	(vect_attempt_slp_rearrange_stmts): Copy the SLP tree before
	re-arranging stmts in it.

	* gcc.dg/torture/pr93868.c: New testcase.

committed Feb 25, 2020

81c833b3 Browse Files

pass_manager: Fix ICE with -fdump-passes -fdisable-tree-* [PR93874] · 2473c81c

dump_passes pushes a dummy function for which it evaluates the gates
and checks whether the pass is enabled or disabled.
Unfortunately, if any -fdisable-*-*/-fenable-*-* options were seen,
we ICE during is_pass_explicitly_enabled_or_disabled because slot
is non-NULL then and the code will do:
  cgraph_uid = func ? cgraph_node::get (func)->get_uid () : 0;
but the dummy function doesn't have a cgraph node.

So, either we need to create and then remove a cgraph node for the dummy
function like the following patch, or function.c would need to export the
in_dummy_function flag (or have some way to query that flag from other TUs)
and we'd need to check it in is_pass_explicitly_enabled_or_disabled.

2020-02-25  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/93874
	* passes.c (pass_manager::dump_passes): Create a cgraph node for the
	dummy function and remove it at the end.

	* gcc.dg/pr93874.c: New test.

committed Feb 25, 2020

2473c81c Browse Files