Commits · 120e873484f20d9a0b8400e2e464ac5b2088a747 · lvzhengyang / riscv-gcc-1

24 Feb, 2020 8 commits

libstdc++: Add default_sentinel support to stream iterators · 120e8734

Missing pieces of P0896R4 "The One Ranges Proposal" for C++20.

	* include/bits/stream_iterator.h (istream_iterator(default_sentinel_t)):
	Add constructor.
	(operator==(istream_iterator, default_sentinel_t)): Add operator.
	(ostream_iterator::difference_type): Define to ptrdiff_t for C++20.
	* include/bits/streambuf_iterator.h
	(istreambuf_iterator(default_sentinel_t)): Add constructor.
	(operator==(istreambuf_iterator, default_sentinel_t)): Add operator.
	* testsuite/24_iterators/istream_iterator/cons/sentinel.cc:
	New test.
	* testsuite/24_iterators/istream_iterator/sentinel.cc: New test.
	* testsuite/24_iterators/istreambuf_iterator/cons/sentinel.cc:
	New test.
	* testsuite/24_iterators/istreambuf_iterator/sentinel.cc: New test.

committed Feb 24, 2020

120e8734 Browse Files

PR78353: Fix testcases · e03069be

Skip the test if arm7a is not supported at link time. This is the case
if the toolchain targets an M-profile CPU by default and does not have
A-profile multilib: the link step fails because it tries to mix
M-profile startup files with A-profile testcase.

2020-02-24  Christophe Lyon  <christophe.lyon@linaro.org>

	PR lto/78353
	* gcc.target/arm/pr78353-1.c: Add arm_arch_v7a_multilib effective
	target.
	* gcc.target/arm/pr78353-2.c: Likewise.

committed Feb 24, 2020

e03069be Browse Files

libstdc++: enable_view has false positives (LWG 3326) · 3841739c

	* include/std/ranges (__deep_const_range, __enable_view_impl): Remove.
	(ranges::enable_view): Simplify (LWG 3326).
	* include/bits/range_access.h (ranges::enable_view): Declare.
	* include/bits/regex.h (__enable_view_impl): Remove partial
	specialization.
	* include/bits/stl_multiset.h (__enable_view_impl): Likewise.
	* include/bits/stl_set.h (__enable_view_impl): Likewise.
	* include/bits/unordered_set.h (__enable_view_impl): Likewise.
	* include/debug/multiset.h (__enable_view_impl): Likewise.
	* include/debug/set.h (__enable_view_impl): Likewise.
	* include/debug/unordered_set (__enable_view_impl): Likewise.
	* include/experimental/string_view (ranges::enable_view): Define
	partial specialization.
	* include/std/span (ranges::enable_view): Likewise.
	* include/std/string_view (ranges::enable_view): Likewise.
	* testsuite/std/ranges/view.cc: Check satisfaction of updated concept.

committed Feb 24, 2020

3841739c Browse Files

sccvn: Handle bitfields in push_partial_def [PR93582] · 7f5617b0

The following patch adds support for bitfields to push_partial_def.
Previously pd.offset and pd.size were counted in bytes and maxsizei
in bits, now everything is counted in bits.

Not really sure how much of the further code can be outlined and moved, e.g.
the full def and partial def code doesn't have pretty much anything in
common (the partial defs case basically have some load bit range and a set
of store bit ranges that at least partially overlap and we need to handle
all the different cases, like negative pd.offset or non-negative, little vs.
bit endian, size so small that we need to preserve original bits on both
sides of the byte, size that fits or is too large.
Perhaps the storing of some value into a middle of existing buffer (i.e.
what push_partial_def now does in the loop) could, but the candidate for
sharing would be most likely store-merging rather than the other spots in
sccvn, and I think it is better not to touch store-merging at this stage.

Yes, I've thought about trying to do everything in place, but the code is
quite hard to understand and get right already now and if we tried to do the
optimize on the fly, it would need more special cases and would for gcov
coverage need more testcases to cover it.  Most of the time the sizes will
be small.  Furthermore, for bitfields native_encode_expr stores actually
number of bytes in the mode and not say actual bitsize rounded up to bytes,
so it wouldn't be just a matter of saving/restoring bytes at the start and
end, but we might need even 7 further bytes e.g. for __int128 bitfields.
Perhaps we could have just a fast path for the case where everything is byte
aligned and (for integral types the mode bitsize is equal to the size too)?

2020-02-24  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/93582
	* tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Consider
	pd.offset and pd.size to be counted in bits rather than bytes, add
	support for maxsizei that is not a multiple of BITS_PER_UNIT and
	handle bitfield stores and loads.
	(vn_reference_lookup_3): Don't call ranges_known_overlap_p with
	uncomparable quantities - bytes vs. bits.  Allow push_partial_def
	on offsets/sizes that aren't multiple of BITS_PER_UNIT and adjust
	pd.offset/pd.size to be counted in bits rather than bytes.
	Formatting fix.  Rename shadowed len variable to buflen.

	* gcc.dg/tree-ssa/pr93582-4.c: New test.
	* gcc.dg/tree-ssa/pr93582-5.c: New test.
	* gcc.dg/tree-ssa/pr93582-6.c: New test.
	* gcc.dg/tree-ssa/pr93582-7.c: New test.
	* gcc.dg/tree-ssa/pr93582-8.c: New test.

committed Feb 24, 2020

7f5617b0 Browse Files

OpenACC tile clause – apply exit/cycle checks (PR 93552) · 2bd8c3ff

        PR fortran/93552
        * match.c (match_exit_cycle): With OpenACC, check the kernels loop
        directive and tile clause as well.

        PR fortran/93552
        * gfortran.dg/goacc/tile-4.f90: New.

committed Feb 24, 2020

2bd8c3ff Browse Files

PR47785: Add support for handling Xassembler/Wa options with LTO. · f1a681a1

2020-02-24  Prathamesh Kulkarni  <prathamesh.kulkarni@linaro.org>
	    Kugan Vivekandarajah  <kugan.vivekanandarajah@linaro.org>

	PR driver/47785
	* gcc.c (putenv_COLLECT_AS_OPTIONS): New function.
	(driver::main): Call putenv_COLLECT_AS_OPTIONS.
	* opts-common.c (parse_options_from_collect_gcc_options): New function.
	(prepend_xassembler_to_collect_as_options): Likewise.
	* opts.h (parse_options_from_collect_gcc_options): Declare prototype.
	(prepend_xassembler_to_collect_as_options): Likewise.
	* lto-opts.c (lto_write_options): Stream assembler options
	in COLLECT_AS_OPTIONS.
	* lto-wrapper.c (xassembler_options_error): New static variable.
	(get_options_from_collect_gcc_options): Move parsing options code to
	parse_options_from_collect_gcc_options and call it.
	(merge_and_complain): Validate -Xassembler options.
	(append_compiler_options): Handle OPT_Xassembler.
	(run_gcc): Append command line -Xassembler options to
	collect_gcc_options.
	* doc/invoke.texi: Add documentation about using Xassembler
	options with LTO.

testsuite/
	* gcc.target/arm/pr78353-1.c: New test.
	* gcc.target/arm/pr78353-2.c: Likewise.

committed Feb 24, 2020

f1a681a1 Browse Files

RISC-V: Adjust floating point code gen for LTGT compare · 9069e948

 - Using gcc.dg/torture/pr91323.c as testcase, so no new testcase
   introduced.

 - We use 3 eq compare for LTGT compare before, in order to prevent exception
   flags setting when any input is NaN.

 - According latest GCC document LTGT and discussion on pr91323
   LTGT should signals on NaNs, like GE/GT/LE/LT.

 - So we expand (LTGT a b) to ((LT a b) | (GT a b)) for fit the document.

 - Tested rv64gc/rv32gc bare-metal/linux on qemu and
   rv64gc on HiFive unleashed board with linux.

ChangeLog

gcc/

Kito Cheng  <kito.cheng@sifive.com>

	* config/riscv/riscv.c (riscv_emit_float_compare): Change the code gen
	for LTGT.
	(riscv_rtx_costs): Update cost model for LTGT.

committed Feb 24, 2020

9069e948 Browse Files

Daily bump. · c7bfe1aa
GCC Administrator committed Feb 24, 2020

c7bfe1aa Browse Files

23 Feb, 2020 5 commits

Changing cost propagation and ordering colorable bucket heuristics for PR93564. · 3133bed5

2020-02-23  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/93564
	* ira-color.c (struct update_cost_queue_elem): New member start.
	(queue_update_cost, get_next_update_cost): Add new arg start.
	(allocnos_conflict_p): New function.
	(update_costs_from_allocno): Add new arg conflict_cost_update_p.
	Add checking conflicts with allocnos_conflict_p.
	(update_costs_from_prefs, restore_costs_from_copies): Adjust
	update_costs_from_allocno calls.
	(update_conflict_hard_regno_costs): Add checking conflicts with
	allocnos_conflict_p.  Adjust calls of queue_update_cost and
	get_next_update_cost.
	(assign_hard_reg): Adjust calls of queue_update_cost.  Add
	debugging print.
	(bucket_allocno_compare_func): Restore previous version.

committed Feb 23, 2020

3133bed5 Browse Files

Add missing closing parenthises in error message. · 92e8508e

2020-02-23  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/93889
	* interface.c (compare_parameter): Fix error message.

committed Feb 23, 2020

92e8508e Browse Files

Fix error message. · 7260547d

2020-02-23  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/93890
	* interface.c: Replace "can not" by "cannot" and remove trailing
	space.

2020-02-23  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/93890
	* gfortran.dg/argument_checking_24.f90: Correct test case.

committed Feb 23, 2020

7260547d Browse Files

Patch for PR57710 · 61c8d9e4
Paul Thomas committed Feb 23, 2020

61c8d9e4 Browse Files
Daily bump. · cd601671
GCC Administrator committed Feb 23, 2020

cd601671 Browse Files

22 Feb, 2020 4 commits

libatomic: Fix last change [PR55930] · 128ff73d

2020-02-22  Jakub Jelinek  <jakub@redhat.com>

	PR other/55930
	* Makefile.am (M_DEPS): Guard the empty definition with
	@AMDEP_FALSE@ rather than @AMDEP_TRUE@.
	* Makefile.in: Regenerated.

committed Feb 22, 2020

128ff73d Browse Files

c++: Use %qs in diagnostic message [PR93882] · 6cd5024c

A tweak for translators, as requested in the PR.

2020-02-22  Marek Polacek  <polacek@redhat.com>

	PR c++/93882
	* decl.c (grokdeclarator): Use %qs in a diagnostic message.

committed Feb 22, 2020

6cd5024c Browse Files

Honor --disable-dependency-tracking in libatomic · d6f420d9
```
	PR other/55930
	* Makefile.am (M_DEPS): Honor -disable-dependency-tracking.
	* Makefile.in: Regenerated.
```
Richarde Purdie committed Feb 22, 2020
d6f420d9 Browse Files
Daily bump. · e99b18cf
GCC Administrator committed Feb 22, 2020

e99b18cf Browse Files

21 Feb, 2020 23 commits

Fix handling of floating-point homogeneous aggregates. · 01af7e0a

	2020-02-21  John David Anglin  <danglin@gcc.gnu.org>

	* gcc/config/pa/pa.c (pa_function_value): Fix check for word and
	double-word size when handling aggregate return values.
	* gcc/config/pa/som.h (ASM_DECLARE_FUNCTION_NAME): Fix to indicate
	that homogeneous SFmode and DFmode aggregates are passed and returned
	in general registers.

committed Feb 21, 2020

01af7e0a Browse Files

i18n: Fix translation of --help [PR93759] · 8d1780b5

The first two hunks make sure we actually translate what has been marked
for translation, i.e. the cl_options[...].help strings, rather than those
strings ammended in various ways, like:
_("%s  Same as %s."), help, ...
or
"%s  %s", help, _(use_diagnosed_msg)

The exgettext changes attempt to make sure that the cl_options[...].help
strings are marked as no-c-format, because otherwise if they happen
to contain a % character, such as the 90% substring, they will be marked
as c-format, which they aren't.

2020-02-21  Jakub Jelinek  <jakub@redhat.com>

	PR translation/93759
	* opts.c (print_filtered_help): Translate help before appending
	messages to it rather than after that.

	* exgettext: For *.opt help texts, use __opt_help_text("...")
	rather than _("...") in the $emsg file and pass options that
	say that this implies no-c-format.

committed Feb 21, 2020

8d1780b5 Browse Files

lra: Stop registers being incorrectly marked live v2 [PR92989] · d11676de

This PR is about a case in which the clobbers at the start of
an EH receiver can lead to registers becoming unnecessarily
live in predecessor blocks.  My first attempt at fixing this
made sure that we update the bb liveness info based on the
real live set:

  http://gcc.gnu.org/g:e648e57efca6ce6d751ef8c2038608817b514fb4

But it turns out that the clobbered registers were also added to
the "gen" set of LRA's private liveness problem, where "gen" in
this context means "generates a requirement for a live value".
So the clobbered registers could still end up live via that
mechanism instead.

This patch therefore reverts the patch above and takes the other
approach floated in the original patch description: model the full
clobber by making the registers live and then dead again.

There's no specific need to revert the original patch, since the
code should no longer be sensitive to the order of the bb liveness
update and the modelling of the clobber.  But given that there's
no specific need to keep the original patch either, it seemed better
to restore the code to the more well-tested order.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard

2020-02-19  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR rtl-optimization/PR92989
	* lra-lives.c (process_bb_lives): Restore the original order
	of the bb liveness update.  Call make_hard_regno_dead for each
	register clobbered at the start of an EH receiver.

committed Feb 21, 2020

d11676de Browse Files

Do not propagate self-dependent value (PR ipa/93763) (ChangeLog) · 25f0909a

            PR ipa/93763
            * ipa-cp.c (self_recursively_generated_p): Mark self-dependent value as
            self-recursively generated.

committed Feb 21, 2020

25f0909a Browse Files

Do not propagate self-dependent value (PR ipa/93763) · 47772af1

        PR ipa/93763
        * ipa-cp.c (self_recursively_generated_p): Mark self-dependent value as
        self-recursively generated.

committed Feb 21, 2020

47772af1 Browse Files

Darwin: Fix wrong quoting on an error string (PR93860). · 147add96

The quotes should surround all of the literal content from the pragma
that has incorrect usage.

2020-02-21  Iain Sandoe  <iain@sandoe.co.uk>

PR target/93860
* config/darwin-c.c (pop_field_alignment): Adjust quoting of
error string.

committed Feb 21, 2020

147add96 Browse Files

PR c++/93753 - ICE on a flexible array followed by a member in an anonymous… · dbfba41e

PR c++/93753 - ICE on a flexible array followed by a member in an anonymous struct with an initializer

gcc/cp/ChangeLog:

	PR gcov-profile/93753
	* class.c (check_flexarrays): Tighten up a test for potential members
	of anonymous structs or unions.

gcc/testsuite/ChangeLog:

	PR gcov-profile/93753
	* g++.dg/ext/flexary36.C: New test.
	* g++.dg/lto/pr93166_0.C: Make struct with flexarray valid.

committed Feb 21, 2020

dbfba41e Browse Files

libstdc++: Define <=> for tuple, optional and variant · 9e589880

Another piece of P1614R2.

	* include/std/optional (operator<=>(optional<T>, optional<U>))
	(operator<=>(optional<T>, nullopt), operator<=>(optional<T>, U)):
	Define for C++20.
	* include/std/tuple (__tuple_cmp): New helper function for <=>.
	(operator<=>(tuple<T...>, tuple<U>...)): Define for C++20.
	* include/std/variant (operator<=>(variant<T...>, variant<T...>))
	(operator<=>(monostate, monostate)): Define for C++20.
	* testsuite/20_util/optional/relops/three_way.cc: New test.
	* testsuite/20_util/tuple/comparison_operators/three_way.cc: New test.
	* testsuite/20_util/variant/89851.cc: Move to ...
	* testsuite/20_util/variant/relops/89851.cc: ... here.
	* testsuite/20_util/variant/90008.cc: Move to ...
	* testsuite/20_util/variant/relops/90008.cc: ... here.
	* testsuite/20_util/variant/relops/three_way.cc: New test.

committed Feb 21, 2020

9e589880 Browse Files

[PATCH, GCC/ARM] Add MVE target check to sourcebuild.texi · 131fbdd7

Follow up to: https://gcc.gnu.org/ml/gcc-patches/2020-02/msg01109.html

Committed as obvious.

gcc/ChangeLog:

2020-02-21  Mihail Ionescu  <mihail.ionescu@arm.com>

	* doc/sourcebuild.texi (arm_v8_1m_mve_ok):
	Document new target supports option.

committed Feb 21, 2020

131fbdd7 Browse Files

arm: ACLE I8MM multiply-accumulate · 436016f4

This patch adds intrinsics for matrix multiply-accumulate instructions
including vmmlaq_s32, vmmlaq_u32, and vusmmlaq_s32.

gcc/ChangeLog:

2020-02-21  Dennis Zhang  <dennis.zhang@arm.com>

	* config/arm/arm_neon.h (vmmlaq_s32, vmmlaq_u32, vusmmlaq_s32): New.
	* config/arm/arm_neon_builtins.def (smmla, ummla, usmmla): New.
	* config/arm/iterators.md (MATMUL): New iterator.
	(sup): Add UNSPEC_MATMUL_S, UNSPEC_MATMUL_U, and UNSPEC_MATMUL_US.
	(mmla_sfx): New attribute.
	* config/arm/neon.md (neon_<sup>mmlav16qi): New.
	* config/arm/unspecs.md (UNSPEC_MATMUL_S, UNSPEC_MATMUL_U): New.
	(UNSPEC_MATMUL_US): New.

gcc/testsuite/ChangeLog:

2020-02-21  Dennis Zhang  <dennis.zhang@arm.com>

	* gcc.target/arm/simd/vmmla_1.c: New test.

committed Feb 21, 2020

436016f4 Browse Files

testsuite: Add -fcommon to gcc.target/i386/pr69052.c · b59506cd

This testcase is susceptible to memory location details and start to fail
with default to -fno-common.  Use -fcommon to set expected testing conditions.

	* gcc.target/i386/pr69052.c: Require target ia32.
	(dg-options): Add -fcommon and remove -pie.

committed Feb 21, 2020

b59506cd Browse Files

[PATCH, GCC/ARM] Fix MVE scalar shift tests · bf5582c3

*** gcc/ChangeLog ***

2020-02-21  Mihail-Calin Ionescu  <mihail.ionescu@arm.com>

	* config/arm/arm.md: Prevent scalar shifts from being
	used when big endian is enabled.

*** gcc/testsuite/ChangeLog ***

2020-02-21  Mihail-Calin Ionescu  <mihail.ionescu@arm.com>

	* gcc.target/arm/armv8_1m-shift-imm-1.c: Add MVE target checks.
	* gcc.target/arm/armv8_1m-shift-reg-1.c: Likewise.
	* lib/target-supports.exp
	(check_effective_target_arm_v8_1m_mve_ok_nocache): New.
	(check_effective_target_arm_v8_1m_mve_ok): New.
	(add_options_for_v8_1m_mve): New.

committed Feb 21, 2020

bf5582c3 Browse Files

testsuite: Require vect_mutiple_sizes for scan-tree-dump in vect-epilogues.c · b150c838

Default testsuite flags do not enable V8QI (MMX) vector mode for
32bit x86 targets.  Require vect_multiple_sizes effective target in
scan-tree-dump to avoid "LOOP EPILOGUE VECTORIZED" failure.

	* gcc.dg/vect/vect-epilogues.c (scan-tree-dump): Require
	vect_mutiple_sizes effective target.

committed Feb 21, 2020

b150c838 Browse Files

Adapt libgomp acc_get_property.f90 test · 83d45e1d

The commit r10-6721-g8d1a1cb1 has changed
the name of the type that is used for the return value of the Fortran
acc_get_property function without adapting the test acc_get_property.f90.

2020-02-21  Frederik Harwath  <frederik@codesourcery.com>

	* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Adapt to
	changes from 2020-02-19, i.e. use integer(c_size_t) instead of
	integer(acc_device_property) for the type of the return value of
	acc_get_property.

committed Feb 21, 2020

83d45e1d Browse Files

tree-optimization: fix access path oracle on mismatched array refs [PR93586] · 91e50b2a

nonoverlapping_array_refs_p is not supposed to give meaningful results when
bases of ref1 and ref2 are not same or completely disjoint and here it is
called on c[0][j_2][0] and c[0][1] so bases in sence of this functions are
"c[0][j_2]" and "c[0]" which do partially overlap.  nonoverlapping_array_refs
however walks pair of array references and in this case it misses to note the
fact that if it walked across first mismatched pair it is no longer safe to
compare rest.

The reason why it continues matching is because it hopes it will
eventually get pair of COMPONENT_REFs from types of same size and use
TBAA to conclude that their addresses must be either same or completely
disjoint.

This patch makes the loop to terminate early but popping all the
remaining pairs so walking can continue.  We could re-synchronize on
arrays of same size with TBAA but this is bit fishy (because we try to
support some sort of partial array overlaps) and hard to implement
(because of zero sized arrays and VLAs) so I think it is not worth the
effort.

In addition I notied that the function is not !flag_strict_aliasing safe
and added early exits on places we set seen_unmatched_ref_p since later
we do not check that in:

       /* If we skipped array refs on type of different sizes, we can
 	 no longer be sure that there are not partial overlaps.  */
       if (seen_unmatched_ref_p
 	  && !operand_equal_p (TYPE_SIZE (type1), TYPE_SIZE (type2), 0))
 	{
 	  ++alias_stats
 	    .nonoverlapping_refs_since_match_p_may_alias;
	}

  	PR tree-optimization/93586
	* tree-ssa-alias.c (nonoverlapping_array_refs_p): Finish array walk
	after mismatched array refs; do not sure type size information to
	recover from unmatched referneces with !flag_strict_aliasing_p.

	* gcc.dg/torture/pr93586.c: New testcase.

committed Feb 21, 2020

91e50b2a Browse Files

amdgcn: Use correct offset mode for gather/scatter · b5fb73b6

The scatter/gather pattern names changed for GCC 10, but I hadn't noticed.
This switches the patterns to the new offset mode scheme.

2020-02-21  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/gcn/gcn-valu.md (gather_load<mode>): Rename to ...
	(gather_load<mode>v64si): ... this and set operand 2 to V64SI.
	(scatter_store<mode>): Rename to ...
	(scatter_store<mode>v64si): ... this and set operand 1 to V64SI.
	(scatter<mode>_exec): Delete. Move contents ...
	(mask_scatter_store<mode>): ... here, and rename that to ...
	(mask_gather_load<mode>v64si): ... this. Set operand 2 to V64SI.
	Remove mode conversion.
	(mask_gather_load<mode>): Rename to ...
	(mask_scatter_store<mode>v64si): ... this. Set operand 1 to V64SI.
	Remove mode conversion.
	* config/gcn/gcn.c (gcn_expand_scaled_offsets): Remove mode conversion.

committed Feb 21, 2020

b5fb73b6 Browse Files

sra: Only verify sizes of scalar accesses (PR 93845) · 4d6bf96b

the testcase is another example - in addition to recent PR 93516 - where
the SRA access verifier is confused by the fact that get_ref_base_extent
can return different sizes for the same type, depending whether they are
COMPONENT_REF or not.  In the previous bug I decided to keep the
verifier check for aggregate type even though it is not really important
and instead avoid easily detectable type-within-the-same-type situation.
This testcase is however a result of a fairly random looking type cast
and so cannot be handled in the same way.

Because the check is not really important for aggregates, this patch
simply disables it for non-register types.

2020-02-21  Martin Jambor  <mjambor@suse.cz>

	PR tree-optimization/93845
	* tree-sra.c (verify_sra_access_forest): Only test access size of
	scalar types.

	testsuite/
	* g++.dg/tree-ssa/pr93845.C: New test.

committed Feb 21, 2020

4d6bf96b Browse Files

amdgcn: Align VGPR pairs · 3abfd4f3

Aligning the registers is not needed by the architecture, but doing so
allows us to remove the requirement for bug-prone early-clobber
constraints from many split patterns (and avoid adding more in future).

2020-02-21  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/gcn/gcn.c (gcn_hard_regno_mode_ok): Align VGPR pairs.
	* config/gcn/gcn-valu.md (addv64di3): Remove early-clobber.
	(addv64di3_exec): Likewise.
	(subv64di3): Likewise.
	(subv64di3_exec): Likewise.
	(addv64di3_zext): Likewise.
	(addv64di3_zext_exec): Likewise.
	(addv64di3_zext_dup): Likewise.
	(addv64di3_zext_dup_exec): Likewise.
	(addv64di3_zext_dup2): Likewise.
	(addv64di3_zext_dup2_exec): Likewise.
	(addv64di3_sext_dup2): Likewise.
	(addv64di3_sext_dup2_exec): Likewise.
	(<expander>v64di3): Likewise.
	(<expander>v64di3_exec): Likewise.
	(*<reduc_op>_dpp_shr_v64di): Likewise.
	(*plus_carry_dpp_shr_v64di): Likewise.
	* config/gcn/gcn.md (adddi3): Likewise.
	(addptrdi3): Likewise.
	(<expander>di3): Likewise.

committed Feb 21, 2020

3abfd4f3 Browse Files

amdgcn: fix mode in vec_series · 2291d1fd

2020-02-21  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/gcn/gcn-valu.md (vec_seriesv64di): Use gen_vec_duplicatev64di.

committed Feb 21, 2020

2291d1fd Browse Files

aarch64: Add SVE support for -mlow-precision-sqrt · a0ee8352

SVE was missing support for -mlow-precision-sqrt, which meant that
-march=armv8.2-a+sve -mlow-precision-sqrt could cause a performance
regression compared to -march=armv8.2-a -mlow-precision-sqrt.

2020-02-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.c (aarch64_emit_approx_sqrt): Add SVE
	support.  Use aarch64_emit_mult instead of emitting multiplication
	instructions directly.
	* config/aarch64/aarch64-sve.md (sqrt<mode>2, rsqrt<mode>2)
	(@aarch64_rsqrte<mode>, @aarch64_rsqrts<mode>): New expanders.

gcc/testsuite/
	* gcc.target/aarch64/sve/rsqrt_1.c: New test.
	* gcc.target/aarch64/sve/rsqrt_1_run.c: Likewise.
	* gcc.target/aarch64/sve/sqrt_1.c: Likewise.
	* gcc.target/aarch64/sve/sqrt_1_run.c: Likewise.

committed Feb 21, 2020

a0ee8352 Browse Files

aarch64: Add SVE support for -mlow-precision-div · 04f307cb

SVE was missing support for -mlow-precision-div, which meant that
-march=armv8.2-a+sve -mlow-precision-div could cause a performance
regression compared to -march=armv8.2-a -mlow-precision-div.

I ended up doing this much later than originally intended, sorry...

2020-02-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.c (aarch64_emit_mult): New function.
	(aarch64_emit_approx_div): Add SVE support.  Use aarch64_emit_mult
	instead of emitting multiplication instructions directly.
	* config/aarch64/iterators.md (SVE_COND_FP_BINARY_OPTAB): New iterator.
	* config/aarch64/aarch64-sve.md (div<mode>3, @aarch64_frecpe<mode>)
	(@aarch64_frecps<mode>): New expanders.

gcc/testsuite/
	* gcc.target/aarch64/sve/recip_1.c: New test.
	* gcc.target/aarch64/sve/recip_1_run.c: Likewise.
	* gcc.target/aarch64/sve/recip_2.c: Likewise.
	* gcc.target/aarch64/sve/recip_2_run.c: Likewise.

committed Feb 21, 2020

04f307cb Browse Files

aarch64: Bump AARCH64_APPROX_MODE to 64 bits · d87778ed

We now have more than 32 scalar and vector float modes, so the
32-bit AARCH64_APPROX_MODE would invoke UB for some of them.
Bumping to a 64-bit mask fixes that... for now.

Ideally we'd have a static assert to trap this, but logically
it would go at file scope.  I think it would be better to wait
until the switch to C++11, so that we can use static_assert
directly.

2020-02-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64-protos.h (AARCH64_APPROX_MODE): Operate
	on and produce uint64_ts rather than ints.
	(AARCH64_APPROX_NONE, AARCH64_APPROX_ALL): Change to uint64_ts.
	(cpu_approx_modes): Change the fields from unsigned int to uint64_t.

committed Feb 21, 2020

d87778ed Browse Files

aarch64: Avoid creating an unused register · 0df28e68

The rsqrt path of aarch64_emit_approx_sqrt created a pseudo
register that it never used.

2020-02-21  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* config/aarch64/aarch64.c (aarch64_emit_approx_sqrt): Don't create
	an unused xmsk register when handling approximate rsqrt.

committed Feb 21, 2020

0df28e68 Browse Files