Commits · 73def6eadc7a1f3e3465b972b774f26fcf8446bf · lvzhengyang / riscv-gcc-1

04 May, 2018 26 commits

[BRIGFE] Fix handling of NOPs. · 73def6ea
```
From-SVN: r259958
```
Pekka Jääskeläinen committed May 04, 2018
73def6ea Browse Files

[BRIGFE] phsa-specific optimizations · 080dc243

Add flag -fassume-phsa that is on by default. If -fno-assume-phsa
is given, these optimizations are disabled.

With this flag, gccbrig can generate GENERIC that assumes we are
targeting a phsa-runtime based implementation, which allows us
to expose the work-item context accesses to retrieve WI IDs etc.
which helps optimizers.

First optimization that takes advantage of this is to get rid of
the setworkitemid calls whenever we have non-inlined calls that
use IDs internally.

Other optimizations added in this commit:

- expand absoluteid to similar level of simplicity as workitemid.
At the moment absoluteid is the best indexing ID to end up with
WG vectorization.
- propagate ID variables closer to their uses. This is mainly
to avoid known useless casts, which confuse at least scalar
evolution analysis.
- use signed long long for storing IDs. Unsigned integers have
defined wraparound semantics, which confuse at least scalar
evolution analysis, leading to unvectorizable WI loops.
- also refactor some BRIG function generation helpers to brig_function.
- no point in having the wi-loop as a for-loop. It's really
a do...while and SCEV can analyze it just fine still.
- add consts to ptrs etc. in BRIG builtin defs.
Improves optimization opportunities.
- add qualifiers to generated function parameters.
Const and restrict on the hidden local/private pointers,
the arg buffer and the context pointer help some optimizations.

From-SVN: r259957

committed May 04, 2018

080dc243 Browse Files

[BRIGFE] skip multiple forward declarations of the same function · 1e25c5a9
```
From-SVN: r259950
```
Pekka Jääskeläinen committed May 04, 2018
1e25c5a9 Browse Files
[BRIGFE] do not allow optimizations based on known C builtins · 60a3d46c
```
It can break inputs that have similarly named functions.

From-SVN: r259949
```
Pekka Jääskeläinen committed May 04, 2018
60a3d46c Browse Files
[BRIGFE] allow controlling strict aliasing from cmd line · 77c42d45
```
From-SVN: r259948
```
Pekka Jääskeläinen committed May 04, 2018
77c42d45 Browse Files
cmd/go: on AIX, pass -X64 first when invoking ar · 1c725133
```
    
    Reviewed-on: https://go-review.googlesource.com/111535

From-SVN: r259946
```
Ian Lance Taylor committed May 04, 2018
1c725133 Browse Files

[BRIGFE] The modulo in ID computation should not be needed. · f986735a

The case where a dim is greater than the grid size doesn't seem
to be mentioned in the specs nor tested by PRM test suite.

From-SVN: r259944

committed May 04, 2018

f986735a Browse Files

[BRIGFE] Enable whole program optimizations · 637f3cde

HSA assumes all program scope HSAIL symbols can be queried from
the host runtime API, thus cannot be removed by the IPA.

Getting some inlining happening in the finalized binary required:
* explicitly marking the 'prog' scope functions and the launcher
function "externally_visible" to avoid the inliner removing it
* also the host_def ptr is set to externally visible, otherwise
IPA assumes it's never set
* adding the 'inline' keyword to functions to enable inlining,
otherwise GCC defaults to replaceable functions (one can link
over the previous one) which cannot be inlined
* replacing all calls to declarations with calls to definitions to
enable the inliner to find the definition
* to fix missing hidden argument types in the generated functions.
These were ignored silently until GCC started to be able to
inline calls to such functions.
* do not gimplify before fixing the call targets. Otherwise the
calls get detached and the definitions are not found. The reason
why this happens is not clear, but gimplifying only after call
target decl->def conversion fixes this.

From-SVN: r259943

committed May 04, 2018

637f3cde Browse Files

[BRIGFE] fix an alloca stack underflow · 1b40975c

We didn't preserve additional space for the alloca frame pointers that
are needed to be saved in the alloca space.

Fixes libgomp.c++/target-6.C execution test.

From-SVN: r259942

committed May 04, 2018

1b40975c Browse Files

* uk.po: Update. · 534fe823
```
From-SVN: r259938
```
Joseph Myers committed May 04, 2018
534fe823 Browse Files
re PR go/85630 (GCC 8.1.0: Filesystem pollution during build: .cache dir in $HOME) · cceec155
```
	PR go/85630
	* Makefile.am (CHECK_ENV): Set GOCACHE.
	(ECHO_ENV): Update for setting of GOCACHE.
	* Makefile.in: Rebuild.

From-SVN: r259937
```
Ian Lance Taylor committed May 04, 2018
cceec155 Browse Files

vsx-vector-6.h (foo): Add test for vec_max, vec_trunc. · 53481a28

gcc/testsuite/ChangeLog:

2018-05-04 Carl Love  <cel@us.ibm.com>
	* gcc.target/powerpc/vsx-vector-6.h (foo): Add test for vec_max,
	vec_trunc.
	* gcc.target/powerpc/vsx-vector-6-le.c (dg-final): Update xvcmpeqdp,
	xvcmpgtdp, xvcmpgedp counts. Add xxsel counts.
	* gcc.target/powerpc/vsx-vector-6-be.c (dg-final): Update xvcmpgtdp,
	xvcmpgedp counts. Add xxsel counts.

From-SVN: r259936

committed May 04, 2018

53481a28 Browse Files

libgo: fix for unaligned read in go-unwind.c's read_encoded_value() · 772455c9

    
    Change code to work properly reading unaligned data on architectures
    that don't support unaliged reads. This fixes a regression (broke
    Solaris/sparc) introduced in https://golang.org/cl/90235.
    
    Reviewed-on: https://go-review.googlesource.com/111296

From-SVN: r259935

committed May 04, 2018

772455c9 Browse Files

libffi PowerPC64 ELFv1 fp arg fixes · 71d372eb

The ELFv1 ABI says: "Single precision floating point values are mapped
to the second word in a single doubleword" and also "Floating point
registers f1 through f13 are used consecutively to pass up to 13
floating point values, one member aggregates passed by value
containing a floating point value, and to pass complex floating point
values".

libffi wasn't expecting float args in the second word, and wasn't
passing one member aggregates in fp registers.  This patch fixes those
problems, making use of the existing ELFv2 homogeneous aggregate
support since a one element fp struct is a special case of an
homogeneous aggregate.

I've also set a flag when returning pointers that might be used one
day.  This is just a tidy since the ppc64 assembly support code
currently doesn't test FLAG_RETURNS_64BITS for integer types..

	* src/powerpc/ffi_linux64.c (discover_homogeneous_aggregate):
	Compile for ELFv1 too, handling single element aggregates.
	(ffi_prep_cif_linux64_core): Call discover_homogeneous_aggregate
	for ELFv1.  Set FLAG_RETURNS_64BITS for FFI_TYPE_POINTER return.
	(ffi_prep_args64): Call discover_homogeneous_aggregate for ELFv1,
	and handle single element structs containing float or double
	as if the element wasn't wrapped in a struct.  Store floats in
	second word of doubleword slot when big-endian.
	(ffi_closure_helper_LINUX64): Similarly.

From-SVN: r259934

committed May 04, 2018

71d372eb Browse Files

bb-reorder.c (sanitize_hot_paths): Release hot_bbs_to_check. · dd172744

2018-05-04  Richard Biener  <rguenther@suse.de>

	* bb-reorder.c (sanitize_hot_paths): Release hot_bbs_to_check.
	* gimple-ssa-store-merging.c
	(imm_store_chain_info::output_merged_store): Remove redundant create,
	release split_store vector contents on failure.
	* tree-vect-slp.c (vect_schedule_slp_instance): Avoid leaking
	scalar stmt vector on cache hit.

From-SVN: r259932

committed May 04, 2018

dd172744 Browse Files

rs6000: Remove Xilinx FP · 2c2aa74d

This removes the special Xilinx FP support.  It was deprecated in
GCC 8.

After this patch all of TARGET_{DOUBLE,SINGLE}_FLOAT,
TARGET_{DF,SF}_INSN, and TARGET_{DF,SF}_FPR are replaced by
TARGET_HARD_FLOAT.  Also the fp_type attribute is deleted.


	* common/config/rs6000/rs6000-common.c (rs6000_handle_option): Remove
	Xilinx FP support.
	* config.gcc (powerpc-xilinx-eabi*): Remove.
	* config/rs6000/predicates.md (easy_fp_constant): Remove Xilinx FP
	support.
	(fusion_addis_mem_combo_load): Ditto.
	* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Remove Xilinx
	FP support.
	(rs6000_cpu_cpp_builtins): Ditto.
	* config/rs6000/rs6000-linux.c
	(rs6000_linux_float_exceptions_rounding_supported_p): Ditto.
	* config/rs6000/rs6000-opts.h (enum fpu_type_t): Delete.
	* config/rs6000/rs6000.c (rs6000_debug_reg_global): Remove Xilinx FP
	support.
	(rs6000_setup_reg_addr_masks): Ditto.
	(rs6000_init_hard_regno_mode_ok): Ditto.
	(rs6000_option_override_internal): Ditto.
	(legitimate_lo_sum_address_p): Ditto.
	(rs6000_legitimize_address): Ditto.
	(rs6000_legitimize_reload_address): Ditto.
	(rs6000_legitimate_address_p): Ditto.
	(abi_v4_pass_in_fpr): Ditto.
	(setup_incoming_varargs): Ditto.
	(rs6000_gimplify_va_arg): Ditto.
	(rs6000_split_multireg_move): Ditto.
	(rs6000_savres_strategy): Ditto.
	(rs6000_emit_prologue_components): Ditto.
	(rs6000_emit_epilogue_components): Ditto.
	(rs6000_emit_prologue): Ditto.
	(rs6000_emit_epilogue): Ditto.
	(rs6000_elf_file_end): Ditto.
	(rs6000_function_value): Ditto.
	(rs6000_libcall_value): Ditto.
	* config/rs6000/rs6000.h: Ditto.
	(TARGET_MINMAX_SF, TARGET_MINMAX_DF): Delete, merge to ...
	(TARGET_MINMAX): ... this.  New.
	(TARGET_SF_FPR, TARGET_DF_FPR, TARGET_SF_INSN, TARGET_DF_INSN): Delete.
	* config/rs6000/rs6000.md: Remove Xilinx FP support.
	(*movsi_internal1_single): Delete.
	* config/rs6000/rs6000.opt (msingle-float, mdouble-float, msimple-fpu,
	mfpu=, mxilinx-fpu): Delete.
	* config/rs6000/singlefp.h: Delete.
	* config/rs6000/sysv4.h: Remove Xilinx FP support.
	* config/rs6000/t-rs6000: Ditto.
	* config/rs6000/t-xilinx: Delete.
	* gcc/config/rs6000/titan.md: Adjust for fp_type removal.
	* gcc/config/rs6000/vsx.md: Remove Xilinx FP support.
	(VStype_simple): Delete.
	(VSfptype_simple, VSfptype_mul, VSfptype_div, VSfptype_sqrt): Delete.
	* config/rs6000/xfpu.h: Delete.
	* config/rs6000/xfpu.md: Delete.
	* config/rs6000/xilinx.h: Delete.
	* config/rs6000/xilinx.opt: Delete.
	* gcc/doc/invoke.texi (RS/6000 and PowerPC Options): Remove
	-msingle-float, -mdouble-float, -msimple-fpu, -mfpu=, and -mxilinx-fpu.

From-SVN: r259929

committed May 04, 2018

2c2aa74d Browse Files

PR libstdc++/85642 fix is_nothrow_default_constructible<optional<T>> · d6ed6b07

Add missing noexcept keyword to default constructor of each
_Optional_payload specialization.

	PR libstdc++/85642 fix is_nothrow_default_constructible<optional<T>>
	* include/std/optional (_Optional_payload): Add noexcept to default
	constructor. Re-indent.
	(_Optional_payload<_Tp, true, true, true>): Likewise. Add noexcept to
	constructor for copying disengaged payloads.
	(_Optional_payload<_Tp, true, false, true>): Likewise.
	(_Optional_payload<_Tp, true, true, false>): Likewise.
	(_Optional_payload<_Tp, true, false, false>): Likewise.
	* testsuite/20_util/optional/cons/85642.cc: New.
	* testsuite/20_util/optional/cons/value_neg.cc: Adjust dg-error lines.

From-SVN: r259928

committed May 04, 2018

d6ed6b07 Browse Files

[expand] Handle null target in expand_builtin_goacc_parlevel_id_size · 39bc9f83

2018-05-04  Tom de Vries  <tom@codesourcery.com>

	PR libgomp/85639
	* builtins.c (expand_builtin_goacc_parlevel_id_size): Handle null target
	if ignore == 0.

From-SVN: r259927

committed May 04, 2018

39bc9f83 Browse Files

re PR ada/85635 (typo in link.c for BSD platforms) · 5759c56d
```
	PR ada/85635
	* link.c (BSD platforms): Add missing backslash.

From-SVN: r259925
```
John Marino committed May 04, 2018
5759c56d Browse Files

re PR tree-optimization/85627 (ICE in update_phi_components in tree-complex.c) · 7d187fdf

2018-05-04  Richard Biener  <rguenther@suse.de>

	PR middle-end/85627
	* tree-complex.c (update_complex_assignment): We are always in SSA form.
	(expand_complex_div_wide): Likewise.
	(expand_complex_operations_1): Likewise.
	(expand_complex_libcall): Preserve EH info of the original stmt.
	(tree_lower_complex): Handle removed blocks.
	* tree.c (build_common_builtin_nodes): Do not set ECF_NOTRHOW
	on complex multiplication and division libcall builtins.

	* g++.dg/torture/pr85627.C: New testcase.

From-SVN: r259923

committed May 04, 2018

7d187fdf Browse Files

re PR lto/85574 (LTO bootstapped binaries differ) · 9b5713f7

2018-05-04  Richard Biener  <rguenther@suse.de>

	PR middle-end/85574
	* fold-const.c (negate_expr_p): Restrict negation of operand
	zero of a division to when we know that can happen without
	overflow.
	(fold_negate_expr_1): Likewise.

	* gcc.dg/torture/pr85574.c: New testcase.
	* gcc.dg/torture/pr57656.c: Use dg-additional-options.

From-SVN: r259922

committed May 04, 2018

9b5713f7 Browse Files

re PR tree-optimization/85466 (Performance is slow when doing 'branchless'… · 04782385

re PR tree-optimization/85466 (Performance is slow when doing 'branchless' conditional style math operations)

	PR libstdc++/85466
	* real.h (real_nextafter): Declare.
	* real.c (real_nextafter): New function.
	* fold-const-call.c (fold_const_nextafter): New function.
	(fold_const_call_sss): Call it for CASE_CFN_NEXTAFTER and
	CASE_CFN_NEXTTOWARD.
	(fold_const_call_1): For CASE_CFN_NEXTTOWARD call fold_const_call_sss
	even when arg1_mode is different from arg0_mode.

	* gcc.dg/nextafter-1.c: New test.
	* gcc.dg/nextafter-2.c: New test.
	* gcc.dg/nextafter-3.c: New test.
	* gcc.dg/nextafter-4.c: New test.

From-SVN: r259921

committed May 04, 2018

04782385 Browse Files

cmd/go: update mkalldocs.sh · 105073e1

    
    Update mkalldocs.sh from the current master sources, replacing the old
    mkdoc.sh.
    
    Reviewed-on: https://go-review.googlesource.com/111096

From-SVN: r259920

committed May 04, 2018

105073e1 Browse Files

cmd/go: enable tests of vet tool · 28fc5502

    
    Since gofrontend does have the vet tool now, we can test it.
    
    Reviewed-on: https://go-review.googlesource.com/111095

From-SVN: r259919

committed May 04, 2018

28fc5502 Browse Files

cmd/go: update to match recent changes to gc · 65229328

    
    In https://golang.org/cl/111097 the gc version of cmd/go was updated
    to include some gofrontend-specific changes. The gofrontend code
    already has different versions of those changes; this CL makes the
    gofrontend match the upstream code.
    
    Reviewed-on: https://go-review.googlesource.com/111099

From-SVN: r259918

committed May 04, 2018

65229328 Browse Files

Daily bump. · e7902c2c
```
From-SVN: r259917
```
GCC Administrator committed May 04, 2018
e7902c2c Browse Files

03 May, 2018 14 commits

PR c++/85600 - virtual delete failure. · 9cbc7d65
```
	* init.c (build_delete): Always save_expr when deleting.

From-SVN: r259913
```
Jason Merrill committed May 03, 2018
9cbc7d65 Browse Files

PR libstdc++/82644 define TR1 hypergeometric functions in strict modes · 86f66562

Following a recent change for PR 82644 the non-standard hypergeomtric
functions are not defined by <cmath> when __STRICT_ANSI__ is defined
(e.g. for -std=c++17, or -std=c++14 -D__STDCPP_WANT_MATH_SPEC_FUNCS__).
That caused errors in <tr1/cmath> because the using-declarations for
tr1::hyperg et al are invalid in strict modes.

The solution is to define the TR1 hypergeometric functions inline in
<tr1/cmath> if __STRICT_ANSI__ is defined.

	PR libstdc++/82644
	* include/tr1/cmath [__STRICT_ANSI__] (hypergf, hypergl, hyperg): Use
	inline definitions instead of using-declarations.
	[__STRICT_ANSI__] (conf_hypergf, conf_hypergl, conf_hyperg): Likewise.
	* testsuite/tr1/5_numerical_facilities/special_functions/
	07_conf_hyperg/compile_cxx17.cc: New.
	* testsuite/tr1/5_numerical_facilities/special_functions/
	17_hyperg/compile_cxx17.cc: New.

From-SVN: r259912

committed May 03, 2018

86f66562 Browse Files

[C++ Patch] Kill -ffriend-injection · 6c072e21

https://gcc.gnu.org/ml/gcc-patches/2018-05/msg00175.html

	* doc/extend.texi (Deprecated Features): Remove
	-ffriend-injection.
	(Backwards Compatibility): Likewise.
	* doc/invoke.texi (C++ Language Options): Likewise.
	(C++ Dialect Options): Likewise.

	c-family/
	* c.opt (ffriend-injection): Remove functionality, issue warning.

	cp/
	* decl.c (cxx_init_decl_processing): Remove flag_friend_injection.
	* name-lookup.c (do_pushdecl): Likewise.

	testsuite/
	Remove -ffriend-injection.
	* g++.old-deja/g++.jason/scoping15.C: Delete.
	* g++.old-deja/g++.mike/net43.C: Delete.

From-SVN: r259904

committed May 03, 2018

6c072e21 Browse Files

re PR target/85530 ([X86] _mm512_mullox_epi64 and _mm512_mask_mullox_epi64 not implemented) · 503ac4e0

	PR target/85530
	* config/i386/avx512fintrin.h (_mm512_mullox_epi64,
	_mm512_mask_mullox_epi64): New intrinsics.

	* gcc.target/i386/avx512f-vpmullq-1.c: New test.
	* gcc.target/i386/avx512f-vpmullq-2.c: New test.
	* gcc.target/i386/avx512dq-vpmullq-3.c: New test.
	* gcc.target/i386/avx512dq-vpmullq-4.c: New test.

From-SVN: r259903

committed May 03, 2018

503ac4e0 Browse Files

PR libstdc++/84769 qualify call to std::get<0> · 1ee021f2
```
	PR libstdc++/84769
	* include/std/variant (visit): Qualify std::get call.

From-SVN: r259902
```
Jonathan Wakely committed May 03, 2018
1ee021f2 Browse Files

PR libstdc++/85632 fix wraparound in filesystem::space · 2e023647

On 32-bit targets any values over 4GB would wrap and produce the wrong
result.

	PR libstdc++/85632 use uintmax_t for arithmetic
	* src/filesystem/ops.cc (experimental::filesystem::space): Perform
	arithmetic in result type.
	* src/filesystem/std-ops.cc (filesystem::space): Likewise.
	* testsuite/27_io/filesystem/operations/space.cc: Check total capacity
	is greater than free space.
	* testsuite/experimental/filesystem/operations/space.cc: New.

From-SVN: r259901

committed May 03, 2018

2e023647 Browse Files

compiler: avoid crashing on invalid non-integer array length · d18734b5

    
    Tweak the array type checking code to avoid crashing on array types
    whose length expressions are explicit non-integer types (for example,
    "float64(10)"). If such constructs are seen, issue an "invalid array
    bound" error.
    
    Fixes golang/go#13486.
    
    Reviewed-on: https://go-review.googlesource.com/91975

From-SVN: r259900

committed May 03, 2018

d18734b5 Browse Files

Update .po files. · 4e0c5f94

	* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
	ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
	zh_TW.po: Update.

From-SVN: r259897

committed May 03, 2018

4e0c5f94 Browse Files

Add tests for std::remove_cvref · adba76a3

	* testsuite/20_util/remove_cvref/requirements/alias_decl.cc: New.
	* testsuite/20_util/remove_cvref/requirements/explicit_instantiation.cc:
	New.
	* testsuite/20_util/remove_cvref/value.cc: New.
	* testsuite/20_util/remove_cvref/value_ext.cc: New.

From-SVN: r259896

committed May 03, 2018

adba76a3 Browse Files

PR libstdc++/84087 add default arguments to basic_string members (LWG 2268) · 852ee53c

This change was a DR against C++11 and so should have been implemented
years ago.

	PR libstdc++/84087 LWG DR 2268 basic_string default arguments
	* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI=1]
	(append(const basic_string&, size_type, size_type)
	(assign(const basic_string&, size_type, size_type)
	(insert(size_type, const basic_string&, size_type, size_type)
	(replace(size_type,size_type,const basic_string&,size_type,size_type)
	(compare(size_type,size_type,constbasic_string&,size_type,size_type)):
	Add default arguments (LWG 2268).
	[_GLIBCXX_USE_CXX11_ABI=0]
	(append(const basic_string&, size_type, size_type)
	(assign(const basic_string&, size_type, size_type)
	(insert(size_type, const basic_string&, size_type, size_type)
	(replace(size_type,size_type,const basic_string&,size_type,size_type)
	(compare(size_type,size_type,constbasic_string&,size_type,size_type)):
	Likewise.
	* testsuite/21_strings/basic_string/dr2268.cc: New test.

From-SVN: r259895

committed May 03, 2018

852ee53c Browse Files

PR libstdc++/84535 constrain std::thread constructor · d49b3426

The standard requires that the std::thread constructor is constrained so
it can't be called with a first argument of type std::thread. The
current implementation only meets that requirement if the constructor is
called with one argument, by using deleted overloads. This uses an
enable_if constraint to enforce the requirement for any number of
arguments.

Also add a static assertion to give a more readable error for invalid
arguments that cannot be invoked. Also simplify _Invoker to reduce the
error cascade for ill-formed instantiations with non-invocable
arguments.

	PR libstdc++/84535
	* include/std/thread (thread::__not_same): New SFINAE helper.
	(thread::thread(_Callable&&, _Args&&...)): Add SFINAE constraint that
	first argument is not a std::thread. Add static assertion to check
	INVOKE expression is valid.
	(thread::thread(thread&), thread::thread(const thread&&)): Remove.
	(thread::_Invoke::_M_invoke, thread::_Invoke::operator()): Use
	__invoke_result for return types and remove exception specifications.
	* testsuite/30_threads/thread/cons/84535.cc: New.

From-SVN: r259893

committed May 03, 2018

d49b3426 Browse Files

[testsuite] Add scan-offload-tree-dump · 63f12215

2018-05-03  Tom de Vries  <tom@codesourcery.com>

	PR testsuite/85106
	* lib/scanoffloadtree.exp: New file.

	* testsuite/lib/libgomp-dg.exp (libgomp-dg-test): Add save-temps to
	extra_tool_flags if it contains an -foffload=-fdump-* flag.
	* testsuite/lib/libgomp.exp: Include scanoffloadtree.exp.
	* testsuite/libgomp.oacc-c/vec.c: Use scan-offload-tree-dump.

	* doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization
	dump files): Add offload-tree.

From-SVN: r259892

committed May 03, 2018

63f12215 Browse Files

re PR tree-optimization/85615 (ICE at -O2 and above on valid code on… · a378f85c

re PR tree-optimization/85615 (ICE at -O2 and above on valid code on x86_64-linux-gnu: in dfs_enumerate_from, at cfganal.c:1197)

2018-05-03  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/85615
	* tree-ssa-threadupdate.c (thread_block_1): Only allow exits
	to loops not nested in BBs loop father to avoid creating multi-entry
	loops.

	* gcc.dg/torture/pr85615.c: New testcase.

From-SVN: r259891

committed May 03, 2018

a378f85c Browse Files

[tree-complex.c] PR tree-optimization/70291: Inline floating-point complex… · b7244ccb

[tree-complex.c] PR tree-optimization/70291: Inline floating-point complex multiplication more aggressively

We can improve the performance of complex floating-point multiplications by inlining the expansion a bit more aggressively.
We can inline complex x = a * b as:
x = (ar*br - ai*bi) + i(ar*bi + br*ai);
if (isunordered (__real__ x, __imag__ x))
  x = __muldc3 (a, b); //Or __mulsc3 for single-precision

That way the common case where no NaNs are produced we can avoid the libgcc call and fall back to the
NaN handling stuff in libgcc if either components of the expansion are NaN.

The implementation is done in expand_complex_multiplication in tree-complex.c and the above expansion
will be done when optimising for -O1 and greater and when not optimising for size.
At -O0 and -Os the single call to libgcc will be emitted.

For the code:
__complex double
foo (__complex double a, __complex double b)
{
  return a * b;
}

We will now emit at -O2 for aarch64:
foo:
        fmul    d16, d1, d3
        fmul    d6, d1, d2
        fnmsub  d5, d0, d2, d16
        fmadd   d4, d0, d3, d6
        fcmp    d5, d4
        bvs     .L8
        fmov    d1, d4
        fmov    d0, d5
        ret
.L8:
        stp     x29, x30, [sp, -16]!
        mov     x29, sp
        bl      __muldc3
        ldp     x29, x30, [sp], 16
        ret

Instead of just a branch to __muldc3.

	PR tree-optimization/70291
	* tree-complex.c (expand_complex_libcall): Add type, inplace_p
	arguments.  Change return type to tree.  Emit libcall as a new
	statement rather than replacing existing one when inplace_p is true.
	(expand_complex_multiplication_components): New function.
	(expand_complex_multiplication): Expand floating-point complex
	multiplication using the above.
	(expand_complex_division): Rename inner_type parameter to type.
	Update expand_complex_libcall call-site.
	(expand_complex_operations_1): Update expand_complex_multiplication
	and expand_complex_division call-sites.

	* gcc.dg/complex-6.c: New test.
	* gcc.dg/complex-7.c: Likewise.

From-SVN: r259889

committed May 03, 2018

b7244ccb Browse Files