- 04 May, 2018 26 commits
-
-
From-SVN: r259958
Pekka Jääskeläinen committed -
Add flag -fassume-phsa that is on by default. If -fno-assume-phsa is given, these optimizations are disabled. With this flag, gccbrig can generate GENERIC that assumes we are targeting a phsa-runtime based implementation, which allows us to expose the work-item context accesses to retrieve WI IDs etc. which helps optimizers. First optimization that takes advantage of this is to get rid of the setworkitemid calls whenever we have non-inlined calls that use IDs internally. Other optimizations added in this commit: - expand absoluteid to similar level of simplicity as workitemid. At the moment absoluteid is the best indexing ID to end up with WG vectorization. - propagate ID variables closer to their uses. This is mainly to avoid known useless casts, which confuse at least scalar evolution analysis. - use signed long long for storing IDs. Unsigned integers have defined wraparound semantics, which confuse at least scalar evolution analysis, leading to unvectorizable WI loops. - also refactor some BRIG function generation helpers to brig_function. - no point in having the wi-loop as a for-loop. It's really a do...while and SCEV can analyze it just fine still. - add consts to ptrs etc. in BRIG builtin defs. Improves optimization opportunities. - add qualifiers to generated function parameters. Const and restrict on the hidden local/private pointers, the arg buffer and the context pointer help some optimizations. From-SVN: r259957
Pekka Jääskeläinen committed -
From-SVN: r259950
Pekka Jääskeläinen committed -
It can break inputs that have similarly named functions. From-SVN: r259949
Pekka Jääskeläinen committed -
From-SVN: r259948
Pekka Jääskeläinen committed -
Reviewed-on: https://go-review.googlesource.com/111535 From-SVN: r259946
Ian Lance Taylor committed -
The case where a dim is greater than the grid size doesn't seem to be mentioned in the specs nor tested by PRM test suite. From-SVN: r259944
Pekka Jääskeläinen committed -
HSA assumes all program scope HSAIL symbols can be queried from the host runtime API, thus cannot be removed by the IPA. Getting some inlining happening in the finalized binary required: * explicitly marking the 'prog' scope functions and the launcher function "externally_visible" to avoid the inliner removing it * also the host_def ptr is set to externally visible, otherwise IPA assumes it's never set * adding the 'inline' keyword to functions to enable inlining, otherwise GCC defaults to replaceable functions (one can link over the previous one) which cannot be inlined * replacing all calls to declarations with calls to definitions to enable the inliner to find the definition * to fix missing hidden argument types in the generated functions. These were ignored silently until GCC started to be able to inline calls to such functions. * do not gimplify before fixing the call targets. Otherwise the calls get detached and the definitions are not found. The reason why this happens is not clear, but gimplifying only after call target decl->def conversion fixes this. From-SVN: r259943
Pekka Jääskeläinen committed -
We didn't preserve additional space for the alloca frame pointers that are needed to be saved in the alloca space. Fixes libgomp.c++/target-6.C execution test. From-SVN: r259942
Pekka Jääskeläinen committed -
From-SVN: r259938
Joseph Myers committed -
PR go/85630 * Makefile.am (CHECK_ENV): Set GOCACHE. (ECHO_ENV): Update for setting of GOCACHE. * Makefile.in: Rebuild. From-SVN: r259937
Ian Lance Taylor committed -
gcc/testsuite/ChangeLog: 2018-05-04 Carl Love <cel@us.ibm.com> * gcc.target/powerpc/vsx-vector-6.h (foo): Add test for vec_max, vec_trunc. * gcc.target/powerpc/vsx-vector-6-le.c (dg-final): Update xvcmpeqdp, xvcmpgtdp, xvcmpgedp counts. Add xxsel counts. * gcc.target/powerpc/vsx-vector-6-be.c (dg-final): Update xvcmpgtdp, xvcmpgedp counts. Add xxsel counts. From-SVN: r259936
Carl Love committed -
Change code to work properly reading unaligned data on architectures that don't support unaliged reads. This fixes a regression (broke Solaris/sparc) introduced in https://golang.org/cl/90235. Reviewed-on: https://go-review.googlesource.com/111296 From-SVN: r259935
Ian Lance Taylor committed -
The ELFv1 ABI says: "Single precision floating point values are mapped to the second word in a single doubleword" and also "Floating point registers f1 through f13 are used consecutively to pass up to 13 floating point values, one member aggregates passed by value containing a floating point value, and to pass complex floating point values". libffi wasn't expecting float args in the second word, and wasn't passing one member aggregates in fp registers. This patch fixes those problems, making use of the existing ELFv2 homogeneous aggregate support since a one element fp struct is a special case of an homogeneous aggregate. I've also set a flag when returning pointers that might be used one day. This is just a tidy since the ppc64 assembly support code currently doesn't test FLAG_RETURNS_64BITS for integer types.. * src/powerpc/ffi_linux64.c (discover_homogeneous_aggregate): Compile for ELFv1 too, handling single element aggregates. (ffi_prep_cif_linux64_core): Call discover_homogeneous_aggregate for ELFv1. Set FLAG_RETURNS_64BITS for FFI_TYPE_POINTER return. (ffi_prep_args64): Call discover_homogeneous_aggregate for ELFv1, and handle single element structs containing float or double as if the element wasn't wrapped in a struct. Store floats in second word of doubleword slot when big-endian. (ffi_closure_helper_LINUX64): Similarly. From-SVN: r259934
Alan Modra committed -
2018-05-04 Richard Biener <rguenther@suse.de> * bb-reorder.c (sanitize_hot_paths): Release hot_bbs_to_check. * gimple-ssa-store-merging.c (imm_store_chain_info::output_merged_store): Remove redundant create, release split_store vector contents on failure. * tree-vect-slp.c (vect_schedule_slp_instance): Avoid leaking scalar stmt vector on cache hit. From-SVN: r259932
Richard Biener committed -
This removes the special Xilinx FP support. It was deprecated in GCC 8. After this patch all of TARGET_{DOUBLE,SINGLE}_FLOAT, TARGET_{DF,SF}_INSN, and TARGET_{DF,SF}_FPR are replaced by TARGET_HARD_FLOAT. Also the fp_type attribute is deleted. * common/config/rs6000/rs6000-common.c (rs6000_handle_option): Remove Xilinx FP support. * config.gcc (powerpc-xilinx-eabi*): Remove. * config/rs6000/predicates.md (easy_fp_constant): Remove Xilinx FP support. (fusion_addis_mem_combo_load): Ditto. * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Remove Xilinx FP support. (rs6000_cpu_cpp_builtins): Ditto. * config/rs6000/rs6000-linux.c (rs6000_linux_float_exceptions_rounding_supported_p): Ditto. * config/rs6000/rs6000-opts.h (enum fpu_type_t): Delete. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Remove Xilinx FP support. (rs6000_setup_reg_addr_masks): Ditto. (rs6000_init_hard_regno_mode_ok): Ditto. (rs6000_option_override_internal): Ditto. (legitimate_lo_sum_address_p): Ditto. (rs6000_legitimize_address): Ditto. (rs6000_legitimize_reload_address): Ditto. (rs6000_legitimate_address_p): Ditto. (abi_v4_pass_in_fpr): Ditto. (setup_incoming_varargs): Ditto. (rs6000_gimplify_va_arg): Ditto. (rs6000_split_multireg_move): Ditto. (rs6000_savres_strategy): Ditto. (rs6000_emit_prologue_components): Ditto. (rs6000_emit_epilogue_components): Ditto. (rs6000_emit_prologue): Ditto. (rs6000_emit_epilogue): Ditto. (rs6000_elf_file_end): Ditto. (rs6000_function_value): Ditto. (rs6000_libcall_value): Ditto. * config/rs6000/rs6000.h: Ditto. (TARGET_MINMAX_SF, TARGET_MINMAX_DF): Delete, merge to ... (TARGET_MINMAX): ... this. New. (TARGET_SF_FPR, TARGET_DF_FPR, TARGET_SF_INSN, TARGET_DF_INSN): Delete. * config/rs6000/rs6000.md: Remove Xilinx FP support. (*movsi_internal1_single): Delete. * config/rs6000/rs6000.opt (msingle-float, mdouble-float, msimple-fpu, mfpu=, mxilinx-fpu): Delete. * config/rs6000/singlefp.h: Delete. * config/rs6000/sysv4.h: Remove Xilinx FP support. * config/rs6000/t-rs6000: Ditto. * config/rs6000/t-xilinx: Delete. * gcc/config/rs6000/titan.md: Adjust for fp_type removal. * gcc/config/rs6000/vsx.md: Remove Xilinx FP support. (VStype_simple): Delete. (VSfptype_simple, VSfptype_mul, VSfptype_div, VSfptype_sqrt): Delete. * config/rs6000/xfpu.h: Delete. * config/rs6000/xfpu.md: Delete. * config/rs6000/xilinx.h: Delete. * config/rs6000/xilinx.opt: Delete. * gcc/doc/invoke.texi (RS/6000 and PowerPC Options): Remove -msingle-float, -mdouble-float, -msimple-fpu, -mfpu=, and -mxilinx-fpu. From-SVN: r259929
Segher Boessenkool committed -
Add missing noexcept keyword to default constructor of each _Optional_payload specialization. PR libstdc++/85642 fix is_nothrow_default_constructible<optional<T>> * include/std/optional (_Optional_payload): Add noexcept to default constructor. Re-indent. (_Optional_payload<_Tp, true, true, true>): Likewise. Add noexcept to constructor for copying disengaged payloads. (_Optional_payload<_Tp, true, false, true>): Likewise. (_Optional_payload<_Tp, true, true, false>): Likewise. (_Optional_payload<_Tp, true, false, false>): Likewise. * testsuite/20_util/optional/cons/85642.cc: New. * testsuite/20_util/optional/cons/value_neg.cc: Adjust dg-error lines. From-SVN: r259928
Jonathan Wakely committed -
2018-05-04 Tom de Vries <tom@codesourcery.com> PR libgomp/85639 * builtins.c (expand_builtin_goacc_parlevel_id_size): Handle null target if ignore == 0. From-SVN: r259927
Tom de Vries committed -
PR ada/85635 * link.c (BSD platforms): Add missing backslash. From-SVN: r259925
John Marino committed -
2018-05-04 Richard Biener <rguenther@suse.de> PR middle-end/85627 * tree-complex.c (update_complex_assignment): We are always in SSA form. (expand_complex_div_wide): Likewise. (expand_complex_operations_1): Likewise. (expand_complex_libcall): Preserve EH info of the original stmt. (tree_lower_complex): Handle removed blocks. * tree.c (build_common_builtin_nodes): Do not set ECF_NOTRHOW on complex multiplication and division libcall builtins. * g++.dg/torture/pr85627.C: New testcase. From-SVN: r259923
Richard Biener committed -
2018-05-04 Richard Biener <rguenther@suse.de> PR middle-end/85574 * fold-const.c (negate_expr_p): Restrict negation of operand zero of a division to when we know that can happen without overflow. (fold_negate_expr_1): Likewise. * gcc.dg/torture/pr85574.c: New testcase. * gcc.dg/torture/pr57656.c: Use dg-additional-options. From-SVN: r259922
Richard Biener committed -
re PR tree-optimization/85466 (Performance is slow when doing 'branchless' conditional style math operations) PR libstdc++/85466 * real.h (real_nextafter): Declare. * real.c (real_nextafter): New function. * fold-const-call.c (fold_const_nextafter): New function. (fold_const_call_sss): Call it for CASE_CFN_NEXTAFTER and CASE_CFN_NEXTTOWARD. (fold_const_call_1): For CASE_CFN_NEXTTOWARD call fold_const_call_sss even when arg1_mode is different from arg0_mode. * gcc.dg/nextafter-1.c: New test. * gcc.dg/nextafter-2.c: New test. * gcc.dg/nextafter-3.c: New test. * gcc.dg/nextafter-4.c: New test. From-SVN: r259921
Jakub Jelinek committed -
Update mkalldocs.sh from the current master sources, replacing the old mkdoc.sh. Reviewed-on: https://go-review.googlesource.com/111096 From-SVN: r259920
Ian Lance Taylor committed -
Since gofrontend does have the vet tool now, we can test it. Reviewed-on: https://go-review.googlesource.com/111095 From-SVN: r259919
Ian Lance Taylor committed -
In https://golang.org/cl/111097 the gc version of cmd/go was updated to include some gofrontend-specific changes. The gofrontend code already has different versions of those changes; this CL makes the gofrontend match the upstream code. Reviewed-on: https://go-review.googlesource.com/111099 From-SVN: r259918
Ian Lance Taylor committed -
From-SVN: r259917
GCC Administrator committed
-
- 03 May, 2018 14 commits
-
-
* init.c (build_delete): Always save_expr when deleting. From-SVN: r259913
Jason Merrill committed -
Following a recent change for PR 82644 the non-standard hypergeomtric functions are not defined by <cmath> when __STRICT_ANSI__ is defined (e.g. for -std=c++17, or -std=c++14 -D__STDCPP_WANT_MATH_SPEC_FUNCS__). That caused errors in <tr1/cmath> because the using-declarations for tr1::hyperg et al are invalid in strict modes. The solution is to define the TR1 hypergeometric functions inline in <tr1/cmath> if __STRICT_ANSI__ is defined. PR libstdc++/82644 * include/tr1/cmath [__STRICT_ANSI__] (hypergf, hypergl, hyperg): Use inline definitions instead of using-declarations. [__STRICT_ANSI__] (conf_hypergf, conf_hypergl, conf_hyperg): Likewise. * testsuite/tr1/5_numerical_facilities/special_functions/ 07_conf_hyperg/compile_cxx17.cc: New. * testsuite/tr1/5_numerical_facilities/special_functions/ 17_hyperg/compile_cxx17.cc: New. From-SVN: r259912
Jonathan Wakely committed -
https://gcc.gnu.org/ml/gcc-patches/2018-05/msg00175.html * doc/extend.texi (Deprecated Features): Remove -ffriend-injection. (Backwards Compatibility): Likewise. * doc/invoke.texi (C++ Language Options): Likewise. (C++ Dialect Options): Likewise. c-family/ * c.opt (ffriend-injection): Remove functionality, issue warning. cp/ * decl.c (cxx_init_decl_processing): Remove flag_friend_injection. * name-lookup.c (do_pushdecl): Likewise. testsuite/ Remove -ffriend-injection. * g++.old-deja/g++.jason/scoping15.C: Delete. * g++.old-deja/g++.mike/net43.C: Delete. From-SVN: r259904
Nathan Sidwell committed -
PR target/85530 * config/i386/avx512fintrin.h (_mm512_mullox_epi64, _mm512_mask_mullox_epi64): New intrinsics. * gcc.target/i386/avx512f-vpmullq-1.c: New test. * gcc.target/i386/avx512f-vpmullq-2.c: New test. * gcc.target/i386/avx512dq-vpmullq-3.c: New test. * gcc.target/i386/avx512dq-vpmullq-4.c: New test. From-SVN: r259903
Jakub Jelinek committed -
PR libstdc++/84769 * include/std/variant (visit): Qualify std::get call. From-SVN: r259902
Jonathan Wakely committed -
On 32-bit targets any values over 4GB would wrap and produce the wrong result. PR libstdc++/85632 use uintmax_t for arithmetic * src/filesystem/ops.cc (experimental::filesystem::space): Perform arithmetic in result type. * src/filesystem/std-ops.cc (filesystem::space): Likewise. * testsuite/27_io/filesystem/operations/space.cc: Check total capacity is greater than free space. * testsuite/experimental/filesystem/operations/space.cc: New. From-SVN: r259901
Jonathan Wakely committed -
Tweak the array type checking code to avoid crashing on array types whose length expressions are explicit non-integer types (for example, "float64(10)"). If such constructs are seen, issue an "invalid array bound" error. Fixes golang/go#13486. Reviewed-on: https://go-review.googlesource.com/91975 From-SVN: r259900
Ian Lance Taylor committed -
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po, ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update. From-SVN: r259897
Joseph Myers committed -
* testsuite/20_util/remove_cvref/requirements/alias_decl.cc: New. * testsuite/20_util/remove_cvref/requirements/explicit_instantiation.cc: New. * testsuite/20_util/remove_cvref/value.cc: New. * testsuite/20_util/remove_cvref/value_ext.cc: New. From-SVN: r259896
Jonathan Wakely committed -
This change was a DR against C++11 and so should have been implemented years ago. PR libstdc++/84087 LWG DR 2268 basic_string default arguments * include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI=1] (append(const basic_string&, size_type, size_type) (assign(const basic_string&, size_type, size_type) (insert(size_type, const basic_string&, size_type, size_type) (replace(size_type,size_type,const basic_string&,size_type,size_type) (compare(size_type,size_type,constbasic_string&,size_type,size_type)): Add default arguments (LWG 2268). [_GLIBCXX_USE_CXX11_ABI=0] (append(const basic_string&, size_type, size_type) (assign(const basic_string&, size_type, size_type) (insert(size_type, const basic_string&, size_type, size_type) (replace(size_type,size_type,const basic_string&,size_type,size_type) (compare(size_type,size_type,constbasic_string&,size_type,size_type)): Likewise. * testsuite/21_strings/basic_string/dr2268.cc: New test. From-SVN: r259895
Jonathan Wakely committed -
The standard requires that the std::thread constructor is constrained so it can't be called with a first argument of type std::thread. The current implementation only meets that requirement if the constructor is called with one argument, by using deleted overloads. This uses an enable_if constraint to enforce the requirement for any number of arguments. Also add a static assertion to give a more readable error for invalid arguments that cannot be invoked. Also simplify _Invoker to reduce the error cascade for ill-formed instantiations with non-invocable arguments. PR libstdc++/84535 * include/std/thread (thread::__not_same): New SFINAE helper. (thread::thread(_Callable&&, _Args&&...)): Add SFINAE constraint that first argument is not a std::thread. Add static assertion to check INVOKE expression is valid. (thread::thread(thread&), thread::thread(const thread&&)): Remove. (thread::_Invoke::_M_invoke, thread::_Invoke::operator()): Use __invoke_result for return types and remove exception specifications. * testsuite/30_threads/thread/cons/84535.cc: New. From-SVN: r259893
Jonathan Wakely committed -
2018-05-03 Tom de Vries <tom@codesourcery.com> PR testsuite/85106 * lib/scanoffloadtree.exp: New file. * testsuite/lib/libgomp-dg.exp (libgomp-dg-test): Add save-temps to extra_tool_flags if it contains an -foffload=-fdump-* flag. * testsuite/lib/libgomp.exp: Include scanoffloadtree.exp. * testsuite/libgomp.oacc-c/vec.c: Use scan-offload-tree-dump. * doc/sourcebuild.texi (Commands for use in dg-final, Scan optimization dump files): Add offload-tree. From-SVN: r259892
Tom de Vries committed -
re PR tree-optimization/85615 (ICE at -O2 and above on valid code on x86_64-linux-gnu: in dfs_enumerate_from, at cfganal.c:1197) 2018-05-03 Richard Biener <rguenther@suse.de> PR tree-optimization/85615 * tree-ssa-threadupdate.c (thread_block_1): Only allow exits to loops not nested in BBs loop father to avoid creating multi-entry loops. * gcc.dg/torture/pr85615.c: New testcase. From-SVN: r259891
Richard Biener committed -
[tree-complex.c] PR tree-optimization/70291: Inline floating-point complex multiplication more aggressively We can improve the performance of complex floating-point multiplications by inlining the expansion a bit more aggressively. We can inline complex x = a * b as: x = (ar*br - ai*bi) + i(ar*bi + br*ai); if (isunordered (__real__ x, __imag__ x)) x = __muldc3 (a, b); //Or __mulsc3 for single-precision That way the common case where no NaNs are produced we can avoid the libgcc call and fall back to the NaN handling stuff in libgcc if either components of the expansion are NaN. The implementation is done in expand_complex_multiplication in tree-complex.c and the above expansion will be done when optimising for -O1 and greater and when not optimising for size. At -O0 and -Os the single call to libgcc will be emitted. For the code: __complex double foo (__complex double a, __complex double b) { return a * b; } We will now emit at -O2 for aarch64: foo: fmul d16, d1, d3 fmul d6, d1, d2 fnmsub d5, d0, d2, d16 fmadd d4, d0, d3, d6 fcmp d5, d4 bvs .L8 fmov d1, d4 fmov d0, d5 ret .L8: stp x29, x30, [sp, -16]! mov x29, sp bl __muldc3 ldp x29, x30, [sp], 16 ret Instead of just a branch to __muldc3. PR tree-optimization/70291 * tree-complex.c (expand_complex_libcall): Add type, inplace_p arguments. Change return type to tree. Emit libcall as a new statement rather than replacing existing one when inplace_p is true. (expand_complex_multiplication_components): New function. (expand_complex_multiplication): Expand floating-point complex multiplication using the above. (expand_complex_division): Rename inner_type parameter to type. Update expand_complex_libcall call-site. (expand_complex_operations_1): Update expand_complex_multiplication and expand_complex_division call-sites. * gcc.dg/complex-6.c: New test. * gcc.dg/complex-7.c: Likewise. From-SVN: r259889
Kyrylo Tkachov committed
-