- 08 Jan, 2019 8 commits
-
-
From-SVN: r267683
Pierre-Marie de Rodat committed -
Following the discovery of regressions in GPRbuild, this reverts both r263100 and r264608: 2019-01-08 Justin Squirek <squirek@adacore.com> Revert: 2018-07-31 Justin Squirek <squirek@adacore.com> gcc/ada/ * lib-writ.adb (Write_With_Lines): Modfiy the generation of dependencies within ali files so that source unit bodies are properly listed even if said bodies are missing. Perform legacy behavior in GNATprove mode. * lib-writ.ads: Modify documentation to reflect current behavior. and: 2018-09-26 Justin Squirek <squirek@adacore.com> gcc/ada/ * lib-writ.adb, lib-writ.ads (Write_With_Lines): Add documentation and an extra conditional check for RCI units so that generated ali files will list the spec only instead of a body when a body is not found. From-SVN: r267680
Justin Squirek committed -
/cp 2019-01-08 Paolo Carlini <paolo.carlini@oracle.com> * decl.c (start_decl): Improve permerror location. /testsuite 2019-01-08 Paolo Carlini <paolo.carlini@oracle.com> * g++.dg/diagnostic/out-of-class-redeclaration.C: New. From-SVN: r267675
Paolo Carlini committed -
2019-01-08 Iain Sandoe <iain@sandoe.co.uk> gcc/testsuite/ * c-c++-common/builtin-has-attribute-3.c: Skip tests requiring symbol alias support. * c-c++-common/builtin-has-attribute-4.c: Likewise. Append match for warning that ‘protected’ attribute is not supported. From-SVN: r267674
Iain Sandoe committed -
2019-01-08 Iain Sandoe <iain@sandoe.co.uk> gcc/testsuite/ * gcc.dg/Wmissing-attributes.c: Require alias support. * gcc.dg/attr-copy-2.c: Likewise. * gcc.dg/attr-copy-5.c: Likewise. From-SVN: r267673
Iain Sandoe committed -
PR c++/88554 * decl.c (finish_function): For -Wreturn-type don't add a return *this; fixit hint if current_class_ref is NULL. Use a single if instead of two nested ones. * g++.dg/warn/Wreturn-type-11.C: New test. Co-Authored-By: Jakub Jelinek <jakub@redhat.com> From-SVN: r267672
Jonathan Wakely committed -
Amongst other changes, r266077 updated value_range_base::dump so that it additionally prints the type. This broke an assertion within the jit testsuite, in jit.dg/test-sum-of-squares.c, which was checking for: ": [-INF, n_" but was now getting: ": signed int [-INF, n_" The test is merely intended as a simple verification that we can read dump files via gcc_jit_context_enable_dump. This patch loosens the requirements on the dump so that it should work with either version of value_range_base::dump. gcc/testsuite/ChangeLog: PR jit/88747 * jit.dg/test-sum-of-squares.c (verify_code): Update expected vrp dump to reflect r266077. From-SVN: r267671
David Malcolm committed -
From-SVN: r267670
GCC Administrator committed
-
- 07 Jan, 2019 32 commits
-
-
re PR c/88701 (Internal compiler error for valid program using compound literal with variably modified type.) PR c/88701 * c-decl.c (build_compound_literal): If not TREE_STATIC, only pushdecl if current_function_decl is non-NULL. * gcc.dg/pr88701.c: New test. From-SVN: r267667
Jakub Jelinek committed -
This patch started off just by adding if_then_else support in write_attr_value to be able to write a saner expression for powerpc tls_gdld_nomark length. Then I noticed bit-rot in functions used to calculate insn_default_length, insn_min_length, and length_unit_log (which are used by the shorten_branches pass). These functions don't handle a const_int length value and return an "unknown" status that isn't used, or in the case of or_attr_value, doesn't need to be used. min_attr_value also attempts to return INT_MAX for the unhandled rtl case, but this can get lost in recursive calls. I fixed that problem by returning INT_MIN instead, and translating that to INT_MAX in the only caller of min_attr_value. PR target/88614 * genattrtab.c (max_attr_value, min_attr_value, or_attr_value): Delete "unknownp" parameter. Adjust callers. Handle CONST_INT, PLUS, MINUS, and MULT. (attr_value_aligned): Renamed from or_attr_value. (min_attr_value): Return INT_MIN for unhandled rtl case.. (min_fn): ..and translate to INT_MAX here. (write_length_unit_log): Modify to cope without "unknown". (write_attr_value): Handle IF_THEN_ELSE. From-SVN: r267666
Alan Modra committed -
Bugs 88720 and 88726 report issues where a function is declared inline in an inner scope, resulting in spurious diagnostics about it being declared but never defined when that scope is left (possibly in some cases also wrongly referring to the function as a nested function). These are regressions that were introduced with the support for C99 inline semantics in 4.3 (they don't appear with 4.2; it's possible some aspects of the bugs might have been introduced later than 4.3). For the case of functions being wrongly referred to as nested, DECL_EXTERNAL was not the right condition for a function being non-nested; TREE_PUBLIC is appropriate for the case of non-nested functions with external linkage, while !b->nested means this is the outermost scope in which the function was declared and so avoids catching the case of a file-scope static being redeclared inline inside a function. For the non-nested, external-linkage case, the code attempts to avoid duplicate diagnostics by diagnosing only when scope != external_scope, but actually scope == external_scope is more appropriate, as it's only when the file and external scopes are popped that the code can actually tell whether a function ended up being defined, and all such functions will appear in the (GCC-internal) external scope. Bootstrapped with no regressions on x86_64-pc-linux-gnu. PR c/88720 PR c/88726 gcc/c: * c-decl.c (pop_scope): Use TREE_PUBLIC and b->nested to determine whether a function is nested, not DECL_EXTERNAL. Diagnose inline functions declared but never defined only for external scope, not for other scopes. gcc/testsuite: * gcc.dg/inline-40.c, gcc.dg/inline-41.c: New tests. From-SVN: r267665
Joseph Myers committed -
From-SVN: r267663
Joseph Myers committed -
/cp 2019-01-07 Paolo Carlini <paolo.carlini@oracle.com> * decl.c (start_decl): Improve two error_at locations. (expand_static_init): Likewise. /testsuite 2019-01-07 Paolo Carlini <paolo.carlini@oracle.com> * g++.dg/diagnostic/constexpr1.C: New. * g++.dg/diagnostic/thread1.C: Likewise. From-SVN: r267662
Paolo Carlini committed -
This is following CL 156038. doscanstackswitch uses the same mechanism of switching goroutines as getTraceback, and so has the same problem as described in issue golang/go#29448. This CL applies the same fix. Reviewed-on: https://go-review.googlesource.com/c/156697 From-SVN: r267661
Ian Lance Taylor committed -
This is the gccgo version of https://golang.org/cl/141822: Only return a pointer p to the new slices backing array from makeslice. Makeslice callers then construct sliceheader{p, len, cap} explictly instead of makeslice returning the slice. This change caused the GCC backend to break the runtime/pprof test by merging together the identical functions allocateReflectTransient and allocateTransient2M. This caused the traceback to be other than expected. Fix that by making the functions not identical. This is a step toward updating libgo to the Go1.12beta1 release. Reviewed-on: https://go-review.googlesource.com/c/155937 From-SVN: r267660
Ian Lance Taylor committed -
Currently, when collecting a traceback for another goroutine, getTraceback calls gogo(gp) switching to gp, which will resume in mcall, which will call gtraceback, which will set up gp->m. There is a gap between setting the current running g to gp and setting gp->m. If a profiling signal arrives in between, sigtramp will see a non-nil gp with a nil m, and will seg fault. Fix this by setting up gp->m first. Fixes golang/go#29448. Reviewed-on: https://go-review.googlesource.com/c/156038 From-SVN: r267658
Ian Lance Taylor committed -
2019-01-07 Thomas Koenig <tkoenig@gcc.gnu.org> Harald Anlauf <anlauf@gmx.de> Tobias Burnus <burnus@gcc.gnu.org> PR fortran/45424 * check.c (gfc_check_is_contiguous): New function. * expr.c (gfc_is_not_contiguous): New function. * gfortran.h (gfc_isym_id): Add GFC_ISYM_IS_CONTIGUOUS. Add prototype for gfc_is_not_contiguous. * intrinsic.c (do_ts29113_check): Add GFC_ISYM_IS_CONTIGUOUS. (add_function): Add is_contiguous. * intrinsic.h: Add prototypes for gfc_check_is_contiguous, gfc_simplify_is_contiguous and gfc_resolve_is_contiguous. * intrinsic.texi: Add IS_CONTIGUOUS. * iresolve.c (gfc_resolve_is_contiguous): New function. * simplify.c (gfc_simplify_is_contiguous): New function. * trans-decl.c (gfor_fncecl_is_contiguous0): New variable. (gfc_build_intrinsic_function_decl): Add it. * trans-intrinsic.c (gfc_conv_intrinsic_is_contiguous): New function. (gfc_conv_intrinsic_function): Handle GFC_ISYM_IS_CONTIGUOUS. 2019-01-07 Thomas Koenig <tkoenig@gcc.gnu.org> Harald Anlauf <anlauf@gmx.de> Tobias Burnus <burnus@gcc.gnu.org> PR fortran/45424 * Makefile.am: Add intrinsics/is_contiguous.c. * Makefile.in: Regenerated. * gfortran.map: Add _gfortran_is_contiguous0. * intrinsics/is_contiguous.c: New file. * libgfortran.h: Add prototype for is_contiguous0. 2019-01-07 Thomas Koenig <tkoenig@gcc.gnu.org> Harald Anlauf <anlauf@gmx.de> Tobias Burnus <burnus@gcc.gnu.org> * gfortran.dg/is_contiguous_1.f90: New test. * gfortran.dg/is_contiguous_2.f90: New test. * gfortran.dg/is_contiguous_3.f90: New test. Co-Authored-By: Harald Anlauf <anlauf@gmx.de> Co-Authored-By: Tobias Burnus <burnus@gcc.gnu.org> From-SVN: r267657
Thomas Koenig committed -
* decl.c (cp_complete_array_type): Strip any location wrappers. * g++.dg/init/array50.C: New test. From-SVN: r267656
Marek Polacek committed -
The IFN_GOMP_SIMD_LANE handling in vectorizable_store tries to use MEM_REF offsets to maintain pointer disambiguation info. This patch makes sure that we don't try to do the same optimisation for IFN_MASK_STOREs, which have no similar offset argument. The patch fixes libgomp.c-c++-common/pr66199-*.c for SVE. Previously we had an ncopies==2 store and stored both halves to the same address. 2019-01-07 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-stmts.c (vectorizable_store): Don't use the dataref_offset optimization for masked stores. From-SVN: r267654
Richard Sandiford committed -
PR c++/88261 PR c++/69338 PR c++/69696 PR c++/69697 * cp-tree.h (LOOKUP_ALLOW_FLEXARRAY_INIT): New flag value. * typeck2.c (digest_init_r): Raise an error for non-static initialization of a flexible array member. (process_init_constructor, massage_init_elt, process_init_constructor_array, process_init_constructor_record, process_init_constructor_union, process_init_constructor): Add the flags parameter and pass it thru. (store_init_value): Pass LOOKUP_ALLOW_FLEXARRAY_INIT parameter to digest_init_flags for static decls. gcc/testsuite: 2019-01-07 Bernd Edlinger <bernd.edlinger@hotmail.de> PR c++/88261 PR c++/69338 PR c++/69696 PR c++/69697 * gcc.dg/array-6.c: Move from here ... * c-c++-common/array-6.c: ... to here and add some more test coverage. * g++.dg/pr69338.C: New test. * g++.dg/pr69697.C: Likewise. * g++.dg/ext/flexary32.C: Likewise. * g++.dg/ext/flexary3.C: Adjust test. * g++.dg/ext/flexary12.C: Likewise. * g++.dg/ext/flexary13.C: Likewise. * g++.dg/ext/flexary15.C: Likewise. * g++.dg/warn/Wplacement-new-size-1.C: Likewise. * g++.dg/warn/Wplacement-new-size-2.C: Likewise. * g++.dg/warn/Wplacement-new-size-6.C: Likewise. From-SVN: r267653
Bernd Edlinger committed -
The use of "j" in: init = permute_results[number_of_vectors - j - 1]; was out-of-sync with the new flat loop structure. Now that all that reversing is gone, we can just use the result of duplicate_and_interleave directly. The other cases shouldn't be affected by postponing the insertion of ctor_seq, since gimple_build* appends to the seq without clearing it first (unlike some of the gimplify routines). The ICE is already covered by gcc.dg/vect/pr63379.c. 2019-01-07 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR middle-end/88567 * tree-vect-loop.c (get_initial_defs_for_reduction): Pass the output vector directly to duplicate_and_interleave instead of going through a temporary. Postpone insertion of ctor_seq to the end of the loop. From-SVN: r267652
Richard Sandiford committed -
The C++ char_traits and ctype APIs do not disallow null pointer arguments, so we need explicit checks to ensure we don't forward null pointers to memcpy or memmove. PR libstdc++/87787 * include/bits/char_traits.h (char_traits::move): Do not pass null pointers to memmove. * include/bits/locale_facets.h (ctype<char>::widen(const char*, const char*, char*)): Do not pass null pointers to memcpy. (ctype<char>::narrow(const char*, const char*, char, char*)): Likewise. (ctype<char>::do_widen(const char*, const char*, char*)): Likewise. (ctype<char>::do_narrow(const char*, const char*, char, char*)): Likewise. From-SVN: r267651
Jonathan Wakely committed -
Investigating PR target/86891 revealed a number of issues with the way the AArch64 backend was handing overflow detection patterns. Firstly, expansion for signed and unsigned types is not the same as in one form the overflow is detected via the C flag and in the other it is done via the V flag in the PSR. Secondly, particular care has to be taken when describing overflow of signed types: the comparison has to be performed conceptually on a value that cannot overflow and compared to a value that might have overflowed. It became apparent that some of the patterns were simply unmatchable (they collapse to NEG in the RTL rather than subtracting from zero) and a number of patterns were overly restrictive in terms of the immediate constants that they supported. I've tried to address all of these issues as well. gcc: PR target/86891 * config/aarch64/aarch64.c (aarch64_expand_subvti): New parameter unsigned_p. Handle signed and unsigned overflow correction as required. * config/aarch64/aarch64-protos.h (aarch64_expand_subvti): Update prototype. * config/aarch64/aarch64.md (addv<mode>4): Use aarch64_plus_operand for operand 2. (add<mode>3_compareV_imm): Make this callable for expanding. (subv<GPI:mode>4): Use register_operand for operand 1. Use aarch64_plus_operand for operand 2. (subv<GPI:mode>_insn): New insn pattern. (subv<GPI:mode>_imm): Likewise. (negv<GPI:mode>3): New expand pattern. (negv<GPI:mode>_insn): New insn pattern. (negv<GPI:mode>_cmp_only): Likewise. (cmpv<GPI:mode>_insn): Likewise. (subvti4): Use register_operand for operand 1. Update call to aarch64_expand_subvti. (usubvti4): Likewise. (negvti3): New expand pattern. (negdi_carryout): New insn pattern. (negvdi_carryinV): New insn pattern. (sub<mode3>_compare1_imm): Delete named insn pattern, make anonymous version the named version. (peepholes to convert to sub<mode3>_compare1_imm): Adjust order of operands. (usub<GPI:mode>3_carryinC, usub<GPI:mode>3_carryinC_z1): New insn patterns. (usub<GPI:mode>3_carryinC_z2, usub<GPI:mode>3_carryinC): New insn patterns. (sub<mode>3_carryinCV, sub<mode>3_carryinCV_z1_z2): Delete. (sub<mode>3_carryinCV_z1, sub<mode>3_carryinCV_z2): Delete. (sub<mode>3_carryinCV): Delete. (sub<GPI:mode>3_carryinV): New expand pattern. sub<mode>3_carryinV, sub<mode>3_carryinV_z2): New insn patterns. testsuite: * gcc.target/aarch64/subs_compare_2.c: Make '#' immediate prefix optional in scan pattern. From-SVN: r267650
Richard Earnshaw committed -
2019-01-07 Richard Biener <rguenther@suse.de> * tree-ssa-uncprop.c (ssa_equip_hash_traits): Remove in favor of tree_operand_hash. From-SVN: r267649
Richard Biener committed -
* doc/xml/manual/spine.xml: Update copyright years. * doc/xml/manual/status_cxx2017.xml: Adjust note about -lstdc++fs. * doc/xml/manual/using.xml: Remove requirement to link with -lstdc++fs for C++17 filesystem library. * doc/html/*: Regenerate. From-SVN: r267648
Jonathan Wakely committed -
Older versions of newlib do not provide truncate so add a configure check for it, and provide a fallback definition. There were also some missing exports in the linker script, which went unnoticed because there are no tests for some functions. A new link-only test checks that every filesystem operation function is defined by the library. * acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for truncate. * config.h.in: Regenerate. * config/abi/pre/gnu.ver: Order patterns for filesystem operations alphabetically and add missing entries for copy_symlink, hard_link_count, rename, and resize_file. * configure: Regenerate. * src/c++17/fs_ops.cc (resize_file): Remove #if so posix::truncate is used unconditionally. * src/filesystem/ops-common.h (__gnu_posix::truncate) [!_GLIBCXX_HAVE_TRUNCATE]: Provide fallback definition that only supports truncating to zero length. * testsuite/27_io/filesystem/operations/all.cc: New test. * testsuite/27_io/filesystem/operations/resize_file.cc: New test. From-SVN: r267647
Jonathan Wakely committed -
This patch folds certain reductions of X & CST to X[I] & CST[I] if I is the only nonzero element of CST. This includes the motivating case in which CST[I] is -1. We could do the same for REDUC_MAX on unsigned types, but I wasn't sure that that special case was worth it. 2019-01-07 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/88598 * tree.h (single_nonzero_element): Declare. * tree.c (single_nonzero_element): New function. * match.pd: Fold certain reductions of X & CST to X[I] & CST[I] if I is the only nonzero element of CST. gcc/testsuite/ PR tree-optimization/88598 * gcc.dg/vect/pr88598-1.c: New test. * gcc.dg/vect/pr88598-2.c: Likewise. * gcc.dg/vect/pr88598-3.c: Likewise. * gcc.dg/vect/pr88598-4.c: Likewise. * gcc.dg/vect/pr88598-5.c: Likewise. * gcc.dg/vect/pr88598-6.c: Likewise. From-SVN: r267646
Richard Sandiford committed -
The PR has: vect__6.24_42 = vect__5.23_41 * { 0.0, 1.0e+0, 0.0, 0.0 }; which for -fno-signed-zeros -fno-signaling-nans can be simplified to: vect__6.24_42 = vect__5.23_41 & { 0, -1, 0, 0 }; I deliberately didn't handle COMPLEX_CST or CONSTRUCTOR in initializer_each_zero_or_onep since there are no current use cases. The patch also makes (un)signed_type_for handle floating-point types. I tried to audit all callers and the few that handle null returns would be unaffected. 2019-01-07 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/88598 * tree.h (initializer_each_zero_or_onep): Declare. * tree.c (initializer_each_zero_or_onep): New function. (signed_or_unsigned_type_for): Handle float types too. (unsigned_type_for, signed_type_for): Update comments accordingly. * match.pd: Fold x * { 0 or 1, 0 or 1, ...} to x & { 0 or -1, 0 or -1, ... }. gcc/testsuite/ PR tree-optimization/88598 * gcc.dg/pr88598-1.c: New test. * gcc.dg/pr88598-2.c: Likewise. * gcc.dg/pr88598-3.c: Likewise. * gcc.dg/pr88598-4.c: Likewise. * gcc.dg/pr88598-5.c: Likewise. From-SVN: r267645
Richard Sandiford committed -
* doc/install.texi: Replace references to x86_64-unknown-linux-gnu with x86_64-pc-linux-gnu. From-SVN: r267643
Jonathan Wakely committed -
With PTX_MAX_VECTOR_LENGTH set to larger than PTX_WARP_SIZE, routines can be called from offloading regions with vector-size set to larger than warp size. OTOH, vector-partitionable routines assume warp-sized vector length. Detect if we're calling a vector-partitionable routine from an offloading region, and if so, fall back to warp-sized vector length in that region. 2019-01-07 Tom de Vries <tdevries@suse.de> PR target/85486 * config/nvptx/nvptx.c (has_vector_partitionable_routine_calls_p): New function. (nvptx_goacc_validate_dims): Force vl32 if calling vector-partitionable routines. From-SVN: r267640
Tom de Vries committed -
sse.md (vec_extract<mode><ssehalfvecmodelower>): Use V_256_512 iterator instead of V_512 and TARGET_AVX instead of... * config/i386/sse.md (vec_extract<mode><ssehalfvecmodelower>): Use V_256_512 iterator instead of V_512 and TARGET_AVX instead of TARGET_AVX512F as condition. From-SVN: r267639
Jakub Jelinek committed -
PR debug/88723 * dwarf2out.c (const_ok_for_output_1): Remove redundant call to const_not_ok_for_debug_p target hook. (mem_loc_descriptor) <case UNSPEC>: Only call const_ok_for_output_1 on UNSPEC and subexpressions thereof if all subexpressions of the UNSPEC are CONSTANT_P. From-SVN: r267638
Jakub Jelinek committed -
PR tree-optimization/88676 * tree-ssa-phiopt.c (two_value_replacement): New function. (tree_ssa_phiopt_worker): Call it. * gcc.dg/tree-ssa/pr88676.c: New test. * gcc.dg/pr88676.c: New test. * gcc.dg/tree-ssa/pr15826.c: Just verify there is no goto, allow &. From-SVN: r267634
Jakub Jelinek committed -
PR sanitizer/88619 * cfgexpand.c (expand_stack_vars): Only align prev_offset to ASAN_MIN_RED_ZONE_SIZE, not to maximum of that and alignb. * c-c++-common/asan/pr88619.c: New test. From-SVN: r267633
Jakub Jelinek committed -
PR c++/85052 * tree-vect-generic.c: Include insn-config.h and recog.h. (expand_vector_piecewise): Add defaulted ret_type argument, if non-NULL, use that in preference to type for the result type. (expand_vector_parallel): Formatting fix. (do_vec_conversion, do_vec_narrowing_conversion, expand_vector_conversion): New functions. (expand_vector_operations_1): Call expand_vector_conversion for VEC_CONVERT ifn calls. * internal-fn.def (VEC_CONVERT): New internal function. * internal-fn.c (expand_VEC_CONVERT): New function. * fold-const-call.c (fold_const_vec_convert): New function. (fold_const_call): Use it for CFN_VEC_CONVERT. * doc/extend.texi (__builtin_convertvector): Document. c-family/ * c-common.h (enum rid): Add RID_BUILTIN_CONVERTVECTOR. (c_build_vec_convert): Declare. * c-common.c (c_build_vec_convert): New function. c/ * c-parser.c (c_parser_postfix_expression): Parse __builtin_convertvector. cp/ * cp-tree.h (cp_build_vec_convert): Declare. * parser.c (cp_parser_postfix_expression): Parse __builtin_convertvector. * constexpr.c: Include fold-const-call.h. (cxx_eval_internal_function): Handle IFN_VEC_CONVERT. (potential_constant_expression_1): Likewise. * semantics.c (cp_build_vec_convert): New function. * pt.c (tsubst_copy_and_build): Handle CALL_EXPR to IFN_VEC_CONVERT. testsuite/ * c-c++-common/builtin-convertvector-1.c: New test. * c-c++-common/torture/builtin-convertvector-1.c: New test. * g++.dg/ext/builtin-convertvector-1.C: New test. * g++.dg/cpp0x/constexpr-builtin4.C: New test. From-SVN: r267632
Jakub Jelinek committed -
Add support for vector reductions with openacc vector_length larger than warp-size. 2019-01-07 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx-protos.h (nvptx_output_red_partition): Declare. * config/nvptx/nvptx.c (vector_red_size, vector_red_align, vector_red_partition, vector_red_sym): New global variables. (nvptx_option_override): Initialize vector_red_sym. (nvptx_declare_function_name): Restore red_partition register. (nvptx_file_end): Emit code to declare the vector reduction variables. (nvptx_output_red_partition): New function. (nvptx_expand_shared_addr): Add vector argument. Use it to handle large vector reductions. (enum nvptx_builtins): Add NVPTX_BUILTIN_VECTOR_ADDR. (nvptx_init_builtins): Add VECTOR_ADDR. (nvptx_expand_builtin): Update call to nvptx_expand_shared_addr. Handle nvptx_expand_shared_addr. (nvptx_get_shared_red_addr): Add vector argument and handle large vectors. (nvptx_goacc_reduction_setup): Add offload_attrs argument and handle large vectors. (nvptx_goacc_reduction_init): Likewise. (nvptx_goacc_reduction_fini): Likewise. (nvptx_goacc_reduction_teardown): Likewise. (nvptx_goacc_reduction): Update calls to nvptx_goacc_reduction_{setup, init,fini,teardown}. (nvptx_init_axis_predicate): Initialize vector_red_partition. (nvptx_set_current_function): Init vector_red_partition. * config/nvptx/nvptx.md (UNSPECV_RED_PART): New unspecv. (nvptx_red_partition): New insn. * config/nvptx/nvptx.h (struct machine_function): Add red_partition. From-SVN: r267631
Tom de Vries committed -
When compiling an empty loop: ... long long v1; #pragma acc parallel num_gangs (640) num_workers(1) vector_length (128) #pragma acc loop for (v1 = 0; v1 < 20; v1 += 2) ; ... the compiler emits two subsequent bar.syncs. This triggers some bug on my quadro m1200 (I'm assuming in the ptxas/JIT compiler) that hangs the testcase. This patch works around the bug by doing an optimization: we detect that this is an empty loop (a forked immediately followed by a joining), and don't emit the barriers. The patch does not include the test-case yet, since vector_length (128) is not yet supported at this point. 2019-01-07 Tom de Vries <tdevries@suse.de> PR target/85381 * config/nvptx/nvptx.c (nvptx_process_pars): Don't emit barriers for empty loops. From-SVN: r267630
Tom de Vries committed -
Add support for a per-worker broadcast buffer and barrier, to be used for openacc vector_length larger than warp-size. 2019-01-07 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.c (oacc_bcast_partition): Declare. (nvptx_option_override): Init oacc_bcast_partition. (nvptx_init_oacc_workers): New function. (nvptx_declare_function_name): Call nvptx_init_oacc_workers. (nvptx_needs_shared_bcast): New function. (nvptx_find_par): Generalize to enable vectors to use shared-memory to propagate state. (nvptx_shared_propagate): Initialize vector bcast partition and synchronization state. (nvptx_single): Generalize to enable vectors to use shared-memory to propagate state. (nvptx_process_pars): Likewise. (nvptx_set_current_function): Initialize oacc_broadcast_partition. * config/nvptx/nvptx.h (struct machine_function): Add bcast_partition and sync_bar members. From-SVN: r267629
Tom de Vries committed -
Allow PTX_MAX_VECTOR_LENGTH to be define as larger than PTX_WARP_SIZE in nvptx_goacc_validate_dims_1. 2019-01-07 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.c (nvptx_welformed_vector_length_p) (nvptx_apply_dim_limits): New function. (nvptx_goacc_validate_dims_1): Allow PTX_MAX_VECTOR_LENGTH larger than PTX_WARP_SIZE. From-SVN: r267628
Tom de Vries committed -
Move warnings in nvptx_goacc_validate_dims_1 to as late as possible. This allows us more flexibility in setting the dimensions. 2019-01-07 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.c (nvptx_goacc_validate_dims_1): Move warnings to as late as possible. From-SVN: r267627
Tom de Vries committed
-