- 02 Dec, 2019 11 commits
-
-
A non-standard clock may tick more slowly than std::chrono::steady_clock. This means that we risk returning false early when the specified timeout may not have expired. This can be avoided by looping until the timeout time as reported by the non-standard clock has been reached. Unfortunately, we have no way to tell whether the non-standard clock ticks more quickly that std::chrono::steady_clock. If it does then we risk returning later than would be expected, but that is unavoidable and permitted by the standard. 2019-12-02 Mike Crowe <mac@mcrowe.com> PR libstdc++/91906 Fix timed_mutex::try_lock_until on arbitrary clock * include/std/mutex (__timed_mutex_impl::_M_try_lock_until): Loop until the absolute timeout time is reached as measured against the appropriate clock. * testsuite/util/slow_clock.h: New file. Move implementation of slow_clock test class. * testsuite/30_threads/condition_variable/members/2.cc: Include slow_clock from header. * testsuite/30_threads/shared_timed_mutex/try_lock/3.cc: Convert existing test to templated function so that it can be called with both system_clock and steady_clock. * testsuite/30_threads/timed_mutex/try_lock_until/3.cc: Also run test using slow_clock to test above fix. * testsuite/30_threads/recursive_timed_mutex/try_lock_until/3.cc: Likewise. * testsuite/30_threads/recursive_timed_mutex/try_lock_until/4.cc: Add new test that try_lock_until behaves as try_lock if the timeout has already expired or exactly matches the current time. From-SVN: r278902
Mike Crowe committed -
The pthread_mutex_clocklock function is available in glibc since the 2.30 release. If this function is available in the C library it can be used to fix PR libstdc++/78237 by supporting steady_clock properly with timed_mutex. This means that code using timed_mutex::try_lock_for or timed_mutex::wait_until with steady_clock is no longer subject to timing out early or potentially waiting for much longer if the system clock is warped at an inopportune moment. If pthread_mutex_clocklock is available then steady_clock is deemed to be the "best" clock available which means that it is used for the relative try_lock_for calls and absolute try_lock_until calls using steady_clock and user-defined clocks. Calls explicitly using system_clock (aka high_resolution_clock) continue to use CLOCK_REALTIME via __gthread_cond_timedwait. If pthread_mutex_clocklock is not available then system_clock is deemed to be the "best" clock available which means that the previous suboptimal behaviour remains. 2019-12-02 Mike Crowe <mac@mcrowe.com> PR libstdc++/78237 Add full steady_clock support to timed_mutex * acinclude.m4 (GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK): Define to detect presence of pthread_mutex_clocklock function. * config.h.in: Regenerate. * configure: Regenerate. * configure.ac: Call GLIBCXX_CHECK_PTHREAD_MUTEX_CLOCKLOCK. * include/std/mutex (__timed_mutex_impl): Remove unnecessary __clock_t. (__timed_mutex_impl::_M_try_lock_for): Use best clock to turn relative timeout into absolute timeout. (__timed_mutex_impl::_M_try_lock_until): Keep existing implementation for system_clock. Add new implementation for steady_clock that calls _M_clocklock. Modify overload for user-defined clock to use a relative wait so that it automatically uses the best clock. [_GLIBCXX_USE_PTHREAD_MUTEX_CLOCKLOCK] (timed_mutex::_M_clocklock): New member function. (recursive_timed_mutex::_M_clocklock): Likewise. From-SVN: r278901
Mike Crowe committed -
2019-12-02 Mike Crowe <mac@mcrowe.com> * testsuite/30_threads/recursive_timed_mutex/try_lock_until/3.cc: New test. Ensure that timed_mutex::try_lock_until actually times out after the specified time when using both system_clock and steady_clock. * testsuite/30_threads/timed_mutex/try_lock_until/3.cc: New test. Likewise but for recursive_timed_mutex. * testsuite/30_threads/timed_mutex/try_lock_until/57641.cc: Template test functions and use them to test both steady_clock and system_clock. * testsuite/30_threads/unique_lock/locking/4.cc: Likewise. Wrap call to timed_mutex::try_lock_until in VERIFY macro to check its return value. From-SVN: r278900
Mike Crowe committed -
2019-12-02 Martin Liska <mliska@suse.cz> * ipa-devirt.c (warn_types_mismatch): Use get_odr_name_for_type function. (debug_tree_odr_name): New. * ipa-utils.h (get_odr_name_for_type): New. 2019-12-02 Martin Liska <mliska@suse.cz> * g++.dg/lto/odr-7_0.C: New test. * g++.dg/lto/odr-7_1.C: New test. From-SVN: r278898
Martin Liska committed -
* g++.dg/lto/inline-crossmodule-1_0.C: fix template. From-SVN: r278897
Jan Hubicka committed -
2019-12-02 Richard Biener <rguenther@suse.de> PR tree-optimization/92742 * tree-vect-loop.c (vect_fixup_reduc_chain): Do not touch the def-type but verify it is consistent with the original stmts. * gcc.dg/torture/pr92742.c: New testcase. From-SVN: r278896
Richard Biener committed -
PR tree-optimization/92712 * match.pd ((A * B) +- A -> (B +- 1) * A, A +- (A * B) -> (1 +- B) * A): Allow optimizing signed integers even when we don't know anything about range of A, but do know something about range of B and the simplification won't introduce new UB. * gcc.dg/tree-ssa/pr92712-1.c: New test. * gcc.dg/tree-ssa/pr92712-2.c: New test. * gcc.dg/tree-ssa/pr92712-3.c: New test. * gfortran.dg/loop_versioning_1.f90: Adjust expected number of likely to be innermost dimension messages. * gfortran.dg/loop_versioning_10.f90: Likewise. * gfortran.dg/loop_versioning_6.f90: Likewise. From-SVN: r278894
Jakub Jelinek committed -
2019-12-02 Feng Xue <fxue@os.amperecomputing.com> PR ipa/92133 * doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option. (ipa-cp-min-recursive-probability): Likewise. * params.opt (ipa-cp-max-recursive-depth): New. (ipa-cp-min-recursive-probability): Likewise. * ipa-cp.c (ipcp_lattice<valtype>::add_value): Add two new parameters val_p and unlimited. (self_recursively_generated_p): New function. (get_val_across_arith_op): Likewise. (propagate_vals_across_arith_jfunc): Add constant propagation for self-recursive function. (incorporate_penalties): Do not penalize pure self-recursive function. (good_cloning_opportunity_p): Dump node_is_self_scc flag. (propagate_constants_topo): Set node_is_self_scc flag for cgraph node. (get_info_about_necessary_edges): Relax hotness check for edge to self-recursive function. * ipa-prop.h (ipa_node_params): Add new field node_is_self_scc. 2019-12-02 Feng Xue <fxue@os.amperecomputing.com> PR ipa/92133 * gcc.dg/ipa/ipa-clone-2.c: New test. From-SVN: r278893
Feng Xue committed -
2019-12-01 Sandra Loosemore <sandra@codesourcery.com> Fix bugs relating to flexibly-sized objects in nios2 backend. PR target/92499 gcc/c/ * c-decl.c (flexible_array_type_p): Move to common code. gcc/ * config/nios2/nios2.c (nios2_in_small_data_p): Do not consider objects of flexible types to be small if they have internal linkage or are declared extern. * config/nios2/nios2.h (ASM_OUTPUT_ALIGNED_LOCAL): Replace with... (ASM_OUTPUT_ALIGNED_DECL_LOCAL): ...this. Use targetm.in_small_data_p instead of the size of the object initializer. * tree.c (flexible_array_type_p): Move from C front end, and generalize to handle fields in non-C structures. * tree.h (flexible_array_type_p): Declare. gcc/testsuite/ * gcc.target/nios2/pr92499-1.c: New. * gcc.target/nios2/pr92499-2.c: New. * gcc.target/nios2/pr92499-3.c: New. From-SVN: r278891
Sandra Loosemore committed -
P9LE generated instruction is not worse than P8LE. mtvsrdd;xxlnot;stxv vs. not;not;std;std. It can have longer latency, but latency via memory is not so critical, and this does save decode and other resources. It's hard to choose which is best. Update the test case to fix failures. gcc/testsuite/ChangeLog: 2019-12-02 Luo Xiong Hu <luoxhu@linux.ibm.com> PR testsuite/92398 * gcc.target/powerpc/pr72804.c: Split the store function to... * gcc.target/powerpc/pr92398.h: ... this one. New. * gcc.target/powerpc/pr92398.p9+.c: New. * gcc.target/powerpc/pr92398.p9-.c: New. * lib/target-supports.exp (check_effective_target_p8): New. (check_effective_target_p9+): New. From-SVN: r278890
Luo Xiong Hu committed -
From-SVN: r278889
GCC Administrator committed
-
- 01 Dec, 2019 3 commits
-
-
re PR libfortran/90374 (Fortran 2018: Support d0.d, e0.d, es0.d, en0.d, g0.d and ew.d e0 edit descriptors for output) 2019-12-01 Jerry DeLisle <jvdelisle@gcc.ngu.org> PR fortran/90374 * io/format.c (parse_format_list): Add braces to disambiguate conditional. From-SVN: r278886
Jerry DeLisle committed -
* profile-count.h (profile_count::operator<): Use IPA value for comparsion. (profile_count::operator>): Likewise. (profile_count::operator<=): Likewise. (profile_count::operator>=): Likewise. * predict.c (maybe_hot_count_p): Do not convert to gcov_type. From-SVN: r278885
Jan Hubicka committed -
From-SVN: r278883
GCC Administrator committed
-
- 30 Nov, 2019 11 commits
-
-
* ipa-inline.c (compute_max_insns): Return int64_t. (inline_small_functions): Simplify. From-SVN: r278880
Jan Hubicka committed -
* tree-cfg.c (execute_fixup_cfg): Update also max_bb_count when scaling happen. From-SVN: r278879
Jan Hubicka committed -
2019-11-30 Jan Hubicka <hubicka@ucw.cz> * cgraph.h (symtab_node): Add symver flag. * cgraphunit.c (process_symver_attribute): New. (process_common_attributes): Use process_symver_attribute. * lto-cgraph.c (lto_output_node): Stream symver. (lto_output_varpool_node): Stream symver. (input_overwrite_node): Stream symver. (input_varpool_node): Stream symver. * output.h (do_assemble_symver): Decalre. * symtab.c (symtab_node::dump_base): Dump symver. (symtab_node::verify_base): Verify symver. (symtab_node::resolve_alias): Handle symver. * varasm.c (do_assemble_symver): New function. * varpool.c (varpool_node::assemble_aliases): Use it. * doc/extend.texi: (symver attribute): Document. * config/elfos.h (ASM_OUTPUT_SYMVER_DIRECTIVE): New. c-family/ChangeLog: 2019-11-30 Jan Hubicka <hubicka@ucw.cz> * c-attribs.c (handle_symver_attribute): New function (c_common_attributes): Add symver. From-SVN: r278878
Jan Hubicka committed -
This patch adds a new target hook to check whether there are any target-specific reasons why a type cannot be used in a certain source-language context. It works in a similar way to existing hooks like TARGET_INVALID_CONVERSION and TARGET_INVALID_UNARY_OP. The reason for adding the hook is to report invalid uses of SVE types. Throughout a TU, the SVE vector and predicate types represent values that can be stored in an SVE vector or predicate register. At certain points in the TU we might be able to generate code that assumes the registers have a particular size, but often we can't. In some cases we might even make multiple different assumptions in the same TU (e.g. when implementing an ifunc for multiple vector lengths). But SVE types themselves are the same type throughout. The register size assumptions change how we generate code, but they don't change the definition of the types. This means that the types do not have a fixed size at the C level even when -msve-vector-bits=N is in effect. It also means that the size does not work in the same way as for C VLAs, where the abstract machine evaluates the size at a particular point and then carries that size forward to later code. The SVE ACLE deals with this by making it invalid to use C and C++ constructs that depend on the size or layout of SVE types. The spec refers to the types as "sizeless" types and defines their semantics as edits to the standards. See: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg00868.html for a fuller description and: https://gcc.gnu.org/ml/gcc/2019-11/msg00088.html for a recent update on the status. However, since all current sizeless types are target-specific built-in types, there's no real reason for the frontends to handle them directly. They can just hand off the checks to target code instead. It's then possible for the errors to refer to "SVE types" rather than "sizeless types", which is likely to be more meaningful to users. There is a slight overlap between the new tests and the ones for gnu_vector_type_p in r277950, but here the emphasis is on testing sizelessness. 2019-11-30 Richard Sandiford <richard.sandiford@arm.com> gcc/ * target.h (type_context_kind): New enum. (verify_type_context): Declare. * target.def (verify_type_context): New target hook. * doc/tm.texi.in (TARGET_VERIFY_TYPE_CONTEXT): Likewise. * doc/tm.texi: Regenerate. * tree.c (verify_type_context): New function. * config/aarch64/aarch64-protos.h (aarch64_sve::verify_type_context): Declare. * config/aarch64/aarch64-sve-builtins.cc (verify_type_context): New function. * config/aarch64/aarch64.c (aarch64_verify_type_context): Likewise. (TARGET_VERIFY_TYPE_CONTEXT): Define. gcc/c-family/ * c-common.c (pointer_int_sum): Use verify_type_context to check whether the target allows pointer arithmetic for the types involved. (c_sizeof_or_alignof_type, c_alignof_expr): Use verify_type_context to check whether the target allows sizeof and alignof operations for the types involved. gcc/c/ * c-decl.c (start_decl): Allow initialization of variables whose size is a POLY_INT_CST. (finish_decl): Use verify_type_context to check whether the target allows variables with a particular type to have static or thread-local storage duration. Don't raise a second error if such variables do not have a constant size. (grokdeclarator): Use verify_type_context to check whether the target allows fields or array elements to have a particular type. * c-typeck.c (pointer_diff): Use verify_type_context to test whether the target allows pointer difference for the types involved. (build_unary_op): Likewise for pointer increment and decrement. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general-c/sizeless-1.c: New test. * gcc.target/aarch64/sve/acle/general-c/sizeless-2.c: Likewise. From-SVN: r278877
Richard Sandiford committed -
* cgraph.c (cgraph_node::dump): Dump unit_id and merged_extern_inline. * cgraph.h (cgraph_node): Add unit_id and merged_extern_inline. (symbol_table): Add max_unit. (symbol_table::symbol_table): Initialize it. * cgraphclones.c (duplicate_thunk_for_node): Copy unit_id. merged_comdat, merged_extern_inline. (cgraph_node::create_clone): Likewise. (cgraph_node::create_version_clone): Likewise. * ipa-fnsummary.c (dump_ipa_call_summary): Dump info about cross module calls. * ipa-fnsummary.h (cross_module_call_p): New inline function. * ipa-inline-analyssi.c (simple_edge_hints): Use it. * ipa-inline.c (inline_small_functions): Likewise. * lto-symtab.c (lto_cgraph_replace_node): Record merged_extern_inline; copy merged_comdat and merged_extern_inline. * lto-cgraph.c (lto_output_node): Stream out merged_comdat, merged_extern_inline and unit_id. (input_overwrite_node): Stream in these. (input_cgraph_1): Set unit_base. * lto-streamer.h (lto_file_decl_data): Add unit_base. * symtab.c (symtab_node::make_decl_local): Record former_comdat. * g++.dg/lto/inline-crossmodule-1.h: New testcase. * g++.dg/lto/inline-crossmodule-1_0.C: New testcase. * g++.dg/lto/inline-crossmodule-1_1.C: New testcase. From-SVN: r278876
Jan Hubicka committed -
2019-11-30 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/91783 * dependency.c (gfc_dep_resolver): Do not look at _data component if present. 2019-11-30 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/91783 * gfortran.dg/dependency_56.f90: New test. From-SVN: r278873
Thomas Koenig committed -
Fix an issue with the GCC driver and the `-x' option where a warning is issued in an invocation like: $ riscv64-linux-gnu-gcc -print-multi-directory -x c++ riscv64-linux-gnu-gcc: warning: '-x c++' after last input file has no effect lib64/lp64d $ where no inputs were given and hence the use of `-x' is irrelevant. The statement printed is also untrue as the `-x' does not come after the last input file given that none was given. Do not print it then if no inputs were supplied. * gcc.c (process_command): Only warn about an ineffective `-x' option if any input files have actually been supplied. From-SVN: r278872
Maciej W. Rozycki committed -
The `--enable-version-specific-runtime-libs' configuration option is now supported throughout all of our target library subdirectories, so update installation documentation accordingly and also mention that the default for the option is `yes' for libada and `no' for the remaining libraries. gcc/ * doc/install.texi (Options specification): Remove the list of target library subdirectories supporting `--enable-version-specific-runtime-libs'. Document defaults for the option. From-SVN: r278871
Maciej W. Rozycki committed -
* acinclude.m4 (GLIBCXX_ENABLE_FILESYSTEM_TS): Enable by default for mingw targets. * configure: Regenerate. From-SVN: r278870
Jonathan Wakely committed -
This function failed to compile when called with a std::string. Also, constructing a path with a char8_t string did not correctly treat the string as already UTF-8 encoded. * include/bits/fs_path.h (u8path(InputIterator, InputIterator)) (u8path(const Source&)) [_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Simplify conditions. * include/experimental/bits/fs_path.h [_GLIBCXX_FILESYSTEM_IS_WINDOWS] (__u8path(const Source&, char)): Add overloads for std::string and types convertible to std::string. (_Cvt::_S_wconvert): Add a new overload for char8_t strings and use codecvt_utf8_utf16 to do the correct conversion. From-SVN: r278869
Jonathan Wakely committed -
From-SVN: r278868
GCC Administrator committed
-
- 29 Nov, 2019 15 commits
-
-
2019-11-29 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/92283 * lra.c (lra): Update reg notes after inheritance sub-pass and before constraint sub-pass. From-SVN: r278865
Vladimir Makarov committed -
2019-11-29 Richard Biener <rguenther@suse.de> PR tree-optimization/91003 * tree-vect-slp.c (vect_mask_constant_operand_p): Pass in the operand number, avoid handling the non-condition operands of COND_EXPRs as comparisons. (vect_get_constant_vectors): Pass down the operand number. (vect_get_slp_defs): Likewise. * gfortran.dg/pr91003.f90: New testcase. From-SVN: r278860
Richard Biener committed -
* include/bits/fs_path.h (path::operator/=): Change template-head to use typename instead of class. * include/experimental/bits/fs_path.h (path::operator/=): Likewise. * include/std/ostream (operator<<): Likewise. From-SVN: r278859
Jonathan Wakely committed -
New tests This patch adds new tests to validate new deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string ostream inserters. Additionally, new tests are added to validate invocations of u8path with sequences of char8_t for both the C++17 and filesystem TS implementations. 2019-11-29 Tom Honermann <tom@honermann.net> New tests * testsuite/27_io/basic_ostream/inserters_character/char/deleted.cc: New test to validate deleted overloads of character and string inserters for narrow ostreams. * testsuite/27_io/basic_ostream/inserters_character/wchar_t/deleted.cc: New test to validate deleted overloads of character and string inserters for wide ostreams. * testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc: New test to validate u8path invocations with sequences of char8_t. * testsuite/experimental/filesystem/path/factory/u8path-char8_t.cc: New test to validate u8path invocations with sequences of char8_t. From-SVN: r278858
Tom Honermann committed -
Updates to existing tests This patch updates existing tests to validate the new value for the __cpp_lib_char8_t feature test macros and to exercise u8path factory function invocations with std::string, std::string_view, and interator pair arguments. 2019-11-29 Tom Honermann <tom@honermann.net> Updates to existing tests * testsuite/experimental/feat-char8_t.cc: Updated the expected __cpp_lib_char8_t feature test macro value. * testsuite/27_io/filesystem/path/factory/u8path.cc: Added testing of u8path invocation with std::string, std::string_view, and iterators thereof. * testsuite/experimental/filesystem/path/factory/u8path.cc: Added testing of u8path invocation with std::string, std::string_view, and iterators thereof. From-SVN: r278857
Tom Honermann committed -
Update feature test macro, add deleted operators, update u8path This patch increments the __cpp_lib_char8_t feature test macro, adds deleted operator<< overloads for basic_ostream, and modifies u8path to accept sequences of char8_t for both the C++17 implementation of std::filesystem, and the filesystem TS implementation. The implementation mechanism used for u8path differs between the C++17 and filesystem TS implementations. The changes to the former take advantage of C++17 'if constexpr'. The changes to the latter retain C++11 compatibility and rely on tag dispatching. 2019-11-29 Tom Honermann <tom@honermann.net> Update feature test macro, add deleted operators, update u8path * include/bits/c++config: Bumped the value of the __cpp_lib_char8_t feature test macro. * include/bits/fs_path.h (u8path): Modified u8path to accept sequences of char8_t. * include/experimental/bits/fs_path.h (u8path): Modified u8path to accept sequences of char8_t. * include/std/ostream: Added deleted overloads of wchar_t, char8_t, char16_t, and char32_t for ordinary and wide formatted character and string inserters. From-SVN: r278856
Tom Honermann committed -
Decouple constraints for u8path from path constructors This patch moves helper classes and functions for std::filesystem::path out of the class definition to a detail namespace so that they are available to the implementations of std::filesystem::u8path. Prior to this patch, the SFINAE constraints for those implementations were specified via delegation to the overloads of path constructors with a std::locale parameter; it just so happened that those overloads had the same constraints. As of P1423R3, u8path and those overloads no longer have the same constraints, so this dependency must be broken. This patch also updates the experimental implementation of the filesystem TS to add SFINAE constraints to its implementations of u8path. These functions were previously unconstrained and marked with a TODO comment. This patch does not provide any intentional behavioral changes other than the added constraints to the experimental filesystem TS implementation of u8path. Alternatives to this refactoring would have been to make the u8path overloads friends of class path, or to make the helpers public members. Both of those approaches struck me as less desirable than this approach, though this approach does require more code changes and will affect implementation detail portions of mangled names for path constructors and inline member functions (mostly function template specializations). 2019-11-29 Tom Honermann <tom@honermann.net> Decouple constraints for u8path from path constructors * include/bits/fs_path.h: Moved helper utilities out of std::filesystem::path into a detail namespace to make them available for use by u8path. * include/experimental/bits/fs_path.h: Moved helper utilities out of std::experimental::filesystem::v1::path into a detail namespace to make them available for use by u8path. From-SVN: r278855
Tom Honermann committed -
The function maybe_resimplify_conditional_op uses operation_could_trap_p to check if the resulting operation of a simplification can trap. Because of the changes introduced by revision r276659, this results in an ICE due to a violated assertion in operation_could_trap_p if the operation is a COND_EXPR or a VEC_COND_EXPR. The changes have allowed those expressions to trap and whether they do cannot be determined without considering their condition which is not available to operation_could_trap_p. Change maybe_resimplify_conditional_op to inspect the condition of COND_EXPRs and VEC_COND_EXPRs to determine if they can trap. From-SVN: r278853
Frederik Harwath committed -
When dissolving an SLP-only group of accesses, we should only set the gap to group_size - 1 for normal non-strided groups. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92677 * tree-vect-loop.c (vect_dissolve_slp_only_groups): Set the gap to zero when dissolving a group of strided accesses. gcc/testsuite/ PR tree-optimization/92677 * gcc.dg/vect/pr92677.c: New test. From-SVN: r278852
Richard Sandiford committed -
Now that stmt_vec_info records the choice between vector mask types and normal nonmask types, we can use that information in vect_get_vector_types_for_stmt instead of deferring the choice of vector type till later. vect_get_mask_type_for_stmt used to check whether the boolean inputs to an operation: (a) consistently used mask types or consistently used nonmask types; and (b) agreed on the number of elements. (b) shouldn't be a problem when (a) is met. If the operation consistently uses mask types, tree-vect-patterns.c will have corrected any mismatches in mask precision. (This is because we only use mask types for a small well-known set of operations and tree-vect-patterns.c knows how to handle any that could have different mask precisions.) And if the operation consistently uses normal nonmask types, there's no reason why booleans should need extra vector compatibility checks compared to ordinary integers. So the potential difficulties all seem to come from (a). Now that we've chosen the result type ahead of time, we also have to consider whether the outputs and inputs consistently use mask types. Taking each vectorizable_* routine in turn: - vectorizable_call vect_get_vector_types_for_stmt only handled booleans specially for gassigns, so vect_get_mask_type_for_stmt never had chance to handle calls. I'm not sure we support any calls that operate on booleans, but as things stand, a boolean result would always have a nonmask type. Presumably any vector argument would also need to use nonmask types, unless it corresponds to internal_fn_mask_index (which is already a special case). For safety, I've added a check for mask/nonmask combinations here even though we didn't check this previously. - vectorizable_simd_clone_call Again, vect_get_mask_type_for_stmt never had chance to handle calls. The result of the call will always be a nonmask type and the patch for PR 92710 rejects mask arguments. So all booleans should consistently use nonmask types here. - vectorizable_conversion The function already rejects any conversion between booleans in which one type isn't a mask type. - vectorizable_operation This function definitely needs a consistency check, e.g. to handle & and | in which one operand is loaded from memory and the other is a comparison result. Ideally we'd handle this via pattern stmts instead (like we do for the all-mask case), but that's future work. - vectorizable_assignment VECT_SCALAR_BOOLEAN_TYPE_P requires single-bit precision, so the current code already rejects problematic cases. - vectorizable_load Loads always produce nonmask types and there are no relevant inputs to check against. - vectorizable_store vect_check_store_rhs already rejects mask/nonmask combinations via useless_type_conversion_p. - vectorizable_reduction - vectorizable_lc_phi PHIs always have nonmask types. After the change above, attempts to combine the PHI result with a mask type would be rejected by vectorizable_operation. (Again, it would be better to handle this using pattern stmts.) - vectorizable_induction We don't generate inductions for booleans. - vectorizable_shift The function already rejects boolean shifts via type_has_mode_precision_p. - vectorizable_condition The function already rejects mismatches via useless_type_conversion_p. - vectorizable_comparison The function already rejects comparisons between mask and nonmask types. The result is always a mask type. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ PR tree-optimization/92596 * tree-vect-stmts.c (vectorizable_call): Punt on hybrid mask/nonmask operations. (vectorizable_operation): Likewise, instead of relying on vect_get_mask_type_for_stmt to do this. (vect_get_vector_types_for_stmt): Always return a vector type immediately, rather than deferring the choice for boolean results. Use a vector mask type instead of a normal vector if vect_use_mask_type_p. (vect_get_mask_type_for_stmt): Delete. * tree-vect-loop.c (vect_determine_vf_for_stmt_1): Remove mask_producers argument and special boolean_type_node handling. (vect_determine_vf_for_stmt): Remove mask_producers argument and update calls to vect_determine_vf_for_stmt_1. Remove doubled call. (vect_determine_vectorization_factor): Update call accordingly. * tree-vect-slp.c (vect_build_slp_tree_1): Remove special boolean_type_node handling. (vect_slp_analyze_node_operations_1): Likewise. gcc/testsuite/ PR tree-optimization/92596 * gcc.dg/vect/bb-slp-pr92596.c: New test. * gcc.dg/vect/bb-slp-43.c: Likewise. From-SVN: r278851
Richard Sandiford committed -
search_type_for_mask uses a worklist to search a chain of boolean operations for a natural vector mask type. This patch instead does that in vect_determine_stmt_precisions, where we also look for overpromoted integer operations. We then only need to compute the precision once and can cache it in the stmt_vec_info. The new function vect_determine_mask_precision is supposed to handle exactly the same cases as search_type_for_mask_1, and in the same way. There's a lot we could improve here, but that's not stage 3 material. I wondered about sharing mask_precision with other fields like operation_precision, but in the end that seemed too dangerous. We have patterns to convert between boolean and non-boolean operations and it would be very easy to get mixed up about which case the fields are describing. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (stmt_vec_info::mask_precision): New field. (vect_use_mask_type_p): New function. * tree-vect-patterns.c (vect_init_pattern_stmt): Copy the mask precision to the pattern statement. (append_pattern_def_seq): Add a scalar_type_for_mask parameter and use it to initialize the new stmt's mask precision. (search_type_for_mask_1): Delete. (search_type_for_mask): Replace with... (integer_type_for_mask): ...this new function. Use the information cached in the stmt_vec_info. (vect_recog_bool_pattern): Update accordingly. (build_mask_conversion): Pass the scalar type associated with the mask type to append_pattern_def_seq. (vect_recog_mask_conversion_pattern): Likewise. Call integer_type_for_mask instead of search_type_for_mask. (vect_convert_mask_for_vectype): Call integer_type_for_mask instead of search_type_for_mask. (possible_vector_mask_operation_p): New function. (vect_determine_mask_precision): Likewise. (vect_determine_stmt_precisions): Call it. From-SVN: r278850
Richard Sandiford committed -
This patch makes vect_get_mask_type_for_stmt and get_mask_type_for_scalar_type take a group size instead of the SLP node, so that later patches can call it before an SLP node has been built. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vectorizer.h (get_mask_type_for_scalar_type): Replace the slp_tree parameter with a group size parameter. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-stmts.c (get_mask_type_for_scalar_type): Likewise. (vect_get_mask_type_for_stmt): Likewise. * tree-vect-slp.c (vect_slp_analyze_node_operations_1): Update call accordingly. From-SVN: r278849
Richard Sandiford committed -
vectorizable_operation returned false for codes that are handled by vectorizable_shift, but only after it had already done a lot of work. Checking earlier should be more efficient and avoid polluting the logs with duplicate info. Also, there was no such early-out for comparisons or COND_EXPRs. Fixing that avoids a false scan-tree-dump hit with a later patch. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * tree-vect-stmts.c (vectorizable_operation): Punt early on codes that are handled elsewhere. From-SVN: r278848
Richard Sandiford committed -
vect_recog_bool_pattern assumed that a comparison between two booleans should always become a comparison of vector mask types (implemented as an XOR_EXPR). But if the booleans in question are generated as data values (e.g. because they're loaded directly from memory), we should treat them like ordinary integers instead, just as we do for boolean logic ops whose operands are loaded from memory. vect_get_mask_type_for_stmt already handled this case: /* We may compare boolean value loaded as vector of integers. Fix mask_type in such case. */ if (mask_type && !VECTOR_BOOLEAN_TYPE_P (mask_type) && gimple_code (stmt) == GIMPLE_ASSIGN && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison) mask_type = truth_type_for (mask_type); and not handling it here complicated later patches. The initial list of targets for vect_bool_cmp is deliberately conservative. 2019-11-29 Richard Sandiford <richard.sandiford@arm.com> gcc/ * doc/sourcebuild.texi (vect_bool_cmp): Document. * tree-vect-patterns.c (search_type_for_mask_1): If neither operand to a boolean comparison is a natural vector mask, handle both operands like normal integers instead. gcc/testsuite/ * gcc.dg/vect/vect-bool-cmp-2.c: New test. * lib/target-supports.exp (check_effective_target_vect_bool_cmp): New effective target procedure. From-SVN: r278847
Richard Sandiford committed -
This fixes two related problems. The iterators for node-based containers use nested typedefs such as std::list<T>::iterator::_Node to denote their node types. As reported in https://bugzilla.redhat.com/show_bug.cgi?id=1053438 those typedefs are not always present in the debug info. That means the pretty printers cannot find them using gdb.lookup_type (via the find_type helper). Instead of looking up the nested typedefs this patch makes the printers look up the actual class templates directly. A related problem (and the original topic of PR 91997) is that GDB fails to find types via gdb.lookup_type when printing a backtrace from a non-C++ functiion: https://sourceware.org/bugzilla/show_bug.cgi?id=25234 That is also solved by not looking up the nested typedef. PR libstdc++/91997 * python/libstdcxx/v6/printers.py (find_type): Fail more gracefully if we run out of base classes to look at. (llokup_templ_spec, lookup_node_type): New utilities to find node types for node-based containers. (StdListPrinter.children, NodeIteratorPrinter.__init__) (NodeIteratorPrinter.to_string, StdSlistPrinter.children) (StdSlistIteratorPrinter.to_string, StdRbtreeIteratorPrinter.__init__) (StdMapPrinter.children, StdSetPrinter.children) (StdForwardListPrinter.children): Use lookup_node_type instead of find_type. (StdListIteratorPrinter.__init__, StdFwdListIteratorPrinter.__init__): Pass name of node type to NodeIteratorPrinter constructor. (Tr1HashtableIterator.__init__): Rename argument. (StdHashtableIterator.__init__): Likewise. Use lookup_templ_spec instead of find_type. * testsuite/libstdc++-prettyprinters/59161.cc: Remove workaround for _Node typedef not being present in debuginfo. * testsuite/libstdc++-prettyprinters/91997.cc: New test. From-SVN: r278846
Jonathan Wakely committed
-