- 15 Feb, 2020 10 commits
-
-
These subroutines have only a single call site, so it might be best and simplest to eliminate them before we convert the algos into function objects. libstdc++-v3/ChangeLog: * include/bits/ranges_algo.h (ranges::__find_end): Fold into ... (ranges::find_end): ... here. (ranges::__lexicographical_compare): Fold into ... (ranges::lexicographical_compare): ... here. * include/bits/ranges_algobase.h (ranges::__equal): Fold into ... (ranges::equal): ... here.
Patrick Palka committed -
PR c++/68061 * g++.dg/concepts/attrib1.C: New.
Jason Merrill committed -
find_template_parameters needs to find the mention of T in the lambda. Fixing that leaves this as a hard error, which may be surprising but is consistent with lambdas in other SFINAE contexts like template argument deduction. gcc/cp/ChangeLog 2020-02-15 Jason Merrill <jason@redhat.com> PR c++/92556 * pt.c (any_template_parm_r): Look into lambda body.
Jason Merrill committed -
gcc/cp/ChangeLog 2020-02-15 Jason Merrill <jason@redhat.com> PR c++/92583 * pt.c (any_template_parm_r): Remove CONSTRUCTOR handling.
Jason Merrill committed -
PR c++/90764 * g++.dg/cpp1z/class-deduction69.C: New.
Jason Merrill committed -
As the following testcases show (the first one reported, last two found by code inspection), we need to disallow side-effects in simplifications that turn some unconditional expression into conditional one. From my little understanding of genmatch.c, it is able to automatically disallow side effects if the same operand is used multiple times in the match pattern, maybe if it is used multiple times in the replacement pattern, and if it is used in conditional contexts in the match pattern, could it be taught to handle this case too? If yes, perhaps just the first hunk could be usable for 8/9 backports (+ the testcases). 2020-02-15 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93744 * match.pd (((m1 >/</>=/<= m2) * d -> (m1 >/</>=/<= m2) ? d : 0, A - ((A - B) & -(C cmp D)) -> (C cmp D) ? B : A, A + ((B - A) & -(C cmp D)) -> (C cmp D) ? B : A): For GENERIC, make sure @2 in the first and @1 in the other patterns has no side-effects. * gcc.c-torture/execute/pr93744-1.c: New test. * gcc.c-torture/execute/pr93744-2.c: New test. * gcc.c-torture/execute/pr93744-3.c: New test.
Jakub Jelinek committed -
Now that this feature has been approved for C++20 we can define the macro to the official value. * include/bits/erase_if.h (__cpp_lib_erase_if): Define to 202002L. * include/std/deque: Likewise. * include/std/forward_list: Likewise. * include/std/list: Likewise. * include/std/string: Likewise. * include/std/vector: Likewise. * include/std/version: Likewise. * testsuite/23_containers/deque/erasure.cc: Test for new value. * testsuite/23_containers/forward_list/erasure.cc: Likewise. * testsuite/23_containers/list/erasure.cc: Likewise. * testsuite/23_containers/map/erasure.cc: Likewise. * testsuite/23_containers/set/erasure.cc: Likewise. * testsuite/23_containers/unordered_map/erasure.cc: Likewise. * testsuite/23_containers/unordered_set/erasure.cc: Likewise. * testsuite/23_containers/vector/erasure.cc: Likewise.
Jonathan Wakely committed -
* include/bits/random.h (uniform_random_bit_generator): Require min() and max() to be constant expressions and min() to be less than max(). * testsuite/26_numerics/random/concept.cc: Check additional cases. * testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error lineno.
Jonathan Wakely committed -
2020-02-15 David Malcolm <dmalcolm@redhat.com> Bernd Edlinger <bernd.edlinger@hotmail.de> PR 87488 PR other/93168 * config.in (DIAGNOSTICS_URLS_DEFAULT): New define. * configure.ac (--with-diagnostics-urls): New configuration option, based on --with-diagnostics-color. (DIAGNOSTICS_URLS_DEFAULT): New define. * config.h: Regenerate. * configure: Regenerate. * diagnostic.c (diagnostic_urls_init): Handle -1 for DIAGNOSTICS_URLS_DEFAULT from configure-time --with-diagnostics-urls=auto-if-env by querying for a GCC_URLS and TERM_URLS environment variable. * diagnostic-url.h (diagnostic_url_format): New enum type. (diagnostic_urls_enabled_p): rename to... (determine_url_format): ... this, and change return type. * diagnostic-color.c (parse_env_vars_for_urls): New helper function. (auto_enable_urls): Disable URLs on xfce4-terminal, gnome-terminal, the linux console, and mingw. (diagnostic_urls_enabled_p): rename to... (determine_url_format): ... this, and adjust. * pretty-print.h (pretty_printer::show_urls): rename to... (pretty_printer::url_format): ... this, and change to enum. * pretty-print.c (pretty_printer::pretty_printer, pp_begin_url, pp_end_url, test_urls): Adjust. * doc/install.texi (--with-diagnostics-urls): Document the new configuration option. (--with-diagnostics-color): Document the existing interaction with GCC_COLORS better. * doc/invoke.texi (-fdiagnostics-urls): Add GCC_URLS and TERM_URLS vindex reference. Update description of defaults based on the above. (-fdiagnostics-color): Update description of how -fdiagnostics-color interacts with GCC_COLORS.
Bernd Edlinger committed -
gcc/ChangeLog: * doc/extend.texi (attribute alias): Mention type requirement. (attribute weak): Same. (attribute weakref): Correct invalid example.
Martin Sebor committed
-
- 14 Feb, 2020 11 commits
-
-
This fixes a weakness in the way -fdump-ada-spec builds names for anonymous structures in the C/C++ code, resulting in duplicate identifiers under specific circumstances. c-family/ * c-ada-spec.c: Include bitmap.h. (dump_ada_double_name): Rename into... (dump_anonymous_type_name): ...this. Always use the TYPE_UID. (dump_ada_array_type): Adjust to above renaming. Robustify. (dump_nested_types_1): New function copied from... Add dumped_types parameter and pass it down to dump_nested_type. (dump_nested_types): ...this. Remove parent parameter. Just call dump_nested_types_1 on an automatic bitmap. (dump_nested_type): Add dumped_types parameter. <ARRAY_TYPE>: Do not dump it if already present in dumped_types. Adjust recursive calls and adjust to above renaming. (dump_ada_declaration): Adjust call to dump_nested_types. Tidy up and adjust to above renaming. (dump_ada_specs): Initialize and release bitmap obstack.
Eric Botcazou committed -
gcc/po: * be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po, ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update. libcpp/po: * be.po, ca.po, da.po, de.po, el.po, eo.po, es.po, fi.po, fr.po, id.po, ja.po, nl.po, pt_BR.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update.
Joseph Myers committed -
This is an old thinko pertaining to the interaction between TLS sequences and delay slot filling: the compiler knows that it cannot put instructions with TLS relocations into delay slots with the original Sun TLS model, but it tests TARGET_SUN_TLS in this context, which depends only on the assembler. So if the compiler is configured with the GNU assembler and the Solaris linker, then TARGET_GNU_TLS is set instead and the limitation is not enforced. PR target/93704 * config/sparc/sparc.c (eligible_for_call_delay): Test HAVE_GNU_LD in conjunction with TARGET_GNU_TLS in early return.
Eric Botcazou committed -
There's a costly signed 64-bit division in rtx_cost on x86 as well as any other target where UNITS_PER_WORD expands to TARGET_64BIT ? 8 : 4. It's also evident that rtx_cost does redundant work for a SET. Obviously the variable named 'factor' rarely exceeds 1, so in the majority of cases it can be computed with a well-predictable branch rather than a division. This patch makes rtx_cost do the division only in case mode is wider than UNITS_PER_WORD, and also moves a test for a SET up front to avoid redundancy. No functional change. * rtlanal.c (rtx_cost): Handle a SET up front. Avoid division if the mode is not wider than UNITS_PER_WORD.
Alexander Monakov committed -
When backporting the PR61414 fix to 8.4, I've noticed that the caching of prec is actually broken, as it would fail to actually store the computed precision into the hash_map's value and so next time we'd think the enum needs 0 bits. 2020-02-14 Jakub Jelinek <jakub@redhat.com> PR c++/61414 * class.c (enum_min_precision): Change prec type from int to int &. * g++.dg/cpp0x/enum39.C: New test.
Jakub Jelinek committed -
get_ref_base_and_extent can return different sizes for COMPONENT_REFs and DECLs of the same type, with the latter including (more?) padding. When in the IL there is an assignment between such a COMPONENT_REF and a DECL, SRA will try to propagate the access from the former as a child of the latter, creating an artificial reference that does not match the access's declared size, which triggers a verifier assert. Fixed by teaching the propagation functions about this special situation so that they don't do it. The condition is the same that build_user_friendly_ref_for_offset uses so the artificial reference causing the verifier is guaranteed not to be created. 2020-02-14 Martin Jambor <mjambor@suse.cz> PR tree-optimization/93516 * tree-sra.c (propagate_subaccesses_from_rhs): Do not create access of the same type as the parent. (propagate_subaccesses_from_lhs): Likewise. gcc/testsuite/ * g++.dg/tree-ssa/pr93516.C: New test.
Martin Jambor committed -
liuhongt committed
-
2020-02-14 Hongtao Liu <hongtao.liu@intel.com> gcc/ PR target/93724 * config/i386/avx512vbmi2intrin.h (_mm512_shrdi_epi16, _mm512_mask_shrdi_epi16, _mm512_maskz_shrdi_epi16, _mm512_shrdi_epi32, _mm512_mask_shrdi_epi32, _mm512_maskz_shrdi_epi32, _m512_shrdi_epi64, _m512_mask_shrdi_epi64, _m512_maskz_shrdi_epi64, _mm512_shldi_epi16, _mm512_mask_shldi_epi16, _mm512_maskz_shldi_epi16, _mm512_shldi_epi32, _mm512_mask_shldi_epi32, _mm512_maskz_shldi_epi32, _mm512_shldi_epi64, _mm512_mask_shldi_epi64, _mm512_maskz_shldi_epi64): Fix typo of lacking a closing parenthesis. * config/i386/avx512vbmi2vlintrin.h (_mm256_shrdi_epi16, _mm256_mask_shrdi_epi16, _mm256_maskz_shrdi_epi16, _mm256_shrdi_epi32, _mm256_mask_shrdi_epi32, _mm256_maskz_shrdi_epi32, _m256_shrdi_epi64, _m256_mask_shrdi_epi64, _m256_maskz_shrdi_epi64, _mm256_shldi_epi16, _mm256_mask_shldi_epi16, _mm256_maskz_shldi_epi16, _mm256_shldi_epi32, _mm256_mask_shldi_epi32, _mm256_maskz_shldi_epi32, _mm256_shldi_epi64, _mm256_mask_shldi_epi64, _mm256_maskz_shldi_epi64, _mm_shrdi_epi16, _mm_mask_shrdi_epi16, _mm_maskz_shrdi_epi16, _mm_shrdi_epi32, _mm_mask_shrdi_epi32, _mm_maskz_shrdi_epi32, _mm_shrdi_epi64, _mm_mask_shrdi_epi64, _m_maskz_shrdi_epi64, _mm_shldi_epi16, _mm_mask_shldi_epi16, _mm_maskz_shldi_epi16, _mm_shldi_epi32, _mm_mask_shldi_epi32, _mm_maskz_shldi_epi32, _mm_shldi_epi64, _mm_mask_shldi_epi64, _mm_maskz_shldi_epi64): Ditto. gcc/testsuite/ * gcc.target/i386/avx512vbmi2-vpshld-1.c: New test. * gcc.target/i386/avx512vbmi2-vpshrd-1.c: Ditto. * gcc.target/i386/sse-12.c: Add -mavx512vbmi2. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-14.c: Add -mavx512vbmi2 and tests. * gcc.target/i386/sse-22.c: Ditto.
liuhongt committed -
I've noticed we claim in cxx-status.html that we implement P1042R1, but it seems we don't implement any of the changes from there. The following patch implements just the change that __VA_OPT__ determines whether to expand to nothing or the enclosed tokens no longer based on whether there were any tokens passed to __VA_ARGS__, but whether __VA_ARGS__ expands to any tokens (from testing apparently it has to be non-CPP_PADDING tokens). I'm afraid I'm completely lost about the padding preservation/removal changes that are also in the paper, so haven't touched that part. 2020-02-14 Jakub Jelinek <jakub@redhat.com> Partially implement P1042R1: __VA_OPT__ wording clarifications PR preprocessor/92319 * macro.c (expand_arg): Move declarations before vaopt_state definition. (class vaopt_state): Move enum update_type definition earlier. Remove m_allowed member, add m_arg and m_update members. (vaopt_state::vaopt_state): Change last argument from bool any_args to macro_arg *arg, initialize m_arg and m_update instead of m_allowed. (vaopt_state::update): When bumping m_state from 1 to 2 and m_update is ERROR, determine if __VA_ARGS__ expansion has any non-CPP_PADDING tokens and set m_update to INCLUDE if it has any, DROP otherwise. Return m_update instead of m_allowed ? INCLUDE : DROP in m_state >= 2. (replace_args, create_iso_definition): Adjust last argument to vaopt_state ctor. * c-c++-common/cpp/va-opt-4.c: New test.
Jakub Jelinek committed -
GCC Administrator committed
-
Some system headers can be broken by the machine_name fix performed by GCC during the fixincludes step. According to the comment in fixincludes/fixinc.h:130 : On some platforms, machine_name doesn't work properly and breaks some of the header files. Since everything works properly without it, just wipe the macro list to disable the fix. So we can just skip it to avoid trouble. fixincludes/ * fixinc.in: Skip machine_name fix on powerpc*-*-linux*.
Matheus Castanho committed
-
- 13 Feb, 2020 19 commits
-
-
Before Joseph's changes when compiling libstdc++-v3/libsupc++/fundamental_type_info.cc we were emitting _ZTIPDd, _ZTIPDe, _ZTIPDf, _ZTIPKDd, _ZTIPKDe, _ZTIPKDf, _ZTIDd, _ZTIDe, _ZTIDf symbols even when DFP wasn't usable, but now we don't and thus those 9 symbols @@CXXABI_1.3.4 are gone from libstdc++. While nothing could probably use it (except perhaps dlsym etc.), various tools don't really like symbols disappearing from symbol versioned shared libraries with stable ABI. Adding those in assembly would be possible, but would be a portability nightmare (the PR has something Red Hat uses in libstdc++_nonshared.a, but that can handle only a handful of linux ELF targets we care about). So, instead this patch hacks up the FE, so that it emits those, but in a way that won't make the DFP types available again on targets that don't support them. 2020-02-14 Jakub Jelinek <jakub@redhat.com> PR libstdc++/92906 * cp-tree.h (enum cp_tree_index): Add CPTI_FALLBACK_DFLOAT32_TYPE, CPTI_FALLBACK_DFLOAT64_TYPE and CPTI_FALLBACK_DFLOAT128_TYPE. (fallback_dfloat32_type, fallback_dfloat64_type, fallback_dfloat128_type): Define. * mangle.c (write_builtin_type): Handle fallback_dfloat*_type like dfloat*_type_node. * rtti.c (emit_support_tinfos): Emit DFP typeinfos even when dfp is disabled for compatibility.
Jakub Jelinek committed -
Here reintroducing the same declarations into the global namespace via using-declaration is useless but OK. And a function and a function template with the same parameters do not conflict. gcc/cp/ChangeLog 2020-02-13 Jason Merrill <jason@redhat.com> PR c++/93713 * name-lookup.c (matching_fn_p): A function does not match a template.
Jason Merrill committed -
Since my patch for PR 91476 moved visibility determination sooner, a local static in a vague linkage function now gets TREE_PUBLIC set before retrofit_lang_decl calls set_decl_linkage, which was making decl_linkage think that it has external linkage. It still has no linkage according to the standard. gcc/cp/ChangeLog 2020-02-13 Jason Merrill <jason@redhat.com> PR c++/93643 PR c++/91476 * tree.c (decl_linkage): Always lk_none for locals.
Jason Merrill committed -
This implements all the ranges members defined in [specialized.algorithms]: ranges::uninitialized_default_construct ranges::uninitialized_value_construct ranges::uninitialized_copy ranges::uninitialized_copy_n ranges::uninitialized_move ranges::uninitialized_move_n ranges::uninitialized_fill ranges::uninitialized_fill_n ranges::construct_at ranges::destroy_at ranges::destroy It also implements (hopefully correctly) the "obvious" optimizations for these algos, namely that if the output range has a trivial value type and if the appropriate operation won't throw then we can dispatch to the standard ranges version of the algorithm which will then potentially enable further optimizations. libstdc++-v3/ChangeLog: * include/Makefile.am: Add <bits/ranges_uninitialized.h>. * include/Makefile.in: Regenerate. * include/bits/ranges_uninitialized.h: New header. * include/std/memory: Include it. * testsuite/20_util/specialized_algorithms/destroy/constrained.cc: New test. * .../uninitialized_copy/constrained.cc: New test. * .../uninitialized_default_construct/constrained.cc: New test. * .../uninitialized_fill/constrained.cc: New test. * .../uninitialized_move/constrained.cc: New test. * .../uninitialized_value_construct/constrained.cc: New test.
Patrick Palka committed -
This roughly mirrors the existing split between <bits/stl_algo.h> and <bits/stl_algobase.h>. The ranges [specialized.algorithms] will use this new header to avoid including all of of <bits/ranges_algo.h>. libstdc++-v3/ChangeLog: * include/Makefile.am: Add bits/ranges_algobase.h * include/Makefile.in: Regenerate. * bits/ranges_algo.h: Include <bits/ranges_algobase.h> and refactor existing #includes. (__detail::__is_normal_iterator, __detail::is_reverse_iterator, __detail::__is_move_iterator, copy_result, move_result, __equal, equal, copy_result, move_result, move_backward_result, copy_backward_result, __copy_or_move_backward, __copy_or_move, copy, move, copy_backward, move_backward, copy_n_result, copy_n, fill_n, fill): Split out into ... * bits/range_algobase.h: ... this new header.
Patrick Palka committed -
The following testcase ICEs, because the PR84305 changes try to evaluate the size earlier. If size has side-effects, that is desirable, and the side-effects will actually be wrapped in a SAVE_EXPR. The problem on this testcase is that there are no side-effects, and c_fully_fold doesn't fold those COMPOUND_EXPRs to constant, and while before gimplification we unshare trees found in the expressions, the unsharing doesn't involve TYPE_SIZE etc. of used types. Gimplification is destructive though, so when we gimplify the two nested COMPOUND_EXPRs and then try to gimplify it the second time for the TYPE_SIZEs, we ICE. Now, we could use unshare_expr in what we push to *expr, SAVE_EXPRs and their operands in there aren't unshared, but I really don't see a point of evaluating expressions that don't have side-effects before, so instead this just pushes there expressions that do have side-effects. 2020-02-13 Jakub Jelinek <jakub@redhat.com> PR c/93576 * c-decl.c (grokdeclarator): If this_size_varies, only push size into *expr if it has side effects. * gcc.dg/pr93576.c: New test.
Jakub Jelinek committed -
vxworks7 headers haven't required fixes, and we've decided to avoid running fixinc on them. The problem with that is that, with a dummy fixinc, mkheaders wipes out include-fixed but then multi_dir subdirs are not created again, so we end up with a limits.h named after each multi_dir, when there are non-default multilibs. Oops. This patch arranges for a dummy fixinc to be created for *-*-vxworks7* targets, and fixes mkheaders so as to create multi_dir subdirs in include-fixed after wiping them out, and to copy limits.h so that it won't take the name that should be of a subdir (unless the multi_dir is limits.h, but that's hopefully never the case ;-) for fixincludes/ChangeLog * mkheaders.in: Re-create subdirs, copy limits.h into subdir. * mkfixinc.sh: Create dummy fixinc for *-*-vxworks7*.
Alexandre Oliva committed -
2020-02-13 Sandra Loosemore <sandra@codesourcery.com> gcc/testsuite/ * g++.dg/cpp0x/constexpr-static13.C: Add -fdelete-null-pointer-checks. * g++.dg/cpp2a/constexpr-new11.C: Likewise. * g++.dg/cpp2a/constexpr-new12.C: Likewise.
Sandra Loosemore committed -
Skip ENDBR32 at the target function entry when initializing trampoline. Tested on Linux/x86-64 CET machine with and without -m32. gcc/ PR target/93656 * config/i386/i386.c (ix86_trampoline_init): Skip ENDBR32 at the target function entry. gcc/testsuite/ PR target/93656 * gcc.target/i386/pr93656.c: New test.
H.J. Lu committed -
For ARC, predicated instructions are not very friendly with size optimizations, leading to increased object size. Disable if-conversion step when optimized for size. gcc/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> * common/config/arc/arc-common.c (arc_option_optimization_table): Disable if-conversion step when optimized for size. Signed-off-by: Claudiu Zissulescu <claziss@gmail.com>
Claudiu Zissulescu committed -
This option was used to control the short instruction selection. However, there is no difference in cycles if we use or not a short instruction, and always someone wants a smaller program. gcc/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.c (arc_conditional_register_usage): R0-R3 and R12-R15 are always in ARCOMPACT16_REGS register class. * config/arc/arc.opt (mq-class): Deprecate. * config/arc/constraint.md ("q"): Remove dependency on mq-class option. * doc/invoke.texi (mq-class): Update text. * common/config/arc/arc-common.c (arc_option_optimization_table): Update list. testsuite/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> * gcc.target/arc/nps400-1.c: Update test.
Claudiu Zissulescu committed -
TARGET_INSN_COST gives us a better control over the instruction costs than classical RTX_COSTS. A simple cost scheme is in place for the time being, when optimizing for size, the cost is given by the instruction length. When optimizing for speed, the cost is 1 for any recognized instruction, and 2 for any load/store instruction. The latter one can be overwritten by using cost attribute for an instruction. Due to this change, we need to update also a number of instruction patterns with a new predicate to better reflect the costs. gcc/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.c (arc_insn_cost): New function. (TARGET_INSN_COST): Define. * config/arc/arc.md (cost): New attribute. (add_n): Use arc_nonmemory_operand. (ashlsi3_insn): Likewise, also update constraints. (ashrsi3_insn): Likewise. (rotrsi3): Likewise. (add_shift): Likewise. * config/arc/predicates.md (arc_nonmemory_operand): New predicate. testsuite/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> * gcc.target/arc/or-cnst-size2.c: Update test.
Claudiu Zissulescu committed -
gcc/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> * config/arc/arc.md (mulsidi_600): Correctly select mlo/mhi registers. (umulsidi_600): Likewise. testsuite/ xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com> Petro Karashchenko <petro.karashchenko@ring.com> * estsuite/gcc.target/arc/mul64-1.c: New test.
Claudiu Zissulescu committed -
As mentioned in the PR and as https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mask_popcnt_epi also documents, _mm*_popcnt_epi* intrinsics are consistent with all other unary AVX512* intrinsics regarding arguments, i.e. the _mm*_whatever has just single argument (called a in the docs, and __A in the GCC headers), _mm*_mask_whatever has 3 arguments (called src, k, a in the docs and _W, __U, __A in GCC headers) and _mm*_maskz_whatever 2 arguments (called k, a in the docs and __U, __A in GCC headers). Unfortunately, whomever implemented the _mm*_popcnt_epi* intrinsics got it wrong for the _mm*_mask_popcnt_epi* ones, calling the args __A, __U, __B and not passing them in the canonical order to the builtins, making it API incompatible with ICC as well as clang (tested on godbolts clang 7/8/9/trunk and ICC 19.0.{0,1}, older clang/ICC don't understand those, so it isn't that it used to be broken even in other compilers and got changed afterwards). 2020-02-13 Jakub Jelinek <jakub@redhat.com> PR target/93696 * config/i386/avx512bitalgintrin.h (_mm512_mask_popcnt_epi8, _mm512_mask_popcnt_epi16, _mm256_mask_popcnt_epi8, _mm256_mask_popcnt_epi16, _mm_mask_popcnt_epi8, _mm_mask_popcnt_epi16): Rename __B argument to __A and __A to __W, pass __A to the builtin followed by __W instead of __A followed by __B. * config/i386/avx512vpopcntdqintrin.h (_mm512_mask_popcnt_epi32, _mm512_mask_popcnt_epi64): Likewise. * config/i386/avx512vpopcntdqvlintrin.h (_mm_mask_popcnt_epi32, _mm256_mask_popcnt_epi32, _mm_mask_popcnt_epi64, _mm256_mask_popcnt_epi64): Likewise. * gcc.target/i386/pr93696-1.c: New test. * gcc.target/i386/pr93696-2.c: New test. * gcc.target/i386/avx512bitalg-vpopcntw-1.c (TEST): Fix argument order of _mm*_mask_popcnt_*. * gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c (TEST): Likewise. * gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c (TEST): Likewise. * gcc.target/i386/avx512bitalg-vpopcntb-1.c (TEST): Likewise. * gcc.target/i386/avx512bitalg-vpopcntb.c (foo): Likewise. * gcc.target/i386/avx512bitalg-vpopcntbvl.c (foo): Likewise. * gcc.target/i386/avx512vpopcntdq-vpopcntd.c (foo): Likewise. * gcc.target/i386/avx512bitalg-vpopcntwvl.c (foo): Likewise. * gcc.target/i386/avx512bitalg-vpopcntw.c (foo): Likewise. * gcc.target/i386/avx512vpopcntdq-vpopcntq.c (foo): Likewise.
Jakub Jelinek committed -
Frederik Harwath committed
-
An OpenMP "nowait" clause on a target construct currently leads to a call to GOMP_OFFLOAD_async_run in the plugin that is used for offloading at execution time. The nvptx plugin contains only a stub of this function that always produces a fatal error if called. This commit changes the "nowait" implementation to ignore the clause if the executing device's plugin does not implement GOMP_OFFLOAD_async_run. The stub in the nvptx plugin is removed which effectively means that programs containing "nowait" can now be executed with nvptx offloading as if the clause had not been used. This behavior is consistent with the OpenMP specification which says that "[...] execution of the target task *may* be deferred" (emphasis added), cf. OpenMP 5.0, page 172. libgomp/ * plugin/plugin-nvptx.c: Remove GOMP_OFFLOAD_async_run stub. * target.c (gomp_load_plugin_for_device): Make "async_run" loading optional. (gomp_target_task_fn): Assert "devicep->async_run_func". (clear_unsupported_flags): New function to remove unsupported flags (right now only GOMP_TARGET_FLAG_NOWAIT) that can be be ignored. (GOMP_target_ext): Apply clear_unsupported_flags to flags. * testsuite/libgomp.c/target-33.c: Remove xfail for offload_target_nvptx. * testsuite/libgomp.c/target-34.c: Likewise.
Frederik Harwath committed -
The following patch is first step towards fixing PR93582. vn_reference_lookup_3 right now punts on anything that isn't byte aligned, so to be able to lookup a constant bitfield store, one needs to use the exact same COMPONENT_REF, otherwise it isn't found. This patch lifts up that that restriction if the bits to be loaded are covered by a single store of a constant (keeps the restriction so far for the multiple store case, can tweak that incrementally, but I think for bisection etc. it is worth to do it one step at a time). 2020-02-13 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/93582 * fold-const.h (shift_bytes_in_array_left, shift_bytes_in_array_right): Declare. * fold-const.c (shift_bytes_in_array_left, shift_bytes_in_array_right): New function, moved from gimple-ssa-store-merging.c, no longer static. * gimple-ssa-store-merging.c (shift_bytes_in_array): Move to gimple-ssa-store-merging.c and rename to shift_bytes_in_array_left. (shift_bytes_in_array_right): Move to gimple-ssa-store-merging.c. (encode_tree_to_bitpos): Use shift_bytes_in_array_left instead of shift_bytes_in_array. (verify_shift_bytes_in_array): Rename to ... (verify_shift_bytes_in_array_left): ... this. Use shift_bytes_in_array_left instead of shift_bytes_in_array. (store_merging_c_tests): Call verify_shift_bytes_in_array_left instead of verify_shift_bytes_in_array. * tree-ssa-sccvn.c (vn_reference_lookup_3): For native_encode_expr / native_interpret_expr where the store covers all needed bits, punt on PDP-endian, otherwise allow all involved offsets and sizes not to be byte-aligned. * gcc.dg/tree-ssa/pr93582-1.c: New test. * gcc.dg/tree-ssa/pr93582-2.c: New test. * gcc.dg/tree-ssa/pr93582-3.c: New test.
Jakub Jelinek committed -
2020-02-13 Richard Biener <rguenther@suse.de> PR testsuite/93717 * gcc.dg/optimize-bswapsi-2.c: Add BE case.
Richard Biener committed -
As mentioned in the PR, the intrinsics allow counts from 0 to 255, but we actually reject values from 128 to 255. That is because QImode CONST_INTs can be only -128 to 127. Fixed by using const_0_to_255_operand and dropping the modes for the operands with those predicates (the IL actually contains the CONST_INT which has VOIDmode). 2020-02-13 Jakub Jelinek <jakub@redhat.com> PR target/93673 * config/i386/sse.md (k<code><mode>): Drop mode from last operand and use const_0_to_255_operand predicate instead of immediate_operand. (avx512dq_fpclass<mode><mask_scalar_merge_name>, avx512dq_vmfpclass<mode><mask_scalar_merge_name>, vgf2p8affineinvqb_<mode><mask_name>, vgf2p8affineqb_<mode><mask_name>): Drop mode from const_0_to_255_operand predicated operands. * gcc.target/i386/avx512f-pr93673.c: New test. * gcc.target/i386/avx512dq-pr93673.c: New test. * gcc.target/i386/avx512bw-pr93673.c: New test.
Jakub Jelinek committed
-