- 23 Oct, 2018 10 commits
-
-
2018-10-23 Richard Biener <rguenther@suse.de> PR tree-optimization/87105 PR tree-optimization/87608 * passes.def (pass_all_early_optimizations): Add early phi-opt after dce. * tree-ssa-phiopt.c (value_replacement): Ignore NOPs and predicts in addition to debug stmts. (tree_ssa_phiopt_worker): Add early_p argument, do only min/max and abs replacement early. * tree-cfg.c (gimple_empty_block_p): Likewise. * g++.dg/tree-ssa/phiopt-1.C: New testcase. g++.dg/vect/slp-pr87105.cc: Likewise. * g++.dg/tree-ssa/pr21463.C: Scan phiopt2 because this testcase relies on phiprop run before. * g++.dg/tree-ssa/pr30738.C: Likewise. * g++.dg/tree-ssa/pr57380.C: Likewise. * gcc.dg/tree-ssa/pr84859.c: Likewise. * gcc.dg/tree-ssa/pr45397.c: Scan phiopt2 because phiopt1 is confused by copies in the IL left by EVRP. * gcc.dg/tree-ssa/phi-opt-5.c: Likewise, this time confused by predictors. * gcc.dg/tree-ssa/phi-opt-12.c: Scan phiopt2. * gcc.dg/pr24574.c: Likewise. * g++.dg/tree-ssa/pr86544.C: Scan phiopt4. From-SVN: r265421
Richard Biener committed -
There are a couple of places in config.gcc where the default CPU is still arm6, but that was removed as a supported CPU earlier this year. This patch fixes those entries. The default CPU for configurations that do not explicitly set a default is now arm7tdmi (so assumes thumb is available). Given that StrongArm is on the deprecated list, this is a better default than we had previously. For NetBSD the default is StrongArm; this is the only remaining port that uses the old ABI and really still carries support for non-thumb based targets. PR target/86383 * config.gcc (arm*-*-netbsdelf*): Default to StrongARM if no CPU specified to configure. (arm*-*-*): Use ARM7TDMI as the target CPU if no default provided. From-SVN: r265420
Richard Earnshaw committed -
2018-10-23 Richard Biener <rguenther@suse.de> PR tree-optimization/87700 * tree-ssa-copy.c (set_copy_of_val): Fix change detection logic. * gcc.dg/torture/pr87700.c: New testcase. From-SVN: r265418
Richard Biener committed -
PR target/87674 * config/i386/avx512vlintrin.h (_mm_mask_mullo_epi32): Change type of second argument from __mmask16 to __mmask8. * config/i386/avx512vlbwintrin.h (_mm_mask_packus_epi32, _mm_mask_packs_epi32): Likewise. * config/i386/avx512pfintrin.h (_mm512_mask_prefetch_i64scatter_ps): Likewise. (_mm512_mask_prefetch_i64scatter_pd): Likewise. Formatting fix. From-SVN: r265416
Jakub Jelinek committed -
2018-10-23 Richard Biener <rguenther@suse.de> * tree-vect-stmts.c (vect_analyze_stmt): Fix typo in comment. From-SVN: r265415
Richard Biener committed -
2018-10-23 Richard Biener <rguenther@suse.de> PR tree-optimization/86144 * tree-vect-stmts.c (vect_analyze_stmt): Prefer -mveclibabi over simd attribute. From-SVN: r265414
Richard Biener committed -
2018-10-23 Richard Biener <rguenther@suse.de> PR tree-optimization/87693 * tree-ssa-threadedge.c (thread_around_empty_blocks): Handle the case we do not find the taken edge. * gcc.dg/torture/pr87693.c: New testcase. From-SVN: r265413
Richard Biener committed -
2018-10-23 Paul Thomas <pault@gcc.gnu.org> PR fortran/85603 * frontend-passes.c (get_len_call): New function to generate a call to intrinsic LEN. (create_var): Use this to make length expressions for variable rhs string lengths. Clean up some white space issues. 2018-10-23 Paul Thomas <pault@gcc.gnu.org> PR fortran/85603 * gfortran.dg/deferred_character_23.f90 : Check reallocation is occurring as it should and a regression caused by version 1 of this patch. From-SVN: r265412
Paul Thomas committed -
Introduce a new "types" command to the export data to record the number of types and the size of their export data. It is immediately followed by new "type" commands that can be indexed. Parse all the exported types immediately so that we register them, but parse other type data only as needed. Reviewed-on: https://go-review.googlesource.com/c/143022 From-SVN: r265409
Ian Lance Taylor committed -
From-SVN: r265408
GCC Administrator committed
-
- 22 Oct, 2018 25 commits
-
-
* symtab.c (symtab_node::increase_alignment): Correct max alignment check. From-SVN: r265404
Paul Koning committed -
2018-10-22 Yury Gribov <tetra2005@gmail.com> gcc/ PR tree-optimization/87633 * match.pd: Do not generate unordered integer comparisons. gcc/testsuite/ PR tree-optimization/87633 * g++.dg/pr87633.C: New test. From-SVN: r265399
Yury Gribov committed -
On most targets every function starts with moves from the parameter passing (hard) registers into pseudos. Similarly, after every call there is a move from the return register into a pseudo. These moves usually combine with later instructions (leaving pretty much the same instruction, just with a hard reg instead of a pseudo). This isn't a good idea. Register allocation can get rid of unnecessary moves just fine, and moving the parameter passing registers into many later instructions tends to prevent good register allocation. This patch disallows combining moves from a hard (non-fixed) register. This also avoid the problem mentioned in PR87600 #c3 (combining hard registers into inline assembler is problematic). Because the register move can often be combined with other instructions *itself*, for example for setting some condition code, this patch adds extra copies via new pseudos after every copy-from-hard-reg. On some targets this reduces average code size. On others it increases it a bit, 0.1% or 0.2% or so. (I tested this on all *-linux targets). PR rtl-optimization/87600 * combine.c: Add include of expr.h. (cant_combine_insn_p): Do not combine moves from any hard non-fixed register to a pseudo. (make_more_copies): New function, add a copy to a new pseudo after the moves from hard registers into pseudos. (rest_of_handle_combine): Declare rebuild_jump_labels_after_combine later. Call make_more_copies. From-SVN: r265398
Segher Boessenkool committed -
PR testsuite/87694 * g++.dg/concepts/memfun-err.C: Make it a compile test. From-SVN: r265397
Marek Polacek committed -
Given a pattern with a number of operands: (match_operand 0 "" "=&v") (match_operand 1 "" " v0") (match_operand 2 "" " v0") (match_operand 3 "" " v0") GCC will currently increment "reject" once, for operand 0, and then decrement it once for each of the other operands, ending with reject == -2 and an assertion failure. If there's a conflict then it might try to decrement reject yet again. Incidentally, what these patterns are trying to achieve is an allocation in which operand 0 may match one of the other operands, but may not partially overlap any of them. Ideally there'd be a better way to do this. In any case, it will affect any pattern in which multiple operands may (or must) match an early-clobber operand. The patch only allows a reject-- when one has not already occurred, for that operand. 2018-10-22 Andrew Stubbs <ams@codesourcery.com> gcc/ * lra-constraints.c (process_alt_operands): New local array, matching_early_clobber. Check matching_early_clobber before decrementing reject, and set matching_early_clobber after. From-SVN: r265393
Andrew Stubbs committed -
As the PR shows, the user can force this to be called on at least some RTL that is not a valid address. Most targets treat this as if the user knows best; let's do the same. PR target/87598 * config/rs6000/rs6000.c (print_operand_address): For unexpected RTL call output_addr_const and hope for the best. From-SVN: r265392
Segher Boessenkool committed -
* gimple-ssa-evrp-analyze.c (evrp_range_analyzer::record_ranges_from_incoming_edge): Be smarter about what ranges to use. * tree-vrp.c (add_assert_info): Dump here. (register_edge_assert_for_2): Instead of here at multiple but not all places. * gcc.dg/tree-ssa/evrp12.c: New testcase. * gcc.dg/predict-6.c: Adjust. * gcc.dg/tree-ssa/vrp33.c: Disable EVRP. * gcc.dg/tree-ssa/vrp02.c: Likewise. * gcc.dg/tree-ssa/cunroll-9.c: Likewise. From-SVN: r265391
Richard Biener committed -
2018-10-22 Steven Bosscher <steven@gcc.gnu.org> Richard Biener <rguenther@suse.de> * bitmap.h: Update data structure documentation, including a description of bitmap views as either linked-lists or splay trees. (struct bitmap_element_def): Update comments for splay tree bitmaps. (struct bitmap_head_def): Likewise. (bitmap_list_view, bitmap_tree_view): New prototypes. (bitmap_initialize_stat): Initialize a bitmap_head's indx and tree_form fields. (bmp_iter_set_init): Assert the iterated bitmaps are in list form. (bmp_iter_and_init, bmp_iter_and_compl_init): Likewise. * bitmap.c (bitmap_elem_to_freelist): Unregister overhead of a released bitmap element here. (bitmap_element_free): Remove. (bitmap_elt_clear_from): Work on splay tree bitmaps. (bitmap_list_link_element): Renamed from bitmap_element_link. Move this function similar ones such that linked-list bitmap implementation functions are grouped. (bitmap_list_unlink_element): Renamed from bitmap_element_unlink, and moved for grouping. (bitmap_list_insert_element_after): Renamed from bitmap_elt_insert_after, and moved for grouping. (bitmap_list_find_element): New function spliced from bitmap_find_bit. (bitmap_tree_link_left, bitmap_tree_link_right, bitmap_tree_rotate_left, bitmap_tree_rotate_right, bitmap_tree_splay, bitmap_tree_link_element, bitmap_tree_unlink_element, bitmap_tree_find_element): New functions for splay-tree bitmap implementation. (bitmap_element_link, bitmap_element_unlink, bitmap_elt_insert_after): Renamed and moved, see above entries. (bitmap_tree_listify_from): New function to convert part of a splay tree bitmap to a linked-list bitmap. (bitmap_list_view): Convert a splay tree bitmap to linked-list form. (bitmap_tree_view): Convert a linked-list bitmap to splay tree form. (bitmap_find_bit): Remove. (bitmap_clear, bitmap_clear_bit, bitmap_set_bit, bitmap_single_bit_set_p, bitmap_first_set_bit, bitmap_last_set_bit): Handle splay tree bitmaps. (bitmap_copy, bitmap_count_bits, bitmap_and, bitmap_and_into, bitmap_elt_copy, bitmap_and_compl, bitmap_and_compl_into, bitmap_compl_and_into, bitmap_elt_ior, bitmap_ior, bitmap_ior_into, bitmap_xor, bitmap_xor_into, bitmap_equal_p, bitmap_intersect_p, bitmap_intersect_compl_p, bitmap_ior_and_compl, bitmap_ior_and_compl_into, bitmap_set_range, bitmap_clear_range, bitmap_hash): Reject trying to act on splay tree bitmaps. Make corresponding changes to use linked-list specific bitmap_element manipulation functions as applicable for efficiency. (bitmap_tree_to_vec): New function. (debug_bitmap_elt_file): New function split out from ... (debug_bitmap_file): ... here. Handle splay tree bitmaps. (bitmap_print): Likewise. PR tree-optimization/63155 * tree-ssa-propagate.c (ssa_prop_init): Use tree-view for the SSA edge worklists. * tree-ssa-coalesce.c (coalesce_ssa_name): Populate used_in_copies in tree-view. From-SVN: r265390
Steven Bosscher committed -
Index: gcc/config/rs6000/emmintrin.h =================================================================== --- gcc/config/rs6000/emmintrin.h (revision 265318) +++ gcc/config/rs6000/emmintrin.h (working copy) @@ -85,7 +85,7 @@ typedef double __m128d __attribute__ ((__vector_si typedef long long __m128i_u __attribute__ ((__vector_size__ (16), __may_alias__, __aligned__ (1))); typedef double __m128d_u __attribute__ ((__vector_size__ (16), __may_alias__, __aligned__ (1))); -/* Define two value permute mask */ +/* Define two value permute mask. */ #define _MM_SHUFFLE2(x,y) (((x) << 1) | (y)) /* Create a vector with element 0 as F and the rest zero. */ @@ -201,7 +201,7 @@ _mm_store_pd (double *__P, __m128d __A) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storeu_pd (double *__P, __m128d __A) { - *(__m128d *)__P = __A; + *(__m128d_u *)__P = __A; } /* Stores the lower DPFP value. */ @@ -2175,7 +2175,7 @@ _mm_maskmoveu_si128 (__m128i __A, __m128i __B, cha { __v2du hibit = { 0x7f7f7f7f7f7f7f7fUL, 0x7f7f7f7f7f7f7f7fUL}; __v16qu mask, tmp; - __m128i *p = (__m128i*)__C; + __m128i_u *p = (__m128i_u*)__C; tmp = (__v16qu)_mm_loadu_si128(p); mask = (__v16qu)vec_cmpgt ((__v16qu)__B, (__v16qu)hibit); Index: gcc/config/rs6000/xmmintrin.h =================================================================== --- gcc/config/rs6000/xmmintrin.h (revision 265318) +++ gcc/config/rs6000/xmmintrin.h (working copy) @@ -85,6 +85,10 @@ vector types, and their scalar components. */ typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); +/* Unaligned version of the same type. */ +typedef float __m128_u __attribute__ ((__vector_size__ (16), __may_alias__, + __aligned__ (1))); + /* Internal data types for implementing the intrinsics. */ typedef float __v4sf __attribute__ ((__vector_size__ (16))); @@ -172,7 +176,7 @@ _mm_store_ps (float *__P, __m128 __A) extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_storeu_ps (float *__P, __m128 __A) { - *(__m128 *)__P = __A; + *(__m128_u *)__P = __A; } /* Store four SPFP values in reverse order. The address must be aligned. */ From-SVN: r265389
William Schmidt committed -
2018-10-22 Martin Liska <mliska@suse.cz> PR tree-optimization/87686 Revert 2018-08-29 Martin Liska <mliska@suse.cz> * tree-switch-conversion.c (switch_conversion::expand): Strenghten assumption about gswitch statements. 2018-10-22 Martin Liska <mliska@suse.cz> PR tree-optimization/87686 * g++.dg/tree-ssa/pr87686.C: New test. From-SVN: r265388
Martin Liska committed -
2018-10-22 Jakub Jelinek <jakub@redhat.com> * g++.target/i386/i386.exp: Use g++-dg-runtest to iterate properly -std= options. From-SVN: r265387
Jakub Jelinek committed -
2018-10-22 Martin Liska <mliska@suse.cz> * ipa-icf.c (sem_item::compare_attributes): Remove. (sem_item::compare_referenced_symbol_properties): Use attribute_list_equal instead. (sem_function::equals_wpa): Likewise. * ipa-icf.h: Remove compare_attributes. From-SVN: r265386
Martin Liska committed -
2018-10-22 Richard Biener <rguenther@suse.de> * gcc.dg/graphite/scop-4.c: Avoid out-of-bound access. From-SVN: r265385
Richard Biener committed -
utils.c (unchecked_convert): Use local variables for the biased and reverse SSO attributes of both types. * gcc-interface/utils.c (unchecked_convert): Use local variables for the biased and reverse SSO attributes of both types. Further extend the processing of integral types in the presence of reverse SSO to all scalar types. From-SVN: r265381
Eric Botcazou committed -
* gcc-interface/trans.c (Pragma_to_gnu) <Pragma_Inspection_Point>: Use a simple memory constraint in all cases. * gcc-interface/lang-specs.h: Bump copyright year. From-SVN: r265378
Eric Botcazou committed -
* gnat.dg/warn19.ad[sb]: New test. * gnat.dg/warn19_pkg.ads: New helper. From-SVN: r265377
Eric Botcazou committed -
2018-10-22 Richard Biener <rguenther@suse.de> PR middle-end/87682 * mem-stats.h (mem_usage::operator==): Fix pasto. From-SVN: r265376
Richard Biener committed -
2018-10-22 Richard Biener <rguenther@suse.de> PR tree-optimization/87640 * tree-vrp.c (set_value_range_with_overflow): Decompose incomplete result. (extract_range_from_binary_expr_1): Adjust. * gcc.dg/torture/pr87640.c: New testcase. From-SVN: r265375
Richard Biener committed -
The test is part of the originally posted change (https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01173.html), but was forgotten during svn commit. From-SVN: r265373
Ilya Leoshkevich committed -
This long patch only does one simple thing, adds an explicit function parameter to predicates stmt_could_throw_p, stmt_can_throw_external and stmt_can_throw_internal. My motivation was ability to use stmt_can_throw_external in IPA analysis phase without the need to push cfun. As I have discovered, we were already doing that in cgraph.c, which this patch avoids as well. In the process, I had to add a struct function parameter to stmt_could_throw_p and decided to also change the interface of stmt_can_throw_internal just for the sake of some minimal consistency. In the process I have discovered that calling method cgraph_node::create_version_clone_with_body (used by ipa-split, ipa-sra, OMP simd and multiple_target) leads to calls of stmt_can_throw_external with NULL cfun. I have worked around this by making stmt_can_throw_external and stmt_could_throw_p gracefully accept NULL and just be pessimistic in that case. The problem with fixing this in a better way is that struct function for the clone is created after cloning edges where we attempt to push the yet not existing cfun, and moving it before would require a bit of surgery in tree-inline.c. A slightly hackish but simpler fix might be to explicitely pass the "old" function to symbol_table::create_edge because it should be just as good at that moment. In any event, that is a topic for another patch. I believe that currently we incorrectly use cfun in maybe_clean_eh_stmt_fn and maybe_duplicate_eh_stmt_fn, both in tree-eh.c, and so I have fixed these cases too. The bulk of other changes is just mechanical adding of cfun to all users. Bootstrapped and tested on x86_64-linux (also with extra NULLing and restoring cfun to double check it is not used in a place I missed), OK for trunk? Thanks, Martin 2018-10-22 Martin Jambor <mjambor@suse.cz> * tree-eh.h (stmt_could_throw_p): Add function parameter. (stmt_can_throw_external): Likewise. (stmt_can_throw_internal): Likewise. * tree-eh.c (lower_eh_constructs_2): Pass cfun to stmt_could_throw_p. (lower_eh_constructs_2): Likewise. (stmt_could_throw_p): Add fun parameter, use it instead of cfun. (stmt_can_throw_external): Likewise. (stmt_can_throw_internal): Likewise. (maybe_clean_eh_stmt_fn): Pass cfun to stmt_could_throw_p. (maybe_clean_or_replace_eh_stmt): Pass cfun to stmt_could_throw_p. (maybe_duplicate_eh_stmt_fn): Pass new_fun to stmt_could_throw_p. (maybe_duplicate_eh_stmt): Pass cfun to stmt_could_throw_p. (pass_lower_eh_dispatch::execute): Pass cfun to stmt_can_throw_external. (cleanup_empty_eh): Likewise. (verify_eh_edges): Pass cfun to stmt_could_throw_p. * cgraph.c (cgraph_edge::set_call_stmt): Pass a function to stmt_can_throw_external instead of pushing it to cfun. (symbol_table::create_edge): Likewise. * gimple-fold.c (fold_builtin_atomic_compare_exchange): Pass cfun to stmt_can_throw_internal. * gimple-ssa-evrp.c (evrp_dom_walker::before_dom_children): Pass cfun to stmt_could_throw_p. * gimple-ssa-store-merging.c (handled_load): Pass cfun to stmt_can_throw_internal. (pass_store_merging::execute): Likewise. * gimple-ssa-strength-reduction.c (find_candidates_dom_walker::before_dom_children): Pass cfun to stmt_could_throw_p. * gimplify-me.c (gimple_regimplify_operands): Pass cfun to stmt_can_throw_internal. * ipa-pure-const.c (check_call): Pass cfun to stmt_could_throw_p and to stmt_can_throw_external. (check_stmt): Pass cfun to stmt_could_throw_p. (check_stmt): Pass cfun to stmt_can_throw_external. (pass_nothrow::execute): Likewise. * trans-mem.c (expand_call_tm): Pass cfun to stmt_can_throw_internal. * tree-cfg.c (is_ctrl_altering_stmt): Pass cfun to stmt_can_throw_internal. (verify_gimple_in_cfg): Pass cfun to stmt_could_throw_p. (stmt_can_terminate_bb_p): Pass cfun to stmt_can_throw_external. (gimple_purge_dead_eh_edges): Pass cfun to stmt_can_throw_internal. * tree-complex.c (expand_complex_libcall): Pass cfun to stmt_could_throw_p and to stmt_can_throw_internal. (expand_complex_multiplication): Pass cfun to stmt_can_throw_internal. * tree-inline.c (copy_edges_for_bb): Likewise. (maybe_move_debug_stmts_to_successors): Likewise. * tree-outof-ssa.c (ssa_is_replaceable_p): Pass cfun to stmt_could_throw_p. * tree-parloops.c (oacc_entry_exit_ok_1): Likewise. * tree-sra.c (scan_function): Pass cfun to stmt_can_throw_external. * tree-ssa-alias.c (stmt_kills_ref_p): Pass cfun to stmt_can_throw_internal. * tree-ssa-ccp.c (optimize_atomic_bit_test_and): Likewise. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Pass cfun to stmt_could_throw_p. (mark_aliased_reaching_defs_necessary_1): Pass cfun to stmt_can_throw_internal. * tree-ssa-forwprop.c (pass_forwprop::execute): Likewise. * tree-ssa-loop-im.c (movement_possibility): Pass cfun to stmt_could_throw_p. * tree-ssa-loop-ivopts.c (find_givs_in_stmt_scev): Likewise. (add_autoinc_candidates): Pass cfun to stmt_can_throw_internal. * tree-ssa-math-opts.c (pass_cse_reciprocals::execute): Likewise. (convert_mult_to_fma_1): Likewise. (convert_to_divmod): Likewise. * tree-ssa-phiprop.c (propagate_with_phi): Likewise. * tree-ssa-pre.c (compute_avail): Pass cfun to stmt_could_throw_p. * tree-ssa-propagate.c (substitute_and_fold_dom_walker::before_dom_children): Likewise. * tree-ssa-reassoc.c (suitable_cond_bb): Likewise. (maybe_optimize_range_tests): Likewise. (linearize_expr_tree): Likewise. (reassociate_bb): Likewise. * tree-ssa-sccvn.c (copy_reference_ops_from_call): Likewise. * tree-ssa-scopedtables.c (hashable_expr_equal_p): Likewise. * tree-ssa-strlen.c (adjust_last_stmt): Likewise. (handle_char_store): Likewise. * tree-vect-data-refs.c (vect_find_stmt_data_reference): Pass cfun to stmt_can_throw_internal. * tree-vect-patterns.c (check_bool_pattern): Pass cfun to stmt_could_throw_p. * tree-vect-stmts.c (vect_finish_stmt_generation_1): Likewise. (vectorizable_call): Pass cfun to stmt_can_throw_internal. (vectorizable_simd_clone_call): Likewise. * value-prof.c (gimple_ic): Pass cfun to stmt_could_throw_p. (gimple_stringop_fixed_value): Likewise. From-SVN: r265372
Martin Jambor committed -
Improves the code generation by getting rid of redundant LAs, as seen in the following example: - la %r1,0(%r13) - lg %r4,0(%r1) + lg %r4,0(%r13) Also allows to proceed with the merge of movdi_64 and movdi_larl. Currently LRA decides to spill literal pool references back to the literal pool, because it preliminarily chooses alternatives with CT_MEMORY constraints without calling satisfies_memory_constraint_p (). Later on it notices that the constraint is wrong and fixes it by spilling. The constraint in this case is "b", and the operand is a literal pool reference. There is no reason to reject them. The current behavior was introduced, apparently unintentionally, by https://gcc.gnu.org/ml/gcc-patches/2010-09/msg00812.html The patch affects a little bit more than mentioned in the subject, because it changes s390_loadrelative_operand_p (), which is called not only for checking the "b" constraint. However, the only caller for which it should really not accept literal pool references is s390_check_qrst_address (), so it was changed to explicitly do so. gcc/ChangeLog: 2018-10-22 Ilya Leoshkevich <iii@linux.ibm.com> * config/s390/s390.c (s390_loadrelative_operand_p): Accept literal pool references. (s390_check_qrst_address): Adapt to the new behavior of s390_loadrelative_operand_p (). gcc/testsuite/ChangeLog: 2018-10-22 Ilya Leoshkevich <iii@linux.ibm.com> * gcc.target/s390/litpool-int.c: New test. From-SVN: r265371
Ilya Leoshkevich committed -
Many AVX512 vector operations can broadcast from a scalar memory source. This patch enables memory broadcast for INT andnot operations. gcc/ PR target/72782 * config/i386/sse.md (*andnot<mode>3_bcst): New. gcc/testsuite/ PR target/72782 * gcc.target/i386/avx512f-andn-di-zmm-1.c: New test. * gcc.target/i386/avx512f-andn-si-zmm-1.c: Likewise. * gcc.target/i386/avx512f-andn-si-zmm-2.c: Likewise. * gcc.target/i386/avx512f-andn-si-zmm-3.c: Likewise. * gcc.target/i386/avx512f-andn-si-zmm-4.c: Likewise. * gcc.target/i386/avx512f-andn-si-zmm-5.c: Likewise. * gcc.target/i386/avx512vl-andn-si-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-andn-si-ymm-1.c: Likewise. From-SVN: r265370
H.J. Lu committed -
Many AVX512 vector operations can broadcast from a scalar memory source. This patch enables memory broadcast for INT logic operations. gcc/ PR target/72782 * config/i386/sse.md (*<code><mode>3_bcst): New. gcc/testsuite/ PR target/72782 * gcc.target/i386/avx512f-and-di-zmm-1.c: New test. * gcc.target/i386/avx512f-and-si-zmm-1.c: Likewise. * gcc.target/i386/avx512f-and-si-zmm-2.c: Likewise. * gcc.target/i386/avx512f-and-si-zmm-3.c: Likewise. * gcc.target/i386/avx512f-and-si-zmm-4.c: Likewise. * gcc.target/i386/avx512f-and-si-zmm-5.c: Likewise. * gcc.target/i386/avx512f-and-si-zmm-6.c: Likewise. * gcc.target/i386/avx512f-or-di-zmm-1.c: Likewise. * gcc.target/i386/avx512f-or-si-zmm-1.c: Likewise. * gcc.target/i386/avx512f-or-si-zmm-2.c: Likewise. * gcc.target/i386/avx512f-or-si-zmm-3.c: Likewise. * gcc.target/i386/avx512f-or-si-zmm-4.c: Likewise. * gcc.target/i386/avx512f-or-si-zmm-5.c: Likewise. * gcc.target/i386/avx512f-or-si-zmm-6.c: Likewise. * gcc.target/i386/avx512f-xor-di-zmm-1.c: Likewise. * gcc.target/i386/avx512f-xor-si-zmm-1.c: Likewise. * gcc.target/i386/avx512f-xor-si-zmm-2.c: Likewise. * gcc.target/i386/avx512f-xor-si-zmm-3.c: Likewise. * gcc.target/i386/avx512f-xor-si-zmm-4.c: Likewise. * gcc.target/i386/avx512f-xor-si-zmm-5.c: Likewise. * gcc.target/i386/avx512f-xor-si-zmm-6.c: Likewise. * gcc.target/i386/avx512vl-and-si-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-and-si-ymm-1.c: Likewise. * gcc.target/i386/avx512vl-or-si-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-or-si-ymm-1.c: Likewise. * gcc.target/i386/avx512vl-xor-si-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-xor-si-ymm-1.c: Likewise. From-SVN: r265369
H.J. Lu committed -
Many AVX512 vector operations can broadcast from a scalar memory source. This patch enables memory broadcast for INT add operations. gcc/ PR target/72782 * config/i386/sse.md (avx512bcst): Updated for V4SI, V2DI, V8SI, V4DI, V16SI and V8DI. (*sub<mode>3<mask_name>_bcst): New. (*add<mode>3<mask_name>_bcst): Likewise. gcc/testsuite/ PR target/72782 * gcc.target/i386/avx512f-add-di-zmm-1.c: New test. * gcc.target/i386/avx512f-add-si-zmm-1.c: Likewise. * gcc.target/i386/avx512f-add-si-zmm-2.c: Likewise. * gcc.target/i386/avx512f-add-si-zmm-3.c: Likewise. * gcc.target/i386/avx512f-add-si-zmm-4.c: Likewise. * gcc.target/i386/avx512f-add-si-zmm-5.c: Likewise. * gcc.target/i386/avx512f-add-si-zmm-6.c: Likewise. * gcc.target/i386/avx512f-sub-di-zmm-1.c: Likewise. * gcc.target/i386/avx512f-sub-si-zmm-1.c: Likewise. * gcc.target/i386/avx512f-sub-si-zmm-2.c: Likewise. * gcc.target/i386/avx512f-sub-si-zmm-3.c: Likewise. * gcc.target/i386/avx512f-sub-si-zmm-4.c: Likewise. * gcc.target/i386/avx512f-sub-si-zmm-5.c: Likewise. * gcc.target/i386/avx512vl-add-si-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-add-si-ymm-1.c: Likewise. * gcc.target/i386/avx512vl-sub-si-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-sub-si-ymm-1.c: Likewise. From-SVN: r265368
H.J. Lu committed -
From-SVN: r265366
GCC Administrator committed
-
- 21 Oct, 2018 5 commits
-
-
emmintrin.h (_mm_movemask_pd): Replace __vector __m64 with __vector unsigned long long for compatibility. 2018-10-21 Bill Schmidt <wschmidt@linux.ibm.com> Jinsong Ji <jji@us.ibm.com> * config/rs6000/emmintrin.h (_mm_movemask_pd): Replace __vector __m64 with __vector unsigned long long for compatibility. (_mm_movemask_epi8): Likewise. * config/rs6000/xmmintrin.h (_mm_cvtps_pi32): Likewise. (_mm_cvttps_pi32): Likewise. (_mm_cvtpi32_ps): Likewise. (_mm_cvtps_pi16): Likewise. (_mm_loadh_pi): Likewise. (_mm_storeh_pi): Likewise. (_mm_movehl_ps): Likewise. (_mm_movelh_ps): Likewise. (_mm_loadl_pi): Likewise. (_mm_storel_pi): Likewise. (_mm_movemask_ps): Likewise. (_mm_shuffle_pi16): Likewise. From-SVN: r265362
William Schmidt committed -
From-SVN: r265360
H.J. Lu committed -
Update AVX512 tests to test the newly added FMSUB, FNMADD and FNMSUB builtin functions. PR target/72782 * gcc.target/i386/avx-1.c (__builtin_ia32_vfmsubpd512_mask): New. (__builtin_ia32_vfmsubpd512_maskz): Likewise. (__builtin_ia32_vfmsubps512_mask): Likewise. (__builtin_ia32_vfmsubps512_maskz): Likewise. (__builtin_ia32_vfnmaddpd512_mask3): Likewise. (__builtin_ia32_vfnmaddpd512_maskz): Likewise. (__builtin_ia32_vfnmaddps512_mask3): Likewise. (__builtin_ia32_vfnmaddps512_maskz): Likewise. (__builtin_ia32_vfnmsubpd512_maskz): Likewise. (__builtin_ia32_vfnmsubps512_maskz): Likewise. * testsuite/gcc.target/i386/sse-13.c (__builtin_ia32_vfmsubpd512_mask): Likewise. (__builtin_ia32_vfmsubpd512_maskz): Likewise. (__builtin_ia32_vfmsubps512_mask): Likewise. (__builtin_ia32_vfmsubps512_maskz): Likewise. (__builtin_ia32_vfnmaddpd512_mask3): Likewise. (__builtin_ia32_vfnmaddpd512_maskz): Likewise. (__builtin_ia32_vfnmaddps512_mask3): Likewise. (__builtin_ia32_vfnmaddps512_maskz): Likewise. (__builtin_ia32_vfnmsubpd512_maskz): Likewise. (__builtin_ia32_vfnmsubps512_maskz): Likewise. * testsuite/gcc.target/i386/sse-23.c (__builtin_ia32_vfmsubpd512_mask): Likewise. (__builtin_ia32_vfmsubpd512_maskz): Likewise. (__builtin_ia32_vfmsubps512_mask): Likewise. (__builtin_ia32_vfmsubps512_maskz): Likewise. (__builtin_ia32_vfnmaddpd512_mask3): Likewise. (__builtin_ia32_vfnmaddpd512_maskz): Likewise. (__builtin_ia32_vfnmaddps512_mask3): Likewise. (__builtin_ia32_vfnmaddps512_maskz): Likewise. (__builtin_ia32_vfnmsubpd512_maskz): Likewise. (__builtin_ia32_vfnmsubps512_maskz): Likewise. From-SVN: r265359
H.J. Lu committed -
Many AVX512 vector operations can broadcast from a scalar memory source. This patch enables memory broadcast for FNMSUB operations. In order to support AVX512 memory broadcast for FNMSUB, FNMSUB builtin functions are also added, instead of passing the negated value to FMA builtin functions. gcc/ PR target/72782 * config/i386/avx512fintrin.h (_mm512_fnmsub_round_pd): Use __builtin_ia32_vfnmsubpd512_mask. (_mm512_mask_fnmsub_round_pd): Likewise. (_mm512_fnmsub_pd): Likewise. (_mm512_mask_fnmsub_pd): Likewise. (_mm512_maskz_fnmsub_round_pd): Use __builtin_ia32_vfnmsubpd512_maskz. (_mm512_maskz_fnmsub_pd): Likewise. (_mm512_fnmsub_round_ps): Use __builtin_ia32_vfnmsubps512_mask. (_mm512_mask_fnmsub_round_ps): Likewise. (_mm512_fnmsub_ps): Likewise. (_mm512_mask_fnmsub_ps): Likewise. (_mm512_maskz_fnmsub_round_ps): Use __builtin_ia32_vfnmsubps512_maskz. (_mm512_maskz_fnmsub_ps): Likewise. * config/i386/avx512vlintrin.h (_mm256_mask_fnmsub_pd): Use __builtin_ia32_vfnmsubpd256_mask. (_mm256_maskz_fnmsub_pd): Use __builtin_ia32_vfnmsubpd256_maskz. (_mm_mask_fnmsub_pd): Use __builtin_ia32_vfmaddpd128_mask (_mm_maskz_fnmsub_pd): Use __builtin_ia32_vfnmsubpd128_maskz. (_mm256_mask_fnmsub_ps): Use __builtin_ia32_vfnmsubps256_mask. (_mm256_mask_fnmsub_ps): Use __builtin_ia32_vfnmsubps256_mask. (_mm256_maskz_fnmsub_ps): Use __builtin_ia32_vfnmsubps256_maskz. (_mm_mask_fnmsub_ps): Use __builtin_ia32_vfnmsubps128_mask. (_mm_maskz_fnmsub_ps): Use __builtin_ia32_vfnmsubps128_maskz. * config/i386/fmaintrin.h (_mm_fnmsub_pd): Use __builtin_ia32_vfnmsubpd. (_mm256_fnmsub_pd): Use __builtin_ia32_vfnmsubpd256. (_mm_fnmsub_ps): Use __builtin_ia32_vfnmsubps. (_mm256_fnmsub_ps): Use __builtin_ia32_vfnmsubps256. (_mm_fnmsub_sd): Use __builtin_ia32_vfnmsubsd3. (_mm_fnmsub_ss): Use __builtin_ia32_vfnmsubss3. * config/i386/i386-builtin.def: Add __builtin_ia32_vfnmsubpd256_mask, __builtin_ia32_vfnmsubpd256_maskz, __builtin_ia32_vfnmsubpd128_mask, __builtin_ia32_vfnmsubpd128_maskz, __builtin_ia32_vfnmsubps256_mask, __builtin_ia32_vfnmsubps256_maskz, __builtin_ia32_vfnmsubps128_mask, __builtin_ia32_vfnmsubps128_maskz, __builtin_ia32_vfnmsubpd512_mask, __builtin_ia32_vfnmsubpd512_maskz, __builtin_ia32_vfnmsubps512_mask, __builtin_ia32_vfnmsubps512_maskz, __builtin_ia32_vfnmsubss3, __builtin_ia32_vfnmsubsd3, __builtin_ia32_vfnmsubps, __builtin_ia32_vfnmsubpd, __builtin_ia32_vfnmsubps256 and. __builtin_ia32_vfnmsubpd256. * config/i386/sse.md (fma4i_fnmsub_<mode>): New. (<avx512>_fnmsub_<mode>_maskz<round_expand_name>): Likewise. (*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_1): Likewise. (*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_2): Likewise. (*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_3): Likewise. (fmai_vmfnmsub_<mode><round_name>): Likewise. gcc/testsuite/ PR target/72782 * gcc.target/i386/avx512f-fnmsub-df-zmm-1.c: New test. * gcc.target/i386/avx512f-fnmsub-sf-zmm-1.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-2.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-3.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-4.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-5.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-6.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-7.c: Likewise. * gcc.target/i386/avx512f-fnmsub-sf-zmm-8.c: Likewise. * gcc.target/i386/avx512vl-fnmsub-sf-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-fnmsub-sf-ymm-1.c: Likewise. From-SVN: r265358
H.J. Lu committed -
Many AVX512 vector operations can broadcast from a scalar memory source. This patch enables memory broadcast for FNMADD operations. In order to support AVX512 memory broadcast for FNMADD, FNMADD builtin functions are also added, instead of passing the negated value to FMA builtin functions. gcc/ PR target/72782 * config/i386/avx512fintrin.h (_mm512_fnmadd_round_pd): Use __builtin_ia32_vfnmaddpd512_mask. (_mm512_mask_fnmadd_round_pd): Likewise. (_mm512_fnmadd_pd): Likewise. (_mm512_mask_fnmadd_pd): Likewise. (_mm512_maskz_fnmadd_round_pd): Use __builtin_ia32_vfnmaddpd512_maskz. (_mm512_maskz_fnmadd_pd): Likewise. (_mm512_fnmadd_round_ps): Use __builtin_ia32_vfnmaddps512_mask. (_mm512_mask_fnmadd_round_ps): Likewise. (_mm512_fnmadd_ps): Likewise. (_mm512_mask_fnmadd_ps): Likewise. (_mm512_maskz_fnmadd_round_ps): Use __builtin_ia32_vfnmaddps512_maskz. (_mm512_maskz_fnmadd_ps): Likewise. * config/i386/avx512vlintrin.h (_mm256_mask_fnmadd_pd): Use __builtin_ia32_vfnmaddpd256_mask. (_mm256_maskz_fnmadd_pd): Use __builtin_ia32_vfnmaddpd256_maskz. (_mm_mask_fnmadd_pd): Use __builtin_ia32_vfmaddpd128_mask (_mm_maskz_fnmadd_pd): Use __builtin_ia32_vfnmaddpd128_maskz. (_mm256_mask_fnmadd_ps): Use __builtin_ia32_vfnmaddps256_mask. (_mm256_mask_fnmadd_ps): Use __builtin_ia32_vfnmaddps256_mask. (_mm256_maskz_fnmadd_ps): Use __builtin_ia32_vfnmaddps256_maskz. (_mm_mask_fnmadd_ps): Use __builtin_ia32_vfnmaddps128_mask. (_mm_maskz_fnmadd_ps): Use __builtin_ia32_vfnmaddps128_maskz. * config/i386/fmaintrin.h (_mm_fnmadd_pd): Use __builtin_ia32_vfnmaddpd. (_mm256_fnmadd_pd): Use __builtin_ia32_vfnmaddpd256. (_mm_fnmadd_ps): Use __builtin_ia32_vfnmaddps. (_mm256_fnmadd_ps): Use __builtin_ia32_vfnmaddps256. (_mm_fnmadd_sd): Use __builtin_ia32_vfnmaddsd3. (_mm_fnmadd_ss): Use __builtin_ia32_vfnmaddss3. * config/i386/i386-builtin.def: Add __builtin_ia32_vfnmaddpd256_mask, __builtin_ia32_vfnmaddpd256_maskz, __builtin_ia32_vfnmaddpd128_mask, __builtin_ia32_vfnmaddpd128_maskz, __builtin_ia32_vfnmaddps256_mask, __builtin_ia32_vfnmaddps256_maskz, __builtin_ia32_vfnmaddps128_mask, __builtin_ia32_vfnmaddps128_maskz, __builtin_ia32_vfnmaddpd512_mask, __builtin_ia32_vfnmaddpd512_maskz, __builtin_ia32_vfnmaddps512_mask, __builtin_ia32_vfnmaddps512_maskz, __builtin_ia32_vfnmaddss3, __builtin_ia32_vfnmaddsd3, __builtin_ia32_vfnmaddps, __builtin_ia32_vfnmaddpd, __builtin_ia32_vfnmaddps256 and. __builtin_ia32_vfnmaddpd256. * config/i386/sse.md (fma4i_fnmadd_<mode>): New. (<avx512>_fnmadd_<mode>_maskz<round_expand_name>): Likewise. (*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_1): Likewise. (*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_2): Likewise. (*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_3): Likewise. (fmai_vmfnmadd_<mode><round_name>): Likewise. gcc/testsuite/ PR target/72782 * gcc.target/i386/avx512f-fnmadd-df-zmm-1.c: New test. * gcc.target/i386/avx512f-fnmadd-sf-zmm-1.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-2.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-3.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-4.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-5.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-6.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-7.c: Likewise. * gcc.target/i386/avx512f-fnmadd-sf-zmm-8.c: Likewise. * gcc.target/i386/avx512vl-fnmadd-sf-xmm-1.c: Likewise. * gcc.target/i386/avx512vl-fnmadd-sf-ymm-1.c: Likewise. From-SVN: r265357
H.J. Lu committed
-