1. 12 Jan, 2018 20 commits
    • Fix integer overflow in stats of trees. · 00e4d22d
      2018-01-12  Martin Liska  <mliska@suse.cz>
      
      	* tree-core.h: Use uint64_t instead of int.
      	* tree.c (tree_node_counts): Likewise.
      	(tree_node_sizes): Likewise.
      	(dump_tree_statistics): Use PRIu64 in printf format.
      
      From-SVN: r256583
      Martin Liska committed
    • Fix --enable-gather-detailed-mem-stats build. · b27b31dc
      2018-01-12  Martin Liska  <mliska@suse.cz>
      
      	* Makefile.in: As qsort_chk is implemented in vec.c, add
      	vec.o to linkage of gencfn-macros.
      	* tree.c (build_new_poly_int_cst): Add CXX_MEM_STAT_INFO as it's
      	passing the info to record_node_allocation_statistics.
      	(test_vector_cst_patterns): Add CXX_MEM_STAT_INFO to declaration
      	and pass the info.
      	* ggc-common.c (struct ggc_usage): Add operator== and use
      	it in operator< and compare function.
      	* mem-stats.h (struct mem_usage): Likewise.
      	* vec.c (struct vec_usage): Remove operator< and compare
      	function. Can be simply inherited.
      
      From-SVN: r256582
      Martin Liska committed
    • Deferring FMA transformations in tight loops · 4a0d0ed2
      2018-01-12  Martin Jambor  <mjambor@suse.cz>
      
      	PR target/81616
      	* params.def: New parameter PARAM_AVOID_FMA_MAX_BITS.
      	* tree-ssa-math-opts.c: Include domwalk.h.
      	(convert_mult_to_fma_1): New function.
      	(fma_transformation_info): New type.
      	(fma_deferring_state): Likewise.
      	(cancel_fma_deferring): New function.
      	(result_of_phi): Likewise.
      	(last_fma_candidate_feeds_initial_phi): Likewise.
      	(convert_mult_to_fma): Added deferring logic, split actual
      	transformation to convert_mult_to_fma_1.
      	(math_opts_dom_walker): New type.
      	(math_opts_dom_walker::after_dom_children): New method, body moved
      	here from pass_optimize_widening_mul::execute, added deferring logic
      	bits.
      	(pass_optimize_widening_mul::execute): Moved most of code to
      	math_opts_dom_walker::after_dom_children.
      	* config/i386/x86-tune.def (X86_TUNE_AVOID_128FMA_CHAINS): New.
      	* config/i386/i386.c (ix86_option_override_internal): Added
      	maybe_setting of PARAM_AVOID_FMA_MAX_BITS.
      
      From-SVN: r256581
      Martin Jambor committed
    • re PR debug/83157 (gcc.dg/guality/pr41616-1.c fail, inline instances refer to… · 80c93fa9
      re PR debug/83157 (gcc.dg/guality/pr41616-1.c fail, inline instances refer to concrete instance as abstract origin)
      
      2018-01-12  Richard Biener  <rguenther@suse.de>
      
      	PR debug/83157
      	* dwarf2out.c (gen_variable_die): Do not reset old_die for
      	inline instance vars.
      
      From-SVN: r256580
      Richard Biener committed
    • re PR target/81819 ([RX] internal compiler error: in… · ec952125
      re PR target/81819 ([RX] internal compiler error: in rx_is_restricted_memory_address, at config/rx/rx.c:311)
      
      gcc/
      	PR target/81819
      	* config/rx/rx.c (rx_is_restricted_memory_address):
      	Handle SUBREG case.
      
      From-SVN: r256578
      Oleg Endo committed
    • rs6000: Tune new testcase (PR83629) · eda03189
      It has some problems running on some 64-bit configuratiions, and the
      bug it is testing for is only on 32-bit; so let's not run it elsewhere.
      
      
      gcc/testsuite/
      	PR target/83629
      	* gcc.target/powerpc/pr83629.c: Require ilp32.
      
      From-SVN: r256577
      Segher Boessenkool committed
    • re PR target/80846 (auto-vectorized AVX2 horizontal sum should narrow to 128b… · c803b2a9
      re PR target/80846 (auto-vectorized AVX2 horizontal sum should narrow to 128b right away, to be more efficient for Ryzen and Intel)
      
      2018-01-12  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/80846
      	* target.def (split_reduction): New target hook.
      	* targhooks.c (default_split_reduction): New function.
      	* targhooks.h (default_split_reduction): Declare.
      	* tree-vect-loop.c (vect_create_epilog_for_reduction): If the
      	target requests first reduce vectors by combining low and high
      	parts.
      	* tree-vect-stmts.c (vect_gen_perm_mask_any): Adjust.
      	(get_vectype_for_scalar_type_and_size): Export.
      	* tree-vectorizer.h (get_vectype_for_scalar_type_and_size): Declare.
      
      	* doc/tm.texi.in (TARGET_VECTORIZE_SPLIT_REDUCTION): Document.
      	* doc/tm.texi: Regenerate.
      
      	i386/
      	* config/i386/i386.c (ix86_split_reduction): Implement
      	TARGET_VECTORIZE_SPLIT_REDUCTION.
      
      	* gcc.target/i386/pr80846-1.c: New testcase.
      	* gcc.target/i386/pr80846-2.c: Likewise.
      
      From-SVN: r256576
      Richard Biener committed
    • re PR target/83368 (alloca after setjmp breaks PIC base reg) · 46336a0e
      	PR target/83368
      	* config/sparc/sparc.h (PIC_OFFSET_TABLE_REGNUM): Set to INVALID_REGNUM
      	in PIC mode except for TARGET_VXWORKS_RTP.
      	* config/sparc/sparc.c: Include cfgrtl.h.
      	(TARGET_INIT_PIC_REG): Define.
      	(TARGET_USE_PSEUDO_PIC_REG): Likewise.
      	(sparc_pic_register_p): New predicate.
      	(sparc_legitimate_address_p): Use it.
      	(sparc_legitimize_pic_address): Likewise.
      	(sparc_delegitimize_address): Likewise.
      	(sparc_mode_dependent_address_p): Likewise.
      	(gen_load_pcrel_sym): Remove 4th parameter.
      	(load_got_register): Adjust call to above.  Remove obsolete stuff.
      	(sparc_expand_prologue): Do not call load_got_register here.
      	(sparc_flat_expand_prologue): Likewise.
      	(sparc_output_mi_thunk): Set the pic_offset_table_rtx object.
      	(sparc_use_pseudo_pic_reg): New function.
      	(sparc_init_pic_reg): Likewise.
      	* config/sparc/sparc.md (vxworks_load_got): Set the GOT register.
      	(builtin_setjmp_receiver): Enable only for TARGET_VXWORKS_RTP.
      
      From-SVN: r256575
      Eric Botcazou committed
    • Add doc for branch_cost effective target. · 7dbf8707
      2018-01-12  Christophe Lyon  <christophe.lyon@linaro.org>
      
      	gcc/
      	* doc/sourcebuild.texi (Effective-Target Keywords, Other attributes):
      	Add item for branch_cost.
      
      From-SVN: r256574
      Christophe Lyon committed
    • re PR rtl-optimization/83565 (RTL combine pass yields wrong rotate result) · 371ae937
      	PR rtl-optimization/83565
      	* rtlanal.c (nonzero_bits1): On WORD_REGISTER_OPERATIONS machines, do
      	not extend the result to a larger mode for rotate operations.
      	(num_sign_bit_copies1): Likewise.
      
      From-SVN: r256572
      Eric Botcazou committed
    • Add dg-require-effective-target indirect_jumps for g++ · c574147e
      2018-01-12  Tom de Vries  <tom@codesourcery.com>
      
      	* g++.dg/ext/label13.C: Add dg-require-effective-target indirect_jumps.
      	* g++.dg/ext/label13a.C: Same.
      	* g++.dg/ext/label14.C: Same.
      	* g++.dg/ext/label2.C: Same.
      	* g++.dg/ext/label3.C: Same.
      	* g++.dg/torture/pr42462.C: Same.
      	* g++.dg/torture/pr42739.C: Same.
      	* g++.dg/warn/Wunused-label-3.C: Same.
      
      From-SVN: r256571
      Tom de Vries committed
    • Add dg-require-effective-target alloca for c++ test-cases · 41287945
      2018-01-12  Tom de Vries  <tom@codesourcery.com>
      
      	* c-c++-common/dwarf2/vla1.c: Add dg-require-effective-target alloca.
      	* g++.dg/Walloca1.C: Same.
      	* g++.dg/cpp0x/pr70338.C: Same.
      	* g++.dg/cpp1y/lambda-generic-vla1.C: Same.
      	* g++.dg/cpp1y/vla10.C: Same.
      	* g++.dg/cpp1y/vla2.C: Same.
      	* g++.dg/cpp1y/vla6.C: Same.
      	* g++.dg/cpp1y/vla8.C: Same.
      	* g++.dg/debug/debug5.C: Same.
      	* g++.dg/debug/debug6.C: Same.
      	* g++.dg/debug/pr54828.C: Same.
      	* g++.dg/diagnostic/pr70105.C: Same.
      	* g++.dg/eh/cleanup5.C: Same.
      	* g++.dg/eh/spbp.C: Same.
      	* g++.dg/ext/tmplattr9.C: Same.
      	* g++.dg/ext/vla10.C: Same.
      	* g++.dg/ext/vla11.C: Same.
      	* g++.dg/ext/vla12.C: Same.
      	* g++.dg/ext/vla15.C: Same.
      	* g++.dg/ext/vla16.C: Same.
      	* g++.dg/ext/vla17.C: Same.
      	* g++.dg/ext/vla3.C: Same.
      	* g++.dg/ext/vla6.C: Same.
      	* g++.dg/ext/vla7.C: Same.
      	* g++.dg/init/array24.C: Same.
      	* g++.dg/init/new47.C: Same.
      	* g++.dg/init/pr55497.C: Same.
      	* g++.dg/opt/pr78201.C: Same.
      	* g++.dg/template/vla2.C: Same.
      	* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Same.
      	* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
      	* g++.dg/torture/pr62127.C: Same.
      	* g++.dg/torture/pr67055.C: Same.
      	* g++.dg/torture/stackalign/eh-alloca-1.C: Same.
      	* g++.dg/torture/stackalign/eh-inline-2.C: Same.
      	* g++.dg/torture/stackalign/eh-vararg-1.C: Same.
      	* g++.dg/torture/stackalign/eh-vararg-2.C: Same.
      	* g++.dg/warn/Wplacement-new-size-5.C: Same.
      	* g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Same.
      	* g++.dg/warn/Wvla-1.C: Same.
      	* g++.dg/warn/Wvla-3.C: Same.
      	* g++.old-deja/g++.ext/array2.C: Same.
      	* g++.old-deja/g++.ext/constructor.C: Same.
      	* g++.old-deja/g++.law/builtin1.C: Same.
      	* g++.old-deja/g++.other/crash12.C: Same.
      	* g++.old-deja/g++.other/eh3.C: Same.
      	* g++.old-deja/g++.pt/array6.C: Same.
      	* g++.old-deja/g++.pt/dynarray.C: Same.
      
      From-SVN: r256570
      Tom de Vries committed
    • Fix g++.dg/cpp0x/inh-ctor30.C · 01da712b
      	* g++.dg/cpp0x/inh-ctor30.C: Allow for alternate mangled form.
      
      From-SVN: r256569
      Rainer Orth committed
    • Link with correct values-*.o files on Solaris (PR target/40411) · c969e34e
      	gcc/testsuite:
      	PR libfortran/67412
      	* gfortran.dg/execute_command_line_2.f90: Remove dg-xfail-run-if
      	on *-*-solaris2.10.
      
      	libstdc++-v3:
      	PR libstdc++/64054
      	* testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc:
      	Remove dg-xfail-run-if.
      
      	gcc:
      	PR target/40411
      	* config/sol2.h (STARTFILE_ARCH_SPEC): Don't use with -shared or
      	-symbolic.
      	Use values-Xc.o for -pedantic.
      	Link with values-xpg4.o for C90, values-xpg6.o otherwise.
      
      From-SVN: r256568
      Rainer Orth committed
    • Include all x86 targets in branch_cost effective target · a7448bdf
      	* lib/target-supports.exp (check_effective_target_branch_cost):
      	Accept all x86 targets.
      
      From-SVN: r256567
      Rainer Orth committed
    • Initialize type_warnings::dyn_count with a default value (PR ipa/83054). · 53b73588
      2018-01-12  Martin Liska  <mliska@suse.cz>
      
      	PR ipa/83054
      	* ipa-devirt.c (final_warning_record::grow_type_warnings):
      	New function.
      	(possible_polymorphic_call_targets): Use it.
      	(ipa_devirt): Likewise.
      2018-01-12  Martin Liska  <mliska@suse.cz>
      
      	PR ipa/83054
      	* g++.dg/warn/pr83054.C: New test.
      
      From-SVN: r256566
      Martin Liska committed
    • Add new verification for profile-count.h. · aae9da9b
      2018-01-12  Martin Liska  <mliska@suse.cz>
      
      	* profile-count.h (enum profile_quality): Use 0 as invalid
      	enum value of profile_quality.
      
      From-SVN: r256565
      Martin Liska committed
    • Add new NDS32 options -mext-perf, -mext-perf2 and -mext-string in the documentation. · b710b08a
      gcc/
      	* doc/invoke.texi (NDS32 Options): Add -mext-perf, -mext-perf2 and
      	-mext-string options.
      
      From-SVN: r256564
      Chung-Ju Wu committed
    • lto-streamer-out.c (DFS::DFS_write_tree_body): Process DECL_DEBUG_EXPR… · c1a7ca7c
      lto-streamer-out.c (DFS::DFS_write_tree_body): Process DECL_DEBUG_EXPR conditional on DECL_HAS_DEBUG_EXPR_P.
      
      2018-01-12  Richard Biener  <rguenther@suse.de>
      
      	* lto-streamer-out.c (DFS::DFS_write_tree_body): Process
      	DECL_DEBUG_EXPR conditional on DECL_HAS_DEBUG_EXPR_P.
      	* tree-streamer-in.c (lto_input_ts_decl_common_tree_pointers):
      	Likewise.
      	* tree-streamer-out.c (write_ts_decl_common_tree_pointers): Likewise.
      
      From-SVN: r256563
      Richard Biener committed
    • Daily bump. · 7b2ce347
      From-SVN: r256561
      GCC Administrator committed
  2. 11 Jan, 2018 20 commits
    • configure.ac (--with-long-double-format): Add support for the configuration… · 8c7a27d5
      configure.ac (--with-long-double-format): Add support for the configuration option to change the default long double...
      
      2018-01-11  Michael Meissner  <meissner@linux.vnet.ibm.com>
      
      	* configure.ac (--with-long-double-format): Add support for the
      	configuration option to change the default long double format on
      	PowerPC systems.
      	* config.gcc (powerpc*-linux*-*): Likewise.
      	* configure: Regenerate.
      	* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): If long
      	double is IEEE, define __KC__ and __KF__ to allow floatn.h to be
      	used without modification.
      
      From-SVN: r256558
      Michael Meissner committed
    • rs6000-builtin.def (BU_P7_MISC_X): New #define. · 02a03501
      [gcc]
      
      2018-01-11  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
      
      	* config/rs6000/rs6000-builtin.def (BU_P7_MISC_X): New #define.
      	(SPEC_BARRIER): New instantiation of BU_P7_MISC_X.
      	* config/rs6000/rs6000.c (rs6000_expand_builtin): Handle
      	MISC_BUILTIN_SPEC_BARRIER.
      	(rs6000_init_builtins): Likewise.
      	* config/rs6000/rs6000.md (UNSPECV_SPEC_BARRIER): New UNSPECV
      	enum value.
      	(speculation_barrier): New define_insn.
      	* doc/extend.texi: Document __builtin_speculation_barrier.
      
      [gcc/testsuite]
      
      2018-01-11  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
      
      	* gcc.target/powerpc/spec-barr-1.c: New file.
      
      From-SVN: r256557
      Bill Schmidt committed
    • re PR target/83203 (Inefficient int to avx2 vector conversion) · 1ad6e904
      	PR target/83203
      	* config/i386/i386.c (ix86_expand_vector_init_one_nonzero): If one_var
      	is 0, for V{8,16}S[IF] and V[48]D[IF]mode use gen_vec_set<mode>_0.
      	* config/i386/sse.md (VI8_AVX_AVX512F, VI4F_256_512): New mode
      	iterators.
      	(ssescalarmodesuffix): Add 512-bit vectors.  Use "d" or "q" for
      	integral modes instead of "ss" and "sd".
      	(vec_set<mode>_0): New define_insns for 256-bit and 512-bit
      	vectors with 32-bit and 64-bit elements.
      	(vecdupssescalarmodesuffix): New mode attribute.
      	(vec_dup<mode>): Use it.
      
      From-SVN: r256556
      Jakub Jelinek committed
    • i386: Align stack frame if argument is passed on stack · c7a61831
      When a function call is removed, it may become a leaf function.  But if
      argument may be passed on stack, we need to align the stack frame when
      there is no tail call.
      
      Tested on Linux/i686 and Linux/x86-64.
      
      gcc/
      
      	PR target/83330
      	* config/i386/i386.c (ix86_compute_frame_layout): Align stack
      	frame if argument is passed on stack.
      
      gcc/testsuite/
      
      	PR target/83330
      	* gcc.target/i386/pr83330.c: New test.
      
      From-SVN: r256555
      H.J. Lu committed
    • re PR fortran/79383 (USE statement error) · 278e902c
      2018-01-11  Steven G. Kargl <kargl@gcc.gnu.org>
      
      	PR fortran/79383
      	* gfortran.dg/dtio_31.f03: New test.
      	* gfortran.dg/dtio_32.f03: New test.
      
      From-SVN: r256554
      Steven G. Kargl committed
    • re PR go/83794 (misc/cgo/test uses gigabytes of memory) · fbea3c33
      	PR go/83794
          misc/cgo/test: avoid endless loop when we can't parse notes
          
          Reviewed-on: https://go-review.googlesource.com/87416
      
      From-SVN: r256553
      Ian Lance Taylor committed
    • Add some reproducers for issues found developing the location-wrappers patch · c5269263
      gcc/testsuite/ChangeLog:
      	PR c++/43486
      	* g++.dg/wrappers: New subdirectory.
      	* g++.dg/wrappers/README: New file.
      	* g++.dg/wrappers/alloc.C: New test case.
      	* g++.dg/wrappers/cow-istream-string.C: New test case.
      	* g++.dg/wrappers/cp-stdlib.C: New test case.
      	* g++.dg/wrappers/sanitizer_coverage_libcdep_new.C: New test case.
      	* g++.dg/wrappers/wrapper-around-type-pack-expansion.C: New test
      	case.
      
      From-SVN: r256552
      David Malcolm committed
    • re PR target/82682 (FAIL: gcc.target/i386/pr50038.c scan-assembler-times movzbl… · e2c0d088
      re PR target/82682 (FAIL: gcc.target/i386/pr50038.c scan-assembler-times movzbl 2 (found 3 times) since r253958)
      
      	PR target/82682
      	* ree.c (combine_reaching_defs): Optimize also
      	reg2=exp; reg1=reg2; reg2=any_extend(reg1); into
      	reg2=any_extend(exp); reg1=reg2;, formatting fix.
      
      From-SVN: r256551
      Jakub Jelinek committed
    • PR c++/82728 - wrong -Wunused-but-set-variable · 03943bbd
      	PR c++/82799
      	PR c++/83690
      	* call.c (perform_implicit_conversion_flags): Call mark_rvalue_use.
      	* decl.c (case_conversion): Likewise.
      	* semantics.c (finish_static_assert): Call
      	perform_implicit_conversion_flags.
      
      From-SVN: r256550
      Jason Merrill committed
    • re PR tree-optimization/83189 (internal compiler error: in probability_in, at profile-count.h:1050) · c2893c6e
      	PR middle-end/83189
      	* gimple-ssa-isolate-paths.c (isolate_path): Fix profile update.
      
      From-SVN: r256545
      Jan Hubicka committed
    • re PR middle-end/83718 (ICE: Floating point exception in profile_count::apply_scale) · 0526ed2a
      	PR middle-end/83718
      	* tree-inline.c (copy_cfg_body): Adjust num&den for scaling
      	after they are computed.
      	* g++.dg/torture/pr83718.C: New testcase.
      
      From-SVN: r256544
      Jan Hubicka committed
    • [C++ PATCH] kill unused enum · 2a3af45c
      https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00923.html
      	* method.c (enum mangling_flags): Delete long-dead enum.
      
      From-SVN: r256543
      Nathan Sidwell committed
    • re PR ipa/83178 (g++.dg/ipa/devirt-22.C fail) · 346ac3a8
      	PR ipa/83178
      	* g++.dg/ipa/devirt-22.C: Adjust scan-dump-times count.
      
      From-SVN: r256542
      Martin Jambor committed
    • re PR tree-optimization/83695 (ICE on valid code at -O3: Segmentation fault) · 4e090bcc
      	PR tree-optimization/83695
      	* gimple-loop-linterchange.cc
      	(tree_loop_interchange::interchange_loops): Call scev_reset_htab to
      	reset cached scev information after interchange.
      	(pass_linterchange::execute): Remove call to scev_reset_htab.
      
      	gcc/testsuite
      	PR tree-optimization/83695
      	* gcc.dg/tree-ssa/pr83695.c: New test.
      
      From-SVN: r256541
      Bin Cheng committed
    • [arm][3/3] Implement fp16fml lane intrinsics · eccf4d70
      This patch implements the lane-wise fp16fml intrinsics.
      There's quite a few of them so I've split them up from
      the other simpler fp16fml intrinsics.
      
      These ones expose instructions such as
      
      vfmal.f16 Dd, Sn, Sm[<index>]  0 <= index <= 1
      vfmal.f16 Qd, Dn, Dm[<index>]  0 <= index <= 3
      vfmsl.f16 Dd, Sn, Sm[<index>]  0 <= index <= 1
      vfmsl.f16 Qd, Dn, Dm[<index>]  0 <= index <= 3
      
      These instructions extract a single half-precision
      floating-point value from one of the source regs
      and perform a vfmal/vfmsl operation as per the
      normal variant with that value.
      
      The nuance here is that some of the intrinsics want
      to do things like:
      
      float32x2_t vfmlal_laneq_low_u32 (float32x2_t __r, float16x4_t __a, float16x8_t __b, const int __index)
      
      
      where the float16x8_t value of '__b' is held in a Q
      register, so we need to be a bit smart about finding
      the right D or S sub-register and translating the
      lane number to a lane in that sub-register, instead
      of just passing the language-level const-int down to
      the assembly instruction.
      
      That's where most of the complexity of this patch comes from
      but hopefully it's orthogonal enough to make sense.
      
      Bootstrapped and tested on arm-none-linux-gnueabihf as well as
      armeb-none-eabi.
      
      	* config/arm/arm_neon.h (vfmlal_lane_low_u32, vfmlal_lane_high_u32,
      	vfmlalq_laneq_low_u32, vfmlalq_lane_low_u32, vfmlal_laneq_low_u32,
      	vfmlalq_laneq_high_u32, vfmlalq_lane_high_u32, vfmlal_laneq_high_u32,
      	vfmlsl_lane_low_u32, vfmlsl_lane_high_u32, vfmlslq_laneq_low_u32,
      	vfmlslq_lane_low_u32, vfmlsl_laneq_low_u32, vfmlslq_laneq_high_u32,
      	vfmlslq_lane_high_u32, vfmlsl_laneq_high_u32): Define.
      	* config/arm/arm_neon_builtins.def (vfmal_lane_low,
      	vfmal_lane_lowv4hf, vfmal_lane_lowv8hf, vfmal_lane_high,
      	vfmal_lane_highv4hf, vfmal_lane_highv8hf, vfmsl_lane_low,
      	vfmsl_lane_lowv4hf, vfmsl_lane_lowv8hf, vfmsl_lane_high,
      	vfmsl_lane_highv4hf, vfmsl_lane_highv8hf): New sets of builtins.
      	* config/arm/iterators.md (VFMLSEL2, vfmlsel2): New mode attributes.
      	(V_lane_reg): Likewise.
      	* config/arm/neon.md (neon_vfm<vfml_op>l_lane_<vfml_half><VCVTF:mode>):
      	New define_expand.
      	(neon_vfm<vfml_op>l_lane_<vfml_half><vfmlsel2><mode>): Likewise.
      	(vfmal_lane_low<mode>_intrinsic,
      	vfmal_lane_low<vfmlsel2><mode>_intrinsic,
      	vfmal_lane_high<vfmlsel2><mode>_intrinsic,
      	vfmal_lane_high<mode>_intrinsic, vfmsl_lane_low<mode>_intrinsic,
      	vfmsl_lane_low<vfmlsel2><mode>_intrinsic,
      	vfmsl_lane_high<vfmlsel2><mode>_intrinsic,
      	vfmsl_lane_high<mode>_intrinsic): New define_insns.
      
      	* gcc.target/arm/simd/fp16fml_lane_high.c: New test.
      	* gcc.target/arm/simd/fp16fml_lane_low.c: New test.
      
      From-SVN: r256540
      Kyrylo Tkachov committed
    • [arm][2/3] Implement fp16fml extension for ARMv8.4-A · 06e95715
      This patch adds the +fp16fml extension that enables some
      half-precision floating-point Advanced SIMD instructions,
      available through arm_neon.h intrinsics.
      
      This extension is on by default for armv8.4-a
      if fp16 is available, so it can be enabled by -march=armv8.4-a+fp16.
      
      fp16fml is also available for armv8.2-a and armv8.3-a through the
      +fp16fml option that is added for these architectures.
      
      The new instructions that this patch adds support for are:
      vfmal.f16 Dr, Sm, Sn
      vfmal.f16 Qr, Dm, Dn
      vfmsl.f16 Dr, Sm, Sn
      vfmsl.f16 Qr, Dm, Dn
      
      They interpret their input registers as a vector of half-precision
      floating-point values, extend them to single-precision vectors
      and perform a fused multiply-add or subtract of them with the
      destination vector.
      
      This patch exposes these instructions through arm_neon.h intrinsics.
      The set of intrinsics allows us to do stuff such as perform
      the multiply-add/subtract operation on the low or top half of
      float16x4_t and float16x8_t values.  This maps naturally in aarch64
      to the FMLAL and FMLAL2 instructions but on arm we have to use the
      fact that consecutive NEON registers overlap the wider register
      (i.e. d0 is s0 plus s1, q0 is d0 plus d1 etc). This just means
      we have to be careful to use the right subreg operand print code.
      
      New arm-specific builtins are defined to expand to the new patterns.
      I've managed to compress the define_expands using code, mode and int
      iterators but the define_insns don't compress very well without two-tiered
      iterators (iterator attributes expanding to iterators) which we
      don't support.
      
      Bootstrapped and tested on arm-none-linux-gnueabihf and also on
      armeb-none-eabi.
      
      	* config/arm/arm-cpus.in (fp16fml): New feature.
      	(ALL_SIMD): Add fp16fml.
      	(armv8.2-a): Add fp16fml as an option.
      	(armv8.3-a): Likewise.
      	(armv8.4-a): Add fp16fml as part of fp16.
      	* config/arm/arm.h (TARGET_FP16FML): Define.
      	* config/arm/arm-c.c (arm_cpu_builtins): Define __ARM_FEATURE_FP16_FML
      	when appropriate.
      	* config/arm/arm-modes.def (V2HF): Define.
      	* config/arm/arm_neon.h (vfmlal_low_u32, vfmlsl_low_u32,
      	vfmlal_high_u32, vfmlsl_high_u32, vfmlalq_low_u32,
      	vfmlslq_low_u32, vfmlalq_high_u32, vfmlslq_high_u32): Define.
      	* config/arm/arm_neon_builtins.def (vfmal_low, vfmal_high,
      	vfmsl_low, vfmsl_high): New set of builtins.
      	* config/arm/iterators.md (PLUSMINUS): New code iterator.
      	(vfml_op): New code attribute.
      	(VFMLHALVES): New int iterator.
      	(VFML, VFMLSEL): New mode attributes.
      	(V_reg): Define mapping for V2HF.
      	(V_hi, V_lo): New mode attributes.
      	(VF_constraint): Likewise.
      	(vfml_half, vfml_half_selector): New int attributes.
      	* config/arm/neon.md (neon_vfm<vfml_op>l_<vfml_half><mode>): New
      	define_expand.
      	(vfmal_low<mode>_intrinsic, vfmsl_high<mode>_intrinsic,
      	vfmal_high<mode>_intrinsic, vfmsl_low<mode>_intrinsic):
      	New define_insn.
      	* config/arm/t-arm-elf (v8_fps): Add fp16fml.
      	* config/arm/t-multilib (v8_2_a_simd_variants): Add fp16fml.
      	* config/arm/unspecs.md (UNSPEC_VFML_LO, UNSPEC_VFML_HI): New unspecs.
      	* doc/invoke.texi (ARM Options): Document fp16fml.  Update armv8.4-a
      	documentation.
      	* doc/sourcebuild.texi (arm_fp16fml_neon_ok, arm_fp16fml_neon):
      	Document new effective target and option set.
      
      	* gcc.target/arm/multilib.exp: Add combination tests for fp16fml.
      	* gcc.target/arm/simd/fp16fml_high.c: New test.
      	* gcc.target/arm/simd/fp16fml_low.c: Likewise.
      	* lib/target-supports.exp
      	(check_effective_target_arm_fp16fml_neon_ok_nocache,
      	check_effective_target_arm_fp16fml_neon_ok,
      	add_options_for_arm_fp16fml_neon): New procedures.
      
      From-SVN: r256539
      Kyrylo Tkachov committed
    • [arm][1/3] Add -march=armv8.4-a option · 946c6c45
      This patch adds support for the Armv8.4-A architecture [1]
      in the arm backend. This is done through the new
      -march=armv8.4-a option.
      
      With this patch armv8.4-a is recognised as an argument
      and supports the extensions: simd, fp16, crypto, nocrypto,
      nofp with the familiar meaning of these options.
      Worth noting that there is no dotprod option like in
      armv8.2-a and armv8.3-a because Dot Product support is
      mandatory in Armv8.4-A when simd is available, so when using
      +simd (of fp16 which enables +simd), the +dotprod is implied.
      
      The various multilib selection makefile fragments are updated
      too and the mutlilib.exp test gets a few armv8.4-a combination
      tests.
      
      Bootstrapped and tested on arm-none-linux-gnueabihf.
      
      From-SVN: r256537
      Kyrylo Tkachov committed
    • re PR target/81821 ([RX] xchg_mem<mode> uses wrong memory operand size) · 99eeb64c
      gcc/
      	PR target/81821
      	* config/rx/rx.md (BW): New mode attribute.
      	(sync_lock_test_and_setsi): Add mode suffix to insn output.
      
      From-SVN: r256536
      Oleg Endo committed
    • re PR tree-optimization/83435 (ICE in set_value_range, at tree-vrp.c:211) · b0bd3e52
      2018-01-11  Richard Biener  <rguenther@suse.de>
      
      	PR tree-optimization/83435
      	* graphite.c (canonicalize_loop_form): Ignore fake loop exit edges.
      	* graphite-scop-detection.c (scop_detection::get_sese): Likewise.
      	* tree-vrp.c (add_assert_info): Drop TREE_OVERFLOW if they appear.
      
      	* gcc.dg/graphite/pr83435.c: New testcase.
      
      From-SVN: r256535
      Richard Biener committed
    • [AArch64] Add const_offset field to aarch64_address_info · dc640181
      This patch records the integer value of the address offset in
      aarch64_address_info, so that it doesn't need to be re-extracted
      from the rtx.  The SVE port will make more use of this.  The patch
      also uses poly_int64 routines to manipulate the offset, rather than
      just handling CONST_INTs.
      
      2018-01-11  Richard Sandiford  <richard.sandiford@linaro.org>
      	    Alan Hayward  <alan.hayward@arm.com>
      	    David Sherwood  <david.sherwood@arm.com>
      
      gcc/
      	* config/aarch64/aarch64.c (aarch64_address_info): Add a const_offset
      	field.
      	(aarch64_classify_address): Initialize it.  Track polynomial offsets.
      	(aarch64_print_address_internal): Use it to check for a zero offset.
      
      Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
      Co-Authored-By: David Sherwood <david.sherwood@arm.com>
      
      From-SVN: r256534
      Richard Sandiford committed