Commits · 00e4d22dc1e8224e090be6dac970fe583e8f39ac · lvzhengyang / riscv-gcc-1

12 Jan, 2018 20 commits

Fix integer overflow in stats of trees. · 00e4d22d

2018-01-12  Martin Liska  <mliska@suse.cz>

	* tree-core.h: Use uint64_t instead of int.
	* tree.c (tree_node_counts): Likewise.
	(tree_node_sizes): Likewise.
	(dump_tree_statistics): Use PRIu64 in printf format.

From-SVN: r256583

committed Jan 12, 2018

00e4d22d Browse Files

Fix --enable-gather-detailed-mem-stats build. · b27b31dc

2018-01-12  Martin Liska  <mliska@suse.cz>

	* Makefile.in: As qsort_chk is implemented in vec.c, add
	vec.o to linkage of gencfn-macros.
	* tree.c (build_new_poly_int_cst): Add CXX_MEM_STAT_INFO as it's
	passing the info to record_node_allocation_statistics.
	(test_vector_cst_patterns): Add CXX_MEM_STAT_INFO to declaration
	and pass the info.
	* ggc-common.c (struct ggc_usage): Add operator== and use
	it in operator< and compare function.
	* mem-stats.h (struct mem_usage): Likewise.
	* vec.c (struct vec_usage): Remove operator< and compare
	function. Can be simply inherited.

From-SVN: r256582

committed Jan 12, 2018

b27b31dc Browse Files

Deferring FMA transformations in tight loops · 4a0d0ed2

2018-01-12  Martin Jambor  <mjambor@suse.cz>

	PR target/81616
	* params.def: New parameter PARAM_AVOID_FMA_MAX_BITS.
	* tree-ssa-math-opts.c: Include domwalk.h.
	(convert_mult_to_fma_1): New function.
	(fma_transformation_info): New type.
	(fma_deferring_state): Likewise.
	(cancel_fma_deferring): New function.
	(result_of_phi): Likewise.
	(last_fma_candidate_feeds_initial_phi): Likewise.
	(convert_mult_to_fma): Added deferring logic, split actual
	transformation to convert_mult_to_fma_1.
	(math_opts_dom_walker): New type.
	(math_opts_dom_walker::after_dom_children): New method, body moved
	here from pass_optimize_widening_mul::execute, added deferring logic
	bits.
	(pass_optimize_widening_mul::execute): Moved most of code to
	math_opts_dom_walker::after_dom_children.
	* config/i386/x86-tune.def (X86_TUNE_AVOID_128FMA_CHAINS): New.
	* config/i386/i386.c (ix86_option_override_internal): Added
	maybe_setting of PARAM_AVOID_FMA_MAX_BITS.

From-SVN: r256581

committed Jan 12, 2018

4a0d0ed2 Browse Files

re PR debug/83157 (gcc.dg/guality/pr41616-1.c fail, inline instances refer to… · 80c93fa9

re PR debug/83157 (gcc.dg/guality/pr41616-1.c fail, inline instances refer to concrete instance as abstract origin)

2018-01-12  Richard Biener  <rguenther@suse.de>

	PR debug/83157
	* dwarf2out.c (gen_variable_die): Do not reset old_die for
	inline instance vars.

From-SVN: r256580

committed Jan 12, 2018

80c93fa9 Browse Files

re PR target/81819 ([RX] internal compiler error: in… · ec952125

re PR target/81819 ([RX] internal compiler error: in rx_is_restricted_memory_address, at config/rx/rx.c:311)

gcc/
	PR target/81819
	* config/rx/rx.c (rx_is_restricted_memory_address):
	Handle SUBREG case.

From-SVN: r256578

committed Jan 12, 2018

ec952125 Browse Files

rs6000: Tune new testcase (PR83629) · eda03189

It has some problems running on some 64-bit configuratiions, and the
bug it is testing for is only on 32-bit; so let's not run it elsewhere.


gcc/testsuite/
	PR target/83629
	* gcc.target/powerpc/pr83629.c: Require ilp32.

From-SVN: r256577

committed Jan 12, 2018

eda03189 Browse Files

re PR target/80846 (auto-vectorized AVX2 horizontal sum should narrow to 128b… · c803b2a9

re PR target/80846 (auto-vectorized AVX2 horizontal sum should narrow to 128b right away, to be more efficient for Ryzen and Intel)

2018-01-12  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/80846
	* target.def (split_reduction): New target hook.
	* targhooks.c (default_split_reduction): New function.
	* targhooks.h (default_split_reduction): Declare.
	* tree-vect-loop.c (vect_create_epilog_for_reduction): If the
	target requests first reduce vectors by combining low and high
	parts.
	* tree-vect-stmts.c (vect_gen_perm_mask_any): Adjust.
	(get_vectype_for_scalar_type_and_size): Export.
	* tree-vectorizer.h (get_vectype_for_scalar_type_and_size): Declare.

	* doc/tm.texi.in (TARGET_VECTORIZE_SPLIT_REDUCTION): Document.
	* doc/tm.texi: Regenerate.

	i386/
	* config/i386/i386.c (ix86_split_reduction): Implement
	TARGET_VECTORIZE_SPLIT_REDUCTION.

	* gcc.target/i386/pr80846-1.c: New testcase.
	* gcc.target/i386/pr80846-2.c: Likewise.

From-SVN: r256576

committed Jan 12, 2018

c803b2a9 Browse Files

re PR target/83368 (alloca after setjmp breaks PIC base reg) · 46336a0e

	PR target/83368
	* config/sparc/sparc.h (PIC_OFFSET_TABLE_REGNUM): Set to INVALID_REGNUM
	in PIC mode except for TARGET_VXWORKS_RTP.
	* config/sparc/sparc.c: Include cfgrtl.h.
	(TARGET_INIT_PIC_REG): Define.
	(TARGET_USE_PSEUDO_PIC_REG): Likewise.
	(sparc_pic_register_p): New predicate.
	(sparc_legitimate_address_p): Use it.
	(sparc_legitimize_pic_address): Likewise.
	(sparc_delegitimize_address): Likewise.
	(sparc_mode_dependent_address_p): Likewise.
	(gen_load_pcrel_sym): Remove 4th parameter.
	(load_got_register): Adjust call to above.  Remove obsolete stuff.
	(sparc_expand_prologue): Do not call load_got_register here.
	(sparc_flat_expand_prologue): Likewise.
	(sparc_output_mi_thunk): Set the pic_offset_table_rtx object.
	(sparc_use_pseudo_pic_reg): New function.
	(sparc_init_pic_reg): Likewise.
	* config/sparc/sparc.md (vxworks_load_got): Set the GOT register.
	(builtin_setjmp_receiver): Enable only for TARGET_VXWORKS_RTP.

From-SVN: r256575

committed Jan 12, 2018

46336a0e Browse Files

Add doc for branch_cost effective target. · 7dbf8707

2018-01-12  Christophe Lyon  <christophe.lyon@linaro.org>

	gcc/
	* doc/sourcebuild.texi (Effective-Target Keywords, Other attributes):
	Add item for branch_cost.

From-SVN: r256574

committed Jan 12, 2018

7dbf8707 Browse Files

re PR rtl-optimization/83565 (RTL combine pass yields wrong rotate result) · 371ae937

	PR rtl-optimization/83565
	* rtlanal.c (nonzero_bits1): On WORD_REGISTER_OPERATIONS machines, do
	not extend the result to a larger mode for rotate operations.
	(num_sign_bit_copies1): Likewise.

From-SVN: r256572

committed Jan 12, 2018

371ae937 Browse Files

Add dg-require-effective-target indirect_jumps for g++ · c574147e

2018-01-12  Tom de Vries  <tom@codesourcery.com>

	* g++.dg/ext/label13.C: Add dg-require-effective-target indirect_jumps.
	* g++.dg/ext/label13a.C: Same.
	* g++.dg/ext/label14.C: Same.
	* g++.dg/ext/label2.C: Same.
	* g++.dg/ext/label3.C: Same.
	* g++.dg/torture/pr42462.C: Same.
	* g++.dg/torture/pr42739.C: Same.
	* g++.dg/warn/Wunused-label-3.C: Same.

From-SVN: r256571

committed Jan 12, 2018

c574147e Browse Files

Add dg-require-effective-target alloca for c++ test-cases · 41287945

2018-01-12  Tom de Vries  <tom@codesourcery.com>

	* c-c++-common/dwarf2/vla1.c: Add dg-require-effective-target alloca.
	* g++.dg/Walloca1.C: Same.
	* g++.dg/cpp0x/pr70338.C: Same.
	* g++.dg/cpp1y/lambda-generic-vla1.C: Same.
	* g++.dg/cpp1y/vla10.C: Same.
	* g++.dg/cpp1y/vla2.C: Same.
	* g++.dg/cpp1y/vla6.C: Same.
	* g++.dg/cpp1y/vla8.C: Same.
	* g++.dg/debug/debug5.C: Same.
	* g++.dg/debug/debug6.C: Same.
	* g++.dg/debug/pr54828.C: Same.
	* g++.dg/diagnostic/pr70105.C: Same.
	* g++.dg/eh/cleanup5.C: Same.
	* g++.dg/eh/spbp.C: Same.
	* g++.dg/ext/tmplattr9.C: Same.
	* g++.dg/ext/vla10.C: Same.
	* g++.dg/ext/vla11.C: Same.
	* g++.dg/ext/vla12.C: Same.
	* g++.dg/ext/vla15.C: Same.
	* g++.dg/ext/vla16.C: Same.
	* g++.dg/ext/vla17.C: Same.
	* g++.dg/ext/vla3.C: Same.
	* g++.dg/ext/vla6.C: Same.
	* g++.dg/ext/vla7.C: Same.
	* g++.dg/init/array24.C: Same.
	* g++.dg/init/new47.C: Same.
	* g++.dg/init/pr55497.C: Same.
	* g++.dg/opt/pr78201.C: Same.
	* g++.dg/template/vla2.C: Same.
	* g++.dg/torture/Wsizeof-pointer-memaccess1.C: Same.
	* g++.dg/torture/Wsizeof-pointer-memaccess2.C: Same.
	* g++.dg/torture/pr62127.C: Same.
	* g++.dg/torture/pr67055.C: Same.
	* g++.dg/torture/stackalign/eh-alloca-1.C: Same.
	* g++.dg/torture/stackalign/eh-inline-2.C: Same.
	* g++.dg/torture/stackalign/eh-vararg-1.C: Same.
	* g++.dg/torture/stackalign/eh-vararg-2.C: Same.
	* g++.dg/warn/Wplacement-new-size-5.C: Same.
	* g++.dg/warn/Wsizeof-pointer-memaccess-1.C: Same.
	* g++.dg/warn/Wvla-1.C: Same.
	* g++.dg/warn/Wvla-3.C: Same.
	* g++.old-deja/g++.ext/array2.C: Same.
	* g++.old-deja/g++.ext/constructor.C: Same.
	* g++.old-deja/g++.law/builtin1.C: Same.
	* g++.old-deja/g++.other/crash12.C: Same.
	* g++.old-deja/g++.other/eh3.C: Same.
	* g++.old-deja/g++.pt/array6.C: Same.
	* g++.old-deja/g++.pt/dynarray.C: Same.

From-SVN: r256570

committed Jan 12, 2018

41287945 Browse Files

Fix g++.dg/cpp0x/inh-ctor30.C · 01da712b
```
	* g++.dg/cpp0x/inh-ctor30.C: Allow for alternate mangled form.

From-SVN: r256569
```
Rainer Orth committed Jan 12, 2018
01da712b Browse Files

Link with correct values-*.o files on Solaris (PR target/40411) · c969e34e

	gcc/testsuite:
	PR libfortran/67412
	* gfortran.dg/execute_command_line_2.f90: Remove dg-xfail-run-if
	on *-*-solaris2.10.

	libstdc++-v3:
	PR libstdc++/64054
	* testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc:
	Remove dg-xfail-run-if.

	gcc:
	PR target/40411
	* config/sol2.h (STARTFILE_ARCH_SPEC): Don't use with -shared or
	-symbolic.
	Use values-Xc.o for -pedantic.
	Link with values-xpg4.o for C90, values-xpg6.o otherwise.

From-SVN: r256568

committed Jan 12, 2018

c969e34e Browse Files

Include all x86 targets in branch_cost effective target · a7448bdf
```
	* lib/target-supports.exp (check_effective_target_branch_cost):
	Accept all x86 targets.

From-SVN: r256567
```
Rainer Orth committed Jan 12, 2018
a7448bdf Browse Files

Initialize type_warnings::dyn_count with a default value (PR ipa/83054). · 53b73588

2018-01-12  Martin Liska  <mliska@suse.cz>

	PR ipa/83054
	* ipa-devirt.c (final_warning_record::grow_type_warnings):
	New function.
	(possible_polymorphic_call_targets): Use it.
	(ipa_devirt): Likewise.
2018-01-12  Martin Liska  <mliska@suse.cz>

	PR ipa/83054
	* g++.dg/warn/pr83054.C: New test.

From-SVN: r256566

committed Jan 12, 2018

53b73588 Browse Files

Add new verification for profile-count.h. · aae9da9b

2018-01-12  Martin Liska  <mliska@suse.cz>

	* profile-count.h (enum profile_quality): Use 0 as invalid
	enum value of profile_quality.

From-SVN: r256565

committed Jan 12, 2018

aae9da9b Browse Files

Add new NDS32 options -mext-perf, -mext-perf2 and -mext-string in the documentation. · b710b08a
```
gcc/
	* doc/invoke.texi (NDS32 Options): Add -mext-perf, -mext-perf2 and
	-mext-string options.

From-SVN: r256564
```
Chung-Ju Wu committed Jan 12, 2018
b710b08a Browse Files

lto-streamer-out.c (DFS::DFS_write_tree_body): Process DECL_DEBUG_EXPR… · c1a7ca7c

lto-streamer-out.c (DFS::DFS_write_tree_body): Process DECL_DEBUG_EXPR conditional on DECL_HAS_DEBUG_EXPR_P.

2018-01-12  Richard Biener  <rguenther@suse.de>

	* lto-streamer-out.c (DFS::DFS_write_tree_body): Process
	DECL_DEBUG_EXPR conditional on DECL_HAS_DEBUG_EXPR_P.
	* tree-streamer-in.c (lto_input_ts_decl_common_tree_pointers):
	Likewise.
	* tree-streamer-out.c (write_ts_decl_common_tree_pointers): Likewise.

From-SVN: r256563

committed Jan 12, 2018

c1a7ca7c Browse Files

Daily bump. · 7b2ce347
```
From-SVN: r256561
```
GCC Administrator committed Jan 12, 2018
7b2ce347 Browse Files

11 Jan, 2018 20 commits

configure.ac (--with-long-double-format): Add support for the configuration… · 8c7a27d5

configure.ac (--with-long-double-format): Add support for the configuration option to change the default long double...

2018-01-11  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* configure.ac (--with-long-double-format): Add support for the
	configuration option to change the default long double format on
	PowerPC systems.
	* config.gcc (powerpc*-linux*-*): Likewise.
	* configure: Regenerate.
	* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): If long
	double is IEEE, define __KC__ and __KF__ to allow floatn.h to be
	used without modification.

From-SVN: r256558

committed Jan 11, 2018

8c7a27d5 Browse Files

rs6000-builtin.def (BU_P7_MISC_X): New #define. · 02a03501

[gcc]

2018-01-11  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def (BU_P7_MISC_X): New #define.
	(SPEC_BARRIER): New instantiation of BU_P7_MISC_X.
	* config/rs6000/rs6000.c (rs6000_expand_builtin): Handle
	MISC_BUILTIN_SPEC_BARRIER.
	(rs6000_init_builtins): Likewise.
	* config/rs6000/rs6000.md (UNSPECV_SPEC_BARRIER): New UNSPECV
	enum value.
	(speculation_barrier): New define_insn.
	* doc/extend.texi: Document __builtin_speculation_barrier.

[gcc/testsuite]

2018-01-11  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.target/powerpc/spec-barr-1.c: New file.

From-SVN: r256557

committed Jan 11, 2018

02a03501 Browse Files

re PR target/83203 (Inefficient int to avx2 vector conversion) · 1ad6e904

	PR target/83203
	* config/i386/i386.c (ix86_expand_vector_init_one_nonzero): If one_var
	is 0, for V{8,16}S[IF] and V[48]D[IF]mode use gen_vec_set<mode>_0.
	* config/i386/sse.md (VI8_AVX_AVX512F, VI4F_256_512): New mode
	iterators.
	(ssescalarmodesuffix): Add 512-bit vectors.  Use "d" or "q" for
	integral modes instead of "ss" and "sd".
	(vec_set<mode>_0): New define_insns for 256-bit and 512-bit
	vectors with 32-bit and 64-bit elements.
	(vecdupssescalarmodesuffix): New mode attribute.
	(vec_dup<mode>): Use it.

From-SVN: r256556

committed Jan 11, 2018

1ad6e904 Browse Files

i386: Align stack frame if argument is passed on stack · c7a61831

When a function call is removed, it may become a leaf function.  But if
argument may be passed on stack, we need to align the stack frame when
there is no tail call.

Tested on Linux/i686 and Linux/x86-64.

gcc/

	PR target/83330
	* config/i386/i386.c (ix86_compute_frame_layout): Align stack
	frame if argument is passed on stack.

gcc/testsuite/

	PR target/83330
	* gcc.target/i386/pr83330.c: New test.

From-SVN: r256555

committed Jan 11, 2018

c7a61831 Browse Files

re PR fortran/79383 (USE statement error) · 278e902c

2018-01-11  Steven G. Kargl <kargl@gcc.gnu.org>

	PR fortran/79383
	* gfortran.dg/dtio_31.f03: New test.
	* gfortran.dg/dtio_32.f03: New test.

From-SVN: r256554

committed Jan 11, 2018

278e902c Browse Files

re PR go/83794 (misc/cgo/test uses gigabytes of memory) · fbea3c33

	PR go/83794
    misc/cgo/test: avoid endless loop when we can't parse notes
    
    Reviewed-on: https://go-review.googlesource.com/87416

From-SVN: r256553

committed Jan 11, 2018

fbea3c33 Browse Files

Add some reproducers for issues found developing the location-wrappers patch · c5269263

gcc/testsuite/ChangeLog:
	PR c++/43486
	* g++.dg/wrappers: New subdirectory.
	* g++.dg/wrappers/README: New file.
	* g++.dg/wrappers/alloc.C: New test case.
	* g++.dg/wrappers/cow-istream-string.C: New test case.
	* g++.dg/wrappers/cp-stdlib.C: New test case.
	* g++.dg/wrappers/sanitizer_coverage_libcdep_new.C: New test case.
	* g++.dg/wrappers/wrapper-around-type-pack-expansion.C: New test
	case.

From-SVN: r256552

committed Jan 11, 2018

c5269263 Browse Files

re PR target/82682 (FAIL: gcc.target/i386/pr50038.c scan-assembler-times movzbl… · e2c0d088

re PR target/82682 (FAIL: gcc.target/i386/pr50038.c scan-assembler-times movzbl 2 (found 3 times) since r253958)

	PR target/82682
	* ree.c (combine_reaching_defs): Optimize also
	reg2=exp; reg1=reg2; reg2=any_extend(reg1); into
	reg2=any_extend(exp); reg1=reg2;, formatting fix.

From-SVN: r256551

committed Jan 11, 2018

e2c0d088 Browse Files

PR c++/82728 - wrong -Wunused-but-set-variable · 03943bbd

	PR c++/82799
	PR c++/83690
	* call.c (perform_implicit_conversion_flags): Call mark_rvalue_use.
	* decl.c (case_conversion): Likewise.
	* semantics.c (finish_static_assert): Call
	perform_implicit_conversion_flags.

From-SVN: r256550

committed Jan 11, 2018

03943bbd Browse Files

re PR tree-optimization/83189 (internal compiler error: in probability_in, at profile-count.h:1050) · c2893c6e
```
	PR middle-end/83189
	* gimple-ssa-isolate-paths.c (isolate_path): Fix profile update.

From-SVN: r256545
```
Jan Hubicka committed Jan 11, 2018
c2893c6e Browse Files

re PR middle-end/83718 (ICE: Floating point exception in profile_count::apply_scale) · 0526ed2a

	PR middle-end/83718
	* tree-inline.c (copy_cfg_body): Adjust num&den for scaling
	after they are computed.
	* g++.dg/torture/pr83718.C: New testcase.

From-SVN: r256544

committed Jan 11, 2018

0526ed2a Browse Files

[C++ PATCH] kill unused enum · 2a3af45c

https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00923.html
	* method.c (enum mangling_flags): Delete long-dead enum.

From-SVN: r256543

committed Jan 11, 2018

2a3af45c Browse Files

re PR ipa/83178 (g++.dg/ipa/devirt-22.C fail) · 346ac3a8
```
	PR ipa/83178
	* g++.dg/ipa/devirt-22.C: Adjust scan-dump-times count.

From-SVN: r256542
```
Martin Jambor committed Jan 11, 2018
346ac3a8 Browse Files

re PR tree-optimization/83695 (ICE on valid code at -O3: Segmentation fault) · 4e090bcc

	PR tree-optimization/83695
	* gimple-loop-linterchange.cc
	(tree_loop_interchange::interchange_loops): Call scev_reset_htab to
	reset cached scev information after interchange.
	(pass_linterchange::execute): Remove call to scev_reset_htab.

	gcc/testsuite
	PR tree-optimization/83695
	* gcc.dg/tree-ssa/pr83695.c: New test.

From-SVN: r256541

committed Jan 11, 2018

4e090bcc Browse Files

[arm][3/3] Implement fp16fml lane intrinsics · eccf4d70

This patch implements the lane-wise fp16fml intrinsics.
There's quite a few of them so I've split them up from
the other simpler fp16fml intrinsics.

These ones expose instructions such as

vfmal.f16 Dd, Sn, Sm[<index>]  0 <= index <= 1
vfmal.f16 Qd, Dn, Dm[<index>]  0 <= index <= 3
vfmsl.f16 Dd, Sn, Sm[<index>]  0 <= index <= 1
vfmsl.f16 Qd, Dn, Dm[<index>]  0 <= index <= 3

These instructions extract a single half-precision
floating-point value from one of the source regs
and perform a vfmal/vfmsl operation as per the
normal variant with that value.

The nuance here is that some of the intrinsics want
to do things like:

float32x2_t vfmlal_laneq_low_u32 (float32x2_t __r, float16x4_t __a, float16x8_t __b, const int __index)


where the float16x8_t value of '__b' is held in a Q
register, so we need to be a bit smart about finding
the right D or S sub-register and translating the
lane number to a lane in that sub-register, instead
of just passing the language-level const-int down to
the assembly instruction.

That's where most of the complexity of this patch comes from
but hopefully it's orthogonal enough to make sense.

Bootstrapped and tested on arm-none-linux-gnueabihf as well as
armeb-none-eabi.

	* config/arm/arm_neon.h (vfmlal_lane_low_u32, vfmlal_lane_high_u32,
	vfmlalq_laneq_low_u32, vfmlalq_lane_low_u32, vfmlal_laneq_low_u32,
	vfmlalq_laneq_high_u32, vfmlalq_lane_high_u32, vfmlal_laneq_high_u32,
	vfmlsl_lane_low_u32, vfmlsl_lane_high_u32, vfmlslq_laneq_low_u32,
	vfmlslq_lane_low_u32, vfmlsl_laneq_low_u32, vfmlslq_laneq_high_u32,
	vfmlslq_lane_high_u32, vfmlsl_laneq_high_u32): Define.
	* config/arm/arm_neon_builtins.def (vfmal_lane_low,
	vfmal_lane_lowv4hf, vfmal_lane_lowv8hf, vfmal_lane_high,
	vfmal_lane_highv4hf, vfmal_lane_highv8hf, vfmsl_lane_low,
	vfmsl_lane_lowv4hf, vfmsl_lane_lowv8hf, vfmsl_lane_high,
	vfmsl_lane_highv4hf, vfmsl_lane_highv8hf): New sets of builtins.
	* config/arm/iterators.md (VFMLSEL2, vfmlsel2): New mode attributes.
	(V_lane_reg): Likewise.
	* config/arm/neon.md (neon_vfm<vfml_op>l_lane_<vfml_half><VCVTF:mode>):
	New define_expand.
	(neon_vfm<vfml_op>l_lane_<vfml_half><vfmlsel2><mode>): Likewise.
	(vfmal_lane_low<mode>_intrinsic,
	vfmal_lane_low<vfmlsel2><mode>_intrinsic,
	vfmal_lane_high<vfmlsel2><mode>_intrinsic,
	vfmal_lane_high<mode>_intrinsic, vfmsl_lane_low<mode>_intrinsic,
	vfmsl_lane_low<vfmlsel2><mode>_intrinsic,
	vfmsl_lane_high<vfmlsel2><mode>_intrinsic,
	vfmsl_lane_high<mode>_intrinsic): New define_insns.

	* gcc.target/arm/simd/fp16fml_lane_high.c: New test.
	* gcc.target/arm/simd/fp16fml_lane_low.c: New test.

From-SVN: r256540

committed Jan 11, 2018

eccf4d70 Browse Files

[arm][2/3] Implement fp16fml extension for ARMv8.4-A · 06e95715

This patch adds the +fp16fml extension that enables some
half-precision floating-point Advanced SIMD instructions,
available through arm_neon.h intrinsics.

This extension is on by default for armv8.4-a
if fp16 is available, so it can be enabled by -march=armv8.4-a+fp16.

fp16fml is also available for armv8.2-a and armv8.3-a through the
+fp16fml option that is added for these architectures.

The new instructions that this patch adds support for are:
vfmal.f16 Dr, Sm, Sn
vfmal.f16 Qr, Dm, Dn
vfmsl.f16 Dr, Sm, Sn
vfmsl.f16 Qr, Dm, Dn

They interpret their input registers as a vector of half-precision
floating-point values, extend them to single-precision vectors
and perform a fused multiply-add or subtract of them with the
destination vector.

This patch exposes these instructions through arm_neon.h intrinsics.
The set of intrinsics allows us to do stuff such as perform
the multiply-add/subtract operation on the low or top half of
float16x4_t and float16x8_t values.  This maps naturally in aarch64
to the FMLAL and FMLAL2 instructions but on arm we have to use the
fact that consecutive NEON registers overlap the wider register
(i.e. d0 is s0 plus s1, q0 is d0 plus d1 etc). This just means
we have to be careful to use the right subreg operand print code.

New arm-specific builtins are defined to expand to the new patterns.
I've managed to compress the define_expands using code, mode and int
iterators but the define_insns don't compress very well without two-tiered
iterators (iterator attributes expanding to iterators) which we
don't support.

Bootstrapped and tested on arm-none-linux-gnueabihf and also on
armeb-none-eabi.

	* config/arm/arm-cpus.in (fp16fml): New feature.
	(ALL_SIMD): Add fp16fml.
	(armv8.2-a): Add fp16fml as an option.
	(armv8.3-a): Likewise.
	(armv8.4-a): Add fp16fml as part of fp16.
	* config/arm/arm.h (TARGET_FP16FML): Define.
	* config/arm/arm-c.c (arm_cpu_builtins): Define __ARM_FEATURE_FP16_FML
	when appropriate.
	* config/arm/arm-modes.def (V2HF): Define.
	* config/arm/arm_neon.h (vfmlal_low_u32, vfmlsl_low_u32,
	vfmlal_high_u32, vfmlsl_high_u32, vfmlalq_low_u32,
	vfmlslq_low_u32, vfmlalq_high_u32, vfmlslq_high_u32): Define.
	* config/arm/arm_neon_builtins.def (vfmal_low, vfmal_high,
	vfmsl_low, vfmsl_high): New set of builtins.
	* config/arm/iterators.md (PLUSMINUS): New code iterator.
	(vfml_op): New code attribute.
	(VFMLHALVES): New int iterator.
	(VFML, VFMLSEL): New mode attributes.
	(V_reg): Define mapping for V2HF.
	(V_hi, V_lo): New mode attributes.
	(VF_constraint): Likewise.
	(vfml_half, vfml_half_selector): New int attributes.
	* config/arm/neon.md (neon_vfm<vfml_op>l_<vfml_half><mode>): New
	define_expand.
	(vfmal_low<mode>_intrinsic, vfmsl_high<mode>_intrinsic,
	vfmal_high<mode>_intrinsic, vfmsl_low<mode>_intrinsic):
	New define_insn.
	* config/arm/t-arm-elf (v8_fps): Add fp16fml.
	* config/arm/t-multilib (v8_2_a_simd_variants): Add fp16fml.
	* config/arm/unspecs.md (UNSPEC_VFML_LO, UNSPEC_VFML_HI): New unspecs.
	* doc/invoke.texi (ARM Options): Document fp16fml.  Update armv8.4-a
	documentation.
	* doc/sourcebuild.texi (arm_fp16fml_neon_ok, arm_fp16fml_neon):
	Document new effective target and option set.

	* gcc.target/arm/multilib.exp: Add combination tests for fp16fml.
	* gcc.target/arm/simd/fp16fml_high.c: New test.
	* gcc.target/arm/simd/fp16fml_low.c: Likewise.
	* lib/target-supports.exp
	(check_effective_target_arm_fp16fml_neon_ok_nocache,
	check_effective_target_arm_fp16fml_neon_ok,
	add_options_for_arm_fp16fml_neon): New procedures.

From-SVN: r256539

committed Jan 11, 2018

06e95715 Browse Files

[arm][1/3] Add -march=armv8.4-a option · 946c6c45

This patch adds support for the Armv8.4-A architecture [1]
in the arm backend. This is done through the new
-march=armv8.4-a option.

With this patch armv8.4-a is recognised as an argument
and supports the extensions: simd, fp16, crypto, nocrypto,
nofp with the familiar meaning of these options.
Worth noting that there is no dotprod option like in
armv8.2-a and armv8.3-a because Dot Product support is
mandatory in Armv8.4-A when simd is available, so when using
+simd (of fp16 which enables +simd), the +dotprod is implied.

The various multilib selection makefile fragments are updated
too and the mutlilib.exp test gets a few armv8.4-a combination
tests.

Bootstrapped and tested on arm-none-linux-gnueabihf.

From-SVN: r256537

committed Jan 11, 2018

946c6c45 Browse Files

re PR target/81821 ([RX] xchg_mem<mode> uses wrong memory operand size) · 99eeb64c

gcc/
	PR target/81821
	* config/rx/rx.md (BW): New mode attribute.
	(sync_lock_test_and_setsi): Add mode suffix to insn output.

From-SVN: r256536

committed Jan 11, 2018

99eeb64c Browse Files

re PR tree-optimization/83435 (ICE in set_value_range, at tree-vrp.c:211) · b0bd3e52

2018-01-11  Richard Biener  <rguenther@suse.de>

	PR tree-optimization/83435
	* graphite.c (canonicalize_loop_form): Ignore fake loop exit edges.
	* graphite-scop-detection.c (scop_detection::get_sese): Likewise.
	* tree-vrp.c (add_assert_info): Drop TREE_OVERFLOW if they appear.

	* gcc.dg/graphite/pr83435.c: New testcase.

From-SVN: r256535

committed Jan 11, 2018

b0bd3e52 Browse Files

[AArch64] Add const_offset field to aarch64_address_info · dc640181

This patch records the integer value of the address offset in
aarch64_address_info, so that it doesn't need to be re-extracted
from the rtx.  The SVE port will make more use of this.  The patch
also uses poly_int64 routines to manipulate the offset, rather than
just handling CONST_INTs.

2018-01-11  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* config/aarch64/aarch64.c (aarch64_address_info): Add a const_offset
	field.
	(aarch64_classify_address): Initialize it.  Track polynomial offsets.
	(aarch64_print_address_internal): Use it to check for a zero offset.

Co-Authored-By: Alan Hayward <alan.hayward@arm.com>
Co-Authored-By: David Sherwood <david.sherwood@arm.com>

From-SVN: r256534

committed Jan 11, 2018

dc640181 Browse Files