- 06 Nov, 2012 11 commits
-
-
libada/ * Makefile.in (osconstool): Fix target. ada/ * gcc-interface/Makefile.in, gcc-interface/Make-lang.in: Remove duplicate rules handled by Make-generated.in. From-SVN: r193209
Arnaud Charlet committed -
* config/i386/i386.c (bdesc_args): Rename CODE_FOR_avx2_umulhrswv16hi3 to CODE_FOR_avx2_pmulhrswv16hi3. * config/i386/predicates.md (const1_operand): Extend for vectors. * config/i386/sse.md (ssse3_avx2): Extend. (ssedoublemode): Ditto. (<sse2_avx2>_uavg<mode>3): Merge avx2_uavgv32qi3, sse2_uavgv16qi3, avx2_uavgv16hi3 and sse2_uavgv8hi3 into one. (*<sse2_avx2>_uavg<mode>3): Merge *avx2_uavgv32qi3, *sse2_uavgv16qi3, *avx2_uavgv16hi3 and *sse2_uavgv8hi3 into one. (PMULHRSW): New. (<ssse3_avx2>_pmulhrsw<mode>3): Merge avx2_umulhrswv16hi3, ssse3_pmulhrswv8hi3 and ssse3_pmulhrswv4hi3 into one. (*avx2_pmulhrswv16hi3): Replace const_vector with const1_operand predicate. (*ssse3_pmulhrswv8hi3): Ditto. (*ssse3_pmulhrswv4hi3): Ditto. From-SVN: r193208
Andrey Turetskiy committed -
From-SVN: r193207
Joern Rennecke committed -
From-SVN: r193206
Joern Rennecke committed -
* config/epiphany/epiphany.c (epiphany_address_cost): Use MODE parameter. From-SVN: r193205
Joern Rennecke committed -
======================== Sriraman Tallam, tmsriram@google.com Overview of the patch which adds support to specify function versions. This is only enabled for target i386. Example: int foo (); /* Default version */ int foo () __attribute__ ((target("avx,popcnt")));/*Specialized for avx and popcnt */ int foo () __attribute__ ((target("arch=core2,ssse3")));/*Specialized for core2 and ssse3*/ int main () { int (*p)() = &foo; return foo () + (*p)(); } int foo () { return 0; } int __attribute__ ((target("avx,popcnt"))) foo () { return 0; } int __attribute__ ((target("arch=core2,ssse3"))) foo () { return 0; } The above example has foo defined 3 times, but all 3 definitions of foo are different versions of the same function. The call to foo in main, directly and via a pointer, are calls to the multi-versioned function foo which is dispatched to the right foo at run-time. Front-end changes: The front-end changes are calls at appropriate places to target hooks that determine the following: * Determine if two function decls with the same signature are versions. * Determine the assembler name of a function version. * Generate the dispatcher function for a set of function versions. * Compare versions to see if one has a higher priority over the other. All the implementation happens in the target-specific config/i386/i386.c. What does the patch do? * Tracking decls that correspond to function versions of function name, say "foo": When the front-end sees more than one decl for "foo", it calls a target hook to determine if they are versions. To prevent duplicate definition errors with other versions of "foo", "decls_match" function in cp/decl.c is made to return false when 2 decls have are deemed versions by the target. This will make all function versions of "foo" to be added to the overload list of "foo". * Change the assembler names of the function versions. For i386, the target changes the assembler names of the function versions by suffixing the sorted list of args to "target" to the function name of "foo". For example, the assembler name of "void foo () __attribute__ ((target ("sse4")))" will become _Z3foov.sse4. The target hook mangle_decl_assembler_name is used for this. * Overload resolution: Function "build_over_call" in cp/call.c sees a call to function "foo", which is multi-versioned. The overload resolution happens in function "joust" in "cp/call.c". Here, the call to "foo" has all possible versions of "foo" as candidates. All the candidates of "foo" are stored in the cgraph side data structure. Each version of foo is chained in a doubly-linked list with the default function as the first element. This allows any pass to access all the semantically identical versions. A call to a multi-versioned function will be replaced by a call to a dispatcher function, determined by a target hook, to execute the right function version at run-time. Optimization to directly call a version when possible: Also, in joust, where overload resolution happens, a multiversioned function resolution is made to return the most specialized version. This is the version that will be checked for dispatching first and is determined by the target. Now, if the caller can inline this function version then a direct call is made to this function version rather than go through the dispatcher. When a direct call cannot be made, a call to the dispatcher function is created. * Creating the dispatcher body. The dispatcher body, called the resolver is made only when there is a call to a multiversioned function dispatcher or the address of a function is taken. This is generated during cgraph_analyze_function. This is done by another target hook. * Dispatch ordering. The order in which the function versions are checked during dispatch is based on a priority value assigned for the ISA that is catered. More specialized versions are checked for dispatching first. This is to mitigate the ambiguity that can arise when more than one function version is valid for execution on a particular platform. This is not a perfect solution, and in future the user should be allowed to assign a dispatching priority value to each version. Function MV in the Intel compiler: The intel compiler supports function multiversioning and the syntax is similar to the patch proposed here. Here is an example of how to generate multiple function versions with the intel compiler. /* Create a stub function to specify the various versions of function that will be created, using declspec attribute cpu_dispatch. */ __declspec (cpu_dispatch (core_i7_sse4_2, atom, generic)) void foo () {}; /* Bodies of each function version. */ /* Intel Corei7 processor + SSE4.2 version. */ __declspec (cpu_specific(core_i7_sse4_2)) void foo () { printf ("corei7 + sse4.2"); } /* Atom processor. */ __declspec (cpu_specific(atom)) void foo () { printf ("atom"); } /* The generic or the default version. */ __declspec (cpu_specific(generic)) void foo () { printf ("This is generic"); } A new function version is generated by defining a new function with the same signature but with a different cpu_specific declspec attribute string. The set of cpu_specific strings that are allowed is the following: "core_2nd_gen_avx" "core_aes_pclmulqdq" "core_i7_sse4_2" "core_2_duo_sse4_1" "core_2_duo_ssse3" "atom" "pentium_4_sse3" "pentium_4" "pentium_m" "pentium_iii" "generic" Comparison with the GCC MV implementation in this patch: * Version creation syntax: The implementation in this patch also has a similar syntax to specify function versions. The first stub function is not needed. Here is the code to generate the function versions with this patch: /* Intel Corei7 processor + SSE4.2 version. */ __attribute__ ((target ("arch=corei7, sse4.2"))) void foo () { printf ("corei7 + sse4.2"); } /* Atom processor. */ __attribute__ ((target ("arch=atom"))) void foo () { printf ("atom"); } void foo () { } The target attribute can have one of the following arch names: "amd" "intel" "atom" "core2" "corei7" "nehalem" "westmere" "sandybridge" "amdfam10h" "barcelona" "shanghai" "istanbul" "amdfam15h" "bdver1" "bdver2" and any number of the following ISA names: "cmov" "mmx" "popcnt" "sse" "sse2" "sse3" "ssse3" "sse4.1" "sse4.2" "avx" "avx2" * doc/tm.texi.in (TARGET_OPTION_FUNCTION_VERSIONS): New hook description. * (TARGET_COMPARE_VERSION_PRIORITY): New hook description. * (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New hook description. * (TARGET_GENERATE_VERSION_DISPATCHER_BODY): New hook description. * doc/tm.texi: Regenerate. * target.def (compare_version_priority): New target hook. * (generate_version_dispatcher_body): New target hook. * (get_function_versions_dispatcher): New target hook. * (function_versions): New target hook. * cgraph.c (cgraph_fnver_htab): New htab. (cgraph_fn_ver_htab_hash): New function. (cgraph_fn_ver_htab_eq): New function. (version_info_node): New pointer. (insert_new_cgraph_node_version): New function. (get_cgraph_node_version): New function. (delete_function_version): New function. (record_function_versions): New function. * cgraph.h (cgraph_node): New bitfield dispatcher_function. (cgraph_function_version_info): New struct. (get_cgraph_node_version): New function. (insert_new_cgraph_node_version): New function. (record_function_versions): New function. (delete_function_version): New function. (init_lowered_empty_function): Expose function. * tree.h (DECL_FUNCTION_VERSIONED): New macro. (tree_function_decl): New bit-field versioned_function. * cgraphunit.c (cgraph_analyze_function): Generate body of multiversion function dispatcher. (cgraph_analyze_functions): Analyze dispatcher function. (init_lowered_empty_function): Make non-static. New parameter in_ssa. (assemble_thunk): Add parameter to call to init_lowered_empty_function. * config/i386/i386.c (add_condition_to_bb): New function. (get_builtin_code_for_version): New function. (ix86_compare_version_priority): New function. (feature_compare): New function. (dispatch_function_versions): New function. (ix86_function_versions): New function. (attr_strcmp): New function. (ix86_mangle_function_version_assembler_name): New function. (ix86_mangle_decl_assembler_name): New function. (make_name): New function. (make_dispatcher_decl): New function. (is_function_default_version): New function. (ix86_get_function_versions_dispatcher): New function. (make_attribute): New function. (make_resolver_func): New function. (ix86_generate_version_dispatcher_body): New function. (fold_builtin_cpu): Return integer for cpu builtins. (TARGET_MANGLE_DECL_ASSEMBLER_NAME): New macro. (TARGET_COMPARE_VERSION_PRIORITY): New macro. (TARGET_GENERATE_VERSION_DISPATCHER_BODY): New macro. (TARGET_GET_FUNCTION_VERSIONS_DISPATCHER): New macro. (TARGET_OPTION_FUNCTION_VERSIONS): New macro. * class.c (add_method): Change assembler names of function versions. (mark_versions_used): New static function. (resolve_address_of_overloaded_function): Create dispatcher decl and return address of dispatcher instead. * decl.c (decls_match): Make decls unmatched for versioned functions. (duplicate_decls): Remove ambiguity for versioned functions. Delete versioned function data for merged decls. * decl2.c (check_classfn): Check attributes of versioned functions for match. * call.c (get_function_version_dispatcher): New function. (mark_versions_used): New static function. (build_over_call): Make calls to multiversioned functions to call the dispatcher. (joust): For calls to multi-versioned functions, make the most specialized function version win. * testsuite/g++.dg/mv1.C: New test. * testsuite/g++.dg/mv2.C: New test. * testsuite/g++.dg/mv3.C: New test. * testsuite/g++.dg/mv4.C: New test. * testsuite/g++.dg/mv5.C: New test. * testsuite/g++.dg/mv6.C: New test. From-SVN: r193204
Sriraman Tallam committed -
From-SVN: r193203
Joern Rennecke committed -
From-SVN: r193202
Jonathan Wakely committed -
From-SVN: r193201
GCC Administrator committed -
* config/i386/i386.c (print_reg): Replace REX_INT_REG_P with REX_INT_REGNO_P. From-SVN: r193197
H.J. Lu committed -
* include/profile/deque: Constrain InputIterator parameters. * include/profile/forward_list: Likewise. * include/profile/list: Likewise. * include/profile/map.h: Likewise. * include/profile/multimap.h: Likewise. * include/profile/set.h: Likewise. * include/profile/multiset.h: Likewise. * include/profile/vector: Likewise. From-SVN: r193196
Jonathan Wakely committed
-
- 05 Nov, 2012 29 commits
-
-
re PR testsuite/55186 (gcc.dg/const-uniq-1.c fails due to vector expected but not being in the constant pool) PR testsuite/55186 * gcc.dg/const-uniq-1.c (a): Increase length four times. From-SVN: r193194
Hans-Peter Nilsson committed -
* gcc.dg/torture/pr53922.c: Use -Wl,-undefined,dynamic_lookup on darwin. From-SVN: r193193
Jack Howarth committed -
PR tree-optimization/54986 * gimple-fold.c (canonicalize_constructor_val): Strip again all no-op conversions on entry but add them back on exit if needed. From-SVN: r193188
Eric Botcazou committed -
* gcc/final.c (final_scan_insn) [HAVE_cc0]: Handle all comparison codes in non-jump and cmove insn. * gcc/testsuite/gcc.dg/torture/fp-compare.c: New testcase. From-SVN: r193187
Andreas Schwab committed -
* include/profile/forward_list: Update to meet allocator-aware requirements. * include/debug/forward_list: Likewise. * include/debug/vector: Verify allocators are swapped or equal. * include/debug/macros.h (__glibcxx_check_equal_allocs): Define. * include/debug/formatter.h: Add new debug message. * src/c++11/debug.cc: Likewise. * testsuite/23_containers/forward_list/allocator/swap.cc: Do not swap containers with non-propagating, non-equal allocators. * testsuite/23_containers/vector/allocator/swap.cc: Likewise. From-SVN: r193186
Jonathan Wakely committed -
2012-11-05 Benjamin Kosnik <bkoz@redhat.com> Oleg Smolsky <oleg@smolsky.net> PR libstdc++/55028 * config/abi/pre/gnu-versioned-namespace.ver: Add symbols. * testsuite/23_containers/unordered_multimap/insert/55028-debug.cc: New. Co-Authored-By: Oleg Smolsky <oleg@smolsky.net> From-SVN: r193185
Benjamin Kosnik committed -
2012-10-05 François Dumont <fdumont@gcc.gnu.org> * include/ext/throw_allocator.h (__throw_value_base): Add move semantic, not throwing. (__throw_value_limit): Likewise. (__throw_value_random): Likewise. * testsuite/util/exception/safety.h: Add validation of C++11 methods emplace/emplace_front/emplace_back/emplace_hint. * testsuite/util/testsuite_container_traits.h: Signal emplace support on deque, forward_list, list and vector. * testsuite/23_containers/deque/requirements/exception/ propagation_consistent.cc: Remove dg-do run fail. From-SVN: r193184
François Dumont committed -
mode-switching.c (create_pre_exit): Force late switching if __builtin_{apply,return} emitted a load that require... * mode-switching.c (create_pre_exit): Force late switching if __builtin_{apply,return} emitted a load that require mode, other than MODE_EXIT. Co-Authored-By: Vladimir Yakovlev <vladimir.b.yakovlev@intel.com> From-SVN: r193182
Uros Bizjak committed -
2012-11-05 Paolo Carlini <paolo.carlini@oracle.com> PR libstdc++/55215 * include/bits/random.tcc (mersenne_twister_engine<>::seed(_Sseq&)): Assign state_size to _M_p. * testsuite/26_numerics/random/mersenne_twister_engine/cons/55215.cc: New. * testsuite/26_numerics/random/independent_bits_engine/cons/55215.cc: Likewise. * testsuite/26_numerics/random/shuffle_order_engine/cons/55215.cc: Likewise. * testsuite/26_numerics/random/subtract_with_carry_engine/cons/ 55215.cc: Likewise. * testsuite/26_numerics/random/discard_block_engine/cons/55215.cc: Likewise. * testsuite/26_numerics/random/linear_congruential_engine/cons/ 55215.cc: Likewise. From-SVN: r193181
Paolo Carlini committed -
re PR target/55204 (ICE: in extract_insn, at recog.c:2140 (unrecognizable insn) with -O --param loop-invariant-max-bbs-in-loop=0) gcc/ PR target/55204 * config/i386/i386.c (ix86_address_subreg_operand): Remove stack pointer check. (print_reg): Use true_regnum rather than REGNO. (ix86_print_operand_address): Remove SUBREG handling. From-SVN: r193178
Richard Sandiford committed -
* gcc.dg/const-1.c: Update. * gcc.dg/pure-1.c: Update. * tree-ssa-loop-niter.c (finite_loop_p): Revamp to be just wrapper of max_loop_iterations. From-SVN: r193175
Jan Hubicka committed -
2012-11-03 Florian Weimer <fweimer@redhat.com> * libsupc++/vec.cc (compute_size): New. (__cxa_vec_new2, __cxa_vec_new3): Use it. * testsuite/18_support/cxa_vec.cc: New. From-SVN: r193174
Florian Weimer committed -
From-SVN: r193173
Ian Lance Taylor committed -
From-SVN: r193172
Ian Lance Taylor committed -
* reorg.c (fill_simple_delay_slots): Avoid calling optimize_skip with a return instruction. From-SVN: r193171
Joern Rennecke committed -
2012-11-05 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/55151 * lra-constraints.c (process_alt_operands): Permit putting reg value into memory. Increase reject for this case. 2012-11-05 Vladimir Makarov <vmakarov@redhat.com> PR rtl-optimization/55151 * gcc.dg/pr55151.c: New test. From-SVN: r193170
Vladimir Makarov committed -
2012-11-05 Dehao Chen <dehao@google.com> * final.c (reemit_insn_block_notes): Do not change scope if insn location is UNKNOWN_LOCATION. From-SVN: r193169
Dehao Chen committed -
md.texi (Defining Attributes): Document that we are defining HAVE_ATTR_name macors as 1 for defined attributes... * doc/md.texi (Defining Attributes): Document that we are defining HAVE_ATTR_name macors as 1 for defined attributes, and as 0 for undefined special attributes. * final.c (asm_insn_count, align_fuzz): Always define. (insn_current_reference_address): Likewise. (init_insn_lengths): Use if (HAVE_ATTR_length) instead of #ifdef HAVE_ATTR_length. (get_attr_length_1, shorten_branches, final): Likewise. (final_scan_insn, output_asm_name): Likewise. * genattr.c (gen_attr): Define HAVE_ATTR_name macros for defined attributes as 1. Remove ancient get_attr_alternative compatibility code. For special purpose attributes not provided, define HAVE_ATTR_name as 0. In case no length attribute is given, provide stub definitions for insn_*_length* functions, and also include insn-addr.h. In case no enabled attribute is given, provide stub definition. * genattrtab.c (write_length_unit_log): Always write a definition. * hooks.c (hook_int_rtx_1, hook_int_rtx_unreachable): New functions. * hooks.h (hook_int_rtx_1, hook_int_rtx_unreachable): Declare. * lra-int.h (struct lra_insn_recog_data): Make member alternative_enabled_p unconditional. * lra.c (free_insn_recog_data): Use if (HAVE_ATTR_length) instead of #ifdef HAVE_ATTR_length. (lra_set_insn_recog_data): Likewise. Make initialization of alternative_enabled_p unconditional. (lra_update_insn_recog_data): Use #if instead of #ifdef for HAVE_ATTR_enabled. * recog.c [!HAVE_ATTR_enabled] (get_attr_enabled): Don't define. (extract_insn): Check HAVE_ATTR_enabled. (gate_handle_split_before_regstack): Use #if instead of #if defined for HAVE_ATTR_length. From-SVN: r193168
Joern Rennecke committed -
PR debug/54970 PR debug/54971 * gcc.dg/guality/pr54970.c: Use NOP instead of "NOP" in inline-asm. From-SVN: r193162
Jakub Jelinek committed -
* ipa-inline.c (compute_uninlined_call_time, compute_inlined_call_time): New functions. (RELATIVE_TIME_BENEFIT_RANGE): New macro. (relative_time_benefit): Rewrite. (edge_badness): Rewrite path with guessed profile and estimated profile. * ipa-inline.h (INLINE_HINT_declared_inline, INLINE_HINT_cross_module): New hints. (struct inline_summary): Add GROWTH filed. * ipa-inline-analysis.c (dump_inline_hints): Update. (reset_inline_summary): Update. (dump_inline_summary): Update. (will_be_nonconstant_predicate): Cleanup to use gimple_store_p and gimple_assign_load_p predicates. (estimate_node_size_and_time): Drop INLINE_HINT_declared_inline hint. (simple_edge_hints): New function. (do_estimate_edge_time): Return time of invocation of callee rather than the time scaled by edge frequency; update hints code. (do_estimate_edge_hints): Update. (do_estimate_growth): Cleanup. From-SVN: r193161
Jan Hubicka committed -
* tree-ssa-loop-niter.c (find_loop_niter): Remove just_once_each_iteration_p. (maybe_lower_iteration_bound): Initialize not_executed_last_iteration to NULL * tree-ssa-loop-ivcanon.c (canonicalize_loop_induction_variables): Skip just_once_each_iteration_p; record estimated bound when loop has only one likely exit; test just_once_each_iteration_p before IV canon itself. From-SVN: r193159
Jan Hubicka committed -
PR target/55194 * dwarf2out.c (value_format) <case dw_val_class_high_pc>: Handle also DWARF2_ADDR_SIZE 1 and 2. From-SVN: r193158
Jakub Jelinek committed -
* ipa-inline.c (leaf_node_p): Rename to ... (num_calls) ... this one. (want_early_inline_function_p): Allow smal growth on non-leafs. From-SVN: r193157
Jan Hubicka committed -
PR testsuite/51128 * gcc.dg/torture/pr55018.c: Skip if -fno-fat-lto-objects was passed. From-SVN: r193156
Uros Bizjak committed -
From-SVN: r193155
Jan Hubicka committed -
From-SVN: r193154
Uros Bizjak committed -
* gcc.dg/tree-ssa/cunroll-9.c: Dump cunrolli details. Fix scan-tree-dump-times directive. From-SVN: r193153
Uros Bizjak committed -
PR debug/54402 * var-tracking.c (fp_setter): Return false if there is REG_CFA_RESTORE hfp note. (vt_initialize): Look for fp_setter in any bb, not just successor of entry bb. From-SVN: r193152
Jakub Jelinek committed -
* config/sh/sh.h (TARGET_CACHE32, TARGET_HARVARD): Delete macro. (TARGET_SUPERSCALAR): Add TARGET_SH2A. (CACHE_LOG): Use TARGET_HARD_SH4 and TARGET_SH5 instead of TARGET_CACHE32. (TRAMPOLINE_ALIGNMENT): Use TARGET_HARD_SH4 and TARGET_SH5 instead of TARGET_HARVARD. * config/sh/sh.c (sh_trampoline_init): Likewise. From-SVN: r193151
Oleg Endo committed
-