Commit 89ac11d8 by Uros Bizjak

re PR target/53399 ("*ffs" pattern generates wrong code with BMI enabled)

	PR target/53399
	* config/i386/i386.md (ffs<mode>2): Generate CCCmode compare
	for TARGET_BMI.
	(ffssi2_no_cmove): Ditto.
	(*ffs<mode>_1): Remove insn pattern.
	(*tzcnt<mode>_1): New insn pattern.
	(*bsf<mode>1): Ditto.

From-SVN: r187722
parent 94ccc95d
2012-05-21 Uros Bizjak <ubizjak@gmail.com>
PR target/53399
* config/i386/i386.md (ffs<mode>2): Generate CCCmode compare
for TARGET_BMI.
(ffssi2_no_cmove): Ditto.
(*ffs<mode>_1): Remove insn pattern.
(*tzcnt<mode>_1): New insn pattern.
(*bsf<mode>1): Ditto.
2012-05-21 Richard Guenther <rguenther@suse.de> 2012-05-21 Richard Guenther <rguenther@suse.de>
* tree-dfa.c (add_referenced_var): Do not walk DECL_INITIAL for * tree-dfa.c (add_referenced_var): Do not walk DECL_INITIAL for
...@@ -70,7 +80,7 @@ ...@@ -70,7 +80,7 @@
2012-05-21 Razya Ladelsky <razya@il.ibm.com> 2012-05-21 Razya Ladelsky <razya@il.ibm.com>
* tree-parloops.c : Add myself to contributors, update * tree-parloops.c : Add myself to contributors, update
TODO list, add link to wiki. TODO list, add link to wiki.
2012-05-21 Alan Modra <amodra@gmail.com> 2012-05-21 Alan Modra <amodra@gmail.com>
...@@ -126,8 +136,9 @@ ...@@ -126,8 +136,9 @@
call_val_symref_64bit, call_val_reg_pic, call_val_reg_64bit): Likewise. call_val_symref_64bit, call_val_reg_pic, call_val_reg_64bit): Likewise.
2012-05-20 Razya Ladelsky <razya@il.ibm.com> 2012-05-20 Razya Ladelsky <razya@il.ibm.com>
* tree-parloops.c (gen_parallel_loop): Change many_iterations_cond for outer loops. * tree-parloops.c (gen_parallel_loop): Change many_iterations_cond
for outer loops.
2012-05-18 Jan Hubicka <jh@suse.cz> 2012-05-18 Jan Hubicka <jh@suse.cz>
...@@ -145,7 +156,8 @@ ...@@ -145,7 +156,8 @@
2012-05-18 Jan Hubicka <jh@suse.cz> 2012-05-18 Jan Hubicka <jh@suse.cz>
* cgraphunit.c (handle_alias_pairs): Cleanup; handle all types of aliases. * cgraphunit.c (handle_alias_pairs): Cleanup; handle all types of
aliases.
2012-05-18 Jan Hubicka <jh@suse.cz> 2012-05-18 Jan Hubicka <jh@suse.cz>
...@@ -205,15 +217,13 @@ ...@@ -205,15 +217,13 @@
* cgraphunit.c (varpool_finalize_decl): Allow external decls. * cgraphunit.c (varpool_finalize_decl): Allow external decls.
(mark_functions_to_output): Fix sanity check. (mark_functions_to_output): Fix sanity check.
* ipa.c (function_and_variable_visibility): Remove TREE_STATIC * ipa.c (function_and_variable_visibility): Remove TREE_STATIC check.
check.
2012-05-18 Richard Guenther <rguenther@suse.de> 2012-05-18 Richard Guenther <rguenther@suse.de>
* tree-flow.h (mark_symbols_for_renaming): Remove. * tree-flow.h (mark_symbols_for_renaming): Remove.
* tree-dfa.c (mark_symbols_for_renaming): Likewise. * tree-dfa.c (mark_symbols_for_renaming): Likewise.
* tree-inline.c (copy_edges_for_bb): Do not mark symbols for * tree-inline.c (copy_edges_for_bb): Do not mark symbols for renaming.
renaming.
(copy_debug_stmt): Likewise. (copy_debug_stmt): Likewise.
(expand_call_inline): Likewise. (expand_call_inline): Likewise.
(declare_return_variable): Mark the return variable for renaming (declare_return_variable): Mark the return variable for renaming
...@@ -248,13 +258,14 @@ ...@@ -248,13 +258,14 @@
2012-05-17 Jan Hubicka <jh@suse.cz> 2012-05-17 Jan Hubicka <jh@suse.cz>
* lto-symtab.c (lto_symtab_resolve_symbols): Preffer decl with constructor * lto-symtab.c (lto_symtab_resolve_symbols): Preffer decl with
over decl without. constructor over decl without.
* cgraph.c (cgraph_remove_node): Clear also body of unanalyzed nodes. * cgraph.c (cgraph_remove_node): Clear also body of unanalyzed nodes.
* cgraph.h (varpool_can_remove_if_no_refs): Handle external correctly. * cgraph.h (varpool_can_remove_if_no_refs): Handle external correctly.
* cgraphunit.c (process_function_and_variable_attributes): Finalize * cgraphunit.c (process_function_and_variable_attributes): Finalize
extrnal decls. extrnal decls.
(mark_functions_to_output): Also accept bodies for functions with clones. (mark_functions_to_output): Also accept bodies for functions with
clones.
(output_in_order): Skip external vars. (output_in_order): Skip external vars.
* lto-cgraph.c (lto_output_node): External functions are never in other * lto-cgraph.c (lto_output_node): External functions are never in other
partition. partition.
...@@ -287,8 +298,7 @@ ...@@ -287,8 +298,7 @@
2012-05-17 Kwok Cheung Yeung <kcy@codesourcery.com> 2012-05-17 Kwok Cheung Yeung <kcy@codesourcery.com>
* config/m68k/m68k-devices.def: Add 51ag, 51je, 51jf, 51jg, 51mm, * config/m68k/m68k-devices.def: Add 51ag, 51je, 51jf, 51jg, 51mm, 51qm.
51qm.
* config/m68k/m68k-tables.opt: Regenerated. * config/m68k/m68k-tables.opt: Regenerated.
* doc/invoke.texi (M680x0 Options): Document. * doc/invoke.texi (M680x0 Options): Document.
...@@ -468,7 +478,7 @@ ...@@ -468,7 +478,7 @@
2012-05-15 Tristan Gingold <gingold@adacore.com> 2012-05-15 Tristan Gingold <gingold@adacore.com>
* tree-ssa-strlen.c (get_string_length): Convert lhs if needed. * tree-ssa-strlen.c (get_string_length): Convert lhs if needed.
2012-05-15 Richard Guenther <rguenther@suse.de> 2012-05-15 Richard Guenther <rguenther@suse.de>
...@@ -496,8 +506,7 @@ ...@@ -496,8 +506,7 @@
2012-05-15 Kenneth Zadeck <zadeck@naturalbridge.com> 2012-05-15 Kenneth Zadeck <zadeck@naturalbridge.com>
* doc/md.texi (fma): Define to only be applicable for single * doc/md.texi (fma): Define to only be applicable for single rounding.
rounding.
2012-05-14 Uros Bizjak <ubizjak@gmail.com> 2012-05-14 Uros Bizjak <ubizjak@gmail.com>
...@@ -552,7 +561,7 @@ ...@@ -552,7 +561,7 @@
* config/avr/avr.c (avr_const_address_lo16): Remove. * config/avr/avr.c (avr_const_address_lo16): Remove.
(avr_assemble_integer): Print ".byte lo8(x)", (avr_assemble_integer): Print ".byte lo8(x)",
".byte hi8(x)", ".byte hh8(x)" instead of emit an assembler ".byte hi8(x)", ".byte hh8(x)" instead of emit an assembler
.warning if 3-byte address is assembled. .warning if 3-byte address is assembled.
* doc/extend.texi (AVR Named Address Spaces): Document that * doc/extend.texi (AVR Named Address Spaces): Document that
binutils 2.23 is needed to assemble 3-byte addresses. binutils 2.23 is needed to assemble 3-byte addresses.
......
...@@ -12132,26 +12132,31 @@ ...@@ -12132,26 +12132,31 @@
(define_expand "ffs<mode>2" (define_expand "ffs<mode>2"
[(set (match_dup 2) (const_int -1)) [(set (match_dup 2) (const_int -1))
(parallel [(set (reg:CCZ FLAGS_REG) (parallel [(set (match_dup 3) (match_dup 4))
(compare:CCZ
(match_operand:SWI48 1 "nonimmediate_operand")
(const_int 0)))
(set (match_operand:SWI48 0 "register_operand") (set (match_operand:SWI48 0 "register_operand")
(ctz:SWI48 (match_dup 1)))]) (ctz:SWI48
(match_operand:SWI48 1 "nonimmediate_operand")))])
(set (match_dup 0) (if_then_else:SWI48 (set (match_dup 0) (if_then_else:SWI48
(eq (reg:CCZ FLAGS_REG) (const_int 0)) (eq (match_dup 3) (const_int 0))
(match_dup 2) (match_dup 2)
(match_dup 0))) (match_dup 0)))
(parallel [(set (match_dup 0) (plus:SWI48 (match_dup 0) (const_int 1))) (parallel [(set (match_dup 0) (plus:SWI48 (match_dup 0) (const_int 1)))
(clobber (reg:CC FLAGS_REG))])] (clobber (reg:CC FLAGS_REG))])]
"" ""
{ {
enum machine_mode flags_mode;
if (<MODE>mode == SImode && !TARGET_CMOVE) if (<MODE>mode == SImode && !TARGET_CMOVE)
{ {
emit_insn (gen_ffssi2_no_cmove (operands[0], operands [1])); emit_insn (gen_ffssi2_no_cmove (operands[0], operands [1]));
DONE; DONE;
} }
flags_mode = TARGET_BMI ? CCCmode : CCZmode;
operands[2] = gen_reg_rtx (<MODE>mode); operands[2] = gen_reg_rtx (<MODE>mode);
operands[3] = gen_rtx_REG (flags_mode, FLAGS_REG);
operands[4] = gen_rtx_COMPARE (flags_mode, operands[1], const0_rtx);
}) })
(define_insn_and_split "ffssi2_no_cmove" (define_insn_and_split "ffssi2_no_cmove"
...@@ -12162,11 +12167,10 @@ ...@@ -12162,11 +12167,10 @@
"!TARGET_CMOVE" "!TARGET_CMOVE"
"#" "#"
"&& reload_completed" "&& reload_completed"
[(parallel [(set (reg:CCZ FLAGS_REG) [(parallel [(set (match_dup 4) (match_dup 5))
(compare:CCZ (match_dup 1) (const_int 0)))
(set (match_dup 0) (ctz:SI (match_dup 1)))]) (set (match_dup 0) (ctz:SI (match_dup 1)))])
(set (strict_low_part (match_dup 3)) (set (strict_low_part (match_dup 3))
(eq:QI (reg:CCZ FLAGS_REG) (const_int 0))) (eq:QI (match_dup 4) (const_int 0)))
(parallel [(set (match_dup 2) (neg:SI (match_dup 2))) (parallel [(set (match_dup 2) (neg:SI (match_dup 2)))
(clobber (reg:CC FLAGS_REG))]) (clobber (reg:CC FLAGS_REG))])
(parallel [(set (match_dup 0) (ior:SI (match_dup 0) (match_dup 2))) (parallel [(set (match_dup 0) (ior:SI (match_dup 0) (match_dup 2)))
...@@ -12174,37 +12178,38 @@ ...@@ -12174,37 +12178,38 @@
(parallel [(set (match_dup 0) (plus:SI (match_dup 0) (const_int 1))) (parallel [(set (match_dup 0) (plus:SI (match_dup 0) (const_int 1)))
(clobber (reg:CC FLAGS_REG))])] (clobber (reg:CC FLAGS_REG))])]
{ {
enum machine_mode flags_mode = TARGET_BMI ? CCCmode : CCZmode;
operands[3] = gen_lowpart (QImode, operands[2]); operands[3] = gen_lowpart (QImode, operands[2]);
operands[4] = gen_rtx_REG (flags_mode, FLAGS_REG);
operands[5] = gen_rtx_COMPARE (flags_mode, operands[1], const0_rtx);
ix86_expand_clear (operands[2]); ix86_expand_clear (operands[2]);
}) })
(define_insn "*ffs<mode>_1" (define_insn "*tzcnt<mode>_1"
[(set (reg:CCC FLAGS_REG)
(compare:CCC (match_operand:SWI48 1 "nonimmediate_operand" "rm")
(const_int 0)))
(set (match_operand:SWI48 0 "register_operand" "=r")
(ctz:SWI48 (match_dup 1)))]
"TARGET_BMI"
"tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}"
[(set_attr "type" "alu1")
(set_attr "prefix_0f" "1")
(set_attr "prefix_rep" "1")
(set_attr "mode" "<MODE>")])
(define_insn "*bsf<mode>_1"
[(set (reg:CCZ FLAGS_REG) [(set (reg:CCZ FLAGS_REG)
(compare:CCZ (match_operand:SWI48 1 "nonimmediate_operand" "rm") (compare:CCZ (match_operand:SWI48 1 "nonimmediate_operand" "rm")
(const_int 0))) (const_int 0)))
(set (match_operand:SWI48 0 "register_operand" "=r") (set (match_operand:SWI48 0 "register_operand" "=r")
(ctz:SWI48 (match_dup 1)))] (ctz:SWI48 (match_dup 1)))]
"" ""
{ "bsf{<imodesuffix>}\t{%1, %0|%0, %1}"
if (TARGET_BMI)
return "tzcnt{<imodesuffix>}\t{%1, %0|%0, %1}";
else if (optimize_function_for_size_p (cfun))
;
else if (TARGET_GENERIC)
/* tzcnt expands to rep;bsf and we can use it even if !TARGET_BMI. */
return "rep; bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
return "bsf{<imodesuffix>}\t{%1, %0|%0, %1}";
}
[(set_attr "type" "alu1") [(set_attr "type" "alu1")
(set_attr "prefix_0f" "1") (set_attr "prefix_0f" "1")
(set (attr "prefix_rep")
(if_then_else
(ior (match_test "TARGET_BMI")
(and (not (match_test "optimize_function_for_size_p (cfun)"))
(match_test "TARGET_GENERIC")))
(const_string "1")
(const_string "0")))
(set_attr "mode" "<MODE>")]) (set_attr "mode" "<MODE>")])
(define_insn "ctz<mode>2" (define_insn "ctz<mode>2"
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment