Commit bbe996ec by Uros Bizjak

re PR target/47989 (-mrecip causes 482.sphinx3, 464.h264ref and 481.wrf to miscompare)

	PR target/47989
	* config/i386/i386.h (RECIP_MASK_DEFAULT): New define.
	* config/i386/i386.op (recip_mask): Initialize with RECIP_MASK_DEFAULT.
	* doc/invoke.texi (ix86 Options, -mrecip): Document that GCC
	implements vectorized single float division and vectorized sqrtf(x)
	with reciprocal sequence with additional Newton-Raphson step with
	-ffast-math.

From-SVN: r180256
parent eb405c46
2011-10-20 Uros Bizjak <ubizjak@gmail.com>
PR target/47989
* config/i386/i386.h (RECIP_MASK_DEFAULT): New define.
* config/i386/i386.op (recip_mask): Initialize with RECIP_MASK_DEFAULT.
* doc/invoke.texi (ix86 Options, -mrecip): Document that GCC
implements vectorized single float division and vectorized sqrtf(x)
with reciprocal sequence with additional Newton-Raphson step with
-ffast-math.
2011-10-20 Dodji Seketeli <dodji@redhat.com> 2011-10-20 Dodji Seketeli <dodji@redhat.com>
PR other/50659 PR other/50659
...@@ -33,8 +43,7 @@ ...@@ -33,8 +43,7 @@
2011-10-19 David S. Miller <davem@davemloft.net> 2011-10-19 David S. Miller <davem@davemloft.net>
* config/sparc/sparc.c (sparc_expand_move): Use * config/sparc/sparc.c (sparc_expand_move): Use can_create_pseudo_p.
can_create_pseudo_p.
(sparc_emit_set_const32): Likewise. (sparc_emit_set_const32): Likewise.
(sparc_emit_set_const64_longway): Likewise. (sparc_emit_set_const64_longway): Likewise.
(sparc_emit_set_const64): Likewise. (sparc_emit_set_const64): Likewise.
...@@ -279,8 +288,8 @@ ...@@ -279,8 +288,8 @@
2011-10-19 Jan Hubicka <jh@suse.cz> 2011-10-19 Jan Hubicka <jh@suse.cz>
* cgraphunit.c (handle_alias_pairs): Also handle wekref with destination * cgraphunit.c (handle_alias_pairs): Also handle wekref with
declared. destination declared.
(output_weakrefs): New function. (output_weakrefs): New function.
* varpool.c (varpool_create_variable_alias): Handle external aliases. * varpool.c (varpool_create_variable_alias): Handle external aliases.
...@@ -319,7 +328,6 @@ ...@@ -319,7 +328,6 @@
2011-10-18 Andrew Stubbs <ams@codesourcery.com> 2011-10-18 Andrew Stubbs <ams@codesourcery.com>
PR tree-optimization/50717 PR tree-optimization/50717
* tree-ssa-math-opts.c (is_widening_mult_p): Remove the 'type' * tree-ssa-math-opts.c (is_widening_mult_p): Remove the 'type'
parameter. Calculate 'type' from stmt. parameter. Calculate 'type' from stmt.
(convert_mult_to_widen): Update call the is_widening_mult_p. (convert_mult_to_widen): Update call the is_widening_mult_p.
...@@ -668,8 +676,7 @@ ...@@ -668,8 +676,7 @@
2011-10-17 Sergio Durigan Junior <sergiodj@redhat.com> 2011-10-17 Sergio Durigan Junior <sergiodj@redhat.com>
* configure.ac: Display `yes' if the SystemTap header has been * configure.ac: Display `yes' if the SystemTap header has been found.
found.
* configure: Regenerate. * configure: Regenerate.
2011-10-08 Andi Kleen <ak@linux.intel.com> 2011-10-08 Andi Kleen <ak@linux.intel.com>
...@@ -685,8 +692,7 @@ ...@@ -685,8 +692,7 @@
2011-10-17 Richard Guenther <rguenther@suse.de> 2011-10-17 Richard Guenther <rguenther@suse.de>
PR tree-optimization/50729 PR tree-optimization/50729
* tree-vrp.c (extract_range_from_unary_expr_1): Remove * tree-vrp.c (extract_range_from_unary_expr_1): Remove redundant test.
redundant test.
(simplify_conversion_using_ranges): Properly test the (simplify_conversion_using_ranges): Properly test the
intermediate result. intermediate result.
...@@ -709,8 +715,7 @@ ...@@ -709,8 +715,7 @@
2011-10-15 Tom Tromey <tromey@redhat.com> 2011-10-15 Tom Tromey <tromey@redhat.com>
Dodji Seketeli <dodji@redhat.com> Dodji Seketeli <dodji@redhat.com>
* input.c (ONE_K, ONE_M, SCALE, STAT_LABEL, FORMAT_AMOUNT): New * input.c (ONE_K, ONE_M, SCALE, STAT_LABEL, FORMAT_AMOUNT): New macros.
macros.
(num_expanded_macros_counter, num_macro_tokens_counter): Declare (num_expanded_macros_counter, num_macro_tokens_counter): Declare
new counters. new counters.
(dump_line_table_statistics): Define new function. (dump_line_table_statistics): Define new function.
...@@ -721,8 +726,7 @@ ...@@ -721,8 +726,7 @@
Dodji Seketeli <dodji@redhat.com> Dodji Seketeli <dodji@redhat.com>
* doc/cppopts.texi: Document -fdebug-cpp. * doc/cppopts.texi: Document -fdebug-cpp.
* doc/invoke.texi: Add -fdebug-cpp to the list of preprocessor * doc/invoke.texi: Add -fdebug-cpp to the list of preprocessor options.
options.
2011-10-15 Tom Tromey <tromey@redhat.com> 2011-10-15 Tom Tromey <tromey@redhat.com>
Dodji Seketeli <dodji@redhat.com> Dodji Seketeli <dodji@redhat.com>
...@@ -759,8 +763,7 @@ ...@@ -759,8 +763,7 @@
(LOCATION_COLUMN): New accessor (LOCATION_COLUMN): New accessor
(in_system_header_at): Use linemap_location_in_system_header_p. (in_system_header_at): Use linemap_location_in_system_header_p.
* diagnostic.c (diagnostic_report_current_module): Adjust to avoid * diagnostic.c (diagnostic_report_current_module): Adjust to avoid
touching the internals of struct line_map. Use the public API. touching the internals of struct line_map. Use the public API instead.
instead.
(diagnostic_report_diagnostic): Don't use relational operator '<' (diagnostic_report_diagnostic): Don't use relational operator '<'
on virtual locations. Use linemap_location_before_p instead. on virtual locations. Use linemap_location_before_p instead.
* input.c (expand_location): Adjust to expand to the tokens' * input.c (expand_location): Adjust to expand to the tokens'
...@@ -1280,9 +1283,8 @@ ...@@ -1280,9 +1283,8 @@
2011-10-12 Bernd Schmidt <bernds@codesourcery.com> 2011-10-12 Bernd Schmidt <bernds@codesourcery.com>
* function.c (prepare_shrink_wrap, bb_active_p): New function. * function.c (prepare_shrink_wrap, bb_active_p): New function.
(thread_prologue_and_epilogue_insns): Use bb_active_p. (thread_prologue_and_epilogue_insns): Use bb_active_p. Call
Call prepare_shrink_wrap, then recompute bb_active_p for the prepare_shrink_wrap, then recompute bb_active_p for the last block.
last block.
2011-10-12 Joseph Myers <joseph@codesourcery.com> 2011-10-12 Joseph Myers <joseph@codesourcery.com>
...@@ -1526,8 +1528,8 @@ ...@@ -1526,8 +1528,8 @@
2011-10-10 Georg-Johann Lay <avr@gjlay.de> 2011-10-10 Georg-Johann Lay <avr@gjlay.de>
* config/avr/avr.c (avr_option_override): Set * config/avr/avr.c (avr_option_override): Set flag_omit_frame_pointer
flag_omit_frame_pointer to 0 if frame pointer is needed for unwinding. to 0 if frame pointer is needed for unwinding.
2011-10-10 Uros Bizjak <ubizjak@gmail.com> 2011-10-10 Uros Bizjak <ubizjak@gmail.com>
...@@ -2322,6 +2322,7 @@ extern void debug_dispatch_window (int); ...@@ -2322,6 +2322,7 @@ extern void debug_dispatch_window (int);
#define RECIP_MASK_VEC_SQRT 0x08 #define RECIP_MASK_VEC_SQRT 0x08
#define RECIP_MASK_ALL (RECIP_MASK_DIV | RECIP_MASK_SQRT \ #define RECIP_MASK_ALL (RECIP_MASK_DIV | RECIP_MASK_SQRT \
| RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT) | RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT)
#define RECIP_MASK_DEFAULT (RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_SQRT)
#define TARGET_RECIP_DIV ((recip_mask & RECIP_MASK_DIV) != 0) #define TARGET_RECIP_DIV ((recip_mask & RECIP_MASK_DIV) != 0)
#define TARGET_RECIP_SQRT ((recip_mask & RECIP_MASK_SQRT) != 0) #define TARGET_RECIP_SQRT ((recip_mask & RECIP_MASK_SQRT) != 0)
......
...@@ -32,7 +32,7 @@ Variable ...@@ -32,7 +32,7 @@ Variable
HOST_WIDE_INT ix86_isa_flags_explicit HOST_WIDE_INT ix86_isa_flags_explicit
TargetVariable TargetVariable
int recip_mask int recip_mask = RECIP_MASK_DEFAULT
Variable Variable
int recip_mask_explicit int recip_mask_explicit
......
...@@ -12922,7 +12922,12 @@ Note that while the throughput of the sequence is higher than the throughput ...@@ -12922,7 +12922,12 @@ Note that while the throughput of the sequence is higher than the throughput
of the non-reciprocal instruction, the precision of the sequence can be of the non-reciprocal instruction, the precision of the sequence can be
decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994). decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994).
Note that GCC implements 1.0f/sqrtf(x) in terms of RSQRTSS (or RSQRTPS) Note that GCC implements @code{1.0f/sqrtf(@var{x})} in terms of RSQRTSS
(or RSQRTPS) already with @option{-ffast-math} (or the above option
combination), and doesn't need @option{-mrecip}.
Also note that GCC emits the above sequence with additional Newton-Raphson step
for vectorized single float division and vectorized @code{sqrtf(@var{x})}
already with @option{-ffast-math} (or the above option combination), and already with @option{-ffast-math} (or the above option combination), and
doesn't need @option{-mrecip}. doesn't need @option{-mrecip}.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment