Commit eab880cf by Uros Bizjak

sse.md (round<mode>2_sfix): New expander.

	* config/i386/sse.md (round<mode>2_sfix): New expander.
	(round<mode>2_vec_pack_sfix): Ditto.
	(<sse4_1>_round<ssemodesuffix>_sfix<avxsizesuffix>): Ditto.
	(<sse4_1>_round<ssemodesuffix>_vec_pack_sfix<avxsizesuffix>): Ditto.
	* config/i386/builtin-types.def (V4SI_FTYPE_V4SF_ROUND,
	V8SI_FTYPE_V8SF_ROUND, V4SI_FTYPE_V2DF_V2DF_ROUND,
	V8SI_FTYPE_V4DF_V4DF_ROUND): New builtin types.
	* config/i386/i386.c (ix86_builtins): Add
	IX86_BUILTIN_{FLOORPD,CEILPD,ROUNDPD_AZ}_VEC_PACK_SFIX{,256} and
	IX86_BUILTIN_{FLOORPS,CEILPS,ROUNDPS_AZ}_SFIX{,256} defines.
	(bdesc_args): Add __builtin_ia32_{floorpd,ceilpd}_vec_pack_sfix{,256},
	__builtin_ia32_roundpd_az_vec_pack_sfix{,256},
	__builtin_ia32_{floorps,ceilps}_sfix{,256}and
	__builtin_ia32_roundps_az_sfix{,256} descriptions.
	(ix86_expand_sse_round_vec_pack_sfix): New.
	(ix86_expand_args_builtin): Handle V4SI_FTYPE_V4SF_ROUND,
	V8SI_FTYPE_V8SF_ROUND, V4SI_FTYPE_V2DF_V2DF_ROUND and
	V8SI_FTYPE_V4DF_V4DF_ROUND types.  Check last argument of
	CODE_FOR_sse4_1_roundpd_vec_pack_sfix, CODE_FOR_sse4_1_roundps_sfix,
	CODE_FOR_avx_roundpd_vec_pack_sfix256 and CODE_FOR_avx_roundps_sfix256.
	(ix86_builtin_vectorized_function): Handle
	BUILT_IN_{I,L,LL}FLOOR{,F}, BUILT_IN_{I,L,LL}CEIL{,F} and
	BUILT_IN_{I,L,LL}ROUND{,F}

testsuite/ChangeLog:

	* gcc.target/i386/sse4_1-floor-sfix-vec.c: New test.
	* gcc.target/i386/sse4_1-floorf-sfix-vec.c: Ditto.
	* gcc.target/i386/avx-floor-sfix-vec.c: Ditto.
	* gcc.target/i386/avx-floorf-sfix-vec.c: Ditto.
	* gcc.target/i386/sse4_1-ceil-sfix-vec.c: Ditto.
	* gcc.target/i386/sse4_1-ceilf-sfix-vec.c: Ditto.
	* gcc.target/i386/avx-ceil-sfix-vec.c: Ditto.
	* gcc.target/i386/avx-ceilf-sfix-vec.c: Ditto.
	* gcc.target/i386/sse4_1-round-sfix-vec.c: Ditto.
	* gcc.target/i386/sse4_1-roundf-sfix-vec.c: Ditto.
	* gcc.target/i386/avx-round-sfix-vec.c: Ditto.
	* gcc.target/i386/avx-roundf-sfix-vec.c: Ditto.

From-SVN: r181361
parent 2841f85e
2011-11-14 Uros Bizjak <ubizjak@gmail.com>
* config/i386/sse.md (round<mode>2_sfix): New expander.
(round<mode>2_vec_pack_sfix): Ditto.
(<sse4_1>_round<ssemodesuffix>_sfix<avxsizesuffix>): Ditto.
(<sse4_1>_round<ssemodesuffix>_vec_pack_sfix<avxsizesuffix>): Ditto.
* config/i386/builtin-types.def (V4SI_FTYPE_V4SF_ROUND,
V8SI_FTYPE_V8SF_ROUND, V4SI_FTYPE_V2DF_V2DF_ROUND,
V8SI_FTYPE_V4DF_V4DF_ROUND): New builtin types.
* config/i386/i386.c (ix86_builtins): Add
IX86_BUILTIN_{FLOORPD,CEILPD,ROUNDPD_AZ}_VEC_PACK_SFIX{,256} and
IX86_BUILTIN_{FLOORPS,CEILPS,ROUNDPS_AZ}_SFIX{,256} defines.
(bdesc_args): Add __builtin_ia32_{floorpd,ceilpd}_vec_pack_sfix{,256},
__builtin_ia32_roundpd_az_vec_pack_sfix{,256},
__builtin_ia32_{floorps,ceilps}_sfix{,256}and
__builtin_ia32_roundps_az_sfix{,256} descriptions.
(ix86_expand_sse_round_vec_pack_sfix): New.
(ix86_expand_args_builtin): Handle V4SI_FTYPE_V4SF_ROUND,
V8SI_FTYPE_V8SF_ROUND, V4SI_FTYPE_V2DF_V2DF_ROUND and
V8SI_FTYPE_V4DF_V4DF_ROUND types. Check last argument of
CODE_FOR_sse4_1_roundpd_vec_pack_sfix, CODE_FOR_sse4_1_roundps_sfix,
CODE_FOR_avx_roundpd_vec_pack_sfix256 and CODE_FOR_avx_roundps_sfix256.
(ix86_builtin_vectorized_function): Handle
BUILT_IN_{I,L,LL}FLOOR{,F}, BUILT_IN_{I,L,LL}CEIL{,F} and
BUILT_IN_{I,L,LL}ROUND{,F}
2011-11-14 Jan Hubicka <jh@suse.cz>
PR middle-end/50598
......@@ -11,38 +37,38 @@
2011-11-14 Zolotukhin Michael <michael.v.zolotukhin@gmail.com>
Jan Hubicka <jh@suse.cz>
* config/i386/i386.h (processor_costs): Add second dimension to
stringop_algs array.
* config/i386/i386.c (cost models): Initialize second dimension of
stringop_algs arrays.
* config/i386/i386.h (processor_costs): Add second dimension to
stringop_algs array.
* config/i386/i386.c (cost models): Initialize second dimension of
stringop_algs arrays.
(core_cost): New costs based on generic64 costs with updated stringop
values.
(promote_duplicated_reg): Add support for vector modes, add
declaration.
(promote_duplicated_reg_to_size): Likewise.
(promote_duplicated_reg): Add support for vector modes, add
declaration.
(promote_duplicated_reg_to_size): Likewise.
(processor_target): Set core costs for core variants.
(expand_set_or_movmem_via_loop_with_iter): New function.
(expand_set_or_movmem_via_loop): Enable reuse of the same iters in
different loops, produced by this function.
(emit_strset): New function.
(expand_movmem_epilogue): Add epilogue generation for bigger sizes,
use SSE-moves where possible.
(expand_setmem_epilogue): Likewise.
(expand_movmem_prologue): Likewise for prologue.
(expand_setmem_prologue): Likewise.
(expand_constant_movmem_prologue): Likewise.
(expand_constant_setmem_prologue): Likewise.
(decide_alg): Add new argument align_unknown. Fix algorithm of
strategy selection if TARGET_INLINE_ALL_STRINGOPS is set; Skip sse_loop
(decide_alignment): Update desired alignment according to chosen move
mode.
(ix86_expand_movmem): Change unrolled_loop strategy to use SSE-moves.
(ix86_expand_setmem): Likewise.
(ix86_slow_unaligned_access): Implementation of new hook
slow_unaligned_access.
* config/i386/i386.md (strset): Enable half-SSE moves.
* config/i386/sse.md (vec_dupv4si): Add expand for vec_dupv4si.
(vec_dupv2di): Add expand for vec_dupv2di.
(expand_set_or_movmem_via_loop_with_iter): New function.
(expand_set_or_movmem_via_loop): Enable reuse of the same iters in
different loops, produced by this function.
(emit_strset): New function.
(expand_movmem_epilogue): Add epilogue generation for bigger sizes,
use SSE-moves where possible.
(expand_setmem_epilogue): Likewise.
(expand_movmem_prologue): Likewise for prologue.
(expand_setmem_prologue): Likewise.
(expand_constant_movmem_prologue): Likewise.
(expand_constant_setmem_prologue): Likewise.
(decide_alg): Add new argument align_unknown. Fix algorithm of
strategy selection if TARGET_INLINE_ALL_STRINGOPS is set; Skip sse_loop
(decide_alignment): Update desired alignment according to chosen move
mode.
(ix86_expand_movmem): Change unrolled_loop strategy to use SSE-moves.
(ix86_expand_setmem): Likewise.
(ix86_slow_unaligned_access): Implementation of new hook
slow_unaligned_access.
* config/i386/i386.md (strset): Enable half-SSE moves.
* config/i386/sse.md (vec_dupv4si): Add expand for vec_dupv4si.
(vec_dupv2di): Add expand for vec_dupv2di.
2011-11-14 Dimitrios Apostolou <jimis@gmx.net>
......@@ -53,8 +79,7 @@
2011-11-14 Kai Tietz <ktietz@redhat.com>
* gcov.c (generate_results): Add missing semicolon and
correct indent.
* gcov.c (generate_results): Add missing semicolon and correct indent.
2011-11-14 Ira Rosen <ira.rosen@linaro.org>
......@@ -71,9 +96,8 @@
PR target/50694
* config/sh/sh.h (IS_LITTLE_ENDIAN_OPTION, UNSUPPORTED_SH2A):
New macros.
(DRIVER_SELF_SPECS): Use new macros to filter out
unsupported options taking the default configuration into
account.
(DRIVER_SELF_SPECS): Use new macros to filter out unsupported options
taking the default configuration into account.
2011-11-13 Jonathan Wakely <jwakely.gcc@gmail.com>
......@@ -110,7 +134,7 @@
2011-11-12 Richard Henderson <rth@redhat.com>
* config/rs6000/rs6000.md (fix_trunc<SFDF>si2_stfiwx): Use
* config/rs6000/rs6000.md (fix_trunc<SFDF>si2_stfiwx): Use
nonimmediate_operand for the destination.
(fixuns_trunc<SFDF>si2_stfiwx): Likewise.
......@@ -465,6 +465,11 @@ DEF_FUNCTION_TYPE_ALIAS (V4DF_FTYPE_V4DF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (V4SF_FTYPE_V4SF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (V8SF_FTYPE_V8SF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (V4SI_FTYPE_V2DF_V2DF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (V8SI_FTYPE_V4DF_V4DF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (V4SI_FTYPE_V4SF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (V8SI_FTYPE_V8SF, ROUND)
DEF_FUNCTION_TYPE_ALIAS (INT_FTYPE_V2DF_V2DF, PTEST)
DEF_FUNCTION_TYPE_ALIAS (INT_FTYPE_V2DI_V2DI, PTEST)
DEF_FUNCTION_TYPE_ALIAS (INT_FTYPE_V4DF_V4DF, PTEST)
......
......@@ -9902,6 +9902,45 @@
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "<MODE>")])
(define_expand "<sse4_1>_round<ssemodesuffix>_sfix<avxsizesuffix>"
[(match_operand:<sseintvecmode> 0 "register_operand" "")
(match_operand:VF1 1 "nonimmediate_operand" "")
(match_operand:SI 2 "const_0_to_15_operand" "")]
"TARGET_ROUND"
{
rtx tmp = gen_reg_rtx (<MODE>mode);
emit_insn
(gen_<sse4_1>_round<ssemodesuffix><avxsizesuffix> (tmp, operands[1],
operands[2]));
emit_insn
(gen_fix_trunc<mode><sseintvecmodelower>2 (operands[0], tmp));
DONE;
})
(define_expand "<sse4_1>_round<ssemodesuffix>_vec_pack_sfix<avxsizesuffix>"
[(match_operand:<ssepackfltmode> 0 "register_operand" "")
(match_operand:VF2 1 "nonimmediate_operand" "")
(match_operand:VF2 2 "nonimmediate_operand" "")
(match_operand:SI 3 "const_0_to_15_operand" "")]
"TARGET_ROUND"
{
rtx tmp0, tmp1;
tmp0 = gen_reg_rtx (<MODE>mode);
tmp1 = gen_reg_rtx (<MODE>mode);
emit_insn
(gen_<sse4_1>_round<ssemodesuffix><avxsizesuffix> (tmp0, operands[1],
operands[3]));
emit_insn
(gen_<sse4_1>_round<ssemodesuffix><avxsizesuffix> (tmp1, operands[2],
operands[3]));
emit_insn
(gen_vec_pack_sfix_trunc_<mode> (operands[0], tmp0, tmp1));
DONE;
})
(define_insn "sse4_1_round<ssescalarmodesuffix>"
[(set (match_operand:VF_128 0 "register_operand" "=x,x")
(vec_merge:VF_128
......@@ -9957,6 +9996,39 @@
operands[5] = GEN_INT (ROUND_TRUNC);
})
(define_expand "round<mode>2_sfix"
[(match_operand:<sseintvecmode> 0 "register_operand" "")
(match_operand:VF1 1 "nonimmediate_operand" "")]
"TARGET_ROUND && !flag_trapping_math"
{
rtx tmp = gen_reg_rtx (<MODE>mode);
emit_insn (gen_round<mode>2 (tmp, operands[1]));
emit_insn
(gen_fix_trunc<mode><sseintvecmodelower>2 (operands[0], tmp));
DONE;
})
(define_expand "round<mode>2_vec_pack_sfix"
[(match_operand:<ssepackfltmode> 0 "register_operand" "")
(match_operand:VF2 1 "nonimmediate_operand" "")
(match_operand:VF2 2 "nonimmediate_operand" "")]
"TARGET_ROUND && !flag_trapping_math"
{
rtx tmp0, tmp1;
tmp0 = gen_reg_rtx (<MODE>mode);
tmp1 = gen_reg_rtx (<MODE>mode);
emit_insn (gen_round<mode>2 (tmp0, operands[1]));
emit_insn (gen_round<mode>2 (tmp1, operands[2]));
emit_insn
(gen_vec_pack_sfix_trunc_<mode> (operands[0], tmp0, tmp1));
DONE;
})
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Intel SSE4.2 string/text processing instructions
......
2011-11-14 Uros Bizjak <ubizjak@gmail.com>
* gcc.target/i386/sse4_1-floor-sfix-vec.c: New test.
* gcc.target/i386/sse4_1-floorf-sfix-vec.c: Ditto.
* gcc.target/i386/avx-floor-sfix-vec.c: Ditto.
* gcc.target/i386/avx-floorf-sfix-vec.c: Ditto.
* gcc.target/i386/sse4_1-ceil-sfix-vec.c: Ditto.
* gcc.target/i386/sse4_1-ceilf-sfix-vec.c: Ditto.
* gcc.target/i386/avx-ceil-sfix-vec.c: Ditto.
* gcc.target/i386/avx-ceilf-sfix-vec.c: Ditto.
* gcc.target/i386/sse4_1-round-sfix-vec.c: Ditto.
* gcc.target/i386/sse4_1-roundf-sfix-vec.c: Ditto.
* gcc.target/i386/avx-round-sfix-vec.c: Ditto.
* gcc.target/i386/avx-roundf-sfix-vec.c: Ditto.
2011-11-14 Fabien Chêne <fabien@gcc.gnu.org>
PR c++/6936
......@@ -309,8 +324,8 @@
2011-11-09 Janne Blomqvist <jb@gcc.gnu.org>
PR libfortran/50016
* gfortran.dg/inquire_size.f90: Don't flush the unit.
PR libfortran/50016
* gfortran.dg/inquire_size.f90: Don't flush the unit.
2011-11-09 Richard Guenther <rguenther@suse.de>
......@@ -495,8 +510,8 @@
2011-11-07 Janne Blomqvist <jb@gcc.gnu.org>
PR libfortran/45723
* gfortran.dg/open_dev_null.f90: Remove testcase.
PR libfortran/45723
* gfortran.dg/open_dev_null.f90: Remove testcase.
2011-11-07 Uros Bizjak <ubizjak@gmail.com>
......
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx" } */
/* { dg-require-effective-target avx } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#define CHECK_H "avx-check.h"
#define TEST avx_test
#include "sse4_1-ceil-sfix-vec.c"
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx" } */
/* { dg-require-effective-target avx } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#define CHECK_H "avx-check.h"
#define TEST avx_test
#include "sse4_1-ceilf-sfix-vec.c"
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx" } */
/* { dg-require-effective-target avx } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#define CHECK_H "avx-check.h"
#define TEST avx_test
#include "sse4_1-floor-sfix-vec.c"
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx" } */
/* { dg-require-effective-target avx } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#define CHECK_H "avx-check.h"
#define TEST avx_test
#include "sse4_1-floorf-sfix-vec.c"
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx" } */
/* { dg-require-effective-target avx } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#define CHECK_H "avx-check.h"
#define TEST avx_test
#include "sse4_1-round-sfix-vec.c"
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -mavx" } */
/* { dg-require-effective-target avx } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#define CHECK_H "avx-check.h"
#define TEST avx_test
#include "sse4_1-roundf-sfix-vec.c"
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -msse4.1" } */
/* { dg-require-effective-target sse4 } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#ifndef CHECK_H
#define CHECK_H "sse4_1-check.h"
#endif
#ifndef TEST
#define TEST sse4_1_test
#endif
#include CHECK_H
#include <math.h>
extern double ceil (double);
#define NUM 64
static void
__attribute__((__target__("fpmath=sse")))
init_src (double *src)
{
int i, sign = 1;
double f = rand ();
for (i = 0; i < NUM; i++)
{
src[i] = (i + 1) * f * M_PI * sign;
if (i < (NUM / 2))
{
if ((i % 6) == 0)
f = f * src[i];
}
else if (i == (NUM / 2))
f = rand ();
else if ((i % 6) == 0)
f = 1 / (f * (i + 1) * src[i] * M_PI * sign);
sign = -sign;
}
}
static void
__attribute__((__target__("fpmath=387")))
TEST (void)
{
double a[NUM];
int r[NUM];
int i;
init_src (a);
for (i = 0; i < NUM; i++)
r[i] = (int) ceil (a[i]);
/* check results: */
for (i = 0; i < NUM; i++)
if (r[i] != (int) ceil (a[i]))
abort();
}
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -msse4.1" } */
/* { dg-require-effective-target sse4 } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#ifndef CHECK_H
#define CHECK_H "sse4_1-check.h"
#endif
#ifndef TEST
#define TEST sse4_1_test
#endif
#include CHECK_H
#include <math.h>
extern float ceilf (float);
#define NUM 64
static void
__attribute__((__target__("fpmath=sse")))
init_src (float *src)
{
int i, sign = 1;
float f = rand ();
for (i = 0; i < NUM; i++)
{
src[i] = (i + 1) * f * M_PI * sign;
if (i < (NUM / 2))
{
if ((i % 6) == 0)
f = f * src[i];
}
else if (i == (NUM / 2))
f = rand ();
else if ((i % 6) == 0)
f = 1 / (f * (i + 1) * src[i] * M_PI * sign);
sign = -sign;
}
}
static void
__attribute__((__target__("fpmath=387")))
TEST (void)
{
float a[NUM];
int r[NUM];
int i;
init_src (a);
for (i = 0; i < NUM; i++)
r[i] = (int) ceilf (a[i]);
/* check results: */
for (i = 0; i < NUM; i++)
if (r[i] != (int) ceilf (a[i]))
abort();
}
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -msse4.1" } */
/* { dg-require-effective-target sse4 } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#ifndef CHECK_H
#define CHECK_H "sse4_1-check.h"
#endif
#ifndef TEST
#define TEST sse4_1_test
#endif
#include CHECK_H
#include <math.h>
extern double floor (double);
#define NUM 64
static void
__attribute__((__target__("fpmath=sse")))
init_src (double *src)
{
int i, sign = 1;
double f = rand ();
for (i = 0; i < NUM; i++)
{
src[i] = (i + 1) * f * M_PI * sign;
if (i < (NUM / 2))
{
if ((i % 6) == 0)
f = f * src[i];
}
else if (i == (NUM / 2))
f = rand ();
else if ((i % 6) == 0)
f = 1 / (f * (i + 1) * src[i] * M_PI * sign);
sign = -sign;
}
}
static void
__attribute__((__target__("fpmath=387")))
TEST (void)
{
double a[NUM];
int r[NUM];
int i;
init_src (a);
for (i = 0; i < NUM; i++)
r[i] = (int) floor (a[i]);
/* check results: */
for (i = 0; i < NUM; i++)
if (r[i] != (int) floor (a[i]))
abort();
}
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -msse4.1" } */
/* { dg-require-effective-target sse4 } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#ifndef CHECK_H
#define CHECK_H "sse4_1-check.h"
#endif
#ifndef TEST
#define TEST sse4_1_test
#endif
#include CHECK_H
#include <math.h>
extern float floorf (float);
#define NUM 64
static void
__attribute__((__target__("fpmath=sse")))
init_src (float *src)
{
int i, sign = 1;
float f = rand ();
for (i = 0; i < NUM; i++)
{
src[i] = (i + 1) * f * M_PI * sign;
if (i < (NUM / 2))
{
if ((i % 6) == 0)
f = f * src[i];
}
else if (i == (NUM / 2))
f = rand ();
else if ((i % 6) == 0)
f = 1 / (f * (i + 1) * src[i] * M_PI * sign);
sign = -sign;
}
}
static void
__attribute__((__target__("fpmath=387")))
TEST (void)
{
float a[NUM];
int r[NUM];
int i;
init_src (a);
for (i = 0; i < NUM; i++)
r[i] = (int) floorf (a[i]);
/* check results: */
for (i = 0; i < NUM; i++)
if (r[i] != (int) floorf (a[i]))
abort();
}
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -msse4.1" } */
/* { dg-require-effective-target sse4 } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#ifndef CHECK_H
#define CHECK_H "sse4_1-check.h"
#endif
#ifndef TEST
#define TEST sse4_1_test
#endif
#include CHECK_H
#include <math.h>
extern double round (double);
#define NUM 64
static void
__attribute__((__target__("fpmath=sse")))
init_src (double *src)
{
int i, sign = 1;
double f = rand ();
for (i = 0; i < NUM; i++)
{
src[i] = (i + 1) * f * M_PI * sign;
if (i < (NUM / 2))
{
if ((i % 6) == 0)
f = f * src[i];
}
else if (i == (NUM / 2))
f = rand ();
else if ((i % 6) == 0)
f = 1 / (f * (i + 1) * src[i] * M_PI * sign);
sign = -sign;
}
}
static void
__attribute__((__target__("fpmath=387")))
TEST (void)
{
double a[NUM];
int r[NUM];
int i;
init_src (a);
for (i = 0; i < NUM; i++)
r[i] = (int) round (a[i]);
/* check results: */
for (i = 0; i < NUM; i++)
if (r[i] != (int) round (a[i]))
abort();
}
/* { dg-do run } */
/* { dg-options "-O2 -ffast-math -ftree-vectorize -msse4.1" } */
/* { dg-require-effective-target sse4 } */
/* { dg-skip-if "no M_PI" { vxworks_kernel } } */
#ifndef CHECK_H
#define CHECK_H "sse4_1-check.h"
#endif
#ifndef TEST
#define TEST sse4_1_test
#endif
#include CHECK_H
#include <math.h>
extern float roundf (float);
#define NUM 64
static void
__attribute__((__target__("fpmath=sse")))
init_src (float *src)
{
int i, sign = 1;
float f = rand ();
for (i = 0; i < NUM; i++)
{
src[i] = (i + 1) * f * M_PI * sign;
if (i < (NUM / 2))
{
if ((i % 6) == 0)
f = f * src[i];
}
else if (i == (NUM / 2))
f = rand ();
else if ((i % 6) == 0)
f = 1 / (f * (i + 1) * src[i] * M_PI * sign);
sign = -sign;
}
}
static void
__attribute__((__target__("fpmath=387")))
TEST (void)
{
float a[NUM];
int r[NUM];
int i;
init_src (a);
for (i = 0; i < NUM; i++)
r[i] = (int) roundf (a[i]);
/* check results: */
for (i = 0; i < NUM; i++)
if (r[i] != (int) roundf (a[i]))
abort();
}
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment