Commit dc6bcf52 by Uros Bizjak Committed by Uros Bizjak

lex.c (search_line_sse42): New main loop using asm flag outputs.

	* lex.c (search_line_sse42) [__GCC_ASM_FLAG_OUTPUTS__]: New main
	loop using asm flag outputs.

From-SVN: r225160
parent 7e6a6f0d
2015-06-30 Uros Bizjak <ubizjak@gmail.com>
* lex.c (search_line_sse42) [__GCC_ASM_FLAG_OUTPUTS__]: New main
loop using asm flag outputs.
2015-06-08 Marek Polacek <polacek@redhat.com> 2015-06-08 Marek Polacek <polacek@redhat.com>
PR c/66415 PR c/66415
......
...@@ -450,15 +450,33 @@ search_line_sse42 (const uchar *s, const uchar *end) ...@@ -450,15 +450,33 @@ search_line_sse42 (const uchar *s, const uchar *end)
s = (const uchar *)((si + 16) & -16); s = (const uchar *)((si + 16) & -16);
} }
/* Main loop, processing 16 bytes at a time. By doing the whole loop /* Main loop, processing 16 bytes at a time. */
in inline assembly, we can make proper use of the flags set. */ #ifdef __GCC_ASM_FLAG_OUTPUTS__
__asm ( "sub $16, %1\n" while (1)
" .balign 16\n" {
char f;
/* By using inline assembly instead of the builtin,
we can use the result, as well as the flags set. */
__asm ("%vpcmpestri\t$0, %2, %3"
: "=c"(index), "=@ccc"(f)
: "m"(*s), "x"(search), "a"(4), "d"(16));
if (f)
break;
s += 16;
}
#else
s -= 16;
/* By doing the whole loop in inline assembly,
we can make proper use of the flags set. */
__asm ( ".balign 16\n"
"0: add $16, %1\n" "0: add $16, %1\n"
" %vpcmpestri $0, (%1), %2\n" " %vpcmpestri\t$0, (%1), %2\n"
" jnc 0b" " jnc 0b"
: "=&c"(index), "+r"(s) : "=&c"(index), "+r"(s)
: "x"(search), "a"(4), "d"(16)); : "x"(search), "a"(4), "d"(16));
#endif
found: found:
return s + index; return s + index;
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment