[nvptx] Don't emit barriers for empty loops -- fix
When compiling an empty loop: ... long long v1; #pragma acc parallel num_gangs (640) num_workers(1) vector_length (128) #pragma acc loop for (v1 = 0; v1 < 20; v1 += 2) ; ... the compiler emits two subsequent bar.syncs. This triggers some bug on my quadro m1200 (I'm assuming in the ptxas/JIT compiler) that hangs the testcase. This patch works around the bug by doing an optimization: we detect that this is an empty loop (a forked immediately followed by a joining), and don't emit the barriers. The patch does not include the test-case yet, since vector_length (128) is not yet supported at this point. 2019-01-07 Tom de Vries <tdevries@suse.de> PR target/85381 * config/nvptx/nvptx.c (nvptx_process_pars): Don't emit barriers for empty loops. From-SVN: r267630
Showing
Please
register
or
sign in
to comment