When using a compiler build with: ... +#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE ... consider a test-case: ... int main (void) { #pragma acc parallel vector_length (64) #pragma acc loop worker for (unsigned int i = 0; i < 32; i++) #pragma acc loop vector for (unsigned int j = 0; j < 64; j++) ; return 0; } ... If num_workers is 16, either because: - we add a "num_workers (16)" clause on the parallel directive, or - we set "GOMP_OPENACC_DIM=:16:", or - the libgomp plugin chooses 16 num_workers we run into an illegal instruction at runtime, because a bar.sync instruction tries to use a barrier 16. The instruction is illegal, because ptx supports only 16 barriers per CTA, and the valid range is 0..15. The problem is that with a warp-multiple vector length, we use a code generation scheme with a per-worker barrier. And because barrier zero is reserved for per-cta barrier, only the remaining 15 barriers can be used as per-worker barrier, and consequently we can't use num_workers larger than 15. This problem occurs only for vector_length 64. For vector_length 32, we use a different code generation scheme, and for vector_length >= 96, the maximum num_workers is not big enough not to trigger this problem. Also, this problem only occurs for num_workers 16. As explained above, num_workers 15 is safe to use, and 16 is already the maximum num_workers for vector_length 64. This patch fixes the problem in both the compiler (handling "num_workers (16)") and in the libgomp nvptx plugin (with and without "GOMP_OPENACC_DIM=:16:"). 2019-01-11 Tom de Vries <tdevries@suse.de> * config/nvptx/nvptx.c (PTX_CTA_NUM_BARRIERS, PTX_PER_CTA_BARRIER) (PTX_NUM_PER_CTA_BARRIER, PTX_FIRST_PER_WORKER_BARRIER) (PTX_NUM_PER_WORKER_BARRIERS): Define. (nvptx_apply_dim_limits): Prevent vector_length 64 and num_workers 16. * plugin/plugin-nvptx.c (nvptx_exec): Prevent vector_length 64 and num_workers 16. From-SVN: r267838
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
config | Loading commit data... | |
plugin | Loading commit data... | |
testsuite | Loading commit data... | |
ChangeLog | Loading commit data... | |
ChangeLog.graphite | Loading commit data... | |
Makefile.am | Loading commit data... | |
Makefile.in | Loading commit data... | |
acinclude.m4 | Loading commit data... | |
aclocal.m4 | Loading commit data... | |
affinity-fmt.c | Loading commit data... | |
affinity.c | Loading commit data... | |
alloc.c | Loading commit data... | |
atomic.c | Loading commit data... | |
barrier.c | Loading commit data... | |
config.h.in | Loading commit data... | |
configure | Loading commit data... | |
configure.ac | Loading commit data... | |
configure.tgt | Loading commit data... | |
critical.c | Loading commit data... | |
env.c | Loading commit data... | |
error.c | Loading commit data... | |
fortran.c | Loading commit data... | |
hashtab.h | Loading commit data... | |
icv-device.c | Loading commit data... | |
icv.c | Loading commit data... | |
iter.c | Loading commit data... | |
iter_ull.c | Loading commit data... | |
libgomp-plugin.c | Loading commit data... | |
libgomp-plugin.h | Loading commit data... | |
libgomp.h | Loading commit data... | |
libgomp.map | Loading commit data... | |
libgomp.spec.in | Loading commit data... | |
libgomp.texi | Loading commit data... | |
libgomp_f.h.in | Loading commit data... | |
libgomp_g.h | Loading commit data... | |
lock.c | Loading commit data... | |
loop.c | Loading commit data... | |
loop_ull.c | Loading commit data... | |
oacc-async.c | Loading commit data... | |
oacc-cuda.c | Loading commit data... | |
oacc-host.c | Loading commit data... | |
oacc-init.c | Loading commit data... | |
oacc-int.h | Loading commit data... | |
oacc-mem.c | Loading commit data... | |
oacc-parallel.c | Loading commit data... | |
oacc-plugin.c | Loading commit data... | |
oacc-plugin.h | Loading commit data... | |
omp.h.in | Loading commit data... | |
omp_lib.f90.in | Loading commit data... | |
omp_lib.h.in | Loading commit data... | |
openacc.f90 | Loading commit data... | |
openacc.h | Loading commit data... | |
openacc_lib.h | Loading commit data... | |
ordered.c | Loading commit data... | |
parallel.c | Loading commit data... | |
priority_queue.c | Loading commit data... | |
priority_queue.h | Loading commit data... | |
sections.c | Loading commit data... | |
secure_getenv.h | Loading commit data... | |
single.c | Loading commit data... | |
splay-tree.c | Loading commit data... | |
splay-tree.h | Loading commit data... | |
target.c | Loading commit data... | |
task.c | Loading commit data... | |
taskloop.c | Loading commit data... | |
team.c | Loading commit data... | |
teams.c | Loading commit data... | |
work.c | Loading commit data... |