When we did not set the workgroup size, LLVM will use too many registers for kernel launches with many threads. This resulted in "invalid ISA" errors. Here we set the maximum workgroup size to the maximum threads per block from the device API. Of course, one might look into allowing configurations with fewer threads at runtime to use more registers.
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
datatype | Loading commit data... | |
literal | Loading commit data... | |
llvm | Loading commit data... | |
opt | Loading commit data... | |
spirv | Loading commit data... | |
stackvm | Loading commit data... | |
build_common.h | Loading commit data... | |
build_module.cc | Loading commit data... | |
codegen.cc | Loading commit data... | |
codegen_aocl.cc | Loading commit data... | |
codegen_c.cc | Loading commit data... | |
codegen_c.h | Loading commit data... | |
codegen_c_host.cc | Loading commit data... | |
codegen_c_host.h | Loading commit data... | |
codegen_cuda.cc | Loading commit data... | |
codegen_cuda.h | Loading commit data... | |
codegen_metal.cc | Loading commit data... | |
codegen_metal.h | Loading commit data... | |
codegen_opencl.cc | Loading commit data... | |
codegen_opencl.h | Loading commit data... | |
codegen_opengl.cc | Loading commit data... | |
codegen_opengl.h | Loading commit data... | |
codegen_source_base.cc | Loading commit data... | |
codegen_source_base.h | Loading commit data... | |
codegen_vhls.cc | Loading commit data... | |
codegen_vhls.h | Loading commit data... | |
intrin_rule.cc | Loading commit data... | |
intrin_rule.h | Loading commit data... | |
intrin_rule_aocl.cc | Loading commit data... | |
intrin_rule_cuda.cc | Loading commit data... | |
intrin_rule_metal.cc | Loading commit data... | |
intrin_rule_opencl.cc | Loading commit data... | |
intrin_rule_opengl.cc | Loading commit data... | |
intrin_rule_vhls.cc | Loading commit data... | |
source_module.cc | Loading commit data... |