* support cuda tensorcore subbyte int data type in auto tensorcore * add lisence * pass cpplint * fix code review comments * merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial * using master's new API * disable tuning when cuda is not enabled * address cr comment * do not run the tuning * fix test failure * fix cpplint error * fix bool type reduction bug * 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4 * fix typo
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
literal | Loading commit data... | |
codegen_aocl.cc | Loading commit data... | |
codegen_c.cc | Loading commit data... | |
codegen_c.h | Loading commit data... | |
codegen_c_host.cc | Loading commit data... | |
codegen_c_host.h | Loading commit data... | |
codegen_cuda.cc | Loading commit data... | |
codegen_cuda.h | Loading commit data... | |
codegen_metal.cc | Loading commit data... | |
codegen_metal.h | Loading commit data... | |
codegen_opencl.cc | Loading commit data... | |
codegen_opencl.h | Loading commit data... | |
codegen_opengl.cc | Loading commit data... | |
codegen_opengl.h | Loading commit data... | |
codegen_source_base.cc | Loading commit data... | |
codegen_source_base.h | Loading commit data... | |
codegen_vhls.cc | Loading commit data... | |
codegen_vhls.h | Loading commit data... | |
intrin_rule_aocl.cc | Loading commit data... | |
intrin_rule_cuda.cc | Loading commit data... | |
intrin_rule_metal.cc | Loading commit data... | |
intrin_rule_opencl.cc | Loading commit data... | |
intrin_rule_opengl.cc | Loading commit data... | |
intrin_rule_vhls.cc | Loading commit data... | |
source_module.cc | Loading commit data... |