* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
datatype | Loading commit data... | |
llvm | Loading commit data... | |
opt | Loading commit data... | |
spirv | Loading commit data... | |
stackvm | Loading commit data... | |
build_common.h | Loading commit data... | |
build_module.cc | Loading commit data... | |
codegen.cc | Loading commit data... | |
codegen_aocl.cc | Loading commit data... | |
codegen_c.cc | Loading commit data... | |
codegen_c.h | Loading commit data... | |
codegen_c_host.cc | Loading commit data... | |
codegen_c_host.h | Loading commit data... | |
codegen_cuda.cc | Loading commit data... | |
codegen_cuda.h | Loading commit data... | |
codegen_metal.cc | Loading commit data... | |
codegen_metal.h | Loading commit data... | |
codegen_opencl.cc | Loading commit data... | |
codegen_opencl.h | Loading commit data... | |
codegen_opengl.cc | Loading commit data... | |
codegen_opengl.h | Loading commit data... | |
codegen_source_base.cc | Loading commit data... | |
codegen_source_base.h | Loading commit data... | |
codegen_vhls.cc | Loading commit data... | |
codegen_vhls.h | Loading commit data... | |
intrin_rule.cc | Loading commit data... | |
intrin_rule.h | Loading commit data... | |
intrin_rule_aocl.cc | Loading commit data... | |
intrin_rule_cuda.cc | Loading commit data... | |
intrin_rule_metal.cc | Loading commit data... | |
intrin_rule_opencl.cc | Loading commit data... | |
intrin_rule_opengl.cc | Loading commit data... | |
intrin_rule_vhls.cc | Loading commit data... | |
source_module.cc | Loading commit data... |