Name |
Last commit
|
Last update |
---|---|---|
.. | ||
README.txt | ||
opt_conv_cuda.py | ||
opt_conv_tensorcore.py | ||
opt_gemm.py |
* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
README.txt | Loading commit data... | |
opt_conv_cuda.py | Loading commit data... | |
opt_conv_tensorcore.py | Loading commit data... | |
opt_gemm.py | Loading commit data... |