* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
api | Loading commit data... | |
arithmetic | Loading commit data... | |
autotvm | Loading commit data... | |
codegen | Loading commit data... | |
common | Loading commit data... | |
contrib | Loading commit data... | |
lang | Loading commit data... | |
op | Loading commit data... | |
pass | Loading commit data... | |
relay | Loading commit data... | |
runtime | Loading commit data... | |
schedule | Loading commit data... | |
README.md | Loading commit data... |