TensorCore Support using Intrinsic (#4136)
* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Showing
src/pass/infer_fragment.cc
0 → 100644
This diff is collapsed.
Click to expand it.
tutorials/optimize/opt_conv_tensorcore.py
0 → 100644
This diff is collapsed.
Click to expand it.
Please
register
or
sign in
to comment