* Add Auto TensorCore TensorCore Unit Test * Rebase to tvm master branch & Add auto tensor core * Code Refine * Add tensor core switch by pragma * Add pragma in tensor core example code * Get real tile size to replace hard coded 16 * support more than 2 dimensions (e.g. batchmatmul) for buffer bind scope * support batch matmul * Move cuda env check to tensor_core.cc * Coderefine for tensor_core.cc * Refine comments * Some refinements of code and comment * Update TensorCore UT to pass the CPU test * remove redundant code * matmul's storage align for different layout * Add support for differenct position of type cast * Add formal tutorial for auto tensorcore codegen * move tensorcore check up to tutorial code * code and doc refine * comment out tune_and_evaluate in tutorial * fix cpplint error
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
arg_binder.cc | Loading commit data... | |
arg_binder.h | Loading commit data... | |
bound_checker.cc | Loading commit data... | |
combine_context_call.cc | Loading commit data... | |
coproc_sync.cc | Loading commit data... | |
detect_device.cc | Loading commit data... | |
hoist_if_then_else.cc | Loading commit data... | |
infer_fragment.cc | Loading commit data... | |
inject_copy_intrin.cc | Loading commit data... | |
inject_double_buffer.cc | Loading commit data... | |
inject_prefetch.cc | Loading commit data... | |
inject_virtual_thread.cc | Loading commit data... | |
inline.cc | Loading commit data... | |
ir_deep_compare.cc | Loading commit data... | |
ir_mutator.cc | Loading commit data... | |
ir_util.cc | Loading commit data... | |
ir_util.h | Loading commit data... | |
ir_visitor.cc | Loading commit data... | |
lift_attr_scope.cc | Loading commit data... | |
loop_partition.cc | Loading commit data... | |
lower_custom_datatypes.cc | Loading commit data... | |
lower_intrin.cc | Loading commit data... | |
lower_thread_allreduce.cc | Loading commit data... | |
lower_tvm_builtin.cc | Loading commit data... | |
lower_warp_memory.cc | Loading commit data... | |
make_api.cc | Loading commit data... | |
narrow_channel_access.cc | Loading commit data... | |
remap_thread_axis.cc | Loading commit data... | |
remove_no_op.cc | Loading commit data... | |
rewrite_unsafe_select.cc | Loading commit data... | |
simple_passes.cc | Loading commit data... | |
split_host_device.cc | Loading commit data... | |
split_pipeline.cc | Loading commit data... | |
ssa.cc | Loading commit data... | |
storage_access.cc | Loading commit data... | |
storage_access.h | Loading commit data... | |
storage_flatten.cc | Loading commit data... | |
storage_rewrite.cc | Loading commit data... | |
storage_sync.cc | Loading commit data... | |
tensor_core.cc | Loading commit data... | |
unroll_loop.cc | Loading commit data... | |
vectorize_loop.cc | Loading commit data... | |
verify_compact_buffer.cc | Loading commit data... | |
verify_gpu_code.cc | Loading commit data... | |
verify_memory.cc | Loading commit data... |