- 11 Nov, 2019 6 commits
-
-
* [TF][Relay][Op] Pass module when infer shape * Fix lint * Improve style * Add test
Wei Chen committed -
Previously runtime::Module was supported using shared_ptr. This PR refactors the codebase to use the Object protocol. It will open doors to allow easier interpolation between Object containers and module in the future.
Tianqi Chen committed -
the test case was removed in #4181 for some reason @tqchen @soiferj @zhiics
Yong Wu committed -
* Fix tf reshape * Fix test * Fix pylint * Fix pylint
Yao Wang committed -
* Add pass manager tutorial * fix some examples * retrigger ci * Update tutorials/dev/relay_pass_infra.py Co-Authored-By: 雾雨魔理沙 <lolisa@marisa.moe> * Add ToANormalForm link
Zhi committed -
Animesh Jain committed
-
- 10 Nov, 2019 5 commits
- 09 Nov, 2019 1 commit
-
-
* Add Auto TensorCore TensorCore Unit Test * Rebase to tvm master branch & Add auto tensor core * Code Refine * Add tensor core switch by pragma * Add pragma in tensor core example code * Get real tile size to replace hard coded 16 * support more than 2 dimensions (e.g. batchmatmul) for buffer bind scope * support batch matmul * Move cuda env check to tensor_core.cc * Coderefine for tensor_core.cc * Refine comments * Some refinements of code and comment * Update TensorCore UT to pass the CPU test * remove redundant code * matmul's storage align for different layout * Add support for differenct position of type cast * Add formal tutorial for auto tensorcore codegen * move tensorcore check up to tutorial code * code and doc refine * comment out tune_and_evaluate in tutorial * fix cpplint error
Minmin Sun (孙敏敏) committed
-
- 08 Nov, 2019 2 commits
-
-
fix the problem that android_rpc compilation failed
peike committed -
* fix_winograd_cuda_kernel_size * add unit test
Cody Hao Yu committed
-
- 07 Nov, 2019 2 commits
-
-
Jon Soifer committed
-
* Batch matmul tuning running but with errors. * Default x86 schedule as good as before. * Code Cleanup * Remove unused argument. * improved template documentation. * Silly lint fix * Removed leftover comment. * Moved cfg declaration to schedule for batch_matmul * Moved x86 dense cfg declaration to schedule. * lint fix * Removed duplicate cfg declaration in dense. * Reverted changes to dense.
Josh Fromm committed
-
- 06 Nov, 2019 4 commits
-
-
* fix winograd * move get padding after kernel transform
Cody Hao Yu committed -
* [Contrib] Fix error message at callback_get_section_size() * Trigger notification
Neo Chien committed -
* Update TensorUtil.scala * Update test_vta_insn.py
Liangfu Chen committed -
Tianqi Chen committed
-
- 05 Nov, 2019 2 commits
-
-
zhuochen committed
-
LLVM 8 will crash when loading the bitcodes This is a runtime check as the file will be compiled in even when USE_ROCM OFF is used in the configuration if ROCM is installed in the default location. Fixes: #4087
Thomas Viehmann committed
-
- 04 Nov, 2019 4 commits
-
-
Tianqi Chen committed
-
* Add StopGradient. Add batch_dims attr to ignore list for GatherV2 * Trigger CI
Trevor Morris committed -
Kim committed
-
XFPlus committed
-
- 02 Nov, 2019 2 commits
-
-
* [VTA] Performance optimize, remove unnecessary contigious memory use. Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic. * address review comments, remove spacing.
Hua Jiang committed -
* Support reshape for dynamic shape in tf converter * Only allow reshape directly after shape function for symbolic input shape * Fix lint
Yao Wang committed
-
- 01 Nov, 2019 7 commits
-
-
* [NODE][REFACTOR] Rename IRFunctor->NodeFunctor, use function pointer for dispatching. Previously we used std::function for the functor dispatching. It introduces additional overhead and problems during dll destruction(of std::function). This PR changes the std::function to function pointers. This change a bit restrictions around the set_dispatch that we can get around, but will improve the general efficiency by reducing one level of indirection in the std::function. We also no longer need special marcos to register functions to the Functor.
Tianqi Chen committed -
Jared Roesch committed
-
Wei Chen committed
-
* [Relay][Pass] Avoid FoldConstant folding some ops * rename
Wuwei Lin committed -
Kim committed
-
Sergei Grechanik committed
-
Signed-off-by: qinqiuping <autumnqin@126.com>
autumnqin committed
-
- 31 Oct, 2019 5 commits
-
-
Tianqi Chen committed
-
Tianqi Chen committed
-
* [CI] Update the ci-gpu to use cuda10 * [CI] Enforce tensorcore gpu for unittest
Tianqi Chen committed -
KoolKoffee committed
-
* [CI] Move gpu docker binary to cuda10 * Fix the gcn tutorial
Tianqi Chen committed
-