- 12 Nov, 2019 5 commits
-
-
* Add test for the qnn_add operator The tests use fake quant approach so until the tf session tensors remain in float32. The test data has to be passed in uint8 because of how the tflite/tvm comparison works. Abs tolerance up to 1 is allowed for the qnn results. For now input_stats are hardcoded assuming the tests for the other qnn ops will pass the input data in the same range. * Separate qnn uint8 test function from the fp32 elemwise tests Isolate qnn uint8 elemwise tests Remove blank lines
Ina Dobreva committed -
Haichen Shen committed
-
* Relay Keras frontent batch_norm op params not handeling well * add unit test for Relay Frontend Keras batch_norm
Xingyu Zhou committed -
* Fix incorrect call to Unicode Win32 * Removed inet_pton call. Win32 already has it.
jmorrill committed -
Neo Chien committed
-
- 11 Nov, 2019 7 commits
-
-
* Add shape functions * Fix get_const_tuple * Fix cpplint * Fix pylint * Fix pylint * rebase and fix * Check Any for infer type * Fix expand_dim shape func for zero rank input * Fix pooling infer type * Address comment * Register layout transform attr
Yao Wang committed -
* [TF][Relay][Op] Pass module when infer shape * Fix lint * Improve style * Add test
Wei Chen committed -
Previously runtime::Module was supported using shared_ptr. This PR refactors the codebase to use the Object protocol. It will open doors to allow easier interpolation between Object containers and module in the future.
Tianqi Chen committed -
the test case was removed in #4181 for some reason @tqchen @soiferj @zhiics
Yong Wu committed -
* Fix tf reshape * Fix test * Fix pylint * Fix pylint
Yao Wang committed -
* Add pass manager tutorial * fix some examples * retrigger ci * Update tutorials/dev/relay_pass_infra.py Co-Authored-By: 雾雨魔理沙 <lolisa@marisa.moe> * Add ToANormalForm link
Zhi committed -
Animesh Jain committed
-
- 10 Nov, 2019 5 commits
- 09 Nov, 2019 1 commit
-
-
* Add Auto TensorCore TensorCore Unit Test * Rebase to tvm master branch & Add auto tensor core * Code Refine * Add tensor core switch by pragma * Add pragma in tensor core example code * Get real tile size to replace hard coded 16 * support more than 2 dimensions (e.g. batchmatmul) for buffer bind scope * support batch matmul * Move cuda env check to tensor_core.cc * Coderefine for tensor_core.cc * Refine comments * Some refinements of code and comment * Update TensorCore UT to pass the CPU test * remove redundant code * matmul's storage align for different layout * Add support for differenct position of type cast * Add formal tutorial for auto tensorcore codegen * move tensorcore check up to tutorial code * code and doc refine * comment out tune_and_evaluate in tutorial * fix cpplint error
Minmin Sun (孙敏敏) committed
-
- 08 Nov, 2019 2 commits
-
-
fix the problem that android_rpc compilation failed
peike committed -
* fix_winograd_cuda_kernel_size * add unit test
Cody Hao Yu committed
-
- 07 Nov, 2019 2 commits
-
-
Jon Soifer committed
-
* Batch matmul tuning running but with errors. * Default x86 schedule as good as before. * Code Cleanup * Remove unused argument. * improved template documentation. * Silly lint fix * Removed leftover comment. * Moved cfg declaration to schedule for batch_matmul * Moved x86 dense cfg declaration to schedule. * lint fix * Removed duplicate cfg declaration in dense. * Reverted changes to dense.
Josh Fromm committed
-
- 06 Nov, 2019 4 commits
-
-
* fix winograd * move get padding after kernel transform
Cody Hao Yu committed -
* [Contrib] Fix error message at callback_get_section_size() * Trigger notification
Neo Chien committed -
* Update TensorUtil.scala * Update test_vta_insn.py
Liangfu Chen committed -
Tianqi Chen committed
-
- 05 Nov, 2019 2 commits
-
-
zhuochen committed
-
LLVM 8 will crash when loading the bitcodes This is a runtime check as the file will be compiled in even when USE_ROCM OFF is used in the configuration if ROCM is installed in the default location. Fixes: #4087
Thomas Viehmann committed
-
- 04 Nov, 2019 4 commits
-
-
Tianqi Chen committed
-
* Add StopGradient. Add batch_dims attr to ignore list for GatherV2 * Trigger CI
Trevor Morris committed -
Kim committed
-
XFPlus committed
-
- 02 Nov, 2019 2 commits
-
-
* [VTA] Performance optimize, remove unnecessary contigious memory use. Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic. * address review comments, remove spacing.
Hua Jiang committed -
* Support reshape for dynamic shape in tf converter * Only allow reshape directly after shape function for symbolic input shape * Fix lint
Yao Wang committed
-
- 01 Nov, 2019 6 commits
-
-
* [NODE][REFACTOR] Rename IRFunctor->NodeFunctor, use function pointer for dispatching. Previously we used std::function for the functor dispatching. It introduces additional overhead and problems during dll destruction(of std::function). This PR changes the std::function to function pointers. This change a bit restrictions around the set_dispatch that we can get around, but will improve the general efficiency by reducing one level of indirection in the std::function. We also no longer need special marcos to register functions to the Functor.
Tianqi Chen committed -
Jared Roesch committed
-
Wei Chen committed
-
* [Relay][Pass] Avoid FoldConstant folding some ops * rename
Wuwei Lin committed -
Kim committed
-
Sergei Grechanik committed
-