- 08 Nov, 2019 2 commits
-
-
fix the problem that android_rpc compilation failed
peike committed -
* fix_winograd_cuda_kernel_size * add unit test
Cody Hao Yu committed
-
- 07 Nov, 2019 2 commits
-
-
Jon Soifer committed
-
* Batch matmul tuning running but with errors. * Default x86 schedule as good as before. * Code Cleanup * Remove unused argument. * improved template documentation. * Silly lint fix * Removed leftover comment. * Moved cfg declaration to schedule for batch_matmul * Moved x86 dense cfg declaration to schedule. * lint fix * Removed duplicate cfg declaration in dense. * Reverted changes to dense.
Josh Fromm committed
-
- 06 Nov, 2019 4 commits
-
-
* fix winograd * move get padding after kernel transform
Cody Hao Yu committed -
* [Contrib] Fix error message at callback_get_section_size() * Trigger notification
Neo Chien committed -
* Update TensorUtil.scala * Update test_vta_insn.py
Liangfu Chen committed -
Tianqi Chen committed
-
- 05 Nov, 2019 2 commits
-
-
zhuochen committed
-
LLVM 8 will crash when loading the bitcodes This is a runtime check as the file will be compiled in even when USE_ROCM OFF is used in the configuration if ROCM is installed in the default location. Fixes: #4087
Thomas Viehmann committed
-
- 04 Nov, 2019 4 commits
-
-
Tianqi Chen committed
-
* Add StopGradient. Add batch_dims attr to ignore list for GatherV2 * Trigger CI
Trevor Morris committed -
Kim committed
-
XFPlus committed
-
- 02 Nov, 2019 2 commits
-
-
* [VTA] Performance optimize, remove unnecessary contigious memory use. Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic. * address review comments, remove spacing.
Hua Jiang committed -
* Support reshape for dynamic shape in tf converter * Only allow reshape directly after shape function for symbolic input shape * Fix lint
Yao Wang committed
-
- 01 Nov, 2019 7 commits
-
-
* [NODE][REFACTOR] Rename IRFunctor->NodeFunctor, use function pointer for dispatching. Previously we used std::function for the functor dispatching. It introduces additional overhead and problems during dll destruction(of std::function). This PR changes the std::function to function pointers. This change a bit restrictions around the set_dispatch that we can get around, but will improve the general efficiency by reducing one level of indirection in the std::function. We also no longer need special marcos to register functions to the Functor.
Tianqi Chen committed -
Jared Roesch committed
-
Wei Chen committed
-
* [Relay][Pass] Avoid FoldConstant folding some ops * rename
Wuwei Lin committed -
Kim committed
-
Sergei Grechanik committed
-
Signed-off-by: qinqiuping <autumnqin@126.com>
autumnqin committed
-
- 31 Oct, 2019 6 commits
-
-
Tianqi Chen committed
-
Tianqi Chen committed
-
* [CI] Update the ci-gpu to use cuda10 * [CI] Enforce tensorcore gpu for unittest
Tianqi Chen committed -
KoolKoffee committed
-
* [CI] Move gpu docker binary to cuda10 * Fix the gcn tutorial
Tianqi Chen committed -
* [Doc] Update ANTLR instruction * Update docs/install/from_source.rst
Wei Chen committed
-
- 30 Oct, 2019 9 commits
-
-
Wei Chen committed
-
* [CI] use llvm9 for the gpu tests * Update Docker script to support new nvidia docker
Tianqi Chen committed -
* Add support for Any op * Support ONNX frontend * Add doc * Add to relay docs * Dummy change to retrigger CI
Jon Soifer committed -
* Added slice v10 * Added constantofshape operation and small refactor. * Finished one_hot implementation. * Reshape working across all bert layers. * Fixed constantofshape and removed code duplication. * onnx model fully ingested. * Working on improving onnx tests. * Changed onnx testing to use onnxruntime instead of caffe2, also formatted. * Add arbitrary output nodes to onnx frontend. * Added v6 tiling for bert squad 8 support. * Small syntax fixes * Reduced code duplication in split opset versions. * Added batch matmul test * Added unstack split testing. * Adde onehot test, needs a little cleanup probably. * Replaced deprecated constant fill with constantofshape and updated tests accordingly. * Added tests for new opset version of slice and tile. * lint clean up * Lint fixes * Changed onnx dependency * Went back to caffe2 runtime for CI integration. * Rebase and small typo/syntax changes. * Added hard casting of onehot attributes to int.
Josh Fromm committed -
Tianqi Chen committed
-
Sergei Grechanik committed
-
* [QNN] Improving Dense lowering. * - Moving get_shape method to util - Finalizing the test cases and the code structure for optimized dense computation. * - Fixing cpplint. * - Addressing review comments. * - Renaming the variables correctly. * - Renaming the variables correctly.
shoubhik committed -
Bohan Hou committed
-
* Add Python type functor and tests * Lint roller
Logan Weber committed
-
- 29 Oct, 2019 2 commits