- 16 Sep, 2019 5 commits
-
-
* improve conv2d_transpose x86 performance by reusing conv2d schedule * parallelize across batches to make large-batch conv2d and conv2d_transpose faster * improve doc for autotvm.task.space.FallbackConfigEntity.fallback_with_reference_log * add fallback schedule for schedule_conv2d_transpose_nchw_cuda * fix pylint * fix pylint * unify conv2d_transpose declaration in topi.nn and topi.x86
Yuwei Hu committed -
* Fix graph tuner benchmarking layout transform * Add test
Yao Wang committed -
* [tvm][codegen] Make buffer auto broadcast independent to the order of the input arg * fix indent
Zhi committed -
* [TOPI] operator support: logical_and, logical_or, logical_not * [TOPI] operator support: logical_and, logical_or, logical_not * [TOPI] fix test cases for operator support: logical_and, logical_or, logical_not * [TOPI] fix test cases for operator support: logical_not
Neo Chien committed -
* QNNLegalize for conv2d * [QNN] Legalization for Intel x86 QNN Conv2D
Animesh Jain committed
-
- 15 Sep, 2019 3 commits
-
-
* Enable miopen transpose convolution and fp16 support * linter
Peter Yeh committed -
* Add support for SquaredDifference and StopGradient; minor fix in BatchMatMul * Remove stopgradient change * Resolve PR comment * Dummy change to retrigger CI * dummy change to retrigger CI
Jon Soifer committed -
* Refine policies for define_split - Rename policy "all" to "factors" - Add policy "verbose" and "power2" * Refine search space * add doc
Cody Hao Yu committed
-
- 14 Sep, 2019 1 commit
-
-
Junru Shao committed
-
- 13 Sep, 2019 6 commits
-
-
2) Add EQ support in the loop partition and add test for the same 3) Change typo truc to trunc
Umang Yadav committed -
Andrew Tulloch committed
-
Issue: RPC path get changed into "vta_rpc" from "pynq_rpc", but related document still use old informaiton. Solution: Update RPC path information.
Hua Jiang committed -
Jianyu Huang committed
-
Animesh Jain committed
-
* Fix int8x4 vectorize * Fix gpu shared/local memory accumulate * Add test_shared_memory for int8x4 * Adjust test format * Fix cpplint
noituIover committed
-
- 12 Sep, 2019 4 commits
-
-
* [QNN] Convolution 2D Implementation. Rebasing. Empty commit. Clang-format styling. * Reformatting code. * Fixing lint issues.
shoubhik committed -
* Support cuBLAS BatchMatMul * Add test and check target name
Jon Soifer committed -
This is an alternative implementation of a subset of the TVM runtime API (and graph runtime) that focuses entirely on reducing code size, at the expense of functionality (no tvm.extern(..) calls via PackedFunc, CPU only, etc). It might be worth incrementally expanding the surface area if there's interest. The motivation for this work was seeing what the minimal useful subset of the TVM runtime is. This is relevant for e.g. super code-size constrained applications in e.g. embedded/mobile. The current runtime is more like O(100KiB) or so, so this might be compelling for some users. The smaller surface area for auditing might make this relevant for https://github.com/dmlc/tvm/issues/3159, or the usecases I was thinking about in https://github.com/dmlc/tvm/issues/2523#issuecomment-459165815 re: the Rust runtime. The symbols in the tvm::minimalruntime space (i.e. excluding std:: and picojson::) are about 5KiB, so I think there's a bunch of room here (i.e. we could replace picojson:: with [`jsmn`](https://zserge.com/jsmn.html) or something, and we could replace more of the `std::unordered_map` usage, etc with custom primitives as well (similar to the `DynArray`).
Andrew Tulloch committed -
* Module refactor * Add load module * Add support for idempotent import * Tweak load paths * Move path around * Expose C++ import functions in Python * Fix import * Add doc string * Fix * Fix lint * Fix lint * Fix test failure * Add type solver * Fix lint
Jared Roesch committed
-
- 11 Sep, 2019 4 commits
-
-
Lianmin Zheng committed
-
* support LLVM trunk * guard with USE_LLVM in if condition for c++14 * GREATER_EQUAL -> GREATER * [Arm] parallel batch axis
Yizhi Liu committed -
Zhao Wu committed
-
雾雨魔理沙 committed
-
- 10 Sep, 2019 2 commits
-
-
* [Relay][Frontend][Keras] Fix ReLU in Keras Converter missed the case * [Relay][Frontend][Keras] Add test case for ReLU in Keras Converter missed the case * [Relay][Frontend][Keras] Add test case for ReLU in Keras Converter missed the case
Neo Chien committed -
Pratyush Patel committed
-
- 09 Sep, 2019 4 commits
-
-
* save * save
雾雨魔理沙 committed -
Luis Vega committed
-
* numpy compatible type inference * update * try to fix * fix * try to fix * fix lint * Update nn.h * cast to int32 * try to fix * fix again * retrigger ci
Xingjian Shi committed -
* add more ops * stop vectorization for erf * x * cleanup * fix * add whitelist for vectorizable intrin * add tf converter * fix dense * fix * add missing intrin * fix mxnet frontend * fix nvptx
Haichen Shen committed
-
- 08 Sep, 2019 2 commits
- 07 Sep, 2019 7 commits
-
-
* fix cmake for mac os * rename
Haichen Shen committed -
* support LLVM trunk * guard with USE_LLVM in if condition for c++14 * GREATER_EQUAL -> GREATER
Yizhi Liu committed -
noituIover committed
-
fix lld
Peter Yeh committed -
Haichen Shen committed
-
* [VTA] Support TLPP in function simulator. Issue: currently vta function simulator just doing serialized instruction execution, the dependency logic of runtime ISA which use for task level pipe line parallelism can not get verified by function simulator. Solution: make the simulator driver to be multiple thread and support TLPP. Benefit: TLPP support VTA function simulator would make VTA logic testing/debug /change more easy. replace boost lockfree queue add configure control for simulator tlpp enable or disable. change code tyle into google style. Wrap queue read/write and sync logic to make function call more simple. Add some comments. Remove MT logic, change into Single thread mode. address review comments. code style change to match google code style and add comments. add cmake macro to enable/disable simulator tlpp logic. submodule update. correct file name mentioned in comments. * remove USE_VTA_FSIM_TLPP.
Hua Jiang committed -
* update lint * lint fixed * lint updated * lint fixed * lint fixed * lint fixed * updates * add intel graphics as a package * remove print info * depthwise conv2d schedule added for intel graphics * asdf * fix lint * fix lint * fix ci * add channels
Leyuan Wang committed
-
- 06 Sep, 2019 2 commits