- 27 Oct, 2019 2 commits
-
-
* Add support for attaching params * Fix types * Fix test
Jared Roesch committed -
* add checkpoint annotation for checkpointing memory optimization * add alpha-equivalence checkpoint test and fix gradient type issue * fix build issues * ignore checkpoint annotation when checking missing gradients * refactor, fix checkpoint compute for tuple and add tests
Altan Haan committed
-
- 25 Oct, 2019 2 commits
-
-
* save * lint
雾雨魔理沙 committed -
Animesh Jain committed
-
- 24 Oct, 2019 4 commits
-
-
* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Siyuan Feng committed -
Ina Dobreva committed
-
Zhi committed
-
This reverts commit 6f9d028b.
Tianqi Chen committed
-
- 23 Oct, 2019 1 commit
-
-
Bjarke Hammersholt Roune committed
-
- 22 Oct, 2019 2 commits
-
-
* [Relay][Frontend][TF] Fix Size operator * Uncomment tests
Jon Soifer committed -
Cody Hao Yu committed
-
- 21 Oct, 2019 3 commits
-
-
* [bugfix][codegen] fix casting bug in llvm codegen * update example * retrigger ci * check llvm version
Zhi committed -
This patch adds multiply operator for quantized tensors. The details of the quantized multiplication are outlined in the code. This builds on pull request 3927 and includes the changes Animesh mentions in the comments on that request. Change-Id: I555715b53d0266a91d5c03dc3dfe8fc31e7ce4e1
ekalda committed -
* [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. This PR removes the original node system, and make node as a subclass of Object. This is a major refactor towards a better unified runtime object system. List of changes in the refactor: - We now hide data_ field, use Downcast explicitly to get a sub-class object. - Removed the node system FFI in python. - Removed the node C API, instead use PackedFunc for list and get attrs. - Change relay::Op::set_attr_type_key(attr_key_name) to relay::Op::set_attr_type<AttrType>(). - This change was necessary because of the new Object registration mechanism. - Subsequent changes to the op registrations - The change revealed a few previous problems that is now fixed. - Patched up a few missing node type registration. - Now we will raise an error if we register object that is not registered. - The original node.h and container.h are kept in the same location. - Calling convention: kObjectHandle now equals the old kNodeHandle, kNodeHandle is removed. - IRFunctor now dispatches on ObjectRef. - Update to the new type checking API: is_type, derived_from are replaced by IsInstance. - Removed .hash member function, instead use C++ convention hasher functors. * Address review comments
Tianqi Chen committed
-
- 20 Oct, 2019 1 commit
-
-
We think it will reduce the confusion with the meaning. https://discuss.tvm.ai/t/discuss-consider-rename-vm-datatype/4339
Wei Chen committed
-
- 18 Oct, 2019 3 commits
-
-
* Add LiftIfThenElse pass * Add more comments * Rename and refactor * Add description for internal data structure * Rename a test * Minor change * Address comments * Improve update_for
Yao Wang committed -
Animesh Jain committed
-
* [Relay][Frontend][TF] Add tensor array ops * rename * delete test * Move utility function * Refactor * fix tensor array ops * fix test * fix rebase * Fix serializer bug * Improve tf convert name lookup to use prelude api * Fix lint * Fix test
Wei Chen committed
-
- 17 Oct, 2019 2 commits
-
-
* [relay][vm] Separate VM runtime with executable * Address comments * move ctx back to vm * make only vm related fields and methods protected * integrate seriliaztion/deserialization to executable * create stream
Zhi committed -
* [TOPI][x86] Cascade lake support. * Jenkins test debug 1. * Testing cascade lake alone.
Animesh Jain committed
-
- 16 Oct, 2019 3 commits
-
-
* [RUNTIME] Refactor object python FFI to new protocol. This is a pre-req to bring the Node system under object protocol. Most of the code reflects the current code in the Node system. - Use new instead of init so subclass can define their own constructors - Allow register via name, besides type idnex - Introduce necessary runtime C API functions - Refactored Tensor and Datatype to directly use constructor. * address review comments
Tianqi Chen committed -
shoubhik committed
-
* add and fix gradients * fix linter issues
Altan Haan committed
-
- 15 Oct, 2019 4 commits
-
-
* Fix infer type of kernel in dense. * - Moving the check of weight being nullptr up as it is needed in both the branches now. - Adding test case for validating that data dtype and kernel dtypes can be different. * - Fix the dtype check for weight. If the weight is not present then we will use the data dtype.
shoubhik committed -
* [Relay][AlterOpLayout] NHWC to NCHWc pad operator. * Fixing culprit. * Flaky test 1. * Flaky test 2.
Animesh Jain committed -
* [RUNTIME] Introduce new object protocol. This PR introduces a new object protocol to unify the node and object. We also updated the existing runtime::vm code to make use of the new system. Update to the node will be done in a follow up PR. Other changes: - Remove object related code in json serializer as that code logic was not complete and we have a separate serializer for VM, can revisit later. * address review comment * Fix the child slot logic
Tianqi Chen committed -
Animesh Jain committed
-
- 14 Oct, 2019 2 commits
-
-
Animesh Jain committed
-
Tianqi Chen committed
-
- 13 Oct, 2019 1 commit
-
-
This implementation provides cast to limited number of dtypes that tflite currently supports for placeholder op. Add INT64 in the possible dtypes as it appears to be supported accrording to tlfite schema.
Ina Dobreva committed
-
- 11 Oct, 2019 3 commits
-
-
* overload half operators for cuda codegen * add float16 te test_op_level1 * fix test_op_level1.py * fix lint * disable fp16 test if gpu does not support * disable fp16 test if gpu does not support * bypass float16 test if gpu does not support float16
Xingyu Zhou committed -
* [tvm][any] broadcast with values other than 1 * Add test for incompatible runtime values * Remove hybrid script compact buffer binding * retrigger ci
Zhi committed -
Animesh Jain committed
-
- 10 Oct, 2019 3 commits
-
-
- Adding support for Mxnet flavored dequantization for both default and using MKLDNN. User can choose between the two at runtime. (#3945) - Added tests for new methods added.
shoubhik committed -
* move the number of nodes constraint in op fusion up to the dom tree level * add test case of limiting the max number of ops to be fused * uncomment other test cases
Yida Wang committed -
* [Relay][VM] Fix constant folding issue in VM compiler 1. allow pass params when compile a module 2. enhance profiler robustness * remove dead code * fix lint * add get_params * fix test * don't pass params back * remove get_params * docs * move compile function to api * compile clashes with builtin name * fix compilation error * remove dead code
Wei Chen committed
-
- 09 Oct, 2019 1 commit
-
-
The current bounds checking infrastructure inserts checks like: ``` for (i, 0, bounds[n]) { if (likely(i < bounds[n]) { ... } } ``` into the TVM IR which is currently not removed by simplification infrastructure. This is a little unclean, as these are trivially true since for a loop var `i` with a given min and extent, we are guaranteed that `i >= min` and `i < min + extent`. Thus, we can insert these checks into the IR and use them to eliminate trivial bounds checks early on.
Andrew Tulloch committed
-
- 08 Oct, 2019 2 commits
-
-
Use fdiv in the tests for the deduce_bound
Umang Yadav committed -
* Fix VM invoke with set_params * add test * tweak
Haichen Shen committed
-
- 07 Oct, 2019 1 commit
-
-
Logan Weber committed
-