- 05 Sep, 2019 1 commit
-
-
Animesh Jain committed
-
- 04 Sep, 2019 1 commit
-
-
Rebasing. Empty commit. Clang-format styling.
Animesh Jain committed
-
- 03 Sep, 2019 2 commits
-
-
This reverts commit 224cc243.
Tianqi Chen committed -
As GraphRuntime does not provide control-flow logics, we have to split our model to two parts. While we need to share parameters between them to save memory usage. Solution: 1) add "lazy_init_input" in graph's attributes "attrs": { ... ... "lazy_init_input": [ "list_str", [ "p0" ] ] } 2) allow un-allocated NDArray entry in SetupStorage 3) utilize "set_input_zero_copy" function to set parameters
Yong Sun committed
-
- 01 Sep, 2019 2 commits
-
-
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
Haichen Shen committed -
* Added arm_cpu NHWC schedules. * Fixed kernel shape legalization. * Added bitserial ops to relay. * Snapshot and more missing files. * Added dense testing. * Added tests * Added ASF header to new files. * cc lint * Pylint change. * pylint fixes. * Change arm legalize test. * Added assert check to arm legalize. * Added better documentation, fixed some bad style * Reverted arm conv2d nhwc changes.
Josh Fromm committed
-
- 31 Aug, 2019 1 commit
-
-
Animesh Jain committed
-
- 30 Aug, 2019 1 commit
-
-
Animesh Jain committed
-
- 23 Aug, 2019 1 commit
-
-
Tianqi Chen committed
-
- 22 Aug, 2019 3 commits
-
-
transform.h:118:3: warning: 'const' type qualifier on return type has no effect attrs.h:68:3: note: expanded from macro 'TVM_DECLARE_ATTRS' node.h:244:3: note: expanded from macro 'TVM_DECLARE_NODE_TYPE_INFO' transform.h:95:3: warning: extra ';' after member function definition attrs.h:68:62: note: expanded from macro 'TVM_DECLARE_ATTRS'
lixiaoquan committed -
* Add one-hot to Relay * topi implementation * Working * add topi test * Add TF test * Fix check * fix linting issues * fix documentation * Fix documentation * Add support for on_value, off_value, axis, dtype * Add full support for axis * Fix compute and update test_forward * Move on_value and off_value to inputs * Add topi test * Update tests * Update docs * Fix style * re-enable tests * Add one_hot to mxnet converter
Jon Soifer committed -
Josh Fromm committed
-
- 21 Aug, 2019 1 commit
-
-
* [Relay][VM]VM debugger * Report mean/min/max for op duration * Typos * Lint * Lint * Lint * Support build debug VM in CMake * Lint * Enable VM debug in unit test * Disable debug vm test until new docker image is built * Add device sync code * Fix qnn unit test * Disable vm debug by default * Rename files * Rename classes * Fix comment * Fix comment
Wei Chen committed
-
- 16 Aug, 2019 1 commit
-
-
* QNN quantize and dequantize operators. * addressing review comments. * addressing review comments. * Adding new line at the end of the file. * Adhering to styling guidelines. * Adding name to contributors. * Fixing lint issue. * Fixing file name. * Removing unnecessary code.
shoubhik committed
-
- 15 Aug, 2019 1 commit
-
-
* Refactor. * update * update * update * update * update * update
ziheng committed
-
- 13 Aug, 2019 1 commit
-
-
* Added relay and topi mirror_pad operator. * Added mirror_padding to tensorflow frontend. * Added mirrorpad testing in tensorflow frontent. * Added space_to_depth in tf frontend. * Added tests for spacetodepth. * spacetodepth bug fix. * Lint fix * Added mirror pad python attrs. * Pad code formatting. * Syntax improvement * Hopefully last lint fix
Josh Fromm committed
-
- 09 Aug, 2019 1 commit
-
-
雾雨魔理沙 committed
-
- 08 Aug, 2019 1 commit
-
-
* [Relay] [Quantization] WIP - Common files for the qauntization work. * [Relay] [Quantization] WIP - Prototyping requantize op. * Requantize operator implementation. Requantize converts one quantized tensor representation to another quantized representation. The PR has following implementation features - Requantize operator defined in qnn namespace - relay.qnn.requantize - Lowering of the requantize to exisiting Relay operators - Integer fixed point implementation of requantize - Two rounding modes - FE_UPWARDS (round towards infinity) and FE_AWAY_FROM_ZERO (std::round behavior) - Floating point implementation as well, that can act as reference or can be used for devices when FP32 computation is not used. - Unit test cases Relevant Issue - https://github.com/dmlc/tvm/issues/2351 Credit to TFLite and GemmLowp to provide reference implementations. * Typo and lint fixes. * Doc fix. * Uncommenting the lint script (fixing mistake). * Modifying the unit tests. * Moving C++ files into src/relay/qnn * Moving python files to python/tvm/relay/qnn. Some minor fixes. * Moving the attrs.h inside the include directory. * Pushing files that I forgot earlier. Changing util location. * Incorporating comments. API change. Lint fixes. * Modifying the GetFixedPointMultiplierShift API as per comments. * Forgot the dialect change. * Changing rewrite to qnn_lower. * Renaming Quantize to Qnn for clarity. * Remove use_int_domain. * Incorportaing review comments. * Adding API doc for QNN dialect. * Move the qnn_lower pass to transform namespace. * Moving from expr to module. Adding namespace in C++. * Minor sentence rewrites. Added qnn namespace. * Added the API doc. * Chanding default out_dtype to int8. Adding a test with in/out_dtype as uint8. * Style fixes. Better error messages. * Adding documentation. * More documentation fixes. * Adding out dtype check for requantize. * Adding corner case for FP32 to fixed point conversion. * Adding extra line. * Documentation fix. * Adding static inline. * Incorporating jackwish comment. Removed idtype from requantize lowering. * Removing Quantize/Dequantize code. Restricting Requantize to (u)int8/int32. * Style fixes. * Fix the docs. * Move to Legalize API.
Animesh Jain committed
-
- 07 Aug, 2019 1 commit
-
-
* Add LayerNorm op * update * fix * Add mean_std and mean_variance * add std and update doc * add license * x * lint * x * fix * fix doc
Haichen Shen committed
-
- 06 Aug, 2019 2 commits
-
-
* [Relay] Rewrite pass. This pass transforms an expression to other expression. This pass has many usecases * Replace a expr to another expr, if the other expr has faster performance. * For ASICs, we might want to modify the inputs to adapt to the HW support. * Alter op layout can work in conjunction with this pass. The supporting usecase is the Intel i8 x i8 conv. Intel HW supports u8 x i8 conv in HW. Using this pass, we can replace an i8 x i8 conv to a sequence of operators where one of the operators is now u8 x i8 conv. This will also help automatic quantizaion performance. * Better API name. * Removing the conv2d legalization for x86. Will send a separate PR. * Test name changes. * Registering one funtion to register FTVMLegalize. * Better comments.
Animesh Jain committed -
* add build gcn tutorial * add transpose operator for square sparse matrices * remove extra files * change loop tag * comply with lint * comply with lint -- line too long * comply with lint * lint check * lint check * lint check * apply marisa and theirry's reviews
Yulun Yao committed
-
- 05 Aug, 2019 1 commit
-
-
Junru Shao committed
-
- 01 Aug, 2019 4 commits
-
-
* [Relay][VM] Support execution on devices * Reduce Copy calls * Cleanup * Lint * CR comments * Merge test into test_vm.py
Wei Chen committed -
Jian Weng committed
-
The patch adds support for Tensorflow operators log1p and cos Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin
alexgl-github committed -
* add fatal lint lint lint do make completeness check an error lint remove fatal * fix test * reset parser file * remove unneeded import * Update python/tvm/relay/adt.py Co-Authored-By: Steven S. Lyubomirsky <slyubomirsky@gmail.com> * Update include/tvm/relay/adt.h Co-Authored-By: Steven S. Lyubomirsky <slyubomirsky@gmail.com> * Eliminate trailing whitespace (my fault)
雾雨魔理沙 committed
-
- 31 Jul, 2019 1 commit
-
-
* relay vm serialization * fix lint * load params, fix stream * lint * fix typo
Zhi committed
-
- 25 Jul, 2019 2 commits
-
-
Lianmin Zheng committed
-
* uTVM interfaces (#14) * some minor interface changes * implemented HostLowLevelDevice * added MicroDeviceAPI * implemented micro_common and added Python interfaces * current status, semi implemented micro session * added micro_common implementation and python interfaces (#18) * added micro_common implementation and python interfaces (#18) * current status, semi implemented * host test working * updated interfaces for MicroSession arguments allocation * make somewhat lint compatible * fix based on comments * added rounding macro * fix minor bug * improvements based on comments * Clean up `binutil.py` and make Python-3-compatible * Change argument allocation design * Address feedback and lint errors * Improve binutil tests * Simplify allocator (per @tqchen's suggestions) * Doc/style fixes * farts * mcgee * rodata section werks (and so does `test_runtime_micro_workspace.py`) * simple graph runtime werk * TEMP * ResNet works, yo * First round of cleanup * More cleanup * runs a dyson over the code * Another pass * Fix `make lint` issues * ready to pr... probably * final * Undo change * Fix rebase resolution * Minor fixes * Undo changes to C codegen tests * Add `obj_path` in `create_micro_lib` * TEMP * Address feedback * Add missing TODO * Partially address feedback * Fix headers * Switch to enum class for `SectionKind` * Add missing ASF header * Fix lint * Fix lint again * Fix lint * Kill lint warnings * Address feedback * Change Python interface to MicroTVM All interaction with the device is now through `Session` objects, which are used through Python's `with` blocks. * Reorder LowLevelDevice interface * Store shared ptr to session in all alloced objects * Move helper functions out of `tvm.micro` * Switch static char arr to vector * Improve general infra and code quality Does not yet address all of tqchen's feedback * Forgot a rename * Fix lint * Add ASF header * Fix lint * Partially address MarisaKirisame's feedback * Lint * Expose `MicroSession` as a node to Python * Revert to using `Session` constructor * Fix compiler error * (Maybe) fix CI error * Debugging * Remove * Quell lint * Switch to stack-based session contexts * Make uTVM less intrusive to host codegen And use SSA for operands of generated ternary operators * Inline UTVMArgs into UTVMTask struct * Remove `HostLowLevelDevice` header * Remove `BaseAddr` class * Address feedback * Add "utvm" prefix to global vars in runtime * Fix lint * Fix CI * Fix `test_binutil.py` * Fix submodules * Remove ResNet tests * Make `test_binutil.py` work with nose * Fix CI * I swear this actually fixes the binutil tests * lint * lint * Add fcompile-compatible cross-compile func * Add docs for uTVM runtime files * Move pointer patching into `MicroSession` * Fix lint * First attempt at unifying cross-compile APIs * Fix lint * Rename `cross_compile` back to `cc` * Address feedback * Remove commented code * Lint * Figure out failing function * Remove debugging code * Change "micro_dev" target to "micro" * Add checks in tests for whether uTVM is enabled * Add TODO for 32-bit support * Rename more "micro_dev" to "micro" * Undo rename We already have `tvm.micro` as a namespace. Can't have it as a method as well. * Fix failing CI Thanks to @tqchen for finding this bug. Emitting ternary operators for `min` and `max` causes concurrency bugs in CUDA, so we're moving the ternary op emissions from `CodeGenC` to `CodeGenCHost`. * Address feedback * Fix lint
Logan Weber committed
-
- 23 Jul, 2019 1 commit
-
-
internally and externally, interested in replacing standard dense layers with block-sparse matrix multiplication layers. The motivations are generally: higher performance (due to reduction in FLOPs, memory bandwidth/cache footprint), enabling larger models (e.g. fitting more layers in a given memory budget). Some public work along these lines: * https://openai.com/blog/block-sparse-gpu-kernels/ * https://openai.com/blog/sparse-transformer/ * https://arxiv.org/abs/1802.08435 * https://arxiv.org/abs/1711.02782 Various groups have been able to successfully train models with reasonable levels of sparsity (90%+) with marginal accuracy changes, which suggests substantial speedups are possible (as this implies a >10x reduction in FLOPs). It is fairly straightforward to realize these theoretical speedups, see e.g. TVM benchmarks for Intel CPUs in https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA results in https://github.com/openai/blocksparse, etc. * https://github.com/openai/blocksparse (CUDA) * https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM) * https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation) This is extracted from an internal patch we've been using internally. There are various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but this is a reasonable starting point. This needs more thorough unit test coverage however. We follow the conventions established by scipy.sparse.bsr_matrix and other libraries, see the unit tests for details. For folks interested in experimenting with scheduling/AutoTVM etc, https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful starting point.
Andrew Tulloch committed
-
- 19 Jul, 2019 1 commit
-
-
Yong Wu committed
-
- 17 Jul, 2019 1 commit
-
-
* Fix build error * comments
Yinghai Lu committed
-
- 16 Jul, 2019 1 commit
-
-
* tmp * Port vm and object to python * clean up * update vm build module * update * x * tweak * cleanup * update * fix rebase * Rename to VMCompiler * fix
Haichen Shen committed
-
- 14 Jul, 2019 1 commit
-
-
* [TVM] Fix bound inference to avoid allocating too much * [ARITH][BOUND] Pass analyzer to PropBoundToInputs
Sergei Grechanik committed
-
- 13 Jul, 2019 1 commit
-
-
* [ARITH][IR] Introduce FloorDiv/Mod * Address review comments * address review comments, fix div sub rule
Tianqi Chen committed
-
- 11 Jul, 2019 1 commit
-
-
* [INFA][IR] Build and Evolve Low-level IR. Remove dep from HalideIR. * Update include/tvm/node/ir_functor.h Co-Authored-By: Jared Roesch <roeschinc@gmail.com> * Update include/tvm/node/ir_functor.h Co-Authored-By: Jared Roesch <roeschinc@gmail.com>
Tianqi Chen committed
-
- 10 Jul, 2019 3 commits
-
-
* Implement type checking for Any Remove code generation related changes Remove compile changes Remove more Remove unification hack Add some code back that was needed, and clean up test Refactor test cases WIP Implement TypeHint AST Add test case which should fail Remove unification changes, and fix bug with let rec Restore unification for shapes Improve error reporting while debugging All examples type check All examples type check WIP First version that works with hints, needs clean up Remove dead code Tweaks Remove type hint Remove unecessary type hint stuff Remove more type hints Clean up Expose Any expression node Address CR Fix Fix solver Kill unecessary code Fix PyLint Fix Relocate loops Fix license and test Lint again Lint again Fix loops Fix docstring Fix template error Fix compiler issue Fix compile err Remove more runtime changes Restore buffer Fix segfault Fix Fix arange * Address feedback * Fix typo * Fix arange * Fix op level3 * Fix issue with Python wrapper
Jared Roesch committed -
Tianqi Chen committed
-
* First pass at Relay-to-Python converter testing utility * Indicate astor as a dependency * Add astor dep to host as well * Typos and small bugs * Handle ADTs and matching in Python conversion * Remove any dependency on ast.parse * Eliminate unnecessary type var field in Python version of ConstructorValue (already gone on C++ side) * Update constructor value, fix syntax errors * Don't forget keywords arg on Call nodes * Fix some incorrect calls to ast nodes * Fix more calls, a little more cleaning up * Missing cases in attr conversion * Lower op calls instead of running them through interpreter, as in @MarisaKirisame's AoT compiler * We do still need the module * Remove changes to op attrs: Will PR separately * Smoke test and corrections * More tests and fixes * Ensure imports are properly global in generated Python code * Add unit tests for refs * Add unit test for tuple indexing * Add unit test for if expression * Remove astor dependency * Remove astor from meta.yaml too * Fix if test and add basic local function test * Add global function test, refactor earlier tests * Correct 'clause' field in ADT so Python and C++ field names match * More fixes and tests for matching and constructors * Dramatically simplify matching: no need for a thunk * Improve ref writing test * Ensure local recursion works * cleanup * Add test for global recursion * Add test for higher-order calls * Get ops working, add basic tests * Remove accidentally duplicated test * More docstrings to appease pylint * Forgot to fix a test using constructor values * Reduce optimization level in fusion and fix tuple input to operators * Test op with tuple output, fix tuple output code * Add unit test for batch norm * Add a couple more tricky test cases * Correct nat constructor to drop unnecessary field * Fix the op attrs file (accidentally reduced it) * Address review comments * Adapt to new ConstructorValue representation (no more runtime dep on module) * Use pass manager and updated interfaces. Extend module.from_expr to accommodate necessary demands * Use sequential return value * Lift out nested conditionals * Replace triple single quotes with triple double quotes * Use main variable instead of entry_func
Steven S. Lyubomirsky committed
-
- 09 Jul, 2019 1 commit
-
-
* [Relay][VM]Compiling pattern matching * Fix lint * Remove debug code * Move TreeNode definition * merge ifi and selecti, todo: remove them * fix lint * remove ifi and selecti * rename GetTagi to GetTag * fix dltype * fix more dltype * Generalize If and select, and rename to Ifi and Selecti * Fix lint * Rename Ifi to If * Change register default to match value * Remove bad specialization for Move * Stop use Select * Remove Select * TreeNode refactor * Change entry_func name * Remove Cmp due to rebase issue
Wei Chen committed
-