- 13 Aug, 2019 2 commits
-
-
Zhi committed
-
* Added relay and topi mirror_pad operator. * Added mirror_padding to tensorflow frontend. * Added mirrorpad testing in tensorflow frontent. * Added space_to_depth in tf frontend. * Added tests for spacetodepth. * spacetodepth bug fix. * Lint fix * Added mirror pad python attrs. * Pad code formatting. * Syntax improvement * Hopefully last lint fix
Josh Fromm committed
-
- 12 Aug, 2019 1 commit
-
-
Neo Chien committed
-
- 11 Aug, 2019 1 commit
-
-
* aot * save * save * fix test * remove vta changes * lint
雾雨魔理沙 committed
-
- 09 Aug, 2019 1 commit
-
-
* reproduce error * fix * lint * lint
雾雨魔理沙 committed
-
- 08 Aug, 2019 1 commit
-
-
* [Relay] [Quantization] WIP - Common files for the qauntization work. * [Relay] [Quantization] WIP - Prototyping requantize op. * Requantize operator implementation. Requantize converts one quantized tensor representation to another quantized representation. The PR has following implementation features - Requantize operator defined in qnn namespace - relay.qnn.requantize - Lowering of the requantize to exisiting Relay operators - Integer fixed point implementation of requantize - Two rounding modes - FE_UPWARDS (round towards infinity) and FE_AWAY_FROM_ZERO (std::round behavior) - Floating point implementation as well, that can act as reference or can be used for devices when FP32 computation is not used. - Unit test cases Relevant Issue - https://github.com/dmlc/tvm/issues/2351 Credit to TFLite and GemmLowp to provide reference implementations. * Typo and lint fixes. * Doc fix. * Uncommenting the lint script (fixing mistake). * Modifying the unit tests. * Moving C++ files into src/relay/qnn * Moving python files to python/tvm/relay/qnn. Some minor fixes. * Moving the attrs.h inside the include directory. * Pushing files that I forgot earlier. Changing util location. * Incorporating comments. API change. Lint fixes. * Modifying the GetFixedPointMultiplierShift API as per comments. * Forgot the dialect change. * Changing rewrite to qnn_lower. * Renaming Quantize to Qnn for clarity. * Remove use_int_domain. * Incorportaing review comments. * Adding API doc for QNN dialect. * Move the qnn_lower pass to transform namespace. * Moving from expr to module. Adding namespace in C++. * Minor sentence rewrites. Added qnn namespace. * Added the API doc. * Chanding default out_dtype to int8. Adding a test with in/out_dtype as uint8. * Style fixes. Better error messages. * Adding documentation. * More documentation fixes. * Adding out dtype check for requantize. * Adding corner case for FP32 to fixed point conversion. * Adding extra line. * Documentation fix. * Adding static inline. * Incorporating jackwish comment. Removed idtype from requantize lowering. * Removing Quantize/Dequantize code. Restricting Requantize to (u)int8/int32. * Style fixes. * Fix the docs. * Move to Legalize API.
Animesh Jain committed
-
- 07 Aug, 2019 1 commit
-
-
* Add LayerNorm op * update * fix * Add mean_std and mean_variance * add std and update doc * add license * x * lint * x * fix * fix doc
Haichen Shen committed
-
- 06 Aug, 2019 3 commits
-
-
* [Relay] Rewrite pass. This pass transforms an expression to other expression. This pass has many usecases * Replace a expr to another expr, if the other expr has faster performance. * For ASICs, we might want to modify the inputs to adapt to the HW support. * Alter op layout can work in conjunction with this pass. The supporting usecase is the Intel i8 x i8 conv. Intel HW supports u8 x i8 conv in HW. Using this pass, we can replace an i8 x i8 conv to a sequence of operators where one of the operators is now u8 x i8 conv. This will also help automatic quantizaion performance. * Better API name. * Removing the conv2d legalization for x86. Will send a separate PR. * Test name changes. * Registering one funtion to register FTVMLegalize. * Better comments.
Animesh Jain committed -
* fix * fix interpreter
Haichen Shen committed -
* add build gcn tutorial * add transpose operator for square sparse matrices * remove extra files * change loop tag * comply with lint * comply with lint -- line too long * comply with lint * lint check * lint check * lint check * apply marisa and theirry's reviews
Yulun Yao committed
-
- 05 Aug, 2019 1 commit
-
-
* save lint some lint lint add charrnn save save save remove debug remove debug remove space refactor save rewrite dce * reset files * join -> meet * lint * address review comment * wordsmith
雾雨魔理沙 committed
-
- 03 Aug, 2019 1 commit
-
-
* Fix gather_nd in Relay * Add test cases for gather_nd.
Huilin Qu committed
-
- 02 Aug, 2019 2 commits
-
-
* fix * lint
雾雨魔理沙 committed -
* [Relay][Quantization] Support floating-point scale * [Relay][Quantization] KL-divergence calibration on dataset * Fix unhandled LeftShift case in QuantizeRealize * Fix lint * drop QBias * fix lint * address comments * address comments * Update comments * address comments * lint * kQIdentity = 0
Wuwei Lin committed
-
- 01 Aug, 2019 2 commits
-
-
The patch adds support for Tensorflow operators log1p and cos Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin
alexgl-github committed -
* add fatal lint lint lint do make completeness check an error lint remove fatal * fix test * reset parser file * remove unneeded import * Update python/tvm/relay/adt.py Co-Authored-By: Steven S. Lyubomirsky <slyubomirsky@gmail.com> * Update include/tvm/relay/adt.h Co-Authored-By: Steven S. Lyubomirsky <slyubomirsky@gmail.com> * Eliminate trailing whitespace (my fault)
雾雨魔理沙 committed
-
- 31 Jul, 2019 1 commit
-
-
* relay vm serialization * fix lint * load params, fix stream * lint * fix typo
Zhi committed
-
- 25 Jul, 2019 1 commit
-
-
Lianmin Zheng committed
-
- 24 Jul, 2019 2 commits
- 23 Jul, 2019 3 commits
-
-
internally and externally, interested in replacing standard dense layers with block-sparse matrix multiplication layers. The motivations are generally: higher performance (due to reduction in FLOPs, memory bandwidth/cache footprint), enabling larger models (e.g. fitting more layers in a given memory budget). Some public work along these lines: * https://openai.com/blog/block-sparse-gpu-kernels/ * https://openai.com/blog/sparse-transformer/ * https://arxiv.org/abs/1802.08435 * https://arxiv.org/abs/1711.02782 Various groups have been able to successfully train models with reasonable levels of sparsity (90%+) with marginal accuracy changes, which suggests substantial speedups are possible (as this implies a >10x reduction in FLOPs). It is fairly straightforward to realize these theoretical speedups, see e.g. TVM benchmarks for Intel CPUs in https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA results in https://github.com/openai/blocksparse, etc. * https://github.com/openai/blocksparse (CUDA) * https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM) * https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation) This is extracted from an internal patch we've been using internally. There are various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but this is a reasonable starting point. This needs more thorough unit test coverage however. We follow the conventions established by scipy.sparse.bsr_matrix and other libraries, see the unit tests for details. For folks interested in experimenting with scheduling/AutoTVM etc, https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful starting point.
Andrew Tulloch committed -
= Motivation It's useful to expose the tvm::reinterpret functionality to Relay/TOPI users, as this allows them to build (fused) operators leveraging the bitwise reinterpretation of an operator. An example is approximate transcendental functions, which can be implemented similar to: ```.py def C(x): return relay.expr.const(x, "float32") def approx_exp(x): x = relay.minimum(relay.maximum(x, C(-88.0)), C(88.0)) x = C(127.0) + x * C(1.44269504) xf = relay.floor(x) i = relay.cast(xf, "int32") x = x - xf Y = C(0.99992522) + x * (C(0.69583354) + x * (C(0.22606716) + x * C(0.078024523))) exponent = relay.left_shift(i, relay.expr.const(23, "int32")) exponent = relay.reinterpret(exponent, "float32") return exponent * Y def approx_sigmoid(x): # <2.0e-5 absolute error over [-5, 5] y = approx_exp(x) return y / (y + C(1.0)) def approx_tanh(x): # <4.0e-5 absolute error over [-5, 5] x = x * C(2.0) y = approx_exp(x) return (y - C(1.0)) / (y + C(1.0)) ``` See unit tests for implementations of these approximate transendentals.
Andrew Tulloch committed -
雾雨魔理沙 committed
-
- 21 Jul, 2019 1 commit
-
-
Tianqi Chen committed
-
- 19 Jul, 2019 2 commits
- 18 Jul, 2019 1 commit
-
-
雾雨魔理沙 committed
-
- 17 Jul, 2019 3 commits
-
-
* [Relay][VM]Fix debug statement * Change debug statement
Wei Chen committed -
* Fix build error * comments
Yinghai Lu committed -
Haichen Shen committed
-
- 16 Jul, 2019 1 commit
-
-
* tmp * Port vm and object to python * clean up * update vm build module * update * x * tweak * cleanup * update * fix rebase * Rename to VMCompiler * fix
Haichen Shen committed
-
- 12 Jul, 2019 1 commit
-
-
* [Relay][Quantization] Fix issue introduced in #3135 * Recover StopFusion * Fix fmultiref * Fix lint
Wuwei Lin committed
-
- 11 Jul, 2019 1 commit
-
-
* [INFA][IR] Build and Evolve Low-level IR. Remove dep from HalideIR. * Update include/tvm/node/ir_functor.h Co-Authored-By: Jared Roesch <roeschinc@gmail.com> * Update include/tvm/node/ir_functor.h Co-Authored-By: Jared Roesch <roeschinc@gmail.com>
Tianqi Chen committed
-
- 10 Jul, 2019 4 commits
-
-
lint update address comment comment out breaking test
雾雨魔理沙 committed -
* Implement type checking for Any Remove code generation related changes Remove compile changes Remove more Remove unification hack Add some code back that was needed, and clean up test Refactor test cases WIP Implement TypeHint AST Add test case which should fail Remove unification changes, and fix bug with let rec Restore unification for shapes Improve error reporting while debugging All examples type check All examples type check WIP First version that works with hints, needs clean up Remove dead code Tweaks Remove type hint Remove unecessary type hint stuff Remove more type hints Clean up Expose Any expression node Address CR Fix Fix solver Kill unecessary code Fix PyLint Fix Relocate loops Fix license and test Lint again Lint again Fix loops Fix docstring Fix template error Fix compiler issue Fix compile err Remove more runtime changes Restore buffer Fix segfault Fix Fix arange * Address feedback * Fix typo * Fix arange * Fix op level3 * Fix issue with Python wrapper
Jared Roesch committed -
Tianqi Chen committed
-
* First pass at Relay-to-Python converter testing utility * Indicate astor as a dependency * Add astor dep to host as well * Typos and small bugs * Handle ADTs and matching in Python conversion * Remove any dependency on ast.parse * Eliminate unnecessary type var field in Python version of ConstructorValue (already gone on C++ side) * Update constructor value, fix syntax errors * Don't forget keywords arg on Call nodes * Fix some incorrect calls to ast nodes * Fix more calls, a little more cleaning up * Missing cases in attr conversion * Lower op calls instead of running them through interpreter, as in @MarisaKirisame's AoT compiler * We do still need the module * Remove changes to op attrs: Will PR separately * Smoke test and corrections * More tests and fixes * Ensure imports are properly global in generated Python code * Add unit tests for refs * Add unit test for tuple indexing * Add unit test for if expression * Remove astor dependency * Remove astor from meta.yaml too * Fix if test and add basic local function test * Add global function test, refactor earlier tests * Correct 'clause' field in ADT so Python and C++ field names match * More fixes and tests for matching and constructors * Dramatically simplify matching: no need for a thunk * Improve ref writing test * Ensure local recursion works * cleanup * Add test for global recursion * Add test for higher-order calls * Get ops working, add basic tests * Remove accidentally duplicated test * More docstrings to appease pylint * Forgot to fix a test using constructor values * Reduce optimization level in fusion and fix tuple input to operators * Test op with tuple output, fix tuple output code * Add unit test for batch norm * Add a couple more tricky test cases * Correct nat constructor to drop unnecessary field * Fix the op attrs file (accidentally reduced it) * Address review comments * Adapt to new ConstructorValue representation (no more runtime dep on module) * Use pass manager and updated interfaces. Extend module.from_expr to accommodate necessary demands * Use sequential return value * Lift out nested conditionals * Replace triple single quotes with triple double quotes * Use main variable instead of entry_func
Steven S. Lyubomirsky committed
-
- 09 Jul, 2019 3 commits
-
-
* [Relay][VM]Compiling pattern matching * Fix lint * Remove debug code * Move TreeNode definition * merge ifi and selecti, todo: remove them * fix lint * remove ifi and selecti * rename GetTagi to GetTag * fix dltype * fix more dltype * Generalize If and select, and rename to Ifi and Selecti * Fix lint * Rename Ifi to If * Change register default to match value * Remove bad specialization for Move * Stop use Select * Remove Select * TreeNode refactor * Change entry_func name * Remove Cmp due to rebase issue
Wei Chen committed -
- Weight dtype can be different than idtype. So, using the weight tensor to set the dtype of weight. - For conv2d NCHWc operator, the weight can be of any dimension. For int8 computation on Intel, it can be 7D. Relaxing the weight type checking.
Animesh Jain committed -
* init fix rebase lint fix cmake try again fix ci * add gitignore * fix format * do not include .interp and .tokens
雾雨魔理沙 committed
-