- 01 Sep, 2019 2 commits
-
-
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
Haichen Shen committed -
* Added arm_cpu NHWC schedules. * Fixed kernel shape legalization. * Added bitserial ops to relay. * Snapshot and more missing files. * Added dense testing. * Added tests * Added ASF header to new files. * cc lint * Pylint change. * pylint fixes. * Change arm legalize test. * Added assert check to arm legalize. * Added better documentation, fixed some bad style * Reverted arm conv2d nhwc changes.
Josh Fromm committed
-
- 30 Aug, 2019 1 commit
-
-
Animesh Jain committed
-
- 29 Aug, 2019 2 commits
-
-
* [Relay] Conv2d grad * Fix test * Fix first order gradient
Wuwei Lin committed -
* [TensorFlow] Fix limitation that depth_mult can only be 1 for DepthwiseConv2dNative * Improve code readability
lixiaoquan committed
-
- 22 Aug, 2019 2 commits
-
-
* Add one-hot to Relay * topi implementation * Working * add topi test * Add TF test * Fix check * fix linting issues * fix documentation * Fix documentation * Add support for on_value, off_value, axis, dtype * Add full support for axis * Fix compute and update test_forward * Move on_value and off_value to inputs * Add topi test * Update tests * Update docs * Fix style * re-enable tests * Add one_hot to mxnet converter
Jon Soifer committed -
Josh Fromm committed
-
- 15 Aug, 2019 1 commit
-
-
* Refactor. * update * update * update * update * update * update
ziheng committed
-
- 13 Aug, 2019 1 commit
-
-
* Added relay and topi mirror_pad operator. * Added mirror_padding to tensorflow frontend. * Added mirrorpad testing in tensorflow frontent. * Added space_to_depth in tf frontend. * Added tests for spacetodepth. * spacetodepth bug fix. * Lint fix * Added mirror pad python attrs. * Pad code formatting. * Syntax improvement * Hopefully last lint fix
Josh Fromm committed
-
- 12 Aug, 2019 1 commit
-
-
Neo Chien committed
-
- 07 Aug, 2019 1 commit
-
-
* Add LayerNorm op * update * fix * Add mean_std and mean_variance * add std and update doc * add license * x * lint * x * fix * fix doc
Haichen Shen committed
-
- 06 Aug, 2019 1 commit
-
-
* add build gcn tutorial * add transpose operator for square sparse matrices * remove extra files * change loop tag * comply with lint * comply with lint -- line too long * comply with lint * lint check * lint check * lint check * apply marisa and theirry's reviews
Yulun Yao committed
-
- 03 Aug, 2019 1 commit
-
-
* Fix gather_nd in Relay * Add test cases for gather_nd.
Huilin Qu committed
-
- 01 Aug, 2019 1 commit
-
-
The patch adds support for Tensorflow operators log1p and cos Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin
alexgl-github committed
-
- 25 Jul, 2019 1 commit
-
-
Lianmin Zheng committed
-
- 24 Jul, 2019 1 commit
-
-
Wuwei Lin committed
-
- 23 Jul, 2019 2 commits
-
-
internally and externally, interested in replacing standard dense layers with block-sparse matrix multiplication layers. The motivations are generally: higher performance (due to reduction in FLOPs, memory bandwidth/cache footprint), enabling larger models (e.g. fitting more layers in a given memory budget). Some public work along these lines: * https://openai.com/blog/block-sparse-gpu-kernels/ * https://openai.com/blog/sparse-transformer/ * https://arxiv.org/abs/1802.08435 * https://arxiv.org/abs/1711.02782 Various groups have been able to successfully train models with reasonable levels of sparsity (90%+) with marginal accuracy changes, which suggests substantial speedups are possible (as this implies a >10x reduction in FLOPs). It is fairly straightforward to realize these theoretical speedups, see e.g. TVM benchmarks for Intel CPUs in https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA results in https://github.com/openai/blocksparse, etc. * https://github.com/openai/blocksparse (CUDA) * https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM) * https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation) This is extracted from an internal patch we've been using internally. There are various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but this is a reasonable starting point. This needs more thorough unit test coverage however. We follow the conventions established by scipy.sparse.bsr_matrix and other libraries, see the unit tests for details. For folks interested in experimenting with scheduling/AutoTVM etc, https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful starting point.
Andrew Tulloch committed -
= Motivation It's useful to expose the tvm::reinterpret functionality to Relay/TOPI users, as this allows them to build (fused) operators leveraging the bitwise reinterpretation of an operator. An example is approximate transcendental functions, which can be implemented similar to: ```.py def C(x): return relay.expr.const(x, "float32") def approx_exp(x): x = relay.minimum(relay.maximum(x, C(-88.0)), C(88.0)) x = C(127.0) + x * C(1.44269504) xf = relay.floor(x) i = relay.cast(xf, "int32") x = x - xf Y = C(0.99992522) + x * (C(0.69583354) + x * (C(0.22606716) + x * C(0.078024523))) exponent = relay.left_shift(i, relay.expr.const(23, "int32")) exponent = relay.reinterpret(exponent, "float32") return exponent * Y def approx_sigmoid(x): # <2.0e-5 absolute error over [-5, 5] y = approx_exp(x) return y / (y + C(1.0)) def approx_tanh(x): # <4.0e-5 absolute error over [-5, 5] x = x * C(2.0) y = approx_exp(x) return (y - C(1.0)) / (y + C(1.0)) ``` See unit tests for implementations of these approximate transendentals.
Andrew Tulloch committed
-
- 19 Jul, 2019 1 commit
-
-
Yong Wu committed
-
- 11 Jul, 2019 1 commit
-
-
* [INFA][IR] Build and Evolve Low-level IR. Remove dep from HalideIR. * Update include/tvm/node/ir_functor.h Co-Authored-By: Jared Roesch <roeschinc@gmail.com> * Update include/tvm/node/ir_functor.h Co-Authored-By: Jared Roesch <roeschinc@gmail.com>
Tianqi Chen committed
-
- 10 Jul, 2019 1 commit
-
-
* Implement type checking for Any Remove code generation related changes Remove compile changes Remove more Remove unification hack Add some code back that was needed, and clean up test Refactor test cases WIP Implement TypeHint AST Add test case which should fail Remove unification changes, and fix bug with let rec Restore unification for shapes Improve error reporting while debugging All examples type check All examples type check WIP First version that works with hints, needs clean up Remove dead code Tweaks Remove type hint Remove unecessary type hint stuff Remove more type hints Clean up Expose Any expression node Address CR Fix Fix solver Kill unecessary code Fix PyLint Fix Relocate loops Fix license and test Lint again Lint again Fix loops Fix docstring Fix template error Fix compiler issue Fix compile err Remove more runtime changes Restore buffer Fix segfault Fix Fix arange * Address feedback * Fix typo * Fix arange * Fix op level3 * Fix issue with Python wrapper
Jared Roesch committed
-
- 09 Jul, 2019 1 commit
-
-
- Weight dtype can be different than idtype. So, using the weight tensor to set the dtype of weight. - For conv2d NCHWc operator, the weight can be of any dimension. For int8 computation on Intel, it can be 7D. Relaxing the weight type checking.
Animesh Jain committed
-
- 28 Jun, 2019 2 commits
-
-
Thierry Moreau committed
-
* Add sequence_mask use exactly the same arguments as mxnet fix * fix lint * fix lint * add mxnet conversion + relay * update * update doc * fix pylint * fix doc * address comment * try to address comments * try to enable shape check for valid_length * fix * try to fix * fix bug * try to fix * address comment * address comment
Xingjian Shi committed
-
- 09 Jun, 2019 1 commit
-
-
* Improve non_max_suppression for CPU * Improve get_valid_counts * Minor change * Skip some unnecessary computes
Yao Wang committed
-
- 08 Jun, 2019 1 commit
-
-
Ligeng Zhu committed
-
- 04 Jun, 2019 2 commits
-
-
ziheng committed
-
* init impl for topk * Fix cpu for topk * init cuda impl for topk * Add cuda for topk * fix * Add doc * update doc * lint * lint * lint * x * fix warning * [Relay] Add TopK in tf converter * Add frontend converter * fix
Haichen Shen committed
-
- 20 May, 2019 1 commit
-
-
* [Relay][TOPI] operator All * Update tests/python/frontend/tensorflow/test_forward.py Co-Authored-By: yongwww <55wuyong@163.com> * fix comments * change to level 4
Yong Wu committed
-
- 11 May, 2019 1 commit
-
-
* Implement the VM compiler * Fix issues * Fix ASF headers * Fix test issue * Apply typo fixes. * Update src/relay/backend/vm/compiler.cc Co-Authored-By: 雾雨魔理沙 <lolisa@marisa.moe> * Refactor compiler * Fix * Fix * Fix in benchmark * Fix * Address comments
Jared Roesch committed
-
- 09 May, 2019 2 commits
-
-
* Implement the virtual machine Co-Authored-By: wweic <ipondering.weic@gmail.com> * Fix rebase build issues * Reorganize vm.py and fix allocator bug * Remove compiler * Remove tests * Remove backend/vm/vm.cc too * Fix docs * Fix doc * Fix doc * Add vm docs * Remove change to dead_code.cc * Remove Relay logging * Remove reduce * Update include/tvm/runtime/vm.h Co-Authored-By: jroesch <roeschinc@gmail.com> * Reformat * Update include/tvm/runtime/vm.h Co-Authored-By: jroesch <roeschinc@gmail.com> * Address feedback * Update include/tvm/runtime/vm.h Co-Authored-By: jroesch <roeschinc@gmail.com> * Apply suggestions from code review Co-Authored-By: jroesch <roeschinc@gmail.com> * Fix a couple outstanding comments * Last couple comments * Update include/tvm/runtime/vm.h Co-Authored-By: jroesch <roeschinc@gmail.com> * Address code review feedback * Fix final comment * Address comments * Error reporting and example * add Const * Explicitly delete copy assignment operator * Fix rebase * Pass 3rd arg to fusion
Jared Roesch committed -
* Add topi adaptive_pool * Use adaptive_pool to compute global_pool * Add relay adaptive pool2d * Fix lint * Fix typo * Minor change * Change support level to 10 * Add contrib * Remove global pool schedule * Add contrib module * Fix lint * Update doc * Update doc
Yao Wang committed
-
- 01 May, 2019 1 commit
-
-
* Fix PRelu layout in Relay * Fix cpplint * Add PRelu test case
Zhao Wu committed
-
- 29 Apr, 2019 1 commit
-
-
* ssd gluoncv gpu op updated * ssd gluoncv gpu op updated * tutorials and testes modified * tutorials and testes modified * fix lint * fix lint * address comment * multibox bug fixed * space line added * use less threads per block * use less threads per block * less threads per block for get valid count * less threads per block for get valid count * merge with master * Revert "less threads per block for get valid count" This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4. * Revert "less threads per block for get valid count" This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4. * typo fixed * elem length made to a variable * fix lint error * fix lint error * lint fixed * bug fixed * bug fixed * lint fixed * error fixed * error fixed * test ci * test ci * seperate argsort to be an independent op * seperate argsort to be an independent op * fix lint * fix lint * remove unsupported models * typo fixed * argsort added to realy * solve conflicts with master * fix lint * fix lint * test push * Revert "test push" This reverts commit 6db00883fab6cc06bddf564c926bb27c874397d8. * fix lint error * fix more lint * cpu test_sort udpated * debug ci * nms fixed * expose argsort to relay frontend * test ci * fix lint * sort register error fixed * fix nnvm * nms type fixed * adaptive pooling added to relay * Revert "adaptive pooling added to relay" This reverts commit 1119f1f2c055753e0cc5611627597749134c5c8c. * fix lint * expose argsort op * fix lint * fix lint * fix lint * sort test updated * sort bug fixed * nnvm error fixed * fix argsort default data type returned to be float insteaf of int * fix lint * fix lint * test fixed * fix valid count * fix titanx bug * tutorial add both targets * titanx error fixed * try to fix CI old gpu error * try to solve CI GPU error * get_valid_count added * reverse get_valid_count * get valid count optimized * address comments * fix ci error * remove unessesary block sync * add back one sync * address comments * address more comments * more comments * move sort to be indepent algorithm * typo fixed * more typos * comments addressed * doc updated * fix pylint * address final comments * apache license added
Leyuan Wang committed
-
- 26 Apr, 2019 1 commit
-
-
* Quantize dense layers * Add out_dtype arggument to dense; Add dense_int8 on CUDA * Add topi unittest of dense int8 * Fix relay * Fix topi integration * Fix quantization * Update dense_rewrite * Triger CI * Change qconfig quantize_dense to quantize_op * Fix * Remove quantize_op from qconfig
Wuwei Lin committed
-
- 25 Apr, 2019 1 commit
-
-
Hiroyuki Makino committed
-
- 17 Apr, 2019 1 commit
-
-
* Implement nn.bias_add compute in C++ * Address comments * Remove unnecessary check
Yinghai Lu committed
-
- 16 Apr, 2019 1 commit
-
-
return false mean retry in the future, and in the case of error, it should be report ASAP, not retry.
雾雨魔理沙 committed
-
- 10 Apr, 2019 1 commit
-
-
* Add `set_body_simple` to Registry, refactor a lot of code to use it * Add more types to Relay PackedFuncs * Add Registry::set_body_method to easily make Node methods into PackedFuncs * Add set_body_method, set_body_node_method; start typing api_lang * Add some docs, remove unused script * Fix mysterious linter problem * Touch up api_ir.cc * Fix some issues with TOPI argument counts * Revert changes to topi.cc to avoid problems with optional arguments * A little more cleanup * Type more of the api _ functions * Whitespace * Finalize names and docs for new registry helpers * Update docs
James Gilles committed
-
- 09 Apr, 2019 1 commit
-
-
[Relay] InferCorrectLayout for strided_slice & min_num_branches option in CombineParallelConv2D (#2961) * [Relay] InferCorrectLayout for strided_slice * Add min_num_branches option to CombineParallelConv2D * Return undef if original layout contains splitted axes
Wuwei Lin committed
-