- 04 Sep, 2019 1 commit
-
-
* Add gradient implementations * Add docstrings to fix lint errors
SWu committed
-
- 01 Sep, 2019 2 commits
-
-
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
Haichen Shen committed -
* Added arm_cpu NHWC schedules. * Fixed kernel shape legalization. * Added bitserial ops to relay. * Snapshot and more missing files. * Added dense testing. * Added tests * Added ASF header to new files. * cc lint * Pylint change. * pylint fixes. * Change arm legalize test. * Added assert check to arm legalize. * Added better documentation, fixed some bad style * Reverted arm conv2d nhwc changes.
Josh Fromm committed
-
- 29 Aug, 2019 2 commits
-
-
* [Relay] Conv2d grad * Fix test * Fix first order gradient
Wuwei Lin committed -
* [TensorFlow] Fix limitation that depth_mult can only be 1 for DepthwiseConv2dNative * Improve code readability
lixiaoquan committed
-
- 23 Aug, 2019 1 commit
-
-
Animesh Jain committed
-
- 22 Aug, 2019 2 commits
-
-
* Add one-hot to Relay * topi implementation * Working * add topi test * Add TF test * Fix check * fix linting issues * fix documentation * Fix documentation * Add support for on_value, off_value, axis, dtype * Add full support for axis * Fix compute and update test_forward * Move on_value and off_value to inputs * Add topi test * Update tests * Update docs * Fix style * re-enable tests * Add one_hot to mxnet converter
Jon Soifer committed -
Josh Fromm committed
-
- 21 Aug, 2019 1 commit
-
-
* Support cblas library in dense * start to add support for generic batch_matmul compute * Add x86 override for batch_matmul * Fix linting * reset file * Fix typos * dummy change to re-trigger CI
Jon Soifer committed
-
- 14 Aug, 2019 1 commit
-
-
Animesh Jain committed
-
- 13 Aug, 2019 1 commit
-
-
* Added relay and topi mirror_pad operator. * Added mirror_padding to tensorflow frontend. * Added mirrorpad testing in tensorflow frontent. * Added space_to_depth in tf frontend. * Added tests for spacetodepth. * spacetodepth bug fix. * Lint fix * Added mirror pad python attrs. * Pad code formatting. * Syntax improvement * Hopefully last lint fix
Josh Fromm committed
-
- 09 Aug, 2019 1 commit
-
-
* reproduce error * fix * lint * lint
雾雨魔理沙 committed
-
- 07 Aug, 2019 1 commit
-
-
* Add LayerNorm op * update * fix * Add mean_std and mean_variance * add std and update doc * add license * x * lint * x * fix * fix doc
Haichen Shen committed
-
- 06 Aug, 2019 2 commits
-
-
* [Relay] Rewrite pass. This pass transforms an expression to other expression. This pass has many usecases * Replace a expr to another expr, if the other expr has faster performance. * For ASICs, we might want to modify the inputs to adapt to the HW support. * Alter op layout can work in conjunction with this pass. The supporting usecase is the Intel i8 x i8 conv. Intel HW supports u8 x i8 conv in HW. Using this pass, we can replace an i8 x i8 conv to a sequence of operators where one of the operators is now u8 x i8 conv. This will also help automatic quantizaion performance. * Better API name. * Removing the conv2d legalization for x86. Will send a separate PR. * Test name changes. * Registering one funtion to register FTVMLegalize. * Better comments.
Animesh Jain committed -
* add build gcn tutorial * add transpose operator for square sparse matrices * remove extra files * change loop tag * comply with lint * comply with lint -- line too long * comply with lint * lint check * lint check * lint check * apply marisa and theirry's reviews
Yulun Yao committed
-
- 01 Aug, 2019 1 commit
-
-
The patch adds support for Tensorflow operators log1p and cos Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin
alexgl-github committed
-
- 24 Jul, 2019 1 commit
-
-
Wuwei Lin committed
-
- 23 Jul, 2019 2 commits
-
-
internally and externally, interested in replacing standard dense layers with block-sparse matrix multiplication layers. The motivations are generally: higher performance (due to reduction in FLOPs, memory bandwidth/cache footprint), enabling larger models (e.g. fitting more layers in a given memory budget). Some public work along these lines: * https://openai.com/blog/block-sparse-gpu-kernels/ * https://openai.com/blog/sparse-transformer/ * https://arxiv.org/abs/1802.08435 * https://arxiv.org/abs/1711.02782 Various groups have been able to successfully train models with reasonable levels of sparsity (90%+) with marginal accuracy changes, which suggests substantial speedups are possible (as this implies a >10x reduction in FLOPs). It is fairly straightforward to realize these theoretical speedups, see e.g. TVM benchmarks for Intel CPUs in https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA results in https://github.com/openai/blocksparse, etc. * https://github.com/openai/blocksparse (CUDA) * https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM) * https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation) This is extracted from an internal patch we've been using internally. There are various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but this is a reasonable starting point. This needs more thorough unit test coverage however. We follow the conventions established by scipy.sparse.bsr_matrix and other libraries, see the unit tests for details. For folks interested in experimenting with scheduling/AutoTVM etc, https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful starting point.
Andrew Tulloch committed -
= Motivation It's useful to expose the tvm::reinterpret functionality to Relay/TOPI users, as this allows them to build (fused) operators leveraging the bitwise reinterpretation of an operator. An example is approximate transcendental functions, which can be implemented similar to: ```.py def C(x): return relay.expr.const(x, "float32") def approx_exp(x): x = relay.minimum(relay.maximum(x, C(-88.0)), C(88.0)) x = C(127.0) + x * C(1.44269504) xf = relay.floor(x) i = relay.cast(xf, "int32") x = x - xf Y = C(0.99992522) + x * (C(0.69583354) + x * (C(0.22606716) + x * C(0.078024523))) exponent = relay.left_shift(i, relay.expr.const(23, "int32")) exponent = relay.reinterpret(exponent, "float32") return exponent * Y def approx_sigmoid(x): # <2.0e-5 absolute error over [-5, 5] y = approx_exp(x) return y / (y + C(1.0)) def approx_tanh(x): # <4.0e-5 absolute error over [-5, 5] x = x * C(2.0) y = approx_exp(x) return (y - C(1.0)) / (y + C(1.0)) ``` See unit tests for implementations of these approximate transendentals.
Andrew Tulloch committed
-
- 19 Jul, 2019 2 commits
- 18 Jul, 2019 1 commit
-
-
雾雨魔理沙 committed
-
- 10 Jul, 2019 1 commit
-
-
* Implement type checking for Any Remove code generation related changes Remove compile changes Remove more Remove unification hack Add some code back that was needed, and clean up test Refactor test cases WIP Implement TypeHint AST Add test case which should fail Remove unification changes, and fix bug with let rec Restore unification for shapes Improve error reporting while debugging All examples type check All examples type check WIP First version that works with hints, needs clean up Remove dead code Tweaks Remove type hint Remove unecessary type hint stuff Remove more type hints Clean up Expose Any expression node Address CR Fix Fix solver Kill unecessary code Fix PyLint Fix Relocate loops Fix license and test Lint again Lint again Fix loops Fix docstring Fix template error Fix compiler issue Fix compile err Remove more runtime changes Restore buffer Fix segfault Fix Fix arange * Address feedback * Fix typo * Fix arange * Fix op level3 * Fix issue with Python wrapper
Jared Roesch committed
-
- 09 Jul, 2019 1 commit
-
-
Amy Wang committed
-
- 01 Jul, 2019 1 commit
-
-
* add mac count for conv 2d transpose * add the explanation of missing parameter in docstring * typo * fix pylint
Yida Wang committed
-
- 28 Jun, 2019 3 commits
-
-
Thierry Moreau committed
-
* Add sequence_mask use exactly the same arguments as mxnet fix * fix lint * fix lint * add mxnet conversion + relay * update * update doc * fix pylint * fix doc * address comment * try to address comments * try to enable shape check for valid_length * fix * try to fix * fix bug * try to fix * address comment * address comment
Xingjian Shi committed -
Amy Wang committed
-
- 27 Jun, 2019 1 commit
-
-
* fix relay reduce axis bug * add tests for reduce bug
Altan Haan committed
-
- 11 Jun, 2019 1 commit
-
-
hlu1 committed
-
- 09 Jun, 2019 1 commit
-
-
* Improve non_max_suppression for CPU * Improve get_valid_counts * Minor change * Skip some unnecessary computes
Yao Wang committed
-
- 07 Jun, 2019 1 commit
-
-
Alexander Pivovarov committed
-
- 04 Jun, 2019 1 commit
-
-
* init impl for topk * Fix cpu for topk * init cuda impl for topk * Add cuda for topk * fix * Add doc * update doc * lint * lint * lint * x * fix warning * [Relay] Add TopK in tf converter * Add frontend converter * fix
Haichen Shen committed
-
- 20 May, 2019 1 commit
-
-
* [Relay][TOPI] operator All * Update tests/python/frontend/tensorflow/test_forward.py Co-Authored-By: yongwww <55wuyong@163.com> * fix comments * change to level 4
Yong Wu committed
-
- 17 May, 2019 1 commit
-
-
hlu1 committed
-
- 11 May, 2019 1 commit
-
-
Register all operators' Python attributes in Python so they can be easily accessed from Python code (#3175)
Steven S. Lyubomirsky committed
-
- 09 May, 2019 1 commit
-
-
* Add topi adaptive_pool * Use adaptive_pool to compute global_pool * Add relay adaptive pool2d * Fix lint * Fix typo * Minor change * Change support level to 10 * Add contrib * Remove global pool schedule * Add contrib module * Fix lint * Update doc * Update doc
Yao Wang committed
-
- 29 Apr, 2019 2 commits
-
-
* ssd gluoncv gpu op updated * ssd gluoncv gpu op updated * tutorials and testes modified * tutorials and testes modified * fix lint * fix lint * address comment * multibox bug fixed * space line added * use less threads per block * use less threads per block * less threads per block for get valid count * less threads per block for get valid count * merge with master * Revert "less threads per block for get valid count" This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4. * Revert "less threads per block for get valid count" This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4. * typo fixed * elem length made to a variable * fix lint error * fix lint error * lint fixed * bug fixed * bug fixed * lint fixed * error fixed * error fixed * test ci * test ci * seperate argsort to be an independent op * seperate argsort to be an independent op * fix lint * fix lint * remove unsupported models * typo fixed * argsort added to realy * solve conflicts with master * fix lint * fix lint * test push * Revert "test push" This reverts commit 6db00883fab6cc06bddf564c926bb27c874397d8. * fix lint error * fix more lint * cpu test_sort udpated * debug ci * nms fixed * expose argsort to relay frontend * test ci * fix lint * sort register error fixed * fix nnvm * nms type fixed * adaptive pooling added to relay * Revert "adaptive pooling added to relay" This reverts commit 1119f1f2c055753e0cc5611627597749134c5c8c. * fix lint * expose argsort op * fix lint * fix lint * fix lint * sort test updated * sort bug fixed * nnvm error fixed * fix argsort default data type returned to be float insteaf of int * fix lint * fix lint * test fixed * fix valid count * fix titanx bug * tutorial add both targets * titanx error fixed * try to fix CI old gpu error * try to solve CI GPU error * get_valid_count added * reverse get_valid_count * get valid count optimized * address comments * fix ci error * remove unessesary block sync * add back one sync * address comments * address more comments * more comments * move sort to be indepent algorithm * typo fixed * more typos * comments addressed * doc updated * fix pylint * address final comments * apache license added
Leyuan Wang committed -
masahi committed
-
- 27 Apr, 2019 1 commit
-
-
* Fixed issue #3069 by adding in_channels * Registerd group_conv2d_nchw as topi compute * Improved by checking tag value * Removed group_conv2d_nchw topi registration * Added test for relay group_conv2d_nchw * Added assertions to forbid small group size * Removed hard-coded oc_block_factor * Added explanatory comments to group_conv2d_nchw_cuda * Updated group_conv2d_nchw_cuda schedule Removed 'direct' CUDA tests * Reverted an accidental change in a conv2d test * Fixed indentation problems * Fixed a mis-commented line * Reverted change in group_conv2d_nchw tag * Removed commented int8 group_conv2d test * Fixed group size assertions in group_conv2d_nchw_cuda
Ruizhe Zhao (Vincent) committed
-