- 10 Oct, 2019 1 commit
-
-
* Add FIFO buffer op to enable explicit computation re-use in convolution * Add a test * Add end-to-end test with 1D convolution * Add a stub in MXNet frontend * Address reviewer comments * Add back stub for MXNet frontend
Philip Hyunsu Cho committed
-
- 02 Oct, 2019 1 commit
-
-
Umang Yadav committed
-
- 22 Sep, 2019 1 commit
-
-
* add expr `isnan` * move to intrinsic * doc & add to topi * fix error from ci
Huang, Guangtai committed
-
- 20 Sep, 2019 1 commit
-
-
MXNet pad is described at: https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.pad Add support for parameter 'None' in MXNet slice operator. MXNet 'slice' is described at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.slice Add support for MXNet cos, sin, arctan MXNet 'cos' is described at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.cos MXNet 'sin' is described at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.sin MXNet arctan is descirbed at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.arctan Add support for MXNet 1D Convolution and 1D Deconvolution MXNet convolution is described at: https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Convolution MXNet Deconvolution is described at: https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Deconvolution
Alex Gladkov committed
-
- 19 Sep, 2019 1 commit
-
-
* add proper scheduling for dense on CUDA * add fallback config and fix unit test * fix corner cases * refactoring * fix bias and add testcase * let fusion happen
Cody Hao Yu committed
-
- 16 Sep, 2019 1 commit
-
-
* [TOPI] operator support: logical_and, logical_or, logical_not * [TOPI] operator support: logical_and, logical_or, logical_not * [TOPI] fix test cases for operator support: logical_and, logical_or, logical_not * [TOPI] fix test cases for operator support: logical_not
Neo Chien committed
-
- 09 Sep, 2019 1 commit
-
-
* add more ops * stop vectorization for erf * x * cleanup * fix * add whitelist for vectorizable intrin * add tf converter * fix dense * fix * add missing intrin * fix mxnet frontend * fix nvptx
Haichen Shen committed
-
- 08 Sep, 2019 1 commit
-
-
雾雨魔理沙 committed
-
- 01 Sep, 2019 1 commit
-
-
* init shape func in interpreter and vm compiler * Update interpreter * fix * lint * lint * fix * remove hack * update * fix * fix * update * address comments & update for shape_of * fix lint * update * fix hybrid * lint * fix bug & add take shape func * lint * lint * update * fix flaky test * add todo
Haichen Shen committed
-
- 22 Aug, 2019 2 commits
-
-
* Add one-hot to Relay * topi implementation * Working * add topi test * Add TF test * Fix check * fix linting issues * fix documentation * Fix documentation * Add support for on_value, off_value, axis, dtype * Add full support for axis * Fix compute and update test_forward * Move on_value and off_value to inputs * Add topi test * Update tests * Update docs * Fix style * re-enable tests * Add one_hot to mxnet converter
Jon Soifer committed -
Josh Fromm committed
-
- 06 Aug, 2019 1 commit
-
-
* add build gcn tutorial * add transpose operator for square sparse matrices * remove extra files * change loop tag * comply with lint * comply with lint -- line too long * comply with lint * lint check * lint check * lint check * apply marisa and theirry's reviews
Yulun Yao committed
-
- 01 Aug, 2019 1 commit
-
-
The patch adds support for Tensorflow operators log1p and cos Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin
alexgl-github committed
-
- 31 Jul, 2019 1 commit
-
-
* [TOPI][CUDA] schedule for group_conv2d * Fix #flops
Wuwei Lin committed
-
- 30 Jul, 2019 1 commit
-
-
* Fix traverse_inline not inline zero input op properly * Add where to python and set tag to broadcast * Fix inline * test * fix test target * fix
Wuwei Lin committed
-
- 28 Jul, 2019 1 commit
-
-
Balint Cristian committed
-
- 26 Jul, 2019 1 commit
-
-
* [TOPI][CUDA] Schedule for pool_grad * Relay test * Fix fused op * doc * Remove set scope local
Wuwei Lin committed
-
- 25 Jul, 2019 1 commit
-
-
Balint Cristian committed
-
- 24 Jul, 2019 1 commit
-
-
Wuwei Lin committed
-
- 23 Jul, 2019 3 commits
-
-
internally and externally, interested in replacing standard dense layers with block-sparse matrix multiplication layers. The motivations are generally: higher performance (due to reduction in FLOPs, memory bandwidth/cache footprint), enabling larger models (e.g. fitting more layers in a given memory budget). Some public work along these lines: * https://openai.com/blog/block-sparse-gpu-kernels/ * https://openai.com/blog/sparse-transformer/ * https://arxiv.org/abs/1802.08435 * https://arxiv.org/abs/1711.02782 Various groups have been able to successfully train models with reasonable levels of sparsity (90%+) with marginal accuracy changes, which suggests substantial speedups are possible (as this implies a >10x reduction in FLOPs). It is fairly straightforward to realize these theoretical speedups, see e.g. TVM benchmarks for Intel CPUs in https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA results in https://github.com/openai/blocksparse, etc. * https://github.com/openai/blocksparse (CUDA) * https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM) * https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation) This is extracted from an internal patch we've been using internally. There are various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but this is a reasonable starting point. This needs more thorough unit test coverage however. We follow the conventions established by scipy.sparse.bsr_matrix and other libraries, see the unit tests for details. For folks interested in experimenting with scheduling/AutoTVM etc, https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful starting point.
Andrew Tulloch committed -
= Motivation It's useful to expose the tvm::reinterpret functionality to Relay/TOPI users, as this allows them to build (fused) operators leveraging the bitwise reinterpretation of an operator. An example is approximate transcendental functions, which can be implemented similar to: ```.py def C(x): return relay.expr.const(x, "float32") def approx_exp(x): x = relay.minimum(relay.maximum(x, C(-88.0)), C(88.0)) x = C(127.0) + x * C(1.44269504) xf = relay.floor(x) i = relay.cast(xf, "int32") x = x - xf Y = C(0.99992522) + x * (C(0.69583354) + x * (C(0.22606716) + x * C(0.078024523))) exponent = relay.left_shift(i, relay.expr.const(23, "int32")) exponent = relay.reinterpret(exponent, "float32") return exponent * Y def approx_sigmoid(x): # <2.0e-5 absolute error over [-5, 5] y = approx_exp(x) return y / (y + C(1.0)) def approx_tanh(x): # <4.0e-5 absolute error over [-5, 5] x = x * C(2.0) y = approx_exp(x) return (y - C(1.0)) / (y + C(1.0)) ``` See unit tests for implementations of these approximate transendentals.
Andrew Tulloch committed -
Animesh Jain committed
-
- 19 Jul, 2019 1 commit
-
-
Yong Wu committed
-
- 03 Jul, 2019 1 commit
-
-
* Pre-allocate buffer for x86 roi_align * Fix typo
Yao Wang committed
-
- 28 Jun, 2019 1 commit
-
-
* Add sequence_mask use exactly the same arguments as mxnet fix * fix lint * fix lint * add mxnet conversion + relay * update * update doc * fix pylint * fix doc * address comment * try to address comments * try to enable shape check for valid_length * fix * try to fix * fix bug * try to fix * address comment * address comment
Xingjian Shi committed
-
- 14 Jun, 2019 1 commit
-
-
* fix flaky test * fix flaky quantize pass
Haichen Shen committed
-
- 11 Jun, 2019 1 commit
-
-
hlu1 committed
-
- 10 Jun, 2019 1 commit
-
-
* Support x86 dilation conv2d and improve multi-batch conv2d * Fix lint
Yao Wang committed
-
- 09 Jun, 2019 1 commit
-
-
* Improve non_max_suppression for CPU * Improve get_valid_counts * Minor change * Skip some unnecessary computes
Yao Wang committed
-
- 06 Jun, 2019 2 commits
- 05 Jun, 2019 1 commit
-
-
hlu1 committed
-
- 04 Jun, 2019 1 commit
-
-
* init impl for topk * Fix cpu for topk * init cuda impl for topk * Add cuda for topk * fix * Add doc * update doc * lint * lint * lint * x * fix warning * [Relay] Add TopK in tf converter * Add frontend converter * fix
Haichen Shen committed
-
- 28 May, 2019 1 commit
-
-
masahi committed
-
- 22 May, 2019 1 commit
-
-
* Support the 1x1 int8 conv with NHWC layout and weight packing fix linter * fix the memoize issue * fix the failed nhwc test * add the schedule for pack to unbreak other tests * skip avx512 compile * Support the 1x1 int8 conv with NHWC layout and weight packing fix linter * fix the memoize issue * fix the failed nhwc test * add the schedule for pack to unbreak other tests * skip avx512 compile * Unify the data_layout and kernel_layout relation * add asf header * fix the comment * retrigger the build/test
llyfacebook committed
-
- 20 May, 2019 2 commits
-
-
* [Relay][TOPI] operator All * Update tests/python/frontend/tensorflow/test_forward.py Co-Authored-By: yongwww <55wuyong@163.com> * fix comments * change to level 4
Yong Wu committed -
Haichen Shen committed
-
- 17 May, 2019 1 commit
-
-
hlu1 committed
-
- 09 May, 2019 1 commit
-
-
* Add topi adaptive_pool * Use adaptive_pool to compute global_pool * Add relay adaptive pool2d * Fix lint * Fix typo * Minor change * Change support level to 10 * Add contrib * Remove global pool schedule * Add contrib module * Fix lint * Update doc * Update doc
Yao Wang committed
-
- 08 May, 2019 1 commit
-
-
* deconv tests * deconv bug fixed for certain cases tests added
Leyuan Wang committed
-