Commits · ba4d081c217a13830ec5e34fb57d6282ada0ea86 · wenyuanbo / tic

22 Sep, 2019 1 commit
- Add operator `isnan` (#3979) · 16d4da4d
```
* add expr `isnan`

* move to intrinsic

* doc & add to topi

* fix error from ci
```
  Huang, Guangtai committed 5 years ago
  16d4da4d Browse Directory
20 Sep, 2019 1 commit

Add support for MXNet pad operator. (#3739) · 719d6d47

MXNet pad is described at:
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.pad

Add support for parameter 'None' in MXNet slice operator.

MXNet 'slice' is described at
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.slice

Add support for MXNet cos, sin, arctan

MXNet 'cos' is described at
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.cos

MXNet 'sin' is described at
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.sin

MXNet arctan is descirbed at
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.arctan

Add support for MXNet 1D Convolution and 1D Deconvolution

MXNet convolution is described at:
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Convolution

MXNet Deconvolution is described at:
https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Deconvolution

committed 5 years ago

719d6d47 Browse Directory

19 Sep, 2019 1 commit

[TOPI] Add proper scheduling for dense on CUDA (#3923) · bec08fec

* add proper scheduling for dense on CUDA

* add fallback config and fix unit test

* fix corner cases

* refactoring

* fix bias and add testcase

* let fusion happen

committed 5 years ago

bec08fec Browse Directory

16 Sep, 2019 1 commit

[TOPI] operator support: logical_and, logical_or, logical_not (#3929) · ab1853c2

* [TOPI] operator support: logical_and, logical_or, logical_not

* [TOPI] operator support: logical_and, logical_or, logical_not

* [TOPI] fix test cases for operator support: logical_and, logical_or, logical_not

* [TOPI] fix test cases for operator support: logical_not

committed 5 years ago

ab1853c2 Browse Directory

09 Sep, 2019 1 commit

[Relay/TOPI][Op] Add erf intrinsic and op (#3702) · 2f5b155a

* add more ops

* stop vectorization for erf

* x

* cleanup

* fix

* add whitelist for vectorizable intrin

* add tf converter

* fix dense

* fix

* add missing intrin

* fix mxnet frontend

* fix nvptx

committed 5 years ago

2f5b155a Browse Directory

08 Sep, 2019 1 commit
- change docker install script (#3524) · 184fa484
  雾雨魔理沙 committed 5 years ago
  
  184fa484 Browse Directory
01 Sep, 2019 1 commit

[Relay][Any] Add shape func for dynamic shape (#3606) · eef35a57

* init shape func in interpreter and vm compiler

* Update interpreter

* fix

* lint

* lint

* fix

* remove hack

* update

* fix

* fix

* update

* address comments & update for shape_of

* fix lint

* update

* fix hybrid

* lint

* fix bug & add take shape func

* lint

* lint

* update

* fix flaky test

* add todo

committed 5 years ago

eef35a57 Browse Directory

22 Aug, 2019 2 commits

[TOPI][Relay][TensorFlow] Add OneHot operator (#3781) · 554df211

* Add one-hot to Relay

* topi implementation

* Working

* add topi test

* Add TF test

* Fix check

* fix linting issues

* fix documentation

* Fix documentation

* Add support for on_value, off_value, axis, dtype

* Add full support for axis

* Fix compute and update test_forward

* Move on_value and off_value to inputs

* Add topi test

* Update tests

* Update docs

* Fix style

* re-enable tests

* Add one_hot to mxnet converter

committed 5 years ago

554df211 Browse Directory

Changed topi cc resize to python implementation with new features. (#3788) · 7264cb6a
Josh Fromm committed 5 years ago

7264cb6a Browse Directory

06 Aug, 2019 1 commit

[Relay] [TOPI] `{relay,topi}.nn.sparse_transpose` for **Square** CSR matrices (#3707) · 3b287c4d

* add build gcn tutorial

* add transpose operator for square sparse matrices

* remove extra files

* change loop tag

* comply with lint

* comply with lint -- line too long

* comply with lint

* lint check

* lint check

* lint check

* apply marisa and theirry's reviews

committed 5 years ago

3b287c4d Browse Directory

01 Aug, 2019 1 commit

Add support for Tensorflow operators log1p, cos, sin (#3614) · d72cdfa6

The patch adds support for Tensorflow operators log1p and cos
Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p
Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos
Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin

committed 5 years ago

d72cdfa6 Browse Directory

31 Jul, 2019 1 commit
- [TOPI][CUDA] schedule for group_conv2d (#3663) · 11da1ca3
```
* [TOPI][CUDA] schedule for group_conv2d

* Fix #flops
```
  Wuwei Lin committed 5 years ago
  11da1ca3 Browse Directory
30 Jul, 2019 1 commit

[TOPI] Fix traverse function not inline zero-input op (#3623) · 9d583cf5

* Fix traverse_inline not inline zero input op properly

* Add where to python and set tag to broadcast

* Fix inline

* test

* fix test target

* fix

committed 5 years ago

9d583cf5 Browse Directory

28 Jul, 2019 1 commit
- Hotfix for issue #3641. (#3644) · 026162ad
  Balint Cristian committed 5 years ago
  
  026162ad Browse Directory
26 Jul, 2019 1 commit
- [TOPI][CUDA] Schedule for pool_grad (#3622) · f1ede9a9
```
* [TOPI][CUDA] Schedule for pool_grad

* Relay test

* Fix fused op

* doc

* Remove set scope local
```
  Wuwei Lin committed 5 years ago
  f1ede9a9 Browse Directory
25 Jul, 2019 1 commit
- Add Winograd matrices computation. (#3553) · 97e333ca
  Balint Cristian committed 5 years ago
  
  97e333ca Browse Directory
24 Jul, 2019 1 commit
- [TOPI][Relay] max_pool2d & avg_pool2d gradient (#3601) · 5c410037
  Wuwei Lin committed 5 years ago
  
  5c410037 Browse Directory
23 Jul, 2019 3 commits

We observe multiple groups across a range of domains (ASR, NMT, LM, etc), (#3566) · d6dcd6c5

internally and externally, interested in replacing standard dense layers with
block-sparse matrix multiplication layers. The motivations are generally: higher
performance (due to reduction in FLOPs, memory bandwidth/cache footprint),
enabling larger models (e.g. fitting more layers in a given memory budget).

Some public work along these lines:

* https://openai.com/blog/block-sparse-gpu-kernels/
* https://openai.com/blog/sparse-transformer/
* https://arxiv.org/abs/1802.08435
* https://arxiv.org/abs/1711.02782

Various groups have been able to successfully train models with reasonable
levels of sparsity (90%+) with marginal accuracy changes, which suggests
substantial speedups are possible (as this implies a >10x reduction in FLOPs).

It is fairly straightforward to realize these theoretical speedups, see e.g. TVM
benchmarks for Intel CPUs in
https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA
results in https://github.com/openai/blocksparse, etc.

* https://github.com/openai/blocksparse (CUDA)
* https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM)
* https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation)

This is extracted from an internal patch we've been using internally. There are
various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but
this is a reasonable starting point. This needs more thorough unit test coverage
however.

We follow the conventions established by scipy.sparse.bsr_matrix and other
libraries, see the unit tests for details.

For folks interested in experimenting with scheduling/AutoTVM etc,
https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful
starting point.

committed 5 years ago

d6dcd6c5 Browse Directory

{relay,topi}.reinterpret support (#3599) · 2ed31b24

= Motivation

It's useful to expose the tvm::reinterpret functionality to Relay/TOPI users, as
this allows them to build (fused) operators leveraging the bitwise
reinterpretation of an operator. An example is approximate transcendental
functions, which can be implemented similar to:

```.py
    def C(x):
        return relay.expr.const(x, "float32")

    def approx_exp(x):
        x = relay.minimum(relay.maximum(x, C(-88.0)), C(88.0))
        x = C(127.0) + x * C(1.44269504)
        xf = relay.floor(x)
        i = relay.cast(xf, "int32")
        x = x - xf
        Y = C(0.99992522) + x * (C(0.69583354) + x * (C(0.22606716) + x * C(0.078024523)))
        exponent = relay.left_shift(i, relay.expr.const(23, "int32"))
        exponent = relay.reinterpret(exponent, "float32")
        return exponent * Y

    def approx_sigmoid(x):
        # <2.0e-5 absolute error over [-5, 5]
        y = approx_exp(x)
        return y / (y + C(1.0))

    def approx_tanh(x):
        # <4.0e-5 absolute error over [-5, 5]
        x = x * C(2.0)
        y = approx_exp(x)
        return (y - C(1.0)) / (y + C(1.0))
```

See unit tests for implementations of these approximate transendentals.

committed 5 years ago

2ed31b24 Browse Directory

Checking the correct dtypes for choosing the Intel int8 instructions. (#3516) · 3ada7c0e
Animesh Jain committed 5 years ago

3ada7c0e Browse Directory

19 Jul, 2019 1 commit
- [TOPI][RELAY] Add op Size (#3094) · 313bc9de
  Yong Wu committed 5 years ago
  
  313bc9de Browse Directory
03 Jul, 2019 1 commit
- Pre-allocate buffer for x86 roi_align (#3475) · 287078c3
```
* Pre-allocate buffer for x86 roi_align

* Fix typo
```
  Yao Wang committed 5 years ago
  287078c3 Browse Directory
28 Jun, 2019 1 commit

[RELAY] [OP] [MXNet Frontend] Add sequence_mask (#3437) · 8ef22176

* Add sequence_mask

use exactly the same arguments as mxnet

fix

* fix lint

* fix lint

* add mxnet conversion + relay

* update

* update doc

* fix pylint

* fix doc

* address comment

* try to address comments

* try to enable shape check for valid_length

* fix

* try to fix

* fix bug

* try to fix

* address comment

* address comment

committed 5 years ago

8ef22176 Browse Directory

14 Jun, 2019 1 commit
- [TEST][FLAKY] Fix flaky test on topk and quantize pass (#3362) · 2b045c56
```
* fix flaky test

* fix flaky quantize pass
```
  Haichen Shen committed 5 years ago
  2b045c56 Browse Directory
11 Jun, 2019 1 commit
- [Topi] Fast mode in take op (#3325) · 2c41fd2f
  hlu1 committed 5 years ago
  
  2c41fd2f Browse Directory
10 Jun, 2019 1 commit
- Support x86 dilation conv2d and improve multi-batch conv2d (#3308) · d43aab07
```
* Support x86 dilation conv2d and improve multi-batch conv2d

* Fix lint
```
  Yao Wang committed 5 years ago
  d43aab07 Browse Directory
09 Jun, 2019 1 commit
- Improve non_max_suppression and get_valid_counts for CPU (#3305) · 98a91af9
```
* Improve non_max_suppression for CPU

* Improve get_valid_counts

* Minor change

* Skip some unnecessary computes
```
  Yao Wang committed 5 years ago
  98a91af9 Browse Directory
06 Jun, 2019 2 commits

Fix x86 depthwise conv2d alter_op_layout (#3264) · d7bc4fdd

* Fix x86 depthwise conv2d alter_op_layout

* Small fix

* Add test case

* Fix test

* Assert kernel layout

* Minor fix

* Add get_shape function

* Minor change

committed 5 years ago

d7bc4fdd Browse Directory

Improve x86 roi align (#3296) · 9164809c
```
* Improve roi_align performance for x86

* Change test
```
Yao Wang committed 5 years ago
9164809c Browse Directory

05 Jun, 2019 1 commit
- fast tanh (#3255) · 165aa0db
  hlu1 committed 5 years ago
  
  165aa0db Browse Directory
04 Jun, 2019 1 commit

[Relay/TOPI][Op] Add TopK operator (#3256) · 072f8cc7

* init impl for topk

* Fix cpu for topk

* init cuda impl for topk

* Add cuda for topk

* fix

* Add doc

* update doc

* lint

* lint

* lint

* x

* fix warning

* [Relay] Add TopK in tf converter

* Add frontend converter

* fix

committed 5 years ago

072f8cc7 Browse Directory

28 May, 2019 1 commit
- [TOPI] Fix resize nearest with fractional scaling (#3244) · a8275bdb
  masahi committed 5 years ago
  
  a8275bdb Browse Directory
22 May, 2019 1 commit

Add packing for int8 1x1 convolution and support the int8 group convolution on X86 (#2991) · f7d7fdcd

* Support the 1x1 int8 conv with NHWC layout and weight packing

fix linter

* fix the memoize issue

* fix the failed nhwc test

* add the schedule for pack to unbreak other tests

* skip avx512 compile

* Support the 1x1 int8 conv with NHWC layout and weight packing

fix linter

* fix the memoize issue

* fix the failed nhwc test

* add the schedule for pack to unbreak other tests

* skip avx512 compile

* Unify the data_layout and kernel_layout relation

* add asf header

* fix the comment

* retrigger the build/test

committed 5 years ago

f7d7fdcd Browse Directory

20 May, 2019 2 commits
- [Relay][TOPI] operator All (#3124) · 9fd8e3c5
```
* [Relay][TOPI] operator All

* Update tests/python/frontend/tensorflow/test_forward.py

Co-Authored-By: yongwww <55wuyong@163.com>

* fix comments

* change to level 4
```
  Yong Wu committed 5 years ago
  9fd8e3c5 Browse Directory
- [BugFix] Fix bug in cast to bool (#3207) · d4fb0a2d
  Haichen Shen committed 5 years ago
  
  d4fb0a2d Browse Directory
17 May, 2019 1 commit
- [ARM] Fix concat (#3061) · 78a0f47b
  hlu1 committed 5 years ago
  
  78a0f47b Browse Directory
09 May, 2019 1 commit

[Relay][Op] Adaptive pooling (#3085) · 147ea3b0

* Add topi adaptive_pool

* Use adaptive_pool to compute global_pool

* Add relay adaptive pool2d

* Fix lint

* Fix typo

* Minor change

* Change support level to 10

* Add contrib

* Remove global pool schedule

* Add contrib module

* Fix lint

* Update doc

* Update doc

committed 5 years ago

147ea3b0 Browse Directory

08 May, 2019 1 commit
- [Bugfix][TOPI] conv2d_transpose bugfix (#3138) · 472c3146
```
* deconv tests

* deconv bug fixed for certain cases tests added
```
  Leyuan Wang committed 5 years ago
  472c3146 Browse Directory
29 Apr, 2019 1 commit

[Relay][TOPI] Gluncv SSD support on the GPU (#2784) · a706ad16

* ssd gluoncv gpu op updated

* ssd gluoncv gpu op updated

* tutorials and testes modified

* tutorials and testes modified

* fix lint

* fix lint

* address comment

* multibox bug fixed

* space line added

* use less threads per block

* use less threads per block

* less threads per block for get valid count

* less threads per block for get valid count

* merge with master

* Revert "less threads per block for get valid count"

This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4.

* Revert "less threads per block for get valid count"

This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4.

* typo fixed

* elem length made to a variable

* fix lint error

* fix lint error

* lint fixed

* bug fixed

* bug fixed

* lint fixed

* error fixed

* error fixed

* test ci

* test ci

* seperate argsort to be an independent op

* seperate argsort to be an independent op

* fix lint

* fix lint

* remove unsupported models

* typo fixed

* argsort added to realy

* solve conflicts with master

* fix lint

* fix lint

* test push

* Revert "test push"

This reverts commit 6db00883fab6cc06bddf564c926bb27c874397d8.

* fix lint error

* fix more lint

* cpu test_sort udpated

* debug ci

* nms fixed

* expose argsort to relay frontend

* test ci

* fix lint

* sort register error fixed

* fix nnvm

* nms type fixed

* adaptive pooling added to relay

* Revert "adaptive pooling added to relay"

This reverts commit 1119f1f2c055753e0cc5611627597749134c5c8c.

* fix lint

* expose argsort op

* fix lint

* fix lint

* fix lint

* sort test updated

* sort bug fixed

* nnvm error fixed

* fix argsort default data type returned to be float insteaf of int

* fix lint

* fix lint

* test fixed

* fix valid count

* fix titanx bug

* tutorial add both targets

* titanx error fixed

* try to fix CI old gpu error

* try to solve CI GPU error

* get_valid_count added

* reverse get_valid_count

* get valid count optimized

* address comments

* fix ci error

* remove unessesary block sync

* add back one sync

* address comments

* address more comments

* more comments

* move sort to be indepent algorithm

* typo fixed

* more typos

* comments addressed

* doc updated

* fix pylint

* address final comments

* apache license added

committed 5 years ago

a706ad16 Browse Directory

28 Apr, 2019 1 commit
- [TOPI] Fix group_conv2d unit test (#3113) · e22b5802
  Wuwei Lin committed 5 years ago
  
  e22b5802 Browse Directory