Commits · 81d1124055ddbf46ee9c2daa89c03a294548c37e · wenyuanbo / tic

25 Feb, 2020 2 commits

Use opencv reisze method for preprocessing of image in darknet (#4883) · 81d11240

* Use opencv reisze method for preprocessing of image in darknet

* Use opencv reisze method for preprocessing of image in darknet

* Fix pylint issues

committed Feb 24, 2020

81d11240 Browse Files

[FRONTEND][KERAS]GaussianDropout/Noise parsing support (#4928) · 13cf1da3
```
GaussianDropout & GaussianNoise are active only during training time. This can be skipped during inference.
```
Samuel committed Feb 25, 2020
13cf1da3 Browse Files

24 Feb, 2020 1 commit

[Relay][AutoTVM] Relay op strategy (#4644) · 623dd208

* relay op strategy

fix lint

bitpack strategy

bitserial_dense (#6)

* update strategy

* address comments

fix a few topi test

Dense strategy (#5)

* dense

* add biforst; remove comments

* address comment

Refactor x86 conv2d_NCHWc (#4)

* Refactor x86 conv2d

* Add x86 depthwise_conv2d_NCHWc

* Add back topi x86 conv2d_nchw

* Merge x86 conv2d_nchw and conv2d_NCHWc

* Minor fix for x86 conv2d

fix more strategy

Add x86 conv2d_NCHWc_int8 strategy (#8)

* Add x86 conv2d_NCHWc_int8 strategy

* Remove contrib_conv2d_nchwc_int8

* Fix generic conv2d_NCHWc for int8

* Fix topi arm_cpu conv2d_NCHWc_int8

update x86 conv2d

enable specify relay ops to be tuned for autotvm

add cuda conv2d strategy

add conv2d strategy for rocm

add conv2d strategy for hls

add conv2d strategy for arm cpu

add conv2d strategy for mali

add conv2d strategy for bifrost

add conv2d strategy for intel graphics

clean up and fix lint

remove template keys from autotvm

remove 2 in the func name

address comments

fix

* fix bugs

* lint

* address comments

* add name to op implement

* Modify topi tests (#9)

* Add pooling, reorg, softmax and vision

* Add lrn

* fix topi test

* fix more topi test

* lint

* address comments

* x

* fix more tests & bugs

* Modify more tests (#10)

* Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn

* Minor fix

* More minor fix

* fix more test

* try to update vta using strategy

* fix cpptest

* x

* fix rebase err

* Fix two tests (#11)

* change autotvm log format

* lint

* minor fix

* try fix vta test

* fix rebase err

* tweak

* tmp hack for vta pass

* fix tutorial

* fix

* fix more tutorials

* fix vta tutorial

* minor

* address comments

* fix

* address comments

* fix cpptest

* fix docs

* change data structure name and api

* address comments

* lint

* fix rebase err

* updates

* fix winograd test

* fix doc

* rebase

* upgrade tophub version number

* fix bug

* re-enable vta tsim test after tophub is upgraded

* fix vta test to use the correct args so the config can be found in tophub

Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>

committed Feb 24, 2020

623dd208 Browse Files

21 Feb, 2020 5 commits

[Fix] Fix get_valid_count flaky test for cuda (#4901) · c4c61cb7

* get_valid_count accuracy issue fixed for individual tests but not for all tests running together

* minor fix

* initialize valid_count and PrefixSum buffers

* test updated

* udpate relay test as well

* update document

* fix lint

* address comment

* fix lint

* correct atomicAdd identifier name

committed Feb 21, 2020

c4c61cb7 Browse Files

[TEST][FLAKY] topi/tests/python/test_topi_sort.py::test_argsort (#4891) · 8290eaba

* [TEST][FLAKY] topi/tests/python/test_topi_sort.py::test_argsort

* upadate test function of argsort like topk

* Shuffle index and get data from shuffled index

* Replace the random.uniform with np.arange

committed Feb 21, 2020

8290eaba Browse Files

[COMMUNITY] @anijain2305 -> Committer (#4921) · f47c38db
Tianqi Chen committed Feb 21, 2020

f47c38db Browse Files

Fix tests for tflite unary elemwise operations (#4913) · 0e189f01

* add TFLite version check for 'ceil' and 'cos'
* fix name check of test_op for positive inputs
* add error message for operator not found in the installed fbs schema

committed Feb 21, 2020

0e189f01 Browse Files

[CODEGEN] Support cuda tensorcore subbyte int data type in auto tensorcore (#4546) · f23ac969

* support cuda tensorcore subbyte int data type in auto tensorcore

* add lisence

* pass cpplint

* fix code review comments

* merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial

* using master's new API

* disable tuning when cuda is not enabled

* address cr comment

* do not run the tuning

* fix test failure

* fix cpplint error

* fix bool type reduction bug

* 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4

* fix typo

committed Feb 20, 2020

f23ac969 Browse Files

20 Feb, 2020 3 commits
- [DOCS] Fix Sphinx Warnings (RST indent, cross-ref, and image scale) (#4920) · 98e7709f
```
* fix indents

* Fix image scale and cross-ref
```
  Cody Yu committed Feb 20, 2020
  98e7709f Browse Files
- [Relay] Fix an assertion exposed by loop vectorizer (#4916) · efd35e86
```
- Allows uniform conditions for select expressions (the same as halide)
  exposed by the loop vectorizer.

Signed-off-by: Wei Pan <weip@nvidia.com>
```
  wpan11nv committed Feb 20, 2020
  efd35e86 Browse Files
- [DOCS] Fix sphinx warnings (#4917) · fd6d7837
```
* Fix Python docstrings

* More fixes

* Fix lint
```
  Cody Yu committed Feb 20, 2020
  fd6d7837 Browse Files
19 Feb, 2020 3 commits
- [REFACTOR] Polish ffi convention. (#4912) · 18295b27
```
* [REFACTOR] Polish ffi convention.

- Remove the src/api, keep registration local to the c++ function.
- Remove the api_internal as it is no longer needed.

* Update the codebase walk through
```
  Tianqi Chen committed Feb 19, 2020
  18295b27 Browse Files
- [RELAY][FRONTEND][TF] Fix FuseBatchNorm output cast error if need_cast is True (#4894) · fccf2268
  hcyang committed Feb 18, 2020
  
  fccf2268 Browse Files
- Fix tvm.target.generic_func runtime detection (#4910) · 406b5f76
  Andrew committed Feb 18, 2020
  
  406b5f76 Browse Files
18 Feb, 2020 8 commits
- [DOCS] Update API docs to reflect the status after the refactor. (#4907) · d2ae8c95
  Tianqi Chen committed Feb 18, 2020
  
  d2ae8c95 Browse Files
- [Relay] Expose FunctionGetAttr to Python (#4905) · 41835d17
```
* [Relay] Expose FunctionGetAttr to Python

* add test

Co-authored-by: Jon Soifer <jonso@microsoft.com>
```
  Jon Soifer committed Feb 18, 2020
  41835d17 Browse Files
- [Relay][Frontend][Keras] NHWC import support. (#4899) · 9d646543
```
* Basic test working

* Almost all tests working.

* all tests passing.

* Fixed lint.

* Improved Style.
```
  Josh Fromm committed Feb 18, 2020
  9d646543 Browse Files
- [REFACTOR][PY] Establish tvm.arith (#4904) · d1e1ac49
  Tianqi Chen committed Feb 18, 2020
  
  d1e1ac49 Browse Files
- [CI] Add autodocsum as dep (#4902) · 38d1dd24
  Tianqi Chen committed Feb 17, 2020
  
  38d1dd24 Browse Files
- [CI] Update ci docker to add autodocsumm (#4903) · 8310b252
  Tianqi Chen committed Feb 17, 2020
  
  8310b252 Browse Files
- Fixed bugs that occured when using bitwise operators on floating point type… · 976c08ad
```
Fixed bugs that occured when using bitwise operators on floating point type expressions. Further crash when using ops <<, >>, %. Finally added regression tests for both types of bug. (#4892)
```
  pankratz committed Feb 17, 2020
  976c08ad Browse Files
- [REFACTOR][PY] Establish tvm.te and tvm.driver (#4900) · 08338dd5
```
- Move the related files to tvm.te
- Move build_module.py to tvm.driver
```
  Tianqi Chen committed Feb 17, 2020
  08338dd5 Browse Files
17 Feb, 2020 5 commits
- [Relay][Pass] Fix bug in re-processing call node in MergeComposite pass (#4879) · 27a02844
```
* Fix bug in re-processing call node

* Add test

* Add to main

* temp changes to work from another machine

* fix rest of tests

* fix test_reuse_call_merge

* fix merge

Co-authored-by: Jon Soifer <jonso@microsoft.com>
```
  Jon Soifer committed Feb 17, 2020
  27a02844 Browse Files
- [DOCS] Introduce how to add hardware backend to FAQ (#4898) · 0b2d11a5
  Tianqi Chen committed Feb 17, 2020
  
  0b2d11a5 Browse Files
- Fast exponent (#4790) · 13140916
  Alex Gladkov committed Feb 17, 2020
  
  13140916 Browse Files
- Update faq.md (#4893) · a43e326f
```
various minor editorial updates - style, grammar, typos.
```
  Baden Hughes committed Feb 16, 2020
  a43e326f Browse Files
- Fix alpha_equal bug (#4897) · 95de08ba
  Zhi committed Feb 16, 2020
  
  95de08ba Browse Files
16 Feb, 2020 3 commits

[CI] Cleanup logfile before tutorial runs (#4896) · e7be8bf4
Tianqi Chen committed Feb 16, 2020

e7be8bf4 Browse Files
[Relay] Fix VM compiler for while loop with free vars (#4889) · 529ee1fe
```
* add additional switch to handle nested call node

* Fix VM compiler for while loop with free var
```
masahi committed Feb 15, 2020
529ee1fe Browse Files

[CodeGen][CUDA] Fix issues in cuda codegen (#4876) · d50ba721

- Do not emit __shared__ etc. as part of type for casting

- Fix fp16 reduction kernels with compiler errors:

  "no operator "+" matches these operands, volatile half + volatile half

  This patch inserts casts to remove volatile type qualifier following
  volatile loads (fp16 only). CUDA fp16 library headers should add
  volatile member functions.

- Update have_fp16 to include compute 6.1 GPUs, which do support fp16,
  although their fp16 throughput is low. Updated tests.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed Feb 15, 2020

d50ba721 Browse Files

15 Feb, 2020 3 commits
- improve antlr import error message (#4888) · 7e9ec735
  masahi committed Feb 15, 2020
  
  7e9ec735 Browse Files
- [AutoTVM] Support range in index based tuners (#4870) · feda150e
```
* Support range in index based tuners

* Address comments

* Remove __*state__

* trigger CI
```
  Cody Yu committed Feb 14, 2020
  feda150e Browse Files
- [QNN] Add support for per channel weight scale in dense op (#4880) · a5e54b1d
```
* add test case for per channel dense

* add unit arg in tflite frontend

* update qnn legalize test

* fix output dim index
```
  masahi committed Feb 15, 2020
  a5e54b1d Browse Files
14 Feb, 2020 3 commits

[QNN] More doc fix on quantize and convolution (#4874) · 24c53a34
```
* [QNN] Doc fix on quantize and convolution

* update test
```
masahi committed Feb 13, 2020
24c53a34 Browse Files

[TOPI][CUDA] Enable vectorization on fp16 type (#4867) · 7013fc9a

- This allows to better utilize the memory bandwidth

- Note that not all cases are vectorized for fp16 datatype. For
  instance, when the size is not a multiple of 1024, the inner loop
  may be an expression that cannot be vectorized. In this case, a
  small inner loop is still benefical for latency hidding.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed Feb 13, 2020

7013fc9a Browse Files

[REFACTOR][PY] Establish tvm.tir · b787ffa3

- Move related files into the corresponding location as in C++
- Keep the top-level TVM API backward compatible to make minimum changes in topi

committed Feb 13, 2020

b787ffa3 Browse Files

13 Feb, 2020 4 commits
- Update docs/dev/virtual_machine.rst · a6c42b34
```
Co-Authored-By: Wei Chen <ipondering.weic@gmail.com>
```
  Zhi committed Feb 13, 2020
  a6c42b34 Browse Files
- Update docs/dev/virtual_machine.rst · 243071ad
```
Co-Authored-By: Wei Chen <ipondering.weic@gmail.com>
```
  Zhi committed Feb 13, 2020
  243071ad Browse Files
- fix vm doc · c8e17dd2
  Zhi Chen committed Feb 13, 2020
  
  c8e17dd2 Browse Files
- Optimize x86 conv3d_ndhwc using data packing approach. (#4866) · 8d945872
```
Add tuneable conv3d_ndhwc schedule
```
  Alex Gladkov committed Feb 12, 2020
  8d945872 Browse Files