Commits · 588523ddb6e938637d96745bcce145375307247f · wenyuanbo / tic

25 Feb, 2020 5 commits

[LLVM] Fix build breaks from StringRef changes (#4923) · 588523dd
```
- llvm::StringRef to std::string conversion is explicit now.

Signed-off-by: Wei Pan <wpan11nv@nvidia.com>
```
wpan11nv committed Feb 25, 2020
588523dd Browse Files

[Relay][External Codegen] Support data types for CSourceModuleCodegen args and output (#4934) · a2429c1f

* Support int args and no extra buffers

* Fixes

* remove testing code

* fix style

* more style

* use const args

* style

Co-authored-by: Jon Soifer <jonso@microsoft.com>

committed Feb 24, 2020

a2429c1f Browse Files

[Relay] Add a PyTorch to Relay Parser (#4497) · 87c20bb2

* Add a PyTorch to Relay parser

* Add alexnet, googlenet, mnasnet, shufflenet wip

* Fix lint

* Remove fix for shufflenet

* Lower check

* Pull changes from neo-ai/tvm changes

* Remove commented out section

* Use infer_shape everywhere

* Change back to using trace instead of path in from_pytorch

* Parse state_dict to add param names

* Umbrella single_op under test_forwards

* Remove print and cleanup call

* Check if update to test broke CI

* Retrigger CI

* Add back in updated tests

* Try splitting up tests

* First pass at flexible typing, implemented for ones

* Add int32 for all ops

* Remove print statements

* Fix lint

* Broad except

* Add other tensor types

* Temporarily use old tests

* Retrigger CI

* Lower type names

* Use numpy to convert in dense op

* Fix lint

* Remove print

* Need to cleanup but verify int32 works for add

* Rough tests for different types, a lot of types are not supported on CPU

* Probably doesn't build, need to save work as I have to switch branches (constantly)

* Parse param type

* Remove print stmt in parser

* Clean up some code

* Working on flaot32 for bn

* Add resnet18 double type

* Fix lint

* Temporarily move PT tests first

* Temporarily add back refactored tests to fix mem issue

* Add more type test and temp remove some tests

* Comment out tests, hopefully CI prints a trace

* Get stack trace

* Remove operator dict key, rename op_name to node_id, remove dead code

* Make relay map a list

* Remove some hacky string stuff

* Move to PyTorch 1.4

* Remove input_type as param

* Remove _get_fill_value, fix full ops

* Remove unused code and combine ops for identity and none

* Remove fn_param

* Clean up main loop

* Remove useless if/else for outputs

* Remove ir_names, only used once

* Remove some string hacking

* Remove string parsing to get output name

* Fix bug with output sizes of nodes

* Use attributeNames in parse ops

* Remove continue and add_op in parse_op

* Do this everywhere, use assert instead of explciitly type casting

* Remove unnecessary swap

* Slight refactor for elemwise input parse

* Use a copy of graph everywhere

* Rename nid_to_node_name

* Refactor parse import prereqs

* Clean up input node kind check

* Clean up conditionals

* Clean up add_op

* Cleanup type for ones and zeros op

* Fix lint

* Add torch install to CI

* Actually use torch

* Try moving import torch to only where it's needed

* Import torch for CI

* Use take op for select

* Temporarily add ignore for jit inline pass for CI

* Use CompleteTensorType, might be a PT 1.2 only thing

* Use different types in elemwise op

* Use float16 ones

* Fix float16 test

* Remove the temp docker changes

* Remove temp test

* Temporarily comment out original tests

* Remove file

* Empty cache after each test

* Add some prints and lower input sizes

* Try using no grad

* Trying to globally set grad off

* Use no grad for torchvision

* Remove xfail tests

* Remove VGG and AlexNet due to some issues

* Combine pooling tests

* Remove extra test file

* Remove single op, remove larger pooling tests

* Remove maxpool3

* Remove debug prints

* Remove inference call and add no_grad in measure latency

* Use standard string start char

* Remove redundant infer_shape in slice

* Convert most to checks to just expr

* Remove extra paren

* More refactor of isinstance

* Add helper for creating typed constants

* Assert instead of return when no matching type

* Remove network variants

* Add no_grad when forward, remove deatch, fix lint

* Change isinstance to expr in transpose

* Use opnotimplemented, refactor

* Fix full ops, remove duplicate tests

* Never use shape field unless we know the type

* Remove comma, retrigger CI

* Add paren, retrigger CI

* Use inline if-else for flags

* Throw exception instead of assert

* Remove version check for CI

* Check version when doing inline pass

* Fix lint

* Lower more input sizes

* Add new line, conv2d only accepts weight as expr

* Use tvm.runtime.ndarray

* Remove change to torch version install

* Try no grad for mobilenet

* Fix lint

* Fix lint again

* Revert to last passing

* Delete test files

* Ignore lint

* Revert back

* Comment out mobilenet

* Clean up compare compiled and baseline outputs

* Use IRModule

* Add todos

* Refactor use_bias

* Add todo for fix conv op channels

* Change input to data type

* Remove todo

* Handle channel multiplier > 1

committed Feb 24, 2020

87c20bb2 Browse Files

Use opencv reisze method for preprocessing of image in darknet (#4883) · 81d11240

* Use opencv reisze method for preprocessing of image in darknet

* Use opencv reisze method for preprocessing of image in darknet

* Fix pylint issues

committed Feb 24, 2020

81d11240 Browse Files

[FRONTEND][KERAS]GaussianDropout/Noise parsing support (#4928) · 13cf1da3
```
GaussianDropout & GaussianNoise are active only during training time. This can be skipped during inference.
```
Samuel committed Feb 25, 2020
13cf1da3 Browse Files

24 Feb, 2020 1 commit

[Relay][AutoTVM] Relay op strategy (#4644) · 623dd208

* relay op strategy

fix lint

bitpack strategy

bitserial_dense (#6)

* update strategy

* address comments

fix a few topi test

Dense strategy (#5)

* dense

* add biforst; remove comments

* address comment

Refactor x86 conv2d_NCHWc (#4)

* Refactor x86 conv2d

* Add x86 depthwise_conv2d_NCHWc

* Add back topi x86 conv2d_nchw

* Merge x86 conv2d_nchw and conv2d_NCHWc

* Minor fix for x86 conv2d

fix more strategy

Add x86 conv2d_NCHWc_int8 strategy (#8)

* Add x86 conv2d_NCHWc_int8 strategy

* Remove contrib_conv2d_nchwc_int8

* Fix generic conv2d_NCHWc for int8

* Fix topi arm_cpu conv2d_NCHWc_int8

update x86 conv2d

enable specify relay ops to be tuned for autotvm

add cuda conv2d strategy

add conv2d strategy for rocm

add conv2d strategy for hls

add conv2d strategy for arm cpu

add conv2d strategy for mali

add conv2d strategy for bifrost

add conv2d strategy for intel graphics

clean up and fix lint

remove template keys from autotvm

remove 2 in the func name

address comments

fix

* fix bugs

* lint

* address comments

* add name to op implement

* Modify topi tests (#9)

* Add pooling, reorg, softmax and vision

* Add lrn

* fix topi test

* fix more topi test

* lint

* address comments

* x

* fix more tests & bugs

* Modify more tests (#10)

* Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn

* Minor fix

* More minor fix

* fix more test

* try to update vta using strategy

* fix cpptest

* x

* fix rebase err

* Fix two tests (#11)

* change autotvm log format

* lint

* minor fix

* try fix vta test

* fix rebase err

* tweak

* tmp hack for vta pass

* fix tutorial

* fix

* fix more tutorials

* fix vta tutorial

* minor

* address comments

* fix

* address comments

* fix cpptest

* fix docs

* change data structure name and api

* address comments

* lint

* fix rebase err

* updates

* fix winograd test

* fix doc

* rebase

* upgrade tophub version number

* fix bug

* re-enable vta tsim test after tophub is upgraded

* fix vta test to use the correct args so the config can be found in tophub

Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>

committed Feb 24, 2020

623dd208 Browse Files

21 Feb, 2020 5 commits

[Fix] Fix get_valid_count flaky test for cuda (#4901) · c4c61cb7

* get_valid_count accuracy issue fixed for individual tests but not for all tests running together

* minor fix

* initialize valid_count and PrefixSum buffers

* test updated

* udpate relay test as well

* update document

* fix lint

* address comment

* fix lint

* correct atomicAdd identifier name

committed Feb 21, 2020

c4c61cb7 Browse Files

[TEST][FLAKY] topi/tests/python/test_topi_sort.py::test_argsort (#4891) · 8290eaba

* [TEST][FLAKY] topi/tests/python/test_topi_sort.py::test_argsort

* upadate test function of argsort like topk

* Shuffle index and get data from shuffled index

* Replace the random.uniform with np.arange

committed Feb 21, 2020

8290eaba Browse Files

[COMMUNITY] @anijain2305 -> Committer (#4921) · f47c38db
Tianqi Chen committed Feb 21, 2020

f47c38db Browse Files

Fix tests for tflite unary elemwise operations (#4913) · 0e189f01

* add TFLite version check for 'ceil' and 'cos'
* fix name check of test_op for positive inputs
* add error message for operator not found in the installed fbs schema

committed Feb 21, 2020

0e189f01 Browse Files

[CODEGEN] Support cuda tensorcore subbyte int data type in auto tensorcore (#4546) · f23ac969

* support cuda tensorcore subbyte int data type in auto tensorcore

* add lisence

* pass cpplint

* fix code review comments

* merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial

* using master's new API

* disable tuning when cuda is not enabled

* address cr comment

* do not run the tuning

* fix test failure

* fix cpplint error

* fix bool type reduction bug

* 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4

* fix typo

committed Feb 20, 2020

f23ac969 Browse Files

20 Feb, 2020 3 commits
- [DOCS] Fix Sphinx Warnings (RST indent, cross-ref, and image scale) (#4920) · 98e7709f
```
* fix indents

* Fix image scale and cross-ref
```
  Cody Yu committed Feb 20, 2020
  98e7709f Browse Files
- [Relay] Fix an assertion exposed by loop vectorizer (#4916) · efd35e86
```
- Allows uniform conditions for select expressions (the same as halide)
  exposed by the loop vectorizer.

Signed-off-by: Wei Pan <weip@nvidia.com>
```
  wpan11nv committed Feb 20, 2020
  efd35e86 Browse Files
- [DOCS] Fix sphinx warnings (#4917) · fd6d7837
```
* Fix Python docstrings

* More fixes

* Fix lint
```
  Cody Yu committed Feb 20, 2020
  fd6d7837 Browse Files
19 Feb, 2020 3 commits
- [REFACTOR] Polish ffi convention. (#4912) · 18295b27
```
* [REFACTOR] Polish ffi convention.

- Remove the src/api, keep registration local to the c++ function.
- Remove the api_internal as it is no longer needed.

* Update the codebase walk through
```
  Tianqi Chen committed Feb 19, 2020
  18295b27 Browse Files
- [RELAY][FRONTEND][TF] Fix FuseBatchNorm output cast error if need_cast is True (#4894) · fccf2268
  hcyang committed Feb 18, 2020
  
  fccf2268 Browse Files
- Fix tvm.target.generic_func runtime detection (#4910) · 406b5f76
  Andrew committed Feb 18, 2020
  
  406b5f76 Browse Files
18 Feb, 2020 8 commits
- [DOCS] Update API docs to reflect the status after the refactor. (#4907) · d2ae8c95
  Tianqi Chen committed Feb 18, 2020
  
  d2ae8c95 Browse Files
- [Relay] Expose FunctionGetAttr to Python (#4905) · 41835d17
```
* [Relay] Expose FunctionGetAttr to Python

* add test

Co-authored-by: Jon Soifer <jonso@microsoft.com>
```
  Jon Soifer committed Feb 18, 2020
  41835d17 Browse Files
- [Relay][Frontend][Keras] NHWC import support. (#4899) · 9d646543
```
* Basic test working

* Almost all tests working.

* all tests passing.

* Fixed lint.

* Improved Style.
```
  Josh Fromm committed Feb 18, 2020
  9d646543 Browse Files
- [REFACTOR][PY] Establish tvm.arith (#4904) · d1e1ac49
  Tianqi Chen committed Feb 18, 2020
  
  d1e1ac49 Browse Files
- [CI] Add autodocsum as dep (#4902) · 38d1dd24
  Tianqi Chen committed Feb 17, 2020
  
  38d1dd24 Browse Files
- [CI] Update ci docker to add autodocsumm (#4903) · 8310b252
  Tianqi Chen committed Feb 17, 2020
  
  8310b252 Browse Files
- Fixed bugs that occured when using bitwise operators on floating point type… · 976c08ad
```
Fixed bugs that occured when using bitwise operators on floating point type expressions. Further crash when using ops <<, >>, %. Finally added regression tests for both types of bug. (#4892)
```
  pankratz committed Feb 17, 2020
  976c08ad Browse Files
- [REFACTOR][PY] Establish tvm.te and tvm.driver (#4900) · 08338dd5
```
- Move the related files to tvm.te
- Move build_module.py to tvm.driver
```
  Tianqi Chen committed Feb 17, 2020
  08338dd5 Browse Files
17 Feb, 2020 5 commits
- [Relay][Pass] Fix bug in re-processing call node in MergeComposite pass (#4879) · 27a02844
```
* Fix bug in re-processing call node

* Add test

* Add to main

* temp changes to work from another machine

* fix rest of tests

* fix test_reuse_call_merge

* fix merge

Co-authored-by: Jon Soifer <jonso@microsoft.com>
```
  Jon Soifer committed Feb 17, 2020
  27a02844 Browse Files
- [DOCS] Introduce how to add hardware backend to FAQ (#4898) · 0b2d11a5
  Tianqi Chen committed Feb 17, 2020
  
  0b2d11a5 Browse Files
- Fast exponent (#4790) · 13140916
  Alex Gladkov committed Feb 17, 2020
  
  13140916 Browse Files
- Update faq.md (#4893) · a43e326f
```
various minor editorial updates - style, grammar, typos.
```
  Baden Hughes committed Feb 16, 2020
  a43e326f Browse Files
- Fix alpha_equal bug (#4897) · 95de08ba
  Zhi committed Feb 16, 2020
  
  95de08ba Browse Files
16 Feb, 2020 3 commits

[CI] Cleanup logfile before tutorial runs (#4896) · e7be8bf4
Tianqi Chen committed Feb 16, 2020

e7be8bf4 Browse Files
[Relay] Fix VM compiler for while loop with free vars (#4889) · 529ee1fe
```
* add additional switch to handle nested call node

* Fix VM compiler for while loop with free var
```
masahi committed Feb 15, 2020
529ee1fe Browse Files

[CodeGen][CUDA] Fix issues in cuda codegen (#4876) · d50ba721

- Do not emit __shared__ etc. as part of type for casting

- Fix fp16 reduction kernels with compiler errors:

  "no operator "+" matches these operands, volatile half + volatile half

  This patch inserts casts to remove volatile type qualifier following
  volatile loads (fp16 only). CUDA fp16 library headers should add
  volatile member functions.

- Update have_fp16 to include compute 6.1 GPUs, which do support fp16,
  although their fp16 throughput is low. Updated tests.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed Feb 15, 2020

d50ba721 Browse Files

15 Feb, 2020 3 commits
- improve antlr import error message (#4888) · 7e9ec735
  masahi committed Feb 15, 2020
  
  7e9ec735 Browse Files
- [AutoTVM] Support range in index based tuners (#4870) · feda150e
```
* Support range in index based tuners

* Address comments

* Remove __*state__

* trigger CI
```
  Cody Yu committed Feb 14, 2020
  feda150e Browse Files
- [QNN] Add support for per channel weight scale in dense op (#4880) · a5e54b1d
```
* add test case for per channel dense

* add unit arg in tflite frontend

* update qnn legalize test

* fix output dim index
```
  masahi committed Feb 15, 2020
  a5e54b1d Browse Files
14 Feb, 2020 3 commits

[QNN] More doc fix on quantize and convolution (#4874) · 24c53a34
```
* [QNN] Doc fix on quantize and convolution

* update test
```
masahi committed Feb 13, 2020
24c53a34 Browse Files

[TOPI][CUDA] Enable vectorization on fp16 type (#4867) · 7013fc9a

- This allows to better utilize the memory bandwidth

- Note that not all cases are vectorized for fp16 datatype. For
  instance, when the size is not a multiple of 1024, the inner loop
  may be an expression that cannot be vectorized. In this case, a
  small inner loop is still benefical for latency hidding.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed Feb 13, 2020

7013fc9a Browse Files

[REFACTOR][PY] Establish tvm.tir · b787ffa3

- Move related files into the corresponding location as in C++
- Keep the top-level TVM API backward compatible to make minimum changes in topi

committed Feb 13, 2020

b787ffa3 Browse Files

13 Feb, 2020 1 commit
- Update docs/dev/virtual_machine.rst · a6c42b34
```
Co-Authored-By: Wei Chen <ipondering.weic@gmail.com>
```
  Zhi committed Feb 13, 2020
  a6c42b34 Browse Files