Commits · 3e8c7beb9248606fe6767ef455adb2ccb8f9368b · wenyuanbo / tic

06 Apr, 2020 3 commits
- fix to skip node not in graph. (#5238) · 3e8c7beb
```
fix to skip node not in graph because some network cannot be hybridized with some var unused.
```
  chinakook committed Apr 06, 2020
  3e8c7beb Browse Files
- [CI] Update MxNet to 1.6.0 with MKL (#5240) · 41b8fd1e
  Haichen Shen committed Apr 05, 2020
  
  41b8fd1e Browse Files
- [Runtime][Contrib] Support cudnn softmax (#5214) · 799ff356
  Haichen Shen committed Apr 05, 2020
  
  799ff356 Browse Files
05 Apr, 2020 4 commits

[Relay][Topi][AutoTVM] Winograd support for Conv3D (#5186) · 02eb1833

* Functional conv3d winograd working.

* Formatted python code.

* registered conv3d winograd compute and started adding relay without_weight_transform operator.

* Add topi testing for conv3d winograd.

* Format file.

* small tweak to unrolling to prevent build sticking.

* Refactoring convolution ops in relay.

* Refactored relay convolutions.

* Bug fixes.

* Fixed static bug in convolution.

* Added conv3d alter op layout and related support.

* Bug fixes and testing done.

* Fix a few autotvm bugs.

* Drop silly debug print.

* Removed debug_skip_region.

* Add variant of conv3d_winograd that doesn't transform depth.

* initial infrastructure done for depthless conv.

* Fix no_depth schedule bugs.

* automatic topi switching between depth and depthless winograd.

* Fixed bug in schedule.

* lint fixes.

* Removed indents in convolution.cc

* missed a few indents oops.

* fixed flop count.

* One more small tweak.

* Change kernel pack inner axes order.

* Style changes.

* Comment fixes.

committed Apr 05, 2020

02eb1833 Browse Files

[Fix][VM] Fix copy constructor (#5237) · c76cbd8d
ga committed Apr 05, 2020

c76cbd8d Browse Files

[Relay][ADT]Static Tensor Array (#5103) · b5352ee2

* Add other static tensor array ops

* Add tensor array get data

* Minor refactor

* Fix pylint

* Update docstring

* Make get data more generic

* Improve test

* Improve split test

* Improve get data

* Minor fix

* Further improvement for static shape

* Improve shape parsing

* Unify get_static_name

committed Apr 05, 2020

b5352ee2 Browse Files

[REFACTOR][TIR] Migrate all low-level passes to the Pass Manager. (#5233) · e63e08fe

* [REFACTOR][TIR] Migrate all low-level passes to the Pass Manager.

This PR migrates the tvm.lower to return IRModule of PrimFuncs
instead of the LoweredFuncs.

* Remove LoweredFunc.

committed Apr 04, 2020

e63e08fe Browse Files

04 Apr, 2020 3 commits

[ONNX]Pool3d & upsample3d op support (#5135) · fd9ce583

* [ONNX]Pool3d and Upsample3d op updated

* Pool3d and Upsample3d testcase

* Review comments fixed

* Review comments

committed Apr 04, 2020

fd9ce583 Browse Files

Fix intel conv2d auto tune (#5200) · 0cfdecda

* Fix x86 conv2d and depthwise conv2d auto tuning

* Fix depthwise conv2d infer layout

* Use random data instead of empty data for autotvm

* Fix pylint

* Keep empty array for now for autotvm

committed Apr 03, 2020

0cfdecda Browse Files

[TE] Support mixing normal and cross-thread reduction (#5193) · b41f4e55
```
* Support mixing normal and cross-thread reduction

* minor improvements
```
Tang, Shizhi committed Apr 03, 2020
b41f4e55 Browse Files

03 Apr, 2020 8 commits

[REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager. (#5225) · 75e936e1

* [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager.

- SplitHostDevice
- ThreadSync
- BindDevice
- LowerThreadAllreduce
- Provide a temp fix for printing IRModule with PrimFunc before the formal text printer.

* Address comments, fix tests.

* Fix relay tests

* Explicit move

committed Apr 03, 2020

75e936e1 Browse Files

[PYTHON] Make IntImm more like an integer (#5232) · 9b274cbb
Tianqi Chen committed Apr 03, 2020

9b274cbb Browse Files

[RELAY] Non-recursive Graph Vistor and Rewriter (#4886) · 7de8a539

* First pass a defining a non-recursive Graph Vistor and Rewriter

autoformat

remove a currently empty test until testing is solidfied

* Make CalcDep from Dead Code Elimination non-recursive

* Partially working, not passing all tests yet

passes tests when disabling GetExprRefCount, I think I have a bug in visit counting

fix GetExprRefCount

Fix a subtle bug with nested recursive/non-recursive scopes

* Refactor

* improve comments

* respond to review comments on comments

* Fix a problem with default recursion for dataflow nodes

mark DataflowVisitor methods as override

* implement ScopeMutator

* convert forward_rewrite to ScopeMutator, remove DataflowMutator

* rewrite ExprRewriter and convert fast_math to use it

* switch BiasAddSimplifier to ExprRewriter

fix a clang warning

fix cpp lint

fix doc param error

* respond to review comments

* fix a typo in the iterative looping

* add a regression test for GetExprRefCount issue

* Normalize naming

* fix lint

* First pass a defining a non-recursive Graph Vistor and Rewriter

autoformat

remove a currently empty test until testing is solidfied

* Make CalcDep from Dead Code Elimination non-recursive

* Partially working, not passing all tests yet

passes tests when disabling GetExprRefCount, I think I have a bug in visit counting

fix GetExprRefCount

Fix a subtle bug with nested recursive/non-recursive scopes

* Refactor

* improve comments

* respond to review comments on comments

* Fix a problem with default recursion for dataflow nodes

mark DataflowVisitor methods as override

* implement ScopeMutator

* convert forward_rewrite to ScopeMutator, remove DataflowMutator

* rewrite ExprRewriter and convert fast_math to use it

* switch BiasAddSimplifier to ExprRewriter

fix a clang warning

fix cpp lint

fix doc param error

* respond to review comments

* fix a typo in the iterative looping

* add a regression test for GetExprRefCount issue

* Normalize naming

* fix lint

* respond to review comments

committed Apr 03, 2020

7de8a539 Browse Files

[TOPI x86] Adding unroll_kw config option for depthwise conv2d. (#5197) · 6b840fa9
Animesh Jain committed Apr 03, 2020

6b840fa9 Browse Files

[RELAY][FIX] Fix hang in MergeCompilerRegions (#5227) · 54975a3f

For certain network topologies, MCR could hang.
This patch fixes that case.

Change-Id: I3edd8a8a6b452b2b838b777720adea22a3b995b4

committed Apr 03, 2020

54975a3f Browse Files

[KERAS]Upsample3d & ZeroPadding3d op (#5125) · b796c13c

* [KERAS]upsampling3d and zeropadding3d op

* [KERAS]upsampling3d and zeropadding3d test case

* Review comments updated

committed Apr 03, 2020

b796c13c Browse Files

[DOCSTRING]missing function parameters updated (#5228) · 3c2aa1aa
Samuel committed Apr 03, 2020

3c2aa1aa Browse Files

[CodeGen][CUDA] Fix bugs (#5209) · 316ce055

- Support vectorized casts

- It is incorrect to extract elements from int8x4 with

   0x000000ff & (x >> i * 8)

  as this value is of type int in C/C++. If this expression
  is used for sign extensions, the sign bit will be wrong.
  Simply use C style casts instead and sign bits will just work.

Signed-off-by: Wei Pan <weip@nvidia.com>

committed Apr 03, 2020

316ce055 Browse Files

02 Apr, 2020 12 commits

[REFACTOR] tvm.hybrid -> te.hybrid (#5223) · 6e1cd825

Rationale: The current hybrid module is more aligned with the te part.
We might consider add a new varient of hybrid script that support the unified IR later.
This refactor paves for the potential later changes.

committed Apr 02, 2020

6e1cd825 Browse Files

[DOCS] Misc docs improvements (#5222) · 62b3195b
```
- Reduce CI docs task log size.
- Update the relation to halide to the latest state.
```
Tianqi Chen committed Apr 02, 2020
62b3195b Browse Files
[PYTORCH]AvgPool3d, MaxPool3d and Squeeze op support (#5220) · db535f45
```
* [PYTORCH]AvgPool3d, MaxPool3d and Squeeze op support

* Testcases added

* review comments
```
Samuel committed Apr 03, 2020
db535f45 Browse Files

[REFACTOR][TIR] Migrate low-level pass functions to Pass Manager, (#5213) · 44bffdb3

- Migrate LowerTVMBultin
- Migrate inferFragment, LowerThreadAllreduce
- Migrate ThreadSync
- Refactor target::Build to directly take IRModule.
- Remove un-used legacy functions.

committed Apr 02, 2020

44bffdb3 Browse Files

[TIR] Introduce BufferLoad/Store (#5205) · 88d2f34b

Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>

This PR introduces BufferLoad/Store to TIR. The new nodes will replace
Provide and Call with Tensor arguments in the subsequent refactors.

committed Apr 02, 2020

88d2f34b Browse Files

[TIR][PASS] dtype rewrite for indexing variables (#5092) · 4e5c5843
Haozheng Fan committed Apr 02, 2020

4e5c5843 Browse Files
[Runtime][Object] expose runtime::String to Python (#5212) · 4195b2e2
```
* expose runtime::String to Python

* retrigger ci
```
Zhi committed Apr 02, 2020
4195b2e2 Browse Files
[REFACTOR][IR] kExternalSymbol -> kGlobalSymbol (#5211) · d2f9af78
```
* expose runtime::String to Python

* kExternalSymbol -> kGlobalSymbol
```
Zhi committed Apr 02, 2020
d2f9af78 Browse Files

[Frontend][Torch] Fix up graph input handling (#5204) · 03cbf78e

* [Frontend][Torch] Simplify operator input handling

* [Frontend][Torch] Allow user supplied input names to override graph inputs

* Fix pylint issues

* Updates from code review feedback

* Fix tutorial to use shape list input

* Disable intermittent test failure in topi vision test

committed Apr 03, 2020

03cbf78e Browse Files

[Debug] Add Dump function for Object type (NFC) (#5207) · 15b1751c
```
Signed-off-by: Wei Pan <weip@nvidia.com>
```
Wei Pan committed Apr 01, 2020
15b1751c Browse Files
[DOCS] Reduce artifcats generated by sphinx gallery (#5208) · 5b857d3c
Tianqi Chen committed Apr 01, 2020

5b857d3c Browse Files

[REFACTOR][TIR] Introduce ExprDeepEqual, Remove IRDeepCompare (#5206) · e60003c2

* [REFACTOR][TIR] Introduce ExprDeepEqual, Remove IRDeepCompare

This PR introduces ExprDeepEqual which reuses the StructuralEqual infra.
We migrated the usecases of ir_pass::Equal to ExprDeepEqual and StructuralEqual.

* Address comments

committed Apr 01, 2020

e60003c2 Browse Files

01 Apr, 2020 8 commits

[RELAY] Fixes to MergeCompilerRegions (#5195) · 04499665

* [RELAY] Fixed issues with MergeCompilerRegions

This PR addresses a few outstanding issues with
the implementation of MergeCompilerRegions. In
particular, it now handles TupleGetItem nodes properly
and other minor bugs related to region merging have
been fixed.

Change-Id: I07783afc56183a6f798a510209f23b0a5f252255

* Fixed issue using pre-merged regions

Change-Id: I0a844ac59bda1089ae0c67cef52f0b0c7ab2cbd7

* Removed some debugging logic

Change-Id: Ib6f2eede6f38bbb270073eb8d4c4dc19f60832c6

* Remove default annotations

Change-Id: I9b7696a51c95871491cbea33c40f92ec327e417f

* Annotate default 'if's

Change-Id: I0098bd1bf6788dd6366810dcefa84f1ebbffaab0

* Clang format

Change-Id: I944365cd3080a97a9261f643a8f1efa5a63cf82b

* Use src/dest in merge

Change-Id: Ie43113492bda8f1ce63eaf9615cb645bb9e2ee86

* Fixed partition test

Change-Id: I46f9e349b1a813a9140f7e4f8a2241687e2df73b

* Removed comments

Change-Id: I309afdd1951d7e796e41d13788aa487707e0ac4c

committed Apr 01, 2020

04499665 Browse Files

[FRONTEND][MXNET] Use leaky by default for LeakyReLU (#5192) · 2f41a396
MORITA Kazutaka committed Apr 01, 2020

2f41a396 Browse Files
[PYTORCH]Dropouts And InstanceNorm support added (#5203) · 302e8ee2
```
* [PYTORCH]Dropouts And InstanceNorm support added

* Review comments fixed
```
Samuel committed Apr 02, 2020
302e8ee2 Browse Files
[BUGFIX]bugfix in tensorflow space_to_batch_nd (#5175) · afb8bf06
```
* [BUGFIX]bugfix in tensorflow space_to_batch_nd

* Test case added
```
Samuel committed Apr 01, 2020
afb8bf06 Browse Files
[DOCS] Use https link (#5183) · b2a32ddf
```
* [DOCS] Use https link

* use http for sphinx
```
Tianqi Chen committed Apr 01, 2020
b2a32ddf Browse Files

[RELAY] Partition graph codestyle fixes (#5202) · e46aa333

* [RELAY] Codestyle fixes for Graph Partitioner
	*ran through clang-format

* *formatting comments

* *further codestyle changes (after clang-format)

committed Apr 02, 2020

e46aa333 Browse Files

[PYTORCH]Activations for pytorch (#5194) · e722301a
```
* [PYTORCH]Activations for pytorch

* Review comments updated
```
Samuel committed Apr 02, 2020
e722301a Browse Files
[REFACTOR][TIR] Migrate Low-level Passes to Pass Manager (#5198) · 2b6d69c6
```
* [TIR][TRANSFORM] Migrate LowerIntrin

* LowerDeviceStorageAccessInfo

* Migrate LowerWarpMemory
```
Tianqi Chen committed Mar 31, 2020
2b6d69c6 Browse Files

31 Mar, 2020 2 commits

[Topi x86] Missing vectorize for depthwise conv2d. (#5196) · 03ff0cd0
Animesh Jain committed Mar 31, 2020

03ff0cd0 Browse Files

[RELAY] Re-wrote the Graph Partitioner to support multiple outputs (#5143) · 14ae3a6e

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

Input : A Relay module that have functions with disjoint annotated regions
        using compiler_begin and compiler_end. There could be multiple outputs.

Output : A Relay module with global functions for such disjoint annotated regions
         with calls inserted at the respective location

Dependencies : AnnotatedRegionSet Utility class.

Methodology :
      1) The AnnotatedRegionSet utility class is able to construct a collection of
         nodes that are bound by a give annotation -- here we use compiler_begin
         and compiler_end
      2) Initially, for each function in the module AnnotatedRegionSets are populated.
      3) Then, Vistor pass is traversed until a compiler_end node is encountered
         that belongs to a "region".
      4) When the first compiler_end of a given annotated region is found, a function is
         formed and inserted.
         a) if the region has multiple outputs, a Tuple node (capturing all outputs)
            is returned.
      5) Thereafter, if we encounter an another output of the same annotated region,
         it is important to note that the function is already formed. Therefore, it will
         lookup the function and add a TupleGetItemNode is inserted.
          a) We will use the location index of "rets" of each "Region" of AnnotatedRegionSet
             as TupleGetItemNode index.
      6) Therefore, functions will be created for all annotated regions. The name for each
         global function is created using "Region" id and the compiler name.

Change-Id: I1372f02a845b6d3da03b561763e03a378dca263c

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *removed the expected use-case as we are taking broken-down PR approach
    *code style fixes
    *some trivial one liners

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *fixed an implicit copy to a move

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

    *code style changes for comments
    *renamed test case multiple outputs --> mixed single multiple outputs
        Since the existing test case checks for both single and multiple
        output scenarios
    *added a new test case with conv2d + batch_norm
    *some var name changes in the test

* [RELAY] Re-wrote the Graph Partitioner to support multiple outputs

	*rebased

committed Mar 31, 2020

14ae3a6e Browse Files