Commits · 17cd27da9b31ce9220e5b3bbf51ad56d8f4579a1 · wenyuanbo / tic

27 Apr, 2020 2 commits
- [relay][topi] Add operation relay.nn.dilate() which calls topi.nn.dilate() (#5331) · 639358e8
```
* Add operation relay.nn.dilate() which calls topi.nn.dilate().

* Fix typo

* Set op pattern to injective
```
  notoraptor committed 4 years ago
  639358e8 Browse Directory
- Improve IntervalSet's floormod (#5367) · 37e57548
  yongfeng-nv committed 4 years ago
  
  37e57548 Browse Directory
24 Apr, 2020 1 commit

[RUNTIME][OBJECT] Introduce static slots for common objects. (#5423) · 3cc49719

The _type_child_slots can be used to enable quick type checking optimization
by checking the whether the type index is within the bound.

This PR enables these static slots:

- Introduce a static assert to avoid the scenario when a developer forget to
  _type_child_slots when the field is set for the type's parent.
- Revamp and assign static type index to common runtime objects
- Add a DumpTypeTable call to allow developer monitor the current situation
  of type table and offers suggestions for the slots(ideally the slots equals
  the number of children so there is no overflow.

committed 4 years ago

3cc49719 Browse Directory

23 Apr, 2020 1 commit
- [TIR][REFACTOR] Remove ir_pass in favor of analysis/transform. (#5415) · b8c23d66
```
This PR removes ir_pass(old style pass functions) in favor
of analysis/transform(new style pass manager).
```
  Tianqi Chen committed 4 years ago
  b8c23d66 Browse Directory
22 Apr, 2020 2 commits
- [TIR] Enhance Substitute, python bindings for Substitute/PostOrderVisit/IRTransform. (#5400) · 6cb5b882
```
Substitute now takes a std::function to customize more replacing behaviors.

Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>

Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>
```
  Tianqi Chen committed 4 years ago
  6cb5b882 Browse Directory
- [Fix] Remove the duplicate PrintIR pass in Relay (#5403) · 56941fb9
  Haichen Shen committed 4 years ago
  
  56941fb9 Browse Directory
21 Apr, 2020 3 commits

[PTYTHON] Migrate VTA TIR passes to the new pass manager. (#5397) · d3277874
Tianqi Chen committed 4 years ago

d3277874 Browse Directory

[ARITH] Remove the legacy Simplify, migrate to Analyzer. (#5385) · d9cecdf5

The legacy Simplify/CanonicalSimplify are now a thin wrapper around the Analyzer.
This PR removes these functions and migrated every place that requires
simplification to enforce Analyzer creation.
The new API would encourage more Analyzer sharing and potentially enable
context-aware analyzer-based simplification.

committed 4 years ago

d9cecdf5 Browse Directory

[REFACTOR][TE] Inline -> te/schedule/operation_inline.h (#5386) · b8efe27f

Rationale: inline is a transformation used in te to
rewrite its internal expressions. It is not a formal IRModule->IRModule transform pass.

Also removed the python test as the test is covered by stage.compute_inline.

committed 4 years ago

b8efe27f Browse Directory

20 Apr, 2020 1 commit

[TIR][REFACTOR] RewriteForTensorCore -> te/schedule (#5379) · 4c0a53dc

* [TIR][REFACTIR] RewriteForTensorCore -> te/schedule

RewriteForTensor depends on the schedule information, which makes it differ
from a typical pass(which should get all the information from the input TIR).

As a result, we refactor it as a SchedulePostProc step for now.
We should revisit it later as we introduce more support for tensor core patterns in the TIR.

* Fix VTA to fit the new IR Pattern

committed 4 years ago

4c0a53dc Browse Directory

19 Apr, 2020 1 commit

[TIR][REFACTOR] Remove te::Tensor dependencies from TIR passes. (#5372) · c3511c5e

* [TIR][REFACTOR] Remove te::Tensor dependencies from TIR passes.

te::Tensor is an useful object for tensor expression, but brings
un-necessary reverse dependency in TIR nodes such as Provide and Realize.

This PR is a first step to remove this dependency. We will use Buffer in all the places
where the te::Tensor was used. The rough correspondence are:

- Provide -> BufferStore
- Realize -> BufferRealize
- HalideCall -> BufferLoad.

After this change, we can not use IRModule of PrimFuncs cleanly to represent TIR
at any point of the optimizations. Buffer will serve as the abstraction for the TIR data
models to represent the intermediate storages and their constraints.

We still keep Realize/HalideCall and Provide as TIR nodes for now to make the change minimum.
Right after ScheduleOps, we call SchedulePostProcToPrimFunc to canonicalize the temporary IR
generated by TE(which contains these nodes) to the TIR.

The TIR optimizations are now mostly migrated to to the pass manager.
Followup PRs are needed to migrate the remaining few passes.

* Fix dev tutorial

committed 4 years ago

c3511c5e Browse Directory

18 Apr, 2020 1 commit

[TIR][REFACTOR] Migrate low-level passes in tvm.lower to the Unified IR pass manager. (#5364) · 32648950

- Migrate BoundCheckers and Simplify
- Migrate RewriteUnsafeSelect and RemoveNoOp
- Migrate UnrollLoop and StorageRewrite
- Migrate InjectDoubleBuffer and InjectVirtualThread
- Migrate LoopPartition and Vectorize
- Migrate CoProcSync, LiftAttrScope, InjectCopyIntrin

We still keep ir_pass registerations for now.
Need a separate PR to refactor the parts before the StorageFlatten.

committed 4 years ago

32648950 Browse Directory

17 Apr, 2020 2 commits
- [RELAY][PYTORCH]GroupNorm op support added (#5358) · f49fc366
  Samuel committed 4 years ago
  
  f49fc366 Browse Directory
- [TIR] Make lower_warp_memory support extent(threadIdx.x) < warp_size (#5307) · 4b5f324a
```
* support extent(threadIdx.x) < warp_size in lower_warp_memory

* more docs for lower_warp_memory
```
  Tang, Shizhi committed 4 years ago
  4b5f324a Browse Directory
15 Apr, 2020 3 commits

[TIR] Remove ProducerConsumer and AllocateNode::new_expr (#5333) · e8138f7d

* [TIR] Remove ProducerConsumer and AllocateNode::new_expr

This PR removes two legacy IR parts in TIR that are deprecated.

ProducerConsumer node only serves as a hint markup and may no longer be
informative after extensive transformations in the pass.
If necessary, we can add related info via AttrStmt.

The new_expr field in the AllocateNode is deprecated since it can just be
replaced by a LetStmt.

- Remove dependencies of passes on ProducerConsumer.
- Remove ProducerConsumer from the IR.
- Remove the deprecated fields (new_expr, free_function) from AllocateNode.

* Fix additional testcases

committed 4 years ago

e8138f7d Browse Directory

[Runtime][Relay][Cleanup] Clean up for memory pass to enable heterogenous… · 9a8ed5b7

[Runtime][Relay][Cleanup] Clean up for memory pass to enable heterogenous execution support. (#5324)

* Cleanup type pack and unpack for tuples.

* Clean up the memory_pass using common helpers

* Clean up memory.cc

* Refactor pass

* Add doc strings

* Fix CPPlint

* Fix PyLint

* Fix

* Apply suggestions from code review

Co-Authored-By: Zhi <5145158+zhiics@users.noreply.github.com>

* Fix typo

Co-authored-by: Zhi <5145158+zhiics@users.noreply.github.com>

committed 4 years ago

9a8ed5b7 Browse Directory

[RELAY] Remove re-exports of tvm.transform (#5337) · 275e317c
Tianqi Chen committed 4 years ago

275e317c Browse Directory

14 Apr, 2020 1 commit

[TIR] Refactor MakePackedAPI to target dependent stage. (#5326) · f08d5d78

Previously MakePackedAPI was in the target independent stage,
but never the less requires the device_type information that will be
binded at a later target dependent stage.

The previous implementation was due to the limitation of LoweredFunc
which can not carry buffer_map info(so they have to be lowered right away).
This is no longer the case after the unified IR refactor.

This PR migrates MakePackedAPI to a target dependent stage
and removes the un-necessary BindDevice pass.

committed 4 years ago

f08d5d78 Browse Directory

13 Apr, 2020 1 commit

[RUNTIME][IR] Allow non-nullable ObjectRef, introduce Optional<T>. (#5314) · fc75de9d

* [RUNTIME] Allow non-nullable ObjectRef, introduce Optional<T>.

We use ObjectRef and their sub-classes extensively throughout our codebase.
Each of ObjectRef's sub-classes are nullable, which means they can hold nullptr
as their values.

While in some places we need nullptr as an alternative value. The implicit support
for nullptr in all ObjectRef creates additional burdens for the developer
to explicitly check defined in many places of the codebase.

Moreover, it is unclear from the API's intentional point of view whether
we want a nullable object or not-null version(many cases we want the later).

Borrowing existing wisdoms from languages like Rust. We propose to
introduce non-nullable ObjectRef, and Optional<T> container that
represents a nullable variant.

To keep backward compatiblity, we will start by allowing most ObjectRef to be nullable.
However, we should start to use Optional<T> as the type in places where
we know nullable is a requirement. Gradually, we will move most of the ObjectRef
to be non-nullable and use Optional<T> in the nullable cases.

Such explicitness in typing can help reduce the potential problems
in our codebase overall.

Changes in this PR:
- Introduce _type_is_nullable attribute to ObjectRef
- Introduce Optional<T>
- Change String to be non-nullable.
- Change the API of function->GetAttr to return Optional<T>

* Address review comments

* Upgrade all compiler flags to c++14

* Update as per review comment

committed 4 years ago

fc75de9d Browse Directory

12 Apr, 2020 2 commits

Remove PrimExpr from String (#5311) · 8c31d0dd
Zhi committed 4 years ago

8c31d0dd Browse Directory

[IR][TRANSFORM] Enable CopyOnWrite for passes. (#5309) · e4b80bda

This PR enables the copy on write optimizations passes:
- Enable COW for IRModule both TIR and relay passes.
- Enabled COW for PrimFunc in TIR passes.

Need more thoughts into whether/how to enable COW
for relay::Function, due to some function passes depend
on the presence of IRModule for context information,
and the std::move of the related function to nullptr
might affect the related behavior.

committed 4 years ago

e4b80bda Browse Directory

11 Apr, 2020 1 commit

[RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc (#5271) · b72dd9d9

* [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc

This PR introduces RValue reference support the PackedFunc calling convention to address the above issue.
Specifically, when an argument is a r-value reference, we will use a assign a different type code(`kObjectRValueRefArg`),
and pass `Object**`  (the address to the Object pointer) instead through the values array.
The callee can choose to move out this Object pointer and set the original Object pointer from the caller side to be nullptr.

We also add an experimental move support to the python side(marked as _move so to indicate the dev nature).
This enhancement will enable copy on write optimizations through out the TVM stack.

* Address review comments

* fix compilation

committed 4 years ago

b72dd9d9 Browse Directory

10 Apr, 2020 5 commits

[Arith] linear system and equation solver (#5171) · e21f2682

* [arith] linear system and equation solver

Co-authored-by: Sergei Grechanik <sergei.grechanik+h@gmail.com>

* avoid constructing analyzer every time

* generate random test cases and address comments

Co-authored-by: Sergei Grechanik <sergei.grechanik@gmail.com>

* rename linear_system to int_constraints

* add comments and use random seed

* message for reporting failure with seed

* add SEqualReduce to IntConstraints; allow variables & ranges to be None

Co-authored-by: Sergei Grechanik <sergei.grechanik+h@gmail.com>
Co-authored-by: Sergei Grechanik <sergei.grechanik@gmail.com>

committed 4 years ago

e21f2682 Browse Directory

[REFACTOR][IR] Move to runtime::String (#5276) · 5da361d3

* Use runtime::String

* move string to tvm namespace

* add const char* constructor

* implicit cast from std::string

committed 4 years ago

5da361d3 Browse Directory

[RUNTIME] Initial implementation of Hexagon runtime support (#5252) · 02d3a59b

* [RUNTIME] Initial implementation of Hexagon runtime support

This is only the TVM runtime. The FastRPC libraries, simulator driver,
etc. will be provided in subsequent commits.

* Fix pylint complaints

* Fix some more pylint complaints

* Add link to the Hexagon SDK website

* Extract VTCM marker into a common variable

* Implement device->device memory copy

* Disable unsigned PDs by default

* Ensure that --hvx_length is present in sim_args if HVX is enabled

* Remove the line about clang from README.md

Apparently things work with libstdc++.

* Mention to set USE_RPC=OFF when building libtvm_runtime.so for Hexagon

* Remember to use codegen_hvx in validate_hvx_length

* Add a line about minimum version of LLVM

committed 4 years ago

02d3a59b Browse Directory

[NODE] General serialzation of leaf objects into bytes. (#5299) · 029388f5

This PR refactors the serialization mechanism to support general
serialization of leaf objects into bytes.

The new feature superceded the original GetGlobalKey feature for singletons.
Added serialization support for runtime::String.

committed 4 years ago

029388f5 Browse Directory

Legalize - Use Non-recursive Rewriter. (#5296) · 7d670b04
```
* Legalize - Use Non-recursive Rewriter.

* Cleanup.
```
Animesh Jain committed 4 years ago
7d670b04 Browse Directory

07 Apr, 2020 3 commits

[RUNTIME] Quick fix PackedFunc String passing (#5266) · 2942278a
Tianqi Chen committed 4 years ago

2942278a Browse Directory

[uTVM][Runtime] Introduce Virtual Memory Allocator to CRT (#5124) · e11a6092

* initial crt_memory and memory leak fix in graph_runtime

Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a

* fix memory leak

Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3

* clean up

Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3

* implement vrealloc

Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398

* allocate from stack memory for most of the variables

Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee

* allocate from stack memory for all of the variables

Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74

* lint

Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73

* lint

Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100

* facilitate the growth of TVM_CRT_MAX_NDIM

Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581

* extend test coverage of vmalloc

Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581

* lint

Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69

* move logging.h to src

* fix an error in macOS

* remove logging.h

* use cflags for gcc

* fix compilation error

committed 4 years ago

e11a6092 Browse Directory

[Relay][OP] Add fast_erf implementation (#5241) · f5b02fdb
```
* add fast erf

* doc

* lint

* fix

* fix indent
```
Haichen Shen committed 4 years ago
f5b02fdb Browse Directory

06 Apr, 2020 1 commit
- [RUNTIME] Enable auto conversion from str to runtime::String in PackedFunc, move… · 5e50f476
```
[RUNTIME] Enable auto conversion from str to runtime::String in PackedFunc, move dtype related handling to data_type.h (#5251)
```
  Tianqi Chen committed 4 years ago
  5e50f476 Browse Directory
05 Apr, 2020 2 commits

[Relay][Topi][AutoTVM] Winograd support for Conv3D (#5186) · 02eb1833

* Functional conv3d winograd working.

* Formatted python code.

* registered conv3d winograd compute and started adding relay without_weight_transform operator.

* Add topi testing for conv3d winograd.

* Format file.

* small tweak to unrolling to prevent build sticking.

* Refactoring convolution ops in relay.

* Refactored relay convolutions.

* Bug fixes.

* Fixed static bug in convolution.

* Added conv3d alter op layout and related support.

* Bug fixes and testing done.

* Fix a few autotvm bugs.

* Drop silly debug print.

* Removed debug_skip_region.

* Add variant of conv3d_winograd that doesn't transform depth.

* initial infrastructure done for depthless conv.

* Fix no_depth schedule bugs.

* automatic topi switching between depth and depthless winograd.

* Fixed bug in schedule.

* lint fixes.

* Removed indents in convolution.cc

* missed a few indents oops.

* fixed flop count.

* One more small tweak.

* Change kernel pack inner axes order.

* Style changes.

* Comment fixes.

committed 4 years ago

02eb1833 Browse Directory

[REFACTOR][TIR] Migrate all low-level passes to the Pass Manager. (#5233) · e63e08fe

* [REFACTOR][TIR] Migrate all low-level passes to the Pass Manager.

This PR migrates the tvm.lower to return IRModule of PrimFuncs
instead of the LoweredFuncs.

* Remove LoweredFunc.

committed 4 years ago

e63e08fe Browse Directory

03 Apr, 2020 2 commits

[REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager. (#5225) · 75e936e1

* [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager.

- SplitHostDevice
- ThreadSync
- BindDevice
- LowerThreadAllreduce
- Provide a temp fix for printing IRModule with PrimFunc before the formal text printer.

* Address comments, fix tests.

* Fix relay tests

* Explicit move

committed 4 years ago

75e936e1 Browse Directory

[RELAY] Non-recursive Graph Vistor and Rewriter (#4886) · 7de8a539

* First pass a defining a non-recursive Graph Vistor and Rewriter

autoformat

remove a currently empty test until testing is solidfied

* Make CalcDep from Dead Code Elimination non-recursive

* Partially working, not passing all tests yet

passes tests when disabling GetExprRefCount, I think I have a bug in visit counting

fix GetExprRefCount

Fix a subtle bug with nested recursive/non-recursive scopes

* Refactor

* improve comments

* respond to review comments on comments

* Fix a problem with default recursion for dataflow nodes

mark DataflowVisitor methods as override

* implement ScopeMutator

* convert forward_rewrite to ScopeMutator, remove DataflowMutator

* rewrite ExprRewriter and convert fast_math to use it

* switch BiasAddSimplifier to ExprRewriter

fix a clang warning

fix cpp lint

fix doc param error

* respond to review comments

* fix a typo in the iterative looping

* add a regression test for GetExprRefCount issue

* Normalize naming

* fix lint

* First pass a defining a non-recursive Graph Vistor and Rewriter

autoformat

remove a currently empty test until testing is solidfied

* Make CalcDep from Dead Code Elimination non-recursive

* Partially working, not passing all tests yet

passes tests when disabling GetExprRefCount, I think I have a bug in visit counting

fix GetExprRefCount

Fix a subtle bug with nested recursive/non-recursive scopes

* Refactor

* improve comments

* respond to review comments on comments

* Fix a problem with default recursion for dataflow nodes

mark DataflowVisitor methods as override

* implement ScopeMutator

* convert forward_rewrite to ScopeMutator, remove DataflowMutator

* rewrite ExprRewriter and convert fast_math to use it

* switch BiasAddSimplifier to ExprRewriter

fix a clang warning

fix cpp lint

fix doc param error

* respond to review comments

* fix a typo in the iterative looping

* add a regression test for GetExprRefCount issue

* Normalize naming

* fix lint

* respond to review comments

committed 4 years ago

7de8a539 Browse Directory

02 Apr, 2020 5 commits
- [REFACTOR][TIR] Migrate low-level pass functions to Pass Manager, (#5213) · 44bffdb3
```
- Migrate LowerTVMBultin
- Migrate inferFragment, LowerThreadAllreduce
- Migrate ThreadSync
- Refactor target::Build to directly take IRModule.
- Remove un-used legacy functions.
```
  Tianqi Chen committed 4 years ago
  44bffdb3 Browse Directory
- [TIR] Introduce BufferLoad/Store (#5205) · 88d2f34b
```
Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>

This PR introduces BufferLoad/Store to TIR. The new nodes will replace
Provide and Call with Tensor arguments in the subsequent refactors.
```
  Tianqi Chen committed 4 years ago
  88d2f34b Browse Directory
- [TIR][PASS] dtype rewrite for indexing variables (#5092) · 4e5c5843
  Haozheng Fan committed 4 years ago
  
  4e5c5843 Browse Directory
- [REFACTOR][IR] kExternalSymbol -> kGlobalSymbol (#5211) · d2f9af78
```
* expose runtime::String to Python

* kExternalSymbol -> kGlobalSymbol
```
  Zhi committed 4 years ago
  d2f9af78 Browse Directory
- [Debug] Add Dump function for Object type (NFC) (#5207) · 15b1751c
```
Signed-off-by: Wei Pan <weip@nvidia.com>
```
  Wei Pan committed 4 years ago
  15b1751c Browse Directory