Commits · 9c12ec81206b948a180590e387b49d84d39ede35 · wenyuanbo / tic

23 Apr, 2020 3 commits
- [cuDNN] Add cuDNN grouped convolutions support (#5319) · 9c12ec81
```
Signed-off-by: Wei Pan <weip@nvidia.com>
```
  Wei Pan committed 4 years ago
  9c12ec81 Browse Directory
- [RUNTIME][CONTRIB] CoreML Runtime (#5283) · 1acad98e
```
* [RUNTIME][CONTRIB] CoreML Runtime

* fix lint

* fix CI

* use xcrun to compile coreml model
```
  MORITA Kazutaka committed 4 years ago
  1acad98e Browse Directory
- [TIR][REFACTOR] Remove ir_pass in favor of analysis/transform. (#5415) · b8c23d66
```
This PR removes ir_pass(old style pass functions) in favor
of analysis/transform(new style pass manager).
```
  Tianqi Chen committed 4 years ago
  b8c23d66 Browse Directory
22 Apr, 2020 3 commits
- Customize SI prefix in logging (#5411) · f5c9bc93
```
* Customize SI prefix in logging

* Include unit test
```
  Andrew Reusch committed 4 years ago
  f5c9bc93 Browse Directory
- [TIR] Enhance Substitute, python bindings for Substitute/PostOrderVisit/IRTransform. (#5400) · 6cb5b882
```
Substitute now takes a std::function to customize more replacing behaviors.

Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>

Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>
```
  Tianqi Chen committed 4 years ago
  6cb5b882 Browse Directory
- [KERAS]Minimum & AlphaDropout op support (#5380) · 24f68653
  Samuel committed 4 years ago
  
  24f68653 Browse Directory
21 Apr, 2020 6 commits

[PTYTHON] Migrate VTA TIR passes to the new pass manager. (#5397) · d3277874
Tianqi Chen committed 4 years ago

d3277874 Browse Directory

* Fix oversight in importing tf.compat.v1 as tf.

* Actually disable test for lstm in TF2.1

Since the testing framework actually uses pytest, the version
check needs to be moved.

committed 4 years ago

72f2aea2 Browse Directory

Fix test_ir_type. (#5390) · be54c984
```
* The void return type is not None/nullptr, it's VoidType or
   TupleType([]).
```
Andrew Reusch committed 4 years ago
be54c984 Browse Directory
Add ability to have multiple copies of same input to onnx_inputs. (#5389) · 5ce2c296
Josh Fromm committed 4 years ago

5ce2c296 Browse Directory

[ARITH] Remove the legacy Simplify, migrate to Analyzer. (#5385) · d9cecdf5

The legacy Simplify/CanonicalSimplify are now a thin wrapper around the Analyzer.
This PR removes these functions and migrated every place that requires
simplification to enforce Analyzer creation.
The new API would encourage more Analyzer sharing and potentially enable
context-aware analyzer-based simplification.

committed 4 years ago

d9cecdf5 Browse Directory

[REFACTOR][TE] Inline -> te/schedule/operation_inline.h (#5386) · b8efe27f

Rationale: inline is a transformation used in te to
rewrite its internal expressions. It is not a formal IRModule->IRModule transform pass.

Also removed the python test as the test is covered by stage.compute_inline.

committed 4 years ago

b8efe27f Browse Directory

20 Apr, 2020 2 commits
- [Blocksparse] Pipeline for lowering dense model to sparse-dense (#5377) · 3f03869e
  Bing Xu committed 4 years ago
  
  3f03869e Browse Directory
- [PYTORCH]Unary Ops (#5378) · 22db299b
  Samuel committed 4 years ago
  
  22db299b Browse Directory
19 Apr, 2020 3 commits

[TIR][REFACTOR] Remove te::Tensor dependencies from TIR passes. (#5372) · c3511c5e

* [TIR][REFACTOR] Remove te::Tensor dependencies from TIR passes.

te::Tensor is an useful object for tensor expression, but brings
un-necessary reverse dependency in TIR nodes such as Provide and Realize.

This PR is a first step to remove this dependency. We will use Buffer in all the places
where the te::Tensor was used. The rough correspondence are:

- Provide -> BufferStore
- Realize -> BufferRealize
- HalideCall -> BufferLoad.

After this change, we can not use IRModule of PrimFuncs cleanly to represent TIR
at any point of the optimizations. Buffer will serve as the abstraction for the TIR data
models to represent the intermediate storages and their constraints.

We still keep Realize/HalideCall and Provide as TIR nodes for now to make the change minimum.
Right after ScheduleOps, we call SchedulePostProcToPrimFunc to canonicalize the temporary IR
generated by TE(which contains these nodes) to the TIR.

The TIR optimizations are now mostly migrated to to the pass manager.
Followup PRs are needed to migrate the remaining few passes.

* Fix dev tutorial

committed 4 years ago

c3511c5e Browse Directory

Remove developer facing api from frontend exports. (#5375) · a4902e05
shoubhik committed 4 years ago

a4902e05 Browse Directory
[TIR] Fix lower_warp_memory when there are >1 warp buffers (#5368) · a2d6fe65
```
* fix recursion in lower_warp_memory

* post-order mutation
```
Tang, Shizhi committed 4 years ago
a2d6fe65 Browse Directory

18 Apr, 2020 2 commits

[TIR][REFACTOR] Migrate low-level passes in tvm.lower to the Unified IR pass manager. (#5364) · 32648950

- Migrate BoundCheckers and Simplify
- Migrate RewriteUnsafeSelect and RemoveNoOp
- Migrate UnrollLoop and StorageRewrite
- Migrate InjectDoubleBuffer and InjectVirtualThread
- Migrate LoopPartition and Vectorize
- Migrate CoProcSync, LiftAttrScope, InjectCopyIntrin

We still keep ir_pass registerations for now.
Need a separate PR to refactor the parts before the StorageFlatten.

committed 4 years ago

32648950 Browse Directory

fix fuse over functions that are handled by external codegen (#5365) · 3db8880d
Zhi committed 4 years ago

3db8880d Browse Directory

17 Apr, 2020 4 commits
- [RELAY][PYTORCH]GroupNorm op support added (#5358) · f49fc366
  Samuel committed 4 years ago
  
  f49fc366 Browse Directory
- [TIR] Make lower_warp_memory support extent(threadIdx.x) < warp_size (#5307) · 4b5f324a
```
* support extent(threadIdx.x) < warp_size in lower_warp_memory

* more docs for lower_warp_memory
```
  Tang, Shizhi committed 4 years ago
  4b5f324a Browse Directory
- [TOPI-ARM] Do not alter layout if layout is NHWC (#5350) · 49d304fc
```
* [TOPI-ARM] Do not alter layout if layout is NHWC

* Add test.
```
  Animesh Jain committed 4 years ago
  49d304fc Browse Directory
- [PYTORCH]Tensor creation ops support (#5347) · 9bbee96f
  Samuel committed 4 years ago
  
  9bbee96f Browse Directory
16 Apr, 2020 2 commits

[RELAY][BYOC] Register pattern tables from external codegens (#5262) · 84d1eec3

* [RELAY][BYOC] Register pattern tables from external codegens

This adds utility functions to support registering
and retrieving pattern tables used by MergeComposite for
external codegens.

Change-Id: I5be165a321440e48b15ff6aff4970e0c67496aaa

* Updated DNNL tests to use pattern table mechanism

* Removed pattern table standalone test

* Change reg to _op

committed 4 years ago

84d1eec3 Browse Directory

[TOPI][PYTORCH]Logical & Bitwise operator support (#5341) · 6e36da35
Samuel committed 4 years ago

6e36da35 Browse Directory

15 Apr, 2020 6 commits

[BYOC] Prevent duplicate outputs in subgraph Tuple (#5320) · 09eb5082

* Fix duplicate output in partitiongraph

* Add test case

* Fix test_annotated_regions with duplicate compiler_end outputs

* Revert "Fix duplicate output in partitiongraph"

This reverts commit e1f8ef3f4ca5b2aaa31ace6fa968bb50e5e4d1fa.

* Prevent duplicate outputs in Tuple in PartitionGraph

* Fix lint

* Add another test case for when regions are merged, and when TupleGetItem was duplicated

* Pull GetFunctionOutput out of branch, improve description of GetFunctionOutput

* Use std::move for GetFunctionOutput. Fix typo with testcase name

* Use tvm.transform.Sequential

committed 4 years ago

09eb5082 Browse Directory

[TIR] Remove ProducerConsumer and AllocateNode::new_expr (#5333) · e8138f7d

* [TIR] Remove ProducerConsumer and AllocateNode::new_expr

This PR removes two legacy IR parts in TIR that are deprecated.

ProducerConsumer node only serves as a hint markup and may no longer be
informative after extensive transformations in the pass.
If necessary, we can add related info via AttrStmt.

The new_expr field in the AllocateNode is deprecated since it can just be
replaced by a LetStmt.

- Remove dependencies of passes on ProducerConsumer.
- Remove ProducerConsumer from the IR.
- Remove the deprecated fields (new_expr, free_function) from AllocateNode.

* Fix additional testcases

committed 4 years ago

e8138f7d Browse Directory

[PYTHON] Enhance with_attr API, cleanup MakeAPILegacy in testcases (#5335) · f1438813
Tianqi Chen committed 4 years ago

f1438813 Browse Directory

[TOPI] Improve get_valid_count and nms performance for CUDA (#5339) · d81b006b

* get_valid_count updated to have correct results

* speedup nms

* update nms

* revert back nms

* recover one test for get_valid_count

committed 4 years ago

d81b006b Browse Directory

[PYTORCH]Take, Topk op support (#5332) · b1364ebb
```
* [PYTORCH]take, topk op support

* Ci Failure fix
```
Samuel committed 4 years ago
b1364ebb Browse Directory
[RELAY] Remove re-exports of tvm.transform (#5337) · 275e317c
Tianqi Chen committed 4 years ago

275e317c Browse Directory

14 Apr, 2020 4 commits

[TIR] Refactor MakePackedAPI to target dependent stage. (#5326) · f08d5d78

Previously MakePackedAPI was in the target independent stage,
but never the less requires the device_type information that will be
binded at a later target dependent stage.

The previous implementation was due to the limitation of LoweredFunc
which can not carry buffer_map info(so they have to be lowered right away).
This is no longer the case after the unified IR refactor.

This PR migrates MakePackedAPI to a target dependent stage
and removes the un-necessary BindDevice pass.

committed 4 years ago

f08d5d78 Browse Directory

[RELAY][PYTORCH]isNan, isinf, isfinite, ceil, clamp, round ops (#5316) · 4720cf85
```
* [RELAY][PYTORCH]isNan, isinf, isfinite, ceil, clamp, round ops

* Review comments
```
Samuel committed 4 years ago
4720cf85 Browse Directory
[Frontend|MXNet] SwapAxis operator support (#5246) · b7545eb5
```
* MXNet swap axis

* MXNet swap axis

* swap axis review comment

* swap axis review comment
```
Mahesh Ambule committed 4 years ago
b7545eb5 Browse Directory

[CODEGEN][CUDA] Fix vector load (#5226) · d2e58ad2

* Fix high-low bit bug in __pack_half2

* Fix vector load

* Add unit8 support for PrintVecElemLoadExpr and BroadcastNode

committed 4 years ago

d2e58ad2 Browse Directory

13 Apr, 2020 4 commits

[BYOC] Enhance partitioning and external codegen (#5310) · 5958d60d

* Remove duplicated output args

* address comment

* fix codegen c

* improve comment

* VisitExprDefault_

* deduce type

committed 4 years ago

5958d60d Browse Directory

[RUNTIME][IR] Allow non-nullable ObjectRef, introduce Optional<T>. (#5314) · fc75de9d

* [RUNTIME] Allow non-nullable ObjectRef, introduce Optional<T>.

We use ObjectRef and their sub-classes extensively throughout our codebase.
Each of ObjectRef's sub-classes are nullable, which means they can hold nullptr
as their values.

While in some places we need nullptr as an alternative value. The implicit support
for nullptr in all ObjectRef creates additional burdens for the developer
to explicitly check defined in many places of the codebase.

Moreover, it is unclear from the API's intentional point of view whether
we want a nullable object or not-null version(many cases we want the later).

Borrowing existing wisdoms from languages like Rust. We propose to
introduce non-nullable ObjectRef, and Optional<T> container that
represents a nullable variant.

To keep backward compatiblity, we will start by allowing most ObjectRef to be nullable.
However, we should start to use Optional<T> as the type in places where
we know nullable is a requirement. Gradually, we will move most of the ObjectRef
to be non-nullable and use Optional<T> in the nullable cases.

Such explicitness in typing can help reduce the potential problems
in our codebase overall.

Changes in this PR:
- Introduce _type_is_nullable attribute to ObjectRef
- Introduce Optional<T>
- Change String to be non-nullable.
- Change the API of function->GetAttr to return Optional<T>

* Address review comments

* Upgrade all compiler flags to c++14

* Update as per review comment

committed 4 years ago

fc75de9d Browse Directory

[PYTORCH]Reduce_ops support added (#5308) · 6805d543
```
* [PYTORCH]Reduce_ops support added

* Review comments updated

* typo bug in qnn test
```
Samuel committed 4 years ago
6805d543 Browse Directory

[Torch] Support Python list, more realistic recurrent networks (#5306) · 0145cd50

* use funcs from prelude, pass around convert_map

* get relay input type from user ishape

* handle tuple unpack

* experimenting with static tensor array

* use prelude concat instead of cons + rev

* minor clean up

* fix layer norm conversion bug, unwrap tensor array

* add infer shape on tensor array

* pass around prelude for now

* compile worked but runtime error

* fix tensor array wrapping

* begin list dynamic test

* is_list_dynamic first version

* finish dynamic list test

* a few fix

* use shape_of function if Any is found

* improve size conversion

* working on adding free vars to loop block

* fixed inlined inner loop issue

* clean up free var handling

* add support for tensor array concat

* adding ta concat on last axis

* fix concat, but got runtime error

* disable concat on axis -1 for now

* add lstm tests

* revert unrelated change

* fix stacked bidir test

* minor fix to test

* relax tol a bit, revert dnnl change to avoid conflict

* simplify infer type, use input tensor shape rather than concat shape

* more shape fix

committed 4 years ago

0145cd50 Browse Directory

12 Apr, 2020 1 commit
- [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign (#5312) · cd0d52da
```
* [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign

* Lint
```
  Junru Shao committed 4 years ago
  cd0d52da Browse Directory