Commits · 10b77ef37adde1ad489e992fad87dc6069deacb1 · wenyuanbo / tic

11 Nov, 2019 6 commits
- [TF][Relay][Op] Pass module when infer shape (#4287) · 10b77ef3
```
* [TF][Relay][Op] Pass module when infer shape

* Fix lint

* Improve style

* Add test
```
  Wei Chen committed Nov 11, 2019
  10b77ef3 Browse Files
- [RUNTIME][REFACTOR] Use object protocol to support runtime::Module (#4289) · f823c577
```
Previously runtime::Module was supported using shared_ptr.
This PR refactors the codebase to use the Object protocol.

It will open doors to allow easier interpolation between
Object containers and module in the future.
```
  Tianqi Chen committed Nov 11, 2019
  f823c577 Browse Files
- [TF][TEST] add test_forward_reduce_any back (#4301) · aeb5f130
```
the test case was removed in #4181 for some reason
@tqchen @soiferj @zhiics
```
  Yong Wu committed Nov 11, 2019
  aeb5f130 Browse Files
- Fix tf reshape (#4285) · 18f8581b
```
* Fix tf reshape

* Fix test

* Fix pylint

* Fix pylint
```
  Yao Wang committed Nov 11, 2019
  18f8581b Browse Files
- [tutorial] Relay pass infra tutorial (#4083) · cff62bdb
```
* Add pass manager tutorial

* fix some examples

* retrigger ci

* Update tutorials/dev/relay_pass_infra.py

Co-Authored-By: 雾雨魔理沙 <lolisa@marisa.moe>

* Add ToANormalForm link
```
  Zhi committed Nov 10, 2019
  cff62bdb Browse Files
- [TOPI][AlterOpLayout][ARM] Enabling NHWC to NCHW layout transformation. (#4249) · 1d243664
  Animesh Jain committed Nov 10, 2019
  
  1d243664 Browse Files
10 Nov, 2019 5 commits
- [RUTNIME] Support C++ RPC (#4281) · d2fc0252
  Zhao Wu committed Nov 10, 2019
  
  d2fc0252 Browse Files
- [TFLite] Support PRelu (#4298) · 2f65a87f
  Zhao Wu committed Nov 10, 2019
  
  2f65a87f Browse Files
- [Test][TF][Relay] Fix argument preparation for vm test mode (#4296) · fc28f7ab
  Wei Chen committed Nov 10, 2019
  
  fc28f7ab Browse Files
- [Codegen][cuda-fp16] fallback to fp32 simulation when cuda arch < sm53 (#4268) · 801cf0e8
  Yizhi Liu committed Nov 09, 2019
  
  801cf0e8 Browse Files
- Rename ml.dmlc.tvm to org.apache.tvm (#4290) · 1dcf8a16
  Yizhi Liu committed Nov 09, 2019
  
  1dcf8a16 Browse Files
09 Nov, 2019 1 commit

Auto TensorCore CodeGen (#4234) · d64bf6b5

* Add Auto TensorCore TensorCore Unit Test

* Rebase to tvm master branch & Add auto tensor core

* Code Refine

* Add tensor core switch by pragma

* Add pragma in tensor core example code

* Get real tile size to replace hard coded 16

* support more than 2 dimensions (e.g. batchmatmul) for buffer bind scope

* support batch matmul

* Move cuda env check to tensor_core.cc

* Coderefine for tensor_core.cc

* Refine comments

* Some refinements of code and comment

* Update TensorCore UT to pass the CPU test

* remove redundant code

* matmul's storage align for different layout

* Add support for differenct position of type cast

* Add formal tutorial for auto tensorcore codegen

* move tensorcore check up to tutorial code

* code and doc refine

* comment out tune_and_evaluate in tutorial

* fix cpplint error

committed Nov 09, 2019

d64bf6b5 Browse Files

08 Nov, 2019 2 commits
- Update tvm_runtime.h (#4278) · 281f643c
```
fix the problem that android_rpc compilation failed
```
  peike committed Nov 08, 2019
  281f643c Browse Files
- [TOPI][CUDA] Fix Winograd Kernel Size Support (#4276) · 76b79671
```
* fix_winograd_cuda_kernel_size

* add unit test
```
  Cody Hao Yu committed Nov 08, 2019
  76b79671 Browse Files
07 Nov, 2019 2 commits

[Relay][Frontend][ONNX] Add support for broadcasting to Where and MatMul (#4267) · 5bcd3313
Jon Soifer committed Nov 07, 2019

5bcd3313 Browse Files

[AutoTVM] Add batch_matmul to tunable operations (#4242) · 14a5a358

* Batch matmul tuning running but with errors.

* Default x86 schedule as good as before.

* Code Cleanup

* Remove unused argument.

* improved template documentation.

* Silly lint fix

* Removed leftover comment.

* Moved cfg declaration to schedule for batch_matmul

* Moved x86 dense cfg declaration to schedule.

* lint fix

* Removed duplicate cfg declaration in dense.

* Reverted changes to dense.

committed Nov 06, 2019

14a5a358 Browse Files

06 Nov, 2019 4 commits
- [TOPI] Fix bug in Winograd on CUDA (#4260) · 7211c277
```
* fix winograd

* move get padding after kernel transform
```
  Cody Hao Yu committed Nov 06, 2019
  7211c277 Browse Files
- [Contrib] Fix error message at callback_get_section_size() (#4221) · ddaa9530
```
* [Contrib] Fix error message at callback_get_section_size()

* Trigger notification
```
  Neo Chien committed Nov 06, 2019
  ddaa9530 Browse Files
- [VTA] Hotfix for padded load test in Chisel VTA (#4264) · 1eca1ad1
```
* Update TensorUtil.scala

* Update test_vta_insn.py
```
  Liangfu Chen committed Nov 06, 2019
  1eca1ad1 Browse Files
- [DOCS] Update link loc (#4257) · 86b844b9
  Tianqi Chen committed Nov 05, 2019
  
  86b844b9 Browse Files
05 Nov, 2019 2 commits
- workaround typing.Deque import error for Python 3.5 (#4254) · aae5cde8
  zhuochen committed Nov 05, 2019
  
  aae5cde8 Browse Files
- Require LLVM >= 9 for AMDGPU backend (#4253) · 635831c7
```
LLVM 8 will crash when loading the bitcodes

This is a runtime check as the file will be compiled in even when
USE_ROCM OFF is used in the configuration if ROCM is installed
in the default location.

Fixes: #4087
```
  Thomas Viehmann committed Nov 05, 2019
  635831c7 Browse Files
04 Nov, 2019 4 commits
- CI trigger after repo move (#4252) · 411fe277
  Tianqi Chen committed Nov 04, 2019
  
  411fe277 Browse Files
- [Relay][Frontend][Tensorflow] Fix GatherV2, Add StopGradient (#4238) · 3f472f94
```
* Add StopGradient. Add batch_dims attr to ignore list for GatherV2

* Trigger CI
```
  Trevor Morris committed Nov 04, 2019
  3f472f94 Browse Files
- remove PEP498 f-string new feature for support python3.5 (#4250) · 996cf30e
  Kim committed Nov 04, 2019
  
  996cf30e Browse Files
- Fix typo in err msg (#4251) · 1b053ec0
  XFPlus committed Nov 04, 2019
  
  1b053ec0 Browse Files
02 Nov, 2019 2 commits

[VTA] Performance optimize, remove unnecessary contigious memory use. (#4246) · 008aa838

* [VTA] Performance optimize, remove unnecessary contigious memory use.

Issue:
Uop maintain a cache vector to copy uop data into contigious DRAM memory for
FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator
core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier
function, we can saw such cache size keep increase, this would cause
no use data copy and unnecessary contigous DRAM memory malloc.

Analysis:
This issue caused by not clear cache_ vector when do
uop_queue_.Reset().

Solution:
Override BaseQueue Reset function in UopQueue and add cache_ clear
logic.

* address review comments, remove spacing.

committed Nov 01, 2019

008aa838 Browse Files

Support reshape for dynamic shape in tf converter (#4185) · e9039d04

* Support reshape for dynamic shape in tf converter

* Only allow reshape directly after shape function for symbolic input shape

* Fix lint

committed Nov 01, 2019

e9039d04 Browse Files

01 Nov, 2019 7 commits
- [NODE][REFACTOR] Rename IRFunctor->NodeFunctor, use func pointer (#4247) · 9a3d2ec9
```
* [NODE][REFACTOR] Rename IRFunctor->NodeFunctor, use function pointer for dispatching.

Previously we used std::function for the functor dispatching.
It introduces additional overhead and problems during dll destruction(of std::function).

This PR changes the std::function to function pointers.
This change a bit restrictions around the set_dispatch that we can get around,
but will improve the general efficiency by reducing one level of indirection in the std::function.
We also no longer need special marcos to register functions to the Functor.
```
  Tianqi Chen committed Nov 01, 2019
  9a3d2ec9 Browse Files
- Implement explicit IR representation of memory alloction (#3560) · 2083513f
  Jared Roesch committed Nov 01, 2019
  
  2083513f Browse Files
- [Relay][Prelude] Add more dtypes to tensor_t (#4233) · 19164063
  Wei Chen committed Nov 01, 2019
  
  19164063 Browse Files
- [Relay][Pass] Avoid FoldConstant folding some ops (#4245) · aa49e851
```
* [Relay][Pass] Avoid FoldConstant folding some ops

* rename
```
  Wuwei Lin committed Nov 01, 2019
  aa49e851 Browse Files
- [ Relay ][ Frontend ][ Tensorflow ]add op add_n to relay/frontend/tensorflow.py (#4181) · cd717dea
  Kim committed Nov 01, 2019
  
  cd717dea Browse Files
- [ARITH] Fix lowering of FloorMod (#4236) · bafc675c
  Sergei Grechanik committed Nov 01, 2019
  
  bafc675c Browse Files
- Fix the problem that android_rpc compilation failed. (#4244) · a897d36d
```
Signed-off-by: qinqiuping <autumnqin@126.com>
```
  autumnqin committed Nov 01, 2019
  a897d36d Browse Files
31 Oct, 2019 5 commits
- [BUILD] Disable utvm standalone runtime by default (#4240) · a3ca1a4d
  Tianqi Chen committed Oct 31, 2019
  
  a3ca1a4d Browse Files
- [CUDA] Fix fp16 intrin, disable bad fp16 vecadd test for now (#4239) · ebfcd28c
  Tianqi Chen committed Oct 31, 2019
  
  ebfcd28c Browse Files
- [CI] Update GPU docker to cuda10 (#4228) · b2155f70
```
* [CI] Update the ci-gpu to use cuda10

* [CI] Enforce tensorcore gpu for unittest
```
  Tianqi Chen committed Oct 31, 2019
  b2155f70 Browse Files
- Fix typo in get_output doc-string (#4237) · a6221a1f
  KoolKoffee committed Oct 31, 2019
  
  a6221a1f Browse Files
- [CI] Move gpu docker binary to cuda10 (#4229) · 26cbc3fb
```
* [CI] Move gpu docker binary to cuda10

* Fix the gcn tutorial
```
  Tianqi Chen committed Oct 30, 2019
  26cbc3fb Browse Files