Commits · 3515dccccbcfe814f3e4cec14fa5afc77777e49e · wenyuanbo / tic

31 Jul, 2019 7 commits

[DOC] Update ssd doc to avoid confusion. (#3677) · 3515dccc

* intel graphics conv2d bugs fixed for inception_v3

* intel conv2d api updated, nn input size 4 condition added

* review addressed

* move conv_tags to attributes

* ssd doc updated

* address comment

committed Jul 31, 2019

3515dccc Browse Files

[Relay][VM] Relay VM serialization (#3647) · 90455121
```
* relay vm serialization

* fix lint

* load params, fix stream

* lint

* fix typo
```
Zhi committed Jul 31, 2019
90455121 Browse Files
[TEST] Comptiable with python3.5 (#3675) · 0365e50a
lixiaoquan committed Jul 31, 2019

0365e50a Browse Files
[TOPI][CUDA] schedule for group_conv2d (#3663) · 11da1ca3
```
* [TOPI][CUDA] schedule for group_conv2d

* Fix #flops
```
Wuwei Lin committed Jul 31, 2019
11da1ca3 Browse Files

[VTA] VTA Compilation Script for Intel FPGA (#3494) · 83591aa5

* initial compilation script for chisel-vta;

* replace tabs with spaces;

* compile script for de10-nano;

* remove generated verilog source code;

* remove `altsource_probe`, `debounce`, `edge_detect` ip;

* replace quartus project files with a single tcl script;

* Update install.md

* improved makefile-based compilation script;

* complete makefile-based compilation of chisel-vta for de10-nano;

* install quartus;

* conversion to .rbf file;

* document chisel-vta compilation process for de10-nano;

* rename generated bitstream file;

* download and extract custom ip for de10-nano;

* minor change

* minor change

* fix indentation;

* bug fix;

* improved robustness in makefile;

* clean up;

* add `.sdc .ipx .qsys` allowance in jenkins;

* add ASF header;

* add ASF header;

* remove IntelShell.scala, update vta_hw.tcl, clean up Makefile & soc_system.qsys;

* add ASF header;

* keep sources compact;

* keep sources compact;

* it's not necessary now

* AXI4LiteClient -> AXI3Client for IntelShell

* remove connection to fpga_only_master;

* a few important bug fix: wire reset pin, and set host_r_last to high

* remove intel specific interface definition;

* add NO_DSP option in Makefile;

* AXI4Lite is not used in IntelShell;

* minor fix: disable dsp and use logic instead;

* quartus version change: 18.0 -> 18.1

* remove altera related statement;

* compose compile_design.tcl

* initial tcl script for soc_system generation;

* remove .qsys file;

* remove unused;

* .qsys can be generated by tcl script;

* remove hps_io and shrink size of soc_system;

* integrate into makefile;

* version change: 18.0 -> 18.1

* add sample config file for de10-nano;

* parameterize DEVICE and PROJECT_NAME

* remove extra lines;

* brief description on flashing sd card image for de10-nano

* docs on building additional components

* parameterize DEVICE and DEVICE_FAMILY

* parameterize DEVICE and DEVICE_FAMILY

* parameterize DEVICE and DEVICE_FAMILY

* de10-nano -> de10nano

* minor change

* add comment in code and document in order to address review comments;

committed Jul 31, 2019

83591aa5 Browse Files

Add yolov3-tiny to the tutorial. (#3674) · 5968eef2
Balint Cristian committed Jul 31, 2019

5968eef2 Browse Files
add reviewer - slyubomirsky (#3673) · f0b4c46f
Haichen Shen committed Jul 30, 2019

f0b4c46f Browse Files

30 Jul, 2019 10 commits
- [RPC] Terminate worker's childs first. (#3669) · e099a6e1
  Balint Cristian committed Jul 30, 2019
  
  e099a6e1 Browse Files
- [VTA] Support for batched inference (#3661) · 6c7f0c4d
```
* fix in IR pass to support padding on 6-d tensors

* support for both N>1 and N==1 for padding

* batch size > 1 tuning and base config

* output formatting

* batch conv2d

* print all category results

* revert to single-batch config

* pick record best

* fix conv test

* improving reporting

* address batching bug in fast simulator

* fix
```
  Thierry Moreau committed Jul 30, 2019
  6c7f0c4d Browse Files
- removing deprecated script (#3667) · 9b355fc3
  Thierry Moreau committed Jul 30, 2019
  
  9b355fc3 Browse Files
- [TOPI] Enable standalone wheel build (#3657) · 4ce93200
```
* Fixed topi bdist_wheel build to include libraries.

* Removed unneeded imports
```
  Josh Fromm committed Jul 30, 2019
  4ce93200 Browse Files
- [TOPI] Fix traverse function not inline zero-input op (#3623) · 9d583cf5
```
* Fix traverse_inline not inline zero input op properly

* Add where to python and set tag to broadcast

* Fix inline

* test

* fix test target

* fix
```
  Wuwei Lin committed Jul 30, 2019
  9d583cf5 Browse Files
- ROCm: Add SaveToFile and LoadFile (#3665) · d4a51751
```
...and add rocm module_save to the tests.
```
  Thomas Viehmann committed Jul 30, 2019
  d4a51751 Browse Files
- tvm/contrib/rocm: improve finding of ld.lld (#3664) · 0cecd037
```
This refines the detection of ld.lld matching the neighbouring clang
file. This is particularly helpful on Ubuntu/Debian when either the
default ld.lld is not installed or the versioned one is preferable for
consistency.

@tqchen I think you last touched the clang equivalent in #3590 .
```
  Thomas Viehmann committed Jul 30, 2019
  0cecd037 Browse Files
- Print llvm source by default in ROCMModuleNode::GetSource (#3662) · 52b63b9f
  Thomas Viehmann committed Jul 30, 2019
  
  52b63b9f Browse Files
- [Relay] Fix typo in ChangeBatch (#3660) · f1d43378
  雾雨魔理沙 committed Jul 29, 2019
  
  f1d43378 Browse Files
- [Relay][VTA] Add ChangeBatch pass (#3656) · 16fefd89
```
* init

* lint

* lint
```
  雾雨魔理沙 committed Jul 29, 2019
  16fefd89 Browse Files
29 Jul, 2019 3 commits

[VTA] [Chisel] make dram offset configurable for uops different than 4-bytes (#3654) · a88b2842
Luis Vega committed Jul 29, 2019

a88b2842 Browse Files
[VTA] [CMake] hotfix tsim rules (#3650) · 6970fc30
Luis Vega committed Jul 29, 2019

6970fc30 Browse Files

[VTA] Refactor to increase platform coverage (Ultra96 etc.) (#3496) · f55609b4

* hardware refactor for increased FPGA coverage, small optimizations

* fix header

* cleaning up parameters that won't be needed for now

* streamlining makefile, and simplifying tcl scripts

* moving parameter derivation into pkg_config.py, keeping tcl scripts lightweight

* refactoring tcl script to avoid global variables

* deriving AXI signals in pkg_config.py

* unifying address map definition for hardware and software drivers

* single channel design for ultra96 to simplify build

* enable alu by default, no mul opcode for now

* hardware fix

* new bitstream; vta version

* avoid error when env variable is not set

* ultra96 cleanup

* further cleaning up tcl script for bitstream generation

* preliminary rpc server support on ultra96

* rpc server tracker scripts

* ultra96 ldflag

* ultra96 support

* ultra96 support

* cleanup line

* cmake support for ultra96

* simplify memory instantiation

* cleaning up IP parameter initialization

* fix queue instantiation

* 2019.1 transition

* fix macro def

* removing bus width from config

* cleanup

* fix

* turning off testing for now

* cleanup ultra96 ps insantiation

* minor refactor

* adding comments

* upgrading to tophub v0.6

* model used in TVM target now refers to a specific version of VTA for better autoTVM scheduling

* revert change due to bug

* rename driver files to be for zynq-type devices

* streamlining address mapping

* unifying register map offset values between driver and hardware generator

* rely on cma library for cache flush/invalidation

* coherence management

* not make buffer packing depend on data types that can be wider than 64bits

* refactor config derivation to minimize free parameters

* fix environment/pkg config interaction

* adding cfg dump property to pkgconfig:

* fix rpc reconfig

* fix spacing

* cleanup

* fix spacing

* long line fix

* fix spacing and lint

* fix line length

* cmake fix

* environment fix

* renaming after pynq since the driver stack relies on the pynq library - see pynq.io

* update doc

* adding parameterization to  name

* space

* removing reg width

* vta RPC

* update doc on how to edit vta_config.json

* fix path

* fix path

committed Jul 28, 2019

f55609b4 Browse Files

28 Jul, 2019 3 commits
- fix comment/doc in TensorLoad (#3646) · bca8ac17
  Luis Vega committed Jul 28, 2019
  
  bca8ac17 Browse Files
- Hotfix for issue #3641. (#3644) · 026162ad
  Balint Cristian committed Jul 28, 2019
  
  026162ad Browse Files
- fix case when offset is odd and size is even (#3643) · 9a542e37
  Luis Vega committed Jul 28, 2019
  
  9a542e37 Browse Files
27 Jul, 2019 3 commits
- [VTA] [Chisel] fix tensor issue/commit in gemm (#3637) · da40645f
```
* fix tensor issue/commit in gemm

* remove trailing spaces
```
  Luis Vega committed Jul 27, 2019
  da40645f Browse Files
- [Relay][TF] add BatchMatMul (#3634) · 786c49f3
  Yong Wu committed Jul 27, 2019
  
  786c49f3 Browse Files
- Improve the x86 auto-tune tutorial (#3609) · 18d0ad31
  peterjc123 committed Jul 27, 2019
  
  18d0ad31 Browse Files
26 Jul, 2019 6 commits

Update tensorflow.py (#3632) · fbe42c26
YPBlib committed Jul 26, 2019

fbe42c26 Browse Files
Make Google Test usage configurable in CMake files (#3628) · f5464ce2
```
* Add USE_GTEST as a CMake variable

* Add GTest section in installation docs

* Incorporate feedback
```
Logan Weber committed Jul 26, 2019
f5464ce2 Browse Files
[TensorFlow] Fix a bug output index is ignored (#3631) · c1376a40
```
Enhance test to cover this case
```
lixiaoquan committed Jul 26, 2019
c1376a40 Browse Files
[TOPI][CUDA] Schedule for pool_grad (#3622) · f1ede9a9
```
* [TOPI][CUDA] Schedule for pool_grad

* Relay test

* Fix fused op

* doc

* Remove set scope local
```
Wuwei Lin committed Jul 26, 2019
f1ede9a9 Browse Files
[Relay] [Training] Add numerical gradient check. (#3630) · 8e0aaa29
```
* add check_grad

* finish

* what does the fox say?

* lint lint lint lint lint lint lint lint lint
```
雾雨魔理沙 committed Jul 26, 2019
8e0aaa29 Browse Files

[VTA] [Chisel] support for different inp/wgt bits, rewrote DotProduct for clarity (#3605) · 87e18a44

* support for different inp/wgt bits, rewrote dot for clarity

* [VTA] [Chisel] support for different inp/wgt bits, rewrote DotProduct for clarity

* [VTA] [Chisel] support for different inp/wgt bits, rewrote DotProduct for clarity

* change back to sim

* fix index

* fix index

* fix indent

* fix indent

* fix indent

* fix trailing spaces

* fix trailing spaces

* change to more descriptive name

* matric->matrix

* fix spacing

* fix spacing & added generic name for dot

* better parameter flow

* spacing

* spacing

* spacing

* update requirement (tested) for dot, spacing

* function call convention

* small edit

committed Jul 25, 2019

87e18a44 Browse Files

25 Jul, 2019 7 commits

[IR] Make iterators compatible with constructors of STL containers (#3624) · 0858c5ad
Lianmin Zheng committed Jul 25, 2019

0858c5ad Browse Files
Add Winograd matrices computation. (#3553) · 97e333ca
Balint Cristian committed Jul 26, 2019

97e333ca Browse Files

Implementation of uTVM (#3227) · ef909df1

* uTVM interfaces (#14)

* some minor interface changes

* implemented HostLowLevelDevice

* added MicroDeviceAPI

* implemented micro_common and added Python interfaces

* current status, semi implemented micro session

* added micro_common implementation and python interfaces (#18)

* added micro_common implementation and python interfaces (#18)

* current status, semi implemented

* host test working

* updated interfaces for MicroSession arguments allocation

* make somewhat lint compatible

* fix based on comments

* added rounding macro

* fix minor bug

* improvements based on comments

* Clean up `binutil.py` and make Python-3-compatible

* Change argument allocation design

* Address feedback and lint errors

* Improve binutil tests

* Simplify allocator (per @tqchen's suggestions)

* Doc/style fixes

* farts

* mcgee

* rodata section werks

(and so does `test_runtime_micro_workspace.py`)

* simple graph runtime werk

* TEMP

* ResNet works, yo

* First round of cleanup

* More cleanup

* runs a dyson over the code

* Another pass

* Fix `make lint` issues

* ready to pr... probably

* final

* Undo change

* Fix rebase resolution

* Minor fixes

* Undo changes to C codegen tests

* Add `obj_path` in `create_micro_lib`

* TEMP

* Address feedback

* Add missing TODO

* Partially address feedback

* Fix headers

* Switch to enum class for `SectionKind`

* Add missing ASF header

* Fix lint

* Fix lint again

* Fix lint

* Kill lint warnings

* Address feedback

* Change Python interface to MicroTVM

All interaction with the device is now through `Session` objects, which
are used through Python's `with` blocks.

* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Move helper functions out of `tvm.micro`

* Switch static char arr to vector

* Improve general infra and code quality

Does not yet address all of tqchen's feedback

* Forgot a rename

* Fix lint

* Add ASF header

* Fix lint

* Partially address MarisaKirisame's feedback

* Lint

* Expose `MicroSession` as a node to Python

* Revert to using `Session` constructor

* Fix compiler error

* (Maybe) fix CI error

* Debugging

* Remove

* Quell lint

* Switch to stack-based session contexts

* Make uTVM less intrusive to host codegen

And use SSA for operands of generated ternary operators

* Inline UTVMArgs into UTVMTask struct

* Remove `HostLowLevelDevice` header

* Remove `BaseAddr` class

* Address feedback

* Add "utvm" prefix to global vars in runtime

* Fix lint

* Fix CI

* Fix `test_binutil.py`

* Fix submodules

* Remove ResNet tests

* Make `test_binutil.py` work with nose

* Fix CI

* I swear this actually fixes the binutil tests

* lint

* lint

* Add fcompile-compatible cross-compile func

* Add docs for uTVM runtime files

* Move pointer patching into `MicroSession`

* Fix lint

* First attempt at unifying cross-compile APIs

* Fix lint

* Rename `cross_compile` back to `cc`

* Address feedback

* Remove commented code

* Lint

* Figure out failing function

* Remove debugging code

* Change "micro_dev" target to "micro"

* Add checks in tests for whether uTVM is enabled

* Add TODO for 32-bit support

* Rename more "micro_dev" to "micro"

* Undo rename

We already have `tvm.micro` as a namespace.  Can't have it as a method
as well.

* Fix failing CI

Thanks to @tqchen for finding this bug.  Emitting ternary operators for
`min` and `max` causes concurrency bugs in CUDA, so we're moving the
ternary op emissions from `CodeGenC` to `CodeGenCHost`.

* Address feedback

* Fix lint

committed Jul 25, 2019

ef909df1 Browse Files

Add a missing header in cuda_device_api.cc (#3621) · 443d023b
Philip Hyunsu Cho committed Jul 24, 2019

443d023b Browse Files
[Relay][Keras] Permute, Softmax support (#3618) · dedcf82f
Yong Wu committed Jul 24, 2019

dedcf82f Browse Files
fix typo (#3611) · e7fb2d4d
Jian Weng committed Jul 24, 2019

e7fb2d4d Browse Files
[TOPI] Average Pool2D Bug. (#3607) · 3aa2eaed
```
* [TOPI] Average Pool2D Bug.

Issue - https://github.com/dmlc/tvm/issues/3581

* Add uint16 test.
```
Animesh Jain committed Jul 24, 2019
3aa2eaed Browse Files

24 Jul, 2019 1 commit
- Remove prints in `generic_op_impl.py` (#3616) · e0df6e12
  Logan Weber committed Jul 24, 2019
  
  e0df6e12 Browse Files