Commits · b8efe27feb022e44c620ae9df00c38bf6cc5da05 · wenyuanbo / tic

09 Mar, 2020 1 commit

[VTA][Chisel,de10nano] Chisel fixes and de10nano support (#4986) · 5b4cf5df

* [VTA][de10nano] Enable user defined target frequency.

Issue:
The VTA target frequency on the DE10-Nano is hardcoded to 50MHz
unnecessarily limiting performance.

Solution:
Add a PLL to the FPGA sub-system along with support for the
selection of a user specified frequency at build time. The board
successfully builds and runs at 100MHz.

* Added a PLL in the soc_system.tcl platform designer generator
  script.

* Modified the Makefile to automatically set the target frequency
  from that specified in the pkg_config.py file.

* Modified the Makefile to generate a bitstream with an RBF
  format that enables programming of the FPGA directly from
  the on-board processor. Specifically, the RBF is generated in
  FastParallel32 mode with compression, which corresponds to the
  default MSEL switch setting on the board, i.e. 01010.

* Added a false path override to file set_clocks.sdc to turn off
  unconstrained path warnings on the VTA pulse LED.

* [VTA][TSIM] Add more debug and tracing options.

* Modified Makefile to change default config to DafaultDe10Config.

* Added option in Makefile to produce more detailed tracing
  for extra observability in debugging complex scenarios.

* Added option in Makefile to produce traces in FST format which
  are 2 orders of magnitude smaller, although much slower to
  generate.

* Added option in Makefile to build the simulator with GCC address
  sanitizer.

* Modified Makefile to not lint the scala code by default avoiding
  unintended wrong indentation. Linting should be better performed
  manually on a per-need basis.

* [VTA][de10nano] Enable remote programming of FPGA.

Issue:
The Cyclone V FPGA on board of the DE10-Nano can only be programmed
using the JTAG port, which is a limiting option for users.

Solution:
Add support for the remote programming of the FPGA implementing
the FPGA programming manager protocol published in the Cyclone V
user manual.

* Added file de10nano_mgr.h implementing an FPGA manager class
  that supports handling of control and status registers as well
  as a push-button option to program the FPGA. The class can be
  easily extended to include more registers if needed.

* Used an instance of the FPGA manager to implement function
  VTAProgram also warning users when incompatible bitstream
  files are used.

* Registered VTAProgram as a global function and modified
  the program_bitstream python class to use it.

* [VTA][de10nano] Enhance de10nano runtime support.

Issue:
The de10nano target has incomplete, non-working support
for runtime reconfiguration, bitstream programming, and
examples of usage.

Solution:
Complete runtime support for the de10nano target.

* Modified VTA.cmake to comment out a default override for
  VTA_MAX_XFER to 21 bit wide.

* Modified VTA.cmake to add needed de10nano include dirs.

* Modified relevant files to support de10nano same way as
  other targets for VTA runtime reconfiguration and FPGA
  programming.

* Added test_program_rpc.py example as a runtime FPGA
  programming example. Note that unlike the pynq target
  no bitstream is either downloaded or programmed when
  the bitstream argument is set to None.

* Cosmetic changes to vta config files.

* [VTA][Chisel] LoadUop FSM bug fix.

Issue:
The LoadUop FSM incorrectly advances the address of the next
uop to read from DRAM when the DRAM data valid bit is deasserted
and asserted at the end of a read. This is caused by a mismatch
in the logic of the state and output portions of the FSM.
This is one of two issues that was gating the correct operation
of VTA on the DE10-Nano target.

Solution:
Modify the logic of the output section of the FSM to include
a check on the DRAM read valid bit or fold the output assignemnt
into the state section.

* Folded the assignemnt of the next uop address in the state
  section of the FSM.

* [VTA][Chisel] Dynamically adjust DMA tranfer size.

Issue:
In the DE10-Nano target and possibly in others, DMA transfers that
cross the boundaries of memory pages result in incorrect reads and
writes from and to DRAM. When this happens depending on different
input values, VTA loads and stores exhibit incorrect results for
DMA pulses at the end of a transfer. This is one of two issues that
were gating the DE10-Nano target from functioning correctly, but may
affect other Chisel based targets.

Solution:
Add support for dynamically adjustble DMA transfer sizes in load
and store operations. For a more elegant and modular implementation
the feature can be enabled at compile time with a static constant
that can be passed as a configuration option.

* Modified the load and store finite state machines to dynamically
  adjust the size of initial and stride DMA transfers. The feature
  is enabled by default by virtue of the static constant
  ADAPTIVE_DMA_XFER_ENABLE.

* [VTA][Chisel] Improve FSIM/TSIM/FPGA xref debug.

Issue:
Cross reference between FSIM, TSIM, and Chisel based FPGA traces
is an invaluable instrument that enables fast analysis on FSIM,
and analysis/debug on TSIM and FPGA, especially for complex flows
like conv2d or full inferences. Currently this cannot be done
easily since a suitable reference is missing. The clock cycle
event counter cannot be used since it is undefined in FSIM and
not reliable between TSIM and FPGA because of different latencies.

Solution:
Introduce a new event counter that preserves a program order across
FSIM, TSIM, FPGA. We propose adding the accumulator write event
counter in the Chisel EventCounter class and a simple instrumentation
in the FSIM runtime code. Note that this technique enabled finding the
Chisel issues reportes in the PR, which would have been otherwise
far more difficult.

* Added the acc_wr_count event counter and changed interfaces
  accordingly.

* [VTA][de10nano] Comply with linting rules.

* [VTA] Appease make lint.

* [VTA] Disable pylint import not top level error.

* [VTA][Chisel,de10nano] Linting changes.

* Use CamelCase class names.

* Use C++ style C include header files.

* Add comments to Chisel makefile.

* [VTA][de10nano]

* Reorder C and C++ includes in de10nano_mgr.h.

* Restore lint as default target in Chisel Makefile.

* [VTA][de10nano] Do not use f string in pkg_config.py.

* [VTA][de10nano] Remove overlooked f strings in pkg_config.py.

* [VTA][de10nano] Fixed typo.

* [VTA][TSIM] Check if gcc has align-new.

* [VTA][Chisel] Make adaptive DMA transfer default.

* [VTA][RPC] Renamed VTA_PYNQ_RPC_* to VTA_RPC_*.

Issue:
With more FPGA targets coming online the initial method of
using individual environment variables to specify target IP and port
does not scale well.

Solution:
Use a single VTA_RPC_HOST, VTA_RPC_PORT pair to be changed
every time a different target is used. For instance in a script
used to benchmark all targets.

* Replaced every instance of VTA_PYNQ_RPC_HOST and VTA_PYNQ_RPC_PORT
  with VTA_RPC_HOST and VTA_RPC_PORT, respectively.

* [VTA][Chisel] Comply with new linter.

committed 4 years ago

5b4cf5df Browse Directory

27 Feb, 2020 1 commit

[REFACTOR][PY][API-CHANGE] Remove legacy python files. (#4943) · 9816efc2

* [REFACTOR][PY][API-CHANGE] Remove legacy python files.

Remove legacy python files.
Use the te namespace for most of the tensor expression primitives.

- tvm.create_schedule -> tvm.te.create_schedule
- tvm.placeholder -> tvm.te.placeholder
- tvm.compute -> tvm.te.compute

* Remove top-level exposures.

committed 4 years ago

9816efc2 Browse Directory

24 Feb, 2020 1 commit

[Relay][AutoTVM] Relay op strategy (#4644) · 623dd208

* relay op strategy

fix lint

bitpack strategy

bitserial_dense (#6)

* update strategy

* address comments

fix a few topi test

Dense strategy (#5)

* dense

* add biforst; remove comments

* address comment

Refactor x86 conv2d_NCHWc (#4)

* Refactor x86 conv2d

* Add x86 depthwise_conv2d_NCHWc

* Add back topi x86 conv2d_nchw

* Merge x86 conv2d_nchw and conv2d_NCHWc

* Minor fix for x86 conv2d

fix more strategy

Add x86 conv2d_NCHWc_int8 strategy (#8)

* Add x86 conv2d_NCHWc_int8 strategy

* Remove contrib_conv2d_nchwc_int8

* Fix generic conv2d_NCHWc for int8

* Fix topi arm_cpu conv2d_NCHWc_int8

update x86 conv2d

enable specify relay ops to be tuned for autotvm

add cuda conv2d strategy

add conv2d strategy for rocm

add conv2d strategy for hls

add conv2d strategy for arm cpu

add conv2d strategy for mali

add conv2d strategy for bifrost

add conv2d strategy for intel graphics

clean up and fix lint

remove template keys from autotvm

remove 2 in the func name

address comments

fix

* fix bugs

* lint

* address comments

* add name to op implement

* Modify topi tests (#9)

* Add pooling, reorg, softmax and vision

* Add lrn

* fix topi test

* fix more topi test

* lint

* address comments

* x

* fix more tests & bugs

* Modify more tests (#10)

* Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn

* Minor fix

* More minor fix

* fix more test

* try to update vta using strategy

* fix cpptest

* x

* fix rebase err

* Fix two tests (#11)

* change autotvm log format

* lint

* minor fix

* try fix vta test

* fix rebase err

* tweak

* tmp hack for vta pass

* fix tutorial

* fix

* fix more tutorials

* fix vta tutorial

* minor

* address comments

* fix

* address comments

* fix cpptest

* fix docs

* change data structure name and api

* address comments

* lint

* fix rebase err

* updates

* fix winograd test

* fix doc

* rebase

* upgrade tophub version number

* fix bug

* re-enable vta tsim test after tophub is upgraded

* fix vta test to use the correct args so the config can be found in tophub

Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>

committed 4 years ago

623dd208 Browse Directory

07 Feb, 2020 1 commit

[REFACTOR][PY][API-Change] Polish tvm.runtime, tvm.runtime.module API update (#4837) · e0122c0e

* [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update

This PR updates the tvm.runtime to use the new FFI style.

- Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule
- API changes wrt to runtime.Module
  - tvm.module.load -> tvm.runtime.load_module
  - tvm.module.enabled -> tvm.runtime.enabled
  - tvm.module.system_lib -> tvm.runtime.system_lib
- Remove dep on api_internal from runtime.

* Update module.load in the latest API

committed 4 years ago

e0122c0e Browse Directory

15 Jan, 2020 1 commit
- Revert "[Relay][TOPI]Fix meaning of conv2d_transpose output_padding parameter (#4318)" (#4708) · 81e03ee7
```
This reverts commit dcf7fbf1.
```
  Haichen Shen committed 5 years ago
  81e03ee7 Browse Directory
11 Jan, 2020 1 commit

[Relay][TOPI]Fix meaning of conv2d_transpose output_padding parameter (#4318) · dcf7fbf1

* Add output_padding to generic

* Add output_padding to the reference impl

* Add output_padding to arm_cpu

* Add output_padding to the test

* Add output_padding for cuda

* Add output_padding for x86

* Make use of the new output_padding argument in Relay

* Adjust conv2d_transpose Relay test

* Fix lint errors

* Fix the VTA declaration of conv2d_transpose

* support for output padding in conv2d transpose

* some output padding will break IR pass

* Fix new conv2d_transpose test

* Update tophub

* Fix conv1d output_padding too.

* Fix the conv1d_transpose reference function.

* Fix the cuda impl

* fix the topi test for conv1d

* Update the versions in tophub.py

Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>

committed 5 years ago

dcf7fbf1 Browse Directory

09 Dec, 2019 1 commit

[VTA] Bringing group convolution support (#4421) · 6ab15806

* group conv operator support for VTA

* autotvm tuning script for group conv2d

* lint fix

* lint fix

* lint fix

* addressing comments

committed 5 years ago

6ab15806 Browse Directory

15 Nov, 2019 1 commit
- [VTA] Bug fix for padded load with large inputs (#4293) · 5b1ca85d
```
* bug fix for padded load with large inputs

* Update TensorLoad.scala

* Update test_vta_insn.py
```
  Liangfu Chen committed 5 years ago
  5b1ca85d Browse Directory
06 Nov, 2019 1 commit
- [VTA] Hotfix for padded load test in Chisel VTA (#4264) · 1eca1ad1
```
* Update TensorUtil.scala

* Update test_vta_insn.py
```
  Liangfu Chen committed 5 years ago
  1eca1ad1 Browse Directory
05 Sep, 2019 1 commit

[VTA][TOPI] Conv2d transpose (deconvolution) operator support (#3777) · 23c22812

* initial conv2d_transpose

* correct select operator

* cleanup

* fix

* fix correcness check

* conv2d transpose declaration fix

* autotvm conv2d_transpose tuning script

* ir pass fix

* fix tuning script

* deriving params from env, adding bias

* removing bias comp from deconvolution

* lint

* fix

* lint

* lint

* turning off cpu

* lint, ops

* lint

* import fix

* removing hard coded values

* lint

committed 5 years ago

23c22812 Browse Directory

30 Jul, 2019 1 commit

[VTA] Support for batched inference (#3661) · 6c7f0c4d

* fix in IR pass to support padding on 6-d tensors

* support for both N>1 and N==1 for padding

* batch size > 1 tuning and base config

* output formatting

* batch conv2d

* print all category results

* revert to single-batch config

* pick record best

* fix conv test

* improving reporting

* address batching bug in fast simulator

* fix

committed 5 years ago

6c7f0c4d Browse Directory

29 Jul, 2019 1 commit

[VTA] Refactor to increase platform coverage (Ultra96 etc.) (#3496) · f55609b4

* hardware refactor for increased FPGA coverage, small optimizations

* fix header

* cleaning up parameters that won't be needed for now

* streamlining makefile, and simplifying tcl scripts

* moving parameter derivation into pkg_config.py, keeping tcl scripts lightweight

* refactoring tcl script to avoid global variables

* deriving AXI signals in pkg_config.py

* unifying address map definition for hardware and software drivers

* single channel design for ultra96 to simplify build

* enable alu by default, no mul opcode for now

* hardware fix

* new bitstream; vta version

* avoid error when env variable is not set

* ultra96 cleanup

* further cleaning up tcl script for bitstream generation

* preliminary rpc server support on ultra96

* rpc server tracker scripts

* ultra96 ldflag

* ultra96 support

* ultra96 support

* cleanup line

* cmake support for ultra96

* simplify memory instantiation

* cleaning up IP parameter initialization

* fix queue instantiation

* 2019.1 transition

* fix macro def

* removing bus width from config

* cleanup

* fix

* turning off testing for now

* cleanup ultra96 ps insantiation

* minor refactor

* adding comments

* upgrading to tophub v0.6

* model used in TVM target now refers to a specific version of VTA for better autoTVM scheduling

* revert change due to bug

* rename driver files to be for zynq-type devices

* streamlining address mapping

* unifying register map offset values between driver and hardware generator

* rely on cma library for cache flush/invalidation

* coherence management

* not make buffer packing depend on data types that can be wider than 64bits

* refactor config derivation to minimize free parameters

* fix environment/pkg config interaction

* adding cfg dump property to pkgconfig:

* fix rpc reconfig

* fix spacing

* cleanup

* fix spacing

* long line fix

* fix spacing and lint

* fix line length

* cmake fix

* environment fix

* renaming after pynq since the driver stack relies on the pynq library - see pynq.io

* update doc

* adding parameterization to  name

* space

* removing reg width

* vta RPC

* update doc on how to edit vta_config.json

* fix path

* fix path

committed 5 years ago

f55609b4 Browse Directory

08 Jul, 2019 1 commit

[VTA] TSIM improvements and fixes (#3505) · a31dd162

* add tsim init function

* add sim device

* test wait and resume

* launch simulation thread from DPILoader

* add VTASimDPI module to handle all simulation related stuff

* test tsim init

* move exit to simdpi module

* update vta driver

* add chisel DPI module

* get back simshell

* update vta to support dpi sim

* update unittests

* add tsim to integration-conv2d test

* run resnet on tsim

* remove max-cycles

* match tsim counters with sim counters

* use env in simulator to switch between sim and tsim

* update unittest

* rollback conv2d test

* update resnet

* add stats to matrix multiply

* add stats

* print stats after assert

* update other tests

* add stats to gemm

* add return and remove unused libs

* add missing arg

* return lib

* update comments for linter

* add more comments to VTASimDPI module

* remove trailing spaces

* remove trailing spaces

committed 5 years ago

a31dd162 Browse Directory

28 Jun, 2019 1 commit
- [VTA][Relay] Relay Compilation + AutoTVM compatible operator libraries for VTA (#3135) · 3818b2a2
  Thierry Moreau committed 5 years ago
  
  3818b2a2 Browse Directory
13 Jun, 2019 1 commit

[VTA] add support to event counters (#3347) · 7bf2ff23

* add support to event counters in VTA

* fix comment

* fix event-counter interface parameter

* no longer needed

* add sim back

* add docs to event counters

* fix docs

* add more details about event counting

* make dpi-module docs more accurate

committed 5 years ago

7bf2ff23 Browse Directory

05 Jun, 2019 1 commit
- [VTA] [Hardware] Chisel implementation (#3258) · 32f74f31
  Luis Vega committed 5 years ago
  
  32f74f31 Browse Directory
08 Apr, 2019 1 commit

[HEADER] Add Header to Comply with ASF Release Policy (#2982) · cffb4fba

* [HEADER] ASF header dir=include

* [HEADER] ASF Header dir=src

* [HEADER] ASF Header -dir=python

* [HEADER] ASF header dir=topi

* [HEADER] ASF Header dir=nnvm

* [HEADER] ASF Header -dir=tutorials

* [HEADER] ASF Header dir=tests

* [HEADER] ASF Header -dir=docker

* fix whitespace

* [HEADER] ASF Header -dir=jvm

* [HEADER] ASF Header -dir=web

* [HEADER] ASF Header --dir=apps

* [HEADER] ASF Header --dir=vta

* [HEADER] ASF Header -dir=go

* temp

* [HEADER] ASF Header --dir=rust

* [HEADER] Add ASF Header --dir=cmake

* [HEADER] ASF Header --dir=docs

* [HEADER] Header for Jenkinsfile

* [HEADER] ASF Header to toml and md

* [HEADER] ASF Header to gradle

* Finalize rat cleanup

* Fix permission

* Fix java test

* temporary remove nnvm onnx test

committed 5 years ago

cffb4fba Browse Directory

31 Oct, 2018 1 commit
- [TOPI] Add dilation argument to conv2d and depthwise_conv2d (#1970) · 2005f852
  Wuwei Lin committed 6 years ago
  
  2005f852 Browse Directory
21 Oct, 2018 1 commit
- [TOPI] Specify non-zero absolute tolerance in tests (#1925) · 39c8bc2a
  Sergey Mironov committed 6 years ago
  
  39c8bc2a Browse Directory
23 Aug, 2018 1 commit
- [AUTOTVM] Simplify TopHub (#1630) · cfafd212
  Lianmin Zheng committed 6 years ago
  
  cfafd212 Browse Directory
02 Aug, 2018 1 commit
- [AUTOTVM] TOPI integration for ARM CPU (#1487) · 32076df8
  Lianmin Zheng committed 6 years ago
  
  32076df8 Browse Directory
13 Jul, 2018 1 commit
- [DOCS] VTA installation guide (#1428) · 6bda4e33
  Thierry Moreau committed 6 years ago
  
  6bda4e33 Browse Directory
12 Jul, 2018 18 commits
- [BUILD][DOCS] Migrate VTA CI, test, build, docs · e531d022
  tqchen committed 6 years ago
  
  e531d022 Browse Directory
- [TOPI] Fix the CPU op perf (#56) · bc410130
  Tianqi Chen committed 6 years ago
  
  bc410130 Browse Directory
- [TVM] Upgrade TVM Support · 6c62dac3
  Tianqi Chen committed 6 years ago
  
  6c62dac3 Browse Directory
- [DOC, TVM] ResNet tutorial, updated TVM (#51) · 3ae9e155
  Thierry Moreau committed 6 years ago
  
  3ae9e155 Browse Directory
- [DOCKER] Cleanup docker image (#50) · 5739acab
  Tianqi Chen committed 6 years ago
  
  5739acab Browse Directory
- [UTILS, DOC] Use TVM file downloading utility, conv2d tutorial (#48) · 4ba6bd50
  Thierry Moreau committed 6 years ago
  
  4ba6bd50 Browse Directory
- [DOC, EXAMPLE] Updated READMEs, tests, etc. (#41) · e2faf792
```
* bug fix for new drivers in new PYNQ image v2.1

* updating instructions for resnet inference

* updated the instructions for starting the RPC server

* deriving host/port from env for unit tests
```
  Thierry Moreau committed 6 years ago
  e2faf792 Browse Directory
- [BITSTREAM SERVER] Bitstream server integration (#38) · 7f25bf1d
  Thierry Moreau committed 6 years ago
  
  7f25bf1d Browse Directory
- [TOPI] Automated schedule in conv2d TOPI lib, moving to GEMM intrinsic (#35) · a96a4a9b
```
* removing programming out of end to end example for now

* updating TOPI library to use gemm tensor intrinsic

* bug fix, autoschedule in TOPI conv lib

* removing the deprecated GEVM intrinsic

* refactoring, fixed lint test

* fix for integer division bug

* python3 bug fix for non matching types due to float division

* comment
```
  Thierry Moreau committed 6 years ago
  a96a4a9b Browse Directory
- [PYTHON] Enable environment scoping (#33) · 9f0e8ffe
  Tianqi Chen committed 6 years ago
  
  9f0e8ffe Browse Directory
- Refactor, refactor code structure, fix pynq rpc (#29) · edac6a8d
  Tianqi Chen committed 6 years ago
  
  edac6a8d Browse Directory
- [INFRASTRUCTURE] Migrate to json based config. Move gemm test to integration. (#28) · 5c5806ba
```
* Migrate to json based config. Move gemm test to integration.

* temp checkin

* checkin  example json
```
  Tianqi Chen committed 6 years ago
  5c5806ba Browse Directory
- [RUNTIME] Simplify dynamic library and code path. (#27) · 666f32d6
```
* [RUNTIME] Simplify dynamic library and code path.

* reword the readme
```
  Tianqi Chen committed 6 years ago
  666f32d6 Browse Directory
- [DRIVER] Add simulator, unify testcase to unittest (#25) · 9c44e4b4
  Tianqi Chen committed 6 years ago
  
  9c44e4b4 Browse Directory
- [COMPILER] Refactor compiler to enable configuration (#21) · dea167a8
  Tianqi Chen committed 6 years ago
  
  dea167a8 Browse Directory
- [SCHEDULER, HW] Auto scheduler for conv2d, hardware generation (#20) · f5960ba6
```
* Hardware generation fixes/sweep, auto scheduling for VTA conv2d

* Hardware generation fixes/sweep, auto scheduling for VTA conv2d

* derive hw spec from config file

* up to date hardware spec
```
  Thierry Moreau committed 6 years ago
  f5960ba6 Browse Directory
- [RPC][RUNTIME] Support dynamic reload of runtime API according to config (#19) · 33a309b2
  Tianqi Chen committed 6 years ago
  
  33a309b2 Browse Directory
- [PYTHON, TVM] Python TVM library, unit tests and end to end example · 96488c11
```
* VTA python library
* Python unit tests
* End to end example with Resnet18
* README instructions
* Bug fixes
```
  Thierry Moreau committed 6 years ago
  96488c11 Browse Directory