- 23 Dec, 2019 1 commit
-
-
* [VTA][Chisel] End-to-end Inference with Chisel VTA * Update TensorAlu.scala
Liangfu Chen committed
-
- 21 Dec, 2019 1 commit
-
-
* [VTA] improved virtual memory mapping * Update virtual_memory.cc
Liangfu Chen committed
-
- 16 Dec, 2019 1 commit
-
-
Liangfu Chen committed
-
- 11 Dec, 2019 1 commit
-
-
This PR tries to increase TSIM performance by introducing multi-threading support.
Liangfu Chen committed
-
- 09 Dec, 2019 1 commit
-
-
* group conv operator support for VTA * autotvm tuning script for group conv2d * lint fix * lint fix * lint fix * addressing comments
Thierry Moreau committed
-
- 28 Nov, 2019 1 commit
-
-
Liangfu Chen committed
-
- 27 Nov, 2019 2 commits
-
-
* disable pipelined adder and enable streamlined gemm execution * pipeline first layer of adder * explain difference between pipeadder and adder * add comment for explaining the hard-coded latency
Liangfu Chen committed -
* relay -> vta fix * setting optlevel to 3 for quantization to fold batchnorm
Thierry Moreau committed
-
- 26 Nov, 2019 1 commit
-
-
Thierry Moreau committed
-
- 24 Nov, 2019 3 commits
-
-
* [License] move cma_api to 3rdparty. separate BSD 2-clause and 3-clause * add zlib license for blockingconcurrentqueue.h
Yizhi Liu committed -
* [LINT] Improve the check tool to handle ASF copyright message. * [LINT] Remove unnecessary copyright message as per ASF requirement. * Fix codegen hybrid * [LINT] Broaden license checks to include html, xml * [LINT] Fix rest of the files * Fix notice * [LINT] Improve check file type error message
Tianqi Chen committed -
Yizhi Liu committed
-
- 22 Nov, 2019 1 commit
-
-
tripley committed
-
- 18 Nov, 2019 1 commit
-
-
Tianqi Chen committed
-
- 15 Nov, 2019 1 commit
-
-
* bug fix for padded load with large inputs * Update TensorLoad.scala * Update test_vta_insn.py
Liangfu Chen committed
-
- 14 Nov, 2019 1 commit
-
-
jason-song-dev committed
-
- 11 Nov, 2019 1 commit
-
-
Previously runtime::Module was supported using shared_ptr. This PR refactors the codebase to use the Object protocol. It will open doors to allow easier interpolation between Object containers and module in the future.
Tianqi Chen committed
-
- 06 Nov, 2019 2 commits
-
-
* Update TensorUtil.scala * Update test_vta_insn.py
Liangfu Chen committed -
Tianqi Chen committed
-
- 02 Nov, 2019 1 commit
-
-
* [VTA] Performance optimize, remove unnecessary contigious memory use. Issue: Uop maintain a cache vector to copy uop data into contigious DRAM memory for FPGA/Simulator use, but this cache vector not get clear after FPGA/Simulator core run, in Resnet18 case, if we printf the cache size in UopQueue::ReadBarrier function, we can saw such cache size keep increase, this would cause no use data copy and unnecessary contigous DRAM memory malloc. Analysis: This issue caused by not clear cache_ vector when do uop_queue_.Reset(). Solution: Override BaseQueue Reset function in UopQueue and add cache_ clear logic. * address review comments, remove spacing.
Hua Jiang committed
-
- 27 Oct, 2019 1 commit
-
-
* app init push * fix on readme * change name, add bit serial explanantion * rm serialLoadMM, change doc * syntax change for readme * add parallel test functionality * fix readme * add python doc * syntax * init commit * fix empty line * fix typo
Benjamin Tu committed
-
- 24 Oct, 2019 1 commit
-
-
* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Siyuan Feng committed
-
- 10 Oct, 2019 1 commit
-
-
* app init push * fix on readme * change name, add bit serial explanantion * rm serialLoadMM, change doc * syntax change for readme * add parallel test functionality * fix readme * add python doc * syntax
Benjamin Tu committed
-
- 08 Oct, 2019 1 commit
-
-
if n_trial is larger then config space.
Attila Dusnoki committed
-
- 28 Sep, 2019 1 commit
-
-
Tianqi Chen committed
-
- 13 Sep, 2019 1 commit
-
-
Issue: RPC path get changed into "vta_rpc" from "pynq_rpc", but related document still use old informaiton. Solution: Update RPC path information.
Hua Jiang committed
-
- 09 Sep, 2019 1 commit
-
-
Luis Vega committed
-
- 07 Sep, 2019 1 commit
-
-
* [VTA] Support TLPP in function simulator. Issue: currently vta function simulator just doing serialized instruction execution, the dependency logic of runtime ISA which use for task level pipe line parallelism can not get verified by function simulator. Solution: make the simulator driver to be multiple thread and support TLPP. Benefit: TLPP support VTA function simulator would make VTA logic testing/debug /change more easy. replace boost lockfree queue add configure control for simulator tlpp enable or disable. change code tyle into google style. Wrap queue read/write and sync logic to make function call more simple. Add some comments. Remove MT logic, change into Single thread mode. address review comments. code style change to match google code style and add comments. add cmake macro to enable/disable simulator tlpp logic. submodule update. correct file name mentioned in comments. * remove USE_VTA_FSIM_TLPP.
Hua Jiang committed
-
- 05 Sep, 2019 3 commits
-
-
* initial conv2d_transpose * correct select operator * cleanup * fix * fix correcness check * conv2d transpose declaration fix * autotvm conv2d_transpose tuning script * ir pass fix * fix tuning script * deriving params from env, adding bias * removing bias comp from deconvolution * lint * fix * lint * lint * turning off cpu * lint, ops * lint * import fix * removing hard coded values * lint
Thierry Moreau committed -
* adding support for graphpack over multiply op * increasing resnet model coverage * fix indentation * lint * moving recursion limit fix into graphpack pass * moving recursionlimit to relay init * pooling on NCHWnc format * adding more models * deploy_resnet_on_vta.py * trailing line * generalizing to vision models * merge conflicts * fix, apply quantization to VTA only * improving comments * trimming models that have runtime issues for the moment * lint * lint * lint
Thierry Moreau committed -
* rework; * `de10-nano` -> `de10nano`; * fix compilation error; * bug fix; * Update install.md * Update install.md * Update install.md * update with current runtime; * add debug messages; * bug fix in cma kernel module;
Liangfu Chen committed
-
- 04 Sep, 2019 2 commits
- 03 Sep, 2019 1 commit
-
-
* [VTA] Fix TSIM compile error in Linux (add missing -fPIC flag); * [VTA] Fix TSIM compile error in Linux (add missing -fPIC flag); * fix indentation problem;
Liangfu Chen committed
-
- 02 Sep, 2019 1 commit
-
-
Luis Vega committed
-
- 01 Sep, 2019 1 commit
-
-
* [VTA][TSIM] add virtual memory support to tsim example * fix identation * remove USE_TSIM macro and use 32-bit addr instead
Luis Vega committed
-
- 29 Aug, 2019 1 commit
-
-
Issue: RewriteForceSerial is a debug function to force instructions to be serialize instead of parrallel running, by doing so we can isolate some parallel problem or do performance compare between parallel and serialize. But this function have some problem, once get enabled by set debug flag, vta would stuck when running on pynq board. Analysis: once enable RewriteForceSerial, the dependency logic is different with default one, but we still use same logic to generate FINISH and other logic, this would cause dead lock. Solution: give a different dependency settings when enable RewriteForceSerial.
Hua Jiang committed
-
- 27 Aug, 2019 1 commit
-
-
Liangfu Chen committed
-
- 26 Aug, 2019 1 commit
-
-
* initial virtual memory; * initial integration; * include the header file in cmake; * implement allocation with virtual to logical address mapping; * virtual memory for tsim_driver; * implement the missing memory release function; * readability improvement; * readability improvement; * address review comments; * improved robustness in virtual memory allocation; * remove VTA_TSIM_USE_VIRTUAL_MEMORY macro and use virtual memory for tsim by default; * link tvm against vta library; * merge with master * build virtual memory system without linking tvm against vta; * minor change; * reuse VTA_PAGE_BYTES; * using DRAM class from sim_driver as VirtualMemoryManager; * satisfy linter; * add comments in code; * undo changes to Makefile * undo changes to Makefile * retrigger ci; * retrigger ci; * directly call into VirtualMemoryManager::Global()
Liangfu Chen committed
-
- 18 Aug, 2019 1 commit
-
-
* [VTA][TSIM] parallel hardware compilation with macOS and debug support * simplify
Liangfu Chen committed
-