- 01 Aug, 2019 1 commit
-
-
Alexander Pivovarov committed
-
- 31 Jul, 2019 7 commits
-
-
* intel graphics conv2d bugs fixed for inception_v3 * intel conv2d api updated, nn input size 4 condition added * review addressed * move conv_tags to attributes * ssd doc updated * address comment
Leyuan Wang committed -
* relay vm serialization * fix lint * load params, fix stream * lint * fix typo
Zhi committed -
lixiaoquan committed
-
* [TOPI][CUDA] schedule for group_conv2d * Fix #flops
Wuwei Lin committed -
* initial compilation script for chisel-vta; * replace tabs with spaces; * compile script for de10-nano; * remove generated verilog source code; * remove `altsource_probe`, `debounce`, `edge_detect` ip; * replace quartus project files with a single tcl script; * Update install.md * improved makefile-based compilation script; * complete makefile-based compilation of chisel-vta for de10-nano; * install quartus; * conversion to .rbf file; * document chisel-vta compilation process for de10-nano; * rename generated bitstream file; * download and extract custom ip for de10-nano; * minor change * minor change * fix indentation; * bug fix; * improved robustness in makefile; * clean up; * add `.sdc .ipx .qsys` allowance in jenkins; * add ASF header; * add ASF header; * remove IntelShell.scala, update vta_hw.tcl, clean up Makefile & soc_system.qsys; * add ASF header; * keep sources compact; * keep sources compact; * it's not necessary now * AXI4LiteClient -> AXI3Client for IntelShell * remove connection to fpga_only_master; * a few important bug fix: wire reset pin, and set host_r_last to high * remove intel specific interface definition; * add NO_DSP option in Makefile; * AXI4Lite is not used in IntelShell; * minor fix: disable dsp and use logic instead; * quartus version change: 18.0 -> 18.1 * remove altera related statement; * compose compile_design.tcl * initial tcl script for soc_system generation; * remove .qsys file; * remove unused; * .qsys can be generated by tcl script; * remove hps_io and shrink size of soc_system; * integrate into makefile; * version change: 18.0 -> 18.1 * add sample config file for de10-nano; * parameterize DEVICE and PROJECT_NAME * remove extra lines; * brief description on flashing sd card image for de10-nano * docs on building additional components * parameterize DEVICE and DEVICE_FAMILY * parameterize DEVICE and DEVICE_FAMILY * parameterize DEVICE and DEVICE_FAMILY * de10-nano -> de10nano * minor change * add comment in code and document in order to address review comments;
Liangfu Chen committed -
Balint Cristian committed
-
Haichen Shen committed
-
- 30 Jul, 2019 10 commits
-
-
Balint Cristian committed
-
* fix in IR pass to support padding on 6-d tensors * support for both N>1 and N==1 for padding * batch size > 1 tuning and base config * output formatting * batch conv2d * print all category results * revert to single-batch config * pick record best * fix conv test * improving reporting * address batching bug in fast simulator * fix
Thierry Moreau committed -
Thierry Moreau committed
-
* Fixed topi bdist_wheel build to include libraries. * Removed unneeded imports
Josh Fromm committed -
* Fix traverse_inline not inline zero input op properly * Add where to python and set tag to broadcast * Fix inline * test * fix test target * fix
Wuwei Lin committed -
...and add rocm module_save to the tests.
Thomas Viehmann committed -
This refines the detection of ld.lld matching the neighbouring clang file. This is particularly helpful on Ubuntu/Debian when either the default ld.lld is not installed or the versioned one is preferable for consistency. @tqchen I think you last touched the clang equivalent in #3590 .
Thomas Viehmann committed -
Thomas Viehmann committed
-
雾雨魔理沙 committed
-
* init * lint * lint
雾雨魔理沙 committed
-
- 29 Jul, 2019 3 commits
-
-
Luis Vega committed
-
Luis Vega committed
-
* hardware refactor for increased FPGA coverage, small optimizations * fix header * cleaning up parameters that won't be needed for now * streamlining makefile, and simplifying tcl scripts * moving parameter derivation into pkg_config.py, keeping tcl scripts lightweight * refactoring tcl script to avoid global variables * deriving AXI signals in pkg_config.py * unifying address map definition for hardware and software drivers * single channel design for ultra96 to simplify build * enable alu by default, no mul opcode for now * hardware fix * new bitstream; vta version * avoid error when env variable is not set * ultra96 cleanup * further cleaning up tcl script for bitstream generation * preliminary rpc server support on ultra96 * rpc server tracker scripts * ultra96 ldflag * ultra96 support * ultra96 support * cleanup line * cmake support for ultra96 * simplify memory instantiation * cleaning up IP parameter initialization * fix queue instantiation * 2019.1 transition * fix macro def * removing bus width from config * cleanup * fix * turning off testing for now * cleanup ultra96 ps insantiation * minor refactor * adding comments * upgrading to tophub v0.6 * model used in TVM target now refers to a specific version of VTA for better autoTVM scheduling * revert change due to bug * rename driver files to be for zynq-type devices * streamlining address mapping * unifying register map offset values between driver and hardware generator * rely on cma library for cache flush/invalidation * coherence management * not make buffer packing depend on data types that can be wider than 64bits * refactor config derivation to minimize free parameters * fix environment/pkg config interaction * adding cfg dump property to pkgconfig: * fix rpc reconfig * fix spacing * cleanup * fix spacing * long line fix * fix spacing and lint * fix line length * cmake fix * environment fix * renaming after pynq since the driver stack relies on the pynq library - see pynq.io * update doc * adding parameterization to name * space * removing reg width * vta RPC * update doc on how to edit vta_config.json * fix path * fix path
Thierry Moreau committed
-
- 28 Jul, 2019 3 commits
-
-
Luis Vega committed
-
Balint Cristian committed
-
Luis Vega committed
-
- 27 Jul, 2019 3 commits
-
-
* fix tensor issue/commit in gemm * remove trailing spaces
Luis Vega committed -
Yong Wu committed
-
peterjc123 committed
-
- 26 Jul, 2019 6 commits
-
-
YPBlib committed
-
* Add USE_GTEST as a CMake variable * Add GTest section in installation docs * Incorporate feedback
Logan Weber committed -
Enhance test to cover this case
lixiaoquan committed -
* [TOPI][CUDA] Schedule for pool_grad * Relay test * Fix fused op * doc * Remove set scope local
Wuwei Lin committed -
* add check_grad * finish * what does the fox say? * lint lint lint lint lint lint lint lint lint
雾雨魔理沙 committed -
* support for different inp/wgt bits, rewrote dot for clarity * [VTA] [Chisel] support for different inp/wgt bits, rewrote DotProduct for clarity * [VTA] [Chisel] support for different inp/wgt bits, rewrote DotProduct for clarity * change back to sim * fix index * fix index * fix indent * fix indent * fix indent * fix trailing spaces * fix trailing spaces * change to more descriptive name * matric->matrix * fix spacing * fix spacing & added generic name for dot * better parameter flow * spacing * spacing * spacing * update requirement (tested) for dot, spacing * function call convention * small edit
Benjamin Tu committed
-
- 25 Jul, 2019 7 commits
-
-
Lianmin Zheng committed
-
Balint Cristian committed
-
* uTVM interfaces (#14) * some minor interface changes * implemented HostLowLevelDevice * added MicroDeviceAPI * implemented micro_common and added Python interfaces * current status, semi implemented micro session * added micro_common implementation and python interfaces (#18) * added micro_common implementation and python interfaces (#18) * current status, semi implemented * host test working * updated interfaces for MicroSession arguments allocation * make somewhat lint compatible * fix based on comments * added rounding macro * fix minor bug * improvements based on comments * Clean up `binutil.py` and make Python-3-compatible * Change argument allocation design * Address feedback and lint errors * Improve binutil tests * Simplify allocator (per @tqchen's suggestions) * Doc/style fixes * farts * mcgee * rodata section werks (and so does `test_runtime_micro_workspace.py`) * simple graph runtime werk * TEMP * ResNet works, yo * First round of cleanup * More cleanup * runs a dyson over the code * Another pass * Fix `make lint` issues * ready to pr... probably * final * Undo change * Fix rebase resolution * Minor fixes * Undo changes to C codegen tests * Add `obj_path` in `create_micro_lib` * TEMP * Address feedback * Add missing TODO * Partially address feedback * Fix headers * Switch to enum class for `SectionKind` * Add missing ASF header * Fix lint * Fix lint again * Fix lint * Kill lint warnings * Address feedback * Change Python interface to MicroTVM All interaction with the device is now through `Session` objects, which are used through Python's `with` blocks. * Reorder LowLevelDevice interface * Store shared ptr to session in all alloced objects * Move helper functions out of `tvm.micro` * Switch static char arr to vector * Improve general infra and code quality Does not yet address all of tqchen's feedback * Forgot a rename * Fix lint * Add ASF header * Fix lint * Partially address MarisaKirisame's feedback * Lint * Expose `MicroSession` as a node to Python * Revert to using `Session` constructor * Fix compiler error * (Maybe) fix CI error * Debugging * Remove * Quell lint * Switch to stack-based session contexts * Make uTVM less intrusive to host codegen And use SSA for operands of generated ternary operators * Inline UTVMArgs into UTVMTask struct * Remove `HostLowLevelDevice` header * Remove `BaseAddr` class * Address feedback * Add "utvm" prefix to global vars in runtime * Fix lint * Fix CI * Fix `test_binutil.py` * Fix submodules * Remove ResNet tests * Make `test_binutil.py` work with nose * Fix CI * I swear this actually fixes the binutil tests * lint * lint * Add fcompile-compatible cross-compile func * Add docs for uTVM runtime files * Move pointer patching into `MicroSession` * Fix lint * First attempt at unifying cross-compile APIs * Fix lint * Rename `cross_compile` back to `cc` * Address feedback * Remove commented code * Lint * Figure out failing function * Remove debugging code * Change "micro_dev" target to "micro" * Add checks in tests for whether uTVM is enabled * Add TODO for 32-bit support * Rename more "micro_dev" to "micro" * Undo rename We already have `tvm.micro` as a namespace. Can't have it as a method as well. * Fix failing CI Thanks to @tqchen for finding this bug. Emitting ternary operators for `min` and `max` causes concurrency bugs in CUDA, so we're moving the ternary op emissions from `CodeGenC` to `CodeGenCHost`. * Address feedback * Fix lint
Logan Weber committed -
Philip Hyunsu Cho committed
-
Yong Wu committed
-
Jian Weng committed
-
* [TOPI] Average Pool2D Bug. Issue - https://github.com/dmlc/tvm/issues/3581 * Add uint16 test.
Animesh Jain committed
-