- 23 Apr, 2020 1 commit
-
-
MORITA Kazutaka committed
-
- 22 Apr, 2020 1 commit
-
-
Substitute now takes a std::function to customize more replacing behaviors. Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn> Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>
Tianqi Chen committed
-
- 16 Apr, 2020 1 commit
-
-
Samuel committed
-
- 15 Apr, 2020 3 commits
-
-
- Changes most of the relay docs to use autosummary. - Bring relay API docs to the top-level flat view for easier discovery - Removed a few cases of re-exports.
Tianqi Chen committed -
* add pytorch tutorial code and doc stub * add more docs * formatting, more docs * typo fix * try make sphinx happy * add performance section * type and nit fix * format fix
masahi committed -
Tianqi Chen committed
-
- 14 Apr, 2020 1 commit
-
-
* [RELAY][PYTORCH]isNan, isinf, isfinite, ceil, clamp, round ops * Review comments
Samuel committed
-
- 08 Apr, 2020 1 commit
-
-
Luis Vega committed
-
- 05 Apr, 2020 2 commits
-
-
* Functional conv3d winograd working. * Formatted python code. * registered conv3d winograd compute and started adding relay without_weight_transform operator. * Add topi testing for conv3d winograd. * Format file. * small tweak to unrolling to prevent build sticking. * Refactoring convolution ops in relay. * Refactored relay convolutions. * Bug fixes. * Fixed static bug in convolution. * Added conv3d alter op layout and related support. * Bug fixes and testing done. * Fix a few autotvm bugs. * Drop silly debug print. * Removed debug_skip_region. * Add variant of conv3d_winograd that doesn't transform depth. * initial infrastructure done for depthless conv. * Fix no_depth schedule bugs. * automatic topi switching between depth and depthless winograd. * Fixed bug in schedule. * lint fixes. * Removed indents in convolution.cc * missed a few indents oops. * fixed flop count. * One more small tweak. * Change kernel pack inner axes order. * Style changes. * Comment fixes.
Josh Fromm committed -
* [REFACTOR][TIR] Migrate all low-level passes to the Pass Manager. This PR migrates the tvm.lower to return IRModule of PrimFuncs instead of the LoweredFuncs. * Remove LoweredFunc.
Tianqi Chen committed
-
- 02 Apr, 2020 3 commits
-
-
Rationale: The current hybrid module is more aligned with the te part. We might consider add a new varient of hybrid script that support the unified IR later. This refactor paves for the potential later changes.
Tianqi Chen committed -
Tianqi Chen committed
-
* [REFACTOR][TIR] Introduce ExprDeepEqual, Remove IRDeepCompare This PR introduces ExprDeepEqual which reuses the StructuralEqual infra. We migrated the usecases of ir_pass::Equal to ExprDeepEqual and StructuralEqual. * Address comments
Tianqi Chen committed
-
- 01 Apr, 2020 1 commit
-
-
* [DOCS] Use https link * use http for sphinx
Tianqi Chen committed
-
- 31 Mar, 2020 2 commits
-
-
* refactor * path udpate
Thierry Moreau committed -
Andrew Reusch committed
-
- 30 Mar, 2020 2 commits
-
-
* [DOCS] Point docs to the ASF site. We have migrated the main docs to the ASF site, which will be periodically updated using the docs generated by the CI. Points the docs to the ASF version. * [CI] Improve the docs generation script
Tianqi Chen committed -
* [DOCS] Various sphinx related fix. - Use :ref: for reference. - Use :py:class: to refer to API docs. - Update installation guide to also refer to the download page. - Only move html contents in doxygen. * Address review comments * Update wording
Tianqi Chen committed
-
- 29 Mar, 2020 1 commit
-
-
Thierry Moreau committed
-
- 26 Mar, 2020 1 commit
-
-
* [Doc] TVM release process * fix tag * remove things not apply
Yizhi Liu committed
-
- 24 Mar, 2020 1 commit
-
-
* [REFACTOR][TIR] Introduce PrimFuncPass. - Introduce PrimFuncPass - Convert one pass to the unified Pass API. * Address comments * Fix comments
Tianqi Chen committed
-
- 23 Mar, 2020 4 commits
-
-
Tianqi Chen committed
-
* isfinite doc update * isfinit expr * isfinit expr * isfinite schedule reg * isfinite python binding * isfinite python binding * relay register isfinite * isfinite type relation * intrin isfinite * topi isfinite * testcase topi isfinite * tf frontend isfinite * tf frontend isfinite testcase * test case relay isfinite * small fixes * test forward tf isfinite * test cases injective for cuda * remove float16 test case * add support for isinf * remove unwanted import * fix conflict
Mahesh Ambule committed -
* first cut unravel_index * merge fixes * change rates to dilations * unravel_index op relay, topi, mxnet, tf * doc changes * small changes * remove empty unravel and argwhere attrs * remove empty unravel and argwhere attrs
Mahesh Ambule committed -
* [DOCS] Cleanup docs before rebuild * Ask doxygen to generate svg to minimize the file size
Tianqi Chen committed
-
- 22 Mar, 2020 1 commit
-
-
* [DOCS] include a tarball of docs during deployment * [DOCS] Add a short security faq
Tianqi Chen committed
-
- 21 Mar, 2020 1 commit
-
-
* Update relay docs * any -> py:func * make clean
Zhi committed
-
- 20 Mar, 2020 1 commit
-
-
masahi committed
-
- 19 Mar, 2020 1 commit
-
-
* [DOC] Add doc for Relay op strategy * update * address more comments * update * update
Haichen Shen committed
-
- 18 Mar, 2020 1 commit
-
-
Zhi committed
-
- 17 Mar, 2020 2 commits
-
-
* refactor relay python * revert relay/ir/*.py to relay * Address comments * remove direct access to analysis and transform namespace
Zhi committed -
* update docs for dilation 2d * dilation2d compute * dilation2d register * dilation2d rel compute * dilation2d strategy * dilation2d attrs * dilation2d generic schedule * dilation2d tf frontend support * dilation2d tf frontend test case * dilation2d test cases * pylint fixes * add exception for cuda target * Update docstring * Update docstring * change rates to dilations * removed unused param * merge master * Update nn.py * Update nn.py
Mahesh Ambule committed
-
- 13 Mar, 2020 1 commit
-
-
Tianqi Chen committed
-
- 12 Mar, 2020 1 commit
-
-
Thierry Moreau committed
-
- 11 Mar, 2020 1 commit
-
-
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Finally implement tan as sin/cos in llvm.
notoraptor committed
-
- 10 Mar, 2020 1 commit
-
- 09 Mar, 2020 1 commit
-
-
* [VTA][de10nano] Enable user defined target frequency. Issue: The VTA target frequency on the DE10-Nano is hardcoded to 50MHz unnecessarily limiting performance. Solution: Add a PLL to the FPGA sub-system along with support for the selection of a user specified frequency at build time. The board successfully builds and runs at 100MHz. * Added a PLL in the soc_system.tcl platform designer generator script. * Modified the Makefile to automatically set the target frequency from that specified in the pkg_config.py file. * Modified the Makefile to generate a bitstream with an RBF format that enables programming of the FPGA directly from the on-board processor. Specifically, the RBF is generated in FastParallel32 mode with compression, which corresponds to the default MSEL switch setting on the board, i.e. 01010. * Added a false path override to file set_clocks.sdc to turn off unconstrained path warnings on the VTA pulse LED. * [VTA][TSIM] Add more debug and tracing options. * Modified Makefile to change default config to DafaultDe10Config. * Added option in Makefile to produce more detailed tracing for extra observability in debugging complex scenarios. * Added option in Makefile to produce traces in FST format which are 2 orders of magnitude smaller, although much slower to generate. * Added option in Makefile to build the simulator with GCC address sanitizer. * Modified Makefile to not lint the scala code by default avoiding unintended wrong indentation. Linting should be better performed manually on a per-need basis. * [VTA][de10nano] Enable remote programming of FPGA. Issue: The Cyclone V FPGA on board of the DE10-Nano can only be programmed using the JTAG port, which is a limiting option for users. Solution: Add support for the remote programming of the FPGA implementing the FPGA programming manager protocol published in the Cyclone V user manual. * Added file de10nano_mgr.h implementing an FPGA manager class that supports handling of control and status registers as well as a push-button option to program the FPGA. The class can be easily extended to include more registers if needed. * Used an instance of the FPGA manager to implement function VTAProgram also warning users when incompatible bitstream files are used. * Registered VTAProgram as a global function and modified the program_bitstream python class to use it. * [VTA][de10nano] Enhance de10nano runtime support. Issue: The de10nano target has incomplete, non-working support for runtime reconfiguration, bitstream programming, and examples of usage. Solution: Complete runtime support for the de10nano target. * Modified VTA.cmake to comment out a default override for VTA_MAX_XFER to 21 bit wide. * Modified VTA.cmake to add needed de10nano include dirs. * Modified relevant files to support de10nano same way as other targets for VTA runtime reconfiguration and FPGA programming. * Added test_program_rpc.py example as a runtime FPGA programming example. Note that unlike the pynq target no bitstream is either downloaded or programmed when the bitstream argument is set to None. * Cosmetic changes to vta config files. * [VTA][Chisel] LoadUop FSM bug fix. Issue: The LoadUop FSM incorrectly advances the address of the next uop to read from DRAM when the DRAM data valid bit is deasserted and asserted at the end of a read. This is caused by a mismatch in the logic of the state and output portions of the FSM. This is one of two issues that was gating the correct operation of VTA on the DE10-Nano target. Solution: Modify the logic of the output section of the FSM to include a check on the DRAM read valid bit or fold the output assignemnt into the state section. * Folded the assignemnt of the next uop address in the state section of the FSM. * [VTA][Chisel] Dynamically adjust DMA tranfer size. Issue: In the DE10-Nano target and possibly in others, DMA transfers that cross the boundaries of memory pages result in incorrect reads and writes from and to DRAM. When this happens depending on different input values, VTA loads and stores exhibit incorrect results for DMA pulses at the end of a transfer. This is one of two issues that were gating the DE10-Nano target from functioning correctly, but may affect other Chisel based targets. Solution: Add support for dynamically adjustble DMA transfer sizes in load and store operations. For a more elegant and modular implementation the feature can be enabled at compile time with a static constant that can be passed as a configuration option. * Modified the load and store finite state machines to dynamically adjust the size of initial and stride DMA transfers. The feature is enabled by default by virtue of the static constant ADAPTIVE_DMA_XFER_ENABLE. * [VTA][Chisel] Improve FSIM/TSIM/FPGA xref debug. Issue: Cross reference between FSIM, TSIM, and Chisel based FPGA traces is an invaluable instrument that enables fast analysis on FSIM, and analysis/debug on TSIM and FPGA, especially for complex flows like conv2d or full inferences. Currently this cannot be done easily since a suitable reference is missing. The clock cycle event counter cannot be used since it is undefined in FSIM and not reliable between TSIM and FPGA because of different latencies. Solution: Introduce a new event counter that preserves a program order across FSIM, TSIM, FPGA. We propose adding the accumulator write event counter in the Chisel EventCounter class and a simple instrumentation in the FSIM runtime code. Note that this technique enabled finding the Chisel issues reportes in the PR, which would have been otherwise far more difficult. * Added the acc_wr_count event counter and changed interfaces accordingly. * [VTA][de10nano] Comply with linting rules. * [VTA] Appease make lint. * [VTA] Disable pylint import not top level error. * [VTA][Chisel,de10nano] Linting changes. * Use CamelCase class names. * Use C++ style C include header files. * Add comments to Chisel makefile. * [VTA][de10nano] * Reorder C and C++ includes in de10nano_mgr.h. * Restore lint as default target in Chisel Makefile. * [VTA][de10nano] Do not use f string in pkg_config.py. * [VTA][de10nano] Remove overlooked f strings in pkg_config.py. * [VTA][de10nano] Fixed typo. * [VTA][TSIM] Check if gcc has align-new. * [VTA][Chisel] Make adaptive DMA transfer default. * [VTA][RPC] Renamed VTA_PYNQ_RPC_* to VTA_RPC_*. Issue: With more FPGA targets coming online the initial method of using individual environment variables to specify target IP and port does not scale well. Solution: Use a single VTA_RPC_HOST, VTA_RPC_PORT pair to be changed every time a different target is used. For instance in a script used to benchmark all targets. * Replaced every instance of VTA_PYNQ_RPC_HOST and VTA_PYNQ_RPC_PORT with VTA_RPC_HOST and VTA_RPC_PORT, respectively. * [VTA][Chisel] Comply with new linter.
Pasquale Cocchini committed
-
- 08 Mar, 2020 1 commit
-
-
ANSHUMAN TRIPATHY committed
-
- 06 Mar, 2020 1 commit
-
-
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Try again to implement tan as sin/cos in llvm.
Yao Wang committed
-
- 28 Feb, 2020 1 commit
-
-
* [DOCS] Fix sphinx precheck * ignore keras warnings * Remove more warnings
Tianqi Chen committed
-