- 22 Apr, 2020 1 commit
-
-
Substitute now takes a std::function to customize more replacing behaviors. Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn> Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn>
Tianqi Chen committed
-
- 21 Apr, 2020 3 commits
-
-
Tianqi Chen committed
-
The legacy Simplify/CanonicalSimplify are now a thin wrapper around the Analyzer. This PR removes these functions and migrated every place that requires simplification to enforce Analyzer creation. The new API would encourage more Analyzer sharing and potentially enable context-aware analyzer-based simplification.
Tianqi Chen committed -
Rationale: inline is a transformation used in te to rewrite its internal expressions. It is not a formal IRModule->IRModule transform pass. Also removed the python test as the test is covered by stage.compute_inline.
Tianqi Chen committed
-
- 20 Apr, 2020 1 commit
-
-
* [TIR][REFACTIR] RewriteForTensorCore -> te/schedule RewriteForTensor depends on the schedule information, which makes it differ from a typical pass(which should get all the information from the input TIR). As a result, we refactor it as a SchedulePostProc step for now. We should revisit it later as we introduce more support for tensor core patterns in the TIR. * Fix VTA to fit the new IR Pattern
Tianqi Chen committed
-
- 19 Apr, 2020 2 commits
-
-
* [TIR][REFACTOR] Remove te::Tensor dependencies from TIR passes. te::Tensor is an useful object for tensor expression, but brings un-necessary reverse dependency in TIR nodes such as Provide and Realize. This PR is a first step to remove this dependency. We will use Buffer in all the places where the te::Tensor was used. The rough correspondence are: - Provide -> BufferStore - Realize -> BufferRealize - HalideCall -> BufferLoad. After this change, we can not use IRModule of PrimFuncs cleanly to represent TIR at any point of the optimizations. Buffer will serve as the abstraction for the TIR data models to represent the intermediate storages and their constraints. We still keep Realize/HalideCall and Provide as TIR nodes for now to make the change minimum. Right after ScheduleOps, we call SchedulePostProcToPrimFunc to canonicalize the temporary IR generated by TE(which contains these nodes) to the TIR. The TIR optimizations are now mostly migrated to to the pass manager. Followup PRs are needed to migrate the remaining few passes. * Fix dev tutorial
Tianqi Chen committed -
* fix recursion in lower_warp_memory * post-order mutation
Tang, Shizhi committed
-
- 18 Apr, 2020 1 commit
-
-
- Migrate BoundCheckers and Simplify - Migrate RewriteUnsafeSelect and RemoveNoOp - Migrate UnrollLoop and StorageRewrite - Migrate InjectDoubleBuffer and InjectVirtualThread - Migrate LoopPartition and Vectorize - Migrate CoProcSync, LiftAttrScope, InjectCopyIntrin We still keep ir_pass registerations for now. Need a separate PR to refactor the parts before the StorageFlatten.
Tianqi Chen committed
-
- 17 Apr, 2020 1 commit
-
-
* support extent(threadIdx.x) < warp_size in lower_warp_memory * more docs for lower_warp_memory
Tang, Shizhi committed
-
- 15 Apr, 2020 2 commits
-
-
* [TIR] Remove ProducerConsumer and AllocateNode::new_expr This PR removes two legacy IR parts in TIR that are deprecated. ProducerConsumer node only serves as a hint markup and may no longer be informative after extensive transformations in the pass. If necessary, we can add related info via AttrStmt. The new_expr field in the AllocateNode is deprecated since it can just be replaced by a LetStmt. - Remove dependencies of passes on ProducerConsumer. - Remove ProducerConsumer from the IR. - Remove the deprecated fields (new_expr, free_function) from AllocateNode. * Fix additional testcases
Tianqi Chen committed -
Tianqi Chen committed
-
- 14 Apr, 2020 1 commit
-
-
Previously MakePackedAPI was in the target independent stage, but never the less requires the device_type information that will be binded at a later target dependent stage. The previous implementation was due to the limitation of LoweredFunc which can not carry buffer_map info(so they have to be lowered right away). This is no longer the case after the unified IR refactor. This PR migrates MakePackedAPI to a target dependent stage and removes the un-necessary BindDevice pass.
Tianqi Chen committed
-
- 13 Apr, 2020 1 commit
-
-
* [RUNTIME] Allow non-nullable ObjectRef, introduce Optional<T>. We use ObjectRef and their sub-classes extensively throughout our codebase. Each of ObjectRef's sub-classes are nullable, which means they can hold nullptr as their values. While in some places we need nullptr as an alternative value. The implicit support for nullptr in all ObjectRef creates additional burdens for the developer to explicitly check defined in many places of the codebase. Moreover, it is unclear from the API's intentional point of view whether we want a nullable object or not-null version(many cases we want the later). Borrowing existing wisdoms from languages like Rust. We propose to introduce non-nullable ObjectRef, and Optional<T> container that represents a nullable variant. To keep backward compatiblity, we will start by allowing most ObjectRef to be nullable. However, we should start to use Optional<T> as the type in places where we know nullable is a requirement. Gradually, we will move most of the ObjectRef to be non-nullable and use Optional<T> in the nullable cases. Such explicitness in typing can help reduce the potential problems in our codebase overall. Changes in this PR: - Introduce _type_is_nullable attribute to ObjectRef - Introduce Optional<T> - Change String to be non-nullable. - Change the API of function->GetAttr to return Optional<T> * Address review comments * Upgrade all compiler flags to c++14 * Update as per review comment
Tianqi Chen committed
-
- 12 Apr, 2020 2 commits
-
-
Zhi committed
-
This PR enables the copy on write optimizations passes: - Enable COW for IRModule both TIR and relay passes. - Enabled COW for PrimFunc in TIR passes. Need more thoughts into whether/how to enable COW for relay::Function, due to some function passes depend on the presence of IRModule for context information, and the std::move of the related function to nullptr might affect the related behavior.
Tianqi Chen committed
-
- 10 Apr, 2020 1 commit
-
-
* Use runtime::String * move string to tvm namespace * add const char* constructor * implicit cast from std::string
Zhi committed
-
- 06 Apr, 2020 1 commit
-
-
Tang, Shizhi committed
-
- 05 Apr, 2020 1 commit
-
-
* [REFACTOR][TIR] Migrate all low-level passes to the Pass Manager. This PR migrates the tvm.lower to return IRModule of PrimFuncs instead of the LoweredFuncs. * Remove LoweredFunc.
Tianqi Chen committed
-
- 03 Apr, 2020 1 commit
-
-
* [REFACTOR][TIR] Migrate most of low-level build to use the Pass Manager. - SplitHostDevice - ThreadSync - BindDevice - LowerThreadAllreduce - Provide a temp fix for printing IRModule with PrimFunc before the formal text printer. * Address comments, fix tests. * Fix relay tests * Explicit move
Tianqi Chen committed
-
- 02 Apr, 2020 4 commits
-
-
- Migrate LowerTVMBultin - Migrate inferFragment, LowerThreadAllreduce - Migrate ThreadSync - Refactor target::Build to directly take IRModule. - Remove un-used legacy functions.
Tianqi Chen committed -
Co-authored-by: Siyuan Feng <hzfengsy@sjtu.edu.cn> This PR introduces BufferLoad/Store to TIR. The new nodes will replace Provide and Call with Tensor arguments in the subsequent refactors.
Tianqi Chen committed -
Haozheng Fan committed
-
* [REFACTOR][TIR] Introduce ExprDeepEqual, Remove IRDeepCompare This PR introduces ExprDeepEqual which reuses the StructuralEqual infra. We migrated the usecases of ir_pass::Equal to ExprDeepEqual and StructuralEqual. * Address comments
Tianqi Chen committed
-
- 01 Apr, 2020 1 commit
-
-
* [TIR][TRANSFORM] Migrate LowerIntrin * LowerDeviceStorageAccessInfo * Migrate LowerWarpMemory
Tianqi Chen committed
-
- 30 Mar, 2020 1 commit
-
-
Zhi committed
-
- 28 Mar, 2020 1 commit
-
-
* [NODE][IR] Introduce StructuralEqual Infra for the Unified IR. This PR introduces a new way to handle structural equality for both TIR and relay nodes in an extensive way. - Each object can now register an optional SEqualReduce function, which describes how to reduce its structural equality to another instance into equality of the children. - Optionally, the object can choose to allow remapping of vars(e.g. function parameters) by calling DefEqual - We implemented a non-recursive structural equality checker that recursively traverses the objects and does the structural equality checking. This PR also fixes a few potential problems in previous relay's AlphaEqual. - In particular, the new structural equality relation will be communicative. - It is can be dangerous to use same_as relation to quickly check equality, demonstrated by the following case. (%x, %y) are shared vars between two functions. - function0: fn (%x, %y) { %x + %y } - function1: fn (%y, %x) { %x + %y } The new structural equal is intented to supersede AlphaEqual and AttrsEqual. Follow-up PRs should be performed to redirect the existing usages, and removes the corresponding implementation. * Update the rule to distinguish between graph node and non-graph nodes. * Refactor the test cases to use structural equal. * address comments * Mark more relay::Expr as graph node, fix a testcase issue(was bug that was not caught by previous alpha equal) * Remove unrelated comment * Fix file comment * Address review comment * Relax condition to fit flaky case
Tianqi Chen committed
-
- 24 Mar, 2020 1 commit
-
-
* [REFACTOR][TIR] Introduce PrimFuncPass. - Introduce PrimFuncPass - Convert one pass to the unified Pass API. * Address comments * Fix comments
Tianqi Chen committed
-
- 23 Mar, 2020 3 commits
-
-
* relay Node::make to constructor * patternwildcard * Address comments
Zhi committed -
[Bugfix] Fixed bug where shifting by out-of-bounds value results in no compute code being emitted. (#5115) * Fixed bug where shifting by out-of-bounds RHS values results in LLVM to codegen nothing. Added regression testcase * Updated testcase to be more precise. * Fixed testcase
pankratz committed -
* isfinite doc update * isfinit expr * isfinit expr * isfinite schedule reg * isfinite python binding * isfinite python binding * relay register isfinite * isfinite type relation * intrin isfinite * topi isfinite * testcase topi isfinite * tf frontend isfinite * tf frontend isfinite testcase * test case relay isfinite * small fixes * test forward tf isfinite * test cases injective for cuda * remove float16 test case * add support for isinf * remove unwanted import * fix conflict
Mahesh Ambule committed
-
- 20 Mar, 2020 1 commit
-
-
As part of the unified IR refactor. This PR refactors the target codegen to use IRModule containing tir::PrimFuncs. In order to break the refactor into several steps without breaking the codebase, we built an conversion pass to convert Array<LoweredFunc> into IRModule. The follow-up refactors will gradually move the passes covered by IRModule up until we cover all the passes. Then we can remove the additional redundant concepts such as LoweredFunc.
Tianqi Chen committed
-
- 14 Mar, 2020 1 commit
-
-
This PR introduces tir::PrimFunc which will be used as the TIR function container in the unified IR. Also streamlined the function attributes a bit further. - All common attributes are under tvm::attr - TIR specific attributes are under tvm::tir::attr and comes with a tir prefix - Use stl_style for attributes for now
Tianqi Chen committed
-
- 12 Mar, 2020 1 commit
-
-
pankratz committed
-
- 11 Mar, 2020 1 commit
-
-
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Finally implement tan as sin/cos in llvm.
notoraptor committed
-
- 10 Mar, 2020 1 commit
-
- 06 Mar, 2020 1 commit
-
-
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Try again to implement tan as sin/cos in llvm.
Yao Wang committed
-
- 21 Feb, 2020 1 commit
-
-
* support cuda tensorcore subbyte int data type in auto tensorcore * add lisence * pass cpplint * fix code review comments * merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial * using master's new API * disable tuning when cuda is not enabled * address cr comment * do not run the tuning * fix test failure * fix cpplint error * fix bool type reduction bug * 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4 * fix typo
Orion34C committed
-
- 20 Feb, 2020 1 commit
-
-
- Allows uniform conditions for select expressions (the same as halide) exposed by the loop vectorizer. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 19 Feb, 2020 1 commit
-
-
* [REFACTOR] Polish ffi convention. - Remove the src/api, keep registration local to the c++ function. - Remove the api_internal as it is no longer needed. * Update the codebase walk through
Tianqi Chen committed
-
- 18 Feb, 2020 1 commit
-
-
Fixed bugs that occured when using bitwise operators on floating point type expressions. Further crash when using ops <<, >>, %. Finally added regression tests for both types of bug. (#4892)
pankratz committed
-