- 12 Apr, 2020 1 commit
-
-
This PR enables the copy on write optimizations passes: - Enable COW for IRModule both TIR and relay passes. - Enabled COW for PrimFunc in TIR passes. Need more thoughts into whether/how to enable COW for relay::Function, due to some function passes depend on the presence of IRModule for context information, and the std::move of the related function to nullptr might affect the related behavior.
Tianqi Chen committed
-
- 11 Apr, 2020 1 commit
-
-
* [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc This PR introduces RValue reference support the PackedFunc calling convention to address the above issue. Specifically, when an argument is a r-value reference, we will use a assign a different type code(`kObjectRValueRefArg`), and pass `Object**` (the address to the Object pointer) instead through the values array. The callee can choose to move out this Object pointer and set the original Object pointer from the caller side to be nullptr. We also add an experimental move support to the python side(marked as _move so to indicate the dev nature). This enhancement will enable copy on write optimizations through out the TVM stack. * Address review comments * fix compilation
Tianqi Chen committed
-
- 10 Apr, 2020 2 commits
-
-
* Use runtime::String * move string to tvm namespace * add const char* constructor * implicit cast from std::string
Zhi committed -
* [RUNTIME] Initial implementation of Hexagon runtime support This is only the TVM runtime. The FastRPC libraries, simulator driver, etc. will be provided in subsequent commits. * Fix pylint complaints * Fix some more pylint complaints * Add link to the Hexagon SDK website * Extract VTCM marker into a common variable * Implement device->device memory copy * Disable unsigned PDs by default * Ensure that --hvx_length is present in sim_args if HVX is enabled * Remove the line about clang from README.md Apparently things work with libstdc++. * Mention to set USE_RPC=OFF when building libtvm_runtime.so for Hexagon * Remember to use codegen_hvx in validate_hvx_length * Add a line about minimum version of LLVM
Krzysztof Parzyszek committed
-
- 07 Apr, 2020 2 commits
-
-
Tianqi Chen committed
-
* initial crt_memory and memory leak fix in graph_runtime Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a * fix memory leak Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3 * clean up Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3 * implement vrealloc Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398 * allocate from stack memory for most of the variables Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee * allocate from stack memory for all of the variables Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74 * lint Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73 * lint Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100 * facilitate the growth of TVM_CRT_MAX_NDIM Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581 * extend test coverage of vmalloc Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581 * lint Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69 * move logging.h to src * fix an error in macOS * remove logging.h * use cflags for gcc * fix compilation error
Liangfu Chen committed
-
- 06 Apr, 2020 1 commit
-
-
[RUNTIME] Enable auto conversion from str to runtime::String in PackedFunc, move dtype related handling to data_type.h (#5251)
Tianqi Chen committed
-
- 02 Apr, 2020 1 commit
-
-
* expose runtime::String to Python * kExternalSymbol -> kGlobalSymbol
Zhi committed
-
- 31 Mar, 2020 1 commit
-
-
hlu1 committed
-
- 28 Mar, 2020 2 commits
-
-
* [NODE][IR] Introduce StructuralHash for the Unified IR. This PR introduces a new way to handle structural hash for the unified IR. - Each object can now register an optional SEqualHash function, which describes how to reduce its structural equality to sequence of hash values. - Optionally, the object can choose to allow labeling of vars(e.g. function parameters) by calling DefHash - We implemented a non-recursive structural hasher that maintains its own stack to traverse te IR. This PR also improves the hash value property from the previous relay's hash utility. In particular, the graph node mode hashs a DAG differently from a tree by attaching an unique occurence index to each graph node. In all of the test cases so far, structural_hash is consistent with structural_equal. - if structrual(x, y) then structural_hash(x) == structural_hash(y) - if structural_hash(x) == structural_hash(y) then highly likely structural_equal(x, y) - hash no collison is found in our testcases. Ideally we should work on automatically generating these functions in the future. * Fix cases for EnvFunc and Array dims * fix testcase * Update src/node/structural_hash.cc Co-Authored-By: 雾雨魔理沙 <lolisa@marisa.moe> Co-authored-by: 雾雨魔理沙 <lolisa@marisa.moe>
Tianqi Chen committed -
* [NODE][IR] Introduce StructuralEqual Infra for the Unified IR. This PR introduces a new way to handle structural equality for both TIR and relay nodes in an extensive way. - Each object can now register an optional SEqualReduce function, which describes how to reduce its structural equality to another instance into equality of the children. - Optionally, the object can choose to allow remapping of vars(e.g. function parameters) by calling DefEqual - We implemented a non-recursive structural equality checker that recursively traverses the objects and does the structural equality checking. This PR also fixes a few potential problems in previous relay's AlphaEqual. - In particular, the new structural equality relation will be communicative. - It is can be dangerous to use same_as relation to quickly check equality, demonstrated by the following case. (%x, %y) are shared vars between two functions. - function0: fn (%x, %y) { %x + %y } - function1: fn (%y, %x) { %x + %y } The new structural equal is intented to supersede AlphaEqual and AttrsEqual. Follow-up PRs should be performed to redirect the existing usages, and removes the corresponding implementation. * Update the rule to distinguish between graph node and non-graph nodes. * Refactor the test cases to use structural equal. * address comments * Mark more relay::Expr as graph node, fix a testcase issue(was bug that was not caught by previous alpha equal) * Remove unrelated comment * Fix file comment * Address review comment * Relax condition to fit flaky case
Tianqi Chen committed
-
- 11 Mar, 2020 2 commits
-
-
Wei Chen committed
-
- This patch allows CUDA backend to emit correct code for selects with vector conditions, which may be produced by floordiv op lowering etc.. - This already works for llvm BE, as llvm select instruction supports vector conditions. Signed-off-by: Wei Pan <weip@nvidia.com>
Wei Pan committed
-
- 10 Mar, 2020 2 commits
-
-
* implement kDLCPUPinned * Fix line endings * Fix whitespace for linter * cleanup up allocdataspace method
jmorrill committed -
* Add Nick's changes's squashed * Fix frontend compilation * Re-enable Rust CI * Add changes with conflicted badly * Restructure import_module! macro in order to avoid unstable features * Kill old unstable feature enablement * Refactor common to use new APIs * Move the code to stable * Fix warning Co-authored-by: Nick Hynes <nhynes@oasislabs.com>
Jared Roesch committed
-
- 29 Feb, 2020 1 commit
-
-
* Added CopyFromBytes and CopyToBytes convenience methods. Fixed typos. * Removed unneed argument check * Use TVMArrayCopyFrom/ToBytes methods * Moved CopyFrom/ToBytes to ndarray.cc * CopyToBytes impl was using CopyFromBytes. Fixed * changed inline to TVM_DLL * Used impl from TVMArrayCopyTo/FromBytes into NDArray CopyTo/FromBytes * Move implementation of all CopyFrom/ToBytes into a common impls * make arg const * simplify method impl
jmorrill committed
-
- 27 Feb, 2020 1 commit
-
-
* [Runtime] Fixed TVM_DLL_EXPORT_TYPED_FUNC to work on Windows * fix style Co-authored-by: Jon Soifer <jonso@microsoft.com>
Jon Soifer committed
-
- 26 Feb, 2020 1 commit
-
-
* bump up dev version * update
Haichen Shen committed
-
- 21 Feb, 2020 1 commit
-
-
* support cuda tensorcore subbyte int data type in auto tensorcore * add lisence * pass cpplint * fix code review comments * merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial * using master's new API * disable tuning when cuda is not enabled * address cr comment * do not run the tuning * fix test failure * fix cpplint error * fix bool type reduction bug * 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4 * fix typo
Orion34C committed
-
- 11 Feb, 2020 1 commit
-
-
Lianmin Zheng committed
-
- 07 Feb, 2020 1 commit
-
-
* [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update This PR updates the tvm.runtime to use the new FFI style. - Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule - API changes wrt to runtime.Module - tvm.module.load -> tvm.runtime.load_module - tvm.module.enabled -> tvm.runtime.enabled - tvm.module.system_lib -> tvm.runtime.system_lib - Remove dep on api_internal from runtime. * Update module.load in the latest API
Tianqi Chen committed
-
- 04 Feb, 2020 1 commit
-
-
* [LINT] Fix -Wextra * Fix virtual-dtor
Tianqi Chen committed
-
- 20 Jan, 2020 1 commit
-
-
* [REFACTOR][TYPE] Finish move all types to IR. - Move definition of Ref and TensorType to ir - Move type_functor.h to public header. - Rename RefType -> RelayRefType for clarity. * Add atol
Tianqi Chen committed
-
- 18 Jan, 2020 2 commits
-
-
* unify vm and interpreter objects * move closure back vm * adt/closure back to vm.adt/vm.closure * closure base
Zhi committed -
- Fixes issues to enable fp16 vectorizer. Now correct packing and unpacking CUDA code will be emitted. Enabled more unit tests. - Do not emit code to read the first lane from an undef variable int _3; _3 = _3 & ~(0x000000ff << 0) | ... and emit the following code instead: _3 = (((0x000000ff & (_1 >> 0))+(0x000000ff & (_2 >> 0))) << 0); Note that nvcc 10.2 is forgiving and emits the same code for both cases. A warning appears in test_codegen_cuda.py. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 17 Jan, 2020 1 commit
-
-
- Remove operator bool from base object ref macro - Raitionale: operator bool can be dangerous for sub-classes that also overloads other operators(e.g. ==). - If bool is still needed, use explicit operator bool. - Use absolute include when necessary - Move type related util to data_type - Isolate stackvm code from compiler
Tianqi Chen committed
-
- 16 Jan, 2020 2 commits
-
-
Tianqi Chen committed
-
This PR introduces more clear naming prefix for C API type codes to avoid conflict with other packages. We also removed TVMArray and TVMType to directly use DLTensor and DLDataType.
Tianqi Chen committed
-
- 11 Jan, 2020 2 commits
-
-
This PR moves a few base types from relay and low-level Expr into the ir sub-folder. These classes will serve as a common type system across the stack. Rationale: - PrimExpr for low-level expressions - RelayExpr for advanced features, including Function definition. - Introduce BaseFunc to host all functions, including future PrimFunc(low-level expr functions, subject to discussion). This is a minimum change we can do to unify the classes into a common hierarchy. The main data structure that are variant specific will still be kept in the sub-namespaces. We only include classes that is needed to allow a common Module class. - BaseFunc - GlobalVar - Type definition part of ADT We will only need the BaseFunc and their checked_type to decide the calling convention across the function variants.
Tianqi Chen committed -
* replace TensorObj and TensorValue with NDArray * NodeBase to Object in Python * rebase
Zhi committed
-
- 09 Jan, 2020 2 commits
-
-
* [REFACTOR][IR] tvm::Expr -> PrimExpr(Primitive Expr) As part of unified IR, we will need to unify relay::Expr and the current tvm::Expr under the same base type. From the techinical point of view. tvm::Expr is a "primitive" expression that only contains POD types and handles and does not do life-cycle management. This PR renames Expr->PrimExpr to clarify that. We will send a subsequent PR to introduce the base expr class. * Remove legacy VarExpr and ExprHash/Equal
Tianqi Chen committed -
* [RUNTIME] Fix windows build after the latest dso module change. Switch to shared_ptr to get around a problem in latest MSVC. * [CI] Add github action for win mac build.
Tianqi Chen committed
-
- 07 Jan, 2020 1 commit
-
-
* [RUNTIME][DSO] Improve TVMBackendPackedCFunc to allow return value. Previously the signature of LibraryModule's PackedFunc does not support return value. This wasn't a limitation for our current usecase but could become one as we start to generate more interesting functions. This feature also start to get interesting as we move towards unified object protocol and start to pass object around. This PR enhances the function signature to allow return values. We also created two macros TVM_DLL_EXPORT_PACKED_FUNC and TVM_DLL_EXPORT_TYPED_FUNC to allow manual creation of functions that can be loaded by a LibraryModule. Examples are added in apps/dso_plugin_module. The change to TVMBackendPackedCFunc is backward compatible, as previous function will simply ignore the return value field. * address review comments
Tianqi Chen committed
-
- 06 Jan, 2020 1 commit
-
-
Previously we support a limited case of function type deduction and in many places we have to supply the type twice during set_body_typed (one in the template parameter, another in the lambda signature). This PR improves the deduce function by enablng automatic function signature deduction. ``` TVM_REGISTER_GLOBAL("sub") .set_body_typed([](int x, int y) -> int { return x - y; }); ``` Unfortunately, because of template conflict, we can not support the original case where both type signature and lambda are supplied through set_body_typed. This PR refactors the existing regsitration to the new style.
Tianqi Chen committed
-
- 04 Jan, 2020 1 commit
-
-
TVM_REGSISTER_API is an alias of TVM_REGISTER_GLOBAL. In the spirit of simplify redirections, this PR removes the original TVM_REGISTER_API macro and directly use TVM_REGISTER_GLOBAL. This type of refactor will also simplify the IDE navigation tools such as FFI navigator to provide better code reading experiences. Move EnvFunc's definition to node.
Tianqi Chen committed
-
- 01 Jan, 2020 1 commit
-
-
Zhi committed
-
- 31 Dec, 2019 1 commit
-
-
* [REFACTOR][OBJECT] Consoldiate NodePtr/Ref/Hash/Equal and macros to Object. Historically, we have classes like NodePtr/Ref/HashEqual. After unified object protocol, these names are just alias of the object counterpart. Moreover, there are helper macros defined over the places for defining these object. This PR consoldiate the terminologies into the corresponding ones in the Object system so we have a clean and consistent API moving forward. * Update include/tvm/attrs.h Co-Authored-By: Wei Chen <ipondering.weic@gmail.com> * fix compilation Co-authored-by: Wei Chen <ipondering.weic@gmail.com>
Tianqi Chen committed
-
- 30 Dec, 2019 1 commit
-
-
* [REFACTOR][RUNTIME] Move NDArray to Object System. Previously NDArray has its own object reference counting mechanism. This PR migrates NDArray to the unified object protocol. The calling convention of NDArray remained intact. That means NDArray still has its own type_code and its handle is still DLTensor compatible. In order to do so, this PR added a few minimum runtime type detection in TVMArgValue and RetValue only when the corresponding type is a base type(ObjectRef) that could also refer to NDArray. This means that even if we return a base reference object ObjectRef which refers to the NDArray. The type_code will still be translated correctly as kNDArrayContainer. If we assign a non-base type(say Expr) that we know is not compatible with NDArray during compile time, no runtime type detection will be performed. This PR also adopts the object protocol for NDArray sub-classing and removed the legacy NDArray subclass protocol. Examples in apps/extension are now updated to reflect that. Making NDArray as an Object brings all the benefits of the object system. For example, we can now use the Array container to store NDArrays. * Address review comments
Tianqi Chen committed
-
- 27 Dec, 2019 1 commit
-
-
Zhao Wu (Chinese Name: 吴钊) committed
-
- 26 Dec, 2019 1 commit
-
-
Zhao Wu committed
-