1. 12 Apr, 2020 1 commit
    • [IR][TRANSFORM] Enable CopyOnWrite for passes. (#5309) · e4b80bda
      This PR enables the copy on write optimizations passes:
      - Enable COW for IRModule both TIR and relay passes.
      - Enabled COW for PrimFunc in TIR passes.
      
      Need more thoughts into whether/how to enable COW
      for relay::Function, due to some function passes depend
      on the presence of IRModule for context information,
      and the std::move of the related function to nullptr
      might affect the related behavior.
      Tianqi Chen committed
  2. 11 Apr, 2020 1 commit
    • [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc (#5271) · b72dd9d9
      * [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc
      
      This PR introduces RValue reference support the PackedFunc calling convention to address the above issue.
      Specifically, when an argument is a r-value reference, we will use a assign a different type code(`kObjectRValueRefArg`),
      and pass `Object**`  (the address to the Object pointer) instead through the values array.
      The callee can choose to move out this Object pointer and set the original Object pointer from the caller side to be nullptr.
      
      We also add an experimental move support to the python side(marked as _move so to indicate the dev nature).
      This enhancement will enable copy on write optimizations through out the TVM stack.
      
      * Address review comments
      
      * fix compilation
      Tianqi Chen committed
  3. 10 Apr, 2020 2 commits
    • [REFACTOR][IR] Move to runtime::String (#5276) · 5da361d3
      * Use runtime::String
      
      * move string to tvm namespace
      
      * add const char* constructor
      
      * implicit cast from std::string
      Zhi committed
    • [RUNTIME] Initial implementation of Hexagon runtime support (#5252) · 02d3a59b
      * [RUNTIME] Initial implementation of Hexagon runtime support
      
      This is only the TVM runtime. The FastRPC libraries, simulator driver,
      etc. will be provided in subsequent commits.
      
      * Fix pylint complaints
      
      * Fix some more pylint complaints
      
      * Add link to the Hexagon SDK website
      
      * Extract VTCM marker into a common variable
      
      * Implement device->device memory copy
      
      * Disable unsigned PDs by default
      
      * Ensure that --hvx_length is present in sim_args if HVX is enabled
      
      * Remove the line about clang from README.md
      
      Apparently things work with libstdc++.
      
      * Mention to set USE_RPC=OFF when building libtvm_runtime.so for Hexagon
      
      * Remember to use codegen_hvx in validate_hvx_length
      
      * Add a line about minimum version of LLVM
      Krzysztof Parzyszek committed
  4. 07 Apr, 2020 2 commits
    • [uTVM][Runtime] Introduce Virtual Memory Allocator to CRT (#5124) · e11a6092
      * initial crt_memory and memory leak fix in graph_runtime
      
      Change-Id: I0f79f909a04d1c677aabb80f202f0612c5ce7f2a
      
      * fix memory leak
      
      Change-Id: I37104c09e28112b1974fa2b064c809d0a8d686c3
      
      * clean up
      
      Change-Id: I039b12015a1d56c8f4120867cd5a5292da34f3e3
      
      * implement vrealloc
      
      Change-Id: I35800470bcbfcf96652494f359711cb4c2d34398
      
      * allocate from stack memory for most of the variables
      
      Change-Id: I72071289843fff4031c0df8796868a0b9fbc57ee
      
      * allocate from stack memory for all of the variables
      
      Change-Id: I32dba85ac1660c77f51c2d0d8ab6436ed0c01c74
      
      * lint
      
      Change-Id: If12cd240685d7791fc60bc0cfb66389cdc186b73
      
      * lint
      
      Change-Id: I7c9d90c11b60b8edda2427ebd189ebe535af2100
      
      * facilitate the growth of TVM_CRT_MAX_NDIM
      
      Change-Id: I939fa43027a5c7529c5c7c6bd8d6e6beb91b7581
      
      * extend test coverage of vmalloc
      
      Change-Id: Ie4ff6b64fdfe6810836cf8fd44dace82a20c4581
      
      * lint
      
      Change-Id: Ibf3c06619ef296df5c49f3945cb6428777781d69
      
      * move logging.h to src
      
      * fix an error in macOS
      
      * remove logging.h
      
      * use cflags for gcc
      
      * fix compilation error
      Liangfu Chen committed
  5. 06 Apr, 2020 1 commit
  6. 02 Apr, 2020 1 commit
  7. 31 Mar, 2020 1 commit
  8. 28 Mar, 2020 2 commits
    • [NODE][IR] Introduce StructuralHash for the Unified IR. (#5160) · 497d01d3
      * [NODE][IR] Introduce StructuralHash for the Unified IR.
      
      This PR introduces a new way to handle structural hash for the unified IR.
      
      - Each object can now register an optional SEqualHash function, which
        describes how to reduce its structural equality to sequence of hash values.
      - Optionally, the object can choose to allow labeling of vars(e.g. function parameters)
        by calling DefHash
      - We implemented a non-recursive structural hasher that maintains its own stack
        to traverse te IR.
      
      This PR also improves the hash value property from the previous relay's hash utility.
      In particular, the graph node mode hashs a DAG differently from a tree
      by attaching an unique occurence index to each graph node.
      
      In all of the test cases so far, structural_hash is consistent with structural_equal.
      - if structrual(x, y) then structural_hash(x) == structural_hash(y)
      - if structural_hash(x) == structural_hash(y) then highly likely structural_equal(x, y)
        - hash no collison is found in our testcases.
      
      Ideally we should work on automatically generating these functions in the future.
      
      * Fix cases for EnvFunc and Array dims
      
      * fix testcase
      
      * Update src/node/structural_hash.cc
      
      Co-Authored-By: 雾雨魔理沙 <lolisa@marisa.moe>
      
      Co-authored-by: 雾雨魔理沙 <lolisa@marisa.moe>
      Tianqi Chen committed
    • [NODE][IR] Introduce StructuralEqual Infra for the unified IR. (#5154) · 997a14ed
      * [NODE][IR] Introduce StructuralEqual Infra for the Unified IR.
      
      This PR introduces a new way to handle structural equality
      for both TIR and relay nodes in an extensive way.
      
      - Each object can now register an optional SEqualReduce function, which
        describes how to reduce its structural equality to another instance
        into equality of the children.
      - Optionally, the object can choose to allow remapping of vars(e.g. function parameters)
        by calling DefEqual
      - We implemented a non-recursive structural equality checker that
        recursively traverses the objects and does the structural equality checking.
      
      This PR also fixes a few potential problems in previous relay's AlphaEqual.
      
      - In particular, the new structural equality relation will be communicative.
      - It is can be dangerous to use same_as relation to quickly check equality,
        demonstrated by the following case. (%x, %y) are shared vars between two functions.
      
      - function0: fn (%x, %y) { %x + %y }
      - function1: fn (%y, %x) { %x + %y }
      
      The new structural equal is intented to supersede AlphaEqual and AttrsEqual.
      
      Follow-up PRs should be performed to redirect the existing usages, and removes
      the corresponding implementation.
      
      * Update the rule to distinguish between graph node and non-graph nodes.
      
      * Refactor the test cases to use structural equal.
      
      * address comments
      
      * Mark more relay::Expr as graph node, fix a testcase issue(was bug that was not caught by previous alpha equal)
      
      * Remove unrelated comment
      
      * Fix file comment
      
      * Address review comment
      
      * Relax condition to fit flaky case
      Tianqi Chen committed
  9. 11 Mar, 2020 2 commits
  10. 10 Mar, 2020 2 commits
  11. 29 Feb, 2020 1 commit
    • Added CopyFromBytes and CopyToBytes convenience methods to NDArray. Fixed typos. (#4970) · 474c70d7
      * Added CopyFromBytes and CopyToBytes convenience methods.  Fixed typos.
      
      * Removed unneed argument check
      
      * Use TVMArrayCopyFrom/ToBytes methods
      
      * Moved CopyFrom/ToBytes to ndarray.cc
      
      * CopyToBytes impl was using CopyFromBytes.  Fixed
      
      * changed inline to TVM_DLL
      
      * Used impl from TVMArrayCopyTo/FromBytes into NDArray CopyTo/FromBytes
      
      * Move implementation of all CopyFrom/ToBytes into a common impls
      
      * make arg const
      
      * simplify method impl
      jmorrill committed
  12. 27 Feb, 2020 1 commit
  13. 26 Feb, 2020 1 commit
  14. 21 Feb, 2020 1 commit
    • [CODEGEN] Support cuda tensorcore subbyte int data type in auto tensorcore (#4546) · f23ac969
      * support cuda tensorcore subbyte int data type in auto tensorcore
      
      * add lisence
      
      * pass cpplint
      
      * fix code review comments
      
      * merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial
      
      * using master's new API
      
      * disable tuning when cuda is not enabled
      
      * address cr comment
      
      * do not run the tuning
      
      * fix test failure
      
      * fix cpplint error
      
      * fix bool type reduction bug
      
      * 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4
      
      * fix typo
      Orion34C committed
  15. 11 Feb, 2020 1 commit
  16. 07 Feb, 2020 1 commit
    • [REFACTOR][PY][API-Change] Polish tvm.runtime, tvm.runtime.module API update (#4837) · e0122c0e
      * [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update
      
      This PR updates the tvm.runtime to use the new FFI style.
      
      - Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule
      - API changes wrt to runtime.Module
        - tvm.module.load -> tvm.runtime.load_module
        - tvm.module.enabled -> tvm.runtime.enabled
        - tvm.module.system_lib -> tvm.runtime.system_lib
      - Remove dep on api_internal from runtime.
      
      * Update module.load in the latest API
      Tianqi Chen committed
  17. 04 Feb, 2020 1 commit
  18. 20 Jan, 2020 1 commit
  19. 18 Jan, 2020 2 commits
    • [runtime][refactor] Unify vm and interpreter objects (#4693) · acbf8851
      * unify vm and interpreter objects
      
      * move closure back vm
      
      * adt/closure back to vm.adt/vm.closure
      
      * closure base
      Zhi committed
    • [CodeGen][CUDA] Improve CUDA vectorizer (#4736) · 2630ffcb
      - Fixes issues to enable fp16 vectorizer. Now correct packing and
        unpacking CUDA code will be emitted. Enabled more unit tests.
      
      - Do not emit code to read the first lane from an undef variable
      
        int _3;
        _3 = _3 & ~(0x000000ff << 0) | ...
      
        and emit the following code instead:
      
        _3 = (((0x000000ff & (_1 >> 0))+(0x000000ff & (_2 >> 0))) << 0);
      
        Note that nvcc 10.2 is forgiving and emits the same code for both cases.
        A warning appears in test_codegen_cuda.py.
      
      Signed-off-by: Wei Pan <weip@nvidia.com>
      wpan11nv committed
  20. 17 Jan, 2020 1 commit
    • [REFACTOR] Polish runtime (#4729) · b171cf1d
      - Remove operator bool from base object ref macro
        - Raitionale: operator bool can be dangerous for sub-classes
          that also overloads other operators(e.g. ==).
        - If bool is still needed, use explicit operator bool.
      - Use absolute include when necessary
      - Move type related util to data_type
      - Isolate stackvm code from compiler
      Tianqi Chen committed
  21. 16 Jan, 2020 2 commits
  22. 11 Jan, 2020 2 commits
    • [REFACTOR][IR] Initialize Unified IR Expr Data Structure (#4673) · 12e51e6c
      This PR moves a few base types from relay and low-level Expr into the ir sub-folder.
      These classes will serve as a common type system across the stack.
      
      Rationale:
      
      - PrimExpr for low-level expressions
      - RelayExpr for advanced features, including Function definition.
      - Introduce BaseFunc to host all functions, including future PrimFunc(low-level expr functions, subject to discussion).
      
      This is a minimum change we can do to unify the classes into a common hierarchy.
      The main data structure that are variant specific will still be kept in the sub-namespaces.
      We only include classes that is needed to allow a common Module class.
      - BaseFunc
      - GlobalVar
      - Type definition part of ADT
      
      We will only need the BaseFunc and their checked_type to decide the calling convention
      across the function variants.
      Tianqi Chen committed
    • [REFACTOR] Replace TensorObj and TensorValue with NDArray (#4643) · 86092de0
      * replace TensorObj and TensorValue with NDArray
      
      * NodeBase to Object in Python
      
      * rebase
      Zhi committed
  23. 09 Jan, 2020 2 commits
  24. 07 Jan, 2020 1 commit
    • [RUNTIME][DSO] Improve TVMBackendPackedCFunc to allow return val (#4637) · 77c47748
      * [RUNTIME][DSO] Improve TVMBackendPackedCFunc to allow return value.
      
      Previously the signature of LibraryModule's PackedFunc does not support return value.
      This wasn't a limitation for our current usecase but could become one
      as we start to generate more interesting functions.
      
      This feature also start to get interesting as we move towards unified
      object protocol and start to pass object around.
      This PR enhances the function signature to allow return values.
      
      We also created two macros TVM_DLL_EXPORT_PACKED_FUNC and TVM_DLL_EXPORT_TYPED_FUNC
      to allow manual creation of functions that can be loaded by a LibraryModule.
      
      Examples are added in apps/dso_plugin_module.
      The change to TVMBackendPackedCFunc is backward compatible,
      as previous function will simply ignore the return value field.
      
      * address review comments
      Tianqi Chen committed
  25. 06 Jan, 2020 1 commit
    • [REFACTOR] Automatically deduce function type signature in Registry.set_body_typed (#4623) · d5d63a44
      Previously we support a limited case of function type deduction and in many places
      we have to supply the type twice during set_body_typed (one in the template parameter, another in the lambda signature).
      
      This PR improves the deduce function by enablng automatic function signature deduction.
      
      ```
      TVM_REGISTER_GLOBAL("sub")
      .set_body_typed([](int x, int y) -> int { return x - y; });
      ```
      
      Unfortunately, because of template conflict, we can not support the original case
      where both type signature and lambda are supplied through set_body_typed.
      
      This PR refactors the existing regsitration to the new style.
      Tianqi Chen committed
  26. 04 Jan, 2020 1 commit
    • [REFACTOR] TVM_REGISTER_API -> TVM_REGISTER_GLOBAL (#4621) · 81523604
      TVM_REGSISTER_API is an alias of TVM_REGISTER_GLOBAL.
      In the spirit of simplify redirections, this PR removes
      the original TVM_REGISTER_API macro and directly use TVM_REGISTER_GLOBAL.
      
      This type of refactor will also simplify the IDE navigation tools
      such as FFI navigator to provide better code reading experiences.
      
      Move EnvFunc's definition to node.
      Tianqi Chen committed
  27. 01 Jan, 2020 1 commit
  28. 31 Dec, 2019 1 commit
    • [REFACTOR][OBJECT] Consoldiate NodePtr/Ref/Hash/Equal to Object (#4603) · a8c36921
      * [REFACTOR][OBJECT] Consoldiate NodePtr/Ref/Hash/Equal and macros to Object.
      
      Historically, we have classes like NodePtr/Ref/HashEqual.
      After unified object protocol, these names are just alias of the object counterpart.
      Moreover, there are helper macros defined over the places for defining these object.
      
      This PR consoldiate the terminologies into the corresponding ones
      in the Object system so we have a clean and consistent API moving forward.
      
      * Update include/tvm/attrs.h
      
      Co-Authored-By: Wei Chen <ipondering.weic@gmail.com>
      
      * fix compilation
      
      Co-authored-by: Wei Chen <ipondering.weic@gmail.com>
      Tianqi Chen committed
  29. 30 Dec, 2019 1 commit
    • [REFACTOR][RUNTIME] Update NDArray use the Unified Object System (#4581) · 55bd786f
      * [REFACTOR][RUNTIME] Move NDArray to Object System.
      
      Previously NDArray has its own object reference counting mechanism.
      This PR migrates NDArray to the unified object protocol.
      
      The calling convention of NDArray remained intact.
      That means NDArray still has its own type_code and
      its handle is still DLTensor compatible.
      
      In order to do so, this PR added a few minimum runtime type
      detection in TVMArgValue and RetValue only when the corresponding
      type is a base type(ObjectRef) that could also refer to NDArray.
      
      This means that even if we return a base reference object ObjectRef
      which refers to the NDArray. The type_code will still be translated
      correctly as kNDArrayContainer.
      If we assign a non-base type(say Expr) that we know is not compatible
      with NDArray during compile time, no runtime type detection will be performed.
      
      This PR also adopts the object protocol for NDArray sub-classing and
      removed the legacy NDArray subclass protocol.
      Examples in apps/extension are now updated to reflect that.
      
      Making NDArray as an Object brings all the benefits of the object system.
      For example, we can now use the Array container to store NDArrays.
      
      * Address review comments
      Tianqi Chen committed
  30. 27 Dec, 2019 1 commit
  31. 26 Dec, 2019 1 commit