1. 12 Apr, 2020 2 commits
  2. 11 Apr, 2020 1 commit
    • [LLVM] Fix generation of LLVM intrinsics (#5282) · 403929f9
      * [LLVM] Fix generation of LLVM intrinsics
      
      The type list in the call to llvm::Intrinsic::getDeclaration is not
      the intrinsic's signature, it's the list of overloaded types. Without
      this fix, the updated unit test would cause the following error:
      
      TVMError: LLVM module verification failed with the following errors:
      Intrinsic name not mangled correctly for type arguments! Should be:
      llvm.ctlz.i32
      i32 (i32, i1)* @llvm.ctlz.i32.i1
      
      Special handling for llvm.prefetch, sig matching for overloaded ints only
      
      The prefetch intrinsic returns void in LLVM, while it returns i32 in TVM.
      This case needs to be handled specially, because rule-based intrinsic
      translation would cause invalid LLVM type to be created.
      
      Do the signature matching only for overloaded intrinsics. It's not needed
      for non-overloaded ones, so this can save a bit of compile-time.
      
      * Include intrinsic name in the error message
      
      * Fix number of arguments for llvm.fmuladd and llvm.pow
      Krzysztof Parzyszek committed
  3. 10 Apr, 2020 3 commits
  4. 08 Apr, 2020 1 commit
  5. 07 Apr, 2020 3 commits
  6. 05 Apr, 2020 1 commit
  7. 03 Apr, 2020 2 commits
  8. 02 Apr, 2020 2 commits
  9. 01 Apr, 2020 1 commit
  10. 26 Mar, 2020 1 commit
  11. 24 Mar, 2020 1 commit
  12. 23 Mar, 2020 1 commit
    • [Relay, Topi, TF Frontend] Isfinite operator (#4981) · 9037f4ec
      * isfinite doc update
      
      * isfinit expr
      
      * isfinit expr
      
      * isfinite schedule reg
      
      * isfinite python binding
      
      * isfinite python binding
      
      * relay register isfinite
      
      * isfinite type relation
      
      * intrin isfinite
      
      * topi isfinite
      
      * testcase topi isfinite
      
      * tf frontend isfinite
      
      * tf frontend isfinite testcase
      
      * test case relay isfinite
      
      * small fixes
      
      * test forward tf isfinite
      
      * test cases injective for cuda
      
      * remove float16 test case
      
      * add support for isinf
      
      * remove unwanted import
      
      * fix conflict
      Mahesh Ambule committed
  13. 22 Mar, 2020 1 commit
  14. 20 Mar, 2020 1 commit
    • [TIR][TARGET] Refactor Target codegen to use IRModule and PrimFunc. (#5107) · 841725cc
      As part of the unified IR refactor.
      This PR refactors the target codegen to use IRModule containing tir::PrimFuncs.
      
      In order to break the refactor into several steps without breaking the codebase,
      we built an conversion pass to convert Array<LoweredFunc> into IRModule.
      
      The follow-up refactors will gradually move the passes covered by IRModule up
      until we cover all the passes. Then we can remove the additional redundant
      concepts such as LoweredFunc.
      Tianqi Chen committed
  15. 18 Mar, 2020 1 commit
  16. 16 Mar, 2020 1 commit
    • Return empty CSourceModule when no lowered_funcs exists in Relay mod (#4847) · 11ee1a0e
      * Use dummy func when no lowered_funcs exists in Relay mod
      
      * Dummy func -> CSourceModule with empty code str
      
      * Added comments describing the empty CSouceModule
      
      * Always import external modules w/o assertions
      
      * Use CSourceModule as a fallback for LLVMModule
      
      * Changed cond for target == llvm
      
      * Create an empty LLVM module w/o using dummy func
      
      * Avoid using IR str concat to create LLVM module
      
      * Improved comments for codegen.LLVMModuleCreate
      
      * Satisfy the linter for LLVMModuleCreate
      Ruizhe Zhao committed
  17. 11 Mar, 2020 3 commits
  18. 10 Mar, 2020 1 commit
  19. 09 Mar, 2020 1 commit
  20. 06 Mar, 2020 1 commit
    • [topi][relay] add operation tan to TVM (#4938) · d992468d
      * Add relay operation relay.op.tan.
      
      * Update tan implementation in TVM.
      
      * Update tests.
      
      * Add shape function for tan.
      
      * Add missing main test to python/frontend/tensorflow/test_forward.
      
      * Revert, back to sin/cos.
      
      * Revert "Revert, back to sin/cos."
      
      This reverts commit 4da5b503b921585ba9d80944b29136142b575c40.
      
      * Fix implementation of tan in cuda. Do not support tan for float16.
      
      Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64.
      
      Try again to implement tan as sin/cos in llvm.
      Yao Wang committed
  21. 25 Feb, 2020 1 commit
  22. 21 Feb, 2020 1 commit
    • [CODEGEN] Support cuda tensorcore subbyte int data type in auto tensorcore (#4546) · f23ac969
      * support cuda tensorcore subbyte int data type in auto tensorcore
      
      * add lisence
      
      * pass cpplint
      
      * fix code review comments
      
      * merge the int4/int1 codegen tutorial into the existing auto tensorcore tutorial
      
      * using master's new API
      
      * disable tuning when cuda is not enabled
      
      * address cr comment
      
      * do not run the tuning
      
      * fix test failure
      
      * fix cpplint error
      
      * fix bool type reduction bug
      
      * 1. fix a index bug 2. fix returned bytes value of int1/int4/uint4
      
      * fix typo
      Orion34C committed
  23. 18 Feb, 2020 1 commit
  24. 16 Feb, 2020 1 commit
    • [CodeGen][CUDA] Fix issues in cuda codegen (#4876) · d50ba721
      - Do not emit __shared__ etc. as part of type for casting
      
      - Fix fp16 reduction kernels with compiler errors:
      
        "no operator "+" matches these operands, volatile half + volatile half
      
        This patch inserts casts to remove volatile type qualifier following
        volatile loads (fp16 only). CUDA fp16 library headers should add
        volatile member functions.
      
      - Update have_fp16 to include compute 6.1 GPUs, which do support fp16,
        although their fp16 throughput is low. Updated tests.
      
      Signed-off-by: Wei Pan <weip@nvidia.com>
      wpan11nv committed
  25. 13 Feb, 2020 1 commit
  26. 11 Feb, 2020 1 commit
  27. 07 Feb, 2020 1 commit
    • [REFACTOR][PY][API-Change] Polish tvm.runtime, tvm.runtime.module API update (#4837) · e0122c0e
      * [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update
      
      This PR updates the tvm.runtime to use the new FFI style.
      
      - Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule
      - API changes wrt to runtime.Module
        - tvm.module.load -> tvm.runtime.load_module
        - tvm.module.enabled -> tvm.runtime.enabled
        - tvm.module.system_lib -> tvm.runtime.system_lib
      - Remove dep on api_internal from runtime.
      
      * Update module.load in the latest API
      Tianqi Chen committed
  28. 04 Feb, 2020 1 commit
  29. 21 Jan, 2020 1 commit
    • [REFACTOR] Establish printer in the source folder (#4752) · e4d817d4
      * [REFACTOR] Establish printer in the source folder.
      
      As we move towards the unified IR, we will eventually want to build a unified
      printers for both relay and TIR.
      
      This PR isolate the printer component into a separate folder in src as a first step.
      
      - Refactored the Doc DSL using Object, clean up APIs.
      - Isolate out the meta data into a header.
      - move printer into relay_text_printer, add comments about further TODos.
      
      * Rename NodePrinter -> ReprPrinter to distinguish it from other printers
      Tianqi Chen committed
  30. 19 Jan, 2020 2 commits
    • [REFACTOR][CODEGEN] codegen->target, build_module->driver (#4742) · 33b0831c
      This PR moves the codegen related code into the target folder,
      as they are target specific functionalities.
      
      We also adopt the term "compiler driver" in common compiler infra
      such as rust, GHC and clang.
      As a result, build_module is moved into the driver folder.
      Tianqi Chen committed
    • [REFACTOR] Establish tir (#4740) · cf59b206
      TIR is the new namespace for low-level IR
      for tensor-level optimizations and loop transformations.
      
      This PR establishes the namespace and files.
      
      - lowered_func.h,buffer.h,data_layout.h -> tir/buffer.h,tir/data_layout.h,tir/lowered_func.h
      - ir.h -> tir/expr.h, tir/stmt.h
      - ir_functor_ext.h -> tir/expr_functor.h, tir/stmt_functor.h
      Tianqi Chen committed