1. 16 Feb, 2020 1 commit
    • [CodeGen][CUDA] Fix issues in cuda codegen (#4876) · d50ba721
      - Do not emit __shared__ etc. as part of type for casting
      
      - Fix fp16 reduction kernels with compiler errors:
      
        "no operator "+" matches these operands, volatile half + volatile half
      
        This patch inserts casts to remove volatile type qualifier following
        volatile loads (fp16 only). CUDA fp16 library headers should add
        volatile member functions.
      
      - Update have_fp16 to include compute 6.1 GPUs, which do support fp16,
        although their fp16 throughput is low. Updated tests.
      
      Signed-off-by: Wei Pan <weip@nvidia.com>
      wpan11nv committed
  2. 04 Feb, 2020 1 commit
  3. 19 Jan, 2020 2 commits
    • [REFACTOR][CODEGEN] codegen->target, build_module->driver (#4742) · 33b0831c
      This PR moves the codegen related code into the target folder,
      as they are target specific functionalities.
      
      We also adopt the term "compiler driver" in common compiler infra
      such as rust, GHC and clang.
      As a result, build_module is moved into the driver folder.
      Tianqi Chen committed
    • [REFACTOR] Establish tir (#4740) · cf59b206
      TIR is the new namespace for low-level IR
      for tensor-level optimizations and loop transformations.
      
      This PR establishes the namespace and files.
      
      - lowered_func.h,buffer.h,data_layout.h -> tir/buffer.h,tir/data_layout.h,tir/lowered_func.h
      - ir.h -> tir/expr.h, tir/stmt.h
      - ir_functor_ext.h -> tir/expr_functor.h, tir/stmt_functor.h
      Tianqi Chen committed
  4. 18 Jan, 2020 1 commit
    • [CodeGen][CUDA] Improve CUDA vectorizer (#4736) · 2630ffcb
      - Fixes issues to enable fp16 vectorizer. Now correct packing and
        unpacking CUDA code will be emitted. Enabled more unit tests.
      
      - Do not emit code to read the first lane from an undef variable
      
        int _3;
        _3 = _3 & ~(0x000000ff << 0) | ...
      
        and emit the following code instead:
      
        _3 = (((0x000000ff & (_1 >> 0))+(0x000000ff & (_2 >> 0))) << 0);
      
        Note that nvcc 10.2 is forgiving and emits the same code for both cases.
        A warning appears in test_codegen_cuda.py.
      
      Signed-off-by: Wei Pan <weip@nvidia.com>
      wpan11nv committed
  5. 17 Jan, 2020 1 commit
  6. 15 Jan, 2020 1 commit
  7. 09 Jan, 2020 1 commit
    • [REFACTOR][IR] tvm::Expr -> PrimExpr(Primitive Expr) (#4669) · d6a23cf5
      * [REFACTOR][IR] tvm::Expr -> PrimExpr(Primitive Expr)
      
      As part of unified IR, we will need to unify relay::Expr
      and the current tvm::Expr under the same base type.
      
      From the techinical point of view. tvm::Expr is a "primitive"
      expression that only contains POD types and handles and does
      not do life-cycle management.
      
      This PR renames Expr->PrimExpr to clarify that.
      We will send a subsequent PR to introduce the base expr class.
      
      * Remove legacy VarExpr and ExprHash/Equal
      Tianqi Chen committed
  8. 08 Jan, 2020 1 commit
    • [REFACTOR][IR] Add Node suffix to low-level IR nodes (#4649) · f4c5f93b
      * [REFACTOR][IR] Variable -> VarNode
      
      * [REFACTOR][IR] Add/Sub/Mul/Div -> AddNode/SubNode etc.
      
      * [REFACTOR][IR] Min/Max/FloorDiv/FloorMod -> MinNode/MaxNode etc.
      
      * [REFACTOR][IR] EQ/NE/LT/LE/GT/GE/Select -> EQNode/NENode etc.
      
      * [REFACTOR][IR] Add Node suffix to Select/Call/Load/Ramp/Shuffle/Let
      
      * [REFACTOR][IR] Add node suffix to IntImm/UIntImm/FloatImm/StringImm
      
      * [REFACTOR][IR] Add Node suffix to Any, AttrStmt, AssertStmt
      
      * [REFACTOR][IR] Add Node suffix to Store/Provide/Allocate/Free
      
      * [REFACTOR][IR] Add Node suffix to ProducerConsumer
      
      * Fix lint
      
      * style updates, test fixes
      Tianqi Chen committed
  9. 22 Dec, 2019 1 commit
    • [REFACTOR][DTYPE] Isolate dtype to runtime (#4560) · 7fa8aab5
      dtype.h -> runtime/data_type.h
      
      Changes:
      - Rename all old reference of tvm::Type to DataType
      - ExprNode.type -> ExprNode.dtype
      - Expr.type() -> Expr.dtype()
      - Change Expr related functions to expr_operator.
        - DataType::min() -> min_value(DataType)
        - DataType::max() -> max_value(DataType)
      - Move type constructor Int, UInt, Float, Handle, Bool into DataType.
        - Int(bits) -> DataType::Int(bits)
        - UInt(bits) -> DataType::UInt(bits)
      Tianqi Chen committed
  10. 24 Nov, 2019 1 commit
  11. 14 Nov, 2019 1 commit
  12. 10 Nov, 2019 1 commit
  13. 31 Oct, 2019 1 commit
  14. 25 Oct, 2019 1 commit
  15. 24 Oct, 2019 1 commit
    • TensorCore Support using Intrinsic (#4136) · 324a9607
      * add tensor core support
      
      * avoid memory bank conflict
      
      * fix thread sync & better performance
      
      * better performance
      
      * add schedule test for conv2d
      
      * extend into BatchMatMul
      
      * support config fragment shape and layout using intrinsic
      
      * add TensorCore tutorial
      
      * add int support and fix lint
      
      * address comment
      
      * add 32*16*8 TensorCore test
      
      * fix wmma include logic
      Siyuan Feng committed
  16. 11 Oct, 2019 1 commit
  17. 13 Sep, 2019 1 commit
  18. 01 Aug, 2019 1 commit
  19. 06 Jul, 2019 1 commit
  20. 17 May, 2019 1 commit
  21. 08 Apr, 2019 1 commit
    • [HEADER] Add Header to Comply with ASF Release Policy (#2982) · cffb4fba
      * [HEADER] ASF header dir=include
      
      * [HEADER] ASF Header dir=src
      
      * [HEADER] ASF Header -dir=python
      
      * [HEADER] ASF header dir=topi
      
      * [HEADER] ASF Header dir=nnvm
      
      * [HEADER] ASF Header -dir=tutorials
      
      * [HEADER] ASF Header dir=tests
      
      * [HEADER] ASF Header -dir=docker
      
      * fix whitespace
      
      * [HEADER] ASF Header -dir=jvm
      
      * [HEADER] ASF Header -dir=web
      
      * [HEADER] ASF Header --dir=apps
      
      * [HEADER] ASF Header --dir=vta
      
      * [HEADER] ASF Header -dir=go
      
      * temp
      
      * [HEADER] ASF Header --dir=rust
      
      * [HEADER] Add ASF Header --dir=cmake
      
      * [HEADER] ASF Header --dir=docs
      
      * [HEADER] Header for Jenkinsfile
      
      * [HEADER] ASF Header to toml and md
      
      * [HEADER] ASF Header to gradle
      
      * Finalize rat cleanup
      
      * Fix permission
      
      * Fix java test
      
      * temporary remove nnvm onnx test
      Tianqi Chen committed
  22. 24 Oct, 2018 1 commit
  23. 07 Oct, 2018 1 commit
  24. 01 Oct, 2018 1 commit
  25. 23 Aug, 2018 1 commit
  26. 09 Aug, 2018 1 commit
  27. 01 Aug, 2018 1 commit
  28. 20 Jul, 2018 1 commit
  29. 11 Jun, 2018 1 commit
  30. 24 Dec, 2017 1 commit
  31. 11 Dec, 2017 2 commits
  32. 30 Nov, 2017 1 commit
  33. 06 Jul, 2017 1 commit
  34. 03 Jul, 2017 1 commit
  35. 02 Jun, 2017 1 commit
  36. 02 May, 2017 1 commit
  37. 22 Apr, 2017 1 commit
  38. 15 Apr, 2017 1 commit