- 16 Feb, 2020 1 commit
-
-
- Do not emit __shared__ etc. as part of type for casting - Fix fp16 reduction kernels with compiler errors: "no operator "+" matches these operands, volatile half + volatile half This patch inserts casts to remove volatile type qualifier following volatile loads (fp16 only). CUDA fp16 library headers should add volatile member functions. - Update have_fp16 to include compute 6.1 GPUs, which do support fp16, although their fp16 throughput is low. Updated tests. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 04 Feb, 2020 1 commit
-
-
* [LINT] Fix -Wextra * Fix virtual-dtor
Tianqi Chen committed
-
- 19 Jan, 2020 2 commits
-
-
This PR moves the codegen related code into the target folder, as they are target specific functionalities. We also adopt the term "compiler driver" in common compiler infra such as rust, GHC and clang. As a result, build_module is moved into the driver folder.
Tianqi Chen committed -
TIR is the new namespace for low-level IR for tensor-level optimizations and loop transformations. This PR establishes the namespace and files. - lowered_func.h,buffer.h,data_layout.h -> tir/buffer.h,tir/data_layout.h,tir/lowered_func.h - ir.h -> tir/expr.h, tir/stmt.h - ir_functor_ext.h -> tir/expr_functor.h, tir/stmt_functor.h
Tianqi Chen committed
-
- 18 Jan, 2020 1 commit
-
-
- Fixes issues to enable fp16 vectorizer. Now correct packing and unpacking CUDA code will be emitted. Enabled more unit tests. - Do not emit code to read the first lane from an undef variable int _3; _3 = _3 & ~(0x000000ff << 0) | ... and emit the following code instead: _3 = (((0x000000ff & (_1 >> 0))+(0x000000ff & (_2 >> 0))) << 0); Note that nvcc 10.2 is forgiving and emits the same code for both cases. A warning appears in test_codegen_cuda.py. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 17 Jan, 2020 1 commit
-
-
Move the conversion extensions to the specific class definitions so that we longer need to include packed_func_ext.
Tianqi Chen committed
-
- 15 Jan, 2020 1 commit
-
-
* [REFACTOR] Move support related code to include/tvm/support - tvm/logging.h -> tvm/support/logging.h - remove tvm/base.h, move with into tvm/support/with.h * src/common -> src/support
Tianqi Chen committed
-
- 09 Jan, 2020 1 commit
-
-
* [REFACTOR][IR] tvm::Expr -> PrimExpr(Primitive Expr) As part of unified IR, we will need to unify relay::Expr and the current tvm::Expr under the same base type. From the techinical point of view. tvm::Expr is a "primitive" expression that only contains POD types and handles and does not do life-cycle management. This PR renames Expr->PrimExpr to clarify that. We will send a subsequent PR to introduce the base expr class. * Remove legacy VarExpr and ExprHash/Equal
Tianqi Chen committed
-
- 08 Jan, 2020 1 commit
-
-
* [REFACTOR][IR] Variable -> VarNode * [REFACTOR][IR] Add/Sub/Mul/Div -> AddNode/SubNode etc. * [REFACTOR][IR] Min/Max/FloorDiv/FloorMod -> MinNode/MaxNode etc. * [REFACTOR][IR] EQ/NE/LT/LE/GT/GE/Select -> EQNode/NENode etc. * [REFACTOR][IR] Add Node suffix to Select/Call/Load/Ramp/Shuffle/Let * [REFACTOR][IR] Add node suffix to IntImm/UIntImm/FloatImm/StringImm * [REFACTOR][IR] Add Node suffix to Any, AttrStmt, AssertStmt * [REFACTOR][IR] Add Node suffix to Store/Provide/Allocate/Free * [REFACTOR][IR] Add Node suffix to ProducerConsumer * Fix lint * style updates, test fixes
Tianqi Chen committed
-
- 22 Dec, 2019 1 commit
-
-
dtype.h -> runtime/data_type.h Changes: - Rename all old reference of tvm::Type to DataType - ExprNode.type -> ExprNode.dtype - Expr.type() -> Expr.dtype() - Change Expr related functions to expr_operator. - DataType::min() -> min_value(DataType) - DataType::max() -> max_value(DataType) - Move type constructor Int, UInt, Float, Handle, Bool into DataType. - Int(bits) -> DataType::Int(bits) - UInt(bits) -> DataType::UInt(bits)
Tianqi Chen committed
-
- 24 Nov, 2019 1 commit
-
-
* [LINT] Improve the check tool to handle ASF copyright message. * [LINT] Remove unnecessary copyright message as per ASF requirement. * Fix codegen hybrid * [LINT] Broaden license checks to include html, xml * [LINT] Fix rest of the files * Fix notice * [LINT] Improve check file type error message
Tianqi Chen committed
-
- 14 Nov, 2019 1 commit
-
-
* add volatile override back * [codegen] remove fp16 function override for cuda
Yizhi Liu committed
-
- 10 Nov, 2019 1 commit
-
-
Yizhi Liu committed
-
- 31 Oct, 2019 1 commit
-
-
Tianqi Chen committed
-
- 25 Oct, 2019 1 commit
-
-
Zhi committed
-
- 24 Oct, 2019 1 commit
-
-
* add tensor core support * avoid memory bank conflict * fix thread sync & better performance * better performance * add schedule test for conv2d * extend into BatchMatMul * support config fragment shape and layout using intrinsic * add TensorCore tutorial * add int support and fix lint * address comment * add 32*16*8 TensorCore test * fix wmma include logic
Siyuan Feng committed
-
- 11 Oct, 2019 1 commit
-
-
* overload half operators for cuda codegen * add float16 te test_op_level1 * fix test_op_level1.py * fix lint * disable fp16 test if gpu does not support * disable fp16 test if gpu does not support * bypass float16 test if gpu does not support float16
Xingyu Zhou committed
-
- 13 Sep, 2019 1 commit
-
-
* Fix int8x4 vectorize * Fix gpu shared/local memory accumulate * Add test_shared_memory for int8x4 * Adjust test format * Fix cpplint
noituIover committed
-
- 01 Aug, 2019 1 commit
-
-
Jian Weng committed
-
- 06 Jul, 2019 1 commit
-
-
Tianqi Chen committed
-
- 17 May, 2019 1 commit
-
-
lixiaoquan committed
-
- 08 Apr, 2019 1 commit
-
-
* [HEADER] ASF header dir=include * [HEADER] ASF Header dir=src * [HEADER] ASF Header -dir=python * [HEADER] ASF header dir=topi * [HEADER] ASF Header dir=nnvm * [HEADER] ASF Header -dir=tutorials * [HEADER] ASF Header dir=tests * [HEADER] ASF Header -dir=docker * fix whitespace * [HEADER] ASF Header -dir=jvm * [HEADER] ASF Header -dir=web * [HEADER] ASF Header --dir=apps * [HEADER] ASF Header --dir=vta * [HEADER] ASF Header -dir=go * temp * [HEADER] ASF Header --dir=rust * [HEADER] Add ASF Header --dir=cmake * [HEADER] ASF Header --dir=docs * [HEADER] Header for Jenkinsfile * [HEADER] ASF Header to toml and md * [HEADER] ASF Header to gradle * Finalize rat cleanup * Fix permission * Fix java test * temporary remove nnvm onnx test
Tianqi Chen committed
-
- 24 Oct, 2018 1 commit
-
-
Wuwei Lin committed
-
- 07 Oct, 2018 1 commit
-
-
Tianqi Chen committed
-
- 01 Oct, 2018 1 commit
-
-
Tianqi Chen committed
-
- 23 Aug, 2018 1 commit
-
-
MORITA Kazutaka committed
-
- 09 Aug, 2018 1 commit
-
-
* Use int for int8x4 due to performance overhead of char4 * Add a comment about using int * Remove invalid test
Wuwei Lin committed
-
- 01 Aug, 2018 1 commit
-
-
Tatsuya Nishiyama committed
-
- 20 Jul, 2018 1 commit
-
-
Tatsuya Nishiyama committed
-
- 11 Jun, 2018 1 commit
-
-
Tianqi Chen committed
-
- 24 Dec, 2017 1 commit
-
-
* [CODEGEN] update codegen for vector operation * update comment, fix for metal
Lianmin Zheng committed
-
- 11 Dec, 2017 2 commits
-
-
* Use long long for platforms where long is 32 bits (like windows). * Make sure scalar chars are signed. * Re-add NOLINT marker.
abergeron committed -
* [CODEGEN] add fp16 and fp64 enable pragma for opencl * fix style
Lianmin Zheng committed
-
- 30 Nov, 2017 1 commit
-
-
* [CUDA] Enable int64 * [PYTHON] Fix rpc tutorial with opencl * OK * update
Tianqi Chen committed
-
- 06 Jul, 2017 1 commit
-
-
* [CODEGEN/PASS] add restricted, alignment option * fix lint * Fix the alloca
Tianqi Chen committed
-
- 03 Jul, 2017 1 commit
-
-
Tianqi Chen committed
-
- 02 Jun, 2017 1 commit
-
-
Tianqi Chen committed
-
- 02 May, 2017 1 commit
-
-
* [CODEGEN/RUNTIME] Metal support, runtime improvement. * Fix case when no device is available
Tianqi Chen committed
-
- 22 Apr, 2017 1 commit
-
-
* [LANG/CODEGEN] Intrinsics and Extern Math * fix lint
Tianqi Chen committed
-
- 15 Apr, 2017 1 commit
-
-
* [PERF] Persitent kernel * fix doc
Tianqi Chen committed
-