- 09 Nov, 2017 2 commits
- 08 Nov, 2017 3 commits
-
-
* feat(docs) add how_to for tvm install with nnpack support * feat(docs) change python package paragraph * feat(doc) remove unsure sentence * add comments on nnpack usage vs TVM * remove mxnet nnpack tips for nthread change
Erwan BERNARD committed -
* Support vector operations for AMD (llvm IR) * fix whitespace * update comments, docstring
eqy committed -
Leyuan Wang committed
-
- 07 Nov, 2017 1 commit
-
-
Change minimum 32-bit restriction for floating point types to 8-bit. This change is to enable reduced precision types that may use vector operations underneath the hood (cases #lanes > 1 such as half4).
eqy committed
-
- 06 Nov, 2017 2 commits
- 03 Nov, 2017 2 commits
-
-
Tianqi Chen committed
-
Yuwei Hu committed
-
- 02 Nov, 2017 1 commit
-
-
* enable popcount intrin * fix lint * add test * fix python3
Yuwei Hu committed
-
- 01 Nov, 2017 1 commit
-
-
Cyril Lashkevich committed
-
- 30 Oct, 2017 1 commit
-
-
Leyuan Wang committed
-
- 27 Oct, 2017 1 commit
-
-
Tianqi Chen committed
-
- 26 Oct, 2017 4 commits
-
-
masahi committed
-
* removed fma dispatch * added comments to explain why remove fma * fix lint * use fmuladd intrin for fma dispatch
masahi committed -
* view llvm ir and gcn asm with module.get_source(...) * fix lint
masahi committed -
* [BUFFER] Smarter slice to detect compactness * move simplify of begins early
Tianqi Chen committed
-
- 25 Oct, 2017 1 commit
-
-
Yuwei Hu committed
-
- 24 Oct, 2017 3 commits
-
-
Tianqi Chen committed
-
Tianqi Chen committed
-
Wei Chen committed
-
- 23 Oct, 2017 1 commit
-
-
* update topi/cuda schedules to use target.max_num_threads * allow num_thread to be larger than cuda.max_num_threads * remove get_max_num_threads and make it inline
masahi committed
-
- 22 Oct, 2017 3 commits
-
-
Tianqi Chen committed
-
* add friendly tips when not found cl and link * fix lint
Hu Shiwen committed -
Wei Chen committed
-
- 20 Oct, 2017 1 commit
-
-
* added math function support * bug fix extern func call in llvm based codegen lint fix fix build bug fix extern func call in llvm based codegen * moved rocm bitcodes detection to python
masahi committed
-
- 19 Oct, 2017 1 commit
-
-
use `object.__eq__`(default object identity comparison) as default implementation of same_as. This should be OK since `EqualOp` and `NotEqualOp` are pure Python object, `object.__eq__` is sufficient.
Wei Chen committed
-
- 17 Oct, 2017 2 commits
-
-
* [PYTHON] Improve equal sugar * fix comment
Tianqi Chen committed -
Tianqi Chen committed
-
- 16 Oct, 2017 3 commits
-
-
* [ARITH] More caninical simplfy * [DEBUG] Use HalideIR with trace logging
Tianqi Chen committed -
* [FIX] Fix target warning * [FIX] Deduplicate options * Fix * Fix
ziheng committed -
* [CODEGEN] Allow link additional module * fix py3 * add register back
Tianqi Chen committed
-
- 15 Oct, 2017 2 commits
-
-
Tianqi Chen committed
-
* [CODEGEN] Force not inline compute core for better debug * also support llvm4
Tianqi Chen committed
-
- 14 Oct, 2017 3 commits
-
-
* [TVM] Introduce target generic dispatch system * fix target warning
Tianqi Chen committed -
masahi committed
-
* [CODEGEN] Detect broadcast(cast(x)) pattern in FMA * [CODEGEN] Improve * [CODEGEN] Fix
ziheng committed
-
- 13 Oct, 2017 2 commits
-
-
* Add same_as to NodeBase 1. Most class inherited from NodeBase(Schedule, Stage, etc) still have the convenience of using '==' for object identity. And this is the right behavior for non-Expr classes. 2. subclasses of ExprOp now create EQ expression when '==' is used. `__nonzero__` and `__bool__` in EQ and NE is a comprise that in some cases object identity semantics is still useful, like in unit test. For instance: ```` assert a == b ```` "a == b" will create EQ expression, assert then calls `__nonzero__` of the result expression. `Expr.__nonzero__` throws exception since it prohibits evaluating IR expression. More complex case like: ```` assert a in b # b is dict ```` it will call `__eq__` on a and all keys of b, then `__bool__` on the result expression. This could not easily be done by same_as. * Retain __hash__ from NodeBase in Python3
Wei Chen committed -
* added support for rocm gpu autodetect * changed type casting from old style to static_cast * fixed code to generate gfx specific code object * fixed namespaces
Aditya Atluri committed
-