- 03 Jan, 2018 3 commits
-
-
* rocblas integration * fix include * fix lint
masahi committed -
* modified schedule_dataflow_rewrite.cc to fix losing tensor problem * modified schedule_dataflow_rewrite.cc for lint scan * modified schedule_dataflow_rewrite.cc for lint scan * using tensor's value_index to index output of stage op
libing4752 committed -
* [CODEGEN] update codegen for vector operation * update comment, fix for metal * fix some bugs in codegen * use 'restrict' in every argument * fix * fix
Lianmin Zheng committed
-
- 02 Jan, 2018 1 commit
-
-
* add cublas support * integrate cublas to topi dense * add cublas error check * minor fix * fix lint * remove topi import from contrib unittest
masahi committed
-
- 31 Dec, 2017 2 commits
-
-
* [SCHEDULE]enable partition const loop with build flag (#719) * enable partition loop with build flag * add a testcase, and modify LoopPartition related cases * * add document for split_const_loop * [IRbuild]Support automatically Name Loop Variable in IRBuilder (#719) * add idx_num in class * using typical index [i, j, k] first, then i_suffix * keep inputs names * fix lint * improve comment of name * fix lint
xqdan committed -
Tianqi Chen committed
-
- 29 Dec, 2017 3 commits
-
-
* [SCHEDULE]enable partition const loop with build flag (#719) * enable partition loop with build flag * add a testcase, and modify LoopPartition related cases * * add document for split_const_loop
xqdan committed -
* use cudnn findalgo to choose the best algo * fix lint
masahi committed -
* when there is no intrin func, using body for initialization. For issue 714. * Refine code per review comments, and add a test case. * Fix lint issues. * Re-organize the tensorize test cases, and add a new case for none-reset mode. * Fix a typo. * Delete the unit case because merged it into test_schedule_tensorize.py already.
kun-zh committed
-
- 27 Dec, 2017 3 commits
-
-
* when there is no intrin func, using body for initialization. For issue 714. * Refine code per review comments, and add a test case. * Fix lint issues.
kun-zh committed -
* support dim-0 tensor in topi ops revert transform * revert
Xingjian Shi committed -
* add target.libs to target str representation * integrate cudnn into topi cuda * append target.libs to target.options
masahi committed
-
- 26 Dec, 2017 2 commits
-
-
* add extern schedule for miopen * fix comment * optionally dispatch to miopen from topi * fix lint * check if current target is None * use generic dispatch for rocm conv2d * fix lint * fix workspace bug * remove blank line * remove blank line * remove blank line
masahi committed -
Tianqi Chen committed
-
- 25 Dec, 2017 1 commit
-
-
* add x86_64 target * add binary dense operator * rebase * improve schedule * remove x86 target * improve schedule
Yuwei Hu committed
-
- 24 Dec, 2017 3 commits
-
-
* fist working miopen support * do FindFwdAlgo during build time * fix lint * update doc string * import topi after checking if rocm is enabled * add miopen namespace * fixed descriptor overwrite bug * add use_miopen option * fix lint * better miopen option handling * fix typo * fix options handling
masahi committed -
* [CODEGEN] update codegen for vector operation * update comment, fix for metal
Lianmin Zheng committed -
Tianqi Chen committed
-
- 23 Dec, 2017 3 commits
-
-
* Make duplicated function name checker working * Fix dependency checking problem for reducer condition (#712); add test * Fix dependency checking problem for reducer condition (#712); add test * Specify R to be computed inlined
Cody Hao Yu committed -
Tianqi Chen committed
-
Salem Derisavi committed
-
- 22 Dec, 2017 1 commit
-
-
During tensorize, call Simplify on algorithm and intrinsic definitions before CanonicalSimplify. This will prevent a number of false tensorize mismatches. (#718) thanks, this we can use this solution for now
Salem Derisavi committed
-
- 19 Dec, 2017 1 commit
-
-
* 1) removed non-determinism from CanonicalSimplify 2) added couple of testcases for CanonicalSimplify * Use IRDeepCompare instead of comparison of string representation * Give a warning (instead of fatal error) when two "ComExprEntry"s are equal
Salem Derisavi committed
-
- 17 Dec, 2017 1 commit
-
-
Andrew Adams committed
-
- 16 Dec, 2017 1 commit
-
-
masahi committed
-
- 15 Dec, 2017 1 commit
-
-
Cody Hao Yu committed
-
- 13 Dec, 2017 2 commits
-
-
* Simplify expressions early on * fixed lint errors
Salem Derisavi committed -
* 1) Refactored some parts of the unrolling code into their own methods so we can reuse unrolling functionality in other parts of the code. E.g., to explicitly unroll loops with count of 1 when they are programmatically created. 2) Reorder based on top operator before resorting to pointers, which causes non-determinism. * Fixed lint errors
Salem Derisavi committed
-
- 11 Dec, 2017 2 commits
-
-
* Use long long for platforms where long is 32 bits (like windows). * Make sure scalar chars are signed. * Re-add NOLINT marker.
abergeron committed -
* [CODEGEN] add fp16 and fp64 enable pragma for opencl * fix style
Lianmin Zheng committed
-
- 07 Dec, 2017 1 commit
-
-
Lianmin Zheng committed
-
- 05 Dec, 2017 1 commit
-
-
* Port build_module.py to C++ * Fix lint errors * Fix more lint errors * Fix more lint errors * Fix more lint errors * Fix build error * Implemented style fixes * Fix lint errors * Added function to construct target from string lower now returns array * Fix lint error * Implemented review changes - style & Target options -> std::vector * Fixed lint, argument alignment and added unit test * Changed test to target LLVM, fixed sign compare warnings * Reverted unit test to CUDA, changed Jenkinsfile to enable GPU for C++ tests * Slight change to Jenkinsfile * Changed build_module test from CUDA to LLVM * Added function var() to construct a Var instance. Changed implementation of LLVMEnabled() * Reverted Jenkinsfile
alex-weaver committed
-
- 04 Dec, 2017 2 commits
-
-
* [CI] Enable llvm in CPU test * fix llvm
Tianqi Chen committed -
* Support rank-0 tensor * fix lint
Tianqi Chen committed
-
- 01 Dec, 2017 1 commit
-
-
* [RANDOM] Init contrib.random library * [RANDOM] Add uniform * [RANDOM] Fix lint * [RANDOM] Add comments and tests * [RANDOM] Fix lint
ziheng committed
-
- 30 Nov, 2017 5 commits
-
-
Salem Derisavi committed
-
* [CUDA] Enable int64 * [PYTHON] Fix rpc tutorial with opencl * OK * update
Tianqi Chen committed -
Yizhi Liu committed
-
Change the parameter 'C' name
solin319 committed -
In unroll_loop.cc the parameter name is "auto_max_depth", but in ir_pass.h the parameter name is "auto_min_depth"
solin319 committed
-