- 12 Mar, 2020 1 commit
-
-
* [CUDA] Op strategy changes for Int8 schedules. * Applying Haichen's suggestions. * Make 4D output work for task extraction. * Make x86 work. * Fix lint. * Lint fixes. * Tests, comments, out channel a multiple of 4. * Topi test. Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-96.us-west-2.compute.internal>
Animesh Jain committed
-
- 11 Mar, 2020 2 commits
-
-
* Support 3d Convolution with the ONNX frontend * add unit tests for conv3d in onnx frontend respond to PR formatting requests add x86 schedules to conv3d ncdhw test fix a doc string format issue refactor for changed upsream API * first attempt at conv3d autotuning add default schedule for conv3d_ncdhw fill in autotvm integration add a fallback for invalid schedules fix fallback fix reduction order to get simd working correctly
Matthew Brookhart committed -
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Finally implement tan as sin/cos in llvm.
notoraptor committed
-
- 10 Mar, 2020 1 commit
-
- 06 Mar, 2020 1 commit
-
-
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Try again to implement tan as sin/cos in llvm.
Yao Wang committed
-
- 01 Mar, 2020 2 commits
-
-
* [Relay][FastMath] Relay pass to use fast exp/tanh * Adding required_pass to the tests. * FastMath test changes.
Animesh Jain committed -
zhengdi committed
-
- 27 Feb, 2020 1 commit
-
-
* [REFACTOR][PY][API-CHANGE] Remove legacy python files. Remove legacy python files. Use the te namespace for most of the tensor expression primitives. - tvm.create_schedule -> tvm.te.create_schedule - tvm.placeholder -> tvm.te.placeholder - tvm.compute -> tvm.te.compute * Remove top-level exposures.
Tianqi Chen committed
-
- 25 Feb, 2020 1 commit
-
-
* remove unnecessary spliting in the cached chunk * remove unnecessary spliting in the cached chunk
Yida Wang committed
-
- 24 Feb, 2020 1 commit
-
-
* relay op strategy fix lint bitpack strategy bitserial_dense (#6) * update strategy * address comments fix a few topi test Dense strategy (#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix * fix bugs * lint * address comments * add name to op implement * Modify topi tests (#9) * Add pooling, reorg, softmax and vision * Add lrn * fix topi test * fix more topi test * lint * address comments * x * fix more tests & bugs * Modify more tests (#10) * Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn * Minor fix * More minor fix * fix more test * try to update vta using strategy * fix cpptest * x * fix rebase err * Fix two tests (#11) * change autotvm log format * lint * minor fix * try fix vta test * fix rebase err * tweak * tmp hack for vta pass * fix tutorial * fix * fix more tutorials * fix vta tutorial * minor * address comments * fix * address comments * fix cpptest * fix docs * change data structure name and api * address comments * lint * fix rebase err * updates * fix winograd test * fix doc * rebase * upgrade tophub version number * fix bug * re-enable vta tsim test after tophub is upgraded * fix vta test to use the correct args so the config can be found in tophub Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>
Haichen Shen committed
-
- 21 Feb, 2020 2 commits
-
-
* get_valid_count accuracy issue fixed for individual tests but not for all tests running together * minor fix * initialize valid_count and PrefixSum buffers * test updated * udpate relay test as well * update document * fix lint * address comment * fix lint * correct atomicAdd identifier name
Leyuan Wang committed -
* [TEST][FLAKY] topi/tests/python/test_topi_sort.py::test_argsort * upadate test function of argsort like topk * Shuffle index and get data from shuffled index * Replace the random.uniform with np.arange
Neo Chien committed
-
- 20 Feb, 2020 1 commit
-
-
* Fix Python docstrings * More fixes * Fix lint
Cody Yu committed
-
- 17 Feb, 2020 1 commit
-
-
Alex Gladkov committed
-
- 16 Feb, 2020 1 commit
-
-
- Do not emit __shared__ etc. as part of type for casting - Fix fp16 reduction kernels with compiler errors: "no operator "+" matches these operands, volatile half + volatile half This patch inserts casts to remove volatile type qualifier following volatile loads (fp16 only). CUDA fp16 library headers should add volatile member functions. - Update have_fp16 to include compute 6.1 GPUs, which do support fp16, although their fp16 throughput is low. Updated tests. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 14 Feb, 2020 2 commits
-
-
- This allows to better utilize the memory bandwidth - Note that not all cases are vectorized for fp16 datatype. For instance, when the size is not a multiple of 1024, the inner loop may be an expression that cannot be vectorized. In this case, a small inner loop is still benefical for latency hidding. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed -
- Move related files into the corresponding location as in C++ - Keep the top-level TVM API backward compatible to make minimum changes in topi
tqchen committed
-
- 13 Feb, 2020 2 commits
-
-
Add tuneable conv3d_ndhwc schedule
Alex Gladkov committed -
Move the related target modules into tvm.target. API change: - tvm.target.current_target -> tvm.target.Target.current - tvm.datatype -> tvm.target.datatype
tqchen committed
-
- 12 Feb, 2020 1 commit
-
-
* [REFACTOR][PY][API-CHANGE] establish tvm.ir, migrate corresponding relay files. This PR establishes tvm.ir and migrates the corresponding relay files into the new folder. API Change: - relay.Module -> tvm.IRModule * Update with ADT * Migrate transform * address comments * Migrate module * Migrate json_compact * Migrate attrs * Move LoweredFunc to stmt temporarily * temp migrate container * Finish migrate container
Tianqi Chen committed
-
- 11 Feb, 2020 1 commit
-
-
hlu1 committed
-
- 10 Feb, 2020 1 commit
-
-
Leyuan Wang committed
-
- 09 Feb, 2020 1 commit
-
-
Tianqi Chen committed
-
- 07 Feb, 2020 2 commits
-
-
* [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update This PR updates the tvm.runtime to use the new FFI style. - Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule - API changes wrt to runtime.Module - tvm.module.load -> tvm.runtime.load_module - tvm.module.enabled -> tvm.runtime.enabled - tvm.module.system_lib -> tvm.runtime.system_lib - Remove dep on api_internal from runtime. * Update module.load in the latest API
Tianqi Chen committed -
Tianqi Chen committed
-
- 06 Feb, 2020 1 commit
-
-
* Add bitwise ops to topi * Add the bitwise ops to relay.
abergeron committed
-
- 05 Feb, 2020 3 commits
-
-
* enforce 4-way padding * add util with get_pad_tuple * delete unnecessary arguments * fix lint * add container.Array case * fix cudnn conv2d asymmetric padding logic * rename get_pad_tuple to get_pad_tuple2d * revert change for topi/python/topi/nn/conv2d.py * add get_pad_tuple2d for several contrib conv2d ops * add get_pad_tuple2d for all conv2d ops
Xingyu Zhou committed -
* [REFACTOR][PY] Establish tvm.runtime This PR establishes the tvm.runtime namespace that contains the core runtime data structures. The top-level API are kept inact for now via re-exporting. We will followup later to cleanup some of the top-level APIs. * Fix ndarray name
Tianqi Chen committed -
* [REFACTOR][PY] tvm._ffi - Remove from __future__ import absolute_import in the related files as they are no longer needed if the code only runs in python3 - Remove reverse dependency of _ctypes _cython to object_generic. - function.py -> packed_func.py - Function -> PackedFunc - all registry related logics goes to tvm._ffi.registry - Use absolute references for FFI related calls. - tvm._ffi.register_object - tvm._ffi.register_func - tvm._ffi.get_global_func * Move get global func to the ffi side
Tianqi Chen committed
-
- 04 Feb, 2020 1 commit
-
-
* [TOPI][x86] Injective Schedule Improvement. * Add tiling. * Vectorize when there is an axis.
Animesh Jain committed
-
- 03 Feb, 2020 1 commit
-
-
* [TOPI] upsample operator 'NCHWinic' format support. some hardware accelerator ask packed format data like NCHWinic to fit the hardware resource, here add upsample NCHWinic format support to help such requirement. * address review comments, add assert for 'else must be NCHWxc' logic.
Hua Jiang committed
-
- 01 Feb, 2020 1 commit
-
-
Alex Gladkov committed
-
- 31 Jan, 2020 1 commit
-
-
Animesh Jain committed
-
- 24 Jan, 2020 2 commits
-
-
* remove cpp upsampling * remove cpp resize
masahi committed -
Alex Gladkov committed
-
- 22 Jan, 2020 2 commits
-
-
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue https://github.com/apache/incubator-tvm/pull/4472
Alex Gladkov committed -
Alexander Pivovarov committed
-
- 21 Jan, 2020 1 commit
-
-
Bring up namespace te -- Tensor expression language DSL.
Tianqi Chen committed
-
- 20 Jan, 2020 1 commit
-
-
Alex Gladkov committed
-
- 19 Jan, 2020 1 commit
-
-
This PR moves the codegen related code into the target folder, as they are target specific functionalities. We also adopt the term "compiler driver" in common compiler infra such as rust, GHC and clang. As a result, build_module is moved into the driver folder.
Tianqi Chen committed
-