- 23 Apr, 2020 1 commit
-
-
Signed-off-by: Wei Pan <weip@nvidia.com>
Wei Pan committed
-
- 16 Apr, 2020 1 commit
-
-
Samuel committed
-
- 15 Apr, 2020 1 commit
-
-
* get_valid_count updated to have correct results * speedup nms * update nms * revert back nms * recover one test for get_valid_count
Leyuan Wang committed
-
- 13 Apr, 2020 1 commit
-
-
* one weird trick. * Added schedule knob for different workloads. * Initial conv3d tensorcore working. * Added conv3d tensorcore strategy. * Added layout conversion to tensorcore friendly format for conv2d and conv3d. * Add target name check. * Fixed bad names and depthwise check. * Removed duplicated attribute assignment.
Josh Fromm committed
-
- 07 Apr, 2020 1 commit
-
-
* add fast erf * doc * lint * fix * fix indent
Haichen Shen committed
-
- 05 Apr, 2020 1 commit
-
-
* Functional conv3d winograd working. * Formatted python code. * registered conv3d winograd compute and started adding relay without_weight_transform operator. * Add topi testing for conv3d winograd. * Format file. * small tweak to unrolling to prevent build sticking. * Refactoring convolution ops in relay. * Refactored relay convolutions. * Bug fixes. * Fixed static bug in convolution. * Added conv3d alter op layout and related support. * Bug fixes and testing done. * Fix a few autotvm bugs. * Drop silly debug print. * Removed debug_skip_region. * Add variant of conv3d_winograd that doesn't transform depth. * initial infrastructure done for depthless conv. * Fix no_depth schedule bugs. * automatic topi switching between depth and depthless winograd. * Fixed bug in schedule. * lint fixes. * Removed indents in convolution.cc * missed a few indents oops. * fixed flop count. * One more small tweak. * Change kernel pack inner axes order. * Style changes. * Comment fixes.
Josh Fromm committed
-
- 02 Apr, 2020 1 commit
-
-
* [Frontend][Torch] Simplify operator input handling * [Frontend][Torch] Allow user supplied input names to override graph inputs * Fix pylint issues * Updates from code review feedback * Fix tutorial to use shape list input * Disable intermittent test failure in topi vision test
Jeremy Johnson committed
-
- 27 Mar, 2020 1 commit
-
-
* [TOPI][Tensor Core] Optimization of CNNs on Tensor Core #6004 * update conv2d test * # pylint: dense_tensorcore.py * modify * modify conv2d * modify the unclear comment,add shape assertion in conv2d compute,combine general gemm intrinsic * add shape assertion in conv2d compute, combine general gemm intrinsic Co-authored-by: libaihong <libaihong@inspur.com> Co-authored-by: libaihong <61525430+libaihong@users.noreply.github.com>
Shawn-Inspur committed
-
- 26 Mar, 2020 1 commit
-
-
* register for fast_exp and fast_tanh * Add unit test for fast math * Add unit test for op fast math * Add unit test for op fast math * Add unit tests to guard registering topi schedule for Relay fast_exp and fast_tanh * Fix ident * Fix the indent * Add fast_tanh in the test_fastmath of topi tests
Selo1412 committed
-
- 23 Mar, 2020 2 commits
-
-
* isfinite doc update * isfinit expr * isfinit expr * isfinite schedule reg * isfinite python binding * isfinite python binding * relay register isfinite * isfinite type relation * intrin isfinite * topi isfinite * testcase topi isfinite * tf frontend isfinite * tf frontend isfinite testcase * test case relay isfinite * small fixes * test forward tf isfinite * test cases injective for cuda * remove float16 test case * add support for isinf * remove unwanted import * fix conflict
Mahesh Ambule committed -
* first cut unravel_index * merge fixes * change rates to dilations * unravel_index op relay, topi, mxnet, tf * doc changes * small changes * remove empty unravel and argwhere attrs * remove empty unravel and argwhere attrs
Mahesh Ambule committed
-
- 15 Mar, 2020 1 commit
-
-
* add stub for nd impl * refactored indices compute * refactored divide step * remove unused variables, add doc * fix lint * add relay op def * add python registration * refactor topi test * update relay tests, but test result is weird * workaround for weird bug * add relay adaptive pool 3d test * add topi tests * update doc for 3d * typo fix * fix lint * add more tests including NDHWC
masahi committed
-
- 12 Mar, 2020 1 commit
-
-
* [CUDA] Op strategy changes for Int8 schedules. * Applying Haichen's suggestions. * Make 4D output work for task extraction. * Make x86 work. * Fix lint. * Lint fixes. * Tests, comments, out channel a multiple of 4. * Topi test. Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-96.us-west-2.compute.internal>
Animesh Jain committed
-
- 11 Mar, 2020 2 commits
-
-
* Support 3d Convolution with the ONNX frontend * add unit tests for conv3d in onnx frontend respond to PR formatting requests add x86 schedules to conv3d ncdhw test fix a doc string format issue refactor for changed upsream API * first attempt at conv3d autotuning add default schedule for conv3d_ncdhw fill in autotvm integration add a fallback for invalid schedules fix fallback fix reduction order to get simd working correctly
Matthew Brookhart committed -
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Finally implement tan as sin/cos in llvm.
notoraptor committed
-
- 10 Mar, 2020 1 commit
-
- 06 Mar, 2020 1 commit
-
-
* Add relay operation relay.op.tan. * Update tan implementation in TVM. * Update tests. * Add shape function for tan. * Add missing main test to python/frontend/tensorflow/test_forward. * Revert, back to sin/cos. * Revert "Revert, back to sin/cos." This reverts commit 4da5b503b921585ba9d80944b29136142b575c40. * Fix implementation of tan in cuda. Do not support tan for float16. Simplify topi/tests/python/test_topi_math. Add testing for tan with float32 and float64. Try again to implement tan as sin/cos in llvm.
Yao Wang committed
-
- 27 Feb, 2020 1 commit
-
-
* [REFACTOR][PY][API-CHANGE] Remove legacy python files. Remove legacy python files. Use the te namespace for most of the tensor expression primitives. - tvm.create_schedule -> tvm.te.create_schedule - tvm.placeholder -> tvm.te.placeholder - tvm.compute -> tvm.te.compute * Remove top-level exposures.
Tianqi Chen committed
-
- 24 Feb, 2020 1 commit
-
-
* relay op strategy fix lint bitpack strategy bitserial_dense (#6) * update strategy * address comments fix a few topi test Dense strategy (#5) * dense * add biforst; remove comments * address comment Refactor x86 conv2d_NCHWc (#4) * Refactor x86 conv2d * Add x86 depthwise_conv2d_NCHWc * Add back topi x86 conv2d_nchw * Merge x86 conv2d_nchw and conv2d_NCHWc * Minor fix for x86 conv2d fix more strategy Add x86 conv2d_NCHWc_int8 strategy (#8) * Add x86 conv2d_NCHWc_int8 strategy * Remove contrib_conv2d_nchwc_int8 * Fix generic conv2d_NCHWc for int8 * Fix topi arm_cpu conv2d_NCHWc_int8 update x86 conv2d enable specify relay ops to be tuned for autotvm add cuda conv2d strategy add conv2d strategy for rocm add conv2d strategy for hls add conv2d strategy for arm cpu add conv2d strategy for mali add conv2d strategy for bifrost add conv2d strategy for intel graphics clean up and fix lint remove template keys from autotvm remove 2 in the func name address comments fix * fix bugs * lint * address comments * add name to op implement * Modify topi tests (#9) * Add pooling, reorg, softmax and vision * Add lrn * fix topi test * fix more topi test * lint * address comments * x * fix more tests & bugs * Modify more tests (#10) * Modify tests for bitserial_conv2d, bitserial_dense, bitserial_conv2d_rasp and bnn * Minor fix * More minor fix * fix more test * try to update vta using strategy * fix cpptest * x * fix rebase err * Fix two tests (#11) * change autotvm log format * lint * minor fix * try fix vta test * fix rebase err * tweak * tmp hack for vta pass * fix tutorial * fix * fix more tutorials * fix vta tutorial * minor * address comments * fix * address comments * fix cpptest * fix docs * change data structure name and api * address comments * lint * fix rebase err * updates * fix winograd test * fix doc * rebase * upgrade tophub version number * fix bug * re-enable vta tsim test after tophub is upgraded * fix vta test to use the correct args so the config can be found in tophub Co-authored-by: Yao Wang <kevinthesunwy@gmail.com>
Haichen Shen committed
-
- 21 Feb, 2020 2 commits
-
-
* get_valid_count accuracy issue fixed for individual tests but not for all tests running together * minor fix * initialize valid_count and PrefixSum buffers * test updated * udpate relay test as well * update document * fix lint * address comment * fix lint * correct atomicAdd identifier name
Leyuan Wang committed -
* [TEST][FLAKY] topi/tests/python/test_topi_sort.py::test_argsort * upadate test function of argsort like topk * Shuffle index and get data from shuffled index * Replace the random.uniform with np.arange
Neo Chien committed
-
- 17 Feb, 2020 1 commit
-
-
Alex Gladkov committed
-
- 16 Feb, 2020 1 commit
-
-
- Do not emit __shared__ etc. as part of type for casting - Fix fp16 reduction kernels with compiler errors: "no operator "+" matches these operands, volatile half + volatile half This patch inserts casts to remove volatile type qualifier following volatile loads (fp16 only). CUDA fp16 library headers should add volatile member functions. - Update have_fp16 to include compute 6.1 GPUs, which do support fp16, although their fp16 throughput is low. Updated tests. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 14 Feb, 2020 1 commit
-
-
- This allows to better utilize the memory bandwidth - Note that not all cases are vectorized for fp16 datatype. For instance, when the size is not a multiple of 1024, the inner loop may be an expression that cannot be vectorized. In this case, a small inner loop is still benefical for latency hidding. Signed-off-by: Wei Pan <weip@nvidia.com>
wpan11nv committed
-
- 13 Feb, 2020 1 commit
-
-
Add tuneable conv3d_ndhwc schedule
Alex Gladkov committed
-
- 07 Feb, 2020 2 commits
-
-
* [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update This PR updates the tvm.runtime to use the new FFI style. - Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule - API changes wrt to runtime.Module - tvm.module.load -> tvm.runtime.load_module - tvm.module.enabled -> tvm.runtime.enabled - tvm.module.system_lib -> tvm.runtime.system_lib - Remove dep on api_internal from runtime. * Update module.load in the latest API
Tianqi Chen committed -
Tianqi Chen committed
-
- 06 Feb, 2020 1 commit
-
-
* Add bitwise ops to topi * Add the bitwise ops to relay.
abergeron committed
-
- 05 Feb, 2020 1 commit
-
-
* [REFACTOR][PY] Establish tvm.runtime This PR establishes the tvm.runtime namespace that contains the core runtime data structures. The top-level API are kept inact for now via re-exporting. We will followup later to cleanup some of the top-level APIs. * Fix ndarray name
Tianqi Chen committed
-
- 03 Feb, 2020 1 commit
-
-
* [TOPI] upsample operator 'NCHWinic' format support. some hardware accelerator ask packed format data like NCHWinic to fit the hardware resource, here add upsample NCHWinic format support to help such requirement. * address review comments, add assert for 'else must be NCHWxc' logic.
Hua Jiang committed
-
- 31 Jan, 2020 1 commit
-
-
Animesh Jain committed
-
- 24 Jan, 2020 1 commit
-
-
Alex Gladkov committed
-
- 22 Jan, 2020 1 commit
-
-
- combine pad and dilate; - fix for the issue https://discuss.tvm.ai/t/compile-error-for-cuda-target/4164 - fix for the issue https://github.com/apache/incubator-tvm/pull/4472
Alex Gladkov committed
-
- 20 Jan, 2020 1 commit
-
-
Alex Gladkov committed
-
- 15 Jan, 2020 1 commit
-
-
This reverts commit dcf7fbf1.
Haichen Shen committed
-
- 11 Jan, 2020 3 commits
-
-
* added conv1d operators to topi. * Started to add python testing. * Added python conv1d implementation for testing. * Wrote test but need to add cuda schedule :( * Cuda schedules working for both conv1d layouts. * All topi tests passing. * Formatting topi. * Removed pad_method option as its probably overkill. * Added relay op definition of conv1d. * End2end conv1d working with onnx. * Lint fixes. * Formatting fixes. * Rebase fix. * Switched to array based attributes for consistency across convs. * Improved onnx parsing and testing for convolutions. * lint fix * Tiny tweak. * Bug fix * Rebase fix. * Add group ignore to onnx conv1d frontend. * Unified MakeConv and fixed documentation. * improved autopadding * Addressed feedback and simplified onnx frontend. * Format fix. * Basic X86 NCW schedule working. * Added nwc schedule. * fixed name * Added more tests and basic x86 schedules. * Format fix. * Added non power of two shape tests.
Josh Fromm committed -
* [TOPI][RELAY][OP] add op crop_and_resize * fix pylint * incorporate comments * fix ci
Yong Wu committed -
* Add output_padding to generic * Add output_padding to the reference impl * Add output_padding to arm_cpu * Add output_padding to the test * Add output_padding for cuda * Add output_padding for x86 * Make use of the new output_padding argument in Relay * Adjust conv2d_transpose Relay test * Fix lint errors * Fix the VTA declaration of conv2d_transpose * support for output padding in conv2d transpose * some output padding will break IR pass * Fix new conv2d_transpose test * Update tophub * Fix conv1d output_padding too. * Fix the conv1d_transpose reference function. * Fix the cuda impl * fix the topi test for conv1d * Update the versions in tophub.py Co-authored-by: Thierry Moreau <tmoreau@octoml.ai>
abergeron committed
-
- 10 Jan, 2020 1 commit
-
-
* Update topi.cc fix topi.nn.global_pool layout="NHWC" * add topi.nn.global_pool layout=NHWC test
戚海涛 committed
-
- 09 Jan, 2020 1 commit
-
-
* [REFACTOR][IR] tvm::Expr -> PrimExpr(Primitive Expr) As part of unified IR, we will need to unify relay::Expr and the current tvm::Expr under the same base type. From the techinical point of view. tvm::Expr is a "primitive" expression that only contains POD types and handles and does not do life-cycle management. This PR renames Expr->PrimExpr to clarify that. We will send a subsequent PR to introduce the base expr class. * Remove legacy VarExpr and ExprHash/Equal
Tianqi Chen committed
-