- 25 Sep, 2019 4 commits
-
-
* Added tesnorizeation for avx2 based gemm. Summary: Tensorized the same region as avx512. Names produce 16x1 int32 results. Does by doing two sets of AVX2 instructions to do reduction on 8x4 int8 kernel with 1x4 data. Test Plan: on avx2 machine: python tests/python/contrib/test_gemm_avx2_acc32.py Reviewers: Subscribers: Tasks: Tags: * Fix lint errors. Removed commented out code. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Kimish Patel committed -
Tianqi Chen committed
-
add test for GREATER
Ina Dobreva committed -
* Changes to make tensorize work. These changes also fix the previously broken test. Summary: Tensorize was breaking for a few reasons. 1) Assert at: src/op/tensorize.cc:234 CHECK(is_one(e.region[j]->extent)) In some cases this cannot be proven, e.g.: expected shape=[16, 4], given region=[range(min=((ax1.outer*16)/16), ext=(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer)), range(min=((k.outer*4)/4), ext=(((((k.outer*4) + 3)/4) + 1) - k.outer)), range(min=0, ext=16), range(min=0, ext=4)] The unprovable one is: ext=(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer)). This can be simplified but it is not because to simplify divide, it must prove ax1.outer > 0 and since it is var it cannot. The fix for this to just find all the vars in expr in relace them with some const value. 2) Equivalence between tensorized expr and one being asked to tensorize. For example, the error would be. TVMError: Check failed: Equal(lhs, rhs): Failed to match the compute with TensorIntrin tensor_intrin's declaration provided= reduce(combiner=comm_reducer(result=[(x + y)], lhs=[x], rhs=[y], identity_element=[(int16)0]), source=[(int16(data(k))*int16(kernel(((((((((k.outer.outer*64) + (k.outer.inner*2)) + k)/2)*128) + i) - (k.outer.inner*128)) - (k.outer.outer*4096)), ((((k.outer.outer*64) + (k.outer.inner*2)) + k) % 2))))], axis=[iter_var(k, range(min=0, ext=2))], where=(bool)1, value_index=0), intrin= reduce(combiner=comm_reducer(result=[(x + y)], lhs=[x], rhs=[y], identity_element=[(int16)0]), source=[(int16(data(k))*int16(kernel(i, k)))], axis=[iter_var(k, range(min=0, ext=2))], where=(bool)1, value_index=0) Difference is mainly in the source part: source=[(int16(data(k))*int16(kernel(((((((((k.outer.outer*64) + (k.outer.inner*2)) + k)/2)*128) + i) - (k.outer.inner*128)) - (k.outer.outer*4096)), ((((k.outer.outer*64) + (k.outer.inner*2)) + k) % 2))))] source=[(int16(data(k))*int16(kernel(i, k)))], axis=[iter_var(k, range(min=0, ext=2))] This was not being simpifiled due to compute_intrin_iter_space (map for iter var to range) not containing leaf iter vars. 3) Here it fails with: Check failed: is_one(Simplify(value->shape[i])): Argument b_buffer shape mismatch[16, 4] vs [(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer), (((((k.outer*4) + 3)/4) + 1) - k.outer), 16, 4] This is in buffer binding where it thinks expected and buffer bound shape is different. Although if we could simplify expr, this would not be the case. Test Plan: On skylake avx512 machine: python tests/python/contrib/test_gemm_acc16.py Reviewers: Subscribers: Tasks: Tags: * Implemented bounded analyzer which traverses tree and for reduce/for statements binds the bound of the analyzer. Later this is used to simplify expressions. Inspired from ir_mutator_with_analyzer Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Addressed comments. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Added ASF header + define macro for the header file: TVM_ARITHMETIC_IR_VISITOR_WITH_ANALYZER_H_ Some lint fixes as well. * Relax the assumption that dom_map must always contain all leaf itervars. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Disable copy constructor and move to raw ptr. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Kimish Patel committed
-
- 24 Sep, 2019 6 commits
-
-
* [ARITH] Explicitly state truncdiv/mod in pattern matching. * Fix the dependent cpp test
Tianqi Chen committed -
Ina Dobreva committed
-
* Refactor to create abstract ParallelOpCombiner * First draft of CombineParallelDense * Begin to work on tests * Test * Refactor to move out more common code * Clean up * Fix * Remove statics * fix wording * Start to add combine_parallel_op_batch * Resolve PR comments * Resolve PR comments * dummy change to retrigger CI * Change special case from bias_add to add * Revert special case change * Ignore units check * dummy change to retrigger CI * dummy change to re-trigger CI * Improve docs * Update docs * Update docs
Jon Soifer committed -
Steven S. Lyubomirsky committed
-
* Add Erf to ONNX frontend * dummy change to retrigger CI
Jon Soifer committed -
StandbyMe committed
-
- 23 Sep, 2019 1 commit
-
-
Animesh Jain committed
-
- 22 Sep, 2019 3 commits
-
-
Paddy Horan committed
-
* Qnn Dense layer. * Reformatting code. * Reformatting code and making the test case more readable. * Fixing lint issues. * Fixing test method names to pass the nose related configurations. * Aligning the code for code style.
shoubhik committed -
* add expr `isnan` * move to intrinsic * doc & add to topi * fix error from ci
Huang, Guangtai committed
-
- 21 Sep, 2019 3 commits
- 20 Sep, 2019 5 commits
-
-
* Fix unittest * Fix pylint error: Line 915 too long * Fix the conflicting files * frontend operator support: space_to_batch_nd * add test case for frontend operator support: space_to_batch_nd * add test case for frontend operator support: space_to_batch_nd * frontend operator support: space_to_batch_nd * Fix ValueError: don't know how to convert type <class 'numpy.ndarray'> to node
Neo Chien committed -
* [Relay][Frontend][ONNX] operator support: Tile * Trigger notification
Neo Chien committed -
* [ARITH] Add Lowering rule for FloorDiv/Mod * add comment about constant folding
Tianqi Chen committed -
MXNet pad is described at: https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.pad Add support for parameter 'None' in MXNet slice operator. MXNet 'slice' is described at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.slice Add support for MXNet cos, sin, arctan MXNet 'cos' is described at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.cos MXNet 'sin' is described at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.sin MXNet arctan is descirbed at https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.arctan Add support for MXNet 1D Convolution and 1D Deconvolution MXNet convolution is described at: https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Convolution MXNet Deconvolution is described at: https://mxnet.incubator.apache.org/api/python/symbol/symbol.html#mxnet.symbol.Deconvolution
Alex Gladkov committed -
Animesh Jain committed
-
- 19 Sep, 2019 5 commits
-
-
* add proper scheduling for dense on CUDA * add fallback config and fix unit test * fix corner cases * refactoring * fix bias and add testcase * let fusion happen
Cody Hao Yu committed -
Meghan Cowan committed
-
adjust pylint output to show file location to make it possible to locate errors
Ina Dobreva committed -
Animesh Jain committed
-
Tianqi Chen committed
-
- 18 Sep, 2019 3 commits
-
-
* [Relay] add shape check for concat * [Relay] add shape check for stack * add test case for shape mismatch * [typo] add the missing assert * fix lint errors. * replace int with size_t. * statically cast param->axis to size_t. * switch to run_infer_type. * fix checking for negative index * add static_cast for param->axis * merge to latest tvm * fix lint error * Fix an error with negative index. * Update transform.h * Update transform.cc
Ligeng Zhu committed -
Neo Chien committed
-
* Fix upsample layout in keras frontend. * Fixed group conv being used instead of conv when channels=1 * Add new conv2d test to catch bugs when channels=1.
Josh Fromm committed
-
- 17 Sep, 2019 3 commits
-
-
* Adding support to check if an attribute is present or not without having to get the value. * - Renaming the method to more appropriate name.
shoubhik committed -
Use a hash map keyed on the descriptor set to avoid bad asymptotic behaviour.
Andrew Tulloch committed -
Junru Shao committed
-
- 16 Sep, 2019 6 commits
-
-
Animesh Jain committed
-
* improve conv2d_transpose x86 performance by reusing conv2d schedule * parallelize across batches to make large-batch conv2d and conv2d_transpose faster * improve doc for autotvm.task.space.FallbackConfigEntity.fallback_with_reference_log * add fallback schedule for schedule_conv2d_transpose_nchw_cuda * fix pylint * fix pylint * unify conv2d_transpose declaration in topi.nn and topi.x86
Yuwei Hu committed -
* Fix graph tuner benchmarking layout transform * Add test
Yao Wang committed -
* [tvm][codegen] Make buffer auto broadcast independent to the order of the input arg * fix indent
Zhi committed -
* [TOPI] operator support: logical_and, logical_or, logical_not * [TOPI] operator support: logical_and, logical_or, logical_not * [TOPI] fix test cases for operator support: logical_and, logical_or, logical_not * [TOPI] fix test cases for operator support: logical_not
Neo Chien committed -
* QNNLegalize for conv2d * [QNN] Legalization for Intel x86 QNN Conv2D
Animesh Jain committed
-
- 15 Sep, 2019 1 commit
-
-
* Enable miopen transpose convolution and fp16 support * linter
Peter Yeh committed
-