- 03 Oct, 2019 1 commit
-
-
* [Relay][Op] Add instance norm op * mend [Relay][Op] Add instance norm op
bindog committed
-
- 02 Oct, 2019 3 commits
-
-
Animesh Jain committed
-
Umang Yadav committed
-
* [TF][Op] Add TF op Where * improve tests * add tests for vm
Wei Chen committed
-
- 01 Oct, 2019 4 commits
-
-
Cody Hao Yu committed
-
Tianqi Chen committed
-
* Add op argwhere * Move shape func to _algorithm.py * Add lint rule * Raise exception if rank is not supportted * move argwhere to transform * Add argwhere example * Fix lint * Add 1-d support * cleanup * Add more dtype support * CR comment * Improve error message * Docs * raise exception
Wei Chen committed -
* [topi] add ARM v8.2 udot (uint8) support * fix test case * fix common conv2d schedule * add back fp32_time in test * fix lint * fix doc, add support for int32_lanes=4, signed int * fix lint * add ic_bn % 4 checker in schedule
Yizhi Liu committed
-
- 30 Sep, 2019 5 commits
-
-
Tianqi Chen committed
-
Animesh Jain committed
-
Animesh Jain committed
-
There are dependencies on dmlc-core in TVM public API headers (e.g. some headers include dmlc/logging.h) so it needs to be installed as part of TVM for TVM headers to be actually usable.
ndl committed -
Tianqi Chen committed
-
- 29 Sep, 2019 3 commits
-
-
* Fix parser * Doc fix * Add module utility functions necessary for prelude * Implement prelude in text format * Remove programmatically constructed prelude defs * Fix 0-arity type conses in pretty printer and test * Make prelude loading backwards-compatible * Fix patterns * Improve some prelude defs * Fix `ImportFromStd` It needs to also follow the "add unchecked, add checked" pattern * Lint roller * Woops * Address feedback * Fix `test_list_constructor` VM test * Fix `test_adt.py` failures
Logan Weber committed -
please see https://stackoverflow.com/a/26949099
egolearner committed -
* [AUTOTVM][DOCS] Add a link to autoTVM tutorial to direct the details of building NN with relay * [AUTOTVM][DOCS] Add a link to autoTVM tutorial to direct the details of building NN with relay
Neo Chien committed
-
- 28 Sep, 2019 4 commits
-
-
Tianqi Chen committed
-
* [Fix] Add more pad_mode support for onnx converter * robustness fix
bindog committed -
Ina Dobreva committed
-
Add different batch sizes and channel numbers to MXNet Convolution and Deconvolution tests.
Alex Gladkov committed
-
- 27 Sep, 2019 6 commits
-
-
brett koonce committed
-
Paddy Horan committed
-
Tianqi Chen committed
-
* use a more intuitive way to limit the #ops in a group * format
Yida Wang committed -
Tianqi Chen committed
-
So that you can use: `build_mod_.GetFunction("get_lowered_funcs", false);` to get lowered_funcs. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Kimish Patel committed
-
- 26 Sep, 2019 3 commits
-
-
Haozheng Fan committed
-
Animesh Jain committed
-
[TOPI][x86] Introduce schedule_injective_from_existing and unify external schedules for all targets (#3983) * Fix extern schedule for x86 * Register x86::schedule_extern * Fix * Fix * Replace extern.py with extern.h * Introduce new generic function schedule_injective_from_existing * Fix * Fix * Add back to C++ * Fix style * Injective schedule calls local schedule_injective_from_existing * Fix * Remove target arg from schedule_injective_from_existing * Fix docs * Try to fix unit test * Fix test * Fix other tests * Fix bug
Jon Soifer committed
-
- 25 Sep, 2019 11 commits
-
-
* impose a max op limit to op fusion * use cross platform data type
Yida Wang committed -
More schedules are making the conv2d.py file too large, so we'd like to move the spatial pack schedule to dedicated file before introducing NHWC schedule. No logic change in this patch.
黎明灰烬 committed -
This reverts commit 23727eb4.
Tianqi Chen committed -
Cody Hao Yu committed
-
* [ARITH] Use explicit div/mod functions instead of operators. * fix pooling case
Tianqi Chen committed -
* Expose llvm.nearbyint intrinsic. This is a faster alternate to rounding. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Added python binding. Added test. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Kimish Patel committed -
Philipp Krones committed
-
* Added tesnorizeation for avx2 based gemm. Summary: Tensorized the same region as avx512. Names produce 16x1 int32 results. Does by doing two sets of AVX2 instructions to do reduction on 8x4 int8 kernel with 1x4 data. Test Plan: on avx2 machine: python tests/python/contrib/test_gemm_avx2_acc32.py Reviewers: Subscribers: Tasks: Tags: * Fix lint errors. Removed commented out code. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Kimish Patel committed -
Tianqi Chen committed
-
add test for GREATER
Ina Dobreva committed -
* Changes to make tensorize work. These changes also fix the previously broken test. Summary: Tensorize was breaking for a few reasons. 1) Assert at: src/op/tensorize.cc:234 CHECK(is_one(e.region[j]->extent)) In some cases this cannot be proven, e.g.: expected shape=[16, 4], given region=[range(min=((ax1.outer*16)/16), ext=(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer)), range(min=((k.outer*4)/4), ext=(((((k.outer*4) + 3)/4) + 1) - k.outer)), range(min=0, ext=16), range(min=0, ext=4)] The unprovable one is: ext=(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer)). This can be simplified but it is not because to simplify divide, it must prove ax1.outer > 0 and since it is var it cannot. The fix for this to just find all the vars in expr in relace them with some const value. 2) Equivalence between tensorized expr and one being asked to tensorize. For example, the error would be. TVMError: Check failed: Equal(lhs, rhs): Failed to match the compute with TensorIntrin tensor_intrin's declaration provided= reduce(combiner=comm_reducer(result=[(x + y)], lhs=[x], rhs=[y], identity_element=[(int16)0]), source=[(int16(data(k))*int16(kernel(((((((((k.outer.outer*64) + (k.outer.inner*2)) + k)/2)*128) + i) - (k.outer.inner*128)) - (k.outer.outer*4096)), ((((k.outer.outer*64) + (k.outer.inner*2)) + k) % 2))))], axis=[iter_var(k, range(min=0, ext=2))], where=(bool)1, value_index=0), intrin= reduce(combiner=comm_reducer(result=[(x + y)], lhs=[x], rhs=[y], identity_element=[(int16)0]), source=[(int16(data(k))*int16(kernel(i, k)))], axis=[iter_var(k, range(min=0, ext=2))], where=(bool)1, value_index=0) Difference is mainly in the source part: source=[(int16(data(k))*int16(kernel(((((((((k.outer.outer*64) + (k.outer.inner*2)) + k)/2)*128) + i) - (k.outer.inner*128)) - (k.outer.outer*4096)), ((((k.outer.outer*64) + (k.outer.inner*2)) + k) % 2))))] source=[(int16(data(k))*int16(kernel(i, k)))], axis=[iter_var(k, range(min=0, ext=2))] This was not being simpifiled due to compute_intrin_iter_space (map for iter var to range) not containing leaf iter vars. 3) Here it fails with: Check failed: is_one(Simplify(value->shape[i])): Argument b_buffer shape mismatch[16, 4] vs [(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer), (((((k.outer*4) + 3)/4) + 1) - k.outer), 16, 4] This is in buffer binding where it thinks expected and buffer bound shape is different. Although if we could simplify expr, this would not be the case. Test Plan: On skylake avx512 machine: python tests/python/contrib/test_gemm_acc16.py Reviewers: Subscribers: Tasks: Tags: * Implemented bounded analyzer which traverses tree and for reduce/for statements binds the bound of the analyzer. Later this is used to simplify expressions. Inspired from ir_mutator_with_analyzer Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Addressed comments. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Added ASF header + define macro for the header file: TVM_ARITHMETIC_IR_VISITOR_WITH_ANALYZER_H_ Some lint fixes as well. * Relax the assumption that dom_map must always contain all leaf itervars. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: * Disable copy constructor and move to raw ptr. Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Kimish Patel committed
-