- 16 Nov, 2019 5 commits
-
-
* Add qnn conv2d attributes for input_tensor_scale and kernel_tensor_scale. The lowering in the tflite frontend loses the input_tensor_scale and the kernel_tensor_scale by multiplying it and putting it into the Requantize operation. This means that any graph partitioning passes or other passes that need to access this information no longer have it available in the qnn dialect. regards Ramana * Store input tensor scale and Weight tensor scale for Dense as well As for conv2d, the tflite frontend drops the input tensor scale and the weight tensor scale from the relay op. Store it as separate fields in there. * Fix unintentional tab * Rename input_tensor_scale to input_scale and kernel_tensor_scale to kernel_scale for conv2d. * input_tensor_scale -> input_scale weight_tensor_scale->weight_scale * Rework dense testcase And use input_scale and kernel_scale * Be consistent in use of input_scale and kernel_scale values * Fixup qnn conv2d tests for input_scale and kernel_scale * Make pydoc identical between conv2d and dense for weight_tensor * Fix up conv2d parameters to be in the same order between C++ and python * Fix ordering of parameters for dense. * Add input_scale and output_scale to try and satisfy ci gods * Delete input_scale and kernel_scale. nn.conv2d does not contain input_scale and kernel_scale. We need to delete it when lowering it to nn.conv2d. * Add input_scale and kernel_scale for qnn.conv2d
Ramana Radhakrishnan committed -
Animesh Jain committed
-
Peter Yeh committed
-
Cody Hao Yu committed
-
* AutoTVM: selecting tuning templates when extracting task Make the procedure of trying new templates easier. Test: tests/python/relay/test_autotvm_task_extraction.py * Use dict to match key for topi ops * fix lint issue * be more pythonic :)
黎明灰烬 committed
-
- 15 Nov, 2019 21 commits
-
-
When we did not set the workgroup size, LLVM will use too many registers for kernel launches with many threads. This resulted in "invalid ISA" errors. Here we set the maximum workgroup size to the maximum threads per block from the device API. Of course, one might look into allowing configurations with fewer threads at runtime to use more registers.
Thomas Viehmann committed -
factors and resulting nested loop is broken. This is due to the fact that we are creating zero extent loops which are fixed afterwards. However unroll pass breaks due to the zero extent loop.
Kimish Patel committed -
[Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter via eta expansion (#4218) * Fix constructor pretty printing * Make Module::HasDef name consistent with API * Add VM constructor compilation via eta expansion * Lint * Fix CI * Fix failing test * Address comment * Retrigger CI * Retrigger CI
Logan Weber committed -
* [COMMUNITY] Add DISCLAIMER, KEYS for ASF release * Add file name spec
Tianqi Chen committed -
T.J. Mercier committed
-
Alex Gladkov committed
-
Zhao Wu committed
-
ziyu-guo committed
-
* bug fix for padded load with large inputs * Update TensorLoad.scala * Update test_vta_insn.py
Liangfu Chen committed -
Jian Weng committed
-
Neo Chien committed
-
Wei Chen committed
-
* add gcnArch query * kGcnArch query for cuda is a no-op
Peter Yeh committed -
* [Relay][Frontend][TF] Use _infer_value_simulated when axes is not a const to Transpose * uncomment tests * dummy change to retrigger ci
Jon Soifer committed -
* [Contrib] Add MKL DNN * update * update
Haichen Shen committed -
Yizhi Liu committed
-
Zhao Wu committed
-
Philip Hyunsu Cho committed
-
A test for qnn_mul has to be added when the qnn elemwise tests (#4282) get merged.
Ina Dobreva committed -
* [Relay][Pass] Add pass to remove unused functions in relay module * Add tests * Fix lint * Fix visit order * Add pass argument * Fix
Wei Chen committed -
Peter Yeh committed
-
- 14 Nov, 2019 9 commits
-
-
* Fix build * dummy change to retrigger CI * dummy change to retrigger ci * dummy change to retrigger ci
Jon Soifer committed -
Tianqi Chen committed
-
* add volatile override back * [codegen] remove fp16 function override for cuda
Yizhi Liu committed -
Tianqi Chen committed
-
Zhi committed
-
Animesh Jain committed
-
Animesh Jain committed
-
* [DOCKER] Add ONNX runtime dep * Improve ci script
Tianqi Chen committed -
jason-song-dev committed
-
- 13 Nov, 2019 2 commits
-
-
Animesh Jain committed
-
* Support Proposal operator on CPU. * PyLint space issue * PyLint space issue * Pylint singleton-comparison issue
Zhao Wu committed
-
- 12 Nov, 2019 3 commits
-
-
* WIP Run the TF tutorial on TF2 * Remove debugger statement. * Complete the support for TF2.0's `resize`. TF2.0 adds a `half_pixel_centers` attribute to the `resize` function in the image API. This commit completes the hooks in Relay's TF frontend. At the point of this commit, no new test yet. Also, this commit addresses solely the `resize` change. Other commits address other changes in TF2.0. * Support TF2.0 in the tutorial by using the compat API. This looks cleaner than trying to detect the TF version. * Use the TF compat API, so as to support TF2.0. This is a direct change, relying on the compat API provided by the TF team. This code will last as long as the compat API exists, so a "proper" support for TF1.x and 2.x will require more work in some future. * Partial support for EXPLICIT padding introduced in TF2.0. Explicit padding is a special case in TF2.0 (see reference linked below). Some models are serialized with that mode, and break TF support in TVM. Support is *partial* as EXPLICIT falls back to set padding on the Relay op, which only supports 2 values. At some point, padding may need to be extended to support 4 values, but that is out of scope of this support commit. Reference on EXPLICIT padding: https://github.com/tensorflow/tensorflow/commit/ec81825aaf7e848d9f8ddffdf1e0d20aebe9172c#diff-1d1c0bb0a880f85b6164f71dbb2f446e * Guard on checking for optional TF2.0 attribute. * Do not expect Relay to implement TF-specific attributes. The `half_pixel_centers` attribute is a new feature in TF2.0. Earlier commits of mine mistakenly introduce them in the Relay API. This is probably not what Relay is expected to support, and the semantics of `half_pixel_centers` is unclear (to me, at least) at this point. * Remove unclear comment. CR https://github.com/dmlc/tvm/pull/4104#discussion_r338705742 Addresses #4104 * Changes after review. Complying without understanding the rationale for now. * Fix the arguments set mistakenly. An argument ignored for the wrong operation.
Eric Platon committed -
Wei Chen committed
-
* Add test for the qnn_add operator The tests use fake quant approach so until the tf session tensors remain in float32. The test data has to be passed in uint8 because of how the tflite/tvm comparison works. Abs tolerance up to 1 is allowed for the qnn results. For now input_stats are hardcoded assuming the tests for the other qnn ops will pass the input data in the same range. * Separate qnn uint8 test function from the fp32 elemwise tests Isolate qnn uint8 elemwise tests Remove blank lines
Ina Dobreva committed
-