Commits · 3ba9dd0920bc727f4dd4fb82ce08fbaced92f4f5 · wenyuanbo / tic

16 Nov, 2019 5 commits

Retain qnn input kernel scales (#4292) · 3ba9dd09

* Add qnn conv2d attributes for input_tensor_scale and
kernel_tensor_scale.

The lowering in the tflite frontend loses the input_tensor_scale
and the kernel_tensor_scale by multiplying it and putting it into
the Requantize operation. This means that any graph partitioning
passes or other passes that need to access this information no longer
have it available in the qnn dialect.

regards
Ramana

* Store input tensor scale and Weight tensor scale for Dense as well

As for conv2d, the tflite frontend drops the input tensor
scale and the weight tensor scale from the relay op. Store
it as separate fields in there.

* Fix unintentional tab

* Rename input_tensor_scale to input_scale and kernel_tensor_scale
to kernel_scale for conv2d.

* input_tensor_scale -> input_scale weight_tensor_scale->weight_scale

* Rework dense testcase

And use input_scale and kernel_scale

* Be consistent in use of input_scale and kernel_scale values

* Fixup qnn conv2d tests for input_scale and kernel_scale

* Make pydoc identical between conv2d and dense for weight_tensor

* Fix up conv2d parameters to be in the same order between C++ and python

* Fix ordering of parameters for dense.

* Add input_scale and output_scale to try and satisfy ci gods

* Delete input_scale and kernel_scale.

nn.conv2d does not contain input_scale and kernel_scale. We need
to delete it when lowering it to nn.conv2d.

* Add input_scale and kernel_scale for qnn.conv2d

committed Nov 16, 2019

3ba9dd09 Browse Files

[Debugger] Sorting op-time breakdown for quicker analysis. (#4352) · 560280dd
Animesh Jain committed Nov 16, 2019

560280dd Browse Files
proper device query through rocm api (#4305) · 022b285d
Peter Yeh committed Nov 16, 2019

022b285d Browse Files
fix install script (#4350) · 9e04298b
Cody Hao Yu committed Nov 15, 2019

9e04298b Browse Files

AutoTVM: selecting tuning templates when extracting task (#4338) · ccde31f1

* AutoTVM: selecting tuning templates when extracting task

Make the procedure of trying new templates easier.

Test: tests/python/relay/test_autotvm_task_extraction.py

* Use dict to match key for topi ops

* fix lint issue

* be more pythonic :)

committed Nov 15, 2019

ccde31f1 Browse Files

15 Nov, 2019 21 commits
- Add workgroup size attribute to AMDGPU functions in codegen (#4342) · 0a9f7e9a
```
When we did not set the workgroup size, LLVM will use too many registers
for kernel launches with many threads. This resulted in "invalid ISA"
errors. Here we set the maximum workgroup size to the maximum threads
per block from the device API.

Of course, one might look into allowing configurations with fewer
threads at runtime to use more registers.
```
  Thomas Viehmann committed Nov 16, 2019
  0a9f7e9a Browse Files
- [FIX] Fix for a specific case when loop partitioning with indivisble (#4243) · d58b733a
```
factors and resulting nested loop is broken.
This is due to the fact that we are creating zero extent loops which
are fixed afterwards. However unroll pass breaks due to the zero extent
loop.
```
  Kimish Patel committed Nov 15, 2019
  d58b733a Browse Files
- [Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter… · 2c5c4da6
```
[Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter via eta expansion (#4218)

* Fix constructor pretty printing

* Make Module::HasDef name consistent with API

* Add VM constructor compilation via eta expansion

* Lint

* Fix CI

* Fix failing test

* Address comment

* Retrigger CI

* Retrigger CI
```
  Logan Weber committed Nov 15, 2019
  2c5c4da6 Browse Files
- [COMMUNITY] Add DISCLAIMER, KEYS for ASF release (#4345) · 3f6b3db8
```
* [COMMUNITY] Add DISCLAIMER, KEYS for ASF release

* Add file name spec
```
  Tianqi Chen committed Nov 15, 2019
  3f6b3db8 Browse Files
- Add check to ensure input file was successfully opened in NNVM deploy code demo (#4315) · da23619a
  T.J. Mercier committed Nov 15, 2019
  
  da23619a Browse Files
- Bump up CUDA log version in tophub.py (#4347) · 888a3c35
  Alex Gladkov committed Nov 15, 2019
  
  888a3c35 Browse Files
- [CodeGen] Add build config option disable_assert to control whether to generate assert (#4340) · b0b16a07
  Zhao Wu committed Nov 15, 2019
  
  b0b16a07 Browse Files
- fix inconsistent tag name (#4134) · a8e6ee9b
  ziyu-guo committed Nov 15, 2019
  
  a8e6ee9b Browse Files
- [VTA] Bug fix for padded load with large inputs (#4293) · 5b1ca85d
```
* bug fix for padded load with large inputs

* Update TensorLoad.scala

* Update test_vta_insn.py
```
  Liangfu Chen committed Nov 15, 2019
  5b1ca85d Browse Files
- imp module is deprecated (#4275) · 9e6371fb
  Jian Weng committed Nov 15, 2019
  
  9e6371fb Browse Files
- [Relay][Frontend][ONNX] operator support: DepthToSpace, SpaceToDepth (#4271) · 510bd8f6
  Neo Chien committed Nov 15, 2019
  
  510bd8f6 Browse Files
- [Test][Relay][Pass] Add test case for lambda lift (#4317) · 135587aa
  Wei Chen committed Nov 15, 2019
  
  135587aa Browse Files
- [RUNTIME] Add device query for AMD GcnArch (#4341) · 0235d283
```
* add gcnArch query

* kGcnArch query for cuda is a no-op
```
  Peter Yeh committed Nov 15, 2019
  0235d283 Browse Files
- [Relay][Frontend][TF] Fix transpose when axes is not a param (#4327) · 1e2c525b
```
* [Relay][Frontend][TF] Use _infer_value_simulated when axes is not a const to Transpose

* uncomment tests

* dummy change to retrigger ci
```
  Jon Soifer committed Nov 14, 2019
  1e2c525b Browse Files
- [Contrib] Add MKL DNN option (#4323) · 72821b20
```
* [Contrib] Add MKL DNN

* update

* update
```
  Haichen Shen committed Nov 14, 2019
  72821b20 Browse Files
- Deprecate NNVM warning msg (#4333) · 2573b3b8
  Yizhi Liu committed Nov 14, 2019
  
  2573b3b8 Browse Files
- Solve custom model of prelu (#4326) · de1bfa4b
  Zhao Wu committed Nov 14, 2019
  
  de1bfa4b Browse Files
- Add topi.nn.fifo_buffer to TVM doc (#4343) · 23691b8e
  Philip Hyunsu Cho committed Nov 14, 2019
  
  23691b8e Browse Files
- Add support for quant. mul operator in tflite frontend (#4283) · 70a3a612
```
A test for qnn_mul has to be added when the qnn elemwise tests (#4282) get merged.
```
  Ina Dobreva committed Nov 14, 2019
  70a3a612 Browse Files
- [Relay][Pass] Add pass to remove unused functions in relay module (#4334) · 7dca6552
```
* [Relay][Pass] Add pass to remove unused functions in relay module

* Add tests

* Fix lint

* Fix visit order

* Add pass argument

* Fix
```
  Wei Chen committed Nov 14, 2019
  7dca6552 Browse Files
- Enable hipModuleGetGlobal() (#4321) · 5b9f459d
  Peter Yeh committed Nov 15, 2019
  
  5b9f459d Browse Files
14 Nov, 2019 9 commits
- [Build][Windows] Fix Windows build by including cctype (#4319) · d9b8a6c9
```
* Fix build

* dummy change to retrigger CI

* dummy change to retrigger ci

* dummy change to retrigger ci
```
  Jon Soifer committed Nov 14, 2019
  d9b8a6c9 Browse Files
- [CI] Set workspace to be per executor (#4336) · 5d66e7a6
  Tianqi Chen committed Nov 14, 2019
  
  5d66e7a6 Browse Files
- [Codegen] remove fp16 function override for cuda (#4331) · cf83d50c
```
* add volatile override back

* [codegen] remove fp16 function override for cuda
```
  Yizhi Liu committed Nov 14, 2019
  cf83d50c Browse Files
- change ci image version (#4313) · b127dc76
  Tianqi Chen committed Nov 14, 2019
  
  b127dc76 Browse Files
- [doc][fix] fix sphinx parsing for pass infra tutorial (#4337) · 72d8a886
  Zhi committed Nov 14, 2019
  
  72d8a886 Browse Files
- [QNN] Use Int16 upcast in Fallback Conv2D. Fix test names. (#4329) · f34dea41
  Animesh Jain committed Nov 14, 2019
  
  f34dea41 Browse Files
- [QNN] Quantize - Fixing the sequence of lowering. (#4316) · fed79b3a
  Animesh Jain committed Nov 13, 2019
  
  fed79b3a Browse Files
- [CI][DOCKER] Add ONNX runtime dep (#4314) · dc5f70ad
```
* [DOCKER] Add ONNX runtime dep

* Improve ci script
```
  Tianqi Chen committed Nov 13, 2019
  dc5f70ad Browse Files
- fix error when memory_id is VTA_MEM_ID_OUT (#4330) · dab7172a
  jason-song-dev committed Nov 13, 2019
  
  dab7172a Browse Files
13 Nov, 2019 2 commits
- [QNN][Legalize] Specialize for Platforms without any fast Int8 arithmetic units. (#4307) · 3486e2c2
  Animesh Jain committed Nov 13, 2019
  
  3486e2c2 Browse Files
- [TOPI][OP] Support Faster-RCNN Proposal OP on CPU (#4297) · 8cd5ccea
```
* Support Proposal operator on CPU.

* PyLint space issue

* PyLint space issue

* Pylint singleton-comparison issue
```
  Zhao Wu committed Nov 13, 2019
  8cd5ccea Browse Files
12 Nov, 2019 3 commits

Fix the TF tutorial to run against TF2.0 and TF1.x (#4104) · e541c758

* WIP Run the TF tutorial on TF2

* Remove debugger statement.

* Complete the support for TF2.0's `resize`.

TF2.0 adds a `half_pixel_centers` attribute to the `resize` function in
the image API. This commit completes the hooks in Relay's TF frontend.

At the point of this commit, no new test yet. Also, this commit
addresses solely the `resize` change. Other commits address other
changes in TF2.0.

* Support TF2.0 in the tutorial by using the compat API.

This looks cleaner than trying to detect the TF version.

* Use the TF compat API, so as to support TF2.0.

This is a direct change, relying on the compat API provided by the TF
team.

This code will last as long as the compat API exists, so a
"proper" support for TF1.x and 2.x will require more work in some
future.

* Partial support for EXPLICIT padding introduced in TF2.0.

Explicit padding is a special case in TF2.0 (see reference linked
below). Some models are serialized with that mode, and break TF support
in TVM.

Support is *partial* as EXPLICIT falls back to set padding on the
Relay op, which only supports 2 values. At some point, padding may need
to be extended to support 4 values, but that is out of scope of this
support commit.

Reference on EXPLICIT padding: https://github.com/tensorflow/tensorflow/commit/ec81825aaf7e848d9f8ddffdf1e0d20aebe9172c#diff-1d1c0bb0a880f85b6164f71dbb2f446e

* Guard on checking for optional TF2.0 attribute.

* Do not expect Relay to implement TF-specific attributes.

The `half_pixel_centers` attribute is a new feature in TF2.0. Earlier
commits of mine mistakenly introduce them in the Relay API. This is
probably not what Relay is expected to support, and the semantics of
`half_pixel_centers` is unclear (to me, at least) at this point.

* Remove unclear comment.

CR https://github.com/dmlc/tvm/pull/4104#discussion_r338705742

Addresses #4104

* Changes after review.

Complying without understanding the rationale for now.

* Fix the arguments set mistakenly.

An argument ignored for the wrong operation.

committed Nov 12, 2019

e541c758 Browse Files

[Relay][Op][TF] Complete tensor array unstack with all ranks support (#4309) · 03a29da7
Wei Chen committed Nov 12, 2019

03a29da7 Browse Files

Add test for the qnn_add operator (#4282) · e6806115

* Add test for the qnn_add operator

The tests use fake quant approach so until the tf session tensors remain in float32.
The test data has to be passed in uint8 because of how the tflite/tvm comparison works.
Abs tolerance up to 1 is allowed for the qnn results. For now input_stats are hardcoded
assuming the tests for the other qnn ops will pass the input data in the same range.

* Separate qnn uint8 test function from the fp32 elemwise tests

Isolate qnn uint8 elemwise tests
Remove blank lines

committed Nov 12, 2019

e6806115 Browse Files