Commits · 786d7998e1afe0a9d5a9e0091a4509afc748d798 · wenyuanbo / tic

21 Nov, 2019 3 commits
- add GPU checking before compilation for rocm (#4394) · 786d7998
```
Previously, we would rely on the later phases to error out
(often for using too much shared memory). This enables the
checks on the IR that already exist for CUDA and OpenCL also
for ROCm.
```
  Thomas Viehmann committed Nov 21, 2019
  786d7998 Browse Files
- [QNN] Lowering for Depthwise Convolution. (#4351) · 464ebb13
  Animesh Jain committed Nov 20, 2019
  
  464ebb13 Browse Files
- [fix][pass] Save the function when it is used as a call arg (#4389) · 2672aad4
  Zhi committed Nov 20, 2019
  
  2672aad4 Browse Files
20 Nov, 2019 7 commits
- [CI] Add more info, per exec ws isolation (#4388) · 3338af7c
  Tianqi Chen committed Nov 20, 2019
  
  3338af7c Browse Files
- [ThreadPool] Solve thread transitions issue (#4344) · 6040b6f9
```
* [ThreadPool] Solve thread transitions issue

* Use pthread_atfork to avoid master thread affinity be derived by child.

* Code Format

* comment of exclude_worker0_

* set full cpu affinity

* Redundant blank line

* CPPLint

* CPPLint namespace

* CPPLint

* Fix the wrong logic of bind master thread.
```
  Zhao Wu committed Nov 20, 2019
  6040b6f9 Browse Files
- Compare all outputs in TFLite test_forward_ssd_mobilenet_v1 (#4373) · 3a133550
  Alexander Pivovarov committed Nov 20, 2019
  
  3a133550 Browse Files
- [team] add Yizhi's pgp key (#4380) · 1df8d308
  Yizhi Liu committed Nov 20, 2019
  
  1df8d308 Browse Files
- fix build with llvm trunk (#4386) · da9a0330
  masahi committed Nov 20, 2019
  
  da9a0330 Browse Files
- [doc] fix typo, codege to codegen (#4383) · eca26032
  Liang ZOU committed Nov 20, 2019
  
  eca26032 Browse Files
- [CI] Avoid content-length request in test data download (#4375) · d745d935
  Tianqi Chen committed Nov 19, 2019
  
  d745d935 Browse Files
19 Nov, 2019 7 commits
- [nvcc] enable multiple arch in one fatbin (#4377) · f8f4ceb2
  Yizhi Liu committed Nov 19, 2019
  
  f8f4ceb2 Browse Files
- [Relay][Quantize] Integrate data-aware calibration into quantization (#4295) · 500ff051
```
* [Relay][Quantize] Integrate data-aware calibration into quantization

* Update _calibrate.py

* trigger ci

* Address comments

* address comments
```
  Wuwei Lin committed Nov 19, 2019
  500ff051 Browse Files
- [PERF] Parallelize reduction for CPU (#4158) · af52eba1
```
* [PERF] parallel reduction in cpu

* fix

* x

* update

* lint

* fix
```
  Haichen Shen committed Nov 19, 2019
  af52eba1 Browse Files
- [tutorial][benchmark] nnvm -> relay (#4368) · fbeac5e2
```
* [tutorial] nnvm -> relay

* use relay workload

* delete movbilenetv2 option
```
  Yizhi Liu committed Nov 20, 2019
  fbeac5e2 Browse Files
- Fix TFLite RESHAPE assert (#4320) · 331f6fd0
  Alexander Pivovarov committed Nov 19, 2019
  
  331f6fd0 Browse Files
- [Relay tests] AlterOpLayout - Temporary attr update (#4357) · 26eb4053
  Animesh Jain committed Nov 18, 2019
  
  26eb4053 Browse Files
- add rule for clean (#4364) · f1d6f335
```
* add rule for clean

* Update clean rule

Seems like lib/ directory is not made by the makefile
So don't delete directory, just the contents of it.
```
  miheer vaidya committed Nov 18, 2019
  f1d6f335 Browse Files
18 Nov, 2019 6 commits
- reminding message for TVM_REGISTER_NODE_TYPE (#4365) · 7efb72e6
  Yizhi Liu committed Nov 18, 2019
  
  7efb72e6 Browse Files
- fix Android and OpenCL docker install (#4363) · 3914be5e
  Cody Hao Yu committed Nov 18, 2019
  
  3914be5e Browse Files
- [SOURCE] Add ASF header to __init__.py files (#4359) · 00521fab
  Tianqi Chen committed Nov 18, 2019
  
  00521fab Browse Files
- [Frontend]Add TensorFlow FloorMod (#4308) · a226973b
```
* Add tf FloorMod

* Add floor_div/mod into topi and relay

* Add to rst

* Fix test
```
  Yao Wang committed Nov 18, 2019
  a226973b Browse Files
- [Relay][Frontend][Tensorflow]Add conv2d_transpose (#4300) · 2baf310e
```
* [Relay][Frontend][Tensorflow]Add conv2d_transpose

* add transformation from NHWC to NCHW to compatible with TVM conv2d_transpose implementation

* remove 'dilations' paramater to compitable with TF1.3
```
  optima2005 committed Nov 17, 2019
  2baf310e Browse Files
- Send list as argument to schedule_conv2d (#4358) · 9955602d
```
When getting cuda schedule passing single tensor seem to work but after changing target to "llvm" causes assert.
Sending list on other hand makes both cuda and llvm targets happy.
See https://discuss.tvm.ai/t/solved-simple-example-error-attributeerror-tensorslice-object-has-no-attribute-op/2245/3
```
  miheer vaidya committed Nov 17, 2019
  9955602d Browse Files
16 Nov, 2019 6 commits

Fix docstring in topi.nn.fifo_buffer (#4349) · 0d891bf3
Philip Hyunsu Cho committed Nov 16, 2019

0d891bf3 Browse Files

Retain qnn input kernel scales (#4292) · 3ba9dd09

* Add qnn conv2d attributes for input_tensor_scale and
kernel_tensor_scale.

The lowering in the tflite frontend loses the input_tensor_scale
and the kernel_tensor_scale by multiplying it and putting it into
the Requantize operation. This means that any graph partitioning
passes or other passes that need to access this information no longer
have it available in the qnn dialect.

regards
Ramana

* Store input tensor scale and Weight tensor scale for Dense as well

As for conv2d, the tflite frontend drops the input tensor
scale and the weight tensor scale from the relay op. Store
it as separate fields in there.

* Fix unintentional tab

* Rename input_tensor_scale to input_scale and kernel_tensor_scale
to kernel_scale for conv2d.

* input_tensor_scale -> input_scale weight_tensor_scale->weight_scale

* Rework dense testcase

And use input_scale and kernel_scale

* Be consistent in use of input_scale and kernel_scale values

* Fixup qnn conv2d tests for input_scale and kernel_scale

* Make pydoc identical between conv2d and dense for weight_tensor

* Fix up conv2d parameters to be in the same order between C++ and python

* Fix ordering of parameters for dense.

* Add input_scale and output_scale to try and satisfy ci gods

* Delete input_scale and kernel_scale.

nn.conv2d does not contain input_scale and kernel_scale. We need
to delete it when lowering it to nn.conv2d.

* Add input_scale and kernel_scale for qnn.conv2d

committed Nov 16, 2019

3ba9dd09 Browse Files

[Debugger] Sorting op-time breakdown for quicker analysis. (#4352) · 560280dd
Animesh Jain committed Nov 16, 2019

560280dd Browse Files
proper device query through rocm api (#4305) · 022b285d
Peter Yeh committed Nov 16, 2019

022b285d Browse Files
fix install script (#4350) · 9e04298b
Cody Hao Yu committed Nov 15, 2019

9e04298b Browse Files

AutoTVM: selecting tuning templates when extracting task (#4338) · ccde31f1

* AutoTVM: selecting tuning templates when extracting task

Make the procedure of trying new templates easier.

Test: tests/python/relay/test_autotvm_task_extraction.py

* Use dict to match key for topi ops

* fix lint issue

* be more pythonic :)

committed Nov 15, 2019

ccde31f1 Browse Files

15 Nov, 2019 11 commits
- Add workgroup size attribute to AMDGPU functions in codegen (#4342) · 0a9f7e9a
```
When we did not set the workgroup size, LLVM will use too many registers
for kernel launches with many threads. This resulted in "invalid ISA"
errors. Here we set the maximum workgroup size to the maximum threads
per block from the device API.

Of course, one might look into allowing configurations with fewer
threads at runtime to use more registers.
```
  Thomas Viehmann committed Nov 16, 2019
  0a9f7e9a Browse Files
- [FIX] Fix for a specific case when loop partitioning with indivisble (#4243) · d58b733a
```
factors and resulting nested loop is broken.
This is due to the fact that we are creating zero extent loops which
are fixed afterwards. However unroll pass breaks due to the zero extent
loop.
```
  Kimish Patel committed Nov 15, 2019
  d58b733a Browse Files
- [Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter… · 2c5c4da6
```
[Relay][VM][Interpreter] Enable first-class constructors in VM and interpreter via eta expansion (#4218)

* Fix constructor pretty printing

* Make Module::HasDef name consistent with API

* Add VM constructor compilation via eta expansion

* Lint

* Fix CI

* Fix failing test

* Address comment

* Retrigger CI

* Retrigger CI
```
  Logan Weber committed Nov 15, 2019
  2c5c4da6 Browse Files
- [COMMUNITY] Add DISCLAIMER, KEYS for ASF release (#4345) · 3f6b3db8
```
* [COMMUNITY] Add DISCLAIMER, KEYS for ASF release

* Add file name spec
```
  Tianqi Chen committed Nov 15, 2019
  3f6b3db8 Browse Files
- Add check to ensure input file was successfully opened in NNVM deploy code demo (#4315) · da23619a
  T.J. Mercier committed Nov 15, 2019
  
  da23619a Browse Files
- Bump up CUDA log version in tophub.py (#4347) · 888a3c35
  Alex Gladkov committed Nov 15, 2019
  
  888a3c35 Browse Files
- [CodeGen] Add build config option disable_assert to control whether to generate assert (#4340) · b0b16a07
  Zhao Wu committed Nov 15, 2019
  
  b0b16a07 Browse Files
- fix inconsistent tag name (#4134) · a8e6ee9b
  ziyu-guo committed Nov 15, 2019
  
  a8e6ee9b Browse Files
- [VTA] Bug fix for padded load with large inputs (#4293) · 5b1ca85d
```
* bug fix for padded load with large inputs

* Update TensorLoad.scala

* Update test_vta_insn.py
```
  Liangfu Chen committed Nov 15, 2019
  5b1ca85d Browse Files
- imp module is deprecated (#4275) · 9e6371fb
  Jian Weng committed Nov 15, 2019
  
  9e6371fb Browse Files
- [Relay][Frontend][ONNX] operator support: DepthToSpace, SpaceToDepth (#4271) · 510bd8f6
  Neo Chien committed Nov 15, 2019
  
  510bd8f6 Browse Files