Commits · ca35277071f43b55987c820899cccd23105b181f · wenyuanbo / tic

05 Sep, 2019 1 commit
- [QNN] Add - Refactoring to C++ (#3736) · a6bb84a8
  Animesh Jain committed 5 years ago
  
  a6bb84a8 Browse Directory
04 Sep, 2019 1 commit
- [QNN] Convolution 2D Implementation. (#3580) · 0d4870cc
```
Rebasing. Empty commit.

Clang-format styling.
```
  Animesh Jain committed 5 years ago
  0d4870cc Browse Directory
03 Sep, 2019 2 commits

Revert "[Runtime] Allow parameter sharing between modules (#3489)" (#3884) · 6b0359b4
```
This reverts commit 224cc243.
```
Tianqi Chen committed 5 years ago
6b0359b4 Browse Directory

[Runtime] Allow parameter sharing between modules (#3489) · 224cc243

As GraphRuntime does not provide control-flow logics, we have to split
our model to two parts. While we need to share parameters between them
to save memory usage.

Solution:
1) add "lazy_init_input" in graph's attributes
   "attrs": {
     ... ...
     "lazy_init_input": [
       "list_str",
       [
         "p0"
       ]
     ]
    }
2) allow un-allocated NDArray entry in SetupStorage
3) utilize "set_input_zero_copy" function to set parameters

committed 5 years ago

224cc243 Browse Directory

01 Sep, 2019 2 commits

[Relay][Any] Add shape func for dynamic shape (#3606) · eef35a57

* init shape func in interpreter and vm compiler

* Update interpreter

* fix

* lint

* lint

* fix

* remove hack

* update

* fix

* fix

* update

* address comments & update for shape_of

* fix lint

* update

* fix hybrid

* lint

* fix bug & add take shape func

* lint

* lint

* update

* fix flaky test

* add todo

committed 5 years ago

eef35a57 Browse Directory

[Relay] Bitserial ops (#3844) · d08c74ca

* Added arm_cpu NHWC schedules.

* Fixed kernel shape legalization.

* Added bitserial ops to relay.

* Snapshot and more missing files.

* Added dense testing.

* Added tests

* Added ASF header to new files.

* cc lint

* Pylint change.

* pylint fixes.

* Change arm legalize test.

* Added assert check to arm legalize.

* Added better documentation, fixed some bad style

* Reverted arm conv2d nhwc changes.

committed 5 years ago

d08c74ca Browse Directory

31 Aug, 2019 1 commit
- [QNN] Concat - Refactoring to C++ (#3819) · ec7790e3
  Animesh Jain committed 5 years ago
  
  ec7790e3 Browse Directory
30 Aug, 2019 1 commit
- [Relay][QNN] QNNtoRelay & QNNLegalize Pass utility using Relay Legalize API. (#3838) · 671421a8
  Animesh Jain committed 5 years ago
  
  671421a8 Browse Directory
23 Aug, 2019 1 commit
- [CODE] Halide attributions (#3824) · 17f8f96b
  Tianqi Chen committed 5 years ago
  
  17f8f96b Browse Directory
22 Aug, 2019 3 commits

[TVM] Fix warnings (#3817) · ac474603

transform.h:118:3: warning: 'const' type qualifier on return type has no
effect
attrs.h:68:3: note: expanded from macro 'TVM_DECLARE_ATTRS'
node.h:244:3: note: expanded from macro 'TVM_DECLARE_NODE_TYPE_INFO'

transform.h:95:3: warning: extra ';' after member function definition
attrs.h:68:62: note: expanded from macro 'TVM_DECLARE_ATTRS'

committed 5 years ago

ac474603 Browse Directory

[TOPI][Relay][TensorFlow] Add OneHot operator (#3781) · 554df211

* Add one-hot to Relay

* topi implementation

* Working

* add topi test

* Add TF test

* Fix check

* fix linting issues

* fix documentation

* Fix documentation

* Add support for on_value, off_value, axis, dtype

* Add full support for axis

* Fix compute and update test_forward

* Move on_value and off_value to inputs

* Add topi test

* Update tests

* Update docs

* Fix style

* re-enable tests

* Add one_hot to mxnet converter

committed 5 years ago

554df211 Browse Directory

Changed topi cc resize to python implementation with new features. (#3788) · 7264cb6a
Josh Fromm committed 5 years ago

7264cb6a Browse Directory

21 Aug, 2019 1 commit

[Relay][VM]VM Profiler (#3727) · 95f12e31

* [Relay][VM]VM debugger

* Report mean/min/max for op duration

* Typos

* Lint

* Lint

* Lint

* Support build debug VM in CMake

* Lint

* Enable VM debug in unit test

* Disable debug vm test until new docker image is built

* Add device sync code

* Fix qnn unit test

* Disable vm debug by default

* Rename files

* Rename classes

* Fix comment

* Fix comment

committed 5 years ago

95f12e31 Browse Directory

16 Aug, 2019 1 commit

QNN quantize and dequantize operators. (#3745) · d3eb9cb8

* QNN quantize and dequantize operators.

* addressing review comments.

* addressing review comments.

* Adding new line at the end of the file.

* Adhering to styling guidelines.

* Adding name to contributors.

* Fixing lint issue.

* Fixing file name.

* Removing unnecessary code.

committed 5 years ago

d3eb9cb8 Browse Directory

15 Aug, 2019 1 commit
- [QUANTIZE] Refactor quantization codebase and fix model accuracy (#3543) · 7eb1f353
```
* Refactor.

* update

* update

* update

* update

* update

* update
```
  ziheng committed 5 years ago
  7eb1f353 Browse Directory
13 Aug, 2019 1 commit

[Relay] SpaceToDepth and MirrorPad Operators (#3718) · 8bd9d4d5

* Added relay and topi mirror_pad operator.

* Added mirror_padding to tensorflow frontend.

* Added mirrorpad testing in tensorflow frontent.

* Added space_to_depth in tf frontend.

* Added tests for spacetodepth.

* spacetodepth bug fix.

* Lint fix

* Added mirror pad python attrs.

* Pad code formatting.

* Syntax improvement

* Hopefully last lint fix

committed 5 years ago

8bd9d4d5 Browse Directory

09 Aug, 2019 1 commit
- Fix typo in ir_pass.h (#3741) · 45827220
  雾雨魔理沙 committed 5 years ago
  
  45827220 Browse Directory
08 Aug, 2019 1 commit

[QNN] Requantize operator (#3531) · a78adbd5

* [Relay] [Quantization] WIP - Common files for the qauntization work.

* [Relay] [Quantization] WIP - Prototyping requantize op.

* Requantize operator implementation.

Requantize converts one quantized tensor representation to another quantized
representation. The PR has following implementation features

- Requantize operator defined in qnn namespace - relay.qnn.requantize
- Lowering of the requantize to exisiting Relay operators
- Integer fixed point implementation of requantize
    - Two rounding modes - FE_UPWARDS (round towards infinity) and
    FE_AWAY_FROM_ZERO (std::round behavior)
- Floating point implementation as well, that can act as reference or can be
used for devices when FP32 computation is not used.
- Unit test cases

Relevant Issue - https://github.com/dmlc/tvm/issues/2351

Credit to TFLite and GemmLowp to provide reference implementations.

* Typo and lint fixes.

* Doc fix.

* Uncommenting the lint script (fixing mistake).

* Modifying the unit tests.

* Moving C++ files into src/relay/qnn

* Moving python files to python/tvm/relay/qnn. Some minor fixes.

* Moving the attrs.h inside the include directory.

* Pushing files that I forgot earlier. Changing util location.

* Incorporating comments. API change. Lint fixes.

* Modifying the GetFixedPointMultiplierShift API as per comments.

* Forgot the dialect change.

* Changing rewrite to qnn_lower.

* Renaming Quantize to Qnn for clarity.

* Remove use_int_domain.

* Incorportaing review comments.

* Adding API doc for QNN dialect.

* Move the qnn_lower pass to transform namespace.

* Moving from expr to module. Adding namespace in C++.

* Minor sentence rewrites. Added qnn namespace.

* Added the API doc.

* Chanding default out_dtype to int8. Adding a test with in/out_dtype as uint8.

* Style fixes. Better error messages.

* Adding documentation.

* More documentation fixes.

* Adding out dtype check for requantize.

* Adding corner case for FP32 to fixed point conversion.

* Adding extra line.

* Documentation fix.

* Adding static inline.

* Incorporating jackwish comment. Removed idtype from requantize lowering.

* Removing Quantize/Dequantize code. Restricting Requantize to (u)int8/int32.

* Style fixes.

* Fix the docs.

* Move to Legalize API.

committed 5 years ago

a78adbd5 Browse Directory

07 Aug, 2019 1 commit

[Relay/TOPI][Op] Add variance and layer norm op (#3700) · 6b6e3888

* Add LayerNorm op

* update

* fix

* Add mean_std and mean_variance

* add std and update doc

* add license

* x

* lint

* x

* fix

* fix doc

committed 5 years ago

6b6e3888 Browse Directory

06 Aug, 2019 2 commits

[Relay] Legalize pass (#3672) · 79922bd3

* [Relay] Rewrite pass.

This pass transforms an expression to other expression.

This pass has many usecases
 * Replace a expr to another expr, if the other expr has faster performance.
 * For ASICs, we might want to modify the inputs to adapt to the HW support.
 * Alter op layout can work in conjunction with this pass.

The supporting usecase is the Intel i8 x i8 conv. Intel HW supports u8 x i8 conv
in HW. Using this pass, we can replace an i8 x i8 conv to a sequence of
operators where one of the operators is now u8 x i8 conv. This will also help
automatic quantizaion performance.

* Better API name.

* Removing the conv2d legalization for x86. Will send a separate PR.

* Test name changes.

* Registering one funtion to register FTVMLegalize.

* Better comments.

committed 5 years ago

79922bd3 Browse Directory

[Relay] [TOPI] `{relay,topi}.nn.sparse_transpose` for **Square** CSR matrices (#3707) · 3b287c4d

* add build gcn tutorial

* add transpose operator for square sparse matrices

* remove extra files

* change loop tag

* comply with lint

* comply with lint -- line too long

* comply with lint

* lint check

* lint check

* lint check

* apply marisa and theirry's reviews

committed 5 years ago

3b287c4d Browse Directory

05 Aug, 2019 1 commit
- Export tvm::relay::OpRegistry::OpRegistry (#3711) · 5821a8ad
  Junru Shao committed 5 years ago
  
  5821a8ad Browse Directory
01 Aug, 2019 4 commits

[Relay][VM] Support execution on devices (#3678) · 5357f49b

* [Relay][VM] Support execution on devices

* Reduce Copy calls

* Cleanup

* Lint

* CR comments

* Merge test into test_vm.py

committed 5 years ago

5357f49b Browse Directory

Add shuffle support to TVM (#3633) · a279dd0e
Jian Weng committed 5 years ago

a279dd0e Browse Directory

Add support for Tensorflow operators log1p, cos, sin (#3614) · d72cdfa6

The patch adds support for Tensorflow operators log1p and cos
Tensorflow log1p is described at https://www.tensorflow.org/api_docs/python/tf/math/log1p
Tensorflow cos is described at https://www.tensorflow.org/api_docs/python/tf/math/cos
Tensorflow sin is described at https://www.tensorflow.org/api_docs/python/tf/math/sin

committed 5 years ago

d72cdfa6 Browse Directory

[Relay] Strict mode in pattern matching (#3620) · 331585f4

* add fatal

lint

lint

lint

do

make completeness check an error

lint

remove fatal

* fix test

* reset parser file

* remove unneeded import

* Update python/tvm/relay/adt.py

Co-Authored-By: Steven S. Lyubomirsky <slyubomirsky@gmail.com>

* Update include/tvm/relay/adt.h

Co-Authored-By: Steven S. Lyubomirsky <slyubomirsky@gmail.com>

* Eliminate trailing whitespace (my fault)

committed 5 years ago

331585f4 Browse Directory

31 Jul, 2019 1 commit
- [Relay][VM] Relay VM serialization (#3647) · 90455121
```
* relay vm serialization

* fix lint

* load params, fix stream

* lint

* fix typo
```
  Zhi committed 5 years ago
  90455121 Browse Directory
25 Jul, 2019 2 commits

[IR] Make iterators compatible with constructors of STL containers (#3624) · 0858c5ad
Lianmin Zheng committed 5 years ago

0858c5ad Browse Directory

Implementation of uTVM (#3227) · ef909df1

* uTVM interfaces (#14)

* some minor interface changes

* implemented HostLowLevelDevice

* added MicroDeviceAPI

* implemented micro_common and added Python interfaces

* current status, semi implemented micro session

* added micro_common implementation and python interfaces (#18)

* added micro_common implementation and python interfaces (#18)

* current status, semi implemented

* host test working

* updated interfaces for MicroSession arguments allocation

* make somewhat lint compatible

* fix based on comments

* added rounding macro

* fix minor bug

* improvements based on comments

* Clean up `binutil.py` and make Python-3-compatible

* Change argument allocation design

* Address feedback and lint errors

* Improve binutil tests

* Simplify allocator (per @tqchen's suggestions)

* Doc/style fixes

* farts

* mcgee

* rodata section werks

(and so does `test_runtime_micro_workspace.py`)

* simple graph runtime werk

* TEMP

* ResNet works, yo

* First round of cleanup

* More cleanup

* runs a dyson over the code

* Another pass

* Fix `make lint` issues

* ready to pr... probably

* final

* Undo change

* Fix rebase resolution

* Minor fixes

* Undo changes to C codegen tests

* Add `obj_path` in `create_micro_lib`

* TEMP

* Address feedback

* Add missing TODO

* Partially address feedback

* Fix headers

* Switch to enum class for `SectionKind`

* Add missing ASF header

* Fix lint

* Fix lint again

* Fix lint

* Kill lint warnings

* Address feedback

* Change Python interface to MicroTVM

All interaction with the device is now through `Session` objects, which
are used through Python's `with` blocks.

* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Move helper functions out of `tvm.micro`

* Switch static char arr to vector

* Improve general infra and code quality

Does not yet address all of tqchen's feedback

* Forgot a rename

* Fix lint

* Add ASF header

* Fix lint

* Partially address MarisaKirisame's feedback

* Lint

* Expose `MicroSession` as a node to Python

* Revert to using `Session` constructor

* Fix compiler error

* (Maybe) fix CI error

* Debugging

* Remove

* Quell lint

* Switch to stack-based session contexts

* Make uTVM less intrusive to host codegen

And use SSA for operands of generated ternary operators

* Inline UTVMArgs into UTVMTask struct

* Remove `HostLowLevelDevice` header

* Remove `BaseAddr` class

* Address feedback

* Add "utvm" prefix to global vars in runtime

* Fix lint

* Fix CI

* Fix `test_binutil.py`

* Fix submodules

* Remove ResNet tests

* Make `test_binutil.py` work with nose

* Fix CI

* I swear this actually fixes the binutil tests

* lint

* lint

* Add fcompile-compatible cross-compile func

* Add docs for uTVM runtime files

* Move pointer patching into `MicroSession`

* Fix lint

* First attempt at unifying cross-compile APIs

* Fix lint

* Rename `cross_compile` back to `cc`

* Address feedback

* Remove commented code

* Lint

* Figure out failing function

* Remove debugging code

* Change "micro_dev" target to "micro"

* Add checks in tests for whether uTVM is enabled

* Add TODO for 32-bit support

* Rename more "micro_dev" to "micro"

* Undo rename

We already have `tvm.micro` as a namespace.  Can't have it as a method
as well.

* Fix failing CI

Thanks to @tqchen for finding this bug.  Emitting ternary operators for
`min` and `max` causes concurrency bugs in CUDA, so we're moving the
ternary op emissions from `CodeGenC` to `CodeGenCHost`.

* Address feedback

* Fix lint

committed 5 years ago

ef909df1 Browse Directory

23 Jul, 2019 1 commit

We observe multiple groups across a range of domains (ASR, NMT, LM, etc), (#3566) · d6dcd6c5

internally and externally, interested in replacing standard dense layers with
block-sparse matrix multiplication layers. The motivations are generally: higher
performance (due to reduction in FLOPs, memory bandwidth/cache footprint),
enabling larger models (e.g. fitting more layers in a given memory budget).

Some public work along these lines:

* https://openai.com/blog/block-sparse-gpu-kernels/
* https://openai.com/blog/sparse-transformer/
* https://arxiv.org/abs/1802.08435
* https://arxiv.org/abs/1711.02782

Various groups have been able to successfully train models with reasonable
levels of sparsity (90%+) with marginal accuracy changes, which suggests
substantial speedups are possible (as this implies a >10x reduction in FLOPs).

It is fairly straightforward to realize these theoretical speedups, see e.g. TVM
benchmarks for Intel CPUs in
https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA
results in https://github.com/openai/blocksparse, etc.

* https://github.com/openai/blocksparse (CUDA)
* https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM)
* https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation)

This is extracted from an internal patch we've been using internally. There are
various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but
this is a reasonable starting point. This needs more thorough unit test coverage
however.

We follow the conventions established by scipy.sparse.bsr_matrix and other
libraries, see the unit tests for details.

For folks interested in experimenting with scheduling/AutoTVM etc,
https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful
starting point.

committed 5 years ago

d6dcd6c5 Browse Directory

19 Jul, 2019 1 commit
- [TOPI][RELAY] Add op Size (#3094) · 313bc9de
  Yong Wu committed 5 years ago
  
  313bc9de Browse Directory
17 Jul, 2019 1 commit
- Fix build error (#3552) · 48c1ae45
```
* Fix build error

* comments
```
  Yinghai Lu committed 5 years ago
  48c1ae45 Browse Directory
16 Jul, 2019 1 commit

[Relay][VM] Port VM, VM compiler, and Object into python (#3391) · b6dc7826

* tmp

* Port vm and object to python

* clean up

* update vm build module

* update

* x

* tweak

* cleanup

* update

* fix rebase

* Rename to VMCompiler

* fix

committed 5 years ago

b6dc7826 Browse Directory

14 Jul, 2019 1 commit
- [ARITH][BOUND] Fix bound inference to avoid allocating too much (#3526) · 9fad94cc
```
* [TVM] Fix bound inference to avoid allocating too much

* [ARITH][BOUND] Pass analyzer to PropBoundToInputs
```
  Sergei Grechanik committed 5 years ago
  9fad94cc Browse Directory
13 Jul, 2019 1 commit
- [ARITH][IR] Introduce FloorDiv/Mod (#3479) · 75892d2b
```
* [ARITH][IR] Introduce FloorDiv/Mod

* Address review comments

* address review comments, fix div sub rule
```
  Tianqi Chen committed 5 years ago
  75892d2b Browse Directory
11 Jul, 2019 1 commit

[INFA][IR] Build and Evolve Low-level IR. Remove HalideIR dep. (#3533) · 0218557c

* [INFA][IR] Build and Evolve Low-level IR. Remove dep from HalideIR.


* Update include/tvm/node/ir_functor.h

Co-Authored-By: Jared Roesch <roeschinc@gmail.com>

* Update include/tvm/node/ir_functor.h

Co-Authored-By: Jared Roesch <roeschinc@gmail.com>

committed 5 years ago

0218557c Browse Directory

10 Jul, 2019 3 commits

[Relay][RFC] Implement type checking for Any (#3221) · 3fb84e2b

* Implement type checking for Any

Remove code generation related changes

Remove compile changes

Remove more

Remove unification hack

Add some code back that was needed, and clean up test

Refactor test cases

WIP

Implement TypeHint AST

Add test case which should fail

Remove unification changes, and fix bug with let rec

Restore unification for shapes

Improve error reporting while debugging

All examples type check

All examples type check

WIP

First version that works with hints, needs clean up

Remove dead code

Tweaks

Remove type hint

Remove unecessary type hint stuff

Remove more type hints

Clean up

Expose Any expression node

Address CR

Fix

Fix solver

Kill unecessary code

Fix

PyLint

Fix

Relocate loops

Fix license and test

Lint again

Lint again

Fix loops

Fix docstring

Fix template error

Fix compiler issue

Fix compile err

Remove more runtime changes

Restore buffer

Fix segfault

Fix

Fix arange

* Address feedback

* Fix typo

* Fix arange

* Fix op level3

* Fix issue with Python wrapper

committed 5 years ago

3fb84e2b Browse Directory

[CPP] Refactor remove tvm/tvm.h (#3523) · 025a6c80
Tianqi Chen committed 5 years ago

025a6c80 Browse Directory

[Relay][Testing] Relay-to-Python compilation (#3156) · db841c24

* First pass at Relay-to-Python converter testing utility

* Indicate astor as a dependency

* Add astor dep to host as well

* Typos and small bugs

* Handle ADTs and matching in Python conversion

* Remove any dependency on ast.parse

* Eliminate unnecessary type var field in Python version of ConstructorValue (already gone on C++ side)

* Update constructor value, fix syntax errors

* Don't forget keywords arg on Call nodes

* Fix some incorrect calls to ast nodes

* Fix more calls, a little more cleaning up

* Missing cases in attr conversion

* Lower op calls instead of running them through interpreter, as in @MarisaKirisame's AoT compiler

* We do still need the module

* Remove changes to op attrs: Will PR separately

* Smoke test and corrections

* More tests and fixes

* Ensure imports are properly global in generated Python code

* Add unit tests for refs

* Add unit test for tuple indexing

* Add unit test for if expression

* Remove astor dependency

* Remove astor from meta.yaml too

* Fix if test and add basic local function test

* Add global function test, refactor earlier tests

* Correct 'clause' field in ADT so Python and C++ field names match

* More fixes and tests for matching and constructors

* Dramatically simplify matching: no need for a thunk

* Improve ref writing test

* Ensure local recursion works

* cleanup

* Add test for global recursion

* Add test for higher-order calls

* Get ops working, add basic tests

* Remove accidentally duplicated test

* More docstrings to appease pylint

* Forgot to fix a test using constructor values

* Reduce optimization level in fusion and fix tuple input to operators

* Test op with tuple output, fix tuple output code

* Add unit test for batch norm

* Add a couple more tricky test cases

* Correct nat constructor to drop unnecessary field

* Fix the op attrs file (accidentally reduced it)

* Address review comments

* Adapt to new ConstructorValue representation (no more runtime dep on module)

* Use pass manager and updated interfaces. Extend module.from_expr to accommodate necessary demands

* Use sequential return value

* Lift out nested conditionals

* Replace triple single quotes with triple double quotes

* Use main variable instead of entry_func

committed 5 years ago

db841c24 Browse Directory

09 Jul, 2019 1 commit

[Relay][VM]Compiling pattern matching (#3470) · 93d1c06d

* [Relay][VM]Compiling pattern matching

* Fix lint

* Remove debug code

* Move TreeNode definition

* merge ifi and selecti, todo: remove them

* fix lint

* remove ifi and selecti

* rename GetTagi to GetTag

* fix dltype

* fix more dltype

* Generalize If and select, and rename to Ifi and Selecti

* Fix lint

* Rename Ifi to If

* Change register default to match value

* Remove bad specialization for Move

* Stop use Select

* Remove Select

* TreeNode refactor

* Change entry_func name

* Remove Cmp due to rebase issue

committed 5 years ago

93d1c06d Browse Directory