Commits · 19164063aaecfb55785e6d582593c7e9e5feb9dc · wenyuanbo / tic

27 Oct, 2019 1 commit
- [RUNTIME] Separate runtime related contrib into runtime/contrib (#4207) · dcc6af53
  Tianqi Chen committed 5 years ago
  
  dcc6af53 Browse Directory
24 Oct, 2019 2 commits

[NODE][REFACTOR] Refactor reflection system in node. (#4189) · 78ca6fc8

* [NODE][REFACTOR] Refactor reflection system in node.

- Removed the old Node, Node is now just an alias of runtime::Object
- Introduce ReflectionVTable, a new columnar dispatcher to support reflection
  - This allows us to remove vtable from most node objects
  - The VisitAttrs are registered via TVM_RESGITER_NODE_TYPE,
    they are no longer virtual.
- Consolidated serialization and reflection features into node.

* Explicit type qualification when calling destructor.

* Fix SPIRV, more comments

committed 5 years ago

78ca6fc8 Browse Directory

TensorCore Support using Intrinsic (#4136) · 324a9607

* add tensor core support

* avoid memory bank conflict

* fix thread sync & better performance

* better performance

* add schedule test for conv2d

* extend into BatchMatMul

* support config fragment shape and layout using intrinsic

* add TensorCore tutorial

* add int support and fix lint

* address comment

* add 32*16*8 TensorCore test

* fix wmma include logic

committed 5 years ago

324a9607 Browse Directory

23 Oct, 2019 1 commit

[rpc] use callback func to do send & recv (#4147) · 5408d3a3

* [rpc] use callback func to do send & recv. don't get fd from sock as it is deprecated in java

* fix java build

* fix min/max macro define in windows

* keep the old rpc setup for py

* add doc for CallbackChannel

committed 5 years ago

5408d3a3 Browse Directory

22 Oct, 2019 1 commit
- [relay][vm] Reuse allocated device memory (#4170) · 5a177070
  Zhi committed 5 years ago
  
  5a177070 Browse Directory
21 Oct, 2019 1 commit

[REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. (#4161) · 7895adb2

* [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol.

This PR removes the original node system, and make node as a subclass of Object.
This is a major refactor towards a better unified runtime object system.

List of changes in the refactor:

- We now hide data_ field, use Downcast explicitly to get a sub-class object.
- Removed the node system FFI in python.
- Removed the node C API, instead use PackedFunc for list and get attrs.
- Change relay::Op::set_attr_type_key(attr_key_name) to relay::Op::set_attr_type<AttrType>().
  - This change was necessary because of the new Object registration mechanism.
  - Subsequent changes to the op registrations
  - The change revealed a few previous problems that is now fixed.
- Patched up a few missing node type registration.
  - Now we will raise an error if we register object that is not registered.
- The original node.h and container.h are kept in the same location.
- Calling convention: kObjectHandle now equals the old kNodeHandle, kNodeHandle is removed.
- IRFunctor now dispatches on ObjectRef.
- Update to the new type checking API: is_type, derived_from are replaced by IsInstance.
- Removed .hash member function, instead use C++ convention hasher functors.

* Address review comments

committed 5 years ago

7895adb2 Browse Directory

20 Oct, 2019 2 commits
- [Runtime] Enable option to use OpenMP thread pool (#4089) · 97ea31c8
  Haichen Shen committed 5 years ago
  
  97ea31c8 Browse Directory
- [Refactor] Rename Datatype to ADT (#4156) · 32aad56c
```
We think it will reduce the confusion with the meaning.

https://discuss.tvm.ai/t/discuss-consider-rename-vm-datatype/4339
```
  Wei Chen committed 5 years ago
  32aad56c Browse Directory
18 Oct, 2019 1 commit

[Relay][Frontend][TF] Add tensor array ops (#3798) · 36a96773

* [Relay][Frontend][TF] Add tensor array ops

* rename

* delete test

* Move utility function

* Refactor

* fix tensor array ops

* fix test

* fix rebase

* Fix serializer bug

* Improve tf convert name lookup to use prelude api

* Fix lint

* Fix test

committed 5 years ago

36a96773 Browse Directory

17 Oct, 2019 1 commit

[relay][vm] Separate VM runtime with executable (#4100) · 4052de6d

* [relay][vm] Separate VM runtime with executable

* Address comments

* move ctx back to vm

* make only vm related fields and methods protected

* integrate seriliaztion/deserialization to executable

* create stream

committed 5 years ago

4052de6d Browse Directory

16 Oct, 2019 1 commit

[RUNTIME] Refactor object python FFI to new protocol. (#4128) · 02c1e117

* [RUNTIME] Refactor object python FFI to new protocol.

This is a pre-req to bring the Node system under object protocol.
Most of the code reflects the current code in the Node system.

- Use new instead of init so subclass can define their own constructors
- Allow register via name, besides type idnex
- Introduce necessary runtime C API functions
- Refactored Tensor and Datatype to directly use constructor.

* address review comments

committed 5 years ago

02c1e117 Browse Directory

15 Oct, 2019 1 commit

[RFC][RUNTIME] Introduce new object protocol. (#4115) · a0bd3786

* [RUNTIME] Introduce new object protocol.

This PR introduces a new object protocol to unify the node and object.
We also updated the existing runtime::vm code to make use of the new system.

Update to the node will be done in a follow up PR.

Other changes:

- Remove object related code in json serializer as that code logic was not complete
  and we have a separate serializer for VM, can revisit later.

* address review  comment

* Fix the child slot logic

committed 5 years ago

a0bd3786 Browse Directory

10 Oct, 2019 1 commit

[Relay][VM] Fix constant folding issue in VM compiler (#4077) · fc2713e5

* [Relay][VM] Fix constant folding issue in VM compiler

1. allow pass params when compile a module
2. enhance profiler robustness

* remove dead code

* fix lint

* add get_params

* fix test

* don't pass params back

* remove get_params

* docs

* move compile function to api

* compile clashes with builtin name

* fix compilation error

* remove dead code

committed 5 years ago

fc2713e5 Browse Directory

08 Oct, 2019 1 commit
- [Fix][VM] Fix VM invoke with set_params (#4079) · b5bcdbb0
```
* Fix VM invoke with set_params

* add test

* tweak
```
  Haichen Shen committed 5 years ago
  b5bcdbb0 Browse Directory
17 Sep, 2019 1 commit
- [Vulkan] Minor optimization for deferred token lookups. (#3960) · 1fe17d14
```
Use a hash map keyed on the descriptor set to avoid bad asymptotic behaviour.
```
  Andrew Tulloch committed 5 years ago
  1fe17d14 Browse Directory
13 Sep, 2019 1 commit
- Vulkan2 Runtime API (#3849) · 2536465c
  Andrew Tulloch committed 5 years ago
  
  2536465c Browse Directory
12 Sep, 2019 1 commit

[RFC] [Contrib] Minimal runtime (~12kb .text on ARMv7/x86) for subset of TVM models (#3567) · 1de52bb0

This is an alternative implementation of a subset of the TVM runtime API (and
graph runtime) that focuses entirely on reducing code size, at the expense of
functionality (no tvm.extern(..) calls via PackedFunc, CPU only, etc). It might
be worth incrementally expanding the surface area if there's interest.

The motivation for this work was seeing what the minimal useful subset of the
TVM runtime is. This is relevant for e.g. super code-size constrained
applications in e.g. embedded/mobile. The current runtime is more like O(100KiB)
or so, so this might be compelling for some users.

The smaller surface area for auditing might make this relevant for
https://github.com/dmlc/tvm/issues/3159, or the usecases I was thinking about in
https://github.com/dmlc/tvm/issues/2523#issuecomment-459165815 re: the Rust
runtime.

The symbols in the tvm::minimalruntime space (i.e. excluding std:: and
picojson::) are about 5KiB, so I think there's a bunch of room here (i.e. we
could replace picojson:: with [`jsmn`](https://zserge.com/jsmn.html) or
something, and we could replace more of the `std::unordered_map` usage, etc with
custom primitives as well (similar to the `DynArray`).

committed 5 years ago

1de52bb0 Browse Directory

03 Sep, 2019 2 commits

Revert "[Runtime] Allow parameter sharing between modules (#3489)" (#3884) · 6b0359b4
```
This reverts commit 224cc243.
```
Tianqi Chen committed 5 years ago
6b0359b4 Browse Directory

[Runtime] Allow parameter sharing between modules (#3489) · 224cc243

As GraphRuntime does not provide control-flow logics, we have to split
our model to two parts. While we need to share parameters between them
to save memory usage.

Solution:
1) add "lazy_init_input" in graph's attributes
   "attrs": {
     ... ...
     "lazy_init_input": [
       "list_str",
       [
         "p0"
       ]
     ]
    }
2) allow un-allocated NDArray entry in SetupStorage
3) utilize "set_input_zero_copy" function to set parameters

committed 5 years ago

224cc243 Browse Directory

02 Sep, 2019 1 commit
- [WIP][µTVM] Add OpenOCD Low-Level Device (RISC-V Support) (#3756) · 60de5be1
  Logan Weber committed 5 years ago
  
  60de5be1 Browse Directory
01 Sep, 2019 1 commit

[Relay][Any] Add shape func for dynamic shape (#3606) · eef35a57

* init shape func in interpreter and vm compiler

* Update interpreter

* fix

* lint

* lint

* fix

* remove hack

* update

* fix

* fix

* update

* address comments & update for shape_of

* fix lint

* update

* fix hybrid

* lint

* fix bug & add take shape func

* lint

* lint

* update

* fix flaky test

* add todo

committed 5 years ago

eef35a57 Browse Directory

29 Aug, 2019 1 commit
- [runtime] reduce set_input and set_input_zero_copy overhead (#3805) · 137bf5f4
  hlu1 committed 5 years ago
  
  137bf5f4 Browse Directory
21 Aug, 2019 1 commit

[Relay][VM]VM Profiler (#3727) · 95f12e31

* [Relay][VM]VM debugger

* Report mean/min/max for op duration

* Typos

* Lint

* Lint

* Lint

* Support build debug VM in CMake

* Lint

* Enable VM debug in unit test

* Disable debug vm test until new docker image is built

* Add device sync code

* Fix qnn unit test

* Disable vm debug by default

* Rename files

* Rename classes

* Fix comment

* Fix comment

committed 5 years ago

95f12e31 Browse Directory

01 Aug, 2019 1 commit

[Relay][VM] Support execution on devices (#3678) · 5357f49b

* [Relay][VM] Support execution on devices

* Reduce Copy calls

* Cleanup

* Lint

* CR comments

* Merge test into test_vm.py

committed 5 years ago

5357f49b Browse Directory

31 Jul, 2019 1 commit
- [Relay][VM] Relay VM serialization (#3647) · 90455121
```
* relay vm serialization

* fix lint

* load params, fix stream

* lint

* fix typo
```
  Zhi committed 5 years ago
  90455121 Browse Directory
30 Jul, 2019 2 commits
- ROCm: Add SaveToFile and LoadFile (#3665) · d4a51751
```
...and add rocm module_save to the tests.
```
  Thomas Viehmann committed 5 years ago
  d4a51751 Browse Directory
- Print llvm source by default in ROCMModuleNode::GetSource (#3662) · 52b63b9f
  Thomas Viehmann committed 5 years ago
  
  52b63b9f Browse Directory
25 Jul, 2019 2 commits

Implementation of uTVM (#3227) · ef909df1

* uTVM interfaces (#14)

* some minor interface changes

* implemented HostLowLevelDevice

* added MicroDeviceAPI

* implemented micro_common and added Python interfaces

* current status, semi implemented micro session

* added micro_common implementation and python interfaces (#18)

* added micro_common implementation and python interfaces (#18)

* current status, semi implemented

* host test working

* updated interfaces for MicroSession arguments allocation

* make somewhat lint compatible

* fix based on comments

* added rounding macro

* fix minor bug

* improvements based on comments

* Clean up `binutil.py` and make Python-3-compatible

* Change argument allocation design

* Address feedback and lint errors

* Improve binutil tests

* Simplify allocator (per @tqchen's suggestions)

* Doc/style fixes

* farts

* mcgee

* rodata section werks

(and so does `test_runtime_micro_workspace.py`)

* simple graph runtime werk

* TEMP

* ResNet works, yo

* First round of cleanup

* More cleanup

* runs a dyson over the code

* Another pass

* Fix `make lint` issues

* ready to pr... probably

* final

* Undo change

* Fix rebase resolution

* Minor fixes

* Undo changes to C codegen tests

* Add `obj_path` in `create_micro_lib`

* TEMP

* Address feedback

* Add missing TODO

* Partially address feedback

* Fix headers

* Switch to enum class for `SectionKind`

* Add missing ASF header

* Fix lint

* Fix lint again

* Fix lint

* Kill lint warnings

* Address feedback

* Change Python interface to MicroTVM

All interaction with the device is now through `Session` objects, which
are used through Python's `with` blocks.

* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Move helper functions out of `tvm.micro`

* Switch static char arr to vector

* Improve general infra and code quality

Does not yet address all of tqchen's feedback

* Forgot a rename

* Fix lint

* Add ASF header

* Fix lint

* Partially address MarisaKirisame's feedback

* Lint

* Expose `MicroSession` as a node to Python

* Revert to using `Session` constructor

* Fix compiler error

* (Maybe) fix CI error

* Debugging

* Remove

* Quell lint

* Switch to stack-based session contexts

* Make uTVM less intrusive to host codegen

And use SSA for operands of generated ternary operators

* Inline UTVMArgs into UTVMTask struct

* Remove `HostLowLevelDevice` header

* Remove `BaseAddr` class

* Address feedback

* Add "utvm" prefix to global vars in runtime

* Fix lint

* Fix CI

* Fix `test_binutil.py`

* Fix submodules

* Remove ResNet tests

* Make `test_binutil.py` work with nose

* Fix CI

* I swear this actually fixes the binutil tests

* lint

* lint

* Add fcompile-compatible cross-compile func

* Add docs for uTVM runtime files

* Move pointer patching into `MicroSession`

* Fix lint

* First attempt at unifying cross-compile APIs

* Fix lint

* Rename `cross_compile` back to `cc`

* Address feedback

* Remove commented code

* Lint

* Figure out failing function

* Remove debugging code

* Change "micro_dev" target to "micro"

* Add checks in tests for whether uTVM is enabled

* Add TODO for 32-bit support

* Rename more "micro_dev" to "micro"

* Undo rename

We already have `tvm.micro` as a namespace.  Can't have it as a method
as well.

* Fix failing CI

Thanks to @tqchen for finding this bug.  Emitting ternary operators for
`min` and `max` causes concurrency bugs in CUDA, so we're moving the
ternary op emissions from `CodeGenC` to `CodeGenCHost`.

* Address feedback

* Fix lint

committed 5 years ago

ef909df1 Browse Directory

Add a missing header in cuda_device_api.cc (#3621) · 443d023b
Philip Hyunsu Cho committed 5 years ago

443d023b Browse Directory

23 Jul, 2019 1 commit

[Runtime] [ThreadPool] Make SpscTaskQueue::Pop(..) spin_count configurable (#3577) · 9b1c2e08

In cases where we have multiple models or threadpools active, spinning around
`sched_yield()` may not be desirable, as it prevents the OS from effectively
scheduling other threads.

Thus, allow users to conditionally disable this behaviour (via an environment
variable `TVM_THREAD_POOL_SPIN_COUNT`, similar to existing environment flags for
the thread pool such as `TVM_BIND_THREADS`, etc).

This substantially improves tail latencies in some of our multi-tenant
workloads in practice.

Unit tests have been added - on my laptop, running:

```
TVM_THREAD_POOL_SPIN_COUNT=0 ./build/threading_backend_test;
TVM_THREAD_POOL_SPIN_COUNT=1 ./build/threading_backend_test;
./build/threading_backend_test;
```

gives https://gist.github.com/ajtulloch/1805ca6cbaa27f5d442d23f9d0021ce6 (i.e.
97ms -> <1ms after this change)

committed 5 years ago

9b1c2e08 Browse Directory

16 Jul, 2019 1 commit

[Relay][VM] Port VM, VM compiler, and Object into python (#3391) · b6dc7826

* tmp

* Port vm and object to python

* clean up

* update vm build module

* update

* x

* tweak

* cleanup

* update

* fix rebase

* Rename to VMCompiler

* fix

committed 5 years ago

b6dc7826 Browse Directory

15 Jul, 2019 1 commit

[Runtime] Enable set_input_zero_copy in GraphRuntime (#3416) · afd4b3e4

* Enable set_input_zero_copy in GraphRuntime

* Fix LoadParams

* Fix

* lint

* Fix remote context issue

* Fix

* Remove LOG

* Remove unused variables

* Add tests

* works

* More test scenarios

* make it simpler

* Remove unnecessary changes

* Address comments

* More comments

* Address comments

* Fix build

committed 5 years ago

afd4b3e4 Browse Directory

11 Jul, 2019 1 commit
- posix_memalign appears in API 17, not 16 (#3532) · 2d53f84d
  hlu1 committed 5 years ago
  
  2d53f84d Browse Directory
10 Jul, 2019 1 commit

[Relay][RFC] Implement type checking for Any (#3221) · 3fb84e2b

* Implement type checking for Any

Remove code generation related changes

Remove compile changes

Remove more

Remove unification hack

Add some code back that was needed, and clean up test

Refactor test cases

WIP

Implement TypeHint AST

Add test case which should fail

Remove unification changes, and fix bug with let rec

Restore unification for shapes

Improve error reporting while debugging

All examples type check

All examples type check

WIP

First version that works with hints, needs clean up

Remove dead code

Tweaks

Remove type hint

Remove unecessary type hint stuff

Remove more type hints

Clean up

Expose Any expression node

Address CR

Fix

Fix solver

Kill unecessary code

Fix

PyLint

Fix

Relocate loops

Fix license and test

Lint again

Lint again

Fix loops

Fix docstring

Fix template error

Fix compiler issue

Fix compile err

Remove more runtime changes

Restore buffer

Fix segfault

Fix

Fix arange

* Address feedback

* Fix typo

* Fix arange

* Fix op level3

* Fix issue with Python wrapper

committed 5 years ago

3fb84e2b Browse Directory

09 Jul, 2019 2 commits

[Relay][VM]Compiling pattern matching (#3470) · 93d1c06d

* [Relay][VM]Compiling pattern matching

* Fix lint

* Remove debug code

* Move TreeNode definition

* merge ifi and selecti, todo: remove them

* fix lint

* remove ifi and selecti

* rename GetTagi to GetTag

* fix dltype

* fix more dltype

* Generalize If and select, and rename to Ifi and Selecti

* Fix lint

* Rename Ifi to If

* Change register default to match value

* Remove bad specialization for Move

* Stop use Select

* Remove Select

* TreeNode refactor

* Change entry_func name

* Remove Cmp due to rebase issue

committed 5 years ago

93d1c06d Browse Directory

[Vulkan] Added conversion from bool to float. (#3513) · a5acca92
```
* Added bool to float conversion support to spirv ir builder.

* Added unittest for vulkan bool conversion.

* Typo fix.
```
Josh Fromm committed 5 years ago
a5acca92 Browse Directory

27 Jun, 2019 1 commit
- Fix Windows build (#3429) · 50f4c1d0
  Li committed 5 years ago
  
  50f4c1d0 Browse Directory
25 Jun, 2019 1 commit

[Runtime] Allow for parameter sharing in GraphRuntime (#3384) · 32be34a0

Summary:

In multi-threaded applications where we have multiple inferences on the
same model in parallel (consider e.g. a TTS system handling multiple
requests), it can be useful to share the parameters of a model amongst
these multiple instances. This improves the cache utilization behaviour
of the system, as multiple cores can use the same set of weights instead
of evicting the identical copies of weights in a shared cache.

As the underlying `NDArray` instances in `data_entry_` implement a
ref-counted based sharing system, this is a simple modification of the
`GraphRuntime::LoadParams` logic to instead copy parameters from an
existing GraphRuntime instance. This is a little ugly in that we need
both the pre-existing GraphRuntime instance, as well as the 'serialized'
params (since we need to know the set of names we should copy), but
without imposing additional assumptions (i.e. storing the set of param
names in GraphRuntime, and enforcing that shared param names are
identical to the parameters set in the preceding `LoadParams` call),
this seems unavoidable.

Test Plan:

Unit test added.

committed 5 years ago

32be34a0 Browse Directory

14 Jun, 2019 1 commit

[Relay][VM] Add AllocTensor instruction and better instruction printer (#3306) · b8fa8f62

* Update vm print & add AllocTensor instruction

* patch

* fix invoke packed

* update cmake

* tweak move

* update invoke_closure

* lint

* add doc

* tweak

committed 5 years ago

b8fa8f62 Browse Directory

23 May, 2019 1 commit
- [GraphRuntime] Debug graph runtime (#3232) · e1e91f1f
  hlu1 committed 5 years ago
  
  e1e91f1f Browse Directory