Commits · 92c78266a614c59c310512d3dab4c23bd155d52c · wenyuanbo / tic

06 Apr, 2020 1 commit
- [Runtime][Contrib] Support cudnn softmax (#5214) · 799ff356
  Haichen Shen committed 4 years ago
  
  799ff356 Browse Directory
30 Mar, 2020 1 commit
- rocm: fix miopen convolutions (#5179) · 84121966
```
* fix miopen convolutions

* fix overly long lines
```
  Thomas Viehmann committed 4 years ago
  84121966 Browse Directory
20 Mar, 2020 1 commit
- Add colors to compute_at edges and thread/block indices. (#5111) · b91dbca6
  yongfeng-nv committed 4 years ago
  
  b91dbca6 Browse Directory
27 Feb, 2020 1 commit

[REFACTOR][PY][API-CHANGE] Remove legacy python files. (#4943) · 9816efc2

* [REFACTOR][PY][API-CHANGE] Remove legacy python files.

Remove legacy python files.
Use the te namespace for most of the tensor expression primitives.

- tvm.create_schedule -> tvm.te.create_schedule
- tvm.placeholder -> tvm.te.placeholder
- tvm.compute -> tvm.te.compute

* Remove top-level exposures.

committed 4 years ago

9816efc2 Browse Directory

26 Feb, 2020 1 commit

Tensor Expression Debug Display (TEDD) (#4651) · b0b1e7da

* Initial TEDD for publishing.

* 1. Fix lint issues. 2. Print intrin.body instead of intrin.name in Schedule Tree.  3. Add examples to top level APIs' comments.  4. Top level APIs don't print Dot string by default, unless outputdotstring is True.

* Fix more lint issues.

* Update top level API argument names and use raw strings to avoid Python lint warnings in the tests.

* Disable TEDD verification, but keep TE construction.

* Stop importing tedd to avoid failure.

* Separate data extraction and visualization. 1. Add API tedd.dump_json(schedule) to dump a json string for the schedule data for visualization.  2. Update tests.  3. Add a tutorial.  4. Add range information to IterVars.

* Update TEDD about InferBound failure.  1. TEDD doesn't call inferbound for DFG. 2. Update tutorial about the InferBound failure.

* 1. Import IPython only if SVG is requested.  This is required to fix a tutorial publishing faliure.  2. Fix test about IPython availability check.

committed 4 years ago

b0b1e7da Browse Directory

12 Feb, 2020 1 commit

[REFACTOR][PY][API-CHANGE] establish tvm.ir, migrate corresponding files (#4862) · a5661611

* [REFACTOR][PY][API-CHANGE] establish tvm.ir, migrate corresponding relay files.

This PR establishes tvm.ir and migrates the corresponding relay
files into the new folder.

API Change:
- relay.Module -> tvm.IRModule

* Update with ADT

* Migrate transform

* address comments

* Migrate module

* Migrate json_compact

* Migrate attrs

* Move LoweredFunc to stmt temporarily

* temp migrate container

* Finish migrate container

committed 5 years ago

a5661611 Browse Directory

08 Feb, 2020 1 commit
- [TEST] test_cuddn flaky (#4846) · b46c2548
  Tianqi Chen committed 5 years ago
  
  b46c2548 Browse Directory
07 Feb, 2020 1 commit

[REFACTOR][PY][API-Change] Polish tvm.runtime, tvm.runtime.module API update (#4837) · e0122c0e

* [REFACTOR][PY-API] Polish tvm.runtime, tvm.runtime.module API update

This PR updates the tvm.runtime to use the new FFI style.

- Remove top-level tvm.module to avoid confusion between runtime.Module and IRModule
- API changes wrt to runtime.Module
  - tvm.module.load -> tvm.runtime.load_module
  - tvm.module.enabled -> tvm.runtime.enabled
  - tvm.module.system_lib -> tvm.runtime.system_lib
- Remove dep on api_internal from runtime.

* Update module.load in the latest API

committed 5 years ago

e0122c0e Browse Directory

05 Feb, 2020 1 commit

[REFACTOR][PY] Establish tvm.runtime (#4818) · fc7dd6d7

* [REFACTOR][PY] Establish tvm.runtime

This PR establishes the tvm.runtime namespace that contains the core runtime data structures.
The top-level API are kept inact for now via re-exporting.

We will followup later to cleanup some of the top-level APIs.

* Fix ndarray name

committed 5 years ago

fc7dd6d7 Browse Directory

16 Jan, 2020 2 commits

[Runtime] EdgeTPU runtime for Coral Boards (#4698) · 31021d2b
Thierry Moreau committed 5 years ago

31021d2b Browse Directory

[Arith] add SizeVar representing non-neg valued variable in a tensor shape (#4684) · 3a672e3e

* [arith] add ShapeVar representing non-neg valued variable in a tensor shape

* bounder remover; deal with div in int_set differently

* fix bounder_remover

* migrate unittest to use shape_var

* use tvm.shape_var in integration & relay tests

* add test case; fix Var register

* fix lint

* fix lint again

* add default ShapeVar visitor in Relay

* fix override

* fix ShapeVar visit bug

* revert IntervalSet for shape_var

* remove bound_remover

* remove is_var; use constructor for shapevar/var instead

* ShapeVar -> SizeVar; add constructor comments

* shape_var -> size_var in doc

* tindex -> size

committed 5 years ago

3a672e3e Browse Directory

06 Jan, 2020 1 commit

[CONV] Asymmetric padding (#4511) · 34b98eb7

* [CONV] Asymmetic padding

* fix lint error

* update for legalize, rocm and cudnn

* add more test cases

* change more symmetric padding

* change conv2d winograd tests according orginal cases

* remove 'alter_op_layout.h' header in bitserial.cc

committed 5 years ago

34b98eb7 Browse Directory

29 Dec, 2019 1 commit
- [Perf] Add CublasLt extern support for better Igemm performance (#4550) · fadea922
```
* cublaslt added

* fix lint

* address comments

* address more comments

* Trigger CI

* Trigger CI
```
  Leyuan Wang committed 5 years ago
  fadea922 Browse Directory
04 Dec, 2019 1 commit
- [CONTRIB] TFLite Runtime (#4439) · 24713bde
  ziheng committed 5 years ago
  
  24713bde Browse Directory
03 Dec, 2019 1 commit

[RUNTIME] Add cudnn conv3d (#4418) · 77bdd5f7

* [RUNTIME] Add cudnn conv3d

* add output checking to test_cudnn.verify()

* fix tests failure

* revised per as review comments

* unify conv_output_shape, conv_find_algo and conv_forward

* convert python list to tvm.array in conv_forward

* revise per as comments

* 'pass as reference' for vector args

* add back con2d/3d seperated implementation

* remove unused included header

* remove extra std::vectors

* remove unused header

committed 5 years ago

77bdd5f7 Browse Directory

02 Dec, 2019 1 commit
- [µTVM] Enable AutoTVM for ARM STM32F746XX Boards (#4274) · 47c870a9
  Logan Weber committed 5 years ago
  
  47c870a9 Browse Directory
25 Nov, 2019 1 commit

[Perf] Enhance cudnn and cublas backend and enable TensorCore (#4353) · dabde40f

* add half and mix precision support to cublas backend

* add TensorCore support in CuDNN

* enhance CuDNN support

* address comments and fix lint

* fix

* add fp16 test

committed 5 years ago

dabde40f Browse Directory

17 Oct, 2019 1 commit
- [TOPI][x86] Cascade lake support. (#4123) · 972f019c
```
* [TOPI][x86] Cascade lake support.

* Jenkins test debug 1.

* Testing cascade lake alone.
```
  Animesh Jain committed 5 years ago
  972f019c Browse Directory
14 Oct, 2019 1 commit
- [CI] Update ci-cpu to latest (#4121) · 7530e043
  Tianqi Chen committed 5 years ago
  
  7530e043 Browse Directory
25 Sep, 2019 3 commits

Revert "Added tesnorizeation for avx2 based gemm. (#3982)" (#4007) · 4a3abb94
```
This reverts commit 23727eb4.
```
Tianqi Chen committed 5 years ago
4a3abb94 Browse Directory

Added tesnorizeation for avx2 based gemm. (#3982) · 23727eb4

* Added tesnorizeation for avx2 based gemm.

Summary:
Tensorized the same region as avx512. Names produce 16x1 int32 results.
Does by doing two sets of AVX2 instructions to do reduction on 8x4 int8
kernel with 1x4 data.

Test Plan:
on avx2 machine:
python tests/python/contrib/test_gemm_avx2_acc32.py

Reviewers:

Subscribers:

Tasks:

Tags:

* Fix lint errors. Removed commented out code.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

committed 5 years ago

23727eb4 Browse Directory

Changes to make tensorize work. These changes also fix the previously broken test. (#3981) · b410df8c

* Changes to make tensorize work. These changes also fix the previously
broken test.

Summary:
Tensorize was breaking  for a few reasons.
1)
Assert at: src/op/tensorize.cc:234 CHECK(is_one(e.region[j]->extent))
In some cases this cannot be proven, e.g.:
expected shape=[16, 4], given region=[range(min=((ax1.outer*16)/16), ext=(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer)), range(min=((k.outer*4)/4), ext=(((((k.outer*4) + 3)/4) + 1) - k.outer)), range(min=0, ext=16), range(min=0, ext=4)]
The unprovable one is: ext=(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer)).
This can be simplified but it is not because to simplify divide, it must
prove ax1.outer > 0 and since it is var it cannot. The fix for this to
just find all the vars in expr in relace them with some const value.

2) Equivalence between tensorized expr and one being asked to tensorize. For example,
the error would be.
TVMError: Check failed: Equal(lhs, rhs):
Failed to match the compute with TensorIntrin tensor_intrin's declaration
provided= reduce(combiner=comm_reducer(result=[(x + y)], lhs=[x], rhs=[y], identity_element=[(int16)0]), source=[(int16(data(k))*int16(kernel(((((((((k.outer.outer*64) + (k.outer.inner*2)) + k)/2)*128) + i) - (k.outer.inner*128)) - (k.outer.outer*4096)), ((((k.outer.outer*64) + (k.outer.inner*2)) + k) % 2))))], axis=[iter_var(k, range(min=0, ext=2))], where=(bool)1, value_index=0),
intrin=  reduce(combiner=comm_reducer(result=[(x + y)], lhs=[x], rhs=[y], identity_element=[(int16)0]), source=[(int16(data(k))*int16(kernel(i, k)))], axis=[iter_var(k, range(min=0, ext=2))], where=(bool)1, value_index=0)
Difference is mainly in the source part:
source=[(int16(data(k))*int16(kernel(((((((((k.outer.outer*64) + (k.outer.inner*2)) + k)/2)*128) + i) - (k.outer.inner*128)) - (k.outer.outer*4096)), ((((k.outer.outer*64) + (k.outer.inner*2)) + k) % 2))))]
source=[(int16(data(k))*int16(kernel(i, k)))], axis=[iter_var(k, range(min=0, ext=2))]
This was not being simpifiled due to compute_intrin_iter_space (map for
iter var to range) not containing leaf iter vars.

3) Here it fails with:
Check failed: is_one(Simplify(value->shape[i])): Argument b_buffer shape mismatch[16, 4] vs [(((((ax1.outer*16) + 15)/16) + 1) - ax1.outer), (((((k.outer*4) + 3)/4) + 1) - k.outer), 16, 4]
This is in buffer binding where it thinks expected and buffer bound
shape is different. Although if we could simplify expr, this would not
be the case.

Test Plan:
On skylake avx512 machine:
python tests/python/contrib/test_gemm_acc16.py

Reviewers:

Subscribers:

Tasks:

Tags:

* Implemented bounded analyzer which traverses tree and for reduce/for
statements binds the bound of the analyzer. Later this is used to
simplify expressions. Inspired from ir_mutator_with_analyzer

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Addressed comments.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Added ASF header + define macro for the header file: TVM_ARITHMETIC_IR_VISITOR_WITH_ANALYZER_H_
Some lint fixes as well.

* Relax the assumption that dom_map must always contain all leaf itervars.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* Disable copy constructor and move to raw ptr.

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

committed 5 years ago

b410df8c Browse Directory

15 Sep, 2019 1 commit
- Enable miopen transpose convolution and fp16 support (#3952) · 9e4f07b4
```
* Enable miopen transpose convolution and fp16 support

* linter
```
  Peter Yeh committed 5 years ago
  9e4f07b4 Browse Directory
13 Sep, 2019 1 commit
- Add AVX512VNNI support for TVM (#3388) · bb82e09f
  Jianyu Huang committed 5 years ago
  
  bb82e09f Browse Directory
12 Sep, 2019 1 commit
- [TOPI][CUDA] Support cuBLAS BatchMatMul (#3936) · 88f9bfd4
```
* Support cuBLAS BatchMatMul

* Add test and check target name
```
  Jon Soifer committed 5 years ago
  88f9bfd4 Browse Directory
08 Sep, 2019 1 commit
- change docker install script (#3524) · 184fa484
  雾雨魔理沙 committed 5 years ago
  
  184fa484 Browse Directory
12 Aug, 2019 1 commit
- Revert compile_cmd kwarg name change (#3746) · d6b7b62d
```
* Revert compile_cmd kwarg name change

* Fix binutil tests
```
  Logan Weber committed 5 years ago
  d6b7b62d Browse Directory
25 Jul, 2019 1 commit

Implementation of uTVM (#3227) · ef909df1

* uTVM interfaces (#14)

* some minor interface changes

* implemented HostLowLevelDevice

* added MicroDeviceAPI

* implemented micro_common and added Python interfaces

* current status, semi implemented micro session

* added micro_common implementation and python interfaces (#18)

* added micro_common implementation and python interfaces (#18)

* current status, semi implemented

* host test working

* updated interfaces for MicroSession arguments allocation

* make somewhat lint compatible

* fix based on comments

* added rounding macro

* fix minor bug

* improvements based on comments

* Clean up `binutil.py` and make Python-3-compatible

* Change argument allocation design

* Address feedback and lint errors

* Improve binutil tests

* Simplify allocator (per @tqchen's suggestions)

* Doc/style fixes

* farts

* mcgee

* rodata section werks

(and so does `test_runtime_micro_workspace.py`)

* simple graph runtime werk

* TEMP

* ResNet works, yo

* First round of cleanup

* More cleanup

* runs a dyson over the code

* Another pass

* Fix `make lint` issues

* ready to pr... probably

* final

* Undo change

* Fix rebase resolution

* Minor fixes

* Undo changes to C codegen tests

* Add `obj_path` in `create_micro_lib`

* TEMP

* Address feedback

* Add missing TODO

* Partially address feedback

* Fix headers

* Switch to enum class for `SectionKind`

* Add missing ASF header

* Fix lint

* Fix lint again

* Fix lint

* Kill lint warnings

* Address feedback

* Change Python interface to MicroTVM

All interaction with the device is now through `Session` objects, which
are used through Python's `with` blocks.

* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Move helper functions out of `tvm.micro`

* Switch static char arr to vector

* Improve general infra and code quality

Does not yet address all of tqchen's feedback

* Forgot a rename

* Fix lint

* Add ASF header

* Fix lint

* Partially address MarisaKirisame's feedback

* Lint

* Expose `MicroSession` as a node to Python

* Revert to using `Session` constructor

* Fix compiler error

* (Maybe) fix CI error

* Debugging

* Remove

* Quell lint

* Switch to stack-based session contexts

* Make uTVM less intrusive to host codegen

And use SSA for operands of generated ternary operators

* Inline UTVMArgs into UTVMTask struct

* Remove `HostLowLevelDevice` header

* Remove `BaseAddr` class

* Address feedback

* Add "utvm" prefix to global vars in runtime

* Fix lint

* Fix CI

* Fix `test_binutil.py`

* Fix submodules

* Remove ResNet tests

* Make `test_binutil.py` work with nose

* Fix CI

* I swear this actually fixes the binutil tests

* lint

* lint

* Add fcompile-compatible cross-compile func

* Add docs for uTVM runtime files

* Move pointer patching into `MicroSession`

* Fix lint

* First attempt at unifying cross-compile APIs

* Fix lint

* Rename `cross_compile` back to `cc`

* Address feedback

* Remove commented code

* Lint

* Figure out failing function

* Remove debugging code

* Change "micro_dev" target to "micro"

* Add checks in tests for whether uTVM is enabled

* Add TODO for 32-bit support

* Rename more "micro_dev" to "micro"

* Undo rename

We already have `tvm.micro` as a namespace.  Can't have it as a method
as well.

* Fix failing CI

Thanks to @tqchen for finding this bug.  Emitting ternary operators for
`min` and `max` causes concurrency bugs in CUDA, so we're moving the
ternary op emissions from `CodeGenC` to `CodeGenCHost`.

* Address feedback

* Fix lint

committed 5 years ago

ef909df1 Browse Directory

21 May, 2019 1 commit
- [Contrib] cblas batch_matmul (#3210) · 3d1d17e3
  hlu1 committed 5 years ago
  
  3d1d17e3 Browse Directory
16 May, 2019 1 commit
- Add the acc16 intrinsic support (#3081) · c7794564
  llyfacebook committed 5 years ago
  
  c7794564 Browse Directory
29 Apr, 2019 1 commit

[Relay][TOPI] Gluncv SSD support on the GPU (#2784) · a706ad16

* ssd gluoncv gpu op updated

* ssd gluoncv gpu op updated

* tutorials and testes modified

* tutorials and testes modified

* fix lint

* fix lint

* address comment

* multibox bug fixed

* space line added

* use less threads per block

* use less threads per block

* less threads per block for get valid count

* less threads per block for get valid count

* merge with master

* Revert "less threads per block for get valid count"

This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4.

* Revert "less threads per block for get valid count"

This reverts commit 08896cfccc34b0b2a1646d01d01ea4cad73941c4.

* typo fixed

* elem length made to a variable

* fix lint error

* fix lint error

* lint fixed

* bug fixed

* bug fixed

* lint fixed

* error fixed

* error fixed

* test ci

* test ci

* seperate argsort to be an independent op

* seperate argsort to be an independent op

* fix lint

* fix lint

* remove unsupported models

* typo fixed

* argsort added to realy

* solve conflicts with master

* fix lint

* fix lint

* test push

* Revert "test push"

This reverts commit 6db00883fab6cc06bddf564c926bb27c874397d8.

* fix lint error

* fix more lint

* cpu test_sort udpated

* debug ci

* nms fixed

* expose argsort to relay frontend

* test ci

* fix lint

* sort register error fixed

* fix nnvm

* nms type fixed

* adaptive pooling added to relay

* Revert "adaptive pooling added to relay"

This reverts commit 1119f1f2c055753e0cc5611627597749134c5c8c.

* fix lint

* expose argsort op

* fix lint

* fix lint

* fix lint

* sort test updated

* sort bug fixed

* nnvm error fixed

* fix argsort default data type returned to be float insteaf of int

* fix lint

* fix lint

* test fixed

* fix valid count

* fix titanx bug

* tutorial add both targets

* titanx error fixed

* try to fix CI old gpu error

* try to solve CI GPU error

* get_valid_count added

* reverse get_valid_count

* get valid count optimized

* address comments

* fix ci error

* remove unessesary block sync

* add back one sync

* address comments

* address more comments

* more comments

* move sort to be indepent algorithm

* typo fixed

* more typos

* comments addressed

* doc updated

* fix pylint

* address final comments

* apache license added

committed 5 years ago

a706ad16 Browse Directory

08 Apr, 2019 1 commit

[HEADER] Add Header to Comply with ASF Release Policy (#2982) · cffb4fba

* [HEADER] ASF header dir=include

* [HEADER] ASF Header dir=src

* [HEADER] ASF Header -dir=python

* [HEADER] ASF header dir=topi

* [HEADER] ASF Header dir=nnvm

* [HEADER] ASF Header -dir=tutorials

* [HEADER] ASF Header dir=tests

* [HEADER] ASF Header -dir=docker

* fix whitespace

* [HEADER] ASF Header -dir=jvm

* [HEADER] ASF Header -dir=web

* [HEADER] ASF Header --dir=apps

* [HEADER] ASF Header --dir=vta

* [HEADER] ASF Header -dir=go

* temp

* [HEADER] ASF Header --dir=rust

* [HEADER] Add ASF Header --dir=cmake

* [HEADER] ASF Header --dir=docs

* [HEADER] Header for Jenkinsfile

* [HEADER] ASF Header to toml and md

* [HEADER] ASF Header to gradle

* Finalize rat cleanup

* Fix permission

* Fix java test

* temporary remove nnvm onnx test

committed 5 years ago

cffb4fba Browse Directory

23 Mar, 2019 1 commit
- [NNPACK] Modernize test (#2868) · 46924406
  hlu1 committed 5 years ago
  
  46924406 Browse Directory
06 Dec, 2018 1 commit
- [contrib][nnpack] remove training-optimized ops (#2224) · f467377f
  hlu1 committed 6 years ago
  
  f467377f Browse Directory
15 Nov, 2018 2 commits
- [NNPACK] Add check for NNPACK being available (`nnp_initialize()` succeeding) (#2119) · 69ad6aed
```
This fixes issues with failing tests on PowerPC.
```
  Andrew Tulloch committed 6 years ago
  69ad6aed Browse Directory
- [NNPACK] temporary disable nnpack test (#2115) · fc55f34f
  Tianqi Chen committed 6 years ago
  
  fc55f34f Browse Directory
10 Nov, 2018 1 commit
- [TVM] [NNPACK] Modernize and improve NNPACK bindings (#2084) · fc83c7f2
  Andrew Tulloch committed 6 years ago
  
  fc83c7f2 Browse Directory
21 Oct, 2018 1 commit
- [TOPI] Specify non-zero absolute tolerance in tests (#1925) · 39c8bc2a
  Sergey Mironov committed 6 years ago
  
  39c8bc2a Browse Directory
04 Oct, 2018 1 commit
- Correction in documentation (#1810) · 836cf13a
  Pariksheet Pinjari committed 6 years ago
  
  836cf13a Browse Directory
06 Sep, 2018 1 commit
- [Sparse] add sparse tensor computation support (#1289) · d87c94d4
  Liangfu Chen committed 6 years ago
  
  d87c94d4 Browse Directory