Commits · 623dd2087839b76bf7950f0759d5d8746497f2b7 · wenyuanbo / tic

20 Jan, 2020 1 commit

[REFACTOR][TYPE] Finish move all types to IR. (#4746) · 2c0c1849

* [REFACTOR][TYPE] Finish move all types to IR.

- Move definition of Ref and TensorType to ir
- Move type_functor.h to public header.
- Rename RefType -> RelayRefType for clarity.

* Add atol

committed 5 years ago

2c0c1849 Browse File

19 Jan, 2020 1 commit

[REFACTOR] Establish tir (#4740) · cf59b206

TIR is the new namespace for low-level IR
for tensor-level optimizations and loop transformations.

This PR establishes the namespace and files.

- lowered_func.h,buffer.h,data_layout.h -> tir/buffer.h,tir/data_layout.h,tir/lowered_func.h
- ir.h -> tir/expr.h, tir/stmt.h
- ir_functor_ext.h -> tir/expr_functor.h, tir/stmt_functor.h

committed 5 years ago

cf59b206 Browse File

14 Jan, 2020 1 commit

[REFACTOR][IR] Polish ir/type (#4705) · f4c4fde4

- Use consistent constructor style to construct objects.
- Move env_func to ir as it is mainly used to construct IRs.
- Make docs consistent.

committed 5 years ago

f4c4fde4 Browse File

04 Jan, 2020 1 commit

[REFACTOR] TVM_REGISTER_API -> TVM_REGISTER_GLOBAL (#4621) · 81523604

TVM_REGSISTER_API is an alias of TVM_REGISTER_GLOBAL.
In the spirit of simplify redirections, this PR removes
the original TVM_REGISTER_API macro and directly use TVM_REGISTER_GLOBAL.

This type of refactor will also simplify the IDE navigation tools
such as FFI navigator to provide better code reading experiences.

Move EnvFunc's definition to node.

committed 5 years ago

81523604 Browse File

31 Dec, 2019 1 commit

[REFACTOR][OBJECT] Consoldiate NodePtr/Ref/Hash/Equal to Object (#4603) · a8c36921

* [REFACTOR][OBJECT] Consoldiate NodePtr/Ref/Hash/Equal and macros to Object.

Historically, we have classes like NodePtr/Ref/HashEqual.
After unified object protocol, these names are just alias of the object counterpart.
Moreover, there are helper macros defined over the places for defining these object.

This PR consoldiate the terminologies into the corresponding ones
in the Object system so we have a clean and consistent API moving forward.

* Update include/tvm/attrs.h

Co-Authored-By: Wei Chen <ipondering.weic@gmail.com>

* fix compilation

Co-authored-by: Wei Chen <ipondering.weic@gmail.com>

committed 5 years ago

a8c36921 Browse File

26 Dec, 2019 1 commit
- [Relay] Convert Layout Pass. (#4335) · 73dda6be
  Animesh Jain committed 5 years ago
  
  73dda6be Browse File
24 Nov, 2019 1 commit

[LINT] Remove unnecessary copyright message for files with ASF header (#4409) · c8772288

* [LINT] Improve the check tool to handle ASF copyright message.

* [LINT] Remove unnecessary copyright message as per ASF requirement.

* Fix codegen hybrid

* [LINT] Broaden license checks to include html, xml

* [LINT] Fix rest of the files

* Fix notice

* [LINT] Improve check file type error message

committed 5 years ago

c8772288 Browse File

21 Oct, 2019 1 commit

[REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol. (#4161) · 7895adb2

* [REFACTOR][NODE][RUNTIME] Move Node to the new Object protocol.

This PR removes the original node system, and make node as a subclass of Object.
This is a major refactor towards a better unified runtime object system.

List of changes in the refactor:

- We now hide data_ field, use Downcast explicitly to get a sub-class object.
- Removed the node system FFI in python.
- Removed the node C API, instead use PackedFunc for list and get attrs.
- Change relay::Op::set_attr_type_key(attr_key_name) to relay::Op::set_attr_type<AttrType>().
  - This change was necessary because of the new Object registration mechanism.
  - Subsequent changes to the op registrations
  - The change revealed a few previous problems that is now fixed.
- Patched up a few missing node type registration.
  - Now we will raise an error if we register object that is not registered.
- The original node.h and container.h are kept in the same location.
- Calling convention: kObjectHandle now equals the old kNodeHandle, kNodeHandle is removed.
- IRFunctor now dispatches on ObjectRef.
- Update to the new type checking API: is_type, derived_from are replaced by IsInstance.
- Removed .hash member function, instead use C++ convention hasher functors.

* Address review comments

committed 5 years ago

7895adb2 Browse File

06 Aug, 2019 1 commit

[Relay] [TOPI] `{relay,topi}.nn.sparse_transpose` for **Square** CSR matrices (#3707) · 3b287c4d

* add build gcn tutorial

* add transpose operator for square sparse matrices

* remove extra files

* change loop tag

* comply with lint

* comply with lint -- line too long

* comply with lint

* lint check

* lint check

* lint check

* apply marisa and theirry's reviews

committed 5 years ago

3b287c4d Browse File

23 Jul, 2019 1 commit

We observe multiple groups across a range of domains (ASR, NMT, LM, etc), (#3566) · d6dcd6c5

internally and externally, interested in replacing standard dense layers with
block-sparse matrix multiplication layers. The motivations are generally: higher
performance (due to reduction in FLOPs, memory bandwidth/cache footprint),
enabling larger models (e.g. fitting more layers in a given memory budget).

Some public work along these lines:

* https://openai.com/blog/block-sparse-gpu-kernels/
* https://openai.com/blog/sparse-transformer/
* https://arxiv.org/abs/1802.08435
* https://arxiv.org/abs/1711.02782

Various groups have been able to successfully train models with reasonable
levels of sparsity (90%+) with marginal accuracy changes, which suggests
substantial speedups are possible (as this implies a >10x reduction in FLOPs).

It is fairly straightforward to realize these theoretical speedups, see e.g. TVM
benchmarks for Intel CPUs in
https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902, and CUDA
results in https://github.com/openai/blocksparse, etc.

* https://github.com/openai/blocksparse (CUDA)
* https://software.intel.com/en-us/mkl-developer-reference-c-mkl-bsrmm (MKL BSRM)
* https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.bsr_matrix.html (SCIPY BSR representation)

This is extracted from an internal patch we've been using internally. There are
various extensions possible (int8/fp16/bf16, CUDA/other GPU architectures), but
this is a reasonable starting point. This needs more thorough unit test coverage
however.

We follow the conventions established by scipy.sparse.bsr_matrix and other
libraries, see the unit tests for details.

For folks interested in experimenting with scheduling/AutoTVM etc,
https://gist.github.com/ajtulloch/e65f90487bceb8848128e8db582fe902 is a useful
starting point.

committed 5 years ago

d6dcd6c5 Browse File