NNVM is not a deep learning library. It is a modular,
decentralized and lightweight part to help build deep learning libraries.
NNVM is a reusable computational graph optimization and compilation stack for deep learning systems.
NNVM provides modules to:
## What is it
- Represent deep learning workloads from front-end frameworks via a graph IR.
- Optimize computation graphs to improve performance.
- Compile into executable modules and deploy to different hardware backends with minimum dependency.
While most deep learning systems offer end to end solutions,
it is interesting to assemble a deep learning system by parts.
The goal is to enable user to customize optimizations, target platforms and set of operators they care about.
We believe that the decentralized modular system is an interesting direction.
The hope is that effective parts can be assembled together just like you assemble your own desktops.
So the customized deep learning solution can be minimax, minimum in terms of dependencies,
while maximizing the users' need.
NNVM offers one such part, it provides a generic way to do
computation graph optimization such as memory reduction, device allocation and more
while being agnostic to the operator interface definition and how operators are executed.
NNVM is inspired by LLVM, aiming to be a high level intermediate representation library
for neural nets and computation graphs generation and optimizations.
See [Overview](docs/overview.md) for an introduction on what it provides.
## Example
See [TinyFlow](https://github.com/tqchen/tinyflow) on how you can build a TensorFlow API with NNVM and Torch.
## Why build learning system by parts
This is essentially ***Unix philosophy*** applied to machine learning system.
- Essential parts can be assembled in minimum way for embedding systems.
- Developers can hack the parts they need and compose with other well defined parts.
- Decentralized modules enable new extensions creators to own their project
without creating a monolithic version.
Deep learning system itself is not necessary one part, for example
here are some relative independent parts that can be isolated
- Computation graph definition, manipulation.
- Computation graph intermediate optimization.
- Computation graph execution.
- Operator kernel libraries.
- Imperative task scheduling and parallel task coordination.
We hope that there will be more modular parts in the future,
so system building can be fun and rewarding.
NNVM is designed to add new frontend, operators and graph optimizations in a decentralized fashion without changing the core interface. NNVM is part of [TVM stack](https://github.com/dmlc/tvm), which provides an end to end IR compilation stack for deploying deep learning workloads into different hardware backends
## Links
-[TinyFlow](https://github.com/tqchen/tinyflow) on how you can use NNVM to build a TensorFlow like API.
-[Apache MXNet](http://mxnet.io/) uses NNVM as a backend.
[MXNet](https://github.com/dmlc/mxnet) is moving to NNVM as its intermediate
NNVM is a reusable graph IR stack for deep learning systems. It provides useful API to construct, represent and transform computation graphs to get most high-level optimization needed in deep learning.
As a part of TVM stack for deep learning, NNVM also provides a shared compiler for deep learning frameworks to optimize, compile and deploy into different hardware backends via [TVM](https://github.com/dmlc/tvm)
## Key Requirements and Design Choices
- Have minimum dependency in the deployment module.
- Being able to add new operators to the IR, in a decentralized fashion.
- Being able to add new optimization passes to the IR and applies to existing graphs.
The item2 and 3 are particularly interesting if we compare it to a typical compiler IR. Compiler IR usually contains a fixed set of primitives(instructions), and use them as a contract between optimization pass designers. This design enables easy addition of new optimization passes, but not new operator(instruction). Because every time we add a new instruction, we need to modify the passes to accommodate these changes.
Deep learning frameworks usually have a fixed operator interface(schema). These interfaces can contain properties like shape inference function, whether in-place computation can happen. The operator interface is an again contract that makes it easy to add new an operator. But it is hard to add new passes in decentralized fashion a new optimization pass usually requires additional information, and this results in frequent changes of the centralized operator interface when we are exploring new optimizations. There is also a drawback of modularization. For example, a graph compiler for FPGA devices may not need the GPU device specific attributes.
During our explorations in graph optimization and compilation, we find that it is important to quickly add both operators and passes to the framework without changing the core library.
Here is a list of key elements in NNVM's design
- Operator registry system to register and add new operators
- Operator attribute system provide property of operator in decentralized fashion
- A reusable IR data structure for optimization passes.
The above list is more like the generic language part of NNVM, besides of that, we also provide a collection of core operator primitives, and graph optimization passes. The core tensor operator primitives and optimizations already cover commonly deep learning workloads. This design allows the NNVM compiler to be directly used as optimization and compilation stack for frameworks. The extendible nature of NNVM makes new adjustment easy without constraining the backend providers.
## Minimum Registration for a Symbolic Front-End
To use NNVM to build language front end, a developer only needs to register minimum information about each operator.
```c++
NNVM_REGISTER_OP(add)
.describe("add two data together")
.set_num_inputs(2);
NNVM_REGISTER_OP(conv2d)
.describe("take 2d convolution of input")
.set_num_inputs(2);
NNVM_REGISTER_OP(assign)
.describe("assign second input argument to the first one")
.set_num_inputs(2);
```
Compiling the code with NNVM library. User can use the following interface to compose the computation graph in python, like the following code.
The graph structure is interchangeable between the frontend and the backend. Python interface is supported currently. More language support can be easily
moved in the future.
## Operator Attribute for More Extensions
The minimum information provided by the operator is enough to get a front-end. However, we need more knowledge about each operator to do transformations and executing the graph.
A typical difference between neural nets' computation graph and traditional compiler IR is that there are a lot more high-level operators. We cannot fix the set of operators in the IR.
NNVM allow developers to register attributes of each operator. The attributes can include shape inference function, whether the operator can perform in-place calculation etc.
This design to having an operator attribute registry is not uncommon in deep learning systems.
For example, MXNet has a ```OpProperty``` class, Tensorflow has a ```OpDef``` and Caffe2 have a ```OperatorSchema``` class.
However, the operator attribute interface listed in these frameworks only support a fixed number of defined attributes of interest to the system. If we want to extend the framework to add a new attribute in each operator, we need to change the operator registry.
Eventually, the operator interface grows into to be very big and have to evolve in the centralized repo.
In NNVM, we decided to change the design and support arbitrary type of operator attributes, without changing the interface registry. The minimum interface also makes it easier to share across multiple projects
User can register new attribute, such as inplace property checking function as follows.
We can query these attributes at arbitrary parts of the code, like the following parts. Under the hood, each attribute is stored in a columnar store, that can easily be retrieved table and do quick lookups.
```c++
voidMyFunction(){
constOp*add=Op::Get("add");
// if we need quick query, we can use static variable
// attribute map contains attributes of all operators.
// quick look up attribute of add, O(1) time, vector index lookup internally.
autoadd_inplace=finplace_option_tbl[add];
}
```
Besides making the code minimum, this attribute store enables decentralization of projects.
Before, all the attributes of operator have to sit on a centralized interface class.
Now, everyone can register attributes of their own, take some other attributes they need from another project without changing the operator interface and core library
## Graph and Pass
We can use the additional information on attribute registry to do optimizations and get more information about the graph. Graph is the unit we manipulate in these steps. A Graph in NNVM contains
two parts:
- The computation graph structure
- A attribute map from string to any type ```map<string, shared_ptr<any> >```
The second attribute map is quite important, as we may need different kinds
of information about the graph during the transformation process. Let it be
shapes of each tensor, types of each tensor or the storage allocation plans.
A ```Pass``` can take a graph with existing attribute information,
and transform it to the same graph structure with more graph attributes or another graph.
This page contains the list of core tensor operator primitives re-defined in NNVM.
The core tensor operator primitives(``nnvm.top``) covers typical workloads in deep learning.
They can represent workloads in front-end frameworks, and provide basic building blocks for optimization.
Since deep learning is a fast evolving field and it is that possible to have operators that are not in here.
NNVM is designed for this problem and can easily new operators without changing the core library.
.. note::
Each operator node in the graph IR contains the following two kinds of parameters.
- inputs: positional list of input tensors
- attrs: attributes about operator(e.g. kernel_size in conv2d)
This document lists both inputs and attributes in the parameter field. You can distinguish them by the marked type. The inputs are of type Tensor, while the rest parameters are attributes.
To construct the graph with NNVM python API, a user can pass in the input Tensors as positional arguments, and attributes as keyword arguments.
Overview of Operators
---------------------
**Level 1: Basic Operators**
This level enables fully connected multi-layer perceptron.
.. autosummary::
...
...
@@ -76,7 +96,8 @@ This level enables typical convnet models.