Unverified Commit 85f7e0a6 by Zhi Committed by GitHub

Fix doc after moving to unified IR (#4835)

parent 9d741ef7
...@@ -78,7 +78,7 @@ C++ Backend ...@@ -78,7 +78,7 @@ C++ Backend
We provide a ``PassInfo`` object to contain the basic information needed by We provide a ``PassInfo`` object to contain the basic information needed by
a pass. ``name`` is the pass name, ``opt_level`` indicates at which optimization a pass. ``name`` is the pass name, ``opt_level`` indicates at which optimization
level the pass will be enabled, and ``required`` represents the passes that are level the pass will be enabled, and ``required`` represents the passes that are
required to execute a certain pass (see `include/tvm/relay/transform.h`_ for required to execute a certain pass (see `include/tvm/ir/transform.h`_ for
more details). For example, during registration of a pass (will be covered in more details). For example, during registration of a pass (will be covered in
later), the pass developers can specify the name of the pass, the optimization later), the pass developers can specify the name of the pass, the optimization
level it will be performed at, and/or the passes that are required. level it will be performed at, and/or the passes that are required.
...@@ -183,7 +183,7 @@ optimization passes, e.g., function-level passes, module-level passes, and ...@@ -183,7 +183,7 @@ optimization passes, e.g., function-level passes, module-level passes, and
sequential passes. Each subclass itself could act as a pass manager. For sequential passes. Each subclass itself could act as a pass manager. For
instance, they could collect the required passes and execute them or build instance, they could collect the required passes and execute them or build
a dependency graph based on the given metadata. The full definition of them a dependency graph based on the given metadata. The full definition of them
can be found in `src/relay/pass/pass_manager.cc`_ can be found in `src/relay/ir/transform.cc`_ and `src/ir/transform.cc`_.
Module-Level Passes Module-Level Passes
^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^
...@@ -651,9 +651,11 @@ For more pass infra related examples in Python and C++, please refer to ...@@ -651,9 +651,11 @@ For more pass infra related examples in Python and C++, please refer to
.. _Relay module: https://docs.tvm.ai/langref/relay_expr.html#module-and-global-functions .. _Relay module: https://docs.tvm.ai/langref/relay_expr.html#module-and-global-functions
.. _include/tvm/relay/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/relay/transform.h .. _include/tvm/ir/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/ir/transform.h
.. _src/relay/pass/pass_manager.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/pass_manager.cc .. _src/relay/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/ir/transform.cc
.. _src/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/ir/transform.cc
.. _src/relay/pass/fold_constant.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/fold_constant.cc .. _src/relay/pass/fold_constant.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/pass/fold_constant.cc
......
...@@ -94,7 +94,7 @@ Here are the common ones: ...@@ -94,7 +94,7 @@ Here are the common ones:
- PackedFunc itself - PackedFunc itself
- Module for compiled modules - Module for compiled modules
- DLTensor* for tensor object exchange - DLTensor* for tensor object exchange
- TVM Node to represent any object in IR - TVM Object to represent any object in IR
The restriction makes the implementation simple without the need of serialization. The restriction makes the implementation simple without the need of serialization.
Despite being minimum, the PackedFunc is sufficient for the use-case of deep learning deployment as Despite being minimum, the PackedFunc is sufficient for the use-case of deep learning deployment as
...@@ -141,7 +141,7 @@ One fun fact about PackedFunc is that we use it for both compiler and deployment ...@@ -141,7 +141,7 @@ One fun fact about PackedFunc is that we use it for both compiler and deployment
.. _here: https://github.com/apache/incubator-tvm/tree/master/src/api .. _here: https://github.com/apache/incubator-tvm/tree/master/src/api
To keep the runtime minimum, we isolated the IR Node support from the deployment runtime. The resulting runtime takes around 200K - 600K depending on how many runtime driver modules (e.g., CUDA) get included. To keep the runtime minimum, we isolated the IR Object support from the deployment runtime. The resulting runtime takes around 200K - 600K depending on how many runtime driver modules (e.g., CUDA) get included.
The overhead of calling into PackedFunc vs. a normal function is small, as it is only saving a few values on the stack. The overhead of calling into PackedFunc vs. a normal function is small, as it is only saving a few values on the stack.
So it is OK as long as we don't wrap small functions. So it is OK as long as we don't wrap small functions.
...@@ -182,7 +182,7 @@ RPC server on iPhone/android/raspberry pi or even the browser. The cross compila ...@@ -182,7 +182,7 @@ RPC server on iPhone/android/raspberry pi or even the browser. The cross compila
This instant feedback gives us a lot of advantages. For example, to test the correctness of generated code on iPhone, we no longer have to write test-cases in swift/objective-c from scratch -- We can use RPC to execute on iPhone, copy the result back and do verification on the host via numpy. We can also do the profiling using the same script. This instant feedback gives us a lot of advantages. For example, to test the correctness of generated code on iPhone, we no longer have to write test-cases in swift/objective-c from scratch -- We can use RPC to execute on iPhone, copy the result back and do verification on the host via numpy. We can also do the profiling using the same script.
TVM Node and Compiler Stack TVM Object and Compiler Stack
--------------------------- ---------------------------
As we mentioned earlier, we build compiler stack API on top of the PackedFunc runtime system. As we mentioned earlier, we build compiler stack API on top of the PackedFunc runtime system.
...@@ -192,17 +192,17 @@ However, we don't want to change our API from time to time. Besides that, we als ...@@ -192,17 +192,17 @@ However, we don't want to change our API from time to time. Besides that, we als
- be able to serialize any language object and IRs - be able to serialize any language object and IRs
- be able to explore, print, and manipulate the IR objects in front-end language to do quick prototyping. - be able to explore, print, and manipulate the IR objects in front-end language to do quick prototyping.
We introduced a base class, called `Node`_ to solve this problem. We introduced a base class, called `Object`_ to solve this problem.
All the language object in the compiler stack is a subclass of Node. Each node contains a string type_key that uniquely identifies All the language object in the compiler stack is a subclass of ``Object``. Each object contains a string type_key that uniquely identifies
the type of object. We choose string instead of int as type key so new Node class can be added in the decentralized fashion without the type of object. We choose string instead of int as type key so new ``Object`` class can be added in the decentralized fashion without
adding the code back to the central repo. To ease the speed of dispatching, we allocate an integer type_index at runtime for each type_key. adding the code back to the central repo. To ease the speed of dispatching, we allocate an integer type_index at runtime for each type_key.
.. _Node: https://github.com/dmlc/HalideIR/blob/master/src/tvm/node/node.h#L61 .. _Object: https://github.com/apache/incubator-tvm/blob/master/include/tvm/runtime/object.h
Since usually one Node object could be referenced in multiple places in the language, we use a shared_ptr to keep Since usually one ``Object`` could be referenced in multiple places in the language, we use a shared_ptr to keep
track of reference. We use NodeRef class to represent a reference to the Node. track of reference. We use ``ObjectRef`` class to represent a reference to the ``Object``.
We can roughly view NodeRef class as shared_ptr to the Node container. We can roughly view ``ObjectRef`` class as shared_ptr to the ``Object`` container.
We can also define subclass NodeRef to hold each subtypes of Node. Each Node class needs to define the VisitAttr function. We can also define subclass ``ObjectRef`` to hold each subtypes of ``Object``. Each subclass of ``Object`` needs to define the VisitAttr function.
.. code:: c .. code:: c
...@@ -216,21 +216,21 @@ We can also define subclass NodeRef to hold each subtypes of Node. Each Node cla ...@@ -216,21 +216,21 @@ We can also define subclass NodeRef to hold each subtypes of Node. Each Node cla
virtual void Visit(const char* key, std::string* value) = 0; virtual void Visit(const char* key, std::string* value) = 0;
virtual void Visit(const char* key, void** value) = 0; virtual void Visit(const char* key, void** value) = 0;
virtual void Visit(const char* key, Type* value) = 0; virtual void Visit(const char* key, Type* value) = 0;
virtual void Visit(const char* key, NodeRef* value) = 0; virtual void Visit(const char* key, ObjectRef* value) = 0;
// ... // ...
}; };
class Node { class BaseAttrsNode : public Object {
public: public:
virtual void VisitAttrs(AttrVisitor* visitor) {} virtual void VisitAttrs(AttrVisitor* v) {}
// ... // ...
}; };
Each Node subclass will override this to visit its members. Here is an example implementation of TensorNode. Each ``Object`` subclass will override this to visit its members. Here is an example implementation of TensorNode.
.. code:: c .. code:: c
class TensorNode : public Node { class TensorNode : public Object {
public: public:
/*! \brief The shape of the tensor */ /*! \brief The shape of the tensor */
Array<Expr> shape; Array<Expr> shape;
...@@ -251,7 +251,7 @@ Each Node subclass will override this to visit its members. Here is an example i ...@@ -251,7 +251,7 @@ Each Node subclass will override this to visit its members. Here is an example i
} }
}; };
In the above examples, both ``Operation`` and ``Array<Expr>`` are NodeRef. In the above examples, both ``Operation`` and ``Array<Expr>`` are ObjectRef.
The VisitAttrs gives us a reflection API to visit each member of the object. The VisitAttrs gives us a reflection API to visit each member of the object.
We can use this function to visit the node and serialize any language object recursively. We can use this function to visit the node and serialize any language object recursively.
It also allows us to get members of an object easily in front-end language. It also allows us to get members of an object easily in front-end language.
...@@ -264,7 +264,7 @@ For example, in the following code, we accessed the op field of the TensorNode. ...@@ -264,7 +264,7 @@ For example, in the following code, we accessed the op field of the TensorNode.
# access the op field of TensorNode # access the op field of TensorNode
print(x.op.name) print(x.op.name)
New Node can be added to C++ without changing the front-end runtime, making it easy to make extensions to the compiler stack. New ``Object`` can be added to C++ without changing the front-end runtime, making it easy to make extensions to the compiler stack.
Note that this is not the fastest way to expose members to front-end language, but might be one of the simplest Note that this is not the fastest way to expose members to front-end language, but might be one of the simplest
approaches possible. We also find that it fits our purposes as we mainly use python for testing and prototyping and still use c++ approaches possible. We also find that it fits our purposes as we mainly use python for testing and prototyping and still use c++
to do the heavy lifting job. to do the heavy lifting job.
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment