Unverified Commit 3bab699d by Tianqi Chen Committed by GitHub

[DOCS] Various sphinx related fix. (#5168)

* [DOCS] Various sphinx related fix.

- Use :ref: for reference.
- Use :py:class: to refer to API docs.
- Update installation guide to also refer to the download page.
- Only move html contents in doxygen.

* Address review comments

* Update wording
parent 6536b356
...@@ -40,8 +40,8 @@ The helper bash script can be useful to build demo sessions. ...@@ -40,8 +40,8 @@ The helper bash script can be useful to build demo sessions.
## Prebuilt Docker Images ## Prebuilt Docker Images
We provide several pre-built images for doing quick exploration with TVM installed. You can use third party pre-built images for doing quick exploration with TVM installed.
For example, you can run the following command to get ```tvmai/demo-cpu``` image. For example, you can run the following command to launch ```tvmai/demo-cpu``` image.
```bash ```bash
/path/to/tvm/docker/bash.sh tvmai/demo-cpu /path/to/tvm/docker/bash.sh tvmai/demo-cpu
...@@ -52,7 +52,8 @@ Then inside the docker container, you can type the following command to start th ...@@ -52,7 +52,8 @@ Then inside the docker container, you can type the following command to start th
jupyter notebook jupyter notebook
``` ```
Check out https://hub.docker.com/r/tvmai/ to get the full list of available prebuilt images. You can find some un-official prebuilt images in https://hub.docker.com/r/tvmai/ .
Note that these are convenience images and are not part of the ASF release.
## Use Local Build Script ## Use Local Build Script
......
...@@ -103,3 +103,17 @@ The tutorial code will run on our build server to generate the document page. ...@@ -103,3 +103,17 @@ The tutorial code will run on our build server to generate the document page.
So we may have a restriction like not being able to access a remote Raspberry Pi, So we may have a restriction like not being able to access a remote Raspberry Pi,
in such case add a flag variable to the tutorial (e.g. `use_rasp`) and allow users to easily switch to the real device by changing one flag. in such case add a flag variable to the tutorial (e.g. `use_rasp`) and allow users to easily switch to the real device by changing one flag.
Then use the existing environment to demonstrate the usage. Then use the existing environment to demonstrate the usage.
Refer to Another Location in the Document
-----------------------------------------
Please use sphinx's `:ref:` markup to refer to another location in the same doc.
.. code-block:: rst
.. _document-my-section-tag
My Section
----------
You can use :ref:`document-my-section-tag` to refer to My Section.
...@@ -46,7 +46,7 @@ One of the interesting aspects of the TVM codebase is that interoperability betw ...@@ -46,7 +46,7 @@ One of the interesting aspects of the TVM codebase is that interoperability betw
Vector Add Example Vector Add Example
******************************************* *******************************************
We use a simple example that uses the low level TVM API directly. The example is vector addition, which is covered in detail in `this tutorial <https://docs.tvm.ai/tutorials/get_started.html#sphx-glr-tutorials-get-started-py>`_. We use a simple example that uses the low level TVM API directly. The example is vector addition, which is covered in detail in :ref:`tutorial-tensor-expr-get-started`
:: ::
...@@ -66,9 +66,9 @@ Here, types of ``A``, ``B``, ``C`` are ``tvm.tensor.Tensor``, defined in ``pytho ...@@ -66,9 +66,9 @@ Here, types of ``A``, ``B``, ``C`` are ``tvm.tensor.Tensor``, defined in ``pytho
def __call__(self, *indices): def __call__(self, *indices):
... ...
The object protocol is the basis of exposing C++ types to frontend languages, including Python. The way TVM implements Python wrapping is not straightforward. It is briefly covered in `this document <https://docs.tvm.ai/dev/runtime.html#tvm-node-and-compiler-stack>`_, and details are in ``python/tvm/_ffi/`` if you are interested. The object protocol is the basis of exposing C++ types to frontend languages, including Python. The way TVM implements Python wrapping is not straightforward. It is briefly covered in :ref:`tvm-runtime-system`, and details are in ``python/tvm/_ffi/`` if you are interested.
We use the ``TVM_REGISTER_*`` macro to expose C++ functions to frontend languages, in the form of a `PackedFunc <https://docs.tvm.ai/dev/runtime.html#packedfunc>`_. A ``PackedFunc`` is another mechanism by which TVM implements interoperability between C++ and Python. In particular, this is what makes calling Python functions from the C++ codebase very easy. We use the ``TVM_REGISTER_*`` macro to expose C++ functions to frontend languages, in the form of a :ref:`tvm-runtime-system-packed-func`. A ``PackedFunc`` is another mechanism by which TVM implements interoperability between C++ and Python. In particular, this is what makes calling Python functions from the C++ codebase very easy.
You can also checkout `FFI Navigator <https://github.com/tqchen/ffi-navigator>`_ which allows you to navigate between python and c++ FFI calls. You can also checkout `FFI Navigator <https://github.com/tqchen/ffi-navigator>`_ which allows you to navigate between python and c++ FFI calls.
A ``Tensor`` object has an ``Operation`` object associated with it, defined in ``python/tvm/te/tensor.py``, ``include/tvm/te/operation.h``, and ``src/tvm/te/operation`` subdirectory. A ``Tensor`` is an output of its ``Operation`` object. Each ``Operation`` object has in turn ``input_tensors()`` method, which returns a list of input ``Tensor`` to it. This way we can keep track of dependencies between ``Operation``. A ``Tensor`` object has an ``Operation`` object associated with it, defined in ``python/tvm/te/tensor.py``, ``include/tvm/te/operation.h``, and ``src/tvm/te/operation`` subdirectory. A ``Tensor`` is an output of its ``Operation`` object. Each ``Operation`` object has in turn ``input_tensors()`` method, which returns a list of input ``Tensor`` to it. This way we can keep track of dependencies between ``Operation``.
...@@ -121,9 +121,7 @@ Lowering is done by ``tvm.lower()`` function, defined in ``python/tvm/build_modu ...@@ -121,9 +121,7 @@ Lowering is done by ``tvm.lower()`` function, defined in ``python/tvm/build_modu
stmt = schedule.ScheduleOps(sch, bounds) stmt = schedule.ScheduleOps(sch, bounds)
... ...
Bound inference is the process where all loop bounds and sizes of intermediate buffers are inferred. If you target the CUDA backend and you use shared memory, its required minimum size is automatically determined here. Bound inference is implemented in ``src/te/schedule/bound.cc``, ``src/te/schedule/graph.cc`` and ``src/te/schedule/message_passing.cc``. For more information on how bound inference works, see `InferBound Pass`_. Bound inference is the process where all loop bounds and sizes of intermediate buffers are inferred. If you target the CUDA backend and you use shared memory, its required minimum size is automatically determined here. Bound inference is implemented in ``src/te/schedule/bound.cc``, ``src/te/schedule/graph.cc`` and ``src/te/schedule/message_passing.cc``. For more information on how bound inference works, see :ref:`dev-InferBound-Pass`.
.. _InferBound Pass: http://docs.tvm.ai/dev/inferbound.html
``stmt``, which is the output of ``ScheduleOps()``, represents an initial loop nest structure. If you have applied ``reorder`` or ``split`` primitives to your schedule, then the initial loop nest already reflects those changes. ``ScheduleOps()`` is defined in ``src/te/schedule/schedule_ops.cc``. ``stmt``, which is the output of ``ScheduleOps()``, represents an initial loop nest structure. If you have applied ``reorder`` or ``split`` primitives to your schedule, then the initial loop nest already reflects those changes. ``ScheduleOps()`` is defined in ``src/te/schedule/schedule_ops.cc``.
......
...@@ -15,10 +15,13 @@ ...@@ -15,10 +15,13 @@
specific language governing permissions and limitations specific language governing permissions and limitations
under the License. under the License.
.. _dev-InferBound-Pass:
******************************************* *******************************************
InferBound Pass InferBound Pass
******************************************* *******************************************
The InferBound pass is run after normalize, and before ScheduleOps `build_module.py <https://github.com/apache/incubator-tvm/blob/master/python/tvm/build_module.py>`_. The main job of InferBound is to create the bounds map, which specifies a Range for each IterVar in the program. These bounds are then passed to ScheduleOps, where they are used to set the extents of For loops, see `MakeLoopNest <https://github.com/apache/incubator-tvm/blob/master/src/op/op_util.cc>`_, and to set the sizes of allocated buffers (`BuildRealize <https://github.com/apache/incubator-tvm/blob/master/src/op/compute_op.cc>`_), among other uses. The InferBound pass is run after normalize, and before ScheduleOps `build_module.py <https://github.com/apache/incubator-tvm/blob/master/python/tvm/build_module.py>`_. The main job of InferBound is to create the bounds map, which specifies a Range for each IterVar in the program. These bounds are then passed to ScheduleOps, where they are used to set the extents of For loops, see `MakeLoopNest <https://github.com/apache/incubator-tvm/blob/master/src/op/op_util.cc>`_, and to set the sizes of allocated buffers (`BuildRealize <https://github.com/apache/incubator-tvm/blob/master/src/op/compute_op.cc>`_), among other uses.
The output of InferBound is a map from IterVar to Range: The output of InferBound is a map from IterVar to Range:
......
...@@ -169,12 +169,12 @@ subclasses at the level of modules, functions, or sequences of passes.. ...@@ -169,12 +169,12 @@ subclasses at the level of modules, functions, or sequences of passes..
class PassNode : RelayNode { class PassNode : RelayNode {
virtual PassInfo Info() const = 0; virtual PassInfo Info() const = 0;
virtual Module operator()(const Module& mod virtual Module operator()(const IRModule& mod
const PassContext& pass_ctx) const = 0; const PassContext& pass_ctx) const = 0;
}; };
The functor shows how a pass must be realized, i.e. it always works on a `Relay The functor shows how a pass must be realized, i.e. it always works on a
module`_ under a certain context. All passes are designed in a ``Module`` to ``Module`` :py:class:`IRModule` under a certain context. All passes are designed in a ``Module`` to ``Module``
manner. Therefore, optimizations governed by the pass infra will manner. Therefore, optimizations governed by the pass infra will
always update the whole module. always update the whole module.
...@@ -649,8 +649,6 @@ For more pass infra related examples in Python and C++, please refer to ...@@ -649,8 +649,6 @@ For more pass infra related examples in Python and C++, please refer to
.. _Block: https://mxnet.incubator.apache.org/api/python/docs/api/gluon/block.html#gluon-block .. _Block: https://mxnet.incubator.apache.org/api/python/docs/api/gluon/block.html#gluon-block
.. _Relay module: https://docs.tvm.ai/langref/relay_expr.html#module-and-global-functions
.. _include/tvm/ir/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/ir/transform.h .. _include/tvm/ir/transform.h: https://github.com/apache/incubator-tvm/blob/master/include/tvm/ir/transform.h
.. _src/relay/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/ir/transform.cc .. _src/relay/ir/transform.cc: https://github.com/apache/incubator-tvm/blob/master/src/relay/ir/transform.cc
......
...@@ -37,6 +37,8 @@ We need to satisfy quite a few interesting requirements: ...@@ -37,6 +37,8 @@ We need to satisfy quite a few interesting requirements:
We want to be able to define a function from any language and call from another. We want to be able to define a function from any language and call from another.
We also want the runtime core to be minimal to deploy to embedded devices. We also want the runtime core to be minimal to deploy to embedded devices.
.. _tvm-runtime-system-packed-func:
PackedFunc PackedFunc
---------- ----------
...@@ -176,9 +178,8 @@ Under the hood, we have an RPCModule that serializes the arguments to do the dat ...@@ -176,9 +178,8 @@ Under the hood, we have an RPCModule that serializes the arguments to do the dat
The RPC server itself is minimum and can be bundled into the runtime. We can start a minimum TVM The RPC server itself is minimum and can be bundled into the runtime. We can start a minimum TVM
RPC server on iPhone/android/raspberry pi or even the browser. The cross compilation on server and shipping of the module for testing can be done in the same script. Checkout RPC server on iPhone/android/raspberry pi or even the browser. The cross compilation on server and shipping of the module for testing can be done in the same script. Checkout
`Cross compilation and RPC tutorial`_ for more details. :ref:`tutorial-cross-compilation-and-rpc` for more details.
.. _Cross compilation and RPC tutorial: https://docs.tvm.ai/tutorials/cross_compilation_and_rpc.html#sphx-glr-tutorials-cross-compilation-and-rpc-py
This instant feedback gives us a lot of advantages. For example, to test the correctness of generated code on iPhone, we no longer have to write test-cases in swift/objective-c from scratch -- We can use RPC to execute on iPhone, copy the result back and do verification on the host via numpy. We can also do the profiling using the same script. This instant feedback gives us a lot of advantages. For example, to test the correctness of generated code on iPhone, we no longer have to write test-cases in swift/objective-c from scratch -- We can use RPC to execute on iPhone, copy the result back and do verification on the host via numpy. We can also do the profiling using the same script.
......
...@@ -19,28 +19,26 @@ ...@@ -19,28 +19,26 @@
Docker Images Docker Images
============= =============
We provide several prebuilt docker images to quickly try out TVM. We provide docker utility scripts to help developers to setup development environment.
These images are also helpful run through TVM demo and tutorials. They are also helpful run through TVM demo and tutorials.
You can get the docker images via the following steps.
We need `docker <https://docs.docker.com/engine/installation/>`_ and We need `docker <https://docs.docker.com/engine/installation/>`_ and
`nvidia-docker <https://github.com/NVIDIA/nvidia-docker/>`_ if we want to use cuda. `nvidia-docker <https://github.com/NVIDIA/nvidia-docker/>`_ if we want to use cuda.
First, clone TVM repo to get the auxiliary scripts Get a tvm source distribution or clone the github repo to get the auxiliary scripts
.. code:: bash .. code:: bash
git clone --recursive https://github.com/apache/incubator-tvm tvm git clone --recursive https://github.com/apache/incubator-tvm tvm
We can then use the following command to launch a `tvmai/demo-cpu` image. We can then use the following command to launch a docker image.
.. code:: bash .. code:: bash
/path/to/tvm/docker/bash.sh tvmai/demo-cpu /path/to/tvm/docker/bash.sh <image-name>
You can also change `demo-cpu` to `demo-gpu` to get a CUDA enabled image.
You can find all the prebuilt images in `<https://hub.docker.com/r/tvmai/>`_
Here the image-name can be a local docker image name, e.g. `tvm.ci_cpu` after you have done
the local build. Or a pre-built third party image (`tvmai/demo-cpu` or `tvmai/ci-gpu`).
This auxiliary script does the following things: This auxiliary script does the following things:
...@@ -67,7 +65,10 @@ Note that on macOS, because we use bridge network, jupyter notebook will be repo ...@@ -67,7 +65,10 @@ Note that on macOS, because we use bridge network, jupyter notebook will be repo
at an URL like ``http://{container_hostname}:8888/?token=...``. You should replace the ``container_hostname`` at an URL like ``http://{container_hostname}:8888/?token=...``. You should replace the ``container_hostname``
with ``localhost`` when pasting it into browser. with ``localhost`` when pasting it into browser.
You can find some un-official prebuilt images in `<https://hub.docker.com/r/tvmai/>`_.
Note that these are convenience images and are not part of the ASF release.
Docker Source Docker Source
------------- -------------
Check out `<https://github.com/apache/incubator-tvm/tree/master/docker>`_ if you are interested in Check out `The docker source <https://github.com/apache/incubator-tvm/tree/master/docker>`_ if you are interested in
building your own docker images. building your own docker images.
...@@ -25,7 +25,12 @@ scratch on various systems. It consists of two steps: ...@@ -25,7 +25,12 @@ scratch on various systems. It consists of two steps:
1. First build the shared library from the C++ codes (`libtvm.so` for linux, `libtvm.dylib` for macOS and `libtvm.dll` for windows). 1. First build the shared library from the C++ codes (`libtvm.so` for linux, `libtvm.dylib` for macOS and `libtvm.dll` for windows).
2. Setup for the language packages (e.g. Python Package). 2. Setup for the language packages (e.g. Python Package).
To get started, clone TVM repo from github. It is important to clone the submodules along, with ``--recursive`` option. To get started, download tvm source code from the `Download Page <https://tvm.apache.org/download>`_.
Developers: Get Source from Github
----------------------------------
You can also choose to clone the source repo from github.
It is important to clone the submodules along, with ``--recursive`` option.
.. code:: bash .. code:: bash
......
...@@ -22,7 +22,7 @@ VTA: Deep Learning Accelerator Stack ...@@ -22,7 +22,7 @@ VTA: Deep Learning Accelerator Stack
The Versatile Tensor Accelerator (VTA) is an open, generic, and customizable deep learning accelerator with a complete TVM-based compiler stack. We designed VTA to expose the most salient and common characteristics of mainstream deep learning accelerators. Together TVM and VTA form an end-to-end hardware-software deep learning system stack that includes hardware design, drivers, a JIT runtime, and an optimizing compiler stack based on TVM. The Versatile Tensor Accelerator (VTA) is an open, generic, and customizable deep learning accelerator with a complete TVM-based compiler stack. We designed VTA to expose the most salient and common characteristics of mainstream deep learning accelerators. Together TVM and VTA form an end-to-end hardware-software deep learning system stack that includes hardware design, drivers, a JIT runtime, and an optimizing compiler stack based on TVM.
.. image:: http://raw.githubusercontent.com/uwsaml/web-data/master/vta/blogpost/vta_overview.png .. image:: http://raw.githubusercontent.com/uwsampl/web-data/master/vta/blogpost/vta_overview.png
:align: center :align: center
:width: 60% :width: 60%
......
...@@ -35,7 +35,7 @@ find . -type f -path "*.log" | xargs rm -f ...@@ -35,7 +35,7 @@ find . -type f -path "*.log" | xargs rm -f
# C++ doc # C++ doc
make doc make doc
rm -f docs/doxygen/html/*.map docs/doxygen/html/*.md5 rm -f docs/doxygen/html/*.map docs/doxygen/html/*.md5
mv docs/doxygen docs/_build/html/doxygen mv docs/doxygen/html docs/_build/html/doxygen
# JS doc # JS doc
jsdoc -c web/.jsdoc_conf.json web/tvm_runtime.js web/README.md jsdoc -c web/.jsdoc_conf.json web/tvm_runtime.js web/README.md
......
...@@ -27,24 +27,15 @@ execute them and maintain their dependencies manually. Therefore, we have ...@@ -27,24 +27,15 @@ execute them and maintain their dependencies manually. Therefore, we have
introduced an infrastructure to manage the optimization passes. introduced an infrastructure to manage the optimization passes.
The optimizations of a Relay program could be applied at various granularity, The optimizations of a Relay program could be applied at various granularity,
namely function-level and module-level using `FunctionPass`_ and `ModulePass`_ namely function-level and module-level using :py:class:`tvm.relay.transform.FunctionPass`
respectively. Or users can rely on `Sequential`_ to apply a sequence of passes and py:class:`tvm.relay.transform.ModulePass`
respectively. Or users can rely on py:class:`tvm.relay.transform.Sequential` to apply a sequence of passes
on a Relay program where the dependencies between passes can be resolved by the on a Relay program where the dependencies between passes can be resolved by the
pass infra. For more details about each type of these passes, please refer to pass infra. For more details about each type of these passes, please refer to
the `pass infra doc`_. the :ref:`relay-pass-infra`
This tutorial demostrates how developers can use the Relay pass infra to perform This tutorial demostrates how developers can use the Relay pass infra to perform
a certain optimization and create an optimization pipeline. a certain optimization and create an optimization pipeline.
.. _FunctionPass: https://docs.tvm.ai/api/python/relay/transform.html#tvm.relay.transform.FunctionPass
.. _ModulePass: https://docs.tvm.ai/api/python/relay/transform.html#tvm.relay.transform.ModulePass
.. _Sequential: https://docs.tvm.ai/api/python/relay/transform.html#tvm.relay.transform.Sequential
.. _pass infra doc: https://docs.tvm.ai/dev/relay_pass_infra.html
.. _ToANormalForm: https://docs.tvm.ai/api/python/relay/transform.html#tvm.relay.transform.ToANormalForm
""" """
import numpy as np import numpy as np
...@@ -130,27 +121,27 @@ mod = relay.transform.FuseOps(fuse_opt_level=0)(mod) ...@@ -130,27 +121,27 @@ mod = relay.transform.FuseOps(fuse_opt_level=0)(mod)
print(mod) print(mod)
############################################################################### ###############################################################################
# Use `Sequential`_ to Apply a Sequence of Passes # Use Sequential to Apply a Sequence of Passes
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Applying passes as above is actually tedious and it may require users to have # Applying passes as above is actually tedious and it may require users to have
# better understanding about the dependencies between them. For example, fusion # better understanding about the dependencies between them. For example, fusion
# currently doesn't work well on let bindings. Therefore, we would not be able # currently doesn't work well on let bindings. Therefore, we would not be able
# to fuse operators that were fusable if `ToANormalForm`_ is applied before # to fuse operators that were fusable if :py:func:`relay.transform.ToANormalForm` is applied before
# fusion, as this pass generates let bindings for each expression to # fusion, as this pass generates let bindings for each expression to
# canonicalize a Relay program. # canonicalize a Relay program.
# #
# Relay, hence, provides `Sequential`_ to alleviate developers from handling # Relay, hence, provides :py:class:`tvm.relay.transform.Sequential` to alleviate developers from handling
# these issues explicitly by specifying the required passes of each pass and # these issues explicitly by specifying the required passes of each pass and
# packing them as a whole to execute. For example, the same passes can now be # packing them as a whole to execute. For example, the same passes can now be
# applied using the sequential style as the following. `Sequential`_ is # applied using the sequential style as the following. :py:class:`tvm.relay.transform.Sequential` is
# similiar to `torch.nn.sequential <https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential>`_ # similiar to `torch.nn.sequential <https://pytorch.org/docs/stable/nn.html#torch.nn.Sequential>`_
# and `mxnet.gluon.block <https://mxnet.incubator.apache.org/api/python/docs/_modules/mxnet/gluon/block.html>`_. # and `mxnet.gluon.block <https://mxnet.incubator.apache.org/api/python/docs/_modules/mxnet/gluon/block.html>`_.
# For example, `torch.nn.sequential` is used to contain a sequence of PyTorch # For example, `torch.nn.sequential` is used to contain a sequence of PyTorch
# `Modules` that will be added to build a network. It focuses on the network # `Modules` that will be added to build a network. It focuses on the network
# layers. Instead, the `Sequential`_ in our pass infra works on the optimizing # layers. Instead, the :py:class:`tvm.relay.transform.Sequential` in our pass infra works on the optimizing
# pass. # pass.
# Now let's execute some passes through `Sequential`_ # Now let's execute some passes through :py:class:`tvm.relay.transform.Sequential`
f = example() f = example()
mod = tvm.IRModule.from_expr(f) mod = tvm.IRModule.from_expr(f)
# Glob the interested passes. # Glob the interested passes.
...@@ -165,7 +156,8 @@ print(mod1) ...@@ -165,7 +156,8 @@ print(mod1)
# identical addition operations. This is because `EliminateCommonSubexpr` # identical addition operations. This is because `EliminateCommonSubexpr`
# was not actually performed. The reason is because only the passes that have # was not actually performed. The reason is because only the passes that have
# optimization level less or equal to 2 will be executed by default under # optimization level less or equal to 2 will be executed by default under
# `Sequential`_. The pass infra, however, provides a configuration interface # :py:class:`tvm.relay.transform.Sequential`. The pass infra,
# however, provides a configuration interface
# for users to customize the optimization level that they want to execute. # for users to customize the optimization level that they want to execute.
with relay.build_config(opt_level=3): with relay.build_config(opt_level=3):
......
...@@ -304,7 +304,7 @@ tvm.testing.assert_allclose(c.asnumpy(), np.dot(a, b.T), rtol=1e-3) ...@@ -304,7 +304,7 @@ tvm.testing.assert_allclose(c.asnumpy(), np.dot(a, b.T), rtol=1e-3)
# For example, INT8 quantization on Intel CPUs uses tensorization # For example, INT8 quantization on Intel CPUs uses tensorization
# to invoke AVX instruction directly. # to invoke AVX instruction directly.
# It also enables TVM to compile to ASICs - # It also enables TVM to compile to ASICs -
# checkout `VTA <https://docs.tvm.ai/vta/index.html>`_ for details. # checkout :ref:`vta-index` for details.
# We also demonstrates how to use inline assembly importing, # We also demonstrates how to use inline assembly importing,
# which helps users inject asm easily into the schedule. # which helps users inject asm easily into the schedule.
# #
...@@ -15,6 +15,8 @@ ...@@ -15,6 +15,8 @@
# specific language governing permissions and limitations # specific language governing permissions and limitations
# under the License. # under the License.
""" """
.. _tutorial-tensor-expr-get-started:
Get Started with Tensor Expression Get Started with Tensor Expression
================================== ==================================
**Author**: `Tianqi Chen <https://tqchen.github.io>`_ **Author**: `Tianqi Chen <https://tqchen.github.io>`_
......
...@@ -141,7 +141,7 @@ def compile_network(env, target, model, start_pack, stop_pack): ...@@ -141,7 +141,7 @@ def compile_network(env, target, model, start_pack, stop_pack):
# Now we can register our devices to the tracker. The first step is to # Now we can register our devices to the tracker. The first step is to
# build the TVM runtime for the Pynq devices. # build the TVM runtime for the Pynq devices.
# #
# Follow `this section <https://docs.tvm.ai/vta/install.html#pynq-side-rpc-server-build-deployment>`_ # Follow :ref:`vta-index`
# to build the TVM runtime on the device. Then register the device to the tracker with: # to build the TVM runtime on the device. Then register the device to the tracker with:
# #
# .. code-block:: bash # .. code-block:: bash
......
...@@ -245,7 +245,7 @@ m.set_input(**params) ...@@ -245,7 +245,7 @@ m.set_input(**params)
m.set_input('data', image) m.set_input('data', image)
# Perform inference and gather execution statistics # Perform inference and gather execution statistics
# More on: https://docs.tvm.ai/api/python/module.html#tvm.runtime.Module.time_evaluator # More on: :py:method:`tvm.runtime.Module.time_evaluator`
num = 4 # number of times we run module for a single measurement num = 4 # number of times we run module for a single measurement
rep = 3 # number of measurements (we derive std dev from this) rep = 3 # number of measurements (we derive std dev from this)
timer = m.module.time_evaluator("run", ctx, number=num, repeat=rep) timer = m.module.time_evaluator("run", ctx, number=num, repeat=rep)
......
...@@ -270,7 +270,7 @@ m.set_input('data', data) ...@@ -270,7 +270,7 @@ m.set_input('data', data)
m.set_input(**params) m.set_input(**params)
# Perform inference and gather execution statistics # Perform inference and gather execution statistics
# More on: https://docs.tvm.ai/api/python/module.html#tvm.runtime.Module.time_evaluator # More on: :py:method:`tvm.runtime.Module.time_evaluator`
num = 4 # number of times we run module for a single measurement num = 4 # number of times we run module for a single measurement
rep = 3 # number of measurements (we derive std dev from this) rep = 3 # number of measurements (we derive std dev from this)
timer = m.module.time_evaluator("run", ctx, number=num, repeat=rep) timer = m.module.time_evaluator("run", ctx, number=num, repeat=rep)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment