Commit ef909df1 by Logan Weber Committed by Tianqi Chen

Implementation of uTVM (#3227)

* uTVM interfaces (#14)

* some minor interface changes

* implemented HostLowLevelDevice

* added MicroDeviceAPI

* implemented micro_common and added Python interfaces

* current status, semi implemented micro session

* added micro_common implementation and python interfaces (#18)

* added micro_common implementation and python interfaces (#18)

* current status, semi implemented

* host test working

* updated interfaces for MicroSession arguments allocation

* make somewhat lint compatible

* fix based on comments

* added rounding macro

* fix minor bug

* improvements based on comments

* Clean up `binutil.py` and make Python-3-compatible

* Change argument allocation design

* Address feedback and lint errors

* Improve binutil tests

* Simplify allocator (per @tqchen's suggestions)

* Doc/style fixes

* farts

* mcgee

* rodata section werks

(and so does `test_runtime_micro_workspace.py`)

* simple graph runtime werk

* TEMP

* ResNet works, yo

* First round of cleanup

* More cleanup

* runs a dyson over the code

* Another pass

* Fix `make lint` issues

* ready to pr... probably

* final

* Undo change

* Fix rebase resolution

* Minor fixes

* Undo changes to C codegen tests

* Add `obj_path` in `create_micro_lib`

* TEMP

* Address feedback

* Add missing TODO

* Partially address feedback

* Fix headers

* Switch to enum class for `SectionKind`

* Add missing ASF header

* Fix lint

* Fix lint again

* Fix lint

* Kill lint warnings

* Address feedback

* Change Python interface to MicroTVM

All interaction with the device is now through `Session` objects, which
are used through Python's `with` blocks.

* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Move helper functions out of `tvm.micro`

* Switch static char arr to vector

* Improve general infra and code quality

Does not yet address all of tqchen's feedback

* Forgot a rename

* Fix lint

* Add ASF header

* Fix lint

* Partially address MarisaKirisame's feedback

* Lint

* Expose `MicroSession` as a node to Python

* Revert to using `Session` constructor

* Fix compiler error

* (Maybe) fix CI error

* Debugging

* Remove

* Quell lint

* Switch to stack-based session contexts

* Make uTVM less intrusive to host codegen

And use SSA for operands of generated ternary operators

* Inline UTVMArgs into UTVMTask struct

* Remove `HostLowLevelDevice` header

* Remove `BaseAddr` class

* Address feedback

* Add "utvm" prefix to global vars in runtime

* Fix lint

* Fix CI

* Fix `test_binutil.py`

* Fix submodules

* Remove ResNet tests

* Make `test_binutil.py` work with nose

* Fix CI

* I swear this actually fixes the binutil tests

* lint

* lint

* Add fcompile-compatible cross-compile func

* Add docs for uTVM runtime files

* Move pointer patching into `MicroSession`

* Fix lint

* First attempt at unifying cross-compile APIs

* Fix lint

* Rename `cross_compile` back to `cc`

* Address feedback

* Remove commented code

* Lint

* Figure out failing function

* Remove debugging code

* Change "micro_dev" target to "micro"

* Add checks in tests for whether uTVM is enabled

* Add TODO for 32-bit support

* Rename more "micro_dev" to "micro"

* Undo rename

We already have `tvm.micro` as a namespace.  Can't have it as a method
as well.

* Fix failing CI

Thanks to @tqchen for finding this bug.  Emitting ternary operators for
`min` and `max` causes concurrency bugs in CUDA, so we're moving the
ternary op emissions from `CodeGenC` to `CodeGenCHost`.

* Address feedback

* Fix lint
parent 443d023b
...@@ -36,6 +36,7 @@ tvm_option(USE_RELAY_DEBUG "Building Relay in debug mode..." OFF) ...@@ -36,6 +36,7 @@ tvm_option(USE_RELAY_DEBUG "Building Relay in debug mode..." OFF)
tvm_option(USE_SGX "Build with SGX" OFF) tvm_option(USE_SGX "Build with SGX" OFF)
tvm_option(USE_RTTI "Build with RTTI" ON) tvm_option(USE_RTTI "Build with RTTI" ON)
tvm_option(USE_MSVC_MT "Build with MT" OFF) tvm_option(USE_MSVC_MT "Build with MT" OFF)
tvm_option(USE_MICRO "Build with Micro" OFF)
tvm_option(INSTALL_DEV "Install compiler infrastructure" OFF) tvm_option(INSTALL_DEV "Install compiler infrastructure" OFF)
tvm_option(HIDE_PRIVATE_SYMBOLS "Compile with -fvisibility=hidden." OFF) tvm_option(HIDE_PRIVATE_SYMBOLS "Compile with -fvisibility=hidden." OFF)
...@@ -206,6 +207,7 @@ include(cmake/modules/Metal.cmake) ...@@ -206,6 +207,7 @@ include(cmake/modules/Metal.cmake)
include(cmake/modules/ROCM.cmake) include(cmake/modules/ROCM.cmake)
include(cmake/modules/SGX.cmake) include(cmake/modules/SGX.cmake)
include(cmake/modules/LLVM.cmake) include(cmake/modules/LLVM.cmake)
include(cmake/modules/Micro.cmake)
include(cmake/modules/ANTLR.cmake) include(cmake/modules/ANTLR.cmake)
include(cmake/modules/contrib/BLAS.cmake) include(cmake/modules/contrib/BLAS.cmake)
include(cmake/modules/contrib/Random.cmake) include(cmake/modules/contrib/Random.cmake)
......
...@@ -62,6 +62,9 @@ set(USE_VULKAN OFF) ...@@ -62,6 +62,9 @@ set(USE_VULKAN OFF)
# Whether enable OpenGL runtime # Whether enable OpenGL runtime
set(USE_OPENGL OFF) set(USE_OPENGL OFF)
# Whether enable MicroTVM runtime
set(USE_MICRO OFF)
# Whether to enable SGX runtime # Whether to enable SGX runtime
# #
# Possible values for USE_SGX: # Possible values for USE_SGX:
......
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
if(USE_MICRO)
message(STATUS "Build with Micro support")
file(GLOB RUNTIME_MICRO_SRCS src/runtime/micro/*.cc)
list(APPEND RUNTIME_SRCS ${RUNTIME_MICRO_SRCS})
endif(USE_MICRO)
...@@ -81,6 +81,7 @@ typedef enum { ...@@ -81,6 +81,7 @@ typedef enum {
kDLAOCL = 5, kDLAOCL = 5,
kDLSDAccel = 6, kDLSDAccel = 6,
kOpenGL = 11, kOpenGL = 11,
kDLMicroDev = 13,
// AddExtraTVMType which is not in DLPack here // AddExtraTVMType which is not in DLPack here
} TVMDeviceExtType; } TVMDeviceExtType;
......
...@@ -215,6 +215,7 @@ inline const char* DeviceName(int type) { ...@@ -215,6 +215,7 @@ inline const char* DeviceName(int type) {
case kDLROCM: return "rocm"; case kDLROCM: return "rocm";
case kOpenGL: return "opengl"; case kOpenGL: return "opengl";
case kDLExtDev: return "ext_dev"; case kDLExtDev: return "ext_dev";
case kDLMicroDev: return "micro_dev";
default: LOG(FATAL) << "unknown type =" << type; return "Unknown"; default: LOG(FATAL) << "unknown type =" << type; return "Unknown";
} }
} }
......
...@@ -42,7 +42,7 @@ from . import datatype ...@@ -42,7 +42,7 @@ from . import datatype
from . import ndarray as nd from . import ndarray as nd
from .ndarray import context, cpu, gpu, opencl, cl, vulkan, metal, mtl from .ndarray import context, cpu, gpu, opencl, cl, vulkan, metal, mtl
from .ndarray import vpi, rocm, opengl, ext_dev from .ndarray import vpi, rocm, opengl, ext_dev, micro_dev
from ._ffi.runtime_ctypes import TypeCode, TVMType from ._ffi.runtime_ctypes import TypeCode, TVMType
from ._ffi.ndarray import TVMContext from ._ffi.ndarray import TVMContext
......
...@@ -143,6 +143,7 @@ class TVMContext(ctypes.Structure): ...@@ -143,6 +143,7 @@ class TVMContext(ctypes.Structure):
10: 'rocm', 10: 'rocm',
11: 'opengl', 11: 'opengl',
12: 'ext_dev', 12: 'ext_dev',
13: 'micro_dev',
} }
STR2MASK = { STR2MASK = {
'llvm': 1, 'llvm': 1,
...@@ -163,6 +164,7 @@ class TVMContext(ctypes.Structure): ...@@ -163,6 +164,7 @@ class TVMContext(ctypes.Structure):
'rocm': 10, 'rocm': 10,
'opengl': 11, 'opengl': 11,
'ext_dev': 12, 'ext_dev': 12,
'micro_dev': 13,
} }
def __init__(self, device_type, device_id): def __init__(self, device_type, device_id):
super(TVMContext, self).__init__() super(TVMContext, self).__init__()
......
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Utilities for binary file manipulation"""
import os
import subprocess
from . import util
from .._ffi.base import py_str
from ..api import register_func
@register_func("tvm_callback_get_section_size")
def tvm_callback_get_section_size(binary_path, section_name, toolchain_prefix):
"""Finds size of the section in the binary.
Assumes `size` shell command exists (typically works only on Linux machines)
Parameters
----------
binary_path : str
path of the binary file
section_name : str
name of section
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
size : integer
size of the section in bytes
"""
if not os.path.isfile(binary_path):
raise RuntimeError("no such file \"{}\"".format(binary_path))
# We use the "-A" flag here to get the ".rodata" section's size, which is
# not included by default.
size_proc = subprocess.Popen(
["{}size".format(toolchain_prefix), "-A", binary_path], stdout=subprocess.PIPE)
(size_output, _) = size_proc.communicate()
size_output = size_output.decode("utf-8")
if size_proc.returncode != 0:
msg = "error in finding section size:\n"
msg += py_str(out)
raise RuntimeError(msg)
# TODO(weberlo): Refactor this method and `*relocate_binary` so they are
# both aware of [".bss", ".sbss", ".sdata"] being relocated to ".bss".
section_mapping = {
".text": [".text"],
".rodata": [".rodata"],
".data": [".data", ".sdata"],
".bss": [".bss", ".sbss"],
}
sections_to_sum = section_mapping["." + section_name]
section_size = 0
# Skip the first two header lines in the `size` output.
for line in size_output.split("\n")[2:]:
tokens = list(filter(lambda s: len(s) != 0, line.split(" ")))
if len(tokens) != 3:
continue
entry_name = tokens[0]
entry_size = int(tokens[1])
if entry_name in sections_to_sum:
section_size += entry_size
return section_size
@register_func("tvm_callback_relocate_binary")
def tvm_callback_relocate_binary(
binary_path, text_addr, rodata_addr, data_addr, bss_addr, toolchain_prefix):
"""Relocates sections in the binary to new addresses
Parameters
----------
binary_path : str
path of the binary file
text_addr : str
text section absolute address
rodata_addr : str
rodata section absolute address
data_addr : str
data section absolute address
bss_addr : str
bss section absolute address
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
rel_bin : bytearray
the relocated binary
"""
tmp_dir = util.tempdir()
rel_obj_path = tmp_dir.relpath("relocated.o")
ld_script_contents = ""
# TODO(weberlo): There should be a better way to configure this for different archs.
if "riscv" in toolchain_prefix:
ld_script_contents += "OUTPUT_ARCH( \"riscv\" )\n\n"
# TODO(weberlo): Generate the script in a more procedural manner.
ld_script_contents += """
SECTIONS
{
. = %s;
. = ALIGN(8);
.text :
{
*(.text)
. = ALIGN(8);
*(.text*)
}
. = %s;
. = ALIGN(8);
.rodata :
{
*(.rodata)
. = ALIGN(8);
*(.rodata*)
}
. = %s;
. = ALIGN(8);
.data :
{
*(.data)
. = ALIGN(8);
*(.data*)
. = ALIGN(8);
*(.sdata)
}
. = %s;
. = ALIGN(8);
.bss :
{
*(.bss)
. = ALIGN(8);
*(.bss*)
. = ALIGN(8);
*(.sbss)
}
}
""" % (text_addr, rodata_addr, data_addr, bss_addr)
rel_ld_script_path = tmp_dir.relpath("relocated.lds")
with open(rel_ld_script_path, "w") as f:
f.write(ld_script_contents)
ld_proc = subprocess.Popen(["{}ld".format(toolchain_prefix), binary_path,
"-T", rel_ld_script_path,
"-o", rel_obj_path],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(out, _) = ld_proc.communicate()
if ld_proc.returncode != 0:
msg = "linking error using ld:\n"
msg += py_str(out)
raise RuntimeError(msg)
with open(rel_obj_path, "rb") as f:
rel_bin = bytearray(f.read())
return rel_bin
@register_func("tvm_callback_read_binary_section")
def tvm_callback_read_binary_section(binary, section, toolchain_prefix):
"""Returns the contents of the specified section in the binary byte array
Parameters
----------
binary : bytearray
contents of the binary
section : str
type of section
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
section_bin : bytearray
contents of the read section
"""
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("temp.bin")
tmp_section = tmp_dir.relpath("tmp_section.bin")
with open(tmp_bin, "wb") as out_file:
out_file.write(bytes(binary))
objcopy_proc = subprocess.Popen(["{}objcopy".format(toolchain_prefix), "--dump-section",
".{}={}".format(section, tmp_section),
tmp_bin],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(out, _) = objcopy_proc.communicate()
if objcopy_proc.returncode != 0:
msg = "error in using objcopy:\n"
msg += py_str(out)
raise RuntimeError(msg)
if os.path.isfile(tmp_section):
# Get section content if it exists.
with open(tmp_section, "rb") as f:
section_bin = bytearray(f.read())
else:
# Return empty bytearray if the section does not exist.
section_bin = bytearray("", "utf-8")
return section_bin
@register_func("tvm_callback_get_symbol_map")
def tvm_callback_get_symbol_map(binary, toolchain_prefix):
"""Obtains a map of symbols to addresses in the passed binary
Parameters
----------
binary : bytearray
contents of the binary
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
map_str : str
map of defined symbols to addresses, encoded as a series of
alternating newline-separated keys and values
"""
tmp_dir = util.tempdir()
tmp_obj = tmp_dir.relpath("tmp_obj.bin")
with open(tmp_obj, "wb") as out_file:
out_file.write(bytes(binary))
nm_proc = subprocess.Popen(["{}nm".format(toolchain_prefix), "-C", "--defined-only", tmp_obj],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(nm_output, _) = nm_proc.communicate()
if nm_proc.returncode != 0:
msg = "error in using nm:\n"
msg += py_str(nm_output)
raise RuntimeError(msg)
nm_output = nm_output.decode("utf8").splitlines()
map_str = ""
for line in nm_output:
line = line.split()
map_str += line[2] + "\n"
map_str += line[0] + "\n"
return map_str
...@@ -14,7 +14,7 @@ ...@@ -14,7 +14,7 @@
# KIND, either express or implied. See the License for the # KIND, either express or implied. See the License for the
# specific language governing permissions and limitations # specific language governing permissions and limitations
# under the License. # under the License.
"""Util to invoke c++ compilers in the system.""" """Util to invoke C/C++ compilers in the system."""
# pylint: disable=invalid-name # pylint: disable=invalid-name
from __future__ import absolute_import as _abs from __future__ import absolute_import as _abs
import sys import sys
...@@ -24,11 +24,10 @@ import os ...@@ -24,11 +24,10 @@ import os
from .._ffi.base import py_str from .._ffi.base import py_str
from .util import tempdir from .util import tempdir
def create_shared(output, def create_shared(output,
objects, objects,
options=None, options=None,
cc="g++"): compile_cmd="g++"):
"""Create shared library. """Create shared library.
Parameters Parameters
...@@ -36,17 +35,17 @@ def create_shared(output, ...@@ -36,17 +35,17 @@ def create_shared(output,
output : str output : str
The target shared library. The target shared library.
objects : list objects : List[str]
List of object files. List of object files.
options : list options : List[str]
The list of additional options string. The list of additional options string.
cc : str, optional compile_cmd : Optional[str]
The compile string. The compiler command.
""" """
if sys.platform == "darwin" or sys.platform.startswith("linux"): if sys.platform == "darwin" or sys.platform.startswith("linux"):
_linux_shared(output, objects, options, cc) _linux_compile(output, objects, options, compile_cmd)
elif sys.platform == "win32": elif sys.platform == "win32":
_windows_shared(output, objects, options) _windows_shared(output, objects, options)
else: else:
...@@ -56,40 +55,44 @@ def create_shared(output, ...@@ -56,40 +55,44 @@ def create_shared(output,
# assign so as default output format # assign so as default output format
create_shared.output_format = "so" if sys.platform != "win32" else "dll" create_shared.output_format = "so" if sys.platform != "win32" else "dll"
def cross_compiler(compile_func, base_options=None, output_format="so"):
def cross_compiler(cc, options=None, output_format="so"):
"""Create a cross compiler function. """Create a cross compiler function.
Parameters Parameters
---------- ----------
cc : str compile_func : Callable[[str, str, Optional[str]], None]
The cross compiler name. Function that performs the actual compilation
options : list, optional options : Optional[List[str]]
List of additional optional string. List of additional optional string.
output_format : str, optional output_format : Optional[str]
Library output format. Library output format.
Returns Returns
------- -------
fcompile : function fcompile : Callable[[str, str, Optional[str]], None]
A compilation function that can be passed to export_library. A compilation function that can be passed to export_library.
""" """
def _fcompile(outputs, objects, opts=None): if base_options is None:
opts = opts if opts else [] base_options = []
if options: def _fcompile(outputs, objects, options=None):
opts += options all_options = base_options
_linux_shared(outputs, objects, opts, cc=cc) if options is not None:
all_options += options
compile_func(outputs, objects, options=all_options)
_fcompile.output_format = output_format _fcompile.output_format = output_format
return _fcompile return _fcompile
def _linux_shared(output, objects, options, cc="g++"): def _linux_compile(output, objects, options, compile_cmd="g++"):
cmd = [cc] cmd = [compile_cmd]
if output.endswith(".so") or output.endswith(".dylib"):
cmd += ["-shared", "-fPIC"] cmd += ["-shared", "-fPIC"]
if sys.platform == "darwin": if sys.platform == "darwin":
cmd += ["-undefined", "dynamic_lookup"] cmd += ["-undefined", "dynamic_lookup"]
elif output.endswith(".obj"):
cmd += ["-c"]
cmd += ["-o", output] cmd += ["-o", output]
if isinstance(objects, str): if isinstance(objects, str):
cmd += [objects] cmd += [objects]
......
"""uTVM module for bare-metal backends.
uTVM (or the micro backend) enables provides support for bare-metal devices.
Its targets currently include a host-emulated device which is used for testing,
and JTAG-based openocd device which allows actual interfacing with microdevices.
"""
from ..contrib import binutil
from .base import Session, cross_compiler, create_micro_lib
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Base definitions for micro."""
from __future__ import absolute_import
import logging
import os
import sys
from tvm.contrib import util as _util
from tvm.contrib import cc as _cc
from .._ffi.function import _init_api
from .._ffi.libinfo import find_include_path
SUPPORTED_DEVICE_TYPES = ["host"]
class Session:
"""MicroTVM Device Session
Parameters
----------
device_type : str
type of low-level device
toolchain_prefix : str
toolchain prefix to be used. For example, a prefix of
"riscv64-unknown-elf-" means "riscv64-unknown-elf-gcc" is used as
the compiler and "riscv64-unknown-elf-ld" is used as the linker,
etc.
Example
--------
.. code-block:: python
c_mod = ... # some module generated with "c" as the target
device_type = "host"
with tvm.micro.Session(device_type) as sess:
sess.create_micro_mod(c_mod)
"""
def __init__(self, device_type, toolchain_prefix):
if device_type not in SUPPORTED_DEVICE_TYPES:
raise RuntimeError("unknown micro device type \"{}\"".format(device_type))
self._check_system()
# First, find and compile runtime library.
runtime_src_path = os.path.join(_get_micro_device_dir(), "utvm_runtime.c")
tmp_dir = _util.tempdir()
runtime_obj_path = tmp_dir.relpath("utvm_runtime.obj")
create_micro_lib(
runtime_obj_path, runtime_src_path, toolchain_prefix, include_dev_lib_header=False)
self.module = _CreateSession(device_type, runtime_obj_path, toolchain_prefix)
self._enter = self.module["enter"]
self._exit = self.module["exit"]
def _check_system(self):
"""Check if the user's system is supported by MicroTVM.
Raises error if not supported.
"""
if not sys.platform.startswith("linux"):
raise RuntimeError("microTVM is currently only supported on Linux")
# TODO(weberlo): Add 32-bit support.
# It's primarily the compilation pipeline that isn't compatible.
if sys.maxsize <= 2**32:
raise RuntimeError("microTVM is currently only supported on 64-bit platforms")
def __enter__(self):
self._enter()
def __exit__(self, exc_type, exc_value, exc_traceback):
self._exit()
def _get_micro_device_dir():
"""Get directory path for uTVM runtime source files.
Return
------
micro_device_dir : str
directory path
"""
micro_dir = os.path.dirname(os.path.realpath(os.path.expanduser(__file__)))
micro_device_dir = os.path.join(micro_dir, "..", "..", "..",
"src", "runtime", "micro", "device")
return micro_device_dir
def cross_compiler(toolchain_prefix, include_dev_lib_header=True):
"""Creates a cross compile function that wraps `create_micro_lib`.
For use in `tvm.module.Module.export_library`.
Parameters
----------
toolchain_prefix : str
toolchain prefix to be used
include_dev_lib_header : Optional[bool]
whether to include the device library header containing definitions of
library functions.
Return
------
func : Callable[[str, str, Optional[str]], None]
cross compile function taking a destination path for the object file
and a path for the input source file.
Example
--------
.. code-block:: python
c_mod = ... # some module generated with "c" as the target
fcompile = tvm.micro.cross_compiler(toolchain_prefix="")
c_mod.export_library("dev_lib.obj", fcompile=fcompile)
"""
def compile_func(obj_path, src_path, **kwargs):
if isinstance(obj_path, list):
obj_path = obj_path[0]
if isinstance(src_path, list):
src_path = src_path[0]
create_micro_lib(obj_path, src_path, toolchain_prefix,
kwargs.get("options", None), include_dev_lib_header)
return _cc.cross_compiler(compile_func)
def create_micro_lib(
obj_path, src_path, toolchain_prefix, options=None, include_dev_lib_header=True):
"""Compiles code into a binary for the target micro device.
Parameters
----------
obj_path : Optional[str]
path to generated object file (defaults to same directory as `src_path`)
src_path : str
path to source file
toolchain_prefix : str
toolchain prefix to be used
include_dev_lib_header : bool
whether to include the device library header containing definitions of
library functions.
"""
def replace_suffix(s, new_suffix):
if "." in os.path.basename(s):
# There already exists an extension.
return os.path.join(
os.path.dirname(s),
".".join(os.path.basename(s).split(".")[:-1] + [new_suffix]))
# No existing extension; we can just append.
return s + "." + new_suffix
# uTVM object files cannot have an ".o" suffix, because it triggers the
# code path for creating shared objects in `tvm.module.load`. So we replace
# ".o" suffixes with ".obj".
if obj_path.endswith(".o"):
logging.warning(
"\".o\" suffix in \"%s\" has been replaced with \".obj\"", obj_path)
obj_path = replace_suffix(obj_path, "obj")
options = ["-I" + path for path in find_include_path()]
options += ["-I{}".format(_get_micro_device_dir())]
options += ["-fno-stack-protector"]
if sys.maxsize > 2**32 and sys.platform.startswith("linux"):
# Only add this option if the host is a 64-bit Linux.
options += ["-mcmodel=large"]
compile_cmd = "{}gcc".format(toolchain_prefix)
if include_dev_lib_header:
# Create a temporary copy of the source, so we can inject the dev lib
# header without modifying the original.
tmp_dir = _util.tempdir()
temp_src_path = tmp_dir.relpath("temp.c")
with open(src_path, "r") as f:
src_lines = f.read().splitlines()
src_lines.insert(0, "#include \"utvm_device_dylib_redirect.c\"")
with open(temp_src_path, "w") as f:
f.write("\n".join(src_lines))
src_path = temp_src_path
_cc.create_shared(obj_path, src_path, options, compile_cmd)
_init_api("tvm.micro", "tvm.micro.base")
...@@ -189,6 +189,22 @@ def ext_dev(dev_id=0): ...@@ -189,6 +189,22 @@ def ext_dev(dev_id=0):
return TVMContext(12, dev_id) return TVMContext(12, dev_id)
def micro_dev(dev_id=0):
"""Construct a micro device
Parameters
----------
dev_id : int, optional
The integer device id
Returns
-------
ctx : TVMContext
The created context
"""
return TVMContext(13, dev_id)
cl = opencl cl = opencl
mtl = metal mtl = metal
......
...@@ -443,7 +443,7 @@ inline void PrintBinaryExpr(const T* op, ...@@ -443,7 +443,7 @@ inline void PrintBinaryExpr(const T* op,
} }
} }
inline void PrintBinaryIntrinsitc(const Call* op, inline void PrintBinaryIntrinsic(const Call* op,
const char *opstr, const char *opstr,
std::ostream& os, // NOLINT(*) std::ostream& os, // NOLINT(*)
CodeGenC* p) { CodeGenC* p) {
...@@ -528,20 +528,20 @@ void CodeGenC::VisitExpr_(const Call *op, std::ostream& os) { // NOLINT(*) ...@@ -528,20 +528,20 @@ void CodeGenC::VisitExpr_(const Call *op, std::ostream& os) { // NOLINT(*)
} }
os << ")"; os << ")";
} else if (op->is_intrinsic(Call::bitwise_and)) { } else if (op->is_intrinsic(Call::bitwise_and)) {
PrintBinaryIntrinsitc(op, " & ", os, this); PrintBinaryIntrinsic(op, " & ", os, this);
} else if (op->is_intrinsic(Call::bitwise_xor)) { } else if (op->is_intrinsic(Call::bitwise_xor)) {
PrintBinaryIntrinsitc(op, " ^ ", os, this); PrintBinaryIntrinsic(op, " ^ ", os, this);
} else if (op->is_intrinsic(Call::bitwise_or)) { } else if (op->is_intrinsic(Call::bitwise_or)) {
PrintBinaryIntrinsitc(op, " | ", os, this); PrintBinaryIntrinsic(op, " | ", os, this);
} else if (op->is_intrinsic(Call::bitwise_not)) { } else if (op->is_intrinsic(Call::bitwise_not)) {
CHECK_EQ(op->args.size(), 1U); CHECK_EQ(op->args.size(), 1U);
os << "(~"; os << "(~";
this->PrintExpr(op->args[0], os); this->PrintExpr(op->args[0], os);
os << ')'; os << ')';
} else if (op->is_intrinsic(Call::shift_left)) { } else if (op->is_intrinsic(Call::shift_left)) {
PrintBinaryIntrinsitc(op, " << ", os, this); PrintBinaryIntrinsic(op, " << ", os, this);
} else if (op->is_intrinsic(Call::shift_right)) { } else if (op->is_intrinsic(Call::shift_right)) {
PrintBinaryIntrinsitc(op, " >> ", os, this); PrintBinaryIntrinsic(op, " >> ", os, this);
} else if (op->is_intrinsic(intrinsic::tvm_if_then_else)) { } else if (op->is_intrinsic(intrinsic::tvm_if_then_else)) {
os << "("; os << "(";
PrintExpr(op->args[0], os); PrintExpr(op->args[0], os);
......
...@@ -31,13 +31,13 @@ namespace tvm { ...@@ -31,13 +31,13 @@ namespace tvm {
namespace codegen { namespace codegen {
CodeGenCHost::CodeGenCHost() { CodeGenCHost::CodeGenCHost() {
module_name = GetUniqueName("__tvm_module_ctx"); module_name_ = GetUniqueName("__tvm_module_ctx");
} }
void CodeGenCHost::Init(bool output_ssa) { void CodeGenCHost::Init(bool output_ssa) {
decl_stream << "#include \"tvm/runtime/c_runtime_api.h\"\n"; decl_stream << "#include \"tvm/runtime/c_runtime_api.h\"\n";
decl_stream << "#include \"tvm/runtime/c_backend_api.h\"\n"; decl_stream << "#include \"tvm/runtime/c_backend_api.h\"\n";
decl_stream << "extern void* " << module_name << " = NULL;\n"; decl_stream << "extern void* " << module_name_ << " = NULL;\n";
CodeGenC::Init(output_ssa); CodeGenC::Init(output_ssa);
} }
...@@ -154,12 +154,13 @@ void CodeGenCHost::VisitExpr_(const Broadcast* op, std::ostream& os) { // NOLI ...@@ -154,12 +154,13 @@ void CodeGenCHost::VisitExpr_(const Broadcast* op, std::ostream& os) { // NOLI
os << "))"; os << "))";
} }
void CodeGenCHost::PrintGetFuncFromBackend(std::string func_name, std::string packed_func_name) { void CodeGenCHost::PrintGetFuncFromBackend(const std::string& func_name,
const std::string& packed_func_name) {
this->PrintIndent(); this->PrintIndent();
this->stream << "if (" << packed_func_name << " == NULL) {\n"; this->stream << "if (" << packed_func_name << " == NULL) {\n";
int packed_func_if_scope = this->BeginScope(); int packed_func_if_scope = this->BeginScope();
this->PrintIndent(); this->PrintIndent();
this->stream << "if (TVMBackendGetFuncFromEnv(" << module_name this->stream << "if (TVMBackendGetFuncFromEnv(" << module_name_
<< ", \"" << func_name << "\"" << ", \"" << func_name << "\""
<< ", &" << packed_func_name << ") != 0) {\n"; << ", &" << packed_func_name << ") != 0) {\n";
int get_func_env_scope = this->BeginScope(); int get_func_env_scope = this->BeginScope();
...@@ -173,7 +174,7 @@ void CodeGenCHost::PrintGetFuncFromBackend(std::string func_name, std::string pa ...@@ -173,7 +174,7 @@ void CodeGenCHost::PrintGetFuncFromBackend(std::string func_name, std::string pa
this->stream << "}\n"; this->stream << "}\n";
} }
void CodeGenCHost::PrintFuncCall(std::string packed_func_name, int num_args) { void CodeGenCHost::PrintFuncCall(const std::string& packed_func_name, int num_args) {
this->PrintIndent(); this->PrintIndent();
std::string ret_val = GetUniqueName("ret_val"); std::string ret_val = GetUniqueName("ret_val");
std::string ret_type_code = GetUniqueName("ret_type_code"); std::string ret_type_code = GetUniqueName("ret_type_code");
...@@ -251,6 +252,29 @@ void CodeGenCHost::VisitStmt_(const AssertStmt *op) { // NOLINT(*) ...@@ -251,6 +252,29 @@ void CodeGenCHost::VisitStmt_(const AssertStmt *op) { // NOLINT(*)
this->PrintStmt(op->body); this->PrintStmt(op->body);
} }
void CodeGenCHost::VisitExpr_(const Min *op, std::ostream& os) { // NOLINT(*)
PrintTernaryCondExpr(op, "<", os);
}
void CodeGenCHost::VisitExpr_(const Max *op, std::ostream& os) { // NOLINT(*)
PrintTernaryCondExpr(op, ">", os);
}
template <typename T>
inline void CodeGenCHost::PrintTernaryCondExpr(const T* op,
const char* compare,
std::ostream& os) { // NOLINT(*)
std::ostringstream temp_a;
VisitExpr(op->a, temp_a);
std::string a_id = SSAGetID(temp_a.str(), op->a.type());
std::ostringstream temp_b;
VisitExpr(op->b, temp_b);
std::string b_id = SSAGetID(temp_b.str(), op->b.type());
os << "((" << a_id << ") " << compare << " (" << b_id << ") "
<< "? (" << a_id << ") : (" << b_id << "))";
}
runtime::Module BuildCHost(Array<LoweredFunc> funcs) { runtime::Module BuildCHost(Array<LoweredFunc> funcs) {
using tvm::runtime::Registry; using tvm::runtime::Registry;
bool output_ssa = false; bool output_ssa = false;
......
...@@ -45,12 +45,30 @@ class CodeGenCHost final : public CodeGenC { ...@@ -45,12 +45,30 @@ class CodeGenCHost final : public CodeGenC {
// overload visitor functions // overload visitor functions
void VisitExpr_(const Broadcast* op, std::ostream& os) final; // NOLINT(*) void VisitExpr_(const Broadcast* op, std::ostream& os) final; // NOLINT(*)
void VisitExpr_(const Call *op, std::ostream& os) final; // NOLINT(*) void VisitExpr_(const Call *op, std::ostream& os) final; // NOLINT(*)
// overload min and max to use the ternary operator, so we don't rely on the
// standard library implementations
void VisitExpr_(const Min *op, std::ostream& os) final; // NOLINT(*)
void VisitExpr_(const Max *op, std::ostream& os) final; // NOLINT(*)
void VisitStmt_(const AssertStmt *op) final; // NOLINT(*) void VisitStmt_(const AssertStmt *op) final; // NOLINT(*)
private: private:
std::string module_name; std::string module_name_;
void PrintGetFuncFromBackend(std::string func_name, std::string packed_func_name);
void PrintFuncCall(std::string packed_func_name, int num_args); void PrintGetFuncFromBackend(const std::string& func_name, const std::string& packed_func_name);
void PrintFuncCall(const std::string& packed_func_name, int num_args);
/*!
* \brief Print ternary conditional operator implementing binary `op`
* Forces the operands to be in SSA form.
* \param op binary operator being expressed
* \param compare string representation of comparison operator
* \param os stream reference to print into
*/
template <typename T>
inline void PrintTernaryCondExpr(const T* op,
const char* compare,
std::ostream& os); // NOLINT(*)
}; };
} // namespace codegen } // namespace codegen
......
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file utvm_device_dylib_redirect.cc
* \brief uTVM dynamic linking stubs
*
* This is a library that gets included in each uTVM library. We redirect
* each library call into a pre-defined global function pointer, and we patch
* the correct addresses of each function into the pointers when we load the
* library.
*/
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <stddef.h>
void *(*TVMBackendAllocWorkspace_)(int, int, uint64_t, int, int) =
(void *(*)(int, int, uint64_t, int, int)) NULL;
int (*TVMBackendFreeWorkspace_)(int, int, void*) = (int (*)(int, int, void*)) NULL;
void (*TVMAPISetLastError_)(const char*) = (void (*)(const char*)) NULL;
void* TVMBackendAllocWorkspace(int device_type, int device_id, uint64_t size,
int dtype_code_hint, int dtype_bits_hint) {
return (*TVMBackendAllocWorkspace_)(device_type, device_id, size, dtype_code_hint,
dtype_bits_hint);
}
int TVMBackendFreeWorkspace(int device_type, int device_id, void* ptr) {
return (*TVMBackendFreeWorkspace_)(device_type, device_id, ptr);
}
void TVMAPISetLastError(const char* msg) {
(*TVMAPISetLastError_)(msg);
}
#ifdef __cplusplus
} // TVM_EXTERN_C
#endif
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file utvm_runtime.cc
* \brief uTVM runtime
*
* All function calls go through `UTVMMain`, which reads from the current
* `UTVMTask` and calls the appropriate function with the arguments from the
* task.
*
* Additionally included in this file are definitions for some of the most
* common functions used in the C runtime API.
*/
#ifdef __cplusplus
extern "C" {
#endif
#include "utvm_runtime.h"
// Task pointers must be patched before calling a function.
UTVMTask task;
// These pointers are patched at load time to point to the workspace section.
char* utvm_workspace_begin = NULL; // NOLINT(*)
char* utvm_workspace_end = NULL; // NOLINT(*)
char* utvm_workspace_curr = NULL; // NOLINT(*)
// Keep track of how many active allocations there are on the workspace.
size_t utvm_num_active_allocs = 0;
const char* utvm_last_error = NULL; // NOLINT(*)
int32_t utvm_return_code = 0; // NOLINT(*)
// We use a dummy function to signal execution is finished for device
// backends which require breakpoints.
void UTVMDone() { }
void UTVMMain() {
utvm_workspace_curr = utvm_workspace_begin;
utvm_num_active_allocs = 0;
utvm_last_error = NULL; // NOLINT(*)
utvm_return_code = 0;
utvm_return_code = task.func((void*) task.arg_values, (void*) task.arg_type_codes, // NOLINT(*)
task.num_args);
UTVMDone();
}
void* TVMBackendAllocWorkspace(int device_type, int device_id, uint64_t size,
int dtype_code_hint, int dtype_bits_hint) {
// Align up to 8 bytes.
utvm_workspace_curr += (8 - ((uintptr_t) utvm_workspace_curr % 8)) % 8; // NOLINT(*)
if (utvm_workspace_curr + size > utvm_workspace_end) {
// Out of space in workspace.
return NULL;
}
void* ret_ptr = (void*) utvm_workspace_curr; // NOLINT(*)
utvm_workspace_curr += size;
utvm_num_active_allocs++;
return ret_ptr;
}
int TVMBackendFreeWorkspace(int device_type, int device_id, void* ptr) {
utvm_num_active_allocs--;
if (utvm_num_active_allocs < 0) {
TVMAPISetLastError("free called with no active workspace allocations");
// Reset allocations and workspace (for future task executions).
utvm_num_active_allocs = 0;
utvm_workspace_curr = utvm_workspace_begin;
return -1;
} else if (utvm_num_active_allocs == 0) {
// No more allocations. Reset workspace.
utvm_workspace_curr = utvm_workspace_begin;
return 0;
} else {
return 0;
}
}
void TVMAPISetLastError(const char* msg) {
utvm_last_error = msg;
}
#ifdef __cplusplus
} // TVM_EXTERN_C
#endif
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file utvm_runtime.h
* \brief uTVM runtime headers
*/
#ifndef TVM_RUNTIME_MICRO_DEVICE_UTVM_RUNTIME_H_
#define TVM_RUNTIME_MICRO_DEVICE_UTVM_RUNTIME_H_
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <tvm/runtime/c_runtime_api.h>
/*!
* \brief Task structure for uTVM
*/
typedef struct {
/*! \brief Pointer to function to call for this task */
int32_t (*func)(void*, void*, int32_t);
/*! \brief Array of argument values */
TVMValue* arg_values;
/*! \brief Array of type codes for each argument value */
int* arg_type_codes;
/*! \brief Number of arguments */
int32_t num_args;
} UTVMTask;
#ifdef __cplusplus
} // TVM_EXTERN_C
#endif
#endif // TVM_RUNTIME_MICRO_DEVICE_UTVM_RUNTIME_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file host_low_level_device.cc
* \brief emulated low-level micro device implementation on host machine
*/
#include <sys/mman.h>
#include <cstring>
#include <memory>
#include "micro_common.h"
#include "low_level_device.h"
namespace tvm {
namespace runtime {
/*!
* \brief emulated low-level device on host machine
*/
class HostLowLevelDevice final : public LowLevelDevice {
public:
/*!
* \brief constructor to initialize on-host memory region to act as device
* \param num_bytes size of the emulated on-device memory region
*/
explicit HostLowLevelDevice(size_t num_bytes) : size_(num_bytes) {
size_t size_in_pages = (num_bytes + kPageSize - 1) / kPageSize;
// TODO(weberlo): Set permissions per section (e.g., read-write perms for
// the heap, execute perms for text, etc.).
int mmap_prot = PROT_READ | PROT_WRITE | PROT_EXEC;
int mmap_flags = MAP_ANONYMOUS | MAP_PRIVATE;
base_addr_ = reinterpret_cast<std::uintptr_t>(
mmap(nullptr, size_in_pages * kPageSize, mmap_prot, mmap_flags, -1, 0));
}
/*!
* \brief destructor to deallocate on-host device region
*/
virtual ~HostLowLevelDevice() {
munmap(reinterpret_cast<void*>(base_addr_), size_);
}
void Read(DevBaseOffset offset, void* buf, size_t num_bytes) {
void* addr = ToDevPtr(offset).cast_to<void*>();
std::memcpy(buf, addr, num_bytes);
}
void Write(DevBaseOffset offset, const void* buf, size_t num_bytes) {
void* addr = ToDevPtr(offset).cast_to<void*>();
std::memcpy(addr, buf, num_bytes);
}
void Execute(DevBaseOffset func_offset, DevBaseOffset breakpoint) {
DevPtr func_addr = ToDevPtr(func_offset);
reinterpret_cast<void (*)(void)>(func_addr.value())();
}
std::uintptr_t base_addr() const final {
return base_addr_;
}
const char* device_type() const final {
return "host";
}
private:
/*! \brief base address of the micro device memory region */
std::uintptr_t base_addr_;
/*! \brief size of memory region */
size_t size_;
};
const std::shared_ptr<LowLevelDevice> HostLowLevelDeviceCreate(size_t num_bytes) {
std::shared_ptr<LowLevelDevice> lld =
std::make_shared<HostLowLevelDevice>(num_bytes);
return lld;
}
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file low_level_device.h
* \brief Abstract low-level micro device management
*/
#ifndef TVM_RUNTIME_MICRO_LOW_LEVEL_DEVICE_H_
#define TVM_RUNTIME_MICRO_LOW_LEVEL_DEVICE_H_
#include <memory>
#include "micro_common.h"
namespace tvm {
namespace runtime {
/*!
* \brief virtual interface for low-level micro device management
*/
class LowLevelDevice {
public:
/*! \brief virtual destructor */
virtual ~LowLevelDevice() {}
/*!
* \brief reads num_bytes from device memory at base_addr + offset into buffer
* \param offset on-device memory offset pointer to be read from
* \param buffer on-host buffer to be read into
* \param num_bytes number of bytes to be read
*/
virtual void Read(DevBaseOffset offset,
void* buffer,
size_t num_bytes) = 0;
/*!
* \brief writes num_bytes from buffer to device memory at base_addr + offset
* \param offset on-device memory offset pointer to be written to
* \param buffer on-host buffer to be written
* \param num_bytes number of bytes to be written
*/
virtual void Write(DevBaseOffset offset,
const void* buffer,
size_t num_bytes) = 0;
/*!
* \brief starts execution of device at offset
* \param func_addr offset of the init stub function
* \param breakpoint breakpoint at which to stop function execution
*/
virtual void Execute(DevBaseOffset func_offset, DevBaseOffset breakpoint) = 0;
/*!
* \brief convert from base offset to absolute address
* \param offset base offset
*/
DevPtr ToDevPtr(DevBaseOffset offset) {
return DevPtr(base_addr() + offset.value());
}
/*!
* \brief convert from absolute address to base offset
* \param ptr absolute address
*/
DevBaseOffset ToDevOffset(DevPtr ptr) {
return DevBaseOffset(ptr.value() - base_addr());
}
/*!
* \brief getter function for low-level device type
* \return string containing device type
*/
virtual const char* device_type() const = 0;
protected:
/*!
* \brief getter function for base_addr
* \return the base address of the device memory region
*/
virtual std::uintptr_t base_addr() const = 0;
};
/*!
* \brief create a host low-level device
* \param num_bytes size of the memory region
*/
const std::shared_ptr<LowLevelDevice> HostLowLevelDeviceCreate(size_t num_bytes);
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_LOW_LEVEL_DEVICE_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_common.cc
* \brief common utilties for uTVM
*/
#include <tvm/runtime/c_runtime_api.h>
#include <tvm/runtime/registry.h>
#include <cstdio>
#include <string>
#include <sstream>
#include <cstdint>
#include "micro_session.h"
#include "micro_common.h"
#include "low_level_device.h"
namespace tvm {
namespace runtime {
size_t GetDefaultSectionSize(SectionKind kind) {
switch (kind) {
case SectionKind::kText:
return 0xF0000;
case SectionKind::kRodata:
return 0xF000;
case SectionKind::kData:
return 0xF00;
case SectionKind::kBss:
return 0xF00;
case SectionKind::kArgs:
return 0xF00000;
case SectionKind::kStack:
return 0xF000;
case SectionKind::kHeap:
return 0xF000000;
case SectionKind::kWorkspace:
return 0xF000000;
default:
LOG(FATAL) << "invalid section " << static_cast<size_t>(kind);
return 0;
}
}
const char* SectionToString(SectionKind section) {
switch (section) {
case SectionKind::kText: return "text";
case SectionKind::kRodata: return "rodata";
case SectionKind::kData: return "data";
case SectionKind::kBss: return "bss";
case SectionKind::kArgs: return "args";
case SectionKind::kStack: return "stack";
case SectionKind::kHeap: return "heap";
case SectionKind::kWorkspace: return "workspace";
default: return "";
}
}
static std::string AddrToString(void* addr) {
std::stringstream stream;
if (addr != nullptr)
stream << addr;
else
stream << "0x0";
std::string string_addr = stream.str();
return string_addr;
}
std::string RelocateBinarySections(const std::string& binary_path,
DevPtr text,
DevPtr rodata,
DevPtr data,
DevPtr bss,
const std::string& toolchain_prefix) {
const auto* f = Registry::Get("tvm_callback_relocate_binary");
CHECK(f != nullptr)
<< "Require tvm_callback_relocate_binary to exist in registry";
std::string relocated_bin = (*f)(binary_path,
AddrToString(text.cast_to<void*>()),
AddrToString(rodata.cast_to<void*>()),
AddrToString(data.cast_to<void*>()),
AddrToString(bss.cast_to<void*>()),
toolchain_prefix);
return relocated_bin;
}
std::string ReadSection(const std::string& binary,
SectionKind section,
const std::string& toolchain_prefix) {
CHECK(section == SectionKind::kText || section == SectionKind::kRodata ||
section == SectionKind::kData || section == SectionKind::kBss)
<< "ReadSection requires section to be one of text, rodata, data, or bss.";
const auto* f = Registry::Get("tvm_callback_read_binary_section");
CHECK(f != nullptr)
<< "Require tvm_callback_read_binary_section to exist in registry";
TVMByteArray arr;
arr.data = &binary[0];
arr.size = binary.length();
std::string section_contents = (*f)(arr, SectionToString(section), toolchain_prefix);
return section_contents;
}
size_t GetSectionSize(const std::string& binary_path,
SectionKind section,
const std::string& toolchain_prefix,
size_t align) {
CHECK(section == SectionKind::kText || section == SectionKind::kRodata ||
section == SectionKind::kData || section == SectionKind::kBss)
<< "GetSectionSize requires section to be one of text, rodata, data, or bss.";
const auto* f = Registry::Get("tvm_callback_get_section_size");
CHECK(f != nullptr)
<< "Require tvm_callback_get_section_size to exist in registry";
int size = (*f)(binary_path, SectionToString(section), toolchain_prefix);
return UpperAlignValue(size, align);
}
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_common.h
*/
#ifndef TVM_RUNTIME_MICRO_MICRO_COMMON_H_
#define TVM_RUNTIME_MICRO_MICRO_COMMON_H_
#include <stdio.h>
#include <tvm/runtime/registry.h>
#include <sstream>
#include <string>
#include <unordered_map>
namespace tvm {
namespace runtime {
/*!
* \brief enum of device memory region sections
*
* The order in which the enum variants are defined also defines the order of
* the sections in device memory.
*/
enum class SectionKind : size_t {
kText = 0,
kRodata,
kData,
kBss,
kArgs,
kStack,
kHeap,
kWorkspace,
kNumKinds,
};
/*! \brief default size alignment */
constexpr int kDefaultSizeAlignment = 8;
/*! \brief Base class for interfacing with device locations (pointers/offsets) */
class DeviceLocation {
public:
/*! \brief construct a location with value `value` */
explicit DeviceLocation(std::uintptr_t value) : value_(value) {}
/*! \brief default constructor */
DeviceLocation() : value_(0) {}
/*! \brief construct a null location */
explicit DeviceLocation(std::nullptr_t value) : value_(0) {}
/*! \brief destructor */
virtual ~DeviceLocation() {}
/*!
* \brief get value of location
* \return value of location
*/
std::uintptr_t value() const { return value_; }
/*!
* \brief cast location to type `T`
* \return casted result
*/
template <typename T>
T cast_to() const { return reinterpret_cast<T>(value_); }
/*! \brief check if location is null */
bool operator==(std::nullptr_t) const { return value_ == 0; }
/*! \brief check if location is not null */
bool operator!=(std::nullptr_t) const { return value_ != 0; }
protected:
/*! \brief raw value storing the location */
std::uintptr_t value_;
};
/*! \brief absolute device address */
class DevPtr : public DeviceLocation {
public:
/*! \brief construct an absolute address with value `value` */
explicit DevPtr(std::uintptr_t val) : DeviceLocation(val) {}
/*! \brief default constructor */
DevPtr() : DeviceLocation() {}
/*! \brief construct a null absolute address */
explicit DevPtr(std::nullptr_t val) : DeviceLocation(val) {}
/*! \brief add an integer to this absolute address to get a larger absolute address */
DevPtr operator+(size_t n) const {
return DevPtr(value_ + n);
}
/*! \brief mutably add an integer to this absolute address */
DevPtr& operator+=(size_t n) {
value_ += n;
return *this;
}
/*! \brief subtract an integer from this absolute address to get a smaller absolute address */
DevPtr operator-(size_t n) const {
return DevPtr(value_ - n);
}
/*! \brief mutably subtract an integer from this absolute address */
DevPtr& operator-=(size_t n) {
value_ -= n;
return *this;
}
};
/*! \brief offset from device base address */
class DevBaseOffset : public DeviceLocation {
public:
/*! \brief construct a base offset with value `value` */
explicit DevBaseOffset(std::uintptr_t value) : DeviceLocation(value) {}
/*! \brief default constructor */
DevBaseOffset() : DeviceLocation() {}
/*! \brief construct a null base offset */
explicit DevBaseOffset(std::nullptr_t value) : DeviceLocation(value) {}
/*! \brief add an integer to this base offset to get a larger base offset */
DevBaseOffset operator+(size_t n) const {
return DevBaseOffset(value_ + n);
}
/*! \brief mutably add an integer to this base offset */
DevBaseOffset& operator+=(size_t n) {
value_ += n;
return *this;
}
/*! \brief subtract an integer from this base offset to get a smaller base offset */
DevBaseOffset operator-(size_t n) const {
return DevBaseOffset(value_ - n);
}
/*! \brief mutably subtract an integer from this base offset */
DevBaseOffset& operator-=(size_t n) {
value_ -= n;
return *this;
}
};
/*!
* \brief map from symbols to their on-device offsets
*/
class SymbolMap {
public:
/*!
* \brief default constructor
*/
SymbolMap() {}
/*!
* \brief constructor that builds the mapping
* \param binary contents of binary object file
* \param toolchain_prefix prefix of compiler toolchain to use
*/
SymbolMap(const std::string& binary,
const std::string& toolchain_prefix) {
const auto* f = Registry::Get("tvm_callback_get_symbol_map");
CHECK(f != nullptr) << "require tvm_callback_get_symbol_map to exist in registry";
TVMByteArray arr;
arr.data = &binary[0];
arr.size = binary.length();
std::string map_str = (*f)(arr, toolchain_prefix);
// Parse symbols and addresses from returned string.
std::stringstream stream;
stream << map_str;
std::string name;
std::uintptr_t addr;
stream >> name;
stream >> std::hex >> addr;
while (stream) {
map_[name] = DevPtr(addr);
stream >> name;
stream >> std::hex >> addr;
}
}
/*!
* \brief retrieve on-device offset for a symbol name
* \param name name of the symbol
* \return on-device offset of the symbol
*/
DevPtr operator[](const std::string& name) const {
auto result = map_.find(name);
CHECK(result != map_.end()) << "\"" << name << "\" not in symbol map";
return result->second;
}
private:
/*! \brief backing map */
std::unordered_map<std::string, DevPtr> map_;
};
/*! \brief struct containing start and size of a device memory region */
struct DevMemRegion {
/*! \brief section start offset */
DevBaseOffset start;
/*! \brief size of section */
size_t size;
};
/*! \brief struct containing section locations and symbol mappings */
struct BinaryInfo {
/*! \brief text section region */
DevMemRegion text_section;
/*! \brief rodata section region */
DevMemRegion rodata_section;
/*! \brief data section region */
DevMemRegion data_section;
/*! \brief bss section region */
DevMemRegion bss_section;
/*! \brief symbol map to offsets */
SymbolMap symbol_map;
};
// TODO(weberlo): should this be here?
/*! \brief number of bytes in each page */
constexpr int kPageSize = 4096;
const DevBaseOffset kDeviceStart = DevBaseOffset(64);
/*!
* \brief return default size of given section kind in bytes
*/
size_t GetDefaultSectionSize(SectionKind kind);
/*!
* \brief upper-aligns value according to specified alignment
* \param value value to be aligned
* \param align alignment
* \return upper-aligned value
*/
inline size_t UpperAlignValue(size_t value, size_t align) {
return value + (align - (value % align)) % align;
}
/*!
* \brief maps section enums to text
* \param section section type
* \return text form of the specified section
*/
const char* SectionToString(SectionKind section);
/*!
* \brief links binary by repositioning section addresses
* \param binary_name input binary filename
* \param text new text section address
* \param rodata new rodata section address
* \param data new data section address
* \param bss new bss section address
* \param toolchain_prefix prefix of compiler toolchain to use
* \return relocated binary file contents
*/
std::string RelocateBinarySections(const std::string& binary_name,
DevPtr text,
DevPtr rodata,
DevPtr data,
DevPtr bss,
const std::string& toolchain_prefix);
/*!
* \brief reads section from binary
* \param binary input binary contents
* \param section section type to be read
* \param toolchain_prefix prefix of compiler toolchain to use
* \return contents of the section
*/
std::string ReadSection(const std::string& binary,
SectionKind section,
const std::string& toolchain_prefix);
/*!
* \brief finds size of the section in the binary
* \param binary input binary contents
* \param section section type
* \param toolchain_prefix prefix of compiler toolchain to use
* \param align alignment of the returned size (default: 8)
* \return size of the section if it exists, 0 otherwise
*/
size_t GetSectionSize(const std::string& binary_name,
SectionKind section,
const std::string& toolchain_prefix,
size_t align = kDefaultSizeAlignment);
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_MICRO_COMMON_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_device_api.cc
*/
#include <tvm/runtime/registry.h>
#include <tvm/runtime/device_api.h>
#include <tvm/runtime/c_runtime_api.h>
#include "../workspace_pool.h"
#include "micro_session.h"
namespace tvm {
namespace runtime {
/*!
* \brief device API for uTVM micro devices
*/
class MicroDeviceAPI final : public DeviceAPI {
public:
/*! \brief constructor */
MicroDeviceAPI() { }
void SetDevice(TVMContext ctx) final {}
void GetAttr(TVMContext ctx, DeviceAttrKind kind, TVMRetValue* rv) final {
if (kind == kExist) {
*rv = 1;
}
}
void* AllocDataSpace(TVMContext ctx,
size_t nbytes,
size_t alignment,
TVMType type_hint) final {
std::shared_ptr<MicroSession>& session = MicroSession::Current();
void* data = session->AllocateInSection(SectionKind::kHeap, nbytes).cast_to<void*>();
CHECK(data != nullptr) << "unable to allocate " << nbytes << " bytes on device heap";
MicroDevSpace* dev_space = new MicroDevSpace();
dev_space->data = data;
dev_space->session = session;
return static_cast<void*>(dev_space);
}
void FreeDataSpace(TVMContext ctx, void* ptr) final {
MicroDevSpace* dev_space = static_cast<MicroDevSpace*>(ptr);
dev_space->session->FreeInSection(
SectionKind::kHeap, DevBaseOffset(reinterpret_cast<std::uintptr_t>(dev_space->data)));
delete dev_space;
}
void CopyDataFromTo(const void* from,
size_t from_offset,
void* to,
size_t to_offset,
size_t size,
TVMContext ctx_from,
TVMContext ctx_to,
TVMType type_hint,
TVMStreamHandle stream) final {
std::tuple<int, int> type_from_to(ctx_from.device_type, ctx_to.device_type);
if (type_from_to == std::make_tuple(kDLMicroDev, kDLMicroDev)) {
// Copying from the device to the device.
MicroDevSpace* from_space = static_cast<MicroDevSpace*>(const_cast<void*>(from));
MicroDevSpace* to_space = static_cast<MicroDevSpace*>(const_cast<void*>(to));
CHECK(from_space->session == to_space->session)
<< "attempt to copy data between different micro sessions (" << from_space->session
<< " != " << to_space->session << ")";
CHECK(ctx_from.device_id == ctx_to.device_id)
<< "can only copy between the same micro device";
std::shared_ptr<MicroSession>& session = from_space->session;
const std::shared_ptr<LowLevelDevice>& lld = session->low_level_device();
DevBaseOffset from_dev_offset = GetDevLoc(from_space, from_offset);
DevBaseOffset to_dev_offset = GetDevLoc(to_space, to_offset);
std::vector<uint8_t> buffer(size);
lld->Read(from_dev_offset, static_cast<void*>(buffer.data()), size);
lld->Write(to_dev_offset, static_cast<void*>(buffer.data()), size);
} else if (type_from_to == std::make_tuple(kDLMicroDev, kDLCPU)) {
// Reading from the device.
MicroDevSpace* from_space = static_cast<MicroDevSpace*>(const_cast<void*>(from));
std::shared_ptr<MicroSession>& session = from_space->session;
const std::shared_ptr<LowLevelDevice>& lld = session->low_level_device();
DevBaseOffset from_dev_offset = GetDevLoc(from_space, from_offset);
void* to_host_ptr = GetHostLoc(to, to_offset);
lld->Read(from_dev_offset, to_host_ptr, size);
} else if (type_from_to == std::make_tuple(kDLCPU, kDLMicroDev)) {
// Writing to the device.
MicroDevSpace* to_space = static_cast<MicroDevSpace*>(const_cast<void*>(to));
std::shared_ptr<MicroSession>& session = to_space->session;
const std::shared_ptr<LowLevelDevice>& lld = session->low_level_device();
void* from_host_ptr = GetHostLoc(from, from_offset);
DevBaseOffset to_dev_offset = GetDevLoc(to_space, to_offset);
lld->Write(to_dev_offset, from_host_ptr, size);
} else {
LOG(FATAL) << "Expect copy from/to micro device or between micro device\n";
}
}
void StreamSync(TVMContext ctx, TVMStreamHandle stream) final {
}
void* AllocWorkspace(TVMContext ctx, size_t size, TVMType type_hint) final {
std::shared_ptr<MicroSession>& session = MicroSession::Current();
void* data = session->AllocateInSection(SectionKind::kWorkspace, size).cast_to<void*>();
CHECK(data != nullptr) << "unable to allocate " << size << " bytes on device workspace";
MicroDevSpace* dev_space = new MicroDevSpace();
dev_space->data = data;
dev_space->session = session;
return static_cast<void*>(dev_space);
}
void FreeWorkspace(TVMContext ctx, void* data) final {
MicroDevSpace* dev_space = static_cast<MicroDevSpace*>(data);
std::shared_ptr<MicroSession>& session = dev_space->session;
session->FreeInSection(SectionKind::kWorkspace,
DevBaseOffset(reinterpret_cast<std::uintptr_t>(dev_space->data)));
delete dev_space;
}
/*!
* \brief obtain a global singleton of MicroDeviceAPI
* \return global shared pointer to MicroDeviceAPI
*/
static const std::shared_ptr<MicroDeviceAPI>& Global() {
static std::shared_ptr<MicroDeviceAPI> inst = std::make_shared<MicroDeviceAPI>();
return inst;
}
private:
DevBaseOffset GetDevLoc(MicroDevSpace* dev_space, size_t offset) {
DevBaseOffset dev_offset =
DevBaseOffset(reinterpret_cast<std::uintptr_t>(dev_space->data) + offset);
return dev_offset;
}
void* GetHostLoc(const void* ptr, size_t offset) {
return reinterpret_cast<void*>(reinterpret_cast<std::uintptr_t>(ptr) + offset);
}
};
// register device that can be obtained from Python frontend
TVM_REGISTER_GLOBAL("device_api.micro_dev")
.set_body([](TVMArgs args, TVMRetValue* rv) {
DeviceAPI* ptr = MicroDeviceAPI::Global().get();
*rv = static_cast<void*>(ptr);
});
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_module.cc
*/
#include <tvm/runtime/registry.h>
#include <tvm/runtime/c_runtime_api.h>
#include <tvm/runtime/module.h>
#include <unordered_map>
#include <string>
#include "micro_session.h"
#include "low_level_device.h"
#include "micro_common.h"
#include "../pack_args.h"
namespace tvm {
namespace runtime {
/*!
* \brief module for uTVM micro devices
*/
class MicroModuleNode final : public ModuleNode {
public:
MicroModuleNode() {}
~MicroModuleNode() {}
const char* type_key() const final {
return "micro";
}
PackedFunc GetFunction(const std::string& name,
const std::shared_ptr<ModuleNode>& sptr_to_self) final;
/*!
* \brief initializes module by establishing device connection and loads binary
* \param binary_path path of the binary to be loaded
*/
void InitMicroModule(const std::string& binary_path) {
session_ = MicroSession::Current();
binary_path_ = binary_path;
binary_info_ = session_->LoadBinary(binary_path_);
}
/*!
* \brief runs selected function on the micro device
* \param func_name name of the function to be run
* \param func_offset offset of the function to be run
* \param args type-erased arguments passed to the function
*/
void RunFunction(const std::string& func_name, DevBaseOffset func_offset, const TVMArgs& args) {
session_->PushToExecQueue(func_offset, args);
}
private:
/*! \brief module binary info */
BinaryInfo binary_info_;
/*! \brief path to module binary */
std::string binary_path_;
/*! \brief global session pointer */
std::shared_ptr<MicroSession> session_;
};
class MicroWrappedFunc {
public:
MicroWrappedFunc(MicroModuleNode* m,
std::shared_ptr<MicroSession> session,
const std::string& func_name,
DevBaseOffset func_offset) {
m_ = m;
session_ = session;
func_name_ = func_name;
func_offset_ = func_offset;
}
void operator()(TVMArgs args, TVMRetValue* rv) const {
m_->RunFunction(func_name_, func_offset_, args);
}
private:
/*! \brief internal module */
MicroModuleNode* m_;
/*! \brief reference to the session for this function (to keep the session alive) */
std::shared_ptr<MicroSession> session_;
/*! \brief name of the function */
std::string func_name_;
/*! \brief offset of the function to be called */
DevBaseOffset func_offset_;
};
PackedFunc MicroModuleNode::GetFunction(
const std::string& name,
const std::shared_ptr<ModuleNode>& sptr_to_self) {
DevBaseOffset func_offset =
session_->low_level_device()->ToDevOffset(binary_info_.symbol_map[name]);
MicroWrappedFunc f(this, session_, name, func_offset);
return PackedFunc(f);
}
// register loadfile function to load module from Python frontend
TVM_REGISTER_GLOBAL("module.loadfile_micro_dev")
.set_body([](TVMArgs args, TVMRetValue* rv) {
std::shared_ptr<MicroModuleNode> n = std::make_shared<MicroModuleNode>();
n->InitMicroModule(args[0]);
*rv = runtime::Module(n);
});
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_section_allocator.h
*/
#ifndef TVM_RUNTIME_MICRO_MICRO_SECTION_ALLOCATOR_H_
#define TVM_RUNTIME_MICRO_MICRO_SECTION_ALLOCATOR_H_
#include <unordered_map>
#include "micro_common.h"
namespace tvm {
namespace runtime {
/*!
* \brief allocator for an on-device memory section
*/
class MicroSectionAllocator {
public:
/*!
* \brief constructor that specifies section boundaries
* \param region location and size of the section on the device
*/
explicit MicroSectionAllocator(DevMemRegion region)
: start_offset_(region.start),
size_(0),
capacity_(region.size) {
CHECK_EQ(start_offset_.value() % 8, 0) << "micro section not aligned to 8 bytes";
}
/*!
* \brief destructor
*/
~MicroSectionAllocator() {}
/*!
* \brief memory allocator
* \param size size of allocated memory in bytes
* \return pointer to allocated memory region in section, nullptr if out of space
*/
DevBaseOffset Allocate(size_t size) {
size_ = UpperAlignValue(size_, 8);
CHECK(size_ + size < capacity_)
<< "cannot alloc " << size << " bytes in section with start_addr " <<
start_offset_.value();
DevBaseOffset alloc_ptr = start_offset_ + size_;
size_ += size;
alloc_map_[alloc_ptr.value()] = size;
return alloc_ptr;
}
/*!
* \brief free prior allocation from section
* \param offs offset to allocated memory
* \note simple allocator scheme, more complex versions will be implemented later
*/
void Free(DevBaseOffset offs) {
std::uintptr_t ptr = offs.value();
CHECK(alloc_map_.find(ptr) != alloc_map_.end()) << "freed pointer was never allocated";
alloc_map_.erase(ptr);
if (alloc_map_.empty()) {
size_ = 0;
}
}
/*!
* \brief start offset of the memory region managed by this allocator
*/
DevBaseOffset start_offset() const { return start_offset_; }
/*!
* \brief current end offset of the space being used in this memory region
*/
DevBaseOffset curr_end_offset() const { return start_offset_ + size_; }
/*!
* \brief end offset of the memory region managed by this allocator
*/
DevBaseOffset max_end_offset() const { return start_offset_ + capacity_; }
/*!
* \brief size of the section
*/
size_t size() const { return size_; }
/*!
* \brief capacity of the section
*/
size_t capacity() const { return capacity_; }
private:
/*! \brief start address of the section */
DevBaseOffset start_offset_;
/*! \brief current size of the section */
size_t size_;
/*! \brief total storage capacity of the section */
size_t capacity_;
/*! \brief allocation map for allocation sizes */
std::unordered_map<std::uintptr_t, size_t> alloc_map_;
};
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_MICRO_SECTION_ALLOCATOR_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_session.h
*/
#ifndef TVM_RUNTIME_MICRO_MICRO_SESSION_H_
#define TVM_RUNTIME_MICRO_MICRO_SESSION_H_
#include "micro_common.h"
#include "micro_section_allocator.h"
#include <tvm/runtime/registry.h>
#include <tvm/runtime/c_runtime_api.h>
#include <memory>
#include <string>
#include <unordered_map>
#include <vector>
#include <tuple>
#include "low_level_device.h"
#include "device/utvm_runtime.h"
#include "target_data_layout_encoder.h"
namespace tvm {
namespace runtime {
/*!
* \brief session for facilitating micro device interaction
*/
class MicroSession : public ModuleNode {
public:
/*!
* \brief Get member function to front-end
* \param name The name of the function.
* \param sptr_to_self The pointer to the module node.
* \return The corresponding member function.
*/
virtual PackedFunc GetFunction(const std::string& name,
const std::shared_ptr<ModuleNode>& sptr_to_self);
/*!
* \return The type key of the executor.
*/
const char* type_key() const final {
return "MicroSession";
}
/*!
* \brief constructor
*/
MicroSession();
/*!
* \brief destructor
*/
~MicroSession();
static std::shared_ptr<MicroSession>& Current();
/*!
* \brief creates session by setting up a low-level device and initting allocators for it
* \param args TVMArgs passed into the micro.init packedfunc
*/
void CreateSession(const std::string& device_type,
const std::string& binary_path,
const std::string& toolchain_prefix);
/*!
* \brief ends the session by destructing the low-level device and its allocators
*/
void EndSession();
/*!
* \brief allocate memory in section
* \param type type of section to allocate in
* \param size size of allocated memory in bytes
* \return pointer to allocated memory region in section, nullptr if out of space
*/
DevBaseOffset AllocateInSection(SectionKind type, size_t size);
/*!
* \brief free prior allocation from section
* \param type type of section to allocate in
* \param ptr pointer to allocated memory
*/
void FreeInSection(SectionKind type, DevBaseOffset ptr);
/*!
* \brief read string from device to host
* \param str_offset device offset of first character of string
* \return host copy of device string that was read
*/
std::string ReadString(DevBaseOffset str_offset);
/*!
* \brief sets up runtime metadata for `func` and copies arguments for on-device execution
* \param func address of the function to be executed
* \param args args to the packed function
*/
void PushToExecQueue(DevBaseOffset func, const TVMArgs& args);
/*!
* \brief loads binary onto device
* \param binary_path path to binary object file
* \param patch_dylib_pointers whether runtime API function pointer patching is needed
* \return info about loaded binary
*/
BinaryInfo LoadBinary(const std::string& binary_path, bool patch_dylib_pointers = true);
/*!
* \brief read value of symbol from device memory
* \param symbol_map symbol map to read location of symbol from
* \param symbol name of symbol being read from
* \return value at symbol in memory
*/
template <typename T>
T DevSymbolRead(const SymbolMap& symbol_map, const std::string& symbol);
/*!
* \brief write value into device memory corresponding to symbol
* \param symbol_map symbol map to read location of symbol from
* \param symbol name of symbol being written to
* \param value value being written into symbol
*/
template <typename T>
void DevSymbolWrite(const SymbolMap& symbol_map, const std::string& symbol, const T& value);
/*!
* \brief returns low-level device pointer
* \note assumes low-level device has been initialized
*/
const std::shared_ptr<LowLevelDevice>& low_level_device() const {
CHECK(low_level_device_ != nullptr) << "attempt to get uninitialized low-level device";
return low_level_device_;
}
private:
/*! \brief low-level device pointer */
std::shared_ptr<LowLevelDevice> low_level_device_;
/*! \brief prefix for binary names in target compiler toolchain */
std::string toolchain_prefix_;
/*! \brief array of memory allocators for each on-device section */
std::shared_ptr<MicroSectionAllocator>
section_allocators_[static_cast<size_t>(SectionKind::kNumKinds)];
/*! \brief total number of bytes of usable device memory for this session */
size_t memory_size_;
/*! \brief uTVM runtime binary info */
BinaryInfo runtime_bin_info_;
/*! \brief path to uTVM runtime source code */
std::string runtime_binary_path_;
/*! \brief offset of the runtime entry function */
DevBaseOffset utvm_main_symbol_;
/*! \brief offset of the runtime exit breakpoint */
DevBaseOffset utvm_done_symbol_;
/*!
* \brief patches a function pointer in this module to an implementation
* \param func_name name of the function pointer being patched
*/
void PatchImplHole(const SymbolMap& symbol_map, const std::string& func_name);
/*!
* \brief sets the runtime binary path
* \param path to runtime binary
*/
void SetRuntimeBinaryPath(std::string path);
/*!
* \brief appends arguments to the host-side buffer of `encoder`
* \param encoder encoder being used to append `args`
* \param args args to be appended
* \return device address of the allocated args
*/
std::tuple<DevPtr, DevPtr> EncoderAppend(TargetDataLayoutEncoder* encoder, const TVMArgs& args);
/*!
* \brief appends a `TVMArray` to the host-side buffer of `encoder`
* \param encoder encoder being used to append `arr`
* \param arr TVMArray to be appended
* \return device address of the allocated `TVMArray`
*/
DevPtr EncoderAppend(TargetDataLayoutEncoder* encoder, const TVMArray& arr);
/*!
* \brief checks and logs if there was an error during the device's most recent execution
*/
void CheckDeviceError();
/*!
* \brief returns section allocator corresponding to the given section kind
* \param kind kind of target section
* \return shared pointer to section allocator
*/
std::shared_ptr<MicroSectionAllocator> GetAllocator(SectionKind kind) {
return section_allocators_[static_cast<size_t>(kind)];
}
/*!
* \brief returns the symbol map for the uTVM runtime
* \return reference to symbol map
*/
const SymbolMap& runtime_symbol_map() {
return runtime_bin_info_.symbol_map;
}
/*!
* \brief Push a new session context onto the thread-local stack.
* The session on top of the stack is used as the current global session.
*/
static void EnterWithScope(std::shared_ptr<MicroSession> session);
/*!
* \brief Pop a session off the thread-local context stack,
* restoring the previous session as the current context.
*/
static void ExitWithScope();
};
/*!
* \brief a device memory region associated with the session that allocated it
*
* We use this to store a reference to the session in each allocated object and
* only deallocate the session once there are no more references to it.
*/
struct MicroDevSpace {
/*! \brief data being wrapped */
void* data;
/*! \brief shared ptr to session where this data is valid */
std::shared_ptr<MicroSession> session;
};
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_MICRO_SESSION_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file target_data_layout_encoder.h
* \brief uTVM data layout encoder
*/
#ifndef TVM_RUNTIME_MICRO_TARGET_DATA_LAYOUT_ENCODER_H_
#define TVM_RUNTIME_MICRO_TARGET_DATA_LAYOUT_ENCODER_H_
#include <vector>
#include "device/utvm_runtime.h"
namespace tvm {
namespace runtime {
// TODO(weberlo): Handle endianness.
/*!
* \brief data encoder for uTVM that builds a host-side buffer
*/
class TargetDataLayoutEncoder {
public:
/*!
* \brief helper class for writing into `TargetDataLayoutEncoder`
*/
template <typename T>
class Slot {
public:
/*!
* \brief constructor
* \param parent pointer to parent encoder
* \param start_offset start byte offset of the slot in the backing buffer
* \param size size (in bytes) of the memory region allocated for this slot
* \param start_addr start address of the slot in the device's memory
*/
Slot(TargetDataLayoutEncoder* parent, size_t start_offset, size_t size, DevPtr start_addr);
~Slot();
/*!
* \brief writes `sizeof(T) * num_elems` bytes of data from `arr`
* \param arr array to be read from
* \param num_elems number of elements in array
*/
void WriteArray(const T* arr, size_t num_elems);
/*!
* \brief writes `val`
* \param val value to be written
*/
void WriteValue(const T& val);
/*!
* \brief returns start address of the slot in device memory
* \return device start address
*/
DevPtr start_addr();
/*!
* \brief returns number of bytes allocated for this slot
* \return size of this slot
*/
size_t size();
private:
/*! \brief pointer to parent encoder */
TargetDataLayoutEncoder* parent_;
/*! \brief start offset of the slot in the parent's backing parent_buffer */
size_t start_offset_;
/*! \brief current offset relative to the start offset of this slot */
size_t curr_offset_;
/*! \brief size (in bytes) of the memory region allocated for this slot */
size_t size_;
/*! \brief start address of the slot in the device's memory */
DevPtr start_addr_;
};
/*!
* \brief constructor
* \param start_addr start address of the encoder in device memory
*/
explicit TargetDataLayoutEncoder(DevPtr start_addr)
: buf_(std::vector<uint8_t>()), curr_offset_(0) {
start_addr_ = DevPtr(UpperAlignValue(start_addr.value(), 8));
}
/*!
* \brief allocates a slot for `sizeof(T) * num_elems` bytes of data
* \param num_elems number of elements of type `T` being allocated (defaults to 1)
* \return slot of size `sizeof(T) * num_elems` bytes
*/
template <typename T>
Slot<T> Alloc(size_t num_elems = 1) {
curr_offset_ = UpperAlignValue(curr_offset_, 8);
size_t size = sizeof(T) * num_elems;
if (curr_offset_ + size > buf_.size()) {
buf_.resize(curr_offset_ + size);
}
size_t slot_start_offset = curr_offset_;
curr_offset_ += size;
return Slot<T>(this, slot_start_offset, size, start_addr_ + slot_start_offset);
}
/*!
* \brief returns the array backing the encoder's buffer
* \return array backing the encoder's buffer
*/
uint8_t* data() {
return buf_.data();
}
/*!
* \brief returns current size of the encoder's buffer
* \return buffer size
*/
size_t buf_size() {
return buf_.size();
}
private:
/*! \brief in-memory backing buffer */
std::vector<uint8_t> buf_;
/*! \brief current offset */
size_t curr_offset_;
/*! \brief start address of the encoder in device memory */
DevPtr start_addr_;
};
template <typename T>
TargetDataLayoutEncoder::Slot<T>::Slot(TargetDataLayoutEncoder* parent,
size_t start_offset,
size_t size,
DevPtr start_addr)
: parent_(parent),
start_offset_(start_offset),
curr_offset_(0),
size_(size),
start_addr_(start_addr) {}
template <typename T>
TargetDataLayoutEncoder::Slot<T>::~Slot() {
CHECK(curr_offset_ == size_) << "unwritten space in slot";
}
template <typename T>
void TargetDataLayoutEncoder::Slot<T>::WriteArray(const T* arr, size_t num_elems) {
if (num_elems == 0) return;
size_t size = sizeof(T) * num_elems;
CHECK(curr_offset_ + size <= size_) << "not enough space in slot";
uint8_t* curr_ptr = &(parent_->data())[start_offset_ + curr_offset_];
std::memcpy(curr_ptr, arr, size);
curr_offset_ += size;
}
template <typename T>
void TargetDataLayoutEncoder::Slot<T>::WriteValue(const T& val) {
WriteArray(&val, 1);
}
template <typename T>
DevPtr TargetDataLayoutEncoder::Slot<T>::start_addr() {
return start_addr_;
}
template <typename T>
size_t TargetDataLayoutEncoder::Slot<T>::size() {
return size_;
}
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_TARGET_DATA_LAYOUT_ENCODER_H_
...@@ -139,6 +139,8 @@ bool RuntimeEnabled(const std::string& target) { ...@@ -139,6 +139,8 @@ bool RuntimeEnabled(const std::string& target) {
f_name = "device_api.rpc"; f_name = "device_api.rpc";
} else if (target == "vpi" || target == "verilog") { } else if (target == "vpi" || target == "verilog") {
f_name = "device_api.vpi"; f_name = "device_api.vpi";
} else if (target == "micro_dev") {
f_name = "device_api.micro_dev";
} else if (target.length() >= 5 && target.substr(0, 5) == "nvptx") { } else if (target.length() >= 5 && target.substr(0, 5) == "nvptx") {
f_name = "device_api.gpu"; f_name = "device_api.gpu";
} else if (target.length() >= 4 && target.substr(0, 4) == "rocm") { } else if (target.length() >= 4 && target.substr(0, 4) == "rocm") {
......
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Test various utilities for interaction with compiled binaries.
Specifically, we test the following capabilities:
- querying the size of a binary section
- relocating sections within a binary to new addresses
- reading the contents of a binary section
- querying the address of a symbol in the binary
"""
import tvm
import subprocess
from tvm.contrib import util
from tvm.contrib import cc
from tvm.contrib.binutil import *
TOOLCHAIN_PREFIX = ""
def make_binary():
prog = "int a = 7; \
int main() { \
int b = 5; \
return 0; \
}"
tmp_dir = util.tempdir()
tmp_source = tmp_dir.relpath("source.c")
tmp_obj = tmp_dir.relpath("obj.obj")
with open(tmp_source, "w") as f:
f.write(prog)
cc.create_shared(tmp_obj, tmp_source, [],
compile_cmd="{}gcc".format(TOOLCHAIN_PREFIX))
prog_bin = bytearray(open(tmp_obj, "rb").read())
return prog_bin
def test_tvm_callback_get_section_size(binary=None):
if binary is None:
binary = make_binary()
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("obj.bin")
with open(tmp_bin, "wb") as f:
f.write(binary)
def verify():
print("Text section size: %d" %
tvm_callback_get_section_size(tmp_bin, "text", TOOLCHAIN_PREFIX))
print("Data section size: %d" %
tvm_callback_get_section_size(tmp_bin, "data", TOOLCHAIN_PREFIX))
print("Bss section size: %d" %
tvm_callback_get_section_size(tmp_bin, "bss", TOOLCHAIN_PREFIX))
print()
verify()
def test_tvm_callback_relocate_binary():
binary = make_binary()
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("obj.bin")
with open(tmp_bin, "wb") as f:
f.write(binary)
def verify():
text_loc_str = "0x0"
rodata_loc_str = "0x10000"
data_loc_str = "0x20000"
bss_loc_str = "0x30000"
rel_bin = tvm_callback_relocate_binary(
tmp_bin, text_loc_str, rodata_loc_str, data_loc_str, bss_loc_str, TOOLCHAIN_PREFIX)
print("Relocated binary section sizes")
test_tvm_callback_get_section_size(binary=rel_bin)
relf = tmp_dir.relpath("rel.bin")
with open(relf, "wb") as f:
f.write(rel_bin)
nm_proc = subprocess.Popen(["nm", "-C", "--defined-only", relf],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(out, _) = nm_proc.communicate()
# Ensure the relocated symbols are within the ranges we specified.
text_loc = int(text_loc_str, 16)
data_loc = int(data_loc_str, 16)
bss_loc = int(bss_loc_str, 16)
symbol_entries = out.decode("utf-8").split("\n")
for entry in symbol_entries:
if len(entry) == 0:
continue
sym_loc, section, sym_name = entry.split(' ')
sym_loc = int(sym_loc, 16)
if section == 'T': # text
assert sym_loc >= text_loc and sym_loc < data_loc
elif section == 'D': # data
assert sym_loc >= data_loc and sym_loc < bss_loc
elif section == 'B': # bss
assert sym_loc >= bss_loc
verify()
def test_tvm_callback_read_binary_section():
binary = make_binary()
def verify():
text_bin = tvm_callback_read_binary_section(binary, "text", TOOLCHAIN_PREFIX)
data_bin = tvm_callback_read_binary_section(binary, "data", TOOLCHAIN_PREFIX)
bss_bin = tvm_callback_read_binary_section(binary, "bss", TOOLCHAIN_PREFIX)
print("Read text section part of binary? %r" % (text_bin in binary))
print("Read data section part of binary? %r" % (data_bin in binary))
print("Read bss section part of binary? %r" % (bss_bin in binary))
print()
verify()
def test_tvm_callback_get_symbol_map():
binary = make_binary()
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("obj.bin")
with open(tmp_bin, "wb") as f:
f.write(binary)
def verify():
text_loc_str = "0x0"
rodata_loc_str = "0x10000"
data_loc_str = "0x20000"
bss_loc_str = "0x30000"
rel_bin = tvm_callback_relocate_binary(
tmp_bin, text_loc_str, rodata_loc_str, data_loc_str, bss_loc_str, TOOLCHAIN_PREFIX)
symbol_map = tvm_callback_get_symbol_map(rel_bin, TOOLCHAIN_PREFIX)
symbols = set()
for i, line in enumerate(symbol_map.split('\n')):
# Every other line is the value the symbol maps to.
if i % 2 == 0:
symbols.add(line)
assert "a" in symbols
assert "main" in symbols
verify()
if __name__ == "__main__":
test_tvm_callback_get_section_size()
test_tvm_callback_relocate_binary()
test_tvm_callback_read_binary_section()
test_tvm_callback_get_symbol_map()
...@@ -95,31 +95,6 @@ def test_add_pipeline(): ...@@ -95,31 +95,6 @@ def test_add_pipeline():
with tvm.build_config(offset_factor=4): with tvm.build_config(offset_factor=4):
check_c() check_c()
def test_reinterpret():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A', dtype="int32")
B = tvm.compute(A.shape, lambda *i: tvm.call_pure_intrin("float32", "reinterpret", A(*i)), name='B')
s = tvm.create_schedule(B.op)
def check_c():
mhost = tvm.build(s, [A, B], "c", name="reinterpret")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
m = tvm.module.load(path_dso)
fadd = m['reinterpret']
ctx = tvm.cpu(0)
n = nn
a = tvm.nd.array(np.random.randint(-2 ** 30, 2 ** 30, size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.zeros(n, dtype=B.dtype), ctx)
fadd(a, b)
tvm.testing.assert_allclose(
b.asnumpy(), a.asnumpy().view('float32'))
check_c()
if __name__ == "__main__": if __name__ == "__main__":
test_add() test_add()
test_add_pipeline() test_add_pipeline()
test_reinterpret()
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import tvm
import numpy as np
from tvm import relay
from tvm.contrib import util
def test_add():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
C = tvm.compute(A.shape, lambda *i: A(*i) + B(*i), name='C')
s = tvm.create_schedule(C.op)
def check_c():
mhost = tvm.build(s, [A, B, C], "c", name="fadd")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
print(mhost.get_source())
m = tvm.module.load(path_dso)
fadd = m['fadd']
ctx = tvm.cpu(0)
# launch the kernel.
n = nn
a = tvm.nd.array(np.random.uniform(size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros(n, dtype=C.dtype), ctx)
fadd(a, b, c)
tvm.testing.assert_allclose(
c.asnumpy(), a.asnumpy() + b.asnumpy())
check_c()
def test_relay_id():
# x = relay.var("x")
# f = relay.Function([x], x)
x = relay.var('x', shape=[])
func = relay.Function([x], x)
ttype = relay.TensorType([], dtype='float32')
relay.FuncType([ttype], ttype)
mod = relay.module.Module()
func_gvar = relay.GlobalVar("f")
mod[func_gvar] = func
print(mod)
def test_add_pipeline():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
AA = tvm.compute((n,), lambda *i: A(*i), name='A')
BB = tvm.compute((n,), lambda *i: B(*i), name='B')
T = tvm.compute(A.shape, lambda *i: AA(*i) + BB(*i), name='T')
C = tvm.compute(A.shape, lambda *i: T(*i), name='C')
s = tvm.create_schedule(C.op)
xo, xi = s[C].split(C.op.axis[0], factor=4)
xo1, xo2 = s[C].split(xo, factor=13)
s[C].parallel(xo2)
s[C].pragma(xo1, "parallel_launch_point")
s[C].pragma(xo2, "parallel_stride_pattern")
s[C].pragma(xo2, "parallel_barrier_when_finish")
s[C].vectorize(xi)
def check_c():
if not tvm.module.enabled("llvm"):
return
# Specifically allow offset to test codepath when offset is available
Ab = tvm.decl_buffer(
A.shape, A.dtype,
elem_offset=tvm.var('Aoffset'),
offset_factor=8,
name='A')
binds = {A : Ab}
# BUILD and invoke the kernel.
f1 = tvm.lower(s, [A,B,C], name="fadd_pipeline")
fsplits = [x for x in tvm.ir_pass.SplitHostDevice(f1)]
fsplits[0] = tvm.ir_pass.LowerTVMBuiltin(fsplits[0])
mhost = tvm.codegen.build_module(fsplits[0], "c")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
m = tvm.module.load(path_dso)
fadd = m["fadd_pipeline"]
ctx = tvm.cpu(0)
# launch the kernel.
n = nn
a = tvm.nd.array(np.random.uniform(size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros(n, dtype=C.dtype), ctx)
fadd(a, b, c)
tvm.testing.assert_allclose(
c.asnumpy(), a.asnumpy() + b.asnumpy())
with tvm.build_config(offset_factor=4):
check_c()
def test_reinterpret():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A', dtype="int32")
B = tvm.compute(A.shape, lambda *i: tvm.call_pure_intrin("float32", "reinterpret", A(*i)), name='B')
s = tvm.create_schedule(B.op)
def check_c():
mhost = tvm.build(s, [A, B], "c", name="reinterpret")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
m = tvm.module.load(path_dso)
fadd = m['reinterpret']
ctx = tvm.cpu(0)
n = nn
a = tvm.nd.array(np.random.randint(-2 ** 30, 2 ** 30, size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.zeros(n, dtype=B.dtype), ctx)
fadd(a, b)
tvm.testing.assert_allclose(
b.asnumpy(), a.asnumpy().view('float32'))
check_c()
if __name__ == "__main__":
test_add()
test_add_pipeline()
test_reinterpret()
...@@ -24,7 +24,7 @@ def _default_schedule(outs, auto_inline): ...@@ -24,7 +24,7 @@ def _default_schedule(outs, auto_inline):
"""Default schedule for llvm.""" """Default schedule for llvm."""
target = tvm.target.current_target(allow_none=False) target = tvm.target.current_target(allow_none=False)
outs = [outs] if isinstance(outs, tvm.tensor.Tensor) else outs outs = [outs] if isinstance(outs, tvm.tensor.Tensor) else outs
if target.target_name != "llvm": if target.target_name not in ("llvm", "c"):
raise RuntimeError("schedule not registered for '%s'" % target) raise RuntimeError("schedule not registered for '%s'" % target)
s = tvm.create_schedule([x.op for x in outs]) s = tvm.create_schedule([x.op for x in outs])
if auto_inline: if auto_inline:
......
...@@ -36,7 +36,7 @@ def pool_grad_nchw(a_np, out_grad_np, ...@@ -36,7 +36,7 @@ def pool_grad_nchw(a_np, out_grad_np,
pad_np = np.zeros(shape=(n, ic, ih+pt+pb, iw+pl+pr)).astype(dtype) pad_np = np.zeros(shape=(n, ic, ih+pt+pb, iw+pl+pr)).astype(dtype)
no_zero = (range(n), range(ic), (range(pt, ih+pt)), (range(pl, iw+pl))) no_zero = (range(n), range(ic), (range(pt, ih+pt)), (range(pl, iw+pl)))
pad_np[np.ix_(*no_zero)] = a_np pad_np[np.ix_(*no_zero)] = a_np
_, oc, oh, ow = out_grad_np.shape _, _, oh, ow = out_grad_np.shape
pool_grad_np = np.zeros(shape=a_np.shape) pool_grad_np = np.zeros(shape=a_np.shape)
pad_pool_grad_np = np.zeros(shape=pad_np.shape) pad_pool_grad_np = np.zeros(shape=pad_np.shape)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment