Commit ef909df1 by Logan Weber Committed by Tianqi Chen

Implementation of uTVM (#3227)

* uTVM interfaces (#14)

* some minor interface changes

* implemented HostLowLevelDevice

* added MicroDeviceAPI

* implemented micro_common and added Python interfaces

* current status, semi implemented micro session

* added micro_common implementation and python interfaces (#18)

* added micro_common implementation and python interfaces (#18)

* current status, semi implemented

* host test working

* updated interfaces for MicroSession arguments allocation

* make somewhat lint compatible

* fix based on comments

* added rounding macro

* fix minor bug

* improvements based on comments

* Clean up `binutil.py` and make Python-3-compatible

* Change argument allocation design

* Address feedback and lint errors

* Improve binutil tests

* Simplify allocator (per @tqchen's suggestions)

* Doc/style fixes

* farts

* mcgee

* rodata section werks

(and so does `test_runtime_micro_workspace.py`)

* simple graph runtime werk

* TEMP

* ResNet works, yo

* First round of cleanup

* More cleanup

* runs a dyson over the code

* Another pass

* Fix `make lint` issues

* ready to pr... probably

* final

* Undo change

* Fix rebase resolution

* Minor fixes

* Undo changes to C codegen tests

* Add `obj_path` in `create_micro_lib`

* TEMP

* Address feedback

* Add missing TODO

* Partially address feedback

* Fix headers

* Switch to enum class for `SectionKind`

* Add missing ASF header

* Fix lint

* Fix lint again

* Fix lint

* Kill lint warnings

* Address feedback

* Change Python interface to MicroTVM

All interaction with the device is now through `Session` objects, which
are used through Python's `with` blocks.

* Reorder LowLevelDevice interface

* Store shared ptr to session in all alloced objects

* Move helper functions out of `tvm.micro`

* Switch static char arr to vector

* Improve general infra and code quality

Does not yet address all of tqchen's feedback

* Forgot a rename

* Fix lint

* Add ASF header

* Fix lint

* Partially address MarisaKirisame's feedback

* Lint

* Expose `MicroSession` as a node to Python

* Revert to using `Session` constructor

* Fix compiler error

* (Maybe) fix CI error

* Debugging

* Remove

* Quell lint

* Switch to stack-based session contexts

* Make uTVM less intrusive to host codegen

And use SSA for operands of generated ternary operators

* Inline UTVMArgs into UTVMTask struct

* Remove `HostLowLevelDevice` header

* Remove `BaseAddr` class

* Address feedback

* Add "utvm" prefix to global vars in runtime

* Fix lint

* Fix CI

* Fix `test_binutil.py`

* Fix submodules

* Remove ResNet tests

* Make `test_binutil.py` work with nose

* Fix CI

* I swear this actually fixes the binutil tests

* lint

* lint

* Add fcompile-compatible cross-compile func

* Add docs for uTVM runtime files

* Move pointer patching into `MicroSession`

* Fix lint

* First attempt at unifying cross-compile APIs

* Fix lint

* Rename `cross_compile` back to `cc`

* Address feedback

* Remove commented code

* Lint

* Figure out failing function

* Remove debugging code

* Change "micro_dev" target to "micro"

* Add checks in tests for whether uTVM is enabled

* Add TODO for 32-bit support

* Rename more "micro_dev" to "micro"

* Undo rename

We already have `tvm.micro` as a namespace.  Can't have it as a method
as well.

* Fix failing CI

Thanks to @tqchen for finding this bug.  Emitting ternary operators for
`min` and `max` causes concurrency bugs in CUDA, so we're moving the
ternary op emissions from `CodeGenC` to `CodeGenCHost`.

* Address feedback

* Fix lint
parent 443d023b
......@@ -36,6 +36,7 @@ tvm_option(USE_RELAY_DEBUG "Building Relay in debug mode..." OFF)
tvm_option(USE_SGX "Build with SGX" OFF)
tvm_option(USE_RTTI "Build with RTTI" ON)
tvm_option(USE_MSVC_MT "Build with MT" OFF)
tvm_option(USE_MICRO "Build with Micro" OFF)
tvm_option(INSTALL_DEV "Install compiler infrastructure" OFF)
tvm_option(HIDE_PRIVATE_SYMBOLS "Compile with -fvisibility=hidden." OFF)
......@@ -206,6 +207,7 @@ include(cmake/modules/Metal.cmake)
include(cmake/modules/ROCM.cmake)
include(cmake/modules/SGX.cmake)
include(cmake/modules/LLVM.cmake)
include(cmake/modules/Micro.cmake)
include(cmake/modules/ANTLR.cmake)
include(cmake/modules/contrib/BLAS.cmake)
include(cmake/modules/contrib/Random.cmake)
......
......@@ -62,6 +62,9 @@ set(USE_VULKAN OFF)
# Whether enable OpenGL runtime
set(USE_OPENGL OFF)
# Whether enable MicroTVM runtime
set(USE_MICRO OFF)
# Whether to enable SGX runtime
#
# Possible values for USE_SGX:
......
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
if(USE_MICRO)
message(STATUS "Build with Micro support")
file(GLOB RUNTIME_MICRO_SRCS src/runtime/micro/*.cc)
list(APPEND RUNTIME_SRCS ${RUNTIME_MICRO_SRCS})
endif(USE_MICRO)
......@@ -81,6 +81,7 @@ typedef enum {
kDLAOCL = 5,
kDLSDAccel = 6,
kOpenGL = 11,
kDLMicroDev = 13,
// AddExtraTVMType which is not in DLPack here
} TVMDeviceExtType;
......
......@@ -215,6 +215,7 @@ inline const char* DeviceName(int type) {
case kDLROCM: return "rocm";
case kOpenGL: return "opengl";
case kDLExtDev: return "ext_dev";
case kDLMicroDev: return "micro_dev";
default: LOG(FATAL) << "unknown type =" << type; return "Unknown";
}
}
......
......@@ -42,7 +42,7 @@ from . import datatype
from . import ndarray as nd
from .ndarray import context, cpu, gpu, opencl, cl, vulkan, metal, mtl
from .ndarray import vpi, rocm, opengl, ext_dev
from .ndarray import vpi, rocm, opengl, ext_dev, micro_dev
from ._ffi.runtime_ctypes import TypeCode, TVMType
from ._ffi.ndarray import TVMContext
......
......@@ -143,6 +143,7 @@ class TVMContext(ctypes.Structure):
10: 'rocm',
11: 'opengl',
12: 'ext_dev',
13: 'micro_dev',
}
STR2MASK = {
'llvm': 1,
......@@ -163,6 +164,7 @@ class TVMContext(ctypes.Structure):
'rocm': 10,
'opengl': 11,
'ext_dev': 12,
'micro_dev': 13,
}
def __init__(self, device_type, device_id):
super(TVMContext, self).__init__()
......
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Utilities for binary file manipulation"""
import os
import subprocess
from . import util
from .._ffi.base import py_str
from ..api import register_func
@register_func("tvm_callback_get_section_size")
def tvm_callback_get_section_size(binary_path, section_name, toolchain_prefix):
"""Finds size of the section in the binary.
Assumes `size` shell command exists (typically works only on Linux machines)
Parameters
----------
binary_path : str
path of the binary file
section_name : str
name of section
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
size : integer
size of the section in bytes
"""
if not os.path.isfile(binary_path):
raise RuntimeError("no such file \"{}\"".format(binary_path))
# We use the "-A" flag here to get the ".rodata" section's size, which is
# not included by default.
size_proc = subprocess.Popen(
["{}size".format(toolchain_prefix), "-A", binary_path], stdout=subprocess.PIPE)
(size_output, _) = size_proc.communicate()
size_output = size_output.decode("utf-8")
if size_proc.returncode != 0:
msg = "error in finding section size:\n"
msg += py_str(out)
raise RuntimeError(msg)
# TODO(weberlo): Refactor this method and `*relocate_binary` so they are
# both aware of [".bss", ".sbss", ".sdata"] being relocated to ".bss".
section_mapping = {
".text": [".text"],
".rodata": [".rodata"],
".data": [".data", ".sdata"],
".bss": [".bss", ".sbss"],
}
sections_to_sum = section_mapping["." + section_name]
section_size = 0
# Skip the first two header lines in the `size` output.
for line in size_output.split("\n")[2:]:
tokens = list(filter(lambda s: len(s) != 0, line.split(" ")))
if len(tokens) != 3:
continue
entry_name = tokens[0]
entry_size = int(tokens[1])
if entry_name in sections_to_sum:
section_size += entry_size
return section_size
@register_func("tvm_callback_relocate_binary")
def tvm_callback_relocate_binary(
binary_path, text_addr, rodata_addr, data_addr, bss_addr, toolchain_prefix):
"""Relocates sections in the binary to new addresses
Parameters
----------
binary_path : str
path of the binary file
text_addr : str
text section absolute address
rodata_addr : str
rodata section absolute address
data_addr : str
data section absolute address
bss_addr : str
bss section absolute address
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
rel_bin : bytearray
the relocated binary
"""
tmp_dir = util.tempdir()
rel_obj_path = tmp_dir.relpath("relocated.o")
ld_script_contents = ""
# TODO(weberlo): There should be a better way to configure this for different archs.
if "riscv" in toolchain_prefix:
ld_script_contents += "OUTPUT_ARCH( \"riscv\" )\n\n"
# TODO(weberlo): Generate the script in a more procedural manner.
ld_script_contents += """
SECTIONS
{
. = %s;
. = ALIGN(8);
.text :
{
*(.text)
. = ALIGN(8);
*(.text*)
}
. = %s;
. = ALIGN(8);
.rodata :
{
*(.rodata)
. = ALIGN(8);
*(.rodata*)
}
. = %s;
. = ALIGN(8);
.data :
{
*(.data)
. = ALIGN(8);
*(.data*)
. = ALIGN(8);
*(.sdata)
}
. = %s;
. = ALIGN(8);
.bss :
{
*(.bss)
. = ALIGN(8);
*(.bss*)
. = ALIGN(8);
*(.sbss)
}
}
""" % (text_addr, rodata_addr, data_addr, bss_addr)
rel_ld_script_path = tmp_dir.relpath("relocated.lds")
with open(rel_ld_script_path, "w") as f:
f.write(ld_script_contents)
ld_proc = subprocess.Popen(["{}ld".format(toolchain_prefix), binary_path,
"-T", rel_ld_script_path,
"-o", rel_obj_path],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(out, _) = ld_proc.communicate()
if ld_proc.returncode != 0:
msg = "linking error using ld:\n"
msg += py_str(out)
raise RuntimeError(msg)
with open(rel_obj_path, "rb") as f:
rel_bin = bytearray(f.read())
return rel_bin
@register_func("tvm_callback_read_binary_section")
def tvm_callback_read_binary_section(binary, section, toolchain_prefix):
"""Returns the contents of the specified section in the binary byte array
Parameters
----------
binary : bytearray
contents of the binary
section : str
type of section
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
section_bin : bytearray
contents of the read section
"""
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("temp.bin")
tmp_section = tmp_dir.relpath("tmp_section.bin")
with open(tmp_bin, "wb") as out_file:
out_file.write(bytes(binary))
objcopy_proc = subprocess.Popen(["{}objcopy".format(toolchain_prefix), "--dump-section",
".{}={}".format(section, tmp_section),
tmp_bin],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(out, _) = objcopy_proc.communicate()
if objcopy_proc.returncode != 0:
msg = "error in using objcopy:\n"
msg += py_str(out)
raise RuntimeError(msg)
if os.path.isfile(tmp_section):
# Get section content if it exists.
with open(tmp_section, "rb") as f:
section_bin = bytearray(f.read())
else:
# Return empty bytearray if the section does not exist.
section_bin = bytearray("", "utf-8")
return section_bin
@register_func("tvm_callback_get_symbol_map")
def tvm_callback_get_symbol_map(binary, toolchain_prefix):
"""Obtains a map of symbols to addresses in the passed binary
Parameters
----------
binary : bytearray
contents of the binary
toolchain_prefix : str
prefix for binary names in target compiler toolchain
Returns
-------
map_str : str
map of defined symbols to addresses, encoded as a series of
alternating newline-separated keys and values
"""
tmp_dir = util.tempdir()
tmp_obj = tmp_dir.relpath("tmp_obj.bin")
with open(tmp_obj, "wb") as out_file:
out_file.write(bytes(binary))
nm_proc = subprocess.Popen(["{}nm".format(toolchain_prefix), "-C", "--defined-only", tmp_obj],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(nm_output, _) = nm_proc.communicate()
if nm_proc.returncode != 0:
msg = "error in using nm:\n"
msg += py_str(nm_output)
raise RuntimeError(msg)
nm_output = nm_output.decode("utf8").splitlines()
map_str = ""
for line in nm_output:
line = line.split()
map_str += line[2] + "\n"
map_str += line[0] + "\n"
return map_str
......@@ -14,7 +14,7 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Util to invoke c++ compilers in the system."""
"""Util to invoke C/C++ compilers in the system."""
# pylint: disable=invalid-name
from __future__ import absolute_import as _abs
import sys
......@@ -24,11 +24,10 @@ import os
from .._ffi.base import py_str
from .util import tempdir
def create_shared(output,
objects,
options=None,
cc="g++"):
compile_cmd="g++"):
"""Create shared library.
Parameters
......@@ -36,17 +35,17 @@ def create_shared(output,
output : str
The target shared library.
objects : list
objects : List[str]
List of object files.
options : list
options : List[str]
The list of additional options string.
cc : str, optional
The compile string.
compile_cmd : Optional[str]
The compiler command.
"""
if sys.platform == "darwin" or sys.platform.startswith("linux"):
_linux_shared(output, objects, options, cc)
_linux_compile(output, objects, options, compile_cmd)
elif sys.platform == "win32":
_windows_shared(output, objects, options)
else:
......@@ -56,40 +55,44 @@ def create_shared(output,
# assign so as default output format
create_shared.output_format = "so" if sys.platform != "win32" else "dll"
def cross_compiler(cc, options=None, output_format="so"):
def cross_compiler(compile_func, base_options=None, output_format="so"):
"""Create a cross compiler function.
Parameters
----------
cc : str
The cross compiler name.
compile_func : Callable[[str, str, Optional[str]], None]
Function that performs the actual compilation
options : list, optional
options : Optional[List[str]]
List of additional optional string.
output_format : str, optional
output_format : Optional[str]
Library output format.
Returns
-------
fcompile : function
fcompile : Callable[[str, str, Optional[str]], None]
A compilation function that can be passed to export_library.
"""
def _fcompile(outputs, objects, opts=None):
opts = opts if opts else []
if options:
opts += options
_linux_shared(outputs, objects, opts, cc=cc)
if base_options is None:
base_options = []
def _fcompile(outputs, objects, options=None):
all_options = base_options
if options is not None:
all_options += options
compile_func(outputs, objects, options=all_options)
_fcompile.output_format = output_format
return _fcompile
def _linux_shared(output, objects, options, cc="g++"):
cmd = [cc]
cmd += ["-shared", "-fPIC"]
if sys.platform == "darwin":
cmd += ["-undefined", "dynamic_lookup"]
def _linux_compile(output, objects, options, compile_cmd="g++"):
cmd = [compile_cmd]
if output.endswith(".so") or output.endswith(".dylib"):
cmd += ["-shared", "-fPIC"]
if sys.platform == "darwin":
cmd += ["-undefined", "dynamic_lookup"]
elif output.endswith(".obj"):
cmd += ["-c"]
cmd += ["-o", output]
if isinstance(objects, str):
cmd += [objects]
......
"""uTVM module for bare-metal backends.
uTVM (or the micro backend) enables provides support for bare-metal devices.
Its targets currently include a host-emulated device which is used for testing,
and JTAG-based openocd device which allows actual interfacing with microdevices.
"""
from ..contrib import binutil
from .base import Session, cross_compiler, create_micro_lib
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Base definitions for micro."""
from __future__ import absolute_import
import logging
import os
import sys
from tvm.contrib import util as _util
from tvm.contrib import cc as _cc
from .._ffi.function import _init_api
from .._ffi.libinfo import find_include_path
SUPPORTED_DEVICE_TYPES = ["host"]
class Session:
"""MicroTVM Device Session
Parameters
----------
device_type : str
type of low-level device
toolchain_prefix : str
toolchain prefix to be used. For example, a prefix of
"riscv64-unknown-elf-" means "riscv64-unknown-elf-gcc" is used as
the compiler and "riscv64-unknown-elf-ld" is used as the linker,
etc.
Example
--------
.. code-block:: python
c_mod = ... # some module generated with "c" as the target
device_type = "host"
with tvm.micro.Session(device_type) as sess:
sess.create_micro_mod(c_mod)
"""
def __init__(self, device_type, toolchain_prefix):
if device_type not in SUPPORTED_DEVICE_TYPES:
raise RuntimeError("unknown micro device type \"{}\"".format(device_type))
self._check_system()
# First, find and compile runtime library.
runtime_src_path = os.path.join(_get_micro_device_dir(), "utvm_runtime.c")
tmp_dir = _util.tempdir()
runtime_obj_path = tmp_dir.relpath("utvm_runtime.obj")
create_micro_lib(
runtime_obj_path, runtime_src_path, toolchain_prefix, include_dev_lib_header=False)
self.module = _CreateSession(device_type, runtime_obj_path, toolchain_prefix)
self._enter = self.module["enter"]
self._exit = self.module["exit"]
def _check_system(self):
"""Check if the user's system is supported by MicroTVM.
Raises error if not supported.
"""
if not sys.platform.startswith("linux"):
raise RuntimeError("microTVM is currently only supported on Linux")
# TODO(weberlo): Add 32-bit support.
# It's primarily the compilation pipeline that isn't compatible.
if sys.maxsize <= 2**32:
raise RuntimeError("microTVM is currently only supported on 64-bit platforms")
def __enter__(self):
self._enter()
def __exit__(self, exc_type, exc_value, exc_traceback):
self._exit()
def _get_micro_device_dir():
"""Get directory path for uTVM runtime source files.
Return
------
micro_device_dir : str
directory path
"""
micro_dir = os.path.dirname(os.path.realpath(os.path.expanduser(__file__)))
micro_device_dir = os.path.join(micro_dir, "..", "..", "..",
"src", "runtime", "micro", "device")
return micro_device_dir
def cross_compiler(toolchain_prefix, include_dev_lib_header=True):
"""Creates a cross compile function that wraps `create_micro_lib`.
For use in `tvm.module.Module.export_library`.
Parameters
----------
toolchain_prefix : str
toolchain prefix to be used
include_dev_lib_header : Optional[bool]
whether to include the device library header containing definitions of
library functions.
Return
------
func : Callable[[str, str, Optional[str]], None]
cross compile function taking a destination path for the object file
and a path for the input source file.
Example
--------
.. code-block:: python
c_mod = ... # some module generated with "c" as the target
fcompile = tvm.micro.cross_compiler(toolchain_prefix="")
c_mod.export_library("dev_lib.obj", fcompile=fcompile)
"""
def compile_func(obj_path, src_path, **kwargs):
if isinstance(obj_path, list):
obj_path = obj_path[0]
if isinstance(src_path, list):
src_path = src_path[0]
create_micro_lib(obj_path, src_path, toolchain_prefix,
kwargs.get("options", None), include_dev_lib_header)
return _cc.cross_compiler(compile_func)
def create_micro_lib(
obj_path, src_path, toolchain_prefix, options=None, include_dev_lib_header=True):
"""Compiles code into a binary for the target micro device.
Parameters
----------
obj_path : Optional[str]
path to generated object file (defaults to same directory as `src_path`)
src_path : str
path to source file
toolchain_prefix : str
toolchain prefix to be used
include_dev_lib_header : bool
whether to include the device library header containing definitions of
library functions.
"""
def replace_suffix(s, new_suffix):
if "." in os.path.basename(s):
# There already exists an extension.
return os.path.join(
os.path.dirname(s),
".".join(os.path.basename(s).split(".")[:-1] + [new_suffix]))
# No existing extension; we can just append.
return s + "." + new_suffix
# uTVM object files cannot have an ".o" suffix, because it triggers the
# code path for creating shared objects in `tvm.module.load`. So we replace
# ".o" suffixes with ".obj".
if obj_path.endswith(".o"):
logging.warning(
"\".o\" suffix in \"%s\" has been replaced with \".obj\"", obj_path)
obj_path = replace_suffix(obj_path, "obj")
options = ["-I" + path for path in find_include_path()]
options += ["-I{}".format(_get_micro_device_dir())]
options += ["-fno-stack-protector"]
if sys.maxsize > 2**32 and sys.platform.startswith("linux"):
# Only add this option if the host is a 64-bit Linux.
options += ["-mcmodel=large"]
compile_cmd = "{}gcc".format(toolchain_prefix)
if include_dev_lib_header:
# Create a temporary copy of the source, so we can inject the dev lib
# header without modifying the original.
tmp_dir = _util.tempdir()
temp_src_path = tmp_dir.relpath("temp.c")
with open(src_path, "r") as f:
src_lines = f.read().splitlines()
src_lines.insert(0, "#include \"utvm_device_dylib_redirect.c\"")
with open(temp_src_path, "w") as f:
f.write("\n".join(src_lines))
src_path = temp_src_path
_cc.create_shared(obj_path, src_path, options, compile_cmd)
_init_api("tvm.micro", "tvm.micro.base")
......@@ -189,6 +189,22 @@ def ext_dev(dev_id=0):
return TVMContext(12, dev_id)
def micro_dev(dev_id=0):
"""Construct a micro device
Parameters
----------
dev_id : int, optional
The integer device id
Returns
-------
ctx : TVMContext
The created context
"""
return TVMContext(13, dev_id)
cl = opencl
mtl = metal
......
......@@ -6,9 +6,9 @@
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
*
* http://www.apache.org/licenses/LICENSE-2.0
*
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
......
......@@ -443,7 +443,7 @@ inline void PrintBinaryExpr(const T* op,
}
}
inline void PrintBinaryIntrinsitc(const Call* op,
inline void PrintBinaryIntrinsic(const Call* op,
const char *opstr,
std::ostream& os, // NOLINT(*)
CodeGenC* p) {
......@@ -528,20 +528,20 @@ void CodeGenC::VisitExpr_(const Call *op, std::ostream& os) { // NOLINT(*)
}
os << ")";
} else if (op->is_intrinsic(Call::bitwise_and)) {
PrintBinaryIntrinsitc(op, " & ", os, this);
PrintBinaryIntrinsic(op, " & ", os, this);
} else if (op->is_intrinsic(Call::bitwise_xor)) {
PrintBinaryIntrinsitc(op, " ^ ", os, this);
PrintBinaryIntrinsic(op, " ^ ", os, this);
} else if (op->is_intrinsic(Call::bitwise_or)) {
PrintBinaryIntrinsitc(op, " | ", os, this);
PrintBinaryIntrinsic(op, " | ", os, this);
} else if (op->is_intrinsic(Call::bitwise_not)) {
CHECK_EQ(op->args.size(), 1U);
os << "(~";
this->PrintExpr(op->args[0], os);
os << ')';
} else if (op->is_intrinsic(Call::shift_left)) {
PrintBinaryIntrinsitc(op, " << ", os, this);
PrintBinaryIntrinsic(op, " << ", os, this);
} else if (op->is_intrinsic(Call::shift_right)) {
PrintBinaryIntrinsitc(op, " >> ", os, this);
PrintBinaryIntrinsic(op, " >> ", os, this);
} else if (op->is_intrinsic(intrinsic::tvm_if_then_else)) {
os << "(";
PrintExpr(op->args[0], os);
......
......@@ -6,9 +6,9 @@
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
*
* http://www.apache.org/licenses/LICENSE-2.0
*
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
......
......@@ -31,13 +31,13 @@ namespace tvm {
namespace codegen {
CodeGenCHost::CodeGenCHost() {
module_name = GetUniqueName("__tvm_module_ctx");
module_name_ = GetUniqueName("__tvm_module_ctx");
}
void CodeGenCHost::Init(bool output_ssa) {
decl_stream << "#include \"tvm/runtime/c_runtime_api.h\"\n";
decl_stream << "#include \"tvm/runtime/c_backend_api.h\"\n";
decl_stream << "extern void* " << module_name << " = NULL;\n";
decl_stream << "extern void* " << module_name_ << " = NULL;\n";
CodeGenC::Init(output_ssa);
}
......@@ -154,12 +154,13 @@ void CodeGenCHost::VisitExpr_(const Broadcast* op, std::ostream& os) { // NOLI
os << "))";
}
void CodeGenCHost::PrintGetFuncFromBackend(std::string func_name, std::string packed_func_name) {
void CodeGenCHost::PrintGetFuncFromBackend(const std::string& func_name,
const std::string& packed_func_name) {
this->PrintIndent();
this->stream << "if (" << packed_func_name << " == NULL) {\n";
int packed_func_if_scope = this->BeginScope();
this->PrintIndent();
this->stream << "if (TVMBackendGetFuncFromEnv(" << module_name
this->stream << "if (TVMBackendGetFuncFromEnv(" << module_name_
<< ", \"" << func_name << "\""
<< ", &" << packed_func_name << ") != 0) {\n";
int get_func_env_scope = this->BeginScope();
......@@ -173,7 +174,7 @@ void CodeGenCHost::PrintGetFuncFromBackend(std::string func_name, std::string pa
this->stream << "}\n";
}
void CodeGenCHost::PrintFuncCall(std::string packed_func_name, int num_args) {
void CodeGenCHost::PrintFuncCall(const std::string& packed_func_name, int num_args) {
this->PrintIndent();
std::string ret_val = GetUniqueName("ret_val");
std::string ret_type_code = GetUniqueName("ret_type_code");
......@@ -251,6 +252,29 @@ void CodeGenCHost::VisitStmt_(const AssertStmt *op) { // NOLINT(*)
this->PrintStmt(op->body);
}
void CodeGenCHost::VisitExpr_(const Min *op, std::ostream& os) { // NOLINT(*)
PrintTernaryCondExpr(op, "<", os);
}
void CodeGenCHost::VisitExpr_(const Max *op, std::ostream& os) { // NOLINT(*)
PrintTernaryCondExpr(op, ">", os);
}
template <typename T>
inline void CodeGenCHost::PrintTernaryCondExpr(const T* op,
const char* compare,
std::ostream& os) { // NOLINT(*)
std::ostringstream temp_a;
VisitExpr(op->a, temp_a);
std::string a_id = SSAGetID(temp_a.str(), op->a.type());
std::ostringstream temp_b;
VisitExpr(op->b, temp_b);
std::string b_id = SSAGetID(temp_b.str(), op->b.type());
os << "((" << a_id << ") " << compare << " (" << b_id << ") "
<< "? (" << a_id << ") : (" << b_id << "))";
}
runtime::Module BuildCHost(Array<LoweredFunc> funcs) {
using tvm::runtime::Registry;
bool output_ssa = false;
......
......@@ -45,12 +45,30 @@ class CodeGenCHost final : public CodeGenC {
// overload visitor functions
void VisitExpr_(const Broadcast* op, std::ostream& os) final; // NOLINT(*)
void VisitExpr_(const Call *op, std::ostream& os) final; // NOLINT(*)
// overload min and max to use the ternary operator, so we don't rely on the
// standard library implementations
void VisitExpr_(const Min *op, std::ostream& os) final; // NOLINT(*)
void VisitExpr_(const Max *op, std::ostream& os) final; // NOLINT(*)
void VisitStmt_(const AssertStmt *op) final; // NOLINT(*)
private:
std::string module_name;
void PrintGetFuncFromBackend(std::string func_name, std::string packed_func_name);
void PrintFuncCall(std::string packed_func_name, int num_args);
std::string module_name_;
void PrintGetFuncFromBackend(const std::string& func_name, const std::string& packed_func_name);
void PrintFuncCall(const std::string& packed_func_name, int num_args);
/*!
* \brief Print ternary conditional operator implementing binary `op`
* Forces the operands to be in SSA form.
* \param op binary operator being expressed
* \param compare string representation of comparison operator
* \param os stream reference to print into
*/
template <typename T>
inline void PrintTernaryCondExpr(const T* op,
const char* compare,
std::ostream& os); // NOLINT(*)
};
} // namespace codegen
......
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file utvm_device_dylib_redirect.cc
* \brief uTVM dynamic linking stubs
*
* This is a library that gets included in each uTVM library. We redirect
* each library call into a pre-defined global function pointer, and we patch
* the correct addresses of each function into the pointers when we load the
* library.
*/
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <stddef.h>
void *(*TVMBackendAllocWorkspace_)(int, int, uint64_t, int, int) =
(void *(*)(int, int, uint64_t, int, int)) NULL;
int (*TVMBackendFreeWorkspace_)(int, int, void*) = (int (*)(int, int, void*)) NULL;
void (*TVMAPISetLastError_)(const char*) = (void (*)(const char*)) NULL;
void* TVMBackendAllocWorkspace(int device_type, int device_id, uint64_t size,
int dtype_code_hint, int dtype_bits_hint) {
return (*TVMBackendAllocWorkspace_)(device_type, device_id, size, dtype_code_hint,
dtype_bits_hint);
}
int TVMBackendFreeWorkspace(int device_type, int device_id, void* ptr) {
return (*TVMBackendFreeWorkspace_)(device_type, device_id, ptr);
}
void TVMAPISetLastError(const char* msg) {
(*TVMAPISetLastError_)(msg);
}
#ifdef __cplusplus
} // TVM_EXTERN_C
#endif
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file utvm_runtime.cc
* \brief uTVM runtime
*
* All function calls go through `UTVMMain`, which reads from the current
* `UTVMTask` and calls the appropriate function with the arguments from the
* task.
*
* Additionally included in this file are definitions for some of the most
* common functions used in the C runtime API.
*/
#ifdef __cplusplus
extern "C" {
#endif
#include "utvm_runtime.h"
// Task pointers must be patched before calling a function.
UTVMTask task;
// These pointers are patched at load time to point to the workspace section.
char* utvm_workspace_begin = NULL; // NOLINT(*)
char* utvm_workspace_end = NULL; // NOLINT(*)
char* utvm_workspace_curr = NULL; // NOLINT(*)
// Keep track of how many active allocations there are on the workspace.
size_t utvm_num_active_allocs = 0;
const char* utvm_last_error = NULL; // NOLINT(*)
int32_t utvm_return_code = 0; // NOLINT(*)
// We use a dummy function to signal execution is finished for device
// backends which require breakpoints.
void UTVMDone() { }
void UTVMMain() {
utvm_workspace_curr = utvm_workspace_begin;
utvm_num_active_allocs = 0;
utvm_last_error = NULL; // NOLINT(*)
utvm_return_code = 0;
utvm_return_code = task.func((void*) task.arg_values, (void*) task.arg_type_codes, // NOLINT(*)
task.num_args);
UTVMDone();
}
void* TVMBackendAllocWorkspace(int device_type, int device_id, uint64_t size,
int dtype_code_hint, int dtype_bits_hint) {
// Align up to 8 bytes.
utvm_workspace_curr += (8 - ((uintptr_t) utvm_workspace_curr % 8)) % 8; // NOLINT(*)
if (utvm_workspace_curr + size > utvm_workspace_end) {
// Out of space in workspace.
return NULL;
}
void* ret_ptr = (void*) utvm_workspace_curr; // NOLINT(*)
utvm_workspace_curr += size;
utvm_num_active_allocs++;
return ret_ptr;
}
int TVMBackendFreeWorkspace(int device_type, int device_id, void* ptr) {
utvm_num_active_allocs--;
if (utvm_num_active_allocs < 0) {
TVMAPISetLastError("free called with no active workspace allocations");
// Reset allocations and workspace (for future task executions).
utvm_num_active_allocs = 0;
utvm_workspace_curr = utvm_workspace_begin;
return -1;
} else if (utvm_num_active_allocs == 0) {
// No more allocations. Reset workspace.
utvm_workspace_curr = utvm_workspace_begin;
return 0;
} else {
return 0;
}
}
void TVMAPISetLastError(const char* msg) {
utvm_last_error = msg;
}
#ifdef __cplusplus
} // TVM_EXTERN_C
#endif
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file utvm_runtime.h
* \brief uTVM runtime headers
*/
#ifndef TVM_RUNTIME_MICRO_DEVICE_UTVM_RUNTIME_H_
#define TVM_RUNTIME_MICRO_DEVICE_UTVM_RUNTIME_H_
#ifdef __cplusplus
extern "C" {
#endif
#include <stdint.h>
#include <tvm/runtime/c_runtime_api.h>
/*!
* \brief Task structure for uTVM
*/
typedef struct {
/*! \brief Pointer to function to call for this task */
int32_t (*func)(void*, void*, int32_t);
/*! \brief Array of argument values */
TVMValue* arg_values;
/*! \brief Array of type codes for each argument value */
int* arg_type_codes;
/*! \brief Number of arguments */
int32_t num_args;
} UTVMTask;
#ifdef __cplusplus
} // TVM_EXTERN_C
#endif
#endif // TVM_RUNTIME_MICRO_DEVICE_UTVM_RUNTIME_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file host_low_level_device.cc
* \brief emulated low-level micro device implementation on host machine
*/
#include <sys/mman.h>
#include <cstring>
#include <memory>
#include "micro_common.h"
#include "low_level_device.h"
namespace tvm {
namespace runtime {
/*!
* \brief emulated low-level device on host machine
*/
class HostLowLevelDevice final : public LowLevelDevice {
public:
/*!
* \brief constructor to initialize on-host memory region to act as device
* \param num_bytes size of the emulated on-device memory region
*/
explicit HostLowLevelDevice(size_t num_bytes) : size_(num_bytes) {
size_t size_in_pages = (num_bytes + kPageSize - 1) / kPageSize;
// TODO(weberlo): Set permissions per section (e.g., read-write perms for
// the heap, execute perms for text, etc.).
int mmap_prot = PROT_READ | PROT_WRITE | PROT_EXEC;
int mmap_flags = MAP_ANONYMOUS | MAP_PRIVATE;
base_addr_ = reinterpret_cast<std::uintptr_t>(
mmap(nullptr, size_in_pages * kPageSize, mmap_prot, mmap_flags, -1, 0));
}
/*!
* \brief destructor to deallocate on-host device region
*/
virtual ~HostLowLevelDevice() {
munmap(reinterpret_cast<void*>(base_addr_), size_);
}
void Read(DevBaseOffset offset, void* buf, size_t num_bytes) {
void* addr = ToDevPtr(offset).cast_to<void*>();
std::memcpy(buf, addr, num_bytes);
}
void Write(DevBaseOffset offset, const void* buf, size_t num_bytes) {
void* addr = ToDevPtr(offset).cast_to<void*>();
std::memcpy(addr, buf, num_bytes);
}
void Execute(DevBaseOffset func_offset, DevBaseOffset breakpoint) {
DevPtr func_addr = ToDevPtr(func_offset);
reinterpret_cast<void (*)(void)>(func_addr.value())();
}
std::uintptr_t base_addr() const final {
return base_addr_;
}
const char* device_type() const final {
return "host";
}
private:
/*! \brief base address of the micro device memory region */
std::uintptr_t base_addr_;
/*! \brief size of memory region */
size_t size_;
};
const std::shared_ptr<LowLevelDevice> HostLowLevelDeviceCreate(size_t num_bytes) {
std::shared_ptr<LowLevelDevice> lld =
std::make_shared<HostLowLevelDevice>(num_bytes);
return lld;
}
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file low_level_device.h
* \brief Abstract low-level micro device management
*/
#ifndef TVM_RUNTIME_MICRO_LOW_LEVEL_DEVICE_H_
#define TVM_RUNTIME_MICRO_LOW_LEVEL_DEVICE_H_
#include <memory>
#include "micro_common.h"
namespace tvm {
namespace runtime {
/*!
* \brief virtual interface for low-level micro device management
*/
class LowLevelDevice {
public:
/*! \brief virtual destructor */
virtual ~LowLevelDevice() {}
/*!
* \brief reads num_bytes from device memory at base_addr + offset into buffer
* \param offset on-device memory offset pointer to be read from
* \param buffer on-host buffer to be read into
* \param num_bytes number of bytes to be read
*/
virtual void Read(DevBaseOffset offset,
void* buffer,
size_t num_bytes) = 0;
/*!
* \brief writes num_bytes from buffer to device memory at base_addr + offset
* \param offset on-device memory offset pointer to be written to
* \param buffer on-host buffer to be written
* \param num_bytes number of bytes to be written
*/
virtual void Write(DevBaseOffset offset,
const void* buffer,
size_t num_bytes) = 0;
/*!
* \brief starts execution of device at offset
* \param func_addr offset of the init stub function
* \param breakpoint breakpoint at which to stop function execution
*/
virtual void Execute(DevBaseOffset func_offset, DevBaseOffset breakpoint) = 0;
/*!
* \brief convert from base offset to absolute address
* \param offset base offset
*/
DevPtr ToDevPtr(DevBaseOffset offset) {
return DevPtr(base_addr() + offset.value());
}
/*!
* \brief convert from absolute address to base offset
* \param ptr absolute address
*/
DevBaseOffset ToDevOffset(DevPtr ptr) {
return DevBaseOffset(ptr.value() - base_addr());
}
/*!
* \brief getter function for low-level device type
* \return string containing device type
*/
virtual const char* device_type() const = 0;
protected:
/*!
* \brief getter function for base_addr
* \return the base address of the device memory region
*/
virtual std::uintptr_t base_addr() const = 0;
};
/*!
* \brief create a host low-level device
* \param num_bytes size of the memory region
*/
const std::shared_ptr<LowLevelDevice> HostLowLevelDeviceCreate(size_t num_bytes);
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_LOW_LEVEL_DEVICE_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_common.cc
* \brief common utilties for uTVM
*/
#include <tvm/runtime/c_runtime_api.h>
#include <tvm/runtime/registry.h>
#include <cstdio>
#include <string>
#include <sstream>
#include <cstdint>
#include "micro_session.h"
#include "micro_common.h"
#include "low_level_device.h"
namespace tvm {
namespace runtime {
size_t GetDefaultSectionSize(SectionKind kind) {
switch (kind) {
case SectionKind::kText:
return 0xF0000;
case SectionKind::kRodata:
return 0xF000;
case SectionKind::kData:
return 0xF00;
case SectionKind::kBss:
return 0xF00;
case SectionKind::kArgs:
return 0xF00000;
case SectionKind::kStack:
return 0xF000;
case SectionKind::kHeap:
return 0xF000000;
case SectionKind::kWorkspace:
return 0xF000000;
default:
LOG(FATAL) << "invalid section " << static_cast<size_t>(kind);
return 0;
}
}
const char* SectionToString(SectionKind section) {
switch (section) {
case SectionKind::kText: return "text";
case SectionKind::kRodata: return "rodata";
case SectionKind::kData: return "data";
case SectionKind::kBss: return "bss";
case SectionKind::kArgs: return "args";
case SectionKind::kStack: return "stack";
case SectionKind::kHeap: return "heap";
case SectionKind::kWorkspace: return "workspace";
default: return "";
}
}
static std::string AddrToString(void* addr) {
std::stringstream stream;
if (addr != nullptr)
stream << addr;
else
stream << "0x0";
std::string string_addr = stream.str();
return string_addr;
}
std::string RelocateBinarySections(const std::string& binary_path,
DevPtr text,
DevPtr rodata,
DevPtr data,
DevPtr bss,
const std::string& toolchain_prefix) {
const auto* f = Registry::Get("tvm_callback_relocate_binary");
CHECK(f != nullptr)
<< "Require tvm_callback_relocate_binary to exist in registry";
std::string relocated_bin = (*f)(binary_path,
AddrToString(text.cast_to<void*>()),
AddrToString(rodata.cast_to<void*>()),
AddrToString(data.cast_to<void*>()),
AddrToString(bss.cast_to<void*>()),
toolchain_prefix);
return relocated_bin;
}
std::string ReadSection(const std::string& binary,
SectionKind section,
const std::string& toolchain_prefix) {
CHECK(section == SectionKind::kText || section == SectionKind::kRodata ||
section == SectionKind::kData || section == SectionKind::kBss)
<< "ReadSection requires section to be one of text, rodata, data, or bss.";
const auto* f = Registry::Get("tvm_callback_read_binary_section");
CHECK(f != nullptr)
<< "Require tvm_callback_read_binary_section to exist in registry";
TVMByteArray arr;
arr.data = &binary[0];
arr.size = binary.length();
std::string section_contents = (*f)(arr, SectionToString(section), toolchain_prefix);
return section_contents;
}
size_t GetSectionSize(const std::string& binary_path,
SectionKind section,
const std::string& toolchain_prefix,
size_t align) {
CHECK(section == SectionKind::kText || section == SectionKind::kRodata ||
section == SectionKind::kData || section == SectionKind::kBss)
<< "GetSectionSize requires section to be one of text, rodata, data, or bss.";
const auto* f = Registry::Get("tvm_callback_get_section_size");
CHECK(f != nullptr)
<< "Require tvm_callback_get_section_size to exist in registry";
int size = (*f)(binary_path, SectionToString(section), toolchain_prefix);
return UpperAlignValue(size, align);
}
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_common.h
*/
#ifndef TVM_RUNTIME_MICRO_MICRO_COMMON_H_
#define TVM_RUNTIME_MICRO_MICRO_COMMON_H_
#include <stdio.h>
#include <tvm/runtime/registry.h>
#include <sstream>
#include <string>
#include <unordered_map>
namespace tvm {
namespace runtime {
/*!
* \brief enum of device memory region sections
*
* The order in which the enum variants are defined also defines the order of
* the sections in device memory.
*/
enum class SectionKind : size_t {
kText = 0,
kRodata,
kData,
kBss,
kArgs,
kStack,
kHeap,
kWorkspace,
kNumKinds,
};
/*! \brief default size alignment */
constexpr int kDefaultSizeAlignment = 8;
/*! \brief Base class for interfacing with device locations (pointers/offsets) */
class DeviceLocation {
public:
/*! \brief construct a location with value `value` */
explicit DeviceLocation(std::uintptr_t value) : value_(value) {}
/*! \brief default constructor */
DeviceLocation() : value_(0) {}
/*! \brief construct a null location */
explicit DeviceLocation(std::nullptr_t value) : value_(0) {}
/*! \brief destructor */
virtual ~DeviceLocation() {}
/*!
* \brief get value of location
* \return value of location
*/
std::uintptr_t value() const { return value_; }
/*!
* \brief cast location to type `T`
* \return casted result
*/
template <typename T>
T cast_to() const { return reinterpret_cast<T>(value_); }
/*! \brief check if location is null */
bool operator==(std::nullptr_t) const { return value_ == 0; }
/*! \brief check if location is not null */
bool operator!=(std::nullptr_t) const { return value_ != 0; }
protected:
/*! \brief raw value storing the location */
std::uintptr_t value_;
};
/*! \brief absolute device address */
class DevPtr : public DeviceLocation {
public:
/*! \brief construct an absolute address with value `value` */
explicit DevPtr(std::uintptr_t val) : DeviceLocation(val) {}
/*! \brief default constructor */
DevPtr() : DeviceLocation() {}
/*! \brief construct a null absolute address */
explicit DevPtr(std::nullptr_t val) : DeviceLocation(val) {}
/*! \brief add an integer to this absolute address to get a larger absolute address */
DevPtr operator+(size_t n) const {
return DevPtr(value_ + n);
}
/*! \brief mutably add an integer to this absolute address */
DevPtr& operator+=(size_t n) {
value_ += n;
return *this;
}
/*! \brief subtract an integer from this absolute address to get a smaller absolute address */
DevPtr operator-(size_t n) const {
return DevPtr(value_ - n);
}
/*! \brief mutably subtract an integer from this absolute address */
DevPtr& operator-=(size_t n) {
value_ -= n;
return *this;
}
};
/*! \brief offset from device base address */
class DevBaseOffset : public DeviceLocation {
public:
/*! \brief construct a base offset with value `value` */
explicit DevBaseOffset(std::uintptr_t value) : DeviceLocation(value) {}
/*! \brief default constructor */
DevBaseOffset() : DeviceLocation() {}
/*! \brief construct a null base offset */
explicit DevBaseOffset(std::nullptr_t value) : DeviceLocation(value) {}
/*! \brief add an integer to this base offset to get a larger base offset */
DevBaseOffset operator+(size_t n) const {
return DevBaseOffset(value_ + n);
}
/*! \brief mutably add an integer to this base offset */
DevBaseOffset& operator+=(size_t n) {
value_ += n;
return *this;
}
/*! \brief subtract an integer from this base offset to get a smaller base offset */
DevBaseOffset operator-(size_t n) const {
return DevBaseOffset(value_ - n);
}
/*! \brief mutably subtract an integer from this base offset */
DevBaseOffset& operator-=(size_t n) {
value_ -= n;
return *this;
}
};
/*!
* \brief map from symbols to their on-device offsets
*/
class SymbolMap {
public:
/*!
* \brief default constructor
*/
SymbolMap() {}
/*!
* \brief constructor that builds the mapping
* \param binary contents of binary object file
* \param toolchain_prefix prefix of compiler toolchain to use
*/
SymbolMap(const std::string& binary,
const std::string& toolchain_prefix) {
const auto* f = Registry::Get("tvm_callback_get_symbol_map");
CHECK(f != nullptr) << "require tvm_callback_get_symbol_map to exist in registry";
TVMByteArray arr;
arr.data = &binary[0];
arr.size = binary.length();
std::string map_str = (*f)(arr, toolchain_prefix);
// Parse symbols and addresses from returned string.
std::stringstream stream;
stream << map_str;
std::string name;
std::uintptr_t addr;
stream >> name;
stream >> std::hex >> addr;
while (stream) {
map_[name] = DevPtr(addr);
stream >> name;
stream >> std::hex >> addr;
}
}
/*!
* \brief retrieve on-device offset for a symbol name
* \param name name of the symbol
* \return on-device offset of the symbol
*/
DevPtr operator[](const std::string& name) const {
auto result = map_.find(name);
CHECK(result != map_.end()) << "\"" << name << "\" not in symbol map";
return result->second;
}
private:
/*! \brief backing map */
std::unordered_map<std::string, DevPtr> map_;
};
/*! \brief struct containing start and size of a device memory region */
struct DevMemRegion {
/*! \brief section start offset */
DevBaseOffset start;
/*! \brief size of section */
size_t size;
};
/*! \brief struct containing section locations and symbol mappings */
struct BinaryInfo {
/*! \brief text section region */
DevMemRegion text_section;
/*! \brief rodata section region */
DevMemRegion rodata_section;
/*! \brief data section region */
DevMemRegion data_section;
/*! \brief bss section region */
DevMemRegion bss_section;
/*! \brief symbol map to offsets */
SymbolMap symbol_map;
};
// TODO(weberlo): should this be here?
/*! \brief number of bytes in each page */
constexpr int kPageSize = 4096;
const DevBaseOffset kDeviceStart = DevBaseOffset(64);
/*!
* \brief return default size of given section kind in bytes
*/
size_t GetDefaultSectionSize(SectionKind kind);
/*!
* \brief upper-aligns value according to specified alignment
* \param value value to be aligned
* \param align alignment
* \return upper-aligned value
*/
inline size_t UpperAlignValue(size_t value, size_t align) {
return value + (align - (value % align)) % align;
}
/*!
* \brief maps section enums to text
* \param section section type
* \return text form of the specified section
*/
const char* SectionToString(SectionKind section);
/*!
* \brief links binary by repositioning section addresses
* \param binary_name input binary filename
* \param text new text section address
* \param rodata new rodata section address
* \param data new data section address
* \param bss new bss section address
* \param toolchain_prefix prefix of compiler toolchain to use
* \return relocated binary file contents
*/
std::string RelocateBinarySections(const std::string& binary_name,
DevPtr text,
DevPtr rodata,
DevPtr data,
DevPtr bss,
const std::string& toolchain_prefix);
/*!
* \brief reads section from binary
* \param binary input binary contents
* \param section section type to be read
* \param toolchain_prefix prefix of compiler toolchain to use
* \return contents of the section
*/
std::string ReadSection(const std::string& binary,
SectionKind section,
const std::string& toolchain_prefix);
/*!
* \brief finds size of the section in the binary
* \param binary input binary contents
* \param section section type
* \param toolchain_prefix prefix of compiler toolchain to use
* \param align alignment of the returned size (default: 8)
* \return size of the section if it exists, 0 otherwise
*/
size_t GetSectionSize(const std::string& binary_name,
SectionKind section,
const std::string& toolchain_prefix,
size_t align = kDefaultSizeAlignment);
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_MICRO_COMMON_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_device_api.cc
*/
#include <tvm/runtime/registry.h>
#include <tvm/runtime/device_api.h>
#include <tvm/runtime/c_runtime_api.h>
#include "../workspace_pool.h"
#include "micro_session.h"
namespace tvm {
namespace runtime {
/*!
* \brief device API for uTVM micro devices
*/
class MicroDeviceAPI final : public DeviceAPI {
public:
/*! \brief constructor */
MicroDeviceAPI() { }
void SetDevice(TVMContext ctx) final {}
void GetAttr(TVMContext ctx, DeviceAttrKind kind, TVMRetValue* rv) final {
if (kind == kExist) {
*rv = 1;
}
}
void* AllocDataSpace(TVMContext ctx,
size_t nbytes,
size_t alignment,
TVMType type_hint) final {
std::shared_ptr<MicroSession>& session = MicroSession::Current();
void* data = session->AllocateInSection(SectionKind::kHeap, nbytes).cast_to<void*>();
CHECK(data != nullptr) << "unable to allocate " << nbytes << " bytes on device heap";
MicroDevSpace* dev_space = new MicroDevSpace();
dev_space->data = data;
dev_space->session = session;
return static_cast<void*>(dev_space);
}
void FreeDataSpace(TVMContext ctx, void* ptr) final {
MicroDevSpace* dev_space = static_cast<MicroDevSpace*>(ptr);
dev_space->session->FreeInSection(
SectionKind::kHeap, DevBaseOffset(reinterpret_cast<std::uintptr_t>(dev_space->data)));
delete dev_space;
}
void CopyDataFromTo(const void* from,
size_t from_offset,
void* to,
size_t to_offset,
size_t size,
TVMContext ctx_from,
TVMContext ctx_to,
TVMType type_hint,
TVMStreamHandle stream) final {
std::tuple<int, int> type_from_to(ctx_from.device_type, ctx_to.device_type);
if (type_from_to == std::make_tuple(kDLMicroDev, kDLMicroDev)) {
// Copying from the device to the device.
MicroDevSpace* from_space = static_cast<MicroDevSpace*>(const_cast<void*>(from));
MicroDevSpace* to_space = static_cast<MicroDevSpace*>(const_cast<void*>(to));
CHECK(from_space->session == to_space->session)
<< "attempt to copy data between different micro sessions (" << from_space->session
<< " != " << to_space->session << ")";
CHECK(ctx_from.device_id == ctx_to.device_id)
<< "can only copy between the same micro device";
std::shared_ptr<MicroSession>& session = from_space->session;
const std::shared_ptr<LowLevelDevice>& lld = session->low_level_device();
DevBaseOffset from_dev_offset = GetDevLoc(from_space, from_offset);
DevBaseOffset to_dev_offset = GetDevLoc(to_space, to_offset);
std::vector<uint8_t> buffer(size);
lld->Read(from_dev_offset, static_cast<void*>(buffer.data()), size);
lld->Write(to_dev_offset, static_cast<void*>(buffer.data()), size);
} else if (type_from_to == std::make_tuple(kDLMicroDev, kDLCPU)) {
// Reading from the device.
MicroDevSpace* from_space = static_cast<MicroDevSpace*>(const_cast<void*>(from));
std::shared_ptr<MicroSession>& session = from_space->session;
const std::shared_ptr<LowLevelDevice>& lld = session->low_level_device();
DevBaseOffset from_dev_offset = GetDevLoc(from_space, from_offset);
void* to_host_ptr = GetHostLoc(to, to_offset);
lld->Read(from_dev_offset, to_host_ptr, size);
} else if (type_from_to == std::make_tuple(kDLCPU, kDLMicroDev)) {
// Writing to the device.
MicroDevSpace* to_space = static_cast<MicroDevSpace*>(const_cast<void*>(to));
std::shared_ptr<MicroSession>& session = to_space->session;
const std::shared_ptr<LowLevelDevice>& lld = session->low_level_device();
void* from_host_ptr = GetHostLoc(from, from_offset);
DevBaseOffset to_dev_offset = GetDevLoc(to_space, to_offset);
lld->Write(to_dev_offset, from_host_ptr, size);
} else {
LOG(FATAL) << "Expect copy from/to micro device or between micro device\n";
}
}
void StreamSync(TVMContext ctx, TVMStreamHandle stream) final {
}
void* AllocWorkspace(TVMContext ctx, size_t size, TVMType type_hint) final {
std::shared_ptr<MicroSession>& session = MicroSession::Current();
void* data = session->AllocateInSection(SectionKind::kWorkspace, size).cast_to<void*>();
CHECK(data != nullptr) << "unable to allocate " << size << " bytes on device workspace";
MicroDevSpace* dev_space = new MicroDevSpace();
dev_space->data = data;
dev_space->session = session;
return static_cast<void*>(dev_space);
}
void FreeWorkspace(TVMContext ctx, void* data) final {
MicroDevSpace* dev_space = static_cast<MicroDevSpace*>(data);
std::shared_ptr<MicroSession>& session = dev_space->session;
session->FreeInSection(SectionKind::kWorkspace,
DevBaseOffset(reinterpret_cast<std::uintptr_t>(dev_space->data)));
delete dev_space;
}
/*!
* \brief obtain a global singleton of MicroDeviceAPI
* \return global shared pointer to MicroDeviceAPI
*/
static const std::shared_ptr<MicroDeviceAPI>& Global() {
static std::shared_ptr<MicroDeviceAPI> inst = std::make_shared<MicroDeviceAPI>();
return inst;
}
private:
DevBaseOffset GetDevLoc(MicroDevSpace* dev_space, size_t offset) {
DevBaseOffset dev_offset =
DevBaseOffset(reinterpret_cast<std::uintptr_t>(dev_space->data) + offset);
return dev_offset;
}
void* GetHostLoc(const void* ptr, size_t offset) {
return reinterpret_cast<void*>(reinterpret_cast<std::uintptr_t>(ptr) + offset);
}
};
// register device that can be obtained from Python frontend
TVM_REGISTER_GLOBAL("device_api.micro_dev")
.set_body([](TVMArgs args, TVMRetValue* rv) {
DeviceAPI* ptr = MicroDeviceAPI::Global().get();
*rv = static_cast<void*>(ptr);
});
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_module.cc
*/
#include <tvm/runtime/registry.h>
#include <tvm/runtime/c_runtime_api.h>
#include <tvm/runtime/module.h>
#include <unordered_map>
#include <string>
#include "micro_session.h"
#include "low_level_device.h"
#include "micro_common.h"
#include "../pack_args.h"
namespace tvm {
namespace runtime {
/*!
* \brief module for uTVM micro devices
*/
class MicroModuleNode final : public ModuleNode {
public:
MicroModuleNode() {}
~MicroModuleNode() {}
const char* type_key() const final {
return "micro";
}
PackedFunc GetFunction(const std::string& name,
const std::shared_ptr<ModuleNode>& sptr_to_self) final;
/*!
* \brief initializes module by establishing device connection and loads binary
* \param binary_path path of the binary to be loaded
*/
void InitMicroModule(const std::string& binary_path) {
session_ = MicroSession::Current();
binary_path_ = binary_path;
binary_info_ = session_->LoadBinary(binary_path_);
}
/*!
* \brief runs selected function on the micro device
* \param func_name name of the function to be run
* \param func_offset offset of the function to be run
* \param args type-erased arguments passed to the function
*/
void RunFunction(const std::string& func_name, DevBaseOffset func_offset, const TVMArgs& args) {
session_->PushToExecQueue(func_offset, args);
}
private:
/*! \brief module binary info */
BinaryInfo binary_info_;
/*! \brief path to module binary */
std::string binary_path_;
/*! \brief global session pointer */
std::shared_ptr<MicroSession> session_;
};
class MicroWrappedFunc {
public:
MicroWrappedFunc(MicroModuleNode* m,
std::shared_ptr<MicroSession> session,
const std::string& func_name,
DevBaseOffset func_offset) {
m_ = m;
session_ = session;
func_name_ = func_name;
func_offset_ = func_offset;
}
void operator()(TVMArgs args, TVMRetValue* rv) const {
m_->RunFunction(func_name_, func_offset_, args);
}
private:
/*! \brief internal module */
MicroModuleNode* m_;
/*! \brief reference to the session for this function (to keep the session alive) */
std::shared_ptr<MicroSession> session_;
/*! \brief name of the function */
std::string func_name_;
/*! \brief offset of the function to be called */
DevBaseOffset func_offset_;
};
PackedFunc MicroModuleNode::GetFunction(
const std::string& name,
const std::shared_ptr<ModuleNode>& sptr_to_self) {
DevBaseOffset func_offset =
session_->low_level_device()->ToDevOffset(binary_info_.symbol_map[name]);
MicroWrappedFunc f(this, session_, name, func_offset);
return PackedFunc(f);
}
// register loadfile function to load module from Python frontend
TVM_REGISTER_GLOBAL("module.loadfile_micro_dev")
.set_body([](TVMArgs args, TVMRetValue* rv) {
std::shared_ptr<MicroModuleNode> n = std::make_shared<MicroModuleNode>();
n->InitMicroModule(args[0]);
*rv = runtime::Module(n);
});
} // namespace runtime
} // namespace tvm
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_section_allocator.h
*/
#ifndef TVM_RUNTIME_MICRO_MICRO_SECTION_ALLOCATOR_H_
#define TVM_RUNTIME_MICRO_MICRO_SECTION_ALLOCATOR_H_
#include <unordered_map>
#include "micro_common.h"
namespace tvm {
namespace runtime {
/*!
* \brief allocator for an on-device memory section
*/
class MicroSectionAllocator {
public:
/*!
* \brief constructor that specifies section boundaries
* \param region location and size of the section on the device
*/
explicit MicroSectionAllocator(DevMemRegion region)
: start_offset_(region.start),
size_(0),
capacity_(region.size) {
CHECK_EQ(start_offset_.value() % 8, 0) << "micro section not aligned to 8 bytes";
}
/*!
* \brief destructor
*/
~MicroSectionAllocator() {}
/*!
* \brief memory allocator
* \param size size of allocated memory in bytes
* \return pointer to allocated memory region in section, nullptr if out of space
*/
DevBaseOffset Allocate(size_t size) {
size_ = UpperAlignValue(size_, 8);
CHECK(size_ + size < capacity_)
<< "cannot alloc " << size << " bytes in section with start_addr " <<
start_offset_.value();
DevBaseOffset alloc_ptr = start_offset_ + size_;
size_ += size;
alloc_map_[alloc_ptr.value()] = size;
return alloc_ptr;
}
/*!
* \brief free prior allocation from section
* \param offs offset to allocated memory
* \note simple allocator scheme, more complex versions will be implemented later
*/
void Free(DevBaseOffset offs) {
std::uintptr_t ptr = offs.value();
CHECK(alloc_map_.find(ptr) != alloc_map_.end()) << "freed pointer was never allocated";
alloc_map_.erase(ptr);
if (alloc_map_.empty()) {
size_ = 0;
}
}
/*!
* \brief start offset of the memory region managed by this allocator
*/
DevBaseOffset start_offset() const { return start_offset_; }
/*!
* \brief current end offset of the space being used in this memory region
*/
DevBaseOffset curr_end_offset() const { return start_offset_ + size_; }
/*!
* \brief end offset of the memory region managed by this allocator
*/
DevBaseOffset max_end_offset() const { return start_offset_ + capacity_; }
/*!
* \brief size of the section
*/
size_t size() const { return size_; }
/*!
* \brief capacity of the section
*/
size_t capacity() const { return capacity_; }
private:
/*! \brief start address of the section */
DevBaseOffset start_offset_;
/*! \brief current size of the section */
size_t size_;
/*! \brief total storage capacity of the section */
size_t capacity_;
/*! \brief allocation map for allocation sizes */
std::unordered_map<std::uintptr_t, size_t> alloc_map_;
};
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_MICRO_SECTION_ALLOCATOR_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file micro_session.h
*/
#ifndef TVM_RUNTIME_MICRO_MICRO_SESSION_H_
#define TVM_RUNTIME_MICRO_MICRO_SESSION_H_
#include "micro_common.h"
#include "micro_section_allocator.h"
#include <tvm/runtime/registry.h>
#include <tvm/runtime/c_runtime_api.h>
#include <memory>
#include <string>
#include <unordered_map>
#include <vector>
#include <tuple>
#include "low_level_device.h"
#include "device/utvm_runtime.h"
#include "target_data_layout_encoder.h"
namespace tvm {
namespace runtime {
/*!
* \brief session for facilitating micro device interaction
*/
class MicroSession : public ModuleNode {
public:
/*!
* \brief Get member function to front-end
* \param name The name of the function.
* \param sptr_to_self The pointer to the module node.
* \return The corresponding member function.
*/
virtual PackedFunc GetFunction(const std::string& name,
const std::shared_ptr<ModuleNode>& sptr_to_self);
/*!
* \return The type key of the executor.
*/
const char* type_key() const final {
return "MicroSession";
}
/*!
* \brief constructor
*/
MicroSession();
/*!
* \brief destructor
*/
~MicroSession();
static std::shared_ptr<MicroSession>& Current();
/*!
* \brief creates session by setting up a low-level device and initting allocators for it
* \param args TVMArgs passed into the micro.init packedfunc
*/
void CreateSession(const std::string& device_type,
const std::string& binary_path,
const std::string& toolchain_prefix);
/*!
* \brief ends the session by destructing the low-level device and its allocators
*/
void EndSession();
/*!
* \brief allocate memory in section
* \param type type of section to allocate in
* \param size size of allocated memory in bytes
* \return pointer to allocated memory region in section, nullptr if out of space
*/
DevBaseOffset AllocateInSection(SectionKind type, size_t size);
/*!
* \brief free prior allocation from section
* \param type type of section to allocate in
* \param ptr pointer to allocated memory
*/
void FreeInSection(SectionKind type, DevBaseOffset ptr);
/*!
* \brief read string from device to host
* \param str_offset device offset of first character of string
* \return host copy of device string that was read
*/
std::string ReadString(DevBaseOffset str_offset);
/*!
* \brief sets up runtime metadata for `func` and copies arguments for on-device execution
* \param func address of the function to be executed
* \param args args to the packed function
*/
void PushToExecQueue(DevBaseOffset func, const TVMArgs& args);
/*!
* \brief loads binary onto device
* \param binary_path path to binary object file
* \param patch_dylib_pointers whether runtime API function pointer patching is needed
* \return info about loaded binary
*/
BinaryInfo LoadBinary(const std::string& binary_path, bool patch_dylib_pointers = true);
/*!
* \brief read value of symbol from device memory
* \param symbol_map symbol map to read location of symbol from
* \param symbol name of symbol being read from
* \return value at symbol in memory
*/
template <typename T>
T DevSymbolRead(const SymbolMap& symbol_map, const std::string& symbol);
/*!
* \brief write value into device memory corresponding to symbol
* \param symbol_map symbol map to read location of symbol from
* \param symbol name of symbol being written to
* \param value value being written into symbol
*/
template <typename T>
void DevSymbolWrite(const SymbolMap& symbol_map, const std::string& symbol, const T& value);
/*!
* \brief returns low-level device pointer
* \note assumes low-level device has been initialized
*/
const std::shared_ptr<LowLevelDevice>& low_level_device() const {
CHECK(low_level_device_ != nullptr) << "attempt to get uninitialized low-level device";
return low_level_device_;
}
private:
/*! \brief low-level device pointer */
std::shared_ptr<LowLevelDevice> low_level_device_;
/*! \brief prefix for binary names in target compiler toolchain */
std::string toolchain_prefix_;
/*! \brief array of memory allocators for each on-device section */
std::shared_ptr<MicroSectionAllocator>
section_allocators_[static_cast<size_t>(SectionKind::kNumKinds)];
/*! \brief total number of bytes of usable device memory for this session */
size_t memory_size_;
/*! \brief uTVM runtime binary info */
BinaryInfo runtime_bin_info_;
/*! \brief path to uTVM runtime source code */
std::string runtime_binary_path_;
/*! \brief offset of the runtime entry function */
DevBaseOffset utvm_main_symbol_;
/*! \brief offset of the runtime exit breakpoint */
DevBaseOffset utvm_done_symbol_;
/*!
* \brief patches a function pointer in this module to an implementation
* \param func_name name of the function pointer being patched
*/
void PatchImplHole(const SymbolMap& symbol_map, const std::string& func_name);
/*!
* \brief sets the runtime binary path
* \param path to runtime binary
*/
void SetRuntimeBinaryPath(std::string path);
/*!
* \brief appends arguments to the host-side buffer of `encoder`
* \param encoder encoder being used to append `args`
* \param args args to be appended
* \return device address of the allocated args
*/
std::tuple<DevPtr, DevPtr> EncoderAppend(TargetDataLayoutEncoder* encoder, const TVMArgs& args);
/*!
* \brief appends a `TVMArray` to the host-side buffer of `encoder`
* \param encoder encoder being used to append `arr`
* \param arr TVMArray to be appended
* \return device address of the allocated `TVMArray`
*/
DevPtr EncoderAppend(TargetDataLayoutEncoder* encoder, const TVMArray& arr);
/*!
* \brief checks and logs if there was an error during the device's most recent execution
*/
void CheckDeviceError();
/*!
* \brief returns section allocator corresponding to the given section kind
* \param kind kind of target section
* \return shared pointer to section allocator
*/
std::shared_ptr<MicroSectionAllocator> GetAllocator(SectionKind kind) {
return section_allocators_[static_cast<size_t>(kind)];
}
/*!
* \brief returns the symbol map for the uTVM runtime
* \return reference to symbol map
*/
const SymbolMap& runtime_symbol_map() {
return runtime_bin_info_.symbol_map;
}
/*!
* \brief Push a new session context onto the thread-local stack.
* The session on top of the stack is used as the current global session.
*/
static void EnterWithScope(std::shared_ptr<MicroSession> session);
/*!
* \brief Pop a session off the thread-local context stack,
* restoring the previous session as the current context.
*/
static void ExitWithScope();
};
/*!
* \brief a device memory region associated with the session that allocated it
*
* We use this to store a reference to the session in each allocated object and
* only deallocate the session once there are no more references to it.
*/
struct MicroDevSpace {
/*! \brief data being wrapped */
void* data;
/*! \brief shared ptr to session where this data is valid */
std::shared_ptr<MicroSession> session;
};
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_MICRO_SESSION_H_
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
/*!
* Copyright (c) 2019 by Contributors
* \file target_data_layout_encoder.h
* \brief uTVM data layout encoder
*/
#ifndef TVM_RUNTIME_MICRO_TARGET_DATA_LAYOUT_ENCODER_H_
#define TVM_RUNTIME_MICRO_TARGET_DATA_LAYOUT_ENCODER_H_
#include <vector>
#include "device/utvm_runtime.h"
namespace tvm {
namespace runtime {
// TODO(weberlo): Handle endianness.
/*!
* \brief data encoder for uTVM that builds a host-side buffer
*/
class TargetDataLayoutEncoder {
public:
/*!
* \brief helper class for writing into `TargetDataLayoutEncoder`
*/
template <typename T>
class Slot {
public:
/*!
* \brief constructor
* \param parent pointer to parent encoder
* \param start_offset start byte offset of the slot in the backing buffer
* \param size size (in bytes) of the memory region allocated for this slot
* \param start_addr start address of the slot in the device's memory
*/
Slot(TargetDataLayoutEncoder* parent, size_t start_offset, size_t size, DevPtr start_addr);
~Slot();
/*!
* \brief writes `sizeof(T) * num_elems` bytes of data from `arr`
* \param arr array to be read from
* \param num_elems number of elements in array
*/
void WriteArray(const T* arr, size_t num_elems);
/*!
* \brief writes `val`
* \param val value to be written
*/
void WriteValue(const T& val);
/*!
* \brief returns start address of the slot in device memory
* \return device start address
*/
DevPtr start_addr();
/*!
* \brief returns number of bytes allocated for this slot
* \return size of this slot
*/
size_t size();
private:
/*! \brief pointer to parent encoder */
TargetDataLayoutEncoder* parent_;
/*! \brief start offset of the slot in the parent's backing parent_buffer */
size_t start_offset_;
/*! \brief current offset relative to the start offset of this slot */
size_t curr_offset_;
/*! \brief size (in bytes) of the memory region allocated for this slot */
size_t size_;
/*! \brief start address of the slot in the device's memory */
DevPtr start_addr_;
};
/*!
* \brief constructor
* \param start_addr start address of the encoder in device memory
*/
explicit TargetDataLayoutEncoder(DevPtr start_addr)
: buf_(std::vector<uint8_t>()), curr_offset_(0) {
start_addr_ = DevPtr(UpperAlignValue(start_addr.value(), 8));
}
/*!
* \brief allocates a slot for `sizeof(T) * num_elems` bytes of data
* \param num_elems number of elements of type `T` being allocated (defaults to 1)
* \return slot of size `sizeof(T) * num_elems` bytes
*/
template <typename T>
Slot<T> Alloc(size_t num_elems = 1) {
curr_offset_ = UpperAlignValue(curr_offset_, 8);
size_t size = sizeof(T) * num_elems;
if (curr_offset_ + size > buf_.size()) {
buf_.resize(curr_offset_ + size);
}
size_t slot_start_offset = curr_offset_;
curr_offset_ += size;
return Slot<T>(this, slot_start_offset, size, start_addr_ + slot_start_offset);
}
/*!
* \brief returns the array backing the encoder's buffer
* \return array backing the encoder's buffer
*/
uint8_t* data() {
return buf_.data();
}
/*!
* \brief returns current size of the encoder's buffer
* \return buffer size
*/
size_t buf_size() {
return buf_.size();
}
private:
/*! \brief in-memory backing buffer */
std::vector<uint8_t> buf_;
/*! \brief current offset */
size_t curr_offset_;
/*! \brief start address of the encoder in device memory */
DevPtr start_addr_;
};
template <typename T>
TargetDataLayoutEncoder::Slot<T>::Slot(TargetDataLayoutEncoder* parent,
size_t start_offset,
size_t size,
DevPtr start_addr)
: parent_(parent),
start_offset_(start_offset),
curr_offset_(0),
size_(size),
start_addr_(start_addr) {}
template <typename T>
TargetDataLayoutEncoder::Slot<T>::~Slot() {
CHECK(curr_offset_ == size_) << "unwritten space in slot";
}
template <typename T>
void TargetDataLayoutEncoder::Slot<T>::WriteArray(const T* arr, size_t num_elems) {
if (num_elems == 0) return;
size_t size = sizeof(T) * num_elems;
CHECK(curr_offset_ + size <= size_) << "not enough space in slot";
uint8_t* curr_ptr = &(parent_->data())[start_offset_ + curr_offset_];
std::memcpy(curr_ptr, arr, size);
curr_offset_ += size;
}
template <typename T>
void TargetDataLayoutEncoder::Slot<T>::WriteValue(const T& val) {
WriteArray(&val, 1);
}
template <typename T>
DevPtr TargetDataLayoutEncoder::Slot<T>::start_addr() {
return start_addr_;
}
template <typename T>
size_t TargetDataLayoutEncoder::Slot<T>::size() {
return size_;
}
} // namespace runtime
} // namespace tvm
#endif // TVM_RUNTIME_MICRO_TARGET_DATA_LAYOUT_ENCODER_H_
......@@ -6,9 +6,9 @@
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
*
* http://www.apache.org/licenses/LICENSE-2.0
*
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
......@@ -139,6 +139,8 @@ bool RuntimeEnabled(const std::string& target) {
f_name = "device_api.rpc";
} else if (target == "vpi" || target == "verilog") {
f_name = "device_api.vpi";
} else if (target == "micro_dev") {
f_name = "device_api.micro_dev";
} else if (target.length() >= 5 && target.substr(0, 5) == "nvptx") {
f_name = "device_api.gpu";
} else if (target.length() >= 4 && target.substr(0, 4) == "rocm") {
......
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
"""Test various utilities for interaction with compiled binaries.
Specifically, we test the following capabilities:
- querying the size of a binary section
- relocating sections within a binary to new addresses
- reading the contents of a binary section
- querying the address of a symbol in the binary
"""
import tvm
import subprocess
from tvm.contrib import util
from tvm.contrib import cc
from tvm.contrib.binutil import *
TOOLCHAIN_PREFIX = ""
def make_binary():
prog = "int a = 7; \
int main() { \
int b = 5; \
return 0; \
}"
tmp_dir = util.tempdir()
tmp_source = tmp_dir.relpath("source.c")
tmp_obj = tmp_dir.relpath("obj.obj")
with open(tmp_source, "w") as f:
f.write(prog)
cc.create_shared(tmp_obj, tmp_source, [],
compile_cmd="{}gcc".format(TOOLCHAIN_PREFIX))
prog_bin = bytearray(open(tmp_obj, "rb").read())
return prog_bin
def test_tvm_callback_get_section_size(binary=None):
if binary is None:
binary = make_binary()
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("obj.bin")
with open(tmp_bin, "wb") as f:
f.write(binary)
def verify():
print("Text section size: %d" %
tvm_callback_get_section_size(tmp_bin, "text", TOOLCHAIN_PREFIX))
print("Data section size: %d" %
tvm_callback_get_section_size(tmp_bin, "data", TOOLCHAIN_PREFIX))
print("Bss section size: %d" %
tvm_callback_get_section_size(tmp_bin, "bss", TOOLCHAIN_PREFIX))
print()
verify()
def test_tvm_callback_relocate_binary():
binary = make_binary()
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("obj.bin")
with open(tmp_bin, "wb") as f:
f.write(binary)
def verify():
text_loc_str = "0x0"
rodata_loc_str = "0x10000"
data_loc_str = "0x20000"
bss_loc_str = "0x30000"
rel_bin = tvm_callback_relocate_binary(
tmp_bin, text_loc_str, rodata_loc_str, data_loc_str, bss_loc_str, TOOLCHAIN_PREFIX)
print("Relocated binary section sizes")
test_tvm_callback_get_section_size(binary=rel_bin)
relf = tmp_dir.relpath("rel.bin")
with open(relf, "wb") as f:
f.write(rel_bin)
nm_proc = subprocess.Popen(["nm", "-C", "--defined-only", relf],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
(out, _) = nm_proc.communicate()
# Ensure the relocated symbols are within the ranges we specified.
text_loc = int(text_loc_str, 16)
data_loc = int(data_loc_str, 16)
bss_loc = int(bss_loc_str, 16)
symbol_entries = out.decode("utf-8").split("\n")
for entry in symbol_entries:
if len(entry) == 0:
continue
sym_loc, section, sym_name = entry.split(' ')
sym_loc = int(sym_loc, 16)
if section == 'T': # text
assert sym_loc >= text_loc and sym_loc < data_loc
elif section == 'D': # data
assert sym_loc >= data_loc and sym_loc < bss_loc
elif section == 'B': # bss
assert sym_loc >= bss_loc
verify()
def test_tvm_callback_read_binary_section():
binary = make_binary()
def verify():
text_bin = tvm_callback_read_binary_section(binary, "text", TOOLCHAIN_PREFIX)
data_bin = tvm_callback_read_binary_section(binary, "data", TOOLCHAIN_PREFIX)
bss_bin = tvm_callback_read_binary_section(binary, "bss", TOOLCHAIN_PREFIX)
print("Read text section part of binary? %r" % (text_bin in binary))
print("Read data section part of binary? %r" % (data_bin in binary))
print("Read bss section part of binary? %r" % (bss_bin in binary))
print()
verify()
def test_tvm_callback_get_symbol_map():
binary = make_binary()
tmp_dir = util.tempdir()
tmp_bin = tmp_dir.relpath("obj.bin")
with open(tmp_bin, "wb") as f:
f.write(binary)
def verify():
text_loc_str = "0x0"
rodata_loc_str = "0x10000"
data_loc_str = "0x20000"
bss_loc_str = "0x30000"
rel_bin = tvm_callback_relocate_binary(
tmp_bin, text_loc_str, rodata_loc_str, data_loc_str, bss_loc_str, TOOLCHAIN_PREFIX)
symbol_map = tvm_callback_get_symbol_map(rel_bin, TOOLCHAIN_PREFIX)
symbols = set()
for i, line in enumerate(symbol_map.split('\n')):
# Every other line is the value the symbol maps to.
if i % 2 == 0:
symbols.add(line)
assert "a" in symbols
assert "main" in symbols
verify()
if __name__ == "__main__":
test_tvm_callback_get_section_size()
test_tvm_callback_relocate_binary()
test_tvm_callback_read_binary_section()
test_tvm_callback_get_symbol_map()
......@@ -95,31 +95,6 @@ def test_add_pipeline():
with tvm.build_config(offset_factor=4):
check_c()
def test_reinterpret():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A', dtype="int32")
B = tvm.compute(A.shape, lambda *i: tvm.call_pure_intrin("float32", "reinterpret", A(*i)), name='B')
s = tvm.create_schedule(B.op)
def check_c():
mhost = tvm.build(s, [A, B], "c", name="reinterpret")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
m = tvm.module.load(path_dso)
fadd = m['reinterpret']
ctx = tvm.cpu(0)
n = nn
a = tvm.nd.array(np.random.randint(-2 ** 30, 2 ** 30, size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.zeros(n, dtype=B.dtype), ctx)
fadd(a, b)
tvm.testing.assert_allclose(
b.asnumpy(), a.asnumpy().view('float32'))
check_c()
if __name__ == "__main__":
test_add()
test_add_pipeline()
test_reinterpret()
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import tvm
import numpy as np
from tvm import relay
from tvm.contrib import util
def test_add():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
C = tvm.compute(A.shape, lambda *i: A(*i) + B(*i), name='C')
s = tvm.create_schedule(C.op)
def check_c():
mhost = tvm.build(s, [A, B, C], "c", name="fadd")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
print(mhost.get_source())
m = tvm.module.load(path_dso)
fadd = m['fadd']
ctx = tvm.cpu(0)
# launch the kernel.
n = nn
a = tvm.nd.array(np.random.uniform(size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros(n, dtype=C.dtype), ctx)
fadd(a, b, c)
tvm.testing.assert_allclose(
c.asnumpy(), a.asnumpy() + b.asnumpy())
check_c()
def test_relay_id():
# x = relay.var("x")
# f = relay.Function([x], x)
x = relay.var('x', shape=[])
func = relay.Function([x], x)
ttype = relay.TensorType([], dtype='float32')
relay.FuncType([ttype], ttype)
mod = relay.module.Module()
func_gvar = relay.GlobalVar("f")
mod[func_gvar] = func
print(mod)
def test_add_pipeline():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
AA = tvm.compute((n,), lambda *i: A(*i), name='A')
BB = tvm.compute((n,), lambda *i: B(*i), name='B')
T = tvm.compute(A.shape, lambda *i: AA(*i) + BB(*i), name='T')
C = tvm.compute(A.shape, lambda *i: T(*i), name='C')
s = tvm.create_schedule(C.op)
xo, xi = s[C].split(C.op.axis[0], factor=4)
xo1, xo2 = s[C].split(xo, factor=13)
s[C].parallel(xo2)
s[C].pragma(xo1, "parallel_launch_point")
s[C].pragma(xo2, "parallel_stride_pattern")
s[C].pragma(xo2, "parallel_barrier_when_finish")
s[C].vectorize(xi)
def check_c():
if not tvm.module.enabled("llvm"):
return
# Specifically allow offset to test codepath when offset is available
Ab = tvm.decl_buffer(
A.shape, A.dtype,
elem_offset=tvm.var('Aoffset'),
offset_factor=8,
name='A')
binds = {A : Ab}
# BUILD and invoke the kernel.
f1 = tvm.lower(s, [A,B,C], name="fadd_pipeline")
fsplits = [x for x in tvm.ir_pass.SplitHostDevice(f1)]
fsplits[0] = tvm.ir_pass.LowerTVMBuiltin(fsplits[0])
mhost = tvm.codegen.build_module(fsplits[0], "c")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
m = tvm.module.load(path_dso)
fadd = m["fadd_pipeline"]
ctx = tvm.cpu(0)
# launch the kernel.
n = nn
a = tvm.nd.array(np.random.uniform(size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.random.uniform(size=n).astype(B.dtype), ctx)
c = tvm.nd.array(np.zeros(n, dtype=C.dtype), ctx)
fadd(a, b, c)
tvm.testing.assert_allclose(
c.asnumpy(), a.asnumpy() + b.asnumpy())
with tvm.build_config(offset_factor=4):
check_c()
def test_reinterpret():
nn = 1024
n = tvm.convert(nn)
A = tvm.placeholder((n,), name='A', dtype="int32")
B = tvm.compute(A.shape, lambda *i: tvm.call_pure_intrin("float32", "reinterpret", A(*i)), name='B')
s = tvm.create_schedule(B.op)
def check_c():
mhost = tvm.build(s, [A, B], "c", name="reinterpret")
temp = util.tempdir()
path_dso = temp.relpath("temp.so")
mhost.export_library(path_dso)
m = tvm.module.load(path_dso)
fadd = m['reinterpret']
ctx = tvm.cpu(0)
n = nn
a = tvm.nd.array(np.random.randint(-2 ** 30, 2 ** 30, size=n).astype(A.dtype), ctx)
b = tvm.nd.array(np.zeros(n, dtype=B.dtype), ctx)
fadd(a, b)
tvm.testing.assert_allclose(
b.asnumpy(), a.asnumpy().view('float32'))
check_c()
if __name__ == "__main__":
test_add()
test_add_pipeline()
test_reinterpret()
......@@ -24,7 +24,7 @@ def _default_schedule(outs, auto_inline):
"""Default schedule for llvm."""
target = tvm.target.current_target(allow_none=False)
outs = [outs] if isinstance(outs, tvm.tensor.Tensor) else outs
if target.target_name != "llvm":
if target.target_name not in ("llvm", "c"):
raise RuntimeError("schedule not registered for '%s'" % target)
s = tvm.create_schedule([x.op for x in outs])
if auto_inline:
......
......@@ -36,7 +36,7 @@ def pool_grad_nchw(a_np, out_grad_np,
pad_np = np.zeros(shape=(n, ic, ih+pt+pb, iw+pl+pr)).astype(dtype)
no_zero = (range(n), range(ic), (range(pt, ih+pt)), (range(pl, iw+pl)))
pad_np[np.ix_(*no_zero)] = a_np
_, oc, oh, ow = out_grad_np.shape
_, _, oh, ow = out_grad_np.shape
pool_grad_np = np.zeros(shape=a_np.shape)
pad_pool_grad_np = np.zeros(shape=pad_np.shape)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment