Commit 0edf87e8 by Sergei Grechanik Committed by Yizhi Liu

[NNVM][TEST] Test against numerical grad (#1505)

* [NNVM][TEST] Numerical gradient testing

* [NNVM][TEST] Make some tests a little faster

* Fix the failing test_top_level3

* Target exclusion for the check_function

* Try to ignore singularities

* grad_input_vars now can't contain shapes

* Don't pass unnecessary grad_input_vars to check_function

* Multiple outputs; fixes; testing of check_function

* Use numerical_grads_params to pass parameters to numgrad checker

* Fail when no action is requested excplicitly

* Pass additional params to functions

* Silence the linter issue

* Simplified numgrad checking

* Improved docs for check_function

* Fixed the error message when no dtype is provided

* Several fixes

* Tests with shape/dtype inference for inputs

* Don't check dense's grads on cuda

* Raise an error if output dtypes haven't been inferred

* Moved shape/dtype inference into a separate function; use float32 as fallback

* Remove redundant dtype=float32

* Fix multiple outputs

* Use check_function in the rest of the test_top_level1
parent 6eecec92
......@@ -11,3 +11,4 @@ This document contains the python API to NNVM compiler toolchain.
symbol
graph
top
testing
nnvm.testing
------------
.. automodule:: nnvm.testing
.. autofunction:: nnvm.testing.ctx_list
nnvm.testing.check_computation
------------------------------
.. automodule:: nnvm.testing.check_computation
:members:
.. include:: testing_new_ops.rst
Testing new operations
----------------------
When adding new operations, it is a good idea to test them. Testing
should be done with the function ``nnvm.testing.check_function``. You
should provide it with the symbol representing the result of a
computation and a reference numpy implementation. By default, it will
also check analytical gradients against numerical gradients if
analytical gradients are implemented for your operation. You can also
pass a reference implementation for the gradients, but numerical
gradients will still be checked. Numerical gradient checking may be
switched off explicitly, but doing this is not a good idea generally.
Here is an example testing the logarithm operation:
.. code:: python
import numpy as np
import nnvm
import nnvm.symbol as sym
from nnvm.testing.check_computation import check_function
x = sym.Variable("x")
y = sym.log(x)
def forward(x):
return np.log(x)
def backward(head_grads, x):
return [1. / x * head_grads]
dtype = "float32"
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, in_range=(0.001, 2.0), dtype=dtype, shape=shape)
If you run the code above, you might get an ``AssertionError`` in rare
cases. That’s why it is recommended to run new tests a lot of times.
.. code:: python
for _ in range(10000):
check_function(y, forward, backward, in_range=(0.001, 2.0), dtype=dtype, shape=shape)
If you run the code above then sooner or later you will get an exception
which may look like this:
.. code-block:: text
AssertionError: Analytical and numerical grads wrt x differ too much
analytical grad = [
...
]
numerical grad = [
...
]
distance > atol*sqrt(n) + rtol*grad_norm
distance 308.50885009765625 > 0.01*55.42562584220407 + 0.1*2167.70703125
It means that either you have a mistake in the ``FGradient`` function or
the numerical error is too high. Generally, if you look at the printed
gradients and see that they differ only slightly or just in a single
position, then it is a numerical error. But if the gradients look
completely different, especially if many corresponding positions have
different signs, then it must be something wrong with the analytical
gradient implementation.
Then try to make this error reproducible, and also try to reduce the
shape of inputs, but not too much, a vector of 10 elements is a
reasonable choice. Also you won’t need reference functions ``forward``
and ``backward``, and restricting the number of targets might also be a
good idea. Since the error may manifest itself only in rare cases, you
might want to run it in a loop.
.. code:: python
shape = {'x': (10,)}
np.random.seed(42)
for _ in range(1000):
check_function(y, in_range=(0.001, 2.0), dtype=dtype, shape=shape,
numerical_grads=True, only_targets=['llvm'])
Running this code will result in the following:
.. code-block:: text
check_function failed while checking gradients numerically, here is the main graph
Graph(%x, %head_grads_0) {
%x, shape=[10], dtype=0
%head_grads_0, shape=[10], dtype=0
%1 = log(%x), shape=[10], dtype=0
%3 = elemwise_div(%head_grads_0, %x), shape=[10], dtype=0
ret %1, %3, %head_grads_0
}
graph_attr_keys = [layout_inputs, dtype_num_unknown_nodes, dtype, shape_num_unknown_nodes, shape]
Generated inputs:
{'x': array([2.5660574e-01, 1.5313280e+00, 1.0232578e-03, 8.3371508e-01,
1.0454979e+00, 1.1021420e-01, 1.9461832e+00, 4.5302454e-01,
6.0909325e-01, 6.0858107e-01], dtype=float32), 'head_grads_0': array([0.4616029 , 0.00394617, 1.4589603 , 1.9337242 , 0.44936267,
1.3264314 , 1.4840508 , 1.6970023 , 0.84583575, 0.60655886],
dtype=float32)}
...
AssertionError: Analytical and numerical grads wrt x differ too much
analytical grad = [1.7988799e+00 2.5769596e-03 1.4257993e+03 2.3194065e+00 4.2980734e-01
1.2035031e+01 7.6254421e-01 3.7459390e+00 1.3886802e+00 9.9667716e-01]
numerical grad = [1.7948151e+00 1.9073486e-03 9.9268610e+02 2.3174286e+00 4.2915344e-01
1.1980057e+01 7.6198578e-01 3.7412643e+00 1.3866425e+00 9.9563599e-01]
distance > atol*sqrt(n) + rtol*grad_norm
distance 433.11322021484375 > 0.01*3.1622776601683795 + 0.1*992.7716674804688
In this case the largest difference is in the 2nd position (starting
from 0) which corresponds to input value ``1.0232578e-03``. This value
is too close to the singularity, so the numerical derivative gets too
imprecise. The solution is to shrink the range for ``x``, here, for
example, ``(0.002, 2.0)`` turned out to be enough. Don’t forget to run
lots of tests, so that other people don’t get false positives.
.. code:: python
for _ in range(100):
check_function(y, in_range={x: (0.002, 2.0)}, dtype=dtype, shape=(1, 3, 32, 32),
numerical_grads=True, only_targets=['llvm'])
If you need a more precise control over which values get passed to the
checking function, you can use ``values={x: ...}``:
.. code:: python
x_val = np.array([1.2594858e+00, 1.0960974e-01, 1.4975418e+00, 6.3585603e-01,
1.2692513e-03, 1.0227472e+00, 9.4656967e-02, 5.5306298e-01,
1.4142460e+00, 1.2631655e-01], dtype=np.float32)
check_function(y, values={x: x_val}, dtype=dtype, shape=shape,
numerical_grads=True, only_targets=['llvm'])
......@@ -13,3 +13,4 @@ from . import inception_v3
from . import dcgan
from . import dqn
from . import yolo2_detection
from . import check_computation
# pylint: disable=cell-var-from-loop,no-else-return
"""Helper utilities to check functions and their gradients."""
from __future__ import absolute_import as _abs
import logging
import numpy as np
import tvm
from tvm.contrib import graph_runtime
import nnvm
from nnvm.compiler import graph_util
from nnvm.compiler.graph_attr import TCODE_TO_DTYPE, DTYPE_TO_TCODE
from .config import ctx_list
def infer_shapes_dtypes(graph, shape=None, dtype=None, fallback_dtype=None):
"""Runs dtype and shape inference passes on a graph and returns the resulting graph
along with the inferred information.
Parameters
----------
graph : nnvm.graph.Graph
A graph we want to run inference on.
shape : Dict[str, Tuple[int]] or Tuple[int], optional
A dict mapping input variable names to shapes.
By default shapes will be inferred from variables' attributes.
Note that this parameter takes precedence over variables' attributes.
dtype : Dict[str, str] or str, optional
A dict mapping input variable names to dtypes, or just a single dtype.
By default dtypes will be inferred from variables' attributes.
Note that this parameter takes precedence over variables' attributes.
fallback_dtype : str, optional
A dtype that will be used for variables whose dtype can't be inferred from other
variables' dtypes.
Returns
-------
graph : nnvm.graph.Graph
The resulting graph with dtype and shape information on its nodes.
input_shapes : Dict[str, Tuple[int]]
The inferred shapes of input variables merged with the `shape` dictionary.
input_dtypes : Dict[str, str]
The inferred dtypes of input variables merged with the `dtype` dictionary.
output_shapes : List[Tuple[int]]
The inferred shapes of outputs.
output_dtypes : List[str]
The inferred dtypes of outputs.
"""
# Preprocess input parameters
if shape is None:
shape = {}
if dtype is None:
dtype = {}
if not isinstance(shape, dict):
shape = {x: shape for x in graph.symbol.list_input_variables()}
if not isinstance(dtype, dict):
dtype = {x: dtype for x in graph.symbol.list_input_variables()}
shape = _dict_var_to_dict_str(shape)
dtype = _dict_var_to_dict_str(dtype)
# The graph may already contain shape and dtype info, so extract it and merge with
# the user-specified shapes and dtypes (use the user-specified one on contradiction)
all_initial_shapes = graph.json_attr('shape')
all_initial_dtypes = graph.json_attr('dtype')
if all_initial_shapes:
for x in graph.index.input_names:
if x not in shape:
x_shape = tuple(all_initial_shapes[graph.index.entry_id(x)])
shape[x] = x_shape
if all_initial_dtypes:
for x in graph.index.input_names:
if x not in dtype:
x_dtype = TCODE_TO_DTYPE[all_initial_dtypes[graph.index.entry_id(x)]]
dtype[x] = x_dtype
# Perform inference
nnvm.compiler.graph_attr.set_shape_inputs(graph, shape)
nnvm.compiler.graph_attr.set_dtype_inputs(graph, dtype)
graph = graph.apply('InferShape').apply('InferType')
shapes = graph.json_attr('shape')
dtypes = graph.json_attr('dtype')
out_len = len(graph.symbol.list_output_names())
index = graph.index
output_shapes = \
[tuple(shapes[index.entry_id(index.output_entries[i])]) for i in range(out_len)]
output_dtypes = \
[TCODE_TO_DTYPE[dtypes[index.entry_id(index.output_entries[i])]] for i in range(out_len)]
# Postprocess the results
input_shapes = shape.copy()
input_dtypes = dtype.copy()
for x in graph.symbol.list_input_variables():
x_name = x.attr('name')
x_node_id = graph.index.node_id(x_name)
input_shapes[x_name] = tuple(shapes[x_node_id])
input_dtypes[x_name] = TCODE_TO_DTYPE[dtypes[x_node_id]]
# Merge the original user-specified shapes in case some of them are specified for non-existing
# variables
for x_name, x_shape in shape.items():
x_shape = tuple(x_shape)
if input_shapes.get(x_name, x_shape) != x_shape:
raise RuntimeError("Inferred shape differs from the provided shape.\n"
"Provided shapes: {}\nInferred shapes: {}"
.format(shapes, input_shapes))
else:
input_shapes[x_name] = x_shape
# Merge the original user-specified dtypes
for x_name, x_dtype in dtype.items():
if not isinstance(x_dtype, str):
x_dtype = TCODE_TO_DTYPE[x_dtype]
if input_dtypes.get(x_name, x_dtype) != x_dtype:
raise RuntimeError("Inferred dtype differs from the provided dtype.\n"
"Provided dtypes: {}\nInferred dtypes: {}"
.format(dtypes, input_dtypes))
else:
input_dtypes[x_name] = x_dtype
# If some dtypes weren't inferred and there is a fallback dtype, assign it to those varibles
# and repeat the inference
if fallback_dtype is not None and not all(input_dtypes.values()):
input_dtypes = {x: input_dtypes[x] if input_dtypes[x] else fallback_dtype
for x in input_dtypes}
return infer_shapes_dtypes(graph, input_shapes, input_dtypes, fallback_dtype=None)
return graph, input_shapes, input_dtypes, output_shapes, output_dtypes
def graph_to_function(graph, target, ctx, shape=None, dtype=None):
"""Convert a graph to a function taking a keyword args and returning a list of results
(both args and results are numpy arrays).
Example::
fun = graph_to_function(graph, llvm, cpu(0))
[res1, res2] = fun(x=np.zeros((1,2)), y=np.zeros((1,)))
Parameters
----------
graph : nnvm.graph.Graph
A graph we want to convert to a function.
target : str or :any:`tvm.target.Target`
The build target
ctx : TVMContext
The context to deploy the module.
shape : Dict[str, Tuple[int]], optional
A dict mapping input variable names to shapes.
By default shapes will be inferred from variables' attributes.
Note that this parameter takes precedence over variables' attributes.
dtype : Dict[str, str] or str, optional
A dict mapping input variable names to dtypes, or just a single dtype.
By default dtypes will be inferred from variables' attributes.
Note that this parameter takes precedence over variables' attributes.
Returns
-------
function : Callable[..., List[numpy.ndarray]]
"""
# Infer missing shapes and dtypes
graph, shape, dtype, output_shapes, output_dtypes = \
infer_shapes_dtypes(graph, shape=shape, dtype=dtype)
if None in dtype.values():
raise ValueError("Input variables with no type: {}".format(dtype))
if not all(shape.values()):
raise ValueError("Input variables with no shape: {}".format(shape))
compute_graph, lib, params = nnvm.compiler.build(graph, target, shape=shape, dtype=dtype)
module = graph_runtime.create(compute_graph, lib, ctx)
if params:
module.set_inputs(**params)
def run(**kwargs):
module.run(**kwargs)
res = []
for i, (o_shape, o_dtype) in enumerate(zip(output_shapes, output_dtypes)):
res.append(module.get_output(i, tvm.nd.empty(o_shape, o_dtype)).asnumpy())
return res
return run
def _dict_var_to_dict_str(dictionary):
"""Convert a Dict[nnvm.Symbol, T] to Dict[str, T]"""
if isinstance(dictionary, dict):
return {s.attr('name') if isinstance(s, nnvm.symbol.Symbol) else s:
dictionary[s] for s in dictionary}
else:
return dictionary
def check_function(symbol, forward=None, backward=None, grad_input_vars=None,
shape=None, dtype=None, in_range=None, values=None,
exclude_targets=None, only_targets=None,
additional_params=None,
numerical_grads=None, numerical_grads_params=None,
atol=1e-5, rtol=1e-5, quiet=False):
"""Compute the function and/or its gradients on a random input and raise
an exception if the result doesn't match the reference implementation.
Parameters
----------
symbol : nnvm.Symbol
A symbol representing the output.
forward : Callable[..., List[numpy.ndarray]], optional
A reference implementation to compare with.
backward : Callable[..., List[numpy.ndarray] or Dict[str, numpy.ndarray]], optional
A reference implementation of gradients. Should also accept head_grads besides
normal inputs which is a list of gradients of some scalar wrt the outputs or just a
single gradient if there are multiple outputs.
Should return either a dict mapping input variable names to the respective
gradients or a list of gradients wrt variables from grad_input_vars in
exactly the same order (in alphabetical order by default).
grad_input_vars : List[nnvm.Symbol or str], optional
A list of variables with respect to which the gradients will be computed.
None (default) means that all input variables will be used in an alphabetical order.
shape : Dict[nnvm.Symbol or str, Tuple[int]] or Tuple[int], optional
A dict mapping input variable names to shapes, or just a single shape.
By default shapes will be inferred from variables' attributes (see the Examples).
Note that this parameter takes precedence over variables' attributes.
dtype : Dict[nnvm.Symbol or str, str] or str, optional
A dict mapping input variable names to dtypes, or just a single dtype.
By default dtypes will be inferred from variables' attributes (see the Examples).
If dtypes cannot be inferred for some variables then float32 will be used as a fallback.
Note that this parameter takes precedence over variables' attributes.
in_range : Dict[nnvm.Symbol or str, (float, float)] or (float, float), optional
A dict mapping input variable names to ranges or just a single range
(the same for all variables). Input values will be generated from
uniform distributions on these ranges. `head_grads` can also be
assigned a range this way.
values : Dict[nnvm.Symbol or str, numpy.ndarray], optional
A dict explicitly providing values for some variables instead of random generation.
exclude_targets : Set[str], optional
Skip compiling and running anything for these targets.
only_targets : Set[str], optional
Test only for those targets from `ctx_list()` that are also in this set.
additional_params : dict, optional
A dict of additional parameters which will be passed to forward and backward.
numerical_grads : bool or 'if_possible', optional
Whether to additionally check against numerically computed gradients. If 'if_possible' or
None is passed (which is the default) then it will try to create a gradient computation
graph and then check gradients numerically only if this graph can be created (i.e. if there
are some operations with unimplemented gradients, it will just issue a warning).
Checking against numerical gradients is done via the `check_numerical_grads` function.
numerical_grads_params : dict, optional
Additional parameters for `check_numerical_grads`.
atol : float, optional
Absolute tolerance for `np.testing.assert_allclose`. NOT used for numerical gradients.
rtol : float, optional
Relative tolerance for `np.testing.assert_allclose`. NOT used for numerical gradients.
quiet : bool, optional
Don't dump additional information to stdout on failure.
Examples
--------
.. code-block:: python
x = sym.Variable("x", shape=(1, 2))
y = sym.Variable("y", shape=(1, 2))
# check the function and its gradients both numerically and using a reference function
check_function(x + 2*y,
lambda x, y: x + 2*y,
lambda x, y, head_grads: {'x': head_grads, 'y': 2*head_grads})
# just check gradients numerically
check_function(x + 2*y, numerical_grads=True)
# just check the forward computation
check_function(x + 2*y, lambda x, y: x + 2*y, numerical_grads=False)
# specifying dtype
check_function(x + 2*y, lambda x, y: x + 2*y, dtype='float64')
# dtypes can also be specified during variable creation with dtype codes
x = sym.Variable("x", dtype=0)
check_function(x + 1, shape=(2, 2), numerical_grads=True)
"""
# validate and preprocess the input params
if numerical_grads is None and forward is None and backward is None:
raise ValueError("No reference function was passed to check_function. If you only want to "
"check gradients numerically, pass numerical_grads=True explicitly.")
if numerical_grads is None:
numerical_grads = 'if_possible'
if numerical_grads not in [False, True, 'if_possible']:
raise ValueError("numerical_grads must be a bool or 'if_possible', not {}"
.format(numerical_grads))
if additional_params is None:
additional_params = {}
input_vars = symbol.list_input_variables()
input_dict = {x.attr('name'): x for x in input_vars}
if grad_input_vars is None:
grad_input_vars = sorted(input_vars, key=lambda x: x.attr('name'))
else:
grad_input_vars = [input_dict[x] if isinstance(x, str) else x for x in grad_input_vars]
in_range = _dict_var_to_dict_str(in_range)
values = _dict_var_to_dict_str(values)
out_len = len(symbol.list_output_names())
# Infer the output shapes and dtypes, and preprocess the shape and dtype params
forward_graph, shape, dtype, out_shapes, out_dtypes = \
infer_shapes_dtypes(nnvm.graph.create(symbol), shape=shape, dtype=dtype,
fallback_dtype='float32')
if not all(out_shapes) or not all(out_dtypes):
if not quiet:
print(forward_graph.ir(join_node_attrs=['shape', 'dtype']))
raise ValueError("Could not infer shapes or dtypes for outputs.\n"
"out_shapes = {}\nout_dtypes = {}".format(out_shapes, out_dtypes))
backward_graph = None
# If we want gradients, we have to recreate the graph, but now with gradient computations
# Note that here we need out_shapes for defining the shape of head grads, so we have to
# create the graph twice
if backward is not None or numerical_grads:
try:
head_grads_symbols = [nnvm.symbol.Variable("head_grads_" + str(i),
shape=out_shapes[i],
dtype=DTYPE_TO_TCODE[out_dtypes[i]])
for i in range(out_len)]
grad_symbols = graph_util.gradients([symbol], grad_input_vars,
grad_ys=head_grads_symbols)
# Sometimes grads do not depend on head_grads, so head_grads does not appear
# in the variable list; adding it manually prevents this, making things a bit easier
backward_graph = \
nnvm.graph.create(nnvm.symbol.Group([symbol] + grad_symbols + head_grads_symbols))
backward_graph, shape, dtype, out_shapes, out_dtypes = \
infer_shapes_dtypes(backward_graph, shape=shape, dtype=dtype,
fallback_dtype='float32')
except nnvm._base.NNVMError as err:
if backward is None and numerical_grads == "if_possible":
logging.warning("Won't check gradients because: %s", str(err).split('\n', 1)[0])
numerical_grads = False
backward_graph = None
else:
raise
main_graph = backward_graph if backward_graph is not None else forward_graph
# Generate random data for inputs (including head_grads)
np_inputs = {}
for x in main_graph.symbol.list_input_variables():
x_name = x.attr('name')
x_shape = shape[x_name]
x_dtype = dtype[x_name]
if values is not None and x_name in values:
np_inputs[x_name] = values[x_name].astype(x_dtype)
continue
low = -1.0
high = 1.0
if in_range is not None:
if isinstance(in_range, dict):
if x_name in in_range:
low = in_range[x_name][0]
high = in_range[x_name][1]
else:
low = in_range[0]
high = in_range[1]
np_inputs[x_name] = np.random.uniform(size=x_shape, low=low, high=high).astype(x_dtype)
np_inputs_without_head_grads = {k: np_inputs[k] for k in np_inputs
if not k.startswith('head_grads_')}
nothing_was_done = True
# Compute and compare the results
for target, ctx in ctx_list():
if exclude_targets is not None:
if target in exclude_targets or str(target) in exclude_targets:
logging.info("Skipping target = %s, ctx = %s", target, ctx)
continue
if only_targets is not None:
if target not in only_targets and str(target) not in only_targets:
logging.info("Skipping target = %s, ctx = %s", target, ctx)
continue
logging.info("Checking computation on target = %s, ctx = %s", target, ctx)
debug_stage = None
try:
nnvm_res = None
debug_stage = "compiling"
main_function = graph_to_function(main_graph, target, ctx)
# nnvm_res contains the output and gradients (if they are needed)
debug_stage = "running"
nnvm_res = main_function(**np_inputs)
if backward_graph is not None:
grad_var_names = [x.attr('name') for x in grad_input_vars]
nnvm_grads = {x: v for x, v in zip(grad_var_names, nnvm_res[out_len:])}
if forward is not None:
nothing_was_done = False
debug_stage = "checking forward computation"
logging.debug(debug_stage)
params = {}
params.update(np_inputs_without_head_grads)
params.update(additional_params)
numpy_res = forward(**params)
if isinstance(numpy_res, tuple):
numpy_res = list(numpy_res)
if not isinstance(numpy_res, list):
numpy_res = [numpy_res]
if len(numpy_res) != out_len:
raise ValueError("Forward function returned {} values, but "
"the nnvm graph returns {} values"
.format(len(numpy_res), out_len))
for i in range(out_len):
np.testing.assert_allclose(nnvm_res[i], numpy_res[i], atol=atol, rtol=rtol)
if backward is not None:
nothing_was_done = False
debug_stage = "checking gradients"
logging.debug(debug_stage)
np_head_grads = [np_inputs["head_grads_" + str(i)] for i in range(out_len)]
if out_len == 1:
np_head_grads = np_head_grads[0]
params = {'head_grads': np_head_grads}
params.update(np_inputs_without_head_grads)
params.update(additional_params)
numpy_grads = backward(**params)
if not isinstance(numpy_grads, dict):
if isinstance(numpy_grads, tuple):
numpy_grads = list(numpy_grads)
if not isinstance(numpy_grads, list):
numpy_grads = [numpy_grads]
numpy_grads = {x: v for x, v in zip(grad_var_names, numpy_grads)}
if len(numpy_grads) != len(grad_var_names):
raise ValueError("The backward function returns a list of gradients which "
"does not contain gradients for these variables: {}"
.format(set(grad_var_names) - set(numpy_grads)))
for x_name in numpy_grads:
np.testing.assert_allclose(nnvm_grads[x_name], numpy_grads[x_name],
atol=atol, rtol=rtol)
if numerical_grads:
nothing_was_done = False
debug_stage = "checking gradients numerically"
logging.debug(debug_stage)
forward_function = graph_to_function(forward_graph, target, ctx)
# Since the result may be non-scalar, we have to put another operation on the top,
# so we just multiple by the randomly generated head_grads and then sum everything.
# This way we can reuse the gradient values which has been already computed.
def scalar_function(**kwargs):
res = forward_function(**kwargs)
return np.sum([np.dot(np_inputs['head_grads_' + str(i)].ravel(), res[i].ravel())
for i in range(out_len)])
if numerical_grads_params is None:
numerical_grads_params = {}
check_numerical_grads(
scalar_function,
input_values=np_inputs_without_head_grads,
grad_values=nnvm_grads,
**numerical_grads_params)
except:
if not quiet:
print("\ncheck_function failed while {}, here is the main graph"
.format(debug_stage))
print(main_graph.ir(join_node_attrs=['shape', 'dtype']))
if nnvm_res is not None:
print("Generated inputs:")
print(np_inputs)
print()
raise
if nothing_was_done:
logging.warning("Nothing was done in check_function. Check ctx_list().")
def check_numerical_grads(function, input_values, grad_values, function_value=None,
delta=1e-3, atol=1e-2, rtol=0.1):
"""A helper function that checks that numerical gradients of a function are equal to
gradients computed in some different way (analytical gradients).
Numerical gradients are computed using finite difference approximation. To reduce the number of
function evaluations, the number of points used is gradually increased if the error value is
too high (up to 5 points).
Parameters
----------
function
A function that takes inputs as keyword arguments (like `function(**input_values)`) and
returns a scalar result. Should accept numpy ndarrays.
input_values : Dict[str, numpy.ndarray]
A dict assigning values to variables. Represents the point at which gradients should be
computed.
grad_values : Dict[str, numpy.ndarray]
Gradients computed using a different method.
function_value : float, optional
Should be equal to `function(**input_values)`.
delta : float, optional
A small number used for numerical computation of partial derivatives. The default 1e-3 is a
good choice for float32.
atol : float, optional
Absolute tolerance.
rtol : float, optional
Relative tolerance.
"""
if function_value is None:
function_value = function(**input_values)
# a helper to modify j-th element of val by a_delta
def modify(val, j, a_delta):
val = val.copy()
val.reshape(-1)[j] = val.reshape(-1)[j] + a_delta
return val
# numerically compute a partial derivative with respect to j-th element of the var `name`
def derivative(x_name, j, a_delta):
modified_values = {n: modify(val, j, a_delta) if n == x_name else val
for n, val in input_values.items()}
return (function(**modified_values) - function_value)/a_delta
def compare_derivative(j, n_der, grad):
der = grad.reshape(-1)[j]
return np.abs(n_der - der) < atol + rtol*np.abs(n_der)
for x_name, grad in grad_values.items():
if grad.shape != input_values[x_name].shape:
raise AssertionError(
"Gradient wrt '{}' has unexpected shape {}, expected {} "
.format(x_name, grad.shape, input_values[x_name].shape))
ngrad = np.zeros_like(grad)
# compute partial derivatives for each position in this variable
for j in range(np.prod(grad.shape)):
# forward difference approximation
nder = derivative(x_name, j, delta)
# if the derivative is not equal to the analytical one, try to use more
# precise and expensive methods
if not compare_derivative(j, nder, grad):
# central difference approximation
nder = (derivative(x_name, j, -delta) + nder)/2
if not compare_derivative(j, nder, grad):
# central difference approximation using h = delta/2
cnder2 = (derivative(x_name, j, delta/2) + derivative(x_name, j, -delta/2))/2
# five-point derivative
nder = (4*cnder2 - nder)/3
ngrad.reshape(-1)[j] = nder
dist = np.sqrt(np.sum((ngrad - grad)**2))
grad_norm = np.sqrt(np.sum(ngrad**2))
# we multiple atol by this number to make it more universal for different sizes
sqrt_n = np.sqrt(float(np.prod(grad.shape)))
if dist > atol*sqrt_n + rtol*grad_norm:
raise AssertionError(
"Analytical and numerical grads wrt {} differ too much\n"
"analytical grad = {}\n numerical grad = {}\n"
"distance > atol*sqrt(n) + rtol*grad_norm\n"
"distance {} > {}*{} + {}*{}"
.format(x_name, grad, ngrad,
dist, atol, sqrt_n, rtol, grad_norm))
max_diff = np.max(np.abs(ngrad - grad))
avg_diff = np.mean(np.abs(ngrad - grad))
logging.info("Numerical grad test wrt %s of shape %s passes, "
"dist = %f, max_diff = %f, avg_diff = %f",
x_name, grad.shape, dist, max_diff, avg_diff)
......@@ -5,49 +5,162 @@ import topi.testing
import nnvm.symbol as sym
import nnvm.compiler
from nnvm.testing.config import ctx_list
from nnvm.testing.check_computation import check_function
def helper(symbol, inputs, dtype,
np_forward, np_backward=None,
need_input=True, need_head_grads=True,
rnd_min=-1, rnd_max=1):
ishapes = {}
itypes = {}
input_syms = []
np_inputs = {}
for (name, shape, s) in inputs:
ishapes.update({name: shape})
itypes.update({name: dtype})
np_inputs.update({name: np.random.uniform(rnd_min, rnd_max, size=shape).astype(dtype)})
input_syms.append(s)
for target, ctx in ctx_list():
graph, lib, _ = nnvm.compiler.build(symbol, target, ishapes, itypes)
m = graph_runtime.create(graph, lib, ctx)
m.run(**np_inputs)
y_np = np_forward(**np_inputs)
out = m.get_output(0, tvm.nd.empty(y_np.shape, dtype))
np.testing.assert_allclose(out.asnumpy(), y_np, atol=1e-5, rtol=1e-5)
# backward
if np_backward:
graph._set_symbol_list_attr("grad_ys", symbol)
graph._set_symbol_list_attr("grad_xs", input_syms)
graph._set_symbol_list_attr("grad_ys_out_grad", sym.Variable("head_grads", shape=y_np.shape))
graph = graph.apply("Gradient")
ishapes.update({"head_grads": y_np.shape})
graph, lib, _ = nnvm.compiler.build(graph, target, ishapes)
m = graph_runtime.create(graph, lib, ctx)
head_grads = np.random.uniform(size=y_np.shape).astype(dtype)
y_np = np_backward(head_grads=head_grads, **np_inputs)
b_inputs = {}
if need_input:
b_inputs.update(np_inputs)
if need_head_grads:
b_inputs.update({"head_grads":head_grads})
m.run(**b_inputs)
for i in range(len(y_np)):
out = m.get_output(i, tvm.nd.empty(y_np[i].shape, dtype))
np.testing.assert_allclose(out.asnumpy(), y_np[i], atol=1e-5, rtol=1e-5)
def test_check_function():
# test the testing function
x = sym.Variable("x")
y = sym.Variable("y")
# different styles of returning gradients from the backward function
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: [head_grads, 2*head_grads],
shape={'x': (1, 2), y: (1, 2)}, dtype='float32')
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: (head_grads, 2*head_grads),
shape={'x': (1, 2), y: (1, 2)}, dtype='float32')
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: {'x': head_grads, 'y': 2*head_grads},
shape={'x': (1, 2), y: (1, 2)}, dtype='float32')
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: {'y': 2*head_grads},
shape={'x': (1, 2), y: (1, 2)}, dtype='float32')
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: [2*head_grads],
grad_input_vars=[y],
shape={'x': (1, 2), y: (1, 2)}, dtype='float32')
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: 2*head_grads,
grad_input_vars=[y],
shape={'x': (1, 2), y: (1, 2)}, dtype='float32')
check_function(x + 2*y, lambda x, y: x + 2*y,
lambda x, y, head_grads: 2*head_grads,
grad_input_vars=[y],
shape={'x': (1, 2), y: (1, 2)}, dtype='float64')
# test just numerical gradients
# different styles of shape and dtype passing
check_function(x + 2*y, shape={'x': (1, 2), y: (1, 2)},
numerical_grads=True)
check_function(x + 2*y, shape={'x': (1, 2), y: (1, 2)}, dtype='float32',
numerical_grads=True)
check_function(x + 2*y, shape={'x': (1, 2), y: (1, 2)}, dtype={x: 'float32', 'y': 'float32'},
numerical_grads=True)
check_function(x + 2*y, shape=(1, 2), dtype='float32',
numerical_grads=True)
# specifying variable attributes on variable creation
# (in this case type codes must be used)
x = sym.Variable("x", dtype=0, shape=(1, 2))
check_function(x + 2*y, shape={y: (1, 2)}, dtype={'y': 'float32'}, numerical_grads=True)
y = sym.Variable("y", dtype=0, shape=(1, 2))
# shape overriding
def _fwd1(x, y):
assert x.shape == (1, 1)
assert y.shape == (1, 2)
return x + 2*y
check_function(x + 2*y, _fwd1, shape={x: (1, 1)})
# in_range
def _fwd2(x, y):
assert x.shape == (100,)
assert (x <= 0.9).all()
assert (x >= 0.8).all()
return x + 2*y
check_function(x + 2*y, _fwd2, shape=(100,), in_range=(0.8, 0.9), numerical_grads=False)
check_function(x + 2*y, _fwd2, shape=(100,), in_range={'x': (0.8, 0.9)}, numerical_grads=False)
check_function(x + 2*y, backward=lambda x, y, head_grads: [1.0, 2.0],
in_range={'head_grads_0': (1.0, 1.0)})
# explicit passing of values
check_function(x + 2*y, backward=lambda x, y, head_grads: [1.0, 2.0],
values={'head_grads_0': np.full((1, 2), 1.0)})
# check that the function reports errors
def _check_function_must_fail(*args, **kwargs):
error = AssertionError
if 'error' in kwargs:
error = kwargs['error']
del kwargs['error']
try:
check_function(*args, quiet=True, **kwargs)
except error:
pass
else:
raise AssertionError("check_function didn't raise an exception")
_check_function_must_fail(x + 2*y, error=ValueError)
_check_function_must_fail(x + 2*y, lambda x, y: x + y)
_check_function_must_fail(x + 2*y, backward=lambda x, y, head_grads: [1.0, 2.0])
_check_function_must_fail(sym.block_grad(x + 2*y), numerical_grads=True)
_check_function_must_fail(x*x, numerical_grads=True,
numerical_grads_params={'atol': 0.0, 'rtol': 0.0})
# different styles of returning results from the forward function
check_function(x + 2*y, lambda x, y: [x + 2*y], numerical_grads=False)
_check_function_must_fail(x + 2*y, lambda x, y: [x + 2*y, x], numerical_grads=False,
error=ValueError)
_check_function_must_fail(x + 2*y, lambda x, y: [], numerical_grads=False,
error=ValueError)
# multiple outputs
z = sym.Group([2*x + y, x + 2*y])
check_function(z, lambda x, y: [2*x + y, x + 2*y])
check_function(z, lambda x, y: (2*x + y, x + 2*y))
check_function(z, backward=lambda x, y, head_grads: [2*head_grads[0] + head_grads[1],
head_grads[0] + 2*head_grads[1]])
_check_function_must_fail(z, backward=lambda x, y, head_grads: [2*head_grads[0],
2*head_grads[1]])
check_function(z, backward=lambda x, y, head_grads: [head_grads[1], 2*head_grads[1]],
in_range={'head_grads_0': (0, 0)})
check_function(z, numerical_grads=True)
z = sym.Group([sym.block_grad(2*x + y), x + 2*y])
check_function(z, lambda x, y: [2*x + y, x + 2*y], numerical_grads=False)
_check_function_must_fail(z, lambda x, y: [2*x + y, x + 2*y])
_check_function_must_fail(z, numerical_grads=True)
z = sym.Group([2*x + y, sym.block_grad(x + 2*y)])
_check_function_must_fail(z, numerical_grads=True)
z = sym.Group([2*x + y, x + 2*y, x, y, sym.sum(x)])
check_function(z, lambda x, y: [2*x + y, x + 2*y, x, y, np.sum(x)])
# passing additional parameters to forward and backward
def _fwd3(x, p):
assert p == 'v'
return x + 1
def _bwd3(x, p, head_grads):
assert p == 'v'
return head_grads
check_function(x + 1, _fwd3, _bwd3, additional_params={'p': 'v'})
# implicitly created variables and shape/dtype inference for inputs
x = sym.Variable("x", shape=(2, 3), dtype=0)
b = sym.Variable("b")
y = sym.dense(data=x, bias=b, units=4)
# Don't check gradients on cuda because is doesn't yet support ewise after reduce
check_function(y, exclude_targets={'cuda'}, numerical_grads=True)
check_function(y, shape={'x': (3, 4)}, exclude_targets={'cuda'}, numerical_grads=True)
check_function(y, dtype={'x': 'float64'}, exclude_targets={'cuda'}, numerical_grads=True)
x = sym.Variable("x")
b = sym.Variable("b")
w = sym.Variable("w")
y = sym.dense(data=x, bias=b, weight=w, units=4)
def _fwd_dense(x, w, b):
return np.dot(x, w.T) + b
check_function(y, _fwd_dense, shape={'x': (1,2)}, dtype={'x': 'float32'}, numerical_grads=False)
check_function(y, _fwd_dense, shape={'x': (1,2)}, dtype={'w': 'float64'}, numerical_grads=False)
_check_function_must_fail(y, _fwd_dense, shape={'x': (1,2)},
dtype={'w': 'float64', 'b': 'float32'},
numerical_grads=False,
error=nnvm._base.NNVMError)
# fails because no shape
_check_function_must_fail(y, _fwd_dense, numerical_grads=False, error=ValueError)
# ok because type is float32 by default
check_function(y, _fwd_dense, shape={'x': (1,2)}, numerical_grads=False)
def test_relu():
x = sym.Variable("x")
......@@ -62,10 +175,8 @@ def test_relu():
return [(sub > 0).astype("float") * \
((x > 0).astype("float") + 0.3 * (x < 0).astype("float")) * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, shape=shape)
def test_prelu_nchw():
x = sym.Variable("x")
......@@ -75,15 +186,8 @@ def test_prelu_nchw():
def forward(x, a):
return (x < 0) * (x * a.reshape(3, 1, 1)) + (x>=0) * x
dtype = "float32"
dshape_x = (1, 3, 32, 32)
dshape_w = (3,)
inputs = [
('x', dshape_x, x),
('a', dshape_w, a)
]
helper(y, inputs, dtype, forward)
shape = {'x': (1, 3, 32, 32), 'a': (3,)}
check_function(y, forward, shape=shape)
def test_prelu_nhwc():
x = sym.Variable("x")
......@@ -93,17 +197,8 @@ def test_prelu_nhwc():
def forward(x, a):
return (x < 0) * (x * a.reshape(1, 1, 3)) + (x>=0) * x
dtype = "float32"
dshape_x = (1, 32, 32, 3)
dshape_w = (3,)
inputs = [
('x', dshape_x, x),
('a', dshape_w, a)
]
helper(y, inputs, dtype, forward)
shape = {'x': (1, 32, 32, 3), 'a': (3,)}
check_function(y, forward, shape=shape)
def test_sym_scalar_pow():
scalar = 3
......@@ -116,10 +211,8 @@ def test_sym_scalar_pow():
def backward(head_grads, x):
return [scalar * x**(scalar - 1) * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, shape=shape)
def test_scalar_sym_pow():
......@@ -133,10 +226,8 @@ def test_scalar_sym_pow():
def backward(head_grads, x):
return [np.log(scalar) * scalar**x * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, shape=shape)
def test_exp():
......@@ -149,10 +240,8 @@ def test_exp():
def backward(head_grads, x):
return [np.exp(x) * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, shape=shape)
def test_log():
......@@ -165,10 +254,8 @@ def test_log():
def backward(head_grads, x):
return [1. / x * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward, rnd_min=0.001)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, in_range=(0.002, 2.0), shape=shape)
def test_tanh():
......@@ -182,10 +269,8 @@ def test_tanh():
y_np = forward(x)
return [(1 - y_np**2) * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, shape=shape)
def test_sigmoid():
......@@ -199,10 +284,8 @@ def test_sigmoid():
y_np = forward(x)
return [y_np *(1 - y_np) * head_grads]
dtype = "float32"
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (1, 3, 32, 32)}
check_function(y, forward, backward, shape=shape)
def test_softmax():
......@@ -217,10 +300,10 @@ def test_softmax():
grad = y * (head_grads - np.sum(y * head_grads, axis=1, keepdims=True))
return [grad]
dtype = "float32"
dshape = (10, 1000)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
check_function(y, forward, backward,
shape={'x': (10, 1000)}, numerical_grads=False)
check_function(y, forward, backward,
shape={'x': (2, 10)})
def test_log_softmax():
......@@ -235,10 +318,10 @@ def test_log_softmax():
grad = head_grads - np.exp(y) * np.sum(head_grads, axis=1, keepdims=True)
return [grad]
dtype = "float32"
dshape = (10, 1000)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
check_function(y, forward, backward,
shape={'x': (10, 1000)}, numerical_grads=False)
check_function(y, forward, backward,
shape={'x': (2, 10)})
def test_dense():
......@@ -250,13 +333,16 @@ def test_dense():
def forward(x, dense_weight, dense_bias):
return np.dot(x, dense_weight.T) + dense_bias
dtype = "float32"
inputs = [
('x', (10, 100), x),
('dense_weight', (3, 100), w),
('dense_bias', (3,), b)
]
helper(y, inputs, dtype, forward)
shape = {
'x': (10, 100),
'w': (3, 100),
'b': (3,)
}
# Don't check gradients on cuda because is doesn't yet support ewise after reduce
check_function(y, forward, shape=shape,
exclude_targets={'cuda'}, numerical_grads=True)
check_function(y, forward, shape=shape,
only_targets={'cuda'}, numerical_grads=False)
def test_batchnorm():
......@@ -272,35 +358,25 @@ def test_batchnorm():
def forward(x, gamma, beta, moving_mean, moving_var):
return (x - moving_mean) / np.sqrt(moving_var + eps) * gamma + beta
dtype = "float32"
inputs = [
('x', (10, 20), x),
('gamma', (20,), gamma),
('beta', (20,), beta),
('moving_mean', (20,), moving_var),
('moving_var', (20,), moving_mean)
]
shape = {
'x': (10, 20),
'gamma': (20,),
'beta': (20,),
'moving_mean': (20,),
'moving_var': (20,)
}
helper(y, inputs, dtype, forward, rnd_min=0.001)
check_function(y, forward, in_range=(0.001, 1.0), shape=shape)
def verify_concatenate(ishape, axis):
x = [sym.Variable("x%d" % i) for i in range(len(ishape))]
x = [sym.Variable("x%d" % i, shape=ishape[i]) for i in range(len(ishape))]
y = sym.concatenate(*x, axis=axis) + 1
dtype = "float32"
for target, ctx in ctx_list():
# set input
data = []
for i, shape in enumerate(ishape):
data.append(np.random.uniform(size=shape).astype(dtype))
pdict = {"x%d" % i : v for i, v in enumerate(data)}
shape = {"x%d" % i : v.shape for i, v in enumerate(data)}
graph, lib, _ = nnvm.compiler.build(y, target, shape)
m = graph_runtime.create(graph, lib, ctx)
m.run(**pdict)
out_np = np.concatenate(data, axis=axis) + 1
out = m.get_output(0, tvm.nd.empty(out_np.shape))
np.testing.assert_allclose(out.asnumpy(), out_np, atol=1e-5, rtol=1e-5)
def forward(**kwargs):
return np.concatenate(list(kwargs.values()), axis=axis) + 1
check_function(y, forward)
def test_concatenate():
......@@ -309,19 +385,13 @@ def test_concatenate():
def verify_split(ishape, indices_or_sections, axis):
x = sym.Variable("x")
x = sym.Variable("x", shape=ishape)
y = sym.split(x, indices_or_sections=indices_or_sections, axis=axis)
dtype = "float32"
x_np = np.random.uniform(size=ishape).astype(dtype)
res = np.split(x_np, indices_or_sections, axis=axis)
for target, ctx in ctx_list():
# set input
graph, lib, _ = nnvm.compiler.build(y, target, {"x": ishape})
m = graph_runtime.create(graph, lib, ctx)
m.run(x=x_np)
for i, arr in enumerate(res):
out = m.get_output(i, tvm.nd.empty(arr.shape))
np.testing.assert_allclose(out.asnumpy(), arr, atol=1e-5, rtol=1e-5)
def forward(x):
return np.split(x, indices_or_sections, axis=axis)
check_function(y, forward)
def test_split():
......@@ -331,28 +401,22 @@ def test_split():
def verify_strided_slice(ishape, begin, end, strideinp=None):
stride = strideinp if strideinp else [1, 1, 1]
x = sym.Variable("x")
x = sym.Variable("x", shape=ishape)
if strideinp:
y = sym.strided_slice(x, begin = begin, end = end, stride = stride) + 1
else:
y = sym.strided_slice(x, begin = begin, end = end) + 1
x_np = np.random.uniform(size=ishape).astype("float32")
for i in range(len(begin), 3):
begin.append(0)
for i in range(len(end), 3):
end.append(ishape[i])
def test_forward(x, begin, end, stride):
def test_forward(x):
return x[begin[0]:end[0]:stride[0],
begin[1]:end[1]:stride[1], begin[2]:end[2]:stride[2]] + 1
for target, ctx in ctx_list():
# set input
graph, lib, _ = nnvm.compiler.build(y, target, {"x": ishape})
m = graph_runtime.create(graph, lib, ctx)
m.run(x=x_np)
res = test_forward(x_np, begin, end, stride)
out = m.get_output(0, tvm.nd.empty(res.shape))
np.testing.assert_allclose(out.asnumpy(), res, atol=1e-5, rtol=1e-5)
check_function(y, test_forward)
def test_strided_slice():
verify_strided_slice((3, 4, 3), [0, 0, 0], [4, -5, 4], [1, -1, 2])
......@@ -369,24 +433,18 @@ def verify_take(src_shape, indices_src, axis=None):
src_dtype = "float32"
indices_dtype = "int32"
indices_src = np.array(indices_src, dtype=indices_dtype)
a = sym.Variable("a")
indices = sym.Variable("indices")
a = sym.Variable("a", shape=src_shape)
indices = sym.Variable("indices", shape=indices_src.shape)
y = sym.take(a, indices, axis=axis)
for target, ctx in ctx_list():
# set input
shape_dict = {"a":src_shape, "indices":indices_src.shape}
type_dict = {"a":src_dtype, "indices":indices_dtype}
graph, lib, _ = nnvm.compiler.build(y, target, shape=shape_dict, dtype=type_dict)
m = graph_runtime.create(graph, lib, ctx)
shape_size = 1
for i in range(len(src_shape)):
shape_size = shape_size * src_shape[i]
a_src = np.arange(shape_size, dtype=src_dtype).reshape((src_shape))
out_np = np.take(a_src, indices_src, axis=axis)
m.run(a=a_src, indices=indices_src)
out = m.get_output(0, tvm.nd.empty(out_np.shape, dtype=src_dtype))
np.testing.assert_allclose(out.asnumpy(), out_np, atol=1e-5, rtol=1e-5)
def forward(a, indices):
return np.take(a, indices=indices, axis=axis)
a_src = np.arange(np.prod(src_shape), dtype=src_dtype).reshape(src_shape)
check_function(y, forward,
dtype={'a': src_dtype, 'indices': indices_dtype},
values={'a': a_src, 'indices': indices_src})
def test_take():
verify_take((4,), [1])
......@@ -399,9 +457,9 @@ def test_take():
verify_take((4,3,5,6), [[2,1,0,0]], -2)
def verify_squeeze(dshape, axis):
def verify_squeeze(shape, axis):
x = sym.Variable("x")
if axis:
if axis is not None:
y = sym.squeeze(x, axis=axis)
else:
y = sym.squeeze(x)
......@@ -413,9 +471,7 @@ def verify_squeeze(dshape, axis):
def backward(head_grads, x):
return [np.reshape(head_grads, x.shape)]
dtype = "float32"
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
check_function(y, forward, backward, shape=shape)
def test_squeeze():
......@@ -433,61 +489,40 @@ def test_pad():
pad_width=((0, 0), (0, 0), (0, 1), (2, 3)),
mode='constant', constant_values=1.)
dtype = "float32"
inputs = [('x', (1, 3, 28, 28), x)]
helper(y, inputs, dtype, forward)
shape = {'x': (1, 3, 28, 28)}
check_function(y, forward, shape=shape)
def verify_lrn(ishape, size, axis, bias, alpha, beta):
x = sym.Variable("x")
x = sym.Variable("x", shape=ishape)
y = sym.lrn(x, size=size, axis=axis, bias=bias, alpha=alpha, beta=beta)
dtype = "float32"
x_np = np.random.uniform(size=ishape).astype(dtype)
for target, ctx in ctx_list():
graph, lib, _ = nnvm.compiler.build(y, target, {"x": ishape})
m = graph_runtime.create(graph, lib, ctx)
m.run(x=x_np)
out = m.get_output(0, tvm.nd.empty(ishape))
out_np = topi.testing.lrn_python(x_np, size, axis, bias, alpha, beta)
np.testing.assert_allclose(out.asnumpy(), out_np, atol=1e-5, rtol=1e-5)
def forward1(x):
return topi.testing.lrn_python(x, size, axis, bias, alpha, beta)
check_function(y, forward1)
def forward2(x):
y = forward1(x)
return (y > 0)*y
#Checking LRN op followed by elementwise op relu
z = sym.relu(y)
x_np = np.random.uniform(low=-10.0, high=10.0, size=ishape).astype(dtype)
for target, ctx in ctx_list():
graph, lib, _ = nnvm.compiler.build(z, target, {"x": ishape})
m = graph_runtime.create(graph, lib, ctx)
m.run(x=x_np)
out = m.get_output(0, tvm.nd.empty(ishape))
out_np = topi.testing.lrn_python(x_np, size, axis, bias, alpha, beta)
out_np = (out_np > 0) * out_np
np.testing.assert_allclose(out.asnumpy(), out_np, atol=1e-5, rtol=1e-5)
check_function(sym.relu(y), forward2, in_range={'x': (-10.0, 10.0)})
def verify_l2_normalize(ishape, eps, axis):
x = sym.Variable("x")
x = sym.Variable("x", shape=ishape)
y = sym.l2_normalize(x, eps=eps, axis=axis)
dtype = "float32"
x_np = np.random.uniform(size=ishape).astype(dtype)
for target, ctx in ctx_list():
graph, lib, _ = nnvm.compiler.build(y, target, {"x": ishape})
m = graph_runtime.create(graph, lib, ctx)
m.run(x=x_np)
out = m.get_output(0, tvm.nd.empty(ishape))
out_np = topi.testing.l2_normalize_python(x_np, eps, axis)
np.testing.assert_allclose(out.asnumpy(), out_np, atol=1e-5, rtol=1e-5)
def forward1(x):
return topi.testing.l2_normalize_python(x, eps, axis)
check_function(y, forward1)
def forward2(x):
y = forward1(x)
return (y > 0)*y
#Checking L2 normalization op followed by elementwise op relu
z = sym.relu(y)
x_np = np.random.uniform(low=-10.0, high=10.0, size=ishape).astype(dtype)
for target, ctx in ctx_list():
graph, lib, _ = nnvm.compiler.build(z, target, {"x": ishape})
m = graph_runtime.create(graph, lib, ctx)
m.run(x=x_np)
out = m.get_output(0, tvm.nd.empty(ishape))
out_np = topi.testing.l2_normalize_python(x_np, eps, axis)
out_np = (out_np > 0) * out_np
np.testing.assert_allclose(out.asnumpy(), out_np, atol=1e-5, rtol=1e-5)
check_function(sym.relu(y), forward2, in_range={'x': (-10.0, 10.0)})
def test_lrn():
verify_lrn((1, 3, 20, 20), 3, 1, 1.0, 1.0, 0.5)
......@@ -498,6 +533,7 @@ def test_l2_normalize():
verify_l2_normalize((1, 3, 20, 20), 0.001, (1, 2))
if __name__ == "__main__":
test_check_function()
test_split()
test_concatenate()
test_log_softmax()
......
......@@ -5,15 +5,14 @@ import topi.testing
import nnvm.symbol as sym
import nnvm.compiler
from nnvm.testing.config import ctx_list
from test_top_level1 import helper
from nnvm.testing.check_computation import check_function
def check_map(symfunc, np_func, np_backward=None, dtype="float32", rnd_min=-1, rnd_max=1):
x = sym.Variable("x")
y = symfunc(x)
dshape = (1, 3, 32, 32)
inputs = [('x', dshape, x)]
helper(y, inputs, dtype, lambda x: np_func(x), np_backward,
rnd_min=rnd_min, rnd_max=rnd_max)
shape = {'x': (1, 3, 32, 32)}
check_function(y, lambda x: np_func(x), np_backward,
dtype=dtype, shape=shape, in_range=(rnd_min, rnd_max))
def test_floor():
......
......@@ -6,52 +6,7 @@ import topi
import nnvm.symbol as sym
import nnvm.compiler
from nnvm.testing.config import ctx_list
def helper(symbol, inputs, dtype,
np_forward, np_backward=None,
need_input=True, need_head_grads=True, in_range={}):
ishapes = {}
input_syms = []
np_inputs = {}
for (name, shape, s) in inputs:
ishapes.update({name: shape})
if name in in_range:
np_inputs.update({name: np.random.uniform(size=shape,
low=in_range[name][0],
high=in_range[name][1]).astype(dtype)})
else:
np_inputs.update({name: np.random.uniform(size=shape).astype(dtype)})
input_syms.append(s)
for target, ctx in ctx_list():
graph, lib, _ = nnvm.compiler.build(symbol, target, ishapes, dtype=dtype)
m = graph_runtime.create(graph, lib, ctx)
m.run(**np_inputs)
y_np = np_forward(**np_inputs)
out = m.get_output(0, tvm.nd.empty(y_np.shape, dtype))
np.testing.assert_allclose(out.asnumpy(), y_np, atol=1e-5, rtol=1e-5)
# backward
if np_backward:
graph._set_symbol_list_attr("grad_ys", symbol)
graph._set_symbol_list_attr("grad_xs", input_syms)
graph._set_symbol_list_attr("grad_ys_out_grad", sym.Variable("head_grads", shape=y_np.shape))
graph = graph.apply("Gradient")
ishapes.update({"head_grads": y_np.shape})
graph, lib, _ = nnvm.compiler.build(graph, target, ishapes)
m = graph_runtime.create(graph, lib, ctx)
head_grads = np.random.uniform(size=y_np.shape).astype(dtype)
y_np = np_backward(head_grads=head_grads, **np_inputs)
b_inputs = {}
if need_input:
b_inputs.update(np_inputs)
if need_head_grads:
b_inputs.update({"head_grads":head_grads})
m.run(**b_inputs)
for i in range(len(y_np)):
out = m.get_output(i, tvm.nd.empty(y_np[i].shape, dtype))
np.testing.assert_allclose(out.asnumpy(), y_np[i], atol=1e-5, rtol=1e-5)
from nnvm.testing.check_computation import check_function
def verify_transpose(dshape, axes):
x = sym.Variable("x")
......@@ -228,93 +183,92 @@ def test_clip():
mask2 = np.less_equal(x, a_max).astype("float")
return [head_grads * mask1 * mask2]
dtype = "float32"
inputs = [('x', (3, 4, 5), x)]
helper(y, inputs, dtype, forward, backward)
shape = {'x': (3, 4, 5)}
check_function(y, forward, backward, shape=shape)
def test_broadcast():
a = sym.Variable("a")
b = sym.Variable("b")
inputs = [('a', (3, 4, 5), a),
('b', (1, 5), b)]
dtype = "float32"
shape = {'a': (3, 4, 5), 'b': (1, 5)}
def _collapse(g):
return g.reshape(-1, inputs[-1][1][-1]).sum(0, keepdims=True)
return g.reshape(-1, shape['b'][-1]).sum(0, keepdims=True)
y = sym.broadcast_add(a, b)
def _backward_add(head_grads, a, b):
da = head_grads
db = _collapse(head_grads)
return da, db
helper(y, inputs, dtype, lambda a, b: a + b, _backward_add)
check_function(y, lambda a, b: a + b, _backward_add, shape=shape)
y = sym.broadcast_sub(a, b)
def _backward_sub(head_grads, a, b):
da = head_grads
db = -_collapse(head_grads)
return da, db
helper(y, inputs, dtype, lambda a, b: a - b, _backward_sub)
check_function(y, lambda a, b: a - b, _backward_sub, shape=shape)
y = sym.broadcast_mul(a, b)
def _backward_mul(head_grads, a, b):
da = head_grads * b
db = _collapse(head_grads * a)
return da, db
helper(y, inputs, dtype, lambda a, b: a * b, _backward_mul)
check_function(y, lambda a, b: a * b, _backward_mul, shape=shape)
y = sym.broadcast_div(a, b)
def _backward_div(head_grads, a, b):
da = head_grads / b
db = _collapse(- head_grads * a / b**2)
return da, db
helper(y, inputs, dtype, lambda a, b: a / b, _backward_div)
# We avoid computing numerical derivatives too close to zero here
check_function(y, lambda a, b: a / b, _backward_div, shape=shape, numerical_grads=False)
check_function(y, lambda a, b: a / b, _backward_div, shape=shape,
in_range={'b': (0.1, 20)})
y = sym.broadcast_mod(a, b)
helper(y, inputs, 'int32',
check_function(y,
lambda a, b: np.mod(a, b),
in_range={'a': (0.001, 100), 'b': (1, 100)})
in_range={'a': (0.001, 100), 'b': (1, 100)}, dtype='int32', shape=shape)
y = sym.broadcast_max(a, b)
helper(y, inputs, dtype, lambda a, b: np.maximum(a, b))
check_function(y, lambda a, b: np.maximum(a, b), shape=shape)
y = sym.broadcast_min(a, b)
helper(y, inputs, dtype, lambda a, b: np.minimum(a, b))
check_function(y, lambda a, b: np.minimum(a, b), shape=shape)
y = sym.broadcast_pow(a, b)
helper(y, inputs, dtype,
check_function(y,
lambda a, b: np.power(a, b),
in_range={'a': (0.001, 100), 'b': (0.001, 2)})
in_range={'a': (0.001, 100), 'b': (0.001, 2)}, shape=shape)
y = sym.broadcast_left_shift(a, b)
helper(y, inputs, 'int32', lambda a, b: a << b)
check_function(y, lambda a, b: a << b, dtype='int32', shape=shape)
y = sym.broadcast_right_shift(a, b)
helper(y, inputs, 'int32', lambda a, b: a >> b)
check_function(y, lambda a, b: a >> b, dtype='int32', shape=shape)
y = sym.broadcast_greater(a, b)
helper(y, inputs, dtype, lambda a, b: np.greater(a, b))
check_function(y, lambda a, b: np.greater(a, b), shape=shape)
y = sym.broadcast_less(a, b)
helper(y, inputs, dtype, lambda a, b: np.less(a, b))
check_function(y, lambda a, b: np.less(a, b), shape=shape)
y = sym.broadcast_equal(a, b)
helper(y, inputs, 'int32', lambda a, b: np.equal(a, b),
in_range={'a': (-2, 2), 'b': (-2, 2)})
check_function(y, lambda a, b: np.equal(a, b),
in_range={'a': (-2, 2), 'b': (-2, 2)}, dtype='int32', shape=shape)
y = sym.broadcast_not_equal(a, b)
helper(y, inputs, 'int32', lambda a, b: np.not_equal(a, b),
in_range={'a': (-2, 2), 'b': (-2, 2)})
check_function(y, lambda a, b: np.not_equal(a, b),
in_range={'a': (-2, 2), 'b': (-2, 2)}, dtype='int32', shape=shape)
y = sym.broadcast_greater_equal(a, b)
helper(y, inputs, 'int32', lambda a, b: np.greater_equal(a, b),
in_range={'a': (-3, 3), 'b': (-3, 3)})
check_function(y, lambda a, b: np.greater_equal(a, b),
in_range={'a': (-3, 3), 'b': (-3, 3)}, dtype='int32', shape=shape)
y = sym.broadcast_less_equal(a, b)
helper(y, inputs, 'int32', lambda a, b: np.less_equal(a, b),
in_range={'a': (-3, 3), 'b': (-3, 3)})
check_function(y, lambda a, b: np.less_equal(a, b),
in_range={'a': (-3, 3), 'b': (-3, 3)}, dtype='int32', shape=shape)
def test_greater():
l = sym.Variable("l")
......@@ -325,13 +279,10 @@ def test_greater():
return np.greater(l, r).astype("float32")
def backward(head_grads, l, r):
return [np.zeros_like(l)]
return {'l': np.zeros_like(l)}
dtype = "float32"
inputs = [('l', (3, 4, 5), l),
('r', (3, 4, 5), r)]
helper(y, inputs, dtype, forward, backward, need_head_grads=False)
shape = {'l': (3, 4, 5), 'r': (3, 4, 5)}
check_function(y, forward, backward, shape=shape)
def test_less():
......@@ -343,13 +294,10 @@ def test_less():
return np.less(l, r).astype("float32")
def backward(head_grads, l, r):
return [np.zeros_like(l)]
return {'l': np.zeros_like(l)}
dtype = "float32"
inputs = [('l', (3, 4, 5), l),
('r', (3, 4, 5), r)]
helper(y, inputs, dtype, forward, backward, need_head_grads=False)
shape = {'l': (3, 4, 5), 'r': (3, 4, 5)}
check_function(y, forward, backward, shape=shape)
def test_reshape_like():
......@@ -364,11 +312,8 @@ def test_reshape_like():
return [np.reshape(head_grads, x.shape),
np.zeros_like(y)]
dtype = "float32"
inputs = [('x', (3, 4, 5), x),
('y', (5, 4, 3), y)]
helper(z, inputs, dtype, forward, backward)
shape = {'x': (3, 4, 5), 'y': (5, 4, 3)}
check_function(z, forward, backward, shape=shape)
def verify_expand_like(in_shape, out_shape, axis, exclude):
......@@ -412,10 +357,8 @@ def verify_expand_like(in_shape, out_shape, axis, exclude):
np.zeros_like(y)]
dtype = "float32"
inputs = [('x', in_shape, x),
('y', out_shape, y)]
helper(z, inputs, dtype, forward, backward, need_input=False)
shape = {'x': in_shape, 'y': out_shape}
check_function(z, forward, backward, shape=shape)
def test_expand_like():
......@@ -440,10 +383,8 @@ def verify_elemwise_sum(num_args):
def backward(head_grads, **inputs):
return [head_grads] * num_args
dtype = "float32"
inputs = [("input" + str(i), (3, 4, 5), s[i])
for i in range(num_args)]
helper(y, inputs, dtype, forward, backward, need_input=False)
shape = {s[i]: (3, 4, 5) for i in range(num_args)}
check_function(y, forward, backward, shape=shape)
def test_elemwise_sum():
......@@ -463,9 +404,9 @@ def test_block_grad():
return [np.zeros_like(head_grads)]
dtype = "float32"
inputs = [('x', (3, 4, 5), x)]
helper(y, inputs, dtype, forward, backward, need_head_grads=False)
shape = {'x': (3, 4, 5)}
# Numerical grad checking would fail for this function
check_function(y, forward, backward, shape=shape, numerical_grads=False)
def test_full():
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment