* [Relay] [Quantization] WIP - Common files for the qauntization work. * [Relay] [Quantization] WIP - Prototyping requantize op. * Requantize operator implementation. Requantize converts one quantized tensor representation to another quantized representation. The PR has following implementation features - Requantize operator defined in qnn namespace - relay.qnn.requantize - Lowering of the requantize to exisiting Relay operators - Integer fixed point implementation of requantize - Two rounding modes - FE_UPWARDS (round towards infinity) and FE_AWAY_FROM_ZERO (std::round behavior) - Floating point implementation as well, that can act as reference or can be used for devices when FP32 computation is not used. - Unit test cases Relevant Issue - https://github.com/dmlc/tvm/issues/2351 Credit to TFLite and GemmLowp to provide reference implementations. * Typo and lint fixes. * Doc fix. * Uncommenting the lint script (fixing mistake). * Modifying the unit tests. * Moving C++ files into src/relay/qnn * Moving python files to python/tvm/relay/qnn. Some minor fixes. * Moving the attrs.h inside the include directory. * Pushing files that I forgot earlier. Changing util location. * Incorporating comments. API change. Lint fixes. * Modifying the GetFixedPointMultiplierShift API as per comments. * Forgot the dialect change. * Changing rewrite to qnn_lower. * Renaming Quantize to Qnn for clarity. * Remove use_int_domain. * Incorportaing review comments. * Adding API doc for QNN dialect. * Move the qnn_lower pass to transform namespace. * Moving from expr to module. Adding namespace in C++. * Minor sentence rewrites. Added qnn namespace. * Added the API doc. * Chanding default out_dtype to int8. Adding a test with in/out_dtype as uint8. * Style fixes. Better error messages. * Adding documentation. * More documentation fixes. * Adding out dtype check for requantize. * Adding corner case for FP32 to fixed point conversion. * Adding extra line. * Documentation fix. * Adding static inline. * Incorporating jackwish comment. Removed idtype from requantize lowering. * Removing Quantize/Dequantize code. Restricting Requantize to (u)int8/int32. * Style fixes. * Fix the docs. * Move to Legalize API.
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
backend | Loading commit data... | |
frontend | Loading commit data... | |
grammar | Loading commit data... | |
op | Loading commit data... | |
qnn | Loading commit data... | |
quantize | Loading commit data... | |
testing | Loading commit data... | |
__init__.py | Loading commit data... | |
_analysis.py | Loading commit data... | |
_base.py | Loading commit data... | |
_build_module.py | Loading commit data... | |
_expr.py | Loading commit data... | |
_make.py | Loading commit data... | |
_module.py | Loading commit data... | |
_module.pyi | Loading commit data... | |
_parser.py | Loading commit data... | |
_transform.py | Loading commit data... | |
adt.py | Loading commit data... | |
analysis.py | Loading commit data... | |
annotation.py | Loading commit data... | |
base.py | Loading commit data... | |
build_module.py | Loading commit data... | |
contrib.py | Loading commit data... | |
debug.py | Loading commit data... | |
expr.py | Loading commit data... | |
expr.pyi | Loading commit data... | |
expr_functor.py | Loading commit data... | |
feature.py | Loading commit data... | |
image.py | Loading commit data... | |
loops.py | Loading commit data... | |
module.py | Loading commit data... | |
nn.py | Loading commit data... | |
param_dict.py | Loading commit data... | |
parser.py | Loading commit data... | |
prelude.py | Loading commit data... | |
prelude.rly | Loading commit data... | |
scope_builder.py | Loading commit data... | |
transform.py | Loading commit data... | |
transform.pyi | Loading commit data... | |
ty.py | Loading commit data... | |
ty.pyi | Loading commit data... | |
vision.py | Loading commit data... |