* [Relay] [Quantization] WIP - Common files for the qauntization work. * [Relay] [Quantization] WIP - Prototyping requantize op. * Requantize operator implementation. Requantize converts one quantized tensor representation to another quantized representation. The PR has following implementation features - Requantize operator defined in qnn namespace - relay.qnn.requantize - Lowering of the requantize to exisiting Relay operators - Integer fixed point implementation of requantize - Two rounding modes - FE_UPWARDS (round towards infinity) and FE_AWAY_FROM_ZERO (std::round behavior) - Floating point implementation as well, that can act as reference or can be used for devices when FP32 computation is not used. - Unit test cases Relevant Issue - https://github.com/dmlc/tvm/issues/2351 Credit to TFLite and GemmLowp to provide reference implementations. * Typo and lint fixes. * Doc fix. * Uncommenting the lint script (fixing mistake). * Modifying the unit tests. * Moving C++ files into src/relay/qnn * Moving python files to python/tvm/relay/qnn. Some minor fixes. * Moving the attrs.h inside the include directory. * Pushing files that I forgot earlier. Changing util location. * Incorporating comments. API change. Lint fixes. * Modifying the GetFixedPointMultiplierShift API as per comments. * Forgot the dialect change. * Changing rewrite to qnn_lower. * Renaming Quantize to Qnn for clarity. * Remove use_int_domain. * Incorportaing review comments. * Adding API doc for QNN dialect. * Move the qnn_lower pass to transform namespace. * Moving from expr to module. Adding namespace in C++. * Minor sentence rewrites. Added qnn namespace. * Added the API doc. * Chanding default out_dtype to int8. Adding a test with in/out_dtype as uint8. * Style fixes. Better error messages. * Adding documentation. * More documentation fixes. * Adding out dtype check for requantize. * Adding corner case for FP32 to fixed point conversion. * Adding extra line. * Documentation fix. * Adding static inline. * Incorporating jackwish comment. Removed idtype from requantize lowering. * Removing Quantize/Dequantize code. Restricting Requantize to (u)int8/int32. * Style fixes. * Fix the docs. * Move to Legalize API.
Name |
Last commit
|
Last update |
---|---|---|
.. | ||
node | Loading commit data... | |
relay | Loading commit data... | |
runtime | Loading commit data... | |
api_registry.h | Loading commit data... | |
arithmetic.h | Loading commit data... | |
attrs.h | Loading commit data... | |
base.h | Loading commit data... | |
buffer.h | Loading commit data... | |
build_module.h | Loading commit data... | |
c_dsl_api.h | Loading commit data... | |
channel.h | Loading commit data... | |
codegen.h | Loading commit data... | |
data_layout.h | Loading commit data... | |
dtype.h | Loading commit data... | |
expr.h | Loading commit data... | |
expr_operator.h | Loading commit data... | |
ir.h | Loading commit data... | |
ir_functor_ext.h | Loading commit data... | |
ir_mutator.h | Loading commit data... | |
ir_pass.h | Loading commit data... | |
ir_visitor.h | Loading commit data... | |
logging.h | Loading commit data... | |
lowered_func.h | Loading commit data... | |
operation.h | Loading commit data... | |
packed_func_ext.h | Loading commit data... | |
schedule.h | Loading commit data... | |
schedule_pass.h | Loading commit data... | |
target_info.h | Loading commit data... | |
tensor.h | Loading commit data... | |
tensor_intrin.h | Loading commit data... |