Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
M
Model-Transfer-Adaptability
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
haoyifan
Model-Transfer-Adaptability
Commits
c6cb641c
Commit
c6cb641c
authored
Apr 06, 2023
by
Klin
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
doc: use png to show table
parent
81b919fa
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
10 additions
and
56 deletions
+10
-56
ykl/AlexNet/README.md
+9
-32
ykl/AlexNet/image/table.png
+0
-0
ykl/AlexNet/qat.py
+1
-24
No files found.
ykl/AlexNet/README.md
View file @
c6cb641c
...
...
@@ -3,12 +3,18 @@
## ptq部分
+
INT/POT/FLOAT量化都采用相同的框架,可以通过
`quant_type`
进行确定
+
量化范围:均采用有符号对称量化,且将zeropoint定为0
+
量化策略:在第一次forward执行伪量化,统计每层的x和weight范围;后续interface在卷积/池化层中使用量化后的值进行运算。量化均通过放缩至对应范围后从量化点列表取最近点进行实现。
+
bias说明:每种量化模式下,bias采用相同的量化策略(INT:32bit量化,POT:8bit量化,FP8:FP16-E7量化)。bias量化损失对结果影响小,该量化策略也对运算硬件实现影响不大,但是在代码实现上可以更加高效,故采用。(英伟达的量化策略甚至直接舍弃了bias)
+
关于量化策略设置,可以更改
`module.py`
中的
`bias_qmax`
函数和
`utils.py`
中的
`build_bias_list`
函数
+
由于INT量化位宽较高,使用量化表开销过大,直接使用round_操作即可
+
量化点选择:
+
INT:取INT2-INT16(INT16后相比全精度无损失)
+
POT:取POT2-POT8 (POT8之后容易出现Overflow)
+
FP8:取E1-E6 (E0相当于INT量化,E7相当于POT量化,直接取相应策略效果更好)
...
...
@@ -19,36 +25,7 @@
FP32-acc:85.08
| title | js_flops | js_param | ptq_acc | acc_loss |
| ---------- | ----------- | ----------- | ------- | ----------- |
| INT_2 | 7507.750226 | 7507.750226 | 10 | 0.882463564 |
| INT_3 | 2739.698391 | 2739.698391 | 10.16 | 0.880582981 |
| INT_4 | 602.561331 | 602.561331 | 51.21 | 0.39809591 |
| INT_5 | 140.9219722 | 140.9219722 | 77.39 | 0.09038552 |
| INT_6 | 34.51721888 | 34.51721888 | 83.03 | 0.024094969 |
| INT_7 | 8.518508719 | 8.518508719 | 84.73 | 0.004113775 |
| INT_8 | 2.135373288 | 2.135373288 | 84.84 | 0.002820874 |
| INT_9 | 0.531941163 | 0.531941163 | 85.01 | 0.000822755 |
| INT_10 | 0.131627102 | 0.131627102 | 85.08 | 0 |
| INT_11 | 0.032495647 | 0.032495647 | 85.07 | 0.000117536 |
| INT_12 | 0.008037284 | 0.008037284 | 85.06 | 0.000235073 |
| INT_13 | 0.00204601 | 0.00204601 | 85.08 | 0 |
| INT_14 | 0.000418678 | 0.000418678 | 85.08 | 0 |
| INT_15 | 0.000132161 | 0.000132161 | 85.08 | 0 |
| INT_16 | 5.84143E-06 | 5.84143E-06 | 85.08 | 0 |
| POT_2 | 7507.667349 | 7507.667349 | 10 | 0.882463564 |
| POT_3 | 1654.377593 | 1654.377593 | 14.32 | 0.831687823 |
| POT_4 | 136.7401731 | 136.7401731 | 72.49 | 0.147978373 |
| POT_5 | 134.578297 | 134.578297 | 72.65 | 0.14609779 |
| POT_6 | 134.5784142 | 134.5784142 | 72.95 | 0.142571697 |
| POT_7 | 134.5783939 | 134.5783939 | 72.08 | 0.152797367 |
| POT_8 | 134.5782946 | 134.5782946 | 72.23 | 0.151034321 |
| FLOAT_8_E1 | 33.31638902 | 33.31638902 | 82.73 | 0.027621063 |
| FLOAT_8_E2 | 32.12034309 | 32.12034309 | 83.3 | 0.020921486 |
| FLOAT_8_E3 | 0.654188087 | 0.654188087 | 85.01 | 0.000822755 |
| FLOAT_8_E4 | 2.442034365 | 2.442034365 | 84.77 | 0.00364363 |
| FLOAT_8_E5 | 9.68811736 | 9.68811736 | 59.86 | 0.296426892 |
| FLOAT_8_E6 | 37.70544899 | 37.70544899 | 51.87 | 0.390338505 |
<img
src=
"image/table.png"
alt=
"table"
style=
"zoom: 33%;"
/>
+
数据拟合:
...
...
@@ -76,5 +53,4 @@
- [x] center and scale

!
[
fig4
](
image/fig4.png
)
\ No newline at end of file
ykl/AlexNet/image/table.png
0 → 100644
View file @
c6cb641c
113 KB
ykl/AlexNet/qat.py
View file @
c6cb641c
from
model
import
*
from
utils
import
*
import
gol
import
sys
import
argparse
...
...
@@ -10,29 +10,6 @@ from torchvision import datasets, transforms
import
os
import
os.path
as
osp
def
build_list
(
num_bits
,
e_bits
):
m_bits
=
num_bits
-
1
-
e_bits
plist
=
[
0.
]
# 相邻尾数的差值
dist_m
=
2
**
(
-
m_bits
)
e
=
-
2
**
(
e_bits
-
1
)
+
1
for
m
in
range
(
1
,
2
**
m_bits
):
frac
=
m
*
dist_m
# 尾数部分
expo
=
2
**
e
# 指数部分
flt
=
frac
*
expo
plist
.
append
(
flt
)
plist
.
append
(
-
flt
)
for
e
in
range
(
-
2
**
(
e_bits
-
1
)
+
2
,
2
**
(
e_bits
-
1
)
+
1
):
expo
=
2
**
e
for
m
in
range
(
0
,
2
**
m_bits
):
frac
=
1.
+
m
*
dist_m
flt
=
frac
*
expo
plist
.
append
(
flt
)
plist
.
append
(
-
flt
)
plist
=
torch
.
Tensor
(
list
(
set
(
plist
)))
return
plist
def
quantize_aware_training
(
model
,
device
,
train_loader
,
optimizer
,
epoch
):
lossLayer
=
torch
.
nn
.
CrossEntropyLoss
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment