feat: add Gen of ALL-cifar in GDFQ

a2ea6085 · Klin · 394c19ce · a2ea6085 · a2ea6085 · a2ea6085
Commit a2ea6085 authored May 29, 2023 by Klin
33 changed files
--- a/ykl/GDFQ/README.md
+++ b/ykl/GDFQ/README.md
+# GDFQ 说明
+ 思路源于论文 **Generative Low-bitwidth Data Free Quantization**，源代码可见https://github.com/xushoukai/GDFQ
+ 论文核心思路：
+  + 从预训练模型（全精度模型）捕获信息训练生成器
+    + 分类信息匹配：从预训练模型最后一层提取分类特征，给定随机标签y和高斯噪声，经过生成器网络反向得到伪数据x，将伪数据x经过全精度模型的输出z和标签y一起计算loss_one_hot.
+    + 数据分布信息匹配：从预训练模型BN层提取训练数据分布信息，计算生成数据分布和真实数据分布一起计算BNS_loss
+    + 生成器网络根据loss_one_hot和BNS_loss进行反向更新。得到的生成器生成的数据可以很好贴合全精度模型分类边界，且生成数据分布匹配真实数据（训练集）分布。如论文中图所示：
+      ![paper_img](image/paper_img.png)
+  + 伪数据驱动的低位宽量化：
+    用训练得到的生成器作为输入，全精度模型提供label，对量化网络进行训练，以提高性能，即给定相同输入下，量化模型和全进度模型输出更加接近。
+ 论文代码功能取舍：
+  + 由于我们的量化希望直接部署，而不经过fine-tune，也即不进行进一步的训练和调整，因此论文中的伪数据驱动可以被我们省略。
+  + 论文中的生成器可以很好的模拟全精度模型的分类边界，符合我们最初对迁移安全性的定义，对经过边界附近的样本扰动下的耐受能力。因此生成器可以较好的应用于我们的框架当中
+ 实验改动：
+  + 原论文直接从torchcv获取官方预训练的全精度模型，我们使用自己训练的全精度模型，这是因为安全性并不关注训练集的客观分类边界，而是关注对全精度模型的分类边界改变。
+  + 生成器的效果：对之前ALL-cifar10中的所有9个模型训练了生成器。将含噪声的随机标签y经过生成器得到伪数据，将伪数据结果输入全精度模型，与y的差别作为acc标准。ResNet系列3个模型得到的生成器acc在60~70，其余模型均在99以上。说明生成器很好的拟合了分类边界，ResNet由于大量残差结构，可能分类边界较复杂，拟合效果稍差一些。
+  + 运行方式：
+    ```shell
+    python main.py --conf_path=./cifar100_resnet20.hocon --id=01 --model_name=ResNet_18
+    ```
+  + 后续的安全性评估方式：
+    随机生成标签，并经过生成器生成伪数据，将伪数据分别输入全精度精度模型和量化模型。以全精度模型的输出为基准，由于生成器能够很好的拟合全精度分类边界，量化模型的输出和全精度模型输出不一致的比例可以衡量量化对分类边界的改变。
+ 进一步的实验改进：
+  参考论文**Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples [NeurIPS 2021]**，进一步增强决策边界样本含量。
\ No newline at end of file
--- a/ykl/GDFQ/__init__.py
+++ b/ykl/GDFQ/__init__.py
--- a/ykl/GDFQ/cfg.py
+++ b/ykl/GDFQ/cfg.py
+# conv: 'C',''/'B'/'BRL'/'BRS',qi,in_ch,out_ch,kernel_size,stirde,padding,bias
+# relu: 'RL'
+# relu6: 'RS'
+# inception: 'Inc'
+# maxpool: 'MP',kernel_size,stride,padding
+# adaptiveavgpool: 'AAP',output_size
+# view: 'VW': 
+#   dafault: x = x.view(x.size(0),-1)
+# dropout: 'D'
+# MakeLayer: 'ML','BBLK'/'BTNK'/'IRES', ml_idx, blocks
+# softmax: 'SM'
+# class 100
+ResNet_18_cfg_table = [
+    ['C','BRL',True,3,16,3,1,1,True],
+    ['ML','BBLK',0,2],
+    ['ML','BBLK',1,2],
+    ['ML','BBLK',2,2],
+    ['ML','BBLK',3,2],
+    ['AAP',1],
+    ['VW'],
+    ['FC',128,100,True],
+    ['SM']
+]
+ResNet_50_cfg_table = [
+    ['C','BRL',True,3,16,3,1,1,True],
+    ['ML','BTNK',0,3],
+    ['ML','BTNK',1,4],
+    ['ML','BTNK',2,6],
+    ['ML','BTNK',3,3],
+    ['AAP',1],
+    ['VW'],
+    ['FC',512,100,True],
+    ['SM']
+]
+ResNet_152_cfg_table = [
+    ['C','BRL',True,3,16,3,1,1,True],
+    ['ML','BTNK',0,3],
+    ['ML','BTNK',1,8],
+    ['ML','BTNK',2,36],
+    ['ML','BTNK',3,3],
+    ['AAP',1],
+    ['VW'],
+    ['FC',512,100,True],
+    ['SM']
+]
+MobileNetV2_cfg_table = [
+    ['C','BRS',True,3,32,3,1,1,True],
+    ['ML','IRES',0,1],
+    ['ML','IRES',1,2],
+    ['ML','IRES',2,3],
+    ['ML','IRES',3,3],
+    ['ML','IRES',4,3],
+    ['ML','IRES',5,1],
+    ['C','',False,320,1280,1,1,0,True],
+    ['AAP',1],
+    ['VW'],
+    ['FC',1280,100,True]
+]
+AlexNet_cfg_table = [
+    ['C','',True,3,32,3,1,1,True],
+    ['RL'],
+    ['MP',2,2,0],
+    ['C','',False,32,64,3,1,1,True],
+    ['RL'],
+    ['MP',2,2,0],
+    ['C','',False,64,128,3,1,1,True],
+    ['RL'],
+    ['C','',False,128,256,3,1,1,True],
+    ['RL'],
+    ['C','',False,256,256,3,1,1,True],
+    ['RL'],
+    ['MP',3,2,0],
+    ['VW'],
+    ['D',0.5],
+    ['FC',2304,1024,True],
+    ['RL'],
+    ['D',0.5],
+    ['FC',1024,512,True],
+    ['RL'],
+    ['FC',512,100,True]
+]
+AlexNet_BN_cfg_table = [
+    ['C','BRL',True,3,32,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,32,64,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,64,128,3,1,1,True],
+    ['C','BRL',False,128,256,3,1,1,True],
+    ['C','BRL',False,256,256,3,1,1,True],
+    ['MP',3,2,0],
+    ['VW'],
+    ['D',0.5],
+    ['FC',2304,1024,True],
+    ['RL'],
+    ['D',0.5],
+    ['FC',1024,512,True],
+    ['RL'],
+    ['FC',512,100,True]
+]
+VGG_16_cfg_table = [
+    ['C','BRL',True,3,64,3,1,1,True],
+    ['C','BRL',False,64,64,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,64,128,3,1,1,True],
+    ['C','BRL',False,128,128,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,128,256,3,1,1,True],
+    ['C','BRL',False,256,256,3,1,1,True],
+    ['C','BRL',False,256,256,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,256,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['MP',2,2,0],
+    ['VW'],
+    ['FC',512,4096,True],
+    ['RL'],
+    ['D',0.5],
+    ['FC',4096,4096,True],
+    ['RL'],
+    ['D',0.5],
+    ['FC',4096,100,True]
+]
+VGG_19_cfg_table = [
+    ['C','BRL',True,3,64,3,1,1,True],
+    ['C','BRL',False,64,64,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,64,128,3,1,1,True],
+    ['C','BRL',False,128,128,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,128,256,3,1,1,True],
+    ['C','BRL',False,256,256,3,1,1,True],
+    ['C','BRL',False,256,256,3,1,1,True],
+    ['C','BRL',False,256,256,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,256,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['MP',2,2,0],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['C','BRL',False,512,512,3,1,1,True],
+    ['MP',2,2,0],
+    ['VW'],
+    ['FC',512,4096,True],
+    ['RL'],
+    ['D',0.5],
+    ['FC',4096,4096,True],
+    ['RL'],
+    ['D',0.5],
+    ['FC',4096,100,True]
+]
+Inception_BN_cfg_table = [
+    ['C','',True,3,64,3,1,1,True],
+    ['RL'],
+    ['C','',False,64,64,3,1,1,True],
+    ['RL'],
+    ['Inc',0],
+    ['Inc',1],
+    ['MP',3,2,1],
+    ['Inc',2],
+    ['Inc',3],
+    ['Inc',4],
+    ['Inc',5],
+    ['Inc',6],
+    ['MP',3,2,1],
+    ['Inc',7],
+    ['Inc',8],
+    ['AAP',1],
+    ['C','',False,1024,100,1,1,0,True],
+    ['VW']
+]
+model_cfg_table = {
+    'AlexNet'       : AlexNet_cfg_table,
+    'AlexNet_BN'    : AlexNet_BN_cfg_table,
+    'VGG_16'        : VGG_16_cfg_table,
+    'VGG_19'        : VGG_19_cfg_table,
+    'Inception_BN'  : Inception_BN_cfg_table,
+    'ResNet_18'     : ResNet_18_cfg_table,
+    'ResNet_50'     : ResNet_50_cfg_table,
+    'ResNet_152'    : ResNet_152_cfg_table,
+    'MobileNetV2'   : MobileNetV2_cfg_table
+}
+#每行对应一个Inc结构(channel)的参数表
+inc_ch_table=[
+    [ 64, 64, 96,128, 16, 32, 32],#3a
+    [256,128,128,192, 32, 96, 64],#3b
+    [480,192, 96,208, 16, 48, 64],#4a
+    [512,160,112,224, 24, 64, 64],#4b
+    [512,128,128,256, 24, 64, 64],#4c
+    [512,112,144,288, 32, 64, 64],#4d
+    [528,256,160,320, 32,128,128],#4e
+    [832,256,160,320, 32,128,128],#5a
+    [832,384,192,384, 48,128,128] #5b
+]
+# br0,br1,br2,br3 <- br1x1,br3x3,br5x5,brM
+# 每个子数组对应Inc结构中一个分支的结构，均默认含'BRL'参数，bias为False
+# Conv层第2、3个参数是对应Inc结构(即ch_table中的一行)中的索引
+# 由于每个Inc结构操作一致，只有权重不同，使用索引而非具体值，方便复用
+# 各分支后还有Concat操作，由于只有唯一结构，未特殊说明
+# conv: 'C', ('BRL' default), in_ch_idex, out_ch_idx, kernel_size, stride, padding, (bias: True default)
+# maxpool: 'MP', kernel_size, stride, padding
+# relu: 'RL'
+inc_cfg_table = [
+    [
+        ['C',0,1,1,1,0]
+    ],
+    [
+        ['C',0,2,1,1,0],
+        ['C',2,3,3,1,1]
+    ],
+    [
+        ['C',0,4,1,1,0],
+        ['C',4,5,5,1,2]
+    ],
+    [
+        ['MP',3,1,1],
+        ['RL'],
+        ['C',0,6,1,1,0]
+    ]
+]
+# ml_cfg_table = []
+#BasicBlock
+#value: downsample,inplanes,planes,planes*expansion,stride,1(dafault stride and group) 
+bblk_ch_table = [
+    [False, 16, 16, 16,1,1], #layer1,first
+    [False, 16, 16, 16,1,1], #       other
+    [True,  16, 32, 32,2,1], #layer2
+    [False, 32, 32, 32,1,1], 
+    [True,  32, 64, 64,2,1], #layer3
+    [False, 64, 64, 64,1,1], 
+    [True,  64,128,128,2,1], #layer4
+    [False,128,128,128,1,1]  
+]
+#conv: 'C','B'/'BRL'/'BRS', in_ch_idx, out_ch_idx, kernel_sz, stride_idx, padding, groups_idx (bias: True default)
+#add: 'AD', unconditonal. unconditonal为true或flag为true时将outs中两元素相加
+bblk_cfg_table = [
+    [
+        ['C','BRL',1,2,3,4,1,5],
+        ['C','B'  ,2,2,3,5,1,5],
+    ],
+    # downsample, 仅当downsample传入为True时使用
+    [
+        ['C','B'  ,1,3,1,4,0,5]
+    ],
+    # 分支交汇后动作
+    [
+        ['AD',True],
+        ['RL']
+    ]
+]
+#BottleNeck
+#value: downsample,inplanes,planes,planes*expansion,stride,1(dafault stride and group) 
+btnk_ch_table = [
+    [True,  16, 16, 64,1,1], #layer1,first
+    [False, 64, 16, 64,1,1], #       other 
+    [True,  64, 32,128,2,1], #layer2
+    [False,128, 32,128,1,1], 
+    [True, 128, 64,256,2,1], #layer3
+    [False,256, 64,256,1,1],  
+    [True, 256,128,512,2,1], #layer4
+    [False,512,128,512,1,1]   
+]
+#conv: 'C','B'/'BRL'/'BRS', in_ch_idx, out_ch_idx, kernel_sz, stride_idx, padding, groups_idx (bias: True default)
+#add: 'AD', unconditonal. unconditonal为true或flag为true时将outs中两元素相加
+btnk_cfg_table = [
+    [
+        ['C','BRL',1,2,1,5,0,5],
+        ['C','BRL',2,2,3,4,1,5],
+        ['C','B'  ,2,3,1,5,0,5]
+    ],
+    # downsample, 仅当downsample传入为True时使用
+    [
+        ['C','B'  ,1,3,1,4,0,5]
+    ], 
+    # 分支交汇后动作
+    [
+        ['AD',True],
+        ['RL']
+    ]
+]
+#InvertedResidual
+#value: identity_flag, in_ch, out_ch, in_ch*expand_ratio, stride, 1(dafault stride and group) 
+ires_ch_table = [
+    [False, 32, 16,  32,1,1], #layer1,first
+    [ True, 16, 16,  16,1,1], #       other
+    [False, 16, 24,  96,2,1], #layer2
+    [ True, 24, 24, 144,1,1],
+    [False, 24, 32, 144,2,1], #layer3
+    [ True, 32, 32, 192,1,1],
+    [False, 32, 96, 192,1,1], #layer4
+    [ True, 96, 96, 576,1,1],
+    [False, 96,160, 576,2,1], #layer5
+    [ True,160,160, 960,1,1],
+    [False,160,320, 960,1,1], #layer6
+    [ True,320,320,1920,1,1]
+]
+#conv: 'C','B'/'BRL'/'BRS', in_ch_idx, out_ch_idx, kernel_sz, stride_idx, padding, groups_idx (bias: True default)
+#add: 'AD', unconditonal. unconditonal为true或flag为true时将outs中两元素相加
+ires_cfg_table = [
+    [
+        ['C','BRS',1,3,1,5,0,5],
+        ['C','BRS',3,3,3,4,1,3],
+        ['C','B'  ,3,2,1,5,0,5]
+    ],
+    # identity_br empty
+    [
+    ],
+    # 分支汇合后操作
+    [
+        ['AD',False] #有条件的相加
+    ]
+]
\ No newline at end of file
--- a/ykl/GDFQ/cifar100_resnet20.hocon
+++ b/ykl/GDFQ/cifar100_resnet20.hocon
+#  ------------ General options ----------------------------------------
+save_path = "./log_cifar100_ResNet_epoch1600/"
+dataPath = "/lustre/datasets/CIFAR100"
+dataset = "cifar100" # options: imagenet | cifar100
+nGPU = 1  # number of GPUs to use by default
+GPU = 0  # default gpu to use, options: range(nGPU)
+visible_devices = "0"
+# ------------- Data options -------------------------------------------
+nThreads = 8  # number of data loader threads
+# ---------- Optimization options for S --------------------------------------
+# nEpochs = 400  # number of total epochs to train 400
+nEpochs = 1600 
+batchSize = 200  # batchsize
+momentum = 0.9  # momentum 0.9
+weightDecay = 1e-4  # weight decay 1e-4
+opt_type = "SGD"
+warmup_epochs = 4 # number of epochs for warmup
+lr_S = 0.0001 # initial learning rate = 0.00001
+lrPolicy_S = "multi_step"  # options: multi_step | linear | exp | const | step
+step_S = [100,200,300]  # step for linear or exp learning rate policy default [100, 200, 300]
+decayRate_S = 0.1 # lr decay rate
+# ---------- Model options ---------------------------------------------
+experimentID = "_cifar100_4bit_"
+nClasses = 100  # number of classes in the dataset
+# ---------- Quantization options ---------------------------------------------
+qw = 4
+qa = 4
+# ----------KD options ---------------------------------------------
+temperature = 20
+alpha = 1
+# ----------Generator options ---------------------------------------------
+latent_dim = 100
+img_size = 32
+channels = 3
+lr_G = 0.001       # default 0.001
+lrPolicy_G = "multi_step"  # options: multi_step | linear | exp | const | step
+#step_G = [100,200,300]   # step for linear or exp learning rate policy
+step_G = [1000,1200,1400]
+decayRate_G = 0.1 # lr decay rate
+b1 = 0.5
+b2 = 0.999
\ No newline at end of file
--- a/ykl/GDFQ/conditional_batchnorm.py
+++ b/ykl/GDFQ/conditional_batchnorm.py
+# -*- coding: utf-8 -*-
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.nn import init
+# 在BN层基础上改动
+class ConditionalBatchNorm2d(nn.BatchNorm2d):
+    """Conditional Batch Normalization"""
+    def __init__(self, num_features, eps=1e-05, momentum=0.1,
+                 affine=False, track_running_stats=True):
+        super(ConditionalBatchNorm2d, self).__init__(
+            num_features, eps, momentum, affine, track_running_stats
+        )
+    def forward(self, input, weight, bias, **kwargs):
+        self._check_input_dim(input)
+        exponential_average_factor = 0.0
+        if self.training and self.track_running_stats:
+            self.num_batches_tracked += 1
+            # 累计移动平均值
+            if self.momentum is None:  # use cumulative moving average
+                exponential_average_factor = 1.0 / self.num_batches_tracked.item()
+            else:  # use exponential moving average
+                exponential_average_factor = self.momentum
+        output = F.batch_norm(input, self.running_mean, self.running_var,
+                              self.weight, self.bias,
+                              self.training or not self.track_running_stats,
+                              exponential_average_factor, self.eps)
+        if weight.dim() == 1:
+            weight = weight.unsqueeze(0)
+        if bias.dim() == 1:
+            bias = bias.unsqueeze(0)
+        size = output.size()
+        weight = weight.unsqueeze(-1).unsqueeze(-1).expand(size)
+        bias = bias.unsqueeze(-1).unsqueeze(-1).expand(size)
+        return weight * output + bias
+class CategoricalConditionalBatchNorm2d(ConditionalBatchNorm2d):
+    def __init__(self, num_classes, num_features, eps=1e-5, momentum=0.1,
+                 affine=False, track_running_stats=True):
+        super(CategoricalConditionalBatchNorm2d, self).__init__(
+            num_features, eps, momentum, affine, track_running_stats
+        )
+        self.weights = nn.Embedding(num_classes, num_features)
+        self.biases = nn.Embedding(num_classes, num_features)
+        self._initialize()
+    def _initialize(self):
+        init.ones_(self.weights.weight.data)
+        init.zeros_(self.biases.weight.data)
+    def forward(self, input, c, **kwargs):
+        weight = self.weights(c)
+        bias = self.biases(c)
+        return super(CategoricalConditionalBatchNorm2d, self).forward(input, weight, bias)
+if __name__ == '__main__':
+    """Forward computation check."""
+    import torch
+    size = (3, 3, 12, 12)
+    #前两个维度
+    batch_size, num_features = size[:2]
+    print('# Affirm embedding output')
+    naive_bn = nn.BatchNorm2d(3)
+    idx_input = torch.tensor([1, 2, 0], dtype=torch.long)
+    embedding = nn.Embedding(3, 3)
+    weights = embedding(idx_input)
+    print('# weights size', weights.size())
+    empty = torch.tensor((), dtype=torch.float)
+    running_mean = empty.new_zeros((3,))
+    running_var = empty.new_ones((3,))
+    naive_bn_W = naive_bn.weight
+    # print('# weights from embedding | type {}\n'.format(type(weights)), weights)
+    # print('# naive_bn_W | type {}\n'.format(type(naive_bn_W)), naive_bn_W)
+    input = torch.rand(*size, dtype=torch.float32)
+    print('input size', input.size())
+    print('input ndim ', input.dim())
+    _ = naive_bn(input)
+    print('# batch_norm with given weights')
+    try:
+        with torch.no_grad():
+            output = F.batch_norm(input, running_mean, running_var,
+                                  weights, naive_bn.bias, False, 0.0, 1e-05)
+    except Exception as e:
+        print("\tFailed to use given weights")
+        print('# Error msg:', e)
+        print()
+    else:
+        print("Succeeded to use given weights")
+    print('\n# Batch norm before use given weights')
+    with torch.no_grad():
+        tmp_out = F.batch_norm(input, running_mean, running_var,
+                               naive_bn_W, naive_bn.bias, False, .0, 1e-05)
+    weights_cast = weights.unsqueeze(-1).unsqueeze(-1)
+    weights_cast = weights_cast.expand(tmp_out.size())
+    try:
+        out = weights_cast * tmp_out
+    except Exception:
+        print("Failed")
+    else:
+        print("Succeeded!")
+        print('\t {}'.format(out.size()))
+        print(type(tuple(out.size())))
+    print('--- condBN and catCondBN ---')
+    catCondBN = CategoricalConditionalBatchNorm2d(3, 3)
+    output = catCondBN(input, idx_input)
+    assert tuple(output.size()) == size
+    condBN = ConditionalBatchNorm2d(3)
+    idx = torch.tensor([1], dtype=torch.long)
+    out = catCondBN(input, idx)
+    print('cat cond BN weights\n', catCondBN.weights.weight.data)
+    print('cat cond BN biases\n', catCondBN.biases.weight.data)
--- a/ykl/GDFQ/dataloader.py
+++ b/ykl/GDFQ/dataloader.py
+"""
+data loder for loading data
+"""
+import os
+import math
+import torch
+import torch.utils.data as data
+import numpy as np
+from PIL import Image
+import torchvision
+import torchvision.datasets as dsets
+import torchvision.transforms as transforms
+import struct
+__all__ = ["DataLoader", "PartDataLoader"]
+class ImageLoader(data.Dataset):
+	def __init__(self, dataset_dir, transform=None, target_transform=None):
+		class_list = os.listdir(dataset_dir)
+		datasets = []
+		for cla in class_list:
+			cla_path = os.path.join(dataset_dir, cla)
+			files = os.listdir(cla_path)
+			for file_name in files:
+				file_path = os.path.join(cla_path, file_name)
+				if os.path.isfile(file_path):
+					# datasets.append((file_path, tuple([float(v) for v in int(cla)])))
+					datasets.append((file_path, [float(cla)]))
+					# print(datasets)
+					# assert False
+		self.dataset_dir = dataset_dir
+		self.datasets = datasets
+		self.transform = transform
+		self.target_transform = target_transform
+	def __getitem__(self, index):
+		frames = []
+		file_path, label = self.datasets[index]
+		noise = torch.load(file_path, map_location=torch.device('cpu'))
+		return noise, torch.Tensor(label)
+	def __len__(self):
+		return len(self.datasets)
+class DataLoader(object):
+	"""
+	data loader for CV data sets
+	"""
+	def __init__(self, dataset, batch_size, n_threads=4,
+	             ten_crop=False, data_path='/home/dataset/', logger=None):
+		"""
+		create data loader for specific data set
+		:params n_treads: number of threads to load data, default: 4
+		:params ten_crop: use ten crop for testing, default: False
+		:params data_path: path to data set, default: /home/dataset/
+		"""
+		self.dataset = dataset
+		self.batch_size = batch_size
+		self.n_threads = n_threads
+		self.ten_crop = ten_crop
+		self.data_path = data_path
+		self.logger = logger
+		self.dataset_root = data_path
+		self.logger.info("|===>Creating data loader for " + self.dataset)
+		if self.dataset in ["cifar100"]:
+			self.train_loader, self.test_loader = self.cifar(
+				dataset=self.dataset)
+		elif self.dataset in ["imagenet"]:
+			self.train_loader, self.test_loader = self.imagenet(
+				dataset=self.dataset)
+		else:
+			assert False, "invalid data set"
+	def getloader(self):
+		"""
+		get train_loader and test_loader
+		"""
+		return self.train_loader, self.test_loader
+	def imagenet(self, dataset="imagenet"):
+		traindir = os.path.join(self.data_path, "train")
+		testdir = os.path.join(self.data_path, "val")
+		normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
+										 std=[0.229, 0.224, 0.225])
+		train_loader = torch.utils.data.DataLoader(
+			dsets.ImageFolder(traindir, transforms.Compose([
+				transforms.RandomResizedCrop(224),
+				transforms.RandomHorizontalFlip(),
+				transforms.ToTensor(),
+				normalize,
+			])),
+			batch_size=self.batch_size,
+			shuffle=True,
+			num_workers=self.n_threads,
+			pin_memory=True)
+		test_transform = transforms.Compose([
+			transforms.Resize(256),
+			# transforms.Scale(256),
+			transforms.CenterCrop(224),
+			transforms.ToTensor(),
+			normalize
+		])
+		test_loader = torch.utils.data.DataLoader(
+			dsets.ImageFolder(testdir, test_transform),
+			batch_size=self.batch_size,
+			shuffle=False,
+			num_workers=self.n_threads,
+			pin_memory=False)
+		return train_loader, test_loader
+	def cifar(self, dataset="cifar100"):
+		"""
+		dataset: cifar
+		"""
+		if dataset == "cifar10":
+			norm_mean = [0.49139968, 0.48215827, 0.44653124]
+			norm_std = [0.24703233, 0.24348505, 0.26158768]
+		elif dataset == "cifar100":
+			norm_mean = [0.50705882, 0.48666667, 0.44078431]
+			norm_std = [0.26745098, 0.25568627, 0.27607843]
+			# norm_mean = [0.4914, 0.4822, 0.4465]
+			# norm_std = [0.2023, 0.1994, 0.2010]
+		else:
+			assert False, "Invalid cifar dataset"
+		test_data_root = self.dataset_root
+		test_transform = transforms.Compose([
+			transforms.ToTensor(),
+			transforms.Normalize(norm_mean, norm_std)])
+		if self.dataset == "cifar10":
+			test_dataset = dsets.CIFAR10(root=test_data_root,
+			                             train=False,
+			                             transform=test_transform)
+		elif self.dataset == "cifar100":
+			test_dataset = dsets.CIFAR100(root=test_data_root,
+			                              train=False,
+			                              transform=test_transform,
+			                              download=True)
+		else:
+			assert False, "invalid data set"
+		test_loader = torch.utils.data.DataLoader(dataset=test_dataset,
+												  batch_size=200,
+												#   batch_size=128,
+												  shuffle=False,
+												  pin_memory=True,
+												  num_workers=self.n_threads)
+		return None, test_loader
--- a/ykl/GDFQ/function.py
+++ b/ykl/GDFQ/function.py
+from torch.autograd import Function
+class FakeQuantize(Function):
+    @staticmethod
+    def forward(ctx, x, qparam):
+        x = qparam.quantize_tensor(x)
+        x = qparam.dequantize_tensor(x)
+        return x
+    @staticmethod
+    def backward(ctx, grad_output):
+        return grad_output, None
\ No newline at end of file
--- a/ykl/GDFQ/gol.py
+++ b/ykl/GDFQ/gol.py
+# -*- coding: utf-8 -*-
+# 用于多个module之间共享全局变量
+def _init():  # 初始化
+    global _global_dict
+    _global_dict = {}
+def set_value(value,is_bias=False):
+    # 定义一个全局变量
+    if is_bias:
+        _global_dict[0] = value
+    else:
+        _global_dict[1] = value
+def get_value(is_bias=False): # 给bias独立于各变量外的精度
+    if is_bias:
+        return _global_dict[0]
+    else:
+        return _global_dict[1]  
--- a/ykl/GDFQ/image/paper_img.png
+++ b/ykl/GDFQ/image/paper_img.png
--- a/ykl/GDFQ/imagenet_resnet18.hocon
+++ b/ykl/GDFQ/imagenet_resnet18.hocon
+#  ------------ General options ----------------------------------------
+save_path = "./save_ImageNet/"
+dataPath = "/home/datasets/Datasets/imagenet"
+dataset = "imagenet" # options: imagenet | cifar100
+nGPU = 1  # number of GPUs to use by default
+GPU = 0  # default gpu to use, options: range(nGPU)
+visible_devices = "2"
+# ------------- Data options -------------------------------------------
+nThreads = 8  # number of data loader threads
+# ---------- Optimization options --------------------------------------
+nEpochs = 400  # number of total epochs to train 400
+batchSize = 16  # batchsize
+momentum = 0.9  # momentum 0.9
+weightDecay = 1e-4  # weight decay 1e-4
+opt_type = "SGD"
+warmup_epochs = 50 # number of epochs for warmup
+lr_S = 0.000001 # initial learning rate = 0.000001
+lrPolicy_S = "multi_step"  # options: multi_step | linear | exp | const | step
+step_S = [100,200,300]  # step for linear or exp learning rate policy default [200, 300, 400]
+decayRate_S = 0.1 # lr decay rate
+# ---------- Model options ---------------------------------------------
+experimentID = "imganet_4bit_"
+nClasses = 1000  # number of classes in the dataset
+# ---------- Quantization options ---------------------------------------------
+qw = 4
+qa = 4
+# ----------KD options ---------------------------------------------
+temperature = 20
+alpha = 1
+# ----------Generator options ---------------------------------------------
+latent_dim = 100
+img_size = 224
+channels = 3
+lr_G = 0.001       # default 0.001
+lrPolicy_G = "multi_step"  # options: multi_step | linear | exp | const | step
+step_G = [100,200,300]   # step for linear or exp learning rate policy
+decayRate_G = 0.1 # lr decay rate
+b1 = 0.5
+b2 = 0.999
\ No newline at end of file
--- a/ykl/GDFQ/main.py
+++ b/ykl/GDFQ/main.py
--- a/ykl/GDFQ/model.py
+++ b/ykl/GDFQ/model.py
+import torch.nn as nn
+from cfg import *
+from module import *
+from model_deployment import *
+class Model(nn.Module):
+    def __init__(self,model_name):
+        super(Model, self).__init__()
+        self.cfg_table = model_cfg_table[model_name]
+        make_layers(self,self.cfg_table)
+        # # 参数初始化
+        # for m in self.modules():
+        #     if isinstance(m, nn.Conv2d):
+        #         nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
+        #     elif isinstance(m, nn.BatchNorm2d):
+        #         nn.init.constant_(m.weight, 1)
+        #         nn.init.constant_(m.bias, 0)
+        #     elif isinstance(m, nn.Linear):
+        #         nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
+    def forward(self,x):
+        x = model_forward(self,self.cfg_table,x)
+        return x
+    def quantize(self, quant_type, num_bits=8, e_bits=3):
+        model_quantize(self,self.cfg_table,quant_type,num_bits,e_bits)
+    def quantize_forward(self,x):
+        return model_utils(self,self.cfg_table,func='forward',x=x)
+    def freeze(self):
+        model_utils(self,self.cfg_table,func='freeze')
+    def quantize_inference(self,x):
+        return model_utils(self,self.cfg_table,func='inference',x=x)
+    def fakefreeze(self):
+        model_utils(self,self.cfg_table,func='fakefreeze')
+# if __name__ == "__main__":
+#     model = Inception_BN()
+#     model.quantize('INT',8,3)
+#     print(model.named_modules)
+#     print('-------')
+#     print(model.named_parameters)
+#     print(len(model.conv0.named_parameters()))
\ No newline at end of file
--- a/ykl/GDFQ/model_deployment.py
+++ b/ykl/GDFQ/model_deployment.py
--- a/ykl/GDFQ/module.py
+++ b/ykl/GDFQ/module.py
--- a/ykl/GDFQ/options.py
+++ b/ykl/GDFQ/options.py
+import os
+import shutil
+from pyhocon import ConfigFactory
+from utils.opt_static import NetOption
+class Option(NetOption):
+	def __init__(self, conf_path):
+		super(Option, self).__init__()
+		self.conf = ConfigFactory.parse_file(conf_path)
+		#  ------------ General options ----------------------------------------
+		self.save_path = self.conf['save_path']
+		self.dataPath = self.conf['dataPath']  # path for loading data set
+		# 这里数据集只支持cifar100和imagenet？
+		self.dataset = self.conf['dataset']  # options: imagenet | cifar100
+		self.nGPU = self.conf['nGPU']  # number of GPUs to use by default
+		self.GPU = self.conf['GPU']  # default gpu to use, options: range(nGPU)
+		self.visible_devices = self.conf['visible_devices']
+		# ------------- Data options -------------------------------------------
+		self.nThreads = self.conf['nThreads']  # number of data loader threads
+		# ---------- Optimization options --------------------------------------
+		self.nEpochs = self.conf['nEpochs']  # number of total epochs to train
+		self.batchSize = self.conf['batchSize']  # mini-batch size
+		self.momentum = self.conf['momentum']  # momentum
+		self.weightDecay = float(self.conf['weightDecay'])  # weight decay
+		# sgd adam之类的
+		self.opt_type = self.conf['opt_type']
+		self.warmup_epochs = self.conf['warmup_epochs']  # number of epochs for warmup
+		self.lr_S = self.conf['lr_S']  # initial learning rate
+		#hocon里用了multistep
+		self.lrPolicy_S = self.conf['lrPolicy_S']  # options: multi_step | linear | exp | const | step
+		self.step_S = self.conf['step_S']  # step for linear or exp learning rate policy
+		self.decayRate_S = self.conf['decayRate_S']  # lr decay rate
+		# ---------- Model options ---------------------------------------------
+		self.experimentID = self.conf['experimentID']
+		self.nClasses = self.conf['nClasses']  # number of classes in the dataset
+		# ---------- Quantization options ---------------------------------------------
+		#量化中的W4A4就是这里，W是指权重，A是指relu等层的量化。hocon里值都是4
+		self.qw = self.conf['qw']
+		self.qa = self.conf['qa']
+		# ----------KD options ---------------------------------------------
+		self.temperature = self.conf['temperature']
+		self.alpha = self.conf['alpha']
+		# ----------Generator options ---------------------------------------------
+		#生成器的参数
+		self.latent_dim = self.conf['latent_dim']
+		self.img_size = self.conf['img_size']
+		self.channels = self.conf['channels']
+		self.lr_G = self.conf['lr_G']
+		#用的还是multistep
+		self.lrPolicy_G = self.conf['lrPolicy_G']  # options: multi_step | linear | exp | const | step
+		self.step_G = self.conf['step_G']  # step for linear or exp learning rate policy
+		self.decayRate_G = self.conf['decayRate_G']  # lr decay rate
+		self.b1 = self.conf['b1']
+		self.b2 = self.conf['b2']
+	def set_save_path(self):
+		self.save_path = self.save_path + "{}_bs{:d}_lr{:.4f}_{}_epoch{}/".format(
+			self.experimentID,
+			self.batchSize, self.lr, self.opt_type, 
+			self.nEpochs)
+		if os.path.exists(self.save_path):
+			shutil.rmtree(self.save_path)
+			# print("{} file exist!".format(self.save_path))
+			# action = input("Select Action: d (delete) / q (quit):").lower().strip()
+			# act = action
+			# if act == 'd':
+			# 	shutil.rmtree(self.save_path)
+			# else:
+			# 	raise OSError("Directory {} exits!".format(self.save_path))
+		if not os.path.exists(self.save_path):
+			os.makedirs(self.save_path)
+	def paramscheck(self, logger):
+		logger.info("|===>The used PyTorch version is {}".format(
+				self.torch_version))
+		if self.dataset in ["cifar10", "mnist"]:
+			self.nClasses = 10
+		elif self.dataset == "cifar100":
+			self.nClasses = 100
+		elif self.dataset == "imagenet" or "thi_imgnet":
+			self.nClasses = 1000
+		elif self.dataset == "imagenet100":
+			self.nClasses = 100
\ No newline at end of file
--- a/ykl/GDFQ/quantization_utils/quant_modules.py
+++ b/ykl/GDFQ/quantization_utils/quant_modules.py
+# *
+# @file Different utility functions
+# Copyright (c) Yaohui Cai, Zhewei Yao, Zhen Dong, Amir Gholami
+# All rights reserved.
+# This file is part of ZeroQ repository.
+#
+# ZeroQ is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# ZeroQ is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with ZeroQ repository.  If not, see <http://www.gnu.org/licenses/>.
+# *
+import torch
+import time
+import math
+import numpy as np
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.nn import Module, Parameter
+from .quant_utils import *
+import sys
+class QuantAct(Module):
+	"""
+	Class to quantize given activations
+	"""
+	def __init__(self,
+	             activation_bit,
+	             full_precision_flag=False,
+	             running_stat=True,
+				 beta=0.9):
+		"""
+		activation_bit: bit-setting for activation
+		full_precision_flag: full precision or not
+		running_stat: determines whether the activation range is updated or froze
+		"""
+		super(QuantAct, self).__init__()
+		self.activation_bit = activation_bit
+		self.full_precision_flag = full_precision_flag
+		self.running_stat = running_stat
+		self.register_buffer('x_min', torch.zeros(1))
+		self.register_buffer('x_max', torch.zeros(1))
+		self.register_buffer('beta', torch.Tensor([beta]))
+		self.register_buffer('beta_t', torch.ones(1))
+		self.act_function = AsymmetricQuantFunction.apply
+	def __repr__(self):
+		return "{0}(activation_bit={1}, full_precision_flag={2}, running_stat={3}, Act_min: {4:.2f}, Act_max: {5:.2f})".format(
+			self.__class__.__name__, self.activation_bit,
+			self.full_precision_flag, self.running_stat, self.x_min.item(),
+			self.x_max.item())
+	#fix和unfix决定了能否更改统计值
+	def fix(self):
+		"""
+		fix the activation range by setting running stat
+		"""
+		self.running_stat = False
+	def unfix(self):
+		"""
+		fix the activation range by setting running stat
+		"""
+		self.running_stat = True
+	def forward(self, x):
+		"""
+		quantize given activation x
+		"""
+		if self.running_stat:
+			x_min = x.data.min()
+			x_max = x.data.max()
+			# in-place operation used on multi-gpus
+			# self.x_min += -self.x_min + min(self.x_min, x_min)
+			# self.x_max += -self.x_max + max(self.x_max, x_max)
+			self.beta_t = self.beta_t * self.beta
+			self.x_min = (self.x_min * self.beta + x_min * (1 - self.beta))/(1 - self.beta_t)
+			self.x_max = (self.x_max * self.beta + x_max * (1 - self.beta)) / (1 - self.beta_t)
+		if not self.full_precision_flag:
+			# 进行量化
+			quant_act = self.act_function(x, self.activation_bit, self.x_min,
+			                              self.x_max)
+			return quant_act
+		else:
+			return x
+class Quant_Linear(Module):
+	"""
+	Class to quantize given linear layer weights
+	"""
+	def __init__(self, weight_bit, full_precision_flag=False):
+		"""
+		weight: bit-setting for weight
+		full_precision_flag: full precision or not
+		running_stat: determines whether the activation range is updated or froze
+		"""
+		super(Quant_Linear, self).__init__()
+		self.full_precision_flag = full_precision_flag
+		self.weight_bit = weight_bit
+		self.weight_function = AsymmetricQuantFunction.apply
+	def __repr__(self):
+		s = super(Quant_Linear, self).__repr__()
+		s = "(" + s + " weight_bit={}, full_precision_flag={})".format(
+			self.weight_bit, self.full_precision_flag)
+		return s
+	def set_param(self, linear):
+		self.in_features = linear.in_features
+		self.out_features = linear.out_features
+		self.weight = Parameter(linear.weight.data.clone())
+		try:
+			self.bias = Parameter(linear.bias.data.clone())
+		except AttributeError:
+			self.bias = None
+	def forward(self, x):
+		"""
+		using quantized weights to forward activation x
+		"""
+		w = self.weight
+		x_transform = w.data.detach()
+		w_min = x_transform.min(dim=1).values
+		w_max = x_transform.max(dim=1).values
+		if not self.full_precision_flag:
+			w = self.weight_function(self.weight, self.weight_bit, w_min,
+			                         w_max)
+		else:
+			w = self.weight
+		return F.linear(x, weight=w, bias=self.bias)
+class Quant_Conv2d(Module):
+	"""
+	Class to quantize given convolutional layer weights
+	"""
+	def __init__(self, weight_bit, full_precision_flag=False):
+		super(Quant_Conv2d, self).__init__()
+		self.full_precision_flag = full_precision_flag
+		self.weight_bit = weight_bit
+		self.weight_function = AsymmetricQuantFunction.apply
+	def __repr__(self):
+		s = super(Quant_Conv2d, self).__repr__()
+		s = "(" + s + " weight_bit={}, full_precision_flag={})".format(
+			self.weight_bit, self.full_precision_flag)
+		return s
+	def set_param(self, conv):
+		self.in_channels = conv.in_channels
+		self.out_channels = conv.out_channels
+		self.kernel_size = conv.kernel_size
+		self.stride = conv.stride
+		self.padding = conv.padding
+		self.dilation = conv.dilation
+		self.groups = conv.groups
+		self.weight = Parameter(conv.weight.data.clone())
+		try:
+			self.bias = Parameter(conv.bias.data.clone())
+		except AttributeError:
+			self.bias = None
+	def forward(self, x):
+		"""
+		using quantized weights to forward activation x
+		"""
+		w = self.weight
+		x_transform = w.data.contiguous().view(self.out_channels, -1)
+		w_min = x_transform.min(dim=1).values
+		w_max = x_transform.max(dim=1).values
+		if not self.full_precision_flag:
+			#这里对权重进行量化。bias还是保持不变
+			w = self.weight_function(self.weight, self.weight_bit, w_min,
+			                         w_max)
+		else:
+			w = self.weight
+		return F.conv2d(x, w, self.bias, self.stride, self.padding,
+		                self.dilation, self.groups)
--- a/ykl/GDFQ/quantization_utils/quant_utils.py
+++ b/ykl/GDFQ/quantization_utils/quant_utils.py
+#*
+# @file Different utility functions
+# Copyright (c) Yaohui Cai, Zhewei Yao, Zhen Dong, Amir Gholami
+# All rights reserved.
+# This file is part of ZeroQ repository.
+#
+# ZeroQ is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# ZeroQ is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with ZeroQ repository.  If not, see <http://www.gnu.org/licenses/>.
+#*
+import math
+import numpy as np
+from torch.autograd import Function, Variable
+import torch
+def clamp(input, min, max, inplace=False):
+    """
+    Clamp tensor input to (min, max).
+    input: input tensor to be clamped
+    """
+    if inplace:
+        input.clamp_(min, max)
+        return input
+    return torch.clamp(input, min, max)
+def linear_quantize(input, scale, zero_point, inplace=False):
+    """
+    Quantize single-precision input tensor to integers with the given scaling factor and zeropoint.
+    input: single-precision input tensor to be quantized
+    scale: scaling factor for quantization
+    zero_pint: shift for quantization
+    """
+    # 根据input的shape确定，将所有信息集中于第一维
+    # reshape scale and zeropoint for convolutional weights and activation
+    if len(input.shape) == 4:
+        scale = scale.view(-1, 1, 1, 1)
+        zero_point = zero_point.view(-1, 1, 1, 1)
+    # reshape scale and zeropoint for linear weights
+    elif len(input.shape) == 2:
+        scale = scale.view(-1, 1)
+        zero_point = zero_point.view(-1, 1)
+    # mapping single-precision input to integer values with the given scale and zeropoint
+    if inplace:
+        # 这里就是量化行为
+        input.mul_(scale).sub_(zero_point).round_()
+        return input
+    return torch.round(scale * input - zero_point)
+def linear_dequantize(input, scale, zero_point, inplace=False):
+    """
+    Map integer input tensor to fixed point float point with given scaling factor and zeropoint.
+    input: integer input tensor to be mapped
+    scale: scaling factor for quantization
+    zero_pint: shift for quantization
+    """
+    # reshape scale and zeropoint for convolutional weights and activation
+    if len(input.shape) == 4:
+        scale = scale.view(-1, 1, 1, 1)
+        zero_point = zero_point.view(-1, 1, 1, 1)
+    # reshape scale and zeropoint for linear weights
+    elif len(input.shape) == 2:
+        scale = scale.view(-1, 1)
+        zero_point = zero_point.view(-1, 1)
+    # mapping integer input to fixed point float point value with given scaling factor and zeropoint
+    if inplace:
+        input.add_(zero_point).div_(scale)
+        return input
+    return (input + zero_point) / scale
+# 非对称线性量化
+def asymmetric_linear_quantization_params(num_bits,
+                                          saturation_min,
+                                          saturation_max,
+                                          integral_zero_point=True,
+                                          signed=True):
+    """
+    Compute the scaling factor and zeropoint with the given quantization range.
+    saturation_min: lower bound for quantization range
+    saturation_max: upper bound for quantization range
+    """
+    n = 2**num_bits - 1
+    # 统计操作与我们的框架相反
+    scale = n / torch.clamp((saturation_max - saturation_min), min=1e-8)
+    zero_point = scale * saturation_min
+    if integral_zero_point:
+        if isinstance(zero_point, torch.Tensor):
+            zero_point = zero_point.round()
+        else:
+            zero_point = float(round(zero_point))
+    if signed:
+        zero_point += 2**(num_bits - 1)
+    return scale, zero_point
+class AsymmetricQuantFunction(Function):
+    """
+    Class to quantize the given floating-point values with given range and bit-setting.
+    Currently only support inference, but not support back-propagation.
+    """
+    @staticmethod
+    def forward(ctx, x, k, x_min=None, x_max=None):
+        """
+        x: single-precision value to be quantized
+        k: bit-setting for x
+        x_min: lower bound for quantization range
+        x_max=None
+        """
+        # if x_min is None or x_max is None or (sum(x_min == x_max) == 1
+        #                                       and x_min.numel() == 1):
+        #     x_min, x_max = x.min(), x.max()
+        scale, zero_point = asymmetric_linear_quantization_params(
+            k, x_min, x_max)
+        #对输入量化
+        new_quant_x = linear_quantize(x, scale, zero_point, inplace=False)
+        n = 2**(k - 1)
+        new_quant_x = torch.clamp(new_quant_x, -n, n - 1)
+        quant_x = linear_dequantize(new_quant_x,
+                                    scale,
+                                    zero_point,
+                                    inplace=False)
+        #这里开启了求导功能
+        return torch.autograd.Variable(quant_x)
+    @staticmethod
+    def backward(ctx, grad_output):
+        return grad_output, None, None, None
--- a/ykl/GDFQ/requirements.txt
+++ b/ykl/GDFQ/requirements.txt
+# numpy==1.16.4
+# requests==2.21.0
+pyhocon==0.3.51
+# torchvision==0.4.0
+# torch==1.2.0+cu92
+# Pillow==7.2.0
+termcolor==1.1.0
--- a/ykl/GDFQ/trainer.py
+++ b/ykl/GDFQ/trainer.py
--- a/ykl/GDFQ/utils/__init__.py
+++ b/ykl/GDFQ/utils/__init__.py
+from utils.lr_policy import *
+from utils.compute import *
+from utils.log_print import *
+from utils.model_transform import *
+# from utils.ifeige import *
\ No newline at end of file
--- a/ykl/GDFQ/utils/__pycache__/__init__.cpython-36.pyc
+++ b/ykl/GDFQ/utils/__pycache__/__init__.cpython-36.pyc
--- a/ykl/GDFQ/utils/__pycache__/compute.cpython-36.pyc
+++ b/ykl/GDFQ/utils/__pycache__/compute.cpython-36.pyc
--- a/ykl/GDFQ/utils/__pycache__/log_print.cpython-36.pyc
+++ b/ykl/GDFQ/utils/__pycache__/log_print.cpython-36.pyc
--- a/ykl/GDFQ/utils/__pycache__/lr_policy.cpython-36.pyc
+++ b/ykl/GDFQ/utils/__pycache__/lr_policy.cpython-36.pyc
--- a/ykl/GDFQ/utils/__pycache__/model_transform.cpython-36.pyc
+++ b/ykl/GDFQ/utils/__pycache__/model_transform.cpython-36.pyc
--- a/ykl/GDFQ/utils/__pycache__/opt_static.cpython-36.pyc
+++ b/ykl/GDFQ/utils/__pycache__/opt_static.cpython-36.pyc
--- a/ykl/GDFQ/utils/compute.py
+++ b/ykl/GDFQ/utils/compute.py
+import numpy as np
+import math
+import torch
+__all__ = ["compute_tencrop", "compute_singlecrop", "AverageMeter"]
+def compute_tencrop(outputs, labels):
+    output_size = outputs.size()
+    outputs = outputs.view(output_size[0] / 10, 10, output_size[1])
+    outputs = outputs.sum(1).squeeze(1)
+    # compute top1
+    _, pred = outputs.topk(1, 1, True, True)
+    pred = pred.t()
+    top1_count = pred.eq(labels.data.view(
+        1, -1).expand_as(pred)).view(-1).float().sum(0)
+    top1_error = 100.0 - 100.0 * top1_count / labels.size(0)
+    top1_error = float(top1_error.cpu().numpy())
+    # compute top5
+    _, pred = outputs.topk(5, 1, True, True)
+    pred = pred.t()
+    top5_count = pred.eq(labels.data.view(
+        1, -1).expand_as(pred)).view(-1).float().sum(0)
+    top5_error = 100.0 - 100.0 * top5_count / labels.size(0)
+    top5_error = float(top5_error.cpu().numpy())
+    return top1_error, 0, top5_error
+def compute_singlecrop(outputs, labels, loss, top5_flag=False, mean_flag=False):
+    with torch.no_grad():
+        if isinstance(outputs, list):
+            top1_loss = []
+            top1_error = []
+            top5_error = []
+            for i in range(len(outputs)):
+                top1_accuracy, top5_accuracy = accuracy(outputs[i], labels, topk=(1, 5))
+                top1_error.append(100 - top1_accuracy)
+                top5_error.append(100 - top5_accuracy)
+                top1_loss.append(loss[i].item())
+        else:
+            top1_accuracy, top5_accuracy = accuracy(outputs, labels, topk=(1,5))
+            top1_error = 100 - top1_accuracy
+            top5_error = 100 - top5_accuracy
+            top1_loss = loss.item()
+        if top5_flag:
+            return top1_error, top1_loss, top5_error
+        else:
+            return top1_error, top1_loss
+# 统计精确度acc
+def accuracy(output, target, topk=(1,)):
+    """Computes the precision@k for the specified values of k"""
+    with torch.no_grad():
+        maxk = max(topk)
+        batch_size = target.size(0)
+        _, pred = output.topk(maxk, 1, True, True)
+        pred = pred.t()
+        correct = pred.eq(target.view(1, -1).expand_as(pred))
+        res = []
+        for k in topk:
+            correct_k = correct[:k].reshape(-1).float().sum(0, keepdim=True)
+            res.append(correct_k.mul_(100.0 / batch_size).item())
+        return res
+class AverageMeter(object):
+    """Computes and stores the average and current value"""
+    # 统计某个间隔内的平均值
+    def __init__(self):
+        self.reset()
+    def reset(self):
+        """
+        reset all parameters
+        """
+        self.val = 0
+        self.avg = 0
+        self.sum = 0
+        self.count = 0
+    def update(self, val, n=1):
+        """
+        update parameters
+        """
+        self.val = val
+        self.sum += val * n
+        self.count += n
+        self.avg = self.sum / self.count
\ No newline at end of file
--- a/ykl/GDFQ/utils/log_print.py
+++ b/ykl/GDFQ/utils/log_print.py
+from termcolor import colored
+import numpy as np
+import datetime
+__all__ = ["compute_remain_time", "print_result", "print_weight", "print_grad"]
+single_train_time = 0
+single_test_time = 0
+single_train_iters = 0
+single_test_iters = 0
+def compute_remain_time(epoch, nEpochs, count, iters, data_time, iter_time, mode="Train"):
+    global single_train_time, single_test_time
+    global single_train_iters, single_test_iters
+    # compute cost time
+    if mode == "Train":
+        single_train_time = single_train_time * \
+                            0.95 + 0.05 * (data_time + iter_time)
+        # single_train_time = data_time + iter_time
+        single_train_iters = iters
+        train_left_iter = single_train_iters - count + \
+                          (nEpochs - epoch - 1) * single_train_iters
+        # print "train_left_iters", train_left_iter
+        test_left_iter = (nEpochs - epoch) * single_test_iters
+    else:
+        single_test_time = single_test_time * \
+                           0.95 + 0.05 * (data_time + iter_time)
+        # single_test_time = data_time+iter_time
+        single_test_iters = iters
+        train_left_iter = (nEpochs - epoch - 1) * single_train_iters
+        test_left_iter = single_test_iters - count + \
+                         (nEpochs - epoch - 1) * single_test_iters
+    left_time = single_train_time * train_left_iter + \
+                single_test_time * test_left_iter
+    total_time = (single_train_time * single_train_iters +
+                  single_test_time * single_test_iters) * nEpochs
+    time_str = "TTime: {}, RTime: {}".format(datetime.timedelta(seconds=total_time),
+                                                           datetime.timedelta(seconds=left_time))
+    return time_str, total_time, left_time
+def print_result(epoch, nEpochs, count, iters, lr, data_time, iter_time, error, loss, top5error=None,
+                 mode="Train", logger=None):
+    log_str = ">>> {}: [{:0>3d}|{:0>3d}], Iter: [{:0>3d}|{:0>3d}], LR: {:.6f}, DataTime: {:.4f}, IterTime: {:.4f}, ".format(
+        mode, epoch + 1, nEpochs, count, iters, lr, data_time, iter_time)
+    if isinstance(error, list) or isinstance(error, np.ndarray):
+        for i in range(len(error)):
+            log_str += "Error_{:d}: {:.4f}, Loss_{:d}: {:.4f}, ".format(i, error[i], i, loss[i])
+    else:
+        log_str += "Error: {:.4f}, Loss: {:.4f}, ".format(error, loss)
+    if top5error is not None:
+        if isinstance(top5error, list) or isinstance(top5error, np.ndarray):
+            for i in range(len(top5error)):
+                log_str += " Top5_Error_{:d}: {:.4f}, ".format(i, top5error[i])
+        else:
+            log_str += " Top5_Error: {:.4f}, ".format(top5error)
+    time_str, total_time, left_time = compute_remain_time(epoch, nEpochs, count, iters, data_time, iter_time, mode)
+    logger.info(log_str + time_str)
+    return total_time, left_time
+def print_weight(layers, logger):
+    if isinstance(layers, MD.qConv2d):
+        logger.info(layers.weight)
+    elif isinstance(layers, MD.qLinear):
+        logger.info(layers.weight)
+        logger.info(layers.weight_mask)
+    logger.info("------------------------------------")
+def print_grad(m, logger):
+    if isinstance(m, MD.qLinear):
+        logger.info(m.weight.data)
--- a/ykl/GDFQ/utils/lr_policy.py
+++ b/ykl/GDFQ/utils/lr_policy.py
+"""
+class LRPolicy
+"""
+import math
+__all__ = ["LRPolicy"]
+class LRPolicy:
+    """
+    learning rate policy
+    """
+    def __init__(self, lr, n_epochs, lr_policy="multi_step"):
+        self.lr_policy = lr_policy
+        self.params_dict = {}
+        self.n_epochs = n_epochs
+        self.base_lr = lr
+        self.lr = lr
+    def set_params(self, params_dict=None):
+        """
+        set parameters of lr policy
+        """
+        if self.lr_policy == "multi_step":
+            """
+            params: decay_rate, step
+            """
+            self.params_dict['decay_rate'] = params_dict['decay_rate']
+            self.params_dict['step'] = sorted(params_dict['step'])
+            if max(self.params_dict['step']) <= 1:
+                new_step_list = []
+                for ratio in self.params_dict['step']:
+                    new_step_list.append(int(self.n_epochs * ratio))
+                self.params_dict['step'] = new_step_list
+        elif self.lr_policy == "step":
+            """
+            params: end_lr, step
+            step: lr = base_lr*gamma^(floor(iter/step))
+            """
+            self.params_dict['end_lr'] = params_dict['end_lr']
+            self.params_dict['step'] = params_dict['step']
+            max_iter = math.floor((self.n_epochs - 1.0) /
+                                  self.params_dict['step'])
+            if self.params_dict['end_lr'] == -1:
+                self.params_dict['gamma'] = params_dict['decay_rate']
+            else:
+                self.params_dict['gamma'] = math.pow(
+                    self.params_dict['end_lr'] / self.base_lr, 1. / max_iter)
+        elif self.lr_policy == "linear":
+            """
+            params: end_lr, step
+            """
+            self.params_dict['end_lr'] = params_dict['end_lr']
+            self.params_dict['step'] = params_dict['step']
+        elif self.lr_policy == "exp":
+            """
+            params: end_lr
+            exp: lr = base_lr*gamma^iter
+            """
+            self.params_dict['end_lr'] = params_dict['end_lr']
+            self.params_dict['gamma'] = math.pow(
+                self.params_dict['end_lr'] / self.base_lr, 1. / (self.n_epochs - 1))
+        elif self.lr_policy == "inv":
+            """
+            params: end_lr
+            inv: lr = base_lr*(1+gamma*iter)^(-power)
+            """
+            self.params_dict['end_lr'] = params_dict['end_lr']
+            self.params_dict['power'] = params_dict['power']
+            self.params_dict['gamma'] = (math.pow(
+                self.base_lr / self.params_dict['end_lr'],
+                1. / self.params_dict['power']) - 1.) / (self.n_epochs - 1.)
+        elif self.lr_policy == "const":
+            """
+            no params
+            const: lr = base_lr
+            """
+            self.params_dict = None
+        else:
+            assert False, "invalid lr_policy" + self.lr_policy
+    def get_lr(self, epoch):
+        """
+        get current learning rate
+        """
+        if self.lr_policy == "multi_step":
+            gamma = 0
+            for step in self.params_dict['step']:
+                if epoch + 1.0 > step:
+                    gamma += 1
+            lr = self.base_lr * math.pow(self.params_dict['decay_rate'], gamma)
+        elif self.lr_policy == "step":
+            lr = self.base_lr * \
+                math.pow(self.params_dict['gamma'], math.floor(
+                    epoch * 1.0 / self.params_dict['step']))
+        elif self.lr_policy == "linear":
+            k = (self.params_dict['end_lr'] - self.base_lr) / \
+                math.ceil(self.n_epochs / self.params_dict['step'])
+            lr = k * math.ceil((epoch + 1) /
+                               self.params_dict['step']) + self.base_lr
+        elif self.lr_policy == "inv":
+            lr = self.base_lr * \
+                math.pow(
+                    1 + self.params_dict['gamma'] * epoch, -self.params_dict['power'])
+        elif self.lr_policy == "exp":
+            # power = math.floor((epoch + 1) / self.params_dict['step'])
+            # lr = self.base_lr * math.pow(self.params_dict['gamma'], power)
+            lr = self.base_lr * math.pow(self.params_dict['gamma'], epoch)
+        elif self.lr_policy == "const":
+            lr = self.base_lr
+        else:
+            assert False, "invalid lr_policy: " + self.lr_policy
+        self.lr = lr
+        return lr
--- a/ykl/GDFQ/utils/model_transform.py
+++ b/ykl/GDFQ/utils/model_transform.py
+import torch.nn as nn
+import torch
+import numpy as np
+__all__ = ["data_parallel", "model2list",
+           "list2sequential", "model2state_dict"]
+def data_parallel(model, ngpus, gpu0=0):
+    """
+    assign model to multi-gpu mode
+    :params model: target model
+    :params ngpus: number of gpus to use
+    :params gpu0: id of the master gpu
+    :return: model, type is Module or Sequantial or DataParallel
+    """
+    if ngpus == 0:
+        assert False, "only support gpu mode"
+    gpu_list = list(range(gpu0, gpu0 + ngpus))
+    assert torch.cuda.device_count() >= gpu0 + ngpus, "Invalid Number of GPUs"
+    if isinstance(model, list):
+        for i in range(len(model)):
+            if ngpus >= 2:
+                if not isinstance(model[i], nn.DataParallel):
+                    model[i] = torch.nn.DataParallel(model[i], gpu_list).cuda()
+            else:
+                model[i] = model[i].cuda()
+    else:
+        if ngpus >= 2:
+            if not isinstance(model, nn.DataParallel):
+                model = torch.nn.DataParallel(model, gpu_list).cuda()
+        else:
+            model = model.cuda()
+    return model
+def model2list(model):
+    """
+    convert model to list type
+    :param model: should be type of list or nn.DataParallel or nn.Sequential
+    :return: no return params
+    """
+    if isinstance(model, nn.DataParallel):
+        model = list(model.module)
+    elif isinstance(model, nn.Sequential):
+        model = list(model)
+    return model
+def list2sequential(model):
+    if isinstance(model, list):
+        model = nn.Sequential(*model)
+    return model
+def model2state_dict(file_path):
+    model = torch.load(file_path)
+    if model['model'] is not None:
+        model_state_dict = model['model'].state_dict()
+        torch.save(model_state_dict, file_path.replace(
+            '.pth', 'state_dict.pth'))
+    else:
+        print((type(model)))
+        print(model)
+        print("skip")
--- a/ykl/GDFQ/utils/opt_static.py
+++ b/ykl/GDFQ/utils/opt_static.py
+"""
+TODO: add doc for module
+"""
+import torch
+__all__ = ["NetOption"]
+"""
+You can run your script with CUDA_VISIBLE_DEVICES=5,6 python your_script.py
+or set the environment variable in the script by os.environ['CUDA_VISIBLE_DEVICES'] = '5,6'
+to map GPU 5, 6 to device_ids 0, 1, respectively.
+"""
+#main中调用了这个对象，并传入hocon进行了改变。内容主要都取决于hocon
+class NetOption(object):
+    def __init__(self):
+        #  ------------ General options ----------------------------------------
+        self.save_path = ""  # log path
+        #数据集的地方
+        self.dataPath = "/home/dataset/"  # path for loading data set
+        self.dataset = "cifar10"  # options: imagenet | cifar10 | cifar100 | imagenet100 | mnist
+        self.manualSeed = 1  # manually set RNG seed
+        self.nGPU = 1  # number of GPUs to use by default
+        self.GPU = 0  # default gpu to use, options: range(nGPU)
+        # ------------- Data options -------------------------------------------
+        self.nThreads = 4  # number of data loader threads
+        # ------------- Training options ---------------------------------------
+        self.testOnly = False  # run on validation set only
+        self.tenCrop = False  # Ten-crop testing
+        # ---------- Optimization options --------------------------------------
+        self.nEpochs = 200  # number of total epochs to train
+        self.batchSize = 128  # mini-batch size
+        self.momentum = 0.9  # momentum
+        self.weightDecay = 1e-4  # weight decay 1e-4
+        self.opt_type = "SGD"
+        self.lr = 0.1  # initial learning rate
+        self.lrPolicy = "multi_step"  # options: multi_step | linear | exp | fixed
+        self.power = 1  # power for learning rate policy (inv)
+        self.step = [0.6, 0.8]  # step for linear or exp learning rate policy
+        self.endlr = 0.001  # final learning rate, oly for "linear lrpolicy"
+        self.decayRate = 0.1  # lr decay rate
+        # ---------- Model options ---------------------------------------------
+        self.netType = "PreResNet"  # options: ResNet | PreResNet | GreedyNet | NIN | LeNet5
+        self.experimentID = "refator-test-01"
+        self.depth = 20  # resnet depth: (n-2)%6==0
+        self.nClasses = 10  # number of classes in the dataset
+        self.wideFactor = 1  # wide factor for wide-resnet
+        # ---------- Resume or Retrain options ---------------------------------------------
+        self.retrain = None  # path to model to retrain with, load model state_dict only
+        self.resume = None  # path to directory containing checkpoint, load state_dicts of model and optimizer, as well as training epoch
+        # ---------- Visualization options -------------------------------------
+        self.drawNetwork = True
+        self.drawInterval = 30
+        self.torch_version = torch.__version__
+        torch_version_split = self.torch_version.split("_")
+        self.torch_version = torch_version_split[0]
+        # check parameters
+        # self.paramscheck()
+    def paramscheck(self):
+        if self.torch_version != "0.2.0":
+            self.drawNetwork = False
+            print("|===>DrawNetwork is supported by PyTorch with version: 0.2.0. The used version is ", self.torch_version)
+        if self.netType in ["PreResNet", "ResNet"]:
+            self.save_path = "log_%s%d_%s_bs%d_lr%0.3f_%s/" % (
+                self.netType, self.depth, self.dataset,
+                self.batchSize, self.lr, self.experimentID)
+        else:
+            self.save_path = "log_%s_%s_bs%d_lr%0.3f_%s/" % (
+                self.netType, self.dataset,
+                self.batchSize, self.lr, self.experimentID)
+        if self.dataset in ["cifar10", "mnist"]:
+            self.nClasses = 10
+        elif self.dataset == "cifar100":
+            self.nClasses = 100
+        elif self.dataset == "imagenet" or "thi_imgnet":
+            self.nClasses = 1000
+        elif self.dataset == "imagenet100":
+            self.nClasses = 100
+        if self.depth >= 100:
+            self.drawNetwork = False
+            print("|===>draw network with depth over 100 layers, skip this step")
--- a/ykl/GDFQ/utils/warmup.py
+++ b/ykl/GDFQ/utils/warmup.py
+from torchlearning.mio import MIO
+train_dataset = MIO("/home/datasets/imagenet_mio/train/")
+test_dataset = MIO("/home/datasets/imagenet_mio/val/")
+for i in range(train_dataset.size):
+    print(i)
+    train_dataset.fetchone(i)
+for i in range(test_dataset.size):
+    print(i)
+    test_dataset.fetchone(i)
\ No newline at end of file
--- a/ykl/README.md
+++ b/ykl/README.md
 # 改动说明
+## update: 2023/05/29
+ GDFQ：结合之前框架，训练了所有模型的生成器。后续将进一步引入评估和决策边界样本增强。
 ## update2: 2023/05/26
 + 添加了cifar100数据集支持，详见ALL-cifar100。原先ALL文件夹重命名为ALL-cifar10