网站导航条专门做页面跳转,企业咨询公司管理,江津区建设工程交易中心网站,wordpress外网访问没模版#x1f368; 本文为#x1f517;365天深度学习训练营 中的学习记录博客 #x1f356; 原作者#xff1a;K同学啊|接辅导、项目定制 1 Inception V1
Inception v1论文
1.1 理论知识 GoogLeNet首次出现在2014年ILSVRC比赛中获得冠军。这次的版本通常称其为Inception V1。… 本文为365天深度学习训练营 中的学习记录博客 原作者K同学啊|接辅导、项目定制 1 Inception V1
Inception v1论文
1.1 理论知识 GoogLeNet首次出现在2014年ILSVRC比赛中获得冠军。这次的版本通常称其为Inception V1。Inception V1有22层深参数量为5M。同一时期的VGGNet性能和InceptionV1差不多但是参数量远大于Inception V1. Inception Module是Inception V1的核心组成单元提出了卷积层的并行结构实现了在同一层就可以提取不同的特征如下图a)所示。 按照这样的结构来增加网络的深度虽然可以提升性能但是还面临计算量大参数多的问题。为改善这种现象Inception Module借鉴Network-in-Network的思想使用1x1的卷积核实现降维操作也间接增加了网络的深度以此来减少网络的参数量与计算量如上图b所示。 备注举例假如前一层的输出为100x100x128经过具有256个5x5卷积核的卷积层之后stride1, pad2, 输出数据为100x100x256.其中卷积层的参数为5x5x128x256256。例如上一层输出先经过具有32个1x1卷积核的卷积层1x1卷积降低了通道数且特征图尺寸不变经过具有256个5x5卷积核的卷积层最终的输出数据仍为100x100x256但卷积参数量以及减少为(128x1x1x3232)(32x5x5x256256)参数数量减少为原来的约四分之一。其计算量由原先的8.191x10e9降低至2.048x10e9。 1x1卷积核的作用1x1卷积核的最大作用是降低输入特征图的通道数减少 网络的参数量与计算量。 最后Inception Module基本由1x1卷积3x3卷积5x5卷积3x3最大池化四个基本单元组成对四个基本单元运算结果进行通道上组合不同大小的卷积核赋予不同大小的感受野从而提取到图像不同尺度的信息进行融合得到图像更好的表征就是Inception Module的核心思想。
1.2 算法结构 实现的Inception v1网络结构图如下所示 注 另外增加了两个辅助分支作用有两点 1避免梯度消失用于前向传导梯度。反向传播时如果有一层求导为0链式求导结果则为0。 2将中间某一层输出用作分类起到模型融合作用实际测试时这两个辅助softmax分支会被去掉。 在后续模型的发展中该方法采用较少。 详细网络结构图如下所示 2 代码实现
2.1 开发环境 电脑系统ubuntu16.04 编译器Jupter Lab 语言环境Python 3.7 深度学习环境Pytorch 2.2 前期准备
2.2.1 设置GPU
import torch
import torch.nn as nn
import torchvision.transforms as transforms
import torchvision
from torchvision import transforms, datasets
import os, PIL, pathlib, warningswarnings.filterwarnings(ignore)
device torch.device(cuda if torch.cuda.is_available() else cpu)print(device) 2.2.2 导入数据
import os,PIL,random,pathlib
data_dir ../data/4-data/
data_dir pathlib.Path(data_dir)
data_dirdata_paths list(data_dir.glob(*))
classNames [str(path).split(\\)[-1] for path in data_paths]
print(classNames:, classNames , \n)total_dir ../data/4-data/
train_transforms transforms.Compose([transforms.Resize([224, 224]), # resize输入图片transforms.ToTensor(), # 将PIL Image或numpy.ndarray转换成tensortransforms.Normalize(mean[0.485, 0.456, 0.406],std[0.229, 0.224, 0.225]) # 从数据集中随机抽样计算得到
])total_data datasets.ImageFolder(total_dir, transformtrain_transforms)
print(total_data, \n)print(total_data.class_to_idx) 结果如下所示 2.2.3 划分数据集
train_size int(0.8 * len(total_data))
test_size len(total_data) - train_size
train_dataset, test_dataset torch.utils.data.random_split(total_data, [train_size, test_size])
print(train_dataset, test_dataset)batch_size 4
train_dl torch.utils.data.DataLoader(train_dataset, batch_sizebatch_size,shuffleTrue,num_workers1,pin_memoryFalse)
test_dl torch.utils.data.DataLoader(test_dataset, batch_sizebatch_size,shuffleTrue,num_workers1,pin_memoryFalse)for X, y in test_dl:print(Shape of X [N, C, H, W]:, X.shape)print(Shape of y:, y.shape, y.dtype)break 结果如下所示 2.3 Inception的实现 这里去掉了两个辅助分支直接复现主支。
2.3.1 inception_block 定义一个名为Inception的类继承自nn.Module。inception_block类包含了Inception V1模型的所有层和参数。
import torch
import torch.nn as nn
import torch.nn.functional as Fclass inception_block(nn.Module):def __init__(self, in_channels, ch1x1, ch3x3red, ch3x3, ch5x5red, ch5x5, pool_proj):super(inception_block, self).__init__()# 1x1 conv branchself.branch1 nn.Sequential(nn.Conv2d(in_channels, ch1x1, kernel_size1),nn.BatchNorm2d(ch1x1),nn.ReLU(inplaceTrue))# 1x1 conv - 3x3 conv branchself.branch2 nn.Sequential(nn.Conv2d(in_channels, ch3x3red, kernel_size1),nn.BatchNorm2d(ch3x3red),nn.ReLU(inplaceTrue),nn.Conv2d(ch3x3red, ch3x3, kernel_size3, padding1),nn.BatchNorm2d(ch3x3),nn.ReLU(inplaceTrue))# 1x1 conv - 5x5 conv branchself.branch3 nn.Sequential(nn.Conv2d(in_channels, ch5x5red, kernel_size1),nn.BatchNorm2d(ch5x5red),nn.ReLU(inplaceTrue),nn.Conv2d(ch5x5red, ch5x5, kernel_size5, padding2),nn.BatchNorm2d(ch5x5),nn.ReLU(inplaceTrue))# 3x3 max pooling - 1x1 conv branchself.branch4 nn.Sequential(nn.MaxPool2d(kernel_size3, stride1, padding1),nn.Conv2d(in_channels, pool_proj, kernel_size1),nn.BatchNorm2d(pool_proj),nn.ReLU(inplaceTrue))def forward(self, x):# compute forward pass through all branches # and concatenate the outout feature mapsbranch1_output self.branch1(x)branch2_output self.branch2(x)branch3_output self.branch3(x)branch4_output self.branch4(x)outputs [branch1_output, branch2_output, branch3_output, branch4_output]return torch.cat(outputs, 1) 在__init__方法中我们定义了四个分支分别是 (1) branch1一个1x1卷积层 (2) branch2一个1x1卷积层一个3x3卷积层 (3) branch3一个1x1卷积层5x5卷积层 (4) branch4一个3x3最大池化层一个1x1卷积层 每个分支都包含了一些卷积层、批归一化层和激活函数。这些层都是PyTorch中的标准层我们可以使用nn.Conv2d、nn.BatchNorm2d和nn.ReLU分别定义卷积层、批归一化层和ReLU激活函数。 在forward方法中我们计算从输入到所有分支的前向传递并将所有分支的特征图拼接在一起。最后我们返回拼接后的特征图。
2.3.2 Inception v1 下面定义Inception v1模型使用nn.ModuleList和nn.Sequential组合多个Inception模块和其他层。
class InceptionV1(nn.Module):def __init__(self, num_classes4):super(InceptionV1, self).__init__()self.conv1 nn.Conv2d(3, 64, kernel_size7, stride2, padding3)self.maxpool1 nn.MaxPool2d(kernel_size3, stride2, padding1)self.conv2 nn.Conv2d(64, 64, kernel_size1, stride1, padding0)self.conv3 nn.Conv2d(64, 192, kernel_size3, stride1, padding1)self.maxpool2 nn.MaxPool2d(kernel_size3, stride2, padding1)self.inception3a inception_block(192, 64, 96, 128, 16, 32, 32)self.inception3b inception_block(256, 128, 128, 192, 32, 96, 64)self.maxpool3 nn.MaxPool2d(kernel_size3, stride2, padding1)self.inception4a inception_block(480, 192, 96, 208, 16, 48, 64)self.inception4b inception_block(512, 160, 112, 224, 24, 64, 64)self.inception4c inception_block(512, 128, 128, 256, 24, 64, 64)self.inception4d inception_block(512, 112, 144, 288, 32, 64, 64)self.inception4e inception_block(528, 256, 160, 320, 32, 128, 128)self.maxpool4 nn.MaxPool2d(kernel_size3, stride2, padding1)self.inception5a inception_block(832, 256, 160, 320, 32, 128, 128)self.inception5b nn.Sequential(inception_block(832, 384, 192, 384, 48, 128, 128),nn.AvgPool2d(kernel_size7, stride1, padding0),nn.Dropout(0.4))# 全连接网络层用于分类self.classifier nn.Sequential(nn.Linear(in_features1024, out_features1024),nn.ReLU(),nn.Linear(in_features1024, out_featuresnum_classes),nn.Softmax(dim1))def forward(self, x):x self.conv1(x)x F.relu(x)x self.maxpool1(x)x self.conv2(x)x F.relu(x)x self.conv3(x)x F.relu(x)x self.maxpool2(x)x self.inception3a(x)x self.inception3b(x)x self.maxpool3(x)x self.inception4a(x)x self.inception4b(x)x self.inception4c(x)x self.inception4d(x)x self.inception4e(x)x self.maxpool4(x)x self.inception5a(x)x self.inception5b(x)x torch.flatten(x, start_dim1)x self.classifier(x)return x
2.3.3 输出模型结构
# 统计模型参数量以及其他指标
import torchsummary# 调用并将模型转移到GPU中
model InceptionV1().to(device)# 显示网络结构
torchsummary.summary(model, (3, 224, 224))
print(model) 输出如下所示
----------------------------------------------------------------Layer (type) Output Shape Param #
Conv2d-1 [-1, 64, 112, 112] 9,472MaxPool2d-2 [-1, 64, 56, 56] 0Conv2d-3 [-1, 64, 56, 56] 4,160Conv2d-4 [-1, 192, 56, 56] 110,784MaxPool2d-5 [-1, 192, 28, 28] 0Conv2d-6 [-1, 64, 28, 28] 12,352BatchNorm2d-7 [-1, 64, 28, 28] 128ReLU-8 [-1, 64, 28, 28] 0Conv2d-9 [-1, 96, 28, 28] 18,528BatchNorm2d-10 [-1, 96, 28, 28] 192ReLU-11 [-1, 96, 28, 28] 0Conv2d-12 [-1, 128, 28, 28] 110,720BatchNorm2d-13 [-1, 128, 28, 28] 256ReLU-14 [-1, 128, 28, 28] 0Conv2d-15 [-1, 16, 28, 28] 3,088BatchNorm2d-16 [-1, 16, 28, 28] 32ReLU-17 [-1, 16, 28, 28] 0Conv2d-18 [-1, 32, 28, 28] 12,832BatchNorm2d-19 [-1, 32, 28, 28] 64ReLU-20 [-1, 32, 28, 28] 0MaxPool2d-21 [-1, 192, 28, 28] 0Conv2d-22 [-1, 32, 28, 28] 6,176BatchNorm2d-23 [-1, 32, 28, 28] 64ReLU-24 [-1, 32, 28, 28] 0inception_block-25 [-1, 256, 28, 28] 0Conv2d-26 [-1, 128, 28, 28] 32,896BatchNorm2d-27 [-1, 128, 28, 28] 256ReLU-28 [-1, 128, 28, 28] 0Conv2d-29 [-1, 128, 28, 28] 32,896BatchNorm2d-30 [-1, 128, 28, 28] 256ReLU-31 [-1, 128, 28, 28] 0Conv2d-32 [-1, 192, 28, 28] 221,376BatchNorm2d-33 [-1, 192, 28, 28] 384ReLU-34 [-1, 192, 28, 28] 0Conv2d-35 [-1, 32, 28, 28] 8,224BatchNorm2d-36 [-1, 32, 28, 28] 64ReLU-37 [-1, 32, 28, 28] 0Conv2d-38 [-1, 96, 28, 28] 76,896BatchNorm2d-39 [-1, 96, 28, 28] 192ReLU-40 [-1, 96, 28, 28] 0MaxPool2d-41 [-1, 256, 28, 28] 0Conv2d-42 [-1, 64, 28, 28] 16,448BatchNorm2d-43 [-1, 64, 28, 28] 128ReLU-44 [-1, 64, 28, 28] 0inception_block-45 [-1, 480, 28, 28] 0MaxPool2d-46 [-1, 480, 14, 14] 0Conv2d-47 [-1, 192, 14, 14] 92,352BatchNorm2d-48 [-1, 192, 14, 14] 384ReLU-49 [-1, 192, 14, 14] 0Conv2d-50 [-1, 96, 14, 14] 46,176BatchNorm2d-51 [-1, 96, 14, 14] 192ReLU-52 [-1, 96, 14, 14] 0Conv2d-53 [-1, 208, 14, 14] 179,920BatchNorm2d-54 [-1, 208, 14, 14] 416ReLU-55 [-1, 208, 14, 14] 0Conv2d-56 [-1, 16, 14, 14] 7,696BatchNorm2d-57 [-1, 16, 14, 14] 32ReLU-58 [-1, 16, 14, 14] 0Conv2d-59 [-1, 48, 14, 14] 19,248BatchNorm2d-60 [-1, 48, 14, 14] 96ReLU-61 [-1, 48, 14, 14] 0MaxPool2d-62 [-1, 480, 14, 14] 0Conv2d-63 [-1, 64, 14, 14] 30,784BatchNorm2d-64 [-1, 64, 14, 14] 128ReLU-65 [-1, 64, 14, 14] 0inception_block-66 [-1, 512, 14, 14] 0Conv2d-67 [-1, 160, 14, 14] 82,080BatchNorm2d-68 [-1, 160, 14, 14] 320ReLU-69 [-1, 160, 14, 14] 0Conv2d-70 [-1, 112, 14, 14] 57,456BatchNorm2d-71 [-1, 112, 14, 14] 224ReLU-72 [-1, 112, 14, 14] 0Conv2d-73 [-1, 224, 14, 14] 226,016BatchNorm2d-74 [-1, 224, 14, 14] 448ReLU-75 [-1, 224, 14, 14] 0Conv2d-76 [-1, 24, 14, 14] 12,312BatchNorm2d-77 [-1, 24, 14, 14] 48ReLU-78 [-1, 24, 14, 14] 0Conv2d-79 [-1, 64, 14, 14] 38,464BatchNorm2d-80 [-1, 64, 14, 14] 128ReLU-81 [-1, 64, 14, 14] 0MaxPool2d-82 [-1, 512, 14, 14] 0Conv2d-83 [-1, 64, 14, 14] 32,832BatchNorm2d-84 [-1, 64, 14, 14] 128ReLU-85 [-1, 64, 14, 14] 0inception_block-86 [-1, 512, 14, 14] 0Conv2d-87 [-1, 128, 14, 14] 65,664BatchNorm2d-88 [-1, 128, 14, 14] 256ReLU-89 [-1, 128, 14, 14] 0Conv2d-90 [-1, 128, 14, 14] 65,664BatchNorm2d-91 [-1, 128, 14, 14] 256ReLU-92 [-1, 128, 14, 14] 0Conv2d-93 [-1, 256, 14, 14] 295,168BatchNorm2d-94 [-1, 256, 14, 14] 512ReLU-95 [-1, 256, 14, 14] 0Conv2d-96 [-1, 24, 14, 14] 12,312BatchNorm2d-97 [-1, 24, 14, 14] 48ReLU-98 [-1, 24, 14, 14] 0Conv2d-99 [-1, 64, 14, 14] 38,464BatchNorm2d-100 [-1, 64, 14, 14] 128ReLU-101 [-1, 64, 14, 14] 0MaxPool2d-102 [-1, 512, 14, 14] 0Conv2d-103 [-1, 64, 14, 14] 32,832BatchNorm2d-104 [-1, 64, 14, 14] 128ReLU-105 [-1, 64, 14, 14] 0inception_block-106 [-1, 512, 14, 14] 0Conv2d-107 [-1, 112, 14, 14] 57,456BatchNorm2d-108 [-1, 112, 14, 14] 224ReLU-109 [-1, 112, 14, 14] 0Conv2d-110 [-1, 144, 14, 14] 73,872BatchNorm2d-111 [-1, 144, 14, 14] 288ReLU-112 [-1, 144, 14, 14] 0Conv2d-113 [-1, 288, 14, 14] 373,536BatchNorm2d-114 [-1, 288, 14, 14] 576ReLU-115 [-1, 288, 14, 14] 0Conv2d-116 [-1, 32, 14, 14] 16,416BatchNorm2d-117 [-1, 32, 14, 14] 64ReLU-118 [-1, 32, 14, 14] 0Conv2d-119 [-1, 64, 14, 14] 51,264BatchNorm2d-120 [-1, 64, 14, 14] 128ReLU-121 [-1, 64, 14, 14] 0MaxPool2d-122 [-1, 512, 14, 14] 0Conv2d-123 [-1, 64, 14, 14] 32,832BatchNorm2d-124 [-1, 64, 14, 14] 128ReLU-125 [-1, 64, 14, 14] 0inception_block-126 [-1, 528, 14, 14] 0Conv2d-127 [-1, 256, 14, 14] 135,424BatchNorm2d-128 [-1, 256, 14, 14] 512ReLU-129 [-1, 256, 14, 14] 0Conv2d-130 [-1, 160, 14, 14] 84,640BatchNorm2d-131 [-1, 160, 14, 14] 320ReLU-132 [-1, 160, 14, 14] 0Conv2d-133 [-1, 320, 14, 14] 461,120BatchNorm2d-134 [-1, 320, 14, 14] 640ReLU-135 [-1, 320, 14, 14] 0Conv2d-136 [-1, 32, 14, 14] 16,928BatchNorm2d-137 [-1, 32, 14, 14] 64ReLU-138 [-1, 32, 14, 14] 0Conv2d-139 [-1, 128, 14, 14] 102,528BatchNorm2d-140 [-1, 128, 14, 14] 256ReLU-141 [-1, 128, 14, 14] 0MaxPool2d-142 [-1, 528, 14, 14] 0Conv2d-143 [-1, 128, 14, 14] 67,712BatchNorm2d-144 [-1, 128, 14, 14] 256ReLU-145 [-1, 128, 14, 14] 0inception_block-146 [-1, 832, 14, 14] 0MaxPool2d-147 [-1, 832, 7, 7] 0Conv2d-148 [-1, 256, 7, 7] 213,248BatchNorm2d-149 [-1, 256, 7, 7] 512ReLU-150 [-1, 256, 7, 7] 0Conv2d-151 [-1, 160, 7, 7] 133,280BatchNorm2d-152 [-1, 160, 7, 7] 320ReLU-153 [-1, 160, 7, 7] 0Conv2d-154 [-1, 320, 7, 7] 461,120BatchNorm2d-155 [-1, 320, 7, 7] 640ReLU-156 [-1, 320, 7, 7] 0Conv2d-157 [-1, 32, 7, 7] 26,656BatchNorm2d-158 [-1, 32, 7, 7] 64ReLU-159 [-1, 32, 7, 7] 0Conv2d-160 [-1, 128, 7, 7] 102,528BatchNorm2d-161 [-1, 128, 7, 7] 256ReLU-162 [-1, 128, 7, 7] 0MaxPool2d-163 [-1, 832, 7, 7] 0Conv2d-164 [-1, 128, 7, 7] 106,624BatchNorm2d-165 [-1, 128, 7, 7] 256ReLU-166 [-1, 128, 7, 7] 0inception_block-167 [-1, 832, 7, 7] 0Conv2d-168 [-1, 384, 7, 7] 319,872BatchNorm2d-169 [-1, 384, 7, 7] 768ReLU-170 [-1, 384, 7, 7] 0Conv2d-171 [-1, 192, 7, 7] 159,936BatchNorm2d-172 [-1, 192, 7, 7] 384ReLU-173 [-1, 192, 7, 7] 0Conv2d-174 [-1, 384, 7, 7] 663,936BatchNorm2d-175 [-1, 384, 7, 7] 768ReLU-176 [-1, 384, 7, 7] 0Conv2d-177 [-1, 48, 7, 7] 39,984BatchNorm2d-178 [-1, 48, 7, 7] 96ReLU-179 [-1, 48, 7, 7] 0Conv2d-180 [-1, 128, 7, 7] 153,728BatchNorm2d-181 [-1, 128, 7, 7] 256ReLU-182 [-1, 128, 7, 7] 0MaxPool2d-183 [-1, 832, 7, 7] 0Conv2d-184 [-1, 128, 7, 7] 106,624BatchNorm2d-185 [-1, 128, 7, 7] 256ReLU-186 [-1, 128, 7, 7] 0inception_block-187 [-1, 1024, 7, 7] 0AvgPool2d-188 [-1, 1024, 1, 1] 0Dropout-189 [-1, 1024, 1, 1] 0Linear-190 [-1, 1024] 1,049,600ReLU-191 [-1, 1024] 0Linear-192 [-1, 4] 4,100Softmax-193 [-1, 4] 0Total params: 7,041,172
Trainable params: 7,041,172
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.57
Forward/backward pass size (MB): 69.61
Params size (MB): 26.86
Estimated Total Size (MB): 97.05
----------------------------------------------------------------
InceptionV1((conv1): Conv2d(3, 64, kernel_size(7, 7), stride(2, 2), padding(3, 3))(maxpool1): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(conv2): Conv2d(64, 64, kernel_size(1, 1), stride(1, 1))(conv3): Conv2d(64, 192, kernel_size(3, 3), stride(1, 1), padding(1, 1))(maxpool2): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(inception3a): inception_block((branch1): Sequential((0): Conv2d(192, 64, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(192, 96, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(96, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(96, 128, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(192, 16, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(16, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(16, 32, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(32, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(192, 32, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(32, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(inception3b): inception_block((branch1): Sequential((0): Conv2d(256, 128, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(256, 128, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(128, 192, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(192, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(256, 32, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(32, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(32, 96, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(96, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(256, 64, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(maxpool3): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(inception4a): inception_block((branch1): Sequential((0): Conv2d(480, 192, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(192, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(480, 96, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(96, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(96, 208, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(208, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(480, 16, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(16, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(16, 48, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(48, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(480, 64, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(inception4b): inception_block((branch1): Sequential((0): Conv2d(512, 160, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(160, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(512, 112, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(112, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(112, 224, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(224, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(512, 24, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(24, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(24, 64, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(512, 64, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(inception4c): inception_block((branch1): Sequential((0): Conv2d(512, 128, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(512, 128, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(128, 256, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(512, 24, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(24, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(24, 64, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(512, 64, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(inception4d): inception_block((branch1): Sequential((0): Conv2d(512, 112, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(112, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(512, 144, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(144, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(144, 288, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(288, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(512, 32, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(32, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(32, 64, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(512, 64, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(64, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(inception4e): inception_block((branch1): Sequential((0): Conv2d(528, 256, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(528, 160, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(160, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(160, 320, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(320, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(528, 32, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(32, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(32, 128, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(528, 128, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(maxpool4): MaxPool2d(kernel_size3, stride2, padding1, dilation1, ceil_modeFalse)(inception5a): inception_block((branch1): Sequential((0): Conv2d(832, 256, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(256, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(832, 160, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(160, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(160, 320, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(320, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(832, 32, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(32, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(32, 128, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(832, 128, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(inception5b): Sequential((0): inception_block((branch1): Sequential((0): Conv2d(832, 384, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(384, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue))(branch2): Sequential((0): Conv2d(832, 192, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(192, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(192, 384, kernel_size(3, 3), stride(1, 1), padding(1, 1))(4): BatchNorm2d(384, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch3): Sequential((0): Conv2d(832, 48, kernel_size(1, 1), stride(1, 1))(1): BatchNorm2d(48, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(2): ReLU(inplaceTrue)(3): Conv2d(48, 128, kernel_size(5, 5), stride(1, 1), padding(2, 2))(4): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(5): ReLU(inplaceTrue))(branch4): Sequential((0): MaxPool2d(kernel_size3, stride1, padding1, dilation1, ceil_modeFalse)(1): Conv2d(832, 128, kernel_size(1, 1), stride(1, 1))(2): BatchNorm2d(128, eps1e-05, momentum0.1, affineTrue, track_running_statsTrue)(3): ReLU(inplaceTrue)))(1): AvgPool2d(kernel_size7, stride1, padding0)(2): Dropout(p0.4, inplaceFalse))(classifier): Sequential((0): Linear(in_features1024, out_features1024, biasTrue)(1): ReLU()(2): Linear(in_features1024, out_features4, biasTrue)(3): Softmax(dim1))
)
2.4 训练模型
2.4.1 编写训练函数
# 训练循环
def train(dataloader, model, loss_fn, optimizer):size len(dataloader.dataset) # 训练集的大小num_batches len(dataloader) # 批次数目, (size/batch_size向上取整)train_loss, train_acc 0, 0 # 初始化训练损失和正确率for X, y in dataloader: # 获取图片及其标签X, y X.to(device), y.to(device)# 计算预测误差pred model(X) # 网络输出loss loss_fn(pred, y) # 计算网络输出pred和真实值y之间的差距y为真实值计算二者差值即为损失# 反向传播optimizer.zero_grad() # grad属性归零loss.backward() # 反向传播optimizer.step() # 每一步自动更新# 记录acc与losstrain_acc (pred.argmax(1) y).type(torch.float).sum().item()train_loss loss.item()train_acc / sizetrain_loss / num_batchesreturn train_acc, train_loss 2.4.2 编写测试函数
def test(dataloader, model, loss_fn):size len(dataloader.dataset) # 训练集的大小num_batches len(dataloader) # 批次数目, (size/batch_size向上取整)test_loss, test_acc 0, 0 # 初始化测试损失和正确率# 当不进行训练时停止梯度更新节省计算内存消耗# with torch.no_grad():for imgs, target in dataloader: # 获取图片及其标签with torch.no_grad():imgs, target imgs.to(device), target.to(device)# 计算误差tartget_pred model(imgs) # 网络输出loss loss_fn(tartget_pred, target) # 计算网络输出和真实值之间的差距targets为真实值计算二者差值即为损失# 记录acc与losstest_loss loss.item()test_acc (tartget_pred.argmax(1) target).type(torch.float).sum().item()test_acc / sizetest_loss / num_batchesreturn test_acc, test_loss 2.4.3 正式训练
import copyoptimizer torch.optim.Adam(model.parameters(), lr 1e-4)
loss_fn nn.CrossEntropyLoss() #创建损失函数epochs 40train_loss []
train_acc []
test_loss []
test_acc []best_acc 0 #设置一个最佳准确率作为最佳模型的判别指标if hasattr(torch.cuda, empty_cache):torch.cuda.empty_cache()for epoch in range(epochs):model.train()epoch_train_acc, epoch_train_loss train(train_dl, model, loss_fn, optimizer)#scheduler.step() #更新学习率调用官方动态学习率接口时使用model.eval()epoch_test_acc, epoch_test_loss test(test_dl, model, loss_fn)#保存最佳模型到best_modelif epoch_test_acc best_acc:best_acc epoch_test_accbest_model copy.deepcopy(model)train_acc.append(epoch_train_acc)train_loss.append(epoch_train_loss)test_acc.append(epoch_test_acc)test_loss.append(epoch_test_loss)#获取当前的学习率lr optimizer.state_dict()[param_groups][0][lr]template (Epoch: {:2d}. Train_acc: {:.1f}%, Train_loss: {:.3f}, Test_acc:{:.1f}%, Test_loss:{:.3f}, Lr: {:.2E})print(template.format(epoch1, epoch_train_acc*100, epoch_train_loss, epoch_test_acc*100, epoch_test_loss, lr))PATH ./J7_best_model.pth
torch.save(model.state_dict(), PATH)print(Done) 输出结果如下所示 2.5 结果可视化
import matplotlib.pyplot as plt
#隐藏警告
import warnings
warnings.filterwarnings(ignore) #忽略警告信息
plt.rcParams[font.sans-serif] [SimHei] # 用来正常显示中文标签
plt.rcParams[axes.unicode_minus] False # 用来正常显示负号
plt.rcParams[figure.dpi] 100 #分辨率epochs_range range(epochs)plt.figure(figsize(12, 3))
plt.subplot(1, 2, 1)plt.plot(epochs_range, train_acc, labelTraining Accuracy)
plt.plot(epochs_range, test_acc, labelTest Accuracy)
plt.legend(loclower right)
plt.title(Training and Validation Accuracy)plt.subplot(1, 2, 2)
plt.plot(epochs_range, train_loss, labelTraining Loss)
plt.plot(epochs_range, test_loss, labelTest Loss)
plt.legend(locupper right)
plt.title(Training and Validation Loss)
plt.show() 输出结果显示如下
3 总结 大部分流行的CNN是将网络的卷积层堆叠的越来越多网络越来越深同时channel越来越宽网络越来越宽以此来希望提取更高层的特征从而得到更好的性能。但单纯的网络堆叠和加宽会带来副作用包括梯度爆炸和数据量剧增而导致的训练困难的问题等。而Inception的提出改善了此种现象。 Inception是用多路分支来并行采用不同的卷积核大小来提取不同大小感受野所代表的特征。这种分支结构将单路改变为多路并行计算使得网络运行速度更快。而不同大小的卷积核则代表在不同大小感受野的范围内提取的特征使得网络可以同时“看到”该位置不同范围的特征通过后续的concate操作将不同大小感受野的特征融合起来综合该位置不同范围的特征。其解读思想更接近于人类的解读方式。 同时为减少参数量在分支中使用1x1卷积将channel维度进行降维提取特征后再次使用1x1卷积进行channel维度的回升看似繁琐却将参数量大大降低。而且这样的操作也在无形中增加了网络的深度提取了更高维的特征。这种降维操作类似于将一个大矩阵转化为一个小矩阵转化的过程中会提取大矩阵的“精华”去除冗余信息。而升维操作则类似于将小矩阵又转化为原始大小的大矩阵方便不同分支的特征融合。