PyTorch 初尝试:MNIST 手写数字识别

一个简单的机器学习入门……总感觉除了我大家都写过了( ´•灬•`)

准备工作

Let’s Start

导入需要用到的库

1
2
3
4
5
6
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms

初始化和载入训练集&测试集

其实测试集就是没有train和shuffle的训练集…入门嘛。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
transform = transforms.Compose(
[transforms.ToTensor(),
])

#训练集
trainset = torchvision.datasets.MNIST(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)

#测试集
testset = torchvision.datasets.MNIST(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)

定义神经网络

一个简单的卷积神经网络

  • 第一层:
    • conv1 【通道:1->6, 输入数据:28×28->24×24】
    • pool1 【通道:6->16, 输入数据:24×24->12×12】
  • 第二层:
    • conv2 卷积 【通道:16->16, 输入数据:12×12->8×8】
    • pool2 池化 【通道:16->16, 输入数据:8×8->4×4】

所以fc1输入的特征数是 16×4×4

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.pool2 = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(in_features=16 * 4 * 4, out_features=120)
#此处输入的特征数值需计算
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
self.dp = nn.Dropout(0.5)

def forward(self, x):
x = self.pool1(F.relu(self.conv1(x)))
x = self.pool2(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 4 * 4)
x = F.relu(self.fc1(x))
x = self.dp(x)
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x

net = Net()

定义损失函数和优化器

1
2
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

训练网络

每1000张输出一行loss数据,可以看到loss越来越小。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
epoch = 1  #纪元
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
imgs, labels = data

optimizer.zero_grad()
outputs = net(imgs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()

running_loss += loss.item()
if i % 1000 == 999: # print every 1000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 1000))
running_loss = 0.0
[2,  1000] loss: 2.286
[2,  2000] loss: 1.367
[2,  3000] loss: 0.604
[2,  4000] loss: 0.412
[2,  5000] loss: 0.320
[2,  6000] loss: 0.271
[2,  7000] loss: 0.270
[2,  8000] loss: 0.241
[2,  9000] loss: 0.211
[2, 10000] loss: 0.200
[2, 11000] loss: 0.178
[2, 12000] loss: 0.183
[2, 13000] loss: 0.167
[2, 14000] loss: 0.141
[2, 15000] loss: 0.158

验证准确率

现在我们就有一个net模型啦~

测试一下这个小网络的准确率。原理是如果输出结果与数据集中标记的答案数字一致,则表示识别正确,== 运算为1。测试集一共2500个batch,每批有4张图,所以除以总数10000张。

1
2
3
4
5
6
r = 0
for i, data in enumerate(testloader, 0):
imgs, labels = data
outputs = net(imgs)
r += (torch.max(outputs, 1)[1] == labels).sum()
print(int(r)/10000)
0.9706

One more thing

如果你想看一眼你的数据集_(:з」∠)_

1
2
import matplotlib.pyplot as plt
import numpy as np
1
2
3
4
dataset = iter(testloader)
imgs, labels = dataset.next()
img = imgs[1].numpy()
plt.imshow(img[0])

或者……如果你想看一个batch

1
2
3
4
5
images, labels = next(iter(trainloader))
img = torchvision.utils.make_grid(images)
img = img.numpy().transpose(1,2,0)
print([labels[i].item() for i in range(4)])
plt.imshow(img)

[0,9,1,2]

如果你想把模型保存下来

1
torch.save(net, './mnist.sxy')  #保存模型
1
net = torch.load('./mnist.sxy')  #载入模型

Reference

封面图来源:Bilibili-3Blue1Brown-深度学习之神经网络的结构 Part 1 ver 2.0
PyTorch官网-新手教程-Training A Classifier