밑바닥부터 시작하는 딥러닝 4장 - 신경망 학습

728x90

학습 알고리즘 구현하기

-2층 신경망 클래스 구현하기

클래스의 이름은 TwoLayerNet이다.

import sys, os
sys.path.append(os.pardir) 
from common.functions import \*
from common.gradient import numerical\_gradient

class TwoLayerNet:
    def \_\_init\_\_(self, input\_size, hidden\_size, output\_size, weight\_init\_std=0.01):
        self.params = {}
        self.params\['W1'\] = weight\_init\_std \* np.random.randn(input\_size, hidden\_size)
        self.params\['b1'\] = np.zeros(hidden\_size)
        self.params\['W2'\] = weight\_init\_std \* np.random.randn(hidden\_size, output\_size)
        self.params\['b2'\] = np.zeros(output\_size)
    def predict(self, x):
        W1, W2 = self.params\['W1'\], self.params\['W2'\]
        b1, b2 = self.params\['b1'\], self.params\['b2'\]
        a1 = np.dot(x, W1) + b1
        z1 = sigmoid(a1)
        a2 = np.dot(z1, W2) + b2
        y = softmax(a2)
        return y

    # x: 입력 데이터, t: 정답 레이블
    def loss(self, x, t):
        y = self.predict(x)
        return cross\_entropy\_error(y, t)
    def accuracy(self, x, t):
        y = self.predict(x)
        y = np.argmax(y, axis=1)
        t = np.argmax(t, axis=1)
        accuracy = np.sum(y == t) / float(x.shape\[0\])
        return accuracy

    # x: 입력 데이터, t: 정답 레이블
    def numerical\_gradient(self, x, t):
        loss\_W = lambda W: self.loss(x, t)
        grads = {}
        grads\['W1'\] = numerical\_gradient(loss\_W, self.params\['W1'\])
        grads\['b1'\] = numerical\_gradient(loss\_W, self.params\['b1'\])
        grads\['W2'\] = numerical\_gradient(loss\_W, self.params\['W2'\])
        grads\['b2'\] = numerical\_gradient(loss\_W, self.params\['b2'\])
return grads

- 미니배치 학습 구현하기

미니배치 학습이란 훈련 데이터 중 일부를 무작위로 꺼내고, 그 미니배치에 대해서 경사법으로 매개변수를 갱신하는 것이다. TwoLayerNet으로 학습을 수행해본다.

from dataset.mnist import load\_mnist
from two\_layer\_net import TwoLayerNet

(x\_train, t\_train), (x\_test, t\_test) = load\_mnist(normalize=True, one\_hot\_label=True)
network = TwoLayerNet(input\_size=784, hidden\_size=50, output\_size=10)
iters\_num = 10000  
train\_size = x\_train.shape\[0\]
batch\_size = 100
learning\_rate = 0.1
train\_loss\_list = \[\]
train\_acc\_list = \[\]
test\_acc\_list = \[\]
iter\_per\_epoch = max(train\_size / batch\_size, 1)

for i in range(iters\_num):
    batch\_mask = np.random.choice(train\_size, batch\_size)
    x\_batch = x\_train\[batch\_mask\]
    t\_batch = t\_train\[batch\_mask\]
    #grad = network.numerical\_gradient(x\_batch, t\_batch)
    grad = network.gradient(x\_batch, t\_batch)

    for key in ('W1', 'b1', 'W2', 'b2'):
        network.params\[key\] -= learning\_rate \* grad\[key\]
    loss = network.loss(x\_batch, t\_batch)
    train\_loss\_list.append(loss)

위 코드를 통해 신경망의 가중치 매개변수가 학습 횟수가 늘어가면서 손실 함수의 값이 줄어드는 것을 확인할 수 있다. 즉, 신경망의 가중치 매개변수가 서서히 데이터에 적응하고 있음을 의미한다. 신경망이 학습하고 있다는 것이다.

데이터를 반복해서 학습함으로서 최적 가중치 매개변수로 서서히 다가서고 있다.

- 시험 데이터로 평가하기

신경망 학습의 목표는 범용적인 능력을 익히는 것이다. 오버피팅 일으키지 않는지 확인해야한다. 아래는 평가하기 위한 코드이다.

from dataset.mnist import load\_mnist
from two\_layer\_net import TwoLayerNet

(x\_train, t\_train), (x\_test, t\_test) = load\_mnist(normalize=True, one\_hot\_label=True)
network = TwoLayerNet(input\_size=784, hidden\_size=50, output\_size=10)
iters\_num = 10000 
train\_size = x\_train.shape\[0\]
batch\_size = 100
learning\_rate = 0.1
train\_loss\_list = \[\]
train\_acc\_list = \[\]
test\_acc\_list = \[\]
iter\_per\_epoch = max(train\_size / batch\_size, 1)

for i in range(iters\_num):
    batch\_mask = np.random.choice(train\_size, batch\_size)
    x\_batch = x\_train\[batch\_mask\]
    t\_batch = t\_train\[batch\_mask\]
    #grad = network.numerical\_gradient(x\_batch, t\_batch)
    grad = network.gradient(x\_batch, t\_batch)

    for key in ('W1', 'b1', 'W2', 'b2'):
        network.params\[key\] -= learning\_rate \* grad\[key\]
    loss = network.loss(x\_batch, t\_batch)
    train\_loss\_list.append(loss)

    if i % iter\_per\_epoch == 0:
        train\_acc = network.accuracy(x\_train, t\_train)
        test\_acc = network.accuracy(x\_test, t\_test)
        train\_acc\_list.append(train\_acc)
        test\_acc\_list.append(test\_acc)
        print("train acc, test acc | " + str(train\_acc) + ", " + str(test\_acc))

728x90

'공부정리 > Deep learnig & Machine learning' 카테고리의 다른 글

[핵심 머신러닝] 선형회귀모델 1 (개요, 모델가정) - 강의 정리 (0)	2022.07.22
[핵심 머신러닝] 수치예측, 범주예측 (분류) - 강의 정리 (0)	2022.07.20
활성화함수를 사용하는 이유 (0)	2022.05.18
밑바닥부터 시작하는 딥러닝 3장 - mnist (0)	2022.05.18
밑바닥부터 시작하는 딥러닝 3장 (0)	2022.05.17

학습 알고리즘 구현하기

-2층 신경망 클래스 구현하기

- 미니배치 학습 구현하기

- 시험 데이터로 평가하기

'공부정리 > Deep learnig & Machine learning' 카테고리의 다른 글

티스토리툴바