[Pytorch] Pytorch 입문 > (1) 미리 보기

Pytorch > Pytorch 입문 > (1) 미리 보기

PyTorch 의 모든것은 PyTorch Tutorials 에서 확인 할 수 있습니다.

바로가기

(PyTorch docs) https://pytorch.org/docs/stable/index.html
(한국 사용자 모임) https://tutorials.pytorch.kr/

1. Tensor

넘파이 배열인 ndarray 와 같은 개념으로, 파이토치에서 연산을 수행하기 위한 가장 기본적인 객체 입니다.
파이토치는 텐서를 통해 값을 저장하고 그 값들에 대해 연산을 수행 할 수 있는 함수를 제공 하며 텐서간 연산에 따른 그래프와 경사도를 저장할 수 가 있습니다.
우리는 딥러닝 개발에 있어 Tensor 와 ndarray 를 자유롭게 전환 할 수 있어야 할 것 입니다.

[Code review]

import torch   

x = torch.Tensor(2,2)   #텐서 구조 선언
print("1. torch.Tensor(2,2) ---- : {0}\n" .format(x))

x = torch.Tensor([[1,2], [3,4]])
print("2-1. torch.Tensor([[1,2], [3,4]])----{0}\n".format(x))
print("2-2. torch.Tensor   x[0][0] ----{0}\n" .format(x[0][0]))

import numpy as np 

x = np.array(x)
print("3.np.array(x) ---- : {0}\n" .format(x))

x = torch.from_numpy(x)
print("4.torch.from_numpy(x) ---- :{0}\n" .format(x))

[결과]

1. torch.Tensor(2,2) ---- : tensor([[0., 0.],[0., 0.]])

2-1. torch.Tensor([[1,2], [3,4]])----tensor([[1., 2.], [3., 4.]])

2-2. torch.Tensor   x[0][0] ----1.0

3.np.array(x) ---- : [[1. 2.] [3. 4.]]

4.torch.from_numpy(x) ---- :tensor([[1., 2.],[3., 4.]])

2. Autograd

파이 토치는 자동으로 미분과 역전파를 수행 하는 Autograd 기능을 가집니다.
파이 토치로 인해 우리는 텐서간의 행열 연산을 크게 신경 쓸 필요 없이 명령어 사용 하면 됩니다.
파이토치는 텐서들 간에 연산을 수행할 때마다 동적으로 Computation graph 를 생성하여 연산의 결과물이 어떤 텐서로 부터 어떤 연산을 통해서 왔는지 추적 합니다.
최종적으로 나온 스칼라에 역전파 알고리즘을 통해 미분을 수행 하도록 했을때, 각 텐서는 자기 자신의 자식노드에 해당 하는 텐서와 연산을 자동으로 찾아 계속해서 역전파 알고 리즘을 수행 할 수 있도록 합니다.

[Code review 1]

1. 생성된 텐서 x,y를 더함 + 새로운 텐서를 생성 = z 에 할당

2. 이미 생성된 연산 그래프를 따라서 미분 값을 z 에 전달 할 수 있게 됨

import torch

x = torch.FloatTensor(2, 2)
y = torch.FloatTensor(2, 2)

print("1. torch.FloatTensor(2, 2):x = {0} \n y = {1} \n" .format(x , y))

y.requires_grad_(True)
print("2. y.requires_grad_(True)------>  {0} \n  " .format(y))

z = (x + y) + torch.FloatTensor(2, 2)
print("3. torch.FloatTensor(2, 2):\n  x = {0} \n y = {1} \n  z = {2}" .format(x , y , z))

[결과 1]

1. torch.FloatTensor(2, 2):
 x = tensor([[0., 0.], [0., 0.]])
 y = tensor([[0., 0.], [0., 0.]])

2. y.requires_grad_(True)------>  tensor([[0., 0.],
        [0., 0.]], requires_grad=True)

3. torch.FloatTensor(2, 2):
 x = tensor([[0., 0.],[0., 0.]])
 y = tensor([[0., 0.],[0., 0.]], requires_grad=True)
 z = tensor([[ 1.4013e-45,  0.0000e+00],
        [-6.2869e-11,  4.5916e-41]], grad_fn=<AddBackward0>)

[Code review 2]

1. 역전파 알고리즘이 필요 없는 경우, with 문법을 사용하여 연산을 수행,

2. 기울기를 구하기 위한 사전 작업을 생략 할 수 있으므로 연산 속도 및 메모리 사용 측면에서 이점

import torch

x = torch.FloatTensor(2, 2)
y = torch.FloatTensor(2, 2)

print("1. torch.FloatTensor(2, 2):x = {0} \n y = {1} \n" .format(x , y))


y.requires_grad_(True)
print("2. y.requires_grad_(True)------>  {0} \n  " .format(y))

with torch.no_grad():
    z = (x + y) + torch.FloatTensor(2, 2)
    print("3. torch.FloatTensor(2, 2):\n  x = {0} \n y = {1} \n  z = {2}" .format(x , y , z))

[결과 2]

1. torch.FloatTensor(2, 2):
  x = tensor([[0., 0.], [0., 0.]])
  y = tensor([[0., 0.],[0., 0.]])

2.y.requires_grad_(True)------>  tensor([[0., 0.],
        [0., 0.]], requires_grad=True)

3. torch.FloatTensor(2, 2):
 x = tensor([[0., 0.],[0., 0.]])
 y = tensor([[0., 0.],[0., 0.]], requires_grad=True)
 z = tensor([[2.3694e-38, 0.0000e+00],[0.0000e+00, 0.0000e+00]])

3. Feedfoward

linear layer 나 fully connected layer 에서, M x N 의 입력 행렬 'x' 가 주어 지면,
M x P 의 행열 'W' 를 곱한 후
P 차원의 벡터 'b' 를 편차 값을 더하게 됩니다.

[Code review]

import torch

def linear(x, W, b):
    y = torch.mm(x, W) + b  # x 곱하기 w 에 b 편차를 더함 

    return y

x = torch.FloatTensor(8, 5) # 8 x 5 
W = torch.FloatTensor(5, 2) # 5 x 2 
b = torch.FloatTensor(2)    # 1 X 2
print("1.Feed-forward  :\n  x = {0} \n w = {1} \n  b = {2}" .format(x , W , b))

#Feed - forward
y = linear(x, W, b)         # 8 x 2 반환 , y 는 <class 'torch.Tensor'> 임
print("2.Feed-forward  torch.mm(x, W) + b :\n  y = {0} " .format(y))

[결과]

1.Feed-forward  :
  x = tensor([[1.2514e-14, 8.9634e-33, 3.0566e+32, 1.8469e+25, 8.7126e-04],
        [1.8889e+31, 4.3973e+21, 7.3773e+28, 2.8936e+12, 7.5338e+28],      
        [1.8037e+28, 3.4740e-12, 1.7743e+28, 9.4701e-01, 1.7338e+25],      
        [3.4732e-12, 1.7743e+28, 9.4701e-01, 2.6907e+20, 1.8910e+23],      
        [7.1443e+31, 1.9046e+31, 1.1128e+27, 2.0700e-19, 3.0263e+29],      
        [1.7704e+31, 6.9767e+22, 1.8037e+28, 3.0313e+32, 7.7783e+31],      
        [4.5453e+30, 7.4392e+28, 5.4275e-14, 1.6109e-19, 1.7743e+28],      
        [5.3831e-14, 1.0194e-38, 4.1328e-39, 5.1429e-39, 1.0286e-38]])     
 w = tensor([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]])
  b = tensor([0., 0.])
2.Feed-forward  torch.mm(x, W) + b :
  y = tensor([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]])

4. nn.Module

파이 토치는 nn.Module 이라는 클래스를 제공 합니다. 사용자는 해당 클래스를 상속받아 필요한 모델 구조를 구현 할 수 있게 됩니다.
가장 중요한 nn.Module 의 forward() 함수는override 하여 Feedfoward 를 구현 할수 있게 됩니다. (사전 개념 정리 없이 코드만 분석 하는 경우, 디버깅시 forward() 가 자동 실행 되는 것을 보고 무한의 삽질에 빠질 수 가 있습니다.)
이외 nn.Module 의 특징을 이해 하여 한번에 신경망 가중치 파라미터 저장 및 불러오기 수행도 가능 합니다.

[Code review]

import torch
import torch.nn as nn

# nn.Module 을 상속 받아 MyLinear 을 구현 
class MyLinear(nn.Module):

    #네트워크 구성
    def __init__(self, input_size, output_size):
        super().__init__()
        print("1.__init__(self, input_size, output_size): \n  input_size = {0} \n output_size = {1} \n" .format(input_size, output_size))
        # x * W + b 에서 W , b 텐서 구성 
        self.W = torch.FloatTensor(input_size, output_size)
        self.b = torch.FloatTensor(output_size)
        print("2._torch.FloatTensor \n  self.W  = {0} \n self.b = {1} \n" .format(self.W , self.b ))

    #feed forward 수행
    #return 의 값은 <class 'torch.Tensor'>  
    def forward(self, x):
        y = torch.mm(x, self.W) + self.b   # x 곱하기 W 에 b 편차를 더함
        return y
        
##############################
# x 텐서 선언
x = torch.FloatTensor(8, 5)  #  x * W + b 에서 x

# 네트워크 생성 
#  - hidden layer , output size
linear = MyLinear(5, 2)   #  x * W + b 에서  W, b 

#MyLinear 에 입력 텐서 x 를 넣어 x * W + b 를 수행 하게 함
y = linear(x)
print("3.y = linear(x) \n  linear  = {0} \n y = {1} \n " .format(linear  , y))

[결과]

1.__init__(self, input_size, output_size): 
  input_size = 5
  output_size = 2

2._torch.FloatTensor
  self.W  = tensor([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]])
  self.b = tensor([0., 0.])

3.y = linear(x)
 linear  = MyLinear()
 y = tensor([[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]])

(1) Parameters 함수 : 모듈 내에 선언된 학습이 필요한 파라미터들을 반환 하는 Iterator

params = [p.size() for p in linear.parameters()]
print(params)

[결과] : linear 에는 학습 가능한 파라미터가 없다고 나옴, 신경망의 학습 파라메터는 파라미터로 등록 필요

[]

[해결 방안] parameter 클래스 (nn.Parameter)사용하여 텐서를 감싸 줘야 함

        # x * W + b 에서 W , b 텐서 구성 
        self.W = nn.Parameter(torch.FloatTensor(input_size, output_size), requires_grad=True)
        self.b = nn.Parameter(torch.FloatTensor(output_size), requires_grad=True)

params = [p.size() for p in linear.parameters()]
print("parameters()  -->" , params) 

[결과]

parameters() --> [torch.Size([5, 2]), torch.Size([2])]

5. Back Propagation

역전파 수행, Feedfoward 가 원하는 연산을 통해 값을 앞으로 전달하는 방식이라면,
이렇게 Feedfoward 를 통해 얻은 값에서 실제 정답 값 과의 차이를 계산 하여 오류(손실)를 뒤로 전달하는 Back Propagation 의 이해가 필요 합니다.
이때의 오류 값은 스칼라로 표현 돠어야 하며, 벡터나 행렬의 형태여서는 안됩니다.
구해진 각 파라미터의 기울기에 대해 반복적으로 경사 하강법을 사용하여 에러를 줄여 나갈 수 있습니다.

[Code review]

objective = 100  

x = torch.FloatTensor(8, 5)
linear = MyLinear(5, 2)
y = linear(x)
print("1. y = linear(x)\n  x = {0} \n y = {1} \n " .format(x   , y ))

loss = (objective - y.sum())**2  #차이를 제곱
print("y = linear(x) \n loss = {0} \n  objective - y = {1} \n " .format(loss    , objective - y))

z = loss.backward()   # 차이값을 backward 수행 함
print("loss.backward() \n z = {0} \n  " .format(z))

[결과]

1. y = linear(x) 
  x = tensor([[0.0000e+00, 0.0000e+00, 2.1019e-44, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 1.4013e-45, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0965e-38],
        [0.0000e+00, 1.2729e+26, 4.5916e-41, 0.0000e+00, 0.0000e+00],
        [2.1019e-44, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00, 0.0000e+00]])
  y = tensor([[0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00],
        [1.2742e-12, 1.1339e-12],
        [0.0000e+00, 0.0000e+00],
        [0.0000e+00, 0.0000e+00]], grad_fn=<AddBackward0>)

2. y = linear(x) 
  loss = 10000.0
  objective - y = tensor([[100., 100.],
        [100., 100.],
        [100., 100.],
        [100., 100.],
        [100., 100.],
        [100., 100.],
        [100., 100.],
        [100., 100.]], grad_fn=<RsubBackward1>)

3. loss.backward() 
  z = None

6. Trainning 그리고 Evaluation

파이토치가 제공하는 train() 함수와 eval() 함수를 사용 하면, 우리는 모델에 대해 훈련과 추론 시 요구되는 모드 변환을 아주 쉽게 수행 할 수 있게 됩니다.
nn.Module 을 상속 받아 구현 하고 생성한 객체는 기본 적으로 Training 모드 입니다. 이를 eval() 을 사용하여 Evaluation 모드로 전환 하고
추론이 끝나면 train() 을 선언 하여 원래의 훈련 모드로 돌아가면 됩니다.

[Code review]

# Training 모드
linear.eval()  
linear.train()    

8. GPU

GPU 가 장착 되는 않은 CPU 버전에서 일반 적인 파이토치 작업은 가능 합니다.
다만. GPU 사용이 요구될수 있는 문제에서는 'cuda()' 함수를 통해 원하는 텐서르 GPU 메모리에 복사 하거나 이동 시킬수 있습니다.

[Code review]

x = torch.cuda.FloatTensor(16, 10)
linear = MyLinear(10, 5)
# .cuda() let module move to GPU memory.
linear.cuda()
y = linear(x)

ref

- 김기현의 자연어 처리 딥러닝 캠프(파이토치 편)

Search This Blog

Maritime 4.0: Innovation Driven by AI, Data, and Cyber Security

[Pytorch] Pytorch 입문 > (1) 미리 보기

Comments

Post a Comment

Popular posts from this blog

[MaritimeCyberTrend] Relationship and prospects between U.S. Chinese maritime operations and maritime cybersecurity

Understanding IMO MSC-FAL.1/Circ.3/Rev.3

Examining the Reality of Cyber Incidents and the Shortfalls in Compliance Frameworks