Why Move to Code
The previous discussion focused on the mathematical modeling behind neural networks. However, theory alone is insufficient without practical implementation. This article shifts the perspective to a code-first approach, translating mathematical concepts into executable PyTorch scripts.
Implementation Strategy
Following a style similar to classical deep learning tutorials, we will start by visualizing the computation graph and then progressively build the corresponding code structure.
Framework Selection
For rapid prototyping and focusing on the logic rather than low-level calculus, we utilize the PyTorch framework.
Reference Material
Dive into Deep Learning (2.0)
Linear Regression Implementation
Model Definition
- Input: Multi-dimensional numeric vectors.
- Output: Single-dimensional scalar value.
- Mapping Function: A weighted sum of inputs plus a bias term.
Code Implementation
We define a simple neural network layer using PyTorch's nn.Linear. Here, we assume an input dimension of 5, but this is configurable.
import torch.nn as nn
# Define a sequential container with a single linear layer
model = nn.Sequential(nn.Linear(5, 1))
Practical Application
First, we generate synthetic training data (features and labels) using a predefined set of weights and bias.
import torch
import numpy as np
from torch.utils import data
from d2l import torch as d2l
# Ground truth parameters
target_weights = torch.tensor([2, -3.4, 5.6, 3.2, -0.6])
target_bias = 4.2
# Generate synthetic dataset
inputs, outputs = d2l.synthetic_data(target_weights, target_bias, 1000)
Next, we wrap the data in a PyTorch DataLoader to handle batching and shuffling.
def create_data_loader(tensors, batch_sz, is_training=True):
dataset = data.TensorDataset(*tensors)
return data.DataLoader(dataset, batch_sz, shuffle=is_training)
batch_size = 10
train_iter = create_data_loader((inputs, outputs), batch_size)
We instantiate the model and initialize the parameters.
from torch import nn
regression_model = nn.Sequential(nn.Linear(5, 1))
# Initialize weights with normal distribution and bias with zeros
regression_model[0].weight.data.normal_(0, 0.01)
regression_model[0].bias.data.fill_(0)
Define the loss function (Mean Squared Error) and the optimizer (Stochastic Gradient Descent).
loss_function = nn.MSELoss()
optimizer = torch.optim.SGD(regression_model.parameters(), lr=0.03)
Execute the training loop over multiple epochs.
num_epochs = 3
for epoch in range(num_epochs):
for X_batch, y_batch in train_iter:
pred = regression_model(X_batch)
loss = loss_function(pred, y_batch)
optimizer.zero_grad()
loss.backward()
optimizer.step()
current_loss = loss_function(regression_model(inputs), outputs)
print(f'Epoch {epoch + 1}, Loss: {current_loss:f}')
Output:
Epoch 1, Loss: 0.000357
Epoch 2, Loss: 0.000110
Epoch 3, Loss: 0.000111
Finally, we evaluate the learned parameters against the true values to see the estimation error.
learned_weights = regression_model[0].weight.data
learned_bias = regression_model[0].bias.data
print('Weight estimation error:', target_weights - learned_weights.reshape(target_weights.shape))
print('Bias estimation error:', target_bias - learned_bias)
Results:
Weight estimation error: tensor([ 5.8055e-05, -7.1454e-04, -2.3937e-04, -1.0014e-04, -3.5894e-04])
Bias estimation error: tensor([0.0013])