Training and Predicting with LSTM Networks in PyTorch (With Full Source Code)

LSTM Background

For detailed coverage of LSTM core concepts, internal structure, and backpropagation derivation, there are many existing in-depth resources available. A high-level understanding of how LSTMs store and propagate information is sufficient to work through this implementasion.

PyTorch Environment Setup

When configuring PyTorch in a conda environment on Windows, a common dependency error may occur:

OSError: [WinError 126] The specified module could not be found. Error loading "...\\torch\\lib\\c10\\_cuda.dll" or one of its dependencies

This issue is often resolved by creating a new clean conda environment with Python 3.9, then reinstalling PyTorch in the fresh environment. Official PyTorch installation documentation can be referenced for step-by-step setup instructions.

LSTM Implemantation

This is a simplified end-to-end example of LSTM training and prediction desgined for beginner learning. All code includes detailed comments to aid readability, and the full project (including raw data, preprocessing scripts, and training code) is available in a public GitHub repository.

LSTM Network Definition

The following code defines a reusable LSTM model that leverages PyTorch's built-in LSTM implementation, so we do not need to code LSTM cell logic from scratch:

import torch
import torch.nn as nn

class TimeSeriesLSTM(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim, num_stacked_layers=1):
        super(TimeSeriesLSTM, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_stacked_layers

        # Initialize LSTM layer: batch_first=True sets input shape to (batch_size, sequence_length, feature_count)
        self.lstm_layer = nn.LSTM(
            input_dim, 
            hidden_dim, 
            num_stacked_layers, 
            batch_first=True
        )

        # Fully connected layer maps LSTM output to final prediction
        self.output_layer = nn.Linear(hidden_dim, output_dim)

    def forward(self, input_sequence):
        # Initialize hidden and cell states to zero, matching the input device
        batch_size = input_sequence.size(0)
        hidden_init = torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(input_sequence.device)
        cell_init = torch.zeros(self.num_layers, batch_size, self.hidden_dim).to(input_sequence.device)

        # Forward pass through LSTM layers
        lstm_output, _ = self.lstm_layer(input_sequence, (hidden_init, cell_init))

        # Extract output from the last time step and generate final prediction
        final_prediction = self.output_layer(lstm_output[:, -1, :])
        return final_prediction

Data Loading Module

This module loads time series data from a local CSV file, preprocesses features, and splits data into training and testing loaders:

import torch
from torch.utils.data import Dataset, DataLoader, random_split
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

class NormalizedTimeSeriesDataset(Dataset):
    def __init__(self, raw_data, sequence_length, target_column_index=-1):
        self.seq_len = sequence_length
        self.target_idx = target_column_index
        # Normalize all features to 0-1 range for stable training
        self.scaler = MinMaxScaler()
        self.normalized_data = self.scaler.fit_transform(raw_data)
        # Generate input-output sequence pairs
        self.input_seqs, self.target_vals = self._build_sequences()

    def _build_sequences(self):
        X, y = [], []
        for i in range(len(self.normalized_data) - self.seq_len):
            seq_x = self.normalized_data[i:i + self.seq_len, :]
            seq_y = self.normalized_data[i + self.seq_len, self.target_idx]
            X.append(seq_x)
            y.append(seq_y)
        return (
            torch.tensor(X, dtype=torch.float32),
            torch.tensor(y, dtype=torch.float32).unsqueeze(1)
        )

    def __len__(self):
        return len(self.input_seqs)

    def __getitem__(self, idx):
        return self.input_seqs[idx], self.target_vals[idx]

def get_train_test_loaders(csv_file_path, sequence_length, batch_size, train_split_ratio=0.8):
    # Load raw data from CSV
    raw_df = pd.read_csv(csv_file_path)
    dataset = NormalizedTimeSeriesDataset(raw_df.values, sequence_length)

    # Split into training and test sets
    train_size = int(train_split_ratio * len(dataset))
    test_size = len(dataset) - train_size
    train_dataset, test_dataset = random_split(dataset, [train_size, test_size])

    # Create batched dataloaders
    train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
    test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

    return train_loader, test_loader, dataset.scaler

Tags: pytorch LSTM Deep Learning python Neural Networks

Posted on Tue, 19 May 2026 13:34:03 +0000 by neuro4848

Freaks City