Intermediate Feature Map Extraction and Visualization in Convolutional Neural Networks

Capturing intermediate layer outputs provides critical diagnostic visibility into representation quality during model training. This section outlines a systematic approach to intercepting and inspecting activation tensors using PyTorch's hook interface, applied to a symmetric encoder-decoder topology commonly used in signal reconstruction tasks.

Network Architecture Definition

The target model implements a multi-stage convolutional bottleneck followed by a transposed convolusion decoder. Each stage applies specific kernel sizes and strides to progressively compress and then restore spatial dimensions.

import torch
import torch.nn as nn
import torch.nn.functional as F

class ReconstructionNet(nn.Module):
    def __init__(self, in_channels=1, base_channels=32, scale_factor=1.0):
        super().__init__()
        
        # Encoder stages
        self.enc = nn.Sequential(
            nn.Conv2d(in_channels, base_channels, kernel_size=(7, 1), stride=(2, 1), padding=(3, 0)),
            ConvBlock(base_channels, base_channels * 2, kernel_size=(3, 1), stride=(2, 1)),
            ConvBlock(base_channels * 2, base_channels * 2, kernel_size=(3, 1)),
            ConvBlock(base_channels * 2, base_channels * 4, stride=2),
            ConvBlock(base_channels * 4, base_channels * 8, stride=2),
            nn.Conv2d(base_channels * 8, base_channels * 16, kernel_size=(8, int(70 * scale_factor / 8)), padding=0)
        )
        
        # Decoder stages
        self.dec = nn.Sequential(
            DeconvBlock(base_channels * 16, base_channels * 8, kernel_size=5),
            ConvBlock(base_channels * 8, base_channels * 8),
            DeconvBlock(base_channels * 8, base_channels * 4, kernel_size=4, stride=2, padding=1),
            ConvBlock(base_channels * 4, base_channels * 4),
            DeconvBlock(base_channels * 4, base_channels * 2, kernel_size=4, stride=2, padding=1),
            ConvBlock(base_channels * 2, base_channels * 2),
            DeconvBlock(base_channels * 2, base_channels, kernel_size=4, stride=2, padding=1),
            ConvBlock(base_channels, base_channels),
            nn.Conv2d(base_channels, in_channels, kernel_size=1)
        )

    def forward(self, x):
        z = self.enc(x)
        out = self.dec(z)
        return out

Activation Capture Mechanism

PyTorch enables intervention at specific module executions via register_forward_hook. By attaching callbacks to designated layers, intermediate tensors can be intercepted immediately after computation without altering the backward pass.

activation_store = {}

def capture_layer_output(module, input_tensor, output_tensor):
    """Callback invoked post-forward-pass."""
    layer_id = f"{module.__class__.__name}_{id(module)}"
    if isinstance(module, (ConvBlock, DeconvBlock)):
        # Detach from computation graph and move to CPU memory
        activation_store[layer_id] = output_tensor.detach().cpu()

Hook Registration Workflow

During initialization, iterate through child modules to attach listeners. Maintaining handles allows later deregistration if needed.

def initialize_hooks(model_instance):
    handles = []
    target_modules = [ConvBlock, DeconvBlock]
    
    for name, mod in model_instance.named_modules():
        if any(isinstance(mod, t) for t in target_modules):
            h = mod.register_forward_hook(capture_layer_output)
            handles.append((name, h))
            
    return handles

Execution and Visualization Pipeline

Trigger the network with dummy or batched inputs. Post-forward, convert captured tensors to raster formats for inspection. The current implementation normalizes channels individually to [0, 1], scales to uint8, and renders as monochrome imagery.

import numpy as np
from PIL import Image
import os

def render_activations(store_dict, output_dir="feature_outputs"):
    os.makedirs(output_dir, exist_ok=True)
    
    for layer_name, tensor in store_dict.items():
        if tensor.dim() == 4:
            # Select first channel of first sample
            slice_data = tensor[0, 0].numpy()
            
            # Normalize per-channel statistics
            s_min, s_max = slice_data.min(), slice_data.max()
            norm_data = (slice_data - s_min) / (s_max - s_min + 1e-8)
            
            img_array = (norm_data * 255).astype(np.uint8)
            img = Image.fromarray(img_array, mode='L')
            
            safe_name = layer_name.replace('/', '_')
            img.save(os.path.join(output_dir, f"{safe_name}.png"))
            print(f"[Saved] {os.path.join(output_dir, safe_name + '.png')}")

# Example usage inside a training step:
# model.forward(batch_input)
# render_activations(activation_store)

Current Limitation: The renderer currently exports single-channel data. Converting to three-channel RGB matrices requires explicit channel duplication or perceptual mapping, which remains unresolved due to inconsistent gradient flow during test-time rendering.

Empirical Observations

Benchmarking revealed high sensitivity to optimizer initialization steps. Adjusting the initial learning rate produced divergent convergence trajectories across evaluation metrics. Comparative analysis against the reference architecture (InversionNet) shows measurable gains on the smaller CurveFaultA subset (5,000 samples). However, scalability to larger distributed splits requires further validation. Loss landscape visualization indicates uneven metric trade-offs on constrained datasets, suggesting that regularization terms may need rebalancing before scaling operations.

Subsequent Objectives

  1. Migrate inference routines to dedicated GPU compute nodes to accelerate full-dataset profiling.
  2. Debug and implement robust RGB channel synthesis for multi-band activation visualization.
  3. Integrate second-order optimizers (e.g., LAMB, AdamW) to stabilize convergence on extended corpora.

Tags: pytorch Feature Maps Hook Mechanisms Model Diagnostics Deep Learning Optimization

Posted on Sun, 10 May 2026 07:23:25 +0000 by djs1