Enhancing Multi-Object Tracking Stability via Adaptive Kalman Filtering and OC-SORT

Conventional multi-object tracking pipelines, such as SORT, typically rely on linear motion hypotheses. While valid for high-frame-rate scenarios with minimal obstruction, this assumption degrades significantly during occlusions, low frame rates, or non-linear maneuvers. To address these limitations, an improved tracking system was developed using the BoxMot library, integrating YOLOv8 for detection and an enhanced OC-SORT algorithm for trajectory management.

System Architecture and Optimization Focus

The core tracking logic utilizes YOLOv8 to generate bounding box detections, which are then processed by the OC-SORT tracker. Two primary areas were targeted for optimizaton to improve robustness:

Data Association Refinement: The cost matrix calculation within the Hungarian algorithm was optimized. This algorithm matches predicted tracks to current detections by minimizing the total cost. Enhancements ensured that the cost matrix accurately reflected similarity, maximizing matching confidence even when detection quality varied.
Adaptive State Estimation: Standard Kalman Filters assume constant noise parameters. However, real-world motion varies. An adaptive mechanism was implemented to dynamically adjust process and measurement noise covariances based on the observed prediction error, allowing the filter to respond to sudden changes in target velocity or direction.

Hungarian Algorithm Implementation Details

The data association phase employs the Hungarian algorithm to solve the bpiartite matching problem between predicted tracks and new detections. The implementation follows these logical steps:

Initialize the cost matrix and mark all rows and columns as uncovered.
Validate matrix dimensions to ensure the number of columns (detections) meets or exceeds rows (tracks).
Perform row reduction by subtracting the minimum value of each row from all elements in that row.
Identify zero elements and apply starring logic to find potential matches.
Cover columns containing starred zeros. If all rows are covered, the optimal assignment is found.
If not complete, find uncovered zeros and prime them. If no starred zero exists in the primed row, construct an alternating sequence to augment the matching.
Adjust the matrix by adding values to covered rows and subtracting from uncovered columns to create new zeros.
Repeat until a complete unique matching is established.

OC-SORT Mechanisms

OC-SORT (Observation-Centric SORT) shifts the focus from estimation-centric to observation-centric tracking. This approach mitigates errors accumulated during occlusion periods where Kalman Filter predictions drift.

Observation-centric Re-Update (ORU): When a track is reactivated after occlusion, ORU runs a prediction-update cycle on virtual trajectories. This corrects the Kalman Filter parameters using recent observations rather than relying on stale predictions.
Observation-Centric Momentum (OCM): During association, OCM considers the consistency of motion direction based on raw observations rather than filtered states. This reduces noise in velocity calculations caused by imperfect filter estimates.

Limitations of Standard Kalman Filters in Tracking

In standard SORT implementations, the Kalman Filter relies on previous estimates when observations are missing. In high-frame-rate videos, this can amplify noise. During occlusion, the linear motion assumption causes error accumulation. When the target reappears or moves non-linearly, the filter's state estimate may be significantly off.

Extended Kalman Filters (EKF) and Unscented Kalman Filters (UKF) attempt to handle non-linearity via Taylor series expansions or sigma points. However, they still rely on Gaussian distribution assumptions and specific motion models. The adaptive approach implemented here modifies the noise covariances directly within the standard filter structure to accommodate varying motion dynamics without changing the underlying model order.

Tracking Workflow

Initialization: Detections in the first frame initialize active trackers with position and velocity states.
Prediction: Active trackers predict current frame positions using the dynamic model.
Association: New detections are matched to predicted tracks using the optimized cost matrix and Hungarian algorithm.
Update: Matched tracks update their state using the measurement. Unmatched tracks are marked as lost.
Re-activation: Lost tracks are compared against new detections. If similarity exceeds a threshold, the track is re-activated using ORU logic.
Termination: Tracks remaining lost for a predefined duration are removed.

Adaptive Kalman Filter Implementation

The following implementation demonstrates the adaptive noise adjustment logic. Instead of fixed noise matrices, the covariances scale based on the magnitude of the innovation residual.

import numpy as np
from filterpy.kalman import KalmanFilter

class DynamicCovarianceFilter(KalmanFilter):
    def __init__(self, state_dim, measurement_dim):
        super().__init__(dim_x=state_dim, dim_z=measurement_dim)
        # Initialize base noise covariances
        self.process_cov_base = np.eye(state_dim) * 0.05
        self.measurement_cov_base = np.eye(measurement_dim) * 0.05
        self.Q = self.process_cov_base.copy()
        self.R = self.measurement_cov_base.copy()
        
    def _compute_scaling_factor(self, residual_vector):
        # Calculate magnitude of the residual
        residual_norm = np.sqrt(np.sum(np.square(residual_vector)))
        # Prevent division by zero using trace of current process noise
        noise_baseline = np.trace(self.Q)
        if noise_baseline < 1e-6:
            noise_baseline = 1e-6
            
        # Compute ratio and clip to prevent extreme variance spikes
        ratio = residual_norm / noise_baseline
        return np.clip(ratio, 0.5, 2.0)

    def update(self, measurement):
        if measurement is None:
            return super().update()
            
        # Calculate innovation (residual)
        innovation = measurement - np.dot(self.H, self.x)
        
        # Adapt noise covariances based on innovation magnitude
        scale = self._compute_scaling_factor(innovation)
        self.Q = self.process_cov_base * scale
        self.R = self.measurement_cov_base * scale
        
        # Execute standard Kalman update step
        return super().update(measurement)

In this structure, np.sqrt(np.sum(np.square(residual_vector))) computes the Euclidean norm of the error. The np.trace function sums the diagonal elements of the covariance matrix, representing total variance. By comparing the residual norm to the current process noise trace, the filter determines if the current model uncertainty is underestimating the actual motion variability. The np.clip function ensures stability by preventing the noise parameters from becoming too large or too small.

Evaluation Metrics and Setup

Performance was validated using the MOT17 test dataset. For this specific benchmark, the detection confidence threshold was configured to 0.4.

Key metrics employed for assessment include:

HOTA (Higher Order Tracking Accuracy): Balances detection accuracy and association accuracy, providing a comprehensive view of tracker performance.
AssA (Association Accuracy): Specifically measures the quality of linking detections to correct track IDs.
IDF1 (Identity F1 Score): Harmonic mean of ID precision and ID recall.
MOTA (Multiple Object Tracking Accuracy): Aggregates false positives, false negatives, and identity switches. Note that MOTA is heavily influenced by detection quality.

Evaluation was performed using the motmetrics library. Ground truth trajectories (gt.txt) were compared against predicted trajectories (ts1.txt). The compare_to_groundtruth function computed Intersection over Union (IOU) matches, generating statistics on frame counts, identity switches, precision, recall, and overall accuracy scores.

Tags: object-tracking kalman-filter computer-vision deep-learning oc-sort

Posted on Wed, 13 May 2026 11:05:47 +0000 by mslinuz

Freaks City