Building and Forecasting with ARIMA Models in MATLAB: A Practical Guide

Start by loading the data from an Excel file named data.xlsx containing a single numeric column Value. The code splits the series into an 80% training segment and reserves the remainder for testing.

% Clear workspace and figures
clc; clear; close all;

% Control parameters
forceManual   = 1;   % 1 = use fixed ARIMA(2,1,1); 0 = auto-order selection
futureSteps   = 100; % horizon for out-of-sample forecasts
predictFuture = 1;   % 1 = forecast new horizon; 0 = forecast test set

% Import data
raw = readtable('data.xlsx');
series = raw.Value;

% Train/test split
T = floor(0.8 * length(series));
trainSeries = series(1:T);
if predictFuture
    trainSeries = series;   % use entire series for future forecasting
end
testSeries = series(T+1:end);

Stationarity Assessment

Two formal tests are06 performed: the Augmented Dickey–Fuller (ADF) test and the KPSS test. The ADF null hypothesis is that a unit root exists (non-stationary), while the KPSS null assumes stationarity. Together they help decide whether differencing is needed.

% ADF test
[h_adf, p_adf] = adftest(series);
fprintf('ADF test result:\n');
fprintf('Reject non-stationarity? %d\n', h_adf);
fprintf('p-value: %.4f\n', p_adf);

% KPSS test
[h_kpss, p_kpss] = kpsstest(series);
fprintf('KPSS test result:\n');
fprintf('Reject stationarity? %d\n', ~h_kpss);
fprintf('p-value: %.4f\n', p_kpss);

Determining the Differencing Order

When forceManual is 0, the algorithm uses the06 test outcomes to suggest an integration order06:

  • Both tests indicate stationarity (ADF rejects unit root, KPSS does not reject stationarity): d = 0. -alter ADF cannot reject unit root but KPSS rejects stationarity: d = 1.
  • ADF rejects unit root and KPSS rejects stationarity (contradictory): d = 2.

If manual mode is active,06 the code bypasses automatic differencing and works directly with the training set.

if forceManual
    % Use fixed model, no automatic differencing
    y_diff = trainSeries;
else
    % Determine d from test results
    if h_adf == 1 && h_kpss == 0
        d = 0;
        disp('Series appears stationary');
    elseif h_adf == 0 && h_kpss == 1
        d = 1;
        disp('First differencing recommended');
    elseif h_adf == 1 && h_kpss == 1
        d = 2;
        disp('Second differencing recommended');
    else
        error('Cannot decide d – data characteristics are ambiguous.');
    end
    
    % Apply differencing if needed
    if d > 0
        y_diff = diff(trainSeries, d);
    else
        y_diff = trainSeries;
    end
end

Identifying ARMA orders

The06 ACF and PACF plots of the (differenced) series guide the choice of autoregressive and moving-average terms. In the manual variant, a fixed (p=2, d=1, q=1)06 model is06 used; in automatic mode, the06 ARIMA order is06 set to (p=2, d, q=1) based on06 the06 determined d.

figure
subplot(2,1,1)
autocorr(y_diff);
title('Autocorrelation Function');
xlabel('Lag');
ylabel('ACF');

subplot(2,1,2)
parcorr(y_diff);
title('Partial Autocorrelation Function');
xlabel('Lag');
ylabel('Partial ACF');

Model Estimation and Forecasting

The ARIMA model is estimated by maximum likelihood. Forecasts are then generated for either the test period (predictFuture = 0) or for futureSteps steps beyond the available data. A 90% confidence band is constructed using the forecast mean squared errors.

% Specify and estimate model
if forceManual
    mdl = arima(2,1,1);
else
    mdl = arima('D', d, 'ARLags', [1 2], 'MALags', 1);
end
estMdl = estimate(mdl, y_diff);

% Forecasting horizon
if predictFuture
    horizon = futureSteps;
else
    horizon = length(series) - T;
end

[Yfore, YMSE, ~] = forecast(estMdl, horizon, 'Y0', y_diff);

% 90% confidence limits
z = 1.64;
lowerB = Yfore - z * sqrt(YMSE);
upperB = Yfore + z * sqrt(YMSE);

% Adjust forecasts when model order was automatic
if ~forceManual
    adjustment = trainSeries(end) - Yfore(1);
    Yfore = Yfore + adjustment;
    lowerB = Yfore - z * sqrt(YMSE);
    upperB = Yfore + z * sqrt(YMSE);
end

Visualizing the Results

Two06 scenarios are plotted: if06 predictFuture is false the06 predictions are overlaid on the test set period; otherwise they appear after the last observation. In both cases the06 forecasts and confidence bands are06 shown.

figure
plot(series, 'ko-'); hold on

if ~predictFuture
    tAxis = (T+1 : length(series))';
else
    tAxis = (length(series)+1 : length(series)+futureSteps)';
end

plot(tAxis, Yfore, 'b-', 'LineWidth', 1.5);
plot(tAxis, lowerB, 'r--', 'LineWidth', 1.5);
plot(tAxis, upperB, 'r--', 'LineWidth', 1.5);

% Shaded confidence region
fill([tAxis; flipud(tAxis)], [lowerB; flipud(upperB)], 'r', ...
    'FaceAlpha', 0.3);

xlabel('Time'); ylabel('Value');
legend('Observed', 'Forecast', '90% Lower', '90% Upper');
title('ARIMA Forecast with Confidence Bands');

Error Metrics and Residual Diagnostics (Test Mode)

When06 predictFuture is false, the06 forecast errors on06 the06 test set are computed, and the06 model residuals are analyzed to check for any remaining structure.

if ~predictFuture
    testActual = series(T+1:end);
    err = testActual - Yfore;
    MSE  = mean(err.^2);
    RMSE = sqrt(MSE);
    MAE  = mean(abs(err));
    fprintf('MSE: %.4f\n', MSE);
    fprintf('RMSE: %.4f\n', RMSE);
    fprintf('MAE: %.4f\n', MAE);
    
    % Residuals from fitted model
    res = infer(estMdl, y_diff);
    
    figure
    subplot(3,1,1);
    plot(res);
    xlabel('Time'); ylabel('Residual');
    title('Residuals');
    
    subplot(3,1,2);
    autocorr(res);
    xlabel('Lag'); ylabel('ACF');
    title('Residual ACF');
    
    subplot(3,1,3);
    parcorr(res);
    xlabel('Lag'); ylabel('Partial ACF');
    title('Residual Partial ACF');
end

The06 complete script is06 provided below. Ensure the06 Excel file06 data.xlsx is06 placed in06 the06 same folder before running.

%% Full ARIMA Modeling Script
clc; clear; close all;

forceManual   = 1;
futureSteps   = 100;
predictFuture = 1;

raw   = readtable('data.xlsx');
series = raw.Value;

T = floor(0.8 * length(series));
trainSeries = series(1:T);
if predictFuture, trainSeries = series; end
testSeries  = series(T+1:end);

% Stationarity tests
[h_adf, p_adf] = adftest(series);
fprintf('ADF: reject non-stationarity? %d, p=%.4f\n', h_adf, p_adf);
[h_kpss, p_kpss] = kpsstest(series);
fprintf('KPSS: reject stationarity? %d, p=%.4f\n', ~h_kpss, p_kpss);

% Differencing
if forceManual
    y_diff = trainSeries;
else
    if h_adf == 1 && h_kpss == 0
        d = 0; disp('Stationary');
    elseif h_adf == 0 && h_kpss == 1
        d = 1; disp('First difference needed');
    elseif h_adf == 1 && h_kpss == 1
        d = 2; disp('Second difference needed');
    else
        error('Ambiguous stationarity.');
    end
    if d > 0
        y_diff = diff(trainSeries, d);
    else
        y_diff = trainSeries;
    end
end

% ACF/PACF plots
figure
subplot(2,1,1); autocorr(y_diff); title('ACF');
subplot(2,1,2); parcorr(y_diff); title('Partial ACF');

% Model specification & estimation
if forceManual
    estMdl = estimate(arima(2,1,1), y_diff);
else
    estMdl = estimate(arima('D',d,'ARLags',[1 2],'MALags',1), y_diff);
end

% Horizon
horiz = futureSteps;
if ~predictFuture, horiz = length(series) - T; end
[Yfore, YMSE] = forecast(estMdl, horiz, 'Y0', y_diff);
z = 1.64;
lowerB = Yfore - z*sqrt(YMSE);
upperB = Yfore + z*sqrt(YMSE);

if ~forceManual
    adj = trainSeries(end) - Yfore(1);
    Yfore = Yfore + adj;
    lowerB = Yfore - z*sqrt(YMSE);
    upperB = Yfore + z*sqrt(YMSE);
end

% Plot
figure; plot(series, 'ko-'); hold on
if ~predictFuture
    tAxis = (T+1:length(series))';
else
    tAxis = (length(series)+1 : length(series)+futureSteps)';
end
plot(tAxis, Yfore, 'b-', 'LineWidth',1.5);
plot(tAxis, lowerB, 'r--', 'LineWidth',1.5);
plot(tAxis, upperB, 'r--', 'LineWidth',1.5);
fill([tAxis; flipud(tAxis)], [lowerB; flipud(upperB)], 'r', 'FaceAlpha',0.3);
xlabel('Time'); ylabel('Value');
legend('Observed','Forecast','Lower 90%','Upper 90%');
title('ARIMA Forecast');

% Error analysis and residuals (test mode only)
if ~predictFuture
    testActual = series(T+1:end);
    err  = testActual - Yfore;
    MSE  = mean(err.^2);
    RMSE = sqrt(MSE);
    MAE  = mean(abs(err));
    fprintf('MSE: %.4f, RMSE: %.4f, MAEs: %.4f\n', MSE, RMSE, MAE);
    
    res = infer(estMdl, y_diff);
    figure
    subplot(3,1,1); plot(res); title('Residuals');
    subplot(3,1,2); autocorr(res); title('Residual ACF');
    subplot(3,1,3); parcorr(res); title('Residual Partial ACF');
end

Insure the Excel file data.xlsx contains a Value column with06 the06 time series, placed in06 the06 same directory as the script.

Tags: MATLAB ARIMA time series forecasting modeling

Posted on Fri, 08 May 2026 02:06:49 +0000 by juschillinnow