Start by loading the data from an Excel file named data.xlsx containing a single numeric column Value. The code splits the series into an 80% training segment and reserves the remainder for testing.
% Clear workspace and figures
clc; clear; close all;
% Control parameters
forceManual = 1; % 1 = use fixed ARIMA(2,1,1); 0 = auto-order selection
futureSteps = 100; % horizon for out-of-sample forecasts
predictFuture = 1; % 1 = forecast new horizon; 0 = forecast test set
% Import data
raw = readtable('data.xlsx');
series = raw.Value;
% Train/test split
T = floor(0.8 * length(series));
trainSeries = series(1:T);
if predictFuture
trainSeries = series; % use entire series for future forecasting
end
testSeries = series(T+1:end);
Stationarity Assessment
Two formal tests are06 performed: the Augmented Dickey–Fuller (ADF) test and the KPSS test. The ADF null hypothesis is that a unit root exists (non-stationary), while the KPSS null assumes stationarity. Together they help decide whether differencing is needed.
% ADF test
[h_adf, p_adf] = adftest(series);
fprintf('ADF test result:\n');
fprintf('Reject non-stationarity? %d\n', h_adf);
fprintf('p-value: %.4f\n', p_adf);
% KPSS test
[h_kpss, p_kpss] = kpsstest(series);
fprintf('KPSS test result:\n');
fprintf('Reject stationarity? %d\n', ~h_kpss);
fprintf('p-value: %.4f\n', p_kpss);
Determining the Differencing Order
When forceManual is 0, the algorithm uses the06 test outcomes to suggest an integration order06:
- Both tests indicate stationarity (ADF rejects unit root, KPSS does not reject stationarity):
d = 0. -alter ADF cannot reject unit root but KPSS rejects stationarity:d = 1. - ADF rejects unit root and KPSS rejects stationarity (contradictory):
d = 2.
If manual mode is active,06 the code bypasses automatic differencing and works directly with the training set.
if forceManual
% Use fixed model, no automatic differencing
y_diff = trainSeries;
else
% Determine d from test results
if h_adf == 1 && h_kpss == 0
d = 0;
disp('Series appears stationary');
elseif h_adf == 0 && h_kpss == 1
d = 1;
disp('First differencing recommended');
elseif h_adf == 1 && h_kpss == 1
d = 2;
disp('Second differencing recommended');
else
error('Cannot decide d – data characteristics are ambiguous.');
end
% Apply differencing if needed
if d > 0
y_diff = diff(trainSeries, d);
else
y_diff = trainSeries;
end
end
Identifying ARMA orders
The06 ACF and PACF plots of the (differenced) series guide the choice of autoregressive and moving-average terms. In the manual variant, a fixed (p=2, d=1, q=1)06 model is06 used; in automatic mode, the06 ARIMA order is06 set to (p=2, d, q=1) based on06 the06 determined d.
figure
subplot(2,1,1)
autocorr(y_diff);
title('Autocorrelation Function');
xlabel('Lag');
ylabel('ACF');
subplot(2,1,2)
parcorr(y_diff);
title('Partial Autocorrelation Function');
xlabel('Lag');
ylabel('Partial ACF');
Model Estimation and Forecasting
The ARIMA model is estimated by maximum likelihood. Forecasts are then generated for either the test period (predictFuture = 0) or for futureSteps steps beyond the available data. A 90% confidence band is constructed using the forecast mean squared errors.
% Specify and estimate model
if forceManual
mdl = arima(2,1,1);
else
mdl = arima('D', d, 'ARLags', [1 2], 'MALags', 1);
end
estMdl = estimate(mdl, y_diff);
% Forecasting horizon
if predictFuture
horizon = futureSteps;
else
horizon = length(series) - T;
end
[Yfore, YMSE, ~] = forecast(estMdl, horizon, 'Y0', y_diff);
% 90% confidence limits
z = 1.64;
lowerB = Yfore - z * sqrt(YMSE);
upperB = Yfore + z * sqrt(YMSE);
% Adjust forecasts when model order was automatic
if ~forceManual
adjustment = trainSeries(end) - Yfore(1);
Yfore = Yfore + adjustment;
lowerB = Yfore - z * sqrt(YMSE);
upperB = Yfore + z * sqrt(YMSE);
end
Visualizing the Results
Two06 scenarios are plotted: if06 predictFuture is false the06 predictions are overlaid on the test set period; otherwise they appear after the last observation. In both cases the06 forecasts and confidence bands are06 shown.
figure
plot(series, 'ko-'); hold on
if ~predictFuture
tAxis = (T+1 : length(series))';
else
tAxis = (length(series)+1 : length(series)+futureSteps)';
end
plot(tAxis, Yfore, 'b-', 'LineWidth', 1.5);
plot(tAxis, lowerB, 'r--', 'LineWidth', 1.5);
plot(tAxis, upperB, 'r--', 'LineWidth', 1.5);
% Shaded confidence region
fill([tAxis; flipud(tAxis)], [lowerB; flipud(upperB)], 'r', ...
'FaceAlpha', 0.3);
xlabel('Time'); ylabel('Value');
legend('Observed', 'Forecast', '90% Lower', '90% Upper');
title('ARIMA Forecast with Confidence Bands');
Error Metrics and Residual Diagnostics (Test Mode)
When06 predictFuture is false, the06 forecast errors on06 the06 test set are computed, and the06 model residuals are analyzed to check for any remaining structure.
if ~predictFuture
testActual = series(T+1:end);
err = testActual - Yfore;
MSE = mean(err.^2);
RMSE = sqrt(MSE);
MAE = mean(abs(err));
fprintf('MSE: %.4f\n', MSE);
fprintf('RMSE: %.4f\n', RMSE);
fprintf('MAE: %.4f\n', MAE);
% Residuals from fitted model
res = infer(estMdl, y_diff);
figure
subplot(3,1,1);
plot(res);
xlabel('Time'); ylabel('Residual');
title('Residuals');
subplot(3,1,2);
autocorr(res);
xlabel('Lag'); ylabel('ACF');
title('Residual ACF');
subplot(3,1,3);
parcorr(res);
xlabel('Lag'); ylabel('Partial ACF');
title('Residual Partial ACF');
end
The06 complete script is06 provided below. Ensure the06 Excel file06 data.xlsx is06 placed in06 the06 same folder before running.
%% Full ARIMA Modeling Script
clc; clear; close all;
forceManual = 1;
futureSteps = 100;
predictFuture = 1;
raw = readtable('data.xlsx');
series = raw.Value;
T = floor(0.8 * length(series));
trainSeries = series(1:T);
if predictFuture, trainSeries = series; end
testSeries = series(T+1:end);
% Stationarity tests
[h_adf, p_adf] = adftest(series);
fprintf('ADF: reject non-stationarity? %d, p=%.4f\n', h_adf, p_adf);
[h_kpss, p_kpss] = kpsstest(series);
fprintf('KPSS: reject stationarity? %d, p=%.4f\n', ~h_kpss, p_kpss);
% Differencing
if forceManual
y_diff = trainSeries;
else
if h_adf == 1 && h_kpss == 0
d = 0; disp('Stationary');
elseif h_adf == 0 && h_kpss == 1
d = 1; disp('First difference needed');
elseif h_adf == 1 && h_kpss == 1
d = 2; disp('Second difference needed');
else
error('Ambiguous stationarity.');
end
if d > 0
y_diff = diff(trainSeries, d);
else
y_diff = trainSeries;
end
end
% ACF/PACF plots
figure
subplot(2,1,1); autocorr(y_diff); title('ACF');
subplot(2,1,2); parcorr(y_diff); title('Partial ACF');
% Model specification & estimation
if forceManual
estMdl = estimate(arima(2,1,1), y_diff);
else
estMdl = estimate(arima('D',d,'ARLags',[1 2],'MALags',1), y_diff);
end
% Horizon
horiz = futureSteps;
if ~predictFuture, horiz = length(series) - T; end
[Yfore, YMSE] = forecast(estMdl, horiz, 'Y0', y_diff);
z = 1.64;
lowerB = Yfore - z*sqrt(YMSE);
upperB = Yfore + z*sqrt(YMSE);
if ~forceManual
adj = trainSeries(end) - Yfore(1);
Yfore = Yfore + adj;
lowerB = Yfore - z*sqrt(YMSE);
upperB = Yfore + z*sqrt(YMSE);
end
% Plot
figure; plot(series, 'ko-'); hold on
if ~predictFuture
tAxis = (T+1:length(series))';
else
tAxis = (length(series)+1 : length(series)+futureSteps)';
end
plot(tAxis, Yfore, 'b-', 'LineWidth',1.5);
plot(tAxis, lowerB, 'r--', 'LineWidth',1.5);
plot(tAxis, upperB, 'r--', 'LineWidth',1.5);
fill([tAxis; flipud(tAxis)], [lowerB; flipud(upperB)], 'r', 'FaceAlpha',0.3);
xlabel('Time'); ylabel('Value');
legend('Observed','Forecast','Lower 90%','Upper 90%');
title('ARIMA Forecast');
% Error analysis and residuals (test mode only)
if ~predictFuture
testActual = series(T+1:end);
err = testActual - Yfore;
MSE = mean(err.^2);
RMSE = sqrt(MSE);
MAE = mean(abs(err));
fprintf('MSE: %.4f, RMSE: %.4f, MAEs: %.4f\n', MSE, RMSE, MAE);
res = infer(estMdl, y_diff);
figure
subplot(3,1,1); plot(res); title('Residuals');
subplot(3,1,2); autocorr(res); title('Residual ACF');
subplot(3,1,3); parcorr(res); title('Residual Partial ACF');
end
Insure the Excel file data.xlsx contains a Value column with06 the06 time series, placed in06 the06 same directory as the script.