Implementing and Evaluating Eight Machine Learning Algorithms on the Iris Dataset with 5-Fold Cross-Validation

This document details the implementation and evaluation of eight machine learning algorithms on the classic Iris dataset using 5-fold cross-validation. The algorithms include: Logistic Regression, C4.5 Decision Tree (with pre- and post-pruning), SMO-based SVM, BP Neural Network, Naive Bayes, K-means Clustering, and Random Forest.

Experiment 1: Data Preparation and Model Evaluation

Objective

Develop proficiency in Python for data handling and model evaluation, focusing on training/test set concepts, N-fold cross-validation, and performance metrics.

Implementation Steps

Load the Iris dataset from a local file (iris.data) and from scikit-learn.
Implement 5-fold cross-validation using a RandomForestClassifier (100 trees).
Compute accuracy, precision (macro average), recall (macro average), and F1-score (macro average).

Code

import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_val_score, KFold, cross_val_predict
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
from sklearn.preprocessing import LabelEncoder

np.random.seed(42)

# Load data from local file
data_path = "iris.data"
col_names = ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)', 'target']
df = pd.read_csv(data_path, header=None, names=col_names)

X = df.drop('target', axis=1).values
encoder = LabelEncoder()
y = encoder.fit_transform(df['target'])

# Model and CV setup
clf = RandomForestClassifier(n_estimators=100, random_state=42)
kf = KFold(n_splits=5, shuffle=True, random_state=42)

# Predictions using cross_val_predict
y_pred = cross_val_predict(clf, X, y, cv=kf)

# Metrics
acc = accuracy_score(y, y_pred)
prec = precision_score(y, y_pred, average='macro')
rec = recall_score(y, y_pred, average='macro')
f1 = f1_score(y, y_pred, average='macro')

cv_acc = cross_val_score(clf, X, y, cv=kf, scoring='accuracy')
cv_prec = cross_val_score(clf, X, y, cv=kf, scoring='precision_macro')
cv_rec = cross_val_score(clf, X, y, cv=kf, scoring='recall_macro')
cv_f1 = cross_val_score(clf, X, y, cv=kf, scoring='f1_macro')

print(f"Accuracy: {acc:.4f} (CV mean: {np.mean(cv_acc):.4f})")
print(f"Precision: {prec:.4f} (CV mean: {np.mean(cv_prec):.4f})")
print(f"Recall: {rec:.4f} (CV mean: {np.mean(cv_rec):.4f})")
print(f"F1: {f1:.4f} (CV mean: {np.mean(cv_f1):.4f})")

Parameter Description

Parameter	Meaning	Notes
`n_estimators=100`	Number of trees in random forest	Standard choice
`random_state=42`	Random seed for reproducibility	Ensures consistent results

Results

Accuracy: 96.67%
Precition (macro): 0.9628
Recall (macro): 0.9594
F1 (macro): 0.9589

Experiment 2: Logistic Regression

Objective

Understand the principles of logistic regression and implement it with multinomial extension for multi-class classification.

Implementation

from sklearn.linear_model import LogisticRegression

def load_and_preprocess():
    col_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
    df = pd.read_csv("iris.data", header=None, names=col_names)
    le = LabelEncoder()
    y = le.fit_transform(df['species'])
    X = df.drop('species', axis=1).values
    return X, y, le

X, y, le = load_and_preprocess()

log_clf = LogisticRegression(multi_class='multinomial', solver='lbfgs', max_iter=1000, random_state=42)
kf = KFold(n_splits=5, shuffle=True, random_state=42)

y_pred_cv = cross_val_predict(log_clf, X, y, cv=kf)
acc_cv = cross_val_score(log_clf, X, y, cv=kf, scoring='accuracy')
prec_cv = cross_val_score(log_clf, X, y, cv=kf, scoring='precision_weighted')
rec_cv = cross_val_score(log_clf, X, y, cv=kf, scoring='recall_weighted')
f1_cv = cross_val_score(log_clf, X, y, cv=kf, scoring='f1_weighted')

print(f"CV Accuracy: {np.mean(acc_cv):.4f}")
print(f"CV Precision (weighted): {np.mean(prec_cv):.4f}")
print(f"CV Recall (weighted): {np.mean(rec_cv):.4f}")
print(f"CV F1 (weighted): {np.mean(f1_cv):.4f}")

Parameter Description

Parameter	Meaning	Notes
`multi_class='multinomial'`	Multinomial logistic regression	For 3-class classification
`solver='lbfgs'`	Optimization algorithm	Suitable for small datasets
`max_iter=1000`	Maximum iterations	Ensures convergence

Results

CV Accuracy: 0.9733
CV Precision (weighted): 0.9738
CV Recall (weighted): 0.9733
CV F1 (weighted): 0.9733

The model performs well, with petal length and width being the most important features.

Experiment 3: C4.5 Decision Tree with Pre- and Post-Pruning

Objective

Implement a C4.5-like decision tree with pruning strategies to control overfitting.

Implementation

from sklearn.tree import DecisionTreeClassifier

def create_c45_model(pruning='none'):
    params = {
        'criterion': 'entropy',
        'random_state': 42
    }
    if pruning == 'pre':
        params.update({'max_depth': 5, 'min_samples_split': 5, 'min_samples_leaf': 3})
    elif pruning == 'post':
        params.update({'ccp_alpha': 0.01})
    return DecisionTreeClassifier(**params)

def evaluate_model(model, X, y):
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    accs, precs, recs, f1s = [], [], [], []
    for train_idx, test_idx in kf.split(X):
        X_train, X_test = X[train_idx], X[test_idx]
        y_train, y_test = y[train_idx], y[test_idx]
        model.fit(X_train, y_train)
        y_pred = model.predict(X_test)
        accs.append(accuracy_score(y_test, y_pred))
        precs.append(precision_score(y_test, y_pred, average='macro'))
        recs.append(recall_score(y_test, y_pred, average='macro'))
        f1s.append(f1_score(y_test, y_pred, average='macro'))
    return np.mean(accs), np.mean(precs), np.mean(recs), np.mean(f1s)

X, y, _ = load_and_preprocess()
for pruning in ['none', 'pre', 'post']:
    model = create_c45_model(pruning)
    acc, prec, rec, f1 = evaluate_model(model, X, y)
    print(f"Pruning={pruning}: Acc={acc:.4f}, Prec={prec:.4f}, Rec={rec:.4f}, F1={f1:.4f}")

Results

Without pruning: Accuracy ~95.33%
Pre-pruning: Accuracy ~93.33%
Post-pruning: Similar to without pruning but simpler tree.

Experiment 4: SMO Algorithm for SVM

Objective

Implement SMO (Sequential Minimal Optimization) for training a Support Vector Machine.

Implementation

from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

def train_svm_cv(X, y, kernel='rbf', C=1.0, gamma='scale'):
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    accs, precs, recs, f1s = [], [], [], []
    for train_idx, test_idx in kf.split(X):
        X_train, X_test = X[train_idx], X[test_idx]
        y_train, y_test = y[train_idx], y[test_idx]
        scaler = StandardScaler()
        X_train_scaled = scaler.fit_transform(X_train)
        X_test_scaled = scaler.transform(X_test)
        model = SVC(kernel=kernel, C=C, gamma=gamma, random_state=42)
        model.fit(X_train_scaled, y_train)
        y_pred = model.predict(X_test_scaled)
        accs.append(accuracy_score(y_test, y_pred))
        precs.append(precision_score(y_test, y_pred, average='macro'))
        recs.append(recall_score(y_test, y_pred, average='macro'))
        f1s.append(f1_score(y_test, y_pred, average='macro'))
    return np.mean(accs), np.mean(precs), np.mean(recs), np.mean(f1s)

X, y, _ = load_and_preprocess()
acc, prec, rec, f1 = train_svm_cv(X, y)
print(f"SVM with RBF: Acc={acc:.4f}, Prec={prec:.4f}, Rec={rec:.4f}, F1={f1:.4f}")

Results

Accuracy: 0.9667 (96.67%)
Precision, Recall, F1: 0.966-0.969
SVM shows good stability with std ~0.021.

Experiment 5: BP Neural Network

Objective

Implement a multi-layer perceptron (BP neural network) for classification.

Implementation

from sklearn.neural_network import MLPClassifier

def train_mlp_cv(X, y, hidden_layer_sizes=(100,), activation='relu', max_iter=200):
    kf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
    accs, precs, recs, f1s = [], [], [], []
    for train_idx, test_idx in kf.split(X, y):
        X_train, X_test = X[train_idx], X[test_idx]
        y_train, y_test = y[train_idx], y[test_idx]
        scaler = StandardScaler()
        X_train_scaled = scaler.fit_transform(X_train)
        X_test_scaled = scaler.transform(X_test)
        model = MLPClassifier(hidden_layer_sizes=hidden_layer_sizes, activation=activation,
                              solver='adam', alpha=0.0001, max_iter=max_iter, random_state=42)
        model.fit(X_train_scaled, y_train)
        y_pred = model.predict(X_test_scaled)
        accs.append(accuracy_score(y_test, y_pred))
        precs.append(precision_score(y_test, y_pred, average='weighted'))
        recs.append(recall_score(y_test, y_pred, average='weighted'))
        f1s.append(f1_score(y_test, y_pred, average='weighted'))
    return np.mean(accs), np.mean(precs), np.mean(recs), np.mean(f1s)

X, y, _ = load_and_preprocess()
acc, prec, rec, f1 = train_mlp_cv(X, y)
print(f"MLP: Acc={acc:.4f}, Prec={prec:.4f}, Rec={rec:.4f}, F1={f1:.4f}")

Results

Accuracy: 0.9533 (95.33%)
Lower stability compared to SVM/logistic regression.

Experiment 6: Naive Bayes

Objective

Implement Gaussian Naive Bayes for classification.

Implementation

from sklearn.naive_bayes import GaussianNB

def train_gnb_cv(X, y):
    kf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
    accs, precs, recs, f1s = [], [], [], []
    for train_idx, test_idx in kf.split(X, y):
        X_train, X_test = X[train_idx], X[test_idx]
        y_train, y_test = y[train_idx], y[test_idx]
        model = GaussianNB()
        model.fit(X_train, y_train)
        y_pred = model.predict(X_test)
        accs.append(accuracy_score(y_test, y_pred))
        precs.append(precision_score(y_test, y_pred, average='weighted'))
        recs.append(recall_score(y_test, y_pred, average='weighted'))
        f1s.append(f1_score(y_test, y_pred, average='weighted'))
    return np.mean(accs), np.mean(precs), np.mean(recs), np.mean(f1s)

X, y, _ = load_and_preprocess()
acc, prec, rec, f1 = train_gnb_cv(X, y)
print(f"Gaussian NB: Acc={acc:.4f}, Prec={prec:.4f}, Rec={rec:.4f}, F1={f1:.4f}")

Results

Accuracy: 0.9467 (94.67%)
Fast training, but assumes feature independence.

Experiment 7: K-Means Clustering

Objective

Implement K-means clustering and evaluate using ground truth labels after mapping.

Implementation

from sklearn.cluster import KMeans
from scipy.optimize import linear_sum_assignment

def map_clusters_to_labels(y_true, y_pred):
    # Use Hungarian algorithm to map clusters to true labels
    from sklearn.utils.linear_assignment_ import linear_assignment
    # ... (mapping implementation)
    pass

def train_kmeans_cv(X, y, n_clusters=3):
    kf = KFold(n_splits=5, shuffle=True, random_state=42)
    accs, precs, recs, f1s = [], [], [], []
    for train_idx, test_idx in kf.split(X):
        X_train, X_test = X[train_idx], X[test_idx]
        y_train, y_test = y[train_idx], y[test_idx]
        model = KMeans(n_clusters=n_clusters, random_state=42, n_init=10)
        train_labels = model.fit_predict(X_train)
        # Map cluster labels to actual labels using training set
        from scipy.optimize import linear_sum_assignment
        from sklearn.metrics import confusion_matrix
        cm = confusion_matrix(y_train, train_labels)
        row_ind, col_ind = linear_sum_assignment(-cm)
        label_map = {col: row for row, col in zip(row_ind, col_ind)}
        test_labels = model.predict(X_test)
        y_pred_mapped = np.array([label_map.get(l, -1) for l in test_labels])
        accs.append(accuracy_score(y_test, y_pred_mapped))
        precs.append(precision_score(y_test, y_pred_mapped, average='macro'))
        recs.append(recall_score(y_test, y_pred_mapped, average='macro'))
        f1s.append(f1_score(y_test, y_pred_mapped, average='macro'))
    return np.mean(accs), np.mean(precs), np.mean(recs), np.mean(f1s)

X, y, _ = load_and_preprocess()
acc, prec, rec, f1 = train_kmeans_cv(X, y)
print(f"K-Means: Acc={acc:.4f}, Prec={prec:.4f}, Rec={rec:.4f}, F1={f1:.4f}")

Results

Accuracy: 0.8333 (83.33%)
Lower than supervised methods as expected.

Experiment 8: Random Forest

Objective

Implement random forest and evaluate its performance.

Implementation

from sklearn.ensemble import RandomForestClassifier

def train_rf_cv(X, y):
    kf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
    accs, precs, recs, f1s = [], [], [], []
    for train_idx, test_idx in kf.split(X, y):
        X_train, X_test = X[train_idx], X[test_idx]
        y_train, y_test = y[train_idx], y[test_idx]
        scaler = StandardScaler()
        X_train_scaled = scaler.fit_transform(X_train)
        X_test_scaled = scaler.transform(X_test)
        model = RandomForestClassifier(n_estimators=100, random_state=42, n_jobs=-1)
        model.fit(X_train_scaled, y_train)
        y_pred = model.predict(X_test_scaled)
        accs.append(accuracy_score(y_test, y_pred))
        precs.append(precision_score(y_test, y_pred, average='macro'))
        recs.append(recall_score(y_test, y_pred, average='macro'))
        f1s.append(f1_score(y_test, y_pred, average='macro'))
    return np.mean(accs), np.mean(precs), np.mean(recs), np.mean(f1s)

X, y, _ = load_and_preprocess()
acc, prec, rec, f1 = train_rf_cv(X, y)
print(f"Random Forest: Acc={acc:.4f}, Prec={prec:.4f}, Rec={rec:.4f}, F1={f1:.4f}")

Results

Accuracy: 0.9467 (94.67%)
Good stability, slightly lower than single decision tree due to randomness.

Summary Comparison

Algorithm	Accuracy	Precision (macro)	Recall (macro)	F1 (macro)
Logistic Regression	0.9733	0.9738	0.9733	0.9733
SVM (RBF)	0.9667	0.9688	0.9660	0.9662
C4.5 Decision Tree	0.9533	0.9581	0.9538	0.9530
BP Neural Network	0.9533	0.9559	0.9533	0.9532
Random Forest	0.9467	0.9512	0.9467	0.9464
Naive Bayes (Gaussian)	0.9467	0.9488	0.9467	0.9465
K-Means Clustering	0.8333	0.8312	0.8359	0.8302

These experiments demonstrate the strengths and weaknesses of various algorithms on a classic dataset. Logistic regression and SVM perform best, while clustering shows the limitation of unsupervised learning without label guidance.

Tags: Machine Learning Iris Dataset Cross-Validation Logistic Regression Decision Tree

Posted on Thu, 07 May 2026 09:29:13 +0000 by robs99

Freaks City

Implementing and Evaluating Eight Machine Learning Algorithms on the Iris Dataset with 5-Fold Cross-Validation

Experiment 1: Data Preparation and Model Evaluation

Objective

Implementation Steps

Code

Parameter Description

Results

Experiment 2: Logistic Regression

Objective

Implementation

Parameter Description

Results

Experiment 3: C4.5 Decision Tree with Pre- and Post-Pruning

Objective

Implementation

Results

Experiment 4: SMO Algorithm for SVM

Objective

Implementation

Results

Experiment 5: BP Neural Network

Objective

Implementation

Results

Experiment 6: Naive Bayes

Objective

Implementation

Results

Experiment 7: K-Means Clustering

Objective

Implementation

Results

Experiment 8: Random Forest

Objective

Implementation

Results

Summary Comparison

Hot Tags