Optimizing IoT Intrusion Detection: A Comparative Analysis of XGBoost and Optimized Sequential Neural Networks

2025-07-03 CoolPal

Optimizing IoT Intrusion Detection: A Comparative Analysis of XGBoost and Optimized Sequential Neural Networks

Dynamic abstract image with mathematical symbols on floating papers, vibrant and conceptual.

The burgeoning Internet of Things (IoT) generates massive volumes of sensitive data, creating a critical need for robust cybersecurity measures. Machine learning (ML) and deep learning (DL) techniques offer a promising approach to anomaly-based intrusion detection, identifying unusual network behavior that signals potential threats. However, existing methods often struggle to effectively counter the sophisticated and evolving nature of modern cyberattacks, particularly concerning preprocessing optimization and hyperparameter tuning.

This study addresses these limitations by proposing a novel intrusion detection system (IDS) that leverages enhanced XGBoost and Optimized Sequential Neural Networks (OSNNs). The methodology incorporates several key improvements over conventional approaches:

Rigorous Data Preprocessing: The research emphasizes comprehensive data preprocessing, including normalization, class imbalance handling (e.g., sub-sampling), categorical variable encoding, and feature extraction. This ensures optimal data preparation for subsequent model training.
Hyperparameter Optimization of XGBoost: The XGBoost model undergoes rigorous hyperparameter tuning via grid search to maximize detection accuracy, particularly for subtle intrusion patterns.
Optimized Sequential Neural Network Architecture: A custom OSNN architecture is developed and optimized. Hyperparameters such as filter sizes, kernel sizes, pooling methods, dense layer sizes, learning rates, and activation functions (ReLU, GeLU, LeakyReLU) are carefully tuned. Dropout layers are strategically implemented to enhance generalization and reduce computational costs. The OSNN model’s architecture is designed to effectively extract unique signatures of various attacks.
Robustness Measures: 5-fold cross-validation and L2 regularization are employed to mitigate overfitting and enhance the model’s generalization capabilities, addressing potential biases stemming from class imbalances in datasets.

The proposed IDS was rigorously evaluated using three publicly available benchmark datasets: NSL-KDD, UNSW-NB15, and CICIDS2017. The optimized XGBoost model demonstrated exceptional performance on the NSL-KDD dataset, achieving 99.93% accuracy, 99.84% F1-score, 99.86% Matthews Correlation Coefficient (MCC), and a remarkably low False Positive Rate (FPR) of 0.0004. The OSNN model also performed exceptionally well, achieving 99.0% accuracy and 1.00 AUC on NSL-KDD, 96.80% accuracy and 0.0777 loss on UNSW-NB15, and 99.53% accuracy and 0.0236 loss on CICIDS2017. These results highlight the effectiveness of the proposed approach in accurately identifying various types of intrusions across diverse datasets.

A comparative analysis with existing studies reveals a significant improvement in both binary and multiclass classification accuracy, exceeding the performance of other ML/DL models reported in the literature. This superior performance is attributed to the combination of enhanced preprocessing techniques, optimized model architectures, and the use of robust regularization and cross-validation methods.

While the study demonstrates significant advancements in IoT intrusion detection, limitations exist. The computational cost of deep learning models, though minimized, remains a consideration for resource-constrained IoT devices. Furthermore, the interpretability of complex deep learning models can be challenging, potentially hindering trust and confidence in their decisions. Future research could focus on addressing these limitations through model compression techniques and the development of more explainable AI methods.

In conclusion, this research provides a valuable contribution to the field of IoT security by presenting a novel IDS that significantly improves intrusion detection accuracy and robustness. The proposed methodology, combining optimized ML and DL techniques, offers a promising solution for safeguarding the increasingly interconnected IoT environment.

Datasets Used:

NSL-KDD: https://data.mendeley.com/datasets/t5bffpjd28/1 and https://www.kaggle.com/datasets/hassan06/nslkdd (CC0 1.0)
UNSW-NB15: https://www.kaggle.com/datasets/dhoogla/unswnb15 (CC BY-NC-SA 4.0)
CICIDS2017: https://www.kaggle.com/datasets/chethuhn/network-intrusion-dataset (CC0)

阅读中文版 (Read Chinese Version)

Disclaimer: This content is aggregated from public sources online. Please verify information independently. If you believe your rights have been infringed, contact us for removal.