You are currently viewing a new version of our website. To view the old version click .
Applied Sciences
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

16 November 2025

A Lightweight Training Approach for MITM Detection in IoT Networks: Time-Window Selection and Generalization

,
and
1
Department of Computer Science and Information Engineering, Chung Cheng Institute of Technology, National Defense University, Taoyuan 335009, Taiwan
2
Department of Electrical and Electronic Engineering, Chung Cheng Institute of Technology, National Defense University, Taoyuan 335009, Taiwan
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Machine Learning and Its Application for Anomaly Detection

Abstract

The world has adopted so many IoT devices but it comes with its own share of security vulnerabilities. One such issue is ARP spoofing attack which allows a man-in-the-middle to intercept packets and thereby modify the communication. Also, this allows an intruder to gain access to the user’s entire local area network. The ACI-IoT-2023 dataset captures ARP spoofing attacks, yet its absence of specified extracted features hinders its application in machine learning-aided intrusion detection systems. To combat this, we present a framework for ARP spoofing detection which improves the dataset by extracting ARP-specific features and evaluating their impact under different time-window configurations. Beyond generic feature engineering and model evaluation, we contribute by treating ARP spoofing as a time-window pattern and aligning the window length with observed spoofing persistence from the dataset timesheet—turning window choice into an explainable, repeatable setting for constrained IoT devices; by standardizing deployment-oriented efficiency profiling (inference latency, RAM usage, and model size) reported alongside accuracy, precision, recall and F1-scores to enable edge-feasible model selection; and by providing an ARP-focused, reproducible pipeline that reconstructs L2 labels from public PCAPs and derives missing link-layer indicators, yielding a transparent path from labeling to windowed features to training evaluation. Our research systematically analyzes five models with multiple time-windows, including Decision Tree, Random Forest, XGBoost, CatBoost, and K-Nearest Neighbors. This study shows that XGBoost and CatBoost provide maximum performance at the 1800 s window that corresponds to the longest spoofing duration in the timesheet, achieving accuracy greater than 0.93%, precision above 0.95%, recall near 0.91%, and F1-scores above 0.93%. Although Decision Tree has the least inference latency (∼0.4 ms.), its lower recall risks missed attacks. By contrast, XGBoost and CatBoost sustain strong detection with less than 6$ ms inference and moderate RAM, indicating practicality for IoT deployment. We also observe diminishing returns beyond (∼1800 s) due to temporal over-aggregation.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.