You are currently viewing a new version of our website. To view the old version click .
Electronics
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

20 November 2025

CLAIRE: A Four-Layer Active Learning Framework for Enhanced IoT Intrusion Detection

School of Computer Science & Information Technology, King Abdulaziz University, Jeddah 22254, Saudi Arabia
This article belongs to the Special Issue Applied Machine Learning in Data Science

Abstract

The integration of the Internet of Things (IoT) has become essential in our daily lives. It plays a core role in operating our daily infrastructure from energy grids and water distribution systems to healthcare and household devices. However, the rapid growth of IoT connections exposes our world to various sophisticated cybersecurity threats. Responding to these potential threats, many security measures have been proposed. The IoT-based Intrusion Detection System is one of the salient components of the security layer and alerts security administrators to any suspicious behaviors. In fact, machine learning-based IDS shows promising results, especially supervised models, but such models require expensive labelling processes by domain experts. The active learning strategy reduces the annotation cost and directs experts to label a small set of carefully selected instances. This paper proposes a robust approach called Clustering-based Layered Active Instance REpresentation (CLAIRE). It involves selecting both representative and informative instances. The former is selected through three sequential clustering-based layers, while the latter is selected by the fourth layer that implements an ensemble-based uncertainty mechanism to identify the most informative instances. Comprehensive evaluation on two well-known IoT datasets, namely, N-BaIoT and CICIoT2023, demonstrates promising results in selecting a small set of instances that capture the various data distributions of the data even in imbalanced datasets. We compare the results of the proposed approach with state-of-the-art baselines that work in the same scope of traditional machine learning.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.