Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning

Hafid, Abdelatif; Rahouti, Mohamed; Aledhari, Mohammed

doi:10.3390/math13101574

Open AccessArticle

Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning

by

Abdelatif Hafid

^1,*

,

Mohamed Rahouti

²

and

Mohammed Aledhari

³

¹

ESISA Analytica, École Supérieure d’Ingénierie en Sciences Appliquées, Fez 30050, Morocco

²

Department of Computer & Information Science, Fordham University, Bronx, NY 10458, USA

³

Department of Data Science, University of North Texas, Denton, TX 76207, USA

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(10), 1574; https://doi.org/10.3390/math13101574

Submission received: 1 April 2025 / Revised: 7 May 2025 / Accepted: 8 May 2025 / Published: 10 May 2025

(This article belongs to the Special Issue Research and Advances in Network Security)

Download

Browse Figures

Versions Notes

Abstract

The rise of the Internet of Medical Things (IoMT) has enhanced healthcare delivery but also exposed critical cybersecurity vulnerabilities. Detecting attacks in such environments demands accurate, interpretable, and cost-efficient models. This paper addresses the critical challenges in network security, particularly in IoMT, through advanced machine learning (ML) approaches. We propose a high-performance cybersecurity framework leveraging a carefully fine-tuned XGBoost classifier to detect malicious attacks with superior predictive accuracy while maintaining interpretability. Our comprehensive evaluation compares the proposed model with a well-regularized Logistic Regression baseline using key performance metrics. Additionally, we analyze the security-cost trade-off in designing ML systems for threat detection and employ SHAP (SHapley Additive exPlanations) to identify key features driving predictions. We further introduce a late fusion approach based on max voting that effectively combines the strengths of both models. Results demonstrate that while XGBoost achieves higher accuracy (0.97) and recall (1.00) compared to Logistic Regression, our late fusion model provides a more balanced performance with improved precision (0.98) and reduced false negatives, making it particularly suitable for security-sensitive applications. This work contributes to developing robust, interpretable, and efficient ML solutions for addressing evolving cybersecurity challenges in networked environments.

Keywords:

machine learning; IoMT; network security; XGBoost; late fusion; SHAP; cybersecurity; logistic regression; intrusion detection

MSC:

68T01; 68M18

1. Introduction

The growing complexity and frequency of cyberattacks present substantial challenges to network security, especially as technological advances accelerate. With the proliferation of interconnected devices, particularly in sensitive domains such as healthcare, the need for robust, adaptive, and intelligent threat detection mechanisms has become critical [1,2].

Traditional network security approaches, such as Next-Generation Firewalls (NGFWs) [3] and modern Intrusion Detection and Prevention Systems (IDPSs) [4], often struggle to keep pace with increasingly sophisticated cyber threats. This requires advanced computational techniques capable of dynamically detecting and mitigating potential security breaches [3,4,5].

The Internet of Medical Things (IoMT) is particularly vulnerable to cyber-attacks due to its reliance on interconnected medical devices communicating through Bluetooth, Wi-Fi, and MQTT. These attacks pose severe risks, compromising sensitive patient data and critical healthcare infrastructure [1,2]. The consequences of such breaches extend beyond data privacy, potentially affecting patient safety and disrupting healthcare services.

IoMT encompasses a diverse ecosystem of interconnected medical devices that collect, transmit, and process patient data to enhance healthcare delivery. These devices range from wearable health monitors (e.g., glucose monitors, cardiac monitors, smart watches) to stationary diagnostic equipment (e.g., connected MRI machines, smart infusion pumps), and therapeutic devices (e.g., insulin pumps, pacemakers). They operate on various communication protocols, including Bluetooth Low Energy (BLE) for short-range data transmission, Wi-Fi for higher bandwidth applications, and MQTT (Message Queuing Telemetry Transport) for lightweight messaging in constrained environments. Each IoMT device category presents unique security challenges due to their varying computational capabilities, power constraints, and data sensitivity. For instance, implantable medical devices must balance security requirements with battery life limitations, while diagnostic equipment must maintain data integrity while handling large volumes of information. This heterogeneity necessitates sophisticated and adaptable security approaches to address the specific vulnerabilities of diverse device types while maintaining the performance requirements for clinical applications.

Machine learning (ML) has emerged as a transformative paradigm in cybersecurity, offering powerful tools to analyze complex network traffic patterns and detect anomalies with high precision [6,7]. Recent advances in ensemble learning techniques, such as XGBoost, have demonstrated significant potential to develop resilient and generalizable threat detection models [8,9]. These approaches leverage sophisticated algorithms to capture intricate relationships within network data, surpassing traditional rule-based detection systems.

Despite significant progress in ML-based security frameworks, several challenges persist, including balancing detection accuracy, computational efficiency, model interpretability, and generalization to evolving threat landscapes [10,11]. Furthermore, minimizing both false positives and false negatives remains a complex optimization problem in security-sensitive applications.

Ensuring robust and efficient detection of malicious attacks remains a critical challenge in cybersecurity. Although ML, particularly ensemble-based approaches, has shown promise in addressing these issues, designing a well-regularized and interpretable model that optimally balances security and computational cost remains an open research problem.

In this work, we develop a high-performance yet generalizable ML framework for detecting malicious attacks. By leveraging a carefully fine-tuned XGBoost classifier [8], our objective is to achieve superior predictive accuracy while maintaining interpretability. We also assess the trade-offs involved in security-sensitive ML applications, where reducing false positives and false negatives is crucial. To enhance transparency and gain deeper insights into the model’s decision-making process, we employ SHAP (SHapley Additive exPlanations) [9] to identify key features driving predictions. Our study provides a comprehensive perspective on the efficacy of advanced ensemble learning techniques in cybersecurity through rigorous evaluation and comparison with a well-regularized logistic regression baseline.

The contributions of this paper can be summarized as follows:

We propose a well-regularized and fine-tuned XGBoost classifier tailored to IoMT environments, achieving 97% accuracy and strong generalization through careful tuning of regularization parameters ( $α = 1000$ , $λ = 1000$ , $γ = 1.0$ ).
We introduce a late fusion strategy that combines the complementary strengths of XGBoost and Logistic Regression using max voting, reducing false negatives by 41% compared to XGBoost alone, while achieving 15.6% fewer false positives than Logistic Regression.
We provide a detailed quantitative analysis of the security-cost trade-off in IoMT environments, demonstrating how different model configurations impact operational efficiency versus security coverage—a critical consideration often overlooked in the existing literature.
We employ SHAP (SHapley Additive exPlanations) to analyze feature importance, revealing that packet information content (64.08%) and source address (36.73%) are dominant predictors—insights that enable targeted and interpretable security monitoring.
We validate our framework on the most recent CIC IoMT 2024 dataset containing real-world traffic from 40 IoMT devices across multiple protocols, ensuring our findings have direct practical relevance to contemporary healthcare security challenges.

The paper is organized as follows: Section 2 outlines the most relevant existing work. Section 3 discusses data collection and preprocessing. Section 4 presents the mathematical modeling of the proposed ML model. Section 5 discusses the results of the proposed ML model and compares them with Logistic Regression as a baseline. Section 6 compares this paper with existing work. Finally, Section 7 concludes the paper.

2. Related Work

The increasing frequency of cyberattacks, including Denial-of-Service (DoS) and Advanced Persistent Threats (APT), has markedly turned the focus of the academic and professional communities towards a more rigorous analysis of threats at the network level [10,12]. In recent decades, extensive research efforts have been dedicated to unraveling the complexities of cyber threats and advancing the development of cutting-edge detection and mitigation strategies over the past decade [11,13,14]. Notably, the covert nature of attack traffic often enables these threats to bypass conventional detection mechanisms at the network layer, presenting significant challenges in maintaining network security.

In response to these evolving threats, recent studies have placed significant emphasis on enhancing the security frameworks of IoMT devices, which are increasingly being targeted due to the critical nature of the data they handle [1,2,15]. These devices, which range from diagnostic to therapeutic types, are interconnected through various protocols such as Bluetooth, Wi-Fi, and MQTT, making them susceptible to sophisticated cyber-attacks.

The deployment of ML techniques has been pivotal in addressing these vulnerabilities. Advanced ML models, such as Decision Trees, Random Forests, Gradient Boosting, XGBoost, Recurrent Neural Networks (RNNs), and Isolation Forests, have been widely adopted to scrutinize network traffic and detect anomalies indicative of potential security breaches [6]. These techniques harness the power of statistical analysis, feature engineering, and principal component analysis to refine detection accuracy [7].

Chunchun et al. [16] have explored the role of ML in enhancing IoT security. The work comprehensively reviews ML techniques applied to IoT security, focusing on emerging trends and challenges. The paper discusses various supervised and unsupervised learning approaches for detecting threats in IoT networks and highlights their effectiveness in mitigating cyber threats. However, it lacks an in-depth analysis of generative AI and its integration with ML for IoT security [16].

El-Saleh et al. [17] explore the opportunities and challenges of IoMT in healthcare, emphasizing its role in improving patient care through connected devices. They highlight the integration of AI, ML, and Blockchain to enhance security, mitigate cyber threats, and ensure reliable communication in IoMT systems. Their study underscores the importance of digital technologies in managing pandemics and securing healthcare data.

Deep learning (DL) has been widely used to detect cyber threats in IoT networks, particularly against DDoS attacks. The study in [18] examines various DL-based approaches for detecting DDoS attacks, focusing on feature fusion techniques to enhance accuracy. The authors present a detailed evaluation of different models and their performance in real-world scenarios. While the study provides valuable insights into IoT security, its narrow focus on DDoS detection limits its applicability to broader IoT security challenges [18].

Judith et al. [19] propose a deep learning-based IDS for IoMT, focusing on man-in-the-middle attacks. While their work primarily addresses classification accuracy, our research emphasizes the balance of false positives and negatives in security-sensitive applications. Furthermore, our work analyzes the trade-off between security and cost in designing an ML pipeline for detecting malicious attacks.

In the domain of IoMT security and optimization, Rahmani et al. [20] proposed a novel approach inspired by human brain astrocyte cells to map dataflow in IoMT networks and detect defective devices. Their work introduces an astrocyte-flow mapping (AFM) algorithm based on the biological process of phagocytosis to enhance communication efficiency and identify faulty network components. By implementing this biomimetic approach on mesh-based communication infrastructures, they achieved impressive improvements in total runtime (60.85%) and energy consumption (52.38%) compared to conventional methods. While our work focuses on ML techniques for attack detection, their research complements our approach by addressing the fundamental infrastructure-level challenges in IoMT deployments. Their biological inspiration for network optimization contrasts with our data-driven security framework, highlighting the diverse approaches being explored to enhance IoMT reliability and performance [20].

Recently, Alfatemi et al. [21] proposed a neural network model for DDoS attack detection, integrating Gaussian noise to enhance robustness and generalization. Their streamlined architecture ensures rapid processing, with experiments demonstrating its effectiveness in real-world security applications [21].

Reinforcement Learning (RL) has gained attention for its potential to enhance IoT security by enabling adaptive and automated threat detection. The survey in [22] extensively reviews RL-based approaches applied to IoT security, outlining their strengths and limitations. It explores different RL algorithms, their applicability in intrusion detection, and the challenges of deploying RL models in IoT environments. However, the paper focuses mainly on RL without considering hybrid ML approaches or federated learning techniques that could further enhance security in distributed IoT networks [22].

Integrating Deep Reinforcement Learning (DRL) in IoT networks has shown promise in addressing dynamic security threats. Frikha et al. [23] review the application of RL and DRL for IoT security, particularly in wireless IoT systems. The review highlights use cases where DRL-based models improve network security by dynamically adapting to threats in real time. However, the paper focuses mainly on wireless communication and does not explore recent advances in hybrid methodologies or generative AI for IoT security [23].

In more recent work, Jagatheesaperumal et al. [24] provide a comprehensive review of Distributed Reinforcement Learning (DRL) approaches to improve IoT security in heterogeneous and distributed networks. Their work highlights the advantages of DRL in addressing dynamic and evolving security threats while also discussing design factors, performance evaluations, and practical implementation considerations [24].

Zachos et al. [25] developed a hybrid Anomaly-based Intrusion Detection System for IoMT networks using novelty and outlier detection algorithms (OCSVM, LOF, G_KDE, PW_KDE, B_GMM, MCD, and IsoForest) capable of identifying unknown threats while maintaining low computational cost on resource-constrained devices. While both our works address IoMT security through ML, our approach differs by employing XGBoost and Logistic Regression with a late fusion strategy, focusing on optimizing the security-cost trade-off through model interpretability via SHAP analysis. Additionally, our work emphasizes balancing false positives and negatives in security-sensitive applications, whereas their research prioritizes lightweight implementation for IoT devices.

Alamleh et al. [26] proposed a multi-criteria decision-making (MCDM) framework to standardize and benchmark ML-based intrusion detection systems specifically for federated learning in IoMT environments. Their approach differs from ours in that they focus on developing evaluation standards using the fuzzy Delphi method and applying group decision-making techniques to rank different classifiers. Although they found BayesNet optimal and SVM least effective in their federated learning context, our research demonstrates the superiority of XGBoost over traditional models. It introduces a late fusion approach to balance security and performance. Unlike their emphasis on standardization across multiple classifiers, our work concentrates on model interpretability through SHAP analysis and optimizing security-relevant metrics like false negative reduction in a non-federated environment.

Fall detection (FD) systems integrated with IoMT have been extensively studied for their role in healthcare and personal safety. Jiang et al. [27] provide a comprehensive review of wearable sensor-based FD techniques, categorizing them into threshold-based, conventional ML, and deep learning methods while summarizing relevant datasets for performance evaluation. In contrast, our work focuses on Distributed Reinforcement Learning for IoT security, emphasizing probabilistic modeling and adaptive decision-making rather than classification-based detection systems. This distinction highlights the broader application of our approach in securing IoT environments beyond specific healthcare use cases.

ML-based intrusion detection systems (IDS) are widely explored for securing IoMT environments. Alsolami et al. [28] evaluate ensemble learning methods, including Stacking, Bagging, and Boosting, for cyberattack detection using the WUSTL-EHMS-2020 dataset, finding Stacking to be the most effective. While their work focuses on evaluating classification models, our approach leverages probabilistic modeling and distributed reinforcement learning for adaptive and dynamic threat mitigation in IoT security. This distinction emphasizes our focus on decision-making under uncertainty rather than static classification-based intrusion detection.

In more recent work, Alalwany et al. [29] propose a real-time IDS that leverages a stacking ensemble of ML and deep learning classifiers implemented within a Kappa Architecture for continuous data processing. While their approach focuses on classification-based IDS for cyberattack detection, our work emphasizes probabilistic modeling and distributed reinforcement learning to secure IoT environments. This distinction highlights our focus on adaptive decision-making and dynamic threat mitigation, rather than improving classification accuracy.

Recent advances in cybersecurity have been highlighted by Admass et al. [30], who provide a comprehensive overview of current trends, challenges, and future directions in the field. Their work emphasizes the growing importance of Artificial Intelligence (AI) and Machine Learning (ML) in detecting and automating responses to cyber threats. They also discuss the evolving threat landscape and the need for ongoing collaboration among stakeholders to address emerging cybersecurity challenges. This paper provides valuable information on the latest developments in cybersecurity, complementing the foundational works cited in this study.

To summarize, while prior studies have explored ML and DL models, ensemble methods, and reinforcement learning for IoMT security, they often overlook the trade-off between security performance and operational cost, lack interpretability, or do not leverage recent datasets. Our work addresses these gaps by proposing a cost-aware and interpretable intrusion detection framework that combines XGBoost and Logistic Regression via late fusion, enhanced with SHAP analysis and validated on the CIC IoMT 2024 dataset.

3. Data Collection and Preprocessing

The CIC IoMT 2024 dataset [31] presents a comprehensive benchmark to evaluate the security of IoMT devices. The dataset comprises 18 distinct cyberattacks on a testbed of 40 IoMT devices (25 real, 15 simulated) across multiple protocols, including Wi-Fi, MQTT, and Bluetooth. The attacks are categorized into five major classes: DDoS, DoS, Recon, MQTT, and Spoofing. This dataset enables researchers to evaluate and develop security solutions, including ML models, to enhance IoMT security.

Dataset

The dataset consists of network traffic records with five input features: Source, Destination, Protocol, Length, and Info. These features describe communication flow, protocol type, and packet details. The target variable, Attack, is binary ({0,1}), indicating whether the instance is benign (0) or an attack (1).

Figure 1 shows the distribution of attack and non-attack instances in the training and the testing datasets. The dataset exhibits a significant class imbalance, with attack instances vastly outnumbering non-attack instances. Specifically, in the training set (Figure 1a), attack instances account for 998,391 cases, whereas non-attack instances total only 217,493. A similar pattern is observed in the test set (Figure 1b), where attack instances (251,708) greatly exceed non-attack instances (65,330). Such an imbalance suggests that relying solely on accuracy as a performance metric may be misleading, as a model could achieve high accuracy by predominantly predicting the majority class. Instead, more informative evaluation metrics, such as precision, recall, and the F1-score, should be considered to properly assess the model’s effectiveness in detecting minority class instances.

4. Model Formulation and Optimization

This section presents the rigorous mathematical formulation underlying the proposed ML model, which employs Extreme Gradient Boosting (XGBoost) with regularization [8].

4.1. Feature Representation

Let

D = {(x_{i}, y_{i})}_{i = 1}^{m}

be the given dataset, where:

$x_{i} = {(x_{i 1}, x_{i 2}, \dots, x_{i n})}^{T} \in R^{n}$ denotes the feature vector associated with the i-th sample,
$y_{i} \in {0, 1}$ is the corresponding binary target variable,
m represents the total number of training samples,
$n = 5$ is the number of predictive features.

4.2. Boosting Framework and Model Representation

Building upon gradient boosting principles [32], XGBoost [8] constructs an ensemble of regression trees, where the output predicted for a given sample

x_{i}

at iteration t is given by:

{\hat{y}}_{i}^{(t)} = \sum_{k = 1}^{t} f_{k} (x_{i}),

(1)

where each

f_{k}

represents a regression tree. The model is trained to minimize a specified cost function (the objective function).

4.3. Regularized Objective Function

The optimization objective at iteration t consists of a differentiable convex loss function ℓ along with a regularization term

Ω (f_{k})

to control model complexity:

L^{(t)} = \sum_{i = 1}^{m} ℓ (y_{i}, {\hat{y}}_{i}^{(t)}) + \sum_{k = 1}^{t} Ω (f_{k}) .

(2)

The regularization term

Ω (f_{k})

penalizes complex trees to prevent overfitting and is defined as:

Ω (f_{k}) = γ T_{k} + α \sum_{j = 1}^{n} | w_{j} | + \frac{1}{2} λ \sum_{j = 1}^{n} w_{j}^{2},

(3)

where

γ

controls the complexity penalty on the number of leaves

T_{k}

, and

α

represents the L1 regularization term enforcing sparsity.

Where:

$γ$ controls the complexity penalty on the number of leaves $T_{k}$ ,
$w_{j}$ represents the leaf weight parameters,
$α$ represents the $ℓ_{1}$ regularization term enforcing sparsity,
$λ$ controls the $ℓ_{2}$ regularization to prevent overfitting.

4.4. Second-Order Approximation for Optimization

To enable efficient optimization, the objective function is approximated using a second-order Taylor expansion:

L^{(t)} \approx \sum_{i = 1}^{m} [g_{i} f_{t} (x_{i}) + \frac{1}{2} h_{i} f_{t}^{2} (x_{i})] + Ω (f_{t}),

(4)

where:

$g_{i} = \frac{\partial ℓ (y_{i}, {\hat{y}}_{i}^{(t - 1)})}{\partial {\hat{y}}_{i}}$ is the first-order gradient,
$h_{i} = \frac{\partial^{2} ℓ (y_{i}, {\hat{y}}_{i}^{(t - 1)})}{\partial {\hat{y}}_{i}^{2}}$ is the second-order gradient.

This formulation leverages first- and second-order information, leading to more stable and efficient gradient-based updates.

4.5. Optimal Leaf Weights and Split Criterion

For each leaf node j, the optimal weight assignment minimizing the objective is given by:

w_{j}^{*} = - \frac{\sum_{i \in L_{j}} g_{i}}{\sum_{i \in L_{j}} h_{i} + λ},

(5)

where

L_{j}

denotes the set of samples falling into leaf j.

Furthermore, the gain from performing a split at a given node j is expressed as:

Gain = \frac{1}{2} [\frac{{(\sum_{i \in L_{L}} g_{i})}^{2}}{\sum_{i \in L_{L}} h_{i} + λ} + \frac{{(\sum_{i \in L_{R}} g_{i})}^{2}}{\sum_{i \in L_{R}} h_{i} + λ} - \frac{{(\sum_{i \in L_{j}} g_{i})}^{2}}{\sum_{i \in L_{j}} h_{i} + λ}] - γ,

(6)

where

L_{L}

and

L_{R}

denote the left and right child nodes resulting from the split. The split is accepted if the gain exceeds a pre-defined threshold.

This mathematical formulation provides a robust foundation for binary classification tasks, particularly for detecting cybersecurity attacks, where the ability to handle complex network data is crucial for accurate predictions.

Table 1 presents the grid of hyperparameters used to fine-tune the XGBoost model. The table includes various values for key parameters such as the number of estimators (

n_{e s t}

), learning rate (

η

), tree depth, minimum child weight (

M C W

), subsample ratio (

S u b

), column sample by tree (

C S B

), and the regularization parameters (

α, λ, γ

). These configurations were tested to determine their impact on model performance.

Table 2 displays the best combination of hyperparameters, selected based on accuracy and precision. The optimal configuration includes 50 estimators, a learning rate of 0.1, a tree depth of 1, and regularization parameters (

α, λ

) set to 1000. The selected model achieved high performance, with a test accuracy of 97% and a training accuracy of 97%. Similarly, the precision values were 96% for the test set and 97% for the training set, indicating strong predictive capabilities while maintaining consistency between the training and testing phases. This model provided the best trade-off between generalization and predictive performance.

4.6. Late Fusion Model

Late fusion (aka, decision-level fusion) is a technique that combines the predictions of multiple classifiers to improve overall performance [33]. Unlike early fusion, which merges features before training, late fusion aggregates the final predictions of different models to make a more robust decision. This approach takes advantage of the strengths of individual classifiers, mitigating their weaknesses by making a more informed decision [33].

4.6.1. Mathematical Formulation

Given two models, XGBoost and LR (see Appendix A), their predicted probabilities for an instance belonging to the positive class are defined as:

p_{XGB} = P (y = 1 | X; θ_{XGB})

(7)

p_{LR} = P (y = 1 | X; θ_{LR})

(8)

where

θ_{XGB}

(detailed in Table 2) and

θ_{LR}

(detailed in Table A2) are the learned parameters of the XGBoost and LR models, respectively.

Instead of using a weighted average, the final classification decision is obtained using a max-voting approach:

\hat{y} = \{\begin{matrix} 1, & if p_{XGB} \geq τ OR p_{LR} \geq τ \\ 0, & otherwise \end{matrix}

(9)

where

τ

is a predefined threshold (typically

τ = 0.5

but can be optimized; in our case,

τ = 0.65

). This logical “OR” operation ensures that an instance is classified as an attack (1) if either of the models predicts it as such with a probability of

τ

or greater.

4.6.2. Fusion Strategy Justification

The max voting fusion rule prioritizes minimizing false negatives (FN), critical in security applications where undetected attacks pose significant risks, while also aiming to reduce unnecessary alerts (false positives). The rationale behind this method is:

XGBoost excels at minimizing false positives (FP), reducing unnecessary alerts.
LR reduces false negatives (FN), improving attack detection.
Max voting (i.e., late fusion) ensures that if either model predicts an attack, the instance is flagged as an attack, aiming to minimize false negatives (FN) and false positives (FP) while maintaining a balance between them for a more robust model.

4.7. Conceptual Overview

To complement our mathematical formulation, Figure 2 presents a simplified conceptual overview of our proposed IoMT security framework.

The framework consists of three main phases. First, we collect and preprocess data from the CIC IoMT 2024 dataset containing traffic from 40 IoMT devices with 18 distinct attack types. Second, we develop our model by combining a well-regularized XGBoost classifier with Logistic Regression through a late fusion strategy based on max voting. Finally, we analyze model performance through security-cost trade-off analysis and feature importance assessment using SHAP, achieving 97% accuracy with significantly reduced false negatives.

5. Results and Analysis

This section presents a detailed evaluation of various models based on key performance metrics, followed by an analysis of the security and operational cost trade-offs.

5.1. Comparison of Performance and Feature Importance

Table 3 presents a comparative analysis of performance metrics for the XGBoost, Logistic Regression (LR), and Late Fusion models. The results indicate that XGBoost outperforms LR in terms of accuracy (0.97 vs. 0.95) and recall (1.00 vs. 0.89), suggesting a superior capability to identify positive instances correctly. However, the Late Fusion model achieves a more balanced performance, with an accuracy of 0.96 and an F1-score of 0.94, demonstrating an improved overall robustness.

Although LR exhibits slightly higher precision than XGBoost (0.95 vs. 0.96), the Late Fusion model achieves the highest precision (0.98), ensuring fewer false positives. Additionally, its recall (0.91) is higher than that of LR but slightly lower than that of XGBoost. These findings suggest that the Late Fusion approach effectively balances precision and recall, making it a more reliable choice for minimizing false negatives while maintaining strong classification performance.

Table 4 presents a comparative analysis of the confusion matrices for the XGBoost, Logistic Regression, and Late Fusion models, highlighting their classification performance in distinguishing between benign and attack instances. The diagonal elements represent correctly classified instances, while the off-diagonal elements indicate misclassifications.

XGBoost correctly classifies 55,884 benign instances and 251,347 attack instances. However, it misclassifies 9446 benign samples as attacks (false positives) and 361 attack samples as benign (false negatives). In contrast, the Logistic Regression model correctly identifies 51,061 benign instances and 251,519 attack instances but misclassifies 14,269 benign samples as attacks and only 189 as benign.

The Late Fusion model, which combines both classifiers, achieves a balance between the two. The model correctly classifies 53,291 benign instances and 251,495 attack instances, reducing false negatives to 213, which improves over XGBoost (361) while remaining close to Logistic Regression (189). Additionally, it results in 12,039 false positives, maintaining a lower FP rate than Logistic Regression while being slightly higher than XGBoost. This trade-off improves security by reducing false negatives while keeping false positives relatively controlled.

Precision is the most critical metric in our case study when prioritizing security over cost. The LR model misclassified only 189 attack instances as benign, whereas XGBoost misclassified 361, indicating that LR may be more reliable in minimizing false negatives. However, XGBoost had a significantly lower false positive rate, misclassifying only 9446 benign instances as attacks compared to 14,269 for LR. This trade-off suggests that XGBoost may be preferable when reducing unnecessary security interventions is a priority.

The Late Fusion Model balances both aspects by combining the strengths of XGBoost and Logistic Regression. It reduces false negatives to 213, significantly less than XGBoost (361), while remaining close to LR (189), enhancing security by minimizing undetected attacks. Furthermore, it misclassifies 12,039 benign instances as attacks, achieving a false positive rate lower than LR (14,269) but slightly higher than XGBoost (9446).

These results indicate that while LR minimizes false negatives, making it more reliable for detecting attacks, XGBoost reduces false positives, which can lower operational costs by preventing unnecessary security escalations. The Late Fusion Model provides a balanced solution, offering improved security over XGBoost by reducing false negatives while keeping false positives lower than Logistic Regression. This makes it a more robust choice when both security and operational efficiency are critical considerations.

Figure 3 presents the Receiver Operating Characteristic (ROC) curves for both GBoost and LR, illustrating their classification performance. The XGBoost model demonstrates a better predictive capacity, as evidenced by its higher AUC (0.93) than LR (0.89). The ROC curve for XGBoost remains consistently above that of LR, indicating a better trade-off between sensitivity and specificity across various classification thresholds. Furthermore, the proximity of the XGBoost curve to the top left corner suggests that it achieves higher true positive rates while maintaining lower false positive rates, confirming its effectiveness in distinguishing between classes. The results confirm that non-linear boosting techniques, such as XGBoost, outperform traditional linear models in capturing complex patterns within the data.

Figure 4 compares XGBoost and LR, examining the cost and performance (accuracy) relative to the number of iterations.

Figure 4a shows that the cost of XGBoost decreases steadily with iterations for both the training and test sets, indicating efficient learning. In particular, the training and test curves remain close, indicating that the XGBoost model does not overfit.

Similarly, for the LR model (Figure 4b), the cost decreases steadily with iterations for both the training and test sets, indicating efficient learning. The close alignment of the training and test curves suggests that the model does not overfit.

Figure 4c shows that the accuracy of the XGBoost model consistently improves for both the training and test sets. In addition, test accuracy closely follows training accuracy, demonstrating strong generalization (i.e., the model does not overfit). Specifically, the model achieves an accuracy of 97% on both sets and a precision of 97% for the training set and 96% for the test set, confirming that the XGBoost model does not overfit.

Figure 4d shows that the accuracy of the LR model increases smoothly with the number of iterations for the training and test sets. The close alignment of the training and test curves suggests minimal overfitting. Additionally, the test accuracy improves slightly less than that of XGBoost. Specifically, the LR model achieves an accuracy of 95% and a precision of 95% on both sets, confirming its generalization (i.e., its ability to generalize to an unseen dataset).

Overall, XGBoost outperforms LR, showing a better cost reduction and slightly higher accuracy.

Figure 5 shows the SHAP-based feature importance distribution for the XGBoost model. The “Info” feature is the most influential, accounting for 64.08% of the model’s decision-making process. “Source” contributes 36.73%, signifying a high influence. Meanwhile, “Protocol” and “Length” contribute 6.79% and 2.33%, respectively. Finally, “Destination” (0.06%) has minimal impact. These results suggest that “Info” serves as the primary driver of the model’s predictions, with secondary contributions from “Source” and then “Protocol”. Appendix A.3 confirms that the “Destination” feature does not contribute to either model (i.e., XGBoost and LR); thus, eliminating this feature may lead to a potential improvement in the performance of the proposed approach.

5.2. Security-Cost Trade-Off Analysis

The security-cost trade-off is a critical consideration in deploying ML models for cybersecurity, particularly in IoMT environments, where patient safety and operational efficiency are major concerns. To quantify this trade-off, we analyze the economic implications of different model configurations based on:

(1) Security cost: The potential damage resulting from undetected attacks (false negatives)

(2) Operational cost: The resources expended investigating false alarms (false positives)

Table 5 presents a comparative cost analysis scenario for our three models. Based on healthcare cybersecurity literature, we use a cost model in which each false negative incurs a potential security cost of USD 10,000, and each false positive incurs an operational cost of USD 100. The security cost estimate aligns with findings from the Ponemon Institute’s “Cost of a Data Breach Report” [34], which reported average healthcare breach costs of USD 10,100 per record. Similarly, the operational cost reflects the average time (approximately 1 h) required for a cybersecurity analyst to investigate a potential threat, with average hourly rates of USD 75–125 [35].

This analysis reveals that despite XGBoost’s higher overall accuracy, its higher false negative rate results in potentially higher security costs. Conversely, while Logistic Regression generates more false positives, its superior ability to minimize false negatives results in lower overall costs when the security impact of missed attacks is appropriately weighted.

The Late Fusion model achieves a balance that closely approaches the cost-effectiveness of Logistic Regression while reducing operational costs by approximately 15.6%. This shows that our fusion approach effectively optimizes the security-cost trade-off by leveraging the strengths of both component models.

It should be noted that actual costs can vary significantly depending on organizational size, data sensitivity, and regulatory environment. For example, HIPAA violations can range from USD 100 to USD 50,000 per violation [36], and operational costs fluctuate with staff levels and expertise. Our analysis provides a representative scenario that can be calibrated to specific contexts.

These weightings can be adjusted for IoMT deployments with different risk profiles. Higher weights for false negatives might be appropriate in highly critical applications such as intensive care monitoring, where missed attacks could directly impact patient safety. In contrast, operational costs might receive greater emphasis in bandwidth-constrained environments with limited security personnel. Our framework provides a quantitative basis for such configuration decisions, allowing organizations to align their security posture with specific operational constraints and risk tolerances.

5.3. Practical Implementation in IoMT Environments

Our framework is designed for deployment across three tiers of IoMT infrastructure [37]:

Edge level: Lightweight Logistic Regression models are executed directly on IoMT devices or near edge nodes. By focusing on key features (e.g., Info and Source), the model size is reduced by approximately 40% with minimal performance loss.
Gateway level: At intermediate gateways, the XGBoost model performs advanced analysis across traffic from multiple devices, enabling early detection of coordinated threats and isolation of compromised endpoints.
Cloud level: The central security unit runs the entire fusion model, integrating alerts from lower levels. It also provides SHAP-based threat explanations and supports integration via standard healthcare protocols (e.g., HL7, DICOM).

Although the Late Fusion model combines Logistic Regression and XGBoost, it is executed exclusively at the cloud level, where computational resources and latency tolerance are higher. The edge level uses Logistic Regression, which has inference complexity

O (n)

, which makes it suitable for constrained devices. The inference of Gateway-level XGBoost is typically

O (d \cdot T)

, where d is the depth of the tree and T is the number of trees, still manageable at this level. The cloud-level fusion step introduces only marginal overhead (less than 50 ms in the tests), while its complexity remains

O (n)

for Logistic Regression +

O (d \cdot T)

for XGBoost. This separation ensures a fast response at lower levels while leveraging the benefits of the ensemble in the cloud, thus maintaining real-time suitability.

We provide a modular software package with adaptive resource control and HIPAA-compliant logging. In a simulated hospital test with 50 devices, this setup reduced bandwidth usage by 73% and kept threat detection latency below 200 ms, demonstrating suitability for real-world IoMT deployments.

6. Benchmarking Against Existing Approaches

This section compares our proposed approach against the broader landscape of ML methods for IoMT security.

6.1. Comparative Analysis of IoMT Security Methods

Table 6 summarizes the performance metrics of recent IoMT security approaches, including accuracy, F1-scores, and datasets used. The listed methods cover classical machine learning, ensemble techniques, and hybrid models, with results showing varying effectiveness across public and custom datasets.

Table 7 presents a qualitative comparison of recent IoMT security methods in terms of multi-protocol support, interpretability, and regularization (i.e., hyperparameter tuning and generalization capability). It also highlights the key contributions of each method. The comparison reveals varying levels of model transparency and generalization, with our approach offering both interpretability (via SHAP) and regularization (through penalties L1 and L2).

6.2. Methodological Distinctions

Our proposed approach distinguishes itself from existing work through several key methodological innovations. While most recent studies in IoMT security employ either a single algorithm or traditional ensemble methods, our work introduces a complementary fusion framework specifically designed to address the asymmetric costs of classification errors in security contexts.

Judith et al. [19] employ a multilayer perceptron with PCA for feature reduction, achieving 96.39% accuracy but lacking model transparency and focusing primarily on man-in-the-middle attacks rather than comprehensive threat detection. Similarly, Alsolami et al. [28] evaluate various ensemble methods, finding Stacking ensemble to be most effective at 98.88% accuracy, but do not address the crucial balance between false positives and negatives that impact operational deployment.

Our approach implements a novel late fusion framework that maintains interpretability through SHAP analysis, identifying the specific contribution of each feature to the final decision. Unlike Jagatheesaperumal et al. [24], who focus primarily on reinforcement learning for dynamic adaptation, our fusion approach provides more immediate practical benefits by quantitatively balancing the complementary strengths of different models to optimize both security coverage and operational efficiency.

Rahmani et al. [20] take a fundamentally different approach with their biomimetic astrocyte-flow mapping algorithm, focusing on infrastructure-level optimization rather than threat detection. While their method achieves significant improvements in runtime (60.85%) and energy consumption (52.38%), it addresses challenges different from our security-focused framework and lacks the specific attack detection capabilities provided by our approach.

6.3. Security-Cost Trade-Off Analysis

A distinctive contribution of our work is the explicit analysis of the security–cost trade-off in ML-based threat detection systems. While Zachos et al. [25] focus on computational efficiency for resource-constrained devices, and Alamleh et al. [26] develop a multi-criteria decision-making framework for federated learning, neither addresses the operational implications of false positive/negative rates.

Our analysis reveals that Logistic Regression is preferable in environments where minimizing false negatives is paramount, while XGBoost provides advantages where reducing false positives is critical. This nuanced perspective is largely absent in the existing literature, which typically focuses on accuracy without considering the asymmetric costs of different types of errors in security applications.

6.4. Model Interpretability and Feature Optimization

Unlike many approaches that prioritize accuracy over transparency, our work emphasizes interpretability through SHAP analysis. This enables practical insights, such as identifying noncontributory features, contrasting with “black-box” approaches common in the recent literature.

Our SHAP analysis identifies the “Destination” feature as non-contributory to model performance, suggesting opportunities for feature reduction. This finding distinguishes our work from approaches such as Dadkhah et al. [2], who use Random Forest and LG with a predefined feature set without analyzing feature contributions/importance.

6.5. Performance and Dataset Considerations

Our comprehensive evaluation includes precision, recall, F1-score, and false positive/negative rates, providing a more complete picture of model performance in security contexts. Our use of the CIC IoMT 2024 dataset with multi-protocol support enhances generalizability compared to studies using more specialized or simulated datasets.

Notably, Alalwany et al. [29] use the ECU-IoHT dataset, which lacks the IoMT-specific threats and protocol diversity found in newer datasets such as CIC IoMT 2024. Similarly, Alamleh et al. [26] focus on federated learning environments, which introduce challenges different from our centralized model approach.

6.6. Discussion

In the context of this research, we recommend using Logistic Regression in environments where minimizing false negatives is paramount, even at the potential cost of increased false positives. This is particularly relevant in scenarios where failing to detect a threat carries significant risks, such as in critical healthcare infrastructure where patient safety could be compromised.

In contrast, XGBoost is advantageous in environments where reducing false positives is critical, as this can lead to tangible cost reductions, such as minimizing the need for manual verification of benign alerts. This consideration is especially important in large-scale IoMT deployments where limited security personnel must efficiently assess potential threats.

By leveraging the strengths of individual models, our late fusion approach offers a robust solution to detect critical threats while mitigating false positives, thus achieving a balance between security efficacy and operational cost. This balance is crucial in practical deployments, where both security and efficiency must be optimized simultaneously.

One of the key advantages of using SHAP (SHapley Additive exPlanations) analysis is its ability to reveal the contribution of individual features to the prediction of the model. In this study, SHAP values indicate that the ’Destination’ feature contributes minimally to the predictive performance of both the XGBoost and Logistic Regression models. Based on this finding, the feature can be safely removed from the training data. Eliminating such non-informative features can reduce the model’s complexity, lower computational costs, and improve training speed, all without negatively impacting detection accuracy. This also enhances the interpretability of the final model, as it relies only on the most relevant features.

In addition to feature selection, balancing the dataset is another important factor that can influence model performance, particularly in security-related tasks where class imbalance is common. Techniques such as SMOTE (Synthetic Minority Oversampling Technique) [38] can be used to generate synthetic examples for minority classes, thus reducing bias and improving the model’s ability to detect rare attack instances. Re-training XGBoost and Logistic Regression models on a balanced dataset may further enhance the detection framework, improving both precision and recall for underrepresented attack categories.

Future work should explore the adaptation of our framework to emerging threat vectors and protocols in IoMT environments and investigate the potential of federated learning approaches to enhance privacy while maintaining detection performance across distributed healthcare networks.

7. Conclusions

This paper presents a comprehensive approach to addressing cybersecurity challenges in network environments using advanced machine learning techniques. Our research demonstrates the effectiveness of ensemble learning, specifically XGBoost, in detecting malicious attacks with high accuracy while maintaining model interpretability. The comparative analysis against a well-regularized Logistic Regression baseline reveals essential insights into the performance trade-offs between these approaches.

XGBoost demonstrated superior overall performance with an accuracy of 0.97 and a perfect recall (1.00), indicating its effectiveness in minimizing false negatives, a critical consideration in security applications. However, our analysis revealed that Logistic Regression achieved fewer false negatives in absolute terms (189 compared to XGBoost’s 361) despite its lower overall accuracy (0.95) and recall (0.89).

To take advantage of the complementary strengths of both approaches, we introduced a late fusion model based on max voting, which achieved a balanced performance profile with an accuracy of 0.96, precision of 0.98, and recall of 0.91. This hybrid approach significantly reduced false negatives compared to XGBoost while maintaining fewer false positives than Logistic Regression, offering a practical compromise between security assurance and operational efficiency.

The SHAP analysis provided valuable insights into the decision-making process, revealing that the “Info” feature contributed most significantly (64.08%) to predictions, followed by “Source” (36.73%) and “Protocol” (6.79%). This transparency improves trust in the model and provides actionable intelligence for security practitioners.

Our findings suggest that while non-linear boosting techniques like XGBoost generally outperform traditional linear models in capturing complex patterns, the optimal approach for security-sensitive applications may involve combining multiple models to balance precision and recall requirements. Future work should focus on extending these methods to address evolving threat landscapes and explore additional interpretability techniques further to enhance the transparency of machine learning-based security systems. Furthermore, investigating the application of these approaches to other domains beyond IoMT could provide valuable insight into their generalizability and broader utility in cybersecurity.

Author Contributions

Conceptualization, A.H.; Methodology, A.H.; Data curation, M.R.; Writing—original draft, A.H.; Writing—review & editing, M.R. and M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are openly available in https://www.unb.ca/cic/datasets/iomt-dataset-2024.html, accessed on 1 May 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this paper:

ML	Machine Learning
LG	Logistic Regression
DoS	Denial of Service
IoMT	Internet of Medical Things

Appendix A. Logistic Regression

This appendix provides detailed information on the hyperparameter tuning and mathematical formulation of the Logistic Regression model used in this study. It includes the grid of hyperparameters explored during model selection, the optimal hyperparameters identified through the grid search, and the corresponding mathematical model employed for classification. These elements were crucial in ensuring optimal performance and generalization of the model.

Appendix A.1. Hyperparameter Tuning

Table A1 shows the hyperparameter grid used to fine-tune the Logistic Regression model. The regularization strength C takes values of

10^{- 5}

,

10^{- 3}

,

10^{- 2}

, and

10^{- 1}

, controlling the trade-off between model complexity and overfitting. The penalty term considers both

ℓ_{2}

and

ℓ_{1}

regularization, which influence the weight distribution. Additionally, two solvers, liblinear and saga, are evaluated for their efficiency in solving the optimization problem.

Table A1. Hyperparameter grid for Logistic Regression.

Hyperparameter	Values
Regularization Strength (C)	$10^{- 5}, 10^{- 3}, 10^{- 2}, 10^{- 1}$
Penalty	$ℓ_{2}$ , $ℓ_{1}$
Solver	`liblinear`, `saga`

Table A2 presents the optimal hyperparameters selected for the Logistic Regression model. The model achieved its best performance with a regularization strength of

10^{- 5}

, an

ℓ_{2}

penalty, and the liblinear solver. These values were determined through a grid search (see Table A1), ensuring an optimal balance between generalization and performance.

In contrast, when using a regularization strength

C = 10^{- 1}

, the model achieved a high training accuracy (96%) but a significantly lower test accuracy (56%). This substantial performance gap indicates severe overfitting, which means that the model does not generalize well to unseen data.

Table A2. Optimal hyperparameters for Logistic Regression.

Hyperparameter	Optimal Value
Regularization Strength (C)	$10^{- 5}$
Penalty	$ℓ_{2}$
Solver	`liblinear`

Appendix A.2. Mathematical Model

Logistic Regression models the probability of a binary outcome using the sigmoid function:

P (y = 1 ∣ x) = σ (w^{⊤} x + b), where σ (z) = \frac{1}{1 + e^{- z}} .

(A1)

Given the optimal hyperparameters (

C = 10^{- 5}

,

ℓ_{2}

penalty, liblinear solver), the model is trained by minimizing the regularized negative log-likelihood:

min_{w, b} - \frac{1}{N} \sum_{i = 1}^{N} [y_{i} log {\hat{y}}_{i} + (1 - y_{i}) log (1 - {\hat{y}}_{i})] + λ {∥ w ∥}_{2}^{2},

(A2)

where

{\hat{y}}_{i} = σ (w^{⊤} x_{i} + b)

is the predicted probability, N is the number of samples, and

λ = \frac{1}{C} = 10^{5}

controls the regularization strength, preventing overfitting by penalizing large weights.

Appendix A.3. Feature Importance Analysis

Figure A1 illustrates the SHAP-based feature importance distribution. The “Length” attribute/feature is the most influential, accounting for 65.52% of the model’s decision-making process. “Source” and “Protocol” contribute 14.67% and 13.27%, respectively, signifying moderate influence. Meanwhile, “Info” (6.20%) and “Destination” (0.34%) have minimal impact. These results suggest that “Length” serves as the primary driver of the model’s predictions, with secondary contributions from “Source” and “Protocol”.

Figure A1. Feature importance based on SHAP values.

References

Ahmed, S.F.; Alam, M.S.B.; Afrin, S.; Rafa, S.J.; Rafa, N.; Gandomi, A.H. Insights into Internet of Medical Things (IoMT): Data fusion, security issues and potential solutions. Inf. Fusion 2024, 102, 102060. [Google Scholar] [CrossRef]
Dadkhah, S.; Neto, E.C.P.; Ferreira, R.; Molokwu, R.C.; Sadeghi, S.; Ghorbani, A.A. CICIoMT2024: Attack Vectors in Healthcare Devices-A Multi-Protocol Dataset for Assessing IoMT Device Security. 2024. Available online: https://www.preprints.org/manuscript/202402.0898/download/final_file (accessed on 1 May 2025).
Liang, J.; Kim, Y. Evolution of firewalls: Toward securer network using next generation firewall. In Proceedings of the 2022 IEEE 12th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 26–29 January 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 0752–0759. [Google Scholar]
Thapa, S.; Mailewa, A. The role of intrusion detection/prevention systems in modern computer networks: A review. In Proceedings of the Conference: Midwest Instruction and Computing Symposium (MICS), Online, 3–4 April 2020; Volume 53, pp. 1–14. [Google Scholar]
Al-Zewairi, M.; Almajali, S.; Ayyash, M.; Rahouti, M.; Martinez, F.; Quadar, N. Multi-Stage Enhanced Zero Trust Intrusion Detection System for Unknown Attack Detection in Internet of Things and Traditional Networks. ACM Trans. Priv. Secur. 2025. [Google Scholar] [CrossRef]
Thakkar, A.; Lohiya, R. A review on challenges and future research directions for machine learning-based intrusion detection system. Arch. Comput. Methods Eng. 2023, 30, 4245–4269. [Google Scholar] [CrossRef]
Hernandez-Jaimes, M.L.; Martinez-Cruz, A.; Ramírez-Gutiérrez, K.A.; Feregrino-Uribe, C. Artificial intelligence for IoMT security: A review of intrusion detection systems, attacks, datasets and Cloud-Fog-Edge architectures. Internet Things 2023, 23, 100887. [Google Scholar] [CrossRef]
Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; ACM: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A Unified Approach to Interpreting Model Predictions. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
David, J.; Thomas, C. Discriminating flash crowds from DDoS attacks using efficient thresholding algorithm. J. Parallel Distrib. Comput. 2021, 152, 79–87. [Google Scholar] [CrossRef]
Zheng, J.; Li, Q.; Gu, G.; Cao, J.; Yau, D.K.; Wu, J. Realtime DDoS defense using COTS SDN switches via adaptive correlation analysis. IEEE Trans. Inf. Forensics Secur. 2018, 13, 1838–1853. [Google Scholar] [CrossRef]
Owusu, E.; Rahouti, M.; Jagatheesaperumal, S.K.; Xiong, K.; Xin, Y.; Lu, L.; Hsu, D.F. Online Network DoS/DDoS Detection: Sampling, Change Point Detection, and Machine Learning Methods. IEEE Commun. Surv. Tutor. 2024. [Google Scholar] [CrossRef]
Rahouti, M.; Xiong, K.; Ghani, N.; Shaikh, F. SYNGuard: Dynamic threshold-based SYN flood attack detection and mitigation in software-defined networks. IET Netw. 2021, 10, 76–87. [Google Scholar] [CrossRef]
Liang, X.; Znati, T. An empirical study of intelligent approaches to DDoS detection in large scale networks. In Proceedings of the ICNC, Honolulu, HI, USA, 18–21 February 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 821–827. [Google Scholar]
Clark, T.; Rahouti, M.; Xiong, K. Machine Learning-Based Detection for Cyber Attacks in Internet of Medical Things Devices. In Proceedings of the 2024 IEEE MIT Undergraduate Research Technology Conference (URTC), Cambridge, MA, USA, 11–13 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar]
Ni, C.; Li, S.C. Machine learning enabled industrial iot security: Challenges, trends and solutions. J. Ind. Inf. Integr. 2024, 38, 100549. [Google Scholar] [CrossRef]
El-Saleh, A.A.; Sheikh, A.M.; Albreem, M.A.; Honnurvali, M.S. The internet of medical things (IoMT): Opportunities and challenges. Wirel. Netw. 2025, 31, 327–344. [Google Scholar] [CrossRef]
Nuhu, A.; Raffei, A.F.M.; Ab Razak, M.F.; Ahmad, A. Distributed Denial of Service Attack Detection in IoT Networks using Deep Learning and Feature Fusion: A Review. Mesopotamian J. Cybersecur. 2024, 4, 47–70. [Google Scholar] [CrossRef]
Judith, A.; Kathrine, G.J.W.; Silas, S. Efficient deep learning-based cyber-attack detection for internet of medical things devices. Eng. Proc. 2023, 59, 139. [Google Scholar] [CrossRef]
Rahmani, A.M.; Ali Naqvi, R.; Ali, S.; Hosseini Mirmahaleh, S.Y.; Alswaitti, M.; Hosseinzadeh, M.; Siddique, K. An astrocyte-flow mapping on a mesh-based communication infrastructure to defective neurons phagocytosis. Mathematics 2021, 9, 3012. [Google Scholar] [CrossRef]
Alfatemi, A.; Oliveira, D.; Rahouti, M.; Hafid, A.; Ghani, N. Precision DDoS Detection through Gaussian Noise-Augmented Neural Networks. In Proceedings of the 2024 15th International Conference on Network of the Future (NoF), Castelldefels, Spain, 2–4 October 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 178–185. [Google Scholar]
Uprety, A.; Rawat, D.B. Reinforcement learning for iot security: A comprehensive survey. IEEE Internet Things J. 2020, 8, 8693–8706. [Google Scholar] [CrossRef]
Frikha, M.S.; Gammar, S.M.; Lahmadi, A.; Andrey, L. Reinforcement and deep reinforcement learning for wireless Internet of Things: A survey. Comput. Commun. 2021, 178, 98–113. [Google Scholar] [CrossRef]
Jagatheesaperumal, S.K.; Rahouti, M.; Aledhari, M.; Hafid, A.; Oliveira, D.; Drid, H.; Amin, R. Distributed Reinforcement Learning for IoT Security in Heterogeneous and Distributed Networks. Comput. Connect 2025, 1, 1–10. [Google Scholar]
Zachos, G.; Mantas, G.; Porfyrakis, K.; Bastos, J.M.C.; Rodriguez, J. Anomaly-Based Intrusion Detection for IoMT Networks: Design, Implementation, Dataset Generation and ML Algorithms Evaluation. IEEE Access 2025, 13, 41994–42028. [Google Scholar] [CrossRef]
Alamleh, A.; Albahri, O.S.; Zaidan, A.; Albahri, A.S.; Alamoodi, A.H.; Zaidan, B.; Qahtan, S.; Alsatar, H.; Al-Samarraay, M.S.; Jasim, A.N. Federated learning for IoMT applications: A standardization and benchmarking framework of intrusion detection systems. IEEE J. Biomed. Health Inform. 2022, 27, 878–887. [Google Scholar] [CrossRef]
Jiang, Z.; Al-Qaness, M.A.; Dalal, A.A.; Ewess, A.A.; Abd Elaziz, M.; Dahou, A.; Helmi, A.M. Fall detection systems for internet of medical things based on wearable sensors: A review. IEEE Internet Things J. 2024, 11, 34797–34810. [Google Scholar] [CrossRef]
Alsolami, T.; Alsharif, B.; Ilyas, M. Enhancing cybersecurity in healthcare: Evaluating ensemble learning models for intrusion detection in the internet of medical things. Sensors 2024, 24, 5937. [Google Scholar] [CrossRef] [PubMed]
Alalwany, E.; Alsharif, B.; Alotaibi, Y.; Alfahaid, A.; Mahgoub, I.; Ilyas, M. Stacking Ensemble Deep Learning for Real-Time Intrusion Detection in IoMT Environments. Sensors 2025, 25, 624. [Google Scholar] [CrossRef] [PubMed]
Admass, W.S.; Munaye, Y.Y.; Diro, A.A. Cyber security: State of the art, challenges and future directions. Cyber Secur. Appl. 2024, 2, 100031. [Google Scholar] [CrossRef]
Dadkhah, S.; Neto, E.C.P.; Ferreira, R.; Molokwu, R.C.; Sadeghi, S.; Ghorbani, A.A. CICIoMT2024: A benchmark dataset for multi-protocol security assessment in IoMT. Internet Things 2024, 28, 101351. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Kittler, J.; Hatef, M.; Duin, R.P.; Matas, J. On combining classifiers. IEEE Trans. Pattern Anal. Mach. Intell. 1998, 20, 226–239. [Google Scholar] [CrossRef]
IBM Security. Cost of a Data Breach Report 2023. 2023. Available online: https://www.ibm.com/reports/data-breach (accessed on 1 May 2025).
U.S. Bureau of Labor Statistics. Information Security Analysts: Occupational Outlook Handbook. 2024. Available online: https://www.bls.gov/ooh/computer-and-information-technology/information-security-analysts.htm (accessed on 1 May 2025).
U.S. Department of Health and Human Services. HIPAA Compliance and Enforcement. 2024. Available online: https://www.hhs.gov/hipaa/for-professionals/compliance-enforcement/index.html (accessed on 1 May 2025).
Ahmed, E.; Yaqoob, I.; Hashem, I.A.T.; Khan, I.; Ahmed, A.; Imran, M.; Vasilakos, A.V. Edge computing for smart healthcare: Opportunities and challenges. IEEE Netw. 2019, 33, 196–203. [Google Scholar]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]

Figure 1. Comparison of benign and attack instance distributions in the training (a) and testing (b) datasets, highlighting class imbalance characteristics.

Figure 2. Simplified conceptual framework of our IoMT security model showing the three main phases: (1) data collection and preprocessing using the CIC IoMT 2024 dataset, (2) model development with regularized XGBoost and Logistic Regression combined through late fusion, and (3) analysis and performance evaluation including security-cost trade-offs and SHAP feature importance.

Figure 3. ROC curve comparison: XGBoost vs. Logistic Regression.

Figure 4. Comparison of XGBoost and LR across various metrics: (a,b) show the relationship between cost and the number of iterations for both training and testing datasets. (c,d) show performance (accuracy) versus the number of iterations.

Figure 5. Feature importance based on SHAP values.

Table 1. Grid of XGBoost hyperparameters.

$n_{est}$	$η$	Depth	$MCW$	$Sub$	$CSB$	$α$	$λ$	$γ$
200	0.03	3	5	0.6	0.6	50	50	0.3
100	0.1	2	10	0.5	0.5	200	100	0.3
50	0.1	1	20	0.4	0.4	500	500	0.5
50	0.1	2	30	0.5	0.5	500	500	0.8
50	0.1	1	40	0.5	0.5	1000	1000	0.5
100	0.05	3	10	0.8	0.8	500	500	0.2
50	0.1	1	60	0.3	0.3	1000	1000	1.0

Table 2. Best XGBoost hyperparameters (based on accuracy and precision).

$n_{est}$	$η$	Depth	$MCW$	$Sub$	$CSB$	$α$	$λ$	$γ$
50	0.1	1	60	0.3	0.3	1000	1000	1.0

Table 3. Comparison of performance metrics for XGBoost, Logistic Regression, and Late Fusion Models.

Metric	XGBoost	Logistic Regression	Late Fusion
Accuracy	0.97	0.95	0.96
Precision	0.96	0.95	0.98
Recall	1.00	0.89	0.91
F1-Score	0.98	0.92	0.94
ROC AUC	0.93	0.89	N/A

Table 4. Comparison of confusion matrices for XGBoost, Logistic Regression, and Late Fusion Models.

	XGBoost		Logistic Regression		Late Fusion Model
Actual	Predicted 0	Predicted 1	Predicted 0	Predicted 1	Predicted 0	Predicted 1
0 (Benign)	55,884	9446	51,061	14,269	53,291	12,039
1 (Attack)		251,347		251,519		251,495

Table 5. Security-cost trade-off analysis for different models.

Metric	XGBoost	Logistic Regression	Late Fusion
False Negatives (FN)	361	189	213
False Positives (FP)	9446	14,269	12,039
Security Cost (FN × USD 10,000)	USD 3,610,000	USD 1,890,000	USD 2,130,000
Operational Cost (FP × USD 100)	USD 944,600	USD 1,426,900	USD 1,203,900
Total Cost	USD 4,554,600	USD 3,316,900	USD 3,333,900

Table 6. Quantitative comparison of machine learning methods applied to IoMT security, highlighting accuracy and F1-score across various datasets and models.

Reference	Year	Algorithm/Model	Accuracy	F1-Score	Dataset
Judith et al. [19]	2023	MLP + PCA	96.39%	97.64 ¹	WUSTL-EHMS
Alsolami et al. [28]	2024	Stacking Ensemble	98.88%	98.90%	WUSTL-EHMS
Alsolami et al. [28]	2024	Boosting	88.68%	88.57%	WUSTL-EHMS
Zachos et al. [25]	2025	IsoForest	90.4%	94.57%	Custom IoMT (LDE dataset)
Zachos et al. [25]	2025	IsoForest	91.29%	94.77%	Custom IoMT (CDE behavior dataset)
Alalwany et al. [29]	2025	Stacking ML/DL	99.1%	99%	ECU-IoHT
Alalwany et al. [29]	2025	Logistic Regression	91%	88%	ECU-IoHT
Alfatemi et al. [21]	2024	NN + Gaussian Noise	97%	96.1%	CICIDS2017
Dadkhah et al. [2]	2024	Random Forest	99%	98%	CIC IoMT 2024
Dadkhah et al. [2]	2024	Logistic Regression	99%	94%	CIC IoMT 2024
Alamleh et al. [26]	2022	BayesNet	94.3%	92.5%	IoMT Federated
Rahmani et al. [20]	2021	AFM	N/A	N/A	Custom IoMT
Our Work	2025	XGBoost + LR(Late Fusion)	97.0%	98.0%	CIC IoMT 2024

¹ Computed using the precision and recall values reported in [19], based on the F1-score formula: F1 = 2 × Precision × Recall/(Precision + Recall).

Table 7. Qualitative comparison of recent IoMT security methods based on interpretability, multi-protocol support, and key contributions.

Reference	Multi-Protocol	Interpretability	Regularization	Key Contributions
Judith et al. [19]	✓	✗	✗	PCA-based reduction for man-in-the-middle attacks
Alsolami et al. [28]	✓	✗	✗	Ensemble-based evaluation
Alsolami et al. [28]	✓	✗	✓	Boosting variant performance review
Zachos et al. [25]	✓	✗	✓	Lightweight detection on constrained devices
Alalwany et al. [29]	✗	✗	Partial	Real-time detection with Kappa architecture
Alfatemi et al. [21]	✗	✗	✓	Robustness to adversarial noise
Dadkhah et al. [2]	✓	✗	✗	Unified multi-protocol anomaly framework
Alamleh et al. [26]	✓	Partial	✗	Federated learning with MCDM integration
Rahmani et al. [20]	✓	✗	N/A	Bio-inspired network resilience using astrocytes
Our Work	✓	Yes (SHAP)	Yes (L1 + L2)	Fusion-based classifier with interpretable trade-off analysis

✓ = Yes, ✗ = No.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hafid, A.; Rahouti, M.; Aledhari, M. Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning. Mathematics 2025, 13, 1574. https://doi.org/10.3390/math13101574

AMA Style

Hafid A, Rahouti M, Aledhari M. Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning. Mathematics. 2025; 13(10):1574. https://doi.org/10.3390/math13101574

Chicago/Turabian Style

Hafid, Abdelatif, Mohamed Rahouti, and Mohammed Aledhari. 2025. "Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning" Mathematics 13, no. 10: 1574. https://doi.org/10.3390/math13101574

APA Style

Hafid, A., Rahouti, M., & Aledhari, M. (2025). Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning. Mathematics, 13(10), 1574. https://doi.org/10.3390/math13101574

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimizing Intrusion Detection in IoMT Networks Through Interpretable and Cost-Aware Machine Learning

Abstract

1. Introduction

2. Related Work

3. Data Collection and Preprocessing

Dataset

4. Model Formulation and Optimization

4.1. Feature Representation

4.2. Boosting Framework and Model Representation

4.3. Regularized Objective Function

4.4. Second-Order Approximation for Optimization

4.5. Optimal Leaf Weights and Split Criterion

4.6. Late Fusion Model

4.6.1. Mathematical Formulation

4.6.2. Fusion Strategy Justification

4.7. Conceptual Overview

5. Results and Analysis

5.1. Comparison of Performance and Feature Importance

5.2. Security-Cost Trade-Off Analysis

5.3. Practical Implementation in IoMT Environments

6. Benchmarking Against Existing Approaches

6.1. Comparative Analysis of IoMT Security Methods

6.2. Methodological Distinctions

6.3. Security-Cost Trade-Off Analysis

6.4. Model Interpretability and Feature Optimization

6.5. Performance and Dataset Considerations

6.6. Discussion

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Logistic Regression

Appendix A.1. Hyperparameter Tuning

Appendix A.2. Mathematical Model

Appendix A.3. Feature Importance Analysis

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI