Next Article in Journal
Fuzzy Clustering Approaches Based on Numerical Optimizations of Modified Objective Functions
Previous Article in Journal
Masked Convolutions Within Skip Connections for Video Anomaly Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

ML-RASPF: A Machine Learning-Based Rate-Adaptive Framework for Dynamic Resource Allocation in Smart Healthcare IoT

Department of Electrical and Software Engineering, University of Calgary, Calgary, AB T2N 1N4, Canada
Algorithms 2025, 18(6), 325; https://doi.org/10.3390/a18060325
Submission received: 25 April 2025 / Revised: 21 May 2025 / Accepted: 23 May 2025 / Published: 29 May 2025

Abstract

The growing adoption of the Internet of Things (IoT) in healthcare has led to a surge in real-time data from wearable devices, medical sensors, and patient monitoring systems. This latency-sensitive environment poses significant challenges to traditional cloud-centric infrastructures, which often struggle with unpredictable service demands, network congestion, and end-to-end delay constraints. Consistently meeting the stringent QoS requirements of smart healthcare, particularly for life-critical applications, requires new adaptive architectures. We propose ML-RASPF, a machine learning-based framework for efficient service delivery in smart healthcare systems. Unlike existing methods, ML-RASPF jointly optimizes latency and service delivery rate through predictive analytics and adaptive control across a modular mist–edge–cloud architecture. The framework formulates task provisioning as a joint optimization problem that aims to minimize service latency and maximize delivery throughput. We evaluate ML-RASPF using a realistic smart hospital scenario involving IoT-enabled kiosks and wearable devices that generate both latency-sensitive and latency-tolerant service requests. Experimental results demonstrate that ML-RASPF achieves up to 20% lower latency, 18% higher service delivery rate, and 19% reduced energy consumption compared to leading baselines.

1. Introduction

The proliferation of Internet of Things (IoT) devices has accelerated rapidly in recent years, with projections estimating nearly 125 billion devices by 2032 [1]. This exponential growth, coupled with advancements in quantum computing and the increasing convergence of IoT and artificial intelligence (AI), is expected to generate unprecedented volumes of heterogeneous data. One of the most impactful domains leveraging IoT is smart healthcare, where intelligent systems support personalized care, continuous monitoring, and timely interventions. In such environments, connected medical devices and wearables operate autonomously to facilitate real-time data exchange and decision making. Unlike traditional session-based models, IoT communication is inherently content-centric, where devices request and consume data or services directly from the network without persistent connectivity to specific service hosts [2]. Latency-sensitive and compute-intensive healthcare use cases, such as remote diagnostics, real-time monitoring, and emergency response, can benefit significantly from IoT’s sensing capabilities and intelligent service provisioning [3]. However, the explosive growth in IoT-enabled healthcare services introduces major challenges in ensuring timely, scalable, and efficient resource provisioning. This calls for advanced, machine learning (ML)-based frameworks capable of adaptive and optimized service delivery tailored to the unique demands of smart healthcare systems.
The emerging vision of the Internet of Services emphasizes the need to collect, analyze, and respond to data in a personalized and context-aware manner [4], which poses serious challenges in smart healthcare environments due to the limited capacity of IoT devices such as wearables and monitoring sensors [3]. Although cloud computing has traditionally addressed scalability through centralized processing, its high end-to-end latency and privacy concerns make it unsuitable for delay-critical applications such as real-time patient monitoring, emergency alerts, or AR-assisted diagnostics [5]. Edge computing offers a partial solution by enabling local analytics and stream mining closer to the data source, but alone it cannot meet the demands of heterogeneous services with fluctuating latency and throughput needs. To address these gaps, mist computing further reduces delay through lightweight, near-device processing, enabling context-aware responsiveness in mobile and critical care scenarios [6].
The motivation of this work is best illustrated through a realistic smart hospital scenario where multiple IoT-enabled services operate simultaneously across a mist–edge–cloud infrastructure. Consider a situation where three services are concurrently active: (i) an ECG streams continuous cardiac data to an edge server, requiring ultra-low transmission latency (e.g., <100 ms) and a steady delivery rate of 3 Mbps [3]; (ii) a public health display delivers real-time video content in a hospital lobby, requiring a delivery rate of over 10 Mbps with moderate latency tolerance [7]; and (iii) a diagnostic kiosk fetches on-demand lab records via a hospital portal, tolerating up to 500 ms delay but needing consistent delivery around 5 Mbps [8]. These services traverse network links with constrained bandwidth. For example, an inter-zone edge-to-cloud link with a capacity of 15 Mbps may concurrently serve both ECG and video streams. If both are routed to the central cloud, the compounded queuing delay and transmission overhead can cause the ECG stream to exceed its 100 ms threshold, potentially compromising patient safety. In contrast, intelligent routing through edge nodes with local decision-making can meet the latency requirements while relieving bandwidth stress on core cloud links. However, most existing service provisioning methods handle such workloads using static scheduling or heuristic policies [9,10]. These models often prioritize either latency or throughput independently and fail to adjust to real-time network fluctuations or service priority shifts. Such rigidity leads to service degradation under load surges, network failures, or emergency conditions.
Most existing service provisioning approaches in edge–cloud IoT ecosystems suffer from three key limitations: (i) they treat latency and service delivery rate independently, optimizing one at the expense of the other rather than addressing both jointly for QoS-critical scenarios; (ii) they rely on static or heuristically tuned allocation models that lack adaptability under dynamic and bursty healthcare traffic [10,11]; and (iii) they overlook predictive learning and reinforcement-based control, which are essential for proactive decision-making in time-sensitive and resource-constrained environments. As illustrated in Figure 1, a dynamic mist–edge–cloud strategy is needed that adaptively chooses optimal execution paths based on predicted traffic, latency sensitivity, and available capacity. This motivates the need for ML-RASPF: a unified learning framework that jointly optimizes latency and delivery rate under realistic smart healthcare conditions.
Recent rate-adaptive methods such as those by Du et al. [9], Mahapatra et al. [8], and Wen et al. [7] have introduced optimization-based or queue-aware scheduling for IoT service allocation. However, these approaches often target latency-tolerant services and fall short in supporting real-time use cases such as ECG monitoring, emergency response, or clinical decision support, where service disruptions or delays can be life-threatening. Moreover, these methods do not jointly model latency and delivery rate in a unified framework, leading to inefficiencies during network surges or unexpected traffic spikes. Furthermore, many existing solutions impose continuous load monitoring or centralized scheduling overheads that do not scale effectively in distributed mist–edge–cloud environments, especially where edge nodes are resource-limited [12,13]. The lack of modular, learning-enhanced strategies also makes it difficult to adapt to dynamic service demands in real time without incurring excessive computation or communication costs. These shortcomings present a need for a unified and lightweight ML framework that can predict traffic variation, dynamically adapt service rates, and allocate resources across hierarchical layers with minimal latency. An intelligent provisioning system that jointly optimizes latency and service delivery rate while maintaining scalability and robustness under constrained settings remains an open research challenge in smart healthcare IoT.
To address these limitations, we present ML-RASPF, an ML-based framework designed to enable intelligent, rate-adaptive, and latency-aware service provisioning in smart healthcare environments. As illustrated in Figure 1, ML-RASPF operates across mist, edge, and cloud layers, dynamically adjusting service paths and resource allocation based on real-time service demands and network conditions. The framework integrates supervised learning (LSTM) for traffic forecasting and reinforcement learning (RL) for rate adaptation, enabling predictive, context-aware decisions across the hierarchy. Edge nodes collaborate vertically to ensure that both latency-sensitive and throughput-heavy services meet their respective QoS constraints. This work proposes and evaluates a unified service provisioning framework that jointly optimizes latency and delivery rate, addressing critical gaps in adaptability, predictive control, and real-time responsiveness across heterogeneous IoT workloads. ML-RASPF delivers robust performance under dynamic loads while minimizing energy and bandwidth usage. Extensive simulations in a realistic smart hospital scenario validate its superiority over existing approaches. The key contributions of this work are as follows:
  • We present ML-RASPF, a novel hybrid mist–edge–cloud framework for rate-adaptive and latency-aware IoT service provisioning in smart healthcare systems.
  • We formulate the service provisioning problem as a joint optimization model that integrates both latency constraints and service delivery rates. This formulation enables intelligent, QoS-aware resource allocation across heterogeneous IoT environments.
  • We propose a modular, ML-based algorithmic suite combining supervised learning for traffic prediction and RL for real-time service rate adaptation.
  • We evaluate the proposed framework within EdgeCloudSim, using realistic smart healthcare workloads and the results show that ML-RASPF significantly outperforms state-of-the-art rate-adaptive methods by reducing latency, energy consumption, and bandwidth utilization while improving service delivery rate.
The remainder of this paper is organized as follows. Section 2 reviews the related work on IoT service provisioning, edge–cloud architectures, and ML applications in healthcare. Section 3 introduces the proposed rate-adaptive framework, detailing its layered architecture and operational workflow. Section 4 formulates the service provisioning problem as a joint optimization model and describes the ML-based algorithm used for resource allocation. Section 5 presents the experimental setup and performance evaluation results based on a smart healthcare scenario. Finally, Section 6 concludes the paper and outlines potential directions for future research.

2. Related Work

The increasing deployment of IoT systems in smart healthcare has triggered a wave of research into latency-aware and QoS-based service provisioning techniques across edge–cloud environments. Existing works have proposed diverse strategies ranging from heuristic scheduling to ML-based optimization. Each category brings unique strengths; however, they often operate in isolation, targeting either latency or throughput, but rarely both. Moreover, few frameworks adopt holistic architectures that include mist, edge, and cloud layers. To contextualize the contributions of ML-RASPF, this section categorizes prior research into four major themes.

2.1. Heuristic and Optimization-Based Resource Allocation

Traditional service provisioning techniques in IoT and edge–cloud environments have predominantly relied on static heuristics or mathematical optimization models. These methods aim to reduce latency and optimize resource utilization under fixed constraints, but they often fail to adapt in real-time to fluctuating service demands and network dynamics. Azmi et al. [14] proposed a tensor-based resource mapping mechanism to assign service requests to cloud and edge servers using predefined latency thresholds. Although the method effectively reduces bandwidth overhead, its reliance on tensor decompositions introduces computation latency, limiting scalability in dynamic environments. Similarly, Centofanti et al. [12] utilized a mixed integer linear programming model to optimize crowd-sensing-based service placement. Although their model demonstrates efficient workload distribution under static conditions, it lacks predictive adaptability and incurs significant computation time for larger systems.
Other linear programming approaches have tackled multi-objective service optimization. For example, Li et al. [15] formulate service provisioning as a latency-bounded function chain placement problem. However, the model neglects real-time delivery rate optimization, making it unsuitable for data-intensive services. Ahmed et al. [16] addressed deployment efficiency in fog computing infrastructures for healthcare, proposing a resource-aware placement scheme that reduces operational cost. However, the model operates under fixed network assumptions and does not take advantage of learning techniques, limiting its effectiveness under unpredictable service bursts.
These optimization and heuristic solutions offer analytical clarity and performance guarantees but suffer from three major drawbacks: (i) lack of real-time adaptability, (ii) inability to jointly model latency and throughput constraints, and (iii) reliance on static thresholds that fail to generalize across heterogeneous workloads. These limitations require more adaptive, learning-enabled architectures that can anticipate traffic surges, react dynamically to network variations, and jointly optimize key QoS parameters.

2.2. Machine Learning and Deep Learning-Based Service Provisioning

Recent advancements in ML and Deep Learning (DL) have led to intelligent service provisioning that aims to achieve better scalability and adaptiveness, especially in heterogeneous IoT environments. Najim et al. [17] proposed a DL-based service placement framework for vehicular networks, leveraging convolutional neural networks for classification of service types. However, this model suffers from reduced accuracy due to training data limitations and lacks rate-adaptive logic. Ji et al. [18] developed a collaborative cloud-edge framework using small deep reinforcement learning models. It improves task offloading performance through hierarchical model compression; however, it introduces significant coordination overhead and does not directly target service rate or latency optimization.
Shang et al. [19] presented a DL-based video adaptation mechanism in smart city environments to maintain service quality under varying bandwidth. Their method does not generalize to emergency healthcare services. Similarly, Fei et al. [20] explored federated learning (FL) for privacy-aware offloading in healthcare devices but ignored the latency–throughput trade-off, which is essential for healthcare workloads. These efforts demonstrate the potential of ML-based methods to improve decision-making but generally suffer from three key issues: (i) limited focus on joint latency and delivery rate optimization, (ii) lack of cross-layer collaboration (i.e., mist–edge–cloud), and (iii) high training complexity without real-time responsiveness.

2.3. Latency-Aware and Fog–Edge–Cloud Architectures

To mitigate latency and reduce bandwidth bottlenecks in IoT systems, several frameworks have adopted fog and edge computing architectures. These systems push computation closer to the data source, thereby reducing reliance on cloud infrastructure. Asif et al. [13] proposed a latency-sensitive model for healthcare, incorporating fog, mist, and cloud layers. Their approach suffers from scalability limitations and introduces network overhead due to redundant processing. Mishra et al. [21] introduced a hierarchical scheduling policy for delay-sensitive tasks; however, this scheme lacks predictive traffic models. Likewise, Tripathy et al. [6] proposed an SDN-enabled fog framework focused on architecture flexibility; however, they did not model rate adaptiveness or energy-aware behavior, both critical in healthcare IoT.
Wen et al. propose the JANUS system [7] for stream prioritization and latency-aware scheduling for IoT streaming; however, they only focus on bandwidth improvements and lack proactive adaptation. Similarly, Mahapatra et al. [8] propose an energy-efficient task offloading mechanism in fog–cloud setups; however, their model is tuned primarily for energy and load balance and does not support dynamic service-specific QoS demands. These architectures highlight the benefits of multi-tier deployment; however, they lack holistic QoS optimization, particularly under real-time, dynamic workloads.

2.4. Healthcare Resource Allocation Approaches

Several works have targeted healthcare-specific IoT service delivery; however, many fail to address optimization of latency and delivery rate under dynamic network and service conditions.
Banitalebi et al. [3] propose a hybrid architecture for secure and energy-efficient healthcare monitoring; however, they primarily focus on security and encryption overhead. Fei et al. [20] introduced an FL-based offloading model for healthcare, improving privacy; however, it lacks the flexibility to adapt to various latency and delivery demands. Najim et al. [17] present a DL-based framework for continuous patient monitoring; however, it lacks scalability and rate adaptiveness, limiting its performance under fluctuating workloads. Furthermore, Ahmed et al. [16] employ IOTA distributed ledgers to support data integrity in fog-based systems; however, their work does not engage with predictive or rate-adaptive provisioning strategies necessary for workload balancing in high-traffic healthcare settings. Although these studies offer domain-specific contributions, they typically optimize isolated performance objectives (e.g., privacy, energy, latency), leaving a critical gap in unified frameworks that jointly optimize multiple QoS parameters.
In summary, although recent advances in resource allocation, traffic prediction, and edge-enabled computing have yielded significant progress, most existing approaches do not meet the compound requirements of modern smart healthcare, namely, the need for real-time responsiveness, joint latency–throughput optimization, and scalable deployment across hierarchical IoT layers. Many systems rely on static configurations, centralized schedulers, or domain-specific assumptions that limit their adaptability under volatile healthcare workloads. Table 1 summarizes key features of recent techniques compared to ML-RASPF, highlighting that our framework uniquely supports joint latency–rate optimization and outperforms previous methods in adaptability, predictive control, and architectural coverage. In contrast, ML-RASPF bridges this critical research gap by offering a modular, learning framework that integrates supervised forecasting and reinforcement-learning-based rate control. Its ability to proactively manage traffic surges and dynamically allocate resources across mist, edge, and cloud layers positions it as a scalable and robust solution for next-generation healthcare service provisioning.
These gaps motivate the development of ML-RASPF, an ML-based, rate-adaptive framework designed to jointly optimize latency and service delivery rate across mist–edge–cloud layers. ML-RASPF uses LSTM-based traffic forecasting and RL-based rate control in a convex optimization structure, which addresses limitations of prior models and delivers a lightweight, scalable, and QoS-aware provisioning mechanism for real-time healthcare IoT services.

3. Optimal Service Provisioning Framework

The proposed ML-RASPF framework is based on distributed data processing principles to enhance resource utilization and ensure QoS in latency-sensitive smart healthcare environments. As shown in Figure 2, the architecture consists of five coordinated layers that collaboratively manage end-to-end service provisioning across mist, edge, and cloud infrastructures. To illustrate its real-world applicability, we present a smart healthcare use case focused on continuous patient monitoring and emergency response. In this scenario, IoT-enabled medical devices, such as wearable health trackers, bedside monitors, and mobile diagnostic units, collect real-time physiological data from patients across various hospital zones or remote care settings. These data include metrics such as heart rate, blood pressure, oxygen saturation, and mobility patterns, all of which require timely and reliable processing to enable prompt medical interventions.
A critical requirement in such a healthcare setup is the ability to perform localized, on-demand analytics with minimal latency. For instance, edge cloudlets deployed within hospital premises can process time-sensitive alerts, such as detecting arrhythmias or fall events, before forwarding relevant summaries to central hospital servers or cloud-based electronic health record (EHR) systems. Data from non-critical or latency-tolerant services, such as periodic wellness logs or historical diagnostic trends, can be cached and analyzed at the cloud layer when needed. The proposed framework supports energy-efficient sensing, data transmission, and real-time service delivery by adaptively allocating resources across its layers. This layered orchestration not only ensures low-latency responses for critical healthcare events but also maintains consistent service delivery rates for ongoing data streams. By balancing delivery speed and throughput across mist, edge, and cloud layers, the framework effectively fulfills QoS requirements for both time-sensitive and bandwidth-intensive smart healthcare services.

3.1. Customized Mist–Edge–Cloud Framework

The framework comprises five layers, including Perception, Mist, Edge, Central Cloud, and Cloud Application, working collaboratively for latency-aware and rate-adaptive service provisioning in smart healthcare systems. A depiction of the framework is shown in Figure 2.

3.1.1. Perception Layer

The lowest layer, consisting of medical sensors enabled by IoT, such as heart rate monitors, pulse oximeters, ECG patches, and ambient condition sensors, is responsible for data acquisition. These sensors operate on patient bedsides, ICUs or remote home care settings, collecting vital signs and contextual information in real time. The captured data are forwarded to the mist layer for low-latency processing.

3.1.2. Mist Layer

Mist nodes operate at the edge of perception, adding lightweight analytics and decision-making capabilities to reduce data load and minimize latency. For example, if a patient’s heart rate exceeds critical thresholds, the mist node can instantly trigger alerts to nearby healthcare staff without waiting for cloud-level processing. The mist layer is also mobility-aware and maintains continuous service flow even when patient devices move across network zones.

3.1.3. Edge Computing Layer

This layer consists of hospital-deployed edge servers and mobile cloudlets with moderate-to-high computational capabilities. It processes latency-sensitive services like anomaly detection in ECG signals or medication reminders. The edge also caches temporary data and forwards non-urgent requests to the central cloud. It supports container-based virtualization and uses heterogeneous connectivity (e.g., Wi-Fi, 5G, 6G) to ensure responsive service provisioning.

3.1.4. Central Cloud Layer

The central cloud manages long-term analytics, large-scale training of predictive models, and population-level trend analysis. It stores historical patient data and supports computationally intensive operations such as ML-based risk stratification, disease progression modeling, and cross-site healthcare analytics. It also ensures system-wide fault tolerance and service continuity.

3.1.5. Cloud Application Layer

The topmost layer presents healthcare dashboards, analytics interfaces, and visualization tools for medical personnel. It supports telemedicine consultations, alerts for emergency services, and the integration of personalized health insights into the hospital workflow. This layer is also the point of orchestration for deploying new healthcare applications across the lower tiers.

3.2. Security and Privacy Consideration

ML-RASPF focuses on optimizing latency and service delivery rate, but smart healthcare deployments must also ensure data security and patient privacy. To support secure and privacy-preserving service provisioning, the framework can be extended with FL or differential privacy mechanisms to ensure that sensitive patient data are processed locally at the edge or mist layer. Furthermore, blockchain-based technologies, such as privacy-preserving blockchain P 2 B -Trace [22], can be integrated to enable tamper-proof service logging and decentralized trust management. These enhancements would allow ML-RASPF to ensure both operational performance and regulatory compliance in highly sensitive healthcare environments.
The proposed framework can be extended with security-aware ML techniques to further strengthen the resilience of ML-RASPF against malicious behavior and unauthorized access. In particular, integrating differential privacy into LSTM-based traffic forecasting can help prevent inference attacks and protect sensitive healthcare patterns. Moreover, FL can be adopted to collaboratively train prediction models across edge nodes without exposing raw data, mitigating risks from data filtration. Future versions of ML-RASPF may also incorporate anomaly detection mechanisms to identify credential abuse (e.g., traffic mimicry or credential sharing) based on deviations from established traffic behaviors.

3.3. Smart Healthcare with Emergency and Routine Services

As illustrated in Figure 3, the ML-RASPF framework enables differentiated treatment of services. High-priority emergency alerts, such as cardiac events—are handled entirely at the mist or edge layers, ensuring sub-second responsiveness. Meanwhile, routine services like patient check-in logs or diet monitoring are processed at the cloud layer. The framework dynamically adapts delivery paths, execution points, and resource allocation based on latency requirements and real-time network states, making it ideal for robust and scalable smart healthcare IoT deployments.

4. Analytical Framework for Heterogeneous Service Provisioning

To address the challenges of latency-sensitive and dynamically evolving IoT services in smart healthcare, we develop an analytical framework for adaptive service provisioning across mist, edge, and cloud layers. The objective is to jointly optimize delivery rate and latency by allocating bandwidth and computing resources based on service priority, link conditions, and predicted demand. Unlike conventional static models, our framework supports heterogeneous healthcare workloads, ranging from real-time telemetry to elastic background services, by integrating ML modules with a mathematically grounded optimization engine. This model enables the system to dynamically route, throttle, and prioritize services with respect to utility and delay constraints, ensuring quality of service (QoS) across all tiers of the architecture.
Figure 3 illustrates the operational flow of the proposed ML-RASPF framework, which incorporates dynamic decision-making, feedback loops, and exception handling. IoT data sources (e.g., wearables and sensors) initiate service requests evaluated for predicted load conditions. If forecast demand exceeds available capacity, the system triggers early adaptation via edge resource provisioning; otherwise, service paths are optimized centrally. A QoS classifier categorizes incoming requests into latency-sensitive or bandwidth-intensive classes, enabling priority-based control. The RL agents then adjust delivery rates based on real-time performance metrics. The optimization engine jointly considers utility and delay, guiding service allocation across mist, edge, and cloud layers. A feedback mechanism continuously monitors service performance: exception handling is activated if quality falls below thresholds, and rate policies are re-optimized. This architecture ensures robust, context-aware, and adaptive service provisioning across distributed healthcare infrastructures.

4.1. Problem Overview and Healthcare Service Requirements

In a typical smart healthcare system, service consumers (e.g., patients, caregivers, clinicians) interact with distributed service providers through a network comprising mist nodes, edge devices, and cloud infrastructure. The goal is to ensure that healthcare services are delivered at appropriate rates while meeting strict QoS constraints, especially for latency-critical applications such as emergency alerts or continuous glucose monitoring.
Let S = { s 1 , s 2 , , s n } denote the set of healthcare services, and let C s = { c s , 1 , c s , 2 , , c s , n } denote the set of consumers for each service s S . Each consumer is connected to the service provider via a set of links T = { t 1 , t 2 , , t m } with associated capacities C = { c 1 , c 2 , , c m } . The objective is to allocate network resources such that each consumer receives service s at an optimal delivery rate g s , i , measured in Mbps, while satisfying both rate and latency constraints.
Services are broadly categorized into two types: (i) latency-sensitive services, such as real-time ECG monitoring or fall detection, which require immediate data processing and response, and (ii) latency-tolerant services, such as access to EHRs or non-urgent data synchronization, which can tolerate higher latency and lower bandwidth. This heterogeneous service demand motivates the development of a rate-adaptive and latency-aware provisioning model that dynamically adjusts service delivery rates based on network conditions, service criticality, and user context. In this work, we aim to jointly optimize the service delivery rate and latency using a utility-based convex optimization framework, further enhanced by ML techniques for traffic prediction and adaptive control.

4.1.1. Utility-Based Formulation for Service Delivery Rate

To support heterogeneous healthcare services, we adopt a utility-based formulation that maps the service delivery rate to a corresponding utility value, representing user satisfaction or QoS level. In the context of healthcare IoT services, user satisfaction is quantified in terms of service-specific QoS goals. For latency-sensitive services such as ECG streaming or fall detection, it corresponds to minimizing detection and alerting delay. For bandwidth-intensive services like real-time video consultations, it translates to maintaining a continuous stream at the required bitrate. For latency-tolerant tasks (e.g., EHR access), satisfaction reflects the ability to complete data transactions within acceptable thresholds. Thus, the utility function aligns system-level delivery rates with the perceived performance at the application level.
Let g s , i denote the service delivery rate for consumer i receiving service s. The utility function U s ( g s , i ) is defined as a strictly increasing, continuous, and positive function over the interval [ m s , M s ] , where m s and M s denote the minimum and maximum allowable rates for service s, respectively. However, traditional utility functions may fail to maintain concavity for latency-sensitive services, which is essential for tractable convex optimization. Therefore, we introduce a pseudo-utility function U s ( g s , i ) that satisfies strict concavity, ensuring the global optimality of the rate allocation. The uility function is defined as:
U s ( g s , i ) = m s g s , i 1 U s ( y ) d y , m s g s , i M s .
This transformation allows us to reformulate the service provisioning task as a convex optimization problem, aiming to maximize the total utility across all services and consumers:
P 1 : maximize g 0 s S i = 1 n s U s ( g s , i ) ,
subject to the following constraint:
s S g s t c t , t T ,
where g s t = max { i | t T s , i } g s , i represents the maximum service delivery rate for service s on link t, and c t is the capacity of the link. This constraint ensures that the total service delivery rate on any link does not exceed its capacity. To maintain differentiability and facilitate the optimization process, we approximate the non-differentiable max operator using:
g s t { i | t T s , i } g s , i n 1 n ,
where n is a large integer. This leads to the reformulated convex optimization problem P2:
P 2 : maximize g 0 s S i = 1 n s U s ( g s , i ) ,
subject to s S { i | t T s , i } g s , i n 1 n c t , t T .
This utility-based formulation provides a scalable and flexible approach to optimize heterogeneous healthcare services, balancing latency sensitivity and resource efficiency. It also serves as a foundation for integrating learning-based techniques in the subsequent sections.

4.1.2. Delay-Aware Utility Adjustment

In smart healthcare environments, many IoT-based services are highly delay-sensitive. Applications such as real-time cardiac monitoring, fall detection, and emergency alerts require not only optimal service delivery rates but also stringent latency guarantees. To account for these latency constraints, we extend the utility model to penalize delays incurred during service delivery. In this work, we model the delay function d s ( g s , i t ) using a hybrid approach. For analytical formulation and theoretical derivation, we assume an M / M / 1 queuing model, which provides a tractable and convex approximation of delay under varying arrival rates. For simulation and learning-based integration, delay is estimated using a supervised regression model trained on network telemetry data, capturing variations due to congestion, link quality, and mobility. This dual representation ensures that our optimization model remains mathematically robust while being practically deployable in real-world smart healthcare settings.
Let d s ( g s , i t ) denote the average delay experienced by consumer i of service s when traversing link t. The cumulative delay experienced by a consumer over the entire delivery path is represented as:
D s , i = { t T s , i } d s ( g s , i t ) ,
where T s , i is the set of links in the delivery path from the service provider to consumer i for service s. To incorporate delay sensitivity, we define a delay-weighted utility function U s * ( g s , i ) as follows:
U s * ( g s , i ) = U s ( g s , i ) φ s D s , i ,
where φ s 0 is a delay penalty coefficient that captures the importance of latency for each service s. The delay penalty coefficient φ s can be tuned according to the criticality of the service. For example, for emergency alerts or ICU monitoring, φ s is set high to enforce strict latency constraints, whereas for routine data logging or archival services, it may be near zero. For highly delay-sensitive services, φ s is large, ensuring that the optimization penalizes high-delay paths more heavily. For latency-tolerant services, φ s can be set to zero or a small value.
Substituting the delay-adjusted utility function into the optimization objective leads to:
P 3 : maximize g 0 s S i = 1 n s U s ( g s , i ) φ s t T s , i d s ( g s , i t ) ,
subject to the same capacity constraints on links as defined earlier. This formulation balances the trade-off between high service delivery rates and low end-to-end latency, making it well-suited for real-time healthcare applications where both factors are mission-critical.
The delay function d s ( g s , i t ) can be modeled using queuing theory (e.g., M/M/1 or M/D/1 approximations), empirical measurements, or predicted via ML models, which will be discussed in the next subsection.

4.1.3. Approximation and Convex Transformation

The utility-based optimization problem, as formulated in the previous subsections, includes a non-differentiable maximum function in the computation of the per-link service delivery rate:
g s t = max { i t T s , i } g s , i .
This formulation poses a challenge for conventional convex optimization techniques due to the discontinuity in the max operator. To address this, we use a smooth approximation of the maximum function based on the n-norm, which is differentiable and converges to the maximum as n .
We approximate Equation (10) as follows:
g s t { i t T s , i } g s , i n 1 n ,
where n is a large positive integer. This approximation retains smoothness while closely matching the original maximum value when n is sufficiently large. Substituting this into the constraint set, we redefine the link capacity constraint as:
s S { i t T s , i } g s , i n 1 n c t , t T .
With this transformation, the optimization problem becomes convex, and standard solvers or learning-augmented techniques (e.g., primal-dual updates or neural approximators) can be employed to find optimal or near-optimal solutions.
The updated delay-aware and smoothed optimization problem is now expressed as:
P 4 : maximize g 0 s S i = 1 n s U s ( g s , i ) φ s t T s , i d s ( g s , i t ) ,
subject to s S { i t T s , i } g s , i n 1 n c t , t T .
This convex transformation enables efficient convergence and supports scalable implementation in dynamic and resource-constrained healthcare environments. The formulation is also amenable to integration with ML models, particularly for delay estimation and adaptive parameter tuning, which are discussed in the next subsection.

4.1.4. Machine Learning Integration

To enhance adaptability and predictive capabilities in dynamic healthcare environments, we integrate ML techniques into the proposed rate-adaptive and latency-aware framework. These ML modules enable proactive resource management, anticipate changes in network load, and intelligently adjust service parameters in real time. The integration focuses on lightweight and scalable models suitable for deployment in resource-constrained IoT infrastructures. In particular, we utilize Long Short-Term Memory (LSTM) networks for traffic prediction, gradient-boosted decision trees (GBDT) for delay estimation, and Deep Q-Networks (DQN) for RL-based rate control. These models were chosen for their balance between prediction accuracy and computational feasibility at the mist and edge layers, benefiting from recent advances in TinyML and embedded inference.

4.1.5. Traffic and Demand Prediction

In smart healthcare IoT systems, traffic patterns can vary due to unpredictable patient mobility, device behavior, and monitoring intensity. To anticipate such variations, we employ an LSTM neural network trained on historical service request traces and real-time link load data. The LSTM model forecasts traffic demand on each communication link over a short-term horizon, yielding predicted traffic levels g ^ s t . These forecasts are used to proactively update pricing coefficients p t and inform adaptive scheduling before congestion occurs.

4.1.6. Delay Estimation and Latency Modeling

Accurate estimation of latency is essential for delivering critical healthcare services such as real-time ECG monitoring or fall detection. Rather than relying solely on analytical queueing approximations, we adopt a regression-based ML model—specifically, a gradient-boosted decision tree (GBDT)—to predict delay functions d s ( g s , i t ) . The model is trained on time-series network metrics, including past traffic volumes, queue lengths, and packet loss events, allowing it to generalize to unseen traffic conditions and rapidly adapt to disruptions such as link failures or bursty demand.

4.1.7. Reinforcement Learning for Adaptive Rate Control

To achieve dynamic and continuous rate adaptation, we implement an RL agent based on the Deep Q-Network (DQN) architecture. The RL agent is trained using a policy-gradient method under an on-policy learning scheme. The agent interacts with the network simulation environment, receiving rewards based on delay-aware utility improvements and penalizing rate allocations that lead to link congestion or deadline violations. Training is conducted in episodic rounds using historical and synthetic traffic traces to ensure generalization across different healthcare service scenarios. The environment is modeled using a Markov Decision Process (MDP) where state transitions reflect traffic fluctuations and resource updates in the edge–cloud topology. The agent observes the system state comprising current delivery rates g s , i , link utilization, estimated delays, and historical rewards and selects rate control actions to maximize cumulative network utility while minimizing penalties from excessive latency. The reward function at time t is defined as:
r t = U s ( g s , i ) φ s t T s , i d s ( g s , i t ) ,
where φ s denotes the service-specific delay penalty weight. The DQN model enables the system to learn an optimal rate allocation policy over time, even in the presence of non-stationary network dynamics and partial observability. This allows the system to learn optimal strategies over time, adapt to changing network and service dynamics, and ensure QoS across a variety of healthcare applications.

4.1.8. Emergency-Aware Prioritization

ML-RASPF supports service-level prioritization by adjusting the delay penalty coefficient φ s in the utility model, enabling elevated prioritization for life-critical services such as ICU telemetry, cardiac alerts, and emergency broadcasts. During unexpected surges (e.g., mass casualty events), the RL agent adapts service delivery by assigning higher utility weights to delay-sensitive tasks and ensuring they are routed via low-latency paths. Additionally, integrating context-aware classification can help distinguish legitimate emergency traffic from anomalies or credential misuse by analyzing surge origin, timing, and service type.

4.1.9. Framework Integration

These ML components operate alongside the core optimization engine. Traffic predictors and delay estimators continuously feed updated parameters into the analytical model, while the RL agent fine-tunes service delivery rates and cost coefficients in response to real-time system feedback. This hybrid approach ensures both the scalability of analytical optimization and the adaptability of learning-based decision-making. By leveraging ML, the proposed framework achieves a higher degree of responsiveness and robustness in smart healthcare IoT scenarios, supporting mission-critical, real-time services while maintaining optimal bandwidth and energy efficiency.

4.2. Algorithms for Rate-Adaptive Provisioning

To operationalize the analytical model, we develop a modular three-phase algorithmic pipeline tailored for rate-adaptive and latency-aware service provisioning in smart healthcare IoT environments, where responsiveness and prioritization are vital. The pipeline comprises three core algorithms, each targeting a distinct stage of the provisioning process. Algorithm 1 handles network initialization by gathering critical parameters such as link capacities, service-to-link mappings, and current traffic states, establishing a stable baseline for iterative optimization. Algorithm 2 dynamically updates link prices and adjusts weight distributions using gradient-based techniques, enabling real-time adaptation to congestion and incorporating predictions from ML models to anticipate traffic fluctuations. Algorithm 3 fine-tunes service delivery rates per user by computing end-to-end path costs and applying inverse utility functions, with optional RL modules to support adaptive control based on real-time system feedback.
Algorithm 1 Network Data Collection and Initialization
Require: Set of services S = { s 1 , , s n } , links T = { t 1 , , t m } , link capacities C = { c 1 , , c m }
Ensure: Initialized weight matrix W and price matrix P
  1: Initialize matrices W R | S | × | T | and P R | S | × | T |
  2: for all services s S  do
  3: for all consumers i of service s do
  4:  for all links t T s , i  do
  5:    w s , i t 1 | { j t T s , j } | {Compute using Equation (16)}
  6:    p s , i t 0 {Initialize price (see Equation (16))}
  7:  end for
  8: end for
  9: end for
10: return W, P
Algorithm 2 Price Computation and Weight Update
Require: Current weights W, prices P, link capacities C, service rates g s , i
Ensure: Updated W and P
  1: for all nodes e do
  2: for all links t connected to e do
  3:  for all services s using t do
  4:    g s t max { i t T s , i } g s , i {Compute per-service max rate on link t (see Equation (10))}
  5:  end for
  6:   g t s S g s t {Compute total link load}
  7:   p t p t + λ ( g t c t ) + {Update link price based on overload (ref. Equation (3))}
  8:  for all services s and consumers i on t do
  9:    w s , i t w s , i t + λ ( g s , i g s t ) + {Update allocation weight}
10:   if   g s , i = g s t   then
11:       w s , i t 1 j i , t T s , j w s , j t {Normalize weight to preserve fairness}
12:   end if
13:    p s , i t w s , i t · p t {Consumer-specific price (used in Equation (8))}
14:   Update W and P
15:  end for
16: end for
17: end for
18: return W, P
These three algorithms operate in a closed loop. Initialization sets up baseline values; the pricing algorithm adjusts to live network conditions, while the rate adaptation phase ensures compliance with utility and latency constraints. In ML-enhanced deployments, prediction and feedback models can be called asynchronously between iterations to further improve responsiveness and decision quality. The design is modular and interpretable, allowing each algorithm to function independently while contributing to the end-to-end optimization cycle. The algorithms are tightly integrated with the analytical framework and leverage ML-generated insights for improved accuracy and responsiveness.
Algorithm 1 is responsible for network setup and initialization. It collects the real-time parameters, including the set of services S, network links T, and capacities C, and constructs the initial weight matrix W and price matrix P. The weights are distributed evenly across consumer paths, and link prices are initialized to zero, as shown in Equation (16).
w s , i t = 1 | { j t T s , j } | , p s , i t = 0 .
This ensures that all consumers begin with equal opportunity to access network resources.
Algorithm 3 Service Rate Adaptation and Delivery
Require: Updated link prices P, weights W, service paths T s , i
Ensure: Adapted delivery rates g s , i
  1: for all nodes e do
  2: if e is a provider of service s then
  3:  for all consumers i of service s do
  4:      p s , i t T s , i p s , i t {Aggregate path cost (used in Equation (8))}
  5:     Observe state s t = { p s , i , d s , i , g s , i prev } {State includes price, delay, previous rate}
  6:     Select action a t using RL policy π ( a t | s t ) : adjust g s , i {Action chosen to optimize utility}
  7:     Receive reward r t = U s ( g s , i ) φ s D s , i {Reward defined in Equation (8)}
  8:     Update policy parameters using gradient of r t {Policy improvement via DQN gradient}
  9:      g s , i α · U s 1 1 p s , i U s ( m s ) U s ( M s ) + ( 1 α ) · a t {Adaptive rate via inverse utility (related to Equation (1))}
10:  end for
11: end if
12: // Check for new service requests to trigger feedback-based re-optimization
13: if any new service is requested by node e then
14:    Update network state and repeat Algorithm 2
15: end if
16: end if
17: return Updated rates g s , i
Algorithm 2 iteratively updates prices and weights. Lines 3–6 identify each active node e and its outgoing links t. For each link, the maximum rate of any service traversing it is computed:
g s t = max { i t T s , i } g s , i ,
and the aggregate link load is:
g t = s S g s t .
Link prices are adjusted via a gradient step:
p t = p t + λ ( g t c t ) + ,
where λ is a learning rate or step size. Consumer-specific weight coefficients are then updated to reflect usage:
w s , i t = w s , i t + λ ( g s , i g s t ) + ,
followed by normalization for fairness:
w s , i t = 1 j i , t T s , j w s , j t .
Finally, consumer-level link prices are computed:
p s , i t = w s , i t · p t .
These updates reflect both current traffic and anticipated changes (predicted via ML models), ensuring dynamic resource reallocation under evolving healthcare conditions.
Algorithm 3 focuses on service rate adaptation. For each service provider node, lines 1–3 compute the total service delivery path price for consumer i:
p s , i = t T s , i p s , i t .
Using the inverse utility function, the consumer’s service delivery rate is adjusted:
g s , i = U s 1 1 p s , i U s ( m s ) U s ( M s ) .
This ensures that rate adjustments respect both the bounds of the service and its urgency. The algorithm supports real-time healthcare needs by ensuring that critical services (e.g., real-time monitoring, emergency alerts) receive bandwidth quickly as network conditions change. If any node is receiving a new service, the process iterates to re-evaluate the network status, ensuring up-to-date adaptation.
Together, these algorithms implement a closed-loop control mechanism that dynamically adjusts service delivery in response to both real-time network data and predictive insights from ML models, making the system robust, intelligent, and responsive for smart healthcare IoT applications.

4.3. Complexity Analysis

The computational complexity of the proposed framework is primarily influenced by the iterative nature of Algorithms 2 and 3, which operate on per-node and per-link bases and rely on dynamic network conditions. Let n represent the number of services, m the number of network links, and k the average number of consumers per link.
Algorithm 1 is responsible for initialization, constructs the weight and price matrices by processing all service-link combinations. This results in a complexity of:
O ( n m ) .
Algorithm 2 traverses each node and evaluates its outgoing links to compute delivery rates, prices, and weights. For each iteration, the computation over all consumers on each link incurs:
O ( m k ) ,
leading to a total complexity of:
O ( I · m k ) ,
where I is the number of iterations until convergence. In practice, I remains moderate due to the adaptive pricing scheme and ML-based forecasting, which stabilize the convergence process.
Algorithm 3 processes each service path to compute the cumulative path price and adapt the delivery rate accordingly. This operation takes:
O ( n k )
per iteration, contributing an additional:
O ( I · n k )
to the overall complexity.
Combining all phases, the total complexity of the framework is:
O ( I · m k + I · n k + n m ) = O ( I ( m k + n k ) + n m ) .
Given that k is typically small relative to m and n and that I remains bounded in realistic deployments, the overall framework is computationally efficient and suitable for large-scale, real-time healthcare networks. The computational demands of ML-RASPF are well within the capabilities of modern edge servers and cloud data centers, enabling real-time deployment in mid- to large-scale smart hospital networks.
The proposed analytical framework enables rate-adaptive and latency-aware service provisioning for heterogeneous IoT services in healthcare environments. Through its convex utility formulation, delay penalization model, and integration of predictive and adaptive ML components, the system offers intelligent, dynamic resource allocation while maintaining scalability. The modular algorithmic structure allows for fine-grained control, rapid convergence, and real-time responsiveness, making the framework suitable for mission-critical healthcare applications such as patient monitoring, diagnostics, and emergency services. In summary, ML-RASPF bridges the gap between traditional resource allocation models and the dynamic needs of smart healthcare environments. By combining utility-based optimization, delay sensitivity, and learning-based forecasting and control, the framework ensures robust, real-time service delivery.

5. Experimental Evaluation

This section presents the experimental evaluation of the proposed ML-enhanced, rate-adaptive service provisioning framework in a smart healthcare environment. We assess the ML-RASPF’s performance in delivering rate-adaptive and latency-sensitive IoT services across a mist–edge–cloud infrastructure. The evaluation focuses on key performance indicators: latency, service delivery rate, energy consumption, bandwidth utilization, and load balancing efficiency. Comparative results are presented against four recent techniques, including JANUS [7], which employs queue management and heuristic coordination, an energy-aware task offloading framework [8] leveraging JAYA-based optimization, and classical scheduling strategies like First-Come-First-Served (FCFS) and Least-Request-First-Served (LRFS). We begin by outlining the experimental setup and the layered simulation architecture used to evaluate ML-RASPF.

5.1. Simulation Setup

To evaluate the proposed framework, we adapt and extend EdgeCloudSim [23], a widely-used simulator for cloud-edge IoT environments. The simulator is customized to model three-tier service delivery (mist, edge, and cloud) and to support service types commonly found in smart healthcare systems. The simulation includes ML components, such as predictive traffic models and adaptive rate control mechanisms that reflect anticipated real-time analytics in actual deployments. We model a healthcare use case involving a network of IoT-enabled hospital zones. Within this environment, three types of service consumers are simulated:
  • Interactive diagnostic kiosks provide patient-specific diagnostic support, lab report access, and symptom checkers. These require moderate delivery rates but have stringent latency requirements.
  • Informational displays broadcast hospital alerts, safety protocols, and public health information. These involve high-bandwidth, video-rich content with stable rate requirements.
  • Patient devices such as tablets or smartphones used by inpatients or visitors for accessing hospital Wi-Fi, EHRs, or teleconsultation. These represent latency-tolerant services.
The simulation environment emulates a multi-zone smart hospital setup with three physical zones and nine service consumers. Each zone includes local mist nodes (service forwarders) that collect and process requests from nearby IoT devices, forwarding them to edge servers or a central cloud data center based on latency, delivery rate, and resource availability. The simulation models 13 network links ( t 1 to t 13 ), with varying bandwidth capacities to reflect realistic congestion points in healthcare infrastructure.
We incorporate an LSTM-based time series forecast to simulate traffic variation patterns and integrate an RL agent that adjusts service rates in response to predicted load and observed delay metrics. These predictive models influence the rate adaptation and path selection process, emulating a real-world intelligent system. The setup includes one cloud node, three edge nodes, and mist-layer service forwarders. Each IoT device is assigned to a service forwarder, which performs local analytics and collaborates with the upper layers for optimal provisioning. The RL controller uses observed system states, including current buffer sizes, link load, and latency feedback, to select rate adaptation actions during each decision epoch. Figure 4 illustrates the simulation topology comprising healthcare zones, service consumers, ML modules, and mist–edge–cloud layers.
The simulation models a 24 h smart hospital operation cycle with three zones and nine service endpoints. Each communication link is configured with realistic bandwidth constraints to emulate bottlenecks and prioritization scenarios. All experiments are conducted on a machine with an Intel Core i7 processor (3.3 GHz, 16 GB RAM). Each scenario is repeated over 10 independent trials to account for stochastic variability, with metrics averaged and reported using 95% confidence intervals. The RL-based controller operates with 15 s decision epochs and uses reward functions based on delivery rate improvement and latency minimization to guide adaptive actions.

5.2. Simulation Parameters

Table 2 summarizes the key simulation parameters used in evaluating the ML-RASPF framework. The simulation environment models an IoT-enabled hospital system comprising diagnostic kiosks, informational displays, and patient devices. Each consumer device is associated with one of three distinct service types: latency-sensitive (real-time diagnostics), bandwidth-intensive (video-based displays), or latency-tolerant (Wi-Fi and EHR access). We simulate 13 communication links with different bandwidth capacities, reflecting the heterogeneous nature of real-world hospital networking infrastructure. Link t 1 represents a high-capacity backbone connection, while t 2 to t 4 simulate medium-capacity inter-zone links. The remaining links ( t 5 to t 13 ) represent lower-bandwidth edge-to-device connections.
In the absence of publicly available, granular healthcare IoT traffic datasets tailored for our experiments, we generate synthetic traces based on empirically modeled service request patterns observed in smart hospitals. The simulation includes patient monitoring, kiosk-based diagnostics, and public display services, each with distinct latency and bandwidth demands. Request arrivals follow a Poisson process augmented with diurnal variations to mimic real-world usage peaks. The LSTM model is trained on 70% of the generated traces and validated on the remaining 30%, ensuring stable prediction performance across unseen sequences. For RL, we simulate episodic training in the environment using a reward function based on latency and delivery utility trade-offs.
To introduce temporal dependencies and reflect realistic traffic bursts, we train an LSTM model on synthetic sequences derived from statistically informed rate distributions. These traces simulate a 24 h workload with both periodic and bursty patterns. The LSTM predictor is further benchmarked against a classical ARIMA model to validate forecasting accuracy. The predicted bandwidth demand directly influences link pricing and capacity updates in the provisioning algorithm. In parallel, an RL control agent adapts service delivery rates in response to real-time feedback, including queue delays and link utilization. Together, these ML components enable anticipatory resource allocation and dynamic service adaptation under fluctuating network conditions.
Static and dynamic energy models are also applied, with energy values for mist nodes, edge cloudlets, and the central cloud. The simulation captures both static and dynamic resource usage, allowing energy-aware evaluation of the proposed algorithms. ML-enhanced components operate asynchronously to generate demand predictions and guide rate adaptation decisions, ensuring the provisioning process remains robust to service fluctuations in real-time healthcare environments.
Traffic traces are generated using a Poisson process ( λ = 25 –50 req/min) with a burstiness index of 1.3 to capture temporal fluctuations. Diurnal variations are introduced via sinusoidal modulation to reflect peak hours in hospital environments. The LSTM model is trained on 70% of synthetic traces with early stopping and validated on the remaining 30% over 60 epochs. The synthetic workloads simulate service types such as real-time diagnostic kiosk queries, public health video displays, and periodic EHR lookups. These generation strategies align with patterns observed in related healthcare IoT simulations [5,24].

5.3. Performance Metrics and Baselines

To evaluate the effectiveness of the proposed framework, we measure its performance across five key dimensions: latency, service delivery rate, energy consumption, bandwidth utilization, and load balancing efficiency. These metrics collectively capture the system’s responsiveness, throughput, energy profile, communication efficiency, and fairness in resource distribution.

5.3.1. Latency

Latency represents the responsiveness of the system in delivering services. We evaluate transmission latency and end-to-end delay. Transmission latency is the time taken for a service request to reach the service provider from the service consumer via the mist and edge nodes. End-to-end latency is the total time taken for a complete service transaction, including request transmission, service execution, and result delivery back to the consumer. Low latency is crucial for time-sensitive healthcare applications, such as remote monitoring, emergency alert systems, and clinical decision support systems.

5.3.2. Service Delivery Rate

This metric captures the rate at which services are successfully delivered to end-users, expressed in Mb/s. A higher service delivery rate indicates better bandwidth utilization and overall system throughput. In healthcare contexts, this directly impacts the QoS for applications such as telehealth, medical video streaming, and real-time diagnostics.

5.3.3. Energy Consumption

We measure total energy consumption, including both static and dynamic components across mist, edge, and cloud layers. Static energy accounts for idle node consumption, while dynamic energy is proportional to service-specific computation and communication loads. This metric reflects the framework’s ability to deliver healthcare services in an energy-aware and sustainable manner.

5.3.4. Bandwidth Utilization

Bandwidth utilization is defined as the ratio of consumed bandwidth to the total available capacity across all network links. High utilization implies effective communication, scheduling, and link optimization. This metric helps quantify how well the system avoids network congestion and supports concurrent services.

5.3.5. Load Balancing Efficiency

Load balancing efficiency measures the uniformity of workload distribution across mist and edge nodes. It is computed using the standard deviation of task loads across nodes, with lower variance indicating better balance. This metric assesses the framework’s ability to prevent node overload and idle states, ensuring high availability and resource fairness under dynamic IoT traffic conditions.

5.3.6. Baselines for Comparison

We evaluate ML-RASPF against four diverse baseline techniques, representing a spectrum of heuristic, scheduling-based, and energy-aware optimization strategies. These baselines enable a comprehensive comparative analysis of latency handling, throughput, energy efficiency, and adaptability under dynamic workloads:
  • Energy-aware offloading [8]: an energy-aware task offloading framework that leverages dynamic load balancing and resource compatibility evaluation among fog nodes. The method uses lightweight metaheuristics to optimize offloading decisions based on task priority, fog availability, and energy profile, but it does not consider ML-based traffic prediction or RL-based rate control.
  • JANUS [7]: a latency-aware traffic scheduling system for IoT data streaming in edge environments. JANUS employs multi-level queue management and global coordination using heuristic stream selection policies. Although effective in managing latency-sensitive streams, it does not perform joint rate-latency optimization or predictive traffic adaptation.
  • FCFS: a baseline queuing strategy where incoming service requests are served in the order they arrive, without considering bandwidth, latency sensitivity, or system state. FCFS represents non-prioritized resource allocation and serves as a lower-bound reference.
  • LRFS: a heuristic baseline that prioritizes service requests with the smallest bandwidth requirements. While this may help reduce short-term congestion, it often leads to unfair treatment of larger or high-priority flows, particularly in healthcare workloads.
All comparative results are averaged over 10 simulation runs and include 95% confidence intervals. These baselines collectively offer a rigorous evaluation landscape to highlight ML-RASPF’s adaptability and QoS-aware performance in real-time healthcare IoT scenarios. We also performed a sensitivity analysis by varying key system parameters such as RL learning rate, link bandwidth capacity, and LSTM prediction window. ML-RASPF showed robust behavior across these settings, consistently maintaining its performance advantage in latency and throughput across all configurations.

5.4. Results and Analysis

In this section, we present a comprehensive analysis of the simulation results evaluated against four recent and diverse baseline techniques: JANUS [7], energy-aware offloading [8], FCFS, and LRFS. We explain evaluation results in the following.

5.4.1. Latency

Figure 5 illustrates the transmission latency across the three representative healthcare services: diagnostic kiosks, video displays, and patient Wi-Fi. ML-RASPF consistently achieves the lowest latency across all services due to its predictive load routing and adaptive congestion handling. For instance, in diagnostic kiosk services, ML-RASPF records a latency of 117 ms, compared to 143 ms for JANUS, 155 ms for energy-aware offloading, and over 170 ms for FCFS and LRFS approaches. Video services and Wi-Fi applications also reflect similar trends, albeit with slightly relaxed constraints. Similar trends were observed for end-to-end delay, where the compounded effect of transmission, queueing, and processing time follows the same ranking among approaches, with ML-RASPF retaining the lead in latency-sensitive service delivery. Thus, results for end-to-end delay are omitted for brevity but follow similar trends as transmission latency. ML-RASPF maintained up to 20–30% lower end-to-end delay than the baselines, reinforcing its advantage for real-time responsiveness in smart healthcare environments.

5.4.2. Service Delivery Rate

Figure 6 compares the service delivery rate (in Mbps) for three healthcare service types—diagnostic kiosks, video displays, and patient Wi-Fi across five approaches. ML-RASPF consistently achieves the highest delivery rates: 8.5 Mbps for diagnostic kiosks, 11.8 Mbps for video displays, and 6.8 Mbps for patient Wi-Fi. In comparison, the next best method, JANUS, achieves 7.1 Mbps, 8.9 Mbps, and 6.1 Mbps, respectively, for the same services. This reflects an average improvement of 18–22% over JANUS and up to 35% compared to FCFS and LRFS, especially in bandwidth-heavy services. These gains are attributed to ML-RASPF’s proactive traffic forecasting (via LSTM) and its RL rate control mechanism, which intelligently adjusts service rates based on predicted traffic and real-time latency feedback. Such capabilities enable it to prioritize high-impact services and ensure consistently high throughput even under fluctuating load conditions.

5.4.3. Energy Consumption

Figure 7 presents the total energy consumption (static + dynamic) for the three healthcare service types under different provisioning approaches. ML-RASPF demonstrates the most energy-efficient performance across all service categories. For diagnostic kiosks, ML-RASPF reduces energy usage by approximately 18.2% compared to JANUS and by 26.5% compared to the energy-aware offloading method. In the case of video displays, ML-RASPF consumes around 21.7% and 16.8% less energy than FCFS and LRFS, respectively. For patient Wi-Fi services, ML-RASPF achieves a 19.3% reduction over JANUS and a 14.6% saving relative to LRFS. This improvement stems from ML-RASPF’s RL-based rate control, which reduces unnecessary data transmission and optimizes local processing at the mist layer. The ability to intelligently route latency-tolerant services away from overloaded nodes also contributes to reduced overall energy usage. These findings confirm ML-RASPF’s suitability for resource-constrained environments, where sustainable and low-power operation is essential.

5.4.4. Bandwidth Utilization

Figure 8 compares the bandwidth utilization across different IoT service types and provisioning strategies. ML-RASPF consistently achieves higher utilization across all three service categories, demonstrating its effectiveness in congestion management and predictive load balancing. Specifically, it attains 86%, 89%, and 84% utilization for diagnostic kiosks, video displays, and patient Wi-Fi services, respectively. In contrast, JANUS achieves 78%, 81%, and 76%, while the energy-aware offloading baseline yields lower values of 72%, 75%, and 70%. Heuristic methods such as FCFS and LRFS show even lower utilization, particularly under high service concurrency. This validates ML-RASPF’s adaptive capability in fully exploiting available bandwidth resources and avoiding underutilization in dynamic healthcare IoT networks.

5.4.5. Load Balancing Efficiency

Figure 9 compares the load balancing efficiency of ML-RASPF against four baseline methods across three healthcare service types. ML-RASPF demonstrates superior balancing performance, with an average load deviation reduction of 18.2% compared to JANUS and 24.6% compared to energy-aware offloading. Specifically, for latency-sensitive diagnostic kiosks, ML-RASPF achieves 91.5% load balancing efficiency, outperforming FCFS (78.3%) and LRFS (84.1%). For bandwidth-intensive video displays, ML-RASPF maintains 87.2% efficiency, while JANUS and energy-aware offloading drop to 69.8% and 65.4%, respectively. For latency-tolerant patient Wi-Fi services, ML-RASPF achieves 85.6%, reflecting its stable adaptation even under relaxed latency requirements. This improvement is attributed to ML-RASPF’s RL resource control, which allows it to dynamically assess edge node capacity and queue states before adjusting delivery paths. In contrast, heuristic-based strategies such as FCFS and LRFS fail to account for real-time link saturation or cross-tier compatibility. The consistently higher efficiency across all services underscores ML-RASPF’s robustness in maintaining system-wide balance, minimizing node overloads, and optimizing edge resource usage.
Overall, the experimental results affirm the superior performance and adaptability of the ML-RASPF framework across diverse healthcare IoT service scenarios. ML-RASPF achieves consistent gains in latency reduction, delivery rate, energy efficiency, bandwidth utilization, and load balancing by integrating predictive traffic forecasting and RL-based rate control into a mist–edge–cloud architecture. Compared to state-of-the-art methods such as JANUS and energy-aware offloading, as well as classical heuristics like FCFS and LRFS, ML-RASPF delivers up to 25–35% improvements across key metrics. These outcomes demonstrate the framework’s suitability for dynamic, resource-constrained smart healthcare environments and establish its potential as a robust, QoS-aware orchestration layer for next-generation IoT deployments.
The simulation framework effectively captures the behavior of healthcare IoT systems under dynamic load; however, it abstracts certain real-world variables such as human-in-the-loop delays, hospital-specific device heterogeneity, and emergency surge patterns. Future work will focus on extending ML-RASPF for deployment on real hospital networks to validate generalizability under operational constraints.

6. Conclusions

This paper presented ML-RASPF, an ML-based, rate-adaptive service-provisioning framework for heterogeneous IoT services in smart healthcare systems. ML-RASPF is a modular mist–edge–cloud architecture, which orchestrates service delivery through context-aware resource allocation, delay-aware utility modeling, and predictive control. We developed a convex model supported by adaptive pricing, dynamic weight adjustment, and RL-based rate control by formulating service provisioning as a joint optimization problem that considers both latency and service rate objectives. The integration of LSTM-based traffic prediction and RL agents enables service-delivery rate adaptation under varying load conditions and service requirements. Comprehensive evaluation using an extended EdgeCloudSim environment demonstrates that ML-RASPF consistently outperforms state-of-the-art baselines, including heuristic, queue-based, and energy-aware techniques across five critical performance metrics: latency, service delivery rate, energy consumption, bandwidth utilization, and load balancing efficiency. These findings confirm the potential of ML-RASPF to enable adaptive and intelligent service orchestration in real-world hospital environments, supporting use cases such as ECG monitoring, diagnostic kiosks, and emergency response systems.
In future work, we plan to extend ML-RASPF with proactive service migration, privacy-preserving mechanisms (e.g., federated learning), and multi-agent reinforcement learning for cooperative and fault-tolerant edge orchestration. We also aim to support emergency-mode prioritization for critical scenarios, incorporate workload-aware soft preemptive scheduling for neural processing unit-based ML execution, and validate the framework through real-world hospital deployments under live traffic conditions.

Funding

This research received no external funding.

Data Availability Statement

The data supporting the findings of this study were synthetically generated during the simulation experiments. These data are not publicly available but may be obtained from the author upon reasonable request.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Framingham, Mass. The Growth in Connected IoT Devices Is Expected to Generate 79.4 ZB of Data in 2025, According to a New IDC Forecast. Available online: https://www.telecomtv.com/content/iot/the-growth-in-connected-iot-devices-is-expected-to-generate-79-4zb-of-data-in-2025-according-to-a-new-idc-forecast-35522/#:~:text=A%20new%20forecast%20from%20International%20Data%20Corporation,79.4%20zettabytes%20(ZB)%20of%20data%20in%202025.&text=%22Understanding%20the%20amount%20of%20data%20created%20from,scale%20in%20this%20accelerating%20data%2Ddriven%20IoT%20market.%22 (accessed on 16 April 2025).
  2. Sun, M.; Quan, S.; Wang, X.; Huang, Z. Latency-aware scheduling for data-oriented service requests in collaborative IoT-edge-cloud networks. Future Gener. Comput. Syst. 2025, 163, 107538. [Google Scholar] [CrossRef]
  3. Banitalebi Dehkordi, A. EDBLSD-IIoT: A comprehensive hybrid architecture for enhanced data security, reduced latency, and optimized energy in industrial IoT networks. J. Supercomput. 2025, 81, 359. [Google Scholar] [CrossRef]
  4. Khan, S.; Khan, S. Latency aware graph-based microservice placement in the edge-cloud continuum. Clust. Comput. 2025, 28, 88. [Google Scholar] [CrossRef]
  5. Pervez, F.; Zhao, L. Efficient Queue-Aware Communication and Computation Optimization for a MEC-Assisted Satellite-Aerial-Terrestrial Network. IEEE Internet Things J. 2025, 12, 13972–13987. [Google Scholar] [CrossRef]
  6. Tripathy, S.S.; Bebortta, S.; Mohammed, M.A.; Nedoma, J.; Martinek, R.; Marhoon, H.A. An SDN-enabled fog computing framework for wban applications in the healthcare sector. Internet Things 2024, 26, 101150. [Google Scholar] [CrossRef]
  7. Wen, Z.; Yang, R.; Qian, B.; Xuan, Y.; Lu, L.; Wang, Z.; Peng, H.; Xu, J.; Zomaya, A.Y.; Ranjan, R. JANUS: Latency-aware traffic scheduling for IoT data streaming in edge environments. IEEE Trans. Serv. Comput. 2023, 16, 4302–4316. [Google Scholar] [CrossRef]
  8. Mahapatra, A.; Majhi, S.K.; Mishra, K.; Pradhan, R.; Rao, D.C.; Panda, S.K. An energy-aware task offloading and load balancing for latency-sensitive IoT applications in the Fog-Cloud continuum. IEEE Access 2024, 12, 14334–14349. [Google Scholar] [CrossRef]
  9. Du, A.; Jia, J.; Chen, J.; Wang, X.; Huang, M. Online Queue-Aware Service Migration and Resource Allocation in Mobile Edge Computing. IEEE Trans. Veh. Technol. 2025, 74, 8063–8078. [Google Scholar] [CrossRef]
  10. San José, S.G.; Marquès, J.M.; Panadero, J.; Calvet, L. NARA: Network-Aware Resource Allocation mechanism for minimizing quality-of-service impact while dealing with energy consumption in volunteer networks. Future Gener. Comput. Syst. 2025, 164, 107593. [Google Scholar] [CrossRef]
  11. Al-Saedi, A.A.; Boeva, V.; Casalicchio, E. Fedco: Communication-efficient federated learning via clustering optimization. Future Internet 2022, 14, 377. [Google Scholar] [CrossRef]
  12. Centofanti, C.; Tiberti, W.; Marotta, A.; Graziosi, F.; Cassioli, D. Taming latency at the edge: A user-aware service placement approach. Comput. Netw. 2024, 247, 110444. [Google Scholar] [CrossRef]
  13. Liu, Z.; Xu, X. Latency-aware service migration with decision theory for Internet of Vehicles in mobile edge computing. Wirel. Netw. 2024, 30, 4261–4273. [Google Scholar] [CrossRef]
  14. Amzil, A.; Abid, M.; Hanini, M.; Zaaloul, A.; El Kafhali, S. Stochastic analysis of fog computing and machine learning for scalable low-latency healthcare monitoring. Clust. Comput. 2024, 27, 6097–6117. [Google Scholar] [CrossRef]
  15. Li, Y.; Zhang, Q.; Yao, H.; Gao, R.; Xin, X.; Guizani, M. Next-Gen Service Function Chain Deployment: Combining Multi-Objective Optimization with AI Large Language Models. IEEE Netw. 2025, 39, 20–28. [Google Scholar] [CrossRef]
  16. Ahmed, W.; Iqbal, W.; Hassan, A.; Ahmad, A.; Ullah, F.; Srivastava, G. Elevating e-health excellence with IOTA distributed ledger technology: Sustaining data integrity in next-gen fog-driven systems. Future Gener. Comput. Syst. 2025, 168, 107755. [Google Scholar] [CrossRef]
  17. Najim, A.H.; Al-sharhanee, K.A.M.; Al-Joboury, I.M.; Kanellopoulos, D.; Sharma, V.K.; Hassan, M.Y.; Issa, W.; Abbas, F.H.; Abbas, A.H. An IoT healthcare system with deep learning functionality for patient monitoring. Int. J. Commun. Syst. 2025, 38, e6020. [Google Scholar] [CrossRef]
  18. Ji, X.; Gong, F.; Wang, N.; Xu, J.; Yan, X. Cloud-Edge Collaborative Service Architecture with Large-Tiny Models Based on Deep Reinforcement Learning. IEEE Trans. Cloud Comput. 2025, 13, 288–302. [Google Scholar] [CrossRef]
  19. Shang, L.; Zhang, Y.; Deng, Y.; Wang, D. MultiTec: A Data-Driven Multimodal Short Video Detection Framework for Healthcare Misinformation on TikTok. IEEE Trans. Big Data, 2025; early access. [Google Scholar]
  20. Fei, Y.; Fang, H.; Yan, Z.; Qi, L.; Bilal, M.; Li, Y.; Xu, X.; Zhou, X. Privacy-Aware Edge Computation Offloading with Federated Learning in Healthcare Consumer Electronics System. IEEE Trans. Consum. Electron. 2025; early access. [Google Scholar]
  21. Ali, A.; Arafa, A. Delay sensitive hierarchical federated learning with stochastic local updates. IEEE Trans. Cogn. Commun. Netw. 2025; early access. [Google Scholar]
  22. Peng, Z.; Xu, C.; Wang, H.; Huang, J.; Xu, J.; Chu, X. P2b-trace: Privacy-preserving blockchain-based contact tracing to combat pandemics. In Proceedings of the 2021 International Conference on Management of Data, Virtual, 20–25 June 2021; pp. 2389–2393. [Google Scholar]
  23. EdgeCloudSim. Available online: https://github.com/CagataySonmez/EdgeCloudSim (accessed on 16 April 2025).
  24. Zhang, T.; Jin, J.; Zheng, X.; Yang, Y. Rate Adaptive Fog Service Platform for Heterogeneous IoT Applications. IEEE Internet Things J. 2019, 7, 176–188. [Google Scholar] [CrossRef]
Figure 1. Illustration of latency-aware and rate-adaptive service flow across cloud, edge, and mist layers in a smart healthcare setting. RL and LSTM modules dynamically optimize traffic routing and service delivery.
Figure 1. Illustration of latency-aware and rate-adaptive service flow across cloud, edge, and mist layers in a smart healthcare setting. RL and LSTM modules dynamically optimize traffic routing and service delivery.
Algorithms 18 00325 g001
Figure 2. Architecture of the ML-RASPF framework for optimal service provisioning.
Figure 2. Architecture of the ML-RASPF framework for optimal service provisioning.
Algorithms 18 00325 g002
Figure 3. ML-RASPF workflow from IoT input to rate-adaptive service provisioning across mist–edge–cloud.
Figure 3. ML-RASPF workflow from IoT input to rate-adaptive service provisioning across mist–edge–cloud.
Algorithms 18 00325 g003
Figure 4. Simulation topology for ML-RASPF with mist–edge–cloud architecture, predictive traffic and adaptive control modules. Each zone includes a mist layer, an edge cloudlets layer, and a diverse mix of latency-sensitive and latency-tolerant healthcare service consumers.
Figure 4. Simulation topology for ML-RASPF with mist–edge–cloud architecture, predictive traffic and adaptive control modules. Each zone includes a mist layer, an edge cloudlets layer, and a diverse mix of latency-sensitive and latency-tolerant healthcare service consumers.
Algorithms 18 00325 g004
Figure 5. Transmission latency for diagnostic kiosks, video displays, and patient Wi-Fi services.
Figure 5. Transmission latency for diagnostic kiosks, video displays, and patient Wi-Fi services.
Algorithms 18 00325 g005
Figure 6. Service delivery rate vs. buffer size for diagnostic kiosks, video displays, and patient Wi-Fi services.
Figure 6. Service delivery rate vs. buffer size for diagnostic kiosks, video displays, and patient Wi-Fi services.
Algorithms 18 00325 g006
Figure 7. Total energy consumption (in Watts) for diagnostic kiosks, video displays, and patient Wi-Fi services across five provisioning schemes. ML-RASPF shows consistent energy savings across all service categories.
Figure 7. Total energy consumption (in Watts) for diagnostic kiosks, video displays, and patient Wi-Fi services across five provisioning schemes. ML-RASPF shows consistent energy savings across all service categories.
Algorithms 18 00325 g007
Figure 8. Bandwidth utilization across healthcare IoT services. ML-RASPF demonstrates superior utilization efficiency under variable traffic loads.
Figure 8. Bandwidth utilization across healthcare IoT services. ML-RASPF demonstrates superior utilization efficiency under variable traffic loads.
Algorithms 18 00325 g008
Figure 9. Load balancing efficiency across healthcare services. Higher percentages reflect better resource distribution across mist, edge, and cloud layers.
Figure 9. Load balancing efficiency across healthcare services. Higher percentages reflect better resource distribution across mist, edge, and cloud layers.
Algorithms 18 00325 g009
Table 1. Comparative analysis of recent service provisioning approaches in IoT and smart healthcare environments.
Table 1. Comparative analysis of recent service provisioning approaches in IoT and smart healthcare environments.
ApproachML-BasedLatency-AwareRate-AdaptiveArchitectureDomainKey Limitations
Amzil et al. [14]××Fog–CloudHealthcareHigh overhead due to tensor mapping, lacks adaptiveness under dynamic loads.
Centofanti et al. [12]××Edge–CloudCrowdsensingAssumes deterministic environment, lacks real-time adaptability.
Mahapatra et al. [8]×Fog–CloudIoT/HealthcareUses metaheuristics; lacks learning-based decision-making or predictive models.
Wen et al. (JANUS) [7]×Edge–CloudStreamingQueue-based heuristic stream selection; lacks proactive learning and mist integration.
Du et al. [9]×Edge–CloudGeneral IoTFocus on queue-aware migration; not suitable for highly variable healthcare demands.
Fei et al. [20]××Edge–CloudHealthcareFocuses on privacy with FL; lacks delivery rate optimization and latency support.
Najim et al. [17]××Fog–CloudIoT–VehiclesLacks training scalability and rate adaptation; accuracy affected by data gaps.
Ji et al. [18]×Edge–CloudSmart CityDRL adds processing delay; lacks fine-grained control in mist layers.
Li et al. [15]××Cloud–EdgeSFCMulti-objective model, but lacks ML support and ignores dynamic adaptation.
ML-RASPFMist–Edge–CloudHealthcareIntegrates ML forecasting + RL adaptation in modular real-time architecture.
Table 2. Simulation parameters for smart healthcare evaluation.
Table 2. Simulation parameters for smart healthcare evaluation.
ParameterValue/Description
Gradient-based step size λ 0.01
Number of service consumers9
Number of service types3 (diagnostics, info, Wi-Fi)
Number of communication links13
Link capacity t 1 20 Mb/s
Link capacity t 2 t 4 16, 15, 14 Mb/s
Link capacity t 5 t 13 13 Mb/s
Edge forwarders (mist nodes)3
Edge cloudlet nodes1 per forwarder
Central cloud nodes1
Mist node energy consumption3.5 W (static baseline)
Edge node energy consumption3.7 W (static baseline)
Cloud energy consumption9.7 kW (data center model)
CPUIntel Core i7 E3-1225
Processor frequency3.3 GHz
RAM16 GB
Operating systemWindows 10 64-bit
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rafique, W. ML-RASPF: A Machine Learning-Based Rate-Adaptive Framework for Dynamic Resource Allocation in Smart Healthcare IoT. Algorithms 2025, 18, 325. https://doi.org/10.3390/a18060325

AMA Style

Rafique W. ML-RASPF: A Machine Learning-Based Rate-Adaptive Framework for Dynamic Resource Allocation in Smart Healthcare IoT. Algorithms. 2025; 18(6):325. https://doi.org/10.3390/a18060325

Chicago/Turabian Style

Rafique, Wajid. 2025. "ML-RASPF: A Machine Learning-Based Rate-Adaptive Framework for Dynamic Resource Allocation in Smart Healthcare IoT" Algorithms 18, no. 6: 325. https://doi.org/10.3390/a18060325

APA Style

Rafique, W. (2025). ML-RASPF: A Machine Learning-Based Rate-Adaptive Framework for Dynamic Resource Allocation in Smart Healthcare IoT. Algorithms, 18(6), 325. https://doi.org/10.3390/a18060325

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop