Hybrid Genetic Algorithm and Deep Reinforcement Learning Framework for IoT-Enabled Healthcare Equipment Maintenance Scheduling
Abstract
1. Introduction
- A scalable planning-and-control pipeline that fuses predictive maintenance signals with hybrid GA+DRL optimization.
- A comparative evaluation isolating the benefits of GA-seeded policy learning over pure DRL and pure metaheuristics across instance sizes and dynamics.
- An optional, permissioned blockchain layer for verifiable, auditable cross-facility records, integrated out-of-loop to preserve real-time performance.
2. Background
2.1. Literature Review
2.2. Critical Synthesis: Paradigm Trade-Offs in Healthcare Maintenance
- Adaptability: Metaheuristics (GAs, SA, ACO) excel at exploration over static spaces but lack mechanisms to incorporate sequential feedback from nonstationary IoT streams—each optimization run restarts from scratch, which makes these methods brittle when risk profiles shift dynamically [6,9]. DRL learns adaptive policies [12,13], yet without good initialization may require thousands of episodes to discover constraint-feasible solutions—a liability when patient safety demands rapid response.
- Sample efficiency: This DRL weakness is critical in healthcare, where simulation episodes are computationally expensive and real-world trial-and-error is ethically inadmissible. Metaheuristics produce competitive solutions with fewer evaluations [10,20] but cannot improve beyond their operators’ expressiveness. Hybrids like GA–DRL [15,16] provide diverse and constraint-satisfying curricula that reduce DRL’s cold-start penalty while enabling adaptive refinement, though compute allocation between phases remains empirically driven.
- Operational readiness: Healthcare demands interpretability, minimal infrastructure, and fail-safe behavior. Metaheuristics are transparent and lightweight, facilitating clinical buy-in; DRL policies resemble black boxes requiring drift monitoring [2]; hybrids inherit complexity from both, complicating deployment.
2.3. Identified Research Gaps
- 1.
- Domain-specific hybridization: Few works apply GA–DRL hybrids to multi-facility healthcare maintenance with life-critical devices, tight time windows, and specialized skills.
- 2.
- End-to-end integration: Most studies address optimization or system architecture (IoT, blockchain) in isolation. A holistic pipeline that ingests IoT predictions, optimizes decisions, and provides secure, auditable records is lacking.
- 3.
- Rigorous comparative evidence: Systematic evaluations isolating the benefits of GA seeding versus pure DRL or pure metaheuristics, under realistic dynamics and constraints, are scarce.
- 4.
- Compute–budget trade-offs: The allocation between GA exploration and DRL refinement is under-explored, yet central to practical performance and scalability.
2.4. Contribution and Future Research Directions
- An integrated pipeline that uses GA to generate diverse, feasible schedules and to warm-start DRL, improving sample efficiency and adaptation.
- A comparative study against strong single-paradigm baselines that quantifies effects on solution quality and compute efficiency across instance sizes and dynamics.
- An optional, permissioned blockchain layer for tamper-evident, cross-facility auditability placed out-of-loop to preserve real-time performance.
3. Methodology
3.1. Framework Overview
3.2. Problem Context and Formulation
3.3. Hybrid GA and DRL Solution Approach
3.3.1. Phase I: Genetic Algorithm Global Search
3.3.2. Phase II: Deep Reinforcement Learning Fine-Tuning
- States encode remaining tasks with their windows, risk indices, technician statuses (location, skill, availability), travel matrices, and streaming IoT risk updates.
- Actions select technician–task pairs following feasibility masks.
- Rewards are negative incremental costs covering labor, travel, activation, risk, lateness penalties, and shaping terms favoring early servicing of high-risk tasks and route consolidation.
3.4. Adaptive Computational Budgeting
3.5. IoT and Blockchain Integration
3.6. Implementation Overview
4. Results
4.1. Baselines and Statistical Protocol
4.2. Case Study 1: Regional Medical Center Network—Imaging Equipment
- Feasible windows: CT at offers the widest slot (15 h), while X-ray has the narrowest (4h). MRI jobs generally span 11–14 h, ensuring moderate flexibility.
- Failure windows: Several devices face tight deadlines (e.g., MRI should start before 9:30, X-ray by 22:00–23:00), whereas others allow later slack (e.g., X-ray starting at 18:00).
- Processing times and penalties: MRI jobs are longest (30–90 min) and most costly (up to 135 C.U./h lateness). CT tasks are medium (60–75 min, 75–90 C.U./h), while X-ray jobs are shortest (30–45 min, 45–60 C.U./h).
- Technician availability: MRI skills are scarcer (4 technicians for 4 jobs), while CT and X-ray are more widely covered (6 technicians each). Only 4 technicians work past 17:00, essential for late X-ray tasks. Split shifts of and increase flexibility for narrow evening slots.
- GA Performance at Early Stages: The GA achieves its best solution remarkably early, reaching a cost of 1124.46 C.U. after just 10% of the available time (). However, GA alone cannot reduce the cost further, even with additional computational time, as evidenced by the unchanged cost of 1124.46 C.U. at (pure GA). This suggests that GA is effective for quickly identifying a reasonable solution but struggles to refine it beyond a certain threshold.
- DRL Refinement and Superior Performance: DRL, when initialized with the solution provided by GA, demonstrates significant improvement in solution quality. For values ranging from 0.1 to 0.4, DRL reduces the cost to the best observed value of 1075.85 C.U., achieving a 4.32% improvement over GA’s solution. A good cost of 1093.09 is reached relatively early, after just 10% of DRL time for , and further refined after 60% of the available time for . However, when DRL starts from a low-quality GA solution (e.g., 1449.87 C.U. at , pure DRL), it struggles to converge to a competitive cost, achieving only 1105.15 C.U., underscoring the importance of a strong initial solution from GA.
- Impact of Time Allocation Balance: The results highlight the critical role of balancing computational time between GA and DRL to achieve high-quality solutions efficiently. For instance, using just 10% of the time for GA to obtain a cost of 1124.46 C.U. () and allocating 60% of the remaining time to DRL for refinement results in the best cost of 1075.85 C.U. This balance prevents wasting computational resources on diminishing returns from GA while leveraging DRL’s ability to fine-tune solutions.
- Penalty Insights: Notably, the best solutions at to 0.4 incur a risk penalty of 36.00 C.U. because one task is scheduled after failure windows begin. However, no lateness penalties are observed across any values, indicating that all solutions adhere to feasible time windows, a critical factor in maintaining operational efficiency.
4.3. Case Study 2: Metropolitan Hospital System—Critical Care Equipment
- Feasible windows: Surgical Robot tasks () generally span 8 h, providing moderate flexibility, while certain Monitor jobs (e.g., , 10 h) offer shorter slots. The narrowest assignment is (Monitor, 8 h), while the widest spans 12 h (, Ventilator at ).
- Failure windows: Tight deadlines appear frequently—for example, (Dialysis at ) must be started by 10:00–11:30, and (Monitor at ) by 12:00–14:00—while others allow late slack, such as (Dialysis at , 15:00–17:00).
- Processing times and penalties: Surgical Robot jobs are longest (120 min) and most costly (up to 320 C.U./h lateness). Dialysis jobs are medium-length (75–90 min, 140–158 C.U./h). Ventilators and Monitors are shorter (30–75 min) but still incur meaningful costs (95–225 C.U./h for Ventilators, 95–107 C.U./h for Monitors.
- Technician Availability: Skills are evenly distributed across different departments: ventilator coverage (five technicians), monitor coverage (six technicians), dialysis coverage (five technicians), and surgical robot coverage (four technicians). However, late-evening capacity is limited, with only three technicians (, , ) extending beyond 19:00, which is crucial for tasks like , , and . Shift overlaps (e.g., and ) provide resilience for early starts, while high-cost senior staff (, ) are pivotal for meeting narrow robot-task deadlines.
- GA Performance and Early Convergence: The GA achieves a notable solution early in the process, reducing the cost to 4384.68 C.U. at and further to 4363.86 C.U. from onward. However, GA alone fails to improve beyond this point, as seen with the unchanged cost of 4363.86 C.U. at (pure GA). This indicates that while GA quickly converges to a reasonable solution, it lacks the ability to refine further without excessive computational effort.
- DRL’s Refinement Capability: DRL, starting from GA’s solutions, consistently improves the cost across most values. The best performance is observed from to , where DRL reduces the cost to an optimal value of 4121.73 C.U., achieving a 5.55% improvement over GA’s result. Even at , DRL delivers a near-optimal cost of 4128.69 C.U. (5.84% improvement). However, at higher values (0.9 and 1.0), DRL offers no improvement, retaining GA’s cost of 4363.86 C.U., likely due to insufficient time for effective refinement. Conversely, in the pure DRL case (), starting from a high initial cost of 4872.01 C.U., DRL achieves 4145.19 C.U. (14.92% improvement), yet still falls short of the hybrid approach’s best result, underscoring the value of GA’s initialization.
- Penalty Observations: In the best solution, the risk penalty is 1.52 C.U., suggesting that certain tasks are scheduled just after the start of failure windows. Notably, no lateness penalties are incurred in any scenario, indicating that all solutions respect the feasible time windows, a crucial factor for maintaining operational efficiency in this hospital system.
4.4. Case Study 3: Large-Scale Healthcare Network—Multi-Modal Equipment
- Time windows structure: Feasible-start windows vary across jobs (earliest feasible starts around 2:00 and latest feasible ends up to 23:00), producing heterogeneous scheduling windows across the horizon. Failure windows (latest allowable start intervals) are frequently narrow: many jobs have failure windows of only 1–3 h, and one job has a 9 min-wide failure window.
- Processing times and penalties: Task durations are heterogeneous (range approximately 0.5–1.5 h), with some equipment types split between short and medium jobs. Risk-penalty and lateness-penalty rates are substantial (lateness penalties reach up to several hundreds of cost units per hour), so missed or late starts carry high cost consequences.
- Technician fleet and skills: Ten technicians with overlapping but uneven skill coverage. Skill counts: Lab Analyzer (five technicians), Pharmacy Automation (four), HVAC (six), IT Systems (five). Availability windows span early to late shifts but differ per technician, so matching a technician’s availability to a job’s narrow failure window is often restrictive.
- Overall best configuration (small GA + DRL refinement): The lowest total cost, C.U., is obtained for –0.3, where GA is given a small fraction of time and DRL performs the bulk of the refinement. This configuration also yields a lower risk penalty (258.99 C.U.), indicating better handling of failure-window exposures.
- GA convergence and plateaus: GA rapidly reaches C.U. once given time (observed for ) and remains at that value as increases. This indicates GA finds a stable solution early but offers little further improvement with additional GA time.
- DRL refinement behavior in hybrid runs: For –0.3, DRL successfully refines the GA solution to C.U. (best observed). For –0.9, DRL produces C.U., suggesting that when GA is allocated too much relative time or when DRL has an insufficient refinement budget, DRL struggles to escape GA’s solution basin. The best hybrid balance is therefore a small initial GA allocation followed by substantial DRL time.
- Penalty patterns and feasibility concerns: The lateness penalty remains greater than or equal to C.U. across all runs, indicating persistent, small lateness that none of the tested configurations eliminated. This suggests one or more tasks systematically fall just after the failure windows end.
4.5. Case Study 4: Specialty Care Centers – High-Tech Equipment
4.6. Overall Performance Analysis
4.7. Robustness and Stress Tests
- Update latency (state ingest + policy decode): 180–260 ms (p95).
- Replan time (GA kick + PPO refine, capped): 420–680 ms (p95), no schedule interruption.
- Lateness events: 0; Risk penalty reduced by 12–18% vs no-replan baseline within 5 min.
5. Discussion
6. Conclusions
6.1. Answer to the Research Question and Research Goals
6.2. Key Contributions
- 1.
- Novel Hybrid Architecture: A GA-initialized DRL framework with behavior cloning that addresses sample inefficiency while maintaining solution quality across varying problem scales.
- 2.
- Computational Efficiency: Computational Efficiency: Empirical evidence that minimal GA allocation (10–20% of budget) followed by DRL refinement achieves optimal performance; with an early-stopping protocol we realized average wall-time savings of 47.5% with negligible cost differences (<0.5%).
- 3.
- Domain Integration: Comprehensive modeling of healthcare-specific constraints including multi-modal equipment, specialized skills, critical time windows, and IoT-derived failure predictions.
- 4.
- Systematic Evaluation: Rigorous comparative analysis isolating hybridization benefits across problem instances of varying complexity.
6.3. Practical Impact
6.4. Limitations and Future Work
- Real-world deployment with human-in-the-loop validation.
- Exploration of alternative hybrid architectures (e.g., ACO-DRL, multi-agent RL) and integration of explainable AI for transparency.
- Deeper IoT integration for predictive maintenance, and systematic evaluation of blockchain performance (latency, security, interoperability).
- Future benchmarking should include tuned Adaptive Large Neighborhood Search (ALNS), Iterated Local Search (ILS), and other state-of-the-art TRSP/VRPTW metaheuristics. While the OR-Tools VRP heuristic offers a strong industrial baseline (our hybrid improves by 0.93–2.86%), specialized academic heuristics with domain-adapted operators can further enhance performance. Current results demonstrate proof-of-concept; full evaluation requires controlled experiments with multiple seeds and time budgets.
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Mathematical Model
Appendix A.1. Sets and Indices
- : set of maintenance tasks (each task e is located at facility ).
- : set of technicians.
- For each , define artificial depot nodes (start) and (end).
- Let denote the node set for technician r.
Appendix A.2. Parameters
- Task windows: feasible ; predicted failure window .
- Technician availability: .
- Travel times: for ; processing times .
- Costs: activation , hourly , risk rate , lateness rate .
- Skills: each task e requires skill ; technician r has skill set .
- Big-M constant M.
Appendix A.3. Decision Variables
- : 1 if task e is assigned to technician r.
- : 1 if technician r is activated.
- : 1 if technician r travels from i to j, for .
- : start time of task e.
- : route start/end times of technician r.
- : auxiliary for risk exposure (), capped exposure (), and lateness ().
Appendix A.4. Objective
Appendix A.5. Constraints
Appendix B. Software Architecture
Appendix B.1. Domain and Scheduling Model
- “Equipment”, “Technician”, and “Facility”: classes represent immutable domain entities.
- “MaintenanceEnvironment” aggregates domain data and precomputes travel times.
- “MaintenanceSchedule” encapsulates start times, assignments, routes, feasibility checks, and cost evaluation.
Appendix B.2. Optimization Engines
- “GeneticAlgorithm” maintains populations of MaintenanceSchedule instances, performs initialization, selection, crossover (Order Crossover and Block-Exchange), mutation (swap, insertion, two-opt, scramble), and local search improvements.
- “ImprovedDRLAgent” implements a PPO-based actor–critic with graph attention nets, action masking, behavior cloning warm-start from GA elite archive, and online adaptation to streaming IoT data.
Appendix B.3. Orchestration and Integration
- “HybridGADRLFramework” coordinates GA and DRL phases, manages adaptive computational budgeting parameter , and returns optimized schedules.
- Interfaces manage streaming IoT data, schedule evaluation, and optional blockchain logging.
Appendix B.4. Dependency Diagram


Appendix C. Blockchain Layer Details
Appendix C.1. Data Model and Events
Appendix C.2. Privacy and GDPR
- Lawful basis: maintenance/quality logs for patient safety and compliance.
- Data minimization: only pseudonymized identifiers on-chain; detailed PII off-chain.
- Right to erasure: off-chain records can be deleted; the on-chain hash becomes unlinkable due to salting.
- Access control: channel-based ACLs (Fabric) or permissioned roles (Quorum) restrict read/write to authorized parties.
Appendix C.3. Throughput and Latency
- Hyperledger Fabric (v2.x, RAFT, 2 orgs × 2 peers, endorsement policy 2/2): 95p 180–220 ms, 99p 240–300 ms; stable up to tps for small payloads (1–2 kB).
- Quorum (IBFT, 4 validators): 95p 140–190 ms, 99p 200–260 ms; stable up to tps for small payloads.
Appendix C.4. Benchmark
| Platform | tps (avg) | p95 [ms] | p99 [ms] | Err [%] |
|---|---|---|---|---|
| Fabric (2 orgs, RAFT) | 290 | 200 | 270 | 0.0 |
| Quorum (IBFT, 4 val.) | 420 | 170 | 230 | 0.1 |
Appendix C.5. Off-Loop Rationale
Appendix D. Detailed Schedule Analysis
Appendix D.1. Case Study 1: Task-Level Assignment Details
- Technician–Equipment Associations: The solution assigns technicians to equipment based on skill compatibility and availability. Notably, technician handles a significant workload, managing five tasks across different modalities and facilities (, , , , and ), showcasing effective multi-tasking across MRI and X-ray skills. Technician covers three tasks (, , and ), balancing CT and X-ray maintenance. Technician is assigned solely to (CT at ), while handles the late-night task (X-ray at ). Technicians and remain unassigned in this solution, indicating a focus on cost efficiency by minimizing activation costs.
- Travel Times Impact: Given the network’s facility distribution—with travel times ranging from 7.1 to 43 min (average 22.4 min)—the scheduling optimizes technician routes to reduce travel overhead. For instance, ’s route spans multiple facilities (, , , , and ), suggesting a clustered approach to minimize long transits. Similarly, ’s tasks at , , and exploit geographic proximity in the north-central cluster. This routing efficiency is critical to adhering to tight feasible windows and avoiding lateness penalties.
- Risk Highlight: A point of concern in the schedule is the maintenance of (CT at ), assigned to technician with a start time at approximately 11:20. This scheduling places the task after the beginning of the failure window (10:00–14:00), introducing a risk of operational disruption. Consequently, this delay contributes to the overall fitness score of the solution (36 C.U.). Mitigation strategies or rescheduling could be explored to address this vulnerability.
- Cost-Efficient Assignment: The assignment of technician to task (X-ray at , starting at 19:00) instead of technician (already assigned to at ) demonstrates a strategic decision to minimize waiting time costs. Assigning to would have resulted in a substantial idle period between the completion of (even if it is scheduled later just before the failure window begins at 11:00) and the start of at 19:00, incurring additional hourly costs (50 C.U./h for ). Activating with a fixed cost of 120 C.U. for this late-night task effectively minimizes costs.
Appendix D.2. Case Study 2: Task-Level Assignment Details
- Technician–Equipment Associations: The solution strategically assigns technicians to tasks based on skill compatibility, availability, and cost efficiency. For instance, technician (Surgical Robot, Monitor skills) is assigned to multiple tasks, including late-afternoon Surgical Robot maintenance like (at ), leveraging their availability (9:00–17:00) and moderate costs (activation 220 C.U., hourly 68 C.U./h). Technician , with the highest activation cost (260 C.U.) and hourly rate (85 C.U./h), is selectively assigned to critical Surgical Robot tasks such as or (late windows at and ), prioritizing their expertise and late availability (12:00–20:00) over cheaper alternatives. Technician , with a lower hourly cost (65 C.U./h), covers early Ventilator and Monitor tasks in the southern corridor (e.g., , ), demonstrating cost-effective allocation for early slots.
- Travel Times Impact: With travel times ranging from under 10 min within clusters to 45 min across clusters (average 23 min), the scheduling optimizes routes to minimize transit costs. Technicians are often assigned to tasks within the same geographic cluster to reduce travel. For example, assignments for (Monitor, Dialysis skills) focus on the north-central hub facilities like , , or , while (Dialysis, Ventilator skills) covers tasks in the eastern group (e.g., , ), ensuring efficient routing and adherence to feasible windows.
- Risk Highlight for Late Tasks: A negligible concern arises with task scheduled just after the beginning of its failure window, contributing to a risk penalty of 1.52 C.U. in the overall cost (4121.73 C.U.). For instance, tasks like (Monitor at ) or (Dialysis at ) with tight failure deadlines (12:00–14:00 and 10:00–11:30, respectively) may face delays due to technician routing or prior commitments, introducing operational risks. Future iterations could explore rescheduling or additional technician activation to mitigate these penalties.
- Cost-Efficient Technician Selection: Specific technician choices reflect a balance between activation costs, hourly rates, and availability. For example, technician (Monitor, Dialysis skills, activation 185 C.U., hourly 58 C.U./h) is preferred for mid-to-late tasks like or over (activation 180 C.U., hourly 60 C.U./h), despite similar costs, due to ’s extended availability (11:00–19:00) aligning better with later windows, avoiding potential overtime or waiting costs. Similarly, (Ventilator, Monitor skills, activation 210 C.U., hourly 70 C.U./h) is chosen for late tasks in the eastern group over (activation 190 C.U., hourly 62 C.U./h) because of ’s availability into the evening (14:00–22:00), ensuring coverage for jobs like without incurring additional waiting or rescheduling expenses.
Appendix E. List of Abbreviations and Acronyms
| Acronym | Definition |
|---|---|
| ACL | Access Control List |
| ACO | Ant Colony Optimization |
| AI | Artificial Intelligence |
| ALNS | Adaptive Large Neighborhood Search |
| APA | American Psychological Association |
| API | Application Programming Interface |
| C.U. | Cost Units |
| CI | Confidence Interval |
| CP-SAT | Constraint Programming - Satisfiability |
| CT | Computed Tomography |
| DRL | Deep Reinforcement Learning |
| DQN | Deep Q-Network |
| GA | Genetic Algorithm |
| GAE | Generalized Advantage Estimation |
| GDPR | General Data Protection Regulation |
| HHCRSP | Home Healthcare Routing and Scheduling Problem |
| HVAC | Heating, Ventilation, and Air Conditioning |
| IBFT | Istanbul Byzantine Fault Tolerance |
| ILS | Iterated Local Search |
| IoT | Internet of Things |
| IT | Information Technology |
| MAML | Model-Agnostic Meta-Learning |
| MILP | Mixed-Integer Linear Programming |
| MIP | Mixed-Integer Programming |
| MLP | Multi-Layer Perceptron |
| MRI | Magnetic Resonance Imaging |
| MS-TRSP-TW | Multi-Skill Technician Routing and Scheduling Problem with Time Windows |
| NSGA-II | Non-dominated Sorting Genetic Algorithm II |
| NP | Non-deterministic Polynomial time |
| OR | Operations Research |
| OX | Order Crossover |
| PII | Personally Identifiable Information |
| PPO | Proximal Policy Optimization |
| QC | Quality Control |
| RAFT | Raft Consensus Algorithm |
| ReLU | Rectified Linear Unit |
| RL | Reinforcement Learning |
| RPL | Routing Protocol for Low-Power and Lossy Networks |
| SA | Simulated Annealing |
| TRSP | Technician Routing and Scheduling Problem |
| UUID | Universally Unique Identifier |
| VRP | Vehicle Routing Problem |
| VRPTW | Vehicle Routing Problem with Time Windows |
| WSN | Wireless Sensor Network |
References
- de la Fuente-Valentin, L.; Carrasco, A.; Rios, J.; Pasek, Z.; Mendieta, R.; Puche, J. A systematic review on predictive maintenance in the healthcare sector: E-health and IoT solutions for maintenance management. J. Ind. Inf. Integr. 2022, 29, 100344. [Google Scholar] [CrossRef]
- Chen, P.; Sheng, S.; Chen, Z.; Wu, L.; Yao, Y. Deep Reinforcement Learning-Based Task Scheduling in IoT Edge Computing. Sensors 2021, 21, 1666. [Google Scholar] [CrossRef]
- Yu, C.H.; Tsai, J.; Chang, Y.T. Intelligent Path Planning for UAV Patrolling in Dynamic Environments Based on the Transformer Architecture. Electronics 2024, 13, 4716. [Google Scholar] [CrossRef]
- Foggetti, A.; Nucci, F.; Papadia, G. Tuning Metaheuristics with Tree-Structured Parzen Estimator: A Case Study on Scheduling. J. Artif. Intell. Auton. Intell. 2025, 2, 293–321. [Google Scholar] [CrossRef]
- Hassan, K.M.; Abdo, A.; Yakoub, A. Enhancement of Health Care Services Based on Cloud Computing in IOT Environment Using Hybrid Swarm Intelligence. IEEE Access 2022, 10, 105877–105886. [Google Scholar] [CrossRef]
- Mwanza, J.; Telukdarie, A.; Igusa, T. Optimising Maintenance Workflows in Healthcare Facilities: A Multi-Scenario Discrete Event Simulation and Simulation Annealing Approach. Modelling 2023, 4, 224–250. [Google Scholar] [CrossRef]
- Gayford, J.D.; Parragh, S.N.; Vancroonenburg, W. A two-phase heuristic for the technician routing and scheduling problem with experience-based service times. Eur. J. Oper. Res. 2021, 293, 351–366. [Google Scholar] [CrossRef]
- Fikar, C.; Hirsch, P. Home health care routing and scheduling: A review. Comput. Oper. Res. 2017, 77, 86–95. [Google Scholar] [CrossRef]
- Ahmed, R.; Nasiri, F.; Zayed, T. Genetic Algorithm-based Clustering Methodology for Maintenance Scheduling in Healthcare Facilities. In Proceedings of the 2021 International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 7 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 643–646. [Google Scholar] [CrossRef]
- Nucci, F.; Papadia, G.; Fedeli, E. Optimized Scheduling of IoT Devices in Healthcare Facilities: Balancing Cost and Quality of Care. Appl. Sci. 2025, 15, 4456. [Google Scholar] [CrossRef]
- Mavrovouniotis, M.; Muller, F.; Yang, S. Ant Colony Optimization With Local Search for Dynamic Traveling Salesman Problems. IEEE Trans. Cybern. 2017, 47, 1743–1756. [Google Scholar] [CrossRef]
- Zhang, C.; Li, P.; Guan, Z.; Gao, L.; Chen, Z. A deep reinforcement learning based framework for solving flexible job shop scheduling problem. Knowl.-Based Syst. 2020, 190, 105173. [Google Scholar] [CrossRef]
- Ros, S.; Ryoo, I.; Kim, S. DRL-Driven Intelligent SFC Deployment in MEC Workload for Dynamic IoT Networks. Sensors 2024, 25, 4257. [Google Scholar] [CrossRef] [PubMed]
- Lilhore, U.K.; Simaiya, S.; Sharma, Y.K.; Rai, A.K.; Padmaja, S.M.; Nabilalm, K.V.; Kumur, V.; Alroobaea, R.; Alsufyani, H. Cloud-edge hybrid deep learning framework for scalable IoT resource optimization. J. Cloud Comput. 2025, 14, 5. [Google Scholar] [CrossRef]
- Liu, Z.; Wang, R.; Wang, T.; Yang, X. GA-DRL: Graph Neural Network-Augmented Deep Reinforcement Learning for DAG Task Scheduling. arXiv 2023, arXiv:2307.00777. [Google Scholar] [CrossRef]
- Faraj, H.; Ahmed, Z. Optimizing RPL with a Hybrid GA-RL Approach: Field-Validated Performance for IoT WSNs. IEEE Internet Things J. 2025, Submitted. [Google Scholar]
- Tijan, E.; Jović, M.; Jardas, M.; Gulić, M. Blockchain Technology for Record-Keeping in Maritime Main-Engine Maintenance. J. Mar. Sci. Eng. 2021, 9, 952. [Google Scholar] [CrossRef]
- Shuaib, K.; Saleous, H.; Shuaib, M.; Zaki, N. A systematic review of blockchain in healthcare: Frameworks, prototypes, and challenges. J. Netw. Comput. Appl. 2021, 186, 103080. [Google Scholar] [CrossRef]
- Miuccio, L.; Riolo, S.; Bennis, M.; Panno, D. On Learning Intrinsic Rewards for Faster Multi-Agent Reinforcement Learning based MAC Protocol Design in 6G Wireless Networks. In Proceedings of the ICC 2023—IEEE International Conference on Communications, Rome, Italy, 28 May–1 June 2023; pp. 466–471. [Google Scholar] [CrossRef]
- Restrepo, M.; Gendreau, M.; Lahrichi, N. A two-phase metaheuristic for the home health care routing and scheduling problem with flexible services. Comput. Ind. Eng. 2021, 159, 107386. [Google Scholar] [CrossRef]
- Nucci, F.; Papadia, G. Data of Case study ‘Hybrid Genetic Algorithm and Deep Reinforcement Learning Framework for IoT-Enabled Healthcare Equipment Maintenance Scheduling’. 2025. Available online: https://github.com/fnuni/2025elect (accessed on 16 October 2025).




| Study | Methodology | Key Advantages | Limitations & Gaps |
|---|---|---|---|
| Metaheuristic Approaches | |||
| Mwanza et al. [6] | Discrete Event Simulation + Simulated Annealing (SA) | Reduces costs/delays with a validated simulation model. | Static model; limited adaptability to new patterns. |
| Restrepo et al. [20] | Two-phase metaheuristic for HHCRSP | Handles complex constraints (skills, flexible services). | Not adaptive to real-time data; risk of local optima. |
| Ahmed et al. [9] | Hybrid GA + hierarchical clustering | Effective task grouping reduces downtime. | Single-facility focus; no real-time IoT integration. |
| Nucci et al. [10] | Multi-objective GA (NSGA-II) | Balances cost and quality-of-care; demonstrates scalability. | Lacks online adaptation to dynamic events. |
| DRL & Hybrid Approaches | |||
| Zhang et al. [12] | DRL for Job-Shop Scheduling | Learns effective dispatching policies in complex settings. | Sample-inefficient; not tailored to routing with skills. |
| Ros et al. [13] | DRL for Service Function Chaining deployment in Multi-access Edge Computing | Smart, low-latency orchestration for IoT workloads. | Network orchestration, not physical maintenance logistics. |
| Liu et al. [15] | GA-initialized DRL (graph attention) | GA seeding improves convergence/quality for Directed Acyclic Graph scheduling. | Evaluated on computational tasks, not field logistics. |
| Lilhore et al. [14] | Hybrid DRL (DQN+PPO) in Cloud–Edge | Reduces time/energy; highlights distributed intelligence. | Assumes cloud–edge architecture; deployment complexity. |
| Faraj et al. [16] | Hybrid GA–RL for IPv6 Routing Protocol for Low-Power and Lossy Networks optimization | Field-validated gains in Wireless Sensor Network efficiency. | Network-layer focus; indirect relevance to scheduling. |
| Surveys & Reviews | |||
| Chen et al. [2] | Survey of DRL for IoT–Edge–Cloud scheduling | Comprehensive taxonomy and challenges. | Review; no algorithmic contribution. |
| Case | GA | DRL | Hybrid (Best ) | Wilcoxon p-val |
|---|---|---|---|---|
| 1 | , | |||
| 2 | , | |||
| 3 | , | |||
| 4 | , |
| Equipment (Skill, Facility) | Feasible Window | Failure Window | Proc. Time (min) | Risk Cost (C.U./h) | Lateness Penalty (C.U./h) |
|---|---|---|---|---|---|
| (MRI, ) | [1:00, 14:00] | [9:30, 13:00] | 30 | 45 | 105 |
| (CT, ) | [1:00, 16:00] | [11:00, 15:00] | 60 | 38 | 90 |
| (MRI, ) | [3:00, 14:00] | [9:00, 12:30] | 90 | 53 | 120 |
| (X-ray, ) | [7:00, 13:00] | [8:30, 12:00] | 45 | 23 | 60 |
| (CT, ) | [9:00, 21:00] | [10:00, 14:00] | 60 | 30 | 75 |
| (X-ray, ) | [19:00, 23:00] | [22:00, 23:00] | 30 | 15 | 45 |
| (MRI, ) | [2:00, 14:00] | [9:00, 12:30] | 75 | 60 | 135 |
| (CT, ) | [9:00, 15:00] | [10:30, 14:00] | 75 | 38 | 90 |
| (X-ray, ) | [7:00, 24:00] | [18:00, 22:30] | 30 | 15 | 53 |
| (MRI, ) | [2:00, 16:00] | [11:00, 15:00] | 30 | 45 | 105 |
| Technician (skills) | Activation Cost (C.U.) | Hourly Cost (C.U./h) | Availability |
|---|---|---|---|
| (MRI, CT) | 200 | 65 | [3:00, 17:00] |
| (MRI, X-ray) | 180 | 55 | [1:00, 17:00] |
| (CT, X-ray) | 160 | 50 | [2:00, 16:00] |
| (CT, X-ray) | 120 | 40 | [8:30, 12:30] |
| (MRI, CT) | 200 | 65 | [3:00, 23:00] |
| (MRI, X-ray) | 180 | 55 | [1:00, 23:00] |
| (CT, X-ray) | 160 | 50 | [2:00, 23:00] |
| (CT, X-ray) | 120 | 40 | [13:30, 23:30] |
| GA Cost C.U. | DRL Cost C.U. | Improvement (%) | Risk Penalty C.U. | Lateness Penalty C.U. | |
|---|---|---|---|---|---|
| 0.0 | 1449.87 | 1105.15 | 23.78 | 0.00 | 0.00 |
| 0.1 | 1124.46 | 1075.85 | 4.32 | 36.00 | 0.00 |
| 0.2 | 1124.46 | 1075.85 | 4.32 | 36.00 | 0.00 |
| 0.3 | 1124.46 | 1075.85 | 4.32 | 36.00 | 0.00 |
| 0.4 | 1124.46 | 1075.85 | 4.32 | 36.00 | 0.00 |
| 0.5 | 1124.46 | 1093.09 | 2.79 | 0.00 | 0.00 |
| 0.6 | 1124.46 | 1093.09 | 2.79 | 0.00 | 0.00 |
| 0.7 | 1124.46 | 1093.09 | 2.79 | 0.00 | 0.00 |
| 0.8 | 1124.46 | 1093.09 | 2.79 | 0.00 | 0.00 |
| 0.9 | 1124.46 | 1093.09 | 2.79 | 0.00 | 0.00 |
| 1.0 | 1124.46 | 1124.46 | 0.00 | 0.00 | 0.00 |
| Equipment (Skill, Facility) | Feasible Window | Failure Window | Proc. Time (min) | Risk Cost (C.U./h) | Lateness Penalty (C.U./h) |
|---|---|---|---|---|---|
| (Ventilator, ) | [1:00, 10:00] | [7:00, 9:00] | 60 | 80 | 200 |
| (Monitor, ) | [2:00, 12:00] | [9:00, 11:00] | 30 | 40 | 100 |
| (Dialysis, ) | [3:00, 12:00] | [10:00, 11:30] | 90 | 60 | 150 |
| (Surgical Robot, ) | [5:00, 13:00] | [9:30, 12:00] | 120 | 120 | 300 |
| (Ventilator, ) | [6:00, 14:00] | [11:00, 13:00] | 75 | 90 | 220 |
| (Monitor, ) | [7:00, 15:00] | [12:00, 14:00] | 45 | 35 | 95 |
| (Dialysis, ) | [8:00, 16:00] | [13:00, 15:00] | 75 | 55 | 140 |
| (Surgical Robot, ) | [9:00, 17:00] | [14:00, 16:00] | 120 | 130 | 310 |
| (Ventilator, ) | [10:00, 18:00] | [15:00, 17:00] | 60 | 85 | 210 |
| (Monitor, ) | [11:00, 19:00] | [16:00, 18:00] | 30 | 42 | 105 |
| (Dialysis, ) | [12:00, 20:00] | [17:00, 19:00] | 90 | 65 | 155 |
| (Surgical Robot, ) | [13:00, 21:00] | [18:00, 20:00] | 120 | 140 | 320 |
| (Ventilator, ) | [6:00, 14:00] | [9:30, 12:00] | 75 | 95 | 225 |
| (Monitor, ) | [8:00, 16:00] | [12:30, 15:00] | 45 | 38 | 98 |
| (Dialysis, ) | [10:00, 18:00] | [14:00, 16:00] | 75 | 58 | 145 |
| (Surgical Robot, ) | [12:00, 20:00] | [16:30, 19:00] | 120 | 125 | 305 |
| (Ventilator, ) | [2:00, 10:00] | [6:00, 8:00] | 60 | 82 | 205 |
| (Monitor, ) | [4:00, 12:00] | [9:00, 11:00] | 30 | 41 | 102 |
| (Dialysis, ) | [6:00, 14:00] | [10:30, 13:00] | 90 | 62 | 152 |
| (Surgical Robot, ) | [8:00, 16:00] | [12:30, 15:00] | 120 | 135 | 315 |
| (Ventilator, ) | [9:00, 17:00] | [13:00, 15:00] | 60 | 85 | 210 |
| (Monitor, ) | [10:00, 18:00] | [14:00, 16:00] | 30 | 43 | 107 |
| (Dialysis, ) | [11:00, 19:00] | [15:00, 17:00] | 90 | 67 | 158 |
| (Surgical Robot, ) | [12:00, 20:00] | [16:00, 18:00] | 120 | 128 | 308 |
| (Ventilator, ) | [7:00, 15:00] | [11:00, 13:00] | 60 | 88 | 215 |
| Technician (Skills) | Activation Cost (C.U.) | Hourly Cost (C.U./h) | Availability |
|---|---|---|---|
| (Ventilator, Monitor) | 200 | 65 | [2:00, 10:00] |
| (Surgical Robot, Dialysis) | 240 | 75 | [6:00, 14:00] |
| (Monitor, Dialysis) | 180 | 60 | [8:00, 16:00] |
| (Ventilator, Surgical Robot) | 260 | 85 | [12:00, 20:00] |
| (Ventilator, Monitor) | 210 | 70 | [14:00, 22:00] |
| (Dialysis, Surgical Robot) | 230 | 72 | [4:00, 12:00] |
| (Ventilator, Monitor) | 190 | 62 | [7:00, 15:00] |
| (Surgical Robot, Monitor) | 220 | 68 | [9:00, 17:00] |
| (Dialysis, Ventilator) | 200 | 66 | [10:00, 18:00] |
| (Monitor, Dialysis) | 185 | 58 | [11:00, 19:00] |
| GA Cost C.U. | DRL Cost C.U. | Improvement (%) | Risk Penalty C.U. | Lateness Penalty C.U. | |
|---|---|---|---|---|---|
| 0.0 | 4872.01 | 4145.19 | 14.92 | 0.00 | 0.00 |
| 0.1 | 4384.68 | 4128.69 | 5.84 | 0.00 | 0.00 |
| 0.2 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.3 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.4 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.5 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.6 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.7 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.8 | 4363.86 | 4121.73 | 5.55 | 1.52 | 0.00 |
| 0.9 | 4363.86 | 4363.86 | 0.00 | 0.00 | 0.00 |
| 1.0 | 4363.86 | 4363.86 | 0.00 | 0.00 | 0.00 |
| GA Cost C.U. | DRL Cost C.U. | Improvement (%) | Risk Penalty C.U. | Lateness Penalty C.U. | |
|---|---|---|---|---|---|
| 0.0 | 6303.48 | 5218.28 | 17.22 | 385.25 | 31.23 |
| 0.1 | 5252.52 | 5142.53 | 2.09 | 258.99 | 29.09 |
| 0.2 | 5252.52 | 5142.53 | 2.09 | 258.99 | 29.09 |
| 0.3 | 5252.52 | 5142.53 | 2.09 | 258.99 | 29.09 |
| 0.4 | 5252.52 | 5218.28 | 0.65 | 385.25 | 31.23 |
| 0.5 | 5252.52 | 5218.28 | 0.65 | 385.25 | 31.23 |
| 0.6 | 5252.52 | 5218.28 | 0.65 | 385.25 | 31.23 |
| 0.7 | 5252.52 | 5218.28 | 0.65 | 385.25 | 31.23 |
| 0.8 | 5252.52 | 5218.28 | 0.65 | 385.25 | 31.23 |
| 0.9 | 5252.52 | 5218.28 | 0.65 | 385.25 | 31.23 |
| 1.0 | 5252.52 | 5252.52 | 0.00 | 395.12 | 37.67 |
| GA Cost C.U. | DRL Cost C.U. | Improvement (%) | Risk Penalty C.U. | Lateness Penalty C.U. | |
|---|---|---|---|---|---|
| 0.0 | 3807.37 | 3068.06 | 19.42 | 34.54 | 0.00 |
| 0.1 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.2 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.3 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.4 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.5 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.6 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.7 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.8 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 0.9 | 3400.48 | 3049.74 | 10.31 | 0.00 | 0.00 |
| 1.0 | 3400.48 | 3400.48 | 0.00 | 0.00 | 0.00 |
| Case | GA [C.U.] | DRL [C.U.] | OR-Tools † [C.U.] | Hybrid [C.U.] | Hybrid vs. GA | Hybrid vs. DRL | , | Early-Stop Saving [%] |
|---|---|---|---|---|---|---|---|---|
| 1 | 1124.57.8 | 1100.99.4 | 1110.28.1 | 1078.46.1 | −4.32% | −2.65% | [0.1, 0.4] | 30.0 |
| 2 | 4369.211.3 | 4149.813.4 | 4176.5 14.7 | 4124.09.8 | −5.55% | −0.57% | [0.2, 0.8] | 60.0 |
| 3 | 5259.114.8 | 5222.615.2 | 5251.618.3 | 5148.912.6 | −2.09% | −1.45% | [0.1, 0.3] | 20.0 |
| 4 | 3406.19.7 | 3071.98.1 | 3081.39.2 | 3052.47.2 | −10.31% | −0.60% | [0.1, 0.9] | 80.0 |
| Avg | – | – | – | – | −5.57% | −1.32% | – | 47.5 |
| Case Study | GA C.U. | DRL C.U. | Hybrid C.U. | Hybrid vs. GA (%) | Hybrid vs. DRL (%) | (%) | ||
|---|---|---|---|---|---|---|---|---|
| 1 | 1124.46 | 1105.15 | 1075.85 | 4.32 | 2.65 | 0.1 | 0.4 | 30.0 |
| 2 | 4363.86 | 4145.19 | 4121.73 | 5.55 | 0.57 | 0.2 | 0.8 | 60.0 |
| 3 | 5252.52 | 5218.28 | 5142.53 | 2.09 | 1.45 | 0.1 | 0.3 | 20.0 |
| 4 | 3400.48 | 3068.06 | 3049.74 | 10.31 | 0.60 | 0.1 | 0.9 | 80.0 |
| Avg. | – | – | – | 5.57 | 1.32 | – | – | 47.5 |
| Case | Cost [C.U.] (mean ± sd, 10 Seeds) | Best Bound (Median) | MIP Gap [%] (Median) | Nodes (Median, ) |
|---|---|---|---|---|
| 1 | 1077.3 | 2.9 | 320 | |
| 2 | 3906.0 | 6.4 | 1100 | |
| 3 | 4628.0 | 11.8 | 2300 | |
| 4 | 3000.5 | 2.6 | 550 |
| Case | (Saving) | Full-Budget (mean ± sd) | Early-Stop (mean ± sd) | vs. Full [%] | Wilcoxon p | Within 95% CI [%] | |
|---|---|---|---|---|---|---|---|
| 1 | [0.1, 0.4] | 0.70 (−30%) | 1078.4 ± 6.1 | 1081.0 ± 6.6 | +0.24 | 0.21 | 78 |
| 2 | [0.2, 0.8] | 0.40 (−60%) | 4124.0 ± 9.8 | 4142.1 ± 12.2 | +0.44 | 0.09 | 70 |
| 3 | [0.1, 0.3] | 0.80 (−20%) | 5148.9 ± 12.6 | 5160.1 ± 14.1 | +0.22 | 0.27 | 76 |
| 4 | [0.1, 0.9] | 0.20 (−80%) | 3052.4 ± 7.2 | 3061.7 ± 10.9 | +0.31 | 0.11 | 64 |
| Scenario | Metric | GA | DRL | Hybrid |
|---|---|---|---|---|
| Travel noise 10% | Degradation [%] | |||
| Best-ranking [%] | 12 | 31 | 57 | |
| Service noise 10% | Degradation [%] | |||
| Best-ranking [%] | 15 | 28 | 57 | |
| FW shift (mean +20 m) | Degradation [%] | |||
| Best-ranking [%] | 10 | 30 | 60 | |
| IoT shock | Recovery [s] | |||
| (+50% @ mid-horizon) | Best-ranking [%] | 9 | 28 | 63 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nucci, F.; Papadia, G. Hybrid Genetic Algorithm and Deep Reinforcement Learning Framework for IoT-Enabled Healthcare Equipment Maintenance Scheduling. Electronics 2025, 14, 4160. https://doi.org/10.3390/electronics14214160
Nucci F, Papadia G. Hybrid Genetic Algorithm and Deep Reinforcement Learning Framework for IoT-Enabled Healthcare Equipment Maintenance Scheduling. Electronics. 2025; 14(21):4160. https://doi.org/10.3390/electronics14214160
Chicago/Turabian StyleNucci, Francesco, and Gabriele Papadia. 2025. "Hybrid Genetic Algorithm and Deep Reinforcement Learning Framework for IoT-Enabled Healthcare Equipment Maintenance Scheduling" Electronics 14, no. 21: 4160. https://doi.org/10.3390/electronics14214160
APA StyleNucci, F., & Papadia, G. (2025). Hybrid Genetic Algorithm and Deep Reinforcement Learning Framework for IoT-Enabled Healthcare Equipment Maintenance Scheduling. Electronics, 14(21), 4160. https://doi.org/10.3390/electronics14214160

