EPRS: Experience-Prioritized Reinforcement Scheduler in Edge Clusters
Abstract
1. Introduction
- We propose a four-dimensional dynamic resource sensing model for edge cluster scheduling, which jointly captures CPU utilization, memory pressure, disk I/O, and container density to characterize multidimensional resource states in Kubernetes-based edge environments.
- We design an Experience-Prioritized Reinforcement Scheduler (EPRS), a reinforcement learning–based scheduling framework with outcome-aware experience prioritization, in which the learning process emphasizes scheduling decisions that induce significant multidimensional resource imbalance or utilization variation, thereby improving adaptation to dynamic edge workloads.
- We implement the proposed EPRS in a real-world Kubernetes edge cluster and evaluate it under dynamic workloads, demonstrating its effectiveness in improving multidimensional resource utilization balance compared with baseline schedulers.
2. Related Work
2.1. Container-Based Edge Frameworks
2.2. Reinforcement Learning-Based Container Scheduling
3. Model and Algorithm
3.1. Modeling the Scheduling Problem in Edge Clustering
3.2. EPRS: Scheduling Algorithm Design
3.3. Edge Cluster System Design
4. Tests and Performance Evaluation
4.1. System Setup
4.2. Hyperparameter Sensitivity Analysis
4.3. Experiments
4.3.1. First Set of Experiments
4.3.2. Second Set of Experiments
4.3.3. Third Group of Experiments
4.3.4. Performance Analysis of Scheduling Algorithms
5. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wang, Z.; Goudarzi, M.; Gong, M.; Buyya, R. Deep reinforcement learning-based scheduling for optimizing system load and response time in edge and fog computing environments. Future Gener. Comput. Syst. 2024, 152, 55–69. [Google Scholar] [CrossRef]
- Zhang, S.; He, J.; Liang, W.; Li, K. MMDS: A secure and verifiable multimedia data search scheme for cloud-assisted edge computing. Future Gener. Comput. Syst. 2024, 151, 32–44. [Google Scholar] [CrossRef]
- Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge computing: Vision and challenges. IEEE Internet Things J. 2016, 3, 637–646. [Google Scholar] [CrossRef]
- García-Valls, M.; Dubey, A.; Botti, V. Introducing the new paradigm of social dispersed computing: Applications, technologies and challenges. J. Syst. Archit. 2018, 91, 83–102. [Google Scholar] [CrossRef]
- Gong, Y.; Bian, K.; Hao, F.; Sun, Y.; Wu, Y. Dependent tasks offloading in mobile edge computing: A multi-objective evolutionary optimization strategy. Future Gener. Comput. Syst. 2023, 148, 314–325. [Google Scholar] [CrossRef]
- Zhang, W.; Luo, J.; Chen, L.; Liu, J. A trajectory prediction-based and dependency-aware container migration for mobile edge computing. IEEE Trans. Serv. Comput. 2023, 16, 3168–3181. [Google Scholar] [CrossRef]
- Chen, Y.; He, S.; Jin, X.; Wang, Z.; Wang, F.; Chen, L. Resource utilization and cost optimization oriented container placement for edge computing in industrial internet. J. Supercomput. 2023, 79, 3821–3849. [Google Scholar] [CrossRef]
- Burns, B.; Grant, B.; Oppenheimer, D.; Brewer, E.; Wilkes, J. Borg, omega, and kubernetes. Commun. ACM 2016, 59, 50–57. [Google Scholar] [CrossRef]
- Toka, L.; Dobreff, G.; Fodor, B.; Sonkoly, B. Machine learning-based scaling management for kubernetes edge clusters. IEEE Trans. Netw. Serv. Manag. 2021, 18, 958–972. [Google Scholar] [CrossRef]
- Kaur, K.; Garg, S.; Kaddoum, G.; Ahmed, S.H.; Atiquzzaman, M. KEIDS: Kubernetes-based energy and interference driven scheduler for industrial IoT in edge-cloud ecosystem. IEEE Internet Things J. 2019, 7, 4228–4237. [Google Scholar] [CrossRef]
- Ju, H.; Juan, R.; Gomez, R.; Nakamura, K.; Li, G. Transferring policy of deep reinforcement learning from simulation to reality for robotics. Nat. Mach. Intell. 2022, 4, 1077–1087. [Google Scholar] [CrossRef]
- Li, C.; Zheng, P.; Yin, Y.; Wang, B.; Wang, L. Deep reinforcement learning in smart manufacturing: A review and prospects. CIRP J. Manuf. Sci. Technol. 2023, 40, 75–101. [Google Scholar] [CrossRef]
- Ladosz, P.; Weng, L.; Kim, M.; Oh, H. Exploration in deep reinforcement learning: A survey. Inf. Fusion 2022, 85, 1–22. [Google Scholar] [CrossRef]
- Shakya, A.K.; Pillai, G.; Chakrabarty, S. Reinforcement learning algorithms: A brief survey. Expert Syst. Appl. 2023, 231, 120495. [Google Scholar] [CrossRef]
- Sindhu, V.; Prakash, M.; Mohan Kumar, P. Energy-efficient task scheduling and resource allocation for improving the performance of a cloud–fog environment. Symmetry 2022, 14, 2340. [Google Scholar] [CrossRef]
- Rani, M.; K P, S.; Jayasingh, B.B. Deep Reinforcement Learning for Dynamic Task Scheduling in Edge-Cloud Environments. Int. J. Electr. Comput. Eng. Syst. 2024, 15, 837–850. [Google Scholar] [CrossRef]
- Bandyopadhyay, A.; Mishra, V.; Swain, S.; Chatterjee, K.; Dey, S.; Mallik, S.; Al-Rasheed, A.; Abbas, M.; Soufiene, B.O. Edgematch: A smart approach for scheduling iot-edge tasks with multiple criteria using game theory. IEEE Access 2024, 12, 7609–7623. [Google Scholar] [CrossRef]
- Kristiani, E.; Yang, C.T.; Huang, C.Y.; Wang, Y.T.; Ko, P.C. The implementation of a cloud-edge computing architecture using OpenStack and Kubernetes for air quality monitoring application. Mob. Netw. Appl. 2021, 26, 1070–1092. [Google Scholar] [CrossRef]
- Goethals, T.; De Turck, F.; Volckaert, B. Extending kubernetes clusters to low-resource edge devices using virtual kubelets. IEEE Trans. Cloud Comput. 2020, 10, 2623–2636. [Google Scholar] [CrossRef]
- Parra-Ullauri, J.M.; Madhukumar, H.; Nicolaescu, A.C.; Zhang, X.; Bravalheri, A.; Hussain, R.; Vasilakos, X.; Nejabati, R.; Simeonidou, D. kubeFlower: A privacy-preserving framework for Kubernetes-based federated learning in cloud–edge environments. Future Gener. Comput. Syst. 2024, 157, 558–572. [Google Scholar] [CrossRef]
- Nguyen, Q.M.; Phan, L.A.; Kim, T. Load-balancing of kubernetes-based edge computing infrastructure using resource adaptive proxy. Sensors 2022, 22, 2869. [Google Scholar] [CrossRef]
- Ali, B.; Golec, M.; Murugesan, S.S.; Wu, H.; Gill, S.S.; Cuadrado, F.; Uhlig, S. GAIKube: Generative AI-based Proactive Kubernetes Container Orchestration Framework for Heterogeneous Edge Computing. IEEE Trans. Cogn. Commun. Netw. 2024, 11, 933–945. [Google Scholar] [CrossRef]
- Oleghe, O. Container placement and migration in edge computing: Concept and scheduling models. IEEE Access 2021, 9, 68028–68043. [Google Scholar] [CrossRef]
- Youn, J.; Han, Y.H. Intelligent task dispatching and scheduling using a Deep Q-Network in a cluster edge computing system. Sensors 2022, 22, 4098. [Google Scholar] [CrossRef]
- Li, Y.; Zhang, X.; Zeng, T.; Duan, J.; Wu, C.; Wu, D.; Chen, X. Task placement and resource allocation for edge machine learning: A gnn-based multi-agent reinforcement learning paradigm. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 3073–3089. [Google Scholar] [CrossRef]
- Qiao, Y.; Shen, S.; Zhang, C.; Wang, W.; Qiu, T.; Wang, X. EdgeOptimizer: A programmable containerized scheduler of time-critical tasks in Kubernetes-based edge-cloud clusters. Future Gener. Comput. Syst. 2024, 156, 221–230. [Google Scholar] [CrossRef]
- Jian, Z.; Xie, X.; Fang, Y.; Jiang, Y.; Lu, Y.; Dash, A.; Li, T.; Wang, G. DRS: A deep reinforcement learning enhanced Kubernetes scheduler for microservice-based system. Softw. Pract. Exp. 2024, 54, 2102–2126. [Google Scholar] [CrossRef]
- Shen, W.; Lin, W.; Wu, W.; Wu, H.; Li, K. Reinforcement learning-based task scheduling for heterogeneous computing in end-edge-cloud environment. Clust. Comput. 2025, 28, 179. [Google Scholar] [CrossRef]
- Cui, H.; Tang, Z.; Lou, J.; Jia, W.; Zhao, W. Latency-aware container scheduling in edge cluster upgrades: A deep reinforcement learning approach. IEEE Trans. Serv. Comput. 2024, 17, 2530–2543. [Google Scholar] [CrossRef]
- Do, H.M.; Tran, T.P.; Yoo, M. Deep reinforcement learning-based task offloading and resource allocation for industrial IoT in MEC federation system. IEEE Access 2023, 11, 83150–83170. [Google Scholar]
- Zhang, P.; Li, S.; Li, D.; Ding, Q.; Shi, L. Sensor-Generated In Situ Data Management for Smart Grids: Dynamic Optimization Driven by Double Deep Q-Network with Prioritized Experience Replay. Appl. Sci. 2025, 15, 5980. [Google Scholar] [CrossRef]
- Lai, W.K.; Wang, Y.C.; Wei, S.C. Delay-aware container scheduling in kubernetes. IEEE Internet Things J. 2023, 10, 11813–11824. [Google Scholar] [CrossRef]

















| Group | Metric | EPRS | D3QN | Dueling DQN | Double DQN | DQN | Kubernetes |
|---|---|---|---|---|---|---|---|
| Group I | CPU | 9.339 | 10.512 | 10.158 | 12.893 | 13.423 | 9.981 |
| Memory | 7.245 | 7.455 | 6.645 | 7.432 | 7.797 | 7.838 | |
| Disk | 16.108 | 19.713 | 20.556 | 25.334 | 27.143 | 17.724 | |
| Pods | 2 | 2.625 | 3.801 | 3.559 | 3.496 | 1.563 | |
| Group II | CPU | 5.931 | 7.323 | 6.736 | 8.704 | 8.666 | 7.931 |
| Memory | 5.011 | 5.879 | 6.706 | 6.941 | 6.800 | 9.445 | |
| Disk | 11.845 | 14.001 | 13.069 | 17.413 | 17.723 | 16.192 | |
| Pods | 1.247 | 2.108 | 4.082 | 3.559 | 4.216 | 2.944 | |
| Group III | CPU | 5.699 | 8.631 | 8.103 | 10.069 | 6.642 | 9.087 |
| Memory | 5.394 | 8.546 | 8.177 | 6.970 | 6.203 | 10.388 | |
| Disk | 11.613 | 17.315 | 13.877 | 18.006 | 13.331 | 19.165 | |
| Pods | 1.414 | 2.944 | 3.232 | 5.617 | 3.162 | 3.771 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Tan, S.; Huang, T.; Zhu, E.; Qin, J.; Fan, X. EPRS: Experience-Prioritized Reinforcement Scheduler in Edge Clusters. Sensors 2026, 26, 1168. https://doi.org/10.3390/s26041168
Tan S, Huang T, Zhu E, Qin J, Fan X. EPRS: Experience-Prioritized Reinforcement Scheduler in Edge Clusters. Sensors. 2026; 26(4):1168. https://doi.org/10.3390/s26041168
Chicago/Turabian StyleTan, Shuya, Tiancong Huang, Enguo Zhu, Jian Qin, and Xiaoqi Fan. 2026. "EPRS: Experience-Prioritized Reinforcement Scheduler in Edge Clusters" Sensors 26, no. 4: 1168. https://doi.org/10.3390/s26041168
APA StyleTan, S., Huang, T., Zhu, E., Qin, J., & Fan, X. (2026). EPRS: Experience-Prioritized Reinforcement Scheduler in Edge Clusters. Sensors, 26(4), 1168. https://doi.org/10.3390/s26041168

