Queue Stability-Constrained Deep Reinforcement Learning Algorithms for Adaptive Transmission Control in Multi-Access Edge Computing Systems
Abstract
1. Introduction
2. Proposed System Model
2.1. Multi-Access Transmission Scheme
2.2. Queue Stability and Queueing Delay Model
3. Queue Stability-Constrained Deep Reinforcement Learning Approach for Adaptive Transmission Control
3.1. Description of Optimization Function
3.2. Modeling Data Transmission Control Using Markov Decision Processes
3.3. Reinforcement Learning Algorithms for Adaptive Transmission Control
3.3.1. DQN-Based Multi-Access Data Transmission Algorithm
Algorithm 1. Multi-Access Data Transmission Algorithm Based on DQN |
Input: State space S, action space A, exploration parameter ε, discount factor γ Output: Packet allocation policy π |
1: Initialize: Experience replay buffer, main network weights, target network weights 2: do: do: 4: Select allocation ratios from discretized action space using ε-greedy policy 5: Execute action, observe reward and next state 6: 7: , compute TD target 8: 9: , and update θ 10: Every C steps: Synchronize target network θ′ ← θ 11: end for 12: end for |
3.3.2. PPO-Based Multi-Access Data Transmission Algorithm
Algorithm 2. Multi-Access Data Transmission Algorithm Based on PPO |
for network links |
1: Initialize: Policy parameters and value function parameters 2: do: 3: Collect trajectories using current policy 4: Compute advantages function using collected trajectories 5: do: 6: do: 7: 8: 9: Calculate entropy and update parameters 10: end for 11: end for 12: Compute optimal allocation ratio 13: Update old policy parameters 14: end for |
4. Experimental Results and Performance Analysis
4.1. Simulation Setup
4.2. Results and Performance Analysis
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sylla, T.; Mendiboure, L.; Maaloul, S.; Aniss, H.; Chalouf, M.A.; Delbruel, S. Multi-Connectivity for 5G Networks and Beyond: A Survey. Sensors 2022, 22, 7591. [Google Scholar] [CrossRef]
- Subhan, F.E.; Yaqoob, A.; Muntean, C.H.; Muntean, G.-M. A Survey on Artificial Intelligence Techniques for Improved Rich Media Content Delivery in a 5G and Beyond Network Slicing Context. IEEE Commun. Surv. Tutor. 2025, 27, 1427–1487. [Google Scholar] [CrossRef]
- He, Z.; Li, L.; Lin, Z.; Dong, Y.; Qin, J.; Li, K. Joint Optimization of Service Migration and Resource Allocation in Mobile Edge–Cloud Computing. Algorithms 2024, 17, 370. [Google Scholar] [CrossRef]
- Jun, S.; Choi, Y.S.; Chung, H. A Study on Mobility Enhancement in 3GPP Multi-Radio Multi-Connectivity. In Proceedings of the 2024 15th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 16–18 October 2024; IEEE: Jeju Island, Republic of Korea, 2024; pp. 1347–1348. [Google Scholar]
- Rahman, M.H.; Chowdhury, M.R.; Sultana, A.; Tripathi, A.; Silva, A.P.D. Deep Learning Based Uplink Power Allocation in Multi-Radio Dual Connectivity Heterogeneous Wireless Networks. In Proceedings of the 2024 IEEE 35th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), Valencia, Spain, 2–5 September 2024; IEEE: Valencia, Spain, 2024; pp. 1–6. [Google Scholar]
- Han, L.; Li, S.; Ao, C.; Liu, Y.; Liu, G.; Zhang, Y.; Zhao, J. MEC-Based Cooperative Multimedia Caching Mechanism for the Internet of Vehicles. Wirel. Commun. Mob. Comput. 2022, 2022, 1–10. [Google Scholar] [CrossRef]
- Zhang, Y.; He, X.; Xing, J.; Li, W.; Seah, W.K.G. Load-Balanced Offloading of Multiple Task Types for Mobile Edge Computing in IoT. Internet Things 2024, 28, 101385. [Google Scholar] [CrossRef]
- Santos, D.; Chi, H.R.; Almeida, J.; Silva, R.; Perdigão, A.; Corujo, D.; Ferreira, J.; Aguiar, R.L. Fully-Decentralized Multi-MNO Interoperability of MEC-Enabled Cooperative Autonomous Mobility. IEEE Trans. Consum. Electron. 2025, 1, 1. [Google Scholar] [CrossRef]
- Qu, B.; Bai, Y.; Chu, Y.; Wang, L.; Yu, F.; Li, X. Resource Allocation for MEC System with Multi-Users Resource Competition Based on Deep Reinforcement Learning Approach. Comput. Netw. 2022, 215, 109181. [Google Scholar] [CrossRef]
- Chu, W.; Jia, X.; Yu, Z.; Lui, J.C.S.; Lin, Y. Joint Service Caching, Resource Allocation and Task Offloading for MEC-Based Networks: A Multi-Layer Optimization Approach. IEEE Trans. Mob. Comput. 2024, 23, 2958–2975. [Google Scholar] [CrossRef]
- Zhang, S.; Bao, S.; Chi, K.; Yu, K.; Mumtaz, S. DRL-Based Computation Rate Maximization for Wireless Powered Multi-AP Edge Computing. IEEE Trans. Commun. 2024, 72, 1105–1118. [Google Scholar] [CrossRef]
- Zhang, S.; Tong, X.; Chi, K.; Gao, W.; Chen, X.; Shi, Z. Stackelberg Game-Based Multi-Agent Algorithm for Resource Allocation and Task Offloading in MEC-Enabled C-ITS. IEEE Trans. Intell. Transport. Syst. 2025, 1–12. [Google Scholar] [CrossRef]
- Mao, Y.; Zhang, J.; Letaief, K.B. Dynamic Computation Offloading for Mobile-Edge Computing with Energy Harvesting Devices. IEEE J. Select. Areas Commun. 2016, 34, 3590–3605. [Google Scholar] [CrossRef]
- Dai, L.; Mei, J.; Yang, Z.; Tong, Z.; Zeng, C.; Li, K. Lyapunov-Guided Deep Reinforcement Learning for Delay-Aware Online Task Offloading in MEC Systems. J. Syst. Archit. 2024, 153, 103194. [Google Scholar] [CrossRef]
- Ning, Z.; Dong, P.; Kong, X.; Xia, F. A Cooperative Partial Computation Offloading Scheme for Mobile Edge Computing Enabled Internet of Things. IEEE Internet Things J. 2019, 6, 4804–4814. [Google Scholar] [CrossRef]
- Ren, J.; Yu, G.; He, Y.; Li, G.Y. Collaborative Cloud and Edge Computing for Latency Minimization. IEEE Trans. Veh. Technol. 2019, 68, 5031–5044. [Google Scholar] [CrossRef]
- Wang, X.; Ning, Z.; Guo, L.; Guo, S.; Gao, X.; Wang, G. Online Learning for Distributed Computation Offloading in Wireless Powered Mobile Edge Computing Networks. IEEE Trans. Parallel Distrib. Syst. 2022, 33, 1841–1855. [Google Scholar] [CrossRef]
- Cui, Y.; Liang, Y.; Wang, R. Resource Allocation Algorithm with Multi-Platform Intelligent Offloading in D2D-Enabled Vehicular Networks. IEEE Access 2019, 7, 21246–21253. [Google Scholar] [CrossRef]
- Bae, S.; Han, S.; Sung, Y. A Reinforcement Learning Formulation of the Lyapunov Optimization: Application to Edge Computing Systems with Queue Stability. arXiv 2020. [Google Scholar] [CrossRef]
- Wu, Q.; Wang, W.; Fan, P.; Fan, Q.; Wang, J.; Letaief, K.B. URLLC-Awared Resource Allocation for Heterogeneous Vehicular Edge Computing. IEEE Trans. Veh. Technol. 2024, 73, 11789–11805. [Google Scholar] [CrossRef]
- Keramidi, I.; Uzunidis, D.; Moscholios, I.; Logothetis, M.; Sarigiannidis, P. Analytical Modelling of a Vehicular Ad Hoc Network Using Queueing Theory Models and the Notion of Channel Availability. AEU-Int. J. Electron. Commun. 2023, 170, 154811. [Google Scholar] [CrossRef]
- Wu, G.; Xu, Z.; Zhang, H.; Shen, S.; Yu, S. Multi-Agent DRL for Joint Completion Delay and Energy Consumption with Queuing Theory in MEC-Based IIoT. J. Parallel Distrib. Comput. 2023, 176, 80–94. [Google Scholar] [CrossRef]
- Zhang, J.; Bao, B.; Wang, C.; Zhu, F. Shapley Value-Driven Multi-Modal Deep Reinforcement Learning for Complex Decision-Making. Neural Netw. 2025, 191, 107650. [Google Scholar] [CrossRef] [PubMed]
- Femminella, M.; Reali, G. Comparison of Reinforcement Learning Algorithms for Edge Computing Applications Deployed by Serverless Technologies. Algorithms 2024, 17, 320. [Google Scholar] [CrossRef]
- Yuan, M.; Yu, Q.; Zhang, L.; Lu, S.; Li, Z.; Pei, F. Deep Reinforcement Learning Based Proximal Policy Optimization Algorithm for Dynamic Job Shop Scheduling. Comput. Oper. Res. 2025, 183, 107149. [Google Scholar] [CrossRef]
- Kumar, P.; Hota, L.; Nayak, B.P.; Kumar, A. An Adaptive Contention Window Using Actor-Critic Reinforcement Learning Algorithm for Vehicular Ad-Hoc NETworks. Procedia Comput. Sci. 2024, 235, 3045–3054. [Google Scholar] [CrossRef]
- Chen, Z.; Maguluri, S.T. An Approximate Policy Iteration Viewpoint of Actor–Critic Algorithms. Automatica 2025, 179, 112395. [Google Scholar] [CrossRef]
Parameter | Value |
---|---|
Maximum Queue Length | 2000 packets |
Packet Arrival Rate | 500 ± 50 packets/step |
Packet Arrival Distribution | Poisson Distribution |
Queue Service Distribution | Exponential Distribution |
Mean Link Bandwidth (4G, 5G, Wi-Fi) | (150, 300, 400) Mbps |
Clipping Range | [0.85, 1.15] |
Learning Rate | 0.01 |
Discount Factor | 0.98 |
Policy Network Architecture | 64 × 64 fully connected |
Value Network Architecture | 16 × 16 fully connected |
Activation Function | Tanh |
Optimization Method | MSE Loss |
Total Training Steps | 100 × 104 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, L.; Zeng, T.; Zhao, J.; Bao, X.; Liu, G.; Liu, Y. Queue Stability-Constrained Deep Reinforcement Learning Algorithms for Adaptive Transmission Control in Multi-Access Edge Computing Systems. Algorithms 2025, 18, 498. https://doi.org/10.3390/a18080498
Han L, Zeng T, Zhao J, Bao X, Liu G, Liu Y. Queue Stability-Constrained Deep Reinforcement Learning Algorithms for Adaptive Transmission Control in Multi-Access Edge Computing Systems. Algorithms. 2025; 18(8):498. https://doi.org/10.3390/a18080498
Chicago/Turabian StyleHan, Longzhe, Tian Zeng, Jia Zhao, Xuecai Bao, Guangming Liu, and Yan Liu. 2025. "Queue Stability-Constrained Deep Reinforcement Learning Algorithms for Adaptive Transmission Control in Multi-Access Edge Computing Systems" Algorithms 18, no. 8: 498. https://doi.org/10.3390/a18080498
APA StyleHan, L., Zeng, T., Zhao, J., Bao, X., Liu, G., & Liu, Y. (2025). Queue Stability-Constrained Deep Reinforcement Learning Algorithms for Adaptive Transmission Control in Multi-Access Edge Computing Systems. Algorithms, 18(8), 498. https://doi.org/10.3390/a18080498