Flash-Attention-Enhanced Multi-Agent Deep Deterministic Policy Gradient for Mobile Edge Computing in Digital Twin-Powered Internet of Things
Abstract
1. Introduction
- To enable accurate virtual representations within the MEC-assisted IoT, this paper designs a comprehensive and layered digital twin model. The proposed model provides valuable insights into the IoT ecosystem and offers data support, decision supervision, and real-time control to tackle MEC problems.
- To improve performance in a MEC scenario, this paper proposes a Flash-Attention-enhanced MADDPG algorithm (FA-MADDPG) for decision making, and its time complexity is analyzed. The integration of attention mechanisms into a critic network enables agents to focus on relevant information, while a flash mechanism ensures efficient and timely training in complex IoT scenarios.
- To validate the effectiveness of the proposed FA-MADDPG algorithm in DT, this paper conducts extensive experiments. DT is constructed, and training efficiency is evaluated using reward convergence curves, time efficiency analysis, and performance assessment against several baseline algorithms.
2. Related Work
2.1. DT-Powered MEC
2.2. MADRL for Computation Offloading Decisions
2.3. Flash-Attention-Advanced DRL
3. System Model
3.1. Digital Twin-Based IoT Model
3.1.1. Physical Entity Layer
3.1.2. Virtual Twin Layer
3.1.3. Application Layer
3.2. Communication Model
3.3. Computation Model
3.3.1. Local Computation Model
3.3.2. Offloading Computation Model
3.3.3. System Computation Model
4. Problem Formulation and Proposed FA-MADDPG Solution
4.1. Problem Formulation
4.2. MDP Problem Construction
4.2.1. State Space
- represents the number of user tasks at the beginning of time slot t, which is randomly formed;
- represents the task weight for the task j at the beginning of the time slot t, which is randomly formed;
- represents the horizontal distance between the base station k and the user device i, and is a random variable sampled from ;
- represents the vertical distance between the base station k and the user device i, and is a random variable sampled from ;
- represents the altitude of a base station.
4.2.2. Action Space
- represents the task offloading decision for the task j;
- stands for the transmission power distribution for the task j;
- stands for the computation power distribution for the task j.
4.2.3. Reward Function
4.3. FA-MADDPG Algorithm Model
4.3.1. Attention-Based MADDPG Algorithm
4.3.2. Flash Attention Mechanism
4.3.3. Joint Optimization Algorithm Framework
Algorithm 1: Joint optimization with FA-MADDPG |
4.3.4. Complexity Analysis
5. Simulation and Results
5.1. Convergence of Proposed Algorithm
5.2. Time Efficiency Analysis
5.3. Performance Evaluation
5.4. Flash Attention Benefits Analysis
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
Symbol | Meaning |
user set | |
time slot set | |
task set | |
base station set | |
n | number of tasks |
base station Cartesian coordinates | |
user device Cartesian coordinates | |
w | channel bandwidth |
data transmission power | |
noise power | |
g | channel gain |
r | required transmission rate |
s | task size |
local computation ability | |
local computation time | |
local computation power | |
local computation energy consumption | |
base station computation ability | |
base station computation power | |
total offloading time | |
total offloading energy consumption | |
t | total delay for one task |
e | total energy consumption for one task |
task weight | |
weight coefficient | |
observation | |
state space | |
action space | |
r | reward |
References
- Loutfi, S.I.; Shayea, I.; Tureli, U.; El-Saleh, A.A.; Tashan, W. An overview of mobility awareness with mobile edge computing over 6G network: Challenges and future research directions. Results Eng. 2024, 202, 102601. [Google Scholar] [CrossRef]
- Feng, C.; Han, P.; Zhang, X.; Yang, B.; Liu, Y.; Guo, L. Computation offloading in mobile edge computing networks: A survey. J. Netw. Comput. Appl. 2022, 202, 103366. [Google Scholar] [CrossRef]
- Yuan, X.; Chen, J.; Yang, J.; Zhang, N.; Yang, T.; Han, T.; Taherkordi, A. Fedstn: Graph representation driven federated learning for edge computing enabled urban traffic flow prediction. IEEE Trans. Intell. Transp. Syst. 2022, 24, 8738–8748. [Google Scholar] [CrossRef]
- Hakiri, A.; Gokhale, A.; Yahia, S.B.; Mellouli, N. A comprehensive survey on digital twin for future networks and emerging Internet of Things industry. Comput. Netw. 2024, 244, 110350. [Google Scholar] [CrossRef]
- Tang, F.; Chen, X.; Rodrigues, T.K.; Zhao, M.; Kato, N. Survey on digital twin edge networks (DITEN) toward 6G. IEEE Open J. Commun. Soc. 2022, 3, 1360–1381. [Google Scholar] [CrossRef]
- Zhang, Y.; Liang, W.; Xu, W.; Xu, Z.; Jia, X. Cost minimization of digital twin placements in mobile edge computing. ACM Trans. Sens. Netw. 2024, 20, 1–26. [Google Scholar] [CrossRef]
- Hasan, M.K.; Jahan, N.; Nazri, M.Z.A.; Islam, S.; Khan, M.A.; Alzahrani, A.I.; Alalwan, N.; Nam, Y. Federated learning for computational offloading and resource management of vehicular edge computing in 6G-V2X network. IEEE Trans. Consum. Electron. 2024, 70, 3827–3847. [Google Scholar] [CrossRef]
- Lowe, R.; Wu, Y.; Tamar, A.; Harb, J.; Pieter Abbeel, O.; Mordatch, I. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Zhu, C.; Dastani, M.; Wang, S. A survey of multi-agent deep reinforcement learning with communication. Auton. Agents Multi-Agent Syst. 2024, 38, 4. [Google Scholar] [CrossRef]
- Wu, J.; Li, D.; Yu, Y.; Gao, L.; Wu, J.; Han, G. An attention mechanism and adaptive accuracy triple-dependent MADDPG formation control method for hybrid UAVs. IEEE Trans. Intell. Transp. Syst. 2024, 25, 11648–11663. [Google Scholar] [CrossRef]
- Dao, T.; Fu, D.; Ermon, S.; Rudra, A.; Ré, C. FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
- Han, Y.; Niyato, D.; Leung, C.; Kim, D.I.; Zhu, K.; Feng, S.; Shen, X.; Miao, C. A dynamic hierarchical framework for IoT-assisted digital twin synchronization in the metaverse. IEEE Internet Things J. 2022, 10, 268–284. [Google Scholar] [CrossRef]
- Zhang, R.; Xie, Z.; Yu, D.; Liang, W.; Cheng, X. Digital twin-assisted federated learning service provisioning over mobile edge networks. IEEE Trans. Comput. 2023, 73, 586–598. [Google Scholar] [CrossRef]
- Qu, Z.; Li, Y.; Liu, B.; Gupta, D.; Tiwari, P. Dtqfl: A digital twin-assisted quantum federated learning algorithm for intelligent diagnosis in 5G mobile network. IEEE J. Biomed. Health. Inf. 2023. early Access. [Google Scholar] [CrossRef] [PubMed]
- Zhuansun, C.; Li, P.; Liu, Y.; Tian, Z. Generative AI-Assisted Mobile Edge Computation Offloading in Digital Twin-Enabled IIoT. IEEE Internet Things J. 2025, 12, 13248–13258. [Google Scholar] [CrossRef]
- He, Y.; Yang, M.; He, Z.; Guizani, M. Resource allocation based on digital twin-enabled federated learning framework in heterogeneous cellular network. IEEE Trans. Veh. Technol. 2022, 72, 1149–1158. [Google Scholar] [CrossRef]
- Zhang, Y.; Hu, J.; Min, G. Digital twin-driven intelligent task offloading for collaborative mobile edge computing. IEEE J. Sel. Areas Commun. 2023, 41, 3034–3045. [Google Scholar] [CrossRef]
- Wang, B.; Sun, Y.; Jung, H.; Nguyen, L.D.; Vo, N.S.; Duong, T.Q. Digital twin-enabled computation offloading in UAV-assisted MEC emergency networks. IEEE Wirel. Commun. Lett. 2023, 12, 1588–1592. [Google Scholar] [CrossRef]
- Qu, Q.; Xu, R.; Sun, H.; Chen, Y.; Sarkar, S.; Ray, I. A Digital Healthcare Service Architecture for Seniors Safety Monitoring in Metaverse. In Proceedings of the 2023 IEEE International Conference on Metaverse Computing, Networking and Applications (MetaCom), Tokyo, Japan, 26–28 June 2023. [Google Scholar]
- Lin, L.; Chen, W.; He, Q.; Xiong, J.; Lin, J.; Lin, L. VECO: A Digital Twin-Empowered Framework for Efficient Vehicular Edge Caching and Computation Offloading. IEEE Trans. Intell. Transp. Syst. 2025, 1588–1592. [Google Scholar] [CrossRef]
- Li, Y.; Huang, L.; Yu, Q.; Ning, Q. Optimization of Synchronization Frequencies and Offloading Strategies in MEC-Assisted Digital Twin Networks. IEEE Internet Things J. 2025. early Access. [Google Scholar] [CrossRef]
- Hou, W.; Wen, H.; Song, H.; Lei, W.; Zhang, W. Multiagent deep reinforcement learning for task offloading and resource allocation in cybertwin-based networks. IEEE Internet Things J. 2021, 8, 16256–16268. [Google Scholar] [CrossRef]
- Suzuki, A.; Kobayashi, M. Multi-Agent Deep Reinforcement Learning for Cooperative Offloading in Cloud-Edge Computing. In Proceedings of the ICC 2022-IEEE International Conference on Communications, Seoul, Republic of Korea, 16–20 May 2022. [Google Scholar]
- Peng, H.; Shen, X. Multi-agent reinforcement learning based resource management in MEC-and UAV-assisted vehicular networks. IEEE J. Sel. Areas Commun. 2020, 39, 131–141. [Google Scholar] [CrossRef]
- Du, J.; Kong, Z.; Sun, A.; Kang, J.; Niyato, D.; Chu, X.; Yu, F.R. MADDPG-based joint service placement and task offloading in MEC empowered air–ground integrated networks. IEEE Internet Things J. 2023, 11, 10600–10615. [Google Scholar] [CrossRef]
- Xue, J.; Wang, L.; Yu, Q.; Mao, P. Multi-Agent Deep Reinforcement Learning-based Partial Offloading and Resource Allocation in Vehicular Edge Computing Networks. Comput. Commun. 2025, 234, 108081. [Google Scholar] [CrossRef]
- Zhang, X.; Wang, C.; Zhu, Y.; Cao, J.; Liu, T. Multi-Agent Deep Reinforcement Learning with Trajectory Prediction for Task Migration-Assisted Computation Offloading. IEEE Trans. Mob. Comput. 2025, 24, 5839–5856. [Google Scholar] [CrossRef]
- Yao, S.; Wang, M.; Ren, J.; Xia, T.; Wang, W.; Xu, K.; Xu, M.; Zhang, H. Multi-Agent Reinforcement Learning for Task Offloading in Crowd-Edge Computing. IEEE Trans. Mob. Comput. 2025. early access. [Google Scholar] [CrossRef]
- Huang, J.; Zhou, F.; Feng, L.; Li, W.; Zhao, M.; Yan, X.; Xi, Y.; Wu, J. Digital Twin Assisted DAG Task Scheduling via Evolutionary Selection MARL in Large-Scale Mobile Edge Network. In Proceedings of the 2023 IEEE International Conference on Communications Workshops (ICC Workshops), Rome, Italy, 28 May 2023. [Google Scholar]
- Zhou, F.; Feng, L.; Kadoch, M.; Yu, P.; Li, W.; Wang, Z. Multiagent RL aided task offloading and resource management in Wi-Fi 6 and 5G coexisting industrial wireless environment. IEEE Trans. Ind. Inf. 2021, 18, 2923–2933. [Google Scholar] [CrossRef]
- Liu, T.; Wang, K.; Sha, L.; Chang, B.; Sui, Z. Table-to-Text Generation by Structure-Aware Seq2seq Learning. In Proceedings of the AAAI Conference on Artificial Intelligence 2018, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Mao, H.; Zhang, Z.; Xiao, Z.; Gong, Z.; Ni, Y. Learning multi-agent communication with double attentional deep reinforcement learning. Auton. Agents Multi-Agent Syst. 2020, 34, 1–34. [Google Scholar] [CrossRef]
- Bono, G.; Dibangoye, J.S.; Simonin, O.; Matignon, L.; Pereyron, F. Solving multi-agent routing problems using deep attention mechanisms. IEEE Trans. Intell. Transp. Syst. 2020, 22, 7804–7813. [Google Scholar] [CrossRef]
- Wu, L.; Qu, J.; Li, S.; Zhang, C.; Du, J.; Sun, X.; Zhou, J. Attention-Augmented MADDPG in NOMA-Based Vehicular Mobile Edge Computational Offloading. IEEE Internet Things J. 2024, 11, 27000–27014. [Google Scholar] [CrossRef]
- Tanveer, J.; Lee, S.W.; Rahmani, A.M.; Aurangzeb, K.; Alam, M.; Zare, G.; Alamdari, P.M.; Hosseinzadeh, M. PGA-DRL: Progressive graph attention-based deep reinforcement learning for recommender systems. Inf. Fusion 2025, 121, 103167. [Google Scholar] [CrossRef]
- Pagliardini, M.; Paliotta, D.; Jaggi, M.; Fleuret, F. Fast Attention Over Long Sequences with Dynamic Sparse Flash Attention. In Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
Task Weight | Traffic Throughput |
---|---|
0 | Background data or unimportant task |
1 | Notification data or lightweight task |
2 | Management or normal task |
3 | Network control or important task |
4 | Emergency task |
Parameters | Value |
---|---|
Task size | 120 KB |
MEC servers computing frequency | 3 GHz |
User device computing frequency | 0.8 GHz |
Channel bandwidth | 80 MHz |
Maximum transmission power | 0.4 W |
Noise power | 0.001 W |
Channel gain | 0.01 W |
Base station | 5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gao, Y.; Yuan, X.; Wang, S.; Chen, L.; Zhang, Z.; Wang, T. Flash-Attention-Enhanced Multi-Agent Deep Deterministic Policy Gradient for Mobile Edge Computing in Digital Twin-Powered Internet of Things. Mathematics 2025, 13, 2164. https://doi.org/10.3390/math13132164
Gao Y, Yuan X, Wang S, Chen L, Zhang Z, Wang T. Flash-Attention-Enhanced Multi-Agent Deep Deterministic Policy Gradient for Mobile Edge Computing in Digital Twin-Powered Internet of Things. Mathematics. 2025; 13(13):2164. https://doi.org/10.3390/math13132164
Chicago/Turabian StyleGao, Yuzhe, Xiaoming Yuan, Songyu Wang, Lixin Chen, Zheng Zhang, and Tianran Wang. 2025. "Flash-Attention-Enhanced Multi-Agent Deep Deterministic Policy Gradient for Mobile Edge Computing in Digital Twin-Powered Internet of Things" Mathematics 13, no. 13: 2164. https://doi.org/10.3390/math13132164
APA StyleGao, Y., Yuan, X., Wang, S., Chen, L., Zhang, Z., & Wang, T. (2025). Flash-Attention-Enhanced Multi-Agent Deep Deterministic Policy Gradient for Mobile Edge Computing in Digital Twin-Powered Internet of Things. Mathematics, 13(13), 2164. https://doi.org/10.3390/math13132164