Research on Multi-Agent D2D Communication Resource Allocation Algorithm Based on A2C
Abstract
:1. Introduction
2. System Model
2.1. System Model Establishment
2.2. Problem Establishment
3. Proposed Algorithm
3.1. A2C Environment Model based on Deep Learning
3.2. Network Parameter Update
Algorithm 1. D2D communication resource allocation algorithm |
Algorithm: A D2D Communication Resource Allocation Algorithm Based on A2C |
Initialization: |
Initialize the cell, base station, cellular user and D2D user using communication simulation. : Policy model for all D2D users Actor network: Parameter , Learning rate Critic network: Parameter , Learning rate : Discount factor Advantage function: T : Number of communication simulation timeslot cycles. |
Train : |
1: 2: Cycle: 3: All D2D communication users observe their own state . 4: All D2D communication users are based on the current state and Policy . Output , and transmit power . 5: All D2D communication users are based on the current status . and . Rewards obtained by output . 6: All D2D communication users observe the next state 7: All D2D communication users input into the critic network as an input parameter and obtain the mathematical expectation of the critic network to calculate the advantage function. 8: Update Critic network parameters . 9: Update actor network parameters . 10: Update MAA2C algorithm strategy . 11: Rewards obtained by output . 12: Simulation platform updates. 13: Until t = T, return the test result. |
4. Simulation Results and Analysis
4.1. Simulation Parameter Setting
4.2. Result Analysis
5. Conclusions
6. Patents
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Neto, J.; Morais, A.J.; Gonçalves, R.; Coelho, A.L. Context-Based Multi-Agent Recommender System, Supported on IoT, for Guiding the Occupants of a Building in Case of a Fire. Electronics 2022, 11, 3466. [Google Scholar] [CrossRef]
- Amin, F.; Asghar, I.; Ali, A.; Hwang, S.O. Recent Advances in Internet of Things and Emerging Social Internet of Things: Vision, Challenges and Trends. Electronics 2022, 11, 2033. [Google Scholar] [CrossRef]
- Zaman, U.; Imran; Mehmood, F.; Iqbal, N.; Kim, J.; Ibrahim, M. Towards Secure and Intelligent Internet of Health Things: A Survey of Enabling Technologies and Applications. Electronics 2022, 11, 1893. [Google Scholar] [CrossRef]
- Mahmoud, H.H.; Alghawli, A.S.; Al-shammari, M.K.M.; Amran, G.A.; Mutmbak, K.H.; Al-harbi, K.H.; Al-qaness, M.A.A. IoT-Based Motorbike Ambulance: Secure and Efficient Transportation. Electronics 2022, 11, 2878. [Google Scholar] [CrossRef]
- Corchado, J.M.; Trabelsi, S. Advances in Sustainable Smart Cities and Territories. Electronics 2022, 11, 1280. [Google Scholar] [CrossRef]
- Szybicki, D.; Obal, P.; Penar, P.; Kurc, K.; Muszyńska, M.; Burghardt, A. Development of a Dedicated Application for Robots to Communicate with a Laser Tracker. Electronics 2022, 11, 3405. [Google Scholar] [CrossRef]
- Chen, C.-H.; Hong, C.-M.; Lin, W.-M.; Wu, Y.-C. Implementation of an Environmental Monitoring System Based on IoTs. Electronics 2022, 11, 1596. [Google Scholar] [CrossRef]
- Chen, X.; Zhou, G.; Chen, A.; Yi, J.; Zhang, W.; Hu, Y. Identification of tomato leaf diseases based on combination of ABCK-BWTR and B-ARNet. Comput. Electron. Agric. 2020, 178, 105730. [Google Scholar] [CrossRef]
- Lv, M.; Zhou, G.; He, M.; Chen, A.; Zhang, W.; Hu, Y. Maize Leaf Disease Identification Based on Feature Enhancement and DMS-Robust Alexnet. IEEE Access 2020, 8, 57952–57966. [Google Scholar] [CrossRef]
- Zhao, S.; Zhang, Y.; Iftikhar, H.; Ullah, A.; Mao, J.; Wang, T. Dynamic Influence of Digital and Technological Advancement on Sustainable Economic Growth in Belt and Road Initiative (BRI) Countries. Sustainability 2022, 14, 15782. [Google Scholar] [CrossRef]
- Farooq, M.S.; Nadir, R.M.; Rustam, F.; Hur, S.; Park, Y.; Ashraf, I. Nested Bee Hive: A Conceptual Multilayer Architecture for 6G in Futuristic Sustainable Smart Cities. Sensors 2022, 22, 5950. [Google Scholar] [CrossRef] [PubMed]
- Jin, W.; Kim, H.; Lee, H. A Novel Machine Learning Scheme for mmWave Path Loss Modeling for 5G Communications in Dense Urban Scenarios. Electronics 2022, 11, 1809. [Google Scholar] [CrossRef]
- Suo, J.; Zhan, J.; Zhou, G.; Chen, A.; Hu, Y.; Huang, W.; Cai, W.; Hu, Y.; Li, L. CASM-AMFMNet: A Network Based on Coordinate Attention Shuffle Mechanism and Asymmetric Multi-Scale Fusion Module for Classification of Grape Leaf Diseases. Front. Plant Sci. 2022, 13. [Google Scholar] [CrossRef] [PubMed]
- Nagapuri, L.; Prabu, A.V.; Penchala, S.; Salah, B.; Saleem, W.; Kumar, G.S.; Aziz, A.S.A. Energy Efficient Underlaid D2D Communication for 5G Applications. Electronics 2022, 11, 2587. [Google Scholar] [CrossRef]
- Pizzi, S.; Rinaldi, F.; Molinaro, A.; Iera, A.; Araniti, G. Energy-Efficient Multicast Service Delivery Exploiting Single Frequency Device-To-Device Communications in 5G New Radio Systems. Sensors 2018, 18, 2205. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yin, M.; Li, W.; Feng, L.; Yu, P.; Qiu, X. Emergency Communications Based on Throughput-Aware D2D Multicasting in 5G Public Safety Networks. Sensors 2020, 20, 1901. [Google Scholar] [CrossRef] [Green Version]
- Alsharfa, R.M.; Mohammed, S.L.; Gharghan, S.K.; Khan, I.; Choi, B.J. Cellular-D2D Resource Allocation Algorithm Based on User Fairness. Electronics 2020, 9, 386. [Google Scholar] [CrossRef] [Green Version]
- Taghizad-Tavana, K.; Ghanbari-Ghalehjoughi, M.; Razzaghi-Asl, N.; Nojavan, S.; Alizadeh, A. An Overview of the Architecture of Home Energy Management System as Microgrids, Automation Systems, Communication Protocols, Security, and Cyber Challenges. Sustainability 2022, 14, 15938. [Google Scholar] [CrossRef]
- Peer, M.; Bohara, V.A.; Srivastava, A. Real-World Spatio–Temporal Behavior Aware D2D Multicast Networks. IEEE Trans. Netw. Sci. Eng. 2019, 7, 1675–1686. [Google Scholar] [CrossRef]
- Balfaqih, M.; Alharbi, S.A. Associated Information and Communication Technologies Challenges of Smart City Development. Sustainability 2022, 14, 16240. [Google Scholar] [CrossRef]
- Sarma, S.S.; Hazra, R. Interference management for D2D communication in mmWave 5G network: An Alternate Offer Bargaining Game theory approach. In Proceedings of the 7th International Conference on Signal Processing and Integrated Networks (SPIN) IEEE, Noida, India, 27–28 February 2020. [Google Scholar]
- Kumar, N.; Ahmad, A. Cooperative evolution of SVM-based resource allocation for 5G cloud-radio access network system with D2D communication. Int. J. Ad Hoc Ubiquitous Comput. 2022, 40, 277–287. [Google Scholar] [CrossRef]
- Llerena, Y.P.; Gondim, P.R. Social-aware spectrum sharing for D2D communication by artificial bee colony opti-mization. Comput. Netw. 2020, 183, 107581. [Google Scholar] [CrossRef]
- Ashtiani, A.F.; Pierre, S. Power allocation and resource assignment for secure D2D communication underlaying cellular networks: A Tabu search approach. Comput. Netw. 2020, 178, 107350. [Google Scholar] [CrossRef]
- Hong, S.G.; Park, J.; Bahk, S. Subchannel and power allocation for D2D communication in mmWave cellular networks. J. Commun. Networks 2020, 22, 118–129. [Google Scholar] [CrossRef]
- Liu, Y.; Wang, W.; Chen, H.-H.; Wang, L.; Cheng, N.; Meng, W.; Shen, X. Secrecy Rate Maximization via Radio Resource Allocation in Cellular Underlaying V2V Communications. IEEE Trans. Veh. Technol. 2020, 69, 7281–7294. [Google Scholar] [CrossRef]
- Zhuang, W.; Chen, M.; Wei, X.; Li, H. Social-aware resource allocation based on cluster formation and matching theory in D2D Un-derlaying cellular networks. KSII Trans. Internet Inf. Syst. (TIIS) 2020, 14, 1984–2002. [Google Scholar]
- Zhang, Y. Interference graph construction for D2D underlaying cellular networks and missing rate analysis. Telecommun. Syst. 2020, 75, 383–399. [Google Scholar] [CrossRef]
- Jeon, H.-B.; Koo, B.-H.; Park, S.-H.; Park, J.; Chae, C.-B. Graph-Theory-Based Resource Allocation and Mode Selection in D2D Communication Systems: The Role of Full-Duplex. IEEE Wirel. Commun. Lett. 2020, 10, 236–240. [Google Scholar] [CrossRef]
- Rathi, R.; Gupta, N. Game theoretic and non-game theoretic resource allocation approaches for D2D communication. Ain Shams Eng. J. 2021, 12, 2385–2393. [Google Scholar] [CrossRef]
- Barik, P.K.; Shukla, A.; Datta, R.; Singhal, C. A resource sharing scheme for intercell D2D communication in cellular networks: A repeated game theoret-ic approach. IEEE Trans. Veh. Technol. 2020, 69, 7806–7820. [Google Scholar] [CrossRef]
- Ghosh, S.; De, D. Power and Spectrum Efficient D2D Communication for 5G IoT Using Stackelberg Game Theory. In Proceedings of the IEEE 17th India Council International Conference (INDICON), New Delhi, India, 10–13 December 2020. [Google Scholar]
- Hakami, V.; Barghi, H.; Mostafavi, S.; Arefinezhad, Z. A resource allocation scheme for D2D communications with unknown channel state information. Peer-to-Peer Netw. Appl. 2022, 15, 1189–1213. [Google Scholar] [CrossRef]
- Gu, B.; Zhang, X.; Lin, Z.; Alazab, M. Deep Multiagent Reinforcement-Learning-Based Resource Allocation for Internet of Controllable Things. IEEE Internet Things J. 2020, 8, 3066–3074. [Google Scholar] [CrossRef]
- Wang, D.; Qin, H.; Song, B.; Xu, K.; Du, X.; Guizani, M. Joint resource allocation and power control for D2D communication with deep reinforcement learning in MCC. Phys. Commun. 2021, 45, 101262. [Google Scholar] [CrossRef]
- Shi, D.; Li, L.; Ohtsuki, T.; Pan, M.; Han, Z.; Poor, H.V. Make smart decisions faster: Deciding d2d resource allocation via stackelberg game guided multi-agent deep rein-forcement learning. IEEE Trans. Mob. Comput. 2021, 21, 4426–4438. [Google Scholar] [CrossRef]
- Budhiraja, I.; Kumar, N.; Tyagi, S. Deep-Reinforcement-Learning-Based Proportional Fair Scheduling Control Scheme for Underlay D2D Communication. IEEE Internet Things J. 2020, 8, 3143–3156. [Google Scholar] [CrossRef]
- Huang, J.; Yang, Y.; He, G.; Xiao, Y.; Liu, J. Deep reinforcement learning-based dynamic spectrum access for D2D communication underlay cellular net-works. IEEE Commun. Lett. 2021, 25, 2614–2618. [Google Scholar] [CrossRef]
- Ban, T.-W. An Autonomous Transmission Scheme Using Dueling DQN for D2D Communication Networks. IEEE Trans. Veh. Technol. 2020, 69, 16348–16352. [Google Scholar] [CrossRef]
- Ron, D.; Lee, J.-R. DRL-Based Sum-Rate Maximization in D2D Communication Underlaid Uplink Cellular Networks. IEEE Trans. Veh. Technol. 2021, 70, 11121–11126. [Google Scholar] [CrossRef]
- Guo, Q.; Tang, F.; Kato, N. Federated Reinforcement Learning-Based Resource Allocation in D2D-Enabled 6G. IEEE Netw. 2022, 1–7. [Google Scholar] [CrossRef]
- Huang, Y.F.; Tan, T.H.; Li, Y.L.; Huang, S.C. Performance of resource allocation for D2D communications in Q-Learning based heterogeneous networks. In Proceedings of the IEEE International Conference on Consumer Electronics-Taiwan (ICCE-TW), Taichung, Taiwan, 19–21 May 2018. [Google Scholar]
- Khan, M.I.; Alam, M.M.; Le Moullec, Y.; Yaacoub, E. Throughput-Aware Cooperative Reinforcement Learning for Adaptive Resource Allocation in Device-to-Device Communication. Futur. Internet 2017, 9, 72. [Google Scholar] [CrossRef] [Green Version]
- Xiang, H.; Yang, Y.; He, G.; Huang, J.; He, D. Multi-Agent Deep Reinforcement Learning-Based Power Control and Resource Allocation for D2D Communications. IEEE Wirel. Commun. Lett. 2022, 11, 1659–1663. [Google Scholar] [CrossRef]
- Zhang, T.; Zhu, K.; Wang, J. Energy-Efficient Mode Selection and Resource Allocation for D2D-Enabled Heterogeneous Networks: A Deep Reinforcement Learning Approach. IEEE Trans. Wirel. Commun. 2020, 20, 1175–1187. [Google Scholar] [CrossRef]
- Huang, Y.F.; Tan, T.H.; Wang, N.C.; Chen, Y.L.; Li, Y.L. Resource allocation for D2D communications with a novel distributed Q-learning algorithm in heterogeneous networks. In Proceedings of the International Conference on Machine Learning and Cybernetics (ICMLC), Chengdu, China, 15–18 July 2018; Volume 2. [Google Scholar]
Simulation Parameters | Parameter Value (unit) |
---|---|
Base station transmit power | 46 dBm |
D2D transmits the maximum transmit power of the user | 23 dBm |
D2D communication to the maximum distance | 20 m |
Number of cellular users | 5 |
The number of RB | 5 |
Number of D2D communication pairs | 5, 10, 15, 20, 25 |
Path loss model of cellular communication link | |
D2D communication link path loss factor | 4 |
Cellular user target SINR threshold | 0 dB |
Cellular subscriber outage probability threshold | 0.01 |
D2D user target SINR threshold | 0 dB |
D2D user outage probability threshold | 0.01 |
Noise power spectral density | −174d dBm/Hz |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, X.; Chen, G.; Wu, G.; Sun, Z.; Chen, G. Research on Multi-Agent D2D Communication Resource Allocation Algorithm Based on A2C. Electronics 2023, 12, 360. https://doi.org/10.3390/electronics12020360
Li X, Chen G, Wu G, Sun Z, Chen G. Research on Multi-Agent D2D Communication Resource Allocation Algorithm Based on A2C. Electronics. 2023; 12(2):360. https://doi.org/10.3390/electronics12020360
Chicago/Turabian StyleLi, Xinzhou, Guifen Chen, Guowei Wu, Zhiyao Sun, and Guangjiao Chen. 2023. "Research on Multi-Agent D2D Communication Resource Allocation Algorithm Based on A2C" Electronics 12, no. 2: 360. https://doi.org/10.3390/electronics12020360
APA StyleLi, X., Chen, G., Wu, G., Sun, Z., & Chen, G. (2023). Research on Multi-Agent D2D Communication Resource Allocation Algorithm Based on A2C. Electronics, 12(2), 360. https://doi.org/10.3390/electronics12020360