An Intelligent Access Channel Algorithm Based on Distributed Double Q Learning
Abstract
:1. Introduction
- A distributed double Q-learning communication anti-jamming algorithm (DDQL) is proposed, which can effectively avoid the interference between users and external malicious jamming. The algorithm reduces the dimension of action set and shortens the convergence time of the algorithm.
- It is proved that the channel subframe binary optimization problem can be equivalently transformed into two unary optimization problems.
2. System Model and Problem Formulation
2.1. System Model
- As shown in Figure 1, we consider a wireless communication network. There are users, one base station, and one jammer. The number of channels available is . The set of channels available is . Each user distributed computing and distributed execution, and hardware performance requirements are not high with good scalability and strong processing power.
- A MAC frame consists of subframes, and each subframe consists of time slots. Therefore, the time slot of a MAC frame is . Each subframe is followed by a feedback signal from the base station. The content of the feedback signal is the current subframe and the current channel occupancy information, so that all users can make full use of the global resource occupancy information.
- The jamming mode of the jammer is multi-channel intelligent blocking interference [23]. The jammer can quickly sense the channel occupied by the user and select channels that have been occupied for the longest time in the previous multiple subframes for jamming. The jamming channel set can be defined as .
- It is assumed that the subframe and time slot of the system are synchronized, and the subframe is the basic transmission unit in this paper. Each user has packets to be transmitted. Each time slot can successfully transmit one packet.
2.2. Problem Formulation
3. Anti-Jamming Algorithm of Intelligent Communication Based on Distributed Double Q-Learning
Algorithm1: The user cooperative distributed double Q learning algorithm |
1: Initialization: , , , , . |
Set the learning factor and the maximum number of iterations . |
2: repeat |
3: for do |
4: According to the base station feedback signal, users calculate their own reward and according to the Equations (10) and (11). |
5: Update and according to the Equations (18) and (19). |
6: End for |
7: According to Equation (20), the optimal strategy is found. |
8: until the number of iterations has reached or the convergence has been reached. |
4. Simulation Results and Discussion
4.1. Parameter Settings
4.2. Analysis of Simulation
- Channel selection algorithm: based on Sense and random selection. Before selecting a channel, the user can sense whether there is interference in the channel. If no interference is detected in the current transmission channel, the user still transmits in the same channel; otherwise, the user randomly switches to other channels in the next action slot. Here, no information is exchanged between users.
- Binary independent Q learning algorithm (ILQ): each user executes the Q-learning algorithm independently only through the local learning results to make transmission decisions and never considers the decision of other users.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Alsabah, M.; Naser, M.A.; Mahmmod, B.M. 6G Wireless Communications Networks: A Comprehensive Survey. IEEE Access 2021, 9, 2169–3536. [Google Scholar] [CrossRef]
- Klaus, W.; Puttnam, B.J.; Luis, R.S.; Sakaguchi, J.; Mendinueta, J.D.; Awaji, Y.; Wada, N. Advanced space division multiplexing technologies for optical networks [Invited]. J. Opt. Commun. Netw. 2017, 9, C1–C11. [Google Scholar] [CrossRef]
- Li, S.; Hou, Y.T.; Lou, W.; Jalaian, B.A.; Russell, S. Maximizing Energy Efficiency with Channel Uncertainty under Mutual Interference. IEEE Trans. Wirel. Commun. 2022, 21, 1276–1536. [Google Scholar] [CrossRef]
- Lin, J.; Tian, B.; Wu, J.; He, J. Spectrum Resource Trading and Radio Management Data Sharing Based on Blockchain. In Proceedings of the 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE), Dalian, China, 27–29 September 2020; pp. 83–87. [Google Scholar]
- Zou, Y.; Zhu, J.; Wang, X.; Hanzo, L. A Survey on Wireless Security: Technical Challenges, Recent Advances, and Future Trends. Proc. IEEE 2016, 104, 1727–1765. [Google Scholar] [CrossRef] [Green Version]
- Adil, M.; Khan, R.; Ali, J. An Energy Proficient Load Balancing Routing Scheme for Wireless Sensor Networks to Maximize Their Lifespan in an Operational Environment. IEEE Access 2020, 8, 163209–163224. [Google Scholar] [CrossRef]
- Ram, S.S.; Ghatak, G. Optimization of Network Throughput of Joint Radar Communication System Using Stochastic Geometry. Front. Sig. Proc. 2022, 2, 835743. [Google Scholar] [CrossRef]
- Zhang, S.; Li, M.; Jian, M. AIRIS:Artificial Intelligence Enhanced Signal Processing in Reconfigurable Intelligent Surface Communications. China Commun. 2021, 18, 1276–1536. [Google Scholar] [CrossRef]
- Noels, N.; Moeneclaey, M. Performance of advanced telecommand frame synchronizer under pulsed jamming conditions. In Proceedings of the IEEE International Conference on Communications (ICC), Paris, France, 21–25 May 2017. [Google Scholar]
- Hall, M.; Silvennoinen, A.; Haggman, S. Effect of pulse jamming on IEEE 802.11 wireless LAN performance. In Proceedings of the MILCOM 2005—2005 IEEE Military Communications Conference, Atlantic City, NJ, USA, 17–20 October 2005; pp. 2301–2306. [Google Scholar]
- Hwang, K.; Chen, M.; Gharavi, H. Artificial Intelligence for Cognitive Wireless Communications (Editorial). IEEE Wirel. Commun. 2019, 26, 1284–1536. [Google Scholar] [CrossRef]
- Wang, J.; Jiang, C.; Zhang, H.; Ren, Y.; Chen, K.; Hanzo, L. Pareto-Optimal Wireless Networks. IEEE Commun. Surv. Tutor. 2020, 22, 1472–1514. [Google Scholar] [CrossRef] [Green Version]
- Zhou, Z. Machine Learning. In Proceedings of the ACM SIGMOD International Conference on Management of Data, Vancouver, BC, Canada, 10–12 June 2008; pp. 1247–1250. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning; Electronic Industry Press: Beijing, China, 2019. [Google Scholar]
- Galindo-Serrano, A.; Giupponi, L. Distributed Q-Learning for Aggregated Interference Control in Cognitive Radio Networks. IEEE Trans. Veh. Technol. 2010, 59, 1823–1834. [Google Scholar] [CrossRef]
- Sharma, S.K.; Wang, X. Collaborative Distributed Q-Learning for RACH Congestion Minimization in Cellular IoT Networks. IEEE Commun. Lett. 2019, 23, 600–603. [Google Scholar] [CrossRef] [Green Version]
- Aref, M.A.; Jayaweera, S.K. A novel cognitive anti-jamming stochastic game. In Proceedings of the Cognitive Communications for Aerospace Applications Workshop (CCAA), Cleveland, OH, USA, 27–28 June 2017; pp. 1–4. [Google Scholar]
- Zhou, Q.; Li, Y.; Niu, Y.; Qin, Z.; Zhao, L.; Wang, J. “One Plus One Is Greater Than Two”: Defeating Intelligent Dynamic Jamming with Collaborative Multi-Agent Reinforcement Learning. In Proceedings of the International Conference on Computer and Communications (ICCC), Chengdu, China, 11–14 December 2020. [Google Scholar]
- Littman, M.L. Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2001, 2, 55–66. [Google Scholar] [CrossRef] [Green Version]
- Aref, M.A.; Jayaweera, S.K.; Machuzak, S. Multi-agent reinforcement learning based cognitive anti-jamming. In Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC), San Francisco, CA, USA, 19–22 March 2017; pp. 1–6. [Google Scholar]
- Machuzak, S.; Jayaweera, S.K. Reinforcement learning based anti-jamming with wideband autonomous cognitive radios. In Proceedings of the IEEE International Conference on Communications in China (ICCC), Chengdu, China, 27–29 July 2016; pp. 1–5. [Google Scholar]
- Su, J.; Ren, G. An SCMA-Based Decoupled Distributed Q-Learning Random Access Scheme for Machine-Type Communication. IEEE Commun. Lett. 2021, 10, 1737–1741. [Google Scholar] [CrossRef]
- Zhou, Q.; Li, Y.; Niu, Y. Intelligent Anti-Jamming Communication for Wireless Sensor Networks: A Multi-Agent Reinforcement Learning Approach. IEEE Commun. Lett. 2021, 2, 775–784. [Google Scholar] [CrossRef]
- Wang, J. Optimization Theory and Methods; Beijing University of Technology Press: Beijing, China, 2018. [Google Scholar]
- Vlassis, N. A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence; Morgan and Claypool Publishers: Williston, VT, USA, 2007. [Google Scholar]
- Zhong, A. Preamble Design and Collision Resolution in a Massive Access IoT System. Sensors 2021, 21, 250. [Google Scholar] [CrossRef] [PubMed]
Parameters | Value |
---|---|
Communication users | |
Available channels Subframes Slots Packet to be transmitted Number of blocked channels | |
Learning | |
Discount factor | |
Greedy index
Number of iterations |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, G.; Niu, Y.; Li, Y.; Zhao, L. An Intelligent Access Channel Algorithm Based on Distributed Double Q Learning. Appl. Sci. 2022, 12, 10815. https://doi.org/10.3390/app122110815
Zhang G, Niu Y, Li Y, Zhao L. An Intelligent Access Channel Algorithm Based on Distributed Double Q Learning. Applied Sciences. 2022; 12(21):10815. https://doi.org/10.3390/app122110815
Chicago/Turabian StyleZhang, Guoliang, Yingtao Niu, Yonggui Li, and Liping Zhao. 2022. "An Intelligent Access Channel Algorithm Based on Distributed Double Q Learning" Applied Sciences 12, no. 21: 10815. https://doi.org/10.3390/app122110815
APA StyleZhang, G., Niu, Y., Li, Y., & Zhao, L. (2022). An Intelligent Access Channel Algorithm Based on Distributed Double Q Learning. Applied Sciences, 12(21), 10815. https://doi.org/10.3390/app122110815