An Intelligent Decision-Making for Electromagnetic Spectrum Allocation Method Based on the Monte Carlo Counterfactual Regret Minimization Algorithm in Complex Environments
Abstract
:1. Introduction
2. Methods and Model
2.1. Methods
2.1.1. Monte Carlo Method (MCM)
2.1.2. Counterfactual Regret Minimization
2.2. Model
2.2.1. Modeling Approach
2.2.2. Modeling of Red Team Defensive Agents
- (1)
- Environment modeling
- (2)
- Frequency selection for communication links based on MCM
- (3)
- Communication link calculation
- (1)
- Determination of communication link interference
- (2)
- Punish the agent
- (3)
- Reward the agent
- (4)
- Calculate Regret Values and Update Regret and Strategy Values Based on CFR
2.2.3. Modeling of Blue Team Attacking Agents
3. Verification and Analysis
3.1. Interference Determination
3.2. Typical Scenario Analysis
4. Conclusions
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, J.; Hao, Y.; Yang, C. The Current Progress and Future Prospects of Path Loss Model for Terrestrial Radio Propagation. Electronics 2023, 12, 4959. [Google Scholar] [CrossRef]
- Xu, M.; Olaimat, M.; Tang, T.; Ramahi, O.M.; Aldhaeebi, M.; Jin, Z.; Zhu, M. Numerical Modeling of the Radio Wave over-the-Horizon Propagation in the Troposphere. Atmosphere 2022, 13, 1184. [Google Scholar] [CrossRef]
- Serkov, A.; Kasilov, O.; Lazurenko, B.; Pevnev, V.; Trubchaninova, K. Strategy of Building a Wireless Mobile Communication System in the Conditions of Electronic Counteraction. Electronics 2023, 12, 160–170. [Google Scholar] [CrossRef]
- Zhang, C. The Construction and Characteristics of Foreign Army’s Electronic Countermeasure Equipment. SHS Web Conf. 2021, 96, 01012. [Google Scholar] [CrossRef]
- Lv, Q.; Zhang, W.; Feng, W.; Xing, M. Deep Neural Network-Based Interrupted Sampling Deceptive Jamming Countermeasure Method. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 9073–9085. [Google Scholar] [CrossRef]
- Wang, J.; Yang, C.; An, W. Regional Refined Long-term Predictions Method of Usable Frequency for HF Communication Based on Machine Learning over Asia. IEEE Trans. Antennas Prop. 2022, 70, 4040–4055. [Google Scholar] [CrossRef]
- Wang, J.; Yang, C.; Yan, N. Study on digital twin channel for the B5G and 6G communication. Chin. J. Radio Sci. 2021, 36, 340–348. [Google Scholar]
- Wang, J.; Shi, D. Cyber-Attacks Related to Intelligent Electronic Devices and Their Countermeasures: A Review. In Proceedings of the 2018 53rd International Universities Power Engineering Conference (UPEC), Glasgow, UK, 4–7 September 2018; pp. 1–6. [Google Scholar]
- Tu, Z. Design and Implementation of an Artificial Intelligence Gomoku System. Master’s Thesis, Hunan University, Changsha, China, 2016. [Google Scholar]
- Zhang, H.; Li, D.; He, Y. Artificial Intelligence and “StarCraft”: New Progress in Multi-Agent Game Research. Unmanned Syst. Technol. 2019, 2, 5–16. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Introduction to Reinforcement Learning; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
- Bichler, M.; Fichtl, M.; Oberlechner, M. Computing Bayes–Nash Equilibrium Strategies in Auction Games via Simultaneous Online Dual Averaging. Oper. Res. 2023; ahead of print. [Google Scholar]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep Learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Henderson, P.; Islam, R.; Bachman, P.; Pineau, J.; Precup, D.; Meger, D. Deep Reinforcement Learning That Matters. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018. [Google Scholar]
- Du, W.; Ding, S. A survey on multi-agent deep reinforcement learning: From the perspective of challenges and applications. Artif. Intell. Rev. 2021, 54, 3215–3238. [Google Scholar] [CrossRef]
- Hu, J.; Wellman, M.P. Nash Q-Learning for General-Sum Stochastic Games. J. Mach. Learn. Res. 2003, 4, 1039–1069. [Google Scholar]
- Lee, D.; Defourny, B.; Powell, W.B. Bias-Corrected Q-Learning to Control Max-Operator Bias in Q-Learning. In Proceedings of the IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, Singapore, 16–18 April 2013; pp. 93–99. [Google Scholar]
- Bowling, M.; Veloso, M. Multiagent Learning Using a Variable Learning Rate. Artif. Intell. 2002, 136, 215–250. [Google Scholar] [CrossRef]
- Claus, C.; Boutilier, C. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI), Madison, WI, USA, 26–30 July 1998. [Google Scholar]
- Lauer, M.; Riedmiller, M. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Takamatsu, Japan, 30 October–5 November 2000; pp. 120–126. [Google Scholar]
- Kok, J.R.; Vlassis, N. Sparse Cooperative Q-Learning. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML), Banff, AB, Canada, 4–8 July 2004; pp. 481–488. [Google Scholar]
- Lowe, R.; Wu, Y.; Tamar, A.; Harb, J.; Pieter Abbeel, O.; Mordatch, I. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. arXiv 2017, arXiv:1706.02275. [Google Scholar]
- Meng, Y.; Qi, P.; Lei, Q.; Zhang, Z.; Ren, J.; Zhou, X. Electromagnetic Spectrum Allocation Method for Multi-Service Irregular Frequency-Using Devices in the Space–Air–Ground Integrated Network. Sensors 2022, 22, 9227. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Wu, Q.; Xu, Y.; Qi, N.; Fang, T.; Liu, D. Spectrum Allocation for Task-Driven UAV Communication Networks Exploiting Game Theory. IEEE Wirel. Commun. 2021, 28, 174–181. [Google Scholar] [CrossRef]
- Thakur, P.; Kumar, A.; Pandit, S.; Singh, G.; Satashia, S.N. Performance Analysis of Cognitive Radio Networks Using Channel-Prediction-Probabilities and Improved Frame Structure. Digit. Commun. Netw. 2018, 4, 287–295. [Google Scholar] [CrossRef]
- ITU-R. P.676-11; Attenuation by Atmospheric Gases and Related Effects. International Telecommunication Union: Geneva, Switzerland, 2022.
- ITU-R. P.838-3; Specific Attenuation Model for Rain for Use in Prediction Methods. International Telecommunication Union: Geneva, Switzerland, 2005.
- ITU-R. P.525-5; Calculation of Free-Space Attenuation. International Telecommunication Union: Geneva, Switzerland, 2024.
Algroithm: The frequency allocation process for Red Team defensive agents. | Time complexity | Space complexity |
Input: the input listed in Table 2. | ||
Output: listed in Table 2. | ||
Begin{ | ||
Step1: initialization. | ||
Load scene. | O(1) | O(1) |
Read input information. | O(1) | O(1) |
Step2: Frequency selection for communication links based on MCCFR. | ||
Initialization: Strategy and Regret. | O(1) | O(1) |
for i = 1:Na | O(1) | O(1) |
for j = 1:N | O(Na+N) | O(1) |
rBF = MC(rF). | O(1) | |
Simulated confrontation. | O(1) | |
Accumulated regret. | O(1) | |
Update strategy. | O(1) | |
end | O(1) | |
end | O(1) | |
Determine the final frequency by CFR with strategy and rBFP. | O(1) | O(1) |
Step3: Communication link calculation. | ||
Building link with other intelligent agents. | O(1) | O(1) |
Calculate L by ITU-R P.525. | O(1) | O(1) |
Step4: Output the parameters in Table 2. | O(1) | O(1) |
Number | Type | Item | Identification | Format | Remarks |
---|---|---|---|---|---|
1 | Input | Longitude | rLon | double | Range: −90 to 90 degrees |
2 | Input | Latitude | rLat | double | Range: −180 to 180 degrees |
3 | Input | Altitude | rH | double | Unit: m |
4 | Input | Power | rP | double | Unit: W |
5 | Input | Antenna Gain | rGt,rGr | double | Unit: dB, including the gain of the receiving antenna and the transmitting antenna |
6 | Input | Frequency to be Selected | rF | double | Range: 30 MHz to 3000 MHz |
7 | Input | Frequency Priority | rFP | double | Priority is expressed as a percentage; a higher percentage indicates higher priority |
8 | Input | Antenna Type | bard | char | - |
9 | Input | Modulation Scheme | MS | int | The simulation is configured for 6 (supporting common modulation schemes such as 2, 4, 6, 8, etc.). |
10 | Input | Coding Scheme | CS | char | The simulation is set to ’MPSK’ (supporting common coding schemes such as [’MPSK’, ’BPSK’, ’2PSK’, ’QPSK’, ’8FSK’, ’2FSK’]). |
11 | Output | Communication Frequency Point | rBF | double | Range: 30 MHz to 3000 MHz |
12 | Output | Interference Probability of Communication Frequency Point | rBFP | double | Probability of interference is expressed as a percentage; a higher percentage indicates a higher likelihood of interference |
13 | Output | Signal-to-Interference Ratio | rSINR | double | Unit: dB |
14 | Output | Signal-to-Noise Ratio | rSNR | double | Unit: dB |
15 | Output | Bit Error Rate | RER | double | Unit: dB |
16 | Output | Interference Status | biter | char | Output: Yes/No |
Algroithm: The spectrum allocation flowchart for the blue team’s attacking agents. | ||
Input: the input listed in Table 4. | O(1) | O(1) |
Output: listed in Table 4. | O(1) | O(1) |
Begin{ | O(1) | O(1) |
Step1: initialization. | ||
Load scene. | O(1) | O(1) |
Read input information. | O(1) | O(1) |
Step2: Frequency selection for attacking links based on MCCFR. | ||
Initialization: Strategy and Regret. | O(1) | O(1) |
for i = 1:Na | O(Na+N) | O(1) |
for j = 1:N | O(1) | |
bBF = MC(bF). | O(1) | |
Simulated confrontation. | O(1) | |
Accumulated regret. | O(1) | |
Update strategy. | O(1) | |
end | O(1) | |
end | O(1) | |
Determine the final frequency by CFR with strategy and bfp. | O(1) | O(1) |
Step3: Attacking link calculation. | ||
Attack red intelligent agents. | O(1) | O(1) |
Calculate L by ITU-R P.525. | O(1) | O(1) |
Step4: Output the parameters in Table 4. | O(1) | O(1) |
Number | Type | Item | Identification | Format | Remarks |
---|---|---|---|---|---|
1 | Input | Longitude | blon | double | Range: −90 to 90 degrees |
2 | Input | Latitude | blat | double | Range: −180 to 180 degrees |
3 | Input | Altitude | bh | double | Unit: m |
4 | Input | Power | bp | int | Unit: W |
5 | Input | Antenna Gain | bGt,bGr | double | Unit: dB, Including the gain of the receiving antenna and the transmitting antenna |
6 | Input | Frequency to be Selected | bf | double | Range: 30 MHz to 3000 MHz |
7 | Input | Frequency Band Priority | bfp | double | Priority expressed as a percentage; higher percentage indicates higher priority |
8 | Input | Antenna Radiation Pattern | bard | double | Unit: dB |
9 | Output | Jamming Frequency Band | bbf | double | Range: 30 MHz to 3000 MHz |
10 | Output | Jamming Success Probability | bbfp | double | Probability of jamming success expressed as a percentage; higher percentage indicates greater likelihood of successful jamming |
11 | Output | Jamming Success Status | biter | char | Success/Failure |
Interference Parameters | Interference Level | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Level 1 | Level 2 | Level 3 | Level 4 | Level 5 | Level 6 | Level 7 | Level 8 | Level 9 | Level 10 | |
SINR | 10 dB | 10 dB | 10 dB | 10 dB | 10 dB | 10 dB | 10 dB | 10 dB | 10 dB | 10 dB |
Rbw | 10% | 20% | 30% | 40% | 50% | 60% | 70% | 80% | 90% | 100% |
Device Name | Frequency to Be Selected (MHz) | Power and Antenna Gain ([rP/bP, rGt/bGt, rGr/bGr]) | Longitude and Latitude | Modulation Scheme | Coding Scheme | Altitude (m) | Frequency Priority | Antenna Type |
---|---|---|---|---|---|---|---|---|
Communication Agent 1 | 40, 500, 900, 2200, 2700 | [30, 0, 0] | [116.5372, 39.7563] | The simulation is configured for 6 (supporting common modulation schemes such as 2, 4, 6, 8, etc.). | The simulation is set to ’MPSK’ (supporting common coding schemes such as [’MPSK’, ’BPSK’, ’2PSK’, ’QPSK’, ’8FSK’, ’2FSK’]). | 26 | Do not set priority temporarily | Omnidirectional antenna |
Communication Agent 2 | [116.6348, 39.8921] | 23 | ||||||
Communication Agent 3 | [35, 2, 2] | [116.2436, 39.9314] | 22 | |||||
Communication Agent 4 | [40, 3, 3] | [116.4929, 39.8257] | 23 | |||||
Communication Agent 5 | [45, 5, 5] | [116.4733, 40.1075] | 78 | |||||
Communication Agent 6 | [50, 6, 6] | [116.1234, 39.6983] | 67 | |||||
Communication Agent 7 | [55, 8, 8] | [116.7456, 39.9112] | 97 | |||||
Communication Agent 8 | [60, 10, 10] | [116.5527, 40.0185] | 89 | |||||
Jamming Agent 1 | 30~300, 300~1000, 1000~3000 | [20, −3, −3] | [116.6958, 39.4557] | - | - | 123 | Omnidirectional antenna | |
Jamming Agent 2 | [25, 0, 0] | [116.1436, 40.1104] | - | - | 102 | |||
Jamming Agent 3 | [30, 1, 1] | [117.0221, 40.3129] | - | - | 67 | |||
Jamming Agent 4 | [35, 2, 2] | [116.9320, 40.0356] | - | - | 78 | |||
Jamming Agent 5 | [40, 3, 3] | [116.2034, 40.2039] | - | - | 78 | |||
Jamming Agent 6 | [45, 4, 4] | [115.9635, 39.8112] | - | - | 95 | |||
Jamming Agent 7 | [50, 5, 5] | [116.8013, 40.2974] | - | - | 96 | |||
Jamming Agent 8 | [55, 6, 6] | [116.7215, 39.4798] | - | - | 89 |
Single Simulation Time (s) | Average Time for 100 Simulations (s) | |
---|---|---|
multi-player incomplete information game | 6.28 | 3.06 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kang, G.; Tan, M.; Zou, X.; Xu, X.; Han, L.; Du, H. An Intelligent Decision-Making for Electromagnetic Spectrum Allocation Method Based on the Monte Carlo Counterfactual Regret Minimization Algorithm in Complex Environments. Atmosphere 2025, 16, 345. https://doi.org/10.3390/atmos16030345
Kang G, Tan M, Zou X, Xu X, Han L, Du H. An Intelligent Decision-Making for Electromagnetic Spectrum Allocation Method Based on the Monte Carlo Counterfactual Regret Minimization Algorithm in Complex Environments. Atmosphere. 2025; 16(3):345. https://doi.org/10.3390/atmos16030345
Chicago/Turabian StyleKang, Guoqin, Ming Tan, Xiaojun Zou, Xuguang Xu, Lixun Han, and Hainan Du. 2025. "An Intelligent Decision-Making for Electromagnetic Spectrum Allocation Method Based on the Monte Carlo Counterfactual Regret Minimization Algorithm in Complex Environments" Atmosphere 16, no. 3: 345. https://doi.org/10.3390/atmos16030345
APA StyleKang, G., Tan, M., Zou, X., Xu, X., Han, L., & Du, H. (2025). An Intelligent Decision-Making for Electromagnetic Spectrum Allocation Method Based on the Monte Carlo Counterfactual Regret Minimization Algorithm in Complex Environments. Atmosphere, 16(3), 345. https://doi.org/10.3390/atmos16030345