Security State Estimation for Cyber-Physical Systems against DoS Attacks via Reinforcement Learning and Game Theory
Abstract
:1. Introduction
2. Preliminaries
2.1. System Model
2.2. Two-Player Zero-Sum Game
2.3. Q-Learning Algorithm
3. Problem Statement
3.1. State Estimation Based on Kalman Filter
3.2. DoS Attack Model
3.3. Remote Estimation
4. Reinforcement Learning for Reliable Channel
Algorithm 1 Q-learning Algorithm for Reliable channel |
Input :The parameters of the system A, C; the steady-state error covariance ; cost and ; learning rate , discount factor and exploration rate . |
Output :Optimal Q-value matrix , Nash equilibrium and . |
Initialize: Set initial state , initialize Q-value matrix with m for all s and , set . |
1: while do |
2: if then |
3: Choose actions randomly; |
4: else |
5: Find the optimal actions obtained by linear programming method. |
6: end if |
7: Observe the reward by (15). |
8: Observe the next state according to (14). |
9: Update the Q-value matrix by (16). |
10: |
11: |
12: end while |
13: Return Q-value matrix for . |
14: Observe the Nash equilibrium and . |
5. Reinforcement Learning for Unreliable Channel
Algorithm 2 Q-learning Algorithm for Unreliable channel |
Input :The parameters of the system A, C; the steady-state error covariance ; cost and ; packet loss probability in each action combination; learning rate , discount factor and exploration rate . |
Output :Optimal Q-value matrix , Nash equilibrium and . |
Initialize: Set initial state , initialize Q-value matrix with m for all s and , set . |
1: while |
2: if then |
3: Choose actions randomly; |
4: else |
5: Find the optimal actions obtained by linear programming method. |
6: end if |
7: According to the actions of sensors and attackers , the packet loss probability is obtained by (17). |
8: Observe the reward by (19). |
9: Observe the next state according to (18). |
10: Update the Q-value matrix by (16). |
11: |
12: |
13: end while |
14: Return Q-value matrix for . |
15: Observe the Nash equilibrium and . |
6. Simulations and Experiments
6.1. Case 1: Simulation Example for Reliable Channel
6.2. Case 2: Simulation Example for Unreliable Channel
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Salau, B.; Rawal, A.; Rawat, D.B. Recent Advances in Artificial Intelligence for Wireless Internet of Things and Cyber-Physical Systems: A Comprehensive Survey. IEEE Internet Things J. 2022. [Google Scholar] [CrossRef]
- Ding, D.; Han, Q.L.; Ge, X.; Wang, J. Secure state estimation and control of cyber-physical systems: A survey. IEEE Trans. Syst. Man, Cybern. Syst. 2020, 51, 176–190. [Google Scholar] [CrossRef]
- Alipour-Fanid, A.; Dabaghchian, M.; Wang, N.; Jiao, L.; Zeng, K. Online-learning-based defense against jamming attacks in multichannel wireless CPS. IEEE Internet Things J. 2021, 8, 13278–13290. [Google Scholar] [CrossRef]
- Duo, W.; Zhou, M.; Abusorrah, A. A Survey of Cyber Attacks on Cyber Physical Systems: Recent Advances and Challenges. IEEE/CAA J. Autom. Sin. 2022, 9, 784–800. [Google Scholar] [CrossRef]
- Dibaji, S.M.; Pirani, M.; Flamholz, D.B.; Annaswamy, A.M.; Johansson, K.H.; Chakrabortty, A. A systems and control perspective of CPS security. Annu. Rev. Control 2019, 47, 394–411. [Google Scholar] [CrossRef] [Green Version]
- Kordestani, M.; Saif, M. Observer-based attack detection and mitigation for cyberphysical systems: A review. IEEE Syst. Man Cybern. Mag. 2021, 7, 35–60. [Google Scholar] [CrossRef]
- Li, T.; Chen, B.; Yu, L.; Zhang, W.A. Active security control approach against DoS attacks in cyber-physical systems. IEEE Trans. Autom. Control 2020, 66, 4303–4310. [Google Scholar] [CrossRef]
- Mahmoud, M.S.; Hamdan, M.M.; Baroudi, U.A. Modeling and control of cyber-physical systems subject to cyber attacks: A survey of advances and challenges. Neurocomputing 2019, 338, 101–115. [Google Scholar] [CrossRef]
- Alsulami, A.A.; Zein-Sabatto, S. Resilient Cyber-Security Approach For Aviation Cyber-Physical Systems Protection Against Sensor Spoofing Attacks. In Proceedings of the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, 27–30 January 2021; pp. 0565–0571. [Google Scholar]
- Renganathan, V.; Fathian, K.; Safaoui, S.; Summers, T. Spoof resilient coordination in distributed and robust robotic networks. IEEE Trans. Control Syst. Technol. 2021, 30, 803–810. [Google Scholar] [CrossRef]
- Ashok, A.; Govindarasu, M.; Ajjarapu, V. Online Detection of Stealthy False Data Injection Attacks in Power System State Estimation. IEEE Trans. Smart Grid 2016, 9, 1636–1646. [Google Scholar] [CrossRef]
- Du, M.; Pierrou, G.; Wang, X. Targeted False Data Injection Attack against DC State Estimation without Line Parameters. In Proceedings of the 2021 IEEE Power & Energy Society General Meeting (PESGM), Washington, DC, USA, 26–29 July 2021; pp. 1–5. [Google Scholar]
- Choraria, M.; Chattopadhyay, A.; Mitra, U.; Strom, E. Design of false data injection attack on distributed process estimation. IEEE Trans. Inf. Forensics Secur. 2022, 17, 670–683. [Google Scholar] [CrossRef]
- Li, Z.; Zhou, C.; Che, W.; Deng, C.; Jin, X. Data-Based Security Fault Tolerant Iterative Learning Control under Denial-of-Service Attacks. Actuators 2022, 11, 178. [Google Scholar] [CrossRef]
- Liu, W.; Sun, J.; Wang, G.; Bullo, F.; Chen, J. Resilient Control under Quantization and Denial-of-Service: Co-designing a Deadbeat Controller and Transmission Protocol. IEEE Trans. Autom. Control. 2021. [Google Scholar] [CrossRef]
- Liu, Y.; Yang, G.H. Event-Triggered Distributed State Estimation for Cyber-Physical Systems Under DoS Attacks. IEEE Trans. Cybern. 2022, 52, 3620–3631. [Google Scholar] [CrossRef] [PubMed]
- Liu, R.; Hao, F.; Yu, H. Optimal SINR-based DoS attack scheduling for remote state estimation via adaptive dynamic programming approach. IEEE Trans. Syst. Man Cybern. Syst. 2020, 51, 7622–7632. [Google Scholar] [CrossRef]
- Hasnat, M.A.; Rahnamay-Naeini, M. A data-driven dynamic state estimation for smart grids under DoS attack using state correlations. In Proceedings of the 2019 North American Power Symposium (NAPS), Wichita, KS, USA, 13–15 October 2019; pp. 1–6. [Google Scholar]
- Feng, S.; Cetinkaya, A.; Ishii, H.; Tesi, P.; De Persis, C. Networked control under DoS attacks: Tradeoffs between resilience and data rate. IEEE Trans. Autom. Control 2020, 66, 460–467. [Google Scholar] [CrossRef]
- Wang, L.; Cao, X.; Zhang, H.; Sun, C.; Zheng, W.X. Transmission scheduling for privacy-optimal encryption against eavesdropping attacks on remote state estimation. Automatica 2022, 137, 110145. [Google Scholar] [CrossRef]
- Yuan, H.; Xia, Y.; Yang, H. Resilient state estimation of cyber-physical system with multichannel transmission under DoS attack. IEEE Trans. Syst. Man Cybern. Syst. 2020, 51, 6926–6937. [Google Scholar] [CrossRef]
- Pirani, M.; Nekouei, E.; Sandberg, H.; Johansson, K.H. A Graph-Theoretic Equilibrium Analysis of Attacker-Defender Game on Consensus Dynamics Under H2 Performance Metric. IEEE Trans. Netw. Sci. Eng. 2020, 8, 1991–2000. [Google Scholar] [CrossRef]
- Kurt, M.N.; Ogundijo, O.; Li, C.; Wang, X. Online cyber-attack detection in smart grid: A reinforcement learning approach. IEEE Trans. Smart Grid 2018, 10, 5174–5185. [Google Scholar] [CrossRef] [Green Version]
- Ding, K.; Ren, X.; Quevedo, D.E.; Dey, S.; Shi, L. DoS attacks on remote state estimation with asymmetric information. IEEE Trans. Control Netw. Syst. 2018, 6, 653–666. [Google Scholar] [CrossRef]
- Dahiya, A.; Gupta, B.B. A reputation score policy and Bayesian game theory based incentivized mechanism for DDoS attacks mitigation and cyber defense. Future Gener. Comput. Syst. 2021, 117, 193–204. [Google Scholar] [CrossRef]
- Li, Y.; Quevedo, D.E.; Dey, S.; Shi, L. SINR-Based DoS Attack on Remote State Estimation: A Game-theoretic Approach. IEEE Trans. Control Netw. Syst. 2016, 4, 632–642. [Google Scholar] [CrossRef] [Green Version]
- Wang, X.F.; Sun, X.M.; Ye, M.; Liu, K.Z. Robust Distributed Nash Equilibrium Seeking for Games Under Attacks and Communication Delays. IEEE Trans. Autom. Control 2022. [Google Scholar] [CrossRef]
- Xue, L.; Cao, X.; Sun, C.; Jin, S. Optimal jamming attack strategy against wireless state estimation: A game theoretic approach. In Proceedings of the IECON 2018-44th Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, USA, 21–23 October 2018; pp. 5989–5995. [Google Scholar]
- Gao, W.; Deng, C.; Jiang, Y.; Jiang, Z.P. Resilient reinforcement learning and robust output regulation under denial-of-service attacks. Automatica 2022, 142, 110366. [Google Scholar] [CrossRef]
- Xue, L.; Sun, C.; Wunsch, D.; Zhou, Y.; Yu, F. An adaptive strategy via reinforcement learning for the prisoner’s dilemma game. IEEE/CAA J. Autom. Sin. 2017, 5, 301–310. [Google Scholar] [CrossRef]
- He, Y.; Liang, C.; Yu, F.R.; Han, Z. Trust-Based Social Networks with Computing, Caching and Communications: A Deep Reinforcement Learning Approach. IEEE Trans. Netw. Sci. Eng. 2020, 7, 66–79. [Google Scholar] [CrossRef]
- Bozkurt, A.K.; Wang, Y.; Pajic, M. Secure planning against stealthy attacks via model-free reinforcement learning. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 10656–10662. [Google Scholar]
- Dai, P.; Yu, W.; Wang, H.; Wen, G.; Lv, Y. Distributed reinforcement learning for cyber-physical system with multiple remote state estimation under DoS attacker. IEEE Trans. Netw. Sci. Eng. 2020, 7, 3212–3222. [Google Scholar] [CrossRef]
- Hu, J.; Wellman, M.P. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the ICML ’98: Proceedings of the Fifteenth International Conference on Machine Learning, San Francisco, CA, USA, 24–27 July 1998; pp. 242–250. [Google Scholar]
- Nash, J.F., Jr. Equilibrium points in n-person games. Proc. Natl. Acad. Sci. USA 1950, 36, 48–49. [Google Scholar] [CrossRef] [Green Version]
- Ye, M.; Tianqing, C.; Wenhui, F. A single-task and multi-decision evolutionary game model based on multi-agent reinforcement learning. J. Syst. Eng. Electron. 2021, 32, 642–657. [Google Scholar] [CrossRef]
- Russell, S.J. Artificial Intelligence a Modern Approach; Pearson Education, Inc.: London, UK, 2010. [Google Scholar]
- Watkins, C.J.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Anderson, B.D.; Moore, J.B. Optimal Filtering; Courier Corporation: North Chelmsford, MA, USA, 2012. [Google Scholar]
- Lyu, L.; Chen, C.; Hua, C.; Yang, B.; Guan, X. Transmission reliability enhancement for multi-sensor state estimation in industrial CPSs. In Proceedings of the 2016 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), San Francisco, CA, USA, 10–14 April 2016; pp. 1057–1058. [Google Scholar]
State-Action | ||||
---|---|---|---|---|
7.500 | 2.100 | 14.300 | 6.300 | |
7.500 | 10.190 | 14.300 | 6.300 | |
7.538 | −0.499 | 14.301 | 7.237 |
State- | ||
---|---|---|
State-Action | ||||
---|---|---|---|---|
7.565 | 2.163 | 13.344 | 5.859 | |
7.605 | 9.620 | 13.352 | 6.659 | |
7.682 | 12.820 | 13.364 | 7.819 |
State- | ||
---|---|---|
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jin, Z.; Zhang, S.; Hu, Y.; Zhang, Y.; Sun, C. Security State Estimation for Cyber-Physical Systems against DoS Attacks via Reinforcement Learning and Game Theory. Actuators 2022, 11, 192. https://doi.org/10.3390/act11070192
Jin Z, Zhang S, Hu Y, Zhang Y, Sun C. Security State Estimation for Cyber-Physical Systems against DoS Attacks via Reinforcement Learning and Game Theory. Actuators. 2022; 11(7):192. https://doi.org/10.3390/act11070192
Chicago/Turabian StyleJin, Zengwang, Shuting Zhang, Yanyan Hu, Yanning Zhang, and Changyin Sun. 2022. "Security State Estimation for Cyber-Physical Systems against DoS Attacks via Reinforcement Learning and Game Theory" Actuators 11, no. 7: 192. https://doi.org/10.3390/act11070192
APA StyleJin, Z., Zhang, S., Hu, Y., Zhang, Y., & Sun, C. (2022). Security State Estimation for Cyber-Physical Systems against DoS Attacks via Reinforcement Learning and Game Theory. Actuators, 11(7), 192. https://doi.org/10.3390/act11070192