Journal Home > Volume 4 , Issue 2

This paper proposes a cyber security strategy for cyber-physical systems (CPS) based on Q-learning under unequal cost to obtain a more efficient and low-cost cyber security defense strategy with misclassification interference. The system loss caused by strategy selection errors in the cyber security of CPS is often considered equal. However, sometimes the cost associated with different errors in strategy selection may not always be the same due to the severity of the consequences of misclassification. Therefore, unequal costs referring to the fact that different strategy selection errors may result in different levels of system losses can significantly affect the overall performance of the strategy selection process. By introducing a weight parameter that adjusts the unequal cost associated with different types of misclassification errors, a modified Q-learning algorithm is proposed to develop a defense strategy that minimizes system loss in CPS with misclassification interference, and the objective of the algorithm is shifted towards minimizing the overall cost. Finally, simulations are conducted to compare the proposed approach with the standard Q-learning based cyber security strategy method, which assumes equal costs for all types of misclassification errors. The results demonstrate the effectiveness and feasibility of the proposed research.


menu
Abstract
Full text
Outline
About this article

Q-learning based strategy analysis of cyber-physical systems considering unequal cost

Show Author's information Xin Chen1Jixiang Cheng1Luanjuan Jiang2( )Qianmu Li2Ting Wang1Dafang Li1
School of Management Science and Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210014, China

Abstract

This paper proposes a cyber security strategy for cyber-physical systems (CPS) based on Q-learning under unequal cost to obtain a more efficient and low-cost cyber security defense strategy with misclassification interference. The system loss caused by strategy selection errors in the cyber security of CPS is often considered equal. However, sometimes the cost associated with different errors in strategy selection may not always be the same due to the severity of the consequences of misclassification. Therefore, unequal costs referring to the fact that different strategy selection errors may result in different levels of system losses can significantly affect the overall performance of the strategy selection process. By introducing a weight parameter that adjusts the unequal cost associated with different types of misclassification errors, a modified Q-learning algorithm is proposed to develop a defense strategy that minimizes system loss in CPS with misclassification interference, and the objective of the algorithm is shifted towards minimizing the overall cost. Finally, simulations are conducted to compare the proposed approach with the standard Q-learning based cyber security strategy method, which assumes equal costs for all types of misclassification errors. The results demonstrate the effectiveness and feasibility of the proposed research.

Keywords: Q-learning, cyber security, policy selection, unequal cost, misclassification interference

References(48)

[1]

X. Zhou, W. Liang, K. Yan, W. Li, K. I. K. Wang, J. Ma, and Q. Jin, Edge-enabled two-stage scheduling based on deep reinforcement learning for Internet of everything, IEEE Internet Things J., vol. 10, no. 4, pp. 3295–3304, 2022.

[2]
Y. Yang, X. Yang, M. Heidari, M. A. Khan, G. Srivastava, M. Khosravi, and L. Qi, ASTREAM: data-stream-driven scalable anomaly detection with accuracy guarantee in IIoT environment, IEEE Trans. Netw. Sci. Eng., doi: 10.1109/TNSE.2022.3157730.
DOI
[3]

X. Zhou, Y. Hu, J. Wu, W. Liang, J. Ma, and Q. Jin, Distribution bias aware collaborative generative adversarial network for imbalanced deep learning in industrial IoT, IEEE Trans. Ind. Inform., vol. 19, no. 1, pp. 570–580, 2022.

[4]

G. Hatzivasilis, I. Papaefstathiou, and C. Manifavas, SCOTRES: secure routing for IoT and CPS, IEEE Internet Things J., vol. 4, no. 6, pp. 2129–2141, 2017.

[5]

X. Zhou, X. Xu, W. Liang, Z. Zeng, S. Shimizu, L. T. Yang, and Q. Jin, Intelligent small object detection for digital twin in smart manufacturing with industrial cyber-physical systems, IEEE Trans. Ind. Inform., vol. 18, no. 2, pp. 1377–1386, 2021.

[6]
A. R. Sadeghi, C. Wachsmann, and M. Waidner, Security and privacy challenges in industrial Internet of things, in Proc. 52nd Annu. Design Automation Conf., San Francisco, CA, USA, 2015, pp. 1–6.
DOI
[7]

X. Zhou, W. Liang, W. Li, K. Yan, S. Shimizu, and K. I. K. Wang, Hierarchical adversarial attacks against graph-neural-network-based IoT network intrusion detection system, IEEE Internet Things J., vol. 9, no. 12, pp. 9310–9319, 2021.

[8]

W. Liang, Y. Hu, X. Zhou, Y. Pan, and K. I. K. Wang, Variational few-shot learning for microservice-oriented intrusion detection in distributed industrial IoT, IEEE Trans. Ind. Inform., vol. 18, no. 8, pp. 5087–5095, 2021.

[9]

F. O. Olowononi, D. B. Rawat, and C. Liu, Resilient machine learning for networked cyber physical systems: A survey for machine learning security to securing machine learning for CPS, IEEE Commun. Surv. Tutor., vol. 23, no. 1, pp. 524–552, 2021.

[10]

K. C. Lalropuia and V. Gupta, Modeling cyber-physical attacks based on stochastic game and Markov processes, Reliab. Eng. Syst. Saf., vol. 181, pp. 28–37, 2019.

[11]

Z. Jin, S. Zhang, Y. Hu, Y. Zhang, and C. Sun, Security state estimation for cyber-physical systems against DoS attacks via reinforcement learning and game theory, Actuators, vol. 11, no. 7, p. 192, 2022.

[12]

K. Huang, C. Zhou, Y. Qin, and W. Tu, A game-theoretic approach to cross-layer security decision-making in industrial cyber-physical systems, IEEE Trans. Ind. Electron., vol. 67, no. 3, pp. 2371–2379, 2020.

[13]

H. Kou, H. Liu, Y. Duan, W. Gong, Y. Xu, X. Xu, and L. Qi, Building trust/distrust relationships on signed social service network through privacy-aware link prediction process, Appl. Soft Comput., vol. 100, p. 106942, 2021.

[14]

X. Zhou, Y. Li, and W. Liang, CNN-RNN based intelligent recommendation for online medical pre-diagnosis support, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 18, no. 3, pp. 912–921, 2021.

[15]
L. Kong, G. Li, W. Rafique, S. Shen, Q. He, M. R. Khosravi, R. Wang, and L. Qi, Time-aware missing healthcare data prediction based on ARIMA model, IEEE/ACM Trans. Comput. Biol. And Bioinf., doi: 10.1109/TCBB.2022.32050642.
[16]
P. Sterner, D. Goretzko, and F. Pargent, Everything has its price: Foundations of cost-sensitive learning and its application in psychology, https://doi.org/10.31234/osf.io/7asgz, 2021.
DOI
[17]
C. Y. T. Ma, N. S. V. Rao, and D. K. Y. Yau, A game theoretic study of attack and defense in cyber-physical systems, in Proc. 2011 IEEE Conf. Computer Communications Workshops (INFOCOM WKSHPS), Shanghai, China, 2011, pp. 708–713.
[18]

H. Orojloo and M. A. Azgomi, A game-theoretic approach to model and quantify the security of cyber-physical systems, Comput. Ind., vol. 88, pp. 44–57, 2017.

[19]

S. Huang, H. Zhang, J. Wang, and J. Huang, Markov differential game for network defense decision-making method, IEEE Access, vol. 6, pp. 39621–39634, 2018.

[20]
Y. Sun, W. Ji, J. Weng, B. Zhao, Y. Li, and X. Wu, Selection of optimal strategy for moving target defense based on signal game, in Proc. 2020 Int. Conf. Cyberspace Innovation of Advanced Technologies, Guangzhou, China, 2020, pp. 28–32.
DOI
[21]
Y. Guo, Y. Gong, L. L. Njilla, and C. A. Kamhoua, A stochastic game approach to cyber-physical security with applications to smart grid, in Proc. IEEE INFOCOM 2018 - IEEE Conf. Computer Communications Workshops (INFOCOM WKSHPS), Honolulu, HI, USA, 2018, pp. 33–38.
DOI
[22]

X. Zhou, X. Xu, W. Liang, Z. Zeng, and Z. Yan, Deep-learning-enhanced multitarget detection for end–edge–cloud surveillance in smart IoT, IEEE Internet Things J., vol. 8, no. 16, pp. 12588–12596, 2021.

[23]

K. Gai and M. Qiu, Optimal resource allocation using reinforcement learning for IoT content-centric services, Appl. Soft Comput., vol. 70, pp. 12–21, 2018.

[24]
J. Khoury and M. Nassar, A hybrid game theory and reinforcement learning approach for cyber-physical systems security, in Proc. NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symp., Budapest, Hungary, 2020, pp. 1–9.
DOI
[25]

P. Cong, Y. Zhang, Z. Liu, T. Baker, H. Tawfik, W. Wang, K. Xu, R. Li, and F. Li, A deep reinforcement learning-based multi-optimality routing scheme for dynamic IoT networks, Comput. Netw., vol. 192, p. 108057, 2021.

[26]

J. Yan, H. He, X. Zhong, and Y. Tang, Q-learning-based vulnerability analysis of smart grid against sequential topology attacks, IEEE Trans. Inf. Forensics Secur., vol. 12, no. 1, pp. 200–210, 2017.

[27]
K. Chung, C. A. Kamhoua, K. A. Kwiat, Z. T. Kalbarczyk, and R. K. Iyer, Game theory with learning for cyber security monitoring, in Proc. 2016 IEEE 17th Int. Symp. High Assurance Systems Engineering (HASE), Orlando, FL, USA, 2016, pp. 1–8.
DOI
[28]
S. Shiva, S. Roy, and D. Dasgupta, Game theory for cyber security, in Proc. 6th Annu. Workshop on Cyber Security and Information Intelligence Research, Oak Ridge, TN, USA, 2010, pp. 1–4.
DOI
[29]

X. Zhou, W. Liang, S. Shimizu, J. Ma, and Q. Jin, Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems, IEEE Trans. Ind. Inform., vol. 17, no. 8, pp. 5790–5798, 2020.

[30]

X. Zhou, W. Liang, K. I. K. Wang, R. Huang, and Q. Jin, Academic influence aware and multidimensional network analysis for research collaboration navigation based on scholarly big data, IEEE Trans. Emerg. Top. Comput., vol. 9, no. 1, pp. 246–257, 2018.

[31]

H. Liu, H. Kou, C. Yan, and L. Qi, Keywords-driven and popularity-aware paper recommendation based on undirected paper citation graph, Complexity, vol. 2020, pp. 1–15, 2020.

[32]

H. Kou, J. Xu, and L. Qi, Diversity-driven automated web API recommendation based on implicit requirements, Appl. Soft Comput., vol. 136, p. 110137, 2023.

[33]
W. Gong, X. Zhang, Y. Chen, Q. He, A. Beheshti, X. Xu, C. Yan, and L. Qi, DAWAR: diversity-aware web APIs recommendation for mashup creation based on correlation graph, in Proc. 45th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Madrid, Spain, 2022, pp. 395–404.
DOI
[34]

L. Qi, W. Lin, X. Zhang, W. Dou, X. Xu, and J. Chen, A correlation graph based approach for personalized and compatible web APIs recommendation in mobile APP development, IEEE Trans. Knowl. Data Eng., vol. 35, no. 6, pp. 5444–5457, 2023.

[35]

Y. Liu, H. Wu, K. Rezaee, M. R. Khosravi, O. I. Khalaf, A. Ali Khan, D. Ramesh, and L. Qi, Interaction-enhanced and time-aware graph convolutional network for successive point-of-interest recommendation in traveling enterprises, IEEE Trans. Ind. Inf., vol. 19, no. 1, pp. 635–643, 2023.

[36]

H. Liu, H. Kou, C. Yan, and L. Qi, Link prediction in paper citation network to construct paper correlation graph, EURASIP J. Wirel. Commun. Netw., vol. 2019, no. 1, pp. 1–12, 2019.

[37]

Y. Xu, Z. Feng, X. Zhou, M. Xing, H. Wu, X. Xue, S. Chen, C. Wang, and L. Qi, Attention-based neural networks for trust evaluation in online social networks, Inf. Sci., vol. 630, pp. 507–522, 2023.

[38]

X. Zhou, X. Yang, J. Ma, and K. I. K. Wang, Energy-efficient smart routing based on link correlation mining for wireless edge computing in IoT, IEEE Internet Things J., vol. 9, no. 16, pp. 14988–14997, 2022.

[39]

X. Zhou, W. Liang, K. I. K. Wang, and L. T. Yang, Deep correlation mining based on hierarchical hybrid networks for heterogeneous big data recommendations, IEEE Trans. Comput. Soc. Syst., vol. 8, no. 1, pp. 171–178, 2021.

[40]
H. Maleki, S. Valizadeh, W. Koch, A. Bestavros, and M. van Dijk, Markov modeling of moving target defense games, in Proc. 2016 ACM Workshop on Moving Target Defense, Vienna, Austria, 2016, pp. 81–92.
DOI
[41]
M. Azadi, H. Zare, and M. J. Zare, Confidentiality, integrity and availability in electronic health records: An integrative review, in Proc. 15th Int. Conf. Information Technology – New Generations, Las Vegas, NV, USA, 2018, 745–748.
DOI
[42]

Z. Ning, P. Dong, X. Wang, L. Guo, J. J. P. C. Rodrigues, X. Kong, J. Huang, and R. Y. K. Kwok, Deep reinforcement learning for intelligent Internet of vehicles: An energy-efficient computational offloading scheme, IEEE Trans. Cogn. Commun. Netw., vol. 5, no. 4, pp. 1060–1072,Dec, 2019.

[43]

Q. Xu, Z. Su, and R. Lu, Game theory and reinforcement learning based secure edge caching in mobile social networks, IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 3415–3429, 2020.

[44]
J. Khoury and M. Nassar, A hybrid game theory and reinforcement learning approach for cyber-physical systems security, in Proc. 2020 IEEE/IFIP Network Operations and Management Symp., Budapest, Hungary, 2020, pp. 1–9.
DOI
[45]

A. Uprety and D. B. Rawat, Reinforcement learning for IoT security: A comprehensive survey, IEEE Internet Things J., vol. 8, no. 11, pp. 8693–8706, 2021.

[46]

C. J. C. H. Watkins and P. Dayan, Q-learning, Mach. Learn., vol. 8, nos. 3–4, pp. 279–292, 1992.

[47]
T. T. Nguyen and V. J. Reddi, Deep reinforcement learning for cyber security, IEEE Trans. Neural Netw. Learn. Syst. ,doi: 10.1109/TNNLS.2021.3121870.
DOI
[48]

J. L. Tan, C. Lei, H. Q. Zhang, and Y. Q. Cheng, Optimal strategy selection approach to moving target defense based on Markov robust game, Comput. Secur., vol. 85, pp. 63–76, 2019.

Publication history
Copyright
Rights and permissions

Publication history

Received: 31 March 2023
Revised: 30 April 2023
Accepted: 30 May 2023
Published: 30 June 2023
Issue date: June 2023

Copyright

© All articles included in the journal are copyrighted to the ITU and TUP.

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license:https://creativecommons.org/licenses/by-nc-nd/3.0/igo/

Return