Intelligent and Converged Networks 2023, 4(2): 116-126 https://doi.org/10.23919/ICN.2023.0012

Open Access | Issue | Published: 30 June 2023

Q-learning based strategy analysis of cyber-physical systems considering unequal cost

Show Author's Information Hide Author's Information Xin Chen^¹, Jixiang Cheng^¹, Luanjuan Jiang^²(

), Qianmu Li^², Ting Wang^¹, Dafang Li^¹

1School of Management Science and Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China

2School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210014, China

Keywords:

Q-learning, cyber security, policy selection, unequal cost, misclassification interference

Cite this article:

Chen X, Cheng J, Jiang L, et al. Q-learning based strategy analysis of cyber-physical systems considering unequal cost. Intelligent and Converged Networks, 2023, 4(2): 116-126. https://doi.org/10.23919/ICN.2023.0012

Download citation

EndNote(RIS)

BibTeX

375

Views

Downloads

Citations

Crossref

N/A

WoS

Scopus

N/A

CSCD

Abstract Full text About this article

Abstract

This paper proposes a cyber security strategy for cyber-physical systems (CPS) based on Q-learning under unequal cost to obtain a more efficient and low-cost cyber security defense strategy with misclassification interference. The system loss caused by strategy selection errors in the cyber security of CPS is often considered equal. However, sometimes the cost associated with different errors in strategy selection may not always be the same due to the severity of the consequences of misclassification. Therefore, unequal costs referring to the fact that different strategy selection errors may result in different levels of system losses can significantly affect the overall performance of the strategy selection process. By introducing a weight parameter that adjusts the unequal cost associated with different types of misclassification errors, a modified Q-learning algorithm is proposed to develop a defense strategy that minimizes system loss in CPS with misclassification interference, and the objective of the algorithm is shifted towards minimizing the overall cost. Finally, simulations are conducted to compare the proposed approach with the standard Q-learning based cyber security strategy method, which assumes equal costs for all types of misclassification errors. The results demonstrate the effectiveness and feasibility of the proposed research.

Full text

Abstract

Full text

Outline

About this article

Q-learning based strategy analysis of cyber-physical systems considering unequal cost

Show Author's information Hide Author's Information Xin Chen^¹, Jixiang Cheng^¹, Luanjuan Jiang^²(

), Qianmu Li^², Ting Wang^¹, Dafang Li^¹

1School of Management Science and Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China

2School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210014, China

Abstract

Keywords: Q-learning, cyber security, policy selection, unequal cost, misclassification interference

References(48)

[1]

X. Zhou, W. Liang, K. Yan, W. Li, K. I. K. Wang, J. Ma, and Q. Jin, Edge-enabled two-stage scheduling based on deep reinforcement learning for Internet of everything, IEEE Internet Things J., vol. 10, no. 4, pp. 3295–3304, 2022.

DOI Google Scholar

[2]

Y. Yang, X. Yang, M. Heidari, M. A. Khan, G. Srivastava, M. Khosravi, and L. Qi, ASTREAM: data-stream-driven scalable anomaly detection with accuracy guarantee in IIoT environment, IEEE Trans. Netw. Sci. Eng., doi: 10.1109/TNSE.2022.3157730.

DOI

[3]

X. Zhou, Y. Hu, J. Wu, W. Liang, J. Ma, and Q. Jin, Distribution bias aware collaborative generative adversarial network for imbalanced deep learning in industrial IoT, IEEE Trans. Ind. Inform., vol. 19, no. 1, pp. 570–580, 2022.

DOI Google Scholar

[4]

G. Hatzivasilis, I. Papaefstathiou, and C. Manifavas, SCOTRES: secure routing for IoT and CPS, IEEE Internet Things J., vol. 4, no. 6, pp. 2129–2141, 2017.

DOI Google Scholar

[5]

X. Zhou, X. Xu, W. Liang, Z. Zeng, S. Shimizu, L. T. Yang, and Q. Jin, Intelligent small object detection for digital twin in smart manufacturing with industrial cyber-physical systems, IEEE Trans. Ind. Inform., vol. 18, no. 2, pp. 1377–1386, 2021.

DOI Google Scholar

[6]

A. R. Sadeghi, C. Wachsmann, and M. Waidner, Security and privacy challenges in industrial Internet of things, in Proc. 52nd Annu. Design Automation Conf., San Francisco, CA, USA, 2015, pp. 1–6.

DOI

[7]

X. Zhou, W. Liang, W. Li, K. Yan, S. Shimizu, and K. I. K. Wang, Hierarchical adversarial attacks against graph-neural-network-based IoT network intrusion detection system, IEEE Internet Things J., vol. 9, no. 12, pp. 9310–9319, 2021.

DOI Google Scholar

[8]

W. Liang, Y. Hu, X. Zhou, Y. Pan, and K. I. K. Wang, Variational few-shot learning for microservice-oriented intrusion detection in distributed industrial IoT, IEEE Trans. Ind. Inform., vol. 18, no. 8, pp. 5087–5095, 2021.

DOI Google Scholar

[9]

F. O. Olowononi, D. B. Rawat, and C. Liu, Resilient machine learning for networked cyber physical systems: A survey for machine learning security to securing machine learning for CPS, IEEE Commun. Surv. Tutor., vol. 23, no. 1, pp. 524–552, 2021.

DOI Google Scholar

[10]

K. C. Lalropuia and V. Gupta, Modeling cyber-physical attacks based on stochastic game and Markov processes, Reliab. Eng. Syst. Saf., vol. 181, pp. 28–37, 2019.

DOI Google Scholar

[11]

Z. Jin, S. Zhang, Y. Hu, Y. Zhang, and C. Sun, Security state estimation for cyber-physical systems against DoS attacks via reinforcement learning and game theory, Actuators, vol. 11, no. 7, p. 192, 2022.

DOI Google Scholar

[12]

K. Huang, C. Zhou, Y. Qin, and W. Tu, A game-theoretic approach to cross-layer security decision-making in industrial cyber-physical systems, IEEE Trans. Ind. Electron., vol. 67, no. 3, pp. 2371–2379, 2020.

DOI Google Scholar

[13]

H. Kou, H. Liu, Y. Duan, W. Gong, Y. Xu, X. Xu, and L. Qi, Building trust/distrust relationships on signed social service network through privacy-aware link prediction process, Appl. Soft Comput., vol. 100, p. 106942, 2021.

DOI Google Scholar

[14]

X. Zhou, Y. Li, and W. Liang, CNN-RNN based intelligent recommendation for online medical pre-diagnosis support, IEEE/ACM Trans. Comput. Biol. Bioinform., vol. 18, no. 3, pp. 912–921, 2021.

DOI Google Scholar

[15]

L. Kong, G. Li, W. Rafique, S. Shen, Q. He, M. R. Khosravi, R. Wang, and L. Qi, Time-aware missing healthcare data prediction based on ARIMA model, IEEE/ACM Trans. Comput. Biol. And Bioinf., doi: 10.1109/TCBB.2022.32050642.

[16]

P. Sterner, D. Goretzko, and F. Pargent, Everything has its price: Foundations of cost-sensitive learning and its application in psychology, https://doi.org/10.31234/osf.io/7asgz, 2021.

DOI

[17]

C. Y. T. Ma, N. S. V. Rao, and D. K. Y. Yau, A game theoretic study of attack and defense in cyber-physical systems, in Proc. 2011 IEEE Conf. Computer Communications Workshops (INFOCOM WKSHPS), Shanghai, China, 2011, pp. 708–713.

[18]

H. Orojloo and M. A. Azgomi, A game-theoretic approach to model and quantify the security of cyber-physical systems, Comput. Ind., vol. 88, pp. 44–57, 2017.

DOI Google Scholar

[19]

S. Huang, H. Zhang, J. Wang, and J. Huang, Markov differential game for network defense decision-making method, IEEE Access, vol. 6, pp. 39621–39634, 2018.

DOI Google Scholar

[20]

Y. Sun, W. Ji, J. Weng, B. Zhao, Y. Li, and X. Wu, Selection of optimal strategy for moving target defense based on signal game, in Proc. 2020 Int. Conf. Cyberspace Innovation of Advanced Technologies, Guangzhou, China, 2020, pp. 28–32.

DOI

[21]

Y. Guo, Y. Gong, L. L. Njilla, and C. A. Kamhoua, A stochastic game approach to cyber-physical security with applications to smart grid, in Proc. IEEE INFOCOM 2018 - IEEE Conf. Computer Communications Workshops (INFOCOM WKSHPS), Honolulu, HI, USA, 2018, pp. 33–38.

DOI

[22]

X. Zhou, X. Xu, W. Liang, Z. Zeng, and Z. Yan, Deep-learning-enhanced multitarget detection for end–edge–cloud surveillance in smart IoT, IEEE Internet Things J., vol. 8, no. 16, pp. 12588–12596, 2021.

DOI Google Scholar

[23]

K. Gai and M. Qiu, Optimal resource allocation using reinforcement learning for IoT content-centric services, Appl. Soft Comput., vol. 70, pp. 12–21, 2018.

DOI Google Scholar

[24]

J. Khoury and M. Nassar, A hybrid game theory and reinforcement learning approach for cyber-physical systems security, in Proc. NOMS 2020 - 2020 IEEE/IFIP Network Operations and Management Symp., Budapest, Hungary, 2020, pp. 1–9.

DOI

[25]

P. Cong, Y. Zhang, Z. Liu, T. Baker, H. Tawfik, W. Wang, K. Xu, R. Li, and F. Li, A deep reinforcement learning-based multi-optimality routing scheme for dynamic IoT networks, Comput. Netw., vol. 192, p. 108057, 2021.

DOI Google Scholar

[26]

J. Yan, H. He, X. Zhong, and Y. Tang, Q-learning-based vulnerability analysis of smart grid against sequential topology attacks, IEEE Trans. Inf. Forensics Secur., vol. 12, no. 1, pp. 200–210, 2017.

DOI Google Scholar

[27]

K. Chung, C. A. Kamhoua, K. A. Kwiat, Z. T. Kalbarczyk, and R. K. Iyer, Game theory with learning for cyber security monitoring, in Proc. 2016 IEEE 17th Int. Symp. High Assurance Systems Engineering (HASE), Orlando, FL, USA, 2016, pp. 1–8.

DOI

[28]

S. Shiva, S. Roy, and D. Dasgupta, Game theory for cyber security, in Proc. 6th Annu. Workshop on Cyber Security and Information Intelligence Research, Oak Ridge, TN, USA, 2010, pp. 1–4.

DOI

[29]

X. Zhou, W. Liang, S. Shimizu, J. Ma, and Q. Jin, Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems, IEEE Trans. Ind. Inform., vol. 17, no. 8, pp. 5790–5798, 2020.

DOI Google Scholar

[30]

X. Zhou, W. Liang, K. I. K. Wang, R. Huang, and Q. Jin, Academic influence aware and multidimensional network analysis for research collaboration navigation based on scholarly big data, IEEE Trans. Emerg. Top. Comput., vol. 9, no. 1, pp. 246–257, 2018.

DOI Google Scholar

[31]

H. Liu, H. Kou, C. Yan, and L. Qi, Keywords-driven and popularity-aware paper recommendation based on undirected paper citation graph, Complexity, vol. 2020, pp. 1–15, 2020.

DOI Google Scholar

[32]

H. Kou, J. Xu, and L. Qi, Diversity-driven automated web API recommendation based on implicit requirements, Appl. Soft Comput., vol. 136, p. 110137, 2023.

DOI Google Scholar

[33]

W. Gong, X. Zhang, Y. Chen, Q. He, A. Beheshti, X. Xu, C. Yan, and L. Qi, DAWAR: diversity-aware web APIs recommendation for mashup creation based on correlation graph, in Proc. 45th Int. ACM SIGIR Conf. Research and Development in Information Retrieval, Madrid, Spain, 2022, pp. 395–404.

DOI

[34]

L. Qi, W. Lin, X. Zhang, W. Dou, X. Xu, and J. Chen, A correlation graph based approach for personalized and compatible web APIs recommendation in mobile APP development, IEEE Trans. Knowl. Data Eng., vol. 35, no. 6, pp. 5444–5457, 2023.

Google Scholar

[35]

Y. Liu, H. Wu, K. Rezaee, M. R. Khosravi, O. I. Khalaf, A. Ali Khan, D. Ramesh, and L. Qi, Interaction-enhanced and time-aware graph convolutional network for successive point-of-interest recommendation in traveling enterprises, IEEE Trans. Ind. Inf., vol. 19, no. 1, pp. 635–643, 2023.

DOI Google Scholar

[36]

H. Liu, H. Kou, C. Yan, and L. Qi, Link prediction in paper citation network to construct paper correlation graph, EURASIP J. Wirel. Commun. Netw., vol. 2019, no. 1, pp. 1–12, 2019.

DOI Google Scholar

[37]

Y. Xu, Z. Feng, X. Zhou, M. Xing, H. Wu, X. Xue, S. Chen, C. Wang, and L. Qi, Attention-based neural networks for trust evaluation in online social networks, Inf. Sci., vol. 630, pp. 507–522, 2023.

DOI Google Scholar

[38]

X. Zhou, X. Yang, J. Ma, and K. I. K. Wang, Energy-efficient smart routing based on link correlation mining for wireless edge computing in IoT, IEEE Internet Things J., vol. 9, no. 16, pp. 14988–14997, 2022.

DOI Google Scholar

[39]

X. Zhou, W. Liang, K. I. K. Wang, and L. T. Yang, Deep correlation mining based on hierarchical hybrid networks for heterogeneous big data recommendations, IEEE Trans. Comput. Soc. Syst., vol. 8, no. 1, pp. 171–178, 2021.

DOI Google Scholar

[40]

H. Maleki, S. Valizadeh, W. Koch, A. Bestavros, and M. van Dijk, Markov modeling of moving target defense games, in Proc. 2016 ACM Workshop on Moving Target Defense, Vienna, Austria, 2016, pp. 81–92.

DOI

[41]

M. Azadi, H. Zare, and M. J. Zare, Confidentiality, integrity and availability in electronic health records: An integrative review, in Proc. 15th Int. Conf. Information Technology – New Generations, Las Vegas, NV, USA, 2018, 745–748.

DOI

[42]

Z. Ning, P. Dong, X. Wang, L. Guo, J. J. P. C. Rodrigues, X. Kong, J. Huang, and R. Y. K. Kwok, Deep reinforcement learning for intelligent Internet of vehicles: An energy-efficient computational offloading scheme, IEEE Trans. Cogn. Commun. Netw., vol. 5, no. 4, pp. 1060–1072,Dec, 2019.

DOI Google Scholar

[43]

Q. Xu, Z. Su, and R. Lu, Game theory and reinforcement learning based secure edge caching in mobile social networks, IEEE Trans. Inf. Forensics Secur., vol. 15, pp. 3415–3429, 2020.

DOI Google Scholar

[44]

J. Khoury and M. Nassar, A hybrid game theory and reinforcement learning approach for cyber-physical systems security, in Proc. 2020 IEEE/IFIP Network Operations and Management Symp., Budapest, Hungary, 2020, pp. 1–9.

DOI

[45]

A. Uprety and D. B. Rawat, Reinforcement learning for IoT security: A comprehensive survey, IEEE Internet Things J., vol. 8, no. 11, pp. 8693–8706, 2021.

DOI Google Scholar

[46]

C. J. C. H. Watkins and P. Dayan, Q-learning, Mach. Learn., vol. 8, nos. 3–4, pp. 279–292, 1992.

DOI Google Scholar

[47]

T. T. Nguyen and V. J. Reddi, Deep reinforcement learning for cyber security, IEEE Trans. Neural Netw. Learn. Syst. ,doi: 10.1109/TNNLS.2021.3121870.

DOI

[48]

J. L. Tan, C. Lei, H. Q. Zhang, and Y. Q. Cheng, Optimal strategy selection approach to moving target defense based on Markov robust game, Comput. Secur., vol. 85, pp. 63–76, 2019.

DOI Google Scholar

About this article

Publication history

Rights and permissions

Publication history

Received: 31 March 2023

Revised: 30 April 2023

Accepted: 30 May 2023

Published: 30 June 2023

Issue date: June 2023

Copyright

Rights and permissions

This work is available under the CC BY-NC-ND 3.0 IGO license:https://creativecommons.org/licenses/by-nc-nd/3.0/igo/