A Hybrid Unsupervised Clustering-Based Anomaly Detection Method

Guo Pu; Lijuan Wang; Jun Shen; Fang Dong

doi:10.26599/TST.2019.9010051

Tsinghua Science and Technology 2021, 26(2): 146-153 https://doi.org/10.26599/TST.2019.9010051

Open Access | Issue | Published: 24 July 2020

A Hybrid Unsupervised Clustering-Based Anomaly Detection Method

Show Author's Information Hide Author's Information Guo Pu, Lijuan Wang(

), Jun Shen, Fang Dong

School of Cyber Engineering, Xidian University, Xi’an 710126, China.

School of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, Australia.

School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.

Keywords:

unsupervised learning, clustering, intrusion detection, feature selection

Cite this article:

Pu G, Wang L, Shen J, et al. A Hybrid Unsupervised Clustering-Based Anomaly Detection Method. Tsinghua Science and Technology, 2021, 26(2): 146-153. https://doi.org/10.26599/TST.2019.9010051

Download citation

EndNote(RIS)

BibTeX

752

Views

116

Downloads

Citations

106

Crossref

N/A

WoS

123

Scopus

CSCD

Abstract Full text About this article

Abstract

In recent years, machine learning-based cyber intrusion detection methods have gained increasing popularity. The number and complexity of new attacks continue to rise; therefore, effective and intelligent solutions are necessary. Unsupervised machine learning techniques are particularly appealing to intrusion detection systems since they can detect known and unknown types of attacks as well as zero-day attacks. In the current paper, we present an unsupervised anomaly detection method, which combines Sub-Space Clustering (SSC) and One Class Support Vector Machine (OCSVM) to detect attacks without any prior knowledge. The proposed approach is evaluated using the well-known NSL-KDD dataset. The experimental results demonstrate that our method performs better than some of the existing techniques.

Full text

Abstract

Full text

Outline

About this article

A Hybrid Unsupervised Clustering-Based Anomaly Detection Method

Show Author's information Hide Author's Information Guo Pu, Lijuan Wang(

), Jun Shen, Fang Dong

School of Cyber Engineering, Xidian University, Xi’an 710126, China.

School of Computing and Information Technology, University of Wollongong, Wollongong, NSW 2522, Australia.

School of Computer Science and Engineering, Southeast University, Nanjing 211189, China.

Abstract

Keywords: unsupervised learning, clustering, intrusion detection, feature selection

References(24)

[1]

M. Rouse, Cyber security, https://searchsecurity.techtarget.com/definition/cybersecurity, 2016.

[2]

A. L. Buczak and E. Guven, A survey of data mining and machine learning methods for cyber security intrusion detection, IEEE Communications Surveys Tutorials, vol. 18, no. 2, pp. 1153-1176, 2016.

DOI Google Scholar

[3]

A. Mukkamala, A. Sung, and A. Abraham, Cyber security challenges: Designing efficient intrusion detection systems and antivirus tools, in Proc. of Enhancing Computer Security with Smart Technology, New York, NY, USA, 2005, pp. 125-163.

DOI

[4]

A. Nisioti, A. Mylonas, P. D. Yoo, and V. Katos, From intrusion detection to attacker attribution: A comprehensive survey of unsupervised methods, IEEE Communications Surveys Tutorials, vol. 20, no. 4, pp. 3369-3388, 2018.

DOI Google Scholar

[5]

P. Casas, J. Mazel, and P. Owezarski, Unsupervised network intrusion detection systems: Detecting the unknown without knowledge, Computer Communications, vol. 35, no. 7, pp. 772-783, 2012.

DOI Google Scholar

[6]

I. Kang, M. K. Jeong, and D. Kong, A differentiated one-class classification method with applications to intrusion detection, Expert Syst. Appl., vol. 39, no. 4, pp. 3899-3905, 2012.

DOI Google Scholar

[7]

F. Kuang, W. Xu, and S. Zhang, A novel hybrid KPCA and SVM with GA model for intrusion detection, Applied Soft Computing, vol. 18, pp. 178-184, 2014.

DOI Google Scholar

[8]

L. Wang and J. Shen, A systematic review of bio-inspired service concretization, IEEE Transactions on Services Computing, vol. 10, no. 4, pp. 493-505, 2017.

DOI Google Scholar

[9]

L. Wang, Q. Zhou, T. Jin, and H. Zhao, Feed-back neural networks with discrete weights, Neural Computing and Application, vol. 22, no. 6, pp. 1063-1069, 2013.

DOI Google Scholar

[10]

L. Wang and J. Shen, Data-intensive service provision based on particle swarm optimization, International Journal of Computational Intelligence Systems, vol. 11, pp. 330-339, 2018.

DOI Google Scholar

[11]

M. Sadiq and A. Khan, Rule-based network intrusion detection using genetic algorithms, International Journal of Computer Applications, vol. 18, no. 8, pp. 26-29, 2011.

DOI Google Scholar

[12]

C. Wagner, J. François, R. State, and T. Engel, Machine learning approach for IP-flow record anomaly detection, Lecture Notes in Computer Science, vol. 6640, pp. 28-39, 2011.

DOI Google Scholar

[13]

L. Parsons, E. Haque, and H. Liu, Subspace clustering for high dimensional data: A review, SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 90-105, 2004.

DOI Google Scholar

[14]

M. Ester, H. P. Kriegel, J. Sander, and X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 1996, pp. 226-231.

[15]

P. V. Amoli, T. Hamalainen, G. David, M. Zolotukhin, and M. Mirzamohammad, Unsupervised network intrusion detection systems for zero-day fast-spreading attacks and botnets, International Journal of Digital Content Technology and Its Applications, vol. 10, no. 2, pp. 1-13, 2016.

Google Scholar

[16]

A. Bohara, U. Thakore, and W. H. Sanders, Intrusion detection in enterprise systems by combining and clustering diverse monitor data, in Proceedings of the Symposium and Bootcamp on the Science of Security, Pittsburgh, PA, USA, 2016, pp. 7-16.

DOI

[17]

M. Blowers and J. Williams, Machine learning applied to cyber operations, in Network Science and Cybersecurity, Advances in Information Security, vol. 55, pp. 155-175, 2014.

DOI Google Scholar

[18]

B. Schölkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, Estimating the support of a high-dimensional distribution, Neural Computation, vol. 13, no. 7, pp. 1443-1471, 2001.

DOI Google Scholar

[19]

A. K. Jain, Data clustering: 50 years beyond K-means, Pattern Recognition Letters, vol. 31, no. 8, pp. 651-666, 2010.

DOI Google Scholar

[20]

A. L. N. Fred and A. K. Jain, Combining multiple clusterings using evidence accumulation, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 835-850, 2005.

DOI Google Scholar

[21]

T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006.

DOI Google Scholar

[22]

KDD Cup 1999 Data, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html, 1999.

[23]

NSL-KDD Dataset, https://www.unb.ca/cic/datasets/nsl.html, 2009.

[24]

M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, A detailed analysis of the KDD CUP 99 data set, in Proc. of 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, Canada, 2009, pp. 1-6.

DOI

About this article

Publication history

Acknowledgements

Rights and permissions

Publication history

Received: 02 July 2019

Revised: 02 September 2019

Accepted: 09 September 2019

Published: 24 July 2020

Issue date: April 2021

Copyright

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China (Nos. 61702398 and 61872079), China 111 Project (No. B16037), and University Global Partnership Network (UGPN) Project of the University of Wollongong 2018 - 2019.

Rights and permissions

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).