Applying Big Data Based Deep Learning System to Intrusion Detection

Wei Zhong; Ning Yu; Chunyu Ai

doi:10.26599/BDMA.2020.9020003

AI Chat Paper

Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.

Chat more with AI

| Sign up

Browse by Subject

Search for peer-reviewed journals with full access.

Journals A - Z

About Us

Discover the SciOpen Platform and Achieve Your Research Goals with Ease.

About Us

Publish with Us

Support

Journals A - Z

About Us

Publish with Us

Support

PDF (5.5 MB)

Cite

EndNote(RIS) BibTeX

Collect

Submit Manuscript

AI Chat Paper

Show Outline

Outline

Show full outline

Hide outline

Outline

Show full outline

Hide outline

Open Access

Applying Big Data Based Deep Learning System to Intrusion Detection

Wei Zhong(

), Ning Yu, Chunyu Ai

∙ Division of Math and Computer Science, University of South Carolina Upstate, Spartanburg, SC 29303, USA.

∙ Department of Computing Sciences, State University of New York College at Brockport, Brockport, NY 14420, USA.

Show Author Information

Abstract

With vast amounts of data being generated daily and the ever increasing interconnectivity of the world’s internet infrastructures, a machine learning based Intrusion Detection Systems (IDS) has become a vital component to protect our economic and national security. Previous shallow learning and deep learning strategies adopt the single learning model approach for intrusion detection. The single learning model approach may experience problems to understand increasingly complicated data distribution of intrusion patterns. Particularly, the single deep learning model may not be effective to capture unique patterns from intrusive attacks having a small number of samples. In order to further enhance the performance of machine learning based IDS, we propose the Big Data based Hierarchical Deep Learning System (BDHDLS). BDHDLS utilizes behavioral features and content features to understand both network traffic characteristics and information stored in the payload. Each deep learning model in the BDHDLS concentrates its efforts to learn the unique data distribution in one cluster. This strategy can increase the detection rate of intrusive attacks as compared to the previous single learning model approaches. Based on parallel training strategy and big data techniques, the model construction time of BDHDLS is reduced substantially when multiple machines are deployed.

Keywords

deep learning intrusion detection convolution neural network fully connected feedforward neural network multi-level clustering algorithm

References

[1]

Homeland Security Council, National strategy for homeland security, https://www.dhs.gov/xlibrary/assets/nat_strat_homelandsecurity_2007.pdf, 2007.

[2]

S. Dua and X Du, Data Mining and Machine Learning in Cybersecurity. Boston, MA, USA: Auerbach Publications, 2011.

[3]

K. Kim and M. E. Aminanto, Deep learning in intrusion detection perspective: Overview and further challenges, in Proc. 2017 Int. Workshop on Big Data and Information Security (IWBIS), Jakarta, Indonesia, 2017, pp. 5-10.

Crossref

[4]

A. L. Buczak and E. Guven, A survey of data mining and machine learning methods for cyber security intrusion detection, IEEE Commun. Surv. Tutor., vol. 18, no. 2, pp. 1153-1176, 2016.

Crossref Google Scholar

[5]

C. A. Catania and C. G. Garino, Automatic network intrusion detection: Current techniques and open issues, Comput. Electr. Eng., vol. 38, no. 5, pp. 1062-1072, 2012.

Crossref Google Scholar

[6]

G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. Van Der Laak, B. Van Ginneken, and C. I. Sánchez, A survey on deep learning in medical image analysis, Med. Image Anal., vol. 42, pp. 60-88, 2017.

Crossref Google Scholar

[7]

E. Hodo, X. Bellekens, A. Hamilton, C. Tachtatzis, and R. Atkinson, Shallow and deep networks intrusion detection system: A taxonomy and survey, arXiv preprint arXiv: 1701.02145, 2017.

[8]

B. Chandra and R. K. Sharma, Deep learning with adaptive learning rate using laplacian score, Exp. Syst. Appl., vol. 63, pp. 1-7, 2016.

Crossref Google Scholar

[9]

Y. C. Li, X. Q. Nie, and R. Huang, Web spam classification method based on deep belief networks, Exp. Syst. Appl., vol. 96, pp. 261-270, 2018.

Crossref Google Scholar

[10]

Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 521, no. 7553, pp. 436-444, 2015.

Crossref Google Scholar

[11]

M. Papakostas and T. Giannakopoulos, Speech-music discrimination using deep visual feature extractors, Exp. Syst. Appl., vol. 114, pp. 334-344, 2018.

Crossref Google Scholar

[12]

Y. Yu, J. Long, and Z. P. Cai, Network intrusion detection through stacking dilated convolutional autoencoders, Secur. Commun. Networks, vol. 2017, p. 4184196, 2017.

Crossref Google Scholar

[13]

T. T. H. Le, J. Kim, and H. Kim, An effective intrusion detection classifier using long short-term memory with gradient descent optimization, in Proc. 2017 Int. Conf. Platform Technology and Service (PlatCon), Busan, South Korea, 2017, pp. 1-6.

Crossref

[14]

A. F. M. Agarap, A neural network architecture combining gated recurrent unit (GRU) and support vector machine (SVM) for intrusion detection in network traffic data, in Proc. 10th Int. Conf. Machine Learning and Computing, Macau, China, 2018, pp. 26-30.

Crossref

[15]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, Imagenet classification with deep convolutional neural networks, in Proc. 25th Int. Conf. Neural Information Processing Systems, Lake Tahoe, NV, USA, 2012, pp. 1097-1105.

[16]

A. Shiravi, H. Shiravi, M. Tavallaee, and A. A. Ghorbani, Toward developing a systematic approach to generate benchmark datasets for intrusion detection, Comput. Secur., vol. 31, no. 3, pp. 357-374, 2012.

Crossref Google Scholar

[17]

W. Wang, Y. Q. Sheng, J. L. Wang, X. W. Zeng, X. Z. Ye, Y. Z. Huang, and M. Zhu, HAST-IDS: Learning hierarchical spatial-temporal features using deep neural networks to improve intrusion detection, IEEE Access, vol. 6, pp. 1792-1806, 2017.

Crossref Google Scholar

[18]

E. Alpaydm, Combined 5 × 2 cv F test for comparing supervised classification learning algorithms, Neural Comput., vol. 11, no. 8, pp. 1885-1892, 1999.

Crossref Google Scholar

[19]

P. Baldi, S. Brunak, Y. Chauvin, C. A. F. Andersen, and H. Nielsen, Assessing the accuracy of prediction algorithms for classification: An overview, Bioinformatics, vol. 16, no. 5, pp. 412-424, 2000.

Crossref Google Scholar

[20]

N. Shone, T. N. Ngoc, V. D. Phai, and Q. Shi, A deep learning approach to network intrusion detection, IEEE Trans. Emerg. Top. Comput. Intell., vol. 2, no. 1, pp. 41-50, 2018.

Crossref Google Scholar

[21]

U. Fiore, F. Palmieri, A. Castiglione, and A. De Santis, Network anomaly detection with the restricted boltzmann machine, Neurocomputing, vol. 122, pp. 13-23, 2013.

Crossref Google Scholar

[22]

J. Schmidhuber, Deep learning in neural networks: An overview, Neural Networks, vol. 61, pp. 85-117, 2015.

Crossref Google Scholar

[23]

R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman, Deep learning approach for intelligent intrusion detection system, IEEE Access, vol. 7, pp. 41525-41550, 2019.

Crossref Google Scholar

[24]

S. M. Kasongo and Y. X. Sun, A deep learning method with filter based feature engineering for wireless intrusion detection system, IEEE Access, vol. 7, pp. 38597-38607, 2019.

Crossref Google Scholar

[25]

P. Nagar, H. K. Menaria, and M. Tiwari, Novel approach of intrusion detection classification deeplearning using SVM, presented at First International Conference on Sustainable Technologies for Computational Intelligence, Singapore, 2020, pp. 365-381.

Crossref

[26]

M. Akter, G. D. Dip, M. S. Mira, M. A. Hamid, and M. Mridha, Construing attacks of internet of things (IoT) and a prehensile intrusion detection system for anomaly detection using deep learning approach, presented at International Conference on Innovative Computing and Communications: Proceedings of ICICC 2019, Singapore, 2020, pp. 427-438.

Crossref

[27]

Z. Q. Liu, M. U. D. Ghulam, Y. Zhu, X. L. Yan, L. F. Wang, Z. J. Jiang, and J. C. Luo, Deep learning approach for ids, presented at Fourth International Congress on Information and Communication Technology: ICICT 2019, Singapore, 2020, pp. 471-479.

Crossref

[28]

C. Sekhar and K. V. Rao, A study: Machine learning and deep learning approaches for intrusion detection system, presented at Int. Conf. Computer Networks and Inventive Communication Technologies, Coimbatore, India, 2019, pp. 845-849.

Crossref

[29]

G. Nguyen, S. Dlugolinsky, V. Tran, and A. L. García, Deep learning for proactive network monitoring and security protection, IEEE Access, vol. 8, pp. 19696-19716, 2020.

Crossref Google Scholar

[30]

A. Abusitta, M. Bellaiche, M. Dagenais, and T. Halabi, A deep learning approach for proactive multi-cloud cooperative intrusion detection system, Future Generation Comput. Syst., vol. 98, pp. 308-318, 2019.

Crossref Google Scholar

[31]

A. Liu and B. Sun, An intrusion detection system based on a quantitative model of interaction mode between ports, IEEE Access, vol. 7, pp. 161725-161740, 2019.

Crossref Google Scholar

[32]

T. Aldwairi, D. Perera, and M. A. Novotny, An evaluation of the performance of restricted boltzmann machines as a model for anomaly network intrusion detection, Comput. Networks, vol. 144, pp. 111-119, 2018.

Crossref Google Scholar

[33]

C. Alliance, Big data analytics for security intelligence, https://downloads.cloudsecurityalliance.org/initiatives/bdwg/Big_Data_Analytics_for_Security_Intelligence.pdf, 2013.

[34]

W. Zhong and F. Gu, A multi-level deep learning system for malware detection, Exp. Syst. Appl., vol. 133, pp. 151-162, 2019.

Crossref Google Scholar

[35]

J. W. Han and M. Kamber, Data Mining: Concepts and Techniques. San Francisco, CA, USA: Elsevier, 2011.

[36]

S. K. Gupta, K. S. Rao, and V. Bhatnagar, K-means clustering algorithm for categorical attributes, in Proc. 1st Int. Conf. Data Warehousing and Knowledge Discovery, Berlin, Germany: Springer, 1999, pp. 203-208.

Crossref

[37]

S. Owen, R. Anil, T. Dunning, and E. Friedman, Mahout in Action. Shelter Island, NY, USA: Manning Publications, 2011.

[38]

W. Zhong, G. Altun, R. Harrison, P. C. Tai, and Y. Pan, Improved K-means clustering algorithm for exploring local protein sequence motifs representing common structural property, IEEE Trans. Nanobioscience, vol. 4, no. 3, pp. 255-265, 2005.

Crossref Google Scholar

[39]

L. D. Gibert, Convolutional neural networks for malware classification, Master dissertation, Universitat Politècnica de Catalunya, Tarragona, Spain, 2016.

[40]

M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, A detailed analysis of the KDD CUP 99 data set, in Proc. 2009 IEEE Symp. Computational Intelligence for Security and Defense Applications, Ottawa, Canada, 2009, pp. 1-6.

Crossref

[41]

J. Song, H. Takakura, and Y. Okabe, Description of Kyoto University benchmark data, http://www.takakura.com/Kyoto_data/BenchmarkData-Description-v5.pdf, 2006.

[42]

R. Lippmann, R. K. Cunningham, D. J. Fried, I. Graf, K. R. Kendall, S. E. Webster, and M. A. Zissman, Results of the DARPA 1998 offline intrusion detection evaluation, presented at Recent Advances in Intrusion Detection: 4th International Symposium, New York, NY, USA, 1999, pp. 829-835.

[43]

I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, Toward generating a new intrusion detection dataset and intrusion traffic characterization, in Proc. 4th Int. Conf. Information Systems Security and Privacy (ICISSP), Funchal, Portugal, 2018, pp. 108-116.

Crossref

[44]

X. Chen, A simple utility to classify packets into flows, https://github.com/caesar0301/pkt2flow, 2017.

[45]

M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, Network anomaly detection: Methods, systems and tools, IEEE Commun. Surv. Tutor., vol. 16, no. 1, pp. 303-336, 2014.

Crossref Google Scholar

Big Data Mining and Analytics

Volume 3 Issue 3,
September 2020

Pages 181-195

DOI: 10.26599/BDMA.2020.9020003

Cite this article:

Zhong W, Yu N, Ai C. Applying Big Data Based Deep Learning System to Intrusion Detection. Big Data Mining and Analytics, 2020, 3(3): 181-195. https://doi.org/10.26599/BDMA.2020.9020003

1050

Views

173

Downloads

Crossref

Web of Science

110

Scopus

CSCD

Google Scholar
Citation

Altmetrics

Received: 08 March 2020

Revised: 27 March 2020

Accepted: 30 March 2020

Published: 16 July 2020

The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).