451
Views
18
Downloads
0
Crossref
N/A
WoS
2
Scopus
1
CSCD
An increasing number of websites are making use of HTTPS encryption to enhance security and privacy for their users. However, HTTPS encryption makes it very difficult to identify the service over HTTPS flows, which poses challenges to network security management. In this paper we present DTA-HOC, a novel DNS-based two-level association HTTPS traffic online service identification method for large-scale networks, which correlates HTTPS flows with DNS flows using big data stream processing and association technologies to label the service in an HTTPS flow with a specific associated domain name. DTA-HOC has been specifically designed to address three practical challenges in the service identification process: domain name ambiguity, domain name query invisibility, and data association time window size contradictions. Several experiments on datasets collected from a 10-Gbps campus network are conducted alongside offline and online testing. Results show that DTA-HOC can achieve an average online association rate on HTTPS traffic of 83% and a generic accuracy of 86.16%. Its processing time for one minute of data is less than 20 seconds. These results indicate that DTA-HOC is an efficient method for online identification of services in HTTPS flows for large-scale networks. Moreover, our proposed method can contribute to the identification of other applications which make a Domain Name System (DNS) communication before establishing a connection.
An increasing number of websites are making use of HTTPS encryption to enhance security and privacy for their users. However, HTTPS encryption makes it very difficult to identify the service over HTTPS flows, which poses challenges to network security management. In this paper we present DTA-HOC, a novel DNS-based two-level association HTTPS traffic online service identification method for large-scale networks, which correlates HTTPS flows with DNS flows using big data stream processing and association technologies to label the service in an HTTPS flow with a specific associated domain name. DTA-HOC has been specifically designed to address three practical challenges in the service identification process: domain name ambiguity, domain name query invisibility, and data association time window size contradictions. Several experiments on datasets collected from a 10-Gbps campus network are conducted alongside offline and online testing. Results show that DTA-HOC can achieve an average online association rate on HTTPS traffic of 83% and a generic accuracy of 86.16%. Its processing time for one minute of data is less than 20 seconds. These results indicate that DTA-HOC is an efficient method for online identification of services in HTTPS flows for large-scale networks. Moreover, our proposed method can contribute to the identification of other applications which make a Domain Name System (DNS) communication before establishing a connection.
This work was partially funded by the National Natural Science Foundation of China (No. 61802270), National Entrepreneurship & Innovation Demonstration Base of China (No. C700011), Key Research & Development Project of Sichuan Province of China (No. 2018GZ0100), and Fundamental Research Business Fee Basic Research Project of Central Universities (No. 2017SCU11065).
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).