Scholar - SciOpen

Open Access Issue

IDEA: A Utility-Enhanced Approach to Incomplete Data Stream Anonymization

Lu Yang, Xingshu Chen, Yonggang Luo, Xiao Lan, Wei Wang

Tsinghua Science and Technology 2022, 27 (1): 127-140

Published: 17 August 2021

Abstract

PDF (966.9 KB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：69

The prevalence of missing values in the data streams collected in real environments makes them impossible to ignore in the privacy preservation of data streams. However, the development of most privacy preservation methods does not consider missing values. A few researches allow them to participate in data anonymization but introduce extra considerable information loss. To balance the utility and privacy preservation of incomplete data streams, we present a utility-enhanced approach for Incomplete Data strEam Anonymization (IDEA). In this approach, a slide-window-based processing framework is introduced to anonymize data streams continuously, in which each tuple can be output with clustering or anonymized clusters. We consider the dimensions of attribute and tuple as the similarity measurement, which enables the clustering between incomplete records and complete records and generates the cluster with minimal information loss. To avoid the missing value pollution, we propose a generalization method that is based on maybe match for generalizing incomplete data. The experiments conducted on real datasets show that the proposed approach can efficiently anonymize incomplete data streams while effectively preserving utility.

Open Access Issue

DGA-Based Botnet Detection Toward Imbalanced Multiclass Learning

Yijing Chen, Bo Pang, Guolin Shao, Guozhu Wen, Xingshu Chen

Tsinghua Science and Technology 2021, 26 (4): 387-402

Published: 04 January 2021

Abstract

PDF (10 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：65

Botnets based on the Domain Generation Algorithm (DGA) mechanism pose great challenges to the main current detection methods because of their strong concealment and robustness. However, the complexity of the DGA family and the imbalance of samples continue to impede research on DGA detection. In the existing work, the sample size of each DGA family is regarded as the most important determinant of the resampling proportion; thus, differences in the characteristics of various samples are ignored, and the optimal resampling effect is not achieved. In this paper, a Long Short-Term Memory-based Property and Quantity Dependent Optimization (LSTM.PQDO) method is proposed. This method takes advantage of LSTM to automatically mine the comprehensive features of DGA domain names. It iterates the resampling proportion with the optimal solution based on a comprehensive consideration of the original number and characteristics of the samples to heuristically search for a better solution around the initial solution in the right direction; thus, dynamic optimization of the resampling proportion is realized. The experimental results show that the LSTM.PQDO method can achieve better performance compared with existing models to overcome the difficulties of unbalanced datasets; moreover, it can function as a reference for sample resampling tasks in similar scenarios.

Open Access Issue

DTA-HOC: Online HTTPS Traffic Service Identification Using DNS in Large-Scale Networks

Xuemei Zeng, Xingshu Chen, Guolin Shao, Tao He, Lina Wang

Tsinghua Science and Technology 2020, 25 (2): 239-254

Published: 02 September 2019

Abstract

PDF (5.5 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：18

An increasing number of websites are making use of HTTPS encryption to enhance security and privacy for their users. However, HTTPS encryption makes it very difficult to identify the service over HTTPS flows, which poses challenges to network security management. In this paper we present DTA-HOC, a novel DNS-based two-level association HTTPS traffic online service identification method for large-scale networks, which correlates HTTPS flows with DNS flows using big data stream processing and association technologies to label the service in an HTTPS flow with a specific associated domain name. DTA-HOC has been specifically designed to address three practical challenges in the service identification process: domain name ambiguity, domain name query invisibility, and data association time window size contradictions. Several experiments on datasets collected from a 10-Gbps campus network are conducted alongside offline and online testing. Results show that DTA-HOC can achieve an average online association rate on HTTPS traffic of 83% and a generic accuracy of 86.16%. Its processing time for one minute of data is less than 20 seconds. These results indicate that DTA-HOC is an efficient method for online identification of services in HTTPS flows for large-scale networks. Moreover, our proposed method can contribute to the identification of other applications which make a Domain Name System (DNS) communication before establishing a connection.

Open Access Issue

Cloud Virtual Machine Lifecycle Security Framework Based on Trusted Computing

Xin Jin, Qixu Wang, Xiang Li, Xingshu Chen, Wei Wang

Tsinghua Science and Technology 2019, 24 (5): 520-534

Published: 29 April 2019

Abstract

PDF (13.8 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：38

As a foundation component of cloud computing platforms, Virtual Machines (VMs) are confronted with numerous security threats. However, existing solutions tend to focus on solving threats in a specific state of the VM. In this paper, we propose a novel VM lifecycle security protection framework based on trusted computing to solve the security threats to VMs throughout their entire lifecycle. Specifically, a concept of the VM lifecycle is presented divided up by the different active conditions of the VM. Then, a trusted computing based security protection framework is developed, which can extend the trusted relationship from trusted platform module to the VM and protect the security and reliability of the VM throughout its lifecycle. The theoretical analysis shows that our proposed framework can provide comprehensive safety to VM in all of its states. Furthermore, experiment results demonstrate that the proposed framework is feasible and achieves a higher level of security compared with some state-of-the-art schemes.

Open Access Issue

Efficient Feature Extraction Using Apache Spark for Network Behavior Anomaly Detection

Xiaoming Ye, Xingshu Chen, Dunhu Liu, Wenxian Wang, Li Yang, Gang Liang, Guolin Shao

Tsinghua Science and Technology 2018, 23 (5): 561-573

Published: 17 September 2018

Abstract

PDF (33.4 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：32

Extracting and analyzing network traffic feature is fundamental in the design and implementation of network behavior anomaly detection methods. The traditional network traffic feature method focuses on the statistical features of traffic volume. However, this approach is not sufficient to reflect the communication pattern features. A different approach is required to detect anomalous behaviors that do not exhibit traffic volume changes, such as low-intensity anomalous behaviors caused by Denial of Service/Distributed Denial of Service (DoS/DDoS) attacks, Internet worms and scanning, and BotNets. We propose an efficient traffic feature extraction architecture based on our proposed approach, which combines the benefit of traffic volume features and network communication pattern features. This method can detect low-intensity anomalous network behaviors and conventional traffic volume anomalies. We implemented our approach on Spark Streaming and validated our feature set using labelled real-world dataset collected from the Sichuan University campus network. Our results demonstrate that the traffic feature extraction approach is efficient in detecting both traffic variations and communication structure changes. Based on our evaluation of the MIT-DRAPA dataset, the same detection approach utilizes traffic volume features with detection precision of 82.3% and communication pattern features with detection precision of 89.9%. Our proposed feature set improves precision by 94%.

Open Access Issue

Trusted Attestation Architecture on an Infrastructure-as-a-Service

Xin Jin, Xingshu Chen, Cheng Zhao, Dandan Zhao

Tsinghua Science and Technology 2017, 22 (5): 469-478

Published: 11 September 2017

Abstract

PDF (2.3 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：13

Trusted attestation is the main obstruction preventing large-scale promotion of cloud computing. How to extend a trusted relationship from a single physical node to an Infrastructure-as-a-Service (IaaS) platform is a problem that must be solved. The IaaS platform provides the Virtual Machine (VM), and the Trusted VM, equipped with a virtual Trusted Platform Module (vTPM), is the foundation of the trusted IaaS platform. We propose a multi-dimensional trusted attestation architecture that can collect and verify trusted attestation information from the computing nodes, and manage the information centrally on a cloud management platform. The architecture verifies the IaaS’s trusted attestation by apprising the VM, Hypervisor, and host Operating System’s (OS) trusted status. The theory and the technology roadmap were introduced, and the key technologies were analyzed. The key technologies include dynamic measurement of the Hypervisor at the process level, the protection of vTPM instances, the reinforcement of Hypervisor security, and the verification of the IaaS trusted attestation. A prototype was deployed to verify the feasibility of the system. The advantages of the prototype system were compared with the Open CIT (Intel Cloud attestation solution). A performance analysis experiment was performed on computing nodes and the results show that the performance loss is within an acceptable range.

Open Access Issue

An Anomalous Behavior Detection Model in Cloud Computing

Xiaoming Ye, Xingshu Chen, Haizhou Wang, Xuemei Zeng, Guolin Shao, Xueyuan Yin, Chun Xu

Tsinghua Science and Technology 2016, 21 (3): 322-332

Published: 13 June 2016

Abstract

PDF (1.5 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：33

This paper proposes an anomalous behavior detection model based on cloud computing. Virtual Machines (VMs) are one of the key components of cloud Infrastructure as a Service (IaaS). The security of such VMs is critical to IaaS security. Many studies have been done on cloud computing security issues, but research into VM security issues, especially regarding VM network traffic anomalous behavior detection, remains inadequate. More and more studies show that communication among internal nodes exhibits complex patterns. Communication among VMs in cloud computing is invisible. Researchers find such issues challenging, and few solutions have been proposed—leaving cloud computing vulnerable to network attacks. This paper proposes a model that uses Software-Defined Networks (SDN) to implement traffic redirection. Our model can capture inter-VM traffic, detect known and unknown anomalous network behaviors, adopt hybrid techniques to analyze VM network behaviors, and control network systems. The experimental results indicate that the effectiveness of our approach is greater than 90%, and prove the feasibility of the model.

Open Access Issue

Research and Practice of Dynamic Network Security Architecture for IaaS Platforms

Lin Chen, Xingshu Chen, Junfang Jiang, Xueyuan Yin, Guolin Shao

Tsinghua Science and Technology 2014, 19 (5): 496-507

Published: 13 October 2014

Abstract

PDF (10.7 MB)

Download citation

GB/T 7714-2015

EndNote(RIS)

BibTeX

NoteExpress

Refworks

Collect Collected

Downloads：23

Network security requirements based on virtual network technologies in IaaS platforms and corresponding solutions were reviewed. A dynamic network security architecture was proposed, which was built on the technologies of software defined networking, Virtual Machine (VM) traffic redirection, network policy unified management, software defined isolation networks, vulnerability scanning, and software updates. The proposed architecture was able to obtain the capacity for detection and access control for VM traffic by redirecting it to configurable security appliances, and ensured the effectiveness of network policies in the total life cycle of the VM by configuring the policies to the right place at the appropriate time, according to the impacts of VM state transitions. The virtual isolation domains for tenants’ VMs could be built flexibly based on VLAN policies or Netfilter/Iptables firewall appliances, and vulnerability scanning as a service and software update as a service were both provided as security supports. Through cooperation with IDS appliances and automatic alarm mechanisms, the proposed architecture could dynamically mitigate a wide range of network-based attacks. The experimental results demonstrate the effectiveness of the proposed architecture.