Abstract
In the field of network traffic anomaly detection, unsupervised learning plays a critical role yet encounters significant challenges, including accurately determining anomaly thresholds and modeling the intricate temporal dynamics of network traffic. To address these challenges, we present a novel approach, termed Convolutional Autoencoder-Isolation Forest (CAE-IF). By leveraging packet-level reconstruction errors with contextual information, our approach obviates the need for manual threshold setting and effectively captures temporal dynamics. The process commences with the application of the damped incremental statistics algorithm to extract statistical features from network traffic with temporal information. Subsequently, the Convolutional Autoencoder (CAE) is employed to compute the Root Mean Square Error (RMSE), offering detailed insights into the temporal correlations in network traffic. This RMSE is then refined through an aggregation mechanism based on source IP addresses, yielding a fine-grained temporal representation. Finally, the Isolation Forest (IF) algorithm is applied to establish an anomaly detection framework. Our comprehensive experimental evaluation, using three datasets: Mirai, OS Scan, and SSDP Flood, demonstrates the superior efficacy of the CAE-IF method. It achieves remarkable F1 scores of 96.14%, 99.81%, and 99.98% on these datasets, respectively. These results not only signify substantial improvements over existing methods for the Mirai and OS Scan datasets but also match the highest F1 score obtained on the SSDP Flood dataset.