Comparative Analysis of Anomaly Detection Approaches in Firewall Logs: Integrating Light-Weight Synthesis of Security Logs and Artificially Generated Attack Detection

Sensors (Basel). 2024 Apr 20;24(8):2636. doi: 10.3390/s24082636.

Abstract

Detecting anomalies in large networks is a major challenge. Nowadays, many studies rely on machine learning techniques to solve this problem. However, much of this research depends on synthetic or limited datasets and tends to use specialized machine learning methods to achieve good detection results. This study focuses on analyzing firewall logs from a large industrial control network and presents a novel method for generating anomalies that simulate real attacker actions within the network without the need for a dedicated testbed or installed security controls. To demonstrate that the proposed method is feasible and that the constructed logs behave as one would expect real-world logs to behave, different supervised and unsupervised learning models were compared using different feature subsets, feature construction methods, scaling methods, and aggregation levels. The experimental results show that unsupervised learning methods have difficulty in detecting the injected anomalies, suggesting that they can be seamlessly integrated into existing firewall logs. Conversely, the use of supervised learning methods showed significantly better performance compared to unsupervised approaches and a better suitability for use in real systems.

Keywords: anomaly detection; artificially generated attacks; cybersecurity; datasets; firewall logs; machine learning; security logs.

Grants and funding

This work has been supported by the European Union’s European Regional Development Fund, Operational Programme Competitiveness and Cohesion 2014–2020 for Croatia, through the project Center of competencies for cyber-security of control systems (CEKOM SUS), grant KK.01.2.2.03.0019.