Information systems play an important role in business management, especially in personnel, budget, and financial management. If an anomaly ensues in an information system, all operations are paralyzed until their recovery. In this study, we propose a method for collecting and labeling datasets from actual operating systems in corporate environments for deep learning. The construction of a dataset from actual operating systems in a company's information system involves constraints. Collecting anomalous data from these systems is challenging because of the need to maintain system stability. Even with data collected over a long period, the training dataset may have an imbalance of normal and anomalous data. We propose a method that utilizes contrastive learning with data augmentation through negative sampling for anomaly detection, which is particularly suitable for small datasets. To evaluate the effectiveness of the proposed method, we compared it with traditional deep learning models, such as the convolutional neural network (CNN) and long short-term memory (LSTM). The proposed method achieved a true positive rate (TPR) of 99.47%, whereas CNN and LSTM achieved TPRs of 98.8% and 98.67%, respectively. The experimental results demonstrate the method's effectiveness in utilizing contrastive learning and detecting anomalies in small datasets from a company's information system.
Keywords: anomaly detection; constructing dataset; contrastive learning; enterprise information system dataset.