Abnormal Samples Oversampling for Anomaly Detection Based on Uniform Scale Strategy and Closed Area
The samples representing abnormal situation is usually very few in the dataset, which makes it difficult to learn the features of abnormal samples by machine-learning-based methods. To improve the accuracy of anomaly detection, the number of abnormal samples should be expanded to ensure the balance...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on knowledge and data engineering 2023-12, Vol.35 (12), p.11999-12011 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The samples representing abnormal situation is usually very few in the dataset, which makes it difficult to learn the features of abnormal samples by machine-learning-based methods. To improve the accuracy of anomaly detection, the number of abnormal samples should be expanded to ensure the balance of the dataset. In this paper, a discrete synthetic minority oversampling technique (D-SMOTE) is proposed to generate new samples. A closed area is constructed using the three nearest abnormal samples in the dataset. The new samples are then uniformly interpolated in a closed area. By this means, the problem of the imbalance for the original dataset is handled, thus improving the data quality. Based on the expanded datasets, a two-dimensional convolutional neural network (2D CNN) is constructed to detect abnormal samples. In experiments, three cases and different machine learning methods are considered for comparison. Several indexes including accuracy, precision, confusion matrix, F1-score, and Recall have been used to evaluate the detection effectiveness. The results show that the abnormal samples can be detected accurately using oversampling data obtained from the proposed D-SMOTE method. |
---|---|
ISSN: | 1041-4347 1558-2191 |
DOI: | 10.1109/TKDE.2021.3130595 |