A novel metric learning method based on constructing a uniform data hypersphere via simulated forging approach

Non-uniformly distributed data in unbalanced datasets have the phenomenon of data stacking and data scattering. However, most traditional metric learning algorithms often overemphasize the intra-class compactness and inter-class dispersion of data. When dealing with the typical non-uniformly distrib...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neural computing & applications 2024-08, Vol.36 (24), p.15137-15148
Hauptverfasser: Liang, Lu, Su, Linxin, Fei, Lunke
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Non-uniformly distributed data in unbalanced datasets have the phenomenon of data stacking and data scattering. However, most traditional metric learning algorithms often overemphasize the intra-class compactness and inter-class dispersion of data. When dealing with the typical non-uniformly distributed data, these traditional methods may distort a few classes of data in the embedding space and finally reduce the classification accuracy. To solve these problems, we propose a novel metric learning method based on constructing a uniform data hypersphere via simulated forging approach (CUDH-SF). CUDH-SF aims to achieve a uniformly distributed embedding space through conducting local forging and global forging on the original data, so that the problem of data stacking and data scattering can be more effectively alleviated. Although the uniform data hypersphere changes the absolute position of the original data, it does not change its relative relationship, that is, the geometric structure of the data can be maintained. Such data transformation is helpful to strengthen the representation ability of the input data, so that the subsequent distance-based classifiers can better construct the decision boundary. Moreover, CUDH-SF does not have the parameters needed for tuning in the validation set, which makes it easier to use. Extensive experiments on eight datasets demonstrate the better performance of the proposed method.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-024-09854-0