Static and adaptive subspace information fusion for indefinite heterogeneous proximity data

Heterogeneous data is common in many real-world machine learning applications, such as healthcare, market analysis, environmental sciences, and social media analysis. In these domains, data is often represented in different modalities and, most of the time, in non-vectorial formats, like text, image...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) 2023-10, Vol.555, p.126635, Article 126635
Hauptverfasser: Münch, Maximilian, Röder, Manuel, Heilig, Simon, Raab, Christoph, Schleif, Frank-Michael
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Heterogeneous data is common in many real-world machine learning applications, such as healthcare, market analysis, environmental sciences, and social media analysis. In these domains, data is often represented in different modalities and, most of the time, in non-vectorial formats, like text, images, and video. Traditional machine learning algorithms are often limited in their ability to effectively analyze and learn from such diverse data types. In this paper, we propose two approaches for such heterogeneous data analysis: static and adaptive subspace kernel fusion. The first approach is a kernel-based method extracting the essential parts of the subspace of each input modality and creating one single fused representation of the data. The second approach utilizes an adaptation step by integrating the weighting of spectral properties into the fusion process in order to improve the data’s representation with respect to a given classification task. Our proposed methods are evaluated on several multi-modal, heterogeneous data sets and demonstrate significant performance improvement compared to other methods in the field. Our results highlight the importance of fusing the underlying subspace information of heterogeneous data for achieving superior performance in machine learning tasks. •We propose two novel kernel-based techniques for learning classification models from heterogeneous data to address various limitations of existing multiple kernel learning methods.•Our approaches leverage the spectral properties of multiple proximity functions to create a new, information-rich representation of the data in the form of a single kernel matrix across multiple modalities.•We provide an efficient out-of-sample extension for multi-modal kernel learning for new, unseen data.•The efficiency of the approach is shown on a variety of benchmark data sets from the MKL domain.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2023.126635