An extended DEIM algorithm for subset selection and class identification

The discrete empirical interpolation method (DEIM) has been shown to be a viable index-selection technique for identifying representative subsets in data. Having gained some popularity in reducing dimensionality of physical models involving differential equations, its use in subset-/pattern-identifi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Machine learning 2021-04, Vol.110 (4), p.621-650
Hauptverfasser: Hendryx, Emily P., Rivière, Béatrice M., Rusin, Craig G.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The discrete empirical interpolation method (DEIM) has been shown to be a viable index-selection technique for identifying representative subsets in data. Having gained some popularity in reducing dimensionality of physical models involving differential equations, its use in subset-/pattern-identification tasks is not yet broadly known within the machine learning community. While it has much to offer as is, the DEIM algorithm is limited in that the number of selected indices cannot exceed the rank of the corresponding data matrix. Although this is not an issue for many data sets, there are cases in which the number of classes represented in a given data set is greater than the rank of the data matrix; in such cases, it is impossible for the standard DEIM algorithm to identify all classes. To overcome this issue, we present a novel extension of DEIM, called E-DEIM. With the proposed algorithm, we also provide some theoretical results for using extensions of DEIM to form the CUR matrix factorization in identifying both rows and columns to approximate the original data matrix. Results from applying variations of E-DEIM to two different data sets indicate that the presented extension can indeed allow for the identification of additional classes along with those selected by standard DEIM. In addition, comparing these results to those of some more familiar methods demonstrates that the proposed deterministic E-DEIM approach including coherence performs comparably to or better than the other evaluated methods and should be considered in future class-identification tasks.
ISSN:0885-6125
1573-0565
DOI:10.1007/s10994-021-05954-3