Simultaneous class-modelling in chemometrics: A generalization of Partial Least Squares class modelling for more than two classes by using error correcting output code matrices

The paper presents a new methodology within the framework of the so-called compliant class-models, PLS2-CM, designed with the purpose of improving the performance of class-modelling in a setting with more than two classes. The improvement in the class-models is achieved through the use of multi-resp...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Chemometrics and intelligent laboratory systems 2022-08, Vol.227, p.104614, Article 104614
Hauptverfasser: Valencia, O., Ortiz, M.C., Ruiz, S., Sánchez, M.S., Sarabia, L.A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The paper presents a new methodology within the framework of the so-called compliant class-models, PLS2-CM, designed with the purpose of improving the performance of class-modelling in a setting with more than two classes. The improvement in the class-models is achieved through the use of multi-response PLS models with the classes encoded via Error-Correcting Output Codes (ECOC), instead of the traditional class indicator variables used in chemometrics. The proposed PLS2-CM entails a decomposition of a class-modelling problem into a series of binary learners, based on a family of code matrices with different code length, which are evaluated to obtain simultaneous compliant class-models with the best performance. The methodology develops both a new encoding system, based on multi-criteria optimization to search for optimal coding matrices, and a new decoding system, based on probability thresholds to assign objects to class-models. The whole procedure implies that the characteristics of the dataset at hand affect the final selection of the coding matrix and therefore of built class-models, thus giving rise to a data-driven strategy. The application of PLS2-CM to a variety of cases (controlled data, experimental data and repository datasets) results in an enhanced class-modelling performance by means of the suggested procedure, as measured by the DMCEN (Diagonal Modified Confusion Entropy) index and by sensitivity-specificity matrices. The predictive ability of the compliant class-models has been evaluated. •A new methodological approach to class-modelling with error-correcting output codes.•Coupling error-correcting output codes to PLS2 models for multiclass settings.•Trading-off codes are obtained for five different criteria in a data-driven approach.•Sensitivity and specificity of the built class-models are notably improved.
ISSN:0169-7439
1873-3239
DOI:10.1016/j.chemolab.2022.104614