Simultaneous class-modelling in chemometrics: A generalization of Partial Least Squares class modelling for more than two classes by using error correcting output code matrices
The paper presents a new methodology within the framework of the so-called compliant class-models, PLS2-CM, designed with the purpose of improving the performance of class-modelling in a setting with more than two classes. The improvement in the class-models is achieved through the use of multi-resp...
Gespeichert in:
Veröffentlicht in: | Chemometrics and intelligent laboratory systems 2022-08, Vol.227, p.104614, Article 104614 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The paper presents a new methodology within the framework of the so-called compliant class-models, PLS2-CM, designed with the purpose of improving the performance of class-modelling in a setting with more than two classes. The improvement in the class-models is achieved through the use of multi-response PLS models with the classes encoded via Error-Correcting Output Codes (ECOC), instead of the traditional class indicator variables used in chemometrics.
The proposed PLS2-CM entails a decomposition of a class-modelling problem into a series of binary learners, based on a family of code matrices with different code length, which are evaluated to obtain simultaneous compliant class-models with the best performance.
The methodology develops both a new encoding system, based on multi-criteria optimization to search for optimal coding matrices, and a new decoding system, based on probability thresholds to assign objects to class-models. The whole procedure implies that the characteristics of the dataset at hand affect the final selection of the coding matrix and therefore of built class-models, thus giving rise to a data-driven strategy.
The application of PLS2-CM to a variety of cases (controlled data, experimental data and repository datasets) results in an enhanced class-modelling performance by means of the suggested procedure, as measured by the DMCEN (Diagonal Modified Confusion Entropy) index and by sensitivity-specificity matrices. The predictive ability of the compliant class-models has been evaluated.
•A new methodological approach to class-modelling with error-correcting output codes.•Coupling error-correcting output codes to PLS2 models for multiclass settings.•Trading-off codes are obtained for five different criteria in a data-driven approach.•Sensitivity and specificity of the built class-models are notably improved. |
---|---|
ISSN: | 0169-7439 1873-3239 |
DOI: | 10.1016/j.chemolab.2022.104614 |