Self-Optimized One-Class Classification Using Sum of Ranking Differences Combined with a Receiver Operator Characteristic Curve

A significant and common problem in analytical chemistry is determining if a sample belongs to a specific class, e.g., establishing if a food product is genuine or counterfeit or a tissue sample is benign or malignant. This problem is termed one-class classification (class modeling). Problematic wit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Analytical chemistry (Washington) 2020-04, Vol.92 (7), p.5354-5361
Hauptverfasser: Lemos, Tony, Kalivas, John H
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:A significant and common problem in analytical chemistry is determining if a sample belongs to a specific class, e.g., establishing if a food product is genuine or counterfeit or a tissue sample is benign or malignant. This problem is termed one-class classification (class modeling). Problematic with class modeling is determining which one-class classifier to use followed by the challenge of optimizing the chosen classifier (identifying the best tuning parameter value(s)). With spectroscopic data, two other conundrums arise: which data preprocessing method(s) and spectral region(s) to use. Presented in this paper is a hybrid fusion process that can combine nonoptimized classifiers across multiple instruments, preprocessing methods, and measurements. Instead of optimizing classifiers, a window of tuning parameters is used for each classifier. The flexible fusion method of sum of ranking differences (SRD) is applied to combine all assessment values. Defining the best SRD ranking value (threshold) for determining class membership is the one tuning parameter value needed. However, this SRD ranking value is automatically optimized by using a receiver operator characteristic (ROC) curve. The approach is demonstrated on two analytical data sets. The first is a beer authentication sample set measured on five instruments: near-infrared, mid infrared (MIR), ultraviolet, visible, and thermogravimetric. Three different fusion protocols of all five instruments are demonstrated. The second data set is MIR spectra of strawberry puree with two categories: strawberry puree and nonstrawberry puree. Fusing nonoptimized classifiers provides reliable classifications relative to accuracy, sensitivity, and specificity.
ISSN:0003-2700
1520-6882
DOI:10.1021/acs.analchem.0c00017