Analysis of handmade paper by Raman spectroscopy combined with machine learning

Handmade paper is a major carrier and restoration material of traditional Chinese ancient books, calligraphies, and paintings. In this study, we carried out a Raman spectroscopy analysis of 18 types of handmade paper samples. The main components of the handmade paper were cellulose and lignin, accor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of Raman spectroscopy 2022-02, Vol.53 (2), p.260-271
Hauptverfasser: Yan, Chunsheng, Cheng, Zhongyi, Luo, Si, Huang, Chen, Han, Songtao, Han, Xiuli, Du, Yuandong, Ying, Chaonan
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Handmade paper is a major carrier and restoration material of traditional Chinese ancient books, calligraphies, and paintings. In this study, we carried out a Raman spectroscopy analysis of 18 types of handmade paper samples. The main components of the handmade paper were cellulose and lignin, according to the wavenumber and Raman vibration assignment. We divided its Raman spectrum into eight subbands. Five machine learning models were employed: principal component analysis (PCA), partial least squares (PLS), support vector machine (SVM), k‐nearest neighbors (KNN), and random forest (RF). The Raman spectral data were normalized, and the fluorescence envelope was subtracted using the airPLS algorithm to obtain four types of data, raw, normalized, defluorescence, and fluorescence data. An RF variable importance analysis of data processing showed that data normalization eliminated the intensity differences of fluorescence signals caused by lignin, which contained important information of raw materials and papermaking technology, let alone the data defluorescence. The data processing also reduced the importance of the average variables in almost all spectral bands. Nevertheless, the data processing is worthwhile because it significantly improves the accuracy of machine learning, and the information loss does not affect the prediction. Using the machine learning models of PCA, PLS, and SVM combined with linear regression (LR), KNN, and RF, the classification and prediction of handmade paper samples were realized. For almost all processed data, including the fluorescence data, PCA‐LR had the highest classification and prediction accuracy (R2 = 1) in almost all spectral bands. PLS‐LR and SVM‐LR had the second‐highest accuracies (R2 = 0.4–0.9), whereas KNN and RF had the lowest accuracies (R2 = 0.1–0.4) for full band spectral data. Our results suggest that the abundant information contained in Raman spectroscopy combined with powerful machine learning models could inspire further studies on handmade paper and related cultural relics. We measured the Raman spectra of 18 types of handmade paper samples, and constructed five machine‐learning models, namely, PCA‐LS, PLS‐LS, SVM‐LS, KNN, and RF, to evaluate the role of data processing and to classify and predict the samples. It shows that data processing resulted in the loss of fluorescence‐related features. Nevertheless, data processing greatly improved the accuracy of machine learning. PCA‐LR has the highest classifi
ISSN:0377-0486
1097-4555
DOI:10.1002/jrs.6280