Application of spectral small-sample data combined with a method of spectral data augmentation fusion (SDA-Fusion) in cancer diagnosis
Cancer is one of the most life-threatening diseases to human life, whose accurate diagnosis is the prerequisite for precise treatment. The detection technology with computer-aided vibrational spectroscopy has achieved gratifying results in intelligent cancer diagnosis. However, limited by factors su...
Gespeichert in:
Veröffentlicht in: | Chemometrics and intelligent laboratory systems 2022-12, Vol.231, p.104681, Article 104681 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Cancer is one of the most life-threatening diseases to human life, whose accurate diagnosis is the prerequisite for precise treatment. The detection technology with computer-aided vibrational spectroscopy has achieved gratifying results in intelligent cancer diagnosis. However, limited by factors such as the number of cancer instances in clinical practice and the cost of spectral acquisition, it is difficult to obtain a large amount of spectral data, which ultimately puts constraints on the performance optimization and improvement of diagnostic models.
Faced with the above challenges, we adopted the different data augmentation strategies in this study to obtain more available training data. In addition to the augmentation methods commonly used in vibrational spectroscopy, such as adding random noise, adding random variations from offset, multiplication and slope, and synthetic minority over-sampling technique (SMOTE), two generative adversarial networks with different architectures were selected for comparison. One is based on artificial neural networks (ANN) and the other on convolutional neural networks (CNN). In the experiments, t-distributed stochastic neighbor embedding (t-SNE) visualization and cosine similarity (CS) measure were opted to evaluate the quality of generated new spectra. New spectra with different manifestations were produced by dissimilar augmentation tactics. Effective merging of heterogeneous data information generated by different augmentation techniques can further enlarge the sample space and increase the diversity of samples. With these factors in mind, we proposed a new spectral data augmentation fusion (SDA-Fusion) method to acquire more available instances. This method is carried out by fusing the new data generated by the five different data augmentation techniques mentioned before. Finally, three groups of experiments, with the original training data, the augmented training data, and the fused training data as input, were designed. Support vector machines (SVM) with different kernel functions, CNN as well as ResNet were used as classification models. Group five-fold (Group5Fold) cross-validation was utilized to assess model performance.
We applied the augmentation methods and experimental ideas mentioned above to two real datasets – the Raman spectral dataset of lung cancer and the mid-infrared spectral dataset of glioma, respectively. The results illustrate that the generative adversarial networks working through adversaria |
---|---|
ISSN: | 0169-7439 1873-3239 |
DOI: | 10.1016/j.chemolab.2022.104681 |