Alignment and clustering strategies for GCGC-MS features using a cylindrical mapping
Comprehensive two-dimensional gas chromatography coupled to mass spectrometry is a powerful tool to analyze complex samples. For application of the technique in studies like biomarker discovery in which large sets of complex samples have to be analyzed, extensive preprocessing is needed to align the...
Gespeichert in:
Veröffentlicht in: | Analytica chimica acta 2012-05, Vol.726, p.9-21 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Comprehensive two-dimensional gas chromatography coupled to mass spectrometry is a powerful tool to analyze complex samples. For application of the technique in studies like biomarker discovery in which large sets of complex samples have to be analyzed, extensive preprocessing is needed to align the data obtained in several injections (analyses). We developed new alignment and clustering algorithms for this type of data. New in the current procedures is the consistent way in which the phenomenon referred to as wrap-around is treated. The data analysis problems associated with this phenomenon are solved by treating the 2D display as the surface of a three-dimensional cylinder. Based on this transformation we developed a new similarity metric for features as a function of both the cylindrical distance (reflecting similarity in chromatographic behavior) and of the mass spectral correlation (reflecting similarity in chemical structure). The concepts are used in warping and clustering, and include a protection against greedy warping. The methods were applied - for the purpose of an example - to the analysis of 11 replicates of a human urine sample concentrated by solid phase extraction. It is shown that the alignment is well protected against greedy warping which is important with respect to analytical qualities as robustness and repeatability. It is also demonstrated that chemically similar features are clustered together. The paper is organized as follows. First a brief introduction is provided addressing the background of the GCGC-MS data structure followed by a theoretical section with a conceptual description of the procedures and details of the algorithms. Finally an example is given in the experimental section, illustrating the application of the procedures. |
---|---|
ISSN: | 0003-2670 |
DOI: | 10.1016/j.aca.2012.03.009 |