Prediction of standard cell types and functional markers from textual descriptions of flow cytometry gating definitions using machine learning

Background A key step in clinical flow cytometry data analysis is gating, which involves the identification of cell populations. The process of gating produces a set of reportable results, which are typically described by gating definitions. The non‐standardized, non‐interpreted nature of gating def...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Cytometry. Part B, Clinical cytometry Clinical cytometry, 2022-05, Vol.102 (3), p.220-227
Hauptverfasser:	Rodriguez‐Esteban, Raul, Duarte, José, Teixeira, Priscila C., Richard, Fabien, Koltsova, Svetlana, So, W. Venus
Format:	Artikel
Sprache:	eng
Schlagworte:	Annotations assay annotation automatic text annotation Data analysis Data interpretation Data retrieval Datasets Flow Cytometry flow cytometry gating Gating gating definitions Genes Gold Humans Learning algorithms Machine Learning Markers Training Workflow
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Background A key step in clinical flow cytometry data analysis is gating, which involves the identification of cell populations. The process of gating produces a set of reportable results, which are typically described by gating definitions. The non‐standardized, non‐interpreted nature of gating definitions represents a hurdle for data interpretation and data sharing across and within organizations. Interpreting and standardizing gating definitions for subsequent analysis of gating results requires a curation effort from experts. Machine learning approaches have the potential to help in this process by predicting expert annotations associated with gating definitions. Methods We created a gold‐standard dataset by manually annotating thousands of gating definitions with cell type and functional marker annotations. We used this dataset to train and test a machine learning pipeline able to predict standard cell types and functional marker genes associated with gating definitions. Results The machine learning pipeline predicted annotations with high accuracy for both cell types and functional marker genes. Accuracy was lower for gating definitions from assays belonging to laboratories from which limited or no prior data was available in the training. Manual error review ensured that resulting predicted annotations could be reused subsequently as additional gold‐standard training data. Conclusions Machine learning methods are able to consistently predict annotations associated with gating definitions from flow cytometry assays. However, a hybrid automatic and manual annotation workflow would be recommended to achieve optimal results.
ISSN:	1552-4949 1552-4957
DOI:	10.1002/cyto.b.22065