Optimal Bayesian Filtering for Biomarker Discovery: Performance and Robustness
Optimal Bayesian feature filtering (OBF) is a fast and memory-efficient algorithm that optimally identifies markers with distributional differences between treatment groups under Gaussian models. Here, we study the performance and robustness of OBF for biomarker discovery. Our contributions are twof...
Gespeichert in:
Veröffentlicht in: | IEEE/ACM transactions on computational biology and bioinformatics 2020-01, Vol.17 (1), p.250-263 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Optimal Bayesian feature filtering (OBF) is a fast and memory-efficient algorithm that optimally identifies markers with distributional differences between treatment groups under Gaussian models. Here, we study the performance and robustness of OBF for biomarker discovery. Our contributions are twofold: (1) we examine how OBF performs on data that violates modeling assumptions, and (2) we provide guidelines on how to set input parameters for robust performance. Contribution (1) addresses an important, relevant, and commonplace problem in computational biology, where it is often impossible to validate an algorithm's core assumptions. To accomplish both tasks, we present a battery of simulations that implement OBF with different inputs and challenge each assumption made by OBF. In particular, we examine the robustness of OBF with respect to incorrect input parameters, false independence, imbalanced sample size, and we address the Gaussianity assumption by considering performance on an extensive family of non-Gaussian distributions. We address advantages and disadvantages between different priors and optimization criteria throughout. Finally, we evaluate the utility of OBF in biomarker discovery using acute myeloid leukemia (AML) and colon cancer microarray datasets, and show that OBF is successful at identifying well-known biomarkers for these diseases that rank low under moderated t-test. |
---|---|
ISSN: | 1545-5963 1557-9964 |
DOI: | 10.1109/TCBB.2018.2858814 |