Abstract 24: Multi-feature ensemble learning on cell-free dna for accurately detecting and locating cancer

Early cancer detection by cell-free DNA (cfDNA) faces multiple challenges: the low fraction of tumor DNA in cfDNA, the molecular heterogeneity of cancer, and sample sizes that are too small to reflect the heterogeneous patient population. We have developed an integrated cancer detection system, Canc...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Cancer research (Chicago, Ill.) Ill.), 2021-07, Vol.81 (13_Supplement), p.24-24
Hauptverfasser: Stackpole, Mary, Zeng, Weihua, Li, Shuo, Liu, Chun-Chi, Zhou, Yonggang, He, Shanshan, Yeh, Angela, Wang, Ziye, Sun, Fengzhu, Li, Qingjiao, Yuan, Zuyang, Yildirim, Asli, Chen, Pin Jung, Winograd, Paul, Li, Shize, Noor, Zorawar, Garon, Edward, French, Samuel, Magyar, Clara, Dry, Sarah, Lajonchere, Clara, Geschwind, Daniel, Choi, Gina, Saab, Sammy, Alber, Frank, Wong, Wing Hung, Dubinett, Steven, Aberle, Denise, Agopian, Vatche, Han, Steven-Huy, Ni, Xiaohui, Li, Wenyuan, Zhou, Xianghong Jasmine
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Early cancer detection by cell-free DNA (cfDNA) faces multiple challenges: the low fraction of tumor DNA in cfDNA, the molecular heterogeneity of cancer, and sample sizes that are too small to reflect the heterogeneous patient population. We have developed an integrated cancer detection system, CancerRadar, that addresses all three challenges. It consists of (1) a cost-effective experimental assay, cfMethyl-Seq, for genome-wide methylation profiling of cfDNA, which provides >12-fold enrichment over Whole Genome Bisulfite Sequencing (WGBS) in CpG islands; and (2) a computational platform to extract information from cfMethyl-Seq data and diagnose the patient. The platform derives cfDNA methylations, cfDNA fragment sizes, copy number variations (CNV), and microbial composition from the raw cfMethyl-Seq data, and performs multi-feature ensemble learning. We demonstrate the power of CancerRadar by detecting and locating cancer in a cohort of 275 colon, liver, lung, and stomach cancer patients and 204 non-cancer individuals. For cancer detection, we achieve a sensitivity of 85.6%± 6.7% across all stages and 80.6%±9.1% for early stages (I and II), with a specificity of 99% in both cases. These metrics are derived using leave-one-out cross-validation. During independent validation on a reserved subsample, it achieves a sensitivity of 89.1%±11.3% across all stages and 85.7%±14.2% for early stages, with a specificity of 97% (one false positive). For locating a tumor's tissue of origin (TOO), CancerRadar achieved an accuracy of 91.5%±5.0% for all stages and 89.1%±7.3% for early stages, on an independent subsample. This study is the first to integrate cfDNA methylation, cfDNA fragment size, CNV, and microbial composition analyses for cancer detection on the same patient cohort. cfDNA methylation was the most useful for detecting cancer, but including features from other categories significantly increased the performance, especially for early-stage cancer. In contrast, with respect to TOO prediction, methylation-derived features were overwhelmingly important while including other features did not further improve performance. To fully exploit the power of cfDNA methylation, we identified four types of methylation markers with different characteristics. We have also improved our previous read-level deconvolution algorithm to more accurately identify trace tumor signals. Finally, our data show that as training sample sizes increase, the detection power of CancerRadar cont
ISSN:0008-5472
1538-7445
DOI:10.1158/1538-7445.AM2021-24