eDNAssay: A machine learning tool that accurately predicts qPCR cross‐amplification

Environmental DNA (eDNA) sampling is a highly sensitive and cost‐effective technique for wildlife monitoring, notably through the use of qPCR assays. However, it can be difficult to ensure assay specificity when many closely related species co‐occur. In theory, specificity may be assessed in silico...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Molecular ecology resources 2022-11, Vol.22 (8), p.2994-3005
Hauptverfasser: Kronenberger, John A., Wilcox, Taylor M., Mason, Daniel H., Franklin, Thomas W., McKelvey, Kevin S., Young, Michael K., Schwartz, Michael K.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Environmental DNA (eDNA) sampling is a highly sensitive and cost‐effective technique for wildlife monitoring, notably through the use of qPCR assays. However, it can be difficult to ensure assay specificity when many closely related species co‐occur. In theory, specificity may be assessed in silico by determining whether assay oligonucleotides have enough base‐pair mismatches with nontarget sequences to preclude amplification. However, the mismatch qualities required are poorly understood, making in silico assessments difficult and often necessitating extensive in vitro testing—typically the greatest bottleneck in assay development. Increasing the accuracy of in silico assessments would therefore streamline the assay development process. In this study, we paired 10 qPCR assays with 82 synthetic gene fragments for 530 specificity tests using SYBR Green intercalating dye (n = 262) and TaqMan hydrolysis probes (n = 268). Test results were used to train random forest classifiers to predict amplification. The primer‐only model (SYBR Green results) and full‐assay model (TaqMan probe‐based results) were 99.6% and 100% accurate, respectively, in cross‐validation. We further assessed model performance using six independent assays not used in model training. In these tests the primer‐only model was 92.4% accurate (n = 119) and the full‐assay model was 96.5% accurate (n = 144). The high performance achieved by these models makes it possible for eDNA practitioners to more quickly and confidently develop assays specific to the intended target. Practitioners can access the full‐assay model online via eDNAssay (https://NationalGenomicsCenter.shinyapps.io/eDNAssay), a user‐friendly tool for predicting qPCR cross‐amplification.
ISSN:1755-098X
1755-0998
DOI:10.1111/1755-0998.13681