Optimising predictive models to prioritise viral discovery in zoonotic reservoirs

Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Lancet. Microbe 2022-08, Vol.3 (8), p.e625-e637
Hauptverfasser: Becker, Daniel J, Albery, Gregory F, Sjodin, Anna R, Poisot, Timothée, Bergner, Laura M, Chen, Binqi, Cohen, Lily E, Dallas, Tad A, Eskew, Evan A, Fagre, Anna C, Farrell, Maxwell J, Guth, Sarah, Han, Barbara A, Simmons, Nancy B, Stock, Michiel, Teeling, Emma C, Carlson, Colin J
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page e637
container_issue 8
container_start_page e625
container_title The Lancet. Microbe
container_volume 3
creator Becker, Daniel J
Albery, Gregory F
Sjodin, Anna R
Poisot, Timothée
Bergner, Laura M
Chen, Binqi
Cohen, Lily E
Dallas, Tad A
Eskew, Evan A
Fagre, Anna C
Farrell, Maxwell J
Guth, Sarah
Han, Barbara A
Simmons, Nancy B
Stock, Michiel
Teeling, Emma C
Carlson, Colin J
description Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host–virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.
doi_str_mv 10.1016/S2666-5247(21)00245-7
format Article
fullrecord <record><control><sourceid>pubmed_cross</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8747432</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S2666524721002457</els_id><sourcerecordid>35036970</sourcerecordid><originalsourceid>FETCH-LOGICAL-c467t-1093ddc4c28585c492513887ab7b0024134d5fb429e63fb0976b80a8f1026cc13</originalsourceid><addsrcrecordid>eNqFkF1LwzAYhYMobsz9BKWXelFN0ny0N4oMv2AwRL0ObZrOV9ZmJLEwf73ZpmNeeZXw5pzznjwInRJ8STARVy9UCJFyyuQ5JRcYU8ZTeYCGu_Hh3n2Axt5_4KjihBLOj9Eg4zgThcRD9DxbBmjBQzdPls7UoAP0JmltbRY-CTYOwToI4E3SgysXSQ1e2964VQJd8mVtZwPoxBlvXG_B-RN01JQLb8Y_5wi93d-9Th7T6ezhaXI7TTUTMqQEF1lda6ZpznOuWRHbZXkuy0pW6w-RjNW8qRgtjMiaChdSVDku84ZgKrQm2Qhdb3OXn1Vram26EOup2Lct3UrZEtTflw7e1dz2KpdMsozGAL4N0M5670yz8xKs1pjVBrNaM1SUqA1mJaPvbH_xzvULNQputoKI0PRgnPIaTKcjXWd0ULWFf1Z8AxoEjtE</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Optimising predictive models to prioritise viral discovery in zoonotic reservoirs</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>Alma/SFX Local Collection</source><creator>Becker, Daniel J ; Albery, Gregory F ; Sjodin, Anna R ; Poisot, Timothée ; Bergner, Laura M ; Chen, Binqi ; Cohen, Lily E ; Dallas, Tad A ; Eskew, Evan A ; Fagre, Anna C ; Farrell, Maxwell J ; Guth, Sarah ; Han, Barbara A ; Simmons, Nancy B ; Stock, Michiel ; Teeling, Emma C ; Carlson, Colin J</creator><creatorcontrib>Becker, Daniel J ; Albery, Gregory F ; Sjodin, Anna R ; Poisot, Timothée ; Bergner, Laura M ; Chen, Binqi ; Cohen, Lily E ; Dallas, Tad A ; Eskew, Evan A ; Fagre, Anna C ; Farrell, Maxwell J ; Guth, Sarah ; Han, Barbara A ; Simmons, Nancy B ; Stock, Michiel ; Teeling, Emma C ; Carlson, Colin J</creatorcontrib><description>Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host–virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.</description><identifier>ISSN: 2666-5247</identifier><identifier>EISSN: 2666-5247</identifier><identifier>DOI: 10.1016/S2666-5247(21)00245-7</identifier><identifier>PMID: 35036970</identifier><language>eng</language><publisher>England: Elsevier Ltd</publisher><subject>Animals ; Chiroptera ; COVID-19 - epidemiology ; Phylogeny ; Review ; SARS-CoV-2 ; Viruses</subject><ispartof>The Lancet. Microbe, 2022-08, Vol.3 (8), p.e625-e637</ispartof><rights>2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license</rights><rights>2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license.</rights><rights>2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license 2022</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c467t-1093ddc4c28585c492513887ab7b0024134d5fb429e63fb0976b80a8f1026cc13</citedby><cites>FETCH-LOGICAL-c467t-1093ddc4c28585c492513887ab7b0024134d5fb429e63fb0976b80a8f1026cc13</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,860,881,27901,27902</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/35036970$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Becker, Daniel J</creatorcontrib><creatorcontrib>Albery, Gregory F</creatorcontrib><creatorcontrib>Sjodin, Anna R</creatorcontrib><creatorcontrib>Poisot, Timothée</creatorcontrib><creatorcontrib>Bergner, Laura M</creatorcontrib><creatorcontrib>Chen, Binqi</creatorcontrib><creatorcontrib>Cohen, Lily E</creatorcontrib><creatorcontrib>Dallas, Tad A</creatorcontrib><creatorcontrib>Eskew, Evan A</creatorcontrib><creatorcontrib>Fagre, Anna C</creatorcontrib><creatorcontrib>Farrell, Maxwell J</creatorcontrib><creatorcontrib>Guth, Sarah</creatorcontrib><creatorcontrib>Han, Barbara A</creatorcontrib><creatorcontrib>Simmons, Nancy B</creatorcontrib><creatorcontrib>Stock, Michiel</creatorcontrib><creatorcontrib>Teeling, Emma C</creatorcontrib><creatorcontrib>Carlson, Colin J</creatorcontrib><title>Optimising predictive models to prioritise viral discovery in zoonotic reservoirs</title><title>The Lancet. Microbe</title><addtitle>Lancet Microbe</addtitle><description>Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host–virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.</description><subject>Animals</subject><subject>Chiroptera</subject><subject>COVID-19 - epidemiology</subject><subject>Phylogeny</subject><subject>Review</subject><subject>SARS-CoV-2</subject><subject>Viruses</subject><issn>2666-5247</issn><issn>2666-5247</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkF1LwzAYhYMobsz9BKWXelFN0ny0N4oMv2AwRL0ObZrOV9ZmJLEwf73ZpmNeeZXw5pzznjwInRJ8STARVy9UCJFyyuQ5JRcYU8ZTeYCGu_Hh3n2Axt5_4KjihBLOj9Eg4zgThcRD9DxbBmjBQzdPls7UoAP0JmltbRY-CTYOwToI4E3SgysXSQ1e2964VQJd8mVtZwPoxBlvXG_B-RN01JQLb8Y_5wi93d-9Th7T6ezhaXI7TTUTMqQEF1lda6ZpznOuWRHbZXkuy0pW6w-RjNW8qRgtjMiaChdSVDku84ZgKrQm2Qhdb3OXn1Vram26EOup2Lct3UrZEtTflw7e1dz2KpdMsozGAL4N0M5670yz8xKs1pjVBrNaM1SUqA1mJaPvbH_xzvULNQputoKI0PRgnPIaTKcjXWd0ULWFf1Z8AxoEjtE</recordid><startdate>20220801</startdate><enddate>20220801</enddate><creator>Becker, Daniel J</creator><creator>Albery, Gregory F</creator><creator>Sjodin, Anna R</creator><creator>Poisot, Timothée</creator><creator>Bergner, Laura M</creator><creator>Chen, Binqi</creator><creator>Cohen, Lily E</creator><creator>Dallas, Tad A</creator><creator>Eskew, Evan A</creator><creator>Fagre, Anna C</creator><creator>Farrell, Maxwell J</creator><creator>Guth, Sarah</creator><creator>Han, Barbara A</creator><creator>Simmons, Nancy B</creator><creator>Stock, Michiel</creator><creator>Teeling, Emma C</creator><creator>Carlson, Colin J</creator><general>Elsevier Ltd</general><general>The Authors. Published by Elsevier Ltd</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>5PM</scope></search><sort><creationdate>20220801</creationdate><title>Optimising predictive models to prioritise viral discovery in zoonotic reservoirs</title><author>Becker, Daniel J ; Albery, Gregory F ; Sjodin, Anna R ; Poisot, Timothée ; Bergner, Laura M ; Chen, Binqi ; Cohen, Lily E ; Dallas, Tad A ; Eskew, Evan A ; Fagre, Anna C ; Farrell, Maxwell J ; Guth, Sarah ; Han, Barbara A ; Simmons, Nancy B ; Stock, Michiel ; Teeling, Emma C ; Carlson, Colin J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c467t-1093ddc4c28585c492513887ab7b0024134d5fb429e63fb0976b80a8f1026cc13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Animals</topic><topic>Chiroptera</topic><topic>COVID-19 - epidemiology</topic><topic>Phylogeny</topic><topic>Review</topic><topic>SARS-CoV-2</topic><topic>Viruses</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Becker, Daniel J</creatorcontrib><creatorcontrib>Albery, Gregory F</creatorcontrib><creatorcontrib>Sjodin, Anna R</creatorcontrib><creatorcontrib>Poisot, Timothée</creatorcontrib><creatorcontrib>Bergner, Laura M</creatorcontrib><creatorcontrib>Chen, Binqi</creatorcontrib><creatorcontrib>Cohen, Lily E</creatorcontrib><creatorcontrib>Dallas, Tad A</creatorcontrib><creatorcontrib>Eskew, Evan A</creatorcontrib><creatorcontrib>Fagre, Anna C</creatorcontrib><creatorcontrib>Farrell, Maxwell J</creatorcontrib><creatorcontrib>Guth, Sarah</creatorcontrib><creatorcontrib>Han, Barbara A</creatorcontrib><creatorcontrib>Simmons, Nancy B</creatorcontrib><creatorcontrib>Stock, Michiel</creatorcontrib><creatorcontrib>Teeling, Emma C</creatorcontrib><creatorcontrib>Carlson, Colin J</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>The Lancet. Microbe</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Becker, Daniel J</au><au>Albery, Gregory F</au><au>Sjodin, Anna R</au><au>Poisot, Timothée</au><au>Bergner, Laura M</au><au>Chen, Binqi</au><au>Cohen, Lily E</au><au>Dallas, Tad A</au><au>Eskew, Evan A</au><au>Fagre, Anna C</au><au>Farrell, Maxwell J</au><au>Guth, Sarah</au><au>Han, Barbara A</au><au>Simmons, Nancy B</au><au>Stock, Michiel</au><au>Teeling, Emma C</au><au>Carlson, Colin J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Optimising predictive models to prioritise viral discovery in zoonotic reservoirs</atitle><jtitle>The Lancet. Microbe</jtitle><addtitle>Lancet Microbe</addtitle><date>2022-08-01</date><risdate>2022</risdate><volume>3</volume><issue>8</issue><spage>e625</spage><epage>e637</epage><pages>e625-e637</pages><issn>2666-5247</issn><eissn>2666-5247</eissn><abstract>Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host–virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.</abstract><cop>England</cop><pub>Elsevier Ltd</pub><pmid>35036970</pmid><doi>10.1016/S2666-5247(21)00245-7</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2666-5247
ispartof The Lancet. Microbe, 2022-08, Vol.3 (8), p.e625-e637
issn 2666-5247
2666-5247
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_8747432
source MEDLINE; DOAJ Directory of Open Access Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; Alma/SFX Local Collection
subjects Animals
Chiroptera
COVID-19 - epidemiology
Phylogeny
Review
SARS-CoV-2
Viruses
title Optimising predictive models to prioritise viral discovery in zoonotic reservoirs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-21T19%3A28%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-pubmed_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Optimising%20predictive%20models%20to%20prioritise%20viral%20discovery%20in%20zoonotic%20reservoirs&rft.jtitle=The%20Lancet.%20Microbe&rft.au=Becker,%20Daniel%20J&rft.date=2022-08-01&rft.volume=3&rft.issue=8&rft.spage=e625&rft.epage=e637&rft.pages=e625-e637&rft.issn=2666-5247&rft.eissn=2666-5247&rft_id=info:doi/10.1016/S2666-5247(21)00245-7&rft_dat=%3Cpubmed_cross%3E35036970%3C/pubmed_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/35036970&rft_els_id=S2666524721002457&rfr_iscdi=true