AB_SA: Accessory genes-Based Source Attribution – tracing the source of Salmonella enterica Typhimurium environmental strains

The partitioning of pathogenic strains isolated in environmental or human cases to their sources is challenging. The pathogens usually colonize multiple animal hosts, including livestock, which contaminate the food-production chain and the environment (e.g. soil and water), posing an additional publ...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Microbial genomics 2020-07, Vol.6 (7)
Hauptverfasser: Guillier, Laurent, Gourmelon, Michèle, Lozach, Solen, Cadel-Six, Sabrina, Vignaud, Marie-Léone, Munck, Nanna, Hald, Tine, Palma, Federica
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 7
container_start_page
container_title Microbial genomics
container_volume 6
creator Guillier, Laurent
Gourmelon, Michèle
Lozach, Solen
Cadel-Six, Sabrina
Vignaud, Marie-Léone
Munck, Nanna
Hald, Tine
Palma, Federica
description The partitioning of pathogenic strains isolated in environmental or human cases to their sources is challenging. The pathogens usually colonize multiple animal hosts, including livestock, which contaminate the food-production chain and the environment (e.g. soil and water), posing an additional public-health burden and major challenges in the identification of the source. Genomic data opens up new opportunities for the development of statistical models aiming to indicate the likely source of pathogen contamination. Here, we propose a computationally fast and efficient multinomial logistic regression source-attribution classifier to predict the animal source of bacterial isolates based on ‘source-enriched’ loci extracted from the accessory-genome profiles of a pangenomic dataset. Depending on the accuracy of the model’s self-attribution step, the modeller selects the number of candidate accessory genes that best fit the model for calculating the likelihood of (source) category membership. The Accessory genes-Based Source Attribution (AB_SA) method was applied to a dataset of strains of Salmonella enterica Typhimurium and its monophasic variant ( S . enterica 1,4,[5],12:i:-). The model was trained on 69 strains with known animal-source categories (i.e. poultry, ruminant and pig). The AB_SA method helped to identify 8 genes as predictors among the 2802 accessory genes. The self-attribution accuracy was 80 %. The AB_SA model was then able to classify 25 of the 29 S . enterica Typhimurium and S . enterica 1,4,[5],12:i:- isolates collected from the environment (considered to be of unknown source) into a specific category (i.e. animal source), with more than 85 % of probability. The AB_SA method herein described provides a user-friendly and valuable tool for performing source-attribution studies in only a few steps. AB_SA is written in R and freely available at https://github.com/lguillier/AB_SA.
doi_str_mv 10.1099/mgen.0.000366
format Article
fullrecord <record><control><sourceid>hal</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_04669875v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>oai_HAL_hal_04669875v1</sourcerecordid><originalsourceid>FETCH-hal_primary_oai_HAL_hal_04669875v13</originalsourceid><addsrcrecordid>eNqVjL1OwzAUhS0EEhV0ZL8rQ4LTYidhcxGoA1u6W7fBbYz8U9lOpUzwDrwhT4JRGViZztE5nz5CbipaVrRt7-xeuZKWlNIl52dktqCsLljDmvM__ZLMY3zLTMUa3tZsRt7FSnbiAUTfqxh9mCB7VCxWGNUrdH4MvQKRUtDbMWnv4OvjE1LAXrs9pEFBPCF-Bx0a650yBkG5pILuETbTYdB2DHq0eTzq4J3NJxqIWaJdvCYXOzRRzX_zitw-P20e18WARh6Cthgm6VHLtXiRPxu957xtanaslv9hvwHZaVyW</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>AB_SA: Accessory genes-Based Source Attribution – tracing the source of Salmonella enterica Typhimurium environmental strains</title><source>TestCollectionTL3OpenAccess</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Guillier, Laurent ; Gourmelon, Michèle ; Lozach, Solen ; Cadel-Six, Sabrina ; Vignaud, Marie-Léone ; Munck, Nanna ; Hald, Tine ; Palma, Federica</creator><creatorcontrib>Guillier, Laurent ; Gourmelon, Michèle ; Lozach, Solen ; Cadel-Six, Sabrina ; Vignaud, Marie-Léone ; Munck, Nanna ; Hald, Tine ; Palma, Federica</creatorcontrib><description>The partitioning of pathogenic strains isolated in environmental or human cases to their sources is challenging. The pathogens usually colonize multiple animal hosts, including livestock, which contaminate the food-production chain and the environment (e.g. soil and water), posing an additional public-health burden and major challenges in the identification of the source. Genomic data opens up new opportunities for the development of statistical models aiming to indicate the likely source of pathogen contamination. Here, we propose a computationally fast and efficient multinomial logistic regression source-attribution classifier to predict the animal source of bacterial isolates based on ‘source-enriched’ loci extracted from the accessory-genome profiles of a pangenomic dataset. Depending on the accuracy of the model’s self-attribution step, the modeller selects the number of candidate accessory genes that best fit the model for calculating the likelihood of (source) category membership. The Accessory genes-Based Source Attribution (AB_SA) method was applied to a dataset of strains of Salmonella enterica Typhimurium and its monophasic variant ( S . enterica 1,4,[5],12:i:-). The model was trained on 69 strains with known animal-source categories (i.e. poultry, ruminant and pig). The AB_SA method helped to identify 8 genes as predictors among the 2802 accessory genes. The self-attribution accuracy was 80 %. The AB_SA model was then able to classify 25 of the 29 S . enterica Typhimurium and S . enterica 1,4,[5],12:i:- isolates collected from the environment (considered to be of unknown source) into a specific category (i.e. animal source), with more than 85 % of probability. The AB_SA method herein described provides a user-friendly and valuable tool for performing source-attribution studies in only a few steps. AB_SA is written in R and freely available at https://github.com/lguillier/AB_SA.</description><identifier>ISSN: 2057-5858</identifier><identifier>EISSN: 2057-5858</identifier><identifier>DOI: 10.1099/mgen.0.000366</identifier><language>eng</language><publisher>Society for General Microbiology</publisher><subject>Life Sciences</subject><ispartof>Microbial genomics, 2020-07, Vol.6 (7)</ispartof><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><orcidid>0000-0003-1654-0960 ; 0000-0002-7865-961X ; 0000-0002-7867-2937 ; 0000-0002-1115-9792 ; 0000-0001-8880-166X ; 0000-0003-1133-6282 ; 0000-0001-5291-2181 ; 0000-0001-8880-166X ; 0000-0002-7865-961X ; 0000-0002-7867-2937 ; 0000-0003-1654-0960 ; 0000-0003-1133-6282 ; 0000-0002-1115-9792 ; 0000-0001-5291-2181</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,864,885,27924,27925</link.rule.ids><backlink>$$Uhttps://hal.science/hal-04669875$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>Guillier, Laurent</creatorcontrib><creatorcontrib>Gourmelon, Michèle</creatorcontrib><creatorcontrib>Lozach, Solen</creatorcontrib><creatorcontrib>Cadel-Six, Sabrina</creatorcontrib><creatorcontrib>Vignaud, Marie-Léone</creatorcontrib><creatorcontrib>Munck, Nanna</creatorcontrib><creatorcontrib>Hald, Tine</creatorcontrib><creatorcontrib>Palma, Federica</creatorcontrib><title>AB_SA: Accessory genes-Based Source Attribution – tracing the source of Salmonella enterica Typhimurium environmental strains</title><title>Microbial genomics</title><description>The partitioning of pathogenic strains isolated in environmental or human cases to their sources is challenging. The pathogens usually colonize multiple animal hosts, including livestock, which contaminate the food-production chain and the environment (e.g. soil and water), posing an additional public-health burden and major challenges in the identification of the source. Genomic data opens up new opportunities for the development of statistical models aiming to indicate the likely source of pathogen contamination. Here, we propose a computationally fast and efficient multinomial logistic regression source-attribution classifier to predict the animal source of bacterial isolates based on ‘source-enriched’ loci extracted from the accessory-genome profiles of a pangenomic dataset. Depending on the accuracy of the model’s self-attribution step, the modeller selects the number of candidate accessory genes that best fit the model for calculating the likelihood of (source) category membership. The Accessory genes-Based Source Attribution (AB_SA) method was applied to a dataset of strains of Salmonella enterica Typhimurium and its monophasic variant ( S . enterica 1,4,[5],12:i:-). The model was trained on 69 strains with known animal-source categories (i.e. poultry, ruminant and pig). The AB_SA method helped to identify 8 genes as predictors among the 2802 accessory genes. The self-attribution accuracy was 80 %. The AB_SA model was then able to classify 25 of the 29 S . enterica Typhimurium and S . enterica 1,4,[5],12:i:- isolates collected from the environment (considered to be of unknown source) into a specific category (i.e. animal source), with more than 85 % of probability. The AB_SA method herein described provides a user-friendly and valuable tool for performing source-attribution studies in only a few steps. AB_SA is written in R and freely available at https://github.com/lguillier/AB_SA.</description><subject>Life Sciences</subject><issn>2057-5858</issn><issn>2057-5858</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqVjL1OwzAUhS0EEhV0ZL8rQ4LTYidhcxGoA1u6W7fBbYz8U9lOpUzwDrwhT4JRGViZztE5nz5CbipaVrRt7-xeuZKWlNIl52dktqCsLljDmvM__ZLMY3zLTMUa3tZsRt7FSnbiAUTfqxh9mCB7VCxWGNUrdH4MvQKRUtDbMWnv4OvjE1LAXrs9pEFBPCF-Bx0a650yBkG5pILuETbTYdB2DHq0eTzq4J3NJxqIWaJdvCYXOzRRzX_zitw-P20e18WARh6Cthgm6VHLtXiRPxu957xtanaslv9hvwHZaVyW</recordid><startdate>20200701</startdate><enddate>20200701</enddate><creator>Guillier, Laurent</creator><creator>Gourmelon, Michèle</creator><creator>Lozach, Solen</creator><creator>Cadel-Six, Sabrina</creator><creator>Vignaud, Marie-Léone</creator><creator>Munck, Nanna</creator><creator>Hald, Tine</creator><creator>Palma, Federica</creator><general>Society for General Microbiology</general><scope>1XC</scope><orcidid>https://orcid.org/0000-0003-1654-0960</orcidid><orcidid>https://orcid.org/0000-0002-7865-961X</orcidid><orcidid>https://orcid.org/0000-0002-7867-2937</orcidid><orcidid>https://orcid.org/0000-0002-1115-9792</orcidid><orcidid>https://orcid.org/0000-0001-8880-166X</orcidid><orcidid>https://orcid.org/0000-0003-1133-6282</orcidid><orcidid>https://orcid.org/0000-0001-5291-2181</orcidid><orcidid>https://orcid.org/0000-0001-8880-166X</orcidid><orcidid>https://orcid.org/0000-0002-7865-961X</orcidid><orcidid>https://orcid.org/0000-0002-7867-2937</orcidid><orcidid>https://orcid.org/0000-0003-1654-0960</orcidid><orcidid>https://orcid.org/0000-0003-1133-6282</orcidid><orcidid>https://orcid.org/0000-0002-1115-9792</orcidid><orcidid>https://orcid.org/0000-0001-5291-2181</orcidid></search><sort><creationdate>20200701</creationdate><title>AB_SA: Accessory genes-Based Source Attribution – tracing the source of Salmonella enterica Typhimurium environmental strains</title><author>Guillier, Laurent ; Gourmelon, Michèle ; Lozach, Solen ; Cadel-Six, Sabrina ; Vignaud, Marie-Léone ; Munck, Nanna ; Hald, Tine ; Palma, Federica</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-hal_primary_oai_HAL_hal_04669875v13</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Life Sciences</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Guillier, Laurent</creatorcontrib><creatorcontrib>Gourmelon, Michèle</creatorcontrib><creatorcontrib>Lozach, Solen</creatorcontrib><creatorcontrib>Cadel-Six, Sabrina</creatorcontrib><creatorcontrib>Vignaud, Marie-Léone</creatorcontrib><creatorcontrib>Munck, Nanna</creatorcontrib><creatorcontrib>Hald, Tine</creatorcontrib><creatorcontrib>Palma, Federica</creatorcontrib><collection>Hyper Article en Ligne (HAL)</collection><jtitle>Microbial genomics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Guillier, Laurent</au><au>Gourmelon, Michèle</au><au>Lozach, Solen</au><au>Cadel-Six, Sabrina</au><au>Vignaud, Marie-Léone</au><au>Munck, Nanna</au><au>Hald, Tine</au><au>Palma, Federica</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>AB_SA: Accessory genes-Based Source Attribution – tracing the source of Salmonella enterica Typhimurium environmental strains</atitle><jtitle>Microbial genomics</jtitle><date>2020-07-01</date><risdate>2020</risdate><volume>6</volume><issue>7</issue><issn>2057-5858</issn><eissn>2057-5858</eissn><abstract>The partitioning of pathogenic strains isolated in environmental or human cases to their sources is challenging. The pathogens usually colonize multiple animal hosts, including livestock, which contaminate the food-production chain and the environment (e.g. soil and water), posing an additional public-health burden and major challenges in the identification of the source. Genomic data opens up new opportunities for the development of statistical models aiming to indicate the likely source of pathogen contamination. Here, we propose a computationally fast and efficient multinomial logistic regression source-attribution classifier to predict the animal source of bacterial isolates based on ‘source-enriched’ loci extracted from the accessory-genome profiles of a pangenomic dataset. Depending on the accuracy of the model’s self-attribution step, the modeller selects the number of candidate accessory genes that best fit the model for calculating the likelihood of (source) category membership. The Accessory genes-Based Source Attribution (AB_SA) method was applied to a dataset of strains of Salmonella enterica Typhimurium and its monophasic variant ( S . enterica 1,4,[5],12:i:-). The model was trained on 69 strains with known animal-source categories (i.e. poultry, ruminant and pig). The AB_SA method helped to identify 8 genes as predictors among the 2802 accessory genes. The self-attribution accuracy was 80 %. The AB_SA model was then able to classify 25 of the 29 S . enterica Typhimurium and S . enterica 1,4,[5],12:i:- isolates collected from the environment (considered to be of unknown source) into a specific category (i.e. animal source), with more than 85 % of probability. The AB_SA method herein described provides a user-friendly and valuable tool for performing source-attribution studies in only a few steps. AB_SA is written in R and freely available at https://github.com/lguillier/AB_SA.</abstract><pub>Society for General Microbiology</pub><doi>10.1099/mgen.0.000366</doi><orcidid>https://orcid.org/0000-0003-1654-0960</orcidid><orcidid>https://orcid.org/0000-0002-7865-961X</orcidid><orcidid>https://orcid.org/0000-0002-7867-2937</orcidid><orcidid>https://orcid.org/0000-0002-1115-9792</orcidid><orcidid>https://orcid.org/0000-0001-8880-166X</orcidid><orcidid>https://orcid.org/0000-0003-1133-6282</orcidid><orcidid>https://orcid.org/0000-0001-5291-2181</orcidid><orcidid>https://orcid.org/0000-0001-8880-166X</orcidid><orcidid>https://orcid.org/0000-0002-7865-961X</orcidid><orcidid>https://orcid.org/0000-0002-7867-2937</orcidid><orcidid>https://orcid.org/0000-0003-1654-0960</orcidid><orcidid>https://orcid.org/0000-0003-1133-6282</orcidid><orcidid>https://orcid.org/0000-0002-1115-9792</orcidid><orcidid>https://orcid.org/0000-0001-5291-2181</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 2057-5858
ispartof Microbial genomics, 2020-07, Vol.6 (7)
issn 2057-5858
2057-5858
language eng
recordid cdi_hal_primary_oai_HAL_hal_04669875v1
source TestCollectionTL3OpenAccess; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Life Sciences
title AB_SA: Accessory genes-Based Source Attribution – tracing the source of Salmonella enterica Typhimurium environmental strains
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T10%3A43%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-hal&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=AB_SA:%20Accessory%20genes-Based%20Source%20Attribution%20%E2%80%93%20tracing%20the%20source%20of%20Salmonella%20enterica%20Typhimurium%20environmental%20strains&rft.jtitle=Microbial%20genomics&rft.au=Guillier,%20Laurent&rft.date=2020-07-01&rft.volume=6&rft.issue=7&rft.issn=2057-5858&rft.eissn=2057-5858&rft_id=info:doi/10.1099/mgen.0.000366&rft_dat=%3Chal%3Eoai_HAL_hal_04669875v1%3C/hal%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true