SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes

Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes d...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2014-04, Vol.30 (8), p.1081-1086
Hauptverfasser: Jaron, Kamil S, Moravec, Jiří C, Martínková, Natália
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1086
container_issue 8
container_start_page 1081
container_title Bioinformatics
container_volume 30
creator Jaron, Kamil S
Moravec, Jiří C
Martínková, Natália
description Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz.
doi_str_mv 10.1093/bioinformatics/btt727
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1826582324</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1534806594</sourcerecordid><originalsourceid>FETCH-LOGICAL-c422t-f25bec9062e08f7546973eaaceedb425ff87e7516ebec96bba408da8b58270673</originalsourceid><addsrcrecordid>eNqFkUtPwzAQhC0EoqXwE0A5cgn1I36UG6qAVqrEAThHTrIuhiYutnOgvx5XLZU49bR7-GZHO4PQNcF3BE_YuLLOdsb5Vkdbh3EVo6TyBA0JEzIvFCGnhx2zAboI4RNjzDEX52hACyYJ4WyI5q92Oeu7eJ99OG83rot6lS2hgyx63QUDPjO2a9Jw62hbu4EmS64Z9F_a_7jkvaVdC-ESnRm9CnC1nyP0_vT4Np3li5fn-fRhkdcFpTE3lFdQT7CggJWRvBATyUDrGqCpCsqNURIkJwK2mKgqXWDVaFVxRSUWko3Q7e7u2rvvHkIsWxtqWK10B64PJVFUJJalH4-iUgmiFE9JHEU5S0EKPtle5Tu09i4ED6Zce9umNEqCy2035f9uyl03SXezt-irFpqD6q8M9gtCaJC0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1534806594</pqid></control><display><type>article</type><title>SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes</title><source>MEDLINE</source><source>Access via Oxford University Press (Open Access Collection)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Jaron, Kamil S ; Moravec, Jiří C ; Martínková, Natália</creator><creatorcontrib>Jaron, Kamil S ; Moravec, Jiří C ; Martínková, Natália</creatorcontrib><description>Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/btt727</identifier><identifier>PMID: 24371153</identifier><language>eng</language><publisher>England</publisher><subject>Base Composition ; Computational Biology ; Density ; Deoxyribonucleic acid ; Eukaryota - genetics ; Gene Transfer, Horizontal ; Genes ; Genomes ; Genomic Islands ; Genomics - methods ; Intervals ; Mathematical analysis ; Organisms ; Phylogeny ; Sequence Analysis, DNA - methods ; Sliding</subject><ispartof>Bioinformatics, 2014-04, Vol.30 (8), p.1081-1086</ispartof><rights>The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c422t-f25bec9062e08f7546973eaaceedb425ff87e7516ebec96bba408da8b58270673</citedby><cites>FETCH-LOGICAL-c422t-f25bec9062e08f7546973eaaceedb425ff87e7516ebec96bba408da8b58270673</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/24371153$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Jaron, Kamil S</creatorcontrib><creatorcontrib>Moravec, Jiří C</creatorcontrib><creatorcontrib>Martínková, Natália</creatorcontrib><title>SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz.</description><subject>Base Composition</subject><subject>Computational Biology</subject><subject>Density</subject><subject>Deoxyribonucleic acid</subject><subject>Eukaryota - genetics</subject><subject>Gene Transfer, Horizontal</subject><subject>Genes</subject><subject>Genomes</subject><subject>Genomic Islands</subject><subject>Genomics - methods</subject><subject>Intervals</subject><subject>Mathematical analysis</subject><subject>Organisms</subject><subject>Phylogeny</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Sliding</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqFkUtPwzAQhC0EoqXwE0A5cgn1I36UG6qAVqrEAThHTrIuhiYutnOgvx5XLZU49bR7-GZHO4PQNcF3BE_YuLLOdsb5Vkdbh3EVo6TyBA0JEzIvFCGnhx2zAboI4RNjzDEX52hACyYJ4WyI5q92Oeu7eJ99OG83rot6lS2hgyx63QUDPjO2a9Jw62hbu4EmS64Z9F_a_7jkvaVdC-ESnRm9CnC1nyP0_vT4Np3li5fn-fRhkdcFpTE3lFdQT7CggJWRvBATyUDrGqCpCsqNURIkJwK2mKgqXWDVaFVxRSUWko3Q7e7u2rvvHkIsWxtqWK10B64PJVFUJJalH4-iUgmiFE9JHEU5S0EKPtle5Tu09i4ED6Zce9umNEqCy2035f9uyl03SXezt-irFpqD6q8M9gtCaJC0</recordid><startdate>20140415</startdate><enddate>20140415</enddate><creator>Jaron, Kamil S</creator><creator>Moravec, Jiří C</creator><creator>Martínková, Natália</creator><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>7SC</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20140415</creationdate><title>SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes</title><author>Jaron, Kamil S ; Moravec, Jiří C ; Martínková, Natália</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c422t-f25bec9062e08f7546973eaaceedb425ff87e7516ebec96bba408da8b58270673</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>Base Composition</topic><topic>Computational Biology</topic><topic>Density</topic><topic>Deoxyribonucleic acid</topic><topic>Eukaryota - genetics</topic><topic>Gene Transfer, Horizontal</topic><topic>Genes</topic><topic>Genomes</topic><topic>Genomic Islands</topic><topic>Genomics - methods</topic><topic>Intervals</topic><topic>Mathematical analysis</topic><topic>Organisms</topic><topic>Phylogeny</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Sliding</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jaron, Kamil S</creatorcontrib><creatorcontrib>Moravec, Jiří C</creatorcontrib><creatorcontrib>Martínková, Natália</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jaron, Kamil S</au><au>Moravec, Jiří C</au><au>Martínková, Natália</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2014-04-15</date><risdate>2014</risdate><volume>30</volume><issue>8</issue><spage>1081</spage><epage>1086</epage><pages>1081-1086</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>Genomic islands (GIs) are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable, and so labour-intensive phylogenetic searches are used instead. We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative GIs. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies using a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score developed herein. We show that the SigHunt method is computationally efficient and able to detect GIs in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research-data-analysis-tools-sighunt are implemented in C and R and are platform-independent. 376090@mail.muni.cz or martinkova@ivb.cz.</abstract><cop>England</cop><pmid>24371153</pmid><doi>10.1093/bioinformatics/btt727</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2014-04, Vol.30 (8), p.1081-1086
issn 1367-4803
1367-4811
1460-2059
language eng
recordid cdi_proquest_miscellaneous_1826582324
source MEDLINE; Access via Oxford University Press (Open Access Collection); EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects Base Composition
Computational Biology
Density
Deoxyribonucleic acid
Eukaryota - genetics
Gene Transfer, Horizontal
Genes
Genomes
Genomic Islands
Genomics - methods
Intervals
Mathematical analysis
Organisms
Phylogeny
Sequence Analysis, DNA - methods
Sliding
title SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T06%3A35%3A07IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SigHunt:%20horizontal%20gene%20transfer%20finder%20optimized%20for%20eukaryotic%20genomes&rft.jtitle=Bioinformatics&rft.au=Jaron,%20Kamil%20S&rft.date=2014-04-15&rft.volume=30&rft.issue=8&rft.spage=1081&rft.epage=1086&rft.pages=1081-1086&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btt727&rft_dat=%3Cproquest_cross%3E1534806594%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1534806594&rft_id=info:pmid/24371153&rfr_iscdi=true