SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies

Abstract Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2024-05, Vol.25 (4)
Hauptverfasser: Hu, Heng, Gao, Runtian, Gao, Wentao, Gao, Bo, Jiang, Zhongjun, Zhou, Murong, Wang, Guohua, Jiang, Tao
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 4
container_start_page
container_title Briefings in bioinformatics
container_volume 25
creator Hu, Heng
Gao, Runtian
Gao, Wentao
Gao, Bo
Jiang, Zhongjun
Zhou, Murong
Wang, Guohua
Jiang, Tao
description Abstract Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.
doi_str_mv 10.1093/bib/bbae336
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11232458</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bbae336</oup_id><sourcerecordid>3103006754</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-4dde9506b2344b1cb987b69d06b14e3653e7444758fd8263e03bd961a24300c73</originalsourceid><addsrcrecordid>eNp9kc1rFTEUxYMotlZX7iUgiCBjk8nnuBGpVoWCCz-2IcnceS9lJnlNMg_87015z6IuXOWS-7uHczgIPaXkNSUDO3fBnTtngTF5D51SrlTHieD3b2epOsElO0GPSrkmpCdK04fohOlBE6bEKdp-_fH-8g2GuLXRh7jBpebV1zXbGe9tDraGFPEIFXzFU04LnlPcdBnsiAvcrHC42geL7VrT0niPpzBXyEc1W2EToDxGDyY7F3hyfM_Q98sP3y4-dVdfPn6-eHfVedYPtePjCIMg0vWMc0e9G7RychjbD-XApGCgOOdK6GnUvWRAmBsHSW3PGSFesTP09qC7W90Co4fYLMxml8Ni80-TbDB_b2LYmk3aG0p71nOhm8LLo0JOLWCpZgnFwzzbCGkthhGlqNYDFw19_g96ndYcWz7DKGmGpBK8Ua8OlM-plAzTnRtKzG2FplVojhU2-tmfAe7Y35014MUBSOvuv0q_AKJ2pj8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3103006754</pqid></control><display><type>article</type><title>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</title><source>MEDLINE</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Hu, Heng ; Gao, Runtian ; Gao, Wentao ; Gao, Bo ; Jiang, Zhongjun ; Zhou, Murong ; Wang, Guohua ; Jiang, Tao</creator><creatorcontrib>Hu, Heng ; Gao, Runtian ; Gao, Wentao ; Gao, Bo ; Jiang, Zhongjun ; Zhou, Murong ; Wang, Guohua ; Jiang, Tao</creatorcontrib><description>Abstract Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.</description><identifier>ISSN: 1467-5463</identifier><identifier>ISSN: 1477-4054</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbae336</identifier><identifier>PMID: 38980375</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Adaptive algorithms ; Adaptive sampling ; Algorithms ; Clustering ; Filtration ; Gene sequencing ; Genomic Structural Variation ; Genomics ; Genomics - methods ; High-Throughput Nucleotide Sequencing - methods ; Humans ; Machine learning ; Problem Solving Protocol ; Sequence Analysis, DNA - methods ; Software ; Structure-function relationships ; Variation</subject><ispartof>Briefings in bioinformatics, 2024-05, Vol.25 (4)</ispartof><rights>The Author(s) 2024. Published by Oxford University Press. 2024</rights><rights>The Author(s) 2024. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c329t-4dde9506b2344b1cb987b69d06b14e3653e7444758fd8263e03bd961a24300c73</cites><orcidid>0000-0002-0673-8503 ; 0000-0002-9505-4049 ; 0009-0009-0870-6693</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11232458/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11232458/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,1604,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38980375$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hu, Heng</creatorcontrib><creatorcontrib>Gao, Runtian</creatorcontrib><creatorcontrib>Gao, Wentao</creatorcontrib><creatorcontrib>Gao, Bo</creatorcontrib><creatorcontrib>Jiang, Zhongjun</creatorcontrib><creatorcontrib>Zhou, Murong</creatorcontrib><creatorcontrib>Wang, Guohua</creatorcontrib><creatorcontrib>Jiang, Tao</creatorcontrib><title>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.</description><subject>Adaptive algorithms</subject><subject>Adaptive sampling</subject><subject>Algorithms</subject><subject>Clustering</subject><subject>Filtration</subject><subject>Gene sequencing</subject><subject>Genomic Structural Variation</subject><subject>Genomics</subject><subject>Genomics - methods</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Humans</subject><subject>Machine learning</subject><subject>Problem Solving Protocol</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Software</subject><subject>Structure-function relationships</subject><subject>Variation</subject><issn>1467-5463</issn><issn>1477-4054</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNp9kc1rFTEUxYMotlZX7iUgiCBjk8nnuBGpVoWCCz-2IcnceS9lJnlNMg_87015z6IuXOWS-7uHczgIPaXkNSUDO3fBnTtngTF5D51SrlTHieD3b2epOsElO0GPSrkmpCdK04fohOlBE6bEKdp-_fH-8g2GuLXRh7jBpebV1zXbGe9tDraGFPEIFXzFU04LnlPcdBnsiAvcrHC42geL7VrT0niPpzBXyEc1W2EToDxGDyY7F3hyfM_Q98sP3y4-dVdfPn6-eHfVedYPtePjCIMg0vWMc0e9G7RychjbD-XApGCgOOdK6GnUvWRAmBsHSW3PGSFesTP09qC7W90Co4fYLMxml8Ni80-TbDB_b2LYmk3aG0p71nOhm8LLo0JOLWCpZgnFwzzbCGkthhGlqNYDFw19_g96ndYcWz7DKGmGpBK8Ua8OlM-plAzTnRtKzG2FplVojhU2-tmfAe7Y35014MUBSOvuv0q_AKJ2pj8</recordid><startdate>20240523</startdate><enddate>20240523</enddate><creator>Hu, Heng</creator><creator>Gao, Runtian</creator><creator>Gao, Wentao</creator><creator>Gao, Bo</creator><creator>Jiang, Zhongjun</creator><creator>Zhou, Murong</creator><creator>Wang, Guohua</creator><creator>Jiang, Tao</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0673-8503</orcidid><orcidid>https://orcid.org/0000-0002-9505-4049</orcidid><orcidid>https://orcid.org/0009-0009-0870-6693</orcidid></search><sort><creationdate>20240523</creationdate><title>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</title><author>Hu, Heng ; Gao, Runtian ; Gao, Wentao ; Gao, Bo ; Jiang, Zhongjun ; Zhou, Murong ; Wang, Guohua ; Jiang, Tao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-4dde9506b2344b1cb987b69d06b14e3653e7444758fd8263e03bd961a24300c73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adaptive algorithms</topic><topic>Adaptive sampling</topic><topic>Algorithms</topic><topic>Clustering</topic><topic>Filtration</topic><topic>Gene sequencing</topic><topic>Genomic Structural Variation</topic><topic>Genomics</topic><topic>Genomics - methods</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Humans</topic><topic>Machine learning</topic><topic>Problem Solving Protocol</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Software</topic><topic>Structure-function relationships</topic><topic>Variation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Heng</creatorcontrib><creatorcontrib>Gao, Runtian</creatorcontrib><creatorcontrib>Gao, Wentao</creatorcontrib><creatorcontrib>Gao, Bo</creatorcontrib><creatorcontrib>Jiang, Zhongjun</creatorcontrib><creatorcontrib>Zhou, Murong</creatorcontrib><creatorcontrib>Wang, Guohua</creatorcontrib><creatorcontrib>Jiang, Tao</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hu, Heng</au><au>Gao, Runtian</au><au>Gao, Wentao</au><au>Gao, Bo</au><au>Jiang, Zhongjun</au><au>Zhou, Murong</au><au>Wang, Guohua</au><au>Jiang, Tao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2024-05-23</date><risdate>2024</risdate><volume>25</volume><issue>4</issue><issn>1467-5463</issn><issn>1477-4054</issn><eissn>1477-4054</eissn><abstract>Abstract Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38980375</pmid><doi>10.1093/bib/bbae336</doi><orcidid>https://orcid.org/0000-0002-0673-8503</orcidid><orcidid>https://orcid.org/0000-0002-9505-4049</orcidid><orcidid>https://orcid.org/0009-0009-0870-6693</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1467-5463
ispartof Briefings in bioinformatics, 2024-05, Vol.25 (4)
issn 1467-5463
1477-4054
1477-4054
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11232458
source MEDLINE; Oxford Journals Open Access Collection; PubMed Central
subjects Adaptive algorithms
Adaptive sampling
Algorithms
Clustering
Filtration
Gene sequencing
Genomic Structural Variation
Genomics
Genomics - methods
High-Throughput Nucleotide Sequencing - methods
Humans
Machine learning
Problem Solving Protocol
Sequence Analysis, DNA - methods
Software
Structure-function relationships
Variation
title SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T01%3A13%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SVDF:%20enhancing%20structural%20variation%20detect%20from%20long-read%20sequencing%20via%20automatic%20filtering%20strategies&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Hu,%20Heng&rft.date=2024-05-23&rft.volume=25&rft.issue=4&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbae336&rft_dat=%3Cproquest_pubme%3E3103006754%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3103006754&rft_id=info:pmid/38980375&rft_oup_id=10.1093/bib/bbae336&rfr_iscdi=true