SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies
Abstract Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable...
Gespeichert in:
Veröffentlicht in: | Briefings in bioinformatics 2024-05, Vol.25 (4) |
---|---|
Hauptverfasser: | , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | |
---|---|
container_issue | 4 |
container_start_page | |
container_title | Briefings in bioinformatics |
container_volume | 25 |
creator | Hu, Heng Gao, Runtian Gao, Wentao Gao, Bo Jiang, Zhongjun Zhou, Murong Wang, Guohua Jiang, Tao |
description | Abstract
Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research. |
doi_str_mv | 10.1093/bib/bbae336 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11232458</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bbae336</oup_id><sourcerecordid>3103006754</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-4dde9506b2344b1cb987b69d06b14e3653e7444758fd8263e03bd961a24300c73</originalsourceid><addsrcrecordid>eNp9kc1rFTEUxYMotlZX7iUgiCBjk8nnuBGpVoWCCz-2IcnceS9lJnlNMg_87015z6IuXOWS-7uHczgIPaXkNSUDO3fBnTtngTF5D51SrlTHieD3b2epOsElO0GPSrkmpCdK04fohOlBE6bEKdp-_fH-8g2GuLXRh7jBpebV1zXbGe9tDraGFPEIFXzFU04LnlPcdBnsiAvcrHC42geL7VrT0niPpzBXyEc1W2EToDxGDyY7F3hyfM_Q98sP3y4-dVdfPn6-eHfVedYPtePjCIMg0vWMc0e9G7RychjbD-XApGCgOOdK6GnUvWRAmBsHSW3PGSFesTP09qC7W90Co4fYLMxml8Ni80-TbDB_b2LYmk3aG0p71nOhm8LLo0JOLWCpZgnFwzzbCGkthhGlqNYDFw19_g96ndYcWz7DKGmGpBK8Ua8OlM-plAzTnRtKzG2FplVojhU2-tmfAe7Y35014MUBSOvuv0q_AKJ2pj8</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3103006754</pqid></control><display><type>article</type><title>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</title><source>MEDLINE</source><source>Oxford Journals Open Access Collection</source><source>PubMed Central</source><creator>Hu, Heng ; Gao, Runtian ; Gao, Wentao ; Gao, Bo ; Jiang, Zhongjun ; Zhou, Murong ; Wang, Guohua ; Jiang, Tao</creator><creatorcontrib>Hu, Heng ; Gao, Runtian ; Gao, Wentao ; Gao, Bo ; Jiang, Zhongjun ; Zhou, Murong ; Wang, Guohua ; Jiang, Tao</creatorcontrib><description>Abstract
Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.</description><identifier>ISSN: 1467-5463</identifier><identifier>ISSN: 1477-4054</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbae336</identifier><identifier>PMID: 38980375</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Adaptive algorithms ; Adaptive sampling ; Algorithms ; Clustering ; Filtration ; Gene sequencing ; Genomic Structural Variation ; Genomics ; Genomics - methods ; High-Throughput Nucleotide Sequencing - methods ; Humans ; Machine learning ; Problem Solving Protocol ; Sequence Analysis, DNA - methods ; Software ; Structure-function relationships ; Variation</subject><ispartof>Briefings in bioinformatics, 2024-05, Vol.25 (4)</ispartof><rights>The Author(s) 2024. Published by Oxford University Press. 2024</rights><rights>The Author(s) 2024. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c329t-4dde9506b2344b1cb987b69d06b14e3653e7444758fd8263e03bd961a24300c73</cites><orcidid>0000-0002-0673-8503 ; 0000-0002-9505-4049 ; 0009-0009-0870-6693</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11232458/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11232458/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,1604,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38980375$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hu, Heng</creatorcontrib><creatorcontrib>Gao, Runtian</creatorcontrib><creatorcontrib>Gao, Wentao</creatorcontrib><creatorcontrib>Gao, Bo</creatorcontrib><creatorcontrib>Jiang, Zhongjun</creatorcontrib><creatorcontrib>Zhou, Murong</creatorcontrib><creatorcontrib>Wang, Guohua</creatorcontrib><creatorcontrib>Jiang, Tao</creatorcontrib><title>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract
Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.</description><subject>Adaptive algorithms</subject><subject>Adaptive sampling</subject><subject>Algorithms</subject><subject>Clustering</subject><subject>Filtration</subject><subject>Gene sequencing</subject><subject>Genomic Structural Variation</subject><subject>Genomics</subject><subject>Genomics - methods</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Humans</subject><subject>Machine learning</subject><subject>Problem Solving Protocol</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Software</subject><subject>Structure-function relationships</subject><subject>Variation</subject><issn>1467-5463</issn><issn>1477-4054</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNp9kc1rFTEUxYMotlZX7iUgiCBjk8nnuBGpVoWCCz-2IcnceS9lJnlNMg_87015z6IuXOWS-7uHczgIPaXkNSUDO3fBnTtngTF5D51SrlTHieD3b2epOsElO0GPSrkmpCdK04fohOlBE6bEKdp-_fH-8g2GuLXRh7jBpebV1zXbGe9tDraGFPEIFXzFU04LnlPcdBnsiAvcrHC42geL7VrT0niPpzBXyEc1W2EToDxGDyY7F3hyfM_Q98sP3y4-dVdfPn6-eHfVedYPtePjCIMg0vWMc0e9G7RychjbD-XApGCgOOdK6GnUvWRAmBsHSW3PGSFesTP09qC7W90Co4fYLMxml8Ni80-TbDB_b2LYmk3aG0p71nOhm8LLo0JOLWCpZgnFwzzbCGkthhGlqNYDFw19_g96ndYcWz7DKGmGpBK8Ua8OlM-plAzTnRtKzG2FplVojhU2-tmfAe7Y35014MUBSOvuv0q_AKJ2pj8</recordid><startdate>20240523</startdate><enddate>20240523</enddate><creator>Hu, Heng</creator><creator>Gao, Runtian</creator><creator>Gao, Wentao</creator><creator>Gao, Bo</creator><creator>Jiang, Zhongjun</creator><creator>Zhou, Murong</creator><creator>Wang, Guohua</creator><creator>Jiang, Tao</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-0673-8503</orcidid><orcidid>https://orcid.org/0000-0002-9505-4049</orcidid><orcidid>https://orcid.org/0009-0009-0870-6693</orcidid></search><sort><creationdate>20240523</creationdate><title>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</title><author>Hu, Heng ; Gao, Runtian ; Gao, Wentao ; Gao, Bo ; Jiang, Zhongjun ; Zhou, Murong ; Wang, Guohua ; Jiang, Tao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-4dde9506b2344b1cb987b69d06b14e3653e7444758fd8263e03bd961a24300c73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adaptive algorithms</topic><topic>Adaptive sampling</topic><topic>Algorithms</topic><topic>Clustering</topic><topic>Filtration</topic><topic>Gene sequencing</topic><topic>Genomic Structural Variation</topic><topic>Genomics</topic><topic>Genomics - methods</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Humans</topic><topic>Machine learning</topic><topic>Problem Solving Protocol</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Software</topic><topic>Structure-function relationships</topic><topic>Variation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Heng</creatorcontrib><creatorcontrib>Gao, Runtian</creatorcontrib><creatorcontrib>Gao, Wentao</creatorcontrib><creatorcontrib>Gao, Bo</creatorcontrib><creatorcontrib>Jiang, Zhongjun</creatorcontrib><creatorcontrib>Zhou, Murong</creatorcontrib><creatorcontrib>Wang, Guohua</creatorcontrib><creatorcontrib>Jiang, Tao</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hu, Heng</au><au>Gao, Runtian</au><au>Gao, Wentao</au><au>Gao, Bo</au><au>Jiang, Zhongjun</au><au>Zhou, Murong</au><au>Wang, Guohua</au><au>Jiang, Tao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2024-05-23</date><risdate>2024</risdate><volume>25</volume><issue>4</issue><issn>1467-5463</issn><issn>1477-4054</issn><eissn>1477-4054</eissn><abstract>Abstract
Structural variation (SV) is an important form of genomic variation that influences gene function and expression by altering the structure of the genome. Although long-read data have been proven to better characterize SVs, SVs detected from noisy long-read data still include a considerable portion of false-positive calls. To accurately detect SVs in long-read data, we present SVDF, a method that employs a learning-based noise filtering strategy and an SV signature-adaptive clustering algorithm, for effectively reducing the likelihood of false-positive events. Benchmarking results from multiple orthogonal experiments demonstrate that, across different sequencing platforms and depths, SVDF achieves higher calling accuracy for each sample compared to several existing general SV calling tools. We believe that, with its meticulous and sensitive SV detection capability, SVDF can bring new opportunities and advancements to cutting-edge genomic research.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38980375</pmid><doi>10.1093/bib/bbae336</doi><orcidid>https://orcid.org/0000-0002-0673-8503</orcidid><orcidid>https://orcid.org/0000-0002-9505-4049</orcidid><orcidid>https://orcid.org/0009-0009-0870-6693</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1467-5463 |
ispartof | Briefings in bioinformatics, 2024-05, Vol.25 (4) |
issn | 1467-5463 1477-4054 1477-4054 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11232458 |
source | MEDLINE; Oxford Journals Open Access Collection; PubMed Central |
subjects | Adaptive algorithms Adaptive sampling Algorithms Clustering Filtration Gene sequencing Genomic Structural Variation Genomics Genomics - methods High-Throughput Nucleotide Sequencing - methods Humans Machine learning Problem Solving Protocol Sequence Analysis, DNA - methods Software Structure-function relationships Variation |
title | SVDF: enhancing structural variation detect from long-read sequencing via automatic filtering strategies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T01%3A13%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SVDF:%20enhancing%20structural%20variation%20detect%20from%20long-read%20sequencing%20via%20automatic%20filtering%20strategies&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Hu,%20Heng&rft.date=2024-05-23&rft.volume=25&rft.issue=4&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbae336&rft_dat=%3Cproquest_pubme%3E3103006754%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3103006754&rft_id=info:pmid/38980375&rft_oup_id=10.1093/bib/bbae336&rfr_iscdi=true |