ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing

Abstract Nanopore selective sequencing allows the targeted sequencing of DNA of interest using computational approaches rather than experimental methods such as targeted multiplex polymerase chain reaction or hybridization capture. Compared to sequence-alignment strategies, deep learning (DL) models...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Briefings in bioinformatics 2024-07, Vol.25 (5)
Hauptverfasser: Fan, Kechen, Li, Mengfan, Zhang, Jiarong, Xie, Zihan, Jiang, Daguang, Bo, Xiaochen, Zhao, Dongsheng, Shi, Shenghui, Ni, Ming
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page
container_issue 5
container_start_page
container_title Briefings in bioinformatics
container_volume 25
creator Fan, Kechen
Li, Mengfan
Zhang, Jiarong
Xie, Zihan
Jiang, Daguang
Bo, Xiaochen
Zhao, Dongsheng
Shi, Shenghui
Ni, Ming
description Abstract Nanopore selective sequencing allows the targeted sequencing of DNA of interest using computational approaches rather than experimental methods such as targeted multiplex polymerase chain reaction or hybridization capture. Compared to sequence-alignment strategies, deep learning (DL) models for classifying target and nontarget DNA provide large speed advantages. However, the relatively low accuracy of these DL-based tools hinders their application in nanopore selective sequencing. Here, we present a DL-based tool named ReadCurrent for nanopore selective sequencing, which takes electric currents as inputs. ReadCurrent employs a modified very deep convolutional neural network (VDCNN) architecture, enabling significantly lower computational costs for training and quicker inference compared to conventional VDCNN. We evaluated the performance of ReadCurrent across 10 nanopore sequencing datasets spanning human, yeasts, bacteria, and viruses. We observed that ReadCurrent achieved a mean accuracy of 98.57% for classification, outperforming four other DL-based selective sequencing methods. In experimental validation that selectively sequenced microbial DNA from human DNA, ReadCurrent achieved an enrichment ratio of 2.85, which was higher than the 2.7 ratio achieved by MinKNOW using the sequence-alignment strategy. In summary, ReadCurrent can rapidly classify target and nontarget DNA with high accuracy, providing an alternative in the toolbox for nanopore selective sequencing. ReadCurrent is available at https://github.com/Ming-Ni-Group/ReadCurrent.
doi_str_mv 10.1093/bib/bbae435
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11370629</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bib/bbae435</oup_id><sourcerecordid>3100562731</sourcerecordid><originalsourceid>FETCH-LOGICAL-c329t-8b24b62bd1e8b0462ea54ed5ac727a49fe3afc75fca51e21ca7bdc2742f2d8733</originalsourceid><addsrcrecordid>eNp9kUtr3DAUhUVpaV5ddV8EgVIoTvSyZXVTyqRpCiGB9LEVV_J14uCRppIdyL-vhpmEJouu7oH7cTiHQ8hbzo44M_LYDe7YOUAl6xdklyutK8Vq9XKtG13VqpE7ZC_nW8YE0y1_TXakEaJpDdslP64QusWcEobpEwX6-2RxcVE5yNjRKcaR9jHRHvJEIXQUvJ8TTEgDhLiKCWnGEf003K3VnxmDH8L1AXnVw5jxzfbuk1-nX38uzqrzy2_fF1_OKy-FmarWCeUa4TqOrWOqEQi1wq4Gr4UGZXqU0Htd9x5qjoJ70K7zQivRi67VUu6Tzxvf1eyW2PlSIcFoV2lYQrq3EQb79BOGG3sd7yznUrNGmOLwYeuQYkmfJ7scssdxhIBxzlZyxupGaMkLevgMvY1zCqXfmhLGGMFVoT5uKJ9izgn7xzSc2fVatqxlt2sV-t2_BR7Zh3kK8H4DxHn1X6e_Ewye0g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3102999214</pqid></control><display><type>article</type><title>ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing</title><source>MEDLINE</source><source>Oxford Open</source><source>PubMed Central</source><creator>Fan, Kechen ; Li, Mengfan ; Zhang, Jiarong ; Xie, Zihan ; Jiang, Daguang ; Bo, Xiaochen ; Zhao, Dongsheng ; Shi, Shenghui ; Ni, Ming</creator><creatorcontrib>Fan, Kechen ; Li, Mengfan ; Zhang, Jiarong ; Xie, Zihan ; Jiang, Daguang ; Bo, Xiaochen ; Zhao, Dongsheng ; Shi, Shenghui ; Ni, Ming</creatorcontrib><description>Abstract Nanopore selective sequencing allows the targeted sequencing of DNA of interest using computational approaches rather than experimental methods such as targeted multiplex polymerase chain reaction or hybridization capture. Compared to sequence-alignment strategies, deep learning (DL) models for classifying target and nontarget DNA provide large speed advantages. However, the relatively low accuracy of these DL-based tools hinders their application in nanopore selective sequencing. Here, we present a DL-based tool named ReadCurrent for nanopore selective sequencing, which takes electric currents as inputs. ReadCurrent employs a modified very deep convolutional neural network (VDCNN) architecture, enabling significantly lower computational costs for training and quicker inference compared to conventional VDCNN. We evaluated the performance of ReadCurrent across 10 nanopore sequencing datasets spanning human, yeasts, bacteria, and viruses. We observed that ReadCurrent achieved a mean accuracy of 98.57% for classification, outperforming four other DL-based selective sequencing methods. In experimental validation that selectively sequenced microbial DNA from human DNA, ReadCurrent achieved an enrichment ratio of 2.85, which was higher than the 2.7 ratio achieved by MinKNOW using the sequence-alignment strategy. In summary, ReadCurrent can rapidly classify target and nontarget DNA with high accuracy, providing an alternative in the toolbox for nanopore selective sequencing. ReadCurrent is available at https://github.com/Ming-Ni-Group/ReadCurrent.</description><identifier>ISSN: 1467-5463</identifier><identifier>ISSN: 1477-4054</identifier><identifier>EISSN: 1477-4054</identifier><identifier>DOI: 10.1093/bib/bbae435</identifier><identifier>PMID: 39226890</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Accuracy ; Alignment ; Artificial neural networks ; Classification ; Computational Biology - methods ; Computer applications ; Computing costs ; Deep Learning ; Deoxyribonucleic acid ; DNA ; DNA sequencing ; Experimental methods ; Gene sequencing ; High-Throughput Nucleotide Sequencing - methods ; Human performance ; Humans ; Hybridization ; Machine learning ; Microorganisms ; Nanopore Sequencing - methods ; Nanopores ; Neural networks ; Neural Networks, Computer ; Nucleotide sequence ; Polymerase chain reaction ; Problem Solving Protocol ; Sequence Analysis, DNA - methods ; Software ; Yeasts</subject><ispartof>Briefings in bioinformatics, 2024-07, Vol.25 (5)</ispartof><rights>The Author(s) 2024. Published by Oxford University Press. 2024</rights><rights>The Author(s) 2024. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c329t-8b24b62bd1e8b0462ea54ed5ac727a49fe3afc75fca51e21ca7bdc2742f2d8733</cites><orcidid>0009-0009-3262-6909 ; 0000-0003-2616-8891 ; 0000-0001-9465-2787</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370629/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11370629/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,1598,27903,27904,53770,53772</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/39226890$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Fan, Kechen</creatorcontrib><creatorcontrib>Li, Mengfan</creatorcontrib><creatorcontrib>Zhang, Jiarong</creatorcontrib><creatorcontrib>Xie, Zihan</creatorcontrib><creatorcontrib>Jiang, Daguang</creatorcontrib><creatorcontrib>Bo, Xiaochen</creatorcontrib><creatorcontrib>Zhao, Dongsheng</creatorcontrib><creatorcontrib>Shi, Shenghui</creatorcontrib><creatorcontrib>Ni, Ming</creatorcontrib><title>ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing</title><title>Briefings in bioinformatics</title><addtitle>Brief Bioinform</addtitle><description>Abstract Nanopore selective sequencing allows the targeted sequencing of DNA of interest using computational approaches rather than experimental methods such as targeted multiplex polymerase chain reaction or hybridization capture. Compared to sequence-alignment strategies, deep learning (DL) models for classifying target and nontarget DNA provide large speed advantages. However, the relatively low accuracy of these DL-based tools hinders their application in nanopore selective sequencing. Here, we present a DL-based tool named ReadCurrent for nanopore selective sequencing, which takes electric currents as inputs. ReadCurrent employs a modified very deep convolutional neural network (VDCNN) architecture, enabling significantly lower computational costs for training and quicker inference compared to conventional VDCNN. We evaluated the performance of ReadCurrent across 10 nanopore sequencing datasets spanning human, yeasts, bacteria, and viruses. We observed that ReadCurrent achieved a mean accuracy of 98.57% for classification, outperforming four other DL-based selective sequencing methods. In experimental validation that selectively sequenced microbial DNA from human DNA, ReadCurrent achieved an enrichment ratio of 2.85, which was higher than the 2.7 ratio achieved by MinKNOW using the sequence-alignment strategy. In summary, ReadCurrent can rapidly classify target and nontarget DNA with high accuracy, providing an alternative in the toolbox for nanopore selective sequencing. ReadCurrent is available at https://github.com/Ming-Ni-Group/ReadCurrent.</description><subject>Accuracy</subject><subject>Alignment</subject><subject>Artificial neural networks</subject><subject>Classification</subject><subject>Computational Biology - methods</subject><subject>Computer applications</subject><subject>Computing costs</subject><subject>Deep Learning</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA sequencing</subject><subject>Experimental methods</subject><subject>Gene sequencing</subject><subject>High-Throughput Nucleotide Sequencing - methods</subject><subject>Human performance</subject><subject>Humans</subject><subject>Hybridization</subject><subject>Machine learning</subject><subject>Microorganisms</subject><subject>Nanopore Sequencing - methods</subject><subject>Nanopores</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Nucleotide sequence</subject><subject>Polymerase chain reaction</subject><subject>Problem Solving Protocol</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Software</subject><subject>Yeasts</subject><issn>1467-5463</issn><issn>1477-4054</issn><issn>1477-4054</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><sourceid>EIF</sourceid><recordid>eNp9kUtr3DAUhUVpaV5ddV8EgVIoTvSyZXVTyqRpCiGB9LEVV_J14uCRppIdyL-vhpmEJouu7oH7cTiHQ8hbzo44M_LYDe7YOUAl6xdklyutK8Vq9XKtG13VqpE7ZC_nW8YE0y1_TXakEaJpDdslP64QusWcEobpEwX6-2RxcVE5yNjRKcaR9jHRHvJEIXQUvJ8TTEgDhLiKCWnGEf003K3VnxmDH8L1AXnVw5jxzfbuk1-nX38uzqrzy2_fF1_OKy-FmarWCeUa4TqOrWOqEQi1wq4Gr4UGZXqU0Htd9x5qjoJ70K7zQivRi67VUu6Tzxvf1eyW2PlSIcFoV2lYQrq3EQb79BOGG3sd7yznUrNGmOLwYeuQYkmfJ7scssdxhIBxzlZyxupGaMkLevgMvY1zCqXfmhLGGMFVoT5uKJ9izgn7xzSc2fVatqxlt2sV-t2_BR7Zh3kK8H4DxHn1X6e_Ewye0g</recordid><startdate>20240725</startdate><enddate>20240725</enddate><creator>Fan, Kechen</creator><creator>Li, Mengfan</creator><creator>Zhang, Jiarong</creator><creator>Xie, Zihan</creator><creator>Jiang, Daguang</creator><creator>Bo, Xiaochen</creator><creator>Zhao, Dongsheng</creator><creator>Shi, Shenghui</creator><creator>Ni, Ming</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>7SC</scope><scope>8FD</scope><scope>FR3</scope><scope>JQ2</scope><scope>K9.</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0009-0009-3262-6909</orcidid><orcidid>https://orcid.org/0000-0003-2616-8891</orcidid><orcidid>https://orcid.org/0000-0001-9465-2787</orcidid></search><sort><creationdate>20240725</creationdate><title>ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing</title><author>Fan, Kechen ; Li, Mengfan ; Zhang, Jiarong ; Xie, Zihan ; Jiang, Daguang ; Bo, Xiaochen ; Zhao, Dongsheng ; Shi, Shenghui ; Ni, Ming</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c329t-8b24b62bd1e8b0462ea54ed5ac727a49fe3afc75fca51e21ca7bdc2742f2d8733</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Accuracy</topic><topic>Alignment</topic><topic>Artificial neural networks</topic><topic>Classification</topic><topic>Computational Biology - methods</topic><topic>Computer applications</topic><topic>Computing costs</topic><topic>Deep Learning</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA sequencing</topic><topic>Experimental methods</topic><topic>Gene sequencing</topic><topic>High-Throughput Nucleotide Sequencing - methods</topic><topic>Human performance</topic><topic>Humans</topic><topic>Hybridization</topic><topic>Machine learning</topic><topic>Microorganisms</topic><topic>Nanopore Sequencing - methods</topic><topic>Nanopores</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Nucleotide sequence</topic><topic>Polymerase chain reaction</topic><topic>Problem Solving Protocol</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Software</topic><topic>Yeasts</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Fan, Kechen</creatorcontrib><creatorcontrib>Li, Mengfan</creatorcontrib><creatorcontrib>Zhang, Jiarong</creatorcontrib><creatorcontrib>Xie, Zihan</creatorcontrib><creatorcontrib>Jiang, Daguang</creatorcontrib><creatorcontrib>Bo, Xiaochen</creatorcontrib><creatorcontrib>Zhao, Dongsheng</creatorcontrib><creatorcontrib>Shi, Shenghui</creatorcontrib><creatorcontrib>Ni, Ming</creatorcontrib><collection>Oxford Open</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Briefings in bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Fan, Kechen</au><au>Li, Mengfan</au><au>Zhang, Jiarong</au><au>Xie, Zihan</au><au>Jiang, Daguang</au><au>Bo, Xiaochen</au><au>Zhao, Dongsheng</au><au>Shi, Shenghui</au><au>Ni, Ming</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing</atitle><jtitle>Briefings in bioinformatics</jtitle><addtitle>Brief Bioinform</addtitle><date>2024-07-25</date><risdate>2024</risdate><volume>25</volume><issue>5</issue><issn>1467-5463</issn><issn>1477-4054</issn><eissn>1477-4054</eissn><abstract>Abstract Nanopore selective sequencing allows the targeted sequencing of DNA of interest using computational approaches rather than experimental methods such as targeted multiplex polymerase chain reaction or hybridization capture. Compared to sequence-alignment strategies, deep learning (DL) models for classifying target and nontarget DNA provide large speed advantages. However, the relatively low accuracy of these DL-based tools hinders their application in nanopore selective sequencing. Here, we present a DL-based tool named ReadCurrent for nanopore selective sequencing, which takes electric currents as inputs. ReadCurrent employs a modified very deep convolutional neural network (VDCNN) architecture, enabling significantly lower computational costs for training and quicker inference compared to conventional VDCNN. We evaluated the performance of ReadCurrent across 10 nanopore sequencing datasets spanning human, yeasts, bacteria, and viruses. We observed that ReadCurrent achieved a mean accuracy of 98.57% for classification, outperforming four other DL-based selective sequencing methods. In experimental validation that selectively sequenced microbial DNA from human DNA, ReadCurrent achieved an enrichment ratio of 2.85, which was higher than the 2.7 ratio achieved by MinKNOW using the sequence-alignment strategy. In summary, ReadCurrent can rapidly classify target and nontarget DNA with high accuracy, providing an alternative in the toolbox for nanopore selective sequencing. ReadCurrent is available at https://github.com/Ming-Ni-Group/ReadCurrent.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>39226890</pmid><doi>10.1093/bib/bbae435</doi><orcidid>https://orcid.org/0009-0009-3262-6909</orcidid><orcidid>https://orcid.org/0000-0003-2616-8891</orcidid><orcidid>https://orcid.org/0000-0001-9465-2787</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1467-5463
ispartof Briefings in bioinformatics, 2024-07, Vol.25 (5)
issn 1467-5463
1477-4054
1477-4054
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11370629
source MEDLINE; Oxford Open; PubMed Central
subjects Accuracy
Alignment
Artificial neural networks
Classification
Computational Biology - methods
Computer applications
Computing costs
Deep Learning
Deoxyribonucleic acid
DNA
DNA sequencing
Experimental methods
Gene sequencing
High-Throughput Nucleotide Sequencing - methods
Human performance
Humans
Hybridization
Machine learning
Microorganisms
Nanopore Sequencing - methods
Nanopores
Neural networks
Neural Networks, Computer
Nucleotide sequence
Polymerase chain reaction
Problem Solving Protocol
Sequence Analysis, DNA - methods
Software
Yeasts
title ReadCurrent: a VDCNN-based tool for fast and accurate nanopore selective sequencing
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T17%3A06%3A35IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ReadCurrent:%20a%20VDCNN-based%20tool%20for%20fast%20and%20accurate%20nanopore%20selective%20sequencing&rft.jtitle=Briefings%20in%20bioinformatics&rft.au=Fan,%20Kechen&rft.date=2024-07-25&rft.volume=25&rft.issue=5&rft.issn=1467-5463&rft.eissn=1477-4054&rft_id=info:doi/10.1093/bib/bbae435&rft_dat=%3Cproquest_pubme%3E3100562731%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3102999214&rft_id=info:pmid/39226890&rft_oup_id=10.1093/bib/bbae435&rfr_iscdi=true