Modular, efficient and constant-memory single-cell RNA-seq preprocessing
We describe a workflow for preprocessing of single-cell RNA-sequencing data that balances efficiency and accuracy. Our workflow is based on the kallisto and bustools programs, and is near optimal in speed with a constant memory requirement providing scalability for arbitrarily large datasets. The wo...
Gespeichert in:
Veröffentlicht in: | Nature biotechnology 2021-07, Vol.39 (7), p.813-818 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 818 |
---|---|
container_issue | 7 |
container_start_page | 813 |
container_title | Nature biotechnology |
container_volume | 39 |
creator | Melsted, Páll Booeshaghi, A. Sina Liu, Lauren Gao, Fan Lu, Lambda Min, Kyung Hoi (Joseph) da Veiga Beltrame, Eduardo Hjörleifsson, Kristján Eldjárn Gehring, Jase Pachter, Lior |
description | We describe a workflow for preprocessing of single-cell RNA-sequencing data that balances efficiency and accuracy. Our workflow is based on the kallisto and bustools programs, and is near optimal in speed with a constant memory requirement providing scalability for arbitrarily large datasets. The workflow is modular, and we demonstrate its flexibility by showing how it can be used for RNA velocity analyses.
A preprocessing workflow for single-cell RNA-seq data achieves near-optimal speed. |
doi_str_mv | 10.1038/s41587-021-00870-2 |
format | Article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_2508576726</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A668270166</galeid><sourcerecordid>A668270166</sourcerecordid><originalsourceid>FETCH-LOGICAL-c622t-6f2545c266fba2da8aa9978558a789ff6e88e271b0f387fa1f79a3fafd6698fd3</originalsourceid><addsrcrecordid>eNqNkltrFTEUhYMotlb_gA8yIEgLTU0yJzuZx0OptlAt1MtryJnZGVPmcprMgP33ZjqtdURF8pCQ_a2VTfYi5CVnR5zl-m1ccakVZYJTxrRiVDwiu1yugHIo4HE6s6nMJeyQZzFeMcZgBfCU7OS5KqTWepecfuirsbHhMEPnfOmxGzLbVVnZd3Gw3UBbbPtwk0Xf1Q3SEpsmu_y4phGvs23AbehLjFPxOXnibBPxxd2-R768O_l8fErPL96fHa_PaQlCDBSckCtZCgC3saKy2tqiUFpKbZUunAPUGoXiG-ZyrZzlThU2d9ZVAIV2Vb5H9mff9PT1iHEwrY9TW7bDfoxGSKalAiUgoa9_Q6_6MXSpu0RJJkEwUA9UbRs0vnP9EGw5mZo1gBaKcZi8jv5ApVVh69NnofPpfiF4sxAkZsDvQ23HGM0SPPg7ePbp8v_Zi69L9vAXdjOmId1OKvr62xBnyQIXM16GPsaAzmyDb224MZyZKW5mjptJcTO3cTMiiV7d_fC4abH6KbnPVwLyGYip1NUYHkbwD9sfnJLa2w</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2550562067</pqid></control><display><type>article</type><title>Modular, efficient and constant-memory single-cell RNA-seq preprocessing</title><source>MEDLINE</source><source>Nature Journals Online</source><source>Alma/SFX Local Collection</source><creator>Melsted, Páll ; Booeshaghi, A. Sina ; Liu, Lauren ; Gao, Fan ; Lu, Lambda ; Min, Kyung Hoi (Joseph) ; da Veiga Beltrame, Eduardo ; Hjörleifsson, Kristján Eldjárn ; Gehring, Jase ; Pachter, Lior</creator><creatorcontrib>Melsted, Páll ; Booeshaghi, A. Sina ; Liu, Lauren ; Gao, Fan ; Lu, Lambda ; Min, Kyung Hoi (Joseph) ; da Veiga Beltrame, Eduardo ; Hjörleifsson, Kristján Eldjárn ; Gehring, Jase ; Pachter, Lior</creatorcontrib><description>We describe a workflow for preprocessing of single-cell RNA-sequencing data that balances efficiency and accuracy. Our workflow is based on the kallisto and bustools programs, and is near optimal in speed with a constant memory requirement providing scalability for arbitrarily large datasets. The workflow is modular, and we demonstrate its flexibility by showing how it can be used for RNA velocity analyses.
A preprocessing workflow for single-cell RNA-seq data achieves near-optimal speed.</description><identifier>ISSN: 1087-0156</identifier><identifier>ISSN: 1546-1696</identifier><identifier>EISSN: 1546-1696</identifier><identifier>DOI: 10.1038/s41587-021-00870-2</identifier><identifier>PMID: 33795888</identifier><language>eng</language><publisher>New York: Nature Publishing Group US</publisher><subject>631/114/2785 ; 631/114/794 ; 631/61/212/2019 ; Agriculture ; Analysis ; Base Sequence ; Bioinformatics ; Biomedical and Life Sciences ; Biomedical Engineering/Biotechnology ; Biomedicine ; Biotechnology ; Computer engineering ; Computer science ; Datasets ; Efficiency ; Experiments ; Gene sequencing ; Genes ; Genetic engineering ; Genetic markers ; Genomics ; High-Throughput Nucleotide Sequencing ; Humans ; Identification and classification ; Letter ; Life Sciences ; Mechanical engineering ; Methods ; Preprocessing ; Ribonucleic acid ; RNA ; RNA sequencing ; Sequence Analysis, RNA ; Single-Cell Analysis ; Software ; Workflow</subject><ispartof>Nature biotechnology, 2021-07, Vol.39 (7), p.813-818</ispartof><rights>The Author(s), under exclusive licence to Springer Nature America, Inc. 2021. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>2021. The Author(s), under exclusive licence to Springer Nature America, Inc.</rights><rights>COPYRIGHT 2021 Nature Publishing Group</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c622t-6f2545c266fba2da8aa9978558a789ff6e88e271b0f387fa1f79a3fafd6698fd3</citedby><cites>FETCH-LOGICAL-c622t-6f2545c266fba2da8aa9978558a789ff6e88e271b0f387fa1f79a3fafd6698fd3</cites><orcidid>0000-0003-0894-4017 ; 0000-0002-9164-6231</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33795888$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Melsted, Páll</creatorcontrib><creatorcontrib>Booeshaghi, A. Sina</creatorcontrib><creatorcontrib>Liu, Lauren</creatorcontrib><creatorcontrib>Gao, Fan</creatorcontrib><creatorcontrib>Lu, Lambda</creatorcontrib><creatorcontrib>Min, Kyung Hoi (Joseph)</creatorcontrib><creatorcontrib>da Veiga Beltrame, Eduardo</creatorcontrib><creatorcontrib>Hjörleifsson, Kristján Eldjárn</creatorcontrib><creatorcontrib>Gehring, Jase</creatorcontrib><creatorcontrib>Pachter, Lior</creatorcontrib><title>Modular, efficient and constant-memory single-cell RNA-seq preprocessing</title><title>Nature biotechnology</title><addtitle>Nat Biotechnol</addtitle><addtitle>Nat Biotechnol</addtitle><description>We describe a workflow for preprocessing of single-cell RNA-sequencing data that balances efficiency and accuracy. Our workflow is based on the kallisto and bustools programs, and is near optimal in speed with a constant memory requirement providing scalability for arbitrarily large datasets. The workflow is modular, and we demonstrate its flexibility by showing how it can be used for RNA velocity analyses.
A preprocessing workflow for single-cell RNA-seq data achieves near-optimal speed.</description><subject>631/114/2785</subject><subject>631/114/794</subject><subject>631/61/212/2019</subject><subject>Agriculture</subject><subject>Analysis</subject><subject>Base Sequence</subject><subject>Bioinformatics</subject><subject>Biomedical and Life Sciences</subject><subject>Biomedical Engineering/Biotechnology</subject><subject>Biomedicine</subject><subject>Biotechnology</subject><subject>Computer engineering</subject><subject>Computer science</subject><subject>Datasets</subject><subject>Efficiency</subject><subject>Experiments</subject><subject>Gene sequencing</subject><subject>Genes</subject><subject>Genetic engineering</subject><subject>Genetic markers</subject><subject>Genomics</subject><subject>High-Throughput Nucleotide Sequencing</subject><subject>Humans</subject><subject>Identification and classification</subject><subject>Letter</subject><subject>Life Sciences</subject><subject>Mechanical engineering</subject><subject>Methods</subject><subject>Preprocessing</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>RNA sequencing</subject><subject>Sequence Analysis, RNA</subject><subject>Single-Cell Analysis</subject><subject>Software</subject><subject>Workflow</subject><issn>1087-0156</issn><issn>1546-1696</issn><issn>1546-1696</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><sourceid>N95</sourceid><sourceid>8G5</sourceid><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><sourceid>GUQSH</sourceid><sourceid>M2O</sourceid><recordid>eNqNkltrFTEUhYMotlb_gA8yIEgLTU0yJzuZx0OptlAt1MtryJnZGVPmcprMgP33ZjqtdURF8pCQ_a2VTfYi5CVnR5zl-m1ccakVZYJTxrRiVDwiu1yugHIo4HE6s6nMJeyQZzFeMcZgBfCU7OS5KqTWepecfuirsbHhMEPnfOmxGzLbVVnZd3Gw3UBbbPtwk0Xf1Q3SEpsmu_y4phGvs23AbehLjFPxOXnibBPxxd2-R768O_l8fErPL96fHa_PaQlCDBSckCtZCgC3saKy2tqiUFpKbZUunAPUGoXiG-ZyrZzlThU2d9ZVAIV2Vb5H9mff9PT1iHEwrY9TW7bDfoxGSKalAiUgoa9_Q6_6MXSpu0RJJkEwUA9UbRs0vnP9EGw5mZo1gBaKcZi8jv5ApVVh69NnofPpfiF4sxAkZsDvQ23HGM0SPPg7ePbp8v_Zi69L9vAXdjOmId1OKvr62xBnyQIXM16GPsaAzmyDb224MZyZKW5mjptJcTO3cTMiiV7d_fC4abH6KbnPVwLyGYip1NUYHkbwD9sfnJLa2w</recordid><startdate>20210701</startdate><enddate>20210701</enddate><creator>Melsted, Páll</creator><creator>Booeshaghi, A. Sina</creator><creator>Liu, Lauren</creator><creator>Gao, Fan</creator><creator>Lu, Lambda</creator><creator>Min, Kyung Hoi (Joseph)</creator><creator>da Veiga Beltrame, Eduardo</creator><creator>Hjörleifsson, Kristján Eldjárn</creator><creator>Gehring, Jase</creator><creator>Pachter, Lior</creator><general>Nature Publishing Group US</general><general>Nature Publishing Group</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>N95</scope><scope>XI7</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7QP</scope><scope>7QR</scope><scope>7T7</scope><scope>7TK</scope><scope>7TM</scope><scope>7X7</scope><scope>7XB</scope><scope>88A</scope><scope>88E</scope><scope>88I</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M2O</scope><scope>M2P</scope><scope>M7P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>Q9U</scope><scope>RC3</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0894-4017</orcidid><orcidid>https://orcid.org/0000-0002-9164-6231</orcidid></search><sort><creationdate>20210701</creationdate><title>Modular, efficient and constant-memory single-cell RNA-seq preprocessing</title><author>Melsted, Páll ; Booeshaghi, A. Sina ; Liu, Lauren ; Gao, Fan ; Lu, Lambda ; Min, Kyung Hoi (Joseph) ; da Veiga Beltrame, Eduardo ; Hjörleifsson, Kristján Eldjárn ; Gehring, Jase ; Pachter, Lior</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c622t-6f2545c266fba2da8aa9978558a789ff6e88e271b0f387fa1f79a3fafd6698fd3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>631/114/2785</topic><topic>631/114/794</topic><topic>631/61/212/2019</topic><topic>Agriculture</topic><topic>Analysis</topic><topic>Base Sequence</topic><topic>Bioinformatics</topic><topic>Biomedical and Life Sciences</topic><topic>Biomedical Engineering/Biotechnology</topic><topic>Biomedicine</topic><topic>Biotechnology</topic><topic>Computer engineering</topic><topic>Computer science</topic><topic>Datasets</topic><topic>Efficiency</topic><topic>Experiments</topic><topic>Gene sequencing</topic><topic>Genes</topic><topic>Genetic engineering</topic><topic>Genetic markers</topic><topic>Genomics</topic><topic>High-Throughput Nucleotide Sequencing</topic><topic>Humans</topic><topic>Identification and classification</topic><topic>Letter</topic><topic>Life Sciences</topic><topic>Mechanical engineering</topic><topic>Methods</topic><topic>Preprocessing</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>RNA sequencing</topic><topic>Sequence Analysis, RNA</topic><topic>Single-Cell Analysis</topic><topic>Software</topic><topic>Workflow</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Melsted, Páll</creatorcontrib><creatorcontrib>Booeshaghi, A. Sina</creatorcontrib><creatorcontrib>Liu, Lauren</creatorcontrib><creatorcontrib>Gao, Fan</creatorcontrib><creatorcontrib>Lu, Lambda</creatorcontrib><creatorcontrib>Min, Kyung Hoi (Joseph)</creatorcontrib><creatorcontrib>da Veiga Beltrame, Eduardo</creatorcontrib><creatorcontrib>Hjörleifsson, Kristján Eldjárn</creatorcontrib><creatorcontrib>Gehring, Jase</creatorcontrib><creatorcontrib>Pachter, Lior</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale Business: Insights</collection><collection>Business Insights: Essentials</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Biology Database (Alumni Edition)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Nature biotechnology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Melsted, Páll</au><au>Booeshaghi, A. Sina</au><au>Liu, Lauren</au><au>Gao, Fan</au><au>Lu, Lambda</au><au>Min, Kyung Hoi (Joseph)</au><au>da Veiga Beltrame, Eduardo</au><au>Hjörleifsson, Kristján Eldjárn</au><au>Gehring, Jase</au><au>Pachter, Lior</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Modular, efficient and constant-memory single-cell RNA-seq preprocessing</atitle><jtitle>Nature biotechnology</jtitle><stitle>Nat Biotechnol</stitle><addtitle>Nat Biotechnol</addtitle><date>2021-07-01</date><risdate>2021</risdate><volume>39</volume><issue>7</issue><spage>813</spage><epage>818</epage><pages>813-818</pages><issn>1087-0156</issn><issn>1546-1696</issn><eissn>1546-1696</eissn><abstract>We describe a workflow for preprocessing of single-cell RNA-sequencing data that balances efficiency and accuracy. Our workflow is based on the kallisto and bustools programs, and is near optimal in speed with a constant memory requirement providing scalability for arbitrarily large datasets. The workflow is modular, and we demonstrate its flexibility by showing how it can be used for RNA velocity analyses.
A preprocessing workflow for single-cell RNA-seq data achieves near-optimal speed.</abstract><cop>New York</cop><pub>Nature Publishing Group US</pub><pmid>33795888</pmid><doi>10.1038/s41587-021-00870-2</doi><tpages>6</tpages><orcidid>https://orcid.org/0000-0003-0894-4017</orcidid><orcidid>https://orcid.org/0000-0002-9164-6231</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1087-0156 |
ispartof | Nature biotechnology, 2021-07, Vol.39 (7), p.813-818 |
issn | 1087-0156 1546-1696 1546-1696 |
language | eng |
recordid | cdi_proquest_miscellaneous_2508576726 |
source | MEDLINE; Nature Journals Online; Alma/SFX Local Collection |
subjects | 631/114/2785 631/114/794 631/61/212/2019 Agriculture Analysis Base Sequence Bioinformatics Biomedical and Life Sciences Biomedical Engineering/Biotechnology Biomedicine Biotechnology Computer engineering Computer science Datasets Efficiency Experiments Gene sequencing Genes Genetic engineering Genetic markers Genomics High-Throughput Nucleotide Sequencing Humans Identification and classification Letter Life Sciences Mechanical engineering Methods Preprocessing Ribonucleic acid RNA RNA sequencing Sequence Analysis, RNA Single-Cell Analysis Software Workflow |
title | Modular, efficient and constant-memory single-cell RNA-seq preprocessing |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T09%3A15%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Modular,%20efficient%20and%20constant-memory%20single-cell%20RNA-seq%20preprocessing&rft.jtitle=Nature%20biotechnology&rft.au=Melsted,%20P%C3%A1ll&rft.date=2021-07-01&rft.volume=39&rft.issue=7&rft.spage=813&rft.epage=818&rft.pages=813-818&rft.issn=1087-0156&rft.eissn=1546-1696&rft_id=info:doi/10.1038/s41587-021-00870-2&rft_dat=%3Cgale_proqu%3EA668270166%3C/gale_proqu%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2550562067&rft_id=info:pmid/33795888&rft_galeid=A668270166&rfr_iscdi=true |