VaDiR: an integrated approach to Variant Detection in RNA

Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its genera...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Gigascience 2018-02, Vol.7 (2), p.1-13
Hauptverfasser: Neums, Lisa, Suenaga, Seiji, Beyerlein, Peter, Anders, Sara, Koestler, Devin, Mariani, Andrea, Chien, Jeremy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 13
container_issue 2
container_start_page 1
container_title Gigascience
container_volume 7
creator Neums, Lisa
Suenaga, Seiji
Beyerlein, Peter
Anders, Sara
Koestler, Devin
Mariani, Andrea
Chien, Jeremy
description Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
doi_str_mv 10.1093/gigascience/gix122
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5827345</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2715807386</sourcerecordid><originalsourceid>FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</originalsourceid><addsrcrecordid>eNpdkV9LwzAUxYMobsx9AR-k4Isv1SZpm8YHYWz-g6EwdPgW0vS2y-iamaSi396OTZnel3vh_u7hHg5Cpzi6xBGnV5WupFMaGgXd_IkJOUB9EsUsJJi9He7NPTR0bhl1xViWMXqMeoSTlHHC-ojP5UTPrgPZBLrxUFnpoQjkem2NVIvAm2AurZaNDybgQXltNmAwexqdoKNS1g6Guz5Ar3e3L-OHcPp8_zgeTUMV08iHikmFgdGC4DzmWUowQJlRzJOyZKqQNMoTKEuCM5ZiRdO4KNI4TQqc4JxQiukA3Wx1122-gkJB462sxdrqlbRfwkgt_m4avRCV-RBJRhiNk07gYidgzXsLzouVdgrqWjZgWicwZ5ynJOakQ8__oUvT2qazJwjDSRYxmqUdRbaUssY5C-XvMzgSm3DEXjhiG053dLZv4_fkJwr6DQ7WjbU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715807386</pqid></control><display><type>article</type><title>VaDiR: an integrated approach to Variant Detection in RNA</title><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Neums, Lisa ; Suenaga, Seiji ; Beyerlein, Peter ; Anders, Sara ; Koestler, Devin ; Mariani, Andrea ; Chien, Jeremy</creator><creatorcontrib>Neums, Lisa ; Suenaga, Seiji ; Beyerlein, Peter ; Anders, Sara ; Koestler, Devin ; Mariani, Andrea ; Chien, Jeremy</creatorcontrib><description>Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.</description><identifier>ISSN: 2047-217X</identifier><identifier>EISSN: 2047-217X</identifier><identifier>DOI: 10.1093/gigascience/gix122</identifier><identifier>PMID: 29267927</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Base Pairing ; Cancer ; Datasets ; Datasets as Topic ; Deoxyribonucleic acid ; DNA ; DNA sequencing ; DNA, Neoplasm - genetics ; Female ; Functional analysis ; Genes ; Genetic Variation ; Genome, Human ; Genomes ; High-Throughput Nucleotide Sequencing - statistics &amp; numerical data ; Humans ; Mutation ; Nucleotide sequence ; Ovarian cancer ; Ovarian Neoplasms - diagnosis ; Ovarian Neoplasms - genetics ; Reagents ; Ribonucleic acid ; RNA ; RNA, Neoplasm - genetics ; Software ; Technical Note ; Transcriptome ; Variation ; Whole genome sequencing</subject><ispartof>Gigascience, 2018-02, Vol.7 (2), p.1-13</ispartof><rights>The Authors 2017. Published by Oxford University Press.</rights><rights>The Authors 2017. Published by Oxford University Press. 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</citedby><cites>FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827345/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827345/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53770,53772</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29267927$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Neums, Lisa</creatorcontrib><creatorcontrib>Suenaga, Seiji</creatorcontrib><creatorcontrib>Beyerlein, Peter</creatorcontrib><creatorcontrib>Anders, Sara</creatorcontrib><creatorcontrib>Koestler, Devin</creatorcontrib><creatorcontrib>Mariani, Andrea</creatorcontrib><creatorcontrib>Chien, Jeremy</creatorcontrib><title>VaDiR: an integrated approach to Variant Detection in RNA</title><title>Gigascience</title><addtitle>Gigascience</addtitle><description>Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.</description><subject>Base Pairing</subject><subject>Cancer</subject><subject>Datasets</subject><subject>Datasets as Topic</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA sequencing</subject><subject>DNA, Neoplasm - genetics</subject><subject>Female</subject><subject>Functional analysis</subject><subject>Genes</subject><subject>Genetic Variation</subject><subject>Genome, Human</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing - statistics &amp; numerical data</subject><subject>Humans</subject><subject>Mutation</subject><subject>Nucleotide sequence</subject><subject>Ovarian cancer</subject><subject>Ovarian Neoplasms - diagnosis</subject><subject>Ovarian Neoplasms - genetics</subject><subject>Reagents</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>RNA, Neoplasm - genetics</subject><subject>Software</subject><subject>Technical Note</subject><subject>Transcriptome</subject><subject>Variation</subject><subject>Whole genome sequencing</subject><issn>2047-217X</issn><issn>2047-217X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpdkV9LwzAUxYMobsx9AR-k4Isv1SZpm8YHYWz-g6EwdPgW0vS2y-iamaSi396OTZnel3vh_u7hHg5Cpzi6xBGnV5WupFMaGgXd_IkJOUB9EsUsJJi9He7NPTR0bhl1xViWMXqMeoSTlHHC-ojP5UTPrgPZBLrxUFnpoQjkem2NVIvAm2AurZaNDybgQXltNmAwexqdoKNS1g6Guz5Ar3e3L-OHcPp8_zgeTUMV08iHikmFgdGC4DzmWUowQJlRzJOyZKqQNMoTKEuCM5ZiRdO4KNI4TQqc4JxQiukA3Wx1122-gkJB462sxdrqlbRfwkgt_m4avRCV-RBJRhiNk07gYidgzXsLzouVdgrqWjZgWicwZ5ynJOakQ8__oUvT2qazJwjDSRYxmqUdRbaUssY5C-XvMzgSm3DEXjhiG053dLZv4_fkJwr6DQ7WjbU</recordid><startdate>20180201</startdate><enddate>20180201</enddate><creator>Neums, Lisa</creator><creator>Suenaga, Seiji</creator><creator>Beyerlein, Peter</creator><creator>Anders, Sara</creator><creator>Koestler, Devin</creator><creator>Mariani, Andrea</creator><creator>Chien, Jeremy</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>K9.</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20180201</creationdate><title>VaDiR: an integrated approach to Variant Detection in RNA</title><author>Neums, Lisa ; Suenaga, Seiji ; Beyerlein, Peter ; Anders, Sara ; Koestler, Devin ; Mariani, Andrea ; Chien, Jeremy</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Base Pairing</topic><topic>Cancer</topic><topic>Datasets</topic><topic>Datasets as Topic</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA sequencing</topic><topic>DNA, Neoplasm - genetics</topic><topic>Female</topic><topic>Functional analysis</topic><topic>Genes</topic><topic>Genetic Variation</topic><topic>Genome, Human</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing - statistics &amp; numerical data</topic><topic>Humans</topic><topic>Mutation</topic><topic>Nucleotide sequence</topic><topic>Ovarian cancer</topic><topic>Ovarian Neoplasms - diagnosis</topic><topic>Ovarian Neoplasms - genetics</topic><topic>Reagents</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>RNA, Neoplasm - genetics</topic><topic>Software</topic><topic>Technical Note</topic><topic>Transcriptome</topic><topic>Variation</topic><topic>Whole genome sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Neums, Lisa</creatorcontrib><creatorcontrib>Suenaga, Seiji</creatorcontrib><creatorcontrib>Beyerlein, Peter</creatorcontrib><creatorcontrib>Anders, Sara</creatorcontrib><creatorcontrib>Koestler, Devin</creatorcontrib><creatorcontrib>Mariani, Andrea</creatorcontrib><creatorcontrib>Chien, Jeremy</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Gigascience</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Neums, Lisa</au><au>Suenaga, Seiji</au><au>Beyerlein, Peter</au><au>Anders, Sara</au><au>Koestler, Devin</au><au>Mariani, Andrea</au><au>Chien, Jeremy</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>VaDiR: an integrated approach to Variant Detection in RNA</atitle><jtitle>Gigascience</jtitle><addtitle>Gigascience</addtitle><date>2018-02-01</date><risdate>2018</risdate><volume>7</volume><issue>2</issue><spage>1</spage><epage>13</epage><pages>1-13</pages><issn>2047-217X</issn><eissn>2047-217X</eissn><abstract>Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>29267927</pmid><doi>10.1093/gigascience/gix122</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2047-217X
ispartof Gigascience, 2018-02, Vol.7 (2), p.1-13
issn 2047-217X
2047-217X
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5827345
source Oxford Journals Open Access Collection; MEDLINE; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Base Pairing
Cancer
Datasets
Datasets as Topic
Deoxyribonucleic acid
DNA
DNA sequencing
DNA, Neoplasm - genetics
Female
Functional analysis
Genes
Genetic Variation
Genome, Human
Genomes
High-Throughput Nucleotide Sequencing - statistics & numerical data
Humans
Mutation
Nucleotide sequence
Ovarian cancer
Ovarian Neoplasms - diagnosis
Ovarian Neoplasms - genetics
Reagents
Ribonucleic acid
RNA
RNA, Neoplasm - genetics
Software
Technical Note
Transcriptome
Variation
Whole genome sequencing
title VaDiR: an integrated approach to Variant Detection in RNA
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T20%3A36%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=VaDiR:%20an%20integrated%20approach%20to%20Variant%20Detection%20in%20RNA&rft.jtitle=Gigascience&rft.au=Neums,%20Lisa&rft.date=2018-02-01&rft.volume=7&rft.issue=2&rft.spage=1&rft.epage=13&rft.pages=1-13&rft.issn=2047-217X&rft.eissn=2047-217X&rft_id=info:doi/10.1093/gigascience/gix122&rft_dat=%3Cproquest_pubme%3E2715807386%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2715807386&rft_id=info:pmid/29267927&rfr_iscdi=true