VaDiR: an integrated approach to Variant Detection in RNA
Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its genera...
Gespeichert in:
Veröffentlicht in: | Gigascience 2018-02, Vol.7 (2), p.1-13 |
---|---|
Hauptverfasser: | , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 13 |
---|---|
container_issue | 2 |
container_start_page | 1 |
container_title | Gigascience |
container_volume | 7 |
creator | Neums, Lisa Suenaga, Seiji Beyerlein, Peter Anders, Sara Koestler, Devin Mariani, Andrea Chien, Jeremy |
description | Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue.
We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels.
Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets. |
doi_str_mv | 10.1093/gigascience/gix122 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5827345</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2715807386</sourcerecordid><originalsourceid>FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</originalsourceid><addsrcrecordid>eNpdkV9LwzAUxYMobsx9AR-k4Isv1SZpm8YHYWz-g6EwdPgW0vS2y-iamaSi396OTZnel3vh_u7hHg5Cpzi6xBGnV5WupFMaGgXd_IkJOUB9EsUsJJi9He7NPTR0bhl1xViWMXqMeoSTlHHC-ojP5UTPrgPZBLrxUFnpoQjkem2NVIvAm2AurZaNDybgQXltNmAwexqdoKNS1g6Guz5Ar3e3L-OHcPp8_zgeTUMV08iHikmFgdGC4DzmWUowQJlRzJOyZKqQNMoTKEuCM5ZiRdO4KNI4TQqc4JxQiukA3Wx1122-gkJB462sxdrqlbRfwkgt_m4avRCV-RBJRhiNk07gYidgzXsLzouVdgrqWjZgWicwZ5ynJOakQ8__oUvT2qazJwjDSRYxmqUdRbaUssY5C-XvMzgSm3DEXjhiG053dLZv4_fkJwr6DQ7WjbU</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715807386</pqid></control><display><type>article</type><title>VaDiR: an integrated approach to Variant Detection in RNA</title><source>Oxford Journals Open Access Collection</source><source>MEDLINE</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Neums, Lisa ; Suenaga, Seiji ; Beyerlein, Peter ; Anders, Sara ; Koestler, Devin ; Mariani, Andrea ; Chien, Jeremy</creator><creatorcontrib>Neums, Lisa ; Suenaga, Seiji ; Beyerlein, Peter ; Anders, Sara ; Koestler, Devin ; Mariani, Andrea ; Chien, Jeremy</creatorcontrib><description>Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue.
We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels.
Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.</description><identifier>ISSN: 2047-217X</identifier><identifier>EISSN: 2047-217X</identifier><identifier>DOI: 10.1093/gigascience/gix122</identifier><identifier>PMID: 29267927</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Base Pairing ; Cancer ; Datasets ; Datasets as Topic ; Deoxyribonucleic acid ; DNA ; DNA sequencing ; DNA, Neoplasm - genetics ; Female ; Functional analysis ; Genes ; Genetic Variation ; Genome, Human ; Genomes ; High-Throughput Nucleotide Sequencing - statistics & numerical data ; Humans ; Mutation ; Nucleotide sequence ; Ovarian cancer ; Ovarian Neoplasms - diagnosis ; Ovarian Neoplasms - genetics ; Reagents ; Ribonucleic acid ; RNA ; RNA, Neoplasm - genetics ; Software ; Technical Note ; Transcriptome ; Variation ; Whole genome sequencing</subject><ispartof>Gigascience, 2018-02, Vol.7 (2), p.1-13</ispartof><rights>The Authors 2017. Published by Oxford University Press.</rights><rights>The Authors 2017. Published by Oxford University Press. 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</citedby><cites>FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827345/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827345/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53770,53772</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29267927$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Neums, Lisa</creatorcontrib><creatorcontrib>Suenaga, Seiji</creatorcontrib><creatorcontrib>Beyerlein, Peter</creatorcontrib><creatorcontrib>Anders, Sara</creatorcontrib><creatorcontrib>Koestler, Devin</creatorcontrib><creatorcontrib>Mariani, Andrea</creatorcontrib><creatorcontrib>Chien, Jeremy</creatorcontrib><title>VaDiR: an integrated approach to Variant Detection in RNA</title><title>Gigascience</title><addtitle>Gigascience</addtitle><description>Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue.
We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels.
Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.</description><subject>Base Pairing</subject><subject>Cancer</subject><subject>Datasets</subject><subject>Datasets as Topic</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>DNA sequencing</subject><subject>DNA, Neoplasm - genetics</subject><subject>Female</subject><subject>Functional analysis</subject><subject>Genes</subject><subject>Genetic Variation</subject><subject>Genome, Human</subject><subject>Genomes</subject><subject>High-Throughput Nucleotide Sequencing - statistics & numerical data</subject><subject>Humans</subject><subject>Mutation</subject><subject>Nucleotide sequence</subject><subject>Ovarian cancer</subject><subject>Ovarian Neoplasms - diagnosis</subject><subject>Ovarian Neoplasms - genetics</subject><subject>Reagents</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>RNA, Neoplasm - genetics</subject><subject>Software</subject><subject>Technical Note</subject><subject>Transcriptome</subject><subject>Variation</subject><subject>Whole genome sequencing</subject><issn>2047-217X</issn><issn>2047-217X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpdkV9LwzAUxYMobsx9AR-k4Isv1SZpm8YHYWz-g6EwdPgW0vS2y-iamaSi396OTZnel3vh_u7hHg5Cpzi6xBGnV5WupFMaGgXd_IkJOUB9EsUsJJi9He7NPTR0bhl1xViWMXqMeoSTlHHC-ojP5UTPrgPZBLrxUFnpoQjkem2NVIvAm2AurZaNDybgQXltNmAwexqdoKNS1g6Guz5Ar3e3L-OHcPp8_zgeTUMV08iHikmFgdGC4DzmWUowQJlRzJOyZKqQNMoTKEuCM5ZiRdO4KNI4TQqc4JxQiukA3Wx1122-gkJB462sxdrqlbRfwkgt_m4avRCV-RBJRhiNk07gYidgzXsLzouVdgrqWjZgWicwZ5ynJOakQ8__oUvT2qazJwjDSRYxmqUdRbaUssY5C-XvMzgSm3DEXjhiG053dLZv4_fkJwr6DQ7WjbU</recordid><startdate>20180201</startdate><enddate>20180201</enddate><creator>Neums, Lisa</creator><creator>Suenaga, Seiji</creator><creator>Beyerlein, Peter</creator><creator>Anders, Sara</creator><creator>Koestler, Devin</creator><creator>Mariani, Andrea</creator><creator>Chien, Jeremy</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>JQ2</scope><scope>K9.</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20180201</creationdate><title>VaDiR: an integrated approach to Variant Detection in RNA</title><author>Neums, Lisa ; Suenaga, Seiji ; Beyerlein, Peter ; Anders, Sara ; Koestler, Devin ; Mariani, Andrea ; Chien, Jeremy</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c430t-c7ac1e73d21b498621eef83195ff7cda30b5eff218761c364dd6465d151b23313</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Base Pairing</topic><topic>Cancer</topic><topic>Datasets</topic><topic>Datasets as Topic</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>DNA sequencing</topic><topic>DNA, Neoplasm - genetics</topic><topic>Female</topic><topic>Functional analysis</topic><topic>Genes</topic><topic>Genetic Variation</topic><topic>Genome, Human</topic><topic>Genomes</topic><topic>High-Throughput Nucleotide Sequencing - statistics & numerical data</topic><topic>Humans</topic><topic>Mutation</topic><topic>Nucleotide sequence</topic><topic>Ovarian cancer</topic><topic>Ovarian Neoplasms - diagnosis</topic><topic>Ovarian Neoplasms - genetics</topic><topic>Reagents</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>RNA, Neoplasm - genetics</topic><topic>Software</topic><topic>Technical Note</topic><topic>Transcriptome</topic><topic>Variation</topic><topic>Whole genome sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Neums, Lisa</creatorcontrib><creatorcontrib>Suenaga, Seiji</creatorcontrib><creatorcontrib>Beyerlein, Peter</creatorcontrib><creatorcontrib>Anders, Sara</creatorcontrib><creatorcontrib>Koestler, Devin</creatorcontrib><creatorcontrib>Mariani, Andrea</creatorcontrib><creatorcontrib>Chien, Jeremy</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Gigascience</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Neums, Lisa</au><au>Suenaga, Seiji</au><au>Beyerlein, Peter</au><au>Anders, Sara</au><au>Koestler, Devin</au><au>Mariani, Andrea</au><au>Chien, Jeremy</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>VaDiR: an integrated approach to Variant Detection in RNA</atitle><jtitle>Gigascience</jtitle><addtitle>Gigascience</addtitle><date>2018-02-01</date><risdate>2018</risdate><volume>7</volume><issue>2</issue><spage>1</spage><epage>13</epage><pages>1-13</pages><issn>2047-217X</issn><eissn>2047-217X</eissn><abstract>Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue.
We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels.
Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>29267927</pmid><doi>10.1093/gigascience/gix122</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2047-217X |
ispartof | Gigascience, 2018-02, Vol.7 (2), p.1-13 |
issn | 2047-217X 2047-217X |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_5827345 |
source | Oxford Journals Open Access Collection; MEDLINE; EZB-FREE-00999 freely available EZB journals; PubMed Central |
subjects | Base Pairing Cancer Datasets Datasets as Topic Deoxyribonucleic acid DNA DNA sequencing DNA, Neoplasm - genetics Female Functional analysis Genes Genetic Variation Genome, Human Genomes High-Throughput Nucleotide Sequencing - statistics & numerical data Humans Mutation Nucleotide sequence Ovarian cancer Ovarian Neoplasms - diagnosis Ovarian Neoplasms - genetics Reagents Ribonucleic acid RNA RNA, Neoplasm - genetics Software Technical Note Transcriptome Variation Whole genome sequencing |
title | VaDiR: an integrated approach to Variant Detection in RNA |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-22T20%3A36%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=VaDiR:%20an%20integrated%20approach%20to%20Variant%20Detection%20in%20RNA&rft.jtitle=Gigascience&rft.au=Neums,%20Lisa&rft.date=2018-02-01&rft.volume=7&rft.issue=2&rft.spage=1&rft.epage=13&rft.pages=1-13&rft.issn=2047-217X&rft.eissn=2047-217X&rft_id=info:doi/10.1093/gigascience/gix122&rft_dat=%3Cproquest_pubme%3E2715807386%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2715807386&rft_id=info:pmid/29267927&rfr_iscdi=true |