Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia
Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory fun...
Gespeichert in:
Veröffentlicht in: | RNA biology 2020-07, Vol.17 (7), p.966-976 |
---|---|
Hauptverfasser: | , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 976 |
---|---|
container_issue | 7 |
container_start_page | 966 |
container_title | RNA biology |
container_volume | 17 |
creator | Hu, Zhikang Lyu, Tao Yan, Chao Wang, Yupeng Ye, Ning Fan, Zhengqi Li, Xinlei Li, Jiyuan Yin, Hengfu |
description | Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants. |
doi_str_mv | 10.1080/15476286.2020.1738703 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1080_15476286_2020_1738703</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2376729102</sourcerecordid><originalsourceid>FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</originalsourceid><addsrcrecordid>eNp9kU2PFCEQhjtG466rP0HD0UuvQDfQXIybiR-bbDQxeibVTTFiaBihZzdz9ZdLZ2Y3evECFDxvFVVv07xk9JLRgb5holeSD_KSU16vVDco2j1qzpkQoh3E0D9ez71qV-iseVbKT0o7OWjxtDnrOJOUUXne_L62GBfv_ASLT5EkRyAsmGMNbzEcSNkFP6ElW4xIfEku5bkQiJbEVIG6xilZH7fk6-erQsaqqEHAdk4Bp31AElLcthnBkoK_9hinFfaRbGDGEDw8b544CAVfnPaL5vuH9982n9qbLx-vN1c37dTLYWkFh173TjPlHDLFR2qFo51WCMIB1xp7h6pfBzGO3GmhaQ9SSrCWuQFEd9G8Pebd7ccZ7VT7zhDMLvsZ8sEk8Obfl-h_mG26NUr0WipeE7w-JcipNlIWM_sy1R4gYtoXwztVMc3oioojOuVUSkb3UIZRs_pn7v0zq3_m5F_Vvfr7jw-qe8Mq8O4I-Lg6AXcpB2sWOISUXYY63GK6_9f4A4GMrcw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2376729102</pqid></control><display><type>article</type><title>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek</source><source>PubMed Central</source><creator>Hu, Zhikang ; Lyu, Tao ; Yan, Chao ; Wang, Yupeng ; Ye, Ning ; Fan, Zhengqi ; Li, Xinlei ; Li, Jiyuan ; Yin, Hengfu</creator><creatorcontrib>Hu, Zhikang ; Lyu, Tao ; Yan, Chao ; Wang, Yupeng ; Ye, Ning ; Fan, Zhengqi ; Li, Xinlei ; Li, Jiyuan ; Yin, Hengfu</creatorcontrib><description>Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.</description><identifier>ISSN: 1547-6286</identifier><identifier>EISSN: 1555-8584</identifier><identifier>DOI: 10.1080/15476286.2020.1738703</identifier><identifier>PMID: 32160106</identifier><language>eng</language><publisher>United States: Taylor & Francis</publisher><subject>Alternative Splicing ; Camellia ; Camellia - genetics ; Computational Biology ; Gene Expression Profiling ; Gene Expression Regulation, Plant ; Genome, Plant ; High-Throughput Nucleotide Sequencing ; lipoxygenase ; Molecular Sequence Annotation ; phased small interfering RNA ; Phenotype ; Polyadenylation ; Research Paper ; RNA Isoforms ; RNA, Untranslated - genetics ; Single Molecule Imaging ; single-molecule sequencing ; Transcriptome</subject><ispartof>RNA biology, 2020-07, Vol.17 (7), p.966-976</ispartof><rights>2020 Informa UK Limited, trading as Taylor & Francis Group 2020</rights><rights>2020 Informa UK Limited, trading as Taylor & Francis Group 2020 Informa UK Limited, trading as Taylor & Francis Group</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</citedby><cites>FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</cites><orcidid>0000-0001-7249-8352 ; 0000-0002-0720-5311</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7549672/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7549672/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32160106$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hu, Zhikang</creatorcontrib><creatorcontrib>Lyu, Tao</creatorcontrib><creatorcontrib>Yan, Chao</creatorcontrib><creatorcontrib>Wang, Yupeng</creatorcontrib><creatorcontrib>Ye, Ning</creatorcontrib><creatorcontrib>Fan, Zhengqi</creatorcontrib><creatorcontrib>Li, Xinlei</creatorcontrib><creatorcontrib>Li, Jiyuan</creatorcontrib><creatorcontrib>Yin, Hengfu</creatorcontrib><title>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</title><title>RNA biology</title><addtitle>RNA Biol</addtitle><description>Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.</description><subject>Alternative Splicing</subject><subject>Camellia</subject><subject>Camellia - genetics</subject><subject>Computational Biology</subject><subject>Gene Expression Profiling</subject><subject>Gene Expression Regulation, Plant</subject><subject>Genome, Plant</subject><subject>High-Throughput Nucleotide Sequencing</subject><subject>lipoxygenase</subject><subject>Molecular Sequence Annotation</subject><subject>phased small interfering RNA</subject><subject>Phenotype</subject><subject>Polyadenylation</subject><subject>Research Paper</subject><subject>RNA Isoforms</subject><subject>RNA, Untranslated - genetics</subject><subject>Single Molecule Imaging</subject><subject>single-molecule sequencing</subject><subject>Transcriptome</subject><issn>1547-6286</issn><issn>1555-8584</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kU2PFCEQhjtG466rP0HD0UuvQDfQXIybiR-bbDQxeibVTTFiaBihZzdz9ZdLZ2Y3evECFDxvFVVv07xk9JLRgb5holeSD_KSU16vVDco2j1qzpkQoh3E0D9ez71qV-iseVbKT0o7OWjxtDnrOJOUUXne_L62GBfv_ASLT5EkRyAsmGMNbzEcSNkFP6ElW4xIfEku5bkQiJbEVIG6xilZH7fk6-erQsaqqEHAdk4Bp31AElLcthnBkoK_9hinFfaRbGDGEDw8b544CAVfnPaL5vuH9982n9qbLx-vN1c37dTLYWkFh173TjPlHDLFR2qFo51WCMIB1xp7h6pfBzGO3GmhaQ9SSrCWuQFEd9G8Pebd7ccZ7VT7zhDMLvsZ8sEk8Obfl-h_mG26NUr0WipeE7w-JcipNlIWM_sy1R4gYtoXwztVMc3oioojOuVUSkb3UIZRs_pn7v0zq3_m5F_Vvfr7jw-qe8Mq8O4I-Lg6AXcpB2sWOISUXYY63GK6_9f4A4GMrcw</recordid><startdate>20200702</startdate><enddate>20200702</enddate><creator>Hu, Zhikang</creator><creator>Lyu, Tao</creator><creator>Yan, Chao</creator><creator>Wang, Yupeng</creator><creator>Ye, Ning</creator><creator>Fan, Zhengqi</creator><creator>Li, Xinlei</creator><creator>Li, Jiyuan</creator><creator>Yin, Hengfu</creator><general>Taylor & Francis</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-7249-8352</orcidid><orcidid>https://orcid.org/0000-0002-0720-5311</orcidid></search><sort><creationdate>20200702</creationdate><title>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</title><author>Hu, Zhikang ; Lyu, Tao ; Yan, Chao ; Wang, Yupeng ; Ye, Ning ; Fan, Zhengqi ; Li, Xinlei ; Li, Jiyuan ; Yin, Hengfu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Alternative Splicing</topic><topic>Camellia</topic><topic>Camellia - genetics</topic><topic>Computational Biology</topic><topic>Gene Expression Profiling</topic><topic>Gene Expression Regulation, Plant</topic><topic>Genome, Plant</topic><topic>High-Throughput Nucleotide Sequencing</topic><topic>lipoxygenase</topic><topic>Molecular Sequence Annotation</topic><topic>phased small interfering RNA</topic><topic>Phenotype</topic><topic>Polyadenylation</topic><topic>Research Paper</topic><topic>RNA Isoforms</topic><topic>RNA, Untranslated - genetics</topic><topic>Single Molecule Imaging</topic><topic>single-molecule sequencing</topic><topic>Transcriptome</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Zhikang</creatorcontrib><creatorcontrib>Lyu, Tao</creatorcontrib><creatorcontrib>Yan, Chao</creatorcontrib><creatorcontrib>Wang, Yupeng</creatorcontrib><creatorcontrib>Ye, Ning</creatorcontrib><creatorcontrib>Fan, Zhengqi</creatorcontrib><creatorcontrib>Li, Xinlei</creatorcontrib><creatorcontrib>Li, Jiyuan</creatorcontrib><creatorcontrib>Yin, Hengfu</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>RNA biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hu, Zhikang</au><au>Lyu, Tao</au><au>Yan, Chao</au><au>Wang, Yupeng</au><au>Ye, Ning</au><au>Fan, Zhengqi</au><au>Li, Xinlei</au><au>Li, Jiyuan</au><au>Yin, Hengfu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</atitle><jtitle>RNA biology</jtitle><addtitle>RNA Biol</addtitle><date>2020-07-02</date><risdate>2020</risdate><volume>17</volume><issue>7</issue><spage>966</spage><epage>976</epage><pages>966-976</pages><issn>1547-6286</issn><eissn>1555-8584</eissn><abstract>Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.</abstract><cop>United States</cop><pub>Taylor & Francis</pub><pmid>32160106</pmid><doi>10.1080/15476286.2020.1738703</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0001-7249-8352</orcidid><orcidid>https://orcid.org/0000-0002-0720-5311</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1547-6286 |
ispartof | RNA biology, 2020-07, Vol.17 (7), p.966-976 |
issn | 1547-6286 1555-8584 |
language | eng |
recordid | cdi_crossref_primary_10_1080_15476286_2020_1738703 |
source | MEDLINE; Elektronische Zeitschriftenbibliothek; PubMed Central |
subjects | Alternative Splicing Camellia Camellia - genetics Computational Biology Gene Expression Profiling Gene Expression Regulation, Plant Genome, Plant High-Throughput Nucleotide Sequencing lipoxygenase Molecular Sequence Annotation phased small interfering RNA Phenotype Polyadenylation Research Paper RNA Isoforms RNA, Untranslated - genetics Single Molecule Imaging single-molecule sequencing Transcriptome |
title | Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T16%3A13%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identification%20of%20alternatively%20spliced%20gene%20isoforms%20and%20novel%20noncoding%20RNAs%20by%20single-molecule%20long-read%20sequencing%20in%20Camellia&rft.jtitle=RNA%20biology&rft.au=Hu,%20Zhikang&rft.date=2020-07-02&rft.volume=17&rft.issue=7&rft.spage=966&rft.epage=976&rft.pages=966-976&rft.issn=1547-6286&rft.eissn=1555-8584&rft_id=info:doi/10.1080/15476286.2020.1738703&rft_dat=%3Cproquest_cross%3E2376729102%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2376729102&rft_id=info:pmid/32160106&rfr_iscdi=true |