Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia

Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory fun...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:RNA biology 2020-07, Vol.17 (7), p.966-976
Hauptverfasser: Hu, Zhikang, Lyu, Tao, Yan, Chao, Wang, Yupeng, Ye, Ning, Fan, Zhengqi, Li, Xinlei, Li, Jiyuan, Yin, Hengfu
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 976
container_issue 7
container_start_page 966
container_title RNA biology
container_volume 17
creator Hu, Zhikang
Lyu, Tao
Yan, Chao
Wang, Yupeng
Ye, Ning
Fan, Zhengqi
Li, Xinlei
Li, Jiyuan
Yin, Hengfu
description Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.
doi_str_mv 10.1080/15476286.2020.1738703
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1080_15476286_2020_1738703</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2376729102</sourcerecordid><originalsourceid>FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</originalsourceid><addsrcrecordid>eNp9kU2PFCEQhjtG466rP0HD0UuvQDfQXIybiR-bbDQxeibVTTFiaBihZzdz9ZdLZ2Y3evECFDxvFVVv07xk9JLRgb5holeSD_KSU16vVDco2j1qzpkQoh3E0D9ez71qV-iseVbKT0o7OWjxtDnrOJOUUXne_L62GBfv_ASLT5EkRyAsmGMNbzEcSNkFP6ElW4xIfEku5bkQiJbEVIG6xilZH7fk6-erQsaqqEHAdk4Bp31AElLcthnBkoK_9hinFfaRbGDGEDw8b544CAVfnPaL5vuH9982n9qbLx-vN1c37dTLYWkFh173TjPlHDLFR2qFo51WCMIB1xp7h6pfBzGO3GmhaQ9SSrCWuQFEd9G8Pebd7ccZ7VT7zhDMLvsZ8sEk8Obfl-h_mG26NUr0WipeE7w-JcipNlIWM_sy1R4gYtoXwztVMc3oioojOuVUSkb3UIZRs_pn7v0zq3_m5F_Vvfr7jw-qe8Mq8O4I-Lg6AXcpB2sWOISUXYY63GK6_9f4A4GMrcw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2376729102</pqid></control><display><type>article</type><title>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek</source><source>PubMed Central</source><creator>Hu, Zhikang ; Lyu, Tao ; Yan, Chao ; Wang, Yupeng ; Ye, Ning ; Fan, Zhengqi ; Li, Xinlei ; Li, Jiyuan ; Yin, Hengfu</creator><creatorcontrib>Hu, Zhikang ; Lyu, Tao ; Yan, Chao ; Wang, Yupeng ; Ye, Ning ; Fan, Zhengqi ; Li, Xinlei ; Li, Jiyuan ; Yin, Hengfu</creatorcontrib><description>Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.</description><identifier>ISSN: 1547-6286</identifier><identifier>EISSN: 1555-8584</identifier><identifier>DOI: 10.1080/15476286.2020.1738703</identifier><identifier>PMID: 32160106</identifier><language>eng</language><publisher>United States: Taylor &amp; Francis</publisher><subject>Alternative Splicing ; Camellia ; Camellia - genetics ; Computational Biology ; Gene Expression Profiling ; Gene Expression Regulation, Plant ; Genome, Plant ; High-Throughput Nucleotide Sequencing ; lipoxygenase ; Molecular Sequence Annotation ; phased small interfering RNA ; Phenotype ; Polyadenylation ; Research Paper ; RNA Isoforms ; RNA, Untranslated - genetics ; Single Molecule Imaging ; single-molecule sequencing ; Transcriptome</subject><ispartof>RNA biology, 2020-07, Vol.17 (7), p.966-976</ispartof><rights>2020 Informa UK Limited, trading as Taylor &amp; Francis Group 2020</rights><rights>2020 Informa UK Limited, trading as Taylor &amp; Francis Group 2020 Informa UK Limited, trading as Taylor &amp; Francis Group</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</citedby><cites>FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</cites><orcidid>0000-0001-7249-8352 ; 0000-0002-0720-5311</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7549672/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7549672/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32160106$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hu, Zhikang</creatorcontrib><creatorcontrib>Lyu, Tao</creatorcontrib><creatorcontrib>Yan, Chao</creatorcontrib><creatorcontrib>Wang, Yupeng</creatorcontrib><creatorcontrib>Ye, Ning</creatorcontrib><creatorcontrib>Fan, Zhengqi</creatorcontrib><creatorcontrib>Li, Xinlei</creatorcontrib><creatorcontrib>Li, Jiyuan</creatorcontrib><creatorcontrib>Yin, Hengfu</creatorcontrib><title>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</title><title>RNA biology</title><addtitle>RNA Biol</addtitle><description>Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.</description><subject>Alternative Splicing</subject><subject>Camellia</subject><subject>Camellia - genetics</subject><subject>Computational Biology</subject><subject>Gene Expression Profiling</subject><subject>Gene Expression Regulation, Plant</subject><subject>Genome, Plant</subject><subject>High-Throughput Nucleotide Sequencing</subject><subject>lipoxygenase</subject><subject>Molecular Sequence Annotation</subject><subject>phased small interfering RNA</subject><subject>Phenotype</subject><subject>Polyadenylation</subject><subject>Research Paper</subject><subject>RNA Isoforms</subject><subject>RNA, Untranslated - genetics</subject><subject>Single Molecule Imaging</subject><subject>single-molecule sequencing</subject><subject>Transcriptome</subject><issn>1547-6286</issn><issn>1555-8584</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kU2PFCEQhjtG466rP0HD0UuvQDfQXIybiR-bbDQxeibVTTFiaBihZzdz9ZdLZ2Y3evECFDxvFVVv07xk9JLRgb5holeSD_KSU16vVDco2j1qzpkQoh3E0D9ez71qV-iseVbKT0o7OWjxtDnrOJOUUXne_L62GBfv_ASLT5EkRyAsmGMNbzEcSNkFP6ElW4xIfEku5bkQiJbEVIG6xilZH7fk6-erQsaqqEHAdk4Bp31AElLcthnBkoK_9hinFfaRbGDGEDw8b544CAVfnPaL5vuH9982n9qbLx-vN1c37dTLYWkFh173TjPlHDLFR2qFo51WCMIB1xp7h6pfBzGO3GmhaQ9SSrCWuQFEd9G8Pebd7ccZ7VT7zhDMLvsZ8sEk8Obfl-h_mG26NUr0WipeE7w-JcipNlIWM_sy1R4gYtoXwztVMc3oioojOuVUSkb3UIZRs_pn7v0zq3_m5F_Vvfr7jw-qe8Mq8O4I-Lg6AXcpB2sWOISUXYY63GK6_9f4A4GMrcw</recordid><startdate>20200702</startdate><enddate>20200702</enddate><creator>Hu, Zhikang</creator><creator>Lyu, Tao</creator><creator>Yan, Chao</creator><creator>Wang, Yupeng</creator><creator>Ye, Ning</creator><creator>Fan, Zhengqi</creator><creator>Li, Xinlei</creator><creator>Li, Jiyuan</creator><creator>Yin, Hengfu</creator><general>Taylor &amp; Francis</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-7249-8352</orcidid><orcidid>https://orcid.org/0000-0002-0720-5311</orcidid></search><sort><creationdate>20200702</creationdate><title>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</title><author>Hu, Zhikang ; Lyu, Tao ; Yan, Chao ; Wang, Yupeng ; Ye, Ning ; Fan, Zhengqi ; Li, Xinlei ; Li, Jiyuan ; Yin, Hengfu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c468t-52a494f917ffe172b0d5f0397ea5fa299e4fe747387bb2f95904a666add1f8a53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Alternative Splicing</topic><topic>Camellia</topic><topic>Camellia - genetics</topic><topic>Computational Biology</topic><topic>Gene Expression Profiling</topic><topic>Gene Expression Regulation, Plant</topic><topic>Genome, Plant</topic><topic>High-Throughput Nucleotide Sequencing</topic><topic>lipoxygenase</topic><topic>Molecular Sequence Annotation</topic><topic>phased small interfering RNA</topic><topic>Phenotype</topic><topic>Polyadenylation</topic><topic>Research Paper</topic><topic>RNA Isoforms</topic><topic>RNA, Untranslated - genetics</topic><topic>Single Molecule Imaging</topic><topic>single-molecule sequencing</topic><topic>Transcriptome</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hu, Zhikang</creatorcontrib><creatorcontrib>Lyu, Tao</creatorcontrib><creatorcontrib>Yan, Chao</creatorcontrib><creatorcontrib>Wang, Yupeng</creatorcontrib><creatorcontrib>Ye, Ning</creatorcontrib><creatorcontrib>Fan, Zhengqi</creatorcontrib><creatorcontrib>Li, Xinlei</creatorcontrib><creatorcontrib>Li, Jiyuan</creatorcontrib><creatorcontrib>Yin, Hengfu</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>RNA biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hu, Zhikang</au><au>Lyu, Tao</au><au>Yan, Chao</au><au>Wang, Yupeng</au><au>Ye, Ning</au><au>Fan, Zhengqi</au><au>Li, Xinlei</au><au>Li, Jiyuan</au><au>Yin, Hengfu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia</atitle><jtitle>RNA biology</jtitle><addtitle>RNA Biol</addtitle><date>2020-07-02</date><risdate>2020</risdate><volume>17</volume><issue>7</issue><spage>966</spage><epage>976</epage><pages>966-976</pages><issn>1547-6286</issn><eissn>1555-8584</eissn><abstract>Direct single-molecule sequencing of full-length transcripts allows efficient identification of gene isoforms, which is apt to alternative splicing (AS), polyadenylation, and long non-coding RNA analyses. However, the identification of gene isoforms and long non-coding RNAs with novel regulatory functions remains challenging, especially for species without a reference genome. Here, we present a comprehensive analysis of a combined long-read and short-read transcriptome sequencing in Camellia japonica. Through a novel bioinformatic pipeline of reverse-tracing the split-sites, we have uncovered 257,692 AS sites from 61,838 transcripts; and 13,068 AS isoforms have been validated by aligning the short reads. We have identified the tissue-specific AS isoforms along with 6,373 AS events that were found in all tissues. Furthermore, we have analysed the polyadenylation (polyA) patterns of transcripts, and found that the preference for polyA signals was different between the AS and non-AS transcripts. Moreover, we have predicted the phased small interfering RNA (phasiRNA) loci through integrative analyses of transcriptome and small RNA sequencing. We have shown that a newly evolved phasiRNA locus from lipoxygenases generated 12 consecutive 21 bp secondary RNAs, which were responsive to cold and heat stress in Camellia. Our studies of the isoform transcriptome provide insights into gene splicing and functions that may facilitate the mechanistic understanding of plants.</abstract><cop>United States</cop><pub>Taylor &amp; Francis</pub><pmid>32160106</pmid><doi>10.1080/15476286.2020.1738703</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0001-7249-8352</orcidid><orcidid>https://orcid.org/0000-0002-0720-5311</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1547-6286
ispartof RNA biology, 2020-07, Vol.17 (7), p.966-976
issn 1547-6286
1555-8584
language eng
recordid cdi_crossref_primary_10_1080_15476286_2020_1738703
source MEDLINE; Elektronische Zeitschriftenbibliothek; PubMed Central
subjects Alternative Splicing
Camellia
Camellia - genetics
Computational Biology
Gene Expression Profiling
Gene Expression Regulation, Plant
Genome, Plant
High-Throughput Nucleotide Sequencing
lipoxygenase
Molecular Sequence Annotation
phased small interfering RNA
Phenotype
Polyadenylation
Research Paper
RNA Isoforms
RNA, Untranslated - genetics
Single Molecule Imaging
single-molecule sequencing
Transcriptome
title Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T16%3A13%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Identification%20of%20alternatively%20spliced%20gene%20isoforms%20and%20novel%20noncoding%20RNAs%20by%20single-molecule%20long-read%20sequencing%20in%20Camellia&rft.jtitle=RNA%20biology&rft.au=Hu,%20Zhikang&rft.date=2020-07-02&rft.volume=17&rft.issue=7&rft.spage=966&rft.epage=976&rft.pages=966-976&rft.issn=1547-6286&rft.eissn=1555-8584&rft_id=info:doi/10.1080/15476286.2020.1738703&rft_dat=%3Cproquest_cross%3E2376729102%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2376729102&rft_id=info:pmid/32160106&rfr_iscdi=true