Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation

Rice ( ) molecular breeding has gained considerable attention in recent years, but inaccurate genome annotation hampers its progress and functional studies of the rice genome. In this study, we applied single-molecule long-read RNA sequencing (lrRNA_seq)-based proteogenomics to reveal the complexity...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Plant physiology (Bethesda) 2020-03, Vol.182 (3), p.1510-1526
Hauptverfasser:	Chen, Mo-Xian, Zhu, Fu-Yuan, Gao, Bei, Ma, Kai-Long, Zhang, Youjun, Fernie, Alisdair R, Chen, Xi, Dai, Lei, Ye, Neng-Hui, Zhang, Xue, Tian, Yuan, Zhang, Di, Xiao, Shi, Zhang, Jianhua, Liu, Ying-Gao
Format:	Artikel
Sprache:	eng
Schlagworte:	Oryza - genetics Oryza - metabolism Proteogenomics - methods Proteome - metabolism RNA, Antisense - genetics Sequence Analysis, RNA Transcriptome
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1526
container_issue	3
container_start_page	1510
container_title	Plant physiology (Bethesda)
container_volume	182
creator	Chen, Mo-Xian Zhu, Fu-Yuan Gao, Bei Ma, Kai-Long Zhang, Youjun Fernie, Alisdair R Chen, Xi Dai, Lei Ye, Neng-Hui Zhang, Xue Tian, Yuan Zhang, Di Xiao, Shi Zhang, Jianhua Liu, Ying-Gao
description	Rice ( ) molecular breeding has gained considerable attention in recent years, but inaccurate genome annotation hampers its progress and functional studies of the rice genome. In this study, we applied single-molecule long-read RNA sequencing (lrRNA_seq)-based proteogenomics to reveal the complexity of the rice transcriptome and its coding abilities. Surprisingly, approximately 60% of loci identified by lrRNA_seq are associated with natural antisense transcripts (NATs). The high-density genomic arrangement of NAT genes suggests their potential roles in the multifaceted control of gene expression. In addition, a large number of fusion and intergenic transcripts have been observed. Furthermore, 906,456 transcript isoforms were identified, and 72.9% of the genes can generate splicing isoforms. A total of 706,075 posttranscriptional events were subsequently categorized into 10 subtypes, demonstrating the interdependence of posttranscriptional mechanisms that contribute to transcriptome diversity. Parallel short-read RNA sequencing indicated that lrRNA_seq has a superior capacity for the identification of longer transcripts. In addition, over 190,000 unique peptides belonging to 9,706 proteoforms/protein groups were identified, expanding the diversity of the rice proteome. Our findings indicate that the genome organization, transcriptome diversity, and coding potential of the rice transcriptome are far more complex than previously anticipated.
doi_str_mv	10.1104/pp.19.00430
format	Article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7054881</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2329729027</sourcerecordid><originalsourceid>FETCH-LOGICAL-c423t-f965a6f6dbbd7824a6477c296cdd6345cf5ea9cf3bb2200d52cca3fc718d34663</originalsourceid><addsrcrecordid>eNpVkctLAzEQxoMotlZP3mWPgmzNa18XQYuthYIi9RyyebQru8maZAv-927tAz3NDPPjm4_5ALhGcIwQpPdtO0bFGEJK4AkYooTgGCc0PwVDCPse5nkxABfef0IIEUH0HAwIypOMYjIEbNrVdbxQZhXW0dJx44Wr2hA_ca9k9OZsUHaljG0q4SOro_dKqGjetM5ulI_mwUez7VZF3Bzwfng0xgYeKmsuwZnmtVdX-zoCH9Pn5eQlXrzO5pPHRSx6GyHWRZrwVKeyLGWWY8pTmmUCF6mQMiU0ETpRvBCalCXGEMoEC8GJFhnKJaFpSkbgYafbdmWjpFAmOF6z1lUNd9_M8or935hqzVZ2wzLY_ypHvcDtXsDZr075wJrKC1XX3CjbeYYJLjJcQJz16N0OFc5675Q-nkGQbSNhbctQwX4j6embv86O7CED8gNCxoki</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2329729027</pqid></control><display><type>article</type><title>Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation</title><source>Oxford University Press Journals All Titles (1996-Current)</source><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Chen, Mo-Xian ; Zhu, Fu-Yuan ; Gao, Bei ; Ma, Kai-Long ; Zhang, Youjun ; Fernie, Alisdair R ; Chen, Xi ; Dai, Lei ; Ye, Neng-Hui ; Zhang, Xue ; Tian, Yuan ; Zhang, Di ; Xiao, Shi ; Zhang, Jianhua ; Liu, Ying-Gao</creator><creatorcontrib>Chen, Mo-Xian ; Zhu, Fu-Yuan ; Gao, Bei ; Ma, Kai-Long ; Zhang, Youjun ; Fernie, Alisdair R ; Chen, Xi ; Dai, Lei ; Ye, Neng-Hui ; Zhang, Xue ; Tian, Yuan ; Zhang, Di ; Xiao, Shi ; Zhang, Jianhua ; Liu, Ying-Gao</creatorcontrib><description>Rice ( ) molecular breeding has gained considerable attention in recent years, but inaccurate genome annotation hampers its progress and functional studies of the rice genome. In this study, we applied single-molecule long-read RNA sequencing (lrRNA_seq)-based proteogenomics to reveal the complexity of the rice transcriptome and its coding abilities. Surprisingly, approximately 60% of loci identified by lrRNA_seq are associated with natural antisense transcripts (NATs). The high-density genomic arrangement of NAT genes suggests their potential roles in the multifaceted control of gene expression. In addition, a large number of fusion and intergenic transcripts have been observed. Furthermore, 906,456 transcript isoforms were identified, and 72.9% of the genes can generate splicing isoforms. A total of 706,075 posttranscriptional events were subsequently categorized into 10 subtypes, demonstrating the interdependence of posttranscriptional mechanisms that contribute to transcriptome diversity. Parallel short-read RNA sequencing indicated that lrRNA_seq has a superior capacity for the identification of longer transcripts. In addition, over 190,000 unique peptides belonging to 9,706 proteoforms/protein groups were identified, expanding the diversity of the rice proteome. Our findings indicate that the genome organization, transcriptome diversity, and coding potential of the rice transcriptome are far more complex than previously anticipated.</description><identifier>ISSN: 0032-0889</identifier><identifier>EISSN: 1532-2548</identifier><identifier>DOI: 10.1104/pp.19.00430</identifier><identifier>PMID: 31857423</identifier><language>eng</language><publisher>United States: American Society of Plant Biologists</publisher><subject>Oryza - genetics ; Oryza - metabolism ; Proteogenomics - methods ; Proteome - metabolism ; RNA, Antisense - genetics ; Sequence Analysis, RNA ; Transcriptome</subject><ispartof>Plant physiology (Bethesda), 2020-03, Vol.182 (3), p.1510-1526</ispartof><rights>2020 American Society of Plant Biologists. All Rights Reserved.</rights><rights>2020 American Society of Plant Biologists. All Rights Reserved. 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c423t-f965a6f6dbbd7824a6477c296cdd6345cf5ea9cf3bb2200d52cca3fc718d34663</citedby><orcidid>0000-0003-3942-5797 ; 0000-0002-5261-8107 ; 0000-0003-1052-0256 ; 0000-0002-6632-8952 ; 0000-0001-9000-335X ; 0000-0002-7676-6075</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/31857423$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Chen, Mo-Xian</creatorcontrib><creatorcontrib>Zhu, Fu-Yuan</creatorcontrib><creatorcontrib>Gao, Bei</creatorcontrib><creatorcontrib>Ma, Kai-Long</creatorcontrib><creatorcontrib>Zhang, Youjun</creatorcontrib><creatorcontrib>Fernie, Alisdair R</creatorcontrib><creatorcontrib>Chen, Xi</creatorcontrib><creatorcontrib>Dai, Lei</creatorcontrib><creatorcontrib>Ye, Neng-Hui</creatorcontrib><creatorcontrib>Zhang, Xue</creatorcontrib><creatorcontrib>Tian, Yuan</creatorcontrib><creatorcontrib>Zhang, Di</creatorcontrib><creatorcontrib>Xiao, Shi</creatorcontrib><creatorcontrib>Zhang, Jianhua</creatorcontrib><creatorcontrib>Liu, Ying-Gao</creatorcontrib><title>Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation</title><title>Plant physiology (Bethesda)</title><addtitle>Plant Physiol</addtitle><description>Rice ( ) molecular breeding has gained considerable attention in recent years, but inaccurate genome annotation hampers its progress and functional studies of the rice genome. In this study, we applied single-molecule long-read RNA sequencing (lrRNA_seq)-based proteogenomics to reveal the complexity of the rice transcriptome and its coding abilities. Surprisingly, approximately 60% of loci identified by lrRNA_seq are associated with natural antisense transcripts (NATs). The high-density genomic arrangement of NAT genes suggests their potential roles in the multifaceted control of gene expression. In addition, a large number of fusion and intergenic transcripts have been observed. Furthermore, 906,456 transcript isoforms were identified, and 72.9% of the genes can generate splicing isoforms. A total of 706,075 posttranscriptional events were subsequently categorized into 10 subtypes, demonstrating the interdependence of posttranscriptional mechanisms that contribute to transcriptome diversity. Parallel short-read RNA sequencing indicated that lrRNA_seq has a superior capacity for the identification of longer transcripts. In addition, over 190,000 unique peptides belonging to 9,706 proteoforms/protein groups were identified, expanding the diversity of the rice proteome. Our findings indicate that the genome organization, transcriptome diversity, and coding potential of the rice transcriptome are far more complex than previously anticipated.</description><subject>Oryza - genetics</subject><subject>Oryza - metabolism</subject><subject>Proteogenomics - methods</subject><subject>Proteome - metabolism</subject><subject>RNA, Antisense - genetics</subject><subject>Sequence Analysis, RNA</subject><subject>Transcriptome</subject><issn>0032-0889</issn><issn>1532-2548</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkctLAzEQxoMotlZP3mWPgmzNa18XQYuthYIi9RyyebQru8maZAv-927tAz3NDPPjm4_5ALhGcIwQpPdtO0bFGEJK4AkYooTgGCc0PwVDCPse5nkxABfef0IIEUH0HAwIypOMYjIEbNrVdbxQZhXW0dJx44Wr2hA_ca9k9OZsUHaljG0q4SOro_dKqGjetM5ulI_mwUez7VZF3Bzwfng0xgYeKmsuwZnmtVdX-zoCH9Pn5eQlXrzO5pPHRSx6GyHWRZrwVKeyLGWWY8pTmmUCF6mQMiU0ETpRvBCalCXGEMoEC8GJFhnKJaFpSkbgYafbdmWjpFAmOF6z1lUNd9_M8or935hqzVZ2wzLY_ypHvcDtXsDZr075wJrKC1XX3CjbeYYJLjJcQJz16N0OFc5675Q-nkGQbSNhbctQwX4j6embv86O7CED8gNCxoki</recordid><startdate>20200301</startdate><enddate>20200301</enddate><creator>Chen, Mo-Xian</creator><creator>Zhu, Fu-Yuan</creator><creator>Gao, Bei</creator><creator>Ma, Kai-Long</creator><creator>Zhang, Youjun</creator><creator>Fernie, Alisdair R</creator><creator>Chen, Xi</creator><creator>Dai, Lei</creator><creator>Ye, Neng-Hui</creator><creator>Zhang, Xue</creator><creator>Tian, Yuan</creator><creator>Zhang, Di</creator><creator>Xiao, Shi</creator><creator>Zhang, Jianhua</creator><creator>Liu, Ying-Gao</creator><general>American Society of Plant Biologists</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0003-3942-5797</orcidid><orcidid>https://orcid.org/0000-0002-5261-8107</orcidid><orcidid>https://orcid.org/0000-0003-1052-0256</orcidid><orcidid>https://orcid.org/0000-0002-6632-8952</orcidid><orcidid>https://orcid.org/0000-0001-9000-335X</orcidid><orcidid>https://orcid.org/0000-0002-7676-6075</orcidid></search><sort><creationdate>20200301</creationdate><title>Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation</title><author>Chen, Mo-Xian ; Zhu, Fu-Yuan ; Gao, Bei ; Ma, Kai-Long ; Zhang, Youjun ; Fernie, Alisdair R ; Chen, Xi ; Dai, Lei ; Ye, Neng-Hui ; Zhang, Xue ; Tian, Yuan ; Zhang, Di ; Xiao, Shi ; Zhang, Jianhua ; Liu, Ying-Gao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c423t-f965a6f6dbbd7824a6477c296cdd6345cf5ea9cf3bb2200d52cca3fc718d34663</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Oryza - genetics</topic><topic>Oryza - metabolism</topic><topic>Proteogenomics - methods</topic><topic>Proteome - metabolism</topic><topic>RNA, Antisense - genetics</topic><topic>Sequence Analysis, RNA</topic><topic>Transcriptome</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Mo-Xian</creatorcontrib><creatorcontrib>Zhu, Fu-Yuan</creatorcontrib><creatorcontrib>Gao, Bei</creatorcontrib><creatorcontrib>Ma, Kai-Long</creatorcontrib><creatorcontrib>Zhang, Youjun</creatorcontrib><creatorcontrib>Fernie, Alisdair R</creatorcontrib><creatorcontrib>Chen, Xi</creatorcontrib><creatorcontrib>Dai, Lei</creatorcontrib><creatorcontrib>Ye, Neng-Hui</creatorcontrib><creatorcontrib>Zhang, Xue</creatorcontrib><creatorcontrib>Tian, Yuan</creatorcontrib><creatorcontrib>Zhang, Di</creatorcontrib><creatorcontrib>Xiao, Shi</creatorcontrib><creatorcontrib>Zhang, Jianhua</creatorcontrib><creatorcontrib>Liu, Ying-Gao</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Plant physiology (Bethesda)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Mo-Xian</au><au>Zhu, Fu-Yuan</au><au>Gao, Bei</au><au>Ma, Kai-Long</au><au>Zhang, Youjun</au><au>Fernie, Alisdair R</au><au>Chen, Xi</au><au>Dai, Lei</au><au>Ye, Neng-Hui</au><au>Zhang, Xue</au><au>Tian, Yuan</au><au>Zhang, Di</au><au>Xiao, Shi</au><au>Zhang, Jianhua</au><au>Liu, Ying-Gao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation</atitle><jtitle>Plant physiology (Bethesda)</jtitle><addtitle>Plant Physiol</addtitle><date>2020-03-01</date><risdate>2020</risdate><volume>182</volume><issue>3</issue><spage>1510</spage><epage>1526</epage><pages>1510-1526</pages><issn>0032-0889</issn><eissn>1532-2548</eissn><abstract>Rice ( ) molecular breeding has gained considerable attention in recent years, but inaccurate genome annotation hampers its progress and functional studies of the rice genome. In this study, we applied single-molecule long-read RNA sequencing (lrRNA_seq)-based proteogenomics to reveal the complexity of the rice transcriptome and its coding abilities. Surprisingly, approximately 60% of loci identified by lrRNA_seq are associated with natural antisense transcripts (NATs). The high-density genomic arrangement of NAT genes suggests their potential roles in the multifaceted control of gene expression. In addition, a large number of fusion and intergenic transcripts have been observed. Furthermore, 906,456 transcript isoforms were identified, and 72.9% of the genes can generate splicing isoforms. A total of 706,075 posttranscriptional events were subsequently categorized into 10 subtypes, demonstrating the interdependence of posttranscriptional mechanisms that contribute to transcriptome diversity. Parallel short-read RNA sequencing indicated that lrRNA_seq has a superior capacity for the identification of longer transcripts. In addition, over 190,000 unique peptides belonging to 9,706 proteoforms/protein groups were identified, expanding the diversity of the rice proteome. Our findings indicate that the genome organization, transcriptome diversity, and coding potential of the rice transcriptome are far more complex than previously anticipated.</abstract><cop>United States</cop><pub>American Society of Plant Biologists</pub><pmid>31857423</pmid><doi>10.1104/pp.19.00430</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0003-3942-5797</orcidid><orcidid>https://orcid.org/0000-0002-5261-8107</orcidid><orcidid>https://orcid.org/0000-0003-1052-0256</orcidid><orcidid>https://orcid.org/0000-0002-6632-8952</orcidid><orcidid>https://orcid.org/0000-0001-9000-335X</orcidid><orcidid>https://orcid.org/0000-0002-7676-6075</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0032-0889
ispartof	Plant physiology (Bethesda), 2020-03, Vol.182 (3), p.1510-1526
issn	0032-0889 1532-2548
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_7054881
source	Oxford University Press Journals All Titles (1996-Current); MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects	Oryza - genetics Oryza - metabolism Proteogenomics - methods Proteome - metabolism RNA, Antisense - genetics Sequence Analysis, RNA Transcriptome
title	Full-Length Transcript-Based Proteogenomics of Rice Improves Its Genome and Proteome Annotation
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T22%3A18%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Full-Length%20Transcript-Based%20Proteogenomics%20of%20Rice%20Improves%20Its%20Genome%20and%20Proteome%20Annotation&rft.jtitle=Plant%20physiology%20(Bethesda)&rft.au=Chen,%20Mo-Xian&rft.date=2020-03-01&rft.volume=182&rft.issue=3&rft.spage=1510&rft.epage=1526&rft.pages=1510-1526&rft.issn=0032-0889&rft.eissn=1532-2548&rft_id=info:doi/10.1104/pp.19.00430&rft_dat=%3Cproquest_pubme%3E2329729027%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2329729027&rft_id=info:pmid/31857423&rfr_iscdi=true