Genome-scale analysis of human mRNA 5′ coding sequences based on expressed sequence tag (EST) database

The “5′ end mRNA artifact” issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5′ end sequence. We performed a systematic identification of coding regions at the 5′ end of all human known mRNAs, using an automated expressed sequence...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genomics (San Diego, Calif.) Calif.), 2012-08, Vol.100 (2), p.125-130
Hauptverfasser: Casadei, Raffaella, Piovesan, Allison, Vitale, Lorenza, Facchin, Federica, Pelleri, Maria Chiara, Canaider, Silvia, Bianconi, Eva, Frabetti, Flavia, Strippoli, Pierluigi
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 130
container_issue 2
container_start_page 125
container_title Genomics (San Diego, Calif.)
container_volume 100
creator Casadei, Raffaella
Piovesan, Allison
Vitale, Lorenza
Facchin, Federica
Pelleri, Maria Chiara
Canaider, Silvia
Bianconi, Eva
Frabetti, Flavia
Strippoli, Pierluigi
description The “5′ end mRNA artifact” issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5′ end sequence. We performed a systematic identification of coding regions at the 5′ end of all human known mRNAs, using an automated expressed sequence tag (EST)-based approach. Following parsing of more than 7million BLAT alignments, we found 477 human loci, out of 18,665 analyzed, in which an extension of the mRNA 5′ coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 cDNAs, and the consequences for the functional studies of these loci are discussed. We also generated a list of 20,775 human mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5′ in the current form. ► Start codon may be incorrectly assigned if mRNA sequence is not complete at 5′ end. ► An EST-based computational approach identifies extended 5′ coding regions. ► The mRNA 5′ coding region is extended in 477 human loci out of 18,665 analyzed. ► Experimental confirmation has been obtained for GNB2L1, QARS and TDP2. ► 20,775 human mRNAs have a coding region not further extendable by EST sequences.
doi_str_mv 10.1016/j.ygeno.2012.05.012
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1027680185</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S088875431200105X</els_id><sourcerecordid>1027680185</sourcerecordid><originalsourceid>FETCH-LOGICAL-c458t-28271fb9442659b1e8a9013cac2f515414c73c954a94de2cb853139b9dc513283</originalsourceid><addsrcrecordid>eNp9kc9u1DAQhy0EosvCEyCBL0jtIcF_E_vQQ1WVglSBRNuz5TiTrVdJvNhZxN54Jh6JJ6nDbsutp5Hlb2Z-_ozQW0pKSmj1cV3uVjCGkhHKSiLLXJ6hBSVKF6oS1XO0IEqpopaCH6FXKa0JIZor9hIdMVZJTZhaoLvLPGKAIjnbA7aj7XfJJxw6fLcd7IiH71_PsPz7-w92ofXjCif4sYXRQcKNTdDiMGL4tYmQ5sPDJZ7sCh9fXN-c4NZOdiZfoxed7RO8OdQluv10cXP-ubj6dvnl_OyqcEKqqWCK1bRrtBBzxIaCsppQ7qxjnaRSUOFq7rQUVosWmGuU5JTrRrdOUs4UX6Lj_dxNDDlMmszgk4O-tyOEbTKUsLpShOa-JeJ71MWQUoTObKIfbNxlyMyKzdr8U2xmxYZIk0vuendYsG0GaB97Hpxm4MMBsLPVLtrR-fSfq5iUWtSZe7_nOhuMXcXM3F7nTTL_k2D5UZk43ROQhf30EE1yfvbb-ghuMm3wT0a9B8ABozw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1027680185</pqid></control><display><type>article</type><title>Genome-scale analysis of human mRNA 5′ coding sequences based on expressed sequence tag (EST) database</title><source>MEDLINE</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><source>ScienceDirect Journals (5 years ago - present)</source><creator>Casadei, Raffaella ; Piovesan, Allison ; Vitale, Lorenza ; Facchin, Federica ; Pelleri, Maria Chiara ; Canaider, Silvia ; Bianconi, Eva ; Frabetti, Flavia ; Strippoli, Pierluigi</creator><creatorcontrib>Casadei, Raffaella ; Piovesan, Allison ; Vitale, Lorenza ; Facchin, Federica ; Pelleri, Maria Chiara ; Canaider, Silvia ; Bianconi, Eva ; Frabetti, Flavia ; Strippoli, Pierluigi</creatorcontrib><description>The “5′ end mRNA artifact” issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5′ end sequence. We performed a systematic identification of coding regions at the 5′ end of all human known mRNAs, using an automated expressed sequence tag (EST)-based approach. Following parsing of more than 7million BLAT alignments, we found 477 human loci, out of 18,665 analyzed, in which an extension of the mRNA 5′ coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 cDNAs, and the consequences for the functional studies of these loci are discussed. We also generated a list of 20,775 human mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5′ in the current form. ► Start codon may be incorrectly assigned if mRNA sequence is not complete at 5′ end. ► An EST-based computational approach identifies extended 5′ coding regions. ► The mRNA 5′ coding region is extended in 477 human loci out of 18,665 analyzed. ► Experimental confirmation has been obtained for GNB2L1, QARS and TDP2. ► 20,775 human mRNAs have a coding region not further extendable by EST sequences.</description><identifier>ISSN: 0888-7543</identifier><identifier>EISSN: 1089-8646</identifier><identifier>DOI: 10.1016/j.ygeno.2012.05.012</identifier><identifier>PMID: 22659028</identifier><language>eng</language><publisher>Amsterdam: Elsevier Inc</publisher><subject>5' Untranslated Regions - genetics ; 5′ Untranslated region (5′ UTR) ; Amino Acid Sequence ; Biological and medical sciences ; Cloning, Molecular ; Codon, Initiator ; complementary DNA ; Computational Biology ; Databases, Genetic ; Diverse techniques ; DNA, Complementary ; Expressed sequence tag (EST) ; Expressed Sequence Tags ; Fundamental and applied biological sciences. Psychology ; Genes. Genome ; Genetic Association Studies - methods ; Genetic Loci ; Genetics of eukaryotes. Biological and molecular evolution ; Genome, Human ; Human genome ; Humans ; loci ; messenger RNA ; Molecular and cellular biology ; Molecular genetics ; Molecular Sequence Data ; mRNA 5′ coding sequence ; Open Reading Frames ; RNA, Messenger - genetics ; Sequence Alignment ; Sequence Analysis, DNA ; start codon ; stop codon ; Translation start codon</subject><ispartof>Genomics (San Diego, Calif.), 2012-08, Vol.100 (2), p.125-130</ispartof><rights>2012 Elsevier Inc.</rights><rights>2015 INIST-CNRS</rights><rights>Copyright © 2012 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c458t-28271fb9442659b1e8a9013cac2f515414c73c954a94de2cb853139b9dc513283</citedby><cites>FETCH-LOGICAL-c458t-28271fb9442659b1e8a9013cac2f515414c73c954a94de2cb853139b9dc513283</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.ygeno.2012.05.012$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>314,778,782,3539,27911,27912,45982</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=26255947$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22659028$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Casadei, Raffaella</creatorcontrib><creatorcontrib>Piovesan, Allison</creatorcontrib><creatorcontrib>Vitale, Lorenza</creatorcontrib><creatorcontrib>Facchin, Federica</creatorcontrib><creatorcontrib>Pelleri, Maria Chiara</creatorcontrib><creatorcontrib>Canaider, Silvia</creatorcontrib><creatorcontrib>Bianconi, Eva</creatorcontrib><creatorcontrib>Frabetti, Flavia</creatorcontrib><creatorcontrib>Strippoli, Pierluigi</creatorcontrib><title>Genome-scale analysis of human mRNA 5′ coding sequences based on expressed sequence tag (EST) database</title><title>Genomics (San Diego, Calif.)</title><addtitle>Genomics</addtitle><description>The “5′ end mRNA artifact” issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5′ end sequence. We performed a systematic identification of coding regions at the 5′ end of all human known mRNAs, using an automated expressed sequence tag (EST)-based approach. Following parsing of more than 7million BLAT alignments, we found 477 human loci, out of 18,665 analyzed, in which an extension of the mRNA 5′ coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 cDNAs, and the consequences for the functional studies of these loci are discussed. We also generated a list of 20,775 human mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5′ in the current form. ► Start codon may be incorrectly assigned if mRNA sequence is not complete at 5′ end. ► An EST-based computational approach identifies extended 5′ coding regions. ► The mRNA 5′ coding region is extended in 477 human loci out of 18,665 analyzed. ► Experimental confirmation has been obtained for GNB2L1, QARS and TDP2. ► 20,775 human mRNAs have a coding region not further extendable by EST sequences.</description><subject>5' Untranslated Regions - genetics</subject><subject>5′ Untranslated region (5′ UTR)</subject><subject>Amino Acid Sequence</subject><subject>Biological and medical sciences</subject><subject>Cloning, Molecular</subject><subject>Codon, Initiator</subject><subject>complementary DNA</subject><subject>Computational Biology</subject><subject>Databases, Genetic</subject><subject>Diverse techniques</subject><subject>DNA, Complementary</subject><subject>Expressed sequence tag (EST)</subject><subject>Expressed Sequence Tags</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Genes. Genome</subject><subject>Genetic Association Studies - methods</subject><subject>Genetic Loci</subject><subject>Genetics of eukaryotes. Biological and molecular evolution</subject><subject>Genome, Human</subject><subject>Human genome</subject><subject>Humans</subject><subject>loci</subject><subject>messenger RNA</subject><subject>Molecular and cellular biology</subject><subject>Molecular genetics</subject><subject>Molecular Sequence Data</subject><subject>mRNA 5′ coding sequence</subject><subject>Open Reading Frames</subject><subject>RNA, Messenger - genetics</subject><subject>Sequence Alignment</subject><subject>Sequence Analysis, DNA</subject><subject>start codon</subject><subject>stop codon</subject><subject>Translation start codon</subject><issn>0888-7543</issn><issn>1089-8646</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kc9u1DAQhy0EosvCEyCBL0jtIcF_E_vQQ1WVglSBRNuz5TiTrVdJvNhZxN54Jh6JJ6nDbsutp5Hlb2Z-_ozQW0pKSmj1cV3uVjCGkhHKSiLLXJ6hBSVKF6oS1XO0IEqpopaCH6FXKa0JIZor9hIdMVZJTZhaoLvLPGKAIjnbA7aj7XfJJxw6fLcd7IiH71_PsPz7-w92ofXjCif4sYXRQcKNTdDiMGL4tYmQ5sPDJZ7sCh9fXN-c4NZOdiZfoxed7RO8OdQluv10cXP-ubj6dvnl_OyqcEKqqWCK1bRrtBBzxIaCsppQ7qxjnaRSUOFq7rQUVosWmGuU5JTrRrdOUs4UX6Lj_dxNDDlMmszgk4O-tyOEbTKUsLpShOa-JeJ71MWQUoTObKIfbNxlyMyKzdr8U2xmxYZIk0vuendYsG0GaB97Hpxm4MMBsLPVLtrR-fSfq5iUWtSZe7_nOhuMXcXM3F7nTTL_k2D5UZk43ROQhf30EE1yfvbb-ghuMm3wT0a9B8ABozw</recordid><startdate>20120801</startdate><enddate>20120801</enddate><creator>Casadei, Raffaella</creator><creator>Piovesan, Allison</creator><creator>Vitale, Lorenza</creator><creator>Facchin, Federica</creator><creator>Pelleri, Maria Chiara</creator><creator>Canaider, Silvia</creator><creator>Bianconi, Eva</creator><creator>Frabetti, Flavia</creator><creator>Strippoli, Pierluigi</creator><general>Elsevier Inc</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>FBQ</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20120801</creationdate><title>Genome-scale analysis of human mRNA 5′ coding sequences based on expressed sequence tag (EST) database</title><author>Casadei, Raffaella ; Piovesan, Allison ; Vitale, Lorenza ; Facchin, Federica ; Pelleri, Maria Chiara ; Canaider, Silvia ; Bianconi, Eva ; Frabetti, Flavia ; Strippoli, Pierluigi</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c458t-28271fb9442659b1e8a9013cac2f515414c73c954a94de2cb853139b9dc513283</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>5' Untranslated Regions - genetics</topic><topic>5′ Untranslated region (5′ UTR)</topic><topic>Amino Acid Sequence</topic><topic>Biological and medical sciences</topic><topic>Cloning, Molecular</topic><topic>Codon, Initiator</topic><topic>complementary DNA</topic><topic>Computational Biology</topic><topic>Databases, Genetic</topic><topic>Diverse techniques</topic><topic>DNA, Complementary</topic><topic>Expressed sequence tag (EST)</topic><topic>Expressed Sequence Tags</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Genes. Genome</topic><topic>Genetic Association Studies - methods</topic><topic>Genetic Loci</topic><topic>Genetics of eukaryotes. Biological and molecular evolution</topic><topic>Genome, Human</topic><topic>Human genome</topic><topic>Humans</topic><topic>loci</topic><topic>messenger RNA</topic><topic>Molecular and cellular biology</topic><topic>Molecular genetics</topic><topic>Molecular Sequence Data</topic><topic>mRNA 5′ coding sequence</topic><topic>Open Reading Frames</topic><topic>RNA, Messenger - genetics</topic><topic>Sequence Alignment</topic><topic>Sequence Analysis, DNA</topic><topic>start codon</topic><topic>stop codon</topic><topic>Translation start codon</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Casadei, Raffaella</creatorcontrib><creatorcontrib>Piovesan, Allison</creatorcontrib><creatorcontrib>Vitale, Lorenza</creatorcontrib><creatorcontrib>Facchin, Federica</creatorcontrib><creatorcontrib>Pelleri, Maria Chiara</creatorcontrib><creatorcontrib>Canaider, Silvia</creatorcontrib><creatorcontrib>Bianconi, Eva</creatorcontrib><creatorcontrib>Frabetti, Flavia</creatorcontrib><creatorcontrib>Strippoli, Pierluigi</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>AGRIS</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Genomics (San Diego, Calif.)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Casadei, Raffaella</au><au>Piovesan, Allison</au><au>Vitale, Lorenza</au><au>Facchin, Federica</au><au>Pelleri, Maria Chiara</au><au>Canaider, Silvia</au><au>Bianconi, Eva</au><au>Frabetti, Flavia</au><au>Strippoli, Pierluigi</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Genome-scale analysis of human mRNA 5′ coding sequences based on expressed sequence tag (EST) database</atitle><jtitle>Genomics (San Diego, Calif.)</jtitle><addtitle>Genomics</addtitle><date>2012-08-01</date><risdate>2012</risdate><volume>100</volume><issue>2</issue><spage>125</spage><epage>130</epage><pages>125-130</pages><issn>0888-7543</issn><eissn>1089-8646</eissn><abstract>The “5′ end mRNA artifact” issue refers to the incorrect assignment of the first AUG codon in an mRNA, due to the incomplete determination of its 5′ end sequence. We performed a systematic identification of coding regions at the 5′ end of all human known mRNAs, using an automated expressed sequence tag (EST)-based approach. Following parsing of more than 7million BLAT alignments, we found 477 human loci, out of 18,665 analyzed, in which an extension of the mRNA 5′ coding region was identified. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 cDNAs, and the consequences for the functional studies of these loci are discussed. We also generated a list of 20,775 human mRNAs where the presence of an in-frame stop codon upstream of the known start codon indicates completeness of the coding sequence at 5′ in the current form. ► Start codon may be incorrectly assigned if mRNA sequence is not complete at 5′ end. ► An EST-based computational approach identifies extended 5′ coding regions. ► The mRNA 5′ coding region is extended in 477 human loci out of 18,665 analyzed. ► Experimental confirmation has been obtained for GNB2L1, QARS and TDP2. ► 20,775 human mRNAs have a coding region not further extendable by EST sequences.</abstract><cop>Amsterdam</cop><pub>Elsevier Inc</pub><pmid>22659028</pmid><doi>10.1016/j.ygeno.2012.05.012</doi><tpages>6</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0888-7543
ispartof Genomics (San Diego, Calif.), 2012-08, Vol.100 (2), p.125-130
issn 0888-7543
1089-8646
language eng
recordid cdi_proquest_miscellaneous_1027680185
source MEDLINE; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals; ScienceDirect Journals (5 years ago - present)
subjects 5' Untranslated Regions - genetics
5′ Untranslated region (5′ UTR)
Amino Acid Sequence
Biological and medical sciences
Cloning, Molecular
Codon, Initiator
complementary DNA
Computational Biology
Databases, Genetic
Diverse techniques
DNA, Complementary
Expressed sequence tag (EST)
Expressed Sequence Tags
Fundamental and applied biological sciences. Psychology
Genes. Genome
Genetic Association Studies - methods
Genetic Loci
Genetics of eukaryotes. Biological and molecular evolution
Genome, Human
Human genome
Humans
loci
messenger RNA
Molecular and cellular biology
Molecular genetics
Molecular Sequence Data
mRNA 5′ coding sequence
Open Reading Frames
RNA, Messenger - genetics
Sequence Alignment
Sequence Analysis, DNA
start codon
stop codon
Translation start codon
title Genome-scale analysis of human mRNA 5′ coding sequences based on expressed sequence tag (EST) database
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-15T14%3A41%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Genome-scale%20analysis%20of%20human%20mRNA%205%E2%80%B2%20coding%20sequences%20based%20on%20expressed%20sequence%20tag%20(EST)%20database&rft.jtitle=Genomics%20(San%20Diego,%20Calif.)&rft.au=Casadei,%20Raffaella&rft.date=2012-08-01&rft.volume=100&rft.issue=2&rft.spage=125&rft.epage=130&rft.pages=125-130&rft.issn=0888-7543&rft.eissn=1089-8646&rft_id=info:doi/10.1016/j.ygeno.2012.05.012&rft_dat=%3Cproquest_cross%3E1027680185%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1027680185&rft_id=info:pmid/22659028&rft_els_id=S088875431200105X&rfr_iscdi=true