Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies

The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full‐length cDNAs) into maximal alignment ass...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Nucleic acids research 2003-10, Vol.31 (19), p.5654-5666
Hauptverfasser:	Haas, Brian J., Delcher, Arthur L., Mount, Stephen M., Wortman, Jennifer R., Smith Jr, Roger K., Hannick, Linda I., Maiti, Rama, Ronning, Catherine M., Rusch, Douglas B., Town, Christopher D., Salzberg, Steven L., White, Owen
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Alternative Splicing Arabidopsis - genetics Arabidopsis - metabolism Arabidopsis thaliana DNA, Complementary - analysis Expressed Sequence Tags Genome, Plant Introns Plant Proteins - genetics RNA, Plant - analysis RNA, Plant - chemistry Sequence Alignment - methods Software Transcription, Genetic Untranslated Regions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	5666
container_issue	19
container_start_page	5654
container_title	Nucleic acids research
container_volume	31
creator	Haas, Brian J. Delcher, Arthur L. Mount, Stephen M. Wortman, Jennifer R. Smith Jr, Roger K. Hannick, Linda I. Maiti, Rama Ronning, Catherine M. Rusch, Douglas B. Town, Christopher D. Salzberg, Steven L. White, Owen
description	The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full‐length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the ∼27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.
doi_str_mv	10.1093/nar/gkg770
format	Article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_206470</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>73671874</sourcerecordid><originalsourceid>FETCH-LOGICAL-c541t-94fcfe1196a505a022b0cb7627a9fedc0ded6e3be09f6c0aa9fce8f68ef059433</originalsourceid><addsrcrecordid>eNqNkU1v1DAQhi0EotvChR-AIg4cKoWO46_4wKFUlK1UiQtICxfLyU5St4md2klV_j2udlU-LnDyyPO8o3fmJeQVhXcUNDvxNp70N71S8ISsKJNVybWsnpIVMBAlBV4fkMOUrgEop4I_JweUC4C60ivy7WKcYrhzvi_mKyxOo23cNkzJpaJHH0YsrPdhtrMLvljSAzfaezfaoZij9amNbpoLO7jej-hzlRKOzeAwvSDPOjskfLl_j8jX849fztbl5edPF2enl2UrOJ1Lzbu2Q0q1tAKEhapqoG2UrJTVHW5b2OJWImsQdCdbsPm3xbqTNXYgNGfsiLzfzZ2WZsyC7CLawUwxm4w_TLDO_Nnx7sr04c5UILmCrH-718dwu2CazehSi8NgPYYlGcWkorXi_wTzClKLuv4PkAotQGbwzV_gdViiz9fK5kDUVNQiQ8c7qI0hpYjd42oUzEP-Judvdvln-PXvx_iF7gPPQLkDXJrx_rFv442Riilh1pvvZsMAzj-sN4azn9uKvqI</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>200581585</pqid></control><display><type>article</type><title>Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies</title><source>MEDLINE</source><source>PubMed (Medline)</source><source>Oxford Journals Open Access Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Haas, Brian J. ; Delcher, Arthur L. ; Mount, Stephen M. ; Wortman, Jennifer R. ; Smith Jr, Roger K. ; Hannick, Linda I. ; Maiti, Rama ; Ronning, Catherine M. ; Rusch, Douglas B. ; Town, Christopher D. ; Salzberg, Steven L. ; White, Owen</creator><creatorcontrib>Haas, Brian J. ; Delcher, Arthur L. ; Mount, Stephen M. ; Wortman, Jennifer R. ; Smith Jr, Roger K. ; Hannick, Linda I. ; Maiti, Rama ; Ronning, Catherine M. ; Rusch, Douglas B. ; Town, Christopher D. ; Salzberg, Steven L. ; White, Owen</creatorcontrib><description>The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full‐length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the ∼27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.</description><identifier>ISSN: 0305-1048</identifier><identifier>ISSN: 1362-4962</identifier><identifier>EISSN: 1362-4962</identifier><identifier>DOI: 10.1093/nar/gkg770</identifier><identifier>PMID: 14500829</identifier><identifier>CODEN: NARHAD</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Alternative Splicing ; Arabidopsis - genetics ; Arabidopsis - metabolism ; Arabidopsis thaliana ; DNA, Complementary - analysis ; Expressed Sequence Tags ; Genome, Plant ; Introns ; Plant Proteins - genetics ; RNA, Plant - analysis ; RNA, Plant - chemistry ; Sequence Alignment - methods ; Software ; Transcription, Genetic ; Untranslated Regions</subject><ispartof>Nucleic acids research, 2003-10, Vol.31 (19), p.5654-5666</ispartof><rights>Copyright Oxford University Press(England) Oct 01, 2003</rights><rights>Copyright © 2003 Oxford University Press 2003</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c541t-94fcfe1196a505a022b0cb7627a9fedc0ded6e3be09f6c0aa9fce8f68ef059433</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC206470/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC206470/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/14500829$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Haas, Brian J.</creatorcontrib><creatorcontrib>Delcher, Arthur L.</creatorcontrib><creatorcontrib>Mount, Stephen M.</creatorcontrib><creatorcontrib>Wortman, Jennifer R.</creatorcontrib><creatorcontrib>Smith Jr, Roger K.</creatorcontrib><creatorcontrib>Hannick, Linda I.</creatorcontrib><creatorcontrib>Maiti, Rama</creatorcontrib><creatorcontrib>Ronning, Catherine M.</creatorcontrib><creatorcontrib>Rusch, Douglas B.</creatorcontrib><creatorcontrib>Town, Christopher D.</creatorcontrib><creatorcontrib>Salzberg, Steven L.</creatorcontrib><creatorcontrib>White, Owen</creatorcontrib><title>Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies</title><title>Nucleic acids research</title><addtitle>Nucl. Acids Res</addtitle><description>The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full‐length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the ∼27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.</description><subject>Algorithms</subject><subject>Alternative Splicing</subject><subject>Arabidopsis - genetics</subject><subject>Arabidopsis - metabolism</subject><subject>Arabidopsis thaliana</subject><subject>DNA, Complementary - analysis</subject><subject>Expressed Sequence Tags</subject><subject>Genome, Plant</subject><subject>Introns</subject><subject>Plant Proteins - genetics</subject><subject>RNA, Plant - analysis</subject><subject>RNA, Plant - chemistry</subject><subject>Sequence Alignment - methods</subject><subject>Software</subject><subject>Transcription, Genetic</subject><subject>Untranslated Regions</subject><issn>0305-1048</issn><issn>1362-4962</issn><issn>1362-4962</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2003</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkU1v1DAQhi0EotvChR-AIg4cKoWO46_4wKFUlK1UiQtICxfLyU5St4md2klV_j2udlU-LnDyyPO8o3fmJeQVhXcUNDvxNp70N71S8ISsKJNVybWsnpIVMBAlBV4fkMOUrgEop4I_JweUC4C60ivy7WKcYrhzvi_mKyxOo23cNkzJpaJHH0YsrPdhtrMLvljSAzfaezfaoZij9amNbpoLO7jej-hzlRKOzeAwvSDPOjskfLl_j8jX849fztbl5edPF2enl2UrOJ1Lzbu2Q0q1tAKEhapqoG2UrJTVHW5b2OJWImsQdCdbsPm3xbqTNXYgNGfsiLzfzZ2WZsyC7CLawUwxm4w_TLDO_Nnx7sr04c5UILmCrH-718dwu2CazehSi8NgPYYlGcWkorXi_wTzClKLuv4PkAotQGbwzV_gdViiz9fK5kDUVNQiQ8c7qI0hpYjd42oUzEP-Judvdvln-PXvx_iF7gPPQLkDXJrx_rFv442Riilh1pvvZsMAzj-sN4azn9uKvqI</recordid><startdate>20031001</startdate><enddate>20031001</enddate><creator>Haas, Brian J.</creator><creator>Delcher, Arthur L.</creator><creator>Mount, Stephen M.</creator><creator>Wortman, Jennifer R.</creator><creator>Smith Jr, Roger K.</creator><creator>Hannick, Linda I.</creator><creator>Maiti, Rama</creator><creator>Ronning, Catherine M.</creator><creator>Rusch, Douglas B.</creator><creator>Town, Christopher D.</creator><creator>Salzberg, Steven L.</creator><creator>White, Owen</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QL</scope><scope>7QO</scope><scope>7QP</scope><scope>7QR</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20031001</creationdate><title>Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies</title><author>Haas, Brian J. ; Delcher, Arthur L. ; Mount, Stephen M. ; Wortman, Jennifer R. ; Smith Jr, Roger K. ; Hannick, Linda I. ; Maiti, Rama ; Ronning, Catherine M. ; Rusch, Douglas B. ; Town, Christopher D. ; Salzberg, Steven L. ; White, Owen</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c541t-94fcfe1196a505a022b0cb7627a9fedc0ded6e3be09f6c0aa9fce8f68ef059433</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2003</creationdate><topic>Algorithms</topic><topic>Alternative Splicing</topic><topic>Arabidopsis - genetics</topic><topic>Arabidopsis - metabolism</topic><topic>Arabidopsis thaliana</topic><topic>DNA, Complementary - analysis</topic><topic>Expressed Sequence Tags</topic><topic>Genome, Plant</topic><topic>Introns</topic><topic>Plant Proteins - genetics</topic><topic>RNA, Plant - analysis</topic><topic>RNA, Plant - chemistry</topic><topic>Sequence Alignment - methods</topic><topic>Software</topic><topic>Transcription, Genetic</topic><topic>Untranslated Regions</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Haas, Brian J.</creatorcontrib><creatorcontrib>Delcher, Arthur L.</creatorcontrib><creatorcontrib>Mount, Stephen M.</creatorcontrib><creatorcontrib>Wortman, Jennifer R.</creatorcontrib><creatorcontrib>Smith Jr, Roger K.</creatorcontrib><creatorcontrib>Hannick, Linda I.</creatorcontrib><creatorcontrib>Maiti, Rama</creatorcontrib><creatorcontrib>Ronning, Catherine M.</creatorcontrib><creatorcontrib>Rusch, Douglas B.</creatorcontrib><creatorcontrib>Town, Christopher D.</creatorcontrib><creatorcontrib>Salzberg, Steven L.</creatorcontrib><creatorcontrib>White, Owen</creatorcontrib><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Nucleic acids research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Haas, Brian J.</au><au>Delcher, Arthur L.</au><au>Mount, Stephen M.</au><au>Wortman, Jennifer R.</au><au>Smith Jr, Roger K.</au><au>Hannick, Linda I.</au><au>Maiti, Rama</au><au>Ronning, Catherine M.</au><au>Rusch, Douglas B.</au><au>Town, Christopher D.</au><au>Salzberg, Steven L.</au><au>White, Owen</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies</atitle><jtitle>Nucleic acids research</jtitle><addtitle>Nucl. Acids Res</addtitle><date>2003-10-01</date><risdate>2003</risdate><volume>31</volume><issue>19</issue><spage>5654</spage><epage>5666</epage><pages>5654-5666</pages><issn>0305-1048</issn><issn>1362-4962</issn><eissn>1362-4962</eissn><coden>NARHAD</coden><abstract>The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full‐length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the ∼27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>14500829</pmid><doi>10.1093/nar/gkg770</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0305-1048
ispartof	Nucleic acids research, 2003-10, Vol.31 (19), p.5654-5666
issn	0305-1048 1362-4962 1362-4962
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_206470
source	MEDLINE; PubMed (Medline); Oxford Journals Open Access Collection; Free Full-Text Journals in Chemistry
subjects	Algorithms Alternative Splicing Arabidopsis - genetics Arabidopsis - metabolism Arabidopsis thaliana DNA, Complementary - analysis Expressed Sequence Tags Genome, Plant Introns Plant Proteins - genetics RNA, Plant - analysis RNA, Plant - chemistry Sequence Alignment - methods Software Transcription, Genetic Untranslated Regions
title	Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-24T21%3A38%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Improving%20the%20Arabidopsis%20genome%20annotation%20using%20maximal%20transcript%20alignment%20assemblies&rft.jtitle=Nucleic%20acids%20research&rft.au=Haas,%20Brian%20J.&rft.date=2003-10-01&rft.volume=31&rft.issue=19&rft.spage=5654&rft.epage=5666&rft.pages=5654-5666&rft.issn=0305-1048&rft.eissn=1362-4962&rft.coden=NARHAD&rft_id=info:doi/10.1093/nar/gkg770&rft_dat=%3Cproquest_pubme%3E73671874%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=200581585&rft_id=info:pmid/14500829&rfr_iscdi=true