Gene structure prediction and alternative splicing analysis using genomically aligned ESTs

With the availability of a nearly complete sequence of the human genome, aligning expressed sequence tags (EST) to the genomic sequence has become a practical and powerful strategy for gene prediction. Elucidating gene structure is a complex problem requiring the identification of splice junctions,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genome research 2001-05, Vol.11 (5), p.889-900
Hauptverfasser: Kan, Z, Rouchka, E C, Gish, W R, States, D J
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 900
container_issue 5
container_start_page 889
container_title Genome research
container_volume 11
creator Kan, Z
Rouchka, E C
Gish, W R
States, D J
description With the availability of a nearly complete sequence of the human genome, aligning expressed sequence tags (EST) to the genomic sequence has become a practical and powerful strategy for gene prediction. Elucidating gene structure is a complex problem requiring the identification of splice junctions, gene boundaries, and alternative splicing variants. We have developed a software tool, Transcript Assembly Program (TAP), to delineate gene structures using genomically aligned EST sequences. TAP assembles the joint gene structure of the entire genomic region from individual splice junction pairs, using a novel algorithm that uses the EST-encoded connectivity and redundancy information to sort out the complex alternative splicing patterns. A method called polyadenylation site scan (PASS) has been developed to detect poly-A sites in the genome. TAP uses these predictions to identify gene boundaries by segmenting the joint gene structure at polyadenylated terminal exons. Reconstructing 1007 known transcripts, TAP scored a sensitivity (Sn) of 60% and a specificity (Sp) of 92% at the exon level. The gene boundary identification process was found to be accurate 78% of the time. also reports alternative splicing patterns in EST alignments. An analysis of alternative splicing in 1124 genic regions suggested that more than half of human genes undergo alternative splicing. Surprisingly, we saw an absolute majority of the detected alternative splicing events affect the coding region. Furthermore, the evolutionary conservation of alternative splicing between human and mouse was analyzed using an EST-based approach. (See http://stl.wustl.edu/~zkan/TAP/)
doi_str_mv 10.1101/gr.155001
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_311065</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>70802123</sourcerecordid><originalsourceid>FETCH-LOGICAL-c436t-711482b7c962e3a1c11637665a538ae189253ec1008ece2399607ab65f1759cf3</originalsourceid><addsrcrecordid>eNpVkU1PwzAMhiMEYnwd-AOoJyQOHXHTpOmBA5pgICFxAC5coiz1SlCWjqSdtH9Ppk18nGzrfV7bsgk5BzoGoHDdhjFwTinskSPgZZ3zUtT7KadS5jXlMCLHMX5SSlkp5SEZATBWlbI4Iu9T9JjFPgymHwJmy4CNNb3tfKZ9k2nXY_C6t6sELZ011rdJ0G4dbcyGuClb9N3CGu3cOvG29dhkdy-v8ZQczLWLeLaLJ-Tt_u518pA_PU8fJ7dPuSmZ6PMKIG0yq0wtCmQaDIBglRBccyY1gqwLztAApRINFqyuBa30TPA5VLw2c3ZCbrZ9l8NsgY1B3wft1DLYhQ5r1Wmr_ivefqi2WymWjid48l_u_KH7GjD2amGjQee0x26IqqKSFlCwBF5tQRO6GAPOf2YAVZtHqDao7SMSe_F3qV9yd3n2DeguhOM</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>70802123</pqid></control><display><type>article</type><title>Gene structure prediction and alternative splicing analysis using genomically aligned ESTs</title><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Kan, Z ; Rouchka, E C ; Gish, W R ; States, D J</creator><creatorcontrib>Kan, Z ; Rouchka, E C ; Gish, W R ; States, D J</creatorcontrib><description>With the availability of a nearly complete sequence of the human genome, aligning expressed sequence tags (EST) to the genomic sequence has become a practical and powerful strategy for gene prediction. Elucidating gene structure is a complex problem requiring the identification of splice junctions, gene boundaries, and alternative splicing variants. We have developed a software tool, Transcript Assembly Program (TAP), to delineate gene structures using genomically aligned EST sequences. TAP assembles the joint gene structure of the entire genomic region from individual splice junction pairs, using a novel algorithm that uses the EST-encoded connectivity and redundancy information to sort out the complex alternative splicing patterns. A method called polyadenylation site scan (PASS) has been developed to detect poly-A sites in the genome. TAP uses these predictions to identify gene boundaries by segmenting the joint gene structure at polyadenylated terminal exons. Reconstructing 1007 known transcripts, TAP scored a sensitivity (Sn) of 60% and a specificity (Sp) of 92% at the exon level. The gene boundary identification process was found to be accurate 78% of the time. also reports alternative splicing patterns in EST alignments. An analysis of alternative splicing in 1124 genic regions suggested that more than half of human genes undergo alternative splicing. Surprisingly, we saw an absolute majority of the detected alternative splicing events affect the coding region. Furthermore, the evolutionary conservation of alternative splicing between human and mouse was analyzed using an EST-based approach. (See http://stl.wustl.edu/~zkan/TAP/)</description><identifier>ISSN: 1088-9051</identifier><identifier>EISSN: 1549-5469</identifier><identifier>DOI: 10.1101/gr.155001</identifier><identifier>PMID: 11337482</identifier><language>eng</language><publisher>United States: Cold Spring Harbor Laboratory Press</publisher><subject>Alternative Splicing - genetics ; Computational Biology - instrumentation ; Computational Biology - methods ; Expressed Sequence Tags ; Genes - genetics ; Genome, Human ; Humans ; Methods ; RNA, Messenger - metabolism ; Sequence Alignment - instrumentation ; Sequence Alignment - methods ; Software ; Software Validation ; Transcription, Genetic</subject><ispartof>Genome research, 2001-05, Vol.11 (5), p.889-900</ispartof><rights>Copyright © 2001, Cold Spring Harbor Laboratory Press 2001</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c436t-711482b7c962e3a1c11637665a538ae189253ec1008ece2399607ab65f1759cf3</citedby><cites>FETCH-LOGICAL-c436t-711482b7c962e3a1c11637665a538ae189253ec1008ece2399607ab65f1759cf3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC311065/pdf/$$EPDF$$P50$$Gpubmedcentral$$H</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC311065/$$EHTML$$P50$$Gpubmedcentral$$H</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/11337482$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Kan, Z</creatorcontrib><creatorcontrib>Rouchka, E C</creatorcontrib><creatorcontrib>Gish, W R</creatorcontrib><creatorcontrib>States, D J</creatorcontrib><title>Gene structure prediction and alternative splicing analysis using genomically aligned ESTs</title><title>Genome research</title><addtitle>Genome Res</addtitle><description>With the availability of a nearly complete sequence of the human genome, aligning expressed sequence tags (EST) to the genomic sequence has become a practical and powerful strategy for gene prediction. Elucidating gene structure is a complex problem requiring the identification of splice junctions, gene boundaries, and alternative splicing variants. We have developed a software tool, Transcript Assembly Program (TAP), to delineate gene structures using genomically aligned EST sequences. TAP assembles the joint gene structure of the entire genomic region from individual splice junction pairs, using a novel algorithm that uses the EST-encoded connectivity and redundancy information to sort out the complex alternative splicing patterns. A method called polyadenylation site scan (PASS) has been developed to detect poly-A sites in the genome. TAP uses these predictions to identify gene boundaries by segmenting the joint gene structure at polyadenylated terminal exons. Reconstructing 1007 known transcripts, TAP scored a sensitivity (Sn) of 60% and a specificity (Sp) of 92% at the exon level. The gene boundary identification process was found to be accurate 78% of the time. also reports alternative splicing patterns in EST alignments. An analysis of alternative splicing in 1124 genic regions suggested that more than half of human genes undergo alternative splicing. Surprisingly, we saw an absolute majority of the detected alternative splicing events affect the coding region. Furthermore, the evolutionary conservation of alternative splicing between human and mouse was analyzed using an EST-based approach. (See http://stl.wustl.edu/~zkan/TAP/)</description><subject>Alternative Splicing - genetics</subject><subject>Computational Biology - instrumentation</subject><subject>Computational Biology - methods</subject><subject>Expressed Sequence Tags</subject><subject>Genes - genetics</subject><subject>Genome, Human</subject><subject>Humans</subject><subject>Methods</subject><subject>RNA, Messenger - metabolism</subject><subject>Sequence Alignment - instrumentation</subject><subject>Sequence Alignment - methods</subject><subject>Software</subject><subject>Software Validation</subject><subject>Transcription, Genetic</subject><issn>1088-9051</issn><issn>1549-5469</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNpVkU1PwzAMhiMEYnwd-AOoJyQOHXHTpOmBA5pgICFxAC5coiz1SlCWjqSdtH9Ppk18nGzrfV7bsgk5BzoGoHDdhjFwTinskSPgZZ3zUtT7KadS5jXlMCLHMX5SSlkp5SEZATBWlbI4Iu9T9JjFPgymHwJmy4CNNb3tfKZ9k2nXY_C6t6sELZ011rdJ0G4dbcyGuClb9N3CGu3cOvG29dhkdy-v8ZQczLWLeLaLJ-Tt_u518pA_PU8fJ7dPuSmZ6PMKIG0yq0wtCmQaDIBglRBccyY1gqwLztAApRINFqyuBa30TPA5VLw2c3ZCbrZ9l8NsgY1B3wft1DLYhQ5r1Wmr_ivefqi2WymWjid48l_u_KH7GjD2amGjQee0x26IqqKSFlCwBF5tQRO6GAPOf2YAVZtHqDao7SMSe_F3qV9yd3n2DeguhOM</recordid><startdate>200105</startdate><enddate>200105</enddate><creator>Kan, Z</creator><creator>Rouchka, E C</creator><creator>Gish, W R</creator><creator>States, D J</creator><general>Cold Spring Harbor Laboratory Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>200105</creationdate><title>Gene structure prediction and alternative splicing analysis using genomically aligned ESTs</title><author>Kan, Z ; Rouchka, E C ; Gish, W R ; States, D J</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c436t-711482b7c962e3a1c11637665a538ae189253ec1008ece2399607ab65f1759cf3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Alternative Splicing - genetics</topic><topic>Computational Biology - instrumentation</topic><topic>Computational Biology - methods</topic><topic>Expressed Sequence Tags</topic><topic>Genes - genetics</topic><topic>Genome, Human</topic><topic>Humans</topic><topic>Methods</topic><topic>RNA, Messenger - metabolism</topic><topic>Sequence Alignment - instrumentation</topic><topic>Sequence Alignment - methods</topic><topic>Software</topic><topic>Software Validation</topic><topic>Transcription, Genetic</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Kan, Z</creatorcontrib><creatorcontrib>Rouchka, E C</creatorcontrib><creatorcontrib>Gish, W R</creatorcontrib><creatorcontrib>States, D J</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genome research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Kan, Z</au><au>Rouchka, E C</au><au>Gish, W R</au><au>States, D J</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Gene structure prediction and alternative splicing analysis using genomically aligned ESTs</atitle><jtitle>Genome research</jtitle><addtitle>Genome Res</addtitle><date>2001-05</date><risdate>2001</risdate><volume>11</volume><issue>5</issue><spage>889</spage><epage>900</epage><pages>889-900</pages><issn>1088-9051</issn><eissn>1549-5469</eissn><abstract>With the availability of a nearly complete sequence of the human genome, aligning expressed sequence tags (EST) to the genomic sequence has become a practical and powerful strategy for gene prediction. Elucidating gene structure is a complex problem requiring the identification of splice junctions, gene boundaries, and alternative splicing variants. We have developed a software tool, Transcript Assembly Program (TAP), to delineate gene structures using genomically aligned EST sequences. TAP assembles the joint gene structure of the entire genomic region from individual splice junction pairs, using a novel algorithm that uses the EST-encoded connectivity and redundancy information to sort out the complex alternative splicing patterns. A method called polyadenylation site scan (PASS) has been developed to detect poly-A sites in the genome. TAP uses these predictions to identify gene boundaries by segmenting the joint gene structure at polyadenylated terminal exons. Reconstructing 1007 known transcripts, TAP scored a sensitivity (Sn) of 60% and a specificity (Sp) of 92% at the exon level. The gene boundary identification process was found to be accurate 78% of the time. also reports alternative splicing patterns in EST alignments. An analysis of alternative splicing in 1124 genic regions suggested that more than half of human genes undergo alternative splicing. Surprisingly, we saw an absolute majority of the detected alternative splicing events affect the coding region. Furthermore, the evolutionary conservation of alternative splicing between human and mouse was analyzed using an EST-based approach. (See http://stl.wustl.edu/~zkan/TAP/)</abstract><cop>United States</cop><pub>Cold Spring Harbor Laboratory Press</pub><pmid>11337482</pmid><doi>10.1101/gr.155001</doi><tpages>12</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1088-9051
ispartof Genome research, 2001-05, Vol.11 (5), p.889-900
issn 1088-9051
1549-5469
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_311065
source MEDLINE; PubMed Central; Alma/SFX Local Collection
subjects Alternative Splicing - genetics
Computational Biology - instrumentation
Computational Biology - methods
Expressed Sequence Tags
Genes - genetics
Genome, Human
Humans
Methods
RNA, Messenger - metabolism
Sequence Alignment - instrumentation
Sequence Alignment - methods
Software
Software Validation
Transcription, Genetic
title Gene structure prediction and alternative splicing analysis using genomically aligned ESTs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T16%3A22%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Gene%20structure%20prediction%20and%20alternative%20splicing%20analysis%20using%20genomically%20aligned%20ESTs&rft.jtitle=Genome%20research&rft.au=Kan,%20Z&rft.date=2001-05&rft.volume=11&rft.issue=5&rft.spage=889&rft.epage=900&rft.pages=889-900&rft.issn=1088-9051&rft.eissn=1549-5469&rft_id=info:doi/10.1101/gr.155001&rft_dat=%3Cproquest_pubme%3E70802123%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=70802123&rft_id=info:pmid/11337482&rfr_iscdi=true