The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam
The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF8...
Gespeichert in:
Veröffentlicht in: | Computational biology and chemistry 2010-06, Vol.34 (3), p.210-214 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 214 |
---|---|
container_issue | 3 |
container_start_page | 210 |
container_title | Computational biology and chemistry |
container_volume | 34 |
creator | Goonesekere, Nalin C.W. Shipely, Krysten O’Connor, Kevin |
description | The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF. |
doi_str_mv | 10.1016/j.compbiolchem.2010.04.001 |
format | Article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_748944123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1476927110000253</els_id><sourcerecordid>748944123</sourcerecordid><originalsourceid>FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</originalsourceid><addsrcrecordid>eNqNkcFvFCEYxYnR2Fr9FwzpRS-7AjMsTG9Nq7VJE3vYQ2-E-fhml3UGtgNT439fxq1NT8YTBH6P7_EeIaecLTnjqy-7JcRh3_rYwxaHpWDlgtVLxvgrcsxrtVo0Qt-9ft4rfkTepbRjTFSMybfkSDBZqUbKY-LWW6SwtX2PYYM0dtSGELPNPmzofowZfaAJ7ycMgOmMzni2_R8S_WabqYuD9SHNB1P4GeKvQLspQPYx0KK97ezwnrzpbJ_ww9N6Qtbfvq4vvi9uflxdX5zfLKDmIi8AlHCu-G1Q6tpZWVkpXNs2CiopeduCXiHXnQCNstZaQ6eqlUXVaChfrU7Ip8OzxXcxnLIZfALsexswTsmoWjd1mVQV8vM_SS50pRiXghf07IDCGFMasTP70Q92_G04M3MdZmde1mHmOgyrTamjiD8-zZnaAd2z9G_-Bbg8AFhiefA4mgR-jtr5ESEbF_3_zHkEvJ-jIA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1283701521</pqid></control><display><type>article</type><title>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><creator>Goonesekere, Nalin C.W. ; Shipely, Krysten ; O’Connor, Kevin</creator><creatorcontrib>Goonesekere, Nalin C.W. ; Shipely, Krysten ; O’Connor, Kevin</creatorcontrib><description>The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.</description><identifier>ISSN: 1476-9271</identifier><identifier>EISSN: 1476-928X</identifier><identifier>DOI: 10.1016/j.compbiolchem.2010.04.001</identifier><identifier>PMID: 20537955</identifier><language>eng</language><publisher>England: Elsevier Ltd</publisher><subject>Annotations ; Biology ; Catalytic Domain ; Databases, Protein ; Domains of unknown function ; Function prediction ; Genomes ; Mathematical analysis ; Mathematical models ; Molecular Sequence Annotation - methods ; Pfam ; Proteins ; Sequence Analysis, Protein ; Sequence homology ; Sequence Homology, Amino Acid</subject><ispartof>Computational biology and chemistry, 2010-06, Vol.34 (3), p.210-214</ispartof><rights>2010 Elsevier Ltd</rights><rights>Copyright 2010 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</citedby><cites>FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.compbiolchem.2010.04.001$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20537955$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Goonesekere, Nalin C.W.</creatorcontrib><creatorcontrib>Shipely, Krysten</creatorcontrib><creatorcontrib>O’Connor, Kevin</creatorcontrib><title>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</title><title>Computational biology and chemistry</title><addtitle>Comput Biol Chem</addtitle><description>The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.</description><subject>Annotations</subject><subject>Biology</subject><subject>Catalytic Domain</subject><subject>Databases, Protein</subject><subject>Domains of unknown function</subject><subject>Function prediction</subject><subject>Genomes</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Molecular Sequence Annotation - methods</subject><subject>Pfam</subject><subject>Proteins</subject><subject>Sequence Analysis, Protein</subject><subject>Sequence homology</subject><subject>Sequence Homology, Amino Acid</subject><issn>1476-9271</issn><issn>1476-928X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkcFvFCEYxYnR2Fr9FwzpRS-7AjMsTG9Nq7VJE3vYQ2-E-fhml3UGtgNT439fxq1NT8YTBH6P7_EeIaecLTnjqy-7JcRh3_rYwxaHpWDlgtVLxvgrcsxrtVo0Qt-9ft4rfkTepbRjTFSMybfkSDBZqUbKY-LWW6SwtX2PYYM0dtSGELPNPmzofowZfaAJ7ycMgOmMzni2_R8S_WabqYuD9SHNB1P4GeKvQLspQPYx0KK97ezwnrzpbJ_ww9N6Qtbfvq4vvi9uflxdX5zfLKDmIi8AlHCu-G1Q6tpZWVkpXNs2CiopeduCXiHXnQCNstZaQ6eqlUXVaChfrU7Ip8OzxXcxnLIZfALsexswTsmoWjd1mVQV8vM_SS50pRiXghf07IDCGFMasTP70Q92_G04M3MdZmde1mHmOgyrTamjiD8-zZnaAd2z9G_-Bbg8AFhiefA4mgR-jtr5ESEbF_3_zHkEvJ-jIA</recordid><startdate>20100601</startdate><enddate>20100601</enddate><creator>Goonesekere, Nalin C.W.</creator><creator>Shipely, Krysten</creator><creator>O’Connor, Kevin</creator><general>Elsevier Ltd</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7U5</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20100601</creationdate><title>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</title><author>Goonesekere, Nalin C.W. ; Shipely, Krysten ; O’Connor, Kevin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Annotations</topic><topic>Biology</topic><topic>Catalytic Domain</topic><topic>Databases, Protein</topic><topic>Domains of unknown function</topic><topic>Function prediction</topic><topic>Genomes</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Molecular Sequence Annotation - methods</topic><topic>Pfam</topic><topic>Proteins</topic><topic>Sequence Analysis, Protein</topic><topic>Sequence homology</topic><topic>Sequence Homology, Amino Acid</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Goonesekere, Nalin C.W.</creatorcontrib><creatorcontrib>Shipely, Krysten</creatorcontrib><creatorcontrib>O’Connor, Kevin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Computational biology and chemistry</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Goonesekere, Nalin C.W.</au><au>Shipely, Krysten</au><au>O’Connor, Kevin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</atitle><jtitle>Computational biology and chemistry</jtitle><addtitle>Comput Biol Chem</addtitle><date>2010-06-01</date><risdate>2010</risdate><volume>34</volume><issue>3</issue><spage>210</spage><epage>214</epage><pages>210-214</pages><issn>1476-9271</issn><eissn>1476-928X</eissn><abstract>The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.</abstract><cop>England</cop><pub>Elsevier Ltd</pub><pmid>20537955</pmid><doi>10.1016/j.compbiolchem.2010.04.001</doi><tpages>5</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1476-9271 |
ispartof | Computational biology and chemistry, 2010-06, Vol.34 (3), p.210-214 |
issn | 1476-9271 1476-928X |
language | eng |
recordid | cdi_proquest_miscellaneous_748944123 |
source | MEDLINE; Elsevier ScienceDirect Journals Complete |
subjects | Annotations Biology Catalytic Domain Databases, Protein Domains of unknown function Function prediction Genomes Mathematical analysis Mathematical models Molecular Sequence Annotation - methods Pfam Proteins Sequence Analysis, Protein Sequence homology Sequence Homology, Amino Acid |
title | The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T19%3A39%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20challenge%20of%20annotating%20protein%20sequences:%20The%20tale%20of%20eight%20domains%20of%20unknown%20function%20in%20Pfam&rft.jtitle=Computational%20biology%20and%20chemistry&rft.au=Goonesekere,%20Nalin%20C.W.&rft.date=2010-06-01&rft.volume=34&rft.issue=3&rft.spage=210&rft.epage=214&rft.pages=210-214&rft.issn=1476-9271&rft.eissn=1476-928X&rft_id=info:doi/10.1016/j.compbiolchem.2010.04.001&rft_dat=%3Cproquest_cross%3E748944123%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1283701521&rft_id=info:pmid/20537955&rft_els_id=S1476927110000253&rfr_iscdi=true |