The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam

The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF8...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computational biology and chemistry 2010-06, Vol.34 (3), p.210-214
Hauptverfasser: Goonesekere, Nalin C.W., Shipely, Krysten, O’Connor, Kevin
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 214
container_issue 3
container_start_page 210
container_title Computational biology and chemistry
container_volume 34
creator Goonesekere, Nalin C.W.
Shipely, Krysten
O’Connor, Kevin
description The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.
doi_str_mv 10.1016/j.compbiolchem.2010.04.001
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_748944123</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1476927110000253</els_id><sourcerecordid>748944123</sourcerecordid><originalsourceid>FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</originalsourceid><addsrcrecordid>eNqNkcFvFCEYxYnR2Fr9FwzpRS-7AjMsTG9Nq7VJE3vYQ2-E-fhml3UGtgNT439fxq1NT8YTBH6P7_EeIaecLTnjqy-7JcRh3_rYwxaHpWDlgtVLxvgrcsxrtVo0Qt-9ft4rfkTepbRjTFSMybfkSDBZqUbKY-LWW6SwtX2PYYM0dtSGELPNPmzofowZfaAJ7ycMgOmMzni2_R8S_WabqYuD9SHNB1P4GeKvQLspQPYx0KK97ezwnrzpbJ_ww9N6Qtbfvq4vvi9uflxdX5zfLKDmIi8AlHCu-G1Q6tpZWVkpXNs2CiopeduCXiHXnQCNstZaQ6eqlUXVaChfrU7Ip8OzxXcxnLIZfALsexswTsmoWjd1mVQV8vM_SS50pRiXghf07IDCGFMasTP70Q92_G04M3MdZmde1mHmOgyrTamjiD8-zZnaAd2z9G_-Bbg8AFhiefA4mgR-jtr5ESEbF_3_zHkEvJ-jIA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1283701521</pqid></control><display><type>article</type><title>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals Complete</source><creator>Goonesekere, Nalin C.W. ; Shipely, Krysten ; O’Connor, Kevin</creator><creatorcontrib>Goonesekere, Nalin C.W. ; Shipely, Krysten ; O’Connor, Kevin</creatorcontrib><description>The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.</description><identifier>ISSN: 1476-9271</identifier><identifier>EISSN: 1476-928X</identifier><identifier>DOI: 10.1016/j.compbiolchem.2010.04.001</identifier><identifier>PMID: 20537955</identifier><language>eng</language><publisher>England: Elsevier Ltd</publisher><subject>Annotations ; Biology ; Catalytic Domain ; Databases, Protein ; Domains of unknown function ; Function prediction ; Genomes ; Mathematical analysis ; Mathematical models ; Molecular Sequence Annotation - methods ; Pfam ; Proteins ; Sequence Analysis, Protein ; Sequence homology ; Sequence Homology, Amino Acid</subject><ispartof>Computational biology and chemistry, 2010-06, Vol.34 (3), p.210-214</ispartof><rights>2010 Elsevier Ltd</rights><rights>Copyright 2010 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</citedby><cites>FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.compbiolchem.2010.04.001$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/20537955$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Goonesekere, Nalin C.W.</creatorcontrib><creatorcontrib>Shipely, Krysten</creatorcontrib><creatorcontrib>O’Connor, Kevin</creatorcontrib><title>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</title><title>Computational biology and chemistry</title><addtitle>Comput Biol Chem</addtitle><description>The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.</description><subject>Annotations</subject><subject>Biology</subject><subject>Catalytic Domain</subject><subject>Databases, Protein</subject><subject>Domains of unknown function</subject><subject>Function prediction</subject><subject>Genomes</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Molecular Sequence Annotation - methods</subject><subject>Pfam</subject><subject>Proteins</subject><subject>Sequence Analysis, Protein</subject><subject>Sequence homology</subject><subject>Sequence Homology, Amino Acid</subject><issn>1476-9271</issn><issn>1476-928X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2010</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkcFvFCEYxYnR2Fr9FwzpRS-7AjMsTG9Nq7VJE3vYQ2-E-fhml3UGtgNT439fxq1NT8YTBH6P7_EeIaecLTnjqy-7JcRh3_rYwxaHpWDlgtVLxvgrcsxrtVo0Qt-9ft4rfkTepbRjTFSMybfkSDBZqUbKY-LWW6SwtX2PYYM0dtSGELPNPmzofowZfaAJ7ycMgOmMzni2_R8S_WabqYuD9SHNB1P4GeKvQLspQPYx0KK97ezwnrzpbJ_ww9N6Qtbfvq4vvi9uflxdX5zfLKDmIi8AlHCu-G1Q6tpZWVkpXNs2CiopeduCXiHXnQCNstZaQ6eqlUXVaChfrU7Ip8OzxXcxnLIZfALsexswTsmoWjd1mVQV8vM_SS50pRiXghf07IDCGFMasTP70Q92_G04M3MdZmde1mHmOgyrTamjiD8-zZnaAd2z9G_-Bbg8AFhiefA4mgR-jtr5ESEbF_3_zHkEvJ-jIA</recordid><startdate>20100601</startdate><enddate>20100601</enddate><creator>Goonesekere, Nalin C.W.</creator><creator>Shipely, Krysten</creator><creator>O’Connor, Kevin</creator><general>Elsevier Ltd</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7U5</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>7X8</scope></search><sort><creationdate>20100601</creationdate><title>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</title><author>Goonesekere, Nalin C.W. ; Shipely, Krysten ; O’Connor, Kevin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c412t-cc72dd9279e584da53a52dbb97c3551bbc86e18f2c8e54888cf736ae798c1473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2010</creationdate><topic>Annotations</topic><topic>Biology</topic><topic>Catalytic Domain</topic><topic>Databases, Protein</topic><topic>Domains of unknown function</topic><topic>Function prediction</topic><topic>Genomes</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Molecular Sequence Annotation - methods</topic><topic>Pfam</topic><topic>Proteins</topic><topic>Sequence Analysis, Protein</topic><topic>Sequence homology</topic><topic>Sequence Homology, Amino Acid</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Goonesekere, Nalin C.W.</creatorcontrib><creatorcontrib>Shipely, Krysten</creatorcontrib><creatorcontrib>O’Connor, Kevin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>MEDLINE - Academic</collection><jtitle>Computational biology and chemistry</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Goonesekere, Nalin C.W.</au><au>Shipely, Krysten</au><au>O’Connor, Kevin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam</atitle><jtitle>Computational biology and chemistry</jtitle><addtitle>Comput Biol Chem</addtitle><date>2010-06-01</date><risdate>2010</risdate><volume>34</volume><issue>3</issue><spage>210</spage><epage>214</epage><pages>210-214</pages><issn>1476-9271</issn><eissn>1476-928X</eissn><abstract>The Pfam database is an important tool in genome annotation, since it provides a collection of curated protein families. However, a subset of these families, known as domains of unknown function (DUFs), remains poorly characterized. We have related sequences from DUF404, DUF407, DUF482, DUF608, DUF810, DUF853, DUF976 and DUF1111 to homologs in PDB, within the midnight zone (9–20%) of sequence identity. These relationships were extended to provide functional annotation by sequence analysis and model building. Also described are examples of residue plasticity within enzyme active sites, and change of function within homologous sequences of a DUF.</abstract><cop>England</cop><pub>Elsevier Ltd</pub><pmid>20537955</pmid><doi>10.1016/j.compbiolchem.2010.04.001</doi><tpages>5</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1476-9271
ispartof Computational biology and chemistry, 2010-06, Vol.34 (3), p.210-214
issn 1476-9271
1476-928X
language eng
recordid cdi_proquest_miscellaneous_748944123
source MEDLINE; Elsevier ScienceDirect Journals Complete
subjects Annotations
Biology
Catalytic Domain
Databases, Protein
Domains of unknown function
Function prediction
Genomes
Mathematical analysis
Mathematical models
Molecular Sequence Annotation - methods
Pfam
Proteins
Sequence Analysis, Protein
Sequence homology
Sequence Homology, Amino Acid
title The challenge of annotating protein sequences: The tale of eight domains of unknown function in Pfam
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T19%3A39%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20challenge%20of%20annotating%20protein%20sequences:%20The%20tale%20of%20eight%20domains%20of%20unknown%20function%20in%20Pfam&rft.jtitle=Computational%20biology%20and%20chemistry&rft.au=Goonesekere,%20Nalin%20C.W.&rft.date=2010-06-01&rft.volume=34&rft.issue=3&rft.spage=210&rft.epage=214&rft.pages=210-214&rft.issn=1476-9271&rft.eissn=1476-928X&rft_id=info:doi/10.1016/j.compbiolchem.2010.04.001&rft_dat=%3Cproquest_cross%3E748944123%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1283701521&rft_id=info:pmid/20537955&rft_els_id=S1476927110000253&rfr_iscdi=true