Profile analysis: detection of distantly related proteins

Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of si...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the National Academy of Sciences - PNAS 1987-07, Vol.84 (13), p.4355-4358
Hauptverfasser: Gribskov, M, McLachlan, A.D, Eisenberg, D
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 4358
container_issue 13
container_start_page 4355
container_title Proceedings of the National Academy of Sciences - PNAS
container_volume 84
creator Gribskov, M
McLachlan, A.D
Eisenberg, D
description Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of similar proteins. This information is expressed in a position-specific scoring table (profile), which is created from a group of sequences previously aligned by structural or sequence similarity. The similarity of any other sequence (target) to the group of aligned sequences (probe) can be tested by comparing the target to the profile using dynamic programming algorithms. The profile method differs in two major respects from methods of sequence comparison in common use: (i) Any number of known sequences can be used to construct the profile, allowing more information to be used in the testing of the target than is possible with pairwise alignment methods. (ii) The profile includes the penalties for insertion or deletion at each position, which allow one to include the probe secondary structure in the testing scheme. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.
doi_str_mv 10.1073/pnas.84.13.4355
format Article
fullrecord <record><control><sourceid>jstor_pnas_</sourceid><recordid>TN_cdi_pnas_primary_84_13_4355_fulltext</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>30172</jstor_id><sourcerecordid>30172</sourcerecordid><originalsourceid>FETCH-LOGICAL-c577t-3a1793ddeff8d5c0b1f17845d757182a91bd6e4341f8ab4ead507e5c83b0d8443</originalsourceid><addsrcrecordid>eNp9kMFrFDEUh4Moda2eBVGZg-hpti-TZJIRPEixrVBQ0J7Dm0lSU7KTNckW9793lh1GvXh6h-_7vff4EfKcwpqCZGfbEfNa8TVla86EeEBWFDpat7yDh2QF0Mha8YY_Jk9yvgOATig4ISeMS96CXJHua4rOB1vhiGGffX5fGVvsUHwcq-gq43PBsYR9lWzAYk21TbFYP-an5JHDkO2zeZ6Sm4tP38-v6usvl5_PP17Xg5Cy1Ayp7Jgx1jllxAA9dVQqLowUkqoGO9qb1nLGqVPYc4tGgLRiUKwHozhnp-TDce9212-sGexYEga9TX6Daa8jev0vGf0PfRvvNQMBSk75t3M-xZ87m4ve-DzYEHC0cZe1lC1QrtpJPDuKQ4o5J-uWGxT0oWx9KFsrrinTh7KnxKu_X1v8ud2Jv5k55gGDSzgOPi-aYg1VUk3au1k77F_ocke7XQjF_iqT-fq_5iS8OAp3ucT05yGgspngyyN0GDXepumVm29KctkqqhT7DV0Nsoc</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>77601486</pqid></control><display><type>article</type><title>Profile analysis: detection of distantly related proteins</title><source>Jstor Complete Legacy</source><source>MEDLINE</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><source>Free Full-Text Journals in Chemistry</source><creator>Gribskov, M ; McLachlan, A.D ; Eisenberg, D</creator><creatorcontrib>Gribskov, M ; McLachlan, A.D ; Eisenberg, D</creatorcontrib><description>Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of similar proteins. This information is expressed in a position-specific scoring table (profile), which is created from a group of sequences previously aligned by structural or sequence similarity. The similarity of any other sequence (target) to the group of aligned sequences (probe) can be tested by comparing the target to the profile using dynamic programming algorithms. The profile method differs in two major respects from methods of sequence comparison in common use: (i) Any number of known sequences can be used to construct the profile, allowing more information to be used in the testing of the target than is possible with pairwise alignment methods. (ii) The profile includes the penalties for insertion or deletion at each position, which allow one to include the probe secondary structure in the testing scheme. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.</description><identifier>ISSN: 0027-8424</identifier><identifier>EISSN: 1091-6490</identifier><identifier>DOI: 10.1073/pnas.84.13.4355</identifier><identifier>PMID: 3474607</identifier><identifier>CODEN: PNASA6</identifier><language>eng</language><publisher>Washington, DC: National Academy of Sciences of the United States of America</publisher><subject>ACIDE AMINE ; Algorithms ; Amino Acid Sequence ; AMINO ACIDS ; AMINOACIDOS ; Analytical, structural and metabolic biochemistry ; ANIMAL PROTEIN ; Base Sequence ; Biochemistry ; Biological and medical sciences ; COMPUTER SOFTWARE ; Dynamic programming methods ; Fundamental and applied biological sciences. Psychology ; General aspects, investigation methods ; GLOBINS ; Globins - genetics ; Immunoglobulin constant regions ; IMMUNOGLOBULINE ; IMMUNOGLOBULINS ; Immunoglobulins - genetics ; Information search ; Information Systems ; INMUNOGLOBULINA ; LOGICIEL ; Molecular structure ; Nucleic acids ; PLANT PROTEINS ; PROFMAKE PROGRAM ; PROGRAMAS DE ORDENADOR ; Protein Conformation ; PROTEINAS DE ORIGEN ANIMAL ; PROTEINAS VEGETALES ; PROTEINE ANIMALE ; PROTEINE VEGETALE ; Proteins ; Proteins - genetics ; Sequence Homology, Nucleic Acid ; Software</subject><ispartof>Proceedings of the National Academy of Sciences - PNAS, 1987-07, Vol.84 (13), p.4355-4358</ispartof><rights>1987 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c577t-3a1793ddeff8d5c0b1f17845d757182a91bd6e4341f8ab4ead507e5c83b0d8443</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.pnas.org/content/84/13.cover.gif</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/30172$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/30172$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,723,776,780,799,881,27901,27902,53766,53768,57992,58225</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=8321878$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/3474607$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Gribskov, M</creatorcontrib><creatorcontrib>McLachlan, A.D</creatorcontrib><creatorcontrib>Eisenberg, D</creatorcontrib><title>Profile analysis: detection of distantly related proteins</title><title>Proceedings of the National Academy of Sciences - PNAS</title><addtitle>Proc Natl Acad Sci U S A</addtitle><description>Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of similar proteins. This information is expressed in a position-specific scoring table (profile), which is created from a group of sequences previously aligned by structural or sequence similarity. The similarity of any other sequence (target) to the group of aligned sequences (probe) can be tested by comparing the target to the profile using dynamic programming algorithms. The profile method differs in two major respects from methods of sequence comparison in common use: (i) Any number of known sequences can be used to construct the profile, allowing more information to be used in the testing of the target than is possible with pairwise alignment methods. (ii) The profile includes the penalties for insertion or deletion at each position, which allow one to include the probe secondary structure in the testing scheme. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.</description><subject>ACIDE AMINE</subject><subject>Algorithms</subject><subject>Amino Acid Sequence</subject><subject>AMINO ACIDS</subject><subject>AMINOACIDOS</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>ANIMAL PROTEIN</subject><subject>Base Sequence</subject><subject>Biochemistry</subject><subject>Biological and medical sciences</subject><subject>COMPUTER SOFTWARE</subject><subject>Dynamic programming methods</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects, investigation methods</subject><subject>GLOBINS</subject><subject>Globins - genetics</subject><subject>Immunoglobulin constant regions</subject><subject>IMMUNOGLOBULINE</subject><subject>IMMUNOGLOBULINS</subject><subject>Immunoglobulins - genetics</subject><subject>Information search</subject><subject>Information Systems</subject><subject>INMUNOGLOBULINA</subject><subject>LOGICIEL</subject><subject>Molecular structure</subject><subject>Nucleic acids</subject><subject>PLANT PROTEINS</subject><subject>PROFMAKE PROGRAM</subject><subject>PROGRAMAS DE ORDENADOR</subject><subject>Protein Conformation</subject><subject>PROTEINAS DE ORIGEN ANIMAL</subject><subject>PROTEINAS VEGETALES</subject><subject>PROTEINE ANIMALE</subject><subject>PROTEINE VEGETALE</subject><subject>Proteins</subject><subject>Proteins - genetics</subject><subject>Sequence Homology, Nucleic Acid</subject><subject>Software</subject><issn>0027-8424</issn><issn>1091-6490</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>1987</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kMFrFDEUh4Moda2eBVGZg-hpti-TZJIRPEixrVBQ0J7Dm0lSU7KTNckW9793lh1GvXh6h-_7vff4EfKcwpqCZGfbEfNa8TVla86EeEBWFDpat7yDh2QF0Mha8YY_Jk9yvgOATig4ISeMS96CXJHua4rOB1vhiGGffX5fGVvsUHwcq-gq43PBsYR9lWzAYk21TbFYP-an5JHDkO2zeZ6Sm4tP38-v6usvl5_PP17Xg5Cy1Ayp7Jgx1jllxAA9dVQqLowUkqoGO9qb1nLGqVPYc4tGgLRiUKwHozhnp-TDce9212-sGexYEga9TX6Daa8jev0vGf0PfRvvNQMBSk75t3M-xZ87m4ve-DzYEHC0cZe1lC1QrtpJPDuKQ4o5J-uWGxT0oWx9KFsrrinTh7KnxKu_X1v8ud2Jv5k55gGDSzgOPi-aYg1VUk3au1k77F_ocke7XQjF_iqT-fq_5iS8OAp3ucT05yGgspngyyN0GDXepumVm29KctkqqhT7DV0Nsoc</recordid><startdate>19870701</startdate><enddate>19870701</enddate><creator>Gribskov, M</creator><creator>McLachlan, A.D</creator><creator>Eisenberg, D</creator><general>National Academy of Sciences of the United States of America</general><general>National Acad Sciences</general><scope>FBQ</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>19870701</creationdate><title>Profile analysis: detection of distantly related proteins</title><author>Gribskov, M ; McLachlan, A.D ; Eisenberg, D</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c577t-3a1793ddeff8d5c0b1f17845d757182a91bd6e4341f8ab4ead507e5c83b0d8443</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>1987</creationdate><topic>ACIDE AMINE</topic><topic>Algorithms</topic><topic>Amino Acid Sequence</topic><topic>AMINO ACIDS</topic><topic>AMINOACIDOS</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>ANIMAL PROTEIN</topic><topic>Base Sequence</topic><topic>Biochemistry</topic><topic>Biological and medical sciences</topic><topic>COMPUTER SOFTWARE</topic><topic>Dynamic programming methods</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects, investigation methods</topic><topic>GLOBINS</topic><topic>Globins - genetics</topic><topic>Immunoglobulin constant regions</topic><topic>IMMUNOGLOBULINE</topic><topic>IMMUNOGLOBULINS</topic><topic>Immunoglobulins - genetics</topic><topic>Information search</topic><topic>Information Systems</topic><topic>INMUNOGLOBULINA</topic><topic>LOGICIEL</topic><topic>Molecular structure</topic><topic>Nucleic acids</topic><topic>PLANT PROTEINS</topic><topic>PROFMAKE PROGRAM</topic><topic>PROGRAMAS DE ORDENADOR</topic><topic>Protein Conformation</topic><topic>PROTEINAS DE ORIGEN ANIMAL</topic><topic>PROTEINAS VEGETALES</topic><topic>PROTEINE ANIMALE</topic><topic>PROTEINE VEGETALE</topic><topic>Proteins</topic><topic>Proteins - genetics</topic><topic>Sequence Homology, Nucleic Acid</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gribskov, M</creatorcontrib><creatorcontrib>McLachlan, A.D</creatorcontrib><creatorcontrib>Eisenberg, D</creatorcontrib><collection>AGRIS</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gribskov, M</au><au>McLachlan, A.D</au><au>Eisenberg, D</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Profile analysis: detection of distantly related proteins</atitle><jtitle>Proceedings of the National Academy of Sciences - PNAS</jtitle><addtitle>Proc Natl Acad Sci U S A</addtitle><date>1987-07-01</date><risdate>1987</risdate><volume>84</volume><issue>13</issue><spage>4355</spage><epage>4358</epage><pages>4355-4358</pages><issn>0027-8424</issn><eissn>1091-6490</eissn><coden>PNASA6</coden><abstract>Profile analysis is a method for detecting distantly related proteins by sequence comparison. The basis for comparison is not only the customary Dayhoff mutational-distance matrix but also the results of structural studies and information implicit in the alignments of the sequences of families of similar proteins. This information is expressed in a position-specific scoring table (profile), which is created from a group of sequences previously aligned by structural or sequence similarity. The similarity of any other sequence (target) to the group of aligned sequences (probe) can be tested by comparing the target to the profile using dynamic programming algorithms. The profile method differs in two major respects from methods of sequence comparison in common use: (i) Any number of known sequences can be used to construct the profile, allowing more information to be used in the testing of the target than is possible with pairwise alignment methods. (ii) The profile includes the penalties for insertion or deletion at each position, which allow one to include the probe secondary structure in the testing scheme. Tests with globin and immunoglobulin sequences show that profile analysis can distinguish all members of these families from all other sequences in a database containing 3800 protein sequences.</abstract><cop>Washington, DC</cop><pub>National Academy of Sciences of the United States of America</pub><pmid>3474607</pmid><doi>10.1073/pnas.84.13.4355</doi><tpages>4</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0027-8424
ispartof Proceedings of the National Academy of Sciences - PNAS, 1987-07, Vol.84 (13), p.4355-4358
issn 0027-8424
1091-6490
language eng
recordid cdi_pnas_primary_84_13_4355_fulltext
source Jstor Complete Legacy; MEDLINE; PubMed Central; Alma/SFX Local Collection; Free Full-Text Journals in Chemistry
subjects ACIDE AMINE
Algorithms
Amino Acid Sequence
AMINO ACIDS
AMINOACIDOS
Analytical, structural and metabolic biochemistry
ANIMAL PROTEIN
Base Sequence
Biochemistry
Biological and medical sciences
COMPUTER SOFTWARE
Dynamic programming methods
Fundamental and applied biological sciences. Psychology
General aspects, investigation methods
GLOBINS
Globins - genetics
Immunoglobulin constant regions
IMMUNOGLOBULINE
IMMUNOGLOBULINS
Immunoglobulins - genetics
Information search
Information Systems
INMUNOGLOBULINA
LOGICIEL
Molecular structure
Nucleic acids
PLANT PROTEINS
PROFMAKE PROGRAM
PROGRAMAS DE ORDENADOR
Protein Conformation
PROTEINAS DE ORIGEN ANIMAL
PROTEINAS VEGETALES
PROTEINE ANIMALE
PROTEINE VEGETALE
Proteins
Proteins - genetics
Sequence Homology, Nucleic Acid
Software
title Profile analysis: detection of distantly related proteins
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-09T21%3A08%3A58IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pnas_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Profile%20analysis:%20detection%20of%20distantly%20related%20proteins&rft.jtitle=Proceedings%20of%20the%20National%20Academy%20of%20Sciences%20-%20PNAS&rft.au=Gribskov,%20M&rft.date=1987-07-01&rft.volume=84&rft.issue=13&rft.spage=4355&rft.epage=4358&rft.pages=4355-4358&rft.issn=0027-8424&rft.eissn=1091-6490&rft.coden=PNASA6&rft_id=info:doi/10.1073/pnas.84.13.4355&rft_dat=%3Cjstor_pnas_%3E30172%3C/jstor_pnas_%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=77601486&rft_id=info:pmid/3474607&rft_jstor_id=30172&rfr_iscdi=true