Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility

In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three‐dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, intro...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proteins, structure, function, and bioinformatics structure, function, and bioinformatics, 2011-06, Vol.79 (6), p.1868-1877
Hauptverfasser: Hijikata, Atsushi, Yura, Kei, Noguti, Tosiyuki, Go, Mitiko
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 1877
container_issue 6
container_start_page 1868
container_title Proteins, structure, function, and bioinformatics
container_volume 79
creator Hijikata, Atsushi
Yura, Kei
Noguti, Tosiyuki
Go, Mitiko
description In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three‐dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley‐Liss, Inc.
doi_str_mv 10.1002/prot.23011
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3110861</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>866251345</sourcerecordid><originalsourceid>FETCH-LOGICAL-c6151-5f283348f32f6ca9103bea8d6efed84415ce2e9530f5ebfa3af25b57830b2b473</originalsourceid><addsrcrecordid>eNqNks1uEzEURkcIRNPChgdAllggIU3xz9jj2SChQAsioqgKgp3l8dxJXGbsYDspeQZeGoe0EbAAVpblc8-1r7-ieETwKcGYPl8Fn04pw4TcKSYEN3WJCavuFhMsZV0yLvlRcRzjFcZYNEzcL44oqQTngk6K75ewsdEm6xZooVdo8EYn611E1iE9WueRNrZDEb6uwRlAerALN4JLEWnXIY1y95WPekC9D3k7Qlr6DiWP7JiPNoDSEkbUbrMwBd-tza5V9MMmO7LbQIy2tYNN2wfFvV4PER7erCfFx7PX8-mbcnZx_nb6clYaQTgpeU8lY5XsGe2F0Q3BrAUtOwE9dLKqCDdAoeEM9xzaXjPdU97yWjLc0raq2UnxYu9drdsROpMvEvSgVsGOOmyV11b9fuLsUi38RjFCsBQkC57eCILPY4lJjTYaGAbtwK-jkrkVk6L-P5LSRlb_JoWgPH8rz-STP8grvw4uT0wRKupG0Kra-Z7tKRN8jAH6w_sIVrvYqF1s1M_YZPjxrxM5oLc5yQDZA9d2gO1fVOrD5cX8Vlrua2xM8O1Qo8MXJWpWc_Xp_bl692rWfJ7KuTpjPwC8q9-z</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1267962444</pqid></control><display><type>article</type><title>Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility</title><source>MEDLINE</source><source>Access via Wiley Online Library</source><creator>Hijikata, Atsushi ; Yura, Kei ; Noguti, Tosiyuki ; Go, Mitiko</creator><creatorcontrib>Hijikata, Atsushi ; Yura, Kei ; Noguti, Tosiyuki ; Go, Mitiko</creatorcontrib><description>In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three‐dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley‐Liss, Inc.</description><identifier>ISSN: 0887-3585</identifier><identifier>EISSN: 1097-0134</identifier><identifier>DOI: 10.1002/prot.23011</identifier><identifier>PMID: 21465562</identifier><language>eng</language><publisher>Hoboken: Wiley Subscription Services, Inc., A Wiley Company</publisher><subject>ALAdeGAP ; amino acid sequence alignment ; Amino Acids - chemistry ; Bacillus subtilis ; Bacillus subtilis - chemistry ; Bacterial Proteins - chemistry ; comparative modeling ; Escherichia coli - chemistry ; Models, Molecular ; position dependent gap penalty ; Protein Conformation ; Sequence Alignment - methods ; solvent accessibility ; Solvents</subject><ispartof>Proteins, structure, function, and bioinformatics, 2011-06, Vol.79 (6), p.1868-1877</ispartof><rights>Copyright © 2011 Wiley‐Liss, Inc.</rights><rights>Copyright © 2011 Wiley-Liss, Inc.</rights><rights>Copyright © 2011 Wiley-Liss, Inc., A Wiley Company 2011</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c6151-5f283348f32f6ca9103bea8d6efed84415ce2e9530f5ebfa3af25b57830b2b473</citedby><cites>FETCH-LOGICAL-c6151-5f283348f32f6ca9103bea8d6efed84415ce2e9530f5ebfa3af25b57830b2b473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://onlinelibrary.wiley.com/doi/pdf/10.1002%2Fprot.23011$$EPDF$$P50$$Gwiley$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://onlinelibrary.wiley.com/doi/full/10.1002%2Fprot.23011$$EHTML$$P50$$Gwiley$$Hfree_for_read</linktohtml><link.rule.ids>230,314,780,784,885,1417,27924,27925,45574,45575</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/21465562$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Hijikata, Atsushi</creatorcontrib><creatorcontrib>Yura, Kei</creatorcontrib><creatorcontrib>Noguti, Tosiyuki</creatorcontrib><creatorcontrib>Go, Mitiko</creatorcontrib><title>Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility</title><title>Proteins, structure, function, and bioinformatics</title><addtitle>Proteins</addtitle><description>In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three‐dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley‐Liss, Inc.</description><subject>ALAdeGAP</subject><subject>amino acid sequence alignment</subject><subject>Amino Acids - chemistry</subject><subject>Bacillus subtilis</subject><subject>Bacillus subtilis - chemistry</subject><subject>Bacterial Proteins - chemistry</subject><subject>comparative modeling</subject><subject>Escherichia coli - chemistry</subject><subject>Models, Molecular</subject><subject>position dependent gap penalty</subject><subject>Protein Conformation</subject><subject>Sequence Alignment - methods</subject><subject>solvent accessibility</subject><subject>Solvents</subject><issn>0887-3585</issn><issn>1097-0134</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>24P</sourceid><sourceid>WIN</sourceid><sourceid>EIF</sourceid><recordid>eNqNks1uEzEURkcIRNPChgdAllggIU3xz9jj2SChQAsioqgKgp3l8dxJXGbsYDspeQZeGoe0EbAAVpblc8-1r7-ieETwKcGYPl8Fn04pw4TcKSYEN3WJCavuFhMsZV0yLvlRcRzjFcZYNEzcL44oqQTngk6K75ewsdEm6xZooVdo8EYn611E1iE9WueRNrZDEb6uwRlAerALN4JLEWnXIY1y95WPekC9D3k7Qlr6DiWP7JiPNoDSEkbUbrMwBd-tza5V9MMmO7LbQIy2tYNN2wfFvV4PER7erCfFx7PX8-mbcnZx_nb6clYaQTgpeU8lY5XsGe2F0Q3BrAUtOwE9dLKqCDdAoeEM9xzaXjPdU97yWjLc0raq2UnxYu9drdsROpMvEvSgVsGOOmyV11b9fuLsUi38RjFCsBQkC57eCILPY4lJjTYaGAbtwK-jkrkVk6L-P5LSRlb_JoWgPH8rz-STP8grvw4uT0wRKupG0Kra-Z7tKRN8jAH6w_sIVrvYqF1s1M_YZPjxrxM5oLc5yQDZA9d2gO1fVOrD5cX8Vlrua2xM8O1Qo8MXJWpWc_Xp_bl692rWfJ7KuTpjPwC8q9-z</recordid><startdate>201106</startdate><enddate>201106</enddate><creator>Hijikata, Atsushi</creator><creator>Yura, Kei</creator><creator>Noguti, Tosiyuki</creator><creator>Go, Mitiko</creator><general>Wiley Subscription Services, Inc., A Wiley Company</general><general>Wiley Subscription Services, Inc</general><scope>BSCLL</scope><scope>24P</scope><scope>WIN</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QL</scope><scope>7QO</scope><scope>7QP</scope><scope>7QR</scope><scope>7TK</scope><scope>7TM</scope><scope>7U9</scope><scope>8FD</scope><scope>C1K</scope><scope>FR3</scope><scope>H94</scope><scope>K9.</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>201106</creationdate><title>Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility</title><author>Hijikata, Atsushi ; Yura, Kei ; Noguti, Tosiyuki ; Go, Mitiko</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c6151-5f283348f32f6ca9103bea8d6efed84415ce2e9530f5ebfa3af25b57830b2b473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>ALAdeGAP</topic><topic>amino acid sequence alignment</topic><topic>Amino Acids - chemistry</topic><topic>Bacillus subtilis</topic><topic>Bacillus subtilis - chemistry</topic><topic>Bacterial Proteins - chemistry</topic><topic>comparative modeling</topic><topic>Escherichia coli - chemistry</topic><topic>Models, Molecular</topic><topic>position dependent gap penalty</topic><topic>Protein Conformation</topic><topic>Sequence Alignment - methods</topic><topic>solvent accessibility</topic><topic>Solvents</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hijikata, Atsushi</creatorcontrib><creatorcontrib>Yura, Kei</creatorcontrib><creatorcontrib>Noguti, Tosiyuki</creatorcontrib><creatorcontrib>Go, Mitiko</creatorcontrib><collection>Istex</collection><collection>Wiley Online Library Open Access</collection><collection>Wiley Online Library (Open Access Collection)</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Chemoreception Abstracts</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Technology Research Database</collection><collection>Environmental Sciences and Pollution Management</collection><collection>Engineering Research Database</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Proteins, structure, function, and bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hijikata, Atsushi</au><au>Yura, Kei</au><au>Noguti, Tosiyuki</au><au>Go, Mitiko</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility</atitle><jtitle>Proteins, structure, function, and bioinformatics</jtitle><addtitle>Proteins</addtitle><date>2011-06</date><risdate>2011</risdate><volume>79</volume><issue>6</issue><spage>1868</spage><epage>1877</epage><pages>1868-1877</pages><issn>0887-3585</issn><eissn>1097-0134</eissn><abstract>In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three‐dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the “twilight zone” between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/. Proteins 2011; © 2011 Wiley‐Liss, Inc.</abstract><cop>Hoboken</cop><pub>Wiley Subscription Services, Inc., A Wiley Company</pub><pmid>21465562</pmid><doi>10.1002/prot.23011</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0887-3585
ispartof Proteins, structure, function, and bioinformatics, 2011-06, Vol.79 (6), p.1868-1877
issn 0887-3585
1097-0134
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3110861
source MEDLINE; Access via Wiley Online Library
subjects ALAdeGAP
amino acid sequence alignment
Amino Acids - chemistry
Bacillus subtilis
Bacillus subtilis - chemistry
Bacterial Proteins - chemistry
comparative modeling
Escherichia coli - chemistry
Models, Molecular
position dependent gap penalty
Protein Conformation
Sequence Alignment - methods
solvent accessibility
Solvents
title Revisiting gap locations in amino acid sequence alignments and a proposal for a method to improve them by introducing solvent accessibility
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T17%3A46%3A22IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Revisiting%20gap%20locations%20in%20amino%20acid%20sequence%20alignments%20and%20a%20proposal%20for%20a%20method%20to%20improve%20them%20by%20introducing%20solvent%20accessibility&rft.jtitle=Proteins,%20structure,%20function,%20and%20bioinformatics&rft.au=Hijikata,%20Atsushi&rft.date=2011-06&rft.volume=79&rft.issue=6&rft.spage=1868&rft.epage=1877&rft.pages=1868-1877&rft.issn=0887-3585&rft.eissn=1097-0134&rft_id=info:doi/10.1002/prot.23011&rft_dat=%3Cproquest_pubme%3E866251345%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1267962444&rft_id=info:pmid/21465562&rfr_iscdi=true