Structure-based Comparative Analysis and Prediction of N-linked Glycosylation Sites in Evolutionarily Distant Eukaryotes

The asparagine-X-serine/threonine (NXS/T) motif, where X is any amino acid except proline, is the consensus motif for N-linked glycosylation. Significant numbers of high-resolution crystal structures of glycosylated proteins allow us to carry out structural analysis of the N-linked glycosylation sit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Genomics, proteomics & bioinformatics proteomics & bioinformatics, 2013-04, Vol.11 (2), p.96-104
Hauptverfasser: Lam, Phuc Vinh Nguyen, Goldman, Radoslav, Karagiannis, Konstantinos, Narsule, Tejas, Simonyan, Vahan, Soika, Valerii, Mazumder, Raja
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 104
container_issue 2
container_start_page 96
container_title Genomics, proteomics & bioinformatics
container_volume 11
creator Lam, Phuc Vinh Nguyen
Goldman, Radoslav
Karagiannis, Konstantinos
Narsule, Tejas
Simonyan, Vahan
Soika, Valerii
Mazumder, Raja
description The asparagine-X-serine/threonine (NXS/T) motif, where X is any amino acid except proline, is the consensus motif for N-linked glycosylation. Significant numbers of high-resolution crystal structures of glycosylated proteins allow us to carry out structural analysis of the N-linked glycosylation sites (NGS). Our analysis shows that there is enough structural information from diverse glycoproteins to allow the development of rules which can be used to predict NGS. A Python-based tool was developed to investigate asparagines implicated in N-glycosylation in five species: Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana and Saccharomyces cerevisiae. Our analysis shows that 78% of all asparagines of NXS/T motif involved in N-glycosylation are localized in the loop/turn conformation in the human proteome. Similar distribution was revealed for all the other species examined. Comparative analysis of the occurrence of NXS/T motifs not known to be glycosylated and their reverse sequence (S/TXN) shows a similar distribution across the secondary structural elements, indicating that the NXS/T motif in itself is not biologically relevant. Based on our analysis, we have defined rules to determine NGS. Using machine learning methods based on these rules we can predict with 93% accuracy if a particular site will be glycosylated. If structural information is not available the tool uses structural prediction results resulting in 74% accuracy. The tool was used to identify glycosylation sites in 108 human proteins with structures and 2247 proteins without structures that have acquired NXS/T site/s due to non-synonymous variation. The tool, Structure Feature Analysis Tool (SFAT), is freely available to the public at http://hive.biochemistry.gwu.edu/tools/sfat.
doi_str_mv 10.1016/j.gpb.2012.11.003
format Article
fullrecord <record><control><sourceid>wanfang_jour_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3914773</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><cqvip_id>45557338</cqvip_id><wanfj_id>jyzdbzzyswxxxb_e201302003</wanfj_id><els_id>S1672022913000272</els_id><sourcerecordid>jyzdbzzyswxxxb_e201302003</sourcerecordid><originalsourceid>FETCH-LOGICAL-c5183-265c550001dd536fa45a71e0734342251a849c24c2a1caeae6f209340783e9b03</originalsourceid><addsrcrecordid>eNp9Uk2P0zAUjBCILQs_gAsETlxS_Jk0QkJalbIgrQCp7Nl6cZzW3dTu2k636a_HId0FLpws-c0bzcybJHmJ0RQjnL_fTFe7akoQJlOMpwjRR8mEEIwyShh7nExwXpAMEVKeJc-83yDEOGP4aXJGKOMl5uUkOSyD62TonMoq8KpO53a7AwdB71V6YaDtvfYpmDr94VStZdDWpLZJv2WtNjcRf9n20vq-hd-TpQ7Kp9qki71tu-ELnG779JP2AUxIF90NuN5G0PPkSQOtVy9O73ly_Xnxc_4lu_p--XV-cZVJjmc0IzmXnCOEcF1zmjfAOBRYoYIyygjhGGaslIRJAliCApU3BJWUoWJGVVkhep58HHl3XbVVtVQmOGjFzultVCIsaPHvxOi1WNm9oCVmRUEjwduRwPqghZfRoVxLa4ySQeCclCXDEYRH0B2YBsxKbGznYnpebPpjXR2Pvb87HA6VUPFaFJF4q7jz7qTM2dtO-SC22kvVtmCU7bzAtCCI85KhP_TSWe-dah70YySGKoiNiFUQQxUExmKkf_W38YeN-9tHwOsR0IAVsHLai-tlZBjSZpSiPCI-jAgVD7TXyg32lZGxCG5wX1v9XwFvTqLX1qxudUzlXgPjnMdoZ_QXQPjagA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1372055940</pqid></control><display><type>article</type><title>Structure-based Comparative Analysis and Prediction of N-linked Glycosylation Sites in Evolutionarily Distant Eukaryotes</title><source>MEDLINE</source><source>DOAJ Directory of Open Access Journals</source><source>Access via Oxford University Press (Open Access Collection)</source><source>Access via ScienceDirect (Elsevier)</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><source>Alma/SFX Local Collection</source><creator>Lam, Phuc Vinh Nguyen ; Goldman, Radoslav ; Karagiannis, Konstantinos ; Narsule, Tejas ; Simonyan, Vahan ; Soika, Valerii ; Mazumder, Raja</creator><creatorcontrib>Lam, Phuc Vinh Nguyen ; Goldman, Radoslav ; Karagiannis, Konstantinos ; Narsule, Tejas ; Simonyan, Vahan ; Soika, Valerii ; Mazumder, Raja ; Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN (United States)</creatorcontrib><description>The asparagine-X-serine/threonine (NXS/T) motif, where X is any amino acid except proline, is the consensus motif for N-linked glycosylation. Significant numbers of high-resolution crystal structures of glycosylated proteins allow us to carry out structural analysis of the N-linked glycosylation sites (NGS). Our analysis shows that there is enough structural information from diverse glycoproteins to allow the development of rules which can be used to predict NGS. A Python-based tool was developed to investigate asparagines implicated in N-glycosylation in five species: Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana and Saccharomyces cerevisiae. Our analysis shows that 78% of all asparagines of NXS/T motif involved in N-glycosylation are localized in the loop/turn conformation in the human proteome. Similar distribution was revealed for all the other species examined. Comparative analysis of the occurrence of NXS/T motifs not known to be glycosylated and their reverse sequence (S/TXN) shows a similar distribution across the secondary structural elements, indicating that the NXS/T motif in itself is not biologically relevant. Based on our analysis, we have defined rules to determine NGS. Using machine learning methods based on these rules we can predict with 93% accuracy if a particular site will be glycosylated. If structural information is not available the tool uses structural prediction results resulting in 74% accuracy. The tool was used to identify glycosylation sites in 108 human proteins with structures and 2247 proteins without structures that have acquired NXS/T site/s due to non-synonymous variation. The tool, Structure Feature Analysis Tool (SFAT), is freely available to the public at http://hive.biochemistry.gwu.edu/tools/sfat.</description><identifier>ISSN: 1672-0229</identifier><identifier>EISSN: 2210-3244</identifier><identifier>DOI: 10.1016/j.gpb.2012.11.003</identifier><identifier>PMID: 23459159</identifier><language>eng</language><publisher>China: Elsevier Ltd</publisher><subject>Amino Acid Motifs ; Amino Acids - metabolism ; Animals ; Arabidopsis ; Arabidopsis Proteins - metabolism ; Arabidopsis thaliana ; Artificial Intelligence ; Asparagine - metabolism ; BASIC BIOLOGICAL SCIENCES ; Biological Evolution ; crystal structure ; Databases, Protein ; Drosophila melanogaster ; Drosophila Proteins - metabolism ; Eukaryota ; eukaryotic cells ; Gain and loss of glycosylation ; glycoproteins ; Glycoproteins - genetics ; Glycoproteins - metabolism ; Glycosylation ; Humans ; Mice ; Mus musculus ; N-linked glycosylation ; N-糖基化 ; nsSNP ; nsSNV ; Original Research ; Polymorphism, Single Nucleotide ; prediction ; proline ; Protein Processing, Post-Translational ; Proteome ; Saccharomyces cerevisiae ; Saccharomyces cerevisiae Proteins - metabolism ; Software ; Variation ; 分析工具 ; 生物进化 ; 真核 ; 结构比较 ; 结构预测 ; 蛋白质构象 ; 连接</subject><ispartof>Genomics, proteomics &amp; bioinformatics, 2013-04, Vol.11 (2), p.96-104</ispartof><rights>2013</rights><rights>Copyright © 2013. Production and hosting by Elsevier Ltd.</rights><rights>Copyright © Wanfang Data Co. Ltd. All Rights Reserved.</rights><rights>2013 Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China. Production and hosting by Elsevier B.V. All rights reserved. 2013</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c5183-265c550001dd536fa45a71e0734342251a849c24c2a1caeae6f209340783e9b03</citedby><cites>FETCH-LOGICAL-c5183-265c550001dd536fa45a71e0734342251a849c24c2a1caeae6f209340783e9b03</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://image.cqvip.com/vip1000/qk/86775X/86775X.jpg</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC3914773/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.gpb.2012.11.003$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,864,885,3550,27924,27925,45995,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23459159$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://www.osti.gov/servlets/purl/1629941$$D View this record in Osti.gov$$Hfree_for_read</backlink></links><search><creatorcontrib>Lam, Phuc Vinh Nguyen</creatorcontrib><creatorcontrib>Goldman, Radoslav</creatorcontrib><creatorcontrib>Karagiannis, Konstantinos</creatorcontrib><creatorcontrib>Narsule, Tejas</creatorcontrib><creatorcontrib>Simonyan, Vahan</creatorcontrib><creatorcontrib>Soika, Valerii</creatorcontrib><creatorcontrib>Mazumder, Raja</creatorcontrib><creatorcontrib>Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN (United States)</creatorcontrib><title>Structure-based Comparative Analysis and Prediction of N-linked Glycosylation Sites in Evolutionarily Distant Eukaryotes</title><title>Genomics, proteomics &amp; bioinformatics</title><addtitle>Genomics Proteomics & Bioinformatics</addtitle><description>The asparagine-X-serine/threonine (NXS/T) motif, where X is any amino acid except proline, is the consensus motif for N-linked glycosylation. Significant numbers of high-resolution crystal structures of glycosylated proteins allow us to carry out structural analysis of the N-linked glycosylation sites (NGS). Our analysis shows that there is enough structural information from diverse glycoproteins to allow the development of rules which can be used to predict NGS. A Python-based tool was developed to investigate asparagines implicated in N-glycosylation in five species: Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana and Saccharomyces cerevisiae. Our analysis shows that 78% of all asparagines of NXS/T motif involved in N-glycosylation are localized in the loop/turn conformation in the human proteome. Similar distribution was revealed for all the other species examined. Comparative analysis of the occurrence of NXS/T motifs not known to be glycosylated and their reverse sequence (S/TXN) shows a similar distribution across the secondary structural elements, indicating that the NXS/T motif in itself is not biologically relevant. Based on our analysis, we have defined rules to determine NGS. Using machine learning methods based on these rules we can predict with 93% accuracy if a particular site will be glycosylated. If structural information is not available the tool uses structural prediction results resulting in 74% accuracy. The tool was used to identify glycosylation sites in 108 human proteins with structures and 2247 proteins without structures that have acquired NXS/T site/s due to non-synonymous variation. The tool, Structure Feature Analysis Tool (SFAT), is freely available to the public at http://hive.biochemistry.gwu.edu/tools/sfat.</description><subject>Amino Acid Motifs</subject><subject>Amino Acids - metabolism</subject><subject>Animals</subject><subject>Arabidopsis</subject><subject>Arabidopsis Proteins - metabolism</subject><subject>Arabidopsis thaliana</subject><subject>Artificial Intelligence</subject><subject>Asparagine - metabolism</subject><subject>BASIC BIOLOGICAL SCIENCES</subject><subject>Biological Evolution</subject><subject>crystal structure</subject><subject>Databases, Protein</subject><subject>Drosophila melanogaster</subject><subject>Drosophila Proteins - metabolism</subject><subject>Eukaryota</subject><subject>eukaryotic cells</subject><subject>Gain and loss of glycosylation</subject><subject>glycoproteins</subject><subject>Glycoproteins - genetics</subject><subject>Glycoproteins - metabolism</subject><subject>Glycosylation</subject><subject>Humans</subject><subject>Mice</subject><subject>Mus musculus</subject><subject>N-linked glycosylation</subject><subject>N-糖基化</subject><subject>nsSNP</subject><subject>nsSNV</subject><subject>Original Research</subject><subject>Polymorphism, Single Nucleotide</subject><subject>prediction</subject><subject>proline</subject><subject>Protein Processing, Post-Translational</subject><subject>Proteome</subject><subject>Saccharomyces cerevisiae</subject><subject>Saccharomyces cerevisiae Proteins - metabolism</subject><subject>Software</subject><subject>Variation</subject><subject>分析工具</subject><subject>生物进化</subject><subject>真核</subject><subject>结构比较</subject><subject>结构预测</subject><subject>蛋白质构象</subject><subject>连接</subject><issn>1672-0229</issn><issn>2210-3244</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9Uk2P0zAUjBCILQs_gAsETlxS_Jk0QkJalbIgrQCp7Nl6cZzW3dTu2k636a_HId0FLpws-c0bzcybJHmJ0RQjnL_fTFe7akoQJlOMpwjRR8mEEIwyShh7nExwXpAMEVKeJc-83yDEOGP4aXJGKOMl5uUkOSyD62TonMoq8KpO53a7AwdB71V6YaDtvfYpmDr94VStZdDWpLZJv2WtNjcRf9n20vq-hd-TpQ7Kp9qki71tu-ELnG779JP2AUxIF90NuN5G0PPkSQOtVy9O73ly_Xnxc_4lu_p--XV-cZVJjmc0IzmXnCOEcF1zmjfAOBRYoYIyygjhGGaslIRJAliCApU3BJWUoWJGVVkhep58HHl3XbVVtVQmOGjFzultVCIsaPHvxOi1WNm9oCVmRUEjwduRwPqghZfRoVxLa4ySQeCclCXDEYRH0B2YBsxKbGznYnpebPpjXR2Pvb87HA6VUPFaFJF4q7jz7qTM2dtO-SC22kvVtmCU7bzAtCCI85KhP_TSWe-dah70YySGKoiNiFUQQxUExmKkf_W38YeN-9tHwOsR0IAVsHLai-tlZBjSZpSiPCI-jAgVD7TXyg32lZGxCG5wX1v9XwFvTqLX1qxudUzlXgPjnMdoZ_QXQPjagA</recordid><startdate>201304</startdate><enddate>201304</enddate><creator>Lam, Phuc Vinh Nguyen</creator><creator>Goldman, Radoslav</creator><creator>Karagiannis, Konstantinos</creator><creator>Narsule, Tejas</creator><creator>Simonyan, Vahan</creator><creator>Soika, Valerii</creator><creator>Mazumder, Raja</creator><general>Elsevier Ltd</general><general>Department of Biochemistry and Molecular Biology, George Washington University Medical Center, Washington, DC 20037, USA%Department of Oncology, Georgetown University, Washington, DC 20057, USA%Department of Biochemistry and Molecular Biology, George Washington University Medical Center, Washington, DC 20037, USA%Center for Biologics Evaluation and Research, Food and Drug Administration, Rockville, MD 20852, USA</general><general>Life Sciences Department, Paris Diderot University, Paris 75013, France</general><general>Elsevier</general><scope>2RA</scope><scope>92L</scope><scope>CQIGP</scope><scope>W94</scope><scope>WU4</scope><scope>~WA</scope><scope>6I.</scope><scope>AAFTH</scope><scope>FBQ</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>RC3</scope><scope>2B.</scope><scope>4A8</scope><scope>92I</scope><scope>93N</scope><scope>PSX</scope><scope>TCJ</scope><scope>OIOZB</scope><scope>OTOTI</scope><scope>5PM</scope></search><sort><creationdate>201304</creationdate><title>Structure-based Comparative Analysis and Prediction of N-linked Glycosylation Sites in Evolutionarily Distant Eukaryotes</title><author>Lam, Phuc Vinh Nguyen ; Goldman, Radoslav ; Karagiannis, Konstantinos ; Narsule, Tejas ; Simonyan, Vahan ; Soika, Valerii ; Mazumder, Raja</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c5183-265c550001dd536fa45a71e0734342251a849c24c2a1caeae6f209340783e9b03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Amino Acid Motifs</topic><topic>Amino Acids - metabolism</topic><topic>Animals</topic><topic>Arabidopsis</topic><topic>Arabidopsis Proteins - metabolism</topic><topic>Arabidopsis thaliana</topic><topic>Artificial Intelligence</topic><topic>Asparagine - metabolism</topic><topic>BASIC BIOLOGICAL SCIENCES</topic><topic>Biological Evolution</topic><topic>crystal structure</topic><topic>Databases, Protein</topic><topic>Drosophila melanogaster</topic><topic>Drosophila Proteins - metabolism</topic><topic>Eukaryota</topic><topic>eukaryotic cells</topic><topic>Gain and loss of glycosylation</topic><topic>glycoproteins</topic><topic>Glycoproteins - genetics</topic><topic>Glycoproteins - metabolism</topic><topic>Glycosylation</topic><topic>Humans</topic><topic>Mice</topic><topic>Mus musculus</topic><topic>N-linked glycosylation</topic><topic>N-糖基化</topic><topic>nsSNP</topic><topic>nsSNV</topic><topic>Original Research</topic><topic>Polymorphism, Single Nucleotide</topic><topic>prediction</topic><topic>proline</topic><topic>Protein Processing, Post-Translational</topic><topic>Proteome</topic><topic>Saccharomyces cerevisiae</topic><topic>Saccharomyces cerevisiae Proteins - metabolism</topic><topic>Software</topic><topic>Variation</topic><topic>分析工具</topic><topic>生物进化</topic><topic>真核</topic><topic>结构比较</topic><topic>结构预测</topic><topic>蛋白质构象</topic><topic>连接</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lam, Phuc Vinh Nguyen</creatorcontrib><creatorcontrib>Goldman, Radoslav</creatorcontrib><creatorcontrib>Karagiannis, Konstantinos</creatorcontrib><creatorcontrib>Narsule, Tejas</creatorcontrib><creatorcontrib>Simonyan, Vahan</creatorcontrib><creatorcontrib>Soika, Valerii</creatorcontrib><creatorcontrib>Mazumder, Raja</creatorcontrib><creatorcontrib>Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN (United States)</creatorcontrib><collection>中文科技期刊数据库</collection><collection>中文科技期刊数据库-CALIS站点</collection><collection>中文科技期刊数据库-7.0平台</collection><collection>中文科技期刊数据库-自然科学</collection><collection>中文科技期刊数据库-自然科学-生物科学</collection><collection>中文科技期刊数据库- 镜像站点</collection><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>AGRIS</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>Wanfang Data Journals - Hong Kong</collection><collection>WANFANG Data Centre</collection><collection>Wanfang Data Journals</collection><collection>万方数据期刊 - 香港版</collection><collection>China Online Journals (COJ)</collection><collection>China Online Journals (COJ)</collection><collection>OSTI.GOV - Hybrid</collection><collection>OSTI.GOV</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genomics, proteomics &amp; bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lam, Phuc Vinh Nguyen</au><au>Goldman, Radoslav</au><au>Karagiannis, Konstantinos</au><au>Narsule, Tejas</au><au>Simonyan, Vahan</au><au>Soika, Valerii</au><au>Mazumder, Raja</au><aucorp>Oak Ridge Institute for Science and Education (ORISE), Oak Ridge, TN (United States)</aucorp><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Structure-based Comparative Analysis and Prediction of N-linked Glycosylation Sites in Evolutionarily Distant Eukaryotes</atitle><jtitle>Genomics, proteomics &amp; bioinformatics</jtitle><addtitle>Genomics Proteomics & Bioinformatics</addtitle><date>2013-04</date><risdate>2013</risdate><volume>11</volume><issue>2</issue><spage>96</spage><epage>104</epage><pages>96-104</pages><issn>1672-0229</issn><eissn>2210-3244</eissn><abstract>The asparagine-X-serine/threonine (NXS/T) motif, where X is any amino acid except proline, is the consensus motif for N-linked glycosylation. Significant numbers of high-resolution crystal structures of glycosylated proteins allow us to carry out structural analysis of the N-linked glycosylation sites (NGS). Our analysis shows that there is enough structural information from diverse glycoproteins to allow the development of rules which can be used to predict NGS. A Python-based tool was developed to investigate asparagines implicated in N-glycosylation in five species: Homo sapiens, Mus musculus, Drosophila melanogaster, Arabidopsis thaliana and Saccharomyces cerevisiae. Our analysis shows that 78% of all asparagines of NXS/T motif involved in N-glycosylation are localized in the loop/turn conformation in the human proteome. Similar distribution was revealed for all the other species examined. Comparative analysis of the occurrence of NXS/T motifs not known to be glycosylated and their reverse sequence (S/TXN) shows a similar distribution across the secondary structural elements, indicating that the NXS/T motif in itself is not biologically relevant. Based on our analysis, we have defined rules to determine NGS. Using machine learning methods based on these rules we can predict with 93% accuracy if a particular site will be glycosylated. If structural information is not available the tool uses structural prediction results resulting in 74% accuracy. The tool was used to identify glycosylation sites in 108 human proteins with structures and 2247 proteins without structures that have acquired NXS/T site/s due to non-synonymous variation. The tool, Structure Feature Analysis Tool (SFAT), is freely available to the public at http://hive.biochemistry.gwu.edu/tools/sfat.</abstract><cop>China</cop><pub>Elsevier Ltd</pub><pmid>23459159</pmid><doi>10.1016/j.gpb.2012.11.003</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1672-0229
ispartof Genomics, proteomics & bioinformatics, 2013-04, Vol.11 (2), p.96-104
issn 1672-0229
2210-3244
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3914773
source MEDLINE; DOAJ Directory of Open Access Journals; Access via Oxford University Press (Open Access Collection); Access via ScienceDirect (Elsevier); EZB-FREE-00999 freely available EZB journals; PubMed Central; Alma/SFX Local Collection
subjects Amino Acid Motifs
Amino Acids - metabolism
Animals
Arabidopsis
Arabidopsis Proteins - metabolism
Arabidopsis thaliana
Artificial Intelligence
Asparagine - metabolism
BASIC BIOLOGICAL SCIENCES
Biological Evolution
crystal structure
Databases, Protein
Drosophila melanogaster
Drosophila Proteins - metabolism
Eukaryota
eukaryotic cells
Gain and loss of glycosylation
glycoproteins
Glycoproteins - genetics
Glycoproteins - metabolism
Glycosylation
Humans
Mice
Mus musculus
N-linked glycosylation
N-糖基化
nsSNP
nsSNV
Original Research
Polymorphism, Single Nucleotide
prediction
proline
Protein Processing, Post-Translational
Proteome
Saccharomyces cerevisiae
Saccharomyces cerevisiae Proteins - metabolism
Software
Variation
分析工具
生物进化
真核
结构比较
结构预测
蛋白质构象
连接
title Structure-based Comparative Analysis and Prediction of N-linked Glycosylation Sites in Evolutionarily Distant Eukaryotes
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T13%3A16%3A18IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wanfang_jour_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Structure-based%20Comparative%20Analysis%20and%20Prediction%20of%20N-linked%20Glycosylation%20Sites%20in%20Evolutionarily%20Distant%20Eukaryotes&rft.jtitle=Genomics,%20proteomics%20&%20bioinformatics&rft.au=Lam,%20Phuc%20Vinh%20Nguyen&rft.aucorp=Oak%20Ridge%20Institute%20for%20Science%20and%20Education%20(ORISE),%20Oak%20Ridge,%20TN%20(United%20States)&rft.date=2013-04&rft.volume=11&rft.issue=2&rft.spage=96&rft.epage=104&rft.pages=96-104&rft.issn=1672-0229&rft.eissn=2210-3244&rft_id=info:doi/10.1016/j.gpb.2012.11.003&rft_dat=%3Cwanfang_jour_pubme%3Ejyzdbzzyswxxxb_e201302003%3C/wanfang_jour_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=1372055940&rft_id=info:pmid/23459159&rft_cqvip_id=45557338&rft_wanfj_id=jyzdbzzyswxxxb_e201302003&rft_els_id=S1672022913000272&rfr_iscdi=true