Protein Solvent Accessibility Prediction Using Support Vector Machines and Sequence Conservations

A two-stage method is developed for the single sequence prediction of protein solvent accessibility from solely its amino acid sequence. The first stage classifies each residue in a protein sequence as exposed or buried using support vector machine (SVM). The features used in the SVM are physico-che...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Oğul, Hasan, Mumcuoğlu, Erkan Ü.
Format:	Buchkapitel
Sprache:	eng
Schlagworte:	Efficient Data Structure Remote Homology Detection Solvent Accessibility Suffix Tree Support Vector Machine
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	148
container_issue
container_start_page	141
container_title
container_volume
creator	Oğul, Hasan Mumcuoğlu, Erkan Ü.
description	A two-stage method is developed for the single sequence prediction of protein solvent accessibility from solely its amino acid sequence. The first stage classifies each residue in a protein sequence as exposed or buried using support vector machine (SVM). The features used in the SVM are physico-chemical properties of the amino acid to be predicted as well as the information coming from its neighboring residues. The SVM-based predictions are refined using pairwise conservative patterns, called maximal unique matches (MUMs). The MUMs are identified by an efficient data structure called suffix tree. The baseline predictions, SVM-based predictions and MUM-based refinements are tested on a nonredundant protein data set and 7̃3% prediction accuracy is achieved for a solvent accessibility threshold that provides an evenly distribution between buried and exposed classes. The results demonstrate that the new method achieves slightly better accuracy than recent methods using single sequence prediction.
doi_str_mv	10.1007/11803089_17
format	Book Chapter
fullrecord	<record><control><sourceid>springer</sourceid><recordid>TN_cdi_springer_books_10_1007_11803089_17</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>springer_books_10_1007_11803089_17</sourcerecordid><originalsourceid>FETCH-springer_books_10_1007_11803089_173</originalsourceid><addsrcrecordid>eNqVjz1PwzAURc2XRAqd-ANeGVL84hDbI6qoWJAqpbBarvsAQ2QHP7cS_55WMLAy3eHec6XD2BWIGQihbgC0kEIbC-qITY3S8rYVstMd6GNWQQdQS9maEzb5KRTI7pRVe6apjWrlOZsQvQshGmWairllTgVD5H0adhgLv_MeicI6DKF88WXGTfAlpMifKMRX3m_HMeXCn9GXlPmj828hInEXN7zHzy1Gj3yeImHeuQNHl-zsxQ2E09-8YNeL-9X8oaYx7x8x23VKH2RB2IOg_SMo_7P9BhahUeQ</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>book_chapter</recordtype></control><display><type>book_chapter</type><title>Protein Solvent Accessibility Prediction Using Support Vector Machines and Sequence Conservations</title><source>Springer Books</source><creator>Oğul, Hasan ; Mumcuoğlu, Erkan Ü.</creator><contributor>Savacı, F. Acar</contributor><creatorcontrib>Oğul, Hasan ; Mumcuoğlu, Erkan Ü. ; Savacı, F. Acar</creatorcontrib><description>A two-stage method is developed for the single sequence prediction of protein solvent accessibility from solely its amino acid sequence. The first stage classifies each residue in a protein sequence as exposed or buried using support vector machine (SVM). The features used in the SVM are physico-chemical properties of the amino acid to be predicted as well as the information coming from its neighboring residues. The SVM-based predictions are refined using pairwise conservative patterns, called maximal unique matches (MUMs). The MUMs are identified by an efficient data structure called suffix tree. The baseline predictions, SVM-based predictions and MUM-based refinements are tested on a nonredundant protein data set and 7̃3% prediction accuracy is achieved for a solvent accessibility threshold that provides an evenly distribution between buried and exposed classes. The results demonstrate that the new method achieves slightly better accuracy than recent methods using single sequence prediction.</description><identifier>ISSN: 0302-9743</identifier><identifier>ISBN: 3540367136</identifier><identifier>ISBN: 9783540367130</identifier><identifier>EISSN: 1611-3349</identifier><identifier>EISBN: 9783540368618</identifier><identifier>EISBN: 3540368612</identifier><identifier>DOI: 10.1007/11803089_17</identifier><language>eng</language><publisher>Berlin, Heidelberg: Springer Berlin Heidelberg</publisher><subject>Efficient Data Structure ; Remote Homology Detection ; Solvent Accessibility ; Suffix Tree ; Support Vector Machine</subject><ispartof>Artificial Intelligence and Neural Networks, 2006, p.141-148</ispartof><rights>Springer-Verlag Berlin Heidelberg 2006</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><relation>Lecture Notes in Computer Science</relation></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://link.springer.com/content/pdf/10.1007/11803089_17$$EPDF$$P50$$Gspringer$$H</linktopdf><linktohtml>$$Uhttps://link.springer.com/10.1007/11803089_17$$EHTML$$P50$$Gspringer$$H</linktohtml><link.rule.ids>779,780,784,793,27925,38255,41442,42511</link.rule.ids></links><search><contributor>Savacı, F. Acar</contributor><creatorcontrib>Oğul, Hasan</creatorcontrib><creatorcontrib>Mumcuoğlu, Erkan Ü.</creatorcontrib><title>Protein Solvent Accessibility Prediction Using Support Vector Machines and Sequence Conservations</title><title>Artificial Intelligence and Neural Networks</title><description>A two-stage method is developed for the single sequence prediction of protein solvent accessibility from solely its amino acid sequence. The first stage classifies each residue in a protein sequence as exposed or buried using support vector machine (SVM). The features used in the SVM are physico-chemical properties of the amino acid to be predicted as well as the information coming from its neighboring residues. The SVM-based predictions are refined using pairwise conservative patterns, called maximal unique matches (MUMs). The MUMs are identified by an efficient data structure called suffix tree. The baseline predictions, SVM-based predictions and MUM-based refinements are tested on a nonredundant protein data set and 7̃3% prediction accuracy is achieved for a solvent accessibility threshold that provides an evenly distribution between buried and exposed classes. The results demonstrate that the new method achieves slightly better accuracy than recent methods using single sequence prediction.</description><subject>Efficient Data Structure</subject><subject>Remote Homology Detection</subject><subject>Solvent Accessibility</subject><subject>Suffix Tree</subject><subject>Support Vector Machine</subject><issn>0302-9743</issn><issn>1611-3349</issn><isbn>3540367136</isbn><isbn>9783540367130</isbn><isbn>9783540368618</isbn><isbn>3540368612</isbn><fulltext>true</fulltext><rsrctype>book_chapter</rsrctype><creationdate>2006</creationdate><recordtype>book_chapter</recordtype><sourceid/><recordid>eNqVjz1PwzAURc2XRAqd-ANeGVL84hDbI6qoWJAqpbBarvsAQ2QHP7cS_55WMLAy3eHec6XD2BWIGQihbgC0kEIbC-qITY3S8rYVstMd6GNWQQdQS9maEzb5KRTI7pRVe6apjWrlOZsQvQshGmWairllTgVD5H0adhgLv_MeicI6DKF88WXGTfAlpMifKMRX3m_HMeXCn9GXlPmj828hInEXN7zHzy1Gj3yeImHeuQNHl-zsxQ2E09-8YNeL-9X8oaYx7x8x23VKH2RB2IOg_SMo_7P9BhahUeQ</recordid><startdate>2006</startdate><enddate>2006</enddate><creator>Oğul, Hasan</creator><creator>Mumcuoğlu, Erkan Ü.</creator><general>Springer Berlin Heidelberg</general><scope/></search><sort><creationdate>2006</creationdate><title>Protein Solvent Accessibility Prediction Using Support Vector Machines and Sequence Conservations</title><author>Oğul, Hasan ; Mumcuoğlu, Erkan Ü.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-springer_books_10_1007_11803089_173</frbrgroupid><rsrctype>book_chapters</rsrctype><prefilter>book_chapters</prefilter><language>eng</language><creationdate>2006</creationdate><topic>Efficient Data Structure</topic><topic>Remote Homology Detection</topic><topic>Solvent Accessibility</topic><topic>Suffix Tree</topic><topic>Support Vector Machine</topic><toplevel>online_resources</toplevel><creatorcontrib>Oğul, Hasan</creatorcontrib><creatorcontrib>Mumcuoğlu, Erkan Ü.</creatorcontrib></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Oğul, Hasan</au><au>Mumcuoğlu, Erkan Ü.</au><au>Savacı, F. Acar</au><format>book</format><genre>bookitem</genre><ristype>CHAP</ristype><atitle>Protein Solvent Accessibility Prediction Using Support Vector Machines and Sequence Conservations</atitle><btitle>Artificial Intelligence and Neural Networks</btitle><seriestitle>Lecture Notes in Computer Science</seriestitle><date>2006</date><risdate>2006</risdate><spage>141</spage><epage>148</epage><pages>141-148</pages><issn>0302-9743</issn><eissn>1611-3349</eissn><isbn>3540367136</isbn><isbn>9783540367130</isbn><eisbn>9783540368618</eisbn><eisbn>3540368612</eisbn><abstract>A two-stage method is developed for the single sequence prediction of protein solvent accessibility from solely its amino acid sequence. The first stage classifies each residue in a protein sequence as exposed or buried using support vector machine (SVM). The features used in the SVM are physico-chemical properties of the amino acid to be predicted as well as the information coming from its neighboring residues. The SVM-based predictions are refined using pairwise conservative patterns, called maximal unique matches (MUMs). The MUMs are identified by an efficient data structure called suffix tree. The baseline predictions, SVM-based predictions and MUM-based refinements are tested on a nonredundant protein data set and 7̃3% prediction accuracy is achieved for a solvent accessibility threshold that provides an evenly distribution between buried and exposed classes. The results demonstrate that the new method achieves slightly better accuracy than recent methods using single sequence prediction.</abstract><cop>Berlin, Heidelberg</cop><pub>Springer Berlin Heidelberg</pub><doi>10.1007/11803089_17</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 0302-9743
ispartof	Artificial Intelligence and Neural Networks, 2006, p.141-148
issn	0302-9743 1611-3349
language	eng
recordid	cdi_springer_books_10_1007_11803089_17
source	Springer Books
subjects	Efficient Data Structure Remote Homology Detection Solvent Accessibility Suffix Tree Support Vector Machine
title	Protein Solvent Accessibility Prediction Using Support Vector Machines and Sequence Conservations
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-21T09%3A25%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-springer&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=bookitem&rft.atitle=Protein%20Solvent%20Accessibility%20Prediction%20Using%20Support%20Vector%20Machines%20and%20Sequence%20Conservations&rft.btitle=Artificial%20Intelligence%20and%20Neural%20Networks&rft.au=O%C4%9Ful,%20Hasan&rft.date=2006&rft.spage=141&rft.epage=148&rft.pages=141-148&rft.issn=0302-9743&rft.eissn=1611-3349&rft.isbn=3540367136&rft.isbn_list=9783540367130&rft_id=info:doi/10.1007/11803089_17&rft_dat=%3Cspringer%3Espringer_books_10_1007_11803089_17%3C/springer%3E%3Curl%3E%3C/url%3E&rft.eisbn=9783540368618&rft.eisbn_list=3540368612&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true