Binary coding, mRNA information and protein structure

We describe new binary algorithm for the prediction of alpha and beta protein folding types from RNA, DNA and amino acid sequences. The method enables quick, simple and accurate prediction of alpha and beta protein folds on a personal computer by means of few binary patterns of coded amino acid and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	26th International Conference on Information Technology Interfaces, 2004 2004, 2004-01, Vol.12 (2), p.53-61 Vol.1
Hauptverfasser:	Stambuk, N., Konjevoda, P., Gotovac, N.
Format:	Artikel
Sprache:	eng
Schlagworte:	Amino acids Analytical biochemistry: general aspects, technics, instrumentation Analytical, structural and metabolic biochemistry Applied sciences Biological and medical sciences Chemical structures Classification tree analysis Computer applications Computer science control theory systems Computer systems and distributed systems. User interface DNA Exact sciences and technology Fundamental and applied biological sciences. Psychology Genetics Information systems. Data bases Memory organisation. Data processing Microcomputers Prediction algorithms Proteins RNA Sequences Software Testing
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	61 Vol.1
container_issue	2
container_start_page	53
container_title	26th International Conference on Information Technology Interfaces, 2004
container_volume	12
creator	Stambuk, N. Konjevoda, P. Gotovac, N.
description	We describe new binary algorithm for the prediction of alpha and beta protein folding types from RNA, DNA and amino acid sequences. The method enables quick, simple and accurate prediction of alpha and beta protein folds on a personal computer by means of few binary patterns of coded amino acid and nucleotide physicochemical properties. The algorithm was tested with machine learning SMO (sequential minimal optimisation) classifier for the support vector machines and classification trees, on a dataset of 140 dissimilar protein folds. Depending on the method of testing, the overall classification accuracy was 91.43%-100% and the tenfold cross-validation result of the procedure was 83.57%->90%. Genetic code randomisation analysis based on 100,000 different codes tested for the protein fold prediction quality indicated that: a) there is a very low chance of p=2.7times10 -4 that a better code than the natural one specified by the binary coding algorithm is randomly produced, b) dipeptides represent basic protein units with respect to the natural genetic code defining of the secondary protein structure
doi_str_mv	10.2498/cit.2004.02.02
format	Article
fullrecord	<record><control><sourceid>proquest_6IE</sourceid><recordid>TN_cdi_hrcak_primary_oai_hrcak_srce_hr_44722</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>1372374</ieee_id><sourcerecordid>57582582</sourcerecordid><originalsourceid>FETCH-LOGICAL-c387t-bdfea833a7895c75481a7707a781f257aa77979438ead22ba29d3cd58a87774c3</originalsourceid><addsrcrecordid>eNqFkMtPxCAQxomPxPVx9eKlFz3ZlTLQgeNqfCVGE6NnMlKq6G6r0D3438umxj1KvgSG-c2XycfYYcWnQhp95sIwFZzLKRdZG2xSaVmXYLjeZLtGgamxNqbayg0AXlYV1DvsIKV3ng8YJThOmDoPHcXvwvVN6F5Pi8Xj_awIXdvHBQ2h7wrqmuIz9oMPXZGGuHTDMvp9tt3SPPmD33uPPV9dPl3clHcP17cXs7vSgcahfGlaTxqAUBvlUEldESLHXFetUEi5MmgkaE-NEC8kTAOuUZo0IkoHe-x09H2Ljj7sZwyLvKztKdjxJ0Xn89NKiUJk_GTE88JfS58GuwjJ-fmcOt8vk1WotMjK4HQEXexTir79s664XYVrc7h2Fa7lIisPHP86U3I0byN1LqT1VA214SD_57JlXZvMHY1c8N6v24ACUMIPXEOMZQ</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>57582582</pqid></control><display><type>article</type><title>Binary coding, mRNA information and protein structure</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Stambuk, N. ; Konjevoda, P. ; Gotovac, N.</creator><creatorcontrib>Stambuk, N. ; Konjevoda, P. ; Gotovac, N.</creatorcontrib><description>We describe new binary algorithm for the prediction of alpha and beta protein folding types from RNA, DNA and amino acid sequences. The method enables quick, simple and accurate prediction of alpha and beta protein folds on a personal computer by means of few binary patterns of coded amino acid and nucleotide physicochemical properties. The algorithm was tested with machine learning SMO (sequential minimal optimisation) classifier for the support vector machines and classification trees, on a dataset of 140 dissimilar protein folds. Depending on the method of testing, the overall classification accuracy was 91.43%-100% and the tenfold cross-validation result of the procedure was 83.57%->90%. Genetic code randomisation analysis based on 100,000 different codes tested for the protein fold prediction quality indicated that: a) there is a very low chance of p=2.7times10 -4 that a better code than the natural one specified by the binary coding algorithm is randomly produced, b) dipeptides represent basic protein units with respect to the natural genetic code defining of the secondary protein structure</description><identifier>ISSN: 1330-1136</identifier><identifier>ISBN: 9539676991</identifier><identifier>ISBN: 9789539676993</identifier><identifier>EISSN: 1846-3908</identifier><identifier>DOI: 10.2498/cit.2004.02.02</identifier><identifier>CODEN: CJCTEM</identifier><language>eng</language><publisher>Zagreb: IEEE</publisher><subject>Amino acids ; Analytical biochemistry: general aspects, technics, instrumentation ; Analytical, structural and metabolic biochemistry ; Applied sciences ; Biological and medical sciences ; Chemical structures ; Classification tree analysis ; Computer applications ; Computer science; control theory; systems ; Computer systems and distributed systems. User interface ; DNA ; Exact sciences and technology ; Fundamental and applied biological sciences. Psychology ; Genetics ; Information systems. Data bases ; Memory organisation. Data processing ; Microcomputers ; Prediction algorithms ; Proteins ; RNA ; Sequences ; Software ; Testing</subject><ispartof>26th International Conference on Information Technology Interfaces, 2004, 2004-01, Vol.12 (2), p.53-61 Vol.1</ispartof><rights>2004 INIST-CNRS</rights><rights>2005 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c387t-bdfea833a7895c75481a7707a781f257aa77979438ead22ba29d3cd58a87774c3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/1372374$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>230,309,310,314,780,784,789,790,885,2058,27924,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/1372374$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=16004669$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=16369034$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Stambuk, N.</creatorcontrib><creatorcontrib>Konjevoda, P.</creatorcontrib><creatorcontrib>Gotovac, N.</creatorcontrib><title>Binary coding, mRNA information and protein structure</title><title>26th International Conference on Information Technology Interfaces, 2004</title><addtitle>ITI</addtitle><description>We describe new binary algorithm for the prediction of alpha and beta protein folding types from RNA, DNA and amino acid sequences. The method enables quick, simple and accurate prediction of alpha and beta protein folds on a personal computer by means of few binary patterns of coded amino acid and nucleotide physicochemical properties. The algorithm was tested with machine learning SMO (sequential minimal optimisation) classifier for the support vector machines and classification trees, on a dataset of 140 dissimilar protein folds. Depending on the method of testing, the overall classification accuracy was 91.43%-100% and the tenfold cross-validation result of the procedure was 83.57%->90%. Genetic code randomisation analysis based on 100,000 different codes tested for the protein fold prediction quality indicated that: a) there is a very low chance of p=2.7times10 -4 that a better code than the natural one specified by the binary coding algorithm is randomly produced, b) dipeptides represent basic protein units with respect to the natural genetic code defining of the secondary protein structure</description><subject>Amino acids</subject><subject>Analytical biochemistry: general aspects, technics, instrumentation</subject><subject>Analytical, structural and metabolic biochemistry</subject><subject>Applied sciences</subject><subject>Biological and medical sciences</subject><subject>Chemical structures</subject><subject>Classification tree analysis</subject><subject>Computer applications</subject><subject>Computer science; control theory; systems</subject><subject>Computer systems and distributed systems. User interface</subject><subject>DNA</subject><subject>Exact sciences and technology</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>Genetics</subject><subject>Information systems. Data bases</subject><subject>Memory organisation. Data processing</subject><subject>Microcomputers</subject><subject>Prediction algorithms</subject><subject>Proteins</subject><subject>RNA</subject><subject>Sequences</subject><subject>Software</subject><subject>Testing</subject><issn>1330-1136</issn><issn>1846-3908</issn><isbn>9539676991</isbn><isbn>9789539676993</isbn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2004</creationdate><recordtype>article</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNqFkMtPxCAQxomPxPVx9eKlFz3ZlTLQgeNqfCVGE6NnMlKq6G6r0D3438umxj1KvgSG-c2XycfYYcWnQhp95sIwFZzLKRdZG2xSaVmXYLjeZLtGgamxNqbayg0AXlYV1DvsIKV3ng8YJThOmDoPHcXvwvVN6F5Pi8Xj_awIXdvHBQ2h7wrqmuIz9oMPXZGGuHTDMvp9tt3SPPmD33uPPV9dPl3clHcP17cXs7vSgcahfGlaTxqAUBvlUEldESLHXFetUEi5MmgkaE-NEC8kTAOuUZo0IkoHe-x09H2Ljj7sZwyLvKztKdjxJ0Xn89NKiUJk_GTE88JfS58GuwjJ-fmcOt8vk1WotMjK4HQEXexTir79s664XYVrc7h2Fa7lIisPHP86U3I0byN1LqT1VA214SD_57JlXZvMHY1c8N6v24ACUMIPXEOMZQ</recordid><startdate>20040101</startdate><enddate>20040101</enddate><creator>Stambuk, N.</creator><creator>Konjevoda, P.</creator><creator>Gotovac, N.</creator><general>IEEE</general><general>University Computing Centre</general><general>Fakultet elektrotehnike i računarstva Sveučilišta u Zagrebu</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>E3H</scope><scope>F2A</scope><scope>VP8</scope></search><sort><creationdate>20040101</creationdate><title>Binary coding, mRNA information and protein structure</title><author>Stambuk, N. ; Konjevoda, P. ; Gotovac, N.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c387t-bdfea833a7895c75481a7707a781f257aa77979438ead22ba29d3cd58a87774c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2004</creationdate><topic>Amino acids</topic><topic>Analytical biochemistry: general aspects, technics, instrumentation</topic><topic>Analytical, structural and metabolic biochemistry</topic><topic>Applied sciences</topic><topic>Biological and medical sciences</topic><topic>Chemical structures</topic><topic>Classification tree analysis</topic><topic>Computer applications</topic><topic>Computer science; control theory; systems</topic><topic>Computer systems and distributed systems. User interface</topic><topic>DNA</topic><topic>Exact sciences and technology</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>Genetics</topic><topic>Information systems. Data bases</topic><topic>Memory organisation. Data processing</topic><topic>Microcomputers</topic><topic>Prediction algorithms</topic><topic>Proteins</topic><topic>RNA</topic><topic>Sequences</topic><topic>Software</topic><topic>Testing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Stambuk, N.</creatorcontrib><creatorcontrib>Konjevoda, P.</creatorcontrib><creatorcontrib>Gotovac, N.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Xplore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>Hrcak: Portal of scientific journals of Croatia</collection><jtitle>26th International Conference on Information Technology Interfaces, 2004</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Stambuk, N.</au><au>Konjevoda, P.</au><au>Gotovac, N.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Binary coding, mRNA information and protein structure</atitle><jtitle>26th International Conference on Information Technology Interfaces, 2004</jtitle><stitle>ITI</stitle><date>2004-01-01</date><risdate>2004</risdate><volume>12</volume><issue>2</issue><spage>53</spage><epage>61 Vol.1</epage><pages>53-61 Vol.1</pages><issn>1330-1136</issn><eissn>1846-3908</eissn><isbn>9539676991</isbn><isbn>9789539676993</isbn><coden>CJCTEM</coden><abstract>We describe new binary algorithm for the prediction of alpha and beta protein folding types from RNA, DNA and amino acid sequences. The method enables quick, simple and accurate prediction of alpha and beta protein folds on a personal computer by means of few binary patterns of coded amino acid and nucleotide physicochemical properties. The algorithm was tested with machine learning SMO (sequential minimal optimisation) classifier for the support vector machines and classification trees, on a dataset of 140 dissimilar protein folds. Depending on the method of testing, the overall classification accuracy was 91.43%-100% and the tenfold cross-validation result of the procedure was 83.57%->90%. Genetic code randomisation analysis based on 100,000 different codes tested for the protein fold prediction quality indicated that: a) there is a very low chance of p=2.7times10 -4 that a better code than the natural one specified by the binary coding algorithm is randomly produced, b) dipeptides represent basic protein units with respect to the natural genetic code defining of the secondary protein structure</abstract><cop>Zagreb</cop><pub>IEEE</pub><doi>10.2498/cit.2004.02.02</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1330-1136
ispartof	26th International Conference on Information Technology Interfaces, 2004, 2004-01, Vol.12 (2), p.53-61 Vol.1
issn	1330-1136 1846-3908
language	eng
recordid	cdi_hrcak_primary_oai_hrcak_srce_hr_44722
source	IEEE Electronic Library (IEL) Conference Proceedings
subjects	Amino acids Analytical biochemistry: general aspects, technics, instrumentation Analytical, structural and metabolic biochemistry Applied sciences Biological and medical sciences Chemical structures Classification tree analysis Computer applications Computer science control theory systems Computer systems and distributed systems. User interface DNA Exact sciences and technology Fundamental and applied biological sciences. Psychology Genetics Information systems. Data bases Memory organisation. Data processing Microcomputers Prediction algorithms Proteins RNA Sequences Software Testing
title	Binary coding, mRNA information and protein structure
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T13%3A46%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Binary%20coding,%20mRNA%20information%20and%20protein%20structure&rft.jtitle=26th%20International%20Conference%20on%20Information%20Technology%20Interfaces,%202004&rft.au=Stambuk,%20N.&rft.date=2004-01-01&rft.volume=12&rft.issue=2&rft.spage=53&rft.epage=61%20Vol.1&rft.pages=53-61%20Vol.1&rft.issn=1330-1136&rft.eissn=1846-3908&rft.isbn=9539676991&rft.isbn_list=9789539676993&rft.coden=CJCTEM&rft_id=info:doi/10.2498/cit.2004.02.02&rft_dat=%3Cproquest_6IE%3E57582582%3C/proquest_6IE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=57582582&rft_id=info:pmid/&rft_ieee_id=1372374&rfr_iscdi=true