PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm

In phylogenomic profiling, the genomic context based methods are based on the observation that two or more proteins having the same pattern of presence or absence in many diverse genomes most likely have a functional link. In this research work, a tool (PATSIM) has been developed to predict the prot...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Gene 2018-05, Vol.657, p.50-59
Hauptverfasser: Manikandan, P., Ramyachitra, D.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 59
container_issue
container_start_page 50
container_title Gene
container_volume 657
creator Manikandan, P.
Ramyachitra, D.
description In phylogenomic profiling, the genomic context based methods are based on the observation that two or more proteins having the same pattern of presence or absence in many diverse genomes most likely have a functional link. In this research work, a tool (PATSIM) has been developed to predict the protein patterns based on the SOPM tool. In this tool, the secondary structure for CATH database protein sequences, predicted by the SOPM (Self Optimized Prediction Method) server is passed as input to fulfill objectives such as, (i) Predict the Amino Acid Pattern using the proposed Hybrid KMP and BM algorithm, (ii) Predict the physiochemical properties such as Hydrophobic Non-Polar ALKYL Amino Acid groups, Hydrophobic Non-Polar AROMATIC Amino Acid groups, Hydrophilic Polar Neutral Amino Acid groups, Hydrophilic Polar Acidic Amino Acid groups and Hydrophilic Polar Basic Amino Acid groups of protein sequence, (iii) Predict the secondary structure of protein where the structure of protein sequence is unknown, and (iv) Similarity analysis of protein sequence (structure unknown) with the CATH database. From the results, it is inferred that this tool effectively predicts the similarity between the sequences and also identifies the protein patterns for four secondary structural classes, namely Alpha Helix (h), Beta Sheet (e), Turn (t) and Coil (c). Based on the experimental results, it is inferred that this tool identifies the physiochemical properties of the protein sequence in an effective manner. The source code and its documentation for the PATSIM tool is freely available in the GitHub public repository (https://github.com/manimkn89/Protein-Sequence-Analysis). •To propose a hybrid algorithm to predict the amino acid patterns from the protein sequences.•To predict the physiochemical properties of protein sequences.•To predict the secondary structure of protein where the structure of protein sequence is unknown.•To perform the similarity analysis of protein sequence (structure unknown) with the CATH database.
doi_str_mv 10.1016/j.gene.2018.02.069
format Article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2010837452</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0378111918302178</els_id><sourcerecordid>2010837452</sourcerecordid><originalsourceid>FETCH-LOGICAL-c356t-4d2dbd0fcc03c1708052febf28fa57560f06df5fa63f184cec10a059d282b7eb3</originalsourceid><addsrcrecordid>eNp9kEFv0zAYhi0EYt3gD3BAPpZDwmenThzEZZsYTFtFJcbZSuzPras03mwHKf8elw6OWLIs2e_7yN9DyDsGJQNWf9yXWxyx5MBkCbyEun1BFkw2bQFQyZdkAVUjC8ZYe0bOY9xDXkLw1-SMtyIDOCzIvLl8-HG7_kQ3AY3TyfmRdqPJuxvm6CL1lj4Gn9CNNOLThKPGSKfoxi3dzX1wht6NU9oVax9Cjm9ClxJd3q03H_5wrvyMIT_6gHR5tc6Xw9YHl3aHN-SV7YaIb5_PC_Lz5svD9bfi_vvX2-vL-0JXok7FynDTG7BaQ6VZAxIEt9hbLm0nGlGDhdpYYbu6skyuNGoGHYjWcMn7BvvqgixP3DxG_n9M6uCixmHoRvRTVFkfyKpZCZ6j_BTVwccY0KrH4A5dmBUDdVSu9uqo_NiRCrjKynPp_TN_6g9o_lX-Os6Bz6cA5il_OQwqanf0aFxAnZTx7n_83xVFkj4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2010837452</pqid></control><display><type>article</type><title>PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm</title><source>Access via ScienceDirect (Elsevier)</source><creator>Manikandan, P. ; Ramyachitra, D.</creator><creatorcontrib>Manikandan, P. ; Ramyachitra, D.</creatorcontrib><description>In phylogenomic profiling, the genomic context based methods are based on the observation that two or more proteins having the same pattern of presence or absence in many diverse genomes most likely have a functional link. In this research work, a tool (PATSIM) has been developed to predict the protein patterns based on the SOPM tool. In this tool, the secondary structure for CATH database protein sequences, predicted by the SOPM (Self Optimized Prediction Method) server is passed as input to fulfill objectives such as, (i) Predict the Amino Acid Pattern using the proposed Hybrid KMP and BM algorithm, (ii) Predict the physiochemical properties such as Hydrophobic Non-Polar ALKYL Amino Acid groups, Hydrophobic Non-Polar AROMATIC Amino Acid groups, Hydrophilic Polar Neutral Amino Acid groups, Hydrophilic Polar Acidic Amino Acid groups and Hydrophilic Polar Basic Amino Acid groups of protein sequence, (iii) Predict the secondary structure of protein where the structure of protein sequence is unknown, and (iv) Similarity analysis of protein sequence (structure unknown) with the CATH database. From the results, it is inferred that this tool effectively predicts the similarity between the sequences and also identifies the protein patterns for four secondary structural classes, namely Alpha Helix (h), Beta Sheet (e), Turn (t) and Coil (c). Based on the experimental results, it is inferred that this tool identifies the physiochemical properties of the protein sequence in an effective manner. The source code and its documentation for the PATSIM tool is freely available in the GitHub public repository (https://github.com/manimkn89/Protein-Sequence-Analysis). •To propose a hybrid algorithm to predict the amino acid patterns from the protein sequences.•To predict the physiochemical properties of protein sequences.•To predict the secondary structure of protein where the structure of protein sequence is unknown.•To perform the similarity analysis of protein sequence (structure unknown) with the CATH database.</description><identifier>ISSN: 0378-1119</identifier><identifier>EISSN: 1879-0038</identifier><identifier>DOI: 10.1016/j.gene.2018.02.069</identifier><identifier>PMID: 29501620</identifier><language>eng</language><publisher>Netherlands: Elsevier B.V</publisher><subject>Amino acid patterns ; Boyer-Moore (BM) ; CATH ; Knuth-Morris Pratt (KMP) ; Physiochemical properties ; Protein secondary structure ; Similarity analysis</subject><ispartof>Gene, 2018-05, Vol.657, p.50-59</ispartof><rights>2018 Elsevier B.V.</rights><rights>Copyright © 2017. Published by Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c356t-4d2dbd0fcc03c1708052febf28fa57560f06df5fa63f184cec10a059d282b7eb3</citedby><cites>FETCH-LOGICAL-c356t-4d2dbd0fcc03c1708052febf28fa57560f06df5fa63f184cec10a059d282b7eb3</cites><orcidid>0000-0002-7060-6206</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://dx.doi.org/10.1016/j.gene.2018.02.069$$EHTML$$P50$$Gelsevier$$H</linktohtml><link.rule.ids>314,780,784,3550,27924,27925,45995</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29501620$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Manikandan, P.</creatorcontrib><creatorcontrib>Ramyachitra, D.</creatorcontrib><title>PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm</title><title>Gene</title><addtitle>Gene</addtitle><description>In phylogenomic profiling, the genomic context based methods are based on the observation that two or more proteins having the same pattern of presence or absence in many diverse genomes most likely have a functional link. In this research work, a tool (PATSIM) has been developed to predict the protein patterns based on the SOPM tool. In this tool, the secondary structure for CATH database protein sequences, predicted by the SOPM (Self Optimized Prediction Method) server is passed as input to fulfill objectives such as, (i) Predict the Amino Acid Pattern using the proposed Hybrid KMP and BM algorithm, (ii) Predict the physiochemical properties such as Hydrophobic Non-Polar ALKYL Amino Acid groups, Hydrophobic Non-Polar AROMATIC Amino Acid groups, Hydrophilic Polar Neutral Amino Acid groups, Hydrophilic Polar Acidic Amino Acid groups and Hydrophilic Polar Basic Amino Acid groups of protein sequence, (iii) Predict the secondary structure of protein where the structure of protein sequence is unknown, and (iv) Similarity analysis of protein sequence (structure unknown) with the CATH database. From the results, it is inferred that this tool effectively predicts the similarity between the sequences and also identifies the protein patterns for four secondary structural classes, namely Alpha Helix (h), Beta Sheet (e), Turn (t) and Coil (c). Based on the experimental results, it is inferred that this tool identifies the physiochemical properties of the protein sequence in an effective manner. The source code and its documentation for the PATSIM tool is freely available in the GitHub public repository (https://github.com/manimkn89/Protein-Sequence-Analysis). •To propose a hybrid algorithm to predict the amino acid patterns from the protein sequences.•To predict the physiochemical properties of protein sequences.•To predict the secondary structure of protein where the structure of protein sequence is unknown.•To perform the similarity analysis of protein sequence (structure unknown) with the CATH database.</description><subject>Amino acid patterns</subject><subject>Boyer-Moore (BM)</subject><subject>CATH</subject><subject>Knuth-Morris Pratt (KMP)</subject><subject>Physiochemical properties</subject><subject>Protein secondary structure</subject><subject>Similarity analysis</subject><issn>0378-1119</issn><issn>1879-0038</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNp9kEFv0zAYhi0EYt3gD3BAPpZDwmenThzEZZsYTFtFJcbZSuzPras03mwHKf8elw6OWLIs2e_7yN9DyDsGJQNWf9yXWxyx5MBkCbyEun1BFkw2bQFQyZdkAVUjC8ZYe0bOY9xDXkLw1-SMtyIDOCzIvLl8-HG7_kQ3AY3TyfmRdqPJuxvm6CL1lj4Gn9CNNOLThKPGSKfoxi3dzX1wht6NU9oVax9Cjm9ClxJd3q03H_5wrvyMIT_6gHR5tc6Xw9YHl3aHN-SV7YaIb5_PC_Lz5svD9bfi_vvX2-vL-0JXok7FynDTG7BaQ6VZAxIEt9hbLm0nGlGDhdpYYbu6skyuNGoGHYjWcMn7BvvqgixP3DxG_n9M6uCixmHoRvRTVFkfyKpZCZ6j_BTVwccY0KrH4A5dmBUDdVSu9uqo_NiRCrjKynPp_TN_6g9o_lX-Os6Bz6cA5il_OQwqanf0aFxAnZTx7n_83xVFkj4</recordid><startdate>20180530</startdate><enddate>20180530</enddate><creator>Manikandan, P.</creator><creator>Ramyachitra, D.</creator><general>Elsevier B.V</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-7060-6206</orcidid></search><sort><creationdate>20180530</creationdate><title>PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm</title><author>Manikandan, P. ; Ramyachitra, D.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c356t-4d2dbd0fcc03c1708052febf28fa57560f06df5fa63f184cec10a059d282b7eb3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Amino acid patterns</topic><topic>Boyer-Moore (BM)</topic><topic>CATH</topic><topic>Knuth-Morris Pratt (KMP)</topic><topic>Physiochemical properties</topic><topic>Protein secondary structure</topic><topic>Similarity analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Manikandan, P.</creatorcontrib><creatorcontrib>Ramyachitra, D.</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Gene</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Manikandan, P.</au><au>Ramyachitra, D.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm</atitle><jtitle>Gene</jtitle><addtitle>Gene</addtitle><date>2018-05-30</date><risdate>2018</risdate><volume>657</volume><spage>50</spage><epage>59</epage><pages>50-59</pages><issn>0378-1119</issn><eissn>1879-0038</eissn><abstract>In phylogenomic profiling, the genomic context based methods are based on the observation that two or more proteins having the same pattern of presence or absence in many diverse genomes most likely have a functional link. In this research work, a tool (PATSIM) has been developed to predict the protein patterns based on the SOPM tool. In this tool, the secondary structure for CATH database protein sequences, predicted by the SOPM (Self Optimized Prediction Method) server is passed as input to fulfill objectives such as, (i) Predict the Amino Acid Pattern using the proposed Hybrid KMP and BM algorithm, (ii) Predict the physiochemical properties such as Hydrophobic Non-Polar ALKYL Amino Acid groups, Hydrophobic Non-Polar AROMATIC Amino Acid groups, Hydrophilic Polar Neutral Amino Acid groups, Hydrophilic Polar Acidic Amino Acid groups and Hydrophilic Polar Basic Amino Acid groups of protein sequence, (iii) Predict the secondary structure of protein where the structure of protein sequence is unknown, and (iv) Similarity analysis of protein sequence (structure unknown) with the CATH database. From the results, it is inferred that this tool effectively predicts the similarity between the sequences and also identifies the protein patterns for four secondary structural classes, namely Alpha Helix (h), Beta Sheet (e), Turn (t) and Coil (c). Based on the experimental results, it is inferred that this tool identifies the physiochemical properties of the protein sequence in an effective manner. The source code and its documentation for the PATSIM tool is freely available in the GitHub public repository (https://github.com/manimkn89/Protein-Sequence-Analysis). •To propose a hybrid algorithm to predict the amino acid patterns from the protein sequences.•To predict the physiochemical properties of protein sequences.•To predict the secondary structure of protein where the structure of protein sequence is unknown.•To perform the similarity analysis of protein sequence (structure unknown) with the CATH database.</abstract><cop>Netherlands</cop><pub>Elsevier B.V</pub><pmid>29501620</pmid><doi>10.1016/j.gene.2018.02.069</doi><tpages>10</tpages><orcidid>https://orcid.org/0000-0002-7060-6206</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0378-1119
ispartof Gene, 2018-05, Vol.657, p.50-59
issn 0378-1119
1879-0038
language eng
recordid cdi_proquest_miscellaneous_2010837452
source Access via ScienceDirect (Elsevier)
subjects Amino acid patterns
Boyer-Moore (BM)
CATH
Knuth-Morris Pratt (KMP)
Physiochemical properties
Protein secondary structure
Similarity analysis
title PATSIM: Prediction and analysis of protein sequences using hybrid Knuth-Morris Pratt (KMP) and Boyer-Moore (BM) algorithm
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-19T07%3A34%3A49IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PATSIM:%20Prediction%20and%20analysis%20of%20protein%20sequences%20using%20hybrid%20Knuth-Morris%20Pratt%20(KMP)%20and%20Boyer-Moore%20(BM)%20algorithm&rft.jtitle=Gene&rft.au=Manikandan,%20P.&rft.date=2018-05-30&rft.volume=657&rft.spage=50&rft.epage=59&rft.pages=50-59&rft.issn=0378-1119&rft.eissn=1879-0038&rft_id=info:doi/10.1016/j.gene.2018.02.069&rft_dat=%3Cproquest_cross%3E2010837452%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2010837452&rft_id=info:pmid/29501620&rft_els_id=S0378111918302178&rfr_iscdi=true