PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure

Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics advances 2022, Vol.2 (1), p.vbac058-vbac058
Hauptverfasser: Townsley, Thomas D, Wilson, James T, Akers, Harrison, Bryant, Timothy, Cordova, Salvador, Wallace, T L, Durston, Kirk K, Deweese, Joseph E
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page vbac058
container_issue 1
container_start_page vbac058
container_title Bioinformatics advances
container_volume 2
creator Townsley, Thomas D
Wilson, James T
Akers, Harrison
Bryant, Timothy
Cordova, Salvador
Wallace, T L
Durston, Kirk K
Deweese, Joseph E
description Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. Results We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. Availability and implementation https://github.com/jdeweeselab/psicalc-package Supplementary information Supplementary data are available at Bioinformatics Advances online.
doi_str_mv 10.1093/bioadv/vbac058
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9710643</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioadv/vbac058</oup_id><sourcerecordid>2769994606</sourcerecordid><originalsourceid>FETCH-LOGICAL-c424t-925368eed842d1fba29b8eca18b9961beb7993c9a2adeac224c7d52321eccbf93</originalsourceid><addsrcrecordid>eNqFkTtvFDEUhS0EIlFIS4lcQjGJX-sZUyChFYRIkUACasuPO1nDrD3Yng0p-edxtEsUKipf298999gHoZeUnFGi-LkNyfjd-c4aR1bDE3TMJF91hAj69FF9hE5L-UEIYX0vqeDP0RGXUilBxDH68-Xr5dpM7i02OKYdTNjMc07GbXBNOHiINYy3IV5jEz3OJv68r10ONTgztZbYNfx32LZNiBWyhxlia3MBCr4JdRMirhvATTubacKNrtDOSs2Lq0uGF-jZaKYCp4f1BH3_-OHb-lN39fnicv3-qnOCidoptuJyAPCDYJ6O1jBlB3CGDlYpSS3YXinulGHGg3GMCdf7FeOMgnN2VPwEvdvrzovdgnftZc2QnnPznm91MkH_exPDRl-nnVY9JVLwJvD6IJDTrwVK1dtQHEyTiZCWolnfPlUJSWRDz_aoy6mUDOPDGEr0fXR6H50-RNcaXj0294D_DaoBb_ZAWub_id0BCW-qaA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2769994606</pqid></control><display><type>article</type><title>PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure</title><source>Oxford Journals Open Access Collection</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Townsley, Thomas D ; Wilson, James T ; Akers, Harrison ; Bryant, Timothy ; Cordova, Salvador ; Wallace, T L ; Durston, Kirk K ; Deweese, Joseph E</creator><creatorcontrib>Townsley, Thomas D ; Wilson, James T ; Akers, Harrison ; Bryant, Timothy ; Cordova, Salvador ; Wallace, T L ; Durston, Kirk K ; Deweese, Joseph E</creatorcontrib><description>Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. Results We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. Availability and implementation https://github.com/jdeweeselab/psicalc-package Supplementary information Supplementary data are available at Bioinformatics Advances online.</description><identifier>ISSN: 2635-0041</identifier><identifier>EISSN: 2635-0041</identifier><identifier>DOI: 10.1093/bioadv/vbac058</identifier><identifier>PMID: 36699404</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Original Paper</subject><ispartof>Bioinformatics advances, 2022, Vol.2 (1), p.vbac058-vbac058</ispartof><rights>The Author(s) 2022. Published by Oxford University Press. 2022</rights><rights>The Author(s) 2022. Published by Oxford University Press.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c424t-925368eed842d1fba29b8eca18b9961beb7993c9a2adeac224c7d52321eccbf93</citedby><cites>FETCH-LOGICAL-c424t-925368eed842d1fba29b8eca18b9961beb7993c9a2adeac224c7d52321eccbf93</cites><orcidid>0000-0001-9683-6723 ; 0000-0002-0877-0523</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710643/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710643/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,4010,27900,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/36699404$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Townsley, Thomas D</creatorcontrib><creatorcontrib>Wilson, James T</creatorcontrib><creatorcontrib>Akers, Harrison</creatorcontrib><creatorcontrib>Bryant, Timothy</creatorcontrib><creatorcontrib>Cordova, Salvador</creatorcontrib><creatorcontrib>Wallace, T L</creatorcontrib><creatorcontrib>Durston, Kirk K</creatorcontrib><creatorcontrib>Deweese, Joseph E</creatorcontrib><title>PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure</title><title>Bioinformatics advances</title><addtitle>Bioinform Adv</addtitle><description>Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. Results We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. Availability and implementation https://github.com/jdeweeselab/psicalc-package Supplementary information Supplementary data are available at Bioinformatics Advances online.</description><subject>Original Paper</subject><issn>2635-0041</issn><issn>2635-0041</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFkTtvFDEUhS0EIlFIS4lcQjGJX-sZUyChFYRIkUACasuPO1nDrD3Yng0p-edxtEsUKipf298999gHoZeUnFGi-LkNyfjd-c4aR1bDE3TMJF91hAj69FF9hE5L-UEIYX0vqeDP0RGXUilBxDH68-Xr5dpM7i02OKYdTNjMc07GbXBNOHiINYy3IV5jEz3OJv68r10ONTgztZbYNfx32LZNiBWyhxlia3MBCr4JdRMirhvATTubacKNrtDOSs2Lq0uGF-jZaKYCp4f1BH3_-OHb-lN39fnicv3-qnOCidoptuJyAPCDYJ6O1jBlB3CGDlYpSS3YXinulGHGg3GMCdf7FeOMgnN2VPwEvdvrzovdgnftZc2QnnPznm91MkH_exPDRl-nnVY9JVLwJvD6IJDTrwVK1dtQHEyTiZCWolnfPlUJSWRDz_aoy6mUDOPDGEr0fXR6H50-RNcaXj0294D_DaoBb_ZAWub_id0BCW-qaA</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Townsley, Thomas D</creator><creator>Wilson, James T</creator><creator>Akers, Harrison</creator><creator>Bryant, Timothy</creator><creator>Cordova, Salvador</creator><creator>Wallace, T L</creator><creator>Durston, Kirk K</creator><creator>Deweese, Joseph E</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-9683-6723</orcidid><orcidid>https://orcid.org/0000-0002-0877-0523</orcidid></search><sort><creationdate>2022</creationdate><title>PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure</title><author>Townsley, Thomas D ; Wilson, James T ; Akers, Harrison ; Bryant, Timothy ; Cordova, Salvador ; Wallace, T L ; Durston, Kirk K ; Deweese, Joseph E</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c424t-925368eed842d1fba29b8eca18b9961beb7993c9a2adeac224c7d52321eccbf93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Original Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Townsley, Thomas D</creatorcontrib><creatorcontrib>Wilson, James T</creatorcontrib><creatorcontrib>Akers, Harrison</creatorcontrib><creatorcontrib>Bryant, Timothy</creatorcontrib><creatorcontrib>Cordova, Salvador</creatorcontrib><creatorcontrib>Wallace, T L</creatorcontrib><creatorcontrib>Durston, Kirk K</creatorcontrib><creatorcontrib>Deweese, Joseph E</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics advances</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Townsley, Thomas D</au><au>Wilson, James T</au><au>Akers, Harrison</au><au>Bryant, Timothy</au><au>Cordova, Salvador</au><au>Wallace, T L</au><au>Durston, Kirk K</au><au>Deweese, Joseph E</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure</atitle><jtitle>Bioinformatics advances</jtitle><addtitle>Bioinform Adv</addtitle><date>2022</date><risdate>2022</risdate><volume>2</volume><issue>1</issue><spage>vbac058</spage><epage>vbac058</epage><pages>vbac058-vbac058</pages><issn>2635-0041</issn><eissn>2635-0041</eissn><abstract>Motivation AlphaFold has been a major advance in predicting protein structure, but still leaves the problem of determining which sub-molecular components of a protein are essential for it to carry out its function within the cell. Direct coupling analysis predicts two- and three-amino acid contacts, but there may be essential interdependencies that are not proximal within the 3D structure. The problem to be addressed is to design a computational method that locates and ranks essential non-proximal interdependencies within a protein involving five or more amino acids, using large, multiple sequence alignments (MSAs) for both globular and intrinsically unstructured proteins. Results We developed PSICalc (Protein Subdomain Interdependency Calculator), a laptop-friendly, pattern-discovery, bioinformatics software tool that analyzes large MSAs for both structured and unstructured proteins, locates both proximal and non-proximal inter-dependent sites, and clusters them into pairwise (second order), third-order and higher-order clusters using a k-modes approach, and provides ranked results within minutes. To aid in visualizing these interdependencies, we developed a graphical user interface that displays these subdomain relationships as a polytree graph. To demonstrate, we provide examples of both proximal and non-proximal interdependencies documented for eukaryotic topoisomerase II including between the unstructured C-terminal domain and the N-terminal domain. Availability and implementation https://github.com/jdeweeselab/psicalc-package Supplementary information Supplementary data are available at Bioinformatics Advances online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>36699404</pmid><doi>10.1093/bioadv/vbac058</doi><orcidid>https://orcid.org/0000-0001-9683-6723</orcidid><orcidid>https://orcid.org/0000-0002-0877-0523</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2635-0041
ispartof Bioinformatics advances, 2022, Vol.2 (1), p.vbac058-vbac058
issn 2635-0041
2635-0041
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_9710643
source Oxford Journals Open Access Collection; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects Original Paper
title PSICalc: a novel approach to identifying and ranking critical non-proximal interdependencies within the overall protein structure
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-05T12%3A40%3A46IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PSICalc:%20a%20novel%20approach%20to%20identifying%20and%20ranking%20critical%20non-proximal%20interdependencies%20within%20the%20overall%20protein%20structure&rft.jtitle=Bioinformatics%20advances&rft.au=Townsley,%20Thomas%20D&rft.date=2022&rft.volume=2&rft.issue=1&rft.spage=vbac058&rft.epage=vbac058&rft.pages=vbac058-vbac058&rft.issn=2635-0041&rft.eissn=2635-0041&rft_id=info:doi/10.1093/bioadv/vbac058&rft_dat=%3Cproquest_pubme%3E2769994606%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2769994606&rft_id=info:pmid/36699404&rft_oup_id=10.1093/bioadv/vbac058&rfr_iscdi=true