K-means find density peaks in molecular conformation clustering

Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories. Usually, it is a critical step for interpreting complex conformational changes or interaction mechanisms. As one of the density-based clustering a...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Chinese journal of chemical physics 2022-04, Vol.35 (2), p.353-368
Hauptverfasser: Wang, Guiyan, Fu, Ting, Ren, Hong, Xu, Peijun, Guo, Qiuhan, Mou, Xiaohong, Li, Yan, Li, Guohui
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 368
container_issue 2
container_start_page 353
container_title Chinese journal of chemical physics
container_volume 35
creator Wang, Guiyan
Fu, Ting
Ren, Hong
Xu, Peijun
Guo, Qiuhan
Mou, Xiaohong
Li, Yan
Li, Guohui
description Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories. Usually, it is a critical step for interpreting complex conformational changes or interaction mechanisms. As one of the density-based clustering algorithms, find density peaks (FDP) is an accurate and reasonable candidate for the molecular conformation clustering. However, facing the rapidly increasing simulation length due to the increase in computing power, the low computing efficiency of FDP limits its application potential. Here we propose a marginal extension to FDP named K-means find density peaks (KFDP) to solve the mass source consuming problem. In KFDP, the points are initially clustered by a high efficiency clustering algorithm, such as K-means. Cluster centers are defined as typical points with a weight which represents the cluster size. Then, the weighted typical points are clustered again by FDP, and then are refined as core, boundary, and redefined halo points. In this way, KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n2) to O(n). We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle, secondary structure or contact map. The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.
doi_str_mv 10.1063/1674-0068/cjcp2111261
format Article
fullrecord <record><control><sourceid>wanfang_jour_scita</sourceid><recordid>TN_cdi_scitation_primary_10_1063_1674_0068_cjcp2111261</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><wanfj_id>hxwlxb202202017</wanfj_id><sourcerecordid>hxwlxb202202017</sourcerecordid><originalsourceid>FETCH-LOGICAL-c368t-64d7a4b050aaabd9a81d6b2d3c3158f95db1d6bb765a86db025d3c7805e4724e3</originalsourceid><addsrcrecordid>eNqNkE1LxDAQhoMouK7-BCE3D1J3kuajexJZ_MIFL3oOaZKuXdu0JK27--9tXREvgjAww8z7zDAvQucErgiIdEaEZAmAyGZmbVpKCKGCHKAJTalMKGXsEE1-NMfoJMb1UHECMEHXT0nttI-4KL3F1vlYdjvcOv0ecelx3VTO9JUO2DS-aEKtu7Lx2FR97Fwo_eoUHRW6iu7sO0_R693ty-IhWT7fPy5ulolJRdYlglmpWQ4ctNa5neuMWJFTm5qU8KyYc5uPjVwKrjNhc6B8mMkMuGOSMpdO0cV-70b7QvuVWjd98MNF9bbdVNucAh0CiByUfK80oYkxuEK1oax12CkCavRLjV6o0Qv1y6-BE3sumrL7evPf4OVf4EcTRhiAAJOZam2RfgKsL38l</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>K-means find density peaks in molecular conformation clustering</title><source>Alma/SFX Local Collection</source><creator>Wang, Guiyan ; Fu, Ting ; Ren, Hong ; Xu, Peijun ; Guo, Qiuhan ; Mou, Xiaohong ; Li, Yan ; Li, Guohui</creator><creatorcontrib>Wang, Guiyan ; Fu, Ting ; Ren, Hong ; Xu, Peijun ; Guo, Qiuhan ; Mou, Xiaohong ; Li, Yan ; Li, Guohui</creatorcontrib><description>Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories. Usually, it is a critical step for interpreting complex conformational changes or interaction mechanisms. As one of the density-based clustering algorithms, find density peaks (FDP) is an accurate and reasonable candidate for the molecular conformation clustering. However, facing the rapidly increasing simulation length due to the increase in computing power, the low computing efficiency of FDP limits its application potential. Here we propose a marginal extension to FDP named K-means find density peaks (KFDP) to solve the mass source consuming problem. In KFDP, the points are initially clustered by a high efficiency clustering algorithm, such as K-means. Cluster centers are defined as typical points with a weight which represents the cluster size. Then, the weighted typical points are clustered again by FDP, and then are refined as core, boundary, and redefined halo points. In this way, KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n2) to O(n). We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle, secondary structure or contact map. The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.</description><identifier>ISSN: 1674-0068</identifier><identifier>EISSN: 2327-2244</identifier><identifier>DOI: 10.1063/1674-0068/cjcp2111261</identifier><identifier>CODEN: CJCPA6</identifier><language>eng</language><publisher>School of Information Engineering,Dalian Ocean University,Dalian 116029,China%Pharmacy Department of Affiliated Zhongshan Hospital of Dalian University,Dalian 116001,China</publisher><ispartof>Chinese journal of chemical physics, 2022-04, Vol.35 (2), p.353-368</ispartof><rights>Chinese Physical Society</rights><rights>Copyright © Wanfang Data Co. Ltd. All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c368t-64d7a4b050aaabd9a81d6b2d3c3158f95db1d6bb765a86db025d3c7805e4724e3</citedby><cites>FETCH-LOGICAL-c368t-64d7a4b050aaabd9a81d6b2d3c3158f95db1d6bb765a86db025d3c7805e4724e3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Uhttp://www.wanfangdata.com.cn/images/PeriodicalImages/hxwlxb/hxwlxb.jpg</thumbnail><link.rule.ids>314,776,780,27903,27904</link.rule.ids></links><search><creatorcontrib>Wang, Guiyan</creatorcontrib><creatorcontrib>Fu, Ting</creatorcontrib><creatorcontrib>Ren, Hong</creatorcontrib><creatorcontrib>Xu, Peijun</creatorcontrib><creatorcontrib>Guo, Qiuhan</creatorcontrib><creatorcontrib>Mou, Xiaohong</creatorcontrib><creatorcontrib>Li, Yan</creatorcontrib><creatorcontrib>Li, Guohui</creatorcontrib><title>K-means find density peaks in molecular conformation clustering</title><title>Chinese journal of chemical physics</title><description>Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories. Usually, it is a critical step for interpreting complex conformational changes or interaction mechanisms. As one of the density-based clustering algorithms, find density peaks (FDP) is an accurate and reasonable candidate for the molecular conformation clustering. However, facing the rapidly increasing simulation length due to the increase in computing power, the low computing efficiency of FDP limits its application potential. Here we propose a marginal extension to FDP named K-means find density peaks (KFDP) to solve the mass source consuming problem. In KFDP, the points are initially clustered by a high efficiency clustering algorithm, such as K-means. Cluster centers are defined as typical points with a weight which represents the cluster size. Then, the weighted typical points are clustered again by FDP, and then are refined as core, boundary, and redefined halo points. In this way, KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n2) to O(n). We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle, secondary structure or contact map. The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.</description><issn>1674-0068</issn><issn>2327-2244</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNqNkE1LxDAQhoMouK7-BCE3D1J3kuajexJZ_MIFL3oOaZKuXdu0JK27--9tXREvgjAww8z7zDAvQucErgiIdEaEZAmAyGZmbVpKCKGCHKAJTalMKGXsEE1-NMfoJMb1UHECMEHXT0nttI-4KL3F1vlYdjvcOv0ecelx3VTO9JUO2DS-aEKtu7Lx2FR97Fwo_eoUHRW6iu7sO0_R693ty-IhWT7fPy5ulolJRdYlglmpWQ4ctNa5neuMWJFTm5qU8KyYc5uPjVwKrjNhc6B8mMkMuGOSMpdO0cV-70b7QvuVWjd98MNF9bbdVNucAh0CiByUfK80oYkxuEK1oax12CkCavRLjV6o0Qv1y6-BE3sumrL7evPf4OVf4EcTRhiAAJOZam2RfgKsL38l</recordid><startdate>20220401</startdate><enddate>20220401</enddate><creator>Wang, Guiyan</creator><creator>Fu, Ting</creator><creator>Ren, Hong</creator><creator>Xu, Peijun</creator><creator>Guo, Qiuhan</creator><creator>Mou, Xiaohong</creator><creator>Li, Yan</creator><creator>Li, Guohui</creator><general>School of Information Engineering,Dalian Ocean University,Dalian 116029,China%Pharmacy Department of Affiliated Zhongshan Hospital of Dalian University,Dalian 116001,China</general><general>State Key Laboratory of Molecular Reaction Dynamics,Dalian Institute of Chemical Physics,Dalian 116023,China%Department of Ophthalmology Aerospace Center Hospital,Beijing 10049,China%Liaoning Normal University,Dalian 116023,China%State Key Laboratory of Molecular Reaction Dynamics,Dalian Institute of Chemical Physics,Dalian 116023,China</general><scope>AAYXX</scope><scope>CITATION</scope><scope>2B.</scope><scope>4A8</scope><scope>92I</scope><scope>93N</scope><scope>PSX</scope><scope>TCJ</scope></search><sort><creationdate>20220401</creationdate><title>K-means find density peaks in molecular conformation clustering</title><author>Wang, Guiyan ; Fu, Ting ; Ren, Hong ; Xu, Peijun ; Guo, Qiuhan ; Mou, Xiaohong ; Li, Yan ; Li, Guohui</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c368t-64d7a4b050aaabd9a81d6b2d3c3158f95db1d6bb765a86db025d3c7805e4724e3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Guiyan</creatorcontrib><creatorcontrib>Fu, Ting</creatorcontrib><creatorcontrib>Ren, Hong</creatorcontrib><creatorcontrib>Xu, Peijun</creatorcontrib><creatorcontrib>Guo, Qiuhan</creatorcontrib><creatorcontrib>Mou, Xiaohong</creatorcontrib><creatorcontrib>Li, Yan</creatorcontrib><creatorcontrib>Li, Guohui</creatorcontrib><collection>CrossRef</collection><collection>Wanfang Data Journals - Hong Kong</collection><collection>WANFANG Data Centre</collection><collection>Wanfang Data Journals</collection><collection>万方数据期刊 - 香港版</collection><collection>China Online Journals (COJ)</collection><collection>China Online Journals (COJ)</collection><jtitle>Chinese journal of chemical physics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Guiyan</au><au>Fu, Ting</au><au>Ren, Hong</au><au>Xu, Peijun</au><au>Guo, Qiuhan</au><au>Mou, Xiaohong</au><au>Li, Yan</au><au>Li, Guohui</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>K-means find density peaks in molecular conformation clustering</atitle><jtitle>Chinese journal of chemical physics</jtitle><date>2022-04-01</date><risdate>2022</risdate><volume>35</volume><issue>2</issue><spage>353</spage><epage>368</epage><pages>353-368</pages><issn>1674-0068</issn><eissn>2327-2244</eissn><coden>CJCPA6</coden><abstract>Performing cluster analysis on molecular conformation is an important way to find the representative conformation in the molecular dynamics trajectories. Usually, it is a critical step for interpreting complex conformational changes or interaction mechanisms. As one of the density-based clustering algorithms, find density peaks (FDP) is an accurate and reasonable candidate for the molecular conformation clustering. However, facing the rapidly increasing simulation length due to the increase in computing power, the low computing efficiency of FDP limits its application potential. Here we propose a marginal extension to FDP named K-means find density peaks (KFDP) to solve the mass source consuming problem. In KFDP, the points are initially clustered by a high efficiency clustering algorithm, such as K-means. Cluster centers are defined as typical points with a weight which represents the cluster size. Then, the weighted typical points are clustered again by FDP, and then are refined as core, boundary, and redefined halo points. In this way, KFDP has comparable accuracy as FDP but its computational complexity is reduced from O(n2) to O(n). We apply and test our KFDP method to the trajectory data of multiple small proteins in terms of torsion angle, secondary structure or contact map. The comparing results with K-means and density-based spatial clustering of applications with noise show the validation of the proposed KFDP.</abstract><pub>School of Information Engineering,Dalian Ocean University,Dalian 116029,China%Pharmacy Department of Affiliated Zhongshan Hospital of Dalian University,Dalian 116001,China</pub><doi>10.1063/1674-0068/cjcp2111261</doi><tpages>16</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1674-0068
ispartof Chinese journal of chemical physics, 2022-04, Vol.35 (2), p.353-368
issn 1674-0068
2327-2244
language eng
recordid cdi_scitation_primary_10_1063_1674_0068_cjcp2111261
source Alma/SFX Local Collection
title K-means find density peaks in molecular conformation clustering
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T15%3A20%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-wanfang_jour_scita&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=K-means%20find%20density%20peaks%20in%20molecular%20conformation%20clustering&rft.jtitle=Chinese%20journal%20of%20chemical%20physics&rft.au=Wang,%20Guiyan&rft.date=2022-04-01&rft.volume=35&rft.issue=2&rft.spage=353&rft.epage=368&rft.pages=353-368&rft.issn=1674-0068&rft.eissn=2327-2244&rft.coden=CJCPA6&rft_id=info:doi/10.1063/1674-0068/cjcp2111261&rft_dat=%3Cwanfang_jour_scita%3Ehxwlxb202202017%3C/wanfang_jour_scita%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_wanfj_id=hxwlxb202202017&rfr_iscdi=true