The maximum capability of a topological feature in link prediction

Abstract Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	PNAS nexus 2024-03, Vol.3 (3), p.pgae113-pgae113
Hauptverfasser:	Ran, Yijun, Xu, Xiao-Ke, Jia, Tao
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Analysis Data mining Machine learning Methods Physical Sciences and Engineering Social networks
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	pgae113
container_issue	3
container_start_page	pgae113
container_title	PNAS nexus
container_volume	3
creator	Ran, Yijun Xu, Xiao-Ke Jia, Tao
description	Abstract Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature’s capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.
doi_str_mv	10.1093/pnasnexus/pgae113
format	Article
fullrecord	<record><control><sourceid>gale_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10962729</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A791605808</galeid><oup_id>10.1093/pnasnexus/pgae113</oup_id><sourcerecordid>A791605808</sourcerecordid><originalsourceid>FETCH-LOGICAL-c504t-e9c7291d5cb6c085662710a768cc7f4e22246cff534854bb12af21bc9150ae583</originalsourceid><addsrcrecordid>eNqNkU9rFDEYxoMottR-AC8S8OLBbfN3MnOSWmwVCl7qObyTfbONZpJxMlPab2-WXZcWPEgOCcnvefIkDyFvOTvjrJPnY4KS8GEp5-MGkHP5ghwLo8Wq0Uq8fLI-Iqel_GSMCWM4V_o1OZKtFm2n1TH5fHuHdICHMCwDdTBCH2KYH2n2FOicxxzzJjiI1CPMy4Q0JBpD-kXHCdfBzSGnN-SVh1jwdD-fkB9XX24vv65uvl9_u7y4WTnN1LzCzhnR8bV2feNYq5tGGM7ANK1zxisUQqjGea-larXqey7AC967jmsGqFt5Qj7tfMelH3DtMM0TRDtOYYDp0WYI9vlJCnd2k-9t_a56l-iqw4e9w5R_L1hmO4TiMEZImJdiJWNSSSO5ruj7HbqBiDYkn6ul2-L2wnS8Ybpl20hn_6DqWOMQXE7oQ91_JuA7gZtyKRP6Q3zOtjmlPdRq97VWzbun7z4o_pZYgY87IC_jf_j9AZUXsIg</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3003437315</pqid></control><display><type>article</type><title>The maximum capability of a topological feature in link prediction</title><source>Oxford Journals Open Access Collection</source><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>PubMed Central</source><creator>Ran, Yijun ; Xu, Xiao-Ke ; Jia, Tao</creator><creatorcontrib>Ran, Yijun ; Xu, Xiao-Ke ; Jia, Tao</creatorcontrib><description>Abstract Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature’s capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.</description><identifier>ISSN: 2752-6542</identifier><identifier>EISSN: 2752-6542</identifier><identifier>DOI: 10.1093/pnasnexus/pgae113</identifier><identifier>PMID: 38528954</identifier><language>eng</language><publisher>US: Oxford University Press</publisher><subject>Algorithms ; Analysis ; Data mining ; Machine learning ; Methods ; Physical Sciences and Engineering ; Social networks</subject><ispartof>PNAS nexus, 2024-03, Vol.3 (3), p.pgae113-pgae113</ispartof><rights>The Author(s) 2024. Published by Oxford University Press on behalf of National Academy of Sciences. 2024</rights><rights>The Author(s) 2024. Published by Oxford University Press on behalf of National Academy of Sciences.</rights><rights>COPYRIGHT 2024 Oxford University Press</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c504t-e9c7291d5cb6c085662710a768cc7f4e22246cff534854bb12af21bc9150ae583</citedby><cites>FETCH-LOGICAL-c504t-e9c7291d5cb6c085662710a768cc7f4e22246cff534854bb12af21bc9150ae583</cites><orcidid>0000-0002-2337-2857 ; 0000-0002-7047-3343 ; 0000-0002-9148-3145</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10962729/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC10962729/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,860,881,1598,27901,27902,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38528954$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ran, Yijun</creatorcontrib><creatorcontrib>Xu, Xiao-Ke</creatorcontrib><creatorcontrib>Jia, Tao</creatorcontrib><title>The maximum capability of a topological feature in link prediction</title><title>PNAS nexus</title><addtitle>PNAS Nexus</addtitle><description>Abstract Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature’s capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.</description><subject>Algorithms</subject><subject>Analysis</subject><subject>Data mining</subject><subject>Machine learning</subject><subject>Methods</subject><subject>Physical Sciences and Engineering</subject><subject>Social networks</subject><issn>2752-6542</issn><issn>2752-6542</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqNkU9rFDEYxoMottR-AC8S8OLBbfN3MnOSWmwVCl7qObyTfbONZpJxMlPab2-WXZcWPEgOCcnvefIkDyFvOTvjrJPnY4KS8GEp5-MGkHP5ghwLo8Wq0Uq8fLI-Iqel_GSMCWM4V_o1OZKtFm2n1TH5fHuHdICHMCwDdTBCH2KYH2n2FOicxxzzJjiI1CPMy4Q0JBpD-kXHCdfBzSGnN-SVh1jwdD-fkB9XX24vv65uvl9_u7y4WTnN1LzCzhnR8bV2feNYq5tGGM7ANK1zxisUQqjGea-larXqey7AC967jmsGqFt5Qj7tfMelH3DtMM0TRDtOYYDp0WYI9vlJCnd2k-9t_a56l-iqw4e9w5R_L1hmO4TiMEZImJdiJWNSSSO5ruj7HbqBiDYkn6ul2-L2wnS8Ybpl20hn_6DqWOMQXE7oQ91_JuA7gZtyKRP6Q3zOtjmlPdRq97VWzbun7z4o_pZYgY87IC_jf_j9AZUXsIg</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Ran, Yijun</creator><creator>Xu, Xiao-Ke</creator><creator>Jia, Tao</creator><general>Oxford University Press</general><scope>TOX</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0002-2337-2857</orcidid><orcidid>https://orcid.org/0000-0002-7047-3343</orcidid><orcidid>https://orcid.org/0000-0002-9148-3145</orcidid></search><sort><creationdate>20240301</creationdate><title>The maximum capability of a topological feature in link prediction</title><author>Ran, Yijun ; Xu, Xiao-Ke ; Jia, Tao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c504t-e9c7291d5cb6c085662710a768cc7f4e22246cff534854bb12af21bc9150ae583</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Analysis</topic><topic>Data mining</topic><topic>Machine learning</topic><topic>Methods</topic><topic>Physical Sciences and Engineering</topic><topic>Social networks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ran, Yijun</creatorcontrib><creatorcontrib>Xu, Xiao-Ke</creatorcontrib><creatorcontrib>Jia, Tao</creatorcontrib><collection>Oxford Journals Open Access Collection</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>PNAS nexus</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ran, Yijun</au><au>Xu, Xiao-Ke</au><au>Jia, Tao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>The maximum capability of a topological feature in link prediction</atitle><jtitle>PNAS nexus</jtitle><addtitle>PNAS Nexus</addtitle><date>2024-03-01</date><risdate>2024</risdate><volume>3</volume><issue>3</issue><spage>pgae113</spage><epage>pgae113</epage><pages>pgae113-pgae113</pages><issn>2752-6542</issn><eissn>2752-6542</eissn><abstract>Abstract Networks offer a powerful approach to modeling complex systems by representing the underlying set of pairwise interactions. Link prediction is the task that predicts links of a network that are not directly visible, with profound applications in biological, social, and other complex systems. Despite intensive utilization of the topological feature in this task, it is unclear to what extent a feature can be leveraged to infer missing links. Here, we aim to unveil the capability of a topological feature in link prediction by identifying its prediction performance upper bound. We introduce a theoretical framework that is compatible with different indexes to gauge the feature, different prediction approaches to utilize the feature, and different metrics to quantify the prediction performance. The maximum capability of a topological feature follows a simple yet theoretically validated expression, which only depends on the extent to which the feature is held in missing and nonexistent links. Because a family of indexes based on the same feature shares the same upper bound, the potential of all others can be estimated from one single index. Furthermore, a feature’s capability is lifted in the supervised prediction, which can be mathematically quantified, allowing us to estimate the benefit of applying machine learning algorithms. The universality of the pattern uncovered is empirically verified by 550 structurally diverse networks. The findings have applications in feature and method selection, and shed light on network characteristics that make a topological feature effective in link prediction.</abstract><cop>US</cop><pub>Oxford University Press</pub><pmid>38528954</pmid><doi>10.1093/pnasnexus/pgae113</doi><orcidid>https://orcid.org/0000-0002-2337-2857</orcidid><orcidid>https://orcid.org/0000-0002-7047-3343</orcidid><orcidid>https://orcid.org/0000-0002-9148-3145</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2752-6542
ispartof	PNAS nexus, 2024-03, Vol.3 (3), p.pgae113-pgae113
issn	2752-6542 2752-6542
language	eng
recordid	cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10962729
source	Oxford Journals Open Access Collection; DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; PubMed Central
subjects	Algorithms Analysis Data mining Machine learning Methods Physical Sciences and Engineering Social networks
title	The maximum capability of a topological feature in link prediction
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-07T03%3A20%3A28IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=The%20maximum%20capability%20of%20a%20topological%20feature%20in%20link%20prediction&rft.jtitle=PNAS%20nexus&rft.au=Ran,%20Yijun&rft.date=2024-03-01&rft.volume=3&rft.issue=3&rft.spage=pgae113&rft.epage=pgae113&rft.pages=pgae113-pgae113&rft.issn=2752-6542&rft.eissn=2752-6542&rft_id=info:doi/10.1093/pnasnexus/pgae113&rft_dat=%3Cgale_pubme%3EA791605808%3C/gale_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=3003437315&rft_id=info:pmid/38528954&rft_galeid=A791605808&rft_oup_id=10.1093/pnasnexus/pgae113&rfr_iscdi=true