Network cross-validation by edge sampling

Summary While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting networ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Biometrika 2020-06, Vol.107 (2), p.257-276
Hauptverfasser:	Li, Tianxi, Levina, Elizaveta, Zhu, Ji
Format:	Artikel
Sprache:	eng
Schlagworte:	Computer simulation Mathematical models Network analysis Nodes Parameters Resampling Splitting Statistical analysis Statistical methods Statistical models Tuning
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	276
container_issue	2
container_start_page	257
container_title	Biometrika
container_volume	107
creator	Li, Tianxi Levina, Elizaveta Zhu, Ji
description	Summary While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. In this paper we propose a new network resampling strategy, based on splitting node pairs rather than nodes, that is applicable to cross-validation for a wide range of network model selection tasks. We provide theoretical justification for our method in a general setting and examples of how the method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a statisticians’ citation network show that the proposed cross-validation approach works well for model selection.
doi_str_mv	10.1093/biomet/asaa006
format	Article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2429813379</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/biomet/asaa006</oup_id><sourcerecordid>2429813379</sourcerecordid><originalsourceid>FETCH-LOGICAL-c407t-50d9cdb51ba19c3d9dd165b155da58f4ef6e757fa73f6843e70f61a6043ebe2d3</originalsourceid><addsrcrecordid>eNqFkDtPwzAUhS0EEqGwMkdi6uDWjl_xiCpeUgULzJYd21VKEgc7AfXfk5LuTPeh79xzdQC4xWiFkSRrU4fWDWudtEaIn4EMU04hYRidgwxNK0gopZfgKqX9ceSMZ2D56oafED_zKoaU4LduaquHOnS5OeTO7lyedNs3dbe7BhdeN8ndnOoCfDw-vG-e4fbt6WVzv4UVRWKADFlZWcOw0VhWxEprMWcGM2Y1Kz11njvBhNeCeF5S4gTyHGuOpta4wpIFuJvv9jF8jS4Nah_G2E2WqqCFLDEhQk7Uaqb-_o7Oqz7WrY4HhZE6xqHmONQpjkmwnAVh7P9jfwFq52L5</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2429813379</pqid></control><display><type>article</type><title>Network cross-validation by edge sampling</title><source>Oxford University Press Journals All Titles (1996-Current)</source><creator>Li, Tianxi ; Levina, Elizaveta ; Zhu, Ji</creator><creatorcontrib>Li, Tianxi ; Levina, Elizaveta ; Zhu, Ji</creatorcontrib><description>Summary While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. In this paper we propose a new network resampling strategy, based on splitting node pairs rather than nodes, that is applicable to cross-validation for a wide range of network model selection tasks. We provide theoretical justification for our method in a general setting and examples of how the method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a statisticians’ citation network show that the proposed cross-validation approach works well for model selection.</description><identifier>ISSN: 0006-3444</identifier><identifier>EISSN: 1464-3510</identifier><identifier>DOI: 10.1093/biomet/asaa006</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Computer simulation ; Mathematical models ; Network analysis ; Nodes ; Parameters ; Resampling ; Splitting ; Statistical analysis ; Statistical methods ; Statistical models ; Tuning</subject><ispartof>Biometrika, 2020-06, Vol.107 (2), p.257-276</ispartof><rights>2020 Biometrika Trust 2020</rights><rights>2020 Biometrika Trust</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c407t-50d9cdb51ba19c3d9dd165b155da58f4ef6e757fa73f6843e70f61a6043ebe2d3</citedby><cites>FETCH-LOGICAL-c407t-50d9cdb51ba19c3d9dd165b155da58f4ef6e757fa73f6843e70f61a6043ebe2d3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,1584,27924,27925</link.rule.ids></links><search><creatorcontrib>Li, Tianxi</creatorcontrib><creatorcontrib>Levina, Elizaveta</creatorcontrib><creatorcontrib>Zhu, Ji</creatorcontrib><title>Network cross-validation by edge sampling</title><title>Biometrika</title><description>Summary While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. In this paper we propose a new network resampling strategy, based on splitting node pairs rather than nodes, that is applicable to cross-validation for a wide range of network model selection tasks. We provide theoretical justification for our method in a general setting and examples of how the method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a statisticians’ citation network show that the proposed cross-validation approach works well for model selection.</description><subject>Computer simulation</subject><subject>Mathematical models</subject><subject>Network analysis</subject><subject>Nodes</subject><subject>Parameters</subject><subject>Resampling</subject><subject>Splitting</subject><subject>Statistical analysis</subject><subject>Statistical methods</subject><subject>Statistical models</subject><subject>Tuning</subject><issn>0006-3444</issn><issn>1464-3510</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNqFkDtPwzAUhS0EEqGwMkdi6uDWjl_xiCpeUgULzJYd21VKEgc7AfXfk5LuTPeh79xzdQC4xWiFkSRrU4fWDWudtEaIn4EMU04hYRidgwxNK0gopZfgKqX9ceSMZ2D56oafED_zKoaU4LduaquHOnS5OeTO7lyedNs3dbe7BhdeN8ndnOoCfDw-vG-e4fbt6WVzv4UVRWKADFlZWcOw0VhWxEprMWcGM2Y1Kz11njvBhNeCeF5S4gTyHGuOpta4wpIFuJvv9jF8jS4Nah_G2E2WqqCFLDEhQk7Uaqb-_o7Oqz7WrY4HhZE6xqHmONQpjkmwnAVh7P9jfwFq52L5</recordid><startdate>20200601</startdate><enddate>20200601</enddate><creator>Li, Tianxi</creator><creator>Levina, Elizaveta</creator><creator>Zhu, Ji</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7QO</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope></search><sort><creationdate>20200601</creationdate><title>Network cross-validation by edge sampling</title><author>Li, Tianxi ; Levina, Elizaveta ; Zhu, Ji</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c407t-50d9cdb51ba19c3d9dd165b155da58f4ef6e757fa73f6843e70f61a6043ebe2d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Computer simulation</topic><topic>Mathematical models</topic><topic>Network analysis</topic><topic>Nodes</topic><topic>Parameters</topic><topic>Resampling</topic><topic>Splitting</topic><topic>Statistical analysis</topic><topic>Statistical methods</topic><topic>Statistical models</topic><topic>Tuning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Li, Tianxi</creatorcontrib><creatorcontrib>Levina, Elizaveta</creatorcontrib><creatorcontrib>Zhu, Ji</creatorcontrib><collection>CrossRef</collection><collection>Biotechnology Research Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><jtitle>Biometrika</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Li, Tianxi</au><au>Levina, Elizaveta</au><au>Zhu, Ji</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Network cross-validation by edge sampling</atitle><jtitle>Biometrika</jtitle><date>2020-06-01</date><risdate>2020</risdate><volume>107</volume><issue>2</issue><spage>257</spage><epage>276</epage><pages>257-276</pages><issn>0006-3444</issn><eissn>1464-3510</eissn><abstract>Summary While many statistical models and methods are now available for network analysis, resampling of network data remains a challenging problem. Cross-validation is a useful general tool for model selection and parameter tuning, but it is not directly applicable to networks since splitting network nodes into groups requires deleting edges and destroys some of the network structure. In this paper we propose a new network resampling strategy, based on splitting node pairs rather than nodes, that is applicable to cross-validation for a wide range of network model selection tasks. We provide theoretical justification for our method in a general setting and examples of how the method can be used in specific network model selection and parameter tuning tasks. Numerical results on simulated networks and on a statisticians’ citation network show that the proposed cross-validation approach works well for model selection.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><doi>10.1093/biomet/asaa006</doi><tpages>20</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 0006-3444
ispartof	Biometrika, 2020-06, Vol.107 (2), p.257-276
issn	0006-3444 1464-3510
language	eng
recordid	cdi_proquest_journals_2429813379
source	Oxford University Press Journals All Titles (1996-Current)
subjects	Computer simulation Mathematical models Network analysis Nodes Parameters Resampling Splitting Statistical analysis Statistical methods Statistical models Tuning
title	Network cross-validation by edge sampling
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T18%3A08%3A03IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Network%20cross-validation%20by%20edge%20sampling&rft.jtitle=Biometrika&rft.au=Li,%20Tianxi&rft.date=2020-06-01&rft.volume=107&rft.issue=2&rft.spage=257&rft.epage=276&rft.pages=257-276&rft.issn=0006-3444&rft.eissn=1464-3510&rft_id=info:doi/10.1093/biomet/asaa006&rft_dat=%3Cproquest_cross%3E2429813379%3C/proquest_cross%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2429813379&rft_id=info:pmid/&rft_oup_id=10.1093/biomet/asaa006&rfr_iscdi=true