Defining and Evaluating Network Communities Based on Ground-Truth
Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of alg...
Gespeichert in:
Hauptverfasser: | , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 754 |
---|---|
container_issue | |
container_start_page | 745 |
container_title | |
container_volume | |
creator | Jaewon Yang Leskovec, J. |
description | Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable gold-standard ground-truth. In this paper we study a set of 230 large real-world social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to ground-truth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad-participation-ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30% relative improvement over current local clustering methods. |
doi_str_mv | 10.1109/ICDM.2012.138 |
format | Conference Proceeding |
fullrecord | <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6413740</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6413740</ieee_id><sourcerecordid>6413740</sourcerecordid><originalsourceid>FETCH-LOGICAL-i241t-1ca158bdd52bd84ff65aa88a99df60e4d1f9003ff72182547f634d59740a4f363</originalsourceid><addsrcrecordid>eNotj7lOAzEURc0mkYSUVDT-AQ9-9vNWhkkIkQI0oY4cbIMhM4NmAfH3BEF1dZqjcwm5BF4AcHe9Kuf3heAgCpD2iEydsdxop9BxZY7JSEiDzKLVJ2QMqI1Ejc6ckhEoxRkaq8_JuOveOJdaSz4is3lMuc71C_V1oItPvx98_4sPsf9q2ndaNlU11LnPsaM3vouBNjVdts1QB7Zph_71gpwlv-_i9H8n5Ol2sSnv2PpxuSpna5YFQs_g2YOyuxCU2AWLKWnlvbXeuZA0jxgguUNVSkaAFQpN0hKDcga5xyS1nJCrP2-OMW4_2lz59nurEQ6XufwBr29Mow</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Defining and Evaluating Network Communities Based on Ground-Truth</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Jaewon Yang ; Leskovec, J.</creator><creatorcontrib>Jaewon Yang ; Leskovec, J.</creatorcontrib><description>Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable gold-standard ground-truth. In this paper we study a set of 230 large real-world social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to ground-truth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad-participation-ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30% relative improvement over current local clustering methods.</description><identifier>ISSN: 1550-4786</identifier><identifier>ISBN: 1467346497</identifier><identifier>ISBN: 9781467346498</identifier><identifier>EISSN: 2374-8486</identifier><identifier>EISBN: 9780769549057</identifier><identifier>EISBN: 0769549055</identifier><identifier>DOI: 10.1109/ICDM.2012.138</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Collaboration ; Communities ; Community detection ; Community scoring functions ; Image edge detection ; Measurement ; Modularity ; Network communities ; Robustness ; Social network services</subject><ispartof>2012 IEEE 12th International Conference on Data Mining, 2012, p.745-754</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6413740$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2056,27924,54919</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6413740$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Jaewon Yang</creatorcontrib><creatorcontrib>Leskovec, J.</creatorcontrib><title>Defining and Evaluating Network Communities Based on Ground-Truth</title><title>2012 IEEE 12th International Conference on Data Mining</title><addtitle>icdm</addtitle><description>Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable gold-standard ground-truth. In this paper we study a set of 230 large real-world social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to ground-truth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad-participation-ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30% relative improvement over current local clustering methods.</description><subject>Collaboration</subject><subject>Communities</subject><subject>Community detection</subject><subject>Community scoring functions</subject><subject>Image edge detection</subject><subject>Measurement</subject><subject>Modularity</subject><subject>Network communities</subject><subject>Robustness</subject><subject>Social network services</subject><issn>1550-4786</issn><issn>2374-8486</issn><isbn>1467346497</isbn><isbn>9781467346498</isbn><isbn>9780769549057</isbn><isbn>0769549055</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><sourceid>RIE</sourceid><recordid>eNotj7lOAzEURc0mkYSUVDT-AQ9-9vNWhkkIkQI0oY4cbIMhM4NmAfH3BEF1dZqjcwm5BF4AcHe9Kuf3heAgCpD2iEydsdxop9BxZY7JSEiDzKLVJ2QMqI1Ejc6ckhEoxRkaq8_JuOveOJdaSz4is3lMuc71C_V1oItPvx98_4sPsf9q2ndaNlU11LnPsaM3vouBNjVdts1QB7Zph_71gpwlv-_i9H8n5Ol2sSnv2PpxuSpna5YFQs_g2YOyuxCU2AWLKWnlvbXeuZA0jxgguUNVSkaAFQpN0hKDcga5xyS1nJCrP2-OMW4_2lz59nurEQ6XufwBr29Mow</recordid><startdate>201212</startdate><enddate>201212</enddate><creator>Jaewon Yang</creator><creator>Leskovec, J.</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201212</creationdate><title>Defining and Evaluating Network Communities Based on Ground-Truth</title><author>Jaewon Yang ; Leskovec, J.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i241t-1ca158bdd52bd84ff65aa88a99df60e4d1f9003ff72182547f634d59740a4f363</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Collaboration</topic><topic>Communities</topic><topic>Community detection</topic><topic>Community scoring functions</topic><topic>Image edge detection</topic><topic>Measurement</topic><topic>Modularity</topic><topic>Network communities</topic><topic>Robustness</topic><topic>Social network services</topic><toplevel>online_resources</toplevel><creatorcontrib>Jaewon Yang</creatorcontrib><creatorcontrib>Leskovec, J.</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Electronic Library (IEL)</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Jaewon Yang</au><au>Leskovec, J.</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Defining and Evaluating Network Communities Based on Ground-Truth</atitle><btitle>2012 IEEE 12th International Conference on Data Mining</btitle><stitle>icdm</stitle><date>2012-12</date><risdate>2012</risdate><spage>745</spage><epage>754</epage><pages>745-754</pages><issn>1550-4786</issn><eissn>2374-8486</eissn><isbn>1467346497</isbn><isbn>9781467346498</isbn><eisbn>9780769549057</eisbn><eisbn>0769549055</eisbn><coden>IEEPAD</coden><abstract>Nodes in real-world networks organize into densely linked communities where edges appear with high concentration among the members of the community. Identifying such communities of nodes has proven to be a challenging task mainly due to a plethora of definitions of a community, intractability of algorithms, issues with evaluation and the lack of a reliable gold-standard ground-truth. In this paper we study a set of 230 large real-world social, collaboration and information networks where nodes explicitly state their group memberships. For example, in social networks nodes explicitly join various interest based social groups. We use such groups to define a reliable and robust notion of ground-truth communities. We then propose a methodology which allows us to compare and quantitatively evaluate how different structural definitions of network communities correspond to ground-truth communities. We choose 13 commonly used structural definitions of network communities and examine their sensitivity, robustness and performance in identifying the ground-truth. We show that the 13 structural definitions are heavily correlated and naturally group into four classes. We find that two of these definitions, Conductance and Triad-participation-ratio, consistently give the best performance in identifying ground-truth communities. We also investigate a task of detecting communities given a single seed node. We extend the local spectral clustering algorithm into a heuristic parameter-free community detection method that easily scales to networks with more than hundred million nodes. The proposed method achieves 30% relative improvement over current local clustering methods.</abstract><pub>IEEE</pub><doi>10.1109/ICDM.2012.138</doi><tpages>10</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1550-4786 |
ispartof | 2012 IEEE 12th International Conference on Data Mining, 2012, p.745-754 |
issn | 1550-4786 2374-8486 |
language | eng |
recordid | cdi_ieee_primary_6413740 |
source | IEEE Electronic Library (IEL) Conference Proceedings |
subjects | Collaboration Communities Community detection Community scoring functions Image edge detection Measurement Modularity Network communities Robustness Social network services |
title | Defining and Evaluating Network Communities Based on Ground-Truth |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T14%3A03%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Defining%20and%20Evaluating%20Network%20Communities%20Based%20on%20Ground-Truth&rft.btitle=2012%20IEEE%2012th%20International%20Conference%20on%20Data%20Mining&rft.au=Jaewon%20Yang&rft.date=2012-12&rft.spage=745&rft.epage=754&rft.pages=745-754&rft.issn=1550-4786&rft.eissn=2374-8486&rft.isbn=1467346497&rft.isbn_list=9781467346498&rft.coden=IEEPAD&rft_id=info:doi/10.1109/ICDM.2012.138&rft_dat=%3Cieee_6IE%3E6413740%3C/ieee_6IE%3E%3Curl%3E%3C/url%3E&rft.eisbn=9780769549057&rft.eisbn_list=0769549055&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6413740&rfr_iscdi=true |