Efficient Correlation Search from Graph Databases

Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, research on correlation mining from graph databases is still lacking despite the proliferation of graph data in recent years. We propose a new proble...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on knowledge and data engineering 2008-12, Vol.20 (12), p.1601-1615
Hauptverfasser:	Yiping Ke, Cheng, J., Ng, W.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Applied sciences Chemicals Chemistry Computational biology Computer science control theory systems Correlation Data mining Data models Data processing. List processing. Character string processing Drugs Exact sciences and technology Graphs Information retrieval. Graph Memory organisation. Data processing Mines Mining Mining methods and algorithms Multimedia databases Queries Searching Software Streaming media Studies Theoretical computing Transaction databases XML
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	1615
container_issue	12
container_start_page	1601
container_title	IEEE transactions on knowledge and data engineering
container_volume	20
creator	Yiping Ke Cheng, J. Ng, W.
description	Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, research on correlation mining from graph databases is still lacking despite the proliferation of graph data in recent years. We propose a new problem of correlation mining from graph databases, called correlated graph search (CGS). CGS adopts Pearson's correlation coefficient to take into account the occurrence distributions of graphs. However, the problem poses significant challenges, since every subgraph of a graph in the database is a candidate but the number of subgraphs is exponential. We derive two necessary conditions that set bounds on the occurrence probability of a candidate in the database. With this result, we devise an efficient algorithm that mines the candidate set from a much smaller projected database and thus a significantly smaller set of candidates is obtained. Three heuristic rules are further developed to refine the candidate set. We also make use of the bounds to directly answer high-support queries without mining the candidates. Experimental results justify the efficiency of our algorithm. Finally, we generalize the CGS problem and show that our algorithm provides a general solution to most of the existing correlation measures.
doi_str_mv	10.1109/TKDE.2008.86
format	Article
fullrecord	<record><control><sourceid>proquest_RIE</sourceid><recordid>TN_cdi_pascalfrancis_primary_20850808</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>4515864</ieee_id><sourcerecordid>875067874</sourcerecordid><originalsourceid>FETCH-LOGICAL-c413t-a404ca89d78f2425bc3db30a93cbf0b3fca1f27acc38c839027dd9e988d70c743</originalsourceid><addsrcrecordid>eNp90D1PwzAQBmALgUQpbGwsERKwkOKvxOcRteVDVGKgzNbFsdVUaVLsdODfk9KqAwOTT_Jzr3QvIZeMjhij-mH-NpmOOKUwgvyIDFiWQcqZZsf9TCVLpZDqlJzFuKQ9UsAGhE29r2zlmi4ZtyG4GruqbZIPh8EuEh_aVfIccL1IJthhgdHFc3LisY7uYv8OyefTdD5-SWfvz6_jx1lqJRNdipJKi6BLBZ5LnhVWlIWgqIUtPC2Et8g8V2itAAtCU67KUjsNUCpqlRRDcrfLXYf2a-NiZ1ZVtK6usXHtJhpQGc0V_Mrbf6WQGTBQeQ-v_8BluwlNf4XRjHOlpRY9ut8hG9oYg_NmHaoVhm_DqNnWbLY1m23NBraZN_tMjBZrH7CxVTzscAoZBQq9u9q5yjl3-JYZyyCX4gf9z4ON</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>912279493</pqid></control><display><type>article</type><title>Efficient Correlation Search from Graph Databases</title><source>IEEE Electronic Library (IEL)</source><creator>Yiping Ke ; Cheng, J. ; Ng, W.</creator><creatorcontrib>Yiping Ke ; Cheng, J. ; Ng, W.</creatorcontrib><description>Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, research on correlation mining from graph databases is still lacking despite the proliferation of graph data in recent years. We propose a new problem of correlation mining from graph databases, called correlated graph search (CGS). CGS adopts Pearson's correlation coefficient to take into account the occurrence distributions of graphs. However, the problem poses significant challenges, since every subgraph of a graph in the database is a candidate but the number of subgraphs is exponential. We derive two necessary conditions that set bounds on the occurrence probability of a candidate in the database. With this result, we devise an efficient algorithm that mines the candidate set from a much smaller projected database and thus a significantly smaller set of candidates is obtained. Three heuristic rules are further developed to refine the candidate set. We also make use of the bounds to directly answer high-support queries without mining the candidates. Experimental results justify the efficiency of our algorithm. Finally, we generalize the CGS problem and show that our algorithm provides a general solution to most of the existing correlation measures.</description><identifier>ISSN: 1041-4347</identifier><identifier>EISSN: 1558-2191</identifier><identifier>DOI: 10.1109/TKDE.2008.86</identifier><identifier>CODEN: ITKEEH</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>Algorithms ; Applied sciences ; Chemicals ; Chemistry ; Computational biology ; Computer science; control theory; systems ; Correlation ; Data mining ; Data models ; Data processing. List processing. Character string processing ; Drugs ; Exact sciences and technology ; Graphs ; Information retrieval. Graph ; Memory organisation. Data processing ; Mines ; Mining ; Mining methods and algorithms ; Multimedia databases ; Queries ; Searching ; Software ; Streaming media ; Studies ; Theoretical computing ; Transaction databases ; XML</subject><ispartof>IEEE transactions on knowledge and data engineering, 2008-12, Vol.20 (12), p.1601-1615</ispartof><rights>2009 INIST-CNRS</rights><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2008</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c413t-a404ca89d78f2425bc3db30a93cbf0b3fca1f27acc38c839027dd9e988d70c743</citedby><cites>FETCH-LOGICAL-c413t-a404ca89d78f2425bc3db30a93cbf0b3fca1f27acc38c839027dd9e988d70c743</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/4515864$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,776,780,792,27901,27902,54733</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/4515864$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=20850808$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Yiping Ke</creatorcontrib><creatorcontrib>Cheng, J.</creatorcontrib><creatorcontrib>Ng, W.</creatorcontrib><title>Efficient Correlation Search from Graph Databases</title><title>IEEE transactions on knowledge and data engineering</title><addtitle>TKDE</addtitle><description>Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, research on correlation mining from graph databases is still lacking despite the proliferation of graph data in recent years. We propose a new problem of correlation mining from graph databases, called correlated graph search (CGS). CGS adopts Pearson's correlation coefficient to take into account the occurrence distributions of graphs. However, the problem poses significant challenges, since every subgraph of a graph in the database is a candidate but the number of subgraphs is exponential. We derive two necessary conditions that set bounds on the occurrence probability of a candidate in the database. With this result, we devise an efficient algorithm that mines the candidate set from a much smaller projected database and thus a significantly smaller set of candidates is obtained. Three heuristic rules are further developed to refine the candidate set. We also make use of the bounds to directly answer high-support queries without mining the candidates. Experimental results justify the efficiency of our algorithm. Finally, we generalize the CGS problem and show that our algorithm provides a general solution to most of the existing correlation measures.</description><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Chemicals</subject><subject>Chemistry</subject><subject>Computational biology</subject><subject>Computer science; control theory; systems</subject><subject>Correlation</subject><subject>Data mining</subject><subject>Data models</subject><subject>Data processing. List processing. Character string processing</subject><subject>Drugs</subject><subject>Exact sciences and technology</subject><subject>Graphs</subject><subject>Information retrieval. Graph</subject><subject>Memory organisation. Data processing</subject><subject>Mines</subject><subject>Mining</subject><subject>Mining methods and algorithms</subject><subject>Multimedia databases</subject><subject>Queries</subject><subject>Searching</subject><subject>Software</subject><subject>Streaming media</subject><subject>Studies</subject><subject>Theoretical computing</subject><subject>Transaction databases</subject><subject>XML</subject><issn>1041-4347</issn><issn>1558-2191</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><sourceid>RIE</sourceid><recordid>eNp90D1PwzAQBmALgUQpbGwsERKwkOKvxOcRteVDVGKgzNbFsdVUaVLsdODfk9KqAwOTT_Jzr3QvIZeMjhij-mH-NpmOOKUwgvyIDFiWQcqZZsf9TCVLpZDqlJzFuKQ9UsAGhE29r2zlmi4ZtyG4GruqbZIPh8EuEh_aVfIccL1IJthhgdHFc3LisY7uYv8OyefTdD5-SWfvz6_jx1lqJRNdipJKi6BLBZ5LnhVWlIWgqIUtPC2Et8g8V2itAAtCU67KUjsNUCpqlRRDcrfLXYf2a-NiZ1ZVtK6usXHtJhpQGc0V_Mrbf6WQGTBQeQ-v_8BluwlNf4XRjHOlpRY9ut8hG9oYg_NmHaoVhm_DqNnWbLY1m23NBraZN_tMjBZrH7CxVTzscAoZBQq9u9q5yjl3-JYZyyCX4gf9z4ON</recordid><startdate>20081201</startdate><enddate>20081201</enddate><creator>Yiping Ke</creator><creator>Cheng, J.</creator><creator>Ng, W.</creator><general>IEEE</general><general>IEEE Computer Society</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20081201</creationdate><title>Efficient Correlation Search from Graph Databases</title><author>Yiping Ke ; Cheng, J. ; Ng, W.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c413t-a404ca89d78f2425bc3db30a93cbf0b3fca1f27acc38c839027dd9e988d70c743</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Chemicals</topic><topic>Chemistry</topic><topic>Computational biology</topic><topic>Computer science; control theory; systems</topic><topic>Correlation</topic><topic>Data mining</topic><topic>Data models</topic><topic>Data processing. List processing. Character string processing</topic><topic>Drugs</topic><topic>Exact sciences and technology</topic><topic>Graphs</topic><topic>Information retrieval. Graph</topic><topic>Memory organisation. Data processing</topic><topic>Mines</topic><topic>Mining</topic><topic>Mining methods and algorithms</topic><topic>Multimedia databases</topic><topic>Queries</topic><topic>Searching</topic><topic>Software</topic><topic>Streaming media</topic><topic>Studies</topic><topic>Theoretical computing</topic><topic>Transaction databases</topic><topic>XML</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yiping Ke</creatorcontrib><creatorcontrib>Cheng, J.</creatorcontrib><creatorcontrib>Ng, W.</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on knowledge and data engineering</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yiping Ke</au><au>Cheng, J.</au><au>Ng, W.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Efficient Correlation Search from Graph Databases</atitle><jtitle>IEEE transactions on knowledge and data engineering</jtitle><stitle>TKDE</stitle><date>2008-12-01</date><risdate>2008</risdate><volume>20</volume><issue>12</issue><spage>1601</spage><epage>1615</epage><pages>1601-1615</pages><issn>1041-4347</issn><eissn>1558-2191</eissn><coden>ITKEEH</coden><abstract>Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, research on correlation mining from graph databases is still lacking despite the proliferation of graph data in recent years. We propose a new problem of correlation mining from graph databases, called correlated graph search (CGS). CGS adopts Pearson's correlation coefficient to take into account the occurrence distributions of graphs. However, the problem poses significant challenges, since every subgraph of a graph in the database is a candidate but the number of subgraphs is exponential. We derive two necessary conditions that set bounds on the occurrence probability of a candidate in the database. With this result, we devise an efficient algorithm that mines the candidate set from a much smaller projected database and thus a significantly smaller set of candidates is obtained. Three heuristic rules are further developed to refine the candidate set. We also make use of the bounds to directly answer high-support queries without mining the candidates. Experimental results justify the efficiency of our algorithm. Finally, we generalize the CGS problem and show that our algorithm provides a general solution to most of the existing correlation measures.</abstract><cop>New York, NY</cop><pub>IEEE</pub><doi>10.1109/TKDE.2008.86</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1041-4347
ispartof	IEEE transactions on knowledge and data engineering, 2008-12, Vol.20 (12), p.1601-1615
issn	1041-4347 1558-2191
language	eng
recordid	cdi_pascalfrancis_primary_20850808
source	IEEE Electronic Library (IEL)
subjects	Algorithms Applied sciences Chemicals Chemistry Computational biology Computer science control theory systems Correlation Data mining Data models Data processing. List processing. Character string processing Drugs Exact sciences and technology Graphs Information retrieval. Graph Memory organisation. Data processing Mines Mining Mining methods and algorithms Multimedia databases Queries Searching Software Streaming media Studies Theoretical computing Transaction databases XML
title	Efficient Correlation Search from Graph Databases
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-01T16%3A50%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_RIE&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Efficient%20Correlation%20Search%20from%20Graph%20Databases&rft.jtitle=IEEE%20transactions%20on%20knowledge%20and%20data%20engineering&rft.au=Yiping%20Ke&rft.date=2008-12-01&rft.volume=20&rft.issue=12&rft.spage=1601&rft.epage=1615&rft.pages=1601-1615&rft.issn=1041-4347&rft.eissn=1558-2191&rft.coden=ITKEEH&rft_id=info:doi/10.1109/TKDE.2008.86&rft_dat=%3Cproquest_RIE%3E875067874%3C/proquest_RIE%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=912279493&rft_id=info:pmid/&rft_ieee_id=4515864&rfr_iscdi=true