When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases

The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programmi...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of artificial intelligence research 2018-03, Vol.61, p.723-759
Hauptverfasser:	McCreesh, Ciaran, Prosser, Patrick, Solnon, Christine, Trimble, James
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Apexes Artificial Intelligence Computer Science Graph theory Isomorphism Phase transitions
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	759
container_issue
container_start_page	723
container_title	The Journal of artificial intelligence research
container_volume	61
creator	McCreesh, Ciaran Prosser, Patrick Solnon, Christine Trimble, James
description	The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programming can operate comfortably on many real-world problem instances with thousands of vertices. However, they cannot handle arbitrary instances of this size. We show how to generate "really hard" random instances for subgraph isomorphism problems, which are computationally challenging with a couple of hundred vertices in the target, and only twenty pattern vertices. For the non-induced version of the problem, these instances lie on a satisfiable / unsatisfiable phase transition, whose location we can predict; for the induced variant, much richer behaviour is observed, and constrainedness gives a better measure of difficulty than does proximity to a phase transition. These results have practical consequences: we explain why the widely researched "filter / verify" indexing technique used in graph databases is founded upon a misunderstanding of the empirical hardness of NP-complete problems, and cannot be beneficial when paired with any reasonable subgraph isomorphism algorithm.
doi_str_mv	10.1613/jair.5768
format	Article
fullrecord	<record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_01741928v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2554081629</sourcerecordid><originalsourceid>FETCH-LOGICAL-c326t-58decf09dca080c4e821c4ef8400a57c1772e87996770e1d681e8fb729dd493</originalsourceid><addsrcrecordid>eNpNkE9Lw0AQxRdRsFYPfoMFT4Kpu5tk_xxL1bZQEdpCj8s0uzEpaTfupkK-vYkV6WVmePPmx_AQuqdkRDmNn3dQ-lEquLxAA0oEj5RIxeXZfI1uQtgRQlXC5AAtN4U94NVx--mhLvA8uL3zdVGGPS4DXlqoqhbPwJsnDAeDN0WL190Wv0PTWB9w7jye_p6-QANbCDbcoqscqmDv_voQrd5e15NZtPiYzifjRZTFjDdRKo3NcqJMBkSSLLGS0a7mMiEEUpFRIZiVQikuBLHUcEmtzLeCKWMSFQ_R44laQKVrX-7Bt9pBqWfjhe41QkVCFZPftPM-nLy1d19HGxq9c0d_6J7TLE0TIilnZ8TMuxC8zf-xlOg-XN2Hq_tw4x9StGry</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2554081629</pqid></control><display><type>article</type><title>When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases</title><source>DOAJ Directory of Open Access Journals</source><source>EZB-FREE-00999 freely available EZB journals</source><source>Free E- Journals</source><creator>McCreesh, Ciaran ; Prosser, Patrick ; Solnon, Christine ; Trimble, James</creator><creatorcontrib>McCreesh, Ciaran ; Prosser, Patrick ; Solnon, Christine ; Trimble, James</creatorcontrib><description>The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programming can operate comfortably on many real-world problem instances with thousands of vertices. However, they cannot handle arbitrary instances of this size. We show how to generate "really hard" random instances for subgraph isomorphism problems, which are computationally challenging with a couple of hundred vertices in the target, and only twenty pattern vertices. For the non-induced version of the problem, these instances lie on a satisfiable / unsatisfiable phase transition, whose location we can predict; for the induced variant, much richer behaviour is observed, and constrainedness gives a better measure of difficulty than does proximity to a phase transition. These results have practical consequences: we explain why the widely researched "filter / verify" indexing technique used in graph databases is founded upon a misunderstanding of the empirical hardness of NP-complete problems, and cannot be beneficial when paired with any reasonable subgraph isomorphism algorithm.</description><identifier>ISSN: 1076-9757</identifier><identifier>EISSN: 1076-9757</identifier><identifier>EISSN: 1943-5037</identifier><identifier>DOI: 10.1613/jair.5768</identifier><language>eng</language><publisher>San Francisco: AI Access Foundation</publisher><subject>Algorithms ; Apexes ; Artificial Intelligence ; Computer Science ; Graph theory ; Isomorphism ; Phase transitions</subject><ispartof>The Journal of artificial intelligence research, 2018-03, Vol.61, p.723-759</ispartof><rights>2018. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the associated terms available at https://www.jair.org/index.php/jair/about</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c326t-58decf09dca080c4e821c4ef8400a57c1772e87996770e1d681e8fb729dd493</citedby><orcidid>0000-0002-0919-496X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,315,782,786,866,887,27933,27934</link.rule.ids><backlink>$$Uhttps://hal.science/hal-01741928$$DView record in HAL$$Hfree_for_read</backlink></links><search><creatorcontrib>McCreesh, Ciaran</creatorcontrib><creatorcontrib>Prosser, Patrick</creatorcontrib><creatorcontrib>Solnon, Christine</creatorcontrib><creatorcontrib>Trimble, James</creatorcontrib><title>When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases</title><title>The Journal of artificial intelligence research</title><description>The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programming can operate comfortably on many real-world problem instances with thousands of vertices. However, they cannot handle arbitrary instances of this size. We show how to generate "really hard" random instances for subgraph isomorphism problems, which are computationally challenging with a couple of hundred vertices in the target, and only twenty pattern vertices. For the non-induced version of the problem, these instances lie on a satisfiable / unsatisfiable phase transition, whose location we can predict; for the induced variant, much richer behaviour is observed, and constrainedness gives a better measure of difficulty than does proximity to a phase transition. These results have practical consequences: we explain why the widely researched "filter / verify" indexing technique used in graph databases is founded upon a misunderstanding of the empirical hardness of NP-complete problems, and cannot be beneficial when paired with any reasonable subgraph isomorphism algorithm.</description><subject>Algorithms</subject><subject>Apexes</subject><subject>Artificial Intelligence</subject><subject>Computer Science</subject><subject>Graph theory</subject><subject>Isomorphism</subject><subject>Phase transitions</subject><issn>1076-9757</issn><issn>1076-9757</issn><issn>1943-5037</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>ABUWG</sourceid><sourceid>AFKRA</sourceid><sourceid>AZQEC</sourceid><sourceid>BENPR</sourceid><sourceid>CCPQU</sourceid><sourceid>DWQXO</sourceid><sourceid>GNUQQ</sourceid><recordid>eNpNkE9Lw0AQxRdRsFYPfoMFT4Kpu5tk_xxL1bZQEdpCj8s0uzEpaTfupkK-vYkV6WVmePPmx_AQuqdkRDmNn3dQ-lEquLxAA0oEj5RIxeXZfI1uQtgRQlXC5AAtN4U94NVx--mhLvA8uL3zdVGGPS4DXlqoqhbPwJsnDAeDN0WL190Wv0PTWB9w7jye_p6-QANbCDbcoqscqmDv_voQrd5e15NZtPiYzifjRZTFjDdRKo3NcqJMBkSSLLGS0a7mMiEEUpFRIZiVQikuBLHUcEmtzLeCKWMSFQ_R44laQKVrX-7Bt9pBqWfjhe41QkVCFZPftPM-nLy1d19HGxq9c0d_6J7TLE0TIilnZ8TMuxC8zf-xlOg-XN2Hq_tw4x9StGry</recordid><startdate>20180330</startdate><enddate>20180330</enddate><creator>McCreesh, Ciaran</creator><creator>Prosser, Patrick</creator><creator>Solnon, Christine</creator><creator>Trimble, James</creator><general>AI Access Foundation</general><general>Association for the Advancement of Artificial Intelligence</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-0919-496X</orcidid></search><sort><creationdate>20180330</creationdate><title>When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases</title><author>McCreesh, Ciaran ; Prosser, Patrick ; Solnon, Christine ; Trimble, James</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c326t-58decf09dca080c4e821c4ef8400a57c1772e87996770e1d681e8fb729dd493</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Apexes</topic><topic>Artificial Intelligence</topic><topic>Computer Science</topic><topic>Graph theory</topic><topic>Isomorphism</topic><topic>Phase transitions</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>McCreesh, Ciaran</creatorcontrib><creatorcontrib>Prosser, Patrick</creatorcontrib><creatorcontrib>Solnon, Christine</creatorcontrib><creatorcontrib>Trimble, James</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Access via ProQuest (Open Access)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>The Journal of artificial intelligence research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>McCreesh, Ciaran</au><au>Prosser, Patrick</au><au>Solnon, Christine</au><au>Trimble, James</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases</atitle><jtitle>The Journal of artificial intelligence research</jtitle><date>2018-03-30</date><risdate>2018</risdate><volume>61</volume><spage>723</spage><epage>759</epage><pages>723-759</pages><issn>1076-9757</issn><eissn>1076-9757</eissn><eissn>1943-5037</eissn><abstract>The subgraph isomorphism problem involves deciding whether a copy of a pattern graph occurs inside a larger target graph. The non-induced version allows extra edges in the target, whilst the induced version does not. Although both variants are NP-complete, algorithms inspired by constraint programming can operate comfortably on many real-world problem instances with thousands of vertices. However, they cannot handle arbitrary instances of this size. We show how to generate "really hard" random instances for subgraph isomorphism problems, which are computationally challenging with a couple of hundred vertices in the target, and only twenty pattern vertices. For the non-induced version of the problem, these instances lie on a satisfiable / unsatisfiable phase transition, whose location we can predict; for the induced variant, much richer behaviour is observed, and constrainedness gives a better measure of difficulty than does proximity to a phase transition. These results have practical consequences: we explain why the widely researched "filter / verify" indexing technique used in graph databases is founded upon a misunderstanding of the empirical hardness of NP-complete problems, and cannot be beneficial when paired with any reasonable subgraph isomorphism algorithm.</abstract><cop>San Francisco</cop><pub>AI Access Foundation</pub><doi>10.1613/jair.5768</doi><tpages>37</tpages><orcidid>https://orcid.org/0000-0002-0919-496X</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1076-9757
ispartof	The Journal of artificial intelligence research, 2018-03, Vol.61, p.723-759
issn	1076-9757 1076-9757 1943-5037
language	eng
recordid	cdi_hal_primary_oai_HAL_hal_01741928v1
source	DOAJ Directory of Open Access Journals; EZB-FREE-00999 freely available EZB journals; Free E- Journals
subjects	Algorithms Apexes Artificial Intelligence Computer Science Graph theory Isomorphism Phase transitions
title	When Subgraph Isomorphism is Really Hard, and Why This Matters for Graph Databases
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-03T08%3A44%3A40IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=When%20Subgraph%20Isomorphism%20is%20Really%20Hard,%20and%20Why%20This%20Matters%20for%20Graph%20Databases&rft.jtitle=The%20Journal%20of%20artificial%20intelligence%20research&rft.au=McCreesh,%20Ciaran&rft.date=2018-03-30&rft.volume=61&rft.spage=723&rft.epage=759&rft.pages=723-759&rft.issn=1076-9757&rft.eissn=1076-9757&rft_id=info:doi/10.1613/jair.5768&rft_dat=%3Cproquest_hal_p%3E2554081629%3C/proquest_hal_p%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2554081629&rft_id=info:pmid/&rfr_iscdi=true