SAGA: a subgraph matching tool for biological graphs

Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph mat...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Bioinformatics 2007-01, Vol.23 (2), p.232-239
Hauptverfasser: Tian, Yuanyuan, McEachin, Richard C., Santos, Carlos, States, David J., Patel, Jignesh M.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 239
container_issue 2
container_start_page 232
container_title Bioinformatics
container_volume 23
creator Tian, Yuanyuan
McEachin, Richard C.
Santos, Carlos
States, David J.
Patel, Jignesh M.
description Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .
doi_str_mv 10.1093/bioinformatics/btl571
format Article
fullrecord <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_68933353</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btl571</oup_id><sourcerecordid>1202725431</sourcerecordid><originalsourceid>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</originalsourceid><addsrcrecordid>eNqNkM1O3DAUha2qqPz1EYqiSmUXsOfasdPddNSZQYBYAFLFxrIdZzBk4sFOpPbt65ARqGzo3fha-s45ugehLwSfEFzCqXbetbUPa9U5E0911zBOPqA9QgucTzArP6YdCp5TgWEX7cf4gDEjlNJPaJdwQjAUYg_R6-li-j1TWez1KqjNfZYMzb1rV1nnfZOlhCxFNX7ljGqyZyQeop1aNdF-3r4H6Hb-82a2zC-uFmez6UVumOBdzoFWNQioNKuoqLVWQhOsTQqvJhgs1DXBFmg5jJ5oY0X6GyJspYDpEg7Q8ei7Cf6pt7GTaxeNbRrVWt9HWYgSABi8C5KSAWcwOH59Az74PrTpiMSI5McnA8RGyAQfY7C13AS3VuGPJFgO5ct_y5dj-Ul3tDXv9dpWr6pt2wn4tgVUTG3WQbXGxVdOUM4LPpyDR873m__OzkeJi539_SJS4VEmR87k8tednP2Y3_HL-blcwF_UsLFg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>198689729</pqid></control><display><type>article</type><title>SAGA: a subgraph matching tool for biological graphs</title><source>Access via Oxford University Press (Open Access Collection)</source><creator>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</creator><creatorcontrib>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</creatorcontrib><description>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btl571</identifier><identifier>PMID: 17110368</identifier><identifier>CODEN: BOINFP</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Computer Graphics ; Computer Simulation ; Database Management Systems ; Databases, Protein ; Fundamental and applied biological sciences. Psychology ; General aspects ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Models, Biological ; Pattern Recognition, Automated - methods ; Proteome - metabolism ; Signal Transduction - physiology ; Software ; User-Computer Interface</subject><ispartof>Bioinformatics, 2007-01, Vol.23 (2), p.232-239</ispartof><rights>2006 The Author(s) 2006</rights><rights>2007 INIST-CNRS</rights><rights>Copyright Oxford University Press(England) Jan 2007</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</citedby><cites>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,1605,27929,27930</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/btl571$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=18477673$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17110368$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tian, Yuanyuan</creatorcontrib><creatorcontrib>McEachin, Richard C.</creatorcontrib><creatorcontrib>Santos, Carlos</creatorcontrib><creatorcontrib>States, David J.</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><title>SAGA: a subgraph matching tool for biological graphs</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Computer Graphics</subject><subject>Computer Simulation</subject><subject>Database Management Systems</subject><subject>Databases, Protein</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Models, Biological</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Proteome - metabolism</subject><subject>Signal Transduction - physiology</subject><subject>Software</subject><subject>User-Computer Interface</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkM1O3DAUha2qqPz1EYqiSmUXsOfasdPddNSZQYBYAFLFxrIdZzBk4sFOpPbt65ARqGzo3fha-s45ugehLwSfEFzCqXbetbUPa9U5E0911zBOPqA9QgucTzArP6YdCp5TgWEX7cf4gDEjlNJPaJdwQjAUYg_R6-li-j1TWez1KqjNfZYMzb1rV1nnfZOlhCxFNX7ljGqyZyQeop1aNdF-3r4H6Hb-82a2zC-uFmez6UVumOBdzoFWNQioNKuoqLVWQhOsTQqvJhgs1DXBFmg5jJ5oY0X6GyJspYDpEg7Q8ei7Cf6pt7GTaxeNbRrVWt9HWYgSABi8C5KSAWcwOH59Az74PrTpiMSI5McnA8RGyAQfY7C13AS3VuGPJFgO5ct_y5dj-Ul3tDXv9dpWr6pt2wn4tgVUTG3WQbXGxVdOUM4LPpyDR873m__OzkeJi539_SJS4VEmR87k8tednP2Y3_HL-blcwF_UsLFg</recordid><startdate>20070115</startdate><enddate>20070115</enddate><creator>Tian, Yuanyuan</creator><creator>McEachin, Richard C.</creator><creator>Santos, Carlos</creator><creator>States, David J.</creator><creator>Patel, Jignesh M.</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20070115</creationdate><title>SAGA: a subgraph matching tool for biological graphs</title><author>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Computer Graphics</topic><topic>Computer Simulation</topic><topic>Database Management Systems</topic><topic>Databases, Protein</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Models, Biological</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Proteome - metabolism</topic><topic>Signal Transduction - physiology</topic><topic>Software</topic><topic>User-Computer Interface</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tian, Yuanyuan</creatorcontrib><creatorcontrib>McEachin, Richard C.</creatorcontrib><creatorcontrib>Santos, Carlos</creatorcontrib><creatorcontrib>States, David J.</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical &amp; Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tian, Yuanyuan</au><au>McEachin, Richard C.</au><au>Santos, Carlos</au><au>States, David J.</au><au>Patel, Jignesh M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SAGA: a subgraph matching tool for biological graphs</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2007-01-15</date><risdate>2007</risdate><volume>23</volume><issue>2</issue><spage>232</spage><epage>239</epage><pages>232-239</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><coden>BOINFP</coden><abstract>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>17110368</pmid><doi>10.1093/bioinformatics/btl571</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext_linktorsrc
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2007-01, Vol.23 (2), p.232-239
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_proquest_miscellaneous_68933353
source Access via Oxford University Press (Open Access Collection)
subjects Algorithms
Biological and medical sciences
Computer Graphics
Computer Simulation
Database Management Systems
Databases, Protein
Fundamental and applied biological sciences. Psychology
General aspects
Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)
Models, Biological
Pattern Recognition, Automated - methods
Proteome - metabolism
Signal Transduction - physiology
Software
User-Computer Interface
title SAGA: a subgraph matching tool for biological graphs
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-16T01%3A04%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SAGA:%20a%20subgraph%20matching%20tool%20for%20biological%20graphs&rft.jtitle=Bioinformatics&rft.au=Tian,%20Yuanyuan&rft.date=2007-01-15&rft.volume=23&rft.issue=2&rft.spage=232&rft.epage=239&rft.pages=232-239&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=BOINFP&rft_id=info:doi/10.1093/bioinformatics/btl571&rft_dat=%3Cproquest_TOX%3E1202725431%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=198689729&rft_id=info:pmid/17110368&rft_oup_id=10.1093/bioinformatics/btl571&rfr_iscdi=true