SAGA: a subgraph matching tool for biological graphs
Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph mat...
Gespeichert in:
Veröffentlicht in: | Bioinformatics 2007-01, Vol.23 (2), p.232-239 |
---|---|
Hauptverfasser: | , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 239 |
---|---|
container_issue | 2 |
container_start_page | 232 |
container_title | Bioinformatics |
container_volume | 23 |
creator | Tian, Yuanyuan McEachin, Richard C. Santos, Carlos States, David J. Patel, Jignesh M. |
description | Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at . |
doi_str_mv | 10.1093/bioinformatics/btl571 |
format | Article |
fullrecord | <record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_68933353</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btl571</oup_id><sourcerecordid>1202725431</sourcerecordid><originalsourceid>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</originalsourceid><addsrcrecordid>eNqNkM1O3DAUha2qqPz1EYqiSmUXsOfasdPddNSZQYBYAFLFxrIdZzBk4sFOpPbt65ARqGzo3fha-s45ugehLwSfEFzCqXbetbUPa9U5E0911zBOPqA9QgucTzArP6YdCp5TgWEX7cf4gDEjlNJPaJdwQjAUYg_R6-li-j1TWez1KqjNfZYMzb1rV1nnfZOlhCxFNX7ljGqyZyQeop1aNdF-3r4H6Hb-82a2zC-uFmez6UVumOBdzoFWNQioNKuoqLVWQhOsTQqvJhgs1DXBFmg5jJ5oY0X6GyJspYDpEg7Q8ei7Cf6pt7GTaxeNbRrVWt9HWYgSABi8C5KSAWcwOH59Az74PrTpiMSI5McnA8RGyAQfY7C13AS3VuGPJFgO5ct_y5dj-Ul3tDXv9dpWr6pt2wn4tgVUTG3WQbXGxVdOUM4LPpyDR873m__OzkeJi539_SJS4VEmR87k8tednP2Y3_HL-blcwF_UsLFg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>198689729</pqid></control><display><type>article</type><title>SAGA: a subgraph matching tool for biological graphs</title><source>Access via Oxford University Press (Open Access Collection)</source><creator>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</creator><creatorcontrib>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</creatorcontrib><description>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btl571</identifier><identifier>PMID: 17110368</identifier><identifier>CODEN: BOINFP</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Computer Graphics ; Computer Simulation ; Database Management Systems ; Databases, Protein ; Fundamental and applied biological sciences. Psychology ; General aspects ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Models, Biological ; Pattern Recognition, Automated - methods ; Proteome - metabolism ; Signal Transduction - physiology ; Software ; User-Computer Interface</subject><ispartof>Bioinformatics, 2007-01, Vol.23 (2), p.232-239</ispartof><rights>2006 The Author(s) 2006</rights><rights>2007 INIST-CNRS</rights><rights>Copyright Oxford University Press(England) Jan 2007</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</citedby><cites>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,1605,27929,27930</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/btl571$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18477673$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17110368$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tian, Yuanyuan</creatorcontrib><creatorcontrib>McEachin, Richard C.</creatorcontrib><creatorcontrib>Santos, Carlos</creatorcontrib><creatorcontrib>States, David J.</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><title>SAGA: a subgraph matching tool for biological graphs</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Computer Graphics</subject><subject>Computer Simulation</subject><subject>Database Management Systems</subject><subject>Databases, Protein</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Models, Biological</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Proteome - metabolism</subject><subject>Signal Transduction - physiology</subject><subject>Software</subject><subject>User-Computer Interface</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkM1O3DAUha2qqPz1EYqiSmUXsOfasdPddNSZQYBYAFLFxrIdZzBk4sFOpPbt65ARqGzo3fha-s45ugehLwSfEFzCqXbetbUPa9U5E0911zBOPqA9QgucTzArP6YdCp5TgWEX7cf4gDEjlNJPaJdwQjAUYg_R6-li-j1TWez1KqjNfZYMzb1rV1nnfZOlhCxFNX7ljGqyZyQeop1aNdF-3r4H6Hb-82a2zC-uFmez6UVumOBdzoFWNQioNKuoqLVWQhOsTQqvJhgs1DXBFmg5jJ5oY0X6GyJspYDpEg7Q8ei7Cf6pt7GTaxeNbRrVWt9HWYgSABi8C5KSAWcwOH59Az74PrTpiMSI5McnA8RGyAQfY7C13AS3VuGPJFgO5ct_y5dj-Ul3tDXv9dpWr6pt2wn4tgVUTG3WQbXGxVdOUM4LPpyDR873m__OzkeJi539_SJS4VEmR87k8tednP2Y3_HL-blcwF_UsLFg</recordid><startdate>20070115</startdate><enddate>20070115</enddate><creator>Tian, Yuanyuan</creator><creator>McEachin, Richard C.</creator><creator>Santos, Carlos</creator><creator>States, David J.</creator><creator>Patel, Jignesh M.</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20070115</creationdate><title>SAGA: a subgraph matching tool for biological graphs</title><author>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Computer Graphics</topic><topic>Computer Simulation</topic><topic>Database Management Systems</topic><topic>Databases, Protein</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Models, Biological</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Proteome - metabolism</topic><topic>Signal Transduction - physiology</topic><topic>Software</topic><topic>User-Computer Interface</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tian, Yuanyuan</creatorcontrib><creatorcontrib>McEachin, Richard C.</creatorcontrib><creatorcontrib>Santos, Carlos</creatorcontrib><creatorcontrib>States, David J.</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tian, Yuanyuan</au><au>McEachin, Richard C.</au><au>Santos, Carlos</au><au>States, David J.</au><au>Patel, Jignesh M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SAGA: a subgraph matching tool for biological graphs</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2007-01-15</date><risdate>2007</risdate><volume>23</volume><issue>2</issue><spage>232</spage><epage>239</epage><pages>232-239</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><coden>BOINFP</coden><abstract>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>17110368</pmid><doi>10.1093/bioinformatics/btl571</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2007-01, Vol.23 (2), p.232-239 |
issn | 1367-4803 1460-2059 1367-4811 |
language | eng |
recordid | cdi_proquest_miscellaneous_68933353 |
source | Access via Oxford University Press (Open Access Collection) |
subjects | Algorithms Biological and medical sciences Computer Graphics Computer Simulation Database Management Systems Databases, Protein Fundamental and applied biological sciences. Psychology General aspects Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Biological Pattern Recognition, Automated - methods Proteome - metabolism Signal Transduction - physiology Software User-Computer Interface |
title | SAGA: a subgraph matching tool for biological graphs |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-16T01%3A04%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SAGA:%20a%20subgraph%20matching%20tool%20for%20biological%20graphs&rft.jtitle=Bioinformatics&rft.au=Tian,%20Yuanyuan&rft.date=2007-01-15&rft.volume=23&rft.issue=2&rft.spage=232&rft.epage=239&rft.pages=232-239&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=BOINFP&rft_id=info:doi/10.1093/bioinformatics/btl571&rft_dat=%3Cproquest_TOX%3E1202725431%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=198689729&rft_id=info:pmid/17110368&rft_oup_id=10.1093/bioinformatics/btl571&rfr_iscdi=true |