SAGA: a subgraph matching tool for biological graphs

Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph mat...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Bioinformatics 2007-01, Vol.23 (2), p.232-239
Hauptverfasser:	Tian, Yuanyuan, McEachin, Richard C., Santos, Carlos, States, David J., Patel, Jignesh M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Algorithms Biological and medical sciences Computer Graphics Computer Simulation Database Management Systems Databases, Protein Fundamental and applied biological sciences. Psychology General aspects Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Biological Pattern Recognition, Automated - methods Proteome - metabolism Signal Transduction - physiology Software User-Computer Interface
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

container_end_page	239
container_issue	2
container_start_page	232
container_title	Bioinformatics
container_volume	23
creator	Tian, Yuanyuan McEachin, Richard C. Santos, Carlos States, David J. Patel, Jignesh M.
description	Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .
doi_str_mv	10.1093/bioinformatics/btl571
format	Article
fullrecord	<record><control><sourceid>proquest_TOX</sourceid><recordid>TN_cdi_proquest_miscellaneous_68933353</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btl571</oup_id><sourcerecordid>1202725431</sourcerecordid><originalsourceid>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</originalsourceid><addsrcrecordid>eNqNkM1O3DAUha2qqPz1EYqiSmUXsOfasdPddNSZQYBYAFLFxrIdZzBk4sFOpPbt65ARqGzo3fha-s45ugehLwSfEFzCqXbetbUPa9U5E0911zBOPqA9QgucTzArP6YdCp5TgWEX7cf4gDEjlNJPaJdwQjAUYg_R6-li-j1TWez1KqjNfZYMzb1rV1nnfZOlhCxFNX7ljGqyZyQeop1aNdF-3r4H6Hb-82a2zC-uFmez6UVumOBdzoFWNQioNKuoqLVWQhOsTQqvJhgs1DXBFmg5jJ5oY0X6GyJspYDpEg7Q8ei7Cf6pt7GTaxeNbRrVWt9HWYgSABi8C5KSAWcwOH59Az74PrTpiMSI5McnA8RGyAQfY7C13AS3VuGPJFgO5ct_y5dj-Ul3tDXv9dpWr6pt2wn4tgVUTG3WQbXGxVdOUM4LPpyDR873m__OzkeJi539_SJS4VEmR87k8tednP2Y3_HL-blcwF_UsLFg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>198689729</pqid></control><display><type>article</type><title>SAGA: a subgraph matching tool for biological graphs</title><source>Access via Oxford University Press (Open Access Collection)</source><creator>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</creator><creatorcontrib>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</creatorcontrib><description>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btl571</identifier><identifier>PMID: 17110368</identifier><identifier>CODEN: BOINFP</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Algorithms ; Biological and medical sciences ; Computer Graphics ; Computer Simulation ; Database Management Systems ; Databases, Protein ; Fundamental and applied biological sciences. Psychology ; General aspects ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Models, Biological ; Pattern Recognition, Automated - methods ; Proteome - metabolism ; Signal Transduction - physiology ; Software ; User-Computer Interface</subject><ispartof>Bioinformatics, 2007-01, Vol.23 (2), p.232-239</ispartof><rights>2006 The Author(s) 2006</rights><rights>2007 INIST-CNRS</rights><rights>Copyright Oxford University Press(England) Jan 2007</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</citedby><cites>FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>315,781,785,1605,27929,27930</link.rule.ids><linktorsrc>$$Uhttps://dx.doi.org/10.1093/bioinformatics/btl571$$EView_record_in_Oxford_University_Press$$FView_record_in_$$GOxford_University_Press</linktorsrc><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18477673$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/17110368$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Tian, Yuanyuan</creatorcontrib><creatorcontrib>McEachin, Richard C.</creatorcontrib><creatorcontrib>Santos, Carlos</creatorcontrib><creatorcontrib>States, David J.</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><title>SAGA: a subgraph matching tool for biological graphs</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</description><subject>Algorithms</subject><subject>Biological and medical sciences</subject><subject>Computer Graphics</subject><subject>Computer Simulation</subject><subject>Database Management Systems</subject><subject>Databases, Protein</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Models, Biological</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Proteome - metabolism</subject><subject>Signal Transduction - physiology</subject><subject>Software</subject><subject>User-Computer Interface</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2007</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNqNkM1O3DAUha2qqPz1EYqiSmUXsOfasdPddNSZQYBYAFLFxrIdZzBk4sFOpPbt65ARqGzo3fha-s45ugehLwSfEFzCqXbetbUPa9U5E0911zBOPqA9QgucTzArP6YdCp5TgWEX7cf4gDEjlNJPaJdwQjAUYg_R6-li-j1TWez1KqjNfZYMzb1rV1nnfZOlhCxFNX7ljGqyZyQeop1aNdF-3r4H6Hb-82a2zC-uFmez6UVumOBdzoFWNQioNKuoqLVWQhOsTQqvJhgs1DXBFmg5jJ5oY0X6GyJspYDpEg7Q8ei7Cf6pt7GTaxeNbRrVWt9HWYgSABi8C5KSAWcwOH59Az74PrTpiMSI5McnA8RGyAQfY7C13AS3VuGPJFgO5ct_y5dj-Ul3tDXv9dpWr6pt2wn4tgVUTG3WQbXGxVdOUM4LPpyDR873m__OzkeJi539_SJS4VEmR87k8tednP2Y3_HL-blcwF_UsLFg</recordid><startdate>20070115</startdate><enddate>20070115</enddate><creator>Tian, Yuanyuan</creator><creator>McEachin, Richard C.</creator><creator>Santos, Carlos</creator><creator>States, David J.</creator><creator>Patel, Jignesh M.</creator><general>Oxford University Press</general><general>Oxford Publishing Limited (England)</general><scope>BSCLL</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7QF</scope><scope>7QO</scope><scope>7QQ</scope><scope>7SC</scope><scope>7SE</scope><scope>7SP</scope><scope>7SR</scope><scope>7TA</scope><scope>7TB</scope><scope>7TM</scope><scope>7TO</scope><scope>7U5</scope><scope>8BQ</scope><scope>8FD</scope><scope>F28</scope><scope>FR3</scope><scope>H8D</scope><scope>H8G</scope><scope>H94</scope><scope>JG9</scope><scope>JQ2</scope><scope>K9.</scope><scope>KR7</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>P64</scope><scope>7X8</scope></search><sort><creationdate>20070115</creationdate><title>SAGA: a subgraph matching tool for biological graphs</title><author>Tian, Yuanyuan ; McEachin, Richard C. ; Santos, Carlos ; States, David J. ; Patel, Jignesh M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c587t-734df383db5d48fbba8b10bc711d203e3ff10e3499999b2bce810ec18eda35b93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2007</creationdate><topic>Algorithms</topic><topic>Biological and medical sciences</topic><topic>Computer Graphics</topic><topic>Computer Simulation</topic><topic>Database Management Systems</topic><topic>Databases, Protein</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Models, Biological</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Proteome - metabolism</topic><topic>Signal Transduction - physiology</topic><topic>Software</topic><topic>User-Computer Interface</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Tian, Yuanyuan</creatorcontrib><creatorcontrib>McEachin, Richard C.</creatorcontrib><creatorcontrib>Santos, Carlos</creatorcontrib><creatorcontrib>States, David J.</creatorcontrib><creatorcontrib>Patel, Jignesh M.</creatorcontrib><collection>Istex</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Aluminium Industry Abstracts</collection><collection>Biotechnology Research Abstracts</collection><collection>Ceramic Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Corrosion Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Engineered Materials Abstracts</collection><collection>Materials Business File</collection><collection>Mechanical & Transportation Engineering Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Oncogenes and Growth Factors Abstracts</collection><collection>Solid State and Superconductivity Abstracts</collection><collection>METADEX</collection><collection>Technology Research Database</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><collection>Aerospace Database</collection><collection>Copper Technical Reference Library</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Civil Engineering Abstracts</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>MEDLINE - Academic</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Tian, Yuanyuan</au><au>McEachin, Richard C.</au><au>Santos, Carlos</au><au>States, David J.</au><au>Patel, Jignesh M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>SAGA: a subgraph matching tool for biological graphs</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2007-01-15</date><risdate>2007</risdate><volume>23</volume><issue>2</issue><spage>232</spage><epage>239</epage><pages>232-239</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><coden>BOINFP</coden><abstract>Motivation: With the rapid increase in the availability of biological graph datasets, there is a growing need for effective and efficient graph querying methods. Due to the noisy and incomplete characteristics of these datasets, exact graph matching methods have limited use and approximate graph matching methods are required. Unfortunately, existing graph matching methods are too restrictive as they only allow exact or near exact graph matching. This paper presents a novel approximate graph matching technique called SAGA. This technique employs a flexible model for computing graph similarity, which allows for node gaps, node mismatches and graph structural differences. SAGA employs an indexing technique that allows it to efficiently evaluate queries even against large graph datasets. Results: SAGA has been used to query biological pathways and literature datasets, which has revealed interesting similarities between distinct pathways that cannot be found by existing methods. These matches associate seemingly unrelated biological processes, connect studies in different sub-areas of biomedical research and thus pose hypotheses for new discoveries. SAGA is also orders of magnitude faster than existing methods. Availability: SAGA can be accessed freely via the web at . Binaries are also freely available at this website. Contact:jignesh@eecs.umich.edu Supplementary material: Supplementary material is available at .</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>17110368</pmid><doi>10.1093/bioinformatics/btl571</doi><tpages>8</tpages><oa>free_for_read</oa></addata></record>
fulltext	fulltext_linktorsrc
identifier	ISSN: 1367-4803
ispartof	Bioinformatics, 2007-01, Vol.23 (2), p.232-239
issn	1367-4803 1460-2059 1367-4811
language	eng
recordid	cdi_proquest_miscellaneous_68933353
source	Access via Oxford University Press (Open Access Collection)
subjects	Algorithms Biological and medical sciences Computer Graphics Computer Simulation Database Management Systems Databases, Protein Fundamental and applied biological sciences. Psychology General aspects Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Models, Biological Pattern Recognition, Automated - methods Proteome - metabolism Signal Transduction - physiology Software User-Computer Interface
title	SAGA: a subgraph matching tool for biological graphs
url	https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-16T01%3A04%3A54IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_TOX&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=SAGA:%20a%20subgraph%20matching%20tool%20for%20biological%20graphs&rft.jtitle=Bioinformatics&rft.au=Tian,%20Yuanyuan&rft.date=2007-01-15&rft.volume=23&rft.issue=2&rft.spage=232&rft.epage=239&rft.pages=232-239&rft.issn=1367-4803&rft.eissn=1460-2059&rft.coden=BOINFP&rft_id=info:doi/10.1093/bioinformatics/btl571&rft_dat=%3Cproquest_TOX%3E1202725431%3C/proquest_TOX%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=198689729&rft_id=info:pmid/17110368&rft_oup_id=10.1093/bioinformatics/btl571&rfr_iscdi=true