An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies
[Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effe...
Gespeichert in:
Veröffentlicht in: | Journal of biomedical informatics 2018-04, Vol.80, p.106-119 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
container_end_page | 119 |
---|---|
container_issue | |
container_start_page | 106 |
container_title | Journal of biomedical informatics |
container_volume | 80 |
creator | Zhang, Guo-Qiang Xing, Guangming Cui, Licong |
description | [Display omitted]
•A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools.
One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work. |
doi_str_mv | 10.1016/j.jbi.2018.03.004 |
format | Article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6070340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1532046418300467</els_id><sourcerecordid>2014956049</sourcerecordid><originalsourceid>FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</originalsourceid><addsrcrecordid>eNp9kUFr3DAQhUVpaJJtf0AvRcceYmdkybJNoRBCmxYCuaRnIWtHXi1eKZXkJf331bLp0l56kkDznt68j5D3DGoGTF5v6-3o6gZYXwOvAcQrcsFa3lQgenh9uktxTi5T2gIw1rbyDTlvhlb0HWMXZH_jKVrrjEOfr-is44RVMnrGK-qDr2adszNYrTGjyS54qucpRJc3O2pDpPi80UvKbo805biYvEQ9U72sXXZ-osHS0YUdrl2xpMHnMIfJYXpLzqyeE757OVfkx9cvj7ffqvuHu--3N_eVES3LVWOMRdGxjmPHBtuZTjdGaw69GPhgpTCo2dCYsqiUo0A2Iu86w80Icmyg5Svy-ej7tIwlhSlLlnzqKbqdjr9U0E79--LdRk1hryR0wAUUg48vBjH8XDBltXPJ4Dxrj2FJqpQvhlZCybMi7DhqYkgpoj19w0AdeKmtKrwOkl4BV4VX0Xz4O99J8QdQGfh0HMDS0t5hVOmAypRGYwGi1sH9x_43kmKpYw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2014956049</pqid></control><display><type>article</type><title>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Zhang, Guo-Qiang ; Xing, Guangming ; Cui, Licong</creator><creatorcontrib>Zhang, Guo-Qiang ; Xing, Guangming ; Cui, Licong</creatorcontrib><description>[Display omitted]
•A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools.
One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.</description><identifier>ISSN: 1532-0464</identifier><identifier>EISSN: 1532-0480</identifier><identifier>DOI: 10.1016/j.jbi.2018.03.004</identifier><identifier>PMID: 29548711</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Algorithms ; Biological Ontologies ; Biomedical ontology ; Data Mining - methods ; Graph-theoretic algorithm ; Lattice vs non-lattice ; Medical Informatics - methods ; Partial order ; Quality assurance ; SNOMED CT ; Systematized Nomenclature of Medicine</subject><ispartof>Journal of biomedical informatics, 2018-04, Vol.80, p.106-119</ispartof><rights>2018 Elsevier Inc.</rights><rights>Copyright © 2018 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</citedby><cites>FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</cites><orcidid>0000-0001-5549-8780</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1532046418300467$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29548711$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Guo-Qiang</creatorcontrib><creatorcontrib>Xing, Guangming</creatorcontrib><creatorcontrib>Cui, Licong</creatorcontrib><title>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</title><title>Journal of biomedical informatics</title><addtitle>J Biomed Inform</addtitle><description>[Display omitted]
•A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools.
One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.</description><subject>Algorithms</subject><subject>Biological Ontologies</subject><subject>Biomedical ontology</subject><subject>Data Mining - methods</subject><subject>Graph-theoretic algorithm</subject><subject>Lattice vs non-lattice</subject><subject>Medical Informatics - methods</subject><subject>Partial order</subject><subject>Quality assurance</subject><subject>SNOMED CT</subject><subject>Systematized Nomenclature of Medicine</subject><issn>1532-0464</issn><issn>1532-0480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kUFr3DAQhUVpaJJtf0AvRcceYmdkybJNoRBCmxYCuaRnIWtHXi1eKZXkJf331bLp0l56kkDznt68j5D3DGoGTF5v6-3o6gZYXwOvAcQrcsFa3lQgenh9uktxTi5T2gIw1rbyDTlvhlb0HWMXZH_jKVrrjEOfr-is44RVMnrGK-qDr2adszNYrTGjyS54qucpRJc3O2pDpPi80UvKbo805biYvEQ9U72sXXZ-osHS0YUdrl2xpMHnMIfJYXpLzqyeE757OVfkx9cvj7ffqvuHu--3N_eVES3LVWOMRdGxjmPHBtuZTjdGaw69GPhgpTCo2dCYsqiUo0A2Iu86w80Icmyg5Svy-ej7tIwlhSlLlnzqKbqdjr9U0E79--LdRk1hryR0wAUUg48vBjH8XDBltXPJ4Dxrj2FJqpQvhlZCybMi7DhqYkgpoj19w0AdeKmtKrwOkl4BV4VX0Xz4O99J8QdQGfh0HMDS0t5hVOmAypRGYwGi1sH9x_43kmKpYw</recordid><startdate>20180401</startdate><enddate>20180401</enddate><creator>Zhang, Guo-Qiang</creator><creator>Xing, Guangming</creator><creator>Cui, Licong</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-5549-8780</orcidid></search><sort><creationdate>20180401</creationdate><title>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</title><author>Zhang, Guo-Qiang ; Xing, Guangming ; Cui, Licong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Biological Ontologies</topic><topic>Biomedical ontology</topic><topic>Data Mining - methods</topic><topic>Graph-theoretic algorithm</topic><topic>Lattice vs non-lattice</topic><topic>Medical Informatics - methods</topic><topic>Partial order</topic><topic>Quality assurance</topic><topic>SNOMED CT</topic><topic>Systematized Nomenclature of Medicine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Guo-Qiang</creatorcontrib><creatorcontrib>Xing, Guangming</creatorcontrib><creatorcontrib>Cui, Licong</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of biomedical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Guo-Qiang</au><au>Xing, Guangming</au><au>Cui, Licong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</atitle><jtitle>Journal of biomedical informatics</jtitle><addtitle>J Biomed Inform</addtitle><date>2018-04-01</date><risdate>2018</risdate><volume>80</volume><spage>106</spage><epage>119</epage><pages>106-119</pages><issn>1532-0464</issn><eissn>1532-0480</eissn><abstract>[Display omitted]
•A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools.
One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>29548711</pmid><doi>10.1016/j.jbi.2018.03.004</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-5549-8780</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1532-0464 |
ispartof | Journal of biomedical informatics, 2018-04, Vol.80, p.106-119 |
issn | 1532-0464 1532-0480 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6070340 |
source | MEDLINE; Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals |
subjects | Algorithms Biological Ontologies Biomedical ontology Data Mining - methods Graph-theoretic algorithm Lattice vs non-lattice Medical Informatics - methods Partial order Quality assurance SNOMED CT Systematized Nomenclature of Medicine |
title | An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies |
url | https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T14%3A36%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20efficient,%20large-scale,%20non-lattice-detection%20algorithm%20for%20exhaustive%20structural%20auditing%20of%20biomedical%20ontologies&rft.jtitle=Journal%20of%20biomedical%20informatics&rft.au=Zhang,%20Guo-Qiang&rft.date=2018-04-01&rft.volume=80&rft.spage=106&rft.epage=119&rft.pages=106-119&rft.issn=1532-0464&rft.eissn=1532-0480&rft_id=info:doi/10.1016/j.jbi.2018.03.004&rft_dat=%3Cproquest_pubme%3E2014956049%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2014956049&rft_id=info:pmid/29548711&rft_els_id=S1532046418300467&rfr_iscdi=true |