An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies

[Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of biomedical informatics 2018-04, Vol.80, p.106-119
Hauptverfasser: Zhang, Guo-Qiang, Xing, Guangming, Cui, Licong
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
container_end_page 119
container_issue
container_start_page 106
container_title Journal of biomedical informatics
container_volume 80
creator Zhang, Guo-Qiang
Xing, Guangming
Cui, Licong
description [Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools. One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.
doi_str_mv 10.1016/j.jbi.2018.03.004
format Article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6070340</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1532046418300467</els_id><sourcerecordid>2014956049</sourcerecordid><originalsourceid>FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</originalsourceid><addsrcrecordid>eNp9kUFr3DAQhUVpaJJtf0AvRcceYmdkybJNoRBCmxYCuaRnIWtHXi1eKZXkJf331bLp0l56kkDznt68j5D3DGoGTF5v6-3o6gZYXwOvAcQrcsFa3lQgenh9uktxTi5T2gIw1rbyDTlvhlb0HWMXZH_jKVrrjEOfr-is44RVMnrGK-qDr2adszNYrTGjyS54qucpRJc3O2pDpPi80UvKbo805biYvEQ9U72sXXZ-osHS0YUdrl2xpMHnMIfJYXpLzqyeE757OVfkx9cvj7ffqvuHu--3N_eVES3LVWOMRdGxjmPHBtuZTjdGaw69GPhgpTCo2dCYsqiUo0A2Iu86w80Icmyg5Svy-ej7tIwlhSlLlnzqKbqdjr9U0E79--LdRk1hryR0wAUUg48vBjH8XDBltXPJ4Dxrj2FJqpQvhlZCybMi7DhqYkgpoj19w0AdeKmtKrwOkl4BV4VX0Xz4O99J8QdQGfh0HMDS0t5hVOmAypRGYwGi1sH9x_43kmKpYw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2014956049</pqid></control><display><type>article</type><title>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</title><source>MEDLINE</source><source>Elsevier ScienceDirect Journals</source><source>Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals</source><creator>Zhang, Guo-Qiang ; Xing, Guangming ; Cui, Licong</creator><creatorcontrib>Zhang, Guo-Qiang ; Xing, Guangming ; Cui, Licong</creatorcontrib><description>[Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools. One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.</description><identifier>ISSN: 1532-0464</identifier><identifier>EISSN: 1532-0480</identifier><identifier>DOI: 10.1016/j.jbi.2018.03.004</identifier><identifier>PMID: 29548711</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Algorithms ; Biological Ontologies ; Biomedical ontology ; Data Mining - methods ; Graph-theoretic algorithm ; Lattice vs non-lattice ; Medical Informatics - methods ; Partial order ; Quality assurance ; SNOMED CT ; Systematized Nomenclature of Medicine</subject><ispartof>Journal of biomedical informatics, 2018-04, Vol.80, p.106-119</ispartof><rights>2018 Elsevier Inc.</rights><rights>Copyright © 2018 Elsevier Inc. All rights reserved.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</citedby><cites>FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</cites><orcidid>0000-0001-5549-8780</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.sciencedirect.com/science/article/pii/S1532046418300467$$EHTML$$P50$$Gelsevier$$Hfree_for_read</linktohtml><link.rule.ids>230,314,776,780,881,3537,27901,27902,65306</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29548711$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Guo-Qiang</creatorcontrib><creatorcontrib>Xing, Guangming</creatorcontrib><creatorcontrib>Cui, Licong</creatorcontrib><title>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</title><title>Journal of biomedical informatics</title><addtitle>J Biomed Inform</addtitle><description>[Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools. One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.</description><subject>Algorithms</subject><subject>Biological Ontologies</subject><subject>Biomedical ontology</subject><subject>Data Mining - methods</subject><subject>Graph-theoretic algorithm</subject><subject>Lattice vs non-lattice</subject><subject>Medical Informatics - methods</subject><subject>Partial order</subject><subject>Quality assurance</subject><subject>SNOMED CT</subject><subject>Systematized Nomenclature of Medicine</subject><issn>1532-0464</issn><issn>1532-0480</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>EIF</sourceid><recordid>eNp9kUFr3DAQhUVpaJJtf0AvRcceYmdkybJNoRBCmxYCuaRnIWtHXi1eKZXkJf331bLp0l56kkDznt68j5D3DGoGTF5v6-3o6gZYXwOvAcQrcsFa3lQgenh9uktxTi5T2gIw1rbyDTlvhlb0HWMXZH_jKVrrjEOfr-is44RVMnrGK-qDr2adszNYrTGjyS54qucpRJc3O2pDpPi80UvKbo805biYvEQ9U72sXXZ-osHS0YUdrl2xpMHnMIfJYXpLzqyeE757OVfkx9cvj7ffqvuHu--3N_eVES3LVWOMRdGxjmPHBtuZTjdGaw69GPhgpTCo2dCYsqiUo0A2Iu86w80Icmyg5Svy-ej7tIwlhSlLlnzqKbqdjr9U0E79--LdRk1hryR0wAUUg48vBjH8XDBltXPJ4Dxrj2FJqpQvhlZCybMi7DhqYkgpoj19w0AdeKmtKrwOkl4BV4VX0Xz4O99J8QdQGfh0HMDS0t5hVOmAypRGYwGi1sH9x_43kmKpYw</recordid><startdate>20180401</startdate><enddate>20180401</enddate><creator>Zhang, Guo-Qiang</creator><creator>Xing, Guangming</creator><creator>Cui, Licong</creator><general>Elsevier Inc</general><scope>6I.</scope><scope>AAFTH</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-5549-8780</orcidid></search><sort><creationdate>20180401</creationdate><title>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</title><author>Zhang, Guo-Qiang ; Xing, Guangming ; Cui, Licong</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c451t-2ccfe47173e719f7c7a2caa3084939f64cea192c15366b4e1be377c3cb06b2053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Biological Ontologies</topic><topic>Biomedical ontology</topic><topic>Data Mining - methods</topic><topic>Graph-theoretic algorithm</topic><topic>Lattice vs non-lattice</topic><topic>Medical Informatics - methods</topic><topic>Partial order</topic><topic>Quality assurance</topic><topic>SNOMED CT</topic><topic>Systematized Nomenclature of Medicine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Guo-Qiang</creatorcontrib><creatorcontrib>Xing, Guangming</creatorcontrib><creatorcontrib>Cui, Licong</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Journal of biomedical informatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Guo-Qiang</au><au>Xing, Guangming</au><au>Cui, Licong</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies</atitle><jtitle>Journal of biomedical informatics</jtitle><addtitle>J Biomed Inform</addtitle><date>2018-04-01</date><risdate>2018</risdate><volume>80</volume><spage>106</spage><epage>119</epage><pages>106-119</pages><issn>1532-0464</issn><eissn>1532-0480</eissn><abstract>[Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools. One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>29548711</pmid><doi>10.1016/j.jbi.2018.03.004</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-5549-8780</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1532-0464
ispartof Journal of biomedical informatics, 2018-04, Vol.80, p.106-119
issn 1532-0464
1532-0480
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6070340
source MEDLINE; Elsevier ScienceDirect Journals; Elektronische Zeitschriftenbibliothek - Frei zugängliche E-Journals
subjects Algorithms
Biological Ontologies
Biomedical ontology
Data Mining - methods
Graph-theoretic algorithm
Lattice vs non-lattice
Medical Informatics - methods
Partial order
Quality assurance
SNOMED CT
Systematized Nomenclature of Medicine
title An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies
url https://sfx.bib-bvb.de/sfx_tum?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T14%3A36%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20efficient,%20large-scale,%20non-lattice-detection%20algorithm%20for%20exhaustive%20structural%20auditing%20of%20biomedical%20ontologies&rft.jtitle=Journal%20of%20biomedical%20informatics&rft.au=Zhang,%20Guo-Qiang&rft.date=2018-04-01&rft.volume=80&rft.spage=106&rft.epage=119&rft.pages=106-119&rft.issn=1532-0464&rft.eissn=1532-0480&rft_id=info:doi/10.1016/j.jbi.2018.03.004&rft_dat=%3Cproquest_pubme%3E2014956049%3C/proquest_pubme%3E%3Curl%3E%3C/url%3E&disable_directlink=true&sfx.directlink=off&sfx.report_link=0&rft_id=info:oai/&rft_pqid=2014956049&rft_id=info:pmid/29548711&rft_els_id=S1532046418300467&rfr_iscdi=true