An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies
[Display omitted] •A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effe...
Gespeichert in:
Veröffentlicht in: | Journal of biomedical informatics 2018-04, Vol.80, p.106-119 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | [Display omitted]
•A new algorithm for computing all non-trivial lowest common ancestors in an ontology.•A fundamental step in non-lattice approaches for ontology quality assurance.•Correctness proofs for an algorithm achieving 2-orders of magnitude speedup.•Non-lattice methods can lead to more effective tools.
One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work. |
---|---|
ISSN: | 1532-0464 1532-0480 |
DOI: | 10.1016/j.jbi.2018.03.004 |