OntoGenix: Leveraging Large Language Models for enhanced ontology engineering from datasets
Knowledge Graphs integrate data from multiple, heterogeneous sources, using ontologies to facilitate data interoperability. Ontology development is a resource-consuming task that requires the collaborative work of domain experts and ontology engineers. Therefore, companies invest considerable resour...
Gespeichert in:
Veröffentlicht in: | Information processing & management 2025-05, Vol.62 (3), p.104042, Article 104042 |
---|---|
Hauptverfasser: | , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Knowledge Graphs integrate data from multiple, heterogeneous sources, using ontologies to facilitate data interoperability. Ontology development is a resource-consuming task that requires the collaborative work of domain experts and ontology engineers. Therefore, companies invest considerable resources in order to generate and maintain Enterprise Knowledge Graphs and ontologies from large and complex datasets, most of which can be unfamiliar for ontology engineers. In this work, we study the use of Large Language Models to aid in the development of ontologies from datasets, ultimately increasing the automation of the generation of ontology-based Knowledge Graphs. As a result we have developed a structured workflow that leverages Large Language Models to enhance ontology engineering through data pre-processing, ontology planning, building, and entity improvement. Our method is also able to generate mappings and RDF data, but in this work we focus on the ontologies. The pipeline has been implemented in the OntoGenix tool. In this work we show the results of the application of OntoGenix to six datasets related to commercial activities. The findings indicate that the ontologies produced exhibit patterns of coherent modeling, and features that closely resemble those created by humans, although the most complex situations are better reflected by the ontologies developed by humans.
[Display omitted]
•OntoGenix is an LLM-powered pipeline for automating ontology development.•Ontologies developed by OntoGenix are similar to human-generated ontologies for most quality metrics applied.•Ontologies developed by OntoGenix are limited for capturing complex modeling situations.•Our experiments on e-commerce datasets show that OntoGenix modeling is consistent. |
---|---|
ISSN: | 0306-4573 |
DOI: | 10.1016/j.ipm.2024.104042 |