Automated extraction of concept matcher thesaurus from semi-structured catalogue-like sources of data on the web

Ontology design and the process of populating a data-set with knowledge following the chosen or developed ontology to fit the principles of Semantic Web and Linked Open Data is a time-consuming and iterative process, requiring either expert knowledge or a set of tools for data scraping from web. A v...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
1. Verfasser: Lapaev, Maxim
Format: Tagungsbericht
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Ontology design and the process of populating a data-set with knowledge following the chosen or developed ontology to fit the principles of Semantic Web and Linked Open Data is a time-consuming and iterative process, requiring either expert knowledge or a set of tools for data scraping from web. A valid and consistent ontology and knowledge withing the data-set require unification of concepts which means overcoming ambiguity and synonymy of terms which become individuals of ontology. In this paper we spot on techniques used for organising a Russian food product data-set under a light-weight FOOD Ontology and concept matching in particular. Main approaches to data-set concept unification, synonymic term matching and ways to collect dictionaries for matcher are mentioned. The tool for catalogue-like semi-structured resources parsing and thesaurus extraction is developed and introduced for the task of on-the-fly concept matching.
ISSN:2305-7254
2305-7254
2343-0737
DOI:10.1109/FRUCT-ISPIT.2016.7561521