An Extensible Approach to Searching and Selecting Data Sources for Materialized Big Data Integration in Distributed Computing Environments

The work relates to the field of big data integration in distributed computing environments. One of the challenges of data integration is searching and selecting relevant data sources. In the modern world, there are many systems that are registries containing descriptions and links to data sources....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition and image analysis 2023-06, Vol.33 (2), p.147-156
Hauptverfasser: Sazontev, V. V., Stupnikov, S. A.
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:The work relates to the field of big data integration in distributed computing environments. One of the challenges of data integration is searching and selecting relevant data sources. In the modern world, there are many systems that are registries containing descriptions and links to data sources. Registries implement various types of searches, such as keyword searches and/or semantic searches. An extensible approach is proposed for embedding various types of data source retrieval systems into a materialized big data integration system deployed in a distributed computing environment. An automated process of searching and selecting relevant data sources in the integration system is described. A description of the implemented software components is given and an example of embedding one of the search systems into a prototype of a data integration system is illustrated.
ISSN:1054-6618
1555-6212
DOI:10.1134/S1054661823020141