Distributed subgraph query for RDF graph data based on MapReduce

•A subgraph query algorithm for RDF graph data in a distributed environment is proposed.•A star decomposition method is used to decompose RDF graphs into star subgraphs.•The query efficiency is improved due to the query order of star subgraphs.•RDF query efficiency is improved by using adjacency lis...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & electrical engineering 2022-09, Vol.102, p.108221, Article 108221
Hauptverfasser: Su, Qianxiang, Huang, Qingrong, Wu, Nan, Pan, Ying
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•A subgraph query algorithm for RDF graph data in a distributed environment is proposed.•A star decomposition method is used to decompose RDF graphs into star subgraphs.•The query efficiency is improved due to the query order of star subgraphs.•RDF query efficiency is improved by using adjacency lists to store RDF graphs distributed in multiple tables. Nowadays, Resource Description Framework (RDF) query has been widely used in social networks, biomedicine and other fields. With the explosion of RDF data due to the Internet of Things and Semantic Web, people's demand for intelligent computing and intelligent search is increasing, effectively querying RDF has become a major challenge. The current query methods often introduce a large number of join operations, and repeatedly traverse in some subgraphs during the query process, which makes the query efficiency and query performance poor. To address the above problems, this paper proposes a subgraph query algorithm for RDF graph data in a distributed environment. The graph structure is used to decompose the stars of the RDF graph, and the optimal query sequence of the stars is calculated. Fewer intermediate results can be produced based on the query sequence to reduce repeated calculations. Besides, adjacency lists are used to store RDF graphs, which are distributed across multiple tables. Multiple table operations can reduce the scope of subject node traversal, and further improve the query efficiency of RDF subgraph by matching one star per iteration. Experimental results show that our work can improve the query efficiency of RDF subgraphs. [Display omitted]
ISSN:0045-7906
1879-0755
DOI:10.1016/j.compeleceng.2022.108221