Demystifying Graph Databases: Analysis and Taxonomy of Data Organization, System Designs, and Graph Queries
Graph processing has become an important part of multiple areas of computer science, such as machine learning, computational sciences, medical applications, social network analysis, and many others. Numerous graphs such as web or social networks may contain up to trillions of edges. Often, these gra...
Gespeichert in:
Hauptverfasser: | , , , , , , , |
---|---|
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Graph processing has become an important part of multiple areas of computer
science, such as machine learning, computational sciences, medical
applications, social network analysis, and many others. Numerous graphs such as
web or social networks may contain up to trillions of edges. Often, these
graphs are also dynamic (their structure changes over time) and have
domain-specific rich data associated with vertices and edges. Graph database
systems such as Neo4j enable storing, processing, and analyzing such large,
evolving, and rich datasets. Due to the sheer size of such datasets, combined
with the irregular nature of graph processing, these systems face unique design
challenges. To facilitate the understanding of this emerging domain, we present
the first survey and taxonomy of graph database systems. We focus on
identifying and analyzing fundamental categories of these systems (e.g., triple
stores, tuple stores, native graph database systems, or object-oriented
systems), the associated graph models (e.g., RDF or Labeled Property Graph),
data organization techniques (e.g., storing graph data in indexing structures
or dividing data into records), and different aspects of data distribution and
query execution (e.g., support for sharding and ACID). 51 graph database
systems are presented and compared, including Neo4j, OrientDB, or Virtuoso. We
outline graph database queries and relationships with associated domains (NoSQL
stores, graph streaming, and dynamic graph algorithms). Finally, we describe
research and engineering challenges to outline the future of graph databases. |
---|---|
DOI: | 10.48550/arxiv.1910.09017 |