EXOD: A tool for building and exploring a large graph of open datasets

We present in this paper a tool called EXOD (EXploration of Open Datasets) for the visual analysis of a large collection of open datasets. EXOD aims at helping the users to find datasets of interest. EXOD starts with the download of a large collection of datasets from an open data web site. For each...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & graphics 2014-04, Vol.39, p.117-130
Hauptverfasser: Liu, Tianyang, Bouali, Fatma, Venturini, Gilles
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We present in this paper a tool called EXOD (EXploration of Open Datasets) for the visual analysis of a large collection of open datasets. EXOD aims at helping the users to find datasets of interest. EXOD starts with the download of a large collection of datasets from an open data web site. For each dataset, it extracts its meta-data and its content. To describe each dataset in a vector space, EXOD extracts features by using text mining techniques. It considers both the metadata and the content of each dataset. Using this feature space, EXOD establishes a proximity graph by computing the Relative Neighborhood Graph. Considering the size of the collection, EXOD uses a GPU-based implementation for building this graph. We visualize the graph using the Tulip software and provide a visual and interactive global map of the collection. We developed a specific plug-in for Tulip to download and open the datasets in an interactive way. All of the presented results concern the French Open Data. EXOD was able to process 293,000 datasets, and half of this collection was visualized in Tulip. We show how clusters and other information can be discovered and how the created links can be used for local and content-based exploration. [Display omitted] •We propose a tool called EXOD to analyze a large collection of open datasets.•Text mining is used to extract features from open datasets.•A GPU-based approach is proposed for computing the Relative Neighborhood Graph.•Graph visualization techniques are used as a user interface for information retrieval.•Results are presented on the French Open Data.
ISSN:0097-8493
1873-7684
DOI:10.1016/j.cag.2013.11.014