Geospatial Queries on Data Collection Using a Common Provenance Model

Lineage information is the part of the metadata that describes “what”, “when”, “who”, “how”, and “where” geospatial data were generated. If it is well-presented and queryable, lineage becomes very useful information for inferring data quality, tracing error sources and increasing trust in geospatial...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:ISPRS international journal of geo-information 2021-03, Vol.10 (3), p.139
Hauptverfasser: Closa, Guillem, Masó, Joan, Julià, Núria, Pons, Xavier
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Lineage information is the part of the metadata that describes “what”, “when”, “who”, “how”, and “where” geospatial data were generated. If it is well-presented and queryable, lineage becomes very useful information for inferring data quality, tracing error sources and increasing trust in geospatial information. In addition, if the lineage of a collection of datasets can be related and presented together, datasets, process chains, and methodologies can be compared. This paper proposes extending process step lineage descriptions into four explicit levels of abstraction (process run, tool, algorithm and functionality). Including functionalities and algorithm descriptions as a part of lineage provides high-level information that is independent from the details of the software used. Therefore, it is possible to transform lineage metadata that is initially documenting specific processing steps into a reusable workflow that describes a set of operations as a processing chain. This paper presents a system that provides lineage information as a service in a distributed environment. The system is complemented by an integrated provenance web application that is capable of visualizing and querying a provenance graph that is composed by the lineage of a collection of datasets. The International Organization for Standardization (ISO) 19115 standards family with World Wide Web Consortium (W3C) provenance initiative (W3C PROV) were combined in order to integrate provenance of a collection of datasets. To represent lineage elements, the ISO 19115-2 lineage class names were chosen, because they express the names of the geospatial objects that are involved more precisely. The relationship naming conventions of W3C PROV are used to represent relationships among these elements. The elements and relationships are presented in a queryable graph.
ISSN:2220-9964
2220-9964
DOI:10.3390/ijgi10030139