Assisting Biologists in Editing Taxonomic Information by Confronting Multiple Data Sources using Linked Data Standards

During the last decade, Web APIs (Application Programming Interface) have gained significant traction to the extent that they have become a de-facto standard to enable HTTP-based, machine-processable data access. Despite this success, however, they still often fail in making data interoperable, inso...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Biodiversity Information Science and Standards 2019-06, Vol.3 (37421)
Hauptverfasser: Michel, Franck, Faron-Zucker, Catherine, Tercerie, Sandrine, Ettorre, Antonia, Olivier, Gargominy
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:During the last decade, Web APIs (Application Programming Interface) have gained significant traction to the extent that they have become a de-facto standard to enable HTTP-based, machine-processable data access. Despite this success, however, they still often fail in making data interoperable, insofar as they commonly rely on proprietary data models and vocabularies that lack formal semantic descriptions essential to ensure reliable data integration. In the biodiversity domain, multiple data aggregators, such as the Global Biodiversity Information Facility (GBIF) and the Encyclopedia of Life (EoL), maintain specialized Web APIs giving access to billions of records about taxonomies, occurrences, or life traits (Triebel et al. 2012). They publish data sets spanning complementary and often overlapping regions, epochs or domains, but may also report or rely on potentially conflicting perspectives, e.g. with respect to the circumscription of taxonomic concepts. It is therefore of utmost importance for biologists and collection curators to be able to confront the knowledge they have about taxa with related data coming from third-party data sources. To tackle this issue, the French National Museum of Natural History (MNHN) has developed an application to edit TAXREF, the French taxonomic register for fauna, flora and fungus (Gargominy et al. 2018). TAXREF registers all species recorded in metropolitan France and overseas territories, accounting for 260,000+ biological taxa (200,000+ species) along with 570,000+ scientific names. The TAXREF-Web application compares data available in TAXREF with corresponding data from third-party data sources, points out disagreements and allows biologists to add, remove or amend TAXREF accordingly. This requires that TAXREF-Web developers write a specific piece of code for each considered Web API to align TAXREF representation with the Web API counterpart. This task is time-consuming and makes maintenance of the web application cumbersome. In this presentation, we report on a new implementation of TAXREF-Web that harnesses the Linked Data standards: Resource Description Framework (RDF), the Semantic Web format to represent knowledge graphs, and SPARQL, the W3C standard to query RDF graphs. In addition, we leverage the SPARQL Micro-Service architecture (Michel et al. 2018) , a lightweight approach to query Web APIs using SPARQL. A SPARQL micro-service is a SPARQL endpoint that wraps a Web API service; it typically produces a smal
ISSN:2535-0897
2535-0897
DOI:10.3897/biss.3.37421