Reading vector data more efficiently: Assessing performance of the OGR simple feature library

Reading vector data files to in‐memory data models efficiently is a crucial step to handle the ever‐growing volume of geographical data. Although advanced IO solutions like distributed file systems or Message Passing Interface parallel IO work well in some high‐end computing environments, the 20‐yea...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transactions in GIS 2022-02, Vol.26 (1), p.201-213
Hauptverfasser: Yang, Anran, Jia, Qingren, Zhong, Zhinong, Jing, Ning
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Reading vector data files to in‐memory data models efficiently is a crucial step to handle the ever‐growing volume of geographical data. Although advanced IO solutions like distributed file systems or Message Passing Interface parallel IO work well in some high‐end computing environments, the 20‐year‐old OGR simple features library is still the de facto tool for loading vector files when developing GIS algorithms in most scenarios, which is not very efficient when data become larger. In this article, we analyze the bottleneck of the OGR library and find that excessive small objects are the main source of slowness. We then offer advice to improve efficiency when using OGR. To further verify our findings and provide an alternative to the OGR library in performance‐sensitive scenarios, we develop a library based on continuous memory pools to avoid small objects. Experiments show that our advice is effective and our library can be several times faster than the OGR library for IO‐intensive programs.
ISSN:1361-1682
1467-9671
DOI:10.1111/tgis.12840