Which Data Format To Store Scientific Data Should I Use? A Performance Analysis
A lot of scientific work is dedicated to the analysis of data. Most of the analyzed data, like data from space missions, are structured. The choice of data format can affect various characteristics - read/write speed of standard files, read/write speed of small files and read/write speed of compress...
Gespeichert in:
Veröffentlicht in: | Acta electrotechnica et informatica 2022-09, Vol.22 (3), p.32-40 |
---|---|
Hauptverfasser: | , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | A lot of scientific work is dedicated to the analysis of data. Most of the analyzed data, like data from space missions, are structured. The choice of data format can affect various characteristics - read/write speed of standard files, read/write speed of small files and read/write speed of compressed data formats. In this paper, we analyze binary data formats, proposed types of the tests and testing methods, and compare their performance with human-readable text format. We also discuss compressed and uncompressed modes available for data formats like HDF5 and netCDF. When disregarding precision, the best data format from the size perspective is lossy HDF5 without compression. Losless HDF5 without compression show the best speed performance. Lossy HDF5 without compression is the best balance between size reduction and speed. However, for specific criteria and types of files, there might be better candidates as detailed in the conclusion. |
---|---|
ISSN: | 1338-3957 1338-3957 |
DOI: | 10.2478/aei-2022-0015 |