Fast and Efficient Compression of Floating-Point Data

Large scale scientific simulation codes typically run on a cluster of CPUs that write/read time steps to/from a single file system. As data sets are constantly growing in size, this increasingly leads to I/O bottlenecks. When the rate at which data is produced exceeds the available I/O bandwidth, th...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on visualization and computer graphics 2006-09, Vol.12 (5), p.1245-1250
Hauptverfasser:	Lindstrom, P., Isenburg, M.
Format:	Artikel
Sprache:	eng
Schlagworte:	Analytical models Bandwidth Central processing units Compressing Computer simulation Data compression Data visualization Entropy fast entropy coding file compaction for I/O efficiency File systems Floating point arithmetic High throughput Image coding Integers large scale simulation and visualization Large-scale systems lossless compression predictive coding Predictive models range coder Studies Throughput Visualization
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Large scale scientific simulation codes typically run on a cluster of CPUs that write/read time steps to/from a single file system. As data sets are constantly growing in size, this increasingly leads to I/O bottlenecks. When the rate at which data is produced exceeds the available I/O bandwidth, the simulation stalls and the CPUs are idle. Data compression can alleviate this problem by using some CPU cycles to reduce the amount of data needed to be transfered. Most compression schemes, however, are designed to operate offline and seek to maximize compression, not throughput. Furthermore, they often require quantizing floating-point values onto a uniform integer grid, which disqualifies their use in applications where exact values must be retained. We propose a simple scheme for lossless, online compression of floating-point data that transparently integrates into the I/O of many applications. A plug-in scheme for data-dependent prediction makes our scheme applicable to a wide variety of data used in visualization, such as unstructured meshes, point sets, images, and voxel grids. We achieve state-of-the-art compression rates and speeds, the latter in part due to an improved entropy coder. We demonstrate that this significantly accelerates I/O throughput in real simulation runs. Unlike previous schemes, our method also adapts well to variable-precision floating-point and integer data
ISSN:	1077-2626 1941-0506
DOI:	10.1109/TVCG.2006.143