Identifying Excessively Rounded or Truncated Data

COMPSTAT 2006, Rome Italy, Physica-Verlag, Heidelberg, pp. 313-324, 2006 All data are digitized, and hence are essentially integers rather than true real numbers. Ordinarily this causes no difficulties since the truncation or rounding usually occurs below the noise level. However, in some instances,...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Knuth, Kevin H, Castle, J. Patrick, Wheeler, Kevin R
Format: Artikel
Sprache:eng
Schlagworte:
Online-Zugang:Volltext bestellen
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:COMPSTAT 2006, Rome Italy, Physica-Verlag, Heidelberg, pp. 313-324, 2006 All data are digitized, and hence are essentially integers rather than true real numbers. Ordinarily this causes no difficulties since the truncation or rounding usually occurs below the noise level. However, in some instances, when the instruments or data delivery and storage systems are designed with less than optimal regard for the data or the subsequent data analysis, the effects of digitization may be comparable to important features contained within the data. In these cases, information has been irrevocably lost in the truncation process. While there exist techniques for dealing with truncated data, we propose a straightforward method that will allow us to detect this problem before the data analysis stage. It is based on an optimal histogram binning algorithm that can identify when the statistical structure of the digitization is on the order of the statistical structure of the data set itself.
DOI:10.48550/arxiv.1602.04292