Extending Summation Precision for Network Reduction Operations
Double precision summation is at the core of numerous important algorithms such as Newton–Krylov methods and other operations involving inner products, such as matrix multiplication and dot products. However, the effectiveness of summation is limited by the accumulation of rounding errors due to com...
Gespeichert in:
Veröffentlicht in: | International journal of parallel programming 2015-12, Vol.43 (6), p.1218-1243 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Double precision summation is at the core of numerous important algorithms such as Newton–Krylov methods and other operations involving inner products, such as matrix multiplication and dot products. However, the effectiveness of summation is limited by the accumulation of rounding errors due to compressed representations, which are an increasing problem with the scaling of modern HPC systems and data sets that can easily perform summations with millions or billions of operands. To reduce the impact of precision loss, researchers have proposed increased- and arbitrary-precision libraries that provide reproducible error or even bounded error accumulation for large sums. However, such libraries increase computation and communication time significantly, and do not always guarantee an exact result. In this article, we propose fixed-point representations of double precision variables that enable arbitrarily large summations
without error
and provide exact and reproducible results. We call this format big integer (BigInt). Even though such formats have been studied for local processor computations, we make the case that using fixed-point representation for distributed computation over a system-wide network is feasible with performance comparable to that of double-precision floating point summation. This is possible by the inclusion of simple and inexpensive logic into modern NICs, or by using the programmable logic found in many modern NICs, in order to accelerate performance on large-scale systems in order to avoid waking up processors. |
---|---|
ISSN: | 0885-7458 1573-7640 |
DOI: | 10.1007/s10766-014-0326-5 |