Remark on Algorithm 539: A Modern Fortran Reference Implementation for Carefully Computing the Euclidean Norm
We propose a set of new Fortran reference implementations, based on an algorithm proposed by Kahan, for the Level 1 BLAS routines *NRM2 that compute the Euclidean norm of a real or complex input vector. The principal advantage of these routines over the current offerings is that, rather than losing...
Gespeichert in:
Veröffentlicht in: | ACM transactions on mathematical software 2018-09, Vol.44 (3), p.1-23 |
---|---|
Hauptverfasser: | , |
Format: | Artikel |
Sprache: | eng |
Online-Zugang: | Volltext |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | We propose a set of new Fortran reference implementations, based on an algorithm proposed by Kahan, for the Level 1 BLAS routines
*NRM2
that compute the Euclidean norm of a real or complex input vector. The principal advantage of these routines over the current offerings is that, rather than losing accuracy as the length of the vector increases, they generate results that are accurate to almost machine precision for vectors of length
N
<
N
max
where
N
max
depends upon the precision of the floating point arithmetic being used. In addition, we make use of intrinsic modules, introduced in the latest Fortran standards, to detect occurrences of non-finite numbers in the input data and return suitable values as well as setting IEEE floating point status flags as appropriate. A set of C interface routines is also provided to allow simple, portable access to the new routines.
To improve execution speed, we advocate a hybrid algorithm; a simple loop is used first and, only if IEEE floating point exception flags signal, do we fall back on Kahan’s algorithm. Since most input vectors are “easy,” i.e., they do not require the sophistication of Kahan’s algorithm, the simple loop improves performance while the use of compensated summation ensures high accuracy.
We also report on a comprehensive suite of test problems that has been developed to test both our new implementation and existing codes for both accuracy and the appropriate settings of the IEEE arithmetic status flags. |
---|---|
ISSN: | 0098-3500 1557-7295 |
DOI: | 10.1145/3134441 |