Energy-efficient floating-point arithmetic for software-defined radio architectures
The lack of hardware support for floating-point arithmetic in low-power software-defined radio architectures can significantly increase their software design time due to a time-consuming process of converting floating-point code to fixed-point code. Moreover, emerging wireless communication protocol...
Gespeichert in:
Hauptverfasser: | , , |
---|---|
Format: | Tagungsbericht |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The lack of hardware support for floating-point arithmetic in low-power software-defined radio architectures can significantly increase their software design time due to a time-consuming process of converting floating-point code to fixed-point code. Moreover, emerging wireless communication protocols involve several matrix based algorithms that are extremely sensitive to round-off errors in computations. Using fixed-point arithmetic for these algorithms can significantly impact the accuracy of algorithm results and may incur additional energy overhead due to the extra instructions required for fixed-point arithmetic. In this paper, we demonstrate that supporting floating-point arithmetic in hardware can deliver nearly 30% higher performance and energy efficiency than supporting only fixed-point arithmetic for key kernels of modern wireless communication protocols. The improvements can be further enhanced by our proposed high-throughput floating-point fused-multiply-add unit. Applying our proposed fused-multiply-add unit to key kernels improves performance of the baseline floating-point unit by as much as 60%, while reducing energy consumption by 30% and area by 33%. Although our approach may cause execution stalls depending on data, we show the performance impact of these stalls is negligible. We also employ dynamic range-based dynamic voltage and frequency scaling to further reduce the energy consumption of the processor by 25% for the same worst-case performance as the baseline floating-point implementation. |
---|---|
ISSN: | 1063-6862 |
DOI: | 10.1109/ASAP.2011.6043260 |