An Area-Efficient, Conflict-Free, and Configurable Architecture for Accelerating NTT/INTT
The Number Theoretic Transform (NTT) is a widely adopted method for accelerating polynomial multiplication in lattice-based cryptosystems. Consequently, numerous hardware accelerators have been developed to enhance the speed of the NTT algorithm. Area-Time Product (ATP) and configurability are criti...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on very large scale integration (VLSI) systems 2024-03, Vol.32 (3), p.519-529 |
---|---|
Hauptverfasser: | , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | The Number Theoretic Transform (NTT) is a widely adopted method for accelerating polynomial multiplication in lattice-based cryptosystems. Consequently, numerous hardware accelerators have been developed to enhance the speed of the NTT algorithm. Area-Time Product (ATP) and configurability are critical metrics for evaluating these accelerators. ATP measures efficiency, while configurability ensures adaptability to various algorithm parameters and hardware limitations. In this article, we propose an area-efficient, conflict-free, and configurable architecture for accelerating NTT/inverse NTT (INTT). The proposed architecture demonstrates adaptability to various parameter settings while maintaining a consistently low ATP, even under a high level of parallelism. Moreover, our architecture supports polynomials of different degrees after compilation, further enhancing its versatility. To minimize latency, we adopt the existing merged Constant-Geometry (CG) NTT and INTT algorithm to combine preprocessing and postprocessing and further eliminate the bit-reverse operation in the algorithm. Building upon this foundation, we propose low-complexity memory access patterns for polynomial coefficients and twiddle factors, which contribute to high area efficiency and configurability in our architecture. The implementation results on FPGA substantiate that the proposed architecture exhibits remarkable superiority over existing works in terms of configurability and area efficiency. |
---|---|
ISSN: | 1063-8210 1557-9999 |
DOI: | 10.1109/TVLSI.2023.3336951 |