An Area-Efficient, Conflict-Free, and Configurable Architecture for Accelerating NTT/INTT

The Number Theoretic Transform (NTT) is a widely adopted method for accelerating polynomial multiplication in lattice-based cryptosystems. Consequently, numerous hardware accelerators have been developed to enhance the speed of the NTT algorithm. Area-Time Product (ATP) and configurability are criti...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on very large scale integration (VLSI) systems 2024-03, Vol.32 (3), p.519-529
Hauptverfasser:	Liu, Si-Huang, Kuo, Chia-Yi, Mo, Yan-Nan, Su, Tao
Format:	Artikel
Sprache:	eng
Schlagworte:	Accelerators Algorithms Clocks Computer architecture Computer systems Configurable architecture for accelerating number theoretic transform (NTT)/inverse NTT (INTT) conflict-free memory access pattern Cryptography Efficiency Hardware lattice-based cryptosystems Lattices NTT Parallel processing Parameters polynomial multiplier Polynomials Random access memory Transforms Very large scale integration
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The Number Theoretic Transform (NTT) is a widely adopted method for accelerating polynomial multiplication in lattice-based cryptosystems. Consequently, numerous hardware accelerators have been developed to enhance the speed of the NTT algorithm. Area-Time Product (ATP) and configurability are critical metrics for evaluating these accelerators. ATP measures efficiency, while configurability ensures adaptability to various algorithm parameters and hardware limitations. In this article, we propose an area-efficient, conflict-free, and configurable architecture for accelerating NTT/inverse NTT (INTT). The proposed architecture demonstrates adaptability to various parameter settings while maintaining a consistently low ATP, even under a high level of parallelism. Moreover, our architecture supports polynomials of different degrees after compilation, further enhancing its versatility. To minimize latency, we adopt the existing merged Constant-Geometry (CG) NTT and INTT algorithm to combine preprocessing and postprocessing and further eliminate the bit-reverse operation in the algorithm. Building upon this foundation, we propose low-complexity memory access patterns for polynomial coefficients and twiddle factors, which contribute to high area efficiency and configurability in our architecture. The implementation results on FPGA substantiate that the proposed architecture exhibits remarkable superiority over existing works in terms of configurability and area efficiency.
ISSN:	1063-8210 1557-9999
DOI:	10.1109/TVLSI.2023.3336951