Scalable and Conflict-Free NTT Hardware Accelerator Design: Methodology, Proof, and Implementation
Number theoretic transform (NTT) is useful for the acceleration of polynomial multiplication, which is the main performance bottleneck in the next-generation cryptographic schemes. Different NTT-based cryptographic algorithms have different security settings. The diverse application scenarios introd...
Gespeichert in:
Veröffentlicht in: | IEEE transactions on computer-aided design of integrated circuits and systems 2023-05, Vol.42 (5), p.1504-1517 |
---|---|
Hauptverfasser: | , , , , , , , , , , |
Format: | Artikel |
Sprache: | eng |
Schlagworte: | |
Online-Zugang: | Volltext bestellen |
Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
Zusammenfassung: | Number theoretic transform (NTT) is useful for the acceleration of polynomial multiplication, which is the main performance bottleneck in the next-generation cryptographic schemes. Different NTT-based cryptographic algorithms have different security settings. The diverse application scenarios introduce different cost-performance tradeoffs and hardware constraints. Motivated by the emerging demand for more versatile NTT hardware accelerators, we propose a new design methodology that can generate area-efficient and high-performance NTT accelerators for any length and modulus of NTT polynomials and single processing element (PE) or PE array with a varying number of layers. The proposed NTT accelerator architecture pivots on a conflict-free memory access pattern for adaptation to different combinations of security and PE array configuration parameters. The proposed memory access pattern is formally proved to be conflict-free for any parametric configurations. The criterion for read-after-write conflict without pipeline stall is also established. Our proposed design methodology can produce NTT accelerators with single PE or multilayer PE array for different polynomial size and modulus, with hardware area and computational efficiency comparable to accelerators customized for a fixed set of parameters. Our proposed methodology produces parameterized accelerator with higher scalability than the existing parameterized accelerator design. On average, the accelerators generated by our proposed method are 71.4% more area-time efficient. Up to 30.7% area-time reduction over the most area-time efficient state-of-the-art scalable NTT accelerator can be achieved for the same security parameters. |
---|---|
ISSN: | 0278-0070 1937-4151 |
DOI: | 10.1109/TCAD.2022.3205552 |