ECO-BIKE: Bridging the Gap Between PQC BIKE and GPU Acceleration

Advancements in quantum computing pose a threat to public-key cryptosystems, leading to the development of post-quantum cryptography. NIST is standardizing candidate algorithms, with BIKE, a code-based key encapsulation mechanism, among those under consideration. Performance is crucial in NIST PQC s...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on information forensics and security 2024, Vol.19, p.8952-8965
Hauptverfasser:	Dong, Jiankuo, Fu, Yusheng, Qin, Xusheng, Dong, Zhenjiang, Xiao, Fu, Lin, Jingqiang
Format:	Artikel
Sprache:	eng
Schlagworte:	BIKE Cryptography CUDA Encapsulation GPU Graphics processing units Karatsuba NIST Parallel processing Polynomials PQC Throughput
Online-Zugang:	Volltext bestellen
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Advancements in quantum computing pose a threat to public-key cryptosystems, leading to the development of post-quantum cryptography. NIST is standardizing candidate algorithms, with BIKE, a code-based key encapsulation mechanism, among those under consideration. Performance is crucial in NIST PQC standardization process, and researchers have introduced a range of optimization techniques for BIKE across various platforms. To the best of our knowledge, our Efficient CryptOgraphy BIKE (ECO-BIKE) represents the first attempt at optimizing the implementation of BIKE on GPU architecture. In this paper, we introduce a comprehensive construction of a 3-threading parallel architecture tailored for the BIKE cryptosystem. This architecture covers a range of computational tasks, addressing operations from low-level to high-level computations. These include a parallel dense polynomial multiplication scheme with a better memory access pattern and a better XOR calculation, which forms the basis for a comprehensive parallel execution framework for the entire BIKE algorithm. Targeted optimizations are implemented for specific modules (KEYGEN, ENCAPS, DECAPS), which collectively enhance the overall efficiency of the algorithm. Our ECO-BIKE exhibits exceptional throughput performance on the NVIDIA GeForce RTX 4090. In the 3-thread mode, the throughput of the KEYGEN, ENCAPS, and DECAPS modules reaches 24.033 kops/s, 277.789 kops/s, and 5.817 kops/s, respectively. Our proposed optimal parallel multiplication scheme achieves a significantly higher overall throughput of 481.302 kops/s. These results highlight the substantial computational advantages our approach provides for cryptographic workloads.
ISSN:	1556-6013 1556-6021
DOI:	10.1109/TIFS.2024.3443617