LE QUY DON
Technical University
VietnameseClear Cookie - decide language by browser settings

High-Efficiency Multi-Standard Polynomial Multiplication Accelerator on RISC-V SoC for Post-Quantum Cryptography

Dam, D.-T. and Nguyen, T.-H. and Tran, T.-H. and Le, D.-H. and Hoang, T.-T. and Pham, C.-K. (2024) High-Efficiency Multi-Standard Polynomial Multiplication Accelerator on RISC-V SoC for Post-Quantum Cryptography. IEEE Access, 12. pp. 195015-195031.

Full text not available from this repository. (Upload)

Abstract

Number Theoretic Transform (NTT) enables speeding up polynomial multiplications, thereby accelerating the implementation of lattice-based post-quantum cryptography (PQC) algorithms. Currently, the standardized PQC algorithms FIPS 203 (CRYSTALS-Kyber), FIPS 204 (CRYSTALS-Dilithium), and the one in the process of being standardized FIPS 206 (FALCON) all use the NTT to perform polynomial multiplication. This paper proposes a high-speed, low-complexity, and run-time configurable accelerator that supports all three standards. Firstly, we propose a unified design using four parallel radix-2 butterflies targeting a high-speed polynomial multiplier. With a unified design, the accelerator performs NTT, inverse NTT (INTT), point-wise multiplication (PWM), and matrix-vector polynomial multiplication. Secondly, we propose a compact, configurable reordering unit for effective coefficient processing in high-parallelism. As a bonus, the required memory size is minimal, and the memory access pattern is straightforward. Finally, we present a RISC-V SoC architecture with a loosely coupled accelerator through register-map communication and the data flow to accelerate NTT-based operation in software. The FPGA implementation results show that the achieved speed for NTT/INTT/PWM executions is 224/224/64 clock cycles (CCs) for Kyber, 512/512/128 CCs for Dilithium, 576/576/128 CCs for FALCON-512, and 1280/1280/256 CCs for FALCON-1024, respectively. The Area × Time Product (ATP) results also show superiority over other algorithm-specific and configurable designs, achieving improvement up to 82, 63, 79, and 50 for Kyber, Dilithium, FALCON-512, and FALCON-1024, respectively. The SoC implementation results show that the NTT-based operations have improved by up to 5.29 ×, 27.49×, 56.79×, and 58.91× in software; and speed-up up to 10.53×, 9.81×, 9.57×, and 9.99× for the considered algorithms compared to previous SW/HW works on RISC-V platforms. © 2013 IEEE.

Item Type: Article
Divisions: Offices > Office of International Cooperation
Identification Number: 10.1109/ACCESS.2024.3520592
Uncontrolled Keywords: C (programming language); Conformal mapping; Integrated circuit design; Inverse transforms; Light velocity; Network coding; Network security; Parallel architectures; Polynomial approximation; Quantum electronics; Reconfigurable architectures; System-on-chip, Clock cycles; Dilithium; FIPS 203; FIPS 204; FIPS 206; Number-theoretic transforms; Polynomial multiplication; Post quantum cryptography; RISC-V; Software/hardware, Quantum cryptography
URI: http://eprints.lqdtu.edu.vn/id/eprint/11485

Actions (login required)

View Item
View Item