Compact dilithium implementations on Cortex-M3 and Cortex-M4

62Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

Abstract

We present implementations of the lattice-based digital signature scheme Dilithium for ARM Cortex-M3 and ARM Cortex-M4. Dilithium is one of the three signature finalists of the NIST post-quantum cryptography competition. As our Cortex-M4 target, we use the popular STM32F407-DISCOVERY development board. Compared to the previous speed records on the Cortex-M4 by Ravi, Gupta, Chat-topadhyay, and Bhasin we speed up the key operations NTT and NTT−1 by 20% which together with other optimizations results in speedups of 7%, 15%, and 9% for Dilithium3 key generation, signing, and verification respectively. We also present the first constant-time Dilithium implementation on the Cortex-M3 and use the Arduino Due for benchmarks. For Dilithium3, we achieve on average 2 562 kilocycles for key generation, 10 667 kilocycles for signing, and 2 321 kilocycles for verification. Additionally, we present stack consumption optimizations applying to both our Cortex-M3 and Cortex-M4 implementation. Due to the iterative nature of the Dilithium signing algorithm, there is no optimal way to achieve the best speed and lowest stack consumption at the same time. We present three different strategies for the signing procedure which allow trading more stack and flash memory for faster speed or vice-versa. Our implementation of Dilithium3 with the smallest memory footprint uses less than 12kB. As an additional output of this work, we present the first Cortex-M3 implementations of the key-encapsulation schemes NewHope and Kyber.

References Powered by Scopus

An algorithm for the machine calculation of complex fourier series

7657Citations
N/AReaders
Get full text

Algorithms for quantum computation: Discrete logarithms and factoring

6559Citations
N/AReaders
Get full text

Modular multiplication without trial division

2117Citations
N/AReaders
Get full text

Cited by Powered by Scopus

A Compact and High-Performance Hardware Architecture for CRYSTALS-Dilithium

49Citations
N/AReaders
Get full text

A High-Performance Domain-Specific Processor with Matrix Extension of RISC-V for Module-LWE Applications

48Citations
N/AReaders
Get full text

High-Performance Hardware Implementation of CRYSTALS-Dilithium

48Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Greconici, D. O. C., Kannwischer, M. J., & Sprenkels, A. (2021). Compact dilithium implementations on Cortex-M3 and Cortex-M4. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2021(1), 1–24. https://doi.org/10.46586/tches.v2021.i1.1-24

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 5

63%

Researcher 3

38%

Readers' Discipline

Tooltip

Computer Science 7

70%

Physics and Astronomy 1

10%

Social Sciences 1

10%

Engineering 1

10%

Save time finding and organizing research with Mendeley

Sign up for free