This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
Recently, due to the advent of resource-constrained trends, such as smartphones and smart devices, the computing environment is changing. Because our daily life is deeply intertwined with ubiquitous networks, the importance of security is growing. A lightweight encryption algorithm is essential for secure communication between these kinds of resource-constrained devices, and many researchers have been investigating this field. Recently, a lightweight block cipher called LEA was proposed. LEA was originally targeted for efficient implementation on microprocessors, as it is fast when implemented in software and furthermore, it has a small memory footprint. To reflect on recent technology, all required calculations utilize 32-bit wide operations. In addition, the algorithm is comprised of not complex S-Box-like structures but simple Addition, Rotation, and XOR operations. To the best of our knowledge, this paper is the first report on a comprehensive hardware implementation of LEA. We present various hardware structures and their implementation results according to key sizes. Even though LEA was originally targeted at software efficiency, it also shows high efficiency when implemented as hardware.
Recent improvements in semi-conductor technology have enabled the computing environment to become mobile, and accelerated the change to a ubiquitous era. The use of small mobile devices is growing explosively, and the importance of security is increasing daily. One of the essential ingredients of smart device security is a block cipher, and lightweight energy-efficient implementation techniques are required for small mobile devices.
Techniques for securing resource-constrained devices such as RFID (Radio-frequency Identification) tags have been proposed. In 2005, Lim and Korkishko [
Both lightweight block ciphers and methods to optimize legacy block ciphers have been studied. Moradi
Recently, the Electronics and Telecommunications Research Institute in Korea announced a new lightweight block cipher called LEA [
Usually, small chip size and reasonably fast encryption is preferred for cryptographic hardware for small devices in resource constrained environments such as RFID tags or smart meters for smart grids. In this paper, we propose several methods to optimize LEA hardware for all key sizes and present implementation results in terms of time and chip area cost. This work is the first that studies a comprehensive hardware implementation of LEA. LEA was originally designed for software implementation, but we aim to demonstrate that it is also efficient when implemented in hardware.
The rest of this paper is organized as follows: We introduce the LEA algorithm in Section 2, and then present elemental techniques for implementing LEA in hardware in Section 3. Section 4 presents hardware structures for the 128, 192, and 256 key version of LEA, and corresponding implementation results are presented in Section 5. We conclude this paper in Section 6.
In this section, we introduce the LEA block cipher. LEA has 128 bit long message blocks and 128, 192, or 256 bit long keys. We denote each version of this algorithm as LEA-128, LEA-196, and LEA-256 according to key length.
We present notations and corresponding descriptions required to explain the LEA algorithm in
4, 6, and 8 constant values that are 32 bits long are used for each version of the LEA key schedule. Each constant is defined as follows:
The constants are generated from the hexadecimal expression of
At the beginning of the LEA-128 key schedule, the key state
The key schedule of LEA-192 also starts with setting
Likewise, the key schedule of LEA-256 starts with setting
As described in Section 2.1, LEA-128/192/256 iterates in 24/28/32 rounds. Unlike AES [
The final
This section describes elemental hardware structures used for implementing LEA hardware.
LEA employs several constants for key scheduling. To design the constant schedule logic, the usage patterns of constants need to be analyzed. In
To minimize the number of gates required, some logic gates are shared and iteratively used in a round. In area-optimized implementation, one round can be split into several clock cycles. Therefore, four constants must be generated one by one in a round. The intuitive structure of constant scheduling logic is depicted in
An alternative logic structure for area-optimized LEA is depicted in
In this section, we describe hardware implementation methods according to three key sizes and the optimization goal(speed or area). Even though the three key versions of LEA use the same round-function, their key scheduling algorithms are different. Therefore, it is impossible to carry out different hardware implementations using the same logic for key scheduling, since they have different structures. The following subsections describe each LEA implementation focused on the key scheduling method. To specify each version according to the key size and optimization goal, each version will be denoted as LEA-KEYSIZE-OPTIMIZATION GOAL (e.g., LEA-128-SPEED refers to the 128-bit version of the LEA implementation with the target of speed improvement).
Plaintexts
The round function is the same as that used by LEA-128, but it differs in terms of the key schedule logic. First, the key input sequence differs from that found in LEA-128-AREA-1. Keys
LEA-192-SPEED in
The structure of LEA-256-SPEED is depicted in
All of the designs described in Section 4 were implemented in Register Transfer Level(RTL) in Verilog. We present the FPGA synthesis result for well-known chips: the Xilinx Virtex 5 series and Altera Cyclone-III series. The Xilinx series was synthesized using ISE 13.4, while the Altera series was synthesized using Quartus-II 11.1sp2.
The implementation results for the Xilinx Virtex 5 chip are summarized in
We also applied the same RTL code to implement the design into ASIC using Synopsys's Design Compiler B-2008-09.SP5 and the UMC 0.13 µm tech library. The maximum target frequency was 100 MHz, and all the designs met the timing constraints.
In this paper, we proposed the hardware design and implementation of a new lightweight encryption algorithm, LEA. LEA uses the same round function irrespective of key size. However, there are differences in its method for implementing key scheduling. Based on the key size, we presented suitable hardware designs. For the area-optimized version, we presented a resource-shared structure. Furthermore, by applying on-the-fly key scheduling or scheduling two keys simultaneously, it is possible to reduce the number of clock cycles. For the speed-optimized version, we parallelized all operations required to a round. Due to parallelization, we could achieve high throughput. After presenting the hardware structure of the LEA, we also presented the synthesis result of our design. We implemented our designs into Verilog HDL, then synthesized them to a FPGA chip and ASIC. We targeted commonly-used FPGA chips, and the open-library for ASIC. From the implementation result, we could observe that there is not much area savings of the area-opt version compared to the speed-opt version. This is because the structure of the LEA is too simple, so not much savings can be had by sharing components. Therefore, the speed-opt version shows better throughput per area than the area-opt version, since the area savings of the area-opt version is lower while the speed is significantly lowered. When we compare our implementation result to other results, our result is not the best in throughput per area. However, it does belong to a high position, and it is the best in throughput. We hope our designs can be improved in the future and we present studies on further improvements as future works.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No.2010-0026621)
The authors declare no conflict of interest.
Round function of LEA.
Constant scheduling logic structure for speed-optimized LEA hardware.
Intuitive constant scheduling logic structure for area-optimized LEA hardware.
Alternative constant scheduling logic structure for area-optimized LEA hardware.
Datapath of LEA-128-AREA-1.
Datapath of LEA-128-AREA-2.
Datapath of LEA-128-SPEED.
Datapath of LEA-192-AREA-1.
Datapath of LEA-192-AREA-2.
Datapath of LEA-192-SPEED.
Datapath of LEA-256-AREA-1.
Datapath of LEA-256-AREA-2.
Datapath of LEA-256-SPEED.
Generalized throughput and area graph to compare relative performance (Xilinx Virtex-5).
Generalized throughput and area graph to compare relative performance (Altera Cyclone-III).
Generalized throughput and area graph to compare relative performance(ASIC).
Notations used to explain LEA algorithm.
128-bit plaintext. | |
128-bit ciphertext. | |
Length of bit sequence | |
Master key. | |
Intermediate value of the | |
Intermediate value of the | |
Constant value used for the key schedule. | |
Number of round iterations. | |
192-bit round key used for the i-th round.
| |
⊕ | XOR operation. |
⊞ | Addition modulo 2^{32}. |
Comparison of implementation results using Xilinx Virtex 5.
LEA-128-AREA-1 | 168 | 269.658 | 0.62 | 16.8 | 205.45 | 392 | 249 | 503 | 311.86 | 0.41 |
LEA-128-AREA-2 | 96 | 163.861 | 0.59 | 9.6 | 218.48 | 388 | 306 | 559 | 329.81 | 0.39 |
LEA-128-SPEED | 24 | 217.806 | 0.11 | 2.4 | 1,161.63 | 386 | 713 | 854 | 93.94 | 1.36 |
LEA-192-AREA-1 | 168 | 197.797 | 0.85 | 16.8 | 226.05 | 423 | 408 | 620 | 527 | 0.36 |
LEA-192-AREA-2 | 84 | 198.364 | 0.42 | 8.4 | 453.40 | 514 | 403 | 709 | 297.78 | 0.64 |
LEA-192-SPEED | 28 | 218.250 | 0.13 | 2.8 | 1,496.57 | 508 | 911 | 1,103 | 143.39 | 1.36 |
LEA-256-AREA-1 | 288 | 257.652 | 1.12 | 28.8 | 229.02 | 663 | 713 | 994 | 1,113.28 | 0.23 |
LEA-256-AREA-2 | 192 | 169.2 | 1.13 | 19.2 | 225.60 | 649 | 987 | 1,003 | 1,133.4 | 0.22 |
LEA-256-SPEED | 32 | 126.23 | 0.25 | 3.2 | 1,009.84 | 645 | 1,131 | 1,137 | 284.3 | 0.89 |
Comparison of implementation results using Altera Cyclone-III.
LEA-128-AREA-1 | 168 | 184.47 | 0.91 | 16.8 | 140.55 | 392 | 632 | 680 | 618.8 | 0.21 |
LEA-128-AREA-2 | 96 | 97.98 | 0.98 | 9.6 | 130.64 | 391 | 721 | 721 | 706.6 | 0.18 |
LEA-128-SPEED | 24 | 121.91 | 0.20 | 2.4 | 650.19 | 389 | 812 | 813 | 162.6 | 0.80 |
LEA-192-AREA-1 | 168 | 119.03 | 1.41 | 16.8 | 136.03 | 520 | 823 | 828 | 1,167.5 | 0.16 |
LEA-192-AREA-2 | 84 | 119.13 | 0.71 | 8.4 | 272.30 | 519 | 864 | 881 | 625.5 | 0.31 |
LEA-192-SPEED | 28 | 122.35 | 0.23 | 2.8 | 838.97 | 517 | 1,003 | 1,003 | 230.7 | 0.84 |
LEA-256-AREA-1 | 288 | 174.76 | 1.65 | 28.8 | 155.34 | 650 | 996 | 1,044 | 1,722.6 | 0.15 |
LEA-256-AREA-2 | 192 | 169.2 | 1.13 | 19.2 | 225.60 | 649 | 987 | 1,003 | 1,133.4 | 0.22 |
LEA-256-SPEED | 32 | 126.23 | 0.25 | 3.2 | 1,009.84 645 | 1,131 | 1,137 | 284.3 | 0.89 |
Comparison of ASIC implementation results. (UMC 0.13 um, Target frequency: 100 MHz).
LEA-128-AREA-1 | 168 | 1.68 | 76.19 | 1,707.5 | 2,118.5 | 3,826 | 6,427.7 | 0.02 |
LEA-128-AREA-2 | 96 | 0.96 | 133.33 | 2,157.75 | 2,137.75 | 4,295.5 | 4,123.7 | 0.03 |
LEA-128-SPEED | 24 | 0.24 | 533.33 | 3,309.25 | 2,116.75 | 5,426 | 1,302.2 | 0.10 |
LEA-192-AREA-1 | 168 | 1.68 | 114.29 | 2,245 | 2,813.5 | 5,058.5 | 8,498.3 | 0.02 |
LEA-192-AREA-2 | 84 | 0.84 | 228.57 | 2,538.5 | 2,812.5 | 5,351 | 4,494.8 | 0.04 |
LEA-192-SPEED | 28 | 0.28 | 685.71 | 3,907.75 | 2,823.5 | 6,731.25 | 1,884.8 | 0.10 |
LEA-256-AREA-1 | 288 | 2.88 | 88.89 | 2,376.5 | 3,555.75 | 5,932.25 | 17,084.9 | 0.01 |
LEA-256-AREA-2 | 192 | 1.92 | 133.33 | 2,440.75 | 3,655.5 | 6,096.25 | 11,704.8 | 0.02 |
LEA-256-SPEED | 32 | 0.32 | 800.00 | 4,142.5 | 3,540 | 7,682.5 | 2,458.4 | 0.10 |
Comparison to other encryption algorithms.
DESL [ |
56 | 64 | 144 | 44.4 | 0.18 | 1,848 | 0.024026 |
KATAN [ |
80 | 64 | 255 | 25.1 | 0.13 | 1,054 | 0.023814 |
HIGHT [ |
128 | 64 | 34 | 188.2 | 0.25 | 3,048 | 0.061745 |
PRESENT [ |
128 | 64 | 32 | 200.0 | 0.18 | 1,570 | 0.127389 |
PRESENT [ |
128 | 64 | 547 | 11.7 | 0.18 | 1,075 | 0.010884 |
HummingBird2 [ |
128 | 16 | 4 | 400.0 | 0.18 | 3,220 | 0.124224 |
HummingBird2 [ |
128 | 16 | 20 | 80.0 | 0.18 | 2,159 | 0.037054 |
AES [ |
128 | 128 | 226 | 56.6 | 0.13 | 2,400 | 0.023583 |
LED [ |
128 | 64 | 1,872 | 3.4 | 0.18 | 1,265 | 0.002688 |
LEA-128-SPEED | 128 | 128 | 24 | 533.3 | 0.13 | 5,426 | 0.098286 |
DESXL [ |
184 | 64 | 144 | 44.4 | 0.18 | 2,168 | 0.02048 |
LEA-196-SPEED | 196 | 128 | 28 | 457.1 | 0.13 | 6,731 | 0.06791 |
LEA-256-SPEED | 256 | 128 | 32 | 400.0 | 0.13 | 7,683 | 0.052063 |