1. Introduction
In the modern era of digital communication, data security is a fundamental concern due to increasing cyberattack and unauthorized access threats. Among the various cryptographic algorithms, the Advanced Encryption Standard (AES) has emerged as the predominant algorithm due to its robustness, efficiency, and standardized structure [
1]. Implementing AES with a Finite State Machine (FSM) architecture provides a modular, systematic approach to managing encryption processes, which is particularly well-suited for hardware implementations [
2]. Field-programmable gate arrays (FPGAs) offer reconfigurable logic and parallel processing capabilities, making them an ideal platform for high-throughput cryptographic systems [
3,
4]. Previous studies have demonstrated FPGA implementations achieving speeds of up to 3 Gbps on devices such as Spartan-6 and Artix-7, with optimizations targeting area, power, or speed [
5,
6]. Conversely, MATLAB is a versatile environment for modeling and simulation, enabling the validation of algorithms and verification of functionality before hardware deployment. Its high-level syntax and built-in matrix operations allow for the quick development of AES encryption for various applications, including image, video, and signal processing [
7,
8].
Despite extensive research on AES implementation, a detailed comparative analysis of FSM-based AES encryption in FPGA and MATLAB environments remains limited. Each platform introduces unique constraints and advantages in terms of resource utilization, timing performance, and design complexity [
9]. This study aims to address this gap by presenting a structured design for FSM-controlled AES encryption implemented on FPGA hardware and MATLAB simulations. Through empirical analysis and synthesis reports, the research evaluated critical metrics such as throughput, latency, logic utilization, and scalability. While earlier studies have documented encryption times as low as 87 ms in MATLAB and throughput above 4 Gbps in optimized FPGA architectures [
6,
8], few have systematically compared these platforms within a unified design framework. These findings aim to provide researchers and engineers with practical insights for selecting the most suitable platform for their application-specific requirements in the design of secure embedded systems [
10].
Reference [
8] explores the application of AES-128 encryption to grayscale images using MATLAB. The authors provide a structured overview of AES operations, including SubBytes, ShiftRows, MixColumns, AddRoundKey, and Key Expansion. They then demonstrated the integration of these operations within a modular MATLAB program. The study included experiments on three grayscale images with dimensions of 120 × 160, 256 × 256, and 336 × 412 pixels, respectively. Observed encryption and decryption times were 50.125 and 57.266 s for the smallest image, 115.703 and 154.032 s for the medium-sized image, and 214.047 and 293.125 s for the largest image, respectively. The findings indicated that processing time increased substantially in proportion to image size, thereby underscoring the performance limitations of using MATLAB for large-scale data processing. Despite its instructional value, the paper has several critical limitations. First, the present volume does not include finite state machine (FSM) design details or diagrams despite the title suggesting such an approach. Implementation is limited to MATLAB, and there is no comparison with hardware platforms, such as FPGAs, which would be essential for evaluating performance in real-time systems [
8].
Paper [
6] provides a thorough implementation of the 128-bit Advanced Encryption Standard (AES) using MATLAB, focusing on simulating encryption and decryption processes. The authors provide a thorough description of the structure of the AES algorithm and the implementation of its core byte-oriented operations, including SubBytes, ShiftRows, MixColumns, and AddRoundKey. These operations are outlined using MATLAB functions. The experimental section includes measurements of key setup time, single-round encryption time, and full encryption and decryption times. The highest recorded encryption and decryption times were 87.57 and 88.007 milliseconds, respectively. These results were obtained using a modern PC with 4 GB of RAM and an Intel i5 processor. These results were considerably faster than those of certain embedded implementations (e.g., 2.06 s in an EDK-based system), demonstrating MATLAB’s efficiency for prototyping. Despite the aforementioned merits, the study is subject to several limitations. First, it lacks broader performance metrics, such as memory usage, throughput, and scalability for large datasets. Additionally, the design omits formal modeling, such as state diagrams or finite state machine (FSM) representations, which would have strengthened the clarity and portability of the AES logic structure.
Efficient hardware implementation is a central theme in leading electrical engineering and computer science journals [
11]. For example, the authors’ proposed method enhances a conventional, non-pipelined Advanced Encryption Standard (AES) algorithm by integrating a PN sequence generator that dynamically creates S-box values and the initial key. This approach centers on a linear feedback shift register (LFSR) with specific feedback taps and a secret seed value. It aims to enhance security by making these fundamental components unpredictable to an attacker. The security of this modification was evaluated using the Strict Avalanche Criterion (SAC). However, a critical evaluation reveals certain shortcomings in the research. The selection of the PN generator’s parameters is justified solely as “proof of concept” without deeper analysis of their cryptographic impact. Furthermore, the claim that the design is “invulnerable to attacks” is an overstatement based solely on avalanche effect results. Furthermore, while the design improves throughput, it comes at the cost of significantly increased area usage. This trade-off is acknowledged but not thoroughly examined for its practical implications. For the hardware implementation, the authors’ primary design was implemented on a Spartan6 XC6SLX150 FPGA device, with plaintext and a 128-bit key as inputs to the AES algorithm. The synthesis results for this component showed a resource utilization of 5566 slices. The design achieved a maximum operating frequency (F_(max)) of 237.45 MHz, resulting in a calculated throughput of 3.03 Gbps based on a 10-cycle latency. The efficiency, measured as Mbps/slice, was 0.54 for this device implementation [
11].
Several other studies have focused on various aspects using FPGAs. One study [
12] examined the use of pipelining techniques to reduce area and delay time in parts of the AES process that require a large number of resources, such as the Sub-Bytes section. This approach has been shown to increase throughput to 79.7 Gbps and FPGA efficiency to 13.3 Mbps/slice. Research [
13] focuses on anticipating unauthorized access to data by external parties by implementing the AES algorithm in hardware using an FPGA, which is not overly complex, flexible, or efficient. The present study focuses on utilizing Slice Registers (SRs), Look-Up Tables (LUTs), Input/Output (I/Os), and Global Buffers (BUFGs) to enhance data security at the hardware level. Research [
14] focuses on using a real-time hardware platform to encrypt multimedia data, especially video, using the AES algorithm. The proposed system uses a CMOS camera as an input device and processes the data directly using an AES encryption processor developed with Xilinx System Generator. This processor is integrated as a dedicated peripheral with a Microblaze 32-bit soft RISC processor. A comparison of previous AES research can be seen in
Table 1.
For the target application domain, an FSM-based AES design is preferable because it offers an optimal balance of performance, resource efficiency, and design simplicity—critical requirements for secure embedded systems. An FSM-controlled architecture significantly reduces hardware area and power consumption by reusing a single round data path across multiple encryption rounds, making it well suited for low- to mid-range FPGA devices. This reduction is achieved by comparison with fully pipelined or unrolled designs. The finite state machine’s deterministic sequencing provides precise control over each AES operation, resulting in predictable latency and straightforward integration with embedded processors, memory interfaces, and control logic. Additionally, the modular, state-oriented structure simplifies verification and validation against high-level reference models, such as MATLAB implementations. This enhances the reliability and portability of the design. While FSM-based implementations generally have lower peak throughput than highly parallel architectures, it remains sufficient for many real-time and embedded security applications where cost, scalability, and power efficiency are prioritized over extreme data rates.
In light of these findings, our study proposes a comprehensive design and comparative analysis of FSM-based AES encryption on MATLAB and FPGA platforms. Leveraging MATLAB for rapid simulation and functional validation and FPGA for high-throughput, resource-constrained deployment aims to bridge the gap between software prototyping and hardware realization. The evaluation process focuses on critical performance metrics, incorporating factors such as execution time, throughput, and resource utilization. Furthermore, this study addresses the limitations of prior research by incorporating FSM modeling and detailed synthesis results. This dual-platform investigation provides valuable insights for researchers and engineers seeking optimized, tailored cryptographic solutions that balance algorithmic flexibility and real-time hardware efficiency.
2. Materials and Methods
The Advanced Encryption Standard (AES) is a symmetric block cipher algorithm that secures digital data using substitution, permutation, and key-mixing operations. The research is conducted in several steps:
An in-depth literature review of various encryption and decryption models with a focus on algorithmic structures and implementation methods to identify their strengths and weaknesses.
A modified AES algorithm employing a finite state machine (FSM) approach was then designed using MATLAB. The algorithm was developed as a software prototype and tested using multimedia data (text, audio, images, and videos) in bitstream form.
The modified, FSM-based algorithm was then converted into a circuit design and implemented in an FPGA using VHDL.
The testing process was conducted through behavioral simulation with numerical data under the assumption that the multimedia data had been prepared for encryption and decryption.
The final stage yielded a functional decryption IP core ready for FPGA integration.
2.1. Implementation Environment and Design Tools
The AES architecture proposed in this study is resource-efficient and was developed and validated using a standard hardware and software co-design workflow. The hardware description logic was written in VHDL, and the design was synthesized using the Xilinx ISE 14.7 design suite (Xilinx, Inc., San Jose, CA, USA). Comprehensive behavioral simulations were conducted using the integrated ISim Simulator to guarantee logical correctness and analyze signal timing. The clock frequency constraint has a default setting of 100 MHz, which is equivalent to 10,000 periods of a wave. The final synthesized IP core was implemented and tested on a Digilent Nexys board (Digilent co NI, 11500 N Mopac Expy, Austin, TX, USA) with a Xilinx Artix-7 100T FPGA Part Number: XC7A100T-1CSG324C). The design properties of the FPGA are shown in
Figure 1. This board served as the target hardware platform for all performance evaluations.
In conjunction with the hardware development, MATLAB was used as a high-level modeling and simulation tool. It was employed to validate the functional behavior of the AES algorithm, verify intermediate transformation stages, and provide baseline performance metrics in a controlled software environment. The design methodology for all transformations employed sequential logic techniques governed by a finite state machine (FSM) because the output of each stage depended on previous inputs and required storage elements.
2.2. Architectural Design: An FSM-Based AES
The primary architectural innovation of this work is the redesign of the dataflow of the AES algorithm, which is managed by a finite state machine. This approach was selected to overcome the inherent inefficiencies of conventional sequential logic in field-programmable gate array (FPGA) implementations. In such cases, hardware components are often redundantly re-declared for each computational round. Our FSM-based architecture enables the central innovation of this research: the use of reusable components. Key functional units, particularly the resource-intensive XOR gate arrays required for the AddRoundKey and MixColumns transformations, were modeled as shared resources. These components, including specialized two-, three-, and four-input XOR blocks, were instantiated once and subsequently invoked programmatically by various states as needed throughout the cryptographic process. This paradigm significantly reduced the overall hardware footprint.
Concurrently, MATLAB was used on the software side. The research focused on comparing the processing speeds of MATLAB and FPGA from input to output. MATLAB and FPGA were given a similar method (AES with FSM) and similar input data, enabling analysis of the results and the speed of each platform.
The architecture comprised four primary transformations, each managed as a distinct process within the FSM: KeySchedule, AddRoundKey, SubBytes, and ShiftRows. To optimize performance further, the SubShift entity was designed to execute the SubBytes and ShiftRows transformations in parallel. This combined the nonlinear substitution and byte-wise permutation steps into a single, efficient clock cycle operation.
2.3. MATLAB Algorithms
The Advanced Encryption Standard (AES) is one of the most widely used symmetric encryption algorithms. It is known for its security, speed, and efficiency in hardware and software implementations. A Finite State Machine (FSM) approach is often used to manage the complexity of AES operations in hardware environments, such as Field Programmable Gate Arrays (FPGAs).
This algorithm uses a finite state machine (FSM) to control the flow of operations across multiple rounds of the AES encryption process. Encryption begins with an initial key addition (AddRoundKey), followed by a series of transformation steps: SubBytes, ShiftRows, MixColumns, and AddRoundKey. These steps are repeated for ten rounds. The FSM architecture ensures structured, modular execution, making it highly suitable for hardware-based cryptographic systems that require deterministic control and optimized performance.
This implementation incorporates a key scheduling function that generates round keys dynamically. It also handles the transition logic between states, including a special case for the final round where the MixColumns step is omitted, as specified in the AES. The timing methodology in MATLAB 2016b uses the tic/toc function. This research used a 10th generation i7 processor with 16 GB of RAM, and MATLAB performed byte-oriented operations. This research’s decryption process utilized AES MATLAB algorithms, which are described in
Appendix A.
AES encryption transforms plaintext into ciphertext. The decryption process restores the original message. A Finite State Machine (FSM) approach offers a structured and modular design, making it an efficient implementation of AES decryption, particularly in hardware platforms such as FPGAs.
This algorithm uses an FSM to manage the sequence of inverse operations across ten rounds to implement AES decryption. The process begins by applying the last round key and performing Inverse ShiftRows and Inverse SubBytes. Then, a loop executes AddRoundKey, Inverse MixColumns, and other inverse transformations in the reverse order of encryption. The final step of the procedure utilizes the original key for a final key addition to retrieve the plaintext.
The FSM approach organizes each decryption phase into distinct states, ensuring clarity, modularity, and synchronization. This makes the approach ideal for hardware synthesis or simulation environments, where precise control over each step is essential.
2.4. FSM Logic, Dataflow, and Memory Structure for FPGA
Each of the encryption and decryption processes is governed by a dedicated seven-state finite state machine (FSM), which controls the data flow and state transitions, as depicted in
Figure 2 and
Figure 3, respectively.
The FSM begins in an Idle state for the encryption process, which initializes the system upon receiving the 128-bit plaintext and key inputs. The Keys process state is then responsible for iteratively generating and storing all ten round keys. Subsequent states rigorously manage the execution of the Subshift, MixColumn, and AddRound processes for the nine main AES rounds. For the tenth and final round, the FSM logic alters the data flow to bypass the MixColumn state, as specified in the AES. Then, it transitions to the AddRound final state to produce the ciphertext.
The decryption finite state machine (FSM) employs the corresponding inverse transformations and follows a parallel but distinct logic. This architecture’s data management relies on a structured memory system comprising ten distinct memory blocks. Three of these are evident as constant, read-only memories. These memories store the RCON values for the key schedule (10 cells), the S-box substitution table (256 cells), and the MIXCOLUMNS matrix constants (16 cells). The remaining seven blocks constitute temporary storage buffers that hold intermediate data between state transitions. These include memories dedicated to the addround results (mem4), the parallelized subbyte/shiftrows output (mem5), the mixcolumns results (mem6), and the complete 160-byte key schedule (mem2p). This organized memory structure is critical for enabling the FSM to control the complex, multi-stage data flow.
After the encryption and decryption processes were completed, a theoretical calculation of the total clock cycle was performed. The purpose of this calculation was to determine the throughput and efficiency of the AES FSM system that was created. On an FPGA, the total clock cycle indicates the number of clock cycles required to complete a process or data unit, such as a block of encryption in the AES algorithm. In an FSM-based implementation, the total clock cycle can be calculated based on the number of states executed in each encryption round. For example, in AES, the initial process is the AddRoundKey stage, followed by nine main rounds, each of which comprises three stages: SubBytes, ShiftRows, and MixColumns. The tenth or final round includes only two stages because it does not incorporate MixColumns. Therefore, the total number of clock cycles required is: 1 (initial round) + 9 × 3 (main rounds) + 2 (final round) = 30 clock cycles to complete one data block.
This calculation is essential for analyzing the performance of hardware-based cryptographic systems. It serves as a foundation for evaluating the efficiency and throughput of the design [
16,
17]. Subsequently, we employed well-known Equations (1)–(3) to calculate the throughput, the efficiency and maximum frequency, respectively [
15].
A commonly used metric to evaluate the performance of AES implementations on FPGAs is the throughput-to-area ratio, expressed in megabits per second (Mbps) per slice. This metric balances the consideration of speed (throughput) and hardware resource consumption (logic slices). A survey of multiple academic works, including those by references [
5,
12,
18], was conducted to determine the efficiency of AES FPGA designs. The results of the survey are shown in
Table 2.
These thresholds were derived from typical values documented in existing literature. For example, one study reported an AES implementation with a throughput of 13.3 Mbps/slice on a Virtex-5 device, demonstrating the high efficiency of pipelined architectures [
12]. Similarly notable is the attainment of 8.4 Mbps/slice in their AES-XTS design, which also falls within the “highly efficient” category [
18]. Conversely, many FSM-based or iterative designs operate at speeds between 0.5 and 1.0 Mbps per slice. While this is considered efficient, it is particularly notable for lightweight or embedded applications [
5].
5. Conclusions
This study presents a detailed comparison of Advanced Encryption Standard (AES)-128 encryption and decryption implementations based on finite state machines (FSMs) on field-programmable gate array (FPGA) and MATLAB platforms, utilizing identical architectures and input parameters. Experimental results from both platforms yielded accurate, consistent cryptographic outputs, confirming functional equivalence. However, the FPGA implementation demonstrated substantially faster execution speeds, achieving microsecond-level performance, while the MATLAB implementation achieved millisecond-level performance. The FPGA implementation demonstrated high throughput, achieving 872.53 Mbps for encryption and 858.49 Mbps for decryption. It maintained efficient area usage at 0.691 and 0.601 Mbps/slice, respectively.
A comparison of the FSM FPGA design with other software-based implementations revealed several advantages. Not only did it achieve performance speeds several orders of magnitude faster, but it also introduced formal FSM modeling, a feature lacking in many existing MATLAB-only approaches. Unlike highly optimized pipelined FPGA designs, which offer higher throughput but consume more logic area, this FSM-based approach strikes a balance between speed and hardware efficiency. Its moderate area usage, combined with substantial throughput, makes it ideal for lightweight, embedded cryptography systems where critical resource constraints and power efficiency are prerequisites.
In conclusion, integrating FSM control into AES-128 enhances structural clarity and logic reuse while providing real-time performance advantages. This work contributes valuable insights to the field of designing secure, efficient, and scalable encryption systems for academic and industrial embedded applications.