Abstract
One of the very important problems connected with FPGA-based design is reducing the hardware amount in implemented circuits. In this paper, we discuss the implementation of Mealy finite state machines (FSMs) by circuits consisting of look-up tables (LUT). A method is proposed to reduce the LUT count of three-block circuits of Mealy FSMs. The method is based on finding a partition of set of internal states by classes of compatible states. To reduce the LUT count, we propose a special kind of state code, named complex state codes. The complex codes include two parts. The first part includes the binary codes of a state as the element of some partition class. The second part consists of the code of corresponding partition class. Using complex state codes allows obtaining FPGA-based FSM circuits with exactly four logic blocks. If some conditions hold, then any FSM function from the first and second blocks is implemented by a single LUT. The third level is represented as a network of multiplexers. These multiplexers generate either additional variable encoding collections of outputs or input memory functions. The fourth level generates FSM outputs. An example of synthesis and experimental results is shown and discussed. The experiments prove that the proposed approach allows reducing hardware compared to such methods as auto and one-hot of Vivado, JEDI. Further, the proposed approach produces circuits with fewer LUTs than for three-level Mealy FSMs based on joint use of several methods of structural decomposition. The experiments show that our approach allows reducing the LUT counts on average from 11 to 77 percent. As the complexity of an FSM increases, the gain from the application of the proposed method grows; the same is true for both the FSM performance and power consumption.
1. Introduction
The behavior of a sequential device can be represented by the model of a Mealy finite state machine (FSM) [,]. This stimulates constant development of various methods of designing Mealy FSM logic circuits [,]. As a rule, these methods are aimed at optimizing one or more basic characteristics of resulting FSM circuits []. There are three basic characteristics, namely: (1) the chip area occupied by an FSM circuit), (2) the operating frequency, and (3) the power consumption; however, as a rule, it is impossible to optimize these three characteristics at the same time. For example, a decrease in the required internal resources (the required chip area) is often associated with a decrease in the maximum operating frequency []. As it is known, the occupied chip area significantly affects other characteristics of an FSM circuit []. At the same time, it is important that reducing the area as little as possible increases the delay time of the circuit. As it is known [], the major challenge in the LUT-based FSM design is developing a low-area circuit without the compromising an FSM performance. In this paper, we propose a method to create Mealy FSMs whose three-level circuits are implemented using internal resources of field-programmable gate arrays (FPGAs) [,]. The proposed approach belongs to methods of structural decomposition [].
Recently, more and more digital systems are implemented using FPGA chips []. The analysis of VLSI’ market shows that Xilinx [] is the largest manufacturer of FPGA chips. This fact explains the orientation of our article to FPGAs of Xilinx. We discuss a case when an FSM circuit is implemented using internal resources of FPGAs such as look-up table (LUT) elements, programmable flip-flops, inter-slice multiplexers, programmable interconnects, synchronization tree, and programmable input–outputs.
Our current article is devoted to improving the LUT count of three-block LUT-based Mealy FSM circuits obtained with the simultaneous usage of the replacement of FSM inputs and the encoding of collections of FSM outputs []. The resulting FSM circuits have three blocks of LUTs; each block has a unique system of inputs and outputs. When certain conditions are met, the circuit of some (or even all) logic block is synthesized using the methods of functional decomposition [,]. Such blocks are represented by circuits having several levels of LUTs. This leads to significant decrease in the FSM operating frequency. Moreover, the interconnection system of a multi-level block becomes dramatically more complex, which leads to a further decrease in the FSM performance. This is why it is so important to reduce the number of levels in each logic block of FSM circuits.
The main contribution of this paper is a novel design method aimed at reducing the number of LUTs and their levels in circuits of three-block LUT-based Mealy FSMs. The reduction diminishes the total number of LUTs in an FSM circuit compared to this number for equivalent FSMs based on the functional decomposition. To apply our method, it is necessary to construct classes of compatible states. This in turn leads to an increase in the number of state variables compared to their minimum number. To reduce the number of state variables, we propose a new type of state codes. We name them complex state codes (CSC). A CSC of any state includes two parts. The first part is a code of a class of compatible states including the particular state. The second part is a code of this state as an element of a particular class. Our method produces four-block FSM circuits. In the best case, each block is represented by a single-level LUT-based circuit. As experimental results show, the proposed approach also provides the performance at the level of three-block FSMs and reduces the power consumption. These phenomena are additional positive qualities of the proposed method.
The further text of the paper includes five sections. Section 2 shows the background of LUT-based Mealy FSMs. Section 3 analyses the related works. The main idea of the proposed method is shown in Section 4. An example of a CSC-based FSM synthesis is shown in Section 5. Section 6 analyses the results of experiments. The paper ends with a short conclusion.
2. Basic Information
A Mealy FSM logic circuit can be represented by two systems of Boolean functions (SBFs) []. One of these SBFs represents FSM outputs connected with operational units of a particular digital system. The second SBF represents input memory functions (IMFs). The arguments of these SBFs are external FSM inputs and internal state variables. The inputs form a set ; the IMFs create a set . An FSM circuit is represented by the following SBFs:
The state variables encode internal states from a set . To encode M states, the minimum number of state variables is determined as []
Each state is represented by a binary code having R bits. These codes are kept into the state code register (RG). In this article, we discuss a case when the RG has informational inputs of D type. This is the most common case []. The systems (1) and (2) determine so called P Mealy FSM [] shown in Figure 1.
Figure 1.
Structural diagram of P Mealy FSM.
In Figure 1, the block of IMFs is implemented using SBF (2); the block of outputs is based on SBF (1). The state register has R D flip-flops. The r-th flip-flop keeps the state variable . The pulse allows clearing the content of RG. This pulse loads a code of the initial state into RG. As a rule, the code consists of zeros. The pulse shows an instant when the RG content can be changed by current IMFs.
As a rule, an FSM is represented by either a state transition table (STT) [] or a state transition graph (STG) []. To obtain the systems (1) and (2), it is necessary to form an FSM direct structure table (DST) []. In this article, we start from the STG. Next, this graph is transformed into the equivalent STT. Using the STT, we construct the DST.
An STG is a directed graph whose nodes correspond to FSM states. Interstate transitions are represented by edges of STG. Each edge is marked by a combination of inputs causing a particular transition and collection of outputs (COs) generated during this transition. An STT is a representation of STG as a list of interstate transitions. An STT includes five columns with: a current state ; a state of transition ; an input signal which is a conjunction of some inputs (or their complements) determining this particular transition; CO generated during this transition; h is a number of transition () [].
A DST includes the columns with state codes and IMFs []. These columns are: the code of the current state , the code of the next state , and a collection of IMFs equal to 1 to load the code of the next state into the state register RG.
In this paper, we consider a case when internal resources of FPGA chips are used for implementing SBFs (1) and (2). An FSM circuit is implemented using configurable logic blocks (CLB) of FPGAs produced by Xilinx []. A circuit is represented as a network of CLBs connected with help of a programmable routing matrix []. In this paper, we discuss a case when CLBs include LUTs, multiplexers, and programmable flip-flops. Using the notation [], we denote as -LUT a single-output LUT with inputs. If a Boolean function depends on up to arguments, then it is represented by a single-LUT logic circuit. If the number of LUT inputs is less than the number of arguments, then a circuit has more than a single level of LUTs. To implement multi-level circuits, the methods of functional decomposition (FD) are used [,]. As a rule, the FD-based circuits have the complicated systems of “spaghetti-type” interconnections [].
We discuss a case when each CLB is a part of slice []. The slice includes internal multiplexers. They can be used for changing the number of LUT inputs within one slice. The internal multiplexers are connected with LUTs by a system of fast inter-slice interconnections. Due to this, the delay time for 6-, 7-, and 8-input LUTs is practically the same for SLICEL of Virtex-7 [,]. This approach makes it possible to flexibly adapt the LUT parameters to the characteristics of the function being implemented. For example, the SLICEL of Virtex-7 includes four 6-LUTs, 8 flip-flops, and 27 multiplexers []. Each 6-LUT can be used as two 5-LUTs with shared inputs. This explains the presence of eight flip-flops in each SLICEL. Using internal multiplexers allows combining two 6-LUTs into a single 7-LUT. Next, four 6-LUTs can be combined into a single 8-LUT. The control inputs of multiplexers can be used as inputs of 7- and 8-LUTs. Each SLICEL possesses special carry chains used for organization of fast multi-bit adders. It is worth noting that these circuits can be used to implement arbitrary logic circuits [,].
In this paper, we use multiplexers to generate functions (1) and (2). We denote a multiplexer having K data inputs as . Using a single 6-LUT, we can implement a circuit of . It has two control inputs and four data inputs. Further, we can organize an with the help of two 6-LUTs. Its circuit has only slightly bigger delay than a circuit of a []. It is possible due to using fast interconnections inside a slice. If a has the control inputs , then its circuit includes four 6-LUTs controlled by . To implement a , two slices and inter-slice interconnections are used. As a result, a is much slower than a .
In LUT-based FSMs, the flip-flops of RG are distributed among LUTs generating functions (1). Due to this, the RG is “hidden” inside the slices where the IMFs are generated. There are two blocks in an LUT-based P Mealy FSM (Figure 2).
Figure 2.
Structural diagram of LUT-based P Mealy FSM.
In this paper, we denote as LUTer a logic block consisting of LUT-based CLBs. In the P Mealy FSM (Figure 2), a implements SBF (1) and implements SBF (2). To control the RG, the pulses and are used.
If each function from systems (1) and (2) depends on not more than arguments, then both blocks of LUT-based P Mealy FSM (Figure 2) are represented by single-level circuits. For Xilinx-based solutions, an LUT has 6 inputs []. There is no point in increasing this value, because provides the best balance for such LUT characteristics as the occupied chip area, performance and consumed power []; however, even for FSMs with average complexity [], it could be up to 40 arguments in functions (1) and (2). Obviously, there is a distinct imbalance between such big number of arguments in SBFs representing FSM circuits and a fairly small value of LUT inputs. This imbalance requires improving synthesis methods of LUT-based FSMs.
Denote as the number of literals in a sum-of-product of function . If the condition
holds, then it is impossible to represent the function by a single-level circuit. In this case, it is very important to optimize the system of connections between different slices of an FSM circuit. This follows from the fact that more than 70% of the power consumption is due to the interconnections []. Moreover, time delays of the interconnection system are starting to play a major role in comparison with CLB delays []. The results of research [] show that the optimization of interconnections leads to increasing the maximum operating frequency and reducing the power consumption of LUT-based FSM circuits. This can be performed, for example, using various methods of structural decomposition [].
3. Related Work
There are a huge number of methods for improving characteristics of circuits targeting LUT-based FSMs. The survey of these methods can be found, for example, in [,]. These methods should be applied if the condition (4) holds []. These methods can improve either the LUT count or the maximum operating frequency or the power consumption []. Sometimes, these methods are looking for a solution that allows joint improvement of more than only one FSM circuit’s characteristic. In this paper, we propose a method for decreasing the number of LUTs (this is an LUT count) of FPGA-based Mealy FSMs.
This task can be solved using various methods of state assignment [,,]. In these methods, the number of bits in the state codes ranges from the minimum determined by formula (3) to the maximum determined by the total number of states, M. If , then this is a one-hot state assignment. These approaches are used in both academic and industrial CAD tools. The examples of academic systems are SIS [] and ABC by Berkeley [,]. The examples of industrial systems are Vivado [] of Xilinx and Quartus of Intel (Altera) [].
Now, there is no single universal method of state encoding that provides the best possible characteristics of FSM circuits. The applicability of a particular method can be judged both by the required number of state variables (R) and by the number of FSM inputs (L). As follows from [], the one-hot codes improve FSM characteristics if . The rather small value of increases the influence of the value of L on the characteristics of LUT-based FSM circuits []. It is shown in [] that it is better to use the codes with the minimum number of bits (), if .
So, in one case, it is better to use the one-hot state codes, and, in the other case, it is better to use the maximum binary state assignment with ; therefore, it makes sense to compare several state assignment methods for the same FSM and find a method with the best characteristics. Due to this, we have compared the FSMs circuits produced by our proposed approach with FSM circuits produced by four other methods of state assignment. As a base for comparison, we use: the method of maximum state assignment Auto used in the CAD tool Vivado [] by Xilinx; one-hot state assignment used in Vivado; the algorithm JEDI [] which is one of the best methods of binary state assignment []. Our choice of Vivado is dictated by the fact that it operates with FPGAs of Xilinx. Further, we compared FSM circuits produced by our approach with three-block FSM circuits [].
In this paper, we propose a method leading to four-block FSM circuits. It belongs to the methods of structural decomposition (SD) []. The main idea of these methods is the elimination of the direct connection between FSM outputs and IMFs , on the one hand, and FSM inputs and state variables , on the other hand. The SD leads to an increase in the total number of implemented functions having significantly fewer arguments than functions (1) and (2). These methods are analyzed, for example, in [].
In [], we have proposed an optimization method based on the combined use of two structural decomposition methods. These methods are the replacement of FSM inputs and encoding of collections of outputs. This approach leads to so called Mealy FSMs. Let us discus these two methods.
The first method is based on the replacement of inputs by additional variables , where . The method uses the fact that transitions from any state depend on inputs, where []. In the case of LUT-based FSMs, the variables are generated by an additional block . This block implements the system
The second method is based on the fact that only a limited set of outputs is formed during transitions from any FSM state. Each transition is accompanied by generating some CO , where . Each CO is encoded by a binary code having bits, where
To encode these COs, additional variables are used.
In FSMs, a special block generate FSM outputs as functions
The variables and IMFs are generated by a block :
So, the structural diagram of an LUT-based Mealy FSM includes three blocks connected in series (Figure 3).
Figure 3.
Structural diagram of LUT-based Mealy FSM.
There is a hidden register RG inside the block . This explains why pulses and enter the block . Obviously, the informational inputs of D flip-flops are connected with IMFs .
As shown in [], such joint usage of two methods of SD leads to a significant decrease in the LUT count compared with other investigated methods; however, the gain in LUTs is significantly reduced if the condition (4) is met for the functions , where these systems are represented as (8) and (9). In this case, the circuit of is designed using the methods of functional decomposition []. As a result, there are several levels of LUTs in the circuit of with all the negative consequences.
The proposed method is an evolution of methods of twofold state assignment []. These methods are based on construction a partition of the set of states by the classes of compatible states: . Each state determines a set including FSM inputs causing transitions from this state. Inside each class, the states are encoded using maximum binary codes. If a set includes elements, then it is enough
variables to encode these states by maximum binary codes. One additional code corresponds to the relation .
We use the same definition of compatible states as the one propose in the paper []. The states are compatible if
In (11), the symbol stands for the number of inputs determining transitions from states . These inputs form a set of inputs .
To create the partition with minimum number of classes, J, we use the method from []. Each class consists of compatible states. Each class determines local sets of inputs and outputs . If outputs are generated during transitions from states , then they are included into a set .
In the case of twofold state assignment, each state has two codes []. The code determines the state as an element of the set A. The extended state code (ESC) [] determines this very state an element of a compatibility class . Each class determines a collection of partial functions generating by a block . These partial functions are partial outputs and partial IMFs . The set includes IMFs generating during the transitions from the states . These partial functions are denoted as and . Because of (11), each partial function is represented by a single LUT.
An ESC consists of J fields. The j-th field has bits and corresponds to the class . So, there are bits in ESCs. If the relation holds, then only bits from the j-th field differ from zeros.
In [], it is propose to produce ESCs by the transforming state codes kept into RG. Unfortunately, the transformation requires an additional block of CLBs which consumes some internal resources of a chip and decreases the performance of a resulting FSM circuit. The improvement of this approach is proposed in []. In this case, the codes are generated in parallel with FSM outputs. This approach allows eliminating the block of code transformation; however, this approach also has some drawbacks. Firstly, there are flip-flops in the RG. Secondly, the total number of LUTs generating IMFs is increased by compared to the previous approach.
The experiments [] show that using only ESCs allows increasing performance up to 15.9% compared with equivalent FSMs based on the twofold state assignment. The growth of operating frequency is accompanied by a slight growth in the LUT count (up to 7.7%). In this paper, we propose a method of reducing the LUT count in Mealy FSM. The method is based on the replacing ESCs by complex state codes proposed in this paper.
4. Main Idea of the Proposed Method
The proposed method is based on finding a partition of the set A by classes of compatible states. The same state variables are used for encoding states from different compatibility classes. The states are encoded by codes using state variables:
A code determines the state as the element of a particular class of . The classes are encoded by class codes . These codes include bits:
We propose to represent FSM states by the complex state codes denoted as . For any state , a CSC is a concatenation of the class code and a state code :
In (14), the sign “*” denotes the concatenation of the codes. There are state variables in CSCs. The value of is determined as
To encode the classes, we use the variables from the set . To encode states as elements of classes , we use variables from the set . Together, these sets form a set having elements.
The proposed method of state assignment is aimed at the reducing LUT count for LUT-based circuits of FSMs. The method is based on the joint application of: (1) the replacement of FSM inputs; (2) the encoding of collections of outputs; (3) the encoding of states by complex state codes. As a result, we propose to replace FSMs by FSMs. The subscript “C” means that the complex state codes are used in FSM. There is the structural diagram of FSM shown in Figure 4.
Figure 4.
Structural diagram of LUT-based Mealy FSM.
There are four levels of logic blocks in FSMs. The first level is represented by LUTerP. This block implements the SBF (5).
The second level includes blocks , where . A class determines three sets of variables. The set includes additional variables determining transitions from the states . The set contains IMFs generated during the transitions from the states . The set consists of the variables equal to 1 in codes of COs produced during the transitions from the states determined by each class . Each block produces the following partial functions:
The block represents the third logic level. It consists of multiplexers generating IMFs and additional variables . The data inputs of these multiplexers are the partial functions (16) and (17). To select a particular partial function, we use the class variables . So, the multiplexers generate the following SBFs:
The functions (18) enter the inputs of the flip-flops that make up the hidden register RG. Due to this, the control signals and enter this block.
The fourth logic level is represented by the block . It implements the SBF (7).
So, there are four levels of logic blocks in the circuits of Mealy FSMs. In the best case, each block is represented by a single-level LUT-based circuit.
In this paper, we propose a synthesis method for LUT-based Mealy FSMs. We start the synthesis process from an FSM state transition graph. The proposed method includes the following steps:
- (1)
- Creating the state transition table of Mealy FSM.
- (2)
- Constructing the partition of the set of states by classes of compatible states.
- (3)
- Encoding of FSM states by complex state codes .
- (4)
- Executing the replacement of FSM inputs by additional variables .
- (5)
- Creating SBF (5) representing .
- (6)
- Encoding of collections of outputs by codes .
- (7)
- Creating SBF (7) representing .
- (8)
- Creating direct structure table of Mealy FSM.
- (9)
- Creating tables of blocks of partial functions –.
- (10)
- (11)
- Creating table of .
- (12)
- (13)
- Implementing the LUT-based circuit of Mealy FSM using internal resources of a particular FPGA chip.
The partition is created using the method []. This approach allows minimizing LUT counts in the resulting Mealy FSM circuits. If it is possible, each class of compatible states should include the maximum possible number of states. This helps minimizing the number of classes (and the blocks of the second level of logic). In turn, this optimizes the number of LUTs in the circuit of . Any multiplexer from this block is implemented as a single LUT if the following condition takes place:
Even if condition (20) is violated, then the multiplexers could be implemented as single-level circuits. This is possible, if the number of partial functions for a given function does not exceed the value . Otherwise, the internal multiplexers of CLBs are used for generating functions (18) and (19).
5. Example of Synthesis
We use the symbol to show that the model of Mealy FSM (Figure 4) is used to implement the circuit of an FSM . Consider an FSM represented by its STG (Figure 5). Let us synthesize the circuit of Mealy FSM using 5-LUTs.
Figure 5.
The state transition graph of Mealy FSM .
Step 1. The h-th edge of an STG is transformed into a row of an STT []. There are 19 edges in the STG (Figure 5). So, it should be rows in the corresponding STT. The transformation is executed in a trivial way []. Table 1 is a resulting STT of FSM . The following sets can be derived from Table 1: the set of states , the set of inputs , and the set of outputs . This gives the following parameters: , , and .
Table 1.
State transition table of Mealy FSM .
Step 2. Using the methods [], we can obtain the partition with . There are the following classes of this partition: and . So, there is . Using (12), we can obtain the value . Using (13), we can obtain the value . Now, we have the sets , , and .
Step 3. As known [], the state codes do not affect the number of LUTs in circuits of FSMs based on twofold or extended state codes []. So, the states can be encoded in the arbitrary way. For our example, one of the possible outcomes of the state assignment is shown in Figure 6.
Figure 6.
Complex state codes of Mealy FSM .
The following class and state codes can be found from Figure 6: and , . Using the codes of classes of compatible states and state codes gives the following complex state codes: and .
Step 4. To execute the replacement, we should find the minimum value of additional variables, G. To do it, we use the methods from []. It is necessary to analyze sets including FSM inputs which determine the transitions from states []. These sets can be found using either the STG (Figure 5) or STT (Table 1). In the discussed case, there are the following sets: and . If is a number of elements in the set , then .
The value of G is equal to the maximum value of . Obviously, there is . So, it is enough additional variables to replace inputs: .
The columns of table of inputs’ replacement are marked by FSM states , the rows are marked by additional variables . If an input is replaced by a variable in a state , then this input is written at the intersection of the corresponding column and row. Using methods from [] gives the table of replacement (Table 2).
Table 2.
Replacement of inputs for Mealy FSM .
Step 5.Table 2 is a base for finding SBF (5). The following SBF can be derived from Table 2:
In (21), the symbol stands for a conjunction of state variables corresponding to the state . Obviously, each of Equation (21) can be implemented as .
Step 6. There are different collections of outputs in STT (Table 1). They are the following: .
To optimize the circuit of , it is necessary to encode COs in a way minimizing the total number of literals in SBF (7) []. Each literal determines an interconnection between and . Using the approach from [], we can encode the COs as it is shown in Figure 7.
Figure 7.
Outcome of encoding of COs for Mealy FSM .
There are 16 literals in (22). The maximum number of literals is equal to . So, due to encoding shown in Figure 7, the number of literals (and interconnections) has almost halved.
Step 8. The DST of Mealy FSM is constructed using the initial STT, codes of states and COs, and a table of replacement of inputs. A DST includes the following columns: , , , , , , , h. The columns of state codes include codes from Figure 6. The column is constructed using the initial STT and table of replacement of inputs (Table 2). The column includes IMFs equal to 1 for loading the code into state register. The column includes variables equal to 1 in the code of CO written in the h-th row of STT. This column is constructed using the initial STT and codes of COs (Figure 7).
In the discussed case, the DST is represented by Table 3. Let us analyze the first row of Table 3. There is the input in this row of Table 1. As follows from Table 2, the input is replaced by the additional variable in the state . For this row, there is the following relation: . As follows from Figure 6, there is . Due to this, column of Table 3 contains in row . In row 1 of Table 1, there is the CO in column . As follows from Figure 7, there is . Due to this, column of Table 3 contains in row . All other rows of Table 3 are constructed in the same way.
Table 3.
Direct structure table of Mealy FSM .
Step 9. These tables are constructed using the classes , DST of Mealy FSM, codes and . For the discussed example, there is . So, there are two blocks ( and ) generating the partial functions (16) and (17). The transitions from the states from the class are represented by Table 4, for the class by Table 5.
Table 4.
Table of of Mealy FSM .
Table 5.
Table of of Mealy FSM .
There is a transparent correspondence between Table 3, on the one hand, and tables of (Table 4) and (Table 5), on the other hand. There are rows in Table 4 and rows in Table 5. Obviously, the following equality takes place: .
Step 10. The following sets can be found from Table 4 and Table 5: , , and . It means that each contains 5-LUTs. Together, this gives 14 5-LUTs in the mutual circuit of and .
The functions (16) and (17) are constructed in the trivial way. For example, the following SBF of partial functions , and can be derived from Table 4 and Table 5:
All other partial functions are created in the same manner.
Step 11. The table of is constructed using sets and where . The table contains the columns "Function" and "j". For our example, this block is represented by Table 6.
Table 6.
Table of of Mealy .
For example, the IMF appears in both tables. Due to this, there are ones in columns with and . All other rows are filled used the similar analysis.
Step 12. The SBFs (18) and (19) representing the third level of the logic circuit are constructed in the trivial way. They include two components: (1) conjunctions of variables corresponding to class codes and (2) corresponding partial functions. For example, functions and are represented by the following SBF:
All other functions (18) and (19) are constructed in the same manner.
Step 13. To implementing the LUT-based circuit of Mealy FSM , it is necessary to use some CAD tools. In the case of FPGAs from Virtex-7, the system Vivado [] should be used; for our simple example we can design this circuit manually.
As follows from (21), there are nine literals in the sum-of-products of and eight literals in the sum-of-products of . The circuit should be implemented using LUTs with inputs. So, the condition (4) holds. To implement the circuit of , it is necessary to apply the methods of FD [,]. As a result, we obtain a two-level circuit of including six LUTs.
In the discussed case, each function is represented by partial functions. Further, the condition (11) holds for the blocks of . Due to this, there are enough 5-LUTs for implementing the circuits of -. Since , there are seven LUTs in the circuit of . As follows from (22), there are seven LUTs in the circuit of .
So, there are 34 5-LUTs in the circuit of Mealy FSM . This circuit has five levels of LUTs (Figure 8).
Figure 8.
Logic circuit of Mealy FSM .
In this circuit, is represented by LUT1–LUT6. This circuit has two levels of LUTs shown in Figure 8. The delivers the inputs and state variables for generating the additional variables . These variables enter to be transformed into the partial functions (16) and (17). The transformation is executed by and . These blocks include elements LUT7–LUT20. The fourth level of the FSM circuit is represented by LUT21–LUT27. The IMFs are generated by LUT21–LUT23. The outputs of these LUTs are connected with flip-flops implementing the register RG. The flip-flops are controlled by the pulses and . The variables are generated by LUT24–LUT27. The outputs of form the . At last, level five consists of seven LUTs (LUT28–LUT34) creating the circuit of .
We compared the characteristics of the 5-LUT-based circuits of and FSMs. In both cases, there is the same number of flip-flops in the state register (). In the case of , there are six LUTs in the and seven LUTs in . There are two levels of LUTs in the circuit of . So, these subcircuits are the same for and . There are two levels of LUTs in the circuit of of . This block’s circuit includes 24 LUTs. So, there are LUTs in the circuit of .
Thus, for the FSM , the transition from model to model allows you to reducing the LUT count by 1.088 times. Note that both circuits have the same number of logical levels; therefore, the model proposed in this article allows reducing the number of LUTs without reducing the operating frequency compared to the circuit of equivalent FSM. In the next Section, we compare some FSM models with the one proposed in this article.
6. Experimental Results
In this Section, we show the results of experiments which have been conducted to compare characteristics of Mealy FSMs with characteristics of FSM circuits based on some other models. To conduct the experiments, we use: (1) the internal resources of Virtex-7; (2) the benchmark FSMs from the library []; (3) the industrial package Vivado []. The library [] includes 48 benchmarks represented in the format KISS2. The benchmarks have a wide range of basic characteristics (numbers of states, inputs, and outputs). They are used very often by different researchers to compare area and time characteristics of FSMs obtained using various synthesis methods. The characteristics of benchmarks are shown Table 7.
Table 7.
Characteristics of benchmark Mealy FSMs [].
We executed the experiments using a personal computer with the following characteristics: CPU: Intel Core i7 6700 K 4.2@4.4 GHz, Memory: 32 GB RAM 2400 MHz CL15. Further, we used the Virtex-7 VC709 Evaluation Platform (xc7vx690tffg1761-2) [] and CAD tool Vivado v2019.1 (64-bit) []. There is for FPGAs of Virtex-7. To obtain the results of experiments, the reports produced by Vivado are used. To enter Vivado, we use thed CAD tool K2F [].
We compared three basic characteristics of resulting FSM circuits. These parameters are: (1) the LUT count; (2) the time of cycle; (3) the power consumption. In addition, two integral characteristics were investigated, namely: (1) the area-time products and (2) the area-time-power products. To conduct the experiments, five FSM models were used. They are: (1) Auto of Vivado (it uses binary state codes); (2) one-hot of Vivado; (3) JEDI; (4) -based FSMs; (5) -based FSMs proposed in this article. Obviously, the first three methods are based on the model of P FSM shown in Figure 2.
Based on the methodology [], we divide the benchmark FSMs [] by five categories. To divide the benchmarks, we use the relation between the values of and . There is for LUTs of Virtex-7. We use this value to divide the benchmarks by the categories.
The benchmarks belong to category of trivial FSMs (category 0), if the following condition holds: . This category includes the following 11 benchmarks: and . The benchmarks belong to category of simple FSMs (category 1), if there is . The category 1 consists of the benchmarks bbara, bbsse, beecount, cse, dk14, dk15, dk16, donfile, ex2, ex4, ex6, ex7, keyb, mark1, opus, s27, s386, s840, and . The benchmarks belong to category of average FSMs (category 2), if . The category 2 contains the benchmarks ex1, kirkman, planet, planet1, pma, s1, s1488, s1494, s1a, s208, styr, and . The benchmarks belong to category of big FSMs (category 3), if the following condition takes place: . The category three includes only the benchmark . The category of very big FSMs (category 4) includes benchmarks satisfying relation . The benchmarks s420, s510, s820, and s832 belong to this category.
The results of experiments are shown in Table 8, Table 9, Table 10, Table 11 and Table 12. There is the same organization of these tables. The investigated methods are listed in the table columns. The table rows contain the names of benchmarks. Inside each table, the benchmarks are listed in alphabetical order, and sorted by ascending category number. The rows “Total” contain results of summation of values for each column. The row “Percentage” includes the percentage of summarized characteristics of FSM circuits produced by other methods respectively to -based FSMs. We use the model of P Mealy FSM as a starting point for methods Auto, one-hot, and JEDI. The basic data (the LUT count, time, and power consumption) are taken from reports of Vivado. Next these data were used to obtain the integral characteristics.
Table 8.
Experimental results (LUT counts).
Table 9.
Experimental results (the latency time for all benchmarks, nsec).
Table 10.
Experimental results (Total On-Chip Power, Watts).
Table 11.
Experimental results (area-time products).
Table 12.
Experimental results (area-time-power products).
Let us analyze the experimental data taken from reports of Vivado. These tables contain the following data: (1) the LUT counts (Table 8); (2) the minimum time of cycle (Table 9); (3) the total On-Chip Power (Table 10); (4) the area-time products (Table 11); (5) the area-time-power products (Table 12). In addition, we compared each of the characteristics for each category; however, in order to avoid a significant increase in the size of the article, we did not show the corresponding tables. We just showed the results of these comparisons.
As follows from Table 8, the –based FSMs require fewer LUTs than it is for other investigated counterparts. Using the proposed approach, we can obtain circuits having 52.19% less 6-LUTs than it is for equivalent Auto–based FSMs; 77.1% less 6-LUTs than for equivalent one-hot–based FSMs; 25.34% less 6-LUTs than for equivalent JEDI–based FSMs. Our approach produces circuits having on average 11.36% less 6-LUTs than the circuits of -FSMs.
Using Table 8, we can compare LUT counts for different categories of benchmark FSMs. Comparing the results for category 0 shows that both multi-level approaches ( and ) lose out to the other methods. This loss is 30.4% compared to auto-based FSMs, 3.4% compared to one-hot-based FSMs, and 31.5% compared to JEDI-based FSMs. We explain this by the fact that condition (4) is not satisfied for benchmark FSMs of the category 0. This means that only a single LUT is needed to implement any function for systems (1) and (2). Obviously, for category 0, the replacement of inputs should not be performed for both and FSMs; however, the encoding of output collections is always performed for these multi-level FSMs. Due to this, for the category 0, the multi-level FSMs have higher LUT counts than they are for other investigated design methods. Let us point out, that equivalent - and -FSMs have the same LUT counts for this category.
Starting from category 1, the condition (4) is met. At the same time, it makes sense to use structural reduction methods instead of methods of functional decomposition. For this category, using the complex state codes in FSMs allows obtaining FSM circuits with fewer LUTs than it is for other methods used in our experiments. This gain is 40.0% compared to auto-based FSMs, 81.1% compared to one-hot-based FSMs, 16.2% compared to JEDI-based FSMs, and 11.0% compared to FSMs.
As follows from this part of research, the winnings increase with the increase in the category number. The gain in LUTs increases up to 65.64% (for categories 2–4) compared to auto-based FSMs. The gain increases up to 65.64% (for categories 2–4) compared to one-hot-based FSMs. Comparison with JEDI-based FSMs shows that the gain increases up to 34.86% (for categories 2–4). At last, compared to -based FSMs, the gain increases from 8.44% (for categories 0–1) to 12.73% (for categories 2–4).
As follows from Table 9, the -based FSMs are faster than their investigated counterparts. They require a cycle time 9.39% less than the equivalent auto-based FSMs, 10.24% less than one-hot-based FSMs, and 1.08% less than the equivalent JEDI-based FSMs. win 18.73%. They also marginally benefit (0.31%) in relation to FSMs. It follows from this that our approach allows reducing the number of LUTs without losing performance. As we have already noted, this is the greatest challenge associated with the optimization of chip area occupied by an FSM circuit. So, our approach allows overcoming this obstacle.
Using Table 9, we have compared time characteristics for different categories of benchmark FSMs. Comparing the results for category 0 shows that -based FSMs lose out to the other methods. This loss is 3.23% compared to auto-based FSMs, 0.2% compared to one-hot-based FSMs, 3.68% compared to JEDI-based FSMs, and 1.13% compared to -based FSMs. As it is for LUT counts, we explain this by the fact that condition (4) is not satisfied for benchmarks of this category. Starting from category 1, the condition (4) is met. This allows obtaining some gain compared to FSMs based on both Auto (3.48%) and one-hot (3.37%); however, other models provide better performance than our approach (3.84% compared to JEDI and 1.93% compared to ).
Starting from the category 2, our approach gives better results compared to all other investigated methods. This gain for the category 2 is the following: (1) 24.17% compared to Auto; (2) 25.96% compared to one-hot; (3) 8.88% compared to JEDI; (4) 3.78% compared to FSMs. For the category 3, the gain increases. It is the following: (1) 31.49% compared to Auto; (2) 31.49% compared to one-hot; (3) 20.24% compared to JEDI; (4) 4.76% compared to FSMs. Further, there is a gain for category 4; however, the gain is less than for category 3. It is the following: (1) 18.73% compared to Auto; (2) 16.48% compared to one-hot; (3) 7.96% compared to JEDI; (4) 3.07% compared to FSMs. We explain this decrease in winnings by an increase in the number of levels in the circuits of -based FSMs compared to their number for category 3; however, the following conclusion can be drawn: the proposed approach allows obtaining faster LUT-based FSM circuits starting from category 2.
The Vivado provides us by information about the total on-chip power. We combine these reports in Table 10. As follows from Table 10, the -based FSMs consume less energy than their investigated counterparts. On average, they provide the following gain in power consumption: (1) 47.02% compared to auto-based FSMs; (2) 59.17% compared to one-hot-based FSMs; (3) 23.96% compared to JEDI-based FSMs; (4) 5.44% compared to Mealy FSMs.
Using Table 10, we have compared total on-chip power for each category. Comparing the results for category 0 shows that -based FSMs lose out to the other methods. This loss is 19.37% compared to auto-based FSMs, 17.6% compared to one-hot-based FSMs and 21.08% compared to JEDI-based FSMs. The same data are correct for FSMs; however, starting from the category 1, our approach allows designing circuits consuming less power. The winnings grow as the category number grows. With respect to auto-based FSMs, our method provides the following gain: (1) 33.95% for the category 1; (2) 85.68% for the category 2; (3) 106.28% for the category 3; (4) 124.46% for the category 4. With respect to one-hot-based FSMs, our method provides the following gain: (1) 47.22% for the category 1; (2) 98.26% for the category 2; (3) 106.28% for the category 3; (4) 163.44% for the category 4. With respect to JEDI-based FSMs, the proposed method provides the following gain: (1) 19.69% for the category 1; (2) 43.98% for the category 2; (3) 77.38% for the category 3; (4) 80.97% for the category 4. Further, there is the following gain compared to -based FSMs: (1) 5.20% for the category 1; (2) 7.58% for the category 2; (3) 10.77% for the category 3; (4) 12.36% for the category 4. So, the proposed organization of the FSM circuit allows reducing the power consumption, starting with simple FSMs (category 1).
Using data from Table 8, Table 9 and Table 10, we can calculate the values for two integral characteristics. One of them is an area-time product [,], the second is an area-time-power product. The smaller the values of these products, the better the quality of the resulting FSM circuit []. As it is the case in many articles [,], we estimate the area of an FSM circuit by its LUT count.
As follows from Table 11, the -based FSMs have better area-time characteristics than their investigated counterparts. On average, they provide the following gain: (1) 84.13% compared to auto-based FSMs; (2) 113.34% compared to one-hot-based FSMs; (3) 33.41% compared to JEDI-based FSMs; (4) 13.53% compared to Mealy FSMs. Using Table 11, we have compared area-time characteristics for each category of benchmark FSMs. As in the previous cases, for category 0 our approach gives the worst results; however, starting from category 1, the benefits of our approach are steadily increasing.
Comparing the results for category 0 shows that -based FSMs lose out to the other methods. This loss is 31.8% compared to auto-based FSMs, 2.88% compared to one-hot-based FSMs, 33.33% compared to JEDI-based FSMs, and 1.1% compared to FSMs; however, starting from category 1, our approach allows designing circuits having smaller values of area-time products than they are for all other approaches. With respect to auto-based FSMs, our method provides the following gain: (1) 46.49% for the category 1; (2) 107.13% for the category 2; (3) 90.73% for the category 3; (4) 141.9% for the category 4. With respect to one-hot-based FSMs, our method provides the following gain: (1) 88.39% for the category 1; (2) 139.36% for the category 2; (3) 90.73% for the category 3; (4) 148.74% for the category 4. With respect to JEDI-based FSMs, the proposed method provides the following gain: (1) 11.42% for the category 1; (2) 45.42% for the category 2; (3) 50.63% for the category 3; (4) 61.03% for the category 4. Further, there is the following gain compared to -based FSMs: (1) 8.74% for the category 1; (2) 16.92% for the category 2; (3) 13.88% for the category 3; (4) 18.75% for the category 4. So, the proposed organization of the FSM circuit allows reducing the values of area-time products, starting with simple FSMs (category 1).
As follows from Table 12, the -based FSMs have much smaller values of area-time-power products than they are for their investigated counterparts. On average, they provide the following gain: (1) 254.69% compared to auto-based FSMs; (2) 325.06% compared to one-hot-based FSMs; (3) 96.36% compared to JEDI-based FSMs; (4) 22.75% compared to Mealy FSMs. Using Table 12, we have compared area-time-power products for each category of benchmark FSMs. As in the previous cases, for category 0 our approach gives the worst results; however, starting from category 1, the benefits of our approach are steadily increasing.
Comparing the results for category 0 shows that -based FSMs lose out to the other methods. This loss is 45.51% compared to auto-based FSMs, 10.12% compared to one-hot-based FSMs, 48.69% compared to JEDI-based FSMs, and 1.13% compared to MPY FSMs; however, starting from category 1, our approach allows designing circuits having smaller values of area-time-power products than they are for all other approaches. With respect to auto-based FSMs, our method provides the following gain: (1) 104.39% for the category 1; (2) 301.21% for the category 2; (3) 293.45% for the category 3; (4) 502.86% for the category 4. With respect to one-hot-based FSMs, our method provides the following gain: (1) 194.34% for the category 1; (2) 376.56% for the category 2; (3) 293.45% for the category 3; (4) 528.13% for the category 4. With respect to JEDI-based FSMs, the proposed method provides the following gain: (1) 36.51% for the category 1; (2) 112.16% for the category 2; (3) 167.19% for the category 3; (4) 213.43% for the category 4. Further, there is the following gain compared to -based FSMs: (1) 15.65% for the category 1; (2) 25.71% for the category 2; (3) 26.14% for the category 3; (4) 33.91% for the category 4. So, the proposed organization of the FSM circuit allows reducing the values of area-time-power products, starting with simple FSMs (category 1).
The main goal of the proposed approach is the reducing LUT counts in FPGA-based circuits of Mealy FSMs. The results of experiments (Table 8) show that this goal has been achieved. Obviously, this gain is achieved by using complex state codes in FSMs. Using these codes leads to introducing an additional level of LUTs forming the partial functions. It was natural to expect that the introduction of this additional level would lead to a decrease in performance; however, as follows from Table 9, our approach leads to slower FSM circuits only for FSMs from categories 0–1. As the complexity of FSMs increases, our approach begins to give a win in terms of minimum cycle time. Moreover, the proposed approach allows reducing the power consumption of resulting FSM circuits (starting from the category 1). The same is true for the integral characteristics of FSM circuits (the area-time and area-time-power products). These phenomena are positive side effects associated with our approach.
So, the results of our experiments show that the proposed approach can be used instead of other models starting from the simple FSMs (category 1). Our approach allows improving LUT counts starting from the simple FSMs. The same is true for the power consumption. Further, starting from the category 2, the proposed method allows improving the minimum cycle time compared with other investigated methods. In our research, we use the chip xc7vx690tffg1761-2 by Virtex-7 (Xilinx); however, this chip has no unique architecture of CLBs. This very architecture of CLBs is used in all chips of the 7th generation of Xilinx chips. Due to this, the results of our experiments show that the proposed approach can be used for improving LUT counts for designs based on any FPGA chip of the 7th generation. Moreover, all Xilinx FPGA families have one fundamental property in common: an extremely limited number of LUT inputs. This leads to the need to develop FSM synthesis methods aimed at reducing the influence of this factor on the characteristics of the LUT-based FSM circuits. The results of our research show that the proposed method allows solving this problem better than some well-known methods combining various approaches of state assignment (auto, one-hot, and JEDI) and functional decomposition, as well as our previous method based on structural decomposition ( FSMs).
7. Conclusions
Modern FPGA chips include more than 7.5 billion transistors []. They have proved to be a very effective means of implementing a variety of digital systems. There is a serious drawback inherent in FPGAs, namely, a rather small number of LUT inputs. This leads to the need of using various methods of functional decomposition under the design of LUT-based FSM circuits. As a result, the LUT-based circuits of rather complex FSMs are presented in the form of multi-level networks with a complex system of spaghetti-type interconnections []. This disadvantage can be overcome due to applying methods of structural decompositions [,]. This leads to FSM circuits with predicted number of levels and regular systems of interconnections [].
Our research [,] shows that SD-based Mealy FSM circuits have better characteristics than their FD-based counterparts. As a rule, the combined use of methods of SD allows obtaining a greater gain in the number of LUTs than from the use of each of these methods separately []. In [], we proposed to use two methods of SD, namely, the replacement of inputs and encoding of collections of outputs; however, even in this case, some parts of the resulting FSM circuits can have more than a single level of LUTs. In this article, we discuss just such a case.
To diminish the number of LUTs and their levels, we use the ideas []. These methods are based on finding a partition of states by classes of compatible states; however, in contrast to the known methods [], we have replaced known extended state codes by the complex state codes, which have not been known before. The complex state codes are represented by concatenations of class codes and the codes of FSM states as elements of these classes. This approach leads to four-level FSM circuits, which require fewer LUTs than their counterparts based on methods []. There is a gain in the LUT count around 11.36% relative to three-level FSM circuits []. Moreover, our approach provides a very small increase in the FSM performance (on average, only 0.31%) and a decrease in the power consumption (on average by 5.79%) for the benchmarks from the library []. In our opinion, the proposed method can be applied instead of LUT-based MPY Mealy FSMs.
Author Contributions
Conceptualization, A.B., L.T., and K.K.; methodology, A.B., L.T., K.K., and S.S.; software, A.B., L.T., and K.K.; validation, A.B., L.T., and K.K.; formal analysis, A.B., L.T., K.K., and S.S.; investigation, A.B., L.T., and K.K.; writing—original draft preparation, A.B., L.T., K.K., and S.S.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The data presented in this study are available in the article.
Conflicts of Interest
The authors declare no conflict of interest.
Abbreviations
The following abbreviations are used in this manuscript:
| CLB | configurable logic block |
| CO | collection of outputs |
| CSC | complex state codes |
| DST | direct structure table |
| ESC | extended state code |
| FD | functional decomposition |
| FPGA | field-programmable gate array |
| FSM | finite state machine |
| IMF | input memory function |
| LUT | look-up table |
| RG | state code register |
| SBF | systems of Boolean functions |
| SD | structural decomposition |
| STG | state transition graph |
| STT | state transition table |
References
- Micheli, G.D. Synthesis and Optimization of Digital Circuits; McGraw-Hill: Cambridge, MA, USA, 1994. [Google Scholar]
- Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 1174. [Google Scholar] [CrossRef]
- Kubica, M.; Kania, D. Technology Mapping of FSM Oriented to LUT-Based FPGA. Appl. Sci. 2020, 10, 3926. [Google Scholar] [CrossRef]
- Czerwinski, R.; Kania, D. Finite State Machine Logic Synthesis for Complex Programmable Logic Devices. In Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2013; Volume 231. [Google Scholar]
- Kubica, M.; Kania, D.; Kulisz, J. A technology mapping of fsms based on a graph of excitations and outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]
- Islam, M.M.; Hossain, M.S.; Shahjalal, M.D.; Hasan, M.K.; Jang, Y.M. Area-time efficient hardware implementation of modular multiplication for elliptic curve cryptography. IEEE Access 2020, 8, 73898–73906. [Google Scholar] [CrossRef]
- Grout, I. Digital Systems Design with FPGAs and CPLDs; Elsevier Science: Amsterdam, The Netherlands, 2011. [Google Scholar]
- Trimberger, S.M. Field-Programmable Gate Array Technology; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Ruiz-Rosero, J.; Ramirez-Gonzalez, G.; Khanna, R. Field Programmable Gate Array Applications—A Scientometric Review. Computation 2019, 7, 63. [Google Scholar] [CrossRef] [Green Version]
- Xilinx FPGAs. Available online: https://www.xilinx.com/products/silicon-devices/fpga.html (accessed on 18 December 2021).
- Barkalov, A.; Titarenko, L.; Krzywicki, K. Reducing LUT Count for FPGA-Based Mealy FSMs. Appl. Sci. 2020, 10, 5115. [Google Scholar] [CrossRef]
- Kubica, M.; Opara, A.; Kania, D. Technology mapping for LUT-based FSMs. In Lecture Notes in Electrical Engineering; Springer: Berlin/Heidelberg, Germany, 2021; Volume 713, p. 216. [Google Scholar]
- Scholl, C. Functional Decomposition with Application to FPGA Synthesis; Kluwer Academic Publishers: Boston, MA, USA, 2001. [Google Scholar]
- Baranov, S. Logic Synthesis of Control Automata; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1994. [Google Scholar]
- Sklarova, D.; Sklarov, V.A.; Sudnitson, A. Design of FPGA-Based Circuits Using Hierarchical Finite State Machines; TUT Press: Tallinn, Estonia, 2012. [Google Scholar]
- Kuon, I.; Tessier, R.; Rose, J. FPGA architecture: Survey and challenges—Found trends. Electr. Des. Autom. 2008, 2, 135–253. [Google Scholar]
- Barkalov, A.; Titarenko, L.; Mielcarek, K. Improving characteristics of LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2020, 30, 745–759. [Google Scholar]
- Kubica, M.; Kania, D. Decomposition of multi-level functions oriented to configurability of logic blocks. Bull. Pol. Acad. Sci. 2017, 67, 317–331. [Google Scholar]
- Mishchenko, A.; Chattarejee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. IEEE Trans. CAD 2006, 27, 240–253. [Google Scholar]
- Chapman, K. Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources. Application Note. 2012. Available online: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.259.5300&rep=rep1&type=pdf (accessed on 17 March 2022).
- Sasao, T.; Mishchenko, A. LUTMIN: FPGA Logic Synthesis with MUX-Based and Cascade Realizations. Proc. IWLS. 2009, pp. 310–316. Available online: http://www.lsi-cad.com/sasao/Papers/files/IWLS2009_sasao_mis.pdf (accessed on 17 March 2022).
- Senhadji-Navarro, R.; Garcia-Vargas, I. Mapping Arbitrary Logic Functions onto Carry Chains in FPGAs. Electronics 2022, 11, 27. [Google Scholar] [CrossRef]
- Kim, J.H.; Anderson, J. Post-LUT-Mapping Implementation of General Logic on Carry Chains Via a MIG-Based Circuit Representation. In Proceedings of the 2021 31st International Conference on Field-Programmable Logic and Applications (FPL), Dresden, Germany, 30 August–3 September 2021; pp. 334–340. [Google Scholar]
- Kubica, M.; Kania, D. Technology mapping oriented to adaptive logic modules. Bull. Pol. Acad. Sci. 2019, 67, 947–956. [Google Scholar]
- Das, N. Reset: A Reconfigurable state encoding technique for FSM to achieve security and hardware optimality. Microprocess. Microsyst. 2020, 77, 103196. [Google Scholar] [CrossRef]
- Tao, Y.; Zhang, Y.; Wang, Q.; Cao, J. MPGA: An evolutionary state assignment for dynamic and leakage power reduction in FSM synthesis. IET Comput. Digit. Tech. 2018, 12, 111–120. [Google Scholar] [CrossRef]
- El-Maleh, A.H. A Probabilistic Tabu Search State Assignment Algorithm for Area and Power Optimization of Sequential Circuits. Arab. J. Sci. Eng. 2020, 45, 6273–6285. [Google Scholar] [CrossRef]
- Sentovich, E.M.; Singh, K.J.; Lavagno, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; Stephan, P.R.; Brayton, R.K.; Sangiovanni-Vincentelli, A. SIS: A System for Sequential Circuit Synthesis; University of California: Berkely, CA, USA, 1992. [Google Scholar]
- ABC System. Available online: https://people.eecs.berkeley.edu/~alanmi/abc/ (accessed on 18 December 2021).
- Brayton, R.; Mishchenko, A. ABC: An Academic Industrial-Strength Verification Tool. In Computer Aided Verification (Berlin, Heidelberg, 2010); Touili, T., Cook, B., Jackson, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 24–40. [Google Scholar]
- Vivado Design Suite User Guide: Synthesis. UG901 (v2019.1). Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 18 December 2021).
- Quartus Prime. Available online: https://www.intel.pl/content/www/pl/pl/software/programmable/quartus-prime/overview.html (accessed on 18 December 2021).
- Khatri, S.P.; Gulati, K. Advanced Techniques in Logic Synthesis, Optimizations and Applications; Springer: New York, NY, USA, 2011. [Google Scholar]
- Sklyarov, V. Synthesis and implementation of RAM-based finite state machines in FPGAs. In International Workshop on Field Programmable Logic and Applications; Springer: Berlin/Heidelberg, Germany, 2000; pp. 718–727. [Google Scholar]
- Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving the Characteristics of Multi-Level LUT-Based Mealy FSMs. Electronics 2020, 9, 1859. [Google Scholar] [CrossRef]
- McElvain, K. LGSynth93 Benchmark; Mentor Graphics: Wilsonville, OR, USA, 1993. [Google Scholar]
- VC709 Evaluation Board for the Virtex-7 FPGA User Guide; UG887 (v1.6); Xilinx, Inc.: San Jose, CA, USA, 2019.
- Islam, M.M.; Hossain, M.S.; Hasan, M.K.; Shahjalal, M.; Jang, Y.M. FPGA implementation of high-speed area-efficient processor for elliptic curve point multiplication over prime field. IEEE Access 2019, 7, 178811–178826. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).