Transforming Group Codes in Mealy Finite State Machines with Composite State Codes

Barkalov, Alexander; Titarenko, Larysa; Mielcarek, Kamil

doi:10.3390/app15084289

Open AccessArticle

Transforming Group Codes in Mealy Finite State Machines with Composite State Codes

by

Alexander Barkalov

¹

,

Larysa Titarenko

^1,2

and

Kamil Mielcarek

^1,*

¹

Institute of Metrology, Electronics and Computer Science, University of Zielona Góra, ul. Licealna 9, 65-417 Zielona Góra, Poland

²

Department of Infocommunication Engineering, Faculty of Infocommunications, Kharkiv National University of Radio Electronics, Nauky Avenue 14, 61166 Kharkiv, Ukraine

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(8), 4289; https://doi.org/10.3390/app15084289

Submission received: 28 February 2025 / Revised: 25 March 2025 / Accepted: 11 April 2025 / Published: 13 April 2025

Download

Browse Figures

Versions Notes

Abstract

A new state assignment method focusing on Mealy finite state machines (FSMs) is proposed. The proposed codes are an alternative to composite state codes (CSCs). CSCs are represented as concatenations of group codes and partial state codes. Both group and partial state codes are maximum binary codes. We propose encoding groups using one-hot codes. The main goal of this method is improving the value of the FSM cycle time without a significant degradation of the spatial characteristics. The method can be applied if FSM circuits are implemented using the look-up table (LUT) elements of field-programmable gate arrays (FPGAs). The resulting FSM circuit includes three logic blocks. The first block generates partial input memory functions and FSM outputs depending on maximum binary state codes and one-hot group codes. The partial codes are assigned in a way minimizing the number of arguments in the partial functions. This allows for the generation of most partial functions by single-LUT circuits. The second block generates the final values of the input memory functions and FSM outputs. This block does not require group codes to generate functions, as in CSC-based FSMs. The third block transforms maximum binary group codes into their one-hot equivalents. The proposed approach allows for a reduction in the number of series-connected LUTs in comparison with CSC-based FSMs. Due to this reduction, the temporal characteristics of an FSM circuit are improved. This paper includes an example of FSM synthesis applying the proposed method. The experiments were conducted using standard benchmark FSMs. The results of the experiments show that the proposed method allowed for an improvement in the cycle time of an average of 8.81%. Moreover, in relation to CSC-based FSMs, the LUT counts decreased by an average of 4.00%.

Keywords:

Mealy FSM; FPGA; LUT; synthesis; composite state assignment; one-hot group codes

1. Introduction

Modern digital systems include a lot of different sequential devices [1,2]. Very often, to represent these devices, a model of a Mealy finite state machine (FSM) [3] is used. An FSM model is a starting point for creating the electrical circuit of a sequential device. In this paper, we discuss a case where the circuits of Mealy FSMs are created using the internal resources of field-programmable gate arrays (FPGAs) [4]. The choice of this basis is justified by the fact that FPGAs are a very popular tool used for creating the circuits of various digital devices [5]. Moreover, FPGAs will be used in logic design within the next thirty years [6].

As a rule, some optimization problems arise in the process of FSM design. These problems are interrelated. The following characteristics may require optimization [7,8,9]: the amount of internal resources used (occupied chip area), or performance, or power consumption. In this article, we propose a method which allows for an increase in the maximum operating frequency of an FPGA-based Mealy FSM circuit. The following internal resources are used to create the circuit: look-up table (LUT) elements, programmable flip-flops, dedicated multiplexers, a synchronization tree, programmable interconnections and programmable input–outputs [4,8,10]. We discuss a case where FSM states are encoded by composite state codes (CSCs) [11]. This approach is based on creating groups of compatible states. Both groups and states are encoded using the minimum possible number of variables. The proposed method improves performance without a significant increase in the amount of internal resources used. We use the LUT count [12] to estimate the chip area occupied by the FSM circuit.

Modern digital systems are processing larger and larger amounts of information [13]. When implementing digital systems based on FPGAs, very often increasing the system performance without a significant increase in the amount of internal resources consumed is a problem [14]. At the same time, the bottleneck may be the digital system control device, implemented in the form of an FSM. The control device generates control signals at each point in the system’s operation. Therefore, the performance of this control FSM significantly affects the performance of the digital system as a whole [15]. One of the ways to increase the performance of an FSM is a state assignment, which allows us to reduce the number of logical levels in the FPGA-based FSM circuit [16,17]. In this article, we propose a state assignment method allowing us to increase the FSM performance without a significant overhead regarding the internal resources used.

The main contribution of this paper is a novel approach improving the cycle time (the operating frequency) of LUT-based Mealy FSM circuits with composite state codes. The improvement is based on transforming the maximum binary codes of groups of compatible states into one-hot codes (OHCs). Within each group, states are encoded by maximum binary codes (MBCs). This approach leads to the development of a new architecture of an FSM circuit different from the architecture where OHCs are not used [11]. The performance improvement is related to the absence of the need to use group codes in the function assembler block. This approach allows the performance to improve for rather complex FSMs. A positive feature of this approach is the total absence of an increase in the amount of the internal resources of the FPGA chip used. Moreover, the proposed approach leads to the creation of circuits with slightly fewer LUTs than their equivalent CSC-based counterparts.

The main idea of this article is as follows: the transformation of maximum binary group codes into one-hot codes allows for a decrease in the number of partial Boolean functions generated by LUTs from the first logic level of an FSM circuit. Obviously, the code transformation requires additional LUTs and programmable interconnections (these are overheads of the proposed method), but our new method provides a decrease in the number of LUTs in the circuit assembling partial Boolean functions. This may result in a reduction in the values of LUT counts compared to the equivalent CSC-based FSMs. Our research shows that this phenomenon takes place for the proposed method.

The article includes seven sections, the first of which is this brief Introduction. Section 2 shows the peculiarities of implementing an FSM with LUTs. The analysis of FSMs with composite state codes is discussed in Section 3. An FSM architecture based on the transformation of group codes is proposed in Section 4. Section 5 is devoted to a synthesis example. Section 6 shows and analyzes the obtained experimental results. The paper also includes a Conclusion (Section 7).

2. LUT-Based FSM Design

There are a lot of approaches used for representing the behavior of a Mealy FSM [1]. The most common are state transition graphs (STGs) or state transition tables (STTs) [18]. We use both tools to represent transitions between the states from the set

A = {a_{1}, \dots, a_{M}}

. The transitions depend on FSM inputs comprising the set

X = {x_{1}, \dots, x_{L}}

. During the transitions, some FSM outputs are generated from the set

Y = {y_{1}, \dots, y_{N}}

. As follows from these sets, an FSM has M internal states, L external inputs and N outputs. For each state,

a_{m} \in A

, an STG (or an STT) shows which outputs are generated under the influence of FSM inputs [1]. There are H interstate transitions represented by either an STG or STT.

The first step of FSM synthesis is a state assignment [19]. During the state assignment, the abstract states

a_{m} \in A

are represented by some physical objects. These objects are binary codes,

B C (a_{m})

, having R bits. The state codes consist of bits corresponding to state variables combined into a set,

T = {T_{1}, \dots, T_{R}}

. Using an STT, it is possible to derive systems of Boolean functions (SBFs) representing a combinational part of an FSM circuit. The state codes

B C (a_{m})

are stored in a state code register,

R G

. The minimum width (the number of bits) is provided by so-called maximum binary codes. Their width (R) is determined as [1]

R = ⌈ l o g_{2} M ⌉ .

(1)

To change the content of the RG, three objects are used. These objects are input memory functions (IMFs), a pulse of initialization,

S t a r t

, and a pulse of synchronization,

C l o c k

. As a rule, the RG consists of D flip-flops [1]. Due to this, the IMFs create a set,

D = {D_{1}, \dots, D_{R}}

. If

S t a r t = 1

, then a code for the initial state

a_{1} \in A

is loaded into the RG. If a certain edge of the pulse

C l o c k

arises, then the bits of the IMFs are repeated in the contents of the RG.

Using an initial STT (or STG) and state codes, it is possible to derive the following SBFs [20]:

D = D (T, X);

(2)

Y = Y (T, X) .

(3)

These SBFs determine the combinational part of an FSM circuit. Systems (2) and (3) determine a so-called P FSM [14]. The combinational part is implemented using some logic elements. The peculiarities of these elements significantly influence the architecture of an FSM circuit [7].

This paper is devoted to the design of FPGA-based FSMs. Modern FPGAs include a lot of LUTs, which are parts of configurable logic blocks (CLBs) [21]. In this article, we target chips produced by AMD Xilinx [4]. The CLBs from this company consist of LUTs, dedicated multiplexers and programmable flip-flops. Using multiplexers, it is possible to obtain a super-LUT having more inputs than a basic LUT. For example, there are

S_{L} = 6

inputs in the basic LUT from the Virtex-7 family of FPGAs [4,22]. To show the number of inputs, we denote an LUT as

S_{L}

-LUT. Inside a single CLB, either two seven-LUTs or one eight-LUT can be created with the help of multiplexers. These super-LUTs (SLUTs) have practically the same delay as their six-input counterparts [23]. If an SLUT must have an

S_{L} \geq 9

, then the internal resources of several configurable blocks are used. But such big SLUTs are significantly slower than SLUTs created inside a single CLB [23]. In this paper, we use the symbol

L U T e r

to show that a logic block is implemented using LUTs and other internal CLB resources.

Because there is a direct connection between the outputs of LUTs and the inputs of flip-flops, such blocks are implemented as registers inside CLBs. So, the state code register is a part of an LUTer. This preliminary information allows for the creation of a structural diagram of the FPGA-based FSM

U_{1}

(Figure 1).

A block,

L U T e r D

, implements IMFs represented by the system (2). We show the state register RG as a separate block, but in reality, this block is a part of

L U T e r T

. The pulses

S t a r t

and

C l o c k

enter

L U T e r T

. The FSM outputs (3) are produced by the LUTs of the block

L U T e r Y

. These outputs are used by the other blocks of a digital system. So, it makes sense to reduce the delay between the arrival of FSM inputs,

x_{l} \in X

, and the production of FSM outputs,

y_{n} \in Y

.

A serious drawback is inherent in modern FPGAs: the number of basic LUT inputs (

S_{L}

) is really small [8]. This significantly affects the number of LUTs and their levels in the FSM circuit. If the sum of product (SOP) of a Boolean function,

f_{i} \in D \cup Y

, includes

N L_{i}

literals, then the corresponding LUT-based circuit is multi-level if

N L_{i} > S_{L} .

(4)

To construct the SOPs of functions

f_{i} \in D \cup Y

, it is necessary to create a direct structure table (DST) [20]. Both STGs and STTs could be used for constructing a DST [18]. In our paper, we start synthesis using an STG.

An STG includes M vertices corresponding to FSM states. The vertices are connected by arcs determining interstate transitions. There are H arcs in an STG. The arc number

h \in {1, \dots, H}

is determined by the values of

X_{h}

and

Y_{h}

. The conjunction of inputs

X_{h}

is an input signal determining a particular transition. The symbol

Y_{h}

stands for a collection of outputs (CO) generated during a particular transition.

To create a DST, it is necessary to transform an STG into the equivalent STT. An STT includes the following columns:

a_{m}

(the current state),

a_{S}

(a state of transition),

X_{h}

(an input signal),

Y_{h}

(a collection of outputs), and h (the number of interstate transitions). The first column includes the current state, the second column contains a state of transition, the third column contains an input signal determining the transition from the current state into the next state, and a CO generated during the transition,

Y_{h}

, is written in the fourth column.

A DST is an extension of the original STT with three additional columns. Two of the additional columns contain binary codes,

B C (a_{m})

and

B C (a_{S})

. The third column includes a collection of IMFs,

D_{h} \subseteq D

, determining the code

B C (a_{S})

[20]. Of course, there are the same number of rows in the equivalent STT and DST.

Two types of codes are typical for FSMs: maximum binary codes and one-hot codes [24]. These approaches have both positive and negative aspects. Using MBCs leads to the minimum possible number of flip-flops into the RG, but this leads to really complex SOPs for functions (2) and (3). As a result, an FSM circuit may include a lot of LUT levels and a very sophisticated system of interconnections [7]. One-hot codes are characterized by having the maximum number of bits (

R_{O H} = M

). This leads to the use of many more flip-flops than in the case of MBC-based FSMs, but the SOPs of the IMFs and FSM outputs are rather simple.

A comparison of FSMs based on MBCs and OHCs was executed in article [25]. This comparison showed that using OHCs provides better results for FSMs with

M > 16

. As was shown in [3], both LUT counts and power consumption depend significantly on the value of L. The following conclusion can be made from an analysis of the experiments conducted by the authors of paper [26]: MBC-based FSMs have better characteristics (as compared with OHC-based FSMs) if the following condition holds:

L > 10

.

If condition (4) is true for a function,

f_{i} \in D \cup Y

, then this function should be decomposed. In this case, such a function is represented by a collection of partial Boolean functions (PBFs). To find such a representation, methods of functional decomposition (FD) [27] could be applied. To be represented by a single-LUT circuit, any partial function should depend on no more than

S_{L}

arguments. Next, all LUTs generating partial Boolean functions representing the same function,

f_{i} \in D \cup Y

, should be combined into a multi-level circuit. It is known that applying FD-based methods results in multi-level FSM circuits having a very complicated system of interconnections [28,29].

To increase the value of the operating frequency, it is necessary to reduce the number of logic levels in an FPGA-based FSM circuit. This could be achieved, for example, by reducing the number of literals in the sum of products representing IMFs and FSM outputs. One of the ways used to solve this problem is an optimal state assignment [28]. The well-known algorithm JEDI [30] is the best example of such an approach’s efficiency. Optimization is achieved due to the placement of some state codes into the generalized cubes of an R-dimensional Boolean space. These cubes include the maximum binary codes of states for which transitions depend on the same input signals,

X_{h}

. To minimize the corresponding SOPs, it is necessary to use adjacent codes for these states. Moreover, this goal may be achieved by increasing the number of bits in the state codes. This leads to FSM circuits having fewer LUTs, LUT levels and interconnections than equivalent FSM circuits based on other state assignment methods. Basing on the results shown in [31], we should point out that applying JEDI leads to an improvement in LUT counts, performance and power consumption.

3. Peculiarities of FSMs Based on Composite State Assignment

Composite state assignment is proposed in paper [11]. As follows from [11], using CSCs may improve the main characteristics of FPGA-based FSM circuits: the resulting circuits occupy a smaller chip area and provide a higher operating frequency than their counterparts based on MBCs, OHCs and JEDI.

To execute the composite state assignment, it is necessary to split the set of states A by

K C

groups of compatible states,

G^{k} \subseteq A

. This leads to a partition,

Π_{A C} = {G^{1}, \dots, G^{K C}}

, with the minimum possible number of groups. The groups

G^{k} \in Π_{A C}

are encoded by maximum binary codes,

C G (G^{k})

, with

R_{G}

bits:

R_{G} = ⌈ l o g_{2} K C ⌉ .

(5)

The groups are encoded using elements of the set

τ_{G} = {τ_{1}, \dots, τ_{R G}}

.

Maximum partial state codes encode states

a_{m} \in G^{k}

inside each group,

G^{k} \in Π_{A C}

. If a group,

G^{k} \in Π_{A C}

, includes

N S_{k}

elements, then the width of the partial codes is determined as

R_{k} = ⌈ l o g_{2} N S_{k} ⌉ .

(6)

To create partial state codes,

R_{S}

variables are necessary. The value of

R_{S}

is determined as

R_{S} = m a x (R_{1}, \dots, R_{K C}) .

(7)

The partial state variables determine the set

τ_{S} = {τ_{R G + 1}, \dots, τ_{R G + R S}}

. So, the total number of group and partial state variables,

R_{C C}

, is determined as

R_{C C} = R_{G} + R_{S} .

(8)

Expression (8) also determines the number of IMFs generated (and the number of flip-flops in the RG).

The concatenation of codes

C G (G^{k})

and

C S (a_{m})

determines a composite state code,

C C (a_{m})

, of the state

a_{m} \in G^{k}

. So, the composite state codes are determined by the following expression:

C C (a_{m}) = C G (G^{k}) * C S (a_{m}) .

(9)

In (9), we use the symbol ∗ to show the concatenation of two binary vectors.

To implement a CSC-based FSM circuit, it is necessary to find three sets representing each group,

G^{k} \in Π_{A C}

. The partial set of inputs

X^{k} \subseteq X

includes inputs determining transitions from the states

a_{m} \in G^{k}

. This set includes

L_{k}

elements. The partial set of outputs

Y^{k} \subseteq Y

includes FSM outputs,

y_{n} \in Y

, generated during transitions from the states

a_{m} \in G^{k}

. The set of IMFs

D^{k} \subseteq D

includes input memory functions generated during transitions from the states

a_{m} \in G^{k}

.

Each group,

G^{k} \in Π_{A C}

, should satisfy the following condition:

R_{S} + L_{k} \leq S_{L} .

(10)

To create the partition

Π_{A C}

, it is possible to use the method proposed in [32]. This approach allows us to find a partition,

Π_{A C}

, with the minimum possible number of groups,

K C

.

So, each group determines the following PBFs:

D^{k} = D^{k} (τ_{S}, X^{k});

(11)

Y^{k} = Y^{k} (τ_{S}, X^{k}) .

(12)

To obtain the final values of functions

f_{i} \in D \cup Y

, it is necessary to assemble the corresponding partial functions. This is achieved using group codes. Each partial function should be multiplied by a conjunction,

G_{k}

, corresponding to the code

C G (G^{k})

. So, the final values of SBFs (2) and (3) are determined by the following systems:

D = D (G_{1}, \dots, G_{K C}, D^{1}, \dots, D^{K C});

(13)

Y = Y (G_{1}, \dots, G_{K C}, Y^{1}, \dots, Y^{K C}) .

(14)

As shown in [11], the functions (13) and (14) could be generated using the dedicated multiplexers of CLBs [23,33].

So, there should be two levels of logic blocks in the circuit of the CSC-based Mealy FSM

U_{2}

. The architecture of this circuit is determined by systems (11)–(14). The architecture of FSM

U_{2}

(Figure 2) is proposed in article [11].

In FSM

U_{2}

, LUT-based blocks,

L U T e r 1

–

L U T e r K C

, form the first logic level of the FSM circuit. The block

L U T e r k

generates partial functions (11) and (12), representing the group

G^{k} \in Π_{A C}

. The function assembler is represented by the block

L U T e r Y τ

. The function assembler generates functions (13) and (14). The FSM outputs (14) enter the other blocks of the digital system. The IMFs (13) enter the flip-flops of the register RG. The outputs of the flip-flops determine both the group and partial state codes.

In [11], the results of various experiments are shown. The experiments allowed for a comparison of the characteristics of CSC-, MBC- and OHC-based FSMs. These results show that CSC-based FSMs have better LUT counts than are obtained using either MBCs or OHCs. It is very important that reducing the LUT count does not reduce the values of the maximum operating frequencies.

The best case for using CSC-based FSMs was determined in [11]. These conditions are the following:

R + L \geq 2 S_{L};

(15)

R_{G} + K C \leq S_{L} .

(16)

If condition (15) is violated, then practically, there is no need to use FD-based approaches. The corresponding FSM circuits are mostly single-level [11]. If condition (16) is true, then there is a single level of LUTs in the function assembler circuit. If both conditions (15) and (16) hold, then there are two levels of LUTs in the circuits of

U_{2}

-based FSMs.

As follows from the analysis of systems (13) and (14), the violation of condition (16) is connected with the existence of a direct dependence between the functions

f_{i} \in D \cup Y

and group variables

τ_{r} \in τ_{G}

. So, to reduce the number of levels in the function assembler circuit, it is necessary to eliminate this direct dependence. This will allow for the deletion of the conjunctions

G_{1}, \dots, G_{K C}

from functions generated by the function assembler. One of the possible approaches for solving this problem is proposed in our current article.

4. The Main Idea of the Proposed Method

In this paper, we propose a method aimed at the relocation of group variables from

L U T e r Y τ

to blocks from the first logic level. This diminishes the number of arguments in functions

f_{i} \in D \cup Y

. In turn, this increases the probability of obtaining a single-level circuit for the block

L U T e r Y τ

. We propose encoding the groups

G^{k} \in Π_{A C}

using one-hot codes. The states are still encoded by maximum binary partial codes. Now, the groups are encoded using the elements of the set

τ_{0} = {g_{1}, \dots, g_{K C}}

. These variables enter the blocks

L U T e r 1

–

L U T e r K C

. This means that the variable

g_{k} \in τ_{0}

determines a literal used by all the PBFs generated by the LUTs of the block

L U T e r k

. This method is focused on a situation where a single-level implementation of the function assembler circuit is impossible.

Now, the variables determining the one-hot group codes should enter the LUTs of the blocks

L U T e r 1

–

L U T e r K C

. This leads to a change in equations for both PBFs and full functions. Now, the LUTs of the first logic level generate the following SBFs:

D^{k} = D^{k} (g_{k}, τ_{S}, X^{k});

(17)

Y^{k} = Y^{k} (g_{k}, τ_{S}, X^{k}) .

(18)

In this case, there are no connections between the group codes and LUTs of

L U T e r Y τ

. This leads to the following SBFs representing the function assembler:

D = D (D^{1}, \dots, D^{K C});

(19)

Y = Y (Y^{1}, \dots, Y^{K C}) .

(20)

Now, the IMFs in (19) are used to create the composite state codes. In these codes, groups are represented by maximum binary codes. To create the one-hot group codes, it is necessary to transform these maximum group codes into their one-hot equivalents. The transformation is executed by an additional code transformer block. The code transformer generates the following system of functions:

τ_{0} = τ_{0} (τ_{G}) .

(21)

We name the resulting state codes mixed composite state codes (MCSCs). The term “mixed” means that the groups are represented by OHCs and the partial state codes are maximum codes. The proposed approach leads to the development of the MCSC-based Mealy FSM

U_{3}

. Its architecture is shown in Figure 3.

In an MCSC-based FSM, SBFs (17) and (18) are generated by the LUTs of the first logic level. These LUTs create the blocks

L U T e r k (k \in {1, \dots, K C})

. The LUTs of block

L U T e r Y τ

generate the final values of the FSM outputs

y_{n} \in Y

and input memory functions. The block

L U T e r O H

generates the SBF (21). As a result, one-hot group codes are created.

The block

L U T e r Y τ

includes the hidden state code register. This register is controlled by IMFs, together with signals

S t a r t

and

C l o c k

. Obviously, the equivalent FSMs

U_{2}

and

U_{3}

have the same amount of flip-flops in the RG. The outputs of the RG are variables from the set

τ = τ_{G} \cup τ_{S}

.

The block

L U T e r O H

is a code transformer. It turns maximum binary group codes into the corresponding one-hot group codes. This block is based on an SBF (21).

To create the partition

Π_{A C}

, we use the greedy algorithm proposed in [34]. The compatible states are found using condition (10). As follows from Equations (17) and (18), it is necessary to reserve one of LUT inputs for a corresponding one-hot group variable. Obviously, this can lead to the necessity of using more than a single basic LUT for implementing a circuit for a partial Boolean function. So, it is possible that more LUTs would be necessary on the first logic level than there are for the equivalent FSM

U_{2}

.

Let, for each group,

G^{k} \in Π_{A C}

, the SOP of any partial function,

f_{i}^{k} \in D^{k} \cup Y^{k}

, depend on

M L_{i}^{k}

literals. Let the following condition apply:

N L_{i}^{k} \leq S_{L} - 1 .

(22)

In this case, including the variable

g_{k} \in τ_{0}

in any SOP does not lead to the development of a multi-level circuit. Obviously, if condition (22) is violated, then the function

f_{i}^{k} \in D^{k} \cup Y^{k}

is represented by a multi-LUT circuit.

To diminish the number of CLBs required for implementing partial functions,

f_{i}^{k} \in D^{k} \cup Y^{k}

, it makes sense to encode some states,

a_{m} \in G^{k}

, using the adjacent partial codes

C S (a_{m})

. This may be performed using, for example, a method similar to that of the algorithm JEDI [30].

For a better understanding of the novelty of the proposed method, it is necessary to show the main difference between the proposed method of state assignment and one-hot state assignment. One-hot codes include the maximum possible number of bits. This leads to a simplification of input memory functions due to a significant increase in the number of flip-flops used (compared to this number for FSMs using MBCs). An increase in the number of memory elements leads to an increase in the number of interconnections, which can significantly increase the FSM cycle time [16]. The method we propose allows for obtaining state codes where the number of bits is practically the same as for maximum binary codes. This allows for an increase in the FSM performance compared to the performance of equivalent OHC-based FSMs.

In this paper, we propose a synthesis method for the LUT-based Mealy FSM

U_{3}

. The synthesis process starts with an STG. The proposed method includes the following steps:

Transforming the initial STG into a state transition table.
Creating the partition $Π_{A C}$ of the set of states with the minimum number of groups.
The encoding of groups $G^{k} \in Π_{A C}$ using maximum binary codes, $C G (G^{k})$ .
The encoding of groups $G^{k} \in Π_{A C}$ using one-hot codes, $O C G (G^{k})$ .
The encoding of states $a_{m} \in G^{k}$ using partial state codes, $S C (a_{m})$ .
Creating composite state codes, $C C (a_{m})$ .
Creating tables of blocks $L U T e r 1$ – $L U T e r K C$ .
Deriving SBFs (17)–(18), representing the circuits of $L U T e r 1$ – $L U T e r K C$ .
Creating a table for block $L U T e r Y τ$ .
Deriving SBFs (19)–(20).
Creating a table for block $L U T e r O H$ and an SBF (21).
Implementing the FSM circuit using the resources of a particular FPGA chip.

5. Example of Synthesis of Mealy FSM $U_{3}$

Let us discuss an example of synthesis based on the proposed method. We applied this method to synthesize an LUT-based circuit for FSM A1. To obtain the circuit, we used LUTs with six inputs. The behavior of FSM A1 is represented by a state transition graph, shown in Figure 4.

Step 1. An analysis of Figure 4 shows that FSM

A 1

has the following characteristics:

M = 12

,

L = 7

,

N = 8

and

H = 26

. This determines the sets

A = {a_{1}, \dots, a_{12}}

,

X = {x_{1}, \dots, x_{7}}

and

Y = {y_{1}, \dots, y_{8}}

. Using approach [18], we can transform the STG (Figure 4) into the equivalent STT with

H = 26

rows (Table 1).

Each arc of the STG represents the pair of vertices

〈 a_{m}, a_{S} 〉

. The hth row of the STT corresponds to the hth arc of the STG. The columns of the STT include the following variables:

a_{m}

is the current state (the arc’s beginning);

a_{S}

is a state of transition (the arc’s end);

X_{h}

is an input signal written above the hth arc (it determines the transition from

a_{m}

into

a_{S}

);

Y_{h}

is a collection of outputs written above the hth arc

(Y_{h} \subseteq Y)

; h is the number of transitions

(h \in {1, \dots, H})

. The transition from an STG to the equivalent STT is transparent and straightforward [18].

Step 2. This step is executed using the greedy algorithm proposed in [34]. We have

S_{L} = 6

. Applying the method from [34] gives us the partition

Π_{A C} = {G^{1}, G^{2}, G^{3}}

with

K C = 3

.

The states

a_{m} \in A

are distributed among the groups in the following way:

G^{1} = {a_{1}, a_{2}, a_{3}, a_{5}}

,

G^{2} = {a_{4}, a_{6}, a_{7}, a_{11}}

and

G^{3} = {a_{8}, a_{9}, a_{10}, a_{12}}

. So, for each group, we have

N S_{k} = 4

. The group

G^{1}

determines the partial sets

X^{1} = {x_{1}, x_{6}, x_{7}}

and

Y^{1} = {y_{1}, \dots, y_{7}}

. The group

G^{2}

determines the partial sets

X^{2} = {x_{2}, x_{3}, x_{4}}

and

Y^{2} = {y_{1}, y_{2}, y_{3}, y_{5}, y_{6}, y_{7}}

. The group

G^{3}

determines the partial sets

X^{3} = {x_{2}, x_{4}, x_{7}}

and

Y^{3} = {y_{1}, \dots, y_{5}, y_{7}, y_{8}}

. Using (6), we can obtain

R_{k} = 2 (k \in {1, 2, 3})

. Using (7), we can obtain the value

R_{S} = 2

. We have

L_{k} = 3

for each group,

G^{k} \subset A

. Taking into account variables

g_{k}

, we can find that a single LUT with six inputs is enough to generate any partial function.

Step 3. We have

K C = 3

. Using (5), we can obtain

R_{G} = 2

. This determines the set

τ_{G} = {τ_{1}, τ_{2}}

. Now, we can form the set

τ_{S} = {τ_{3}, τ_{4}}

. There is no influence of the group codes on the number of LUTs in the blocks

L U T e r 1

–

L U T e r K C

. Due to this, we can encode the groups in the following way:

C G (G^{1}) = 00

,

C G (G^{2}) = 01

and

C G (g^{2}) = 10

.

Step 4. Obviously, three bits is enough to create the one-hot group codes. This determines the set

τ_{0} = {g_{1}, g_{2}, g_{3}}

. Let us encode the groups in the following way:

O C G (G^{1}) = 100

,

O C G (G^{2}) = 010

and

O C G (G^{2}) = 001

.

Step 5. In complex FSMs, it makes sense to encode states using the algorithm JEDI for optimal state encoding. This can lead to the optimization of the partial SOPs for output functions.

But in this very simple example, we encode states in the basic way. The method is the following: the smaller the subscript of a state, the more zeros its partial code includes. Using this approach, we can assign the following partial codes (Figure 5).

Step 6. In the Karnaugh map (Figure 5), we show these partial codes together with the maximum binary group codes. So, this map shows the composite state codes.

So, steps 1–6 are executed. Using the obtained codes, we can create tables representing the blocks LUTer1–LUTer3.

Step 7. In the discussed case, LUTer1 is represented by Table 2, LUTer2 by Table 3, and LUTer3 by Table 4. These tables have the columns

a_{m}

,

C S (a_{m})

,

a_{S}

,

C C (a_{S})

,

X_{h}

,

Y_{h}

,

D_{h}

and h. For each state, the columns

a_{m}

,

a_{S}

,

X_{h}

and

Y_{h}

are the same as they were in the initial STT (Table 1). The column

D_{h}

is created using the composite state codes from Figure 5.

Step 8. The outcome of the previous step provides information which can be used for creating the SOPs of partial IMFs (13) and partial microoperations (14). This is performed using the partial state codes, inputs

x_{l} \in X

and one-hot group variables. There are two stages in creating PBFs. The first stage is deriving the partial Boolean functions (11) and (12) from the tables of blocks LUTer1–LUTerKC. The second stage is multiplying each term of SOPs (11) and (12) by the corresponding one-hot group variables. So, the PBFs representing

L U T e r k

are multiplied by

g_{k} (k \in {1, \dots, K C})

. The resulting expressions represent SBFs (13) and (14).

For our example, we show only the SOPs resulting from the partial functions

y_{1}^{1}

and

D_{1}^{1}

,

y_{1}^{2}

and

D_{1}^{2}

, and

y_{1}^{3}

and

D_{1}^{3}

. These partial SOPs are represented by the following equations:

\begin{matrix} D_{1}^{1} = F_{10}^{1} = g_{1} τ_{3} τ_{4} \bar{x_{6}}; \\ D_{1}^{2} = F_{4}^{2} \lor F_{8}^{2} = g_{2} τ_{3} \bar{τ_{4}} x_{3} \lor g_{2} τ_{2} τ_{4} \bar{x_{4}}; \\ D_{1}^{3} = F_{1}^{3} \lor F_{2}^{3} \lor F_{6}^{3} = g_{3} \bar{τ_{3}} \bar{τ_{4}} \lor g_{3} \bar{τ_{3}} τ_{4} x_{7} \lor g_{3} τ_{3} \bar{τ_{4}} \bar{x_{3}} \bar{x_{4}} . \end{matrix}

(23)

\begin{matrix} y_{1}^{1} = F_{1}^{1} \lor F_{8}^{1} = g_{1} \bar{τ_{3}} \bar{τ_{4}} x_{1} \lor g_{1} τ_{3} \bar{τ_{4}} \bar{x_{1}} \bar{x_{7}}; \\ y_{1}^{2} = F_{4}^{2} = g_{2} τ_{3} \bar{τ_{4}} x_{3}; \\ y_{1}^{3} = F_{3}^{3} \lor F_{6}^{3} = g_{3} \bar{τ_{3}} τ_{4} \bar{x_{4}} \lor g_{3} τ_{3} v \bar{τ_{4}} \bar{x_{3}} \bar{x_{4}} . \end{matrix}

(24)

In these systems, variables

F_{h}^{k}

are product terms created from the lines, h, of tables representing blocks LUTer1–LUTerKC. We hope there is a transparent connection between SBFs (23) and (24) and Table 2, Table 3 and Table 4.

Step 9. To obtain the final values of both the IMFs and FSM outputs, it is necessary to create a table for the function assembler. This is a table for the block

L U T e r Y τ

. The table contains

R_{C C} + N

rows. This table provides information about the disjunctions of PBFs. Using this information, the final values of the FSM outputs and IMFs are obtained. There are the following columns in this table:

f_{i}

, 1, …,

K C

. The row number i includes the function

f_{i} \in D \cup Y

. The column number k corresponds to the partial function

f_{i}^{k} \in D^{k} \cup Y^{k}

. If the function

f_{i} \in D \cup Y

is generated by the block

L U T e r k

, then there is a value of 1 at the intersection of the column k and row i. Otherwise, the symbol 0 is written at this intersection. In the discussed case, the block

L U T e r Y τ

is represented by Table 5.

Step 10. Using Table 5 allows us to obtain a system of disjunctions representing functions

f_{i} \in D \cup Y

. In the discussed case, some IMFs are represented by SBF (25), and some outputs are represented by SBF (26):

D_{1} = D_{1}^{1} \lor D_{1}^{2} \lor D_{1}^{3}; \dots; D_{4} = D_{4}^{1} \lor D_{4}^{2} \lor D_{4}^{3};

(25)

y_{1} = y_{1}^{1} \lor y_{1}^{2} \lor y_{1}^{3}; \dots; y_{7} = y_{7}^{1} \lor y_{7}^{2} \lor y_{7}^{3}; y_{8} = y_{8}^{3}

(26)

Obviously, using a similar approach, we can obtain the SOPs of all the functions from SBFs (19) and (20).

Step 11. To construct SBF (21), it is necessary to create a table for the block

L U T e r O H

. This table includes the columns

G^{k}

,

C G (G^{k})

,

O C G (G^{k})

,

τ_{0}

and k. In the discussed case, the code transformer is represented by Table 6.

The following SBF is derived from Table 6:

g_{1} = \bar{τ_{1}} \bar{τ_{2}}; g_{2} = τ_{2}; g_{3} = τ_{1} .

(27)

As follows from SBF (27), there is only a single LUT in the circuit of

L U T e r O H

. The variables

g_{2}

and

g_{3}

are exactly the same as group variables

τ_{2}

and

τ_{1}

, respectively.

Step 12. To implement the FPGA-based circuit, it is necessary to apply some industrial CAD tools. These tools are necessary to execute the technology mapping [3,35]. The most popular tools are Vivado [36] and Quartus [37]. We do not discuss this step for our example.

To analyse the efficiency of the proposed approach compared to some other known methods, we conducted a lot of experiments. We compared the characteristics of

U_{3}

-based FSMs with the characteristics of their counterparts based on maximum binary state codes, one-hot state codes and composite state codes. In the next section, we show the results of our experiments and their analysis.

6. Experimental Results

We conducted experiments to compare four characteristics of

U_{3}

-based FSM circuits with the characteristics of FSM circuits based on other state assignment methods. These characteristics were the cycle times, maximum operating frequencies, LUT counts and area–time products. As an example of an MBC-based method, we use Vivado’s Auto method [36]. Let us point out that the number of code bits used by Auto may exceed the minimum value determined by (1). Vivado’s one-hot method [36] was used as an example of OHC-based state assignment. Also, we compared our approach with the algorithm JEDI [30], which is an MBC-based algorithm creating cubes covering the codes of some states [3]. Finally, we executed a comparison of

U_{3}

-based FSM circuits with their

U_{2}

-based counterparts [11].

In the experiments, we used benchmark FSMs from the well-known library LGSynth93 [38]. There are 48 Mealy FSMs (benchmarks) in the library. The benchmarks are represented in the KISS2 format. These benchmarks have a wide range of basic characteristics (numbers of states, inputs, outputs and transitions). A lot of scientists have been using these benchmarks as a basis for the comparison of various design methods [39,40,41]. The characteristics of the benchmarks are shown in Table 7. The last column of this table includes the sum of the number of FSM inputs and the minimum number of bits in the maximum binary state codes. We used this additional column to select benchmarks where structural decomposition made sense.

To conduct the experiments, we used a platform including the FPGA chip from the Virtex-7 family. This was a VC709 Evaluation Platform (xc7vx690tffg1761-2) [22]. The CLBs used included basic LUTs with six inputs

(S_{L} = 6)

. To implement a CLB-based circuit of

L U T e r Y τ

, the dedicated multiplexers could be used. Each CLB included four basic LUTs and three dedicated multiplexers. We use thed industrial CAD package Vivado v2019.1 (64-bit) [36] to execute the step of technology mapping. Using the Vivado reports, we created four tables showing the results of the conducted experiments.

Each Boolean function,

f_{i} \in D \cup Y

, depended on

N L_{i}

arguments. If condition (4) was violated for all IMFs and FSM outputs, then the circuit of FSM

U_{1}

consisted of exactly

R + N

LUTs. So, each function,

f_{i} \in D \cup Y

, was generated by a single-level circuit. In this case, the

U_{1}

-based FSM circuit possessed the best spatial and temporal characteristics. This model provided the minimum values of the LUT count and propagation time. So, if condition (4) is violated for all input memory functions and FSM outputs, there is no need to use either functional or structural decomposition. Obviously, the states may be encoded in a way minimizing the power consumption [24,27].

As follows from the results of the experiments shown in article [11], condition (4) only held for some benchmarks. These benchmarks had the following common property:

L + R > 2 S_{L} .

(28)

For FSMs satisfying (28), it made sense to replace

U_{1}

-based FSMs with FSMs with composite state assignment. We could use either the

U_{2}

or

U_{3}

models. For FPGAs from the Virtex 7 family, the basic LUT had

S_{L} = 6

inputs. Due to this, in our research, we only used benchmarks for which

L + R > 12

. The experimental results are shown in Table 8 (cycle times, nanoseconds), Table 9 (maximum operating frequency, MHz), Table 10 (LUT counts) and Table 11 (area–time products).

Table 8, Table 9, Table 10 and Table 11 include the following columns: FSM (names of benchmark FSMs); MB (experimental results for MBC-based FSMs); OH (experimental results for OHC-based FSMs); JEDI (experimental results for FSMs with state codes generated by JEDI);

U_{2}

(experimental results for FSMs with composite state codes); and

U_{3}

(experimental results obtained for our new approach). The results of the summation of values from the corresponding columns are shown in the row “Total”. The row “Percentage” shows the percentage of the summarized characteristics of the investigated FSM circuits compared to those of

U_{3}

-based FSMs.

The main goal of the proposed method was increasing the performance of LUT-based FSM circuits in relation to the circuits of FSMs based on composite state codes. So, the most important characteristic was the cycle time (or operating frequency). The values for the cycle times were obtained directly from Vivado reports. These values are shown in Table 8.

As follows from Table 8, the proposed method made it possible to obtain LUT-based FSM circuits with shorter cycle times than those of the other studied FSMs. As can be seen from Table 8,

U_{3}

-based FSMs had the following improvements in cycle times: (1) 82.25% compared with MBC- based FSMs; (2) 84.37% compared with OHC-based FSMs; (3) 60.28% compared with JEDI-based FSMs; and (4) 8.81% compared with

U_{2}

-based FSMs.

An analysis of Table 8 shows that under some conditions, our method always produced circuits that were slightly faster than the circuits of the equivalent FSMs with composite state codes, but the level of improvement depended on the value of

L + R

. For example, we found

L + R = 13

for the benchmark

t m a

. As follows from Table 8, our method provided an improvement of around 1%. Next, we found

L + R = 27

for the benchmark

s 510

. As follows from Table 8, our method provided an improvement of around 19%. As we compare the results for each benchmark, we can make the following conclusion: the improvement achieved by replacing composite state codes with mixed CSCs increased with the growth of the value

L + R

.

We think that this phenomenon was related to the difference in the number of CLBs connected in series in the circuit creating the input memory functions and outputs of Mealy FSMs. As the value of

L + R

increased, the number of consecutive CLBs grew faster for CSC-based FSMs than for the equivalent MCSC-based FSMs.

Using the values for the cycle time allowed us to create Table 9. This table shows the values of the maximum operating frequencies.

As follows from Table 9, the proposed method produced FPGA-based FSM circuits with higher frequencies than the other studied FSMs. As can be seen from Table 9,

U_{3}

-based FSMs achieved the following improvements in the frequency: (1) 45.2% compared with Auto-based FSMs; (2) 55.49% compared with OHC-based FSMs; (3) 37.44% compared with JEDI-based FSMs; and (4) 7.89% compared with

U_{2}

-based FSMs. The reasons for this state of affairs have already been considered in the analysis of the data from Table 8. We will not repeat them.

The main goal of the proposed approach is to improve the temporal characteristics of LUT-based FSM circuits in relation to the circuits of the equivalent CSC-based FSMs, but sometimes, it is very important that an increase in the frequency does not result in a significant deterioration of the spatial characteristics. These characteristics depend significantly on the number of LUTs used (LUT count) [12]. A comparison of the values of the LUT counts for FSM circuits based on various state assignment methods is shown in Table 10.

As follows from Table 10, the proposed approach led to the creation of FSM circuits that required fewer LUTs than the other investigated methods. Our method had an average improvement of (1) 55.06% compared with Auto-based FSMs, (2) 68.35% compared with OHC-based FSMs, (3) 26.12% compared with JEDI-based FSMs, and (4) 4% compared with FSMs based on composite state codes.

Sometimes, it is very important to reach a balance between the spatial and temporal characteristics of FSM circuits [19]. This balance can be estimated using some integral evaluations of the quality of a digital circuit. One of the main integral evaluations involves the product of the area occupied by a circuit and its performance (area–time product) [12]. In the case of LUT-based circuits, the area is estimated using a characteristic such as the LUT count [12]. Obviously, the performance is represented by the value of the cycle time. The smaller the value of this integral characteristic is, the higher the quality of the circuit (and the better the balance between the spatial and temporal characteristics). Using Vivado reports, we created Table 11, including values of the area–time products obtained for the circuits of the benchmark FSMs.

As follows from Table 11, the proposed method made it possible to obtain LUT-based FSM circuits with smaller values for the area–time products compared with their

U_{2}

-based counterparts. The

U_{2}

-based FSMs were inferior in relation to the equivalent

U_{3}

-based FSMs (the difference was 12.19%). Also, our method provided significantly better results than those for the other methods studied. The improvement was 189.39% in relation to MB-based FSMs, 216.87% in relation to OH-based FSMs and 104.06% in relation to JEDI-based FSMs.

An analysis of the results of the experiments carried out allowed us to draw the following conclusion regarding the equivalent

U_{2}

- and

U_{3}

-based FSMs. If condition (28) holds, then our method is always an improvement over the method based on composite state codes [11]. Under this condition, our method provides a shorter cycle time and requires fewer LUTs than the method based on CSCs. So, mixed composite state codes can be viewed as a good alternative to maximum binary CSCs.

7. Conclusions

One of the basic problems associated with LUT-based FSM design is the problem of improving the temporal characteristics of the produced circuits. Under certain conditions, the best spatial characteristics are provided if FSM states are encoded by composite state codes [11]. This method is based on finding a partition of the set of states according to the minimum number of groups of compatible states. In the composite state codes proposed in article [11], there was the minimum possible number of bits in the group codes. Within each group, states were encoded by partial maximum binary codes. To increase the maximum operating frequency of CSC-based FSM circuits, we propose encoding groups using one-hot codes.

As the conducted experiments show, the proposed method provides a reduction in the value of the cycle time compared with that of the equivalent CSC-based FSMs. This reduction is possible for rather complex FSMs for which the total number of inputs and state variables is at least twice the number of inputs of the base LUT. For example, for FSMs whose circuits are implemented using six-input LUTs, an improvement in the temporal characteristics is possible if

L + R > 12

. The proposed method has the following positive feature: an increase in performance (an increase in the maximum operating frequency) is accompanied by a slight decrease in the LUT count.

As shown through the experiments, the results of which were given in the previous section, the proposed method can only be applied to an FSM for which condition (28) is met. If the sum of digits of codes with the appropriate number of logical conditions does not exceed twice the number of LUT inputs, then the use of this method does not make sense. In this case, it is advisable to use the other methods of state coding discussed in this article. An analysis of the tables containing the results of the studies shows that the improvements from the application of the proposed method increase as the value of the parameter

L + R

increases.

So, the proposed state assignment method allows for an increase in the FSM performance without increasing the amount of internal chip resources used. The results of the investigations show that the proposed method can be used for designing rather complex FSMs.

Author Contributions

Conceptualization, A.B., L.T. and K.M.; methodology, A.B., L.T. and K.M.; formal analysis, A.B., L.T. and K.M.; writing–original draft preparation, A.B., L.T. and K.M.; writing—review and editing, A.B., L.T. and K.M.; supervision, A.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Baranov, S. Finite State Machines and Algorithmic State Machines; Amazon: Seattle, WA, USA, 2018; p. 185. [Google Scholar]
Kubica, M.; Kania, D. Technology mapping oriented to adaptive logic modules. Bull. Pol. Acad. Sci. Tech. Sci. 2019, 67, 947–956. [Google Scholar] [CrossRef]
Kubica, M.; Opara, A.; Kania, D. Technology Mapping for LUT-Based FPGA; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
Kuon, I.; Tessier, R.; Rose, J. FPGA Architecture: Survey and Shallenges—Found Trends. Electr. Des. Autom. 2008, 2, 135–253. [Google Scholar]
Ruiz-Rosero, J.; Ramirez-Gonzalez, G.; Khanna, R. Field Programmable Gate Array Applications—A Scientometric Review. Computation 2019, 7, 63. [Google Scholar] [CrossRef]
Gazi, O.; Arli, A. State Machines using VHDL: FPGA Implementation of Serial Communication and Display Protocols; Springer: Berlin/Heidelberg, Germany, 2021; p. 326. [Google Scholar] [CrossRef]
Barkalov, A.; Titarenko, L.; Krzywicki, K. Structural Decomposition in FSM Design: Roots, Evolution, Current State—A Review. Electronics 2021, 10, 44. [Google Scholar] [CrossRef]
Amagasaki, M.; Shibata, Y. FPGA Structure; Springer: Singapore, 2018; pp. 47–86. [Google Scholar] [CrossRef]
Senhadji-Navarro, R.; Garcia-Vargas, I. High-Performance Architecture for Binary-Tree-Based Finite State Machines. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2018, 37, 796–805. [Google Scholar] [CrossRef]
Grout, I. Digital Systems Design with FPGAs and CPLDs; Elsevier Science: Amsterdam, The Netherlands, 2011; p. 718. [Google Scholar]
Barkalov, A.; Titarenko, L.; Krzywicki, K. Improving Characteristics of LUT-Based Sequential Blocks for Cyber-Physical Systems. Energies 2022, 15, 2636. [Google Scholar] [CrossRef]
Islam, M.M.; Hossain, M.; Shahjalal, M.; Hasan, M.K.; Jang, Y.M. Area-Time Efficient Hardware Implementation of Modular Multiplication for Elliptic Curve Cryptography. IEEE Access 2020, 8, 73898–73906. [Google Scholar] [CrossRef]
Mazur, P.; Czerwinski, R.; Chmiel, M. PLC implementation in the form of a System-on-a-Chip. Bull. Pol. Acad. Sci.—Tech. Sci. 2020, 68, 1263–1273. [Google Scholar] [CrossRef]
Baranov, S. From Algorithm to Digital System: HSL and RTL tool Sinthagate in Digital System Design; Amazon: Boston, MA, USA, 2020; p. 76. [Google Scholar]
Salauyou, V. Area and Performance Estimates of Finite State Machines in Reconfigurable Systems. Appl. Sci. 2024, 14, 11833. [Google Scholar] [CrossRef]
Barkalov, A.; Titarenko, L.; Krzywicki, K. Logic Synthesis for FPGA-Based Mealy Finite State Machines: Structural Decomposition in Logic Design; Taylor & Francis Group, CRC Press: Boca Raton, FL, USA, 2024; p. 332. [Google Scholar] [CrossRef]
Salauyou, V.; Bułatow, W. Optimized Sequential State Encoding Methods for Finite-State Machines in Field-Programmable Gate Array Implementations. Appl. Sci. 2024, 14, 5594. [Google Scholar] [CrossRef]
De Micheli, G. Synthesis and Optimization of Digital Circuits; McGraw–Hill: New York, NY, USA, 1994; p. 578. [Google Scholar]
Teren, V.; Cortadella, J.; Villa, T. Generation of synchronizing state machines from a transition system: A region-based approach. Int. J. Appl. Math. Comput. Sci. 2023, 33, 133–149. [Google Scholar] [CrossRef]
Baranov, S. High Level Synthesis of Digital Systems; Amazon: Seattle, WA, USA, 2018; p. 207. [Google Scholar]
Trimberger, S. Field-Programmable Gate Array Technology; Springer: New York, NY, USA, 2012. [Google Scholar]
Xilinx. VC709 Evaluation Board for the Virtex-7 FPGA; Xilinx: San Jose, CA, USA, 2023. [Google Scholar]
Chapman, K. Multiplexer Design Techniques for Datapath Performance with Minimized Routing Resources; Xilinx All Programmable: San Jose, CA, USA, 2014; pp. 1–32. [Google Scholar]
Barkalov, A.; Titarenko, L.; Bieganowski, J.; Krzywicki, K. Basic Approaches for Reducing Power Consumption in Finite State Machine Circuits—A Review. Appl. Sci. 2024, 14, 2693. [Google Scholar] [CrossRef]
Skliarova, I.; Sklyarov, V.; Sudnitson, A. Design of FPGA-Based Circuits Using Hierarchical Finite State Machines; TUT Press: Tallinn, Estonia, 2012. [Google Scholar]
Sklyarov, V. Synthesis and Implementation of RAM-based Finite State Machines in FPGAs. In Field-Programmable Logic and Applications: The Roadmap to Reconfigurable Computing; Hartenstein, R.W., Grünbacher, H., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; pp. 718–727. [Google Scholar] [CrossRef]
Opara, A.; Kubica, M. Technology mapping of multi-output functions leading to the reduction of dynamic power consumption in FPGAs. Int. J. Appl. Math. Comput. Sci. 2023, 33, 267–284. [Google Scholar] [CrossRef]
Park, J.; Yoo, H. Area-Efficient Differential Fault Tolerance Encoding for Finite State Machines. Electronics 2020, 9, 1110. [Google Scholar] [CrossRef]
Barkalov, A.; Titarenko, L.; Krzywicki, K.; Saburova, S. Improving Characteristics of LUT-Based Mealy FSMs with Twofold State Assignment. Electronics 2021, 10, 901. [Google Scholar] [CrossRef]
Sentowich, E.; Singh, K.; Lavango, L.; Moon, C.; Murgai, R.; Saldanha, A.; Savoj, H.; Stephan, P.; Bryton, R.; Sangiovanni-Vincentelli, A. SIS: A System for Sequential Circuit Synthesis; Technical Report; University of California: Berkely, CA, USA, 1992. [Google Scholar]
Tatalov, E. Synthesis of Compositional Microprogram Control Units for Programmable Devices. Master’s Thesis, Donetsk National Technical University, Donetsk, Ukraine, 2011. [Google Scholar]
Barkalov, O.; Titarenko, L.; Mielcarek, K. Improving characteristics of LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2020, 30, 745–759. [Google Scholar] [CrossRef]
Sasao, T.; Mishchenko, A. LUTMIN: FPGA Logic Synthesis with MUX-Based and Cascade Realizations; ResearchGate: Berlin, Germany, 2009. [Google Scholar]
Barkalov, O.; Titarenko, L.; Mielcarek, K. Hardware reduction for LUT-based Mealy FSMs. Int. J. Appl. Math. Comput. Sci. 2018, 28, 595–607. [Google Scholar] [CrossRef]
Mishchenko, A.; Chatterjee, S.; Brayton, R. Improvements to technology mapping for LUT-based FPGAs. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2007, 26, 240–253. [Google Scholar] [CrossRef]
Xilinx. Vivado Design Suite User Guide: Synthesis; UG901 (v2019.1). 2025. Available online: https://www.xilinx.com/support/documentation/sw_manuals/xilinx2019_1/ug901-vivado-synthesis.pdf (accessed on 1 January 2025).
Quartus Prime. 2025. Available online: https://www.intel.pl/content/www/pl/pl/software/programmable/quartus-prime/overview.html (accessed on 1 January 2025).
McElvain, K. LGSynth93 benchmark Set. Version 4.0. 1993. Available online: https://people.engr.ncsu.edu/brglez/CBL/benchmarks/LGSynth93/LGSynth93.tar (accessed on 1 February 2018).
Feng, W.; Greene, J.; Mishchenko, A. Improving FPGA Performance with a S44 LUT Structure. In Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA ’18, Monterey, CA, USA, 25–27 February 2018; Association for Computing Machinery: New York, NY, USA, 2018; pp. 61–66. [Google Scholar] [CrossRef]
Kubica, M.; Opara, A.; Kania, D. Logic Synthesis for FPGAs Based on Cutting of BDD. Microprocess. Microsyst. 2017, 52, 173–187. [Google Scholar] [CrossRef]
Kubica, M.; Kania, D.; Kulisz, J. A Technology Mapping of FSMs Based on a Graph of Excitations and Outputs. IEEE Access 2019, 7, 16123–16131. [Google Scholar] [CrossRef]

Figure 1. Architecture of FPGA-based FSM

U_{1}

.

Figure 1. Architecture of FPGA-based FSM

U_{1}

.

Figure 2. Architecture of CSC-based FSM

U_{2}

.

Figure 2. Architecture of CSC-based FSM

U_{2}

.

Figure 3. Architecture of MCSC-based Mealy FSM

U_{3}

.

Figure 3. Architecture of MCSC-based Mealy FSM

U_{3}

.

Figure 4. State transition graph of Mealy FSM

A 1

.

Figure 4. State transition graph of Mealy FSM

A 1

.

Figure 5. Outcome of composite state assignment for FSM A1.

Table 1. State transition table for FSM

A 1

.

Table 1. State transition table for FSM

A 1

.

$a_{m}$	$a_{S}$	$X_{h}$	$Y_{h}$	h
$a_{1}$	$a_{4}$	$x_{1}$	$y_{1} y_{2}$	1
	$a_{3}$	$\bar{x_{1}} x_{6}$	$y_{3}$	2
	$a_{2}$	$\bar{x_{1}} \bar{x_{6}}$	$y_{2} y_{5}$	3
$a_{2}$	$a_{2}$	$x_{6}$	$y_{3} y_{4}$	4
$a_{2}$	$a_{5}$	$\bar{x_{6}}$	$y_{6} y_{7}$	5
$a_{3}$	$a_{4}$	$x_{1}$	$y_{7}$	6
	$a_{2}$	$\bar{x_{1}} x_{7}$	$y_{3}$	7
	$a_{6}$	$\bar{x_{1}} \bar{x_{7}}$	$y_{1} y_{2}$	8
$a_{4}$	$a_{1}$	$x_{3}$	$y_{6}$	9
$a_{4}$	$a_{3}$	$\bar{x_{3}}$	$y_{2} y_{3}$	10
$a_{5}$	$a_{6}$	$x_{6}$	$y_{3}$	11
$a_{5}$	$a_{9}$	$\bar{x_{6}} x_{7}$	$y_{6} y_{7}$	12
$a_{6}$	$a_{4}$	1	$y_{3} y_{5}$	13
$a_{7}$	$a_{8}$	$x_{3}$	$y_{1} y_{7}$	14
	$a_{5}$	$\bar{x_{3}} x_{2}$	$y_{2}$	15
	$a_{11}$	$\bar{x_{3}} \bar{x_{2}}$	$y_{3}$	16
$a_{8}$	$a_{9}$	1	$y_{2} y_{4}$	17
$a_{9}$	$a_{8}$	$x_{7}$	$y_{3}$	18
$a_{9}$	$a_{7}$	$\bar{x_{7}} x_{7}$	$y_{1} y_{2}$	19
$a_{10}$	$a_{11}$	$x_{3}$	$y_{2}$	20
	$a_{5}$	$\bar{x_{3}} x_{4}$	$y_{5} y_{7}$	21
	$a_{12}$	$\bar{x_{3}} \bar{x_{4}}$	$y_{1} y_{5}$	22
$a_{11}$	$a_{5}$	$x_{4}$	$y_{2} y_{6}$	23
$a_{11}$	$a_{10}$	$\bar{x_{4}}$	$y_{3}$	24
$a_{12}$	$a_{5}$	$x_{4}$	$y_{3}$	25
$a_{12}$	$a_{1}$	$\bar{x_{4}}$	–	26

Table 2. Table for

L U T e r 1

of FSM

A 1

.

Table 2. Table for

L U T e r 1

of FSM

A 1

.

$a_{m}$	$CS (a_{m})$	$a_{S}$	$CC (a_{S})$	$X_{h}$	$D_{h}$	$Y_{h}$	h
$a_{1}$	00	$a_{4}$	0100	$x_{1}$	$D_{2}$	$y_{1} y_{2}$	1
		$a_{3}$	0010	$\bar{x_{1}} x_{6}$	$D_{3}$	$y_{3}$	2
		$a_{2}$	0001	$\bar{x_{1}} \bar{x_{6}}$	$D_{4}$	$y_{2} y_{5}$	3
$a_{2}$	01	$a_{2}$	0001	$x_{6}$	$D_{4}$	$y_{3} y_{4}$	4
$a_{2}$	01	$a_{5}$	0011	$\bar{x_{6}}$	$D_{3} D_{4}$	$y_{6} y_{7}$	5
$a_{3}$	10	$a_{4}$	0100	$x_{1}$	$D_{2}$	$y_{7}$	6
		$a_{2}$	0001	$\bar{x_{1}} x_{7}$	$D_{4}$	$y_{3}$	7
		$a_{6}$	0101	$\bar{x_{1}} \bar{x_{7}}$	$D_{2} D_{4}$	$y_{1} y_{2}$	8
$a_{4}$	11	$a_{6}$	0101	$x_{6}$	$D_{2} D_{4}$	$y_{3}$	9
$a_{4}$	11	$a_{9}$	1001	$\bar{x_{6}}$	$D_{1} D_{4}$	$y_{6} y_{7}$	10

Table 3. Table for

L U T e r 2

of FSM

A 1

.

Table 3. Table for

L U T e r 2

of FSM

A 1

.

$a_{m}$	$CS (a_{m})$	$a_{S}$	$CC (a_{S})$	$X_{h}$	$D_{h}$	$Y_{h}$	h
$a_{4}$	00	$a_{1}$	0000	$x_{3}$	–	$y_{6}$	1
$a_{4}$	00	$a_{3}$	0010	$\bar{x_{3}} x_{6}$	$D_{3}$	$y_{2} y_{3}$	2
$a_{6}$	01	$a_{4}$	0100	1	$D_{2}$	$y_{3} y_{5}$	3
$a_{7}$	10	$a_{8}$	1000	$x_{2}$	$D_{1}$	$y_{7}$	4
		$a_{5}$	0011	$\bar{x_{3}} x_{2}$	$D_{3} D_{4}$	$y_{3}$	5
		$a_{11}$	0111	$\bar{x_{3}} \bar{x_{2}}$	$D_{2} D_{3} D_{4}$	$y_{1} y_{2}$	6
$a_{11}$	11	$a_{5}$	0011	$x_{4}$	$D_{3} D_{4}$	$y_{2} y_{6}$	7
$a_{11}$	11	$a_{10}$	1000	$\bar{x_{4}}$	$D_{1}$	$y_{3}$	8

Table 4. Table for

L U T e r 3

of FSM

A 1

.

Table 4. Table for

L U T e r 3

of FSM

A 1

.

$a_{m}$	$CS (a_{m})$	$a_{S}$	$CC (a_{S})$	$X_{h}$	$D_{h}$	$Y_{h}$	h
$a_{8}$	00	$a_{9}$	1001	1	$D_{1} D_{4}$	$y_{2} y_{4}$	1
$a_{9}$	01	$a_{8}$	1000	$x_{7}$	$D_{1}$	$y_{3}$	2
$a_{9}$	01	$a_{7}$	0110	$\bar{x_{7}}$	$D_{2} D_{3}$	$y_{1} y_{2}$	3
$a_{10}$	10	$a_{11}$	0111	$x_{3}$	$D_{2} D_{3} D_{4}$	$y_{2} y_{8}$	4
		$a_{5}$	0011	$\bar{x_{3}} x_{4}$	$D_{3} D_{4}$	$y_{5} y_{7}$	5
		$a_{12}$	0011	$\bar{x_{3}} \bar{x_{2}}$	$D_{1} D_{3} D_{4}$	$y_{1} y_{5}$	6
$a_{12}$	11	$a_{5}$	0011	$x_{4}$	$D_{3} D_{4}$	$y_{3}$	7
$a_{12}$	11	$a_{1}$	0000	$\bar{x_{4}}$	–	–	8

Table 5. Table for

L U T e r Y τ

of Mealy FSM

A 1

.

Table 5. Table for

L U T e r Y τ

of Mealy FSM

A 1

.

$f_{i}$	1	2	3	$f_{i}$	1	2	3	$f_{i}$	1	2	3	$f_{i}$	1	2	3
$D_{1}$	1	1	1	$D_{4}$	1	1	1	$y_{3}$	1	1	1	$y_{6}$	1	1	0
$D_{2}$	1	1	1	$y_{1}$	1	1	1	$y_{4}$	1	0	1	$y_{7}$	1	1	1
$D_{3}$	1	1	1	$y_{2}$	1	1	1	$y_{5}$	1	1	1	$y_{8}$	0	0	1

Table 6. Table for

L U T e r O H

of Mealy FSM A1.

Table 6. Table for

L U T e r O H

of Mealy FSM A1.

$G^{k}$	$CG (G^{k})$	$OCG (G^{k})$	$τ_{0}$	k
$G^{1}$	00	100	$g_{1}$	1
$G^{2}$	01	010	$g_{2}$	2
$G^{3}$	10	001	$g_{3}$	3

Table 7. Characteristics of benchmarks from LGSynth93 [38].

Benchmark	L	N	H	M	L + R
bbara	4	2	60	10	8
bbsse	7	7	56	16	11
bbtas	2	2	24	6	5
beecount	3	4	28	7	6
cse	7	7	91	16	11
dk14	3	5	56	7	6
dk15	3	5	32	4	5
dk16	2	3	108	27	7
dk17	2	3	32	8	5
dk27	1	2	14	7	4
dk512	1	3	15	15	5
donfile	2	1	96	24	7
ex1	9	19	138	20	14
ex2	2	2	72	19	7
ex3	2	2	36	10	6
ex4	6	9	21	14	10
ex5	2	2	32	9	6
ex6	5	8	34	8	8
ex7	2	2	36	10	6
keyb	7	7	170	19	12
kirkman	12	6	370	16	16
lion	2	1	11	4	4
lion9	2	1	25	9	6
mark1	5	16	22	15	9
mc	3	5	10	4	5
modulo12	1	1	24	12	5
opus	5	6	22	10	9
planet	7	19	115	48	13
planet1	7	19	115	48	13
Pma	8	8	73	24	13
s1	8	7	106	20	13
s1488	8	19	251	48	14
s1494	8	19	250	48	14
s1a	8	6	107	20	13
s27	4	1	34	6	7
s298	3	6	1096	218	11
s386	7	7	64	13	11
s510	19	7	77	47	25
s8	4	1	20	5	7
s820	18	19	232	25	23
s832	18	19	245	25	23
sand	11	9	184	32	16
shiftreg	1	1	16	8	4
sse	7	7	56	16	11
styr	9	10	166	30	14
tma	8	9	44	20	13

Table 8. Results of experiments (cycle time, nsec).

FSM	MB	OH	JEDI	$U_{2}$	$U_{3}$	L + R
ex1	6625	7155	5654	3620	3410	16
kirkman	7073	6494	6382	4359	3993	18
planet	7535	7535	5344	3491	3321	14
planet1	7535	7535	5344	3491	3321	14
pma	6841	6841	5888	4080	3708	14
s1	6830	7361	6363	4233	3902	14
s1488	7220	7579	6362	4384	4217	15
s1494	6694	6861	6085	4420	4178	15
s1a	6520	5669	5911	4521	4252	15
s510	5629	5629	5512	4581	3729	27
s820	6579	6529	5663	4419	3687	25
s832	6863	6526	5754	4553	3966	25
sand	8623	8623	7885	4424	4130	18
styr	7267	7697	6866	3719	3505	16
tma	6102	6766	6092	3557	3525	13
Total	103,937	104,800	91,106	61,851	56,843
Percentage, %	18,285	18,437	16,028	10,881	100

Table 9. Results of experiments (maximum operating frequency, MHz).

BM	MB	OH	JEDI	$U_{2}$	$U_{3}$	L + R
ex1	15,094	13,976	17,687	27,623	29,327	16
kirkman	14,138	15,400	15,668	22,941	25,042	18
planet	13,271	13,271	18,714	28,648	30,108	14
planet1	13,271	13,271	18,714	28,648	30,108	14
pma	14,618	14,618	16,983	24,512	26,972	14
s1	14,641	13,585	15,716	23,624	25,631	14
s1488	13,850	13,194	15,718	22,812	23,711	15
s1494	14,939	14,575	16,434	22,623	23,935	15
s1a	15,337	17,640	16,917	22,118	23,521	15
s510	17,765	17,765	18,142	21,831	26,817	27
s820	15,200	15,316	17,658	22,628	27,123	25
s832	14,571	15,323	17,378	21,965	25,213	25
sand	11,597	11,597	12,682	22,603	24,216	18
styr	13,761	12,992	14,564	26,892	28,532	16
tma	16,388	14,780	16,414	28,114	28,371	13
Total	218,441	217,303	249,389	367,582	398,627
Percentage, %	5480	5451	6256	9221	100%

Table 10. Results of experiments (LUT count).

BM	MB	OH	JEDI	$U_{2}$	$U_{3}$	L + R
ex1	70	74	53	40	38	16
kirkman	42	58	39	31	30	18
planet	131	131	88	77	72	14
planet1	131	131	88	77	72	14
pma	94	94	86	72	69	14
s1	65	99	61	55	52	14
s1488	124	131	108	87	84	15
s1494	126	132	110	85	82	15
s1a	49	81	43	43	40	15
s510	48	48	32	26	28	27
s820	88	82	68	48	46	25
s832	80	79	62	50	52	25
sand	132	132	114	91	87	18
styr	93	120	81	71	69	16
tma	45	39	39	31	29	13
Total	1318	1431	1072	884	850
Percentage, %	15,506	16,835	12,612	10,400	10,000

Table 11. Results of experiments (area–time products).

BM	MB	OH	JEDI	$U_{2}$	$U_{3}$	L + R
ex1	46,376	52,948	29,966	14,481	12,957	16
kirkman	29,707	37,662	24,891	13,513	11,980	18
planet	98,711	98,711	47,024	26,878	23,914	14
planet1	98,711	98,711	47,024	26,878	23,914	14
pma	64,304	64,304	50,639	29,373	25,582	14
s1	44,396	72,874	38,814	23,281	20,288	14
s1488	89,531	99,288	68,711	38,138	35,427	15
s1494	84,343	90,566	66,934	37,572	34,259	15
s1a	31,949	45,918	25,418	19,441	17,006	15
s510	27,019	27,019	17,639	11,910	10,441	27
s820	57,895	53,539	38,509	21,213	16,960	25
s832	54,904	51,556	35,677	22,763	20,624	25
sand	113,823	113,823	89,891	40,260	35,927	18
styr	67,582	92,365	55,617	26,402	24,183	16
tma	27,459	26,387	23,760	11,027	10,222	13
Total	9367	10,257	6605	3631	3237
Percentage, %	28,939	31,687	20,406	11,219	10,000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barkalov, A.; Titarenko, L.; Mielcarek, K. Transforming Group Codes in Mealy Finite State Machines with Composite State Codes. Appl. Sci. 2025, 15, 4289. https://doi.org/10.3390/app15084289

AMA Style

Barkalov A, Titarenko L, Mielcarek K. Transforming Group Codes in Mealy Finite State Machines with Composite State Codes. Applied Sciences. 2025; 15(8):4289. https://doi.org/10.3390/app15084289

Chicago/Turabian Style

Barkalov, Alexander, Larysa Titarenko, and Kamil Mielcarek. 2025. "Transforming Group Codes in Mealy Finite State Machines with Composite State Codes" Applied Sciences 15, no. 8: 4289. https://doi.org/10.3390/app15084289

APA Style

Barkalov, A., Titarenko, L., & Mielcarek, K. (2025). Transforming Group Codes in Mealy Finite State Machines with Composite State Codes. Applied Sciences, 15(8), 4289. https://doi.org/10.3390/app15084289

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Transforming Group Codes in Mealy Finite State Machines with Composite State Codes

Abstract

1. Introduction

2. LUT-Based FSM Design

3. Peculiarities of FSMs Based on Composite State Assignment

4. The Main Idea of the Proposed Method

5. Example of Synthesis of Mealy FSM $U_{3}$

6. Experimental Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Transforming Group Codes in Mealy Finite State Machines with Composite State Codes

Abstract

1. Introduction

2. LUT-Based FSM Design

3. Peculiarities of FSMs Based on Composite State Assignment

4. The Main Idea of the Proposed Method

5. Example of Synthesis of Mealy FSM U 3

6. Experimental Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

5. Example of Synthesis of Mealy FSM $U_{3}$