# Efficient Construction of the Equation Automaton

^{1}

^{2}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Preliminaries

#### 2.1. Regular Expressions and Finite Automata

#### 2.1.1. Regular Expressions and Languages

- $1+1=1$,
- $1+E+1=1+E$,
- E is in Star-Normal Form (SNF) [16].

#### 2.1.2. Finite Automata and Recognizable Languages

- $\sim \subseteq {(Q-F)}^{2}\cup {F}^{2}$ (final and non-final states are not ∼-equivalent),
- for any $p,q\in Q$, $a\in A$, if $p\sim q$, then $\delta (p,a){/}_{\sim}=\delta (q,a){/}_{\sim}$.

#### 2.2. Thompson Automaton

**Example**

**1.**

## 3. Equation Automaton

**Theorem**

**1.**

- $\delta ({E}^{\prime},a)=\{{E}^{\u2033}\in D(E)\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}{E}^{\u2033}\in {\partial}_{a}({E}^{\prime})\}$,
- $F=\{{E}^{\prime}\in D(E)\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}\lambda ({E}^{\prime})=1\}$.

**Example**

**2.**

#### 3.1. C-Continuation Automaton

**Definition**

**1.**

**Theorem**

**2.**

**Proposition**

**1.**

**Corollary**

**1.**

**Definition**

**2.**

- $Q=\{(x,{c}_{x}(E))\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}x\in pos(E)\cup \{0\}\}$,
- $i=(0,{c}_{0}(E))$,
- $F=\{(x,{c}_{x}(E))\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}\lambda ({c}_{x}(E))=1\}$,
- $\delta ((x,{c}_{x}(E)),a)=\{(y,{c}_{y}(E)\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}h(y)=aand{d}_{y}({c}_{x}(E))\equiv {c}_{y}(E)\},\forall x\in pos(E)\cup \{0\}and\forall a\in {A}_{E}$.

**Corollary**

**2.**

**Example**

**3.**

$\begin{array}{ccc}{c}_{0}(E)\hfill & =\hfill & {({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \\ {c}_{{a}_{1}}(E)\hfill & =\hfill & {a}_{1}^{\ast}{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \\ {c}_{{b}_{2}}(E)\hfill & =\hfill & {a}_{3}^{\ast}{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \\ {c}_{{b}_{4}}(E)\hfill & =\hfill & {b}_{4}^{\ast}{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \end{array}$ | $\begin{array}{ccc}{c}_{{a}_{3}}(E)\hfill & =\hfill & {c}_{{a}_{3}}({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})\xb7{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \\ & =\hfill & {c}_{{a}_{3}}({b}_{2}{a}_{3}^{\ast})\xb7{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \\ & =\hfill & {c}_{{a}_{3}}({a}_{3}^{\ast})\xb7{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \\ & =\hfill & {a}_{3}^{\ast}{({a}_{1}^{\ast}+{b}_{2}{a}_{3}^{\ast}+{b}_{4}^{\ast})}^{\ast}\hfill \end{array}$ |

$\begin{array}{c}{d}_{{a}_{1}}({c}_{0}(E))={c}_{{a}_{1}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{2}}({c}_{0}(E))={c}_{{b}_{2}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{a}_{3}}({c}_{0}(E))=0,\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{4}}({c}_{0}(E))={c}_{{b}_{4}}(E)\hfill \\ {d}_{{a}_{1}}({c}_{{a}_{1}}(E))={c}_{{a}_{1}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{2}}({c}_{{a}_{1}}(E))={c}_{{b}_{2}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{a}_{3}}({c}_{{a}_{1}}(E))=0,\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{4}}({c}_{{a}_{1}}(E))={c}_{{b}_{4}}(E)\hfill \end{array}$ |

$\begin{array}{c}{d}_{{a}_{1}}({c}_{{b}_{2}}(E))={c}_{{a}_{1}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{2}}({c}_{{b}_{2}}(E))={c}_{{b}_{2}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{a}_{3}}({c}_{{b}_{2}}(E))={c}_{{a}_{3}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{4}}({c}_{{b}_{2}}(E))={c}_{{b}_{4}}(E)\hfill \end{array}$ |

$\begin{array}{c}{d}_{{a}_{1}}({c}_{{a}_{3}}(E))={c}_{{a}_{1}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{2}}({c}_{{a}_{3}}(E))={c}_{{b}_{2}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{a}_{3}}({c}_{{a}_{3}}(E))={c}_{{a}_{3}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{4}}({c}_{{a}_{3}}(E))={c}_{{b}_{4}}(E)\hfill \end{array}$ |

$\begin{array}{c}{d}_{{a}_{1}}({c}_{{b}_{4}}(E))={c}_{{a}_{1}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{2}}({c}_{{b}_{4}}(E))={c}_{{b}_{2}}(E),\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{a}_{3}}({c}_{{b}_{4}}(E))=0,\phantom{\rule{3.33333pt}{0ex}}\phantom{\rule{3.33333pt}{0ex}}{d}_{{b}_{4}}({c}_{{b}_{4}}(E))={c}_{{b}_{4}}(E)\hfill \end{array}$ |

#### 3.2. Equation Automaton as a Quotient of C-Continuation Automaton

**Proposition**

**2.**

- ${Q}_{{\equiv}_{e}}=\{{C}_{x}\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}x\in pos(E)\cup \{0\}\}$,
- ${q}_{0}={C}_{0}$,
- $F=\{{C}_{x}\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}\lambda ({c}_{x}(E))=1\}$,
- $\delta ({C}_{x},a)=\{{C}_{y}\phantom{\rule{3.33333pt}{0ex}}|\phantom{\rule{3.33333pt}{0ex}}h(y)=a\mathrm{and}{d}_{y}({c}_{x}(E))\equiv {c}_{y}(E)\},\phantom{\rule{3.33333pt}{0ex}}\forall {C}_{x}\in {Q}_{{\equiv}_{e}}\mathrm{and}\forall a\in {A}_{E}$.

**Theorem**

**3.**

**Example**

**4.**

$\begin{array}{c}h({c}_{0}(E))={({a}^{\ast}+b{a}^{\ast}+{b}^{\ast})}^{\ast}\\ h({c}_{{a}_{1}}(E))=h({c}_{{b}_{2}}(E))=h({c}_{{a}_{3}}(E))={a}^{\ast}{({a}^{\ast}+b{a}^{\ast}+{b}^{\ast})}^{\ast}\\ h({c}_{{b}_{4}}(E))={b}^{\ast}{({a}^{\ast}+b{a}^{\ast}+{b}^{\ast})}^{\ast}\end{array}$ |

## 4. Allauzen and Mohri’s Algorithm

**Proposition**

**3.**

**Theorem**

**4.**

## 5. Efficient Conversion Algorithm

Algorithm 1 Computation of the equation automaton. |

input: The Thompson automaton ${\mathcal{T}}_{E}=\langle Q,{A}_{E},I,\delta ,F\rangle $ associated with a regular expression E.output: The equation automaton ${\mathcal{E}}_{E}$ associated with E./* Computation of states */ |

Compute $Id({\mathcal{T}}_{E}$): |

Compute ${C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E}))$: |

• Compute pseudo-continuations for all position states of $Id({\mathcal{T}}_{E})$. • Merge equivalent states having the same pseudo-continuation. /* Computation of transitions and final states /* |

Compute $rmeps({C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E})))$: |

• Perform epsilon removal operation using $rmeps()$ function over ${C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E}))$. |

#### 5.1. Computation of States

#### 5.1.1. Sub-Expressions Identification

- $A=\{{\epsilon}_{l+},{\epsilon}_{\xb7},{\epsilon}_{r+},{\epsilon}_{\ast}^{1},{\epsilon}_{\ast}^{2}\}\cup {A}_{E}$ where l (resp. r), denote left (resp. right),
- ${Q}^{\prime}=\{(q,N(q))|\phantom{\rule{3.33333pt}{0ex}}q\in Q\}$, i.e., a state in ${\mathcal{T}}_{E}$ is augmented by the letter $N(q)$.
- The transition function ${\delta}^{\prime}$ is defined over the Thompson automaton as follows:

**Lemma**

**1.**

**Proof.**

**Proposition**

**4.**

**Proof.**

**Example**

**5.**

#### 5.1.2. ${\equiv}_{e}$-Equivalent States Merging

**Proposition**

**5.**

**Definition**

**3.**

- $A=\{{\epsilon}_{0},\cdots ,{\epsilon}_{|Exp|}\}\cup {A}_{E}$.
- ${Q}_{{\equiv}_{e}}=\{(q,C(q))|\phantom{\rule{3.33333pt}{0ex}}(q,N(q))\in {Q}^{\prime}\}$, i.e., a state in $Id({\mathcal{T}}_{E})$ is replaced by $(q,C(q))$.
- The transition function ${\delta}^{\u2033}$ is defined as follows:

**Proposition**

**6.**

**Example**

**6.**

**Proposition**

**7.**

**Proposition**

**8.**

**Theorem**

**5.**

**Proof.**

**Example**

**7**

**.**The automaton ${C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E}))$, schematized in the Figure 8, is obtained from $Id({\mathcal{T}}_{E})$ after merging ${\equiv}_{e}$-equivalent position states $5,8$ and 11.

#### 5.2. Computation of Transitions and Final States

**Lemma**

**2.**

**Lemma**

**3.**

**Proposition**

**9.**

**Example**

**8.**

- The set of states of the equation automaton are $\{0,\{5,8,11\},16\}$.
- Since the final state of ${C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E}))$ is the state 19 and $19\in rmeps(0)$, $19\in rmeps(\{5,8,11\})$, and $19\in rmeps(16)$, then the set of final states are $\{0,\{5,8,11\},16\}$.
- There are two paths in ${C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E}))$ from the state 0 to the state $\{5,8,11\}$ labeled respectively by $\epsilon \xb7\epsilon \xb7\epsilon \xb7\epsilon \xb7a$ and $\epsilon \xb7\epsilon \xb7\epsilon \xb7b$, then $\{5,8,11\}\in rmeps(0)$. Consequently, two transitions $(0,a,\{5,8,11\})$ and $(0,b,\{5,8,11\})$ are added to the equation automaton. The same process will be applied for other transitions.

**Theorem**

**6.**

## 6. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Mirkin, B.G. Novyj algoritm postroéniá bazisa v ázyké régularnyh vyražénij. Izvéstiá Akadémii Nauk SSSR. Engineering cybernetics, no. 5 (1966). English translation of the preceding: Brzozowski, J. An algorithm for constructing a base in a language of regular expressions. pp. 110–116. J. Symb. Log.
**1971**, 36, 694. [Google Scholar] - Antimirov, V. Partial derivatives of regular expressions and finite automaton constructions. Theor. Comput. Sci.
**1996**, 155, 291–319. [Google Scholar] [CrossRef] [Green Version] - Glushkov, V.M. The abstract theory of automata. Russ. Math. Surv.
**1961**, 16, 1–53. Available online: https://iopscience.iop.org/article/10.1070/RM1961v016n05ABEH004112 (accessed on 27 May 2021). [CrossRef] - McNaughton, R.F.; Yamada, H. Regular expressions and state graphs for automata. IEEE Trans. Electron. Comput.
**1960**, 9, 39–57. [Google Scholar] [CrossRef] - Ziadi, D.; Ponty, J.-L.; Champarnaud, J.-M. A New Quadratic Algorithm to Convert a Regular Expression into an Automaton. In Proceedings of the Workshop on Implementing Automata, London, ON, Canada, 29–31 August 1996; pp. 109–119. [Google Scholar]
- Champarnaud, J.-M.; Ziadi, D. Canonical derivatives, partial derivatives and finite automaton constructions. Theor. Comput. Sci.
**2002**, 289, 137–163. [Google Scholar] [CrossRef] [Green Version] - Khorsi, A.; Ouardi, F.; Ziadi, D. Fast equation automaton computation. J. Discret. Algorithms
**2008**, 6, 433–448. [Google Scholar] [CrossRef] [Green Version] - Allauzen, C.; Mohri, M. A Unified Construction of the Glushkov, Follow, and Antimirov Automata. In Proceedings of the International Conference of Mathematical Foundations of Computer Science, Stará Lesná, Slovakia, 28 August–1 September 2006; pp. 110–121. [Google Scholar]
- Ilie, L.; Yu, S. Follow automata. Inf. Comput.
**2003**, 186, 140–162. [Google Scholar] [CrossRef] [Green Version] - Champarnaud, J.-M.; Nicart, F.; Ziadi, D. From the ZPC Structure of a Regular Expression to its Follow Automaton. IJAC
**2006**, 16, 17–34. [Google Scholar] [CrossRef] - Kleene, S. Representation of Events in Nerve Nets and Finite Automata; Automata Studies, Ann. Math. Studies 34; Princeton University Press: Princeton, NJ, USA, 1956; pp. 3–41. [Google Scholar]
- Thompson, K. Regular Expression Search Algorithm. Commun. ACM
**1968**, 11, 410–422. [Google Scholar] [CrossRef] - Hopcroft, J. An n log n Algorithm for Minimizing States in a Finite Automaton; Technical Report; Stanford University, CS Dept.: Stanford, CA, USA, 1971. [Google Scholar]
- Hopcroft, J.E.; Ullman, J.D. Introduction to Automata Theory, Languages and Computation; Addison-Wesley: Reading, MA, USA, 1979. [Google Scholar]
- Sakarovitch, J.; Thomas, R. Elements of Automata Theory; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
- Brüggemann-Klein, A. Regular expressions into finite automata. Theor. Comp. Sci.
**1993**, 120, 117–126. [Google Scholar] [CrossRef] [Green Version] - Bubenzer, J. Cycle-aware minimization of acyclic deterministic finite-state automata. J. Discret. Appl. Math.
**2014**, 163, 238–246. [Google Scholar] [CrossRef] - Revuz, D. Minimization of acyclic deterministic automata in linear time. Theor. Comput. Sci.
**1992**, 92, 181–189. [Google Scholar] [CrossRef] [Green Version] - Giammarresi, D.; Ponty, J.-L.; Wood, D.; Ziadi, D. A characterization of Thompson digraphs. Discret. Appl. Math.
**2004**, 134, 317–337. [Google Scholar] [CrossRef] - Myhill, J. Finite automata and the representation of events. In WADD TR-57-624; Wright Patterson AFB: Dayton, OH, USA, 1957; pp. 112–137. [Google Scholar]
- Nerode, A. Linear automata transformation. Proc. AMS
**1958**, 9, 541–544. [Google Scholar] [CrossRef]

**Figure 5.**(

**a**) The c-continuation automaton ${\mathcal{C}}_{E}$ versus (

**b**) The quotient automaton ${\mathcal{C}}_{E}{/}_{{\equiv}_{e}}$.

**Figure 6.**The automaton $Id({\mathcal{T}}_{E})$ associated with $E={({a}^{\ast}+b{a}^{\ast}+{b}^{\ast})}^{\ast}$.

**Figure 8.**The automaton ${C}_{{\equiv}_{e}}(Id({\mathcal{T}}_{E}))$ associated with $E={({a}^{\ast}+b{a}^{\ast}+{b}^{\ast})}^{\ast}$.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ouardi, F.; Lotfi, Z.; Elghadyry, B.
Efficient Construction of the Equation Automaton. *Algorithms* **2021**, *14*, 238.
https://doi.org/10.3390/a14080238

**AMA Style**

Ouardi F, Lotfi Z, Elghadyry B.
Efficient Construction of the Equation Automaton. *Algorithms*. 2021; 14(8):238.
https://doi.org/10.3390/a14080238

**Chicago/Turabian Style**

Ouardi, Faissal, Zineb Lotfi, and Bilal Elghadyry.
2021. "Efficient Construction of the Equation Automaton" *Algorithms* 14, no. 8: 238.
https://doi.org/10.3390/a14080238