Efﬁcient Algorithms for Computing the Inner Edit Distance of a Regular Language via Transducers

: The concept of edit distance and its variants has applications in many areas such as computational linguistics, bioinformatics, and synchronization error detection in data communications. Here, we revisit the problem of computing the inner edit distance of a regular language given via a Nondeterministic Finite Automaton (NFA). This problem relates to the inherent maximal error-detecting capability of the language in question. We present two efﬁcient algorithms for solving this problem, both of which execute in time O ( r 2 n 2 d ) , where r is the cardinality of the alphabet involved, n is the number of transitions in the given NFA, and d is the computed edit distance. We have implemented one of the two algorithms and present here a set of performance tests. The correctness of the algorithms is based on the connection between word distances and error detection and the fact that nondeterministic transducers can be used to represent the errors (resp., edit operations) involved in error-detection (resp., in word distances).


Introduction
The concept of edit distance and its variants has applications in many areas such as computational linguistics [1], bioinformatics [2], and synchronization error detection in data communications [3].The edit distance of a language L with at least two words-also referred to as inner edit distance of L-is the minimum edit distance between any two different words in L. In [4], the author considers the problem of computing the edit distance of a regular language, which is given via a Nondeterministic Finite Automaton (NFA), or a Deterministic Finite Automaton (DFA).For a given automaton a with n transitions and an alphabet of r symbols, the algorithm proposed in [4] has worst-case time complexity O(r 2 n 2 q 2 (q + r)), where q is either the number of states in a (if a is a DFA), or the square of the number of states in a (if a is an NFA).If the size of the alphabet is ignored and the automaton in question has only states that can be reached from the start state, then the number of states is O(n) and the worst-case time complexity shown in Label (1) can be written as O(n 5 ) for DFAs, and O(n 8 ) for NFAs. ( In this paper, we present two efficient algorithms to compute the inner edit distance of a regular language given via an NFA with n transitions-see Theorems 1 and 3.Both algorithms, which are called DistErrDetect and DistInpAlter, have the same worst-case time complexity where d is the computed distance, which is a significant improvement over the original algorithm in [4]. Our first algorithm, DistErrDetect, is based on the general method of [5] for computing distances via the error-detection property.Now, however, we have an efficient way of realizing algorithmically that general method using an incremental construction of a (nondeterministic) transducer and the test of [6] for partial identity for transducers.In our second algorithm, DistInpAlter, the idea is to model the edit operations of the desired distance using an efficient, in terms of size, input-altering transducer (a transducer whose output is always different from the input used).Please see subsequent sections for definitions of terms.For clarity of presentation, we give in detail not only the new algorithms, but also their preliminary versions PrelimDistErrDetect and PrelimDistInpAlter that could possibly be applied to other types of distances.We have implemented the preliminary and final versions of the second algorithm (PrelimDistInpAlter and DistInpAlter) in Python using the well maintained, open source package FAdo for automata [7].We have also tested our implementation experimentally, and we present in this paper the outcomes of the tests.
We note that some related problems involving distances between words and languages can be found in [8,9] (edit distance between a word and a language), and in [10][11][12][13][14] (various distances between two languages).Also in [15], the newer concept of edit distance with moves is investigated.The problem considered here is technically different, however, as the desired distance involves different words within the same language.More specifically, if we used directly the tools of [10,11], for instance, to compute an edit string with minimal number of errors between the given language and itself, then that string would simply be an edit string of zero errors, as the edit distance between any word and itself is zero.We also note that the inner prefix distance of a regular language, which is quite different from the inner edit distance, is considered in [16] and computed in time O(n 2 log n).
The paper is organized as follows.The next section contains basic notions on languages, word relations, finite-state machines and edit-strings.Section 3 describes the approach of computing the desired edit distance via the concept of error-detection and presents the preliminary version PrelimDistErrDetect of the first algorithm.Then, Section 4 explains the improved and final version DistErrDetect of the first algorithm.In Section 5, it is shown that the edit distance is definable via an efficient input-altering transducer-see Theorem 2-and then the second algorithm DistInpAlter is presented.Section 6 discusses the implementation and testing of the second algorithm and its preliminary version.The last section contains a few concluding remarks.The appendix contains the proofs of two technical lemmata.

Notation, Background and Preliminary Results
This section contains basic terminology about formal languages, automata, transducers, and edit strings.Most of the basic notions presented here can be found in various texts such as [17][18][19][20][21].

Sets, Words, Languages, Channels
The set of positive integers is denoted by N.Then, N 0 = N ∪ {0}.If S is any set, the expression |S| denotes the cardinality of S. We use standard basic notation and terminology for alphabets, words and languages-see [22], for instance.For example, Σ denotes an alphabet, Σ + the set of nonempty words, λ the empty word, Σ * = Σ + ∪ {λ}, |w| the length of the word w.We write u ≤ p w to indicate that the word u is a prefix of w, that is, w = uv for some word v.Then, u < p w means that u is a proper prefix, that is, u ≤ p w and u = w.We use the concepts of (formal) language and concatenation between words, or languages, in the usual way.We say that w is an L-word if w ∈ L and L is a language.
A binary word relation ρ on Σ * is any subset of and (w, w) ∈ γ for all words w in the domain of γ.When (u, v) ∈ γ, we say that u can be received as v via the channel γ, or v is a possible output of γ when u is used as input.If v = u, then we say that u can be received (via γ) with errors.Here, we only consider the channel sid(k), for some k ∈ N, such that (u, v) ∈ sid(k) if and only if v can be obtained by applying at most k errors in u, where an error could be a deletion of a symbol in u, a substitution of a symbol in u with another symbol, or an insertion of a symbol in u-see further below for a more rigorous definition via edit-strings.

NFAs and Transducers
A Nondeterministic Finite Automaton with empty transitions, λ-NFA for short, or just automaton, is a quintuple a = (Q, Σ, T, s, F) such that Q is the finite set of states, Σ is the alphabet, s ∈ Q is the start (or initial) state, F ⊆ Q is the set of final states, and T ⊆ Q × (Σ ∪ {λ}) × Q is the finite set of transitions or edges.Let (p, x, q) be a transition of a.Then, x is called the label of the transition, and we say that the transition goes out of p.We also use the notation for short, is a special type of NFA in which, for each state p, there are no two transitions with equal labels going out of p.
A path of a is a finite sequence of consecutive transitions: for some nonnegative integer , where we use concatenation of these transitions to denote the path.Then, if P 1 and P 2 are two paths such that the last state of P 1 is equal to the first state of P 2 , P 1 P 2 denotes the path resulting by concatenating the transitions of P 1 and P 2 .
The word x 1 • • • x is called the label of the path in Label (4).We write p 0 x − → * p to indicate that there is a path with label x from p 0 to p .A path as above is called a computation of a if p 0 is the start state.It is called an accepting path/computation if p 0 is the start state and p is a final state.The language accepted by a, denoted as L(a), is the set of labels of all the accepting paths of a.The automaton a is called trim, if every state appears in some accepting path of a.
A (finite nondeterministic) transducer [17,20] is a quintuple (In the literature, a transducer also has an output alphabet Γ, but we consider here that Γ is the same as the input alphabet Σ.Without further mention all transducers considered here are nondeterministic.)t = (Q, Σ, T, s, F) such that Q, s, F are exactly the same as those in λ-NFAs, Σ is the alphabet, and the finite set of transitions or edges.We write (p, x/y, q), or p x/y − − → q for a transition-the label here is (x/y), with x being the input and y being the output label of the transition.The concepts of path, computation, accepting path, and trim transducer are similar to those in λ-NFAs.However, the label of a transducer path (p 0 , two words consisting of the concatenations of the input and output labels in the path, respectively.The relation realized by the transducer t, denoted by R(t), is the set of labels in all the accepting paths of t.We write t(u) for the set of possible outputs of t on input u, that is, The transducer t is called functional, if the relation R(t) is a function, that is, t(u) consists of at most one word, for all input words u.We say that t realizes a partial identity, If m is an automaton or a transducer, then the size of m, denoted by |m|, is the number of states plus the number of transitions in m.We shall write Q m , T m for the sets of states and transitions of m, respectively.
We recall that making an automaton or transducer m trim can be done in linear time O(|m|).

Edit Strings and Edit Distance
The alphabet E Σ of the (basic) edit operations, which depends on the alphabet Σ of ordinary symbols, consists of all symbols (x/y) such that x, y ∈ Σ ∪ {λ} and at least one of x and y is in Σ.If (x/y) ∈ E Σ and x is not equal to y, then (x/y) is called an error [23].The edit operations (a/b), (λ/a), (a/λ), where a, b ∈ Σ − {λ} and a = b, are called substitution, insertion, deletion, respectively.We write (λ/λ) for the empty word over the alphabet E Σ .We note that λ is used as a formal symbol in the elements of E Σ .For example, if a, b ∈ Σ, then (λ/a)(b/b) = (b/a)(λ/b).The elements of E * Σ are called edit strings.The weight of an edit string h, denoted by weight(h), is the number of errors occurring in h.For example, for weight(g) = 2.The input and output parts of an edit string h = (x 1 /y 1 ) We write inp(h) for the input part and out(h) for the output part of h.For example, for the g shown above, inp(g) = aabbb and out(g) = abab.The inverse of an edit string h is the edit string resulting by inverting the order of the input and output parts in every edit operation in h.For example, the inverse of g shown above is The channel sid(k) can be defined more rigorously via edit strings: The edit (or Levenshtein) distance [24] between two words u and v, denoted by δ(u, v), is the smallest number of errors (substitutions, insertions and deletions) that can be used to transform u to v.More formally, We say that an edit string h realizes the edit distance between two words u and v, if weight(h) = δ(u, v) and, either inp(h) = u and out(h) = v, or inp(h) = v and out(h) = u.For example, for Σ = {a, b}, we have that δ(ababa, babbb) = 3 and the edit string realizes δ(ababa, babbb).Note that several edit strings can realize the distance δ(u, v).If L is a language containing at least two words, then the edit distance of L is Testing whether a given NFA accepts at least two words is not a concern in this paper, but we note that this can be done efficiently (in linear time via a breadth first search type algorithm) [25].
The next lemma comes from [4].The bound D a is always less than or equal to the number of states in the NFA a.Moreover, there are NFAs for which this bound is tight-see Section 6.
Lemma 1.For every NFA a accepting at least two words, we have that where D a is the number of states in the longest path in a from the start state having no repeated state.
However, the bound D a is of no use in our context, as the problem of determining the length of a longest path in a given automaton, or a graph in general, is NP-complete since an algorithm solving this problem can be used to decide the existence of a Hamiltonian path; see for example [26].There are many ways to obtain an efficiently computable upper bound on the edit distance of L(a) that is always at most equal to the number of states in a.For example, that distance is always less than or equal to the distance of the two shortest accepted words.We agree to use this as a working upper bound: Lemma 2. For every NFA a accepting at least two words, we have that where B a is the edit distance of two shortest words in L(a).

Edit Distance via Error-Detection
In [5], the authors discuss a conceptual method for computing integral distances of regular languages-integral means that all distance values are nonnegative integers-via the property of error-detection.In this section, we review that method and produce a concrete preliminary algorithm for computing the edit distance of a regular language.
A language L is error-detecting for a channel γ, [27], if no L-word can be received as a different L-word via γ; that is (The definition of error-detection in [27] uses L ∪ {λ} instead of L in Formula 6.This slight change makes the presentation here simpler and has no bearing on any existing results regarding error-detecting languages.),for any words u and v, Remark 1.The error-detection method of [5] for computing inner distances of regular languages is based on the following observations, where a is an NFA and t is an input-preserving transducer.
1.A language L is error-detecting for sid(m), if and only if δ(L) > m.
2. δ(L) is equal to the positive integer k such that L is error-detecting for sid(k − 1) and L is not error-detecting for sid(k).3. We have the following facts from [27].A language L is error-detecting for a channel γ if and only if the following relation is a function realizes relation (7) and can be constructed in time O(|t||a| 2 ).
4. There is an O(|T s | 2 + r|Q s | 2 ) time algorithm that decides whether a given transducer s is functional [6,28], where r is the size of the alphabet.
Using the above observations, we present first a preliminary error-detection-based algorithm for computing the desired edit distance.Algorithm PrelimDistErrDetect 0. Input: NFA a 1.Let B a be the edit distance bound in Lemma 2 2. Let min ← 1 and max ← B a − 1 3. Perform binary search to find the largest k in {min, . . ., max} for which L(a) is error-detecting for sid(k) as follows:

Remark 2.
Step (3d) of the above algorithm can be computed using the transducer functionality algorithm on t k , which leads again to a polynomial but expensive algorithm.It turns out, however, using standard logical arguments, that Condition (6) is equivalent to whether (t ↓ a ↑ a) realizes a partial identity, when t realizes γ-in the above algorithm, t is sid k .Moreover, [6], there is an time algorithm that tests whether a given transducer t realizes a partial identity, where r is the size of the alphabet.Corollary 1.Consider the algorithm PrelimDistErrDetect.Using the partial identity test for t k in step 3d, the algorithm computes the edit distance of a language given via a trim NFA a in time where r is the cardinality of the alphabet used in T a , and n = |T a |.
Proof.The correctness of the algorithm follows from Remarks 1 and 2. For the time complexity, the whole loop will perform O(log B a ) iterations.In each iteration, the value k is used to construct the transducer sid k shown in Figure 1 with alphabet being the set of alphabet symbols appearing in the definition of a.Then, the transducer t k is constructed having O(k|Q a | 2 ) states and O(kr 2 |T 2 a |) transitions.Then, the partial identity of t k is tested in time O(|T a | 2 kr 2 ).As k < B a , it follows that the total time complexity is as required.
We note that, in the worst case, B a is of order O(n) and, assuming a fixed alphabet, the above algorithm operates in time O(n 3 log n), which is asymptotically better than the time complexity of the algorithm in [4], even when the given automaton is a DFA.

An O(n 2 d) Algorithm for Edit Distance via Error-Detection
In this section, we observe that the algorithm of the previous section repeats a lot of computations, and we eliminate those repeated computations to arrive at an improved algorithm that computes the edit distance d of a trim NFA a in time O(n 2 dr 2 ), where r is the cardinality of the alphabet used in T a , and n = |T a |.The improved algorithm is based on the following two observations:

•
The previous algorithm starts the binary search loop by constructing the transducer t B a /2 , but the edit distance might be much smaller than B a /2 .It turns out that it is more efficient in the end to construct in turn t 1 , t 2 , . . .until the first t d that does not realize a partial identity.

•
If t k is constructed and tested that does not realize a partial identity, then the transducer t k+1 is constructed from scratch and the partial identity test is repeated for the part of t k+1 that corresponds to t k .We shall define the transducer t k+1 to be the part that is added to t k in order to obtain t k+1 , plus some initial state.Moreover, we shall show that, if t k realizes a partial identity, then t k+1 realizes a partial identity if and only if t k+1 does so.Thus, the partial identity test in each step will apply only to the new part that is added to the transducer of the previous step.
We proceed with details based on the above observations.
Product construction of trim t = t ↓ a ↑ a, given transducer t and NFA a.As usual in cross product constructions, the states of t are triples of the form (ϕ, q, q ), where ϕ is a state of t, and q, q are states of a.The initial state of t is (ϕ 0 , q 0 , q 0 ), where ϕ 0 is the initial state of t and q 0 is the initial state of a.
The construction is incremental, starting with the creation of (ϕ 0 , q 0 , q 0 ); then: • If state (ϕ, q, q ) has been created, and there are transitions ϕ x/y − − → ψ, q x − → r, q y − → r of t, a λ and a λ , respectively, then the transition (ϕ, q, q ) x/y − − → (ψ, r, r ) is added in t .Here, a λ is the λ-NFA that results if we add in a the loop transitions (q, λ, q) to all states q of a.
The final states of t are those constructed triples consisting of final states in t and a.In the end, we also make t trim.
Optimized construction of t k+1 and t k+1 from the trim t k .Suppose that t k has been constructed, where initially t 1 = sid 1 ↓ a ↑ a. Constructing t k+1 using t k will be done again incrementally.The first phase of the incremental construction is to add the new transitions where x/y is an error and q x − → r, q y − → r are transitions in a λ .There will be no new transitions of the form ([i], q, q ) x/y − − → ([k + 1], r, r ) for i < k, because the transducer sid k+1 has no transitions from any state [i]  After the first phase, the incremental construction proceeds from the new states ([k + 1], r, r ) in (8).Any new transition must be of the form where σ ∈ Σ.This is because the transducer sid k+1 has only transitions of the form The process ends when no new states are created.The transitions and final states of the transducer t k+1 are those in t k plus the newly created ones, after removing any new states that cannot reach a final state (thus, also t k+1 is trim).The transducer t k+1 has as transitions and final states only the newly created ones, and has as initial state a new state [−1] with transitions ), for all states of the form ([k], q, q ). 2 Lemma 3. Suppose the trim transducer t k realizes a partial identity.λ/λ − − → ([k], p, p ) be the first transition of C 2 .Let C 2 be the path that results when we remove the first transition of C 2 .By the construction of t k+1 , there is a computation C 1 of t k that ends at state ([k], p, p ).Let (w 1 , w 1 ) be the label of C 1 .Then, C 1 C 2 is an accepting computation of t k+1 with label (w 1 w 2 , w 1 w 2 ).As t k+1 realizes a partial identity, w 1 w 2 = w 1 w 2 , which implies w 2 = w 2 , as required.
For the 'if' part of the second statement, assume that t k+1 realizes a partial identity.Consider any accepting computation C of t k+1 .We show that the label of C must be of the form (w, w).If C is already a computation of t k , then this holds, as t k realizes a partial identity.Now suppose that C = C 1 C 2 such that C 1 is a computation of t k and C 2 is a path in t k+1 that starts with a transition as in (8) and then uses transitions as in (9).Let ([k], p, p ) be the last state of C 1 , which is also the first state of C 2 .Then, C 1 has some label (w 1 , w 1 ).In addition, the path is an accepting computation of t k+1 , which implies that it has some label (w 2 , w 2 ).Hence, the label of C is (w 1 w 2 , w 1 w 2 ) and, therefore, t k realizes a partial identity.

An O(n 2 d) Algorithm for Edit Distance via Input-Altering Transducers
In this section, we present another algorithm for computing the desired edit distance via input-altering transducers-see Theorem 3 and the associated algorithm.A transducer t is called input-altering, if w / ∈ t(w), for all words w, that is, the output of t is never equal to the input used.
We explain now how input-altering transducers are related to edit-distance and error-detection.Let t be a transducer.A language L is t-independent, [29,30], if Of course, when R(t) is input-preserving, then t-independence is the same as error-detection for the channel R(t), and condition (10) can be tested as explained in Remark 2. On the other hand, if the transducer t is input-altering, then [30], condition (10) is equivalent to If L is accepted by some NFA a, then the above condition can be tested using two product constructions: first, construct an NFA b accepting t(L), then construct an NFA c by intersecting b with a, and then test whether there is a path from the start to a final state of c.Thus, condition (11) Certain types of input-altering transducers are useful in constructing maximal t-independent languages [30].In Theorem 2, we show how an input-altering transducer can be used to model the edit operations used in the definition of the edit distance.

An Input-Altering Transducer for Edit-Distance
We shall define the input-altering transducer ia k , which is partially shown in Figure 2. The value i in a state [i] or [i, a] is called the error counter, meaning that any path from [0] to a state with error counter i has to be labeled u/v such that δ(u, v) ≤ i.More precisely, we will define the edges such that a state [i, a] can be reached from [0] via a path with label u/v if and only if u = vax for some word x and i = |ax|, thus, v is a proper prefix of u and state [i, a] remembers the left-most letter of u that occurs after its prefix v.A state [i] with i ≥ 1 can only be reached via a path labeled u/v from [0] if 1 ≤ δ(u, v) ≤ i, thus, u = v.Furthermore, we make sure that for u = v such that neither u ≤ p v nor v ≤ p u there is a path from [0] to [δ(u, v)], which is labeled by u/v or v/u.
is defined as follows.The set of states is with all but the initial state [0] being final states: The transitions in ia k can be divided into the four sets of edges E = E 0 ∪ E s ∪ E i ∪ E d .The transitions from E 0 do not introduce any error, edges from the other sets model one substitution (E s ), insertion (E i ), or deletion (E d ): Terminology.If t = (Q, Σ, T, q 0 , F) is a transducer in standard form, then, we write t e for the NFA t e = (Q, E Σ , T, q 0 , F) over the edit alphabet E Σ , where the labels of the transitions in t are viewed as elements of E Σ .Note that, the label of a path P in t is a pair of words (u, v), whereas the label of the corresponding path in t e , which we denote as P e , is an edit string h such that inp(h) = u and out(h) = v.This type of NFA is called an eNFA in [23].
Definition 2. An edit string h of nonzero weight is called reduced, if (a) the first error in h is not an insertion, and (b) if the first error in h is a deletion of the form (a/λ), then the first non-deletion edit operation that follows (a/λ) in h (if any) is of the form σ/σ with σ ∈ Σ \ {a}.The proofs of the next two lemmata are given in the appendix.
Lemma 4. Let x, y, u, v be words.The following statements hold true: then there is a reduced edit string h realizing δ(u, v).
Lemma 5. Let k ∈ N and let u, v be words.The following statements hold true with respect to the transducer ia k .
1.In ia e k , every path from the start state [0] to any state [i] or [i, a] has as label a reduced edit string whose weight is equal to i.
Theorem 2. For each k ∈ N, the transducer ia k is input-altering and of size O(kr 2 ), where r is the cardinality of the alphabet, and satisfies the following condition, for any language L containing at least two words Proof.By construction, it follows that t k is trim and has O(rk) states and O(kr 2 ) transitions.Hence, it is indeed of size O(kr 2 ).The third statement of Lemma 5 implies that the transducer is input-altering.Next, we show that ( 20) is true for all languages L containing at least two words.First, for the 'if' part, assume δ(L) > k and consider any words u, v ∈ L. We need to prove v / ∈ ia k (u).If u = v, then this holds as ia k is input-altering.Else, it follows from the third statement of Lemma 5. Now, for the 'only if' part, assume but, for the sake of contradiction, suppose there are different words Let h be a reduced edit string realizing δ(u, v).By the second statement of Lemma 5, h is accepted by ia e k via some path P e and, therefore, either of (u, v) and (v, u) is the label of the path P of ia k , that is, we have u ∈ ia k (v) or v ∈ ia k (u), which contradicts (21).
Corollary 2. For each NFA a accepting at least two words and for each transducer ia k , with k ∈ N, the following condition is satisfied: R Proof.The statement follows from the above theorem and the fact (based on standard logic arguments The reason why condition R ia k ↓ a ↑ a = ∅ is preferred to the equivalent one in Theorem 2 is explained further below in the remark that follows Theorem 3.

The Second O(n 2 d) Algorithm for Edit Distance
Here, we use the results of the previous subsection to arrive at the second algorithm for computing the desired edit distance.Corollary 2 implies that the preliminary algorithm PrelimDistInpAlter shown below correctly computes the desired edit distance.Moreover, by reasoning as in the proof of Corollary 1, it follows that this algorithm also executes in time O(n 2 r 2 B a log B a ), where r is the cardinality of the alphabet used in T a , and n = |T a |.
Algorithm PrelimDistInpAlter 0. Input: NFA a 1.Let B a be the bound in Lemma 2 2. Let min ← 1 and max ← B a − 1 3. Perform binary search to find the largest k in {min, . . ., max} for which L(a) is error-detecting for sid(k) as follows: We discuss now how to improve the above algorithm.The two observations we made at the beginning of Section 4 apply here as well if, instead of partial identity of t k , we talk about the emptiness of t k .Thus, we want the improved algorithm to construct in turn t 1 , t 2 , . . .until the first t d with R(t d ) = ∅.Moreover, when t k has been constructed and realizes ∅, we continue in the next step with new transitions added to t k in order to get t k+1 .
Optimized construction of t k+1 from the trim t k .Suppose that the trim t k has been constructed, where initially t 1 ← ia 1 ↓ a ↑ a. Constructing t k+1 using t k will be done again incrementally.The first phase of the incremental construction is to add two sets of new transitions: the new transitions where x/y is an error and q x − → r, q y − → r are transitions in a λ ; and the new transitions where a, σ ∈ Σ, and q σ − → r, q λ − → r are transitions in a λ .Note that the total numbers of new transitions and states created in the first phase are O(|T a | 2 r 2 ) and O(|Q a | 2 r), respectively.
After the first phase, the incremental construction proceeds from the new states ([k + 1], r, r ) and ([k + 1, a], r, r ).Any new transition must be of the form where σ ∈ Σ and, in the second case above, σ = a.This is because the transducer ia k+1 has only transitions of the form Remark 4. When a final state f , say, of t k is created, then we know that there is an accepting path of ia k ↓ a ↑ a ending at f .The label of that path is a word pair (u, v) such that δ(u, v) = k.Thus, the above algorithm can be modified to return not only the edit distance of L(a), but also a witness pair for that distance.

Implementation and Testing
As both algorithms DistErrDetect and DistInpAlter have the same theoretical complexity, we chose to implement one of the two.We chose to implement DistInpAlter because it requires a simpler test for each constructed transducer (although ia k is slightly more complex than sid k , the test for partial identity, [6], is more sophisticated than testing merely for existence of final states).We have also implemented the preliminary algorithm PrelimDistInpAlter.
Our implementation uses the FAdo package for automata, version 1.3.5.1, [7], which is well maintained and provides several useful tools for manipulating automata.We have performed several tests (All tests were performed on a laptop with the following specification.Make: Apple, Model: MacBook Pro, Processor 2.5 GHz Intel Core i7, Memory (RAM): 16.00 GB, Operating System: macOS High Sierra Version 10.13.6.) for the correctness of these algorithms, as well as two sets of tests for the time complexity, which confirm the theoretical result that DistInpAlter is indeed faster than PrelimDistInpAlter.The two sets of tests correspond to two lists of DFAs, (a n ) and (b n ), shown in Figures 3 and 4. The first test set is such that the desired distance is equal to n, for each DFA a n , that is, the distance grows with n and, in fact, it is a worst-case scenario where the distance is equal to the number of states of the automaton.The second test set is such that the desired distance is fixed, equal to 2, for all n.
The automaton a n accepting the language 0 n−1 (10 n−1 ) * . [ [n, 0] where '%' is the integer division remainder operation.This code has edit distance equal to 2. On the other hand, its distance for insertion/deletion errors only is 3.The automaton (before making it trim) has n 2 + n + 1 states: [n, 0] and [i, s], with 0 ≤ i ≤ n − 1 and 0 ≤ s ≤ n.The meaning of state [i, s] is that the automaton has read i bits b Table 1 shows the actual running times (in seconds) of the two algorithms on the DFAs a 28 , a 41 , a 56 , a 76 , a 100 , a 124 , a 152 , a 184 .The number in parentheses next to each a n indicates the number of transitions in a n .The column d shows the computed edit distance, and the column B a n shows the computed upper bound on the edit distance (used in algorithm PrelimDistInpAlter). Table 2 shows the actual running times (in seconds) of the two algorithms on the DFAs b 6 , . . ., b 13 .Again, the number in parentheses next to each b n indicates the number of transitions in b n .
In both test sets, the empirical outcomes confirm the asymptotic outcome that the improved algorithm based on the optimized construction is faster than the preliminary one.

Conclusions
This paper represents a significant improvement in the time complexity of computing the inner edit distance of a given regular language.The performance tests of the implemented algorithm show that in practice the algorithm is reasonably fast for moderate size automata.As discussed in [4], this problem is related to the inherent capability of a language to detect substitution, insertion, and deletion errors.
The two preliminary algorithms can be applied to different distances as long as these distances can be related to appropriate transducers.For some of those distances, the idea in the optimized algorithms can also be used.For example, one can construct a transducer similar to sid k for insertion/deletion only errors.A direction for future research is to investigate to what extent the methods used here can be extended to compute inner weighted distances or the inner edit distance with moves.
where the e i 's are non-errors, d ∈ N 0 and each (a j /λ) is a deletion, and g 0 does not start with a deletion.We have the following subcases:

•
If g 0 is empty or starts with an edit operation (σ/σ) in which σ ∈ Σ \ {a}, then the required h is g 0 .

•
If g 0 starts with an edit operation (x/a) in which x ∈ Σ ∪ {λ}, then it is of the form g 0 = (x/a)g 1 , and the edit string realizes δ(u, v), as weight(g 1 ) = weight(g 0 ).The process now continues from the first step using g 1 for g 0 .
As the edit string g 0 is finite, the above process terminates with a reduced edit string h, as required.

Proof. (Of Lemma 5)
The first statement follows when we note that the definition of ia k and ia e k implies the following facts: (a) an edge exists between a state with error counter i to one with error counter i + 1, if and only if the label of that edge is an error; thus, in any path from [0] to [i] or [i, a], the label of that path consists of exactly i errors; (b) any edit string accepted by ia e k is indeed reduced.For the second statement, consider any reduced edit string h realizing δ(u, v).We have two cases for the first error of h.If the first error in h is a deletion, then h is of the form where each e i is a non-error edit operation of the form (σ i /σ i ), (a/λ) is a deletion error, d ∈ N 0 and each (b j /λ) is a deletion error, and h is an edit string that is either empty or starts with a non-error (σ/σ) such that σ = a.Consider the following path of ia e accepting h.For the case where the first error in h is a substitution, one verifies that again h is accepted by ia e k .
For the third statement, if v ∈ ia k (u), then (u, v) is the label of a path P from [0] to a final state [i] or [i, a], with 0 < i ≤ k.As the label of the path P e has exactly i errors, it follows that δ(u, v) ≤ i ≤ k.
We also need to show that δ(u, v) ≥ 1, that is, u = v.First, consider the case where the path P ends at [i, a], with 1 ≤ i ≤ k.Then, the label of P e is an edit string of the form

Figure 1 .
Figure 1.The input-preserving transducer sid k realizing the channel sid(k).Each edge label σ/σ represents many transitions, one for each symbol σ of the alphabet, and similarly for σ/λ and λ/σ.Each edge label σ/τ represents many transitions, one for each pair of distinct symbols σ and τ from the alphabet.Thus, if the alphabet size is r, then the transducer has O(k) states and O(r 2 k) transitions.
with i < k to state [k + 1].Note that the numbers of new transitions and new states created as in (8) are O(|T a | 2 r 2 ) and O(|Q a | 2 ), respectively.

Figure 2 .
Figure 2. A segment of the input-altering transducer ia k : for each a ∈ Σ the complete transducer has k states of the form [i, a].The labels σ and τ on an edge mean: one edge for each σ, τ ∈ Σ with σ = τ; for some edge sets, additional restrictions apply denoted, for example, by | σ =a .Definition 1.The transducer ia k = (Q, Σ, E, [0], F)

Example.
The edit string (a/a)(a/b)(a/λ)(λ/a) is reduced as its first error is a substitution.The edit string (a/a)(a/λ)(b/b)(b/a) is reduced as well.The edit string (λ/a)(a/a) is not reduced as it starts with an insertion, and the edit string (a/λ)(b/a)(b/b) is not reduced either.

Figure 4 .
Figure 4.The automaton b n accepting the Levenshtein code, which consists of all binary words b1 • • • b n of length n such that (∑ n i=1 i • b i )%(n + 1) = 0,where '%' is the integer division remainder operation.This code has edit distance equal to 2. On the other hand, its distance for insertion/deletion errors only is 3.The automaton (before making it trim) has n 2 + n + 1 states: [n, 0] and [i, s], with 0 ≤ i ≤ n − 1 and 0 ≤ s ≤ n.The meaning of state [i, s] is that the automaton has read i bits b1 • • • b i and s = 1 • b 1 + • • • + i • b i .We have that f (i, s) = [i + 1, (s + i + 1)%(n + 1)].

•
If C 1 is a computation of t k ending at a state of the form ([k], p, p ), then the label of C 1 is of the form (w 1 , w 1 ).• t k+1 realizes a partial identity if and only if t k+1 does so.Proof.For the first statement, consider any computation C 1 of t k having some label (w 1 , w 1 ) and ending at a state of the form ([k], p, p ).We show that w 1 = w 1 .If the state ([k], p, p ) is final, then C 1 is an accepting computation, which implies w 1 = w 1 , as t k realizes a partial identity.If([k], p, p ) is not a final state, then, as t k is trim, there is a path C 1 from ([k], p, p ) to a final state of t k , where all states of that path are of the form ([k], r, r ) and all labels of that path are of the form σ/σ-this is because any transition of sid k from state[k]can only go to state[k]and can only have a label of the form σ/σ. Thus, there is an accepting path of t k of the form C 1 C 1 with label (w 1 z, w 1 z) for some nonempty word z.Then, as t k realizes a partial identity, we have that w 1 z = w 1 z, which implies w 1 = w 1 , as required.For the 'only if' part of the second statement, assume that t k+1 realizes a partial identity.Consider any accepting computation C 2 of t k+1 with some label (w 2 , w 2 ).We show that w 2 = w 2 .Let [−1] Construct the transducer sid 1 realizing the channel sid(1)-see Figure12.Construct the trim transducert 1 = sid 1 ↓ a ↑ a 3. Let k ← 1 4. Let s ← t 15. while (s realizes a partial identity) a) Construct t k+1 and t k+1 from t k using the optimized construction b) Let s ← t k+1 c) Let k ← k + 1 6. return k Theorem 1. Algorithm DistErrDetect computes the edit distance of a language given via a trim NFA a in time O(n 2 dr 2 ), where d is the computed edit distance, n = |T a |, and r is the cardinality of the alphabet used in T a .Proof.The correctness of the algorithm follows from the optimized construction and the above lemma.For the time complexity of the algorithm, we note the following.First, t 1 is constructed in time O(|a| 2 r 2 ).Then, t 2 , . . ., t d are constructed according to the optimized construction.Each of these is constructed in time O(|a| 2 r 2 ) and has O(|Q a | 2 ) states and O(|T a | 2 r 2 ) transitions.In addition, each t k is tested for partial identity in time O |T a | 2 r 2 + |Q a | 2 r , which is O(|a| 2 r 2 ).
1], with σ = a, going out of the state [k + 1].The incremental process ends when no new states are created.The transitions and final states of the transducer t k+1 are those in t k plus the newly created ones, after removing any new states that cannot reach a final state (thus, also t k+1 is trim). 2 Remark 3. If the trim transducer t k has no final states, then t k+1 has no final states if and only if none of the new created states in the optimized construction is a final state.Algorithm DistInpAlter computes the edit distance of the language given via a trim NFA a in time O(n 2 dr 2 ), where d is the computed edit distance, n = |T a |, and r is the cardinality of the alphabet used in T a .Proof.The correctness of the algorithm follows from the above optimized construction and Corollary 2. For the time complexity of the algorithm, we note the following: first, t 1 is constructed in time O(|a| 2 r 2 ).Then, t 2 , . . ., t d are constructed according to the optimized construction.Each of these is constructed in time O(|a| 2 r 2 ) and has O(|Q a | 2 r) states and O(|T a | 2 r 2 ) transitions.In addition, each t k is tested for final states in linear time.

Table 1 .
Outcomes of performance tests on the automata (a n ).

Table 2 .
Outcomes of performance tests on the automata (b n ).