Spheres of Strings Under the Levenshtein Distance

Algarni, Said; Echi, Othman

doi:10.3390/axioms14080550

Open AccessArticle

Spheres of Strings Under the Levenshtein Distance

by

Said Algarni

^†

and

Othman Echi

^*,†

Department of Mathematics, College of Computing and Mathematics, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Axioms 2025, 14(8), 550; https://doi.org/10.3390/axioms14080550

Submission received: 3 May 2025 / Revised: 14 July 2025 / Accepted: 20 July 2025 / Published: 22 July 2025

Download Versions Notes

Abstract

Let

Σ

be a nonempty set of characters, called an alphabet. The run-length encoding

(RLE)

algorithm processes any nonempty string u over

Σ

and produces two outputs: a k-tuple

(b_{1}, b_{2}, \dots, b_{k})

, where each

b_{i}

is a character and

b_{i + 1} \neq b_{i}

; and a corresponding k-tuple

(q_{1}, q_{2}, \dots, q_{k})

of positive integers, so that the original string can be reconstructed as

u = b_{1}^{q_{1}} b_{2}^{q_{2}} \dots b_{k}^{q_{k}}

. The integer k is termed the run-length of u, and symbolized by

ρ (u)

. By convention, we let

ρ (ε) = 0

. In the Euclidean space

(R^{n}, ∥ \cdot ∥_{2})

, the volume of a sphere is determined solely by the dimension n and the radius, following well-established formulas. However, for spheres of strings under the edit metric, the situation is more complex, and no general formulas have been identified. This work intended to show that the volume of the sphere

S_{L} (u, 1)

, composed of all strings of Levenshtein distance 1 from u, is dependent on the specific structure of the “

RLE

-decomposition” of u. Notably, this volume equals

(2 l (u) + 1) s - (2 l (u) - ρ (u))

, where

ρ (u)

represents the run-length of u and

l (u)

denotes its length (i.e., the number of characters in u). Given an integer

p \geq 2

, we present a partial result concerning the computation of the volume

| S_{L} (u, p) |

in the specific case where the run-length

ρ (u) = 1

. More precisely, for a fixed integer

n \geq 1

and a character

a \in Σ

, we explicitly compute the volume of the Levenshtein sphere of radius p, centered at the string

u = a^{n}

. This case corresponds to the simplest run structure and serves as a foundational step toward understanding the general behavior of Levenshtein spheres.

Keywords:

edit distance; Hamming distance; inclusion–exclusion principle; run-length encoding; sphere of strings

MSC:

05A20; 68R15

1. Introduction

The Levenshtein metric [1], denoted

L

, is widely used for identifying the closest valid strings to a misspelled term by evaluating edit distances. This metric is also pivotal in genetic sequence comparison, where it quantifies the count of mutations required to transform one sequence into another, thereby elucidating evolutionary relationships. Furthermore, it plays a significant role in plagiarism detection, text similarity analysis, and the comparison of documents or code to identify discrepancies (see, for instance, [2]).

In natural language processing (NLP), the edit metric is crucial for tasks like machine translation (see [3,4]), where it helps measure the similarity between machine-generated translations and human-created reference texts. Additionally, it is employed to detect duplicate records in databases by assessing the similarity between entries.

In voice recognition, the edit metric serves to accurately compare phonetic transcriptions, aiding in the correct identification of spoken strings [5]. Moreover, it helps correct OCR errors by suggesting the most likely corrections based on the recognized text and a reference dictionary.

The Hamming metric, extensively used in coding theory, particularly in error-correcting codes like Hamming codes, is essential for detecting and correcting errors in data transmission or storage (see [6,7,8]). It is also used in computer science for comparing binary data, such as hashes or binary fingerprints [9].

In bioinformatics, the Hamming metric is utilized to compare genetic sequences or protein structures of equal length, accounting for substitutions without considering insertions or deletions. Both the Hamming and edit metrics are effective tools in bioinformatics and computational biology (see [10,11]).

The Hamming metric is also applied in machine learning for clustering and classification tasks, particularly with binary or categorical data.

Run-length encoding

(RLE)

is a fundamental data compression technique where sequences of the same data value, called runs, are substituted by a unique value followed by a count. This is especially effective for compressing data with many runs, such as simple graphics and animations (see [12]).

Throughout this paper,

Σ

represents a (finite) alphabet of size

s \in N

;

Σ^{*}

denotes the set of all finite strings, including the empty string

ε

; and

Σ^{+}

is the set of all nonempty strings over

Σ

.

Given a nonempty string u of length n over the alphabet

Σ

,

RLE

produces two k-tuples:

(i): $(a_{1}, a_{2}, \dots, a_{k})$ , where each $a_{i}$ is a character and $a_{i} \neq a_{i + 1}$ ;
(ii): $(r_{1}, r_{2}, \dots, r_{k})$ , where each $r_{i} \in N$ , with $u = a_{1}^{r_{1}} a_{2}^{r_{2}} \dots a_{k}^{r_{k}}$ .

The integer k, known as the run-length of u, is denoted

ρ (u)

. By convention, we let

ρ (ε) = 0

.

For

r \in R

, with

r > 0

, and a vector

x \in R^{n}

, the sphere and ball are specified by

S^{n} (x, r) : = {y \in R^{n} {: ∥ x - y ∥}_{2} = r}, B^{n} (x, r) : = {y \in R^{n} {: ∥ x - y ∥}_{2} \leq r},

where

{∥ \cdot ∥}_{2}

represents the Euclidean norm. The volumes of

S^{n} (x, r)

and of

B^{n} (x, r)

rely exclusively on the radius r, independent of the center

x

.

If u is a string, we denote its length by

l (u)

, that is, the number of characters in u. The set of all strings of length n formed from the alphabet

Σ

is denoted by

Σ^{(n)}

.

For the Hamming and Levenshtein metrics

H

and

L

on the monoid

Σ^{*}

of strings over the alphabet

Σ

of size s, consider

p \in N

and a string u of length n. We define

\begin{matrix} S_{H} (u, p) : = & {v \in Σ^{*} : l (u) = l (v) = n and H (u, v) = p}, \\ S_{L} (u, p) : = & {v \in Σ^{*} : L (u, v) = p}, \end{matrix}

For further information regarding volume formulas for balls of strings, see [13].

Given a string u of length n over an alphabet

Σ

, the volumes of

S_{H} (u, p)

and

B_{H} (u, p)

are precisely

(\binom{n}{p}) {(s - 1)}^{p}

and

\sum_{i = 0}^{p} (\binom{n}{i}) {(s - 1)}^{i}

, respectively (see Section 2). These volumes are solely determined by

n = l (u)

, the alphabet size s, and the radius p. However, for the edit metric, this is not the case. We are aiming to demonstrate that the volume of

S_{L} (u, 1)

(consisting of Levenshtein neighbors of u) is influenced by the structure of the “

RLE

-decomposition” of u. Specifically, this volume equals

(2 l (u) + 1) s - (2 l (u) - ρ (u))

.

Given integers

p \geq 2

and

n \geq 1

and a character

a \in Σ

, we compute the volume of the Levenshtein sphere

S_{L} (a^{n}, p)

.

2. Spheres of Strings Under the Hamming Metric

Let

Σ

be an alphabet of size

s \geq 1

, and let

Σ^{*}

represent the set of all possible strings over

Σ

, including the empty string

ε

. The set

Σ^{+}

is defined as

Σ^{*} ∖ {ε}

, which includes all nonempty strings over

Σ

.

Consider two strings u and v over

Σ

, both sharing the same length, denoted

l (u) = l (v)

. The Hamming metric between u and v, denoted by

H (u, v)

, is the count of positions where the corresponding characters in u and v differ. Alternatively, it is the smallest number of substitutions needed to transform one string into the other.

Named after Richard Hamming, the Hamming metric was initially introduced to aid in error detection and correction in data transmission [6]. Since then, it has been applied in various fields, particularly in coding theory and information theory.

It is worth noting that the Hamming distance on strings is closely related to hypercubes and k-ary n-cube structures; see, for instance, [14,15].

Formally, if

u = u_{1} u_{2} \dots u_{n}

and

v = v_{1} v_{2} \dots v_{n}

are two strings of length n over the alphabet

Σ

, the Hamming metric

H (u, v)

is specified by

H (u, v) = \sum_{i = 1}^{n} δ (u_{i}, v_{i}),

where

δ (u_{i}, v_{i}) = 0

if

u_{i} = v_{i}

and 1 otherwise. This function essentially counts the total number of positions i where

u_{i}

and

v_{i}

differ.

This section reviews the determination of the count of strings v that have a Hamming distance p from a string u.

We now discuss a well-known formula for the volume of a p-Hamming sphere, demonstrating that

| S_{H} (u, p) |

depends solely on p, the length

l (u)

, and the size s of the alphabet

Σ

.

To construct all the strings in

S_{H} (u, p)

, we select a subset I of

[n] = {1, \dots, n}

with size p, which can be achieved in

(\binom{n}{p})

ways. For each chosen subset I, there are

{(s - 1)}^{p}

possible p-tuples of characters

{(v_{i})}_{i \in I}

, so that

v_{i} \in Σ ∖ {u_{i}}}

. This results in the expression

| S_{H} (u, p) | = {(s - 1)}^{p} (\binom{n}{p}) .

Another approach to derive this formula involves using the “combinatorial additive rule”. Let

P (n, p) = {I \subseteq {1, \dots, n} : | I | = p}

. Consider the mapping

\begin{matrix} φ : & S_{H} (u, p) & ⟶ & P (n, p) \\ v = v_{1} \dots v_{n} & ⟼ & {i \in [n] : v_{i} \neq u_{i}} \end{matrix}

Given that

| S_{H} (u, p) | = \sum_{I \in P (n, p)} | φ^{- 1} (I) |,

and noting that

\begin{matrix} γ : & φ^{- 1} (I) & ⟶ & \prod_{i \in I} (Σ ∖ {u_{i}}) \\ v = v_{1} \dots v_{n} & ⟼ & {(v_{i})}_{i \in I} \end{matrix}

is bijective, we can infer that

| S_{H} (u, p) | = \sum_{I \in P (n, p)} {(s - 1)}^{p} = | P (n, p) | \cdot {(s - 1)}^{p} = {(s - 1)}^{p} (\binom{n}{p}) .

This confirms the following desired result.

Proposition 1.

The volume of the p-Hamming sphere is expressed by the formula

| S_{H} (u, p) | = {(s - 1)}^{p} (\binom{n}{p}),

where

n = l (u)

.

3. Scattered Strings and Run-Length Encoding

Run-length encoding

(RLE)

is a simple yet effective data compression method where sequences of repeated elements are substituted by a unique value followed by a count of its occurrences. The concept of

RLE

has its roots in early developments in data compression and digital image processing.

In 1961, Freeman introduced a method for encoding lines and curves in digital images that utilized run-length encoding [12]. This work laid the foundation for the widespread use of

RLE

in various fields, including image compression, data transmission, and digital storage.

RLE

operates by substituting consecutive identical characters in a string with the character itself succeeded by the count of its repetitions. For instance, the string “bbbabb” would be encoded as “3b1a2b”. This method is particularly effective for compressing data that contains many consecutive repeating characters, such as simple visual graphics, text with repeated letters, or other forms of repetitive data.

This section presents a fundamental theorem concerning the decomposition of a nonempty string using run-length encoding

(RLE)

.

To begin, we introduce the following concept.

Definition 1.

A string

u \in Σ^{*}

is termed scattered if either

u = ε

or

u = u_{1} \dots u_{n}

, where each

u_{i}

is a character and

u_{i} \neq u_{i + 1}

. The set of all scattered strings of length n is denoted

{SS}_{n} (Σ)

.

Partitioning an integer

n \in N

into k positive parts involves expressing n as the sum

x_{1} + x_{2} + \dots + x_{k}

, where each

x_{i} \in N

. This concept is central to number theory and combinatorics, as it explores the different ways in which an integer can be decomposed into a specific number of summands. Each partition corresponds to a unique combination of these summands that collectively equal the original integer n. The ordered k-tuple

(x_{1}, x_{2}, \dots, x_{k})

is known as a k-partition of n, and the set of all such k-partitions is denoted

P r (n, k)

.

Theorem 1

(

RLE

-decomposition). Let u be a nonempty string over Σ with length n. There are a single positive integer k, a unique scattered string

v = a_{1} \dots a_{k}

, and a unique k-partition

(r_{1}, r_{2}, \dots, r_{k})

of n so that

u = a_{1}^{r_{1}} \dots a_{k}^{r_{k}},

where this expression is termed the

RLE

-decomposition of u.

Proof.

Existence. We will establish the existence through induction on n. For the initial step, where

n = 1

, let

u = u_{1}

. We can simply set

k = 1

,

v = u

, and

r_{1} = 1

.

Assume, the decomposition is ensured for all nonempty strings of length n. Let

u = u_{1} u_{2} \dots u_{n} u_{n + 1}

be a string of length

n + 1

. By the inductive assumption, there exist

s \in N

, a sequence

w = a_{1} \dots a_{s}

, and an s-partition

(r_{1}, \dots, r_{s})

of n such that

u_{1} u_{2} \dots u_{n} = a_{1}^{r_{1}} \dots a_{s}^{r_{s}} .

Two cases may be considered.

If

u_{n + 1} = a_{s}

, then the required decomposition of u is

u = a_{1}^{r_{1}} \dots a_{s}^{r_{s} + 1}

, associated with

w = a_{1} \dots a_{s}

, and the s-partition of

n + 1

is

(r_{1}, r_{2}, \dots, r_{s} + 1)

.

If

u_{n + 1} \neq a_{s}

, then the decomposition of u is

u = a_{1}^{r_{1}} \dots a_{s}^{r_{s}} a_{s + 1}^{1}

, where

a_{s + 1} = u_{n + 1}

, and the

(s + 1)

-partition of

n + 1

is

(r_{1}, r_{2}, \dots, r_{s}, 1)

.

Uniqueness of the decomposition. Again, we proceed through induction on n. For the initial step, where

n = 1

, suppose

u = u_{1} = a_{1}^{α_{1}} \dots a_{k}^{α_{k}} = b_{1}^{β_{1}} \dots b_{s}^{β_{s}},

then

k = 1 = s

,

a_{1} = b_{1} = u_{1}

, and

α_{1} = β_{1} = 1

.

Now, for a given positive integer n, assume that every nonempty string u with

l (u) \leq n

possesses a unique decomposition. Let

u = u_{1} \dots u_{n + 1}

be a nonempty string of length

n + 1

, and assume

u = a_{1}^{α_{1}} \dots a_{k}^{α_{k}} = b_{1}^{β_{1}} \dots b_{s}^{β_{s}},

are two decompositions of u. Thus,

u_{1} = b_{1} = a_{1}

.

If

k = 1

, then the equality

u = a_{1}^{α_{1}} = b_{1}^{β_{1}} \dots b_{s}^{β_{s}}

implies

s = 1

, since

b_{1} b_{2} \dots b_{s}

is scattered. Consequently,

α_{1} = β_{1}

, establishing the decomposition as unique.

If

k \geq 2

, then necessarily

s \geq 2

(since

a_{1} a_{2} \dots a_{k}

is scattered). We assert that

α_{1} = β_{1}

. Indeed, if (for instance)

α_{1} > β_{1}

, then we would have

a_{1}^{α_{1} - β_{1}} \dots a_{k}^{α_{k}} = b_{2}^{β_{2}} \dots b_{s}^{β_{s}}

. Therefore,

b_{1} = a_{1} = b_{2}

, contradicting the condition that

b_{1} b_{2} \dots b_{s}

is scattered. As a consequence,

α_{1} = β_{1}

. Thus, we obtain

a_{2}^{α_{2}} \dots a_{k}^{α_{k}} = b_{2}^{β_{2}} \dots b_{s}^{β_{s}}

. By the inductive assumption, we derive

k - 1 = s - 1

,

α_{i} = β_{i}

, and

a_{i} = b_{i}

for

i = 2, \dots, k

. This confirms that the decomposition is unique. □

The above result motivates the following definitions.

Definition 2.

The integer k in the previous theorem is termed the run-length of u, referred to as

ρ (u)

. The scattered string

a_{1} a_{2} \dots a_{k}

is termed the run-root of u, denoted by

rr (u)

. The k-tuple

(r_{1}, r_{2}, \dots, r_{k})

is called the run-partition of u and is referred to as

rp (u)

.

Remark 1.

Theorem 1 can be reformulated using functions. For each positive integer n, the map

\begin{matrix} Φ : & Σ^{(n)} & ⟶ & ⨆_{k = 1}^{n} {SS}_{k} (Σ) \times P r (n, k) \\ u & ⟼ & (rr (u), rp (u)) \end{matrix}

defines a bijection.

4. Unit Spheres of Strings Under the Levenshtein Metric

Unlike the Hamming metric, where the volume of a unit sphere depends only on the length of the string and the size of the alphabet, the volume of

S_{L} (u, 1)

under the Levenshtein metric exhibits a more intricate dependence. Specifically, it is entirely determined by three parameters: the run-length

ρ (u)

of the string u, the size s of the alphabet

Σ

, and the length

l (u)

of u. This dependence highlights the sensitivity of the Levenshtein metric to the internal structure and redundancy within strings, especially with respect to repeated characters and their arrangements.

For further detailed comparisons between the Hamming distance and the Levenshtein distance, we refer the reader to [16].

To proceed with the calculation of

| S_{L} (u, 1) |

, the number of strings at Levenshtein distance one from u, we begin by introducing some notation and conventions that are used throughout the computation.

Notation 1.

Let u be a string of length n over the alphabet Σ, and let

a \in Σ

.

1.: For real numbers $x \leq y$ , the notation $[[x, y]]$ represents the intersection $[x, y] \cap Z$ .
2.: If $u = v w$ , where $v, w \in Σ^{*}$ , $l (v) = i$ , and $l (w) = j$ , then v is referred to as ${pref}_{i} (u)$ (the prefix of u of length i), and w is referred to as ${suf}_{j} (u)$ (the suffix of u of length j). By convention, ${pref}_{0} (u) = {suf}_{0} (u) = ε$ .
3.: $DEL (u)$ denotes the set of all strings obtained from u by deleting one character. The cardinality of $DEL (u)$ is referred to as the deletion degree of u and denoted by $del (u)$ . By convention, $DEL (ε) = \emptyset$ . Using prefix–suffix notation,

$DEL (u) = {{pref}_{i - 1} (u) \cdot {suf}_{n - i} (u) : i \in [[1, n]]} .$
4.: $INS (u)$ denotes the set of all strings obtained from u by inserting one character. The cardinality of $INS (u)$ is referred to as the insertion degree of u and denoted by $ins (u)$ . Using prefix–suffix notation, we have

$INS (u) = ⋃_{i = 0}^{n} {{pref}_{i} (u) \cdot x \cdot {suf}_{n - i} (u) : x \in Σ} .$
5.: $SUB (u)$ represents the set of all strings obtained from u by replacing one character $u_{i}$ with a character from $Σ ∖ {u_{i}}$ . The cardinality of $SUB (u)$ is referred to as the substitution degree of u and denoted by $sub (u)$ . By convention, $SUB (ε) = \emptyset$ . Using prefix–suffix notation, we have

$SUB (u) = ⋃_{i = 1}^{n} {{pref}_{i - 1} (u) \cdot x \cdot {suf}_{n - i} (u) : x \in Σ ∖ {u_{i}}} .$
6.: For $i \in [[1, n]]$ , we denote by $D_{i} (u)$ the string obtained from u by deleting the character $u_{i}$ , i.e., $D_{i} (u) = {pref}_{i - 1} (u) \cdot {suf}_{n - i} (u)$ .
7.: For $i \in [[0, n]]$ and $a \in Σ$ , we denote by $I (i, a) (u)$ the string obtained by inserting the character a at position $i + 1$ in u, i.e., $I (i, a) (u) = {pref}_{i} (u) \cdot a \cdot {suf}_{n - i} (u)$ .
8.: For $i \in [[1, n]]$ and $a \in Σ ∖ {u_{i}}$ , we denote by $S (i, a) (u)$ the string obtained by substituting the character $u_{i}$ with a, i.e., $S (i, a) (u) = {pref}_{i - 1} (u) \cdot a \cdot {suf}_{n - i} (u)$ .

Remark 2.

Applying a basic combinatorial rule, namely, the additive rule, for any string u over Σ we have

| S_{L} (u, 1) | = sub (u) + del (u) + ins (u) .

In what follows, let

u = u_{1} \dots u_{n}

be a nonempty string over

Σ

of length n. The enumeration of

| S_{L} (u, 1) |

requires a sequence of lemmas.

From Theorem 1, we derive the following lemma.

Lemma 1.

The substitution degree of the string u equals

sub (u) = | S_{H} (u, 1) | = (s - 1) n

.

The strings

D_{i} (u)

and

D_{j} (u)

share a common prefix of length

i - 1

and a common suffix of length

n - j

. Consequently, the equality

D_{i} (u) = D_{j} (u)

is equivalent to the condition

u_{i + 1} \dots u_{j} = u_{i} \dots u_{j - 1}

. This insight results in the following lemma.

Lemma 2.

Let

i < j

be indices in

[[1, n]]

. Then,

D_{i} (u) = D_{j} (u)

is equivalent to

u_{t} = u_{i}

for every

t \in [[i, j]]

.

This indicates that whenever consecutive characters are identical, deleting either of them results in the same string. As a consequence, we have the following corollary.

Corollary 1.

The deletion degree of a string u equals

del (u) = ρ (u)

, where

ρ (u)

denotes the run-length of u.

For the insertion degree, the two strings

I (i, a) (u)

and

I (j, b) (u)

share a common prefix of length i and a common suffix of length

n - j

. Therefore, the equality

I (i, a) (u) = I (j, b) (u)

is equivalent to

a u_{i + 1} \dots u_{j} = u_{i + 1} \dots u_{j} b

. This insight results in the following lemma.

Lemma 3.

Let

i < j

be indices in

[[0, n]]

, and let

a, b \in Σ

. Then,

I (i, a) (u) = I (j, b) (u)

means

a = b = u_{t}

for every

t \in [[i + 1, j]]

.

Now, we provide the enumeration of

ins (u)

for a run-length 1 string.

Lemma 4.

Let a be a character and

r \in N

, then the insertion degree of the string

a^{r}

is

ins (a^{r}) = (s - 1) r + s .

Proof.

It is clear that

INS (a^{r}) = \{a^{r + 1}\} ⋃ (⋃_{i = 0}^{r} \{I (i, x) (a^{r}) : x \in Σ ∖ {a}\}),

and by Lemma 3 the union is disjoint. Consequently,

ins (a^{r}) = 1 + \sum_{i = 0}^{r} |{I (i, x) (a^{r}) : x \in Σ ∖ {a}}| = 1 + (s - 1) (r + 1) = (s - 1) r + s .

□

Lemma 5.

Let

u = a_{1}^{r_{1}} \dots a_{k}^{r_{k}}

be a string over Σ, with run-length

k \geq 2

, and let

\begin{matrix} A_{1} & = & \{v a_{2}^{r_{2}} \dots a_{k}^{r_{k}} : v \in INS (a_{1}^{r_{1}})\}, \\ A_{2} & = & \{a_{1}^{r_{1}} v a_{3}^{r_{3}} \dots a_{k}^{r_{k}} : v \in INS (a_{2}^{r_{2}})\}, \\ ⋮ \\ A_{k} & = & \{a_{1}^{r_{1}} a_{2}^{r_{2}} \dots a_{k - 1}^{r_{k - 1}} v : v \in INS (a_{k}^{r_{k}})\}; \end{matrix}

the following properties are satisfied:

0.: $INS (u) = A_{1} \cup A_{2} \cup \dots \cup A_{k} .$
1.: If $i \in [[1, k - 1]]$ , then

$A_{i} \cap A_{i + 1} = {a_{1}^{r_{1}} \dots a_{i}^{r_{i}} x a_{i + 1}^{r_{i + 1}} \dots a_{k}^{r_{k}} : x \in Σ} .$
2.: If $k \geq 3$ and $i \in [[1, k - 2]]$ , then

$A_{i} \cap A_{i + 2} = {a_{1}^{r_{1}} \dots a_{i}^{r_{i}} a_{i + 1}^{r_{i + 1} + 1} a_{i + 2}^{r_{i + 2}} \dots a_{k}^{r_{k}}} = A_{i} \cap A_{i + 1} \cap A_{i + 2} .$
3.: If $j - i \geq 3$ , then $A_{i} \cap A_{j} = \emptyset$ .

Proof.

1.: Let

$w = a_{1}^{r_{1}} \dots a_{i - 1}^{r_{i - 1}} v_{1} a_{i + 1}^{r_{i + 1}} \dots a_{k}^{r_{k}} = a_{1}^{r_{1}} \dots a_{i}^{r_{i}} v_{2} a_{i + 2}^{r_{i + 2}} \dots a_{k}^{r_{k}},$

for some $v_{1} \in INS (a_{i}^{r_{i}})$ and $v_{2} \in INS (a_{i + 1}^{r_{i + 1}})$ . In turn, there is $p \in [[0, r_{i}]]$ , $q \in [[0, r_{i + 1}]]$ , and $x, y \in Σ$ so that $v_{1} = I (p, x) (a_{i}^{r_{i}})$ and $v_{2} = I (q, y) (a_{i + 1}^{r_{i + 1}})$ . We consider four cases.
Case 1: $p < r_{i}$ and $q > 0$ . Based on Lemma 3, the two equal insertions of u lead to $x = a_{i} = a_{i + 1}$ , contradicting the $RLE$ -decomposition of u. So this case cannot happen.
Case 2: $p < r_{i}$ and $q = 0$ . Following Lemma 3, we obtain $x = y = a_{i}$ , and

$w = a_{1}^{r_{1}} \dots a_{i}^{r_{i} + 1} a_{i + 1}^{r_{i + 1}} \dots a_{k}^{r_{k}} .$

Case 3: $p = r_{i}$ and $q > 0$ . Lemma 3 guarantees the equality $x = a_{i + 1} = y$ . This gives

$v_{1} = a_{i}^{r_{i}} a_{i + 1}, v_{2} = a_{i + 1}^{r_{i + 1} + 1}, and w = a_{1}^{r_{1}} \dots a_{i}^{r_{i}} a_{i + 1}^{r_{i + 1} + 1} \dots a_{k}^{r_{k}} .$

Case 4: $p = r_{i}$ and $q = 0$ . In this scenario, $v_{1} = a_{i}^{r_{i}} x$ and $v_{2} = y a_{i + 1}^{r_{i + 1}}$ , and as a consequence, $x = y$ can be any character of $Σ$ .
As a conclusion, we obtain

$A_{i} \cap A_{i + 1} = {a_{1}^{r_{1}} \dots a_{i}^{r_{i}} x a_{i + 1}^{r_{i + 1}} \dots a_{k}^{r_{k}} : x \in Σ} .$
2.: Let

$w = a_{1}^{r_{1}} \dots a_{i - 1}^{r_{i - 1}} v_{1} a_{i + 1}^{r_{i + 1}} \dots a_{k}^{r_{k}} = a_{1}^{r_{1}} \dots a_{i + 1}^{r_{i + 1}} v_{2} a_{i + 3}^{r_{i + 3}} \dots a_{k}^{r_{k}},$

for some $v_{1} \in INS (a_{i}^{r_{i}})$ and $v_{2} \in INS (a_{i + 2}^{r_{i + 2}})$ . So there exist $p \in [[0, r_{i}]]$ , $q \in [[0, r_{i + 2}]]$ , and $x, y \in Σ$ so that $v_{1} = I (p, x) (a_{i}^{r_{i}})$ and $v_{2} = I (q, y) (a_{i + 2}^{r_{i + 2}})$ .
Based on Lemma 3, this results in $x = a_{i + 1} = y$ .
We assert that $p = r_{i}$ . Otherwise, by Lemma 3, the equality of two insertions of u implies that $x = a_{i} = a_{i + 1} = y$ , which contradicts the $RLE$ -decomposition of u.
We also claim that $q = 0$ . Otherwise, by Lemma 3, $x = a_{i + 1} = a_{i + 1} = y$ , again obtaining a contradiction.
As a consequence,

$v_{1} = a_{i}^{r_{i}} a_{i + 1}, v_{2} = a_{i + 1} a_{i + 2}^{r_{i + 2}}, and w = a_{1}^{r_{1}} \dots a_{i}^{r_{i}} a_{i + 1}^{r_{i + 1} + 1} a_{i + 2}^{r_{i + 2}} \dots a_{k}^{r_{k}} .$

Additionally, it is clear that $w \in A_{i + 1}$ . Therefore,

$A_{i} \cap A_{i + 2} = {a_{1}^{r_{1}} \dots a_{i}^{r_{i}} a_{i + 1}^{r_{i + 1} + 1} a_{i + 2}^{r_{i + 2}} \dots a_{k}^{r_{k}}} = A_{i} \cap A_{i + 1} \cap A_{i + 2} .$
3.: If $j \geq i + 3$ , and $A_{i} \cap A_{j} \neq \emptyset$ , then there would be a string

$w = a_{1}^{r_{1}} \dots a_{i - 1}^{r_{i - 1}} v_{1} a_{i + 1}^{r_{i + 1}} \dots a_{k}^{r_{k}} = a_{1}^{r_{1}} \dots a_{j - 1}^{r_{j - 1}} v_{2} a_{j + 1}^{r_{j + 1}} \dots a_{k}^{r_{k}},$

with $v_{1} = I (p, x) (a_{i}^{r_{i}})$ and $v_{2} = I (q, y) (a_{j}^{r_{j}})$ . Again, using Lemma 3, $x = a_{i + 1} = a_{i + 2} = y$ , that is not possible.

□

The earlier lemma highlights the existence of bijections: between

A_{i}

and

INS (a_{i}^{r_{i}})

, and between

A_{i} \cap A_{i + 1}

and

Σ

. Additionally, the intersections

A_{i} \cap A_{i + 2}

and

A_{i} \cap A_{i + 1} \cap A_{i + 2}

are singletons. Using Lemma 4, the following cardinalities are derived.

Corollary 2.

1.: $| A_{i} | = r_{i} (s - 1) + s$ ;
2.: $| A_{i} \cap A_{i + 1} | = s$ ;
3.: $| A_{i} \cap A_{i + 2} | = | A_{i} \cap A_{i + 1} \cap A_{i + 2} | = 1 .$

To evaluate

ins (u)

, we use the inclusion–exclusion principle (or the sieve formula), first introduced by Abraham de Moivre as part of his efforts to understand and quantify probabilities [17].

Recall that if

S_{1}, S_{2}, \dots, S_{k}

are finite sets, then the cardinality of their union equals

|⋃_{i = 1}^{k} S_{i}| = \sum_{i = 1}^{k} {(- 1)}^{i - 1} (\sum_{J \subseteq [k], | J | = i} |⋂_{j \in J} S_{j}|) .

The following lemma indicates that the insertion degree of a string u is solely determined by

l (u)

and

s = | Σ |

.

Lemma 6.

The insertion degree of u equals

ins (u) = (s - 1) l (u) + s .

Proof.

Lemma 4 confirms the validity of the given equality for strings with a run-length

k = ρ (u) = 1

.

Now, we extend this result to strings with a run-length

k = ρ (u) \geq 2

.

Applying the inclusion–exclusion principle in conjunction with Lemma 5, we obtain

\begin{matrix} ins (u) & = & \sum_{i = 1}^{k} | A_{i} | - (\sum_{i = 1}^{k - 1} | A_{i} \cap A_{i + 1} | + \sum_{i = 1}^{k - 2} | A_{i} \cap A_{i + 2} |) + (\sum_{i = 1}^{k - 2} | A_{i} \cap A_{i + 1} \cap A_{i + 2} |) \\ = & \sum_{i = 1}^{k} (r_{i} (s - 1) + s) - (\sum_{i = 1}^{k - 1} s + \sum_{i = 1}^{k - 2} 1) + (\sum_{i = 1}^{k - 2} 1) \\ = & (s - 1) l (u) + k s - ((k - 1) s + k - 2) + k - 2 = (s - 1) l (u) + s . \end{matrix}

□

By combining Lemma 1, Corollary 1, and Lemma 6, we can now present the main result of this paper, which calculates the volume of

S_{H} (u, 1)

.

Theorem 2.

The volume of

S_{L} (u, 1)

is given by

| S_{L} (u, 1) | = (2 l (u) + 1) s - (2 l (u) - ρ (u)) .

Corollary 3

(Minimum and maximum values of volumes). Let

n \in N

and Σ be an alphabet of size

s \geq 2

. The following properties are satisfied.

1.: The minimum value of $\{|S_{L} (u, 1)| : u \in Σ^{(n)}\}$ is $(2 n + 1) s - (2 n - 1)$ . This value is attained by strings structured as $u = a^{n}$ , where a is a character from the alphabet.
2.: The maximum value of $\{|S_{L} (u, 1)| : u \in Σ^{(n)}\}$ is $(2 n + 1) s - n$ . This value is attained by strings u where the characters are maximally scattered.
3.: Intermediate value result: Every integer in $[(2 n + 1) s - (2 n - 1), (2 n + 1) s - n]$ is the volume of $S_{L} (u, 1)$ for some string u of length n.

Proof.

Clearly, minimizing the value of

(2 n + 1) s - (2 n - ρ (u))

is equivalent to minimizing

ρ (u)

. This occurs when

ρ (u) = 1

, which corresponds to

u = a^{n}

for some character a. Therefore, the minimum value is

(2 n + 1) s - (2 n - 1)

.

Similarly, the maximum value of

(2 n + 1) s - (2 n - ρ (u))

is attained when

ρ (u)

is maximized. This occurs when

ρ (u) = n

, corresponding to a scattered string u. The maximum value is

(2 n + 1) s - (2 n - n) = (2 n + 1) s - n

.

Now, let

x \in [[(2 n + 1) s - (2 n - 1), (2 n + 1) s - n]]

. Then

x = (2 n + 1) s - i

for some integer

1 \leq i \leq 2 n - 1

. Let v be a scattered string of length

2 n - i

with a tail character

a \in Σ

, and let

u = v a^{i - n}

. Then

ρ (u) = 2 n - i

. By Theorem 2,

|S_{L} (u, 1)| = (2 n + 1) s - (2 n - ρ (u)) = (2 n + 1) s - i = x .

□

5. Spheres of Strings with Centers of Run-Length 1

This section is devoted to computing the cardinality of the Levenshtein sphere

S_{L} (u, p)

, where

p \geq 2

is a fixed integer and

u = a^{n}

is a string consisting of n repetitions of a single character a from the alphabet

Σ

. In other words, u has run-length one.

This special case is of particular interest due to the structural simplicity of u, which allows for an explicit enumeration of all strings at Levenshtein distance p. The analysis will rely on combinatorial principles and recursive formulations of the Levenshtein distance.

We begin by recalling a classical result often referred to as the recurrence relation for the Levenshtein distance, which forms the foundation for our computations; the details may be found, for example, in [18].

Remark 3.

Let Σ be a nontrivial alphabet (i.e., of cardinality at least 2). Let

s, t

be strings over Σ; ε be the empty string; and

a, b \in Σ

be single characters. Then, the following properties hold.

1.: $L (s, ε) = | s |$ .
2.: $L (ε, t) = | t |$ .
3.: $L (a, b) = \{\begin{matrix} 1 & if a \neq b, \\ 0 & if a = b . \end{matrix}$
4.: $L (s a, t a) = L (s, t)$ .
5.: If $a \neq b$ , then $L (s a, t b) = min (\{L (s, t) + 1, L (s, t b) + 1, L (s a, t) + 1\}) .$

The following result provides a formula for computing the Levenshtein distance between a string

u = a^{n}

, which has run-length 1, and an arbitrary string v of the same or varying length.

Remark 4.

Before stating the result, we first show how to transform the string v into the string

u = a^{n}

by means of edit operations. Three situations arise:

1.: Case 1: $l (v) \leq n$ .
Replace every character of v that is not equal to a with a, obtaining the intermediate string $a^{l (v)}$ . The number of such substitutions is $l (v) - c_{a} (v)$ , where $c_{a} (v)$ denotes the number of occurrences of a in v.
Then, insert the character a exactly $n - l (v)$ times to produce the string $a^{n}$ .
The total number of edit operations is therefore

$(l (v) - c_{a} (v)) + (n - l (v)) = n - c_{a} (v),$

which shows that $L (a^{n}, v) \leq n - c_{a} (v)$ .
2.: Case 2: $l (v) > n$ and $n \leq c_{a} (v)$ .
Delete every character of v that is not equal to a, obtaining the string $a^{c_{a} (v)}$ .
Then, delete $c_{a} (v) - n$ occurrences of a to obtain the string $a^{n}$ .
The number of edit operations performed is

$(l (v) - c_{a} (v)) + (c_{a} (v) - n) = l (v) - n,$

which implies $L (a^{n}, v) \leq l (v) - n$ .
3.: Case 3: $c_{a} (v) < n < l (v)$ .
Delete $l (v) - n$ characters of v that are not equal to a, producing a string w of length n.
Then, substitute each character of w that is not equal to a with a, resulting in the string $a^{n}$ .
The total number of edit operations is

$(l (v) - n) + (n - c_{a} (v)) = l (v) - c_{a} (v),$

hence $L (a^{n}, v) \leq l (v) - c_{a} (v)$ .

The following result shows that the inequalities obtained in cases 1, 2, and 3 are in fact equalities.

Theorem 3.

Let v be a string, n a non-negative integer, and a a character. Then, the Levenshtein distance between the constant string

a^{n}

and v is given by

L (a^{n}, v) = \{\begin{matrix} n - c_{a} (v), & if l (v) \leq n, \\ l (v) - n, & if l (v) > n and n \leq c_{a} (v), \\ l (v) - c_{a} (v), & if c_{a} (v) < n < l (v), \end{matrix}

where

l (v)

denotes the length of v, and

c_{a} (v)

is the number of occurrences of the character a in v.

Proof.

The formula given in the theorem can be equivalently written in a more compact form as

L (a^{n}, v) = max (l (v), n) - min (n, c_{a} (v)) .

We proceed by induction on

l (v) + n

.

Base case: Suppose

l (v) + n = 0

. Then

l (v) = n = 0

, and we have

L (a^{n}, v) = L (ε, ε) = 0 = n - c_{a} (v) .

Hence, the formula holds in the base case.

Inductive step: Let k be a non-negative integer. Assume that for every non-negative integer n and every string w such that

l (w) + n \leq k

, the formula given in the theorem for

L (a^{n}, w)

holds.

Now, let v be a string such that

l (v) + n = k + 1

. Write

v = w b

, where w is a string and b is a character. We consider two cases.

Case 1: If

a = b

, then since

l (w) + (n - 1) = l (v) - 1 + (n - 1) = k - 1

, the induction hypothesis gives

\begin{matrix} L (a^{n}, v) & = & L (a^{n - 1}, w) = max (l (w), n - 1) - min (n - 1, c_{a} (w)) \\ = & max (l (v) - 1, n - 1) - min (n - 1, c_{a} (v) - 1) \\ = & max (l (v), n) - 1 - (min (n, c_{a} (v)) - 1) \\ = & max (l (v), n) - min (n, c_{a} (v)) . \end{matrix}

Case 2: If

a \neq b

, then by Remark 3, we have

L (a^{n}, v) = L (a^{n - 1} a, w b) = min \{L (a^{n - 1}, w) + 1, L (a^{n - 1}, v) + 1, L (a^{n}, w) + 1\} .

(1)

By the induction hypothesis, since

l (w) + (n - 1) = k - 1

,

l (v) + (n - 1) = k

, and

l (w) + n = k

, we deduce the following:

\begin{matrix} L (a^{n - 1}, w) + 1 & = & max (l (w), n - 1) - min (n - 1, c_{a} (w)) + 1 \\ = & max (l (v) - 1, n - 1) - min (n - 1, c_{a} (w)) + 1 \\ = & max (l (v), n) - min (n - 1, c_{a} (v)), \end{matrix}

\begin{matrix} L (a^{n - 1}, v) + 1 & = & max (l (v), n - 1) - min (n - 1, c_{a} (v)) + 1 \\ = & max (l (v) + 1, n) - min (n - 1, c_{a} (v)) \geq L (a^{n - 1}, w) + 1, \end{matrix}

\begin{matrix} L (a^{n}, w) + 1 & = & max (l (w), n) - min (n, c_{a} (w)) + 1 \\ = & max (l (v), n + 1) - min (n, c_{a} (v)) . \end{matrix}

From Equation (1), it follows that

L (a^{n}, v) = min \{max (l (v), n) - min (n - 1, c_{a} (v)), max (l (v), n + 1) - min (n, c_{a} (v))\} .

(2)

We now consider several sub-cases:

(i): Suppose $l (v) \leq n$ and $c_{a} (v) \leq n - 1$ . Then, $max (l (v), n) - min (n - 1,$ $c_{a} (v)) = n - c_{a} (v),$ and $max (l (v), n + 1) - min (n, c_{a} (v)) = n + 1 - c_{a} (v) .$
Thus, by Equation (2), $L (a^{n}, v) = n - c_{a} (v) = max (l (v), n) - min (n, c_{a} (v)) .$
(ii): Suppose $l (v) \leq n$ and $c_{a} (v) = n$ . Then, $v = a^{n}$ , so $L (a^{n}, v) = 0 = max (l (v), n) - min (n, c_{a} (v)) .$
(iii): Suppose $l (v) \geq n + 1$ and $c_{a} (v) \leq n - 1$ . Then,

$max (l (v), n) = max (l (v), n + 1) = l (v), and min (n - 1, c_{a} (v)) = min (n, c_{a} (v)) = c_{a} (v) .$

Therefore, by Equation (2), $L (a^{n}, v) = l (v) - c_{a} (v) = max (l (v), n) - min (n, c_{a} (v)) .$
(iv): Suppose $l (v) \geq n + 1$ and $c_{a} (v) > n - 1$ . Then,

$max (l (v), n) = max (l (v), n + 1) = l (v), min (n - 1, c_{a} (v)) = n - 1, and min (n, c_{a} (v)) = n .$

Thus, $max (l (v), n) - min (n - 1, c_{a} (v)) = l (v) - n + 1,$ and $max (l (v), n + 1) - min (n, c_{a} (v)) = l (v) - n .$ Hence, by Equation (2),

$L (a^{n}, v) = l (v) - n = max (l (v), n) - min (n, c_{a} (v)) .$

This completes the induction. □

The above theorem allows us to compute the volume of the Levenshtein sphere

S_{L} (a^{n}, p)

. In line with the three cases considered in the computation of

L (a^{n}, v)

in Theorem 3, we begin by establishing the following lemma.

Lemma 7.

Let a be a character in an alphabet Σ of size

s \geq 2

, and let

n \geq 1

and

p \geq 2

be integers. Set

\begin{matrix} S_{1} & = & {v \in S_{L} (a^{n}, p) : l (v) \leq n}, \\ S_{2} & = & {v \in S_{L} (a^{n}, p) : l (v) > n and n \leq c_{a} (v)}, \\ S_{3} & = & {v \in S_{L} (a^{n}, p) : c_{a} (v) < n < l (v)} . \end{matrix}

The following statements hold:

1.: $| S_{1} | = \{\begin{matrix} 0, & if p > n, \\ \sum_{j = n - p}^{n} (\binom{j}{n - p}) {(s - 1)}^{j - n + p}, & if p \leq n . \end{matrix}$
2.: $| S_{2} | = \sum_{j = n}^{p + n} (\binom{p + n}{j}) {(s - 1)}^{p + n - j} .$
3.: $| S_{3} | = \{\begin{matrix} \sum_{j = 0}^{n - 1} (\binom{p + j}{j}) {(s - 1)}^{p}, & if n < p, \\ \sum_{j = n - p + 1}^{n - 1} (\binom{p + j}{j}) {(s - 1)}^{p}, & if n \geq p . \end{matrix}$

Proof.

1.: A string v lies in $S_{1}$ if and only if $p = n - c_{a} (v)$ ; equivalently, $c_{a} (v) = n - p \geq 0$ . Consequently, $S_{1} = ⌀$ whenever $p > n$ .
Assume now that $p \leq n$ . Any string v of length $j \leq n$ in $S_{1}$ contains exactly $c_{a} (v) = n - p$ occurrences of a; all remaining characters belong to $Σ ∖ {a}$ . The number of such strings of length j is

$(\binom{j}{n - p}) {(s - 1)}^{j - (n - p)} .$

Hence,

$| S_{1} | = \sum_{j = n - p}^{n} (\binom{j}{n - p}) {(s - 1)}^{j - n + p} .$
2.: A string v with length $l (v) > n$ lies in $S_{2}$ precisely when $p = l (v) - n$ and $n \leq c_{a} (v)$ . Set $j = c_{a} (v)$ ; then $n \leq j = c_{a} (v) \leq l (v) = p + n .$
The number of strings v of length $p + n$ for which $c_{a} (v) = j$ equals $(\binom{p + n}{j}) {(s - 1)}^{p + n - j} .$
Consequently, $| S_{2} | = \sum_{j = n}^{p + n} (\binom{p + n}{j}) {(s - 1)}^{p + n - j} .$
3.: Strings $v \in S_{3}$ satisfy $l (v) > n$ , $n > c_{a} (v)$ , and $l (v) = c_{a} (v) + p$ . Set $j = c_{a} (v)$ ; then $n - p < j < n$ . For such an integer j, the number of strings v of length $p + j$ with $c_{a} (v) = j$ equals $(\binom{p + j}{j}) {(s - 1)}^{p + j - j} = (\binom{p + j}{j}) {(s - 1)}^{p} .$

-: If $n < p$ , then the admissible values of j are $0, \dots, n - 1$ . Hence, $| S_{3} | = \sum_{j = 0}^{n - 1} (\binom{p + j}{j}) {(s - 1)}^{p} .$
-: If $n \geq p$ , then the admissible values of j are $n - p + 1, \dots, n - 1$ . Thus, $| S_{3} | =$ $\sum_{j = n - p + 1}^{n - 1} (\binom{p + j}{j}) {(s - 1)}^{p} .$

□

Taking into consideration Lemma 7, we are ready to provide a formula of the volume of the Levenshtein sphere

S_{L} (a^{n}, p)

.

Theorem 4.

Let a be a character of an alphabet Σ with

| Σ | = s \geq 2

. For integers

n \geq 1

and

p \geq 2

, the volume of the Levenshtein sphere centered at

a^{n}

with radius p is given by

| S_{L} (a^{n}, p) | = \{\begin{matrix} \begin{matrix} \sum_{j = n - p}^{n} (\binom{j}{n - p}) {(s - 1)}^{j - n + p} & + \sum_{j = n}^{p + n} (\binom{p + n}{j}) {(s - 1)}^{p + n - j} \\ + \sum_{j = n - p + 1}^{n - 1} (\binom{p + j}{j}) {(s - 1)}^{p}, \end{matrix} & if p \leq n, \\ \begin{matrix} \sum_{j = n}^{p + n} (\binom{p + n}{j}) {(s - 1)}^{p + n - j} & + \sum_{j = 0}^{n - 1} (\binom{p + j}{j}) {(s - 1)}^{p}, \end{matrix} & if p > n . \end{matrix}

In [13], the authors provided a formula for the volume of the sphere

S_{L} (a, p)

, where

a \in Σ

and

p \geq 2

is an integer. This formula can be recovered from the above theorem by taking

n = 1

.

Corollary 4

([13]).

| S_{L} (a, p) | = {(s - 1)}^{p} + s^{p + 1} - {(s - 1)}^{p + 1} .

Proof.

By Theorem 4, we have

| S_{L} (a, p) | = \sum_{j = 1}^{p + 1} (\binom{p + 1}{j}) {(s - 1)}^{p + 1 - j} + \sum_{j = 0}^{0} (\binom{p + j}{j}) {(s - 1)}^{p} .

Since

\sum_{j = 1}^{p + 1} (\binom{p + 1}{j}) {(s - 1)}^{p + 1 - j} = {(s - 1 + 1)}^{p + 1} - {(s - 1)}^{p + 1} = s^{p + 1} - {(s - 1)}^{p + 1},

it follows that

| S_{L} (a, p) | = {(s - 1)}^{p} + s^{p + 1} - {(s - 1)}^{p + 1} .

□

In [13], the authors also suggested the following conjecture.

Problem 1

([13]). There is a function

f : Z^{+} \times Σ^{*} \times N \to [0, \infty)

that satisfies the following conditions:

$(i)$: $| f (z, u, p) - z |$ is monotonically increasing as $l (u) \to p$ ;
$(i i)$: $f (z, u, p) \to z$ as $p \to \infty$ ,

so that if

l (u) \leq p

, then

| S_{L} (u, p) | = f {(s, u, p)}^{l (u)} s^{p},

where s is the size of Σ.

We conclude this section by formulating the following open problems, which naturally arise from the preceding analysis. These problems highlight directions for future investigation and aim to deepen our understanding of the combinatorial structure of Levenshtein spheres.

Problem 2.

Given an integer

p \geq 2

and a string u over an alphabet Σ of size s, a considerably more difficult task is to determine the volume of the Levenshtein sphere

S_{L} (u, p)

when the run-length of u satisfies

ρ (u) \geq 2

.

Problem 3.

Given an alphabet Σ, an integer

p \geq 2

, and two strings

u_{1}, u_{2}

over Σ of the same length, under what conditions on

u_{1}

and

u_{2}

does the equality

|S_{L} (u_{1}, p)| = |S_{L} (u_{2}, p)|

hold?

6. Pseudocode and Illustrative Examples

In this section we present concise pseudocode for our principal computational tasks (Algorithm 1):

1.: Evaluating the Levenshtein distance between arbitrary strings;
2.: Enumerating the volume of the Levenshtein sphere of radius p centered at a fixed string u, denoted $|S_{L} (u, p)|$ .

To demonstrate the effectiveness of this pseudocode and to corroborate the theoretical results established earlier, we also include a series of tables (Table 1, Table 2, Table 3, Table 4, Table 5 and Table 6). Each table showcases carefully chosen input strings and parameters (run-length), together with the corresponding output values. The examples serve as a practical guide for the implementation and highlight how the run-length

ρ (u)

, the word length

l (u)

, the radius p, and the alphabet size s influence the cardinality of

S_{L} (u, p)

.

Implementing this pseudocode in Python 3.12 yields the sequence of tables presented in this section.

We now present a series of tables that illustrate the theoretical results established in the preceding sections. These tables are intended to highlight how the structure of the string u, particularly its length and run-length, influences the cardinality of the corresponding Levenshtein sphere

S_{L} (u, p)

. Each table serves to validate and exemplify the formulas and properties discussed earlier.

Algorithm 1 Enumerate all strings at Levenshtein distance p from a given string u

Require: Integer $n \geq 1$ , radius $p \geq 1$ , alphabet size $s \geq 2$
Require: Alphabet $Σ$ with $| Σ | = s$
Require: String u over $Σ$ with length n
Ensure: All strings v such that $L (u, v) = p$ and the count $| S_{L} (u, p) |$
1:function LevenshteinDistance( $x, y$ ) ▹internal helper
2: $m \leftarrow length (x)$ , $n \leftarrow length (y)$
3: Create array $D [0 . . m] [0 . . n]$
4: for $i \leftarrow 0$ to m do ▹initialise first column
5: $D [i] [0] \leftarrow i$
6: for $j \leftarrow 0$ to n do ▹initialise first row
7: $D [0] [j] \leftarrow j$
8: for $i \leftarrow 1$ to m do
9: for $j \leftarrow 1$ to n do
10: $cost \leftarrow \{\begin{matrix} 0 & if x [i - 1] = y [j - 1] \\ 1 & otherwise \end{matrix}$
11: $D [i] [j] \leftarrow min (D [i - 1] [j] + 1, D [i] [j - 1] + 1, D [i - 1] [j - 1] + cost)$
12: return $D [m] [n]$
13:/* enumeration phase */
14: Solutions $\leftarrow \emptyset$
15: $minLen \leftarrow max (0, n - p)$ , $maxLen \leftarrow n + p$
16:for $l \leftarrow minLen$ to $maxLen$ do
17: for all $v \in Σ^{l}$ do
18: if LevenshteinDistance( $u, v$ ) $= p$ then
19: Solutions ←Solutions $\cup {v}$
20:Output every string in Solutions
21:Output $| S_{L} (u, p) | \leftarrow | Solutions |$

Observe from the above table that for radii

p \geq 2

, the equality

| S_{L} (u_{1}, p) | = | S_{L} (u_{2}, p) |

does not necessarily hold, even when the two strings share the same run-length

ρ (u_{1}) = ρ (u_{2})

and length

l (u_{1}) = l (u_{2})

. By contrast, when

p = 1

, these two conditions are sufficient to guarantee

| S_{L} (u_{1}, 1) | = | S_{L} (u_{2}, 1) |

(see Theorem 2).

7. Discussion

In this paper, we established a closed-form expression for the volume of the unit Levenshtein sphere, that is, the set

S_{L} (u, 1) = {v \in Σ^{*} ∣ L (u, v) = 1},

expressed as a function of the string length

l (u)

, the alphabet size s, and the run-length

ρ (u)

.

To provide a comparative perspective, we also revisited the classical Hamming metric, for which we derived exact counts of Hamming sphere volumes. This comparison highlighted the structural differences between the Hamming distance, which only accounts for substitutions, and the more flexible Levenshtein distance, which captures a broader spectrum of edit operations—including insertions, deletions, and substitutions.

The general problem of determining the volume

| S_{L} (u, p) |

for an arbitrary radius

p \geq 2

remains a challenging and largely unresolved combinatorial question. Nonetheless, for strings of the form

u = a^{n}

—i.e., strings with run-length

ρ (u) = 1

—we obtained a closed-form expression for

| S_{L} (a^{n}, p) |

valid for all

p \geq 2

.

The difficulty in deriving a closed-form expression for

| S_{L} (u, p) |

when

ρ (u) \geq 2

stemmed from the absence of an explicit formula for the Levenshtein distance

L (u, v)

, even in seemingly simple cases. For instance, in the case where

u = a^{n} b^{m}

(so that

ρ (u) = 2

), no general closed-form expression for

L (u, v)

was known. This lack of an analytic formula complicated the enumeration of all strings v such that

L (u, v) = p

, and thereby hindered the computation of

| S_{L} (u, p) |

.

When the run-length satisfied

ρ (u) \geq 2

, the allowable transformations are strongly influenced by the run-partition

rp (u)

, which records the lengths of the successive character runs in u. We generated experimental values for the volumes

| S_{L} (u, p) |

for selected strings with

ρ (u) \geq 2

, providing benchmark data that future theoretical developments must aim to explain.

A key open direction is to identify and formalize the structural patterns that emerged in the

ρ (u) \geq 2

case and translate them into rigorous closed-form expressions.

Beyond their intrinsic combinatorial appeal, explicit formulas for Levenshtein sphere volumes have concrete applications in areas such as error-correcting codes(particularly in sequence alignment) and compressed-text indexing. Advancing our understanding of the multi-run case thus offers both theoretical insight and practical impact.

Author Contributions

Methodology, S.A. and O.E.; Investigation, S.A. and O.E.; Writing—review & editing, S.A. and O.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors gratefully acknowledge the valuable comments and suggestions of the three anonymous referees, which significantly improved both the mathematical content and the clarity of the exposition. The authors also acknowledge the support provided by the Deanship of Research at King Fahd University of Petroleum and Minerals, Saudi Arabia.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Levenshtein, V.I. Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Dokl. 1966, 10, 707–710. [Google Scholar]
Barrón-Cedeno, A.; Stein, B.; Rosso, P. Cross-language plagiarism detection. Lang Resour. Eval. 2011, 45, 45–62. [Google Scholar]
Brill, E.; Moore, R.C. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics, Hong Kong, China, 3–6 October 2000; pp. 286–293. [Google Scholar]
Koehn, P. Europarl: A Parallel Corpus for Statistical Machine Translation. In Proceedings of the 10th Machine Translation Summit, Phuket, Thailand, 12–16 September 2005; pp. 79–86. [Google Scholar]
Jurafsky, D.; Martin, J.H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition; Prentice Hall: Upper Saddle River, NJ, USA, 2008. [Google Scholar]
Hamming, R.W. Error detecting and error correcting codes. Bell Syst. Tech. J. 1950, 29, 147–160. [Google Scholar] [CrossRef]
Katz, J.; Lindell, Y. Introduction to Modern Cryptography, 3rd ed.; Chapman & Hall/CRC Cryptography and Network Security; CRC Press: Boca Raton, FL, USA, 2021. [Google Scholar]
Lin, S.; Costello, D.J. Error Control Coding: Fundamentals and Applications, 2nd ed.; Pearson/Prentice Hall: Upper Saddle River, NJ, USA, 2004. [Google Scholar]
Andoni, A.; Indyk, P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Commun. ACM 2018, 51, 117–122. [Google Scholar] [CrossRef]
Amir, A.; Amit, M.; Landau, G.M.; Sokol, D. Period recovery of strings over the Hamming and edit distances. Theor. Comput. Sci. 2018, 710, 2–18. [Google Scholar] [CrossRef]
Marçais, G.; DeBlasio, D.; Pandey, P.; Kingsford, C. Locality-sensitive hashing for the edit distance. Bioinformatics 2019, 35, i127–i135. [Google Scholar] [CrossRef] [PubMed]
Malon, S.; Freeman, H. On the encoding of arbitrary geometric configurations. IRE Trans. EC 1961, 10, 260–268. [Google Scholar]
Koyano, H.; Hayashida, M. Volume formula and growth rates of the balls of strings under the edit distances. Appl. Math. Comput. 2023, 458, 128202. [Google Scholar] [CrossRef]
Wang, M.; Wang, S. Connectivity and diagnosability of center k-ary n-cubes. Discrete Appl. Math. 2021, 294, 98–107. [Google Scholar] [CrossRef]
Wang, M.; Lin, Y.; Wang, S. The connectivity and nature diagnosability of expanded k-ary n-cubes. RAIRO Theor. Inform. Appl. 2017, 51, 71–89. [Google Scholar] [CrossRef]
Bakhtary, P.; Echi, O. On minimal Hamming compatible distances. RAIRO Theor. Inform. Appl. 2014, 48, 495–503. [Google Scholar] [CrossRef]
de Moivre, A. The Doctrine of Chances: Or, a Method of Calculating the Probabilities of Events in Play; Chelsea Publishing Company: New York, NY, USA, 1967. [Google Scholar]
Navarro, G. A guided tour to approximate string matching. ACM Comput. Surv. 2001, 33, 31–88. [Google Scholar] [CrossRef]

Table 1. Levenshtein spheres

S_{L} (u, 1)

of center u and radius

p = 1

over the alphabet

Σ = {0, 1, 2}

, illustrating the cardinality

| S_{L} (u, 1) |

stated in Theorem 2.

Table 1. Levenshtein spheres

S_{L} (u, 1)

of center u and radius

p = 1

over the alphabet

Σ = {0, 1, 2}

, illustrating the cardinality

| S_{L} (u, 1) |

stated in Theorem 2.

u	$S_{L} (u, 1)$	$\| S_{L} (u, 1) \|$
01	0, 1, 00, 02, 11, 21, 001, 010, 011, 012, 021, 101, 201	13
010	00, 01, 10, 000, 011, 012, 020, 110, 210, 0010, 0100, 0101, 0102, 0110, 0120, 0210, 1010, 2010	18
0101	001, 010, 011, 101, 0001, 0100, 0102, 0111, 0121, 0201, 1101, 2101, 00101, 01001, 01010, 01011, 01012, 01021, 01101, 01201, 02101, 10101, 20101	23

Table 2. Levenshtein spheres

S_{L} (u, 2)

over the alphabet

Σ = {0, 1, 2}

.

Table 2. Levenshtein spheres

S_{L} (u, 2)

over the alphabet

Σ = {0, 1, 2}

.

u	$S_{L} (u, 2)$	$\| S_{L} (u, 2) \|$
01	$ε$ , 2, 10, 12, 20, 22, 000, 002, 020, 022, 100, 102, 110, 111, 112, 121, 200, 202, 210, 211, 212, 221, 0001, 0010, 0011, 0012, 0021, 0100, 0101, 0102, 0110, 0111, 0112, 0120, 0121, 0122, 0201, 0210, 0211, 0212, 0221, 1001, 1010, 1011, 1012, 1021, 1101, 1201, 2001, 2010, 2011, 2012, 2021, 2101, 2201	55
010	0, 1, 02, 11, 12, 20, 21, 001, 002, 021, 022, 100, 101, 102, 111, 112, 120, 200, 201, 211, 212, 220, 0000, 0001, 0002, 0011, 0012, 0020, 0111, 0112, 0121, 0122, 0200, 0201, 0202, 0211, 0212, 0220, 1000, 1011, 1012, 1020, 1100, 1101, 1102, 1110, 1120, 1210, 2000, 2011, 2012, 2020, 2100, 2101, 2102, 2110, 2120, 2210, 00010, 00100, 00101, 00102, 00110, 00120, 00210, 01000, 01001, 01002, 01010, 01011, 01012, 01020, 01021, 01022, 01100, 01101, 01102, 01110, 01120, 01200, 01201, 01202, 01210, 01220, 02010, 02100, 02101, 02102, 02110, 02120, 02210, 10010, 10100, 10101, 10102, 10110, 10120, 10210, 11010, 12010, 20010, 20100, 20101, 20102, 20110, 20120, 20210, 21010, 22010	109
0101	00, 01, 10, 11, 000, 002, 012, 020, 021, 100, 102, 110, 111, 121, 201, 210, 211, 0000, 0002, 0010, 0011, 0012, 0021, 0110, 0112, 0120, 0122, 0200, 0202, 0210, 0211, 0221, 1001, 1010, 1011, 1012, 1021, 1100, 1102, 1111, 1121, 1201, 2001, 2010, 2011, 2100, 2102, 2111, 2121, 2201, 00001, 00010, 00011, 00012, 00021, 00100, 00102, 00111, 00121, 00201, 01000, 01002, 01020, 01022, 01100, 01102, 01110, 01111, 01112, 01121, 01200, 01202, 01210, 01211, 01212, 01221, 02001, 02010, 02011, 02012, 02021, 02100, 02102, 02111, 02121, 02201, 10001, 10100, 10102, 10111, 10121, 10201, 11001, 11010, 11011, 11012, 11021, 11101, 11201, 12101, 20001, 20100, 20102, 20111, 20121, 20201, 21001, 21010, 21011, 21012, 21021, 21101, 21201, 22101, 000101, 001001, 001010, 001011, 001012, 001021, 001101, 001201, 002101, 010001, 010010, 010011, 010012, 010021, 010100, 010101, 010102, 010110, 010111, 010112, 010120, 010121, 010122, 010201, 010210, 010211, 010212, 010221, 011001, 011010, 011011, 011012, 011021, 011101, 011201, 012001, 012010, 012011, 012012, 012021, 012101, 012201, 020101, 021001, 021010, 021011, 021012, 021021, 021101, 021201, 022101, 100101, 101001, 101010, 101011, 101012, 101021, 101101, 101201, 102101, 110101, 120101, 200101, 201001, 201010, 201011, 201012, 201021, 201101, 201201, 202101, 210101, 220101	187

Table 3. Levenshtein spheres

S_{L} (a^{n}, 2)

over

Σ = {0, 1}

, illustrating Theorem 4.

Table 3. Levenshtein spheres

S_{L} (a^{n}, 2)

over

Σ = {0, 1}

, illustrating Theorem 4.

u	$S_{L} (u, 2)$	$\| S_{L} (u, 2) \|$
0	11, 000, 001, 010, 011, 100, 101, 110	8
00	$ε$ , 1, 11, 011, 101, 110, 0000, 0001, 0010, 0011, 0100, 0101, 0110, 1000, 1001, 1010, 1100	17
000	0, 01, 10, 011, 101, 110, 0011, 0101, 0110, 1001, 1010, 1100, 00000, 00001, 00010, 00011, 00100, 00101, 00110, 01000, 01001, 01010, 01100, 10000, 10001, 10010, 10100, 11000	28
0000	00, 001, 010, 100, 0011, 0101, 0110, 1001, 1010, 1100, 00011, 00101, 00110, 01001, 01010, 01100, 10001, 10010, 10100, 11000, 000000, 000001, 000010, 000011, 000100, 000101, 000110, 001000, 001001, 001010, 001100, 010000, 010001, 010010, 010100, 011000, 100000, 100001, 100010, 100100, 101000, 110000	42
00000	000, 0001, 0010, 0100, 1000, 00011, 00101, 00110, 01001, 01010, 01100, 10001, 10010, 10100, 11000, 000011, 000101, 000110, 001001, 001010, 001100, 010001, 010010, 010100, 011000, 100001, 100010, 100100, 101000, 110000, 0000000, 0000001, 0000010, 0000011, 0000100, 0000101, 0000110, 0001000, 0001001, 0001010, 0001100, 0010000, 0010001, 0010010, 0010100, 0011000, 0100000, 0100001, 0100010, 0100100, 0101000, 0110000, 1000000, 1000001, 1000010, 1000100, 1001000, 1010000, 1100000	59

Table 4. Levenshtein spheres

S_{L} (u, 3)

over the alphabet

Σ = {0, 1}

, illustrating the volume computation described in Theorem 4.

Table 4. Levenshtein spheres

S_{L} (u, 3)

over the alphabet

Σ = {0, 1}

, illustrating the volume computation described in Theorem 4.

u	$S_{L} (u, 3)$	$\| S_{L} (u, 3) \|$
0	111, 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110	16
00	111, 0111, 1011, 1101, 1110, 00000, 00001, 00010, 00011, 00100, 00101, 00110, 00111, 01000, 01001, 01010, 01011, 01100, 01101, 01110, 10000, 10001, 10010, 10011, 10100, 10101, 10110, 11000, 11001, 11010, 11100	31
000	$ε$ , 1, 11, 111, 0111, 1011, 1101, 1110, 00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100, 000000, 000001, 000010, 000011, 000100, 000101, 000110, 000111, 001000, 001001, 001010, 001011, 001100, 001101, 001110, 010000, 010001, 010010, 010011, 010100, 010101, 010110, 011000, 011001, 011010, 011100, 100000, 100001, 100010, 100011, 100100, 100101, 100110, 101000, 101001, 101010, 101100, 110000, 110001, 110010, 110100, 111000	60
0000	0, 01, 10, 011, 101, 110, 0111, 1011, 1101, 1110, 00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100, 000111, 001011, 001101, 001110, 010011, 010101, 010110, 011001, 011010, 011100, 100011, 100101, 100110, 101001, 101010, 101100, 110001, 110010, 110100, 111000, 0000000, 0000001, 0000010, 0000011, 0000100, 0000101, 0000110, 0000111, 0001000, 0001001, 0001010, 0001011, 0001100, 0001101, 0001110, 0010000, 0010001, 0010010, 0010011, 0010100, 0010101, 0010110, 0011000, 0011001, 0011010, 0011100, 0100000, 0100001, 0100010, 0100011, 0100100, 0100101, 0100110, 0101000, 0101001, 0101010, 0101100, 0110000, 0110001, 0110010, 0110100, 0111000, 1000000, 1000001, 1000010, 1000011, 1000100, 1000101, 1000110, 1001000, 1001001, 1001010, 1001100, 1010000, 1010001, 1010010, 1010100, 1011000, 1100000, 1100001, 1100010, 1100100, 1101000, 1110000	104
00000	00, 001, 010, 100, 0011, 0101, 0110, 1001, 1010, 1100, 00111, 01011, 01101, 01110, 10011, 10101, 10110, 11001, 11010, 11100, 000111, 001011, 001101, 001110, 010011, 010101, 010110, 011001, 011010, 011100, 100011, 100101, 100110, 101001, 101010, 101100, 110001, 110010, 110100, 111000, 0000111, 0001011, 0001101, 0001110, 0010011, 0010101, 0010110, 0011001, 0011010, 0011100, 0100011, 0100101, 0100110, 0101001, 0101010, 0101100, 0110001, 0110010, 0110100, 0111000, 1000011, 1000101, 1000110, 1001001, 1001010, 1001100, 1010001, 1010010, 1010100, 1011000, 1100001, 1100010, 1100100, 1101000, 1110000, 00000000, 00000001, 00000010, 00000011, 00000100, 00000101, 00000110, 00000111, 00001000, 00001001, 00001010, 00001011, 00001100, 00001101, 00001110, 00010000, 00010001, 00010010, 00010011, 00010100, 00010101, 00010110, 00011000, 00011001, 00011010, 00011100, 00100000, 00100001, 00100010, 00100011, 00100100, 00100101, 00100110, 00101000, 00101001, 00101010, 00101100, 00110000, 00110001, 00110010, 00110100, 00111000, 01000000, 01000001, 01000010, 01000011, 01000100, 01000101, 01000110, 01001000, 01001001, 01001010, 01001100, 01010000, 01010001, 01010010, 01010100, 01011000, 01100000, 01100001, 01100010, 01100100, 01101000, 01110000, 10000000, 10000001, 10000010, 10000011, 10000100, 10000101, 10000110, 10001000, 10001001, 10001010, 10001100, 10010000, 10010001, 10010010, 10010100, 10011000, 10100000, 10100001, 10100010, 10100100, 10101000, 10110000, 11000000, 11000001, 11000010, 11000100, 11001000, 11010000, 11100000	168

Table 5. Illustration of

| S_{L} (u, p) |

for

p = 2, 3, 5, 8

over the alphabet

\sum = {0, 1}

with fixed

n = l (u)

and run-length

ρ (n)

. See Problem 3 for further discussion.

Table 5. Illustration of

| S_{L} (u, p) |

for

p = 2, 3, 5, 8

over the alphabet

\sum = {0, 1}

with fixed

n = l (u)

and run-length

ρ (n)

. See Problem 3 for further discussion.

n	$ρ (u)$	u	$\| S_{L} (u, 2) \|$	$\| S_{L} (u, 3) \|$	$\| S_{L} (u, 5) \|$	$\| S_{L} (u, 8) \|$
4	2	0001	47	110	472	4026
		1110	47	110	472	4026
		0011	49	111	474	4027
		1100	49	111	474	4027
4	3	0010	52	112	474	4028
		1101	52	112	474	4028
		0110	53	112	475	4029
		1001	53	112	475	4029
4	4	0101	55	112	474	4028
		1010	55	112	474	4028

Table 6. Distinct Volumes for equal run-partitions (the alphabet is

Σ = {0, 1, 2}

).

Table 6. Distinct Volumes for equal run-partitions (the alphabet is

Σ = {0, 1, 2}

).

String u	Run: $ρ (u)$	Run-Partition Sequence: $rp (u)$	Volume of the Sphere $S_{L} (u, 2)$
0100000	3	(1, 1, 5)	446
0122222	3	(1, 1, 5)	448
0100001	4	(1, 1, 4, 1)	479
0122220	4	(1, 1, 4, 1)	481
0111010	5	(1, 3, 1, 1, 1)	514
0111012	5	(1, 3, 1, 1, 1)	517
0010101	6	(2, 1, 1, 1, 1, 1)	542
1102020	6	(2, 1, 1, 1, 1, 1)	547
0101010	7	(1, 1, 1, 1, 1, 1, 1)	565
0101012	7	(1, 1, 1, 1, 1, 1, 1)	571

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Algarni, S.; Echi, O. Spheres of Strings Under the Levenshtein Distance. Axioms 2025, 14, 550. https://doi.org/10.3390/axioms14080550

AMA Style

Algarni S, Echi O. Spheres of Strings Under the Levenshtein Distance. Axioms. 2025; 14(8):550. https://doi.org/10.3390/axioms14080550

Chicago/Turabian Style

Algarni, Said, and Othman Echi. 2025. "Spheres of Strings Under the Levenshtein Distance" Axioms 14, no. 8: 550. https://doi.org/10.3390/axioms14080550

APA Style

Algarni, S., & Echi, O. (2025). Spheres of Strings Under the Levenshtein Distance. Axioms, 14(8), 550. https://doi.org/10.3390/axioms14080550

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spheres of Strings Under the Levenshtein Distance

Abstract

1. Introduction

2. Spheres of Strings Under the Hamming Metric

3. Scattered Strings and Run-Length Encoding

4. Unit Spheres of Strings Under the Levenshtein Metric

5. Spheres of Strings with Centers of Run-Length 1

6. Pseudocode and Illustrative Examples

7. Discussion

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI