# Static Analysis for ECMAScript String Manipulation Programs

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

`eval`. In this way, dynamic languages provide multiple string features that simplify the writing of programs, allowing, at the same time, statically unpredictable executions which might make them harder to understand [1]. For this reason, string obfuscation (e.g., string splitting) is becoming one of the most common obfuscation techniques in JavaScript malwares [2], making it hard to statically analyze code. Consider, for example, the JavaScript program fragment in Figure 1 where strings are manipulated, de-obfuscated, combined together into the variable

`d`and finally transformed into executable code, the statement

`ws = new ActiveXObject(WScript.Shell)`. This command, in Internet Explorer, opens a shell which may execute malicious commands. The command is not hard-coded in the fragment but it is built at run-time and the initial values of $\mathtt{i}$,$\mathtt{j}$ and $\mathtt{k}$ are unknown, as is the number of iterations of the loops.

`d`. Unfortunately existing static analyzers for dynamic languages [3,4,5,6], might fail to precisely analyze strings in dynamic contexts. for instance, in the example above, TAJS [3], JSAI [4] and SAFE [5], lose precision on the

`eval`input value and any information gathered so far about it. Namely, the issue of analyzing dynamic languages, even if tackled by sophisticated tools as the cited ones, still lacks formal approaches for handling the dynamic features of string manipulation, such as dynamic typing, implicit type conversion and dynamic code generation. Instead, in [7], a new approach for dynamic language analysis is proposed based on finite state automata for abstracting strings, coming with both a precise string abstraction able to infer string properties in general and a sound abstract interpreter for dynamically-generated code.

`eval`statement might execute, we surely need to (over-)approximate the set of precise string values of its input. Hence we propose an approach defining a collecting semantics for strings. With this task in mind, we will first discuss how to combine abstract domains of primitive types (strings, integers and booleans) in order to capture dynamic typing. Once we have such an abstract domain, we will define on it an abstract semantics for a $\mu \mathsf{JS}$ language, augmented with implicit type conversion, dynamic typing and several interesting string operations taken from the official ECMAScript language specification [10], namely the JavaScript language specification, whose concrete semantics is inspired by the JavaScript one. In particular, for each one of these operations we will provide the algorithm computing its abstract semantics and we will discuss their soundness and completeness.

## 2. Background

#### 2.1. String Notation

`"1"`, and not

`"+1"`). Given $\sigma \in {\Sigma}^{*}$ and $n\in \mathbb{N}$, we denote with ${\sigma}^{n}$ the n-times concatenation of $\sigma $. Given a symbol $c\in \Sigma $ we denote with $\mathsf{toLowerCase}\left(c\right)$ its corresponding lower-case symbol, if it is a capital letter, otherwise c is returned. We abuse notation denoting by $\mathsf{toLowerCase}\left(\sigma \right)$ the string $\sigma $ where at each position any upper-case symbol is replaced with the corresponding lower-case symbol.

#### 2.2. Regular Languages and Finite State Automata

#### 2.3. Abstract Interpretation

## 3. The Core Language

`String`, detailed in the ECMAScript language specification [10]. Even though we have decided to focus on a core of the operations, note that the missing methods (e.g.,

`indexOf`or

`endsWith`) can be easily modeled as composition of our chosen string methods or as particular cases of them. Nevertheless, as we will discuss in Section 6, these operations have been implemented and tested.

#### $\mu \mathsf{JS}$ Semantics

`substring`:- It extracts the substring between two indexes from a string. The semantics is defined by the function Ss$:\mathbb{S}\times \mathbb{Z}\times \mathbb{Z}\to \mathbb{S}$ as:$$\mathrm{S}\mathrm{S}(\sigma ,i,j)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\left(\right)open="\{"\; close>\begin{array}{cc}\mathrm{S}\mathrm{S}(\sigma ,j,i)\hfill & ji\hfill \\ {\sigma}_{i}\cdots {\sigma}_{j}\hfill & j\left|\sigma \right|\phantom{\rule{4pt}{0ex}}\wedge \phantom{\rule{4pt}{0ex}}i\le \phantom{\rule{3.33333pt}{0ex}}j\hfill \\ {\sigma}_{i}\cdots {\sigma}_{n}\hfill & j\ge n=\left|\sigma \right|\wedge \phantom{\rule{4pt}{0ex}}i\le j\hfill \end{array}$$
`charAt`:- It returns the character, i.e., the string of unitary length, at a specified index in a string $\sigma $. The semantics is the function Ca$:\mathbb{S}\times \mathbb{Z}\to \mathbb{S}$ defined as follows:$$\mathrm{C}\mathrm{A}(\sigma ,i)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\left(\right)open="\{"\; close>\begin{array}{cc}{\sigma}_{i}\hfill & 0\le i\left|\sigma \right|\hfill \\ \u03f5\hfill & \mathrm{otherwise}\hfill \end{array}$$
`length`:- It returns the length of a string $\sigma \in \mathbb{S}$. Its semantics is the function Le$:\mathbb{S}\to \mathbb{Z}$ defined as $\mathrm{L}\mathrm{E}\left(\sigma \right)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\left|\sigma \right|$.
`concat`:- It returns the concatenation between two strings and its concrete semantics Cc : $\mathbb{S}\times \mathbb{S}\to \mathbb{S}$ relies on the concatenation operator reported in Section 2.$$\mathrm{C}\mathrm{C}(\sigma ,{\sigma}^{\prime})=\sigma \xb7{\sigma}^{\prime}$$
`startsWith`:- It determines whether a specified string $\sigma $ starts with ${\sigma}^{\prime}$. The semantics is the function Sw : $\mathbb{S}\times \mathbb{S}\to \mathbb{B}$ defined as:$$\mathrm{S}\mathrm{W}(\sigma ,{\sigma}^{\prime})\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\left(\right)open="\{"\; close>\begin{array}{cc}\mathtt{true}\hfill & \exists {\sigma}^{\prime \prime}\in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}\sigma ={\sigma}^{\prime}\xb7{\sigma}^{\prime \prime}\hfill \\ \mathtt{false}\hfill & \mathrm{otherwise}\hfill \end{array}$$
`repeat`:- It returns the given string repeated n times. The semantics is the function Rt$:\mathbb{S}\times \mathbb{Z}\to \mathbb{S}$ defined as $\mathrm{R}\mathrm{T}(\sigma ,n)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}{\sigma}^{n}$.
`includes`:- It determines whether a string ${\sigma}^{\prime}$ is a substring of $\sigma $. The semantics is the function In$:\mathbb{S}\times \mathbb{S}\to \mathbb{B}$ defined as:$$\mathrm{I}\mathrm{N}(\sigma ,{\sigma}^{\prime})\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\left(\right)open="\{"\; close>\begin{array}{cc}\mathtt{true}\hfill & \exists \varphi ,\psi \in {\Sigma}^{*}.\sigma =\varphi \xb7{\sigma}^{\prime}\xb7\psi \hfill \\ \mathtt{false}\hfill & \mathrm{otherwise}\hfill \end{array}$$
`toLowerCase`:- It returns the given string in all lowercase letters. The semantics is the function Lc $:\mathbb{S}\to \mathbb{S}$ defined as $\mathrm{L}\mathrm{C}\left(\sigma \right)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\mathsf{toLowerCase}\left(\sigma \right)$.
`trimLeft`:- It removes all the white-spaces at the beginning of a string. The semantics is the function Tl $:\mathbb{S}\to \mathbb{S}$ defined as:$$\mathrm{T}\mathrm{L}\left(\sigma \right)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}{\sigma}^{\prime}\phantom{\rule{1.em}{0ex}}\mathrm{where}\psi =\mathsf{max}\{\phantom{\rule{0.277778em}{0ex}}{\psi}^{\prime}\in {\left(\u2423\right)}^{*}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma ={\psi}^{\prime}\xb7{\sigma}^{\prime}\phantom{\rule{0.277778em}{0ex}}\}\wedge \sigma =\psi \xb7{\sigma}^{\prime}$$
`trimRight`:- It removes all the white-spaces at the end of a string. The semantics is the function Tr $:\mathbb{S}\to \mathbb{S}$ defined as:$$\mathrm{T}\mathrm{R}\left(\sigma \right)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}{\sigma}^{\prime}\phantom{\rule{1.em}{0ex}}\mathrm{where}\psi =\mathsf{max}\{\phantom{\rule{0.277778em}{0ex}}{\psi}^{\prime}\in {\left(\u2423\right)}^{*}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma ={\sigma}^{\prime}\xb7{\psi}^{\prime}\phantom{\rule{0.277778em}{0ex}}\}\wedge \sigma ={\sigma}^{\prime}\xb7\psi $$
`trim`:- It removes all the white-spaces at the end and beginning of a string. The semantics is the function Tm $:\mathbb{S}\to \mathbb{S}$ defined as: $\mathrm{T}\mathrm{M}\left(\sigma \right)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\mathrm{T}\mathrm{R}\left(\mathrm{T}\mathrm{L}\left(\sigma \right)\right)$.

#### Implicit Type Conversion

## 4. An Abstract Domain for String Manipulation

#### 4.1. The Finite State Automata Abstract Domain for Strings

**Theorem**

**1.**

#### Widening

**Example**

**1.**

`str = “ ”; while (x < 100) { str = str + “a”; x = x + 1; }`

`x`is unknown and so is the number of iterations of the

`while`-loop. In these cases, in order to guarantee soundness and termination, we apply the widening operator.

`str`at the beginning of the second iteration of the loop, while in Figure 5b the abstract value of the variable

`str`at the end of the second iteration. Before starting a new iteration, in the example, we apply ${\nabla}_{1}$ between the two automata, specifically we merge all the states having the same outgoing character. The minimization of the so obtained automaton is reported in Figure 5c. The next iteration will reach the fix-point, guaranteeing termination.

#### 4.2. An Abstract Domain for $\mu \mathsf{JS}$

#### 4.2.1. Coalesced Sum

**Definition**

**1**

`x`may be both a string and a boolean value, after the

`if`statement. On the coalesced sum domain, the analysis would lose any precision w.r.t. collecting semantics by returning ${\alpha}_{\mathbb{S}}(\u2018\u2018\mathtt{42}")\bigsqcup {\alpha}_{\mathbb{B}}\left(\mathtt{true}\right)=\top $.

#### 4.2.2. Cartesian Product

`x`after the

`if`-execution is precisely $(\perp ,{\alpha}_{\mathbb{B}}\left(\mathtt{true}\right),{\alpha}_{\mathbb{S}}(\u2018\u2018\mathtt{42}"),\perp )$, now an element of the domain, inferring that the value of

`x`can be ${\alpha}_{\mathbb{B}}\left(\mathtt{true}\right)$ or ${\alpha}_{\mathbb{S}}(\u2018\u2018\mathtt{42}")$ but surely not an abstract integer of $\mathtt{NaN}$.

## 5. Abstract Semantics of ECMAScript String Operations

**Definition**

**2**

**Definition**

**3**

**Definition**

**4**

#### 5.1. Abstract Semantics of `Substring`

`substring`. In particular, we define the operator SS${}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathsf{Const}\times \mathsf{Const}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$, that takes as input an automaton and two constant integer indexes $i,j\in \mathsf{Const}$, and computes the automaton recognizing the set of all substrings of the input automata language between the two provided integer indexes. Since the abstract semantics has to take into account the swaps when the initial index is greater than the final one, several cases arise when one of the two integer parameters is unknown, namely when it is equal to ${\top}_{\mathsf{Const}}$. Indeed, the abstract semantics SS${}^{\u266f}$ is divided in four cases that are reported in Table 1. Consider $\mathtt{A}\in \mathrm{D}{\mathrm{FA}}_{/\equiv}$, $i,j\in \mathsf{Const}$ (for the sake of readability we denote by ⊔ the automata lub ${\bigsqcup}_{\mathrm{D}\mathrm{FA}}$, and by ⊓ the glb ${\sqcap}_{\mathrm{D}\mathrm{FA}}$). As in the concrete semantics of $\mathtt{substring}$, negative integer values are treated as zero.

- If $i,j\in \mathbb{Z}$ (second row, second column of Table 1) we have to compute the language of all the substrings between the initial index i and a final index in j, i.e., $Ss\left(\mathcal{L}\right(\mathtt{A}),i,j)$. For example, let $\mathtt{L}={\left\{a\right\}}^{*}\cup \{hello,bc\}$, the set of its substrings from 1 to 3 is $Ss(\mathtt{L},1,3)=\{\u03f5,a,aa,el,c\}$. When $i<j$, as in the example, the automaton accepting this language is computed by the operator$$\begin{array}{c}\mathsf{SS}(\mathtt{A},i,j)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}(\mathsf{RQ}(\mathsf{SU}(\mathtt{A},i),\mathsf{SU}(\mathtt{A},j))\sqcap \mathsf{Min}\left({\Sigma}^{j-i}\right))\bigsqcup (\mathsf{SU}(\mathtt{A},i)\sqcap \mathsf{Min}\left({\Sigma}^{<j-i}\right))\hfill \end{array}$$If $j>i$, the integer arguments are simply swapped, as in the Table 1.
- When both integer parameters correspond to ${\top}_{\mathsf{Const}}$, the result is the automaton of all possible factors of $\mathtt{A}$ (third row, third column), i.e., $\mathsf{FA}\left(\mathtt{A}\right)$.
- When i is defined and $j={\top}_{\mathsf{Const}}$ (second row, third column), we have to compute the automaton recognizing all the substrings of $\mathcal{L}\left(\mathtt{A}\right)$ from 0 to i and any substring starting from i. For example, let us consider ${\mathsf{SS}}^{\u266f}(\mathsf{Min}\left(\left\{helloworld\right\}\right),5,{\top}_{\mathsf{Const}})$. Due to the semantics of
`substring`reported in Section 3, we need to compute the substring from $a\in [0,5]$ to 5 and then any substring with initial index equal to 5. The automata recognizing any substring starting at a specific index l is defined as ${\mathsf{SS}}^{\leftrightarrow}(\mathtt{A},l)\stackrel{\mathrm{def}}{\phantom{\rule{0.277778em}{0ex}}=\phantom{\rule{0.277778em}{0ex}}}\mathsf{FA}\left(\mathsf{SU}(\mathtt{A},l)\right)$. The abstract semantics returns the least upper bound of all the automata of substrings from a in $[0,i]$ to the automata recognizing any substring with initial index equals to i. - Similarly to the previous case, when j is defined and $i={\top}_{\mathsf{Const}}$ (third row, second column), we have to compute the automaton recognizing all the substring of $\mathcal{L}\left(\mathtt{A}\right)$ from 0 to j and any substring starting from j. Let us consider ${\mathsf{SS}}^{\u266f}(\mathsf{Min}\left(\left\{helloworld\right\}\right),{\top}_{\mathsf{Const}},5)$. Similarly to the previous case, we compute the substrings from $a\in [0,5]$ to 5 and then any substring with initial index equal to 5. The abstract semantics therefore returns the least upper bound of all the automata of substrings from a in $[0,j]$ to the automata recognizing any substring with initial index equal to j.

**Theorem**

**2.**

#### 5.2. Abstract Semantics of `charAt`

`charAt`should return an automaton accepting the language of the characters at position i in the strings accepted by the given automaton. Since

`charAt`is a particular case of

`substring`, its abstract semantics, determined by ${\mathsf{CA}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathsf{Const}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$, relies on the abstract semantic of

`substring`previously defined. In particular,

**Theorem**

**3.**

#### 5.3. Abstract Semantics of `length`

`length`should return a value, of the integer domain $\mathsf{Const}$, that, in a sound way, approximates the length of all the possible strings of an automaton. The abstract semantics of

`length`is defined by the function ${\mathsf{LE}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathsf{Const}$, computed by Algorithm 1, where $\mathsf{Paths}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \wp \left(\wp \left(Q\right)\right)$ returns the set of the paths from the initial state to any final state of $\mathtt{A}$ [35]. Given a path $\mathsf{p}\in \mathsf{Paths}\left(\mathtt{A}\right)$, we denote by $\left|\mathsf{p}\right|$ the length of $\mathsf{p}$.

Algorithm 1:${\mathsf{LE}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathsf{Const}$ algorithm |

`length`can give a precise answer only when any string of the automaton has precisely the same length. More accurate results can be obtained by using more precise integer abstract domains, e.g., intervals, as we will discuss in Section 6. For example, consider the automata $\mathtt{A}$ and ${\mathtt{A}}^{\prime}$ in Figure 8a,b, respectively. ${\mathsf{LE}}^{\u266f}\left(\mathtt{A}\right)$ precisely returns 5, since all the strings recognized by $\mathtt{A}$ have the same length, while ${\mathsf{LE}}^{\u266f}\left({\mathtt{A}}^{\prime}\right)$ returns ${\top}_{\mathsf{Const}}$.

**Theorem**

**4.**

#### 5.4. Abstract Semantics of `Concat`

**Theorem**

**5.**

#### 5.5. Abstract Semantics of `StartsWith`

`startsWith`takes as input two automata and checks whether a string of the language of the first automaton starts with a string of the language of the second one. The abstract semantics of

`startsWith`is captured by the function ${\mathsf{SW}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathrm{D}{\mathrm{FA}}_{/\equiv}\to {\mathbb{B}}^{\u266f}$, computed by Algorithm 2, where $\mathsf{maxString}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ returns the (minimal) automaton recognizing the longest string of the automaton given as input and $\mathsf{isSinglePath}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \{\mathtt{true},\mathtt{false}\}$ checks whether the input automata $\mathtt{A}=(Q,\Sigma ,\delta ,{q}_{0},F)$ respect the following condition: $\delta ={\bigcup}_{i\in [0,|Q\left|\right]}({q}_{i},{q}_{i+1},c)$. Informally, a single-path automaton is an automaton where, if we sort the strings of its language from the shortest to the longest, each string is a prefix of the next one. An example of a single-path automaton is reported in Figure 9b where it is graphically clear that each state, excluding the initial and last one, have one incoming and one outgoing transition. Since the longest string in a single-path automaton has, as prefix, all the others of the language, it is sufficient to check, for an automaton $\mathtt{A}$, if it starts with only the former. For example, let $\mathcal{L}\left(\mathtt{A}\right)=\left\{softer\right\}$ and $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\{s,so,soft\}$. The string s is prefix of $so$, which is in turn prefix of $soft$ so ${\mathtt{A}}^{\prime}$ is a single-path automaton. Therefore, in this case, it is sufficient to check if $softer$ starts with only $soft$ (the longest string of $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)$) since, being ${\mathtt{A}}^{\prime}$ single-path, the other strings (s and $so$) are consequently prefix of $softer$. Instead, consider $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\{s,no\}$. It would be impossible for a string to start with both of them since there is no prefix relation between them.

Algorithm 2:${\mathsf{SW}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathrm{D}{\mathrm{FA}}_{/\equiv}\to {\mathbb{B}}^{\u266f}$ algorithm |

`B`, which is the (minimal) automaton that recognizes $pan$ and

`C`, $\mathcal{L}\left(\mathtt{C}\right)=\{pan,koa\}$, and compare them (line 13). We return $\left\{\mathtt{true}\right\}$ if

`B`and

`C`recognize the same language otherwise we return ${\top}_{\mathsf{Bool}}$. In the other cases, as already mentioned, we return $\{\mathtt{true},\mathtt{false}\}$. For example, in Figure 9, $\{\mathtt{true},\mathtt{false}\}$ is returned because, although ${\mathtt{A}}^{\prime}$ is a single-path automaton, only the string $panda\in \mathcal{L}\left(\mathtt{A}\right)$ begins with $pan$, namely the longest string of $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)$.

**Example**

**2.**

**Theorem**

**6.**

#### 5.6. Abstract Semantics of `ToLowerCase`

`toLowerCase`is defined by the function ${\mathsf{LC}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ which returns as result an automaton that recognizes the same strings of the input automaton, where any upper-case symbol is replaced with the corresponding lower-case symbol. ${\mathsf{LC}}^{\u266f}$ is computed by Algorithm 3.

Algorithm 3:${\mathsf{LC}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ algorithm |

**Theorem**

**7.**

#### 5.7. Abstract Semantics of `Includes`

`true`} is returned, since the empty string is always a substring of a non-empty automaton (lines 2–4), if none of the substring of $\mathtt{A}$ is contained in ${\mathtt{A}}^{\prime}$, $\left\{\mathtt{false}\right\}$ is returned (lines 5–7) and if one of the input automata is cyclic, it returns ${\top}_{\mathsf{Bool}}$ (lines 8–10). When these corner cases are excluded, we check each string recognized by $\mathtt{A}$. If the algorithm finds at least one string ${\sigma}^{\prime}$ in $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)$ that is not a substring of a string $\sigma $ of $\mathtt{A}$, ${\top}_{\mathsf{Bool}}$ is returned otherwise $\left\{\mathtt{true}\right\}$. This is done in lines 10–14 where, for each path $\mathsf{p}$ of $\mathtt{A}$ we create $\mathsf{Min}\left(\mathsf{p}\right)$ and check if its factorization with ${\mathtt{A}}^{\prime}$ equals ${\mathtt{A}}^{\prime}$, i.e., we check if it contains any string of ${\mathtt{A}}^{\prime}$.

Algorithm 4:${\mathsf{IN}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathrm{D}{\mathrm{FA}}_{/\equiv}\to {\mathbb{B}}^{\u266f}$ algorithm |

**Theorem**

**8.**

#### 5.8. Abstract Semantics of `Repeat`

`repeat`is defined by the function ${\mathsf{RT}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathsf{Const}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ that, given as input an automaton $\mathtt{A}$ and a constant integer value i, returns an automaton that recognizes any string of $\mathcal{L}\left(\mathtt{A}\right)$ repeated i times. ${\mathsf{RT}}^{\u266f}$ is computed by Algorithm 5 and we suppose that the abstract integer value i is positive or zero. Any non-positive value is treated as zero. The algorithm first checks some corner cases. If $i=0$ or the input automaton only recognizes the empty string, then $\mathsf{Min}\left(\u03f5\right)$ is returned (lines 1–3). If the automaton has a cycle or $i={\top}_{\mathsf{Const}}$, it returns the Kleene-closure of the input automaton (lines 4–6). If none of these corner cases is detected then, for each string in $\mathcal{L}\left(\mathtt{A}\right)$, we concatenate it with itself $(i-1)$-times using the already defined ${\mathsf{CC}}^{\u266f}$. The result is the union of all the concatenated automata.

Algorithm 5:${\mathsf{RT}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\times \mathsf{Const}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ algorithm |

**Theorem**

**9.**

#### 5.9. Abstract Semantics of `TrimLeft`, `TrimRight` and `Trim`

`trimLeft`,

`trimRight`and

`trim`operations. The abstract semantics of

`trimLeft`is defined by the function ${\mathsf{TL}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$. In particular, it takes as input an automaton $\mathtt{A}$ and returns an automaton accepting the same strings of $\mathtt{A}$ removing, at the beginning of each string, consecutive white spaces, if present. In the following, we denote a white-space as ␣. The function is computed by Algorithm 6. The idea of algorithm is to iteratively replace white-space transitions from the initial state with $\u03f5$-transition (lines 5–7), while leaving the other transitions unaltered (lines 7–9). At each iteration, the resulting automaton is minimized, and hence determinized (line 11). This operation is repeated until the initial state has no white-space transitions, checking the condition that white-space is not a prefix of the automaton (line 3). In Figure 14 is depicted an example of application of our algorithm.

Algorithm 6:${\mathsf{TL}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ algorithm |

**Theorem**

**10.**

`trimRight`can be defined in function of the already defined function ${\mathsf{TL}}^{\u266f}$. Indeed, the abstract semantics ${\mathsf{TR}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ reserves the input automaton, applies ${\mathsf{TL}}^{\u266f}$ and finally reverses again the so obtained automaton. Formally,

`trim`applies both the abstract semantics of

`trimLeft`and

`trimRight`. Thus, the abstract semantics of

`trim`is captured by the function ${\mathsf{TM}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathrm{D}{\mathrm{FA}}_{/\equiv}$ and it is defined as

**Theorem**

**11.**

**Proof.**

#### 5.10. Concerning Abstract Implicit Type Conversion

## 6. $\mathbf{\mu}\mathrm{F}\mathrm{ASA}$ Implementation

#### 6.1. Theoretical Concerns

#### 6.2. Implementation

`endsWith`w.r.t.

`startsWith`,

`slice`w.r.t.

`substring`). The full list of implemented string operations is reported in Table 2, also summarizing for which operations holds soundness and completeness and the average complexity of their algorithms (w.r.t. the constant integer abstract domain).

#### 6.3. Extension to Interval Abstract Domain

`substring`,

`charAt`,

`repeat`and

`length`, while the other methods only use strings or booleans. Nevertheless, $\mu \mathrm{F}\mathrm{ASA}$ abstracts integer values to the more precise interval abstract domain [9], i.e., to the set $\mathsf{Intervals}$.

`substring`abstract semantics. Since intervals can be unbounded (e.g., $[5,+\infty ]$), more than 20 different cases have been identified in its abstract semantics [8]. Given $\mathtt{substr}(\mathtt{A},[a,b],[c,d])$, for some $\mathtt{A}\in \mathrm{D}{\mathrm{FA}}_{/\equiv}$, many of these cases include $b=+\infty $ and d definite value, b definite and $d=+\infty $, $b,d=+\infty $ and $a,c$ definite values and $a\le c$, only to cite few. Moreover, the interval-based abstract semantics does not add any further important technical detail to our contribution since the cases cited above, met with an interval-based analysis, were handled in an ad hoc manner and would have made this paper harder to follow. In particular, being the constant integer abstract domain strictly contained into the intervals one, restricting the presentation to constant integers permitted us to report only the meaningful cases (from a technical point of view), avoiding the others (related to intervals) handled in specific ways (and relevant for the implementation).

`substring`,

`charAt`,

`length`and

`repeat`, as reported in [8]. The abstract semantics of the other string operations remain unaffected by the change. Just as an example, in the following we report the abstract semantics of $\mathtt{length}$ on intervals.

#### `length` Abstract Semantics with Intervals

`length`, reported in Section 5.3. The idea behind the algorithm capturing its abstract semantics is to check if any string recognized by the input automata have the same length $l\in \mathbb{N}$. If so, l is returned as result, otherwise ${\top}_{\mathsf{Const}}$ is returned. Clearly, this is a forced choice given by the fact that the constant integer abstract domain is only able to track a single integer value. In this sense, the abstract semantics of $\mathtt{length}$ can be improved, from a precision point of view, when we deal with intervals rather than constant integers. Algorithm 7 reports the abstract semantics of $\mathtt{length}$ using the former abstract domain. We compute the minimum and the maximum path reaching each final state in the automaton and then we abstract the set of lengths obtained so far into intervals. Problems arise when the automaton contains cycles. In that case, we return the undefined interval starting from the minimum path, to a final state, to $+\infty $.

#### 6.4. Qualitative Evaluation of $\mu \mathrm{F}\mathrm{ASA}$

`eval`, while the second is a benevolent function taken from a real-world string manipulation program. In both cases, we will show that important string information can be obtained by $\mu \mathrm{F}\mathrm{ASA}$.

Algorithm 7:${\mathsf{LE}}^{\u266f}:\mathrm{D}{\mathrm{FA}}_{/\equiv}\to \mathsf{Intervals}$ Algorithm |

#### 6.4.1. Obfuscated Malware

`d`, at the

`eval`call, is the automaton ${\mathtt{A}}_{d}$ in Figure 16. The cycles are caused by the widening application in

`while`computations.

`eval`string? We can also answer that by checking if ${\mathtt{A}}_{d}\sqcap \mathsf{Min}\left(\left\{eval\right\}\right)\ne \mathsf{Min}\left(\mathsf{\u2300}\right)$. In this case it returns false and enforces the idea that any explicit call to

`eval`cannot occur.

#### 6.4.2. String Manipulation Program

`fixStations`reported in Figure 17, taken from [38]. The function takes as input an object

`stations`containing information about train stations (each item contains the three-letter station code, followed by some machine-readable data, followed by a semicolon, followed by the human-readable station name) and extracts the station code (in capital letters) and the station name. for instance, given the input $\mathtt{stations}$ = $\{\mathtt{st}\mathtt{1}:"\mathtt{MANay}\mathtt{781};\mathtt{Manchester}",\mathtt{st}\mathtt{2}:"\mathtt{gNfbx}\mathtt{420};\mathtt{Greenfield}"\}$, the function returns the object $\{\mathtt{st}\mathtt{1}:"\mathtt{MAN}:\mathtt{Manchester}",\mathtt{st}\mathtt{2}:"\mathtt{GNF}:\mathtt{Greenfield}"\}$.

`fixStations`returns another object containing strings following the pattern of three capital letters concatenated with a colon concatenated with a string. The goal of our analyzer is to exactly preserve this information on the variable $\mathtt{result}$. Let us consider a statically unknown value of $\mathtt{stations}$, namely where $\mathtt{stations}=\{\mathtt{st}\mathtt{1}:{\sigma}_{1},\cdots ,\mathtt{stn}:{\sigma}_{n}\}$, $n\in \mathbb{N}$ and ${\sigma}_{i}$ follows the station information pattern, for each $i\in [1,n]$. While other static analyzers, such as TAJS, which has a finite height string abstract domain, lose any information about the returned string, $\mu \mathrm{F}\mathrm{ASA}$ is able to infer, for the variable $\mathtt{result}$, the object $\{\mathtt{st}\mathtt{1}:{\mathtt{p}}_{1},\cdots ,{\mathtt{st}}_{n}:{\mathtt{p}}_{n}\}$, where each ${\mathtt{p}}_{i}$ is a string abstract value, namely a finite state automaton, following the desired pattern

## 7. Discussion and Related Work

#### 7.1. Analysis vs. Verification

#### 7.2. Main Related Works

`substring`; (2) our focus is on the characterization of a formal abstract interpretation-based framework where it is possible to prove soundness and to analyze the completeness of string operations, in order to understand where it is possible to tune precision versus efficiency. The main feature we have in common with existing works is the use of DFA (regular expressions) for abstracting strings. In [21], the authors propose symbolic string verifier for PHP based on finite state automata represented by a particular form of binary decision diagrams, the MBDD. Even if it could be interesting to understand whether this representation of DFAs may be used also for improving our algorithms, their work only considers operations exclusively involving strings (not also integers such as

`substring`) and therefore it provides a solution for different string manipulations. In [20], the authors propose an abstract interpretation-based string analyzer approximating strings into a subset of regular languages, called regular strings and define the abstract semantics of four string operations of interest equipped with a widening. This is the most related work, but our approach is strictly more general, since we do not introduce any restriction to regular languages. In [19], the authors propose a scalable static analysis for jQuery that relies on a novel abstract domain of regular expressions. The abstract domain in [19] contains the finite state automata one but pursues a different task and does not provide semantics for string operations. Surely it may be interesting to integrate our library for string manipulation operators into SAFE. Finally, [42] proposes a lattice-based generalization of regular expression, formally illustrating a parametric abstract domain of regular expressions starting from a complete lattice of reference. However, this work does not tackle the problem of analyzing string manipulations, since it instantiates the parametric abstract domain in the network communication environment, analyzing the exchanged messages as regular expressions.

`eval`patterns has been defined in [50]. In [37], it is defined the Loop-Sensitive Analysis (LSA) that distinguishes loop iterations using loop strings in the same way call strings distinguish function calls from different call sites in k-CFA [51]. The authors have implemented LSA into SAFE [5], a JavaScript web applications static analyzer. As future work, it may be intriguing to combine LSA with our abstract semantics for decreasing the occurrences of false positives introduced by the widening operator during fix-point computations.

#### 7.3. Future Ideas

`eval`function, transforming strings into code. Our semantics is sound and precise enough to answer some non-trivial properties of interest. Indeed, in [7], the finite state automata domain and the corresponding abstract semantics for strings turned out to be the basis for a sound and precise enough analysis of

`eval`.

## Author Contributions

## Funding

## Conflicts of Interest

## Appendix A. Selected Proofs

**Proof of Theorem 2**.

`substring`semantics any negative value is treated as zero, in the proof, we suppose w.l.o.g. that when a negative value arises it is treated as zero.

- $\gamma \left(i\right)=\left\{l\right\},\gamma \left(j\right)=\left\{k\right\},l$ and $k\in \mathbb{Z}$: let us suppose, w.l.o.g., that $l<k$ (otherwise the indexes are swapped).$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \mathrm{S}\mathrm{s}\left(\mathcal{L}\right(\mathtt{A}),\{l\},\{k\left\}\right)=\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\{\phantom{\rule{0.277778em}{0ex}}{\sigma}_{l}\cdots {\sigma}_{k}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),k<|\sigma \left|\phantom{\rule{0.277778em}{0ex}}\right\}\cup \{\phantom{\rule{0.277778em}{0ex}}{\sigma}_{i}\cdots {\sigma}_{n}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),k\ge n=|\sigma \left|\phantom{\rule{0.277778em}{0ex}}\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}y\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\exists z\in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}yz\in \mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),l),z\in \mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),k),\left|y\right|=k-l,k<\left|\sigma \right|\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \cup \left\{\phantom{\rule{0.277778em}{0ex}}y\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}y\in \mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),l),y\in {\Sigma}^{\le k-l}\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =(\mathrm{R}\mathrm{Q}(\mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),l),\mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),k))\cap {\Sigma}^{k-l})\cup \mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),l)\cap {\Sigma}^{\le k-l}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}((\mathsf{RQ}(\mathsf{SU}(\mathtt{A},i),\mathsf{SU}(\mathtt{A},j))\sqcap \mathsf{Min}\left({\Sigma}^{j-i}\right))\bigsqcup (\mathsf{SU}(\mathtt{A},i)\sqcap \mathsf{Min}\left({\Sigma}^{<j-i}\right)))\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\mathsf{SS}\right(\mathtt{A},i,j\left)\right)\hfill \end{array}$$
- $\gamma \left(i\right)=\mathbb{Z},\gamma \left(j\right)=\left\{k\right\}$, with $k\in \mathbb{Z}$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \mathrm{S}\mathrm{s}\left(\mathcal{L}\right(\mathtt{A}),\mathbb{Z},\{k\left\}\right)=\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{Ss}\right(\sigma ,l,k\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),l\in \mathbb{Z}\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}\right(\sigma ,l,k\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),0\le l<k\phantom{\rule{0.277778em}{0ex}}\}\cup \{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}(\sigma ,k,l)\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}(\mathtt{A}),l\ge k\wedge l<|\sigma \left|\phantom{\rule{0.277778em}{0ex}}\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\bigcup _{a\in [0,k]}\mathrm{S}\mathrm{s}(\mathcal{L}\left(\mathtt{A}\right),a,k)\cup Fa\left(\mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),l)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\underset{a\in [0,k]}{\u2a06}{\mathsf{SS}}^{\u266f}(\mathtt{A},a,k){\bigsqcup}_{\mathrm{D}\mathrm{FA}}\mathsf{FA}\left(\mathsf{SU}(\mathtt{A},l)\right)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\underset{a\in [0,k]}{\u2a06}\mathsf{SS}(\mathtt{A},a,k){\bigsqcup}_{\mathrm{D}\mathrm{FA}}{\mathsf{SS}}^{\leftrightarrow}(\mathtt{A},l)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left({\mathsf{SS}}^{\u266f}(\mathtt{A},i,j)\right)\hfill \end{array}$$
- $\gamma \left(i\right)=l\in \mathbb{Z},\gamma \left(j\right)=\mathbb{Z}:$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \mathrm{S}\mathrm{s}\left(\mathcal{L}\right(\mathtt{A}),l,\mathbb{Z})=\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}\right(\sigma ,l,k\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),k\in \mathbb{Z}\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}\right(\sigma ,l,k\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),k\ge l\wedge k\le \left|\sigma \right|\phantom{\rule{0.277778em}{0ex}}\}\cup \{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}(\sigma ,k,l)\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}(\mathtt{A}),0\le k<l\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\bigcup _{a\in [0,l]}\mathrm{S}\mathrm{s}(\mathcal{L}\left(\mathtt{A}\right),a,l)\cup Fa\left(\mathrm{S}\mathrm{U}(\mathcal{L}\left(\mathtt{A}\right),l)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\underset{a\in [0,l]}{\u2a06}{\mathsf{SS}}^{\u266f}(\mathtt{A},a,l){\bigsqcup}_{\mathrm{D}\mathrm{FA}}\mathsf{FA}\left(\mathsf{SU}(\mathtt{A},l)\right)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\underset{a\in [0,l]}{\u2a06}\mathsf{SS}(\mathtt{A},a,k){\bigsqcup}_{\mathrm{D}\mathrm{FA}}{\mathsf{SS}}^{\leftrightarrow}(\mathtt{A},l)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left({\mathsf{SS}}^{\u266f}(\mathtt{A},i,j)\right)\hfill \end{array}$$
- $\gamma \left(i\right)=\gamma \left(j\right)=\mathbb{Z}:$$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& \mathrm{S}\mathrm{s}\left(\mathcal{L}\right(\mathtt{A}),\mathbb{Z},\mathbb{Z})=\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}\right(\sigma ,l,k\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),l,k\in \mathbb{Z}\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{S}\mathrm{s}\right(\sigma ,l,k\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),l,k\ge 0,l,k<\left|\sigma \right|\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =Fa\left(\mathcal{L}\right(\mathtt{A}\left)\right)=\mathsf{FA}\left(\mathtt{A}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left({\mathsf{SS}}^{\u266f}(\mathtt{A},i,j)\right)\hfill \end{array}$$

**Proof of Theorem 3**.

- Let us suppose that $i\ne {\top}_{\mathsf{Const}}$, hence $\gamma \left(i\right)=\left\{n\right\}$, where $n\in \mathbb{Z}$.$$\begin{array}{cc}\hfill \mathrm{C}\mathrm{A}\left(\mathcal{L}\right(\mathtt{A}),\{n\left\}\right)& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{C}\mathrm{A}\right(\sigma ,n\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right)\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}{\sigma}_{n}\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),0\le n<\left|\sigma \right|\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \cup \left\{\phantom{\rule{0.277778em}{0ex}}\u03f5\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\exists \sigma \in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}n\ge \left|\sigma \right|\vee n<0\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{Ss}\right(\sigma ,n,n+1\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),0\le n<\left|\sigma \right|\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \cup \left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{Ss}\right(\sigma ,n,n+1\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\exists \sigma \in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}n\ge \left|\sigma \right|\vee n<0\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{Ss}\right(\sigma ,\left\{n\right\},\{n+1\}\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right)\phantom{\rule{0.277778em}{0ex}}\}=\mathrm{Ss}(\mathcal{L}\left(\mathtt{A}\right),n,n+1)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left({\mathsf{SS}}^{\u266f}(\mathtt{A},i,i+1)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left({\mathsf{CA}}^{\u266f}(\mathtt{A},i)\right)\hfill \end{array}$$
- Let us suppose that $i={\top}_{\mathsf{Const}}$, hence $\gamma \left(i\right)=\mathbb{Z}$. It is worth noting that the function $\mathtt{chars}$ we used in the abstract semantics of
`charAt`is complete. Let $\mathrm{C}\mathrm{HARS}:\wp \left({\Sigma}^{*}\right)\to \wp (\Sigma )$ be the function that given a set of strings returns the set of characters inside any string of the input string set. It holds that $\mathrm{C}\mathrm{HARS}\left(\mathcal{L}\right(\mathtt{A}\left)\right)=\mathtt{chars}\left(\mathtt{A}\right)$.$$\begin{array}{cc}\hfill \mathrm{C}\mathrm{A}\left(\mathcal{L}\right(\mathtt{A}),\gamma (i\left)\right)& =\left\{\phantom{\rule{0.277778em}{0ex}}\mathrm{C}\mathrm{A}\right(\sigma ,n\left)\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),n\in [0,|\sigma |-1]\phantom{\rule{0.277778em}{0ex}}\}\cup \{\u03f5\}=\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\{\phantom{\rule{0.277778em}{0ex}}{\sigma}_{n}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),n\in [0,\left|\sigma \right|-1\left]\phantom{\rule{0.277778em}{0ex}}\right\}\cup \left\{\u03f5\right\}=\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathrm{C}\mathrm{HARS}\left(\mathcal{L}\right(\mathtt{A}\left)\right)\cup \left\{\u03f5\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\mathsf{Min}\right(\mathtt{chars}\left(\mathtt{A}\right))\bigsqcup \mathsf{Min}(\left\{\u03f5\right\}\left)\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left({\mathsf{CA}}^{\u266f}(\mathtt{A},i)\right)\hfill \end{array}$$

**Proof Of Theorem 4**.

- $\mathrm{L}\mathrm{E}\left(\mathcal{L}\right(\mathtt{A}\left)\right))=I\in \wp (\mathbb{Z})$, s.t. $\left|I\right|=1$:$$\begin{array}{cc}\hfill \left|\mathrm{L}\mathrm{E}\right(\mathcal{L}\left(\mathtt{A}\right)\left)\right|=1& \iff \mathrm{L}\mathrm{E}\left(\mathcal{L}\right(\mathtt{A}\left)\right)=\left\{n\right\}\mathrm{for}\mathrm{some}n\in \mathbb{N}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \forall \sigma \in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\left|\sigma \right|=n\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \forall \mathsf{p}\in \mathsf{Paths}\left(\mathtt{A}\right),\left|p\right|=n\hfill \end{array}$$This condition checks whether the size of any path of $\mathtt{A}$ is n. This check is performed by Algorithm 1 at lines 5–8.
- $\mathrm{L}\mathrm{E}\left(\mathcal{L}\right(\mathtt{A}\left)\right)=I\in \wp \left(\mathbb{Z}\right)$, s.t. $\left|I\right|>1$: this means that$$\left|\mathrm{L}\mathrm{E}\left(\mathcal{L}\left(\mathtt{A}\right)\right)\right|>1\iff \exists \sigma ,{\sigma}^{\prime}\in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\left|\sigma \right|\ne |{\sigma}^{\prime}|$$$$\begin{array}{cc}\hfill \left|\mathrm{L}\mathrm{E}\right(\mathcal{L}\left(\mathtt{A}\right)\left)\right|>1& \iff \exists \sigma ,{\sigma}^{\prime}\in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\left|\sigma \right|\ne |{\sigma}^{\prime}|\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \exists \mathsf{p},{\mathsf{p}}^{\prime}\in \mathsf{Paths}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\left|\mathsf{p}\right|\ne |{\mathsf{p}}^{\prime}|\hfill \end{array}$$

**Proof of Theorem 6**.

- Let us suppose that $\mathrm{S}\mathrm{W}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{false}\right\}$.$$\begin{array}{cc}\hfill \mathrm{S}\mathrm{W}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{false}\right\}& \iff \forall \sigma \in \mathcal{L}\left(\mathtt{A}\right).\forall {\sigma}^{\prime}\in \mathcal{L}\left({\mathtt{A}}^{\prime}\right).\nexists \varphi \in {\Sigma}^{*}.{\sigma}^{\prime}\xb7\varphi =\sigma \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \mathrm{P}\mathrm{R}\left(\mathcal{L}\left(\mathtt{A}\right)\right)\cap \mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\mathsf{\u2300}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \mathsf{PR}\left(\mathtt{A}\right){\sqcap}_{\mathrm{D}\mathrm{FA}}{\mathtt{A}}^{\prime}=\mathsf{Min}\left(\mathsf{\u2300}\right)(\mathrm{lines}4\u20136\mathrm{of}\mathrm{Algorithm}2\phantom{\rule{3.33333pt}{0ex}})\hfill \end{array}$$
- Let us suppose that $\mathrm{S}\mathrm{W}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{true}\right\}$. We split the proof in the following cases:
- -
- if ${\mathtt{A}}^{\prime}=\mathsf{Min}\left(\left\{\u03f5\right\}\right)$: Algorithm 2 verifies the condition (${\mathtt{A}}^{\prime}==\mathsf{Min}\left(\left\{\u03f5\right\}\right)$) at lines 1–3 and returns $\left\{\mathtt{true}\right\}$.
- -
- if $\mathtt{A}$ or ${\mathtt{A}}^{\prime}$ are cyclic: Algorithm 2 verifies the condition ($\mathsf{hasCycle}\left(\mathtt{A}\right)\vee \mathsf{hasCycle}\left({\mathtt{A}}^{\prime}\right)$) at lines 7–9 and returns $\{\mathtt{true},\mathtt{false}\}$.
- -
- if ${\mathtt{A}}^{\prime}$ is not a single-path automaton: in this case, we check if ${\mathtt{A}}^{\prime}$ is not a single path automaton at line 10 of Algorithm 2 and, if so, $\{\mathtt{true},\mathtt{false}\}$ is returned at line 17.
- -
- if ${\mathtt{A}}^{\prime}$ is a single path automaton: let us denote by $\mathrm{MAX}\mathit{S}\mathrm{TRING}\left(\mathcal{L}\left({\mathtt{A}}^{\prime}\right)\right)$ the longest string recognized by $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)$. As we already highlighted, if ${\mathtt{A}}^{\prime}$ is single path, the longest string is unique. Clearly, we have that $\mathrm{MAX}\mathit{S}\mathrm{TRING}\left(\mathcal{L}\left({\mathtt{A}}^{\prime}\right)\right)=\mathsf{maxString}\left({\mathtt{A}}^{\prime}\right)$. Let us denote $\mathrm{MAX}\mathit{S}\mathrm{TRING}\left(\mathcal{L}\left({\mathtt{A}}^{\prime}\right)\right)$ by ${\sigma}^{m}$.$$\begin{array}{cc}\hfill \mathrm{S}\mathrm{W}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))& =\left\{\mathtt{true}\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \forall \sigma \in \mathcal{L}\left(\mathtt{A}\right),\forall {\sigma}^{\prime}\in \mathcal{L}\left({\mathtt{A}}^{\prime}\right).\exists \varphi \in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}{\sigma}^{\prime}\xb7\varphi =\sigma \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \forall \sigma \in \mathcal{L}\left(\mathtt{A}\right)\phantom{\rule{4pt}{0ex}}\exists \varphi \in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}\sigma ={\sigma}^{m}\xb7\varphi \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \mathrm{Ss}(\mathcal{L}\left(\mathtt{A}\right),0,|{\sigma}^{m}\left|\right)=={\sigma}^{m}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff {\mathsf{SS}}^{\u266f}(\mathtt{A},0,{\mathsf{LE}}^{\u266f}\left(\mathsf{maxString}\left({\mathtt{A}}^{\prime}\right)\right))==\mathsf{maxString}\left({\mathtt{A}}^{\prime}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& (\mathrm{lines}10\u201315\mathrm{of}\mathrm{Algorithm}2\phantom{\rule{3.33333pt}{0ex}})\hfill \end{array}$$

- Let us suppose that $\mathrm{S}\mathrm{W}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\{\mathtt{true},\mathtt{false}\}$. We split the proof in the following cases:
- -
- $\mathtt{A}$ or ${\mathtt{A}}^{\prime}$ are cyclic: Algorithm 2 verifies the condition ($\mathsf{hasCycle}\left(\mathtt{A}\right)\vee \mathsf{hasCycle}\left({\mathtt{A}}^{\prime}\right)$) at lines 7–9 and returns $\{\mathtt{true},\mathtt{false}\}$.
- -
- if ${\mathtt{A}}^{\prime}$ is not single path automaton: the check at line 10 of Algorithm 2 fails and $\{\mathtt{true},\mathtt{false}\}$ is returned at line 17.
- -
- ${\mathtt{A}}^{\prime}$ is single-path automaton: as before, if ${\mathtt{A}}^{\prime}$ is single path, the longest string is unique. Let us denote $\mathrm{MAX}\mathit{S}\mathrm{TRING}\left(\mathcal{L}\left({\mathtt{A}}^{\prime}\right)\right)$ by ${\sigma}^{m}$.$$\begin{array}{cc}\hfill \mathrm{S}\mathrm{W}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))& =\left\{\mathtt{true}\phantom{\rule{0.166667em}{0ex}}\mathtt{false}\right\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \exists \sigma \in \mathcal{L}\left(\mathtt{A}\right),\exists {\sigma}^{\prime}\in \mathcal{L}\left({\mathtt{A}}^{\prime}\right)\forall \varphi \in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}{\sigma}^{\prime}\xb7\varphi \ne \sigma \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \forall \sigma \in \mathcal{L}\left(\mathtt{A}\right)\exists \varphi \in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}{\sigma}^{m}\xb7\varphi \ne \sigma \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \mathrm{Ss}(\mathcal{L}\left(\mathtt{A}\right),0,|{\sigma}^{m}\left|\right)\ne {\sigma}^{m}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff {\mathsf{SS}}^{\u266f}(\mathtt{A},0,{\mathsf{LE}}^{\u266f}\left(\mathsf{maxString}\left({\mathtt{A}}^{\prime}\right)\right))\ne \mathsf{maxString}\left({\mathtt{A}}^{\prime}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& (\mathrm{lines}10\u201315\mathrm{of}\mathrm{Algorithm}2\phantom{\rule{3.33333pt}{0ex}})\hfill \end{array}$$The condition is verified at lines 13 of Algorithm 2, it fails, hence $\{\mathtt{true},\mathtt{false}\}$ at line 17 is returned.

**Proof of Theorem 7**.

**Proof of Theorem 8**.

- Let us suppose that $\mathrm{I}\mathrm{N}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{false}\right\}$.$$\begin{array}{cc}\hfill \mathrm{I}\mathrm{N}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{false}\right\}& \iff \forall \sigma \in \mathcal{L}\left({\mathtt{A}}^{\prime}\right).\phantom{\rule{4pt}{0ex}}\sigma \notin \mathrm{F}\text{A}\left(\mathtt{A}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff \mathcal{L}\left({\mathtt{A}}^{\prime}\right)\cap \mathrm{F}\text{A}\left(\mathcal{L}\left(\mathtt{A}\right)\right)=\mathsf{\u2300}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \iff {\mathtt{A}}^{\prime}{\sqcap}_{\mathrm{D}\mathrm{FA}}\mathsf{FA}\left(\mathtt{A}\right)=\mathsf{Min}\left(\mathsf{\u2300}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& (\mathrm{lines}4\u20136\mathrm{of}\mathrm{Algorithm}4\phantom{\rule{3.33333pt}{0ex}})\hfill \end{array}$$
- Let us suppose that $\mathrm{I}\mathrm{N}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{true}\right\}$. Thus, consider the following cases:
- -
- ${\mathtt{A}}^{\prime}=\mathsf{Min}\left(\left\{\u03f5\right\}\right)$: Algorithm 4 verifies the condition (${\mathtt{A}}^{\prime}==\mathsf{Min}\left(\left\{\u03f5\right\}\right)$) at lines 1–3 and returns {$\mathtt{true}\}$.
- -
- $\mathtt{A}$ or ${\mathtt{A}}^{\prime}$ are cyclic: Algorithm 2 verifies the condition ($\mathsf{hasCycle}\left(\mathtt{A}\right)\vee \mathsf{hasCycle}\left({\mathtt{A}}^{\prime}\right)$) at lines 7–9 and returns $\{\mathtt{true},\mathtt{false}\}$.
- -
- ${\mathtt{A}}^{\prime}\ne \mathsf{Min}\left(\left\{\u03f5\right\}\right)$ and $\mathtt{A},{\mathtt{A}}^{\prime}$ are not cyclic:$$\begin{array}{cc}\hfill \mathrm{I}\mathrm{N}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\left\{\mathtt{true}\right\}\phantom{\rule{1.em}{0ex}}& \iff \forall {\sigma}^{\prime}\in \mathcal{L}\left({\mathtt{A}}^{\prime}\right).\forall \sigma \in \mathcal{L}\left(\mathtt{A}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \exists \varphi ,\psi \in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}\varphi \xb7{\sigma}^{\prime}\xb7\psi =\sigma \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \forall \sigma \in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\mathrm{F}\text{A}\left(\left\{\sigma \right\}\right)\cap \mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\mathcal{L}\left({\mathtt{A}}^{\prime}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \forall \mathsf{p}\in \mathsf{Paths}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\mathsf{FA}\left(\mathsf{Min}\left(\mathsf{p}\right)\right){\sqcap}_{\mathrm{D}\mathrm{FA}}{\mathtt{A}}^{\prime}={\mathtt{A}}^{\prime}\hfill \end{array}$$This condition is verified in lines 11–15 of Algorithm 4 and in this case the algorithm returns $\left\{\mathtt{true}\right\}$.

- Let us suppose that $\mathrm{I}\mathrm{N}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\{\mathtt{true},\mathtt{false}\}$. Thus, consider the following cases:
- -
- $\mathtt{A}$ or ${\mathtt{A}}^{\prime}$ are cyclic: Algorithm 4 verifies the condition ($\mathsf{hasCycle}\left(\mathtt{A}\right)\vee \mathsf{hasCycle}\left({\mathtt{A}}^{\prime}\right)$) at lines 7–9 and returns $\{\mathtt{true},\mathtt{false}\}$.
- -
- ${\mathtt{A}}^{\prime}\ne \mathsf{Min}\left(\left\{\u03f5\right\}\right)$ and $\mathtt{A},{\mathtt{A}}^{\prime}$ are not cyclic:$$\begin{array}{cc}\hfill \mathrm{I}\mathrm{N}(\mathcal{L}\left(\mathtt{A}\right),\mathcal{L}\left({\mathtt{A}}^{\prime}\right))=\{\mathtt{true},\mathtt{false}\}\phantom{\rule{1.em}{0ex}}& \Rightarrow \exists {\sigma}^{\prime}\in \mathcal{L}\left({\mathtt{A}}^{\prime}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \exists \sigma \in \mathcal{L}\left(\mathtt{A}\right).\nexists \varphi ,\psi \in {\Sigma}^{*}.\phantom{\rule{0.222222em}{0ex}}\varphi \xb7{\sigma}^{\prime}\xb7\psi =\sigma \hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \exists \sigma \in \mathcal{L}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\mathrm{F}\text{A}\left(\left\{\sigma \right\}\right)\cap \mathcal{L}\left({\mathtt{A}}^{\prime}\right)\ne \mathcal{L}\left({\mathtt{A}}^{\prime}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \Rightarrow \exists \mathsf{p}\in \mathsf{Paths}\left(\mathtt{A}\right).\phantom{\rule{0.222222em}{0ex}}\mathsf{FA}\left(\mathsf{Min}\left(\mathsf{p}\right)\right){\sqcap}_{\mathrm{D}\mathrm{FA}}{\mathtt{A}}^{\prime}\ne {\mathtt{A}}^{\prime}\hfill \end{array}$$

This condition is verified in lines 11–15 of Algorithm 4 and in this case the algorithm returns $\{\mathtt{true},\mathtt{false}\}$.

**Proof of Theorem 9**.

- Let us suppose that $i\ne {\top}_{\mathsf{Const}}$, hence $\gamma \left(i\right)=n$, where $n\in \mathbb{Z}$. We split the proof in the following cases:
- -
- $i=0$: $\mathrm{R}\mathrm{T}\left(\mathcal{L}\right(\mathtt{A}),0)=\left\{\u03f5\right\}$ and Algorithm 5 checks this condition and returns $\mathsf{Min}\left(\right\{\u03f5\left\}\right)$ at lines 1–3.
- -
- $i\ne 0$:
- *
- if $\mathtt{A}$ is s.t. $\mathcal{L}\left(\mathtt{A}\right)=\left\{\u03f5\right\}$: since $\mathrm{R}\mathrm{T}\left(\right\{\u03f5\},i)=\left\{\u03f5\right\}$, Algorithm 5 checks this condition and returns $\mathsf{Min}\left(\right\{\u03f5\left\}\right)$ at lines 1–3.
- *
- if $\mathtt{A}$ is cyclic: $\mathrm{R}\mathrm{T}\left(\mathcal{L}\right(\mathtt{A}),i)\subseteq \mathcal{L}\left(\mathsf{Kleene}\right(\mathtt{A}\left)\right)$ and Algorithm5 checks this condition and returns $\mathsf{Kleene}\left(\mathtt{A}\right)$ at lines 4–6.
- *
- $\mathtt{A}$ is not cyclic:$$\begin{array}{cc}\hfill \mathrm{R}\mathrm{T}\left(\mathcal{L}\right(\mathtt{A}),i)& =\{\phantom{\rule{0.277778em}{0ex}}{\sigma}^{i}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right)\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\{\phantom{\rule{0.277778em}{0ex}}\stackrel{i-times}{\overbrace{\sigma \xb7\sigma \xb7\cdots \xb7\sigma}}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right)\phantom{\rule{0.277778em}{0ex}}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& =\mathcal{L}\left(\{\phantom{\rule{0.277778em}{0ex}}\stackrel{i-times}{\overbrace{\mathsf{Min}\left(\mathsf{p}\right)\xb7\mathsf{Min}\left(\mathsf{p}\right)\xb7\cdots \xb7\mathsf{Min}\left(\mathsf{p}\right)}}\phantom{\rule{0.277778em}{0ex}}|\phantom{\rule{0.277778em}{0ex}}\mathsf{p}\in \mathsf{Paths}\left(\mathtt{A}\right)\phantom{\rule{0.277778em}{0ex}}\}\right)\hfill \end{array}$$In this case, Algorithm 5 returns the above automaton at lines 8–15.

- Let us suppose that $i={\top}_{\mathsf{Const}}$, hence we have that $\gamma \left(i\right)=\mathbb{Z}$.$$\begin{array}{c}\hfill \mathrm{R}\mathrm{T}(\mathcal{L}\left(\mathtt{A}\right),\gamma \left(i\right))=\mathrm{R}\mathrm{T}(\mathcal{L}\left(\mathtt{A}\right),\mathbb{Z})=\left\{\phantom{\rule{0.277778em}{0ex}}{\sigma}^{n}\phantom{\rule{0.277778em}{0ex}}\right|\phantom{\rule{0.277778em}{0ex}}\sigma \in \mathcal{L}\left(\mathtt{A}\right),n>0\phantom{\rule{0.277778em}{0ex}}\}\subseteq \mathcal{L}\left(\mathsf{Kleene}\left(\mathtt{A}\right)\right)\end{array}$$In this case, Algorithm 5 returns $\mathsf{Kleene}\left(\mathtt{A}\right)$, guaranteeing the soundness of ${\mathsf{RT}}^{\u266f}$.

**Proof of Theorem 10**.

## References

- Pradel, M.; Sen, K. The Good, the Bad, and the Ugly: An Empirical Study of Implicit Type Conversions in JavaScript. In Proceedings of the 29th European Conference on Object-Oriented Programming, ECOOP 2015, Prague, Czech Republic, 5–10 July 2015; Boyland, J.T., Ed.; LIPIcs. Schloss Dagstuhl- Leibniz-Zentrum für Informatik: Wadern, Germany, 2015; Volume 37, pp. 519–541. [Google Scholar] [CrossRef]
- Xu, W.; Zhang, F.; Zhu, S. The power of obfuscation techniques in malicious JavaScript code: A measurement study. In Proceedings of the 7th International Conference on Malicious and Unwanted Software, MALWARE 2012, Fajardo, PR, USA, 16–18 October 2012; IEEE Computer Society: Washington, DC, USA, 2012; pp. 9–16. [Google Scholar] [CrossRef] [Green Version]
- Jensen, S.H.; Møller, A.; Thiemann, P. Type Analysis for JavaScript. In Proceedings of the 16th International Symposium on Static Analysis, SAS 2009, Los Angeles, CA, USA, 9–11 August 2009; Palsberg, J., Su, Z., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2009; Volume 5673, pp. 238–255. [Google Scholar] [CrossRef] [Green Version]
- Kashyap, V.; Dewey, K.; Kuefner, E.A.; Wagner, J.; Gibbons, K.; Sarracino, J.; Wiedermann, B.; Hardekopf, B. JSAI: A static analysis platform for JavaScript. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, 16–22 November 2014; Cheung, S., Orso, A., Storey, M.D., Eds.; ACM: New York, NY, USA, 2014; pp. 121–132. [Google Scholar] [CrossRef] [Green Version]
- Lee, H.; Won, S.; Jin, J.; Cho, J.; Ryu, S. SAFE: Formal specification and implementation of a scalable analysis framework for ECMAScript. In Proceedings of the 19th International Workshop on Foundations of Object-Oriented Languages (FOOL’12), Tucson, AZ, USA, 19–26 October 2012. [Google Scholar]
- Hauzar, D.; Kofron, J. Framework for Static Analysis of PHP Applications. In Proceedings of the 29th European Conference on Object-Oriented Programming, ECOOP 2015, Prague, Czech Republic, 5–10 July 2015; Boyland, J.T., Ed.; LIPIcs. Schloss Dagstuhl-Leibniz-Zentrum für Informatik: Wadern, Germany, 2015; Volume 37, pp. 689–711. [Google Scholar] [CrossRef]
- Arceri, V.; Mastroeni, I. A sound abstract interpreter for dynamic code. In Proceedings of the SAC ’20: The 35th ACM/SIGAPP Symposium on Applied Computing, Brno, Czech Republic, 30 March–3 April 2020; Hung, C., Cerný, T., Shin, D., Bechini, A., Eds.; ACM: New York, NY, USA, 2020; pp. 1979–1988. [Google Scholar] [CrossRef]
- Arceri, V.; Mastroeni, I. Static Program Analysis for String Manipulation Languages. In Proceedings of the Seventh International Workshop on Verification and Program Transformation, VPT@Programming 2019, Genova, Italy, 2 April 2019; Volume 299, pp. 19–33. [Google Scholar] [CrossRef]
- Cousot, P.; Cousot, R. Abstract Interpretation: A Unified Lattice Model for Static Analysis of Programs by Construction or Approximation of Fixpoints. In Proceedings of the Conference Record of the Fourth ACM Symposium on Principles of Programming Languages, Los Angeles, CA, USA, 17–19 January 1977; Graham, R.M., Harrison, M.A., Sethi, R., Eds.; ACM: New York, NY, USA, 1977; pp. 238–252. [Google Scholar] [CrossRef] [Green Version]
- ECMA. Standard ECMA-262 Language Specification, 9th ed. Available online: https://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf (accessed on 6 December 2018).
- Hopcroft, J.E.; Ullman, J.D. Introduction to Automata Theory, Languages and Computation; Addison-Wesley: Reading, MA, USA, 1979. [Google Scholar]
- Davis, M.D.; Sigal, R.; Weyuker, E.J. Computability, Complexity, and Languages: Fundamentals of Theoretical Computer Science; Academic Press Professional, Inc.: Cambridge, MA, USA, 1994. [Google Scholar]
- Cousot, P.; Cousot, R. Systematic Design of Program Analysis Frameworks. In Proceedings of the Conference Record of the Sixth Annual ACM Symposium on Principles of Programming Languages, San Antonio, TX, USA, 29–31 January 1979; Aho, A.V., Zilles, S.N., Rosen, B.K., Eds.; ACM Press: New York, NY, USA, 1979; pp. 269–282. [Google Scholar] [CrossRef]
- Cousot, P.; Cousot, R. Abstract Interpretation Frameworks. J. Log. Comput.
**1992**, 2, 511–547. [Google Scholar] [CrossRef] - Giacobazzi, R.; Quintarelli, E. Incompleteness, Counterexamples, and Refinements in Abstract Model-Checking. In Proceedings of the Static Analysis, 8th International Symposium, SAS 2001, Paris, France, 16–18 July 2001; Cousot, P., Ed.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2001; Volume 2126, pp. 356–373. [Google Scholar] [CrossRef]
- Giacobazzi, R.; Mastroeni, I. Making abstract models complete. Math. Struct. Comput. Sci.
**2016**, 26, 658–701. [Google Scholar] [CrossRef] - Giacobazzi, R.; Mastroeni, I. Transforming Abstract Interpretations by Abstract Interpretation. In Proceedings of the Static Analysis, 15th International Symposium, SAS 2008, Valencia, Spain, 16–18 July 2008; Alpuente, M., Vidal, G., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2008; Volume 5079, pp. 1–17. [Google Scholar] [CrossRef]
- Arceri, V.; Maffeis, S. Abstract Domains for Type Juggling. Electron. Notes Theor. Comput. Sci.
**2017**, 331, 41–55. [Google Scholar] [CrossRef] - Park, C.; Im, H.; Ryu, S. Precise and scalable static analysis of jQuery using a regular expression domain. In Proceedings of the 12th Symposium on Dynamic Languages, DLS 2016, Amsterdam, The Netherlands, 1 November 2016; Ierusalimschy, R., Ed.; ACM: New York, NY, USA, 2016; pp. 25–36. [Google Scholar] [CrossRef]
- Choi, T.; Lee, O.; Kim, H.; Doh, K. A Practical String Analyzer by the Widening Approach. In Proceedings of the 4th Asian Symposium on Programming Languages and Systems, APLAS 2006, Sydney, Australia, 8–10 November 2006; Kobayashi, N., Ed.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2006; Volume 4279, pp. 374–388. [Google Scholar] [CrossRef] [Green Version]
- Yu, F.; Bultan, T.; Cova, M.; Ibarra, O.H. Symbolic String Verification: An Automata-Based Approach. In Proceedings of the 15th International SPIN Workshop on Model Checking Software, Los Angeles, CA, USA, 10–12 August 2008; Havelund, K., Majumdar, R., Palsberg, J., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2008; Volume 5156, pp. 306–324. [Google Scholar] [CrossRef]
- Câmpeanu, C.; Paun, A.; Yu, S. An Efficient Algorithm for Constructing Minimal Cover Automata for Finite Languages. Int. J. Found. Comput. Sci.
**2002**, 13, 83–97. [Google Scholar] [CrossRef] - Domaratzki, M.; Shallit, J.O.; Yu, S. Minimal Covers of Formal Languages. In Proceedings of the 5th International Conference Developments in Language Theory, DLT 2001, Vienna, Austria, 16–21 July 2001; Revised Papers. Kuich, W., Rozenberg, G., Salomaa, A., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2001; Volume 2295, pp. 319–329. [Google Scholar] [CrossRef]
- Mohri, M.; Nederhof, M. Regular Approximation of Context-Free Grammars through Transformation. In Robustness in Language and Speech Technology; Springer: Dordrecht, The Netherlands, 2001; pp. 153–163. [Google Scholar]
- Cousot, P.; Halbwachs, N. Automatic Discovery of Linear Restraints Among Variables of a Program. In Proceedings of the Conference Record of the Fifth Annual ACM Symposium on Principles of Programming Languages, Tucson, AZ, USA, 23–25 January 1978; Aho, A.V., Zilles, S.N., Szymanski, T.G., Eds.; ACM Press: New York, NY, USA, 1978; pp. 84–96. [Google Scholar] [CrossRef] [Green Version]
- Costantini, G.; Ferrara, P.; Cortesi, A. A suite of abstract domains for static analysis of string values. Softw. Pract. Exp.
**2015**, 45, 245–287. [Google Scholar] [CrossRef] - Cousot, P.; Cousot, R. Comparing the Galois Connection and Widening/Narrowing Approaches to Abstract Interpretation. In Proceedings of the 4th International Symposium on Programming Language Implementation and Logic Programming, PLILP’92, Leuven, Belgium, 26–28 August 1992; Bruynooghe, M., Wirsing, M., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 1992; Volume 631, pp. 269–295. [Google Scholar] [CrossRef]
- D’Silva, V. Widening for Automata. Ph.D. Thesis, Institut Fur Informatick, UZH, Zurich, Switzerland, 2006. [Google Scholar]
- Bartzis, C.; Bultan, T. Widening Arithmetic Automata. In Proceedings of the 16th International Conference on Computer Aided Verification, CAV 2004, Boston, MA, USA, 13–17 July 2004; Alur, R., Peled, D.A., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2004; Volume 3114, pp. 321–333. [Google Scholar] [CrossRef] [Green Version]
- Cousot, P. Types as Abstract Interpretations. In Proceedings of the Conference Record of POPL’97: The 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Paris, France, 15–17 January 1997; Lee, P., Henglein, F., Jones, N.D., Eds.; ACM Press: New York, NY, USA, 1997; pp. 316–331. [Google Scholar] [CrossRef]
- Reynolds, J.C. Theories of Programming Languages; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
- Giacobazzi, R.; Ranzato, F.; Scozzari, F. Making abstract interpretations complete. J. ACM
**2000**, 47, 361–416. [Google Scholar] [CrossRef] - Fromherz, A.; Ouadjaout, A.; Miné, A. Static Value Analysis of Python Programs by Abstract Interpretation. In Proceedings of the 10th International Symposium on NASA Formal Methods, NFM 2018, Newport News, VA, USA, 17–19 April 2018; Dutle, A., Muñoz, C.A., Narkawicz, A., Eds.; Lecture Notes in Computer Science. Springer: Berin, Germany, 2018; Volume 10811, pp. 185–202. [Google Scholar] [CrossRef] [Green Version]
- Bordihn, H.; Holzer, M.; Kutrib, M. Determination of finite automata accepting subregular languages. Theor. Comput. Sci.
**2009**, 410, 3209–3222. [Google Scholar] [CrossRef] [Green Version] - Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 3rd ed.; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
- Holzer, M.; Jakobi, S. Brzozowski’s Minimization Algorithm - More Robust than Expected-(Extended Abstract). In Proceedings of the 18th International Conference on Implementation and Application of Automata, CIAA 2013, Halifax, NS, Canada, 16–19 July 2013; Konstantinidis, S., Ed.; Springer: Berlin, Germany, 2013. Lecture Notes in Computer Science. Volume 7982, pp. 181–192. [Google Scholar] [CrossRef]
- Park, C.; Ryu, S. Scalable and Precise Static Analysis of JavaScript Applications via Loop-Sensitivity. In Proceedings of the 29th European Conference on Object-Oriented Programming, ECOOP 2015, Prague, Czech Republic, 5–10 July 2015; LIPIcs. Boyland, J.T., Ed.; Schloss Dagstuhl-Leibniz-Zentrum für Informatik: Wadern, Germany, 2015; Volume 37, pp. 735–756. [Google Scholar] [CrossRef]
- Mozilla. MDN Web Docs-Useful String Methods. Available online: https://developer.mozilla.org/en-US/docs/Learn/JavaScript/First_steps/Useful_string_methods (accessed on 20 April 2020).
- Abdulla, P.A.; Atig, M.F.; Chen, Y.; Holík, L.; Rezine, A.; Rümmer, P.; Stenman, J. Norn: An SMT Solver for String Constraints. In Proceedings of the Computer Aided Verification-27th International Conference, CAV 2015, San Francisco, CA, USA, 18–24 July 2015; Part I. Kroening, D., Pasareanu, C.S., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2015; Volume 9206, pp. 462–469. [Google Scholar] [CrossRef]
- Liang, T.; Reynolds, A.; Tsiskaridze, N.; Tinelli, C.; Barrett, C.W.; Deters, M. An efficient SMT solver for string constraints. Form. Methods Syst. Des.
**2016**, 48, 206–234. [Google Scholar] [CrossRef] - Cousot, P.; Giacobazzi, R.; Ranzato, F. Program Analysis Is Harder Than Verification: A Computability Perspective. In Proceedings of the Computer Aided Verification-30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, 14–17 July 2018; Part II. Chockler, H., Weissenbacher, G., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2018; Volume 10982, pp. 75–95. [Google Scholar] [CrossRef] [Green Version]
- Midtgaard, J.; Nielson, F.; Nielson, H.R. A Parametric Abstract Domain for Lattice-Valued Regular Expressions. In Proceedings of the Static Analysis-23rd International Symposium, SAS 2016, Edinburgh, UK, 8–10 September 2016; Rival, X., Ed.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2016; Volume 9837, pp. 338–360. [Google Scholar] [CrossRef]
- Lin, A.W.; Barceló, P. String solving with word equations and transducers: Towards a logic for analysing mutation XSS. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, 20–22 January 2016; Bodík, R., Majumdar, R., Eds.; ACM: New York, NY, USA, 2016; pp. 123–136. [Google Scholar] [CrossRef]
- Abdulla, P.A.; Atig, M.F.; Chen, Y.; Holík, L.; Rezine, A.; Rümmer, P.; Stenman, J. String Constraints for Verification. In Proceedings of the Computer Aided Verification-26th International Conference, CAV 2014, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, 18–22 July 2014; Biere, A., Bloem, R., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2014; Volume 8559, pp. 150–166. [Google Scholar] [CrossRef]
- Bouajjani, A.; Habermehl, P.; Vojnar, T. Abstract Regular Model Checking. In Proceedings of the 16th International Conference on Computer Aided Verification, CAV 2004, Boston, MA, USA, 13–17 July 2004; Alur, R., Peled, D.A., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2004; Volume 3114, pp. 372–386. [Google Scholar] [CrossRef] [Green Version]
- Bouajjani, A.; Habermehl, P.; Holík, L.; Touili, T.; Vojnar, T. Antichain-Based Universality and Inclusion Testing over Nondeterministic Finite Tree Automata. In Proceedings of the 13th International Conference on Implementation and Applications of Automata, CIAA 2008, San Francisco, CA, USA, 21–24 July 2008; Ibarra, O.H., Ravikumar, B., Eds.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2008; Volume 5148, pp. 57–67. [Google Scholar] [CrossRef]
- Alur, R.; Madhusudan, P. Visibly pushdown languages. In Proceedings of the 36th Annual ACM Symposium on Theory of Computing, Chicago, IL, USA, 13–16 June 2004; Babai, L., Ed.; ACM: New York, NY, USA, 2004; pp. 202–211. [Google Scholar] [CrossRef] [Green Version]
- Holík, L.; Janku, P.; Lin, A.W.; Rümmer, P.; Vojnar, T. String constraints with concatenation and transducers solved efficiently. Proc. ACM Program. Lang.
**2018**, 2, 4. [Google Scholar] [CrossRef] [Green Version] - Balakrishnan, G.; Reps, T.W. Recency-Abstraction for Heap-Allocated Storage. In Proceedings of the 13th International Symposium on Static Analysis, SAS 2006, Seoul, Korea, 29–31 August 2006; Yi, K., Ed.; Lecture Notes in Computer Science. Springer: Berlin, Germany, 2006; Volume 4134, pp. 221–239. [Google Scholar] [CrossRef] [Green Version]
- Jensen, S.H.; Jonsson, P.A.; Møller, A. Remedying the eval that men do. In Proceedings of the International Symposium on Software Testing and Analysis, ISSTA 2012, Minneapolis, MN, USA, 15–20 July 2012; Heimdahl, M.P.E., Su, Z., Eds.; ACM: New York, NY, USA, 2012; pp. 34–44. [Google Scholar] [CrossRef] [Green Version]
- Sharir, M.; Pnueli, A. Two Approaches to Interprocedural Data Flow Analysis; NYU CS: New York, NY, USA, 1978. [Google Scholar]

**Figure 7.**(

**a**) $\mathtt{A}$, $\mathcal{L}\left(\mathtt{A}\right)=\{lang,hello\}$ (

**b**) ${\mathtt{A}}^{\prime}={\mathsf{SS}}^{\u266f}(\mathtt{A},2,{\top}_{\mathsf{Const}})$.

**Figure 8.**(

**a**) $\mathtt{A}$, $\mathcal{L}\left(\mathtt{A}\right)=\{paper,hello\}$. (

**b**) ${\mathtt{A}}^{\prime}$, $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\{abc,hello\}$.

**Figure 9.**(

**a**) $\mathtt{A}$, $\mathcal{L}\left(\mathtt{A}\right)=\{panda,koala\}$. (

**b**) ${\mathtt{A}}^{\prime}$, $\mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\{pan,p\}$.

**Figure 10.**(

**a**) $\mathtt{A}$, $\mathcal{L}\left(\mathtt{A}\right)=\{!Ab,CdE\}$, (

**b**) ${\mathsf{LC}}^{\u266f}\left(\mathtt{A}\right)$.

**Figure 11.**(

**a**) $\mathtt{A},\mathcal{L}\left(\mathtt{A}\right)=\{abc,abd,efg\}$ (

**b**) ${\mathtt{A}}^{\prime},\mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\{ab,fg\}$.

**Figure 12.**(

**a**) $\mathtt{A},\mathcal{L}\left(\mathtt{A}\right)=\{panda,candy,andy\}$ (

**b**) ${\mathtt{A}}^{\prime},\mathcal{L}\left({\mathtt{A}}^{\prime}\right)=\{an,nd\}$.

**Figure 13.**(

**a**) $\mathtt{A},\mathcal{L}\left(\mathtt{A}\right)=\{do,mi\}$ (

**b**) ${\mathsf{RT}}^{\u266f}(\mathtt{A},2)$ (

**c**) ${\mathsf{RT}}^{\u266f}(\mathtt{A},{\top}_{\mathsf{Const}})$.

**Figure 14.**(

**a**) $\mathtt{A}$, $\mathcal{L}\left(\mathtt{A}\right)=\{{\left(\u2423\right)}^{*}ab,\u2423d\}$, (

**b**) ${\mathsf{TL}}^{\u266f}\left(\mathtt{A}\right)$.

**Figure 17.**Useful string manipulation method taken from [38].

${\mathsf{SS}}^{\u266f}(\mathtt{A},\mathbf{i},\mathbf{j})$ | $\mathbf{j}\in \mathbb{Z}$$(\mathbf{j}\ne {\mathbf{\top}}_{\mathsf{Const}})$ | $\mathbf{j}={\mathbf{\top}}_{\mathsf{Const}}$ |
---|---|---|

$\mathbf{i}\in \mathbb{Z}$$(\mathbf{i}\ne {\top}_{\mathsf{Const}})$ | $\mathsf{SS}(\mathtt{A},\mathsf{min}(i,j),\mathsf{max}(i,j\left)\right)$ | ${\u2a06}_{a\in [0,i]}\mathsf{SS}(\mathtt{A},a,i)$ ⊔ ${\mathsf{SS}}^{\leftrightarrow}(\mathtt{A},i)$ |

$\mathbf{i}={\top}_{\mathsf{Const}}$ | ${\u2a06}_{a\in [0,j]}\mathsf{SS}(\mathtt{A},a,j)$ ⊔ ${\mathsf{SS}}^{\leftrightarrow}(\mathtt{A},j)$ | $\mathsf{FA}\left(\mathtt{A}\right)$ |

**Table 2.**$\mu \mathsf{JS}$ Finite-state Automata String Analyzer ($\mu \mathrm{F}\mathrm{ASA}$) string operations.

String Operation | Soundness | Completeness | Average Complexity |
---|---|---|---|

substring | ✓ | ✓ | $O(nlogn)$ |

charAt | ✓ | ✓ | $O(nlogn)$ |

length | ✓ | ✓ | $O(n+m)$ |

concat | ✓ | ✓ | $O(nlogn+n+m)$ |

startsWithendsWith | ✓ | ✗ | $O(nlogn+n+m)$ |

toLowerCasetoUpperCase | ✓ | ✓ | $O\left(m\right)$ |

includes | ✓ | ✗ | $O(nlogn+n+m)$ |

repeat | ✓ | ✗ | $O(nlogn+n+m)$ |

replace | ✓ | ✗ | $O\left(\right(n+m\left)nlogn\right)$ |

indexOf | ✓ | ✗ | $O\left(n(nlogn)\left({n}^{2}m\right)\right)$ |

slice | ✓ | ✗ | $O\left(\right(n+m\left)\right(nlogn\left)\right)$ |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Arceri, V.; Mastroeni, I.; Xu, S.
Static Analysis for ECMAScript String Manipulation Programs. *Appl. Sci.* **2020**, *10*, 3525.
https://doi.org/10.3390/app10103525

**AMA Style**

Arceri V, Mastroeni I, Xu S.
Static Analysis for ECMAScript String Manipulation Programs. *Applied Sciences*. 2020; 10(10):3525.
https://doi.org/10.3390/app10103525

**Chicago/Turabian Style**

Arceri, Vincenzo, Isabella Mastroeni, and Sunyi Xu.
2020. "Static Analysis for ECMAScript String Manipulation Programs" *Applied Sciences* 10, no. 10: 3525.
https://doi.org/10.3390/app10103525