Adding Matrix Control: Insertion-Deletion Systems with Substitutions III

: Insertion-deletion systems have been introduced as a formalism to model operations that ﬁnd their counterparts in ideas of bio-computing, more speciﬁcally, when using DNA or RNA strings and biological mechanisms that work on these strings. So-called matrix control has been introduced to insertion-deletion systems in order to enable writing short program fragments. We discuss substitutions as a further type of operation, added to matrix insertion-deletion systems. For such systems, we additionally discuss the effect of appearance checking . This way, we obtain new characterizations of the family of context-sensitive and the family of recursively enumerable languages. Not much context is needed for systems with appearance checking to reach computational completeness. This also suggests that bio-computers may run rather traditionally written programs, as our simulations also show how Turing machines, like any other computational device, can be simulated by certain matrix insertion-deletion-substitution systems.


Introduction
Insertion-deletion systems, or ins-del systems for short, are well-established as computational devices and as a research topic within Formal Languages throughout the past nearly 30 years, starting off with the PhD thesis of Lila Kari [1]. Corresponding to the mismatched annealing of DNA sequences, both the insertion and deletion operations have a strong biological background, which led to their study in the molecular computing framework (cf. [2]). Insertion rules add a substring to a string, given a specified left and right context, while deletion rules remove a substring from a string, again taking a specified left and right context into consideration. The replacement of single letters (possibly within some context) by other letters by an operation, called substitution, is discussed in [3,4], again from a bio-computing background.Interestingly, all of the theoretical studies on grammatical mechanisms involving insertions and deletions (except [5]) omitted, including the substitution operation in their studies. In [6][7][8], we started out a project to formally and systematically study insertion-deletion systems with substitutions as an additional operation, leading to ins-del-sub systems (for short).
It can be argued that the potentially most error-prone part of a bio-computing implementation of ins-del-sub systems are context checks concerning the site where the operation (be it an insertion, a deletion or a substitution operation) may be applied. Therefore, it is interesting to study how much context dependency is necessary for achieving computational completeness, given the ability to add a substring of length n and to delete a substring of length p.
Conversely, assuming that one can prove that ins-del-sub systems with limited context checks exist that can simulate arbitrary Turing machines (or phrase structure grammars, or any arbitrary computational mechanism that can be used to define terms like computability), then this means that there is some hope that some day one could build computers that are no longer silicon-based, but that compute with RNA, DNA, or protein structures.
It should be noted that all of our computational completeness proofs are constructive, which means that algorithms exist that will finally turn any program written in some high-level programming language of your choice into an ins-del-sub system that executes this program based on insertion, deletion, or substitution operations.
As the following definitions will show, the 'derivation relation', although formally defined as a sequential series of insertion, deletion, or substitution operations, can be easily seen to allow an inherent parallelism, as all operations that are 'far enough' from each other can be executed in parallel. Because the potentially high degree of parallelism in bio-computing is argued as one of the main attractive features of this form of computation, this gives another reason as to why one should strive for operations that are as context-independent as possible, as context dependencies could be seen as semaphore-like synchronization points. Namely, on the level of bio-computing, checking certain strings means building up some structures (mostly proteins) that read and check 'matching structures' by forming chemical links between molecules. This means that 'checking from one side' prevents and, hence, rules out 'checking from the other side' in parallel.
Ins-del systems can be extended with some form of control to further reduce their context dependency. Matrix insertion-deletion systems, or matrix ins-del systems for short, were introduced in [9,10]. These systems group insertion and deletion rules in sequences, called matrices; either the whole sequence of operations is applied consecutively, or no rule is applied at all, thus resembling traditional matrix grammars, originally introduced with a linguistic motivation [11]. From the perspective of bio-computing, the matrices correspond to small program fragments without jumps that are easier to implement than longer and more involved ones. Allowing such program fragments should also make a better fit for finally implementing compilers that produce executable 'bio-code' from traditional high-level programming languages, as the sequential execution of commands is one of the cornerstones of basically any of the programming languages of today. There is certainly a certain trade-off between the potentially high degree of parallelism and these new sequential program fragments. This is one of the motivations to limit (on top of context lengths) the length of these program fragments (i.e., technical speaking the length of the matrices).
Additionally, we discuss appearance checking in the context of matrix ins-del-sub systems. In the case of matrix grammars, it is known that allowing certain rules of a matrix to be skipped if not applicable increases the computational power [12]. In this paper, we investigate the effect of appearance checking on matrix ins-del-sub systems. We show that the context dependency of ins-del systems can be greatly reduced if matrices, appearance checking, and substitution rules are allowed. For instance, it is shown that a matrix ins-delsub system that only allows context-free single letter insertions and two-letter deletions in addition to context-free substitution is sufficient for generating any recursively enumerable language. In addition, we show that a 'normal form' for matrix ins-del-sub systems exists, in which only matrices of size at most 2 occur. On the downside, it can be argued that appearance checks is a particularly expensive feature when it comes to implementing it in bio-computing devices, as many sites of potential rule applications have to be checked before being able to execute the next command in the matrix. However, one clearly sees a trade-off in our results between the necessity to have larger contexts and the necessity to have appearance checks. Because it is not clear which of these mechanisms is really harder to implement when it comes to build real bio-computers, it appears to be reasonable to study the general possibilities of these mechanisms, hence paving the way to future generations of new computing devices.

Definitions
We assume the reader to be familiar with the standard notations in formal language theory. By λ we denote the empty string. Let w be an arbitrary string. We denote by w R the reversal or mirror image of w. By L R and L R , we denote the reversal of a language L and a language family L, respectively. We denote, by RE, CS, CF, and REG, the families of recursively enumerable, context-sensitive, context-free, and regular languages, respectively. We are interested in computational completeness results, i.e., in describing RE with matrix ins-del-sub systems with little resources as formally explained next.

Matrix Grammars
A matrix grammar is a tuple G = (N, T, M G , S) where N, T and S are the finite set of nonterminals, the finite set of terminals and the start symbol, respectively. M G is a finite set of sequences of the form m = [r 1 , r 2 , . . . , r n ], n ≥ 1, with rewriting rules Such a sequence m is called a matrix [12]. The relation ⇒ induced by G is defined, as follows. For words w 1 , w 2 ∈ (N ∪ T) * , w 1 ⇒ w 2 holds if a matrix m = [r 1 , r 2 , . . . , r n ] and w 0 , . . . , w n ∈ (N ∪ T) * with w 0 = w 1 and w n = w 2 exist, such that w j ⇒ r j+1 w j+1 holds for all 0 ≤ j < n. The language that is generated by G is L(G) = {w ∈ T * | S ⇒ * w}. We denote, by L(M, CF), the language family that is generated by matrix grammars with context-free rewriting rules [12]. A matrix grammar with appearance checking is a tuple G ac = (N, T, M G , S, F), where N, T, M G , and S are defined as in usual matrix grammars. F is a set of rewriting rules occurring in matrices of M G . All of the rules in F may be skipped in a transition of G ac , if not applicable. Thus, the absence of symbols can be checked.

Insertion-Deletion Systems
An insertion-deletion system (ins-del system for short) is a five-tuple ID = (V, T, A, I, D), consisting of two alphabets V and T with T ⊆ V, a finite language A over V, a set of insertion rules I and a set of deletion rules D. Both sets of rules are formally defined as sets of triples of the form (u, a, v) with u, v ∈ V * and a ∈ V + . We call elements occurring in T terminal symbols, while referring to elements of V \ T as nonterminals. Elements of A are called axioms.
Let w 1 uvw 2 and w 1 uavw 2 , with w 1 , u, v, w 2 ∈ V * , a ∈ V + , be strings. The application of an insertion rule (u, a, v) ∈ I (also written (u, a, v) ins ) to w 1 uvw 2 corresponds to inserting the string a ∈ V * between u and v, which results in the string w 1 uavw 2 . The application of a deletion rule (u, a, v) ∈ D (also written (u, a, v) del ) to w 1 uavw 2 results in the removal of a substring a from the context (u, v), which results in the string w 1 uvw 2 . The relation =⇒ is defined, as follows: Let x, y ∈ V * . Afterwards, we write x =⇒ y iff y is the result of applying an insertion or deletion rule to x. We write =⇒ ins /=⇒ del if y is obtained via an insertion/ a deletion rule. We denote, by =⇒ + and =⇒ * , the transitive and the reflexive and transitive closure, respectively. The language that is generated by ID is defined by L(ID) = {w ∈ T * | ∃α ∈ A : α =⇒ * w}. Consider (u, a, v) ins or (u, a, v) del . We refer to u as the left context and v as the right context of (u, a, v) ins /(u, a, v) del . A sentential form of ID is a string over V. The size of ID describes its complexity and it is defined by a vector (n, m, m ; p, q, q ), By INS m,m n DEL q,q p , we denote the family of all insertion-deletion systems of size (n, m, m ; p, q, q ) [13,14]. Depending on the context, we also denote the family of languages that can be generated by insertion-deletion systems of size (n, m, m ; p, q, q ) by INS m,m n DEL q,q p . We first study a concrete example now to clarify the definitions and also to return to some of the general discussions of the introduction. (1, 1, 1; 1, 0, 0) and the generated language is L(ID) = a + b + ∪ a + cb + .
As mentioned in the introduction, ins-del systems offer a high degree of parallelism. The system ID, for instance, exhibits this trait when considering the rules (a, a, c) and (c, b, b). It is easy to see that the order in which these rules are applied is insignificant and, thus, these rules do not affect each other. Hence, (a, a, c) and (c, b, b) are "far enough" from each other to be applied in parallel without affecting the computation. However, note that the deletion rule (λ, c, λ) cannot be applied in parallel with any insertion rule as the order does matter in this case. More precisely, the deletion (λ, c, λ) cannot be applied before applying an insertion, as it removes necessary context information.
Finally, observe that the amount of parallelism that is observable in generating the language a + b + ∪ a + cb + can be further significantly increased by adding the insertion rules (a, a, a) and (b, b, b).

Combining Ideas: Matrix Insertion-Deletion Systems
The idea of regulating ins-del systems with matrix control goes back to [10,15]. A matrix ins-del system [10] is a construct MID = (V, T, A, M) where V, T and A are defined as in usual ins-del systems. M = {m 1 , . . . , m t }, t ≥ 1, is a finite set of sequences, called matrices, of the form m i = [r i,1 , . . . , r i,k i ], where k i ≥ 1. r i,j , with 1 ≤ i ≤ t, 1 ≤ j ≤ k i , is either an insertion or a deletion rule. A sentential form of MID is a string w ∈ V * . Consider a matrix m i = [r i,1 , . . . , r i,k i ]. A transition w =⇒ m i w is performed if there exist strings w 1 , . . . , w k i +1 ∈ V * such that w j =⇒ r i,j w j+1 with w 1 = w and w k i +1 = w . Let =⇒:= m∈M =⇒ m . The language that is generated by MID is defined as We say that MID has matrices of size k if k = max 1≤i≤t k i . If MID is an ins-del system of size (n, m, m ; p, q, q ) with matrices of size k , we also say that MID is of size (k; n, m, m ; p, q, q ). We denote by MAT k INS m,m n DEL q,q p either the family of languages that are generated by ins-del systems of size (k; n, m, m ; p, q, q ) or the family of ins-del systems of size (k; n, m, m ; p, q, q ), depending on the context. Denote, by MAT * INS m,m n DEL q,q p SUB r,r , the family of matrix ins-del-sub systems with matrices of arbitrary size and insertion rules and deletion rules, of size (n, m, m ) and (p, q, q ), respectively. The following matrix ins-del systems are known to describe RE:  [10]. The incompleteness results for matrix ins-del sub systems includeMAT * INS 0,0 2 DEL 0,0 2 , for instance. More precisely, the following theorem holds.
This result follows from [18], as stated as in [10]. For reasons of completeness, we give a formal proof of this result in the following.
Proof. Before we begin with our proof, we introduce 'markings' [14,19] in the following paragraph. We explain the details of the marking approach with the following example. Consider the derivation λ =⇒ ins aA =⇒ ins aAaA =⇒ ins abCAaA =⇒ ins abDECAaA =⇒ del abDAaA =⇒ del abaA =⇒ ins abaAbc =⇒ del abac of an ins-del system of size (2, 0, 0; 2, 0, 0). Then we introduce 'markings', as follows: Two symbols that have been introduced together are joined with an overline, while two symbols that are deleted together are joined with an underline. The overlines and underlines mark the insertion pairs and the deletion pairs, respectively. The marking to a word abac, which is derived in the manner presented above, is shown in Figure 1. Figure 1. Marking corresponding to a derivation of abac [14,19].
Interpreting all of the symbols as labelled nodes and all lines as edges, it is easy to see that any word w that is generated by an arbitrary ins-del system of size (2, 0, 0; 2, 0, 0) corresponds to a graph, which consists of disjoint paths and/or cycles. If a symbol A is deleted by a deletion rule (λ, A, λ), then we replace the corresponding node in the graph with A . The node A is interpreted as λ. Additionally, the application of an insertion rule (λ, A, λ) corresponds to adding a path, which consists of a single edge with nodes λ and A, to the graph at the corresponding position. We assume that all letters of the axiom have been introduced by such insertion rules. There is clearly no interaction between two symbols of two different paths/cycles or between a symbol of a cycle and a symbol of a path. We remark that all of the symbols/nodes, which have a degree of two, are deleted and hence only symbols with a degree of one contribute to the final word w. Note that all of the symbols of degree one only have an 'overline' edge, while nodes of degree two have both types of edges. Therefore, it is clear that at most two symbols of a path can contribute to w. We refer to [14,19] for more details on the 'marking' approach. Clearly, the marking approach can be applied to matrix-ins-del systems of size ( * ; 2, 0, 0; 2, 0, 0), as well.
We now show that the language a * b cannot be generated by any matrix ins-del systems of size ( * ; 2, 0, 0; 2, 0, 0) by contradiction. Assume that there is a matrix insertion-deletion system MID, which generates the language a * b. Subsequently, clearly the word a m b, where m > 2n + 1 and n is the length of the longest axiom, can also be generated. Consider the graph that corresponds to a m b. It is clear that this graph consist of more than n + 1 paths. Then there is at least one path which does not involve any symbols of the axiom and contributes at least one letter a (and no b) to a m b. Because this path does not involve any letters of the axiom, all symbols of this path have been introduced with context-free insertion rules. We denote this path as P. Consider the derivation of a m b. Because there is no interaction between two symbols of different paths and /or cycles, it is clear that there is a derivation that applies all of the matrices used in the derivation of a m b in the same order, but in which all insertions and deletions corresponding to the path P occur right of the final b. This, in turn, means that ID also generates a word in a * ba + .

Adding Substitutions
With substitution rules, we now introduce the central notion of this paper. We define substitution rules to be of the form (u, allows us to substitute a single letter a with another letter b in the context of u and v, which results in the string w 1 ubvw 2 . Formally, we define an insertion-deletion-substitution system, or ins-del-sub system for short, to be specified by a six-tuple ID ς = (V, T, A, I, D, S), where V, T, A, I, and D are defined as in the case of usual ins-del systems and S is a set of substitution rules. Let x = w 1 uavw 2 and y = w 1 ubvw 2 be the strings over V. The substitution rules define a relation =⇒ sub , as follows: In the context of ins-del-sub systems, we write= ⇒ to denote any of the relations =⇒ ins , =⇒ del or =⇒ sub . We define= ⇒ * and= ⇒ + as usual, denoting the reflexive-transitive and transitive closure of= ⇒, respectively.
The language that is generated by an ins-del-sub system ID ς is defined as As with usual ins-del system, we measure the complexity of an ins-del-sub system ID ς = (V, T, A, I, D, S) via its size, which is, an eight-tuple (n, m, m ; p, q, q ; r, r ), where n, m, m , p, q, and q are defined as in the case of usual ins-del systems and r and r limit the maximal length of the left and right context of a substitution rule, respectively, i.e., p SUB r,r denotes the family of all ins-del-sub systems of size (n, m, m ; p, q, q ; r, r ) Note that as only one letter is replaced by any substitution rule, there is no subscript at SUB. Depending on the context, we also refer to the family of languages generated by ins-del-sub systems of size (n, m, m ; p, q, q ; r, r ) by INS m,m n DEL q,q p SUB r,r . As with ins-del systems, ins-del-sub systems can be regulated with matrix control introducing matrix ins-del-sub systems. A matrix ins-del-sub system is a construct is either an insertion, a deletion or a substitution rule.
We define the relation between the strings w and w over V w= ⇒ m i w , as well as the generated language of MID ς , analogously to the case without substitution rules. We say that matrix ins-del-sub systems, which have insertion rules, deletion rules, substitution rules, and matrices of size (n, m, m ),(p, q, q ), (r, r ), and k, respectively, is of size (k; n, m, m ; p, q, q ; r, r ). By MAT k INS m,m n DEL q,q p SUB r,r , denote the family of matrix ins-del-sub systems of size (k; n, m, m ; p, q, q ; r, r ), as well as the family of languages generated by such systems, depending on the context. Consider a language family MAT k INS m,m n DEL q,q p SUB r,r . Concerning the reversal operator (that reads words from right to left), the following lemma holds. Proof. These claims follow analogously to [17].
By definition, it is also clear that the following lemma holds.

Appearance Checking: An Additional Feature
It is known that matrix grammars with context-free production are not computationally complete, but they can reach computational completeness if used with appearance checking. Transferring this idea to matrix ins-del-sub systems, we introduce matrix ins-delsub systems with appearance checking and show that, similar to the matrix grammar case, formerly computationally incomplete matrix ins-del-sub systems can reach computational completeness if they are used in conjunction with appearance checking. We begin by defining matrix ins-del-sub systems and appearance checking. A matrix ins-del-sub systems and appearance checking is a tuple MID ς,ac = (V, T, A, M ς , F), where V, T, A, and M ς are defined as in usual matrix ins-del-sub systems. F is a subset of all rules occurring in M ς .
Let w 1 , w 2 ∈ V * and z be an arbitrary rule occurring in M ς . We define the re-lation= ⇒ ac z , as follows: w 1= ⇒ ac z w 2 if one of the following conditions hold: (a) the rule z is applicable to w 1 , such that w 1= ⇒ z w 2 or (b) the rule z is not applicable to w 1 , z ∈ F and w 1 = w 2 . Basically, this means that, if some rule in F is not applicable, we can skip that rule. Let m = [r 1 , . . . , r n ] ∈ M ς and x, y ∈ V * . Then x= ⇒ ac m y iff x = w 0= ⇒ ac r 1 w 1= ⇒ ac r 2 . . .= ⇒ ac r n w n = y. The language that is generated by MID ς,ac is defined as We define the size of MID ς,ac analogously to matrix ins-del-sub systems without appearance checking. We denote the language family that is generated by matrix insdel-sub systems with appearance checking of size (k; n, m, m ; p, q, q ; r, r ) by the term MAT ac k INS m,m n DEL q,q p SUB r,r . Again, we like to clarify these definitions by presenting a concrete example of a matrix ins-del-sub system that makes use of appearance checking.

Example 2. Consider the matrix ins-del-sub systems with appearance checking MID
The language that is generated by MID ς,ac is a * b. This can be shown, as follows.
Assume that during the application of m the nonterminal X is inserted to the right of b. Subsequently, clearly the deletion rule (λ, bX, λ) is applicable after the insertion and we delete X along with b. However, now the substitution rule (λ, X → a, λ) cannot be applied anymore and, as (λ, X → a, λ) / ∈ F, this means that the matrix as a whole cannot be applied. Assume that, during the application of m, the nonterminal X is inserted somewhere left of b. Clearly, (λ, bX, λ) del is not applicable and, as (λ, bX, λ) del ∈ F, we skip this rule and proceed with the application of (λ, X → a, λ). Applying m in this way effectively inserts the letter a somewhere left of b and, therefore, w 2 ∈ a * b.
Using the argument above inductively yields our claim. We remark that, for any word w derived from the axiom b, |w| {X} = 0 holds, as any X introduced during the application of m is resolved at the end of m.
The example above clarifies that appearance checking can result in an increase in computational power, as a * b / ∈ MAT * INS 0,0 2 DEL 0,0 2 SUB 0,0 is known (Theorem 1).

Computational (In-)Completeness Results
In this section, we present the main results of our research.

A Normal Form Theorem
We begin by introducing a (binary) normal form for matrix ins-del-sub systems that are similar to the two-normal form (also known as binary normal form) for matrix grammars ( [12] Def. 1.2.1). Recall our discussion in the introductory section concerning possible applications of matrix ins-del-sub systems: there, we argued that short matrices offer advantages, as they help allow for parallel execution in this type of computational devices. However, when looking at the proof of the next theorem, one sees that it enforces some sequentialization by introducing a shared resource in the form of specific nonterminals. Therefore, in essence, the question of parallelizability within computational devices, like (matrix) ins-del-sub systems, remains an interesting topic of future research.
A matrix ins-del-sub systems where r is some insertion, deletion, or substitution rule and A, B ∈ V. Clearly, all matrix ins-del-sub systems in normal form are included in MAT 2 INS m,m n DEL q,q p SUB r,r . We show that, for every matrix ins-del-sub system, there is a matrix ins-del-sub system in normal form that generates the same language.
and all matrices of MID ς be labelled in a one-to-one manner, i.e., a bijection from M ς to a set of labels exists. Subsequently, we define For every matrix m = [r 1 , r 2 , . . . , r n ] of MID ς , where r j , with j = 1, . . . , n, is some insertion, deletion, or substitution rule and i is the label of m, we add the following matrices to MID ς . For every label i of some matrix of MID ς and every matrix m = [r 1 , r 2 , . . . , r n ] of MID ς , we add a matrix [r n , (λ, (i, n) → (i , 1), λ)] to MID ς . By definition, the second component of every matrix of MID ς is either a context-free deletion rule of the form (λ, (i, j), λ) del or a context-free substitution rule of the form (λ, (i, j) → (i , j ), λ), where i and i are the labels of some matrices of MID ς and j, j ≤ k. Furthermore, the first rule of any matrix of MID ς does not involve any nonterminals in V \ V.
Consider a sentential form w 1 (i, j)w 2 with w 1 , w 2 ∈ V * . It can be shown that all the sentential forms of MID ς are either of this form or of the form w ∈ V * . The basic idea is that the nonterminal (i, j) serves as an indicator where the matrix is to be applied next. For instance, the occurrence of (i, j) in w 1 (i, j)w 2 signifies that the next rule to be applied is either (r j , (λ, (i, j) → (i, j + 1), λ)) if the length of the matrix of MID ς labelled by i is greater than j or (r j , (λ, (i, j) → (i , 1), λ)) / (r j , (λ, (i, j), λ) del ), otherwise. We note that, in every derivation of MID ς , a matrix of the form (r j , (λ, (i, j), λ) del ) is applied at most once. Furthermore, we remark that, if a sentential form w ∈ V * occurs during a derivation of MID ς , then the derivation cannot proceed as the second rule of any matrix cannot be applied. We now prove the correctness of the construction, i.e., L(MID ς ) = L(MID ς ), by showing both inclusion directions separately. '⊇': Consider the following derivation w 1 (i, j)w 2= ⇒ [r,(λ,(i,j)→(i ,j ),λ)] w 1 (i , j )w 2 , where r is some insertion, deletion, or substitution rule and w 1 , w 2 ∈ V * . because r does not involve any nonterminals in V \ V, clearly w 1 w 2= ⇒ r w 1 w 2 holds. We now extend this result. Consider a matrix m = [r 1 , . . . , w We remark that, in the simulation of the application a matrix of MID ς , we can assume that the leftmost symbol of any sentential form of a derivation of MID ς is of a nonterminal (i, j) ∈ V \ V (unless a matrix of the form [r n , (λ, (i, n), λ) del ] is applied).
Similarly, for every matrix ins-del  Proof. Let MID ς = (V, T, A, M ς ) and all matrices of MID ς be labelled in a one-to-one manner, i.e., a bijection M ς to a set of labels exists. We define the set of symbols as where k denotes the maximal length of a matrix in M ς . We now describe how to construct an equivalent matrix ins-del-sub system MID ς of the same size, such that MID ς is in normal form. For every v ∈ V, every label i of a matrix of MID ς and every matrix m = [r 1 , r 2 , . . . , r n ] of MID ς , where r j , with j = 1, . . . , n, is some insertion or substitution rule and i is the label of m, we add the following matrices to MID ς . Intuitively, any letters of any sentential form of MID may be substituted or used as a context of some rule. Hence, to simulate MID, any letter of the form v i,j may have to be substituted or used as a context. Therefore, if one of the matrices added to MID ς is of the The cases are treated analogously. The set of axioms is defined as A ={α 1 i,1 α 2 . . . α n | α 1 , . . . , α n ∈ V, n ≥ 1, α 1 . . . α n ∈ A and i is the label of a matrix of MID ς }.
Additionally, if λ ∈ A, for every matrix of MID ς , which has the form m = [(λ, a 1 . . . a n , λ) ins , r 2 , . . . , r n ] with a 1 , . . . , a n ∈ V, we add a 1 i,2 a 2 . . . a n to A , where i is the label of m. This simulates the case that λ is the axiom of a derivation of MID ς and the matrix m is applied. The basic overall idea is the same as in Theorem 2. Here, nonterminals v i,j control the derivations. Hence, we can hence forego a formal inductive proof.

One-Sided Context Dependence
Our previous computational completeness results either required two-sided contexts for insertion or two-sided contexts for deletions. As argued in the introduction, there are good motivations to try to reduce the context dependence. Hence, we are now looking at one-sided contexts for deletions and insertions. We are going to prove two main results in this subsection: first, we show that uni-directional (for instance, left) single-symbol context in insertions and deletions of single symbols suffice in achievinf computational completeness, which this is in contrast with the second result that tells that we cannot completely forego using context information: we do need one-sided context dependency for both deletions and for insertions.

Computational Incompleteness
Though ins-del systems with matrix control and substitution rules are powerful devices, they are not always sufficient for achieving computational completeness. For every matrix of the form X of MID ς , we add a matrix of the form Y to M G .
For every matrix of the form X of MID ς and every N a ∈ N, we add matrices of the Furthermore, we add matrices of the form [N a → a] to M G if a ∈ T. By induction, it can be shown that a 1 . . . a n ⇒ G b 1 . . . b m if and only if ] is applicable to a 1 . . . a n ∈ V * , some matrix of the form [N a → N a N b , N c → N d ] is applicable to N a 1 . . . N a n ∈ N * and vice versa.
We remark that, if a sentential form λ occurs during either a derivation of MID ς or G, the derivation cannot proceed, as no matrix is applicable any more. (see Theorem 2). Hence, the case that the axiom of a derivation of MID ς is λ is covered by an application of S → λ. Therefore, it is easy to see that L(G) = L(MID ς ) holds. Moreover, L(G) ∈ L(M, CF).
Proof. Let G = (N, T, M G , S) be a matrix grammar with context-free production rules. Afterwards, we construct MID ς = (V, T, A, M ς ) as follows: Consider a context-free production rule of the form A → λ. Clearly, this rule is equivalent to a deletion rule (λ, A, λ). Analogously, a production rule A → B is essentially the same as a substitution rule (λ, A → B, λ).
Clearly, applying this sequence to a word w ∈ (N ∪ T) * is the same as applying the production rule A → B 1 . . . B n .
Hence, we add the matrices that were obtained by the Algorithm 1 to M ς .
Consider the following construction for the proof. For each derivation of a matrix ins-del-sub system MID ς = (V, T, A, M ς ) of size ( * ; 1, 0, 0; 1, 1, 0; 0, 0), we construct a group of trees that represents the structure of the derivation.
Each tree node is labelled by a string over V, such that reading the rightmost symbols of all root labels of the corresponding group of trees from left to right yields w (we refer to Figure 2). If an insertion rule (λ, a, λ) adds the letter a at some position of the sentential form, we add a new tree with a single node labelled a at the corresponding position in the group of trees (see Figure 3). Figure 3. The tree group following the application of (λ, a, λ) ins .
Applying a deletion rule (a, X, λ) has the following effect on the group of trees: the node corresponding to X becomes the rightmost child of the node corresponding to a (see Figure 4). a X Y YX Y X Figure 4. The tree group following the application of (a, X, λ) del .
Let w Y be the string of the node corresponding to a letter Y. If a substitution rule (λ, Y → b, λ) is applied, then we concatenate b right of w Y (see Figure 5). a X Y YX Yb X Figure 5. The tree group following the application of (λ, Y → b, λ).
Let the axiom of the derivation be α 1 , . . . , α n . Subsequently, the group of trees consists initially of n trees with single nodes, each being labelled by a symbol of the axiom, such that reading the labels of the respective roots from left to right yields α 1 , . . . , α n . Each root node corresponds to a letter of the current sentential form.
By construction, it is clear that only (the rightmost letter of) root labels contribute letters to the final word, i.e., each tree contributes, at most, one letter to the final word. Furthermore, it is clear that there is no interaction between letters of two different trees, i.e., a letter belonging to certain tree is not a context for some operation on a letter of another tree.

Example 3.
Consider a matrix ins-del-sub system MID ς of size ( * ; 1, 0, 0; 1, 1, 0; 0, 0), which has the axiom X 1 X 2 . Let be matrices of MID ς . Clearly X 1 X 2= ⇒ m 1 m 2 m 3 bab holds and the corresponding group of trees is b X a a b Note that all of the letters corresponding to the eventual a-tree originate from context-free insertions. Furthermore, because there is no interaction between letters of two different trees, inserting all letters of the eventual a-tree (in the order specified by the a-tree) left or right of all letters belonging to eventual b-trees does not affect the b-trees. Thus, X 1 X 2= ⇒ m 1 m 2 m 3 abb and X 1 X 2= ⇒ m 1 m 2 m 3 bba also hold.
Proof. We show that there is no matrix ins-del-sub system of size (1, 0, 0; 1, 1, 0; 0, 0) generating the regular language a + b + . Assume to the contrary that MID ς = (V, T, A, M ς ) generates a + b + . Subsequently, MID ς generates the word a n b n , n > γ, as well, where γ is the length of the longest axiom of MID ς . Consider the group of trees corresponding to a derivation of a n b n starting from the axiom α. Because n > γ, there exists a tree t with the following properties: (1) the tree contributes a letter a to a n b n and (2) all nodes of the tree originate from the application of some insertion rule. Consider the derivation from α to a n b n . Subsequently, MID ς also generates a n−1 b n a. The string a n−1 b n a is generated by applying the same matrices used in the derivation from α to a n b n in the same order. All of the insertion rules corresponding to nodes of the tree t that are specified above are applied right of all letters belonging to (eventual) b-trees. Because there is no interaction between letters of two different trees, none of the letters that correspond to nodes of t are used as context to delete symbols not affiliated with nodes of t. Thus, inserting these letters right of all letters belonging to (eventual) b-trees changes nothing for the other trees. Note that the tree t specifies the position of the inserted letters in relation to each other as well as how the rest of the rules concerning symbols of t are applied.

Context-Free Substitutions Do Not Always Help
We now show that extending context-free matrix ins-del systems with context-free substitution rules does not result in an increase in computational power.
Therefore, the set M is constructed, as follows: let m ∈ M, then we replace all of the occurrences of substitution rules (λ, a → b, λ) in m with the sequence (λ, Xb, λ) ins , (λ, aX, λ) del . The resulting matrix is added to M . This procedure is applied to all m ∈ M.

One-Sided Substitutions
We now consider matrix ins-del systems with one-sided substitution rules. In particular, the families of systems that are discussed in detail now are MAT 2 INS 0,0 1 DEL 0,0 1 SUB 1,0 and MAT 2 INS 0,0 1 DEL 0,0 0 SUB 1,0 . The following procedure is applied to all matrices of MID. Consider an arbitrary matrix m of MID. We replace every occurrence of an insertion rule of the form (a, b, λ) with an insertion rule (λ, X, λ) and a substitution rule (a, X → b, λ). Additionally, any deletion rule of the form (a, b, λ) is replaced with a substitution rule (a, b → X, λ) and a deletion rule (λ, X, λ). The matrix that is obtained by these replacements is added to M .
Clearly, MID ς ∈ MAT * INS 0,0 1 DEL 0,0 1 SUB 1,0 . The basic idea of this proof is that the nonterminal X is immediately resolved after being introduced. It is easy to see that applying the substitution rule (a, b → X, λ) immediately after the insertion rule (λ, X, λ) is essentially the same as applying a rule of the form (a, b, λ). Likewise, applying the deletion rule (λ, X, λ) immediately after the substitution rule (a, b → X, λ) is basically the same as applying the deletion rule (a, b, λ).
We now show that omitting deletion rules in the systems mentioned above yields a characterization of context-sensitive languages, which is quite rare with ins-del systems. Clearly, G is a matrix grammar with context-sensitive production rules. In ( [12] [Theorem 1.2.1]), it is shown that L(M, CS) = L(CS) holds. Therefore, our claim holds.
We now prove the converse.
Proof. Let G = (V, T, S, P) be a context-sensitive grammar in Penttonen normal form [21]. Subsequently, we construct the following matrix ins-del-sub systems to simulate G: The simulation of a production rule of the form AB → AC is carried out by the following matrix A production rule of the form A → BC is simulated by the matrix [(λ, A → X 1 , λ), (λ, X 2 , λ) ins , (λ, X 2 → B, X 1 ), (λ, X 1 → C, λ)], while a production rule of the form A → a is simulated by a matrix [(λ, A → a, λ)]. We make the following observation: all nonterminals X 1 , X 2 , which are introduced by a matrix m, are resolved at the end of m.
In the following paragraph, we prove the equality L(MID ς ) = L(G) by induction. We begin by proving that, if there are matrices m 1 , . . . , m n ∈ M ς with n ∈ N, such that S= ⇒ m 1 . . .= ⇒ m n w, then S ⇒ n G w holds. The base case, i.e., n = 0, is clear. We now consider the inductive step. Let S= ⇒ m 1 . . .= ⇒ m n w= ⇒ m n+1 w .
Let the matrix that is used in the derivation step w= ⇒ m n+1 w be a matrix of the form Subsequently, w = w 1 ABw 2 and w = w 1 ACw 2 follow. Consider an application of m n+1 . Clearly |w| {X 1 ,X 2 } = 0 due to our observation. Therefore, the substitution rule (λ, B → X 1 , λ) of m n+1 must be applied to a letter B whose left context is A. Otherwise, the substitution rule (λ, A → X 2 , X 1 ) of m n+1 (and m n+1 itself) cannot be applied. Hence, it is easy to see that w = w 1 ABw 2 and w = w 1 ACw 2 hold. Because of our induction hypothesis S ⇒ n G w = w 1 ABw 2 holds. Because of our construction, the existence of a matrix of the form m n+1 implies the existence of a production rule AB → AC. Clearly, is handled analogously (clearly, the insertion rule (λ, X 2 , λ) ins must insert X 2 left of X 1 , otherwise (λ, X 2 → B, X 1 ) cannot be applied), while the case m n+1 = [(λ, A → a, λ)] is obvious.
The converse follows analogously.
More specifically, with Lemmas 1, 5 and 6, we can state the following result.

Adding Appearance Checking
We have previously shown that there are even regular languages not included in MAT * INS 0,0 2 DEL 0,0 2 SUB 0,0 . We now show that expanding these systems with appearance checking yields computational completeness. More precisely, we show that the expanded systems can simulate type-0 grammars in Penttonen normal form, which means that all of the production rules are of the form where A, B, C are nonterminal symbols and a is a terminal symbol. Proof. Because the inclusion MAT ac * INS 0,0 2 DEL 0,0 2 SUB 0,0 ⊆ RE is clear, we now proceed to prove the converse by simulating a type-0 grammar G = (N, T, P, S) in Penttonen normal form. The matrix ins-del-sub systems with appearance checking simulating G is defined as MID ς,ac = (V, T, {$S}, M ς , F), with V = N ∪ T ∪ {X, X 1 , X 2 , $, $ } and (N ∪ T) ∩ {X, X 1 , X 2 , $, $ } = ∅. The nonterminal $ is an auxiliary symbol that marks the beginning of a sentential form, and it is eventually deleted by the matrix [(λ, $, λ) del ]. We now describe how the rules of G are simulated.
For every rule of G of the form A → a, we add a matrix [(λ, A → a, λ)] to M ς , and, for every rule A → λ, we add the matrix [(λ, A, λ) del ] to M ς . We remark that the following matrices that are introduced to M ς require the sentential form to have $ as the leftmost symbol. Hence, these rules cannot be applied if $ is absent. For every rule of the form AB → AC, to M ς , we add a matrix . . , (λ, A n X, λ) del } to F. These deletion rules are used to check whether the left context of the B, which is substituted by X, has been A. The basic idea is as follows: consider the application of the matrix above to $w, where w is a string over V \ {X, X 1 , X 2 , $, $ }. It is clear that the letter B, which is substituted by X, must have some symbol as its left context, i.e., this B cannot be the leftmost symbol. Furthermore, if the matrix above has been successfully applied, neither of the deletion rules in F has been applicable, as, otherwise, the substitution rule (λ, X → C, λ) could not have been applied. Therefore, the application of these deletion rules has been skipped during the processing of the matrix above. Because the letter B (which is eventually substituted by X) must have some left context, but neither of the deletion rules from F has been applicable, this means that the left context of this B could not have been a letter from {A 1 , . . . , A n } = V \ {A}. Hence, the left context of this B has been A. Therefore, it is easy to see that the matrix above correctly simulates AB → AC.
We remark that applying any of the matrices, which simulate a production rule of G, to a sentential form of MID ς,ac , whose leftmost symbol is $, results in a string whose leftmost symbol remains $. By induction, we can show that S ⇒ * G w iff $S= ⇒ ac * $w.
Additionally, we can show that MAT ac * INS 0,0 1 DEL 1,0 1 SUB 0,0 is also computationally complete. Proof. The idea is the same as in Theorem 9. Replacing all the deletion rules of the form (λ, ab, λ) in the matrices of the system constructed in Theorem 9 with deletion rules of the form (a, b, λ) yields our claim.
This result shows that appearance checking is indeed powerful, as we have previously seen that MAT * INS 0,0 1 DEL 1,0 1 SUB 0,0 does not even include all regular languages. Furthermore, we can conclude the following from this result.
Proof. All of the substitution rules occurring in matrices of the construction in Theorem 9 can be replaced, as specified in the proof of Theorem 6.
Additionally, the following can be derived from [22], based on ideas on P systems. This is interesting, as P systems (also known as membrane systems, introduced in [23]) are another formalization of a bio-computing device, when considering a cell (abstractly viewed as a membrane structure) as a computing mechanism.
An insertion-deletion P system is a construct Π = (O, T, µ, M 1 , . . . , M n , µ is the tree structure of the system which has n nodes, • M i , 1 ≤ i ≤ n, is a finite language associated to the membrane i, and • R i , 1 ≤ i ≤ n, is a set of insertion and deletion rules with target indicators of the node i. The rules are of the form: (u, x, v; tar), where (u, x, v) is an insertion rule or a deletion rule, and tar ∈ {here, in j , out | 1 ≤ j ≤ n}. A configuration of Π is an n-tuple (N 1 , . . . , N n ) of finite languages over O. The transition between two configurations consists of applying rules in parallel to all possible strings, non-deterministically with respect to the target indications associated with the rules. A sequence of transitions between configurations of a given insertion-deletion P system Π starting from the initial configuration is called a computation with respect to Π. The result of Π's computations is collected in the language L(Π) that consists of all strings over T that are sent out of the root node during its computations.
An insertion-deletion P system has the priority of deletion rules over insertion rules if a rule (u 1 , x 1 , v 1 ; tar 1 ), where (u 1 , x 1 , v 1 ) is an insertion rule, is only allowed to be applied if no rule (u 2 , x 2 , v 2 ; tar 2 ), where (u 2 , x 2 , v 2 ) is a deletion rule, is applicable.
Proof. Consider an insertion-deletion P system with the priority of deletion rules over insertion rules. It is known that such systems with insertion rules of size (1, 0, 1) and deletion rules of size (1, 0, 0) are computationally complete, see [22]. We now show that such an insertion-deletion P system with a priority of deletion rules over insertion rules can be simulated by an matrix ins-del system with the appearance checking of size ( * ; 1, 0, 1; 1, 0, 0). Let Π = (O, T, µ, M 1 , . . . , M n , R 1 , . . . , R n ) be such a system. An equivalent matrix insdel system with appearance checking MID ς = (O ∪ {1, . . . , n} ∪ {X}, T, A, M ς , F), where X is a trap symbol, is constructed, as follows. We define A = {wi | w ∈ M i }. For every deletion rule (λ, x, λ; tar) ∈ R i , we add a matrix [(λ, i, λ) del , (λ, x, λ), (λ, k, λ) ins ] to M ς , where k = i if tar = here, k = j if tar = in j and k = i if tar = out and i is the parent node of i. Furthermore, if the node i has no parent node and if the deletion rule (λ, x, λ; out) is in R i , then we add the matrix [(λ, i, λ) del , (λ, x, λ)] to M ς .
We remark that deletion rules are essential for computational completeness results, as we have shown that, even with the addition of substitution rules, systems that do not have deletion rules lack computational power beyond CS. Furthermore, we have shown that INS 0,0 n DEL 0,0 m = INS 0,0 n DEL 0,0 m SUB 0,0 , i.e., ins-del-sub systems require some context information; otherwise, substitution rules do not offer any benefits in regards to the size complexity of those systems.
We have shown that matrix ins-del systems do not need much context to reach computational completeness if substitution rules and appearance checking are used. For instance, we have shown that, in this setting, no context other than single symbol context for deletion rules is necessary for computational completeness. In the case of no context other than single symbol context for insertion rules, we have shown that appearance checking is a necessary and sufficient feature for ensuring computational completeness. Notice that the term "appearance checking" might, indeed, be a bit misleading, as it usually applies to checking the non-applicability of certain rules. Because we (always) test for the absence of single symbols, it might be better to formalize or interpret our results in the context of (forbidden) random context [25][26][27][28][29], or, more generally speaking, to semiconditional (matrix) grammars. So-called generalized forbidding matrix grammars (GFM) have been presented at ICMC 2020 (the corresponding paper will appear in the Springer LNCS volume 12687 of CMC21) and cover the idea of associating tests for the absence of symbols (or even words) as a filter prior to applying a matrix. The mentioned paper uses (more traditional) context-free rules within the matrices. In a sense, the results of our paper could also be seen as initializing the study of generalized forbidding matrix ins-del-sub systems. Subsequently, also studies on random context with respect to ins-del systems are also interesting for comparison, see [30,31]. Another interpretation could be in terms of ordered variants of matrix ins-del-sub systems, a topic so far explored only in connection with context-free grammars [32], because ordered grammars and forbidden context grammars mostly coincide, also see [12,33,34]. To provide an overview, our completeness results are collected in the following table.

Normal Matrix Extended with Substitution ins-del Systems and Appearance Checking
Size Family Size Family ac? The incompleteness reults for normal matrix ins-del systems follow from [10] and Theorem 1, Theorems 8, Theorem 5, and Corollary 2.
Although matrix ins-del systems extended with substitution rules are powerful devices, this extension is not always sufficient for reaching computational completeness. More precisely, our incompleteness results have been thus as follows.

Size Family
( * ; 1, 1, 0; 1, 0, 0; 0, 0) = L(M, CF) RE ( * ; 1, 0, 0; 1, 1, 0; 0, 0) RE ( * ; 2, 0, 0; 2, 0, 0; 0, 0) RE Coming back to the original motivation of our study, that of exploring the limits of computability with bio-computing devices, it is not that clear whether appearance checks (which always mean that the whole, say, RNA string has to be checked for possible sites where a rule might apply) can be efficiently implemented in real bio-computing mechanisms. Yet, they may serve as a yardstick, showing what is possible and also indicating in which way one might want to modify the formalisms in order to achieve computational completeness with smaller context dependencies for the different types of rules. Conversely, we suppose that, if bio-computing devices based on processing RNA or other large molecules leaves the experimental laboratories one day, turning into an industrial-strength technology, then computational completeness results, as presented in this paper on matrix ins-del-sub systems, as well as in previous papers for ins-del-sub systems, may serve as a basis of implementing compilers and so forth to master these future machines.