A New Class of Graph Grammars and Modelling of Certain Biological Structures

Vijayakumar, Jayakrishna; Mathew, Lisa; Nagar, Atulya K.

doi:10.3390/sym15020349

Open AccessArticle

A New Class of Graph Grammars and Modelling of Certain Biological Structures

by

Jayakrishna Vijayakumar

^1,2,†

,

Lisa Mathew

^3,*,†

and

Atulya K. Nagar

^4,†

¹

Department of Computer Science and Engineering, Amal Jyothi College of Engineering, Kanjirappally 686 518, Kerala, India

²

Research Scholar, APJ Abdul Kalam Technological University, Thiruvananthapuram 695 016, Kerala, India

³

Department of Basic Sciences, Amal Jyothi College of Engineering, Kanjirappally 686 518, Kerala, India

⁴

School of Mathematics, Computer Science and Engineering, Liverpool Hope University, Liverpool L16 9JD, UK

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Symmetry 2023, 15(2), 349; https://doi.org/10.3390/sym15020349

Submission received: 28 December 2022 / Revised: 18 January 2023 / Accepted: 20 January 2023 / Published: 27 January 2023

(This article belongs to the Special Issue Graph Theory and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

Graph grammars can be used to model the development of diverse graph families. Since their creation in the late 1960s, graph grammars have found usage in a variety of fields, such as the design of sophisticated computer systems and electronic circuits, as well as visual languages, computer animation, and even the modelling of intricate molecular structures Replacement of edges and nodes are the two primary approaches of graph rewriting. In this paper we introduce a new type of node replacement graph grammar known as nc-eNCE graph grammar. With this new class of graph grammars we generated certain graph classes and we showed that these class of graph grammars are more powerful than the existing edge and node controlled embedding graph grammars. In addition, these graph grammars were used to model several common protein secondary structures such as parallel and anti-parallel

β

-sheet structures in different configurations. The use of these graph grammars in modelling other bio-chemical structures and their interactions remains to be explored.

Keywords:

graph grammars; non-confluence; connection instructions; regular control; protein secondary structures; generative power; node replacement; node rewriting

1. Introduction

Graphs offer a logical and relevant approach in representing a collection of items as nodes and their interactions as edges. Numerous fields of study and practical applications have benefited from the broad influence of graph theory. Problems in combinatorial optimization [1], which include selecting the best option from a limited number of options, have been studied using graph theory. Various applications, including routing in communication networks [2], transportation systems, and geographic information systems, use shortest path algorithms in graphs. Graph coloring has several applications in scheduling, register allocation, and resource allocation, among others. Graph theory is also used in the study of network reliability, which is a measure of the ability of a communication network to function correctly even in the presence of failures or disruptions. Network reliability can be modeled as a graph, and graph-theoretic techniques can be used to analyze the robustness of the network and identify potential points of failure.

In biology, graphs can be used to represent the relationships between different biological entities, such as genes, proteins, and metabolic pathways [3]. One of the major applications of graph theory in biology is the study of protein-protein interaction networks [4], which are networks of proteins that interact with each other in the cell. These networks can be represented as graphs, with the vertices representing the proteins and the edges representing the interactions between the proteins. Graph-theoretic techniques have been used to study the topology and structure of protein-protein interaction networks and to identify key proteins that play important roles in the network.

Another application of graph theory in biology is the study of metabolic networks, which are networks of chemical reactions that take place in the cell. These networks can also be represented as graphs, with the vertices representing the chemical compounds and the edges representing the reactions between the compounds. Graph-theoretic techniques have been used to study the structure and function of metabolic networks and to identify key pathways that are involved in the metabolism of the cell. In addition, graph theory has been used to study the structure and function of other types of biological networks, such as gene regulatory networks [5], neural networks, and social networks.

Generation of families of graphs [6,7,8] has been studied extensively. Graph grammars which were introduced in sixties have evoked the interest of researchers due to the possibilities of their application in disciplines that range from program design to modelling of biochemical interactions. They are used to formalise the idea of a set of graphs that can be specified recursively. The technique of transformation of a graph is known as graph rewriting. This involves starting with a graph G (which is called host graph) from which a subgraph S is removed and then embedding a graph E with the remaining portion of G, say, R. Graph rewriting is usually performed by the recursive replacement of either a node or an edge. We develop non-confluent edge and node label controlled embedding to generate new graphs. This class of graph grammars have an enhanced generative capacity compared to normal node-replacement graph grammars.

A protein, on the other hand, is a complex organic compound made up of amino acids linked by chemical bonds [9,10]. An alphabetic string of 20 letters can be used to represent the linear amino acid sequence that makes up a protein’s fundamental structure. Secondary structures in proteins are created by the local orientation of amino acids along the chain. These structures are stabilised by hydrogen interactions among the amino-keto groups in the peptide bond. Protein secondary structures are divided into four categories:

α

-helix,

β

-sheet, turns, and coils.

There are many areas of application of different formalisms of graph grammars. Recently, Guo et al. [11] have studied graph grammars for molecular generation. Here we study non-confluent eNCE graph grammars for modelling biological structures. Section 2 covers the essential definitions involving graph grammars. Section 3 introduces the notion of Non-confluent eNCE (

n c e N C E

) graph grammars and some variants of this class of graph grammars. Section 4 addresses the application of constructs discussed earlier in the simulation and structural analysis of some standard biological structures. We then demonstrate how parallel and anti-parallel

β

-sheet structures are generated using our new grammar.

2. Graph Grammars

In general, a graph grammar [7,12] has a set of rules with the format

p (M, D, E)

which are used to recursively transform a host graph H (initially H is the start graph

G_{S})

. Here M (known as mother graph) is a subgraph of

G_{S}

which is to be replaced by another graph D (known as daughter graph). New edges are to be established with designated nodes of D due to the loss of edges incident to the vertices in M using an appropriate embedding mechanism E. As a result a new family of graphs is generated.

Graph grammars have been applied to the modelling of numerous biological structures. They have been used to forecast protein folding patterns as well as model the structure and dynamics of proteins. The structure and operation of genetic networks, including the control of gene expression, have been modelled using graph grammars. The structure and operation of metabolic pathways, including the movement of metabolites through a cell, have been modelled using graph grammars. The structure and operation of cell signalling pathways, which are crucial in cell communication, have been modelled using graph grammars.

A bipartite graph can also be used to represent a graph grammar. The edges in the bipartite graph represent the production rules of the grammar, connecting a nonterminal symbol on the left of a rule to the terminal symbols on the right that it can be expanded into. In this representation, a graph can be generated by starting with a single nonterminal symbol and repeatedly expanding it according to the production rules until only terminal symbols remain. This process is called graph rewriting, and the resulting graph is called a derived graph.

Embedding styles may be categorized into two types, namely, connecting and gluing. In a graph grammar based on gluing [8,13,14,15], certain nodes of

H^{'}

(

H^{'} = H - M

) and D are fused. In addition, some edges in H (which were initially incident on M) and D are fused.

In the connecting approach [6], new edges are introduced between the daughter graph D and some specific nodes of

H^{'}

(which were initially adjacent with nodes in M).

Node Replacement Graph Grammars

When M is a single node and the connection instructions are independently defined for each production rule, the graph grammar is termed as a node replacement graph grammar [16].

A graph grammar is said to be a node label controlled (

N L C

) [17,18] graph grammar if the nodes that participate in the embedding process are specified using labels. When a production rule is applied, a node labelled by the non-terminal A is removed and instead a graph D is embedded. This embedding is done using any valid connection instruction with the format (

α, β

). As a result of the application of this connection instruction, undirected edges labelled

β

are established between some node of D and any neighbor of A in H which had an edge labelled

α

incident on it and some node of D. Formally we have the definition:

Definition 1

([17]). A construct

n G = (Σ, Γ, P, C, G_{S})

is known as an

N L C

graph grammar where

Σ is an alphabet used to label nodes,
Γ is a collection of terminal symbols in Σ,
A production rule in P, $p : A \to D$ acts on the mother node labelled A,
C is a collection of embedding instructions in $Σ \times Σ$ ,
$G_{S}$ is the initial graph.

We say

G \overset{p}{\Rightarrow} G^{'}

or

G \Rightarrow G^{'}

if an application of p to the graph G yields

G^{'}

. Furthermore we can write

G \overset{*}{\Rightarrow} H

if

G \Rightarrow G_{1} \Rightarrow G_{2} \Rightarrow \dots \Rightarrow G_{n} = H

This grammar [19] generates the language

L (n G) = {G \in G_{Γ} | G_{S} \overset{*}{\Rightarrow} G}

and all nodes of graphs in

G_{Γ}

are labelled using Γ.

In a neighbourhood controlled embedding (

N C E

) [6] we have:

_{c} G = (Σ, Γ, P, S)

, each production rule p:

A \to (D, C)

has an independent connection instruction C unlike the construct in Definition 1. Here

C \subseteq Σ \times N_{D}

and

N_{D}

is a collection of nodes in D.

Another extension of the

N L C

embedding [17] is the

e N C E

or edge and node controlled embedding [6] which uses both edge and node labels to specify the new edges added during the embedding process. Formally, we have the definition:

Definition 2

([20]). A construct

_{e} G = (Σ, Δ, Γ, Ω, P, G_{S})

is known as

e N C E

graph grammar where

Σ and Γ are sets of symbols used to label nodes and edges respectively,
Δ and Ω are the collections of terminal symbols in Σ and Γ respectively,
A production rule in P, $p : A \to (D, C)$ acting on the mother node M with label A has a collection C of connection instructions $(a, p ∣ q, B)$ associated with it. Here B is a node in D and x with label a is one of the neighbors of M. The edge p which connected x and M is removed and an edge q is established between x and B.
$G_{S}$ is the initial graph.

The graph grammar

_{e} G

generates the language [19]

L (_{e} G) = {G \in G_{Δ} | G_{S} \overset{*}{\Rightarrow} G}

where all the nodes of graphs in

G_{Δ}

are labelled using Δ. Here

G_{S} \overset{*}{\Rightarrow} G

is defined as in Definition 1.

3. Non-Confluent Edge and Node Controlled Embedding $(nc$ - $eNCE)$ Graph Grammar

In order to introduce a measure of determinism in the intrinsically non-deterministic concept of a graph grammar, we restrict the sequence of production rules used and thereby obtain a new class of graph grammar. Formally we have the definition:

Definition 3

([21]). A construct

n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

is known as an

n c - e N C E

graph grammar where

Σ and Γ are sets of symbols used to label nodes and edges respectively,
Δ and Ω are the collections of terminal symbols in Σ and Γ respectively,
A production rule in P, $p : A \to (D, C)$ acting on the mother node M with label A has a collection C of connection instructions $(a, p ∣ q, B)$ associated with it. Here x with label a is a neighbor of M and B is a node in D. The edge p which connected x and M is removed and a new edge q is established between x and B.
$G_{S}$ is the initial graph,
The regular control, $R (P)$ , regulates the sequence of application of the production rules.

The graph grammars

n c G

generates the language

L (n c G) = {G \in G_{Δ} | G_{S} \overset{R (P)}{\Rightarrow} G}

Here, all the nodes of graphs in

G_{Δ}

are labelled using

Δ

and

G_{S} \overset{*}{\Rightarrow} G

is defined as in Definition 1. The ordered application of productions in the sequence

p_{1}, p_{2}, \dots, p_{n}

(

p_{1} p_{2} \dots p_{n} \in L (R (P))

leads to the generation of graphs specified by the language of the grammar.

3.1. nc- $e N C E$ Graph Grammar with Deletion $(d n c$ - $e N C E)$

In order to improve its generative capability, we introduce here another version of the

n c

-

e N C E

graph grammar. In this version there are some special productions used to simply delete a node and establish edges between its neighbors. Formally, we have the definition:

Definition 4.

A construct

d n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

is known as a

d n c

-

e N C E

graph grammar (

n c

-

e N C E

graph grammar with deletion) where,

Σ is an alphabet used to label nodes,
$Δ \subset Σ$ is a collection of terminal symbols,
Γ is an edge labelling alphabet,
Ω is the edge labels of the final graph, [22].
P contains in addition to productions defined in Definition 2, the rules $A \to ε$ where, M labelled A is the node to be deleted. The connection instruction for this rule has the format $((a, x), z, (y, b))$ . In this rule the edges labelled x, y connecting M to a and b respectively are removed, and a new edge with label z is established between the two nodes.
$G_{S}$ is the initial graph.
The regular control $R (P)$ regulates the sequence of application production rules.

The graph grammar

d n c G

generates the language

L (d n c G) = {G \in G_{Δ} | G_{S} \overset{R (P)}{\Rightarrow} G}

where all the nodes of graphs in

G_{Δ}

is labelled using

Δ

. The following example shows the generation of a parse tree with yield

{w w^{r} w w^{r} | w \in {(a + b)}^{*}}

using a

d n c

-

e N C E

graph grammar

n c G_{T}

.

Example 1.

Consider

d n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

with

Σ = {S, r, u, a, b}

,

Δ = {r, u, a, b}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}}

,

G_{S}

consists of a single mode labelled S and

R (P) = {(p_{1} + p_{2})}^{*} p_{3}

. Figure 1 depicts the production rules that generates parse trees with the yield

w w^{r} w w^{r}, w \in {(a + b)}^{*}

and Figure 2 demonstrates the application of these rules when

w = a b

.

3.2. Non-Confluent $e N C E$ Graph Grammar with $ψ$ Labelled Edges $(ψ n c$ - $e N C E)$

Another version of the

n c

-

e N C E

graph grammar introduces special edges labelled

ψ

. Formally, we have the following definition.

Definition 5.

A construct

ψ n c G = (Σ, Δ, Γ, Ω, P, G_{S},

R (P))

is known as a non-confluent

e N C E

graph grammar with ψ labelled edges or simply

ψ n c

-

e N C E

graph grammar where

Σ is an alphabet used to label nodes,
$Δ \subset Σ$ is a collection of terminal symbols,
$Γ \cup ψ$ is the edge labelling alphabet,
Ω is the collection of edge labels of the final graph,
P contains rules similar to those in Definition 2. In addition we introduce a special edge label ψ which acts as follows: While concatenating two graphs together or embedding a daughter graph in a mother graph, this edge can be introduced between two nodes with terminal labels. The connection instruction associated with this edge has the format $(x, ψ, y)$ . This label can be bypassed or ignored while specifying the graph language.
$G_{S}$ is the initial graph.
The regular control $R (P)$ regulates the sequence of application of the production rules.

The graph grammar

ψ n c G

generates the language

L (ψ n c G) = {G \in G_{Δ} | G_{S} \overset{R (P)}{\Rightarrow} G}

where all the nodes of graphs in

G_{Δ}

is labelled using

Δ

. Edges with label

ψ

can be used with

n c

-

e N C E

graph grammars as well as

d n c

-

e N C E

graph grammars for connecting two or more graphs without loss of generality.

Example 2.

Consider the graph grammar

ψ n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

, with

Σ = {S, t, #, u, v, w}

,

Δ = {t, u, v, w, a, b}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}, p_{4}}

,

G_{S}

is the graph shown in Figure 3 and

R (P) = {((p_{1} + p_{2}) p_{3})}^{*} p_{4}

. Figure 4 depicts the production rules that generates a parse tree with the yield

w t w^{r} t w t w^{r}, w \in {(a + b)}^{*}

. Figure 5 shows the application of these rules when

w = a b

.

The graph grammar in Example 2 can generate parse trees with yield of the form

{w t w^{r} t w t w^{r} | w \in {(a + b)}^{*}}

. Figure 5 shows the derivation of tree yielding the string

a b t b a t a b t b a

which is obtained when

w = a b

.

4. Generation of Certain Graph Classes

The class of

n c - e N C E

graph grammars are capable of generating several graph classes. Some of these graph classes are dealt with in the following results.

4.1. Wheel Graphs

Definition 6

([23]). A wheel graph

W_{n}

is a graph that consists of a central node, called the hub, and several spoke nodes that are connected to the hub. The spoke nodes are connected to the hub by edges, and to each other by a rim or cycle of edges. The number of spoke nodes in a wheel graph is referred to as its order. A wheel graph with n spoke nodes is also known as a wheel graph of order n.

Lemma 1.

The class of wheel graphs can be generated by a

n c

-

e N C E

graph grammar.

Proof.

Consider

n c G_{W} = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

,

Σ = {W, E, a, c, s, e}

,

Δ = {a, c, s, e}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}}

,

G_{S}

consists of a single node labelled W and R(P)=

p_{1} p_{2}^{*} p_{3}

. The productions in P are depicted in Figure 6. □

Figure 7 shows the generation of the Wheel graph

W_{6}

using the grammar

n c G_{W}

.

4.2. Complete Bipartite Graphs

Definition 7

([23]). A bipartite graph is a graph in which the vertex set can be decomposed into 2 disjoint sets such that no two vertices within the same set are adjacent. A complete bipartite graph is a bipartite graph such that every vertex in the first set is connected to each vertex of the second set.

Lemma 2.

The class of complete bipartite graphs can be generated by a

n c

-

e N C E

graph grammar.

Proof.

Let

n c C B = (Σ, Δ, Γ, Ω, P, S = A, R (P))

be an

n c

-

e N C E

graph grammar with,

Σ = {A, a, b}

,

Δ = {a, b}

,

Γ = {α, β}

,

Ω = {α}

,

P = {p_{0}, p_{1}, p_{2}, p_{3}, p_{4}, p_{5}}

and

R (P) = (p_{0} p_{1}^{*} p_{2} p_{3}^{*} p_{4}) + p_{5}

. Figure 8 shows the production rules and the associated connection instructions for generating complete bipartite graphs. □

4.3. Binary Tree

Definition 8

([23]). A rooted tree is a connected acyclic graph which has a special vertex called the root. A binary tree is a rooted tree in which each vertex has at most two children.

Lemma 3.

The class of binary trees can be generated by a

n c

-

e N C E

graph grammar.

Proof.

Let

n c B T = (Σ, Δ, Γ, Ω, P, S = A, R (P))

be an

n c

-

e N C E

graph grammar with,

Σ = {A, a}

,

Δ = {a}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}}

and

R (P) = {(p_{1} + p_{2})}^{*} p_{3}^{+}

. Figure 9 shows the production rules and the associated connection instructions for generating binary trees. □

Other classic graph classes which can be generated using

n c

-

e N C E

graph grammars include complete graphs, caterpillar graphs and star graphs. In fact

n c

-

e N C E

graph grammars provide finite tools for the generation of such classic graph classes.

5. Generative Power of $nc$ - $eNCE$ Graph Grammars

Theorem 1.

Let

e N C E G G

be the class of all the Edge and Node Controlled Embedding Graph Grammars and

n c

-

e N C E G G

be the class of all the Non-Confluential Edge and Node Controlled Embedding Graph Grammars. Then

L (e N C E G G) \subset L (n c

-

e N C E G G)

.

Proof.

Let

_{e} G = (Σ, Δ, Γ, Ω, P, G_{S}) \in e N C E G G

with

P = {p_{1}, p_{2}, p_{3}, \dots, p_{m}}

and let

A = {W \in P^{*} | G_{S} \overset{W}{\Rightarrow} G \in L (_{e} G)}

. Then we can construct an

n c - e N C E G G

,

n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

where all the components except

R (P)

can be obtained from

_{e} G

and

R (P)

is a regular expression corresponding to set A. □

Theorem 2.

L (n c

-

e N C E G G) - L (e N C E G G) \neq ϕ

Proof.

Consider a graph language consisting of graphs of the form shown in Figure 10. This graph can be interpreted as follows: Each graph contains a series of n hanging triangles T followed by a series of

(2 n + 1)

rhombuses R glued together and followed again by a series of n hanging triangles T. This graph language can now be represented using the string language

L = {T^{n} R^{2 n + 1} T^{n}, n \geq 0}

. It is known that the language L is not context free. Based on this feature of non-contextfreeness it can be shown that any graph language of this form cannot be generated using an edge and node controlled embedding graph grammar. The

e N C E G G

production rules can be applied in any order and a single node replacement happens when we apply a rule. Hence any grammar of this type will generate certain graphs which are not in the required form. We now show that this graph language can be generated using a non-confluential edge and node controlled embedding graph grammars as follows. Consider an

n c

-

e N C E G G

n c G_{d i f f} = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

,

Σ = {A, C, X, Y, a, b}

,

Δ = {a, b}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}, p_{4}, p_{5}, p_{6}, p_{7}, p_{8}}

,

G_{S}

is the graph in Figure 11 and

R (P) = {(p_{1} p_{2} p_{3} p_{4})}^{*} (p_{5} p_{6} p_{7} p_{8})

. Figure 12 shows the production rules and the associated connection instructions for generating the pattern of the form

T^{n} R^{2 n + 1} T^{n}, n \geq 0

. Figure 13 shows the derivation of the pattern of the form

T^{n} R^{2 n + 1} T^{n}, n = 2

. □

6. Modelling of Biological Structures

Graphs and graph grammars play a pivotal role in the field of bioinformatics, especially in studies related to structural analysis of DNA and proteins. In particular the secondary structure prediction of proteins is a topic of active research [9,10,24]. In this section, we show how

n c

-

e N C E

graph grammars can be used to model some biological structures with their associated characteristics.

In Example 2, an informal description of this concept is shown using a

ψ n c

-

e N C E

graph grammar to generate a tree with its yield given by a string that can be read from leftmost leaf node label to the right most leaf node label. The structure shown in Figure 5 is a linguistic description of anti-parallel

β

-sheet structure of protein. The symbols

a, b

and t in the graph shown in Figure 5 can be replaced with the corresponding amino-acid sequences so that the original

β

-sheet sequence can be obtained. Since the generated structure is a tree, it can be parsed using a computational device. This model can also be used to learn and predict the occurrence of a beta sheet structure when a sequence is given. The following examples shows the generation of some popular

β

-sheet structures.

6.1. Modelling of Parallel $β$ -Sheet Structures

Consider an

d n c

-

e N C E

graph grammar

p β n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

with

Σ = {S, t, #, u_{1}, u_{2}, l_{1}, l_{2}, v, a, b}

,

Δ = {u_{1}, u_{2}, l_{1}, l_{2}, a, b, v}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}, p_{4}, p_{5}}

,

G_{S}

is the graph in Figure 14 and

R (P) = {((p_{1} + p_{2}) p_{3} p_{4})}^{*} p_{5}

. Figure 15 depicts the production rules that generate parse trees with yield

w l_{1} w l_{2} w, w \in {(a + b)}^{*}

. Figure 16 show the application of these rules when

w = a b

.

6.2. Modelling of Anti-Parallel $β$ -Sheet Structures with a Semi-Greek Key Conformation

Consider

g β n c G = (Σ, Δ, Γ, Ω, P, G_{S}, R (P))

with

Σ = {S, l_{1}, l_{2}, t, u, #, v, a, b}

,

Δ = {l_{1}, l_{2}, t, u, v, a,

b}

,

Γ = {α}

,

Ω = {α}

,

P = {p_{1}, p_{2}, p_{3}, p_{4}}

,

G_{S}

is the graph in Figure 17 and

R (P) = {((p_{1} + p_{2}) p_{3})}^{*} p_{4}

. Figure 18 depicts the production rules that generate parse trees with the yield

w l_{1} w^{r} l_{2} w^{r}, W \in {(a + b)}^{*}

. Figure 19 shows the application of these rules when

w = a b

.

As stated in the introduction, we investigate the modelling and prediction of the

β

-sheet regions. Figure 5, Figure 16 and Figure 19 depict this feature, as does Figure 20, which shows a schematic illustration of various common

β

-sheet configurations. The

β

-sheet strands are shown by solid arrow marks, while the turns between the strands are represented by light line segments. Another illustration can be found to the right of this schemata, which depicts the beta sheets with amino acids (stereotyped as a and b) held together by the hydrogen bond (shown with dotted lines). This leads us to believe that there is a strong link between amino acids in those positions. Our new grammar, together with its variants, can handle these beta sheet topologies. Parsing becomes easier when the development of graphs using our grammar is limited by the regular control of graph production rules. It is also worth noting that there will be sequences in between the

β

-sheet sections that can be handled by the

n c

-

e N C E

graph grammar.

7. Conclusions

We have established a new type of edge and node driven embedding graph grammar called

n c

-

e N C E

graph grammars along with some variants. We have shown how some of the classic symmetric graph classes can be easily generated by our new graph grammar. The generative power of

n c

-

e N C E

graph grammars have been demonstrated. We have shown how these grammars can be used to represent and analyse biological structures such as

β

-sheets in proteins. It will be interesting to see how these grammars are used for identifying biochemical structures especially those of higher order proteins and their symmetry in terms of the sequences of amino acids present. It will also be of high interest if we can apply these graph grammar constructs and explore their symmetry in building unpredictable linear games.

Author Contributions

Conceptualization, J.V., L.M. and A.K.N.; Writing—original draft, J.V.; Writing—review & editing, L.M.; Supervision, L.M.; Funding acquisition, A.K.N. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by Liverpool Hope University.

Data Availability Statement

The data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Avis, D.; Hertz, A.; Marcotte, O. Graph Theory and Combinatorial Optimization; Springer: Berlin/Heidelberg, Germany, 2005; Volume 8. [Google Scholar]
Majeed, A.; Rauf, I. Graph Theory: A Comprehensive Survey about Graph Theory Applications in Computer Science and Social Networks. Inventions 2020, 5, 10. [Google Scholar] [CrossRef] [Green Version]
Koutrouli, M.; Karatzas, E.; Paez-Espino, D.; Pavlopoulos, G.A. A guide to conquer the biological network era using graph theory. Front. Bioeng. Biotechnol. 2020, 8, 34. [Google Scholar] [CrossRef] [PubMed]
Balasubramanian, K.; Gupta, S.P. Quantum Molecular Dynamics, Topological, Group Theoretical and Graph Theoretical Studies of Protein-Protein Interactions. Curr. Top. Med. Chem. 2019, 19, 426–443. [Google Scholar] [CrossRef] [PubMed]
Yue, H.; Chunmei, L. Study of gene regulatory network based on graph. In Proceedings of the 2011 4th International Conference on Biomedical Engineering and Informatics (BMEI), Shanghai, China, 15–17 October 2011. [Google Scholar] [CrossRef]
Engelfriet, J.; Rozenberg, G. Node Replacement Graph Grammars. In Handbook of Graph Grammars and Computing by Graph Transformation; World Scientific: Singapore, 1997; pp. 1–94. [Google Scholar] [CrossRef]
Fahmy, H.; Blostein, D. A survey of graph grammars: Theory and applications. In Proceedings of the International Conference on Pattern Recognition. Conference B: Pattern Recognition Methodology and Systems, The Hague, The Netherlands, 30 August–3 September 1992; Volume 2, pp. 294–298. [Google Scholar] [CrossRef]
Ehrig, H. Introduction to the algebraic theory of graph grammars (a survey). In Graph-Grammars and Their Application to Computer Science and Biology; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1978; Volume 1073, pp. 1–69. [Google Scholar] [CrossRef]
Abe, N.; Mamitsuka, H. Predicting Protein Secondary Structure Using Stochastic Tree Grammars. Mach. Learn. 1997, 29, 275–301. [Google Scholar] [CrossRef] [Green Version]
Vishveshwara, S.; Brinda, K.V.; Kannan, N. Protein Structure: Insights from Graph Theory. J. Theor. Comput. Chem. 2002, 1, 187–211. [Google Scholar] [CrossRef]
Guo, M.; Thost, V.; Li, B.; Das, P.; Chen, J.; Matusik, W. Data-Efficient Graph Grammar Learning for Molecular Generation. In Proceedings of the International Conference on Learning Representations, Virtually, 25–29 April 2022. [Google Scholar]
Engelfriet, J.; Vereijken, J.J. Concatenation of Graphs. In Graph Grammars and Their Application to Computer Science. Graph Grammars 1994; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; Volume 1073, pp. 368–382. [Google Scholar] [CrossRef] [Green Version]
Ehrig, H.; Korff, M.; Löwe, M. Tutorial Introduction to the Algebraic Approach of Graph Grammars. In Graph Grammars and Their Application to Computer Science; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1986; Volume 291, pp. 24–37. [Google Scholar] [CrossRef]
Ehrig, H.; Habel, A.; Kreowski, H.J. Introduction to graph grammars with applications to semantic networks. Comput. Math. Appl. 1992, 23, 557–572. [Google Scholar] [CrossRef] [Green Version]
Habel, A.; Kreowski, H. May we introduce to you: Hyperedge Replacement. In Graph Grammars and Their Application to Computer Science; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1986; Volume 291, pp. 15–26. [Google Scholar] [CrossRef]
Drewes, F.; Kreowski, H.J.; Habel, A. Hyperedge Replacement Graph Grammars. In Handbook of Graph Grammars and Computing by Graph Transformation; World Scientific: Singapore, 1997; pp. 95–162. [Google Scholar] [CrossRef]
Engelfriet, J.; Rozenberg, G. Graph grammars based on node rewriting: An introduction to NLC graph grammars. In Graph Grammars and Their Application to Computer Science: 4th International Workshop, Bremen, Germany, 5–9 March 1990; Springer: Berlin/Heidelberg, Germany, 1991; pp. 12–23. [Google Scholar] [CrossRef]
Subramanian, K.G. Regular control on NLC grammars. Bull. EATCS 1985, 26, 63–64. [Google Scholar]
Janssens, D.; Rozenberg, G. Generating graph languages using hypergraph grammars. In Fundamentals of Computation Theory; Springer: Berlin/Heidelberg, Germany, 1981; pp. 154–164. [Google Scholar] [CrossRef]
Rozenberg, G.; Salomaa, A. (Eds.) Handbook of Formal Languages, Volume 1: Word, Language, Grammar; Springer: Berlin/Heidelberg, Germany, 1997. [Google Scholar] [CrossRef]
Jayakrishna, V.; Mathew, L. nc-eNCE Graph Grammars and Graph Rewriting P Systems. In Proceedings of the International Conference on Membrane Computing, Online, 25–26 August 2021; pp. 582–589. [Google Scholar]
Pavlidis, T. Linear and context-free graph grammars. J. ACM 1972, 19, 11–22. [Google Scholar] [CrossRef]
Harary, F. Graph Theory; Addison-Wesley: Boston, MA, USA, 1991. [Google Scholar]
Searls, D.B. The Computational Linguistics of Biological Sequences. In Artificial Intelligence and Molecular Biology; American Association for Artificial Intelligence: Palo Alto, CA, USA, 1993; pp. 47–120. [Google Scholar]

Figure 1. Production rules for tree strings

w w^{r} w w^{r}, w \in {(a + b)}^{*}

.

Figure 1. Production rules for tree strings

w w^{r} w w^{r}, w \in {(a + b)}^{*}

.

Figure 2. Derivation of trees with yield

w w^{r} w w^{r}, w \in {(a + b)}^{*}

.

Figure 2. Derivation of trees with yield

w w^{r} w w^{r}, w \in {(a + b)}^{*}

.

Figure 3. Start graph

G_{S}

.

Figure 3. Start graph

G_{S}

.

Figure 4. Production rules for tree string

w t w^{r} t w t w^{r}, w \in {(a + b)}^{*}

.

Figure 4. Production rules for tree string

w t w^{r} t w t w^{r}, w \in {(a + b)}^{*}

.

Figure 5. Derivation of the tree string

a b t b a t a b t b a

.

Figure 5. Derivation of the tree string

a b t b a t a b t b a

.

Figure 6. Production rules for Wheel graph.

Figure 7. Generation of the Wheel graph

W_{6}

.

Figure 7. Generation of the Wheel graph

W_{6}

.

Figure 8. Production rules for Complete bipartite graph.

Figure 9. Production rules for Binary Tree.

Figure 10.

L = {T^{n} R^{2 n + 1} T^{n}, n \geq 0}

.

Figure 10.

L = {T^{n} R^{2 n + 1} T^{n}, n \geq 0}

.

Figure 11. Start graph

G_{S}

.

Figure 11. Start graph

G_{S}

.

Figure 12. Production rules for

n c G_{d i f f}

.

Figure 12. Production rules for

n c G_{d i f f}

.

Figure 13. Sample derivation for grammar in Example for

n = 2

.

Figure 13. Sample derivation for grammar in Example for

n = 2

.

Figure 14. Start graph

G_{S}

.

Figure 14. Start graph

G_{S}

.

Figure 15. Production rules for tree strings

w l_{1} w l_{2} w, w \in {(a + b)}^{*}

.

Figure 15. Production rules for tree strings

w l_{1} w l_{2} w, w \in {(a + b)}^{*}

.

Figure 16. Derivation for tree strings

w l_{1} w l_{2} w, w \in {(a + b)}^{*}

.

Figure 16. Derivation for tree strings

w l_{1} w l_{2} w, w \in {(a + b)}^{*}

.

Figure 17. Start graph

G_{S}

.

Figure 17. Start graph

G_{S}

.

Figure 18. Production rules for tree strings

w l_{1} w^{r} l_{2} w^{r}, w \in {(a + b)}^{*}

.

Figure 18. Production rules for tree strings

w l_{1} w^{r} l_{2} w^{r}, w \in {(a + b)}^{*}

.

Figure 19. Derivation of tree strings

w l_{1} w^{r} l_{2} w^{r}, w \in {(a + b)}^{*}

.

Figure 19. Derivation of tree strings

w l_{1} w^{r} l_{2} w^{r}, w \in {(a + b)}^{*}

.

Figure 20. Schematic illustration of different

β

sheet configurations.

Figure 20. Schematic illustration of different

β

sheet configurations.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Vijayakumar, J.; Mathew, L.; Nagar, A.K. A New Class of Graph Grammars and Modelling of Certain Biological Structures. Symmetry 2023, 15, 349. https://doi.org/10.3390/sym15020349

AMA Style

Vijayakumar J, Mathew L, Nagar AK. A New Class of Graph Grammars and Modelling of Certain Biological Structures. Symmetry. 2023; 15(2):349. https://doi.org/10.3390/sym15020349

Chicago/Turabian Style

Vijayakumar, Jayakrishna, Lisa Mathew, and Atulya K. Nagar. 2023. "A New Class of Graph Grammars and Modelling of Certain Biological Structures" Symmetry 15, no. 2: 349. https://doi.org/10.3390/sym15020349

APA Style

Vijayakumar, J., Mathew, L., & Nagar, A. K. (2023). A New Class of Graph Grammars and Modelling of Certain Biological Structures. Symmetry, 15(2), 349. https://doi.org/10.3390/sym15020349

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A New Class of Graph Grammars and Modelling of Certain Biological Structures

Abstract

1. Introduction

2. Graph Grammars

Node Replacement Graph Grammars

3. Non-Confluent Edge and Node Controlled Embedding $(nc$ - $eNCE)$ Graph Grammar

3.1. nc- $e N C E$ Graph Grammar with Deletion $(d n c$ - $e N C E)$

3.2. Non-Confluent $e N C E$ Graph Grammar with $ψ$ Labelled Edges $(ψ n c$ - $e N C E)$

4. Generation of Certain Graph Classes

4.1. Wheel Graphs

4.2. Complete Bipartite Graphs

4.3. Binary Tree

5. Generative Power of $nc$ - $eNCE$ Graph Grammars

6. Modelling of Biological Structures

6.1. Modelling of Parallel $β$ -Sheet Structures

6.2. Modelling of Anti-Parallel $β$ -Sheet Structures with a Semi-Greek Key Conformation

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

A New Class of Graph Grammars and Modelling of Certain Biological Structures

Abstract

1. Introduction

2. Graph Grammars

Node Replacement Graph Grammars

3. Non-Confluent Edge and Node Controlled Embedding ( nc - eNCE ) Graph Grammar

3.1. nc- e N C E Graph Grammar with Deletion ( d n c - e N C E )

3.2. Non-Confluent e N C E Graph Grammar with ψ Labelled Edges ( ψ n c - e N C E )

4. Generation of Certain Graph Classes

4.1. Wheel Graphs

4.2. Complete Bipartite Graphs

4.3. Binary Tree

5. Generative Power of nc - eNCE Graph Grammars

6. Modelling of Biological Structures

6.1. Modelling of Parallel β -Sheet Structures

6.2. Modelling of Anti-Parallel β -Sheet Structures with a Semi-Greek Key Conformation

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3. Non-Confluent Edge and Node Controlled Embedding $(nc$ - $eNCE)$ Graph Grammar

3.1. nc- $e N C E$ Graph Grammar with Deletion $(d n c$ - $e N C E)$

3.2. Non-Confluent $e N C E$ Graph Grammar with $ψ$ Labelled Edges $(ψ n c$ - $e N C E)$

5. Generative Power of $nc$ - $eNCE$ Graph Grammars

6.1. Modelling of Parallel $β$ -Sheet Structures

6.2. Modelling of Anti-Parallel $β$ -Sheet Structures with a Semi-Greek Key Conformation