Next Article in Journal
A Study of Geodesic (E, F)-Preinvex Functions on Riemannian Manifolds
Previous Article in Journal
Learning High-Dimensional Chaos Based on an Echo State Network with Homotopy Transformation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Formalization of Side-Aware DNA Origami Words and Their Rewriting System, and Equivalent Classes

Department of Software and Computer Engineering, Ajou University, Suwon 16499, Republic of Korea
Mathematics 2025, 13(6), 895; https://doi.org/10.3390/math13060895
Submission received: 22 December 2024 / Revised: 16 February 2025 / Accepted: 3 March 2025 / Published: 7 March 2025
(This article belongs to the Section E1: Mathematics and Computer Science)

Abstract

:
DNA origami is a powerful technique for constructing nanoscale structures by folding a single-stranded DNA scaffold with short staple strands. While traditional models assume staples bind to a fixed side of the scaffold, we introduce a side-aware DNA origami framework that incorporates the directional binding of staples to either the left or right side. The graphical representation of DNA origami is described using rectangular basic modules of scaffolds and staples, which we refer to as symbols in side-aware DNA origami words. We further define the concatenation of these symbols to represent side-aware DNA origami words. A set of rewriting rules is introduced to define equivalent words that correspond to the same graphical structure. Finally, we compute the number of possible structures by determining the equivalence classes of these words.

1. Introduction

DNA is an ideal material for building nanostructures due to its unique features, such as molecular recognition, self-assembly, programmability, predictable nanoscale structure, and ease of synthesis. Self-assembly is a process where components, typically molecules, autonomously organize into a larger structure [1]. Two primary methods have been developed for constructing DNA nanostructures: tile-based self-assembly [2,3] and the DNA origami method, introduced by Rothemund [4,5]. The tile-based approach is effective for building small, simple structures with repeated domains, whereas the DNA origami method excels at creating larger, more complex, and precisely defined structures [6]. The term “origami” originates from the Japanese word for folding paper into a special shape. Similarly, DNA origami builds complex two- and three-dimensional nanostructures by folding a single-stranded DNA plasmid, called the scaffold, which outlines a shape. Short DNA strands, called staples, connect different parts of the scaffold. The specificity of interactions between complementary base pairs makes DNA a highly programmable material, allowing precise control over the resulting structure through sequence design. Following Rothemund’s pioneering work on DNA origami, significant advancements have been made in designing DNA nanostructures through scaffold and staple sequence optimization. Researchers have focused on improving the scalability, stability, and accuracy of DNA origami, addressing challenges like minimizing defects and optimizing folding efficiency [7]. The development of functional DNA origami has expanded the design space by tailoring nanostructures for specific uses, such as drug delivery, molecular sensors, and biosensing [8]. Modular DNA origami, which combines smaller units into larger and more complex structures, has created new opportunities for dynamic and multifunctional designs [9].
Theoretical research in DNA origami primarily focuses on understanding the mechanisms of self-assembly and optimizing computational aspects of sequence design. Major areas of research include developing sequence design algorithms for scaffolds and staples [10], thermodynamic modeling to predict folding stability and efficiency [11,12], and combinatorial optimization to minimize errors in assembly [13]. Additionally, studies on folding kinetics and assembly pathways aim to improve the speed and accuracy of self-assembly [14]. These theoretical approaches not only enhance the efficiency of DNA origami design but also offer insights into the underlying principles governing molecular self-assembly. Recently, Garrett et al. [5] introduced a graphical framework for DNA origami structures by defining a graphical alphabet to represent DNA origami and a corresponding word rewriting system. They analyzed the equivalence classes of DNA origami words.
Our research builds upon the work of Garrett et al. [5], but we extend it by considering the fact that staples can bind to either the left or right side of the scaffold, rather than being fixed to one side. Traditional designs often assume fixed-side binding, but recent studies have demonstrated that the orientation of staple strands can significantly influence the positioning and orientation of attached molecules, such as fluorophores, within the DNA origami structure [15,16]. To address this, we propose a formal framework for describing side-aware DNA origami words, which explicitly captures the directional binding of staples to either side of the scaffold. This framework provides a more comprehensive understanding of how directional binding impacts the resulting DNA origami structures.
The main contributions of this work are as follows:
  • We introduce the concept of side-aware DNA origami words, explicitly modeling the directional binding of staples (left or right) to the scaffold. This new concept accommodates the representation of more flexible and biologically realistic DNA origami structures, and is described in Section 3.
  • We define the side-aware DNA origami word rewriting system and analyze the properties of its graphical structures, focusing on equivalence classes. Since staples can bind to either side of the scaffold or fail to form stable structures if blocked by staples from the opposite side, the diversity of rewriting patterns increases. We categorize all patterns arising from concatenation and define them as rewriting rules for side-aware DNA origami words. This is described in Section 4.

2. Preliminaries

An alphabet Σ is a non-empty finite set of symbols. A word   w = w 1 w 2 w n Σ n is a finite sequence of n symbols over Σ , and | w | = n denotes the size of the word. We use ϵ to denote the empty word. A subword, or a factor, of a word  w = w 1 w 2 w n is w = w i w j where 1 i j n . We use Σ * to denote the set of all words over Σ . The concatenation of two words x and y is denoted by x · y , or simply x y .
A word rewriting system  ( Σ , R ) consists of an alphabet  Σ and a set  R Σ * × Σ * of rewriting rules. In this paper, Σ is finite. R generates an equivalence relation  R ^ on Σ * . An element  ( x , y ) of R is called a rewriting rule, and is written as x y . In general, we rewrite u x v as u y v for u , v Σ * if ( x , y ) R , and denote such rewriting by u x v u y v . For a sequence of words  u = x 1 x 2 x n = v in a rewriting system  ( Σ , R ) , we write u v . We consider R ^ and denote an equivalence class of a word w as [ w ] . A word w 0 [ w ] is irreducible if | w 0 | | w | for all w [ w ] . We consider the set of equivalence classes  O . The reader may refer to Book and Otto [17] for more information about word rewriting systems.
The Jones monoid  J n is defined as a monoid with generators  h 1 , , h n 1 that satisfy three types of relations:
  • h i h j h i = h i for | i j | = 1
  • h i h i = h i
  • h i h j = h j h i for | i j | 2
It is a monoid counterpart of the Temperley–Lieb algebra, which has been extensively studied in physics and knot theory [18,19,20].
Figure 1 shows the graphical representation of the generators and relations in the Jones monoid. Each element in J n is depicted with n endpoints at both the top and bottom. The generator h i connects the ith and ( i + 1 ) st endpoints at both the top and bottom, with the remaining endpoints connected by vertical lines. For instance, Figure 1a illustrates the generators h 1 , h 2 , and h 3 in J 5 . The generator h 1 connects the 1st and 2nd points at both the top and bottom, h 2 connects the 2nd and 3rd points at both the top and bottom, and h 3 connects the 3rd and 4th points at both the top and bottom, respectively. The multiplication of two elements corresponds to the concatenation of diagrams, where the diagram of the first element is placed above the second, and closed loops are removed. Relations 1, 2, and 3 are also depicted graphically in Figure 1 (b)–(d), respectively. Two elements in the Jones monoid are equal if their graphical representations are equivalent, that is, they have the same set of top–bottom connecting segments after deleting internal loops. For any two words that have equivalent diagrams, one word can be rewritten to the other using the sequence of relations 1 to 3. In the simplification of DNA origami structures, Garrett et al. [5] introduced a base unit for DNA origami words using the Jones monoid framework. Their approach considers the endpoints of scaffolds and staples that are visible along the top and bottom borders of the overall structure. We adopt the same approach as Garrett et al. [5], but we further include the left or right bonding side of staples to the scaffold in our model.

3. Formalization of Side-Aware DNA Origami Words

The schematics of the DNA origami structure presented in this paper build upon previous research on DNA origami words and rewriting systems [5]. A key distinction in our model is the absence of staple endpoint sides, which eliminates block staples; any pair of staple endpoints is connected as long as they occupy the same position. In this chapter, we introduce a formal framework to describe side-aware DNA origami words, incorporating the directional binding of staples to either the left or right side of the scaffold, and the concatenation of graphical structures of side-aware DNA origami words. A brief introduction to side-aware DNA origami words and their concatenation is provided as follows:
  • Side-aware DNA origami words, consisting of scaffolds and staples, are represented by graphical structures of width n. These graphical structures contain n columns, and each column has two endpoints: top and bottom. Scaffolds and staples are represented by directed pairs of endpoints. By combining the set of pairs of scaffolds and the set of pairs of staples, we define a graphical structure. Details are described in Section 3.1.
  • We define the concatenation of two graphical structures of side-aware DNA origami words. This concatenation is based on the relations of the Jones monoid and its graphical representation, as described in Figure 1. Unlike the classical concatenation of words, the concatenation of graphical structures is performed vertically, aligning their columns. Details are described in Section 3.2.

3.1. Side-Aware DNA Origami Words

We define columns consisting of scaffolds and staples, where the staples align along the scaffolds with the following properties:
  • We observe that there are two different types of motifs in the DNA origami structure: there are places where two adjacent scaffolds connect the two columns, and also places where two adjacent staples connect the two columns.
  • The scaffolds and staples have directions: adjacent scaffolds are anti-parallel, and a staple is anti-parallel to the connected scaffold.
  • Each staple end can reside on two different sides of the connected scaffold—left and right.
To represent such scaffolds and staples by directed pairs of endpoints, we first define the set of geometric positions to represent locations of points. Given n as the width of the graphical structure, we first define x = i t (respectively, i b ) to represent a position at the top (respectively, bottom) of the ith column for 1 i n . We assume that scaffolds at the ith column go upward if i is odd, and downward if i is even, and the direction of the staples is opposite from the direction of the scaffolds. The set of geometric positions for given n is defined as E n = { i t , i b 1 i n } .
A scaffold segment is defined by an ordered pair  ( x , y ) of endpoints where x , y E n . For a staple segment, each segment is categorized into four different types:
  • Virtual (V): A staple is not present on the given column (we formally regard this “absence” of a staple as one type of staple for the concise definition of concatenation of structures). Other staples can be extended on a virtual staple through concatenation.
  • Left (L): Two endpoints of the staple are on the left side of the scaffolds.
  • Right (R): Two endpoints of the staple are on the right side of the scaffolds.
  • Block (B): A staple is disconnected in-between. Other staples cannot be extended on a block staple through concatenation.
Thus, a staple segment is defined by an ordered pair ( x d , y d ) of endpoints where x , y E n and d { V , B , L , R } . Figure 2 shows an example of staple segments over E 5 . Then, a graphical structure is defined as a pair ( C , P ) of a set of scaffold segments and a set of staple segments that satisfies the following condition: each position in E n is occupied exactly once by endpoints in C ( P ). For any segment s = ( x , y ) of endpoints, we define the reverse of the segment as s = ( y , x ) , which is defined similarly to a set of segments.
For a given width n, we define basic modules and corresponding generators Σ n = { α i , β i 1 i n 1 } as an alphabet for DNA origami words with the order α 1 < < α n 1 < β 1 < < β n 1 . We say that α i is complementary to β i , and vice versa. For each generator α i and β i we have four maps as described below:
  • α i = { ( i b , i t ) , ( ( i + 1 ) t , ( i + 1 ) b ) , ( i t L , ( i + 1 ) t L ) , ( ( i + 1 ) b R , i b R ) }
  • β i = { ( ( i + 1 ) t , i t ) , ( i b , ( i + 1 ) b ) , ( i t L , i b L ) , ( ( i + 1 ) b R , ( i + 1 ) b R ) }
We call four maps that describe α i (respectively, β i ) units for α i (respectively, β i ). Note that units are reversed for even is. The units for each generator describe their structures between the ith and the ( i + 1 ) st columns.
Table 1 shows the units for generators of odd is.
The full graphical structure G ( γ i ) for γ i must include not only the units for the ith and ( i + 1 ) st columns but also the remaining columns to fully represent a complete structure of size n. The structure of each generator γ i Σ n has a context  T ( γ i ) which consists of a scaffold ( k b , k t ) and a staple ( k t V , k b V ) for odd ks and their reverses for even ks where k { i , i + 1 } . Figure 3 illustrates the graphical structure of α i , α i + 1 , β i , and β i + 1 of size 3, including the context corresponding to the 3rd column, with i = 1 . We notice that context T ( γ i ) can have real or virtual pairs of scaffolds and staples. Based on the choice of r, we observe that the context T ( γ i ) can include either real or virtual pairs of scaffolds and staples. Depending on the choice of real or virtual context, different structural descriptions of graphical structures are possible [5]:
  • G m a x ( n ) : The entire context is real, with both V c and V p being empty.
  • G m i d ( n ) : The context for the scaffold is real, while the context for the staple is virtual.
  • G m i n ( n ) : The entire context is virtual.
In this study, we focus on G m a x ( n ) , as the four types of staples enhance the structural diversity.
Figure 3. Graphical structure of α i s and α i + 1 , where i = 1 . Real scaffolds are represented by black lines and real staples are represented by red dashed lines. Virtual staples are grey dashed lines. Virtual staples can be assigned to either the left or right side of a scaffold, but in this figure, we assign the virtual staples to the right side of a scaffold. In the graphical structure of α 1 , the context T 1 contains ( 3 b , 3 t ) R c and ( 3 b ϵ , 3 t ϵ ) V p . The pair ( 3 b ϵ , 3 t ϵ ) shows that the virtual staple has possible strands to either the left or right of the scaffold ( 3 t , 3 t ) .
Figure 3. Graphical structure of α i s and α i + 1 , where i = 1 . Real scaffolds are represented by black lines and real staples are represented by red dashed lines. Virtual staples are grey dashed lines. Virtual staples can be assigned to either the left or right side of a scaffold, but in this figure, we assign the virtual staples to the right side of a scaffold. In the graphical structure of α 1 , the context T 1 contains ( 3 b , 3 t ) R c and ( 3 b ϵ , 3 t ϵ ) V p . The pair ( 3 b ϵ , 3 t ϵ ) shows that the virtual staple has possible strands to either the left or right of the scaffold ( 3 t , 3 t ) .
Mathematics 13 00895 g003

3.2. Concatenation of Side-Aware DNA Origami Words

We now introduce how side-aware DNA origami structures are concatenated based on the relations of Jones monoid and its graphical representation. The graphical structure corresponding to the concatenation of two words is formed by placing the graphical structure of the first word on top of that of the second word. When a virtual staple meets a left (or right) staple, the virtual staple becomes the left (or right) staple. This process demonstrates that the empty space represented by a virtual staple can be extended through concatenation, depending on whether it encounters a left or right staple. In contrast, if a left (or right) staple meets a right (or left) staple, both staples become blocked, meaning the staples are disconnected.
For two graphical structures of words w 1 and w 2 , the concatenation w 1 · w 2 = w is defined as follows: suppose G ( w 1 ) = ( C 1 , P 1 ) and G ( w 2 ) = ( C 2 , P 2 ) are graphical structures of w 1 and w 2 , respectively. We now describe the procedure to define the graphical structure G ( w ) , the concatenation of the two words w 1 and w 2 , with the set C of scaffolds and the set P of staples. Note that G ( w ) = G ( w 1 w 2 ) = ( C , P ) .
First, the set C of scaffolds is obtained as follows:
(i)
For all scaffolds in C 1 , replace the subscript b by m.
(ii)
For all scaffolds in C 2 , replace the subscript t by m.
(iii)
Given set C 1 C 2 , for each sequence
( q ( 0 ) , q ( 1 ) ) , ( q ( 1 ) , q ( 2 ) ) , , ( q ( v 1 ) , q ( v ) )
of scaffolds where points q ( 0 ) and q ( v ) have subscripts t or b, add ( q ( 0 ) , q ( v ) ) to C . Figure 4 shows an example of the set C of scaffolds of a graphical structure G ( w ) = ( C , P ) of a word w = α 1 β 1 .
Set P of staples is obtained as follows:
(i)
For all staples in P 1 , replace the subscript b by m.
(ii)
For all staples in P 2 , replace the subscript t by m.
(iii)
Given set P 1 P 2 , for each sequence
( q ( 0 ) d ( 0 ) , q ( 1 ) d ( 0 ) ) , ( q ( 1 ) d ( 1 ) , q ( 2 ) d ( 1 ) ) , , ( q ( v 1 ) d ( v 1 ) , q ( v ) d ( v 1 ) )
of staples where q ( i ) s denote positions, d ( i ) s denote staple types and points q ( 0 ) d ( 0 ) , and q ( v ) d ( v 1 ) have subscripts t or b, we perform the following:
(a)
If (1) d ( i ) = B or (2) d ( i ) = L and d ( i + 1 ) = R or (3) d ( i ) = R and d ( i + 1 ) = L for any 0 i v 1 , add ( q ( 0 ) B , q ( v ) B ) to P .
(b)
Otherwise, if d ( i ) = V for all 0 i v 1 , add ( q ( 0 ) V , q ( v ) V ) to P .
(c)
Otherwise, if d ( i ) = L for any 0 i v 1 , add ( q ( 0 ) L , q ( v ) L ) to P .
(d)
Otherwise, add ( q ( 0 ) R , q ( v ) R ) to P .
Figure 5 shows an example of each case (a)–(d).
We observe that the composition of graphical structures is associative, that is, G ( w 1 w 2 w 3 ) = G ( ( w 1 w 2 ) w 3 ) = G ( w 1 ( w 2 w 3 ) ) . We use G e ( n ) to refer to the set of all graphical structures that can be constructed from Σ n .

4. Side-Aware DNA Origami Rewriting Systems

In this section, we introduce rewriting rules for two graphical structures corresponding to w 1 and w 2 . Specifically, if two words w 1 and w 2 produce the same graphical structure, i.e., G ( w 1 ) = G ( w 2 ) , then w 1 can be rewritten as w 2 , and vice versa. Since the concatenation of two graphical structures occurs in a horizontal manner, the types of endpoints at the bonding sites influence the resulting structure. For instance, the bottom endpoints of the graphical structure of G ( w 1 ) and the top endpoints of the graphical structure of G ( w 2 ) determine the outcome of the concatenated structure G ( w 1 · w 2 ) . If the bottom endpoints of G ( w 1 ) have a “cap” shape, with two endpoints connected from the bottom of one column to the bottom of the adjacent column, how does this affect the resulting structures when considering all possible generators that could bond below G ( w 1 ) ? Note that it is worth noting that a “cup” shape at the bottom endpoints of G ( w 1 ) does not constrain or affect other generators bonding below G ( w 1 ) , as it does not interfere with the bonding process. Conversely, if the top endpoints of G ( w 2 ) have a “cup” shape, with two endpoints connected from the top of one column to the top of the adjacent column, how does this influence the resulting structures when considering all possible generators that could bond above G ( w 2 ) ?
We define the rewriting rules as patterns of structural types and describe how each structural type affects the resulting structures. The concept of side-aware DNA origami words, which allows for the directional binding of staples to the scaffold, increases the complexity of the rewriting rules, even with only six possible generators to concatenate. We note that if the gap between columns is greater than two, the resulting structures are unaffected by the relations in the Jones monoid. Therefore, we only need to consider six generators: α i 2 , α i 1 , α i , β i 2 , β i 1 , β i .

4.1. Rewriting Rules

We define a rewriting rule w 1 w 2 under G e ( n ) if G ( w 1 ) = G ( w 2 ) under G e ( n ) . Here, we establish the finite set of rewriting rules R e ( n ) under G e ( n ) . For convenience, we use γ and δ to represent an arbitrary generator type ( α or β ), and γ ¯ to represent a complementary generator of γ . We observe the following statements:
  • If w 1 w 2 under G e ( n ) , then w 1 w 2 under G m a x ( n ) . Namely, if two graphical structures G ( w 1 ) and G ( w 2 ) are the same under G e ( n ) , then they should be the same when we ignore sides of staples and connect all staple endpoints at the same position. This implies that for any rule w 1 w 2 R e ( n ) , there exist w 1 w 2 R m a x ( n ) , and words w and w such that w 1 = w w 1 w and w 2 = w w 2 w .
  • If w 1 w 2 under G m a x ( n ) , then the scaffolds of G ( w 1 ) and G ( w 2 ) are the same under G e ( n ) .
  • Staple segments in the generators are categorized into three geometric types: a cap (i.e., ( 1 b R , 2 b R ) ), a cup (i.e., ( 2 t R , 1 t R ) ), and a straight segment (i.e., ( ( 3 b B , 3 t B ) )). For convenience, we refer to a cap (cup) between the ith and the i 1 st columns as the ith cap (cup), and a straight staple at the ith column as the ith straight staple.
Under these observations, for each pair ( s ( i ) , r ( i ) ) of staple segments with the same start and end positions dependent on i, we establish a set { ( w , w ) } of pairs of a prefix w Σ * and a suffix w Σ * that satisfies the following condition: for any pair ( w 1 , w 2 ) of words where G ( w 1 ) and G ( w 2 ) are the same under G e ( n ) except s ( i ) in G ( w 1 ) and r ( i ) in G ( w 2 ) , the equality G ( w w 1 w ) = G ( w w 2 w ) holds under G e ( n ) . We call such set an override set, denoted by O C ( i ) , where C { c a p , c u p , R B , L B , V R , V L , V B } represents the type of the pairs. Then, for each rule w 1 w 2 R m a x ( n ) , we observe the staple difference between G ( w 1 ) and G ( w 2 ) under G e ( n ) , concatenate a prefix and a suffix from the corresponding override set to make the graphical structures equivalent, and retrieve the rewriting rules under G e ( n ) .
An override set is a collection of “prefix–suffix pairs” that can adjust two words to make their graphical structures equivalent, even if there is a specific difference between them in terms of staple segments. For any given type of difference—like a “cap” or “cup” at a certain position—this set provides the necessary adjustments to “override” that difference. If two DNA origami words would result in the same structure except for a small discrepancy in their staples at position i, the override set tells us how to add the same prefix and suffix around both words so that their final structures become identical.
Below is the list of each pair of staple segments and the corresponding override sets.
  • Cap: We first consider a pair s ( i ) = ( i b B , ( i 1 ) b B ) , r ( i ) = ( i b R , ( i 1 ) b R ) of ith caps for even is. Since the cap segments are at the bottom of the structure, concatenating any prefix does not affect the equivalence of graphical structures. We have six different generators that affect the ith cap segments: γ i 2 through γ i . Figure 6 illustrates an example of G ( w 1 ) and G ( w 2 ) where the only difference is ( s ( i ) , r ( i ) ) , and changes in the graphical structures from concatenation of these six generators after w 1 and w 2 . We observe that all generators except β i 2 make two graphical structures equivalent. For the β i 2 case, the difference in graphical structures remains the same. Therefore, we establish the override set as O c a p ( i ) = ( ϵ , v z ) v ( Σ n Z ) * , z Z for Z = { α i 2 , α i 1 , α i , β i 1 , β i } . When i is odd, for s ( i ) , r ( i ) , the override set is the same as the even case, which holds for all of the following cases.
  • Cup: We consider a pair s ( i ) = ( i t B , ( i 1 ) t B ) , r ( i ) = ( i t L , ( i 1 ) t L ) of ith cups for odd is. Since the cup segments are at the top of the structure, concatenating any suffix does not affect the equivalence of graphical structures. We have six different generators that affect the ith cap segments: γ i 2 through γ i . Figure 7 illustrates an example of G ( w 1 ) and G ( w 2 ) where the only difference is ( s ( i ) , r ( i ) ) , and changes in the graphical structures from concatenation of these six generators before w 1 and w 2 . We observe that all generators except β i make two graphical structures equivalent. For the β i case, the difference in graphical structures remains the same. Therefore, we establish the override set as O c u p ( i ) = ( z v , ϵ ) v ( Σ n Z ) * , z Z for Z = { α i 2 , α i 1 , α i , β i 2 , β i 1 } .
  • Straight segment: We have the following cases.
    (a)
    When s ( i ) is right and r ( i ) is block: we consider s ( i ) = ( i t R , i b R ) , r ( i ) = ( i t B , i b B ) for odd is. We have four different generators that affect the straight segments: γ i 1 through γ i . Figure 8 illustrates an example of G ( w 1 ) and G ( w 2 ) where the only difference is ( s ( i ) , r ( i ) ) , and changes in the graphical structures from concatenation of these four generators before and after w 1 and w 2 . If concatenation of a generator does not make two graphical structures equivalent, it may change the geometric type of the staple segment difference. For example, concatenating α i before w 1 and w 2 changes the geometric type of the staple segment difference to the i + 1 st cap. Thus, for any ( ϵ , w ) O c a p ( i + 1 ) , G ( α i w 1 w ) = G ( α i w 2 w ) holds. We summarize all of the cases as the following override set: O R B ( i ) = ( β i v , ϵ ) ( ϵ , v z ) z Z ( α i 1 v , w ) ( ϵ , w ) O c a p ( i ) ( α i v , w ) ( ϵ , w ) O c a p ( i + 1 ) for Z = { α i 1 , α i , β i } and v ( Σ n Z ) * .
    (b)
    When s ( i ) is left and r ( i ) is block: we consider s ( i ) = ( i t L , i b L ) , r ( i ) = ( i t B , i b B ) for odd is in Figure 9. We establish the following override set: O L B ( i ) = ( z v , ϵ ) z Z ( ϵ , v β i 1 ) ( w , v α i 1 ) ( w , ϵ ) O c u p ( i ) ( w , v α i ) ( w , ϵ ) O c u p ( i + 1 ) for Z = { α i 1 , α i , β i 1 } and v ( Σ n Z ) * .
    (c)
    When S ( i ) is virtual and r ( i ) is right: We first consider s ( i ) = ( i t V , i b V ) , r ( i ) = ( i t R , i b R ) for odd is in Figure 10. We summarize all of the cases as the following override set: O V R ( i ) = ( z v , ϵ ) z { α i 1 , α i , β i 1 } ( ϵ , v β i 1 ) ( w , v α i 1 ) ( w , ϵ ) O c u p ( i ) ( w , v α i ) ( w , ϵ ) O c u p ( i + 1 ) ( w β i , w ) , ( w , β i w ) ( w , w ) O L B ( i ) for Z = { α i 1 , α i , β i 1 , β i } and v ( Σ n Z ) * .
    (d)
    When s ( i ) is virtual and r ( i ) is left: We consider s ( i ) = ( i t V , i b V ) , ( s ( i ) = i t L , i b L ) for odd is in Figure 11. We establish the following override set: O V L ( i ) = ( β i v , ϵ ) ( ϵ , v z ) z { α i 1 , α i , β i } ( α i 1 v , w ) ( ϵ , w ) O c a p ( i ) ( α i v , w ) ( ϵ , w ) O c a p ( i + 1 ) ( w β i 1 , w ) , ( w , β i 1 w ) ( w , w ) O R B ( i ) for Z = { α i 1 , α i , β i 1 , β i } and v ( Σ n Z ) * .
    (e)
    When s ( i ) is virtual and r ( i ) is block: We consider s ( i ) = ( i t V , i b V ) , r ( i ) = ( i t B , i b B ) for odd is in Figure 12. We establish the following override set: O V B ( i ) = ( β i v , ϵ ) , ( ϵ , v β i ) ( α i 1 v , w ) ( ϵ , w ) O c a p ( i ) ( α i v , w ) ( ϵ , w ) O c a p ( i + 1 ) ( w , v α i 1 ) ( w , ϵ ) O c u p ( i ) ( w , v α i ) ( w , ϵ ) O c u p ( i + 1 ) ( w β i 1 , w ) , ( w , β i 1 w ) ( w , w ) O R B ( i ) for Z = { α i 1 , α i , β i 1 , β i } and v ( Σ n Z ) * .
Note that the definition of each override set only refers to the previously defined override sets. We reorganize the override sets by substitution:
  • O c a p ( i ) = ( ϵ , v z ) v ( Σ n Z ) * , z Z for Z = { α i 2 , α i 1 , α i , β i 1 , β i } .
  • O c u p ( i ) = ( z v , ϵ ) v ( Σ n Z ) * , z Z for Z = { α i 2 , α i 1 , α i , β i 2 , β i 1 } .
  • O R B ( i ) = ( β i v [ 1 ] , ϵ ) , ( ϵ , v [ 1 ] z [ 1 ] ) , ( α i 1 v [ 1 ] , v [ 2 ] z [ 2 ] ) , ( α i v [ 1 ] , v [ 3 ] z [ 3 ] ) for Z [ 1 ] = { α i 1 , α i , β i } , Z [ 2 ] = { α i 2 , α i 1 , α i , β i 1 , β i } , Z [ 3 ] = { α i 1 , α i , α i + 1 , β i , β i + 1 } , z [ j ] Z [ j ] and v [ j ] ( Σ n Z [ j ] ) * for 1 j 3 .
  • O L B ( i ) = ( z [ 1 ] v [ 1 ] , ϵ ) , ( ϵ , v [ 1 ] β i 1 ) , ( z [ 2 ] v [ 2 ] , v [ 1 ] α i 1 ) , ( z [ 3 ] v [ 3 ] , v [ 1 ] α i ) for Z [ 1 ] = { α i 1 , α i , β i 1 } , Z [ 2 ] = { α i 2 , α i 1 , α i , β i 2 , β i 1 } , Z [ 3 ] = { α i 1 , α i , α i + 1 , β i 1 , β i } , z [ j ] Z [ j ] and v [ j ] ( Σ n Z [ j ] ) * for 1 j 3 .
  • O V R ( i ) = { ( z [ 4 ] v [ 1 ] , ϵ ) , ( ϵ , v [ 1 ] β i 1 ) , ( z [ 2 ] v [ 2 ] , v [ 1 ] α i 1 ) , ( z [ 3 ] v [ 3 ] , v [ 1 ] α i ) ,
    ( z [ 4 ] v [ 4 ] β i , ϵ ) , ( z [ 4 ] v [ 4 ] , β i ) , ( β i , v [ 4 ] β i 1 ) , ( ϵ , β i v [ 4 ] β i 1 ) , ( z [ 2 ] v [ 2 ] β i , v [ 4 ] α i 1 ) ,
    ( z [ 2 ] v [ 2 ] , β i v [ 4 ] α i 1 ) , ( z [ 3 ] v [ 3 ] β i , v [ 4 ] α i ) , ( z [ 3 ] v [ 3 ] , β i v [ 4 ] α i ) } for Z [ 1 ] = { α i 1 , α i , β i 1 , β i } , Z [ 2 ] = { α i 2 , α i 1 , α i , β i 2 , β i 1 } , Z [ 3 ] = { α i 1 , α i , α i + 1 , β i 1 , β i } , Z [ 4 ] = { α i 1 , α i , β i 1 } , z [ j ] Z [ j ] and v [ j ] ( Σ n Z [ j ] ) * for 1 j 4 .
  • O V L ( i ) = { ( β i v [ 1 ] , ϵ ) , ( ϵ , v [ 4 ] z [ 1 ] ) , ( α i 1 v [ 1 ] , v [ 2 ] z [ 2 ] ) , ( α i v [ 1 ] , v [ 3 ] z [ 3 ] ) ,
    ( β i v [ 4 ] β i 1 , ϵ ) , ( β i v [ 4 ] , β i 1 ) , ( β i 1 , v [ 4 ] z [ 4 ] ) , ( ϵ , β i 1 v [ 4 ] z [ 4 ] ) , ( α i 1 v [ 4 ] β i 1 , v [ 2 ] z [ 2 ] ) , ( α i 1 v [ 4 ] , β i 1 v [ 2 ] z [ 2 ] ) ,
    ( α i v [ 4 ] β i 1 , v [ 3 ] z [ 3 ] ) , ( α i v [ 4 ] , β i 1 v [ 3 ] z [ 3 ] ) }
    for Z [ 1 ] = { α i 1 , α i , β i 1 , β i } , Z [ 2 ] = { α i 2 , α i 1 , α i , β i 1 , β i } ,
    Z [ 3 ] = { α i 1 , α i , α i + 1 , β i , β i + 1 } , Z [ 4 ] = { α i 1 , α i , β i } , z [ j ] Z [ j ] and v [ j ] ( Σ n Z [ j ] ) * for 1 j 4 .
  • O V B ( i ) = { ( β i v [ 1 ] , ϵ ) , ( ϵ , v [ 1 ] β i ) , ( α i 1 v [ 1 ] , v [ 2 ] z [ 2 ] ) , ( α i v [ 1 ] , v [ 3 ] z [ 3 ] ) , ( z [ 4 ] v [ 4 ] , v [ 1 ] α i 1 ) ,
    ( z [ 5 ] v [ 5 ] , v [ 1 ] α i ) , ( β i v [ 6 ] β i 1 , ϵ ) , ( β i v [ 6 ] , β i 1 ) , ( β i 1 , v [ 6 ] z [ 6 ] ) , ( ϵ , β i 1 v [ 6 ] z [ 6 ] ) , ( α i 1 v [ 6 ] β i 1 , v [ 2 ] z [ 2 ] ) ,
    ( α i 1 v [ 6 ] , β i 1 v [ 2 ] z [ 2 ] ) , ( α i v [ 6 ] β i 1 , v [ 3 ] z [ 3 ] ) , ( α i v [ 6 ] , β i 1 v [ 3 ] z [ 3 ] ) } for Z [ 1 ] = { α i 1 , α i , β i 1 , β i } ,
    Z [ 2 ] = { α i 2 , α i 1 , α i , β i 1 , β i } , Z [ 3 ] = { α i 1 , α i , α i + 1 , β i , β i + 1 } ,
    Z [ 4 ] = { α i 2 , α i 1 , α i , β i 2 , β i 1 } , Z [ 5 ] = { α i 1 , α i , α i + 1 , β i 1 , β i } , Z [ 6 ] = { α i 1 , α i , β i } ,
    z [ j ] Z [ j ] and v [ j ] ( Σ n Z [ j ] ) * for 1 j 6 .
Now, we inspect the rules in R m a x ( n ) to construct R e ( n ) .
  • γ i γ j ¯ γ j ¯ γ i : When | i j | 2 , the rule also holds under G e ( n ) .
    (a)
    When j = i + 1 and γ = α , we observe that the only difference in graphical structures is the ith cap. Thus, α i β i + 1 w β i + 1 α i w holds for ( ϵ , w ) O c a p ( i ) .
    (b)
    When j = i + 1 and γ = β , we observe that the only difference in graphical structures is the i + 1 st cup. Thus, w β i α i + 1 w α i + 1 β i holds for ( w , ϵ ) O c u p ( i + 1 ) .
    (c)
    When i = j , we observe that there are i + 1 st cup and cap differences. Thus, w α i β i w w β i α i w holds for ( w , ϵ ) O c u p ( i + 1 ) and ( ϵ , w ) O c a p ( i + 1 ) .
  • γ i γ i γ i : This rule also holds under G e ( n ) .
  • γ i γ j γ j γ i for | i j | 2 : This rule also holds under G e ( n ) .
  • γ i γ j γ i γ i for | i j | = 1
    (a)
    When γ = α and j = i 1 , we observe that the only difference in graphical structures is the ith straight virtual/block staples. Thus, w α i w w α i α i 1 α i w holds for ( w , w ) O V B ( i 1 ) .
    (b)
    When γ = α and j = i + 1 , we observe that the only difference in graphical structures is the i + 2 nd straight virtual/block staples. Thus, w α i w w α i α i + 1 α i w holds for ( w , w ) O V B ( i + 2 ) .
    (c)
    When γ = β and j = i 1 , we observe that the differences in graphical structures are the i 1 st straight virtual/block staple and the ith straight left/block staple. Since straight differences on two adjacent columns can become the same by simply applying O V B ( i 1 ) and O L B , we attach generators that affect the different staple types and observe the changes on the differences. Figure 13 illustrates the transition graph of the changes in the different staple types. Each node denotes the different staple types by override sets for the differences, and each transition has a set of pairs of a prefix and a suffix that changes the different staple types. Nodes with less than or equal to one type difference are double-circled to denote the “final” nodes, since we can directly apply the override set on the node to make two graphical structures equivalent. For G ( β i ) and G ( β i β i 1 β i ) , we start from the node with two override sets O V B ( i 1 ) and O L B ( i ) . Then, for example, if we follow transitions ( α i 2 , ϵ ) and ( ϵ , α i 3 ) , the resulting node has one override set O L B ( i ) . From this sequence of transitions, we establish the rewriting rule w α i 2 v [ 1 ] β i v [ 2 ] α i 3 w w α i 2 v [ 1 ] β i β i 1 β i v [ 2 ] α i 3 w for v [ 1 ] ( Σ n { α i 1 , α i , β i 2 , β i 1 , β i } ) * , v [ 2 ] ( Σ n { α i 2 , α i 1 , α i , β i 3 , β i 2 , β i 1 , β i } ) * and ( w , w ) O L B ( i ) . In general, we can recursively construct a rewriting rule based on a sequence of transitions that leads to a final node as follows:
    • We start with w 1 = β i and w 2 = β i β i 1 β i and the current node with two override sets O V B ( i 1 ) and O L B ( i ) .
    • For a transition ( ζ , ϵ ) from the current node, append a prefix ζ v to w 1 and w 2 where v does not have any ζ such that the transition ( ζ , ϵ ) exists from the current node.
    • For a transition ( ϵ , ζ ) from the current node, append a suffix v ζ to w 1 and w 2 where v does not have any ζ such that the transition ( ϵ , ζ ) exists from the current node.
    • Update the current node by following the transition. If the current node is final with the override set O, we establish the rewriting rule w w 1 w w w 2 w for ( w , w ) O . Otherwise, repeat the process from the current node.
    (d)
    When γ = β and j = i + 1 , we can similarly construct the transition graph as Figure 14. Based on this transition graph, we can recursively construct a set of rewriting rules.
Note that the transition graphs in Figure 13 and Figure 14 are not symmetric with respect to the indices (i.e., changing index j to 2 i + 1 j from Figure 13 does not construct Figure 14).
We union the sets of previously analyzed rewriting rules to construct R e ( n ) . It is straightforward that if w 1 w 2 under R e ( n ) , then G ( w 1 ) = G ( w 2 ) under G e ( n ) . Moreover, since R e ( n ) is constructed from recursively tracing every possible difference in the graphical structures of G ( w 1 ) and G ( w 2 ) where G ( w 1 ) = G ( w 2 ) under G m a x ( n ) , the converse of the statement also holds: if G ( w 1 ) = G ( w 2 ) under G e ( n ) , then w 1 w 2 under R e ( n ) . Based on R e ( n ) , we can also define O e ( n ) , the set of equivalent classes under R ^ e ( n ) .

4.2. Properties of Graphical Structures and Equivalent Classes

Here, we inspect the properties of graphical structures under G e ( n ) , which we will use to count the number of distinct equivalent classes. For two graphical structures G ( w ) = ( C , P ) and G ( w ) = ( C , P ) , we say that G ( w ) is isomorphic to G ( w ) (and vice versa) if there exists one-to-one correspondence between scaffolds in C and C (staples in P and P ) with the same pair of endpoints. For instance, if we compare the graphical structures from the same word under G e ( n ) and G m a x ( n ) , one is isomorphic to the other—the only difference is the four different types of staples (virtual, left, right, block) under G e ( n ) . However, not all combinations of these types of staples are realizable under G e ( n ) . We first analyze the possible combinations of different types of staples. Then, for each given graphical structure G ( w ) under G m a x ( n ) , we count the number of distinct graphical structures under G e ( n ) that are isomorphic to G ( w ) .
For further analysis of staples, we first define four different types of a column according to the scaffold structure. For a given graphical structure, the ith column can be categorized into four different types:
  • Straight if there exists a straight scaffold on the ith column.
  • Crossing if there exists a scaffold from the jth column to the kth column such that j < i < k or k < i < j holds.
  • Left-sided if the ith column is not crossing and all scaffolds with endpoints on the ith column have the other endpoints on the jth column such that j < i .
  • Right-sided if the ith column is not crossing and all scaffolds with endpoints on the ith column have the other endpoints on the jth column such that i < j .
Since each position is assigned to distinct endpoints of scaffolds and there is no crossing of scaffolds, these four different types partition the set of all columns.
Lemma 1. 
For given graphical structure G ( w ) under G e ( n ) , staples in G ( w ) have the following properties:
  • If a staple is virtual straight, then the staple is on a straight column.
  • If a staple is a cup, then the staple is left or block.
  • If a staple is a cap, then the staple is right or block.
  • If a staple is left straight, then the staple is on a straight or right-sided column.
  • If a staple is right straight, then the staple is on a straight or left-sided column.
  • If a staple is not a cap/a cup/a straight segment, then the staple is block.
Lemma 2. 
For given graphical structure G ( w ) under G e ( n ) , there always exists a graphical structure G ( w ) under G e ( n ) that satisfies the following properties:
  • G ( w ) is isomorphic to G ( w ) .
  • A staple is virtual if and only if the staple is straight and on a straight column.
  • If a staple is a cup, then the staple is left.
  • If a staple is a cap, then the staple is right.
  • A staple is left straight if and only if the staple is on a right-sided column.
  • A staple is right straight if and only if the staple is on a left-sided column.
Proof of Lemma 2. 
Figure 15 gives an example of G ( w ) and G ( w ) . □
We also observe that if α i is in w, then the ith and the i + 1 st columns cannot have left or right straight staples.
We now classify graphical structures using a binary b of length n, where the ith digit has 1 if the ith column has a straight scaffold and 0 otherwise. The set of binaries of length n has bijection with the set T ( β ) n = { p = ( a 1 , b 1 , , a k , b k ) k 1 , 0 a 1 , b k n , 1 a i n , 0 b i n for   all   i , and i = 1 k ( a i + b i ) = n } . Thus, let D ( p ) be the number of graphical structures which correspond to p T ( β ) n .
Theorem 1. 
Given n N 0 , for each tuple p 1 i n T ( β ) i , let D ( p ) N 0 be recursively defined as follows:
  • for p T ( β ) 0 , D ( 0 , 0 ) = 1 .
  • for p T ( β ) 1 , D ( p ) = 1 if p = ( 0 , 1 ) and D ( p ) = 0 if p = ( 1 , 0 ) .
  • for p = ( a 1 , b 1 , , a k , b k ) T ( β ) n , ( n > 0 ) we have D ( p ) = i = 1 k D ( a i , 0 ) .
  • for n > 1 , we have D ( 0 , n ) = 1 and
    D ( n , 0 ) = 1 n + 1 2 n n p T ( β ) n { ( n , 0 ) } D ( p ) .
Then, the size of the set of equivalent classes of words from Σ ( β ) n is given as
| O ( β ) e ( n ) | = p T ( β ) n D ( p ) × x ( p ) n ,
where x ( a 1 , b 1 , , a k , b k ) is given as
p = ( a 1 , b 1 , , a l , b l ) T ( β ) k i = 1 l 1 c i · ( b ( s l b l ) + 1 ) 1
for s i = j = 1 i ( a j + b j ) and c i = ( b ( s i b i ) + 1 ) ( b s i + 1 ) if b i 1 and ( b s i + 1 ) ( b s i + 2 ) 2 if b i = 0 .
Proof of Theorem 1. 
We observe that all staples in a graphical structure of a word from Σ ( β ) n are straight. Moreover, all straight staples are either left or right in the generators, and if the ith straight staple was block in G ( w ) , then the ith straight staple is block in all G ( u w v ) where u , w , v Σ ( β ) n * . Thus, for each graphical structure, we first find a structure with the minimal block staples, and count the number of graphical structures with the same connectivity. It is straightforward to verify that D ( 0 , 1 ) = 1 because there is only one case for p = ( 0 , 1 ) , while D ( 1 , 0 ) = 0 since p = ( 1 , 0 ) corresponds to an invalid case. For tuples p = ( a 1 , b 1 , , a k , b k ) T ( β ) n , the columns with 1s in the binary representation always have only one possible case (straight scaffold and staple). Therefore, D ( p ) is computed as the product of D ( a i , 0 ) for all i, that is:
D ( p ) = i = 1 k D ( a i , 0 ) .
To calculate D ( a 1 , 0 ) , we observe that the total number of graphical structures for a 1 is given by the Catalan number:
| J a 1 | = 1 a 1 + 1 2 a 1 a 1 .
Using the recursive definition, D ( a 1 , 0 ) can be computed by summing over all valid tuples p T ( β ) a 1 :
| J a 1 | = p T ( β ) a 1 D ( p ) .
For D ( n , 0 ) , the value is determined by subtracting the contributions of all tuples p T ( β ) n { ( n , 0 ) } from the total number of structures:
D ( n , 0 ) = 1 n + 1 2 n n p T ( β ) n { ( n , 0 ) } D ( p ) .
Finally, the size of the equivalence classes of words from Σ ( β ) n is given by:
| O ( β ) e ( n ) | = p T ( β ) n D ( p ) × x ( p ) n ,
where x ( p ) is calculated recursively based on the tuples p T ( β ) n . □
Given a graphical structure with n straight scaffolds, there exist the following different types of staple segment:
  • a c u p if a staple segment s = ( i t , j t ) , where 0 < | j i | = 1 ,
  • an e x t e n d e d c u p if a staple segment s = ( i t , j t ) , where 0 < | j i | = 2 k + 1 for 1 k ,
  • a c a p if a staple segment s = ( i t , j t ) , where 0 < | j i | = 1
  • an e x t e n d e d c a p if a staple segment s = ( i b , j b ) , where 0 < | j i | = 2 k + 1 for 1 k ,
  • a j u n c t i o n if a staple segment s = ( i t , j b ) , where 0 < | j i | = 2 k for 1 k .
Without loss of generality, we omit superscripts B , L , R that represent where two endpoints of the staple are placed on the scaffolds. Figure 16 shows the five different types of staple segment with n columns.
Given a graphical structure with n straight scaffolds, we define n to represent a set of staple segments over extended_cup, extended_cap, and junction that satisfy the following properties:
(i)
A bridge of width n is a subset of staple set P of the given graphical structure,
(ii)
For any staple segments s = ( i p d , j q d ) in a bridge of width n, | j i | = 2 k for k 1 ,
(iii)
For any staple segments s = ( i p d , j p d ) in a bridge of width n, | j i | = 2 k + 1 , for k 0 ,
(iv)
A bridge of width n consequently spans n columns,
(v)
There is no proper subset of a bridge of width n which is a bridge of width n.
Given a graphical structure of width n, a staple segment s = ( i p d , j q d ) of endpoints, where p , q { t , b } , d { B , L , R } , and 1 i , j n , spans columns from i to j if i < j (or from the j to i if j < i ).
We call a set of staple segments a bridge of width n if it satisfies the following properties (see Figure 17 for an example of a bridge of width n):
(i)
A bridge of width n is a subset of staple set P of the given graphical structure,
(ii)
For any staple segments s = ( i p d , j q d ) in a bridge of width n, | j i | = 2 k for k 1 ,
(iii)
For any staple segments s = ( i p d , j p d ) in a bridge of width n, | j i | = 2 k + 1 , for k 0 ,
(iv)
A bridge of width n consequently spans n columns,
(v)
There is no proper subset of a bridge of width n which is a bridge of width n.
Figure 17. An example of a bridge of width n. (a) A set { ( 1 t L , 3 b L ) , ( 5 t L , 2 t L ) , ( 4 b L , 6 t R ) } of staple segments is a bridge of width 6. (b) There exists a set { ( 1 t L , 5 b R ) , ( 6 b R , 4 t L ) , ( 4 b R , 1 b L ) } that violates the property (v) since a proper subset { ( 6 b R , 4 t L ) , ( 4 b R , 1 b L ) } is a bridge of width 6. Therefore, the set { ( 1 t L , 5 b R ) , ( 6 b R , 4 t L ) , ( 4 b R , 1 b L ) } is not bride of width 6 whereas the set { ( 1 t L , 5 b L ) , ( 6 b R , 4 t L ) } is a bridge. (c) A set { ( 1 t L , 3 b R ) , ( 6 b R , 2 t L ) } is a bridge of width 6. (d) A staple segment ( 5 t L , 2 t L ) violates the property (ii) for a bridge of width 7. A set { ( 1 t L , 7 b R ) } is a bridge of width 7.
Figure 17. An example of a bridge of width n. (a) A set { ( 1 t L , 3 b L ) , ( 5 t L , 2 t L ) , ( 4 b L , 6 t R ) } of staple segments is a bridge of width 6. (b) There exists a set { ( 1 t L , 5 b R ) , ( 6 b R , 4 t L ) , ( 4 b R , 1 b L ) } that violates the property (v) since a proper subset { ( 6 b R , 4 t L ) , ( 4 b R , 1 b L ) } is a bridge of width 6. Therefore, the set { ( 1 t L , 5 b R ) , ( 6 b R , 4 t L ) , ( 4 b R , 1 b L ) } is not bride of width 6 whereas the set { ( 1 t L , 5 b L ) , ( 6 b R , 4 t L ) } is a bridge. (c) A set { ( 1 t L , 3 b R ) , ( 6 b R , 2 t L ) } is a bridge of width 6. (d) A staple segment ( 5 t L , 2 t L ) violates the property (ii) for a bridge of width 7. A set { ( 1 t L , 7 b R ) } is a bridge of width 7.
Mathematics 13 00895 g017
We define n to represent a set of all bridges of width n.
For a bridge of width n there are consecutive endpoints at the top and at the bottom which are not spanned by the bridge. For instance, there are consecutive endpoints ( 2 t , , 5 t ) and ( 1 b , , 4 b ) for a bridge { ( 1 t L , 5 b R ) } of width 5.
We recall that a binary string of length n represents the graphical structures, and T ( α ) n = { p p = ( a 1 , b 1 , , a k , b k ) , k 1 , 0 a 1 , b k n , 1 a i , b i n for   all   i , and i = 1 k ( a i + b i ) = n } denotes the set of tuples corresponding to the binary string.
Given a set of 2 n positions, we calculate the number of partial graphical structures with t caps (or cups) as follows:
D ( 2 n , t ) = n 1 + + n i = n , t 1 + + t i = t , n j t j 1 for   all   j i j = 1 i D ( 2 n j 2 , t j )
with initial conditions D ( 2 , 1 ) = D ( 0 , 1 ) = 1 and D ( 2 n , t ) = 0 for all n < t .

5. Conclusions

This study presents a theoretical framework for side-aware DNA origami words, accounting for the directional attachment of staples to either the left or right side of the scaffold. This expansion of the traditional real and virtual staple classification to include left, right, virtual, and block types offers a more realistic representation of DNA origami structures, enabling the design of more intricate configurations.
We establish a rewriting system for side-aware DNA origami words and explore the properties of the corresponding graphical structures, with a focus on equivalence classes and rewriting patterns. Our model highlights the role of staple binding directionality in shaping the final structure, enhancing control over positioning and folding, which is crucial for applications in molecular devices and nanoscale assembly.
Although the complexity of rewriting rules in the G m a x ( n ) case poses challenges for computational efficiency, future work will consider the G m i d ( n ) and G m i n ( n ) contexts. These will help examine their impact on rewriting patterns and structural diversity, offering deeper insights into the role of virtual elements in DNA origami design. Additionally, identifying equivalence classes streamlines the design process by grouping structurally similar configurations, improving both the efficiency and flexibility of DNA origami structures. This classification is especially useful for optimizing stable designs and their applications in molecular devices and nanoscale assembly.
As a next step, we plan to validate the model through molecular simulations, specifically focusing on how structural width and staple types influence the overall shape. This will confirm the model’s accuracy and demonstrate its potential for real-world applications.
Finally, while this study focuses on 2D side-aware DNA origami, the framework can be extended to 3D structures. Addressing challenges related to staple orientation and computational efficiency will be key, and future work will explore these extensions to enable the design of more complex 3D DNA origami structures for advanced applications in molecular devices and nanoscale assembly.

Funding

This work was supported by Institute of Information and Communications Technology 651 Planning and Evaluation (IITP) under the Artificial Intelligence Convergence Innovation Human 652 Resources Development (IITP-2024-RS-2023-00255968) grant and the National Research Foundation of Korea (NRF-2022R1G1A1013287).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Whitesides, G.M.; Boncheva, M. Beyond Molecules: Self-assembly of mesoscopic and macroscopic components. Proc. Natl. Acad. Sci. USA 2002, 99, 4769–4774. [Google Scholar] [CrossRef] [PubMed]
  2. Evans, C.G.; Winfree, E. Physical principles for DNA tile self-assembly. Chem. Soc. Rev. 2017, 46, 3808–3829. [Google Scholar] [CrossRef] [PubMed]
  3. Winfree, E.; Eng, T.; Rozenberg, G. String Tile Models for DNA Computing by Self-Assembly. In Proceedings of the 6th International Workshop on DNA-Based Computers, Leiden, The Netherlands, 13–17 June 2000; pp. 63–88. [Google Scholar]
  4. Rothemund, P.W.K. Folding DNA to create nanoscale shapes and patterns. Nature 2006, 440, 297–302. [Google Scholar] [CrossRef] [PubMed]
  5. Garrett, J.; Jonoska, N.; Kim, H.; Saito, M. DNA origami words, graphical structures and their rewriting systems. Nat. Comput. 2021, 20, 217–231. [Google Scholar] [CrossRef]
  6. Zadegan, R.M.; Norton, M.L. Structural DNA nanotechnology: From design to applications. Int. J. Mol. Sci. 2012, 13, 7149–7162. [Google Scholar] [CrossRef] [PubMed]
  7. Kim, H.; Surwade, S.P.; Powell, A.; O’Donnell, C.; Liu, H. Stability of DNA Origami Nanostructure under Diverse Chemical Environments. Chem. Mater. 2014, 26, 5265–5273. [Google Scholar] [CrossRef]
  8. Voigt, N.V.; Tørring, T.; Rotaru, A.; Jacobsen, M.F.; Ravnsbæk, J.B.; Subramani, R.; Mamdouh, W.; Kjems, J.; Mokhir, A.; Besenbacher, F.; et al. Single-molecule chemical reactions on DNA origami. Nat. Nanotechnol. 2010, 5, 200–203. [Google Scholar] [CrossRef] [PubMed]
  9. Lin, C.; Liu, Y.; Rinker, S.; Yan, H. DNA tile based self-assembly: Building complex nanoarchitectures. ChemPhysChem 2006, 7, 1641–1647. [Google Scholar] [CrossRef] [PubMed]
  10. Douglas, S.M.; Dietz, H.; Liedl, T.; Högberg, B.; Graf, F.; Shih, W.M. Self-assembly of DNA into nanoscale three-dimensional shapes. Nature 2009, 459, 414–418. [Google Scholar] [CrossRef] [PubMed]
  11. Yin, P.; Hariadi, R.F.; Sahu, S.; Choi, H.M.T.; Park, S.H.; LaBean, T.H.; Reif, J.H. Programming DNA tube circumferences. Science 2008, 321, 824–826. [Google Scholar] [CrossRef]
  12. SantaLucia, J.; Allawi, H.T.; Seneviratne, P.A. Improved nearest-neighbor parameters for predicting DNA duplex stability. Biochemistry 1996, 35, 3555–3562. [Google Scholar] [CrossRef]
  13. Wei, B.; Dai, M.; Yin, P. Complex shapes self-assembled from single-stranded DNA tiles. Nature 2012, 485, 623–626. [Google Scholar] [CrossRef] [PubMed]
  14. Ke, Y.; Ong, L.L.; Shih, W.M.; Yin, P. Three-dimensional structures self-assembled from DNA bricks. Science 2012, 338, 1177–1183. [Google Scholar] [CrossRef] [PubMed]
  15. Adamczyk, A.K.; Huijben, T.A.P.M.; Sison, M.; Di Luca, A.; Chiarelli, G.; Vanni, S.; Brasselet, S.; Mortensen, K.I.; Stefani, F.D.; Pilo-Pais, M.; et al. DNA self-assembly of single molecules with deterministic position and orientation. ACS Nano 2022, 16, 16924–16931. [Google Scholar] [CrossRef] [PubMed]
  16. Lee, C.; Lee, J.Y.; Kim, D.-N. Polymorphic design of DNA origami structures through mechanical control of modular components. Nat. Commun. 2017, 8, 2067. [Google Scholar] [CrossRef] [PubMed]
  17. Ronald, V.B.; Friedrich, O. String-Rewriting Systems; Springer: Berlin/Heidelberg, Germany, 1993. [Google Scholar]
  18. Louis, H.K. Knots and Physics; World Scientific: Singapore, 2001. [Google Scholar]
  19. Borisavljević, M.; Došen, K.; Petric, Z. Kauffman Monoids. J. Knot Theory Its Ramif. 2002, 11, 127–143. [Google Scholar] [CrossRef]
  20. Lau, K.W.; FitzGerald, D.G. Ideal Structure of the Kauffman and Related Monoids. Commun. Algebra 2006, 34, 2617–2629. [Google Scholar] [CrossRef]
Figure 1. Graphical representation of the Jones monoid J 5 . (a) The generators h 1 , h 2 , and h 3 in J 5 . (b) The relation h 1 h 2 h 1 = h 1 . (c) The relation h 1 h 1 = h 1 . (d) The relation h 1 h 3 = h 3 h 1 .
Figure 1. Graphical representation of the Jones monoid J 5 . (a) The generators h 1 , h 2 , and h 3 in J 5 . (b) The relation h 1 h 2 h 1 = h 1 . (c) The relation h 1 h 1 = h 1 . (d) The relation h 1 h 3 = h 3 h 1 .
Mathematics 13 00895 g001
Figure 2. An example of four different types of staples segments over E 5 . Scaffolds are represented by black plain lines, virtual staples are represented by grey dotted lines, left and right staples are represented by red dotted lines, and block staples are represented by pink dotted lines. The virtual staple is not present and can be extended to left, right, or block according to their neighbor endpoints. In the figure, we arbitrarily draw the virtual staple on the right side of the scaffolds. For convenience, representation of virtual staples can be omitted.
Figure 2. An example of four different types of staples segments over E 5 . Scaffolds are represented by black plain lines, virtual staples are represented by grey dotted lines, left and right staples are represented by red dotted lines, and block staples are represented by pink dotted lines. The virtual staple is not present and can be extended to left, right, or block according to their neighbor endpoints. In the figure, we arbitrarily draw the virtual staple on the right side of the scaffolds. For convenience, representation of virtual staples can be omitted.
Mathematics 13 00895 g002
Figure 4. The set C of scaffolds of a graphical structure G ( w ) = ( C , P ) of a word w = α 1 β 1 . We omit the representation of staples. For all scaffolds in C 1 , replace the subscript b by m, and for all staples in C 2 , replace the subscript t by m. Given set C 1 C 2 , for a sequence ( 2 t , 2 m ) , ( 2 m , 1 m ) , ( 1 m , 1 t ) , add ( 2 t , 1 t ) to C . For a sequence ( 3 b , 3 m ) , ( 3 m , 3 t ) , add ( 3 b , 3 t ) to C . Furthermore, add ( 1 b , 2 b ) to C .
Figure 4. The set C of scaffolds of a graphical structure G ( w ) = ( C , P ) of a word w = α 1 β 1 . We omit the representation of staples. For all scaffolds in C 1 , replace the subscript b by m, and for all staples in C 2 , replace the subscript t by m. Given set C 1 C 2 , for a sequence ( 2 t , 2 m ) , ( 2 m , 1 m ) , ( 1 m , 1 t ) , add ( 2 t , 1 t ) to C . For a sequence ( 3 b , 3 m ) , ( 3 m , 3 t ) , add ( 3 b , 3 t ) to C . Furthermore, add ( 1 b , 2 b ) to C .
Mathematics 13 00895 g004
Figure 5. An example of concatenation of two graphical structures. Each case (ad) describes the concatenation process of staples.
Figure 5. An example of concatenation of two graphical structures. Each case (ad) describes the concatenation process of staples.
Mathematics 13 00895 g005
Figure 6. An example of G ( w 1 ) and G ( w 2 ) for the cap case and the graphical structures after concatenating six different generators. The top two images illustrate the graphical structures for w 1 and w 2 . (ivi) illustrate the concatenation of G ( w 1 ) with each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i , and the concatenation of G ( w 2 ) with each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i .
Figure 6. An example of G ( w 1 ) and G ( w 2 ) for the cap case and the graphical structures after concatenating six different generators. The top two images illustrate the graphical structures for w 1 and w 2 . (ivi) illustrate the concatenation of G ( w 1 ) with each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i , and the concatenation of G ( w 2 ) with each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i .
Mathematics 13 00895 g006
Figure 7. An example of G ( w 1 ) and G ( w 2 ) for the cup case, and the graphical structures after concatenating six different generators. (ivi) illustrate the concatenation of each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i and G ( w 1 ) , and the concatenation of each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i and G ( w 2 ) .
Figure 7. An example of G ( w 1 ) and G ( w 2 ) for the cup case, and the graphical structures after concatenating six different generators. (ivi) illustrate the concatenation of each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i and G ( w 1 ) , and the concatenation of each of the six different generators α i 2 , α i 1 , α i , β i 2 , β i 1 , β i and G ( w 2 ) .
Mathematics 13 00895 g007
Figure 8. An example of G ( w 1 ) and G ( w 2 ) when s ( i ) is right and r ( i ) is block, and the graphical structures before and after concatenating four different generators. (iviii) illustrate the case when the right staple from G ( w 1 ) affects the concatenation of G ( w 1 ) with four different generators and the case when the block staple from G ( w 2 ) affects the concatenation of G ( w 2 ) with four different generators.
Figure 8. An example of G ( w 1 ) and G ( w 2 ) when s ( i ) is right and r ( i ) is block, and the graphical structures before and after concatenating four different generators. (iviii) illustrate the case when the right staple from G ( w 1 ) affects the concatenation of G ( w 1 ) with four different generators and the case when the block staple from G ( w 2 ) affects the concatenation of G ( w 2 ) with four different generators.
Mathematics 13 00895 g008
Figure 9. An example of G ( w 1 ) and G ( w 2 ) when s ( i ) is left and r ( i ) is block, and the graphical structures before and after concatenating four different generators. (iviii) illustrate the case when the left staple from G ( w 1 ) affects the concatenation of G ( w 1 ) with four different generators and the case when the block staple from G ( w 2 ) affects the concatenation of G ( w 2 ) with four different generators.
Figure 9. An example of G ( w 1 ) and G ( w 2 ) when s ( i ) is left and r ( i ) is block, and the graphical structures before and after concatenating four different generators. (iviii) illustrate the case when the left staple from G ( w 1 ) affects the concatenation of G ( w 1 ) with four different generators and the case when the block staple from G ( w 2 ) affects the concatenation of G ( w 2 ) with four different generators.
Mathematics 13 00895 g009
Figure 10. (iviii) illustrate the graphical structures after concatenating four different generators on top, see (iiv), and the bottom, see (vviii), with G ( w 1 ) and G ( w 2 ) when S ( i ) is virtual and r ( i ) is right.
Figure 10. (iviii) illustrate the graphical structures after concatenating four different generators on top, see (iiv), and the bottom, see (vviii), with G ( w 1 ) and G ( w 2 ) when S ( i ) is virtual and r ( i ) is right.
Mathematics 13 00895 g010
Figure 11. (iviii) illustrate the graphical structures after concatenating four different generators on top, see (iiv), and the bottom, see (vviii), with G ( w 1 ) and G ( w 2 ) when S ( i ) is virtual and r ( i ) is left.
Figure 11. (iviii) illustrate the graphical structures after concatenating four different generators on top, see (iiv), and the bottom, see (vviii), with G ( w 1 ) and G ( w 2 ) when S ( i ) is virtual and r ( i ) is left.
Mathematics 13 00895 g011
Figure 12. (iviii) illustrate the graphical structures after concatenating four different generators on top, see (iiv), and the bottom, see (vviii), with G ( w 1 ) and G ( w 2 ) when S ( i ) is virtual and r ( i ) is block.
Figure 12. (iviii) illustrate the graphical structures after concatenating four different generators on top, see (iiv), and the bottom, see (vviii), with G ( w 1 ) and G ( w 2 ) when S ( i ) is virtual and r ( i ) is block.
Mathematics 13 00895 g012
Figure 13. The transition graph of the changes in the different staple types for w 1 = β i and w 2 = β i β i 1 β i .
Figure 13. The transition graph of the changes in the different staple types for w 1 = β i and w 2 = β i β i 1 β i .
Mathematics 13 00895 g013
Figure 14. The transition graph of the changes in the different staple types for w 1 = β i and w 2 = β i β i + 1 β i .
Figure 14. The transition graph of the changes in the different staple types for w 1 = β i and w 2 = β i β i + 1 β i .
Mathematics 13 00895 g014
Figure 15. An example of G ( w ) and G ( w ) . Each number i illustrates the ith property in the lemma.
Figure 15. An example of G ( w ) and G ( w ) . Each number i illustrates the ith property in the lemma.
Mathematics 13 00895 g015
Figure 16. (a) Shows an example of staple segments e x t e n d e d c u p and e x t e n d e d c a p . Given a graphical structure with 6 columns, staple segments ( 1 t , 6 t ) , ( 3 t , 2 t ) , ( 5 t , 4 t ) are e x t e n d e d c u p , and ( 2 b , 1 b ) , ( 4 b , 3 b ) , ( 6 b , 5 b ) are e x t e n d e d c a p . (b) Shows an example of staple segments j u n c t i o n and e x t e n d e d c a p . With 5 columns, ( 2 b , 5 b ) is e x t e n d e d c a p , and ( 5 t , 1 b ) is j u n c t i o n .
Figure 16. (a) Shows an example of staple segments e x t e n d e d c u p and e x t e n d e d c a p . Given a graphical structure with 6 columns, staple segments ( 1 t , 6 t ) , ( 3 t , 2 t ) , ( 5 t , 4 t ) are e x t e n d e d c u p , and ( 2 b , 1 b ) , ( 4 b , 3 b ) , ( 6 b , 5 b ) are e x t e n d e d c a p . (b) Shows an example of staple segments j u n c t i o n and e x t e n d e d c a p . With 5 columns, ( 2 b , 5 b ) is e x t e n d e d c a p , and ( 5 t , 1 b ) is j u n c t i o n .
Mathematics 13 00895 g016
Table 1. Units for generators of odd is (pairs are reversed for even is).
Table 1. Units for generators of odd is (pairs are reversed for even is).
R c R p
α i ( i b , i t ) , ( i + 1 t , i + 1 b ) ( i t L , i + 1 t L ) , ( i + 1 b R , i b R )
β i ( i + 1 t , i t ) , ( i b , i + 1 b ) ( i t L , i b L ) , ( i + 1 b R , i + 1 t R )
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cho, D.-J. Formalization of Side-Aware DNA Origami Words and Their Rewriting System, and Equivalent Classes. Mathematics 2025, 13, 895. https://doi.org/10.3390/math13060895

AMA Style

Cho D-J. Formalization of Side-Aware DNA Origami Words and Their Rewriting System, and Equivalent Classes. Mathematics. 2025; 13(6):895. https://doi.org/10.3390/math13060895

Chicago/Turabian Style

Cho, Da-Jung. 2025. "Formalization of Side-Aware DNA Origami Words and Their Rewriting System, and Equivalent Classes" Mathematics 13, no. 6: 895. https://doi.org/10.3390/math13060895

APA Style

Cho, D.-J. (2025). Formalization of Side-Aware DNA Origami Words and Their Rewriting System, and Equivalent Classes. Mathematics, 13(6), 895. https://doi.org/10.3390/math13060895

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop