Abstract
DNA origami is a powerful technique for constructing nanoscale structures by folding a single-stranded DNA scaffold with short staple strands. While traditional models assume staples bind to a fixed side of the scaffold, we introduce a side-aware DNA origami framework that incorporates the directional binding of staples to either the left or right side. The graphical representation of DNA origami is described using rectangular basic modules of scaffolds and staples, which we refer to as symbols in side-aware DNA origami words. We further define the concatenation of these symbols to represent side-aware DNA origami words. A set of rewriting rules is introduced to define equivalent words that correspond to the same graphical structure. Finally, we compute the number of possible structures by determining the equivalence classes of these words.
MSC:
92-10
1. Introduction
DNA is an ideal material for building nanostructures due to its unique features, such as molecular recognition, self-assembly, programmability, predictable nanoscale structure, and ease of synthesis. Self-assembly is a process where components, typically molecules, autonomously organize into a larger structure [1]. Two primary methods have been developed for constructing DNA nanostructures: tile-based self-assembly [2,3] and the DNA origami method, introduced by Rothemund [4,5]. The tile-based approach is effective for building small, simple structures with repeated domains, whereas the DNA origami method excels at creating larger, more complex, and precisely defined structures [6]. The term “origami” originates from the Japanese word for folding paper into a special shape. Similarly, DNA origami builds complex two- and three-dimensional nanostructures by folding a single-stranded DNA plasmid, called the scaffold, which outlines a shape. Short DNA strands, called staples, connect different parts of the scaffold. The specificity of interactions between complementary base pairs makes DNA a highly programmable material, allowing precise control over the resulting structure through sequence design. Following Rothemund’s pioneering work on DNA origami, significant advancements have been made in designing DNA nanostructures through scaffold and staple sequence optimization. Researchers have focused on improving the scalability, stability, and accuracy of DNA origami, addressing challenges like minimizing defects and optimizing folding efficiency [7]. The development of functional DNA origami has expanded the design space by tailoring nanostructures for specific uses, such as drug delivery, molecular sensors, and biosensing [8]. Modular DNA origami, which combines smaller units into larger and more complex structures, has created new opportunities for dynamic and multifunctional designs [9].
Theoretical research in DNA origami primarily focuses on understanding the mechanisms of self-assembly and optimizing computational aspects of sequence design. Major areas of research include developing sequence design algorithms for scaffolds and staples [10], thermodynamic modeling to predict folding stability and efficiency [11,12], and combinatorial optimization to minimize errors in assembly [13]. Additionally, studies on folding kinetics and assembly pathways aim to improve the speed and accuracy of self-assembly [14]. These theoretical approaches not only enhance the efficiency of DNA origami design but also offer insights into the underlying principles governing molecular self-assembly. Recently, Garrett et al. [5] introduced a graphical framework for DNA origami structures by defining a graphical alphabet to represent DNA origami and a corresponding word rewriting system. They analyzed the equivalence classes of DNA origami words.
Our research builds upon the work of Garrett et al. [5], but we extend it by considering the fact that staples can bind to either the left or right side of the scaffold, rather than being fixed to one side. Traditional designs often assume fixed-side binding, but recent studies have demonstrated that the orientation of staple strands can significantly influence the positioning and orientation of attached molecules, such as fluorophores, within the DNA origami structure [15,16]. To address this, we propose a formal framework for describing side-aware DNA origami words, which explicitly captures the directional binding of staples to either side of the scaffold. This framework provides a more comprehensive understanding of how directional binding impacts the resulting DNA origami structures.
The main contributions of this work are as follows:
- We introduce the concept of side-aware DNA origami words, explicitly modeling the directional binding of staples (left or right) to the scaffold. This new concept accommodates the representation of more flexible and biologically realistic DNA origami structures, and is described in Section 3.
- We define the side-aware DNA origami word rewriting system and analyze the properties of its graphical structures, focusing on equivalence classes. Since staples can bind to either side of the scaffold or fail to form stable structures if blocked by staples from the opposite side, the diversity of rewriting patterns increases. We categorize all patterns arising from concatenation and define them as rewriting rules for side-aware DNA origami words. This is described in Section 4.
2. Preliminaries
An alphabet is a non-empty finite set of symbols. A word is a finite sequence of n symbols over , and denotes the size of the word. We use to denote the empty word. A subword, or a factor, of a word is where . We use to denote the set of all words over . The concatenation of two words x and y is denoted by , or simply .
A word rewriting system consists of an alphabet and a set of rewriting rules. In this paper, is finite. R generates an equivalence relation on . An element of R is called a rewriting rule, and is written as . In general, we rewrite as for if , and denote such rewriting by . For a sequence of words in a rewriting system , we write . We consider and denote an equivalence class of a word w as . A word is irreducible if for all . We consider the set of equivalence classes . The reader may refer to Book and Otto [17] for more information about word rewriting systems.
The Jones monoid is defined as a monoid with generators that satisfy three types of relations:
- for
- for
It is a monoid counterpart of the Temperley–Lieb algebra, which has been extensively studied in physics and knot theory [18,19,20].
Figure 1 shows the graphical representation of the generators and relations in the Jones monoid. Each element in is depicted with n endpoints at both the top and bottom. The generator connects the ith and st endpoints at both the top and bottom, with the remaining endpoints connected by vertical lines. For instance, Figure 1a illustrates the generators , , and in . The generator connects the 1st and 2nd points at both the top and bottom, connects the 2nd and 3rd points at both the top and bottom, and connects the 3rd and 4th points at both the top and bottom, respectively. The multiplication of two elements corresponds to the concatenation of diagrams, where the diagram of the first element is placed above the second, and closed loops are removed. Relations 1, 2, and 3 are also depicted graphically in Figure 1 (b)–(d), respectively. Two elements in the Jones monoid are equal if their graphical representations are equivalent, that is, they have the same set of top–bottom connecting segments after deleting internal loops. For any two words that have equivalent diagrams, one word can be rewritten to the other using the sequence of relations 1 to 3. In the simplification of DNA origami structures, Garrett et al. [5] introduced a base unit for DNA origami words using the Jones monoid framework. Their approach considers the endpoints of scaffolds and staples that are visible along the top and bottom borders of the overall structure. We adopt the same approach as Garrett et al. [5], but we further include the left or right bonding side of staples to the scaffold in our model.
Figure 1.
Graphical representation of the Jones monoid . (a) The generators , and in . (b) The relation . (c) The relation . (d) The relation .
3. Formalization of Side-Aware DNA Origami Words
The schematics of the DNA origami structure presented in this paper build upon previous research on DNA origami words and rewriting systems [5]. A key distinction in our model is the absence of staple endpoint sides, which eliminates block staples; any pair of staple endpoints is connected as long as they occupy the same position. In this chapter, we introduce a formal framework to describe side-aware DNA origami words, incorporating the directional binding of staples to either the left or right side of the scaffold, and the concatenation of graphical structures of side-aware DNA origami words. A brief introduction to side-aware DNA origami words and their concatenation is provided as follows:
- Side-aware DNA origami words, consisting of scaffolds and staples, are represented by graphical structures of width n. These graphical structures contain n columns, and each column has two endpoints: top and bottom. Scaffolds and staples are represented by directed pairs of endpoints. By combining the set of pairs of scaffolds and the set of pairs of staples, we define a graphical structure. Details are described in Section 3.1.
- We define the concatenation of two graphical structures of side-aware DNA origami words. This concatenation is based on the relations of the Jones monoid and its graphical representation, as described in Figure 1. Unlike the classical concatenation of words, the concatenation of graphical structures is performed vertically, aligning their columns. Details are described in Section 3.2.
3.1. Side-Aware DNA Origami Words
We define columns consisting of scaffolds and staples, where the staples align along the scaffolds with the following properties:
- We observe that there are two different types of motifs in the DNA origami structure: there are places where two adjacent scaffolds connect the two columns, and also places where two adjacent staples connect the two columns.
- The scaffolds and staples have directions: adjacent scaffolds are anti-parallel, and a staple is anti-parallel to the connected scaffold.
- Each staple end can reside on two different sides of the connected scaffold—left and right.
To represent such scaffolds and staples by directed pairs of endpoints, we first define the set of geometric positions to represent locations of points. Given n as the width of the graphical structure, we first define (respectively, ) to represent a position at the top (respectively, bottom) of the ith column for . We assume that scaffolds at the ith column go upward if i is odd, and downward if i is even, and the direction of the staples is opposite from the direction of the scaffolds. The set of geometric positions for given n is defined as .
A scaffold segment is defined by an ordered pair of endpoints where . For a staple segment, each segment is categorized into four different types:
- Virtual (V): A staple is not present on the given column (we formally regard this “absence” of a staple as one type of staple for the concise definition of concatenation of structures). Other staples can be extended on a virtual staple through concatenation.
- Left (L): Two endpoints of the staple are on the left side of the scaffolds.
- Right (R): Two endpoints of the staple are on the right side of the scaffolds.
- Block (B): A staple is disconnected in-between. Other staples cannot be extended on a block staple through concatenation.
Thus, a staple segment is defined by an ordered pair of endpoints where and . Figure 2 shows an example of staple segments over . Then, a graphical structure is defined as a pair of a set of scaffold segments and a set of staple segments that satisfies the following condition: each position in is occupied exactly once by endpoints in (). For any segment of endpoints, we define the reverse of the segment as , which is defined similarly to a set of segments.
Figure 2.
An example of four different types of staples segments over . Scaffolds are represented by black plain lines, virtual staples are represented by grey dotted lines, left and right staples are represented by red dotted lines, and block staples are represented by pink dotted lines. The virtual staple is not present and can be extended to left, right, or block according to their neighbor endpoints. In the figure, we arbitrarily draw the virtual staple on the right side of the scaffolds. For convenience, representation of virtual staples can be omitted.
For a given width n, we define basic modules and corresponding generators as an alphabet for DNA origami words with the order . We say that is complementary to , and vice versa. For each generator and we have four maps as described below:
We call four maps that describe (respectively, ) units for (respectively, ). Note that units are reversed for even is. The units for each generator describe their structures between the ith and the st columns.
Table 1 shows the units for generators of odd is.
Table 1.
Units for generators of odd is (pairs are reversed for even is).
The full graphical structure for must include not only the units for the ith and st columns but also the remaining columns to fully represent a complete structure of size n. The structure of each generator has a context which consists of a scaffold and a staple for odd ks and their reverses for even ks where . Figure 3 illustrates the graphical structure of , and of size 3, including the context corresponding to the 3rd column, with . We notice that context can have real or virtual pairs of scaffolds and staples. Based on the choice of r, we observe that the context can include either real or virtual pairs of scaffolds and staples. Depending on the choice of real or virtual context, different structural descriptions of graphical structures are possible [5]:
- : The entire context is real, with both and being empty.
- : The context for the scaffold is real, while the context for the staple is virtual.
- : The entire context is virtual.
In this study, we focus on , as the four types of staples enhance the structural diversity.
Figure 3.
Graphical structure of s and , where . Real scaffolds are represented by black lines and real staples are represented by red dashed lines. Virtual staples are grey dashed lines. Virtual staples can be assigned to either the left or right side of a scaffold, but in this figure, we assign the virtual staples to the right side of a scaffold. In the graphical structure of , the context contains and . The pair shows that the virtual staple has possible strands to either the left or right of the scaffold .
3.2. Concatenation of Side-Aware DNA Origami Words
We now introduce how side-aware DNA origami structures are concatenated based on the relations of Jones monoid and its graphical representation. The graphical structure corresponding to the concatenation of two words is formed by placing the graphical structure of the first word on top of that of the second word. When a virtual staple meets a left (or right) staple, the virtual staple becomes the left (or right) staple. This process demonstrates that the empty space represented by a virtual staple can be extended through concatenation, depending on whether it encounters a left or right staple. In contrast, if a left (or right) staple meets a right (or left) staple, both staples become blocked, meaning the staples are disconnected.
For two graphical structures of words and , the concatenation is defined as follows: suppose and are graphical structures of and , respectively. We now describe the procedure to define the graphical structure , the concatenation of the two words and , with the set of scaffolds and the set of staples. Note that .
First, the set of scaffolds is obtained as follows:
- (i)
- For all scaffolds in , replace the subscript b by m.
- (ii)
- For all scaffolds in , replace the subscript t by m.
- (iii)
- Given set , for each sequenceof scaffolds where points and have subscripts t or b, add to . Figure 4 shows an example of the set of scaffolds of a graphical structure of a word .
Figure 4. The set of scaffolds of a graphical structure of a word . We omit the representation of staples. For all scaffolds in , replace the subscript b by m, and for all staples in , replace the subscript t by m. Given set , for a sequence , add to . For a sequence , add to . Furthermore, add to .
Set of staples is obtained as follows:
- (i)
- For all staples in , replace the subscript b by m.
- (ii)
- For all staples in , replace the subscript t by m.
- (iii)
- Given set , for each sequenceof staples where s denote positions, s denote staple types and points , and have subscripts t or b, we perform the following:
- (a)
- If (1) or (2) and or (3) and for any , add to .
- (b)
- Otherwise, if for all , add to .
- (c)
- Otherwise, if for any , add to .
- (d)
- Otherwise, add to .
Figure 5 shows an example of each case (a)–(d).
Figure 5.
An example of concatenation of two graphical structures. Each case (a–d) describes the concatenation process of staples.
We observe that the composition of graphical structures is associative, that is, . We use to refer to the set of all graphical structures that can be constructed from .
4. Side-Aware DNA Origami Rewriting Systems
In this section, we introduce rewriting rules for two graphical structures corresponding to and . Specifically, if two words and produce the same graphical structure, i.e., , then can be rewritten as , and vice versa. Since the concatenation of two graphical structures occurs in a horizontal manner, the types of endpoints at the bonding sites influence the resulting structure. For instance, the bottom endpoints of the graphical structure of and the top endpoints of the graphical structure of determine the outcome of the concatenated structure . If the bottom endpoints of have a “cap” shape, with two endpoints connected from the bottom of one column to the bottom of the adjacent column, how does this affect the resulting structures when considering all possible generators that could bond below ? Note that it is worth noting that a “cup” shape at the bottom endpoints of does not constrain or affect other generators bonding below , as it does not interfere with the bonding process. Conversely, if the top endpoints of have a “cup” shape, with two endpoints connected from the top of one column to the top of the adjacent column, how does this influence the resulting structures when considering all possible generators that could bond above ?
We define the rewriting rules as patterns of structural types and describe how each structural type affects the resulting structures. The concept of side-aware DNA origami words, which allows for the directional binding of staples to the scaffold, increases the complexity of the rewriting rules, even with only six possible generators to concatenate. We note that if the gap between columns is greater than two, the resulting structures are unaffected by the relations in the Jones monoid. Therefore, we only need to consider six generators: .
4.1. Rewriting Rules
We define a rewriting rule under if under . Here, we establish the finite set of rewriting rules under . For convenience, we use and to represent an arbitrary generator type ( or ), and to represent a complementary generator of . We observe the following statements:
- If under , then under . Namely, if two graphical structures and are the same under , then they should be the same when we ignore sides of staples and connect all staple endpoints at the same position. This implies that for any rule , there exist , and words and such that and .
- If under , then the scaffolds of and are the same under .
- Staple segments in the generators are categorized into three geometric types: a cap (i.e., ), a cup (i.e., ), and a straight segment (i.e., ()). For convenience, we refer to a cap (cup) between the ith and the st columns as the ith cap (cup), and a straight staple at the ith column as the ith straight staple.
Under these observations, for each pair of staple segments with the same start and end positions dependent on i, we establish a set of pairs of a prefix and a suffix that satisfies the following condition: for any pair of words where and are the same under except in and in , the equality holds under . We call such set an override set, denoted by , where represents the type of the pairs. Then, for each rule , we observe the staple difference between and under , concatenate a prefix and a suffix from the corresponding override set to make the graphical structures equivalent, and retrieve the rewriting rules under .
An override set is a collection of “prefix–suffix pairs” that can adjust two words to make their graphical structures equivalent, even if there is a specific difference between them in terms of staple segments. For any given type of difference—like a “cap” or “cup” at a certain position—this set provides the necessary adjustments to “override” that difference. If two DNA origami words would result in the same structure except for a small discrepancy in their staples at position i, the override set tells us how to add the same prefix and suffix around both words so that their final structures become identical.
Below is the list of each pair of staple segments and the corresponding override sets.
- Cap: We first consider a pair of ith caps for even is. Since the cap segments are at the bottom of the structure, concatenating any prefix does not affect the equivalence of graphical structures. We have six different generators that affect the ith cap segments: through . Figure 6 illustrates an example of and where the only difference is , and changes in the graphical structures from concatenation of these six generators after and . We observe that all generators except make two graphical structures equivalent. For the case, the difference in graphical structures remains the same. Therefore, we establish the override set as for . When i is odd, for , the override set is the same as the even case, which holds for all of the following cases.
Figure 6. An example of and for the cap case and the graphical structures after concatenating six different generators. The top two images illustrate the graphical structures for and . (i–vi) illustrate the concatenation of with each of the six different generators , and the concatenation of with each of the six different generators . - Cup: We consider a pair of ith cups for odd is. Since the cup segments are at the top of the structure, concatenating any suffix does not affect the equivalence of graphical structures. We have six different generators that affect the ith cap segments: through . Figure 7 illustrates an example of and where the only difference is , and changes in the graphical structures from concatenation of these six generators before and . We observe that all generators except make two graphical structures equivalent. For the case, the difference in graphical structures remains the same. Therefore, we establish the override set as for .
Figure 7. An example of and for the cup case, and the graphical structures after concatenating six different generators. (i–vi) illustrate the concatenation of each of the six different generators and , and the concatenation of each of the six different generators and . - Straight segment: We have the following cases.
- (a)
- When is right and is block: we consider for odd is. We have four different generators that affect the straight segments: through . Figure 8 illustrates an example of and where the only difference is , and changes in the graphical structures from concatenation of these four generators before and after and . If concatenation of a generator does not make two graphical structures equivalent, it may change the geometric type of the staple segment difference. For example, concatenating before and changes the geometric type of the staple segment difference to the st cap. Thus, for any , holds. We summarize all of the cases as the following override set: for and .
Figure 8. An example of and when is right and is block, and the graphical structures before and after concatenating four different generators. (i–viii) illustrate the case when the right staple from affects the concatenation of with four different generators and the case when the block staple from affects the concatenation of with four different generators. - (b)
- When is left and is block: we consider for odd is in Figure 9. We establish the following override set: for and .
Figure 9. An example of and when is left and is block, and the graphical structures before and after concatenating four different generators. (i–viii) illustrate the case when the left staple from affects the concatenation of with four different generators and the case when the block staple from affects the concatenation of with four different generators. - (c)
- When is virtual and is right: We first consider for odd is in Figure 10. We summarize all of the cases as the following override set: for and .
Figure 10. (i–viii) illustrate the graphical structures after concatenating four different generators on top, see (i–iv), and the bottom, see (v–viii), with and when is virtual and is right. - (d)
- When is virtual and is left: We consider for odd is in Figure 11. We establish the following override set: for and .
Figure 11. (i–viii) illustrate the graphical structures after concatenating four different generators on top, see (i–iv), and the bottom, see (v–viii), with and when is virtual and is left. - (e)
- When is virtual and is block: We consider for odd is in Figure 12. We establish the following override set: for and .
Figure 12. (i–viii) illustrate the graphical structures after concatenating four different generators on top, see (i–iv), and the bottom, see (v–viii), with and when is virtual and is block.
Note that the definition of each override set only refers to the previously defined override sets. We reorganize the override sets by substitution:
- for .
- for .
- for , , , and for .
- for , , , and for .
- for , , , , and for .
- for , ,, , and for .
- for ,, ,, , ,and for .
Now, we inspect the rules in to construct .
- : When , the rule also holds under .
- (a)
- When and , we observe that the only difference in graphical structures is the ith cap. Thus, holds for .
- (b)
- When and , we observe that the only difference in graphical structures is the st cup. Thus, holds for .
- (c)
- When , we observe that there are st cup and cap differences. Thus, holds for and .
- : This rule also holds under .
- for : This rule also holds under .
- for
- (a)
- When and , we observe that the only difference in graphical structures is the ith straight virtual/block staples. Thus, holds for .
- (b)
- When and , we observe that the only difference in graphical structures is the nd straight virtual/block staples. Thus, holds for .
- (c)
- When and , we observe that the differences in graphical structures are the st straight virtual/block staple and the ith straight left/block staple. Since straight differences on two adjacent columns can become the same by simply applying and , we attach generators that affect the different staple types and observe the changes on the differences. Figure 13 illustrates the transition graph of the changes in the different staple types. Each node denotes the different staple types by override sets for the differences, and each transition has a set of pairs of a prefix and a suffix that changes the different staple types. Nodes with less than or equal to one type difference are double-circled to denote the “final” nodes, since we can directly apply the override set on the node to make two graphical structures equivalent. For and , we start from the node with two override sets and . Then, for example, if we follow transitions and , the resulting node has one override set . From this sequence of transitions, we establish the rewriting rule for , and . In general, we can recursively construct a rewriting rule based on a sequence of transitions that leads to a final node as follows:
Figure 13. The transition graph of the changes in the different staple types for and .- We start with and and the current node with two override sets and .
- For a transition from the current node, append a prefix to and where v does not have any such that the transition exists from the current node.
- For a transition from the current node, append a suffix to and where v does not have any such that the transition exists from the current node.
- Update the current node by following the transition. If the current node is final with the override set O, we establish the rewriting rule for . Otherwise, repeat the process from the current node.
- (d)
- When and , we can similarly construct the transition graph as Figure 14. Based on this transition graph, we can recursively construct a set of rewriting rules.
Figure 14. The transition graph of the changes in the different staple types for and .
Note that the transition graphs in Figure 13 and Figure 14 are not symmetric with respect to the indices (i.e., changing index j to from Figure 13 does not construct Figure 14).
We union the sets of previously analyzed rewriting rules to construct . It is straightforward that if under , then under . Moreover, since is constructed from recursively tracing every possible difference in the graphical structures of and where under , the converse of the statement also holds: if under , then under . Based on , we can also define , the set of equivalent classes under .
4.2. Properties of Graphical Structures and Equivalent Classes
Here, we inspect the properties of graphical structures under , which we will use to count the number of distinct equivalent classes. For two graphical structures and , we say that is isomorphic to (and vice versa) if there exists one-to-one correspondence between scaffolds in and (staples in and ) with the same pair of endpoints. For instance, if we compare the graphical structures from the same word under and , one is isomorphic to the other—the only difference is the four different types of staples (virtual, left, right, block) under . However, not all combinations of these types of staples are realizable under . We first analyze the possible combinations of different types of staples. Then, for each given graphical structure under , we count the number of distinct graphical structures under that are isomorphic to .
For further analysis of staples, we first define four different types of a column according to the scaffold structure. For a given graphical structure, the ith column can be categorized into four different types:
- Straight if there exists a straight scaffold on the ith column.
- Crossing if there exists a scaffold from the jth column to the kth column such that or holds.
- Left-sided if the ith column is not crossing and all scaffolds with endpoints on the ith column have the other endpoints on the jth column such that .
- Right-sided if the ith column is not crossing and all scaffolds with endpoints on the ith column have the other endpoints on the jth column such that .
Since each position is assigned to distinct endpoints of scaffolds and there is no crossing of scaffolds, these four different types partition the set of all columns.
Lemma 1.
For given graphical structure under , staples in have the following properties:
- If a staple is virtual straight, then the staple is on a straight column.
- If a staple is a cup, then the staple is left or block.
- If a staple is a cap, then the staple is right or block.
- If a staple is left straight, then the staple is on a straight or right-sided column.
- If a staple is right straight, then the staple is on a straight or left-sided column.
- If a staple is not a cap/a cup/a straight segment, then the staple is block.
Lemma 2.
For given graphical structure under , there always exists a graphical structure under that satisfies the following properties:
- is isomorphic to .
- A staple is virtual if and only if the staple is straight and on a straight column.
- If a staple is a cup, then the staple is left.
- If a staple is a cap, then the staple is right.
- A staple is left straight if and only if the staple is on a right-sided column.
- A staple is right straight if and only if the staple is on a left-sided column.
Proof of Lemma 2.
Figure 15 gives an example of and . □
Figure 15.
An example of and . Each number i illustrates the ith property in the lemma.
We also observe that if is in w, then the ith and the st columns cannot have left or right straight staples.
We now classify graphical structures using a binary b of length n, where the ith digit has 1 if the ith column has a straight scaffold and 0 otherwise. The set of binaries of length n has bijection with the set . Thus, let be the number of graphical structures which correspond to .
Theorem 1.
Given , for each tuple , let be recursively defined as follows:
- for , .
- for , if and if .
- for , we have .
- for , we have and
Then, the size of the set of equivalent classes of words from is given as
where is given as
for and if and if .
Proof of Theorem 1.
We observe that all staples in a graphical structure of a word from are straight. Moreover, all straight staples are either left or right in the generators, and if the ith straight staple was block in , then the ith straight staple is block in all where . Thus, for each graphical structure, we first find a structure with the minimal block staples, and count the number of graphical structures with the same connectivity. It is straightforward to verify that because there is only one case for , while since corresponds to an invalid case. For tuples , the columns with 1s in the binary representation always have only one possible case (straight scaffold and staple). Therefore, is computed as the product of for all i, that is:
To calculate , we observe that the total number of graphical structures for is given by the Catalan number:
Using the recursive definition, can be computed by summing over all valid tuples :
For , the value is determined by subtracting the contributions of all tuples from the total number of structures:
Finally, the size of the equivalence classes of words from is given by:
where is calculated recursively based on the tuples . □
Given a graphical structure with n straight scaffolds, there exist the following different types of staple segment:
- a if a staple segment , where ,
- an if a staple segment , where for ,
- a if a staple segment , where
- an if a staple segment , where for ,
- a if a staple segment , where for .
Without loss of generality, we omit superscripts that represent where two endpoints of the staple are placed on the scaffolds. Figure 16 shows the five different types of staple segment with n columns.
Figure 16.
(a) Shows an example of staple segments and . Given a graphical structure with 6 columns, staple segments are , and are . (b) Shows an example of staple segments and . With 5 columns, is , and is .
Given a graphical structure with n straight scaffolds, we define to represent a set of staple segments over extended_cup, extended_cap, and junction that satisfy the following properties:
- (i)
- A bridge of width n is a subset of staple set of the given graphical structure,
- (ii)
- For any staple segments in a bridge of width n, for ,
- (iii)
- For any staple segments in a bridge of width n, , for ,
- (iv)
- A bridge of width n consequently spans n columns,
- (v)
- There is no proper subset of a bridge of width n which is a bridge of width n.
Given a graphical structure of width n, a staple segment of endpoints, where , and , spans columns from i to j if (or from the j to i if ).
We call a set of staple segments a bridge of width n if it satisfies the following properties (see Figure 17 for an example of a bridge of width n):
- (i)
- A bridge of width n is a subset of staple set of the given graphical structure,
- (ii)
- For any staple segments in a bridge of width n, for ,
- (iii)
- For any staple segments in a bridge of width n, , for ,
- (iv)
- A bridge of width n consequently spans n columns,
- (v)
- There is no proper subset of a bridge of width n which is a bridge of width n.
Figure 17.
An example of a bridge of width n. (a) A set of staple segments is a bridge of width 6. (b) There exists a set that violates the property (v) since a proper subset is a bridge of width 6. Therefore, the set is not bride of width 6 whereas the set is a bridge. (c) A set is a bridge of width 6. (d) A staple segment violates the property (ii) for a bridge of width 7. A set is a bridge of width 7.
We define to represent a set of all bridges of width n.
For a bridge of width n there are consecutive endpoints at the top and at the bottom which are not spanned by the bridge. For instance, there are consecutive endpoints and for a bridge of width 5.
We recall that a binary string of length n represents the graphical structures, and denotes the set of tuples corresponding to the binary string.
Given a set of positions, we calculate the number of partial graphical structures with t caps (or cups) as follows:
with initial conditions and for all .
5. Conclusions
This study presents a theoretical framework for side-aware DNA origami words, accounting for the directional attachment of staples to either the left or right side of the scaffold. This expansion of the traditional real and virtual staple classification to include left, right, virtual, and block types offers a more realistic representation of DNA origami structures, enabling the design of more intricate configurations.
We establish a rewriting system for side-aware DNA origami words and explore the properties of the corresponding graphical structures, with a focus on equivalence classes and rewriting patterns. Our model highlights the role of staple binding directionality in shaping the final structure, enhancing control over positioning and folding, which is crucial for applications in molecular devices and nanoscale assembly.
Although the complexity of rewriting rules in the case poses challenges for computational efficiency, future work will consider the and contexts. These will help examine their impact on rewriting patterns and structural diversity, offering deeper insights into the role of virtual elements in DNA origami design. Additionally, identifying equivalence classes streamlines the design process by grouping structurally similar configurations, improving both the efficiency and flexibility of DNA origami structures. This classification is especially useful for optimizing stable designs and their applications in molecular devices and nanoscale assembly.
As a next step, we plan to validate the model through molecular simulations, specifically focusing on how structural width and staple types influence the overall shape. This will confirm the model’s accuracy and demonstrate its potential for real-world applications.
Finally, while this study focuses on 2D side-aware DNA origami, the framework can be extended to 3D structures. Addressing challenges related to staple orientation and computational efficiency will be key, and future work will explore these extensions to enable the design of more complex 3D DNA origami structures for advanced applications in molecular devices and nanoscale assembly.
Funding
This work was supported by Institute of Information and Communications Technology 651 Planning and Evaluation (IITP) under the Artificial Intelligence Convergence Innovation Human 652 Resources Development (IITP-2024-RS-2023-00255968) grant and the National Research Foundation of Korea (NRF-2022R1G1A1013287).
Data Availability Statement
The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.
Conflicts of Interest
The author declares no conflicts of interest.
References
- Whitesides, G.M.; Boncheva, M. Beyond Molecules: Self-assembly of mesoscopic and macroscopic components. Proc. Natl. Acad. Sci. USA 2002, 99, 4769–4774. [Google Scholar] [CrossRef] [PubMed]
- Evans, C.G.; Winfree, E. Physical principles for DNA tile self-assembly. Chem. Soc. Rev. 2017, 46, 3808–3829. [Google Scholar] [CrossRef] [PubMed]
- Winfree, E.; Eng, T.; Rozenberg, G. String Tile Models for DNA Computing by Self-Assembly. In Proceedings of the 6th International Workshop on DNA-Based Computers, Leiden, The Netherlands, 13–17 June 2000; pp. 63–88. [Google Scholar]
- Rothemund, P.W.K. Folding DNA to create nanoscale shapes and patterns. Nature 2006, 440, 297–302. [Google Scholar] [CrossRef] [PubMed]
- Garrett, J.; Jonoska, N.; Kim, H.; Saito, M. DNA origami words, graphical structures and their rewriting systems. Nat. Comput. 2021, 20, 217–231. [Google Scholar] [CrossRef]
- Zadegan, R.M.; Norton, M.L. Structural DNA nanotechnology: From design to applications. Int. J. Mol. Sci. 2012, 13, 7149–7162. [Google Scholar] [CrossRef] [PubMed]
- Kim, H.; Surwade, S.P.; Powell, A.; O’Donnell, C.; Liu, H. Stability of DNA Origami Nanostructure under Diverse Chemical Environments. Chem. Mater. 2014, 26, 5265–5273. [Google Scholar] [CrossRef]
- Voigt, N.V.; Tørring, T.; Rotaru, A.; Jacobsen, M.F.; Ravnsbæk, J.B.; Subramani, R.; Mamdouh, W.; Kjems, J.; Mokhir, A.; Besenbacher, F.; et al. Single-molecule chemical reactions on DNA origami. Nat. Nanotechnol. 2010, 5, 200–203. [Google Scholar] [CrossRef] [PubMed]
- Lin, C.; Liu, Y.; Rinker, S.; Yan, H. DNA tile based self-assembly: Building complex nanoarchitectures. ChemPhysChem 2006, 7, 1641–1647. [Google Scholar] [CrossRef] [PubMed]
- Douglas, S.M.; Dietz, H.; Liedl, T.; Högberg, B.; Graf, F.; Shih, W.M. Self-assembly of DNA into nanoscale three-dimensional shapes. Nature 2009, 459, 414–418. [Google Scholar] [CrossRef] [PubMed]
- Yin, P.; Hariadi, R.F.; Sahu, S.; Choi, H.M.T.; Park, S.H.; LaBean, T.H.; Reif, J.H. Programming DNA tube circumferences. Science 2008, 321, 824–826. [Google Scholar] [CrossRef]
- SantaLucia, J.; Allawi, H.T.; Seneviratne, P.A. Improved nearest-neighbor parameters for predicting DNA duplex stability. Biochemistry 1996, 35, 3555–3562. [Google Scholar] [CrossRef]
- Wei, B.; Dai, M.; Yin, P. Complex shapes self-assembled from single-stranded DNA tiles. Nature 2012, 485, 623–626. [Google Scholar] [CrossRef] [PubMed]
- Ke, Y.; Ong, L.L.; Shih, W.M.; Yin, P. Three-dimensional structures self-assembled from DNA bricks. Science 2012, 338, 1177–1183. [Google Scholar] [CrossRef] [PubMed]
- Adamczyk, A.K.; Huijben, T.A.P.M.; Sison, M.; Di Luca, A.; Chiarelli, G.; Vanni, S.; Brasselet, S.; Mortensen, K.I.; Stefani, F.D.; Pilo-Pais, M.; et al. DNA self-assembly of single molecules with deterministic position and orientation. ACS Nano 2022, 16, 16924–16931. [Google Scholar] [CrossRef] [PubMed]
- Lee, C.; Lee, J.Y.; Kim, D.-N. Polymorphic design of DNA origami structures through mechanical control of modular components. Nat. Commun. 2017, 8, 2067. [Google Scholar] [CrossRef] [PubMed]
- Ronald, V.B.; Friedrich, O. String-Rewriting Systems; Springer: Berlin/Heidelberg, Germany, 1993. [Google Scholar]
- Louis, H.K. Knots and Physics; World Scientific: Singapore, 2001. [Google Scholar]
- Borisavljević, M.; Došen, K.; Petric, Z. Kauffman Monoids. J. Knot Theory Its Ramif. 2002, 11, 127–143. [Google Scholar] [CrossRef]
- Lau, K.W.; FitzGerald, D.G. Ideal Structure of the Kauffman and Related Monoids. Commun. Algebra 2006, 34, 2617–2629. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).