An Operational DNA Strand Displacement Encryption Approach

DeoxyriboNucleic Acid (DNA) encryption is a new encryption method that appeared along with the research of DNA nanotechnology in recent years. Due to the complexity of biology in DNA nanotechnology, DNA encryption brings in an additional difficulty in deciphering and, thus, can enhance information security. As a new approach in DNA nanotechnology, DNA strand displacement has particular advantages such as being enzyme free and self-assembly. However, the existing research on DNA-strand-displacement-based encryption has mostly stayed at a theoretical or simulation stage. To this end, this paper proposes a new DNA-strand-displacement-based encryption framework. This encryption framework involves three main strategies. The first strategy was a tri-phase conversion from plaintext to DNA sequences according to a Huffman-coding-based transformation rule, which enhances the concealment of the information. The second strategy was the development of DNA strand displacement molecular modules, which produce the initial key for information encryption. The third strategy was a cyclic-shift-based operation to extend the initial key long enough, and thus increase the deciphering difficulty. The results of simulation and biological experiments demonstrated the feasibility of our scheme for encryption. The approach was further validated in terms of the key sensitivity, key space, and statistic characteristic. Our encryption framework provides a potential way to realize DNA-strand-displacement-based encryption via biological experiments and promotes the research on DNA-strand-displacement-based encryption.


Introduction
Over the past few years, the world has seen a stunning transformation in how information is exchanged. Communication online (through various platforms) has gradually become an indispensable means for information exchange, and ensuring data security has become one of the most concerning problems.
Cryptography plays a pivotal role in protecting the security of data communication by transforming plaintexts into unrecognizable codes [1,2]. Conventional cryptography, which depends excessively on the high computational complexity of mathematical calculations, is facing increasing risks of encryption cracking as computing capabilities are rising. Therefore, new encryption methods have been increasingly studied. As a nanomaterial, DeoxyriboNucleic Acid (DNA) can store a large amount of information, and with the rapid development of nanotechnology, DNA nanotechnology has been widely studied for encryption. DNA encryption, as a novel technique of cryptography, was proposed by Gehani et al. [3]. In DNA encryption, data are protected by transforming them into digital DNA codes. Because of the exclusive advantages of DNA molecules, including their large scale of parallelism, high storage capacity, and low power consumption, it is widely believed that DNA encryption can work with huge data and can potentially increases information security [4].
There is a growing body of literature recognizing the importance of DNA encryption. To solve the storage problem of one-time pad, Gehani et al. [3] first designed a one-timepad-based DNA encryption program. In 2012, Wang et al. [5] proposed a new one-time one-key encryption algorithm based on the ergodicity of the skew tent chaotic graph. In 2014, Mokhtar et al. [6] combined a chaotic system with DNA coding to design a one-time pad encryption scheme. In [7], Yang et al. proposed a one-time pad encryption device based on DNA self-assembly technology. Because the keys generated in one-time pad approaches are not reusable, it is difficult to produce enough keys for encryption. A common method to address this problem is code transformation (i.e., transforming (0,1)-sequences into DNA sequences). In 2012, Liu et al. [8] proposed an image encryption method by means of a novel confusion and diffusion method, in which a DNA complementary rule was designed to confuse the pixels. To enhance the degree of confusing the pixels, Rehman et al. [9] in 2014 proposed a new gray image block cipher, which dynamically selects a rule from newly designed DNA complementary rules to encode and decode each pixel in a block. In 2016, based on the combination of the dynamic S-box and chaotic systems, Liu et al. [10] proposed a new image encryption scheme and showed that the proposed algorithm can reduce the correlation coefficients of images in three directions. In 2018, Wu et al. [11] designed a new chaotic mapping, called 2D-HSM. Then, they proposed an image encryption scheme combining 2D-HSM with DNA approaches and demonstrated its excellent performance. In [12,13], the authors employed chaotic series generated by a chaotic system to randomly select the coding rules, by which the security of encryption can be improved significantly. More recently, Wang et al. [14] proposed an image encryption algorithm based on ladder scrambling and DNA coding, which has a lower correlation of images compared to previous algorithms. In addition, some studies have attempted to improve the security of DNA encryption by performing operations on DNA codes, such as Addition (ADD) [15,16], Subtraction (SUB) [15,16], Exclusive Or (XOR) [16][17][18], and Exclusive Nor (XNOR) [18].
DNA encryption has been extensively studied along with the research on DNA nanotechnology in recent years. Due to the biological complexity of DNA nanotechnology, DNA encryption brings in the additional difficulty of deciphering, and thus can enhance information security. As a new approach in dynamic DNA nanotechnology, DNA Strand Displacement Reaction (SDR) has particular advantages such as being enzyme free and self-assembly. SDR has attracted considerable attention in recent years and has been widely applied to build various molecular systems [19] (it should be noted that the materials (DNA single strands) required for DNA strand displacement experiments are first designed by researchers, then commissioned to manufacture, and finally assembled into DNA molecules (complex structure)). A DNA SDR can be described as a molecular dynamic process (Figure 1), where a single-stranded DNA molecule is combined with a doublestranded DNA molecule through short complementary single-stranded DNA domains (called toeholds; see td and td * ), and a new stable double-stranded DNA molecule will be formed and a new single-stranded DNA molecule released from the original double strand. Notice that this can only happen gradually. Previous research has demonstrated that by designing appropriate DNA SDR, one can approximately realize all chemical reactions with ideal forms [20,21]. For example, in [22], SDR-based DNA switching circuits were designed for digital computing; in [23], the authors developed a time-sensitive molecular circuit based on SDR, called the cross-inhibitor, which can execute mutual inhibition; in [24,25], DNA strand displacement for microRNA detection was investigated; in [26], the authors analyzed the morphological manipulation of DNA gel microbeads with biomolecular stimuli by using SDR; in [27], the authors proposed an SDR-based chemical reaction network to solve 0-1 integer programming problems. Designing encryption algorithms with the aid of DNA SDR has also been attempted. In [28], by using DNA SDR to extract secret keys, Zhang et al. proposed an image encryption algorithm on the basis of a chaos system. To obtain the keys with this approach, the DNA of the chains obtained by SDR must be sequenced. This may lead to decryption failures when current sequencing techniques are used. In [29], the authors designed six DNA SDR modules and combined them with the XOR operation to create a new encryption algorithm. Although the proposed algorithm may have a high capacity to resist statistical attacks, it relies heavily on real-time concentration detection. Therefore, it is still in a simulation stage and is difficult to realize via biological experiments because of the complicated design program.
During a DNA strand displacement experiment, it is difficult to monitor and detect the concentration of the target DNA strand in real time, and the changes in the design of the DNA sequence can easily lead to changes in the reaction rate. For these reasons, the study of DNA-strand-displacement-based encryption is still in the theoretical or simulation stage. To facilitate the implementation of DNA encryption via the biological experiment of DNA strand displacement, we introduced in this work a novel bio-experiment-based encryption framework. In this approach, three strategies were adopted, including a Huffman-codingbased transformation rule to confuse the plaintext, two SDR-based molecular modules to generate the initial key, and a cyclic-shift-based mechanism to extend and confuse the key. Note that most studies on DNA encryption techniques focus mainly on how to design complex rules to hide confidential information in DNA codes, without considering whether the designed scheme can be realized by biochemical experiments. Our approach enhances the feasibility of biochemical experiments and reveals two advantages. First, it improves the security of key transmission. To obtain the keys, one has to perform biochemical experiments, for which the results are sensitive to various conditions, such as temperature, time, and concentration. Therefore, our approach provides excellent protection against decoding. Second, it combines biochemical experiments with other techniques such as code transformation, which generates a new confusion and diffusion method to create a secure cipher, thus enhancing the cipher strength.
In order to verify the feasibility of the proposed approach, we first present an encryption example. Then, we refer to [29] for the analysis of its performance in encryption in terms of three aspects, viz., key sensitivity, key space, and statistic characteristics. Note that a good encryption method should be sensitive to the key, that is, when the key changes slightly, the encryption and decryption results will be sufficiently different. Meanwhile, a good encryption method should also have a large key space to resist brute force attacks. Besides, we also analyzed the statistical characteristic of our approach to demonstrate that it can cope with statistical attacks. Our encryption framework provides a potential way to realize DNA-strand-displacement-based encryption via biological experiments and promotes the research on DNA-strand-displacement-based encryption.
The remainder of the paper is organized as follows. Section 2 introduces the encryption framework and the process of the encryption algorithm. Section 3 presents the experimental validation of the feasibility of our approach by designing specific modular reactions. In Section 4, we analyze the performance of our approach in encryption security. The results imply that the proposed scheme is sensitive to the keys and possesses high resistance against statistical attacks. Finally, a summary of the main findings, along with some discussion and concluding remarks are provided in Section 5.

Encryption Framework
In view of the increasing need for dealing with large data and ensuring data security, we propose a novel bio-experiment-based DNA encryption method based on the DNA strand displacement technique. In this section, we first present the framework of our encryption method (Algorithm 1).

Algorithm 1:
A new bio-experiment-based encryption framework. Input: Plaintext P (an arbitrary string) Output: Ciphertext C begin Transform P into a DNA sequence D 1 ; Design the DNA strand displacement molecular module, and obtain the initial key-a DNA sequence D; Design a shift rule, by which D is extended to a new DNA sequence D 2 whose length is not less than that of D 1 ; Perform DNA operation between D 1 and D 2 , and transform the result into ciphertext C;

return C; end
The encryption starts with a plaintext input P, i.e., an arbitrary string, and transforms it into a DNA sequence D 1 (Line 2), which will be taken as a substrate in the subsequent DNA computation. To generate the DNA sequence key D by biochemical experiments, some digital seeds are first obtained by recording the state changes (such as fluorescence color change or concentration change) during a designed experiment (Line 3). The next step is to extend D (Line 4) to a new DNA sequence D 2 with a length at least that of D 1 for later use in DNA computation. Finally, it produces the desired ciphertext (Line 5) by performing DNA computations (such as XOR and ADD) between D 1 and D 2 , together with some transformation strategies.

Huffman Coding and Data Transformation
Huffman coding is an efficient method for compressing data without losing information. By using this technique, Ailenberg and Rotstein [30] proposed a simple, but efficient coding method for information storage in DNA and showed its potential ability in coding DNA. Inspired by this, we designed a Huffman-coding-based method, called tri-phase transformation (TPT), to confuse P.
TPT first transforms P into a DNA sequence P 1 according to the rule listed in Supplementary Table S1; then, it transforms P 1 into a (0,1)-sequence P 2 via Huffman coding; finally, by using the rules listed in the first column in Supplementary Table S2, it transforms P 2 into a new DNA sequence D 1 , which is an ingredient for subsequent DNA operations. Specifically, the process from P 1 to P 2 can be described as follows.
For each base x ∈ {A, T, G, C}, denote by ω(x) the weight of x, which is defined as the number of x that appear in P 1 . Then, construct a Huffman binary tree with four leaves in the following way: select two bases with the smallest weights as two leaves, denoted by x 1 and x 2 , where ω(x 1 ) ≤ ω(x 2 ), and add a new vertex y 1 joining x 1 and x 2 such that x 1 and x 2 are the left and the right children of y 1 , respectively; set ω(y 1 ) = ω(x 1 ) + ω(x 2 ) select two elements from {A, G, C, T, y 1 }\{x 1 , x 2 } with the smallest weights, denoted by x 3 and x 4 , where ω(x 3 ) ≤ ω(x 4 ), and add a new vertex y 2 joining x 3 and x 4 such that x 3 and x 4 are the left and right children of y 2 , respectively; set ω(y 2 ) = ω(x 3 ) + ω(x 4 ), and add a new vertex y 3 jointing y 2 and the element in {A, G, C, T, y 1 }\{x 1 , x 2 , x 3 , x 4 } such that the one with the smaller weight is the left child of y 3 and the other is the right child of y 3 . Now, for each edge xy of the constructed tree such that y is a child of x, assign weight zero to it if y is the left child of x, and assign weight one to it if y is the right child of x. As a result, each base (a leaf) can be encoded into a (0,1)-sequence, which subsequently appears in the edges of the path from the root to the leaf, and P 1 is encoded into a (0,1)-sequence Z. Observe that the length of Z may be an odd number. To transform Z into the DNA sequence D 1 according to Supplementary Table S2, we have to modify it to have an even length. Our approach was as follows: if Z has an even length, add 00 to Z at the end of Z; otherwise, add 101 to Z. As an example, we considered a DNA sequence TTCCAGCGGAC, for which ω(A) = 2, ω(G) = 3, ω(C) = 4, and ω(T) = 2. By constructing a Huffman tree, A is encoded into 000, G is encoded into 01, C is encoded into 1, and T is encoded into 001. As a result, TTCCAGCGGAC is encoded into Z = 0010011100001101010001. Since Z has an even length, 00 is added at the end of Z and D 1 = ACGTAATGGAGA.
As described in Algorithm 1, our approach depends on DNA operations to generate the final ciphertext. Two such operations, XOR and ADD, are used in our subsequently designed algorithm, where the rules of these two operations are shown in Supplementary Tables S5 and S6, respectively.

SDR Modules and Seed Encoding
Let us now turn to the design of initial keys, which first generate seeds in the form of "2-1" or "1-2" for the keys via the corresponding SDR modules. Two SDR modules are used to encode these two seeds based on the concentration change of the main species before and after the strand displacement reactions (the concentration change of the species should be normalized to the form p-q such that both p and q are integers, i.e., 1 − 1 2 should be replaced by 2:1).

Degradation Reaction Module
The principle of this module is presented in Figure 2, and its mechanism can be described by the reactions listed in Equation (1).
Schematic illustration of the degradation reaction module, by which the concentration of Species A is reduced to half of its original concentration. Thus, Species A can be used to encode the seed "2-1".
The process of the reaction can be described as follows: this module mainly involves four initial species, including Single-stranded A and Complexes B, D, and G. We add the inputs A, B, D, and G into the biochemical reaction module simultaneously, and then a series of reactions is activated, after which the concentration of A is reduced to half of its original concentration, as shown in Equation (2). This is because A is consumed by both B and D and is generated by only one reaction (the third reaction listed in Equation (3). Specifically, the toehold a 5 of A binds to the domain a * 5 of B (and also D), and then, branch migration moves gradually to domain a 1 , which releases single-stranded W 1 (and Singlestranded E and F) together with double-stranded W 2 . Furthermore, the toehold s 2 of E (and t 2 of F) binds to the domain s * 2 (and t * 2 ) of G, and then, branch migration moves gradually to the domain a 3 (and a 1 ), which releases the desired Single-stranded A and forms double strands W 3 and W 4 . Observe that both A and G carry a dye at their 3 end, and B, D, and G each carry a quencher at their 5 end. Therefore, the beacon-labeled Strand A can be monitored in real time.

Catalysis Reaction Module
The principle of this module is presented in Figure 3, and its mechanism can be described by the reactions listed in Equations (3) and (4). Figure 3. Schematic illustration of the catalytic reaction module, by which the concentration of Species A is extended to twice its original concentration so that Species A can be used to encode the seed "1-2".
The process of the reaction can be described as follows: this module involves three main species, including Single-stranded A and Somplexes B and D, where A and D each carry a dye at their 5 end, B carries a quencher at its 3 end, and D carries a quencher at the end of t * 1 (close to its 3 end). The toehold t 1 of A binds to the domain t * 1 of B, and the branch migration moves gradually to domain t 3 , which releases Single-stranded C together with double-stranded W 1 . Then, toeholds a 1 and a 2 of C bind to the domains a * 1 and a * 2 of D, respectively, and the branch migration moves gradually to domain t 3 , which releases double-stranded W 2 and two single-stranded A molecules. This implies that the concentration of A will be extended to twice its initial concentration.

Group Cyclic Shift
To extend the DNA-sequence-based initial key (Species A) so that it is sufficiently long, we introduce Algorithm 2 (to clearly describe these algorithms (Algorithms 2 and 3), we followed the way mentioned in [31][32][33]), hereafter referred to as groupCS, based on the group Cyclic Shift. For any sequence S = s 1 s 2 . . . s n−1 s n , let O(S) = s 2 . . . s n−1 s n s 1 , and E (S) = s 3 . . . s n−1 s n s 1 s 2 . For two sequences S and S , we denote by S + S the resulting sequence obtained by connecting S to S (at the end of S).  Supplementary Table S2.
. . , Q m from left to right such that each Q i contains eight elements, except possibly the last group Q m . for j = 1 to m do if Q j has length eight (say Q j = q 1 q 2 . . . q 8 ) and (4q 6 + 2q 7 + q 8 ) / ∈ {2,3,5,7} then if the middle two positions (the fourth and fifth positions) of Q j are 00 or 11, then then label Q j by xor; end else label Q j by add; end D ←− the DNA sequence obtained by transforming Q j according to the rule listed in the (4q 1 + 2q 2 + q 3 + 1)-th column of Supplementary Table S2; if the length of Q m is less than eight then The algorithm first transforms the input DNA sequence D into a (0,1)-sequence S according to the first column in Supplementary Table S2 (Line 2). Note that each base corresponds to a (0,1)-sequence of length 2; therefore, S has length n = 2 0 . Then, a loop iteratively generates a DNA sequence D 2 with length at least (Lines 4-27).
In each iteration, a new (0,1)-sequence Q is constructed by k rounds of cyclic shift based on the current S (Lines 5-9), where the value of k is initially set as 2 0 and gradually decreases (Lines 3 and 23). To transform S into a DNA sequence, the algorithm divides S into m = nk 8 groups (say Q 1 , Q 2 , . . . , Q m ) from left to right such that each Q i contains eight elements, except possibly the last group Q m , that is, when nk = 0 (mod eight), the last group contains less than eight elements (Line 10). Only a group of length eight (say Q j = q 1 q 2 . . . q 8 ) such that (4q 6 + 2q 7 + q 8 ) / ∈ {2, 3, 5, 7} is transformed into the corresponding DNA sequence according to the rule listed in the (4q 1 + 2q 2 + q 3 + 1)-th column of Supplementary Table S2 (Lines 11-18). Next, if the length of D 2 is at least , then the algorithm breaks out of the loop and returns D 2 ; otherwise, k is reduced to 2 − 0 , Q is updated by Q m or ∅, and the algorithm implements the next iteration (Lines [19][20][21][22][23][24][25][26][27]. We refer to the DNA sequence D 2 returned by groupCS as the final key. For examples of groupCS, refer to Supplementary Table S7.

The BioEN Algorithm
Based on the encryption framework and the above techniques, we developed a DNAstrand-displacement-based encryption algorithm (Algorithm 3), hereafter referred to as BioEN, which utilizes Huffman coding, DNA SDR, and cyclic shift. Note that the reverse process of BioEN is the corresponding decryption algorithm. This is illustrated by an example in Supplementary Table S8.  Supplementary Table S2;

return C; end
In light of the foregoing discussion, it is enough to explain how to transform D 3 into the final ASCII code, i.e., the ciphertext C (Line 18). First, transform D 3 into a (0,1)sequence, denoted by S, according to the first column in Supplementary Table S2. Then, divide S into k = t 8 groups, from left to right, such that each group contains eight elements, except possibly the last group, where t is the length of S. Now, if the last group contains exactly eight elements, then add a new group consisting of eight zeros to S (at the end of S); if the last group contains less than eight elements, then add enough ones at the end of the last group so that the length of it is extended to eight, and add a new group of length eight consisting zeros or ones such that its corresponding decimal number is equal to the number of ones added to the last group. As a result, a (0,1)-sequence of length 8(k + 1) is obtained, which can be divided into (k + 1) groups, from left to right, such that each group contains eight elements. We refer to each of these groups as an ASC-group. Observe that the last ASC-group is used to identify how many ones are added, which serves for the decryption.

Experimental Setup
To show the feasibility of our approach in encryption, each experiment was set up with an experimental group and a control group. The concentration of the target DNA was expressed in the form of fluorescence intensity. The assembled DNA molecules were mixed according to the designed ratio, and the fluorescence intensity was monitored to obtain the final concentration of the target DNA strand.
All spectrofluorimetric measurements were performed using a real-time PCR system (QuantStudio 3 & 5 fluorescence quantitative PCR, Thermo Fisher Scientific, Waltham, MA, USA) equipped with a 96-well fluorescence plate reader. In the hold stage, the temperature was decreased by 1.6 • C to 4 • C/s and was then held for 10 s prior to the PCR stage. Then, the temperature was increased by 3 • C to 23 • C/s, and the fluorescence intensity was monitored every 10 s. The volume of each DNA sample was 20 µL.

Tools and Data
The sequences of all DNA strands in the experiment, listed in Supplementary Table S4, were designed by obtaining the original sequences using Nupack and then modifying the sequences by hand. The DNA oligonucleotides used were manufactured by Sangon Biotech (Shanghai, China). DNA oligonucleotides were purified by Sangon using highperformance liquid chromatography. Individual unlabeled DNA oligonucleotides were dissolved in 1 × TE buffer (nuclease free, pH 8.0, Sigma-Aldrich, St. Louis, MO, USA) and stored at −20 • C. Oligos labeled with dyes or quenchers were dissolved in deionized water (Milli-Q) and stored in deionized water at −20 • C. The DNA sample concentration was measured by NanoPhotometer N120 (Implen Inc., Westlake Village, CA, USA). All reagents were of analytical grade without further purification.
The DNA oligonucleotides were mixed in Tris-EDTA buffer (1×Tris-EDTA: 40 mM Tris base, 20 mM acetic acid, 2 mM EDTA adjusted to pH 8.0) with 12.5 mM MgCl 2 . All DNA complexes (listed in Supplementary Table S3) were mixed with an equal amount of corresponding single-stranded DNA to 10 µM. All samples were annealed in a polymerase chain reaction (PCR) thermal cycler. The temperature was set at 95 • C for 2 min initially and then decreased to 4 • C at a rate of −0.1 • C every 6 s. The hybridized molecules were stored at 4 • C for further use.
For simulation and dynamic analysis, we used Visual DSD [34]. The simulation duration was set to 600 s. The reactant concentration was at least 10 nM.

Experimentation Procedure
The initial key was obtained by biological experiments. Two DNA strand displacement modules were designed to obtain seeds 2-1 and 1-2. Before carrying out the biological experiments, simulation experiments were conducted as an auxiliary verification.

Simulation Experiment of the DR-Module
In the Degradation Reaction (DR)-module, there were Single-stranded A and auxiliary Complexes B, D and G. The initial concentration of A was [A] 0 = 20 nM, and the initial concentration of each of B, D, and G was C m = 10 nM. The (DSD) reaction rates k 1 = k 2 = 7 × 10 −4 /nM/s and k 3 = 10 −1 /nM/s, where k 3 is the maximum reaction rate.
The rate constants of the corresponding DNA reactions were determined according to the rate constants of the formal chemical reactions, which were equal to the rate constants of the corresponding DNA strand reaction multiplied by the initial concentration of the auxiliary complexes strands. k 1 , k 4 , and C m satisfy k 4 = k 1 C m . The simulation process was performed for 600 s, and the concentration of A was reduced from 20 nM to 10 nM (see Figure 4a).

Biological Experiments of the DR-Module
To obtain the seed 2-1, we conducted two groups of biochemical reactions, named experiment group and control group, respectively, where the concentration of all species involved in the experiments (i.e., A, B, D, and G) was 10 µM, and the control group was just for reference. The experiment group included 4 µL of A, D, and G, respectively, and 6 µL of B, while the control group included 4 µL of A and 14 µL Tris-EDTA buffer (1× Tris-EDTA: 40 mM Tris base, 20 mM acetic acid, and 2 mM EDTA adjusted to pH 8.0). We put these two groups into the fluorescence quantitative PCR instrument and examined the fluorescence intensity change of A. Initially, they had the same concentration of A. When the reaction tended to be stable, the concentration of A in the experiment group was reduced by half, while the concentration of A in the control group was unchanged (see Figure 5a).
To show the key sensitivity of our approach (see Section 4.1.2), we conducted a contrast experiment, in which the experiment group included 5 µL of A, B, D, and G, respectively, while the control group included 5 µL of A and 15 µL Tris-EDTA buffer. The results are shown in Figure S1.

Simulation Experiment of the CR-Module
In the Catalysis Reaction (CR)-module, there are single-stranded A and auxiliary complexes B and D. The initial concentration of A is [A] 0 = 10 nM and the initial concentration of each of B and D is C m = 10 nM. The (DSD) reaction rates k 5 = 9 × 10 −3 /nM/s and k 6 = 10 −2 /nM/s, where k 6 is the maximum reaction rate. The rate constants of the formal chemical reactions is equal to the rate constants of the corresponding DNA strand reaction multiplied by the initial concentration of the auxiliary complexes strands. k 5 , k 7 and C m satisfy k 7 = k 5 C m . The simulation process was performed for 600 s, and the concentration of A was increased from 10 nM to 20 nM (see Figure 4b).

Biological Experiments of CR-Module
To obtain the seed 1-2, we also conducted the two groups of experiments for the DR-module, in which the concentration of A, B, and D was 10 µM. The experiment group included 5 µL of A and D, respectively, and 6 µL of B, while the control group included 5 µL of A and 11 µL Tris-EDTA buffer. We put these two groups into the fluorescence quantitative PCR instrument and examined the fluorescence intensity change of A. Initially, they had the same concentration of A. When the reaction tended to be stable, the concentration of A in the experiment group was doubled, while the concentration of A in the control group was unchanged (see Figure 5b).
To show the key sensitivity of our approach (see Section 4.1.2), we conducted a contrast experiment, in which the experiment group included 7 µL of A, B, and D, respectively, while the control group included 7 µL of A and 14 µL Tris-EDTA buffer. The results are shown in Figure S2.

Experimental Results
The results are shown in Figures 4 and 5, respectively. As expected, the simulation and biological experiment produced consistent results. This provides a guarantee for the performance of these two SDR modules, which can be used to encode the seeds 2-1 and 1-2, respectively.

Security Analysis
In this section, we analyzed the security of our encryption algorithm.

Key Sensitivity
An excellent encryption scheme should be sensitive to the key, meaning a minor change to the key will cause major changes to the results of encryption and decryption. Because our key is highly associated with biological experiments and the experiments are very sensitive to the environment, the desired key can be generated only when all experimental conditions are set correctly. Any mistake will lead to a different result, which implies that the key is sensitive. In addition, the key extension mechanism (groupCS) introduces considerable confusion to the final key. To illustrate this, we designed three types of experiments. The plaintext we used was "anewencryptionapproachusingdnabiotechnologyandhuffmancoding", and the seed was 2-11-2.

Change One Base
Referring to Supplementary Table S1, the seed 2-11-2 was transformed into the DNA sequence key D = GCCCGCAAGCCGGCCGGCAAGCCC. We wanted to investigate the difference of the encryption results (obtained by BioEN) when an arbitrary base in D is changed. In the experiment, we selected the fifth base G and changed it to T, i.e., the changed DNA sequence was D = GCCCTCAAGCCGGCCGGCAAGCCC. Based on D and D , the ciphertexts obtained by BioEN were completely different; see Figure 6a.

Change Experiment Conditions
Note that when conducting the biological experiment, for the DR-module, the concentration ratio of Species A, B, D, and G was 2:3:2:2; and for the CR-module, the concentration ratio of A, B, and D was 5:6:5. To show the key is sensitive, we conducted a new experiment by setting the concentration ratio of A, B, D, and G to 1:1:1:1 for the DR-module and the concentration ratio of A, B, and D to 1:1:1 for the CR-module (see Supplementary Figures S1 and S2 for the results of the experiment). Consequently, the concentration changes of Species A for the DR-module and CR-module were 8:5 and 3:5, respectively, by which the seed we obtained was 8-53-5. Thus, the ciphertexts obtained by BioEN, based on the seeds 2-11-2 and 8-53-5, respectively, were very different; see Figure 6b To extend the seed 2-11-2, groupCS first transforms it into the DNA sequence D, which is further transformed to a (0,1)-sequence S by the rule listed in the first column of Supplementary Table S2. Then, based on S, a longer (0,1)-sequence Q is constructed according to the corresponding rules (Lines 6-9; groupCS). Note that here, we only considered the first iteration. We wanted to change an element of Q to test the effect on the final ciphertext. Thus, given the importance of each element's position in Q (Lines 10-14; groupCS), we changed the eighth element of Q from zero to one, and all other elements remained unchanged. Figure 6c shows that even such a slight modification led to a significant change in the final ciphertexts.

Key Space Analysis
Note that the (0,1)-sequence S mentioned in Section 4.1.3 has length 48. Denote by R(S) the resulting (0,1)-sequence obtained from S by conducting the shift operation once, and let: i.e., when i ≡ 1(mod 2), R i (S) = O(R i−1 (S)); when i ≡ 0(mod 2), R i (S) = E (R i−1 (S)), where R 0 (S) = S and i is a positive integer. Since the length of S is finite, there may exist some positive integer r such that R r (S) = S and R r+i (S) = R i (S), where i is a nonnegative positive integer. We call the smallest r with this property the rank of S. Clearly, the final key is generated based on a (0,1)-sequence of length 48r, where r is the rank of S. We refer to the set of all distinct (0,1)-sequences of length 48r as the key space of the encryption algorithm BioEN. By a simple exhaustive analysis, we have the following proposition, which shows that the key space of BioEN is large enough to be secure. The detailed proof can be found in Supplementary Table S9.
The rank of S is 32, and the cardinality of the key space of BioEN is 2 1536 .

Statistic Characteristic
We investigated the ASCII values of the characters appearing in the plaintext and ciphertext. Compared to the range of the ASCII values, we saw that the ASCII value distribution of the plaintext was 95-125, whereas that of the ciphertext was 0-255; see Figure 7. Such a large difference in ASCII values provides a strong guarantee for protection against statistical attacks.

Conclusions
We proposed a bio-experiment-based DNA encryption framework for data security (i.e., Algorithm 1). Based on the proposed framework, we introduced an encryption algorithm (i.e., BioEN) by designing a Huffman-coding-based method tri-phase transformation to deal with the unprocessed plaintext, two DNA SDR modules to generate the initial key, and a cyclic-shift-based mechanism (i.e., groupCS) to extend the key. The proposed algorithm highlights the importance of biochemical experiments. To validate the feasibility of the proposed algorithm, we conducted both a DSD simulation and a biochemical experiment. Compared to the existing DNA strand replacement encryption algorithms, the proposed algorithm is heavily dependent on the experiments and generates pseudo-random sequences by tracing the concentration change of the target DNA strand. Further analysis of the security showed that our algorithm is key sensitive, has a large key space, and can effectively resist statistical attacks. Compared with the works in [28,29], our encryption approach has the advantage of performing encryption through DNA strand displacement experiments rather than staying in the theory or simulation stage, which is expected to push forward the research of DNA-strand-displacement-based encryption. Though designed for text encryption, our encryption framework may be also applicable to image encryption or other areas of encryption, which would be worth exploring in future work.
Supplementary Materials: The following supporting information can be downloaded at: www.mdpi.com/article/10.3390/nano12050877/s1, Table S1: DNA coding. Table S2: DNA encoding and decoding rules. Table S3: Synthetic DNA complexes. Table S4: DNA sequence design. Table S5: DNA XOR operation. Table S6: DNA ADD operation. Table S7: Example of groupCS. Table S8. An example of the algorithm BioEN. Table S9: Proof of the key space analysis. Figure S1: The fluorescence intensity changes when the concentration ratio of A,B,D, and G is 1:1:1:1 in the DR-module. Figure S2: The fluorescence intensity changes when the concentration ratio of A, B, and D is 1:1:1 in the CR-module.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.