Novel Algorithm for Multi-Time Data Implantation in a Special Cyber-Manufacturing Architecture

: A physical cyber system connects all authenticated cyber devices in its network. Nowadays, many wearable devices function as cyber devices. In essence, people are using these devices more for their healthcare. These devices would be very popular if an easy-to-use manufacturing architecture could be created and, at the same time, the devices could protect the data stored on the device. In this article, we suggest a good manufacturing architecture for a healthcare device, in which our proposed data protection method works well. The architecture is very simple to implement and the data protection method hides information in a DNA sequence. The present DNA-based data hiding schemes implant secrets in converted binaries of nucleotides. The number of implanted bits is no more than the length of nucleotides, however, these schemes expand the stego DNA sequence noticeably. While implanting a large message, e.g., historical records of patients, it would be harder for these schemes to manage the implantation of the whole secret in a single DNA sequence. A large DNA sequence might be a solution in some contexts. Nevertheless, managing a large DNA sequence and its expanded part in a ﬁxed memory space would be challenging as those are too large in size. To address this problem, we propose a multi-time data embedment method that could implant as much data in a DNA sequence as needed. Although it presents a greater embedding capacity, it does not increase the length of the modiﬁed DNA/RNA sequence. Thus, it optimizes the extra memory load in the chip. The proposed method implemented several features to improve the security of both implanted data and DNA sequence. The experimental results outperform all measurements over the competing schemes.


Introduction
Cyber manufacturing is one of the most influential branches of the physical cyber system. Typically, a cyber-manufacturing system focuses on the architecture of the devices used in cyber systems, their data acquisition, retrieval and processing methods, the functional efficiency and security features of data and devices [1]. At the same time, it deals with related engineering and technological aspects of the devices to make them good-looking, easy to use and strong. Very commonly used cyber devices are smart watches, healthcare and assistance devices of disabled and elderly people [2]. With the advancement in industry and technology, those devices have started to stand out in terms of their functionality. In today's world, no one doubts that, in near future, a small device will provide us with multiple services, e.g., we could add our own security, day-to-day work, accounting, asset manage-ment, etc. Then, body-worn devices will obtain a different dimension. These devices will be widely used to simplify our day-to-day operations and increase security [2].
Certainly, these devices will include healthcare services, as well. Current healthcare devices are capable of storing several pieces of casual information about the people who wear them. Some devices may update a user's information to a limited extend through a set of device-implanted sensors. However, future healthcare devices will show many surprises that are unimaginable compared to the present time. They will be able to perform many physical and pathological experiments on the human body automatically. These devices will automatically start communicating with the designated doctors or any healthcare provider to receive suggestions and consultations. It will not be surprising if a robotic device takes its owner to the doctor or hospital if it senses the deteriorated health condition of the owner. Although we have started to see similar applications in a short range, there are still questions about their capability and the quality of their skills. That limitation is mainly due to the small size and low-cost demand of such devices. However, we expect better services from them by making them capable of processing and storing huge amounts of information.
The healthcare devices will store demographic, pathological, historical and other data of the owner in their memory. This information will be updated regularly. In that case, if we preserve the historical data, at some point in time, that data will become larger, especially for old patients and chronic diseases. It could even be larger than the memory capacity of the device. Again, the information stored on the device will be threatened because the device will be connected to the cyber world and its information can easily be stolen if we store plane data in them. Hence, processing data to a secured format is essential. Moreover, failure to provide data protection will also hamper the cybermanufacturing process. Therefore, in these days, the security breach on the information will be a big obstacle in the way of obtaining good service. If we do not work now on the security architecture of the cyber-manufacturing world, the cyber-manufacturing method, especially on wearable healthcare devices, will not be implemented [3]. Consequently, we have to look at three things-(i) a simple architecture for the device, (ii) its information management and (iii) information security. Hence, in this article, we place an emphasis on developing an architecture for future healthcare devices and an efficient algorithm for their data management and security.
Our primary target is to design a simple but effective architecture. We first assume that the intended wearable cyber device is equipped with the necessary sensors to collect and process human health information as well as to store them. The sensors are connected to a memory unit and a signalling unit through the processing unit of the device. The memory unit consists of two parts-a restricted part and a reusable part. The initial information of the person, e.g., the demographic information of a person, will be kept in a restricted part of the memory at the time of manufacturing or in the one-time erasable memory of the device. Generally, this are very ordinary personal information, e.g., name, parent's name, address, gender, etc. All of this information will be encrypted and stored in that stated memory. The details of the architecture are given at Section 3.
Our secondary target is to meet the demand of storing a large volume of data, even if it is an ever-growing dataset. As we are working on smart devices where memory size is a concerning issue, we have presented a method of repeatedly implanting secrets in a smallsized cover. As a medium, we have chosen a small DNA/RNA sequence. That technique has helped us in both managing large data and securing them.
The proposed scheme has several distinct features: (i) It presents a simple architecture for cyber devices that will be a boon for production in the industry; (ii) The architecture would be useful for managing its memory, a signaling unit and sensors effectively and building a perfect synchronization among these units; (iii) Uses of restricted memory will allow the devices to be used for other purposes, e.g., in banking, owner identifying, etc.; (iv) As far as we know, no other study has used a small chunk of nucleotides for embedment of a large volume of data, e.g., a chunk of the ten nucleotides is enough to conceive any volume of information; (v) A multi-time implantation method is proposed for the first time in the field of its kind; (vi) There is no other scheme that generalizes its functionality to work with any size of message, e.g., message size of some multiple of the length of DNA; (vii) Shuffling elements in the stego key by the proposed scheme is an innovative idea that is effective for improving the security of data. For this, we think that the proposed method will be a striking one in the field of cyber physical manufacturing, as well as the DNA-based data hiding arena.
The study involved experiments on real medical data, as well as on four separate RNA and three DNA sequences to justify its robustness. Four RNA sequences of COVID-19 and one DNA sequence were taken from the NCBI repository. The RNA sequences of nCoV-19 and DNA sequences are open to all [4][5][6][7] and, therefore, freely available. Two additional sequences were prepared by us from real data. The proposed method worked successfully in all the RNA/DNA sequences and provided promising results.
The article comprises seven sections. Section 2 is provided to give a description of the background studies. Section 3 narrates the proposed cyber physical manufacturing architecture. The proposed multi-time data embedment method is given in Section 4. Section 5 analyses the proposed scheme. We discuss the results in Section 6. Section 7 concludes the article.

Related Study on Architecture
Kamal and Islam in [8,9] presented two architectures for securing data in a smart card. They implanted patients' data in the card owner's photograph. However, these methods can neither handle the sensor data nor connect the devices to a cloud and doctors. Rather, the information security provisions have been worked out in those schemes, when someone wants to receive healthcare at a terminal using a smart card. Lin W. D. and Low Y. H. in 2020 [10] proposed a digital twin architecture of the physical cyber system. Again, Liu, Xiaoqing F. et al. [11] in 2017 proposed a communication layer architecture of the cyber physical system. However, they did not state the hardware architecture of the device, which is essential for manufacturing industries.

Related Study on Data Security
Cryptography is a traditional method of managing the secrecy of the message. In cryptography, an encryption technique destroys the meaning of the data. The sender side does it at its end using a suitable key. The secret key helps the communicating parties to decrypt the modified data at the retriever end [12]. A smart technique of present data communication is to hide the secret message in a media, e.g., image, audio, video. The processes of hiding data in media are classified as watermarking and steganography. Watermarking is mainly used to provide data integrity, rather than security [13][14][15]. On the contrary, steganography is the art of hiding message data in a cover medium for securing a data communication method. A steganography method modifies the cover medium by some rules while implanting the secret message in it. The modified medium then called stego media. The steganography process makes the data invisible in the stego medium and thus, one cannot realize the existence of a secret in it while the stego medium is stolen. In fact, the system attains the ability of deceiving unauthorized person or devices pretending that there is nothing more inside the media. Therefore, the interest from both the data handler and the number of researchers is increasing in the field of steganography. Steganography schemes are broadly classified into two groups-reversible and irreversible. Reversible schemes retrieve both data and cover media at the extractor end without taking any help from the encoder module [9,[16][17][18][19][20][21][22][23][24]. On the other hand, irreversible schemes only extract secret data from the stego contents [8,[25][26][27][28][29]. These schemes are not concerned with rebuilding the cover media.
Although image, audio and video are famous media for steganography, recently, some researchers have used a deoxyribonucleic acid (DNA) sequence in the cloud as a medium of hiding secrets [30,31]. DNA sequences are available at various repositories. A very renowned one is NCBI [7,30,31]. Rahman et al. [30] and Hamed G. et al. [31] implanted secret bits in DNA. In 2019, the authors in [30] stated a DNA data hiding approach for data authenticity in mobile cloud. The cloud was based on the healthcare system. The authors used a cover DNA sequence with nucleotides A, C, G, T of length l. They implanted a message M of n-bits. They converted the DNA sequence into a 2D matrix with n rows and l/n columns. They randomly calculated a one-time string E of n-bits. That E was a secret key. The method divides both M and E in σ segments with b-bits each, where σ = n/b and 2 ≤ b ≤ l/n. They also used a substitution rule (SR), presented in Table 1, where they used '00' for nucleotide 'A', '01' for 'C', '10' for 'G' and '11' for 'T'.
The position of a nucleotide in a 2D matrix was selected randomly using a sequence generator. The column sequence was stored in a set of named payload (R). Then, the method substitutes a pointed nucleotides by mapping binary values of the SR table, i.e., Table 1. That substitution mechanism produced a codeword W s of length of b-bits. After that, W s were generated by the exclusive OR operation between message segment M s , codeword W s , and string segment E s . The W s was converted into nucleotides according to the SR. A 2D matrix was generated after repeating the aforementioned process and this 2D matrix was transformed into one-dimensional vector of nucleotides. Finally, it appended the payload R and tail T with the one-dimensional vector to construct the stego DNA sequence C. The stego DNA sequence C and a stego key SK are sent to the cloud, where the stego key contains l, M, E, b, SR and a seed for generating a pseudo-random sequence. The length of C was less than the length of C. Hamed G. et al. in 2016 [31] combined cryptography and steganography to hide data in the DNA sequence as well as to improve the volatility of stego DNA. The method first converts the DNA and the cipher text into binary. The scheme places bits of cipher text into arbitrary position of DNA's binary using true random number seed. That placement introduces an expansion in the final string. The scheme executes a conversion method to create stego nucleotides from these binaries. Instead of managing true random number seed, the scheme was simple by nature.
Wang et al. [32] in 2019 applied a recombinant DNA technique to embed secrets in DNA sequence. The method works in two phases. First, it converts each character of message in to a DNA triplet, known as encoded message. The scheme next selects a cover DNA and a reference DNA. During data embedment, a three-input-based substitution table maps to a single nucleotide where these three input elements come from cover DNA sequence, reference DNA sequence and encoded message, respectively and progressively. The mapped values are called setgo DNA sequence. As the method is based on living organisms, at its second phase, it ligases the stego DNA sequence with a selectable marker and thus, it obtains a recombinant DNA sequence. Finally, the method transforms that recombinant DNA sequence into a host cell. These cells, along with many dummy cells, are sent to a destination. After cultivating these cells, reversible functions are applied to extract the secrets and the cover DNA sequence. As the second phase is fully based on biological affairs and it does not conceive any secret message there, we correlate only its first phase with our scheme.

Proposed Architecture
The list of main hardware of the target smart devices is a processing unit, a memory unit, a set of sensors for collecting user data, a wireless sensor to connect to the cloud and a set of connections among the units. The memory unit consists of a restricted ROM and a general purpose ROM. These two ROMs could be implanted in the device as either two separate ROMs or in a single ROM. In the second case, a part of the ROM will be declared as a restricted part of the memory. The restricted ROM is writable for a single time only. The demographic information of the device owner will be written there.
The proposed method will convert this demographic information into a DNA sequence. The converted DNA sequence will be stored in an erasable ROM (E-ROM) (or part of one). In all subsequent tasks, it will achieve five aims-(i) it will gather the latest health information of a person through its sensors; (ii) it will collect the previously stored information from E-ROM and will extract all healthcare data; (iii) it will understand the situation of the patient by comparing that information with the most recently collected information; (iv) if that measurement signifies some serious issues that are to be treated soon, it will take the necessary steps to save the patient, otherwise, it will notify the person through a gentle signalling process; (v) it will update the device memory with the latest information.
The updating task will be performed by the proposed data hiding technique. The architecture of the proposed cyber-manufacturing system is graphically shown in Figure 1. The functionality of the hardware equipment of the proposed architecture is described in detail in the section on the analysis of the proposed method with a different figure. The proposed device has several sensors to generate pathological data. The device stores the sensed data in a memory unit after implanting them in a DNA sequence. The DNA sequence is generated from the demographic information of the owner. Each time, it senses the present health statistics through its sensors. Then, it compares the sensed data with the historical data. Historical data were saved earlier by implanting them in a DNA sequence.
A concern issue of this research work is to secure the information in the second and fifth steps of the above five tasks. For this, during its update task, the scheme will implant the latest information in the previously stored DNA sequence. That is why it is advised to generate a DNA sequence from the owner's demographic information at the first stage of the device registration.
The proposed scheme will convert the demographic information to a DNA sequence, as explained. That conversion technique will, indeed, make the information useless to outer applications. However, this demographic information might be used for other purposes, e.g., in banking. To make the device usable for applications other than healthcare, we have preserved this demographic information in a restricted ROM as raw data.

Proposed Method of Securing Data in Cyber Device
It is already mentioned that a major issue of concern in this research is to secure the data at the fifth operational step, as stated in the proposed architecture. We intend to apply a data embedment policy for establishing a data security task. The embedment is performed in the fake DNA sequence where the fake DNA sequence was generated from demographic information of the device owner. However, the length of the fake DNA sequence would not be long enough to conceive the information in a single DNA sequence, because the information might be larger than the length of the DNA sequence for considering historical data. For example, if we consider a cyber device for a patient that will mainly be used for telemedicine purposes or as a smart medical card, the device has to save all previous information of the patient as well as the current information. Such data become larger for a chronically ill or old patient. To overcome this problem, we, in this research, apply multi-time data embedment in the same DNA sequence.

Proposed Multi-Time Data Embedment Scheme
All DNA sequences comprise A, C, G and T named nucleotides. According to [4][5][6]33,34], the RNA sequence of nCoV-19 also consists of the same four nucleotides. Let a RNA sequence of nCoV-19 is N r . As an example, consider, N r = "AGGCTCCA. . . ". The proposed method implants a binary message stream M in the nucleotides of N r .  The method converts nucleotides to binaries. It generates a secret key S for one time only. This could be the password of the owner. The method employs a binary form of the message M i , secret key S and the binary of nucleotides R to generate stego nucleotides. If the message length is too big to hide in nucleotides, the scheme repeats the process. Nucleotides in N r are converted to binary values according to Table 1. Let the converted binaries of the RNA sequence be R. The size of R is twice that of N r as two bits of binary are generated for each of the nucleotides.

2.
Measure the length of R and store that integer value in L.

3.
Generate a secret key S that consists of L number of binary bits. 4.
Divide the binary message M into M 1 , M 2 , . . ., M n , M t , where the size of each of M 1 to M n is L and the size of the last chunk M t could be equal to or less than L and M = M 1 M 2 . . . M n M t , when stands for concatenation of binary string. Let the length of M t be L t . Thus, the total length of the message is n × L + L t bits. Now, append L − L t number of zeros to the left of M t to make it of equal size to M 1 .
The following loop executes two statements for n + 1 times. Each time, it takes a chunk of message M i from M. It generates stego results by performing an exclusive-OR operation of M i withR and S. At the first stage of the loop,R means R, according to step 5. After each execution of Equation (1),R is appended to the end ofR x according to Equation (2).R x is a dynamically expandable array where each time it is enlarged by L.
The objective of uses ofR x is to keep track of stegos of first (n − 1) cycles. That is why the length ofR x is (n − 1) × L. The proposed scheme will make it a part of stego key. Stegos of the last cycle are stored atR. The length ofR is L. Finally, theseR andR x are converted toÑ r andÑ x , respectively. For i = 1 to n + 1, dõ where ⊕ stands for exclusive-OR operator.
End if End loop 7.
Grouping bits into pairs and convertingR andR x into nucleotides again according to Table 1, we obtain the stego nucleotides. Let the sequences of stego nucleotides beÑ r and N x , respectively. The sender side, thus, implants n × L bits of secrets in an RNA sequence and produces two new stego sequencesÑ r andÑ x . Among these two,Ñ r is used as stego nucleotides and stored to a cloud for further uses. The other sequence, i.e.,Ñ x , is kept as a part of the stego key. The length ofÑ r is the same as that of N r and the length ofÑ x is n − 1 times that of N r .
The scheme could also serve in both standalone and cloud based applications. In cloudbased application, only theÑ r will be stored in a cloud space. The intended person will collectÑ r from the cloud and then will request the sender side/server for a stego key. In that case, only a cloud space of L/2 nucleotides is enough for communicating n × L bits of the message.

Stego Key Generation
The stego key is a secret key that is used to extract the implanted secrets from the stego DNA/RNA. The encoder was privately communicated to share it. Thus, the authorized decoder collects it from the encoder. The stego key consists ofÑ x , length of message bits L that are implanted at the last cycle, total execution cycle n and the secret key S. Thus the stego key (SK), as shown in Table 2, was defined by SK =<Ñ x , L, n, S >. The length of SK is shown in Table 2.  The destination end, i.e., decoder, might collectÑ r from either cloud, standalone system or through a communication link according to nature of the application. However, our concern issue was to extract message bits fromÑ r and to generate cover DNA/RNA N r . Before proceeding to do that, the decoder next collect the SK and thereafter, it separates N x , L, n and S from SK. The method converts both theÑ r andÑ x into binary values using Table 1, which are, indeed,R andR x , respectively. ConcatenateR andR x to formR x,i , where 1 ≤ i ≤ n + 1. Next, it divides theR x into pieces of L bits. Let the pieces areR x,1 , R x,2 , . . . ,R x,n+1 .
The method executes the following steps of Algorithm 2 and Equations (3)-(6) to extract secrets.

Algorithm 2: Data Extraction
Step 1: For i = n + 1 to 2 do step-1 End loop The flowchart of the extraction process is depicted in Figure 3. If we look back at Equation (1), we see that R 1 was produced by performing an exclusive-OR operation among R 2 , S and M i . Therefore, R 1 XOR R 2 XOR S of Equation (5) will produce M i . Equation (6) appends every extracted message chunk to the lagging of M as it extracts each most recently implanted chunk, i.e., it extracts the last chunk first, then the second-last chunk, and so on. After completing the execution of the loop, R 2 will hold the value ofR x,1 . ThatR x,1 is mainly R, as presented in step (5) of Figure 1. Hence, R 2 is R. The R is converted to nucleotides N r according to Table 1. As the scheme has the capability to generate N r , it is a reversible method. The proposed reversible method, thus, extracts M and N r at the receiver end.

Manufacturing Architecture
The hardware architecture of the system is depicted in Figure 4. Hardware architecture: Processing unit collects information from sensors. The processing unit then extracts the DNA sequence from the main memory. Then retrieve the historical and other data from the DNA sequence. The processing unit compares the sensor's current data with the older one and signals accordingly. Updating the historical data with sensed information, the processing unit again implants that in the original DNA sequence. Finally, the processing unit sends back the stego DNA sequence to the main memory.

Manufacturing Architecture: Processing Unit
The processing unit of the cyber device stores demographic information of the device owner in the restricted a memory as raw data. At the same time, it converts the demographic information into a DNA sequence and hides the other information in it using our proposed data hiding method. After that, it saves the stego DNA sequence in the main memory. Next, when the processing unit senses a new piece of data from its sensors, the unit collects the most recently saved DNA sequence from the main memory. The processing unit then extracts the secret data from that stego DNA sequence. The unit also generates the original DNA sequence by the proposed reversible data extraction method. The processing unit compares the newly sensed data with the extracted data. It sends the results to the signaling unit.

Manufacturing Architecture: Signaling Unit
The signaling unit is developed to provide one of five notifications. The first four notifications are called in-control signals. The last one is referred to as an out-control signal. According to the demand of the users, the manufacturer can increase the number of inand out-control signals. The signaling unit applies a mathematical function f to generate a number from 1, 2, 3, 4, i.e., f = 1 or f = 2 or f = 3 or f = 4. Here, we have not given any fixed equation as a part of that function. Rather, it is left to the manufacturer so that they can set their own function and fitted parameters according to this proposal or their modified one. The signaling unit may generate one of the four notifications automatically based on a function's generated value f . When the signaling unit generates f = 1, the cyber device creates a gentle notification. That means that there is no problem in the sensed data. f = 2 indicates some minor changes in the health statistics. The device signals accordingly. Again, f = 3 indicates major changes in the health conditions. The person demands extra care. If the signaling unit measures f = 4, the person is in danger. They should consult a doctor immediately. The signaling unit then automatically calls a designated doctor. Notification 5 is not generated by the auto system of the signaling unit. This is done after receiving a request from a doctor. That request automatically triggers the signaling unit to set f = 5. When the device realizes a value of f = 5, it sends the report of the last-sensed and the extracted historical data to the doctor. A list of those notifications is shown in Figure 5. A mathematical function f is used to generate one of the numbers from the set 1, 2, 3, 4. Here, f = 1 refers to gentile notification, which means that there is no problem. Moderate notification, i.e., f = 2, is generated for some level in changes of sensed data. f = 3 is generated to alarm the user about their health condition. When signaling unit measure f = 4, it automatically calls a doctor. For f = 5, the signaling unit sends the latest report and the historical data to the designated doctor.

Analysis on Data Hiding Algorithm
To analyze, the data hiding method first considers a small DNA sequence. Let us consider the sequence TTTGTTGAGT.

Execution of Data Implantation Module
In the following, we explain the execution steps of the data implantation module.

Execution of Data Extraction Module
Data extraction rules work at receiver end. The receiver end has all the necessary information to extract the implanted data and to reconstruct the cover DNA/RNA. It has N x = TTTGTTGAGT-CACGCGGGTA-CGTGAAACGA-GCTTTTATGA; Secret key, S = 00001111110111000001.

Boosting Up Vulnerability
One noticeable point of the scheme is that the cover DNA is found at the first block of N x . The same is true in the case of RNA. According to the above example, cover DNA is TTTGTTGAGT. As the cover DNA is always found as a first block ofÑ x , in case of stolen key, it cannot preserve the security of cover DNA. Therefore, we introduce a shuffling policy of blocks ofÑ x . We have not explained that policy in the previous section due to better understanding. Moreover, it is not a necessary part of the algorithm for its functionality.
We know thatÑ x consists of n − 1 stego DNA blocks. Table 9 shows the method of shuffling n − 1 stego blocks among themselves. The first column of the table shows the current block numbers ofÑ x . These block numbers are randomly rearranged. After shuffling, block positions have changed. The second column shows the position of blocks after shuffling, e.g., the blocks, 0, 1, 2, 3, and so on, are moved to the 1, 0, 2, 0, . . . index positions, respectively. The indices of all blocks after their movements are tabulated at the third column. The fourth and fifth column show the equivalent binary of moved position and converted nucleotides of the binaries. These nucleotides are interpreted in a top-down manner, and thus, the read values are ACATAGAACGCATCTATTCTTG-GTGGGACCGC. Say, it isÑ s . We make thatÑ s as a part of stego key. In that case, SK will increase in size by more than 64 bits. Finally,Ñ x will be modified byÑ x bỹ N 3 Ñ 0 Ñ 2 Ñ 1 , where stands for concatenation. Then, the new sequence will be as N x = GCTTTTATGA-TTTGTTGAGT-CGTGAAACGA-CACGCGGGTA. At the receiver end, the decoder first separatesÑ s from the last 32 nucleotides of SK. For this example, it is ACATAGAACGCATCTATTCTTGGTGGGACCGC. At that stage, it is necessary to know the number of blocks that the stego nucleotides contain. That could be part of a negotiated stego key or might be managed by adding two additional nucleotides. In our case, we communicate that information between encoder and decoder through stego key. Indeed, it is the cycle number, i.e., n. In our above example, we used 4 blocks. Hence, the decoder will separate first eight nucleotides of the ACATAGAACGCATC-TATTCTTGGTGGGACCGC, i.e., ACATAGAA. The decimal conversion of that ACATA-GAA will produce 1-3-2-0. That result states that the 2nd block, 4th block, 3rd block and 1st block are the reading sequences. The readers will generateÑ x byÑ 1 Ñ 3 Ñ 2 Ñ 0 . Thus,Ñ x = TTTGTTGAGT-CACGCGGGTA-CGTGAAACGA-GCTTTTATGA. From that stage, the process of extracting secrets and recovering cover DNA is explained in earlier discussions.

Result Analysis and Discussions
A good number of experiments were conducted on different DNA and RNA datasets, most of which were collected from NCBI datasets [4][5][6][7]. Rest of them was obtained by our developed program. Our generated DNAs were used to check whether the proposed method could be able to work on independent data set or not. We found no anomalies, i.e., the method worked successfully.
Among those datasets, results of seven datasets are presented in the following discussions. Table 10 states the properties of those seven datasets. To use in our analysis, we produce several binary data of different length, as of Table 11. We assume these as a secret message and then implant these in various DNA and RNA dataset of Table 10.

Message Name Description Length in Bits
OurMsg. 1 We produced it with our program. 800,000 OurMsg. 2 We produced it manually. 60 OurMsg. 3 We produced it with our program. 2000 OurMsg. 4 We produced it with our program. 20,000 OurMsg. 5 We produced it with our program. 100,000 OurMsg. 4 We produced it with our program. 200,000

Checking Reversibility
First, we allow the data embedment module to implant data bits of the confidential message, of Table 11, in DNA/RNA of Table 10. Thereafter, we execute the extraction module to retrieve data from the stego DNA/RNA and reconstruct the original cover DNA/RNA. Our system does it blindly, i.e., without taking any extra help from the decoder. The extracted message was stored in RetrivedMsg.txt, while the original message stream was saved to SecretMsg.txt file. We compare these two files using the file comparing function, named fc of DOS. We follow this procedure for the cover and extracted DNA as well. In both cases, no difference was encountered. That result of a DNA was shown in Figure 6. Thus, the proposed method proves its success in secret extraction and covers DNA reconstruction processes. Figure 6. Checking the reversibility of the proposed method. Here, we first save the cover and stego nucleotides in two different files. Then, we rebuild the cover DNA sequence from the stego DNA sequence using our proposed algorithm. The rebuilt DNA sequence is saved to another file. The DOS operating system has a special function named fc, which is used to compare two files. We used the fc command to compare the files containing cover and rebuilt cover DNA sequences. The fc command encountered no differences. That means that the two files contain similar information. Thus, proof of reversibility is obtained.
The method of calculating the error rate is shown in Equation (7). The equation is given just to measure the error rate if an error arises. As the fc command of DOS encounters no differences between cover DNA and extracted DNA in our case, the error rate is set to zero, as shown in Table 12.

Payload Comparison
The payload was determined by the number of bits that a scheme implants in a DNA or RNA sequence. In our experiment, the proposed scheme implants in about N r nucleotides. Therefore, at each cycle, the scheme implants |N r | × 2 bits and finally, payload tallies to (|N r | × 2) × c for c time of data embedment, as showed in Equation (8). Again, as in Equation (9), capacity is a measure of bits that could be implantable per nucleotide.
where |N r | stands for the number of cover nucleotides and c indicated the number of embedment cycles. Table 13 presents the results of payload. Our method strongly dominates the other competing schemes. The same result is noted at Figure 7. Even single cycle execution (c = 1) also implants a double of the [30] and three times of [31,32]. Every higher cycle increases that payload amount to twice that of the one immediately before.  Here, we presented payloads of three competing methods and the results of our method for three different cycles. In the whole dataset, the proposed scheme gives higher payloads. The payload increased along with the higher embedding cycle.

Expansion
The methods that we studied, which are state-of-the-art methods, indicate that due to data embedment, the size of stego nucleotides increases. The method of computing the expansion rate is a process of measuring the ratio between the differences in length of stego and cover nucleotides and the cover nucleotides. For the proposed scheme, it is c × 100%, as shown in Equation (10).
We observed that the stego DNAs other than the last cycle are inÑ x and the final stego DNAs are inÑ r . If we keep both the stego DNAs in the cloud, an intruder might feel interested to look for the secret message in the stego DNA due to the large size of stego DNA. Therefore, we propose to upload onlyÑ r at the cloud and to makeÑ x as a part of SK. Thus, we minimize the expansion rate of stego DNA to 0%. Even then, the size of SK is L × n + 20 bits which is smaller than the competing method [30] because the length of SK in [30] is 2 × |M| + 136. In fact, this 2 × |M| + 136 is about double of L × n + 136. Table 14 shows the results of the expansion rate. Our method presents dominating results. Although we executed our method for 16 cycles, for the purpose of the paper organization, only the results of the first three cycles are provided in Tables 13 and 14.

Robustness
There are many steganalysis methods that try to detect stego contents in a cover medium [35,36]. Therefore, checking the robustness of a scheme is necessary.

Cracking Probability
To crack the concealed data in our method, an intruder first needs to guess S. To this end, the length of S should be guessed as well. Let the length of S be l, where 1 ≤ l ≤ |N r | × 2. The total possibility of forming separate S is defined by Equation (11).
Note that, in practice, l is a very big number. Moreover, that cracking probability will increase for every subsequence cycle. It is hard to assume the probable number of executing cycles. However, for simplicity, we consider 16 execution cycles. Then, the cracking possibility (CP) will increase by more than 16 times. Again, there are 2 n possible messages which might be formed in the given length n, where n is the length of the message. Consequently, the cracking possibility CP increases and it is defined by Equation (12).
We have increased vulnerability by shuffling 16 blocks ofÑ x . That shuffling will again enlarge the probable number by 2 16 . Consequently, the total cracking possibility will further increase and thus, the total CP will be 2 16 times that of Equation (12). After simplifying the exponential values, we can compute CP using Equation (13).
The cracking probability is the inverse of the different possible cracking capabilities, i.e., cracking probability P is 1/CP. Hence, we can now deduce the cracking probability P by Equation (14).

Machine Learning
Nowadays, intruders apply various machine learning algorithms to guess whether a media contain stego contents or not. They even try to extract the secrets as well through their machine learning algorithms. To evaluate the robustness of our scheme, we implement a one-hot encoder for using nucleotides in a convolution neural network. The method first takes the cover nucleotides and the stego nucleotides of the proposed method, Rahman et al. [30], Hamed et al. [31] and Wang et al. [32] in five separate files. We first extracted the nucleotides from the cover file. As the nucleotides are one of 'A', 'C', 'G' and 'T', we generated four bits as one-hot codes for each nucleotide. 'A', 'C', 'G' and 'T' are used for label/class values. Thus, we had a long list of data where each one consisting of 4 bits. Suppose that this is X. Similarly, we have the same length of class labels, i.e., nucleotides. Let these data be Y. We divided both X and Y in a ratio of 9:1. Let the first 90% of data be trainX and trainY. In the same way, let the remaining 10% of data be testX and testY. We built a convolution neural network (CNN) model to train trainX and trainY. Afterwards, we applied testX to predict testY. We measured the accuracy in decoding the one-hot code. We performed the same tasks for the other four files, i.e., for the stego nucleotides of the proposed method, following Rahman et al. [30], Hamed et al. [31] and Wang et al. [32]. The results are demonstrated in Table 15. The table states that the accuracies in stego nucleotides are very close to those in cover. This means that detecting stego contents in a stego DNA sequence by deep learning is quite hard. Another noticeable point is that the accuracy is 0 in Rahman and Wang for OurDN A.2 cover sequence. It happens as a result of the small size of the sequence. That sequence contains only 10 nucleotides. For such a small dataset, a deep learning or machine learning algorithm cannot provide accurate results. Rather, any result can be produced by a machine. Hence, it would not be rational to believe the accuracy of the other three pieces of data of the OurDN A.2 cover sequence as well.

Summary
As a summary, Table 16 summarizes the comparison results of the proposed scheme with two closely related works. Although, in our scheme, the expansion rate is 0%, it provides a notably higher payload than the others. The scheme creates stego DNA which is same as that of cover DNA by size. As a result, stego DNA does not demand extra space in the cloud. The cracking probability is also a considerable number and greater than the others. Thus, it ensures higher security. The proposed scheme works with an RNA sequence while the others do not. The proposed scheme can work on only a few of nucleotides, e.g., about 10 nucleotides, while the others cannot perform their operation on such small size of nucleotides. As the proposed scheme implants secrets in working DNA for multi-time, it has the ability to implant more bits in a small-sized DNA set than the others, even if the others implant on a large DNA set as well.  (14) P > Equation (14) P > Equation (14) P > Equation (14) strand. Hence, a small memory is enough to make the scheme functional. Thus, this opens up a unique opportunity for data hiding-based healthcare applications as well as for the manufacturing industries. Moreover, the scheme could be used in message communication, identifying a device and recognizing the source of data. It will work in both offline and in the cloud. This will be a very useful scheme for an organization whereby the organization works with DNA. The medical and criminology departments of a country work on DNA and/or RNA. They just have to manage a huge volume of information for their patients and criminals. It would be a very notable application to hide the large volume of their secrets in their official DNA/RNA. From a managerial point of view, this contribution will be a smart one in handling their data. In our future work, we hope to work more on manufacturing architecture. We will work to improve the hardware and software architectures of such cyber devices to make them more user-friendly, efficient and functional. Additionally, we intend to embed personal information and one's DNA strand in a smart wearable device in our next research. Then, criminals and unknown dead bodies could be identified easily.