Unintended Creation or Insertion of Antisense Promoter Motifs During Codon Optimization: A Cyber-Biosecurity Risk
Abstract
1. Introduction
2. Materials and Methods
2.1. The Promoter “−10” Motif: An Example Used for Our Proof of Concept (POC)
- The −10 Region: A sequence positioned ~10 nucleotides upstream, which is crucial for DNA unwinding and the formation of the open transcription complex.
- The −35 Region: A sequence located ~35 nucleotides upstream of the transcription start site, responsible for initial recognition by the RNA polymerase sigma factor.
2.2. Dataset Choice and Preprocessing
2.3. Codon Optimization Framework: Analyzing Popular Tools’ Output Sequences
2.4. New Quality Assurance (QA) Open-Source SW Tool for Codon Optimization
- Backend—This part includes the logic for the various tests, including:
- Manipulations on sequences.
- SW interface with optimization tools.
- Searching for promoter motif in different positions in sequences.
- Insertion attempt of promoter motif in different positions in sequences.
- Frontend—This part presents the product to the client in the form of a server that opens and displays the results on the Web based on Hyper Text Markup Language (HTML) and Cascading Style Sheets (CSSs).

- Users—Running the application via an “.exe” file.
- Developers—Running the application via Python.
- Modular structure and division between Backend and Frontend.
- For each option available on the home page of the Web, a separate Python script was written to enable a proper development and debugging process.
- For convenience, the developer who wants to run the application will also be required to run only one script—“app.py”—which is responsible for the integration and combination of all the scripts written in the project.
2.4.1. Antisense Promoter Motif Detection Algorithm
2.4.2. Motif Insertion Algorithm: Silent Insertion Probability
2.4.3. Motif Defense Module: An SW Tool to Protect Your Lab
3. Results
3.1. Natural Occurrence of Antisense Promoter Motifs and Insertion Feasibility in Motif-Free Sequences
3.2. Comparison with Random k-mer Frequencies
- All amino acids appear with equal probability in the sequence.
- No specific codon usage bias.
- Case 1: Two Complete Codons: “ATT”–“ATA”
- Case 2: Across Three Codons: “_ _ A”–“TTA”–“TA _”
- Case 3: Across Three Codons: “_ AT”–“TAT”–“A _ _”
4. Discussion
5. Conclusions and Future Research Directions
- Dive deeper into understanding the genetic code and various motifs with probabilities very different than random (both much higher and much lower).
- Developing more QA tools to minimize the potential damage of synthetic DNA sequences that might encode surprises for biologists using these tools.
- Identify other bioinformatics tools that should be examined within this framework to ensure their high quality and minimize their damage potential.
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Carlson, R. The changing economics of DNA synthesis. Nat. Biotechnol. 2009, 27, 1091–1094. [Google Scholar] [CrossRef] [PubMed]
- Cameron, D.E.; Bashor, C.J.; Collins, J.J. A brief history of synthetic biology. Nat. Rev. Microbiol. 2014, 12, 381–390. [Google Scholar] [CrossRef]
- Elgabry, M.; Nesbeth, D.; Johnson, S.D. A Systematic Review of the Criminogenic Potential of Synthetic Biology and Routes to Future Crime Prevention. Front. Bioeng. Biotechnol. 2020, 8, 571672. [Google Scholar] [CrossRef] [PubMed]
- Watts, A.; Sankaranarayanan, S.; Watts, A.; Raipuria, R.K. Optimizing protein expression in heterologous system: Strategies and tools. Meta Gene 2021, 29, 100899. [Google Scholar] [CrossRef]
- Condon, A.; Thachuk, C. Efficient codon optimization with motif engineering. J. Discret. Algorithms 2012, 16, 104–112. [Google Scholar] [CrossRef]
- Angov, E. Codon usage: Nature’s roadmap to expression and folding of proteins. Biotechnol. J. 2011, 6, 650–659. [Google Scholar] [CrossRef]
- Tuller, T.; Carmi, A.; Vestsigian, K.; Navon, S.; Dorfan, Y.; Zaborske, J.; Pan, T.; Dahan, O.; Furman, I.; Pilpel, Y. An Evolutionarily Conserved Mechanism for Controlling the Efficiency of Protein Translation. Cell 2010, 141, 344–354. [Google Scholar] [CrossRef]
- Goodman, D.B.; Church, G.M.; Kosuri, S. Causes and Effects of N-Terminal Codon Bias in Bacterial Genes. Science 2013, 342, 475–479. [Google Scholar] [CrossRef]
- Quax, T.E.F.; Claassens, N.J.; Söll, D.; van der Oost, J. Codon Bias as a Means to Fine-Tune Gene Expression. Mol. Cell 2015, 59, 149–161. [Google Scholar] [CrossRef] [PubMed]
- Plotkin, J.B.; Kudla, G. Synonymous but not the same: The causes and consequences of codon bias. Nat. Rev. Genet. 2011, 12, 32–42. [Google Scholar] [CrossRef]
- Masłowska-Górnicz, A.; van den Bosch, M.R.M.; Saccenti, E.; Suarez-Diez, M. A large-scale analysis of codon usage bias in 4868 bacterial genomes shows association of codon adaptation index with GC content, protein functional domains and bacterial phenotypes. Biochim. Biophys. Acta (BBA) Gene Regul. Mech. 2022, 1865, 194826. [Google Scholar] [CrossRef] [PubMed]
- Mo, O.; Zhang, Z.; Cheng, X.; Zhu, L.; Zhang, K.; Zhang, N.; Li, J.; Li, H.; Fan, S.; Li, X.; et al. mRNAdesigner: An integrated web server for optimizing mRNA design and protein translation in eukaryotes. Nucleic Acids Res. 2025, 53, W415–W426. [Google Scholar] [CrossRef] [PubMed]
- Han, X.; Shao, X.; Liu, S.; Shi, Z.; Huang, R.; Chu, H.; Zhang, H.; Wang, R.; Li, H.; Liao, X.; et al. DeepCodon: A deep learning codon-optimization model to enhance protein expression. BioDes. Res. 2025, 7, 100042. [Google Scholar] [CrossRef]
- Karaşan, O.; Şen, A.; Tiryaki, B.; Cicek, A.E. A unifying network modeling approach for codon optimization. Bioinformatics 2022, 38, 3935–3941. [Google Scholar] [CrossRef]
- Fallahpour, A.; Gureghian, V.; Filion, G.J.; Lindner, A.B.; Pandi, A. CodonTransformer: A multispecies codon optimizer using context-aware neural networks. Nat. Commun. 2025, 16, 3205. [Google Scholar] [CrossRef] [PubMed]
- Welch, M.; Govindarajan, S.; Ness, J.E.; Villalobos, A.; Gurney, A.; Minshull, J.; Gustafsson, C. Design Parameters to Control Synthetic Gene Expression in Escherichia coli. PLoS ONE 2009, 4, e7002. [Google Scholar] [CrossRef] [PubMed]
- Browning, D.F.; Busby, S.J.W. The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. 2004, 2, 57–65. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Feng, R.; Ding, N.; Cao, W.; Liu, Y.; Zhou, S.; Deng, Y. Deep learning guided programmable design of Escherichia coli core promoters from sequence architecture to strength control. Nucleic Acids Res. 2025, 53, gkaf863, Erratum in Nucleic Acids Res. 2025, 53, gkaf1308. [Google Scholar] [CrossRef]
- Dornenburg, J.E.; DeVita, A.M.; Palumbo, M.J.; Wade, J.T. Widespread Antisense Transcription in Escherichia coli. MBio 2010, 1, e00024-10. [Google Scholar] [CrossRef]
- Anjum, N.; Alshahrani, H.; Shaikh, A.; Mahreen-Ul-Hassan Kiran, M.; Raz, S.; Alam, A. Cyber-Biosecurity Challenges in Next-Generation Sequencing: A Comprehensive Analysis of Emerging Threat Vectors. IEEE Access 2025, 13, 52006–52035. [Google Scholar] [CrossRef]
- Peccoud, J.; Gallegos, J.E.; Murch, R.; Buchholz, W.G.; Raman, S. Cyberbiosecurity: From Naive Trust to Risk Awareness. Trends Biotechnol. 2018, 36, 4–7. [Google Scholar] [CrossRef] [PubMed]
- Puzis, R.; Farbiash, D.; Brodt, O.; Elovici, Y.; Greenbaum, D. Increased cyber-biosecurity for DNA synthesis. Nat. Biotechnol. 2020, 38, 1379–1381. [Google Scholar] [CrossRef] [PubMed]
- Murch, R.S.; So, W.K.; Buchholz, W.G.; Raman, S.; Peccoud, J. Cyberbiosecurity: An Emerging New Discipline to Help Safeguard the Bioeconomy. Front. Bioeng. Biotechnol. 2018, 6, 39. [Google Scholar] [CrossRef] [PubMed]
- Doricchi, A.; Platnich, C.M.; Gimpel, A.; Horn, F.; Earle, M.; Lanzavecchia, G.; Cortajarena, A.L.; Liz-Marzán, L.M.; Liu, N.; Heckel, R.; et al. Emerging Approaches to DNA Data Storage: Challenges and Prospects. ACS Nano 2022, 16, 17552–17571. [Google Scholar] [CrossRef] [PubMed]
- Ibaisi, T.A.L.; Kuhn, S.; Kaiiali, M.; Kazim, M. Network Intrusion Detection Based on Amino Acid Sequence Structure Using Machine Learning. Electronics 2023, 12, 4294. [Google Scholar] [CrossRef]
- DB Source: Database. Available online: https://www.uniprot.org/uniprotkb?query=Escherichia+coli+%28strain+K12%29&facets=reviewed%3Atrue%2Cmodel_organism%3A83333 (accessed on 6 February 2026).
- Chakravarty, P.R. What is Codon Bias: Implications in Gene Cloning. 2023. Available online: https://www.goldbio.com/blogs/articles/what-is-codon-bias?srsltid=AfmBOoplqhl3-MHdBz8xvP_L_xMU8WYE3g6p2Hp01h7VxhJglNp8woH4 (accessed on 7 March 2026).
- VectorBuilder. Codon Optimization Tool. 2025. Available online: https://en.vectorbuilder.com/tool/codon-optimization.html (accessed on 7 March 2026).
- U. Consortium. UniPort Entry P01308. 2025. Available online: https://rest.uniprot.org/uniprotkb/P01308.fasta (accessed on 7 March 2026).


| The Sequence | Probability by Nucleotides ) | Probability by Sequences ) |
|---|---|---|
| ATTATA (motif) | ||
| GCCATC | ||
| CATTGC | ||
| TGGCAT | ||
| ATACCT | ||
| CGATCG |
“AAT”. The other two insertion cases also generate the −10 motif in the antisense strand. The second case requires two changes, and the third case only one synonymous change.
“AAT”. The other two insertion cases also generate the −10 motif in the antisense strand. The second case requires two changes, and the third case only one synonymous change.| Predicted DNA | Protein Sequence | Expected Case | ||||
|---|---|---|---|---|---|---|
| AACTATATGATCATTGCTTTATAT | NYMIIALY | [(case 3: start in codon 1), (case 1: start in codon 4), (case 2: start in codon 6)] | ||||
| Original DNA | Modified DNA | # Insertions | Changes | Existing Motif | Cases | |
| AACTATATGATCATTGCTTTATAT | AATTATATGATTATAGCATTATAT | 3 | [(‘Insertion_1’, 2, ‘C’, ‘T’), (‘Insertion_2’, 11, ‘C’, ‘T’), (‘Insertion_2’, 14, ‘T’, ‘A’), (‘Insertion_3’, 17, ‘T’, ‘A’)] | FALSE | [3, 1, 2] | |
| c | Optional Codons | Codons That End in A | Probability |
|---|---|---|---|
| Gly (G) | 4 | 1 | 1/4 |
| Glu (E) | 2 | 1 | 1/2 |
| Ala (A) | 4 | 1 | 1/4 |
| Val (V) | 4 | 1 | 1/4 |
| Arg (R) | 6 | 2 | 1/3 |
| Lys (K) | 2 | 1 | 1/2 |
| Thr (T) | 4 | 1 | 1/4 |
| Ile (I) | 3 | 1 | 1/3 |
| Gln (Q) | 2 | 1 | 1/2 |
| Pro (P) | 4 | 1 | 1/4 |
| Leu (L) | 6 | 2 | 1/3 |
| Ser (S) | 6 | 1 | 1/6 |
| Name of Amino Acid | Optional Codons | Codons That End in A | Probability |
|---|---|---|---|
| Asp (D) | 2 | 1 | 1/2 |
| Asn (N) | 2 | 1 | 1/2 |
| His (H) | 2 | 1 | 1/2 |
| Tyr (Y) | 2 | 1 | 1/2 |
| Name of Amino Acid | Optional Codons | Codons That End in A | Probability |
|---|---|---|---|
| Arg (R) | 2 | 1 | 1/2 |
| Ser (S) | 2 | 1 | 1/2 |
| Lys (K) | 2 | 1 | 1/2 |
| Asn (N) | 2 | 1 | 1/2 |
| Thr (T) | 4 | 1 | 1/4 |
| Ile (I) | 3 | 1 | 1/3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Carmi, E.; Glikman, R.; Dorfan, Y. Unintended Creation or Insertion of Antisense Promoter Motifs During Codon Optimization: A Cyber-Biosecurity Risk. Microorganisms 2026, 14, 638. https://doi.org/10.3390/microorganisms14030638
Carmi E, Glikman R, Dorfan Y. Unintended Creation or Insertion of Antisense Promoter Motifs During Codon Optimization: A Cyber-Biosecurity Risk. Microorganisms. 2026; 14(3):638. https://doi.org/10.3390/microorganisms14030638
Chicago/Turabian StyleCarmi, Elad, Roni Glikman, and Yuval Dorfan. 2026. "Unintended Creation or Insertion of Antisense Promoter Motifs During Codon Optimization: A Cyber-Biosecurity Risk" Microorganisms 14, no. 3: 638. https://doi.org/10.3390/microorganisms14030638
APA StyleCarmi, E., Glikman, R., & Dorfan, Y. (2026). Unintended Creation or Insertion of Antisense Promoter Motifs During Codon Optimization: A Cyber-Biosecurity Risk. Microorganisms, 14(3), 638. https://doi.org/10.3390/microorganisms14030638

