# A Comparative Analysis of Different Strains of Coronavirus Based on Genometric Mappings

^{1}

^{2}

^{*}

## Abstract

**:**

**Methods**. To study and identify features in the genetic composition of the nucleotide sequences of various coronaviruses, we applied copyright algorithms and visualization, which allowed us to compare the biochemical parameters of diverse RNA coronaviruses in a visual form.

**Results**. The article provides examples of different approaches to imaging coronaviruses. We have provided examples of coronavirus RNA structure visualization in various parametric spaces (1-D and 2-D). We employed various visualization types, including structural, integral, and frequency. The research discussed methods of visualization. Our team developed visualization and comparative analysis of coronavirus serotypes and visualization of SARS-CoV-2 coronavirus datasets. Discussion followed on the visualization results. The presented techniques and the results allowed for displaying the structure of RNA sequences of coronaviruses in spaces of various dimensions.

**Conclusions**. According to our findings, the proposed method contributes to the visualization of the genetic coding of coronaviruses. We discussed the issues of machine learning and neural network technology concerning the analysis of coronaviruses based on the presented approach. The described line of research is essential for the study and control of complex quantum mechanical systems, such as RNA or DNA.

## 1. Introduction

## 2. Methods: An Algorithm for Genometric Visualization of Genetic Sequences

- (1)
- A sequence of characters encoding nitrogen bases from the set A, G, C, and T or A, G, C, and U decomposes into N equal-length segments, where N is a method parameter. N-plets refer to the associated equal-length elements.
- (2)
- Three binary sequences comprising 0 and 1 can describe the pattern of nitrogen bases. The method (considering 0 or 1) affects the orientations and other symmetrical changes in the resulting display.
- (3)
- The resulting three records of the elements are encoded in three decimal representations or other uniquely identifying functions.

## 3. Results

#### 3.1. Visualization and Comparative Analysis of Coronavirus Serotypes

- (1)
- HCoV-229E, HCoV-NL63, HCoV-HKU1, HCoV-OC43
- (2)
- MERS-CoV, SARS-CoV, SARS-CoV-2.

#### 3.2. A Visualization of SARS-CoV-2 Coronavirus Datasets

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Jónsdóttir, H.R.; Dijkman, R. Coronaviruses and the human airway: A universal system for virus-host interaction studies. Virol. J.
**2016**, 13, 24. [Google Scholar] [CrossRef] [Green Version] - Li, H.; Liu, S.M.; Yu, X.H.; Tang, S.L.; Tang, C.K. Coronavirus disease 2019 (COVID-19): Current status and future perspectives. Int. J. Antimicrob. Agents
**2020**, 55, 105951. [Google Scholar] [CrossRef] [PubMed] - Rahmani, A.M.; Mirmahaleh, S.Y.H. Coronavirus disease (COVID-19) prevention and treatment methods and effective parameters: A systematic literature review. Sustain. Cities Soc.
**2020**, 64, 102568. [Google Scholar] [CrossRef] [PubMed] - Cui, F.; Zhou, H.S. Diagnostic methods and potential portable biosensors for coronavirus disease 2019. Biosens. Bioelectron.
**2020**, 165, 112349. [Google Scholar] [CrossRef] [PubMed] - Fernandes, N. Economic Effects of Coronavirus Outbreak (COVID-19) on the World Economy. Available online: https://ssrn.com/abstract=3557504 (accessed on 20 January 2022).
- Brodeur, A.; Gray, D.; Islam, A.; Bhuiyan, S. A literature review of the economics of COVID-19. J. Econ. Surv.
**2021**, 35, 1007–1044. [Google Scholar] [CrossRef] [PubMed] - Joshi, A.; Paul, S. Phylogenetic analysis of the novel coronavirus reveals important variants in Indian strains. BioRxiv
**2020**. [Google Scholar] [CrossRef] [Green Version] - Kaur, N.; Singh, R.; Dar, Z.; Bijarnia, R.K.; Dhingra, N.; Kaur, T. Genetic comparison among various coronavirus strains for the identification of potential vaccine targets of SARS-CoV-2. Infect. Genet. Evol.
**2021**, 89, 104490. [Google Scholar] [CrossRef] [PubMed] - Ji, W.; Wang, W.; Zhao, X.; Zai, J.; Li, X. Cross-species transmission of the newly identified coronavirus 2019-nCoV. J. Med. Virol.
**2020**, 92, 433–440. [Google Scholar] [CrossRef] [PubMed] - Roy, B.; Dhillon, J.K.; Habib, N.; Pugazhandhi, B. Global variants of COVID-19: Current understanding. J. Biomed. Sci.
**2021**, 8, 8–11. [Google Scholar] [CrossRef] - Xu, P.; Sun, G.D.; Li, Z.Z. Clinical characteristics of two human to human transmitted coronaviruses: Corona virus disease 2019 versus middle east respiratory syndrome coronavirus. MedRxiv
**2020**. [Google Scholar] [CrossRef] [Green Version] - Hemida, M.G. The next-generation coronavirus diagnostic techniques with particular emphasis on the SARS-CoV-2. J. Med. Virol.
**2021**, 93, 4219. [Google Scholar] [CrossRef] [PubMed] - Duś-Ilnicka, I.; Szymczak, A.; Małodobra-Mazur, M.; Tokarski, M. Role of laboratory medicine in SARS-CoV-2 diagnostics. Lessons learned from a pandemic. Healthcare
**2021**, 9, 915. [Google Scholar] [CrossRef] [PubMed] - Sarkar, J.P.; Saha, I.; Seal, A.; Maity, D. COVID-predictor: RNA sequence based prediction of coronavirus. Researchsquare
**2021**, 9, 708224. [Google Scholar] - Heo, L.; Feig, M. Modeling of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) proteins by machine learning and physics-based refinement. BioRxiv
**2020**. [Google Scholar] [CrossRef] [Green Version] - Stepanyan, I.V.; Petoukhov, S.V. The matrix method of representation, analysis and classification of long genetic sequences. Information
**2017**, 8, 12. [Google Scholar] [CrossRef] [Green Version] - Stepanyan, I.V.; Lednev, M.Y. Overview of possibilities of genometric information systems. IOP Conf. Ser. Mater. Sci. Eng.
**2021**, 1129, 012047. [Google Scholar] [CrossRef] - Stepanyan, I.V. The genometrical concept. Symmetry Cult. Sci.
**2021**, 32, 269–272. [Google Scholar] [CrossRef] - Stepanyan, I.V. DNA clustering algorithms. Autom. Doc. Math. Linguist.
**2021**, 55, 1–7. [Google Scholar] [CrossRef] - Stepanyan, I.V. A multiscale model of nucleic acid imaging. Sci. Vis.
**2020**, 12, 61–78. [Google Scholar] [CrossRef] - Hassan, A.H.M.; Qasem, A.A.M.; Abdalla, W.F.M.; Elhassan, O.H. Visualization & prediction of COVID-19 future outbreak by using machine learning. Int. J. Inf. Technol. Comput. Sci.
**2021**, 13, 16–32. [Google Scholar]

**Figure 1.**(

**Left**): system of three sub-alphabets as rows with white (true) and black (false) cells in a matrix (a). (

**Right**): two ways of applying parameter N with steps equal to 1 and to N.

**Figure 2.**One-dimensional (

**left**) and two-dimensional (

**right**) structural mappings at N = 16 of the nucleotide composition of SARS-CoV-2. 1D sets (left top and bottom): y–from 0 to 2

^{16}decimal values; x–from 1 to 29,888 N-plets. The 2D set (

**right**): x and y are from 0 to 2

^{16}decimal values.

**Figure 3.**One- and two-dimensional integral imaging of the nucleotide composition of SARS-CoV-2 N = 512. 1D sets: x—from 1 to 29,888 N-plets, (

**left top**) y—from 207 to 300 values of integral characteristic, (

**left bottom**) y—from 185 to 306 values of the integral characteristic; 2D set (

**right**): x—from 207 to 300 N-plets, y—from 185 to 306 N-plets.

**Figure 4.**One- and two-dimensional frequency imaging at N = 10 of the nucleotide composition of SARS-CoV-2. 1D sets (

**left top**and

**bottom**): x–from 0 to 29894 N-plets; y–from 12 to 62 times (

**top**), and from 6 to 94 times (

**bottom**). The 2D set (

**right**): x is from 12 to 62 times, and y is from 6 to 94 times.

**Figure 5.**One-dimensional integral mappings of the nucleotide structure of coronaviruses genomics data (N = 1024). For each graph, x–from 0 to 30,741 N-plets, y–from 0 to 1024 values.

**Figure 6.**Two-dimensional imaging of the nucleotide structure of coronaviruses genomics data (N = 8). X and Y have 2

^{8}= 256 pixels, the header includes parameter values for each coordinate.

**Figure 7.**One-dimensional integral mappings of the nucleotide structure of coronaviruses genomics data (N = 1024; 401,330 samples of SARS-CoV-2 from different people); y–from 1 to 1024 values, x–from 1 to 31,500 N-plets.

**Figure 8.**One-dimensional frequency mappings of the nucleotide structure of coronaviruses genomics data (N = 9; 401,330 samples of SARS-CoV-2 from different people); y–from 1 to 31,500 N-plets, x–from 1 to 250 times.

Index | Abbreviation | Coronavirus Genome | Length |
---|---|---|---|

NC_006577 | HCoV-HKU1 | Human (HKU1) | 29,926 |

NC_005831 | HCoV-NL63 | Human (NL63) | 27,553 |

NC_002645 | HCoV-229E | Human (229E) | 27,317 |

NC_006213 | HCoV-OC43 | Human (OC43) | 30,741 |

NC_004718 | SARS-CoV | Severe acute respiratory syndrome | 29,751 |

NC_045512 | SARS-CoV-2 | Severe acute respiratory syndrome-2 | 29,903 |

NC_019843 | MERS-CoV | Middle East respiratory syndrome | 30,119 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Stepanyan, I.V.; Lednev, M.Y.
A Comparative Analysis of Different Strains of Coronavirus Based on Genometric Mappings. *Symmetry* **2022**, *14*, 942.
https://doi.org/10.3390/sym14050942

**AMA Style**

Stepanyan IV, Lednev MY.
A Comparative Analysis of Different Strains of Coronavirus Based on Genometric Mappings. *Symmetry*. 2022; 14(5):942.
https://doi.org/10.3390/sym14050942

**Chicago/Turabian Style**

Stepanyan, Ivan V., and Michail Y. Lednev.
2022. "A Comparative Analysis of Different Strains of Coronavirus Based on Genometric Mappings" *Symmetry* 14, no. 5: 942.
https://doi.org/10.3390/sym14050942