# The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Matrix Representations of Whole Sets of N-Plets (or N-Mers)

- the set of
**4**monoplets (in DNA: A, C, G, T) (in RNA, uracil U replaces thymine T);^{1} - the set of
**4**= 16 duplets (AA, AC, AG, AT, ….);^{2} - the set of
**4**= 64 triplets (AAA, AAC, ACA, ACG, ACT, ….);^{3} - etc.

**4**N-plets coincides with the whole set of

^{N}**4**entries in a (2

^{N}^{N}× 2

^{N})-matrix, which belongs to the Kronecker family of genetic matrices [A G; C T]

^{(N)}, where (N) means Kronecker (or tensor) power. Figure 1 shows the first three members of this Kronecker family for n = 1, 2, 3. It also shows that—inside such matrix [A G; C T]

^{(N)}—each N-plet has its individual binary coordinates (or appropriate coordinates in decimal notation) due to biochemical attributes of N-plets. This is explained in detail below.

_{1}and 1

_{1}, which one can name “the binary sub-alphabet to the first kind of the binary attributes”.

- 10110... (in accordance with the first sub-alphabet; its decimal equivalent can be located on the “X” axis of a Cartesian system of coordinates);
- 01110... (in accordance with the second sub-alphabet; its decimal equivalent can be located on the “Y” axis of a Cartesian system of coordinates);
- 11000... (in accordance with the third sub-alphabet; its decimal equivalent can be located on the “Z” axis of a Cartesian system of coordinates).

^{(3)}on Figure 1, the second row has its binary numeration 110 because each of its triplets (AAC, AAT, AGC, AGT, GAC, GAT, GGC, GGT) is a sequence “purine-purine-pyrimidine” that corresponds to binary number 110 from the point of view of the first sub-alphabet on Figure 2. Analogically in genetic matrices of the Kronecker family (see Figure 1), each column has its individual binary number, which is connected with the fact that all N-plets inside this column have identical binary representation from the point of view of the second sub-alphabet on Figure 2. For example, in the (8 × 8)-matrix [A G; C T]

^{(3)}on Figure 1, the third column has its binary numeration 010 because each of its triplets (AGA, AGC, ATA, ATC, CGA, CGC, CTA, CTC) is a “amino–keto–amino” sequence that corresponds to binary number 010 from the point of view of the second sub-alphabet on Figure 2. Respectively, each N-plet, which is located in an appropriate genetic matrix on crossing “column–row”, obtains its individual 2-dimensional coordinates on the base of binary numeration of its column and row. For example, the triplet AGC, which is located on crossing of the mentioned column and row (Figure 1), obtains its individual binary coordinates (010, 110), or in decimal notation (2, 6).

^{(5)}but also as the point with decimal coordinates (22, 14) in the orthogonal Cartesian system of coordinates (x, y). Taking into account the chosen connection (Figure 2) between each sub-alphabet and one of the X, Y, Z axes of the Cartesian system of coordinates, the following correspondence exists between Kronecker families of genomatrices and 2-dimensional planes (x, y), (x, z) and (y, z) of the Cartesian system:

- the plane (x, y) corresponds to matrices [A G; C T]
^{(N)}, whose rows and columns are binary numerated from the point of view of the first sub-alphabet and the second sub-alphabet respectively; - the plane (x, z) corresponds to matrices [G A; C T]
^{(N)}, whose rows and columns are binary numerated from the point of view of the first sub-alphabet and the third sub-alphabet respectively; - the plane (y, z) corresponds to matrices [G T; C A]
^{(N)}, whose rows and columns are binary numerated from the point of view of the second sub-alphabet and the third sub-alphabet respectively.

_{1}, b

_{1}) and W(a

_{2}, b

_{2}):

_{2}− a

_{1})

^{2}+ (b

_{2}− b

_{1})

^{2}]

^{0.5}

^{N}members, is located inside one of the matrices of the Kronecker family of matrices such as [A G; C T]

^{(N)}. Correspondingly this method is closely connected with Kronecker multiplication of matrices, which is widely used in mathematics, informatics, physics, etc. and which is one of the main mathematical operations in the field of matrix genetics [32,33,34,35,36,37]. Kronecker multiplication of matrices is used when one needs to go from spaces of smaller dimension into associated spaces of higher dimension. If one uses the mathematical language of vector spaces for modeling the ontogenetic complication of a living organism, it is natural to apply the ideology of a gradual transition from the spaces of low dimensions into spaces of higher dimensions. Such gradual transition is described by means of a series of Kronecker multiplication of matrices.

## 3. The Description of the Matrix Method for Long Nucleotide Sequences

- Any long nucleotide sequence, which contains K nucleotides, is divided into equal fragments of length “N” (N-plets or N-mers), where “N” takes different values: n = 1, 2, 3, …, K; in the result, an appropriate set of different symbolic representations of this sequence as a chain of N-plets appears;
- Each N-plet in every of these representations of the sequence is transformed into three kinds of n-bit binary numbers by means of its reading from the point of view of the three sub-alphabets (Figure 2). Each of these binary numbers is transformed into its decimal equivalent. In the result, an appropriate set of different decimal representations of the initial symbolic sequence appears in a form of three kinds of sequences of decimal numbers respectively for positive integer coordinates on Cartesian axes X, Y, Z (or for numeration of rows and columns of appropriate genetic matrices).
- Any two of the received numeric sequences define an appropriate sequence of pairs of positive integer coordinates of points on the 2-dimensional Cartesian plane (or coordinates of cells inside an appropriate genetic matrix of a Kronecker family). On the base of these pairs of coordinates, a set of corresponding points is built on the 2-dimensional Cartesian plane (or a set of corresponding cells in black inside a respective genetic matrix of a Kronecker family in contrast to other cells, which remain in white).

- Homo sapiens contactin associated protein-like 2 (CNTNAP2), RefSeqGene on chromosome 7 (N = 63).
- Homo sapiens contactin associated protein-like 2 (CNTNAP2), RefSeqGene on chromosome 7 (N = 63).
- Sorangium cellulosum So0157-2, complete genome (N = 63).
- Burkholderia multivorans ATCC 17616 genomic DNA, complete genome, chromosome 2 (N = 63).
- Thermofilum sp. 1910b, complete genome (N = 63).
- Thermofilum sp. 1910b, complete genome (N = 63).
- Dinoroseobacter shibae DFL 12, complete genome (N = 8).
- Escherichia coli LY180, complete genome (N = 24).
- Francisella tularensis subsp. tularensis SCHU S4 complete genome (N = 24).
- Halomonas elongata DSM 2581, complete genome (N = 24).
- Helicobacter mustelae 12198 complete genome (N = 24).
- Helicobacter mustelae 12198 complete genome (N = 12).
- Invertebrate iridovirus 22 complete genome (N = 8).
- Methanosalsum zhilinae DSM 4017, complete genome (N = 12).
- Methanosalsum zhilinae DSM 4017, complete genome (N = 12).
- Mycobacterium abscessus subsp. bolletii INCQS 00594 INCQS00594_scaffold1, whole genome shotgun sequence (N = 12).
- Penicillium chrysogenum Wisconsin 54-1255 complete genome, contig Pc00c12 (N = 32).
- Riemerella anatipestifer DSM 15868, complete genome (N = 12).
- Riemerella anatipestifer DSM 15868, complete genome (N = 12).
- Burkholderia multivorans ATCC 17616 genomic DNA, complete genome, chromosome 2 (N = 8).

^{(16)}and [G T; C A]

^{(16)}respectively, where cells with existing 16-plets of the sequence are shown in black and cells with missing 16-plets are shown in white.

^{n}. For example, the set of 3-bit binary numbers contains 2

^{3}= 8 members: 000, 001, 010, 011, 100, 101, 110, 111 (their equivalents in decimal notation are 0, 1, 2, 3, 4, 5, 6, 7). Decimal equivalent of the biggest n-bit binary member in a set of n-bit binary numbers is equal to 2

^{n}− 1. Such sets of N-bit binary numbers are named “dyadic groups” (see details in [6]).

- a long nucleotide sequence, which is divided into relative short N-mers (N = 1, 2, 3, 4), usually contains all possible kinds of such short N-mers; correspondingly, its visual pattern is trivial because it contains all possible points with positive integer coordinates (x, y) inside an appropriate numeric range;
- a long nucleotide sequence, which is divided into relative long N-mers (N = 8, 9, 10, …), usually generates a regular non-trivial mosaic of a fractal-like or other character. This was detected using a special computer program in the course of initial investigations of different long nucleotide sequences by means of the described method.

## 4. Long Random Sequences

## 5. Kronecker Multiplication, Fractal Lattices and the Problem of Coding an Organism on Different Stages of Its Ontogenesis

^{n}× k

^{n})-matrix M

^{(N)}with a fractal location of entries 0 and 1 inside it (Figure 8 and Figure 9). These fractal mosaics inside such matrices of Kronecker families are called “fractal lattices.” The theme of “Kronecker multiplication and fractal lattices” is accurately described in a previous book [39]. Such fractal lattices (Figure 8) are generated due to a general definition of Kronecker multiplication of matrices as a special mathematical operation.

^{(3)}of 64 triplets where those 8 triplets are missing, which are located in this matrix on the same places and which are marked by red color on Figure 9 (upper level, right side). Let us replace these 8 missing triplets by number 0, and all other 56 triplets by number 1. It leads to a transformation of this variant of symbolic matrix [A G; C T]

^{(3)}into a numeric matrix S (Figure 9, bottom level, left side).

^{(2)}, S

^{(3)}, …, whose visual patterns illustrate appropriate fractal lattices, one of which for the matrix S

^{(2)}is shown on Figure 9 (bottom level, right side). The numeric matrix S

^{(16)}contains the whole set of 16-plets with an appropriate fractal lattice, which resembles the visual pattern of the real nucleotide sequence Homo sapiens chromosome 22 genomic scaffold on Figure 4. One should note that the visual pattern of this real sequence contains more white places (than in the matrix S

^{(16)}) because many additional 16-plets are absent since the sequence has a finite length in 648,059 nucleotides.

## 6. Patterns of Human Chromosomes

## 7. Patterns of Penicillin

## 8. About 3D-Representations

## 9. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Bell, S.J.; Forsdyke, D.R. Deviations from Chargaff’s second parity rule correlate with direction of transcription. J. Theor. Biol.
**1999**, 197, 63–76. [Google Scholar] [CrossRef] [PubMed] - Chen, L.; Zhao, H. Negative correlation between compositional symmetries and local recombination rates. Bioinformatics
**2005**, 21, 3951–3958. [Google Scholar] [CrossRef] [PubMed] - Dong, Q.; Cuticchia, A.J. Compositional symmetries in complete genomes. Bioinformatics
**2001**, 17, 557–559. [Google Scholar] - Forsdyke, D.R. A stem-loop “kissing” model for the initiation of recombination and the origin of introns. Mol. Biol. Evol.
**1995**, 12, 949–958. [Google Scholar] [PubMed] - Forsdyke, D.R. Symmetry observations in long nucleotide sequences: A commentary on the discovery of Qi and Cuticchia. Bioinform. Lett.
**2002**, 18, 215–217. [Google Scholar] [CrossRef] - Forsdyke, D.R.; Bell, S.J. A discussion of the application of elementary principles to early chemical observations. Appl. Bioinform.
**2004**, 3, 3–8. [Google Scholar] [CrossRef] - Mitchell, D.; Bride, R. A test of Chargaff’s second rule. BBRC
**2006**, 340, 90–94. [Google Scholar] [CrossRef] [PubMed] - Perez, J.-C. Codon populations in single-stranded whole human genome DNA are fractal and fine-tuned by the golden ratio 1.618. Interdiscip. Sci. Comput. Life Sci.
**2010**, 2, 1–13. [Google Scholar] - Prabhu, V.V. Symmetry observation in long nucleotide sequences. Nucleic Acids Res.
**1993**, 21, 2797–2800. [Google Scholar] [CrossRef] [PubMed] - Grebnev, Y.V.; Sadovsky, M.G. Second Chargaff's rules and symmetry genomes. Fundam. Res.
**2014**, 12, 965–968. (In Russian) [Google Scholar] - Yamagishi, M.; Herai, R. Chargaff’s “Grammar of Biology”: New Fractal-like Rules. Available online: https://arxiv.org/pdf/1112.1528.pdf (accessed on 7 December 2011).
- Jeffrey, H.J. Chaos game representation of gene structure. Nucleic Acids Res.
**1990**, 18, 2163–2170. [Google Scholar] [CrossRef] [PubMed] - Goldman, N. Nucleotide, dinucleotide and trinucleotide frequencies explain patterns observed in chaos game representations of DNA sequences. Nucleic Acid Res.
**1993**, 21, 2487–2491. [Google Scholar] [CrossRef] [PubMed] - Gutierrez, J.M.; Rodriguez, M.A.; Abramson, G. Multifractal analysis of DNA sequences using novel chaos-game representation. Physica A
**2001**, 300, 271–284. [Google Scholar] [CrossRef] - Joseph, J.; Sasikumar, R. Chaos game representation for comparison of whole genomes. BMC Bioinform.
**2006**, 7, 243–246. [Google Scholar] [CrossRef] [PubMed] - Oliver, J.L.; Bernaola-Galvan, P.; Guerrero-Garcia, J.; Roman-Roldan, R. Entropic profiles of DNA sequences through chaos-game-derived images. J. Theor. Biol.
**1993**, 160, 457–470. [Google Scholar] [CrossRef] [PubMed] - Tavassoly, I.; Tavassoly, O.; Rad, M.; Dastjerdi, N. Multifractal analysis of Chaos Game Representation images of mitochondrial DNA. In Proceedings of the IEEE Conference: Frontiers in the Convergence of Bioscience and Information Technologies, Jeju City, Korea, 11–13 October 2007; Howard, D., Ed.; IEEE Press: Jeju City, Korea, 2007; pp. 224–229. [Google Scholar]
- Tavassoly, I.; Tavassoly, O.; Rad, M.; Dastjerdi, N. Three dimensional Chaos Game Representation of genomic sequences. In Proceedings of the IEEE Conference: Frontiers in the Convergence of Bioscience and Information Technologies, Jeju City, Korea, 11–13 October 2007; Howard, D., Ed.; IEEE Press: Jeju City, Korea, 2007; pp. 219–223. [Google Scholar]
- Wang, Y.; Hill, K.; Singh, S.; Kari, L. The spectrum of genomic signatures: From dinucleotides to chaos game representation. Gene
**2005**, 346, 173–185. [Google Scholar] [CrossRef] [PubMed] - Petoukhov, S.V. The genetic code, 8-dimensional hypercomplex numbers and dyadic shifts. Available online: https://arxiv.org/pdf/1102.3596v11.pdf (accessed on 15 July 2016).
- Petoukhov, S.V. Symmetries of the genetic code, Walsh functions and the theory of genetic logical holography. Symmetry Cult. Sci.
**2016**, 27, 95–98. [Google Scholar] - Petoukhov, S.V.; Petukhova, E.S. Symmetries in genetics, Walsh functions and the geno-logical code. In Periodic Collection of Articles: “Symmetry: Theoretical and Methodological Aspects”, Issue 21; Ammosova, N.V., Ed.; Publishing House LLC “Triad”: Astrakhan, Russia, 2016; pp. 79–87. (In Russian) [Google Scholar]
- Petoukhov, S.V.; Petukhova, E.S. Resonances, Walsh functions and logical holography in genetics and musicology. Symmetry Cult. Sci.
**2017**, 28, 21–40. [Google Scholar] - Horimoto, K.; Nakatsui, M.; Popov, N. (Eds.) Algebraic and Numeric Biology, 2012 ed.; In Proceedings of the 4th International Conference, ANB 2010, Hagenberg, Austria, 31 July–2 August 2010; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2012; Volume 6479.
- Hornos, J.E.M.; Hornos, Y.M.M. Algebraic model for the evolution of the genetic code. Phys. Rev. Lett.
**1993**, 71, 4401–4404. [Google Scholar] [CrossRef] [PubMed] - Gonzalez, D.L. The mathematical structure of the genetic code. In The Codes of Life: The Rules of Macroevolution, Biosemiotics; Barbieri, M., Hoffmeyer, J., Eds.; Springer: Dordrecht, The Netherlands, 2008; Volume 1, Chapter 8; pp. 111–152. [Google Scholar]
- Gonzalez, D.L.; Giannerini, S.; Rosa, R. On the origin of the mitochondrial genetic code: Towards a unified mathematical framework for the management of genetic information. Nat. Proc.
**2012**. [Google Scholar] [CrossRef][Green Version] - Dragovich, B. p-Adic structure of the genetic code. NeuroQuantology
**2011**, 9, 716–727. [Google Scholar] [CrossRef] - Fimmel, E.; Giannerini, S.; Gonzalez, D.; Strüngmann, L. Dinucleotide circular codes and bijective transformations. J. Theor. Biol.
**2015**, 386, 159–165. [Google Scholar] [CrossRef] [PubMed] - Fimmel, E.; Giannerini, S.; Gonzalez, D.; Strüngmann, L. Circular codes, symmetries and transformations. J. Math. Biol.
**2014**, 70, 1623–1644. [Google Scholar] [CrossRef] [PubMed] - Petoukhov, S.V. Biperiodic Table of the Genetic Code and Number of Protons; MKC: Moscow, Russia, 2001; p. 258. (In Russian) [Google Scholar]
- Petoukhov, S.V. Matrix Genetics, Algebras of the Genetic Code, Noise-Immunity; Regular and Chaotic Dynamics: Moscow, Russia, 2008; p. 316. (In Russian) [Google Scholar]
- Petoukhov, S.V. Matrix genetics and algebraic properties of the multi-level system of genetic alphabets. Neuroquantology
**2011**, 9, 60–81. [Google Scholar] [CrossRef] - Petoukhov, S.V. Symmetries of the genetic code, hypercomplex numbers and genetic matrices with internal complementarities. Symmetry Cult. Sci.
**2012**, 23, 275–301. [Google Scholar] - Petoukhov, S.V. Dyadic Groups, Dyadic Trees and Symmetries in Long Nucleotide Sequences. Available online: http://arxiv.org/abs/1204.6247v2 (accessed on 17 January 2013).
- Petoukhov, S.V. The Genetic Code, Algebra of Projection Operators and Problems of Inherited Biological Ensembles. Available online: http://arxiv.org/abs/1307.7882 (accessed on 31 December 2014).
- Petoukhov, S.V.; He, M. Symmetrical Analysis Techniques for Genetic Systems and Bioinformatics: Advanced Patterns and Applications; IGI Global: Hershey, PA, USA, 2010; p. 271. [Google Scholar]
- Karlin, S.; Ost, F.; Blaisdell, B.E. Patterns in DNA and Amino Acid Sequences and Their Statistical Significance; Waterman, M.S., Ed.; Mathematical Methods for DNA Sequences; CRC Press: Raton, FL, USA, 1989. [Google Scholar]
- Gazalé, M.J. Gnomon: From Pharaons to Fractals; Princeton University Press: Princeton, NJ, USA, 1999; p. 280. [Google Scholar]
- Homo Sapiens Chromosome 22 Genomic Scaffold, Alternate Assembly CHM1_1.0, Whole Genome Shotgun Sequence. NCBI Reference Sequence: NW_004078110.1. Available online: http://www.ncbi.nlm.nih.gov/nuccore/NW_004078110.1?report=genbank (accessed on 31 October 2013).
- Stepanyan, I.V.; Petoukhov, S.V. The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences. Available online: https://arxiv.org/abs/1310.8469v1 (accessed on 31 October 2013).
- Human Chromosomes. Available online: ftp://ftp.ncbi.nih.gov//genomes/H_sapiens/April_14_2003/ (accessed on 14 April 2003).
- Kappraff, J.; Petoukhov, S.V. Symmetries, generalized numbers and harmonic laws in matrix genetics. Symmetry Cult. Sci.
**2009**, 20, 23–50. [Google Scholar] - Petoukhov, S.V.; Svirin, V.I. Fractal genetic nets and symmetry principles in long nucleotide sequences. Symmetry Cult. Sci.
**2012**, 23, 303–322. [Google Scholar]

**Figure 1.**The first members of the Kronecker family of symbolic genomatrices [A G; C T]

^{(N)}for n = 1, 2, 3. Inside each genomatrix [A G; C T]

^{(N)}, each row and each column has its individual binary numeration due to genetic sub-alphabets (see explanation in text below). Correspondingly each N-plet, which is located on a row-column crossing, has two digital binary coordinates in such matrix. The decimal equivalents of these binary numbers are shown in red.

**Figure 2.**Three binary sub-alphabets according to three kinds of binary-opposite attributes in the set of nitrogenous bases C, A, G, T/U. Symbols X, Y, Z in the left column mean names of axes of Cartesian systems of coordinates. Schemes in the right column graphically symbolize each sub-alphabet, which is characterized by a set of numbers 0 and 1.

**Figure 3.**Examples of visual patterns, which have been received on the base of the described method for different nucleotide sequences (see explanation in the text). Two symbols are shown at the right side of each pattern to indicate what kinds of the sub-alphabets from Figure 2 were used to construct the pattern.

**Figure 4.**Two examples of patterns which are constructed on the base of the described method.

**Left**side: the visual pattern of the nucleotide sequence Homo sapiens chromosome 22 genomic scaffold, which has 648,059 nucleotides and which is divided into a sequence of 16-mers; these 16-mers are transformed into 16-bit binary numbers on the basis of the first sub-alphabet and of the second sub-alphabet (Figure 2); then their decimal equivalents are plotted on the x and y axes respectively.

**Right**side: the visual pattern of the nucleotide sequence Arabidopsis thaliana mitochondrion, which has 366,924 nucleotides and which is divided into a sequence of 16-mers; these 16-mers are transformed into 16-bit binary numbers on the basis of the second sub-alphabet and of the third sub-alphabet (Figure 2); then their decimal equivalents are plotted on the axes “y” and “z” respectively.

**Figure 5.**The pattern of the sequence “Burkholderia multivorans ATCC 17616 genomic DNA, complete genome, chromosome 2” with 2,473,162 nucleotides in the case of its division into 63-plets.

**Figure 6.**Examples of patterns for the sequence Fistulifera sp. JPCC DA0580 chloroplast, complete genome in cases of its divisions into N-plets with N= 4, 5, 6, 7, 8, 9.

**Figure 7.**Examples of visual patterns of a random nucleotide sequence with 100,000 nucleotides (pentagramon.com) in cases of its division with N = 8, 16, 28.

**Figure 8.**An example of generating fractal lattices by means of Kronecker exponentiation of matrices. Left side: the (4 × 4)-matrix M with entries 0 and 1. Right side: visual patterns of the matrix M and its Kronecker powers M

^{(2)}and M

^{(3)}, which are (16 × 16)-matrix and (64 × 64)-matrix respectively. Here, black corresponds to matrix cells with entries of 1 and white corresponds to cells with entries of 0.

**Figure 9.**Illustration of relations among Kronecker multiplication, fractal lattices and fractal-like patterns of long nucleotide sequences (explanations in text).

**Figure 10.**The binary mosaics of the first 15,000,000 nucleotides of the sequences of Homo sapiens chromosomes X and Y (two upper levels) and Homo sapiens chromosome 1 in the case of their division into 63-plets. Two symbols are shown at the right side of each pattern to indicate what kinds of the sub-alphabets from Figure 2 were used to construct the pattern. Initial data were taken from [41,42].

**Figure 11.**Examples of binary mosaics for two parts of Homo sapiens chromosome 1 [41,42], which has 245,203,898 nucleotides: one part corresponds to interval of this sequence from 45,000,000 to 60,000,000 nucleotides, and the second part corresponds to the interval from 135,000,000 to 150,000,000 nucleotides.

**Figure 12.**Examples of binary mosaics for long nucleotide sequences of different contigs of Penicillium chrysogenum Wisconsin 54-1255 complete genome. Upper level: the mosaic for the contig 22, which contains 6,387,817 nucleotides, for the case of 63-plets. The second level: the mosaic for the contig 15, which contains 535,560 nucleotides, for the case of 63-plets. The third level: the mosaic for the contig 18, which contains 1,591,038 nucleotides, for the case of 63-plets. Lower level: the mosaic for the contig 12, which contains 3,988,431 nucleotides, for the case of 12-plets. Symbols from the right side of each mosaic indicate the pair of the sub-alphabets that were used for a transformation of these N-plets into binary numbers.

**Figure 13.**

**Upper**level: two 2-dimensional images of an “ideal” 3d configuration in its examination from two oblique foreshortening;

**Lower**level: a real 3d configuration for a real nucleotide sequence (see explanation in text).

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Stepanyan, I.V.; Petoukhov, S.V. The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences. *Information* **2017**, *8*, 12.
https://doi.org/10.3390/info8010012

**AMA Style**

Stepanyan IV, Petoukhov SV. The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences. *Information*. 2017; 8(1):12.
https://doi.org/10.3390/info8010012

**Chicago/Turabian Style**

Stepanyan, Ivan V., and Sergey V. Petoukhov. 2017. "The Matrix Method of Representation, Analysis and Classification of Long Genetic Sequences" *Information* 8, no. 1: 12.
https://doi.org/10.3390/info8010012