The Inconsistency of the Algorithms of Jaro–Winkler and Needleman–Wunsch Applied to DNA Chain Similarity Results

Boris Melnikov

doi:10.3390/math13020263

Abstract

There are many different algorithms for calculating the distances between DNA chains. Different algorithms for determining such distances give different results. This paper does not consider issues related to which of the classical algorithms is better, but shows the inconsistency of two classical algorithms, specifically the algorithms of Jaro–Winkler and Needleman–Wunsch. To do this, we consider distance matrices based on both of these algorithms. We explain that, ideally, the triangles formed by the distance matrix corresponding to each triple of distances should be acute-angled isosceles. Of course, in reality, this fact is violated, and we can determine the badness for each such triangle. In this case, the two algorithms for determining distances will be consistent. In the case where such sequences of badness are located in the same order for them, and the greater the difference from this order, the less they are consistent. In this paper, we consider the distance matrices for the two mentioned algorithms, calculated for the mitochondrial DNA of 32 species of monkeys belonging to different genera. For them, 4960 triangles are formed in both matrices, and we calculate the values of the rank correlation between these sequences. We obtain very small results for these values (with different methods of calculating the rank correlation, it does not exceed the value 0.14), which indicates the inconsistency of the two algorithms under consideration.

Keywords:

heuristic algorithms; DNA chains; distance matrix; Jaro–Winkler algorithm; Needleman–Wunsch algorithm; pair correlation

MSC:

62P10; 92B15

1. Introduction and Motivation

There are many different algorithms for calculating the distances between sequences of symbols of different natures, and, in particular, between DNA chains, which are the main subject of research in this article. At the same time, both biologists and specialists in applied mathematics consider some of the provisions used in these algorithms to be unshakable. (However, “Nothing in Biology Makes Sense Except in the Light of Evolution” (1973, Theodosius Dobzhansky). We believe that both the title of this essay and its content are directly related to the study of DNA sequences in general and to algorithms for calculating distances between them in particular.) Certainly, this applies to the once-calculated distance between the genomes of different species.

However, there are some important points to make about this case:

Firstly, if we talk specifically about mammals, whose genomes are the object of the research of this paper, then one of three options is usually used as the object for analysis:
–
Mitochondrial DNA (mt DNA, which will be the main object of this research);
–
The “tails” of Y chromosomes;
–
The main histocompatibility complex.
Secondly, different algorithms are used to determine the distances between genomes, and, according to the author’s opinion, they are all modifications of the Levenshtein metric (sometimes significant modifications, as a result of which their relation to the Levenshtein metric is not always obvious). At the same time, in this paper, we do not engage in comparing these different algorithms.
Thirdly, the main difficulty encountered in calculating the distance between such sequences is their very long length. For example, the length of the human mt DNA sequence, i.e., very short DNA, exceeds 16,000 characters, while the total length of human DNA exceeds 3,000,000,000 characters. Therefore, it is impossible to solve real problems with the exact calculation of the Levenshtein distance, and all the algorithms used in them can be called heuristic.
Fourthly, it is possible to conduct distance studies either before “combining triples of letters into one” or after such combining. However, for the task described in this paper, this is not fundamental.
Fifthly, for 4 variants of nucleotides in the genome, natural selection results not in 64 variants of the triples but in 21 variants only. Each of these options can be considered an encoded letter. Moreover, as is written in the popular scientific literature (we shall not give specific references), at least four artificial amino acids have been designed, which can “on full grounds” enter artificial DNA chains, and the triples of amino acids in such artificial DNA chains can be replaced by fours or even fives.

Thus, different algorithms for determining the distances between DNA chains give different results. At the same time, the general trend is, of course, correct; for any adequate algorithm, the distance between the genomes of humans and chimpanzees is, of course, less than that between the genomes of humans and, e.g., an elephant. However, of course, we would like to obtain a more detailed answer to the question of quantifying this distance. The paper is devoted to one of the issues of this big topic.

However, when considering such different algorithms for determining the distances between genomes, the author has long had an assumption about the great inconsistency of the Jaro–Winkler and Needleman–Wunsch algorithms. The main subject of the paper is the quantitative verification of this hypothesis. We shall show in the paper that it is fulfilled, i.e., these algorithms are not consistent.

We calculate such a quantitative characteristic as follows: First, for the algorithm in question, we calculate the distance matrix between pairs of genomes. Note that, for example, for a matrix of dimension 32, we have

\frac{32 \cdot 31}{2} = 496

cells required to fill the matrix, but the algorithms mentioned above have been working for quite a long time. Using the Needleman–Wunsch algorithm for mt DNA, we fill 496 cells of the 32-dimensional matrix in about a day of operation of an average modern computer.

Next, in this matrix, we consider all the triangles. It is important to note that there are quite a lot of them. For example, for a matrix of dimension 32, there are

\frac{32 \cdot 31 \cdot 30}{2 \cdot 3} = 4960 .

Each of the triangles has a special characteristic, the so-called badness, which will be discussed in detail in Section 2.

For two different algorithms, as a result of such constructions, we can calculate sufficiently long sequences of badness values for all triangles. Next, we calculate the values of the pair correlation for these sequences, and we believe that acceptably large values of the latter value (that is,

0.7

or more) indicate that the algorithms are consistent.

However, in our situation, real calculations give results that are very far from such values. Specifically, with different methods of calculating the correlation coefficient, values for the examples we are considering are obtained in the range from

0.075

to

0.14

(see details below). Furthermore, it would not be an exaggeration to say that it is good that the correlation coefficients are positive in general. Thus, as we have already said, the Jaro–Winkler and Needleman–Wunsch algorithms are inconsistent.

At the end of this section, we note a few small general remarks:

Firstly, everything said here is described in more detail below, but we do not provide detailed content by section in the introduction.
Secondly, the technique considered in the paper, which can be called an algorithm for determining the consistency of algorithms for calculating distances between lines, is applicable to any pair of such distance calculation algorithms and to any set of types.
Thirdly,
–
Algorithms for determining the distances between two specific lines (in particular, DNA sequences) are heuristic due to the total size of the data under consideration;
–
Algorithms for calculating the badness for triples of DNA sequences are therefore heuristic algorithms for analyzing heuristic algorithms;
–
Algorithms for determining consistency between two distance calculation algorithms can therefore be called heuristic algorithms for analyzing heuristic algorithms designed to analyze heuristic algorithms.
In other words, a “triple embedding” appears.

2. Preliminaries: DNA Chains, Their Distance and Statistical Characteristics

The theory presented in the paper related to the analysis of DNA sequences is based on the author’s previous works, among which it is primarily worth noting [1,2,3,4]. The standard concepts and formulas of mathematical statistics are consistent with the monographs [5,6].

Above, we talked about the triangles formed by the distances between genomes, that is, where they come from. We continue the example of chimpanzees and humans but add a third very close species.

For this interesting example, let us consider the three following species: human (H), chimpanzee (C) and bonobo (B). According to biologists,

The ancestors of both of apes and humans diverged about 7,000,000 years ago;
The ancestors of chimpanzees and bonobos diverged about 2,500,000 years ago.

At the same time, the exact values are not particularly important. The only important thing is that the triangles formed by the corresponding three distances should ideally be acute-angled isosceles. Moreover, all of the above must be fulfilled for any three species.

Table 1. Some triangles and options for their badness.

Sides ^1,2	Angles ²	Bad. (0)	Bad. (1)	Bad. (2)	Bad. (3)	Bad. (5)
$a$ , $b$ , $c$	$α$ , $β$ , $γ$	$(α - β) / γ$	$(α - β) / π$	$(α - β) / α$	$(a - b) / a$	$(a - b) / c$
$1 1 1$	$60 60 60$	0	0	0	0	0
$5 5 4$	$66 66 47$	0	0	0	0	0
$42 41 28$	$72 68 39$	$0.10$	$0.04$	$0.05$	$0.02$	$0.04$
$19 18 17$	$66 60 55$	$0.11$	$0.07$	$0.09$	$0.05$	$0.06$
$10 9 8$	$72 59 50$	$0.26$	$0.14$	$0.18$	$0.10$	$0.13$
$6 5 5$	$74 53 53$	$0.39$	$0.23$	$0.28$	$0.17$	$0.20$
$13 12 5$	$90 67 23$	$1.00$	$0.25$	$0.25$	$0.08$	$0.20$
$5 4 3$	$90 53 37$	$1.00$	$0.41$	$0.41$	$0.20$	$0.33$
$12 6 5$	−			$1.09$
$20 6 5$	−			$1.81$

¹ We round it up to an integer of degrees; therefore, the sum may not be the same as 180. ²

a ⩾ b ⩾ c

,

α ⩾ β ⩾ γ

.

Thus, we consider distance matrices based on both of these algorithms. Certainly, in reality, the fact that the triangles formed in these matrices are acute-angled isosceles is violated. Then, for each such a triangle, we determine the numerical value of the density. In the process of the calculations, several variants of such badness are considered. Examples for some specified triangles are shown in Table 1. Below, we will use the value of badness, indicated by “Bad. (0)”, which we consider to be the most adequate.

It should be noted that such matrices are used primarily by biologists, in particular in the popular science literature, but are little used in related mathematical modeling tasks. Of course, there are successful exceptions. As one of them, we note the most recent study [7]. Among earlier works, we note [8,9].

The author hopes that the presented work will be considered not only for the use of such distance matrices by biologists, but also, above all, as one of the applications for creating mathematical models and algorithms for working with such matrices.

Based on such matrices, the total badness of all triangles can be considered and it can be argued that algorithms with lower badness values are better than algorithms with higher values. However, such consideration is not the subject of this paper. Here, we consider another natural assumption. It can be assumed that two algorithms for determining distances are consistent in the case when the sequences of badness of their corresponding matrices are ordered in the same order for them, and the greater the difference from this order, the less they are consistent.

Two such sequences of badness values can be compared by applying rank correlation algorithms. At the same time, as in many statistical experiments, if we obtain a value exceeding, e.g.,

0.7

, then it could be argued that the considered sequences of badness are consistent, and, therefore, the two algorithms under consideration are also consistent. However, we do not obtain such values (see details below). We also note in advance that different classical variants for calculating rank correlation give approximately the same results; therefore, the specific algorithm of rank correlation is unimportant.

Now, let us move on to the description of the standard statistical characteristics we use, as well as their small variations. Sometimes, we use “more customary” notation. For example, we do not use “standard statistical” notation

M X Y

(we write

M_{X \cdot Y}

instead), etc.

The two random variables under consideration are denoted by X and Y; their observed implementations are denoted in the same way with the corresponding subscripts, i.e.,

X_{i} a n d Y_{i} f o r i = 1, 2, \dots, N .

Firstly, let us formulate the usual definition of correlation. Recall that the pair correlation coefficient can be calculated using the usual formula:

R (X, Y) = \frac{c o v (X, Y)}{σ_{X} \cdot σ_{Y}},

where

c o v (X, Y) = M_{X \cdot Y} - M_{X} \cdot M_{Y} .

In our further tables and program fragments, this variant of the coefficient has the number 0.

Secondly, let us formulate a modified Kendall correlation coefficient. For it, we define the number of discrepancies (“entropy coefficient”) as follows: a discrepancy holds if, for a pair

(i, j)

where

i \neq j

, we have

X_{i} > X_{j} b u t Y_{i} < Y_{j} .

(1)

Let us denote the number of such discrepancies by

e n t r (X, Y)

, or simply E in the next formula. We should also note that the correlation calculated in any way between the usual Kendall correlation coefficient and our variant is always equal to 1 (“correlation between correlations”). This is easily obtained by simply considering the formulas.

Since the maximum possible number of such discrepancies is

\frac{N \cdot (N - 1)}{2}

, we will consider the modified Kendall correlation coefficient by

1 - \frac{4 \cdot E}{N \cdot (N - 1)};

This value is equal to 1 in the case of 0 discrepancies, and is equal to

- 1

in the case of the maximum possible number of discrepancies. In our further tables and program fragments, this variant of the coefficient has the number 2.

Note that we could calculate this coefficient as follows: We define the “entropy coefficient” considered before for each pair of pairs by (1). Then, we calculate the sum of these coefficients and divide the result by the value

\frac{N \cdot (N - 1)}{2}

already used earlier.

Different publications provide different versions of criticism of the Kendall criterion, but the authors of the current paper consider the following flaw to be the most important: it does not give very adequate results with a large number of coincidences in the values of the considered random variables. Therefore, we also consider the following “very modificated” Kendall correlation coefficient.

It is most convenient to consider it as a search for pairs of pairs, like in the last remark. However, unlike in (1), we also use the value 0 (not only 1 and

- 1

). Specifically, the value 0 is selected if and only if the values of at least one of the random variables in the considered pairs match. In our further tables and program fragments, this variant of the coefficient has the number 3.

Thirdly, the Spearman correlation coefficient is calculated in the usual way, i.e.,

\frac{\sum_{i = 1}^{n} (x_{i} - M_{X}) \cdot (y_{i} - M_{Y})}{\sqrt{n \cdot σ_{X} \cdot σ_{Y}}}

In our further tables and program fragments, this variant of the coefficient has the number 1.

Note in advance that, in Section 5, we will briefly describe and apply another way to calculate the rank correlation where it will have the number 4.

3. Problem Statement

Continuing what was said in the introduction, we formulate the problem statement. Based on general considerations, we can say in our expert assessment that we believe that the Needleman–Wunsch algorithm [10] is much more adequate than the Jaro–Winkler algorithm [11]. However, we do not show this in this paper (many indirect arguments were given in our publications cited above), but rather show a less significant fact, namely, the inconsistency of these two algorithms.

By using correlation analysis, such inconsistency can be shown by simply calculating the rank correlation for the sequences of the corresponding elements of the two distance matrices, i.e., matrices for the Jaro–Winkler and Needleman–Wunsch algorithms. However, we do not consider such work to be important, for the following three reasons:

Firstly, there are not very many such elements. In square matrices of size 32, we have 496 elements, which are located from the top of the main diagonal.
Secondly, based on the calculations performed, we came to the conclusion that the results of such a correlation comparison are not very informative. The values of the correlation coefficient (with different methods of calculating it) do not differ much from $0.5$ (specifically, from $0.38$ to $0.61$ ), and this fact, apparently, does not allow us to draw unambiguous conclusions.
Thirdly, we are considering a specific task (and not just comparing any two abstract matrices), and, as we noted in Section 2, our matrices must have an important special property, i.e., a small value of badness (we use the Bad. (0) value), and, moreover, they also have in our case the consistency of these values for both matrices.

Thus, we use some more complex comparisons. We will talk about them in the next section.

4. Algorithms, Methods and Some More About the Motivation

In this section, we discuss different opinions about whether the conclusions that can be drawn based on the simplest study of the distances between DNA sequences are correct or not. In particular (and this is the simplest question, to which there is no definitive answer), whether the initial results themselves expressing the distances between genomes, i.e., the algorithms of Jaro–Winkler and Needleman–Wunsch (as well as several other algorithms not considered in this paper), are correct and adequate. In any case, the most basic conclusion is as follows: the research related to the consideration of algorithms for determining the distances between DNA strands should be continued.

On the one hand, it is worth adding that there are scientific publications where this difference is significantly greater (see [12] and many others); estimates of the difference between the genomes of humans and chimpanzees reach up to 19% (and this is only in works known to the author; see also some links in [12]). This is explained by the authors as follows: Geneticists allegedly sequenced “small pieces of chimpanzee DNA”, i.e., using conventional chemical laboratory procedures, they determined the sequence of the chemical symbols. Then, these small chains of “symbols” were connected to the human genome in those places where, in their opinion, they should match. After that, the human genome was removed and a chimpanzee pseudo-genome was obtained, which allegedly indicated a common kinship with humans. Thus, a mixed sequence was obtained, which, apparently, is not real. Hence, it is concluded that the real differences are significantly more than 1.

However, on the other hand, there are arguments for the fact that the resulting distances between genomes are more or less correct, and the grounds for the noticeable difference between humans and their closest relatives lie elsewhere.

According to the general DNA sequence, humans actually stand apart from other hominids. Moreover, this is not according to the “formal set of genes”, but according to their distribution on chromosomes. It is precisely the following factors that apply:

Multiple chromosomal aberrations;
The deletion of a huge section;
The transition of another section to the other chromosome, due to which humans have one pair of autosomes fewer;
The reversal of another section.

Most likely, it leads to a radical change in the phenotype. We consider the phenotype to be a set of internal and external features, properties and traits of a specific organism. There are some other definitions in the scientific literature.

This last, i.e., the change in the phenotype, can be described primarily by the following signs (also based on material taken from numerous scientific and popular science publications):

The absence of a massive, protruding jaw in humans, and, consequently, a significantly different structure of the oral cavity, which is the most important resonator in speech formation;
The structure of the nose (as well as the larynx) is significantly different;
Lack of wool cover;
Walking upright;
Rebuilt work of sebaceous and sweat glands;
Reconstruction of the upper part of the skull;
Many other things that distinguish humans from anthropoids in general.

However, as can be understood, all of the above are general reflections at the level of “apparently”, not supported by specific genomic studies. At the same time, it is precisely this “lack of strength”, i.e., the inability to strictly prove the above dependencies, at least in the near future, that is exactly what

suggests the need to continue detailed studies of DNA strands, in particular, to analyze their similarity.

Such tasks remain and they will remain very relevant for a long time. As we said before, the research related to the consideration of algorithms for determining the distances between DNA strands should be continued. In particular, it is possible in future studies to try to algorithmically formulate the grounds for the strong difference between humans and great apes.

Thus, for the considered matrices corresponding to 32 species of monkeys of different genera, we consider two sequences of badness values for all the corresponding triangles (as we said above, there are 4960 of them) and calculate the badness values for them.

We also repeat once again that we do not consider in this paper the issues related to determining which of the algorithms is better; we are considering only a method that can show the inconsistency of these algorithms.

5. Description of Computational Experiments and the Results

Firstly, let us list the species of monkeys we are considering (see Table 2). It is important to remark that all the species belong to different genera. Apparently, this fact leads to a more or less successful distribution of the elements of the distance matrix.

Table 2. The considered monkey species in alphabetical order.

After that, we present the distances calculated for the mt DNA of these species in the form of two tables. Everything is considered for two different distance calculation algorithms. Namely, for our article, we have reviewed the algorithms of Jaro–Winkler and Needleman–Wunsch.

Table 3 is the calculated distance matrix for the Jaro–Winkler algorithm. The species numbers correspond to those shown in Table 2. The peculiarity of this algorithm is that it gives very close answers for these types; therefore, the three-digit numbers shown in the table correspond to three decimal places after

0.0

. For instance, 541 means

0.0541

.

If it is necessary to verify the algorithms described by us and the calculation results, the values of the table elements can simply be copied from a PDF file and pasted to the computer program. The author can also submit them by e-mail.

Table 3. The matrix obtained by applying the Jaro–Winkler algorithm.

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	541	677	583	592	541	589	536	562	633	465	610	530	370	512	565	545	800	624	640	520	556	548	562	515	570	726	524	511	589	589	540
2	541	000	635	387	342	369	396	381	386	733	600	686	463	542	409	549	349	722	698	708	515	440	401	543	462	455	681	388	452	464	383	532
3	677	635	000	665	676	627	668	626	670	714	728	739	666	678	655	777	617	731	744	760	737	661	663	767	692	680	690	646	648	710	661	753
4	583	387	665	000	334	396	385	384	396	767	630	727	457	579	422	577	383	677	723	733	546	447	411	571	442	434	637	403	474	447	378	568
5	592	342	676	334	000	384	395	321	397	777	644	736	481	584	433	570	375	672	742	751	554	451	421	579	429	444	650	418	498	453	393	562
6	541	369	627	396	384	000	401	319	406	706	581	665	455	528	387	526	383	753	676	675	510	458	381	499	481	457	693	320	436	475	400	527
7	589	396	668	385	395	401	000	397	389	763	630	727	471	580	425	584	392	695	738	741	556	429	346	573	458	451	657	400	488	463	382	573
8	536	381	626	384	321	319	397	000	400	723	595	691	453	537	396	527	345	724	687	696	518	457	392	534	474	457	685	312	448	470	392	526
9	562	386	670	396	397	406	389	400	000	747	585	700	462	561	415	565	390	725	703	722	532	448	403	571	467	469	681	409	477	482	327	546
10	633	733	714	767	777	706	763	723	747	000	628	635	706	661	676	699	723	674	653	678	634	720	693	677	767	758	538	712	697	793	775	656
11	465	600	728	630	644	581	630	595	585	628	000	560	584	462	549	535	594	859	568	579	494	596	589	526	636	608	790	582	560	631	639	464
12	610	686	739	727	736	665	727	691	700	635	560	000	673	610	631	601	687	871	379	381	556	688	669	589	724	706	795	667	646	729	731	571
13	530	463	666	457	481	455	471	453	462	706	584	673	000	535	446	467	449	741	665	678	434	391	454	454	414	402	678	463	454	413	461	448
14	370	542	678	579	584	528	580	537	561	661	462	610	535	000	502	566	545	790	614	627	526	545	539	558	578	549	723	511	492	545	582	539
15	512	409	655	422	433	387	425	396	415	676	549	631	446	502	000	515	400	772	630	642	477	478	395	506	510	483	705	390	437	509	426	493
16	565	549	777	577	570	526	584	527	565	699	535	601	467	566	515	000	529	913	580	571	401	484	548	350	483	461	836	528	513	481	589	379
17	545	349	617	383	375	383	392	345	390	723	594	687	449	545	400	529	000	719	684	701	514	442	391	543	462	461	673	376	443	468	387	503
18	800	722	731	677	672	753	695	724	725	674	859	871	741	790	772	913	719	000	871	884	851	708	759	897	664	690	538	759	763	694	709	874
19	624	698	744	723	742	676	738	687	703	653	568	379	665	614	630	580	684	871	000	366	579	701	682	565	734	711	799	668	647	721	729	547
20	640	708	760	733	751	675	741	696	722	678	579	381	678	627	642	571	701	884	366	000	585	717	688	567	752	718	806	679	656	729	739	551
21	520	515	737	546	554	510	556	518	532	634	494	556	434	526	477	401	514	851	579	585	000	446	515	386	469	462	787	508	498	485	549	344
22	556	440	661	447	451	458	429	457	448	720	596	688	391	545	478	484	442	708	701	717	446	000	438	473	377	369	644	465	471	379	451	469
23	548	401	663	411	421	381	346	392	403	693	589	669	454	539	395	548	391	759	682	688	515	438	000	539	492	478	705	380	451	490	416	528
24	562	543	767	571	579	499	573	534	571	677	526	589	454	558	506	350	543	897	565	567	386	473	539	000	503	479	822	522	509	465	569	372
25	515	462	692	442	429	481	458	474	467	767	636	724	414	578	510	483	462	664	734	752	469	377	492	503	000	346	627	484	486	344	467	486
26	570	455	680	434	444	457	451	457	469	758	608	706	402	549	483	461	461	690	711	718	462	369	478	479	346	000	621	460	453	366	451	471
27	726	681	690	637	650	693	657	685	681	538	790	795	678	723	705	836	673	538	799	806	787	644	705	822	627	621	000	694	699	634	663	805
28	524	388	646	403	418	320	400	312	409	712	582	667	463	511	390	528	376	759	668	679	508	465	380	522	484	460	694	000	389	478	409	525
29	511	452	648	474	498	436	488	448	477	697	560	646	454	492	437	513	443	763	647	656	498	471	451	509	486	453	699	389	000	476	488	500
30	589	464	710	447	453	475	463	470	482	793	631	729	413	545	509	481	468	694	721	729	485	379	490	465	344	366	634	478	476	000	466	479
31	589	383	661	378	393	400	382	392	327	775	639	731	461	582	426	589	387	709	729	739	549	451	416	569	467	451	663	409	488	466	000	541
32	540	532	753	568	562	527	573	526	546	656	464	571	448	539	493	379	503	874	547	551	344	469	528	372	486	471	805	525	500	479	541	000

The following Table 4 is the calculated distance matrix for the Needleman–Wunsch algorithm. The species numbers also correspond to those shown in Table 2.

Table 4. The matrix obtained by applying the Needleman–Wunsch algorithm.

This algorithm gives not very close answers for these types; therefore, the three-digit numbers shown in the table correspond to three decimal places after

0 .

(not

0.0

). For instance, 375 means

0.375

. It is important to note that such a 10 times increase in values does not change any of the values of the badness of the triangles we are considering. Indeed, considering the first triangle of Table 3, with the sides

0.0541

,

0.0677

and

0.0635

, we can say that its badness is exactly equal to the badness of the triangle with the sides

0.541

,

0.677

and

0.635

.

As follows from the previous material, we can work with Table 3 and Table 4 (they are given after the text of the paper), as well as with any other tables built on the same principle. Simply, as with tables of integers, the values of badness that we are interested in will be the same.

In general, all the calculation results are shown in Table 5. The column designations are clear; they are related to the options described above for calculating the rank correlation. The string designations have the following meaning:

“Simple” means counting sequences of matrix elements above the main diagonal, while “main” means counting sequences of badness (Bad. 0) of triangles;
“With” (unlike “without”) means that we used normalization before calculations. As usual, normalization is what we call the linear mapping of all the received data into the segment $[0, 1]$ .

In the next section, we will discuss the numerical results obtained and some conclusions.

Table 5. The calculation results.

Option	Corr-0, Usual	Corr-1, Spearman	Corr-2, Kendall+	Corr-3, Kendall++
simple, with	$0.610$	$0.533$	$0.386$	$0.376$
main, without	$0.0817$	$0.136$	$0.0742$	$0.0909$
main, with	$0.0817$	$0.136$	$0.139$	$0.0909$

At the end of this section, we note that, in a recently published work [13], our method of calculating rank correlation was given. See its detailed description in that paper. This method differs from all “classical” methods and at the same time gives adequate results in all fields of application known to us. The first computational experiments show its good applicability in the subject area considered in this paper. For example, the values of the rank correlation in all three calculation variants (i.e., in three lines) for the same set of genomes and a pair of Needleman–Wunsch and Jaro–Winkler algorithms produce results not exceeding

0.14

, while the pair of the values obtained by the Needleman–Wunsch and Damerau–Levenshtein [14] algorithms are greater than

0.6

. However, of course, computational experiments related to this method of calculating the rank correlation should be continued.

6. Discussion and Conclusions

Summarizing some of the thoughts of this paper, we can formulate the following: The difference between genomes is very different in different studies, although the vast majority of both scientific and popular scientific papers give a distance between the genomes of humans and chimpanzees in the range from

0.5

% to 2% (i.e., the similarity is from 98% to

99.5

%). For example, according to [15], the genomes of humans and chimpanzees are “identical by more than

98.5

%”, and this statement is very often quoted “as the ultimate truth”. However, for the sake of completeness, in the next section, we give even more detailed reflections on the topic of specific values of genome proximity.

Regarding the various rank correlation algorithms used in this article, we note that very interesting results related to their different ways of calculation (including the method described in [13]) are provided by the following interesting example, specially selected by the author:

1001 1002 1003 1004 1005 1006 2001 2002 2003 2004 2005 2006

1006 1005 1004 1003 1002 1001 2006 2005 2004 2003 2002 2001

(The corresponding elements of the two rows form the pairs of elements.)

From our point of view, this example, as well as some other specially selected ones, shows the need to use special algorithms for calculating rank correlation and improving existing ones. Therefore, in the following publications, we propose to return to the consideration of this example and its connection with the methods of calculating rank correlation that we have considered.

Here are some references to biologists’ works that use distance matrices between genomes. The following are recent works that are not related to the study of mammalian mitochondrial DNA: [16,17,18]. However, we have already noted that the application of mathematical methods and the creation of algorithms for analyzing such matrices, including heuristic algorithms, are reflected in a very small number of publications. We have already cited the following: [7,8,9].

Thus, the main work performed is as follows: Based on the above tables of calculation results, we can say that the method we described as “simple” can hardly answer the question we posed about the consistency of the two algorithms. Correlation values of approximately 0.5, as a rule, do not say much. However, everything is clarified by a more complex method that examines the rank correlation of the badness of all the triangles under consideration. For large values of pairs obtained using the algorithms of Needleman–Wunsch and Damerau–Levenshtein, we obtain very small values of pairs of algorithms using the Needleman–Wunsch and Jaro–Winkler algorithms (not exceeding 0.14) on the same input data.

Therefore, we think that we have shown the inconsistency between two well-known algorithms for determining the distances between genomes, namely, the algorithms of Jaro–Winkler and Needleman–Wunsch. Specifically, there is an assumption (not yet fully confirmed) that the Needleman–Wunsch algorithm is significantly more adequate than the Jaro–Winkler one.

Here are possible directions for continuing the work described in the paper, including outcomes that can be drawn based on its material:

We hope to obtain a matrix for all types of monkeys (500 to 850 types, according to various sources), and at first these will be algorithms for restoring a partially filled matrix.
This problem is best used for the Needleman–Wunsch algorithm, ignoring the rest of the described algorithms.
The author believes that the following task is very important. This problem consists of viewing, based on the given distance matrix, all five variants of badness, and choosing “the best” of them. In previous papers and in Section 2, it was said that, ideally, this value should be equal to 0. Then, “the best” badness can be obtained by minimizing the linear combination of the considered options. At the same time, of course, functions like the identity zero are pointless to consider. Therefore, in our model, we consider a linear combination of several of the above functions for variants of badness.
We hope to continue the consideration of the tasks described in the paper, our algorithm for calculating rank correlation [13], which can be called corr-4.

Funding

This work was partially supported by a grant from the scientific program of Chinese universities, the “Higher Education Stability Support Program” (section “Shenzhen 2022—Science, Technology and Innovation Commission of Shenzhen Municipality”).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

This work was partially supported by a grant from the scientific program of Chinese universities, the “Higher Education Stability Support Program” (section “Shenzhen 2022—Science, Technology and Innovation Commission of Shenzhen Municipality”). The author also expresses gratitude to post-graduate students Li Jiamian and Mu Jingyuan (Shenzhen MSU–BIT University), who received the values of Table 3 and Table 4, for which they independently implemented these algorithms.

Conflicts of Interest

The author declares no conflicts of interest.

References

Melnikov, B.; Pivneva, S.; Trifonov, M. Various algorithms, calculating distances of DNA sequences, and some computational recommendations for use such algorithms. CEUR Workshop Proc. 2017, 1902, 43–50. [Google Scholar]
Melnikov, B.; Melnikova, E.; Pivneva, S.; Trenina, M. An approach to analysis of the similarity of DNA-sequences. CEUR Workshop Proc. 2018, 2212, 67–72. [Google Scholar]
Melnikov, B.; Zhang, Y.; Chaikovskii, D. An inverse problem for matrix processing: An improved algorithm for restoring the distance matrix for DNA chains. Cybern. Phys. 2022, 11, 217–226. [Google Scholar] [CrossRef]
Melnikov, B.; Chaikovskii, D. On the application of heuristics of the TSP for the task of restoring the DNA matrix. Front. Artif. Intell. Appl. 2024, 385, 36–44. [Google Scholar]
Lagutin, M. Visual Mathematical Statistics; BINOM. Laboratoriya Znaniy: Moscow, Russia, 2012; 472p. (In Russian) [Google Scholar]
Wasserman, L. All of Statistics: A Concise Course in Statistical Inference; Springer Science & Business Media: Berlin, Germany, 2013; 442p. [Google Scholar]
Young, S.; Gilles, J. Use of 3D chaos game representation to quantify DNA sequence similarity with applications for hierarchical clustering. J. Theor. Biol. 2024, 5967, 111972. [Google Scholar] [CrossRef] [PubMed]
Ballester, P.J.; Richards, W.G. Ultrafast shape recognition to search compound databases for similar molecular shapes. J. Comput. Chem. 2007, 28, 1711–1723. [Google Scholar] [CrossRef] [PubMed]
Bodenhofer, U.; Bonatesta, E.; Horejs-Kainrath, C.; Hochreiter, S. Msa: An R package for multiple sequence alignment. Bioinformatics 2015, 31, 3997–3999. [Google Scholar] [CrossRef] [PubMed]
Needleman, S.; Wunsch, C. A general method is applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 1970, 48, 443–453. [Google Scholar] [CrossRef] [PubMed]
Winkler, W. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. In Proceedings of the Survey Research Methods Sections, Anaheim, CA, USA, 6–9 August 1990; American Statistical Association: Alexandria, VA, USA, 1990; pp. 354–359. [Google Scholar]
Cohen, J. Relative differences: The myth of 1%. Science 2007, 316, 1836. [Google Scholar] [CrossRef] [PubMed]
Melnikov, B.; Lysak, T. On some algorithms for comparing models of femtosecond laser radiation propagation in a medium with gold nanorods. Cybern. Phys. 2024, 13, 261–267. [Google Scholar] [CrossRef]
Levenshtein, V. Binary codes capable of correcting. Deletions, insertions, and reversals. Sov. Phys. Dokl. 1966, 10, 707–710. [Google Scholar]
Polavarapu, N.; Arora, G.G.; Mittal, V.; McDonald, J. Characterization and potential functional significance of human-chimpanzee large INDEL variation. Mob. DNA 2011, 2, 13. [Google Scholar] [CrossRef] [PubMed]
Sampaio, J.R.; Oliveira, W.D.d.S.; Junior, L.C.d.S.; Nascimento, F.d.S.; Moreira, R.F.C.; Ramos, A.P.d.S.; Santos-Serejo, J.A.d.; Amorim, E.P.; Jesus, R.D.M.d.; Ferreira, C.F. Diversity of Improved Diploids and Commercial Triploids from Musa spp. via Molecular Markers. Curr. Issues Mol. Biol. 2024, 46, 11783–11796. [Google Scholar] [CrossRef] [PubMed]
Memon, J.; Patel, R.; Patel, B.N.; Patel, M.P.; Madariya, R.B.; Patel, J.K.; Kumar, S. Genetic diversity, population structure and association mapping of morphobiochemical traits in castor (Ricinus communis L.) through simple sequence repeat markers. Ind. Crops Prod. 2024, 221, 119348. [Google Scholar] [CrossRef]
Mansueto, L.; Tandayu, E.; Mieog, J.; Garcia-de Heer, L.; Das, R.; Burn, A.; Mauleon, R.; Kretzschmar, T. HASCH—A high-throughput amplicon-based SNP-platform for medicinal cannabis and industrial hemp genotyping applications. BMC Genom. 2024, 25, 818. [Google Scholar] [CrossRef]

Table 2. The considered monkey species in alphabetical order.

No.	Species of Monkeys	No.	Species of Monkeys
1	Allenopithecus nigroviridis	17	Lagothrix lagotricha
2	Ateles belzebuth	18	Leontopithecus rosalia
3	Brachyteles arachnoides	19	Macaca fascicularis
4	Cacajao calvus	20	Macaca fuscata
5	Callimico goeldii	21	Mandrillus leucophaeus
6	Callithrix jacchus	22	Nasalis larvatus
7	Carlito syrichta	23	Nycticebus coucang
8	Cebuella pygmaea	24	Papio anubis
9	Cephalopachus bancanus	25	Presbytis melalophos
10	Cercocebus atys	26	Pygathrix nemaeus
11	Cercopithecus albogularis	27	Rhinopithecus roxellana
12	Chlorocebus sabaeus	28	Saguinus oedipus
13	Colobus angolensis	29	Saimiri boliviensis
14	Erythrocebus patas	30	Semnopithecus entellus
15	Galago moholi	31	Tarsius dentatus
16	Gorilla gorilla	32	Theropithecus gelada

Table 4. The matrix obtained by applying the Needleman–Wunsch algorithm.

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	250	375	260	253	256	283	253	277	156	143	192	197	157	274	216	253	477	206	204	154	187	284	161	188	192	381	263	256	192	281	153
2	250	000	293	184	168	168	267	167	265	250	253	287	258	256	264	240	123	473	289	289	246	247	275	254	245	249	473	180	179	251	267	249
3	375	293	000	322	323	320	371	320	368	373	377	476	380	375	384	375	286	476	474	476	374	372	383	378	377	376	296	327	329	381	372	374
4	260	184	322	000	191	191	271	189	270	258	263	295	264	264	271	258	182	476	297	298	257	258	278	265	258	259	405	196	199	259	270	261
5	253	168	323	191	000	146	268	145	265	255	258	289	259	259	272	251	169	474	292	293	253	253	276	260	250	250	472	169	184	251	269	256
6	256	168	320	191	146	000	276	091	271	254	254	286	261	259	271	249	165	477	286	287	252	253	274	255	255	255	474	163	180	256	273	253
7	283	267	371	272	268	276	000	272	152	283	287	319	286	285	255	279	266	477	319	320	281	277	253	289	276	275	406	273	278	282	177	281
8	253	167	320	189	145	091	272	000	266	251	253	286	257	257	269	247	165	474	289	288	251	254	275	256	253	253	471	163	181	255	269	254
9	277	265	368	270	266	271	152	266	000	275	277	312	279	276	250	272	263	477	311	313	275	271	250	282	272	271	402	270	273	275	172	275
10	156	250	373	258	255	254	283	251	275	000	159	201	202	173	275	212	249	477	191	191	084	190	279	153	191	196	377	260	253	192	279	148
11	143	253	377	263	258	254	287	253	277	159	000	174	202	140	273	215	251	480	205	202	153	191	280	162	191	193	384	264	258	194	283	156
12	192	287	476	295	289	286	319	286	312	201	174	000	244	193	301	246	285	479	160	157	201	236	312	203	235	234	478	291	287	237	316	200
13	197	258	380	264	259	261	286	257	279	202	202	244	000	209	281	227	256	478	246	245	197	167	284	200	176	174	363	266	267	174	283	197
14	157	256	375	264	259	259	285	257	276	173	140	193	209	000	278	225	258	478	221	219	169	200	287	179	200	206	378	270	265	207	282	173
15	274	264	384	271	272	271	255	269	250	275	273	301	281	278	000	267	266	476	301	301	274	273	202	277	273	273	476	271	275	278	249	272
16	216	240	375	258	251	249	279	247	272	212	215	246	227	225	267	000	244	481	245	245	208	217	275	216	219	222	399	254	250	222	275	210
17	253	123	286	182	169	165	266	165	263	249	251	285	256	259	266	245	000	473	288	289	247	248	272	252	252	250	407	179	179	251	267	247
18	477	472	476	476	474	477	476	474	477	477	480	479	478	478	476	482	473	000	480	481	478	479	477	478	475	476	476	477	477	479	474	479
19	206	289	474	297	292	286	319	289	311	191	205	160	246	221	301	245	288	480	000	077	189	234	311	200	237	237	477	296	290	239	317	199
20	204	289	475	298	293	287	320	288	313	191	202	157	245	219	301	245	289	481	077	000	189	236	312	196	236	236	478	293	288	241	316	195
21	154	246	374	257	253	252	281	251	275	084	153	201	197	169	274	208	247	477	189	189	000	185	281	146	187	190	379	256	253	190	276	141
22	187	247	372	258	254	253	277	254	271	190	191	236	167	200	273	217	248	479	234	236	185	000	279	193	142	129	336	264	257	145	271	187
23	284	275	383	278	276	274	253	275	250	279	280	312	284	287	202	275	272	477	311	312	281	279	000	287	282	281	476	276	279	284	253	282
24	161	254	378	265	260	255	289	256	282	153	162	203	200	179	277	216	252	479	200	196	146	193	286	000	199	197	382	264	260	196	286	095
25	188	245	377	258	250	255	276	253	272	191	192	235	176	200	273	219	252	474	237	236	187	142	282	199	000	148	348	267	256	148	272	192
26	192	249	376	259	250	255	275	253	271	196	193	234	174	206	273	222	250	477	237	236	190	129	281	197	148	000	339	264	256	153	276	192
27	381	473	296	405	472	474	406	471	402	377	384	478	363	378	475	399	407	476	477	478	379	336	476	382	348	339	000	477	471	352	403	380
28	263	180	327	196	169	163	273	163	270	260	264	291	266	270	270	254	179	477	296	293	256	264	276	264	267	264	477	000	190	265	273	259
29	256	179	329	199	184	180	278	181	273	253	258	286	267	265	275	250	179	477	290	288	253	257	279	260	256	256	472	190	000	261	275	255
30	192	251	380	259	251	256	282	254	275	192	194	237	174	207	278	222	251	480	239	241	190	145	284	196	148	153	352	265	261	000	279	195
31	281	267	372	270	269	273	177	269	172	280	283	316	283	282	249	275	267	475	317	316	276	272	253	286	272	276	403	273	275	279	000	279
32	153	249	374	261	256	253	281	254	275	148	156	200	197	173	272	210	247	479	199	195	141	187	282	095	192	192	380	259	255	195	279	000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	541	677	583	592	541	589	536	562	633	465	610	530	370	512	565	545	800	624	640	520	556	548	562	515	570	726	524	511	589	589	540
2	541	000	635	387	342	369	396	381	386	733	600	686	463	542	409	549	349	722	698	708	515	440	401	543	462	455	681	388	452	464	383	532
3	677	635	000	665	676	627	668	626	670	714	728	739	666	678	655	777	617	731	744	760	737	661	663	767	692	680	690	646	648	710	661	753
4	583	387	665	000	334	396	385	384	396	767	630	727	457	579	422	577	383	677	723	733	546	447	411	571	442	434	637	403	474	447	378	568
5	592	342	676	334	000	384	395	321	397	777	644	736	481	584	433	570	375	672	742	751	554	451	421	579	429	444	650	418	498	453	393	562
6	541	369	627	396	384	000	401	319	406	706	581	665	455	528	387	526	383	753	676	675	510	458	381	499	481	457	693	320	436	475	400	527
7	589	396	668	385	395	401	000	397	389	763	630	727	471	580	425	584	392	695	738	741	556	429	346	573	458	451	657	400	488	463	382	573
8	536	381	626	384	321	319	397	000	400	723	595	691	453	537	396	527	345	724	687	696	518	457	392	534	474	457	685	312	448	470	392	526
9	562	386	670	396	397	406	389	400	000	747	585	700	462	561	415	565	390	725	703	722	532	448	403	571	467	469	681	409	477	482	327	546
10	633	733	714	767	777	706	763	723	747	000	628	635	706	661	676	699	723	674	653	678	634	720	693	677	767	758	538	712	697	793	775	656
11	465	600	728	630	644	581	630	595	585	628	000	560	584	462	549	535	594	859	568	579	494	596	589	526	636	608	790	582	560	631	639	464
12	610	686	739	727	736	665	727	691	700	635	560	000	673	610	631	601	687	871	379	381	556	688	669	589	724	706	795	667	646	729	731	571
13	530	463	666	457	481	455	471	453	462	706	584	673	000	535	446	467	449	741	665	678	434	391	454	454	414	402	678	463	454	413	461	448
14	370	542	678	579	584	528	580	537	561	661	462	610	535	000	502	566	545	790	614	627	526	545	539	558	578	549	723	511	492	545	582	539
15	512	409	655	422	433	387	425	396	415	676	549	631	446	502	000	515	400	772	630	642	477	478	395	506	510	483	705	390	437	509	426	493
16	565	549	777	577	570	526	584	527	565	699	535	601	467	566	515	000	529	913	580	571	401	484	548	350	483	461	836	528	513	481	589	379
17	545	349	617	383	375	383	392	345	390	723	594	687	449	545	400	529	000	719	684	701	514	442	391	543	462	461	673	376	443	468	387	503
18	800	722	731	677	672	753	695	724	725	674	859	871	741	790	772	913	719	000	871	884	851	708	759	897	664	690	538	759	763	694	709	874
19	624	698	744	723	742	676	738	687	703	653	568	379	665	614	630	580	684	871	000	366	579	701	682	565	734	711	799	668	647	721	729	547
20	640	708	760	733	751	675	741	696	722	678	579	381	678	627	642	571	701	884	366	000	585	717	688	567	752	718	806	679	656	729	739	551
21	520	515	737	546	554	510	556	518	532	634	494	556	434	526	477	401	514	851	579	585	000	446	515	386	469	462	787	508	498	485	549	344
22	556	440	661	447	451	458	429	457	448	720	596	688	391	545	478	484	442	708	701	717	446	000	438	473	377	369	644	465	471	379	451	469
23	548	401	663	411	421	381	346	392	403	693	589	669	454	539	395	548	391	759	682	688	515	438	000	539	492	478	705	380	451	490	416	528
24	562	543	767	571	579	499	573	534	571	677	526	589	454	558	506	350	543	897	565	567	386	473	539	000	503	479	822	522	509	465	569	372
25	515	462	692	442	429	481	458	474	467	767	636	724	414	578	510	483	462	664	734	752	469	377	492	503	000	346	627	484	486	344	467	486
26	570	455	680	434	444	457	451	457	469	758	608	706	402	549	483	461	461	690	711	718	462	369	478	479	346	000	621	460	453	366	451	471
27	726	681	690	637	650	693	657	685	681	538	790	795	678	723	705	836	673	538	799	806	787	644	705	822	627	621	000	694	699	634	663	805
28	524	388	646	403	418	320	400	312	409	712	582	667	463	511	390	528	376	759	668	679	508	465	380	522	484	460	694	000	389	478	409	525
29	511	452	648	474	498	436	488	448	477	697	560	646	454	492	437	513	443	763	647	656	498	471	451	509	486	453	699	389	000	476	488	500
30	589	464	710	447	453	475	463	470	482	793	631	729	413	545	509	481	468	694	721	729	485	379	490	465	344	366	634	478	476	000	466	479
31	589	383	661	378	393	400	382	392	327	775	639	731	461	582	426	589	387	709	729	739	549	451	416	569	467	451	663	409	488	466	000	541
32	540	532	753	568	562	527	573	526	546	656	464	571	448	539	493	379	503	874	547	551	344	469	528	372	486	471	805	525	500	479	541	000

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	250	375	260	253	256	283	253	277	156	143	192	197	157	274	216	253	477	206	204	154	187	284	161	188	192	381	263	256	192	281	153
2	250	000	293	184	168	168	267	167	265	250	253	287	258	256	264	240	123	473	289	289	246	247	275	254	245	249	473	180	179	251	267	249
3	375	293	000	322	323	320	371	320	368	373	377	476	380	375	384	375	286	476	474	476	374	372	383	378	377	376	296	327	329	381	372	374
4	260	184	322	000	191	191	271	189	270	258	263	295	264	264	271	258	182	476	297	298	257	258	278	265	258	259	405	196	199	259	270	261
5	253	168	323	191	000	146	268	145	265	255	258	289	259	259	272	251	169	474	292	293	253	253	276	260	250	250	472	169	184	251	269	256
6	256	168	320	191	146	000	276	091	271	254	254	286	261	259	271	249	165	477	286	287	252	253	274	255	255	255	474	163	180	256	273	253
7	283	267	371	272	268	276	000	272	152	283	287	319	286	285	255	279	266	477	319	320	281	277	253	289	276	275	406	273	278	282	177	281
8	253	167	320	189	145	091	272	000	266	251	253	286	257	257	269	247	165	474	289	288	251	254	275	256	253	253	471	163	181	255	269	254
9	277	265	368	270	266	271	152	266	000	275	277	312	279	276	250	272	263	477	311	313	275	271	250	282	272	271	402	270	273	275	172	275
10	156	250	373	258	255	254	283	251	275	000	159	201	202	173	275	212	249	477	191	191	084	190	279	153	191	196	377	260	253	192	279	148
11	143	253	377	263	258	254	287	253	277	159	000	174	202	140	273	215	251	480	205	202	153	191	280	162	191	193	384	264	258	194	283	156
12	192	287	476	295	289	286	319	286	312	201	174	000	244	193	301	246	285	479	160	157	201	236	312	203	235	234	478	291	287	237	316	200
13	197	258	380	264	259	261	286	257	279	202	202	244	000	209	281	227	256	478	246	245	197	167	284	200	176	174	363	266	267	174	283	197
14	157	256	375	264	259	259	285	257	276	173	140	193	209	000	278	225	258	478	221	219	169	200	287	179	200	206	378	270	265	207	282	173
15	274	264	384	271	272	271	255	269	250	275	273	301	281	278	000	267	266	476	301	301	274	273	202	277	273	273	476	271	275	278	249	272
16	216	240	375	258	251	249	279	247	272	212	215	246	227	225	267	000	244	481	245	245	208	217	275	216	219	222	399	254	250	222	275	210
17	253	123	286	182	169	165	266	165	263	249	251	285	256	259	266	245	000	473	288	289	247	248	272	252	252	250	407	179	179	251	267	247
18	477	472	476	476	474	477	476	474	477	477	480	479	478	478	476	482	473	000	480	481	478	479	477	478	475	476	476	477	477	479	474	479
19	206	289	474	297	292	286	319	289	311	191	205	160	246	221	301	245	288	480	000	077	189	234	311	200	237	237	477	296	290	239	317	199
20	204	289	475	298	293	287	320	288	313	191	202	157	245	219	301	245	289	481	077	000	189	236	312	196	236	236	478	293	288	241	316	195
21	154	246	374	257	253	252	281	251	275	084	153	201	197	169	274	208	247	477	189	189	000	185	281	146	187	190	379	256	253	190	276	141
22	187	247	372	258	254	253	277	254	271	190	191	236	167	200	273	217	248	479	234	236	185	000	279	193	142	129	336	264	257	145	271	187
23	284	275	383	278	276	274	253	275	250	279	280	312	284	287	202	275	272	477	311	312	281	279	000	287	282	281	476	276	279	284	253	282
24	161	254	378	265	260	255	289	256	282	153	162	203	200	179	277	216	252	479	200	196	146	193	286	000	199	197	382	264	260	196	286	095
25	188	245	377	258	250	255	276	253	272	191	192	235	176	200	273	219	252	474	237	236	187	142	282	199	000	148	348	267	256	148	272	192
26	192	249	376	259	250	255	275	253	271	196	193	234	174	206	273	222	250	477	237	236	190	129	281	197	148	000	339	264	256	153	276	192
27	381	473	296	405	472	474	406	471	402	377	384	478	363	378	475	399	407	476	477	478	379	336	476	382	348	339	000	477	471	352	403	380
28	263	180	327	196	169	163	273	163	270	260	264	291	266	270	270	254	179	477	296	293	256	264	276	264	267	264	477	000	190	265	273	259
29	256	179	329	199	184	180	278	181	273	253	258	286	267	265	275	250	179	477	290	288	253	257	279	260	256	256	472	190	000	261	275	255
30	192	251	380	259	251	256	282	254	275	192	194	237	174	207	278	222	251	480	239	241	190	145	284	196	148	153	352	265	261	000	279	195
31	281	267	372	270	269	273	177	269	172	280	283	316	283	282	249	275	267	475	317	316	276	272	253	286	272	276	403	273	275	279	000	279
32	153	249	374	261	256	253	281	254	275	148	156	200	197	173	272	210	247	479	199	195	141	187	282	095	192	192	380	259	255	195	279	000

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	541	677	583	592	541	589	536	562	633	465	610	530	370	512	565	545	800	624	640	520	556	548	562	515	570	726	524	511	589	589	540
2	541	000	635	387	342	369	396	381	386	733	600	686	463	542	409	549	349	722	698	708	515	440	401	543	462	455	681	388	452	464	383	532
3	677	635	000	665	676	627	668	626	670	714	728	739	666	678	655	777	617	731	744	760	737	661	663	767	692	680	690	646	648	710	661	753
4	583	387	665	000	334	396	385	384	396	767	630	727	457	579	422	577	383	677	723	733	546	447	411	571	442	434	637	403	474	447	378	568
5	592	342	676	334	000	384	395	321	397	777	644	736	481	584	433	570	375	672	742	751	554	451	421	579	429	444	650	418	498	453	393	562
6	541	369	627	396	384	000	401	319	406	706	581	665	455	528	387	526	383	753	676	675	510	458	381	499	481	457	693	320	436	475	400	527
7	589	396	668	385	395	401	000	397	389	763	630	727	471	580	425	584	392	695	738	741	556	429	346	573	458	451	657	400	488	463	382	573
8	536	381	626	384	321	319	397	000	400	723	595	691	453	537	396	527	345	724	687	696	518	457	392	534	474	457	685	312	448	470	392	526
9	562	386	670	396	397	406	389	400	000	747	585	700	462	561	415	565	390	725	703	722	532	448	403	571	467	469	681	409	477	482	327	546
10	633	733	714	767	777	706	763	723	747	000	628	635	706	661	676	699	723	674	653	678	634	720	693	677	767	758	538	712	697	793	775	656
11	465	600	728	630	644	581	630	595	585	628	000	560	584	462	549	535	594	859	568	579	494	596	589	526	636	608	790	582	560	631	639	464
12	610	686	739	727	736	665	727	691	700	635	560	000	673	610	631	601	687	871	379	381	556	688	669	589	724	706	795	667	646	729	731	571
13	530	463	666	457	481	455	471	453	462	706	584	673	000	535	446	467	449	741	665	678	434	391	454	454	414	402	678	463	454	413	461	448
14	370	542	678	579	584	528	580	537	561	661	462	610	535	000	502	566	545	790	614	627	526	545	539	558	578	549	723	511	492	545	582	539
15	512	409	655	422	433	387	425	396	415	676	549	631	446	502	000	515	400	772	630	642	477	478	395	506	510	483	705	390	437	509	426	493
16	565	549	777	577	570	526	584	527	565	699	535	601	467	566	515	000	529	913	580	571	401	484	548	350	483	461	836	528	513	481	589	379
17	545	349	617	383	375	383	392	345	390	723	594	687	449	545	400	529	000	719	684	701	514	442	391	543	462	461	673	376	443	468	387	503
18	800	722	731	677	672	753	695	724	725	674	859	871	741	790	772	913	719	000	871	884	851	708	759	897	664	690	538	759	763	694	709	874
19	624	698	744	723	742	676	738	687	703	653	568	379	665	614	630	580	684	871	000	366	579	701	682	565	734	711	799	668	647	721	729	547
20	640	708	760	733	751	675	741	696	722	678	579	381	678	627	642	571	701	884	366	000	585	717	688	567	752	718	806	679	656	729	739	551
21	520	515	737	546	554	510	556	518	532	634	494	556	434	526	477	401	514	851	579	585	000	446	515	386	469	462	787	508	498	485	549	344
22	556	440	661	447	451	458	429	457	448	720	596	688	391	545	478	484	442	708	701	717	446	000	438	473	377	369	644	465	471	379	451	469
23	548	401	663	411	421	381	346	392	403	693	589	669	454	539	395	548	391	759	682	688	515	438	000	539	492	478	705	380	451	490	416	528
24	562	543	767	571	579	499	573	534	571	677	526	589	454	558	506	350	543	897	565	567	386	473	539	000	503	479	822	522	509	465	569	372
25	515	462	692	442	429	481	458	474	467	767	636	724	414	578	510	483	462	664	734	752	469	377	492	503	000	346	627	484	486	344	467	486
26	570	455	680	434	444	457	451	457	469	758	608	706	402	549	483	461	461	690	711	718	462	369	478	479	346	000	621	460	453	366	451	471
27	726	681	690	637	650	693	657	685	681	538	790	795	678	723	705	836	673	538	799	806	787	644	705	822	627	621	000	694	699	634	663	805
28	524	388	646	403	418	320	400	312	409	712	582	667	463	511	390	528	376	759	668	679	508	465	380	522	484	460	694	000	389	478	409	525
29	511	452	648	474	498	436	488	448	477	697	560	646	454	492	437	513	443	763	647	656	498	471	451	509	486	453	699	389	000	476	488	500
30	589	464	710	447	453	475	463	470	482	793	631	729	413	545	509	481	468	694	721	729	485	379	490	465	344	366	634	478	476	000	466	479
31	589	383	661	378	393	400	382	392	327	775	639	731	461	582	426	589	387	709	729	739	549	451	416	569	467	451	663	409	488	466	000	541
32	540	532	753	568	562	527	573	526	546	656	464	571	448	539	493	379	503	874	547	551	344	469	528	372	486	471	805	525	500	479	541	000

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	250	375	260	253	256	283	253	277	156	143	192	197	157	274	216	253	477	206	204	154	187	284	161	188	192	381	263	256	192	281	153
2	250	000	293	184	168	168	267	167	265	250	253	287	258	256	264	240	123	473	289	289	246	247	275	254	245	249	473	180	179	251	267	249
3	375	293	000	322	323	320	371	320	368	373	377	476	380	375	384	375	286	476	474	476	374	372	383	378	377	376	296	327	329	381	372	374
4	260	184	322	000	191	191	271	189	270	258	263	295	264	264	271	258	182	476	297	298	257	258	278	265	258	259	405	196	199	259	270	261
5	253	168	323	191	000	146	268	145	265	255	258	289	259	259	272	251	169	474	292	293	253	253	276	260	250	250	472	169	184	251	269	256
6	256	168	320	191	146	000	276	091	271	254	254	286	261	259	271	249	165	477	286	287	252	253	274	255	255	255	474	163	180	256	273	253
7	283	267	371	272	268	276	000	272	152	283	287	319	286	285	255	279	266	477	319	320	281	277	253	289	276	275	406	273	278	282	177	281
8	253	167	320	189	145	091	272	000	266	251	253	286	257	257	269	247	165	474	289	288	251	254	275	256	253	253	471	163	181	255	269	254
9	277	265	368	270	266	271	152	266	000	275	277	312	279	276	250	272	263	477	311	313	275	271	250	282	272	271	402	270	273	275	172	275
10	156	250	373	258	255	254	283	251	275	000	159	201	202	173	275	212	249	477	191	191	084	190	279	153	191	196	377	260	253	192	279	148
11	143	253	377	263	258	254	287	253	277	159	000	174	202	140	273	215	251	480	205	202	153	191	280	162	191	193	384	264	258	194	283	156
12	192	287	476	295	289	286	319	286	312	201	174	000	244	193	301	246	285	479	160	157	201	236	312	203	235	234	478	291	287	237	316	200
13	197	258	380	264	259	261	286	257	279	202	202	244	000	209	281	227	256	478	246	245	197	167	284	200	176	174	363	266	267	174	283	197
14	157	256	375	264	259	259	285	257	276	173	140	193	209	000	278	225	258	478	221	219	169	200	287	179	200	206	378	270	265	207	282	173
15	274	264	384	271	272	271	255	269	250	275	273	301	281	278	000	267	266	476	301	301	274	273	202	277	273	273	476	271	275	278	249	272
16	216	240	375	258	251	249	279	247	272	212	215	246	227	225	267	000	244	481	245	245	208	217	275	216	219	222	399	254	250	222	275	210
17	253	123	286	182	169	165	266	165	263	249	251	285	256	259	266	245	000	473	288	289	247	248	272	252	252	250	407	179	179	251	267	247
18	477	472	476	476	474	477	476	474	477	477	480	479	478	478	476	482	473	000	480	481	478	479	477	478	475	476	476	477	477	479	474	479
19	206	289	474	297	292	286	319	289	311	191	205	160	246	221	301	245	288	480	000	077	189	234	311	200	237	237	477	296	290	239	317	199
20	204	289	475	298	293	287	320	288	313	191	202	157	245	219	301	245	289	481	077	000	189	236	312	196	236	236	478	293	288	241	316	195
21	154	246	374	257	253	252	281	251	275	084	153	201	197	169	274	208	247	477	189	189	000	185	281	146	187	190	379	256	253	190	276	141
22	187	247	372	258	254	253	277	254	271	190	191	236	167	200	273	217	248	479	234	236	185	000	279	193	142	129	336	264	257	145	271	187
23	284	275	383	278	276	274	253	275	250	279	280	312	284	287	202	275	272	477	311	312	281	279	000	287	282	281	476	276	279	284	253	282
24	161	254	378	265	260	255	289	256	282	153	162	203	200	179	277	216	252	479	200	196	146	193	286	000	199	197	382	264	260	196	286	095
25	188	245	377	258	250	255	276	253	272	191	192	235	176	200	273	219	252	474	237	236	187	142	282	199	000	148	348	267	256	148	272	192
26	192	249	376	259	250	255	275	253	271	196	193	234	174	206	273	222	250	477	237	236	190	129	281	197	148	000	339	264	256	153	276	192
27	381	473	296	405	472	474	406	471	402	377	384	478	363	378	475	399	407	476	477	478	379	336	476	382	348	339	000	477	471	352	403	380
28	263	180	327	196	169	163	273	163	270	260	264	291	266	270	270	254	179	477	296	293	256	264	276	264	267	264	477	000	190	265	273	259
29	256	179	329	199	184	180	278	181	273	253	258	286	267	265	275	250	179	477	290	288	253	257	279	260	256	256	472	190	000	261	275	255
30	192	251	380	259	251	256	282	254	275	192	194	237	174	207	278	222	251	480	239	241	190	145	284	196	148	153	352	265	261	000	279	195
31	281	267	372	270	269	273	177	269	172	280	283	316	283	282	249	275	267	475	317	316	276	272	253	286	272	276	403	273	275	279	000	279
32	153	249	374	261	256	253	281	254	275	148	156	200	197	173	272	210	247	479	199	195	141	187	282	095	192	192	380	259	255	195	279	000

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	541	677	583	592	541	589	536	562	633	465	610	530	370	512	565	545	800	624	640	520	556	548	562	515	570	726	524	511	589	589	540
2	541	000	635	387	342	369	396	381	386	733	600	686	463	542	409	549	349	722	698	708	515	440	401	543	462	455	681	388	452	464	383	532
3	677	635	000	665	676	627	668	626	670	714	728	739	666	678	655	777	617	731	744	760	737	661	663	767	692	680	690	646	648	710	661	753
4	583	387	665	000	334	396	385	384	396	767	630	727	457	579	422	577	383	677	723	733	546	447	411	571	442	434	637	403	474	447	378	568
5	592	342	676	334	000	384	395	321	397	777	644	736	481	584	433	570	375	672	742	751	554	451	421	579	429	444	650	418	498	453	393	562
6	541	369	627	396	384	000	401	319	406	706	581	665	455	528	387	526	383	753	676	675	510	458	381	499	481	457	693	320	436	475	400	527
7	589	396	668	385	395	401	000	397	389	763	630	727	471	580	425	584	392	695	738	741	556	429	346	573	458	451	657	400	488	463	382	573
8	536	381	626	384	321	319	397	000	400	723	595	691	453	537	396	527	345	724	687	696	518	457	392	534	474	457	685	312	448	470	392	526
9	562	386	670	396	397	406	389	400	000	747	585	700	462	561	415	565	390	725	703	722	532	448	403	571	467	469	681	409	477	482	327	546
10	633	733	714	767	777	706	763	723	747	000	628	635	706	661	676	699	723	674	653	678	634	720	693	677	767	758	538	712	697	793	775	656
11	465	600	728	630	644	581	630	595	585	628	000	560	584	462	549	535	594	859	568	579	494	596	589	526	636	608	790	582	560	631	639	464
12	610	686	739	727	736	665	727	691	700	635	560	000	673	610	631	601	687	871	379	381	556	688	669	589	724	706	795	667	646	729	731	571
13	530	463	666	457	481	455	471	453	462	706	584	673	000	535	446	467	449	741	665	678	434	391	454	454	414	402	678	463	454	413	461	448
14	370	542	678	579	584	528	580	537	561	661	462	610	535	000	502	566	545	790	614	627	526	545	539	558	578	549	723	511	492	545	582	539
15	512	409	655	422	433	387	425	396	415	676	549	631	446	502	000	515	400	772	630	642	477	478	395	506	510	483	705	390	437	509	426	493
16	565	549	777	577	570	526	584	527	565	699	535	601	467	566	515	000	529	913	580	571	401	484	548	350	483	461	836	528	513	481	589	379
17	545	349	617	383	375	383	392	345	390	723	594	687	449	545	400	529	000	719	684	701	514	442	391	543	462	461	673	376	443	468	387	503
18	800	722	731	677	672	753	695	724	725	674	859	871	741	790	772	913	719	000	871	884	851	708	759	897	664	690	538	759	763	694	709	874
19	624	698	744	723	742	676	738	687	703	653	568	379	665	614	630	580	684	871	000	366	579	701	682	565	734	711	799	668	647	721	729	547
20	640	708	760	733	751	675	741	696	722	678	579	381	678	627	642	571	701	884	366	000	585	717	688	567	752	718	806	679	656	729	739	551
21	520	515	737	546	554	510	556	518	532	634	494	556	434	526	477	401	514	851	579	585	000	446	515	386	469	462	787	508	498	485	549	344
22	556	440	661	447	451	458	429	457	448	720	596	688	391	545	478	484	442	708	701	717	446	000	438	473	377	369	644	465	471	379	451	469
23	548	401	663	411	421	381	346	392	403	693	589	669	454	539	395	548	391	759	682	688	515	438	000	539	492	478	705	380	451	490	416	528
24	562	543	767	571	579	499	573	534	571	677	526	589	454	558	506	350	543	897	565	567	386	473	539	000	503	479	822	522	509	465	569	372
25	515	462	692	442	429	481	458	474	467	767	636	724	414	578	510	483	462	664	734	752	469	377	492	503	000	346	627	484	486	344	467	486
26	570	455	680	434	444	457	451	457	469	758	608	706	402	549	483	461	461	690	711	718	462	369	478	479	346	000	621	460	453	366	451	471
27	726	681	690	637	650	693	657	685	681	538	790	795	678	723	705	836	673	538	799	806	787	644	705	822	627	621	000	694	699	634	663	805
28	524	388	646	403	418	320	400	312	409	712	582	667	463	511	390	528	376	759	668	679	508	465	380	522	484	460	694	000	389	478	409	525
29	511	452	648	474	498	436	488	448	477	697	560	646	454	492	437	513	443	763	647	656	498	471	451	509	486	453	699	389	000	476	488	500
30	589	464	710	447	453	475	463	470	482	793	631	729	413	545	509	481	468	694	721	729	485	379	490	465	344	366	634	478	476	000	466	479
31	589	383	661	378	393	400	382	392	327	775	639	731	461	582	426	589	387	709	729	739	549	451	416	569	467	451	663	409	488	466	000	541
32	540	532	753	568	562	527	573	526	546	656	464	571	448	539	493	379	503	874	547	551	344	469	528	372	486	471	805	525	500	479	541	000

	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31	32
1	000	250	375	260	253	256	283	253	277	156	143	192	197	157	274	216	253	477	206	204	154	187	284	161	188	192	381	263	256	192	281	153
2	250	000	293	184	168	168	267	167	265	250	253	287	258	256	264	240	123	473	289	289	246	247	275	254	245	249	473	180	179	251	267	249
3	375	293	000	322	323	320	371	320	368	373	377	476	380	375	384	375	286	476	474	476	374	372	383	378	377	376	296	327	329	381	372	374
4	260	184	322	000	191	191	271	189	270	258	263	295	264	264	271	258	182	476	297	298	257	258	278	265	258	259	405	196	199	259	270	261
5	253	168	323	191	000	146	268	145	265	255	258	289	259	259	272	251	169	474	292	293	253	253	276	260	250	250	472	169	184	251	269	256
6	256	168	320	191	146	000	276	091	271	254	254	286	261	259	271	249	165	477	286	287	252	253	274	255	255	255	474	163	180	256	273	253
7	283	267	371	272	268	276	000	272	152	283	287	319	286	285	255	279	266	477	319	320	281	277	253	289	276	275	406	273	278	282	177	281
8	253	167	320	189	145	091	272	000	266	251	253	286	257	257	269	247	165	474	289	288	251	254	275	256	253	253	471	163	181	255	269	254
9	277	265	368	270	266	271	152	266	000	275	277	312	279	276	250	272	263	477	311	313	275	271	250	282	272	271	402	270	273	275	172	275
10	156	250	373	258	255	254	283	251	275	000	159	201	202	173	275	212	249	477	191	191	084	190	279	153	191	196	377	260	253	192	279	148
11	143	253	377	263	258	254	287	253	277	159	000	174	202	140	273	215	251	480	205	202	153	191	280	162	191	193	384	264	258	194	283	156
12	192	287	476	295	289	286	319	286	312	201	174	000	244	193	301	246	285	479	160	157	201	236	312	203	235	234	478	291	287	237	316	200
13	197	258	380	264	259	261	286	257	279	202	202	244	000	209	281	227	256	478	246	245	197	167	284	200	176	174	363	266	267	174	283	197
14	157	256	375	264	259	259	285	257	276	173	140	193	209	000	278	225	258	478	221	219	169	200	287	179	200	206	378	270	265	207	282	173
15	274	264	384	271	272	271	255	269	250	275	273	301	281	278	000	267	266	476	301	301	274	273	202	277	273	273	476	271	275	278	249	272
16	216	240	375	258	251	249	279	247	272	212	215	246	227	225	267	000	244	481	245	245	208	217	275	216	219	222	399	254	250	222	275	210
17	253	123	286	182	169	165	266	165	263	249	251	285	256	259	266	245	000	473	288	289	247	248	272	252	252	250	407	179	179	251	267	247
18	477	472	476	476	474	477	476	474	477	477	480	479	478	478	476	482	473	000	480	481	478	479	477	478	475	476	476	477	477	479	474	479
19	206	289	474	297	292	286	319	289	311	191	205	160	246	221	301	245	288	480	000	077	189	234	311	200	237	237	477	296	290	239	317	199
20	204	289	475	298	293	287	320	288	313	191	202	157	245	219	301	245	289	481	077	000	189	236	312	196	236	236	478	293	288	241	316	195
21	154	246	374	257	253	252	281	251	275	084	153	201	197	169	274	208	247	477	189	189	000	185	281	146	187	190	379	256	253	190	276	141
22	187	247	372	258	254	253	277	254	271	190	191	236	167	200	273	217	248	479	234	236	185	000	279	193	142	129	336	264	257	145	271	187
23	284	275	383	278	276	274	253	275	250	279	280	312	284	287	202	275	272	477	311	312	281	279	000	287	282	281	476	276	279	284	253	282
24	161	254	378	265	260	255	289	256	282	153	162	203	200	179	277	216	252	479	200	196	146	193	286	000	199	197	382	264	260	196	286	095
25	188	245	377	258	250	255	276	253	272	191	192	235	176	200	273	219	252	474	237	236	187	142	282	199	000	148	348	267	256	148	272	192
26	192	249	376	259	250	255	275	253	271	196	193	234	174	206	273	222	250	477	237	236	190	129	281	197	148	000	339	264	256	153	276	192
27	381	473	296	405	472	474	406	471	402	377	384	478	363	378	475	399	407	476	477	478	379	336	476	382	348	339	000	477	471	352	403	380
28	263	180	327	196	169	163	273	163	270	260	264	291	266	270	270	254	179	477	296	293	256	264	276	264	267	264	477	000	190	265	273	259
29	256	179	329	199	184	180	278	181	273	253	258	286	267	265	275	250	179	477	290	288	253	257	279	260	256	256	472	190	000	261	275	255
30	192	251	380	259	251	256	282	254	275	192	194	237	174	207	278	222	251	480	239	241	190	145	284	196	148	153	352	265	261	000	279	195
31	281	267	372	270	269	273	177	269	172	280	283	316	283	282	249	275	267	475	317	316	276	272	253	286	272	276	403	273	275	279	000	279
32	153	249	374	261	256	253	281	254	275	148	156	200	197	173	272	210	247	479	199	195	141	187	282	095	192	192	380	259	255	195	279	000