Linguistic Communication Channels Reveal Connections between Texts: The New Testament and Greek Literature

: We studied two fundamental linguistic channels—the sentences and the interpunctions channels—and showed they can reveal deeper connections between texts. The applied theory does not follow the actual paradigm of linguistic studies. As a study case, we considered the Greek New Testament, with the purpose of determining mathematical connections between its texts and possible differences in the writing style (mathematically defined) of the writers and in the reading skill required of their readers. The analysis was based on deep ‐ language parameters and communication/information theory. To set the New Testament texts in the larger Greek classical literature, we considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch. The results largely confirmed what scholars have found about the New Testament texts, therefore giving credibility to the theory. The Gospel according to John is very similar to the fables written by Aesop. Surprisingly, the Epistle to the Hebrews and Apocalypse are each other’s “photocopies” in the two linguistic channels and not linked to all other texts. These two texts deserve further study by historians of the early Christian church literature at the level of meaning, readers, and possible Old Testament texts that might have influenced them. The theory can guide scholars to study any literary corpus.


A Mathematical Theory of Texts Outside the Paradigm of Natural Language Processing
In recent papers [1][2][3][4][5][6][7][8], we have developed a general theory on the deep-language mathematical structure of literary texts (or any long text), including their translation.The theory is based on linguistic communication channels-suitably defined-always contained in texts and based on the theory of regression lines [9,10] and Shannon's communication and information theory [11].
In our theory, "translation" means not only the conversion of a text from a language to another language-what is properly understood as translation-but also how some linguistic parameters of a text are related to those of another text, either in the same language or in another language."Translation", therefore, refers also to the case in which a text is mathematically compared (metaphorically "translated") with another text, whichever is the language of the two texts [2].
The theory does not follow the actual paradigm of linguistic studies.Most studies on the relationships between texts concern translation because of the importance of automatic translation.References [12][13][14][15][16][17][18] report results not based on mathematical analyses of texts-as our theory does-and when a mathematical approach is used, as in References , most of these studies consider neither Shannon's communication theory, nor the fundamental connection that some linguistic variables seem to have with reading ability and short-term memory (STM) capacity [1][2][3][4][5][6][7][8].In fact, these studies are mainly concerned with automatic translations, not with a high-level direct response of human readers, as our theory is.Very often, they refer only to one very limited linguistic variable, not to sentences that convey a completely developed thought-or to deeplanguage parameters, as our theory does.
The theory allows one to perform experiments with ancient readers − otherwise impossible-or with modern readers, by studying the literary texts of their epoch.These "experiments" can reveal unexpected similarities and dependences between texts because they consider mathematical parameters not consciously controlled by writers, either ancient or modern, as we will also show in the present paper.
In addition to the total number of characters, words, sentences, and interpunctions (punctuation marks) of a text, the linguistic parameters considered in our theory are the number of words  per chapter, the number of sentences  per chapter, and the number of interpunctions per chapter   .Instead of referring to chapters, the analysis can refer to any chosen subdivision of a literary text, large enough to provide reliable statistics, such as a few hundred words [1][2][3][4][5][6][7][8].
We also consider four important deep-language parameters, calculated in each chapter (or in any large-enough block text): characters per word  , words per sentence  , words per interpunction  , and interpunctions per sentence    ⁄ (this variable gives the number of  s contained in a sentence).
The parameter  , also referred to as the "words interval" (i.e., an "interval" measured in words [1]), is very likely linked to readers' STM capacity [52], and it can be used to study how much two populations of readers of diverse languages overlap in reading a literary text in translation [7].
To study the chaotic data that emerge in any language, the theory compares a text (the reference, or input text) with another text (output text, "cross-channel") or with itself ("self-channel"), with a complex communication channel-consisting of several parallel single channels [4], two of which are explicitly considered in the present paper-in which both input and output are affected by "noise", i.e., by diverse scattering of the data around a mean linear relationship, namely, a regression line.
In [3] we have shown how much the mathematical structure of a literary text is saved or lost in translation.To make objective comparisons, we have defined a likeness index  , based on the probability and communication theory of noisy digital channels.We have shown that two linguistic parameters can be related by regression lines.This is a general feature of texts.If we consider the regression line linking  (dependent variable) to  (independent variable) in a reference text and the regression line linking the same parameters in another text, then  of the first text can be linked to  of the second text with another regression line without explicitly calculating its parameters (slope and correlation coefficient) from the samples because the mathematical problem has the same structure of the theory developed in Reference [2].
In Reference [4] we have applied the theory of linguistic channels to show how an author shapes a character speaking to diverse audiences by diversifying and adjusting ("fine tuning") two important linguistic communication channels, namely, the sentences channel (S-channel) and the interpunctions channel (I-channel).The S-channel links  of the output text to  of the input text, for the same number of words.The I-channel links  (i.e., the number words intervals  ) of the output text to  of the input text, for the the same number of sentences.
In Reference [5] we have further developed the theory of linguistic channels by applying it to Charles Dickens' novels and to other novels of the English literature and found, for example, that this author was very likely affected by King James' New Testament.
In Reference [6] we have defined a universal readability index, applicable to any alphabetical language, by including the readers' STM capacity, modeled by  ; in Reference [7] we have studied the STM capacity across time and language, and in Reference [8] we have studied the readability of a text across time and language.
In this paper, as the title claims, we further study linguistic communication channels-namely, S-channels and I-channels-and show that they can reveal deeper connections between texts.As a study case, we consider an important historical literary corpus, the Greek New Testament (NT), with the purpose of determining the mathematical connections between its books (in the following referred to as "texts") and possible differences in writing style (mathematically defined) of writers and in reading skill required of their readers.To set the NT texts in the Greek classical literature, we have considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch.
The analysis is based on the deep-language parameters and communication channels mentioned above, not explicitly known to the ancient writer/reader or, as well, to any modern writer/reader not acquainted with this theory.
After this introductory section, Section 2 recalls and defines the deep-language parameters of texts, Section 3 recalls the vector representation of texts, Section 4 summarizes the theory of linguistic communication channels, Section 5 defines the theoretical signal-to-noise ratio in linguistic channels (S-channels and I-channels), Section 6 defines the experimental signal-to-noise ratio in these channels, Section 7 recalls the likeness index of texts and defines the channels quadrants, Section 8 presents an extreme synthesis of the main findings, and Section 9 concludes and suggests future work.Appendices A and B reports numerical tables.

Deep-Language Parameters of Texts
The original NT Greek texts were first processed manually to delete all notes, titles, and other textual material added by modern editors, therefore leaving in the end only the original texts, as it was done in Reference [53].The original Greek texts of the New Testament have been downloaded from Tyndale House Greek New Testament (THGNT)-BibleGateway.com (last accessed on 31 May 2023).
Interpunctions were introduced by ancient readers acting as "editors" [54].They were well-educated readers of the early Christian Church and very respectful of the original text and its meaning; therefore, they likely maintained a correct subdivision in sentences and word intervals within sentences, for not distorting the correct meaning and emphasis of the text.In other terms, we can reasonably assume that interpunctions were effectively introduced by the author.
In Reference [53], we compared the Gospels according to Matthew (Mt), Mark (Mk), Luke (Lk), and John (Jh) and the book of Acts (Ac) by considering only deep-language parameters, not S-channels and I-channels, as we do in this paper.Moreover, we have presently enlarged our study case by including the Epistle to the Hebrews (Hb) and Apocalypse (Ap, known also as Revelation)-texts that show unexpected connectionsand some texts written by the historians Polybius (Po), Plutarch (Pl), and Flavius Josephus (Fl) and by the story-teller Aesop (Ae) to set the NT in the larger classical Greek literature.These texts were downloaded from Greek and Roman Materials (tufts.edu)(last accessed on 31 May 2023).
The theory is very robust against slightly different versions of the Greek texts (e.g.New Testament) because it never considers meaning.If a word is not written, or it is substituted with another one in the NT texts, or if a small text is not present in a version, it does not significantly affect the statistical analysis.This applies also to the quality of the Greek used both in the NT texts and in Josephus.This a point of force of the theory.
The samples used in the statistical analysis refer to chapters: for example, Matthew has 28 chapters; therefore, this text is described by 28 samples for each deep-language parameter.The list of names ("genealogy" of Jesus of Nazareth) in Matthew and in Luke have been deleted for not biasing the statistical results.Like in References [1][2][3][4][5][6][7][8]53], samples were statistically weighted with the fraction of total words; therefore, in Matthew-which contains 18121 total words-Chapter 5, for example, has 824 words, and therefore, its weight is 824/18121 0.0455, not 1/128 0.0078 .This choice is mandatory to avoid that a short chapter (or, in general, a short text) affects the statistical results like a long one.
After this processing, we have obtained the mean values of  ,  ,  , and  reported in Table 1 and the universal readability  , defined and discussed in Reference [6] In Equation (1) we set  4.48, the mean value found in the Italian literature, since Italian is the reference language in the definition of  [1].To set the NT texts in the Greek classical literature, we have considered texts written by Aesop, Polybius, Flavius Josephus, and Plutarch.The rational for selecting these authors is the following: Aesop wrote texts (Fables) that may recall the parables of the Gospels for their brevity and similar narrative purpose and style, and Polybius, Flavius Josephus, and Plutarch were historians and therefore wrote essays narrating facts, like the Gospels, partially, and especially Acts.Table 2 lists the texts and the mean values of the deep-language parameters of these authors.These texts have been processed manually like the NT.The mean values of Tables 1 and 2 can be used for a first assessment of how "close", or mathematically similar, texts are in a Cartesian plane, by defining a linear combination of deep-language parameters.Texts are then modeled as vectors, the representation of which is discussed in detail in [1][2][3][4][5][6] and briefly recalled in the next section.
In this Cartesian plane, two texts are likely connected-they show close ending points-if their relative Pythagorean distance is small and are likely not connected if their distance is large.In other words, a small distance means that the texts share a similar mathematical structure.This is a necessary, but not sufficient, condition for two texts being very likely connected to each other.
In Figure 1, the three synoptic Gospels (Mt, Mk, and Lk) are the closest texts of the NT.In particular, Mt and Lk are practically coincident, almost a mathematical "photocopy" of each other, as it was also shown, with diverse analysis, in References [1,2].Notice also that  (Table 1) is very similar for the synoptics but not for the other NT texts (except Hebrews) and that John (Jh) is the most readable text.
Acts and Luke, although written by the same author-as widely accepted by scholars in References [55,56], a very small selection of the huge body of literature on this topicare quite diverse because when Luke writes the Gospel, he has significant constraints because his sources are very likely shared with Matthew.But when Luke writes Acts, he has few or no sources to share with Matthew; therefore, he is free to use his personal writing style oriented to narrating the early facts of the church.It is not surprising, therefore, that Acts, because of its contents, is closer to Plutarch and Polybius than to the synoptics and that its  41.37 is close to Plutarch's Parallel Lives  45.53 (Tables 1  and 2), therefore shedding some light on the similar readability skill required of the readers of these historical narrations.
John is distinctly diverse of Matthew, Luke and Mark, but it is very close to Aesop's Fables.
Unexpected is the vicinity of Hebrews and Apocalyse-two NT texts scholars rarely consider to be connected [57][58][59][60]-and their great distance from the Gospels.Their universal readability indices are also very similar,  53.10 for Hebrews and  49.46 for Apocalypse.
As for the Greek historians, we can notice that they are distinctly grouped and distant from the Gospels.
In conclusion, the vector modeling of texts can reveal first connections, otherwise hidden.These connections can be further addressed by studying their S-channels and Ichannels and the likeness index  .Therefore, in the next section we first recall the theory of linguistic communication channels.

Theory of Linguistic Communication Channels
In a text, an independent (reference) variable  (e.g.,  in S-channels) and a dependent variable  (e.g.,  can be related by a regression line (slope ) passing through the origin of the Cartesian coordinates: Let us consider two diverse texts  and  .For both we can write Equation (3) for the same couple of parameter; however, in both cases, Equation (3) does not give the full relationship of two parameters because it links only the mean conditional values.We can write more general linear relationships, which take care of the scattering of the datameasured by the correlation coefficients  and  , not considered in Equation (3)around the regression lines (slopes  and  ):
We can compare two texts by eliminating .In other words, we compare the output variable  for the same value of the input variable  in the two texts.In the example just mentioned, we can compare the number of sentences in two texts-for an equal number of words-by considering not only the mean relationship (Equation ( 3)) but also the scattering of the data (Equation ( 4)).
As recalled before, we refer to this communication channel as the "sentences channel" and to this processing as "fine tuning" because it deepens the analysis of the data and provides more insight into the relationship between two texts.The mathematical theory follows.
By eliminating , from Equation ( 4) we obtain the linear relationship betweennow-the sentences in text  (now the reference, input text) and the sentences in text  (now the output text): Compared with the independent (input) text  , the slope  is given by The noise source that produces the correlation coefficient between  and  is given by The "regression noise-to-signal ratio",  , due to  1, of the channel is given by [2] The unknown correlation coefficient  between  and  is given by [2,9]  cos arcos  arcos The "correlation noise-to-signal ratio",  , due to  1 , of the channel that connects the input text  to the output text  is given by [1] Because the two noise sources are disjoint and additive, the total noise-to-signal ratio of the channel connecting text  to text  is given by [ Notice that Equation ( 9) can be represented graphically [2], to study the impact of  and  on .Finally, the total signal-to-noise ratio is given by Γ 10 log The last expression is in dB.Notice that no channel can yield  1 and  1 (i.e., Γ ∞), a case referred to as the ideal channel, unless a text is compared with itself (self-comparison, self-channel).In practice, we always find  1 and  1.The slope  measures the multiplicative "bias" of the dependent variable compared with the independent variable; the correlation coefficient  measures how "precise" the linear best fit is.
In conclusion, the slope  is the source of the regression noise, and the correlation coefficient  is the source of the correlation noise of the channel.
In the next section we study how sentences and interpunctions build S-channels and I-channels and calculate their theoretical signal-to-noise ratio.

S-Channels and I-Channels: Theoretical Signal-to-Noise Ratio 𝚪 𝒕𝒉
In S-channels the number of sentences of two texts is compared for the same number of words.Therefore, they describe how many sentences the writer of text  uses to convey a meaning, compared with the writer of text -who may convey, of course, a diverse meaning-by using the same number of words.Simply stated, it is all about how a writer shapes his/her style in communicating the full meaning of a sentence with a given number of words available; therefore, it is more linked to  than to other parameters.
In I-channels the number of word intervals  of two texts is compared for the same number of sentences.Therefore, they describe how many short texts (the text between two contiguous punctuation marks) two writers use to make a full sentence.Since  is connected with short-term memory [1], I-channels are more related to readers' STM capacity than to authors' style.
Finally, notice that the universal readability index, Equation (1), depends on both  and  ; therefore, it can better measure reading difficulty, as discussed in Reference [6].
To apply the theory of Section 4, we need the slope  and the correlation coefficient  of the regression line between (a)  and  to study S-channels and (b)  and  to study I-channels.We first consider the NT and then the texts from the Greek literature.interpunctions.

New Testament
Figures 2 and 3 show the scatterplots and regression lines linking  to  , and Figures 4 and 5 show those linking  to  .By looking at these figures, we can see at glance which texts have very similar regression lines, but it is more difficult to see whether the scattering of data is similar or not.3).Regression lines, however, consider and describe only one aspect of the linear relationship, namely, that concerning (conditional) mean values.They do not consider the other aspect of the relationship, namely, the scattering of data, which may not be similar when two regression lines almost coincide, as it is clearly shown in Figure 2 in Mark and John, in Matthew and Luke and in Hebrews and Apocalypse.The theory of linguistic channels (Section 4), on the contrary, by considering both slopes and correlation coefficients, provides a reliable tool to fully compare two sets of data and can confirm the findings shown in Figure 1.
As an example, Table 4 reports the calculated values of  (Equation ( 6)) and  (Equation ( 9)) in S-channels and in I-channels by assuming Matthew as the output text and the others as input texts.For instance, the number of sentences in Matthew (text  ) is linked to the sentences in Luke (text  )-for the same number of words-with a regression line with slope   Let us examine in detail some results.Miller's law [52] because as sentences grow long, the writer-who is, of course, also a reader of his/her own text-unconsciously introduces more interpunctions, therefore limiting  in Millers' range [1].Consequently  is longer in Acts (8.77) than in Luke (7.11).
Hebrews and Apocalypse are completely disconnected with the other NT texts in the S-channel but not with each other.These two texts unexpectedly coincide in the Schannels, in both the slope and the correlation coefficient (Tables 7 and 8).This coincidence produces very large signal-to-noise ratios (Tables 5 and 6), namely, Γ 42.61 dB in Hebrews→Apocalypse and Γ 42.68 in Apocalypse→Hebrews, practically the same value (i.e., about 18,500 in linear units).The texts share the same style- 32 in Hebrews and  30.70 in Apocalypse; therefore, the two datasets, in this channel, seem to be produced by the same source.In the I-channel, Hebrews and Apocalypse are also completely disconnected with the other NT texts, but they are to each other significantly connected because Γ 15.25 dB in Hebrews→Apocalypse and Γ 13.92 in Apocalypse→Hebrews.Finally, notice that the four Gospels are closer to each other than to the other texts.

Greek Literature
For the Greek literature, Table 9 reports the slope  and the correlation coefficient  of the regression lines between  versus  and  versus  .Table 10 (S-channels) and Table 11 (I-channels) report Γ .The data referring to John are also reported for comparison with Aesop's Fables because of their vicinity in the vector plane (Figure 1).Let us examine the connection of John with Fables.Figure 6 shows the scatterplots and regression lines between  (words, independent variable) and  (sentences, dependent variable) in John (cyan triangles and cyan) and in Aesop (magenta circles and magenta line).Notice that the two regression lines are practically superposed, and the scattering of the two sets are very alike.Figure 7 shows the scatterplot and regression line between  (sentences, independent variable) and  (interpunctions, dependent variable) in John (cyan triangles and cyan line) and in Aesop (magenta circles and magenta line).In this case, it is clear they do not share the slope.John and Aesop share a large Γ in the S-channel and a significant Γ in the Ichannel; therefore, this "fine tuning" clarifies that the vicinity of the two ending points in Figure 1 is mainly due to sharing more the style than the readers' STM capacity.
In conclusion, S-channel results suggest that John's style was likely affected by Fables, or by the particular type of story-telling, while the I-channel results suggest that John's readers were not different, as far as their STM capacity, from the readers of the other texts listed (see the last column in Table 11).
As for the historians, Flavius Josephus shares more the style of Polybius than that of the other writers (Table 10), and his readers share the same STM capacity of Polybius' readers since Γ 30.80 in the I-channel Polybius → Flavius Josephus and Γ , 30.56 in Flavius Josephus → Polybius (Table 11).

Issues and Solutions
At this stage, however, as discussed in Reference [3], important issues arise, likely due to the small sample size used in calculating the regression line parameters, especially for the NT texts, and some questions must be answered.
The large and unexpected Γ in the channels Hebrews↔Apocalypse is just due to chance, or is it due to real likeness of the two texts?How can we assess whether these values are reliable?Now, it is practically impossible to estimate some probabilities of the parameters  and  of the regression lines of Table 3 because the texts available are very few.If Matthew had written, say, hundreds of texts, then we could attempt an analysis based on probability, but this is not the case, of course, and we are in the same situation for many ancient or modern authors.
In fact, because of the small sample size used in calculating a regression line, the slope  and the correlation coefficient -being stochastic parameters-are characterized by mean values and standard deviations, which depend on the sample size [9].Obviously, the theory would yield more precise estimates of the signal-to-noise ratio Γ for larger sample sizes, as it can be assumed for the Greek literature.
With a small sample size, the standard deviations of  and  can give too large a variation in Γ (see the sensitivity of this parameter to  and  in [3]).To avoid this inaccuracy-due to the small sample size, not to the theory of Section 4-we have defined and discussed in [3,4] a "renormalization" of the texts and their subsequent analysis, based on Monte Carlo (MC) simulations of multiple texts attributed to the same writer, whose results can be considered "experimental".Therefore, in the case of texts with small sample sizes for which we suspect Γ is due only to chance, as it may be with Hebrews and Apocalypse, the results of the simulation can replace the theoretical values.
In addition to the usefulness of the simulation as a "renormalization" tool, there is another property-very likely more interesting-of the generated new texts.In fact, since the mathematical theory does not consider meaning, these new texts could have been "written" by the author because they maintain the main statistical properties of the original text.In other words, they are "literary texts" that the author might have written at the time when he/she wrote the original text.Based on this hypothesis, we can consider a large number of texts for each author.With this strategy, we think we have solved these issues in Reference [3].In the next section we recall the rationale of the MC simulation.

S-Channels and I-Channels: Experimental Signal-to-Noise Ratio 𝚪 𝒆𝒙
In this section, after recalling the Monte Carlo simulation steps to obtain the new texts attributed to the same author, we examine S-channels and I-channels.

Multiple Versions of a Text: Monte Carlo Simulation
Let the literary text  be the "output" of which we consider  disjoint block texts (e.g., chapters), and let us compare it with a particular input literary text  characterized by a regression line, as detailed in Section 4. The steps of the MC simulation are the following (here explicitly described for S-channels): 1. Generate  independent integers (the number of disjoint block texts, e.g., chapters, 28 in Matthew) from a discrete uniform probability distribution in the range 1 to , with replacement-i.e., a block text can be selected more than once.2. "Write" another "text  " with new  block texts, e.g., the sequence 2, 1, ,  2; hence, take block text 2, followed by block text 1, block text , block text  2 up to  block texts.A block text can appear twice (with probability 1  ⁄ ), three times (with probability 1  ⁄ ), etc., and the new "text  " can contain a number of words greater or smaller than the original text, on the average; however, the differences are small and do not affect the final statistical results and analysis.3. Calculate the parameters  and  of the regression line between words (independent variable) and sentences (dependent variable) in the new "text  ", namely, Equation (1). 4. Compare  and  of the new "text  " (output, dependent text) with any other text (input, independent text,  and  ), in the "cross-channels" so defined, including the original text  (this latter case is referred to as the "self-channel").5. Calculate  ,  , and Γ , of the cross-channels or Γ , of the self-channel according to the theory of Section 4. 6.Consider the signal-to-noise ratios obtained as "experimental" results.7. Repeat steps 1 to 6 many times for obtaining reliable results (we have repeated the sequence 5000 times, ensuring a standard deviation of the mean value less than about 0.1 dB).
In conclusion, the MC simulation substitutes a probability study on the joint density function of  and  on real texts, not available in such a large number.Let us now apply the MC simulation to the NT texts.From Figure 8, for example, or from Appendix A, in S-channels we can notice that if the input is Matthew and the output is Luke (blue line), then Γ , 20.52; vice versa, if the input is Luke and the output is Matthew (black line), then Γ , 19.68.If the input is Matthew and the output is Matthew (self-channel), then Γ , 25.01.In this case we compare Matthew with 5000 "new" Matthews obtained randomly.Notice that Γ , Γ , .The Gospels are clearly distinguishable from the other texts, especially from Hebrews and Apocalypse, which can be confused.Notice that Γ , 15.66 for Hebrews and Γ , 19.76 for Apocalypse are always very similar to Γ , 15.73 and Γ , 19.64, respectively; therefore, the theoretical striking similarity of the two texts found in Section 5 (Table 5) is confirmed.

S-Channels and I-Channels
Notice that the Gospels differ quite significantly from Acts, Hebrews, and Apocalypse and that they are very similar to each other, therefore confirming, with this "fine-tuning", the findings shown in Figure 1.
Let us discuss the results for I-channels (lower panel).For example, if the input is Matthew and the output is Luke, then Γ , .The Gospels are very similar to each other and are clearly distinguished from Hebrews and Apocalypse, confirming therefore also in this channel what is shown in Figure 1.Finally, notice that also in the I-channel, Hebrews and Apocalypse are always the most similar texts.
In the next sub-section we compare  with  because this comparison gives fundamental insight on the range in which  is reliable.

𝛤 Versus 𝛤 and Minimum Reliable Range of 𝛤
As done in Reference [3], it is very interesting to compare  with  .This comparison gives the minimum range in which  is reliable.
Figure 9 shows  versus  in S-channels, for self-and cross-channels (a), and the difference   versus  (b).This difference represents the ratio (expressed in dB) between the noise power in the experimental channel and that in the theoretical channel.As in Reference [3], we notice that the two signal-to-noise ratios are very well correlated up to a maximum value set by  , , presently at about 20~22 dB (horizontal asymptote), beyond which  cannot follow the large increase in  , which reaches about 42 dB in Hebrews and Apocalypse. Figure 10 shows  versus  and   versus  in I-channels.We notice the same behavior of S-channels but with the asymptote set at about 24 dB.
From these figures we can draw the following conclusions: 1) There is a horizontal asymptote that sets the maximum reliable value of  , given by the largest  , .2) In this range the MC simulation is not indispensable, because  , calculated from Equation ( 12), is reliable.However, MC simulations are very useful to calculate the likeness index [3], which is based on a large number of texts an author might have written.
3) The theory can predict large values-as in Hebrews and Apocalypse-but we may suspect they are just due to chance because of the large sensitivity of  to slopes and correlation coefficients, as discussed in Reference [3].Therefore, a cautionary (pessimistic) value is to assume   .4) The difference   -i.e., the ratio (expressed in dB) between the noise power in the experimental channel and that in the theoretical channel-tends to be constant before saturation; afterward, it increases linearly, therefore indicating the end of a reliable range of  .In the next section we calculate the likeness index of texts and define a useful graphical tool, the "channels quadrants".

Likeness Index of Texts and Channels Quadrants
In Reference [3] we explored a way of comparing the signal-to-noise ratios Γ , of self-and cross-channels objectively and possibly obtaining more insight on the texts' mathematical likeness.In comparing a self-channel with a cross-channel, the probability of mistaking one text with another is a binary problem because a decision must be made between two alternatives.The problem is classical in binary digital communication channels affected by noise.In digital communication, "error" means that bit 1 is mistaken for bit 0 or vice versa; therefore, the channel performance worsens as the error frequency (i.e., the error probability) increases.However, in linguistics self-and cross-channels, "error" means that a text can be more or less mistaken, or confused, with another text; consequently, two texts are more similar as the "error probability" increases.Therefore, a large error probability means that two literary texts are mathematically similar.
We first recall the theory of likeness index and then define the "channel quadrants", a graphical tool that classifies texts, with the aim of showing how much the writers' style and the readers' STM capacity are matched.

Likeness Index
In digital communication channels affected by noise, the probability of error is given by [3] In Equation ( 13), Γ , and Γ , are modeled as Gaussian density functions with the mean and standard deviation given in Appendix A. The decision threshold,  , is given by the intersection of the two known probability density functions   (cross-channel) and   (self-channel).The integrals limits are fixed as shown because in general, Γ , Γ , .
If  0, there is no intersection between the two densities; their mean values are centered at ∞ and ∞, respectively, or the two densities have collapsed to Dirac delta functions.If  0.5, the two densities are identical, e.g., a self-channel is compared with itself.In conclusion, 0  0.5; therefore, if  0, the cross-and self-channels can be considered totally uncorrelated, and if  0.5  , , the self-and cross-channels coincide, and the two texts are mathematically identical.
The likeness index  is defined by The likeness index ranges from 0  1;  0 means totally uncorrelated texts, and  1 means totally correlated texts.

Channels Quadrants
Some insight on the "fine-tuning"-i.e., matching the writers' style and the readers' STM capacity-and on the relationship between texts can be visualized through the "channel quadrants" shown in Figure 11.In quadrant IV, the S-channels of two texts are significantly similar, and the texts coincide along the vertical line  1.Similarly, in quadrant II, the I-channels are significantly similar, and the texts coincide along the horizontal line  1 .In quadrant III, the two texts can be considered unmatched completely uncorrelated at the origin (0,0).Finally, in quadrant I, the two texts are very much matched in both channels and fully matched at (1,1); therefore, at this point, the two texts are mathematically indistinguishable.We can notice that only 19.0% of the cases have good matching in both channels (quadrant I), 21.4% have good matching only in the I-channel (quadrant II), 54.8% have poor matching in both channels (quadrant III), and 4.8% have good matching only in the Schannel (quadrant IV).11 and 12): Matthew, black circles; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta.The percentages indicate the relative number of cases falling in a quadrant.
The marginal probabilities are   0.5 23.8% in the S-channel and   0.5 40.4% in the I-channel.This fact, together with the other percentages, marks some interesting differences between the S-channels and I-channels.Tables 12 and 13 report the average values of  of the two asymmetric channels (e.g., Matthew→Luke and Luke→Matthew; see Appendix B) in S-channels and in I-channels, respectively.We can notice that the mathematical similarity of Matthew and Luke, already observed, is further reinforced by noting they are quite similar in both channels.Another interesting fact to notice is the high likeness index between Mark and John, who, according to scholars [64,65], share some similar Greek.
For I-channels, there are confirmations and differences compared with S-channels.Recall that I-channels are more concerned with the readers' STM memory than with the authors' style.The large  between Hebrews and Apocalypse of the S-channel is not confirmed in the I-channel, although it is large enough ( 0.697 to link the two groups of readers.
Very insightful is the large  0.863 between Luke and Acts, both texts written by Luke, who very likely addressed, as already mentioned, similar groups of readers.Further, notice that Acts is very close to all other texts, except Hebrew and Apocalypse, which means that Acts likely addressed all the early Christians.
Finally, let us reconsider the vicinity of John to Aesop's Fables shown in Figure 1.The signal-to-noise ratio in the S-channel Aesop → John is Γ , 23 1 and 2).
In conclusion, the coincidence of John and Aesop in Figure 1 is a necessary condition for being similar, but only the fine tuning provided by linguistic channels can fully reveal the nature of this similarity.In this example, John might have been inspired by the long tradition of short stories telling a truth, such as Aesop' Fables.

I-Channel Versus S-Channel: Hebrews and Apocalypse
According to Tables 12 and 13, Hebrews and Apocalypse are mathematically each other's "photocopies" in the S-channel and very similar in the I-channel; therefore, the styles-as it is meant in this paper-of the two authors coincide, and their readers share similar STM capacities.As already mentioned, the likeness of these texts is unexpected; therefore, it may be realistic to suppose that the writers and readers of them have belonged to the same group of Jewish-Christians, an issue to be researched by scholars of the Greek language used in the NT and by historians of early Christianity.
In conclusion, the S-channel and the I-channel describe the deep mathematical joint structure of two texts, namely, the authors' styles and the readers' STM capacities required to read the texts.If both likeness indices are large, then the two texts are very similar.These mathematical results may be used to confirm, in a multidisciplinary approach, what scholars of humanistic disciplines find, and they can even suggest new paths of research, such as the relationship between the author and the readers of Hebrews and Apocalypse.

Synthesis of Main Results
At this point, the reader of the present paper may be overwhelmed by tables and figures.However, due to the nature of the mathematical theory based on studying regression lines and linguistic channels-not to mention the many comparisons that can be carried out, even in a small literary corpus such as the New Testament-these numbers and figures are the only means we know for supporting the partial conclusions reached in each section above.Now we can attempt to present a final compact comparison based on one more table and figure.
Table 14 shows the most synthetic comparison of the NT texts, namely, the overall mean value of  , averaged from Tables 12 and 13.By assuming  0.5 as the threshold beyond which texts are reasonably similar, this threshold is exceeded in Luke-Matthew, Luke-Mark, John-Matthew, John-Mark, and Luke-Acts.
The couple Hebrews-Apocalypse is completely disconnected from the other texts, and their likeness index is the largest.We like to reiterate that these two texts deserve further studies by historians of the early Christian church literature at the higher level of meaning, readers, and possible Old Testament texts that might have affected them, a task well beyond the knowledge of the present author.Now, we show that the value  0.5 brings a special meaning, besides defining the borders of the quadrants in Figure 12.
Figure 13 shows the scatterplot between  of S-channels and I-channels versus the difference ΔΓ Γ , Γ , found in each channel, for all NT texts.The scatterplot suggests a tight inverse proportional relationship between  and ΔΓ.A very similar scatterplot and tight relationship was also found for texts taken from the Italian literature [4], therefore suggesting that this relationship is "universal" for alphabetical texts.Notice that ΔΓ is the ratio (expressed in dB) between the noise, defined in Section 4, affecting a cross-channel and that found in the corresponding self-channel.
The value  0.5 is obtained from Equation (15) at ΔΓ 6.50 dB, a value that is practically the standard deviation of Γ , in all cases, because this parameter ranges from 6 to 7.
We can link this last observation to the quadrants of Figure 11.As a general rule, we can say that in quadrant I ( 0.5 in both channels), we will always find texts whose Γ , is approximately distant 6~7 dB from the corresponding Γ , .In other words, a noise power ratio of 6~7 dB indicates that the two texts considered tend to be matched in both channels; therefore, it can be taken, with the vector representation of Figure 1, as a first objective assessment of the texts' likeness.

Conclusions
We studied two fundamental linguistic channels-namely, the S-channel and the Ichannel-and showed that they can reveal deeper connections between texts.As a study case, we considered the Greek New Testament, with the purpose of determining mathematical connections between its texts and possible differences in the writing style (mathematically defined) of the writers and in the reading skill required of their readers.The analysis is based on deep-language parameters and communication/information theory developed in previous papers.
Our theory does not follow the actual paradigm of linguistic studies, which consider neither Shannon's communication theory nor the fundamental connection that some linguistic parameters have with the reading skill and short-term memory capacity of readers.
Table A1.S-channels.Experimental mean signal-to-noise ratio Γ (dB) and standard deviation (dB, in parentheses) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.Table A2 reports Γ (dB) and its standard deviation (dB, in parentheses) in the I-channel between the (input) text indicated in the first column and the (output) text indicated in the first line.

Figure 2 .
Figure 2. Scatterplots and regression lines between  (words, independent variable) and  (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Mark (black triangles and black line), Luke (blue triangles and blue line), and John (cyan triangles and cyan line).

Figure 3 .
Figure 3. Scatterplots and regression lines between  (words, independent variable) and  (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line).The magenta line (Apocalypse) and the red line (Hebrews) are superposed because they practically coincide (see Table3).
Figure 3. Scatterplots and regression lines between  (words, independent variable) and  (sentences, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line).The magenta line (Apocalypse) and the red line (Hebrews) are superposed because they practically coincide (see Table3).

Figure 4 .
Figure 4. Scatterplots and regression lines between  (sentences, independent variable) and  (interpunctions, dependent variable) in the following texts: Matthew (green triangles and green line), Mark (black triangles and black line), Luke (blue triangles and blue line), and John (cyan triangles and cyan line).

Figure 5 .
Figure 5. Scatterplots and regression lines between  (sentences, independent variable) and  (interpunctions, dependent variable) in the following texts: Matthew (green triangles and green line), Acts (blue circles and blue line), Hebrews (red circles and red line), and Apocalypse (magenta circles and magenta line).The green line (Matthew) and the blue line (Acts) are superposed because they practically coincide (see Table3).

Figure 6 .
Figure 6.Scatterplots and regression lines between  (words, independent variable) and  (sentences, dependent variable) in John (cyan triangles and cyan) and in Aesop (magenta circles and magenta line).Notice that the two regression lines are practically superposed, and the scattering of the two sets are very alike.

Figure 7 .
Figure 7. Scatterplots and regression lines between  (sentences, independent variable) and  (interpunctions, dependent variable) in John (cyan triangles and cyan line) and in Aesop (magenta circles and magenta line).

Figure 8 .
Figure 8. Γ , and Γ , for each NT input texts indicated in abscissa.Upper panel: S-channel; Lower panel: I-channel.Output texts: Matthew, black; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta.The mean and standard deviation numerical values are reported in Appendix A. Notice that Γ , Γ , .

, 20 .
46 dB; vice versa, if the input is Luke and the output is Matthew, then Γ , 21.23 dB.If the input is Matthew and the output is Matthew, then Γ , 26.63, very close to that obtained in the S-channel.Like in S-channels, Γ , Γ

Figure 11 .
Figure 11.Matching texts in S-channels and in I-channels.

Figure 12
Figure 12 shows the scatterplot of  of the I-channel (ordinate) versus  of the Schannel (abscissa) referred to the NT.The numerical values are reported in Appendix B.

Figure 12 .
Figure 12.Scatterplot of  of the interpunctions channel (ordinate scale) versus  of the S-channel (abscissa scale).Output channels (first line in Tables11 and 12): Matthew, black circles; Mark, yellow; Luke, blue; John, green; Acts, cyan; Hebrews, red; Apocalypse, magenta.The percentages indicate the relative number of cases falling in a quadrant.

Table 1 .
New Testament.Mean values (averaged over all chapters) of  (characters per word),  (words per sentence),  (interpunctions per sentence ,  words per interpunctions), and  (universal readability index).The genealogies in Matthew (verses 1.1-1.17)and in Luke (verses 3.23-3.38)have been deleted for not biasing the statistical analyses.All parameters have been computed by weighting a chapter with the fraction of total words of the literary text.

Table 2 .
Greek literature.Mean values (averaged over all chapters) of  (characters per word),  (words per sentence),  (interpunctions per sentence ,  words per interpunctions, or words interval), and the corresponding  (universal readability index).All parameters have been computed by weighting a chapter with the fraction of total words of the literary text.

Table 3
reports the slope  and the correlation coefficient  of the regression line in the NT texts.In Matthew, for example, if we set  100 words, then the text, on the average, contains  100 0.0508 5.08 sentences and 2.7271 5.08 13.85

Table 3 .
Slope  and the correlation coefficient  of the regression lines of  versus  , and  versus  in the indicated texts.Four decimal digits are reported because some values differ only from the third digit.These parameters are calculated by uniformly weighing each block text, e.g., weight 1/28 in Matthew.
The number of interpunctions in Matthew (text  ) is linked to the interpunctions in Luke (text  )-for the same number of sentences-with a regression line with  0.9638 and  0.9960.
1.0180 and correlation coefficient  0.9938.In other terms, 100 sentences in Luke give 1.0180 100 101.80 sentences in Matthew, for the same number of words.

Table 4 .
Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts.Output channel: Matthew.Let us calculate the theoretical signal-to-noise ratio Γ obtained in S-channels and in I-channels.Table5(S-channel) and Table6(I-channel) report Γ (dB) between the input text indicated in the first column and the output text indicated in the first line.

Table 5 .
S-channel.Theoretical signal-to-noise ratio Γ (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.For example, if the input is Matthew and the output is Mark, then Γ 17.70; vice versa, if the input is Mark and the output is Matthew, then Γ 18.59.

Table 6 .
[55][56][57][58][59][60][61][62][63][64][65]Γ , (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.For example, if the input is Matthew and the output is Mark, then Γ 14.25; vice versa, if the input is Mark and the output is Matthew, then Γ 13.16.In I-channels (Table6), we read Γ 19.94 in Matthew→Luke and Γ 20.53 in Luke→Matthew.These results say not only that the asymmetry is very small but, more important, that the S-channel and the I-channel are practically identical, with a Γ 19~20, therefore confirming that the very small distance between Matthew and Luke shown in Figure1is not due to chance.From the point of view of communication theory, therefore, Matthew and Luke appear as each other's mathematical "photocopies".Luke and Acts, both universally attributed to Luke[55][56][57][58][59][60][61][62][63][64][65], have very similar Γ in the S-channel: Γ 15.14 in Luke→Acts and Γ 13.44 in Act → Luke.These values are low enough to agree with the large distance shown in Figure1; therefore, the style used in the two texts is significantly diverse, in agreement with the diverse values  20.47 in Luke and  25.47 in Acts.On the contrary, the large and practically identical values in the I-channel-Γ 27.93 in Luke → Acts and Γ 27.56 in Acts→Luke-indicate that the readers addressed by these texts may even coincide, as far as their STM capacity is concerned.

Table 7 .
Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts.Output channel: Hebrews.Notice that five decimal digits are reported for Apocalypse because its value is very close to 1.

Table 8 .
Theoretical slope and correlation coefficient of the regression line according to Section 4, for the indicated input texts.Output channel: Apocalypse.Notice that five decimal digits are reported for Hebrews because its value is very close to 1.

Table 9 .
Slope  and the correlation coefficient  of the regression lines between  versus  and  versus  for the indicated texts of the Greek literature.The slopes and correlation coefficients have been calculated the same as those reported in Table3.
Table 10.S-channel, Greek literature.Theoretical signal-to-noise ratio Γ (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.For example, if the input is Polybius and the output is Plutarch, then Γ 9.81; vice versa, if the input is Plutarch and the output is Polybius, then Γ 8.48.

Table 11 .
I-channel, Greek literature.Theoretical signal-to-noise ratio Γ (dB) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.For example, if the input is Polybius and the output is Plutarch, then Γ 17.06; vice versa, if the input is Plutarch and the output is Polybius, then Γ 16.49.

Table 9
reports the slope and correlation coefficient of the regression lines.From these data we calculate Γ , according to Section 4, reported in Table10(S-channels) and Table11(I-channels).

Table 12 .
Average value of  in S-channels.For example, in the channels Hebrews ↔ Apocalypse, from Appendix B, we obtain the average value 0.993 0.999 /2 0.996.In bold type are the cases in which  0.5.

Table 13 .
Average value of  in I-channels.In bold type are the cases in which  0.5.

Table 14 .
Overall total average value of  .For example, in the channels Hebrews ↔ Apocalypse, from Tables12 and 13we obtain the average value 0.996 0.697 /2 0.847.In bold type are the cases in which  0.5.

Table A2 .
I-channels.Experimental mean signal-to-noise ratio Γ , (dB) and standard deviation (dB, in parentheses) in the channel between the (input) text indicated in the first column and the (output) text indicated in the first line.