A Mathematical Structure Underlying Sentences and Its Connection with Short–Term Memory

: The purpose of the present paper is to further investigate the mathematical structure of sentences—proposed in a recent paper—and its connections with human short–term memory. This structure is defined by two independent variables which apparently engage two short–term memory buffers in a series. The first buffer is modelled according to the number of words between two consecutive interpunctions—variable referred to as the word interval, I P —which follows Miller’s 7 ± 2 law; the second buffer is modelled by the number of word intervals contained in a sentence, M F , ranging approximately for one to seven. These values result from studying a large number of literary texts belonging to ancient and modern alphabetical languages. After studying the numerical patterns (combinations of I P and M F ) that determine the number of sentences that theoretically can be recorded in the two memory buffers—which increases with the use of I P and M F —we compare the theoretical results with those that are actually found in novels from Italian and English literature. We have found that most writers, in both languages, write for readers with small memory buffers and, consequently, are forced to reuse sentence patterns to convey multiple meanings.


Does the Short-Term Memory Process Words with Two Independent Buffers in Series?
Recently [1], we proposed a well-grounded conjecture that a sentence-read or pronounced as the two activities are similarly processed by the brain [2]-is elaborated by the short-term memory (STM), with two independent processing units in series that have similar buffer size.The clues for conjecturing this model have emerged from considering many novels belonging to Italian and English literature.In [1], we have shown that there are no significant mathematical/statistical differences between the two literary corpora, according to surface deep-language variables.In other words, the mathematical surface structure of alphabetical languages-a creation of the human mind-seems to be deeply rooted in humans, independent of the particular language used.
A two-unit STM processing can be justified according to how a human mind seems to memorize "chunks" of information written in a sentence.Although simple and related to the surface of language, the model seems to describe mathematically the input-output characteristics of a complex mental process largely unknown.
According to [1], the first processing unit is linked to the number of words between two contiguous interpunctions, the variable for which is indicated by I p -termed the word interval (Appendix A lists the mathematical symbols used in the present paper)approximately ranging within Miller's 7 ± 2 law range [3][4][5][6][7][8][9][10][11][12].The second unit is linked to the number M F of I p s contained in a sentence, referred to as the extended STM, or E-STM, ranging approximately from one to six.We have shown that the capacity (expressed in words) required to process a sentence ranges from 8.3 to 61.2 words, values that can be converted into time by assuming a reading speed.This conversion gives the range 2.6∼19.5 s for a fast reader [13], and 5.3 ∼ 30.1 s for an average reader of novels, values that are well-supported by the experiments reported in the literature [14][15][16][17][18][19][20][21][22][23][24][25][26][27][28][29].
The E-STM must not be confused with the intermediate memory [30,31].It is not modelled by studying neuronal activity, but by studying the surface aspects of human communication, such as words and interpunctions, whose effects writers and readers have experienced since the invention of writing.
The modeling of the STM processing by two units in a series has never been considered in the literature before [1,32].The reader is very likely aware that the literature on the STM and its various aspects is very large and multidisciplinary, but nobody-as far as we know-has never considered the connections we have found and discussed in [1,32].Moreover, a sentence conveys meaning; therefore, the theory we are further developing in the present paper might be a starting point to arrive at the information theory that includes meaning.
Currently, some attempts are being made by many scholars to arrive at a "semantic communication" theory or a "semantic information" theory, but the results are still, in our opinion, in their infancies [33][34][35][36][37][38][39][40][41].These theories, as those concerning the STM, have not considered the main "ingredients" of our theory-namely I P and P F -as a starting point for including meaning, which is still a very open issue.
Figure 1 sketches the flowchart of the two processing units [1].The words p 1 , p 2 ,. . .p j are stored in the first buffer up to j items-approximately in Miller's range-until an interpunction is introduced to fix the length of I P .The word interval I P is then stored in the second buffer up to k items, from about one to six, until the sentence ends.The process is then repeated for the next sentence.
The E-STM must not be confused with the intermediate memory [30,31].It is not modelled by studying neuronal activity, but by studying the surface aspects of human communication, such as words and interpunctions, whose effects writers and readers have experienced since the invention of writing.
The modeling of the STM processing by two units in a series has never been considered in the literature before [1,32].The reader is very likely aware that the literature on the STM and its various aspects is very large and multidisciplinary, but nobody-as far as we know-has never considered the connections we have found and discussed in [1,32].Moreover, a sentence conveys meaning; therefore, the theory we are further developing in the present paper might be a starting point to arrive at the information theory that includes meaning.
Currently, some attempts are being made by many scholars to arrive at a "semantic communication" theory or a "semantic information" theory, but the results are still, in our opinion, in their infancies [33][34][35][36][37][38][39][40][41].These theories, as those concerning the STM, have not considered the main "ingredients" of our theory-namely   and   -as a starting point for including meaning, which is still a very open issue.
Figure 1 sketches the flowchart of the two processing units [1].The words  1 ,  2 ,…   are stored in the first buffer up to  items-approximately in Miller's rangeuntil an interpunction is introduced to fix the length of   .The word interval   is then stored in the second buffer up to  items, from about one to six, until the sentence ends.The process is then repeated for the next sentence.The purpose of the present paper is to further investigate the mathematical structure underlying sentences, both theoretically and experimentally, by considering the novels The purpose of the present paper is to further investigate the mathematical structure underlying sentences, both theoretically and experimentally, by considering the novels previously mentioned [1] listed in Table A1 for Italian literature and in Table A2 for English literature.
After this introduction, in Section 2, we study the probability distribution function (PDF) of sentence size-measured in words-that is recordable by an E-STM buffer made of C F cells (this parameter plays the role of M F ).In other words, in this section, we study and discuss the length of sentences that humans can possibly conceive with an E-STM made of C F memory cells.
In Section 3, we study the number of sentences, with the same number of words, that C F cells can process.In this section, we study and discuss the complementary issue of Section 2, namely, how many sentences with a constant number of words humans can conceive, based solely on the E-STM of C F cells.
In Section 4, we compare the number of sentences that authors of Italian and English literature actually wrote for their novels to the number of sentences theoretically available to them, by defining a multiplicity factor.In Section 5, we define a mismatch index, which synthetically measures to what extent a writer uses the number of sentences that are theoretically available.In Section 6, we show that the parameters studied increase with the year of novel publication.Finally, in Section 7, we summarize the main results and propose future work.

Probability Distribution of Sentence Length versus E-STM Buffer Size
First, we study the conditional PDF of sentence length, measured in words W-i.e., the parameter which in long texts, such as chapters, gives P F of each chapter-recordable in an E-STM buffer made of C F cells, i.e., the parameter which gives M F in chapters.Second, we study the overlap of the PDFs because this overlap gives interesting indications.

Probability Distribution of Sentence Length
To estimate the PDF of sentence length, we run a Monte Carlo simulation based on the PDF of I P obtained in [1] by merging the two literatures listed in Section 1.
In [1], we have shown that the PDF of I P , P F and M F -as previously mentioned, these averages refer to single chapters of the novels-can be modelled with a three-parameter log-normal density function [42] (natural logs): In Equation (1), µ x and σ x are, respectively, the mean value and the standard deviation the log-normal PDF.Table 1 reports these values for the three deep-language variables.The Monte Carlo simulation steps are as follows: 1.
Consider a buffer made of C F cells.The sentence contains C F word intervals: for example, if C F = 3, the sentence contains two interpunctions followed by a full stop, a question mark, or an exclamation mark.

2.
Generate C F independent values of I P according to the log-normal model given by Equation ( 1) and Table 1.The independence of I P from a cell to another cell is reasonable [1].In detail, from a random number generator of standard Gaussian density variables X i (zero mean and unit standard deviation), we get the relationship X i = (y i − µ x )/σ x ; therefore, the three-parameter log-normal variable I P,i ≥ 1 is then given by I P,i = exp(y i ) + 1 = exp(σ x X i + µ x ) + 1.

3.
Add the number of words contained in the C F cells to obtain W: 4.
Repeat steps one through three many times (we repeated these steps 100,000 times, i.e., we simulated 100,000 sentences of different length) to obtain a stable conditional PDF of W.

5.
Repeat steps one through four for another C F and obtain another PDF.
Figure 2 shows the conditional PDF for several values of C F .Each PDF can be very well-modelled by a Gaussian PDF f C F (x) because the probability of getting unacceptable negative values is negligible in any of the PDFs shown in Figure 2.For example, for C F = 3 , the mean value and the standard deviation are, respectively, m 3 = 18.00 words and s 3 = 1.79 words.
negative values is negligible in any of the PDFs shown in Figure 2.For example, for   = 3 , the mean value and the standard deviation are, respectively,  3 = 18.00 words and  3 = 1.79 words.
In general terms [42], the mean value of Equation ( 2) is given by: Therefore,    is proportional to   .As for the standard deviation of , if the  , ′ are independent-as we assume-then the variance    2 of  is given by: Therefore, the standard deviation    is proportional to √  .Finally, according to the central limit theory [42], the PDF can be modelled as Gaussian in a significant range about the mean.
In conclusion, the Monte Carlo simulation produces a Gaussian PDF with a mean value proportional to   and a standard deviation proportional to √  .These findings are clearly evident in the PDFs shown in Figure 2, in which    and    increase as theoretically expected; therefore, the mean values and standard deviations of the other PDFs can be calculated by scaling the values found for   = 3.For example, for   = 6,  6 = 2 × 18.00 = 36.00words and  6 = √2 × 1.79 = 2.53 words.Figure 3 shows the histograms corresponding to Figure 2. The number of samples for each conditional PDF, out of 100,000 considered in the Monte Carlo simulation, is obtained by distributing the samples according to the PDF of   given by Equation (1) and Table 1.The case   = 3 gives the largest sample size.In general terms [42], the mean value of Equation ( 2) is given by: Therefore, m C F is proportional to C F .As for the standard deviation of W, if the I p,i 's are independent-as we assume-then the variance s 2 C F of W is given by: Therefore, the standard deviation s C F is proportional to √ C F .Finally, according to the central limit theory [42], the PDF can be modelled as Gaussian in a significant range about the mean.
In conclusion, the Monte Carlo simulation produces a Gaussian PDF with a mean value proportional to C F and a standard deviation proportional to √ C F .These findings are clearly evident in the PDFs shown in Figure 2, in which m C F and s C F increase as theoretically expected; therefore, the mean values and standard deviations of the other PDFs can be calculated by scaling the values found for C F = 3.For example, for C F = 6, m 6 = 2 × 18.00 = 36.00words and s 6 = √ 2 × 1.79 = 2.53 words.Figure 3 shows the histograms corresponding to Figure 2. The number of samples for each conditional PDF, out of 100,000 considered in the Monte Carlo simulation, is obtained by distributing the samples according to the PDF of M F given by Equation ( 1) and Table 1.The case C F = 3 gives the largest sample size.The results shown above have an experimental basis because the relationship between <   > , the average of words per sentence for the entire novel which is calculated by averaging the   of single chapters via weighting single chapters with the fraction of novel total word, as discussed in [32], versus <   >, the average of   of a novel calculated as   , is linear, as Figure 4 shows by drawing <   > versus <   > concerning the Italian and English novels mentioned above.

Overlap of the Conditional Probability Distributions
Figure 2 shows that the conditional PDFs overlap; therefore, some sentences can be processed by buffers of diverse   size, either larger or smaller.Let us define the probability of these overlaps.
Let  ℎ be the intersection of two contiguous Gaussian PDFs, for example,    −1 () and    (); therefore, the probability   that a sentence length can be found in the nearest lower Gaussian PDF (going from   →   − 1) is given by [42]: The results shown above have an experimental basis because the relationship between < P F >, the average of words per sentence for the entire novel which is calculated by averaging the P F of single chapters via weighting single chapters with the fraction of novel total word, as discussed in [32], versus < M F >, the average of M F of a novel calculated as P F , is linear, as Figure 4 shows by drawing < P F > versus < M F > concerning the Italian and English novels mentioned above.The results shown above have an experimental basis because the relationship between <   > , the average of words per sentence for the entire novel which is calculated by averaging the   of single chapters via weighting single chapters with the fraction of novel total word, as discussed in [32], versus <   >, the average of   of a novel calculated as   , is linear, as Figure 4 shows by drawing <   > versus <   > concerning the Italian and English novels mentioned above.

Overlap of the Conditional Probability Distributions
Figure 2 shows that the conditional PDFs overlap; therefore, some sentences can be processed by buffers of diverse   size, either larger or smaller.Let us define the probability of these overlaps.
Let  ℎ be the intersection of two contiguous Gaussian PDFs, for example,    −1 () and    (); therefore, the probability   that a sentence length can be found in the nearest lower Gaussian PDF (going from   →   − 1) is given by [42]:

Overlap of the Conditional Probability Distributions
Figure 2 shows that the conditional PDFs overlap; therefore, some sentences can be processed by buffers of diverse C F size, either larger or smaller.Let us define the probability of these overlaps.
Let W th be the intersection of two contiguous Gaussian PDFs, for example, f C F −1 (x) and f C F (x); therefore, the probability p HL that a sentence length can be found in the nearest lower Gaussian PDF (going from C F → C F − 1 ) is given by [42]: Similarly, the probability that a sentence length can be found in the nearest higher Gaussian PDF (going from C F − 1 → C F ) is given by: For example, the threshold value between f C F=3 (x) and f C F=4 (x) is W th = 20.9words and p HL = 6.6%, while p LH = 5.5%.
Figure 5 draws these probabilities (%) versus C F − 1 (lower C F ).Because s C F increases with √ C F , therefore p HL > p LH .However, this is not only a mathematically obvious result, but it also meaningful because it indicates that: (a) a human mind can process sentences of lengths belonging to the contiguous lower or higher M F (the probability of going to more distant PDFs is negligible) and (b) the number of these sentences is larger in the case C F → C F − 1 , which simply means that an E-STM buffer can process to a larger extent data matched to a smaller capacity buffer than data matched to a larger capacity buffer.
Figure 5 draws these probabilities (%) versus   − 1 (lower   ).Because    increases with √  , therefore   >   .However, this is not only a mathematically obvious result, but it also meaningful because it indicates that: (a) a human mind can process sentences of lengths belonging to the contiguous lower or higher   (the probability of going to more distant PDFs is negligible) and (b) the number of these sentences is larger in the case   →   − 1, which simply means that an E-STM buffer can process to a larger extent data matched to a smaller capacity buffer than data matched to a larger capacity buffer.
Finally, notice that each sentence conveys meaning-theoretically, any sequence of words might be meaningful, although this may not always be the case, but we do not know the proportion-therefore, the PDFs found above are also the PDFs associated with meaning.Moreover, the same numerical sequence of  words can carry different meanings, according to the words used.Multiplicity of meaning, therefore, is "built in" in a sequence of  words.We will further explore this issue in the next sections by considering the number of sentences that authors of Italian and English literature actually wrote.
So far, we have explored the processing of the words of a sentence by simulating sentences of diverse length that are conditioned to the E-STM buffer size.In the next section, we explore the complementary processing concerning the number of sentences that contain the same number of words.Finally, notice that each sentence conveys meaning-theoretically, any sequence of words might be meaningful, although this may not always be the case, but we do not know the proportion-therefore, the PDFs found above are also the PDFs associated with meaning.Moreover, the same numerical sequence of W words can carry different meanings, according to the words used.Multiplicity of meaning, therefore, is "built in" in a sequence of W words.We will further explore this issue in the next sections by considering the number of sentences that authors of Italian and English literature actually wrote.
So far, we have explored the processing of the words of a sentence by simulating sentences of diverse length that are conditioned to the E-STM buffer size.In the next section, we explore the complementary processing concerning the number of sentences that contain the same number of words.

Theoretical Number of Sentences Recordable in C F Cells
We study the number of sentences of W words that an E-STM buffer, made of C F cells, can theoretically process.In summary, we ask the following question: how many sentences containing the same number of words W (Equation ( 2)) can be theoretically written in C F cells?
Table 2 reports these numbers as a function of W and C F .We calculated these data first by running a code and then by finding the mathematical recursive formula that generates them, given by the following:   Figure 7 draws the data reported in some columns of Table 2, i.e., the number of sentences   (  ) versus , for fixed   .In this case, it is useful to adopt an efficiency factor , which is defined as the ratio between   (  ) and  for a given   : This factor explains, summarily, how a buffer of   cells is efficient in providing sentences with a given number of words, its units being sentences per word.
This factor explains, summarily, how a buffer of C F cells is efficient in providing sentences with a given number of words, its units being sentences per word.
factor , which is defined as the ratio between   (  ) and  for a given   : This factor explains, summarily, how a buffer of   cells is efficient in providing sentences with a given number of words, its units being sentences per word.Figure 8 shows  versus .It is interesting to note that for  ≲ 10 words, the buffer   = 2 can be more efficient than the others.Beyond  = 10, the larger buffers become very efficient with very large .
If a writer uses short buffers-e.g., deliberately because of his/her style, or necessarily because of the reader's E-STM memory size-then he/she has to repeat the same numerical sequence of words many times, according to the number of meanings conveyed.For example, if   = 2 and  = 10, the writer has only nine different choices, or patterns, of two numbers whose sum is 10 (Table 2).Therefore, Table 2 gives the minimum number of meanings that can be conveyed.The larger the   , the larger is the variety of sentences that can be written with  words.The following question naturally arises: How many sentences authors do write in their texts as compared to the theoretical number available to them?In the next section, we will compare these two sets of data by studying the novels taken from the Italian and English literature listed in Appendix B, by assuming their average values <   > and <   > and by defining a multiplicity factor.If a writer uses short buffers-e.g., deliberately because of his/her style, or necessarily because of the reader's E-STM memory size-then he/she has to repeat the same numerical sequence of words many times, according to the number of meanings conveyed.For example, if C F = 2 and W = 10, the writer has only nine different choices, or patterns, of two numbers whose sum is 10 (Table 2).Therefore, Table 2 gives the minimum number of meanings that can be conveyed.The larger the C F , the larger is the variety of sentences that can be written with W words.

Experimental Multiplicity Factor of Sentences
The following question naturally arises: How many sentences authors do write in their texts as compared to the theoretical number available to them?In the next section, we will compare these two sets of data by studying the novels taken from the Italian and English literature listed in Appendix B, by assuming their average values < P F > and < M F > and by defining a multiplicity factor.

Experimental Multiplicity Factor of Sentences
We compare the number of sentences that authors of Italian and English literature actually wrote for each novel to the number of sentences theoretically available to them, according to the < P F > and < M F > of each novel.In this analysis, we do not consider the values of P F and M F of each chapter of a novel because the detail would be so fine as to miss the general trend given by the average values < P F >, < M F > of the complete novel.
As is well known, the average value and the standard deviation of integers very likely are not integers, as is always the case for the linguistic parameters; therefore, to apply the mathematical theory of the previous sections, we must do some interpolations and only at the end of the calculation consider the integers.
Let us compare the experimental number of sentences S (M F ) P F in a novel, as reported in Tables A1 and A2, to the theoretical number S (C F ) W available to the author, according to the experimental values < P F > (which plays the role of W) and < M F > (which plays the role of C F ) of the novel.
By referring to Figure 7, the interpolation between the integers of Table 2 to find the curve of constant C F -given by the real number < M F >-is linear along both axes.At the intersection of the vertical line (corresponding to the real number < P F >) and the new curve (corresponding to the real number < M F >), we find the theoretical S (C F ) W by rounding the value to the nearest integer toward zero.For example, for David Copperfield, in Table A2 we read S (M F ) P F = 19, 610, and the interpolation gives S (C F ) W = 1553.Figure 9 shows the result of this exercise.We see that S The values of α for each novel are reported in Tables A1 and A2.For example, for David Copperfield, α = 19, 610/1553 = 12.63.Figure 10 shows α versus S (M F ) P F .We notice a fairly significant increasing trend of α with S (M F ) the intersection of the vertical line (corresponding to the real number <   >) and the new curve (corresponding to the real number <   > ), we find the theoretical   (  ) by rounding the value to the nearest integer toward zero.For example, for David Copperfield, in Table A2 we read    (  ) = 19,610, and the interpolation gives   (  ) = 1553.Figure 9 shows the result of this exercise.We see that   (  ) increases rapidly with <   >.The most displaced (red) circle is due to Robinson Crusoe.
The values of  for each novel are reported in Tables A1 and A2.For example, for David Copperfield,  = 19,610/1553 = 12.63.Figure 10 shows  versus    (  ) .We notice a fairly significant increasing trend of  with    (  ) .English ( 11) The correlation coefficient of log values is −0.9873.for Italian and −0.9710.for English.
Based on Equations ( 10) and ( 11),  = 1 when   (  ) = 3886 for Italian novels and  (  ) = 6028 for English novels; therefore, novels with sentences in the range 4000~6000   For the Italian literature in question, (correlation coefficient of linear-log values is −0.9697)  = 1 when   = 3.48; for the English literature (correlation coefficient of linear-log values is −0.9603),  = 1 when   = 3.65.Therefore, novels with sentences in the range 4000~6000 use, on average, the same E-STM buffer size of   ≈ 3.5 cells.
From Figures 10-12, we can draw the following conclusion: in general,  > 1 is more likely than  < 1 and often  ≫ 1.When  ≫ 1, the writer reuses the same pattern of number of words many times.The multiplicity factor, therefore, indicates also the minimum multiplicity of meaning conveyed by an E-STM besides, of course, the many diverse meanings conveyed by the same sequence of    obtainable by only changing words.Few novels show  < 1.In these cases, the writer has enough diverse patterns to convey meaning, but most of them are not used.= 6028 for English novels; therefore, novels with sentences in the range 4000 ∼ 6000 use, on average, the number of sentences theoretically available for their averages < P F > and < M F >.
Figure 12 shows α versus < M F >.In this case, an exponential law is a good fit: For the Italian literature in question, (correlation coefficient of linear-log values is −0.9697) α = 1 when M F = 3.48; for the English literature (correlation coefficient of linearlog values is −0.9603), α = 1 when M F = 3.65.Therefore, novels with sentences in the range 4000∼ 6000 use, on average, the same E-STM buffer size of M F ≈ 3.5 cells.
From Figures 10-12, we can draw the following conclusion: in general, α > 1 is more likely than α < 1 and often α ≫ 1.When α ≫ 1, the writer reuses the same pattern of number of words many times.The multiplicity factor, therefore, indicates also the minimum multiplicity of meaning conveyed by an E-STM besides, of course, the many diverse meanings conveyed by the same sequence of I p s obtainable by only changing words.Few novels show α < 1.In these cases, the writer has enough diverse patterns to convey meaning, but most of them are not used.
Finally, it is interesting to relate α to a universal readability factor G U , which is a function of both P F and I P [43].
The universal readability index, as compared to the current readability indices for the few languages for which they are available (mainly for English [43]), considers also the reader's short-term memory processing capacity.It can be used to assess the readability of texts written in any alphabetical language, as described in [43].
Figure 13 shows α versus G U .Because the readability of a text increases as G U increases, we can see that the novels with α < 1 tend to be less readable than those with α > 1.The less-readable novels have, in general, large values of < P F > and therefore may contain more E-STM cells (large < M F >).
AppliedMath 2024, 4, FOR PEER REVIEW 14 Finally, it is interesting to relate  to a universal readability factor   , which is a function of both   and   [43].
The universal readability index, as compared to the current readability indices for the few languages for which they are available (mainly for English [43]), considers also the reader's short-term memory processing capacity.It can be used to assess the readability of texts written in any alphabetical language, as described in [43].
Figure 13 shows  versus   .Because the readability of a text increases as   increases, we can see that the novels with  < 1 tend to be less readable than those with  > 1.The less-readable novels have, in general, large values of <   > and therefore may contain more E-STM cells (large <   >).
In conclusion, if a writer does use the full variety of sentence patterns available, or even overuses them, then he/she writes texts that are easier to read.On the other hand, if a writer does not use the full variety of sentence patterns available, then he/she tends to write texts that are more difficult to read.In the next section, we define a useful index, the mismatch index, which describes these cases.

Mismatch Index
We define a useful index, the mismatch index, which measures to what extent a writer uses the number of sentences that are theoretically available according to the averages <   > and <   > of the novel.For this purpose, we define the mismatch index: In conclusion, if a writer does use the full variety of sentence patterns available, or even overuses them, then he/she writes texts that are easier to read.On the other hand, if a writer does not use the full variety of sentence patterns available, then he/she tends to write texts that are more difficult to read.In the next section, we define a useful index, the mismatch index, which describes these cases.

Mismatch Index
We define a useful index, the mismatch index, which measures to what extent a writer uses the number of sentences that are theoretically available according to the averages < P F > and < M F > of the novel.For this purpose, we define the mismatch index: According to Equation ( 14), W ; hence, α = 1, and in this case the experiment and theory are perfectly matched.They are overmatched when I M > 0 (α > 1) and undermatched when I M < 0 (α < 1).
Figure 14 shows the scatterplot of I M versus M F .The mathematical models drawn are calculated by substituting Equations ( 12) and ( 13) in Equation ( 14).We can reiterate that when I M > 0 (overmatching, M F ≲ 3.5), the writer repeats sentence patterns because there are not enough diverse patterns to convey all the meanings.The texts are easier to read.When I M < 0 (undermatching, M F ≳ 3.5), the writer has theoretically many sentence patterns to choose from, but he/she uses only a few or very few of them.The texts are more difficult to read.  Figure 15 shows the scatterplot of   versus   (  ) .The mathematical models drawn were calculated by substituting Equations ( 10) and (11) in Equation ( 14).Overmatching was found for   (  ) < 3886 for Italian and   (  ) < 6028 for English.Figure 15 shows the scatterplot of I M versus S W .The mathematical models drawn were calculated by substituting Equations ( 10) and (11) in Equation ( 14).Overmatching  Figure 15 shows the scatterplot of   versus   (  ) .The mathematical models drawn were calculated by substituting Equations ( 10) and (11) in Equation ( 14).Overmatching was found for   (  ) < 3886 for Italian and   (  ) < 6028 for English.Finally, Figure 16 shows I M versus α, Equation ( 14), a picture that summarizes the entire analysis of mismatch.
AppliedMath 2024, 4, FOR PEER REVIEW 17 Finally, Figure 16 shows   versus , Equation ( 14), a picture that summarizes the entire analysis of mismatch.As we can see by reading the years of publication in Tables A1 and A2, the novels span a long period.Do the parameters studied depend on time?In the next section, we show that the answer to this question is positive.

Time Dependence
The novels considered in Tables A1 and A2 were published in a period spanning several centuries.We show that the multiplicity factor  and the mismatch index  do As we can see by reading the years of publication in Tables A1 and A2, the novels span a long period.Do the parameters studied depend on time?In the next section, we show that the answer to this question is positive.

Time Dependence
The novels considered in Tables A1 and A2 were published in a period spanning several centuries.We show that the multiplicity factor α and the mismatch index I M do depend on time.
Figure 17 shows the multiplicity factors versus the years of publication of the novels since 1800.It is evident that writers tend to use larger values of α-therefore the E-STM buffers are of smaller sizes-as we approach the present epoch and a possible saturation at α ≈ 100.The English literature shows a stable increasing pattern while the Italian literature seems to contain samples that come from two diverse sets of data, one of which evolved in agreement with English literature, the other (given by the novels labelled with "*" in Table A1) is always increasing with time but with a diverse slope.Figure 18 shows the mismatch index versus the year of novel publication.Figure 19 shows the universal readability index versus time.In both Figures 18  and 19, we can observe the same trends shown in Figure 17, which therefore reinforces the conjecture that: (a) the writers are partially changing their style with time by making their novels more readable, i.e., more matched to less-educated readers according to the relationship between G U and the schooling years in the Italian school system, as discussed in [43]; (b) a saturation seems to occur in all parameters in the novels written in the second half of the XX century, at least according to the novels of Appendix B. Figure 18 shows the mismatch index versus the year of novel publication.Figure 19 shows the universal readability index versus time.In both Figures 18 and  19, we can observe the same trends shown in Figure 17, which therefore reinforces the conjecture that: (a) the writers are partially changing their style with time by making their novels more readable, i.e., more matched to less-educated readers according to the relationship between   and the schooling years in the Italian school system, as discussed in [43]; (b) a saturation seems to occur in all parameters in the novels written in the second half of the XX century, at least according to the novels of Appendix B.

Summary and Future Work
In the present paper, we have further investigated the mathematical structure of sentences and its connections with human short-term memory.This structure is defined by two independent variables which apparently engage two short-term memory buffers in series.The first buffer is modelled according to the number of words between two consecutive interpunctions-variable-termed word interval   -which follows Miller's 7 ± 2 law; the second buffer is modelled by the number of word intervals contained in a sentence,   , ranging approximately from one to seven.These values arise from an

Summary and Future Work
In the present paper, we have further investigated the mathematical structure of sentences and its connections with human short-term memory.This structure is defined by two independent variables which apparently engage two short-term memory buffers in series.The first buffer is modelled according to the number of words between two consecutive interpunctions-variable-termed word interval I P -which follows Miller's 7 ± 2 law; the second buffer is modelled by the number of word intervals contained in a sentence, M F , ranging approximately from one to seven.These values arise from an extensive analysis of alphabetical texts [44].
We have studied the numerical patterns (combinations of I P and M F ) that determine the number of sentences that theoretically can be recorded in the two memory bufferswhich increases with I P and M F -and we have compared the theoretical results with those that are actually found in novels from Italian and English literature.We have found that most writers, in both languages, write for readers with small memory buffers and, consequently, are forced to reuse sentence patterns to convey multiple meanings.In this case, texts are easier to read, according to the universal readability index.
Future work should consider other literatures to confirm what, in our opinion, is general because the topic is connected to the human mind.The same analysis performed on ancient languages, such as Greek and Latin-for which there are large literary corporawould show whether these ancient writers/readers displayed similar short-term memory buffers.Tables A1 and A2 list the authors, the titles of the novels, and their years of publication in either Italian and English literature as considered in the paper, with deep-language average statistics, multiplicity factor α, and mismatch index I M .The averages < C P >, < P F >, < I P >< M F > have been calculated by weighting each chapter value with its fraction of the total number of words in the novel, as described in [32].  1 of [45] reported the number of sentences ending only with full stops; sentences ending with question marks and exclamation marks were not reported, contrarily to all other literary texts there reported.Moreover, the analysis conducted in [45] was performed by considering only the sentences ending with full stops; this is why the values of < P F > and < M F > there reported are larger (upper bounds) than those listed below.

Figure 1 .
Figure 1.Flowchart of the two processing units of a sentence.The words  1 ,  2 ,…   are stored in the first buffer up to  items to complete a word interval   , which is approximately in Miller's range, when an interpunction is introduced.  is then stored in the E-STM buffer, up to  items, i.e., in   cells, approximately one to six, until the sentence ends.

Figure 1 .
Figure1.Flowchart of the two processing units of a sentence.The words p 1 , p 2 ,. . .p j are stored in the first buffer up to j items to complete a word interval I P , which is approximately in Miller's range, when an interpunction is introduced.I P is then stored in the E-STM buffer, up to k items, i.e., in M F cells, approximately one to six, until the sentence ends.

Figure 2 .
Figure 2. Conditional PDFs of words per sentence versus an E-STM buffer of   cells from two to eight.Each PDF can be modelled with a Gaussian PDF    () with a mean value proportional to   and a standard deviation proportional to √  .

Figure 2 .
Figure 2. Conditional PDFs of words per sentence versus an E-STM buffer of C F cells from two to eight.Each PDF can be modelled with a Gaussian PDF f C F (x) with a mean value proportional to C F and a standard deviation proportional to √ C F .

Figure 3 .
Figure 3. Conditional histograms of words per sentence versus an E-STM buffer of   cells from two to eight, obtained from Figure 1, by simulating 100,000 sentences weighted with the PDF of   .

Figure 3 .
Figure 3. Conditional histograms of words per sentence versus an E-STM buffer of C F cells from two to eight, obtained from Figure 1, by simulating 100,000 sentences weighted with the PDF of M F .

Figure 3 .
Figure 3. Conditional histograms of words per sentence versus an E-STM buffer of   cells from two to eight, obtained from Figure 1, by simulating 100,000 sentences weighted with the PDF of   .

Figure 4 .
Figure 4. Scatterplot of < P F > versus < M F > of Italian novels (blue circles) and English novels (red circles).

Figure 6 9 Figure 6 .
Figure6draws the data reported in some lines of Table2for a quick overview.We see how fast the number of sentences changes with C F for constant W. For example, if W = 20 words, then S W=20 ranges from 1 (C F = 1) to 52,698 sentences (C F = 8).Maxima are clearly visible for W = 5 and W = 10 words at C F = 3 and C F = 5 or 6, respectively.Values become fantastically large for larger W and C F , well beyond the ability and creativity of single writers, as we will show in Section 4.AppliedMath 2024, 4, FOR PEER REVIEW 9

Figure 6 .
Figure 6.Number of sentences S (C F ) W made of W words versus an E-STM buffer capacity of C F .

Figure 7
Figure 7 draws the data reported in some columns of Table 2, i.e., the number of sentences S (C F ) W versus W, for fixed C F .In this case, it is useful to adopt an efficiency factor ε, which is defined as the ratio between S (C F ) W and W for a given C F :

Figure 7 .
Figure 7. Number of sentences S (C F ) W recordable in an E-STM buffer capacity of C F versus words per sentence.

Figure 8 10 Figure 7 .
Figure8shows ε versus W. It is interesting to note that for W ≲ 10 words, the buffer C F = 2 can be more efficient than the others.Beyond W = 10, the larger buffers become very efficient with very large ε.

Figure 8 .
Figure 8. Efficiency , Equation (8), of an E-STM buffer of   cells versus words per sentence .

Figure 8 .
Figure 8. Efficiency ε, Equation (8), of an E-STM buffer of C F cells versus words per sentence W.

(
C F ) W increases rapidly with < M F >.The most displaced (red) circle is due to Robinson Crusoe.The comparison between S (M F ) P F and S (C F ) W is performed by defining a multiplicity factor α, defined as the ratio between S (M F ) P F (experimental value) and S (C F ) W (theoretical value):

Figure 9 .
Figure 9. Theoretical number of sentences   (  ) versus <   > for Italian (blue circles) and English (red circles) novels.The most displaced (red) circle is due to Robinson Crusoe.The comparison between    (  ) and   (  ) is performed by defining a multiplicity factor , defined as the ratio between    (  ) (experimental value) and   (  ) (theoretical value):

Figure 9 .
Figure 9. Theoretical number of sentences S (C F ) W versus < M F > for Italian (blue circles) and English (red circles) novels.The most displaced (red) circle is due to Robinson Crusoe.AppliedMath 2024, 4, FOR PEER REVIEW 12

Figure 10 .
Figure 10.Multiplicity factor versus S (M F ) P Ffor Italian (blue circles) and English (red circles) novels.

Figure 11 11 )Figure 11 .
Figure 11 shows α versus S (C F ) W .An inverse relation power law is a good fit: α = 3886/S (C F ) W

Figure 11 .
Figure 11.Multiplicity factor α versus theoretical number of words S (C F ) W for Italian (blue circles and blue line) and English (red circles and red line) novels.The correlation coefficient of log values is −0.9873 for Italian and −0.9710 for English.Based on Equations (10) and (11), α = 1 when S (C F ) W = 3886 for Italian novels and

Figure 12 .
Figure 12.Multiplicity factor  versus   for Italian (blue circles and blue line) and English (red circles and red line) novels.

Figure 12 .Figure 13 .
Figure 12.Multiplicity factor α versus M F for Italian (blue circles and blue line) and English (red circles and red line) novels.AppliedMath 2024, 4, FOR PEER REVIEW 15

Figure 13 .
Figure 13.Multiplicity factor α versus readability index G U for Italian (blue circles) and English (red circles) novels.

Figure 14 .
Figure 14.Mismatch index   versus   for Italian (blue circles and blue line) and English (red circles and red line) novels.

Figure 14 .
Figure 14.Mismatch index I M versus M F for Italian (blue circles and blue line) and English (red circles and red line) novels.
was found for S (C F ) W < 3886 for Italian and S (C F ) W < 6028 for English.

Figure 14 .
Figure 14.Mismatch index   versus   for Italian (blue circles and blue line) and English (red circles and red line) novels.

Figure 15 .
Figure 15.Mismatch index   versus the theoretical number of sentences   (  ) for Italian (blue circles and blue line) and English (red circles and red line) novels.

Figure 15 .
Figure 15.Mismatch index I M versus the theoretical number of sentences S (C F ) W for Italian (blue circles and blue line) and English (red circles and red line) novels.

Figure 16 .
Figure 16.Mismatch index   versus the multiplicity factor   for Italian (blue circles) and English (red circles) novels.

Figure 16 .
Figure 16.Mismatch index I M versus the multiplicity factor I M for Italian (blue circles) and English (red circles) novels.

Figure 17 .
Figure 17.Multiplicity factor   versus the year of novel publication for Italian (blue circles) and English (red circles) novels.

Figure 18
Figure18shows the mismatch index versus the year of novel publication.

Figure 17 .
Figure 17.Multiplicity factor I M versus the year of novel publication for Italian (blue circles) and English (red circles) novels.

Figure 17 .
Figure 17.Multiplicity factor   versus the year of novel publication for Italian (blue circles) and English (red circles) novels.

Figure 18 .
Figure 18.Mismatch factor  versus the year of novel publication for Italian (blue circles) and English (red circles) novels.

Figure 18 .
Figure 18.Mismatch factor α versus the year of novel publication for Italian (blue circles) and English (red circles) novels.

Figure 19 .
Figure 19.Universal readability index   versus the year of novel publication for Italian (blue circles) and English (red circles) novels.

Figure 19 .
Figure 19.Universal readability index G U versus the year of novel publication for Italian (blue circles) and English (red circles) novels.
Mean value of log-normal PDF σ x standard deviation of log-normal PDF Appendix B. List of the Novels Considered from Italian and English Literature

Table 1 .
[1]n value µ x and standard deviation σ x of the log-normal PDF of the indicated variable[1].

Table 2 .
Theoretical number of sentences S (C F ) W (columns) recordable in an E-STM buffer made of C F cells with the same number of words (items) indicated in the first column.

Table A1 .
Authors of novels of Italian literature.Number of total sentences (sentences ending with full stops, question marks, or exclamation marks); average number of characters per word, < C P >; average number of words pr sentence, < P F >; average number of word intervals, < I P >; average number word intervals per sentence, < M F >; multiplicity factor α; and mismatch index I M .

Table A2 .
Authors of the novels of English literature.Number of total sentences; average number of characters per word, < C P >; average number of words pr sentence, < P F >; average number of word intervals, < I P >; average number word intervals per sentence, < M F >; multiplicity factor α; and mismatch index I M .Notice that for Dickens' novels, Table