First Digits’ Shannon Entropy

Related to the letters of an alphabet, entropy means the average number of binary digits required for the transmission of one character. Checking tables of statistical data, one finds that, in the first position of the numbers, the digits 1 to 9 occur with different frequencies. Correspondingly, from these probabilities, a value for the Shannon entropy H can be determined as well. Although in many cases, the Newcomb–Benford Law applies, distributions have been found where the 1 in the first position occurs up to more than 40 times as frequently as the 9. In this case, the probability of the occurrence of a particular first digit can be derived from a power function with a negative exponent p > 1. While the entropy of the first digits following an NB distribution amounts to H = 2.88, for other data distributions (diameters of craters on Venus or the weight of fragments of crushed minerals), entropy values of 2.76 and 2.04 bits per digit have been found.


Introduction
According to information theory, the Shannon entropy H of a variable is the average level of information inherent to the variable's possible outcomes [1]. Originally, the term entropy was coined by Clausius [2] for a state variable in thermodynamics. Boltzmann [3] related the entropy of an ideal gas to the multiplicity, the number of microstates corresponding to the gas's macrostate. From the probability of occurrence of letters, Shannon calculated the entropy H of an alphabet [4,5]. This gives the minimum number of binary digits necessary for the transmission of one character and, simultaneously, means its average information. In quite a similar way, one can obtain a value of H from the probability of occurrence of the first digits in tables of numbers. This gives the average information of a first digit and the number of binary digits required for their transmission as well. Due to Kwiatkowsky [6], entropy is a measure of uncertainty.
One goal of this manuscript is to apply Shannon's method to first digits of data collections. The entropy of the first digits, giving the minimum of binary digits necessary to characterize their distribution, may be of some value in case the signal to noise ratio is extremely low, e.g., when transmitting data from space probes.
Another goal is to find first digits' distributions that follow a continuous function, such as the data that obey Newcomb-Benford's Law, but on the other hand exhibit a clearly different frequency of the first digits. Examples seem to be rare.
The question is also discussed regarding distributions of the first digits that are very different from the NB Law. In the case of fractured minerals, a possible explanation is presented.
For an alphabet, the entropy H is determined in the following way [7]: if the letters occur with different probabilities, p i , one forms the logarithmus dualis of a particular character, ld p i , and multiplies this value with the probability of its occurrence in the text, giving p i ld p i . Taking the sum over all letters results in with n being the total number of letters [8,9] in this alphabet. The average information of a single character shows its highest value when all letters are equally likely. Correspondingly, one can calculate a value for the entropy H from the probability of occurrence of first digits in data collections as well.

The Newcomb-Benford Distribution
Checking tables of statistical data (e.g., atomic weights, masses of exoplanets, but also wages and prices, stock prices, populations of communities, street addresses, lengths of rivers, page numbers in literature citations, development aid), one finds quite often that 1 occurs more frequently in the first position of numbers than 9. The best-known example is the Newcomb-Benford distribution, saying that the number 1 occurs 6.58 times more often as the leading digit than 9 [10][11][12][13]. Some continuous processes satisfy this exactly (in the asymptotic limit as the process continues through time). Examples are exponential growth or decay processes [12]: if a quantity is exponentially increasing or decreasing in time, then the percentage of time that it has each first digit satisfies Newcomb-Benford's Law. An example is given in Appendix A. Many-though not all-distributions follow the NB Law more or less closely (the population of municipalities, microorganisms in a culture, spread of internet contents, capital growth, wages and prices, loss of value over time, street addresses [12], page numbers in literature citations [14]). Biau [15] analyses the discrepancies between the NB Law and first-digits frequencies in data of turbulent flows through Shannon's entropy.
For a more general approach to the problem of first digits' probabilities, one starts from the density function The probability W(a,b) of the object to exhibit a size between a and b then is In case of exponential growth of a certain object, x means the time and p equals 1.
The bars in Figure 1 refer to subsequent integers a, b, defining a "z-interval". Equations (3) and (4) are scale invariant, i.e., one would obtain the same power law and the same ratio of 1s to 9s in the first digital place if one would use other units to measure the objects' properties, as long as one chooses comparable intervals (e.g., one decade).

Entropy of the Newcomb-Benford Distribution
Similar to the letters of an alphabet, the probability of occurrence can be determined from the first digits of a data set following the Newcomb-Benford Law. In Table 1, the values for W(z), 1 ≤ z ≤9, are listed, together with the probability of occurrence of a particular digit in the first position of numbers. From the probabilities, one finds the numerical value of the entropy with Equation (1) of the first digits' distribution to be H (NB Law) = 2.8762 bits per digit. This means the average information of a single first digit. In case the leading digits were evenly distributed (all p i = 1 9 ), the entropy would amount to H = −9 1 9 ld 1 9 = 3.1699 bpd. Equations (3) and (4) are scale invariant, i.e., one would obtain the same power law and the same ratio of 1s to 9s in the first digital place if one would use other units to measure the objects´ properties, as long as one chooses comparable intervals (e.g., one decade).

Entropy of the Newcomb-Benford Distribution
Similar to the letters of an alphabet, the probability of occurrence can be determined from the first digits of a data set following the Newcomb-Benford Law. In Table 1, the values for W(z), 1 ≤ z ≤9, are listed, together with the probability of occurrence of a particular digit in the first position of numbers. From the probabilities, one finds the numerical value of the entropy with Equation (1) of the first digits´ distribution to be H (NB Law) = 2.8762 bits per digit. This means the average information of a single first digit. In case the leading digits were evenly distributed (all pi = ), the entropy would amount to H = − 9 = 3.1699 bpd.

Non-NB Distributions
In case the first digits' distribution does not follow the NB Law ( Figure 2), it can be derived from the density function D(x) = A x P = Ax −P with p = 1. The proportion of numbers falling into the range between x 1 and x 2 is calculated from This law is scale invariant. In case x 1 and x 2 are chosen as subsequent integers, W(x 1 , x 2 ) = W(z) is in proportion to the numbers starting with the digit x 1 = z. From the probabilities of the first digits, the average entropy of a digit can be determined (Equation (1)). This law is scale invariant. In case x1 and x2 are chosen as subsequent integers, W(x1, x W(z) is in proportion to the numbers starting with the digit x1 = z. From the probabilitie the first digits the average entropy of a digit can be determined (EQ [1]).

Fragments of Marble
The distribution of the first digits of fractions of minerals occasionally deviates strongly the NB Law [16]. Six samples of marble (irregularly shaped plates of 10 mm thickness weighing between 4.70 and 51.69 grams) had been crushed in a hydraulic press and the weight of 1052 fragments between 10 and 99 mg has been taken. Fragments below 10 a above 99 mg have been omitted. In Table 2 the numbers of fragments sharing the sam digit are listed. From the probabilites pi the value of the entropy H= 2.0370 has been fo From a fit of [EQ 5] (red curve in Figure 3) the value of P =2.003(25) has been determ Assuming the distribution would follow exactly this function, the statistical weights and first digits´ probabilities can be obtained from where a,b are the integers defining the z intervals.  This law is scale invariant. In case x1 and x2 are chosen as subsequent integers, W(x1, x2) = W(z) is in proportion to the numbers starting with the digit x1 = z. From the probabilities of the first digits, the average entropy of a digit can be determined (Equation (1)).

Fragments of Marble
The distribution of the first digits of fractions of minerals occasionally deviates strongly from the NB Law [16]. Six samples of marble (irregularly shaped plates of 10 mm thickness, weighing between 4.70 and 51.69 g) were crushed in a hydraulic press, and the

Fragments of Marble
The distribution of the first digits of fractions of minerals occasionally deviates strongly from the NB Law [16]. Six samples of marble (irregularly shaped plates of 10 mm thickness, weighing between 4.70 and 51.69 g) were crushed in a hydraulic press, and the weights of 1052 fragments between 10 and 99 mg were taken. Fragments below 10 and above 99 mg have been omitted. In Table 2, the numbers of fragments sharing the same first digit are listed. From the probabilities, p i , the value of the entropy H = 2.0370 has been found. From a fit of Equation (5) (red curve in Figure 3), the value of p = 2.003 (25) has been determined. Assuming the distribution would follow exactly this function, the statistical weights and the first digits' probabilities can be obtained from where a, b are the integers defining the z intervals. In this case, a 1 (a = 1, b = 2) is expected to occur as a leading digit 45.27 times more often than a 9 (a = 9, b = 10). For statistical reasons, the observed ratio is different (58.8). It is, e.g., closer for the ratio of the 1 s to the 7 s (calculated: 29.8, observed: 30.9).
Why Is the Number 1 So Strong in The Majority of Cases?
It is surprising that, when it comes to the weight of the fragments, the number 1 occurs even more often as a leading digit than expected from the NB Law. To answer this question, one has to determine the probability of the weight of a fragment exhibiting a certain first digit. Table 3 shows the number of first digits for this case, so one would divide the sample into the number of equal-sized fragments given in the left column. For example, cutting a mass of 81,475 g (a weight chosen arbitrarily) into 815 pieces of equal size, one finds that the weight of each piece starts with a 9. This holds up to 905 fragments. Dividing the sample into smaller parts (between 906 and 1018 fragments) results in an 8 in the first position. Between 4074 and 8147 fragments of equal size, the leading digit will be a 1. Although the sample will probably never break into pieces of equal size, there will be at least a high probability for a majority of the fragments that their weight will start with a 1. A 9 will be rare. Table 4 gives the probabilities of occurrence of the first digits.
From Table 5, one can see that, in this example, the ratio of 1 s to 9 s is 45,265/1006 = 44.995. Starting from samples with different weights produces quite similar results. Figure 4 gives the distribution of probabilities among the z-intervals. From a fit (Equation (5)   mass of 81,475 g (a weight chosen arbitrarily) into 815 pieces of equal size, one finds that the weight of each piece starts with a 9. This holds up to 905 fragments. Dividing the sample into smaller parts (between 906 and 1018 fragments) results in an 8 in the first position. Between 4074 and 8147 fragments of equal size, the leading digit will be a 1. Although the sample will probably never break into pieces of equal size, there will be at least a high probability for a majority of the fragments that their weight will start with a 1. A 9 will be rare. Table 4 gives the probabilities of occurrence of the first digits. From Table 5, one can see that, in this example, the ratio of 1s to 9s is 45265/1006 = 44.995. Starting from samples with different weights produces quite similar results. Figure 4 gives the distribution of probabilities among the z-intervals. From a fit (Equation (5) Table 3. From the fit of the probability function, W(z), for the density function exponent, a value of p = 2.00003 (4) is obtained.  Table 3. From the fit of the probability function, W(z), for the density function exponent, a value of p = 2.00003 (4) is obtained. Table 3 also shows that there is kind of periodicity in the occurrence of a certain first digit when dividing the sample into more and more pieces. Inspecting the table from top to bottom, one finds that the number of a particular digit increases approximately by a factor of 10.

Fragments of Calcite
From 2 samples of calcite, 775 fragments between 10 and 99 mg were obtained. Table 6 and Figure 5 give the distribution with respect to the first digits of their weight. A value for the entropy of H = 2.1064 is obtained. The exponent of the density function amounts to p = 1.966 (36).   From the function, the probabilities for W(z = 1) and W(z = 9) have been calculated. Their ratio is 42.12. From the counted number of fragments, a ratio of 423/7 = 60.43 is obtained.

Fragments of Granite
Samples of granite (13 square discs, 34 × 34 × 9.5 mm 3 , 28 g each, with 3 more samples, slightly larger) were crushed in a hydraulic press. The weight of 3849 fragments between 10 mg and 99 mg was taken. The distribution is given in Figure 6 as well as in Table 7. From this, a value of H = 2.1817 bits per digit has been found.  The distribution of the first digits would be close to the same power law when fragments are selected in the comparable 10-fold range of 1.0 × 10^−4 ounces to 9.9 × 10^−4 ounces.

Prices of Shares and Equity Funds
Equation (5) can be used to obtain a p-value from share prices. On a certain day in Figure 6. Fit of the function W(z) to a bar graph giving the probabilities of first digits of the weight of 3849 fragments of granite between 10 and 99 mg (obtained from 15 samples). P = 1.748 (81). From the function, the probabilities for W(z = 1) and W(z = 9) have been calculated. Their ratio is 27.62. From the counted number of fragments, the ratio is 37.71. From a fit of Equation (5) (red curve in Figure 6), the value for the parameters p = 1.748 (81) has been determined. Assuming the distribution would follow exactly this function, the statistical weights and the first digits' probabilities can be obtained from From this, a 1 is expected to occur as a leading digit 27.62 times more often than a 9. From the counted number of fragments, the ratio is found to be larger (1923/51 = 37.71).
The distribution of the first digits would be close to the same power law when fragments are selected in the comparable 10-fold range of 1.0 × 10 −4 ounces to 9.9 × 10 −4 ounces.

Prices of Shares and Equity Funds
Equation (5) can be used to obtain a p-value from share prices. On a certain day in 2022, a listing of 397 share prices was taken (including stock indices and funds; Table 8 exhibits the distribution of first digits given in Figure 7 (prices in €)). From the frequency of the first digits, the entropy has been determined to H = 2.841 bpd. A fit of Equation (5) resulted in p = 1.045 (74). Usually, for share prices, the distribution of the first digits deviates only slightly from the NB Law [17]. A 1 occurs as a first digit 7.17 times as frequently as a 9 (from the function fitted). deviates only slightly from the NB Law [17]. A 1 occurs as a first digit 7.17 times as frequently as a 9 (from the function fitted).

Craters on the Venus
The first digits of the diameters of 874 craters on Venus [18] produced an H value of 2.764 bits per digit (Table 9). Figure 8 shows the distribution. Due to the function fitted (p

Craters on the Venus
The first digits of the diameters of 874 craters on Venus [18] produced an H value of 2.764 bits per digit (Table 9). Figure 8 shows the distribution. Due to the function fitted (p = 1.238(73)), a 1 should be 10.36 as frequent as a 9.

Results
The entropy of most of the first digits' distributions in listings encountered in daily life exhibits a value around H = 2.9 bits per digit, while in some cases (diameters of craters on the planet Venus, weights of mineral fragments), considerably lower values have been found. Table 10 gives the results. The largest deviation of the entropy from the NB distribution was observed for the weight of fragments of some minerals. Whether this is due to material constants or whether it can be attributed to the experimental conditions has to be left to future investigations.
Funding: This research received no external funding.

Conflicts of Interest:
The author declares no conflict of interest.

Appendix A. First Digits' Distribution of Exponentially Growing Objects
Suppose an object starts growing at a certain size according to the function S = exp [0.057565 t]. (A1) The number 0.057565 has been chosen such that the first object reaches size S = 10 after 40 years. Figure A1 shows the size of objects starting to grow at different times: the first one at t = 0, the others following in each case at intervals of four years. After 40 years, one will find more objects whose size starts with a 1 than with a 9.
This can be shown in the following way: from the functions n = 0.25 t + 1 (n = number of objects; t = time/years) and S = exp [0.057565 t], one obtains dn = 0.25 dt and dS = 0.057565 exp [0.057565 t] dt, and from dn dS = dt 4·0.057565 dt S ; dn = dS 4·0.057565 S = 4.343 dS S , one gets the number n of objects within a certain size range n = 4.343 S2 S1 dS S = 4.343 ln S2 − ln S1 (A2) After 40 years, in the limiting case of an infinite number of objects, 30.102 % of them will exhibit a size between 1 and 2 units, but only 4.576 % between 9 and 10 units, with their ratio being 6.58, corresponding to the NB Law (Figure 1).