Some New Tests of Conformity with Benford’s Law

Cerqueti, Roy; Lupi, Claudio

doi:10.3390/stats4030044

Open AccessArticle

Some New Tests of Conformity with Benford’s Law

by

Roy Cerqueti

^1,†

and

Claudio Lupi

^2,*

¹

Department of Social and Economic Sciences, Sapienza University of Rome, P.le Aldo Moro 5, I-00185 Rome, Italy

²

Department of Economics, University of Molise, Via De Sanctis snc, I-86100 Campobasso, Italy

^*

Author to whom correspondence should be addressed.

^†

London South Bank University Business School, 103 Borough Road, London SE1 0AA, UK.

Stats 2021, 4(3), 745-761; https://doi.org/10.3390/stats4030044

Submission received: 19 July 2021 / Revised: 1 September 2021 / Accepted: 2 September 2021 / Published: 6 September 2021

(This article belongs to the Special Issue Benford's Law(s) and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

This paper presents new perspectives and methodological instruments for verifying the validity of Benford’s law for a large given dataset. To this aim, we first propose new general tests for checking the statistical conformity of a given dataset with a generic target distribution; we also provide the explicit representation of the asymptotic distributions of the relevant test statistics. Then, we discuss the applicability of such novel devices to the case of Benford’s law. We implement extensive Monte Carlo simulations to investigate the size and the power of the introduced tests. Finally, we discuss the challenging theme of interpreting, in a statistically reliable way, the conformity between two distributions in the presence of a large number of observations.

Keywords:

Benford’s law; conformity tests; goodness-of-fit tests; Monte Carlo simulation; size–power graphs

MSC:

60E05; 62F03

1. Introduction

Data regularities are relevant properties of many datasets whose elements maintain their individuality while creating a unified framework. One of the most illustrative examples of such statistical features is that of Benford’s law, introduced in [1] and successfully tested and described in [2]. Benford’s law is a sort of magic rule, for which the first digit(s) of the elements of a given dataset follow a specific distribution—hereafter called Benford’s distribution. For all the details on such a law, we refer the interested reader to [3,4,5,6].

Benford’s law is not at all intuitive; however, over the years, long after Frank Benford’s paper appeared (see [2]), several solid theoretical motivations and explanations have been found, mathematically validating the phenomenon (see, among others, refs. [4,5,7,8,9,10,11,12]). Surprisingly, this digital pattern holds true in a large number of cases, with datasets in the fields of economics (e.g., [13,14,15,16]), accounting (e.g., [17,18,19]), finance (e.g., [20,21,22,23,24,25]), geophysics and hydrology (e.g., [26,27,28]), as well as social sciences (e.g., [29,30,31]).

A methodological aspect of Benford’s law lies in how to test the compliance of the empirical distribution of a given sample with Benford’s variable. The root of such an issue lies in the definition of a statistical distance between two random variables, the most popular being the chi-square and the mean absolute deviation (MAD).

This paper deals with this challenging research theme. Specifically, we advance herein some new tests for verifying the compliance of the empirical distribution obtained from a given population. In this respect, we mention the recent contribution [32], where the authors suggested a statistical test based on the mean. Following the quoted paper, we start by introducing a mean-based conformity test. Moreover, we also developed a variance-based and a joint mean and variance-based test for verifying the compliance of a given distribution with a target one. Furthermore, we also present a test based on Wald’s statistic and a new version of a MAD-based test. We explored the asymptotic distributions of the proposed tests; furthermore, we pay specific attention to their size and power, which were investigated through a large set of Monte Carlo simulations.

We also focus the so-called “excess of power problem”. In this respect, we mention Kossovsky’s criticism ([12], in this Special Issue), where the author refers to the “mistaken use of the Chi-Square test in Benford’s law”. From this perspective, we also mention the theme of the selection of the critical thresholds for having perfect/marginal/acceptable conformity with Benford’s law (see [3,4] and the recent study developed by [33]). Finally, the problem of the “excess of power problem” is treated using resampling techniques.

The rest of the paper is organised as follows: the next Section introduces the new tests and derives their asymptotic null distribution; Section 3 illustrates the extensive Monte Carlo analysis carried out to investigate the size and power properties of the proposed tests in the relevant cases of the first digit and first two digits Benford’s law; the “excess of power problem” is addressed in Section 4; the last Section draws some conclusions. An Appendix reports some further technical details.

2. New Tests of Conformity with Benford’s Law

In this Section, we report the analytical derivations of the new test statistics of conformity to Benford’s law and their asymptotic distributions.

Proposition 1.

Consider a random sample

x_{1}, \dots, x_{n}

from a population with mean μ, variance

σ^{2}

, and third and fourth central moments

μ_{3}

and

μ_{4}

. All moments up to the fourth are assumed to be finite. Let

{\bar{x}}_{n}

and

s_{n}^{2}

be the sample mean and the sample variance, respectively. Then:

{\tilde{x}}_{n} : = \frac{\sqrt{n} ({\bar{x}}_{n} - μ)}{σ} \overset{d}{\to} N (0, 1);

(1)

{\tilde{s}}_{n}^{2} : = \frac{\sqrt{n} (s_{n}^{2} - σ^{2})}{\sqrt{μ_{4} - σ^{4}}} \overset{d}{\to} N (0, 1);

(2)

n_{μ σ} : = \frac{{\tilde{x}}_{n} + {\tilde{s}}_{n}^{2}}{{[2 (1 + \frac{μ_{3}}{σ \sqrt{μ_{4} - σ^{4}}})]}^{\frac{1}{2}}} \overset{d}{\to} N (0, 1);

(3)

\begin{matrix} w_{μ σ} & : = & ({\tilde{x}}_{n}, {\tilde{s}}_{n}^{2}) {(\begin{matrix} 1 & \frac{μ_{3}}{σ \sqrt{μ_{4} - σ^{4}}} \\ \frac{μ_{3}}{σ \sqrt{μ_{4} - σ^{4}}} & 1 \end{matrix})}^{- 1} (\begin{matrix} {\tilde{x}}_{n} \\ {\tilde{s}}_{n}^{2} \end{matrix}) \\ = & z_{n}^{'} \sum^{- 1} z_{n} \overset{d}{\to} χ^{2} (2), \end{matrix}

(4)

with

z_{n}^{'}

denoting the transpose of

z_{n}

.

Proof.

To prove (1) and (2) see, e.g., [34] (Theorem 10.1).

To prove (3), first note that:

\begin{matrix} cov ({\tilde{x}}_{n}, {\tilde{s}}_{n}^{2}) & = & \frac{\sqrt{n}}{σ} \frac{\sqrt{n}}{\sqrt{μ_{4} - σ^{4}}} cov ({\bar{x}}_{n}, s_{n}^{2}) \end{matrix}

(5)

\begin{matrix} = & \frac{n}{σ \sqrt{μ_{4} - σ^{4}}} \frac{μ_{3}}{n} \end{matrix}

(6)

\begin{matrix} = & \frac{μ_{3}}{σ \sqrt{μ_{4} - σ^{4}}} \end{matrix}

(7)

where

cov ({\bar{x}}_{n}, s_{n}^{2}) = μ_{3} / n

see [35]; (3) then follows from (1) and (2) and from the rule of the variance of a sum of correlated random variables.

Let us now define

z_{n} : = {({\tilde{x}}_{n}, {\tilde{s}}_{n}^{2})}^{'}

. The Cramér–Wold device implies that

z_{n}

is (asymptotically) multivariate normal if

λ^{'} z_{n}

is (asymptotically) univariate normal for every

λ \in R^{2}

. However, every

λ \in R^{2}

defines a linear combination of two (asymptotically) normal variables and

λ^{'} z

is trivially (asymptotically) univariate normal. Therefore:

\sqrt{n} (\begin{matrix} {\tilde{x}}_{n} \\ {\tilde{s}}_{n}^{2} \end{matrix}) \overset{d}{\to} N [(\begin{matrix} 0 \\ 0 \end{matrix}), (\begin{matrix} 1 & \frac{μ_{3}}{σ \sqrt{μ_{4} - σ_{4}}} \\ \frac{μ_{3}}{σ \sqrt{μ_{4} - σ_{4}}} & 1 \end{matrix})] \sim N (0, \sum)

(8)

and (4) follows. □

Remark 1.

The results stated in Proposition 1 can be used to test conformity (goodness of fit) with any given distribution with finite moments up to the fourth. If μ, σ,

μ_{3}

, and

μ_{4}

are those of Benford’s distribution, then Equation (1) can be used to build a conformity test based on the mean: such a test has indeed recently been suggested by Hassler and Hosseinkouchack [32]. Equation (2) is the basis for a normal conformity test based on the variance, whereas (3) can be used to build a normal conformity test jointly based on the mean and the variance. Finally, (4) is a chi-square conformity test which jointly considers the mean and the variance.

Remark 2.

When conformity is tested with reference to the normal distribution, (4) is simplified because

μ_{3} = 0

: indeed, under normality, the sample mean and the sample variance are independent: the sample mean and the sample variance are not independent random variables for any other distribution, as can be seen in [36].

Proposition 2.

Consider a random sample

x_{1}, \dots, x_{n}

from a discrete random variable with

k ≪ n

classes with individual probabilities

p : = {(p_{1}, \dots, p_{k})}^{'}

, with

p_{j} \neq 0

\forall j \in {1, \dots, k}

. Let

f_{n} : = {(f_{n 1}, \dots f_{n k})}^{'}

be a consistent estimate of p and define

e_{n} : = {(e_{n 1}, \dots, e_{n k})}^{'} = f_{n} - p

and

\sum : = diag (p) - p p^{'}

. Then:

w : = n {e_{n}^{*}}^{'} {\sum^{*}}^{- 1} e_{n}^{*} \overset{d}{\to} χ^{2} (k - 1)

(9)

where

e_{n}^{*} : = {(e_{n 1}, \dots, e_{n, k - 1})}^{'}

and

\sum^{*}

is made of the first

k - 1

rows and columns of

\sum

. Furthermore:

M A D^{⋆} : = \frac{\sqrt{n}}{k} \sum_{j = 1}^{k} \frac{|f_{n j} - p_{j}|}{\sqrt{p_{j} (1 - p_{j})}} \overset{d}{\to} N (\sqrt{\frac{2}{π}}, \frac{1}{k^{2}} \sum_{i = 1}^{k} \sum_{j = 1}^{k} r_{i j})

(10)

where:

r_{i j} = \frac{2}{π} (ρ_{i j} arcsin (ρ_{i j}) + \sqrt{1 - ρ_{i j}^{2}}) - \frac{2}{π}

(11)

and:

ρ_{i j} = - \sqrt{\frac{p_{i} p_{j}}{(1 - p_{i}) (1 - p_{j})}} .

(12)

Proof.

To prove (9), let

Y_{i j} : = 1_{X_{i} = j}

with

1_{κ}

being the indicator function which is equal to 1 when condition

κ

is satisfied and 0 otherwise. Furthermore,

Y_{i j} \sim Bern (p_{j})

and

S_{n j} : = \sum_{i = 1}^{n} Y_{i j} \sim Binom (n p_{j}, n p_{j} (1 - p_{j}))

. Then:

\begin{matrix} e_{n j}^{†} : = \frac{S_{n j} - n p_{j}}{\sqrt{n p_{j} (1 - p_{j})}} & = & \frac{\sqrt{n} (\frac{S_{n j}}{n} - p_{j})}{\sqrt{p_{j} (1 - p_{j})}} \\ = & \frac{\sqrt{n} (f_{n j} - p_{j})}{\sqrt{p_{j} (1 - p_{j})}} \\ = & \frac{\sqrt{n} e_{n j}}{\sqrt{p_{j} (1 - p_{j})}} \overset{d}{\to} N (0, 1) \end{matrix}

(13)

by the central limit theorem. Furthermore, the covariance matrix of

e_{n}

is

\sum = diag (p) - p p^{'}

, as can be seen in [37]. Again invoking the Cramér–Wold device

\sqrt{n} e \overset{d}{\to} N (0, \sum)

and (9) is a Wald-like statistic with a

χ^{2} (k - 1)

limiting distribution under the null [38] (p. 71).

To prove (10), we exploit the fact that if

Y \sim (0, 1)

, then see [36]:

E | Y | = \sqrt{\frac{2}{π}} .

(14)

Furthermore:

\begin{matrix} var (| Y |) & = & {E (| Y |}^{2} {) - E (| Y |)}^{2} \\ = & \sqrt{\frac{2}{π}} \int_{0}^{\infty} y^{2} e^{- \frac{y^{2}}{2}} d y - \frac{2}{π} \\ = & 1 - \frac{2}{π} . \end{matrix}

(15)

Therefore, by (13):

\sqrt{n} e_{n j}^{⋆} : = \frac{\sqrt{n} |f_{n j} - p_{j}|}{\sqrt{p_{j} (1 - p_{j})}} \overset{d}{\to} N (\sqrt{\frac{2}{π}}, 1 - \frac{2}{π}) .

(16)

Furthermore,

\sqrt{n} e_{n}^{⋆} : = \sqrt{n} {(e_{n 1}^{⋆}, \dots, e_{n k}^{⋆})}^{'} \overset{d}{\to} N (ı \sqrt{\frac{2}{π}}, R)

by the Cramér–Wold device, with

ı

a k-vector of ones.

Using the fact that when

(X, Y)

have a bivariate normal distribution with means 0, variances

ı

and correlation

θ

, then [39]:

E (| X | | Y |) = \frac{2}{π} (θ arcsin (θ) + \sqrt{1 - θ^{2}})

(17)

and therefore:

E (| e_{n i}^{†} | | e_{n j}^{†} |) = \frac{2}{π} (ρ_{i j} arcsin (ρ_{i j}) + \sqrt{1 - ρ_{i j}^{2}})

(18)

where

ρ_{i j}

is the correlation between

e_{n i}^{†}

and

e_{n j}^{†}

:

ρ_{i j} = - \sqrt{\frac{p_{i} p_{j}}{(1 - p_{i}) (1 - p_{j})}} .

(19)

Then, note that:

\begin{matrix} cov (| e_{n i}^{†} | | e_{n j}^{†} |) & = & E (| e_{n i}^{†} | | e_{n j}^{†} |) - E (| e_{n i}^{†} |) E (| e_{n j}^{†} |) \\ = & \frac{2}{π} (ρ_{i j} arcsin (ρ_{i j}) + \sqrt{1 - ρ_{i j}^{2}}) - \frac{2}{π} . \end{matrix}

(20)

Therefore, the covariance matrix R is:

R = (\begin{matrix} r_{11} & r_{12} & \dots & r_{1 k} \\ r_{12} & r_{22} & \dots & r_{2 k} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ r_{1 k} & r_{2 k} & \dots & r_{k k} \end{matrix}) = \{r_{i j}\}

(21)

with:

r_{i j} = \frac{2}{π} (ρ_{i j} arcsin (ρ_{i j}) + \sqrt{1 - ρ_{i j}^{2}}) - \frac{2}{π} .

(22)

Finally:

\frac{\sqrt{n}}{k} \sum_{j = 1}^{k} e_{n j}^{⋆} = \frac{1}{k} \sum_{j = 1}^{k} \frac{\sqrt{n} |f_{n j} - p_{j}|}{\sqrt{p_{j} (1 - p_{j})}} \overset{d}{\to} N (\sqrt{\frac{2}{π}}, \frac{1}{k^{2}} ı^{'} R ı) .

(23)

□

Remark 3.

The results stated in Proposition 2 can be used to test conformity (goodness of fit) with any given discrete distribution and specialise to the first digit or first two digits Benford’s law when

p_{i} = {log}_{10} (1 + 1 / d)

, with either

d = 1, \dots, 9

or

d = 10, \dots, 99

. Here, (9) is a Wald-like test, whereas (10) is a modification of the mean absolute deviation (MAD) statistic advocated in [3,40], where each absolute deviation is adjusted by the factor

1 / \sqrt{p_{j} (1 - p_{j})}

thereby emphasising deviations from smaller expected frequencies, as well as incorporating (the square root of) the sample size n as a factor in the measure of deviation.

Remark 4.

The Wald-like

χ^{2}

statistic in (9) is equivalent to the usual

χ^{2}

computed as

n \sum_{j = 1}^{k} e_{n j}^{2} / p_{j}

. A proof, which also proves that

\sum^{*}

is nonsingular, is offered in Appendix A.

Remark 5.

Equation (10) makes it clear that, contrarily to what is commonly asserted, as can be seen in, e.g., [3] (p. 158), the

M A D

statistic:

M A D : = \frac{1}{k} \sum_{j = 1}^{k} |f_{n j} - p_{j}|

(24)

is not independent of n and is, in fact,

O_{p} (n^{- \frac{1}{2}})

.

3. Monte Carlo Simulations

The size (the probability of falsely rejecting the null hypothesis) and power (the ability of the test to reject the null when it is false) of the proposed tests are investigated over 25,000 Monte Carlo replications for varying sample sizes n, under the null and under selected interesting alternatives (all computations and graphics were produced using R, version 4.0.5 [41] and ggplot2, version 3.3.3 [42]). Each alternative is expressed in terms of the mixture:

p = λ_{p B} + (1 - λ) p_{A}

(25)

where

p_{B} : = {(p_{B 1}, \dots, p_{B k})}^{'}

is the vector of Benford’s probabilities,

p_{A} : = {(p_{A 1}, \dots, p_{A k})}^{'}

is the vector of probabilities of some “contaminating” distribution, and k is the number of digits.

λ \in {0.75, 0.80, \dots, 0.95}

is the mixing parameter. When dealing with data manipulation issues,

1 - λ

can be interpreted as the fraction of manipulated data.

The following mixtures were used in the simulations:

Uniform mixture: $p_{A}$ describes the discrete uniform distribution with the same support as the considered Benford’s distribution;
Normal mixture: $p_{A i}$ are the probabilities of $N (μ_{B}, σ^{2})$ , with $μ_{B}$ the mean of Benford’s distribution and $σ = \sqrt{4 μ_{B}}$ ;
Randomly perturbed mixture: Benford’s law is perturbed by a random quantity in correspondence to each digit. More precisely, $p_{A i} = u_{i} p_{B_{i}}$ with $u_{i} \sim U (0, 2 p_{B_{i}})$ . Since this mixture contains elements of randomness, each Monte Carlo iteration uses a different mixture. However, the mixtures are the same across all tests;
Under-reporting mixture: under the alternative, Benford’s distribution is modified by putting to zero the probability of “round” numbers and giving this probability to the preceding number: for example, $p_{A 20} = 0$ and $p_{A 19} = p_{B 19} + p_{B 20}$ . This mixture is only considered with reference to the first two digits case.

The above mixtures are plotted in Figure 1 for the first two-digit case. The corresponding data for each mixture are generated from a multinomial distribution with vector probability. In order to reduce Monte Carlo variability, all tests were applied to the same data, and larger samples include observations from the smaller ones.

Rather than reporting long and difficult-to-compare tables of outcomes, we summarise the simulation results by relying on a graphical approach (as can be seen in, e.g., [43,44]). In order to summarise the size properties of the tests, we plot the size deviations (i.e.,

a c t u a l s i z e - n o m i n a l s i z e

) against nominal size. When no size distortions are present,

a c t u a l s i z e = n o m i n a l s i z e

, and this graph coincides with a horizontal line with the ordinate equal to zero; however, this is a theoretical case only, since in practice, size deviations will tend to reflect experimental randomness. To report power results, we use size–power curves: these curves allow us to easily visualise the power of each test in correspondence of its actual (rather than nominal) size and to compare the power of different tests on perfectly fair grounds. The line

p o w e r = a c t u a l s i z e

is also reported as a reference, representing the performance of a test of no practical use (the fraction of rejections under the null and under the alternative is the same); the more distant the size–power curve is from this line, the more powerful the test is.

3.1. First-Digit Law

The tests generally have very good size properties, irrespective of the sample size, with size deviations of approximately zero (see Figure 2). Only the modified MAD test tends to over-reject slightly under the null (with a +0.01 deviation with respect to nominal size) in correspondence of the 5% nominal size. In other words, the actual size of the modified MAD test in correspondence of the 5% nominal size is around 6%, and the discrepancy tends to reduce for larger nominal sizes.

As far as power is concerned, the performance of the different tests depends on the specific alternative hypothesis considered. The normal mean test (1) is the most powerful in the presence of a uniform mixing alternative (Figure 3), followed by the

χ^{2} (2)

test on the mean and the variance (4) and the normal test on the mean and the variance (3).

In the presence of a normal mixing alternative (Figure 4), the

χ^{2} (2)

(4) and the normal test on the mean (1) perform the best, followed by the adjusted MAD (10) and the Wald-like

χ^{2} (d - 1)

test (9).

Finally, in the presence of a perturbed Benford distribution (Figure 5), the highest power is reached by the

χ^{2} (d - 1)

(9) and the adjusted MAD (10) tests, followed by the

χ^{2} (2)

test (4).

3.2. First Two Digits Law

All the tests have approximately the correct size, even in the presence of fairly small samples (see Figure 6). All deviations with respect to the nominal size are within

\pm 0.005

, with the only exception of the ordinary chi-square test which shows a deviation around

0.010

in correspondence with values of the nominal size of common usage for

n = 250

.

As anticipated, the power performance of the tests crucially depends on the alternative. The normal test based on the mean (1) is the most powerful test among those considered here, in the presence of a uniform mixing alternative (see Figure 7). The

χ^{2} (2)

test on the mean and variance (4) and the normal test on the mean and variance (3) followed at short distance.

In the presence of a normal mixing alternative (see Figure 8), the

χ^{2} (2)

test (4) is the most powerful one, followed by the normal variance test (2). It is interesting to note that in the first digit case, the normal variance test had no power; here, the normal mean test has no power. The other tests are generally more powerful in the first two digits than in the first digit case.

When the alternative can be described as a “perturbed Benford” distribution (Figure 9) or in terms of a rounding behaviour (Figure 10), then the

χ^{2} (d - 1)

, either in the “classical” or in the equivalent Wald’s formulation (9), and the modified MAD (10) perform very closely and are by far the most powerful tests. The ordering of the tests is the same as in the first digit case; however, the tests are generally more powerful in the first digit case.

These results suggest that in applications it is generally a good idea not to rely on a single test, but to use a battery of different tests designed to detect particular deviations from the null.

4. Statistical versus Practical Significance

In 1998, Granger [45] (p. 260) pointed out that in the presence of very large datasets:

“Virtually all specific null hypotheses will be rejected using present standards. It will probably be necessary to replace the concept of statistical significance with some measure of economic significance.”

This is obviously related to the fact that the power of any consistent test increases with the sample size n, i.e.,

π \to 1

as

n \to \infty

(with

π

denoting the power of the test). Of course, consistency is a desirable property of any statistical test. The symmetrical case, with small n, is somewhat less relevant in empirical applications of Benford’s law where typical sample sizes are large. However, it has been observed that standard conformity tests may substantially lack power in the presence of small sample sizes (see, e.g., [12]). In our context, a large n is required to approximate the test asymptotic distributions).

In fact, the “large n problem” and some related apparently paradoxical implications were already highlighted in a paper by Lindley in 1957 [46]. The idea that a “large n problem” plagues empirical tests of conformity with Benford’s distribution is widespread in the literature on Benford’s law (as can be seen in, e.g., Nigrini’s contributions [3,40] and Kossovsky’s paper in this Special Issue [12]). In fact, Nigrini [3] (p. 158) claims that:

“What is needed is a test that ignores the number of records. The mean absolute deviation ( $M A D$ ) test is such a test, and the formula is shown in Equation 7.7. [...] There is no reference to the number of records, N, in Equation 7.7.”

However, Nigrini’s statement that the

M A D

does not depend on the number of observations would only be valid if the relative frequencies of the digits for the data were given, not estimated. The fact that the relative frequencies must be estimated from the observed data makes the

M A D

dependent on the sample size, despite the sample size not explicitly appearing in the MAD formula. In fact, in proposition 2, we show that Nigrini’s

M A D

is

O_{p} (n^{- \frac{1}{2}})

under Benford’s distribution (see Remark 5 above). Indeed, Figure 11 clearly shows that the behaviour of the estimated

M A D

is perfectly consistent with

1 / \sqrt{n}

under the null: therefore, taking a fixed “critical value” for the

M A D

irrespective of the sample size may lead to biased conclusions.

The risk of rejecting the (Benford’s law) null hypothesis for tiny uninteresting deviations in the presence of large datasets can be dealt with in two different ways: (i) using significance levels

α_{n}

decreasing with increasing n; and (ii) using a sort of “m out of n bootstrap” procedure [47] to assess significance. In what follows, we explain this second route with specific reference to the “first two digits” case.

If the available sample is very large (e.g.,

n > 3000

), then the idea is to repeatedly test for conformity on a large number of smaller samples randomly resampled from the original data. If the observations are independent, identically distributed (IID), then the smaller samples will have the same distribution as the original data, making it possible to check conformity on the smaller datasets. In doing so, we are sacrificing some power in order to only detect “interesting” (or sizeable) departures from the null. The fact that the test statistics are computed over a large number of random sub-samples allows us to derive the distribution of the statistics and not to rely on a single outcome. The whole procedure is exemplified in Figure 12 in the case of data conforming with the “first two digits” Benford’s law (first row in the Figure) as well as for a possibly uninteresting deviation from the null (second row) and a more substantial deviation from the null (third row). In this example, the random subsamples were made of 1750 observations, consistently with Figure 11, indicating that jointly using 0.0022 as the “critical value” for the MAD with n = 1750 ensures an approximate size of 5% to Nigrini’s test. The tests considered are the

M A D

and those that, according to our simulations, are the most powerful in the presence of a perturbed Benford’s alternative (see Figure 9). The third column (panels C, F, I) of Figure 12 reports the estimated densities of the conventional (or Wald) chi-square test statistic over 5000 random subsamples of length n = 1750 (blue curve) along with the

χ^{2} (89)

null distribution (red). The probability of superiority (a measure of the effect size that corresponds to the probability that a randomly chosen point under the experimental curve is larger than a randomly chosen point under the null curve: see, e.g., [48] (Chapter 11)) is also reported to compare the two distributions.

Panels A–C in Figure 12 show that the null of conformity is not rejected: this conclusion carries over using the full sample (panel A) as well as using a single subsample (panel B) or 5000 random subsamples (panel C). The null is rejected in the full sample under the “uninteresting” alternative using either the chi-square or the adjusted

M A D

test, but it is not rejected using the fixed “critical value” 0.0022 for the

M A D

(panel D). Using the subsamples, none of the criteria are able to decidedly reject the null, suggesting that the deviation of the data from the null is tiny. When the deviation is substantial (panels G–I), the

M A D

still cannot reject the null in the full sample (panel G) whereas the p value of the other two tests is virtually zero. In the single subsample, all three criteria correctly reject the null of conformity (panel H) and panel I shows that the “effect size” on the chi-square test is substantial, with the probability of superiority being approximately 0.9.

5. Conclusions

This paper introduces new tests of conformance with a given distribution with first four finite moments. The tests are then specialised to the special case of the first digit and first two digits Benford’s law. An extensive Monte Carlo analysis was carried out to study the size and power properties of the tests. The results show that it can be advisable to use different tests in real applications, given that the different tests perform differently, according to the nature of the alternative hypothesis.

This paper also addresses the “excess of power” problem of the tests in the presence of very large samples: the proposed solution, based on resampling techniques, seems to be able to reconcile the evidence stemming from the MAD criterion (as can be seen in, e.g., [3]) with firmly statistically based tests.

Author Contributions

The authors equally contributed to this work. Both authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

The authors wish to thank three anonymous referees for their comments and constructive criticisms.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Proof of Remark 4.

For simplicity and without loss of generality, we consider

k = 3

classes.

p : = {(p_{1}, p_{2}, p_{3})}^{'}

and

f_{n} : = {(f_{n 1}, f_{n 2}, f_{n 3})}^{'}

are such that

p_{i} \neq 0

\forall i

, and

ı^{'} p = ı^{'} f_{n} = 1

with

ı : = {(1, 1, 1)}^{'}

.

The “classical” chi-square statistic is:

\begin{matrix} χ^{2} & = & \sum_{i = 1}^{3} \frac{{(n f_{n i} - n p_{i})}^{2}}{n p_{i}} \\ = & n [\frac{{(f_{n 1} - p_{1})}^{2}}{p_{1}} + \frac{{(f_{n 2} - p_{2})}^{2}}{p_{2}} + \frac{{(f_{n 3} - p_{3})}^{2}}{p_{3}}] \\ = & \frac{n}{p_{1} p_{2} p_{3}} [{(f_{n 1} - p_{1})}^{2} p_{2} p_{3} + {(f_{n 2} - p_{2})}^{2} p_{1} p_{3} + {(f_{n 3} - p_{3})}^{2} p_{1} p_{2}] \\ = & \frac{n}{p_{1} p_{2} p_{3}} \{[{(f_{n 1} - p_{1})}^{2} p_{2} + {(f_{n 2} - p_{2})}^{2} p_{1}] (1 - p_{1} - p_{2}) \\ + {[(p_{1} - f_{n 1}) + (p_{2} - f_{n 2})]}^{2} p_{1} p_{2}\} \\ = & \frac{n}{p_{1} p_{2} p_{3}} [{(f_{n 1} - p_{1})}^{2} p_{2} - {(f_{n 1} - p_{1})}^{2} p_{1} p_{2} - {(f_{n 1} - p_{1})}^{2} p_{2}^{2} + {(f_{n 2} - p_{2})}^{2} p_{1} \\ - {(f_{n 2} - p_{2})}^{2} p_{1}^{2} - {(f_{n 2} - p_{2})}^{2} p_{1} p_{2} + {(f_{n 1} - p_{1})}^{2} p_{1} p_{2} \\ + {(f_{n 2} - p_{2})}^{2} p_{1} p_{2} + 2 (p_{1} - f_{n 1}) (p_{2} - f_{n 2}) p_{1} p_{2}] \\ = & \frac{n}{p_{1} p_{2} p_{3}} [{(f_{n 1} - p_{1})}^{2} p_{2} - {(f_{n 1} - p_{1})}^{2} p_{2}^{2} + {(f_{n 2} - p_{2})}^{2} p_{1} \\ - {(f_{n 2} - p_{2})}^{2} p_{1}^{2} + 2 (p_{1} - f_{n 1}) (p_{2} - f_{n 2}) p_{1} p_{2}] . \end{matrix}

(A1)

Notice that

\sum

in this case is:

\begin{matrix} \sum = & diag (p) - p p^{'} \\ = & (\begin{matrix} p_{1} - p_{1}^{2} & - p_{1} p_{2} & - p_{1} p_{3} \\ - p_{1} p_{2} & p_{2} - p_{2}^{2} & - p_{2} p_{3} \\ - p_{1} p_{3} & - p_{2} p_{3} & p_{3} - p_{3}^{2} \end{matrix}) \end{matrix}

(A2)

and that the determinant of

\sum^{*}

is:

\begin{matrix} |\sum^{*}| & = & (p_{1} - p_{1}^{2}) (p_{2} - p_{2}^{2}) - p_{1}^{2} p_{2}^{2} \\ = & p_{1} p_{2} - p_{1} p_{2}^{2} - p_{1}^{2} p_{2} \\ = & p_{1} p_{2} (1 - p_{1} - p_{2}) \\ = & p_{1} p_{2} p_{3} \end{matrix}

(A3)

which is different from zero unless at least one of the

p_{i}

s is zero, which is excluded by the hypothesis. Therefore,

\sum^{*}

is always invertible.

The Wald statistic

χ_{W}^{2}

can be explicitly written as

\begin{matrix} w & : = & n (f_{n 1} - p_{1}, f_{n 2} - p_{2}) {\sum^{*}}^{- 1} (\begin{matrix} f_{n 1} - p_{1} \\ f_{n 2} - p_{2} \end{matrix}) \\ = & \frac{n}{p_{1} p_{2} p_{3}} (f_{n 1} - p_{1}, f_{n 2} - p_{2}) (\begin{matrix} p_{2} - p_{2}^{2} & p_{1} p_{2} \\ p_{1} p_{2} & p_{1} - p_{1}^{2} \end{matrix}) (\begin{matrix} f_{n 1} - p_{1} \\ f_{n 2} - p_{2} \end{matrix}) \\ = & \frac{n}{p_{1} p_{2} p_{3}} {(\begin{matrix} (f_{n 1} - p_{1}) (p_{2} - p_{2}^{2}) + (f_{n 2} - p_{2}) p_{1} p_{2} \\ (f_{n 2} - p_{2}) (p_{1} - p_{1}^{2}) + (f_{n 1} - p_{1}) p_{1} p_{2} \end{matrix})}^{'} (\begin{matrix} f_{n 1} - p_{1} \\ f_{n 2} - p_{2} \end{matrix}) \\ = & \frac{n}{p_{1} p_{2} p_{3}} [{(f_{n 1} - p_{1})}^{2} (p_{2} - p_{2}^{2}) + (f_{n 1} - p_{1}) (f_{n 2} - p_{2}) p_{1} p_{2} \\ + {(f_{n 2} - p_{2})}^{2} (p_{1} - p_{1}^{2}) + (f_{n 1} - p_{1}) (f_{n 2} - p_{2}) p_{1} p_{2}] \\ = & \frac{n}{p_{1} p_{2} p_{3}} [{(f_{n 1} - p_{1})}^{2} (p_{2} - p_{2}^{2}) + {(f_{n 2} - p_{2})}^{2} (p_{1} - p_{1}^{2}) \\ + 2 (f_{n 1} - p_{1}) (f_{n 2} - p_{2}) p_{1} p_{2}] \\ = & \frac{n}{p_{1} p_{2} p_{3}} [{(f_{n 1} - p_{1})}^{2} p_{2} - {(f_{n 1} - p_{1})}^{2} p_{2}^{2} + {(f_{n 2} - p_{2})}^{2} p_{1} \\ - {(f_{n 2} - p_{2})}^{2} p_{1}^{2} + 2 (f_{n 1} - p_{1}) (f_{n 2} - p_{2}) p_{1} p_{2}] \end{matrix}

(A4)

which is equal to (A1). □

References

Newcomb, S. Note on the frequency of use of the different digits in natural numbers. Am. J. Math. 1881, 4, 39–40. [Google Scholar] [CrossRef] [Green Version]
Benford, F. The law of anomalous numbers. Proc. Am. Philos. Soc. 1938, 78, 551–572. [Google Scholar]
Nigrini, M.J. Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar] [CrossRef]
Kossovsky, A.E. Benford’s Law: Theory, the General Law of Relative Quantities, and Forensic Fraud Detection Applications; World Scientific: Singapore, 2014. [Google Scholar]
Berger, A.; Hill, T.P. An Introduction to Benford’s Law; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
Miller, S.J. (Ed.) Benford’s Law: Theory and Applications; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
Raimi, R.A. The first digit problem. Am. Math. Mon. 1976, 83, 521–538. [Google Scholar] [CrossRef]
Hill, T.P. A statistical derivation of the significant-digit law. Stat. Sci. 1995, 10, 354–363. [Google Scholar] [CrossRef]
Leemis, L. Benford’s Law Geometry. In Benford’s Law: Theory and Applications; Miller, S.J., Ed.; Princeton University Press: Princeton, NJ, USA, 2015; Chapter 4; pp. 109–118. [Google Scholar]
Miller, S.J. (Ed.) Fourier Analysis and Benford’s Law. In Benford’s Law: Theory and Applications; Princeton University Press: Princeton, NJ, USA, 2015; Chapter 3; pp. 68–105. [Google Scholar]
Schürger, K. Lévy Processes and Benford’s Law. In Benford’s Law: Theory and Applications; Miller, S.J., Ed.; Princeton University Press: Princeton, NJ, USA, 2015; Chapter 6; pp. 135–173. [Google Scholar]
Kossovsky, A.E. On the Mistaken Use of the Chi-Square Test in Benford’s Law. Stats 2021, 4, 27. [Google Scholar] [CrossRef]
Ausloos, M.; Cerqueti, R.; Mir, T.A. Data science for assessing possible tax income manipulation: The case of Italy. Chaos Solitons Fractals 2017, 104, 238–256. [Google Scholar] [CrossRef] [Green Version]
Mir, T.A.; Ausloos, M.; Cerqueti, R. Benford’s law predicted digit distribution of aggregated income taxes: The surprising conformity of Italian cities and regions. Eur. Phys. J. B 2014, 87, 1–8. [Google Scholar] [CrossRef] [Green Version]
Nye, J.; Moul, C. The Political Economy of Numbers: On the Application of Benford’s Law to International Macroeconomic Statistics. BE J. Macroecon. 2007, 7, 17. [Google Scholar] [CrossRef]
Tödter, K.H. Benford’s Law as an Indicator of Fraud in Economics. Ger. Econ. Rev. 2009, 10, 339–351. [Google Scholar] [CrossRef]
Durtschi, C.; Hillison, W.; Pacini, C. The effective use of Benford’s law to assist in detecting fraud in accounting data. J. Forensic Account. 2004, 5, 17–34. [Google Scholar]
Nigrini, M.J. I have got your number. J. Account. 1999, 187, 79–83. [Google Scholar]
Shi, J.; Ausloos, M.; Zhu, T. Benford’s law first significant digit and distribution distances for testing the reliability of financial reports in developing countries. Phys. A Stat. Mech. Appl. 2018, 492, 878–888. [Google Scholar] [CrossRef] [Green Version]
Ley, E. On the peculiar distribution of the US stock indexes’ digits. Am. Stat. 1996, 50, 311–313. [Google Scholar] [CrossRef]
Ceuster, M.J.D.; Dhaene, G.; Schatteman, T. On the hypothesis of psychological barriers in stock markets and Benford’s Law. J. Empir. Financ. 1998, 5, 263–279. [Google Scholar] [CrossRef]
Clippe, P.; Ausloos, M. Benford’s law and Theil transform of financial data. Phys. A Stat. Mech. Appl. 2012, 391, 6556–6567. [Google Scholar] [CrossRef] [Green Version]
Mir, T.A. The leading digit distribution of the worldwide illicit financial flows. Qual. Quant. 2014, 50, 271–281. [Google Scholar] [CrossRef] [Green Version]
Ausloos, M.; Castellano, R.; Cerqueti, R. Regularities and discrepancies of credit default swaps: A data science approach through Benford’s law. Chaos Solitons Fractals 2016, 90, 8–17. [Google Scholar] [CrossRef] [Green Version]
Riccioni, J.; Cerqueti, R. Regular paths in financial markets: Investigating the Benford’s law. Chaos Solitons Fractals 2018, 107, 186–194. [Google Scholar] [CrossRef]
Sambridge, M.; Tkalčić, H.; Jackson, A. Benford’s law in the natural sciences. Geophys. Res. Lett. 2010, 37. [Google Scholar] [CrossRef]
Diaz, J.; Gallart, J.; Ruiz, M. On the Ability of the Benford’s Law to Detect Earthquakes and Discriminate Seismic Signals. Seismol. Res. Lett. 2014, 86, 192–201. [Google Scholar] [CrossRef] [Green Version]
Ausloos, M.; Cerqueti, R.; Lupi, C. Long-range properties and data validity for hydrogeological time series: The case of the Paglia river. Phys. A Stat. Mech. Appl. 2017, 470, 39–50. [Google Scholar] [CrossRef] [Green Version]
Mir, T. The law of the leading digits and the world religions. Phys. A Stat. Mech. Appl. 2012, 391, 792–798. [Google Scholar] [CrossRef] [Green Version]
Mir, T. The Benford law behavior of the religious activity data. Phys. A Stat. Mech. Appl. 2014, 408, 1–9. [Google Scholar] [CrossRef] [Green Version]
Ausloos, M.; Herteliu, C.; Ileanu, B. Breakdown of Benford’s law for birth data. Phys. A Stat. Mech. Appl. 2015, 419, 736–745. [Google Scholar] [CrossRef] [Green Version]
Hassler, U.; Hosseinkouchack, M. Testing the Newcomb-Benford Law: Experimental evidence. Appl. Econ. Lett. 2019, 26, 1762–1769. [Google Scholar] [CrossRef]
Cerqueti, R.; Maggi, M. Data validity and statistical conformity with Benford’s Law. Chaos Solitons Fractals 2021, 144, 110740. [Google Scholar] [CrossRef]
Linton, O. Probability, Statistics and Econometrics; Academic Press: London, UK, 2017. [Google Scholar]
Zhang, L. Sample Mean and Sample Variance: Their Covariance and Their (In)Dependence. Am. Stat. 2007, 61, 159–160. [Google Scholar] [CrossRef]
Geary, R.C. Moments of the Ratio of the Mean Deviation to the Standard Deviation for Normal Samples. Biometrika 1936, 28, 295. [Google Scholar] [CrossRef]
Choulakian, V.; Lockhart, R.A.; Stephens, M.A. Cramér-von Mises statistics for discrete distributions. Can. J. Stat. 1994, 22, 125–137. [Google Scholar] [CrossRef]
White, H. Asymptotic Theory for Econometricians; Economic Theory, Econometrics, and Mathematical Economics; Academic Press: London, UK, 1984. [Google Scholar]
Wellner, J.A.; Smythe, R.T. Computing the covariance of two Brownian area integrals. Stat. Neerl. 2002, 56, 101–109. [Google Scholar] [CrossRef] [Green Version]
Drake, P.D.; Nigrini, M.J. Computer assisted analytical procedures using Benford’s Law. J. Account. Educ. 2000, 18, 127–146. [Google Scholar] [CrossRef]
R Development Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2019. [Google Scholar]
Wickham, H. ggplot2: Elegant Graphics for Data Analysis; Use R! Springer: New York, NY, USA, 2016. [Google Scholar]
Davidson, R.; MacKinnon, J.G. Graphical Methods for Investigating The Size and Power of Hypothesis Tests. Manch. Sch. 1998, 66, 1–26. [Google Scholar] [CrossRef]
Lloyd, C.J. Estimating Test Power Adjusted for Size. J. Stat. Comput. Simul. 2005, 75, 921–934. [Google Scholar] [CrossRef]
Granger, C.W. Extracting information from mega-panels and high-frequency data. Stat. Neerl. 1998, 52, 258–272. [Google Scholar] [CrossRef]
Lindley, D.V. A Statistical Paradox. Biometrika 1957, 44, 187–192. [Google Scholar] [CrossRef]
Bickel, P.J.; Götze, F.; van Zwet, W.R. Resampling Fewer Than n Observations: Gains, Losses, and Remedies for Losses. In Selected Works of Willem van Zwet; Springer: New York, NY, USA, 2011; pp. 267–297. [Google Scholar] [CrossRef] [Green Version]
Cumming, G. Understanding the New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis; Routledge: New York, NY, USA, 2012. [Google Scholar]

Figure 1. Probability function of the first-two-digits Benford’s law (red) compared to the probability functions of the mixtures used under the alternative hypothesis (blue). In the figure, a mixing parameter

λ = 0.6

was used to exaggerate the visual effect. Larger values of

λ

were used in the simulations, with the consequence that the distribution under the alternative is closer to the distribution under the null.

Figure 1. Probability function of the first-two-digits Benford’s law (red) compared to the probability functions of the mixtures used under the alternative hypothesis (blue). In the figure, a mixing parameter

λ = 0.6

was used to exaggerate the visual effect. Larger values of

λ

were used in the simulations, with the consequence that the distribution under the alternative is closer to the distribution under the null.

Figure 2. First digit tests: deviation of the actual size from the nominal size. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The number of observations is indicated on top of each panel.

Figure 2. First digit tests: deviation of the actual size from the nominal size. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The number of observations is indicated on top of each panel.

Figure 3. First digit tests: size–power curves of the tests against the uniform mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on the mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 3. First digit tests: size–power curves of the tests against the uniform mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on the mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 4. First digit tests: size–power curves of the tests against the normal mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 4. First digit tests: size–power curves of the tests against the normal mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 5. First digit tests: size–power curves of the tests against the perturbed mixing alternative with

λ = 0.75

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 5. First digit tests: size–power curves of the tests against the perturbed mixing alternative with

λ = 0.75

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (8)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 6. First two digits tests: deviation of the actual size from the nominal size. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The number of observations is indicated on top of each panel.

Figure 6. First two digits tests: deviation of the actual size from the nominal size. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The number of observations is indicated on top of each panel.

Figure 7. First two digits tests: size–power curves of the tests against the uniform mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 7. First two digits tests: size–power curves of the tests against the uniform mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 8. First two digits tests: size–power curves of the tests against the normal mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 8. First two digits tests: size–power curves of the tests against the normal mixing alternative with

λ = 0.9

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 9. First two digits tests: size–power curves of the tests against the perturbed mixing alternative with

λ = 0.75

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 9. First two digits tests: size–power curves of the tests against the perturbed mixing alternative with

λ = 0.75

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 10. First two digits tests: size–power curves of the tests against the rounding mixing alternative with

λ = 0.75

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 10. First two digits tests: size–power curves of the tests against the rounding mixing alternative with

λ = 0.75

. Tests are as follows: “Adj. MAD”, adjusted MAD (10); “Chi-sq(2)”,

χ^{2} (2)

test on mean and variance (4); “Chi-sq(d-1)”,

χ^{2} (89)

test (9); “Mean”, normal test on the mean (1); “Mean & var.”, normal test on mean and variance (3); “Variance”, normal test on variance (2). The dashed line is

p o w e r = a c t u a l s i z e

. The number of observations is indicated on top of each panel.

Figure 11. Average estimated

M A D

s over 1000 replications under the (Benford’s law) null hypothesis (blue points) and

α / \sqrt{n}

(black curve) for varying sample sizes

n \in (250, 500, \dots, 10, 000)

.

α

is a scale factor used to report

1 / \sqrt{n}

on the same scale as

M A D

. The shaded area represents the central 90% of the distribution of estimated

M A D

s. The horizontal dashed line corresponds to Nigrini’s suggested critical value (0.0022). The vertical dashed line corresponds to

n = 1750

.

Figure 11. Average estimated

M A D

s over 1000 replications under the (Benford’s law) null hypothesis (blue points) and

α / \sqrt{n}

(black curve) for varying sample sizes

n \in (250, 500, \dots, 10, 000)

.

α

is a scale factor used to report

1 / \sqrt{n}

on the same scale as

M A D

. The shaded area represents the central 90% of the distribution of estimated

M A D

s. The horizontal dashed line corresponds to Nigrini’s suggested critical value (0.0022). The vertical dashed line corresponds to

n = 1750

.

Figure 12. Behaviour of conformance tests across samples. In the first row (panels A–C), data conform to the “first two digits” Benford’s law. In the second row (panels D–F), data follow a perturbed Benford’s law with

λ = 0.95

. In the third row (panels G–I), data are consistent with a perturbed Benford’s law with

λ = 0.75

. The first column (panels A,D,G) reports the results computed over the full sample, with n = 15,000. The second column (panels B,E,H) is relative to a single random subsample with n = 1750. The third column (panels C,F,I) reports the estimated densities (blue) of the conventional (or Wald) chi-square test statistic over 5000 random subsamples of length n = 1750 along with the

χ^{2} (89)

distribution under the null distribution (red).

P (χ_{89}^{2})

and

P (A d j . M A D)

denote p values of the conventional (or Wald) chi-square test and of the adjusted

M A D

test, respectively.

P r o b . o f s u p .

is an estimate of the probability of superiority.

Figure 12. Behaviour of conformance tests across samples. In the first row (panels A–C), data conform to the “first two digits” Benford’s law. In the second row (panels D–F), data follow a perturbed Benford’s law with

λ = 0.95

. In the third row (panels G–I), data are consistent with a perturbed Benford’s law with

λ = 0.75

. The first column (panels A,D,G) reports the results computed over the full sample, with n = 15,000. The second column (panels B,E,H) is relative to a single random subsample with n = 1750. The third column (panels C,F,I) reports the estimated densities (blue) of the conventional (or Wald) chi-square test statistic over 5000 random subsamples of length n = 1750 along with the

χ^{2} (89)

distribution under the null distribution (red).

P (χ_{89}^{2})

and

P (A d j . M A D)

denote p values of the conventional (or Wald) chi-square test and of the adjusted

M A D

test, respectively.

P r o b . o f s u p .

is an estimate of the probability of superiority.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cerqueti, R.; Lupi, C. Some New Tests of Conformity with Benford’s Law. Stats 2021, 4, 745-761. https://doi.org/10.3390/stats4030044

AMA Style

Cerqueti R, Lupi C. Some New Tests of Conformity with Benford’s Law. Stats. 2021; 4(3):745-761. https://doi.org/10.3390/stats4030044

Chicago/Turabian Style

Cerqueti, Roy, and Claudio Lupi. 2021. "Some New Tests of Conformity with Benford’s Law" Stats 4, no. 3: 745-761. https://doi.org/10.3390/stats4030044

APA Style

Cerqueti, R., & Lupi, C. (2021). Some New Tests of Conformity with Benford’s Law. Stats, 4(3), 745-761. https://doi.org/10.3390/stats4030044

Article Menu

Some New Tests of Conformity with Benford’s Law

Abstract

1. Introduction

2. New Tests of Conformity with Benford’s Law

3. Monte Carlo Simulations

3.1. First-Digit Law

3.2. First Two Digits Law

4. Statistical versus Practical Significance

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI