Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

# Bayesian Inference in Auditing with Partial Prior Information Using Maximum Entropy Priors

1
Department of Quantitative Methods, University of Las Palmas de Gran Canaria, 35001 Las Palmas de Gran Canaria, Spain
2
*
Author to whom correspondence should be addressed.
All three authors contributed equally to this work.
Entropy 2018, 20(12), 919; https://doi.org/10.3390/e20120919
Received: 15 October 2018 / Revised: 15 November 2018 / Accepted: 28 November 2018 / Published: 1 December 2018
(This article belongs to the Special Issue Bayesian Inference and Information Theory)

## Abstract

:
Problems in statistical auditing are usually one–sided. In fact, the main interest for auditors is to determine the quantiles of the total amount of error, and then to compare these quantiles with a given materiality fixed by the auditor, so that the accounting statement can be accepted or rejected. Dollar unit sampling (DUS) is a useful procedure to collect sample information, whereby items are chosen with a probability proportional to book amounts and in which the relevant error amount distribution is the distribution of the taints weighted by the book value. The likelihood induced by DUS refers to a 201–variate parameter $p$ but the prior information is in a subparameter $θ$ linear function of $p$, representing the total amount of error. This means that partial prior information must be processed. In this paper, two main proposals are made: (1) to modify the likelihood, to make it compatible with prior information and thus obtain a Bayesian analysis for hypotheses to be tested; (2) to use a maximum entropy prior to incorporate limited auditor information. To achieve these goals, we obtain a modified likelihood function inspired by the induced likelihood described by Zehna (1966) and then adapt the Bayes’ theorem to this likelihood in order to derive a posterior distribution for $θ$. This approach shows that the DUS methodology can be justified as a natural method of processing partial prior information in auditing and that a Bayesian analysis can be performed even when prior information is only available for a subparameter of the model. Finally, some numerical examples are presented.

## 1. Introduction

This paper addresses the statistical problem of estimating the total amount of error in an account balance obtained from auditing. To do so, the statistical toolbox employed by the auditor must be adapted to use a Bayesian approach. The conclusions drawn from the audit process are commonly based on statistical methods such as hypothesis testing, which in turn is based on compliance testing and substantive testing. The first of these is conducted to provide reasonable assurance that internal control mechanisms are present and function adequately. Substantive testing seeks to determine whether errors are present and if so, their size. In auditing practice, the total amount of error in a single statement, denoted by $θ$, and associated substantive testing are highly important to decision making. For instance, the test $H 0 : θ ≤ θ m$ vs. $H 1 : θ > θ m$, can be conducted in order to accept or reject the amount of error detected in the audit, where $θ m$ denotes the total amount of error the auditor deems material. Johnstone (1995)  presented auditing evidence showing that the classical hypothesis test is incoherent and that Bayesian techniques are to be preferred.
Monetary Unit Sampling (MUS) or equivalently Dollar Unit Sampling (DUS, is commonly used to obtain sample information. In DUS, the population size is the recorded book value (B) and the sample plan consists of selecting monetary (dollar) units with an equal chance of being selected. The amount of error for each dollar selected is the difference between its book value and its audit value. The taint of the randomly–selected dollar unit is given by the quotient between the error and the book values. Most of the audited values will be correct and so the associated errors will be zero. The taints in a dollar unit sample are recorded and used to draw inferences about the parameter of interest, i.e., the total amount of error. In practice, auditors usually assume that no amount can be over or under–estimated by an amount greater than its book value. Therefore, the range of taints extends from −100 to +100 per cent in increments of one per cent: $− 100 , − 99 , … , − 1 , 0 , 1 , … , 99 , 100 ,$ and the proportions of each taint are: $p − 100 , p − 99 , … , p − 1 , p 0 , p 1 , … , p 99 , p 100$. For a sample DUS of size $n$, the practitioner knows the observed number of tainted dollar units in the sample with i% taints, $n i$, where $0 ≤ n i ≤ n ,$$i = − 100 , … , 0 , … , 100 ,$ and $∑ i = − 100 100 n i = n$. In practice, B is very large in relation to sample size n and then the multinomial model adequately reflects the likelihood function. The likelihood of the problem is expressed as a parameter $p = ( p − 100 , … , p 100 )$ of dimension $201 ,$
$f ( n | p ) = n ! n − 100 ! · … · n 100 ! ∏ i = − 100 100 p i n i ,$
where $n = ( n − 100 , … , n 100 )$.
To complete a Bayesian analysis, a prior distribution is required, and this is frequently a conjugated Dirichlet prior. However, there are certain difficulties. On the one hand, quantifying the expert’s opinion as a probability distribution is a difficult task, especially for complex multivariate problems. Furthermore, although the auditor usually has an intuitive understanding of the magnitude, i.e., the total amount, of error $θ$, the proportion of $p i$ will be unknown. Finally, the likelihood of the observed data depends on the parameters $p − 100 , … , p 100$. In consequence, the analyst must consider a Bayesian scenario under partial prior information, and seek to combine prior information about $θ$ with the sample information about the individual proportions.
In a non–Bayesian context, McCray (1984)  introduced a heuristic procedure to obtain a maximum likelihood function. Following Hernández et al. (1998) , we now propose a modification of the likelihood to make it compatible with prior information on $θ$ and then perform a Bayesian analysis. The prior distribution for the total amount of error in the population is commonly asymmetrical and right tailed, and statistically–trained auditors can readily elicit values such as the mean and/or certain quantiles. In this paper, we propose to use for the prior the maximum entropy prior with a specified mean. The advantages of this objective “automatised” prior are that it requires only a small amount of prior information, and nothing else, and is computationally feasible.
The remainder of the paper is organized as follows. Section 2 outlines technical results needed to derived the modified likelihood we use to combine with prior distributions. Section 3 shows how maximum entropy priors can be incorporated into the auditing context. Section 4 then presents some numerical illustrations of the method, and the results obtained are discussed in Section 5.

## 2. The Likelihood Function

Assuming the joint probability mass function given in (1) and consider that there exists a measurable function $ψ ( p − 100 , … , p 100 ) = θ$ such that the auditor has prior information about $θ ∈ Θ$ and $Θ$ a discrete set of values of $θ$. Observe that by construction
$θ = ψ ( p − 100 , … , p 100 ) = B 100 ∑ i = − 100 100 i p i .$
The following notation will be used. $Π$ denoted a separable metric space, $A$ is the natural $σ$-field of subsets of $Π$, and $B$ a sub-$σ$-field of $A$, $A b + ( p )$ denotes the set of all real–valued functions $f ( p ) , p ∈ Π ,$ which are nonnegative, bounded and $A$-measurable, $π$ is a probability measure on $( Π , B )$, $∫ Π * f ( p ) π ( d p )$ for $f ∈ A b + ( p )$, is the upper–integral of $f ( p )$ with respect to $π$  and $1 C ( · )$ is the indicator function of the set $C ,$
Theorem A1 ([5,6]) in Appendix provides a modified likelihood function for the subparameter $θ$. The function $f π B$ in Theorem A1 is the modified likelihood function desired. In fact, we have a $A$-measurable likelihood function $ψ ( p − 100 , … , p 100 ) ,$ with $A$ the usual Borel $σ$-field and also we have prior information given on $Θ$ with its usual $σ$-field. As $Θ$ is discrete, all atoms of its $σ$-field are ${ θ }$, and therefore the sets
are belonging to the $σ$-field $A$. Let $B$ be the sub-$σ$-field of $A$ induced by $ψ$ on $Π$. If we define the probability of a set $ψ − 1 ( { θ } )$ by the probability of ${ θ }$ (known a priori), we will have a probability measure on the sub-$σ$-field $B$, denoted by $π$. Furthermore, the sub-$σ$-field is generated by a countable partition, and in consequence the modified likelihood is given by
$f B ( p ) = ∑ θ sup p ∈ ψ − 1 ( { θ } ) f ( p ) · 1 ψ − 1 ( { θ } ) ( p ) ,$
where for simplicity we write $f ( p )$ to refer function in (1). Observe that function in (4) is constant on every set $ψ − 1 ( { θ } )$, and thus we can write the modified likelihood as
$f B ( θ ) = sup p ∈ ψ − 1 ( { θ } ) f ( p ) .$
Also we note that expressions (4) and (5) are similar to empirical likelihood functions  and with the likelihood induced in the notation introduced by Zehna (1966) . Likelihood function in (5) is a $B$-measurable function and compatible with the prior $π$, thus Bayes’ theorem now apply as
$π ( θ | data ) = f B ( θ ) · π ( θ ) ∫ Θ f B ( θ ) π ( d θ ) .$
We illustrate how to obtain the modified likelihood in (5) with a simulated example.
Example 1.
Consider a DUS sample of 100 items which no errors have been discovered 90 times, one taint is 10% in error, one more taint is 90%; and eight taints are −10% in error (understatement error). Also, we assume that the monetary units are drawn from a population of accounts totaling $B = 10 6$. To find the likelihood of a value θ we solve the following optimization problem
$max 100 ! 90 ! · 8 ! p − 10 8 p 0 90 p 10 p 90$
subject to:
$p − 10 + p 0 + p 10 + p 90 = 1 ,$
and
and that all proportions are nonnegative and less than one.
For example, for a total amount of error $θ$ = 12,000 the proportions obtained are $p − 10 = 0.075 , p 0 = 0.894 , p 10 = 0.011$ and $p 90 = 0.020$, and the likelihood of this error is 0.014. All computations are easily obtained with Mathematica© using the command NMaximize.

## 3. The Maximum Entropy Priors

To apply Bayesian methods in auditing, a prior distribution must be assigned to the total error parameter $θ$. References [9,10], among others, have described how this might be done. In practice, however, Bayesian methods are not widely used because auditors frequently find it difficult to assess a prior probability function. They often lack statistical expertise in this respect, and so cannot easily assess hyperprior parameters, which might not have an intuitive meaning. In most cases, only certain descriptive summaries, such as the mean and/or median of a probability distribution, can be assigned straightforwardly. Thus, auditors tend to feel comfortable assessing certain values of the prior distribution and disregard the other possible values of the parameter. In such a situation, the maximum entropy procedure might be an appropriate way to obtain the prior distribution required.
Let the parameter space $Θ$ be an interval $Θ = [ θ L , θ U ]$. It is well known that the probability distribution $π$ which maximizes the entropy with respect to the objective uniform prior on $[ θ L , θ U ]$ subject to partial prior information given by
has the form
where $λ k$ are constants to be determined from the constraints in (7). Observe that functions $g k$ can adopt several interesting expressions. For example, for $g 1 ( θ ) = θ$ and $g k ( θ ) = ( θ − μ 1 ) k , k = 2 , … , m ,$ we have that partial prior information consists of specifying m central moments in the distribution. Quantiles are also easy to incorporate considering $g k ( θ ) = 1 ( θ L , θ k ) ( θ )$.
For practical applications and illustrative purposes we focus on situations where only the mean $θ 0$ is given, i.e., $g 1 ( θ ) = θ$, and $μ 1 = θ 0$. In such case,
• If $θ 0 = θ L + θ U 2$, then $π ( θ ) ∼ U ( θ L , θ U )$, that is, the uniform distribution on the interval $( θ L , θ U )$.
• If $θ 0 ≠ θ L + θ U 2$, then
$π ( θ ) ∝ exp { λ θ } , θ L ≤ θ ≤ θ U ,$
where $λ$ is obtained by solving the nonlinear equation
$θ U + ( θ L − θ U ) exp { − λ θ L } exp { − λ θ L } − exp { − λ θ U } = θ 0$

## 4. Numerical Illustrations

For illustrative purposes, we present a simulated audit situation in which two auditors have partial prior information about the mean, and are comfortable using a maximum entropy prior in a DUS context. Let us consider the DUS data from an inventory with a reported book value of $B = 10 6$, a sample size of 100 items, and observed taints of 0, 5, 10 and 90 and 94, 4, 1, 1, cases, respectively. In order to decide whether to accept the auditee’s aggregate account balance, the auditors then conduct the statistical hypothesis test of $H 0 : θ ≤ θ m$ vs. $H 1 : θ > θ m$, where $θ m$ denotes an intolerable material error. Assume that a figure of five to seven per cent over the reported book value is a common value for this materiality. For instance, let us suppose that the auditors wish to test
$H 0 : θ ≤ 50,000 v s . H 1 : θ > 50,000 ,$
that is, $θ m$ = $50,000. Following (5), for every value of the total error $θ$ the modified likelihood associated with the DUS data is obtained solving $max p 0 94 p 5 4 p 10 p 90$ subject to: $p 0 + p 5 + p 10 + p 90 = 1 ,$ and and that all proportions are nonnegative and less than one. For a given maximum entropy prior on $θ$, we can now derive its posterior distribution using (6). Using Bayes’ theorem, these priors can then be updated to posteriors conditioned on the data that were actually observed. To facilitate reproducibility of the results presented, a simplified code version is available as Supplementary Material to this paper. To compare scenarios where non prior or only limited partial prior information is available, and so the auditors must base their decisions on the information in the data, we present the following situation. Auditor #1 adopts a reference non–informative prior for the parameter $θ$, i.e., uniform on $Θ$. Observe that for a constant prior the Bayes’ theorem is applicable because the constant is cancelled out in (6) and the posterior distribution is equivalent to the normalised modified likelihood. Figure shows the posterior distribution (in grey) of the total amount of error for the DUS data given above. On the other hand, the partial prior information provided by Auditor #2 is given by the a priori mean of $θ 0$ =$40,000. With this partial prior information, the maximum entropy prior for $θ$, deduced by solving Equation (10), corresponds to $λ$ = −25. In practical applications, we suggest using a grid of 1000 total error points for a good approximation to the likelihood function. Figure 1 shows the prior and posterior distribution for Auditor #2 with the sample information considered above. Observe that when just a small amount of prior information is included via the mean, there are differences between the posterior distributions obtained by Auditors #1 and #2. The estimated mean total error, that is, the posterior mean of the distribution in each case, is $16,576.2 and$19,577.2, respectively, and so the posterior distribution for Auditor #1 is more right–skewed than that obtained by Auditor #2.
The posterior probabilities of the null hypothesis are similar, presenting strong evidence for $H 0$ , although more so under MEP. Table 1 details the posterior probability of the null hypothesis to be tested in (11). All computations were conducted using Mathematica© (version 11.2).
In practice, auditors commonly wish to obtain a high probability quantile of the posterior distribution, say 0.95, and will then accept the accounting balance if this quantile represents a small proportion of the book value, for example no more than five per cent. In Table 1 which shows these quantiles, there is a significant difference between the non-informative and the maximum entropy case, which represent 4.4% and 3.6%, respectively, of the recorded book value. In other words, the posterior probability of the actual total error in the accounting balance being less than $36,000 is 0.95, which represents a reduction of almost 18% in the 95%–quantile compared with a non-informative scenario. The advantages of the proposed model are highlighted by comparing it with conventional methods such as the conventional Bayesian approach and the classical statistics procedure. Accordingly, let us first consider a conventional conjugated Bayesian model with multinomial sampling distribution and a non-informative conjugated Dirichlet prior. A burn–in of 10,000 updates followed by a further 50,000 updates produces the parameter estimates $θ 0.95$ =$48,880 and $Pr { H 0 | DUS data } = 0.954$ (the WinBUGS code is available as Supplementary Material to this paper). Therefore, both of the new Bayesian upper bounds shown in Table 1 are tighter than the above conventional Bayesian bound. Furthermore, the Bayesian Multinomial–Dirichlet model is fairly sensitive to the dimension of $p$, a concern which does not arise in the proposed formulation. For instance, the above numerical illustration developed with a non-informative Dirichlet prior over the range 0–100 obtains an unrealistic 95% upper bound of $295,900, in contrast with the MEP upper bound which is$36,000.
On the other hand, under a classical approach and following Fienberg et al. (1977) , an upper confidence bound for an $α$ percent confidence coefficient with the Stringer method, based on the total overstatement error, is given by
$B π 0 ; 1 − α + B ∑ i = 1 k ( π i ; 1 − α − π i − 1 ; 1 − α ) p i ,$
MEP0.99\$36,000

## Share and Cite

MDPI and ACS Style

Martel-Escobar, M.; Vázquez-Polo, F.-J.; Hernández-Bastida, A. Bayesian Inference in Auditing with Partial Prior Information Using Maximum Entropy Priors. Entropy 2018, 20, 919. https://doi.org/10.3390/e20120919

AMA Style

Martel-Escobar M, Vázquez-Polo F-J, Hernández-Bastida A. Bayesian Inference in Auditing with Partial Prior Information Using Maximum Entropy Priors. Entropy. 2018; 20(12):919. https://doi.org/10.3390/e20120919

Chicago/Turabian Style

Martel-Escobar, María, Francisco-José Vázquez-Polo, and Agustín Hernández-Bastida. 2018. "Bayesian Inference in Auditing with Partial Prior Information Using Maximum Entropy Priors" Entropy 20, no. 12: 919. https://doi.org/10.3390/e20120919

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.