Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments

Muñoz, David Fernando

doi:10.3390/appliedmath3030031

Open AccessArticle

Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments

by

David Fernando Muñoz

Department of Industrial and Operations Engineering, Instituto Tecnológico Autónomo de México, Rio Hondo 1, Mexico City 01080, Mexico

AppliedMath 2023, 3(3), 582-600; https://doi.org/10.3390/appliedmath3030031

Submission received: 7 May 2023 / Revised: 28 July 2023 / Accepted: 1 August 2023 / Published: 7 August 2023

(This article belongs to the Special Issue Trends in Simulation and Its Applications)

Download

Browse Figures

Versions Notes

Abstract

:

When there is uncertainty in the value of parameters of the input random components of a stochastic simulation model, two-level nested simulation algorithms are used to estimate the expectation of performance variables of interest. In the outer level of the algorithm n observations are generated for the parameters, and in the inner level m observations of the simulation model are generated with the values of parameters fixed at the values generated in the outer level. In this article, we consider the case in which the observations at both levels of the algorithm are independent and show how the variance of the observations can be decomposed into the sum of a parametric variance and a stochastic variance. Next, we derive central limit theorems that allow us to compute asymptotic confidence intervals to assess the accuracy of the simulation-based estimators for the point forecast and the variance components. Under this framework, we derive analytical expressions for the point forecast and the variance components of a Bayesian model to forecast sporadic demand, and we use these expressions to illustrate the validity of our theoretical results by performing simulation experiments with this forecast model. We found that, given a fixed number of total observations

n m

, the choice of only one replication in the inner level (

m = 1

) is recommended to obtain a more accurate estimator for the expectation of a performance variable.

Keywords:

Bayesian forecasting; stochastic simulation; parameter uncertainty; two-level simulation

1. Introduction and Notation

Simulation is usually regarded as a powerful tool for producing forecasts, evaluating risk (see, e.g., [1]), and animating and illustrating a system’s performance over time (see, e.g., [2]). When a component of a simulation model has a certain degree of uncertainty, it is said to be a random component, and it is modeled by using a probability distribution and/or a stochastic process that is sampled throughout the simulation run to produce a stochastic simulation. A random component typically depends on the value of certain parameters; we denote by

θ

a particular value for the vector of parameters of the random components of a stochastic simulation, and

Θ

denotes a random vector that corresponds to the parameter values when there is uncertainty in the values of these parameters.

Following the notation of [1], the output of a stochastic (dynamic) simulation can be regarded as a stochastic process

{Y (s) : s \geq 0; Θ}

, where

Y (s)

is a random vector (of arbitrary dimension d) representing the state of the simulation at time

s \geq 0

. The term transient simulation applies to a dynamic simulation that has a well-defined termination time, so the output of a transient simulation can be viewed as a stochastic process

{Y (s) : 0 \leq s \leq T; Θ}

, where T is a stopping time (which may be deterministic); see, e.g., [3] for a definition of stopping time.

A performance variable W in a transient simulation is a real-valued random variable (r.v.) that depends on the simulation output up to time T, i.e.,

W = f (Y (s), 0 \leq s \leq T; Θ)

, and the expectation of a performance variable W is a performance measure that we usually estimate through experimentation with the simulation model. When there is no uncertainty in the parameters of the random components, the accepted technique for estimating a performance measure in transient simulation is the method of independent replications. This method consists of running experiments with the simulation model to produce n replications

W_{1}, W_{2}, \dots, W_{n}

that can be regarded as independent and identically distributed (i.i.d.) random variables (see Figure 1).

In the method of independent replications, a point estimator for the expectation

α = E [W_{1}]

is the average

\hat{α} (n) = \frac{\sum_{i = 1}^{n} W_{i}}{n}

. If

E [| W_{1} |] < \infty

, it follows from the classical Law of Large Numbers (LLN) that

\hat{α} (n)

is consistent, i.e., it satisfies

\hat{α} (n) \Rightarrow α

, as

n \to \infty

(where ⇒ denotes weak convergence of random variables); see, e.g., [3] for a proof. Consistency ensures that the estimator approaches the parameter as the number of replications n increases, and an asymptotic confidence interval (ACI) for the parameter is often used to evaluate the accuracy of the simulation-based estimator. Typically, a Central Limit Theorem (CLT) for the estimator is used to derive the expression for an ACI (see, for example, chapter 3 of [4]). For the case of the expectation

α

in the algorithm of Figure 1, if

E [W_{1}^{2}] < \infty

, the classical CLT implies that

\frac{\sqrt{n} (\hat{α} (n) - α)}{σ} \Rightarrow N (0, 1),

(1)

as

n \to \infty

, where

σ^{2} = E [{(W_{1} - α)}^{2}]

, and

N (0, 1)

denotes an r.v. distributed as normal with a mean of 0 and variance of 1. Then, if

E [W_{1}^{2}] < \infty

, it follows from (1) and Slutsky’s Theorem (see Appendix A) that

\frac{\sqrt{n} (\hat{α} (n) - α)}{\hat{σ} (n)} \Rightarrow N (0, 1),

as

n \to \infty

, where

\hat{σ} (n)

denotes the sample standard deviation, i.e.,

{\hat{σ}}^{2} (n) = \frac{\sum_{i = 1}^{n} {(W_{i} - \hat{α} (n))}^{2}}{n - 1}

. This CLT implies that

lim_{n \to \infty} P [|\hat{α} (n) - α| \leq z_{β} \hat{σ} (n) / \sqrt{n}] = 1 - β,

for

0 < β < 1

, where

z_{β}

is the (

1 - β / 2

)-quantile of a

N (0, 1)

, so the CLT of Equation (2) is sufficient to establish a

100 (1 - β) %

ACI for

α

with the following halfwidth:

H W_{α} = z_{β} \hat{σ} (n) / \sqrt{n} .

(2)

The standard measurement used in simulation software (e.g., Simio; see [2]) to evaluate the accuracy of

\hat{α} (n)

as an estimator of expectation

α

is a halfwidth in the form of (2).

The parameters of the random input components of a simulation model are typically estimated from real-data observations (denoted by a real-valued vector x), in contrast to an estimation of output performance measures that uses observations from simulation experiments. While the majority of applications covered in the related literature assume that there is no uncertainty in the value of input parameters, the uncertainty can be significant when these parameters are estimated by using small amounts of data. In these situations, Bayesian statistics can be used to incorporate this uncertainty in the output analysis of simulation experiments via the use of a posterior distribution

p (θ | x)

. A two-level nested simulation algorithm (see, e.g., [5,6,7]) is a methodology that is currently proposed for the analysis of simulation experiments under parameter uncertainty. In the outer level, we simulate n observations for the parameters from a posterior distribution

p (θ | x)

, while in the inner level, we simulate m observations for the performance variable with the parameters fixed at the value

θ

generated in the outer level. In this paper, we focus on the case where the observations at both levels of the algorithm are independent (as illustrated in Figure 2). We first show how the variance of a simulated observation can be decomposed into parametric and stochastic variance components, and then we obtain CLTs for the estimator of the point forecast and the estimators of the variance components. Our CLTs allow us to compute an ACI for each estimator. Our results are validated through experiments with a forecast model for sporadic demand reported in [8]. The main theoretical results reported in this paper were first stated in [9] (although the proofs were omitted), and in this paper, we provide the missing proofs, a more comprehensive literature review, and a more complete set of experiments with different values for the parameters of our experiments.

The existing literature on quantifying the impact of uncertainty on the input components of a stochastic simulation is very extensive; detailed reviews can be found, e.g., in [10,11,12] and the references therein. However, in order to situate our results within the framework of the bibliography related to the input analysis of simulation experiments, next, we present a brief discussion of the different approaches that have been proposed on this topic.

According to Barton et al. [13], input analysis in the simulation literature has been addressed essentially in two ways: sensitivity analysis and the characterization of the impact of input uncertainty to provide an ACI (to evaluate the accuracy of a point estimator) that explicitly considers this uncertainty. A sensitivity analysis is performed by running simulation experiments under different distributions for the random components and/or different parameters in order to investigate and describe the changes in the main performance measures of the simulation experiments (see [14,15] for early proposals). Although formalization of this approach was initially proposed by using techniques (e.g., design of experiments and/or regression; see [16]) that were previously proposed to analyze real-world experiments (see, e.g., [17]), some other techniques were proposed for the special purpose of simulation; for example, Freimer and Schruben [18] discussed methods for the design of experiments to search for the sample size of input data that ensured that the difference in the results of two simulation experiments was dominated by the stochastic variance (induced by the simulation experiments) so that the parametric variance (induced by input uncertainty) was not significant for decision making. As pointed out by Bruce Schmeiser in his discussion in [19], sensitivity analysis has a wide range of applications, since it can handle model uncertainty, as well as situations where no real-world data exist; however, a significant drawback is that it does not provide a statistical characterization of input uncertainty. This characterization can be achieved through the construction of an ACI that explicitly considers input uncertainty based on sample data.

According to several authors (e.g., [10,13]), for the construction of an ACI that explicitly considers the impact of input uncertainty, there have been essentially three approaches: the delta method, resampling, and Bayesian methods. Let us denote by

η (θ)

the expectation of

W_{1}

(of Figure 1) as a function of

θ

, and let

{\hat{θ}}_{r}

be an estimator for parameter

θ

(where r is the sample size of real-world observations); in this notation, the main idea of the delta method is to consider a Taylor series expansion for

η ({\hat{θ}}_{r})

around

θ

to investigate convergence properties (as

r \to \infty

) of

η ({\hat{θ}}_{r})

as an estimator of

η (θ)

. Cheng and Holland ([20,21,22]) introduced the use of the delta method to propose the construction of confidence intervals for the expectation of a performance variable of a stochastic simulation under uncertainty in the parameters of a proposed (known) parametric family of distributions for a random component. In [20,22], the authors did not formally prove the asymptotic validity of their proposed confidence intervals but justified them by appealing to the asymptotic normality (as

r \to \infty

) of the estimator

{\hat{θ}}_{r}

(which is the case, under regularity conditions, when

{\hat{θ}}_{r}

is the maximum likelihood estimator). In a later publication ([23]), Cheng and Holland provided proof of the asymptotic validity of their confidence intervals based on the delta method under regularity conditions as

r \to \infty

and

n \to \infty

. In [20,23], the authors also proposed the construction of asymptotic confidence intervals under parameter uncertainty based on a resampling technique known as parametric bootstrapping; this proposal basically consisted of using the algorithm of Figure 1 but sampling the values for parameter

θ

from the likelihood evaluated at the maximum likelihood estimator. Some other proposals for the construction of an ACI are based on resampling from the empirical distribution of real-data observations (see, e.g., [13,24,25]). A drawback of proposed confidence intervals based on the delta method and resampling is that their asymptotic validity (see the Theorem of [23] and Theorem 1 of [13]) requires that the sample size of real observations be large (

r \to \infty

), which is a condition that is probably true for big data, in which case parameter uncertainty may not be significant. Another drawback of techniques based on the delta method and parametric bootstrapping is that parameter

θ

is assumed to be deterministic (although unknown) so that, at some point, the value of

θ

is replaced by

\hat{θ}

, and this is one reason for why these methods are called frequentists in the statistics community. On the contrary, under a Bayesian approach, a parameter is regarded as a random variable

Θ

, and the uncertainty about

Θ

is assessed through a posterior distribution

p (θ | x)

that explicitly incorporates available information from sample data x.

Bayesian methods have solid theoretical foundations (see, e.g., [26]), and they have been proposed to assess not only parameter uncertainty, but also model uncertainty (see [27]). Bayesian methods for input analysis in simulation experiments were introduced by Chick in [28], and since then, there has been a fair number of publications on Bayesian methods for input simulation analysis (see, e.g., [7,29,30,31]). Bayesian methods require the specification of a prior distribution on the input parameters of the simulation model, and some users object to this requirement; however, there is a well-developed theory on objective priors (see, e.g., [32]), and some authors (e.g., [10,33]) consider that this requirement is actually a strength of the approach. Another strength of Bayesian methods for input analysis in stochastic simulation is that some Bayesian methods have been developed to construct an ACI for parameters that are not the expectation—for example, the variance and quantiles for a consistent estimation of a credible interval for the performance variable W (see [34]). It is worth mentioning that some methods need the extra assumption that the simulation output satisfies a meta-model in order to justify the asymptotic validity of their proposed confidence intervals (e.g., [13,35]); as we will see, this extra assumption is not required to establish an ACI for the expectation and variance components of two-level simulation experiments in a Bayesian framework.

The organization of this article is as follows. After this introduction, our proposed methodologies for the computation of an ACI for the point forecast and the variance components in a two-level simulation experiment are then described, and the mathematical results that support the validity of the proposed ACIs are stated (the corresponding proofs are provided in Appendix A). In the next section, we illustrate our notation by obtaining the analytical solutions for the point forecast and the variance components of a Bayesian model to forecast sporadic demand. This solution is used in the next section to illustrate and support, through simulation experiments, the validity of the ACIs proposed in this article. In the final section, we summarize our findings and suggestions for future research.

2. Theoretical Results

To identify the parameters that we wish to estimate by using the two-level nested algorithm in Figure 2, we denote

μ (Θ) \overset{d e f}{=} E [W_{11} | Θ]

, and

σ^{2} (Θ) \overset{d e f}{=} E [W_{11}^{2} | Θ] - μ^{2} (Θ)

. In this notation, the point forecast is the expectation

α = E [μ (Θ)]

, and the variance of each

W_{i j}

is:

V [W_{i j}] \overset{d e f}{=} E [W_{i j}^{2}] - E {[W_{i j}]}^{2} = E [E [W_{i j}^{2} | Θ] - μ {(Θ)}^{2}] + E [μ {(Θ)}^{2}] - E {[μ (Θ)]}^{2} = σ_{S}^{2} + σ_{P}^{2},

(3)

for

i = 1, \dots, n

;

j = 1, \dots, m

, where

σ_{P}^{2} = V [μ (Θ)] \overset{d e f}{=} E [μ {(Θ)}^{2}] - E {[μ (Θ)]}^{2}

, and

σ_{S}^{2} = E [σ^{2} (Θ)]

, where

σ^{2} (Θ)

was previously defined. We mention that, in the relevant literature,

σ_{S}^{2}

is usually referred to as stochastic variance, and

σ_{P}^{2}

is usually referred to as parametric variance.

2.1. Point Estimators

In this paper, we are interested in both the estimation of the point forecast

α = E [μ (Θ)]

and the estimators of the variance components of every observation generated in the algorithm in Figure 2 and defined in (3); thus, we first consider the natural point estimators

\hat{α} (n) = \frac{1}{n} \sum_{i = 1}^{n} {\hat{α}}_{i}, {\hat{σ}}_{T}^{2} (n) = \frac{1}{n - 1} \sum_{i = 1}^{n} {({\hat{α}}_{i} - \hat{α} (n))}^{2}, {\hat{σ}}_{S}^{2} (n) = \frac{1}{n} \sum_{i = 1}^{n} S_{i}^{2},

(4)

where

{\hat{α}}_{i} = m^{- 1} \sum_{j = 1}^{m} W_{i j}

, and

S_{i}^{2} = {(m - 1)}^{- 1} \sum_{j = 1}^{m} {(W_{i j} - {\hat{α}}_{i})}^{2}

,

i = 1, \dots, m

. Note that the

{\hat{α}}_{i}

s are i.i.d. with expectation

E [{\hat{α}}_{1}] = α

and variance

\begin{matrix} σ_{T}^{2} & \overset{d e f}{=} & E [{({\hat{α}}_{1} - α)}^{2}] = m^{- 2} (m E [{(W_{11} - α)}^{2}] + m (m - 1) E [(W_{11} - α) (W_{12} - α)]) \\ = & m^{- 1} (σ_{S}^{2} + σ_{P}^{2}) + m^{- 1} (m - 1) σ_{S}^{2} = σ_{S}^{2} + m^{- 1} σ_{P}^{2} . \end{matrix}

(5)

In addition, note that the

S_{i}^{2}

values are i.i.d. with expectation

E [S_{1}^{2}] = σ_{S}^{2}

,

i = 1, \dots, n

. Thus, the next proposition follows from the classical LLN.

Proposition 1.

If

m \geq 1

and

E [W_{11}^{2}] < \infty

, then

\hat{α} (n)

and

{\hat{σ}}_{T}^{2} (n)

are unbiased and consistent (as

n \to \infty

) estimators for α and

σ_{T}^{2}

(as defined in (5)), respectively.

Furthermore, if

m \geq 2

and

E [W_{11}^{2}] < \infty

, then

{\hat{σ}}_{S}^{2} (n)

is an unbiased and consistent (as

n \to \infty

) estimator for

σ_{S}^{2}

(as defined in (3)).

2.2. Accuracy of the Point Estimators

As we stated in Proposition 1, under mild assumptions, the point estimators proposed in (4) are consistent and, thus, converge to the corresponding parameter value (as

n \to \infty

). However, to determine the degree of accuracy of these estimators, we must establish a CLT for each estimator to derive a valid expression for the corresponding ACI. Note that both

\hat{α} (n)

and

{\hat{σ}}_{S}^{2} (n)

are averages of i.i.d observations; thus, the next proposition follows from the classical CLT for i.i.d. observations.

Proposition 2.

If

m \geq 1

and

E [W_{11}^{2}] < \infty

, then

\frac{\sqrt{n} (\hat{α} (n) - α)}{σ_{T}} \Rightarrow N (0, 1),

as

n \to \infty

.

Furthermore, if

m \geq 2

and

E [W_{11}^{4}] < \infty

, then

\frac{\sqrt{n} ({\hat{σ}}_{S}^{2} (n) - σ_{S}^{2})}{\sqrt{V_{S}}} \Rightarrow N (0, 1),

as

n \to \infty

, where

σ_{S}^{2}

is defined in (3),

σ_{T}^{2}

is defined in (5),

\hat{α} (n)

,

{\hat{σ}}_{S}^{2} (n)

are defined in (4), and

V_{S} = E [{(S_{1}^{2} - σ_{S}^{2})}^{2}]

, where

S_{1}^{2}

is defined in (4).

Since we have consistent estimators for

σ_{T}^{2}

and

V_{S}

(under mild assumptions), the next corollary follows from Proposition 1 and Slutsky’s Theorem; the details of the proof are given in Appendix A.

Corollary 1.

Under the same notation and assumptions as in Proposition 2, for

m \geq 1

, we have

\frac{\sqrt{n} (\hat{α} (n) - α)}{\sqrt{{\hat{σ}}_{T}^{2} (n)}} \Rightarrow N (0, 1),

as

n \to \infty

, and for

m \geq 2

, we have

\frac{\sqrt{n} ({\hat{σ}}_{S}^{2} (n) - σ_{S}^{2})}{\sqrt{{\hat{V}}_{S} (n)}} \Rightarrow N (0, 1),

as

n \to \infty

, where

{\hat{σ}}_{S}^{2} (n)

and

{\hat{σ}}_{T}^{2} (n)

are defined in (4), and

\hat{V_{s}} (n) = \frac{1}{n - 1} \sum_{i = 1}^{n} {(S_{i}^{2} - S^{2})}^{2}, S^{2} = \frac{1}{n} \sum_{i = 1}^{n} S_{i}^{2} .

In order to obtain a CLT for

{\hat{σ}}_{T}^{2} (n)

, note that this estimator is the sample variance of i.i.d. observations; thus we can use the following lemma. A proof using the delta method (see, e.g., Proposition 2 of [36] for a proof) is provided in Appendix A.

Lemma 1.

If

X_{1}, X_{2}, \dots

is a sequence of i.i.d. random variables with

E [X_{1}^{4}] < \infty

, then

\frac{\sqrt{n} (S^{2} (n) - σ_{1}^{2})}{\sqrt{σ_{2}^{2}}} \Rightarrow N (0, 1),

as

n \to \infty

, where

σ_{1}^{2} = μ_{2} - μ_{1}^{2}

,

σ_{2}^{2} = μ_{1}^{2} μ_{2} - 4 μ_{1}^{4} - 4 μ_{1} μ_{3} + μ_{4} - μ_{2}^{2}

,

μ_{k} = E [X_{1}^{k}]

,

k = 1, 2, 3, 4

;

S^{2} (n) = {(n - 1)}^{- 1} \sum_{i = 1}^{n} {(X_{i} - {\hat{μ}}_{1})}^{2}

,

{\hat{μ}}_{1} = n^{- 1} \sum_{i = 1}^{n} X_{i}

.

Note that, for

k = 1, 2, 3, 4

,

{\hat{μ}}_{k}

of Lemma 1 is an unbiased and consistent estimator for

μ_{k}

, so the next corollary follows directly from Lemma 1.

Corollary 2.

Under the same assumptions as those in Lemma 1, we have

\frac{\sqrt{n} (S^{2} (n) - σ_{S}^{2})}{\sqrt{{\hat{σ}}_{2}^{2} (n)}} \Rightarrow N (0, 1),

as

n \to \infty

, where

{\hat{σ}}_{2}^{2} = 8 {\hat{μ}}_{1}^{2} {\hat{μ}}_{2} - 4 {\hat{μ}}_{1}^{4} - 4 {\hat{μ}}_{1} {\hat{μ}}_{3} + {\hat{μ}}_{4} - {\hat{μ}}_{2}^{2}

,

{\hat{μ}}_{k} = n^{- 1} \sum_{i = 1}^{n} X_{i}^{k}

.

Since

{\hat{σ}}_{T}^{2} (n)

is the sample variance of

{\hat{α}}_{i}

,

i = 1, \dots, n

, the next corollary follows directly from Lemma 1.

Corollary 3.

If

m \geq 1

and

E [W_{11}^{4}] < \infty

, then

\frac{\sqrt{n} ({\hat{σ}}_{T}^{2} (n) - σ_{T}^{2})}{\sqrt{{\hat{V}}_{T} (n)}} \Rightarrow N (0, 1),

as

n \to \infty

, where

{\hat{V}}_{T} (n) = 8 {\bar{α}}_{1}^{2} {\bar{α}}_{2} - 4 {\bar{α}}_{1}^{4} - 4 {\bar{α}}_{1} {\bar{α}}_{3} + {\bar{α}}_{4} - {\bar{α}}_{2}^{2}

,

{\bar{α}}_{k} = n^{- 1} \sum_{i = 1}^{n} {\hat{α}}_{i}^{k}

,

k = 1, 2, 3, 4

.

Corollaries 1, 2, and 3 are the CLTs required to establish an ACI for the point forecast

α

, the stochastic variance

σ_{S}^{2}

, and the variance

σ_{T}^{2} = σ_{S}^{2} + m^{- 1} σ_{P}^{2}

, respectively. According to these corollaries, for

0 < β < 1

, the

100 (1 - β) %

ACIs are centered in the corresponding point estimator, and have halfwidths are given by:

H W_{α} = z_{β} \frac{\sqrt{{\hat{σ}}_{T}^{2} (n)}}{\sqrt{n}}, H W_{σ_{S}^{2}} = z_{β} \frac{\sqrt{{\hat{V}}_{S} (n)}}{\sqrt{n}}, and H W_{σ_{T}^{2}} = z_{β} \frac{\sqrt{{\hat{V}}_{T} (n)}}{\sqrt{n}},

(6)

for

α

,

σ_{S}^{2}

, and

σ_{T}^{2}

, respectively, where

{\hat{σ}}_{T}^{2} (n)

is defined in (4), and

{\hat{V}}_{S} (n)

and

{\hat{V}}_{T} (n)

are defined in Corollary 1 and in Corollary 3, respectively.

As we have seen, by using the algorithm in Figure 2, we can build valid ACIs for the parameters of interest in this article, although a relevant question is that of how to distribute the computing time between the two loops to obtain more accurate point estimators—that is, given a budget

k = n m

, which value of m provides the most accurate estimators? For the case of the estimation of the point forecast

α

, we can answer this question, as we explain below. Since the point estimator

\hat{α} (n)

is an average of i.i.d. random variables, for fixed values of

k = n m

, it follows from Equation (5) that the variance of

\hat{α} (n)

is given by

n^{- 1} σ_{T}^{2} = k^{- 1} (m σ_{S}^{2} + σ_{P}^{2}),

(7)

and takes its minimal value when

m = 1

, suggesting that the point estimator

\hat{α} (n)

defined in (4) is more accurate when m is smaller (i.e., it takes the value of 1). However, note that a smaller value of m is convenient (from the point of view of running time) for a fixed budget of

k = n m

when the computing time needed to generate a random variate from

p (θ | x)

in the algorithm in Figure 2 is negligible compared to the computing time needed to generate

W_{i j}

, as is the case in most real applications.

Note that the TLCs stated in Xorollaries 1, 2 and 3 were obtained for a fixed value of m as

n \to \infty

in the algorithm in Figure 2, which means that the accuracy of the corresponding point estimator increases as the number of observations in the outer level increases and m remains fixed. An interesting result is that we can also obtain a TLC for the point forecast

α

if we allow m to increase with n, as we state in the following proposition (a proof using the Lindeberg–Feller theorem is provided in Appendix A).

Proposition 3.

Given

0 < p \leq 1

, if

m = ⌊ n^{- 1 + 1 / p} ⌋

and

E [W_{11}^{2}] < \infty

, then

\frac{\sqrt{n} (\hat{α} (n) - α)}{\sqrt{σ_{T}^{2}}} \Rightarrow N (0, 1),

as

n \to \infty

, where

σ_{T}^{2}

is defined in (5) and, for any real number s,

⌊ s ⌋

denotes the integer part of s.

Note that the last proposition implies that the ACI defined in Equation (6) for the point forecast

α

is also valid under the assumptions of Proposition 3. If, once again, we set the total number of iterations in the algorithm in Figure 2 to

k = n m

, we let

n \approx k^{p}

,

m \approx k^{1 - p}

, and nm = k, as in Proposition 3, and it follows from Equation (5) that the variance of

\hat{α} (n)

is

n^{- 1} σ_{T}^{2} \approx k^{- p} (σ_{S}^{2} + k^{- (1 - p)} σ_{P}^{2}) = k^{- p} σ_{S}^{2} + k^{- 1} σ_{P}^{2}

for every

0 < p \leq 1

. In this case, for a fixed value of k,

n^{- 1} σ_{T}^{2}

reaches its minimum value when

p = 1

, that is, when

n = k

and

m = 1

. However, note that we need

m \geq 2

in order to estimate

σ_{S}^{2}

. In Section 4, we report some empirical results that confirm our theoretical results. It is worth mentioning that the case of

n = k

and

m = 1

has been reported in the literature as the posterior sampling algorithm (see, e.g., [34,37]).

3. A Forecast Model for Inventory Management with an Analytical Solution

The following model was proposed in [8] to forecast sporadic demand by incorporating data on times between arrivals and customer demand, and uncertainty in the model parameters was incorporated by using a Bayesian approach. For this model, we will show analytical expressions for the performance measures defined in Section 2. These expressions are used in Section 4 to provide empirical evidence of the validity of the ACIs proposed in Section 2. This application example will also illustrate the notation that we used in the previous sections.

The arrivals of customers who enter a store to buy a certain item follow a Poisson process. There is uncertainty in the value of the arrival rate

Θ_{0}

, although we assume that given

[Θ_{0} = θ_{0}]

, the times between customer arrivals are i.i.d. with exponential density:

f (y | θ_{0}) = \{\begin{matrix} θ_{0} e^{- θ_{0} y}, & y > 0, \\ 0, & otherwise, \end{matrix}

(8)

where

θ_{0} \in S_{00} = (0, \infty)

. We assume that any client can order j units of this item with probability

Θ_{1 j}

,

j = 1, \dots, q

,

q \geq 2

. Let

Θ_{1} = (Θ_{11}, \dots Θ_{1 (q - 1)})

and

Θ_{1 q} = 1 - \sum_{j = 1}^{q - 1} Θ_{1 j}

; then,

Θ = (Θ_{0}, Θ_{1})

is the vector of parameters, and

S_{0} = S_{00} ⨂ S_{01}

is the parameter space, where

S_{01} = {(θ_{11}, \dots, θ_{1 (q - 1)}) : \sum_{j = 1}^{q - 1} θ_{1 j} \leq 1; θ_{1 j} \geq 0, j = 1, \dots, q - 1}

.

We are interested in forecasting the total demand for this item over a period of length T:

D = \{\begin{matrix} \sum_{i = 1}^{N (T)} U_{i}, & N (T) > 0 \\ 0, & otherwise, \end{matrix}

(9)

where, for any

s \geq 0

,

N (s)

is the number of customers who came to buy the item during the interval

[0, s]

, and

U_{1}, U_{2}, \dots

are the individual demands, which are assumed to be conditionally independent relative to

Θ_{0}

. The vector of real-data observations is denoted by

x = (v, u)

and consists of i.i.d. observations

v = (v_{1}, \dots, v_{r})

,

u = (u_{1}, \dots, u_{r})

of past customers, where

v_{i}

is the interarrival time between customer i and customer (

i - 1

), and

u_{i}

is the number of units ordered by client i. By taking Jeffrey’s non-informative prior as the prior density for

Θ

, we obtain the posterior density (see [8] for details)

p (θ | x) = p (θ_{0} | v) p (θ_{1} | u)

, where

θ = (θ_{0}, θ_{1})

, and

p (θ_{0} | v) = \frac{θ_{0}^{r - 1} {(\sum_{i = 1}^{r} v_{i})}^{r} e^{- θ_{0} \sum_{i = 1}^{r} v_{i}}}{(r - 1)!}, p (θ_{1} | u) = \frac{{(1 - \sum_{j = 1}^{q - 1} θ_{1 j})}^{c_{q} - 1 / 2} Π_{j = 1}^{q - 1} θ_{1 j}^{c_{j} - 1 / 2}}{B (c_{1} + 1 / 2, \dots, c_{q} + 1 / 2)},

(10)

where

c_{j} = \sum_{i = 1}^{r} I [u_{i} = j]

, and

B (a_{1}, \dots a_{q}) = Π_{j = 1}^{q} Γ (a_{j}) / Γ (\sum_{j = 1}^{q} a_{j})

for

a_{1}, \dots, a_{q} > 0

. With this notation, we can show that (see [1] for details)

α = E [T Θ_{0}] \sum_{j = 1}^{q} j p_{j},

σ_{P}^{2} = \frac{E [T^{2} Θ_{0}^{2}]}{(q_{0} + 1)} \sum_{j = 1}^{q} j^{2} p_{j} + \frac{E {[T Θ_{0}]}^{2} [(q_{0} / n) - 1]}{(q_{0} + 1)} {(\sum_{j = 1}^{q} j p_{j})}^{2},

σ_{S}^{2} = E [T Θ_{0}] \sum_{j = 1}^{q} j^{2} p_{j},

where

E [T Θ_{0}] = T r {(\sum_{i = 1}^{r} v_{i})}^{- 1}

,

E [T^{2} Θ_{0}^{2}] = T^{2} r (1 + r) {(\sum_{i = 1}^{r} v_{i})}^{- 2}

,

p_{j} = q_{j} / q_{0}

,

q_{j} = c_{j} + 1 / 2

,

j = 1, \dots, q

,

q_{0} = \sum_{j = 1}^{q} q_{j}

, and

c_{j}

is defined in (10).

4. Empirical Results

To validate the ACIs proposed in (4), we conducted some experiments with the Bayesian model of Section 3 to illustrate the estimation of

α

,

σ_{S}^{2}

, and

σ_{T}^{2}

. We considered the values

T = 15

,

r = 20

,

\sum_{i = 1}^{r} x_{i} = 10

,

q = 5

,

c_{1} = 5

,

c_{2} = 3

,

c_{3} = 2

,

c_{4} = 3

, and

c_{5} = 7

. With these data, the point forecast is

α \approx 95.333

, and the variance components are

σ_{S}^{2} \approx 380.667

and

σ_{P}^{2} \approx 568.598

. Note that

σ_{S}^{2} < σ_{P}^{2}

in this case, for which we also ran the same experiments with

T = 5

, so

α \approx 31.778

,

σ_{S}^{2} \approx 162.889

,

σ_{P}^{2} \approx 62.066

, and

σ_{S}^{2} > σ_{P}^{2}

in the latter case. The empirical results that we report below illustrate a typical behavior that we should experiment with for any other feasible dataset.

In each of the estimation experiments carried out for this research, we considered 1000 independent replications of the algorithm in Figure 2 for different numbers of observations in the outer loop (n) and in the inner loop (m); in each replication, we computed the point estimators for

α

,

σ_{S}^{2}

, and

σ_{T}^{2}

, as well as the corresponding halfwidths of 90% ACIs according to Equation (6). Because we are estimating parameters whose values we know a priori, we can report (for a given n and m) the empirical coverage (i.e., the proportion of independent replications in which the corresponding ACI covers the true value of the parameter), the average and the standard deviation of halfwidths, and the square root of the empirical mean squared error defined by

R M S E = \sqrt{\frac{1}{n_{0}} \sum_{i = 1}^{n_{0}} {({\hat{θ}}_{i} - θ)}^{2}},

where

{\hat{θ}}_{i}

is the value obtained in replication i for the point estimation of a parameter

θ, i = 1, 2, \dots, n_{0}

(we set the number of replications to

n_{0} = 1000

).

In the first set of experiments, we considered

n m = 240, 2400, 24,000

and

m = 1, 2, 3, 4, 5

for each value of

n m

to compare the effect of increasing the number of observations in the inner loop for a given value of

n m

. The main results of this set of experiments are summarized in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8. Note that we do not consider

m = 1

in Figure 5 and Figure 6 to be able to construct an ACI for the stochastic variance

σ_{S}^{2}

.

In Figure 3, we illustrate the performance measures for the quality of the estimation procedure that we obtained for the estimation of the point forecast

α

when

T = 15

. As we can observe in Figure 3, the coverages are acceptable (very close to the nominal value of 0.9, even for

n m = 240

). These results validate the ACI defined in (6) for the point forecast

α

. We also observe in Figure 3 that the RMSE, average halfwidth, and standard deviation of the halfwidths improve (decrease) as the number of observations in the outer loop (n) increases, as suggested by Corollary 1. Note also in Figure 3 that a smaller value of m provides smaller RMSEs, average halfwidths, and standard deviations of the halfwidths, thus validating our suggestion that m should be as small as possible to improve the accuracy in the estimation of

α

. In Figure 4, we illustrate the corresponding results for

T = 5

, where we can observe that the same conclusions mentioned for

T = 15

are fulfilled, and the main difference from the previous case is that the RMSE, average halfwidths, and standard deviation of halfwidths are smaller, which is consistent with the fact that the point forecast

α

is smaller than in the case in which

T = 15

. Note that both in the case of Figure 3 and in the case of Figure 4, the RMSE, average halfwidth, and average standard deviation of the halfwidths seem to be a linear function of m.

In Figure 5, we illustrate the performance measures for the quality of the estimation procedure that we obtained for the estimation of the stochastic variance

σ_{S}^{2}

when

T = 15

. As we can observe in Figure 5, the coverages are acceptable (very close to the nominal value of 0.9, even for

n m = 240

). These results validate the ACI defined in (6) for the stochastic variance

σ_{S}^{2}

. We also observe in Figure 5 that the RMSE, average halfwidth, and standard deviation of halfwidths improve (decrease) as the number of observations in the outer level (n) increases, as suggested by Corollary 2. However, contrary to what we observed for the estimation of

α

, a larger value of m provides smaller RMSE, average halfwidths, and standard deviations of the halfwidths, suggesting that, for a fixed value of

n m

, the quality of the estimation for the stochastic variance

σ_{S}^{2}

improves as the number of the observations in the inner loop (m) increases, although the values are very close for

n m = 2400, 24,000

. In Figure 6, we illustrate the corresponding results for

T = 5

, where we can observe that the same conclusions as those mentioned for

T = 15

are fulfilled, and the main difference from the previous instance that we observed is that, now, the RMSE, average halfwidths, and standard deviation of halfwidths are smaller, which is consistent with the fact that the point forecast

α

is smaller than in the case

T = 15

. Contrary to what we observed for the case of the estimation of

α

, note that both in the case of Figure 5 and in the case of Figure 6, the RMSE, average halfwidth, and average standard deviation of the halfwidth do not seem to be a linear function of m. We emphasize that the case in which

m = 1

is not considered in Figure 5 and Figure 6 because

σ_{S}^{2}

cannot be estimated when

m = 1

.

For the estimation of the total variance

σ_{T}^{2}

(illustrated in Figure 7 for

T = 15

, and in Figure 8 for

T = 5

), we obtained results for the quality of the estimation that were similar to those for the estimation of the point forecast

α

, except that larger values of n were required to obtain reliable coverages. As we can observe in Figure 7 and Figure 8, the coverages are acceptable (very close to the nominal value of 0.9, only for

n = 2400

and 24,000). These results validate the ACI defined in (6) for the total variance

σ_{T}^{2}

. We can also observe in Figure 7 and Figure 8 that the RMSE, average halfwidth, and standard deviation of the halfwidths improve (decrease) as the number of observations in the outer loop (n) increases, as suggested by Corollary 3. Note also in Figure 7 that a smaller value of m provides smaller RMSEs, average halfwidths, and standard deviations of halfwidths. However, for the case in which

T = 5

(illustrated in Figure 8), where

σ_{S}^{2} > σ_{P}^{2}

, we observe that the RMSE and the average halfwidth decrease from

m = 1

to

m = 2

and then increase, showing that the best value of m for the estimation of

σ_{T}^{2}

depends on the value of the ratio of

σ_{S}^{2} / σ_{P}^{2}

.

In a second set of experiments, we considered

n m = 100, 1000, 10,000

, with

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

for each value of

n m

, to compare the quality of the estimation procedures by using the value of m that we suggested as appropriate for the estimation of the point forecast

α

with other choices of p to illustrate the validity of Proposition 3.

Note that

m \approx {(n m)}^{1 / 3}

is equivalent to

p = 2 / 3

in Proposition 3, and

m \approx {(n m)}^{1 / 2}

corresponds to

p = 1 / 2

. Note also that

m \approx {(n m)}^{1 / 3}

corresponds to the value of m suggested in [5], which is a good option for the case of biased estimation in the inner loop of the algorithm in Figure 2. The results of this set of experiments are summarized in Figure 9, Figure 10, Figure 11 and Figure 12. Note that we do not consider the estimation of the stochastic variance

σ_{S}^{2}

in this set of experiments because

m \geq 2

is required to construct an ACI for the stochastic variance

σ_{S}^{2}

. Note also that we consider

a = 100

,

b = 1000

,

c = 10,000

,

a^{1 / 3} \approx 5

,

c^{1 / 3} \approx 20

, and

c^{1 / 2} \approx 32

, and we use the same color (red) for

m = a^{1 / 3}, b^{1 / 3}

, and

c^{1 / 3}

, as well as the same color (yellow) for

m = a^{1 / 2}, b^{1 / 2},

and

c^{1 / b}

, to report our results in Figure 9, Figure 10, Figure 11 and Figure 12.

In Figure 9 and Figure 10, we illustrate the performance measures for the quality of the estimation procedure that we obtained for the estimation of the point forecast

α

in our second set of experiments. As we can observe in Figure 9 and Figure 10, the coverages are acceptable (very close to the nominal value of 0.9, even for

n = 100

). These results validate the ACI defined in (6) for the point forecast

α

and the ACI suggested by Proposition 3. We can also observe in Figure 9 and Figure 10 that the RMSE, average halfwidth, and standard deviation of the halfwidths are worse than

m = 1

for

m \approx {(n m)}^{1 / 3}

, and they are even worse for

m \approx {(n m)}^{1 / 2}

, thus confirming our finding that, for a fixed number of simulated observations

k = n m

, a smaller value of m produces better point estimators for

α

, confirming the result of Proposition 3.

Finally, in Figure 11 and Figure 12, we show our results for the second set of experiments and the estimation of the total variance

σ_{T}^{2}

.

In Figure 11 and Figure 12, we found similar results to those for the case of the estimation of the point forecast

α

, the coverages were very good (even for n = 100), and all measures of the quality of the point estimation (RMSE, average and standard deviation of halfwidths) were worse than those in the case in which

m = 1

for

m \approx {(n m)}^{1 / 3}

, and they were even worse for

m \approx {(n m)}^{1 / 2}

, suggesting that, as in the case of the estimation of

α

, a smaller value of m produces better point estimators for

σ_{T}^{2}

given a fixed number of replications

k = n m

, with the only exception in Figure 12 (

σ_{S}^{2} > σ_{P}^{2}

) being for the case of the average halfwidths and

n m = 100

, where the average halfwidth seems to decrease with the value of m.

5. Conclusions

In this article, we discussed methods for the estimation of both the point forecast and the variance components of Bayesian forecast models from the output of stochastic simulations by using a two-level nested algorithm in which the simulated observations at both levels of the algorithm are independent. Our main contribution is the development of valid asymptotic confidence intervals for assessing the accuracy of the simulation-based point estimators. These methods are particularly useful when there is uncertainty in the parameter values of the random input components of a forecast model, and we wish to incorporate this uncertainty into the simulation-based forecasting of the model’s performance measures.

The proposed point estimators and their corresponding halfwidths are asymptotically valid, as shown by the theoretical and experimental results, which show that the point estimators converge to the corresponding parameter values and that the halfwidths converge to the nominal coverage as the number of replications n of the outer level increases. In addition, the halfwidths corresponding to the proposed ACIs tend to zero (as

n \to \infty

), which normally occurs with the appropriate simulation-based estimators of performance measurements from the outputs of simulation experiments.

We also investigated the best option for the number of observations m in the inner loop of the algorithm given a fixed number of observations

k = n m

, and we found that the choice of only one observation (

m = 1

) is the best option for obtaining the smallest variance of the point estimator for the expectation of a performance variable. However, for the estimation of the stochastic variance

σ_{S}^{2}

in a two-level nested algorithm,

m \geq 2

is required. On the other hand, for the estimation of the stochastic variance

σ_{S}^{2}

, our experimental results show that larger values of m are better, and the best choice of m depends on the ratio of

σ_{S}^{2} / σ_{P}^{2}

for the case of the estimation of the total variance (

σ_{T}^{2}

).

We remark that we did not consider the case of correlated observations in the outer loop of the two-level algorithm or the case of steady-state simulations, which may have important applications for simulation-based estimation by using Markov chain Monte Carlo (see [38]). In addition, experimentation with other computational procedures, such as quasi-Monte Carlo (see [6]) or Simpson integration, may be other directions for future research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/appliedmath3030031/s1.

Funding

This research was supported by the Asociación Mexicana de Cultura A.C. and the National Council of Science of Technology of Mexico under Award Number 1200/158/2022.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data corresponding to the results of our experiments are available in the repository of AppliedMath (Supplementary Materials).

Acknowledgments

The author is very grateful for the valuable suggestions and comments from the Academic Editor and the three anonymous referees.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

For completeness, we first cite three well-known theorems. Proofs of Theorems A1 and A2 can be found, e.g., in [39], and a proof of Theorem A3 can be found, e.g., in [36]. In what follows, we write ⇒ for weak converge (as

n \to \infty

without explicit mention). In addition, ℜ will denote the space of real numbers, and, for an integer k,

ℜ^{k}

will denote the k-dimensional space of real numbers.

Theorem A1.

(Slutsky). Let

X, Y, X_{1}, X_{2}, \dots, Y_{1}, Y_{2}, \dots

be random variables and let c be a real constant. If

X_{n} \Rightarrow X

and

Y_{n} \Rightarrow c

, then:

(i): $X_{n} + Y_{n} \Rightarrow X + c$
(ii): $X_{n} Y_{n} \Rightarrow c X$
(iii): $X_{n} / Y_{n} \Rightarrow X / c$ , if $c \neq 0$ .

Theorem A2.

(Continuous mapping). Let

X, X_{1}, X_{2}, \dots

be

ℜ^{k}

-valued random vectors, and let

g : ℜ^{k} \to ℜ

be a function such that

P [X \in D (g)] = 0

, where

D (g) = {x : g (x)

is not continuous at

x}

; then,

g (X_{n}) \Rightarrow g (X)

.

Theorem A3.

(Delta method). Let

Y_{1}, Y_{2}, \dots

be

ℜ^{k}

-valued random vectors, and let g:

{I R}^{k} \to ℜ

be a function that is differentiable in a neighborhood of

μ \in ℜ^{k}

. If there exists a

k \times k

matrix G such that the TLC

\sqrt{n} [\bar{Y} (n) - μ] \Rightarrow G N_{k} (0, 1)

is satisfied, where

\bar{Y} (n) = n^{- 1} \sum_{i = 1}^{m} Y_{i}

, and

N_{k} (0, I)

denotes a (k-variate) normal distribution with a mean of 0 and variance of I (the identity), then

\sqrt{n} [g (\bar{Y} (n)) - g (μ)] \Rightarrow σ N (0, 1),

where

σ = \sqrt{\nabla g {(μ)}^{T} G G^{T} \nabla g (μ))}

.

Proof of Corollary 1.

Since

\hat{α_{1}}, \hat{α_{2}}, \dots

are i.i.d. with

E [{\hat{α}}_{1}^{2}] < \infty

, it follows from the Law of Large Numbers that

n^{- 1} (\sum_{i = 1}^{n} {\hat{α}}_{i}^{2}, \sum_{i = 1}^{n} {\hat{α}}_{i}) \Rightarrow (E [{\hat{α}}_{1}^{2}], E [{\hat{α}}_{1}])

. Therefore, by taking

g (x_{1}, x_{2}) = \sqrt{x_{1} - x_{2}^{2}}

for

x_{1} - x_{2}^{2} \geq 0

in Theorem A2, we have

\sqrt{(n - 1) {\hat{σ}}_{T}^{2} (n) / n} \Rightarrow \sqrt{σ_{T}^{2}}

, so

Y_{n} = \sqrt{{\hat{σ}}_{T}^{2} (n)} / \sqrt{σ_{T}^{2}} \Rightarrow 1

. Finally, by taking

X_{n} = \sqrt{n} (\hat{α} (n) - α) / \sqrt{σ_{T}^{2}}

in Theorem A1, it follows from Proposition 2 that

\frac{\sqrt{n} (\hat{α} (n) - α)}{\sqrt{{\hat{σ}}_{T}^{2} (n)}} \Rightarrow N (0, 1)

Similarly, since

S_{1}^{2}, S_{2}^{2}, \dots

are i.i.d. with

E [S_{1}^{4}] < \infty

, it follows from Theorem A1 and Proposition 2 that

\frac{\sqrt{n} ({\hat{σ}}_{S}^{2} (n) - σ_{S}^{2})}{\sqrt{{\hat{V}}_{S} (n)}} \Rightarrow N (0, 1) .

□

Proof of Lemma 1.

Let

k = 2

,

Y_{i} = (X_{i}, X_{i}^{2})

,

μ = (μ_{1}, μ_{2})

; then, the TLC of Theorem A3 is satisfied for

G G^{T} = [\begin{matrix} μ_{2} - μ_{1}^{2} & μ_{3} - μ_{1} μ_{2} \\ μ_{3} - μ_{1} μ_{2} & μ_{4} - μ_{2}^{2} \end{matrix}] .

Taking

g (μ) = μ_{2} - μ_{1}^{2} = σ_{1}^{2}

, we have

g (\bar{Y} (n)) = (n - 1) S^{2} (n) / n

,

\nabla g {(μ)}^{T} = (- 2 μ_{1}, 1)

, and

\nabla f {(μ)}^{T} G G^{T} \nabla f (μ) = 8 μ_{1}^{2} μ_{2} - 4 μ_{1}^{4} - 4 μ_{1} μ_{3} + μ_{4} - μ_{2}^{2} = σ_{2}^{2} .

It follows from Theorem A3 that

\sqrt{n} ⌊ (n - 1) S^{2} (n) / n - σ_{1}^{2} ⌋ \Rightarrow σ_{2} N (0, 1),

and the final conclusion follows from Theorem A1. □

Proof of Proposition 3.

In this proof, we follow the notation of the Lindeberg–Feller Theorem as stated in Theorem 7.2.1 of [3].

For n = 1, 2, …, let

m_{n} = ⌊ n^{- 1 + 1 / p} ⌋

and

α_{j} (n) = (\sum_{i = 1}^{m_{n}} W_{i j}) / m_{n}, j = 1, \dots, n

. Then,

α_{1} (n), α_{2} (n), \dots, α_{n} (n)

are independent, and for

X_{n j} = (α_{j} (n) - α) / \sqrt{n σ_{T}^{2}}

, we also have that

X_{n 1}, X_{n 2}, \dots X_{n n}

are independent.

Then, if

Y_{n j} = (α_{j} (n) - α) / σ_{T}

, we have

E [Y_{n j}] = 0

and

E [Y_{n j}^{2}] = 1

, so given

ϵ > 0

, there exists

η_{0} > 0

such that

\int_{| y | < η_{0}} y^{2} d F y_{n j} (y) < ϵ

.

Therefore, given

η > 0

, for

n \geq m a x {1, {(η_{0} / η)}^{2}}

, we have

\sum_{j = 1}^{n} \int_{| x | < η} x^{2} d F x_{n j} (x) \leq \sum_{j = 1}^{n} \frac{1}{n} \int_{| y | < η_{0}} y^{2} d F y_{n j} (y) < ϵ,

so (1) of Theorem 7.2.1 of [3] is satisfied, and it follows from this Theorem that

S_{n} \Rightarrow N (0, 1)

, where

S_{n} = \sum_{j = 1}^{n} X_{n j} = \frac{\sqrt{n} (\hat{α} (n) - α)}{\sqrt{σ_{T}^{2}}} .

□

References

Muñoz, D.F. Simulation output analysis for risk assessment and mitigation. In Multi-Criteria Decision Analysis for Risk Assessment and Management; Ren, J., Ed.; Springer: Heidelberg, Germany, 2021; pp. 111–148. [Google Scholar]
Smith, J.S.; Sturrock, D.T. Simio and Simulation: Modeling, Analysis, Applications, 6th ed.; Simio LLC: Sewickley, PA, USA, 2022. [Google Scholar]
Chung, K.L. A Course in Probability Theory, 3rd ed.; Academic Press: Cambridge, MA, USA, 2001. [Google Scholar]
Asmussen, S.; Glynn, P.W. Stochastic Simulation Algorithms and Analysis; Springer: Heidelberg, Germany, 2007. [Google Scholar]
Andradóttir, S.; Glynn, P.W. Computing bayesian means using simulation. ACM TOMACS 2016, 26, 10. [Google Scholar] [CrossRef] [Green Version]
L’Ecuyer, P. Quasi-Monte Carlo methods with applications in finance. Financ. Stoch. 2009, 13, 307–349. [Google Scholar] [CrossRef] [Green Version]
Zouaoui, F.; Wilson, J.R. Accounting for parameter uncertainty in simulation input modeling. IIE Trans. 2003, 35, 781–792. [Google Scholar] [CrossRef]
Muñoz, D.F.; Muñoz, D.F. Bayesian forecasting of spare parts using simulation. In Service Parts Management: Demand Forecasting and Inventory Control; Altay, N., Litteral, L.A., Eds.; Springer: Heidelberg, Germany, 2011; pp. 105–123. [Google Scholar]
Muñoz, D.F. Estimation of expectations in two-level nested simulation experiments. In Proceedings of the 29th European Modeling and Simulation Symposium, Barcelona, Spain, 18–20 September 2017; pp. 233–238. [Google Scholar]
Henderson, S.G. Input model uncertainty: Why do we care and what should we do about it? In Proceedings of the 2003 Winter Simulation Conference, New Orleans, LA, USA, 7–10 December 2003; pp. 90–100. [Google Scholar]
Song, E.; Nelson, B.L.; Pegden, C.D. Advanced tutorial: Input uncertainty quantification. In Proceedings of the 2014 Winter Simulation Conference, Savannah, GA, USA, 7–10 December 2014; pp. 162–176. [Google Scholar]
Barton, R.R.; Lam, H.; Song, E. Input uncertainty in stochastic simulation. In The Palgrave Handbook of Operations Research; Salhi, S., Boylan, J., Eds.; Springer: Heidelberg, Germany, 2022; pp. 573–620. [Google Scholar]
Barton, R.R.; Nelson, B.L.; Xie, W. Quantifying input uncertainty via simulation confidence intervals. Inf. J. Comput. 2014, 26, 74–87. [Google Scholar] [CrossRef] [Green Version]
Kleijnen, J.P.C. Sensitivity analysis versus uncertainty analysis: When to use what? In Predictability and Nonlinear Modelling in Natural Sciences and Economics; Grassman, J., van Straten, G., Eds.; Kluwer: Dordrecht, The Netherlands, 1994; pp. 322–333. [Google Scholar]
Kleijnen, J.P.C. Five-stage procedure for the evaluation of simulation models through statistical technique. In Proceedings of the 1996 Winter Simulation Conference, Coronado, CA, USA, 8–11 December 1996; pp. 248–254. [Google Scholar]
Kleijnen, J.P.C. Experimental design for sensitivity analysis, optimization, and validation of simulation models. In Handbook of Simulation: Principles, Methodology, Advances, Applications, and Practice; Banks, J., Ed.; Wiley: New York, NY, USA, 1998; pp. 133–140. [Google Scholar]
Montgomery, D.C. Design and Analysis of Experiments, 8th ed.; Wiley: New York, NY, USA, 2012. [Google Scholar]
Freimer, M.; Schruben, L. Collecting data and estimating parameters for input distributions. In Proceedings of the 2002 Winter Simulation Conference, San Diego, CA, USA, 8–11 December 2002; pp. 392–399. [Google Scholar]
Barton, R.R.; Cheng, R.C.H.; Chick, S.E.; Henderson, S.G.; Law, A.M.; Leemis, L.M.; Schmeiser, B.W.; Schruben, L.W.; Wilson, J.R. Panel on current issues in simulation input modeling. In Proceedings of the 2002 Winter Simulation Conference, San Diego, CA, USA, 8–11 December 2002; pp. 353–369. [Google Scholar]
Cheng, R.C.H. Selecting input models. In Proceedings of the 1994 Winter Simulation Conference, Orlando, FL, USA, 11–14 December 1994; pp. 184–191. [Google Scholar]
Cheng, R.C.H.; Holland, W. Sensitivity of computer simulation experiments to errors in input data. J. Stat. Comput. Simul. 1997, 57, 219–241. [Google Scholar] [CrossRef]
Cheng, R.C.H.; Holland, W. Two-point methods for assessing variability in simulation output. J. Stat. Comput. Simul. 1998, 60, 183–205. [Google Scholar] [CrossRef]
Cheng, R.C.H.; Holland, W. Calculation of confidence intervals for simulation output. ACM TOMACS 2004, 14, 344–362. [Google Scholar] [CrossRef]
Barton, R.R.; Schruben, L.W. Uniform and bootstrap resampling of input distributions. In Proceedings of the 1993 Winter Simulation Conference, Los Angeles, CA, USA, 12–15 December 1993; pp. 503–508. [Google Scholar]
Barton, R.R.; Schruben, L.W. Resampling methods for input modeling. In Proceedings of the 2001 Winter Simulation Conference, Arlington, VA, USA, 9–12 December 2001; pp. 372–378. [Google Scholar]
Bernardo, J.M.; Smith, A.F.M. Bayesian Theory, 8th ed.; Wiley: New York, NY, USA, 2009. [Google Scholar]
Draper, D. Assessment and propagation of model uncertainty. J. R. Stat. Soc. Ser. B Stat. Methodol. 1995, 57, 45–70. [Google Scholar] [CrossRef]
Chick, S.E. Bayesian analysis for simulation input and output. In Proceedings of the 1997 Winter Simulation Conference, Atlanta, GA, USA, 7–10 December 1997; pp. 253–260. [Google Scholar]
Chick, S.E. Input distribution selection for simulation experiments: Accounting for input uncertainty. Oper. Res. 2001, 49, 744–758. [Google Scholar] [CrossRef]
Zouaoui, F.; Wilson, J.R. Accounting for input-model and input-parameter uncertainties in simulation. IIE Trans. 2004, 36, 1135–1151. [Google Scholar] [CrossRef] [Green Version]
Biller, B.; Corlu, C.G. Accounting for parameter uncertainty in large-scale stochastic simulations with correlated inputs. Oper. Res. 2011, 49, 661–673. [Google Scholar] [CrossRef] [Green Version]
Berger, J.O.; Bernardo, J.M. On the development of the reference prior method. Bayesian Stat. 1992, 4, 35–60. [Google Scholar]
Chick, S.E. Subjective probability and Bayesian methodology. In Handbooks in Operations Research and Management Science; Henderson, S.G., Nelson, B.L., Eds.; Elsevier: Amsterdam, The Netherlands, 2006; Volume 13, pp. 225–257. [Google Scholar]
Muñoz, D.F.; Muñoz, D.F.; Ramírez-López, A. On the incorporation of parameter uncertainty for inventory management using simulation. Int. Trans. Oper. Res. 2013, 20, 493–513. [Google Scholar] [CrossRef]
Xie, W.; Nelson, B.L.; Barton, R.R. A Bayesian framework for quantifying uncertainty in stochastic simulation. Oper. Res. 2014, 62, 1439–1452. [Google Scholar] [CrossRef] [Green Version]
Muñoz, D.F.; Glynn, P.W. A batch means methodology for estimation of a nonlinear function of a steady-state mean. Manag. Sci. 1997, 43, 1121–1135. [Google Scholar] [CrossRef]
Russo, D.; Van Roy, B. Learning to optimize via posterior sampling. Math. Oper. Res. 2014, 39, 1221–1243. [Google Scholar] [CrossRef] [Green Version]
Brooks, S.; Gelman, A.; Jones, G.; Meng, X.L. (Eds.) Handbook of Markov Chain Monte Carlo; CRC Press: Boca Raton, FL, USA, 2011. [Google Scholar]
Serfling, R.J. Approximation Theorems of Mathematical Statistics; Wiley: New York, NY, USA, 2009. [Google Scholar]

Figure 1. Algorithm for the method of independent replications with a parameter fixed at the value

θ

.

Figure 1. Algorithm for the method of independent replications with a parameter fixed at the value

θ

.

Figure 2. Algorithm for a two-level nested simulation experiment to calculate a point estimator for the expectation of a performance variable under parameter uncertainty.

Figure 3. Performance of the estimation of the point forecast

α

for

T = 15

and fixed

n m

with different values of m.

Figure 3. Performance of the estimation of the point forecast

α

for

T = 15

and fixed

n m

with different values of m.

Figure 4. Performance of the estimation of the point forecast

α

for

T = 5

and fixed

n m

with different values of m.

Figure 4. Performance of the estimation of the point forecast

α

for

T = 5

and fixed

n m

with different values of m.

Figure 5. Performance of the estimation of the stochastic variance

σ_{S}^{2}

for

T = 15

and fixed

n m

with different values of m.

Figure 5. Performance of the estimation of the stochastic variance

σ_{S}^{2}

for

T = 15

and fixed

n m

with different values of m.

Figure 6. Performance of the estimation of the stochastic variance

σ_{S}^{2}

for

T = 5

and fixed

n m

with different values of m.

Figure 6. Performance of the estimation of the stochastic variance

σ_{S}^{2}

for

T = 5

and fixed

n m

with different values of m.

Figure 7. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 15

and fixed

n m

with different values of m.

Figure 7. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 15

and fixed

n m

with different values of m.

Figure 8. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 5

and fixed

n m

with different values of m.

Figure 8. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 5

and fixed

n m

with different values of m.

Figure 9. Performance of the estimation of the point forecast

α

for

T = 15

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 9. Performance of the estimation of the point forecast

α

for

T = 15

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 10. Performance of the estimation of the point forecast

α

for

T = 5

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 10. Performance of the estimation of the point forecast

α

for

T = 5

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 11. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 15

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 11. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 15

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 12. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 5

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Figure 12. Performance of the estimation of the total variance

σ_{T}^{2}

for

T = 5

and fixed

n m

to compare

m = 1

,

m \approx {(n m)}^{1 / 3}

, and

m \approx {(n m)}^{1 / 2}

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Muñoz, D.F. Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments. AppliedMath 2023, 3, 582-600. https://doi.org/10.3390/appliedmath3030031

AMA Style

Muñoz DF. Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments. AppliedMath. 2023; 3(3):582-600. https://doi.org/10.3390/appliedmath3030031

Chicago/Turabian Style

Muñoz, David Fernando. 2023. "Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments" AppliedMath 3, no. 3: 582-600. https://doi.org/10.3390/appliedmath3030031

APA Style

Muñoz, D. F. (2023). Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments. AppliedMath, 3(3), 582-600. https://doi.org/10.3390/appliedmath3030031

Article Menu

Estimation of Expectations and Variance Components in Two-Level Nested Simulation Experiments

Abstract

1. Introduction and Notation

2. Theoretical Results

2.1. Point Estimators

2.2. Accuracy of the Point Estimators

3. A Forecast Model for Inventory Management with an Analytical Solution

4. Empirical Results

5. Conclusions

Supplementary Materials

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI