1. Introduction
The INAR(1) and GINAR(1) processes were originally proposed by McKenzie [
1,
2]; the latter model was soon after discussed in more detail by Alzaid and Al-Osh [
3]. They rely on the binomial thinning operation due to Steutel and van Harn [
4] which is defined below.
Definition 1. Let X be a non-negative integer-valued r.v. with range and ρ a scalar in . Then the binomial thinning operation on X results in the r.v.where ∘ represents the binomial thinning operator; is a sequence of i.i.d. Bernoulli r.v. with parameter ρ; is independent of X. We usually refer to as the r.v. that arises from X by binomial thinning. Furthermore, we define and .
Now that we have defined the binomial thinning operation, a sort of scalar multiplication counterpart in the integer-valued setting, the reader is reminded of the definition of McKenzie’s GINAR(1) process and its main properties.
Definition 2. Let . Then is said to be a GINAR(1) process if is written in the formwhere and are independent sequences of i.i.d. Bernoulli r.v. with parameter and of i.i.d. geometric r.v. with parameter p, respectively; the sequence of innovations and are independent; all thinning operations are performed independently of each other and of ; and all the thinning operations at time t are independent of . According to McKenzie [
2] and Alzaid and Al-Osh [
3], if
then
is a stationary AR(1) process with
marginal distribution.
McKenzie [
2] also adds that
is a DTMC with TPM,
, where
where
represents the indicator function of the set of non-negative integers. These entries can be obtained by taking advantage of a few facts:
with probability 1;
, for
; the p.f. of the innovations,
, is equal to
The autocorrelation function of the GINAR(1) process is equal to
We ought to point out that the GINAR(1) process is a particular case of the generalized geometric INAR(1) or GGINAR(1) process, introduced by (Al-Osh and Aly [
5], Section 3). Moreover, autocorrelated geometric counts can also be modeled by the new geometric INAR(1) or NGINAR(1) process, proposed by Ristić et al. [
6] and relying on the negative binomial thinning operator. Finally, the NGINAR(1) process is a special instance of the ZMGINAR(1) process, the zero-modified geometric first-order integer-valued autoregressive, introduced and thoroughly described by Barreto-Souza [
7].
The remainder of the paper is organized as follows. In
Section 2, we shall prove that
has two important features stated in the two following theorems.
Theorem 1. The TPM of a GINAR(1) process is totally positive of order 2,i.e., all the minors of the are non-negative. Theorem 2. Let: and be two independent GINAR(1) processes, with parameters and ; and be their corresponding TPM. Then is stochastically smaller than in the usual (or in the Kalmykov order) sense,if , that is,in case . In
Section 3, we discuss and illustrate the impact of (
6) and (
7) on the run length of an upper one-sided geometric chart for monitoring GINAR(1) processes. In
Section 4, we sum up our findings and briefly refer to related and future work.
3. Practical Implications in Statistical Process Control
Time series of counts arise naturally in several applications, namely the manufacturing industry, health care, service industry, insurance, and network analysis. Using control charts for monitoring the underlying count processes is essential to swiftly detect changes in such processes and start preventive or corrective actions (see Weiß [
13]). For an overview of control charts for count processes, we refer the reader to Weiß [
14].
As noted by Ristić et al. [
6], counts with geometric marginal distributions play a
major role in several areas, for instance reliability, medicine, and precipitation modeling. These counts may refer to the number of
machines waiting for maintenance,
congenital malformations, or
thunderstorms in a day.
In statistical process control, the GINAR(1) process can be used to model, for example, the cumulative counts of conforming items between two nonconforming items when these successive counts are no longer independent, say because the observations are generated by automated high-frequency sampling.
The literature review reveals that no charts have been proposed for monitoring GINAR(1) or GGINAR(1) counts. However, Li et al. [
15] proposed a combined jumps chart, a cumulative sum (CUSUM) chart, and a combined exponentially weighted moving average (EWMA) chart for monitoring the NGINAR(1) counts. Furthermore, Li et al. [
16] described upper and lower one-sided CUSUM charts for monitoring the mean of ZMGINAR(1) counts.
Let us consider that the following quality control chart is being used to detect decreases in the parameter p of the GINAR(1) process.
Definition 3. Let be a GINAR(1) process. The upper one-sided geometric chart makes use of the set of control statistics and triggers a signal at time t if , where U is a fixed upper control limit (UCL) in .
We should bear in mind that the control statistic
becomes stochastically smaller in the usual sense as
p increases (see Lemma A4). Consequently and as suggested by (Xie et al. [
17] p. 42), it is clear that when an observed value of
exceeds the UCL of the chart, this should be taken as a sign that the
p has decreased, that is, an indication of a potential increase in the process mean
.
The performance of the upper one-sided geometric chart is about to be assessed in terms of the run length (RL), the random number of samples collected before a signal is triggered by this control chart. Consequently, the following first passage time of the stochastic process
, under the condition that
, is a vital performance measure of this chart for monitoring a GINAR(1) process:
where
u is a fixed initial value in the set
.
U is chosen in such a way that false alarms are rather infrequent and increases in the process mean (i.e., decreases in p) are detected as quickly as possible. Hence, we should be dealing with a large in-control RL and smaller out-of-control run lengths.
3.1. Significance of
By invoking the first part of Theorem 3.1 of Assaf et al. [
18], we can state that the
character of the TPM of the GINAR(1) process leads to the following result.
Corollary 1. Let be a GINAR(1) process. Theni.e., , for . Corollary 1 implies that
has an increasing hazard rate
, that is,
is a nondecreasing function of
(see Kijima [
19] p. 118, Theorem 3.7(ii)).
means that signaling, given that no observation has previously exceeded the UCL, becomes more likely as we proceed with the collection of observations provided that
.
Note, however, that
may not be IHR, for
. In fact, the second part of Theorem 3.1 of Assaf et al. [
18] allows us to state that the p.f.
is
in
l and
n, i.e.,
, for
. As a consequence,
, for
, thus we can add that
has an decreasing hazard rate
.
The next corollary translates the stochastic influence of an increase in the initial value
u and can be shown to be valid by capitalizing on (Karlin [
20] pp. 42–43, Theorem 2.1).
Corollary 2. Let be a GINAR(1) process. Then, for , Let us denote the upper one-sided geometric chart with
(resp.
) by Scheme 1 (resp. Scheme 2). Then (
10) can be interpreted as follows: the odds of Scheme 1 signaling at sample
m against Scheme 2 triggering a signal at the same sample decreases as
m increases (see [
21] p. 5).
Result (
10) seems
quite evident; nevertheless, it would not be valid if the GINAR(1) process was not governed by a
TPM.
3.2. Other Comparisons of Run Lengths
The stochastic inequality
, for
, allows us to stochastically compare two GINAR(1) processes. As a matter of fact, by invoking Lemma A4 and Theorem 6.B.32 of (Shaked and Shanthikumar [
9] p. 282), we can state the next result.
Corollary 3. Let and two GINAR(1) processes. If and the initial states are deterministic or random, say , then From (
11) we can infer from (
11) that
.
The next lemma plays a vital role in the comparison of run lengths and is taken from (Shaked and Shanthikumar [
9] p. 283).
Lemma 1. If two stochastic processes and satisfy then Lemma 1 states what could be considered obvious: if we are dealing with two ordered stochastic processes in the usual sense, the larger stochastic process in the usual sense exceeds the critical level U stochastically sooner also in the usual sense.
By combining Corollary 3 and Lemma 1, we can provide a stochastic flavor to the influence of an increase in
p not only on
but also on another important RL:
which we coin as
overall run length, following (Weiß [
22] Section 20.2.2).
refers to a first passage time of the stochastic process
under the condition that the initial state coincides with the r.v.
. In point of fact, it is reasonable to resort to this performance measure because in practice we do not know
, hence it is plausible to rely, for example, on
.
Corollary 4. The following stochastic ordering results hold for the run lengths of the upper one-sided geometric chart for monitoring GINAR(1) processes:for and . Note that we could have also invoked (
14) and the closure of the usual stochastic order
under mixtures (see Shaked and Shanthikumar [
9] p. 6, Theorem 1.A.3.(d)) to prove (
15).
Results (
14) and (
15) mean that the upper one-sided geometric chart for the GINAR(1) process stochastically increases its detection speed (in the usual sense) as the downward shift in
p becomes more extreme. This stochastic ordering result parallels with the notion of a sequentially repeated uniformly powerful test.
3.3. An Illustration
Ristić et al. [
6] found that an NGINAR(1) model with estimated parameters
and
adequately described the monthly counts of sex offenses reported in the 21st police car beat in Pittsburgh. This data set comprises 144 observations, starting in January 1990 and ending in December 2001.
Note that the GINAR(1) and NGINAR(1) processes share the same geometric marginal distribution; and, as far as the offense data set is concerned, the value of the Akaike information criterion (AIC) for the NGINAR(1) and GINAR(1) models are very close, namely
and
, respectively, as (Ristić et al. [
6] Table 2) attest. Hence, we are going to consider the upper one-sided geometric chart from Definition 3 with
and
for monitoring such counts.
An UCL equal to
and an initial state
(resp.
) yield an in-control ARL of
(resp.
). These and other RL-related performance measures used in this subsection are described in
Appendix A.2.
The plots of the hazard rate function in
Figure 1 give additional insights into the RL performance of the geometric chart as we proceed with the sampling and to the impact of the adoption of a head start. Indeed, it illustrates two results that follow from Corollary 1:
and
. This last result suggests that the false-alarm rate conveniently decreases in the first samples when we adopt a head start
.
According to Brook and Evans [
23], the limiting form of the p.f. of the RL is geometric-like with parameter
, where
is the maximum real eigenvalue of
, regardless of the initial value
u of the control statistic
. Therefore, it comes as no surprise that the values of the hazard rate functions of
and
converge to
as suggested by
Figure 1.
Furthermore, the hazard rate function of is pointwise below the one of because Corollary 2 establishes that and this result in turn implies , that is, , for (see Definition A4).
We now illustrate the first result of Corollary 4 and also of a consequence of its second result: , for ; is an increasing function of p in the interval, .
In the left panel of
Figure 2, we plotted the survival functions of
and
.
Since
, the plot of survival function of
is pointwise below the one of
, as
Figure 2 plainly demonstrates. Hence, the number of samples taken until the detection of a
decrease in
p by the upper one-sided geometric chart is indeed stochastically smaller than the number of samples we collect until this chart emits a false alarm.
The right panel of
Figure 2 refers to the overall ARL function,
, for
. It increases with
p in this particular interval from
to
. We ought to note that it increases further when we take
, therefore the upper one-sided geometric chart cannot detect increases in
p in an expedient manner, as we have anticipated.
We wrote a program for Mathematica 10.3 (Wolfram [
24]) to produce all the graphs and results in this subsection.
4. Concluding Remarks
As expertly put by Montgomery and Mastrangelo [
25], the independence assumption is often violated in practice. As a consequence, we often deal with discrete-valued time series, namely when we are dealing with very high sampling rates, as suggested by Weiß and Testik [
26], and Rakitzis et al. [
27].
In this paper, we considered the GINAR(1) count process, resorted to stochastic ordering to prove two features of its TPM, and discussed the implications of these two traits on RL-related performance measures of an upper one-sided geometric control chart that accounts for the autocorrelated character of such process.
For example: the character of the TPM of the GINAR(1) process implies an IHR behaviour of the run length of that same chart; the run length and the overall run length stochastically increase in the usual sense in the interval .
These features of the GINAR(1) process and the associated results are comparable to the ones derived by (Morais [
21] Section 3.2) and Morais and Pacheco [
8,
28].
It is important to note that the notion of stochastically monotone matrices in the usual sense was introduced by Daley [
29] for real-valued discrete-time Markov chains. Moreover, Karlin [
20] implicitly states that a
TPM possesses a monotone likelihood ratio property and, thus, virtually defines stochastically monotone Markov chains in the likelihood ratio sense. Furthermore, the comparison of counting processes and queues in the usual sense can be traced back, for instance, to Whitt [
30] and the multivariate likelihood ratio order of random vectors (or
order) is discussed, for example, by (Shaked and Shanthikumar [
9] pp. 298–305).
Coincidentally, the stochastic order in the likelihood sense for stochastic processes or TPM has not been defined up to now, as far as we have investigated. For this reason and the fact that the
order is not closed under mixtures (see Shaked and Shanthikumar [
31] p. 33), we did not state or prove the
analogue of the two results in Corollary 4.
We also failed to prove that , for , because of two opposing stochastic behaviors of the summands and : the r.v. binomial (resp. ) stochastically increases (resp. decreases) with in the likelihood ratio sense. Had we proven that result, we could have concluded that the larger the upward shifts in the autocorrelation parameter, the longer it takes the upper one-sided geometric chart to detect such a change in .
It would be pertinent to investigate the stochastic properties of the RL and overall RL of lower one-sided geometric charts for detecting increases in the parameter p of a GINAR(1) process.
Another possibility of further work which certainly deserves some consideration is to investigate the extension of Theorems 1 and 2 to the NGINAR(1) process, the novel geometric INAR(1) process proposed by Guerrero et al. [
32], or the new INAR(1) process with Poisson binomial-exponential 2 innovations studied by Zhang et al. [
33], and assess the impact of these two results in the RL performance of upper one-sided charts for monitoring such autocorrelated geometric counts.
We ought to mention that deriving results similar to (
6) and (
7) seems to be very unlikely for the mixed generalized Poisson INAR process [
34]. This follows from the fact that the generalized Poisson distribution has not a
p.f.