Waiting Time Problems for Patterns in a Sequence of Multi-State Trials

: In this paper, we investigate waiting time problems for a ﬁnite collection of patterns in a sequence of independent multi-state trials. By constructing a ﬁnite GI/M/1-type Markov chain with a disaster and then using the matrix analytic method, we can obtain the probability generating function of the waiting time. From this, we can obtain the stopping probabilities and the mean waiting time, but it also enables us to compute the waiting time distribution by a numerical inversion.


Introduction
Waiting time problems for runs and patterns in a random sequence of trials are considered important, as they are of theoretical interest and have practical applications in various areas of statistics and applied probability such as reliability, sampling inspection, quality control, DNA/RNA sequence analysis, and hypothesis testing ( [1]). For comprehensive surveys and applications of related waiting time problems, refer to the books of Balakrishnan and Koutras [2] and Fu and Lou [3].
Let {X t , t ≥ 1} be a sequence of random variables taking values in a finite set A. A finite sequence of elements of A is called a pattern. We consider a finite collection C = {C 1 , C 2 , . . . , C K } of patterns, possibly of different lengths. For i = 1, . . . , K, let τ C i be the waiting time until the first occurrence of pattern C i as a run in the series X 1 , X 2 , . . .. Let W be the waiting time until one of the K patterns appears, i.e., W = min{τ C 1 , . . . , τ C K }.
Many researchers have studied waiting time problems for general and specific choices of C in a random sequence of trials. When {X t , t ≥ 1} is a sequence of independent and identically distributed (i.i.d.) Bernoulli trials, Fu and Koutras [4] developed a finite Markov chain embedding method, which was first employed by Fu [5], to study the exact distributions for the numbers of specified runs and patterns. Fu [6] extended the finite Markov chain embedding method to study the exact distributions for the numbers of runs and patterns in a sequence of i.i.d. multi-state trials. In addition, he obtained the waiting time distribution of a specified pattern.
In this paper, we are mainly interested in computing the waiting time distribution, as well as the stopping probabilities P(W = τ C j ), j = 1, . . . , K. Li [7], Gerber and Li [8], Guibas and Odlyzko [9], Blom and Thorburn [10] and Antzoulakos [11] considered the case when {X t , t ≥ 1} is a sequence of i.i.d. multi-state trials. Li [7] and Gerber and Li [8] used the martingale approach to obtain the mean waiting time E[W] and the stopping probabilities P(W = τ C i ), i = 1, . . . , K for a finite collection C of patterns. Guibas and Odlyzko [9] used the combinatorial method to obtain the probability generating function of the waiting time. Blom and Thorburn [10] also used the combinatorial method to obtain the mean waiting time E[W] and the stopping probabilities P(W = τ C i ), i = 1, . . . , K for a finite collection C of patterns with the same length. Antzoulakos [11] used the finite Markov chain embedding method to study waiting time problems for a single pattern as well as a finite collection C of patterns.
Han and Hirano [12], Fu and Chang [13], Glaz et al. [14], Pozdnyakov [15], Gava and Salotti [16], Zhao et al. [17] and Kerimov and Öner [18] considered the case when {X t , t ≥ 1} is a discrete time homogenous Markov chain with a finite state space, i.e., {X t , t ≥ 1} is a sequence of Markov dependent multi-state trials. Han and Hirano [12] studied waiting time problems for two different patterns. Fu and Chang [13] studied waiting time problems for a finite collection of patterns by using the finite Markov chain embedding method. Glaz et al. [14] obtained the mean waiting time E[W] and the probability generating function of the waiting time for a finite collection of patterns in a two-state Markov chain by using the method of gambling teams and martingale approach. Pozdnyakov [15] investigated the same problems as in Glaz et al. [14] for multi-state Markovian trials. Gava and Salotti [16] obtained the system of linear equations of stopping probabilities P(W = τ C i ), i = 1, . . . , K, by using the methods developed for gambling teams in [14,15]. Recently, Zhao et al. [17] found a method, which is based on the method of [9], to calculate E[W] and P(W = τ C i ), i = 1, . . . , K. Even more recently, Kerimov and Öner [18] found oscillation properties of the expected stopping times and stopping probabilities for patterns consisting of two consecutive states. For useful reviews of different approaches to solve waiting time problems of patterns for both i.i.d. and Markov dependent trials, refer to Fu and Lou [3].
Antzoulakos [11] and Fu and Chang [13] obtained the probability generating function of the waiting time for a finite collection of patterns in a sequence of i.i.d. and Markov dependent multi-state trials, respectively. They used a Markov chain with absorbing states corresponding to the patterns and considered the waiting time as the first entrance time into the absorbing state. The Markov chain has the transition probability matrix P of the form: where P TT is the submatrix of P whose entries are transition probabilities from a transient state to a transient state, P TA is the submatrix of P whose entries are transition probabilities from a transient state to an absorbing state, O is the zero matrix and I is the identity matrix. By using the general formula that represents the probability generating function of the first entrance time into the absorbing state, they obtained the probability generating function of the waiting time. Their results are expressed in terms of the submatrices P TT and P TA , as well as variants of them. Chang [19] also studied waiting time problems for a finite collection of patterns. He investigated the distribution of the waiting time until the rth occurrence of any pattern in the collection of patterns. He also used the expression (1) for the analysis.
In this paper, we consider a sequence of i.i.d. multi-state trials. We also use a Markov chain with transition probability matrix of the form (1). However, we heavily investigate the structure of the submatrices P TT and P TA . This enables us to construct a finite GI/M/1-type Markov chain with a disaster and consider the waiting time as the time until the occurrence of the disaster. Based on this and the matrix analytic method, we obtain the probability generating function of the waiting time W on From this, we can obtain the stopping probabilities P(W = τ C i ), i = 1, . . . , K as well as the conditional/unconditional mean waiting times, E[W|W = τ C j ] and E[W], but it also enables us to compute the waiting time distribution by a numerical inversion. The benefit of our method is that it is useful and efficient even when the length of the pattern is large. Our method can also be extended to Markov dependent multi-state trials.
The paper is organized as follows. In Section 2, we formulate our waiting time problems. In Section 3, we construct a GI/M/1-type Markov chain with a disaster. From this we can obtain our results, which are given in Section 4. In Section 5, numerical examples are presented to illustrate our results. Conclusions are given in Section 6.

Problem Formulation
Let {X t , t ≥ 1} be a sequence of i.i.d. trials taking values in a finite set A. Assume that for t = 1, 2, . . ., where ∑ x∈A p x = 1. For a finite collection C = {C 1 , C 2 , . . . , C K } of patterns, suppose that pattern C i is of the form where s i j ∈ A, j = 1, . . . , l i , i.e., C i is any pattern of length l i . Here, l i , i = 1, . . . , K are fixed positive integers with l 1 ≥ l 2 ≥ · · · ≥ l K . Recall that W is the waiting time until one of the K patterns appears, i.e., We will call W the sooner waiting time.
Our main interest is to derive the probability generating function of the sooner waiting time W on {W = τ C j }, j = 1, . . . , K, i.e., From this, we can obtain the stopping probabilities, the conditional/unconditional probability mass functions of W, and the conditional/unconditional means of W as follows:

•
The conditional probability mass function of W, given W = τ C j , i.e., P(W = n|W = τ C j ), j = 1, . . . , K, can be computed from the conditional probability generating function of W given The probability mass function of W, P(W = n), can be computed from . In addition, the unconditional mean of W, E[W], can be

GI/M/1-Type Markov Chain with a Disaster
In this section, we construct a GI/M/1-type Markov chain with a disaster to obtain an expression for (2). We define the following three terms: • s 1 · · · s j is a subpattern of pattern C i if s 1 · · · s j = s i k s i k+1 · · · s i k+j for some k with 1 ≤ k ≤ l i − j; when j = 0, s 1 · · · s j means the null pattern (i.e., the pattern with length 0).
Assume that for i = j, C i is not a proper subpattern of C j .
We now introduce a two-dimensional process ( N t , J t ), t = 0, 1, 2, . . ., where N t and J t are defined as follows: (i) N 0 = 0, and for t = 1, 2, . . ., (ii) J 0 = 1, and for t = 1, 2, . . ., To clarify the definitions of N t and J t , we provide the following example: For example, if we consider the sequence of trials bcaababbaaabbbbc · · · , then ( N t , J t ), t = 0, 1, 2, . . . are given in Table 1. As another example, if we consider the sequence of trials ababaccaacaabaa · · · , then ( N t , J t ), t = 0, 1, 2, . . . are given in Table 2. Note that {( N t , J t ), t = 0, 1, . . .} is a discrete time Markov chain. Table 1. Sample paths of N t and J t corresponding to the sample path of X t , bcaababbaaabbbbc · · · .  Table 2. Sample paths of N t and J t corresponding to the sample path of X t , ababaccaacaabaa · · · .
Define m 0 = 1 and for k = 1, 2, . . . , l 1 − 1, let m k be the number of patterns in C whose lengths are larger than k, i.e., Furthermore, the set of all possible values of J t when N t = k ∈ {0, 1, . . . , l 1 − 1, ∆} is given by For each state (k, i), the first component k is called level. The one-step transition probability matrix P of {( N t , J t ), t = 0, 1, . . .} is given, in lexicographic order with ∆ being the last element in the set of levels, as follows: where the submatrices are described below. A matrix consisting of (i, j) components with i ∈ I and j ∈ J will be called an I × J matrix.
if the following three conditions hold: k is a proper leading subpattern of pattern C j ; (ii) s j 1 · · · s j k is not a proper leading subpattern of pattern C j for j ∈ {1, . . . , j − 1}; (iii) s i n s i n+1 · · · s i k s j k is not a leading subpattern of a pattern in C for n ∈ {1, 2, . . . , k − k + 1}.
Otherwise, (P kk ) ij = 0. To make it easier to understand how the matrix P in (4) is constructed, we explain with an example. For the previously described example with A = {a, b, c}, C 1 = aaabbb, C 2 = aaba and C 3 = abc, we have The matrix P is O P 1∆ P 30 P 31 P 32 P 33 P 34 O P 1∆ P 40 P 41 P 42 P 43 P 44 P 45 P 1∆ P 50 P 51 P 52 P 53 P 54 That is, P is given by . .} be a two-dimensional discrete time Markov chain with the same state space E as that given in (3) and the same transition probability matrix P as that given in (4), but with an arbitrary initial state. Note that {(N t , J t ), t = 0, 1, . . .} is a finite GI/M/1-type Markov chain with a disaster. This disaster occurs when N t reaches ∆.
Equation (7) can be interpreted as follows: Starting from level n, the Markov chain may visit level n + 1 (while avoiding level ∆) in two ways: it may move up to level n + 1 at the very next transition (contributing the factor zP n,n+1 ), or it may move to level k (0 ≤ k ≤ n) at the first transition, move up from level k to level k + 1, then from level k + 1 to level k + 2, and so on, until finally moving from level n to level n + 1 (contributing the factor z ∑ n k=0 P nk G k (z)G k+1 (z) · · · G n (z)). From (7), we obtain where I n is the I n × I n identity matrix. For n = 0, 1, . . . , l 1 − 1, we define which means that (H n (z)) ij (n = 0, 1, . . . , l 1 − 2) is the probability generating function for the time of the first visit to state (∆, j), starting from state (n, i), before the first visit to level n + 1, and that (H l 1 −1 (z)) ij is the probability generating function for the time of the first visit to state (∆, j), starting from state (l 1 − 1, i). Let H n (z) be the matrix of the probability generating functions whose (i, j)-component is (H n (z)) ij . Conditioning on the first transition, we have H n (z) = z P n∆ + n ∑ k=0 P nk H k:n+1 (z) , 0 ≤ n ≤ l 1 − 1, where the (i, j)-component of Since we have Substituting (11) into (9), we obtain which can be written as From this equation, we obtain which means, by (10), Therefore, by (11), In summary, we obtain the following theorem.
From Theorem 1, we can obtain the following results.

Corollary 1.
(i) The stopping probabilities P(W = τ C j ), j = 1, . . . , K, are given by (ii) The conditional probability generating functions of W, given W = τ C j , j = 1, . . . , K, are given by (iii) The marginal probability generating function of W is given by Remark. As mentioned in Section 2, the conditional probability mass function P(W = n|W = τ C j ), j = 1, . . . , K can be computed from (14) by a numerical inversion. In addition, the probability mass function P(W = n) can be computed from (15) by a numerical inversion. For the numerical inversion of probability generating functions, refer to Abate and Whitt [23]. By Theorem 1, we can also obtain the conditional/unconditional means of the sooner waiting time. To get this, we introduce , k = 0, 1, . . . , l 1 − 1.

Numerical Examples
In this section, we present numerical results for the computations of the stopping probabilities, the probability mass functions (along with the tail probabilities) of the sooner waiting time, and the conditional/unconditional means of the sooner waiting time. To illustrate our results, we provide two examples. Suppose that K = 10, i.e., the collection C consists of 10 patterns, C = {C 1 , . . . , C 10 }. We select the collection of patterns {C 1 , . . . , C 10 } as shown in Table 3, where the lengths of the patterns, l 1 , . . . , l 10 , are chosen from the order statistics of i.i.d. random variables with mean 5. The set of patterns given in Table 3 is an example of a randomly selected pattern set such that one pattern is not a subpattern of another. The procedure of randomly selecting a one pattern set is omitted here. Table 3. The patterns used in Example 1.
In Figure 1, we plot the joint probabilities P(W ≥ n, W = τ C j ), j = 1, 4, 7, 10, with n varying. This can be computed by the numerical inversion of its generating function:  1.3893 × 10 −1 7 9.3671 × 10 −2 8 1.2852 × 10 −1 9 1.5249 × 10 −1 10 4.2152 × 10 −1 In Table 5, we present the probability mass function of W P(W = n) = 10 ∑ j=1 P(W = n, W = τ C j ), and the tail probability of W P(W ≥ n) = 10 ∑ j=1 P(W ≥ n, W = τ C j ), with n varying. Here, P(W = n, W = τ C j ) can be computed by the numerical inversion of its generating function: By Theorem 2, we can compute the conditional mean of the sooner waiting time W, E[W|W = τ C j ], j = 1, . . . , K, and the unconditional mean of W, E[W]. Table 6 shows the conditional and unconditional mean waiting times for Example 1.
Suppose that the collection C consists of 5 patterns, C = {C 1 , . . . , C 5 }, where C 1 = 1111111111111111, For Example 2, the joint probabilities P(W ≥ n, W = τ C j ), j = 1, . . . , 5 are shown in Figure 2. Also, the stopping probabilities P(W = τ C j ), j = 1, . . . , 5, the probability mass function of W (along with the tail probability) and the conditional/unconditional means of W are shown in Tables 7-9 Table 8. The probability mass function P(W = n) and tail probability P(W ≥ n) for Example 2.
n P(W = n) P(W ≥ n)

Conclusions
We have derived the probability generating function of the sooner waiting time for a finite collection of patterns in a sequence of i.i.d. multi-state trials. From this probability generating function we have obtained the stopping probabilities and the mean waiting time, but it also has enabled us to compute the waiting time distribution by a numerical inversion. As mentioned in the introduction, our method can be extended to Markov dependent multi-state trials.
For further research, we will investigate the tail asymptotics for the sooner waiting time W. From Figures 1 and 2, we can expect that the distribution of W has a geometric tail behavior. This is true under certain aperiodic condition because W is the first passage time to a subset of the state space, in a discrete time Markov chain with a finite state space. Under some assumptions about periodicity, the distribution of W exhibits a geometric tail behavior, i.e., P(W ≥ n) ∼ cσ n as n → ∞ for some c > 0 and σ ∈ (0, 1). Here "∼" means that the limit of the ratio is 1. It would be of interest to find explicit expressions for c and σ. We also have the following geometric tail behavior: P(W ≥ n, W = τ C i ) ∼ c i σ n as n → ∞ for some c i > 0 and σ ∈ (0, 1). Here σ is independent of i and is the same as that described above. It would also be of interest to find explicit expressions for c i , i = 1, . . . , K.