Delay in a 2-State Discrete-Time Queue with Stochastic State-Period Lengths and State-Dependent Server Availability and Arrivals

: In this paper, we consider a discrete-time multiserver queueing system with correlation in the arrival process and in the server availability. Speciﬁcally, we are interested in the delay characteristics. The system is assumed to be in one of two different system states, and each state is characterized by its own distributions for the number of arrivals and the number of available servers in a slot. Within a state, these numbers are independent and identically distributed random variables. State changes can only occur at slot boundaries and mark the beginnings and ends of state periods. Each state has its own distribution for its period lengths, expressed in the number of slots. The stochastic process that describes the state changes introduces correlation to the system, e.g., long periods with low arrival intensity can be alternated by short periods with high arrival intensity. Using probability generating functions and the theory of the dominant singularity, we ﬁnd the tail probabilities of the delay.


Introduction
When, in the early 20th century, the Danish mathematician Agner Erlang used a mathematical model to describe a telephone switch (at the time, an office where workers manually connected phone lines), he became the founder of the field of queueing theory. More than a hundred years later, queueing theory is used in a broad range of practical problems such as in traffic engineering and industrial engineering, but telecommunications remains one of the main fields of application [1][2][3].
Multiserver queueing models have received considerable attention in the past. Most papers, however, focus on the analysis of the queue length characteristics only. Results are available for queueing systems with a constant number of available servers [4][5][6] as well as for queueing systems with a variable number of available servers [7][8][9].
When it comes to delay analysis, multiserver queueing systems are notoriously hard to analyze, but some results do exist in the literature. In [10][11][12], discrete-time queueing systems are treated where the number of available servers is constant, while in [13], the number of available servers in a slot is a stochastic variable which is independent and identically distributed (i.i.d.) from slot to slot. In [14], a continuous-time queueing system is treated where N servers are subject to breakdowns and repair; the results are the distributions of the queue length and waiting times. The authors of [15] obtain the queue length distribution and the delay distribution for a discrete-time model with general service demands and correlated service capacities. In their model, the service time of a customer depends on both its service demand and the service capacity.
In this paper, we consider a discrete-time multiserver queueing system that is special in the way that it allows correlation in both the slot-to-slot server availability and the arrival Table 1. Brief summary of the setting and delay results of relevant papers in comparison with the current paper.

Paper Servers Arrivals Delay Results
Chaudhry et al. [10] constant general (bulk), i.i.d. full delay distribution Bruneel et al. [11] constant general, i.i.d. tail probabilities Gao et al. [12] constant general, i.i.d. tail probabilities Laevens et al. [13] i.i.d. general, i.i.d. tail probabilities Neuts et al. [14] i.i.d. general, i.i.d. full delay distribution De Muynck et al. [15] correlated general, i.i.d. full delay distribution The current paper correlated correlated tail probabilites The considered queueing system can be in two different system states. For every slot, a stochastic number of servers is available, and a stochastic number of new customers arrive. These are both i.i.d. within each state, but these distributions can be different for each state. The system resides for a stochastic number of slots in the same state before state transitions occur, which can only happen at slot boundaries. The sojourn times follow a distribution which is also dependent on the system state.
The above setup can be used to model a wide variety of queueing systems, such as queues with bursty arrivals (where long periods of low/no arrival intensity are alternated by short periods of high arrival intensity) and queues with cyclical service capacities (where alternately few and many servers are available for fixed period lengths).
The potential applications can be found in many fields. In [17,18], continuous time models are used to calculate the delay in a system with time-varying arrival intensity and service capacity. The envisaged application is a hospital emergency department, but the model can also be used in other applications. In [19], time-dependent arrival rates are called a key feature of many real life service systems and it is, therefore, included in the Erlang loss model which is presented. In [20], an overview of time-varying queueing models is presented, from a telecommunications point of view. A possible application of the method of this paper is a processor sharing system with speed scaling [21], where the processing speed is adapted to the (expected) load. Another possible application is the domain of mobile ad hoc networks [22]. In such networks, nodes (which can move freely) must cooperate to send and receive packages. The number of available nodes as well as the traffic varies over time.
The outline of the paper is as follows. In Section 2, we give the mathematical outline of the queueing model. In Section 3, we repeat some key results regarding the system content from our earlier paper [16]. In Section 4, we condition the delay of a customer on the state of the arrival slot and on the number of customers present in the queue at the moment of arrival. Section 5 then describes how the delay of an arbitrary customer can be obtained. In Section 6, we use the theory of the dominant singularity to obtain the tail characteristics of the delay. The numerical examples in Section 7 illustrate the model and Section 8 concludes the paper.

Queueing Model
We consider the discrete-time multiserver queueing model of [16] with correlation over time in both the arrival process and the server availability. In this discrete-time model, the time horizon is divided into slots of equal length and, during a slot, the system can be either in state 1 or state 2. Note that, in general, in our paper, we will always use i to indicate a system state and not repeat that it can only take values 1 and 2. Furthermore, we will useī to refer to the other state:ī Therefore, put simply, there are two types of slots, which we call state-1-slots and state-2-slots, and during a state-i-slot the queueing system is in system state i. State changes can only occur at slot boundaries and mark the beginnings and ends of state-1-periods and state-2-periods. If we denote with r i,k the length (expressed in the number of slots) of the kth state-i-period, then the series {r i,k } form two sets of i.i.d. stochastic variables. Their distribution is given by r i (n) Prob[state-i-period has n slots] , n ≥ 1 ; (2) where we have introduced the probability generating functions (pgfs) R i (z). We limit the R i (z) to be rational functions of their argument: with A i r (z) and B i r (z) mutually prime polynomials of degree m i Ar and m i Br , respectively, and with A i r (1) = B i r (1) = 1. We define m i r as: The probability that an arbitrary slot belongs to a given state is equal to the fraction of time the system is in that state and is given by The special feature of our model is that the server availability and the arrival process both depend on the system state. Specifically, the distribution of the number of available servers during a slot depends on the system state during that slot in the following way: s i (n) Prob[n servers available during a state-i-slot] , n ≥ 1 ; (7) During every slot, there is at least one server available and within a given state-iperiod the numbers of available servers during the different slots are i.i.d. from slot to slot. Similarly to the R i (z), we also limit the S i (z) to be rational functions of their argument: with A i s (z) and B i s (z) mutually prime polynomials of degree m i As and m i Bs , respectively, and with A i s (1) = B i s (1) = 1. We define m i s as: The numbers of arrivals during the different slots of a given state-i-period are i.i.d. from slot to slot as well. Their common distribution is characterized by c i (n) Prob[state-i-slot has n arrivals] , n ≥ 0 ; (11) The average arrival intensity is then given by Customers can only start service at slot boundaries, so an arriving customer can only be taken into service at the beginning of the next slot, even if a server is idle at the moment of arrival. The queue capacity is assumed to be infinite, so an arriving customer will always join the system. Each customer requires exactly one slot of service. We assume the system reaches steady state and, therefore, the average number of customers entering the system should be strictly smaller than the average number of available servers [23], leading to the following stability condition: Furthermore, it will prove useful to introduce the following notation:

System Content
In an earlier work [16], we analyzed the system content for a queueing model as described above. In the current section, we repeat the main results that are necessary for the delay analysis in this paper. Let us denote with the stochastic variable g i k (k ≥ 0), the total number of customers in the system at the beginning of the (k + 1)st slot of a state-i-period. The corresponding pgf is G i k (z). We can derive the following recursive equation, valid for k ≥ 1: We can obtain a set of two linear equations for the functions G i 0 (z) by recursive application of (16) and by stating that the system content at the beginning of a state-iperiod equals the system content at the end of a state-ī-period. This set of equations can then be solved to yield the following expression: with the bivariate functions Q i (x, z) unknown. It can be proven that if R i (z) and S i (z) are rational functions of their argument, then also the Q i (x, z) are rational and of the following form: The (finite number of) unknowns i nj can be determined by relying on the properties of pgfs, namely that they are normalized and that they are analytical within the complex unit disk. To do this, the roots within the complex unit disk of the denominator of (17) need to be determined and a set of linear equations needs to be solved. For many common choices of the distributions in the model, this does not require large computational effort. Similarly, it is possible to obtain the pgfs of the system content at the beginning of an arbitrary state-islot and at the beginning of an arbitrary slot. Based on these pgfs, important performance metrics can be calculated, such as the expected number of customers in the system.

Delay of a Customer with K Customers Ahead
For the system content, the specific queueing discipline is not of importance, as long as it is work conserving. However, for the delay analysis it needs to be specified. We will assume a First Come First Served (FCFS) policy. We do not specify the exact arrival instant of a customer within a slot and, therefore, define its delay as the time interval from the first slot boundary after the customer's arrival until the end of its service slot. This definition is illustrated in Figure 1. The delay thus consists of an integer number of slots and is at least one slot long. This setup is also referred to as a Late Arrival System with Delayed Access (LAS-DA) [24]. Let us denote with the stochastic variable d i,n k the delay of a customer that arrives during a state-i-slot, with n more slots until the next state-ī-slot and with k customers waiting in the queue with priority over the arriving customer (excluding the customer(s) currently in service, if any). The corresponding pgf is D i,n k (z). Clearly, if during the slot after the considered customer's arrival slot more than k servers are available, the customer's delay will consist of exactly one slot. Otherwise, if during the first delay slot only l ≤ k servers are available, there will be an additional number of delay slots that corresponds to the delay of a customer with k − l customers ahead. Based on these observations, we can state Furthermore, we use the stochastic variable d i k for the delay of a customer arriving during the first slot of a state-i-period and with k customers in the queue with priority over the arriving customer. The corresponding pgf D i k (z) can be expressed as Let us now introduce two auxiliary functions Using the Relations (19) and (20), we obtain After recursive application this leads to Multiplying (26) with r i (n + 1) and summing over all n ≥ 0 we get Using (21) we can also work out an expression for D i,0 (x, z): Combining (27) and (28) we then obtain the following set of equations: which can be solved to find an explicit expression for D i (x, z): Bearing in mind that all pgfs are normalized, we can furthermore verify that . In order to rewrite (30) as a rational function of x, we divide numerator and denominator by those common factors. To remove the remaining poles in x ing i (x, z), we multiply with the auxiliary function u(x, z), which in view of (4) and (9) is defined as Therefore, we obtain with The new denominator g(x, z) of D i (x, z) is a polynomial in x of degree M m i s m i r + m¯ı s m¯ı r and the numerator is of degree M − 1. We can do a partial fraction expansion of D i (x, z) based on its poles x φ in x, which we assume distinct. Note that the x φ (φ = 1, . . . , M) are functions of z, but for notational simplicity the argument is omitted. We can then write D i (x, z) as with The above allows us to easily obtain an expression for D i k (z), which is the pgf of the delay of a customer arriving during the first slot of a state-i-period with k customers waiting in the queue with priority over the considered customer:

Delay of an Arbitrary Customer
Let us now consider the arbitrary customer P, arriving during slot S. We denote the probability that S is a state-i-slot with h i and that it is the lth slot of a state-i-period of in total l + n slots as h i l|n : The pgf F i (z) of the number of customers that arrive during slot S but before customer P and given that S is a state-i-slot is known to be given by (see, e.g., [1]) The queue content as experienced by P upon arrival, are the customers with priority over P that can start service after S. It thus consists of the customers that were present in the system at the beginning of S, minus those that receive service during S and plus those that arrived during S but before P. It is a stochastic variable that depends on the state of S and on the time since the last state change. Given that S is the lth slot of a state-i-period, we will denote this stochastic variable as t i l (l ≥ 1). Its pgf T i l (z) can be obtained by translating the above observations into the z-domain: We will denote the inverse z-transform of the above pgf as t i l (k). We can now develop the pgf W(z) of the delay of an arbitrary customer The functions R i (z) are assumed to be rational, if we further assume that they only have poles of multiplicity 1, we can write them in the form Note that the summations in the above formula do not necessarily both appear. In the remainder of this paper, we will assume that both summations are present, the results can be easily modified for the other cases. The corresponding probability mass function (pmf) r i (n) can be written as We substitute (46) into (44) and rearrange the summations to get: We will look at the above expression in more detail in two steps and, therefore, introduce the following auxiliary notations: Let us first look at (48) for l = j. In that case, the arrival slot of customer P is the last slot of a state-i-period. We denote with n the number of servers available in the next slot (which is the first slot of a state-ī-period). With probability t i j (k) there are k customers waiting in the queue with priority over P. The delay of the tagged customer P is 1 slot if n > k, or the pgf of its delay is given by zD¯ı k−n (z) if n ≤ k. In the z-domain, this yields u j,j where we have also introduced (39). Similarly, we now look at the situation where l = j − 1, with j > 1. The arrival slot of customer P is the penultimate slot of a state-i-period. We denote with n 1 and n 2 the number of servers available in the next two slots (of which one is the last slot of a state-i-period and one is the first slot of a state-ī-period). With probability t i j−1 (k), there are k customers waiting in the queue with priority over the tagged customer P. The delay of P equals 1 slot if n 1 > k, the delay equals 2 slots if n 1 ≤ k < n 1 + n 2 and the pgf of its delay is given by z 2 D¯ı k−n 1 −n 2 (z) if n 1 + n 2 ≤ k. In the z-domain, this yields The same reasoning can be applied to obtain an expression for the general function u j,l i (z). There are (j − l) full slots until the next state-ī-period. In these slots, n 1 , n 2 , . . . , n j−l servers are available, with probabilities s i (n 1 ), s i (n 2 ), . . . , s i (n j−l ) and in the slot afterwards (the first slot of a state-ī-period), there are n j−l+1 servers available with probability sī(n j−l+1 ). With probability t i l (k) there are k customers waiting in the queue with priority over the tagged customer P. The delay of P equals s slots (with 1 ≤ s ≤ j − l + 1) if ∑ s−1 p=1 n p ≤ k < ∑ s p=1 n p and its delay is defined by the pgf z j−l+1 D¯ı k−n 1 −..
Now, we look at the second part of (47). In order to work out v j i (z) as given in (49), we first introduce the following auxiliary functions: Taking first the sum over k in (54), we identify the definition of D i,n (x, z) as given in (23), so we can bring expression (26) into (54) and then work out the summation over n: Using (15), (16) and (43), we can rewrite the expression for T i,j (z) as where the functions Q i,j (z) are still unknown, but they are similar to the functions Q i (x, z) in (18). It can be proven that they are rational functions with denominator equal to the denominator of S i (z) and that the unknown coefficients of their numerators can be determined using the properties of pgfs. However, it will turn out that it is not necessary to determine these unknowns. We now have an expression for W(z), the pgf of the delay of an arbitrary customer: The above expression is rather complex and, therefore, cannot be easily inverted to give the full delay analysis. On top of that, it still contains (a finite number of) unknowns. However, a tail approximation can be obtained, which will be done in the following section.

Tail Approximation
To obtain a tail approximation of the delay, we can use the theory of the dominant singularity, which has been used extensively in the literature, see for example [11,25]. The theory stipulates that for k sufficiently large and with z 0 the pole of W(z) with smallest modulus and with Note that z 0 will be positive, real-valued and strictly larger than 1, see, e.g., [11]. Let us take a closer look at W(z) and its subparts (52) and (60) to determine where the dominant pole z 0 can be found. The functions f i (x, z) are polynomials and, therefore, contain no poles. The x φ (φ = 1, . . . , M) are assumed to be single roots of g(x, z) and, therefore, g x x φ , z contains no zeros. Furthermore, x φ = 0 cannot give a pole, as g(0, z) = 0 has no solutions. Therefore, z 0 can only be a pole of S i x φ , a pole of T i l 1 x φ or a pole of T i,j 1 x φ .

Conjecture 1.
The dominant pole z 0 can only be found as a pole of G i 0 1 x φ .
From (17) it follows that G 1 0 1 x φ and G 2 0 1 x φ have the same poles. The dominant pole z 0 , therefore, appears in T i l 1 x φ and T i,j 1 x φ , for i = 1, 2 and j = 1, . . . , N i . It is found for a specific value of φ, which we will call ξ and we will denote the value of x ξ (z 0 ) as x 0 . Using (66), we can find an expression for w 0 : with u j,l * where furthermore and Following an application of L'Hôpital's Rule G i * 0 1 x 0 is then obtained from (17) by dividing the numerator of G i 0 1 x ξ (z) by the derivative with respect to z of its denominator and evaluating at z = z 0 : In order to evaluate dx ξ dz z=z 0 we recall that x ξ is a solution of Deriving both sides of the above equation with respect to z, working out for dx ξ dz and evaluating at z = z 0 , we find Note that z 0 does not necessarily exist. Indeed, if we consider the case where every slot contains at most 1 arrival, every arriving customer will experience an empty queue, and will be served in the slot following its slot of arrival. The delay will be 1 for all customers, i.e., W(z) = z.

Numerical Examples
In this section, we will illustrate the model with some numerical examples and validate the obtained results using simulation. First we compare the delay characteristics of two very similar queueing systems and then we look at the influence of the period lengths (while keeping their ratio constant).

Two Similar Queueing Systems
In this example, we compare two cases with the following input parameters: For the stochastic processes describing the numbers of servers available per slot, and the period lengths, we take the same distributions for both Case A and Case B. The expected numbers of available servers are s 1 = 4 and s 2 = 1.5, and the expected period lengths are r 1 = 10 and r 2 = 20. For the numbers of arrivals per slot we take Poisson distributions with intensities as given above. The chosen values mean that in Case A in the first state there are barely any arrivals and in the second state the system is overloaded, while in Case B this situation is reversed.
This specific choice for the parameters results in two queueing systems with an equal expected delay of 10.0 slots, which can be obtained using the method of [16] and using Little's Law [26], and which was also confirmed by simulation. Using the method of the current paper, it can be seen that despite having the same expected delay, and despite being two very similar queueing systems, the tail characteristics of the delay are different, as plotted in Figure 2. We observe that for Case B, there is a larger probability that the delay exceeds a large given value, i.e., the delay distribution of customers has a heavier tail in Case B as compared to Case A. In the figure the probability mass function of the delay obtained by simulation is also plotted, as a validation of the tail probabilities obtained via the method developed in this paper. For Case A, the results obtained by the method of this paper are within 5% of the results obtained by simulation for k > 8. For Case B, this is so for k > 17. The difference between the simulated and calculated results decreases for increasing k, e.g., for Case A the difference is under 1% for k > 15. The method of this paper thus clearly provides the tail probabilities of the delay with a good accuracy for k sufficiently large. This was verified in several other settings as well, with similar observations on the accuracy. It must be noted that the computational effort to obtain the results by simulation is much higher compared to the method of this paper.

Influence of Period Lengths
In a second example, we look at the influence of the period lengths on the average delay and 99th percentile. Throughout this subsection, we take the arrival process and the distribution of the period lengths the same for both states, namely a Poisson arrival process with intensity λ and a geometrical distribution for the period lengths with parameter 1 − 1 r (and thus an expected period length of r): For the server availability, we take a fixed number of four servers available per slot in the first state and a fixed number of two servers available per slot in the second state: In Figure 3, we plot the average delay and the 99th percentile of the delay in function of the expected period lengths r for λ = 2.4. The average delay is calculated from the average system content (as obtained by the method of [16]) by application of Little's Law [26]. The 99th percentile of the delay, i.e., the smallest k for which Prob[delay > k slots] < 0.01, is calculated by inversion of (65): As, on average, there are three servers available, the load is 80%, but in the second state, the system is temporarily overloaded. With increasing period lengths, the average delay increases linearly. This is to be expected as the system remains for longer periods in the overloaded state. The same trend is visible for the 99th percentile of the delay; however, the increase happens with a much steeper slope. This proves that valuable information can be obtained from a tail analysis of the delay. The results of Figure 3 were verified by simulation as well (not plotted as they are not distinguishable from the results obtained by the method of this paper).

Brief Summary for Implementation
In this subsection, we briefly indicate how the method of this paper can be implemented in order to obtain results such as for the numerical examples given earlier in this section, starting from the pgfs fully describing the queueing system (R i (z), S i (z) and C i (z), as defined in Section 2).

1.
Numerically obtain the solutions z p of 1 − R 1 (Y 1 (z))R 2 (Y 2 (z)) = 0 for |z| ≤ 1, where Y i (z) = C i (z)S i 1 z and with z 1 = 1. There are a total of M = m 1 r m 1 s + m 2 r m 2 s such solutions.
Numerically obtain the solutions x p of 4.
Find z 0 as the solution closest to 1 of the following polynomial Equation (for all possible x p ): g(x p , z) = 0 , with g(x, z) given in (36) and x 0 is then equal to the corresponding x p . 5.
Fill in the obtained values for z 0 and x 0 in the expression (67) for w 0 .

Conclusions
In this paper, we have studied the delay characteristics of a discrete-time multiserver queueing model. It is the extension of an earlier work where the system content was treated [16].
The queueing model considers two different system states. Each state is characterized by its own distributions for the number of arrivals and the number of available servers in a slot. Within a state, these numbers are independent and identically distributed random variables. The state transitions occur after stochastic period lengths, and each state has its own distribution for the sojourn time in that state. This setup allows to model a broad variety of queueing systems, where correlations can be introduced in the slot-to-slot arrivals and server availability.
In this paper, we have obtained the tail distribution of the delay for such a queueing model, using a generating functions approach and using the theory of the dominant singularity. Numerical examples have shown the accuracy of the obtained results, and the importance of the work. We have seen that tail characteristics of the delay of a customer in two separate queueing systems can be substantially different even when the systems are very similar and the expected delay is the same. Tail characteristics also provide a further in-depth understanding of the behavior of a queueing system.