Queuing System with Two Types of Customers and Dynamic Change of a Priority

: The use of priorities allows us to improve the quality of service of inhomogeneous customers in telecommunication networks, inventory and health-care systems. An important modern direction of research is to analyze systems in which priority of a customer can be changed during his/her stay in the system. We considered a single-server queuing system with a ﬁnite buffer, where two types of customers arrive according to a batch marked Markov arrival process. Type 1 customers have non-preemptive priority over type 2 customers. Low priority customers are able to receive high priority after the random amount of time. For each non-priority customer accepted into the buffer, a timer, which counts a random time having a phase type distribution, is switched-on. When the timer expires, the customer with some probability leaves the system unserved and with the complimentary probability gains the high priority. Such a type of queues is typical in many health-care systems, contact centers, perishable inventory, etc. We describe the behavior of the system by a multi-dimensional continuous-time Markov chain and calculate a number of the stationary performance measures of the system including the various loss probabilities as well as the distribution function of the waiting time of priority customers. The illustrative numerical examples giving insights into the system behavior are presented.


Introduction
Queuing theory is very useful for solving the problems of optimal sharing and scheduling restricted resources in many real world systems. The important branch of this theory is devoted to the consideration of queuing systems that are designed to provide service to heterogeneous flows of customers. These flows may have different value for the system and, therefore, more important flows may deserve a special treatment. This led to the idea to introduce certain priority classes and provide a privilege to the customers from the class with higher priority. Priorities are divided into preemptive and non-preemptive. For example, if the priority defines the choice of the type of the customer that will receive service, preemptive priority suggests immediate interruption of service of a customer if the customer from higher priority class arrives. Non-preemptive priority works only at service completion moments. Next service is provided to the customer from the queue which has the highest priority among waiting for service. Another kind of the priorities classification distinguishes the static and dynamic priorities. Static priorities are strictly fixed and decision making is defined only by these priorities without account of the current lengths of queues of different types of customers. Dynamic priorities take these lengths into account. Generally speaking, the dynamic priorities are more effective than the static ones. However, the field of their application is narrower because sometimes the queue lengths are not completely observable and management of control is expensive. Therefore, the static priorities are still popular in many real world systems.
The main disadvantage of the classical static priorities is their inflexibility and possible unfairness with respect to low-priority customers. If congestion occurs, the non-priority customers have very small chance to receive service within a reasonable amount of time while the priority customers are serviced quickly. To overcome this disadvantage, various improvements of the static priorities can be offered, e.g., restriction of too fast access of priority customers or mandatory service of a non-priority customer after service in turn of some fixed number of the priority customers. Another popular improvement consists of the possibility of increasing the priority of the customer during his/her staying in the queue. There are works where the customer accumulates the priority from some initial value, which depends on the priority of the customer, according to some linear or non-linear function (again, depending on the priority of the customer) during the stay of the customer in the system. Among these works we can mention papers [1][2][3][4][5]. Another bunch of the works assume that the increase of the priority occurs not deterministically according to some functional dependence, but randomly, after certain random amount of time. Because our paper makes the analogous assumption, to clarify the novelty of our model and results, we overview the related results in Table 1. In this table, the standard Kendall's denotations are used. In particular, the symbol M denotes the exponential distribution of the corresponding variables (inter-arrival and service times as well as the time until the increase of the priority), the symbol PH denotes the phase type distribution.
It is clear from this table that our paper surpasses all cited papers, except [8,9], where N priority classes are considered, whereas we considered only two classes; however, we allow batch arrivals of customers and obtain not just ergodicity condition (as it is done in [8,9]) but exactly compute the stationary distribution of the system states and waiting time for priority customers. No other paper from this table gives the exact formulas for these distributions. It is worth to note also that, except [8,9], all papers assume the exponential distribution of inter-arrival and service times as well as the time until the increase of the priority. We assume a much more general phase type (PH) distributions and the batch marked Markov arrival process (BMMAP). This is very important for potential real-world applications. One of popular applications of such kind of models is in the field of health-care. This application is mentioned in practically all cited above papers. For example, we considered queuing system suits for description of operation of the emergency department in a hospital, including an operation room and a team of necessary doctors and nurses. Patients, which suffered in an accident, are delivered to this hospital. There, they are subjected to triage. This triage is performed by a physician called sorting medic, which decides whether the patient needs very urgent treatment (and, consequently, receives high priority) or can wait for a while (receives low priority). During the waiting time, the state of the low priority patient can become worse and he/she will need urgent treatment (becomes high priority customer). It is worth noting that in the recent paper [13] devoted to analysis of patients managing in emergency departments, the following is stated: "disciplines with changing priorities have been studied in the literature from a queuing theory point of view, which requires assumptions rarely found in real emergency departments, such as homogeneity in the patient arrival pattern and only one service stage." Assumptions about the BMMAP arrival process and phase type service (probably consisting of an arbitrary finite number of sequentially stages) made in our paper a better fit for modeling real emergency departments. The phase type distribution is also much better than the exponential one, as it is suited for the description of the time until the patient will leave the hospital without help (e.g., transfers to another hospital or dies) or become a priority customer. The mode of the exponential distribution is equal to zero, which is hardly true for waiting of service. The model considered in this paper can also be used for description of operation of a contact center. As it is mentioned in the literature, phone calls have high priority, and requests sent by e-mail or a messenger have low priority; however, the customer who used a messenger for receiving information can make a phone call if his/her waiting for a response is too long. It is clarified in [14] that exponential assumptions may be inadequate and more general distributions or flows should be exploited for modeling a contact center, as it is done in our paper.
Our paper has the following structure. In Section 2, the queuing model we studied is described. The operation of the system is described by a multi-dimensional continuous-time Markov chain in Section 3. The infinitesimal generator of this Markov chain is presented there along with the brief proof of the probabilistic meaning of its blocks. The main difficulties in derivation of the expressions for these blocks and computer realization of their computations are caused by the assumption that the times until the change of a priority are not exponential, but a more complicated phase type distribution. To drastically reduce the state space of the multi-dimensional process describing the simultaneous behavior of underlying processes of such a distribution for all non-priority customers, we used and extended the known trick from [15,16]. In Section 4, the problem of computation of the stationary distribution of the states of the constructed Markov chain is briefly touched on and expressions for computation of the most important performance indicators of the system are derived, including expressions for the loss probabilities of an arbitrary customer, high priority customer, low priority customer and customer departed from the buffer without obtaining service. Expressions for computation of distribution functions of the waiting time of an arbitrary priority customer that is accepted into the system and of an arbitrary non-priority customer who becomes the priority customer are presented. These expressions allow us to exactly compute the probability that the waiting time will exceed an arbitrary given value. In turn, having the ability to compute this probability, many managerial problems can be solved, e.g., the choice of the satisfactory service rate (which is predefined by the used equipment and the staff of the team of surgeons, anesthetists, nurses, etc), necessary capacity of the buffer and strategy of rescheduling the patients to other hospitals, etc. Section 6 contains the results of the numerical examples which highlight the effects of variation of arrival rate and buffer capacity as well as correlation in the arrival process.

Model Description
We considered a single-server system with a buffer of capacity N and non-homogeneous input flow.
The structure of the system is presented in Figure 1. Customers of two types arrive to the system according to a BMMAP. The brief description of this BMMAP is as follows. Arrivals of the batches of customers occur under the control of an irreducible continuous-time Markov chain ν t , t ≥ 0, with finite state space {0, . . . , W}. This chain is called as an underlying process of the BMMAP. The chain ν t stays in the state ν during an exponentially distributed time with the parameter λ ν , ν = 0, . . . , W. After this time is over, with probability p 0 (ν, ν ) the chain transits into the state ν , ν = ν, without generating customers or with probability p k (ν, ν ) the chain transits into the state ν and the batch of k customers of lth type is generated, k ≥ 1, l = 1, 2. It is assumed that Parameters characterizing the BMMAP are stored in the square matrices D 0 , D (l) k , k ≥ 1 , l = 1, 2, of sizeW = W + 1. These matrices are defined by their entries as follows: Let θ be the vector of the stationary distribution of the chain ν t , t ≥ 0. The vector θ is calculated as the unique solution of the system of the linear algebraic equations θD(1) = 0, θe = 1. Here e is a column vector consisting of 1's and 0 is a row vector consisting of 0's.
The arrival rate of customers of lth type is calculated by the formula The total rate of customers in the BMMAP is equal to λ = 2 ∑ l=1 λ l . The arrival rate of batches of customers of lth type is calculated by the formula λ k e, l = 1, 2. The variance of the lengths of the intervals between the moments of arrival of batches of the lth type customers are calculated by The coefficient of variation of the lengths of the intervals between the arrival moments of batches of customers of the lth type is calculated by c The coefficient correlation of the lengths of two adjacent intervals between the arrival moments of batches of customers of the lth type is calculated by A more detailed description of a BMMAP and the batch Markov arrival process can be found, for example, in [17,18].
In our model, the type of a customer defines his/her priority. Type 1 customers have non-preemptive priority over type 2 customers. A batch of priority customers, which enters the system and meets the corresponding number of free places in the buffer, is queued in the buffer after the last priority customer staying in the buffer and ahead of all non-priority customers, if any. If the batch size is greater than the number of free places, then the part of the batch is accepted and the rest leaves the system forever (is lost). This means that we consider the so called partial admission discipline. A similar acceptance discipline is valid for an arriving batch of non-priority customers, only these customers are placed at the end of the common queue if there are free places in the buffer.
For every non-priority customer accepted into the buffer, a timer is switched-on. The timer counts the random time having PH distribution defined by an irreducible representation (γ, Γ). The time until the timer expiration (switching-off) is defined by the time until absorption of the continuous-time irreducible Markov chain r t , t ≥ 0, with the state space {1, . . . , R, R + 1}, where the state R + 1 is an absorbing one. The transition rates of the chain within the set of transient states {1, . . ., R} are defined by the entries of the sub-generator Γ and the rates of transition into the absorbing state R + 1 are defined by the entries of the vector Γ 0 = −Γe. At the moment of switching the timer on, the state of the process r t is randomly selected among the transient states according to the probability row vector γ. The average time until switching-off the timer is calculated by t 1 = γΓ −1 e. We denote τ = −t −1 1 . When the timer, which was switched-on for some non-priority customer, expires (switches-off), i.e., the corresponding Markov chain r t reaches the absorbing state R + 1, this customer leaves the system unserved with the probability p or gains the higher priority and moves at the end of the priority customers queue with the complimentary probability 1 − p. The priority customers are picked up from the buffer according to first in-first out discipline. If at the service completion epoch the priority customers are absent in the buffer, an arbitrary non-priority customer among those having the maximal value of the corresponding process r t is picked up for service. Such an assumption is quite realistic when the time counted by the timer has Erlangian distribution or generalized Erlangian distribution having various rates of the phases. In this case, in average, the maximal value of the underlying process r t have customers having longer stay in the buffer. This is why namely such a customer (or one of such customers if several customers have the maximal value) will be picked up for service.
The service time of any customer has a PH distribution with an irreducible representation (β, S) and underlying process m t , t ≥ 0. This process has the state space {1, . . . , M, M + 1}, where the state M + 1 is an absorbing one. The transition rates of the chain m t , t ≥ 0, within the set of transient states {1, . . . , M} are defined by the entries of the sub-generator S. The transition rates into the absorbing state (what implies service completion) are defined by the entries of the vector S 0 = −Se. The average service rate is calculated by µ = −(βS −1 e) −1 , the average service time is given by b It is worth noting that if we assume that only priority customers arrive to the system then the system under consideration transforms to the BMAP/PH/1/N type queuing system. In this case, the matrices D (2) k k ≥ 1, describing the input flow are equal to zero. The system BMAP/PH/1/N is a very special case of the system BMAP/SM/1/N with semi-Markovian service process that was investigated in [19] for partial admission strategy and in [20] for complete admission and complete rejection strategies.

Process of the System States
Let at the moment t • i t be the number of customers in the buffer, i t = 0, . . . , N; • j t be the number of non-priority customers in the buffer, j t = 0, . . . , i t ; • χ t = 0, if the server is idle; χ t = 1, if the server is busy; • m t be the state of the underlying process of the PH service on the busy server, m t = 1, . . . , M; • n (r) t be the number of non-priority customers staying in the buffer for which the timer is in the state r, n (r) = 1, . . . , j t , r = 1, . . . , R, • ν t be the state of underlying process of the BMMAP, ν t = 0, . . . , W.
The process of operation of the system is described by a regular irreducible continuous-time Markov chain ξ t , t ≥ 0, with state space Note that the number of states in the space Ω with the value i = 0 of the first component is and the number of states with the value i = 1, . . . , N of the first component is Then the cardinality of the state space Ω is equal to We suppose that the states of the chain ξ t , t ≥ 0, are enumerated in the direct lexicographic order of the components i t , j t , χ t , ν t , m t , and in the reverse lexicographic order of the components In the sequel, we use the following notation: I (O) is an identity (zero) matrix of an appropriate size. When it is necessary, we identify the size of a matrix with a suffix; diag {a l , l = 1, . . . , L } is a diagonal matrix with the diagonal entries a l ; diag − {A l , l = 0, . . . , L} is a sub-diagonal matrix with the sub-diagonal blocks A l ; ⊗ and ⊕ are the symbols of the Kronecker product and sum of matrices, see, e.g., [21], and δ m,n is Kronecker's symbol; a =W M.
Before proceeding with the construction of the generator of the Markov chain ξ t , t ≥ 0, we will consider the important problem of finding the transition rates of the process n t = (n (1) t defines the number of timers having the state r of the underlying process r t , r = 1, . . . , R.
To calculate these transition rates, we use the results for R independent Markov processes in parallel, obtained by V. Ramaswami and D. Lucantoni, see [15,16]. To use these results for our model, the transition probabilities or rates of the process n t .
• The matrix P j (γ) contains the transition probabilities of the process n t at the moments of an increase in the number of non-priority customers in the buffer from j to j + 1. • The matrix P i,j (γ) = P i (γ)P i+1 (γ) . . . P j−1 (γ) contains the probabilities of transition of the process n t at the moments of an increase in the number of non-priority customers in the buffer from i to j. • The matrix A j (l, Γ) contains the transition rates of the process n t in the state space of this process without increasing or decreasing the number of non-priority customers in the buffer (here j is the number of non-priority customers in the buffer, l is the total number of free places and non-priority customers in the buffer). • The matrix L k (l,Γ) contains the transition rates of the process n t , which leads to the expiration of the timer on one of the l − k non-priority customers in the buffer (here k is the number of free places in the buffer, l is the total number of free places and non-priority customers in the buffer).
In the following, we assume that A more detailed description of the matrices P j (γ), A j (l, Γ), L k (l,Γ) and algorithms for their calculation, can be found in [15,22].
Besides the matrices P j (γ), A j (l, Γ), L k (l,Γ) we need to use the matrix E j,j−1 , which defines the transition probabilities of the process n t when the current service is completed and there are no priority customers in the buffer (in the following we denote this service completion epoch as t * ) and one of j non-priority customers from the buffer moves to the server. The server is occupied by one of the non-priority customers whose timer is in the maximal (among all timers) phase r max and the number of non-priority customers in the buffer becomes equal to j − 1. Let the rows of the matrix E j,j−1 correspond to the possible states of the process n t at the moment t * − 0 and the columns correspond to the states of the process n t at the moment t * + 0. We assume that in both cases the states are ordered in the reverse lexicographic order. Consider the row of the matrix E j,j−1 corresponding the state (n (1) , n (2) , . . . , n (r max ) , 0, . . . , 0) of the process n t . If at the moment t * − 0 the process n t is in such a state, then at the moment t * − 0 it will transit to the state (n (1) , n (2) , . . . , n (r max ) − 1, 0, . . . , 0). This transition occurs with probability 1. To fix this transition, we set the entry (E j,j−1 ) (n (1) ,n (2) ,...,n (rmax ) ,0,...,0), (n (1) ,n (2) ,...,n (rmax ) −1,0,...,0) of the matrix E j,j−1 be equal to 1. The remaining entries of the row are set equal to zero. Using such an algorithm for forming the rows, we get the matrix E j,j−1 of size C R−1 j+R−1 × C R−1 j+R−2 . Let Q i,l be the matrix of transition rates of the chain ξ t , t ≥ 0, from the set of states corresponding to the value i t = i to the states corresponding to the value i t = l. Then the infinitesimal generator Q of the chain is formed as Q = (Q i,l ) i,l=0,...,N . The following statement is true. Lemma 1. The infinitesimal generator Q of the Markov chain ξ t , t ≥ 0, has the following block structure: Here, ∆ i , i = 1, . . . , N, are the diagonal matrices ensuring the equality Qe = 0 T .

Proof.
(1). The entries of the block Q i,i+k , i = 0, . . . . . . , N − 1, k = 1, . . . , N − i, define the transition rates of the Markov chain ξ t , t ≥ 0, that lead to the increase the number of customers in the buffer by k.
If i = 0, the transitions can occur as a result of the following events: • the batch of (k + 1) type 1 customers arrives to the idle system and one of the customers occupies the server. The rates of this event are defined by the matrix D k+1 ⊗ β, if k = 1, . . . , N − 1, and by the matrix • the batch of (k + 1) type 2 customers arrives to the idle system, one of the customers occupies the server and a timer is set for each of min{k, N} customers placed in the buffer. The rates of this event are defined by the matrix D (2) k+1 ⊗ β ⊗ P 0,k (γ), if k = 1, . . . , N − 1 and by the matrix • the batch of k type 1 customers arrives to the system when the buffer is empty while the server is busy. In this case, min{k, N} customers are placed in the buffer. The rates of this event are defined by the matrix D k ⊗ I M ⊗ P 0,k (γ), if k = 1, . . . , N − 1, and by the matrix The presence of zero blocks in the matrices Q 0,k is explained by the fact that simultaneous arrival of both priority and non-priority customers in the BMMAP over an infinitesimal time interval is possible only with zero probability.
Thus, we have explained the structure of the generator blocks Q i,i+k for i = 0. We have considered two cases: the server is idle and the server is busy. If i = 1, . . . , N − 1, the server is busy a priori. Because of this, the explanation of the structure of the blocks Q i,i+k for i = 0 is similar to the explanation given for the case when i = 0 and the server is busy.
(2). The entries of the block Q i,i−1 , i = 1, . . . , N, define the transition rates of the Markov chain ξ t , t ≥ 0, at the moments of the decrease the total number of customers in the buffer by one.
If i = 1, the transition occurs as a result of the following events: • the service of the current customer finishes and the only priority customer staying in the buffer occupies the server. The rates of this event are defined by the matrix IW ⊗ S 0 β. • the service of the current customer finishes and the only non-priority customer staying in the buffer occupies the server. At this moment, the timer set for this customer is switched-off. The rates of this event are defined by the matrix IW ⊗ S 0 β ⊗ e R . • the timer set for the only non-priority customer staying in the buffer expires while the server is still busy. In this case the non-priority customer with the probability p leaves the system forever. The rates of this event are defined by the matrix pI a ⊗ Γ 0 .
If i = 2, . . . , N, the transitions of the Markov chain ξ t , t ≥ 0, related to the decrease the total number of customers in the buffer by one occur as a result of following events: • the service of the current customer finishes and there is at least one priority customer in the buffer.
Then, the first priority customer goes to the server and the number of customers in the buffer decreases by one. The rates of this event are defined by the matrix diag{IW • the timer expires for one of non-priority customers and this customer leaves the buffer forever.
The rates of this event are defined by the matrices pI a ⊗ L N−i (N − i + j,Γ), j = 1, . . . , i. • the service of the current customer finishes and there are no priority customers in the buffer.
Then one of the non-priority customers whose timer is in the maximal (among all timers) phase r, r = 1, . . . , R, goes to the server and the number of non-priority customers in the buffer whose timer is in the phase r decreases by one. The rates of this event are defined by the matrix IW ⊗ S 0 β ⊗ E i,i−1 .
(3). The non-diagonal entries of the block Q i,i define the transition rates of the chain ξ t , that do not lead to the change of the number i of customers in the buffer. The case of i = 0 is not commented on here due its simplicity. In the case i = 1, . . . , N, the mentioned transitions occur as a result of the following events: • the underlying process of BMMAP makes a transition without the generation of customers or the PH service time underlying process makes a transition which does not lead to the service completion. The rates of these events are defined by the non-diagonal entries of matrix diag{(D 0 ⊕ S) ⊗ I C R−1 j+R−1 , j = 0, . . . , i}, if i = 1, . . . , N − 1 and by the non-diagonal entries of the • the underlying process of the PH timer set for one of j non-priority customers in the buffer makes a transition, which does not lead to the timer expiration. The rates of this event are defined by the non-diagonal entries of the matrix diag{I a ⊗ A j (N − i + j, Γ), j = 0, . . . , i}. • the timer expires for one of j non-priority customers in the buffer but the server is busy. Then, the mentioned customer with the probability 1 − p gains the higher priority and joins the tail of the priority customers queue. The rates of this event are defined by the entries of The diagonal entries of the block Q i,i , i = 0, . . . , N are negative and the modulus of each entry defines the rate of leaving the corresponding state of the Markov chain ξ t , t ≥ 0.
Lemma is proven.

Stationary Distribution: Stationary Performance Measures
Since the Markov chain ξ t is irreducible with a finite state space, it has the unique stationary distribution.
Let p i be the row vector of the steady-state probabilities of the states having the value i of the first component, i = 0, . . . , N. The entries of the vector p 0 of size K 0 give the steady-state probabilities that the buffer is empty, the server is idle or busy, the underlying process of the BMMAP is in any of the (W + 1) states and, if the server is busy, the service process is in any of the M phases. The entries of the vector p i of size K i , i = 1, . . . , N, give the steady-state probabilities that there are i customers in the buffer, the underlying process of the BMMAP is in any of the (W + 1) states, the service process is in any of the M phases and the timers process n t is in any of the i ∑ j=0 C R−1 j+R−1 states.
Denote as p = (p 0 , p i , . . . , p N ) the stationary probability vector of the chain ξ t . It is well known that this vector is the unique solution of the following system of linear algebraic equations: pQ = 0, pe = 1. (1) In case of small dimension, the system of Equation (1) can be solved on a computer by standard methods. However, with more or less large values of N, R, the order of this system becomes so large that it is not possible to solve this system directly, for example, using the method of the inverse matrix. In such a case, we use the algorithm that was elaborated in [23]. This algorithm is stable, takes into account the upper-Hessenberg structure of the generator Q and operates with the generator blocks Q i,l .
The sizes of these blocks are defined by the values K 0 =W(1 + M), and K i =W M i ∑ j=0 C R−1 j+R−1 , i = 1, . . . , N, whereas the size of whole system of Equation (1) is equal to Having calculated the stationary distribution p i , i = 0, . . . , N, we are able to calculate a number of system performance measures of interest. Below we give the expressions for these performance measures along with brief explanations of nontrivial formulas.
• Probability that the system is idle • Probability that the buffer is empty and the server is busy • Probability that there are i customers in the buffer, of which j customers are non-priority • Probability that there are i > 0 customers in the buffer • Mean number of customers in the buffer • Mean number of non-priority customers in the buffer • Mean number of priority customers in the buffer L (prior) = L bu f − L (non−prior) .
• Probability that an arbitrary customer will be lost due to lack of buffer space where I A brief explanation of Equation (2) is as follows. The expression in the braces is the rate of the output flow and λ is the input rate. Then, the ratio of these rates gives the probability that an arbitrary customer will be served, and the complimentary probability gives the desired probability P loss .
Note that the probability P loss can be calculated by an alternative equation obtained by considering the situation at the arrival time. This equation has the form where A brief explanation of Equation (3) is as follows. The expression in the braces gives the rate of customers from the input flow that are accepted into the system. Dividing this rate by the input rate λ, we obtain the probability that an arbitrary customer will not be lost due to the lack of buffer space. The complimentary probability gives the probability P loss .
Similarly calculated the probability of loss of customers of each type. • Probabilities that an arbitrary priority customer (l = 1) and an arbitrary non-priority customer (l = 2) will be lost due to lack of buffer space P (l) k e , l = 1, 2.
• Probability that an arbitrary non-priority customer will be lost due to impatience A brief explanation of Equation (4) is as follows. The expression gives the mean number of expirations of the timers per unit time. Each non-priority customer from the buffer, for which the timer expires, with the probability p leaves the system (is lost). The ratio of the rate of flow of thus lost non-priority customers to the input rate of non-priority customers, λ 2 , is loss . • Probability that an arbitrary non-priority customer accepted to the buffer will be lost due to impatienceP

Stationary Distribution of a Priority Customer Waiting Time
Let w (1) τ , τ ≥ 0, be the process of waiting time of an arbitrary priority customer accepted into the system at the moment τ and w (2) τ , τ ≥ 0, be the process of waiting time of a customer which changed the low priority to the high priority during the stay in the buffer starting from the moment of changing the priority. We assume that an arbitrary priority customer, which arrives in a batch of size k, is numerated as the ith in the batch with the probability 1 τ < t} be the distribution function of the process w (l) τ , l = 1, 2. Denote by W l (t) the corresponding stationary distribution functions of the waiting time, i.e., W l (t) = lim τ→∞ W (l) τ (t), l = 1, 2. It is seen from the definitions, that W l (t), l = 1, 2, are the conditional distribution functions. To find these functions, we focus on the calculation of the joint probabilities, which are defined as follows: W 1 (t) is the joint probability that an arbitrary priority customer will be accepted into the system and his/her waiting time will be less than t; W 2 (t) is the joint probability that an arbitrary non-priority customer will become the priority customer and after that his/her waiting time will not exceed value t.
The following theorem holds.
Theorem 1. The functionsW 1 (t) andW 2 (t) have the forms Here the square matrix A (n) of size M(n + 1) is defined as and the sub-vector p i (j) is the part of the vector p i corresponding to presence of j non-priority customers in the buffer and is defined as Proof. We first consider the derivation of Equation (5) for the functionW 1 (t). Using the equation of total probability, this function can be written in the following form: where A (n) To clarify Equation (7), we first explain the meaning of the matrices A (n) and t 0 e A (n) x A (n) 0 dx. Let us tag an arbitrary priority customer accepted to the system. Suppose that at the arrival moment of this customer, the server is busy and max{0, n − 1} priority customers are in the buffer. The waiting time T (n) of the tagged customer is the sum of the remaining service time of customer in the service and the service time of max{0, n − 1} priority customers from the buffer. Given the known distribution of the service phases at the arrival time of the tagged customer (let this distribution be given by the vector α of size M) the waiting time T (n) of this customer has a PH distribution with (n + 1)M phases defined by the irreducible representation (α, A (n) ), whereα = (α, 0 nM ). Then P{T (n) < t} =α (7) for the functionW 1 (t). The scalar

Now let us explain the meaning of the terms in Equation
is the probability that the system is empty at the arrival moment of the batch of k priority customers, the tagged priority customer arrives in this batch and receives the first position in the batch what implies that his/her waiting time is less than t for any t > 0.
As stated above, the waiting time for an arbitrary priority customer has a PH distribution. Taking into account this fact, we can see that the vector defines the initial phase distribution of the waiting time of the tagged priority customer conditional he/she enters the empty system in a batch of size k, is accepted into the system, and occupies the (l − 1)th place in the buffer, l = 2, . . . , min{N + 1, k}. Multiplying the indicated row vector by the column vector defines the initial state of underlying process of phase distribution of the waiting time of the tagged priority customer, which is dependent on whether or not he/she enters the system in the batch of size k when the buffer is empty and the server is busy, is accepted into the system and occupies the lth place in the buffer, l = 1, . . . , min{N, k}. Multiplying the indicated vector by the column vector dx, we obtain the probability that such a tagged customer has his/her waiting time less than t. The vector in double sum on the right-hand side of Equation (7) defines the initial phase distribution of the waiting time for the tagged priority customer, which enters the system in the batch of size k when i customers are in the buffer, i − j of which are priority ones, and occupies the lth place in buffer, l = i − j + 1, . . . , min{N − i, k}. Multiplying the indicated vector by column vector we obtain the probability that such a tagged priority customer has waiting time less than t. Now we use the equation for total probability by summing up the described terms on the right-hand side of Equation (7) over k and l. Then, on the right-hand side of Equation (7) we obtain the probability that the arbitrary tagged priority customer was accepted into the system and his/her waiting time is less than t. This is by definition the functionW 1 (t). Performing the calculation of integrals and some simple transformations in Equation (7), we obtain the desired Equation (5). Now we prove Equation (6) forW 2 (t). In the equation, the mth entry of the vector defines the mean number of expirations of the timers per unit of time conditional that i customers are staying in the buffer, j of which are non-priority ones and the service process is in the phase m.
Note that the valueγ is the total rate of the flow of expiration of timers on non-priority customers. Then, the mth component of the vector is the probability that at the moment of timer expiration on one of the non-priority customers, the service of the current customer is in the phase m and there are i customers in the buffer of j which are non-priority ones.
If a non-priority customer, whose timer has expired, becomes priority, then his/her waiting time T (i,j) consists of the residual service time of the customer in the service and the service time of (i − j − 1) of priority customers standing in the buffer in front of him/her. This time has a PH distribution with the representation (φ (i,j) , A (i−j) ), whereφ (i,j) = (φ (i,j) , 0 (i−j)M ). Then the distribution of the random time T (i,j) is calculated as follows: Multiplying the right-hand side of Equation (8) by the right-hand side of Equation (9), calculating the integrals, summing over the possible values of i, j and multiplying the resulting expression by 1 − p, we obtain Equation (6) for the functionW 2 (t).
Corollary 1. (i) The distribution function of the waiting time of an arbitrary priority customer accepted into the buffer is calculated by the equation (ii) The distribution function of the waiting time of a priority customer that initially was the non-priority customer is calculated by the equation Proof. Since the function W l (t) is a conditional distribution function, it is calculated by the division of the joint probabilityW l (t) by the probability of the conditionW l (∞), l = 1, 2.

Corollary 2.
The average waiting timew 1 of an arbitrary priority customer accepted into the buffer and the average waiting timew 2 of a non-priority customer after he/she became the priority customer, are calculated by the equationsw Proof. The proof follows from the equationsw l =

Numerical Examples
The goal of this section is to bring out the qualitative nature of the model under study through three illustrative numerical examples.
Example 1. In this example, we plotted graphs of the distribution functions W 1 (t) and W 2 (t) under different values of the service rate µ. We considered the following input data.
• N = 10, p = 0.4 • To define the BMMAP, we first define the matrices D 0 and D as follows The matrix D is split into two matrices D (1) and D (2) as It means that the arrival rate of batches of type 2 customers are nine times less than the arrival rate of batches of type 1 customers.
Next, we need to obtain the matrices D (1) k and D (2) k . We assume that the maximum batch size of type 1 (priority) customers is 5 and the distribution among batches of different sizes is carried out in accordance with the equation Further, we assume that the maximum batch size of type 2 (non-priority) customers is 2 and the distribution among batches of different sizes is carried out in accordance with the equation For this BMMAP λ = 8, λ 1 = 1.569656, λ 2 = 6.430344, λ It means that the time counted by the timer has the Erlangian distribution of order 2. The timer rate τ = 5 and the coefficient of variation c var = 0.7. • In this example, we considered three service processes, which differ in the service rate. In all these processes, the service time has the Erlangian distribution of order 2. The rates of the phases of this distribution are calculated as 8c, where c = 1, 2, 4. Therefore, the service rates are equal to 4, 8, 16, respectively.
When calculating the functions W 1 (t) and W 2 (t), we consider t ∈ [0.01, 4] and take 40 points uniformly distributed over the interval. In Table 2 we give the values of the functions W 1 (t) and W 2 (t) at 21 points. Table 2. Distribution functions W 1 (t) and W 2 (t).
As it is seen from Table 2, the functions W 1 (t) and W 2 (t) differ only slightly. This can be explained by the fact that the waiting time of an arbitrary arrived priority customer and the customer, which initially arrived as the non-priority one, may be different if the distributions of the number of customers at the moments of arrival of a priority customer and transfer of a non-priority customer to the class of priority customers would be different. However, as calculations in this example show, these distributions are close, and the waiting time distributions of two types of priority customers are not much different. Therefore, below in this example we considered only the function W 1 (t).  As expected, under the fixed value of µ the function W 1 (t) increases with increasing t and tends to 1 when t → ∞. It is also clear that the value of the function increases with increasing the service rate µ. Note also that the growth rate is greater for larger values of µ. Example 2. In this example, we investigate the dependence of the loss probabilities P loss , P (1) loss , P (2) loss , P (imp) loss on the buffer capacity N for the system with BMMAPs having the different coefficients of correlation.
These additional BMMAPs are also defined by certain matrices D 0 and D from which the matrices For this BMMAP, c (1) var = 3.087863. The service time has the Erlangian distribution of order 2 with parameter 20. The timer has the Erlangian distribution of order 2 with parameter 10. The probability that a non-priority customer leaves the system unserved when the timer expires is p = 0.4. Figures 3-6 show the dependence of the loss probabilities P (1) loss , P loss , P loss , P loss on the buffer capacity N for the systems with BMMAPs having different coefficients of correlation.    As expected, the loss probabilities of priority and non-priority customers, P (1) loss , P (2) loss , as well as the general loss probability, P loss , decrease with N increasing. It is also seen that the loss probabilities depend on the correlation in the input flow. Under the same value of N each of these probabilities increases when the correlation increases. The difference in the value of any of these probabilities for BMMAP with different coefficients of correlation may be very significant. This is especially evident if we compare the curves for BMMAPs with coefficients of correlation c decreases with increasing the buffer space N; therefore, larger fraction of non-priority customers is admitted in the system and forms the longer queue. Thus, the rate of the flow of customers leaving the system due to impatience increases. The graphs also confirm that the correlation in the input flow can significantly affect the value P (imp) loss , especially for large values of N. The higher correlation implies the higher probability of loss due to impatience.
In general, it can be concluded from Figures 3-6 that ignorance of the correlation in the input flow can negatively affect the accuracy of evaluating the performance of real systems and lead to too optimistic estimates of the performance measures. Example 3. In this example, we computed the following performance measures: mean number of priority customers in the buffer, L (prior) , mean number of all customers in the buffer, L, the loss probabilities P (1) loss , P 2) loss , P loss , P (imp) loss . We investigated the dependence of these performance measures on the input rate λ for the systems with BMMAPs having different coefficients of correlation.
We considered three BMMAPs from Example 2. For convenience, we denote them as BMMAP 1 , BMMAP 2 and BMMAP 3 , where the numbering is done in order of increasing of coefficients of correlation in the BMMAP. Distributions of the service time and timer are the same as in Example 2. The buffer capacity N = 10 and the probability that a non-priority customer leaves the system unserved when the timer expires is p = 0.4.  As expected, the values of L (prior) and L increase when the input rate λ increases. More interesting is to analyze relative location of the curves for different coefficients of correlation. In the region λ < 10, a curve for the higher correlation is located above a curve for the lower correlation. It is worth recalling that in the considered example the service rate is µ = 10, therefore, the region λ < 10 is associated with the region ρ = λ µ < 1. The relative location of the curves is changed when λ > 10 (ρ > 1). In this region, under the same value of λ the mean L (prior) and L are significantly less for the BMMAP 3 than for the BMMAP 1 and BMMAP 2 . Also note a relatively low rate of increase in the case of the BMMAP 3 . Thus, with a large correlation, we observe poor average buffer occupancy (due to the high loss probability). In order to investigate the behavior of L (prior) and L for BMMAP 3 in more detail, we examine the deviations of the number of priority customers and the total number of customers in the buffer from their average values. Figures 9 and 10 show the behavior of standard deviations σ and σ (prior) .  As seen in the figures, the values of σ and σ (prior) for the BMMAP 3 retain large values in a wide region λ > 10. Taking this into account, we can explain the behavior of the means L (prior) and L by the fact that, under a high correlation in the BMMAP, periods of time when customers arrive rarely (and the server starves) alternate with the periods when customers arrive frequently (and many customers are lost due to the full occupancy of the buffer). This implies the non-uniform filling of the buffer. On average, the buffer may be filled weakly, but there is the large deviation of the number of customers in the buffer from the average value.

Conclusions
We considered a priority single-server queue with a finite buffer and two types of customers. Customers of low priority may become high priority customers after a random time having PH distribution. This queuing system may be helpful for modeling the work of emergency department in a hospital, contact center, inventory with perishable foods, etc. An analysis of the system was implemented under more realistic situations than found in the majority of the existing literature, and making assumptions about the arrival process and distributions of the service time and time until the change of the priority. Exact algorithmic results are obtained for stationary distributions of the number of customers of two types in the system and waiting time of priority customers. Numerical illustrations are presented, which show the dependencies of important performance indicators of the system in arrival rate and capacity of the buffer. Significant effect of correlation in arrival process is shown what evidently motivates the choice of the BMMMAP as the model of arrival process, whereas the overwhelming majority of the existing research is devoted to the system with the stationary Poisson arrival process having zero correlation. The presented results are planned to be extended to the systems in which the increase of the priority can be dependent on the availability of some additional items (equipment, expendable materials, etc.), see, e.g., [24].