A Multi-Server Heterogeneous Queuing-Inventory System with Class-Dependent Inventory Access

: In this paper, we consider a queuing inventory system with heterogeneous customers of K types arriving according to a marked Markovian arrival process. Each class of customers differs by nature of the service they seek and different priorities are assigned for each class resulting in different levels of inventory admitted to exhaust for customers of each class. A single service node is provided for each class with exponential services having class-dependent service rates. All classes of customers are served from a single source of inventory replenished according to ( s , S ) policy with exponentially distributed lead time. Stability condition and steady state probabilities are obtained by matrix-analytic method. Some important performance measures are also derived. Inventory recycle time was analyzed in detail. Useful cost function and numerical illustrations are also given. The optimization problem is interesting and can be solved in similar real scenario.


Introduction
Queuing-inventory systems are in the focus of recent research due to practical applicability in many fields including social, biological and technical systems. Access to a finite consumable and refillable resource is a natural way of modeling interaction with retail shop customers, office visitors, hospital patients, and even packages in the telecommunication network. In such systems, various sophisticated models arise, including the models with random demand and/or number of customers served [1,2], random order grouping [3] or duplicate ordering from several facilities [4]. In operations research, queuing-inventory system is a natural way to model load leveraging techniques, such as the leaky bucket congestion avoidance scheme used in a wide range of systems, from large scale routers [5] to electric vehicle charging stations [6].
Queuing inventory systems with positive service time were first investigated in Reference [7], followed by the work of Reference [8], in which an optimal quantity of inventory to be ordered to minimize the cost rate was obtained. In queuing inventory framework with positive service time, customers' queue is formed even when some inventory is available. We point the reader to a detailed survey of inventory systems with positive service time given in Reference [9], which includes classical, retrial, and production inventory.
Many retailers and banks find it helpful to partition the customers into different categories (classes) according to specific characteristics and adopt an inventory management policy based on this differentiation strategy. In particular, long-term customers can be treated as high priority as compared to walk-in customers. Given a limited resource, the low priority customers may have to wait while some amount of resource is still available, reserved for customers with higher priority. The reservation may be made by imposing typical inventory levels for each class of customers according to their priority level. When the inventory comes below the level, customers from corresponding classes may have to wait until the inventory replenishment. Such a critical level policy is introduced and studied in Reference [10]. Different classes experience congestion with a single inventory; thus, customer sojourn times in a system are intrinsically correlated.
In most cases priorities are accompanied with heterogeneity of customer classes, e.g., in terms of service time distribution either in single-server [11,12] or in multi-server case [13][14][15]. As such, the arrival process also becomes heterogeneous, and the promising candidate is the so-called Marked Markovian Arrival Process (MMAP) used, e.g., in the works [11,14,16].
MMAP [K] is a generalization of Markov arrival processes (MAP) which have been studied and used extensively in queuing theory. MAP was introduced in Reference [17] to model non-Markovian point processes. While MAP is a useful tool to model point processes with one class of customers, MMAP[K] introduced by Neuts (see Reference [18]) is useful when multiple types of customers are present, while the model remains analytically tractable [19]. The basic characteristics of MMAP, such as peakedness of the arrival process, the first passage time to the arrival of an item of a specific type, and the behavior of the MMAP during that first passage, are analyzed in Reference [18].
In this paper, we analyze a multi-server queuing-inventory system with K classes of customers served from a single inventory which is managed according to the (s, S) with a positive lead time. Following Reference [11,14,16], we use MMAP[K] to model the arrival process. The servers are class-dependent, each server dedicated to one specific class of customers. Only the highest priority customers are allowed to wait in an infinite buffer. All other class customers can wait in respective finite buffers. The service for each class of customers is carried out with different exponential rates, and the inventory item is consumed at the end of service. A class-specific boundary level in the inventory is defined, causing customers of this specific class to wait for inventory replenishment when this boundary is down crossed. To the best of our knowledge, this model is new.
The structure of the paper is as follows. In Section 2, we give a detailed description of the model. The example of K = 2 is also given for better illustration of the model. In Section 3, an intuitive stability criterion for the system has been derived. Section 4 analyzes the steady state of the system and expresses a few important performance measures. In Section 5, a detailed analysis of the inventory recycle time has been carried out. Numerical illustrations are provided in Section 6, in which an optimization problem of practical importance has also been stated.

Model Description
We consider a multi-server queuing inventory system with heterogeneous customers. A K-server station provides service to K classes of customers. Arrivals occur according to the Marked Markovian Arrival Process (MMAP) driven by an irreducible continuous time Markov chain (CTMC) {Z(t)} t≥0 with finite state space W. Let |W | = W. The sojourn time in each state z ∈ W is distributed exponentially with rate σ z and d.f.
At sojourn time expiration epoch, with probability P (k) (z,ẑ) the process Z(t) moves from the state z toẑ, generating a class k customer arrival, z,ẑ ∈ W, k = 0, 1, ..., K (conventionally, no customer arrival is generated if k = 0). Note that, for any z ∈ W, Thus, the MMAP is characterized by a set of K + 1 transition rate square matrices D 0 , . . . , D K of order W defined as follows: where 1 k=0 is the indicator of a non-random condition k = 0. The matrices D k constitute a generator matrix D of the Markov process {Z(t)} t≥0 such that Recall that De = 0, which also follows from (1) and (2). Hereinafter, e (0, respectively) is the vector of ones (zeroes), and, if necessary, we designate the row vector with a transpose sign, e.g., 0 . As such, denoting by θ the stochastic invariant vector of D (i.e., θD = 0, that is, θ is the distribution of the corresponding Markov jump process governed by D), the class k arrival rate, λ k , equals λ k = θD k e.
We would like to note that the MMAP process being a versatile Markov process, is suitable for a wide range of applications due to variety of modeling features including correlated arrivals [20]. However, capturing sophisticated features, such as Long-Range Dependence (slow autocorrelation decay of the process that complicates application of the standard methods of performance estimation), may require infinite number of sources [21]. As such, it is necessary to balance the size W of the MMAP state space with practical capabilities of the model, including computational/storage capacity. Useful examples of MMAP processes may be found in Reference [19].
If the server k is busy, then the arriving class k customer can wait in a buffer space of capacity l k , where l 1 = ∞ and l k < ∞, k = 2, . . . , K. The class k customer, finding the respective buffer completely filled on arrival, leaves the system forever.
The inventory is divided into K segments, Class k customer spends an exponentially distributed, with rate µ k , time at the (classspecific) server k = 1, . . . , K. At service completion epoch, one item from a single common inventory is consumed by the customer. However, class k customers are served only if the inventory level exceeds s k−1 , k = 1, . . . , K. Thus, s is not only inventory replenishment boundary, but also the critical level, at or below which only the class 1 customers ( highest priority) are served. If the inventory level at service completion epoch is insufficient, the corresponding customer of class k repeats an independent service time with the same rate, µ k , until a service completion coincides with a sufficient inventory level.
The inventory is replenished under (s, S) policy in an exponentially distributed lead time with rate γ. That is, when the inventory hits level s, a request for replenishment is issued, and the inventory level S is restored after an independent exponentially distributed random time with rate γ. Finally, we note that the arrival process, service times and replenishment times are independent. The structure of the system is presented on Figure 1.
This gives a very special structure of the infinitesimal generator matrix of the considered process {ζ(t)} t≥0 . First, since the component N 1 (t) corresponds to the unbounded buffer, while other components of ζ(t) are finite, and due to the fact that the transitions of N 1 (t) are skip-free in both directions (i.e., it may be incremented/decremented by at most one), the process ζ(t) is the so-called Quasi-Birth-Death (QBD) process with level N 1 (t) and phase (N 2 (t), . . . , N K (t), I(t), Z(t)). Lexicographically ordered state space allows one to write the generator matrix in block-tridiagonal form where the matrix O is a zero block of corresponding dimension (we give the dimension explicitly if and when necessary). Define an integer-valued function of two arguments, i, j, such that 2 ≤ i ≤ j ≤ K: and define α(i, j) = 0 otherwise. Then, the blocks A i , i = 0, 1, 2, and A 1 are square matrices having size α(2, K)(S + 1)W.
The matrix A 0 consists of the rates of transitions corresponding to the arrival of class 1 customer, A 2 keeps the transition rates corresponding to departure of a class 1 customer, while A 1 is related to 2K remaining possible transitions, such that the level remains unaffected. It is quite straightforward to define A 0 , since, upon arrival of a class 1 customer, only the MMAP phase is switched; thus, where ⊗ is the Kronecker product and I is the identity matrix of corresponding dimension.
We will need the following notation: hereafter e k i:j is the (column) vector of dimension k ≥ 0 with ith to jth components equal 1, and zero otherwise, i ≤ j ≤ k (conventionally we take e k i:j ≡ 1 if k = 0). To shorten the notation, we use for any i ≤ k • e k i ≡ e k i:i (a single non-zero component at ith row, dimension k); • e k ≡ e k k:k (last non-zero component, dimension k); • e k ≡ e k 1:k (all non-zero components of dimension k). To define A 2 and for further use, we need to construct some auxiliary matrices. Define for j = 1, . . . , K the following square matrix of order S + 1: where the matrix L j has non-zero entries below main diagonal only for rows s j−1 + 2, . . . , S + 1, and the zero vectors are of size S. The block matrix L j corresponds to possible transitions of the inventory for class j customer, and is indexed from 0 to S. In particular, for class 1 customers the matrix L 1 has the lower diagonal of ones (there is no constraint for the inventory to be decremented). Since, upon departure of class 1 customer, the level, as well as the inventory, are decremented by one, the matrix A 2 has the following form: Now, to define A 1 , which contains the transition rates related to arrivals and departures of class j customers, j ≥ 2, inventory replenishment and MMAP phase change, we need auxiliary matrices of an increment/decrement in the corresponding (phase) component: is a square matrix of order l j + 1 having lower diagonal of ones, while is semi-upper diagonal. This asymmetry will be explained below. Using these constructions, it is rather straightforward to define the transition rates of all possible transitions constituting the matrix A 1 : Conventionally, in (10) and (11), we define the zero-size identity matrix I 0 = 1. Note that (10) corresponds to arrival of class j customer, (11) is the corresponding class departure, (12) is a replenishment (where in fact the corresponding vector product gives a matrix with only last column being non-zero), while (13) is the MMAP phase change. It is worth noting that asymmetry in N (+) j (non-zero last row), defined in (8), is used in (10) to indicate an arrival of a class j customer that is lost, while the MMAP phase is changed according to D j . The matrix ∆ is a diagonal matrix that guarantees (A 0 + A 1 + A 2 )e = 0; thus, Straightforward algebra allows to obtain ∆, the diagonal matrix of dimension α(2, K)(S + 1)W, from (14) in a closed form as follows: where δ (a) is the vector of transition rates due to departure of customers of classes 2, . . . , K, δ (d) contains transition rates due to a class 1 customer departure, and δ (b) is the vector of transition rates due to replenishment given as follows: It remains to define the block A (0) 1 corresponding to possible transitions of the model from within the states having zero class 1 customers. Note that such a matrix is very similar to the matrix A 1 , and the only difference is in the diagonal balancing matrix (14). Indeed, from the condition (A (0) The matrix A 1 is then defined as follows: where it follows from (14) and (18) We note that the definitions of subblocks may be rewritten in a more compact form using Kronecker sums. Moreover, by defining unbounded analogs of the matrices given in (8), the generator matrix itself may be rewritten similarly. However, we skip this possibility to keep parsimony of the notation.
To illustrate (3), we consider the case K = 2. Recall that class 1 customers have priority over class 2 customers. Class 1 customers are allowed to wait in an infinite buffer, whereas class 2 customers cannot enter the system if there are l 2 class 2 customers already in the system. Server 1 serves class 1 customers with rate µ 1 and server 2 serves class 2 customers with rate µ 2 . Customers are served with an inventory from a common source running according to (s, S) policy with exponential lead time (rate γ). Even if the server 2 is free, class 2 customers will be served only if the inventory level is at least s + 1.
We consider the CTMC Corresponding to each level n 1 , there will be (l 2 + 1)(S + 1)W phase states. The infinitesimal generator matrix of ζ t is given by (3), where the non zero blocks 1 , A 0 , A 1 , and A 2 are of size (l 2 + 1)(S + 1)W and have the following forms: where L 1 is a matrix of order S + 1 with a non-zero lower diagonal, defined as The subblocks A 1 , A 1 also have the block-tridiagonal structure with l 2 + 1 blocks over the main diagonal indexed by the number of class 2 customers: . . , 4, of order (S + 1)W, are given below.

A
(1) where L 2 has non-zero entries in the lower diagonal from s + 2nd row onward, It remains to define A (i) 1 , i = 1, . . . , 4. These matrices have semi-block-diagonal structure with main diagonal and last column containing S + 1, possibly non-zero, blocks, indexed by the number of inventory items available, each block being a square matrix of order W. Indeed, since the departures of class 1 customers are not possible from level 0, where, from s + 1st row onward, the diagonal elements are D 0 (only MMAP phase change is possible). Similarly, since A 1 corresponds to states with positive number of class 2 customers, Since A 1 corresponds to positive levels of the process ζ t , i.e., the number of class 1 customers is positive, S+1 is a square matrix of order S + 1 consisting of a zero column, zero row, and identity matrix as follows: Indeed, the matrices A 1 and A 1 correspond to positive number of class 1 customers; thus, their diagonals include the service rates of class 1 customers, except the case of the empty inventory. As a final note, the last diagonal blocks in matrices, A (1) 1 and A (3) 1 , correspond to arrivals of class 2 customers that are lost; hence,

Stability Condition
The necessary and sufficient condition for existence of the non-zero steady-state probability is the specific version of Foster ergodicity condition known as the Neuts ergodicity criterion [22], where the stochastic vector π is the solution of the system and Note that, due to the properties of A i , i = 0, 1, 2, the matrix A is a generator matrix of a finite state space CTMC giving the projection of the phase transition at high levels; thus, the vector π may be considered as the steady-state probability of the phase at high levels [22]. It now follows from (5), (7), and (9) that A has block-tridiagonal structure indexed by the number of class-2 customers as follows: The blocks R 1 , R 1 , and R i , i = 0, 1, 2, are square matrices, and their structure follows from (5), (7), and (9). Indeed, where ⊕ is the Kronecker sum defined as A ⊕ B = I ⊗ B + A ⊗ I; ∆ being a diagonal matrix that guarantees (R 0 + R 1 + R 2 )e = 0, and obvious convention ∑ K j=3 = 0 for K < 3 is used. It remains to note that, since R , while the matrix R 1 = R 1 + R 0 includes the lost arrivals of class-2 customers. Recall that the losses of class 2 customers appear here due to the specific indexing of the blocks of matrix A, and corresponding losses of classes 3, . . . , K are encoded in the components of matrix R 1 . It is worthwhile to note that, despite the expected independence of the MMAP component evolution on the system state, the changes in the phase occur upon arrival of customers of classes 2, . . . , K, both in the corresponding queue size and in MMAP phase; similar simultaneous changes occur upon departure of such customers (both counter and inventory). In the case K = 2, the subblocks of A have the following clear structure: Note that the Kronecker sums in the phase transition rate matrices appear due to the multidimensional structure of the phase state space. These sums highlight the independent changes of one of the components of the phase vector, and O is used if the corresponding component remains unchanged due to a transition.
To solve the system (21), the following numerically stable algorithm is suggested. Let the row vector π be presented in the form π = π 0 , π 1 , . . . , π l 2 . The vectors π m , m = 0, . . . , l 2 , are considered to have the following form: The matrices U m , m = 0, . . . , l 2 − 1, are obtained from the system (21) using (22), starting from the last column, i.e., calculated using the backward recursion, under the initial condition (following from the last column of the matrix A) Note that, since A is a generator matrix, R 1 is diagonally dominant and, hence, invertible. Finally, the vector π 0 is the unique solution to the system (obtained from the first column of A and the fact that π is stochastic) Now, let us consider the l.h.s. of (20). The vector π is the steady-state probability vector of the finite state space CTMC defined by the matrix A and having the state space E = {(n 2 , . . . , n K , i, z), n k ∈ {0, . . . , l k }, k = 2, . . . , K, i ∈ {0, . . . , S}, z ∈ W }.
However, it is possible to shrink the state space E into the following subsets defined for all z ∈ W: E z = {(n 2 , . . . , n K , i, z 0 ) ∈ E : z 0 = z}. Now, we show that, for each state from E z , the transition rate to the set Eẑ (i.e., the sum of corresponding transition rates to individual states) equals (D) zẑ . Indeed, to obtain this transtion rate, the matrix A needs to be multiplied by the matrix that would sum up the corresponding transition rates, that is, e α(2,K)(S+1) ⊗ I W . As such, using the fact that, for any matrices B i , i = 1, . . . , 4, the following equation holds good, it can be obtained from (5) that (27) It follows from (7) and (17) that Finally, it follows from (9), (15), (16), after some algebra, that Then, it follows from (27)-(29) that Thus, the subsets E z , z ∈ W, are the states of a CTMC defined by the matrix D. Now, we note that π A(e α(2,K)(S+1) ⊗ I W ) = π(e α(2,K)(S+1) ⊗ I)D; hence, θ = π(e α(2,K)(S+1) ⊗ I). Finally, from (5), obtain Now, taking into account (7), (22), (24), and using the property (26) for the unit vector e, the r.h.s. of the stability condition (20) becomes Finally, (20) becomes Note that the stability condition (30) has a nice interpretation, since λ 1 is the upward drift, while the sum in brackets is the mean downward drift of class 1 customers at high levels obtained by aggregation of the components phase probabilities vector π corresponding to the states with positive departure probability of class 1 customers.
Since the matrix Q defines a level-independent QBD, the soluton can be obtained in matrix-geometric form [22]. Indeed, splitting the vector q into finite vectors q 0 , q 1 , . . . by the (value of the) first coordinate, we assume that The matrix R is the so-called rate matrix, being the minimal non-negative solution of the matrix quadratic equation (which follows from (31) using the block-tridiagonal structure of Q and (32)), The boundary conditions for obtaining q 0 follow from the first block-column of Q and are [23] q 0 (A (0) However, in order to avoid numerical difficulties, it is recommended to use the alternative matrix quadratic equation for the substochastic matrix G being the minimal non-negative solution of the system obtain the R matrix by the known relation [23] and, finally, use (32) to obtain the vector q.
After obtaining the steady-state probability vector, it is straightforward to define the steady-state performance measures of interest. To do so, we use an auxiliary vector j (n) of size n + 1 containing the sequence (0, . . . , n), i.e., j (n) = (0, . . . , n).
Average number of class 1 customers in the system: Average number of class k = 2, . . . , K customers in the system: Average inventory size in the system: Replenishment rate, where δ (b) is given in (16), Class k loss rate, k = 2, . . . , K: We note that the average number of queued customers can be obtained by the corresponding formula for the average number of customers in the system by replacing i to (i − 1). Finally, the corresponding sojourn times of class k ≥ 1 customer are obtained by Little's law as follows: whereλ k is the effective arrival rate of class k, i.e.,λ 1 = λ 1 and

Analysis of Inventory Recycle Time
Starting with inventory level S, the time taken to hit the level S again is called inventory cycle time, say, Γ. The distribution of inventory cycle time depends on the number of customers of all classes in the system and the phase of the arrival process. However, we note that, since the replenishment happens only at the event when the inventory hits the level s, the inventory cycle is essentially the lead time plus the time it takes to reach s from S by the steps decreasing the inventory, made by the process ζ(t). It is clear that such a process is a time until absorption of the process, where the absorption happens at the level s. It is known that such a time can be modeled by a phase-type distribution (for details on this type of distributions, see Reference [24]), and below we obtain the parameters of such a distribution. Indeed, consider the inventory level I(t) = S. If N k (t) ≥ S − s k−1 , then the class k customers present in the system are capable of consuming all the inventory allowed for such a class, either before stopping service for this specific class at the boundary level s k−1 , or before hitting the absorbing inventory state I(t) = s, and in such case the time to absorption does not depend on the arrivals of class k customers after time t. Otherwise, arrivals of class k customers can be tracked until the condition N k (t) ≥ S − s k−1 is met (if this happens before absorption). Moreover, if the level N k (t) hits the value S − s k−1 from below, the departures of class k customers need not be tracked, but instead, only the inventory decreasing process should take into account the corresponding rate of the class k customer service, µ k . Thus, the time it takes I(t) to reach s (time to absorption) can be modeled by a finite state space absorbing CTMC, which is essentially a restriction of ζ(t) to the set This means that the time to reach inventory level s has a phase-type distribution PH(β, T), where β is the initial distribution of the restricted chain (N 1 (t), . . . ,N k (t),Î(t), Z(t)) at time 0, and T is the transition matrix which follows from Q and the aforementioned restrictions. In particular, it is straightforward to define the transition rate matrix T as follows.
Note that T is a block-tridiagonal finite matrix, with S − s + 1 blocks on the main diagonal, where the first block corresponds to states with N 1 (t) = 0, while the last block is for N 1 (t) = S − s, respectively. The rows corresponding to the last block have zeroes except the block on the main diagonal, since upon reaching the level N 1 (t) = S − s, neither the departures, nor the arrivals, of class 1 customers are tracked, but the rate µ 1 is taken into account in the value of the inventory decreasing rate. Moreover, the inventory states become in fact re-numbered so as to have S − s states starting from inventory level s + 1, numbered sequentially, that is, compared to the original inventory component and the absorption happens whenÎ(t) makes a transition to 0. The size of the blocks iŝ andα(i, j) = 0 otherwise. It is straightforward to see and where L j is obtained from the matrix L j defined in (6) by reducing the latter to rows and columns from s j−1 + 2 to S + 1, i.e., where the matrix J a,b removes the first a rows, 1 ≤ a ≤ b, that is, with an exception for L 1 defined as Let us restrict the matrices N where the S + 1-dimensional vector e S+1 is zero vector except the last component equal to one, while e S+1 is the S + 1-dimensional vector of ones. Now, we are ready to define the matrix B 1 as follows: Note that the diagonal matrix∆ holds the exit rates from states which do not lead to absorption, that is, transitions from inventory levelsÎ(t) = 2, . . . , S − s according to enumeration (43). As such, this matrix can be defined explicitly as follows: Similarly to (19), the matrix B Finally, we need to define the matrix B 1 corresponding to the boundary states. Since at the level (number of class 1 customers) S − s, neither arrivals nor departures of the class 1 customers are tracked, It remains to denote t 0 = −Te as the corresponding absorption rate vector, and note that the initial state probability vector should be taken so as to have initial inventory equal to S, that is, P(I(0) = S) = 1.
Note that the absorption happens if the transition is made from the inventory level i = 1 downward. In particular, this means that, since s 2 > s, the absorption happens only due to a transition caused by service completion of either class 1, or class 2 customer.
It is well known that time to absorption, say X, of an absorbing CTMC defined by a subgenerator T and initial state probability vector β has a phase-type distribution with mean [24] EX = β(−T) −1 e.
Thus, to define the mean inventory cycle time, EΓ, it only remains to convert the steady-state probability vector q obtained in (32) into the initial state probability vector for the corresponding phase-type distribution defined by the matrix (42).
We summarize our findings in the following Lemma. The vector q is zero elsewhere.
It remains to note that the expectation is taken so as to summarize the resulting times by the appropriate steady-state probabilities involving the inventory level S, and the term 1 γ is added to emphasize the exponentially distributed lead time.

Stability Condition Parametric Sensitivity
In this section we illustrate the sensitivity of the stability condition (30) on the parameters. In particular, we take K = 2, fix the MMAP arrival process, inventory size S and queue size for the second class customers, l 2 . We then vary the rates µ 1 , µ 2 and γ for several values of replenishment level s, ceteris paribus, and plot the resulting dependency. In the experiment we use the following default settings: S = 100, s = 7, l 2 = 15, µ 1 = 7, µ 2 = 15, γ = 10.
The MMAP arrival process with W = 2 is driven by the following matrices: Thus, the arrival rate of class 1 customers may be calculated as λ 1 ≈ 4.867.
In the first experiment we consider ρ defined in (30) to be the function of µ 2 ∈ [2, 15] for values s = 5, 6, 7. The resulting curves depict ρ s (µ 2 ) versus µ 2 on Figure 2 (top). While in absolute values the variability of ρ is rather small, a non-linear dependency is clearly visible, and the dependency on s is motivated by the relatively higher impact of the second class customers on the system load for smaller values of s. Counter-intuitively though, the load increases (slightly) with an increasing service rate of the second class customers.
To investigate this interesting effect, we consider the probability P(N 2 = i) for two boundary values, i = 0, l 2 , that is, the probability that second class customer queue is empty or full. We plot the corresponding estimates within the setup of the first experiment on Figure 3. It can be observed that, for smaller values of µ 2 (that is, for larger mean service times of class 2 customers), the queue is mostly overloaded (with high probability the queue is full), which causes relatively high loss of class 2 customers. In contrast, for µ 2 > 7, the probability of an empty class 2 queue overtakes the probability of a queue completely occupied, and the former is increasing, while the latter is decreasing with increasing µ 2 .
We note that, in the experiment, we used only s = 7 since, as numerical results show, the effect of s on empty/full probability of class 2 customers is negligible for s = 5, 6, 7.  In the second experiment, we fix µ 2 = 15 and consider ρ defined in (30) to be the function of γ ∈ [2, 10] for values s = 5, 6, 7. The resulting curves depict ρ s (γ) versus γ on Figure 2 (bottom). In this experiment, the non-linear dependency on γ follows the intuition: higher replenishment rate decreases the customer's queuing time and results in a lower system load.
Finally, we fix µ 2 = 15, γ = 10 and consider ρ defined in (30) to be the function of µ 1 ∈ [7,12]. However, since the dependency on s is rather weak, we take more contrast values s = 1 and s = 50. The resulting curves depict ρ s (µ 1 ) versus µ 1 on Figure 4. As expected, increasing µ 1 , and, hence, decreasing the service time of class 1 customers, causes a dramatic decrease of the system load. The additional load caused by a "lazy" replenishment at the level s = 1 is also visible.

Steady-State Performance Sensitivity
In this section, we illustrate the sensitivity of the performance measures (37)-(41) described in Section 4 to the management parameter γ, that is, the lead time intensity. To do so, we take K = 2 and slightly modify the parameters used in the previous section, so as to make computations more convenient. Namely, we take S = 50, s = 5, l 2 = 10, µ 1 = 7, µ 2 = 15.
We use the same MMAP defined in (49). We vary γ = 1, . . . , 10 and obtain the performance measures (37)-(41) for given γ, ceteris paribus. Finally, we depict the obtained numerical results. Figure 5 (top) describes the dependency of the mean number of class 1 and class 2 customers given in (37) and (38), on the replenishment rate γ. A decreasing pattern with increasing γ is caused by decreasing load ρ given in (30); see Figure 5 (bottom).     Finally, Figure 7 describes the dependency of the class 2 customer loss rate L 2 defined in (41), on γ = 1, . . . , 10. As expected, with increasing γ the customer loss rate is decreasing.

Total Cost Optimization
Based on the above performance measures, we obtain expected total cost per unit time in the considered system as follows: where the non-negative coefficients are defined on a per system|customer|item per time unit basis: • c I is the holding cost of inventory (per item per unit time), • c L k is the cost due to a class k customer loss (per system per unit time), • c R is the reorder cost (per system per unit time), • c w k is the waiting cost of class k customer (per customer per unit time), k ≤ K.
Using the same 2-class system with same parameters as in Section 6.2, that is, and MMAP defined in (49), we perform a numerical exploration of sensitivity of the cost function E TC given in (50) on the parameter c I . This is motivated as follows. Since the per-customer measures are decreasing, while the per-system measures are increasing, there should be a trade-off between keeping a high inventory level and compromising some second-class customer losses. This is regulated by the inventory holding cost parameter c I . For the experiment, we take c I = 1, 5, 10 and vary γ = 1, . . . , 10, as in the previous section. We depict the resulting curves on Figure 8 and observe clearly that the optimal (minimal) cost shifts to lower values of γ with increasing c I , which is intuitive.

Conclusions
We have analyzed a queuing inventory system with correlated arrivals of heterogeneous customers. The general case of K types of customers was investigated, where each class of customers is assigned some priority according to which the inventory access is managed. We illustrated the general system and a special case of K = 2. An intuitive stability condition was derived and the steady state probability vector was obtained. Key performance measures of the system were obtained. The inventory cycle time was analyzed and it was shown that the inventory cycle follows a phase type distribution.
The large scale models have computational difficulties due to a large state space (while the matrices are sparse), so we cannot rely on numerical investigation only. Thus, we performed numerical experiments only to highlight interesting features of the model, and an interesting cost optimization problem has been stated which could be studied in future for practical applications.
As future directions of investigation we would like to point out the following possibilities. First, it is relatively easy to incorporate customer impatience (which will cause additional flow of customers towards decreasing the level and the first K components of the phase in the corresponding QBD process) and balking upon arrival finding the server busy (which modifies the rates related to corresponding class customer arrivals). Phasetype service times and replenishment times can also be incorporated leaving the model mathematically tractable. At the same time, the inventory replenishment discipline can be modified into booking-type, where the inventory items are booked for specific customers upon arrival; thus, arriving customers are rejected if no items are available for booking. Comparison of this type of model with the one analyzed in the present paper is one of the promising directions for future research.
Thinking beyond simple extensions of the model, the retrial queues instead of classical queues can be incorporated, while the stability part most likely can be extended towards a regenerative input. Finally, it might be interesting to consider the inventory with common lifetime of items and priorities related to item "freshness".
It might be also interesting to consider the model in transient regime, according to possible applications in social systems. A few methods are available for this type of analysis of QBD processes [25][26][27]. However, this might require inverting Laplace transforms which might cause numerical instability. In this regard, we refer to a recent work, Reference [28], where a novel method of numerically effective Laplace transform inversion is presented.