Optimal Open-Loop Routing and Threshold-Based Allocation in TWO Parallel QUEUEING Systems with Heterogeneous Servers

: In this paper, we study the problem of optimal routing for the pair of two-server heterogeneous queues operating in parallel and subsequent optimal allocation of customers between the servers in each queue. Heterogeneity implies different servers in terms of speed of service. An open-loop control assumes the static resource allocation when a router has no information about the state of the system. We discuss here the algorithm to calculate the optimal routing policy based on specially constructed Markov-modulated Poisson processes. As an alternative static policy, we consider an optimal Bernoulli splitting which prescribes the optimal allocation probabilities. Then, we show that the optimal allocation policy between the servers within each queue is of threshold type with threshold levels depending on the queue length and phase of an arrival process. This dependence can be neglected by using a heuristic threshold policy. A number of illustrative examples show interesting properties of the systems operating under the introduced policies and their performance characteristics.


Introduction
The very rapidly spreading and upgrading of telecommunications and computer technology has led to combination and simultaneous maintenance of inter-generational systems. Various hybrid and heterogeneous service systems, where the servers are differentiated by their service rates, usage costs or reliability attributes, are being created to ensure proper quality and reliability of service while wishing to ensure energy efficiency and generally limit the amount of resources used. In many practical service applications it has become necessary to provide stochastic modeling by means of queueing systems consisting of individual queues and operating in parallel. Such systems often face the challenge of optimal routing between parallel queues which can be either dynamic (also called closed-loop policy) or static (open-loop policy). In the first case, the router dynamically receives information about the state of the system, while in the second case, the routing policy is based only on the information available at the initial point in time.
Most of papers dedicated to the parallel queueing systems deal with a dynamic routing. The homogeneous case, where each parallel queue is supplied with only one server and all of them are identical, was studied in [1]. It was shown that the shortest queue (SQ) policy is optimal for the average cost criterion. A similar heterogeneous server model was analyzed in [2]. The authors proved the result that the policy to use the faster server with a shorter queue minimizes the expected cost. In [3], a dynamic programming approach was used to find an optimal routing policy with the aim to minimize the expected total discounted cost. It was shown that the optimal policy has a switch structure. The optimal routing problems in finite-source parallel queueing system were presented in [4,5]. A classical routing problem in exponential system with K parallel heterogeneous servers was studied in [6] with respect to the SQ policy which, in general, eliminate the service rates and cost structure and to the shortest expected delay (SED) routing which takes these attributes into account. Systems where each of the parallel queues is supplied with a multi-server service area were considered in [7], where a static policy in form of the optimal Bernoulli splitting, and three alternative dynamic index policies were considered. Obviously, when system information is available to the router during the decision-making process, it can respond more flexibly to changing system states, which means lower values for the optimised mean cost function as a result. Unfortunately, in certain cases, the transmission of such information to the router is not always reliable and control errors can occur taking into account possible losses and delays. Alternatively an open-loop control policy can be used instead. Ref. [8] considered the problem of routing N arrivals to M queues in parallel without information about system states and proposed two policy iteration algorithms. The optimal routing for a static policy based on a routing words was investigated in [9]. The authors proposed some heuristics used to approximate the optimal policy.
The main aim of the paper is to study the open-loop control problem with an application to parallel queues, each of them has two heterogeneous servers operating at different service rates. In contrast to the previous research, this model is new one, since the optimal routing problem must be solved simultaneously both for the routing between the queues and for the allocation between the servers within each queue. Furthermore, the control policy for the routing between queues must be of an open-loop type, i.e., be independent of the state changing, and the control allocation policy for the servers should depend only on the number of customers in a certain queue. Our analysis of the proposed system includes the following contributions. We define the arrival process to each queue as a special case of a Markov-modulated Poisson process (MMPP) with l phases, where the transition within a set of the first k phases in the first queue and l − k phases in the second queue, 1 ≤ k ≤ l, accompany the arrival of a new customer. For the fixed parameters l and k we formulate the Markov decision problem to calculate the optimal control policy for the allocation of customers between the servers to minimize the long-run average cost per unit of time. Then, the model is optimized with respect to parameters k and l. We show that the dynamic-programming value function satisfies certain monotonicity properties indicating that the optimal allocation policy is of the threshold type with threshold levels depending on the queue length and the phase of the arrival process. Further, we propose a heuristic threshold policy depending only on the queue length and derive a matrix-analytic solution for the mean number of customers in the system. As an alternative model with a static policy we consider an optimal Bernoulli splitting policy and obtain a closed-form solution for the stationary state probabilities and the optimized functional.
The rest of the paper is organized as follows. Section 2 formally describes the queueing model where the optimal routing between the queues is formulated as an optimal number of phases in MMPP arrival process connected with arrivals and provides the MDP formulation for the optimal allocation between servers for a certain queue. Section 3 presents the results required to show the optimality of a threshold allocation policy and a corresponding matrix-analytic solution under a heuristic threshold policy. The Bernoulli splitting policy is discussed in Section 4. A brief review for the proposed results is given in Section 5.
For use in sequel, let e(n), e j (n) and I n denote, respectively, the n-dimensional unit vector, the unit vector with each element equal to zero except jth unity element, an identity matrix of dimension n. When there is no need to specify the dimension of these vectors we will omit the corresponding argument. The notation appearing in a vector or in a matrix stand for the matrix transpose. Denote further by H i,j the square matrix of the suitable dimension where only the element in the ith row and jth column is 1 and 0 elsewhere. The notation ⊗ will stand for the Kronecker product of two matrices. Denote by S j , j ≥ 1, the shift operators, i.e., S j x = x + e j and S n j x = x + ne j for n ≥ 2 applied to a vector x.

Routing with an Open-Loop Control and MDP Formulation
The queueing system shown in Figure 1 is studied. This system consists of two exponential parallel queues: Q 1 -Queue 1 and Q 2 -Queue 2. The arrival stream forms a Poisson process with a rate λ. Each queue is supplied with two heterogeneous servers operating without pre-emption, i.e., the customer is not able to change the server during a service time. The service rates of the queue Q i are µ 1i and µ 2i with µ 1i > µ 2i . We assume that the first queue has a faster service area as the second one, i.e., µ 11 > µ 12 and µ 21 > µ 22 . The Router 0 allocates the customers between the queues while the Routers 1 and 2 assign the customers between the servers of the corresponding queue. The main task consists both in optimal open-loop routing of customers to parallel queues and in their optimal allocation between heterogeneous servers within each queue. As it is known, see e.g., [10][11][12][13], in an ordinary queueing system with two heterogeneous servers operating without pre-emption, one common waiting line and a Poisson arrival stream the optimal allocation policy has a threshold structure, where the fastest server must be occupied whenever it is idle and slower server is used according to a specified threshold level for the queue length. We have to find out if the optimal policy for the allocation between the servers in the system under study exhibits similar structural properties for non-Poisson arrival stream. In this case we could have a static control not only for the routing between the queues but also for the servers. This section will look at the situation where separate queues receive applications according to the open-loop control when the information for the Router 0 is available at starting time point. We define the routing policy as a periodic binary sequence {0, 1} l with a period length l, where 1 means that the customer is sent to Q 1 and 0-the customer is sent to Q 2 . Note that for the routing policy f 0 in stationary regime only the number k ≤ l of 1s in a binary sequence of length l for one queue matters but not their specific locations. In other words, if k is the number of 1s for the routing to the first queue, then l − k is the number of 1s for the routing to the second queue. Denote by u(i) = (l − 2k)(i − 1) + k, i = 1, 2. For the arbitrary sequence a parallel queueing system Q i behaves as a G i /M i /2 queue with two heterogeneous servers. For a given routing policy f 0 = k the arrival process in the queue Q i can be formulated as a special case of a Markov-modulated Poisson process (MMPP). This arrival process is defined by two square matrices D 0i and D 1i of dimension l, Here, D i = D 0i + D 1i is a generator of the continuous-time Markov chain associated with an arrival process with a state space {1, 2, . . . , l}, i.e., D i e(l) = 0. The matrix D 0i contains transitions without arrivals while D 1i includes transition rates accompanying with a new arrival. The average arrival rateλ i = α i D 1i e(l) = u(i) l λ, where α i = 1 l e (l) is a solution of a linear system α i D i = 0 and α i e(l) = 1. The lag-1 correlation coefficient r i of the inter-arrival times in the queue Q i is computed by Further, in the paper we use the fact that in each queue Q 1 and Q 2 the faster server must always be used as was shown for Poisson arrivals in [11,12] and phase type arrival processes in [14,15]. Therefore, we collapse the state space to three dimensions avoiding a state specification for the faster server. Due to capacity limits and the fact that proof is similar to that in the references above, we skip it. Denote by Q i (t), S i (t), and N i (t), respectively the total number of waiting customers and at the fast server, the number of customers at the second server and the state of an arrival process at time t in a certain queue Q i . The Markov decision process is associated with a continuous-time Markov-chain with a state space given by Denote by q(x), d j (x) and n(x) the corresponding components of the vector state x ∈ E. The decision epochs consists of the moments when the change of an arrival state is accompanied with a real arrival which occurs if 1 ≤ n(x) ≤ k and service completions while the queue is non empty. The action space is A = {0, 1, 2} where as usual a = 0 means to put a customer to the queue and a = 0 means to assign a customer to the ath server and A(x) ⊆ A-the subsets of actions in state x ∈ E. The immediate cost c(x) is the number of customers in state x, i.e., c(x) = q(x) + d 1 (x) + d 2 (x). The optimization problem consists to find the optimal policy f = {k, f u(i) i , i = 1, 2} including the routing policy f 0 = k and the corresponding allocation policies f k 1 and f l−k 2 to minimize the long-run average number of customers in the systemN f = g f , where Consider the ith queue of the original system. The optimality equation of the MDP model for such a queue can be expressed using uniformization in the form where B is a dynamic-programming operator, v i : E → R is a relative value function, and The second and third terms in the right-hand side of (3) describe the changing of the phase in the arrival process, respectively, accompanied with an arrival of a new customer and without an arrival. The fourth and fifth term stand for service completions, respectively, at server 1 and 2. Note that the terms including v i (x) for q(x) = 0 and the term Tv i (S −1 1 x) for q(x) > 0 represent fictitious transitions in accordance with the uniformization procedure, as shown in [16]. In the latter case, we allow the action to be performed which is required for the proof of the monotonicity properties. However, as it was discussed in [12], the optimal policy in a system with decision actions at fictitious transition epochs remains the same as in the original system without such actions. In general case the MDP model is assumed to be countable infinite so the cost in a state can not be bounded. To prove the existence of the stationary average-cost optimal policy and convergence of the policy-iteration algorithm [16,17] in this case it is necessary to use the main theorem proposed in [18] by checking the Assumptions 1, 2, 3, and 3 * .
Further, we want to show that in each queue Q i there exists an optimal threshold policy that depends on the queue length and the state of an arrival process. To complete it, it is required to study some properties of the function v(x). Define a real-valued function v : E → R. We say that v ∈ F 1 if the following monotonicity inequalities hold:

Proposition 1.
If v ∈ F 1 , then for the operator B defined in (3) holds Bv ∈ F 1 .
Proof. The fact that the operator T, as well as B preserves the properties (4) and (5) can be proved in the same way as it has been done in [12,15] for the ordinary infinite population heterogeneous queueing system, therefore the details will be omitted.
Consider now the real valued function v ∈ F 1 which additionally belongs to a set F 2 when the monotonicity properties of increments 2v The superconvexity condition (6) confirms optimality of a threshold-based policy for the usage of second server. To prove it we need to use a supermodularity condition (7) together with a convexity condition (8), which can be obtained by summing (6) and (7).

Proposition 2.
If v ∈ F 2 , then for the operator B defined in (3) holds Bv ∈ F 2 .

Proof. For the operator B according to (3) we obtain
The last inequality holds due to the following arguments. The first term in the righthand side of (9) is obviously non-positive according the definition of the immediate cost c(x). Consider the second term with λ since for a = 1 we obtain the inequality (6) in state S m(x) 3 x and for a = 2 the term is equal to 0. The third term is obviously non-positive due to the condition (6) when replacing x by S m(x) 3 x. For the fourth term of (9) we have two cases, For the fifth term we have three cases, due the condition (8), and, if f (S 1 x) = 2, then according to (7). In case q( (7) and if f (S 1 x) = 2, then we get The condition (7) can be proved similarly to the above condition (6). Details are omitted to conserve space.

Proposition 3.
For the state x = (q, 0, n) in queue Q i there exists thresholds q 2,n,i ∈ N for the activation of slower servers such that for the optimal control action f Proof. The statement follows directly from a convergence of the policy iteration algorithm v i (x) and from the Proposition 2. Example 1. Consider the system with an arrival rate λ = 20 and l = 6 phases of the arrival process. The service rates in Q 1 are equal to: µ 11 = 20, µ 21 = 4 and in Q 2 : µ 12 = 10, µ 22 = 2. The optimal k for the queue Q 1 is 4 and l − k = 2 for the queue Q 2 . The table of optimal control actions f u(i) (x) for the queue Q i , i = 1, 2, are summarized, respectively, in Tables 1 and 2.

Remark 1.
The thresholds q 2,n,i for the queue Q i in general are not equal for different arrival phases n, 1 ≤ n ≤ l. Numerical results show that these thresholds can vary by at most 1. Hence, we can try to use instead one heuristic threshold level which is independent of n. It can be established on the basis of the heuristic solution obtained in [19] for the ordinary queueing system M/M/K, where we replace λ by a corresponding average arrival rateλ i , taking into account the fact that the number q 2,i here is a sum of customers in the queue and at server 1. An alternative heuristic solution can be obtain using such a solution obtained in [20], Both of these solutions are of roughly the same quality.

Example 2.
In case l = 3 and k = 2 for the queues Q 1 the blocks of the matrix Λ (q 2,1 ) are of the form
If the condition (14) is met, there are stationary state probabilities π x = lim t→∞ P[X i (t) = x].

Proposition 6.
The sub-vectors π qi of the stationary state probabilities satisfy the following relations where the matrices M q are defined by The sub-vector π q 2,i i is a unique solution of the system The matrix R is the minimal non-negative solution to the matrix equation Proof. The proof follows directly from the theory of matrix-geometric solutions [21] using a block forward elimination-backward substitution method for the quasi-birth-anddeath processes.
Proposition 7. The mean number of customersN i in the queue Q i is given bȳ Proof. The mean number of customers in the queueing system Q i satisfies the relation By expressing relation (20) in vector-matrix form we obtain the expression where the last geometric sum due to the properties of the matrix R converges to the value (R(I l − R) −1 + (q 2,i + 1)I l )(I l − R) −1 . The final substitution of (15) to the last relation completes the proof.
Thus instead of the policy-iteration algorithm we use the results of Propositions 6 and 7 in form of the vector-matrix relation (19) for the mean valuē to calculate optimal values k, l and q 2,i , i = 1, 2 and provide comparison analysis of an optimal policy with an alternative one.

Bernoulli Splitting Policy
As an alternative model with a heuristic state independent routing we consider a so-called Bernoulli splitting policy (BSP) model. According to this model, the total arrival rate λ is split into rates pλ for the system Q 1 and (1 − p)λ for the system Q 2 . Obviously, the optimal allocation policy for each parallel queue is of threshold type with corresponding threshold levels q 21 and q 22 , where 1 ≤ q 21 , q 22 < ∞. The optimization problem can be formulated then as follows: subject to 0 ≤ p ≤ 1. First of all, we derive the expression for the mean number of customers in heterogeneous queueing system M/M/2 with an arrival rate λ and thresholdbased policy q 2 . This model is obviously a special case of that model discussed in Section 2 for k = l = 1. The infinitesimal matrix of the Markov-chain describing the queueing system with two heterogeneous servers and threshold-based policy q 2 is of the same form as in (12) To calculate the stationary state probabilities, in principle, one can also use the results of Proposition 6 with R = λ µ 1 +µ 2 . However, for the given two-dimensional process the result can be obtained also directly by solving a system of balance difference equations. Proposition 8. The stationary probabilities for the threshold policy q 2 satisfy the relations where ρ = λ µ 1 +µ 2 and r 1 = λ

Proposition 9.
The mean number of customersN is of the form Proof. For the mean number of customers in the system under the threshold policy q 2 we obtainN (q + 1)π (q,1) .
By substituting expressions (22) after some algebra for geometric series we obtain the relation (33).
The results of Propositions 8 and 9 are characteristic for the Bernoulli splitting policy. Using (33) we solve optimization problem (21). As a heuristic solution for the threshold q 2 we use the solutions obtained by (10) and (11) for the optimal open-loop policy. For numerical experiments we replace in these expressionsλ i by λ((1 − 2p)(i − 1) + p) and then optimized numerically the functional

Numerical Analysis
In this section, we provide numerical examples which include a comparative analysis of calculations obtained by means of the policy iteration algorithm, the matrix-analytical method, and expressions obtained for the Bernoulli splitting model.
In a first example we compare the results on the behavior ofN f calculated numerically for the optimal threshold policy (OTP) where threshold levels depend on the phase of the arrival process and using analytic vector-matrix relations for the heuristic threshold policy (HTP) defined in (10). The arrival intensity λ is fixed at λ = 20 and the number of arrival phases l = 6. The Router 0 dispatches the customers according to the optimal open-loop control (OLC) with value k. We fix also heterogeneity factors for the queues Q 1 and Q 2 at level µ 11 /µ 21 = µ 12 /µ 22 = 5. The following cases are under study: • Case 1: µ 11 = 15, µ 21 = 3; • Case 2: µ 11 = 20, µ 21 = 4; • Case 3: µ 11 = 25, µ 21 = 5; • Case 4: µ 11 = 30, µ 21 = 6. Note that the main conclusions about behavior of the average number of customers in the system drawn from the proposed experiments at given cases are also valid for other values of service intensities and heterogeneity factors, provided that the total intensity in queue Q 1 is higher than in queue Q 2 .  In Figure 2, we display the mean valuesN OLP-OTP andN OLP-HTP as λ varies in the interval [1,29], the service rates µ 11 and µ 21 of the queue Q 1 vary according to specified cases, for the queue Q 2 we set µ 12 = 10, µ 22 = 2 in a figure labeled by (a) and µ 12 = 12, µ 22 = 2.4 in a figure labeled by (b). As to be expected, the total mean number of customers in the system increases monotonously with increasing values of the traffic intensity and decreasing of service rates. The curves in most cases are graphically indistinguishable in the displayed domain. Only in some cases of high load there is a slight difference in the curves and in general the optimal policy has a very little advantage over the heuristic one. Thus, for different system parameters the influence of individual states of the arrival process can be neglected, so that the Routers 1 and 2 needs only the information q 2,i .
The optimal open-loop control policy k for different system parameters are illustrated the following tables based on the results of the previous experiment. In Table 3, we take for the queue Q 2 the service rates µ 12 = 10 and µ 22 = 2 and, in Table 4, the service rates of Q 2 are equal to µ 12 = 12 and µ 22 = 2.4. The maximal possible number of arrival phases is as before fixed to l = 6. From these Tables, we may remark that in light-traffic case, e.g., when λ < µ 11 − µ 12 , the Router 0 uses only the queue Q 1 and the optimal policy k = l = 6. As λ increases, the incentive to make an assignment to the slower queue Q 2 is rising higher. When the difference in service rate between the two queues decreases, Router 0 starts sending more customers to the slower queue Q 2 at lower traffic intensity values. For example, if µ 11 = 15, µ 21 = 3, in a heavy traffic case, λ > 17, the optimal policy is equal to k = 4 when µ 12 = 10, µ 22 = 2, i.e., only 1/3 of customers must be sent to Q 2 , while k = 3 when µ 12 = 12, µ 22 = 2.4, i.e., 1/2 of customers must be dispatched to queue Q 2 .  It is interesting to show the influence of the number of the phases l of the arrival process. This is done in Figures 3 and 4, respectively, for the queue Q 1 and Q 2 , where in figures labeled by (a) we illustrate the mean number of customers in each queue operating under the HTP and by (b) a 1-lag correlation with l = 3, 6, 9. The arrival rate varies in the interval [1,29], the service rates in Q 1 and Q 2 are fixed at µ 11 = 20, µ 21 = 4, and µ 12 = 10, µ 22 = 2. We observe that the mean number of customersN OLP-HTP i in each queue i changes with jumps as λ increases. Jumps occur where policy k changes, as can be clearly seen by the behavior of the correlation function r i . Moreover, the functionN OLP-HTP 1 increases non-monotonically, although the functionN OLP-HTP =N OLP-HTP 1 +N OLP-HTP 2 has due to previous experiments a monotonic structure by increasing λ and the choice l = 3 for the original system and specified service rates seems to be the better one.
We can also compare the performance of the optimal open-loop policy (OLP) k and the Bernoulli-splitting policy (BSP). For the allocation of customers between the servers the heuristic threshold policy (HTP) is chosen. In Figure 5, we consider the mean valuesN f when λ increases, service rates of Q 1 change according to specified above cases, service rates of Q 2 are equal to µ 12 = 10, µ 22 = 2 in a figure labeled by (a) and µ 12 = 12, µ 22 = 2.4 in a figure labeled by (b). We see that the OLP outperforms the BSP despite the fact that under the OLP the arrival stream has correlated inter-arrival times, and as it is known, correlation in Markovian arrival processes often leads to a significant increase in the mean number of customers in the system. such an advantage is drastically reduced in a lighttraffic and heavy-traffic cases. The advantage of the OLP in performance metricN f can be up to more than 25% for certain values of system parameter values compared to the heuristic BSP.  (a) (b)

Conclusions
In this research, we provide the model to calculate the optimal open-loop routing policy and corresponding optimal allocation policies for the servers in two heterogeneous queues operating in parallel. It is show that the routing policy influences the arrival process of the specific queue. It is reflected in the fact that the optimal threshold policy to dispatch the customers between servers depends both on the number of waiting customers and on the phase of the arrival process. For the heuristic threshold policy a matrix-analytic solution is obtained. The quality of such a solution is compared with an alternative static policy defined by the optimal Bernoulli splitting. The optimal open-loop policy outperforms the probabilistic heuristic policy and advantage in a mean sojourn time can be more than 25%. This topic can be developed further to provide an exhaustive performance analysis, including not only the calculation of average characteristics, but also stationary waiting time distributions for a given control policy. A comparison of the proposed routing policy with an optimal dynamic closed-loop policy is also a topic for subsequent research.