1. Introduction
Many papers have been published on the optimal control of queueing systems. Often the authors assume that it is possible to control the service rate; see, for example, Laxmi et al. [
1], Chen et al. [
2] and Tian et al. [
3]. In other papers, the aim is to control the arrival of customers into the system; see Wu et al. [
4]. Sometimes, it is assumed that there are different types of customers; see Wen et al. [
5] and Dudin et al. [
6]. Another problem that is often considered is to determine the optimal number of servers in the system; see Asadzadeh et al. [
7].
In this paper, we consider a single-server queueing model in discrete time. Orders arrive one at a time. We assume that the time needed for an order to arrive is a random variable
A that follows a geometric distribution with parameter
, so that
Moreover, the service time is a random variable S having a geometric distribution with parameter .
There are two competing companies. In equilibrium, orders arrive at each company according to a random variable having a geometric distribution with parameter , for .
Instead of solving an optimal control problem, our aim in this paper is to present a stochastic dynamic game that can serve as a model for the behavior of companies competing for orders. We must, of course, make simplifying assumptions in order to obtain a mathematically tractable problem. However, we believe that the model is realistic enough to be useful.
Let
be the number of orders being processed by the first company at time
n, and let
where
and
.
Now, suppose that each company can try to use some form of control in order to increase its share of the orders. More precisely, we assume that the first company uses the control at time n to increase the arrival rate of its orders. Similarly, the second company uses at time n to decrease the order arrival rate of its competitor.
Remark 1. Writing that means that Company 2 is making some efforts (for instance, by lowering its prices) in order to reduce to zero the rate at which Company 1 receives orders, unless this company also tries to increase or at least keep its share of the orders.
We define the cost function
where
,
and
are positive constants. We look for the values
and
of the control variables that are such that the expected value of
is minimized.
The idea behind the cost function is as follows: since the parameter is positive, if r is large, then Company 1 wants to maximize the time it stays in business. To do this, it would like to use the control ; however, this leads to quadratic control costs. Company 2, for its part, would like to use in order to bankrupt Company 1 as quickly as possible, but this also entails quadratic control costs.
Problems in which the optimizers try to minimize or maximize the expected value of a certain cost function until a given random event occurs are known as
homing problems. Whittle [
8] considered the case when the optimizer controls an
n-dimensional diffusion process until it leaves a given subset of
. Rishel [
9] also treated the homing problems for
n-dimensional diffusion processes; these processes were more realistic models for the wear of devices than those proposed by various authors.
The author has recently extended homing problems to the optimal control of queueing systems in continuous time; see [
10,
11,
12]. In these three papers, the aim was to determine the optimal number of servers working at time
t. He also published a paper ([
13]) on a homing problem for a controlled random walk with two optimizers.
Next, we define the
value functionThe function is the expected cost incurred (which can sometimes be a reward) if both optimizers choose the optimal value of and between the initial time and time .
In
Section 2, the
dynamic programming equation satisfied by the value function
will be derived, and a particular problem will be solved explicitly. In
Section 3, the problem formulation will be modified. We will then assume that the value of
is known, and we will look for the value of
that maximizes the expected value of a certain cost function. Concluding remarks will be made in
Section 4.
2. Dynamic Programmic Equation
We will derive the dynamic programming equation satisfied by
. We have
Then, making use of
Bellman’s principle of optimality (see [
14]), we can write that
Indeed, whatever the two optimizers decide to do at time , the decisions they make from time to time must be optimal.
Remark 2. Equation (7) is valid because of our assumptions that , for , and S have a geometric distribution. Indeed, as is well known, this distribution possesses the memoryless property. If we assume instead that these random variables have any other (discrete) distribution, then we will have to take the past into account, rendering the optimization problem almost intractable. Hence, we can state the following proposition.
Proposition 1. The value function satisfies the dynamic programming equation (DPE) Moreover, we have the boundary condition if or .
Remark 3. - (i)
We take for granted that each optimizer does not know what the other has decided to do. In Section 3, we will assume that Company 2 knows the decision made by Company 1. - (ii)
There are four possibilities for : , , and . If we solve the difference equation corresponding to each possible value of , we actually obtain the value of the function if the optimizers choose the same value of for any k. Hence, we cannot obtain the value function and/or the optimal controls for any value of k by comparing the four expressions for obtained by solving the four difference equations.
- (iii)
We can write that . The number of possible pairs for is equal to . If we have the values of for , we can solve a system of linear equations to obtain the corresponding values of for any k. If is small, it is a simple matter to consider all the possible values of and compute the function for . We can then determine the optimal controls and the associated value function.
In the following subsection, a particular problem will be solved explicitly.
An Example
Suppose that
,
and
. Moreover, we take
and
. The values of
and
for the
possible choices for
and
are presented in
Table 1.
For example, if
and
(Case no. 5), then we have
if
and
if
. We must solve the system of linear equations
whose solution is
and
.
We see that the optimal strategy is to choose Case no. 14; that is, we take = and = .
Remark 4. The four difference equations that must be solved, subject to the boundary conditions , are given below. We denote their solutions by , for .
- (1)
:
We easily find that the solution is .
- (2)
:
- (3)
:
Because
,
cannot reach the value 3 if
or 2. The solution that satisfies the boundary condition
is
- (4)
:
The functions
, for
, are shown in
Figure 1. Moreover, the values of
for
and
are presented in
Table 2.
Notice that corresponds to Case no. 1, while corresponds to Case no. 16. We observe that none of the functions is the value function. However, yields values which are quite close to those obtained with the value function. Therefore, if is large, so that the number of equations to consider is also large, then a suboptimal solution can be obtained by assuming that will be the same for any k.
3. Optimal Control When Is Known
In this section, we assume that Company 2 knows the strategy of Company 1, and tries to maximize the expected value of the following cost function:
The terminal cost function
is defined by
where
and
. The constant
could be the maximum number of orders that Company 1 can process at the same time.
Suppose that Company 1 uses the control
for any
n and any
k. Company 2 must decide whether to choose
or
. We define the value function
Proceeding as in the previous section, we can prove the following proposition.
Proposition 2. The value function satisfies the dynamic programming equationwhere . Furthermore, the function is such that and . There are now possible strategies for Company 2. If is small, we can proceed as in the previous section and calculate the expected value of for each possible strategy.
Assume that
,
,
,
,
,
and
. We present in
Table 3 the value of
for each of the 16 possible strategies that Company 2 can choose. These values are obtained by solving the system of four linear equations
for
, together with the boundary conditions
and
. We conclude that the optimal strategy is the one that corresponds to Case no. 15; that is,
and
for
.
Remark 5. As in the previous section, we can obtain at least a suboptimal solution which is close the optimal one by solving the difference equations obtained by assuming first that , namelyand then that : The first equation corresponds to Case no. 1, and the second one to Case no. 16. We find that the solutions to these equations in the particular case considered above are, respectively,and Notice that the solution obtained when (Case no. 1) actually gives the minimum of the expected value of the cost function . Moreover, we see in Table 3 that the choice that corresponds to Case no. 11 almost yields the optimal solution that we are looking for. 4. Conclusions
In this paper, a homing problem for a queueing model in discrete time has been considered. The problem can be seen as a dynamic game because there are two optimizers with opposing objectives.
In
Section 2, dynamic programming was used to derive the equation satisfied by the value function. From this equation, one can deduce the optimal values of the control variables. However, we have seen that, in order to do this, one has to solve a possibly large number of systems of linear equations, subject to the appropriate boundary conditions. Although solving each system is straightforward, repeating this procedure a large number of times can become tedious. We have also seen that it is possible to obtain a good suboptimal solution to our problem fairly quickly.
In
Section 3, the problem formulation was modified. We assumed that the strategy of Company 1 was known, and we looked for the strategy that Company 2 should adopt to maximize the expected value of a certain cost function. We treated the case when the control variable
is always equal to
. However, the same type of analysis could be carried out for any choice of
. In particular, we could find the optimal strategy of Company 2 if
.
In theory, we could easily extend the problems considered to the case when each optimizer can choose between more than two possible values for the variable it controls. The calculations would, however, become quite complex. One could possibly use numerical simulations to determine the optimal solutions. Indeed, instead of solving a large number of difference equations, simulating the proposed model can enable us to determine the optimal solution by computing the value function for each simulation. Simulating geometric random variables is not a difficult task.
Finally, it is also possible to consider optimal control problems for queueing models in continuous time with two optimizers.