Abstract
We consider two companies that are competing for orders. Let denote the number of orders processed by the first company at time n, and let be the first time that or , given that . We assume that is a controlled discrete-time queueing system. Each company is using some control to increase its share of orders. The aim of the first company is to maximize the expected value of , while its competitor tries to minimize this expected value. The optimal solution is obtained by making use of dynamic programming. Particular problems are solved explicitly.
Keywords:
dynamic programming; difference equations; linear equations; first-passage time; homing problem AMS Subject Classification:
Primary 93E20; Secondary 60K25
1. Introduction
Many papers have been published on the optimal control of queueing systems. Often the authors assume that it is possible to control the service rate; see, for example, Laxmi et al. [1], Chen et al. [2] and Tian et al. [3]. In other papers, the aim is to control the arrival of customers into the system; see Wu et al. [4]. Sometimes, it is assumed that there are different types of customers; see Wen et al. [5] and Dudin et al. [6]. Another problem that is often considered is to determine the optimal number of servers in the system; see Asadzadeh et al. [7].
In this paper, we consider a single-server queueing model in discrete time. Orders arrive one at a time. We assume that the time needed for an order to arrive is a random variable A that follows a geometric distribution with parameter , so that
Moreover, the service time is a random variable S having a geometric distribution with parameter .
There are two competing companies. In equilibrium, orders arrive at each company according to a random variable having a geometric distribution with parameter , for .
Instead of solving an optimal control problem, our aim in this paper is to present a stochastic dynamic game that can serve as a model for the behavior of companies competing for orders. We must, of course, make simplifying assumptions in order to obtain a mathematically tractable problem. However, we believe that the model is realistic enough to be useful.
Let be the number of orders being processed by the first company at time n, and let
where and .
Now, suppose that each company can try to use some form of control in order to increase its share of the orders. More precisely, we assume that the first company uses the control at time n to increase the arrival rate of its orders. Similarly, the second company uses at time n to decrease the order arrival rate of its competitor.
Remark 1.
Writing that means that Company 2 is making some efforts (for instance, by lowering its prices) in order to reduce to zero the rate at which Company 1 receives orders, unless this company also tries to increase or at least keep its share of the orders.
Let
We define the cost function
where , and are positive constants. We look for the values and of the control variables that are such that the expected value of is minimized.
The idea behind the cost function is as follows: since the parameter is positive, if r is large, then Company 1 wants to maximize the time it stays in business. To do this, it would like to use the control ; however, this leads to quadratic control costs. Company 2, for its part, would like to use in order to bankrupt Company 1 as quickly as possible, but this also entails quadratic control costs.
Problems in which the optimizers try to minimize or maximize the expected value of a certain cost function until a given random event occurs are known as homing problems. Whittle [8] considered the case when the optimizer controls an n-dimensional diffusion process until it leaves a given subset of . Rishel [9] also treated the homing problems for n-dimensional diffusion processes; these processes were more realistic models for the wear of devices than those proposed by various authors.
The author has recently extended homing problems to the optimal control of queueing systems in continuous time; see [10,11,12]. In these three papers, the aim was to determine the optimal number of servers working at time t. He also published a paper ([13]) on a homing problem for a controlled random walk with two optimizers.
Next, we define the value function
The function is the expected cost incurred (which can sometimes be a reward) if both optimizers choose the optimal value of and between the initial time and time .
In Section 2, the dynamic programming equation satisfied by the value function will be derived, and a particular problem will be solved explicitly. In Section 3, the problem formulation will be modified. We will then assume that the value of is known, and we will look for the value of that maximizes the expected value of a certain cost function. Concluding remarks will be made in Section 4.
2. Dynamic Programmic Equation
We will derive the dynamic programming equation satisfied by . We have
Then, making use of Bellman’s principle of optimality (see [14]), we can write that
Indeed, whatever the two optimizers decide to do at time , the decisions they make from time to time must be optimal.
Remark 2.
Equation (7) is valid because of our assumptions that , for , and S have a geometric distribution. Indeed, as is well known, this distribution possesses the memoryless property. If we assume instead that these random variables have any other (discrete) distribution, then we will have to take the past into account, rendering the optimization problem almost intractable.
Furthermore, we have
Hence, we can state the following proposition.
Proposition 1.
The value function satisfies the dynamic programming equation (DPE)
Moreover, we have the boundary condition if or .
Remark 3.
- (i)
- We take for granted that each optimizer does not know what the other has decided to do. In Section 3, we will assume that Company 2 knows the decision made by Company 1.
- (ii)
- There are four possibilities for : , , and . If we solve the difference equation corresponding to each possible value of , we actually obtain the value of the function if the optimizers choose the same value of for any k. Hence, we cannot obtain the value function and/or the optimal controls for any value of k by comparing the four expressions for obtained by solving the four difference equations.
- (iii)
- We can write that . The number of possible pairs for is equal to . If we have the values of for , we can solve a system of linear equations to obtain the corresponding values of for any k. If is small, it is a simple matter to consider all the possible values of and compute the function for . We can then determine the optimal controls and the associated value function.
In the following subsection, a particular problem will be solved explicitly.
An Example
Suppose that , and . Moreover, we take and . The values of and for the possible choices for and are presented in Table 1.
Table 1.
Values of and for the 16 possible choices for the control variables.
For example, if and (Case no. 5), then we have if and if . We must solve the system of linear equations
whose solution is and .
We see that the optimal strategy is to choose Case no. 14; that is, we take = and = .
Remark 4.
The four difference equations that must be solved, subject to the boundary conditions , are given below. We denote their solutions by , for .
- (1)
- :
We easily find that the solution is .
- (2)
- :
We find that
- (3)
- :
Because , cannot reach the value 3 if or 2. The solution that satisfies the boundary condition is
- (4)
- :
We have
The functions , for , are shown in Figure 1. Moreover, the values of for and are presented in Table 2.
Figure 1.
Functions (solid line), (dotted line) and (dashed line) for .
Table 2.
Values of for and .
Notice that corresponds to Case no. 1, while corresponds to Case no. 16. We observe that none of the functions is the value function. However, yields values which are quite close to those obtained with the value function. Therefore, if is large, so that the number of equations to consider is also large, then a suboptimal solution can be obtained by assuming that will be the same for any k.
3. Optimal Control When Is Known
In this section, we assume that Company 2 knows the strategy of Company 1, and tries to maximize the expected value of the following cost function:
The terminal cost function is defined by
where and . The constant could be the maximum number of orders that Company 1 can process at the same time.
Suppose that Company 1 uses the control for any n and any k. Company 2 must decide whether to choose or . We define the value function
Proceeding as in the previous section, we can prove the following proposition.
Proposition 2.
The value function satisfies the dynamic programming equation
where . Furthermore, the function is such that and .
There are now possible strategies for Company 2. If is small, we can proceed as in the previous section and calculate the expected value of for each possible strategy.
Assume that , , , , , and . We present in Table 3 the value of for each of the 16 possible strategies that Company 2 can choose. These values are obtained by solving the system of four linear equations
for , together with the boundary conditions and . We conclude that the optimal strategy is the one that corresponds to Case no. 15; that is, and for .
Table 3.
Values of for the 16 possible choices for the control variables .
Remark 5.
As in the previous section, we can obtain at least a suboptimal solution which is close the optimal one by solving the difference equations obtained by assuming first that , namely
and then that :
The first equation corresponds to Case no. 1, and the second one to Case no. 16. We find that the solutions to these equations in the particular case considered above are, respectively,
and
See Figure 2.
Figure 2.
Functions (solid line) and (dotted line) for .
Notice that the solution obtained when (Case no. 1) actually gives the minimum of the expected value of the cost function . Moreover, we see in Table 3 that the choice that corresponds to Case no. 11 almost yields the optimal solution that we are looking for.
4. Conclusions
In this paper, a homing problem for a queueing model in discrete time has been considered. The problem can be seen as a dynamic game because there are two optimizers with opposing objectives.
In Section 2, dynamic programming was used to derive the equation satisfied by the value function. From this equation, one can deduce the optimal values of the control variables. However, we have seen that, in order to do this, one has to solve a possibly large number of systems of linear equations, subject to the appropriate boundary conditions. Although solving each system is straightforward, repeating this procedure a large number of times can become tedious. We have also seen that it is possible to obtain a good suboptimal solution to our problem fairly quickly.
In Section 3, the problem formulation was modified. We assumed that the strategy of Company 1 was known, and we looked for the strategy that Company 2 should adopt to maximize the expected value of a certain cost function. We treated the case when the control variable is always equal to . However, the same type of analysis could be carried out for any choice of . In particular, we could find the optimal strategy of Company 2 if .
In theory, we could easily extend the problems considered to the case when each optimizer can choose between more than two possible values for the variable it controls. The calculations would, however, become quite complex. One could possibly use numerical simulations to determine the optimal solutions. Indeed, instead of solving a large number of difference equations, simulating the proposed model can enable us to determine the optimal solution by computing the value function for each simulation. Simulating geometric random variables is not a difficult task.
Finally, it is also possible to consider optimal control problems for queueing models in continuous time with two optimizers.
Funding
This research was supported by the Natural Sciences and Engineering Research Council of Canada.
Data Availability Statement
No data was used for this research.
Acknowledgments
The author wishes to thank the anonymous reviewers of this paper for their constructive comments.
Conflicts of Interest
The author reports that there are no competing interests to declare.
References
- Laxmi, P.V.; Bhavani, E.G.; Jyothsna, K. Analysis of Markovian queueing system with second optional service operating under the triadic policy. OPSEARCH 2023, 60, 256–275. [Google Scholar] [CrossRef]
- Chen, G.; Liu, Z.; Xia, L. Event-based optimization of service rate control in retrial queues. J. Oper. Res. Soc. 2023, 74, 979–991. [Google Scholar] [CrossRef]
- Tian, R.; Su, S.; Zhang, Z.G. Equilibrium and social optimality in queues with service rate and customers’ joining decisions. Qual. Technol. Quant. Manag. 2024, 21, 1–34. [Google Scholar] [CrossRef]
- Wu, C.-H.; Yang, D.-Y.; Yong, C.-R. Performance evaluation and bi-objective optimization for F-policy queue with alternating service rates. J. Ind. Manag. Optim. 2023, 19, 3819–3839. [Google Scholar] [CrossRef]
- Wen, J.; Geng, N.; Xie, X. Optimal insertion of customers with waiting time targets. Comput. Oper. Res. 2020, 122, 105001. [Google Scholar] [CrossRef]
- Dudin, A.; Dudin, S.; Dudina, O. Analysis of a queueing system with mixed service discipline. Methodol. Comput. Appl. Prob. 2023, 25, 57. [Google Scholar] [CrossRef]
- Asadzadeh, S.; Akhavan, B.; Akhavan, B. Multi-objective optimization of Gas Station performance using response surface methodology. Int. J. Qual. Reliab. Manag. 2021, 38, 465–483. [Google Scholar] [CrossRef]
- Whittle, P. Optimization over Time; Wiley: Chichester, UK, 1982; Volume I. [Google Scholar]
- Rishel, R. Controlled wear process: Modeling optimal control. IEEE Trans. Automat. Control 1991, 36, 1100–1102. [Google Scholar] [CrossRef]
- Lefebvre, M. An optimal control problem for a modified M/G/k queueing system. In Proceedings of the Workshop on Intelligent Information Systems, Chişinǎu, Republic of Moldova, 19–21 October 2023; pp. 141–148. [Google Scholar]
- Lefebvre, M. Reducing the size of a waiting line optimally. WSEAS Trans. Syst. Control 2023, 18, 342–345. [Google Scholar] [CrossRef]
- Lefebvre, M.; Yaghoubi, R. Optimal control of a queueing system. 2024; Submitted for publication. [Google Scholar]
- Lefebvre, M. A discrete-time homing problem with two optimizers. Games 2023, 14, 68. [Google Scholar] [CrossRef]
- Bellman, R. Dynamic Programming; Princeton University Press: Princeton, NJ, USA, 1957. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).