Advances in Markovian Dynamic and Stochastic Optimization Models in Diverse Application Areas

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics".

Deadline for manuscript submissions: closed (31 August 2023) | Viewed by 16682

Special Issue Editor


E-Mail Website
Guest Editor
Department of Statistics, Carlos III University of Madrid, 28903 Getafe (Madrid), Spain
Interests: operations research; dynamic and stochastic optimization

Special Issue Information

Dear Colleagues,

Markovian dynamic and stochastic optimization is an active research area concerning the design and analysis of optimal or nearly optimal policies for Markov decision models of stochastic systems evolving over time. Such models arise in a wide variety of application areas, including manufacturing, marketing, service operations, finance, call centers, and cloud service systems.

In this Special Issue, we shall collect recent theoretical and application-oriented advances regarding Markovian dynamic and stochastic optimization models in any application area. This includes the design and analysis of optimal and nearly optimal policies, performance analysis, large-scale systems, queueing systems, bandit models,and computational studies.

Prof. Dr. José Niño-Mora
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Markov decision processes
  • stochastic dynamic programming
  • optimal policies
  • optimal control
  • queueing systems
  • bandit models
  • reinforcement learning
  • machine Learning
  • operations research
  • dynamic and stochastic optimization

Published Papers (5 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

22 pages, 984 KiB  
Article
Equilibrium Analysis for Batch Service Queueing Systems with Strategic Choice of Batch Size
by Ayane Nakamura and Tuan Phung-Duc 
Mathematics 2023, 11(18), 3956; https://doi.org/10.3390/math11183956 - 18 Sep 2023
Cited by 1 | Viewed by 1043
Abstract
Various transportation services exist, such as ride-sharing or shared taxis, in which customers receive services in a batch of flexible sizes and share fees. In this study, we conducted an equilibrium analysis of a variable batch service model in which customers who observe [...] Read more.
Various transportation services exist, such as ride-sharing or shared taxis, in which customers receive services in a batch of flexible sizes and share fees. In this study, we conducted an equilibrium analysis of a variable batch service model in which customers who observe no waiting customers in an incomplete batch can strategically select a batch size to maximize the individual utilities. We formulated this model as a three-dimensional Markov chain and created a book-type transition diagram. To consider the joining/balking dilemma of customers for this model, we proposed an effective algorithm to construct a necessary and sufficient size of state space for the Markov chain provided that all customers adopt the threshold-type equilibrium strategy. Moreover, we proved that the best batch size is a non-decreasing function for i if the reward for the completion of batch service with size l is an increasing function of l assuming that a tagged customer observes i complete batches in the system upon arrival; in other words, the fee decreases as the batch becomes larger. We then derive several performance measures, such as throughput, social welfare, and monopolist’s revenue. Throughout the numerical experiment, a comparison between the present variable batch service model and regular batch service model in which customers were served in a constant batch, was discussed. It was demonstrated that the three performance measures can be optimized simultaneously in the variable batch service model, as long as the fee was set relatively high. Full article
Show Figures

Figure 1

15 pages, 658 KiB  
Article
Pricing the Volatility Risk Premium with a Discrete Stochastic Volatility Model
by Petra Posedel Šimović and Azra Tafro
Mathematics 2021, 9(17), 2038; https://doi.org/10.3390/math9172038 - 25 Aug 2021
Cited by 4 | Viewed by 2440
Abstract
Investors’ decisions on capital markets depend on their anticipation and preferences about risk, and volatility is one of the most common measures of risk. This paper proposes a method of estimating the market price of volatility risk by incorporating both conditional heteroscedasticity and [...] Read more.
Investors’ decisions on capital markets depend on their anticipation and preferences about risk, and volatility is one of the most common measures of risk. This paper proposes a method of estimating the market price of volatility risk by incorporating both conditional heteroscedasticity and nonlinear effects in market returns, while accounting for asymmetric shocks. We develop a model that allows dynamic risk premiums for the underlying asset and for the volatility of the asset under the physical measure. Specifically, a nonlinear in mean time series model combining the asymmetric autoregressive conditional heteroscedastic model with leverage (NGARCH) is adapted for modeling return dynamics. The local risk-neutral valuation relationship is used to model investors’ preferences of volatility risk. The transition probabilities governing the evolution of the price of the underlying asset are adjusted for investors’ attitude towards risk, presenting the asset returns as a function of the risk premium. Numerical studies on asset return data show the significance of market shocks and levels of asymmetry in pricing the volatility risk. Estimated premiums could be used in option pricing models, turning options markets into volatility trading markets, and in measuring reactions to market shocks. Full article
Show Figures

Figure 1

26 pages, 1558 KiB  
Article
Three-Stage Numerical Solution for Optimal Control of COVID-19
by Luis Vargas Tamayo, Vianney Mbazumutima, Christopher Thron and Léonard Todjihounde
Mathematics 2021, 9(15), 1777; https://doi.org/10.3390/math9151777 - 27 Jul 2021
Viewed by 2132
Abstract
In this paper, we present a three-stage algorithm for finding numerical solutions for optimal control problems. The algorithm first performs an exhaustive search through a discrete set of widely dispersed solutions which are representative of large subregions of the search space; then, it [...] Read more.
In this paper, we present a three-stage algorithm for finding numerical solutions for optimal control problems. The algorithm first performs an exhaustive search through a discrete set of widely dispersed solutions which are representative of large subregions of the search space; then, it uses the search results to initialize a Monte Carlo process that searches quasi-randomly for a best solution; then, it finally uses a Newton-type iteration to converge to a solution that satisfies mathematical conditions of local optimality. We demonstrate our methodology on an epidemiological model of the coronavirus disease with testing and distancing controls applied over a period of 180 days to two different subpopulations (low-risk and high-risk), where model parameters are chosen to fit the city of Houston, Texas, USA. In order to enable the user to select his/her preferred trade-off between (number of deaths) and (herd immunity) outcomes, the objective function includes costs for deaths and non-immunity. Optimal strategies are estimated for a grid of (death cost) × (non-immunity cost) combinations, in order to obtain a Pareto curve that represents optimum trade-offs. The levels of the four controls for the different Pareto-optimal solutions over the 180-day period are visually represented and their characteristics discussed. Three different variants of the algorithm are run in order to determine the relative importance of the three stages in the optimization. Results from the three algorithm variants are fairly consistent, indicating that solutions are robust. Results also show that the Monte Carlo stage plays an especially prominent role in the optimization, but that all three stages of the process make significant contributions towards finding lower-cost, more effective control strategies. Full article
Show Figures

Figure 1

Review

Jump to: Research

27 pages, 437 KiB  
Review
Markovian Restless Bandits and Index Policies: A Review
by José Niño-Mora
Mathematics 2023, 11(7), 1639; https://doi.org/10.3390/math11071639 - 28 Mar 2023
Cited by 3 | Viewed by 3790
Abstract
The restless multi-armed bandit problem is a paradigmatic modeling framework for optimal dynamic priority allocation in stochastic models of wide-ranging applications that has been widely investigated and applied since its inception in a seminal paper by Whittle in the late 1980s. The problem [...] Read more.
The restless multi-armed bandit problem is a paradigmatic modeling framework for optimal dynamic priority allocation in stochastic models of wide-ranging applications that has been widely investigated and applied since its inception in a seminal paper by Whittle in the late 1980s. The problem has generated a vast and fast-growing literature from which a significant sample is thematically organized and reviewed in this paper. While the main focus is on priority-index policies due to their intuitive appeal, tractability, asymptotic optimality properties, and often strong empirical performance, other lines of work are also reviewed. Theoretical and algorithmic developments are discussed, along with diverse applications. The main goals are to highlight the remarkable breadth of work that has been carried out on the topic and to stimulate further research in the field. Full article
22 pages, 463 KiB  
Review
Reinforcement Learning Approaches to Optimal Market Making
by Bruno Gašperov, Stjepan Begušić, Petra Posedel Šimović and Zvonko Kostanjčar
Mathematics 2021, 9(21), 2689; https://doi.org/10.3390/math9212689 - 22 Oct 2021
Cited by 7 | Viewed by 5879
Abstract
Market making is the process whereby a market participant, called a market maker, simultaneously and repeatedly posts limit orders on both sides of the limit order book of a security in order to both provide liquidity and generate profit. Optimal market making entails [...] Read more.
Market making is the process whereby a market participant, called a market maker, simultaneously and repeatedly posts limit orders on both sides of the limit order book of a security in order to both provide liquidity and generate profit. Optimal market making entails dynamic adjustment of bid and ask prices in response to the market maker’s current inventory level and market conditions with the goal of maximizing a risk-adjusted return measure. This problem is naturally framed as a Markov decision process, a discrete-time stochastic (inventory) control process. Reinforcement learning, a class of techniques based on learning from observations and used for solving Markov decision processes, lends itself particularly well to it. Recent years have seen a very strong uptick in the popularity of such techniques in the field, fueled in part by a series of successes of deep reinforcement learning in other domains. The primary goal of this paper is to provide a comprehensive and up-to-date overview of the current state-of-the-art applications of (deep) reinforcement learning focused on optimal market making. The analysis indicated that reinforcement learning techniques provide superior performance in terms of the risk-adjusted return over more standard market making strategies, typically derived from analytical models. Full article
Show Figures

Figure 1

Back to TopTop