Next Article in Journal
Quantum-Inspired Attention-Based Semantic Dependency Fusion Model for Aspect-Based Sentiment Analysis
Previous Article in Journal
Fuzzy Treatment for Meromorphic Classes of Admissible Functions Connected to Hurwitz–Lerch Zeta Function
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Stochastic Production Planning with Regime-Switching: Sensitivity Analysis, Optimal Control, and Numerical Implementation

by
Dragos-Patru Covei
Department of Applied Mathematics, The Bucharest University of Economic Studies, Piata Romana, No. 6, District 1, 010374 Bucharest, Romania
Axioms 2025, 14(7), 524; https://doi.org/10.3390/axioms14070524
Submission received: 15 June 2025 / Revised: 28 June 2025 / Accepted: 5 July 2025 / Published: 8 July 2025

Abstract

This study investigates a stochastic production planning problem with regime-switching parameters, inspired by economic cycles impacting production and inventory costs. The model considers types of goods and employs a Markov chain to capture probabilistic regime transitions, coupled with a multidimensional Brownian motion representing stochastic demand dynamics. The production and inventory cost optimization problem is formulated as a quadratic cost functional, with the solution characterized by a regime-dependent system of elliptic partial differential equations (PDEs). Numerical solutions to the PDE system are computed using a monotone iteration algorithm, enabling quantitative analysis. Sensitivity analysis and model risk evaluation illustrate the effects of regime-dependent volatility, holding costs, and discount factors, revealing the conservative bias of regime-switching models when compared to static alternatives. Practical implications include optimizing production strategies under fluctuating economic conditions and exploring future extensions such as correlated Brownian dynamics, non-quadratic cost functions, and geometric inventory frameworks. In contrast to earlier studies that imposed static or overly simplified regime-switching assumptions, our work presents a fully integrated framework—combining optimal control theory, a regime-dependent system of elliptic PDEs, and comprehensive numerical and sensitivity analyses—to more accurately capture the complex stochastic dynamics of production planning and thereby deliver enhanced, actionable insights for modern manufacturing environments.
MSC:
90B05; 93E20; 60J60; 90C39; 49L20; 90C15

1. Introduction

Stochastic production planning lies at the intersection of optimal control, stochastic processes, and operations research. Over the last four decades, researchers have developed an array of methodologies to tackle the inherent uncertainty in production and inventory systems. The seminal work of Bensoussan et al. [1] established the foundations for production optimization under probabilistic constraints. Cadenillas et al. then introduced regime-switching dynamics to reflect business cycle effects on demand [2], and extended this framework to include production constraints [3]. Dong et al. [4] demonstrated the importance of regime shifts in microgrid energy management, while Gharbi and Kenne [5] focused on multi-product manufacturing environments.
Building on elliptic PDE techniques, Covei et al. [6] derived explicit radially symmetric solutions for infinite-horizon regime-switching problems, proving uniqueness and convexity. Subsequent works [7,8] refined numerical implementations and parabolic PDE analyses. More recently, Ghosh et al. [9] and Borhan et al. [10] applied switching diffusion controls to complex engineering systems, and Hu et al. [11] developed a K convex multi-cycle supply network model with interdependent demand shocks.
Motivated by economic cycles that abruptly alter production and holding costs, we study a stochastic production planning model in which a D 1 dimensional Brownian motion W t captures continuous demand fluctuations, and an independent, finite-state continuous-time homogeneous Markov chain ( ε t ) t 0 with the generator
A = a 1 a 1 a 2 a 2 , a i > 0 , i = 1 , 2
models regime-switches (e.g., growth vs. recession). The chain has stationary (time homogeneous) transition rates, so that regime-switching events occur at exponential times independent of the Brownian paths.
Under each regime i { 1 , 2 } , cost and volatility parameters ( α i , σ i , f i ) remain constant until the next jump of ε t . This regime-switching structure reflects real-world scenarios in which economic cycles or policy shifts cause sudden parameter changes. The manager’s goal is to choose production rates p ( t , ε t ) R D to minimize
J ( p ) = E 0 τ p ( s , ε s ) 2 + f ε s y ( s ) e α ε s s d s ,
where y ( t ) evolves by
d y ( t ) = p ( t , ε t ) d t + σ ε t d W t
and τ is the first exit time from a given inventory ball of radius R > 0 .
Our contributions are threefold:
  • We derive the coupled elliptic Hamilton–Jacobi–Bellman (HJB) equations for the regime-dependent value functions z 1 , z 2 ; prove the existence, uniqueness, and convexity of the solution via a logarithmic transform and monotone iteration; and obtain an explicit radially symmetric bound.
  • We clarify and exploit the independence and stationarity of the Markov chain transitions; regime changes occur at exponential times independently of Brownian noise, ensuring no mixed diffusion jump terms and greatly simplifying both analysis and numerics.
  • We implement a fully integrated numerical pipeline solving the transformed PDEs, recovering the optimal feedback control, and simulating the controlled SDE, thereby illustrating sensitivity and model risk analyses under realistic economic scenarios.
The remainder of this paper is organized as follows: Section 2 introduces the mathematical formulation of the model and its objectives. Section 3 explains the methodology, including the derivation of the HJB equations and the existence of a solution. Section 4 focuses on the optimal control of the problem at hand. Section 5 provides a discussion on sensitivity analysis, model examination, and visualization of the results. Section 6 proposes future research directions. Section 7 concludes with the final observations, and Appendix A concludes with a rigorous proof of the key technical result, a detailed presentation of the numerical algorithm implementing our main theorems, and the complete Python (version 3.13.1) code for these computations.

2. Theoretical Framework

In this section, we present the mathematical formulation of the stochastic production planning problem with regime-switching. The model incorporates random demand, regime-dependent parameters, and production controls, as described below.

2.1. Problem Formulation

This paper addresses a stochastic production planning problem involving D 1 types of goods stored in inventory with the objective of minimizing production and inventory costs over time under regime-switching economic parameters. The overall criterion is based on a quadratic cost functional that represents the production and holding costs (adjusted for stochastic demand), with production ceasing when inventory levels exceed a given threshold R > 0 . In our model, regimes are characterized by a Markov chain that captures probabilistic transitions between states, and a D-dimensional Brownian motion models stochastic fluctuations in demand.
The stochastic dynamics of the inventory levels are governed by
d y i ( t , ϵ ( t ) ) = p i ( t , ϵ ( t ) ) d t + σ ϵ ( t ) d w i ( t ) , y i ( 0 , ϵ ( 0 ) ) = y i 0 , ϵ ( 0 ) , i = 1 , , D ,
where p i is the deterministic production rate, y i ( t , ϵ ( t ) ) is an Itô process in R , σ ϵ ( t ) is the regime-dependent volatility, and ϵ ( t ) is a Markov chain representing economic regimes (here with two states, but the methodology extends seamlessly to scenarios involving any number of states).
The cost functional is defined as
J ( p 1 , , p D ) = E 0 τ i = 1 D p i 2 ( t , ϵ ( t ) ) + f ϵ ( t ) ( y ( t , ϵ ( t ) ) ) e α ϵ ( t ) t d t ,
subject to the dynamics in (1) and the stopping time τ , which stops production when the inventory exceeds the threshold R > 0 .

2.2. Regime-Switching and Dynamics

We consider a probability space ( Ω , F , P ) together with a standard R D -valued Brownian motion
w = { w t : t 0 }
and an observable finite-state continuous-time homogeneous Markov chain, with states
ϵ t = { ϵ t : t 0 } .
We denote by F = { F t : t 0 } the P-augmentation of the filtration { F ( w , ϵ ) t : t 0 } generated by the Brownian motion and the Markov chain, where
F ( w , ε ) t : = σ ( { w s , ϵ s : 0 s t } ) for every t 0 .
The manager of a firm wants to control the inventory of a given item. We assume a stochastic production environment driven by two sources of randomness:
  • Markov Chain: A continuous-time homogeneous Markov chain ϵ ( t ) , with states { 1 , 2 } , represents economic regimes. These regimes may correspond to scenarios such as economic growth ( ϵ ( t ) = 1 ) or recession ( ϵ ( t ) = 2 ).
  • Brownian Motion: A D-dimensional Brownian motion w ( t ) = ( w 1 ( t ) , , w D ( t ) ) models stochastic demand fluctuations in inventory levels.
We also assume that ϵ and w are independent (i.e., HJB system has no “mixed” terms; see [12]), and that the Markov chain ϵ has a strongly irreducible generator, which is given by
A = a 1 a 1 a 2 a 2 , ( transition rate matrix of the Markov chain ϵ ( t ) )
where a 1 > 0 and a 2 > 0 . In this case,
if p t t = E [ ϵ ( t ) ] R 2 then d p t t d t = A ϵ ( t ) ,
and ϵ t is explicitly described by the integral form
ϵ t = ϵ 0 + 0 t A ϵ s d s + M t ,
where M ( t ) is a martingale with respect to F .

2.3. Inventory Dynamics and State Variables

Let y i ( t ) denote the inventory levels of good i at time t, adjusted for demand, and let p i ( t , ϵ ( t ) ) denote the production rate (control variable) for good i at time t under regime ϵ ( t ) . The stochastic dynamics of the inventory are governed by
d y i ( t ) = p i ( t , ϵ ( t ) ) d t + σ ϵ ( t ) d w i ( t ) , y i ( 0 ) = y i 0 , i = 1 , , D ,
where σ ϵ ( t ) is the regime-dependent volatility, w i ( t ) is the i-th component of a D-dimensional Brownian motion, ϵ ( t ) is a Markov chain representing economic regimes, and y i 0 denotes the initial inventory level of good i.

2.4. Objective Function

The objective of the stochastic production planning problem is to minimize the total expected cost incurred over time. These costs include both production costs and inventory holding costs, adjusted for regime-switching dynamics and exponential discounting. This is formalized through the following components.

2.4.1. Production Costs

The cost associated with the production rate p i ( t , ϵ ( t ) ) for good i is quadratic and regime-dependent. The quadratic form ensures tractability in optimization and is expressed as
C p ( t ) = i = 1 D p i 2 ( t , ϵ ( t ) ) ,
where p i ( t , ϵ ( t ) ) represents the net production rate (actual production minus demand).

2.4.2. Inventory Costs

The holding cost for storing the inventory is modeled as a convex function of the inventory levels. It accounts for regime-switching parameters and is given by
C h ( t ) = f ϵ ( t ) ( y ( t , ϵ ( t ) ) ) ,
where f ϵ ( t ) ( · ) represents regime-dependent holding costs. The convexity of f ϵ ( t ) ( · ) reflects the increasing marginal cost of holding excess inventory.

2.4.3. Discount Factor

To account for the time value of money, the costs are exponentially discounted with a regime-dependent discount rate α ϵ ( t ) . The discount factor ensures that costs incurred in the future are valued less than those incurred immediately.

2.4.4. Cost Functional

The factory aims to minimize production and inventory costs, subject to the stochastic dynamics (2) described above. The total cost functional—combining production costs, inventory costs, and exponential discounting—is given by
J ( p 1 , , p D ) = E 0 τ i = 1 D p i 2 ( t , ϵ ( t ) ) + f ϵ ( t ) ( y ( t , ϵ ( t ) ) ) e α ϵ ( t ) t d t ,
where p i ( t , ϵ ( t ) ) is the production rate for good i at time t under regime ϵ ( t ) ; p i 2 ( t , ϵ ( t ) ) is the quadratic production cost for good i; i = 1 D p i 2 ( t , ϵ ( t ) ) is the regime-dependent production costs, modeled as quadratic functions of the production rate; y ( t , ϵ ( t ) ) represents the inventory levels of goods, adjusted for demand; f ϵ ( t ) ( y ( t , ϵ ( t ) ) ) represents the regime-dependent inventory holding costs (holding cost, modeled as convex functions f 1 ( x ) and f 2 ( x ) ); and α ϵ ( t ) is the regime-dependent discount rate for exponential discounting.
The stopping time τ is defined as the moment when the inventory exceeds an exogenous threshold R, i.e.,
τ = inf { t > 0 : y ( t , ϵ ( t ) ) R } , y ( t , ϵ ( t ) ) = y 1 ( t , ϵ ( t ) ) , . . . , y D ( t , ϵ ( t ) ) ,
where stands for the Euclidian norm.

3. Optimization Problem

The primary objective of the stochastic production planning problem is to minimize the total expected cost, which comprises both production and inventory holding costs, subject to the constraints of stochastic inventory dynamics and regime-switching parameters. This optimization problem is formulated as follows.

3.1. Optimization Objective

The objective is to determine the optimal production rates p 1 ( t , ϵ ( t ) ) , …, p D ( t , ϵ ( t ) ) that minimize the total cost functional J, while satisfying the stochastic inventory dynamics. Mathematically, this is expressed as
inf p ( t , ϵ ( t ) ) R D J ( p ( t , ϵ ( t ) ) ) , p ( t , ϵ ( t ) ) = p 1 ( t , ϵ ( t ) ) , , p D ( t , ϵ ( t ) )
subject to the inventory dynamics:
d y i ( t ) = p i ( t , ϵ ( t ) ) d t + σ ϵ ( t ) d w i ( t ) , y i ( 0 ) = y i 0 , i = 1 , , D .
The optimization problem is solved over a finite horizon, up to the stopping time τ , and incorporates the effects of regime-switching. The constraints ensure that the optimization respects the stochastic nature of the inventory dynamics and the stopping criterion at  τ .

3.2. Hamilton–Jacobi–Bellman Equations

To solve the optimization problem, we employ the value function approach. The value function is defined as
z i ( x ) = inf p 1 , , p D E 0 τ j = 1 D p j 2 ( t , ϵ ( t ) ) + f ϵ ( t ) ( y ( t , ϵ ( t ) ) ) e α ϵ ( t ) t d t | y ( 0 ) = x , ϵ ( 0 ) = i .
The HJB equations for the value functions z 1 ( x ) and z 2 ( x ) , corresponding to the two regimes ϵ ( t ) = 1 and ϵ ( t ) = 2 , are given by
a 1 z 2 + ( a 1 + α 1 ) z 1 σ 1 2 2 Δ z 1 f 1 x = 1 4 z 1 2 , for x B R , a 2 z 1 + ( a 2 + α 2 ) z 2 σ 2 2 2 Δ z 2 f 2 x = 1 4 z 2 2 , for x B R ,
with the following boundary conditions:
z 1 ( x ) = z 2 ( x ) = 0 , for x B R .
Here, a 1 , a 2 , α 1 , α 2 , σ 1 , σ 2 are regime-dependent parameters, Δ z i is the Laplacian of z i ( x ) (sum of second-order partial derivatives), B R is the open ball in R D ( D 1 ) of radius R > 0 , and f 1 ( x ) and f 2 ( x ) are the holding cost functions in regimes 1 and 2, respectively.

Assumptions

To ensure mathematical tractability, we impose the following assumptions:
  • f 1 ( x ) and f 2 ( x ) are continuous, convex functions satisfying f i ( x ) M i x 2 , i = 1 , 2 ;
  • σ ϵ ( t ) > 0 and α ϵ ( t ) > 0 , ensuring non-degenerate stochastic dynamics;
  • Boundary conditions: z ϵ ( t ) = 0 when t = τ , i = 1 , , D .
The hypotheses on f 1 and f 2 are chosen so that
z i ( y ) C 1 + y 2 for some C > 0 ,
and so that the running cost J ( y , p ) is convex in ( y , p ) . Together, these conditions guarantee that each value–function z i is convex in y.
This formulation provides the mathematical foundation for deriving the Hamilton–Jacobi–Bellman equations and solving the optimization problem.
The computational goal is to approximate the value functions z 1 ( x ) and z 2 ( x ) using numerical techniques that guarantee convergence and stability.
The next section focuses on the methodology used to obtain the solutions.

3.3. Transformation and Simplification

To simplify the PDE system (4), we apply a change in variables: z j ( x ) = 2 σ j 2 ln u j ( x ) , j = 1 , 2 , which removes the gradient terms and transforms the PDE system into
Δ u 1 x = u 1 x 1 σ 1 4 f 1 x + 2 ( a 1 + α 1 ) σ 1 2 ln u 1 x 2 a 1 σ 2 2 σ 1 4 ln u 2 x , f o r x B R Δ u 2 x = u 2 x 1 σ 2 4 f 2 x + 2 ( a 2 + α 2 ) σ 2 2 ln u 2 x 2 a 2 σ 1 2 σ 2 4 ln u 1 x , f o r x B R
with the following boundary conditions:
u 1 ( x ) = u 2 ( x ) = 1 , x B R .
This transformation reduces the complexity of the system and facilitates numerical computation.

3.4. Existence and Uniqueness of Solutions

The solution’s computation involved specific parameters that had to be determined in an approximately exact form. In the paper [8], we established only the existence of these parameters. Therefore, it becomes essential to provide a proof of the results that will facilitate our computational technique. Consequently, to facilitate the implementation of our main results, we state the following practical lemma; its proof can be found in Appendix A.1.
Lemma 1.
For any a 1 , α 1 , a 2 , α 2 , σ 1 , σ 2 , M 1 , M 2 , R ( 0 , ) , and D N * , there exist unique K 1 , K 2 ( , 0 ) such that
4 K 1 2 + 2 ( a 1 + α 1 ) σ 1 2 σ 1 4 K 1 M 1 σ 1 4 2 a 1 σ 2 2 σ 1 4 K 2 = 0 , 4 K 2 2 + 2 ( a 2 + α 2 ) σ 2 2 σ 2 4 K 2 M 2 σ 2 4 2 a 2 σ 1 2 σ 2 4 K 1 = 0 , 2 ( a 1 + α 1 ) R 2 σ 1 2 K 1 2 D K 1 + 2 a 1 σ 2 2 R 2 σ 1 4 K 2 0 , 2 ( a 2 + α 2 ) R 2 σ 2 2 K 2 2 D K 2 + 2 a 2 σ 1 2 R 2 σ 2 4 K 1 0 .
We are now ready to adapt the proof of the theorem in [8], integrating the necessary steps to address the numerical implications.
Theorem 1.
Let K 1 , K 2 , 0 be the unique solutions of the nonlinear system (6):
B 1 x = 2 σ 1 2 K 1 ( R 2 | x | 2 ) a n d B 2 x = 2 σ 2 2 K 2 ( R 2 | x | 2 ) .
The system of Equation (4) has a unique positive convex solution ( z 1 , z 2 ) [ C 2 ( B R ) C ( B ¯ R ) ] 2 with value functions z 1 and z 2 such that
z 1 x B 1 x and z 2 x B 2 x , for all x B ¯ R .
Proof. 
Our constructive approach aims to develop a computational scheme for numerical approximations of the solution. Since the system (4) is equivalent to (5), we will focus on the latter. The approach involves four key steps.
  • Step 1: Sub-solution and Super-solution Construction.
The main problem reduces to constructing functions ( u 1 , u 2 ) as sub-solutions (and ( u 1 , u 2 ) as super-solutions) for the system (5) that satisfy the following inequalities:
Δ u 1 ( x ) u 1 ( x ) 1 σ 1 4 f 1 ( x ) + 2 ( a 1 + α 1 ) σ 1 2 ln u 1 ( x ) 2 a 1 σ 2 2 σ 1 4 ln u 2 ( x ) , f o r x B R , Δ u 2 ( x ) u 2 ( x ) 1 σ 2 4 f 2 ( x ) + 2 ( a 2 + α 2 ) σ 2 2 ln u 2 ( x ) 2 a 2 σ 1 2 σ 2 4 ln u 1 ( x ) , f o r x B R , ( u 1 ( x ) , u 2 ( x ) ) = 1 , 1 , f o r x B R ,
(and similarly for the inequalities with ≤). For sub-solutions, choose
( u ̲ 1 ( x ) , u ̲ 2 ( x ) ) = e K 1 ( R 2 | x | 2 ) , e K 2 ( R 2 | x | 2 ) ,
where
K 1 , K 2 ( , 0 ) ,
are solutions of (6). For super-solutions, choose
( u ¯ 1 ( x ) , u ¯ 2 ( x ) ) = ( 1 , 1 ) .
Clearly,
u ̲ 1 ( x ) u ¯ 1 ( x ) and u ̲ 2 ( x ) u ¯ 2 ( x ) , for any x B ¯ R .
Step 2: Approximation Scheme.
Construct sequences { ( u 1 k , u 2 k ) } k N via monotone Picard iterations, starting with
( u 1 0 , u 2 0 ) = ( u ¯ 1 , u ¯ 2 ) , x B ¯ R .
Define the iteration
Δ u 1 k + Λ 1 u 1 k = g 1 ( x , u 1 k 1 , u 2 k 1 ) + Λ 1 u 1 k 1 , f o r x B R , Δ u 2 k + Λ 2 u 2 k = g 2 ( x , u 1 k 1 , u 2 k 1 ) + Λ 2 u 2 k 1 , f o r x B R ,
where for
S 1 = e K 1 R 2 , S 2 = e K 2 R 2 , and S = 1 ,
the functions
g 1 : B ¯ R × [ S 1 , S ] × [ S 2 , S ] R and g 2 : B ¯ R × [ S 2 , S ] × [ S 1 , S ] R
are defined by
g 1 x , t , s = 1 σ 1 4 f 1 x t + 2 ( a 1 + α 1 ) σ 1 2 t ln t 2 a 1 σ 2 2 σ 1 4 t ln s , g 2 x , t , s = 1 σ 2 4 f 2 x s + 2 ( a 2 + α 2 ) σ 2 2 s ln s 2 a 2 σ 1 2 σ 2 4 s ln t .
Since g 1 (respectively, g 2 ) is a continuous function with respect to the first variable in B R and continuously differentiable with respect to the second and third in
[ S 1 , S ] × [ S 2 , S ] ,
respectively,
[ S 2 , S ] × [ S 1 , S ] ,
this allows us to choose
Λ 1 , Λ 2 ( , 0 )
such that
Λ 1 g 1 x , t 1 , s g 1 x , t 2 , s t 2 t 1 , ( = g 1 t ) ,
respectively
Λ 2 g 2 x , t , s 1 g 2 x , t , s 2 s 2 s 1 , ( = g 2 s ) ,
for every t 1 , t 2 with
u ̲ 1 t 2 < t 1 u ¯ 1 , u ̲ 2 s u ¯ 2 and x B ¯ R ,
respectively, for every s 1 , s 2 with
u ̲ 2 s 2 < s 1 u ¯ 2 , u ̲ 1 t u ¯ 1 and x B ¯ R ,
to ensure monotonicity
( u 1 k 1 u 1 k , u 2 k 1 u 2 k ) ( u 1 k u 1 k + 1 , u 2 k u 2 k + 1 ) ,
via mathematical induction and the maximum principle.
  • Step 3: Convergence and Uniqueness.
The sequences { u 1 k , u 2 k } k N converge monotonically to bounded limits:
lim k ( u 1 k ( x ) , u 2 k ( x ) ) = ( u 1 ( x ) , u 2 ( x ) ) , x B ¯ R .
Standard bootstrap arguments ensure that
( u 1 k , u 2 k ) ( u 1 , u 2 ) in [ C 2 ( B R ) C ( B ¯ R ) ] 2 ,
and ( u 1 , u 2 ) solves (5) with
u ̲ 1 ( x ) u 1 ( x ) u ¯ 1 ( x ) , u ̲ 2 ( x ) u 2 ( x ) u ¯ 2 ( x ) , x B ¯ R .
Uniqueness follows from the maximum principle, i.e., any two positive solutions ( u 1 , u 2 ) and ( u ˜ 1 , u ˜ 2 ) coincide.
Step 4: 
Since J is convex in ( y , p ) , the state equation is affine in p, and z i grows at most quadratically and is continuous, it then follows that the function z i is convex on R D (see [8]).
   □

4. Optimal Control

The optimal control represents the production rate policy that minimizes the total expected cost functional. It is derived using the Hamilton–Jacobi–Bellman (HJB) equations and is directly related to the gradients of the value functions. Below, we delve deeper into its formulation and derivation.

4.1. Optimal Production Policy

By differentiating the HJB equations with respect to the inventory levels y i , we obtain the gradient terms that define the optimal control.
Hence, for each economic good i = 1 , , D , the optimal production rate p i * ( t , ϵ ( t ) ) is given by
p i * ( t , ϵ ( t ) ) = 1 2 z ϵ ( t ) y i ,
where z ϵ ( t ) is the value function corresponding to regime ϵ ( t ) and z ϵ ( t ) y i denotes the partial derivative of the value function with respect to the inventory level of good i.
This result is obtained by solving the first-order optimality condition derived from the HJB equations.

Economic Interpretation

The negative gradient of the value function implies that the optimal production rate decreases as the marginal cost of inventory increases. Intuitively, higher inventory levels (positive gradient) lead to a reduction in production to avoid excess costs, and lower inventory levels (negative gradient) necessitate an increase in production to meet anticipated demand.

4.2. Verification of Optimality

The verification of optimality establishes that the control p * ( t , ϵ ( t ) ) , derived from the Hamilton–Jacobi–Bellman (HJB) equations, is indeed the optimal control that minimizes the cost functional. This involves proving the supermartingale property of the value function for all admissible controls and the martingale property for the optimal control.

4.2.1. The Stochastic Process

To verify that p i * ( t , ϵ ( t ) ) is indeed the optimal control, we use the supermartingale and martingale properties of the value function z ϵ ( t ) ( y ) . Let the stochastic process Z p ( t ) be defined as
Z p ( t ) = e α ϵ ( t ) t z ϵ ( t ) ( y ( t , ϵ ( t ) ) ) 0 t i = 1 D p i 2 ( s , ϵ ( s ) ) + f ϵ ( s ) ( y ( s , ϵ ( s ) ) ) e α ϵ ( s ) s d s ,
where z ϵ ( t ) ( y ) is the value function for regime ϵ ( t ) , p i ( t , ϵ ( t ) ) are the production rates (control variables), f ϵ ( t ) ( y ) is the holding cost function, and α ϵ ( t ) is the regime-dependent discount rate.
Using Itô’s lemma, the time derivative of Z p ( t ) satisfies
d Z p ( t ) = e α ϵ ( t ) t α ϵ ( t ) z ϵ ( t ) ( y ) + L z ϵ ( t ) ( y ) i = 1 D p i 2 ( t , ϵ ( t ) ) f ϵ ( t ) ( y ) d t + M ( t ) ,
where M ( t ) is a martingale term, and L z ϵ ( t ) ( y ) represents the generator of the Markov-modulated diffusion (here, the Laplacian operator).

4.2.2. Supermartingale and Martingale Properties

For the admissible controls p i ( t , ϵ ( t ) ) , Z p ( t ) is a supermartingale because L L z ϵ ( t ) ( y ) satisfies the following HJB inequality:
α ϵ ( t ) z ϵ ( t ) ( y ) + L z ϵ ( t ) ( y ) i = 1 D p i 2 ( t , ϵ ( t ) ) f ϵ ( t ) ( y ) 0 .
For the optimal control p * ( t , ϵ ( t ) ) , L Z p ( t ) is a martingale because L z ϵ ( t ) ( y ) satisfies the equality condition:
α ϵ ( t ) z ϵ ( t ) ( y ) + L z ϵ ( t ) ( y ) i = 1 D p i * 2 ( t , ϵ ( t ) ) f ϵ ( t ) ( y ) = 0 .
The optimal control p * ( t , ϵ ( t ) ) ensures that Z p ( t ) is a martingale, while any other control p ( t , ϵ ( t ) ) results in Z p ( t ) being a supermartingale. This validates the optimality of p * ( t , ϵ ( t ) ) .

4.2.3. Boundary Conditions and Optimality

The boundary condition
z ϵ ( t ) ( y ) = 0 for y B R ,
where B R is the ball of radius R, which ensures that Z p ( t ) vanishes at the stopping time τ . Thus, for t τ , the contribution to the cost functional ceases, confirming the proper termination of production when inventory exceeds the threshold R.
The optimality of p * ( t , ϵ ( t ) ) is formalized through the following theorem:
Theorem 2.
Let Z p ( t ) be the stochastic process defined in (10). The control p * ( t , ϵ ( t ) ) , derived from the HJB equations, minimizes the cost functional J p 1 , , p D and satisfies
Z p * t = z ϵ t y 0 .
Proof. 
  • For Z p ( t ) under the admissible control p ( t , ϵ ( t ) ) ,
    E [ Z p ( τ ) ] Z p ( 0 ) ,
    since Z p ( t ) is a supermartingale.
  • For Z p * ( t ) under the optimal control p * ( t , ϵ ( t ) ) ,
    E [ Z p * ( τ ) ] = Z p * ( 0 ) ,
    since Z p * ( t ) is a martingale.
  • From the boundary condition z ϵ ( t ) ( y ) = 0 , it follows that
    E 0 τ i = 1 D p i 2 ( s , ϵ ( s ) ) + f ϵ ( s ) ( y ( s , ϵ ( s ) ) ) e α ϵ ( s ) s d s = z ϵ ( t ) ( y ( 0 ) ) .
    Thus, the control p * ( t , ϵ ( t ) ) minimizes the cost functional J ( p 1 , , p D ) and satisfies the optimality condition.
   □

4.2.4. Theoretical Properties of the Optimal Control and Inventory Process

The theoretical properties of the optimal control are as follows: the optimal control p i * ( t , ϵ ( t ) ) is Lipschitz continuous in y, ensuring stability in production rates under small changes in inventory levels; the control policy is adaptive, responding dynamically to regime changes governed by the Markov chain ϵ ( t ) ; and the quadratic nature of the cost functional guarantees uniqueness of the optimal control.
The inventory process is modeled by the stochastic differential equation (SDE)
d y ( t ) = p * ( t , ε ( t ) ) d t + σ ε ( t ) d W ( t ) , y ( 0 ) = x 0 R D ,
where the economic regime ε ( t ) takes the values { 1 , 2 } with the following transitions:
P ( ε ( t + d t ) = 2 ε ( t ) = 1 ) = a 1 d t and P ( ε ( t + d t ) = 1 ε ( t ) = 2 ) = a 2 d t .
In the simulation, one uses a time step Δ t and performs the following update:
y ( t + Δ t ) = y ( t ) + p * ( y ( t ) , ϵ ( t ) ) Δ t + σ ϵ ( t ) Δ t ξ ,
where ξ is an independent standard normal random variable. The optimal control is obtained by interpolating the computed value function gradients:
p i * ( y , ϵ ( t ) ) = 1 2 z ϵ ( t ) y i , ( i = 1 , , D , ϵ t { 1 , 2 } ) ,
where z ϵ ( t ) = 1 = z 1 and z ϵ ( t ) = 2 = z 2 solve the coupled HJB system in each regime.
The simulation by Euler–Maruyama is executed until the stopping time
τ = inf { t > 0 : | y ( t ) | R } ,
at which point the production is halted.

5. Sensitivity, Model Analysis, and Visualization

The proof of the results in this section is detailed in reference [8], and thus, the specifics are excluded here. The data are presented for its visualization in accordance with the results, showcasing the strength of the mathematical conclusions and numerical implementation.

5.1. Sensitivity Analysis

The sensitivity analysis shows the following impacts:
Theorem 3.
If α 1 = α 2 and f 1 x = f 2 x , for all x B R , then higher volatility ( σ 1 > σ 2 ) increases the following value function:
z 1 ( x ) z 2 ( x ) f o r a l l x B ¯ R .
Theorem 4.
If σ 1 = σ 2 and f 1 x = f 2 x , for all x B R , then higher discount rates ( α 1 < α 2 ) decrease the value function
z 1 ( x ) z 2 ( x ) , f o r a l l x B ¯ R .
Theorem 5.
If α 1 = α 2 and σ 1 = σ 2 , then higher holding costs ( f 1 ( x ) > f 2 ( x ) ) increase the value function
z 1 ( x ) z 2 ( x ) , f o r a l l x B ¯ R .

5.2. Model Comparisons

For models with and without regime-switching, we have
Theorem 6.
If σ 1 > σ 2 , α 1 < α 2 , and f 1 x > f 2 x , for all x B R , then
z ¯ 1 ( x ) z 1 ( x ) z 2 ( x ) z ̲ 2 ( x ) , f o r a l l x B R ,
where z ¯ 1 ( x ) and z ̲ 2 ( x ) correspond to the value functions of a model without regime-switching.

5.3. Visualization of the Solution in the Case D = 1

In this section, we connect our theoretical results (Theorems 3–6) with concrete numerical experiments in the case D = 1 . We first present a compact tabular summary of the sensitivity analysis statements, then illustrate the transformed solutions and value functions, and finally show the time dynamics of the optimal control and inventory. All plots were generated with the annotated Python code in Appendix A.3, which can be adapted to other parameter choices so long as the monotonicity and convergence assumptions remain valid.

5.3.1. Summary of Sensitivity Results

Table 1 collects the four main sensitivity comparisons of Section 5.1, restating the impact of parameter changes on the regime-dependent value functions.
Next, we give a concise workflow diagram, which summarize the sensitivity statements in tabular form.

5.3.2. Workflow Overview

  • Model specification ( { a i , α i , M i , σ i , f i } ).
  • HJB PDE derivation and logarithmic transform z i u i .
  • Monotone iteration (Picard) for ( u 1 , u 2 ) on B ¯ R .
  • Back-transform to obtain z 1 , z 2 and compute feedback law p * ( y ) .
  • Simulate inventory SDE under optimal control and regime-switching.
Figure 1 depicts the end-to-end computational pipeline:
We now present four detailed case studies “one per theorem” each with its own parameter table, plots, and discussion of practical relevance.

5.3.3. Case Study: Volatility Variation (Theorem 3)

To connect our theoretical model with practical applications (see [4]), we vary certain parameters that typically arise in real-world problems. Table 2 lists these inputs.
Figure 2a,b show the numerically computed transformed solutions u 1 , u 2 (so that z i = 2 σ i 2 K i ln u i ) alongside the regime-dependent value functions.
This choice satisfies the hypotheses of Theorem 3 and ensures that volatility differences dominate the ordering of z 1 and z 2 .
Figure 3a,b display a sample path under one realization of the regime-switching process.
Interpretation
In a microgrid context, a i and σ i model drift and renewable generation uncertainty, while f i ( x ) = x 2 penalizes inventory deviations. Theorem 3 predicts z 1 z 2 when σ 1 > σ 2 , confirmed by both the static profiles and dynamic simulation. The negative K i and Λ i ensure convexity and damping in the transformed PDEs (see Lemma 1 and Theorem 1).

5.3.4. Case Study: Discount Rate Variation (Theorem 4)

To illustrate Theorem 4, we choose parameters motivated by a multi-product, multi-machine manufacturing setting [5] (Table 3).
Figure 4a,b plot the transformed variables u 1 , u 2 and the resulting value functions z 1 , z 2 along a radial slice in B ¯ 4 .
Notice z 1 z 2 throughout, confirming the result in Theorem 4 (Figure 5).
Interpretation
The lower discount rate in regime 1 increases the present value of future costs, driving z 1 above z 2 in line with Theorem 4. The parameters a i and α i reflect the distinct operational regimes and production dynamics, while the equal volatilities ( σ 1 = σ 2 = 1 ) capture the uncertainty inherent in the system. The quadratic cost functions f i ( x ) = 2 x 2 impose a steep penalty on deviations, thereby driving the system to minimize inventory surplus and backlogs. These values serve as damping and sensitivity factors in the PDE framework and validate the stability and accuracy of our numerical scheme.

5.3.5. Case Study: Holding Cost Variation (Theorem 5)

To validate Theorem 5, we adopt parameters for a flexible manufacturing system [9] (Table 4).
Figure 6a,b show the transformed solutions and value functions, demonstrating z 1 z 2 as f 1 > f 2 .
Next, we characterize the time evolution of the optimal production rate and inventory level under varying holding cost regimes (Figure 7).
Interpretation
The steeper penalty in regime 1 elevates z 1 over z 2 , in full agreement with Theorem 5. These parameters model a production environment where the drift coefficients a i and switching intensities α i reflect dynamic operational regimes, while M i and σ i capture the production capacities and uncertainties inherent in the system. The quadratic cost functions f 1 and f 2 impose steeper penalties for deviations in production levels, thereby promoting a robust control strategy.

5.3.6. Case Study 4: Automotive Manufacturing (Theorem 6)

We adopt a parameter set inspired by modern automotive production, “e.g., companies like Dacia from România” where planning must react to volatile demand and supply chain risks.
This case study shows how to choose and interpret each model parameter in practice and how the two regimes arise in an automotive production context (e.g., Dacia Logan at Mioveni România).
Parameter Forms and Existence Conditions
To invoke Theorem 6, we must ensure
f i ( x ) = M i x 2 , α i > 0 , σ i > 0 , a i > 0 .
Here
  • Quadratic holding cost: We model inventory holding costs by
    f i ( x ) = M i x 2 .
    A quadratic form captures rising marginal costs (e.g., storage, insurance) and the scalar M i bounds the curvature.
  • Positive discount rates:  α i > 0 assures that future costs are appropriately down-weighted, guaranteeing a finite-value function.
  • Non-degenerate volatility:   σ i > 0 models random demand or supply chain shocks in regime i.
  • Irreducible regimes:   a i > 0 are the Markov chain transition intensities, so each state recurs on average in 1 / a i time units.
Regime Interpretation
We assume two operational regimes driven by market and supply chain conditions:
  • Regime 1 (High Demand)  a 1 = 0.6 : On average 1.7 months until demand cools; σ 1 = 1.0 : large volatility from rush orders; α 1 = 0.3 : lower discounting of near-term costs; M 1 = 5 : steep holding cost penalty to curb overstocking.
  • Regime 2 (Low Demand)  a 2 = 0.9 : On average 1.1 months until demand rebounds; σ 2 = 0.3 : stable production; α 2 = 0.8 : higher discounting of distant costs; M 2 = 1 : mild inventory penalty to maintain minimal safety stock.
Interpreting the Transition Rate a 1 = 0.6
In our two-state Markov model, the time spent in regime 1 (high demand) is exponentially distributed with rate parameter a 1 . By standard properties of the exponential law, the expected sojourn time in regime 1 is
E [ τ 1 ] = 1 a 1 = 1 0.6 1.67 months .
Thus, once the system enters the high-demand regime, it will, on average, remain there for about 1.7 months before switching back to the low-demand regime. This interpretation allows practitioners to translate the abstract rate a 1 directly into a familiar planning horizon—about seven weeks of sustained peak conditions under regime 1.
Table 5 displays the concrete numbers.
Numerical Implementation
  • Solve the transformed PDE system (5) on the ball { y R } by monotone Picard iteration, initializing with the super solution u i ( 0 ) ( y ) = 1 .
  • Back-transform to obtain the following value functions:
    z i ( y ) = 2 σ i 2 K i ln u i ( y ) , i = 1 , 2 ,
    where K i < 0 are the unique roots from Lemma 1.
  • Compute the feedback control
    p i * ( y , j ) = 1 2 y i z j ( y ) , j { 1 , 2 } .
  • (Optional) Validate by simulating N = 1 000 Euler–Maruyama paths of
    d y ( t ) = p * ( y ( t ) , ε ( t ) ) d t + σ ε ( t ) d W ( t )
    until y R .
Convergence and Interpretation
Executing the Python script in Appendix A.3 yields the following figures.
Figure 8 and Figure 9 demonstrate the following:
  • Rapid, uniform convergence of ( u i , z i ) at all grid nodes, as predicted by Theorem 6.
  • Regime-tailored controls: During high demand ( σ 1 large; α 1 small), the policy pre-emptively ramps production; during low demand, it throttles output to preserve cash.
  • Theoretical bounds: The computed z i lie strictly between the bounded limits that underlie the existence of a proof.
By following this parameter selection recipe and interpreting each a i , α i , M i , and σ i in business terms, we consider that any practitioner can instantiate Theorems 3–6 for their own two-regime production inventory problem.

5.3.7. Implementation Notes

The full Python code (Appendix A.3) is annotated step by step: each function call is commented to clarify mesh setup, PDE discretization, monotone iteration, and post-processing. For the above examples, the annotated Python routine prints the nonlinear solver settings and the convergence statistics compiled in Table 6.
The negative K i and Λ i guarantee convexity and damping in the transformed PDEs, while rapid global convergence confirms the method’s suitability for real-time production planning under uncertainty.

5.3.8. Discussion and Practical Insights

Across all four experiments, the empirical ordering of z 1 z 2 agree with our theorems; this convergence demonstrates robustness, and in automotive production planning, rapidly changing regimes (peak/off peak, supply shocks) can be managed by our regime-switching feedback law p * ( y ) , ensuring inventory targets are met with minimal cost.
Future work should extend to non-quadratic costs, correlated Brownian motions (see [12], for the resulting system), and higher-dimensional product portfolios.
Remark 1.
It is important to highlight that specific parameters require algorithmic application, each with an associated margin of error. In all the above considered scenarios, our theoretical results establish that
z 1 x B 1 x a n d z 2 x B 2 x , f o r a l l x B ¯ R ,
where the inequalities serve as a foundational guideline. In cases where these conditions are violated, the initial data must undergo adjustments to ensure the value functions align with (11). With the parameters explicitly defined, updating the Python code becomes easier; we only need to adjust these values directly when model parameters change, rather than re-running iterative numerical solvers.

6. Some Future Directions

Building on our stochastic production planning framework with regime-switching, we outline four concise avenues for future work:
  • Alternative Convex Loss Functions. Replace the quadratic cost functional by other convex penalties (e.g., logarithmic or exponential) to better reflect industry-specific cost structures. Such non-quadratic losses introduce nonlinear terms into the HJB system, calling for novel analytical approximations or specialized numerical schemes.
  • Correlated Brownian Motions. Allow nonzero correlations among the Brownian drivers to model interdependencies across product demands. The ensuing mixed-derivative terms in the coupled PDEs increase both theoretical complexity and computational burden, and yield a richer and more realistic description of cross-good risk.
  • Geometric Inventory Dynamics. Model inventory levels via geometric Brownian motion to enforce non-negativity. The multiplicative noise and drift require a logarithmic change of variables and adaptive discretization (e.g., mesh refinement or implicit solvers) to maintain stability and accuracy.
  • Real-Time Regime Detection. Integrate machine learning techniques (e.g., online change point detection, hidden Markov model inference) to identify economic regime shifts on the fly. Coupling an ML-based detector with the control law promises faster adaptation and enhanced robustness to structural breaks.

7. Conclusions

The production planning problem is solved using a value function approach, where the optimal production policy is characterized by a system of elliptic PDEs. This paper aims to bridge the gap between theoretical modeling and practical implementation, providing robust tools for stochastic production planning under regime-switching parameters. The regime-switching framework provides actionable insights for managerial decision-making, policy analysis, and operational optimization.
The contributions of this study include a derivation of the Hamilton–Jacobi–Bellman (HJB) equations and their transformation into an elliptic PDE system; the development of a monotone iteration scheme to approximate solutions, enabling quantitative analysis of production policies; an investigation of the impacts of volatility, holding costs, and discount rates on the value functions; and a comparison of models with and without regime-switching, highlighting the conservative and balanced predictions of regime-switching models.
By adapting production strategies to economic cycles, minimizing costs, and mitigating risks, the model enhances practical applicability in industries such as automotive manufacturing, energy systems, and retail.

Funding

This research received no external funding.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data results are available on request.

Acknowledgments

I thank the reviewers for their support and for suggesting the reorganization of the paper, which significantly improved its clarity.

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

Appendix A.1. Proof of Lemma 1

The inequalities in (6) are quadratic in nature with respect to K 1 and K 2 . We analyze the first equation as follows:
4 K 1 2 + 2 ( a 1 + α 1 ) σ 1 2 σ 1 4 K 1 M 1 σ 1 4 2 a 1 σ 2 2 σ 1 4 K 2 = 0 .
This can be rewritten in standard quadratic form:
4 K 1 2 + 2 ( a 1 + α 1 ) σ 1 2 K 1 + M 1 σ 1 4 2 a 1 σ 2 2 σ 1 4 K 2 = 0 .
The discriminant of this quadratic equation is non-negative for K 1 :
Δ 1 = 2 ( a 1 + α 1 ) σ 1 2 2 4 · 4 · M 1 σ 1 4 2 a 1 σ 2 2 σ 1 4 K 2 0 ,
and so, the equation has real solutions. A similar process applies to the second equation:
Δ 2 = 2 ( a 2 + α 2 ) σ 2 2 2 4 · 4 · M 2 σ 2 4 2 a 2 σ 1 2 σ 2 4 K 1 0 .
Let
K 1 * = 2 ( a 1 + α 1 ) σ 1 2 Δ 1 8 , 0 , K 2 * = 2 ( a 2 + α 2 ) σ 2 2 Δ 2 8 , 0 .
Define
R 1 ( K 1 ) = 4 K 1 2 + 2 ( a 1 + α 1 ) σ 1 2 σ 1 4 K 1 M 1 σ 1 4 2 a 1 σ 2 2 σ 1 4 K 2 * ,
and
R 2 ( K 2 ) = 4 K 2 2 + 2 ( a 2 + α 2 ) σ 2 2 σ 2 4 K 2 M 2 σ 2 4 2 a 2 σ 1 2 σ 2 4 K 1 * .
Observe that
R 1 ( 0 ) = M 1 σ 1 4 2 a 1 σ 2 2 σ 1 4 K 2 * < 0 , R 2 ( 0 ) = M 2 σ 2 4 2 a 2 σ 1 2 σ 2 4 K 1 * < 0 .
On the other hand,
lim K 1 R 1 ( K 1 ) = + , lim K 2 R 2 ( K 2 ) = + .
Thus, there exist K 1 , K 2 ( , 0 ) such that the first and second equations of the system are satisfied. By the monotonicity of R 1 and R 2 , the solutions for K 1 and K 2 are unique. It is evident that the third and fourth inequalities of (6) hold true for any K 1 , K 2 ( , 0 ) , and in particular, they are satisfied for the specifically chosen parameters.

Appendix A.2. Numerical Algorithms

For the case D = 1 , we present the algorithm corresponding to the result in Lemma 1, designed for implementation in any programming language, specifically Python in our case. A scalar Newton–Raphson solver locates the unique negative root K guaranteed by Lemma 1. Starting from an initial negative guess, it evaluates the nonlinear residual and its derivative, updates K, and repeats until the change falls below a prescribed tolerance. Once converged, the boundary multiplier S = exp K R 2 is computed.
Algorithm 1 Newton–Raphson Solver for K 1 , K 2 and S 1 , S 2
Require: 
constants a i , α i , σ i , M i ( i = 1 , 2 ) , R ; initial guess G = [ 1 , 1 ] T ; tolerance ϵ > 0 .
Ensure:
solution x = [ K 1 , K 2 ] T with K i < 0 , and S i = e K i R 2 .
1:
 Define residual:
H x ( k ) = h 1 K 1 ( k ) , K 2 ( k ) h 2 K 1 ( k ) , K 2 ( k )
2:
 Define Jacobian:
J x ( k ) = h 1 K 1 h 1 K 2 h 2 K 1 h 2 K 2 x ( k )
3:
 function NewtonRaphsonSolveK( G , ϵ )
4:
        x ( 0 ) G , k 0
5:
     repeat
6:
           evaluate H x ( k )
7:
           evaluate J x ( k )
8:
            x ( k + 1 ) x ( k ) J x ( k ) 1 H x ( k )
9:
            k k + 1
10:
    until  H ( x ( k ) ) < ϵ
11:
    return  x ( k )
12:
 end function
13:
  ( K 1 , K 2 ) N e w t o n R a p h s o n S o l v e K ( G , ϵ )
14:
 while   K 1 0   or  K 2 0   do 
15:
       print Non-negative K i encountered. Adjusting initial guess.
16:
        G G [ 1 , 1 ] T
17:
        ( K 1 , K 2 ) N e w t o n R a p h s o n S o l v e K ( G , ϵ )
18:
 end while
19:
  S 1 e K 1 R 2 , S 2 e K 2 R 2
20:
 return  K 1 , K 2 , S 1 , S 2
Next, for the case D = 1 , we present two algorithms corresponding to the result in Theorem 1, designed for implementation across different programming languages, particularly in Python, which we adopt in this work.
The first task uses the algorithm to compute the sensitivities Λ 1 and Λ 2 via partial derivatives: at each spatial node x i , we discretize the admissible t and s ranges, evaluate the derivatives g 1 t x , t , s and g 2 s x , t , s over that grid, record their maxima, and then set
Λ i = 1 + δ max x i , t , s g 1 x i , t , s , g 2 x i , t , s ,
for i = 1 , 2 and some δ > 0 .
Algorithm 2 Compute Λ 1 and Λ 2 via Partial Derivatives
Require: 
Constants: M 1 , M 2 , σ 1 , σ 2 , a 1 , a 2 , α 1 , α 2 , R , K 1 , K 2 , and a safety factor δ > 0 .
Require: 
Functions:
g 1 ( x , t , s ) = f 1 x σ 1 4 t + 2 ( a 1 + α 1 ) σ 1 2 t ln ( t ) 2 a 1 σ 2 2 σ 1 4 t ln ( s ) , g 2 ( x , t , s ) = f 2 x σ 2 4 s + 2 ( a 2 + α 2 ) σ 2 2 s ln ( s ) 2 a 2 σ 1 2 σ 2 4 s ln ( t ) .
1:
 Let x vals a uniform grid of N x points in [ R , R ] (e.g., N x = 200 ).
2:
 Initialize Λ 1 , max and Λ 2 , max .
3:
 for all  x i x vals   do
4:
     Compute the lower bounds:
t low exp K 1 R 2 x i 2 , s low exp K 2 R 2 x i 2 .
5:
     Define the intervals:
t range linspace t low , 1 , N t , s range linspace s low , 1 , N s ,
      where N t and N s are the number of discretization points (e.g., N t = N s = 50 ).
6:
     for all  t t range  do
7:
         for all  s s range  do
8:
               Compute v 1 g 1 , t ( x i , t , s ) .
9:
               Update Λ 1 , max max Λ 1 , max , v 1 .
10:
             Compute v 2 g 2 , s ( x i , t , s ) .
11:
             Update Λ 2 , max max Λ 2 , max , v 2 .
12:
       end for
13:
   end for
14:
 end for
15:
 Set:
Λ 1 1 + δ Λ 1 , max , Λ 2 1 + δ Λ 2 , max .
16:
 Output:  Λ 1 , Λ 2 .
The second algorithm performs successive linear-boundary-value solutions between a fixed analytic sub-solution and a trivial super-solution, enforcing at each step that the new iterate stays sandwiched monotonically, and stops when the maximum update falls below tolerance. Then, the inverse transform is applied to recover the original value function and its derivative. With K and S fixed, starting from the constant profile u 1 = u 2 = 1 on a uniform grid over [ R , R ] , we sweep through all interior points at each step, replacing the current value by the unique solution of a small three-point linear problem (which incorporates the fixed stabilizers), and we flag a point as “done” once its update is below the tolerance. We repeat these passes until every interior node has converged, then apply the inverse logarithmic transform to recover the original value functions and their gradients.
Algorithm 3 Finite Difference Successive Approximation for u 1 and u 2
Require: 
Interval [ R , R ] , number of points N; nonlinear sources g 1 , g 2 as above; stabilizers Λ 1 , Λ 2 < 0 (from Algorithm A2); tolerance > 0; max iterations.
Ensure: 
{ u 1 ( x i ) , u 2 ( x i ) } i = 1 N approximating the PDE solution.
1:
  d x 2 R N 1
2:
 Initialize grid and iterates:
x i R + ( i 1 ) d x , u j ( x i ) 1 , j = 1 , 2 ; i = 1 , , N .
3:
 Mark boundary nodes converged: conv [ 1 ] conv [ N ] true
4:
 for  k = 1 to do
5:
       Copy old iterates: u j old ( x i ) u j ( x i ) for j = 1 , 2 and i = 1 , , N
6:
       Reset convergence flags: conv [ i ] false for i = 2 , , N 1
7:
     for  i = 2 to N 1  do
8:
           Compute local changes: Δ 1 = | u 1 ( x i ) u 1 old ( x i ) | , Δ 2 = | u 2 ( x i ) u 2 old ( x i ) |
9:
           if max ( Δ 1 , Δ 2 ) tol then                                                             ▹ needs update
10:
                 u 1 ( x i ) u 1 old ( x i 1 ) + u 1 old ( x i + 1 ) + d x 2 g 1 ( x i , u 1 old , u 2 old ) + Λ 1 u 1 old ( x i ) 2 + Λ 1 d x 2
11:
                 u 2 ( x i ) u 2 old ( x i 1 ) + u 2 old ( x i + 1 ) + d x 2 g 2 ( x i , u 1 old , u 2 old ) + Λ 2 u 2 old ( x i ) 2 + Λ 2 d x 2
12:
         else                                                                  ▹ already converged
13:
                conv [ i ] true
14:
         end if
15:
     end for
16:
     Reimpose fixed boundary values: u j ( x 1 ) 1 , u j ( x N ) 1 , j = 1 , 2
17:
     if  conv [ i ] = true for all i = 2 , , N 1  then
18:
           print Global convergence at iteration k.
19:
           break
20:
     end if
21:
     if  k =  then
22:
           print Warning: reached maximum iterations without full convergence.
23:
     end if
24:
 end for
25:
 return  { u 1 ( x i ) , u 2 ( x i ) } i = 1 N
Finally, we present the algorithm for simulating the optimal control and inventory process, designed to be adaptable across multiple programming languages, with Python serving as the primary implementation framework in this study.
Here, we simulate the one-dimensional inventory under the pre-computed optimal feedback. Begin with ( t , y , r ) = ( 0 , 0 , 1 ) and an empty the trajectory. At each step, (1) update the regime r by its jump rates, (2) interpolate the feedback law at the current y, (3) advance ( t , y ) by one Euler–Maruyama increment, and (4) record the new ( t , y , r ) . Repeat until y exits [ R , R ] , then output the full list of ( t , y , r ) triples, the exit time, and the total cost.
Algorithm 4: Simulate Inventory Dynamics via Euler–Maruyama with Regime-Switching
Require: 
Time step d t (chosen sufficiently small), maximum simulation time T max , inventory threshold R, initial inventory y = 0 , initial time t = 0 , initial regime r = 1 .
Ensure: 
An inventory trajectory { ( t , y , r ) } recording the time, inventory level, and regime.
1:
 Initialize trajectory: traj { ( t , y , r ) } .
x grid [ i ] is closest to y▹ Using a pre-computed grid of inventory levels.
2:
 if  r = 1  then
3:
     return  p 1 * ( x grid [ i ] )
4:
 else
5:
     return  p 2 * ( x grid [ i ] )
6:
 end if
7:
 functionUpdateRegime(r)
8:
     Draw u Uniform ( 0 , 1 )
9:
     if  r = 1 and u < a 1 d t  then
10:
         return 2
11:
   else if  r = 2 and u < a 2 d t  then
12:
         return 1
13:
   else
14:
        return r
15:
   end if
16:
 end function
17:
 while  | y | < R and t < T max do
18:
   Update regime: r U p d a t e R e g i m e ( r )
19:
   Compute optimal production: p * O p t i m a l P r o d u c t i o n ( y , r )
20:
   Set volatility: σ σ 1 , if r = 1 , σ 2 , if r = 2 ,
21:
   Sample η N ( 0 , 1 ) and set Δ W = d t η
22:
   Update inventory: y y + p * d t + σ Δ W
23:
   Advance time: t t + d t
24:
   Append ( t , y , r ) to traj
25:
 end while
26:
 Output: The inventory trajectory traj

Appendix A.3. Python Code

Axioms 14 00524 i001
Axioms 14 00524 i002
Axioms 14 00524 i003
Axioms 14 00524 i004
Axioms 14 00524 i005
Axioms 14 00524 i006

References

  1. Bensoussan, A.; Sethi, S.P.; Vickson, R.; Derzko, N. Stochastic production planning with production constraints. SIAM J. Control. 1984, 22, 627–641. [Google Scholar] [CrossRef]
  2. Cadenillas, A.; Lakner, P.; Pinedo, M. Optimal production management when demand depends on the business cycle. Oper. Res. 2013, 61, 1046–1062. [Google Scholar] [CrossRef]
  3. Cadenillas, A.; Ferrari, G.; Schuhmann, P. Optimal production management when there is regime switching and production constraints. Ann. Oper. Res. 2024, 61, 1–33. [Google Scholar] [CrossRef]
  4. Dong, J.; Malikopoulos, A.; Djouadi, S.M.; Kuruganti, T. Application of Optimal Production Control theory for Home Energy Management in a Micro Grid. In Proceedings of the 2016 American Control Conference (ACC), Boston, MA, USA, 6–8 July 2016; Volume 61, pp. 5014–5019. [Google Scholar] [CrossRef]
  5. Gharbi, A.; Kenne, J.P. Optimal production control problem in stochastic multiple-product multiple-machine manufacturing systems. IIE Trans. 2003, 35, 941–952. [Google Scholar] [CrossRef]
  6. Covei, D.-P.; Pirvu, T.A. An elliptic partial differential equation and its application. Appl. Math. Lett. 2020, 101, 106059. [Google Scholar] [CrossRef]
  7. Covei, D.-P. On a parabolic partial differential equation and system modeling a production planning problem. Electron. Arch. 2022, 30, 1340–1353. [Google Scholar] [CrossRef]
  8. Canepa, E.C.; Covei, D.-P.; Pirvu, T.A. Stochastic production planning with regime switching. J. Ind. Manag. 2023, 19, 1697–1713. [Google Scholar] [CrossRef]
  9. Ghosh, M.K.; Arapostathis, A.; Marcus, S.I. Optimal Control of Switching Diffusions with Application to Flexible Manufacturing Systems. SIAM J. Control Optim. 1992, 31, 1183–1204. [Google Scholar] [CrossRef]
  10. Borhan, J.R.M.; Miah, M.M.; Alsharif, F.; Kanan, M. Abundant Closed-Form Soliton Solutions to the Fractional Stochastic Kraenkel-Manna-Merle System with Bifurcation, Chaotic, Sensitivity, and Modulation Instability Analysis. Fractal Fract. 2024, 8, 327. [Google Scholar] [CrossRef]
  11. Hu, C.; Bian, J.; Zhao, D.; He, L.; Dong, F. Optimal Dynamic Production Planning for Supply Network with Random External and Internal Demands. Mathematics 2024, 12, 2669. [Google Scholar] [CrossRef]
  12. Covei, D.-P. Exact Solution for the Production Planning Problem with Several Regimes Switching over an Infinite Horizon Time. Mathematics 2023, 11, 4307. [Google Scholar] [CrossRef]
Figure 1. Computational workflow: from model inputs to numerical approximation and simulation.
Figure 1. Computational workflow: from model inputs to numerical approximation and simulation.
Axioms 14 00524 g001
Figure 2. Numerical profiles of the change in variable solutions and corresponding z i , illustrating z 1 z 2 as predicted by Theorem 3. (a) Transformed PDE solutions u 1 ( x ) (blue) vs. u 2 ( x ) (orange), on | x | 4 . (b) Value functions z 1 ( x ) (blue) and z 2 ( x ) (green) over the same radial slice.
Figure 2. Numerical profiles of the change in variable solutions and corresponding z i , illustrating z 1 z 2 as predicted by Theorem 3. (a) Transformed PDE solutions u 1 ( x ) (blue) vs. u 2 ( x ) (orange), on | x | 4 . (b) Value functions z 1 ( x ) (blue) and z 2 ( x ) (green) over the same radial slice.
Axioms 14 00524 g002
Figure 3. Sample trajectory of the optimal feedback policy (a) and its induced inventory (b). Regime switches produce distinct control spikes and inventory adjustments. (a) Optimal production rate p * ( t ) in regimes 1 (red) and 2 (brown). (b) Inventory level y ( t ) showing reflective behavior until the stopping boundary at R = 4 .
Figure 3. Sample trajectory of the optimal feedback policy (a) and its induced inventory (b). Regime switches produce distinct control spikes and inventory adjustments. (a) Optimal production rate p * ( t ) in regimes 1 (red) and 2 (brown). (b) Inventory level y ( t ) showing reflective behavior until the stopping boundary at R = 4 .
Axioms 14 00524 g003
Figure 4. Numerical profiles for Theorem 4 scenario: lower discount rate in regime 1 yields z 1 z 2 . (a) Transformed solutions u 1 ( x ) (blue) vs. u 2 ( x ) (orange). (b) Value functions z 1 ( x ) (blue) and z 2 ( x ) (green).
Figure 4. Numerical profiles for Theorem 4 scenario: lower discount rate in regime 1 yields z 1 z 2 . (a) Transformed solutions u 1 ( x ) (blue) vs. u 2 ( x ) (orange). (b) Value functions z 1 ( x ) (blue) and z 2 ( x ) (green).
Axioms 14 00524 g004
Figure 5. Time evolution of optimal control and inventory for the discount rate experiment. (a) Optimal production rate p * ( t ) in regimes 1 (red) and 2 (brown). (b) Inventory level y ( t ) under the optimal policy.
Figure 5. Time evolution of optimal control and inventory for the discount rate experiment. (a) Optimal production rate p * ( t ) in regimes 1 (red) and 2 (brown). (b) Inventory level y ( t ) under the optimal policy.
Axioms 14 00524 g005
Figure 6. Monotone iteration outputs for the Theorem 5 scenario yields z 1 z 2 . (a) u 1 ( x ) (blue) vs. u 2 ( x ) (orange) on | x | 4 . (b) z 1 ( x ) (blue) and z 2 ( x ) (green).
Figure 6. Monotone iteration outputs for the Theorem 5 scenario yields z 1 z 2 . (a) u 1 ( x ) (blue) vs. u 2 ( x ) (orange) on | x | 4 . (b) z 1 ( x ) (blue) and z 2 ( x ) (green).
Axioms 14 00524 g006
Figure 7. Time dynamics of optimal production and inventory for differing holding cost regimes. (a) Optimal control p * ( t ) under f 1 > f 2 . (b) Inventory-level trajectory y ( t ) .
Figure 7. Time dynamics of optimal production and inventory for differing holding cost regimes. (a) Optimal control p * ( t ) under f 1 > f 2 . (b) Inventory-level trajectory y ( t ) .
Axioms 14 00524 g007
Figure 8. Convergence of the monotone Picard scheme: (a) transformed variables u i and (b) back-transformed value functions z i .
Figure 8. Convergence of the monotone Picard scheme: (a) transformed variables u i and (b) back-transformed value functions z i .
Axioms 14 00524 g008
Figure 9. Iteration history: (a) production rate p * and (b) inventory level under Picard iteration.
Figure 9. Iteration history: (a) production rate p * and (b) inventory level under Picard iteration.
Axioms 14 00524 g009
Table 1. Effects of parameter variations on value functions z 1 , z 2 .
Table 1. Effects of parameter variations on value functions z 1 , z 2 .
TheoremParameter ChangeEffect on z i ( x ) , x B ¯ R
 3 σ 1 > σ 2 Higher volatility raises z 1 z 2 .
 4 α 1 < α 2 Lower discount rate raises z 1 z 2 .
 5 f 1 ( x ) > f 2 ( x ) Higher holding cost raises z 1 z 2 .
 6all above+switching limitsYields z ¯ 1 z 1 z 2 z ̲ 2 .
Table 2. Parameters for Theorem 3 experiment.
Table 2. Parameters for Theorem 3 experiment.
ParameterRegime 1Regime 2
a i 0.60.5
α i 0.30.3
M i 11
σ i 0.90.2
f i ( x ) x 2 x 2
R4
Table 3. Parameter set for Theorem 4 experiment.
Table 3. Parameter set for Theorem 4 experiment.
ParameterRegime 1Regime 2Interpretation
a i 0.6 0.9 Regime-switching rates
α i 0.2 0.9 Discount factors ( α 1 < α 2 )
M i 22Bound on f i ( x )
σ i 11Common volatility
f i ( x ) 2 x 2 Quadratic holding cost
Domain x B ¯ 4 Inventory threshold R = 4
Table 4. Parameter set for Theorem 5 experiment.
Table 4. Parameter set for Theorem 5 experiment.
ParameterRegime 1Regime 2Interpretation
a i 0.6 0.9 Transition rates
α i 0.3 0.3 Common discount factor
M i 54Bounds on f i ( x )
σ i 11Common volatility
f i ( x ) 5 x 2 4 x 2 Regime-dependent cost
Domain x B ¯ 4 Boundary R = 4
Table 5. Parameters for Theorem 6 experiment.
Table 5. Parameters for Theorem 6 experiment.
ParameterRegime 1 (High)Regime 2 (Low)
Transition rate a i 0.6 0.9
Discount α i 0.3 0.8
Holding cost bound M i 51
Volatility σ i 1.0 0.3
Holding cost f i ( x ) 5 x 2 x 2
Inventory capacity R4 units
Table 6. Nonlinear solver constants and convergence metrics for the four case studies (Theorems 3–6).
Table 6. Nonlinear solver constants and convergence metrics for the four case studies (Theorems 3–6).
Thm.Parameter Variation K 1 K 2 Λ 1 Λ 2 Picard Its.
3Volatility: σ 1 > σ 2 0.71 14.51 27.93 10 , 541.99 1237
4Discount rate: α 1 < α 2 0.66 1.08 35.28 37.38 448
5Holding cost: f 1 ( x ) > f 2 ( x ) 1.21 1.03 85.89 69.72 1097
6Combined limits 1.22 0.45 85.89 42.54 1176
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Covei, D.-P. Stochastic Production Planning with Regime-Switching: Sensitivity Analysis, Optimal Control, and Numerical Implementation. Axioms 2025, 14, 524. https://doi.org/10.3390/axioms14070524

AMA Style

Covei D-P. Stochastic Production Planning with Regime-Switching: Sensitivity Analysis, Optimal Control, and Numerical Implementation. Axioms. 2025; 14(7):524. https://doi.org/10.3390/axioms14070524

Chicago/Turabian Style

Covei, Dragos-Patru. 2025. "Stochastic Production Planning with Regime-Switching: Sensitivity Analysis, Optimal Control, and Numerical Implementation" Axioms 14, no. 7: 524. https://doi.org/10.3390/axioms14070524

APA Style

Covei, D.-P. (2025). Stochastic Production Planning with Regime-Switching: Sensitivity Analysis, Optimal Control, and Numerical Implementation. Axioms, 14(7), 524. https://doi.org/10.3390/axioms14070524

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop