You are currently viewing a new version of our website. To view the old version click .
Mathematics
  • Article
  • Open Access

2 December 2025

Structural Properties and a Revised Value Iteration Algorithm for Dynamic Capacity Expansion and Reduction

and
1
NXP Semiconductors, 3501 Ed Bulestein Blvd, Austin, TX 78721, USA
2
Industrial and Systems Engineering Department, Interdisciplinary Research Center for Smart Mobility & Logistics, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Mathematical Methods of Operational Research and Data Analytics in Operations Planning and Scheduling

Abstract

This manuscript introduces a generalized Markov Decision Process (MDP) model for dynamic capacity planning in the presence of stochastic time-nonhomogeneous demand, wherein system capacity may be flexibly increased or decreased throughout a finite planning horizon. The model includes investment, disinvestment, maintenance, operational, and shortage costs, in addition to a salvage value at the end of the planning horizon. Under very realistic conditions, we investigate the structural properties of the optimal policy and demonstrate its monotonic structure. By leveraging these properties, we propose a revised value iteration algorithm that capitalizes on the intrinsic structure of the problem, thereby achieving enhanced computational efficiency compared to traditional dynamic programming techniques. The proposed model is applicable across a range of sectors, including manufacturing systems, cloud-computing services, logistics systems, healthcare resource management, power capacity planning, and other intelligent infrastructures driven by Industry 4.0.

1. Introduction

The discipline of capacity planning is distinguished by a substantial body of academic research, encompassing a wide array of application domains, including (a) power generation and distribution; (b) production and supply chain logistics; (c) network systems such as electricity, radio, and mobile communications; and (d) hospital capacity planning. For an overview of this topic, the reader is referred to Ref. [1] along with the review articles listed therein.
There exists a broad array of modelling and solution methodologies for capacity planning, illustrating the heterogeneity and complexity inherent in real-world scenarios. This includes, but is not limited to, the following.
A diverse array of modelling techniques are currently used for capacity planning, including (a) mathematical programming techniques [2], (b) stochastic and dynamic programming methods [3,4], (c) robust optimization [5], and (d) simulation-based optimization [6,7]. With the exception of stochastic dynamic programming methods, these techniques are normally applied to deterministic problems or employed in policy evaluation. The challenge of sequential decision-making under stochastic demand conditions is most effectively modelled via stochastic dynamic programming. AlDurgam et al. [1] confirmed that the topic of sequential decision-making under conditions of uncertainty is considerably underrepresented in the literature in comparison to the aforementioned methodologies. A Markov decision process (MDP) is a discrete stochastic dynamic programming technique that is extensively employed to model sequential decision-making under uncertainty scenarios frequently observed in capacity planning. The assumption of static demand can result in significant ramifications, including, as thoroughly documented, instances of capacity shortages or excess capacities [8].
Characterising the optimal policies in MDPs facilitates the practical application of intricate models by resulting in simple decision rules [9], such as threshold or monotone policies, which can be easily executed by decision makers. Furthermore, these characteristics reduce the computational complexity of MDPs by limiting the solution space [10,11]. Recently Krishnamurthy [12] demonstrated that in Markov decision processes, optimal policies increase with the state under broader conditions than supermodularity, such as sigmoidal rewards and non-supermodular transitions. Their research confirmed monotonicity with totally positive order-three transition matrices and concave value functions. Lee et al. [13] analysed monotone policy iteration in Markov decision processes, offering convergence and optimality conditions and a modified policy iteration algorithm. They assessed nineteen state-ordering rules’ effects, finding a time–quality trade-off: faster orderings excel in structured scenarios, while random orderings reduce the optimality gap with more of a computational cost.
Martínez-Costa et al. [14] observed that strategic capacity-planning models generally emphasize expansion while paying less attention to capacity reduction—a gap that is particularly salient in global markets with rapid product cycles. In practice, workforce and logistics planning require the ability to scale resources both up and down. This pattern extends to the MDP literature: several studies (Table 1) formulate capacity planning as an MDP and concentrate on expansion under uncertainty, and apart from application cases, few provide rigorous structural characterizations of the optimal policies. Wu and Chuang [15] studied a multigeneration capacity portfolio expansion model involving uncertainties in price, demand, and lifecycle; established a monotone expansion policy; and formulated a revised value iteration algorithm. Wu and Chuang [11] further extended their study to encompass multi-type (flexible versus dedicated) capacity with uncertainties in demand, price, and yield, utilising the structural properties in their previous work [15], and proposed a heuristic search algorithm that improves computational efficiency by approximately 30%. Lin et al. [16] explored expansion and allocation across multiple sites under Markovian demand within a finite-horizon dynamic programming framework embedded into linear programming, although no structural properties were delineated. Mishra et al. [17] investigated a two-period product line expansion/reduction problem characterized by price-dependent stochastic demand, resulting in a news-vendor-like optimal policy in quantile form along with an analytical solution. Serrato et al. [18] modelled the outsourcing of reverse-logistics capacity over a finite horizon with binomial returns and demonstrated a threshold policy through a backward induction algorithm.
Table 1. Summary of the most relevant MDB-based research articles.
Classically, Lindley recursions are associated with inventory and queueing models, yet the same reflected-random-walk dynamics govern capacity headroom under stochastic demand with discrete expand/contract actions. We therefore draw on three inventory studies as methodological antecedents. Recent research by Blancas-Rivera et al. [20] elucidates the conditions under which (s, S) policies are optimal, even in the presence of potentially unbounded costs. Their findings demonstrate that a subsequence of value-iteration minimizers converges to an optimal policy, substantiated through a numerical example. In the realm of periodic-review lost-sales models with positive lead times, van Jaarsveld and Arts [21] proposed the projected inventory-level (PIL) policy. This policy aims to maintain a fixed expected on-hand inventory upon arrival, and it has been analytically proven to dominate constant-order policies. The PIL policy achieves asymptotic optimality as the penalty for lost sales approaches infinity and, in scenarios with exponential demand, as the lead time becomes infinitely long. Their study also reports robust numerical performance. In a related investigation, Yuan et al. [22] establish that within lost-sales systems characterized by stochastic lead times, base-stock policies attain asymptotic optimality when the penalty for lost sales significantly surpasses holding costs, with the incurred costs converging to the system’s optimum.
The studies most pertinent to this research are those conducted by references [17,18]. Serrato et al. [18] investigated outsourcing decisions constrained to expansion scenarios under the assumption of binomially distributed returns. Conversely, Mishra et al. explored a two-period model characterized by demand contingent on price, which resulted in an analytical solution comparable to the newsvendor model. Distinctively, our model extends the model developed by Abduljaleel and AlDurgam [19] to accommodate both expansion and reduction decisions. We incorporate a general stochastic demand model that allows for time-nonhomogeneous demands and derive a monotone structure under realistic cost conditions. Utilising these characteristics, we have developed a revised value iteration algorithm that significantly diminishes computational effort in comparison to the conventional backward value iteration method.
The remainder of this paper is structured as follows. Section 2 provides a formal definition and MDP formulation of the problem at hand. Section 3 presents the structural properties of the model and their advantages with respect to computational efficiency. Based on the structural properties of the MDP model, a modified value iteration algorithm is given in Section 4, and illustrative numerical examples are presented in Section 5. Section 6 concludes the paper.

2. Problem Definition and MDP Model

In this section, we provide the problem definition, notations, and MDP model of the problem at hand.

2.1. Problem Definition

This study addresses a problem of multi-period capacity planning wherein a decision maker is tasked with meeting stochastic dynamic demands. The model assumes there is a Markovian demand process [23]; that is, the probability distribution of the demand in the next time period depends only on the demand realised in the current period. In any time period, the system’s capacity represents the maximum demand it can satisfy internally in that period. Any demand exceeding this capacity incurs a penalty due to emergency processing or expediting. In every time-period, there will be a maintenance cost for maintaining the overall system capacity and an operating cost for the utilised system capacity. At the end of each period, the decision maker decides how much to increase or decrease the system capacity. There is no backordering of demand; that is, at any point in time, if the demand is not met by the system’s capacity, it is processed externally, incurring a penalty cost. The optimal decision policy defines the optimal action in every time period for each possible system state, with the objective of minimising overall system costs over the entire planning horizon.

2.2. Nomenclature

Next, the notations used to construct the MDP model are provided.
a t C S t , d t denotes the action taken at time t when the system is in state C S t , d t . For simplicity, a t is used interchangeably with a t C S t , d t . Positive a t values represent increasing the capacity of the system, and negative a t values represent decreasing the capacity of the system.
a t C S t , d t denotes the optimal action taken at time t when the system is in state C S t , d t . For simplicity, a t is used interchangeably with a t C S t , d t .
A t C S t , d t denotes action space at time t when the system is in state C S t , d t . For simplicity, A t is used interchangeably with A t C S t , d t .
c 1 denotes the cost of increasing the capacity of the system by one unit; this represents an investment cost ($ per unit of capacity).
c 2 denotes the cost of decreasing the capacity of the system by one unit; this represents disinvestment cost/reward ($ per unit of capacity). c 2 can take negative values to represent rewards.
c 3 denotes capacity maintenance cost ($ per unit of capacity available per time period).
c 4 denotes operating cost. This represents the cost incurred in meeting a unit of demand by the existing system capacity ($ per unit of capacity demanded per time period).
c 5 denotes shortage cost or the penalty incurred per unit of demand exceeding current system capacity ($ per unit short of capacity per time period).
c 6 denotes the cost of salvaging a unit of capacity of the system at the end of the planning horizon, t = T , ($ per unit of capacity). To represent rewards, c 6 can take negative values.
C S t denotes System capacity at time t (units of capacity).
d t denotes demand for units of capacity at time t (units of capacity).
M C denotes the maximum possible capacity available for the decision maker at any point in time.
p t + 1 C S t + 1 , d t + 1 C S t , d t , a t denotes the conditional probability the state at time t + 1 will be C S t + 1 , d t + 1 , given that the state and action taken at time t were C S t , d t and a t .
P d t + 1 | d t denotes the conditional probability the demand at time t + 1 will be d t + 1 , given that the demand at time t was d t .
R t C S t , d t , a t denotes the immediate cost incurred when taking an action a t at time t when the system is in state C S t , d t . A positive value of R t C S t , d t , a t means that a t resulted in immediate costs, while a negative value represents immediate revenue.
T denotes the length of the planning horizon.
t denotes time period/epoch t = 0,1 , 2 .   T .
V t C S t , d t , a t denotes the value function when choosing an action a t at time   t when the system is in state C S t , d t subsequently following an optimal decision policy from time t + 1 .
V t C S t , d t denotes the optimal value function when choosing the optimal decision policy starting from time t when the system is in state C S t , d t .

2.3. An MDP Model for Multi-Period Capacity Planning

An MDP model consists of system states, a set of actions, transition probabilities between system states, and reward/cost functions that, in their basic form, depend on system state and the action taken [10]. We define these elements for our problem below.
System States: In any time period ( t ) of the planning horizon ( T ) , the system state is fully described by two state variables—the system capacity and demand experienced during the time period C S t , d t . At the beginning of the planning horizon, the system state is assumed to be C S 0 , d 0 .
In instances where component-specific details are irrelevant, such as in expressions involving the value function or action-value function, we denote the state succinctly as s t . However, in scenarios where the argument or proposition is contingent upon the ordering or structure of individual components, such as in the theorems and proofs, we maintain the explicit notation of the components for s t , i.e., C S t , d t and ( C S t + a t , d t + 1 ) for s t + 1 .
Actions: At the end of every time period t , given a system state s t , the decision maker engages in an action a t s t (or simply a t ) that allows changes in the capacity of the system or keeps the capacity the same by employing the ‘do nothing’ action. The action space A t s t (or equally A t for simplicity) denotes the set of actions available for the decision maker when the system is in state s t at the end of period t , a t s t ϵ   A t s t . If the capacity of the system ranges between 0 and M C (the maximum capacity of the system), the values of a t can be negative integers for a decrease in capacity, zero for the ‘do nothing’ action, or positive integers for an increase in system capacity.
At the end of time t when the system is in state s t , the set of actions available to the decision maker is defined as follows:
A t s t = C S t ,   C S t + 1 , C S t + 2 M C C S t .
For instance, if the decision maker chooses the action C S t , the new capacity of the system ( C S t + 1 = C S t + a t ) becomes zero.
Transition Probabilities: This MDP element defines the conditional state transition probabilities. Assuming C S t and d t are independent, the probability of transition to a new system state s t + 1 in t + 1 , given the current system state   s t and action taken a t , is expressed as
p t + 1 s t + 1 s t , a t = P d t + 1 | d t i f   C S t + 1 = C S t + a t 0                                 o t h e r w i s e                               .
Costs and model dynamics: Each MDP model should have a reward or cost element; the cost elements of the proposed MDP model are described below.
At the end of planning horizon ( t = T ) , the entire system capacity is salvaged. The resulting relations at T are
V T s T = V T s T , a T = R T s T , a T = c 6 C S T .
The terminal cost accounts for the cost in salvaging the system capacity at the end of the planning horizon. If there is a reward for salvaging the system, c 6 takes a negative value to reflect the reward in the cost function. It is important to clarify that a T (the action at the end of the planning horizon) is not really a decision variable, as the system simply salvages its entire capacity. We include a T simply for the sake of model analysis.
At the end of any time period, t T , given the state of the system s t and the action a t   carried out by the decision maker, a cost R t s t , a t will be incurred. This is defined as
R t s t , a t = c 1 a t + + c 2 a t + +   c 3 C S t + c 4 m i n d t ,   C S t +   c 5 d t C S t + .
In Equation (3), the first term, c 1 a t + , accounts for the costs incurred for increasing the capacity of the system (investment cost), that is, whenever a t   is positive. The second term, c 2 a t + , represents costs for any decrease in capacity (disinvestment), that is, whenever a t is negative. The third term,   c 3 C S t , represents the costs for maintaining the system capacity; the fourth term, c 4 m i n d t ,   C S t , represents operating cost in period t ; and the fifth term,     c 5 d t   C S t + , accounts for the penalty incurred when the demand exceeds system capacity in period t .
The decision maker establishes the decision policy, which involves selecting an action for each time period within the planning horizon, depending on the current state of the system. The policy is designed to minimize the total expected cost throughout the entire planning horizon. The optimal value function that gives the minimum expected cost, from period t to the end of the planning horizon when following an optimal decision policy, is given as follows:
V t s t = m i n a t A t ( ) m i n a t A t ( ) R t s t , a t + d t + 1 p t + 1 s t + 1 s t , a t V t + 1 s t + 1 .
Using Equation (1), this can be also rewritten as
V t s t = m i n a t A t ( ) R t s t , a t + d t + 1 P d t + 1 | d t V t + 1 C S t + a t , d t + 1 .
Hence, the optimal action a t s t is expressed as follows:
a t s t = a r g min a t a t A t s t R t s t , a t + d t + 1 P d t + 1 | d t V t + 1 C S t + a t , d t + 1 .
The value function that yields the expected cost from period t to the end of planning horizon T —when carrying out action a t at the end of period t and henceforth following the optimal decision policy—is expressed as V t s t , a t :
V t s t , a t = R t s t , a t + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 .
It should be noted that Equations (4)–(7) are derived directly based on the standard Bellman optimality equation.

2.4. The Value Iteration Algorithm

The optimal decision policy for a finite-horizon MDP model can be determined using the backward value iteration algorithm [10]. Hence, for our model, if the demand takes integer values of [ 0 , x ] , then the number of iterations required is
T × x + 1 3 .
For each epoch, there are x + 1 2 possible states, and for each state, there are x + 1 possible actions. In Section 4, we present some structural properties of our model and discuss the advantages of the model’s structural properties in terms of computational effort.

3. A Structured Optimal Capacity-Planning Policy

The existence of a structured optimal policy has two main advantages: it reduces the computational effort required to determine the optimal policy, and it simplifies the application of the optimal policy. In terms of the cost elements, all the structural properties derived in this paper and proven in the Appendix A are based only on the following two realistic conditions:
c 1 m a x 0 , c 2 ,
c 5 > c 4 + c 3 .
The first condition means that if revenue is generated in the context of a decreasing system capacity, it will be less than the cost of increasing system capacity. The second condition means that the shortage cost per unit of capacity is greater than the sum of the maintenance and operating costs per unit of capacity. Below, the different structural properties of the optimal policy are provided, considering the stated cost assumptions.
Theorem 1:
In any time period t , given an arbitrary state C S t = C S 1 , d t , if the optimal action is increasing the system’s capacity to level g   ( g > C S 1 ) , then for all states C S t = C S 2 , d t , where C S 1     C S 2     g , increasing system capacity to reach the same capacity level g  would be an optimal action. Mathematically, if a t C S t = C S 1 , d t = g C S 1  and g C S 1 , then a t C S t = C S 2 , d t = g C S 2     C S 1 C S 2 g .
Practical meaning:
Once a capacity target is justified, being further below that target cannot be optimal; that is, partial expansions should be avoided.
Proof: 
The proof of Theorem 1 is given in Appendix A at the end of the paper. □
Theorem 2:
In any time period t , given an arbitrary system state C S t = C S 2 , d t , if the optimal action is decreasing system capacity to level e 0 , then for all states C S t = C S 1 , d t , where e C S 1   C S 2 , an optimal action would be to decrease system capacity to reach the same capacity level e . Mathematically, if a t C S t = C S 2 , d t = e C S 2  and e C S 2 , then,  a t C S t = C S 1 , d t = e C S 1     e C S 1     C S 2 .
Practical meaning:
Once a contraction target is justified, being further above that target cannot be optimal; that is, partial contractions should be avoided.
Proof: 
The proof of Theorem 2 is similar to that for Theorem 1, and it is given in Appendix A at the end of the paper. □
The following Lemma 1 is used to arrive at our Proposition 1, which is stated below.
Lemma 1:
If g  is a superadditive (subadditive) function for X ×  Y and for each x    X, min y Y g x , y  exists, then
f x = m i n y arg min y Y g x , y
 is monotonously non-increasing (non-decreasing) in x .
Proof. 
The proof of this Lemma is given by reference [10]. □
Lemma 2 is used in the proof for Proposition 1.
Lemma 2:
In any time period t , the function
c 4 m i n d t ,   C S t 1 + a t 1 + c 5 d t C S t 1 a t 1 +
 is superadditive in C S t 1 × a t 1 d t .
Proof: 
The proof of this Lemma is straightforward, and it is given in the Appendix A. □
Theorem 3, which will be stated later, is based on Lemma 1 and Proposition 1.
Proposition 1:
In the given problem definition, for any time period t, V t C S t , d t , a t is superadditive in C S t   ×   a t   d t . That is, for any capacity levels C S 2  and C S 1 , such that  C S 2 C S 1 , and for any actions a 2 t  and a 1 t  in A t , such that a 2 t a 1 t ,
V t C S 2 , d t , a 2 t V t C S 2 , d t , a 1 t V t C S 1 , d t , a 2 t V t C S 1 , d t , a 1 t .
Proof: 
The proof of this proposition is given in Appendix A. □
The main result of this paper, Theorem 3, is stated next.
Theorem 3:
For a given d t , a t C S t , d t  is nonincreasing in C S t .
Practical meaning:
For a fixed period and demand state, the optimal adjustment is nonincreasing in current capacity (with a greater installed capacity, one should never expand more).
Proof: 
By rewriting the equations for V t C S t , d t , a t C S t , d t , and V t C S t , d t , a t given in the previous section, we obtain
V t C S t , d t = m i n a t A t ( ) R t C S t , d t , a t + d t + 1 P d t + 1 | d t V t + 1 C S t + a t , d t + 1 ,
a t C S t , d t = a r g min a a t A t ( ) R t C S t , d t , a t + d t + 1 P d t + 1 | d t V t + 1 C S t + a t , d t + 1 ,
V t C S t , d t , a t = R t C S t , d t , a t + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 .
We use Lemma 1 to acquire the result of Theorem 3. As per Lemma 1, if V t C S t , d t , a t is superadditive in C S t   ×   a t d t , then Theorem 3 follows. Proposition 1 proves that V t C S t , d t , a t is superadditive in C S t   ×   a t d t . Through Theorem 3, the computational effort is greatly reduced.
If the demand takes values from 0 , x , the number of iterations required using Proposition 1 is less.
Owing to Theorems 1–3, the number of iterations required to find an optimal decision policy reduces. If the demand takes values from 0 , x , the maximum number of iterations will be
T 3 x + 1 x + 1 .
After searching in x + 1 actions at both the first and last capacity levels x + 1 × 2 x + 2 , Theorems 1–3 are applied to determine the optimal action at the remaining capacity levels x + 1 × x 1 . One can argue that what we have presented is more of a loose upper bound.
We can delineate additional structural properties of the defined problem. Theorem 4 will help in the search for the optimal action in the initial states. □
Theorem 4:
In the defined problem, V t C S t , d t , a t  is convex in a t   C S t , d t , t .
Practical meaning:
For any period, demand state, and current capacity, the look-ahead cost as a function of the adjustment is convex.
Proof: 
The proof of this theorem is given in the Appendix A. □
Theorem 4 further reduces the number of iterations required in order to determine the optimal decision policy. This issue is addressed in Section 5.
We put forward Lemma 3 to prove Proposition 2, which is used for the purpose of proving Theorem 5.
Lemma 3:
The function
c 4 m i n d t ,   C S t 1 + a t 1 + c 5 d t C S t 1 a t 1 + .
 is subadditive in d t   ×   a t 1    C S t 1 .
Proof: 
The proof is given in Appendix A. □
Proposition 2:
In the presented MDP model, if the demand transition matrix exhibits first-order stochastic dominance (that is, if the probability that the demand exceeds a specific value in the next state does not decrease with an increase in the demand experienced in the current state), then V t C S t , d t , a t  is subadditive in d t   ×   a t      C S t . That is, for any two demands d 2 t ,   d 1 t  such that d 2 t d 1 t , and actions a 2 t ,   a 1 t  in a t , such that  a 2 t a 1 t ,
V t C S t , d 2 t , a 2 t V t C S t , d 2 t , a 1 t V t C S t , d 1 t , a 2 t V t C S t , d 1 t , a 1 t .
Proof: 
The proof is given in Appendix A. □
Theorem 5:
If the demand transition follows first-order stochastic dominance, then the optimal action does not decrease with an increase in d t  in the state definition. In other words, for a given C S t , a t C S t , d t  is nondecreasing in d t  if the demand transition follows first-order stochastic dominance. That is,   d 1 ,   d 2  in d t  such that d 2 d 1 , if P d t + 1 > d | d 2 P d t + 1 > d | d 1     d , t . Then, a t C S t , d 2 a t C S t , d 1     C S t , t .
Practical meaning:
Under FOSD, higher-demand states never call for smaller moves; that is, expansion should not be reduced and further cuts should not be made when demand is in a state with a higher demand level.
Proof: 
Because of Lemma 1, Theorem 5 follows if V t C S t , d t , a t is subadditive in d t × a t C S t . Proposition 2 proves that V t C S t , d t , a t is subadditive in d t   ×   a t C S t . □

4. A Revised Value Iteration Algorithm

This section presents a modified value iteration algorithm that will further reduce the computational efforts compared with the standard backward value iteration algorithm [10]. The number of iterations required to obtain the optimal policy is, at most,
T 2 x x + 1 .
For a problem definition where the demand takes values from 0 , x , a search is performed in x + 1 actions together for both the first and last capacity levels x + 1 × x + 1 ; then, Theorems 1, 2, 3, and 4 are applied to determine the optimal action at the remaining capacity levels x + 1 × x 1 . Again, this is more of a loose upper bound. In case of large values of x , we can employ the modified Golden section search method or modified Fibonacci search, the latter of which is used for discrete functions to reduce the steps/time required to determine the optimal policy.
In Section 4.1, based on Theorems 1–4, we present a modified value iteration algorithm where we do not impose any condition on the demand transition matrix; hence, our model and its structural properties apply to the case of time-nonhomogeneous transitions. In addition, if the state transition matrix follows first-order stochastic dominance, we will obtain additional structural properties (Proposition 2 and Theorem 5); then, the algorithm in Section 4.2 applies.

4.1. Modified Value Iteration Algorithm 1

Next, we present a modified value iteration algorithm for the problem at hand. Additionally, we highlight the links between the proposed algorithm and the theorems presented in Section 4.
Algorithm 1: Pseudocode
Input: Horizon T; Maximum system capacity M C ; maximum demand max_demand; Demand transition matrix P d t + 1 | d t ; Cost components: c 1 ,   c 2 ,   c 3 ,   c 4 ,   c 5 ,   c 6 ;
Output: Optimal action table a t C S t ,   d t and optimal value table V t C S t , d t

Initialize: Set t = T , compute V T C S T , d T =   c 6 C S T .
For t = T 1 down to 1 do:
      For: d t = 0 to max_demand do:
      Find optimal action a t C S t , d t and value V t C S t , d t for C S t = 0 (convexity)
          Set x o p t = 0 and V c u r r e n t = V t 0 , d t , 0
          For x = 1 to M C do:
              If V t 0 , d t , x   V c u r r e n t then:
                   Set x o p t = x and V c u r r e n t = V t 0 , d t , x
              Else: break
          Set a t 0 , d t =   x o p t and V t 0 ,   d t =   V c u r r e n t
      For C S t = 1 to a t 0 , d t do: (Theorem 1)
          Set a t C S t , d t =   a t 0 , d t   C S t
          Set V t C S t ,   d t =   V t C S t , d t , a t C S t , d t
      Find optimal action a t C S t , d t and value V t C S t , d t for C S t = M C (convexity)
          Set y s t a r t = a t 0 , d t M C
          Set y o p t = y s t a r t and V c u r r e n t = V t M C , d t , y s t a r t
          For y =   y s t a r t + 1 to 0 do:
              If V t M C , d t , y   V c u r r e n t then:
                   Set y o p t = y and V c u r r e n t = V t M C , d t , y
              Else: break
          Set a t M C , d t =   y o p t and V t M C ,   d t =   V c u r r e n t
      Set L = M C +   a t M C , d t
      For C S t = L to M C 1 do: (Theorem 2)
          Set a t C S t , d t =   L   C S t
          Set V t C S t ,   d t =   V t C S t , d t , a t C S t , d t
      For C S t = a t 0 , d t + 1 to L 1 do: (Theorem 3)
          Set a t C S t , d t =   0
          Set V t C S t ,   d t =   V t C S t , d t , 0

4.2. Modified Value Iteration Algorithm 2

In case the demand transition matrix P d t + 1 | d 1 corresponds to first-order stochastic dominance, we can further improve our modified value iteration algorithm, as follows.
Algorithm 2: Pseudocode
Input: Horizon T; Maximum system capacity M C ; maximum demand max_demand; Demand transition matrix P d t + 1 | d t ; Cost components: c 1 ,   c 2 ,   c 3 ,   c 4 ,   c 5 ,   c 6 ;
Output: Optimal action table a t C S t ,   d t and optimal value table V t C S t , d t

Initialize: Set t = T , compute V T C S T , d T =   c 6 C S T .
For t = T 1 down to 1 do:
      For: d t = 0 to max_demand do:
      Find optimal action a t 0 , d t and value V t 0 , d t (convexity & Theorem 5)
          If d t = 0 then:
              Set x s t a r t = 0
          Else: Set x s t a r t = a t 0 , d t
          Set x o p t = x s t a r t and V c u r r e n t = V t 0 , d t , x s t a r t
          For x = x s t a r t + 1 to M C do:
              If V t 0 , d t , x   V c u r r e n t then:
                   Set x o p t = x and V c u r r e n t = V t 0 , d t , x
              Else: break
          Set a t 0 , d t =   x o p t and V t 0 ,   d t =   V c u r r e n t
      For C S t = 1 to a t 0 , d t do: (Theorem 1)
          Set a t C S t , d t =   a t 0 , d t   C S t
          Set V t C S t ,   d t =   V t C S t , d t , a t C S t , d t
      Find optimal action a t C S t , d t and value V t C S t , d t for C S t = M C (convexity)
          If d t = 0 then:
              Set y s t a r t = a t 0 , d t M C
          Else:
              Set y s t a r t = m a x a t 0 , d t M C ,   a t M C , d t 1
Set y o p t = y s t a r t and V c u r r e n t = V t M C , d t , y s t a r t
          For y =   y s t a r t + 1 to 0 do:
              If V t M C , d t , y   V c u r r e n t then:
                   Set y o p t = y and V c u r r e n t = V t M C , d t , y
              Else: break
          Set a t M C , d t =   y o p t and V t M C ,   d t =   V c u r r e n t
      Set L = M C +   a t M C , d t
      For C S t = L to M C 1 do: (Theorem 2)
          Set a t C S t , d t =   L   C S t
          Set V t C S t ,   d t =   V t C S t , d t , a t C S t , d t
      For C S t = a t 0 , d t + 1 to L 1 do: (Theorem 3)
          Set a t C S t , d t =   0
          Set V t C S t ,   d t =   V t C S t , d t , 0

4.3. Values of Structural Properties

To quantify the benefits of the proposed structural properties and the resulting modified value iteration algorithms, we conducted 100 numerical experiments. Each experiment modelled a system with a capacity range of 0 to 10 units over a planning horizon of 10 discrete periods. Demand distributions were generated to satisfy first-order stochastic dominance, and all other problem parameters were randomly sampled to ensure generality. The experimental setup is summarized as follows:
  • Using R, all draws use runif() and are then rounded to the nearest integer (banker’s rounding):
    -
    c1: Uniform on [ 10,100 ] integer in { 10 , , 100 } .
    -
    c 2 : Uniform on [ 0 , c 1 1 ] integer in { 0 , , c 1 1 } (so c 2 < c 1 ).
    -
    c3: Uniform on [ 10,100 ] integer in { 10 , , 100 } .
    -
    c4: Uniform on [ 10,100 ] integer in { 10 , , 100 } .
    -
    c5: Uniform on [ c 3 + c 4,250 ] integer in { c 3 + c 4 , , 250 } (always maintenance + labour).
    -
    c6: (salvage “cost”): -round (runif ( . . . , 0,6 ) ) integer in { 0 , 1 , 2 , 3 , 4 , 5 , 6 } (i.e., a non-positive per-unit term).
  • Transition matrix P d (size 11 × 11 , since CSM = 10)
  • Row 1: Draw n i.i.d. uniforms, normalize them to sum to 1, round each one to 2 decimals, and then set the last entry to the residual 1 j = 1 n 1   p j .
    Practically, the first n 1 entries lie on a 0.01 grid in [ 0,1 ] ; the last entry is whatever makes the row sum exactly 1.
  • Rows 2…n: This is constructed to keep each row’s CDF no larger than the previous row’s CDF (first-order stochastic dominance in that direction). For each column j < n ,
p j   Uniform   0 ,   prev _ cdf   [ j ] k < j     p k ,
which is then truncated to 2 decimals via floor ( 100 ) / 100 . The last entry is again the residual 1 j = 1 n 1   p j , so each p j lies on a 0.01 grid between 0 and its row-specific cap set by the previous row’s CDF.
  • Number of experiments: Num Experiments = 100 (independent re-draws of the above per experiment).
  • Capacity/demand state space: C S M = 10 states 0 , , 10 .
  • Horizon: H = 10 .
In a brute-force approach, the total number of computations required is substantial: (11 capacity states) × (11 demand states) × (11 actions at each state) × (9 time periods) = 11,979 calculations
In contrast to the brute-force approach, Figure 1 illustrates that the proposed modified value iteration algorithms, which leverage the structural properties of the MDP, substantially reduce the computational effort required to compute the optimal decision policies.
Figure 1. Step reduction with Algorithms 1 and 2 against standard backward value iteration.
  • Algorithm 1 operates without relying on any structural assumptions regarding the demand transition matrix. By leveraging the structural properties established in Theorem 1 (structuredness of capacity expansion decisions), Theorem 2 (structuredness of capacity reduction decisions), and Theorem 3 (monotonicity of optimal actions with respect to the capacity state), it achieves an average reduction of 82% in the required number of computational steps.
  • Algorithm 2 extends these results by leveraging additional structural properties. Beyond the preceding theorems, it incorporates Theorem 4, which establishes the monotonicity of optimal actions with respect to the demand state under first-order stochastic dominance of the demand transition probabilities. Exploiting this structure yields substantial gains in computational efficiency, with average reductions in computational effort exceeding 87%.
Analogous to threshold maintenance policies and base-stock policies in maintenance and production, structured policies make MDP-based capacity planning intuitive for decision makers. They turn state-by-state optimization into clear rules that managers can check and follow. In the context of the monotonicity results (Theorems 1 and 2), the system admits two targets: a level to expand to and a level to shrink to. As a result, one only needs to solve exactly at the capacity bounds; for any interior capacity, the optimal move is simply the gap to the relevant target. This collapses the search space, speeds up computation substantially (with about an 82% average reduction when employing Algorithm 1), and yields straightforward, defensible guidance on when and how much to adjust capacity.

5. Numerical Examples

This section presents two examples to demonstrate the structural properties of the model presented in this paper.
Example 1. 
Consider a problem where a system adjusts its capacity to fulfil demands. Any unit increase in the capacity costs the system USD 13, while any reduction in the capacity costs the system USD 4. The cost of maintaining the unit system capacity per period is USD 12. The cost of using one unit for the capacity to meet demand (operating cost) during a period is USD 24. The penalty incurred for each unit of demand exceeding the system capacity in a period is USD 65. The cost of salvaging the unit capacity at the end of planning horizon is −USD 3 (represents revenue). We need to find an optimal decision policy for capacity planning for a planning horizon of 10 periods. The maximum capacity level of the system is 10 units, and capacity decisions are made at the end of every period.
Problem parameters
c 1 : 13 c 2 : 4 c 3 : 12 c 4 : 25
c 5 : 65 c 6 : - 3 T : 10 M C : 10
The demand transition matrix P d t + 1 | d t is
d t | d t + 1 0 1 2 3 4 5 6 7 8 9 10 0 0.2 0.1 0.3 0 0.1 0 0 0 0 0 0.3 1 0.1 0.2 0.2 0.4 0.1 0 0 0 0 0 0 2 0.1 0.1 0.2 0 0.1 0.1 0 0 0.4 0 0 3 0.1 0.1 0.2 0 0.2 0.1 0 0 0 0 0.3 4 0.1 0.1 0 0 0.2 0.2 0 0 0 0.3 0.1 5 0.1 0 0 0.2 0.2 0.2 0.1 0 0 0.2 0 6 0 0.1 0.1 0.2 0.1 0.2 0.1 0.1 0.1 0 0 7 0 0 0.1 0 0.1 0.1 0.1 0.1 0.3 0.2 0 8 0 0 0 0.1 0 0.1 0.1 0.1 0.2 0.2 0.2 9 0 0 0 0 0 0.1 0.1 0.2 0.2 0.2 0.2 10 0 0 0 0 0.1 0.1 0.1 0.1 0.2 0.2 0.2
The objective is to minimize the total cost over the planning horizon.
Optimal solution
The optimal solution to Example 1 is provided and demonstrated in Table 2, Table 3 and Table 4 and Figure 1 and Figure 2. The tables demonstrate that the optimal actions are monotonic for all levels of system capacity and do not change depending on the demand, as shown in Figure 2.
Table 2. Example 1, optimal actions a t C S t , d t at t = 1, 2, 3.
Table 3. Example 1, optimal actions at t = 4 ( a 4 C S 4 , d 4 ).
Table 4. Example 1, optimal actions at t = 5 ( a 5 C S 5 , d 5 ).
Figure 2. Example 1, Graphical representation I of the optimal decision rule at t = 1 (Table 2).
The same pattern repeats for the optimal decision rules for periods 6, 7, and 8, which we will skip. Figure 2 and Figure 3 demonstrate the optimal decision rules when t = 1 . .
Figure 3. Example 1, Graphical representation II of the optimal decision rule at t = 1. Capacity_0 is the action carried out at minimum capacity C S 0 = 0 , and Capacity_10 is the action caried out at maximum capacity C S 0 = 10 .
Upon analysing the optimal decision at different time epochs, we find that the result is in accordance with our listed theorems:
  • After determining a t C S t = 0 , d t , we see that, with every unit increase in C S t , a t C S t , d t decreases by 1 unit until it reaches 0. This follows from Theorem 1.
  • After determining a t C S t = M C , d t , we see that, with every unit decrease in C S t , a t C S t , d t increases by 1 unit until it reaches 0. This follows from Theorem 2.
  • For a given d t , a t C S t , d t does not increase with C S t . This follows from Theorem 3.
Theorems 1–3 allow a simpler representation of the policy in Figure 2, which is given in Figure 4. Essentially, knowing the optimum policy at maximum and minimum capacity levels for a demand defines the optimal policy for the remaining capacity levels.
Figure 4. Example 2, Graphical representation of optimal decision rule at t = 1.
Example 2.
In this example, we work on the same problem parameters as in Example 1, but, now, we replace the demand transition matrix with one that has first-order stochastic dominance, as given below.
d t | d t + 1 0 1 2 3 4 5 6 7 8 9 10 0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0 1 0 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 2 0 0 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 3 0 0 0.1 0.2 0.1 0.1 00.1 0.1 0.1 0.1 0.1 4 0 0 0 0.2 0.1 0.2 0.1 0.1 0.1 0.1 0.1 5 0 0 0 0 0.1 0.2 0.2 0.2 0.1 0.1 0.1 6 0 0 0 0 0 0.2 0.2 0.3 0.1 0.1 0.1 7 0 0 0 0 0 0.1 0.3 0.3 0.1 0.1 0.1 8 0 0 0 0 0 0.1 0.2 0.3 0.1 0.2 0.1 9 0 0 0 0 0 0.1 0.1 0.3 0.2 0.2 0.1 10 0 0 0 0 0 0.1 0.1 0.2 0.2 0.2 0.2
The resulting optimal solution at t = 1 is demonstrated in Figure 4 and Figure 5.
Figure 5. Example 2, Graphical representation II of optimal decision rule at t = 1.
Upon analysing the optimal decision at different time epochs, we find that the result is in accordance with our listed theorems:
  • Theorems 1–3 are upheld, as described in Example 1; and since this problem definition has a demand matrix satisfying first-order stochastic dominance, for a given C S t , a t C S t , d t does not decrease with d t . This follows from Theorem 4. Figure 3 and Figure 5 demonstrate that the results become more structured when the conditions of Theorem 4 hold true.

6. Summary

This study presents the formulation of a finite-horizon Markov decision process (MDP) for addressing capacity planning challenges characterized by time-variant Markovian demand. The approach integrates both capacity expansion and contraction within a generic cost structure. This research identifies two fundamental and practical cost conditions that facilitate the development of a simplified policy structure. Furthermore, when demand transitions adhere to first-order stochastic dominance, an additional property of monotonicity is established. We also leverage the property of convexity to enhance the value iteration process, thereby reducing computational effort.
In finite-horizon, time-varying contexts, optimal policies may appear counterintuitive. This study has established sufficient cost conditions (including when assuming there is first-order stochastic dominance (FOSD)), where these policies simplify to straightforward, verifiable rules. (i) State-contingent targets indicate that if expanding to level g is optimal at a certain capacity, the same target is applicable from any lower capacity; conversely, if contracting to level e is optimal, the same target applies for any higher capacity. (ii) There is a monotonicity with respect to capacity such that with increased installed capacity, the prescribed expansions are not larger. (iii) There is monotonicity across demand states, where under FOSD, higher demand states do not necessitate smaller adjustments. This policy is computationally efficient, facilitating routine and defensible decisions regarding capacity.
Finally, our study was conducted under the following simplifying assumptions: (i) transition probability matrices are known, (ii) capacity adjustment lead times are zero, and (iii) the system features a single, homogeneous capacity type. These assumptions facilitated structural characterization of the optimal policies but limit the external validity and practical generalizability of the results. Future research could relax these restrictions by (a) statistically estimating or learning the transition dynamics from data (through model-free reinforcement learning); (b) incorporating strictly positive and stochastic lead times, including ramp-up and ramp-down dynamics together with potentially nonconvex capacity adjustment costs; and (c) extending the model to multi-resource systems (e.g., multiple nonidentical resources) with explicit coupling constraints across resources. Moreover, empirical calibration using industry data and data-driven policy optimization or evaluation would allow a more rigorous assessment of the managerial relevance and performance of the proposed decision rules.

Author Contributions

Methodology, J.A. and M.M.A.; validation, J.A. and M.M.A.; formal analysis, J.A. and M.M.A.; writing—original draft, J.A. and M.M.A.; writing—review and editing, J.A. and M.M.A.; supervision, M.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors would like to acknowledge the support provided by the Deanship of Research at King Fahd University of Petroleum & Minerals (KFUPM).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

This section lists the theorems, propositions, and lemmas of Section 4 along with their detailed proofs.
Proof of Theorem 1:
The proof of Theorem 1 is shown by contradiction. Conceptually, given an optimal action a t C S t = C S 1 , d t such that C S 1 +   a t = g , we obtain a set of different relations (necessary optimality conditions) by acknowledging that V t C S t = C S 1 , d t = V t C S t = C S 1 , d t , a t V t ( C S t = C S 1 , d t , a t ) for the different possible values of a t , except in the case of increasing the system’s capacity to level g . Now, if in states C S t = C S 2 , d t , where C S 1     C S 2     g , there exist better actions than increasing the system capacity to reach the same capacity level g , we obtain a second set of relations. However, it will be shown that this set of relations will contradict the first set; hence, the optimal action is to reach capacity level g . Next, we provide the detailed proof of this theorem:
Figure A1. Theorem 1 capacity level illustration.
Graphically, Figure A1 represents a set of capacities ( b ,   C S 1 ,   e ,   C S 2 ,   f ,   g ,   h ) , where b C S 1 e C S 2 f < g < h .
Given the optimal action a t C S t = C S 1 , d t = x = g C S 1 and g > C S 1 , referring to Figure A1, let us consider all possible alternative actions:
Let x be any action such that C S 1   +   x   =   b ;   b     C S 1 .
Let x be any action such that C S 1   +   x   =   e ;   C S 1     e     C S 2 .
Let x + be any action such that C S 1   +   x +   =   f ; C S 2     f   <   g .
Let x + + be any action such that C S 1   +   x + +   = h ;   g   < h .
Based on this, since a t C S 1 , d t = x is the optimal action at C S t = C S 1 , d t , we are left with the following four relations (necessary optimality conditions on a t ):
(1)
V t C S t = C S 1 , d t = V t C S t = C S 1 , d t , x V t ( C S t = C S 1 , d t , x ) .
(2)
V t C S t = C S 1 , d t = V t C S t = C S 1 , d t , x V t ( C S t = C S 1 , d t , x ) .
(3)
V t C S t = C S 1 , d t = V t C S t = C S 1 , d t , x V t ( C S t = C S 1 , d t , x + ) .
(4)
V t C S t = C S 1 , d t = V t C S t = C S 1 , d t , x V t ( C S t = C S 1 , d t , x + + ) .
Relations 1–4 imply that, in any time period, the value function when the optimal action is carried out will be always less than or equal to the value function at the time when any other action is carried out.
Taking one relation at a time:
(1)
V t C S 1 , d t , x V t ( C S 1 , d t , x ) .
By substituting in Equation (7), using Equation (3), and simplifying, we get
c 1 g C S 1 + c 2 b C S 1 d t + 1 P d t + 1 | d t   V t + 1 b , d t + 1   V t + 1 g , d t + 1 .
By applying the same approach to the remaining relations, we get
(2)
V t C S 1 , d t , x V t ( C S 1 , d t , x ) ,
c 1 g e d t + 1 P d t + 1 | d t   V t + 1 e , d t + 1   V t + 1 g , d t + 1 .
(3)
V t C S 1 , d t , x V t ( C S 1 , d t , x + ) ,
c 1 g f d t + 1 P d t + 1 | d t   V t + 1 f , d t + 1   V t + 1 g , d t + 1 .
(4)
V t C S 1 , d t , x V t ( C S 1 , d t , x + + ) ,
c 1 g h d t + 1 P d t + 1 | d t   V t + 1 h , d t + 1   V t + 1 g , d t + 1 .
Given a capacity CS2 such that C S 1     C S 2     g , let us consider all possible actions available for the state C S t = C S 2 (other than the action a = y = g C S 2 ):
Action Subset 1: This is an action y such that C S 2   +   y   =   b ;   b     C S 1 .
Action Subset 2: Let y be any action such that C S 2 +   y   =   e ;   C S 1     e     C S 2 .
Action Subset 3: Let y + be any action such that C S 2   +   y +   =   f ; C S 2     f   <   g .
Action Subset 4: Let y + + be any action such that C S 2   +   y + +   = h ;   g   < h .
Let y be the action such that C S 2   +   y   =   g .
If in state C S t = C S 2 , d t an action in action Subset 1 yields a better value function than that corresponding to action y , then
V t C S 1 , d t , y < V t C S 2 , d t , y .
By writing the value function Equation (7) and using Equation (3), we get
c 1 g C S 2 + c 2 b C S 2 > d t + 1 P d t + 1 | d t   V t + 1 b , d t + 1   V t + 1 g , d t + 1 .
We can see that Equation (A5) contradicts Equation (A1). This implies that action y cannot be better than action y .
Similarly, for action subsets 2, 3, and 4, we get the following relations:
c 1 g C S 2 + c 2 e C S 2 > d t + 1 P d t + 1 | d t   V t + 1 e , d t + 1   V t + 1 g , d t + 1 .
c 1 g f > d t + 1 P d t + 1 | d t   V t + 1 f , d t + 1   V t + 1 g , d t + 1 .
c 1 g h > d t + 1 P d t + 1 | d t   V t + 1 h , d t + 1   V t + 1 g , d t + 1 .
It is obvious that Equations (A6)–(A8) contradict Equations (A2)–(A4), respectively. Hence, actions y , y + , or y + + cannot be better than action y .
Thus, y is the optimal action at C S 2 . This completes the proof of Theorem 1. □
Proof of Theorem 2:
Similar to Theorem 1, the proof of Theorem 2 is shown by contradiction. Conceptually, given an optimal action a t C S t = C S 2 , d t such that C S 2 +   a t = e , we obtain one set of relations (necessary optimality conditions) by acknowledging that V t C S t = C S 2 , d t = V t C S t = C S 2 , d t , a t V t ( C S t = C S 2 , d t , a t ) . Now, if in states C S t = C S 1 , d t , where e C S 1   <   C S 2 , there exist better actions than decreasing the system capacity to reach the same capacity level e , we obtain a second set of relations. However, it will be shown that this set of relations will contradict the first set. Hence, the optimal action is to reach capacity level. Next, we provide the detailed proof of this theorem.
Figure A2. Theorem 2 capacity level illustration.
Graphically, Figure A2 represents a set of capacities ( b ,   e , f ,   C S 1 ,   g ,   C S 2 ,   h ) , where
b < e < f C S 1 g C S 2 h .
Given the optimal action a t C S t = C S 2 , d t = y = e C S 2 and e <   C S 2 .
Referring to Figure A2, let us consider all possible alternative actions:
Let y be any action such that C S 2   +   y   =   b ;   b   <   e .
Let y be any action such that C S 2 +   y   =   f ;   e   <   f     C S 1 .
Let y + be any action such that C S 2   +   y +   =   g ; C S 1     g     C S 2 .
Let y + + be any action such that C S 2   +   y + +   = h ;   C S 2   h .
Since a t C S 2 , d t = y is the optimal action at C S t = C S 2 , d t , we are left with the following relations (necessary optimality conditions on a ):
(1)
V t C S 2 , d t = V t C S 2 , d t , y V t ( C S 2 , d t , y ) .
(2)
V t C S 2 , d t = V t C S 2 , d t , y V t ( C S 2 , d t , y ) .
(3)
V t C S 2 , d t = V t C S 2 , d t , y V t ( C S 2 , d t , y + ) .
(4)
V t C S 2 , d t = V t C S 2 , d t , y V t ( C S 2 , d t , y + + ) .
Relations 1–4 imply that, in any time period, the value function for the optimal action will always be less than or equal to the value function for any other action.
We will employ one relation at a time:
(1)
V t C S 2 , d t , y V t ( C S 2 , d t , y ) .
By substituting in Equation (7), using Equation (3), and simplifying, we get
c 2 b e d t + 1 P d t + 1 | d t   V t + 1 b , d t + 1   V t + 1 e , d t + 1 .
(2)
Doing the same with the remaining relations, we get V t C S 2 , d t , y V t ( C S 2 , d t , y ) , which yields
c 2 f e d t + 1 P d t + 1 | d t   V t + 1 f , d t + 1   V t + 1 e , d t + 1 .
(3)
V t C S 2 , d t , y V t ( C S 2 , d t , y + ) yields
c 2 g e d t + 1 P d t + 1 | d t   V t + 1 g , d t + 1   V t + 1 e , d t + 1 .
(4)
V t C S 2 , d t , y V t ( C S 2 , d t , y + + ) yields
c 2 e C S 2 c 1 h C S 2 d t + 1 P d t + 1 | d t   V t + 1 h , d t + 1   V t + 1 e , d t + 1 .
Given a capacity CS1 such that e C S 1   <   C S 2 , let us consider all possible actions available for the state C S t = C S 1 , d t
Action Subset 1: Let x be any action such that C S 1   +   x   =   b ;   b   <   e .
Action Subset 2: Let x be any action such that C S 1   +   x   =   f ;   e   <   f     C S 1 .
Action Subset 3: Let x + be any action such that C S 1   +   x +   =   g ; C S 1   <   g     C S 2 .
Action Subset 4: Let x + + be any action such that C S 1   +   x + +   = h ;   C S 2   < h .
Let x be the action such that C S 1 + x = e .
If in state C S t = C S 1 , d t an action in action subset yields a better value function than that corresponding to action x , then
V t C S 1 , d t , x > V t ( C S 1 , d t , x ) .
By writing the value function Equation (7) and using Equation (3), we get
c 2 b e > d t + 1 P d t + 1 | d t   V t + 1 b , d t + 1   V t + 1 g , d t + 1 .
We can see that Equation (A13) contradicts Equation (A9). This implies that action x cannot be better than action x .
Similarly, we get the following relations for action subsets 2, 3, and 4:
c 2 f e > d t + 1 P d t + 1 | d t   V t + 1 f , d t + 1   V t + 1 e , d t + 1 .
c 2 e C S 1 c 1 g C S 1 > d t + 1 P d t + 1 | d t   V t + 1 g , d t + 1   V t + 1 e , d t + 1 .
c 2 e C S 1 c 1 h C S 1 > d t + 1 P d t + 1 | d t   V t + 1 h , d t + 1   V t + 1 e , d t + 1 .
Equations (A14)–(A16) are in contradiction to relations (A10)–(A12), respectively.
Hence, there cannot exist a better action at C S 1 than x . This completes the proof of Theorem 2. □
Proof of Lemma 2:
c 4 m i n d t ,   C S t 1 + a t 1 + c 5 d t C S t 1 a t 1 + superadditive in C S t 1   ×   a t 1 d t means that for any a 2 t 1 a 1 t 1 in A t 1 :
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + is nondecreasing in C S t 1 .
It can be observed that as C S t 1 increases, c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + can have at most three regions in the following order:
(1)
C S t 1 < d t a 2 t 1
(2)
d t a 2 t 1 C S t 1 < d t a 1 t 1
(3)
C S t 1 d t a 1 t 1
Region 1: C S t 1 < d t a 2 t 1
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t C S t 1 a 2 t 1 + d t C S t 1 a 1 t 1 + = c 5 c 4 a 1 t 1 a 2 t 1 .
Note that based on the cost parameter assumption c 5 > c 4 + c 3 , the relation c 5 > c 4 is implicit. Hence, in region 1, c 5 c 4 a 1 t 1 a 2 t 1 is a constant with a negative value
Region 2: d t a 2 t 1 C S t 1 < d t a 1 t 1
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t C S t 1 a 2 t 1 + d t C S t 1 a 1 t 1 + = c 5 c 4 C S t 1 + a 1 t 1 d t .
Since c 5 > c 4 , c 5 c 4 C S t 1 + a 1 t 1 d t increases with an increase in C S t 1 . However, as C S t 1 < d t a 1 t 1 , the value is negative.
Region 3: C S t 1 d t a 1 t 1
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t C S t 1 a 2 t 1 + d t C S t 1 a 1 t 1 + = 0 .
From the above analysis, we get c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + is nondecreasing in C S t 1 .
This completes the proof of Lemma 2. □
Proof of Proposition 1:
The proof of this proposition is established using mathematical induction. This starts from the last time epoch T , where the capacity is to be salvaged. We work our way up, showing that the value function and its components are superadditive in C S t   ×   a t d t
From Equation (2),
V T C S T , d T , a T = V T C S T , d T = c 6 C S T .
Since V T C S T , d T , a 1 T V T C S T , d T , a 2 T = 0 for any C S T and for any combination of a 1 T and a 2 T , V T C S T , d T , a T is superadditive in C S T   ×   a T d T . Note that in practice, there is virtually no other option in the last epoch but to salvage the capacity. We simply make this manipulation to lay out the proof.
Writing Equation (7)
V t C S t , d t , a t = R t C S t , d t , a t + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 .
Considering the elements of this equation, R t C S t , d t , a t from Equation (3):
R t C S t , d t , a t   = c 1 a t + + c 2 a t + + c 3 C S t + c 4 m i n d t ,   C S t + c 5 d t   C S t + .
This is because R t C S t , d t , a 2 t R t C S t , d t , a 1 t is a constant independent of C S t a 2 t ,   a 1 t . We have R t C S t , d t , a t as a superadditive in C S t   ×   a t d t .
Hence, V t C S t , d t , a t will be superadditive in C S t   ×   a t , if d t + 1 P d t + 1 | d   V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Moreover, if   V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t , then d t + 1 P d t + 1 | d   V t + 1 C S t + a t , d t + 1 will be superadditive in C S t   ×   a t .
In conclusion, if   V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t , then V t C S t , d t , a t will be superadditive in C S t   ×   a t d t .
This will be shown to be true for our problem definition via induction:
Since V T C S T , d T = c 6 .   C S T , we have V T C S T 1 + a T 1 , d T as a superadditive in C S T 1   ×   a T 1 , because for any two actions a 2 T 1 and a 1 T 1 in A T 1 such that a 2 T 1     a 1 T 1 , we have
V T C S T 1 + a 2 T 1 , d T V T C S T 1 + a 1 T 1 , d T = c 6 a 2 T 1 a 1 T 1 .
The term c 6 a 2 T 1 a 1 T 1 is independent of C S T 1 .
By employing Equation (7) for t = T 1 and substituting Equations (3) and (A17), we acquire
V T 1 C S T 1 , d T 1 , a T 1 = c 1 a T 1 + + c 2 a T 1 + + c 3 C S T 1 + c 4 m i n d T 1 ,   C S T 1 + c 5 d T 1 C S T 1 + + c 6 C S T 1 + a T 1 .
We can see that V T 1 C S T 1 , d T 1 , a T 1 is superadditive in C S T 1   ×   a T 1 .
From Equation (A18), based on the assumption on the cost parameters c 1 m a x 0 , c 2 , there are three possible optimal actions at t = T 1 depending on the measurement of c 6 with respect to c 1 and c 2 :
a T 1 C S T 1 , d T 1 = 0 if c 2 c 6   &   c 1 c 6 ; we thus get
  V T 1 C S T 1 , d T 1 = c 3 + c 6   C S T 1 + c 4 m i n d T 1 ,   C S T 1 + c 5 d T 1 C S T 1 + .
a T 1 C S T 1 , d T 1 = C S T 1 if c 2 c 6   &   c 1 c 6 , so we have
  V T 1 C S T 1 , d T 1 = c 2 + c 3   C S T 1 + c 4 m i n d T 1 ,   C S T 1 + c 5 d T 1 C S T 1 + .
a T 1 C S T 1 , d T 1 = M C C S T 1 if c 1 c 6 , so we have
  V T 1 C S T 1 , d T 1 = c 1 + c 6   M C + c 3 c 1   C S T 1 + c 4 m i n d T 1 ,   C S T 1 + c 5 d T 1 C S T 1 + .
Next, we have the Bellman optimality equation for t = T 2 , which depends on   V T 1 ( ) , as follows:
V T 2 C S T 2 , d T 2 , a T 2 = R T 2 C S T 2 , d T 2 , a T 2 + d t + 1 P d T 1 | d T 2   V T 1 C S T 2 + a T 2 , d T 1 .
As shown earlier, if   V T 1 C S T 2 + a T 2 , d T 1 is superadditive in C S T 2   ×   a T 2 , then V T 2 C S T 2 , d T 2 , a T 2 will be superadditive in C S T 2   ×   a T 2 d T 2 .
Let a 2 T 2 ,   a 1 T 2 be two actions such that a 2 T 2 a 1 T 2 .
We have   V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 for the three cases as follows:
  V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 = c 3 + c 6 a 2 T 2 a 1 T 2 + c 4 m i n d T 1 ,   C S T 2 + a 2 T 2 m i n d T 1 ,   C S T 2 + a 1 T 2 + c 5 d T 1     C S T 2 a 2 T 2 + d T 1     C S T 2 a 1 T 2 + .   if   c 2 c 6   &   c 1 c 6   V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 = c 3 + c 2 a 2 T 2 a 1 T 2 + c 4 m i n d T 1 ,   C S T 2 + a 2 T 2 m i n d T 1 ,   C S T 2 + a 1 T 2 + c 5 d T 1     C S T 2 a 2 T 2 + d T 1     C S T 2 a 1 T 2 + .   if   c 2 c 6   &   c 1 c 6   V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 = c 3 c 1 a 2 T 2 a 1 T 2 + c 4 m i n d T 1 ,   C S T 2 + a 2 T 2 m i n d T 1 ,   C S T 2 + a 1 T 2 + c 5 d T 1     C S T 2 a 2 T 2 + d T 1     C S T 2 a 1 T 2 + .   if   c 1 c 6
The first parts in Equation (A19)
c 3 +   c 6 a 2 T 2 a 1 T 2 ;   c 3 + c 2 a 2 T 2 a 1 T 2 ; c 3 c 1 a 2 T 2 a 1 T 2 are constants. By studying the second part of Equation (A19), c 4 m i n d T 1 ,   C S T 2 + a 2 T 2 m i n d T 1 ,   C S T 2 + a 1 T 2 + c 5 d T 1   C S T 2 a 2 T 2 + d T 1   C S T 2 a 1 T 2 + , based on Lemma 2, we find that it increases in C S T 2 .
This means   V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 is increasing in C S T 2 . This gives the V T 2 C S T 2 , d T 2 , a T 2 superadditive in C S T 2   ×   a T 2 .
So far, the following results have been established:
1.
V T 2 C S T 2 , d T 2 , a T 2 ,   V T 1 C S T 1 , d T 1 , a T 1 ,   a n d   V T C S T , d T , a T are superadditive in C S T 2   ×   a T 2 , C S T 1   ×   a T 1 ,   a n d   C S T   ×   a T , respectively.
2.
  V T 1 C S T 2 + a T 2 , d T 1   a n d   V T C S T 1 + a T 1 , d T are superadditive in C S T 2   ×   a T 2   a n d   C S T 1   ×   a T 1 , respectively.
This gives the first step of the induction proof of Proposition 1. Next, assuming that V t + 1 C S t + 1 , d t + 1 , a t + 1 and V t + 2 C S t + 1 + a t + 1 , d t + 2 are both superadditive in C S t + 1   ×   a t + 1   d t + 1 ,   we need to prove that V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t to prove that V t C S t , d t , a t is superadditive in C S t   ×   a t .
That is, we need to check if V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 is nondecreasing in C S t for any a 2 t a 1 t   d t + 1
Let the optimal action in time epoch t + 1 and state C S t + 1 = 0 , d t + 1 be K .
a t + 1 0 , d t + 1 = K .
This implies that when in state C S t + 1 , d t + 1   C S t + 1 K , a t + 1 C S t + 1 , d t + 1 + C S t + 1 = K , based on Theorem 1.
And let optimal action in time epoch t + 1 and state C S t + 1 = M C , d t + 1 be L M C
a t + 1 M C , d t + 1 = L M C .
This implies that when in state C S t + 1 , d t + 1   C S t + 1 L , a t + 1 C S t + 1 , d t + 1 + C S t + 1 = L , based on Theorem 2. Moreover, Theorems 1 and 2 imply that L K . .
Let K   a 2 t   a 1 t ( a 2 t > K > a 1 t and a 2 t > a 1 t > K are special cases of this).
To study the nature of V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 with an increase in C S t for a given d t + 1 , we have two possibilities depending on the selection of a 1 t and a 2 t :
Possibility 1:
When K a 1 t L a 2 t
Figure A3. Proposition 1: Possibility 1— C S t regions.
As shown in Figure A3 for Possibility 1, the resulting regions are considered:
Region I:
In this region, C S t + 1 K in both functions V t + 1 C S t + a 2 t , d t + 1   &   V t + 1 C S t + a 1 t , d t + 1
This implies
m a x ( 0 , a 1 t ) C S t K a 2 t
Based on Theorem 1, the optimal action would be to go to capacity K for both states C S t + a 2 t , d t + 1 and C S t + a 1 t , d t + 1 ; hence,
V t + 1 ( C S t + a 2 t , d t + 1 ) V t + 1 C S t + a 1 t , d t + 1 = c 1 K C S t a 2 t + c 3 C S t + a 2 t     + c 4 m i n d t + 1 , C S t + a 2 + c 5 d t + 1 C S t a 2 t + c 1 K C S t a 1 t + c 3 C S t + a 1 t     + c 4 m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 1 t +
Simplifying
V t + 1 ( C S t + a 2 t , d t + 1 ) V t + 1 C S t + a 1 t , d t + 1 = c 3 c 1 a 2 t a 1 t   + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t +  
Lemma 2 shows that c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + is nondecreasing in C S t .
Hence, in region I, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t
Region II:
C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 and L C S t + 1 K in the function V t + 1 C S t + a 2 t , d t + 1 .
This implies
K a 2 t   C S t K a 1 t .
Based on Theorems 1 and 2, the optimal action would be to go to capacity K for states C S t + a 1 t , d t + 1 and to choose action a = 0 for states C S t + a 2 t , d t + 1 .
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + c 1 K C S t a 1 t + d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2   V t + 2 K , d t + 2 .
We study Equation (A20) in parts:
1.
c 3 a 2 t a 1 t d t + 2 P d t + 2 | d t + 1 V t + 2 K , d t + 2 c 1 K a 1 t is fixed in this region.
2.
Lemma 2 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + is nondecreasing in C S t .
3.
The last part is as follows:
c 1 C S t + d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2 .
In this region,
V t + 1 C S t + a 2 t , d t + 1 , 0 V t + 1 C S t + a 2 t , d t + 1 , 1 .
This gives
d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t + 1 , d t + 2 d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2 c 1
Equation (A22) implies that for a unit increase in C S t , the decrease in d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2 in Equation (A21) will not be more than c 1 .
This shows that in the last part, c 1 C S t + d t + 2 P d t + 2 | d   V t + 2 C S t + a 2 t , d t + 2 is nondecreasing in C S t
Hence, in region II, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Region III:
L C S t + 1 K in both the functions V t + 1 C S t + a 2 t , d t + 1 and V t + 1 C S t + a 1 t , d t + 1
This implies
K a 1 t C S t L a 2 t
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + + d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2   V t + 2 C S t + a 1 t , d t + 2 .
c 3 a 2 t a 1 t is a fixed value. Lemma 2 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + is nondecreasing in C S t .   V t + 2 C S t + a 2 t , d t + 2   V t + 2 C S t + a 1 t , d t + 2 is nondecreasing in C S t , because   V t + 2 C S t + a t , d t + 2 is superadditive in C S t   ×   a t .
Hence, in region III, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t
Region IV:
L C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 and C S t + 1 L in the function V t + 1 C S t + a 2 t , d t + 1 .
This implies
L a 2 t C S t L a 1 t
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 2 C S t + a 2 t L + c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + + d t + 2 P d t + 2 | d t + 1   V t + 2 L , d t + 2   V t + 2 C S t + a 1 t , d t + 2 .
Equation (A23) is studied in parts:
1.
c 3 a 2 t a 1 t + d t + 2 P d t + 2 | d t + 1 V t + 2 L , d t + 2 + c 2 a 2 t L is fixed in this region.
2.
Lemma 2 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + is nondecreasing in C S t .
3.
The last part is as follows:
c 2 C S t d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 1 t , d t + 2 .
In this region,
V t + 1 C S t + a 1 t , d t + 1 , 0 V t + 1 C S t + a 1 t , d t + 1 , 1 .
This gives
c 2 d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 1 t 1 , d t + 2 d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 1 t , d t + 2 .
Equation (A25) implies that for a unit increase in C S t , the last part (Equation (A24)) is nondecreasing in C S t
Hence, in region IV, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Region V:
C S t + 1 L in both functions V t + 1 C S t + a 1 t , d t + 1 and V t + 1 C S t + a 2 t , d t + 1 .
This implies
L a 1 t C S t .
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 2 + c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + .
c 2 + c 3 a 2 t a 1 t does not change with an increase in C S t . Lemma 2 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + is nondecreasing in C S t . Hence, in region V, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t
This completes the proof of Possibility 1 when K a 1 t L a 2 t . □
Now, we move onto Possibility 2 when K a 1 t > L a 2 t . The proof process is the same as in Possibility 1, which we just addressed.
Possibility 2:
When K a 1 t > L a 2 t
Figure A4. Proposition 1, Possibility 2— C S t regions.
As shown in Figure A4 for Possibility 2, the resulting regions are considered:
Region I:
C S t + 1 K in both functions V t + 1 C S t + a 2 t , d t + 1   &   V t + 1 C S t + a 1 t , d t + 1 .
This implies
m a x ( 0 , a 1 t ) C S t K a 2 t .
This is the same as region I for Possibility 1.
Hence, in region I, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Region 2:
C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 , and L C S t + 1 K in the function V t + 1 C S t + a 2 t , d t + 1 . This implies
K a 2 t   C S t L a 1 t .
The region is the same as region II of Possibility 1.
Hence, in region II, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Region III:
C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 and C S t + 1 L in the function V t + 1 C S t + a 2 t , d t + 1 . This implies
L a 2 t C S t K a 1 t .
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 1 K C S t a 1 t + c 2 C S t + a 2 t L + c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + + d t + 2 P d t + 2 | d t + 1   V t + 2 L , d t + 2   V t + 2 K , d t + 2 .
We study Equation (A26) in parts:
1.
c 1 K a 1 t + c 2 a 2 t L + c 3 a 2 t a 1 t + d t + 2 P d t + 2 | d t + 1 V t + 2 L , d t + 2 V t + 2 K , d t + 2 is a fixed value in this region upon an increase in C S t .
2.
Lemma 2 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + is nondecreasing in C S t
3.
c 1 + c 2 C S t increases with C S t , considering the assumption that c 1 m a x 0 , c 2 .
Hence, in region III, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Region IV:
L C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 , and C S t + 1 L in the function V t + 1 C S t + a 2 t , d t + 1 . This implies
K a 1 t C S t L a 1 t .
The region is given the same consideration as region IV of Possibility 1.
Hence, in region IV, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
Region V:
The same consideration is given for region V for Possibility 1. Hence, in region V, V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t .
This completes the proof of Possibility 2. □
We have thus shown that V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t in the given model if V t + 1 C S t + 1 , d t + 1 , a t + 1 and V t + 2 C S t + 1 + a t + 1 , d t + 2 are superadditive in C S t + 1   ×   a t + 1 .
We have thus proved by induction that V t C S t , d t , a t is superadditive in C S t   ×   a t d t
Proof of Theorem 4:
From Equations (3) and (7),
V t C S t , d t , a t = c 1 a t + + c 2 a t + + c 3 C S t + c 4 m i n d t ,   C S t + c 5 d t C S t + + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 .
  a t < 0 ; this gives
V t C S t , d t , a t + 1 V t C S t , d t , a t = c 2 + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t + 1 , d t + 1   V t + 1 C S t + a t , d t + 1 .
And   a t 0 ; this gives
V t C S t , d t , a t + 1 V t C S t , d t , a t = c 1 + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t + 1 , d t + 1   V t + 1 C S t + a t , d t + 1
Considering the assumption c 1 m a x 0 , c 2 , V t C S t , d t , a t + 1 V t C S t , d t , a t is nondecreasing in a t if   V t + 1 C S t + a t + 1 , d t + 1   V t + 1 C S t + a t , d t + 1   is nondecreasing in a t .
From the Proof of Proposition 1, we have shown that V t + 1 C S t + a t , d t + 1 is superadditive in C S t   ×   a t   d t + 1 , t .
Thus, ∀   C S 2 , C S 1   i n   C S t   &     a 2 , a 1   i n   a t such that C S 2 C S 1 and a 2 a 1 ; it follows
  V t + 1 C S 2 + a 2 , d t + 1   V t + 1 C S 2 + a 1 , d t + 1   V t + 1 C S 1 + a 2 , d t + 1   V t + 1 C S 1 + a 1 , d t + 1 .
Let C S 1 = C S ,   C S 2 =   C S + 1 ,   a 1 = a   &   a 2 = a + 1 . This yields
  V t + 1 C S + a + 2 , d t + 1   V t + 1 C S + a + 1 , d t + 1   V t + 1 C S + a + 1 , d t + 1   V t + 1 C S + a , d t + 1 .
Equation (A27) implies that   V t + 1 C S + a + 1 , d t + 1   V t + 1 C S + a , d t + 1 is nondecreasing in a , thus completing the proof of Theorem 4. □
Proof of Lemma 3:
To state c 4 m i n d t ,   C S t 1 + a t 1 + c 5 d t   C S t 1 a t 1 + is subadditive in d t   ×   a t 1 C S t 1 means ∀ a 2 t 1 a 1 t 1 in a t 1 :
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + is nonincreasing in d t .
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + has three regions:
(1)
d t C S t 1 + a 1 t 1
(2)
C S t 1 + a 1 t 1 d t C S t 1 + a 2 t 1
(3)
d t C S t 1 + a 2 t 1
Region 1: d t C S t 1 + a 1 t 1
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + = 0 .
Region 2: C S t 1 + a 1 t 1 d t C S t 1 + a 2 t 1
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + = c 5 c 4 C S t 1 + a 1 t 1 d t .
Note that based on the cost parameter assumption c 5 > c 4 + c 3 , the relation c 5 > c 4 is implicit. Hence, the above relation decreases with d t .
Region 3: d t C S t 1 + a 2 t 1
c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + = c 5 c 4 a 1 t 1 a 2 t 1 .
Hence, we have c 4 m i n d t ,   C S t 1 + a 2 t 1 m i n d t ,   C S t 1 + a 1 t 1 + c 5 d t   C S t 1 a 2 t 1 + d t   C S t 1 a 1 t 1 + nonincreasing in d t . □
Proof of Proposition 2:
The proof of this proposition is quite similar to the proof of Proposition 1 and is also established using mathematical induction. This starts from the last time epoch T, where the capacity is to be salvaged. □
From Equation (2),
V T C S T , d T , a T = V T C S T , d T = c 6 C S T .
V T C S T , d T , a 1 T V T C S T , d T , a 2 T = 0 for any d T , and any combination of a 1 T and a 2 T . Hence, we find that V T C S T , d T , a T is subadditive in d t   ×   a t C S T .
We write Equation (7) as follows:
V t C S t , d t , a t = R t C S t , d t , a t + d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 .
Considering the elements of this equation,
R t C S t , d t , a t = c 1 a t + + c 2 a t + + c 3 C S t + c 4 m i n d t ,   C S t + c 5 d t C S t + .
R t C S t , d t , a t is subadditive in d t   ×   a t   C S t because
R t C S t , d t , a 2 t R t C S t , d t , a 1 t is a constant independent of d t (∀   a 2 t   &   a 1 t ).
Hence, V t C S t , d t , a t will be subadditive in d t   ×   a t , if d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 is subadditive in d t   ×   a t .
Moreover, since P d t + 1 | d t shows first-order stochastic dominance d t + 1 P d t + 1 | d t   V t + 1 C S t + a t , d t + 1 will be subadditive in d t   ×   a t if   V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
In conclusion, V t C S t , d t , a t will be subadditive in d t   ×   a t d t , if   V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
This will be shown by induction:
Starting with the last period t = T , since V T C S T , d T = c 6 .   C S T , V T C S T 1 + a T 1 , d T is subadditive in d T   ×   a T 1 , as for a 2 T 1     a 1 T 1 :
V T C S T 1 + a 2 T 1 , d T   V T C S T 1 + a 1 T 1 , d T = c 6 a 2 T 1 a 1 T 1
is independent of d T .
From Equation (A18),
V T 1 C S T 1 , d T 1 , a T 1 = c 1 a T 1 + + c 2 a T 1 + + c 3 C S T 1 + c 4 m i n d T 1 ,   C S T 1 + c 5 d T 1 C S T 1 + c 6 C S T 1 + a T 1 .
It is evident that V T 1 C S T 1 , d T 1 , a T 1 is subadditive in d T 1   ×   a T 1 .
From Equation (A19), it is shown for c 2 c 6   &   c 1 c 6 :
  V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 = c 3 + c 6 a 2 T 2 a 1 T 2 + c 4 m i n d T 1 ,   C S T 2 + a 2 T 2 m i n d T 1 ,   C S T 2 + a 1 T 2 + c 5 d T 1   C S T 2 a 2 T 2 + d T 1   C S T 2 a 1 T 2 + .
The first parts of Equation (A19) c 3 + c 6 a 2 T 2 a 1 T 2 are constants. The second part of Equation (A19) c 4 m i n d T 1 ,   C S T 2 + a 2 T 2 m i n d T 1 ,   C S T 2 + a 1 T 2 + c 5 d T 1   C S T 2 a 2 T 2 + d T 1   C S T 2 a 1 T 2 + is known to be nonincreasing in d T 1 from Lemma 3. Similar arguments can be made for cases c 2 c 6 ,   c 1 c 6 and c 1 c 6 . The proofs are straightforward and hence skipped here.
This means   V T 1 C S T 2 + a 2 T 2 , d T 1   V T 1 C S T 2 + a 1 T 2 , d T 1 is nonincreasing in d T 1 . This shows that V T 2 C S T 2 , d T 2 , a T 2 is subadditive in d T 2   ×   a T 2 .
So far, the following results have been established:
1.
V T 2 C S T 2 , d T 2 , a T 2 , V T 1 C S T 1 , d T 1 , a T 1 , V T C S T , d T , a T is subadditive in d T 2 × a T 2 , d T 1 × a T 1   a n d   d T × a T , respectively.
2.
V T 1 C S T 2 + a T 2 , d T 1 , V T C S T 1 + a T 1 , d T is subadditive in d T 1 × a T 2 a n d d T × a T 1 , respectively.
This results in the first step of the induction proof of the proposition. Next, assuming that V t + 1 C S t + 1 , d t + 1 , a t + 1 and V t + 2 C S t + 1 + a t + 1 , d t + 2 are subadditive in d t + 1   ×   a t + 1     C S t + 1 ,   we need to prove that V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t to prove that V t C S t , d t , a t is subadditive in d t   ×   a t .
That is, we must check if V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 is nonincreasing in d t + 1 for any a 2 t a 1 t   C S t .
Let the optimal action in time epoch t + 1 and state C S t + 1 = 0 , d t + 1 be K .
a t + 1 0 , d t + 1 = K .
And let the optimal action in time epoch t + 1 and state C S t + 1 = M C , d t + 1 be L M C
a t + 1 M C , d t + 1 = L M C .
The understanding here is the same as explained for Proposition 1.
From Theorems 1 and 2, it is understood that L K .
Let K     a 2   a 1 ( a 2 > k > a 1   and a 2 > a 1 > k will be special cases of this).
The following steps in the proof are parallel to those explained in the proof of Proposition 1; again to study the nature of V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 with an increase in d t + 1 for a given C S t , we have two possibilities depending on the selection of a 1 t and a 2 t :
Possibility 1:
When K a 1 t L a 2 t
Figure A5. Proposition 2, Possibility 1— C S t regions.
As shown in Figure A5 for Possibility 1, the resulting regions are considered:
Region I:
In this region, C S t + 1 K in both functions V t + 1 C S t + a 2 t , d t + 1   &   V t + 1 C S t + a 1 t , d t + 1 .
This implies
m a x ( 0 , a 1 t ) C S t K a 2 t .
Based on Theorem 1, the optimal action would be to go to capacity K for both states C S t + a 2 t , d t + 1 and C S t + a 1 t , d t + 1 ; hence,
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 1 K C S t a 2 t + c 3 C S t + a 2 t + c 4 m i n d t + 1 , C S t + a 2 + c 5 d t + 1 C S t a 2 t + c 1 K C S t a 1 t + c 3 C S t + a 1 t + c 4 m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 1 t + .
We will simplify:
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 3 c 1 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + .
Lemma 3 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + is nonincreasing in d t + 1 .
Hence, in region I, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region II:
C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 , and L C S t + 1 K in the function V t + 1 C S t + a 2 t , d t + 1 ; this implies
K a 2 t   C S t K a 1 t .
Based on Theorems 1 and 2, the optimal action would be to go to capacity K for states C S t + a 1 t ,   d t + 1 and to choose action a = 0 for states C S t + a 2 t , d t + 1 :
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + c 1 K C S t a 1 t + d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2   V t + 2 K , d t + 2 .
We study Equation (A28) in parts:
1.
c 3 a 2 t a 1 t c 1 K C S t a 1 is fixed in this region.
2.
Lemma 3 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + is nonincreasing in d t + 1 .
3.
The last part is as follows:
4.
d t + 2 P d t + 2 d t + 1 V t + 2 C S t + a 2 t , d t + 2 d t + 2 P d t + 2 d t + 1 V t + 2 K , d t + 2 is nonincreasing in d t + 1 because P d t + 2 d t + 1 shows first-order stochastic dominance and V t + 2 C S t + a t , d t + 2 is subadditive in d t + 2 × a t .
Hence, in region II, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region III:
L C S t + 1 K in both the functions V t + 1 C S t + a 2 t , d t + 1 and V t + 1 C S t + a 1 t , d t + 1
This implies
K a 1 t C S t L a 2 t .
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + + d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 2 t , d t + 2   V t + 2 C S t + a 1 t , d t + 2 .
c 3 a 2 t a 1 t is a fixed value. Lemma 3 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + is nonincreasing in d t + 1 .   V t + 2 C S t + a 2 t , d t + 2   V t + 2 C S t + a 1 t , d t + 2 is shown to be nonincreasing in d t + 2 , because   V t + 2 C S t + a t , d t + 2 is subadditive in d t + 2   ×   a t .
Hence, in region III, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region IV:
L C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 , and C S t + 1 L in the function V t + 1 C S t + a 2 t , d t + 1 .
This implies
L a 2 t C S t L a 1 t .
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 2 C S t + a 2 t L + c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + + d t + 2 P d t + 2 | d t + 1   V t + 2 L , d t + 2   V t + 2 C S t + a 1 t , d t + 2 .
We study Equation (A29) in parts:
1.
c 3 a 2 t a 1 t + c 2 C S t + a 2 t L is fixed in this region.
2.
Lemma 3 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + is nonincreasing in d t + 1 .
3.
The last part is as follows:
d t + 2 P d t + 2 | d t + 1   V t + 2 L , d t + 2 d t + 2 P d t + 2 | d t + 1   V t + 2 C S t + a 1 t , d t + 2
is nonincreasing in d t + 1 because P d t + 2 d t + 1 shows first-order stochastic dominance and V t + 2 C S t + 1 + a t , d t + 2 is subadditive in d t + 2   ×   a t
Hence, in region IV, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t
Region V:
C S t + 1 L in both functions V t + 1 C S t + a 1 t , d t + 1 and V t + 1 C S t + a 2 t , d t + 1 .
This implies
L a 1 t C S t .
V t + 1 ( C S t + a 2 t , d t + 1 ) V t + 1 C S t + a 1 t , d t + 1 = c 2 + c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t +  
c 2 + c 3 a 2 t a 1 t is constant with an increase in C S t . Lemma 3 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1   C S t a 2 t + d t + 1   C S t a 1 t + is nonincreasing in d t + 1 . Hence, in region V, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t
This completes the proof of Possibility 1 when K a 1 t L a 2 t . □
Now we move onto Possibility 2 when K a 1 t > L a 2 t . The proof process is the same as in Possibility 1, which we just addressed. Again, the steps are parallel to the proof of Proposition 1.
Possibility 2 
When K a 1 t > L a 2 t .
Figure A6. Proposition 2, Possibility 2 C S t regions.
As shown in Figure A6 (above) for Possibility 1, the resulting regions are considered:
Region I:
C S t + 1 K in both functions V t + 1 C S t + a 2 t , d t + 1   &   V t + 1 C S t + a 1 t , d t + 1 .
This implies
m a x ( 0 , a 1 t ) C S t K a 2 t .
This is the same as region I for Possibility 1.
Hence, in region I, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region II:
C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 and L C S t + 1 K in the function V t + 1 C S t + a 2 t , d t + 1 .
This implies
K a 2 t   C S t L a 1 t .
The region is given the same consideration as region II of Possibility 1.
Hence, in region II, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region III:
C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 , and C S t + 1 L in the function V t + 1 C S t + a 2 t , d t + 1 .
This implies
L a 2 t C S t K a 1 t .
V t + 1 C S t + a 2 t , d t + 1 V t + 1 C S t + a 1 t , d t + 1 = c 1 K C S t a 1 t + c 2 C S t + a 2 t L + c 3 a 2 t a 1 t + c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + + d t + 2 P d t + 2 | d t + 1   V t + 2 L , d t + 2   V t + 2 K , d t + 2 .
We study Equation (A30) in parts:
  • c 1 K a 1 t + c 2 a 2 t L + c 3 a 2 t a 1 t + c 1 + c 2 C S t is a fixed value in this region.
  • Lemma 3 shows c 4 m i n d t + 1 , C S t + a 2 t m i n d t + 1 , C S t + a 1 t + c 5 d t + 1 C S t a 2 t + d t + 1 C S t a 1 t + is nonincreasing in d t + 1
  • d t + 2 P d t + 2 | d t + 1 V t + 2 L , d t + 2 V t + 2 K , d t + 2 is nonincreasing with d t + 1 because P d t + 2 d t + 1 shows first-order stochastic dominance and V t + 2 C S t + 1 + a t + 1 , d t + 2 is subadditive in d t + 2 × a t + 1
Hence, in region III, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region IV:
L C S t + 1 K in the function V t + 1 C S t + a 1 t , d t + 1 , and C S t + 1 L in the function V t + 1 C S t + a 2 t , d t + 1 .
This implies
K a 1 t C S t L a 1 t .
The region is given the same consideration as region IV of Possibility 1.
Hence, in region IV, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t .
Region V:
This is the same as region V of Possibility 1.
Hence, in region V, V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t . This completes the proof of Possibility 2.
We have now shown that under first-order stochastic dominance of demand transition matrix and if V t + 1 C S t + 1 , d t + 1 , a t + 1 is subadditive in d t + 1   ×   a t + 1     C S t + 1 and V t + 2 C S t + 1 + a t + 1 , d t + 2 is subadditive in d t + 2   ×   a t + 1     C S t + 1 ,   then V t + 1 C S t + a t , d t + 1 is subadditive in d t + 1   ×   a t , showing that V t C S t , d t , a t is subadditive in d t   ×   a t .
This completes the proof of Proposition 2. □

References

  1. AlDurgam, M.M.; Tuffaha, F.M.; Abdel-Aal, M.A.M.; Almoghathawi, Y.; Saleh, H.H.; Saleh, Z. A Scientometric Analysis of the Capacity Expansion and Planning Research. IEEE Trans. Eng. Manag. 2024, 71, 6382–6405. [Google Scholar] [CrossRef]
  2. Fu, C.; Suo, R.; Li, L.; Guo, M.; Liu, J.; Xu, C. A Capacity Expansion Model of Hydrogen Energy Storage for Urban-Scale Power Systems: A Case Study in Shanghai. Energies 2025, 18, 5183. [Google Scholar] [CrossRef]
  3. Hole, J.; Philpott, A.B.; Dowson, O. Capacity Planning of Renewable Energy Systems Using Stochastic Dual Dynamic Programming. Eur. J. Oper. Res. 2025, 322, 573–588. [Google Scholar] [CrossRef]
  4. AlDurgam, M.M. An Integrated Inventory and Workforce Planning Markov Decision Process Model with a Variable Production Rate. IFAC-PapersOnLine 2019, 52, 2792–2797. [Google Scholar] [CrossRef]
  5. Conejo, A.J.; Hall, N.G.; Long, D.Z.; Zhang, R. Robust Capacity Planning for Project Management. Inf. J. Comput. 2021, ijoc.2020.1033. [Google Scholar] [CrossRef]
  6. Kováts, P.; Skapinyecz, R. A Combined Capacity Planning and Simulation Approach for the Optimization of AGV Systems in Complex Production Logistics Environments. Logistics 2024, 8, 121. [Google Scholar] [CrossRef]
  7. Teerasoponpong, S.; Sopadang, A. A Simulation-Optimization Approach for Adaptive Manufacturing Capacity Planning in Small and Medium-Sized Enterprises. Expert Syst. Appl. 2021, 168, 114451. [Google Scholar] [CrossRef]
  8. Chien, C.-F.; Wu, C.-H.; Chiang, Y.-S. Coordinated Capacity Migration and Expansion Planning for Semiconductor Manufacturing under Demand Uncertainties. Int. J. Prod. Econ. 2012, 135, 860–869. [Google Scholar] [CrossRef]
  9. Aldurgam, M.M. Dynamic Maintenance, Production and Inspection Policies, for a Single-Stage, Multi-State Production System. IEEE Access 2020, 8, 105645–105658. [Google Scholar] [CrossRef]
  10. Puterman, M.L. Markov Decision Processes: Discrete Stochastic Dynamic Programming, Wiley Series in Probability and Statistics; 1st ed.; Wiley: Hoboken, NJ, USA, 1994; ISBN 978-0-471-61977-2. [Google Scholar]
  11. Wu, C.-H.; Chuang, Y.-T. An Efficient Algorithm for Stochastic Capacity Portfolio Planning Problems. J. Intell. Manuf. 2012, 23, 2161–2170. [Google Scholar] [CrossRef]
  12. Krishnamurthy, V. Interval Dominance Based Structural Results for Markov Decision Process. Automatica 2023, 153, 111024. [Google Scholar] [CrossRef]
  13. Lee, S.J.; Gong, X.; Garcia, G.-G. Modified Monotone Policy Iteration for Interpretable Policies in Markov Decision Processes and the Impact of State Ordering Rules. Ann. Oper. Res. 2025, 347, 783–841. [Google Scholar] [CrossRef]
  14. Martínez-Costa, C.; Mas-Machuca, M.; Benedito, E.; Corominas, A. A Review of Mathematical Programming Models for Strategic Capacity Planning in Manufacturing. Int. J. Prod. Econ. 2014, 153, 66–85. [Google Scholar] [CrossRef]
  15. Wu, C.-H.; Chuang, Y.-T. An Innovative Approach for Strategic Capacity Portfolio Planning under Uncertainties. Eur. J. Oper. Res. 2010, 207, 1002–1013. [Google Scholar] [CrossRef]
  16. Lin, J.T.; Chen, T.-L.; Chu, H.-C. A Stochastic Dynamic Programming Approach for Multi-Site Capacity Planning in TFT-LCD Manufacturing under Demand Uncertainty. Int. J. Prod. Econ. 2014, 148, 21–36. [Google Scholar] [CrossRef]
  17. Mishra, B.K.; Prasad, A.; Srinivasan, D.; ElHafsi, M. Pricing and Capacity Planning for Product-Line Expansion and Reduction. Int. J. Prod. Res. 2017, 55, 5502–5519. [Google Scholar] [CrossRef]
  18. Serrato, M.A.; Ryan, S.M.; Gaytán, J. A Markov Decision Model to Evaluate Outsourcing in Reverse Logistics. Int. J. Prod. Res. 2007, 45, 4289–4315. [Google Scholar] [CrossRef]
  19. Abduljaleel, J.; AlDurgam, M. Structured Optimal Policy for a Capacity Expansion Model Using Markov Decision Processes. In Proceedings of the 2024 IEEE International Conference on Technology Management, Operations and Decisions (ICTMOD), Sharjah, United Arab Emirates, 4–6 November 2024; IEEE: Sharjah, United Arab Emirates, 2024; pp. 1–7. [Google Scholar]
  20. Blancas-Rivera, R.; Cruz-Suárez, H.; Portillo-Ramírez, G.; López-Ríos, R.; Blancas-Rivera, R.; Cruz-Suárez, H.; Portillo-Ramírez, G.; López-Ríos, R. (s, S) Inventory Policies for Stochastic Controlled System of Lindley-Type with Lost-Sales. MATH 2023, 8, 19546–19565. [Google Scholar] [CrossRef]
  21. Van Jaarsveld, W.; Arts, J. Projected Inventory-Level Policies for Lost Sales Inventory Systems: Asymptotic Optimality in Two Regimes. Oper. Res. 2024, 72, 1790–1805. [Google Scholar] [CrossRef]
  22. Yuan, S.; Lyu, J.; Xie, J.; Zhou, Y. Asymptotic Optimality of Base-Stock Policies for Lost-Sales Inventory Systems with Stochastic Lead Times. Oper. Res. Lett. 2024, 57, 107196. [Google Scholar] [CrossRef]
  23. Beyer, D.; Sethi, S.P.; Taksar, M. Inventory Models with Markovian Demands and Cost Functions of Polynomial Growth. J. Optim. Theory Appl. 1998, 98, 281–323. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.