Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation

Omar, Moien A.

doi:10.3390/app16115693

Open AccessArticle

Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation

by

Moien A. Omar

Electrical Engineering Department, Faculty of Engineering, An-Najah National University, Nablus P400, Palestine

Appl. Sci. 2026, 16(11), 5693; https://doi.org/10.3390/app16115693 (registering DOI)

Submission received: 1 April 2026 / Revised: 15 May 2026 / Accepted: 4 June 2026 / Published: 5 June 2026

(This article belongs to the Special Issue Challenges and Opportunities of Microgrids)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a power-based dynamic programming (DP) method for day-ahead battery scheduling in a grid-connected photovoltaic (PV) microgrid under time-of-use (TOU) tariffs. The proposed formulation optimizes battery power directly, rather than SOC setpoints, so the dispatch is easier to apply in practical inverter control and remains computationally tractable over a 48 h horizon. The model includes battery degradation through a linear wear-cost term based on a 200 USD/kWh replacement cost, while also enforcing SOC and charging/discharging power limits. The case study uses a 250 kWh battery and evaluates two power limits, 0.1C and 0.2C, together with two degradation cases, 200 and 400 USD/kWh. The simulation considers two different operating days to test the controller under unequal renewable and demand conditions. Day 1 has stronger PV generation and lower load demand, whereas Day 2 has lower PV output and higher demand. Under the baseline 0.1C limit, DP reduces the net operating cost to 97.47 USD, compared with 122.95 USD for the TOU-aware rule-based benchmark. When the power limit increases to 0.2C, the net operating cost falls further to 78.35 USD because export revenue rises substantially. When the battery replacement cost doubles from 200 USD/kWh to 400 USD/kWh, the optimizer reduces cycling and the net operating cost increases to 129.21 USD. Overall, the results show that power-based DP provides a practical and transparent framework for balancing tariff arbitrage and battery preservation in grid-connected microgrids.

Keywords:

battery energy management; PV microgrid; dynamic programming; time-of-use tariff; battery wear cost; power discretization; grid-connected systems; net operating cost

1. Introduction

The increasing penetration of renewable energy sources (RESs), particularly photovoltaic (PV) systems, is transforming modern power systems toward a cleaner and more sustainable operation [1,2]. Nevertheless, the intermittent nature of solar generation introduces significant operational challenges related to power balance, grid stability, and energy management [3]. Meanwhile, battery energy storage systems (BESSs) have emerged as a key enabling technology for mitigating renewable intermittency, improving self-consumption, supporting peak shaving, and enhancing grid flexibility [4,5]. In grid-connected microgrids, battery storage systems help balance energy supply and demand while improving the reliability and economic performance of distributed energy resources (DERs) [6].

Time-of-use (TOU) electricity tariffs and feed-in tariff (FIT) schemes have further increased interest in PV-battery systems by encouraging consumers to shift energy consumption and maximize economic benefits [7,8]. Under TOU pricing, electricity prices vary throughout the day according to demand conditions, thereby creating opportunities for energy arbitrage using battery storage [9]. Consequently, effective battery scheduling strategies have become increasingly important for reducing operating costs and improving renewable energy utilization. In addition to economic objectives, battery scheduling must also consider operational constraints and degradation effects because excessive cycling and deep charge/discharge operations may significantly reduce battery lifetime [10,11].

Several energy management strategies (EMSs) have been proposed for PV-battery systems. Rule-based methods, including maximum self-consumption (MSC) and TOU-aware scheduling, are widely used because of their simplicity and ease of implementation [12,13]. Although these approaches are computationally efficient, they generally rely on predefined heuristics and therefore may not achieve globally optimal scheduling decisions under varying operating conditions [14]. On the other hand, advanced optimization techniques such as mixed-integer linear programming (MILP), model predictive control (MPC), genetic algorithms (GAs), and dynamic programming (DP) have been extensively investigated for optimal battery scheduling in microgrids [15,16,17,18].

Among these methods, dynamic programming has received considerable attention because of its suitability for multistage and time-coupled optimization problems [19,20]. DP recursively evaluates future operating costs while considering battery constraints, electricity tariffs, and renewable generation variability. Therefore, DP can effectively determine optimal charging and discharging schedules over finite operating horizons [21]. Previous studies have demonstrated the effectiveness of DP-based energy management strategies for reducing operating costs, improving self-consumption, and extending battery lifetime in grid-connected PV systems [22,23,24,25]. In [22], DP was employed to compare different operational strategies for a PV-battery office building, demonstrating improved economic performance compared with MSC and TOU-based methods. Similarly, predictive DP-based scheduling frameworks were proposed in [23] and [24] for minimizing electricity cost while considering battery degradation effects. Furthermore, stochastic DP approaches have been investigated to improve battery lifetime and operational efficiency under uncertain operating conditions [25].

Despite the advantages of DP-based scheduling, many existing studies discretize battery state of charge (SOC) as the primary optimization variable [26,27]. Although SOC-based formulations provide effective constraint handling, practical inverter implementation typically requires an additional conversion stage from SOC trajectories to battery power references. Meanwhile, many existing studies primarily focus on economic optimization without explicitly integrating battery wear cost into the scheduling process [28]. Furthermore, some advanced optimization methods such as MPC and MILP may become computationally intensive when applied to long scheduling horizons or high-resolution operational datasets [29,30].

To better position the proposed framework relative to existing approaches, Table 1 summarizes the main characteristics of common microgrid energy management strategies reported in the literature.

Among the existing optimization approaches, SOC-based DP formulations are widely used because of their ability to handle multistage optimization problems and operational constraints. However, practical inverter implementation generally requires an additional conversion stage from SOC trajectories to battery power references. In contrast, the proposed formulation directly discretizes battery power commands within the DP framework, thereby simplifying real-time implementation and avoiding additional SOC-to-power mapping layers.

Despite the extensive research on PV-battery energy management systems, there remains a need for practical scheduling approaches that simultaneously consider TOU pricing, battery degradation cost, operational constraints, and inverter-oriented implementation. Therefore, this paper proposes a power-step dynamic programming framework for optimal battery scheduling in grid-connected PV microgrids under TOU electricity tariffs. The proposed approach directly discretizes battery power rather than SOC states, thereby improving compatibility with practical inverter control implementation. Meanwhile, battery wear cost is incorporated into the optimization process to balance short-term economic benefits and long-term battery utilization. The proposed framework is evaluated under varying PV and load conditions over a representative 48 h scheduling horizon to investigate energy flow management, economic performance, battery utilization, and grid interaction behavior.

2. Methodology

The methodology develops a multi-objective PV-battery scheduling model that minimizes operating cost while accounting for battery degradation. The system model captures PV generation, load demand, battery dynamics, and grid exchange. The optimization then uses hourly power references over a 48 h horizon, which covers two full daily tariff cycles. This horizon allows the controller to respond to both within-day fluctuations and day-to-day changes in PV and load profiles. The case study also examines how different power limits and battery wear assumptions affect the solution, so the analysis can show both economic performance and operational sensitivity. The workflow is summarized in Figure 1.

2.1. Problem Formulation

The objective is to determine the optimal active power flow for charging and discharging the battery in a PV-battery microgrid. The optimization uses the load profile, PV output, and initial battery SOC to determine the battery power reference at each hour. In addition, the model accounts for the wear cost created by battery cycling, so the selected dispatch is not based only on immediate tariff savings. Instead, it reflects the combined effect of energy balance, TOU pricing, and battery preservation.

2.2. System Model Equations

The PV-battery microgrid system is represented by a set of equations that describe the relationship among PV generation, load demand, battery operation, and grid exchange. These equations provide the foundation for the DP scheduling model and ensure that the dispatch remains physically feasible.

2.2.1. Power Balance Equation

The power balance within the microgrid is expressed in Equation (1), where PV generation, load demand, battery power, and grid power must remain consistent at every time step.

P_{g r i d} (t) = P_{l o a d} (t) - P_{P V} (t) + P_{b} (t)

(1)

where

P_load (t): Load demand at time t,
P_PV (t): PV generation at time t,
P_b (t): Battery charging/discharging power,
P_grid (t): Power imported from or exported to the grid.

2.2.2. Grid Cost

The cost of grid interaction depends on both the amount of imported or exported energy and the TOU pricing schedule. The grid cost at time t, Cgrid(t), is calculated using Equation (2).

C_{g r i d} (t) = \{\begin{matrix} P_{g r i d} (t) \cdot Δ t \cdot λ_{o n / i m p} i f P_{g r i d} (t) > 0, t \in T_{o n} \\ P_{g r i d} (t) \cdot Δ t \cdot λ_{o f f / i m p} i f P_{g r i d} (t) > 0, t \in T_{o f f} \\ P_{g r i d} (t) \cdot Δ t \cdot λ_{o n / e x p} i f P_{g r i d} (t) < 0, t \in T_{o n} \\ P_{g r i d} (t) \cdot Δ t \cdot λ_{o f f / e x p} i f P_{g r i d} (t) < 0, t \in T_{o f f} \end{matrix}

(2)

where

P_grid(t): Grid power at time t [kW]; positive for import, negative for export;
Δt: Time step duration [h] (set to 1 h in this study);
λ_on/imp = 0.45 USD/kWh: Cost of energy imported during on-peak periods;
λ_off/imp = 0.15 USD/kWh: Cost of energy imported during off-peak periods;
λ_on/exp = 0.40 USD/kWh: Revenue from energy exported during on-peak periods;
λ_off/exp = 0.10 USD/kWh: Revenue from energy exported during off-peak periods;
T_on: Set of on-peak time intervals (7–10 AM, 7–10 PM);
T_off: Set of off-peak time intervals (all other hours).

2.2.3. Battery Wear Cost

Battery degradation is represented as a wear cost per kilowatt-hour of energy throughput. The wear cost is derived from the battery replacement cost and the lifetime throughput, as shown in Equation (3). This approach provides a practical way to include aging effects in day-ahead scheduling without introducing excessive modeling complexity.

C_{w e a r} (t) = c_{w e a r} \cdot | P_{b} (t) | \cdot Δ t

(3)

c w e a r = \frac{C_{r e p}}{2 \times N c y c l e s \times {D O D}_{r e f}}

(4)

where

C_wear(t): Battery wear cost at time t [USD];
cwear: Wear cost per kWh throughput [USD/kWh];
P_b(t): Battery power reference [kW];
Δt: Time step duration [h];
C_rep: Battery replacement cost [USD/kWh];
N_cycles: Rated cycle life at reference DOD [cycles];
DOD_ref: Reference depth of discharge at which Ncycles was determined [dimensionless].

The factor of 2 accounts for round-trip energy, meaning that one complete cycle includes both charging and discharging energy.

Note that DODref = 0.5 is the manufacturer-specified reference depth of discharge used to define the rated cycle life. It is separate from the operational SOC limits used during grid-connected operation. For Crep = 200 USD/kWh, the wear cost is 0.05 USD/kWh; when Crep = 400 USD/kWh, it increases to 0.10 USD/kWh.

This linear throughput-based wear model is a practical first-order approximation for day-ahead scheduling. Although more detailed semi-empirical or rainflow-based models can capture nonlinear aging effects such as C-rate dependence, temperature influence, and SOC-window stress, they require extensive battery-specific data and add significant computational burden. Therefore, the results should be read as an economic trade-off between arbitrage benefit and battery aging under the assumed linear degradation law.

2.3. Optimization Framework

This study uses dynamic programming (DP) to optimize battery charging and discharging in a PV-battery microgrid over a 48 h horizon. The algorithm evaluates the full scheduling window while considering load profiles, PV generation, TOU pricing, and the initial battery SOC. Accordingly, it can compare the present value of each power decision with its future impact, which is essential for cost-aware battery scheduling.

2.3.1. Objective Function

The optimization is based on an economic objective function that minimizes the total operating cost, including both grid interaction cost and battery wear cost. The objective function is given in Equation (5).

M i n \sum_{t = 1}^{T} [C_{g r i d} (t) + C_{w e a r} (t)]

(5)

where

C_grid(t): Grid interaction cost at time t [USD];
C_wear(t): Battery wear cost at time t [USD];
T: Total scheduling horizon [h] (48 h in this study).

Because both the grid interaction cost and the battery wear cost are expressed in USD, the objective can be minimized directly without additional weighting coefficients. This keeps the formulation dimensionally consistent and avoids the need for normalization or parameter tuning. As a result, the optimizer balances short-term tariff arbitrage against long-term degradation cost in a transparent way.

2.3.2. Constraints

To keep the system within realistic and safe operating limits, the optimization framework applies the following constraints.

SOC limits restrict the battery to its allowable operating range, as given in Equation (6).

{S O C}_{m i n} \leq S O C (t) \leq {S O C}_{m a x}

(6)

This prevents overcharging and over-discharging, both of which can reduce efficiency and accelerate degradation.

Maximum charging and discharging power limits ensure that the battery does not exceed its rated capability, as shown in Equation (7).

- P_{b a t, m a x} \leq P_{b} (t) \leq P_{b a t, m a x}

(7)

Here, Pmax denotes the maximum allowable charging or discharging power of the battery.

2.4. Power-Step Discretization

Battery power decisions are discretized into practical power steps, as described in Equation (8).

P_{b} (t) \in {- P_{b a t, m a x}, \dots, 0, \dots, P_{b a t, m a x}}

(8)

Although battery power and SOC are coupled through the state transition equation, the proposed formulation treats SOC as the recursive state variable and battery power as the discretized decision variable. This choice offers a practical advantage over conventional SOC-discretized DP methods because grid-tied inverters are controlled through power commands, not SOC setpoints. Direct power optimization avoids the post-processing step required to convert SOC trajectories into power commands, and it reduces interpolation error, limit violations, and control-layer delay. By using 0.1 kW increments, the model remains aligned with inverter command resolution while ensuring that every decision satisfies the power constraint in Equation (7).

2.5. Dynamic Programming Algorithm

The dynamic programming (DP) algorithm determines the optimal battery power reference at each time step by recursively evaluating the cost-to-go function. The procedure is outlined below.

Initialization

The terminal cost at the end of the optimization horizon (t = T) is defined as follows:

J (T, P_{b}) = 0 \forall P_{b} \in U_{T}

(9)

where UT denotes the set of feasible battery power decisions at the terminal step. This boundary condition means that no additional operating cost is incurred beyond the end of the horizon.

J_{t} (t, P_{b}) = M i n [C_{g r i d} (t) + C_{w e a r} (t) + J_{t} (t + 1, P_{b})

(10)

where

J (t, P_{b}

): Minimum cost from time t to the end of the horizon,

P_{b}^{'}

: Feasible power decision at the next time step, determined by the state transition dynamics.

States and transitions

For each feasible power decision, the corresponding SOC is computed from the state transition equation in Equation (11) and checked against the allowable SOC bounds.

S O C (t + 1) = \{\begin{matrix} S O C (t) + \frac{P_{b} (t) \times η_{c} \times ∆ t}{E_{b a t t}}, & i f P_{b} (t) \geq 0 (charging) \\ S O C (t) + \frac{P_{b} (t) \times ∆ t}{η_{d} \times E_{b a t t}}, & i f P_{b} (t) < 0 (discharging) \end{matrix}

(11)

where

SOC(t): State of charge at time t [dimensionless, 0–1];
E_batt: Battery energy capacity [kWh] (250 kWh);
P_b(t): Battery power reference [kW];
η_c = 0.9: Charging efficiency [dimensionless];
η_d = 0.9: Discharging efficiency [dimensionless];
Δt = 1 h: Time step duration.

Optimal power decision

At each step, the algorithm selects the power decision that minimizes the total future cost and stores the optimal path for later use.

The proposed power-step DP algorithm is intended for day-ahead scheduling with hourly updates. Under the case-study settings of a 48 h horizon, 1 h time steps, and 0.1 kW power discretization, the optimization completes in less than 2 s on a standard laptop with an Intel Core i7 processor and 16 GB of RAM. Computational effort grows linearly with the scheduling horizon and increases with the number of discretized power levels. This predictable scaling supports practical use in microgrid energy management systems.

2.6. TOU-Aware Rule-Based Strategy

To provide a meaningful benchmark that also accounts for tariff arbitrage, a TOU-aware rule-based heuristic is implemented. Unlike a conventional PV-following strategy, this baseline charges during off-peak periods and discharges during on-peak periods. The battery power reference Pb(t) is determined as follows:

P_{b} (t) = \{\begin{matrix} m i n (P_{m a x}, \frac{[{S O C}_{m a x} - S O C (t)] \times E_{b a t t}}{η_{c} \cdot Δ t}), & i f t \in T_{o f f} (Off - Peak : Charge) \\ - \min (P_{m a x}, \frac{[S O C (t) - {S O C}_{m i n}] \times E_{b a t t} \times η_{d}}{Δ t}), & i f t \in T_{o n} (On - Peak : Discharge) \end{matrix}

(12)

where

SOC_min, SOC_max: State of charge [dimensionless, 0–1];
P_max: Maximum battery power [kW].

The SOC is updated using Equation (11). This strategy removes the main weakness of conventional rule-based control by including TOU awareness, so it serves as a fair benchmark for evaluating the DP approach. The decision flow is shown in Figure 2.

3. Case Study

The case study examines a PV microgrid with battery storage operating primarily in grid-connected mode. The system exchanges power with the grid during normal operation and can move to islanded mode during disturbances. Because grid interruptions are assumed to be infrequent in this study, the battery is mainly used in grid-connected mode to reduce operating cost, increase PV self-consumption, and improve overall system performance.

3.1. Microgrid Description

The microgrid operates in grid-connected mode, where the PV array, battery, and grid jointly supply the load as shown in Figure 3. The PV system is rated at 50 kW and operates at its maximum power point to maximize generation. Meanwhile, the bidirectional battery inverter operates in power-control mode, allowing both charging and discharging so that energy flow can be coordinated with TOU prices. The battery has a capacity of 250 kWh and a minimum SOC of 20% to preserve reserve energy for islanded operation. During grid-connected operation, the minimum SOC is set to 50%, which preserves a 30% reserve for critical loads in case of disturbances. This reserve constraint is independent of the DODref = 0.5 value used in the wear-cost model.

3.2. Time-of-Use Pricing

The microgrid follows a TOU pricing structure in which electricity prices vary by time of day. The utility defines on-peak periods from 7 to 10 AM and 7 to 10 PM.

During on-peak hours, imported energy costs 0.45 USD/kWh, reflecting the higher demand and utility charge. During off-peak hours, the import cost falls to 0.15 USD/kWh, which encourages flexible consumption. Meanwhile, exported surplus energy is paid at 0.40 USD/kWh during on-peak hours and 0.10 USD/kWh during off-peak hours.

3.3. Battery Wear Cost Calculation

The battery wear cost is calculated from Equation (3). For the 250 kWh battery, a replacement cost of 200 USD/kWh, and a cycle life of 4000 cycles at DODref = 0.5, the wear cost is 0.05 USD/kWh. If the replacement cost is doubled to 400 USD/kWh, the wear cost becomes 0.10 USD/kWh.

c w e a r = \frac{200 \times 250}{2 \times 250 \times 4000 \times 0.5} = 0.05 $ / k W h

4. Simulation Results

The simulation compares a TOU-aware rule-based method with the proposed DP approach over a 48 h period that covers two distinct operating days. As shown in Figure 4, the first 24 h represent Day 1, while the second 24 h represent Day 2. The PV profile is intentionally different across the two days: Day 1 reaches a maximum of about 41 kW and produces 280 kWh, while Day 2 drops to very low output and produces 130 kWh. The load profile also changes substantially: Day 1 has a moderate demand pattern with a 15 kW peak and 200 kWh total consumption, whereas Day 2 reaches 40 kW and 620 kWh. Therefore, the full simulation contains 410 kWh of PV generation and 820 kWh of load demand, which creates a surplus condition on Day 1 and a clear deficit on Day 2.

This test design is useful because it evaluates the controllers under both surplus and deficit conditions rather than under a single uniform profile. Although some studies vary only one input at a time, testing PV and load together provides a more realistic assessment of control performance. Meanwhile, keeping the battery constraints fixed ensures that the performance differences mainly reflect the control logic rather than hardware changes. In this study, SOC is limited to between 50% and 100% of the 250 kWh battery capacity, the baseline power limit is 0.1C (25 kW), and an additional 0.2C (50 kW) case is used to study the effect of higher converter capability.

4.1. Results of TOU-Aware Rule-Based Strategy

The TOU-aware rule-based strategy applies predefined heuristic rules to manage battery charging and discharging, and it serves as a practical benchmark for the DP optimizer. The method reacts to the current PV output, load demand, SOC bounds, and fixed TOU windows. Figure 5 shows the resulting power dispatch over the 48 h horizon, where positive battery power indicates charging and negative power indicates discharging.

During the early off-peak hours, such as hours 1 to 5, the battery charges at the maximum allowable rate of 25 kW because both PV output and load demand are low. As solar generation increases on Day 1, excess PV power is sent to the battery after local demand is met. For example, at hour 13, PV output reaches 40.5 kW while the load is 15 kW, so the battery absorbs 17.6 kW and the remaining surplus is exported to the grid. During the evening on-peak window, the strategy commands full discharge at −25 kW to reduce grid imports and take advantage of the higher tariff. When PV output drops to zero at night, the grid supplies the full load, while the battery either charges or remains idle depending on the TOU period.

On Day 2, load demand increases sharply while PV availability remains limited. As a result, the grid supplies a larger share of the energy. At hour 31, for example, the battery discharges at −25 kW to support the 40 kW load, while the grid supplies the remaining 15 kW. This behavior shows that the rule-based controller reacts to the present power balance rather than anticipating future conditions. Although it follows the TOU schedule and keeps SOC within limits, it cannot shift charging decisions in advance, which reduces economic efficiency under low-PV, high-load conditions.

Figure 6 depicts the battery state-of-charge (SOC) trajectory over the 48 h simulation horizon, with shaded regions marking the designated on-peak TOU windows. Initially, the system begins at the minimum allowable SOC of 0.5 p.u. (hour 1). During off-peak intervals, the controller prioritizes battery charging, progressively increasing the SOC toward the upper bound of 1.0 p.u. Conversely, when on-peak periods commence, the strategy triggers discharge to offset grid imports, causing the SOC to decline. Notably, the discharge depth is modulated by real-time PV availability and load magnitude; for example, on Day 1, the SOC stabilizes around 0.67 p.u. during evening peak hours rather than reaching the absolute minimum, as concurrent PV generation partially satisfies local demand. Throughout the simulation, the SOC remains strictly within the prescribed 0.5–1.0 p.u. operational limits, demonstrating consistent constraint adherence.

Although the overall charge and discharge pattern follows TOU price signals, the SOC trajectory is governed mainly by the instantaneous power balance and hardware limits rather than by predictive optimization. The rule-based method responds to the current tariff status but does not anticipate future price changes or renewable variability. Consequently, the battery may reach full charge before the peak period ends or discharge too early when load is high and PV is low. This reactive behavior limits the ability of the strategy to achieve optimal energy arbitrage and provides a clear benchmark for the DP method.

4.2. Results of Dynamic Programming Approach

The DP optimizer computes a cost-minimizing dispatch schedule over the full 48 h horizon while explicitly including TOU tariffs and a battery replacement cost of 200 USD/kWh. As shown in Figure 7, the algorithm keeps the battery within the 0.1C limit, which corresponds to 25 kW for a 250 kWh system. Rather than relying on fixed heuristics, the DP method continuously adjusts battery power so that current energy needs and future economic opportunities are considered together.

During off-peak periods, the grid often supplies more power than the instantaneous load so that the battery can also be charged. For instance, at hour 6 of Day 2, the grid imports 53.9 kW to cover the 30 kW load and sends the remaining 23.9 kW to the battery. Conversely, when on-peak tariffs begin, the algorithm prioritizes battery discharge to reduce expensive imports or support profitable exports. At hour 9 on Day 1, for example, the battery discharges at −22.5 kW while PV generation reaches 28 kW, together supplying the 15 kW load and exporting surplus power to the grid. This coordinated dispatch shows how the DP method uses price differences to buy energy when it is cheap and release it when it is valuable.

The optimizer also accounts for cycle degradation cost, which prevents unnecessary power fluctuations that would reduce long-term economic value. As a result, battery power remains within the ±25 kW limit, while grid interaction shifts between import, export, and self-consumption according to PV availability, load level, and tariff schedule. Unlike the rule-based method, the DP strategy plans ahead and prepares the battery for later peak periods, which lowers total operating cost while still respecting all constraints.

Figure 8 shows the battery SOC trajectory under the DP strategy, with shaded regions marking the on-peak TOU windows. Throughout the 48 h horizon, SOC remains within the prescribed 0.5 to 1.0 p.u. range, and its evolution follows the tariff pattern. During on-peak intervals, the optimizer schedules controlled discharge to offset costly grid imports, which produces a gradual SOC decline. However, the 0.1C limit restricts the discharge rate, so the battery cannot fully eliminate grid dependence during high-demand periods. Conversely, off-peak windows trigger charging from either PV surplus or low-cost grid electricity, but the same power limit slows the recharge process and produces a gradual recovery rather than a rapid one.

Notably, the DP method shows forward-looking behavior across consecutive tariff cycles. After each on-peak discharge, the controller adjusts power flows during the following off-peak period so that enough energy is available for the next high-price window. This inter-peak management keeps the battery ready for future peak periods instead of depleting it too early. The shaded bands in Figure 8 make these TOU-aligned decisions easy to see. Overall, the SOC pattern confirms that the DP strategy combines tariff signals, power limits, and horizon planning to minimize cost while maintaining reliable operation.

4.3. Energy Outputs Comparison

Table 2 summarizes energy flows and battery utilization over the 48 h horizon. The DP method clearly reduces grid dependence relative to the TOU-aware rule-based baseline. Specifically, off-peak imports fall from 724.07 kWh to 551.00 kWh, which is a 24% reduction, because the optimizer avoids unnecessary charging when future PV generation or load conditions make that energy less useful. As a result, total grid imports drop from 794.07 kWh to 624.10 kWh, while on-peak imports remain similar at 70.00 kWh and 73.10 kWh. This shows that the two strategies both limit peak imports, but DP does so with a more selective charging pattern.

Regarding exports, the rule-based method sends more energy to the grid overall, with 192.41 kWh compared with 145.10 kWh for DP. The difference is especially clear during off-peak periods, where the baseline exports 67.41 kWh and DP exports only 24.10 kWh. Nevertheless, this higher export volume does not lead to better economics, because the rule-based controller exports whenever surplus appears, while the DP strategy reserves battery capacity for higher-value periods. Although on-peak exports are slightly higher in the baseline, the DP profile is more cost-effective because it accounts for battery wear.

The DP strategy also uses the battery more efficiently. Cumulative charging is reduced to 361.90 kWh, compared with 491.67 kWh for the rule-based method, and cumulative discharging is 292.90 kWh versus 300.00 kWh. Therefore, total battery losses fall from 79.17 kWh to 65.48 kWh. The key reason is that the DP optimizer weighs every charge and discharge action against the 200 USD/kWh replacement cost, so it avoids marginal cycling that would add wear without creating enough economic value.

4.4. Economic Output Comparison

Table 3 compares the economic performance of the two strategies over the 48 h horizon. The DP optimizer achieves a net operating cost of 97.47 USD, which is 20.7% lower than the TOU-aware rule-based baseline cost of 122.95 USD. This advantage remains even though export income is slightly lower under DP, at 50.81 USD versus 56.74 USD, because the algorithm focuses on reducing import cost and limiting unnecessary cycling. In fact, total import cost drops from 140.11 USD to 115.54 USD, which provides a savings of 24.57 USD through better timing of grid purchases.

Battery wearing cost also falls from 39.58 USD to 32.74 USD under the DP strategy, which is a 17.3% reduction in wear-related expense. This improvement comes from the explicit inclusion of the 200 USD/kWh replacement cost in the objective function, which discourages marginal cycling that would accelerate capacity fade. On the other hand, the rule-based controller seeks maximum throughput without considering long-term wear, so its higher export income is partly offset by higher import cost and higher battery degradation cost.

The economic advantage of DP comes from its multi-horizon planning logic. Rather than reacting only to the current tariff, the algorithm evaluates the full 48 h cost profile and shifts imports to cheaper periods while reserving battery energy for high-value discharge opportunities. Accordingly, each decision contributes to net cost reduction after accounting for conversion losses and degradation. The resulting 25.48 USD reduction in operating cost shows that predictive optimization can deliver real financial benefit beyond simple tariff-following control.

It is also important to note that the higher total imported energy under DP, 624.10 kWh compared with 794.07 kWh for the baseline, does not contradict the lower import cost. This is because the optimizer concentrates imports during low-cost off-peak hours and reduces reliance on the grid during expensive on-peak periods. Therefore, the results show that cost minimization and energy minimization are not the same objective: lower cost requires temporal arbitrage, not just lower total consumption.

4.5. Dynamic Programming with Increasing Pmax

To study the effect of converter sizing on optimization performance, the maximum charge and discharge power limit is increased to 0.2C, or 50 kW, which doubles the baseline operating envelope. As shown in Figure 9, the dispatch profile becomes more flexible, and the battery responds more strongly to both tariff signals and renewable fluctuations. During the Day 1 morning peak, the optimizer reaches the −50 kW limit, and during off-peak intervals it commands charging at +50 kW so that the battery can move energy more quickly between low-value and high-value periods.

This additional flexibility provides several advantages. First, the higher discharge rate during on-peak periods reduces grid dependence because the battery can cover a larger share of the load without expensive imports. Second, the increased charging capability improves PV self-consumption by absorbing short generation peaks that might otherwise be exported at low off-peak tariffs. Consequently, grid power becomes more sharply time-shifted, with imports concentrated in cheap periods and exports concentrated in valuable ones. The effect is especially clear on Day 2, where evening demand is high and PV output is low.

Although higher power ratings increase instantaneous stress on the battery, the DP method balances those gains against degradation cost and avoids unnecessary cycling. Therefore, the 0.2C case shows that a larger converter can improve economic performance when predictive control is used to place charge and discharge actions only where tariff differences justify them.

Figure 10 shows the battery SOC trajectory under the 0.2C limit, with shaded regions indicating on-peak intervals. Compared with the 0.1C baseline, the wider power envelope produces steeper SOC changes because the battery can move energy faster. During on-peak periods, the algorithm commands stronger discharge and produces a faster SOC decline, which reduces expensive imports more effectively. During off-peak periods, charging is also faster, so the battery can recover before the next peak window.

This stronger SOC response improves temporal arbitrage. By using the 50 kW capability, the DP controller limits exposure to high tariffs while capturing more low-cost grid energy and PV surplus. The steeper discharge slopes in the shaded bands show that the battery supplies a larger share of the instantaneous demand, which reduces peak-period imports. At the same time, the faster off-peak recharge keeps SOC within bounds and maintains readiness for the next peak window.

Overall, the 0.2C case shows that converter sizing can amplify the benefits of predictive optimization. Although the higher rating increases cycling intensity, the DP algorithm limits charge and discharge actions to intervals where the tariff gap is large enough to justify the wear cost. Accordingly, the SOC dynamics improve TOU alignment and system efficiency at the same time.

Table 4 shows how the larger power envelope changes energy flows. Increasing Battery P_max to 0.2C shifts the grid interaction profile more aggressively. Total grid imports rise from 624.10 kWh to 748.40 kWh because the optimizer deliberately buys more off-peak energy, increasing off-peak imports from 551.00 kWh to 740.80 kWh. At the same time, on-peak imports fall sharply from 73.10 kWh to 7.60 kWh, which confirms that the higher discharge capability displaces expensive grid energy during critical tariff windows.

Export behavior also changes. On-peak exports rise from 121.00 kWh to 209.40 kWh, which shows that the system can capture and monetize PV surplus more effectively when prices are high. Off-peak exports remain almost unchanged, moving from 24.10 kWh to 24.00 kWh, while total exports increase from 145.10 kWh to 233.40 kWh. This pattern shows that the controller prefers to store energy during low-value periods and release it when the tariff is more attractive.

Battery throughput increases significantly under the larger power limit. Charging energy rises from 361.9 kWh to 552.2 kWh, and discharging energy rises from 292.9 kWh to 447.2 kWh. As a result, conversion losses increase from 65.48 kWh to 99.94 kWh, which directly reflects the stronger charge–discharge activity. Although this adds wear, the DP optimizer still limits cycling to economically justified intervals, so the 0.2C case trades a moderate increase in losses for a substantial improvement in dispatch flexibility and tariff response.

Table 5 summarizes the economic effect of increasing the battery power limit. The 0.2C case reduces net operating cost to 78.35 USD, which is 19.6% lower than the 0.1C baseline cost of 97.47 USD. The improvement occurs even though total import cost changes only slightly, from 115.54 USD to 114.54 USD, which means the savings come mainly from higher export income rather than lower grid purchases. In particular, export income rises from 50.81 USD to 86.16 USD, a 69.6% increase, because the optimizer can shift more energy into high-value periods.

The higher power limit also increases battery wear cost from 32.74 USD to 49.97 USD, which reflects the larger throughput reported in Table 3. However, the additional wear expense is more than offset by the 35.35 USD gain in export income. Therefore, the 0.2C configuration achieves a lower overall cost even after accounting for faster cycling and higher degradation.

In this case, the 0.2C scenario accumulates about 1.8 equivalent full cycles over the 48 h horizon, compared with about 1.2 cycles for the 0.1C case. Although this is a modest increase in cycling intensity, the DP framework still restricts throughput to intervals where the tariff advantage is large enough to justify the wear cost. Accordingly, the 19.6% cost reduction shows that power oversizing can improve economic performance when it is paired with predictive optimization.

These results depend on system-specific factors such as battery capacity, round-trip efficiency, cycle life, and the tariff structure. In this study, the TOU rates of 0.45/0.15 USD/kWh for import and 0.40/0.10 USD/kWh for export represent a realistic high-renewable operating context. Nevertheless, the main conclusion remains general: the optimal charge and discharge limit should be selected by considering cost savings, degradation, and reliability together.

4.6. Dynamic Programming with Increasing Battery Cost

Figure 11 shows the dispatch response when the battery replacement cost increases from 200 USD/kWh to 400 USD/kWh. As the degradation penalty becomes larger, the DP optimizer reduces battery activity and produces much smaller charge and discharge swings over the 48 h horizon. Rather than pursuing aggressive arbitrage, the controller favors asset preservation, so grid power assumes a more direct role in balancing the load.

This change also affects renewable integration. Under the lower wear cost, excess PV energy is stored for later discharge or high-tariff export. With the higher degradation penalty, however, the optimizer prefers to serve the load immediately and avoids costly cycling. As a result, grid interaction becomes more direct, and the battery plays a smaller buffering role between PV generation and demand.

Overall, the higher wear cost shifts the trade-off between short-term tariff exploitation and long-term sustainability. By reducing battery utilization, the DP framework sacrifices some energy-shifting benefit in order to limit wear cost. This sensitivity test shows why accurate degradation modeling is important: if cycling cost is underestimated, the dispatch can become too aggressive and the economic result can be misleading.

Figure 12 shows the battery SOC trajectory under the 400 USD/kWh degradation case. Compared with the 200 USD/kWh baseline, the SOC profile is much flatter because the optimizer restricts cycling. The battery no longer follows the tariff signal as aggressively, and instead remains within a narrower operating band. This moderation comes directly from the higher wear penalty, which raises the marginal cost of each cycle.

During on-peak intervals, the discharge slope is gentler, so the battery covers only part of the demand while the grid supplies the rest. During off-peak intervals, charging is also weaker and less frequent because the optimizer gives priority to preservation rather than to low-cost energy accumulation. Therefore, the system shifts from a storage-intensive strategy to a more conservative grid-following strategy.

As shown in Table 6, total imported energy falls from 624.1 kWh to 494.2 kWh, mainly because off-peak imports drop from 551.0 kWh to 420.5 kWh. This reduction reflects the optimizer’s decision to avoid aggressive off-peak charging when degradation cost becomes too high. Consequently, the battery has less energy available for later discharge, which limits its ability to offset expensive peak-period imports.

Export energy falls sharply as well. Total exports decrease from 145.1 kWh to 40.4 kWh, and on-peak exports drop from 121.0 kWh to just 16.3 kWh, which is an 86.5% reduction. Off-peak exports remain at 24.1 kWh, but the overall export profile shifts away from revenue generation and toward wear reduction.

Battery throughput also contracts significantly. Charged energy falls from 361.9 kWh to 231.4 kWh, which is a 36.1% reduction, and discharged energy falls from 292.9 kWh to 187.6 kWh. Conversion losses therefore decrease from 65.48 kWh to 41.9 kWh. This lower loss level confirms that reduced cycling saves energy, but it also lowers operational flexibility.

Table 7 summarizes the economic impact of doubling the battery replacement cost. Battery wear cost rises from 32.74 USD to 41.90 USD, even though charged energy falls by 36.1%, because the per-kWh degradation penalty increases from 0.05 to 0.10 USD/kWh. At the same time, total import cost drops from 115.54 USD to 96.24 USD because the system uses the battery less aggressively and draws more directly from the grid when needed.

The largest effect is seen in export revenue, which falls from 50.81 USD to 8.93 USD, an 82.4% reduction. This happens because the optimizer stops cycling the battery for arbitrage and therefore gives up high-value export opportunities. Consequently, the net operating cost increases from 97.47 USD to 129.21 USD, which is a 32.6% rise. In this case, the 400 USD/kWh scenario uses only about 0.75 equivalent full cycles over the 48 h horizon, compared with 1.2 cycles in the baseline case, but the lower cycling does not offset the loss of revenue.

5. Conclusions

This paper presents a power-based dynamic programming (DP) framework for optimal battery scheduling in grid-connected photovoltaic microgrids under TOU tariffs. Unlike conventional SOC-discretization approaches, the proposed method directly optimizes battery power references, which makes the dispatch compatible with practical inverter control and keeps the day-ahead problem computationally manageable.

The comparative analysis shows that the DP optimizer achieves a net operating cost of 97.47 USD over the 48 h horizon, which is 20.7% lower than the TOU-aware rule-based baseline of 122.95 USD. This advantage appears even though the battery cycles are more conservative, because the algorithm limits throughput to intervals where tariff differentials justify the degradation cost. Specifically, DP reduces off-peak grid imports by 24% while maintaining strong on-peak export capability.

Sensitivity analysis shows that the operating result depends strongly on converter capability and degradation cost. When the maximum charge and discharge of power increases from 0.1C (25 kW) to 0.2C (50 kW), net operating cost falls by 19.6%, from 97.47 USD to 78.35 USD, because export revenue rises by 69.6% and outweighs the extra wear cost. Conversely, when the battery replacement cost doubles from 200 USD/kWh to 400 USD/kWh, cycling drops and net operating cost increases by 32.6%, from 97.47 USD to 129.21 USD, because the loss of arbitrage opportunities outweighs the reduction in throughput.

From a practical point of view, the power-reference formulation integrates naturally with standard grid-tied inverters and avoids the extra conversion step from SOC to power commands. Moreover, the backward-recursion DP structure has predictable computational complexity, so it can be re-run efficiently when forecasts or tariffs change. Nevertheless, the present study assumes perfect knowledge of PV and load profiles; future work should include stochastic forecasting and receding-horizon control to improve robustness under real-world uncertainty.

In summary, the proposed DP framework offers a rigorous and practical solution for TOU-aware battery scheduling. By balancing short-term tariff exploitation with long-term battery preservation, it gives microgrid operators a transparent way to reduce operating cost while extending service life. As renewable penetration and time-varying pricing continue to expand, predictive and degradation-aware scheduling will remain an important tool for economically sustainable distributed energy systems.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable

Informed Consent Statement

Not applicable

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the author.

Conflicts of Interest

The author declares no conflicts of interest.

References

Omar, M.A. Assessing the economic impacts of net metering on residential solar photovoltaic adoption: A Palestinian case study. Int. J. Energy Res. 2025, 2025, 1370101. [Google Scholar] [CrossRef]
Wei, C.; Bai, X.; Kim, T. Advanced control and optimization for complex energy systems. Complexity 2020, 2020, 5908102. [Google Scholar] [CrossRef]
Omar, M.A. Control scheme of photovoltaic inverter for voltage improvement in isolated AC microgrids. Int. Rev. Electr. Eng. 2020, 15, 199–205. [Google Scholar] [CrossRef]
Azuatalam, D.; Paridari, K.; Ma, Y.; Förstl, M.; Chapman, A.C.; Verbič, G. Energy management of small-scale PV-battery systems: A systematic review considering practical implementation, computational requirements, quality of input data and battery degradation. Renew. Sustain. Energy Rev. 2019, 112, 555–570. [Google Scholar] [CrossRef]
Zhang, Y.; Ma, T.; Campana, P.E.; Yamaguchi, Y.; Dai, Y. A techno-economic sizing method for grid-connected household photovoltaic battery systems. Appl. Energy 2020, 269, 115106. [Google Scholar] [CrossRef]
Braun, M.; Büdenbender, K.; Magnor, D.; Jossen, A. Photovoltaic self-consumption in Germany using lithium-ion storage to increase self-consumed photovoltaic energy. In Proceedings of the 24th European Photovoltaic Solar Energy Conference, Hamburg, Germany, 21–25 September 2009. [Google Scholar]
Talavera, D.; Muñoz-Rodriguez, F.; Jimenez-Castillo, G.; Rus-Casas, C. A new approach to sizing the photovoltaic generator in self-consumption systems based on cost competitiveness and maximizing direct self-consumption. Renew. Energy 2019, 130, 1021–1035. [Google Scholar] [CrossRef]
Hernández, J.; Sanchez-Sutil, F.; Muñoz-Rodríguez, F. Design criteria for optimal sizing of a hybrid energy storage system in PV household-prosumers to maximize self-consumption and self-sufficiency. Energy 2019, 186, 115827. [Google Scholar] [CrossRef]
Luthander, R.; Widén, J.; Nilsson, D.; Palm, J. Photovoltaic self-consumption in buildings: A review. Appl. Energy 2015, 142, 80–94. [Google Scholar] [CrossRef]
Angenendt, G.; Zurmühlen, S.; Axelsen, H.; Sauer, D.U. Comparison of different operation strategies for PV battery home storage systems including forecast-based operation strategies. Appl. Energy 2018, 229, 884–899. [Google Scholar] [CrossRef]
Omar, M.A. The Significance of Considering Battery Service-Lifetime for Correctly Sizing Hybrid PV–Diesel Energy Systems. Energies 2023, 17, 103. [Google Scholar] [CrossRef]
Darghouth, N.R.; Wiser, R.H.; Barbose, G. Customer economics of residential photovoltaic systems: Sensitivities to changes in wholesale market design and rate structures. Renew. Sustain. Energy Rev. 2016, 54, 1459–1469. [Google Scholar] [CrossRef]
Paliwal, N.K. A day-ahead optimal scheduling operation of battery energy storage with constraints in hybrid power systems. Procedia Comput. Sci. 2020, 167, 2140–2152. [Google Scholar] [CrossRef]
Omar, M.A.; Hamdan, A. Control strategy of battery inverter for voltage profile improvement in low voltage networks with high PV penetration level. Int. J. Power Energy Convers. 2024, 15, 25–41. [Google Scholar] [CrossRef]
Bahmani-Firouzi, B.; Azizipanah-Abarghooee, R. Optimal sizing of battery energy storage for microgrid operation management using an improved bat algorithm. Int. J. Electr. Power Energy Syst. 2014, 56, 42–54. [Google Scholar] [CrossRef]
Liu, T.; Hu, X.; Hu, W.; Zou, Y. A heuristic planning reinforcement learning-based energy management strategy for power-split plug-in hybrid electric vehicles. IEEE Trans. Ind. Inform. 2019, 15, 6436–6445. [Google Scholar] [CrossRef]
Liu, W.; Niu, S.; Xu, H. Optimal planning of battery energy storage considering reliability benefit and operation strategy in active distribution systems. J. Mod. Power Syst. Clean Energy 2017, 5, 177–186. [Google Scholar] [CrossRef]
Shaterabadi, M.; Jirdehi, M.A.; Tabar, V.S.; Galvani, S. Advanced dynamic programming for optimal microgrid energy management under renewable energy intermittency. Renew. Energy 2025, 256, 124077. [Google Scholar] [CrossRef]
Rohkamper, S.; Hellwig, M.; Ritschel, W. Energy optimization for electric vehicles using dynamic programming. In Proceedings of the International Conference on Research and Education in Mechatronics, Wolfenbüttel, Germany, 14–15 September 2017. [Google Scholar]
Surya, A.S.; Awater, P.; Marbun, M.P.; Hariyanto, N. Optimal allocation of photovoltaic systems in hybrid power systems using knapsack dynamic programming. In Proceedings of the IEEE Innovative Smart Grid Technologies Asia, Chengdu, China, 21–24 May 2019. [Google Scholar]
Huangfu, Y.; Tian, C.; Zhuo, S.; Xu, L.; Li, P.; Quan, S.; Zhang, Y.; Ma, R. An optimal energy management strategy with subsection bi-objective optimization dynamic programming for photovoltaic/battery/hydrogen hybrid energy systems. Int. J. Hydrogen Energy 2022, 48, 3154–3170. [Google Scholar] [CrossRef]
Wang, X.; Ji, Y.; Wang, J.; Wang, Y.; Qi, L. Optimal energy management of microgrids based on multi-parameter dynamic programming. Int. J. Distrib. Sens. Netw. 2020, 16, 1550147720937141. [Google Scholar] [CrossRef]
Zou, B.; Peng, J.; Li, S.; Li, Y.; Yan, J.; Yang, H. Comparative study of dynamic programming-based and rule-based operation strategies for grid-connected PV-battery systems in office buildings. Appl. Energy 2022, 305, 117875. [Google Scholar] [CrossRef]
Riffonneau, Y.; Bacha, S.; Barruel, F.; Ploix, S. Optimal power flow management for grid-connected PV systems with batteries. IEEE Trans. Sustain. Energy 2011, 2, 309–320. [Google Scholar] [CrossRef]
Bhoi, S.K.; Nayak, M.R. Optimal scheduling of battery storage with grid-tied PV systems for trade-off between consumer energy cost and storage health. Microprocess. Microsyst. 2020, 79, 103274. [Google Scholar] [CrossRef]
Tran, D.; Khambadkone, A.M. Energy management for lifetime extension of energy storage systems in microgrid applications. IEEE Trans. Smart Grid 2013, 4, 1289–1296. [Google Scholar] [CrossRef]
Qin, Y.; Hua, H.; Cao, J. Stochastic optimal control scheme for battery lifetime extension in islanded microgrids via a novel modeling approach. IEEE Trans. Smart Grid 2019, 10, 4467–4475. [Google Scholar] [CrossRef]
Hu, J.; Shan, Y.; Xu, Y.; Guerrero, J.M. Model predictive control of microgrids—An overview. Renew. Sustain. Energy Rev. 2021, 136, 110422. [Google Scholar] [CrossRef]
Sharma, P.; Mathur, H.D.; Mishra, P.; Bansal, R.C. A critical and comparative review of energy management strategies for microgrids. Appl. Energy 2022, 327, 120028. [Google Scholar] [CrossRef]
Fagundes, T.A.; Fuzato, G.H.F.; Silva, L.J.R.; Alonso, A.M.d.S.; Vasquez, J.C.; Guerrero, J.M.; Machado, R.Q. Battery energy storage systems in microgrids: A review of SoC balancing and perspectives. IEEE Open J. Ind. Electron. Soc. 2024, 5, 961–992. [Google Scholar] [CrossRef]

Figure 1. Workflow of the proposed methodology.

Figure 2. TOU-aware rule-based energy management strategy.

Figure 3. Grid-connected PV-battery microgrid configuration.

Figure 4. Hourly PV generation and load demand.

Figure 5. Power profile and grid interaction under TOU-aware rule-based strategy. The three-point discharge minima correspond to the 3 h on-peak TOU window (hours 7, 8, 9), where each integer hour index represents a 1 h interval.

Figure 6. Battery state of charge under TOU-aware rule-based control.

Figure 7. Power profile and grid interaction under dynamic programming.

Figure 8. Battery state of charge under dynamic programming (0.1C power limit). Shaded regions.

Figure 9. Power profile under DP with increased power limit.

Figure 10. Battery state of charge under dynamic programming with 0.2C power limit. Shaded regions indicate on-peak TOU periods.

Figure 11. Power profile under high battery cost scenario.

Figure 12. Battery state of charge under high battery cost scenario.

Table 1. Comparative summary of optimization methods for battery scheduling in grid-connected PV microgrids.

Method	Optimization Variable	Main Advantages	Main Limitations	Real-Time Applicability	References
Rule-based control	Heuristic rules	Simple implementation and low computational burden	Limited optimality and weak forecasting capability	High	[12,14]
SOC-based DP	Battery SOC	Global optimization capability and effective constraint handling	Requires SOC-to-power conversion for inverter implementation	Moderate	[20,26]
MPC	Power and SOC states	Effective handling of forecasts and operational constraints	High computational complexity and forecast dependency	High	[16,29]
MILP	Mixed variables	Accurate optimization under multiple constraints	Computationally intensive for real-time implementation	Moderate	[15,30]
Proposed power-step DP	Battery power	Direct inverter-oriented implementation and reduced conversion complexity	Requires power discretization	High	This work

Table 2. Comparison of energy imports, exports, and battery usage (0.1C P_max, $200/kWh).

Metric	TOU-Aware Rule-Based	Dynamic Programming
On-Peak Import Energy (kWh)	70	73.1
Off-Peak Import Energy (kWh)	724.07	551
Total Import Energy (kWh)	794.07	624.1
On-Peak Export Energy (kWh)	125	121
Off-Peak Export Energy (kWh)	67.41	24.1
Total Export Energy (kWh)	192.41	145.1
Battery Charged Energy (kWh)	491.67	361.9
Battery Discharged Energy (kWh)	300	292.9
Total Battery Losses (kWh)	79.17	65.48

Table 3. Comparison of costs for TOU-aware rule-based approach and DPA.

Metric	TOU-Aware Rule-Based	Dynamic Programming
Total Import Cost (USD)	140.11	115.54
Total Export Income (USD)	56.74	50.81
Battery Wear Cost (USD)	39.58	32.74
Net Operating Cost (USD)	122.95	97.47

Table 4. Energy outputs of dynamic programming approach with increasing power limit.

Metric	$P_{b a t, m a x}$ 0.1C	$P_{b a t, m a x}$ 0.2C
Import Energy On-Peak (kWh)	73.1	7.6
Import Energy Off-Peak (kWh)	551	740.8
Total Import Energy kWh	624.1	748.4
Export Energy On-Peak (kWh)	121	209.4
Export Energy Off-Peak (kWh)	24.1	24
Total Export Energy kWh	145.1	233.4
Energy Charged to Battery (kWh)	361.9	552.2
Energy Discharged from Battery (kWh)	292.9	447.2
Total Battery Losses (kWh)	65.48	99.94

Table 5. Cost outputs of DPA with increasing power limit ($200/kWh fixed).

Metric	$P_{b a t, m a x}$ 0.1C	$P_{b a t, m a x}$
Total import cost (USD)	115.54	114.54
Total export income (USD)	50.81	86.16
Battery wear cost (USD)	32.74	49.97
Optimal cost (USD)	97.47	78.35

Table 6. Energy outputs of dynamic programming approach with increasing battery cost.

Metric	200 USD/kWh	400 USD/kWh
Import Energy On-Peak (kWh)	73.1	73.7
Import Energy Off-Peak (kWh)	551	420.5
Total Import Energy (kWh)	624.1	494.2
Export Energy On-Peak (kWh)	121	16.3
Export Energy Off-Peak (kWh)	24.1	24.1
Total Export Energy (kWh)	145.1	40.4
Energy Charged to Battery (kWh)	361.9	231.4
Energy Discharged from Battery (kWh)	292.9	187.6
Total Battery Losses (kWh)	65.48	41.9

Table 7. Cost outputs of dynamic programming approach with increasing battery cost.

Metric	200 USD/kWh	400 USD/kWh
Total Import Cost (USD)	115.54	96.24
Total Income (USD)	50.81	8.93
Battery Wear Cost (USD)	32.74	41.9
Optimal Cost (USD)	97.47	129.21

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Omar, M.A. Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation. Appl. Sci. 2026, 16, 5693. https://doi.org/10.3390/app16115693

AMA Style

Omar MA. Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation. Applied Sciences. 2026; 16(11):5693. https://doi.org/10.3390/app16115693

Chicago/Turabian Style

Omar, Moien A. 2026. "Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation" Applied Sciences 16, no. 11: 5693. https://doi.org/10.3390/app16115693

APA Style

Omar, M. A. (2026). Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation. Applied Sciences, 16(11), 5693. https://doi.org/10.3390/app16115693

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Power-Based Dynamic Programming for Cost-Optimal Battery Scheduling in Grid-Connected PV Microgrids Considering Time-of-Use Tariffs and Battery Degradation

Abstract

1. Introduction

2. Methodology

2.1. Problem Formulation

2.2. System Model Equations

2.2.1. Power Balance Equation

2.2.2. Grid Cost

2.2.3. Battery Wear Cost

2.3. Optimization Framework

2.3.1. Objective Function

2.3.2. Constraints

2.4. Power-Step Discretization

2.5. Dynamic Programming Algorithm

2.6. TOU-Aware Rule-Based Strategy

3. Case Study

3.1. Microgrid Description

3.2. Time-of-Use Pricing

3.3. Battery Wear Cost Calculation

4. Simulation Results

4.1. Results of TOU-Aware Rule-Based Strategy

4.2. Results of Dynamic Programming Approach

4.3. Energy Outputs Comparison

4.4. Economic Output Comparison

4.5. Dynamic Programming with Increasing Pmax

4.6. Dynamic Programming with Increasing Battery Cost

5. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI