Open Access
This article is

- freely available
- re-usable

*Energies*
**2019**,
*12*(23),
4577;
https://doi.org/10.3390/en12234577

Article

Data-Driven Distributionally Robust Stochastic Control of Energy Storage for Wind Power Ramp Management Using the Wasserstein Metric

Department of Electrical and Computer Engineering, Automation and Systems Research Institute, Seoul National University, Seoul 08826, Korea

Received: 7 November 2019 / Accepted: 26 November 2019 / Published: 1 December 2019

## Abstract

**:**

The integration of wind energy into the power grid is challenging because of its variability, which causes high ramp events that may threaten the reliability and efficiency of power systems. In this paper, we propose a novel distributionally robust solution to wind power ramp management using energy storage. The proposed storage operation strategy minimizes the expected ramp penalty under the worst-case wind power ramp distribution in the Wasserstein ambiguity set, a statistical ball centered at an empirical distribution obtained from historical data. Thus, the resulting distributionally robust control policy presents a robust ramp management performance even when the future wind power ramp distribution deviates from the empirical distribution, unlike the standard stochastic optimal control method. For a tractable numerical solution, a duality-based dynamic programming algorithm is designed with a piecewise linear approximation of the optimal value function. The performance and utility of the proposed method are demonstrated and analyzed through case studies using the wind power data in the Bonneville Power Administration area for the year 2018.

Keywords:

energy storage operation; wind ramp rate; renewable integration; stochastic control; dynamic programming; distributionally robust optimization; linear programming## 1. Introduction

To decarbonize the electric power grid, there have been growing efforts to utilize clean, renewable energy sources. The utilization of wind and solar power generation is challenging because these energy sources are uncertain, intermittent, and nondispatchable. In particular, as the penetration of wind power increases, fast-ramping generators must be called upon more frequently to balance supply and demand, or wind power production must be curtailed [1,2]. Such ancillary services and wind power curtailments will offset the economic and environmental benefits of wind energy.

One possible way to alleviate the negative impact of a growing wind power ramp rate is to utilize the flexibility that energy storage can offer. Energy storage devices are capable of shifting wind generation to reduce the ramp rate of wind generation [3,4]. For an efficient charging/discharging operation of battery energy storage systems, a model predictive control approach was proposed in [5]. However, a certain amount of wind generation must be curtailed when using this method. In [6], the wind power ramp control problem using energy storage was formulated as a social welfare maximization problem. As the optimal solution to the problem requires information about the future wind generation and demand, a suboptimal online algorithm is presented; however, the suboptimal approach suffers a performance loss. In [7], a storage control approach using a two-stage stochastic optimization was proposed. This operational strategy utilizes the forecast of wind energy obtained by a Gaussian process. Another optimization-based method was developed by using ramp scenario forecasts [8]. The performance of both methods depends on the accuracy of forecasted information because the optimization problems in [7,8] directly use wind forecasts. Arguably, the most popular method for efficient energy storage operation is stochastic optimal control [9,10,11,12]. The associated stochastic optimal control problems are solved by dynamic programming or its approximate version, which often allows important structural properties of optimal strategies. Unfortunately, this method requires knowledge about the probability distribution of all the uncertainties such as future wind power generation. However, accurate distribution models are difficult to obtain in practice. Thus, the effectiveness of stochastic optimal control methods is limited as the wind power distribution at any given time deviates from the distribution estimated using historical data.

The methods mentioned above either require reliable information about future wind power generation or compromise the control performance. To account for these limitations, we seek an efficient storage operation strategy for wind power ramp management when only an inaccurate probability distribution of wind power generation is available. This method is based on distributionally robust stochastic control, which minimizes the expected value of a given cost function in the face of the worst-case distribution drawn from a known set, called the ambiguity set [13,14,15,16,17,18,19]. In this work, the ambiguity set is chosen as the set of all probability distribution whose Wasserstein distance from an empirical distribution constructed from data is no greater than a certain threshold [20,21,22]. The proposed storage control strategy is robust against wind ramp distribution errors characterized by the Wasserstein ambiguity set. It is worth mentioning that some storage control techniques do not require the exact distribution of uncertainties [23,24]. However, these approaches do not aim to design a controller that is robust against distribution errors, unlike our method.

The contributions of this work can be summarized as follows. First, a novel storage operation strategy is proposed to provide a robust ramp management performance even when future wind power ramp distribution deviates from the empirical distribution obtained by historical data. Second, we develop a computationally tractable dynamic programming (DP) algorithm by using a piecewise linear approximation of the optimal value function with a uniform convergence property and Kantorovich duality. Thus, in each DP iteration, it suffices to solve linear programs for all grid points that discretize the state space. Third, the performance of the distributionally robust method is evaluated using the wind power generation data in the Bonneville Power Administration (BPA) control area and is compared with that of the standard stochastic optimal control method. Our simulation studies indicate that the proposed method reduces the ramp penalty by 4.82% on average compared to the standard stochastic optimal control method. We also examine how the ambiguity set size and the storage size affect the ramp management performance of the distributionally robust control method. This paper is significantly expanded from a preliminary conference version in many aspects [25]. The problem studied in this paper is wind power ramp management, while [25] considers a wind energy balancing problem. In addition, we use the Wasserstein ambiguity set and examine the effect of the set size, unlike [25], which employs the moment-based ambiguity set (The performance of the moment-based approach in [25] depends on the reliability of moments estimated from wind power generation data. However, the proposed Wasserstein approach does not have such an issue because it does not use information about moments.) Furthermore, a tractable dynamic programming solution is carefully developed in this work, using a linear programming approximation with a uniform convergence property.

The remainder of this paper is organized as follows. In Section 2, the Wasserstein distributionally robust storage control problem is formulated for wind power ramp management using historical data. Its dynamic programming solution with a piecewise linear approximation is proposed in Section 3. In Section 4, the performance and utility of the proposed method are demonstrated and analyzed using the wind power generation data in BPA for the year 2018.

## 2. Problem Formulation

#### 2.1. Energy Storage Model

Consider an energy storage device whose state of charge (SOC) ${x}_{t}^{s}$ evolves as
where $\Delta t$ is the length of each time interval. Here, the control inputs ${u}_{t}^{c}$ and ${u}_{t}^{d}$ determine the amount of power (MW) by which the storage device is charged and discharged, respectively, at time t. The coefficients $\eta \in (0,1]$ and ${\alpha}^{c}\in (0,1]$ account for the dissipation loss and the charging inefficiency, respectively. Given the capacity of the storage device, ${x}_{t}^{s}\in [{x}_{min}^{s},{x}_{max}^{s}]$, the decision variable ${u}_{t}:=\phantom{\rule{3.33333pt}{0ex}}({u}_{t}^{c},{u}_{t}^{d})$ must satisfy the following constraints:
where ${u}_{max}^{c}$ and ${u}_{max}^{d}$ denote the charging and discharging limit, respectively. If the storage device is connected to a bus, the amount of power drawn from the bus at time t is given by
where ${\alpha}^{d}\in (0,1]$ represents the discharging efficiency. This energy storage model is illustrated in Figure 1.

$${x}_{t+1}^{s}=\eta [{x}_{t}^{s}+({\alpha}^{c}{u}_{t}^{c}-{u}_{t}^{d})\Delta t],\phantom{\rule{1.em}{0ex}}t=0,1,\dots ,T-1,$$

$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& 0\le {u}_{t}^{c}\le min\{({x}_{max}^{s}-{x}_{t}^{s})/({\alpha}^{c}\Delta t),{u}_{max}^{c}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {u}_{t}^{d}\le min\{{x}_{t}^{s}/\Delta t,{u}_{max}^{d}\},\hfill \end{array}$$

$$h\left({u}_{t}\right):={u}_{t}^{c}-{\alpha}_{d}{u}_{t}^{d},$$

#### 2.2. Wind Power Ramp Management

Let ${w}_{t}$ denote the amount of power generated by wind power plants in an area of interest at time t. We model the evolution of wind power production by
where ${\xi}_{t}$ is a random variable in $\Xi \subseteq \mathbb{R}$. Note that $({w}_{t+1}-{w}_{t})/\Delta t={\xi}_{t}/\Delta t$ represents the (average) ramp rate of the wind power production. We introduce a new state ${x}_{t}^{p}$, which evolves according to

$${w}_{t+1}={w}_{t}+{\xi}_{t},$$

$${x}_{t+1}^{p}=h\left({u}_{t}\right)+{\xi}_{t}.$$

Let ${x}_{t}:=({x}_{t}^{s},{x}_{t}^{p})\in {\mathbb{R}}^{2}$ be the augmented state, and $\mathcal{X}:=[{x}_{min}^{s},{x}_{max}^{s}]\times [{x}_{min}^{p},{x}_{max}^{p}]$ be the augmented state space.

Suppose the wind power plants and the energy storage device are located in the same area (possibly connected in the same bus). Their net power production in period t is given by
as illustrated in Figure 1. Let

$${y}_{t}:={w}_{t}-h\left({u}_{t}\right),$$

$${d}_{t}:={y}_{t}-{y}_{t-1}={\xi}_{t-1}-h\left({u}_{t}\right)+h\left({u}_{t-1}\right)={x}_{t}^{p}-h\left({u}_{t}\right).$$

Then, ${d}_{t}/\Delta t$ represents the ramp rate of net power production.

Let ${R}_{u}>0$ and ${R}_{d}>0$ denote the ramp-up limit and ramp-down limit (MW), respectively. To compute the ramp penalty when ${d}_{t}\ge 0$, consider the following cases:

- If $0\le {d}_{t}<{R}_{u}$, then the ramp penalty, denoted by ${r}_{t}({x}_{t},{u}_{t})$, is linear in ${d}_{t}$, i.e.,$${r}_{t}({x}_{t},{u}_{t}):=p{d}_{t}.$$
- If ${d}_{t}\ge {R}_{u}$, then the ramp penalty is given by$${r}_{t}({x}_{t},{u}_{t}):={p}_{u}({d}_{t}-{R}_{u})+p{R}_{u},$$

When ${d}_{t}\le 0$, by symmetry, we have the following:

- If $-{R}_{d}<{d}_{t}\le 0$, then the ramp penalty is given by$${r}_{t}({x}_{t},{u}_{t}):=-p{d}_{t}.$$
- If ${d}_{t}\le -{R}_{d}$, then the amount below the ramp-down limit is penalized with price ${p}_{d}>p$, i.e.,$${r}_{t}({x}_{t},{u}_{t}):=-{p}_{d}({d}_{t}+{R}_{d})+p{R}_{d}.$$

Note that the ramp penalty depends on the storage control action ${u}_{t}$ as well as the new state ${x}_{t}^{p}$ through the ramp ${d}_{t}$ of net power production. Thus, the ramp penalty can be reduced by carefully controlling the storage device. The ramp management problem using energy storage can be formulated as (To focus on the wind power ramp management capability of our method, we use this stylized control problem, neglecting the energy cost and the aging of batteries (in the case of battery energy storage systems). However, these additional factors can be incorporated by modifying the cost function.)
where $\Pi $ is the set of admissible control policies,

$$\underset{\pi \in \Pi}{min}\phantom{\rule{0.277778em}{0ex}}{\mathbb{E}}^{\pi}\left[\sum _{t=0}^{T-1}{r}_{t}({x}_{t},{u}_{t})\mid {x}_{0}=\mathit{x}\right],$$

$$\begin{array}{cc}\hfill \Pi :=\{\pi =({\pi}_{0},\dots ,{\pi}_{T-1})\mid & {u}_{t}={\pi}_{t}\left({x}_{t}\right)\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \in \mathcal{U}\left({x}_{t}\right):=[0,min\{({x}_{max}^{s}-{x}_{t}^{s})/({\alpha}^{c}\Delta t),{u}_{max}^{c}\}]\times [0,min\{{x}_{t}^{s}/\Delta t,{u}_{max}^{d}\}]\}.\hfill \end{array}$$

The stochastic optimal control problem (3) can be solved when the probability distribution ${\mu}_{t}$ of the wind ramp variable ${\xi}_{t}$ is available. Unfortunately, it is difficult to accurately estimate the probability distribution ${\mu}_{t}$. Suppose that $T=288$ and $\Delta t=5$ min. In other words, by solving this problem, we can design an optimal controller that operates for 24 h. To estimate the distribution of ${\xi}_{t}$, we may use its historical data for the past 15 days. However, this distribution may not be valid for the next 15 days. For example, as shown in Figure 2, there is a clear discrepancy between the empirical distribution of wind power ramp rate at 12PM in the BPA area obtained by the data for 1–15 April 2018 and that obtained by the data for 16–30 April 2018. If an optimal controller is constructed using distributional information that will be invalid at the test time, it may not perform well. Our distributionally robust method aims to resolve this issue, as we will see later.

#### 2.3. Ambiguity of Wind Ramp Distribution

One of the simplest ways to estimate the probability distribution of the wind ramp variable ${\xi}_{t}$ is to use the empirical distribution constructed from the (historical) data $\{{\widehat{\xi}}_{t}^{\left(1\right)},\dots ,{\widehat{\xi}}_{t}^{\left(N\right)}\}$:
where ${\delta}_{{\widehat{\xi}}_{t}^{\left(n\right)}}$ denotes the Dirac delta measure concentrated at ${\widehat{\xi}}_{t}^{\left(n\right)}$. However, this empirical distribution may not reflect the actual behavior of ${\xi}_{t}$, particularly at the (future) test time or operation time of energy storage. To characterize errors in the empirical distribution, we introduce the following set called the ambiguity set, of probability distributions:
where $\mathcal{P}(\Xi )$ is the set of probability distributions on $\Xi $. The ambiguity set is a statistical ball centered at the empirical distribution with radius $\theta >0$. Here, the distance between two probability distributions is measured by the Wasserstein metric of order 1,
where ${\Pi}^{i}\kappa $ denotes the ith marginal of $\kappa $. The Wasserstein distance between two probability distributions can be interpreted as the minimum cost of transporting mass from one to another using nonuniform perturbation. The true distribution of ${\xi}_{t}$ lies in the ambiguity set if a sufficiently large $\theta $ is chosen. We use the Wasserstein ambiguity set ${\mathcal{D}}_{t}$ to design a control policy that is robust against errors in the empirical distribution ${\nu}_{t}$ of the ramp variable ${\xi}_{t}$. The Wasserstein ambiguity sets have received much attention because the associated distributionally robust optimization (DRO) problems provide solutions with probabilistic out-of-sample performance guarantees and have equivalent tractable forms [20,21,22].

$${\nu}_{t}:=\frac{1}{N}\sum _{n=1}^{N}{\delta}_{{\widehat{\xi}}_{t}^{\left(n\right)}},$$

$${\mathcal{D}}_{t}:=\{{\mu}_{t}\in \mathcal{P}(\Xi )\mid W({\mu}_{t},{\nu}_{t})\le \theta \},$$

$$W(\mu ,\nu ):=\underset{\kappa \in \mathcal{P}\left({\Xi}^{2}\right)}{min}\left\{{\int}_{{\mathbb{R}}^{2}}|\xi -{\xi}^{\prime}|\kappa (\mathrm{d}\xi ,\mathrm{d}{\xi}^{\prime})\mid {\Pi}^{1}\kappa =\mu ,{\Pi}^{2}\kappa =\nu \right\},$$

#### 2.4. Wasserstein Distributionally Robust Stochastic Control

Our goal is to construct a control policy that performs well even when the true distribution ${\mu}_{t}$ of the wind ramp variable ${\xi}_{t}$ deviates from its empirical distribution ${\nu}_{t}$ obtained from (historical) data. We take a game-theoretic approach. Consider a two-player zero-sum game in which Player I selects a storage control action to minimize the total ramp penalty while Player II, an adversary, chooses the probability distribution of ${\xi}_{t}$ from the Wasserstein ambiguity set ${\mathcal{D}}_{t}$ to maximize the same penalty. The strategy space for Player I is given by $\Pi $. The policy space ${\Gamma}_{\mathcal{D}}$ for Player II is chosen as
where the subscript $\mathcal{D}$ is used to emphasize the fact that the adversary’s strategy space depends on the Wasserstein ambiguity set. The Wasserstein distributionally robust stochastic control problem for ramp management is then formulated as

$${\Gamma}_{\mathcal{D}}:=\{\gamma :=({\gamma}_{0},\dots ,{\gamma}_{T-1})\mid {\mu}_{t}={\gamma}_{t}\left({x}_{t}\right)\in {\mathcal{D}}_{t}\},$$

$$\underset{\pi \in \Pi}{min}\underset{\gamma \in {\Gamma}_{\mathcal{D}}}{max}\phantom{\rule{0.277778em}{0ex}}{\mathbb{E}}^{\pi ,\gamma}\left[\sum _{t=0}^{T-1}{r}_{t}({x}_{t},{u}_{t})\mid {x}_{0}=\mathit{x}\right].$$

An optimal distributionally robust policy ${\pi}^{\U0001f7c9}$ minimizes the worst-case ramp penalty under the most adversarial wind ramp distributions in the ambiguity set. Thus, ${\pi}^{\U0001f7c9}$ is robust against distribution errors characterized by the Wasserstein ambiguity set. The radius $\theta $ of the Wasserstein ball controls the robustness and conservativeness of control policy ${\pi}^{\U0001f7c9}$. A detailed discussion of tuning $\theta $ and theoretical properties of ${\pi}^{\U0001f7c9}$ can be found in [19].

## 3. Solution via Dynamic Programming

To solve the minimax stochastic control problem (5), we use dynamic programming in conjunction with modern distributionally robust optimization (DRO). To describe the evolution of the augmented state ${x}_{t}:=({x}_{t}^{s},{x}_{t}^{p})\in \mathcal{X}$, we use the following notation:
where $f(\mathit{x},\mathit{u},\mathit{\xi}):=\left(\eta [{\mathit{x}}^{s}+({\alpha}^{c}{\mathit{u}}^{c}-{\mathit{u}}^{d})\Delta t],\phantom{\rule{0.222222em}{0ex}}h\left(\mathit{u}\right)+\mathit{\xi}\right)$. Note that the ramp penalty is a convex piecewise linear function and can be expressed as

$${x}_{t+1}=f({x}_{t},{u}_{t},{\xi}_{t}),$$

$$\begin{array}{c}{r}_{t}({x}_{t},{u}_{t})=max\{p{d}_{t},{p}_{u}({d}_{t}-{R}_{u})+p{R}_{u},-p{d}_{t},-{p}_{d}({d}_{t}+{R}_{d})+p{R}_{d}\}\hfill \\ \phantom{\rule{1.em}{0ex}}=max\{p({x}_{t}^{p}-h\left({u}_{t}\right)),{p}_{u}({x}_{t}^{p}-h\left({u}_{t}\right)-{R}_{u})+p{R}_{u},-p({x}_{t}^{p}-h\left({u}_{t}\right)),-{p}_{d}({x}_{t}^{p}-h\left({u}_{t}\right)+{R}_{d})+p{R}_{d}\}.\hfill \end{array}$$

#### 3.1. Bellman Equation

We define the (optimal) value function ${v}_{t}:\mathcal{X}\to \mathbb{R}$ of (5) as
which represents the minimum cost-to-go under the most adversarial distributions chosen from $\mathcal{D}$. By the dynamic programming principle, the value function satisfies the following Bellman equation:
with ${v}_{T}\left(\mathit{x}\right)\equiv 0$ (More precisely, the dynamic programming principle holds under the measurable selection condition, which we assume throughout this paper. More technical details can be found in [26].) This Bellman equation may be solved backward in time, i.e., from T to 0. However, it requires an optimal solution to the minimax optimization problem, which is computationally challenging to solve. This is because the inner minimization problem over probability distributions is infinite dimensional.

$${v}_{t}\left(\mathit{x}\right):=\underset{\pi \in \Pi}{min}\underset{\gamma \in {\Gamma}_{\mathcal{D}}}{max}\phantom{\rule{0.277778em}{0ex}}{\mathbb{E}}^{\pi ,\gamma}\left[\sum _{\tau =t}^{T-1}{r}_{\tau}({x}_{\tau},{u}_{\tau})\mid {x}_{t}=\mathit{x}\right],$$

$${v}_{t}\left(\mathit{x}\right)=\underset{\mathit{u}\in \mathcal{U}\left(\mathit{x}\right)}{inf}\underset{\mathit{\mu}\in {\mathcal{D}}_{t}}{sup}{\mathbb{E}}^{\mathit{\mu}}[{r}_{t}(\mathit{x},\mathit{u})+{v}_{t+1}\left(f(\mathit{x},\mathit{u},{\xi}_{t})\right)],\phantom{\rule{1.em}{0ex}}t=0,\dots ,T-1$$

#### 3.2. Tractable Reformulation

To alleviate the computational difficulty, we reformulate the Bellman equation into a tractable form by using Wasserstein DRO based on the Kantorovich duality principle [20,22]. Specifically, the right-hand side can be expressed in the following dual form:

$$\begin{array}{cc}\hfill {v}_{t}\left(\mathit{x}\right)=\underset{\mathit{u},\lambda ,z}{inf}\phantom{\rule{1.em}{0ex}}& \theta \lambda +{r}_{t}(\mathit{x},\mathit{u})+\frac{1}{N}\sum _{n=1}^{N}{z}_{n}\hfill \\ \hfill \mathrm{s}.\mathrm{t}.\phantom{\rule{1.em}{0ex}}& {v}_{t+1}\left(f(\mathit{x},\mathit{u},\mathit{\xi})\right)-\lambda |{\widehat{\xi}}_{t}^{\left(n\right)}-\mathit{\xi}|\le {z}_{n}\phantom{\rule{1.em}{0ex}}\forall \mathit{\xi}\in \Xi \phantom{\rule{0.277778em}{0ex}}\forall n=1,\dots ,N\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {\mathit{u}}^{c}\le min\{({x}_{max}^{s}-{\mathit{x}}^{s})/({\alpha}^{c}\Delta t),{u}_{max}^{c}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {\mathit{u}}^{d}\le min\{{\mathit{x}}^{s}/\Delta t,{u}_{max}^{d}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \lambda \ge 0.\hfill \end{array}$$

It is shown that strong duality holds under a minor technical condition [22]. Note that the reformulated Bellman equation involves a single minimization problem instead of the minimax problem in the original one. This minimization problem is a convex optimization problem given any $\mathit{x}\in \mathcal{X}$ because ${r}_{t}$ and ${v}_{t}$ are convex and f is affine. Furthermore, all the optimization variables $u,\lambda $, and z are finite dimensional, unlike those of the original Bellman equation. However, the first inequality constraint must be satisfied for all $\mathit{\xi}$ in the support $\Xi $, which is a dense set. Thus, the reformulated minimization problem is a convex semi-infinite program. This can be numerically solved by several existing algorithms, such as discretization and sampling-based methods (see [27,28,29] and the references therein).

#### 3.3. Controller Design Algorithm Using Linear Programming

By using the piecewise linear structure of the ramp penalty function, we can further rewrite the Bellman Equation (7), with a slack variable b, as

$$\begin{array}{cc}\hfill {v}_{t}\left(\mathit{x}\right)=\underset{\mathit{u},\lambda ,z,b}{inf}\phantom{\rule{1.em}{0ex}}& \theta \lambda +b+\frac{1}{N}\sum _{n=1}^{N}{z}_{n}\hfill \\ \hfill \mathrm{s}.\mathrm{t}.\phantom{\rule{1.em}{0ex}}& {v}_{t+1}\left(f(\mathit{x},\mathit{u},\mathit{\xi})\right)-\lambda |{\widehat{\xi}}_{t}^{\left(n\right)}-\mathit{\xi}|\le {z}_{i}\phantom{\rule{1.em}{0ex}}\forall \mathit{\xi}\in \Xi \phantom{\rule{0.277778em}{0ex}}\forall n=1,\dots ,N\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& p({\mathit{x}}^{p}-h\left(\mathit{u}\right))\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& {p}_{u}({\mathit{x}}^{p}-h\left(\mathit{u}\right)-{R}_{u})+p{R}_{u}\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& -p({\mathit{x}}^{p}-h\left(\mathit{u}\right))\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& -{p}_{d}({\mathit{x}}^{p}-h\left(\mathit{u}\right)-{R}_{d})+p{R}_{d}\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {\mathit{u}}^{c}\le min\{({x}_{max}^{s}-{\mathit{x}}^{s})/({\alpha}^{c}\Delta t),{u}_{max}^{c}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {\mathit{u}}^{d}\le min\{{\mathit{x}}^{s}/\Delta t,{u}_{max}^{d}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \lambda \ge 0.\hfill \end{array}$$

This optimization problem has a linear objective function. Furthermore, all the constraint functions are affine except the first. Observing that $\mathit{u}\mapsto {v}_{t+1}\left(f(\mathit{x},\mathit{u},\mathit{\xi})\right)$ is a convex function, we approximate it as a convex piecewise linear function, i.e.,
where ${\{({x}^{\left[i\right]},{y}^{\left[j\right]})\}}_{i,j=1}^{M}$ is the set of rectangular grid points that discretize the state space (Here, we assume that there are the same number of grid points on each axis. However, this assumption can easily be relaxed)and ${\left\{{\xi}^{\left[k\right]}\right\}}_{k=1}^{K}$ is the set of grid points that discretize the support $\Xi $. The uniform convergence property of this approximation scheme is shown in [30]. Here, the weight ${\gamma}_{i,j,k}$ represents the contribution of $({x}^{\left[i\right]},{y}^{\left[j\right]})$ to approximating $f(\mathit{x},\mathit{u},{\xi}^{\left[k\right]})$, and thus satisfies

$${v}_{t+1}(f(\mathit{x},\mathit{u},{\xi}^{\left[k\right]}))\approx \sum _{i,j=1}^{M}{\gamma}_{i,j,k}{v}_{t+1}({x}^{\left[i\right]},{y}^{\left[j\right]}),$$

$$f(\mathit{x},\mathit{u},{\xi}^{\left[k\right]})=\sum _{i,j=1}^{M}{\gamma}_{i,j,k}\left[\begin{array}{c}{x}^{\left[i\right]}\\ {y}^{\left[j\right]}\end{array}\right],\phantom{\rule{1.em}{0ex}}\sum _{i,j=1}^{M}{\gamma}_{i,j,k}=1,\phantom{\rule{1.em}{0ex}}{\gamma}_{i,j,k}\ge 0.$$

With this piecewise linear approximation, the Bellman Equation (8) can be expressed as

$$\begin{array}{cc}\hfill {v}_{t}(\mathit{x})=\underset{\mathit{u},\lambda ,z,b,\gamma}{inf}\phantom{\rule{1.em}{0ex}}& \theta \lambda +b+\frac{1}{N}\sum _{n=1}^{N}{z}_{n}\hfill \\ \hfill \mathrm{s}.\mathrm{t}.\phantom{\rule{1.em}{0ex}}& \sum _{i,j=1}^{M}{\gamma}_{i,j,k}{v}_{t+1}({x}^{\left[i\right]},{y}^{\left[j\right]})-\lambda |{\widehat{\xi}}_{t}^{(n)}-{\xi}^{\left[k\right]}|\le {z}_{n}\phantom{\rule{1.em}{0ex}}\forall k=1,\dots ,K,\phantom{\rule{0.277778em}{0ex}}\forall n=1,\dots ,N\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& f(\mathit{x},\mathit{u},{\xi}^{\left[k\right]})=\sum _{i,j=1}^{M}{\gamma}_{i,j,k}\xb7({x}^{\left[i\right]},{y}^{\left[j\right]})\phantom{\rule{1.em}{0ex}}\forall k=1,\dots ,K\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \sum _{i,j=1}^{M}{\gamma}_{i,j,k}=1\phantom{\rule{1.em}{0ex}}\forall k=1,\dots ,K\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& {\gamma}_{i,j,k}\ge 0\phantom{\rule{1.em}{0ex}}\forall i,j=1,\dots ,K,\phantom{\rule{0.277778em}{0ex}}\forall k=1,\dots ,K\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& p({\mathit{x}}^{p}-h(\mathit{u}))\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& {p}_{u}({\mathit{x}}^{p}-h(\mathit{u})-{R}_{u})+p{R}_{u}\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& -p({\mathit{x}}^{p}-h(\mathit{u}))\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& -{p}_{d}({\mathit{x}}^{p}-h(\mathit{u})-{R}_{d})+p{R}_{d}\le b\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {\mathit{u}}^{c}\le min\{({x}_{max}^{s}-{\mathit{x}}^{s})/({\alpha}^{c}\Delta t),{u}_{max}^{c}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& 0\le {\mathit{u}}^{d}\le min\{{\mathit{x}}^{s}/\Delta t,{u}_{max}^{d}\}\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& \lambda \ge 0.\hfill \end{array}$$

Given any $\mathit{x}\in \mathcal{X}$, this optimization problem is a linear program (LP) because f and h are affine functions. Thus, this optimization problem can be efficiently solved by using existing algorithms such as interior-point methods and simplex methods (see, e.g., [31,32,33] and the references therein).

Algorithm 1 describes the design procedure for a distributionally robust storage controller. It aims to solve the Bellman equation to compute the value function ${v}_{t}$ over a discretized state space. As specified in Line 7, the value function at a particular grid point is obtained by solving the LP (9). Note that an optimal ${\mathit{u}}^{\U0001f7c9}$ of this LP is assigned to be an optimal action at state $({x}^{\left[i\right]},{y}^{\left[j\right]})$, i.e., ${\pi}_{t}^{\U0001f7c9}({x}^{\left[i\right]},{y}^{\left[j\right]})={\mathit{u}}^{\U0001f7c9}$. An optimal control action at a nongrid point $\mathit{x}$ can also be computed by solving the LP with $\mathit{x}$. Thus, this algorithm does not require any explicit interpolation, which may introduce additional numerical errors. The input data $\{{\widehat{\xi}}_{t}^{\left(1\right)},\dots ,{\widehat{\xi}}_{t}^{\left(N\right)}\}$ can be considered as a training sample. In fact, the proposed method can be interpreted as an adversarial training of a storage controller to make it robust against distributional errors.

The distributionally robust control problem (5) is solved once using Algorithm 1 before the storage operation starts. In other words, the proposed controller design approach is an offline method unlike, for example, receding horizon control. Note also that the proposed reformulation method does not require the direct calculation of Wasserstein distances, which is #P-hard [34]. This is an advantage of Wasserstein DRO that enables us to obtain an optimal solution through the dual form (7) or (9) without explicitly computing Wasserstein distances.

Algorithm 1: Distributionally Robust Storage Controller Design via Linear Program (LP). |

## 4. Case Studies

To demonstrate the performance and utility of the proposed distributionally robust ramp management method, we perform simulation studies using the wind power generation data in the BPA control area for the year of 2018 [35]. The storage size is chosen as ${x}_{max}^{s}=10$ MWh with power rating ${u}_{max}^{c}={u}_{max}^{d}=10$ MW. The initial SOC of storage is chosen to be the half of its capacity, i.e., ${x}^{s}\left(0\right)={x}_{max}^{s}/2$. The storage operation interval is $\Delta t=5$ min, and the number of time steps T is 288. Thus, the storage device manages the ramp rate for 24 h. The ramp-rate limit is chosen as $\pm 0.5$ MW/min by setting ${R}_{u}={R}_{d}=2.5$ MW. Additionally, the following parameters are used in the simulations: $\eta =0.99$, ${\alpha}^{c}={\alpha}^{d}=0.9$, $p=0.005$, ${p}_{u}={p}_{d}=1$, $\theta =0.1$, and $K=21$. The sample of the ramp variable ${\xi}_{t}$ is constructed for each month from the BPA wind generation data. Since outliers bias the trained controller, we clipped out data points lying outside $[-120,120]$ MW and replaced such outliers with $\pm 120$ MW. The state space $\mathcal{X}=[0,10]\times [-120,120]$ was discretized with $11\times 21$ grid points with grid spacing 1 for x-axis and 12 for y-axis. All the numerical experiments were performed on a Mac with 4.2 GHz Intel Core i7 and 64 GB RAM. The LP problem (9) was solved using CPLEX for MATLAB.

#### 4.1. Comparison with Stochastic Optimal Control

We first compare the performance of the proposed distributionally robust method and that of the standard stochastic optimal control method, which is the most popular approach to energy storage operation (e.g., [9,10,11]). The stochastic optimal controller is designed by solving (3) via dynamic programming. To evaluate the performance for all four seasons, the BPA wind data of January, April, July, and October are used. For each month, we split the data into the training and test data set: the training data are chosen as the ramp data for the first 15 days, i.e., from day 1 to day 15; and the test data are selected as those for the next 15 days, i.e., from day 16 to day 30. We use three different training sample sizes, 5, 10, and 15, in this comparison study. The training data are chosen from day 15, backward in time. For example, in the case of sample size 5, the training data are selected for days 11–15.

The performance of the two controllers are evaluated as the total ramp penalty for 24 h relative to the “no storage” case. In other words, the cost is evaluated as the ratio of the total ramp penalty with energy storage to that without storage. As shown in Figure 3, the proposed distributionally robust controller outperforms the standard stochastic optimal controller for all months and for all sample sizes. Specifically, the distributionally robust method saves the ramp penalty by 4.82% on average compared to the stochastic optimal control method, as summarized in Table 1. This result indicates that the distributionally robust method consistently resolves the issue coming from distribution errors for every season and sample size. The stochastic optimal controller sometimes performs worse than the “no storage” case. This is because the training set distribution is different from the test set distribution, i.e., the training set does not offer useful information about the behavior of wind power ramping in the near future. The stochastic optimal controller believes such a misleading or uninformative training set distribution, while the distributionally robust controller does not. The proposed method actively takes into account potential distribution mismatches and makes control decisions robust against the distribution errors. The effect of the distributionally robust method on net power production is shown in Figure 4. The controlled storage smoothens wind power fluctuations and thus reduces the ramp penalty.

When the sample size is too small, the data may provide too little information that is useful in decision making. On the other hand, as the sample size increases, old data are used for designing controllers. This addition of old data, which may be different from the future ramping behavior, may distort the training set distribution in an undesirable way. Thus, in both control cases, the performance is improved when increasing the sample size from 5 to 10 and is almost unaffected when increasing the sample size from 10 to 15.

#### 4.2. Effect of Ambiguity Set Size

A notable advantage of Wasserstein distributionally robust control is that it provides a nonasymptotic probabilistic guarantee on the out-of-sample performance, which is the control performance evaluated with unseen test samples drawn from the true distribution [19,20]. It is well known that the out-of-sample performance critically depends on the radius $\theta $ that controls the size of ambiguity set (4). In our simulation studies, the ramp penalty computed with test samples is the measure of out-of-sample performance. We now examine how the radius affects the ramp management performance of the distributionally robust method. Figure 5 displays the effect of $\theta $ on the total ramp penalty relative to the “no storage” case, where the data of April are used for the test. As $\theta $ increases from 0.05, the performance initially decreases and then increases for $\theta $ greater than $0.2$. When the radius is too small, the resulting controller is not sufficiently robust to take into account the deviation of the test set distribution from the training set distribution. On the other hand, in the case of a large radius, decisions made by the distributionally robust storage controller are overly conservative. Thus, it is incapable of aggressively charging or discharging energy storage to minimize the ramp rate. According to the simulation result, the proposed controller with $\theta =0.1$–$0.2$ presents the best out-of-sample performance for wind power ramp management in the setting used for these simulations.

#### 4.3. Effect of Storage Size

To examine the impact of storage size, the total ramp penalties relative to the “no storage” case are computed for different sizes ${x}_{max}$ of energy storage with radius $\theta $ fixed as $0.1$, using the data of April. As shown in Figure 6, the ramp penalty decreases as the storage size increases up to 11 MWh in both standard stochastic optimal control and distributionally robust control cases. This is because a bigger storage device provides greater operational flexibility, which can be utilized to mitigate the ramp rate of wind power generation. However, the benefit of such flexibility is saturated around 11 MWh: the ramp penalty even slightly increases with storage size. This counterintuitive result is caused by the mismatch between the training and test sets. The controllers use the prior knowledge obtained from the training set about wind power ramp to fully utilize the flexibility provided by energy storage. However, such behaviors can be overly aggressive when the storage size is large. Thus, as the test set distribution deviates from the training set distribution, the aggressive storage operation produces undesirable ramp events.

## 5. Conclusions

We have proposed a distributionally robust storage control method for wind power ramp management using historical data that may not provide reliable information about future wind power ramp events. Our simulation studies using the BPA data indicate that the proposed storage operation strategy overcomes wind ramp distribution errors and presents a robust performance unlike the standard stochastic optimal control method. The control performance depends on the Wasserstein ambiguity set size as well as the storage size, and there exist an optimal ambiguity set size and an optimal storage size given the trade-off between robustness and efficiency. The proposed distributionally robust storage control method can be extended in several interesting ways. First, an approximate dynamic programming algorithm can be designed to solve a large-scale distributionally robust control problem, taking into account transmission constraints in the network setting. Second, a more sophisticated model for wind power dynamics or even its forecast model can be used in conjunction with the distributionally robust method. Third, other types of services including frequency regulation and energy arbitrage can be considered by extending the proposed framework.

## Funding

This research was funded by the NSF under ECCS-1708906, the Creative-Pioneering Researchers Program through SNU, the Basic Research Lab Program through the National Research Foundation of Korea funded by the MSIT (2018R1A4A1059976), and Samsung Electronics.

## Conflicts of Interest

The author declares no conflict of interest.

## Abbreviations

The following abbreviations are used in this manuscript:

BPA | Bonneville Power Administration |

DP | dynamic programming |

DRO | distributionally robust optimization |

LP | linear program (or linear programming) |

SOC | state of charge |

## References

- Wan, Y.H. Analysis of Wind Power Ramping Behavior in ERCOT; Technical Report NREL/TP-5500-49218; National Renewable Energy Lab: Golden, CO, USA, 2011.
- Huang, B.; Krishnan, V.; Hodge, B.M. Analyzing the impacts of variable renewable resources on California net-load ramp events. In Proceedings of the IEEE Power & Energy Society General Meeting, Portland, OR, USA, 5–10 August 2018; pp. 1–5. [Google Scholar]
- Yoshimoto, K.; Nanahara, T.; Koshimizu, G.; Uchida, Y. New control method for regulating state-of-charge of a battery in hybrid wind power/battery energy storage system. In Proceedings of the IEEE PES Power Systems Conference and Exposition, Atlanta, GA, USA, 29 October–1 November 2006; pp. 1244–1251. [Google Scholar]
- Tewari, S.; Mohan, N. Value of NAS energy storage toward integrating wind: Results from the wind to battery project. IEEE Trans. Power Syst.
**2012**, 28, 532–541. [Google Scholar] [CrossRef] - Teleke, S.; Baran, M.E.; Bhattacharya, S.; Huang, A.Q. Optimal control of battery energy storage for wind farm dispatching. IEEE Trans. Energy Convers.
**2010**, 25, 787–794. [Google Scholar] [CrossRef] - Xiang, L.; Ng, D.W.K.; Lee, W.; Schober, R. Optimal storage-aided wind generation integration considering ramping requirements. In Proceedings of the IEEE International Conference on Smart Grid Communications, Vancouver, BC, Canada, 21–24 October 2013; pp. 648–653. [Google Scholar]
- Lee, D.; Kim, J.; Baldick, R. Stochastic optimal control of the storage system to limit ramp rates of wind power output. IEEE Trans. Smart Grid
**2013**, 4, 2256–2265. [Google Scholar] [CrossRef] - Gong, Y.; Jiang, Q.; Baldick, R. Ramp event forecast based wind power ramp control with energy storage system. IEEE Trans. Power Syst.
**2016**, 31, 1831–1844. [Google Scholar] [CrossRef] - Kim, J.H.; Powell, W.B. Optimal energy commitments with storage and intermittent supply. Oper. Res.
**2011**, 59, 1347–1360. [Google Scholar] [CrossRef] - Su, H.I.; El Gamal, A. Modeling and analysis of the role of energy storage for renewable integration: Power balancing. IEEE Trans. Power Syst.
**2013**, 28, 4109–4117. [Google Scholar] [CrossRef] - Harsha, P.; Dahleh, M. Optimal management and sizing of energy storage under dynamic pricing for the efficient integration of renewable energy. IEEE Trans. Power Syst.
**2015**, 30, 1164–1181. [Google Scholar] [CrossRef] - Salas, D.F.; Powell, W.B. Benchmarking a scalable approximate dynamic programming algorithm for stochastic control of grid-level energy storage. INFORMS J. Comput.
**2017**, 30, 106–123. [Google Scholar] [CrossRef] - Petersen, I.R.; James, M.R.; Dupuis, P. Minimax optimal control of stochastic uncertain systems with relative entropy constraints. IEEE Trans. Autom. Control
**2000**, 45, 398–412. [Google Scholar] [CrossRef] - Xu, H.; Mannor, S. Distributionally robust Markov decision processes. Math. Oper. Res.
**2012**, 37, 288–300. [Google Scholar] [CrossRef] - Van Parys, B.P.G.; Kuhn, D.; Goulart, P.J.; Morari, M. Distributionally robust control of constrained stochastic systems. IEEE Trans. Autom. Control
**2016**, 61, 430–442. [Google Scholar] [CrossRef] - Tzortzis, I.; Charalambous, C.D.; Charalambous, T. Infinite horizon average cost dynamic programming subject to total variation distance ambiguity. SIAM J. Control Optim.
**2019**, 57, 2843–2872. [Google Scholar] [CrossRef] - Yang, I. A convex optimization approach to distributionally robust Markov decision processes with Wasserstein distance. IEEE Control Syst. Lett.
**2017**, 1, 164–169. [Google Scholar] [CrossRef] - Yang, I. A dynamic game approach to distributionally robust safety specifications for stochastic systems. Automatica
**2018**, 94, 94–101. [Google Scholar] [CrossRef] - Yang, I. Wasserstein distributionally robust stochastic control: A data-driven approach. arXiv
**2018**, arXiv:1812.09808. [Google Scholar] - Mohajerin Esfahani, P.; Kuhn, D. Data-driven distributionally robust optimization using the Wasserstein metric: Performance guarantees and tractable reformulations. Math. Program.
**2018**, 171, 115–166. [Google Scholar] [CrossRef] - Zhao, C.; Guan, Y. Data-driven risk-averse stochastic optimization with Wasserstein metric. Oper. Res. Lett.
**2018**, 46, 262–267. [Google Scholar] [CrossRef] - Gao, R.; Kleywegt, A.J. Distributionally robust stochastic optimization with Wasserstein distance. arXiv
**2016**, arXiv:1604.02199. [Google Scholar] - Huang, L.; Walrand, J.; Ramchandran, K. Optimal demand response with energy storage management. In Proceedings of the IEEE International Conference on Smart Grid Communications, Tainan, Taiwan, 5–8 November 2012; pp. 61–66. [Google Scholar]
- Qin, J.; Chow, Y.; Yang, J.; Rajagopal, R. Online modified greedy algorithm for storage control under uncertainty. IEEE Trans. Power Syst.
**2016**, 31, 1729–1743. [Google Scholar] [CrossRef] - Samuelson, S.; Yang, I. Data-driven distributionally robust control of energy storage to manage wind power fluctuations. In Proceedings of the 1st IEEE Conference on Control Technology and Applications, Mauna Lani, HI, USA, 27–30 August 2017; pp. 199–204. [Google Scholar]
- Hernández-Lerma, O.; Lasserre, J.B. Discrete-Time Markov Control Processes: Basic Optimality Criteria; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Reemtsen, R. Discretization methods for the solution of semi-infinite programming problems. J. Optim. Theory Appl.
**1991**, 71, 85–103. [Google Scholar] [CrossRef] - Hettich, R.; Kortanek, K.O. Semi-infinite programming: Theory, methods, and applications. SIAM Rev.
**1993**, 35, 380–429. [Google Scholar] [CrossRef] - López, M.; Still, G. Semi-infinite programming. Eur. J. Oper. Res.
**2007**, 180, 491–518. [Google Scholar] [CrossRef] - Yang, I. A convex optimization approach to dynamic programming in continuous state and action spaces. arXiv
**2018**, arXiv:1810.03847. [Google Scholar] - Bertsimas, D.; Tsitsiklis, J.N. Introduction to Linear Optimization; Athena Scientific: Belmont, MA, USA, 1997. [Google Scholar]
- Dantzig, G.B. Linear Programming and Extensions; Princeton University Press: Princeton, NJ, USA, 1998. [Google Scholar]
- Vanderbei, R.J. Linear Programming; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Kuhn, D.; Esfahani, P.M.; Nguyen, V.A.; Shafieezadeh-Abadeh, S. Wasserstein distributionally robust optimization: Theory and applications in machine learning. In Operations Research & Management Science in the Age of Analytics; INFORMS TutORials in Operations Research; INFORMS: Catonsville, MD, USA, 2019; pp. 130–166. [Google Scholar]
- BPA. Balancing Authority Load and Total Wind Generation. 2019. Available online: https://transmission.bpa.gov/business/operations/wind/ (accessed on 1 November 2019). [Google Scholar]

**Figure 2.**The empirical distribution of wind power ramp rate at 12PM in the Bonneville Power Administration (BPA) control area obtained by the data for (

**a**) 1–15 April 2018 and (

**b**) 16–30 April 2018.

**Figure 3.**Performance comparison between the standard stochastic optimal controller and the proposed distributionally robust (DR) controller for the sample size (

**a**) $N=5$, (

**b**) $N=10$, and (

**c**) $N=15$. Here, the cost is defined as the ratio of the total ramp penalty with energy storage to that without storage.

**Figure 4.**Net power production (on 23 April 2018) with and without energy storage controlled by the distributionally robust method. Here, the net power is wind power generation minus power drawn by energy storage. (

**b**) is the close-up of (

**a**) for 12–6PM.

**Figure 5.**Total ramp penalty relative to the “no storage” case, depending on the radius $\theta $ of the Wasserstein ambiguity set. Note that the x-axis is log scale.

**Figure 6.**Total ramp penalty relative to the “no storage” case, depending on the size ${x}_{max}$ of energy storage.

Sample Size | 5 | 10 | 15 | Avg |
---|---|---|---|---|

Stochastic optimal control | 0.9939 | 0.9902 | 0.9970 | 0.9937 |

Distributionally robust control | 0.9612 | 0.9399 | 0.9363 | 0.9458 |

Cost saving | 3.29% | 5.08% | 6.09% | 4.82% |

© 2019 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).