A stochastic optimization method for energy storage sizing based on an expected value model

: Energy storage technologies have been rapidly evolving in recent years. Energy storage plays different roles in various scenarios. For electricity consumers, they are concerned with how to use the energy storage system (ESS) to reduce their costs of electricity or increase their proﬁts. In this paper, a stochastic optimization method for energy storage sizing based on an expected value model for consumers with Photovoltaic Generation (PV) is proposed. Firstly, the Gaussian mixture model clustering method is used to cluster the historical load and PV data and calculate the probability of each cluster. Secondly, the optimal model of total system proﬁt is established. Finally, according to the expected value model, the optimal ESS power and capacity are determined. Two case studies are used to demonstrate the calculation of optimal ESS capacity. The results obtained by the method proposed in this paper are compared with the results produced by the deterministic method. Through the analysis and comparison, the validity and superiority of the method proposed in this paper are veriﬁed. The proﬁts obtained by the method proposed in this paper are 0.87% to 127.16% more than the deterministic method.


Introduction
An energy storage system (ESS) has the ability of flexible adjustment at different time scales and can be regarded as a variable power source or a variable load [1,2]. Based on its response time, power density, and other characteristics, ESS can play different roles to help improve various aspects of power quality, increase dispatchable PV power generation, and bring economic benefits to consumers [3][4][5]. However, the current price of ESS is relatively expensive, and we cannot install ESS without a consideration of what profits it will accrue. Therefore, how to obtain the maximum profits with a minimum investment cost is one of the major problems when applying energy storage in practical applications.
The research focus of this paper is how to estimate the optimal sizing of ESS on the consumer side where a PV plant is also installed, in order to maximize the consumer's benefits. It is well-known that the optimal sizing of the ESS is closely related to its operational flexibility. For electricity consumers who are incentivized by electricity price signals, it is useful to investigate how we can use the energy storage system to change demand behaviors in order to maximize economic benefits, as well as to reduce the stress of a grid during the high peak demand period [6,7]. To achieve this, the optimal size of energy storage, the optimal operation and co-ordination of energy storage, local demand, and PV plants need to be fully investigated [8].
Optimal allocation and sizing of energy storage systems has been investigated thoroughly in the literature [9][10][11][12]. Reference [9] proposed a novel control scheme and found the boundary values of compensator gains. This method has a good ability of tight dc-link voltage regulation, with effective power management at the DC-link. Reference [11] proposed an ESS sizing method which leads to a considerable reduction of the ESS. In [12], a distributed control strategy for state-of-charge balancing is proposed. It provides advantages in terms of reduced communication requirements and increased modularity. In [13], when consumers lose their power supply from a distribution network, an energy storage system optimization configuration method is proposed to improve the reliability of consumer electricity supply. The application scenario includes consumers, PV, and energy storage. It does not take the uncertainty into consideration, which has an important effect on the optimal result. Reference [14], based on the time-of-use (TOU) price of the consumer side, the optimal allocation of energy storage considering the advanced multi-pass dynamic programming and expert knowledge base rules has been proposed. The purpose is to provide consumers with the greatest economic benefits. In [14], two instances are analyzed via simulation. The article [15] proposed a method of optimal allocation of the energy storage system. The optimization problem was divided into two time scales. For the long-term sizing of ESS, it does not include the elements of uncertainty and TOU, but the optimal model is worth learning. In the above literature, the energy storage system is optimally configured by a deterministic method. This method neglects the uncertainty of load and PV, which is an important element in the stochastic optimization. Although they built some models, the results obtained by deterministic methods may not be the most economical, due to the uncertainty of load and PV.
In the literature [16], an optimization algorithm based on a self-adapted evolutionary strategy and Fischer-Burmeister algorithm was proposed to reduce the one-time investment and annual running costs. In different scenarios, the investment costs of the energy storage system ware calculated. Reference [17] considers the uncertainty of wind power and proposes a linear optimization model of energy storage to reduce the operating costs of the micro-grid and the investment costs of the energy storage system. However, PV power has different characteristics than wind power. In the distribution network, most of the distribution generations are PV, but not wind power. For the terminal electric consumer, it is necessary to study the situation including PV, load, and TOU. In [18], a model predictive control method is proposed to optimize the ESS. However, the wind power has uncertainties and its prediction has some errors. Reference [18] takes these features into account and directly solves the problem using stochastic optimization. Most studies of stochastic optimization for the sizing of ESS are focused on wind generation uncertainty. With the increase of distributed PV and the development of the electricity market, the stochastic optimization method considering the uncertainties of PV and load will generate a more reasonable configuration result for the consumer.
The purpose of this paper is to propose a stochastic optimization method based on an expected value model for electricity consumers with a PV plant. Here, we assume that the ESS is not allowed to sell electricity to the grid. The ESS is used to provide consumers with some economic benefits. In this paper, the following issues are the main contributions: (1) A stochastic optimization method for ESS sizing based on the expected value model is proposed, which takes into account the uncertainty of load and PV in a year, and makes the optimization result more reasonable under different application scenarios; (2) The Gaussian mixture model is used to cluster the historical data of load and PV, which lays the foundation for establishing the expected value model; (3) The profit model for load, PV, and energy storage is established. This model takes into consideration the consumer's electricity costs, the profits of PV, time-of-use price, energy storage investment and maintenance costs, and so on; (4) The expected value model taking power and capacity as variables of optimal sizing is established on the basis of (1)-(3), and a more reasonable optimization result than that of the deterministic method is obtained; Energies 2019, 12, 702 3 of 14 (5) The profits of a sodium sulfur (NAS) battery, vanadium redox battery (VRB), polysulfide bromine battery (PSB), value-regulated lead-acid (VRLA) battery, and lithium-ion (Li-ion) battery are compared. The profits of the expected value model and deterministic scenario method are also compared.
The other parts of this paper are organized as follows: Section 2 formulates the proposed optimal model based on the expected value model; Section 3 presents the application of the proposed procedure to two cases; and conclusions are presented in Section 4.

Problem Formulations
Under expected constraints, the mathematical programming that maximizes the expected value of the objective function is called the expected value model. The expected value model is one of the common forms in mathematical programming, such as minimizing the expected costs, maximizing the expected value model, and so on [19]. The basic function expression for the expectation model is (1) X is an n-dimensional decision vector, ξ is a t-dimensional random vector, f is the objective function, g i is a random constraint function, E(·) is the expected value, and m is the number of constraint function.
In a system that contains both consumers and PV, the ESS is installed. In order to minimize the consumer's costs over the whole life of the ESS, we propose a method to size the energy storage system using the expected value model. As shown in Figure 1, which is the overall architecture of the expected value model, the basic steps are as follows: (1) The clustering method of the Gaussian mixture model is used to divide the consumer load curve and PV generation curve into different scenarios. Moreover, the probabilities of different scenarios are calculated; (2) The total profit model involving the consumer, the PV, and the energy storage system is established to calculate the costs in the entire life cycle of energy storage in different scenarios of step (1). The profit model includes the initial investment costs, operation and maintenance costs, energy storage profits, PV, and load costs; (3) For different storage system configuration values, the expected values of consumer total profits are calculated. When the expected value is the maximum, the sizing of the energy storage system is the optimal result. The other parts of this paper are organized as follows: Section 2 formulates the proposed optimal model based on the expected value model; Section 3 presents the application of the proposed procedure to two cases; and conclusions are presented in Section 4.

Problem Formulations
Under expected constraints, the mathematical programming that maximizes the expected value of the objective function is called the expected value model. The expected value model is one of the common forms in mathematical programming, such as minimizing the expected costs, maximizing the expected value model, and so on [19]. The basic function expression for the expectation model is X is an n-dimensional decision vector, ξ is a t-dimensional random vector, f is the objective function, i g is a random constraint function, ( ) E  is the expected value, and m is the number of constraint function.
In a system that contains both consumers and PV, the ESS is installed. In order to minimize the consumer's costs over the whole life of the ESS, we propose a method to size the energy storage system using the expected value model. As shown in Figure 1, which is the overall architecture of the expected value model, the basic steps are as follows: (1) The clustering method of the Gaussian mixture model is used to divide the consumer load curve and PV generation curve into different scenarios. Moreover, the probabilities of different scenarios are calculated; (2) The total profit model involving the consumer, the PV, and the energy storage system is established to calculate the costs in the entire life cycle of energy storage in different scenarios of step (1). The profit model includes the initial investment costs, operation and maintenance costs, energy storage profits, PV, and load costs; (3) For different storage system configuration values, the expected values of consumer total profits are calculated. When the expected value is the maximum, the sizing of the energy storage system is the optimal result.

Clustering the Load Curve and the PV Curve
When using the expected value model to size the energy storage, different random variables and their probabilities need to be determined. In this paper, different clusters can be obtained by analyzing the daily historical data of a year of consumer load and PV. The Gaussian mixture model (GMM) [20], sometimes called soft clustering, can obtain the probability that each sample point belongs to each cluster, instead of judging that it belongs to exactly one cluster. This method can be divided into different clusters of load and PV curve, and calculate the probability of each cluster, which is the essential parameter of the expected value model. Firstly, we assume that the data are generated by the GMM. Then, we just need to introduce the probability distribution of GMM based on the data, and the K components of GMM actually correspond to K clusters.
An important parameter for clustering using the Gaussian mixture model is the number of mixture models; that is, the number of clusters. In this paper, we use the Calinski-Harabasz [21] index to evaluate the number of clusters. The within-cluster variance is the square of the distance between each point in the cluster and the center of the cluster. The between-cluster variance is the square of the distance between the center points of each cluster and the center of the data set.
S B is the overall between-cluster variance, S W is the overall within-cluster variance, k is the number of clusters, and N is the number of observations.
m i is the centroid of cluster i, m is the overall mean of the sample data, x i is the number of points in cluster k, and x is a data point. Here, well-defined clusters have a large S B and a small S W . The larger the Calinski-Harabasz index I k , the better the data partition. To determine the optimal number of clusters, we should maximize I k with respect to k. The optimal number of clusters is the solution with the highest Calinski-Harabasz index value.
According to the above index, the number of clusters of the daily load and PV curves in a year can be determined. The load and PV are respectively divided into n and m clusters, and the corresponding probabilities are p n and p m . In this paper, we assume that the load and PV are independent, so there are n × m scenarios with the probabilities p n × p m .

The Optimization Model of Total Profits
In a system without energy storage, the consumer purchases electricity from the grid and uses the PV power considering the price of the electricity. The energy storage system has the ability to charge and discharge power. When the grid load is low, the electricity price is low, and the energy storage system can be charged during this period; while during peak load, the energy storage system can provide power to the load, thereby reducing the costs of the consumer. At present, the ESS of the consumer side is not allowed to sell power to the grid in China. Therefore, we assumed that the energy storage system considered in this paper is not allowed to sell power to the grid and the energy stored in the ESS is only used by the consumer. However, the PV power can be sold to the grid. An energy storage system requires an initial capital cost when installed, so calculating the total profits for a consumer over the entire life of the energy storage is reasonable. Therefore, the optimization model in this paper considers the investment of the energy storage system, the maintenance costs, the purchase costs of the consumer, and the costs saved by the energy storage. The electric power has different kinds of time-of-use price in reality. In this paper, we use a kind of typical price to explain the advantage of the stochastic optimization method. For the deterministic method for sizing the EES, this total profit model is regarded as the optimal sizing model. In the proposed method, we calculate the total profits in difference scenarios, which are the variable of the expected value model.
The total costs without the ESS are the costs of purchasing power from the grid minus the profits of the PV power sold to the grid, that is When the ESS is installed, the initial investment costs of the energy storage system include power costs and capacity costs [22].
The maintenance costs of the energy storage system [22] are After installing the energy storage system, the positive power bought by the consumer from the grid at time t is The operating costs of a combined system of load, energy storage, and PV are the costs of purchasing power from the grid minus the costs of PV power sold to the grid, that is P ES and E ES represent the rate of power and energy of the ESS, respectively; C P , C E , and C M represent the price of per unit of power, energy, and maintenance, respectively; P t PV1 represents the PV power used by the consumer at time t; P t E represents the power of the energy storage system, which is positive during charging and negative during discharging; P t L is the load power; α and β are the inflation rate and discount rate, respectively; and T 1 is the cycle life of ESS, and its unit is year.
The costs of energy storage and the costs of operation consider the full life cycle of the energy storage system. During this cycle, the impact on the total costs of the factors, such as the inflation rate, needs to be taken into account. Therefore, in Equations (6), (8), and (10), the costs are multiplied by the coefficient (1 + α)/(1 + β). For Equation (5), it is the difference of cost without ESS and with ESS. If it is positive, it means that ESS brings some profit for the consumer and the consumer should install the ESS. If it is negative, it means that installing the ESS makes the consumer lose some money. Therefore, we can determine whether the consumer should install the ESS.
In the progress of operation, there are several constrains. The PV power sold to the grid and used by the consumer should follow the inequality Equation (12). As we mentioned above, the ESS is not allowed sell power to the grid. Therefore, when the ESS discharges (P t E is negative), the inequality Equation (13) should be followed.
The charge and discharge model of ESS is In the operation of the joint system, the energy storage system needs to meet the following constraints: P t ES is the energy storage output power at time t; P t PV1 is the PV power sold to the grid; S t ES is the state of charge of energy storage at t; η ch and η disch are the charge and discharge efficiency, respectively; ∆t is the time of charge or discharge; P ES,max is the maximum power of the ESS; and S ES,min and S ES,max are the minimum and maximum SoC, respectively.
Given the power and capacity of the energy storage system, the above optimization model can be used to calculate the maximum profits caused by the ESS in every scenario over the cycle life of the energy storage. It is necessary to mention that we use the Genetic Algorithm [23] to solve the object function. Because it is not the keynote of this paper, we do not describe its detailed steps. In the progress of solving, we set enough generations of GA to ensure that it has a good accuracy.

Expected Value Model
In the first step, we can get the clustering results and the probabilities of the load and PV curves in a year. Therefore, the probabilities and clusters of load and PV curve are the known quantities. Regarding the power and capacity of ESS as the variables, we can get the maximum total profits in all scenarios that are clusters with different probabilities. The expected value is ξ n and F n are the probability and the maximum total profits of the n th scenario, respectively. The the maximum value of all the expected values can be found, where the result is the best energy storage sizing value.

Case Studies
According to the optimal method of the expected value model mentioned in the second part, the Li-ion battery is taken as an example to carry out the simulation analysis. The parameters of batteries [22] are shown in Table 1. The S ES,min is 10% and the S ES,max is 90%. P ES,max is the rated power of ESS, w is the energy efficiency, and T is the battery life. In the second part, the TOU price and the price of the PV power sold to the grid are considered in the model. For some consumers, the TOU price of Jiangsu Province is as shown in Table 2. In this paper, it is assumed that the daily TOU prices are fixed. The price of PV sold by the consumer is 1 CNY/ kWh. CNY is Chinese Yuan. As shown in Figure 2, the consumer daily load data is from the literature [24]. The data is divided into four clusters corresponding to four seasons. It is assumed that the probability of each cluster is 25% and the maximum load is 1000 kW. The vertical axis is the load power. The horizontal axis is the time.

Case 1
As shown in Figure 2, the consumer daily load data is from the literature [24]. The data is divided into four clusters corresponding to four seasons. It is assumed that the probability of each cluster is 25% and the maximum load is 1000 kW. The vertical axis is the load power. The horizontal axis is the time.
PV data is the one-year PV power generation of a 300 kW PV power plant in a factory in Jiangsu Province. According to the clustering method in the second section, the Calinski-Harabasz index of the GMM and the k-means can be seen in Figure 3. The vertical axis is the index. The horizontal axis is the number of clusters. The index of the red point reaches the maximum, and the clustering effect is the best. Here, the highest index of the GMM is better than the k-means.
The three clusters of PV curves obtained by Gaussian mixture model clustering are shown in Figure 4. Among them, the first, second, and third cluster have a probability of 27.93%, 45.25%, and 26.82%, respectively, in one year. The results of the clustering will serve as the basis for the optimization of the expected value model. Therefore, the results of the clustering, including the curves and probabilities, will have an impact on the final calculation.   PV data is the one-year PV power generation of a 300 kW PV power plant in a factory in Jiangsu Province. According to the clustering method in the second section, the Calinski-Harabasz index of the GMM and the k-means can be seen in Figure 3. The vertical axis is the index. The horizontal axis is the number of clusters. The index of the red point reaches the maximum, and the clustering effect is the best. Here, the highest index of the GMM is better than the k-means.    The three clusters of PV curves obtained by Gaussian mixture model clustering are shown in Figure 4. Among them, the first, second, and third cluster have a probability of 27.93%, 45.25%, and 26.82%, respectively, in one year. The results of the clustering will serve as the basis for the optimization of the expected value model. Therefore, the results of the clustering, including the curves and probabilities, will have an impact on the final calculation.  It is assumed that the PV and load curves are independent of each other in the following scenarios. The power rating of the energy storage system considered is 400 kW. As shown in Table  3, S1-S12 represent 12 scenarios. The first column is the capacity of the energy storage system in kW. The last column is the expected value of profits. As shown in this table, the profits generated by ESS during the whole life-cycle can be obtained under different capacity values. When the capacity is 3200 kWh, the expected value of profits is the largest. Therefore, the best Li-ion capacity is 3200 kWh when the power rating of ESS is 400 kW.  It is assumed that the PV and load curves are independent of each other in the following scenarios. The power rating of the energy storage system considered is 400 kW. As shown in Table 3, S1-S12 represent 12 scenarios. The first column is the capacity of the energy storage system in kW. The last column is the expected value of profits. As shown in this table, the profits generated by ESS during the whole life-cycle can be obtained under different capacity values. When the capacity is 3200 kWh, the expected value of profits is the largest. Therefore, the best Li-ion capacity is 3200 kWh when the power rating of ESS is 400 kW. However, the variables are the power and the capacity in Equation (16). Therefore, we need to consider the impact of changes in the power, as well as the capacity variables. Figure 5 shows how the choice of power rating impacts the profits as the capacity of ESS is changed. The profit curves are illustrated at 200 kW intervals. The vertical axis is the profit accrued by the ESS and the horizontal axis is the capacity of ESS installed. We can see that each curve will first increase as the capacity increases and then decrease. Figure 5 shows that when the storage capacity exceeds an optimal size, the consumer cannot use this capacity, and it is essentially wasted. Clearly, the highest point represents the highest profits for systems of different power ratings. When the power is 400 kW, the highest point is 3200 kWh, which corresponds the best expected value in Table 3. are illustrated at 200 kW intervals. The vertical axis is the profit accrued by the ESS and the horizontal axis is the capacity of ESS installed. We can see that each curve will first increase as the capacity increases and then decrease. Figure 5 shows that when the storage capacity exceeds an optimal size, the consumer cannot use this capacity, and it is essentially wasted. Clearly, the highest point represents the highest profits for systems of different power ratings. When the power is 400 kW, the highest point is 3200 kWh, which corresponds the best expected value in Table 3. In fact, we calculate more results at the 50 kW intervals, which show that the highest profits are obtained when the power and the capacity are 550 kW and 4400 kWh, respectively. As shown in Figure 6, it is the operated curves of load, PV, and ESS in scenario 1, where the load curve is cluster1 in Figure 2 and the PV curve is cluster1 in Figure 4. The Ppv1 curve is the PV power used by the consumer. The Ppv2 curve is the PV power sold to the grid. For obtaining the maximum operated profits, from 1 to 8 o'clock, the ESS is charged and all PV power should be sold to the grid. From 8 to 12 and from 18 to 21o'clock, the ESS is discharged, and at the same time, some proportion of the PV power is used by the consumer and the rest of it is sold to the grid. The ESS changes the utilization of the PV power and brings more profits to the consumer. It is worth noting that all the PV power should be used by the consumer under the scenario when there is no energy stored in the ESS between 9 and 13 o'clock. In fact, we calculate more results at the 50 kW intervals, which show that the highest profits are obtained when the power and the capacity are 550 kW and 4400 kWh, respectively. As shown in Figure 6, it is the operated curves of load, PV, and ESS in scenario 1, where the load curve is cluster1 in Figure 2 and the PV curve is cluster1 in Figure 4. The Ppv1 curve is the PV power used by the consumer. The Ppv2 curve is the PV power sold to the grid. For obtaining the maximum operated profits, from 1 to 8 o'clock, the ESS is charged and all PV power should be sold to the grid. From 8 to 12 and from 18 to 21o'clock, the ESS is discharged, and at the same time, some proportion of the PV power is used by the consumer and the rest of it is sold to the grid. The ESS changes the utilization of the PV power and brings more profits to the consumer. It is worth noting that all the PV power should be used by the consumer under the scenario when there is no energy stored in the ESS between 9 and 13 o'clock. The optimal results of different batteries are shown in Table 4. The unit of the data in the table is 10 5 CNY. The results of the fourth column are calculated by the method presented in this paper. The profits of PSB are the highest. This means that, in this case, PSB is the best choice for the consumer. The results of the seventh column are calculated by a deterministic scenario, which does not consider multiple scenarios and their probabilities, but is just based on the price signal. It considers the average load and the average PV generation of one year. We can see that the profits are lower than the fourth column. However, this single deterministic scenario considering only one type of load or PV curve in one year may not be reliable. As shown in the last column of Table 4, when we use 12 scenarios to test the power and the capacity obtained in the fifth and sixth column, the profits are lower than the fourth column. The profits in the fourth column are 12.71% to 68.58% more than the profits in the seventh column. Therefore, the deterministic single scenario approach has its shortcomings, which cannot guarantee more profits for consumers in real multiple scenarios.  The optimal results of different batteries are shown in Table 4. The unit of the data in the table is 10 5 CNY. The results of the fourth column are calculated by the method presented in this paper. The profits of PSB are the highest. This means that, in this case, PSB is the best choice for the consumer. The results of the seventh column are calculated by a deterministic scenario, which does not consider multiple scenarios and their probabilities, but is just based on the price signal. It considers the average load and the average PV generation of one year. We can see that the profits are lower than the fourth column. However, this single deterministic scenario considering only one type of load or PV curve in one year may not be reliable. As shown in the last column of Table 4, when we use 12 scenarios to test the power and the capacity obtained in the fifth and sixth column, the profits are lower than the fourth column. The profits in the fourth column are 12.71% to 68.58% more than the profits in the seventh column. Therefore, the deterministic single scenario approach has its shortcomings, which cannot guarantee more profits for consumers in real multiple scenarios.

Case 2
Case 2 uses one-year historical data of a factory in Jiangsu Province. The different Calinski-Harabasz index shows that the optimal number of clusters for this factory is 2, as shown in Figure 7. The GMM method is better than the k-means method in this case. As proposed above, a larger index indicates better clustering. The load curves of the two clusters obtained by the Gaussian mixture model clustering are shown in Figure 8, where the probabilities of cluster 1 and 2 are 49.18% and 50.82%, respectively.        Assuming that the PV and load curves are independent of each other, in this case, there are six scenarios in total, and the following simulation results are available for each scenario. When the power of the energy storage system is 400 kW, the simulation results in Table 5 can be obtained. It shows the consumer profits in different capacities and scenarios. The unit of the data in the table is 10 5 CNY. The highest expected profits are achieved when the capacity is 3200 kWh. As shown is Figure 9, we draw the profit curves at the 200kW intervals. We can see that each curve initially increases to a maximum and then decreases. The trend of each curve is similar to Figure 5. The highest point of the 400 kW system corresponds to the optimum expected value of Table 4. It is observed that some power rating exhibits a sharper peak zenith in the Figure 5. This is due to limitations in the discharge power. When the capacity of ESS increases, the profits do not increase in the original trend. In this case, we also calculate the results of different batteries, as shown in the Table 6. The unit of the data in the table is 10 5 CNY. The PSB is again the best choice. The results of the seventh column are obtained from the scenario where the load and the PV curve are the average value of one year. As the last column of Table 6 shows, when we use six scenarios to test the power and the capacity, the profits are lower than the results of the expected value model. The profits in the fourth column are 0.87% to 127.16% more than the profits in the seventh column.  In this case, we also calculate the results of different batteries, as shown in the Table 6. The unit of the data in the table is 10 5 CNY. The PSB is again the best choice. The results of the seventh column are obtained from the scenario where the load and the PV curve are the average value of one year. As the last column of Table 6 shows, when we use six scenarios to test the power and the capacity, the profits are lower than the results of the expected value model. The profits in the fourth column are 0.87% to 127.16% more than the profits in the seventh column.  Figure 10 shows the operational curves when the power and capacity of the Li-ion battery are 450 kW and 3600 kWh, respectively. At 12 o'clock, the load power is too small to limit the discharge power of ESS, which accords with Equation (12). In each table of each case, the best choice of battery is revealed. Obviously, the results of the deterministic scenario of the average curve are better than the results of the expected value model. However, when we use different scenarios to test the optimal results of the deterministic scenario method, the profits are reduced. The method proposed in this paper takes more factors into account relative to the deterministic approach. These factors include the stochastic features of the consumer's load power due to the consumer's electricity consumption habit and PV power due to changes of the weather or season. Therefore, the results are more credible.

Conclusions
In this paper, an energy storage sizing method based on the expected value model is proposed, which is applied to the consumer side with the PV system. Through the analysis and simulation in two cases, the following conclusions are drawn: (1) The best cluster number and subsequent clusters of load and PV historical data of a factory in Jiangsu Province are achieved by GMM. The clustering results are obtained as the basis of the expected value model. Different clustering results will have an impact on the final result; (2) The profit model of the consumer during the entire storage life-cycle is established. The profit model considers the load, PV, storage, and TOU. In different scenarios, the results of the model are obviously different; (3) The profits of different batteries are obtained from our method. The best batteries in two cases are both PSB. The efficiency of PSB is lower than other batteries, but its price is advantageous; (4) The result of the expected value model combines the results of multiple scenarios. Compared with the deterministic method, it more fully summarizes the load and PV conditions and produces more reliable results.
Future research will be devoted to the research on the PV and load data cluster. There are many clustering methods and evaluated indexes, which can produce different clustering results when the data needed to be clustered is different. Obviously, different scenarios will produce different profits. In two cases, we use history data of the load and PV to calculate the operated profits. To improve this, the ESS can be operated according to the forecast data of the load and PV.

Conclusions
In this paper, an energy storage sizing method based on the expected value model is proposed, which is applied to the consumer side with the PV system. Through the analysis and simulation in two cases, the following conclusions are drawn: (1) The best cluster number and subsequent clusters of load and PV historical data of a factory in Jiangsu Province are achieved by GMM. The clustering results are obtained as the basis of the expected value model. Different clustering results will have an impact on the final result; (2) The profit model of the consumer during the entire storage life-cycle is established. The profit model considers the load, PV, storage, and TOU. In different scenarios, the results of the model are obviously different; (3) The profits of different batteries are obtained from our method. The best batteries in two cases are both PSB. The efficiency of PSB is lower than other batteries, but its price is advantageous; (4) The result of the expected value model combines the results of multiple scenarios. Compared with the deterministic method, it more fully summarizes the load and PV conditions and produces more reliable results. Future research will be devoted to the research on the PV and load data cluster. There are many clustering methods and evaluated indexes, which can produce different clustering results when the data needed to be clustered is different. Obviously, different scenarios will produce different profits. In two cases, we use history data of the load and PV to calculate the operated profits. To improve this, the ESS can be operated according to the forecast data of the load and PV.