Probabilistic Assessment of Hybrid Wind-PV Hosting Capacity in Distribution Systems

: In recent years, hybrid wind-photovoltaic (PV) systems are ﬂourishing due to their advantages in the utilization of renewable energy. However, the accurate assessment of the maximum integration of hybrid renewable generation is problematic because of the complex uncertainties of source and demand. To address this issue, we develop a stochastic framework for the quantiﬁcation of hybrid energy hosting capacity. In the proposed framework, historical data sets are adopted to represent the stochastic nature of production and demand. Moreover, extreme combinations of production and demand are introduced to avoid multiple load ﬂow calculations. The proposed framework is conducted in the IEEE 33-bus system to evaluate both single and hybrid energy hosting capacity. The results demonstrate that the stochastic framework can provide accurate evaluations of hosting capacity while signiﬁcantly reducing the computational burden. This study provides a comprehensive understanding of hybrid wind-PV hosting capacity and veriﬁes the excellent performance of the hybrid energy system in facilitating integration and energy utilization.


Introduction
Over the past decades, the utilization of renewable energy sources (RES) has grown dramatically due to concerns over the environment and carbon emissions [1]. Wind power and photovoltaic (PV) comprise a significant proportion of RES due to mature technologies, economic advantages, and abundant primary energy. The strategical deployment of RES in distribution systems as distributed generation (DG) can bring serval benefits [2,3], such as voltage profile improvement, power loss reduction, and reliability enhancement. However, the excessive penetration of DG can alter the normal operational behavior of distribution systems and cause various negative impacts, such as overvoltage [4,5], voltage unbalance [6], thermal overloading [7], and harmonic distortions [8,9]. The performances of systems becomes unacceptable when the amount of DG exceeds the hosting capacity [10]. For this reason, it is of paramount importance to system planners to determine the maximum amount of DG that can be connected.
Hosting capacity is defined as the amount of DG that can be connected without endangering the reliability or voltage quality of systems [11]. The idea of hosting capacity was introduced in 2004, and Math Bollen et al. [12] further developed this concept. Throughout the next few years, hosting capacity analysis was regarded as a transparent tool for the planning and design of distribution networks. Furthermore, based on the time-varying nature of generation and demand, Castelo de Oliveira et al. [13] emphasized that the hosting capacity needs to be analyzed throughout different periods, and developed the concept of dynamic hosting capacity. The analysis of dynamic hosting capacity requires time series data, e.g., forecasting the values of generation and demand [14,15]. In this context, the combined dynamic hosting capacity with forecasting data has the potential to predict and facilitate the operation of systems.
In addition to the development of the concept, abundant studies have been conducted using various techniques for the assessment and enhancement of hosting capacity. To provide guidelines for system planners and researchers on various assessment techniques, the authors in [16] presented a comprehensive overview of three different methods for quantifying PV hosting capacity, i.e., deterministic [17], stochastic [18,19], and time series methods [20]. The main differences between the three categories of methods are whether and how uncertainties are considered. Deterministic methods usually adopt the worst-case scenario (e.g., maximum production and minimum load) to evaluate the extreme impact of DG. Therefore, these methods may lead to conservative results. To realistically assess the states of systems, the stochastic and time series methods adopt probability distributions and time-series data, respectively, to represent uncertainties of source and demand.
The accurate analysis of hosting capacity involves, not only the variations of DG production and load demands, but also the uncertainties of DG location and rated power [16]. In this context, Smith et al. [21] developed a Monte Carlo simulation-based (MCS) stochastic method to provide a comprehensive assessment with various uncertainties. This stochastic method takes into account the uncertainties in the size and location of DG by generating thousands of potential DG deployment schemes. Then, load flow calculations are performed to address the stochastic nature associated with DG production and load consumption. For a specific deployment scheme, the hosting capacity result can be obtained if any technological limitations are violated. With the multiple potential DG deployment schemes, the final result is no longer one exact value but a probability distribution.
Based on the stochastic method in [21], some applications and variations have been developed. This method was conducted in [22] to estimate the PV hosting capacity of 16 utility distribution feeders. Compared with the 15% rule adopted by system planners [10] (i.e., the total DG rating should be lower than 15% of the feeder annual peak load), the stochastic analysis provides more realistic estimations of hosting capacity. Moreover, some studies [23,24] adopted the stochastic method to perform the sensitivity analysis of hosting capacity to the characteristics of the feeder (e.g., voltage class, peak load demand, and voltage regulation equipment) and PV generation (e.g., location and power factor). Dubey et al. [25] presented a comprehensive understanding of the PV hosting capacity problem by developing a mathematical formulation and an hourly stochastic analysis framework of hosting capacity assessment. Given that the analysis framework is stochastic, an approach was also established to quantify the accuracy of the obtained results. A simplified MCS-based method was developed in [26] to estimate the PV hosting capacity of 50000 real low-voltage systems from a distribution utility in Brazil. The obtained statistical results provide a risk-based guide to determine strategies to deal with increasing PV penetration. Instead of using load flow calculations, Abad et al. [27] proposed an optimization model of hosting capacity and embedded this model into the stochastic framework to address the variability of production and consumption.
The literature review in the realm of hosting capacity assessment illustrates that abundant methods have been developed for PV hosting capacity problems. These studies have contributed significantly to the understanding of the impact of different variables on PV hosting capacity. However, less research work has been done on the hybrid wind-PV hosting capacity assessment. Wind power and PV, to some extent, are complementary [28,29]. Therefore, the hybrid wind-PV system may have the potential to generate smoother production than single renewable generation. Some studies have reported the advantages of the hybrid wind-PV system in facilitating energy production [30] and enhancing power supply reliability [31]. In this context, if the complementarity can be captured through the hybrid deployment of wind power and PV, distribution systems could have the potential to accommodate more DG capacity. More importantly, hybrid renewable energy can provide more power for demand, which helps achieve the goal of reducing carbon emissions.
Since the output of wind power and PV have different characteristics [20], the methods for hybrid wind-PV hosting capacity assessment can be different compared to those methods for only PV. The most significant difference is how to consider the uncertainties of DG production. For PV generation, the maximum output usually happens at midday due to solar irradiation. Therefore, only critical periods, in terms of potential limit violations, are evaluated in some studies. For instance, the authors in [23,26] only considered periods 10 am-2 pm and 11 am-1 pm, respectively. The adoption of critical periods can significantly reduce the computational burden while ensuring the accuracy of assessments. However, it is difficult to determine such periods for wind power. Under such circumstances, the evaluation of hybrid wind-PV hosting capacity can be very time-consuming due to the need for a large number of load flow calculations to capture the potential technological violations.
Motivated by the research gap and challenges, this paper aims to present a comprehensive assessment of hybrid wind-PV hosting capacity in distribution systems. To achieve this goal, we develop a stochastic framework that takes into account the uncertainties of load demands, DG production, location, size, and type. Moreover, to avoid multiple load flow calculations, the extreme combinations of DG production and load demands are defined. The proposed framework is implemented to evaluate both single and hybrid DG hosting capacity. Compared with the comprehensive load flow method, both the efficiency and accuracy of the proposed framework are verified. The simulation results provide system planners with an understanding of hosting capacity and energy utilization of the hybrid wind-PV system from the probabilistic perspective.
The paper organization is as follows. Section 2 presents the modeling of time-varying DG production and load demands. Section 3 introduces the stochastic framework for DG hosting capacity assessment. Section 4 performs the proposed framework on the IEEE 33-bus system to evaluate single and hybrid DG hosting capacity. Section 5 presents the discussion on the main features and applications of the proposed framework. Finally, Section 6 summarizes the main conclusions of this study.

Time-Varying Renewables and Demand
In hosting capacity assessment, the time-varying characteristics of both generation and demand need to be considered in evaluating the operating state of the systems. In this context, some studies [32,33] adopted the historical data of generation and demand as inputs to perform time series analysis to identify potential constraint violations. In general, time series analysis requires historical data sets with a long time scale and high-resolution. On the one hand, long time scale historical data, such as one year or a few years [16], is needed to capture various combinations of production and demand. On the other hand, a relatively short time interval [34] (e.g., 1-min, 10-min, and 15-min) is critical for capturing the variabilities of these uncertainties. However, the adoption of time series analysis introduces a significant number of power flow calculations into assessments. For instance, a time series analysis on the data set of one year at the interval of 15-min requires 35,040 power flow simulations to cover the complete assessment. Therefore, the significant computational burden makes time series analysis laborious and intractable.
One of the solutions to improve the efficiency of host capacity assessment is to extract the historical data that has the most significant impact on system constraints. In this context, Ochoa et al. [35] proposed a two-step (i.e., discretization and aggregation steps) technique to allocate the historical data of wind power and load demand into a finite number of bins based on their joint probability of occurrence. Subsequently, Sun et al. [30] further developed this technique by addressing the "coincidence" of wind power, PV, and load.
To outline the procedure of the two-step technique, we adopt wind power and load demand as an example. In Figure 1a, the curves present a 10-day snapshot of 15-min wind power and demand data with the values normalized against respective peak values. The discretization process allocates the time-dependent wind power and demand data into a series of bins. For instance, when the width of the bins is set to 0.1 p.u, the data points 0.43 p.u (wind power) and 0.64 p.u (load demand) are allocated into bins (0.40 p.u, 0.50 p.u] and (0.60 p.u, 0.70 p.u], respectively. Then, the aggregation process groups the bins into multiple combinations, e.g., {(0.40 p.u, 0.50 p.u] and (0.60 p.u, 0.70 p.u]}, as shown as the two-dimensional mesh in Figure 1b, using the above processes to sweep all data points and allocate them into corresponding combinations. Since load demand is never below 0.40 p.u, only 52 out of 100 potential combinations have non-zero probabilities of occurrence. These combinations are labeled with " √ " in Figure 1c.
Sustainability 2020, 12, x FOR PEER REVIEW  4 of 19 p.u, only 52 out of 100 potential combinations have non-zero probabilities of occurrence. These combinations are labeled with "√" in Figure 1c. The discretization-aggregation technique allocates the time series data into a finite number of combinations, thus reducing the number of load flow calculations. More importantly, this technique can preserve the interrelationships between production and demand. However, two challenges still need to be addressed when applying this technique in the hosting capacity assessment. The first is to determine which combinations are most critical in driving the system constraints. Second, the selection of bin width is empirical. Therefore, it is imperative to evaluate the impact of different bin widths on hosting capacity results.
For the first challenge, consider the combinations labeled with the red "√" in Figure 1c, which are characterized by relatively high generation and low demand. If technical constraints are not violated in these combinations, they are unlikely to be violated in other combinations. Therefore, these combinations can represent the worst-case scenarios in determining the hosting capacity, and we define them as extreme combinations. For the second challenge, different bin widths need to be considered. Figure 2 presents the combinations of wind power and load demand when the bin width is set to 0.05 p.u. It is evident that the total number of potential combinations increases significantly compared with the results in Figure 1c, but the number of extreme combinations is almost the same. The adoption of a smaller bin width can generate more compact intervals to be evaluated. Therefore, the selection of bin width may affect the accuracy of hosting capacity assessment, and the quantitative analysis of this impact is presented in Section 4. The discretization-aggregation technique allocates the time series data into a finite number of combinations, thus reducing the number of load flow calculations. More importantly, this technique can preserve the interrelationships between production and demand. However, two challenges still need to be addressed when applying this technique in the hosting capacity assessment. The first is to determine which combinations are most critical in driving the system constraints. Second, the selection of bin width is empirical. Therefore, it is imperative to evaluate the impact of different bin widths on hosting capacity results.
For the first challenge, consider the combinations labeled with the red " √ " in Figure 1c, which are characterized by relatively high generation and low demand. If technical constraints are not violated in these combinations, they are unlikely to be violated in other combinations. Therefore, these combinations can represent the worst-case scenarios in determining the hosting capacity, and we define them as extreme combinations. For the second challenge, different bin widths need to be considered. Figure 2 presents the combinations of wind power and load demand when the bin width is set to 0.05 p.u. It is evident that the total number of potential combinations increases significantly compared with the results in Figure 1c, but the number of extreme combinations is almost the same. The adoption of a smaller bin width can generate more compact intervals to be evaluated. Therefore, the selection of bin width may affect the accuracy of hosting capacity assessment, and the quantitative analysis of this impact is presented in Section 4.

Framework for Hosting Capacity Assessment
In this section, we develop a stochastic framework for hosting capacity assessment. The determination of hosting capacity requires clear performance limits as criteria. Note that the most restrictive impact of DG integration is bus overvoltage [23,26]. This paper primarily considers the voltage limits to determine the hosting capacity. The proposed approach can also be extended to consider other criteria, such as conductor thermal capacity and transformer overload [26].

Definition of Variables
Two types of variables need to be considered in hosting capacity assessment. The first category of variables relates to the deployment of DG, including location, type, and the rated power of DG. The second category of variables includes DG production and load consumption. To consider the above variables, we define some terms as follows: 1) Location penetration (Loci): The definition of location penetration is the ratio of the number of selected DG locations NS to the number of all potential locations NP. This variable denotes the number of locations for the integration of DG. To fully understand the impact of location penetration on hosting capacity, this variable is increased by a 10% step from 10% to 100%. Let Locpen be the set of potential location penetration, i.e., Locpen = {10%, 20%,…, i×10%,…,100%}.
2) Deployment scheme of DG (Sij): For a specific bus penetration Loci, a deployment scheme Sij can be obtained by randomly selecting Loci × NP buses for DG integration. Let Si be the set of deployment schemes corresponding to Loci. Various potential deployment schemes can be generated using the MCS. Since the deployment of DG is stochastic, the trial number of MCS can affect the accuracy of probabilistic hosting capacity.
3) DG rated power (DGrated): The rated power of DG at each selected bus is assumed to be proportional to the corresponding peak load consumption. 4) Type of DG: This variable represents the combination of DG technologies. Two types of DG are considered in this paper, i.e., wind power and PV. As an example, a combination of DG types (50% wind, 50% PV) indicates that the half capacity of DG at the selected buses is wind power, and the other half is PV panels. 5) Load demands and DG production: A one-year historical data set with a 15-min time resolution is adopted to represent the stochastic nature of generation and consumption. The discretization-aggregation technique in Section 2 is applied to process the historical data and

Framework for Hosting Capacity Assessment
In this section, we develop a stochastic framework for hosting capacity assessment. The determination of hosting capacity requires clear performance limits as criteria. Note that the most restrictive impact of DG integration is bus overvoltage [23,26]. This paper primarily considers the voltage limits to determine the hosting capacity. The proposed approach can also be extended to consider other criteria, such as conductor thermal capacity and transformer overload [26].

Definition of Variables
Two types of variables need to be considered in hosting capacity assessment. The first category of variables relates to the deployment of DG, including location, type, and the rated power of DG. The second category of variables includes DG production and load consumption. To consider the above variables, we define some terms as follows: (1) Location penetration (Loc i ): The definition of location penetration is the ratio of the number of selected DG locations N S to the number of all potential locations N P . This variable denotes the number of locations for the integration of DG. To fully understand the impact of location penetration on hosting capacity, this variable is increased by a 10% step from 10% to 100%. Let Loc pen be the set of potential location penetration, i.e., Loc pen = {10%, 20%, . . . , i×10%, . . . ,100%}. (2) Deployment scheme of DG (S ij ): For a specific bus penetration Loc i , a deployment scheme S ij can be obtained by randomly selecting Loc i × N P buses for DG integration. Let S i be the set of deployment schemes corresponding to Loc i . Various potential deployment schemes can be generated using the MCS. Since the deployment of DG is stochastic, the trial number of MCS can affect the accuracy of probabilistic hosting capacity. (3) DG rated power (DG rated ): The rated power of DG at each selected bus is assumed to be proportional to the corresponding peak load consumption. (4) Type of DG: This variable represents the combination of DG technologies. Two types of DG are considered in this paper, i.e., wind power and PV. As an example, a combination of DG types (50% wind, 50% PV) indicates that the half capacity of DG at the selected buses is wind power, and the other half is PV panels. (5) Load demands and DG production: A one-year historical data set with a 15-min time resolution is adopted to represent the stochastic nature of generation and consumption.
The discretization-aggregation technique in Section 2 is applied to process the historical data and determine the extreme combinations. Since the selection of bin width is empirical, seven different

Stochastic Analysis Framework
In this section, the computational procedure of the proposed stochastic framework is developed. As shown in Figure 3, the proposed framework has three modules. Module 1 aims to generate various DG deployment schemes. Then, Module 2 conducts the impact analysis to obtain the hosting capacity results of different deployment schemes. Module 3 performs the statistical analysis of the obtained hosting capacity results.

Stochastic Analysis Framework
In this section, the computational procedure of the proposed stochastic framework is developed. As shown in Figure 3, the proposed framework has three modules. Module 1 aims to generate various DG deployment schemes. Then, Module 2 conducts the impact analysis to obtain the hosting capacity results of different deployment schemes. Module 3 performs the statistical analysis of the obtained hosting capacity results.

1) Deployment schemes of DG
This module addresses the generation of multiple DG deployment schemes. The variables involved in this module include DG location penetration, rated power, and type. The steps of Module 1 are as follows: Step 1: Determine the location penetration Loci ∈ Locpen, (i = 1, 2, …, 10); Step 2: Perform MCS to generate k DG deployment schemes with location penetration level Loci. The set of deployment schemes is represented as Si = {Si1, …, Sij, …,Sik}; Step 3: Determine the initial total hosting capacity (e.g., 1 MW) of the test system. For each deployment scheme, Sij, the initial DG rated power at each bus is allocated based on the corresponding peak load demand; Step 4: Determine the DG type at each selected bus (e.g., 50% wind, 50% PV). (1) Deployment schemes of DG This module addresses the generation of multiple DG deployment schemes. The variables involved in this module include DG location penetration, rated power, and type. The steps of Module 1 are as follows: Step 1: Determine the location penetration Loc i ∈ Loc pen , (i = 1, 2, . . . , 10); Step 2: Perform MCS to generate k DG deployment schemes with location penetration level Loc i . The set of deployment schemes is represented as Step 3: Determine the initial total hosting capacity (e.g., 1 MW) of the test system. For each deployment scheme, S ij , the initial DG rated power at each bus is allocated based on the corresponding peak load demand; Step 4: Determine the DG type at each selected bus (e.g., 50% wind, 50% PV).
(2) Assessment of hosting capacity This module aims to analyze the impact of DG production on the operating states of systems. The hosting capacity of each deployment scheme is determined by performing deterministic power flow calculations on extreme combinations. Therefore, the variable considered in this module is the bin width d. The steps of Module 2 are as follows: Step 5: Determine the bin width d, and obtain the extreme combinations to be evaluated; Step 6: For a specific deployment scheme, S ij , perform deterministic power flow calculations on the extreme combinations to estimate the upper bounds of voltages. For instance, if the extreme combination is {(0.65 p.u, 0.70 p.u], (0.45 p.u, 0.50 p.u]}, the upper bounds of voltages can be obtained with DG generation and load demand set to 0.70 p.u and 0.45 p.u, respectively; Step 7: If none of the bus voltages exceed the technical constraints, i.e., 1.05 p.u, the total hosting capacity is increased by a fixed step size (e.g., 0.01 MW) and the procedure is repeated from Step 6. Otherwise, the procedure is terminated and the hosting capacity result, HC j , is obtained for the test system with deployment scheme S ij ; Step 8: Repeat steps 6-7 to obtain the hosting capacity results for each deployment scheme in S i . The hosting capacity results are represented as HC = {HC 1 , . . . HC j , . . . ,HC k }.
(3) Analysis of hosting capacity results In Module 3, statistical analysis of the obtained hosting capacity results is performed. The results of this module are as follows: Histogram of hosting capacity: The histogram of hosting capacity can be obtained based on the results HC = {HC 1 , . . . HC j , . . . ,HC k }. As shown in Figure 3a, the results provide an understanding of the probabilistic hosting capacity of the test system with a specific location penetration.
Cumulative probability curve of hosting capacity: This curve can be obtained based on the histogram of hosting capacity. As shown in Figure 3b, the curve is presented in a descending trend. This curve helps system planners to estimate the probability of having a hosting capacity higher than a specific value.
Histogram of total generation: The total energy generation of DG is a valuable indicator for the utilization of renewable energy. Therefore, the histogram of energy generation in one year, as shown in Figure 3c, is calculated to provide a comparison of the energy utilization of different types of DG.

Numerical Results
In this section, numerical studies are carried out on the IEEE 33-bus distribution system to demonstrate the efficacy of the proposed framework. Firstly, brief introductions to the test system and historical data sets of load demand, wind, and PV production are presented. Then, two critical parameters (bin width and trail number) in the stochastic framework are determined through comparisons with the comprehensive assessment. Following that, detailed assessments of both single and hybrid DG hosting capacity are performed.

Data
The proposed framework is evaluated on the IEEE 33-bus distribution network, whose single line diagram is presented in Figure 4. We set the voltage of the grid supply point to 1.0 p.u, and the lower and upper bounds of voltage for each bus are 0.95 p.u and 1.05 p.u, respectively. The total peak demand for this test system is 3.715 MW. The detailed parameters and configurations of the test system are available in [36].
The historical data of load demand, wind speed, and solar irradiation are derived from a typical distribution system in Hubei province, China. The data set of one year at intervals of 15 min has a total of 35,040 data points. Load demand is normalized against its peak value. Wind speed and solar irradiation are processed and applied to their characteristic curves [37], respectively. Figure 5 presents the variations of load demand, wind power, and PV production in the whole year. It can be seen from the curves that both load demand and PV output have visible seasonal patterns. The load in summer is relatively lower than in winter, while PV output shows the opposite trend. However, the pattern of wind power is less clear than the load and PV output. In the test system, different buses are geographically close. Therefore, it is reasonable to assume that the production of the same type of DG follow the same time series curve. in summer is relatively lower than in winter, while PV output shows the opposite trend. However, the pattern of wind power is less clear than the load and PV output. In the test system, different buses are geographically close. Therefore, it is reasonable to assume that the production of the same type of DG follow the same time series curve.  in summer is relatively lower than in winter, while PV output shows the opposite trend. However, the pattern of wind power is less clear than the load and PV output. In the test system, different buses are geographically close. Therefore, it is reasonable to assume that the production of the same type of DG follow the same time series curve. The discretization-aggregation technique in Section 2 is adopted to establish the combinations of renewables and load demands. Figure 6 shows the coincident periods for wind power-load, PV-load, and wind power-PV when the bin width is selected as 0.05 p.u. For the 400 potential combinations of wind power-load, only 175 have non-zero probabilities of occurrence. Similarly, only 133 combinations are non-zero in the PV-load case. For the coincidence of renewable energies, there is a negative correlation between wind power and PV, and the correlation coefficient between two resources is −0.162. The PV output mainly depends on solar irradiation. As a result, solar power is negligible during nighttime. In contrast, wind power during daytime is typically less than at nighttime. The complementarity between wind and solar power has the potential to facilitate energy integration. The discretization-aggregation technique in Section 2 is adopted to establish the combinations of renewables and load demands. Figure 6 shows the coincident periods for wind power-load, PVload, and wind power-PV when the bin width is selected as 0.05 p.u. For the 400 potential combinations of wind power-load, only 175 have non-zero probabilities of occurrence. Similarly, only 133 combinations are non-zero in the PV-load case. For the coincidence of renewable energies, there is a negative correlation between wind power and PV, and the correlation coefficient between two resources is −0.162. The PV output mainly depends on solar irradiation. As a result, solar power is negligible during nighttime. In contrast, wind power during daytime is typically less than at nighttime. The complementarity between wind and solar power has the potential to facilitate energy integration.
In hosting capacity assessment, the occurrences of relatively high production and low demand are critical in driving in system constraints. Table 1 presents the extreme combinations to be evaluated for wind power and PV hosting capacity. For wind power-load and PV-load cases, only three and four extreme combinations need to be considered, respectively. Compared with the historical dataset, the introduction of extreme combinations can significantly reduce the number of load flow calculations.    In hosting capacity assessment, the occurrences of relatively high production and low demand are critical in driving in system constraints. Table 1 presents the extreme combinations to be evaluated for wind power and PV hosting capacity. For wind power-load and PV-load cases, only three and four extreme combinations need to be considered, respectively. Compared with the historical dataset, the introduction of extreme combinations can significantly reduce the number of load flow calculations.

Impacts of Simulation Parameters on Hosting Capacity
Before the evaluation of hosting capacity, two critical parameters should be determined. One of them is the bin width, which is used to determine the extreme combinations of generation and demand. This parameter may have an impact on the accuracy of hosting capacity. Another parameter is the trial number of MCS, which represents the number of DG deployment schemes. Since the analysis framework is stochastic, this parameter may affect the accuracy of the probability distribution of the obtained results.
Firstly, the selection of different bin widths is conducted, and only wind power is considered to simplify the analysis. The proposed stochastic framework is implemented with seven different bin widths, d, i.e., 0.1 p.u, 0.05 p.u, 0.025 p.u, 0.01 p.u, 0.005 p.u, 0.0025 p.u, and 0.001 p.u. Moreover, the comprehensive load flow method is adopted to render comparative results. The computational procedure of comprehensive assessment is similar to Module 2 of the proposed framework, i.e., increase the total DG capacity until the limitation is violated. Still, the difference is that the former method performs load flow calculations with all historical data of wind power and load to capture the violation of system constraints. Therefore, the results of the comprehensive load flow method can be treated as the exact values.
Taking a specific DG deployment as an example, the locations of wind power integration are Bus 2, 7, 24, and 33. Table 2 shows the number of extreme combinations and hosting capacity results under different bin widths. As a comparison, the result obtained by the comprehensive assessment is 10.87 MW, and the computational time is 894 s. The result obtained by the proposed framework with d = 0.1 p.u is 12.97% less than the exact value. That means the adoption of a relatively large bin width underestimates the hosting capacity and leads to a conservative result. In contrast, when d is less than 0.01 p.u, the relative errors are less than 1%. The accurate estimations demonstrate that the proposed extreme combination can provide a suitable alternative to the time series data in hosting capacity assessment when an appropriate bin width is selected. The reason for the conservative estimation of hosting capacity can be further explained through the time series results of bus voltage. Figure 7 presents the 1-day snapshot of the time series voltage curves at Bus 18. The blue curve denotes the accurate results obtained by the comprehensive load flow method. Three different d, i.e., 0.1 p.u, 0.01 p.u, and 0.001 p.u, are conducted as the comparative cases. For these cases, the corresponding results in Table 2, i.e., 9.46 MW, 10.77 MW, and 10.86 MW, are used to calculate the upper and lower bounds of bus voltage. As described in Section 3.2, the upper bounds of the voltages are calculated using the lower and upper bounds of load and wind power bins, respectively. The lower bounds can be calculated using the opposite rule. It is evident from the results that the accurate voltage curve is always contained by the range of upper and lower bounds in each case. Therefore, when the corresponding hosting capacity in each case is applied to the actual time series data, it can be guaranteed that no voltage violation occurs. The comparison of the upper bounds of the voltages in different cases can explain the conservatism of the obtained hosting capacity. When d is relatively large, the upper bound of voltages is overestimated, and the voltage violation can be observed with less DG capacity. In contrast, when d is appropriately selected, the interval between the upper and lower bounds of voltage is much narrower than those with larger bin widths. Under such circumstances, the hosting capacity can be accurately estimated. voltage violation can be observed with less DG capacity. In contrast, when △d is appropriately selected, the interval between the upper and lower bounds of voltage is much narrower than those with larger bin widths. Under such circumstances, the hosting capacity can be accurately estimated. On the comparison of computational burden, due to the increase of the number of extreme combinations, the computational time required by the proposed framework with smaller △d increases slightly, as shown in Table 2  In the test system, the number of candidate buses is 32, and each of them has an equal probability of DG integration. For location penetration of 20%, the total possible number of DG deployment schemes is more than 4 × 10 5 . The hosting capacity assessment for such a large number of potential DG deployments results in a significant computational burden. Therefore, MCS is adopted to simulate a relatively small number of scenarios to obtain approximated results. Moreover, the trial number of MCS needs to be determined to achieve a compromise between accuracy and computational burden. In this context, we adopt variance coefficient β [38] of the obtained hosting On the comparison of computational burden, due to the increase of the number of extreme combinations, the computational time required by the proposed framework with smaller d increases slightly, as shown in Table 2. Compared with the comprehensive assessment, the proposed framework can provide accurate estimations of hosting capacity with at least 400 times (894/2.23) speed up in efficiency.
In the test system, the number of candidate buses is 32, and each of them has an equal probability of DG integration. For location penetration of 20%, the total possible number of DG deployment schemes is more than 4 × 10 5 . The hosting capacity assessment for such a large number of potential DG deployments results in a significant computational burden. Therefore, MCS is adopted to simulate a relatively small number of scenarios to obtain approximated results. Moreover, the trial number of MCS needs to be determined to achieve a compromise between accuracy and computational burden. In this context, we adopt variance coefficient β [38] of the obtained hosting capacity results as the stopping criteria of MCS. That means the results of MCS are assumed to be accurate if β is less than a specific value.
The variance coefficient, β, should be appropriately selected to ensure the accuracy of the obtained results. Therefore, we consider two different values of β, i.e., 1% and 0.5%, to provide comparative results. Assume that there are four buses in each deployment scheme that are the candidate locations for DG integration. Under such circumstances, the total number of potential deployments can be obtained by the enumeration method (EM). The hosting capacity results of a total of 35,960 deployment scenarios are treated as accurate results. The mean and standard deviation (STD) values of hosting capacity obtained by EM are 6.66 MW and 2.45 MW. As a comparison, the results obtained by MCS are presented in Table 3. The trial number required by MCS with two different β values are 1300 and 5400, respectively. It can be found that the approximated results are more accurate when stopping criteria is relatively small. Figure 8 presents the comparisons between the probability distribution curves.
The results indicate that MCS can well approximate the accurate curves obtained by the EM with β set to 0.5%. Therefore, in the following case study, the stopping criterion of MCS is set as β ≤ 0.5%. The variance coefficient, β, should be appropriately selected to ensure the accuracy of the obtained results. Therefore, we consider two different values of β, i.e., 1% and 0.5%, to provide comparative results. Assume that there are four buses in each deployment scheme that are the candidate locations for DG integration. Under such circumstances, the total number of potential deployments can be obtained by the enumeration method (EM). The hosting capacity results of a total of 35,960 deployment scenarios are treated as accurate results. The mean and standard deviation (STD) values of hosting capacity obtained by EM are 6.66 MW and 2.45 MW. As a comparison, the results obtained by MCS are presented in Table 3. The trial number required by MCS with two different β values are 1300 and 5400, respectively. It can be found that the approximated results are more accurate when stopping criteria is relatively small. Figure 8 presents the comparisons between the probability distribution curves. The results indicate that MCS can well approximate the accurate curves obtained by the EM with β set to 0.5%. Therefore, in the following case study, the stopping criterion of MCS is set as β ≤ 0.5%.

Hosting Capacity Assessment of Single DG Technology
In this section, we perform the stochastic framework to assess the hosting capacity of single DG technology. Based on the analysis in Section 4.2, the parameters bin width, △d, and variance coefficient, β, are set to 0.001 p.u and 0.5%, respectively, to ensure the accuracy of the results. For each DG technology, location penetration is increased from 10% to 100%, with a step size of 10%. Table 4 presents the statistical characteristics of wind power and PV hosting capacity results (HCWP and HCPV) under location penetration 20%, 50%, and 80%. It is evident that the mean values of hosting capacity increase with higher location penetration, while the standard deviations present an opposite trend. The reasons for such results are as follows. Firstly, with increased location penetration, the installed DG capacity at each location reduces (the initial total capacity is assumed as 1 MW and increased with a fixed step). In this case, the impact of DG on the system is less severe, and the test system can accommodate more DG capacity. Secondly, higher location penetration alleviates the uncertainties of DG location. Therefore, the obtained hosting capacity results with different deployment schemes are less dispersed.

Hosting Capacity Assessment of Single DG Technology
In this section, we perform the stochastic framework to assess the hosting capacity of single DG technology. Based on the analysis in Section 4.2, the parameters bin width, d, and variance coefficient, β, are set to 0.001 p.u and 0.5%, respectively, to ensure the accuracy of the results. For each DG technology, location penetration is increased from 10% to 100%, with a step size of 10%. Table 4 presents the statistical characteristics of wind power and PV hosting capacity results (HC WP and HC PV ) under location penetration 20%, 50%, and 80%. It is evident that the mean values of hosting capacity increase with higher location penetration, while the standard deviations present an opposite trend. The reasons for such results are as follows. Firstly, with increased location penetration, the installed DG capacity at each location reduces (the initial total capacity is assumed as 1 MW and increased with a fixed step). In this case, the impact of DG on the system is less severe, and the test system can accommodate more DG capacity. Secondly, higher location penetration alleviates the uncertainties of DG location. Therefore, the obtained hosting capacity results with different deployment schemes are less dispersed. When comparing the hosting capacity of different DG types, it is worth mentioning that the test system has a higher PV hosting capacity than wind power at the same location penetration. For instance, the results for PV is 3.3% higher than that of wind power when location penetration is 50%. In general, the occurrences of high DG production and low demand are of great concern as they promote voltage rise. Therefore, the difference in wind power and PV hosting capacity can be explained using extreme combinations. For the case of wind power, most of the deployment schemes are constrained by the combination {(0.729 p.u, 0.730 p.u], (0.507 p.u, 0.508 p.u]}. While the corresponding combination for the PV case is {(0.749 p.u, 0.750 p.u], (0.590 p.u, 0.591 p.u]}. The former combination can cause a slightly higher voltage rise at the same DG capacity. In addition to the hosting capacity results, Table 5 presents the comparisons of mean values of total energy production from wind power and PV (E WP and E PV ). Although the hosting capacity for PV is slightly higher than wind power, total energy production from PV is less than 50% of wind. The reason for the difference between total energy production is the higher wind power capacity factor in this area. Total generation production can realistically reflect energy utilization. Therefore, in the planning of DG, not only the maximum DG capacity, but also the total generation production should be considered.  Figure 9 presents the probability distributions of wind power and PV hosting capacity under 50% location penetration. The results are shown using histograms and cumulative probability curves. It can be found that Gamma distribution can approximate both wind power and PV hosting capacity. Moreover, the obtained results with other location penetrations also follow the same rule. The histograms provide the estimations of extreme values of hosting capacity. For wind power hosting capacity, the obtained minimum and maximum values are 4.10 MW and 12.65 MW, respectively. As a comparison, the corresponding results for PV hosting capacity are 4.26 MW and 13.21 MW, respectively. The difference between the extreme values is partly due to the locations of DG. In general, the relatively high hosting capacity can be obtained with DG located near the first bus. Figure 9b,d presents the cumulative probability curves of hosting capacity. These curves provide system planners with a straightforward tool for estimating the probability of having a hosting capacity higher than a specific value. For example, the probability of having wind power hosting capacity higher than 6 MW is 0.9072, while in the PV case, the result is 0.9353.
Since the obtained hosting capacity results with different location penetration can be approximated by Gamma distributions, it is imperative to establish the relationship between location penetration and the parameters of Gamma distributions. The probability density function of Gamma distribution is given as follows: where Γ(·) is the Gamma function; parameters α and β are shape and scale parameters, respectively. The shape and scale parameters can be calculated as follows: where µ x and σ x are the expected value and standard deviation of variable x, respectively. Based on the hosting capacity results with different location penetrations, the curves of the parameter of Gamma distribution to location penetration can be obtained and fitted by the fifth-order polynomials, as shown in Figure 10. These curves can provide planners with useful information. For instance, when the location penetration is 35%, system planners can utilize the curve to obtain the shape and scale parameters and estimate the probability distribution of hosting capacity under the target location penetration. Moreover, once such curves are established for a specific distribution system, the probabilistic hosting capacity of the system with any location penetration can be obtained without simulating thousands of DG deployment schemes. where μx and σx are the expected value and standard deviation of variable x, respectively. Based on the hosting capacity results with different location penetrations, the curves of the parameter of Gamma distribution to location penetration can be obtained and fitted by the fifth-order polynomials, as shown in Figure 10. These curves can provide planners with useful information. For instance, when the location penetration is 35%, system planners can utilize the curve to obtain the shape and scale parameters and estimate the probability distribution of hosting capacity under the target location penetration. Moreover, once such curves are established for a specific distribution system, the probabilistic hosting capacity of the system with any location penetration can be obtained without simulating thousands of DG deployment schemes.  where μx and σx are the expected value and standard deviation of variable x, respectively. Based on the hosting capacity results with different location penetrations, the curves of the parameter of Gamma distribution to location penetration can be obtained and fitted by the fifth-order polynomials, as shown in Figure 10. These curves can provide planners with useful information. For instance, when the location penetration is 35%, system planners can utilize the curve to obtain the shape and scale parameters and estimate the probability distribution of hosting capacity under the target location penetration. Moreover, once such curves are established for a specific distribution system, the probabilistic hosting capacity of the system with any location penetration can be obtained without simulating thousands of DG deployment schemes.

Hosting Capacity Assessment of Hybrid Wind-PV
In this section, the proposed framework is conducted to estimate the hybrid wind power-PV hosting capacity. This section only considers the different ratios of wind power and PV under 50% location penetration to simplify the presentation.
Monte Carlo simulation is performed to generate various DG deployment schemes. For a specific DG deployment scenario, the percentage of wind power generation is increased from 0% to 100% with 5% increment steps. The proposed framework is performed to calculate the corresponding hosting capacity. Figure 11 presents the probability distributions of the obtained hosting capacity results with different wind power percentages. As the ratio of wind power increases, the transition of hybrid DG hosting capacity follows a two-stage process. In the first stage, the probability distributions of hosting capacity shift toward the right until the percentage of wind power reaches 75%. In the second stage, as the proportion of wind power continues to increase, the probability distribution curves of hosting capacity move to the left.
Sustainability 2020, 12, x FOR PEER REVIEW 15 of 19 In this section, the proposed framework is conducted to estimate the hybrid wind power-PV hosting capacity. This section only considers the different ratios of wind power and PV under 50% location penetration to simplify the presentation.
Monte Carlo simulation is performed to generate various DG deployment schemes. For a specific DG deployment scenario, the percentage of wind power generation is increased from 0% to 100% with 5% increment steps. The proposed framework is performed to calculate the corresponding hosting capacity. Figure 11 presents the probability distributions of the obtained hosting capacity results with different wind power percentages. As the ratio of wind power increases, the transition of hybrid DG hosting capacity follows a two-stage process. In the first stage, the probability distributions of hosting capacity shift toward the right until the percentage of wind power reaches 75%. In the second stage, as the proportion of wind power continues to increase, the probability distribution curves of hosting capacity move to the left.  Table 6 presents the mean values of hosting capacity and total energy production under two maximum situations. The hybrid wind power-PV hosting capacity reaches its maximum value when the percentage of wind power is 75%. In this case, the mean value of total hosting capacity is 9.39 MW, of which wind power and PV capacity are 7.04 MW and 2.35 MW, respectively. Compared with the results of single wind power and PV technologies, the total hosting capacity increases by about 1.21 (9.39/7.78) and 1.17 (9.39/8.05) times, respectively. The mean value of total energy production for one year is 17,999.26 MW·h, of which the energy production of wind power accounts for about 87%. The total energy production increases by about 1.04 and 2.20 times when compared with single wind power and PV, respectively. Another important result is the maximum energy production, and this value is reached when the wind power percentage is 85%. In this case, the mean values of hosting capacity and total energy production are 9.12 MW and 18,596.31 MW·h, respectively. As can be seen from the above results, wind power has a dominant role in both cases of maximum hosting capacity and energy production. However, PV is also essential as the complementarity generation to increase total hosting capacity and energy production. The results demonstrate that the complementarity between wind power and PV can promote the system to accommodate more DG capacity, thus increasing the utilization of renewable energy. Besides, the difference between the two maximum results demonstrates that hosting capacity and energy production do not reach the optimal values simultaneously. Therefore, considering the investment  Table 6 presents the mean values of hosting capacity and total energy production under two maximum situations. The hybrid wind power-PV hosting capacity reaches its maximum value when the percentage of wind power is 75%. In this case, the mean value of total hosting capacity is 9.39 MW, of which wind power and PV capacity are 7.04 MW and 2.35 MW, respectively. Compared with the results of single wind power and PV technologies, the total hosting capacity increases by about 1.21 (9.39/7.78) and 1.17 (9.39/8.05) times, respectively. The mean value of total energy production for one year is 17,999.26 MW·h, of which the energy production of wind power accounts for about 87%. The total energy production increases by about 1.04 and 2.20 times when compared with single wind power and PV, respectively. Another important result is the maximum energy production, and this value is reached when the wind power percentage is 85%. In this case, the mean values of hosting capacity and total energy production are 9.12 MW and 18,596.31 MW·h, respectively. As can be seen from the above results, wind power has a dominant role in both cases of maximum hosting capacity and energy production. However, PV is also essential as the complementarity generation to increase total hosting capacity and energy production. The results demonstrate that the complementarity between wind power and PV can promote the system to accommodate more DG capacity, thus increasing the utilization of renewable energy. Besides, the difference between the two maximum results demonstrates that hosting capacity and energy production do not reach the optimal values simultaneously. Therefore, considering the investment of different DG types, system planners and DG investors may need to resort to multi-objective optimization to achieve compromise solutions.
Similar to the hosting capacity of single DG technology, the probability distribution of hybrid wind power-PV hosting capacity with different wind power ratios can also be approximated by Gamma distribution. Moreover, the probability distribution curves of hosting capacity in Figure 11 have a similar shape. Using Equation (2) to calculate the parameters of Gamma distribution corresponding to each curve, the obtained mean and STD values of the shape parameters are 32.13 and 0.4, which indicates that the shape parameters of different curves are almost the same. In this context, only the relationship between wind power percentage and scale parameter needs to be established. Moreover, the change of scale parameters can represent the trend of hosting capacity. As can be seen from the curve in Figure 12, the system can accommodate relatively higher DG capacity when the wind power percentage is 65-75%. This result is consistent with the previous analysis in Table 6. This curve has two applications. Firstly, system planners can obtain the probability distributions of hosting capacity with different wind power and photovoltaic combinations based on the curve. Secondly, DG investors can adjust their investment plans for wind power and PV generation based on the curve so that they can maximize the DG capacity and obtain more revenue from energy production. of different DG types, system planners and DG investors may need to resort to multi-objective optimization to achieve compromise solutions. Similar to the hosting capacity of single DG technology, the probability distribution of hybrid wind power-PV hosting capacity with different wind power ratios can also be approximated by Gamma distribution. Moreover, the probability distribution curves of hosting capacity in Figure 11 have a similar shape. Using Equation (2) to calculate the parameters of Gamma distribution corresponding to each curve, the obtained mean and STD values of the shape parameters are 32.13 and 0.4, which indicates that the shape parameters of different curves are almost the same. In this context, only the relationship between wind power percentage and scale parameter needs to be established. Moreover, the change of scale parameters can represent the trend of hosting capacity. As can be seen from the curve in Figure 12, the system can accommodate relatively higher DG capacity when the wind power percentage is 65-75%. This result is consistent with the previous analysis in Table 6. This curve has two applications. Firstly, system planners can obtain the probability distributions of hosting capacity with different wind power and photovoltaic combinations based on the curve. Secondly, DG investors can adjust their investment plans for wind power and PV generation based on the curve so that they can maximize the DG capacity and obtain more revenue from energy production.

Discussion
In this section, we discuss the main features, applications, and future research of this study. Firstly, the proposed stochastic framework provides system planners with an accurate and efficient tool for hosting capacity assessment. The framework adopts the discretization-aggregation technique to deal with the time series data. Compared with the application of this technique in previous studies [30,35], this study concentrates more on the extreme combinations and the appropriate selection of bin width. Thanks to the introduction of extreme combinations, the number of power flow calculations required by the hosting capacity evaluation is significantly reduced. More importantly, the simulation results demonstrate the necessity of appropriate selection of bin width.
Secondly, this study presents the assessments of wind power, PV, and hybrid wind-PV hosting capacity. The simulation results indicate that the hybrid energy system has the potential to increase hosting capacity and energy production due to the complementarity of wind speed and solar irradiation. The performance of the hybrid wind-PV system in facilitating energy integration and utilization varies with different wind power and PV ratios. Compared with the studies that focused on PV hosting capacity [23][24][25][26], this study can contribute a better understanding of hybrid wind-PV hosting capacity from a probabilistic point of view.
Thirdly, the simulation results identify that Gamma distribution can approximate the probability distributions of both single and hybrid DG hosting capacity. These findings can provide system planners and DG investors with serval useful tools, such as the curves in Figures 10 and 12.

Discussion
In this section, we discuss the main features, applications, and future research of this study. Firstly, the proposed stochastic framework provides system planners with an accurate and efficient tool for hosting capacity assessment. The framework adopts the discretization-aggregation technique to deal with the time series data. Compared with the application of this technique in previous studies [30,35], this study concentrates more on the extreme combinations and the appropriate selection of bin width. Thanks to the introduction of extreme combinations, the number of power flow calculations required by the hosting capacity evaluation is significantly reduced. More importantly, the simulation results demonstrate the necessity of appropriate selection of bin width.
Secondly, this study presents the assessments of wind power, PV, and hybrid wind-PV hosting capacity. The simulation results indicate that the hybrid energy system has the potential to increase hosting capacity and energy production due to the complementarity of wind speed and solar irradiation. The performance of the hybrid wind-PV system in facilitating energy integration and utilization varies with different wind power and PV ratios. Compared with the studies that focused on PV hosting capacity [23][24][25][26], this study can contribute a better understanding of hybrid wind-PV hosting capacity from a probabilistic point of view.
Thirdly, the simulation results identify that Gamma distribution can approximate the probability distributions of both single and hybrid DG hosting capacity. These findings can provide system planners and DG investors with serval useful tools, such as the curves in Figures 10 and 12. Such results can be generally established for other networks. These curves can, not only help system planners assess the hosting capacity under different location penetrations, but also assists DG investors in making informed decisions on the investment plan of different DG types.
This study performs the hosting capacity assessment without considering active network managements (ANMs). ANMs, such as on-load tap changers [39] and energy storage [40,41], are verified to be effective ways for the enhancement of hosting capacity. However, due to the need for the knowledge of the succession of operating states (e.g., state of charge of energy storage), the assessment of enhanced hosting capacity can be more challenging when considering ANMs. Therefore, the establishment of a critical time series of DG production and load demand can be one of the future research works.

Conclusions
In this paper, a stochastic framework is developed to study the hybrid wind-PV system hosting capacity from a probabilistic view. To reduce the computational burden of assessments, we define the extreme combinations of DG productions and load demand based on historical data sets. The proposed framework is implemented in the IEEE 33-bus system to evaluate wind power, PV, and hybrid energy hosting capacity. The main findings of this study are as follows: (1) The accuracy of the proposed framework depends on the selection of bin width. Relatively large bin width can lead to a conservative estimation of hosting capacity. Therefore, to obtain accurate results, it is recommended that the bin width needs to be smaller than 0.01 p.u. (2) Due to the difference in energy resources, the system has slightly higher PV hosting capacity than wind power. Still, the total energy production of PV is much less than that of wind power. Moreover, the probability distributions of wind power and PV hosting capacity with different location penetration can be approximated by Gamma distribution. (3) Due to the complementarity between wind power and PV, the system can accommodate more DG capacity, thus promoting the utilization of renewable energy. However, the hosting capacity and energy production of the hybrid wind-PV system do not reach optimal values simultaneously. Therefore, the multi-objective optimization method can be used to achieve a compromise between DG capacity and energy production.
The main application of this study is to provide a useful tool for system planners to assess DG hosting capacity. This study can be extended to consider other low carbon technologies, such as electric vehicles and biomass-fueled gas engines. Also, the proposed combination of production and demand can be used in other energy management studies that require extensive power flow calculations, such as analysis of system losses and energy reductions. Regarding future research, more binding constraints, such as voltage unbalance, conductor thermal capacity, and transformer overload, can be incorporated to establish a more comprehensive hosting capacity assessment. Furthermore, a critical time series for DG production and load demand needs to be developed to evaluate the hosting capacity enhanced with ANMs.