Dynamic Risk Assessment of Voltage Violation in Distribution Networks with Distributed Generation

In response to the growing demand for economic and social development, there has been a significant increase in the integration of distributed generation (DG) into distribution networks. This paper proposes a dynamic risk assessment method for voltage violations in distribution networks with DG. Firstly, considering the characteristics of random variables such as load and DG, a probability density function estimation method based on boundary kernel density estimation is proposed. This method accurately models the probability of random variables under different time and external environmental conditions, such as wind speed and global horizontal radiation. Secondly, to address the issue of correlated DG in the same region, an independent transformation method based on the Rosenblatt inverse transform is proposed, which enhances the accuracy of probabilistic load flow. Thirdly, a voltage violation severity index based on the utility function is proposed. This index, in combination with probabilistic load flow results, facilitates the quantitative assessment of voltage violation risks. Finally, the accuracy of the proposed method is verified on the IEEE-33 system.


Introduction
To reduce the consumption of high-carbon-emission fossil fuels, the renewable energy sector has witnessed rapid growth.Photovoltaics (PVs) and wind turbines (WTs) have been extensively constructed, and they have been integrated into the distribution network in the form of distributed generation (DG), gradually transforming the conventional passive distribution network into a complex active distribution network [1][2][3].
Due to the significant influence of natural environmental factors such as solar radiation, PVs exhibit instability, marked by pronounced randomness and fluctuation [4].Similarly, WTs are constrained by natural factors like wind speed, resulting in strong randomness and uncertainty [5].The substantial integration of DG and the fluctuating loads introduce various sources of uncertainty into the distribution network system, significantly affecting its stable operation.This elevates the risk of voltage violations at network nodes, posing a considerable challenge to voltage control throughout the entire grid.
Hence, scholars have embarked on research pertaining to voltage violation risk assessment in distribution networks with DG [6][7][8].However, the majority of these risk assessment studies primarily rely on historical data of DG to determine "static risks" of voltage violations at network nodes.They do not take into account load variations at different times and the influence of external environmental factors on DG.For instance, the load levels are typically lower during the late night compared to the afternoon, and the DG output can significantly differ under varying environmental conditions.These factors can lead to substantial variations in voltage violation risk in the same node at different times on the same day or at the same time on different days.Consequently, it becomes imperative to propose a dynamic risk assessment methodology for voltage violations in distribution networks with DG which can offer information for the coordination and control of network node voltages.Given the complex operating conditions in active distribution networks, deterministic load flow analysis methods often yield imprecise results, necessitating the use of probabilistic load flow (PLF) for analytical computations.
PLF analysis methods can be broadly categorized into three main types: the Monte Carlo Simulation (MCS) [9,10], the Point Estimate Method (PEM) [11,12], and analytical methods [13].The MCS involves the simulation of numerous random variable samples through sampling, resulting in high accuracy but consuming a significant amount of time for calculations.The PEM approximates the moments of the output variables based on the characteristics of input random variables, but its computational efficiency is closely tied to the number of variables.Analytical methods rely on relationships among input random variables to calculate the probability statistical characteristics of output random variables.
The key challenge of analytical methods lies in handling complex convolution calculations.Given the intricacies of convolution operations, it is possible to enhance computational efficiency by substituting convolution calculations with cumulant operations.
Currently, there are three main issues with using the probabilistic load flow based on the cumulant method (PLFCM) for dynamic risk assessment of voltage violations in distribution networks with DG.
The first issue pertains to the accuracy of the probability models for random variables such as load and DG, which directly affect the results of load flow calculations.There are currently two main approaches to describing probability models for random variables: parameter-based methods [14] and non-parameter-based methods [15].Parameter-based methods begin by assuming that the research subject follows a certain probability distribution based on empirical observations.Then, they calculate the relevant parameter from actual samples to obtain a complete probability density function (pdf).However, due to the randomness of DG and the volatility of load, common probability density forms such as the Beta distribution and Weibull distribution often fail to accurately reflect the actual probability distribution, leading to significant errors.Non-parameter-based methods, on the other hand, do not require prior assumptions about the underlying probability distribution model or prior knowledge.Instead, they directly analyze the probability distribution based on actual data.Kernel density estimation (KDE) is the most widely used non-parameter method [16].However, a notable characteristic of random variables like load and DG is that they are bounded.Traditional KDE methods, when applied to bounded data, can exhibit boundary effects that subsequently impact the results of probabilistic load flow.Additionally, another prominent feature of DG is its susceptibility to external environmental factors.As is well known, wind speed and global horizontal radiation are the most important factors affecting the output of WT and PV, respectively.How to achieve dynamic probity density estimation for DG remains an unresolved challenge under the influence of both.
The second issue pertains to the prerequisite that input variables must be mutually independent for the application of PLFCM.Due to the consistent characteristics of renewable energy sources in the same or neighboring areas, unit outputs often exhibit similar trends.The existence of correlations between input variables makes it impractical to directly employ this method for probabilistic load flow, necessitating the transformation of correlated outputs into independent ones.Presently, two widely used methods for achieving this are the Orthogonal inverse transform [17] and the Nataf inverse transform [18].The Orthogonal inverse transform is straightforward and widely applied, while the Nataf inverse transform uses the correlation coefficient matrix, taking into account changes in equivalent correlation coefficients before and after transformation.However, both of these methods rely on the Pearson correlation coefficient and cannot capture nonlinear relationships among DGs.Moreover, they require variables to adhere to specific distribution types to achieve higher accuracy.Furthermore, the most challenging aspect of the Nataf inverse transform is determining the equivalent correlation coefficients.Although some scholars [19] have provided empirical formulas for a certain number of equivalent correlation coefficients for the reference of other researchers, they only cover a limited range of common probability distributions.Addressing how to achieve a more precise and widely applicable independent transformation for random variables is an issue that requires resolution.
The third issue is the establishment of a reasonable risk assessment framework, which is a key factor in achieving voltage violation risk assessment in distribution networks with DG.Risk is a comprehensive measure that combines the probability of an event occurring with the severity of its consequences [20].When conducting risk assessments, both aspects should be taken into consideration.Traditional risk assessments often focus solely on the probability of a voltage violation occurring, without considering the severity of its consequences.Voltage violation events with a low probability but significant impact should receive more attention than those with a higher probability but less significant consequences.How to accurately portray the severity of losses resulting from risk events is a currently unresolved issue.
To address the aforementioned issues, this paper proposes a dynamic risk assessment method for voltage violations in distribution networks with DG.Firstly, a pdf estimation method is proposed, based on boundary kernel density estimation (BKDE), to overcome the problem of errors at boundary points when handling bounded data.Moreover, considering the significant impact of wind speed and global horizontal radiation on DG, conditional density is introduced to enable the dynamic probability density estimation of DG based on numerical weather prediction (NWP).Secondly, an independent transformation method based on the Rosenblatt inverse transform is proposed.This method accurately characterizes the correlations between DG variables, achieving an independent transformation of variables, and laying the foundation for subsequent PLFCM.Thirdly, a voltage violation severity index based on the utility function is proposed.By combining this index, an integrated risk assessment framework is constructed on the basis of the probability of voltage violations.This framework quantitatively analyzes the dynamic risk of voltage violations in distribution networks with DG at both the node and system levels.Finally, simulation tests on the IEEE-33 system validate the rationality and effectiveness of the proposed method.

Pdf Estimation Based on BKDE
KDE is a data-driven, non-parametric method for estimating pdf and has the advantage of being unaffected by the choice of prior models [21].It is commonly used for data fitting when it is challenging to directly obtain the underlying pdf.This method is suitable for analyzing the probability characteristics of load and DG.Hence, in this paper, the KDE method is employed for estimation.
Given a set of independently and identically distributed data, X 1 , ..., X N , with an unknown density function f (x), the KDE is calculated as follows: where K(•) represents the kernel function; N represents the sample size; h represents the bandwidth; x represents the point at which the kernel density estimate is being calculated; and X i represents the ith sample.The choice of the kernel function can indeed influence the results of KDE.Generally, the direction of the sample X i from the density point x does not affect the estimation, and the closer the distance from x, the more weight should be assigned to X i .Therefore, a unimodal kernel function centered at 0 is typically chosen.In this paper, the Epanechnikov kernel function is used: where I(•) represents the indicator function, which has a value of 1 when z satisfies the condition and 0 otherwise.Given z = (x − X i )/h and substituting (2) into (1), we can obtain the complete formula for the KDE.

pdf Estimation for Load
Assuming in the distribution network there are N L load nodes, the load data for the distribution network at time i, denoted as P L i , can be expressed as follows: where P L ik represents the local load power measurement for node k at time i.Additionally, if there are historical data for the load power for a total of T time, the historical load data, P L , can be described as follows: In ( 4), the data for each load column can be decomposed, based on daily patterns, into two components: the basic data with a daily cycle and the random fluctuation data.Therefore, when performing PLFCM, the pdf for node k at time t can be calculated using the power data set for different days at the same time t, denoted as P L,k t .
where T d represents the number of monitoring sample points within one day.However, a significant characteristic of load is that it is bounded, and its power consumption is always greater than zero.Therefore, when performing KDE, it is necessary to consider the impact of boundary effects.
The expression for the bias of the KDE is as follows: where σ 2 k = u 2 K(u)du, and K(u) is the kernel function as shown in (2).It can be observed that this bias diminishes as the bandwidth h decreases, approaching 0 at a rate of h 2 .However, when the pdf has a boundary at 0, the above expression is no longer applicable.Instead, it is replaced by: where a i (x) = x/h −1 u i K(u)du.It can be observed that when x ≥ h, a 0 (x) = 1 and a 1 (x) = 0, leading to no difference in bias from (6).However, when 0 ≤ x < h, both a 0 (x) and a 1 (x) are non-zero, implying that bias is always present at the boundary.
In this case, when another kernel function L(x) is used to estimate f (x) once again, there is as follows: where b 0 and b 1 are similar to a 0 and a 1 from (7), except that (8) is specific to L(x).By performing a linear combination of ( 7) and (8), we can obtain: Entropy 2023, 25, 1662 5 of 20 Equation ( 9) can be equivalently viewed as an estimation of f (x) using a new kernel function.This new kernel function is as follows: In particular, if L(x) is taken as x*K(x), K B (x) will have a simple form: When x ≥ h, the new kernel function K B (x) is the same as K(x), but when 0 ≤ x < h (i.e., at the boundary), it adjusts the original kernel function.When x < 0, the estimated value is taken as 0.
Based on (1), (2), and (11), we can derive the calculation formula for BKDE by substituting (5), which provides the pdf for node k at time t.
Taking an exponential distribution with a parameter of 1 as an example, the performance of BKDE is tested.One thousand random numbers are generated from this exponential distribution, and pdf estimation is performed using both KDE and BKDE. Figure 1 shows the results of these two estimation methods compared to the actual pdf.

 
where b0 and b1 are similar to a0 and a1 from (7), except that ( 8) is specific to L(x).By performing a linear combination of ( 7) and ( 8), we can obtain: Equation ( 9) can be equivalently viewed as an estimation of f(x) using a new kernel function.This new kernel function is as follows: In particular, if L(x) is taken as x*K(x), K B (x) will have a simple form: When x ≥ h, the new kernel function K B (x) is the same as K(x), but when 0 ≤ x < h (i.e., at the boundary), it adjusts the original kernel function.When x < 0, the estimated value is taken as 0.
Based on (1), (2), and (11), we can derive the calculation formula for BKDE by substituting (5), which provides the pdf for node k at time t.
Taking an exponential distribution with a parameter of 1 as an example, the performance of BKDE is tested.One thousand random numbers are generated from this exponential distribution, and pdf estimation is performed using both KDE and BKDE. Figure 1 shows the results of these two estimation methods compared to the actual pdf.Observing Figure 1, it can be seen that, compared to the true density function, the results obtained by KDE exhibit a noticeable decrease at the boundary.In contrast, the results obtained by BKDE closely match the real situation, and the bias issues near the boundary have been significantly improved.

Conditional pdf Estimation for DG
For DG, its generated power is constrained between 0 and its maximum capacity.After normalization to the maximum capacity, its double boundary is [0, 1].Therefore, when calculating the kernel function as shown in (11), the parameters ai(x) are: Observing Figure 1, it can be seen that, compared to the true density function, the results obtained by KDE exhibit a noticeable decrease at the boundary.In contrast, the results obtained by BKDE closely match the real situation, and the bias issues near the boundary have been significantly improved.

Conditional pdf Estimation for DG
For DG, its generated power is constrained between 0 and its maximum capacity.After normalization to the maximum capacity, its double boundary is [0, 1].Therefore, when calculating the kernel function as shown in (11), the parameters a i (x) are: In addition to its bounded nature, another significant characteristic of DG is its strong correlation with external factors.For instance, PVs might exhibit significant variations in power data at different times of one day due to differences in radiation.Therefore, the probability model for DG at time t can be characterized jointly by the conditional pdf and the predicted values of conditional variables at time t.
The conditional pdf for DG at time t can be represented as: where p t is the DG output at time t, y t is the predicted value of the conditional variable at time t, fP is the conditional pdf for DG, fY is the pdf for the conditional variable, and fP,Y is the joint pdf.
Based on Equations ( 1), (2), and ( 11), the conditional pdf for DG can be represented as: where P i represents historical samples of DG, Y i represents corresponding historical samples of conditional variables, h 1 and h 2 are bandwidth parameters, and Different types of DG correspond to different conditional variables.In this paper, the conditional variable for PVs is global horizontal radiation, while for WTs, it is wind speed.

PLFCM Based on the Rosenblatt Inverter Transform
3.1.Independent Transform Based on Rosenblatt Inverter Transformation 3.1.1.Joint pdf for DG Due to the characteristics of wind and solar energy sources, DGs located in close geographical proximity exhibit some level of similarity in their output, showing spatial correlation.The higher the spatial correlation among DGs, the stronger the synchronization between different DGs, making PLFCM more challenging.Therefore, in order to establish a foundation for subsequent Rosenblatt inverse transform and PLFCM, it is essential to accurately characterize the joint pdf of DG.
Assuming that the multivariate data P D i1 , ..., P D id (i = 1, ..., N) are historical data from d different DGs (P D 1 , ..., P D d ), their joint density function can be calculated through KDE using the following equation: where h 1 , ..., h d are bandwidth parameters, and K j (•) are kernel functions corresponding to the variable P D j (j = 1, ..., d).Taking into account the boundary characteristics of DG and external conditional factors, the joint pdf for DG at time t can be represented as follows:

Independent Transform
The Rosenblatt transform can directly convert a set of correlated non-normal variables According to the principle of equiprobability marginal transformation, it can be expressed as follows: where Φ(•) represents a cumulative distribution function (CDF) of normal distribution, and F(•) represents a conditional CDF.
Taking the inverse of (17) will yield the independent standard normal variables U I , which can be expressed as: Equation ( 18) is known as the Rosenblatt transform.It is not influenced by the distribution type or the correlation type and is considered a precise transformation method.The conditional CDF can be obtained by integrating the joint PDF derived from (16).
Through its inverse transform, standard normal variables can be transformed into independent samples of DG, and the basic steps are as follows: (1) Generate m samples that follow the standard normal distribution functions U I 1 , U I 2 , ..., U I n .( 2) Use ( 18) to obtain: From (19), it can obtain m samples of one of the DG, denoted as u C 1 .
(3) Extend to n DG, the following is applicable: Considering the random variation in node injection power, the polar form of the system power flow equation is Taylor-expanded at the base operating point, retaining only the first-order term, yielding as follows: where ∆X, ∆W, and ∆Z represent the perturbations in node state variables, node injection power variables, and branch power variables, respectively.S 0 and T 0 are sensitivity matrices.J 0 is the Jacobian matrix obtained in the last iteration of the power flow calculation.G 0 = (∂Z/∂X)| X=X0 .

Cumulant Computation
Cumulants can be calculated based on origin moments that do not exceed their order, and in this paper, only the first eight orders are considered.The relationship between various orders of cumulants and origin moments, as well as the specific derivation process, can be found in [22].
For load, the origin moments of all orders can be calculated using the density function obtained from its historical data based on BKDE.
where α v represents the vth order origin moment.
For DG variables with correlation, it is necessary to first generate mutually independent samples following the standard normal distribution.Then, using the Rosenblatt inverse transform, independent samples of the DG variables can be obtained.Based on this, the cumulants of that variable can be calculated using the relationship between origin moments and cumulants.
Once the cumulants of the node injection power changes, ∆W, are computed, it can then determine the cumulants of the node state changes, ∆X, and the cumulants of the branch power changes, ∆Z, for each order using the following equation: where ∆W L represent the kth order cumulants for changes in DG injection power and changes in load injection power, respectively.

Cornish-Fisher Series Expansion
Series expansion can approximate the cumulants of output variables as probability distributions.According to [23], the Cornish-Fisher series provides higher accuracy.Therefore, in this study, the CDF of output variables is obtained using the Cornish-Fisher series.
The approach of the Cornish-Fisher series expansion entails initially selecting a specific quantile, followed by computing the quantiles of the state variable, and ultimately deriving the cumulative distribution of that variable.Assuming α represents the quantile of random variable X, x(α) can be expressed as follows: where ζ(α) = Φ −1 (α), and g v represents the vth order normalized cumulant.By utilizing the relationship x(α) = F −1 (α), the CDF F(x) of the output random variable X can be determined, thereby providing the probability of node voltage violation.

The Voltage Violation Risk Assessment Metric Based on the Utility Function
This section provides a comprehensive assessment of node voltage violation risk in distribution networks with DG by considering both the probability of voltage violation and the severity of such violation.This approach allows for a quantitative evaluation at both the node and system levels, serving as the basis for coordinated voltage control.

The Probability of Voltage Violation
The voltage violation probability refers to the likelihood of nodes deviating from the permissible voltage range, encompassing both overvoltage and undervoltage conditions.It can be determined through the previously conducted PLFCM analysis.
where r(V i ) represents the voltage violation probability for node i, F(V i ) represents the voltage CDF for node I, and V min i , and V max i represent the lower and upper voltage limits permissible for node i, respectively.

The Severity of Voltage Violation
Voltage violation severity reflects the level of severity caused to the system and equipment when the voltage deviates from permissible values.The traditional voltage violation severity function, denoted as S e , is represented by linear functions.In reality, many electrical devices exhibit more severe consequences as voltage deviation increases.For instance, a slight voltage deviation from the permissible range might lead to a decrease in product quality in manufacturing equipment, and as the extent of voltage deviation deepens, it can even impact equipment safety, resulting in significant consequences such as equipment damage.Therefore, the severity assessment function should be more sensitive to reflect the consequences when the deviation is severe.This paper employs the utility function to define the severity function of node voltage violation.
where k represents the risk factor, and a larger value indicates greater sensitivity to risk.θ(Vi) represents the voltage deviation index, defined as follows:

The Comprehensive Assessment of Voltage Violation
Taking into account both voltage violation probability and the severity of such violations, the comprehensive risk index for voltage violation of node i, R i , and the system-level comprehensive risk index R s , can be defined as follows: where n represents the number of system nodes, and f (V i ) represents the voltage pdf of node i, which can be derived from F(V i ) using numerical differentiation.

The Process of the Proposed Method
The method proposed in this paper enables dynamic risk assessment of voltage violation in distribution networks with DG, which is crucial for coordinated voltage control.The specific flow chart of the method is depicted in Figure 2

The Comprehensive Assessment of Voltage Violation
Taking into account both voltage violation probability and the severity of such violations, the comprehensive risk index for voltage violation of node i, Ri, and the system-level comprehensive risk index Rs, can be defined as follows: where n represents the number of system nodes, and f(Vi) represents the voltage pdf of node i, which can be derived from F(Vi) using numerical differentiation.

The Process of the Proposed Method
The method proposed in this paper enables dynamic risk assessment of voltage violation in distribution networks with DG, which is crucial for coordinated voltage control.The specific flow chart of the method is depicted in Figure 2 and comprises three main steps: Step 1: Probability Density Function Estimation.Accurate pdfs are obtained through BKDE, utilizing historical data for both load and DG, while also accounting for the influence of external factors.
Step 2: Probability Load Flow Calculation.Considering the correlation between DGs, independent DG variables are obtained through the Rosenbla inverse transform.Based on these variables, PLFCM is applied to compute the probability distribution of node voltages.
Step 3: Risk Assessment.The severity of voltage violation at the node is considered through the application of the utility function, and this, combined with the results of PLFCM, yields dynamic risk assessment outcomes at both the node and system levels.

Case Study
Due to the confidentiality requirement, real grid data are difficult to obtain, so the case study in this paper is conducted using an improved IEEE-33 system, as depicted in Figure 3.A WT with a capacity of 600 kW is connected to node 26, while two PVs are connected to node 33 and node 15 with capacities of 500 kW and 450 kW, respectively.The daytime DG penetration rate is approximately 30%.The output data for WTs and PVs is real and derived from a WT and PVs located in a specific region in northwest China, along with historical weather records providing the corresponding wind speed and global horizontal radiation data.To be er simulate the probability density of the 24 h base load, the ratio of the load for each hour relative to the maximum load of that node is defined Step 1: Probability Density Function Estimation.Accurate pdfs are obtained through BKDE, utilizing historical data for both load and DG, while also accounting for the influence of external factors.Step 2: Probability Load Flow Calculation.Considering the correlation between DGs, independent DG variables are obtained through the Rosenblatt inverse transform.Based on these variables, PLFCM is applied to compute the probability distribution of node voltages.
Step 3: Risk Assessment.The severity of voltage violation at the node is considered through the application of the utility function, and this, combined with the results of PLFCM, yields dynamic risk assessment outcomes at both the node and system levels.

Case Study
Due to the confidentiality requirement, real grid data are difficult to obtain, so the case study in this paper is conducted using an improved IEEE-33 system, as depicted in Figure 3.A WT with a capacity of 600 kW is connected to node 26, while two PVs are connected to node 33 and node 15 with capacities of 500 kW and 450 kW, respectively.The daytime DG penetration rate is approximately 30%.The output data for WTs and PVs is real and derived from a WT and PVs located in a specific region in northwest China, along with historical weather records providing the corresponding wind speed and global horizontal radiation data.To better simulate the probability density of the 24 h base load, the ratio of the load for each hour relative to the maximum load of that node is defined according to the solid line in Figure 4.In accordance with daily patterns, the load can be decomposed into two components: periodic base data and random fluctuation data.To reflect the load's volatility, the standard deviation of the load is set to 20%.Following the three-sigma rule, it can be considered that the load ratios at each time fall within the shaded area of Figure 4.In this section, the accuracy of pdf modeling under real data using BKDE and KDE is compared.Subsequently, the effectiveness of the proposed method for risk assessment is validated through simulations in two cases: when there is abundant radiation and high wind speed resulting in higher DG output during the daytime, and when there is no radiation at night coupled with low wind speed leading to reduced DG output.

The Performance of pdf Modeling
Section 2.1 compares the pdf modeling results of BKDE and KDE with simulation data through a simple example.Here, the accuracy of the modeling of the two is compared again through real DG data.Since the real pdf of the DG data is unknown, the relative error between the mean and variance of the modeled pdf and the mean and variance of the real measured data is used as the criterion for comparison.When using the measured data for PV, the data for the time when the PV is not producing power at night were removed.The results are presented in Table 1.Figures 5-7 show histograms of the DG real data as well as the pdfs calculated by BKDE and KDE.
The results from Figures 5-7 reveal that KDE exhibits a segment of density curve near the boundary points that markedly contradicts the actual histogram trend, whereas BKDE   In this section, the accuracy of pdf modeling under real data using BKDE and KDE is compared.Subsequently, the effectiveness of the proposed method for risk assessment is validated through simulations in two cases: when there is abundant radiation and high wind speed resulting in higher DG output during the daytime, and when there is no radiation at night coupled with low wind speed leading to reduced DG output.

The Performance of pdf Modeling
Section 2.1 compares the pdf modeling results of BKDE and KDE with simulation data through a simple example.Here, the accuracy of the modeling of the two is compared again through real DG data.Since the real pdf of the DG data is unknown, the relative error between the mean and variance of the modeled pdf and the mean and variance of the real measured data is used as the criterion for comparison.When using the measured data for PV, the data for the time when the PV is not producing power at night were removed.The results are presented in Table 1.Figures 5-7 show histograms of the DG real data as well as the pdfs calculated by BKDE and KDE.
The results from Figures 5-7 reveal that KDE exhibits a segment of density curve near the boundary points that markedly contradicts the actual histogram trend, whereas BKDE In this section, the accuracy of pdf modeling under real data using BKDE and KDE is compared.Subsequently, the effectiveness of the proposed method for risk assessment is validated through simulations in two cases: when there is abundant radiation and high wind speed resulting in higher DG output during the daytime, and when there is no radiation at night coupled with low wind speed leading to reduced DG output.

The Performance of pdf Modeling
Section 2.1 compares the pdf modeling results of BKDE and KDE with simulation data through a simple example.Here, the accuracy of the modeling of the two is compared again through real DG data.Since the real pdf of the DG data is unknown, the relative error between the mean and variance of the modeled pdf and the mean and variance of the real measured data is used as the criterion for comparison.When using the measured data for PV, the data for the time when the PV is not producing power at night were removed.The results are presented in Table 1.Figures 5-7 show histograms of the DG real data as well as the pdfs calculated by BKDE and KDE.

Case 1: High DG Output
Assuming that, based on NWP, the wind speed forecast for WT at 11 a.m. on a certain day is 10 m/s, and the global horizontal radiation at PV1 is 850 W/m², while at PV2, it is 900 W/m².Based on the BKDE, the pdfs and CDFs of WT, PV1, and PV2 can be obtained, as shown in Figure 8, along with the joint pdf and joint CDF between these variables, as illustrated in Figures 9-11

Case 1: High DG Output
Assuming that, based on NWP, the wind speed forecast for WT at 11 a.m. on a certain day is 10 m/s, and the global horizontal radiation at PV1 is 850 W/m², while at PV2, it is 900 W/m².Based on the BKDE, the pdfs and CDFs of WT, PV1, and PV2 can be obtained, as shown in Figure 8, along with the joint pdf and joint CDF between these variables, as illustrated in Figures 9-11

Case 1: High DG Output
Assuming that, based on NWP, the wind speed forecast for WT at 11 a.m. on a certain day is 10 m/s, and the global horizontal radiation at PV1 is 850 W/m², while at PV2, it is 900 W/m².Based on the BKDE, the pdfs and CDFs of WT, PV1, and PV2 can be obtained, as shown in Figure 8, along with the joint pdf and joint CDF between these variables, as illustrated in Figures 9-11  The results from Figures 5-7 reveal that KDE exhibits a segment of density curve near the boundary points that markedly contradicts the actual histogram trend, whereas BKDE yields results consistent with the actual trend.It can be affirmed that the results from BKDE are more in line with the actual data-based histogram compared to KDE.
where Value c represents the mean or variance of the modeled pdf, and Value r represents the mean or variance of the real measured data.

Case 1: High DG Output
Assuming that, based on NWP, the wind speed forecast for WT at 11 a.m. on a certain day is 10 m/s, and the global horizontal radiation at PV1 is 850 W/m 2 , while at PV2, it is 900 W/m 2 .Based on the BKDE, the pdfs and CDFs of WT, PV1, and PV2 can be obtained, as shown in Figure 8, along with the joint pdf and joint CDF between these variables, as illustrated in Figures 9-11             From the above figures, it is evident that the variables WT, PV1, and PV2 do not satisfy the condition of mutual independence between them.Therefore, it is not possible to directly perform PLFCM, and an independent transformation is required.
After performing the Rosenblatt inverse transformation, the correlated DG variables were transformed to become independent samples.Tables 2 and 3 present the Pearson correlation coefficient, Kendall correlation coefficient, and Spearman correlation coefficient between WT, PV1, and PV2 before and after the transformation.The results indicate that after the transformation, the absolute values of the correlation coefficients between DG variables were largely reduced to below 0.1, significantly decreasing their correlations and rendering them nearly independent.In this paper, the accuracy of the proposed method is validated through the MCS method for PLF.Additionally, a method based on Orthogonal inverse transform is employed to calculate PLF for comparative purposes.To account for the correlation, the MCS method used DG data that originally had correlations before the independence transformation.It employed random sampling based on the joint pdf of the DG variables obtained previously.
In total, the MCS method conducted 20,000 random samples, performed deterministic load flow, and obtained the pdf of node voltages.Figure 12 illustrates the voltage pdf of three DG-connected nodes and the last node 18. Due to space limitations, it is difficult to plot the pdf of all nodes under the th different methods.Therefore, in this paper, the results of the MCS method are used as standard to calculate the relative errors of the mean and variance obtained by both PLFCM with Rosenbla inverse transform and PLFCM with Orthogonal inverse tr form.The results are shown in Table 4.  Due to space limitations, it is difficult to plot the pdf of all nodes under the three different methods.Therefore, in this paper, the results of the MCS method are used as the standard to calculate the relative errors of the mean and variance obtained by both the PLFCM with Rosenblatt inverse transform and PLFCM with Orthogonal inverse transform.The results are shown in Table 4. Observing Figure 12, it can be noted that the results obtained from the PLFCM based on BKDE and Rosenblatt inverse transformation are in good agreement with the outcomes of the MCS method.Node 26 is located in the middle of the radial distribution network and has a WT connection, which is why its voltage remains within the allowable range without violations.However, nodes 33, 15, and 18 are situated at the endpoints of the distribution network.Despite having DG support at these nodes or in their vicinity, voltage violations below the lower limit still occur due to the impedance losses in the transmission lines.
Moreover, as can be seen from Table 4, the error of PLFCM based on the Orthogonal inverse transform is significantly larger than that of PLFCM based on the Rosenblatt inverse transform proposed in this paper.This is due to the fact that the Orthogonal inverse transform is unable to characterize the nonlinear relationship that exists between the DG variables.
The time used for the calculation of the three methods is shown in Table 5.By comparing the computation times in Table 5, it can be found that the computation time of PLFCM with Rosenblatt inverse transform and PLFCM with Orthogonal inverse transform is greatly reduced and the computation efficiency is improved compared with the MCS method.Through the above analysis, in general, the method proposed in this paper has high accuracy and prediction precision, and the calculation time is greatly reduced.
The corresponding risk indexes can be calculated based on the node voltage violation probability combined with the severity of the violation.A risk factor of three is assigned to regular nodes, while a risk factor of four is assigned to DG-connected nodes.Table 6 presents the voltage violation probability and risk indexes for each node.Nodes with risk indexes above 0.01 are considered high risk, while those between 0.0001 and 0.01 are categorized as medium risk, values below 0.0001 are classified as low risk, and a risk index of 0 indicates no risk.Figure 13 illustrates the risk zones of the system at different levels at 11 a.m.

Case 2: Low DG Output
Assuming that, based on NWP, the wind speed forecast for WT at 9 p.m. on a certain day is 6 m/s.Since it is night-time, the global horizontal radiation is zero, resulting in no power output from PV1 and PV2.Utilizing the BKDE, the pdf and CDF for WT can be obtained, as shown in Figure 14.
As there is only one DG generating power at this moment, there is no correlation among multiple DGs.Hence, the PLFCM can be performed directly as there is no need for independent transformation.So in this section, only two methods are used for calculating the pdf of voltage, which are PLFCM and MCS.The sampling number of the MCS is still 20,000 times.The pdf of voltage at the WT-connected node and the last node 18 are shown in Figure 15.Table 7 illustrates the relative errors of the mean and variance of each node.Table 8 demonstrates the computation time for both methods.

Case 2: Low DG Output
Assuming that, based on NWP, the wind speed forecast for WT at 9 p.m. on a certain day is 6 m/s.Since it is night-time, the global horizontal radiation is zero, resulting in no power output from PV1 and PV2.Utilizing the BKDE, the pdf and CDF for WT can be obtained, as shown in Figure 14.As there is only one DG generating power at this moment, there is no correlation among multiple DGs.Hence, the PLFCM can be performed directly as there is no need for independent transformation.So in this section, only two methods are used for calculating the pdf of voltage, which are PLFCM and MCS.The sampling number of the MCS is still 20,000 times.The pdf of voltage at the WT-connected node and the last node 18 are shown in Figure 15.Table 7 illustrates the relative errors of the mean and variance of each node.Table 8 demonstrates the computation time for both methods.From Figure 15, it can be observed that despite the relatively low WT output, 26, due to its location and reduced load, maintains its voltage within the normal r However, the end-node 18 experiences consistently low voltage levels during the due to the lack of nearby PV power support.The voltage violation probability and indexes for each node at this time are presented in Table 9. Figure 16 illustrates th levels of the system at 9 p.m.  From Figure 15, it can be observed that despite the relatively low WT output, node 26, due to its location and reduced load, maintains its voltage within the normal range.However, the end-node 18 experiences consistently low voltage levels during the night due to the lack of nearby PV power support.The voltage violation probability and risk indexes for each node at this time are presented in Table 9. Figure 16 illustrates the risk levels of the system at 9 p.m.

Conclusions
A dynamic risk assessment method for voltage violation in distribution networks with DG is proposed in this paper, which mainly contributes to the following three points Firstly, a pdf modeling method based on BKDE and conditional density is proposed This method provides a more accurate representation of stochastic variables with boundary constraints, such as load and DG.Simulation results on actual DG data indicate that the relative errors of mean and variance using this method are approximately one-third of those obtained with KDE.
Secondly, to address the correlation between DG variables within the same region an independent transformation method based on the Rosenbla inverter transformation is proposed.This method has the capability to reduce the absolute values of correlation coefficients between random variables to below 0.1, laying the foundation for accurate PLFCM.Compared to the PLFCM based on the Orthogonal inverse transform, the proposed method demonstrates a reduction of one to two orders of magnitude in the relative errors of mean and variance, with comparable computational times.
Finally, an index for the severity of voltage violation, based on a utility function, is proposed.This index, combined with the voltage violation probability, quantitatively characterizes the risk of voltage violation at nodes and provides essential information for coordinated voltage control.

Conclusions
A dynamic risk assessment method for voltage violation in distribution networks with DG is proposed in this paper, which mainly contributes to the following three points.
Firstly, a pdf modeling method based on BKDE and conditional density is proposed.This method provides a more accurate representation of stochastic variables with boundary constraints, such as load and DG.Simulation results on actual DG data indicate that the relative errors of mean and variance using this method are approximately one-third of those obtained with KDE.
Secondly, to address the correlation between DG variables within the same region, an independent transformation method based on the Rosenblatt inverter transformation is proposed.This method has the capability to reduce the absolute values of correlation coefficients between random variables to below 0.1, laying the foundation for accurate PLFCM.Compared to the PLFCM based on the Orthogonal inverse transform, the proposed method demonstrates a reduction of one to two orders of magnitude in the relative errors of mean and variance, with comparable computational times.
Finally, an index for the severity of voltage violation, based on a utility function, is proposed.This index, combined with the voltage violation probability, quantitatively characterizes the risk of voltage violation at nodes and provides essential information for coordinated voltage control.

Figure 1 .
Figure 1.Results of the two estimation methods with the actual pdf.

Figure 1 .
Figure 1.Results of the two estimation methods with the actual pdf.
and comprises three main steps: Entropy 2023, 25, x FOR PEER REVIEW 10 of 21

Figure 2 .
Figure 2. The flow chart of the proposed method.

Figure 2 .
Figure 2. The flow chart of the proposed method.

Entropy 2023 ,
25, x FOR PEER REVIEW 11 of 21 three-sigma rule, it can be considered that the load ratios at each time fall within the shaded area of Figure4.

Figure 4 .
Figure 4. Load ratio at different times.

Figure 4 .
Figure 4. Load ratio at different times.

Figure 4 .
Figure 4. Load ratio at different times.

Figure 5 .
Figure 5. Histograms of the WT and the pdfs calculated by BKDE and KDE.

Figure 6 .
Figure 6.Histograms of the PV1 and the pdfs calculated by BKDE and KDE.

Figure 7 .
Figure 7. Histograms of the PV2 and the pdfs calculated by BKDE and KDE. .

Figure 5 .
Figure 5. Histograms of the WT and the pdfs calculated by BKDE and KDE.

Figure 5 .
Figure 5. Histograms of the WT and the pdfs calculated by BKDE and KDE.

Figure 6 .
Figure 6.Histograms of the PV1 and the pdfs calculated by BKDE and KDE.

Figure 7 .
Figure 7. Histograms of the PV2 and the pdfs calculated by BKDE and KDE. .

Figure 6 .
Figure 6.Histograms of the PV1 and the pdfs calculated by BKDE and KDE.

Figure 5 .
Figure 5. Histograms of the WT and the pdfs calculated by BKDE and KDE.

Figure 6 .
Figure 6.Histograms of the PV1 and the pdfs calculated by BKDE and KDE.

Figure 7 .
Figure 7. Histograms of the PV2 and the pdfs calculated by BKDE and KDE. .

Figure 7 .
Figure 7. Histograms of the PV2 and the pdfs calculated by BKDE and KDE. .

Figure 9 .
Figure 9.The joint pdf and joint CDF between WT and PV1.

Figure 9 .
Figure 9.The joint pdf and joint CDF between WT and PV1.

Figure 10 .
Figure 10.The joint pdf and joint CDF between WT and PV2.

Figure 9 . 21 Figure 8 .
Figure 9.The joint pdf and joint CDF between WT and PV1.

Figure 9 .
Figure 9.The joint pdf and joint CDF between WT and PV1.

Figure 10 .
Figure 10.The joint pdf and joint CDF between WT and PV2.Figure 10.The joint pdf and joint CDF between WT and PV2.

Figure 10 .
Figure 10.The joint pdf and joint CDF between WT and PV2.Figure 10.The joint pdf and joint CDF between WT and PV2.

Figure 10 .
Figure 10.The joint pdf and joint CDF between WT and PV2.

Figure 11 .
Figure 11.The joint pdf and joint CDF between PV1 and PV2.Figure 11.The joint pdf and joint CDF between PV1 and PV2.

Figure 11 .
Figure 11.The joint pdf and joint CDF between PV1 and PV2.Figure 11.The joint pdf and joint CDF between PV1 and PV2.

Figure 13 .
Figure 13.Risk level chart for system in case 1.

Figure 13 .
Figure 13.Risk level chart for system in case 1.

Entropy 2023 ,Figure 14 .
Figure 14.The pdf and CDF of WT.

Figure 14 .
Figure 14.The pdf and CDF of WT.

Figure 14 .
Figure 14.The pdf and CDF of WT.

Figure 16 .
Figure16.Risk level chart for system in case 2.

Table 1 .
The relative errors of BKDE and KDE.

Table 1 .
The relative errors of BKDE and KDE.

Table 1 .
The relative errors of BKDE and KDE.

Table 1 .
The relative errors of BKDE and KDE.

Table 4 .
The relative errors of PLFCM with Rosenbla inverse transform and PLFCM with Orth onal inverse transform.

Table 4 .
The relative errors of PLFCM with Rosenblatt inverse transform and PLFCM with Orthogonal inverse transform.

Table 5 .
Computation time of three methods.

Table 6 .
The voltage violation probability and risk indexes for the nodes in case 1.

Table 6 .
The voltage violation probability and risk indexes for the nodes in case 1.

Table 7 .
The relative errors of PLFCM.

Table 8 .
Computation time of two methods.

Table 7 .
The relative errors of PLFCM.
RE of Mean (%) RE of Variance (%)Node No.RE of Mean (%) RE of Variance (%)

Table 8 .
Computation time of two methods.

Table 9 .
The voltage violation probability and risk indexes for the nodes in case 2.

Table 9 .
The voltage violation probability and risk indexes for the nodes in case 2. Risk level chart for system in case 2.