In this section, we first fix the notations and introduce some basic concepts of graph theory. The problem statement of interest is then presented in a constructive way. In particular, the optimal power flow problem of distribution networks in the presence of uncertainties is taken into account using a distributionally robust chance-constrained formulation.
  2.1. Notations and Basic Graph Theory
The symbol  represents the column vector of all ones with appropriate dimensions. We use calligraphy font  to represent a set and boldface font  to represent a vector. For a vector , its upper and lower bounds are denoted as  and , respectively. Let  be the true distribution and  be the expectation under distribution . For a given support ,  denotes the set of all probability distributions with support . For an arbitrarily given set ,  is equal to the cardinality of the set. Throughout this article, we will use the notion of the tilde diacritic over the letter for uncertain variables.
Let  be a graph, in which  and  are the set of all nodes and edges, respectively. The symbol  represents the set of nodes other than the root node 0 in the tree graph, i.e., . The sets of parent and child nodes associated with each node  are denoted by  and . For the graph of the tree, each nonroot node has a unique ancestor and a set of children nodes except the so-called “leaf” nodes which have no children nodes.
  2.2. Network Representation and Power Flow Equation
Electric power systems are typically modeled as networks of buses interconnected by branches. Buses serve as physical points to which various electronic components, such as generators, loads, storage units, and shunt impedances, are connected. Branches, on the other hand, provide pathways for the flow of electrical current between buses. This article examines an N-bus radial distribution system, represented by a tree-like graph , where buses correspond to nodes in the set , and a branch from bus i to bus j is denoted by edge . It is well established that in a radial distribution network without isolated nodes, the number of nodes exceeds the number of branches by one, i.e., . Each branch  is characterized by resistance , reactance , and an upper bound on apparent power flow . Without loss of generality, the root node 0 is designated as the slack bus, which connects the distribution side to the transmission side, commonly referred to as the substation. Consequently, the set of nonroot buses is .
Each system bus 
 is specified by its voltage magnitude 
, voltage phase 
, net active power 
 and net reactive power 
, such that
        where 
 and 
 represent net demand for active and reactive power injection on the node, whereas 
 and 
 are controllable active and reactive power injections for the buses associated with generators. Unless otherwise specified, it is assumed that the power flowing out from the reference bus 0 is known, together with 
 p.u. and 
, where 
 is the voltage phase of root node bus 0. Obviously, the power flowing out from node 
j is equal to zero if 
 and one has 
 if there is no generator attached to the node 
. 
Figure 1 provides a concrete interpretation for the introduced notations, where 
.
This article employs the branch flow model to characterize the AC power flow balance in radial distribution grids, as it demonstrates superior numerical stability compared with the bus injection model [
31]. The branch flow equation is expressed as follows:
        where 
 (
) represents active(reactive) power flow of branch 
, and 
 is denoted as the square of voltage magnitude, i.e., 
 for 
. It is worth noting that Equations (
2a) and (2b) mean the power balance of nonroot nodes 
, while Equation (2c) characterizes the voltage balance of nonroot nodes 
. It is well known that the branch flow model (3) fully characterizes the power flows for a balanced radial network [
31].
  2.3. Optimal Power Flow Problem of Distribution Networks
The OPF problem is a cornerstone in the planning and operation of distribution systems. Its primary objective is to minimize a specified function, typically encompassing generation costs and power losses, while ensuring that all operational constraints are satisfied within safe limits.
However, the nonlinearity inherent in the AC power flow model (3) introduces a nonconvex structure to the OPF problem, making it challenging to solve directly with guaranteed convergence. To address this, linear approximation or convex relaxation techniques are commonly employed. Among these, the DC power flow model has been widely used for linearizing OPF problems in transmission networks, where the resistance-to-inductance ratio of transmission lines is negligible, and reactive power flows are supplied locally. Nevertheless, this DC linearization approach is unsuitable for distribution networks, as the resistance-to-inductance ratio in such systems is significantly larger.
Different from the DC-like linearization technique, we employ an AC optimal power formulation for radial distribution systems based on the so-called LinDistFlow model [
32], which provides an alternative linear approximation for the branch flow model (3) by neglecting the real and reactive power losses but accounting for apparent flows and voltage magnitudes. Specifically, the LinDistFlow has the following form:
        which is a linear form of Equation (3).
To effectively tackle the challenges of solving nonconvex OPF problems in a distributed manner, one approach is to convexify the OPF problem before applying distributed algorithms. A common technique for convexification is second-order cone programming (SOCP) relaxation. However, this method often struggles to accurately represent the original nonconvex problem, ensuring optimality primarily in simpler network configurations, such as IEEE benchmark systems and acyclic radial networks. Furthermore, SOCP solvers typically require significant computational resources, limiting their scalability for large transmission networks with thousands of buses and meshed topologies. As a computationally efficient alternative, the LinDistFlow method provides a convex approximation of power flow equations. Numerical simulations, as cited in [
33], demonstrate that LinDistFlow is well suited for a wide range of distribution circuits, offering a viable solution with reduced complexity.
To this end, the full OPF problem associated with the LindistFlow model is given by
        where the quadratic form 
 is used in (4a) by giving the cost coefficients 
 associated with the controllable generators at bus 
. Moreover, the remaining inequality constraints (4c)–(4f) represent, respectively, engineering limits on the voltage magnitudes, power injections, and magnitudes of apparent power in distribution networks.
  2.4. DRCC-OPF of AC Distribution Networks
Since the uncertainties derived from volatile renewable and fluctuating loads increasingly emerge in distribution feeders, the deterministic LinDistFlow-based OPF model (4) is no longer available to guide the planning and operation of distribution systems. For simplicity of exposition, we restrict our attention to the case of uncertainties induced by renewable energy generation because this consideration is sufficient to illustrate the proposed techniques but is technically much simpler than the general case. Nevertheless, the explored results in this article can be easily extended to any case of uncertainties involving fluctuating load consumption and volatile energy reserve capacities.
More specifically, the net power injections (
1) in the presence of random active and reactive generations 
 and 
 become
In order to describe explicitly the uncertainty propagation across the distribution grid, we distinguish the unpredictable renewable energy production as follows:
        where 
 correspond to conventional active and reactive power generation, whereas 
 represent volatile renewable feed-in such that
        in which active and reactive power 
, 
 respond to the fixed forecasting amount associated with renewable generation, and random forecasting errors 
, 
 characterize the impact of uncertainties on node 
j. Note that different from only concerning random forecasting errors derived from active power generation in [
29], the randomness of reactive power is also taken into account in this article.
When accounting for uncertainty, controllable generating units are responsible for compensating the total deviations arising from forecast errors in renewable generation. Consequently, the random active and reactive power generations, 
 and 
, in (
5), can be reformulated as
        where 
 and 
 are the vector of active and reactive power prediction error, respectively. Here, with a slight abuse of notation, 
 and 
 stand for the forecasting active and reactive power injection from generators on node 
j. Moreover, the decision variable 
 in (8) determines to what extent the controllable generators on node 
j are involved in compensating for renewable generation deviations, which satisfies 
 if there is no generator on node 
j and 
.
Next, it is essential to examine the expressions for edge power flows and nodal voltages influenced by the uncertainty in renewable power production. To aid in this derivation, we introduce two indicator matrices, 
 and 
, with components that satisfy the following conditions:
As a consequence of the notations in (
9) and Formulas (
5)–(
7), the propagation of uncertainty stemming from fluctuating renewable generation across the distribution network can be captured by the stochastic LinDistFlow model in the following vector-based form:
        where 
 are column vectors representing the random active and reactive power flows, 
 and 
, across branch 
, where 
. The column vectors 
 contain the elements 
 for nonroot buses 
. The vector 
 includes the random counterparts of the squared voltage magnitudes 
 for all 
. The constant matrices 
 are diagonal, containing 
 and 
 for each branch 
, where 
. Clearly, when the prediction error in renewable power generation vanishes, i.e., 
, the conditions of Equations (
10a)–(10c) reduce to the power flow balance system (
3a)–(3c) at the nominal operating point.
Thanks to Equations (
10a)–(10c), the uncertain active and reactive edge power flows and squared voltage magnitudes with respect to (8) lead to
        where 
 is the remaining matrix formed by removing the first column of 
. Matrices 
 are deterministic counterparts of 
 in the absence of uncertainties. Notably, the Equations (
11a)–(11c) give rise to the fact that voltage and the power outflow from slack bus 0 maintain constant whatever the uncertainties emerge or not. More importantly, the system response models concerning (8),  (
11a)–(11c) are all in a linear manner with respect to renewable power prediction derivations.
In addition to guaranteeing the compensation of power production deviation from the forecast values, the controllable generators must ensure various soft constraints of inequalities (4c)–(4f) in a prescribed high probability, which leads to the following chance-constrained formulations:
        where 
, which typically take small values in practice, are referred to as the violation probability of chance-constrained optimizations with respect to squared voltage magnitudes, power generations, and squared magnitudes of apparent powers, respectively.
The stochastic OPF problem with chance constraints aims to find feasible solutions within a confidence region defined by specified probability levels. This requires comprehensive knowledge of the probability distribution  of the random variables  and , which is essential for the existence of optimal solutions. However, in practice, these precise distributions are typically unavailable. Instead, only a finite set of historical data is accessible, which provides some reliable probabilistic information. Consequently, deriving the true distributions needed to formulate the expected cost and chance constraints from finite samples is challenging, leading to ambiguity in these distributions.
To effectively manage the uncertainties inherent in probability distributions and assess their impact on power flows and voltage limits, this study emphasizes a data-driven distributionally robust chance-constrained optimal power flow (DRCC-OPF) model, formulated as follows:
		
        where 
 represents the decision variables, and the ambiguity set 
 encompasses all admissible probability distributions inferred from accessible historical data. The objective function (13a) is defined as the worst-case expectation of generation costs concerning the probability distributions 
 within the ambiguity set 
. Constraint (13b) ensures that the cumulative participation factors of power generation from renewable sources sum to 1. The equality constraints (13c) enforce the balance of active and reactive power at the nominal operating point, where renewable power generation precisely matches forecasted values. Furthermore, the distributionally robust chance constraints (13d)–(13g) ensure the adequacy of voltage quality, active power, reactive power, and transmission line capacity, respectively.
The data-driven DRCC-OPF problem (13), incorporating the objective function (13a) and chance constraints (13d)–(13g), is designed to account for the worst-case probability distribution within the ambiguity set . This approach ensures robust performance across various ambiguous distributions contained in . The optimization framework utilizes upper and lower bounds, such as , , , , , , and , alongside confidence probability factors , , , and , to quantify the reliability and performance levels of uncertain variables. Consequently, the primary objective of the optimization problem (13) is to identify a data-driven (sub-)optimal solution that achieves high reliability with minimal certification requirements.
  2.5. Wasserstein Metric and Ambiguity Set
The true probability distribution 
 of uncertain renewable energy generation deviations, denoted as 
, is not precisely known but is assumed to reside within an ambiguity set 
 constructed from a finite set of historical data. This study employs the Wasserstein metric to define the ambiguity set, encompassing all probability distributions that are proximate to a nominal or empirical distribution with respect to a predefined probability metric [
15].
Specifically, given a set of historical samples 
, the empirical probability distribution is expressed as
        where 
 represents the Dirac measure centered at 
. This empirical distribution 
 serves as an approximation of the true distribution 
. As the volume of historical data increases indefinitely, the “distance” between 
 and 
 diminishes according to the prescribed probability metric, indicating convergence of 
 to 
 as 
. The Wasserstein metric is utilized in this study to quantify the distance between 
 and 
.
Definition 1  (Wasserstein Metric [
34])
. The Wasserstein metric between two arbitrary probability distributions  and  within the space  is defined aswhere Π 
denotes a joint distribution of the random variables  and , with  and  as their respective marginals. It is remarkable that the Wasserstein metric in (
14) holds for arbitrarily 
, but we preferably employ the 
-norm in this article because of its appealing advantages in numerical computation. As a result of giving 
N historical samples and the set 
 of all probability distributions under support 
, the Wasserstein-metric-based ambiguity set of the true distribution can be described by the following formulation:
        which is also referred to as a Wasserstein ball with the empirical distribution 
 as the center and 
r as the radius.
Obviously, the radius of the Wasserstein ball, which has been known as an inverse function with respect to the amount of historical data, plays an essential role in the performance of the data-driven DRCC-OPF problem (13). As shown in [
35] by giving a confidence level 
 and the diameter 
B of the support 
, a possible radius can be obtained as
        which nevertheless leads to a conservative choice. In the recent work in [
29], the  radius formula
        has been proved to save significant conservation and thus provide desirable performance for the DRO in numerical tests. Here, the constant 
C is obtained by solving the following optimization problem:
        where 
 is the mean value of historical samples. In particular, the radius given in (
17) degenerates to the case of (
16) while setting 
.