A Chance-Constrained Multistage Planning Method for Active Distribution Networks

: This paper introduces a multistage planning method for active distribution networks (ADNs) considering multiple alternatives. The uncertainties of load, wind and solar generation are taken into account and a chance constrained programming (CCP) model is developed to handle these uncertainties in the planning procedure. A method based on a k-means clustering technique is employed for the modelling of renewable generation and load demand. The proposed solution methodology, which is based on a genetic algorithm, considers multiple planning alternatives, such as the reinforcement of substations and distribution lines, the addition of new lines, and the placement of capacitors and it aims at minimizing the net present value of the total operation cost plus the total investment cost of the reinforcement and expansion plan. The active network management is incorporated into planning method in order to exploit the control capabilities of the output power of the distributed generation units. To validate its e ﬀ ectiveness and performance, the proposed method is applied to a 24-bus distribution system.


Introduction
Distribution network planning (DNP) aims at determining the optimum location, capacity and time of the investments in new network components and equipment in order to minimize the total investment and operation cost and to ensure the safe operation of the network for a forecasted load growth demand during a planning period. Over recent years, due to the increasing penetration of renewable energy sources (RES), the installation of advanced metering infrastructure (AMI) to distribution networks and the development of information and communication technologies (ICT), distribution networks are being transformed from passive to active distribution networks (ADNs). In ADNs, the control of the active and reactive power output of the available distributed generation (DG) units is enabled to deal with technical challenges, such as line congestion and voltage rise issues [1]. Hence, the planning of ADNs, in which the control capabilities of the DG units are exploited, becomes a very challenging problem. Furthermore, the complexity of the DNP problem increases significantly, since the uncertainties of load demand and RES generation should be considered and appropriately modeled.
DNP problems can be formulated as a mixed integer nonlinear programming (MINLP) problem. The solution methods of DNP problems can be divided into two categories: (1) static methods or (2) multistage methods [2]. The static methods do not calculate the time of the network investments and they assume that all investments are implemented at the start of the planning period. On the contrary, the multistage methods determine the time the network investments are necessary, providing the (1) To introduce a CCP methodology for the planning of ADN considering the uncertainties of load demand, wind and solar generation. (2) To develop a multistage planning framework that calculates the optimal location, capacity and time of the investments on substation, distribution lines and capacitors, while examine the effect of the active DG management on the optimal solution of the DNP problem.
This paper is organized as follows: Section 2 describes the modelling of the stochastic behavior of load demand and renewable generation. The formulation of the proposed CCP DNP model is given in Section 3. Section 4 provides the solution method. The method is applied to a 24-bus distribution network and the results are analyzed in Section 5. In Section 6, conclusions are drawn.

Load and Renewable Generation Data Modeling
Distribution networks are, in general, planned in order to ensure the safe network operation under any possible loading conditions over the planning period. In distribution networks with no or low penetration of RES, in which power flows from the high voltage (HV) / medium voltage (MV) substation to the MV/low voltage (LV) substations, the reinforcement and expansion plan of a distribution network was determined based on a given forecast of the maximum load demand of each year of the planning period without taking into account the variability of the output power of RES. However, in distribution networks with high penetration of RES, the aforementioned approach is not adequate, since reverse power flows occur quite often during every year of the planning period, especially in periods of low demand and high RES generation, creating new technical challenges that need to be solved at the planning stage, such as line congestion and voltage rise issues [1].
Thus, a planning method for ADNs with high penetration of DERs, especially RES, has to take into account a proper dataset of load demand and RES generation during each stage of the planning period in order to determine the optimal reinforcement and expansion plan. It is important that the dataset of load demand and RES generation that will be incorporated into the planning process contains the possible combinations of load demand and RES generation. Furthermore, each set of this dataset should be different from the rest in order to avoid the unnecessary repetition of similar load-generation sets.
The first feature of this dataset is the quality. A high quality dataset for the planning of a distribution network needs to be created in a way that it adequately considers the uncertainties of load demand and RES generation during each stage of the planning period. The dataset's second feature is its proper size. A large dataset will increase significantly the computation time of the solution of the DNP problem, making it sometimes impractical to solve. However, a small dataset would deteriorate the dataset's quality.
This paper proposes a method that is based on a k-means clustering method [21] to generate probabilistic load-generation sets for each stage of the planning period in order to model accurately the load and renewable generation uncertainties. The k-means method is an established clustering method, which has been widely employed for the classification of large volumes of data [22]. The probabilistic load-generation sets are determined based on historical data of wind speed, solar irradiance and load demand variability in order to be used as the input data of the proposed multistage planning method of ADNs.

The k-Means Clustering Method
Data clustering is a process that allocates a set of individual data into smaller groups (clusters) such that (a) the individual data with similar features are grouped into the same cluster and (b) the dissimilarity of each created cluster compared to the rest is high enough making each cluster distinct [23]. The k-means clustering method is a partitioning clustering technique that employs an iterative process to optimize the quality, i.e., the dissimilarity, of a predetermined number of distinct clusters [24][25][26]. The k-means clustering method is initialized by specifying the number of clusters (k) and selecting randomly individual data from the whole dataset as the cluster centroids. The optimal number of clusters can be determined by the method presented in [27,28]. The k-means method allocates the remaining individual data between clusters so that the Euclidean distance of an individual vector data (m i ) from its corresponding cluster centroid (c k ) is minimized as shown in (1). The k-means clustering method has been widely used for data management of power systems related problems [29][30][31][32]: Then, the average of all the individual data that are the members of a cluster is set as the new centroid of that cluster. Overall, the k-means method determines the clusters' centroids so that the sum of squared distances of each individual vector data from its corresponding cluster centroid is minimized. This iterative process is terminated when a convergence criterion is fulfilled. The convergence criterion is that there is no change in the clusters' centroids or after a certain number of iterations the change in the clusters' centroids is below a threshold. After the data clustering is terminated, the occurrence probability of each cluster's centroid can be calculated, as follows:

Probabilistic Load-Generation Sets
The incorporation of variable DG, such as RES, combined with the variation of load demand into the DNP problem, increases significantly the problem's complexity. Considering all possible scenarios of load and RES generation that can occur during every year of the planning period would make the solution process of the DNP problem almost impractical due to the high computational burden. To diminish the number the scenarios of load and RES generation that should be considered at every year of the planning period, while preserving the stochastic behavior and interrelationship between load demand and RES generation, a method based on k-means clustering method is proposed. The output of the proposed method is a specific number of load-generation sets, where each set has its own occurrence probability and is characterized by a load state (level), an output power state of wind DG and an output power state of solar DG unit. The steps of the proposed method that determine these probabilistic load-generation sets are as follows: Step 1: The success of a clustering method, like the k-means method, depends on the development of a wide range dataset that adequately represents the majority of the distribution network operating states. Thus, the basis of the proposed method is the collection of historical data of load demand, wind speed and solar irradiance of the area of the distribution network that will be studied. For example, in Figure 1, the hourly data of load demand, wind speed and solar irradiance of one year for a distribution network [29] are presented.
Step 2: After the collection of historical data of load demand, wind speed and solar irradiance, these data are adapted in order to represent the yearly profile of load demand, the yearly generation profile of wind DG units and the yearly generation profile of solar DG units. The yearly profile of load demand is determined by dividing the hourly load demand with the maximum load of each year. Based on the hourly wind speed data, the generation profile of a wind DG unit can be determined using the wind turbine power curve as presented in Equation (3): Similarly, based on the hourly solar irradiance data, the generation profile of a solar DG unit can be determined using the PV power curve as given by Equation (4) Hence, the historical data of Figure 1 are transformed into the data presented in Figure 2. Similarly, based on the hourly solar irradiance data, the generation profile of a solar DG unit can be determined using the PV power curve as given by Equation (4): Hence, the historical data of Figure 1 are transformed into the data presented in Figure 2. Step 3: After processing the historical data, as described in Step 2, the k-means clustering method is applied and the input data are divided into K clusters. The centroids of the derived clusters represent the average behavior of the data that are included in each cluster. Thus, the clusters' centroids are considered as the probabilistic load-generation sets. Each clusters' centroid represents a load level, an output power state of wind DG and an output power state of solar DG, as shown in (5)

Problem Formulation
The long-term DNP problem is a complicated optimization problem that requires the consideration of many decision variables and many constraints related to distribution network operation as well as to the investment decisions. Furthermore, in this paper, the DNP problem solved is not only to meet a load growth rate, but what is also solved is considering how the distribution networks will accommodate high capacities of renewable DG. The DG units are assumed to be private investments and their type, installation location, installed capacity and installation time are predetermined. Moreover, it is assumed that all the necessary information and communication technology (ICT) infrastructures have been already installed in the distribution network facilitating the coordinated control of the active and reactive power output of the DG units for the safe operation of the network. Step 3: After processing the historical data, as described in Step 2, the k-means clustering method is applied and the input data are divided into K clusters. The centroids of the derived clusters represent the average behavior of the data that are included in each cluster. Thus, the clusters' centroids are considered as the probabilistic load-generation sets. Each clusters' centroid represents a load level, an output power state of wind DG and an output power state of solar DG, as shown in (5) and its occurrence probability is calculated according to Equation (2).

Problem Formulation
The long-term DNP problem is a complicated optimization problem that requires the consideration of many decision variables and many constraints related to distribution network operation as well as to the investment decisions. Furthermore, in this paper, the DNP problem solved is not only to meet a load growth rate, but what is also solved is considering how the distribution networks will accommodate high capacities of renewable DG. The DG units are assumed to be private investments and their type, installation location, installed capacity and installation time are pre-determined. Moreover, it is assumed that all the necessary information and communication technology (ICT) infrastructures have been already installed in the distribution network facilitating the coordinated control of the active and reactive power output of the DG units for the safe operation of the network.

Objective Function
The DNP problem is formulated as a MINLP problem that aims at minimizing the net present value of the total operation cost plus the total investment cost of the reinforcement and expansion plan of ADNs considering multiple planning alternatives, while ensuring its safe operation during the whole planning period, as follows: The objective function (6) consists of two terms. The first term (INV) denotes the investment cost and it accounts for the net present value of the investment cost for: (i) HV/MV substation reinforcement, (ii) line reinforcement, i.e., reconductoring of existing lines, (iii) installation of new lines for the connection of future loads to the network and (iv) placement of capacitors, as follows: where: The binary variables id SS i,a,t , id L ij,b,t and id CB i,c,t are the investment variables and they are equal to 1, when an investment is decided at a part of the network at the period t of the planning period. For example, the variable id L ij,b,t is equal to 1, when the installation of type b conductor at the line i-j is decided at period t.
The second term (OPC) of the objective function denotes the distribution network's operational cost and it accounts for the total annual energy losses during the planning period, as follows: Ploss k,t · Pr{c k } · 8760 (12)

Constraints
For the solution of the DNP problem, several constraints should be considered to represent the steady state operation of the distribution network. These constraints ensure that the distribution network will operate within its limits during the whole planning period, while the investment constraints are related to the investment decision variables.
To incorporate the investment decision variables (id SS i,a,t , id L ij,b,t and id CB i,c,t ) into the operational constraints, the auxiliary variables (uz SS i,a,t , uz L ij,b,t and uz CB i,c,t ) are employed and their relation to the corresponding investment variables is described by Equations (13)-(15): Constraints (13)-(15) represent the operation status of an investment decision, which is made at a certain stage of the planning period. For example, suppose that it is decided the reinforcement of line i-j with type b conductor at year 10 of the planning period, i.e., id L ij,b,10 = 1. According to (14), uz L ij,b,t will be equal to zero from year 1 to year 9, which means that the technical characteristics of line i-j for that period will be the same with its initial ones, whereas, uz L ij,b,t will be equal to 1 from year 10 to the end of the planning period, meaning that the technical characteristics of line i-j will change and they will be the same with the technical characteristics of a type b conductor. The operational constraints considering the investment decisions are the following: The active and reactive power flow balance in every bus of the distribution network at the load-generation set k during the planning stage t is given by Equations (16) and (17), respectively. The active and reactive power flow of each line i-j at the load-generation set k during the planning stage t is calculated by (18) and (19), respectively, while the conductance and the susceptance of line i-j at the planning stage t is presented in (20) and (21), respectively. The voltage magnitude of bus i at the load-generation set k during the planning stage t should vary within specific limits, as shown in (22). The apparent power flow limit of line i-j at the planning stage t is given in (23). Furthermore, the apparent power that flows through the HV/MV substation at bus i at the load-generation set k during the planning stage t should be lower than the capacity of the type a HV/MV substation (24).
At every stage t of the planning period, the network configuration should be radial. To ensure the radial operation of the network, the distribution network is represented as a spanning tree [33], which can be expressed by the following constraints: In a spanning tree, every node except the root (HV/MV substation bus) has exactly one parent. If line i-j is part of the spanning tree, then bus i is the parent of bus j (rd ij,t = 1) or bus j is parent of bus i (rd ji,t = 1), according to (25) and (26). Every bus can have only one parent bus (27), except for the HV/MV substation bus, which is considered as the root of the spanning tree (28).
Line congestion and voltage rise are the most common issues of the distribution networks with high penetration of RES [1,12]. To deal with these issues, in this paper the control capabilities of the DG units in ADNs are exploited. More specifically, the control of the active and reactive power output is incorporated into the formulation of the DNP problem and the effect of the active network management on the solution of the DNP problem is examined.
Based on the P-Q capability curve [34], which is shown in Figure 3, the reactive output power that a DG unit can absorb or inject to the distribution at the load-generation set k during the planning stage t is, as follows: • If 0.2 · P rated DG,i ≤ P dg,i,k,t ≤ P rated DG,i , then: Energies 2019, 12, x FOR PEER REVIEW 9 of 20 Line congestion and voltage rise are the most common issues of the distribution networks with high penetration of RES [1,12]. To deal with these issues, in this paper the control capabilities of the DG units in ADNs are exploited. More specifically, the control of the active and reactive power output is incorporated into the formulation of the DNP problem and the effect of the active network management on the solution of the DNP problem is examined.
Based on the P-Q capability curve [34], which is shown in Figure 3, the reactive output power that a DG unit can absorb or inject to the distribution at the load-generation set k during the planning stage t is, as follows:  Furthermore, the curtailment of the active power output of the DG units is enabled, in case the output power of the DG unit is higher than 20% of its rated power. This can be described by the following constraints: • If 0 ≤ P dg,i,k,t ≤ 0.2 · P rated DG , then: • If 0.2 · P rated DG,i ≤ P dg,i,k,t ≤ P rated DG,i , then: It should be noted that the curtailment of the active power output of the DG units is considered as last resort and it should be limited, since it has financial implications on the owners of the DG units.

Chance Constrained DNP Formulation
When the uncertainties of load demand and renewable generation are considered in the solution of the DNP problem, investments for the reinforcement and expansion of the network may be overestimated in the occasions of maximum RES generation and minimum load. Taking into account the low probability of occurrence of this occasion, it would be a reasonable choice to accept a minor violation for short time periods of the voltage constraints (22) and the thermal capacity of the distribution lines (23) [34][35][36][37]. Therefore, a chance constrained DNP formulation is proposed. The voltage limit constraint (22) is transformed into the chance constraint (35): According to (35), if the probability of the voltage limits violation is lower than a specified threshold (β V ), then it is considered that the network operates within its limits.
Similarly, the line capacity limit (23) is transformed into the chance constraint (36). According to (36), if the overload probability of the distribution line is lower than a specified threshold (β SL ), then it is considered that the network operates within its limits:

Solution Method
In this paper, a solution methodology that is based on GA is proposed to solve the DNP problem. GA is a meta-heuristic optimization algorithm involving iterative search procedures [38] and it has been widely used for the solution of mixed integer optimization problems [39]. Moreover, the GA has been proved very powerful in solving various DNP problems [13,[40][41][42].
The DNP problem is a MINLP problem due to the binary investment decision variables and the nonlinear power flow equations. The consideration of multiple planning alternatives, such as the reinforcement of existing network components (e.g., distribution lines, HV/MV substations) and the placement of new equipment (e.g., capacitors), results in a more cost-efficient planning solution, which handles more efficiently both capacity limit and voltage limit constraints. The proposed method is applied to distribution networks with high RES penetration and the control of the output power of the DG units is incorporated into the solution method. Furthermore, the proposed solution method is a multistage DNP method, which means that the time period of the installation of the new network investments is determined.
The proposed planning framework aims at calculating the investment decision variables (id SS i,a,t , id L ij,b,t and id CB i,c,t ). Certain features of the GA, such as chromosome structure, initial population, selection, crossover and mutation operators are discussed in the following.

Input Data
The solution of the DNP problem requires a significant amount of input data. Apart from the techno-economic characteristics of the planning alternatives, in every reinforcement and expansion plan of a distribution network, a load growth rate should be considered, and the available candidate routes for the connection of future loads should be known in advance as well as the installed DG capacity in every stage of the planning period. Furthermore, K probabilistic load-generation sets are considered, as described in Section 2, to represent the load demand variability and RES generation in every stage of the planning period. Hence, the investment decision variables should be calculated in such way that they guarantee the safe operation of the network in every year of the planning period.

Chromosome Structure
The coding of the potential solutions is essential for the effective implementation of the GA. Every potential solution (chromosome) can be depicted as a four part vector, in which the first part represents the decision for the reinforcement of the HV/MV substation, the second part represents the decision for the reinforcement of an existing distribution line, the third part represents the decision of the candidate routes for the connection of the future loads and the fourth part represents the installation bus of the capacitors. Each part of the chromosome is represented by a binary string as in common GAs. Figure 4 illustrates a simple example of the proposed encoding. Figure 4a presents the topology of a 6-bus distribution network in the reference year of the planning period. In Figure 4a, the dashed lines represent the candidate routes for the connection of the future load of bus 5 and bus 6; all distribution lines are assumed to be candidate for reinforcement; buses 2-4 are candidate for the installation of capacitors. Figure 4b illustrates the binary-coded candidate solution. The first part of the candidate solution of Figure 4b denotes the reinforcement of substation at bus 1, since the corresponding gene is equal to 1. Similarly, the second part of the candidate solution of Figure 4b denotes that only line 2 is reinforced with the immediately next higher type of conductor, i.e., type 2, since the corresponding gene is equal to 1. The conductor type of lines 1 and 3 are the same with the initial network topology (Figure 4a), since their corresponding genes are equal to zero. Moreover, the third part of Figure 4b denotes the addition of lines 4 and 6, while the fourth part denotes the installation of a capacitor at bus 4. Figure 4c illustrates the topology of the decoded candidate solution of Figure 4b. as in common GAs. Figure 4 illustrates a simple example of the proposed encoding. Figure 4a presents the topology of a 6-bus distribution network in the reference year of the planning period. In Figure 4a, the dashed lines represent the candidate routes for the connection of the future load of bus 5 and bus 6; all distribution lines are assumed to be candidate for reinforcement; buses 2-4 are candidate for the installation of capacitors. Figure 4b illustrates the binary-coded candidate solution. The first part of the candidate solution of Figure 4b denotes the reinforcement of substation at bus 1, since the corresponding gene is equal to 1. Similarly, the second part of the candidate solution of Figure 4b denotes that only line 2 is reinforced with the immediately next higher type of conductor, i.e., type 2, since the corresponding gene is equal to 1. The conductor type of lines 1 and 3 are the same with the initial network topology (Figure 4a), since their corresponding genes are equal to zero. Moreover, the third part of Figure 4b denotes the addition of lines 4 and 6, while the fourth part denotes the installation of a capacitor at bus 4. Figure 4c illustrates the topology of the decoded candidate solution of Figure 4b.

Chromosome Evaluation
Each chromosome is decoded using the procedure described in Section 4.2 and for their evaluation the fitness function (37) is used: To handle the violation of the constraints in the GA at every stage of the planning period, while searching for the optimal solution, a penalty function ( ) is incorporated into the fitness function and it is calculated as follows:

Chromosome Evaluation
Each chromosome is decoded using the procedure described in Section 4.2 and for their evaluation the fitness function (37) is used: To handle the violation of the constraints in the GA at every stage of the planning period, while searching for the optimal solution, a penalty function ( f pen,t ) is incorporated into the fitness function and it is calculated as follows: The penalty function (38) calculates the Euclidean distance (Dcon k,m ) from the upper or lower limit, in case constraint m is violated during load-generation set k [43]. To calculate (37) it is necessary to determine the time (installation period) of the investments, which is given by the decoding of the chromosome, using the following steps: Step 1: The investment decisions given by the decoding of the chromosome are stored in set {I} and it is assumed that the installation period of all elements of set {I} is t = 1.
Step 2: Select from set {I} an investment decision, which its installation period is equal to t. Consider as network topology the network configuration without the selected investment decision, and solve the NLP problem with objective function the fitness function (37) subject to constraints (13)- (21) and (24)- (36). If the penalty function (38) is equal to zero, it means that the distribution network operates within its specified limits and the selected investment decision is needed in a later stage of the planning period. Hence, the installation period of the selected investment decision is set as t = t + 1. This step is repeated until all available investment decisions of set {I} are examined for stage t.
Step 3: Set t = t + 1 and repeat Step 2 until t = T.

Initial Population
GA's effectiveness depends significantly on the selection of the initial population. To create diverse and high quality initial solutions for the GA, two procedures are followed. The first procedure aims at creating radial distribution networks. More specifically, the type of the existing substations and distribution lines is randomly selected. Furthermore, for the connection of a new load to the distribution network, first an available candidate route is randomly chosen and then its conductor type is randomly selected. The second procedure is based on the Prim's algorithm, which is a minimum spanning tree algorithm, to create radial distribution network. Then, the type of the substation capacity and the conductor type of the distribution lines are randomly selected. The first procedure is used to create the 70% of the initial population, while the second procedure is employed to yield the remaining 30% of the initial population.

Selection, Crossover, Mutation, and Next Generations
The initial population is evaluated and the genetic operators, i.e., selection, crossover and mutation, are employed next to create a new population. A stochastic tournament selection operator is employed for the choice of N pop population. Initially, two solutions p i and p j are randomly chosen from the population and the solution with the better fitness function value, i.e., with the lower total cost, is selected to be Parent 1. Parent 2 is selected with the same procedure. If Parent 1 and 2 are different, the selection procedure is considered successful. In fact, with the selection operator, two network topologies (Parent 1 and Parent 2) are selected from the initial population in order to be combined and to create a new network topology, i.e., a new candidate solution.
The combination of the two parent candidate solutions is performed by the crossover operator. In each parent chromosome a one-point crossover operator is employed and the cross-site is chosen randomly. Thus, the combination of the parent solutions yields two new candidate solutions, which contain features of both parent solutions. If the crossover operation is used in the pc% of the mating pool population, the mutation operator will be used in the rest (100 − pc)%. During the mutation operation, if the gene that is randomly chosen is equal to one, it will be set equal to zero and vice versa. For example, if the last gene of the fourth part of the candidate solution of Figure 4b is selected for mutation, it means that there will be no capacitors in the network topology of the newly derived candidate solution. After crossover and mutation, a set with the best candidate solutions is selected, which forms the new population. The process is terminated after a maximum number of generations is reached.

Results and Discussion
To validate the performance of the proposed methodology, a modified 24-bus distribution test system [9] is used. The load data in the start of the planning period are presented in Table A1 of Appendix A. The 24-bus distribution network is a 20 kV network and it consists of 2 substations, 20 load buses and 34 branches, illustrated in Figure 5, where the dashed lines stand for the candidate routes for network expansion. The duration of the planning period is equal to 20 years. The future loads that are planned to be connected to the network, as well as the DG units that are planned to be installed are shown in Tables 1 and 2, respectively. The power factor of the future load is 0.90 lagging. All buses of the initial topology are candidate for the placement of capacitors. The costs associated with the planning procedure and techno-economic characteristics of the available planning alternatives are presented in Table 3. Voltage limits are ±5% of the nominal voltage and the capacity of the existing substations is 15 MVA. The hourly data of load demand, wind and solar generation of Figure 2, i.e., 3 × 8760 values, are grouped into 50 clusters. Three cases are considered as follows: • Case A: The DNP problem is solved considering that the DG units operate with unity power factor (i.e., there is no control of the output power of the DG units). Furthermore, the violation of the voltage and line capacity constraints is not allowed (i.e., β V = β SL = 0). • Case B: The DNP problem is solved considering the active management of the DG, while the violation of the voltage and line capacity constraints is not allowed (i.e., β V = β SL = 0). • Case C: The DNP problem is solved using the proposed multistage CCP planning method. The active management of DG is incorporated into the planning process and the minor violation of the voltage and line capacity constraints is allowed (β V = 5%, β SL = 10%).
The network topologies at the end of the planning period for Cases A-C are presented in Figure 6. associated with the planning procedure and techno-economic characteristics of the available planning alternatives are presented in Table 3. Voltage limits are ±5% of the nominal voltage and the capacity of the existing substations is 15 MVA. The hourly data of load demand, wind and solar generation of Figure 2, i.e., 3 × 8760 values, are grouped into 50 clusters. Three cases are considered as follows: • Case A: The DNP problem is solved considering that the DG units operate with unity power factor (i.e., there is no control of the output power of the DG units). Furthermore, the violation of the voltage and line capacity constraints is not allowed (i.e., 0 • Case B: The DNP problem is solved considering the active management of the DG, while the violation of the voltage and line capacity constraints is not allowed (i.e., 0 .
• Case C: The DNP problem is solved using the proposed multistage CCP planning method. The active management of DG is incorporated into the planning process and the minor violation of the voltage and line capacity constraints is allowed ( 5%, 10% The network topologies at the end of the planning period for Cases A-C are presented in Figure  6. Figure 7 presents the total cost (TC) of the reinforcement and expansion plan in Cases A-C, along with the investment (INV) and operational costs (OPC) in each case.        Case C: Considering the CCP planning framework, the network topology at the end of the planning period is illustrated in Figure 6c. In this case, the net present value of the total cost is 1141.50 k$. Due Case A: The net present value of the total cost of the reinforcement and expansion plan is 1440.17 k$ and the network topology at the end of the planning period is shown in Figure 6a. Due to the increase of renewable generation in the first two years of the planning period, distribution lines 21-2 and 22-8 are reinforced with a type 3 conductor at year 1 and year 2, respectively, of the planning period. To meet the load growth demand, distribution line 21-1 is reinforced with a type 3 conductor 390 at year 3; distribution line 8-7 is reinforced with a type 3 conductor at year 6; and distribution line 21-1 is reinforced with a type 3 conductor at year 11. The capacity of the substations at bus 21 and bus 22 is increased to 25 MVA at year 13 in order to ensure that the substations will operate within their capacity limits during the whole planning period. The network operation within their voltage limits is guaranteed by the installation of two capacitors at bus 10 at year 18 and at bus 14 at year 19.
Case B: Figure 6b presents the network topology at the end of the planning period considering the active management of the DG units. In this case, the net present value of the total cost is 1253.75 k$. The main differences between the network topology of Case A and Case B are the deferral of the investments on the reinforcement of distribution lines 21-2 and 22-8 due to the active management of the DG units. More specifically, the distribution line 22-8 is reinforced with type 3 conductor at year 4 and the distribution line 21-2 is reinforced with type 3 conductor at year 11 in order to meet the load demand. Furthermore, the distribution line 7-23, which has smaller length than line 3-23, is added to connect the wind DG unit at bus 23 to the network. Moreover, as shown in Figure 7, the incorporation of the active management of the DG units into the planning process results in the decrease of the operational cost of the planning solution compared with the operational cost of the planning solution in Case A.
Case C: Considering the CCP planning framework, the network topology at the end of the planning period is illustrated in Figure 6c. In this case, the net present value of the total cost is 1141.50 k$. Due to the proposed CCP framework, the main differences between Cases B and C are the deferral of the investments on the reinforcement of distribution lines and the placement of capacitors for later years of the planning period. More specifically, distribution line 21-1 is reinforced with a type 3 conductor at year 7; distribution lines 21-8 and 8-7 are reinforced with a type 3 conductor at year 8; distribution line 22-6 is reinforced with a type 3 conductor at year 14; distribution line 21-2 is reinforced with a type 3 conductor at year 15; no capacitors are installed in bus 10 and bus 14. As is shown in Figure 7, the planning solution with the lowest total cost is obtained in case the active management of the DG units and the CCP framework are incorporated into the planning process (Case C). More specifically, the exploitation of the control capabilities of the DG units results in the deferral of the network investments, which are necessary for the connection of the DG units to the network, while adopting the CCP planning framework results in the deferral of the network investments, which are necessary due the increase of the load demand. Hence, the total cost of the planning solution of Case C, which is the proposed one, is 21% lower than the cost of the planning solution of Case A and 9% lower than the cost of the planning solution of Case B.

Conclusions
This paper proposes a multistage planning method of ADNs. The uncertainties of load, wind and solar generation are modeled using a method based on the k-means clustering technique and they are incorporated into the optimization procedure. A chance constrained DNP optimization model is introduced to efficiently handle these uncertainties and the objective function of the proposed methodology is to minimize the operational cost and the investment costs for the reinforcement of substations and distribution lines, the addition of distribution lines and the placement of capacitors. A problem specific GA is proposed for the solution of the DNP optimization problem. The effect of the active network management on the optimal solution of the DNP problem is investigated. To demonstrate the performance of the proposed planning methodology a 24-bus distribution test system is used. Results clearly show that the consideration of active management of the active and reactive power output of the DG units achieves a significant deferral of the investment cost compared with the traditional planning approaches. Furthermore, the proposed CCP model provides valuable information about the relationship between the investment cost, the risk of distribution lines' overload probability and the risk of voltage limit violation.
Funding: This research has been co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH-CREATE-INNOVATE (project code: T1EDK-00450). Cost of type c capacitor. C cd,b

Conflicts of Interest
Cost of type b conductor. C SS,a Cost of type a substation.

C Loss
Cost of losses. In f Inflation rate.

Int
Interest rate. l ij Length of line i-j. P d,i,k,t /Q d,i,k,t Active/reactive load demand of bus i at the load-generation set k during stage t. P dg,i,k,t Active power of the distributed generation (DG) unit at bus i at the load-generation set k during stage t. P rated DG,i,t Rated active power of the DG unit at bus i during stage t. P rated SDG,i,t Rated active power of the solar DG unit at bus i during stage t. P sdg,k Solar DG at load-generation set k. P rated WDG,i,t Rated active power of the wind DG unit at bus i during stage t. P wdg,k Wind DG at load-generation set k. Q CB,i Nominal reactive power of the capacitor at bus i.
Conductance/susceptance of type b conductor. S rated DG,i Maximum apparent power of the DG unit at bus i. S max L,b Thermal limit of type b conductor. S max

SS,a
Capacity of type a substation. s, s n Solar irradiance, nominal solar irradiance. V min /V max Minimum/maximum voltage magnitude limits. v, v n , v ci , v co Wind speed, nominal wind speed, cut-in wind speed, cut-out wind speed. C. Variables CF i,k,t Curtailment factor of the active power output of the DG unit at bus i at the load-generation set k during stage t. P ij,k,t /Q ij,k,t Active/reactive power flow of line i-j at the load-generation set k during stage t. P loss,k,t Power losses at the load-generation set k during stage t. P SS,i,k,t /Q SS,i,k,t Active/reactive power flow of substation at bus i at the load-generation set k during stage t. Q dg,i,k,t Reactive power of the DG unit of bus i at the load-generation set k during stage t. Q sdg,i,k,t Reactive power of the solar DG unit of bus i at the load-generation set k during stage t. Q wdg,i,k,t Reactive power of the wind DG unit of bus i at the load-generation set k during stage t. V i,k,t Voltage magnitude of bus i at the load-generation set k during stage t. θ ij,k,t Voltage angle difference between bus i and j at the load-generation set k during stage t. D. Binary Variables id SS i,a,t Investment decision variable for the reinforcement of substation at bus i using type a substation at period t. id L ij,b,t Investment decision variable for the installation of line i-j using type b conductor at period t.
id CB i,c,t Investment decision variable for the installation of type c capacitor at bus i at period t. rd ij,t Spanning tree variable. It is equal to 1 if bus i is the parent of bus j at period t; otherwise it is equal to 0.
uz SS i,a,t , uz L ij,b,t ,uz CB i,c,t Auxiliary variables associated with the investment decision variable for the reinforcement of substation at bus i, the installation of line i-j using type b conductor, and the installation of type c capacitor at bus i, respectively, at period t. Table A1 presents the load of the 24-bus distribution network at the reference year.