Multivariable Regression Equivalent Model of Interconnected Active Distribution Networks Based on Boundary Measurement

: With the increasing complexity of the active distribution network (ADN) due to distributed generation (DG) integration, together with the electricity market evolution, the traditional ADN is divided into multiple areas to operate independently. Due to technical problems or business privacy, the internal network regional control center cannot grasp the changes of the external regional network in time. In order to accurately reﬂect the distribution network operation state, a multivariable regression equivalent model is proposed in this paper. Firstly, the external network is made equivalent to a multi-port Norton model. The multivariable linear regression model is then derived based on the equivalent distribution network, and the regression model variables are constructed using boundary node information collected by the measurement equipment. Finally, the maximum likelihood estimation (MLE) is used to estimate the parameters of the multivariable linear regression model. Furthermore, case studies demonstrate the e ﬀ ectiveness and robustness of the proposed method, and detailed information of external ADN is unnecessary, except for the boundary node information. The proposed method can also be applied for three-phase unbalanced ADN e ﬃ ciently.


Introduction
Due to the penetration of a large number of distributed generation (DG) in the distribution system, together with demand-side interaction enhancement, the traditional distribution network is gradually evolving into an active distribution network (ADN), and its topology and power flow are moving towards complexity and diversification [1]. In order to reduce the complexity of the distribution network model, together with the electricity market reformation, a single distribution network may be divided into interconnected distribution networks that are operated independently by different owners [2,3]. Due to technical issues or business privacy, information between regions cannot be exchanged in a timely and efficient manner. In order to give the regional management centers the required responsiveness, it is necessary to establish an equivalent method for the three-phase unbalanced interconnected distribution network. The multi-microgrid (MMG), defined as multiple interconnected small-scale power grid that is comprised of distributed energy resource systems, storage devices, and local demands, also has similar characteristics as the interconnected ADN [4,5]. Therefore, the established equivalent method of the interconnection ADN could also be applicable for MMG.
Various equivalent models have been proposed. The Ward equivalent model, REI equivalent model, and their expansion have been widely used [6][7][8][9][10]. However, detailed grid information of the external system is needed by these methods, which cannot make an equivalent for an unknown According to circuit principles, external networks connected with the internal network can be equivalent to multi-port Norton models, as shown in Figure 1. And the external Norton equivalent model consists of multiple current sources and admittance matrices [30]. U E and U B represent external node voltage and boundary node voltage matrix, respectively. Y EE and Y EB represent the external admittance matrix and the external-boundary admittance matrix, respectively. I E and I BE respectively represent the Norton equivalent current sources and the outputting current matrix from boundary nodes to external nodes. When the interconnected external grid is a three-phase system, the dimension of these matrixes would be expanded to three times that of the original matrix. Besides, the node number of external systems established by the equivalent model equals the number of boundary nodes. According to circuit principles, the network equations described by an admittance matrix of the power system can be represented as: The network is divided into the internal node set {N}, the boundary node set {B}, and the external node set {E}, then Equation (1) can be re-expressed as: The voltage sub-matrix U E corresponding to the external network in (2) can be eliminated through Gaussian elimination method, and the following linear equation is obtained: Equation (3) contains the information of the internal network and external unknown network. Specifically, Y BE Y EE −1 Y EB contains admittance information of the external network and I E contains the external network current information, which is unknown. Besides, the remaining parameters in the equation are known to be associated with the internal network, the following equation can be obtained according to (3): where Y EB is the mutual admittance matrix between boundary nodes and external nodes. Since the number of external nodes of the established multi-port Norton model is equal to the boundary node number, and the phase loss portion of the boundary node of the unbalanced system can be further deleted, Y EB is a nonsingular matrix. Therefore, (4) can be transformed into: then Equations (3) and (5) can be expressed as: Equation (7) is the mathematical equivalent model of the external grid with unknown parameters. It is vital to estimate the unknown parameters to obtain the real operation state.
The following equation is obtained through expanding (6): where b is the number of boundary nodes. In order to accurately estimate the unknown parameters in Equation (8), the following model is constructed according to the multivariable linear regression analysis method: .
Equation (9) is the multivariable linear regression model of the interconnected active distribution network. By solving the model, the external network equivalent parameters β and U E eq can be obtained to estimate the operating state of the system.

Algorithm for Regression Model
The algorithm for the multivariable linear regression model of the interconnected ADN is described in this section. Firstly, the boundary system data are collected by measuring equipment placed at boundary nodes. From the theoretical point of view, the equivalent parameters estimation method of the external network is then derived based on the maximum likelihood estimation. Finally, the estimation procedures are introduced.

Collection Boundary Nodes Information
Measuring equipment, such as PMUs and micro-PMUs, are placed at boundary nodes to synchronously collect the complex current flowing into the external ADN and the complex voltage of the boundary bus [31,32], as shown in Figure 2. The collected boundary information is transmitted to state estimator to constructed Equation (8).

Maximum Likelihood Estimation of Equivalent Parameters
Regression vertical outliers and poor leverage points can be suppressed by maximum likelihood estimation to achieve the highest statistical efficiency. When the system has a mean Gaussian distribution measurement noise, the Equation (9) can be expressed as follows: where U is the set of voltage values collected by the internal system measuring equipment, k is the current flowing into the external ADN, β is the constructed external network information matrix, which is called regression coefficient, and ε is the error vector. The specific form of ε is shown as follows: The density function of error is as follows: where the constant σ 2 represents the variance. The maximum likelihood function is the joint density function of ε 1 , ε 2 , · · · , ε n , or Π n i=1 f (ε i ), and its specific form is as follows: Bring Equation (10) into (13), the likelihood function can be transformed as follows: For a constant value σ, to maximize the likelihood function, the optimal estimation matrix β is required to minimizing the term (U − kβ) T (U − kβ). And the residual squared function S(β) is introduced: It is noted that β T k T U and U T kβ are the same matrix due to the symmetry of the system admittance matrix, and S(β) can be further expressed as the following equation: Since all columns of the regression variable k obtained according to multiple sets of power flow are not linearly combined with other columns, k is a nonsingular matrix. Hence, the estimated vector ofβ is as follows: The obtainedβ is the maximum likelihood estimation parameter matrix, and the detailed solution of the multivariable linear regression model can be found in [33]. The matrix U E eq can be obtained by bringing the estimated matrixβ into Equation (6). Bringβ and U E eq into Equation (7) to establish the multi-port Norton equivalent model of the external system. Thus, the operation state of the interconnected ADN can be effectively estimated accordingly.

Equivalent Procedures
The external network equivalent method proposed in this paper consists of two phases. In the first phase, the measuring devices are used to synchronously collect the boundary node data (complex voltage and complex current, see Figure 1). In the second phase, the multiple regression equivalent models are established and solved to obtain the external network equivalent parameters. The external equivalent network is constructed by using the equivalent parameters to reflect the real operating state of the system.
The proposed external network equivalent algorithm can be summarized by the following steps: Step 1: Set up the measuring devices to synchronously measure the complex voltage U B and complex current K flowing into the external network.
Step 2: The collected boundary information is used as the regression variable in Equation (8). In order to estimate the external network parameters more accurately, the number of regression variables collected can be referenced to [34].
Step 3: Standardize boundary data according to the following equation: x is the average of the collected data, and N is the number of collected data.
Step 4: A multivariate linear regression model is established based on Equation (9), and then the maximum likelihood estimation is used to solve the model based on Equations (10)- (17) to obtain external network equivalent parameters β and U E eq .
The external network equivalent procedures are shown in Figure 3.

Simulation Verification
In this section, Matlab and Opendss interactive simulations are performed on the personal computer with a 1.9 GHz A8 processor and 8 GB RAM to verify the accuracy of the proposed algorithm.
The method proposed in this paper named MEG is compared with three mainstream equivalent methods: (1) PQ equivalent: The external network is made equivalent to load according to the injected power of the boundary node. (2) PV equivalent: The external network is made equivalent to the generator according to the injected power of the boundary node. (3) COM equivalent: The external network is equivalent to the generator if the injected power of the boundary node is positive, otherwise, it is equivalent to load.

Test Systems
Simulation on the IEEE 69-node distribution test system and IEEE 123-node distribution test system are carried out to verify the effectiveness and accuracy of the proposed method. The two systems constructed are divided into an external network, boundary network, and internal network, and the node numbers are rearranged, as shown in Figures 4 and 5. The external network consists of various areas with DGs installed.
Case I: The modified IEEE 69-node system is a three-phase balanced system, node 1 is the balanced node, and the initial voltage is 12.66 kV. The three-phase power base value is 10 MVA, and the total load is 3802.19 + j2694.6 kVA [35].

Test Systems
Simulation on the IEEE 69-node distribution test system and IEEE 123-node distribution test system are carried out to verify the effectiveness and accuracy of the proposed method. The two systems constructed are divided into an external network, boundary network, and internal network, and the node numbers are rearranged, as shown in Figures 4 and 5. The external network consists of various areas with DGs installed.

Case II:
The modified IEEE 123-node system is a three-phase unbalanced system with a voltage level of 4.16 kV, the relaxation node voltage is set to 1.05 (per-unit value), the feeder capacity is 5.28 MW, the three-phase active power is 3490 kW, and the reactive power is 1925 kVar [36,37].

Scenarios Setting
The distribution system operator will adjust the operating condition by changing the structure of the distribution system accordingly [38]. Different scenarios are constructed to simulate the operation states of the distribution system based on the two cases above.

Case II:
The modified IEEE 123-node system is a three-phase unbalanced system with a voltage level of 4.16 kV, the relaxation node voltage is set to 1.05 (per-unit value), the feeder capacity is 5.28 MW, the three-phase active power is 3490 kW, and the reactive power is 1925 kVar [36,37].

Test Systems
Simulation on the IEEE 69-node distribution test system and IEEE 123-node distribution test system are carried out to verify the effectiveness and accuracy of the proposed method. The two systems constructed are divided into an external network, boundary network, and internal network, and the node numbers are rearranged, as shown in Figures 4 and 5. The external network consists of various areas with DGs installed.
Case I: The modified IEEE 69-node system is a three-phase balanced system, node 1 is the balanced node, and the initial voltage is 12.66 kV. The three-phase power base value is 10 MVA, and the total load is 3802.19 + j2694.6 kVA [35].

Case II:
The modified IEEE 123-node system is a three-phase unbalanced system with a voltage level of 4.16 kV, the relaxation node voltage is set to 1.05 (per-unit value), the feeder capacity is 5.28 MW, the three-phase active power is 3490 kW, and the reactive power is 1925 kVar [36,37].

Scenarios Setting
The distribution system operator will adjust the operating condition by changing the structure of the distribution system accordingly [38]. Different scenarios are constructed to simulate the operation states of the distribution system based on the two cases above.

Scenarios Setting
The distribution system operator will adjust the operating condition by changing the structure of the distribution system accordingly [38]. Different scenarios are constructed to simulate the operation states of the distribution system based on the two cases above.

Three Scenarios based on Case 1
Scenario 1: modified IEEE 69-node three-phase balanced distribution system with DG integration. Information of installed DGs are shown in Table 1  Scenario 2: Based on Scenario 1, the normal loop closing operation is performed, and the nodes connected by the loop closing interconnection switch are node 15 and node 69.
Scenario 3: modified IEEE 69-node distribution system with random fluctuation load. It is a dynamic scenario, and the load curve is shown in Figure 6.

Three Scenarios based on Case 2
Scenario 1: modified IEEE 123-node three-phase unbalance distribution system with DG integration. The information of installed DGs are shown in Table 2   Scenario 2: Based on Scenario 1, the normal loop closing operation is performed, and the nodes connected by the loop closing interconnection switch are node 56 and node 117. Scenario 3: This is a dynamic scenario based on Scenario 1 with fluctuation load. The load curve is the same as Case I.

Equivalent Error Indicators
Since the basic requirement of the distribution system is to provide reliable voltage support for the demand, the voltage related indicators are used in this paper to evaluate the accuracy of the equivalent methods. The average absolute error e avg and the maximum absolute error e max of the node voltage amplitude are selected as the indicators during the static simulation scenarios, and expressions are as follows: where V n is the voltage value of node n renumbered in the internal network under the original model, V n eq is the voltage value of node n under the equivalent model, and N is the internal network node set.
The lower error indicators e max and e avg represent higher equivalent accuracy. Under dynamic scenarios, the root mean square error of each node voltage before and after equivalent is used as the measuring indicator. The calculation formula is as follows: where m is the number of collection points within a certain period of time. Lower V RMSE value obtained under the load fluctuation scenario means higher equivalent accuracy.

Simulation Results of Case 1
4.4.1. Scenario 1 Figure 7a and b respectively show the internal system voltage amplitude and the absolute error of four equivalent modeling methods under Scenario 1 of Case 1. ORI depicts the voltage amplitude of the original system. As can be seen from the figure, four different equivalent models can reflect the voltage of each node of the original system to some extent, but the voltage calculated by the method proposed in this paper is closer to the actual value. Which indicates that the proposed equivalent model improves the accuracy dramatically.  Table 3 shows the equivalent errors values of e max and e avg of the four methods with DG and without DG. It can be seen from the table that the equivalent accuracy drops significantly due to the integration of DG, especially for the methods of PV, PQ, and COM. However, the values of e max and e avg calculated by the equivalent model proposed in this paper are still relatively small compared with the other three methods, which indicates the robustness and the equivalent accuracy of the proposed method. Hence, the equivalent model proposed in this paper reveals the real situation of the original system.  Figure 8a,b respectively show the internal system voltage amplitude (p.u.) and absolute error (%) as reflected by the equivalent model obtained by each method in Scenario 2. As can be seen from the figure, after the loop closing, the voltage of each branch connected to the loop closing interconnection switch has a significant drop. The equivalent model obtained by the above four methods can basically reflect the voltage of each node in the original system, but the internal node voltage calculated by the method proposed in this paper is closer to the actual value of the internal voltage of the system.  Table 4 compares the equivalent results e max and e avg reflected by the above four equivalent modeling methods in Scenario 2. As can be seen from the table, under the loop closing state, the internal system voltage e max and e avg calculated by the equivalent model proposed in this paper are smaller than those obtained by the other three methods, indicating that compared with the other three models, the model proposed in this paper has a higher equivalent accuracy and can more accurately reflect the voltage states of the internal system. Load fluctuations are introduced in this scenario to further illustrate the dynamic characteristics of the proposed equivalent model for interconnected ADN. Figure 9a shows the dynamic variation of the voltage amplitude of each node in the internal network of the original model. It can be seen that the voltage of each node fluctuates slightly with the fluctuation of the load, but can be maintained within a certain range under the action of the relaxed node. Figure 9b shows the absolute error of the internal system voltage obtained by the equivalent model proposed in this paper. It can be seen from Figure 9b that the equivalent accuracy of the proposed method can be guaranteed even in this dynamic scenario (the equivalent error is lower than 1%). It can also accurately reflect the variation of the internal system voltage.  Figure 10 shows the voltage mean square errors (V RMSE ) for four different models before and after equivalent. It can be seen from the figure that under Scenario 3, the V RMSE value (lower than 2.5%) obtained by the method proposed in this paper is lower than that of the other three methods (greater than 5%), indicating that the equivalent model proposed in this paper can reflect the actual voltage of the internal system under dynamic scenarios better and has better dynamic characteristics. In Scenario 1, The simulation results with different models are shown in Figure 11a-c, respectively. As we can see, the node voltages obtained by four methods are similar (some nodes lack a phase due to the three-phase unbalance characteristics). However, the voltage amplitude obtained by the method proposed in this paper is closer to the voltage using the original model, which indicates that the equivalent method proposed in this paper can reveal the actual voltage better than the other three methods, when large numbers of DGs are connected to the distribution networks. The results of e max and e avg obtained by four methods are shown in Table 5. It can be seen from the table that the absolute errors of each phase voltage obtained by the method proposed in this paper are the smallest compared with the other three methods, which indicates that the accuracy of the method proposed in this paper is higher.    Table 6 shows the e max and e avg of the above four methods in Scenario 2. It can be seen from the table that the equivalent accuracy of the four methods in the loop-opening state is significantly lower than that in Scenario 1. However, the absolute error value of voltage obtained by the method proposed in this paper is still the smallest compared with the other three methods, which indicates that the method proposed in this paper is more accurate in terms of equivalent than the other three methods in the original network loop-opening state, and has the best equivalent accuracy. The voltages of three-phase of the internal system using the original model are as shown in Figure 13a-c. As we can see, it shows obvious fluctuation with the load changing. However, the fluctuation is maintained within a certain range due to the existing compensation devices such as capacitors.  Figure 14 a-c shows the absolute error of the three-phase voltage of each node obtained by the equivalent model proposed in this paper. It can be seen from the figure that the maximum absolute error of three-phase voltage is lower than 2%, which indicates that the accuracy of the proposed equivalent model can be assured even under the dynamic scenario with fluctuation load.  Figure 15 shows the value of V RMSE using different equivalent models. It can be seen that the V RMSE of the internal nodes of each phase obtained by the method proposed is the smallest with dynamic fluctuation load, and the V RMSE of three-phase is less than 0.5%, which indicates that the equivalent accuracy of the method proposed in this paper is much better than the other three methods under dynamic scenarios. In addition, it can be seen from the figure that the number of mean square error abnormal values of the method proposed in this paper is also lower than the other three methods, further indicating that the dynamic performance of the method proposed in this paper is better than the other three methods.

Conclusions
A multivariable regression equivalent method for interconnected active distribution networks is proposed in this paper. A multi-port Norton model considering the coupling relationship of interconnected networks is developed and the multivariable regression model is derived. The parameters of the equivalent model are estimated according to the boundary nodes measuring data. Besides, IEEE 69-node three-phase balanced active distribution system and IEEE 123-node three-phase unbalanced active distribution system are used to demonstrate the effectiveness of the proposed model. The advantages of this method are as follows: (1) The proposed method can establish an external equivalent model for multi-area interconnected active distribution network to accurately reveal the actual operating state of the internal system. (2) The proposed method can be used for three-phase unbalanced distribution system efficiently, and exchange information between interconnected distribution networks is unnecessary. (3) The equivalent model parameters have certain dynamic characteristics. The model maintains a certain accuracy even with load fluctuations.