Active Distribution Network Fault Diagnosis Based on Improved Northern Goshawk Search Algorithm

: Timely and accurate fault location in active distribution networks is of vital importance to ensure the reliability of power grid operation. However, existing intelligent algorithms applied in fault location of active distribution networks possess slow convergence speed and low accuracy, hindering the construction of new power systems. In this paper, a new regional fault localization method based on an improved northern goshawk search algorithm is proposed. The population quality of the samples was improved by using the chaotic initialization strategy. Meanwhile, the positive cosine strategy and adaptive Gaussian–Cauchy hybrid variational perturbation strategy were introduced to the northern goshawk search algorithm, which adopted the perturbation operation to interfere with the individuals to increase the diversity of the population, contributing to jumping out of the local optimum to strengthen the ability of local escape. Finally, simulation veri ﬁ cation was carried out in a multi-branch distribution network containing distributed power sources. Compared with the traditional regional localization models, the new method proposed possesses faster convergence speed and higher location accuracy under di ﬀ erent fault locations and di ﬀ erent distortion points.


Introduction
Distribution networks are located at the end of the power system and connect with users directly.It is particularly important to ensure the operational stability of the distribution network.Under the influence of the double carbon background, more and more distributed generation (DG) systems are being connected to distribution networks and, in this situation, the direction of the system power flow would no longer be unique.The traditional distribution network is transformed into a multi-directional complex active distribution network (ADN), and the structure of the distribution network would be more complex, leading to the increase in the difficulty of fault location, which brings great challenges to the stable operation of the ADN [1,2].Thus, it is of great research significance to study the fault localization methods suitable for ADNs [3][4][5][6][7][8][9].
Due to the access of DGs, when a fault occurs, the fault characteristics of ADNs are quite different from those of traditional distribution networks [10], which can be mainly summarized as follows: (1) The location of access points and the access capacity of DGs affect the direction of the system power flow and the amplitude of the fault current [11].(2) The outputs of each DG are uncertain, causing uncertainty in the fault transient process.(3) The low-voltage distribution network possesses more branches and their line parameters are unevenly distributed, increasing the complexity of fault analysis [12].Against the background of automated upgrades of distribution network equipment, the Citation: Guo, Z.; Ji, X.; Wang, H.; research of fault localization methods based on the current information uploaded by feeder terminal units (FTUs) has become a hot spot [13].
A variety of distribution network fault localization methods have been put forward, which can be classified according to the localization results: Fault routing, fault ranging, and fault segment localization [14].And the realization of research methods is mainly based on matrix algorithms and intelligent algorithms.The matrix algorithm [15] combines the distribution network topology with the current information uploaded by the FTU to generate the fault discrimination matrix and locates the fault section through matrix operation; the intelligent algorithm is based on the theory of "minimum fault diagnostic set" and converts the fault section localization problem into a mathematical optimization problem, which could be solved using intelligent algorithms.The authors of [16] use the multiverse algorithm to locate faults in distribution networks, which improves the algorithm through the introduction of the adaptive elite strategy and adaptive mutation operation and, though the localization ability of the method is not bad, it requires a lot of computational resources and has some limitations in treating some specific faults.The authors of [17] propose a localization method based on the vulture search algorithm, which improves the algorithm's optimization ability by introducing the crossover operator, the non-uniform variation operator, and the somersault foraging strategy.However, the high computational cost in fault localization limits its application.The matrix algorithm and the chaotic binary particle swarm algorithm were used to solve the distribution network fault localization problem in [18], which establishes the causal association matrix and criterion of zones and nodes based on the actual structure of the distribution network.Nevertheless, the convergence and stability of the algorithm need to be further considered since they determine whether the method could be effectively applied in ADNs.The authors of [19] apply an improved algorithm based on the quantum ant colony algorithm to solve the distribution network fault location problem; the authors of [20] verify that introducing the improved sine-cosine algorithm into the local development stage of the algorithm increases the population diversity at the late iteration stage, prevents the algorithm from falling into a local optimum, and effectively improves the algorithm's solution accuracy and convergence speed; the authors of [21] propose an improved differential evolution algorithm, self-adaptive differential evolution with Gaussian-Cauchy mutation (SDEGCM), which introduces two strategies, Gaussian-Cauchy mutation and parameter self-adaptation, to improve the performance of the algorithm; the authors of [22] employ a Hunger Games search algorithm based on Gaussian-Cauchy variants, however, the selection and adjustment of the parameters of the above four algorithms have a large impact on the results, and sufficient parameter optimization and testing are needed for specific problems.
To solve the problem, this paper proposes a zonal fault localization model for distribution networks based on the improved northern goshawk optimization (INGO).The chaotic initialization strategy was used to improve the quality of the sample population, and the sinusoidal cosine strategy and the adaptive Gaussian-Cauchy hybrid variance perturbation strategy were introduced into the northern goshawk search algorithm (NGSA), which used perturbation operations to interfere with the individuals to improve the diversity of the sample population, contributing to jumping out of the local optimum to strengthen the ability of local escape.Finally, the effectiveness and reliability of the proposed method were verified by comparing with the northern goshawk optimization (NGO) algorithm, gray wolf optimization (GWO) algorithm, and whale optimization algorithm (WOA).

Distribution Network Zoning Models and Systems
Distribution system problems include the short circuit, overload, ground fault, and other fault problems.Zonal fault localization in distribution networks means dividing the distribution network into a series of zones and monitoring and analyzing the flow of electrical energy in each zone to help locate the positions where faults occur quickly.This zonal fault localization method can help improve the accuracy and efficiency of fault diagnosis and shorten the fault-processing time, ensuring the stable operation of the power system.
Taking the dual-source distribution network shown in Figure 1 as an example, it is stipulated that the state of the end nodes in each region represents the state of the whole region.The first level of the hierarchical localization model uses an algorithm to locate the region where the fault occurs according to the regional state, and the second level locates the specific faulty zone within the region obtained from the localization.For example, a fault occurs in zone 9 in the distribution network.The first level starts the regional localization with the algorithm according to the state of each region and provides the localization results for the Region 3 failure; the second level starts the specific segment localization according to the state of nodes 8, 9 in Region 3 to determine the faulty zone.

Coding Methods
In order to accurately determine the direction and source of the fault current when a fault occurs in a zone, this paper defines that the direction of current flow from the main power supply to the load side is the positive direction.Ij denotes the uploaded value of the fault current at node j.Ij = 1 when the FTU detects a forward fault current, Ij = −1 when the FTU detects a reverse fault current, and Ij = 0 when the FTU does not detect a fault current, as shown in Equation (1). 1, positive fault current lowing at switch 0, no fault current lowing at the switch 1, reverse fault current lowing at switch (1)

Construction of Switching Functions
The switching function can realize the conversion of line fault state and switch fault information.Considering the switching of distributed power supply, the switching function can be defined as follows [19]: where  * is the switching function value of node j, also known as the desired state value; Ku and Kd are the power casting coefficients of the upstream and downstream regions of node j, respectively, and the coefficients are set to 1 when there is a power supply input, otherwise, they are set to 0; ∏  , is the value of all zones or operations between node j and each upstream power supply; ∏  , is the value of all zones or operations between node j and each downstream power supply; ∏  , and ∏  , are the values of the upstream and downstream zones or operations of node j, respectively; M1 and M2 are the number of power sources of node j upstream and downstream, respectively; N1 and N2 are the number of the zones of the node upstream and downstream, respectively.

Objective Function and Switching Function
The principle of localization based on the FTU fault current information is to minimize the sum of the difference between the uploaded and actual values of the currents at each node.The smaller the sum of the difference, the higher the similarity between the solved fault situation and the actual fault situation.The objective function was defined by Equation ( 3) [19].
where  is the factor preventing misjudgment and | | the factor preventing misjudgment for node  .When the uploaded value of a zone is not equal to the actual value, there is a possibility that the minimum value corresponds to a series of faults and, in order to make up for this shortcoming, | | is introduced, and it is set to 0.5.
Taking Figure 1 as an example, when a fault occurs in zone X5 in Region 2, X = [X1, X2, L, X16] = [0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0], and the values of switching functions for node 1 and node 2 in Region 1 can be given by Equation ( 2) as follows: Similarly, the switching function values of the nodes in Region 3, Region 4, and Region 5 can be calculated as follows:  *  *  *  * 0 ,  *  *  * 0 ,  *  *  * -1 .It could be found that the switching function values of the nodes in the faulty region are not equal, while the switching function values of the nodes in the nonfaulty region are equal to that of the nodes at the end of their respective regions.To further validate the partitioning basis, assuming different zone faults and multiple faults occur in Region 3 and there is no information distortion, the switching function values of all the nodes in the non-faulty regions can be calculated, as shown in Table 1.

Northern Goshawk Optimization
Northern goshawk optimization [20] (NGO) was proposed in 2022 by Mohammad Dehghani.It simulates the behavior of the northern goshawk during the hunting process, which includes prey identification and attack, pursuit, and escape.In the optimization algorithm for northern goshawks, the hunting process of northern goshawks can be divided into two stages: the exploration stage (prey identification and attack) and the exploitation stage (pursuit and escape).The mathematical model established based on the NGO according to the different hunting stages can be summarized as follows: (1) Exploration stage In the first stage of northern goshawk hunting, it randomly selects a prey and then quickly attacks it.The mathematical expression for the behavior of the northern goshawk in this stage can be presented by Equations ( 6)- (8). is the fitness value corresponding to it; R is a random number that belongs to [0,1] and, furthermore, the value of i could be either 1 or 2; r and i are random numbers that were used to generate the stochastic NGO in the search and update the behavior of the random numbers.
(2) Exploitation stage When the northern goshawk starts the process of capturing the prey, the prey tries to escape at the same time.During the process of pursuing the escaped prey, the movement speed of the northern goshawk is extremely fast, and it can capture the prey at any time and in any place.Assuming that the northern goshawk in this hunt is in the attack position of radius R, then the mathematical expression for the second stage can be presented by Equations ( 9) and (10) [20].
0.05  (10) where t is the current number of iterations; T is the maximum number of iterations;  , is the new state of the i-th northern goshawk in the pursuit stage;  , , is the new state of the i-th northern goshawk in the j-th dimension in the pursuit stage; and  , is the adaptation value in the new state.It can be seen that the NGO achieves parameter optimization by searching for the optimal penalty parameter c and kernel parameter g of the diagnostic model system, which could improve the classification accuracy.However, the following limitations still exist: (1) During the initialization of the sample population, the distribution of the initial solution is random and uneven, and the quality of individuals in the population varies, which can easily lead to a lack of population diversity, resulting in missing the potential optimal solution.(2) During the process of prey escape in the second stage, the northern goshawk chases the prey at extraordinary speed, which can easily lead to the algorithm falling into the local optimum [20].

Northern Goshawk Optimization for Binary
The value of zone state can be only 0 or 1, thus the position of the northern goshawk needs to be represented in binary form.The position of the northern goshawk can be updated according to the following equations [20].

The Improvement of NGO
In order to improve the optimization performance of the NGO algorithm used in fault location in ADNs, the following improvements were carried out: (1) When the individual northern goshawk chooses another search area, the decision would be made based on the information available at the previous stage.If the northern goshawks in the population are all trapped in a localized search, they will not be able to capture the prey during the global search optimization process accurately.In order to compensate for this deficiency, the sinusoidal projection was introduced to promote a more uniform distribution of the northern goshawk population in the search space, which can solve the problem of "premature convergence" to a certain extent.Meanwhile, the global detection ability of the algorithm could be further enhanced by the crossover operation and the non-uniform variation operator.The crossover operation is to swap the positions of northern goshawk population and recalculate the fitness value, and when the fitness value of the new positions is better than that of the previous northern goshawks, the previous northern goshawks would be replaced, which increases the diversity of northern goshawk population after the iteration.The non-uniform variation operator is to perturb the position of the northern goshawks, which helps increase the diversity of the individual population, leading to an increase in the search range and the search accuracy of the northern goshawk algorithm.When the non-uniform variation strategy perturbs the position of the northern goshawks, k dimensions would be randomly selected for each northern goshawk to be perturbed.Once the new northern goshawk individuals generated by the perturbation are better than the previous northern goshawk individuals, the previous northern goshawk individuals would be replaced.
where  denotes the location of the northern goshawk at the (t + 1)-th iteration;  denotes the position of the northern goshawk at the t-th iteration; T denotes the maximum number of iterations, for example, T = 100; r is the search range of the northern goshawk population; b = 2 is a system parameter, which determines the degree of non-uniformity.
(2) The sine-cosine algorithm (SCA) [21] was introduced to avoid trapping in the local optimum, and the diversity of individuals could be maintained by using the oscillatory change characteristics of the sine-cosine model working on the positions, which is beneficial to the improvement of the global search capability of INGO.
For the basic sine-cosine algorithm, the step search factor   / ( is a constant, and  1 in this paper;  is the number of iterations) has a linear decreasing trend, which is not conducive to further balancing the global search and local development ability of the NGO.Thus, a new non-linear decreasing search factor was defined, as shown in Equation ( 15), which has a larger weight value and a more slowly decreasing speed during the early stage, contributing to the improvement of the global optimization ability; it has a smaller weight value and a more quickly decreasing speed during the later stage, where the algorithm's advantage in the local development could be enhanced, accelerating the process of obtaining the optimal solution.
where  is the step search factor after updating;  is the maximum number of iterations;  is the adjustment factor, and  1;  is the number of iterations.
(3) The NGO algorithm is easily trapped in the local optimum during the later iterations, so the adaptive Gaussian-Cauchy hybrid mutation perturbation strategy [22] was introduced to enhance the algorithm's ability to develop locally and search globally, improving the probability of obtaining the optimal prey location.Since the result of the mutation perturbation operation has randomness, if the mutation perturbation operation is carried out on all individuals, the complexity of the algorithm would be inevitably increased.Thus, in this paper, the mutation perturbation is just carried out on the optimal individual, and then the positions before and after the mutation are compared and the best one is chosen to enter the next iteration.To increase the diversity of the individuals and expand the population search range, Equation ( 14) was defined.
where   is the optimal position of individual X in the t-th iteration;   is the optimal position of individual X in the t-th iteration after the mixed Gaussian-Cauchy perturbation;   is the Gaussian variation operator; ℎ  is the Cauchy variation operator; and the weight coefficients  / ,  1 / max , which change progressively in a one-dimensional linear manner to ensure the balanced and smooth iteration.With the continuous iteration of the algorithm, positions of most of northern goshawk individuals would not change much.In this situation, the Gaussian distribution function coefficients were used to perturb the population, which helps the algorithm to jump out of the local optimum degrees of freedom and to overcome the interdimensional interference problem in the high-dimensional space at the same time.

Fault Location Process
The flow chart of fault location based on the improved northern goshawk algorithm is shown in Figure 2, and the specific steps can be summarized as follows: (1) Read the fault current status information of sectional switches, contact switches, circuit breakers, and other components detected by the FTU, then it is uploaded to the SCADA system of the master station.Based on the information, the actual fault current arrays of the switching nodes are generated according to the number of nodes.

Calculus Analysis
To verify the validity of the method proposed in this paper, a mathematical model of the IEEE 33-node ADN structure was built on the Matlab platform, which is shown in Figure 3. X1-X33 represent the 33 feeder segments, 1-33 represent the 33 switching nodes, and K1-K3 represent the access switches of each distributed power source.Due to the access of distributed power supply, the complexity of fault location increases.During the simulation experiment, the distributed power supply was connected to the ADN from random nodes and the number of the distributed power supplies varies.Meanwhile, it needs to be noted that the data for the troubleshooting algorithms in this paper came from a local data center and control center in Jilin Province, China.

Simulation Test Analysis
Assuming that a fault occurred in zone 7 in Region 6 in Figure 3, all three distributed power sources were in operation and the fault current information was not distorted.Firstly, the fault current information uploaded by the FTU can be calculated according to Equation 0,0,0,0], and then the current information of the end nodes in each region could be extracted as [1,1,−1,0,−1,0,−1,−1,−1,−1,0], using the improved northern goshawk algorithm to search for faulty areas, with the corresponding results obtained as [0,0,0,0,0,0,1,0,0,0,0,0,0,0,0].It can be deduced that a fault occurred in one of the zones in Region 6.Then, the INGO algorithm was applied and the results of the fitness vs. number of iterations are shown in Figure 4.The localization result was obtained as [1,0,0,0], i.e., the failure in zone 7 is consistent with the assumption.In order to illustrate the fault tolerance of the proposed method better, the distortion of information was further added to the faulty nodes.It is assumed that faults occurred in zone 9, zone 12, and zone 22 at the same time and that the state of node to 0, the state of node 18 changed from −1 to 0, the state of node 32 changed from 0 to −1.
Then, the convergence curve of fault location when node information was distorted could be calculated and obtained, as shown in Figure 5.Meanwhile, the state matrix [0,0,1,0,0,1,1,1,0,0,0,0,0] could be obtained, which means that faults occurred in Region 3, Region 6, and Region 7, respectively.On this basis, the state values of the zones in Region 3, Region 6, and Region 7 could be further calculated, and the results [0,1], respectively.It can be reasonably deduced that faults occurred in zone 9, zone 12, and zone 22, which is in accordance with the hypothetical positions of faults set.From Figure 4, it can be seen that the convergence curve presents a straight line after the first iteration, which means that the method proposed in this paper finds the optimal solution at the beginning of the iteration, and then the system convergence reaches a stable state.Similarly, it can be seen from Figure 5 that the convergence curve presents a straight line after the third iteration, indicating that the method proposed in this paper could find the optimal solution after two iterations.Thus, one conclusion could be drawn firmly that the INGO algorithm zonal localization model proposed in this paper is able to accurately locate the faults in ADNs within the maximum number of iterations, under the preset double faults without information distortion and triple faults without information distortion.Moreover, the method also has a great advantage in terms of the convergence speed.

Performance Comparison with Other Typical Algorithms
In this paper, the northern goshawk optimization (NGO) algorithm, gray wolf optimization (GWO) algorithm, and whale optimization algorithm (WOA) were chosen to carry out the comparative experiment.Single-point and multi-point fault comparison simulation experiments with different numbers and locations of distributed power sources connected to the distribution network were conducted.Meanwhile, the positioning accuracy rate and the average number of generations of convergence were taken as the algorithm's performance evaluation indexes.Since the above algorithms are all stochastic optimization algorithms, each algorithm was repeated 20 times in the experiment, then the average values of the performance evaluation indexes of each algorithm were calculated and obtained.

Multiple Points Containing Distortion Faults
Due to the complex and uncontrollable environment of the distribution network in actual operation, the FTU equipment nodes are often exposed to harsh environments, which may lead to the phenomena of data loss and data distortion when the detection equipment node transmits fault current information.When a fault occurs in the actual active distribution network, the FTU device at the node may not be able to upload the corresponding fault information owing to the fault, and there would be false alarms, omissions, and misreporting.In simulation experiments, under the premise of single-point and multi-point faults occurring in the distribution network, the FTUs were set to upload the fault current information aberration points, and the distribution network containing distributed power supply was analyzed for fault tolerance.For example, when [K1,K2,K3] = [0,0,0], a fault occurred in zone X11, and the single-point distortion position was node 8, the algorithm iteration comparison curves were calculated and obtained, as shown in Figure 8a.Similarly, when [K1,K2, K3] = [1,0,0], a fault occurred in zone X9, and the multi-point distortion positions were node 6 and node 12, the algorithm iteration comparison curve were calculated and drawn as shown in Figure 8b; when [K1,K2, K3] = [0,1,1], faults occurred in zone X15 and X27, and the multi-point distortion positions were node 3 and node 33, the algorithm iteration comparison curves were calculated and drawn as shown in Figure 8c; when [K1,K2,K3] = [1,1,1], faults occurred in zone X13 and X25, and the multi-point distortion positions were node 5 and node 23, the algorithm iteration comparison curve were calculated and drawn as shown in Figure 8d.From Figures 6-8, it can be found that the INGO algorithm has an obvious advantage over the algorithms of NGO, GWO, and WOA in terms of convergence speed, and the optimal solution can be found in about five iterations for INGO.Though the algorithms of WOA, SCA, and FFOA are able to search for the optimal solution in some cases, owing to their weak global optimization ability, average numbers of iterations of 6, 24, and 75 were needed, respectively, before finding the optimal solution.To prevent the occurrence of contingency, INGO, NGO, GWO, and WOA were carried out another 100 times.The accuracy and average numbers of iterations of each algorithm were measured in different cases, and the results are shown in Table 4. Table 4 shows that as the complexity of the fault type increases, the average number of iterations of the NGO, the SWO, and WOA for fault location increases while the accuracy rate decreases.Among the three algorithms, NGO performed best in terms of the average number of iterations and the solution accuracy.Compared to the previous three algorithms, for IGNO, the dimensionality is reduced and the computing speed and the accuracy of results are improved.For example, compared with NGO, there is an average increase of 4.5% in the accuracy of the solution process for INGO.Moreover, INGO performs better than other optimized search algorithms in terms of computing speed and accuracy, for example, compared with NGO, the computing speed of INGO is improved by 31.9%, which presents huge engineering application value.

( 2 )
Initialize the INGO parameters, such as the number of populations, population dimensions (i.e., total number of nodes), variable range values, and maximum number of iteration generations.(3) Initialize the binary northern goshawk population, in which each individual represents a set of faulty operating states of the feeder line segment [23-26].(4) Calculate the fitness value and update the position of the northern goshawk individuals.(5) Introduce adaptive Gaussian-Cauchy hybrid perturbation variant perturbation strategy and sine-cosine strategy.Then, determine whether the maximum value is reached and, if not, return to step 4.

( 6 )
Determine the fault area and generate initialized individuals by use of the exhaustive enumeration method, then calculate the adaptation value.(7) Determine the fault zone, then check whether the fault region and the fault zone match.If they match, the fault region is determined, and the process is over; otherwise, return to step 6.

Figure 4 .
Figure 4. Convergence curve of fault location without distortion of information.

Figure 5 .
Figure 5. Convergence curve of fault location when node information was distorted.

Table 1 .
Switching function values for each node under different fault conditions.

Table 2 .
Single Point of failure simulation example.

Table 3 .
Multi-point fault simulation example.

Table 4 .
Localization results of different algorithms.