RSSI-Power-Based Direction of Arrival Estimation of Partial Discharges in Substations

: The localization of partial discharges in air-insulated substations using ultra-high frequency technology is widely studied for power equipment early warning purposes. Ultra-high frequency partial discharge localization systems are usually based on the time-di ﬀ erence of electromagnetic wave signals. However, the large size of test equipment and the need for a high sampling rate and time synchronization accuracy limit their practical application. To address this challenge, this paper proposes a power-based partial discharge direction of arrival method using a received signal strength indicator from an ultra-high frequency wireless sensor array. Furthermore, the Gaussian mixture model is used for noise suppression, and the Gaussian process classiﬁer is used for line of sight received signal strength indicator data identiﬁcation. Laboratory tests are performed and the results show the average error of direction of arrival is less than 5 ◦ . The results verify the e ﬀ ectiveness of the proposed partial discharge localization system.


Introduction
Partial discharges (PDs) are one of the main factors of insulation deterioration and equipment break-down in air-insulated substations (AIS) [1,2]. Thus, monitoring and localization of partial discharges is an important topic in power systems. Much research has been done on the use of PD detection for condition monitoring and diagnosis of power equipment failure [3][4][5][6]. The authors in [2,7,8] used electromagnetic waves to locate PD sources and used the information for an early warning system in AIS. The localization methods are usually based on time of arrival (ToA) or time difference of arrival (TDoA) [9][10][11]. Some researchers have adopted array processing technology, i.e., multiple signal classification (MUSIC) [12] for better accuracy. However, these methods are mainly based on time-frequency analysis and therefore bear a high cost due to the requirement for nanosecond high accuracy time synchronization and high sampling rates of several gigahertz. Recently, some researchers have used received signal strength indicator (RSSI) methods to locate the PD sources and reduce the cost [13][14][15]. The standard RSS and RSSI localization techniques are usually divided into two stages, an offline stage to collect RSSI values in the detection area and an online stage to estimate the PD location by a scenario analysis method, so standard RSS and RSSI localization techniques usually have more workload and are less feasible for practical application due to the offline stage which is needed for building a fingerprints map [16].
This paper presents a simple method for PD direction of arrival (DoA) method by ultra-high frequency (UHF) RSSI measurements. Firstly, a circle array composed of UHF wireless sensors is designed and used for PD DoA. Then, the Gaussian mixture model (GMM) and Gaussian process classifier (GPC) are taken into consideration for reducing noise and line of sight (LOS) signal identification, respectively. Finally, the radiation characteristics of the designed UHF antenna is used for DoA through searching for the minimal RSSI value. To verify the proposed algorithm, a field test is carried out in a high voltage (HV) test laboratory. The main advantage of the proposed method is to achieve a similar accuracy comparing to the ToA or TDoA in a low hardware cost way.
The rest of the paper is organized as follows: Section 2 introduces the hardware of the sensor and the principle of the RSSI power-based DoA estimation. Section 3 describes the process of the DoA estimation in details. Section 4 presents the experimental results and Section 5 concludes the paper.

Wireless UHF Sensor
The UHF antenna we used is a printed circuit board (PCB) elliptical dipole antenna with double feed. A diagram and picture of the elliptical dipole antenna are shown in Figure 1a,b. The antenna's 2-D radiation pattern at 1100 MHz is shown in Figure 1c. Figure 1c shows that the antenna has maximum sensitivity at 0 • , and its minimum at 90 • and −90 • . Furthermore, the sensitivity at 180 • is somewhat increased. Considering the electromagnetic shielding effect of the metal framework, the sensor has its maximum signal when the PD is in front of the sensor and the sensor has its minimal signal when the PD is behind the sensor. For the designed wireless UHF sensors, the signal bandwidth is 300-1500 MHz and the A/D sampling rate is 2.7 MHz. Firstly, the UHF electromagnetic wave is received by the UHF antenna. Then, the envelope detection waveform is obtained after signal conditioning by a bandpass filter, amplifier and detector. Finally, the digital data are generated after A/D sampling and transmitted to computer through a Wi-Fi module controlled by MCU. The peak value of envelope detection waveform is the signal strength as we needed. The diagram and picture of the wireless UHF sensor are shown in Figure 2, respectively.

Preliminary Test of Radiation Pattern
The proposed DoA algorithm is based on the characteristic radiation pattern of the designed UHF sensor, thus, a preliminary test with four sensors was performed. The sensor array is comprised of four sensors and the four sensors are labeled as S1, S2, S3 and S4, respectively. The azimuth angle difference of every two adjacent sensors is 90 • , and the angles of each sensor in the sensor array are assumed as 0 • , 90 • , 180 • and 270 • for convenience. When a PD event occurs, the DoA estimation result is regarded as angle difference between the DoA and the first sensor S1. In the preliminary test, the angle where the PD happens is from 0 • to 360 • and at an interval of 30 • . The preliminary test result is shown in Figure 3. The results in Figure 3 convey that the minimum value of each sensor is received when a PD event occurs behind the sensor in an ideal situation. Almost the maximum value is achieved in front of the sensor, however there is a little deviation from the expected angle. Besides the maximum and minimum values, the received value of each sensor fluctuates randomly at other angles. Therefore, this paper chooses to search for the minimal signal of the sensor array and then to find the direction of the arrival of PD. The angle of direction of the arrival is obtained by adding or subtracting 180 • from the angle of the minimal signal.

LOS and NLOS Consideration
Due to the complex environment of substations, i.e., the number of reflective surfaces due to installed power equipment, PD UHF signals will be severely impacted by shadowing and multipathing effects. This effect is referred to as the no-line of sight (NLOS) condition and has a high impact on the received RSSI data. Accordingly, the unobstructed scenario is called line of sight (LOS) conditions.
In order to investigate the influence of NLOS conditions on the received RSSI data, we also performed experiments in a laboratory environment that contains many pieces of power equipment to collect UHF RSSI data using wireless UHF sensors. Using the wireless UHF sensors, we measured and collected the RSSI values along 20 m in both unobstructed space and the laboratory environment. The experimental process is described simply as follows: the wireless UHF sensor is placed at a fixed point, then the distance between the simulated PD source and sensor is gradually changed in a straight line, and an air discharge pulse according to EN/IEC 61000-4-2 is generated by a PD simulator named 'EM TEST DITO' [17], and finally, the raw RSSI measurements at each position are collected. The parameters of the PD simulator are set as an inception voltage is 2 kV, a rise time is 0.7 ns and the spectrum is 0-1.5 GHz.
We collected total 1540 data under LOS conditions, reported in Figure 4a. Similarly, a total of 1540 data are collected under NLOS conditions and reported in Figure 4b. We can see that NLOS observations will make the signal amplitude more dispersed, and its attenuation trend is obviously different from LOS conditions. Therefore, NLOS condition can seriously affect the accuracy of the signal amplitude measurement, and thus affect the PD source DoA accuracy.

Gaussian Mxiture Model for Noise Suppression
According to the signal attenuation model [18], the received signal strength indicator at the received wireless sensor can be calculated as: where d 0 is the reference node and P(d 0 ) is corresponding RSSI, and γ is the attenuation coefficient which usually is known. X σ follows a Gaussian distribution with mean value 0 and variance σ 2 , so the RSSI of each sensor can be regarded as a Gaussian distribution. However, in a real situation, the received dataset is blended with noisy data. GMM is used to pick out the main Gaussian distribution and remove the noisy data [19]. GMM can be applied to divide the dataset or vectors into M Gaussian functions. The Gaussian mixture density is a weighted sum of M component densities: where x is a D-dimensional vector, p i is the mixture weight and satisfies the constrain that M i=1 p i = 1 The density function of b i (x) is written as: µ i is the mean vector and Σ i is the covariance matrix, so the Gaussian mixture density is parameterized by the mean vectors, covariance matrices and mixture weights. These parameters are collectively represented by the notation: In this paper, diagonal covariance matrices are used, and the most popular method to estimate the parameters of GMM is maximum likelihood (ML) estimation. The GMM likelihood can be written as: The purpose of ML estimation is to find the target parameters that maximize the likelihood of GMM:λ The target parameters can be obtained by the expectation maximization (EM) algorithm rather than by solving the nonlinear function. The EM algorithm begins with an initial parameter and estimating a new parameter which meets the condition that: Then the new parameter becomes the new initial parameter for the next iteration, and the iteration process is repeated until some convergence threshold is reached. After the above procedure, the main Gaussian distribution can be obtained for the next step.

Calibration and Normalization
In the selected dataset ψ': Each column vector is represented by a sensor's RSSI and each row vector is represented by all twelve sensors' RSSI data once the PD happens. However, there are minor differences between the twelve sensors when the sensors are facing the same PD. Therefore, the calibration is done before the PD test by randomly selecting one sensor as a base, and in this paper the sensor S1 is regarded as base. In the same situation, the twelve sensors' RSSI data for the same PD can be written as [F 1 , F 2 , . . . , F 12 ].
The calibration coefficient f i for each sensor to S1 is written as: The calibration for the selected dataset is calculated by: After calibration, each data in the new matrix ψ can be updated by normalization in each row vector: The calibration and normalization can reduce the DoA estimation error effectively. The error can be improved by approximately 15 percent.

LOS Identification by GPC
As addressed in Section 2.3, NLOS will affect the RSSI data significantly. Thus, a GPC is designed to determine the decision region of LOS/NLOS conditions. LOS/NLOS condition separation could be abstracted as a supervised classification problem of learning input output relationship from a training dataset. A machine learning technology named Gaussian process classification [20] provides us with a nonparametric Bayesian approach to learn the points dependencies of a dataset.
Supposing there are d sensors used for UHF monitoring and n samples obtained. Then, we define the training set D (x i , y i ) i = 1, 2, . . . n which contains a d-dimensional vector x and a d-dimensional class label vector y, y Firstly, the latent predictive distribution of training set is computed by: where X = {x i , i = 1, 2, . . . n} is a d × n matrix of aggregated input vectors x, and Y = y i , i = 1, 2, . . . n is also a d × n matrix of aggregated class label vector y, ε is an independent Gaussian random variable and ε ∼ N(0, σ 2 n ). In GPC, assuming that any combination of random variables obeys a Gaussian distribution with an average of 0, thus the prior distribution of Y is: where m = g(X, X) is a n × n symmetric positive definite covariance matrix, and the covariance function g is [21]: where p, σ 2 0 are hyper-parameters that can be optimized by maximizing the likelihood function log p(y X, (p, δ 2 )) .
Then, the joint prior distribution of training set D = {X, Y} and testing set D * = {X * , Y * } is: The mean value and variance of Y * are predicted by GPC: The approximate inference in this paper is done using the open source Gaussian process (GP) library in [20]. The decision regions of training set in Figure 4 is learned by GPC and reported in Figure 5, where the higher probabilities correspond to LOS observations. We can see that the GPC could differentiate between LOS conditions and NLOS conditions effectively.

Interpolation
Each row vector in matrix ψ is a sample for DoA estimation. For each row vector, interpolation is used to find the DoA estimation result. Without interpolation, the DoA results will be fixed as one of the twelve angles that the sensors placed which means the resolution is 30 • as shown in Figure 6. Thus, the resolution is improved by the interpolation algorithm. First, twelve sensors' data are represented the twelve angles, respectively. In the Cartesian coordinate system, twelve RSSI data are marked with twelve dots. The interpolation algorithm is using the specific curve to connect the adjacent points. In order to complete the curve between the angle range [330 • ,360 • ], the first sensor's data can be regarded as the thirteenth sensor's data in the Cartesian coordinate system since the first sensor is not only beside the second sensor but also beside the twelfth sensor. In this paper, the cubic spline interpolation and polynomial interpolation are used for the thirteen points respectively. For the cubic spline interpolation, the curve function between two adjacent points (x j , y j ), (x j+1 , y j+1 ) can be written as: where x ∈ (x j , x j+1 ), j = 1, 2, . . . , 12, a j , b j , c j , d j are unsolved coefficients. Also, in order to solve the equation set, the smoothness conditions and boundary conditions are required to be considered. The smoothness conditions can be written as: and for boundary conditions we choose natural boundary conditions. The equations are written as: For the polynomial interpolation, the equation of the thirteen points can be written as: where a 0 , a 1 , . . . , a p are unknown coefficients. The least-squares method is used to solve the high order equation.

Framework of RSSI-Power-Based DoA of PD
Based on the preliminary test and preprocessing method, the final idea for DoA estimation is to find the minimal signal in the received value data. The number of the sensor in the sensor array has increased from four to twelve in order to improve the accuracy of DoA estimation result. The diagram is presented in Figure 6. The twelve sensors S1, S2, . . . , S12 are placed in a circle frame and interval of two sensor is 30 • . So, the resolution of the sensor array is 30 • . Several steps are used to improve the accuracy of the proposed method and corresponding flow chart is shown in Figure 7. First, GMM is used to filter out some noisy data to reduce DoA error. Second, GPC is adopted for LOS RSSI data identification for better DoA accuracy. Then, data interpolation is used to find the lowest point of the curve and improve the resolution. The angle of the lowest point is regarded as the DoA of the PD. Through all these steps, the DoA estimation result is finally achieved.

Experimental Verification
Experiments were performed in a HV laboratory to verify the effectiveness of the proposed DoA estimation method. The twelve sensors were placed in a circle frame with twelve holes and each sensor was placed in each hole in order. The frame was laid on a tripod. The setup is shown in Figure 8. The PD source was also the EM TEST DITO. In the experiment test, the azimuth angles of PD were set as 240 • and 360 • (or 0 • ) and the distances between the PD simulator and the sensor array were 5 m and 10 m.
In the first situation, the azimuth angle and the distance were 240 • and 5 m, respectively. The results after applying the GMM and GPC and the interpolation algorithms are shown in Figures 9 and 10. Thirteen RSSI data are pictured with thirteen black dots and the blue vertical line marks the lowest point of the cubic spline interpolation. In Figure 9, the angle of the lowest point is 57 • , and the DoA estimation result is 237 • . Also, the polynomial interpolation of the RSSI data is presented in Figure 10. The angle of the lowest point is 50 • . The DoA estimation result is 230 • for polynomial interpolation.  In the second situation, the distance was longer and set as 10 m. Also, and the azimuth angle was set as the same 240 • . The DoA estimation results are shown in Figures 11 and 12. The angle of the lowest point is 67 • using cubic spline interpolation in Figure 11 and the DoA estimation result is 247 • . The angle of the polynomial interpolation in Figure 12 is 64 • . The DoA estimation result is 244 • using polynomial interpolation.    The details of all three situations' results can be found in Table 1. In the first setup, the DoA estimation results for cubic spline interpolation and polynomial interpolation are 240.6 • and 229.0 • respectively. The errors are 0.6 • and 11.0 • . Therefore, at the distance of 5 m, the method using cubic spline interpolation can locate the PD source accurately. Cubic spline interpolation performs better than polynomial interpolation. In the second setup, the DoA estimation results for cubic spline interpolation and polynomial interpolation are 235.4 • and 230.0 • . The errors are 4.6 • and 10.0 • , respectively. The error for cubic spline interpolation becomes greater and the error for polynomial interpolation does not change significantly, and the error of the polynomial interpolation is again greater than that of the cubic spline interpolation. The DoA estimation results for cubic spline interpolation and polynomial interpolation are 2.7 • and 10.0 • in the third setup. The errors are 2.7 • and 10.0 • , respectively. Overall, the DoA estimation error of cubic spline interpolation is less than 6 • while the DoA estimation error of polynomial interpolation is approximately 10 • . Moreover, using GPC can improve the location accuracy further.

Conclusions
This paper proposes a DoA estimation method for PD based on RSSI and sensor array. Compared to TDoA or ToA techniques, the cost of the proposed method has been reduced by an order of the magnitude without needing a high signal acquisition system. According to [20,21], the errors of DoA estimation using the TDoA or ToA techniques range from 2.4 • to 7.2 • [6,11]. Also, using RSSI fingerprint localization, the error of angle is about 5 • [15]. In the proposed paper, the sensors in the sensor array collect the RSSI data of PD. Then the RSSI data are subjected to GMM, GPC and interpolation algorithms using the specific characteristics of the UHF antenna. The effectiveness of the method has been verified by experiments performed in a HV laboratory. The results show that the average error is less than 5 • using a cubic spline interpolation algorithm. The accuracy is approximately the same as with the time delay estimation method, and the low cost and flexibility make our proposal a promising fault early warning system. Furthermore, the DoA estimation of multiple PD sources by the proposed method will be the topic of one of our future works.