Classification of Partial Discharge Sources in Ultra-High Frequency Using Signal Conditioning Circuit Phase-Resolved Partial Discharges and Machine Learning

: This work presents a methodology for the generation and classification of phase-resolved partial discharge (PRPD) patterns based on the use of a printed UHF monopole antenna and signal conditioning circuit to reduce hardware requirements. For this purpose, the envelope detection technique was applied. In addition, test objects such as a hydrogenerator bar, dielectric discs with internal cavities in an oil cell, a potential transformer and tip–tip electrodes immersed in oil were used to generate partial discharge (PD) signals. To detect and classify partial discharges, the standard IEC 60270 (2000) method was used as a reference. After the acquisition of conditioned UHF signals, a digital signal filtering threshold technique was used, and peaks of partial discharge envelope pulses were extracted. Feature selection techniques were used to classify the discharges and choose the best features to train machine learning algorithms, such as multilayer perceptron, support vector machine and decision tree algorithms. Accuracies greater than 84% were met, revealing the classification potential of the methodology proposed in this work.


Introduction
The availability of assets in an electric power system (EPS) depends on the quality and reliability of electrical equipment during its operation.Internally located faults, even minor ones, can lead to the progressive deterioration of the equipment's insulation and the consequent withdrawal of equipment, resulting in asset repair costs, regulatory fines and other types of penalties imposed by the inspection agencies of the electricity sector [1].
Among the numerous methods for monitoring and diagnosing the operational condition of insulation systems, the measurement of partial discharge (PD) activity stands out, being considered a well-established solution for identifying potential defects in the insulation system of high-voltage equipment [2].
Partial discharges are, in general, consequences of local electrical stress on the equipment insulation, both internally and superficially.These discharges partially short-circuit the insulation between two conductors and are mostly accompanied by emissions of sound, light, heat, chemical reactions and electromagnetic waves [3].Despite the low intensity of Electronics 2024, 13, 2399 2 of 17 the discharges, due to the recurrent occurrence of the phenomenon on organic insulating materials, continuous PD activity degrades the material in such a way that a partial discharge evolves into a complete disruption, resulting in equipment failure and consequent withdrawal from operation.Thus, as it is a phenomenon with gradual evolution, the verification of the occurrence and evolution of partial discharges becomes a robust tool in the monitoring and diagnosis of the operating conditions of assets, helping to raise the level of reliability of electric power concessionaires.
The electrical method is deemed a benchmark method for the detection and classification of partial discharges.It is a calibratable method and, therefore, reproducible with relative reliability, and it is used in laboratory and insulating system commissioning tests.However, the standard postulates the need for a direct electrical connection between the acquisition system and the equipment to be tested, making this method quite invasive and sensitive to noise, presenting practical limitations for the application in the continuous monitoring of the operational state of the equipment in substations.Consequently, the method is not widely practiced in the field, being generally restricted to laboratory tests.
Due to the limitations imposed by the electrical method and other alternative methods, researchers have developed a technique focused on the detection of electromagnetic waves in the UHF (ultra-high frequency) range for the detection of PD, called the radiometric method.The radiometric method consists of acquiring electromagnetic waves from the current pulses of partial discharges using sensors (antennas) with a frequency range between 300 MHz and 3 GHz [12,13].
The great potential applicability of the radiometric method of partial discharge acquisition (RMPDA) lies in its capability to perform a comprehensive diagnosis of electrical equipment regarding the quality of its insulation in an efficient, reliable and non-invasive way, allowing defects to be detected, classified and located [10][11][12][13].
In addition, the employment of UHF sensors or antennas is effective for real-time monitoring of various levels and types of partial discharges in equipment, as they are immune to external noises, such as corona discharges [14].
However, the application of monitoring via the RMPDA method necessitates oscilloscopes with high bandwidths and sampling rates for better detailing of the partial discharge pulses and the consequent extraction of features for classification [15].
The replacement of oscilloscopes with high bandwidths and sampling rates by cheaper and more robust acquisition systems prepared for field acquisition is possible with the addition of a signal conditioning system based on envelope sensing circuitry [16].Moreover, the signal-to-noise ratio tends to be improved, since a low noise amplifier (LNA) could be used to improve the sensitivity of the system.
The acquisition of envelopes or a UHF pulse does not provide significant information for the classification of PD sources.Thus, using a technique like phase-resolved partial discharge (PRPD) analysis can contextualize and simplify the classification of defects.It is based only on the intensity and phase parameters of signals and histograms, especially if the acquisition of signals involves a reduced sampling method, such as envelope detectors [17,18].
With the evolution of signal processing techniques, artificial intelligence and digital signal filtering techniques have been increasingly applied for monitoring and diagnosis in electric systems [19].Artificial neural networks, support vector machines and decision trees have been used as techniques for diagnosing equipment failures [20].In addition, due to the large amount of data acquired and patterns generated, it is not feasible to perform the classification visually [21].
The detection of PD with commercial antennas and the demonstration of the differences in the envelope depending on the distance between the antenna and the PD source are presented by [16,22].Moreover, Refs.[22,23] differentiated between sources of partial discharges using envelope information from various sources.
In the literature, the conventional use of PRPD patterns is primarily to assess the severity of PD, regarding the amplitude and evolution of discharge levels [24].PRPD patterns with commercial UHF antenna and PD classifications using the extraction of features from the envelopes of these pulses were performed by [25].However, they did not perform classification with the PRPD standards.
Therefore, it is essential to study the feasibility of PRPD graphs from the RMPDA with envelope detection, as well as studies to verify the feasibility of classifying the different patterns detected.Therefore, this work proposes the use of a printed monopole antenna (PMA) as a UHF sensor with envelope detection as a sampling rate reduction technique through signal conditioning for the detection of patterns that can assist in the classification of partial discharges.

Experimental Setup
For comparative purposes, the circuit defined by IEC 60270 [3] was added to allow an automatic and simultaneous acquisition by the radiometric method.The proposed additions consist of acquiring, through one channel of the oscilloscope, the sinusoidal high-voltage reference signal through the capacitive voltage divider and, in other channels, the signal from the LNA and the conditioned signal from the detection of the envelope.The proposed arrangement can be seen in Figure 1, and its photograph is shown in Figure 2.
In the literature, the conventional use of PRPD patterns is primarily to assess the severity of PD, regarding the amplitude and evolution of discharge levels [24].PRPD patterns with commercial UHF antenna and PD classifications using the extraction of features from the envelopes of these pulses were performed by [25].However, they did not perform classification with the PRPD standards.
Therefore, it is essential to study the feasibility of PRPD graphs from the RMPDA with envelope detection, as well as studies to verify the feasibility of classifying the different patterns detected.Therefore, this work proposes the use of a printed monopole antenna (PMA) as a UHF sensor with envelope detection as a sampling rate reduction technique through signal conditioning for the detection of patterns that can assist in the classification of partial discharges.

Experimental Setup
For comparative purposes, the circuit defined by IEC 60270 [3] was added to allow an automatic and simultaneous acquisition by the radiometric method.The proposed additions consist of acquiring, through one channel of the oscilloscope, the sinusoidal highvoltage reference signal through the capacitive voltage divider and, in other channels, the signal from the LNA and the conditioned signal from the detection of the envelope.The proposed arrangement can be seen in Figure 1, and its photograph is shown in Figure 2.   source are presented by [16,22].Moreover, Refs.[22,23] differentiated between sources of partial discharges using envelope information from various sources .
In the literature, the conventional use of PRPD patterns is primarily to assess the severity of PD, regarding the amplitude and evolution of discharge levels [24].PRPD patterns with commercial UHF antenna and PD classifications using the extraction of features from the envelopes of these pulses were performed by [25].However, they did not perform classification with the PRPD standards.
Therefore, it is essential to study the feasibility of PRPD graphs from the RMPDA with envelope detection, as well as studies to verify the feasibility of classifying the different patterns detected.Therefore, this work proposes the use of a printed monopole antenna (PMA) as a UHF sensor with envelope detection as a sampling rate reduction technique through signal conditioning for the detection of patterns that can assist in the classification of partial discharges.

Experimental Setup
For comparative purposes, the circuit defined by IEC 60270 [3] was added to allow an automatic and simultaneous acquisition by the radiometric method.The proposed additions consist of acquiring, through one channel of the oscilloscope, the sinusoidal highvoltage reference signal through the capacitive voltage divider and, in other channels, the signal from the LNA and the conditioned signal from the detection of the envelope.The proposed arrangement can be seen in Figure 1, and its photograph is shown in Figure 2.   As shown in Figures 1 and 2, high alternating voltage was generated by a transformer with adjustable secondary voltage (0-100 kV) from the voltage variation in the primary voltage (0-220 V).The measurement of the high voltage is made possible by a capacitive voltage divider for obtaining and controlling the waveform.The filtering of noise coming from the voltage source is performed by a 15 mH blocking coil.The IEC 60270 measuring branch consists of a 1000 pF coupling capacitor in series with an LDM-5 Doble Lemke measuring impedance and an LDS-6 digital partial discharge meter, connected to a laptop with a graphical interface of the Doble Lemke LDS-6 PD Measuring System program.Finally, the system calibration is made possible using an LDC-5 Doble Lemke calibrator.
Simultaneously with the electric method measurement, a measurement was carried out with the UHF sensor, which was a printed monopole antenna, designed by [26], with a signal conditioning circuit.The oscilloscope was a Rohde & Schwarz-RTA4004 (1 GHz and 5 GSa/s) configured for acquisition at 125 MSa/s on four channels, with 2.5 MSa per channel.Lastly, for PD generation, five test objects were used: a hydrogenerator bar with an applied nominal operating voltage of 6 kV; Phenolite dielectric discs with two and three internal cavities of 2 mm diameter submitted to 16 kV and 18 kV, respectively, both designed and made by [27] and submitted to a PD generation cell developed by [28]; potential transformer at 20 kV nominal voltage and; tip-to-tip electrodes immersed in oil, spaced by 4 cm and submitted to 12 kV.

Signal Conditioning System
The signal conditioning circuit consisted of a high-pass filter, a low-noise amplifier and an envelope detector.Additionally, the circuit developed for the application of this work is presented in Figure 3.
As shown in Figures 1 and 2, high alternating voltage was generated by a transformer with adjustable secondary voltage (0-100 kV) from the voltage variation in the primary voltage (0-220 V).The measurement of the high voltage is made possible by a capacitive voltage divider for obtaining and controlling the waveform.The filtering of noise coming from the voltage source is performed by a 15 mH blocking coil.The IEC 60270 measuring branch consists of a 1000 pF coupling capacitor in series with an LDM-5 Doble Lemke measuring impedance and an LDS-6 digital partial discharge meter, connected to a laptop with a graphical interface of the Doble Lemke LDS-6 PD Measuring System program.Finally, the system calibration is made possible using an LDC-5 Doble Lemke calibrator.
Simultaneously with the electric method measurement, a measurement was carried out with the UHF sensor, which was a printed monopole antenna, designed by [26], with a signal conditioning circuit.The oscilloscope was a Rohde & Schwarz-RTA4004 (1 GHz and 5 GSa/s) configured for acquisition at 125 MSa/s on four channels, with 2.5 MSa per channel.Lastly, for PD generation, five test objects were used: a hydrogenerator bar with an applied nominal operating voltage of 6 kV; Phenolite dielectric discs with two and three internal cavities of 2 mm diameter submitted to 16 kV and 18 kV, respectively, both designed and made by [27] and submitted to a PD generation cell developed by [28]; potential transformer at 20 kV nominal voltage and; tip-to-tip electrodes immersed in oil, spaced by 4 cm and submitted to 12 kV.

Signal Conditioning System
The signal conditioning circuit consisted of a high-pass filter, a low-noise amplifier and an envelope detector.Additionally, the circuit developed for the application of this work is presented in Figure 3.The specified conditioning system consisted of a high-pass filter with a cut-off frequency of 150 MHz.The amplifier was an Oockic 40 dB low noise amplifier with frequencies between 30 MHz and 4 GHz.The envelope detector was a model based on the ADL 5511 with a Schottky diode with an operating range from DC up to 6 GHz with envelopes with a bandwidth up to 130 MHz, as shown in [16].
For the assessment of the filter and LNA  21 parameter, an ENA 5071C vector network analyzer (VNA) was used.For the measurement of the figures of merit, an MXA 9020A model spectrum analyzer was used to obtain the noise figure, and an EXG N5172B signal generator in addition to the spectrum analyzer to obtain the compression point of 1 dB.The characterizations of the signal conditioning circuit are presented in Figures 4  and 5.
As shown in Figure 4, the filter has a coefficient ranging from −1 dB to −4 dB in the range between 150 MHz and 1.5 GHz.According to Figure 5, the LNA has  21 coefficients ranging from 41 to 37 dB in the evaluated range, in addition to a noise figure (NF) always below 5 dB, which gives it the name LNA.The P1dB above 15 dBm ensures that the output signals have good amplitudes without major harmonic distortions in the amplified signal  21 .The specified conditioning system consisted of a high-pass filter with a cut-off frequency of 150 MHz.The amplifier was an Oockic 40 dB low noise amplifier with frequencies between 30 MHz and 4 GHz.The envelope detector was a model based on the ADL 5511 with a Schottky diode with an operating range from DC up to 6 GHz with envelopes with a bandwidth up to 130 MHz, as shown in [16].
For the assessment of the filter and LNA S 21 parameter, an ENA 5071C vector network analyzer (VNA) was used.For the measurement of the figures of merit, an MXA 9020A model spectrum analyzer was used to obtain the noise figure, and an EXG N5172B signal generator in addition to the spectrum analyzer to obtain the compression point of 1 dB.The characterizations of the signal conditioning circuit are presented in Figures 4 and 5.
As shown in Figure 4, the filter has a coefficient ranging from −1 dB to −4 dB in the range between 150 MHz and 1.5 GHz.According to Figure 5, the LNA has S 21 coefficients ranging from 41 to 37 dB in the evaluated range, in addition to a noise figure (NF) always below 5 dB, which gives it the name LNA.The P1dB above 15 dBm ensures that the output signals have good amplitudes without major harmonic distortions in the amplified signal S 21 .For the correct analysis of the phenomena, the acquisition and filtering of the signals must be performed.

Threshold Filtering
For each test object, the signals acquired by the electric method and the conditioning system were processed using universal threshold filtering.This filtering was determined as a percentage of the signal peak for each test object before being applied to the database.
The step-by-step of this filtering consisted of truncating the samples into 16.66 ms, separating the acquired signals into groups of ten periods and filtering them together.The maximum peak value of each set was determined from the absolute values of the signals under analysis, and then the threshold was applied relative to it.The other signals were compared to the threshold; if they had a lower or equal amplitude, they were discarded, and if the amplitude was greater, the amplitude and phase pair was stored.It is worth mentioning that each test object has its own threshold that also varies according to the acquisition.The flowchart of this filtering methodology is shown in Figure 6.For the correct analysis of the phenomena, the acquisition and filtering of the signals must be performed.

Threshold Filtering
For each test object, the signals acquired by the electric method and the conditioning system were processed using universal threshold filtering.This filtering was determined as a percentage of the signal peak for each test object before being applied to the database.
The step-by-step of this filtering consisted of truncating the samples into 16.66 ms, separating the acquired signals into groups of ten periods and filtering them together.The maximum peak value of each set was determined from the absolute values of the signals under analysis, and then the threshold was applied relative to it.The other signals were compared to the threshold; if they had a lower or equal amplitude, they were discarded, and if the amplitude was greater, the amplitude and phase pair was stored.It is worth mentioning that each test object has its own threshold that also varies according to the acquisition.The flowchart of this filtering methodology is shown in Figure 6.For the correct analysis of the phenomena, the acquisition and filtering of the signals must be performed.

Computational Procedures 4.1. Threshold Filtering
For each test object, the signals acquired by the electric method and the conditioning system were processed using universal threshold filtering.This filtering was determined as a percentage of the signal peak for each test object before being applied to the database.
The step-by-step of this filtering consisted of truncating the samples into 16.66 ms, separating the acquired signals into groups of ten periods and filtering them together.The maximum peak value of each set was determined from the absolute values of the signals under analysis, and then the threshold was applied relative to it.The other signals were compared to the threshold; if they had a lower or equal amplitude, they were discarded, and if the amplitude was greater, the amplitude and phase pair was stored.It is worth mentioning that each test object has its own threshold that also varies according to the acquisition.The flowchart of this filtering methodology is shown in Figure 6.

PRPD Pattern Representation
The final part of the algorithm presented in the previous section is the generation of PRPD patterns, which is performed with all the groupings of 10 acquisition cycles of each test object.The amplitude and phase information of the pulses is known, allowing the pulses to be plotted on the corresponding phase of the reference sinusoidal voltage.Figure 7 shows the flowchart used to plot each envelope in the form of PRPD from a reference sinusoidal voltage.

Feature Extraction
After the signals are acquired, it is necessary to extract characteristics to serve as input for the classification methods used.A flowchart summarizing this step is presented in Figure 8.Because the number of filtering points is variable in each new PRPD, these points alone cannot be used as classifier input parameters, so a fixed number of features were used as input.In all, there were 18 extracted features.
In the construction of information from large databases, it is necessary to use relationships between data so that particularities that characterize the phenomenon are evidenced.In this section, the statistical features that were used in the signal analysis are presented:

•
Minimum: the element that represents the smallest value in a dataset; • Maximum: the element that represents the largest value in a dataset;

PRPD Pattern Representation
The final part of the algorithm presented in the previous section is the generation of PRPD patterns, which is performed with all the groupings of 10 acquisition cycles of each test object.The amplitude and phase information of the pulses is known, allowing the pulses to be plotted on the corresponding phase of the reference sinusoidal voltage.Figure 7 shows the flowchart used to plot each envelope in the form of PRPD from a reference sinusoidal voltage.

PRPD Pattern Representation
The final part of the algorithm presented in the previous section is the generation of PRPD patterns, which is performed with all the groupings of 10 acquisition cycles of each test object.The amplitude and phase information of the pulses is known, allowing the pulses to be plotted on the corresponding phase of the reference sinusoidal voltage.Figure 7 shows the flowchart used to plot each envelope in the form of PRPD from a reference sinusoidal voltage.

Feature Extraction
After the signals are acquired, it is necessary to extract characteristics to serve as input for the classification methods used.A flowchart summarizing this step is presented in Figure 8.Because the number of filtering points is variable in each new PRPD, these points alone cannot be used as classifier input parameters, so a fixed number of features were used as input.In all, there were 18 extracted features.
In the construction of information from large databases, it is necessary to use relationships between data so that particularities that characterize the phenomenon are evidenced.In this section, the statistical features that were used in the signal analysis are presented:

•
Minimum: the element that represents the smallest value in a dataset; • Maximum: the element that represents the largest value in a dataset; After the PD signals are filtered, the PRPD patterns are plotted.Next, as a step in data preparation, feature extraction is carried out.

Feature Extraction
After the signals are acquired, it is necessary to extract characteristics to serve as input for the classification methods used.A flowchart summarizing this step is presented in Figure 8.

PRPD Pattern Representation
The final part of the algorithm presented in the previous section is the generation of PRPD patterns, which is performed with all the groupings of 10 acquisition cycles of each test object.The amplitude and phase information of the pulses is known, allowing the pulses to be plotted on the corresponding phase of the reference sinusoidal voltage.Figure 7 shows the flowchart used to plot each envelope in the form of PRPD from a reference sinusoidal voltage.

Feature Extraction
After the signals are acquired, it is necessary to extract characteristics to serve as input for the classification methods used.A flowchart summarizing this step is presented in Figure 8.Because the number of filtering points is variable in each new PRPD, these points alone cannot be used as classifier input parameters, so a fixed number of features were used as input.In all, there were 18 extracted features.
In the construction of information from large databases, it is necessary to use relationships between data so that particularities that characterize the phenomenon are evidenced.In this section, the statistical features that were used in the signal analysis are presented:

•
Minimum: the element that represents the smallest value in a dataset; • Maximum: the element that represents the largest value in a dataset; Because the number of filtering points is variable in each new PRPD, these points alone cannot be used as classifier input parameters, so a fixed number of features were used as input.In all, there were 18 extracted features.
In the construction of information from large databases, it is necessary to use relationships between data so that particularities that characterize the phenomenon are evidenced.In this section, the statistical features that were used in the signal analysis are presented:

•
Minimum: the element that represents the smallest value in a dataset; • Maximum: the element that represents the largest value in a dataset; • Number of elements: the amount of data in the set to be evaluated; • Mean: given by the arithmetic mean of the elements belonging to the dataset; • First quartile: the value of the dataset that delimits the 25% lowest values; • Second quartile: also called median, it is the value of the dataset that separates the 50% smallest from the 50% highest values; • Third quartile: the value of the dataset that delimits the 25% largest values; • Asymmetry: the statistical parameter that measures the degree of deviation of the symmetry of a dataset from the normal distribution; • Kurtosis: a parameter that measures the degree of flattening of the distribution of a set in relation to the normal distribution.
In addition to the extraction, the choice of the best features was performed using the SelectKBest algorithm of the Scikit-learn library of the Python language, which considers the mutual information between the features and classifies them from the most to the least relevant according to the technique.It is worth noting that due to the variation in ambient, noise, measurement saturations and measurements with no PD signal acquired, not all PRPDs are valid for serving as an example for training and testing classification models.An unbalanced database between objects was expected, with a greater number of samples available for those with the best signal-to-noise ratio.The use of classifiers for sources of partial discharges is presented in the following section.

Classification Using Machine Learning Algorithms
The extracted features serve as input to the machine learning algorithms.That is, each PRPD to be used as data for classification is transformed into the domain of its features.The step-by-step process of using the classifiers is presented in Figure 9.

•
Number of elements: the amount of data in the set to be evaluated; • Mean: given by the arithmetic mean of the elements belonging to the dataset; • First quartile: the value of the dataset that delimits the 25% lowest values; • Second quartile: also called median, it is the value of the dataset that separates the 50% smallest from the 50% highest values; • Third quartile: the value of the dataset that delimits the 25% largest values; • Asymmetry: the statistical parameter that measures the degree of deviation of the symmetry of a dataset from the normal distribution; • Kurtosis: a parameter that measures the degree of flattening of the distribution of a set in relation to the normal distribution.
In addition to the extraction, the choice of the best features was performed using the SelectKBest algorithm of the Scikit-learn library of the Python language, which considers the mutual information between the features and classifies them from the most to the least relevant according to the technique.It is worth noting that due to the variation in ambient, noise, measurement saturations and measurements with no PD signal acquired, not all PRPDs are valid for serving as an example for training and testing classification models.An unbalanced database between objects was expected, with a greater number of samples available for those with the best signal-to-noise ratio.The use of classifiers for sources of partial discharges is presented in the following section.

Classification Using Machine Learning Algorithms
The extracted features serve as input to the machine learning algorithms.That is, each PRPD to be used as data for classification is transformed into the domain of its features.The step-by-step process of using the classifiers is presented in Figure 9. First, the machine learning algorithm should be defined as a multilayer perceptron (MLP), support vector machine (SVM) or decision tree (DTC).The MLP network was used because it is a well-established standard classifier.The SVM and DTC were chosen due to the ability of these classifiers to work with unbalanced databases, which is the case in this work.The number of features to be used from each of the PRPDs in the training and testing of the model is defined according to the methodology aforementioned, ranging from 1 to 18 statistical features.Next, the database is divided between 70% for training and 30% for testing.
For the MLP network that was applied based on the Python Scikit-Learning library, the topology used was a single hidden layer with 12 neurons.The number of input neurons in the network ranged from 1 to 18, tracking the number of features, and the number of output neurons was fixed at 5, each one representing one of the test objects evaluated.The optimization algorithm used was L-BFGS.Additionally, the activation function used was relu, to improve the non-linearity of the classifier [29], with a maximum number of epochs set at 2000 to complete the training of the model.
For the SVM classifier from Scikit-Learning, the kernel function used was the radial basis function, or Gaussian kernel [30].The input layer for this classifier was set ranging from 1 to 18, and the output was set to 5.
For DTC, the criterion was set as gini, with the splitter set in best, without the number of max leaf nodes, as the default mode of the Python Scikit-Learning library, as in [31].The number of inputs still ranged from 1 to 18 features, tracking the number of features.The number of outputs possible for this classifier was set to 5. First, the machine learning algorithm should be defined as a multilayer perceptron (MLP), support vector machine (SVM) or decision tree (DTC).The MLP network was used because it is a well-established standard classifier.The SVM and DTC were chosen due to the ability of these classifiers to work with unbalanced databases, which is the case in this work.The number of features to be used from each of the PRPDs in the training and testing of the model is defined according to the methodology aforementioned, ranging from 1 to 18 statistical features.Next, the database is divided between 70% for training and 30% for testing.
For the MLP network that was applied based on the Python Scikit-Learning library, the topology used was a single hidden layer with 12 neurons.The number of input neurons in the network ranged from 1 to 18, tracking the number of features, and the number of output neurons was fixed at 5, each one representing one of the test objects evaluated.The optimization algorithm used was L-BFGS.Additionally, the activation function used was relu, to improve the non-linearity of the classifier [29], with a maximum number of epochs set at 2000 to complete the training of the model.
For the SVM classifier from Scikit-Learning, the kernel function used was the radial basis function, or Gaussian kernel [30].The input layer for this classifier was set ranging from 1 to 18, and the output was set to 5.
For DTC, the criterion was set as gini, with the splitter set in best, without the number of max leaf nodes, as the default mode of the Python Scikit-Learning library, as in [31].The number of inputs still ranged from 1 to 18 features, tracking the number of features.The number of outputs possible for this classifier was set to 5.
In summary, two PRPDs were evaluated for the classification of partial discharges, namely UHF signals conditioned by envelope detection and amplification stage and UHF signals acquired by the electric method.Both were filtered by a universal threshold.This resulted in a combination of 108 total scenarios, 2 acquisition methods, 1 filtering method, 18 features, and 3 classifiers.The classifiers' performance was evaluated by the average accuracy of the model in 10 executions for each scenario evaluated.

Signal Conditioning System
Figure 10 shows a radiometric pulse of amplified PD obtained from the PT and its respective envelope in the time domain.Figure 11 presents the conditioning system's responses in the frequency domain.
In summary, two PRPDs were evaluated for the classification of partial discharges, namely UHF signals conditioned by envelope detection and amplification stage and UHF signals acquired by the electric method.Both were filtered by a universal threshold.This resulted in a combination of 108 total scenarios, 2 acquisition methods, 1 filtering method, 18 features, and 3 classifiers.The classifiers' performance was evaluated by the average accuracy of the model in 10 executions for each scenario evaluated.

Signal Conditioning System
Figure 10 shows a radiometric pulse of amplified PD obtained from the PT and its respective envelope in the time domain.Figure 11 presents the conditioning system's responses in the frequency domain.From the curves shown in Figure 5, it can be seen in Figure 11 that the partial discharge pulse has important components up to 900 MHz, while the envelope has a strong In summary, two PRPDs were evaluated for the classification of partial discharges, namely UHF signals conditioned by envelope detection and amplification stage and UHF signals acquired by the electric method.Both were filtered by a universal threshold.This resulted in a combination of 108 total scenarios, 2 acquisition methods, 1 filtering method, 18 features, and 3 classifiers.The classifiers' performance was evaluated by the average accuracy of the model in 10 executions for each scenario evaluated.

Signal Conditioning System
Figure 10 shows a radiometric pulse of amplified PD obtained from the PT and its respective envelope in the time domain.Figure 11 presents the conditioning system's responses in the frequency domain.From the curves shown in Figure 5, it can be seen in Figure 11 that the partial discharge pulse has important components up to 900 MHz, while the envelope has a strong From the curves shown in Figure 5, it can be seen in Figure 11 that the partial discharge pulse has important components up to 900 MHz, while the envelope has a strong DC component and has the amplitude of the spectrum dropping.However, it turns out that the most energetic components are up to 50 MHz.Therefore, there is a reduction of the order of 18 times for the frequency components and consequently the hardware requirements for signal acquisition.
The signals acquired by the two techniques are compared in the common unit between the two, which is volt, or millivolt.However, all acquisitions were performed after the checking of the background noise, which was always kept below 15 pC according to IEC 60270 method.The discharge levels captured for each test object are shown in Table 1.

Threshold Filtering Algorithm
The signals acquired by the two methods are displayed, along with the universal threshold filtering algorithm.Figure 12 shows the whole process, from the acquisition to the generation of PRPD patterns for dielectric disk 1.The sequence of figures on the left corresponds to the data acquired with the partial discharge signal conditioning system, while those on the right column correspond to the data obtained with the measurement standardized by IEC 60270.
In Figure 12, figures (a) and (f) represent the signal of 10 acquisitions acquired by the oscilloscope.The first step of filtering consists of truncating these signals over a sinusoidal period to analyze only the equivalent of one power cycle, or 16.66 ms for 60 Hz.This step is presented in (b) and (g).In the next step, (c) and (h), the absolute values of the acquisition under analysis are calculated and represented.In (d) and (i), the thresholds calculated for each set of 10 acquisitions are presented; these thresholds are percentages of the highest peak value recorded.For the conditioned signals of the dielectric disk, the threshold of 30% was used, and for those acquired with the IEC 60270 method, the threshold was fixed at 15% for all objects under test.The difference between the amplitudes and threshold levels of the signals is due to the measurement principles.The electrical method has an electrical connection with the partial discharge generation system, which ensures greater sensitivity and signals with greater amplitudes and more prominent peaks.However, the radiometric method has losses associated with the propagation of waves in the air from the device under test to the UHF sensor.Finally, in (e) and (j), the PRPD patterns after signal filtering are represented.Each dot represents the maximum peak of each envelope acquired.
Not all acquired signals are used to generate valid PRPD standards, since there were acquisitions that did not acquire discharges above the noise level.A comparison between the numbers of valid patterns for each test object is presented in Table 2.Each filtered PRPD was formed by 10 acquisitions, so 250 measurements made up of 10 acquisitions each would form 250 valid PRPDs if all measurements triggered the proposed threshold.For the data obtained with the electric method, 92.96% of the database was valid for the generation of PRPD standards, with 1162 standards.The acquisition with a conditioning system presented 90.72% of the valid base.Comparing Tables 1 and 2, it is observed that test objects that present higher PD activity (hydrogenerator bar and PT) had all the patterns valid for the attribute extraction and classification since higher PD activity is less likely to mix with the background noise, always resulting in a successful filtering and pulse extraction process.For the test objects with lower PD activity (discs and oil discharges), the number of usable patterns is reduced when compared to the TP and bar since lower PD activity is more likely to mix with the background noise.Through the application of the universal threshold technique, it is possible to avoid false positives, i.e., avoid confusing background noise with PD activity, ensuring the robustness of the proposed methodology.

Feature Extraction
For the PRPD patterns generated, the differentiation between the objects under test can be better observed in the tables containing the features of the pulse peak distributions in the phase, as illustrated in Tables 3 and 4. The + and − symbols used refer to the respective positive and negative semicycles.Table 3 refers to the 1134 valid PRPD patterns of the signals acquired with the conditioning system, while Table 4 refers to the 1162 valid standards of the acquisition with the IEC.Differences between the signals depending on the acquisition technique are observed as information regarding the quartiles between the phases of the pulses.However, in Tables 3 and 4, there are shading zones between the test objects.Due to this difficulty of visual separation, machine learning techniques were used to perform the separations between the different sources of partial discharges.For the correct classification, ML models must be given the best features.
For the conditioned UHF data, it is verified that there are greater interquartile distances between the phases of the pulses when compared to the IEC signals.
Machine learning algorithms cannot receive PRPDs without the proper disposition of the features associated with this phenomenon.The features that were used as inputs to the machine learning models were statistical and referred only to the phase of the PD pulse peaks.In addition, the data used have been normalized.
Moreover, to obtain the best classification for the two databases resulting from the combinations between acquisition and filtering methods, the features were selected by the mutual information algorithm of the Scikit-Learning library.The summary and sequential ordering of the best features for each database are presented in Table 5.From Table 5, it is verified that, regardless the acquisition technique, the seven most important features are those presented in the negative semicycle of the PRPD patterns.Therefore, features such as maximum, third quartile, mean, median, minimum, second quartile and number of pulses of the negative semicycle have the potential for representativeness and separability between the various patterns.On the other hand, the features referring to data distributions, such as skewness and kurtosis of both semicycles, were the least representative.
The classification algorithms were trained with the best normalized features, with the number of input features ranging from 1 to 18.The features were added in the order presented in Table 5.

Classification Using Machine Learning
The machine learning models were trained with 70% of the database, and the test was performed with 30%.Both sets were randomly selected and mutually self-exclusive.The classifiers used are presented in terms of accuracy for the test database in Table 6.
Analyzing the data presented in Table 6, it can be concluded that the best accuracies were obtained when applying MLP.For the case of conditioned signals, the best accuracy for the test base was 84.5% for 14 features, whose confusion matrix is shown in Figure 13.As for the data obtained with the IEC circuit, the best accuracy was 91.1% and obtained for MLP with 12 features, whose confusion matrix is shown in Figure 14.The results by classification method are followed by the decision tree, with 83.6% and 90% for the conditioned and electric methods, respectively, and by the SVM, with 71.2% and 74.5%, also for the conditioned and electric methods, respectively.As shown in Figure 13, higher accuracy rates were obtained in the classification of the hydrogenerator bar and the potential transformer, namely 97% and 94%, respectively.The greatest difficulty that the model encountered was in performing the separation between discs 1 and 2 and the oil discharge, obtaining accuracies as low as 70% for these objects (70% for disc 1 and 79% for disc 2 and oil discharge).The difficulties of classifying the dielectric discs are physically corroborated, since both are internal discharges in cavities, with only the number of cavities being different between the objects under test.The proof of this difficulty is the classification errors of 13%, 16% and 18% among the objects tested.As shown in Figure 13, higher accuracy rates were obtained in the classification of the hydrogenerator bar and the potential transformer, namely 97% and 94%, respectively.The greatest difficulty that the model encountered was in performing the separation between discs 1 and 2 and the oil discharge, obtaining accuracies as low as 70% for these objects (70% for disc 1 and 79% for disc 2 and oil discharge).The difficulties of classifying the dielectric discs are physically corroborated, since both are internal discharges in cavities, with only the number of cavities being different between the objects under test.The proof of this difficulty is the classification errors of 13%, 16% and 18% among the objects tested.For Figure 14, it was verified that the method obtained 100% accuracy for the hydrogenerator bus, 96% for the potential transformer, 87% for the oil discharge, and 81% and 80% for dielectric discs 1 and 2, respectively.Again, the bottleneck was the separation between the two dielectric discs.This finding corroborates the difficulty of separating the two dielectric discs regardless of the acquisition method.

Discussion and Conclusions
This research aimed to detect and classify different patterns of internal discharges through UHF pulse capture with a printed monopole PMA and signal conditioning based  As shown in Figure 13, higher accuracy rates were obtained in the classification of the hydrogenerator bar and the potential transformer, namely 97% and 94%, respectively.The greatest difficulty that the model encountered was in performing the separation between discs 1 and 2 and the oil discharge, obtaining accuracies as low as 70% for these objects (70% for disc 1 and 79% for disc 2 and oil discharge).The difficulties of classifying the dielectric discs are physically corroborated, since both are internal discharges in cavities, with only the number of cavities being different between the objects under test.The proof of this difficulty is the classification errors of 13%, 16% and 18% among the objects tested.For Figure 14, it was verified that the method obtained 100% accuracy for the hydrogenerator bus, 96% for the potential transformer, 87% for the oil discharge, and 81% and 80% for dielectric discs 1 and 2, respectively.Again, the bottleneck was the separation between the two dielectric discs.This finding corroborates the difficulty of separating the two dielectric discs regardless of the acquisition method.

Discussion and Conclusions
This research aimed to detect and classify different patterns of internal discharges through UHF pulse capture with a printed monopole PMA and signal conditioning based For Figure 14, it was verified that the method obtained 100% accuracy for the hydrogenerator bus, 96% for the potential transformer, 87% for the oil discharge, and 81% and 80% for dielectric discs 1 and 2, respectively.Again, the bottleneck was the separation between the two dielectric discs.This finding corroborates the difficulty of separating the two dielectric discs regardless of the acquisition method.

Discussion and Conclusions
This research aimed to detect and classify different patterns of internal discharges through UHF pulse capture with a printed monopole PMA and signal conditioning based on envelope detection.A reduction of 18 times in the frequencies of the signals acquired with this conditioning system was verified, from 900 MHz to 50 MHz.
Five test objects, all with internal discharges, were tested to obtain different profiles of partial discharges of the same nature, which was obtained with the confirmation of IEC 60270.These objects were a bar whose predominant defect is machine bar discharge, a type of internal discharge; dielectric discs 1 and 2 with internal discharge in controlled dielectric cavities, a potential transformer with internal discharges, but due to threshing in the epoxy insulation; and, finally, tip-to-tip electrodes, which cause discharge in oil.
Acquisitions by the radiometric method were performed using Occkic 40 dB V2A.0 LNA conditioned signals and the ADL 5511 envelope detector.A threshold filtering technique was performed on the signals, and the amount of valid PRPDs per database was above 90%.

Figure 2 .
Figure 2. Picture of experimental setup for partial discharge generation/detection.

Figure 2 .
Figure 2. Picture of experimental setup for partial discharge generation/detection.Figure 2. Picture of experimental setup for partial discharge generation/detection.

Figure 2 .
Figure 2. Picture of experimental setup for partial discharge generation/detection.Figure 2. Picture of experimental setup for partial discharge generation/detection.

Figure 3 .
Figure 3. Picture of UHF partial discharge signal conditioning system.

Figure 3 .
Figure 3. Picture of UHF partial discharge signal conditioning system.

Figure 5 .
Figure 5. Figures of merit of LNA.

Figure 5 .
Figure 5. Figures of merit of LNA.

Figure 5 .
Figure 5. Figures of merit of LNA.

Figure 7 .
Figure 7. PRPD pattern representation flowchart.After the PD signals are filtered, the PRPD patterns are plotted.Next, as a step in data preparation, feature extraction is carried out.

Figure 8 .
Figure 8. Flowchart of data engineering for machine learning training.

Figure 7 .
Figure 7. PRPD pattern representation flowchart.After the PD signals are filtered, the PRPD patterns are plotted.Next, as a step in data preparation, feature extraction is carried out.

Figure 8 .
Figure 8. Flowchart of data engineering for machine learning training.

Figure 7 .
Figure 7. PRPD pattern representation flowchart.After the PD signals are filtered, the PRPD patterns are plotted.Next, as a step in data preparation, feature extraction is carried out.

Figure 8 .
Figure 8. Flowchart of data engineering for machine learning training.

Figure 8 .
Figure 8. Flowchart of data engineering for machine learning training.

Figure 9 .
Figure 9. Flowchart of classification using machine learning.

Figure 9 .
Figure 9. Flowchart of classification using machine learning.

Figure 10 .
Figure 10.Amplified PD pulse and its conditioned envelope.

Figure 10 .
Figure 10.Amplified PD pulse and its conditioned envelope.

Figure 11 .
Figure 11.Frequency domain response of PD pulse and its conditioning system.(a) From 0 to 1.25 GHz; (b) zoom in from 0 to 160 MHz.

Figure 11 .
Figure 11.Frequency domain response of PD pulse and its conditioning system.(a) From 0 to 1.25 GHz; (b) zoom in from 0 to 160 MHz.

Figure 12 .
Figure 12.Representative figure of the step-by-step universal threshold filtering technique: from (a-e) representation of filtering with conditioned signals; from (f-j) representation of the filtering of the signals acquired with the electric method.In orange, the reference voltage; and in blue, the partial discharges signals; and in red, the filtering threshold.

Figure 13 .
Figure 13.Confusion matrix of MLP with 14 features from conditioned dataset.

Figure 14 .
Figure 14.Confusion matrix of MLP with 14 features from IEC dataset.

Figure 13 .
Figure 13.Confusion matrix of MLP with 14 features from conditioned dataset.

Figure 13 .
Figure 13.Confusion matrix of MLP with 14 features from conditioned dataset.

Figure 14 .
Figure 14.Confusion matrix of MLP with 14 features from IEC dataset.

Figure 14 .
Figure 14.Confusion matrix of MLP with 14 features from IEC dataset.

Table 3 .
Features per semicycle of the filtered dataset of signals acquired with conditioning system.

Table 4 .
Features per semicycle of the filtered dataset of signals acquired with IEC 60270.

Table 5 .
Statistical features per semicycle ordered by importance for the two datasets.

Table 6 .
Test dataset accuracy of machine learning models.