Failure Prevention and Malfunction Localization in Underground Medium Voltage Cables

: A smart monitoring system capable of detecting and classifying the health conditions of MV (Medium Voltage) underground cables is presented in this work. Using the analysis technique proposed here, it is possible to prevent the occurrence of catastrophic failures in medium voltage underground lines, for which it is generally difﬁcult to realize maintenance operations and carry out punctual inspections. This prognostic method is based on Frequency Response Analysis (FRA) and can be used online during normal network operation, resulting in a minimally invasive tool. In order to obtain the good results shown in the simulation section, it is necessary to develop a lamped equivalent circuit of the network branch under consideration. The standard π -model is used in this paper to analyse sections of a medium voltage cable and the parameter variations with temperature are used to classify the state of health of the line. In fact, the variation of the electrical parameters produces a corresponding variation in the frequency response. The proposed system is based on the use of a complex neural network with feedforward architecture. It processes the frequency response, allowing the classiﬁcation of the cable conditions with an accuracy higher than 90%.


Introduction
Monitoring electrical infrastructures represents a fundamental activity for ensuring the continuity of operation of any industrial, commercial, or domestic activity. Concepts such as quality and stability of the electricity service have become central themes for the study and development of smart grids, opening new perspectives in the field of scientific research [1,2].
This work focuses on the issue of the electricity service continuity, which depends on several factors such as, for example, the degradation level of the most stressed components. In this sense, the prognostic analysis of measurements carried out on the electrical network allows us to plane maintenance operations, preventing malfunctions evolving into catastrophic failures [3]. The main subjects of this study are the underground medium voltage power lines, which are particularly widespread near urban centres and constitute the fundamental connection between the high voltage transmission lines and the low voltage distribution network. The primary substation represents the starting point for all MV networks in a specific area and allows the transformation from high voltage to medium voltage. Along the path of the MV network, several stretches of the line can be delimited by two medium-low voltage energy transformation stations. The main purpose of this work is to develop a monitoring system capable of classifying the health state of the cables belonging to these stretches. Furthermore, since MV lines are generally underground and difficult to inspect, it is very important to be able to locate malfunctions and intervene in the most critical regions. The prognostic method shown in this paper is based on the measurement of the equivalent admittance of a line stretch. A low voltage signal, with a frequency higher than the fundamental one, is injected into the line stretch to obtain the network function. The relevant frequency response depends on the electrical characteristics of the cable, which can undergo variations when a specific failure mechanism arises [4,5]. The line properties are generally described by means of a lumped equivalent circuit and the π-model is used in this work. Therefore, the line stretch between two secondary substations is represented by the cascade connection of several π-models and each circuit represents a single section of the cable. The purpose of the prognostic method is to identify the cable section in the worst operating conditions and then locate the malfunction facilitating maintenance operations. In this way, it is possible to prevent catastrophic failures and intervene on the network, minimizing the interruption of the electricity service.
The most-used diagnosis techniques for electrical power networks are based on reflectometry in the time domain or on the measurement of "traveling waves" [6,7]. The former are techniques able to locate, with good precision, a sharp change of the impedance value along the line, but do not allow the identification of the progressive degradation of the infrastructure and present some difficulties for an online usage. As for systems based on the measurement of "traveling wave", it is certainly possible to identify the exact point where the short-circuit occurs, but it is not possible to prevent this catastrophic failure [8,9]. To achieve the objective of the work here presented, the relationships between the most common failure mechanisms and the electrical parameters present in the model are considered [10,11]. In particular, it is possible to notice that failure mechanisms, such as the degradation of the insulation, the presence of water trees or damages to the sheath, produce a conductance variation and an increase in the conductor temperature with a consequent increase in its resistance. Since these fluctuations affect the frequency response, it is possible to exploit this information for the prognosis. An effective method for achieving this task is presented in this paper, based on the classification of the equivalent admittance value [5,12]. The procedure consists of the following four phases:

•
Modelling of the network branch considered. • Testability analysis of the equivalent circuit. • Definition of the fault classes. • Development of the classification system.
Due to the complex nature of the involved quantities, the classification system is designed using machine learning techniques such as a feedforward neural network with complex neurons. For the analysis of the network branch model, the SapWin software (Version 4.0 beta, Department of Information Engineering-University of Florence, Florence, Italy) [13] is used. It allows us to perform the symbolic analysis of the model providing the transfer function needed for the subsequent testability computation and for the identification of ambiguity groups.
The paper is organized as follows. A detailed description of the network segment modelling, the considered fault classes, testability analysis and the implementation of the complex neural network are presented in Section 2. The results of the simulations performed with the complex neural network are shown in Section 3. Finally, the discussion of results and the conclusions are reported in Section 4.

Materials and Methods
The main topics of this section are three: • Network segment modelling. • Fault classes and testability analysis. • Complex neural network.

Network Segment Modelling
The first step is the modelling of the network branch under consideration. In this work, 1 km of medium voltage underground line is considered; the main characteristics of the grid are summarized in Table 1. The line stretch model is split in several sections during the simulations in order to monitor the operating temperature of different segments of the cable and identify the one operating in the worst conditions. The rated current must be greater than the nominal one and the electrical characteristics depend on several factors, which can be grouped in three main categories:

•
Type of materials used for the main elements (conductor, insulator, armor, screen). • Characteristics of cable laying and arrangement. • Environmental conditions.
All these features are used to calculate the electrical parameters of the pi-model shown in Figure 1a. The entire cable stretch is obtained cascading n models, as shown in Figure 1b. Of course, since the sections are virtual, there are no medium voltage joints at the ends of each section. The cases considered in this work present long sections (at least 200 m) which can include more than two junction regions. In order to increase the accuracy of the measurements and problem localization, the effect of the joints on high frequency signals must be taken into consideration for future developments. For the moment, it can be assumed that the change in impedance at the junction regions does not represent a significant effect compared to the change in resistance of a cable stretch of great length. Using such long cable sections, low localization accuracy is obtained since the insulation problem can be identified if it produces a temperature increase along the entire length. To increase the localization accuracy a larger number of sections will be considered and, consequently, the effect of reflections on high frequency signals will have to be taken into account.

Network Segment Modelling
The first step is the modelling of the network branch under consideration. In this work, 1 km of medium voltage underground line is considered; the main characteristics of the grid are summarized in Table 1. The line stretch model is split in several sections during the simulations in order to monitor the operating temperature of different segments of the cable and identify the one operating in the worst conditions. The rated current must be greater than the nominal one and the electrical characteristics depend on several factors, which can be grouped in three main categories:

•
Type of materials used for the main elements (conductor, insulator, armor, screen).

•
Characteristics of cable laying and arrangement.
All these features are used to calculate the electrical parameters of the pi-model shown in Figure 1a. The entire cable stretch is obtained cascading n models, as shown in Figure 1b. Of course, since the sections are virtual, there are no medium voltage joints at the ends of each section. The cases considered in this work present long sections (at least 200 m) which can include more than two junction regions. In order to increase the accuracy of the measurements and problem localization, the effect of the joints on high frequency signals must be taken into consideration for future developments. For the moment, it can be assumed that the change in impedance at the junction regions does not represent a significant effect compared to the change in resistance of a cable stretch of great length. Using such long cable sections, low localization accuracy is obtained since the insulation problem can be identified if it produces a temperature increase along the entire length. To increase the localization accuracy a larger number of sections will be considered and, consequently, the effect of reflections on high frequency signals will have to be taken into account. , , , in Figure 1a are, respectively, the resistance, inductance, capacitance, and conductance of a cable section, per unit length. To obtain the final value of the electrical parameters belonging to the model it is necessary to multiply these quantities by the length of the section considered, which is indicated in Figure 1a by the term ∆ . Some of these quantities can be extracted from the manufacturers' datasheets, but they always refer to the main frequency (50 Hz) and to specific environmental and installation conditions. Therefore, the first step is the definition of an analytical procedure for calculating the electrical characteristics of the cable in any operating condition. It should be noted that, in this work, only three-phase lines consisting of three single-core cables are considered. The resistivity of the conductor material represents the first considered quantity. The R , L , C , G in Figure 1a are, respectively, the resistance, inductance, capacitance, and conductance of a cable section, per unit length. To obtain the final value of the electrical parameters belonging to the model it is necessary to multiply these quantities by the length of the section considered, which is indicated in Figure 1a by the term ∆l. Some of these quantities can be extracted from the manufacturers' datasheets, but they always refer to the main frequency (50 Hz) and to specific environmental and installation conditions. Therefore, the first step is the definition of an analytical procedure for calculating the electrical characteristics of the cable in any operating condition. It should be noted that, in this work, only three-phase lines consisting of three single-core cables are considered. The resistivity of the conductor material represents the first considered quantity. The resistivity values at 20 • C of the cable materials are known [14,15]. The conductor resistance at 20 • C, without considering the laying conditions and network frequency, is where the term S indicates the surface of the conductor and it can be directly extracted from the datasheet or easily calculated knowing the geometry of the cable and its dimensions. In order to consider the influence of the laying conditions and the frequency of the electrical signal, it is necessary to introduce in the Equation (1) the parameters Y s and Y p , which represent, respectively, the skin effect and the proximity effect. The obtained relationship is and where the term K s depends on the type of cable and can be determined using the standard IEC 60287-1-1 [16]. The term f shown in (4) represents the frequency of the electrical quantities. The parameter Y p is obtained using the following equations: where K P is a specific parameter of the cable in use and depends on the conductor material and its geometric characteristics. Since the MV cable considered in this work has a copper core made up of elementary round section wires, the value of K p is equal to 0.8. The term d c represents the diameter of the conductor and D indicates the "Equivalent Distance" between the cables of the three phases. The value of the term D depends on the layout of the cables: in the case of a trefoil placement, it corresponds to the external diameter of a single cable, while for a flat arrangement it is where D ab , D bc e D ac are the mutual distances between the three phases. Finally, the actual resistance value of the cable at any working temperature is where α represents the thermal coefficient of the conductor material and T is the working temperature in • C. The proposed prognostic method uses signals with frequency higher than that of the network and, consequently, this frequency must be used in Equation (4). In this way, the correct value of the cable resistance is obtained and errors in the evaluation of the frequency response are avoided. Formulas (1)-(8) are extracted from different scientific texts [14][15][16]. Concerning the other electrical parameters of the π-model, the variation due to the working frequency is not considered, but the effect due to the reciprocal positioning of the three phases is evaluated. For the calculation of the cable capacity C an equation frequently used in literature is introduced [10,15]. This formula is where the term r i represents the insulation radius, r c the conductor radius and ε r the relative magnetic permeability of the insulating material. The magnetic permeability values for different types of insulating material can be obtained from [14]. To calculate the characteristic conductance of the cable, it is necessary to introduce the term tan(δ), also called tangent loss, which is often used as an index of the insulation quality. Tangent loss can be obtained by measuring the phase difference between the waveform of the voltage and that of the current [17]. Therefore, the term tan(δ) can be expressed as where the currents I C and I R are the charge component and the loss component of the total current I, respectively, obtained by the phasor method [17,18]. Since the equivalent circuit of the insulation system corresponds to the parallel connection between a capacitance and a voltage-dependent resistance, the value of the term tan(δ) at a specific pulsation ω and at a certain voltage V is obtained as Some diagnostic methods for medium voltage cables are based on the experimental measurement of tangent loss using low frequency signals (less than 1 Hz) and a portion of the mains voltage [19]. The usage of a reduced frequency makes it easier to measure tan(δ) and there are some standards, such as IEEE 400.2-2013, which allow the classification of the health status of the insulation based on the measured value. Therefore, the value of the model conductance is In our case the frequency of 50 Hz and the rated voltage of the line are used for the tan(δ) assessment. In fact, the prognostic method must operate online and, consequently, the charge and leakage currents in the insulation are those caused by normal operating conditions. This means that the values of tan(δ) contained in the above standards cannot be used. However, as shown in [17,18], there is a direct dependence between temperature and tan(δ). Consequently, the conductance is also used as an indicator of the cable operating temperature. The formula used to calculate the parameter L is where the term ϑ represents the radius of the conductor multiplied by a constant that depends on the type of cable. If the conductor is "full", this constant is 0.7788, whereas, if the conductor is hollow, it is equal to 1. Finally, it is necessary to introduce the mutual inductance term M , which is obtained as This parameter allows the computation of the overall reactance of the cable X as and it allows to modify the self-inductance and resistance values. The term in bracket corresponds to L c , which is the self-inductance in the case of sheathed cables with shortcircuited connection of the sheaths at both the ends of the line. The term R S represents the sheath resistance per unit length. Similarly, the conductor resistance is modified as follows: It should be noted that the phase transposition technique is used if the cables are very long or the sheath resistance is very low. This technique is called "Cross Bonding" connection and it allows loss reduction. In this case, the correction of the terms R T and L is not necessary. Even if the sheaths are not short-circuited, the correction is not necessary and an induced voltage of 0.005 V/(Km·A) is considered in the evaluation of the losses. To clarify the meaning of the geometric terms used in the previous formulas, Figure 2a and it allows to modify the self-inductance and resistance values. The term in bracket corresponds to , which is the self-inductance in the case of sheathed cables with short-circuited connection of the sheaths at both the ends of the line. The term represents the sheath resistance per unit length. Similarly, the conductor resistance is modified as follows: It should be noted that the phase transposition technique is used if the cables are very long or the sheath resistance is very low. This technique is called "Cross Bonding" connection and it allows loss reduction. In this case, the correction of the terms and is not necessary. Even if the sheaths are not short-circuited, the correction is not necessary and an induced voltage of 0.005 V/(Km·A) is considered in the evaluation of the losses. To clarify the meaning of the geometric terms used in the previous formulas, Figure 2a   A medium voltage cable with HEPR (Hard-Ethylene-Propylene-Rubber) insulation and copper sheath is used in this work. The geometric characteristics of the cable and the main information on its materials are shown in Table 2. As previously stated, the calculation of the electrical components of the π-model depends on several characteristics such as the environmental conditions, the laying, and the A medium voltage cable with HEPR (Hard-Ethylene-Propylene-Rubber) insulation and copper sheath is used in this work. The geometric characteristics of the cable and the main information on its materials are shown in Table 2.
As previously stated, the calculation of the electrical components of the π-model depends on several characteristics such as the environmental conditions, the laying, and the distance of the phases. Furthermore, practical aspects such as the "Cross bonding" connection of the cables and the short circuit connection of the sheaths (CCTO) introduce a  Table 3 summarizes all the information needed to calculate the electrical components of the cable taken into consideration.  It is now possible to calculate the nominal values of the electrical components. It is necessary to observe that only the resistance formula (8) contains the temperature T. This temperature depends on many factors and, for its calculation, the thermal balance equation extracted from [14] is used. The main factors affecting the cable temperature are the environmental temperature and the amount of current flowing in it. As is well known, temperature variations change the resistance value. As mentioned above, insulation degradation and other malfunctions can change the conductance value and introduce a rise in temperature. Therefore, also in this case, a corresponding variation of the cable resistance can be obtained. As regard the conductance variation, it is not easy to obtain numerical values referring to the operating conditions because each standard test considers offline cables [19]. However, a good approximation would be to consider the same percentage variation with respect to the nominal value. In fact, standard tests such as IEEE 400.2-2013 show three levels of tan(δ) to classify the health state of the insulation at low frequency. The same percentage changes of tan(δ) can reasonably be used at 50 Hz [18].

Fault Classes and Testability Analysis
According to these considerations, the monitoring of the health state of the cable can be achieved by basing the prognostic procedure on the detection of variations in frequency response caused by changes in resistance and conductance. In this work, three fault classes are used for each cable section and they correspond to specific intervals of temperature and tan(δ). A π-model is used for each cable section; it contains four electrical components, but only two, R and G, are considered variable terms depending on the operating conditions. As previously described, the electrical conductance G depends on tan(δ) and changes its value when insulation degradation occurs. This degradation produces an increase in the temperature of the cable and, consequently, an increase in resistance R.
The first fault class represents the nominal working condition of the cable. Protection devices generally tolerate a specific overcurrent value. This causes an acceptable overtemperature, which does not excessively change the cable performance. If the overcurrent exceeds the established limits, the protections intervene, interrupting the current flow. Overtemperature due to an insulation problem does not activate the protections and, consequently, must be identified to avoid the malfunction causing a fault. In this sense, one of the fundamental aspects is to define the range of critical temperatures. The cable temperature must not go beyond the value T max reported on the datasheet. If this temperature level is exceeded, the functional characteristics of the cable, such as the functionality of the insulation, sheath and other components, are not guaranteed by the manufacturer and structural failure of the network could occur. Therefore, the term "hard overtemperature" corresponds to unacceptable working conditions and represents the third fault class. The term "slight overtemperature" is used to indicate intermediate working conditions between nominal and critical ones. If there is no evident variation in the environmental characteristics or an increase in current such as to trigger the protections, it means that there is an insulation problem. This problem does not cause a real failure but represents an indication of a malfunction. One of the possible effects of this slight increase in temperature and the consequent increase in resistance could be the overcoming of the maximum limit for the voltage drop. The T 2 value, which represents the lower limit of the slight overtemperature range, can be chosen with respect to the characteristics of the line and the monitoring system to be implemented. For example, it is possible to choose T 2 as the temperature reached by the cable with the maximum current tolerated by the protections. In this work, since the nominal phase current is much lower than the current carrying capacity of the cable, T 2 equal to the average value between T nom and T max is chosen. Figure 3a summarizes the fault classes based on temperature values. These temperature intervals must be translated into resistance ranges and, to achieve this, the Formula (8) is used. Assuming that the variation of the cable temperature is a consequence of malfunctions, it is also possible to consider three intervals for the parameter tan(δ). The nominal value of this parameter for insulation in HEPR is extracted from [15] and the corresponding fault classes are obtained by applying the same percentage variation to that reported in [19]. Consequently, three fault classes for the cable conductance can be calculated through the Formula (12). Figure 3b represents all possible working conditions for the cable under test.
Energies 2021, 14, x FOR PEER REVIEW 8 of 24 represents the lower limit of the slight overtemperature range, can be chosen with respect to the characteristics of the line and the monitoring system to be implemented. For example, it is possible to choose as the temperature reached by the cable with the maximum current tolerated by the protections. In this work, since the nominal phase current is much lower than the current carrying capacity of the cable, equal to the average value between and is chosen. Figure 3a summarizes the fault classes based on temperature values. These temperature intervals must be translated into resistance ranges and, to achieve this, the Formula (8) is used. Assuming that the variation of the cable temperature is a consequence of malfunctions, it is also possible to consider three intervals for the parameter tan ( ). The nominal value of this parameter for insulation in HEPR is extracted from [15] and the corresponding fault classes are obtained by applying the same percentage variation to that reported in [19]. Consequently, three fault classes for the cable conductance can be calculated through the Formula (12). Starting from these results, it is possible to observe that each cable section belonging to the considered network branch can operate in three different working conditions, which are called: nominal condition, slight overtemperature and hard overtemperature. The main goal of the monitoring system is the classification of the health state of the line by identifying the condition of each cable section. In this way, it is possible to locate the section operating in the worst conditions. The realization of the prognostic method requires measurements of the frequency response corresponding to the equivalent admittance of the network. This means that one or more signals with frequency higher than the fundamental one is injected on the medium voltage line and the ratio between current and voltage must be evaluated. The classifier used in the monitoring system must identify the cause of each variation in the line admittance. In other words, it must associate the variation of the R and G parameters of one or more sections to the measured frequency response.
Since the power line is split into several sections (Figure 1b), it is necessary to understand if there are components producing the same variation in the frequency response. Two variable terms are considered for each cable section and, to obtain the classification of the working conditions, each of these pairs must introduce a different variation in the frequency response. To be sure of this fact, an analysis of the equivalent circuit, called "Testability analysis", is introduced. The testability concept is widely used in the field of analog circuits and many scientific articles propose different methods for its evaluation [20,21]. However, Starting from these results, it is possible to observe that each cable section belonging to the considered network branch can operate in three different working conditions, which are called: nominal condition, slight overtemperature and hard overtemperature. The main goal of the monitoring system is the classification of the health state of the line by identifying the condition of each cable section. In this way, it is possible to locate the section operating in the worst conditions. The realization of the prognostic method requires measurements of the frequency response corresponding to the equivalent admittance of the network. This means that one or more signals with frequency higher than the fundamental one is injected on the medium voltage line and the ratio between current and voltage must be evaluated. The classifier used in the monitoring system must identify the cause of each variation in the line admittance. In other words, it must associate the variation of the R and G parameters of one or more sections to the measured frequency response.
Since the power line is split into several sections (Figure 1b), it is necessary to understand if there are components producing the same variation in the frequency response. Two variable terms are considered for each cable section and, to obtain the classification of the working conditions, each of these pairs must introduce a different variation in the frequency response. To be sure of this fact, an analysis of the equivalent circuit, called "Testability analysis", is introduced. The testability concept is widely used in the field of analog circuits and many scientific articles propose different methods for its evaluation [20,21]. However, the main goal of testability analysis is the definition of ambiguity groups, categories of components whose variation introduces the same change in frequency response. In this work, the definition extracted from [22,23] is used, then a testability based on the fault equation system is introduced [24,25]. Therefore, the starting point for the testability assessment is constituted by the equations relevant to one or more transfer functions at different test points and at multiple frequencies. When the number of equations is at least equal to the number of electrical parameters considered unknown, testability is the rank of the corresponding Jacobian matrix. The unknown parameters correspond to the electrical components considered as variable terms, i.e., the quantities changing their value when a failure mechanism intervenes. The theoretical bases and the analytical method for calculating testability and ambiguity groups are reported in [23]. The maximum value of testability is equal to the total number of variable terms. When testability is less than the total number of variable terms, at least one ambiguity group can be detected, and this means that two or more components introduce the same variation in the frequency response. In this case, if the component variations are produced by the same failure mechanism, the problem can be identified but it cannot be located. If every component variation is caused by different mechanisms, identification and localization of the problems are not possible. As previously stated, the measurements of the frequency response must be obtained by injecting signals with different frequencies in medium voltage networks. In order to minimize the intrusive level of the monitoring system, the equivalent admittance measured at the starting point of the line is used. This means that there is only one test point and, consequently, the previously described testability evaluation requires a number of frequencies that allows the realization of a system of failure equations in which the number of equations is equal to the number of unknown parameters. Since the prognostic method focuses on evaluating the resistance and conductance of each π-model shown in Figure 1b, there are two unknown components for each cable section. Furthermore, the equivalent admittance corresponds to two failure equations, one for its magnitude and one for its phase. Then, the number of required frequencies must equate the number of considered cable sections.
The choice of the signal frequencies used for the measurements is made by taking into consideration the available band. The CENELEC band used for "Power Line Communications" (PLC) represents the best solution [26,27]. In this way, the monitoring method can be applied to existing systems adding the prognostic analysis to the transmission of information. For this scope, a device able to inject low voltage signals within the band is used. A specific coupling system is designed based on the most common used PLC devices [28][29][30]. The circuit used to inject the high frequency signal contains four main elements: transmitter, impedance matching transformer, fourth order high pass filter and capacitive divider for medium voltage cables. In this work, the transmitter is modelled using a low voltage signal generator with a series resistance of 75 Ω called R Tx . The resistance value corresponds to that used in the most common PLC standards. For example, the "MCD 80 communication system" produced by ABB uses a 75 Ω resistor and a high pass filter as presented in this paper. The transformer has a transformation ratio n Zc , which allows the impedance matching between R Tx and the characteristic impedance of the network Z c , as shown in [31]. The fourth order high pass filter is realized by setting the nominal value of the capacitive divider and the desired cut-off frequencies. The capacitive divider represents the last element of the coupling circuit and it is physically connected to the medium voltage network. The other passive components contained in the filter are chosen according to the cut-off frequencies and according to the scattering parameters required to optimize signal transmission [32,33]. In Figure 4, the high pass filter specifications are shown.   Moreover, the equivalent circuit of the network must be completed by introducing line traps [34]. In this way, it is possible to avoid the short-circuit of the signal used for the measurements at the energy transformation stations.

Complex Neural Network
Once the network modelling and testability analysis have been carried out, it is necessary to create a dataset containing all the possible situations starting from the fault classes previously described. Since each network section can belong to one of the three failure conditions shown in Figure 3a  As shown in [26,27], PLC standards generally present a minimum attenuation level of 70 dB for frequencies lower than f s . The value of the frequency f s is fixed at 2500 Hz in order to eliminate most of the harmonic content that could affect the signal transmission. The maximum attenuation level in the passband has been fixed at 0.2 dB, typical of capacitive coupling systems for high voltage PLC. The Butterworth approximation is used. In Figure 5, the structure of the coupling system is shown. For the evaluation of the line admittance, the transmission parameters of each section are used.  Moreover, the equivalent circuit of the network must be completed by introducing line traps [34]. In this way, it is possible to avoid the short-circuit of the signal used for the measurements at the energy transformation stations.

Complex Neural Network
Once the network modelling and testability analysis have been carried out, it is necessary to create a dataset containing all the possible situations starting from the fault classes previously described. Since each network section can belong to one of the three failure conditions shown in Figure 3a  Moreover, the equivalent circuit of the network must be completed by introducing line traps [34]. In this way, it is possible to avoid the short-circuit of the signal used for the measurements at the energy transformation stations.

Complex Neural Network
Once the network modelling and testability analysis have been carried out, it is necessary to create a dataset containing all the possible situations starting from the fault classes previously described. Since each network section can belong to one of the three failure conditions shown in Figure 3a of the variable terms, resistances and conductances, are randomly selected with uniform distribution from a specific fault class for each section. Furthermore, multiple frequencies are used in order to improve the performance of the prognostic method and this means that there are N f different measurements of the same sample obtained with N f different signals within the CENELEC band. Therefore, the input section of the dataset used during the training procedure is where, for example, M 1 2 f 1 represents the second measure of magnitude corresponding to the first combination made at the frequency f 1 , φ 1 2 f 1 represents the second measure of phase corresponding to the first combination made at the frequency f 1 , N C is the number of all possible combinations and N rs is the total number of rows belonging to the dataset, which is equal to the product of the number of samples and the number of combinations. A MATLAB code is used to obtain the measurements of the frequency response during the simulation procedure.
The tool used to classify the health state of the cable sections is a neural network classifier, based on a multilayer neural network with multi-valued neurons (MLMVN: Multi-Layered Multi-Valued Neurons). It has great classifier capability thanks to its generalization performance and guarantees excellent results compared to other classifiers based on machine learning techniques. It is a feedforward multilayer neural network that uses the derivative free learning algorithm shown in [35,36] during the training phase. Each neuron is a multi-valued neuron (MVN) with n complex inputs (X 1 , · · · , X n ) and a single output (Y) that belongs to the unit circle on the complex plane. The neural network is composed by discrete neurons; each neuron divides the complex plane into k different sectors (depending on the number of the classes), and the output of the activation function P(z) is set to the lower border of the sector where the weighted sum z falls (z = W 1 X 1 + · · · + W n X n ). The discrete activation function is where j is one of the possible sectors, k is the total number of the sectors and arg(z) represents the argument of the weighted sum. Figure 6a,b show a graphical representation of the functioning of the complex neuron. The most efficient MVN learning algorithm is based on the error-correction learning rule. In the case of discrete neurons, the error of the output is represented by the difference between the complex number corresponding to the lower limit of the desired sector and that of the actual one. In a standard neural network with multiple layers, each output error obtained on the last layer is used for the weight adjustment through a backpropagation procedure. This learning rule allows the correction of the weights for each sample of the dataset s (s = 1, · · · , N s ). As shown in [37,38] the correction of the weights can be obtained through a derivative free learning rule and this is one of the most important advantages of using a complex neural network over other classifiers. This procedure can be applied step by step for each layer and each sample or through an algorithm based on the linear least square (LLS) method reducing the computational cost [39]. the lower border of the sector where the weighted sum falls ( = + ⋯ + ). The discrete activation function is ( ) = = = / 2 / ≤ arg ( ) < 2 ( + 1)/ (18) where is one of the possible sectors, is the total number of the sectors and arg ( ) represents the argument of the weighted sum. Figure 6a,b show a graphical representation of the functioning of the complex neuron.
(a) (b)  The standard rule used to calculate the correction for each sample is where ∆W k,m i is the correction for the i-th weight of the k-th neuron belonging to the layer m, α k,m is the corresponding learning rate, n m−1 is the number of the inputs equal to the number of the outputs of the previous layer, z s k,m is the magnitude of the weighted sum, δ s k,m is the output error obtained through the backpropagation method and Y s i,m−1 is the conjugate-transposed of the input. In this way, it is possible to organize a very efficient batch learning algorithm based on the LLS method [37]. When using this algorithm, the output error is calculated for each neuron and each sample and saved in a specific matrix at the end of every training epoch. This matrix can be expressed as If the number of samples is greater than the number of inputs, an oversized system of equations is obtained and some different techniques can be used, such as the complex Q-R decomposition or Singular Value Decomposition SVD, for calculating the corrections with the minimum error. The system can be written in a more compact form as and the solution obtained with the LLS method satisfies this condition: where the superscript k indicates the number of the neuron taken into account, Y T is the pseudo-inverse of the matrix Y and Y T is its conjugate transpose. Moreover, the soft margin method is used to improve the classification rate, changing the target for each output to the bisector of the desired sector [40].
The number of neurons belonging to the output layer depends on the number of considered cable sections. For each of them, two binary neurons are used to identify the health state. They have an activation function that maps the complex plane in two sectors, [0, π] and [π, 2π]. The first neuron identifies a possible slight overtemperature; therefore, the first sector [0, π] is where the weighted sum z falls if the cable section is safe and the second sector [π, 2π] is where the weighted sum z falls if the cable section is affected by a slight overtemperature. The second neuron is responsible for identifying a strong overtemperature on the cable section. For this neuron, the first sector [0, π] is where the weighted sum z falls if the cable section is safe and the second sector [π, 2π] is where the weighted sum z falls if the cable section is affected by a hard overtemperature. Figure 7 shows the classification rule of the output layer for each couple of neurons. A binary code is used to describe the heath state of each cable section. The first sector of each neuron, from 0 to π, corresponds to the number 0 while the second sector, from π to 2π, represents the number 1. This means that the nominal condition of a specific cable section is coded through the sequence 00, the condition of slight overtemperature is coded through the sequence 10, and the code 11 is used to describe the presence of a strong overtemperature. The sequence 01 is not used because it has no meaning. It is possible to define the output section of the dataset used during the learning phase. Two columns for each cable section are addend in (20) and the health state is shown by using the binary sequences described above. The complete form of the dataset is where, for example, is the second output of the section number . Part of the dataset is used during the learning phase to calculate the error and modify the weights; this procedure is called the training phase. The remainder is used to verify the classification results at the end of each training epoch; this procedure is called test phase. A procedure called "Cross Validation" is used to process all the samples belonging to the dataset in both phases.

Results and Simulation Procedure
The above-described theoretical concepts are used to develop a specific simulation A binary code is used to describe the heath state of each cable section. The first sector of each neuron, from 0 to π, corresponds to the number 0 while the second sector, from π to 2π, represents the number 1. This means that the nominal condition of a specific cable section is coded through the sequence 00, the condition of slight overtemperature is coded through the sequence 10, and the code 11 is used to describe the presence of a strong overtemperature. The sequence 01 is not used because it has no meaning. It is possible to define the output section of the dataset used during the learning phase. Two columns for each cable section are addend in (20) and the health state is shown by using the binary sequences described above. The complete form of the dataset is where, for example, Y N s 2 is the second output of the section number N s . Part of the dataset is used during the learning phase to calculate the error and modify the weights; this procedure is called the training phase. The remainder is used to verify the classification results at the end of each training epoch; this procedure is called test phase. A procedure called "Cross Validation" is used to process all the samples belonging to the dataset in both phases.

Results and Simulation Procedure
The above-described theoretical concepts are used to develop a specific simulation procedure. The main purpose is to verify the classification performance of the monitoring system applied to the network branch described in Table 1. The simulation methodology is organized as shown in [39,40] and provides the performance of the classifier in three different situations: three cable sections, four cable sections and five cable sections.
Once the cable branch with length of about 1 km has been divided into a certain number of sections, the corresponding π-models are calculated, and the complex neural network is trained to classify the state of health of each section. The main objective of the classifier is to define the exact combination of the working conditions of the sections under consideration. As shown in the Section 2, the operating conditions are related to the working temperatures and the complex neural network must deduce the cable section temperature starting from the measurements of the frequency response. In this way, it is possible to detect the cable section affected by a malfunction and, consequently, organize the necessary maintenance operations. By using the previously mentioned three fault classes it is possible to prevent the most critical situations. In fact, the second fault class represents a slight overtemperature, which corresponds to an initial degradation of the insulation. Identifying this condition allows the detection of a malfunction in its early stage. The accuracy of the problem localization increases as the number of considered sections increases but, in the same way, the complexity of the classification also increases. Therefore, it is essential to find a compromise between localization accuracy and classification accuracy.
The first step consists of the calculation of the electrical characteristics referred to the unit of length according to the cable size and materials. Then, by dividing the network branch into a certain number of sections, it is possible to obtain the values of the electrical components belonging to the π-models. The maximum temperature of the cable extracted from the datasheet and the nominal one calculated with the rated current and environmental condition are used to define the numerical range of the fault classes. Therefore, once the equivalent lumped circuit of the network has been completed by introducing the coupling system shown in Section 2.2, it is necessary to evaluate the testability and define the ambiguity groups. Then, a dataset containing all possible combinations is created. To achieve this goal, the values of the electrical parameters that change with respect to the temperature are randomly chosen in each fault class. The resistance values are also modified according to the frequencies chosen for the measurements. Finally, all the electrical components are used to create the ABCD matrices and calculate the equivalent line admittance at each frequency. Once the dataset has been created, it is possible to train the complex neural network and verify the classification results. The main steps of the simulation procedure are summarized in in the block diagram shown in Figure 8.
In the following subsections the results obtained considering the cable characteristics shown in Tables 2 and 3 are presented.

Calculation of the Electrical Components
The first step of the simulation procedure is the definition of the electrical parameters of the π-model. Using Formulas (1)- (16) it is possible to calculate the characteristic quantities for unit of length. Since one π-model is used for each cable section, these values must be multiplied for the length of the sections. Table 4 shows the general results referred to the nominal condition. achieve this goal, the values of the electrical parameters that change with respect to the temperature are randomly chosen in each fault class. The resistance values are also modified according to the frequencies chosen for the measurements. Finally, all the electrical components are used to create the ABCD matrices and calculate the equivalent line admittance at each frequency. Once the dataset has been created, it is possible to train the complex neural network and verify the classification results. The main steps of the simulation procedure are summarized in in the block diagram shown in Figure 8. In the following subsections the results obtained considering the cable characteristics shown in Tables 2 and 3    The values of rated voltage, nominal current and frequency are extracted from Table 1 and the environmental temperature is considered equal to 25 • C. Consequently, the working temperature of the cable used to obtain these results is 35.2 • C.

Calculation of the Fault Classes
As previously explained, the line working temperature is the main index used to define the health state of the cable and it is used to create the fault classes. However, since the cable resistance depends on the frequency value, it is necessary to set the pulsation of the signals used for the measurements. The used bandwidth is the CENELEC band and corresponds to the range (9-148) kHz. In order to respect the values of the scattering parameters suggested in [27] the band actually used is (35-148) kHz. The simulation procedure involves two different techniques for frequency selection. The first of them consists of dividing the range into four parts and selecting the extremes as signal frequencies. The second method consists of selecting 100 frequencies within the range and then applying a Principal Component Analysis (PCA) to the dataset. The fault classes calculated with the first method are shown in Table 5, where each row corresponds to a specific frequency value. Rows 1 to 4 correspond to the values 35 kHz, 72.7 kHz, 110.3 kHz and 148 kHz, respectively. The maximum temperature for the cable taken into consideration can be extracted from its datasheet. This temperature is equal to 105 • C and it must be used in (8) to obtain the upper limit of the second fault class. The corresponding values of conductance are shown in Table 6 and, in this case, only one row is needed because the term tan(δ) is calculated at the fundamental frequency (50 Hz).  Table 4, while the resistance and conductance values are randomly selected within these intervals, creating all possible combinations.

Testability Assessment
Testability analysis is necessary to understand if the resistances and conductances of each cable section can introduce the same variation in the frequency response. The symbolic analysis must be repeated for each situation: three, four and five cable sections. The fault classes previously described can be used if there are no ambiguity groups that contain resistances and conductances of different π-models. The testability index calculated as described in the second section of this paper is maximum for each situation.

Creation of the Dataset
The dataset includes 100 samples for each fault combination, and it is realized through the following procedure:

•
An iterative method carries out the random selection of the electrical parameters belonging to the fault classes in order to obtain all possible combinations; for example, the first step in the case of three cable sections is the extraction of three resistance and conductance values from the first fault class to simulate the nominal condition of each section, which is coded as 00 00 00. • The random selection described above is repeated 100 times for each frequency • All electrical components are used to create the equivalent ABCD matrix of the network.

•
Magnitude and phase of equivalent admittance are calculated for each frequency and each sample.

•
The calculated values and the output codes are arranged as shown in (23).

Complex Neural Network Structure
The functioning of the complex neural network with multivalued neurons has been presented in the second section of this paper. Since binary neurons are used in the output layer, the output section of the dataset can be used directly as a target. This means that the binary codes represent the desired outputs and are used to calculate errors. The structure of the neural network presents three layers: input, hidden and output layer. The output layer contains two binary neurons for each section. The number of neurons belonging to the hidden layer and the other hyper-parameters are chosen through a heuristic approach and they are kept the same for each simulation. In particular, 120 neurons are used in the hidden layer and this guarantees an excellent generalization capability, verified during the test phase. In the first layer, there is an input for each frequency and this means that each pair of columns (magnitude and phase), belonging to the input section of the dataset, is transformed into a complex number and processed by the network.

Classification Results
The classification results reported in this subsection demonstrate the effectiveness of the prognostic procedure in defining the health status of the line branch. The main task of the intelligent monitoring system is the simultaneous classification of the working conditions of each cable section. Since there are three fault classes, the performance of the neural network is evaluated by referring to 3 N s possible combinations, where N s represents the number of cable sections taken into consideration. This means that the difficulties of the classification task increase as the number of sections increases and, in this work, the maximum number of sections used is five, i.e., 243 combinations. The index used to evaluate neural network performances is called classification rate, and it is defined as the ratio between the number of correctly classified samples and the total number of samples. During the training procedure, 100 samples are used for each combination in order to define the weights of the complex neural network. Each of these samples is found along a row of the dataset matrix with the corresponding desired outputs as shown in (23). When a sample is correctly classified by the neural network, it means that the classifier recognizes the correct combination among all those possible. In this case, all neurons belonging to the last layer present the correct output and they show the real working condition of each cable section. If an output neuron does not assume the desired value, the classification result is incorrect, and the combination provided by the classifier is wrong. Therefore, in addition to the global classification rate, which represents the ratio between the number of correctly classified combinations and the total number of samples, it is possible to consider a specific classification rate for each neuron. This index is focused on the outputs of the neurons belonging to the last layer and plays a fundamental role during the training procedure by identifying problems in the structure of the neural network. In this subsection, graphs of the general classification rate are shown and tables containing the specific classification rate for each output neuron are presented.
It is necessary to observe that the cross-validation method is used to organize the training procedure into two different phases: the training phase and the test phase. The first of these is carried out by randomly choosing 80% of the samples belonging to the dataset. In this phase, all output errors with respect to the desired values are calculated and used for the adjustment of the weights. Once the 80% of the dataset has been processed, the global classification rate of the training phase is calculated. Through the same procedure, it is possible to calculate the specific classification rate for each output neuron. The correction procedure is repeated until a specific error threshold is reached or the maximum number of allowed iterations is exceeded. At the end of each correction of the weights, the error committed on the remaining 20% of the dataset is calculated. This verification is called the test phase and, also in this case, the global and specific classification rate is calculated. The division of the dataset is carried out five times to process all the information available.
The results obtained during the test phase are considered more important than those obtained during the learning phase because they offer an assessment of the generalization capability of the neural network. In other words, these results allow the evaluation of the performance with data not used during the training phase. For this reason, all the graphs presented in this subsection have a circle on the curve of the results obtained during the test phase, which indicates the best classification rate. Furthermore, all the classification rate values shown in the tables refer to the test phase. The first simulation taken into consideration is focused on three cable sections. This means that the network branch with length of about 1 km is divided into three parts and each of them is characterized by a specific π-model. As mentioned before, two different methods can be used to obtain the dataset matrix. The first of these is to divide the bandwidth into four parts and use these frequencies to create the dataset. The second method consists in the creation of a dataset with 100 frequencies belonging to the range  kHz. Then, the principal component analysis is used to extract the main information content and reduce the number of columns in the dataset matrix.
The classification results obtained using four frequencies are shown in Figure 9a, while the global classification rate obtained by applying the PCA on the dataset with 100 frequencies is shown in Figure 9b. The PCA technique is used to limit the number of inputs to the neural network. Therefore, most of the information content obtained with 100 frequencies is retained and there is no need to change the classifier structure.  The same simulation procedure can be applied on a network branch characterized by four cable sections. The classification results obtained in this case are shown in Figure 10a,b.
In this situation it is possible to observe that the results without PCA are much worse than in the previous case. This fact is due to the higher number of combinations which increase the complexity of the problem. Table 8 summarizes the results obtained for a network branch with four cable sections.  The results shown in Figure 9a,b are extracted from a MATLAB (R2020b, MathWorks, Natick, MA, USA) application developed by the authors. This application calculates the number of errors made on each output neuron and the corresponding classification rate. Moreover, it allows the extraction of weights in order to implement the classifier on systems with online measurements. Table 7 summarizes the classification results: the global classification rate corresponds to the value indicated with the circle on the graph of the Figure 9a,b. All other results refer to the test phase. Comparing the results obtained in this situation, it can be observed that the performance of the neural network trained with the PCA technique is much better. This fact is due to the large information content of the original dataset containing 100 frequencies. The PCA allows the extraction of 95% of information reducing the number of network inputs. In this way it is possible to maintain the hyperparameters shown in Section 3.5 without increasing the number of neurons in the hidden layer.
The same simulation procedure can be applied on a network branch characterized by four cable sections. The classification results obtained in this case are shown in Figure 10a,b.  Finally, the global classification rate obtained in the case of a network branch divided into 5 sections is shown in Figure 11a,b.
(a) (b) Figure 11. Classification results obtained for a network branch with five cable sections: (a) Global classification rate obtained during the test and training phase by using a dataset with four frequencies; (b) Global classification rate obtained during the test and training phase by applying the principal component analysis to a dataset with 100 frequencies. In this situation it is possible to observe that the results without PCA are much worse than in the previous case. This fact is due to the higher number of combinations which increase the complexity of the problem. Table 8 summarizes the results obtained for a network branch with four cable sections. Finally, the global classification rate obtained in the case of a network branch divided into 5 sections is shown in Figure 11a Figure 10. Classification results obtained for a network branch with four cable sections: (a) Global classification rate obtained during the test and training phase by using a dataset with four frequencies; (b) Global classification rate obtained during the test and training phase by applying the principal component analysis to a dataset with 100 frequencies.
Finally, the global classification rate obtained in the case of a network branch divided into 5 sections is shown in Figure 11a,b.
(a) (b) Figure 11. Classification results obtained for a network branch with five cable sections: (a) Global classification rate obtained during the test and training phase by using a dataset with four frequencies; (b) Global classification rate obtained during the test and training phase by applying the principal component analysis to a dataset with 100 frequencies.
The results are summarized in Table 9 and, in this case, a great improvement in performance can be seen with the introduction of the PCA technique. This means that the loss of accuracy in results due to the large number of combinations can be avoided by using a large number of frequencies and applying a feature extraction technique. The results are summarized in Table 9 and, in this case, a great improvement in performance can be seen with the introduction of the PCA technique. This means that the loss of accuracy in results due to the large number of combinations can be avoided by using a large number of frequencies and applying a feature extraction technique. The excellent performances of the proposed classifier are confirmed by the results obtained in other case studies. For example, Table 10 shows the classification rate obtained for cables having different geometric characteristics and materials. Different laying conditions are also considered.
It should be noted that the results shown in Table 10 are obtained for three cable sections in different environmental conditions with different installation: for the ARG7H1M1 cable presented in the first row of Table 10 an external installation is used with ambient temperature between 0 and 40 • C. In this case, a rated current of 200 A is considered. Cable ARG7H1M1 shown in the second row of Table 10 is tested under the same condition as those presented in this paper. These conditions also apply to the RG7H1M1 cable with 200 A current.

Discussion
In this work, a new prognostic approach for the detection and localization of faults in medium voltage underground cables is presented. Fault classes are defined for a specific network branch, starting from the variations in resistance and conductance due to the degradation of the insulation and the increase in temperature. Online monitoring of resistance and conductance values is performed through a Frequency Response Analysis (FRA), in which a high frequency low voltage signal is injected into the underground cable to measure the equivalent admittance value. This network function is obtained in its symbolic form thanks to SapWin and then, the testability analysis is performed. Specific datasets are created and adapted to the MLMVN structure. Three cases are presented, obtained by dividing the cable branch into 3, 4 and 5 sections, respectively. As shown in the results section, this neural network-based tool gives excellent results in terms of classification rate.  show in the first row the global classification rates obtained with and without the use of principal component analysis. These results correspond to the ratio between the correct classified samples and the total number of samples used in the test phase. Since these samples are not used during the training phase, the global classification rate allows the evaluation of the generalization capability of the neural network. In other words, it shows the performance of the classifier with new measurements and can be used to describe the online functioning of the complex neural network. Analyzing the global results obtained without PCA, it is possible to notice that the performance of the neural network decreases as the number of sections increases. This reduction of the accuracy level is caused by the greater complexity of the considered system. In fact, increasing the number of sections increases the number of combinations to be classified; for example, when the cable stretch is divided into three sections, there are 27 possible combinations, 81 if four sections are considered and 243 for the case with five sections. In order to carry out an accurate evaluation of the performances, it should be noted that the results obtained for each output neuron without PCA are always higher than 94%. This means that the monitoring system guarantees excellent results in the classification of the health status of a single cable section, while the overall performance decreases if the number of sections is high. It is necessary to note that the results without PCA are obtained using four frequencies in the range (35-148) kHz.
Furthermore, analyzing the results obtained with the use of Principal Component Analysis, it possible to note that the performances are excellent both from an overall point of view and for each single section. In this case, the information obtained using 100 frequencies in the interval (35-148) kHz is post-processed to extract the greatest information content and reduce the columns of the dataset. In this case, each simulation shows a global classification rate higher than 97% and this means that the monitoring system allows excellent performance in the classification of the line health state. Even in the most complicated case, five cable sections with 243 combinations, the classifier is able to identify the correct combination among all those possible with an error level of less than 3%. Since the network branch taken into account is about 1 km long, the monitoring Starting from these results, future developments will be aimed at increasing the localization accuracy for longer line branches. To improve the accuracy of the localization, it will be necessary to extend the number of cable sections but, in this way, the complexity of the classification will increase. For this reason, an iterative procedure will be studied that can improve the performance of the prognostic method.