Machine Learning-Based Classification of Electrical Low Voltage Cable Degradation

Low voltage distribution networks have not been traditionally designed to accommodate the large-scale integration of decentralized photovoltaic (PV) generations. The bidirectional power flows in existing networks resulting from the load demand and PV generation changes as well as the influence of ambient temperature led to voltage variations and increased the leakage current through the cable insulation. In this paper, a machine learning-based framework is implemented for the identification of cable degradation by using data from deployed smart meter (SM) measurements. Nodal voltage variations are supposed to be related to cable conditions (reduction of cable insulation thickness due to insulation wear) and to client net demand changes. Various machine learning techniques are applied for classification of nodal voltages according to the cable insulation conditions. Once trained according to the comprehensive generated datasets, the implemented techniques can classify new network operating points into a healthy or degraded cable condition with high accuracy in their predictions. The simulation results reveal that logistic regression and decision tree algorithms lead to a better prediction (with a 97.9% and 99.9% accuracy, respectively) result than the k-nearest neighbors (which reach only 76.7%). The proposed framework offers promising perspectives for the early identification of LV cable conditions by using SM measurements.


Introduction
Electrical low voltage (LV) distribution networks are the last stage of the electrical power network, which supply many dispersed small-scale loads. A set of equipment such as MV-LV (medium voltage-low voltage) transformer substations, overhead/underground lines, protection systems, etc., compose those networks. The radial topology is widely used in LV distribution networks, with a voltage level around 230V. LV feeders are designed to feed a limited amount of end users in order to reduce the influence of an interruption. Consequently, either LV level interruption problems or LV equipment physical state problems (such as the cable ageing and deterioration) have received less attention.
The French standard NF C 15-100 (harmonized with the European standard HD 384) specifies that the insulating material of LV electrical cables must oppose the current all along the conductor [1]. In fact, the deteriorations of the insulation material can increase the discharge of leakage currents, which can create overcurrent and voltage variation issues and can decrease the efficient operation and safety of the network. In addition, LV distribution networks (initially designed for unidirectional power flows) are currently subject to the bidirectional power flows and frequent voltage variations arisen from the nance strategies, using historical data, would be more profitable than the currently used corrective maintenance.
In a recent research direction, machine learning (ML) techniques have been studied for fault detection in [21][22][23][24][25][26][27]. The study in [22] addresses the benefit of a machine learning framework for fault detection and classification in power systems. By analyzing the most used ML techniques (within consideration of fault types and metrics for those techniques evaluation), the authors have shown the benefits of supervised classifiers to reliably solve power system problems. In the same way, a part of the research in [23] was dedicated to the fault diagnosis in LV networks by using a deep neural networks approach. The results of this study allowed the authors to highlight the most influencing parameters in the fault assessment process, such as the fault resistance. In the context of grid monitoring, the authors in [25] set up a power line modems (PLM)-based solution for the diagnostics of distribution network cable. By implemented various ML algorithms (combined with several preprocessing methods), the proposed approach ensures the employment of the best algorithm for a given diagnostic procedure. The work has been oriented through a two-stage approach from the degradation detection to the ageing and localized degradation assessment of XLPE-insulated cable. The key point of this approach relies on access to the PML database. The authors of [27] investigated the role of ML in integrity analysis of subsea cables. From the design of a low frequency (LF) sonar system to the detection of the cable degradation stage through accelerated life cycle testing, their study provides a library of LF sonar responses depending on the cable types and conditions. Regarding voltage issues in the distribution network, the researchers in [28] have worked on a centralized voltage control framework within consideration of the uncertainties related to the network working conditions and its physical parameters (dependency between temperature variation and line resistance; internal resistance of the transformer and consideration of the shunt admittances of power lines by using a PI line model). The authors have implemented a fast decision-making method, which is cost-efficient since the deep reinforcement learning-based agent can automatically adapt its behavior under varying operating conditions. The above ML-based studies give relevant and acceptable accuracy results with a good speed and a low calculation burden. However, they do not integrate the assessment of the electrical properties of the LV network cables associated to its growing insulation degradation. It will therefore be interesting to investigate the integration of those ML tools in the LV cable condition assessment. Hence, this paper focuses on the implementation of a machine learning-based framework in order to identify the cable lines that present an insulation degradation, considering the voltage and net demand variation profiles of the distribution network.
The novelty of this study resides in its proposed machine learning-based framework to identify the cable insulation wear, relying on nodal voltage and load demand variations. Through the extensive analysis of cable insulation thickness variations and load flow calculations, a synthetic database is built. Then, the observations in the dataset are classified using several predictors whose impacts are studied. Indeed, the proposed work is a novel approach, which lies in the use of data from already largely deployed smart meters. From an economic point of view, it is a cost-effective approach compared to the actual costly monitoring of HV transmission lines where specific meters and communication systems are used (as implemented in France). In the LV distribution system, it is very expensive to deploy sensors and dedicated information and communication technologies in the entire electrical network. To tackle this challenge, this research project aims to take advantage of available data from smart meters and leverage the ML capabilities in order to detect the soft (early-stage) degradation of cable insulation (regardless of the type of fault). Despite the existing literature related to fault detection in electrical networks, the main contribution of the current study lies in its proposed methodology, where the problem has been approached through highlighting the relationships between the operating conditions of network, its nodal voltages and thickness variation of cable insulation. The remaining of this paper is organized as follows: Section 2 expresses the motivation and objectives of this study. Sections 3 and 4 present the formulation of the insulation degradation problem and the way that the LV line is modelled in this work. In Section 5, the proposed methods of classification are introduced. Then, Section 6 presents the application cases, while Section 7 discusses the obtained results. Finally, in Section 8, the main conclusions are presented.

Motivation and Objectives
The degradation of the insulating material and its impact on the node voltages have been investigated by [14] through the electrical conductance variation of the cable insulation. A probabilistic framework has been proposed to that end by combining Monte Carlo simulations and load flow computations. Assuming that the degradation degree of the insulation material is an uncertain variable, the scenario creation procedure using Monte Carlo (MC) has been implemented for characterizing the above uncertainty. The load flow calculations finally determine the nodal voltages in the generated scenarios. The developed framework in [14] provides us with the insightful information about the statistical distribution of nodal voltage variations. Additionally, the probability of voltage variation appearance, under various degrees of insulation wear, has been analyzed.
The current paper is a step further on this direction. The objective is to detect the cable insulation degradation from the network operating point. To do so, relying on the generated database consisting of nodal voltages (associated with the load and generation profiles) as well as the cable insulation conditions, different machine learning techniques have been implemented. The latter in the training phase will learn what would be the possible nodal voltages linked to each load and generation profiles as well as the cable insulation conditions. Then, in the test phase, relying only on the nodal voltages (associated with the load and generation data), they will identify if the network working point corresponds to the normal conditions or if there is cable insulation degradation in the tested network. As the main contribution of this work, it paves the way to an effective and timely predictive maintenance of the LV distribution network avoiding the costly solutions for the distribution system operators (DSOs) as well as the customers.

Characterization of the Cables Insulation Degradation
Electrical cables are subject to mechanical damage, excessive heat, ageing of material, and electrical stress on a daily basis. These operating conditions cause degradation of the cable insulation material, and in extreme cases, the cable can totally or partially lose its insulation. As consequence, the insulation impedance decreases, which generates a leakage current flowing through the cable to the ground. Therefore, this impedance is composed of the ground resistance as well as the resistance of the degraded cable insulation. The remainder of this section focuses on calculating the resistance associated to the degraded insulation.
In a degraded cable, the leakage current flows radially outwards from the center towards the surface of the cable along its length. So, let us assume a cylindrical cable that has a total radius R, a length L and a conductor radius equal to r. The radius corresponding to the insulating material is equal to R-r. Then, let us consider an elementary section of that cable with a radius x and an insulation material thickness dx (infinitesimally small layer of insulation) [29]. The elementary cylindrical section (of area 2πLx) has an insulation resistance given by: where R iso-dx and ρ are, respectively, the resistance and the resistivity coefficient of the insulation material. From Equation (1), the insulation resistance of the cable is calculated by integrating the thickness value dx over the radius of the insulating material [14]: The above equation gives a general formulation of an electrical cable insulation resistance. Then, by assuming that due to degradation the cable loses a part of its insulation thickness, the conductor radius r will remain constant while the cable radius R will reduce; radius variation will tend to decrease the insulation resistance value.

Model of a Healthy Line
A single-phase LV line (between two nodes), in healthy condition, is modelled by its longitudinal impedance. In this study, the shunt admittances (capacitive phenomenon) from the traditional PI model are neglected because of the short distances (short cable length between system nodes; see Section 6.1) as demonstrated in [18]. Therefore, the equation of the line impedance becomes a combination of per-unit-length series resistance R i and reactance X i as: where Z i is the self-impedance of the line i (between nodes i and i + 1). R i , X i and length i represent, respectively, the line resistance, the line reactance and the length of the line. Figure 1 shows the series model of the above LV electrical line.
where Riso-dx and ρ are, respectively, the resistance and the resistivity coefficient of the sulation material. From Equation (1), the insulation resistance of the cable is calculated by integra the thickness value dx over the radius of the insulating material [14]: The above equation gives a general formulation of an electrical cable insulation sistance. Then, by assuming that due to degradation the cable loses a part of its insula thickness, the conductor radius r will remain constant while the cable radius R will red radius variation will tend to decrease the insulation resistance value.

Model of a Healthy Line
A single-phase LV line (between two nodes), in healthy condition, is modelled b longitudinal impedance. In this study, the shunt admittances (capacitive phenomen from the traditional PI model are neglected because of the short distances (short c length between system nodes; see Section 6.1) as demonstrated in [18]. Therefore, equation of the line impedance becomes a combination of per-unit-length series resista and reactance as: where Zi is the self-impedance of the line i (between nodes i and i + 1). , and len represent, respectively, the line resistance, the line reactance and the length of the lin Figure 1 shows the series model of the above LV electrical line.

Model of a Line with Damaged Insulation
To model the electrical line, in the damaged insulation condition, the resistance iation ( ) due to the insulation degradation, established in Section 3, is incorporate the above model, as in [14]. Indeed, a shunt variable resistance, between the leakage p (named t in Figure 2) and the ground, models the current discharge over an electrica sulation material. Figure 2 shows the representation of this new electric path (series c bination of insulation resistance Riso and ground resistance Rg) in the line model.

Model of a Line with Damaged Insulation
To model the electrical line, in the damaged insulation condition, the resistance variation (R iso ) due to the insulation degradation, established in Section 3, is incorporated in the above model, as in [14]. Indeed, a shunt variable resistance, between the leakage point (named t in Figure 2) and the ground, models the current discharge over an electrical insulation material. Figure 2 shows the representation of this new electric path (series combination of insulation resistance R iso and ground resistance R g ) in the line model. sulation material.
From Equation (1), the insulation resistance of the cable is calculated by integra the thickness value dx over the radius of the insulating material [14]: The above equation gives a general formulation of an electrical cable insulation sistance. Then, by assuming that due to degradation the cable loses a part of its insula thickness, the conductor radius r will remain constant while the cable radius R will red radius variation will tend to decrease the insulation resistance value.

Model of a Healthy Line
A single-phase LV line (between two nodes), in healthy condition, is modelled b longitudinal impedance. In this study, the shunt admittances (capacitive phenomen from the traditional PI model are neglected because of the short distances (short c length between system nodes; see Section 6.1) as demonstrated in [18]. Therefore, equation of the line impedance becomes a combination of per-unit-length series resist and reactance as: where Zi is the self-impedance of the line i (between nodes i and i + 1). , and len represent, respectively, the line resistance, the line reactance and the length of the lin Figure 1 shows the series model of the above LV electrical line.

Model of a Line with Damaged Insulation
To model the electrical line, in the damaged insulation condition, the resistance iation ( ) due to the insulation degradation, established in Section 3, is incorporate the above model, as in [14]. Indeed, a shunt variable resistance, between the leakage p (named t in Figure 2) and the ground, models the current discharge over an electrica sulation material. Figure 2 shows the representation of this new electric path (series c bination of insulation resistance Riso and ground resistance Rg) in the line model.   length i is defined as the total length of the damaged line i while the healthy part of this line is represented by length i h . length iw is the length of the section starting from the leakage point to the next node.
From the model in Figure 2, three impedances are defined according to the different parts of the star model [14]: To suite with the chosen load flow calculation method (presented below in Section 5.1) the «T» line model shown in Figure 2 (star connection represented by three impedances Z at , Z bt and Z ct according to Equations (5)- (7)) is converted to an equivalent delta connection circuit represented by Figure 3. ℎ is defined as the total length of the damaged line i while the healthy part of this line is represented by ℎ ℎ . ℎ is the length of the section starting from the leakage point to the next node.
From the model in Figure 2, three impedances are defined according to the different parts of the star model [14]: To suite with the chosen load flow calculation method (presented below in Section 5.1) the «T» line model shown in Figure 2 (star connection represented by three impedances , and according to Equations (5)- (7)) is converted to an equivalent delta connection circuit represented by Figure 3.

Synthetic Creation of the Working Database
In the first stage, a working database is created from the cable thickness distribution and the smart meter (SM) measurements data (i.e., the load and the PV measured each quarter of an hour q). The SM inputs are used to obtain the net demand (ND).
where NDi, PVi and Loadi are, respectively, the net demand, the PV production and the load demand at node i. Then a load flow is computed, for each observation (each quarter q of each day), using the Newton-Raphson load flow (NRLF) technique. In this study, the NRLF technique is

Synthetic Creation of the Working Database
In the first stage, a working database is created from the cable thickness distribution and the smart meter (SM) measurements data (i.e., the load and the PV measured each quarter of an hour q). The SM inputs are used to obtain the net demand (ND).
where ND i , PV i and Load i are, respectively, the net demand, the PV production and the load demand at node i. Then a load flow is computed, for each observation (each quarter q of each day), using the Newton-Raphson load flow (NRLF) technique. In this study, the NRLF technique is carried out for calculating the network nodal voltages. During NRLF computation, the nodal powers are expressed in nonlinear algebraic equations. Then, Taylor series are used to linearize those equations, which give the link between small variations in real and reactive powers as a function of small variations in the nodal voltage angles and magnitudes. The obtained Jacobian matrix is expressed as: where the vectors ∆P and ∆Q represent the errors between the scheduled and calculated powers at the load buses. The vectors ∆θ and ∆V represent, respectively, the variations in the nodal voltage angles and magnitudes. The equations for calculating the elements of the Jacobian matrix (using measured powers by the smart meter) are given in [30]. The obtained Jacobian matrix is used to update the network voltages. The ∆P and ∆Q vectors are then updated with the new voltages. For the computation of the next iteration, the Jacobian matrix elements are recalculated to obtain new network voltages, and so on, until the errors (i.e., ∆P and ∆Q vectors) are minimized to a predefined value. This is what makes the NRLF technique an iterative-based procedure. The particularity of this process is linked to the fact that the load levels are imposed for obtaining voltages of the same magnitude range as those obtained with a non-degraded cable. Figure 4 shows the flowchart of the synthetic creation of the knowledge database (the global flowchart of the proposed approach including the classification process is presented in Appendix A).
Energies 2021, 14, x FOR PEER REVIEW 7 of 20 carried out for calculating the network nodal voltages. During NRLF computation, the nodal powers are expressed in nonlinear algebraic equations. Then, Taylor series are used to linearize those equations, which give the link between small variations in real and reactive powers as a function of small variations in the nodal voltage angles and magnitudes. The obtained Jacobian matrix is expressed as: where the vectors ΔP and ΔQ represent the errors between the scheduled and calculated powers at the load buses. The vectors Δθ and ΔV represent, respectively, the variations in the nodal voltage angles and magnitudes. The equations for calculating the elements of the Jacobian matrix (using measured powers by the smart meter) are given in [30]. The obtained Jacobian matrix is used to update the network voltages. The ΔP and ΔQ vectors are then updated with the new voltages. For the computation of the next iteration, the Jacobian matrix elements are recalculated to obtain new network voltages, and so on, until the errors (i.e., ΔP and ΔQ vectors) are minimized to a predefined value. This is what makes the NRLF technique an iterativebased procedure. The particularity of this process is linked to the fact that the load levels are imposed for obtaining voltages of the same magnitude range as those obtained with a non-degraded cable. Figure 4 shows the flowchart of the synthetic creation of the knowledge database (the global flowchart of the proposed approach including the classification process is presented in Appendix A).

Labelling Data
For the evaluation of the cable state, two classes are defined and applied to each observation in the database (see Table 1). The class H is associated with the cables without

Labelling Data
For the evaluation of the cable state, two classes are defined and applied to each observation in the database (see Table 1). The class H is associated with the cables without insulation wear while the class M is used to label the cables presented a certain degree of insulation wear.

Implemented Machine Learning Methods
This subsection focuses on the machine learning (ML) aspect of the developed tool. Indeed, supervised learning approaches are ML techniques based on input and output data (labeled data) and are employed for classification. The objective is to automatically generate knowledge rules from a database containing "samples" of inputs and corresponding outputs so that with a new input data, the output variable can be predicted (as represented by Figure 5). insulation wear while the class M is used to label the cables presented a certain degree of insulation wear.

Implemented Machine Learning Methods
This subsection focuses on the machine learning (ML) aspect of the developed tool. Indeed, supervised learning approaches are ML techniques based on input and output data (labeled data) and are employed for classification. The objective is to automatically generate knowledge rules from a database containing "samples" of inputs and corresponding outputs so that with a new input data, the output variable can be predicted (as represented by Figure 5). Regarding supervised learning approaches, they can be divided into two categories [31]: • Classification methods, which dispatch the input observations in categorical groups and lead to the construction of predictive models for discrete responses.

•
Regression methods, which describe the relationship between input variables (socalled predictors) and the outputs (through a mathematical function) and lead to the construction of predictive models for continuous responses.
In what follows, the supervised machine learning methods, implemented in this work, have been discussed.

K-Nearest Neighbors Algorithm
The k-nearest neighbor (kNN) is a supervised ML algorithm that can be used in both classification and regression models. For classification purposes, kNN is a non-parametric method that supports non-linear solutions and can only provide labels as an output. By assuming a value k for the number of nearest neighbors, kNN algorithm identifies the training observations N closest to the new prediction point x, as represented in Figure 6. Regarding supervised learning approaches, they can be divided into two categories [31]: • Classification methods, which dispatch the input observations in categorical groups and lead to the construction of predictive models for discrete responses. • Regression methods, which describe the relationship between input variables (socalled predictors) and the outputs (through a mathematical function) and lead to the construction of predictive models for continuous responses.
In what follows, the supervised machine learning methods, implemented in this work, have been discussed.

K-Nearest Neighbors Algorithm
The k-nearest neighbor (kNN) is a supervised ML algorithm that can be used in both classification and regression models. For classification purposes, kNN is a non-parametric method that supports non-linear solutions and can only provide labels as an output. By assuming a value k for the number of nearest neighbors, kNN algorithm identifies the training observations N closest to the new prediction point x, as represented in Figure 6.
Each new observation x is compared to those that already exist in the input dataset by using a distance calculation (such as Euclidean distance, cosine of the angle formed by the two observations, etc.). Then, the class with the smallest distance is assigned to x. The algorithm therefore requires knowing k, the number of neighbors to consider. To choose the right k, the kNN algorithm can be run several times with different values of k. Then, the right k will be the one that has led to the best performance (i.e., the lowest error and the best prediction accuracy). Each new observation x is compared to those that already exist in the inpu by using a distance calculation (such as Euclidean distance, cosine of the angle fo the two observations, etc.). Then, the class with the smallest distance is assigned algorithm therefore requires knowing k, the number of neighbors to consider. T the right k, the kNN algorithm can be run several times with different values of the right k will be the one that has led to the best performance (i.e., the lowest e the best prediction accuracy).
Studies have proved that kNN is a simple but highly efficient and effective a for solving real-life classification problems (such as the recommendation of m NETFLIX) [33,34]. In electrical engineering applications, kNN is mostly use for tection and classification but also for power quality classification. The kNN algori has the advantage of being a versatile and easy to understand and implement with no need for initial assumptions. However, when the volume of samples in th (so-called predictors) increases, the kNN algorithm tends to become slower. Even are more precise classification algorithms, kNN remains a first-choice and sim rithm to model a classification problem and can achieve a high classification acc problems with unknown distributions, while familiarizing with the available d For this study, the kNN algorithm has been implemented by keeping the Euclid tance as the employed distance measure. For this study, the Euclidean distance employed as the distance measure because of the ease of calculations and possible checking of results. Additionally, a limited number of neighbors (k = 5) has been

Decision Tree
A decision tree (DT) is a supervised ML algorithm used in both regression a sification problems (usually called CART: classification and regression trees). For cation purposes, DT is a widely used non-parametric method, which is based on chical representation where the end-nodes are the classification and the inter nodes are the tests on the properties of the observations (see Figure 7). In othe building a decision tree is a recursive process, going from the properties (d branches) to the conclusions about an observation (drawn by leaves). Studies have proved that kNN is a simple but highly efficient and effective algorithm for solving real-life classification problems (such as the recommendation of movies on NETFLIX) [33,34]. In electrical engineering applications, kNN is mostly use for fault detection and classification but also for power quality classification. The kNN algorithm also has the advantage of being a versatile and easy to understand and implement method with no need for initial assumptions. However, when the volume of samples in the dataset (so-called predictors) increases, the kNN algorithm tends to become slower. Even if there are more precise classification algorithms, kNN remains a first-choice and simple algorithm to model a classification problem and can achieve a high classification accuracy in problems with unknown distributions, while familiarizing with the available database. For this study, the kNN algorithm has been implemented by keeping the Euclidean distance as the employed distance measure. For this study, the Euclidean distance has been employed as the distance measure because of the ease of calculations and possible manual checking of results. Additionally, a limited number of neighbors (k = 5) has been applied.

Decision Tree
A decision tree (DT) is a supervised ML algorithm used in both regression and classification problems (usually called CART: classification and regression trees). For classification purposes, DT is a widely used non-parametric method, which is based on a hierarchical representation where the end-nodes are the classification and the intermediate nodes are the tests on the properties of the observations (see Figure 7). In other words, building a decision tree is a recursive process, going from the properties (drawn by branches) to the conclusions about an observation (drawn by leaves). Each new observation x is compared to those that already exist in the by using a distance calculation (such as Euclidean distance, cosine of the an the two observations, etc.). Then, the class with the smallest distance is assi algorithm therefore requires knowing k, the number of neighbors to consid the right k, the kNN algorithm can be run several times with different valu the right k will be the one that has led to the best performance (i.e., the low the best prediction accuracy).
Studies have proved that kNN is a simple but highly efficient and effec for solving real-life classification problems (such as the recommendation NETFLIX) [33,34]. In electrical engineering applications, kNN is mostly us tection and classification but also for power quality classification. The kNN has the advantage of being a versatile and easy to understand and imple with no need for initial assumptions. However, when the volume of samples (so-called predictors) increases, the kNN algorithm tends to become slower are more precise classification algorithms, kNN remains a first-choice and rithm to model a classification problem and can achieve a high classificatio problems with unknown distributions, while familiarizing with the avail For this study, the kNN algorithm has been implemented by keeping the tance as the employed distance measure. For this study, the Euclidean dist employed as the distance measure because of the ease of calculations and po checking of results. Additionally, a limited number of neighbors (k = 5) has

Decision Tree
A decision tree (DT) is a supervised ML algorithm used in both regres sification problems (usually called CART: classification and regression tree cation purposes, DT is a widely used non-parametric method, which is bas chical representation where the end-nodes are the classification and the nodes are the tests on the properties of the observations (see Figure 7). In building a decision tree is a recursive process, going from the properti branches) to the conclusions about an observation (drawn by leaves).  The decision tree starts with a root node (property of X 1 in Figure 7) and branches toward possible outcomes. Each of those outcomes leads to additional nodes (property of X 2 and X 3 ), which also branch toward other outcomes. In other words, it is a visual representation of the decision-making directly related to the problem to be solved.
A decision tree is a commonly used and highly understandable machine learning method. It is a reliable algorithm for separating a dataset (predictor variables set) into several given classes by providing some clear indications about the most relevant predictors. For classification problems, a DT algorithm does not need much computation and does not rely on functional assumptions (i.e., it is not affected by any non-linearity) while it can build very complex trees and encounter an overfitting problem. Additionally, the creation of optimal decision trees can be obstructed by the presence of dominate classes. DT accuracy reduces, however, when the number of training examples to the number of classes is low. Decision trees are widely used algorithms that give high-quality results with the data, which mostly depends on the conditions [35,36]. In electric power system applications, DT is used in load consumption prediction and load forecasting, preventive and corrective control, power systems security assessment, etc. [37]. The DT algorithm, in this study, is an adjusted binary classification decision tree.

Logistic Regression
Logistic regression (LR) is a parametric model that supports linear solutions and can derive to a high confidence level (regarding its prediction). LR is a powerful algorithm for finding boundaries between two classes. Mathematically speaking, an LR algorithm uses regression to predict the probability (between 0 and 1) of a new observation x to be classified into y, a given class (see Figure 8). The decision tree starts with a root node (property of X1 in Figure 7) and branches toward possible outcomes. Each of those outcomes leads to additional nodes (property of X2 and X3), which also branch toward other outcomes. In other words, it is a visual representation of the decision-making directly related to the problem to be solved.
A decision tree is a commonly used and highly understandable machine learning method. It is a reliable algorithm for separating a dataset (predictor variables set) into several given classes by providing some clear indications about the most relevant predictors. For classification problems, a DT algorithm does not need much computation and does not rely on functional assumptions (i.e., it is not affected by any non-linearity) while it can build very complex trees and encounter an overfitting problem. Additionally, the creation of optimal decision trees can be obstructed by the presence of dominate classes. DT accuracy reduces, however, when the number of training examples to the number of classes is low. Decision trees are widely used algorithms that give high-quality results with the data, which mostly depends on the conditions [35,36]. In electric power system applications, DT is used in load consumption prediction and load forecasting, preventive and corrective control, power systems security assessment, etc. [37]. The DT algorithm, in this study, is an adjusted binary classification decision tree.

Logistic Regression
Logistic regression (LR) is a parametric model that supports linear solutions and can derive to a high confidence level (regarding its prediction). LR is a powerful algorithm for finding boundaries between two classes. Mathematically speaking, an LR algorithm uses regression to predict the probability (between 0 and 1) of a new observation x to be classified into y, a given class (see Figure 8).

Figure 8. Logistic regression representation [38].
A mathematical representation of LR will be made here. Considering the two-class classification problem of this paper, an analogy can be made between the labels and the output classes as shown in Table 2. The output ℎ ( ) of a logistic regression model (i.e., the probability of a new observation x to be classified into a class y) will be bounded as below: A mathematical representation of LR will be made here. Considering the two-class classification problem of this paper, an analogy can be made between the labels and the output classes as shown in Table 2. The output h θ (x) of a logistic regression model (i.e., the probability of a new observation x to be classified into a class y) will be bounded as below: For this classification problem, the probability value h θ (x) can be calculated by using a sigmoid function g (S-curve function to map predictions to probabilities): Then h θ (x) can be written as bellow: where the input of the sigmoid function (u) is the weighted sum of the input predictors (x).
The key point is then to find the right values for parameters θ (θ being a vector of the same size as the observation vector x) by solving a minimization problem: with where J is a cost function, M is the total number of observations in the dataset and cost is the quadratic classification error that is expressed as follows [39]: The cost function to be minimized will be equal to: The logistic regression method is the go-to method for binary classification problems (problems with two class values). LR is easy to implement, fast and very efficient to train. The LR algorithm gives good accuracy for simple datasets and the provided model coefficients can be interpreted as indicators of predictor importance. LR has the advantage of being less likely to lead to over-fitting, except in high dimensional datasets. Logistic regression methods are used, in electrical engineering, for electricity monitoring, visualization and prediction but also for fault detection in renewable energy production [40].

Presentation of the Monitored Low-Voltage Distribution Network
The LV distribution network studied in this paper is presented in Figure 9. Having a radial topology, it consists of 18 nodes, each one (except node 1, i.e., the slack bus) connected to a customer (Ci) with photovoltaic panels (so-called prosumers). The LV network is part of Flobecq town distribution system in Belgium [14,16], where each prosumer is equipped with a smart meter (SM). The SM simultaneously records, at each node and for each quarter of an hour, the PV generation, the injection and the consumption. Using those measured energy values, the system powers P (active power) and Q (reactive power) are calculated (Appendix B shows the associated lengths of the lines). For the sake of simplicity, the analysis of this paper is carried out on a portion of the network shown in Figure 9, which is in the upward direction of node 3. The input node (i.e., node 2) is connected to customer C1 while the output node (i.e., node 3) is connected to customer C2. The first node (i.e., node 1) connected to the secondary side of the transformer is supposed to be at the 230 V reference value. In this study, a month of SM data is used to build the dataset. For each day, 96 measurements are made. The total number of observations is thus equal to 2880 measurements (i.e., 30 × 96). Those 2880 observations are created while ensuring uniformity of the two classes in the synthetic dataset. Table 3 shows how the cable states are distributed in the working database.

Training and Validation Sets
Supervised machine learning algorithms consist of two phases-a training phase and testing phase. During the training phase, the training samples and the class labels of these samples are stored in a subset. The algorithm to learn and to create the right output from the data uses this subset. While training, the algorithm modifies the training parameters. In this phase, the algorithm is said to be learning. During the testing phase, the remaining observations from the original dataset are stored in a subset without the associated output. Then, a prediction is made on those samples to check how well the algorithm predicts the desired output.
To fit those two phases, the original dataset has been reduced in two subsets: the training subset and the test subset. The training subset is used to train the algorithm and the test subset is used to make some predictions for the resulting model validation. To select the observations in each data subset, a random logical selection was made. Tables 4 and 5 summarize the repartition of the data used in each classification algorithm.  For the sake of simplicity, the analysis of this paper is carried out on a portion of the network shown in Figure 9, which is in the upward direction of node 3. The input node (i.e., node 2) is connected to customer C1 while the output node (i.e., node 3) is connected to customer C2. The first node (i.e., node 1) connected to the secondary side of the transformer is supposed to be at the 230 V reference value. In this study, a month of SM data is used to build the dataset. For each day, 96 measurements are made. The total number of observations is thus equal to 2880 measurements (i.e., 30 × 96). Those 2880 observations are created while ensuring uniformity of the two classes in the synthetic dataset. Table 3 shows how the cable states are distributed in the working database.

Training and Validation Sets
Supervised machine learning algorithms consist of two phases-a training phase and testing phase. During the training phase, the training samples and the class labels of these samples are stored in a subset. The algorithm to learn and to create the right output from the data uses this subset. While training, the algorithm modifies the training parameters. In this phase, the algorithm is said to be learning. During the testing phase, the remaining observations from the original dataset are stored in a subset without the associated output. Then, a prediction is made on those samples to check how well the algorithm predicts the desired output.
To fit those two phases, the original dataset has been reduced in two subsets: the training subset and the test subset. The training subset is used to train the algorithm and the test subset is used to make some predictions for the resulting model validation. To select the observations in each data subset, a random logical selection was made. Tables 4 and 5 summarize the repartition of the data used in each classification algorithm.  As explained in Section 2, the main purpose of this work is to identify if the monitored cable section (i.e., the one between nodes 2 and 3) is either in the healthy working condition (class H) or has any insulation wear (class M). This classification will be made by various ML methods using an input dataset built from the provided smart meter data and computed nodal voltage variations. Figure 10 presents the flowchart of the implemented tool for solving that classification problem while Figures 11 and 12 show the specified classification process for each implemented algorithm.  As explained in Section 2, the main purpose of this work is to identify if the monitored cable section (i.e., the one between nodes 2 and 3) is either in the healthy working condition (class H) or has any insulation wear (class M). This classification will be made by various ML methods using an input dataset built from the provided smart meter data and computed nodal voltage variations. Figure 10 presents the flowchart of the implemented tool for solving that classification problem while Figures 11 and 12 show the specified classification process for each implemented algorithm.    Figure 11. Classification process specified to decision tree (DT) and k-nearest neighbor (kNN) algorithms.  As explained in Section 2, the main purpose of this work is to identify if the monitored cable section (i.e., the one between nodes 2 and 3) is either in the healthy working condition (class H) or has any insulation wear (class M). This classification will be made by various ML methods using an input dataset built from the provided smart meter data and computed nodal voltage variations. Figure 10 presents the flowchart of the implemented tool for solving that classification problem while Figures 11 and 12 show the specified classification process for each implemented algorithm.    Figure 11. Classification process specified to decision tree (DT) and k-nearest neighbor (kNN) algorithms. Figure 11. Classification process specified to decision tree (DT) and k-nearest neighbor (kNN) algorithms.  Figure 12. Classification process specified to logistic regression (LR) algorithm.

Test Cases
In order to evaluate the performance of the proposed framework, two cases are considered as follows.

Case 1: Impact of the Net Demand and the Thickness Variation
The first application case will evaluate the impact of the net demand and the thickness variation on the model training and the prediction result. In this case, the net demand (ND) and the nodal voltage (V) of both the input node (named ND1 and V1) and output node (named ND2 and V2) are given to the classification input dataset. This helps the algorithm in its learning process. The algorithm will understand if any variation in the data is related to a cable degradation (based on the net demand/voltage level compromise) or to the client net demand.

Case 2: Impact of the Net Demand on the Prediction Result
The second application case will evaluate the impact of the net demand on the model training and the prediction result. In this scenario, only the nodal voltage (V) of both input node (V1) and output node (V2) are given to the classifier in the training subset. The idea is to evaluate if the algorithm can really distinguish between the effects of thickness variation independent of the net demand variation.

Results and Discussion
A first investigation is carried out to find the nodal voltage variation range of the feeder in a healthy cable condition (knowing that the maximum ND is associated to minimum voltage). The obtained values are limited to [210. 19, 242.2734] Volts as shown in Figure 13a. In addition, Figure 13b presents the nodal voltages for moderately degraded cable located in the line between nodes 2 and 3. It should be noted that the extreme degradation scenarios as studied in [6] have not been considered in this work. Moreover, the severe faults (extreme degradation scenarios) are easier to observe and detect. The interest in this study is focused more on the detection of the cable at the beginning of degradation process, which will be useful in managing cable maintenance and in anticipating the occurrence of severe faults or outage. Hence, the moderately degraded cable condition is linked to a soft fault degradation, which is not necessarily in breakage conditions but just introduces significant variations in the voltage profile.

Test Cases
In order to evaluate the performance of the proposed framework, two cases are considered as follows.

Case 1: Impact of the Net Demand and the Thickness Variation
The first application case will evaluate the impact of the net demand and the thickness variation on the model training and the prediction result. In this case, the net demand (ND) and the nodal voltage (V) of both the input node (named ND 1 and V 1 ) and output node (named ND 2 and V 2 ) are given to the classification input dataset. This helps the algorithm in its learning process. The algorithm will understand if any variation in the data is related to a cable degradation (based on the net demand/voltage level compromise) or to the client net demand.

Case 2: Impact of the Net Demand on the Prediction Result
The second application case will evaluate the impact of the net demand on the model training and the prediction result. In this scenario, only the nodal voltage (V) of both input node (V 1 ) and output node (V 2 ) are given to the classifier in the training subset. The idea is to evaluate if the algorithm can really distinguish between the effects of thickness variation independent of the net demand variation.

Results and Discussion
A first investigation is carried out to find the nodal voltage variation range of the feeder in a healthy cable condition (knowing that the maximum ND is associated to minimum voltage). The obtained values are limited to [210. 19, 242.2734] Volts as shown in Figure 13a. In addition, Figure 13b presents the nodal voltages for moderately degraded cable located in the line between nodes 2 and 3. It should be noted that the extreme degradation scenarios as studied in [6] have not been considered in this work. Moreover, the severe faults (extreme degradation scenarios) are easier to observe and detect. The interest in this study is focused more on the detection of the cable at the beginning of degradation process, which will be useful in managing cable maintenance and in anticipating the occurrence of severe faults or outage. Hence, the moderately degraded cable condition is linked to a soft fault degradation, which is not necessarily in breakage conditions but just introduces significant variations in the voltage profile. In the boxplots of nodal voltage profiles shown in Figure 13, the red positive signs demonstrate the outliers of the voltages in the created scenarios. The outliers in Figure 13a are related to the prosumers ND demand variations while those in Figure 13b are due to the nonlinear equation of the insulation conductance (1/Riso) applied in the NRLF computation. As it can be understood, the increase in insulation conductance (1/Riso) can lead to the voltage drops shown by the outliers. Tables 6 and 7 show the prediction results obtained by the studied classification techniques in case 1 and case 2. In Table 6, it can be observed that LR and DT demonstrate good accuracies in the prediction process in case 1 (model trained with ND variations and nodal voltage profiles) while kNN performance is in a lower level. For the case 2, where the net demand (DT) is missing, it can be seen that the implemented algorithms lost performance. Therefore, the ND is an important predictor (input variable) for the classifier, as well as the voltage profiles. This is due to its unneglected impact on the nodal voltage variation range [6]. The constructed tree for DT method is shown in Figure 14 (corresponding to case 1). It reveals that normalized input ND profiles (named x1) will affect the prediction process as well as the normalized output voltage profile (named x4). As a result, the LR and DT lead to predictions with high accuracies in case 1. In the boxplots of nodal voltage profiles shown in Figure 13, the red positive signs demonstrate the outliers of the voltages in the created scenarios. The outliers in Figure 13a are related to the prosumers ND demand variations while those in Figure 13b are due to the nonlinear equation of the insulation conductance (1/Riso) applied in the NRLF computation. As it can be understood, the increase in insulation conductance (1/Riso) can lead to the voltage drops shown by the outliers. Tables 6 and 7 show the prediction results obtained by the studied classification techniques in case 1 and case 2. In Table 6, it can be observed that LR and DT demonstrate good accuracies in the prediction process in case 1 (model trained with ND variations and nodal voltage profiles) while kNN performance is in a lower level. For the case 2, where the net demand (DT) is missing, it can be seen that the implemented algorithms lost performance. Therefore, the ND is an important predictor (input variable) for the classifier, as well as the voltage profiles. This is due to its unneglected impact on the nodal voltage variation range [6]. The constructed tree for DT method is shown in Figure 14 (corresponding to case 1). It reveals that normalized input ND profiles (named x1) will affect the prediction process as well as the normalized output voltage profile (named x4). As a result, the LR and DT lead to predictions with high accuracies in case 1.  Table 8 gives the related training and prediction accuracy for each studied classification method. By comparing these results, it can be concluded that the LR and decision tree are great binary classification tools, while the kNN method leads to less accurate predictions.  Figure 15 represents the confusion matrix of LR and DT methods for the first application case in order to visualize the quality of the classifiers output (see if the predictions really match the real associated classes for validating the prediction counts in Table 6) in a three-dimensional plot. In Figure 15, the axes yPred and yvalid correspond, respectively, to the outputs of the classifier (the predictions) and to the known cable conditions (real classes from the original dataset). Only a few damaged cable conditions could not be predicted with either LR or DT algorithms (small blue block corresponding to 30 observations in Figure 15a and four observations in Figure 15b).  Figure 16 shows the ROC (receiver operating characteristic) diagram representation of the prediction, which shows the ratio between the true positive (sensitivity) and the  Table 8 gives the related training and prediction accuracy for each studied classification method. By comparing these results, it can be concluded that the LR and decision tree are great binary classification tools, while the kNN method leads to less accurate predictions.  Figure 15 represents the confusion matrix of LR and DT methods for the first application case in order to visualize the quality of the classifiers output (see if the predictions really match the real associated classes for validating the prediction counts in Table 6) in a three-dimensional plot. In Figure 15, the axes yPred and yvalid correspond, respectively, to the outputs of the classifier (the predictions) and to the known cable conditions (real classes from the original dataset). Only a few damaged cable conditions could not be predicted with either LR or DT algorithms (small blue block corresponding to 30 observations in Figure 15a and four observations in Figure 15b).  Table 8 gives the related training and prediction accuracy for each studied classification method. By comparing these results, it can be concluded that the LR and decision tree are great binary classification tools, while the kNN method leads to less accurate predictions.  Figure 15 represents the confusion matrix of LR and DT methods for the first application case in order to visualize the quality of the classifiers output (see if the predictions really match the real associated classes for validating the prediction counts in Table 6) in a three-dimensional plot. In Figure 15, the axes yPred and yvalid correspond, respectively, to the outputs of the classifier (the predictions) and to the known cable conditions (real classes from the original dataset). Only a few damaged cable conditions could not be predicted with either LR or DT algorithms (small blue block corresponding to 30 observations in Figure 15a and four observations in Figure 15b).  Figure 16 shows the ROC (receiver operating characteristic) diagram representation of the prediction, which shows the ratio between the true positive (sensitivity) and the  Figure 16 shows the ROC (receiver operating characteristic) diagram representation of the prediction, which shows the ratio between the true positive (sensitivity) and the false positive (specificity) outputs of the classifier. It is the curved diagram of the classifier's accuracy (in Table 8). Knowing that the closer the curve is to a 45-degree diagonal of the ROC space, the less accurate the prediction result, it can be concluded that kNN is clearly the least efficient algorithm in the studied application case. false positive (specificity) outputs of the classifier. It is the curved diagram of the classifier's accuracy (in Table 8). Knowing that the closer the curve is to a 45-degree diagonal of the ROC space, the less accurate the prediction result, it can be concluded that kNN is clearly the least efficient algorithm in the studied application case. The conducted simulations on various degrees of insulation wear reveal interesting information about the added value of data-driven approaches for the cable condition assessment. Particularly, this work demonstrates the ability of different classification algorithms to identify, on the basis of only ND and voltage variation, the LV network cable condition assessment.
However, this presented work should not be directly extended for other practical applications or be generalized, for two reasons. Firstly, the resistance of the insulation material is calculated (in Section 3) within consideration of some LV cable electrical properties specific to each manufacturer. Secondly, machine-learning techniques have been developed here for the degradation detection in operating domains where the causes of observed variations are difficult to interpret. Hence, to avoid a direct median separation in the observations, the input database has been built (in Section 5.1) by excluding the cases of extreme degradation scenarios (severe faults) because they are easily detected without any advanced techniques.

Conclusions
In this study, a machine learning-based framework is proposed for the identification of low voltage cable degradation due to the insulation material wear. To this end, a probabilistic tool was first developed to generate scenarios for the uncertain nature and degree of the cable insulation degradation. Those scenarios were then associated with the load demand and PV generation variations and used to build the nodal voltage database by performing probabilistic load flow calculations. Different supervised learning methods were finally applied to the generated database. In the first (training) stage, the studied classification methods learned from the given inputs, its associated cable condition status in order to be able to predict, in the second (test) phase, the cable condition corresponding to each given network operating point. The comparisons between the implemented classifiers show that logistic regression and decision tree approaches are powerful binary classification tools with 97.917% and 99.884% accuracy performance, respectively, while the The conducted simulations on various degrees of insulation wear reveal interesting information about the added value of data-driven approaches for the cable condition assessment. Particularly, this work demonstrates the ability of different classification algorithms to identify, on the basis of only ND and voltage variation, the LV network cable condition assessment.
However, this presented work should not be directly extended for other practical applications or be generalized, for two reasons. Firstly, the resistance of the insulation material is calculated (in Section 3) within consideration of some LV cable electrical properties specific to each manufacturer. Secondly, machine-learning techniques have been developed here for the degradation detection in operating domains where the causes of observed variations are difficult to interpret. Hence, to avoid a direct median separation in the observations, the input database has been built (in Section 5.1) by excluding the cases of extreme degradation scenarios (severe faults) because they are easily detected without any advanced techniques.

Conclusions
In this study, a machine learning-based framework is proposed for the identification of low voltage cable degradation due to the insulation material wear. To this end, a probabilistic tool was first developed to generate scenarios for the uncertain nature and degree of the cable insulation degradation. Those scenarios were then associated with the load demand and PV generation variations and used to build the nodal voltage database by performing probabilistic load flow calculations. Different supervised learning methods were finally applied to the generated database. In the first (training) stage, the studied classification methods learned from the given inputs, its associated cable condition status in order to be able to predict, in the second (test) phase, the cable condition corresponding to each given network operating point. The comparisons between the implemented classifiers show that logistic regression and decision tree approaches are powerful binary classification tools with 97.917% and 99.884% accuracy performance, respectively, while the k-nearest neighbors method could not provide accurate predictions. The conducted study reveals the added value of such a data-driven approach for the cable condition assessment.
The interest of this work is to set up a tool that can assist the distribution system operators (DSOs) in an effective and timely predictive maintenance of the LV distribution network, avoiding the costly solutions. Indeed, the obtained result offers promising perspectives for the early detection of cable degradation by combining ML approaches, load demands profiles and smart meter (SM) measurements.
For future work, this research will extend the current model to a complete network, on the basis of cross nodal learning (learning between the models of each line section or cables in the network). The current study is the first step towards a global and generalized (e.g., by considering the type of cable as one of the classifier parameters) data-based early identification of electrical low voltage cable degradation due to insulation wear, using machine learning tools.
Author Contributions: All the authors have contributed equally for this research article, from the conceptualization, methodology, implementation, analysis, discussion, validation, writing to review and editing. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
We applied our proposed approach to the Smart Meter measurements database (and LV network technical information) which are part of the local DSO private property.

Acknowledgments:
The authors would like to thank ORES, Belgian Distribution System Operator, for providing them with the required data.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Energies 2021, 14, x FOR PEER REVIEW 18 of 20 k-nearest neighbors method could not provide accurate predictions. The conducted study reveals the added value of such a data-driven approach for the cable condition assessment. The interest of this work is to set up a tool that can assist the distribution system operators (DSOs) in an effective and timely predictive maintenance of the LV distribution network, avoiding the costly solutions. Indeed, the obtained result offers promising perspectives for the early detection of cable degradation by combining ML approaches, load demands profiles and smart meter (SM) measurements.
For future work, this research will extend the current model to a complete network, on the basis of cross nodal learning (learning between the models of each line section or cables in the network). The current study is the first step towards a global and generalized (e.g., by considering the type of cable as one of the classifier parameters) data-based early identification of electrical low voltage cable degradation due to insulation wear, using machine learning tools.
Author Contributions: All the authors have contributed equally for this research article, from the conceptualization, methodology, implementation, analysis, discussion, validation, writing to review and editing. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
We applied our proposed approach to the Smart Meter measurements database (and LV network technical information) which are part of the local DSO private property.

Acknowledgments:
The authors would like to thank ORES, Belgian Distribution System Operator, for providing them with the required data.

Conflicts of Interest:
The authors declare no conflict of interest. Output y Figure A1. Global flowchart of the implemented process. Figure A1. Global flowchart of the implemented process.