Recognizing VSC DC Cable Fault Types Using Bayesian Functional Data Depth

: Diagnostics of power and energy systems is obviously an important matter. In this paper we present a contribution of using new methodology for the purpose of signal type recognition (for example, faulty/healthy or different types of faults). Our approach uses Bayesian functional data analysis with data depths distributions to detect differing signals. We present our approach for discrimination of pole-to-pole and pole-to-ground short circuits in VSC DC cables. We provide a detailed case study with Monte Carlo analysis. Our results show potential for applications in diagnostics under uncertainty.


Introduction
Effective and reliable monitoring and diagnostics of energy installations is of utmost importance, as they are an important part of the world's economy. Algorithms for fault detection and isolation allow extension of system lifetime, reduction in operation interruption and can lead to significant savings. The main difficulty in their development is that power installations have a high level of complexity, are usually nonlinear and are influenced by stochastic disturbances and parameter variations. Therefore, approaches based on first principles models are difficult or even impossible to use on a wider scale. That is why methods based on statistical models or machine learning are those most researched.
Typically, machine learning, data-driven models are providing complicated 'black-box' models, which are not transparent and hard to interpret. This means flaws are less prevalent in statistical approaches which is a cause of their dominance of statistical approaches in the field. Both of those groups, however, suffer from the typical situation that the real data for system faults is extremely rare, and even if present it is often incomplete. That is why it is crucial to develop methods that can handle issues of non-representative or missing data. Bayesian methods are an emerging set of tools for solving many kinds of diagnostic problems [1][2][3].
Time series diagnostics are usually based on extracting features and with that reduction of dimensionality as signals are represented as vectors of numbers. Unfortunately the usual approach to feature extraction is to obtain certain typical statistical measures (such as mean, standard deviation, kurtosis, median, peak-to-peak) in both time and frequency domain and hope that they will contain enough information about the signal [4,5]. This negatively influences reliability and efficiency of diagnostic models as it is very hard to verify. In the case of transient diagnostics authors (e.g., [6]), they mostly use autoregressive moving average models (or their variants) which is a significant limitation both because of linearity and non-locality (it is difficult to capture localized behavior). Many cable fault location studies are based on simulation models and methods and use simulation tools.
Functional data analysis (FDA) is a group of methods for analysis of data in the form of functions, with a special focus on time series data. The main idea is to create a model of the signal using certain function basis and coefficients of that basis representation can be considered as reduced dimensionality. FDA is a matured field in the area of statistics, with focus on bases in functional spaces such as polynomials, wavelets and others. Maturity of the field can be observed by recent review papers [7] or special issues [8] in prestigious journals covering the field of statistics. We join FDA with Bayesian approach in order to obtain probability distributions of basis coefficients, get generative models and model uncertainty.
In the case of cable fault monitoring, popular approaches are based on first principle model fitting. In the next section we provide a comprehensive overview of such methods. An important aspect shared by all those methods is that they are not focused on uncertainty. They are often considering point estimates or least squares fits. In this paper we want to provide a certain proof of concept for using statistical methods for fault detection and distinguishing types of faults. For this purpose we focus on a real problem of distinguishing between pole-to-ground and pole-to-pole short circuits. Pole-to-pole and pole-to-ground short circuits are the typical DC cable faults. These generally result in fast discharge of the DC-link capacitor through the DC circuit, leading to transient overcurrent, which can damage system components. In addition to fault type and DC system parameters (namely, capacitance C of the DC-link capacitor and cable distributed parameters Rx, Lx), transient response also depends on fault distance x and fault resistance Rf. VSC DC systems are helpless against these DC faults because IGBTs are blocked for self-protection during the fault, leaving freewheel diodes subject to overcurrent [9].
We propose a general algorithm for comparing signals with reference that takes uncertainty under account and uses functional data analysis to provide dimensionality reduction. Uncertainty modeling is obviously important to avoid ill-informed decisions.
Our main contributions are: • Construction of Bayesian spline model capturing measurement and parameter uncertainty, • An algorithm for using Bayesian models to obtain data depth distributions allowing analysis of signal similarity, • An extended case study using simulated voltage source converter (VSC) direct current DC cable fault data focusing on pole-to-pole and pole-to-ground short circuits.
The rest of the paper is organized as follows. First we present a review of cable fault modeling techniques. Then we present methodologies behind data depth and spline modeling using Bayesian hierarchical linear model. Then we describe our computational system and thoroughly analyze current and voltage signals as an indicator, allowing distinguishing between fault types. We finish the paper with the discussion and conclusions section.

Review of Cable Fault Modeling Research
Pollution, physical damage, aging and environmental impact may generate the fault of the cable and cause a variety of serious consequences. It is crucial to detect and locate the cable fault fast and accurately, in particular in case of aircraft cable. Because of this, cable system behavior under fault conditions needs to be studied to enable rapid interruption and isolation of damage. Therefore, many research papers deal with this problem.
In literature, papers present methods and algorithms detecting and locating faults based on simulations and also propose a theoretical analysis of different cable faults.
Many papers investigate VSC DC system by performing various types of computer simulations. Paper [9] deals with two-level VSC DC system response to DC cable faults and analytical expressions for characterizing DC fault overcurrent and voltages and identifying main DC fault characteristics. Yang et al. [10] investigate DC cable transient modeling issues for VSC based high voltage direct current (HVDC) transmission systems. Loume [11] focuses on the influence of cable modeling and grounding on DC fault current behavior in a HVDC point-to-point cable system. In [12] authors propose a transient simplified model with high-frequency components of the fault DC network reserved.
A different important problem is designing extensive networks so that cable failures do not cause further undesirable consequences.
Zhang et al. [13] present a sub-sea DC collection grid with robust control and protection scheme with the DC/DC converter. Network section interconnections are decoupled in the event of DC faults. Jovic et al. [14] consider building large DC grids as an interconnection of regional radial DC systems, which enables very simple and robust DC system protection. An interesting approach is also the one of Bapijaru et al. [15], who present cable models. Their purpose is the analysis of the faults' nature using signatures of measurements in offshore Multi-Modular Converter High Voltage DC (MMC-HVDC) systems.
Cable location studies rely mostly on simulation models, methods and tools. In particular, this concerns the underground power cables [16]. PSCAD™EMTP [17,18] and PROTEUS version 8.1 [19] are the examples of the simulation tools used in cable diagnosis. Gjabhiye et al. [20] review various fault locating methods and highly computational methods for underground cables and provide design of fault location and remote indicators.
In particular, it covers various types of methods (A-Frame Method, A-Frame Method, Time Domain Reflectometry (TDR) and Bridge Method) and highlights the adverse effects of some of them and the selection of methods to the type of fault. The authors in [21] propose new method for online monitoring underground cable monitoring. The simulation tool PSCAD/EMTDC is popular for fault model verification. In paper [22] authors present model for both type pole-to-ground and the pole-to-pole faults detection for cables used in photovoltaics (PV). In paper [23] authors present simulation model of sheath earth current generation and the relationship between sheath earth current and load current. This model simulates various faults of sheath grounding system.
Diagnostic systems for high voltage cables based on sheath current are also an important issue. The sheath current is measured to locate a segment of transversely connected fault, which leads to reduced maintenance time. In the paper [24] authors present an equivalent circuit model of sheath current in a cross-connected cable system according to the single-wire laying type. The model considers the operating mode, cable parameters and line length. This paper [25] presents a model for cable over sheath damage at alternating current (AC) voltage. Authors propose representation of the characteristics as a combination of linear resistance and capacitance. The model is implemented in Alternative Transient Program (ATP) by using the true nonlinear resistance model and transient analysis of control systems (TACS-controlled) switch.
The monitoring and diagnostic systems are designed also for Shielded Twisted Pair cables so that there is no interference between the twisted pair cables. The paper [26] discusses computer models of electrical wires in Shielded Twisted Pair cables and test procedures based on an Enhanced Time Domain Reflectometry technique. The diagnostic models for the coaxial cable are under consideration in papers [27,28]. In [27] faults of the cables are modeled as radiating apertures using the Bethe theory. In the paper [28], Shi et al. present the lossy transmission line model using time domain reflectometry and impedance spectroscopy for the extraction of parameters.
Few papers use predictive methods and data analysis to predict the occurrence of cable fault and the location of it. Such methods include Backpropagation (BP) neural network and Levenberg-Marquardt data-optimized method [29], recursive regression analysis and the amnesic factor regression analysis [30] and the method of data fusion based on Bayes estimate [31]. Successful applications of these methods include locating of cable faults and parameter estimation (e.g., resistance, inductance, conductance and capacitance).
In conclusion, our approach is a good supplement to existing methodology, as it is at least partially data driven, uncertainty focused and relatively simple. It fills certain gaps and has a potential for greater development.

Materials and Methods
In this section we introduce the main concepts on which our work is built. Firstly, we introduce the data depth function, which is certain measure of similarity of a multivariate datapoint to certain probability distribution. Then we present the idea for construction of Bayesian functional data models with splines. Finally, we give the algorithm for comparing signals to reference.

Data Depth
Data depth, as proposed by Tukey in 1975 [32], was considered a multivariate generalization of median, so a function that allows certain ordering of datapoints with respect to distribution. Data depth has multiple uses. Idris considered it for multivariate control charts for easier control of multidimensional processes [33]. Chenouri used it for improving the quality of nonparametric tests [34]. Nagy and Ferraty [35] use functional data analysis to represent discontinuous data. It was also used to analyze functional data [36,37]. By analyzing function curves, it is much easier to detect distant data. In addition, finding anomalies is easier by using the data depth derived from the functional mean.
We define the data depth as the distance of the measurements from the center of the point x ∈ R d regarding a distribution function F. The outermost observations have lower values than the near-center data. They are determined by the depth function. This way, you can identify which data is anomalous throughout the process. First, let us start with the definition of the depth function.
Let the mapping D(·; ·) : R d × F → R be bounded, non-negative and meet the assumptions:' Then D(·; F) is called a statistical depth function [38]. There are multiple definitions fulfilling depth function conditions. Tukey proposed the half space depth, known also as Tukey depth or location depth. In the following years, researchers proposed new types like Euclidean depth, L p depth, Mahalanobis depth, projection depth, Oja depth and many others [39,40]. Those definitions have various properties, but for practical use, most important are computational complexity and scaling properties. Tukey's depth has complexity of O(2 n ), with n being number of dimensions, which makes it practically useless for high dimensional problems. Oja's depth uses n-dimensional volumes of simplexes and has a complexity of O(n 4 ). Unfortunately, because of the curse of dimensionality, simplex volume in high-dimensional space can be infinitesimally small and get lost in rounding errors. Projection depth could be attractive but requires solution of n dimensional convex programming problem, which is hard to bound in the number of multiplications. Other mentioned depths have fewer problems, with best and most efficient results we obtained for Mahalanobis depth, which we define below.

Mahalanobis Depth
Mahalanobis proposed a method of determining the distance between two points x and y in R d in the following form Based on the Mahalanobis distance, the depth function can be defined as where Σ(F) is covariance matrix of F and µ(F) is the mean.

Bayesian Functional Spline Models
In this section we will not cover main principles of Bayesian statistics, for more details we refer the reader to Gelman's book [41]. In the main principle we will be creating a joint probability distribution model for data and parameters.
We consider that our data generating process is given by where y n are our sampled measurements, n = 1, . . . , N, which with uncertainty of normal distribution given by σ. Functions φ m (t) are B-splines on the assigned M knot grid. µ n are transformed parameters of the model, as it corresponds to the mean of fitted distribution for individual measurements. It could be avoided, as actual model parameters are coefficients of B-splines combination θ m , but it improves formula clarity. θ m are all distributed with respect to normal distribution Parameters µ 0 , σ 0 are not known and are inferred from data, making the proposed model a hierarchical one. Relations of the entire model are presented using Bayesian network plate notation in Figure 1. We can join the likelihood (3) and the hyper-prior (4) to construct a full model, assigning priors for all the parameters: Hyper-parameters were given uniform priors, as we have no justification for others, but as well they could be just disperse normals. σ was given an exponential prior, as it has heavier tail than half-normal allowing extra flexibility. The proposed model allows sampling of parameters from posterior and sampling from posterior predictive distribution. Posterior predictive distribution is a useful tool for generating predicted data from inferred parameters. For the rest of the paper we will refer to (5) as 'model'.
Parameters θ m of the model are dimensionality reduction representation of the time series. Their number M is naturally smaller than number of samples N. However, those coefficients with basis functions allow for the representation of the signal up to the confidence interval.

Algorithm for Fault Determination and Detection
Here we will propose a data-driven general Bayesian algorithm using data depths and Bayesian functional models to capture uncertainty and compare signals to reference.
Main use of data-depth is the possibility of using it with data, as a statistic similar to the median. Its usefulness reduces if data is not available in large quantities. However, having a model of data generating process using posterior predictive distribution one can generate as many datapoints are needed efficiently estimating data depth. This is the base of our algorithm.
Algorithm consists of the following steps: 1.
Create reference . Having a set of reference signals (for example, healthy behavior). Fit the model with them and obtain the set of samples of (θ m ) M sequences.

2.
Compute Mahalanobis depth distribution. Using the obtained samples, compute mean and covariance matrices and determine depth of each of the samples. This will be our reference depth distribution. It can be summarized by a histogram if needed. 3.
Create model of the candidate. Using the same model, fit it with the candidate signal and obtain the set of samples of (θ m ) M sequences.

4.
Compute the depth distribution of candidate with respect to reference. Using mean and covariance of the reference set compute the Mahalanobis depth of all the samples of the candidate. This gives a marginal probability distribution of depth of candidate with respect to reference. 5.
Analyze the overlap/distance. The marginal distribution of the depth allows us to verify if signal is 'shallow' with respect to the reference set or close to it. If there is an overlap we can state that the similarity is strong. If there is a large gap, we can say that the signal is an 'outlier' with respect to reference.
The main strength of the proposed approach is that it allows to compensate for a small amount of data and its quality. Because the model of the candidate is available we can capture uncertainty coming from the disturbances and protect ourselves from accidentally stating that the signal is similar.

Results
In this section we present results of our case study applying our algorithm to determining in the VSC DC cable the type of fault between pole-to-pole and pole-to-ground short circuits. First we provide a brief description of our data. Then we describe the computational system used. Finally, we go in depth in to analysis of current and voltage signals for their diagnostic use.

Considered Data
For our data we have considered results from simulation models given by Mesas et al. [9] for configurations presented in Figure 2. Proposed models represent the initial 5 ms period after the fault occurrence. For those signals we have simulated with randomized parameters 100 voltage-current pairs for both pole-to-pole and pole-to-ground faults. We have taken parameters normally distributed with means equals to parameters provided by Mesas et al. and standard deviations corresponding to 10% of those values. Simulated signals are presented in Figure 3.

Computational Setup
For Bayesian computation, we have used Hamiltonian Monte Carlo (HMC, also known as hybrid Monte Carlo) algorithm. Currently, most advanced HMC software is Stan [42]. HMC algorithm is a type of Markov Chain Monte Carlo (MCMC) method. MCMC generates a Markov chain of samples. It generates them in a way that makes their limiting distribution converge to the desired probability distribution. MCMC methods are especially useful for Bayesian computation, as sampling from the posterior distribution is difficult. Using generated samples, we can estimate expected values of desired functions of random variables. Because of that, we can answer practically all relevant statistical questions.
HMC is a variant of the Metropolis-Hastings algorithm. Traditionally Metropolis-Hastings algorithm uses Gaussian random walk proposal distribution. Algorithm accepts or rejects samples from the random walk depending on computed acceptance probability. In HMC, we generate proposals of random variables (system states) through a Hamiltonian dynamics evolution. This evolution is simulated using a time-reversible and volumepreserving numerical integrator (a symplectic integrator). HMC algorithm reduces the correlation between successive sampled states by proposing moves to distant states. Those states maintain a high probability of acceptance. This happens because symplectic integrators conserve energy of the simulated Hamiltonian dynamic. The reduced correlation in HMC means we need fewer Markov chain samples to get a desired level of Monte Carlo error when computing expectations.
Simulation of Hamiltonian dynamics might numerically destabilize for probability distributions with complicated geometry. This is an advantage of the method, because such complications usually mean problems in identifiability of parameters. Destabilization (known in the statistical field as "divergence") is a useful diagnostic that can suggest re-parametrization or other numerical adaptation of algorithm.
All the codes used for computation in this paper are available in the repository listed at the "data availability" section.
For model computation we have decided to use limited sample of 10 signals in order to increase the uncertainty and reduce averaging. We have assigned a spline base of 15 third order polynomials which are presented in the Figure 4. As a reference we have selected pole-to-pole faults data and we were comparing them with pole-to-ground. This had no influence as results in the other way are very similar.

Analysis of Current Measurements
First we have started with analyzing current signals. The reference set was created with 10 selected signals. Results of the fit are presented in Figure 5. Here we have visualization of uncertainty of parameters of combination in the form of error bars Figure 5a and posterior predictive distribution compared to quantiles of data, Figure 5b. As we can see there is an underestimation of uncertainty in the middle of the signal. Some statistics of inference are given in the Table 1. Hyper-prior parameters of the current spline model are efficiently estimated. They get very thin distributions centered in the neighbourhood of zero. Standard error of the current measurement is estimated at 0.05 kA which is consistent with simulations. Probabilistic computation is efficient, as Monte Carlo standard error (MCSE) is vanishingly small. Effective sample size of both bulk of distribution and its tails is on the level of 42% (total number of samples is 4000), which is a reasonable result. Figure 6 shows the distributions of the remaining parameters. In Figure 7 we present the application of the algorithm to two samples, one of poleto-pole and one of pole-to-ground faults. (a) Bayesian model has efficiently captured the shape of measurements. (b) Distribution of data depth shows good discrimination of pole-to-ground from the reference depth; however, isolated pole-to-pole fault is also much more shallow. It suggests that current is a poor diagnostic indicator in this case. It is further highlighted in Table 2. When analyzing current signal in the context of reference depth of pole-to-pole fault (with mean of 6.23 × 10 −2 and spreading from 1.993 × 10 −2 to 2.14 × 10 −1 ) we can see that it is not the best indicator. While pole-to-ground scenarios are being clearly discriminated as 'shallow' with depths at the level of 10 −5 , the pole-to-pole current signals are also separated from the reference, not getting depths greater than 10 −3 .  (a) Bayesian model has efficiently captured the shape of measurements. (b) Distribution of data depth shows good discrimination of pole-to-ground from the reference depth; however, isolated pole-to-pole fault is also much more shallow. It suggests that current is a poor diagnostic indicator in this case. Table 2. When analyzing current signal in the context of reference depth of pole-to-pole fault (with mean of 6.23 × 10 −2 and spreading from 1.993 × 10 −2 to 2.14 × 10 −1 ) we can see that it is not the best indicator. While pole-to-ground scenarios are being clearly discriminated as 'shallow' with depths at the level of 10 −5 , the pole-to-pole current signals are also separated from the reference not getting depths greater than 10 −3 . Exp. no.-experiment number, mean-mean value of data depth of candidate signal consisting of one type of faults (no unit), min-minimal value of data depth of candidate signal consisting of one type of faults (no unit), max-maximal value of data depth of candidate signal consisting of one type of faults (no unit).

Analysis of Voltage Measurements
We have then continued our analysis with the voltage signal. As previously, we have used 10 signals for reference using the voltages corresponding to previously included currents. Similarly we have fitted the model which is illustrated in Figure 8. The fit is slightly better at the end of the signal without overestimation. Inference statistics are given in Table 3. Hyper-prior parameters of the voltage spline model are also efficiently estimated but with more difficulty. They get very thin distributions centered in the neighborhood of zero; however, all of them are positive. Standard error of the voltage measurement is estimated at 0.02 kV which is consistent with simulations. Probabilistic computation is efficient, as Monte Carlo standard error (MCSE) is vanishingly small. Effective sample size of both bulk of distribution and its tails is on the level between 15% and 25% (total number of samples is 4000), which is a reasonable result.
Potential scale reduction factorR is reasonably close to 1, indicating good mixing of Markov chains. Kernel density estimators (Figure 9) of (a) hyper-parameters µ 0 , σ 0 and (b) standard error of voltage measurement also show reasonable sampling behavior. All the distributions are reasonably normal-like. Figure 10 presents the application of the algorithm to two voltage samples, one of pole-to-pole and one of pole-to-ground faults. (a) Bayesian model has efficiently captured the shape of measurements. (b) Distribution of data depth shows good discrimination of pole-to-ground from the reference depth. Isolated pole-to-pole fault is also shallower than reference but reasonably close. It suggests that voltage is a much better diagnostic indicator in this case. It is also supported by the analysis in Table 4. Table 3. Hyper-prior parameters of the voltage spline model are also efficiently estimated but with more difficulty. They get very thin distributions centered in the neighbourhood of zero; however, all of them are positive. Standard error of the voltage measurement is estimated at 0.02 kV which is consistent with simulations. Probabilistic computation is efficient, as Monte Carlo standard error (MCSE) is vanishingly small. Effective sample size of both bulk of distribution and its tails is on the level between 15% and 25% (total number of samples is 4000), which is a reasonable result. Potential scale reduction factorR is reasonably close to 1, indicating good mixing of Markov chains.  (b) Distribution of data depth shows good discrimination of pole-to-ground from the reference depth. Isolated pole-to-pole fault is also shallower than reference but reasonably close. It suggests that voltage is a much better diagnostic indicator in this case. Table 4. When analyzing voltage signal in the context of reference depth of pole-to-pole fault (with mean of 6.223 × 10 −2 and spreading from 2.108 × 10 −2 to 2.135 × 10 −1 ) we can see that it is a much better indicator than current. Pole-to-ground scenarios are being clearly discriminated as 'shallow' with depths at the level of 10 −5 as in the previous case. The pole-to-pole current signals are also separated from the reference but their depths are even sometimes overlapping reference or are relatively close to it. Exp. no.-experiment number, mean-mean value of data depth of candidate signal consisting of one type of faults (no unit), min-minimal value of data depth of candidate signal consisting of one type of faults (no unit), max-maximal value of data depth of candidate signal consisting of one type of faults (no unit).

Discussion and Conclusions
The results presented in this paper are obviously in an early stage. Certainly there is a potential for diagnostic use of functional Bayesian models with data depth as a statistic. This application can find a place in energy systems, for example, in cable diagnostics. Certain aspects still require consideration. Perhaps a more varied reference set would provide better performance. This will certainly be investigated. We have also arbitrarily chosen the spline basis. Proper selection of order could be for example made using leave one out cross validation or Watanabe-Akaike Information Criterion. Those are aspects that we will continue to investigate. Finally, we want to investigate the method for joining both signals for diagnostics (a functional data fusion); this is also an open challenge. Funding: This research was funded by AGH's Research University Excellence Initiative under project "Interpretable methods of process diagnosis using statistics and machine learning" and by AGH's subvention for scientific activity.

Data Availability Statement:
The data presented in this study and code for the analysis are openly available in the GitHub repository https://github.com/KAIR-ISZ/public_data (accessed on 16 September 2021).