Fault Detection and Isolation Methods in Subsea Observation Networks

Subsea observation networks have gradually become the main means of deep-sea exploration. The reliability of the observation network is greatly affected by the severe undersea conditions. This study mainly focuses on theoretical research and the experimental platform verification of high-impedance and open-circuit fault detection for an underwater observation network. With the aid of deep learning, we perform the fault detection and prediction of the network operation. For the high-impedance and open-circuit fault detection of submarine cables, the entire system is modeled and simulated, and the voltage and current values of the operating nodes under different fault types are collected. Numerous calibrated data samples are supervised by a deep learning algorithm, and a fault location system model is built in the laboratory to verify the feasibility and superiority of the scheme. This paper also studies the fault isolation of the observation network, focusing on the communication protocol and the design of the fault isolation system. Experimental results verify the effectiveness of the proposed algorithm for the location and prediction of high-impedance and open-circuit faults, and the feasibility of the fault isolation system has also been verified. Moreover, the proposed methods greatly improve the reliability of undersea observation network systems.


Introduction
Cabled submarine observation networks are underwater monitoring networks formed by connecting many monitoring terminals/equipment distributed in the ocean through photoelectric composite cables. These networks are connected to the land transmission network grid and the communication network to realize the excellent span of land monitoring systems extending to the deep sea [1][2][3][4]. The reliable, safe, and stable power transmission of submarine observation networks is the key to ensuring that all underwater observation equipment can carry out scientific research and exploration work normally [5]. However, subsea observation networks, which are composed of many underwater photoelectric composite cables, electrical connectors, connecting equipment, and observation terminals, operate in deep sea and other harsh environments. Given their complex structure and poor working environment, power transmission systems are prone to failure, which affects the stability of power transmission. Submarine cable fault is an important factor that affects the reliability of submarine observation networks. In the case of ground faults of submarine cables, most of the current transmitted by shore stations flows through a short-circuit point, because the network transmits electric energy through a single line; consequently, the constant voltage power supply becomes paralyzed and stops immediately because the current exceeds the threshold value, and all the corresponding underwater connection boxes cannot be started normally; these conditions should be avoided as much as possible [6,7].

Overview of the Subsea Observation Network
The overall system architecture of the observation network is shown in Figure 1. The observation network in this work is composed of a power supply in the shore station (SS) and equipment in the first, second, and terminal layers. The first layer mainly includes an underwater photoelectric composite cable, underwater branch units (BUs), independent cathodes of BUs, and primary junction boxes (PJBs) connected in parallel to the backbone cable; each PJB has a corresponding node cathode. The first layer not only provides energy for the entire system but also accounts mainly for the high maintenance and repair costs. The second and terminal layers mainly include the secondary junction boxes (SJBs) and various terminal instruments.
Each PJB is connected to the backbone cable of the submarine observation network through a BU. High-voltage energy is gradually distributed to each terminal instrument through the PJB and SJB. The second and terminal layers have their own energy and communication management systems, which can isolate and work independently under various operation failures [1,4]. erview of the Subsea Observation Network he overall system architecture of the observation network is shown in Figure 1. The observ rk in this work is composed of a power supply in the shore station (SS) and equipment i second, and terminal layers. The first layer mainly includes an underwater photoele osite cable, underwater branch units (BUs), independent cathodes of BUs, and primary jun (PJBs) connected in parallel to the backbone cable; each PJB has a corresponding node cath irst layer not only provides energy for the entire system but also accounts mainly for the enance and repair costs. The second and terminal layers mainly include the secondary jun (SJBs) and various terminal instruments. ach PJB is connected to the backbone cable of the submarine observation network throu igh-voltage energy is gradually distributed to each terminal instrument through the PJB he second and terminal layers have their own energy and communication managem s, which can isolate and work independently under various operation failures [1,4]. lt Detection Methods ystem Structure his paper mainly focuses on the fault diagnosis of submarine cables, which can be divided impedance faults, open-circuit faults, and short-circuit faults. High-impedance faults and o t faults are common faults. Short-circuit faults are a product of high-impedance faults. If a h ance fault cannot be found and repaired in time, the damage degree of the submarine c e expanded, and the high-impedance fault will turn into a short-circuit fault, and the w will collapse. Therefore, monitoring high-impedance faults can greatly reduce the probab ort-circuit faults. The fault detection method proposed in this paper mainly aims at h ance and open-circuit faults. The fault detection system mainly monitors the voltage nt data of junction box nodes, uploads real-time signals to the shore-based power monito rm for data analysis and judgment through a microcontroller signal and photoelectric mo

System Structure
This paper mainly focuses on the fault diagnosis of submarine cables, which can be divided into high-impedance faults, open-circuit faults, and short-circuit faults. High-impedance faults and open-circuit faults are common faults. Short-circuit faults are a product of high-impedance faults. If a high-impedance fault cannot be found and repaired in time, the damage degree of the submarine cable will be expanded, and the high-impedance fault will turn into a short-circuit fault, and the whole system will collapse. Therefore, monitoring high-impedance faults can greatly reduce the probability of short-circuit faults. The fault detection method proposed in this paper mainly aims at high-impedance and open-circuit faults. The fault detection system mainly monitors the voltage and current data of junction box nodes, uploads real-time signals to the shore-based power monitoring platform for data analysis and judgment through a microcontroller signal and photoelectric modem, and determines whether to use the power switching control system to start the fault isolation program according to the diagnosis results. The related hardware scheme is shown in Figure 2. entire fault detection algorithm is the control core of the entire platform. Deep (DNN) is used to optimize and improve the accuracy of fault diagnosis and location e randomness and uncertainty of the location of cable faults, time limitations, and ec nts, realistically simulating the large amount of data required in deep learning tra . Therefore, the dataset is enhanced by the software modeling of the observation n rning uses a common DNN for model training based on supervised learning. It also ristics of the training data set x (i) = {voltage and current data of power supply, volt data of all junction boxes in a complex topology network} and the corresponding la ion value of corresponding fault type}. The TensorFlow (Tensorflow is a sy atical system based on data stream programming, which is widely used in the progra s machine learning algorithms. It is developed and maintained by Google brain, a artificial intelligence in California, USA) deep learning computing framework is u ining, and the cost reduction function is optimized for fault detection model traini fault detection flowchart is shown in Figure 3  The entire fault detection algorithm is the control core of the entire platform. Deep neural network (DNN) is used to optimize and improve the accuracy of fault diagnosis and location [20,21]. Given the randomness and uncertainty of the location of cable faults, time limitations, and economic constraints, realistically simulating the large amount of data required in deep learning training is difficult. Therefore, the dataset is enhanced by the software modeling of the observation network. Deep learning uses a common DNN for model training based on supervised learning. It also sets the characteristics of the training data set x (i) = {voltage and current data of power supply, voltage and current data of all junction boxes in a complex topology network} and the corresponding label y (i) = {calibration value of corresponding fault type}. The TensorFlow (Tensorflow is a symbolic mathematical system based on data stream programming, which is widely used in the programming of various machine learning algorithms. It is developed and maintained by Google brain, a team of Google artificial intelligence in California, USA) deep learning computing framework is used for DNN training, and the cost reduction function is optimized for fault detection model training. The specific fault detection flowchart is shown in Figure 3. various machine learning algorithms. It is developed and maintained by Google brain, a team o ogle artificial intelligence in California, USA) deep learning computing framework is used fo N training, and the cost reduction function is optimized for fault detection model training. Th ecific fault detection flowchart is shown in Figure 3   (1) A system simulation model is built in the PSpice simulation software (Pspice is launched by Cadence in San Jose, CA, USA), and high-impedance faults are simulated at intervals by connecting a series of fault resistors at the fault point. Python is used to preprocess the simulation data (such as the normalization of feature combinations) to obtain the training set and test set data of the DNN. (2) The TensorFlow deep learning framework is used to build a neural network predictive failure model and perform multiple iterative training optimizations for the training set characteristics. Meanwhile, the test set data are used to perform hyperparameter search tuning on the model to improve the model's accuracy and generalizability. The trained fault detection model is saved locally to be used on real data. (3) The voltage and current data of the shore base and junction box collected by the single-chip microcomputer are displayed in real time by the QT host computer (QT is a cross-platform C + + Graphical User Interface(GUI) application development framework developed bythe software company QT in Espoo, Finland in 1991. It can be used to develop both GUI programs and non-GUI programs, such as console tools and servers), and the collected related data are saved locally in text/csv format. (4) The host computer software calls the Python (Python has become one of the most popular programming languages. It was founded by Guido van Rossum of Amsterdam, the Netherlands) interpreter to run the deep learning script and performs the same processing on the collected data according to the training data preprocessing steps. The combined features are used as input to the already trained deep learning model for prediction. The Qt host computer displays the prediction results on the graphical user interface in real time. (5) If the system is predicted to have potential high-impedance and open-circuit faults, then the result can also be used as a drive signal for the fault isolation system to perform switching.

Supervised Learning Feature Engineering
The current mainstream circuit simulation and design software mainly includes Multisim (Multisim is a windows based simulation tool developed by National Instruments (NI) Co., Ltd. in Austin, TX, USA), Saber (Saber is an electronic design automation software developed by Synopsys, in Mountain View, CA, USA), and PSpice. PSpice has a friendly graphical interface, simple operation, and excellent performance in circuit simulation, so we choose PSpice to simulate the circuit to obtain the dataset. In this work, a submarine observation network model with two shore base network topologies is established in PSpice, and the input voltage and current of each junction box are collected. After the prestandardized data processing, the deep learning input layer data are obtained. Given enough differences between each state feature, the fault detection model generated by the iterative training of the deep learning algorithm can accurately fit and distinguish the differences between systems. The fault detection process is shown in Figure 4.

Supervised Learning Feature Engineering
The current mainstream circuit simulation and design software mainly includes Multisim (Multisim is a windows based simulation tool developed by National Instruments (NI) Co., Ltd. in Austin, TX, USA), Saber (Saber is an electronic design automation software developed by Synopsys, in Mountain View, CA, USA.), and PSpice. PSpice has a friendly graphical interface, simple operation, and excellent performance in circuit simulation, so we choose PSpice to simulate the circuit to obtain the dataset. In this work, a submarine observation network model with two shore base network topologies is established in PSpice, and the input voltage and current of each junction box are collected. After the prestandardized data processing, the deep learning input layer data are obtained. Given enough differences between each state feature, the fault detection model generated by the iterative training of the deep learning algorithm can accurately fit and distinguish the differences between systems. The fault detection process is shown in Figure 4. The network in PSpice ( Figures 5 and 6) consists of dual shore-based power supplies (A and B), 10 underwater nodes (T1-T10), 12 BUs (BU1-BU12), and 14 cable sections (R1-R14) and uses the combination of parasitic resistance, inductance, and capacitance to simulate the submarine cable. The neural network fault diagnosis method proposed in this paper is suitable for single shore station and dual shore station systems. A dual shore station system has better reliability than a single shore station system. When one shore power supply failure occurs, the other shore power supply can continue to provide power to ensure that the system does not crash. However, since the whole The network in PSpice ( Figures 5 and 6) consists of dual shore-based power supplies (A and B), 10 underwater nodes (T1-T10), 12 BUs (BU1-BU12), and 14 cable sections (R1-R14) and uses the combination of parasitic resistance, inductance, and capacitance to simulate the submarine cable. The neural network fault diagnosis method proposed in this paper is suitable for single shore station and dual shore station systems. A dual shore station system has better reliability than a single shore station system. When one shore power supply failure occurs, the other shore power supply can continue to provide power to ensure that the system does not crash. However, since the whole systems of single shore station and dual shore station systems are in different power states, in order to obtain the neural network model that can be applied to a single shore station system and dual shore station system respectively, it is necessary to obtain the training set of the corresponding system and conduct deep learning.
Sensors 2020, 20, x FOR PEER REVIEW 6 of 22 systems of single shore station and dual shore station systems are in different power states, in order to obtain the neural network model that can be applied to a single shore station system and dual shore station system respectively, it is necessary to obtain the training set of the corresponding system and conduct deep learning.   The lumped parameter model and distributed parameter model are two main transmission cable models [22]. The lumped parameter model is suitable for the steady-state calculation of a short distance transmission system. The distributed parameter model is accurate but computationally intensive. When considering the transient process of long-distance transmission cables, the lumped parameter cascade model showed in Figure 7 is generally used. After optimizing the segment length of a transmission cable, the lumped parameter cascade model can ensure the simulation accuracy and convergence speed. Ref. [4] proposed the parameters of a transmission cable: resistance  The simulation model mainly includes a shore-based power supply model, transmission cable model, BU model, and junction box model. Among them, the shore-based power supply is regarded as a DC voltage source due to its sufficient power. The transmission cable model adopts lumped parameter cascade model, and the specific parameters are determined according to the actual cable length. The BU model is determined according to the BU structure design. The junction box is a constant power load, and the specific model is determined according to [4]. The lumped parameter model and distributed parameter model are two main transmission cable models [22]. The lumped parameter model is suitable for the steady-state calculation of a short distance transmission system. The distributed parameter model is accurate but computationally intensive. When considering the transient process of long-distance transmission cables, the lumped parameter cascade model showed in Figure 7 is generally used. After optimizing the segment length of a transmission cable, the lumped parameter cascade model can ensure the simulation accuracy and convergence speed. Ref. [4] proposed the parameters of a transmission cable: resistance R = 1 Ω/km, inductance L = 0.37 mH/km, and capacitance C = 0.16 µF/km.  The lumped parameter model and distributed parameter model are two main transmission cable models [22]. The lumped parameter model is suitable for the steady-state calculation of a short distance transmission system. The distributed parameter model is accurate but computationally intensive. When considering the transient process of long-distance transmission cables, the lumped parameter cascade model showed in Figure 7 is generally used. After optimizing the segment length of a transmission cable, the lumped parameter cascade model can ensure the simulation accuracy and convergence speed. Ref. [4] proposed the parameters of a transmission cable: resistance The simulation model mainly includes a shore-based power supply model, transmission cable model, BU model, and junction box model. Among them, the shore-based power supply is regarded as a DC voltage source due to its sufficient power. The transmission cable model adopts lumped parameter cascade model, and the specific parameters are determined according to the actual cable length. The BU model is determined according to the BU structure design. The junction box is a constant power load, and the specific model is determined according to [4]. The simulation model mainly includes a shore-based power supply model, transmission cable model, BU model, and junction box model. Among them, the shore-based power supply is regarded as a DC voltage source due to its sufficient power. The transmission cable model adopts lumped parameter cascade model, and the specific parameters are determined according to the actual cable length. The BU model is determined according to the BU structure design. The junction box is a constant power load, and the specific model is determined according to [4].
Given the influence of temperature, damage degree, and seawater chemical characteristics, the ground resistance of high-impedance faults varies. Previous statistical data indicate that the ground resistance of high-impedance faults is generally in the range of thousands of ohms to hundreds of thousands of ohms. A series of different fault resistance values can be simulated at the same fault point in the simulation software, and the open-circuit fault in the simulation is obtained by turning off the BU.
The probability of multiple faults occurring simultaneously in a submarine observation network is considerably small. All the fault data in the simulation are based on the assumption that the system has a single fault in the network. The characteristics and the corresponding calibration values of the supervised learning training data adopted are shown in Tables 1 and 2.
10 terminal voltage and current values 2 shore station current values 2 shore station current product values Each training sample has an input characteristic dimension (R 1 * 23 ), which is a continuous dataset that can be collected. The frequent fluctuation of the current value of the power supply when a fault occurs can be understood as a strong characteristic component that can be used to distinguish fault types. By using the product of two SS currents as a new feature, the identification degree of fault detection can be improved in the later model training. For each eigenvector (x (i) ), we manually calibrate the output calibration value (y (i) ) whose dimension is R 1 * 28 . If the characteristic component (y (i) ) is 1, then the system is in a normal working state; otherwise, the system has a high-impedance or open-circuit fault. The 0/1 state value of the characteristic component (y 2∼13 . This value is a continuous value whose size does not exceed the maximum length of the faulty cable section and indicates the high-impedance fault location.

Training and Optimization of Fault Detection Model
The code implementation of the hidden layer of the neural unit mainly defines the function named add_layer shown in Algorithm A1 in Appendix A. The add_layer function mainly defines the weight variable 'Weights' and bias variable 'biases'. Meanwhile, the two-dimensional weight and one-dimensional offset tensors of the hidden layer are initialized to be normal and constant for the convenient training and convergence of the neural network in the later stage. Then, operation 'Wx_plus_b' is defined to perform matrix operation on the output variables of the previous layer (input variables of the neural unit of this layer), the weight variables, and the bias variables. In a specific application scenario, we choose different activation functions, outputs = activation_function(Wx_plus_b), to deal with the calculation tensor nonlinearly and further increase the complexity of the system. The output of this hidden layer is the result calculated by this function and the input tensor of the neural unit of the next layer.
Compared with other activation functions such as sigmoid and tanh, when the input value of the relu activation function is greater than 0, the function gradient value is constant to 1, which greatly accelerates the training speed of the model. Without exponential operation, the convergence speed will be further accelerated. Therefore, the relu activation function is used in the model training of this research, which has an excellent training effect and can meet the training requirements of the diagnostic model well.
Combined with the fault detection feature engineering designed in Section 3.2, two cost functions are defined in the output layer of the DNN shown in Algorithm A2: the cross-entropy cost function is mainly used to evaluate the error degree of the softmax classification problem [23], and the cross-entropy cost function of the first 27 output features of y (i) is calculated to get the similarity between the predicted value and the true value, which mainly solves the problem of fault classification. The cross-entropy loss can be calculated from: Sensors 2020, 20, 5273 9 of 22 The mean square deviation of the last output feature of y (i) is calculated to measure the similarity of the regression problem and get the specific location of the predicted failure. The calculation formula of the mean square loss is: Finally, the two cost functions defined by feature engineering are weighted and summed to obtain the total cost function 'loss' of the system, which is used to minimize the global loss function to train the model. In this paper, the Adam optimizer with a better comprehensive performance is used to train the model iteratively [24]. Finally, the execution function 'sess.run' starts the adjustment of model parameters, and the loss function value is also decreasing.
Deep learning is end-to-end learning, and no theoretical support is given for the specific values of the neural network architecture and algorithm hyperparameters. Hyperparameter search is carried out continuously on the basis of personal experience to ensure that the prediction accuracy and generalization rate of the model meet the requirements. The accuracy index is the statistical result calculated from the training dataset that is aimed at measuring the ability of the model to fit the data. The generalization index is aimed at the test set without training to measure the model's ability to predict unknown samples. In model training, the parameters must be adjusted continuously to optimize the training process. The relevant hyperparameters and corresponding initial values involved in the model are shown in Table 3. The proposed fault detection neural network consists of four hidden layers. By adjusting the model parameters, the number of neural units in each layer is determined to be 22, 20, 40, and 28, and the number of training iterations is 20,000. In total, 1064 sets of data are obtained from PSpice simulation software as the training set data of the neural network. Backpropagation (BP) is carried out with tf.train.AdamOptimizer, a built-in optimizer of TensorFlow, to calculate the partial derivative value of the cost function for all model parameter variables and optimize the model with the parameter update rate of the product of the learning rate and the corresponding partial derivative. To observe whether the optimization direction of the entire neural network is correct, we record the total cost function (Loss) of the training (blue curve) and test (orange curve) sets during the training process for every certain interval iteration ( Figure 8).
Sensors 2020, 20, x FOR PEER REVIEW 9 of 22 Backpropagation (BP) is carried out with tf.train.AdamOptimizer, a built-in optimizer of TensorFlow, to calculate the partial derivative value of the cost function for all model parameter variables and optimize the model with the parameter update rate of the product of the learning rate and the corresponding partial derivative. To observe whether the optimization direction of the entire neural network is correct, we record the total cost function (Loss) of the training (blue curve) and test (orange curve) sets during the training process for every certain interval iteration ( Figure 8). The difference between the training set and the Bayes error rate (The Bayes error rate is the minimum error that can be achieved by inputting the existing feature set into any classifier. It can also be called minimum error.) is defined as bias, and the difference between the test and training sets is defined as variance. All parameters of the neural network are optimized to balance the bias The difference between the training set and the Bayes error rate (The Bayes error rate is the minimum error that can be achieved by inputting the existing feature set into any classifier. It can also be called minimum error.) is defined as bias, and the difference between the test and training sets is defined as variance. All parameters of the neural network are optimized to balance the bias and variance. According to Figure 8, the entire model training optimizes the training set, and the loss function value of the training set is smaller than that of the test set with the same data distribution. However, a difference of 50 remains between the loss value of the training set (blue curve) and 0 at the end of the training, indicating that the complexity of the current training model is not enough to fit the training set data of supervised learning and that the model suffers from high bias. In addition, the error between the loss value of the training and test sets increases with the number of training iterations. When the number of iterations is 16,000, the loss difference between the training and test sets is approximately 80. These data show the weakening generalizability of the fault detection model and its high variance problem.
The most common way to reduce variance is to add an additional term after the loss function. Two common additional terms are used: L1 regularization ( 1-norm) and L2 regularization ( 2-norm) [25]. L1 regularization refers to the sum of the absolute values of each element in weight vector W and is usually expressed as ||w||1; L2 regularization refers to the root of square sum of each element in weight vector w and is usually expressed as ||w||2. In the process of minimizing the cost function, the regularization term and the penalty function are added to the cost function for optimization to minimize the matrix parameter W value of the model. The final effect is equivalent to reducing the complexity of the model matrix and achieving a good trend for the generalized fitting data. In addition to regularization techniques, early stopping and dropout can be used to slow down the overfitting. The former continuously observes the loss curve of the test set. When the cost function increases with the number of iterations, the training must be stopped in advance. The latter randomly deletes the neural units of each layer according to a certain ratio. The equivalent effect is that the structure of the network becomes simple, thereby reducing the effect of overfitting. The fault detection model mainly uses the L2 regularization constant term and dropout technology to improve the generalizability.
To further reduce the bias between the predicted value of the training set and the label data, this work attempts to increase the numbers of hidden layers and neural units in each layer to increase the complexity of the model. However, the convergence slows as the network deepens. In severe cases, the gradient of the training parameters of the entire neural network disappears, and normal training cannot be performed. The BN (Batch Normalization) method solves the stoppage of model training. This technology adopts the idea of normalizing the mean value of the input layer data and constructs the input value of each neural layer of the hidden layer into a normal distribution with mean value γ and variance β through a normalization approach. In this way, the input value of the activation function always falls in the region of the nonlinear function with a large gradient, and the problem of gradient disappearance is avoided. The entire BN precedes the activation function shown in Algorithm A1, and the relevant calculation formula is expressed in the equations below.
In order to ensure that each layer of the neuron calculation model has a certain nonlinearity, after standardizing the matrix data, BN also sets the variance as scale (γ) and the mean value as shift (β). The core idea of the algorithm is to change the data distribution of each layer of the neural network forcibly to avoid the input value of the activation function appearing in the nonlinear interval, and the convergence speed of neural network is greatly accelerated without reducing the nonlinear complexity of the system.
Since the mean and variance fc_mean, fc_var of each layer cannot represent the data distribution of the whole training set, we must smooth the calculation results to standardize the available test data. In the Tensor Flow framework, the exponential weighted average operation function 'ema' is used to estimate the local mean value of variables, which makes the update of variables related to the historical value in a period of time. The batch standardization of each hidden layer can not only greatly improve the training speed, but also increase the classification effect. After BN processing, a large learning rate can be used in the training process without worrying about the non-convergence of the training such as oscillation, which can better train the complex neural network and to a certain extent prevent the over fitting of the training.
After a series of parameter adjustments and algorithm optimizations, the system fault detection model is completed. The test results are shown in Figures 9 and 10. The blue curve represents the prediction accuracy of the training set, while the red curve represents the diagnosis results of the test set.     The above figure indicates that the prediction accuracy and generalizability of the fault detection model are further improved after the continuous optimization of the number of deep network layers, the number of neural units in each layer, the learning rate, the batch normalization, and the regularized dropout/L2 values. Figure 9 describes the prediction accuracy of the multiclassification softmax problem with fault types. The diagnostic accuracy of the training and test sets gradually improves with the increase of the training times. The accuracy of the training set is always higher than that of the test set, and the accuracy error between them decreases. When the number of iterations reaches 16,000, the prediction accuracy of the fault detection model for high-impedance and open-circuit fault types reaches approximately 91%. Figure 10 shows the prediction error diagram of the specific location of the high-impedance fault. The mean squared error of the fault location decreases with the increase of the number of training iterations. With the increase of iterations, the model keeps approaching the data law of training sets. At first, the accuracy will gradually increase with the number of iterations, but after a certain number of iterations, which is 8000 in this simulation, because of the excess of model expression ability, this neural network will learn some non-common features that can only meet the training sets (these are more of an accidental feature, not applicable to the test sets, which means there is an over fitting), which will lead to the decline of test accuracy. Therefore, in the over-fitting state, the accuracy will first increase and then decrease with the increase of the number of iterations. This problem can be solved by increasing the amount of training set data. The predicted deviation value at the specific fault location is approximately 4 km. The prediction accuracy and generalizability of the fault detection model are further improved.

Communication Protocol
In order to realize fault isolation, communication protocol should be established first. There are three main hardware interfaces of an MCU (Microcontroller Unit): UART (Universal Asynchronous Receiver/Transmitter), SPI (Serial Peripheral Interface), and I2C (Inter-Integrated Circuit). This paper selects UART according to the system requirements.
Combined with the characteristics and application scenarios of BUs, and inspired by serial asynchronous communication protocol, a communication protocol suitable for fault isolation is proposed here, which is shown in Figure 11.

Communication Protocol
In order to realize fault isolation, communication protocol should be established first. There are three main hardware interfaces of an MCU (Microcontroller Unit): UART (Universal Asynchronous Receiver/Transmitter), SPI (Serial Peripheral Interface), and I2C (Inter-Integrated Circuit). This paper selects UART according to the system requirements.
Combined with the characteristics and application scenarios of BUs, and inspired by serial asynchronous communication protocol, a communication protocol suitable for fault isolation is proposed here, which is shown in Figure 11. The data bit consists of two parts: the address bits and the command bits. Address bits are mainly used to code and differentiate each BU. The number of address bits is not fixed. The number of address bits can be dynamically adjusted according to the number of nodes in the whole observation network. When the number of communication protocol address bits is n, the maximum controllable number of nodes can reach about 2n. The command bit here is 2 bits, indicating that the number of circuit breakers that can be controlled by each BU is 2, '0′ means the switch is turned on, and '1′ means the switch is turned off. Parity bits are used to ensure that the entire data are correct. The stop bit '0′ is intended to indicate the end of the entire command of the BU, waiting for the next command to be received. Only when the digital current conforms to the communication protocol can the BU switch work, and whether the MCU can correctly sample the current signal is the key of the The data bit consists of two parts: the address bits and the command bits. Address bits are mainly used to code and differentiate each BU. The number of address bits is not fixed. The number of address bits can be dynamically adjusted according to the number of nodes in the whole observation network. When the number of communication protocol address bits is n, the maximum controllable number of nodes can reach about 2n. The command bit here is 2 bits, indicating that the number of circuit breakers that can be controlled by each BU is 2, '0 means the switch is turned on, and '1 means the switch is turned off. Parity bits are used to ensure that the entire data are correct. The stop bit '0 is intended to indicate the end of the entire command of the BU, waiting for the next command to be received. Only when the digital current conforms to the communication protocol can the BU switch work, and whether the MCU can correctly sample the current signal is the key of the whole fault isolation system.

Optimal Communication Frequency
For the transmission line system model, the higher the frequency of the sinusoidal signal, the more serious the signal attenuation and distortion. Therefore, when the current square wave is transmitted to a specific BU, the current waveform will be distorted to a certain extent, which will bring uncertainty to the BU sampling unit. When the sampling frequency is high and serious, the original digital signal '1 is sampled at the BU side but becomes '0 , which makes the whole communication uncontrollable. This error code will make the whole communication uncontrollable.
According to the simulation model in Figure 6, the current waveforms of different distances are measured. The power supply terminal is a step current signal of 50-100 mA. From the simulation results showed in Figure 12, it can be seen that the farther away from the shore-based power supply, the more serious the current distortion flowing through the BU sampling resistor, in which Tt is the transition time equal to the sum of the rise time and the delay time. According to the existing simulation data, the algebraic relationship between the maximum transition time Tt and the total length of transmission line L is determined by using the MATLAB(MATLAB is a commercial mathematical software produced by MathWorks company in Natick, MA, USA) curve-fitting toolbox. When the polynomial is of second order, the algebraic relationship between Tt and L in a 95% confidence interval: In order to further increase the accuracy and reliability of sampling, this paper adopts the method of ten-fold frequency sampling:  According to the existing simulation data, the algebraic relationship between the maximum transition time T t and the total length of transmission line L is determined by using the MATLAB (MATLAB is a commercial mathematical software produced by MathWorks company in Natick, MA, USA) curve-fitting toolbox. When the polynomial is of second order, the algebraic relationship between T t and L in a 95% confidence interval: In order to further increase the accuracy and reliability of sampling, this paper adopts the method of ten-fold frequency sampling: f s = 10 · f c . When considering the distortion degree, the text only considers the influence of the parasitic parameters of the cable on the digital signal. However, in the actual situation, there are still other uncertain factors disturbing the digital current signal, and some disturbance factors are inevitable. For example, the actual transition time T t will also be increased due to the electric pulse caused by the environment or power supply, estimation error of cable parasitic parameters, software simulation error, curve-fitting error, and response time of electronic components.
In order to ensure the accuracy of the sampling accuracy, the sampling frequency f of the switch controller f s must be greater than T t , where the safety factor S = 2: When the transmission distance is within 2000 km, the best communication frequency of the digital current signal can be determined by Equation (8), which not only ensures the high integrity of the signal and improves the reliability of the whole control process, it also shortens the execution time of control instructions in the whole power line transmission process.

Control of BU
The hardware system of a current digital controllable BU mainly includes a power supply and signal acquisition module, control module, and driver module, which is shown in Figure 13.
in the actual situation, there are still other uncertain factors disturbing the digital current signal, and some disturbance factors are inevitable. For example, the actual transition time Tt will also be increased due to the electric pulse caused by the environment or power supply, estimation error of cable parasitic parameters, software simulation error, curve-fitting error, and response time of electronic components. In order to ensure the accuracy of the sampling accuracy, the sampling frequency f of the switch controller fs must be greater than Tt, where the safety factor S = 2: When the transmission distance is within 2000 km, the best communication frequency of the digital current signal can be determined by Equation (8), which not only ensures the high integrity of the signal and improves the reliability of the whole control process, it also shortens the execution time of control instructions in the whole power line transmission process.

Control of BU
The hardware system of a current digital controllable BU mainly includes a power supply and signal acquisition module, control module, and driver module, which is shown in Figure 13.  The SS computer communicates with the SS power supply through Modbus serial communication protocol and sends out the digital current signal of a specific communication frequency. The digital signal reaches the BU through the photoelectric composite cable. Each BU is connected to the backbone cable in series with a voltage regulator and a precision sampling resistor. The voltage regulator acts as the power supply module of the whole BU. The signal amplifier converts the current signal into a voltage signal, and it amplifies the signal to the voltage level that can be processed and identified by the MCU. The analog voltage signal is compared with the reference voltage of the AD conversion chip, and the voltage signal suitable for the MCU is generated. The MCU collects the signals at a certain frequency, combines and recognizes these signals, and decodes them according to the communication protocol. Only when the signal conforms to the communication protocol can the BU switch work.

Experiment of Fault Detection
Under laboratory conditions, the system model is limited by the voltage level of the SS power supply, cable length, cost of experimental equipment, and other constraints. In the experiment, a single power supply and tree topology observation networks are built to verify the proposed fault detection model. The experimental diagram of the experimental platform is shown in Figure 14.
In this laboratory experiment, according to the similarity theory, the voltage level, cable length, and load size are reduced correspondingly compared with the practical observation network. After calculation based on similarity theory, the shore station power supply is reduced from 10 kV to 300 V. The BU and junction box are the load of the system, which are replaced by 50 ohm cement resistance, and the load is set every 60 km. The transmission cable in this experiment is the cable model. The actual observation network may have multiple nodes, but due to the limitation of the experiment scale, only three nodes are set up in the experiment to verify the theory. At the input end of each node, a voltage and current sample are arranged to collect data, which are uploaded to the operating system at the SS through the network port communication of the lower computer STM32. The structure of the laboratory fault detection experiment is shown in Figure 15.

Experiment of Fault Detection
Under laboratory conditions, the system model is limited by the voltage level of the SS power supply, cable length, cost of experimental equipment, and other constraints. In the experiment, a single power supply and tree topology observation networks are built to verify the proposed fault detection model. The experimental diagram of the experimental platform is shown in Figure 14. In this laboratory experiment, according to the similarity theory, the voltage level, cable length, and load size are reduced correspondingly compared with the practical observation network. After calculation based on similarity theory, the shore station power supply is reduced from 10 kV to 300 V. The BU and junction box are the load of the system, which are replaced by 50 ohm cement resistance, and the load is set every 60 km. The transmission cable in this experiment is the cable model. The actual observation network may have multiple nodes, but due to the limitation of the experiment scale, only three nodes are set up in the experiment to verify the theory. At the input end of each node, a voltage and current sample are arranged to collect data, which are uploaded to the operating system at the SS through the network port communication of the lower computer STM32. The structure of the laboratory fault detection experiment is shown in Figure 15.  Table 4. Table 4. Experimental results of fault detection.

Test Set
Calibration/Prediction  Table 4. Four typical cases are extracted from the experimental test sequence for analysis. The system characteristic state vector shown in No. 1 is [0,0,0,0,1,0,0,20] T , which means that the first section of the backbone cable of the system has a high-impedance fault ( Table 2)  Similarly, serial number 4 is the normal state when the system is running. Each state indicates the sufficient discrimination between features for ensuring that the deep learning model can accurately detect faults. This paper is aimed at the fault detection of submarine cable. The large disturbance usually comes from a power supply or junction box fault. The output ports of the junction box and shore-based power supply are equipped with over-current and over-voltage protection functions. The proposed method is run to diagnose cable faults on the premise that no large disturbance will occur. The output voltage of the DC power supply used in the submarine observation network has been noise suppressed. The noise is smaller than the useful signal, so the noise has less influence on the diagnostic results. From the results of the experiment, it can be seen that the existence of noise will not affect the judgment of the fault diagnosis results. In addition, if the sensor fails, the accuracy of the proposed fault diagnosis algorithm will be greatly reduced. The sensor needs to be maintained and the fault diagnosis algorithm needs to be run again.

Experiment of Fault Isolation
In the fault isolation experiment, the SS power supply sends a digital current of different frequencies to observe whether the BU can reliably carry out the on and off switch. The experimental topology is shown in Figure 16. In the experiment, the BU is installed at 200 km of the cable model. The main difference from the practical system is that the analog cable is used here. The programmable power supply and BU controller are the same as the practical fault isolation system. The '0 /'1 data current signals are 50 mA/100 mA, respectively. and the fault diagnosis algorithm needs to be run again.

Experiment of Fault Isolation
In the fault isolation experiment, the SS power supply sends a digital current of different frequencies to observe whether the BU can reliably carry out the on and off switch. The experimental topology is shown in Figure 16. In the experiment, the BU is installed at 200 km of the cable model. The main difference from the practical system is that the analog cable is used here. The programmable power supply and BU controller are the same as the practical fault isolation system. The '0′/'1′ data current signals are 50 mA/100 mA, respectively.  In Figure 17, when the current of the I-Cable is interrupted at the falling edge, the program of the MCU starts. The current value of the backbone cable is collected five times every 100 ms, and the address bit of the BU is obtained as '10101 . Furthermore, the command bit is '10 , which indicate that the turning-on coil is turned on. The parity bit '0 meets the check requirements. The stop bit is '0 , which means that the isolation control procedure is completed, and all input current signals are verified to meet the requirements of communication protocol, and the BU sends control signals to drive the turning-on coil. Figure 18 is similar to Figure 17; the difference is that the command bits change to '01 to drive the turning-off coil. In Figure 19, since the parity bit is '1 , which does not meet communication protocol, the BU will not respond. The address bit of Figure 20 is '10110 , which does not match the BU address in the experiment, so no misoperation occurs.  Figure 17, when the current of the I-Cable is interrupted at the falling edge, the program of the MCU starts. The current value of the backbone cable is collected five times every 100 ms, and the address bit of the BU is obtained as '10101′. Furthermore, the command bit is '10′, which indicate that the turning-on coil is turned on. The parity bit '0′ meets the check requirements. The stop bit is '0′, which means that the isolation control procedure is completed, and all input current signals are verified to meet the requirements of communication protocol, and the BU sends control signals to drive the turning-on coil. Figure 18 is similar to Figure 17; the difference is that the command bits change to '01′ to drive the turning-off coil. In Figure 19, since the parity bit is '1′, which does not meet communication protocol, the BU will not respond. The address bit of Figure 20 is '10110′, which does not match the BU address in the experiment, so no misoperation occurs.          In the experimental environment, the reliability of the BU is tested by periodically sending the control commands of turning on/off the switch by the programmable power supply. A simple counting function is realized by software programming. A series of digital current waveforms are sent alternately every 20 s, and the accumulated operation time is 6 h. The actual switching action is consistent with the expectation, and no error occurs. In addition, considering that the actual working In the experimental environment, the reliability of the BU is tested by periodically sending the control commands of turning on/off the switch by the programmable power supply. A simple counting function is realized by software programming. A series of digital current waveforms are sent alternately every 20 s, and the accumulated operation time is 6 h. The actual switching action is consistent with the expectation, and no error occurs. In addition, considering that the actual working environment of the BU is in a closed chamber, and air circulation and heat dissipation conditions are relatively poor, the thermal imager is used to continuously monitor the heating condition of the BU in the durability test in real time. The measurement results are shown in Figure 21. In the experimental environment, the reliability of the BU is tested by periodically sending the control commands of turning on/off the switch by the programmable power supply. A simple counting function is realized by software programming. A series of digital current waveforms are sent alternately every 20 s, and the accumulated operation time is 6 h. The actual switching action is consistent with the expectation, and no error occurs. In addition, considering that the actual working environment of the BU is in a closed chamber, and air circulation and heat dissipation conditions are relatively poor, the thermal imager is used to continuously monitor the heating condition of the BU in the durability test in real time. The measurement results are shown in Figure 21. It can be found from the above figure that the main centralized heating component of the circuit board is the voltage stabilizing tube. Under the working condition of the current value of 100 mA for a long time, the maximum temperature of the component is 52 °C, which is completely within the reasonable range of the component design, which proves the rationality and reliability of the BU.

Conclusions
With the expansion of the scale of single-line power supply systems, the reliability of submarine observation networks gradually decreases, and the failure of any node and cable will lead to the collapse of entire network systems. To improve the reliability of submarine observation networks, the fault diagnosis and location of transmission lines are studied. On the basis of the characteristic data of the voltage and current of the power supply and underwater nodes, a fault-type diagnosis and location based on DNN is proposed. To verify the feasibility of applying deep learning to fault diagnosis and location, a complex network model with a double SS network topology is adopted in the theoretical research stage. The supervised learning feature engineering of this model is designed in the TensorFlow deep learning framework, and a DNN is built to accelerate the training of the model by using feature preprocessing, the random gradient descent algorithm, and batch normalization. L2 regularization and dropout and other deep learning training strategies are added to prevent over fitting and improve the generalizability of the training model. In terms of fault isolation, a communication protocol suitable for the power supply system of the observation network It can be found from the above figure that the main centralized heating component of the circuit board is the voltage stabilizing tube. Under the working condition of the current value of 100 mA for a long time, the maximum temperature of the component is 52 • C, which is completely within the reasonable range of the component design, which proves the rationality and reliability of the BU.

Conclusions
With the expansion of the scale of single-line power supply systems, the reliability of submarine observation networks gradually decreases, and the failure of any node and cable will lead to the collapse of entire network systems. To improve the reliability of submarine observation networks, the fault diagnosis and location of transmission lines are studied. On the basis of the characteristic data of the voltage and current of the power supply and underwater nodes, a fault-type diagnosis and location based on DNN is proposed. To verify the feasibility of applying deep learning to fault diagnosis and location, a complex network model with a double SS network topology is adopted in the theoretical research stage. The supervised learning feature engineering of this model is designed in the TensorFlow deep learning framework, and a DNN is built to accelerate the training of the model by using feature preprocessing, the random gradient descent algorithm, and batch normalization. L2 regularization and dropout and other deep learning training strategies are added to prevent over fitting and improve the generalizability of the training model. In terms of fault isolation, a communication protocol suitable for the power supply system of the observation network is proposed, and an experimental platform is built according to this protocol. Experimental results verify the effectiveness of the proposed algorithm for the location and prediction of high-impedance and open-circuit faults, and the feasibility of the fault isolation system has also been verified. Moreover, the proposed methods greatly improve the reliability of undersea observation network systems.