Locating Faults in Thyristor-Based LCC-HVDC Transmission Lines Using Single End Measurements and Boosting Ensemble

: Most of the fault location methods in high voltage direct current (HVDC) transmission lines usemethods which require signals from both ends. It will be difﬁcult to estimate fault location if the signal recorded is not correct due to communication problems.Hence a robust method is required which can locate fault with minimum error. In this work, faults are located using boosting ensembles in HVDC transmission lines based on single terminal direct current (DC) signals. The signals are processed to obtain input features that vary with the fault distance. These input features are obtained by taking maximum of half cycle current signals after fault and minimum of half cycle voltage signals after fault from the root mean square of DC signals. The input features are input to a boosting ensemble for estimating the location of fault. Boosting ensemble method attempts to correct the errors from the previous models and ﬁnd outputs by combining all models. The boosting ensemble method has been also compared with the decision tree method and thebagging-based ensemble method. Fault locations are estimated using three methods and compared to obtain an optimal method. The boosting ensemble method has better performance than all the other methods in locating the faults. It also validated varying fault resistance, smoothing reactors, boundary faults, pole to ground faults and pole to pole faults. The advantage of the method is that no communication link is needed. Another advantage is that it allowsreach setting up to 99.9% and does not exhibitthe problem of over-ﬁtting. Another advantage is that the percentage error in locating faults is within 1% and has a low realization cost. The proposed method can be implemented in HVDC transmission lines effectively as an alternative to overcome the drawbacks of traveling wave methods.


Introduction
Owing to the boost in energy demand at the moment, bulk power transmission is essential. One of the ideas is to implement HVDC transmission lines as it is cost effective and emerging in the power sector. The advantages of using HVDC transmission lines over alternating current (AC) lines are that they can carry power to long distances, but locating faults in HVDC lines is not an easy task due to the long transmission line length whose impedance is very small, as well as severe environmental conditions. Most of the HVDC transmission lines use traveling waves which have drawbacks such asthe need for communication links and both end measurements. Hence, it is essential to design a scheme to locate the faults with minimum error in all operating conditions. 2

of 21
To predict fault location, traveling-wave method based on current data has been proposed in [1]. In [2], a fault direction comparison method using both end measurements based on traveling wave has been proposed. The amplitude of the first backward traveling wave is greater than that of forward traveling wave for a forward fault that is used for direction comparison. In [3], using Prony algorithm, locations are predicted in DC lines. In [4], a traveling wave-based protection has been suggested, but the sampling rate is very high, making it complicated to implement. A traveling wave based directional protection has been proposed for DC transmission lines in [5]. In [6], a distance protection method with traveling wave based on wavelet transform has been suggested. The drawback of the above-mentioned methods [1][2][3][4][5][6] is that they are based on traveling waves in which different positions have different characteristics.
In [7], a method has been proposed for multi-terminal using electromagnetic time reversal. In [8], a method using fault currents from two ends has been proposed. Using the fault component current at both converter stations, faults are identified, but this cannot detect high impedance fault. A current differential protection method has been suggested in [9] which remove the effects of capacitor current. The sum of the current calculated with each station's data has been used to detect faults. However, the methods suggested in [7][8][9] need data from both ends and communication link. A current differential protection scheme for current source converters (CSC) based HVDC transmission lines has been proposed in [10]. However, this method takes more time to operate to avoid the impact of capacitive current. In [11], an impedance-frequency characteristic is used for fault identification. In this approach, transient energy within a specific frequency band has been used to identify internal faults, but it does not emphasize the estimation of fault location. In [12] a wavelet-based approach has been suggested for fault detection, pole identification and fault location, but a drawback of the method is that it cannot estimate location varying resistance.
In recent studies, artificial intelligence-based techniques are also used for locating faults. In [13], a support vector machine (SVM) based method has been proposed to locate the fault. In [14], a deep learning approach called long short-term memory network has been used. In [15], a discrete wavelet transform (DWT) and SVM based technique has been projected for fault location, but it has used a very high sampling frequency and the percentage error is maximum up to 7%. In [16], a deep learning approach has been proposed that can directly learn the fault conditions based on feature. It has been tested on a modular multilevel converter based four-terminal high-voltage direct current system. In [17], a fault location method for a modular multilevel converter (MMC)-based HVDC grid based on dynamic state estimation and gradient descent uses data window of 5 ms after the occurrence of the fault. In [18], a novel fault location approach based on characteristic-harmonic measured impedance (CHMI) has been proposed. In this method, fault branch is identified by the amplitude ratio of the two CHMI. In [19], current limiting reactor power based primary protection scheme and variation tendency of current based the backup protection has been proposed. In [20], a backup protection method for bipolar HVDC lines using two-end synchronized sampled currents has been proposed. It can be applied to detect and recognize the faulted poles without long time delays. In [21], internal faults are partitioned into four zones along the transmission line. Then, the polarities and arrival times of traveling-waves are used for zone partition.
The research gap from the above discussion can be outlined as follows: • Most of the methods for used are traveling wave-based method. In atraveling wave method, sometimes estimating fault location is difficult because the signal recorded is not strong; • Another problem with traveling wave methods is that if one end does not record the signal correctly then it cannot estimate the location of fault; • As traveling waves are an online method, if there is a problem in the communication link between terminals then also it will not be able to estimate the location; • Another drawback of traveling wave method is that if the obstacle in the wave front path is irregular then it will providewrong result for the location of fault. In these scenarios, the maintenance crew has to patrol the line, which is not economical and takes time, ultimately affecting uninterrupted power flow.
Considering the drawbacks of the existing methods, a scheme for locating faults has been suggested based on single terminal signals using a boosting ensemble learning method. An advantage of the proposed method compared to other methods is that it does not need a communication link and is simple, reliable and robust. Compared with the existing methods which require communication links leading to high cost and low reliability, the proposed method uses signals from one terminal. The proposed method does not need a complex algorithm to improve the location accuracy. Compared to atraveling wave that needs high sampling frequency, the proposed method provides a better result with a low sampling frequency. Hence the proposed method is robust and consistent. The manuscript is organized as follows: Section 2 contains the materials and methods, Section 3 contains the results, Section 4 contains the discussion, Section 5 contains the implantation of proposed method and Section 6 contains the conclusion.

Materials and Methods
The fault line can be located offline in DC lines. An effective fault location method is very important because it helps to send the maintenance crew to the fault location which will reduce the repair time and maintenance costs. The proposed fault location estimation scheme has various steps, as shown in Figure 1. It is based on a training module which is designed using three methods, namely decision tree regression (DTR), bagging ensemble regression (BGER) and the boosting ensemble regression (BTER) technique, to correlate input features with fault locations. DC current and voltage signals are recorded at the rectifier end using a sampling rate of 1 kHz. The previously trained module takes the input data and estimates the fault location.

System under Study
Bipolar LCC-HVDC transmission lines have various advantages, hence are implemented more frequently. Power transmission line is 1100 km long, designed using MATLAB/SIMULINK [22]. As the HVDC lines are long lines, fault occurrence is more common. Power system network used in this study is shown in Figure 2.

System under Study
Bipolar LCC-HVDC transmission lines have various advantages, hence are implemented more frequently. Power transmission line is 1100 km long, designed using MAT-LAB/SIMULINK [22]. As the HVDC lines are long lines, fault occurrence is more common. Power system network used in this study is shown in Figure 2.

Feature Selection
After designing the bipolar LCC-HVDC line, it is simulated for various faul The signals are recorded at the rectifier end. The sampling frequency of 1 kHz i The signals of various cases are studied extensively. After trying with various fe the optimal features are obtained which can the estimate fault location correctl steps required to obtain the features are described below. i.
Obtain the signals from rectifier end; ii.
Obtain the root mean square (rms) value of signals; iii.
Take half cycle samples after occurrence of fault; iv.
Find the maximum of current and minimum of voltage from the half cycle sam that are the input features given to the fault location estimation module.
Step i and step ii are shown in Figure 3 for pole 1 to ground (P1G) fault at 1 Figure 3a-g show the current of pole 1, current of pole 2, voltage of pole 1 and vol pole 2, respectively. Figures 3e-h show the rms current of pole 1, current of pole age of pole 1 and voltage of pole 2, respectively. It is observed that half cycle of th ples after the occurrence of a fault has certain variation with respect to location. Step iii is shown in Figure 4, i.e., half cycle signals obtained for P1G fault at v

Feature Selection
After designing the bipolar LCC-HVDC line, it is simulated for various fault cases. The signals are recorded at the rectifier end. The sampling frequency of 1 kHz is used. The signals of various cases are studied extensively. After trying with various features, the optimal features are obtained which can the estimate fault location correctly. The steps required to obtain the features are described below. i.
Obtain the signals from rectifier end; ii. Obtain the root mean square (rms) value of signals; iii. Take half cycle samples after occurrence of fault; iv. Find the maximum of current and minimum of voltage from the half cycle samples that are the input features given to the fault location estimation module.
Step i and step ii are shown in Figure 3 for pole 1 to ground (P1G) fault at 150 km.

Feature Selection
After designing the bipolar LCC-HVDC line, it is simulated for variou The signals are recorded at the rectifier end. The sampling frequency of 1 The signals of various cases are studied extensively. After trying with var the optimal features are obtained which can the estimate fault location c steps required to obtain the features are described below. i.
Obtain the signals from rectifier end; ii.
Obtain the root mean square (rms) value of signals; iii.
Take half cycle samples after occurrence of fault; iv.
Find the maximum of current and minimum of voltage from the half cy that are the input features given to the fault location estimation module Step i and step ii are shown in Figure 3 for pole 1 to ground (P1G) fau  Step iii is shown in Figure 4, i.e., half cycle signals obtained for P1G fa locations. Figure 4a shows that the half cycle current is different for each lo  Step iii is shown in Figure 4, i.e., half cycle signals obtained for P1G fault at various locations. Figure 4a shows that the half cycle current is different for each location. From Figure 4b it can be observed that the half cycle voltage signals are different for each location.
Step iii is also shown in Figure 5, i.e., half cycle rms signals obtained during P1G fault at 1080 km for various fault resistance. From Figure 4a it can be observed that the half cycle current signals are different for each fault resistance. From Figure 5b it can be observed that the half cycle voltage signals are different for each fault resistance. Figures 4 and 5 show the correlation between signals obtained after fault and fault distance. So, these features or any variation of these features can be used for fault location estimation.
Electronics 2022, 11, x FOR PEER REVIEW 6 observed that the half cycle voltage signals are different for each fault resistance. Fig  4 and 5 show the correlation between signals obtained after fault and fault distance these features or any variation of these features can be used for fault location estimat  Step iv is shown in Figure 6, i.e., input features obtained for various locations (2  Electronics 2022, 11, x FOR PEER REVIEW 6 observed that the half cycle voltage signals are different for each fault resistance. Fig  4 and 5 show the correlation between signals obtained after fault and fault distance these features or any variation of these features can be used for fault location estimat  Step iv is shown in Figure 6, i.e., input features obtained for various locations (2 to 1079 km in step of 2 km) of both poles during P1G fault. From Figure 6a it can be that the current features of pole 1 are varying according to location of fault, as the fau  Step iv is shown in Figure 6, i.e., input features obtained for various locations (21 km to 1079 km in step of 2 km) of both poles during P1G fault. From Figure 6a it can be seen that the current features of pole 1 are varying according to location of fault, as the fault is P1G. From Figure 6c it can be observed that the voltage features of pole 1 are also varying according to location of fault as the fault is P1G. Pole2 current and voltage features are shown in Figure 6b and 6d, respectively.

Design of Proposed Method
Machine learning is one of the solutions when any conventional or hard computing method fails to deliver a solution for a problem. Machine learning provides an approximate solution from uncertain problems with partial truths. Fault location estimation of transmission line fits into the area where machine learning can be a solution. Among the various types of regression machine learning method, some methods have many advantages over other methods, such as DTR, BGER and BTER.
Learning methods such as DTR, BGER and BTER are used, which first learns from the example samples before it predicts the outcome when unknown samples are given. Learning based methods have two phases, the training phase and the testing phase. To design the training modules, appropriate features are required. Features given to the training modules for locating the faults are discussed in the above section. Various fault cases used to design the training network and testing. Different cases used for designing the training network are 4 fault type (P1G, pole 2 to ground (P2G), pole to pole to ground (P1P2G) and pole to pole (P1P2)), 1099 location, 40 fault resistance, 5 smoothing reactors. Test cases used are those cases which are not given in training. Fault location estimation module is designed using three methods, namely DTR, BGER and BTER, which will be discussed in the subsection below.

Design of DTR Based Method
DTR is part of the supervised learning algorithms that can solve regression problems. The main aim of the DTR is to train and create a model that can predict different values using decision rules. A set of nodes and edges forms the basic structure of a decision tree. There is the presence of one root node followed by the internal and leaf nodes. In a decision tree, the foremost challenge lies in attribute selection, or identifying the at-

Design of Proposed Method
Machine learning is one of the solutions when any conventional or hard computing method fails to deliver a solution for a problem. Machine learning provides an approximate solution from uncertain problems with partial truths. Fault location estimation of transmission line fits into the area where machine learning can be a solution. Among the various types of regression machine learning method, some methods have many advantages over other methods, such as DTR, BGER and BTER.
Learning methods such as DTR, BGER and BTER are used, which first learns from the example samples before it predicts the outcome when unknown samples are given. Learning based methods have two phases, the training phase and the testing phase. To design the training modules, appropriate features are required. Features given to the training modules for locating the faults are discussed in the above section. Various fault cases used to design the training network and testing. Different cases used for designing the training network are 4 fault type (P1G, pole 2 to ground (P2G), pole to pole to ground (P1P2G) and pole to pole (P1P2)), 1099 location, 40 fault resistance, 5 smoothing reactors. Test cases used are those cases which are not given in training. Fault location estimation module is designed using three methods, namely DTR, BGER and BTER, which will be discussed in the subsection below.

Design of DTR Based Method
DTR is part of the supervised learning algorithms that can solve regression problems. The main aim of the DTR is to train and create a model that can predict different values using decision rules. A set of nodes and edges forms the basic structure of a decision tree. There is the presence of one root node followed by the internal and leaf nodes. In a decision tree, the foremost challenge lies in attribute selection, or identifying the attribute that would represent the root node and nodes at each level [23]. The choice of strategic split affects the performance of the tree. Entropy for more than one class is represented in (2). Information gain for an attribute (A) is represented in (3). Intrinsic information about an attribute is given in (4). Gain Ratio is defined in (5), Gini Index is calculated in (6) and average Gini Index can be written as in (7) EntropyE where S is the current state and p i is the probability of an event i of the state S.
Variance reduction is represented as, where X is the actual value, X is the mean and n representing the number of values. Mathematically, Chi-Square is defined as, where O is the observed score and E is the expected score. The first phase of DTR method is to design the DTR based training module. The input and target, i.e., the corresponding location of the input features are given to the DTR based training network. During designing the training module, it is important to select the optimal value of parameters that are required to locate faults correctly. The choice of strategic split affects the performance of the decision tree. After various trials and errors, it has been found that the Gini index method gives better results than all other methods. The tree pruning was completed after designing the DTR and it has been found that the highest level of the tree is 659. The optimal parameters used for designing the DTR based training network are shown in Table 1. The DTR based training network is tested and results are discussed in the next section. With the gradual growth and improvement in the field of machine learning, different newer methods have come into the spotlight. Among them, one framework that has garnered substantial attention is ensemble learning framework. The pivotal priority of achieving better results has led to the notion of merging models and recent trends show a renewed interest in the combination of models, mainly in the area of machine learning [23]. Although it is seen from various works that ensemble function better for classification problems but that does not limit it from being used in regression problems [24]. The ensemble learning framework has given rise to different new techniques such as boosting and bagging. Generally, these methods vary in handling the training data that is used for learning the base model. Parameter settings also differ in the training of the base model such that various ensemble components can bring their diversity to the framework [25]. Since their successful implementation, ensemble systems are now used in addressing different other problems related to machine learning such as error correction, selection of features, class imbalanced data, confidence estimation and so on [26].
There are different aggregation methods in ensemble learning, and bagging is one of them, which is used in a predictive model to reduce variance. Bagging ensemble regression (BGER) method is a learning-based method used to perform regression by constructing multiple trees at the time of training and returning the mean regression (prediction) as output. The problem of over-fitting of the training dataset in decision trees is handled in BGER. Here multiple decision trees are constructed by keeping the variance in control. Deep decision trees over-fit the training dataset by learning extremely irregular patterns, thus having low bias and large variance. Although the bias increases to a very small extent, the high variance is ultimately reduced in BGER and thus increasing the performance of the end model. In BGER, the stability and accuracy are improved by bagging the learning trees in the training algorithm. Let in the training dataset X = x1, . . . , xn be the predictor variables and Y = y1, . . . , yn are the target variables, then by bagging B number of times, a random sample with replacement of the training dataset is selected and the trees are then fitted to the selected samples [27,28]. When the training is completed, values of unseen samples X' are predicted by taking the average of all predictions from every individual regression tree on X' as shown in (10) Here the feature set used in this work is X (current of pole 1, current of pole 2, voltage of pole 1 and voltage of pole 2) and approximate fault location is the response variable. By bootstrapping, the trees are de-correlated by showing them different training datasets, thus decreasing the variance without increasing the bias, which ultimately boosts the performance of the end model. Figure 7 shows a simple BGER algorithm diagrammatically. Electronics 2022, 11, x FOR PEER REVIEW 10 of 21 BGER based method is designed using the input features. In the training or learning process of BGER, a random feature subset is used at every candidate split where p is the total number of features. Various parameters used for training the fault location modules with random forest are number of trees, quality of split, impurity split, etc. The error in location has been checked for a varying number of trees. The optimal number of trees chosen to design the training module is 300. Parameters of best estimator used for training are 300 numbers of trees, the function used to measure the quality of split is MSE, minimum impurity split (threshold used for early stopping in tree growth) is 10-7. The optimal parameters used for designing the BGER based training network are shown in Table 1. The BGER based training network is tested and the results are discussed in the next section.

Design of BTER-based Method
BTER method helps in converting a group of weak learners into an aggregated strong model. Another method that quite overt and uses the association between boosting and optimization techniques is the gradient boosting machine. The principal step is ensemble generation, a step where the set of models are generated. This is followed by the second step called ensemble pruning where redundant models generated in the first step are eliminated or pruned. Lastly in the final step of ensemble integration is to combine the models. This approach is generally used later for obtaining the prediction results of other new problems [23]. Figure 8 shows the simple architecture of BTER method. The final model can be represented as a sum of all the base learners or predictors denoted as ban(x): = In forward stage-wise additive modeling, f(x)←0 and for n= 1, 2, …, N. BGER based method is designed using the input features. In the training or learning process of BGER, a random feature subset is used at every candidate split where p is the total number of features. Various parameters used for training the fault location modules with random forest are number of trees, quality of split, impurity split, etc. The error in location has been checked for a varying number of trees. The optimal number of trees chosen to design the training module is 300. Parameters of best estimator used for training are 300 numbers of trees, the function used to measure the quality of split is MSE, minimum impurity split (threshold used for early stopping in tree growth) is 10-7. The optimal parameters used for designing the BGER based training network are shown in Table 1. The BGER based training network is tested and the results are discussed in the next section.

Design of BTER-based Method
BTER method helps in converting a group of weak learners into an aggregated strong model. Another method that quite overt and uses the association between boosting and optimization techniques is the gradient boosting machine. The principal step is ensemble generation, a step where the set of models are generated. This is followed by the second step called ensemble pruning where redundant models generated in the first step are eliminated or pruned. Lastly in the final step of ensemble integration is to combine the models. This approach is generally used later for obtaining the prediction results of other new problems [23]. Figure 8 shows the simple architecture of BTER method. The final model can be represented as a sum of all the base learners or predictors denoted as ba n (x): In forward stage-wise additive modeling, f (x) ← 0 and for n= 1, 2, . . . , N.
ba n ← arg min ba n ∑ (x,ŷ) The idea is to decrease the loss function and unless it uses least square loss, the optimization would not change into a least square problem.
In Gradient boosting, f 0 (x) ← 0 and for n= 1, 2, . . . , N and for all (x,ŷ) In this work, ensemble learning method has been used to locate faults.
The idea is to decrease the loss function and unless it uses least square loss, the optimization would not change into a least square problem. In Gradient boosting, f0(x)←0 and for n= 1, 2, …, N and for all ( , ) ← arg min − , , In this work, ensemble learning method has been used to locate faults. In BTER-based method, LSboost has been used because it has shown better performance to locate faults. A weak learner in the form of a decision tree has been used to create the bagged ensemble. The minimum value of the leaf size has been kept as five, which is the default value for regression. This helps in building deep trees that are ideal for the prognostic power of the ensemble. Another important parameter that is necessary at every split is the number of predictors. This is randomly selected for each split and the default value for regression is one-third of the predictors present. The number of learning cycles has been varied before fixing it at 400 for the trained ensemble module. In addition, the learning rate used in the trained module has been fixed at one.The optimal parameters used for designing the BTER-based training network are shown in Table 1. The BTER-based training network is tested and results are discussed in the next section.

Implementation of Proposed Method
The proposed method is a fault location estimation method which can be implemented offline, as shown in Figure 9. The proposed method consist of two parts, i.e., creating a training fault location module and testing of faulted signals across training module when fault occurs to estimate the location of fault. The training module can be designed by simulating the power system network and recording the signals of fault cases. The signals will be then processed and features are extracted. The extracted features are then given to the BTER method along with the target to design the training In BTER-based method, LSboost has been used because it has shown better performance to locate faults. A weak learner in the form of a decision tree has been used to create the bagged ensemble. The minimum value of the leaf size has been kept as five, which is the default value for regression. This helps in building deep trees that are ideal for the prognostic power of the ensemble. Another important parameter that is necessary at every split is the number of predictors. This is randomly selected for each split and the default value for regression is one-third of the predictors present. The number of learning cycles has been varied before fixing it at 400 for the trained ensemble module. In addition, the learning rate used in the trained module has been fixed at one.The optimal parameters used for designing the BTER-based training network are shown in Table 1. The BTER-based training network is tested and results are discussed in the next section.

Implementation of Proposed Method
The proposed method is a fault location estimation method which can be implemented offline, as shown in Figure 9. The proposed method consist of two parts, i.e., creating a training fault location module and testing of faulted signals across training module when fault occurs to estimate the location of fault. The training module can be designed by simulating the power system network and recording the signals of fault cases. The signals will be then processed and features are extracted. The extracted features are then given to the BTER method along with the target to design the training module. The parameters of the BTER method have been changed and an optimal training module is decided based on better accuracy. This training module will be kept andwill be used later when the fault occurs in the system. When fault occurs in the HVDC transmission line, the signals will be measured at the rectifier end. The signals will be then imported to thecomputer where a training module is designed. The signals will be then processed and input features will be extracted. The input features will be then tested against the trained BTER module and fault location will be estimated. The repair crew then can be sent to the location for possible repair of the line. module. The parameters of the BTER method have been changed and an optimal training module is decided based on better accuracy. This training module will be kept andwill be used later when the fault occurs in the system. When fault occurs in the HVDC transmission line, the signals will be measured at the rectifier end. The signals will be then imported to thecomputer where a training module is designed. The signals will be then processed and input features will be extracted. The input features will be then tested against the trained BTER module and fault location will be estimated. The repair crew then can be sent to the location for possible repair of the line. Figure 9. Implementation of proposed method. Figure 9. Implementation of proposed method.

Results
To analyze the accuracy, it is validated with various fault cases. The fault cases are generated by varying different system parameters and tested. The estimated fault location after the testing has been analyzed. The percentage error is calculated using (17),

Fault Resistance
It is possible that when a fault occurs it may go through a high resistance medium, and fault resistance is unknown. The fault location algorithm should be designed in such a way that it can locate fault irrespective of fault resistance. Table 2 shows the results of fault occurring at different levels of resistance. DTR based method has more error in location estimation than the other two methods. BTER method has less error in locating the fault than the BGER method, as shown in Table 2. Considering all the three methods used, BTER method has a maximum error of up to 3 km while the other two methods may have maximum error of up to 10 km during high resistance faults. Figure 10 shows the estimated fault location with fault resistance 100 Ω during P1G fault. Hence, the BTER method is better for locating faults than the BGER and DTR methods in HVDC transmission lines.

Results
To analyze the accuracy, it is validated with various fault cases. The fault cases are generated by varying different system parameters and tested. The estimated fault location after the testing has been analyzed. The percentage error is calculated using (17),

Fault Resistance
It is possible that when a fault occurs it may go through a high resistance medium, and fault resistance is unknown. The fault location algorithm should be designed in such a way that it can locate fault irrespective of fault resistance. Table 2 shows the results of fault occurring at different levels of resistance. DTR based method has more error in location estimation than the other two methods. BTER method has less error in locating the fault than the BGER method, as shown in Table 2. Considering all the three methods used, BTER method has a maximum error of up to 3 km while the other two methods may have maximum error of up to 10 km during high resistance faults. Figure 10 shows the estimated fault location with fault resistance 100 Ω during P1G fault. Hence, the BTER method is better for locating faults than the BGER and DTR methods in HVDC transmission lines.

Far End and Near End Faults
Conventional fault location schemes are not capable of locating faults near a boundary. Faults may occur near the relay location or at the far end of the lines. Few results of boundary faults are given in Table 3. Table 3 shows that the fault location of near and far boundary faults are estimated correctly. BTER method has less error in locating the fault than other two methods. The %error is within 1% for the BTER method. It can be concluded that the BTER method is unaffected by boundary faults.

Smoothing Reactors
Smoothing reactors are connected in series with the converters for smoothing the DC currents. As the fault location algorithm in the proposed work uses current signals, it will have some impact on the proposed method. Hence, smoothing reactors may affect the accuracy of the fault location method. Table 4 shows results of all three methods with %error. The results show that the error is within 1% for all the three methods used but BTER method has less error than other two methods.

Sampling Frequency
Sampling frequency has an effect on the accuracy of fault location method of DC transmission lines to locate the fault. In this work, the proposed method has been tried with various sampling frequencies to choose the optimal sampling frequency for the location estimation. Figure 11 shows the performance of all three methods in locating the faults for various sampling frequencies. Figure 11a shows the performance of the DTR based method which has fewer fault cases for 500 Hz in error range <1% than other sampling frequencies, and all the other sampling frequencies used have almost the same fault cases for error range <1%. So lower sampling frequencies should be chosen, i.e., 1 kHz which will be easier to realize. The same scenario occurredfor BGER and BTER methods and is shown in Figure 11b and 11c, respectively.

Sampling Frequency
Sampling frequency has an effect on the accuracy of fault lo transmission lines to locate the fault. In this work, the proposed m with various sampling frequencies to choose the optimal sampl location estimation. Figure 11 shows the performance of all three m faults for various sampling frequencies. Figure 11a shows the pe based method which has fewer fault cases for 500 Hz in error sampling frequencies, and all the other sampling frequencies used fault cases for error range <1%. So lower sampling frequencies sh kHz which will be easier to realize. The same scenario occurre methods and is shown in Figure 11b and Figure 11c, respectively.

Single Pole to Ground Fault
The most commonly occurring faults are pole to ground faults checked with varying fault locations during P1G and P2G fault. Ta of different tested fault cases for all three methods. Table 5 shows t location of the tested fault cases is within 1%. BTER method has le fault than BGER method, as shown in Table 5. Figure 12 shows three methods. BTER method has less error than the DTR and BGE proposed method estimates the location of single pole to ground fa Table 5. Actual and estimated fault location for single pole to ground faul

Single Pole to Ground Fault
The most commonly occurring faults are pole to ground faults. The method has been checked with varying fault locations during P1G and P2G fault. Table 5 shows the results of different tested fault cases for all three methods. Table 5 shows that percentage error in location of the tested fault cases is within 1%. BTER method has less error in locating the fault than BGER method, as shown in Table 5. Figure 12 shows the performance of all three methods. BTER method has less error than the DTR and BGER methods. Hence, the proposed method estimates the location of single pole to ground faults accurately.

Double Pole to Ground Fault
In DC lines, double pole to ground fault does not occur more frequently but still needs to be located if it occurs. Table 6 shows the results of different cases during double pole to ground faults. From the results, it has been observed that the percentage error is within 1%. BTER method has less error in locating faults than BGER and DTR method. Hence, the proposed fault location estimation method is capable of locating P1P2G faults accurately.

Double Pole to Ground Fault
In DC lines, double pole to ground fault does not occur more frequently but still needs to be located if it occurs. Table 6 shows the results of different cases during double pole to ground faults. From the results, it has been observed that the percentage error is within 1%. BTER method has less error in locating faults than BGER and DTR method. Hence, the proposed fault location estimation method is capable of locating P1P2G faults accurately.

Pole to Pole Fault
In DC lines, pole to pole faults are not common but still a possibility. The proposed method has been tested during pole to pole (P1P2) faults. Table 7 shows the results of different tested fault cases for P1P2 fault. The percentage error in location is within 1% for all three methods. However, BTER method has better results in locating the fault than the other 2 methods. Hence, the fault location estimation method locates P1P2 faults accurately.

Discussion
In HVDC transmission lines, fault location is estimated using traveling wave in most of the real HVDC power networks. In traveling wave method, fault location is measured from time taken by the reflected and refracted wave fronts of wavelets multiplied with the speed of light. However, this method suffers from serious drawbacks as it cannot be assumed to there to always bea smooth earth surface. When fault occurs, the earth surface is not uniform for all locations so the time taken by the reflected wave front cannot be assumed to be the same in proportion to the distances. In addition, the severity of fault is also not the same if the faulty phase during a ground fault has more bending or irregularity; in addition, in this instance time taken by the reflected wave front will not be the same. For various fault impedances of the same location, the reflected wave front may have more secondary wave fronts. Hence the propagation time to reach the measuring station cannot be the same. Refracted wave front time is also considered for fault location, but this wave front will not reach the measuring station-if the conductor experiences phase fault and the medium is different, the wave front will travel in the different dielectric mediums where it strikes, but cannot be refracted back to the same media from where it travels. So, considering this parameter fault location estimation will lead to inconsistency.
A reflected wave front formed after a fault and followed by new wave front that is formed will act as the secondary source of disturbances and this process continues successively. However, there are some wavelets that will be coherent and some that will be incoherent, and the resultant wave interference will be destructive or constructive depending on the fault-sampling of data on these parameters is neglected. In diffraction, due to waves striking the earth or an obstacle in the path, the waves will bend near the corner and a fringe pattern will be formed resulting in wave fronts. Path difference under these circumstances will not be the same and so high sampling of data is not reliable. The surface of earth and the nature of the obstacle are very irregular, and under these circumstances using a conventional method to determine the fault location is not accurate.
Various AI methods have been implanted in transmission line fault location such as artificial neural network [29,30], support vector machine [29], k-nearest neighbor [31], decision tree [29], random forest (bagging ensemble) [30], boosting ensemble [30], deep learning [14] and fuzzy [12]. Artificial neural networks have no specific rule for determining the structure of network.The suitable network is achieved through trial and error. Support vector machine algorithm is not suitable for large data sets, and it does not perform sound when the data set has more noise. Accuracy of k-nearest neighbor depends on the quality of the data and prediction stage might be slow for large data. It requires high memory to store all the training data, and can be computationally expensive. In decision tree, small changes in the data cause large changes in the structure of the decision tree that will cause instability and so it may have a problem of over fitting. The over fitting problem of decision tree has been solved by using random forest and boosting ensembles. Deep learning method have no specific rule for determining the structure of network and has high hardware implementation cost. Although fuzzy based method is simple, it is unable to locate high resistance fault. Ensemble learning has been applied to AC transmission lines, this does not guarantee that it will work for DC transmission lines, as the signals of AC and DC are different. Hence, the ensemble learning method has been used in this work for locating DC transmission line faults, considering its simplicity and robustness. All the methods implemented in this work have been compared in terms of the percentage error obtained. A total of nearly 10,000 fault cases are used for testing the fault location estimation algorithms designed for HVDC transmission lines. Percentage error and percentage of fault cases in each error range have been calculated and compared, as shown in Figure 13. BTER method has more accuracy in locating the faults than BGER and DTR based methods. Hence, BTER-based method can be used in HVDC transmission lines for locating the faults correctly. If conventional methods are compared to the method suggested in this work, the suggested work is more superior for fault location, and has many advantages. In comparison to all these methods, the suggested BTER method uses extracted features that are not time dependent. It depends on the pattern formed after a fault and averaging of maximum intensity signal and minimum intensity signal due to resultant interference of different waves formed. Methods used in this work are learning methods which have the capability of adapting to various scenarios. Hence, the BTER method can locate fault in natural conditions and adverse conditions effectively.   In recent years, various methods have been proposed for locating faults. Table 8 shows the comparisons which are made based on different criteria. In [1], multiple signal classification method has been used to obtain the natural frequency. Traveling-wave velocity and reflect coefficient of natural frequency has been used to locate faults. However, since the sampling rate is very high that is not easy to realize. In [3], the value of the natural frequency of VSC-HVDC transmission line is related to fault distance that is used to locate fault, but it also needs a higher sampling rate. In [5], the ratio coefficient of wave velocities between different frequencies and digital band-pass filter has been used. A drawback of this method is that it requires a communication link and has high sampling rate. In [7], both end measurements of the faulted line have been time-reversed and further calculated to identify the fault. This also requires a communication link. A drawback of the method, suggested in [12], is that it is unable to locate high resistance faults. In [13], rectifier side AC RMS voltage, DC voltage and current on both the poles have been used to detect the fault. However, the effect of fault resistance has not been studied in this work. In [14], faults are detected, poles are identified, and faults are located using one end measurement, but this has a high hardware implementation cost. Some of the methods use traveling waves for fault location estimation. High sampling frequency is the drawback of thetraveling wave method, while the proposed method works with only 1 kHz sampling frequency. If the sampling frequency is high, then it will be difficult to implement. Some of the methods are not capable of locating faults with high fault resistance. Some of the methods overlooked the reach setting of the proposed method which is also important. Some of the methods are designed for monopolar transmission lines which may or may not work for bipolar transmission lines. Table 8 shows that the proposed method uses only one end measurement, has a low sampling frequency, more reach settings, and locates high resistance faults. Hence, the proposed BTER-based method is more suitable for fault location estimation to locate faults. If conventional methods are compared to the method suggested in this work, the suggested work is more superior for fault location, and has many advantages. In comparison to all these methods, the suggested BTER method uses extracted features that are not time dependent. It depends on the pattern formed after a fault and averaging of maximum intensity signal and minimum intensity signal due to resultant interference of different waves formed. Methods used in this work are learning methods which have the capability of adapting to various scenarios. Hence, the BTER method can locate fault in natural conditions and adverse conditions effectively.

Conclusions
A novel method has been proposed using BTER method for DC lines to locate faults. The BTER-based method has been tested with the CIGRE benchmark model. The proposed fault location estimation scheme uses only one end signal. The BTER method has been tested using various fault cases. The results of the proposed scheme are promising and can be implemented in real systems. Considering the growing amount of data in recent years, the proposed method will be a good tool for locating faults. The BTER method uses post fault samples and a lower sampling rate. The BTER method uses one end signals only. The proposed BTER-based fault location estimation methodhas many advantages: i. The proposed method has fault location error within 1%; ii. The proposed method has reach setting of 99.9%; iii. The proposed method has sampling frequency of 1 kHz; iv. No need communication link; v. The proposed method is simple, robust and reliable.
Theproposed fault location estimation method is an offline procedure. Hence, it can be implemented easily with a PC set up once the measurement from substation is directed to the PC. The signal processing can be carried out in any open source software available, such as Python. After the features are obtained, the BTER-based fault location method can be implemented using scikit learn of python to estimate the location offaults.

Conflicts of Interest:
The authors declare no conflict of interest.