E ﬀ ective Electricity Theft Detection in Power Distribution Grids Using an Adaptive Neuro Fuzzy Inference System

: Electric power grids are a crucial infrastructure for the proper operation of any country and must be preserved from various threats. Detection of illegal electricity power consumption is a crucial issue for distribution system operators (DSOs). Minimizing non-technical losses is a challenging task for the smooth operation of electrical power system in order to increase electricity provider’s and nation’s revenue and to enhance the reliability of electrical power grid. The widespread popularity of smart meters enables a large volume of electricity consumption data to be collected and new artiﬁcial intelligence technologies could be applied to take advantage of these data to solve the problem of power theft more e ﬃ ciently. In this study, a robust artiﬁcial intelligence algorithm adaptive neuro fuzzy inference system (ANFIS)—with many applications in many various areas—is presented in brief and applied to achieve more e ﬀ ective detection of electric power theft. To the best of our knowledge, there are no studies yet that involve the application of ANFIS for the detection of power theft. The proposed technique is shown that if applied properly it could achieve very high success rates in various cases of fraudulent activities originating from unauthorized energy


Introduction
Nowadays, societies are even more dependent on electricity due to the extinction of fossil fuels and revolutionary shifts to electric mobility. Electricity power losses occur naturally during the entire operation of the electrical network grid, but the vast amount of losses is caused by electricity theft mainly in the power distribution network. Detection of power theft is an important issue worldwide nowadays in order to preserve the reliability and the more profitable operation of the electricity distribution power grids. Nevertheless, power theft percentages are very small in industrialized countries, in absolute terms they lead to a considerable amount of electrical energy that is not billed [1][2][3].
Technical losses are an inherent consequence of the operation in any electricity distribution network and arise as the power flows through equipment such as cables, overhead lines and transformers. Technical losses are also related to low power quality occurrence due to the voltage variations, the frequency fluctuations as well as the power fluctuations because of the small time period demand events, the long time period demand events or the seasonal power variations in general [4].
Non-technical power grid losses (NTLs) are defined as the energy that is distributed, but not billed mainly due to illegal actions external of the power system and to conditions that technical losses

•
The adaptive neuro fuzzy inference system (ANFIS) is proposed and applied for first time in power theft detection for local low voltage power distribution network; • Thirteen different scenarios possible to occur in the real world are established and presented analytically, in order to justify their importance in the proper operation of the power distribution network and they are used in the simulated and discussed case studies in the following; • High success rates in power theft detection for most those realistic power theft scenarios were achieved; The adaptive neuro fuzzy inference system (ANFIS) is proposed, implemented and has achieved great success in classifying residential energy consumption patterns to be legal or illegal.
The rest of this study is organized as follows: Section 2 (the proposed machine learning model framework) describes in brief the ANFIS algorithm used, presents the adopted power theft scenarios and the steps followed for the ANFIS algorithm applications in the simulated case studies for the theft Layer 1: Every node i here in is an adaptive inference node with a node function defined as following: where, x (or y) is the input to node i, Ai (or Bi−2) is a linguistic variable associated with this node.
Here the membership function for Ai can be any appropriate parameterized membership function [32]. Layer 2: Every node in this layer is a fixed node labeled Π, whose output is the product of all the incoming signals representing the firing strength of a rule or in other words performing the fuzzy AND operation: Layer 3: Every node in this layer is a fixed node labeled N. The node i calculates the ratio of the i-th rule's firing strength to the sum of all rules' firing strengths. The output of this layer is called normalized fire strength: Layer 1: Every node i here in is an adaptive inference node with a node function defined as following: where, x (or y) is the input to node i, A i (or B i−2 ) is a linguistic variable associated with this node.
Here the membership function for A i can be any appropriate parameterized membership function [32].
Layer 2: Every node in this layer is a fixed node labeled Π, whose output is the product of all the incoming signals representing the firing strength of a rule or in other words performing the fuzzy AND operation: Layer 3: Every node in this layer is a fixed node labeled N. The node i calculates the ratio of the i-th rule's firing strength to the sum of all rules' firing strengths. The output of this layer is called normalized fire strength: Layer 4: Every node i in this layer is an adaptive node-with-node function defined as: where w i is a normalized firing strength from layer 3 and (p i , q i , r i ) is the parameter set of this node. The parameters in this layer are referred as consequent parameters. Layer 5: The single node in this layer is a fixed node, which computes the overall output as the summation of all the incoming signals: The training procedure of the network is carried out in two phases. During the forward propagation of the signals (from the input to the output) the premise parameters (p i , q i , r i ) remain unchanged and the LSM algorithm extract the consequent parameters. On the other hand, during back propagation (from the output to the input) the premise parameters are extracted using the gradient descent method (GDM) [30].

Power Theft Scenarios
The consumers' electricity consumption pattern may differ from the legal consumption due to several factors. These factors can be temporary, periodic or permanent consumption changes which are related mainly to the power theft occurrences. For the purposes of the present research three basic power theft scenarios w.r.t legal consumption (Normal) were considered, as it is shown in Figure 2, which are very close to the reality [25]. These are: (a) consumers with smart meter stealing a part of the electricity (Partial theft) of their overall consumption (possibly power pass before the smart meter), (b) consumers with abruptly increased consumption of electricity (Overload), (possible illegal activity or power delivery to a building without legal authorization for power supply), (c) consumers with smart meter stealing a part of the totally supplied electricity (periodic theft) during specific hours of the day, which a high demand of electric power consumption occurs.
Layer 4: Every node i in this layer is an adaptive node-with-node function defined as: (4) where ̅i is a normalized firing strength from layer 3 and (pi, qi, ri) is the parameter set of this node. The parameters in this layer are referred as consequent parameters.
Layer 5: The single node in this layer is a fixed node, which computes the overall output as the summation of all the incoming signals: The training procedure of the network is carried out in two phases. During the forward propagation of the signals (from the input to the output) the premise parameters (pi, qi, ri) remain unchanged and the LSM algorithm extract the consequent parameters. On the other hand, during back propagation (from the output to the input) the premise parameters are extracted using the gradient descent method (GDM) [30].

Power Theft Scenarios
The consumers' electricity consumption pattern may differ from the legal consumption due to several factors. These factors can be temporary, periodic or permanent consumption changes which are related mainly to the power theft occurrences. For the purposes of the present research three basic power theft scenarios w.r.t legal consumption (Normal) were considered, as it is shown in Figure 2, which are very close to the reality [25]. These are: (a) consumers with smart meter stealing a part of the electricity (Partial theft) of their overall consumption (possibly power pass before the smart meter), (b) consumers with abruptly increased consumption of electricity (Overload), (possible illegal activity or power delivery to a building without legal authorization for power supply), (c) consumers with smart meter stealing a part of the totally supplied electricity (periodic theft) during specific hours of the day, which a high demand of electric power consumption occurs. Furthermore, different percentages of power theft for the three basic scenarios and the combinations of them were taken into account due to the fact that in an electricity provider database different types of power theft are recorded [25,33,34]. The percentages for partial power theft (PTi) Furthermore, different percentages of power theft for the three basic scenarios and the combinations of them were taken into account due to the fact that in an electricity provider database different types of power theft are recorded [25,33,34]. The percentages for partial power theft (PT i ) were considered from 10%-90% of the overall consumption. For abruptly increased consumption (O k ) the percentages are from 20%-100% of the overall consumption. For partial power theft specific times of the day (PD m ) were considered two scenarios with 80% power theft in time periods that the consumers have a large amount of consumption. These periods are from 12:00 p.m.-15:00 p.m. and from 20:00 p.m.-22:00 p.m. for PD 1 and from 12:00 p.m.-15:00 p.m. for PD 2 . The scenario (M n ) is a combination of the above power theft scenarios, as summarized in Table 1 below.

Electricity Consumption Data Preprocess and ANFIS Configuration
The dataset used is based on the real smart metering data of approximately 5000 Irish households monitored for one and a half years [35]. After data preprocessing 3273 consumers have been chosen in order to conduct the experiments. The consumption pattern (C p ) for each consumer, consists of the electricity consumption data logged at 30 min intervals. This dataset resulted from an electricity consumer behavior trial scheduled and performed by the Irish Commission for Energy Regulation (CER). For this reason and without loss of generality these energy consumption data were initially considered as no containing power theft incidents [25,36].
For the purposes of the present research a random selection of 1000 consumers from the total 3273 were considered as illegal consumers and consequently it was necessary to apply the power theft scenarios in their electricity consumption data in order to generate the power theft cases which are referred in Table 1 above. As a result, thirteen datasets were constructed corresponding to each of the power theft scenarios of Table 1. In order to select the appropriate ANFIS parameters the criterion applied was the misclassification error to be minimized. Trial and error implementations on the type of membership functions (MF), the number of the MF were performed in order to select the optimal parameters for each applied ANFIS model j, j = 1, 2, . . . ., 12, 13 corresponding to each scenario of Table 1 and they are shown in Table 2. Due to the 10-fold cross validation method used for the validation of the ANFIS classifier, the consumers dataset is randomly divided by a factor of 10 [37]. Thus, 10 random subdatasets of the total (3273) consumers are created, resulting approximately 327 consumers randomly put in each subdataset. In Figure 3 the procedure followed for the detection of fraudulent consumers is presented in a form of block diagram. For the whole consumption (one and a half year) of each consumer among the 3273 consumers, a number of common classification features are calculated and tested. These features are: The mean, the median, the skewness, the entropy, the variance, the standard deviation, Energies 2020, 13, 3110 6 of 13 the kurtosis, the energy and the load factor of the data, the calculation formulas of which are presented analytically in Table 3 below.
Energies 2020, 13, x FOR PEER REVIEW 6 of 14 In Figure 3 the procedure followed for the detection of fraudulent consumers is presented in a form of block diagram. For the whole consumption (one and a half year) of each consumer among the 3273 consumers, a number of common classification features are calculated and tested. These features are: The mean, the median, the skewness, the entropy, the variance, the standard deviation, the kurtosis, the energy and the load factor of the data, the calculation formulas of which are presented analytically in Table 3 below.    Table 3. Definition of the classification features necessary to be extracted from the electricity consumption data.

Features
Definition In order to select the appropriate number and type of features which maximize the classification's process the well-known neighborhood component analysis (NCA) is used [32]. Applying feature selection and computing the ranking importance of features ends up to four top scoring features as inputs, i.e., the mean, the median, the entropy and the load factor of the data, as it is shown in Figure 4.
Energies 2020, 13, x FOR PEER REVIEW 7 of 14 scoring features as inputs, i.e., the mean, the median, the entropy and the load factor of the data, as it is shown in Figure 4. Afterwards, the datasets for each power theft scenario divided randomly in 10 subdata matrices (by using the aforementioned 10-fold cross validation method) were inserted into each ANFISj model for the analysis and the classification of the consumers in legal and illegal, respectively. Afterwards, the datasets for each power theft scenario divided randomly in 10 subdata matrices (by using the aforementioned 10-fold cross validation method) were inserted into each ANFIS j model for the analysis and the classification of the consumers in legal and illegal, respectively.

Performance and Discussion
In this section, the performance of the conducted simulations of the thirteen power theft scenarios of Table 1 is discussed in order to assess the robustness of the proposed ANFIS algorithm. In  Afterwards, the datasets for each power theft scenario divided randomly in 10 subdata matrices (by using the aforementioned 10-fold cross validation method) were inserted into each ANFISj model for the analysis and the classification of the consumers in legal and illegal, respectively.

Performance and Discussion
In this section, the performance of the conducted simulations of the thirteen power theft scenarios of Table 1 is discussed in order to assess the robustness of the proposed ANFIS algorithm. In Figures 5-7        The classification performance metrics used for the evaluation of the results are: the accuracy (ACC), the F1 score, the precision or positive predictive value (PPV), the recall or true positive rate (TPR), the specificity, the area under curve (AUC), which are defined as follows by the Equations (6)-(11): In Figures 8-11 the evaluation metrics (accuracy, F1, precision, recall) are shown graphically for the cases of partial power theft of the overall consumption (PTi scenarios of Table 1) and for overload power theft for different power theft percentages (Ok scenarios of Table 1).        From the above figures it is clearly shown that the percentages of power theft detection are generally very high. Moreover, it is worth noting that the ANFIS algorithm has equally good results for mixed scenarios (Mn scenarios of Table 1) and for periodic power theft scenarios (PDm scenarios of Table 1) as shown in Figure 12 and Table 4. In Figure 12 and in Table 4 the AUC metric and the reminder performance metrics are presented for the power theft incidents of Table 1 in comparison   From the above figures it is clearly shown that the percentages of power theft detection are generally very high. Moreover, it is worth noting that the ANFIS algorithm has equally good results for mixed scenarios (Mn scenarios of Table 1) and for periodic power theft scenarios (PDm scenarios of Table 1) as shown in Figure 12 and Table 4. In Figure 12 and in Table 4 the AUC metric and the reminder performance metrics are presented for the power theft incidents of Table 1 in comparison From the above figures it is clearly shown that the percentages of power theft detection are generally very high. Moreover, it is worth noting that the ANFIS algorithm has equally good results for mixed scenarios (M n scenarios of Table 1) and for periodic power theft scenarios (PD m scenarios of Table 1) as shown in Figure 12 and Table 4. In Figure 12 and in Table 4 the AUC metric and the reminder performance metrics are presented for the power theft incidents of Table 1 in comparison with the support-vector machine (SVM) [38] and the radial-basis-function neural network (RBF) [39] results, respectively. In order to obtain comparative results, the SVM and RBF classifiers are trained and evaluated with exactly the same dataset, the same power theft scenarios and with the 10-fold validation technique.  Table 1. In more detail, it can be observed from Table 4 and Figure 12 that for almost all the power theft scenarios the applied ANFIS method has a better performance in comparison with the other two widely used algorithms, i.e., the SVM and the RBF.
Especially the proposed ANFIS method outperforms the other algorithms in cases of low power theft percentage such as PT1, O1. In case of mixed scenarios such as M2, and M3 the lower performance than the other two is due to the increment of FP incidents. On the other hand, the proposed method has better performance in case of recall metric due to the low FN rate. Table 5 presents the training root mean square error (RMSE) in all cases of power theft incidents after running the ANFIS model for each case and shows that the ANFIS model was trained successfully.   Table 1. In more detail, it can be observed from Table 4 and Figure 12 that for almost all the power theft scenarios the applied ANFIS method has a better performance in comparison with the other two widely used algorithms, i.e., the SVM and the RBF.
Especially the proposed ANFIS method outperforms the other algorithms in cases of low power theft percentage such as PT 1 , O 1. In case of mixed scenarios such as M 2 , and M 3 the lower performance Energies 2020, 13, 3110 11 of 13 than the other two is due to the increment of FP incidents. On the other hand, the proposed method has better performance in case of recall metric due to the low FN rate. Table 5 presents the training root mean square error (RMSE) in all cases of power theft incidents after running the ANFIS model for each case and shows that the ANFIS model was trained successfully.  Table 6 presents the main characteristics and the best performances of previous approaches, which many them have common characteristics with the proposed method, such us the database (Irish [35]) and some similar power theft scenarios. In the present work, more power theft scenarios with respect to those examined in [16,17,25,33,34,36], which represent additional realistic power theft cases, are studied.

Conclusions
This study presents an artificial intelligence (AI) method for efficient power theft detection based on real smart electricity meter data, where the most significant classification features are inserted into the adaptive neuro fuzzy inference system classifier. Numerous simulations were performed and thirteen different incidents of power theft, extracted from the real-world experience, are studied. Real smart-electricity-metering data from Irish households are taken into account and high AUC scores were achieved. Except for the AUC metric, other classification success metrics such as ACC, F1 score, precision, recall and specificity, were used for the further evaluation of the proposed method. Additionally, a comparison with other extensively used classification methods for power theft detection, such as the SVM and the RBF is performed using exactly the same power theft scenarios and the same Irish smart electricity meter data, verifying by this way the efficiency and superiority of the ANFIS structure and algorithm in the effective power theft detection.
In conclusion, the proposed ANFIS structure and algorithm gave very encouraging results for the successful detection of power theft in power distribution grids. Almost every power theft scenario was considered herein. In future research, more power theft scenarios could be investigated to take into consideration consumers with high demand of electricity such us commercial and industrial consumers. Moreover, the interconnection of RES (renewable energy sources) as distributed generation (i.e., photovoltaic, wind turbines, small hydro, etc.) connected to the power distribution grid could crucially affect the power theft phenomenon and should be investigated.