Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm

Li, Bin; Tan, Yuxiang; Guo, Qingqing; Wang, Weihuan

doi:10.3390/su151512074

Open AccessArticle

Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm

Guangxi Key Laboratory of Power System Optimization and Energy-Saving Technology, School of Electrical Engineering, Guangxi University, Nanning 530004, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(15), 12074; https://doi.org/10.3390/su151512074

Submission received: 5 July 2023 / Revised: 3 August 2023 / Accepted: 4 August 2023 / Published: 7 August 2023

(This article belongs to the Special Issue Sustainable Power Systems and Optimization Volume II)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Effective line loss management necessitates a model-driven evaluation method to assess its efficiency level thoroughly. This paper introduces a “model-driven + data-driven” approach based on collective intelligence theory to address the limitations of individual evaluation methods in conventional line loss assessments. Initially, eight different evaluation methods are used to form collective intelligence to evaluate the line loss management of power grid enterprises and generate a comprehensive dataset. Then, the data set is trained and evaluated using the random forest algorithm, with Spearman rank correlation coefficient as the test metric, to assess the power grid enterprise’s line loss management level. Combining model-driven and data-driven methods, this integrated approach efficiently leverages the informational value of indicator data while thoroughly considering the causal and associative attributes within the dataset. Based on data from 61 municipal grid enterprises, both the comparison of multiple AI methods and correlation tests of results verify the superiority of the proposed method.

Keywords:

line loss management; evaluation system; random forest algorithm; big-data-driven; artificial intelligence; collective intelligence

1. Introduction

In the 2021 Chinese National People’s Congress, the Chinese government’s work report for the first time emphasized the importance of “Carbon Neutral, Emission Peaking” as a key task. China has set ambitious goals, aiming to reach the peak of carbon emissions by 2030 and achieve “carbon neutrality” by 2060. In light of the significance of reducing losses, power grid enterprises have made energy saving and emission reduction their top priorities [1].

Line loss is a comprehensive reflection of the power supply quality of power grid enterprises. However, line loss management is a systematic project with large data and strong comprehensiveness. At present, power grid enterprises have a set of established methods and implementation measures for line loss management and evaluation, but due to uneven management levels, imperfect management mechanisms, and single evaluation methods in the evaluation system, the management and implementation of line loss are still not in place. Therefore, power grid enterprises need a more effective line loss management evaluation system to guide loss reduction.

At present, comprehensive evaluation has been widely used in power systems. Yang et al. [2] employed the Analytic Hierarchy Process (AHP) to evaluate the energy efficiency level of distribution networks based on expert experience. Goh et al. [3] proposed the Fuzzy-AHP approach to determine the weight of load nodes in a system’s load profile and select control strategies for different load levels. Zhao et al. [4], aiming to minimize subjective bias, integrated the subjective evaluation values obtained from the entropy method using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) method to evaluate the national electricity development status comprehensively. Zhang et al. [5] utilized the coefficient of variation method and ideal approximation sorting method to evaluate wind farms comprehensively. To evaluate power quality objectively and reasonably, Li et al. [6] used the improved AHP and entropy weight method to introduce attribute recognition in the literature. Additionally, contingency theory has been employed to improve the Analytic Hierarchy Process (AHP) and enable comprehensive evaluations of power quality and wind power grid connection technology [7,8]. The main focus of these works [2,3,4,5,6,7,8] was to establish a suitable evaluation index system and develop an objective weighting scheme to address the evaluation challenges. However, the aforementioned scholars’ investigations into evaluation methods in the comprehensive evaluation system mainly focus on employing either a single algorithm or a combination of two algorithms for calculation. This narrow approach may result in biased evaluation outcomes, making it challenging to effectively integrate subjectivity and objectivity in the evaluation process.

The theory of comprehensive evaluation has been applied in line loss management. For instance, in [9], a combination of subjective and objective assignments using the AHP method and sensitivity weighting is utilized to evaluate line loss management in rural network systems. In [10], a principal component analysis (PCA) method is proposed to comprehensively evaluate the impact of distributed generation on voltage and line loss. Considering the limitations of a single evaluation method, some researchers employed multiple algorithms to assess the level of line loss management. The study in [11] employs seven different evaluation algorithms to assess municipal power companies’ line loss management systems, providing a comprehensive evaluation of the level of line loss management. Similarly, in [12], which focuses on line loss management at the county level, multiple indicators are used to measure and track the progress of line loss management, which allows for a more comprehensive assessment of the effectiveness of line loss management efforts. Researchers use multiple evaluation algorithms to capture different aspects and dimensions of line loss management, providing a well-rounded evaluation considering various factors and perspectives. Nonetheless, scholars predominantly rely on the traditional model-driven research paradigm when evaluating line loss management, emphasizing a “cause and effect” perspective. However, in the current digital era, traditional management practices have transitioned to digital management, with decision-making increasingly based on data analysis. Numerous decision scenarios require a comprehensive understanding of causality and correlation. Hence, considering the application of data-driven methods in evaluating comprehensive line loss management is imperative.

Artificial intelligence has witnessed rapid development in the electrical industry in recent years. This technology has the potential to revolutionize grid operations by enabling intelligent evaluations and management based on integrated big data [13]. The advancement of artificial intelligence has led to the emergence of data-driven methods, particularly machine learning, which offer new possibilities for constructing comprehensive evaluation systems. As computing power continues to improve, intelligent algorithms have been successfully applied in evaluation systems, replacing traditional complex evaluation processes with a more reliable model-based learning approach.

For instance, a study by Yao et al. [14] proposes an approach based on gradient-boosting decision trees for line loss rate prediction. This model leverages the power of machine learning to provide accurate and reliable forecasts in line loss management. Another study by Author et al. [15] applies an enhanced online random forest model to assess online voltage reliability, offering a dependable and precise evaluation method. These examples demonstrate how artificial intelligence techniques, specifically data-driven methods such as machine learning algorithms, have been successfully utilized in line loss management evaluation. By utilizing advanced technologies, the development of more robust and effective integrated assessment systems can be achieved, resulting in improved reliability and efficiency of grid operations.

Collective intelligence refers to the idea that multiple participants independently provide their evaluation opinions on a particular subject and then aggregate those opinions. The resulting evaluation is often more accurate than the opinions of any individual participant alone [16]. This concept has garnered interest from businesses seeking to foster collaborative innovation [17] and researchers aiming to address systemic challenges such as climate change [18]. These studies demonstrate that groups can exhibit intelligence and, in certain cases, can be as intelligent as experts when they offer independent viewpoints on a given matter.

The collective intelligence paradigm focuses on leveraging the intelligence of groups of individuals to enhance productivity and facilitate better decision-making compared to isolated individuals [19]. Studies such as Yi’s [20] have provided further insights into the methods and factors that contribute to the stability of labeled behaviors. In this study, the relationship between the level of label distribution on a single resource and the number of annotators was quantified. The researchers found that a typical stability level of label distribution can be achieved when there are around 300–400 annotators. Another study conducted by Robu et al. [21] used the Kullback–Leibler (KL) distance to measure the degree of randomization in label distribution. This measurement was utilized to establish a metric for determining the dynamics of label distribution, which characterizes the group intelligence level. Collective intelligence recognizes the value of collaboration and the diverse perspectives within a group, enabling more comprehensive and effective problem-solving processes. By harnessing collective intelligence, organizations and researchers can tap into the collective intelligence of individuals, leading to improved outcomes and decision-making.

Traditional models face challenges in examining numerous variable combinations to establish correlations within the context of big data. Moreover, these models tend to focus on data causality, potentially overlooking implicit information and resulting in low explanatory power. On the other hand, traditional comprehensive evaluation models often rely on a single algorithm or a combination of two algorithms for evaluating line loss management. However, this approach tends to exhibit one side in terms of the chosen algorithm(s). As a result, the evaluation results may lack scientific rigor and fail to provide a comprehensive measurement of the level of line loss management. Based on the provided background, this paper proposes a comprehensive evaluation method for lean line loss management using a big-data-driven paradigm.

The proposed method combines data-driven and model-driven approaches to construct a big-data-driven paradigm. It involves utilizing eight evaluation methods with different focuses to create collective intelligence and establish an effective evaluation model. The model-driven paradigm measures are employed to resolve causal relationships among variables, while data-driven paradigm association mining methods are used to discover associations among variables. Subsequently, the random forest algorithm is applied to build a situation prediction model using the collective-intelligence-based results as training samples. This paper evaluates line loss management based on the data from 61 grid companies in five provinces in southern China. The study contributes to the field in the following ways:

This paper proposes a research method combining “model-driven” and “data-driven” approaches. The data-driven approach is utilized to uncover causal and correlative relationships within the data, while the model-driven method is employed to evaluate the data and generate a comprehensive dataset.
The comprehensive evaluation model developed in this study incorporates the theory of group intelligence. By integrating objective information such as indicator conflicts and information entropy as well as leveraging the collective expertise of experts in relevant fields, the evaluation results obtained provide a scientific assessment of line loss management.
The random forest algorithm is employed to evaluate line loss management in this study. The trained model enables accurate evaluation of the management score for enterprises using indicator characteristics specific to municipal power enterprises, which offers scientific guidance for implementing lean management strategies to minimize line losses.

The rest of the paper is organized as follows: Section 2 introduces the background of big data and explains the implications of lean line loss management. Section 3 provides a brief description and organization of the research model. Section 4 briefly discusses the process of constructing the data-driven evaluation model and describes the validation methods for the evaluation results. Section 5 examines the prediction model results and provides an analysis. Based on the analyses in Section 5, conclusions are drawn in Section 6.

2. Paradigm and Characteristics of Lean Line Loss Management in the Era of Big Data

2.1. Paradigm and Characteristics of the Era of Big Data

The new paradigm in the era of big data can be observed from the “making” perspective and the “using” perspective. In the comprehensive evaluation of line loss management, big data is considered a component of information technology (IT) encompassing data and systems.

From the “making” perspective, the primary focus lies in big data analysis, which involves organizing line loss data and conducting line loss index analysis, among other tasks. Additionally, attention is given to constructing big data systems for line loss evaluation and integrating comprehensive evaluation methods.

On the other hand, from the “using” perspective, key areas of concern include the behavior of utilizing big data. This encompasses establishing line loss management evaluation indices and constructing models for evaluating their effectiveness. Furthermore, big data plays a crucial role in enabling innovation in line loss management, such as predicting line loss situations and implementing closed-loop management strategies.

In traditional models, as the combination of variables increases and the original model’s explanatory power diminishes, the introduction of new variables becomes necessary. These variables are often latent, unpredictable, or currently unavailable. This is where data-driven methods and techniques come into play. Data awareness, technological capabilities, and their integration become the core competencies in research and application within the realm of big data. They form the fundamental elements of the big-data-driven paradigm.

Another significant implication of the big-data-driven paradigm is big-data enablement, which refers to the value creation driven by big-data capabilities. Big-data enablement focuses on developing big-data capabilities to uncover new models and opportunities. This, in turn, drives service and model innovation, ultimately leading to the creation of enterprise value.

The big-data-driven paradigm framework can be examined through the lens of the aforementioned perspectives: the technology methods and the enabling innovation (as shown in Figure 1).

2.2. Overview and Requirements of the Comprehensive Evaluation System for Line Loss Lean Management

With the rise of the big data era and the increasing complexity of power grid operations, traditional evaluation systems can no longer meet the needs of power grid enterprises. Therefore, the concept of lean management needs to be introduced. In the context of line loss management, “lean” refers to accurately and comprehensively measuring the level of line loss management, identifying weaknesses promptly, and investing resources accurately. One aspect of “lean” management is cost reduction, which involves minimizing inputs and resource consumption. The goal is to achieve greater output benefits while conserving resources. These benefits include economic gains and social benefits, both in the short term and the long term. Ultimately, lean management supports the sustainable and rapid development of the enterprise. Power grid enterprises can enhance their line loss management practices by adopting lean management principles and leveraging big data. Through accurate measurement, identification of weaknesses, and efficient resource allocation, they can optimize their operations, reduce losses, and achieve long-term success.

The demand for big data also necessitates the development of a comprehensive evaluation system for line loss lean management. Within the field of line loss lean management evaluation, the utilization of big data requires consideration of both cause–effect relationships and correlations. Without these elements, making effective decisions to guide reduction becomes challenging. Consequently, enhancing foresight and risk insight in evaluating line loss lean management becomes imperative.

The construction of a comprehensive evaluation system for lean line loss management involves several key steps, including data standardization, establishment of the line loss management indicators, development of an effective evaluation model, creation of a situation prediction model, and closing the loop of line loss management.

The application of the lean management evaluation system holds significant importance for both industrial and service-oriented enterprises. It offers valuable assistance by eliminating inefficient labor within the production management process to achieve optimal long-term benefits. Therefore, in the era of big data, power grid enterprises should leverage artificial intelligence technology to make informed decisions and provide guidance for reducing losses, ultimately maximizing long-term advantages

3. Artificial Intelligence Technology

3.1. Artificial Intelligence Methods

In the big-data-driven paradigm, numerous innovative technological approaches have emerged in response to the challenges posed by traditional models. Additionally, the development of new digital infrastructures has facilitated the implementation of artificial intelligence methods in power systems [22]. Figure 2 provides a schematic illustration of the analytical process of the AI approach.

Artificial intelligence is applied to evaluate lean management of power grid line loss, based on Figure 2. Initially, actual sample data are collected and processed through dimensionality reduction, clustering, and missing-value supplementation. Subsequently, evaluation model construction, parameter adjustment, and results derivation are performed using machine learning or deep learning methods. This study selects the random forest algorithm as the machine learning method, as it can efficiently train the dataset and generate relatively precise outcomes. Random forest swiftly and accurately analyzes substantial line loss management index data, delivering timely scientific outcomes, which is especially valuable for time-sensitive enterprises such as power grids.

3.2. Data Processing

Data processing is required once the study population is established and the sample set is generated. This processing step aims to prevent individual data points with significantly deviating values from influencing the results. Clustering is performed to uncover and explore potential differences and associations within the data. In this paper, the AP (Affinity Propagation) clustering algorithm is selected to cluster the samples and eliminate values that deviate excessively from other data points.

The AP clustering algorithm is a technique that considers the information exchange between observations. Unlike other clustering algorithms, it does not require the pre-determination of cluster centers. Instead, it considers all sample points as potential centers and automatically determines the location and number of cluster centers through iterative calculations. The algorithm searches for suitable clustering centers in each iteration, continuously refining the clustering assignment until convergence is achieved.

R (i, k)

is defined as the measure of suitability for

x_{k}

to serve as a clustering center for

x_{i}

, indicating the extent to which

x_{i}

considers

x_{k}

appropriate as its clustering center. To determine the appropriate clustering center

x_{k}

, the AP algorithm continuously gathers evidence in the form of

R (i, k)

and

A (i, k)

from the data sample. The iterative formulas for

R (i, k)

and

A (i, k)

are as follows:

R (i, k) = S (i . j) - max_{j \neq k} \{A (i, j) + S (i, j)\}

(1)

A (i, k) = min \{0, R (k, k) + \sum_{j \notin \{i, k\}} max \{0, R (j, k)\}\}

(2)

The AP clustering process is a continuous-loop iteration based on Equations (1) and (2) to update the evidence. By iterative competition, AP clustering can obtain the optimal clustering centers and the class profiles of each sample point.

3.3. Construction of Data-Driven Model

Random forest offers several advantages over traditional networks, including a simple model structure, high accuracy, fast training speed, and robustness against overfitting and noise interference. Hence, random forest is applied as the data-driven method to construct an empirical model for evaluating line loss lean management.

Random forest [23] is essentially a decision tree algorithm: a well-established and classical data mining method that automatically generates decision rules. It utilizes a partitioning approach to address classification or regression problems. During training, decision trees evaluate the information gained from feature partitioning at each stage, starting from the top and progressing downwards, in order to select the optimal features from the partitioned dataset. Subsequently, the resulting subproblems are recursively processed. The data instances are assigned to different branches within the tree structure, and this process iterates until the recursive stopping condition is met.

Each path within a decision tree model, starting from the root node and extending to the leaf nodes, represents a classification rule. Consequently, a decision tree can be viewed as a collection of classification rules that enable the model to make predictions on unlabeled data.

In the case of classification trees, the Gini impurity is a commonly used metric to evaluate the effectiveness of branching in the tree model. This metric measures the “impurity” of the system by assessing the likelihood that a randomly selected sub-item from the dataset will be incorrectly assigned to another category. The formula for calculating the Gini impurity is defined by Equation (3).

G (P) = \sum_{i = 1}^{n} P_{i} (1 - P_{i}) = 1 - \sum_{i = 1}^{n} (P_{i}^{2})

(3)

Here,

P i

denotes the proportion of the current nodes belonging to the class i. The range of

G (P)

lies within [0,1], where values closer to zero indicate a more favorable classification outcome.

In the case of regression trees, the splitting point is determined using a variance metric. This involves calculating the variance of each leaf node and then summing and weighting all the variances. The splitting method with the lowest variance value is ultimately chosen. The calculation formula for variance is given by Equation (4).

V = \frac{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}{n}

(4)

Here,

\bar{x}

represents the average value of the input variable data in the regression tree, and V denotes the weighted variance associated with that particular input variable. Each

x_{i}

corresponds to an individual data point within the dataset, and n signifies the total number of data points.

4. Construction of Evaluation Algorithms Based on the Big-Data-Driven Paradigm

4.1. The Data-Driven Model Based on Collective Intelligence Theory

Numerous recent studies have demonstrated that groups exhibit collective intelligence, resulting in more accurate decision-making compared to individuals. The practical application of collective intelligence theory should adhere to four key principles: independence, decentralization, plurality, and integration [24]. In the evaluation of line loss lean management within power grid enterprises, primary organizations focus on utilizing model-driven methods to assess the level of line loss lean management. By employing model-driven methods, the fundamental nature of the problem can be identified, allowing for the development of new theories. These methods consider the research problem holistically and describe the characteristics of the research object using specific mechanism models or relevant rules. In alignment with the objectives of this paper, the collective intelligence of this study should also adhere to the following four principles:

Independence: Each model-driven method should express its own viewpoint independently regarding the perceived level of line loss management within the grid enterprise. The viewpoints of other model-driven methods should not influence its assessment of the perceived level of line loss management.
Decentralization: Each model-driven approach should be able to focus and apply its knowledge when providing an opinion on the perceived lean level of line loss management.
Plurality: Each model-driven approach possesses unique knowledge regarding the integrated level of line loss management perception, resulting in diverse information among different approaches.
Integration: A unified decision integration mechanism should be employed to aggregate the evaluation opinions of various model-driven methods regarding the comprehensive level of line loss lean management, thereby deriving a collective opinion.

This study aims to adhere to the criteria above by selecting model-driven methods and decision-integration mechanisms. The evaluation algorithm is then constructed based on the big-data-driven paradigm. The following eight evaluation methods have been chosen: analytic hierarchy process (AHP) [25], entropy weight method [26], TOPSIS method [27], weighted rank-sum ratio [28], coefficient of variation method [29], CRITIC weight method [30], cosine value method [31], and gray correlation analysis method [32]. These methods are utilized to construct an effect evaluation model. Furthermore, collective intelligence is integrated using the random forest approach. Based on collective intelligence theory, the data-driven model is illustrated in Figure 3.

As shown in Figure 3, in this study, the relevant raw data within the evaluation index system are collected, and the original line loss index data are quantified based on specific scoring requirements, resulting in the extraction of new feature index data. Subsequently, the evaluation process employs the model-driven approach for scoring, wherein the feature indicator data and model-driven evaluation scores are utilized as a training set for the AI model. The trained AI model is then employed to make predictions and determine the final evaluation results. The integrated model does not simply average the results of individual methods but instead combines the strengths of each evaluation method. By preserving the unique information contributed by each method and incorporating the collective intelligence of the group, the final result becomes objective and realistic.

4.2. Flow of Random Forest

In the context of lean management decisions for line loss reduction, a well-integrated sample set is created through clustering to ensure a high degree of consensus among different groups. This sample set is then used as the training data for the random forest algorithm.

Random forest is an ensemble learning algorithm that utilizes decision trees as its base learners. The base learners are combined together using the bagging method. Each decision tree is trained on a randomly selected subset of the training data, with replacement. This process is repeated for multiple decision trees, and their predictions are aggregated to obtain the final prediction.

Suppose the original dataset consists of samples with input features and a classification label. The random forest algorithm combines independently trained decision trees to form a forest. The construction process of each tree can be viewed as partitioning the data space. Specifically, before constructing a decision tree, the bootstrap method is applied to draw K training datasets from the original dataset D. Each decision tree is built using the categorical regression tree (CART) method. Each tree node randomly selects m features from the M input features as the candidate split feature set. The optimal split feature and cut-point are determined based on the minimization of the mean square error (MSE) criterion. This partitioning process is repeated until a stopping condition is met.

After training K decision trees using the bootstrap sample sets, they are combined into a random forest model, denoted as

\{t_{i}, i = 1, 2, \dots, K\}

. When a test sample x is input into the model, the corresponding classification results

\{t_{1} (x), t_{2} (x), \dots, t_{k} (x)\}

are obtained.

The random forest algorithm aggregates the results from each decision tree to make the final prediction. In this study, a simple averaging method is employed to obtain the final regression result

T (x) = \frac{1}{K} \sum_{i = 1}^{K} t_{i} (x)

, where the predicted values from all decision trees are averaged. The final output represents the grid enterprise’s line loss lean management score.

The flowchart of the random forest algorithm is shown in Figure 4.

4.3. Evaluation Criteria for Data-Driven Models

The criteria used to evaluate and verify the data-driven model are predicted values and observed values to show the accuracy of the model. The observed value refers to the line loss index data of the city network enterprise, and the predicted value is the predicted value of the output of the random forest model. These criteria include Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and accuracy

A_{i}

[33]. The calculation formulas are as follows:

M A E = \frac{1}{n} \sum_{i = 1}^{n} |\overset{\land}{y_{i}} - y_{i}|

(5)

M A P E = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{\overset{\land}{y_{i}} - y_{i}}{y_{i}}|

(6)

A_{i} = [1 - \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\frac{\overset{\land}{y_{i}} - y_{i}}{y_{i}})}^{2}}] \times 100 %

(7)

Here, n represents the total number of predicted elements, while

\overset{\land}{y_{i}}

and

y_{i}

refer to the predicted value and the actual evaluation value of the ith sampling point, respectively. In line loss lean management evaluation training, lower values of MAE and MAPE indicate better prediction performance, whereas a higher accuracy value

A_{i}

signifies greater precision.

4.4. Test Methods of Comprehensive Evaluation

4.4.1. Preliminary Verification of Comprehensive Evaluation

When employing various comprehensive evaluation methods, the correlation degree of evaluation results is commonly measured using the KENDALL concordant coefficient. The formula for calculating the correlation degree G is as follows:

G = \frac{12 {\sum_{i = 1}^{n} (\sum_{j = 1}^{m} r_{i j} - \frac{1}{n} \sum_{j = 1}^{m} r_{i j})}^{2}}{m^{2} (n^{3} - n) - m \sum_{i = 1}^{m} \sum_{i = 1}^{m_{i}} ({n_{i j}}^{3} - n_{i j}) / 12}

(8)

Here, m indicates the number of evaluation methods, n is the number of evaluated indicators,

r_{i j}

is the order value,

m_{i}

is the number of the repeated order value in the evaluation conclusion of the ith evaluation method of line loss management, and

n_{i j}

is the jth repetition value in the evaluation conclusion of the ith evaluation method of line loss management.

Subsequently, the significance check of G should be conducted using the following formula:

{X_{n}}^{2} = m (n - 1) G

(9)

The requirement is that the approximation abides by

{X_{α}}^{2} (n - 1)

, thus satisfying:

{X_{n}}^{2} \geq {X_{α}}^{2}

(10)

Meeting the condition in Equation (9) implies significant consistency among multiple evaluation methods. It not only reflects the information within the initial evaluation conclusion but also captures the characteristics of multiple evaluation methods.

4.4.2. Post Verification of Comprehensive Evaluation

When multiple combination evaluation methods are employed, selection of the most reasonable method can be determined by assessing the consistency among the conclusions drawn from these methods. In order to test this consistency, the Spearman rank correlation coefficient method is utilized, and the calculation is as follows:

ρ_{j k} = 1 - \frac{6 \sum_{i = 1}^{n} d_{i}^{2}}{n (n^{2} - 1)}

(11)

In this equation,

d_{i}

represents the rank difference between the two evaluation methods for the ith evaluation object, and

ρ_{j k}

represents the Spearman rank correlation coefficient between the kth combination method and the jth comprehensive evaluation method. A higher value of

ρ_{j k}

indicates a stronger correlation between the ranking results of the two methods. The variable n in this context represents the number of schemes or municipal network enterprises participating in the evaluation, as described in this paper.

When the number of schemes n is less than 10, the test statistics can be calculated using the equation:

ρ_{k} = \frac{\sum_{j = 1}^{n} ρ_{j k}}{m}

(12)

Here,

ρ_{k}

represents the average correlation degree between the kth combination evaluation method and the original m comprehensive evaluation methods.

However, when the number of schemes n exceeds 10, specifically in this article, where n is 61, the test statistics can be determined using the following formula:

t_{k} = ρ_{k} \sqrt{\frac{n - 2}{1 - ρ_{k}^{2}}}

(13)

In this case, the value of

ρ_{k}

is calculated using the same method as in Equation (12). This test statistic provides a measure of the overall consistency between the ranking results of the kth combination method and the multiple comprehensive evaluation methods.

After connecting all the aforementioned processes, the evaluation process of lean line loss management driven by big data is illustrated in Figure 5. Firstly, the evaluation system is established based on a data-driven approach to determine the relevant indicators required for the evaluation system. Through analysis of the characteristics of the relevant power grid structure coupled with the strategic requirements of contemporary power grid enterprise development, the collection of line-loss-related data and information is targeted. The second aspect involves conducting a comprehensive evaluation and analysis. This study adopts a joint evaluation model combining the “model-driven” and “data-driven” approaches for lean evaluation work on grid line loss management. This choice is made based on the four principles of group intelligence theory and considering the current context of the big-data era. For this study, an eight-method group intelligence model is selected to evaluate line loss management using the model-driven approach. Subsequently, the evaluation results are utilized for training in random forest and are compared with the LightGBM algorithm as well as other artificial intelligence algorithms to validate the rationality and effectiveness of the outcomes.

5. Results and Discussion

5.1. Data Introduction and Experimental Platform

In this paper, the actual line loss index data of 61 local and municipal power grids in a southern network area for the years 2015 and 2016 were utilized as the simulation dataset. The line loss index data of 61 municipal network enterprises were evaluated using several methods, including analytical hierarchy process (AHP), entropy weight method, TOPSIS method, weighted rank-sum ratio, coefficient of variation method, CRITIC weight method, cosine value method, and gray correlation analysis method. All evaluation results were then scored and quantified on a scale of 0–100.

The training and testing set for the random forest model consisted of the data from 2015, while the data from 2016 were used as the validation set. These 61 power grids are situated in five southern China provinces: Guangdong, Guangxi, Yunnan, Guizhou, and Hainan. The geographical locations of these power grids are depicted as the yellow area in Figure 6.

The hardware platform used in this paper consists of an Intel Core i5-7500 CPU and Intel(R) HD Graphics 630 GPU. The software code was developed using Python, and the random forest algorithm was implemented by invoking the random forest machine learning package.

5.2. Construction of Evaluation System and Data Processing

In order to make the evaluation system conform to the actual situation of line loss management, this paper refers to the related standards and scheme of line loss management of power grid enterprises and constructs the evaluation system from four dimensions that cover the whole process of line loss management, namely “planning and reducing”, “management loss”, “running loss”, and “technical loss”, which cover the structure of the net frame, marketing management, power grid operation, and equipment configuration. The system fully considers municipal power grids’ actual operation and management level, covers multiple voltage levels, and involves various daily business management scopes. The specific evaluation index system is shown in Table 1.

The original line loss management data collected were scored based on the indicators listed in Table 1. All the original line loss management data were then transformed into scores ranging from 0 to 100. Subsequently, the collective intelligence theory was employed to evaluate each city and obtain evaluation results. In order to eliminate values that deviated significantly from other evaluation scores, the AP clustering algorithm was selected for clustering the evaluation scores.

5.3. Evaluation Accuracy Results

The random forest model is employed to train the evaluation results in the proposed comprehensive evaluation method for collective-intelligence-based line loss lean management. Following simulation analysis, the values of the test samples are compared with the comprehensive evaluation value, depicted in Figure 7, which represents the average value determined by the model-driven algorithm previously discussed.

The close resemblance between the predicted values in the test samples and the true values is demonstrated in Figure 7. The predicted values represent the evaluation score of municipal power grids obtained from the random forest algorithm. The true values correspond to the average evaluation scores of municipal power grids derived from collective intelligence. The vertical axis is the fraction of the predicted and true values divided by one hundred. The horizontal axis denotes the count of power grid enterprises. With an accuracy of 91.07%, the predicted values demonstrate their reliability and accuracy as evaluation results derived from the random forest algorithm.

As shown in Table 2, the test set yields an MAE of 5.40% and MAPE of 0.070%, which signifies the high accuracy of the random forest model. Utilizing the random forest algorithm, the training model adeptly combines the assessment perspectives from diverse model-driven methods regarding the comprehensive aspect of line loss lean management into a collective evaluation.

5.4. Comparative Analysis of Evaluation Methods

The validation set used for training the random forest prediction model was based on the dataset from 2016. The results obtained from all the evaluation methods mentioned were compared with the findings presented in this paper. The rankings, which assess the level of lean comprehensive management of line loss in grid enterprises using nine different methods, can be found in Appendix A.

To demonstrate the rationality and validity of the effect evaluation model constructed using collective intelligence in this paper, the ranking of this proposed method was compared and analyzed alongside the other eight methods. Furthermore, to compare the management gaps among different enterprises, MM and SW, as well as LPS and YL, were selected as examples to identify the weak points in line loss management across these companies. The comparison results are illustrated in Figure 8 and Figure 9, where the horizontal axis represents the name of each line loss indicator for the respective enterprise. The indicators and their meanings are listed in Table 1 (e.g., GH1 represents the standard rate of power supply radius of the power grid). The vertical axis shows the scores of each indicator for different firms.

Figure 8 illustrates that in MM’s planning dimension, the standard rate of power supply radius of the power grid and the capacity load ratio of the main network, as well as in the technical dimension, the station electricity consumption rate and energy-saving transformer ratio, have low scores. On the other hand, in the management dimension, the proportion of old and low electric energy meters, the power consumption ratio of automatic meters, calculation and measurement of fault error rate, and the rate of abnormal line loss solvation all received higher scores. AHP assigns too much weight to the management dimension based on subjective expert experience, resulting in a higher final ranking for MM. This approach overlooks the management shortcomings in the planning and technical dimensions. Therefore, introducing an evaluation method that considers the objective information quantity of indicators, as proposed in this paper, is more reasonable.

In comparison with the entropy weight method, the ranking differences between SW and WZ were nine and seven, respectively, while the ranking differences for other power grid enterprises were no more than six. The entropy weight method only considers the objective weights of indicators and focuses on their variability. In contrast, the method proposed in this paper considers both the objective weights and the comparative strengths and conflicts between indicators. It combines the advantages of objective weighting methods and AHP, which takes into account the subjective intentions of decision-makers and the collective intelligence of experts. This allows for a more comprehensive evaluation that incorporates expert knowledge and experience. The rankings of this paper’s method were also compared with the TOPSIS method, weighted rank-sum ratio, coefficient of variation method, and CRITIC weight method. Since these are all objective weighting methods, the comparisons are similar to that of the entropy weight method and will not be repeated here.

Finally, since both the cosine value method and gray correlation method are methods for measuring the differences in the degree of association between factors, the ranking of this paper’s method is compared with that of the gray correlation method, using the gray correlation method as a representative. As can be seen from Appendix A, there are also some differences between the two methods, where ST, JY, YX, CX, DH, YL, and LPS, respectively, are ranked 11, 15, 10, 13, 12, 16, and 19 places apart in the two methods. For the city LPS with the largest difference, see Figure 9; this grid company’s planning dimension and technical dimension scores are close to 90. The gray correlation analysis method sets the best column of the available data as the reference series. It calculates the similarity between LPS and the reference series, ranking high in the gray correlation method. The weight of the management and operation dimensions in the evaluation index system are slightly more significant than that of the planning and technical dimensions. The LPS does not perform well enough in some indicators with greater weights, such as calculation and measurement of fault error rate, the rate of abnormal line loss, comprehensive voltage rate, and the qualified rate of bus power imbalances, so the similarity degree with the reference series alone cannot accurately measure the comprehensive management level of line loss leaning in this enterprise. On the contrary, YL is ranked 16 places higher than the gray correlation method in this paper because while YL has a slightly lower score of 76.93 in the technical dimension, it has a score of 85.24 in the management dimension and a higher score for several key indicators. The method in this paper has fully adopted the advantages of AHP and the objective weighting method. The objective information between the indicators can be measured fairly, while the expert group’s experience and opinions are used for evaluation.

In conclusion, the method presented in this paper provides an objective assessment of power grid enterprises’ line loss lean management by effectively evaluating various aspects to reflect the gap between these enterprises and the optimal benchmark. As a result, it offers targeted guidance for power grid enterprises seeking to reduce line losses.

5.5. Comparative Analysis of AI Models

The effectiveness of the proposed method in practical application is further illustrated through a comparison with three artificial intelligence models, namely LightGBM, deep forest, and width learning. The analysis focuses on the 2016 line loss management index data by conducting a correlation test for each evaluation method. By employing Equations (8) and (9), the KENADLL coefficient for eight comprehensive evaluation methods is obtained, resulting in a calculation outcome of G = 0.946 and

{X_{n}}^{2}

= 453.841. Subject to the fulfillment of Equation (10), the values of G and

{X_{n}}^{2}

indicate strong correlation and compatibility among the eight comprehensive evaluation methods employed, highlighting their ability to effectively reflect line loss management information.

Subsequently, the evaluation scores for the eight evaluation methods are clustered using the same method as described earlier, while the data underwent a similar clustering process. In total, 455 new sample datasets for the year 2016 and 52 sample datasets for the year 2017 are acquired. Initially, all 446 sample datasets related to line loss management from the year 2015 are utilized as training sets for random forest, Bayesian-optimized LightGBM, deep forest, and width learning models. The four aforementioned AI models are employed for evaluation using all 507 datasets from 2016 and 2017 serving as the test set.

The evaluation results of each grid company are obtained, as shown in Figure 10, and the ranking results are shown in Appendix A.

In Figure 10, the horizontal axis shows the abbreviation of the name of each power grid enterprise. The vertical axis shows the evaluation scores of each enterprise derived from the four different AI methods. The figure illustrates that the four models yield different evaluation values for the same power grid enterprise and display similar overall trends. To compare the ranking results among these four models, this study employs the widely used consistency test method from the field of comprehensive evaluation and calculates the Spearman rank correlation coefficient for the post-verification of comprehensive evaluation. The Spearman rank correlation coefficients, representing the similarity between the ranking results of each artificial intelligence model and the original comprehensive evaluation method, are calculated using Equation (11) and are presented in Table 3.

The

ρ_{k}

values are recalculated using Table 3 and Equation (12). Subsequently, the corresponding t-statistic is derived. The calculation results are presented in Table 4.

The t-statistics for the four AI models are 14.659, 14.371, 13.625, and 10.122, respectively. These values are greater than the critical value

t_{0.05} (61) = 1.671

at a significance level of 0.05. This suggests that all four AI algorithms pass the consistency test. From Table 3 and Table 4, it is evident that the random forest algorithm exhibits the highest Spearman rank correlation coefficient and t-statistic when compared to the other three artificial intelligence methods. This finding suggests that the model-driven approach employing the random forest algorithm yields the most scientifically valid and effective evaluation results. Consequently, the evaluation outcomes derived from this algorithm can be employed to assess the lean line loss management of power grid companies. The random forest algorithm exhibits the highest statistic, indicating that its evaluation results are the most reasonable and effective. Therefore, the evaluation results obtained from this algorithm can be utilized to assess line loss lean management in power grid enterprises.

In summary, this paper’s approach incorporates the advantages of objective weighting methods such as the entropy weight, AHP, and gray correlation analysis. It comprehensively evaluates objective information by considering indicators’ comparison strength, conflicts, and information entropy and by extracting collective intelligence from decision-making experts. Furthermore, it effectively captures the gap between an enterprise and the optimal benchmark. Consequently, it enhances the objectivity of comprehensive evaluation results for line loss lean management in power grid enterprises and offers targeted guidance for reducing line losses.

6. Conclusions

In this study, the primary objective was to comprehensively evaluate line loss lean management in municipal grid companies. To achieve this, a big-data-driven paradigm was employed in constructing a prediction model and conducting a comprehensive evaluation of line loss lean management across 61 municipal grid companies. Through the analysis of arithmetic examples, the following conclusions can be drawn:

The comprehensive evaluation model developed in this study incorporates the theory of collective intelligence. This model effectively considers both objective information, such as the conflicts and information entropy among indicators, and the subjective intention of decision-makers to capitalize on the wealth of practical experience and accumulated knowledge of experts in the relevant fields by tapping into their collective intelligence. As a result, the evaluation results obtained are objective, comprehensive, and accurate.
Research on online loss management makes innovative use of artificial intelligence techniques. Specifically, the random forest algorithm was applied in this study to evaluate line loss management. Additionally, a comparative analysis was conducted using various artificial intelligence algorithms. The evaluation results obtained from the trained model are highly credible, providing valuable guidance to enterprises seeking to minimize losses. The findings of this study demonstrate that the approach yields more scientific and reasonable results for comprehensively evaluating line loss lean management in municipal network enterprises, thereby introducing a fresh perspective to line loss management evaluation.
The evaluation model developed in this paper allows for the application of sensitivity analysis to suggest directions for improving line loss management in future research. Moreover, by utilizing situational awareness, optimized line loss evaluation results can be predicted, providing effective guidance for reducing losses in line loss management. This complete closed-loop management system for line loss encompasses comprehensive guidance for line loss management.

In summary, this research significantly contributes to the evaluation of line loss lean management in municipal grid companies. It introduces a comprehensive evaluation model that seamlessly integrates the theory of collective intelligence, artificial intelligence algorithms, and a big-data-driven paradigm. The evaluation results obtained through this model are not only objective, accurate, and scientifically grounded but also provide valuable insights for enhancing line loss management practices.

Author Contributions

Conceptualization, B.L. and Y.T.; methodology, Y.T.; software, Y.T.; validation, B.L., Y.T., Q.G. and W.W.; formal analysis, Y.T.; investigation, Y.T.; resources, B.L.; data curation, B.L.; writing—original draft preparation, Y.T.; writing—review and editing, Y.T.; visualization, Y.T.; supervision, B.L.; project administration, Q.G. and W.W.; funding acquisition, B.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Natural Science Foundation of Guangxi Province under grant 2020GXNSFAA297117.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study were obtained from China Southern Power Grid and have been licensed for sharing with other researchers upon request. Restrictions may apply to the availability of these data, which were used under license for this study and are not publicly available. Data are, however, available from the authors upon reasonable request and with permission from China Southern Power Grid.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. Comparison of ranking results of each evaluation method.

Name of Electric Power Enterprise	Analytic Hierarchy Process	Entropy Weight Method	TOPSIS Method	Weighted Rank-Sum Ratio	Coefficient of Variation Method	CRITIC Weight Method	Cosine Value Method	Gray Correlation Analysis Method	Methodology of This Paper
CZ	21	11	14	20	10	10	10	10	12
DG	2	1	1	1	1	1	1	1	1
FS	5	6	7	3	3	3	3	3	3
HY	15	8	6	9	9	8	11	13	9
HZ	8	3	3	7	5	5	6	6	5
MM	10	26	29	21	25	26	25	28	24
MZ	34	25	26	30	28	28	33	27	28
QY	7	14	27	12	12	11	9	8	11
ST	26	33	43	23	26	27	22	16	27
SW	41	22	20	38	27	25	35	30	31
SG	12	12	11	19	14	14	17	21	15
YJ	11	5	2	6	7	7	7	9	7
YF	14	15	18	14	15	15	14	18	13
ZJ	19	19	14	15	19	19	23	24	18
ZS	1	4	5	2	2	2	2	2	2
ZH	4	2	4	5	4	4	5	5	4
JY	17	30	39	29	24	24	16	11	26
ZQ	3	7	16	4	6	6	4	4	6
JM	6	9	8	8	8	9	8	7	8
KM	16	32	32	27	31	31	28	28	30
QJ	53	48	42	47	48	49	49	55	49
HH	38	17	9	28	23	22	29	35	23
YX	30	29	25	35	36	39	40	44	34
CX	35	34	29	31	32	30	27	19	32
DL	52	31	28	34	37	35	43	39	37
ZT	20	38	36	41	35	34	31	38	38
PE	51	52	54	54	51	51	47	49	52
LC	44	51	48	57	52	52	53	51	51
XSBN	42	42	37	43	45	44	45	47	44
DH	22	41	47	42	41	41	36	30	42
NJ	56	57	52	59	58	58	58	52	58
DQ	60	60	60	61	60	60	60	61	60
RL	59	59	59	58	59	59	59	59	59
WSDL	58	53	51	56	54	53	56	54	57
LJ	49	44	41	50	46	48	50	50	48
BS	36	39	32	33	39	40	37	35	36
BH	40	16	12	24	20	18	24	23	22
CZ	9	18	19	16	16	17	15	16	17
FCG	18	13	10	22	13	13	13	20	16
GG	46	49	44	48	49	50	48	57	50
GL	23	24	24	11	22	23	19	25	19
HC	32	37	35	36	34	36	32	35	35
LB	24	10	12	13	11	12	12	14	10
LZ	29	27	20	25	29	29	26	32	25
NN	25	20	17	17	18	20	20	25	20
QZ	28	36	31	32	33	33	30	33	33
WZ	13	21	20	10	17	21	18	15	14
YL	31	28	23	26	30	32	34	45	29
AS	47	35	34	39	38	37	41	43	39
BJ	61	61	61	60	61	61	61	60	61
DY	27	23	38	18	21	16	21	12	21
GY	37	47	49	49	44	45	42	46	45
KL	50	55	55	55	56	56	54	56	55
LPS	33	46	53	37	42	42	38	22	41
TR	45	50	50	45	50	47	46	39	46
XY	57	56	57	52	57	57	57	53	56
ZY	48	54	56	51	55	55	55	58	53
HK	39	40	39	44	40	38	39	34	40
SY	43	43	46	40	43	43	44	39	43
DZ	54	45	45	46	47	46	51	48	47
QH	55	58	58	53	53	54	52	42	54

References

Zhu, Y.; Wang, X.; Ye, Q.; Duan, C. Comprehensive evaluation method for tidal current power generation device. J. Mod. Power Syst. Clean Energy 2016, 4, 702–708. [Google Scholar] [CrossRef]
Yang, X.; Li, H.; Yin, Z.; Jiang, L.; Meng, J.; Jiang, Z. Energy efficiency index system for distribution network based on analytic hierarchy process. Autom. Electr. Power Syst. 2013, 37, 146–150. [Google Scholar]
Goh, H.; Kok, B. Application of analytic hierarchy process (AHP) in load shedding scheme for electrical power system. In Proceedings of the 2010 9th International Conference on Environment and Electrical Engineering, Prague, Czech Republic, 16–19 May 2010; pp. 365–368. [Google Scholar]
Zhao, D.; Li, C.; Wang, Q.; Yuan, J. Comprehensive evaluation of national electric power development based on cloud model and entropy method and TOPSIS: A case study in 11 countries. J. Clean. Prod. 2020, 277, 123190. [Google Scholar] [CrossRef]
Zhang, W.; Gu, X. Comprehensive evaluation of wind farms using variation coefficient method and technique for order preference by similarity to ideal solution. Power Syst. Technol. 2014, 38, 2741–2746. [Google Scholar]
Jun, L.; Jiguang, L.; Jiangang, Y.; Tangbing, L. Application of attribute recognition and G1-entropy method in evaluation of power quality. Power Syst. Technol. 2009, 33, 56–61. [Google Scholar]
Wang, Y.; Zhang, X.; Yan, J. Weight-varying gray cloud model based comprehensive evaluation on technical performance of overall grid-integration of wind farm. Power Syst. Technol. 2013, 37, 3546–3551. [Google Scholar]
Li, H.; Guo, S.; Tang, H.; Li, C. Comprehensive evaluation on power quality based on improved matter-element extension model with variable weight. Power Syst. Technol. 2013, 37, 653–659. [Google Scholar]
Yu, J.; Chen, Y.; Zhao, J.; Yan, L. Comprehensive evaluation system of rural area network line loss management. In Proceedings of the 2018 Chinese Control and Decision Conference (CCDC), Shenyang, China, 9–11 June 2018; pp. 6464–6469. [Google Scholar] [CrossRef]
Wang, J.; Gao, H.; Zou, G.; Wu, Z. Comprehensive evaluation of impacts of distributed generation on voltage and line loss in distribution network. In Proceedings of the 2015 5th International Conference on Electric Utility Deregulation and Restructuring and Power Technologies (DRPT), Changsha, China, 26–29 November 2015; pp. 2063–2067. [Google Scholar] [CrossRef]
Li, B.; Yan, K.; Lu, Y.D.; Lu, F. Lean Management of Line Loss in Municipal Power Grid. In Proceedings of the 2018 2nd IEEE Conference on Energy Internet and Energy System Integration (EI2), Beijing, China, 20–22 October 2018; pp. 1–6. [Google Scholar] [CrossRef]
Bin, L.; Zhufeng, L.; Liujun, H.; Daiyuan, B. Comprehensive Evaluation Benchmark Study on Line Loss Management of County Power Grids. Power Syst. Technol. J. 2016, 40, 3871–3880. [Google Scholar]
Madan, S.; Bollinger, K. Applications of artificial intelligence in power systems. Electr. Power Syst. Res. 1997, 41, 117–131. [Google Scholar] [CrossRef]
Yao, M.; Zhu, Y.; Li, J.; Wei, H.; He, P. Research on predicting line loss rate in low voltage distribution network based on gradient boosting decision tree. Energies 2019, 12, 2522. [Google Scholar] [CrossRef]
Su, H.Y.; Liu, T.Y. Enhanced-Online-Random-Forest Model for Static Voltage Stability Assessment Using Wide Area Measurements. IEEE Trans. Power Syst. 2018, 33, 6696–6704. [Google Scholar] [CrossRef]
Leimeister, J.M. Collective intelligence. Bus. Inf. Syst. Eng. 2010, 2, 245–248. [Google Scholar] [CrossRef]
Gloor, P.; Cooper, S. The new principles of a swarm business. MIT Sloan Manag. Rev. 2007, 48, 81. [Google Scholar]
Malone, T.W.; Klein, M. Harnessing collective intelligence to address global climate change. Innov. Technol. Gov. Glob. 2007, 2, 15–26. [Google Scholar] [CrossRef]
Gregg, D.G. Designing for Collective Intelligence. Commun. ACM 2010, 53, 134–138. [Google Scholar] [CrossRef]
Yi, K. Harnessing collective intelligence in social tagging using Delicious. J. Am. Soc. Inf. Sci. Technol. 2012, 63, 2488–2502. [Google Scholar] [CrossRef]
Robu, V.; Halpin, H.; Shepherd, H. Emergence of consensus and shared vocabularies in collaborative tagging systems. ACM Trans. Web (TWEB) 2009, 3, 1–34. [Google Scholar] [CrossRef]
Boza, P.; Evgeniou, T. Artificial intelligence to support the integration of variable renewable energy sources to the power system. Appl. Energy 2021, 290, 116754. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Lévy, P. From social computing to reflexive collective intelligence: The IEML research program. Inf. Sci. 2010, 180, 71–94. [Google Scholar] [CrossRef]
Vaidya, O.S.; Kumar, S. Analytic hierarchy process: An overview of applications. Eur. J. Oper. Res. 2006, 169, 1–29. [Google Scholar] [CrossRef]
Li, X.; Wang, K.; Liu, L.; Xin, J.; Yang, H.; Gao, C. Application of the entropy weight and TOPSIS method in safety evaluation of coal mines. Procedia Eng. 2011, 26, 2085–2091. [Google Scholar] [CrossRef]
Behzadian, M.; Otaghsara, S.K.; Yazdani, M.; Ignatius, J. A state-of the-art survey of TOPSIS applications. Expert Syst. Appl. 2012, 39, 13051–13069. [Google Scholar] [CrossRef]
Lu, H.; Zhu, C.; Cao, X.; Hsu, Y. The sustainability evaluation of masks based on the integrated rank sum ratio and entropy weight method. Sustainability 2022, 14, 5706. [Google Scholar] [CrossRef]
Faber, D.; Korn, H. Applicability of the coefficient of variation method for analyzing synaptic plasticity. Biophys. J. 1991, 60, 1288–1294. [Google Scholar] [CrossRef]
Zafar, S.; Alamgir, Z.; Rehman, M. An effective blockchain evaluation system based on entropy-CRITIC weight method and MCDM techniques. Peer -Peer Netw. Appl. 2021, 14, 3110–3123. [Google Scholar] [CrossRef]
Kou, G.; Lin, C. A cosine maximization method for the priority vector derivation in AHP. Eur. J. Oper. Res. 2014, 235, 225–232. [Google Scholar] [CrossRef]
Liu, G.; Yu, J. Gray correlation analysis and prediction models of living refuse generation in Shanghai city. Waste Manag. 2007, 27, 345–351. [Google Scholar] [CrossRef]
Sherafatpour, Z.; Roozbahani, A.; Hasani, Y. Agricultural water allocation by integration of hydro-economic modeling with Bayesian networks and random forest approaches. Water Resour. Manag. 2019, 33, 2277–2299. [Google Scholar] [CrossRef]

Figure 1. Research block diagram of big-data-driven paradigm.

Figure 2. Artificial intelligence method to analyze the process diagram.

Figure 3. Data-driven model based on collective intelligence theory.

Figure 4. The flowchart of the random forest algorithm.

Figure 5. Big-data-driven line loss lean management evaluation process.

Figure 6. The geographic location of the 61 power grid enterprises.

Figure 7. Comparison diagram of predicted values and comprehensive evaluation values of test samples.

Figure 8. Index scores of MM and SW.

Figure 9. Index scores of LPS and YL.

Figure 10. Comparison of multi-model evaluation results.

Table 1. The index system of municipal power grid enterprise line loss management.

Dimension	Index
Plan	Standard rate of power supply radius of power grid (GH1)
	Capacity load ratio of main network (GH2)
	Proportion rate of reactive power allocation in substations (GH3)
	Reactive power compensation rate of 10kV distribution transformer (GH4)
Manage	The proportion of old and low electric energy meters (GL1)
	Power consumption ratio of automatic meters (GL2)
	Calculation and measurement of fault error rate (GL3)
	The rate of abnormal line loss solvation (GL4)
	Four types of terminal data acquisition integrity (GL5)
	The rate of abnormal line loss (GL6)
Work	Transformer status ratio (YX1)
	Line state ratio (YX2)
	Comprehensive voltage rate (YX3)
	Power factor rate (YX4)
	The qualified rate of bus power imbalances (YX5)
Technique	The ratio of main transformer with load regulating (JS1)
	Availability of reactive power compensator in substations (JS2)
	Station electricity consumption rate (JS3)
	Energy-saving transformer ratio (JS4)
	High-loss transformer ratio (JS5)

Table 2. Evaluation accuracy procedures.

Sample	Accuracy%	MAE	MAPE
Test Sets	91.07	5.40	0.070

Table 3. Table of Spearman rank correlation coefficients.

Spearman Rank Correlation Coefficient	Random Forest	Light GBM	Broad Learning System	Deep Forest
Analytic hierarchy process	0.921	0.909	0.822	0.819
Entropy weight method	0.835	0.831	0.830	0.754
TOPSIS method	0.916	0.905	0.915	0.813
Weighted rank-sum ratio	0.874	0.879	0.886	0.783
Coefficient of variation method	0.882	0.877	0.876	0.796
CRITIC weight method	0.863	0.861	0.858	0.782
Cosine value method	0.898	0.901	0.901	0.80
Gray correlation analysis method	0.866	0.865	0.860	0.786

Table 4. Post-verification calculation results.

	Random Forest	Light- GBM	Broad Learning System	Deep Forest
$ρ_{k}$	0.885	0.881	0.869	0.795
t-statistics	14.659	14.371	13.625	10.122

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, B.; Tan, Y.; Guo, Q.; Wang, W. Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm. Sustainability 2023, 15, 12074. https://doi.org/10.3390/su151512074

AMA Style

Li B, Tan Y, Guo Q, Wang W. Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm. Sustainability. 2023; 15(15):12074. https://doi.org/10.3390/su151512074

Chicago/Turabian Style

Li, Bin, Yuxiang Tan, Qingqing Guo, and Weihuan Wang. 2023. "Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm" Sustainability 15, no. 15: 12074. https://doi.org/10.3390/su151512074

APA Style

Li, B., Tan, Y., Guo, Q., & Wang, W. (2023). Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm. Sustainability, 15(15), 12074. https://doi.org/10.3390/su151512074

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application of Comprehensive Evaluation of Line Loss Lean Management Based on Big-Data-Driven Paradigm

Abstract

1. Introduction

2. Paradigm and Characteristics of Lean Line Loss Management in the Era of Big Data

2.1. Paradigm and Characteristics of the Era of Big Data

2.2. Overview and Requirements of the Comprehensive Evaluation System for Line Loss Lean Management

3. Artificial Intelligence Technology

3.1. Artificial Intelligence Methods

3.2. Data Processing

3.3. Construction of Data-Driven Model

4. Construction of Evaluation Algorithms Based on the Big-Data-Driven Paradigm

4.1. The Data-Driven Model Based on Collective Intelligence Theory

4.2. Flow of Random Forest

4.3. Evaluation Criteria for Data-Driven Models

4.4. Test Methods of Comprehensive Evaluation

4.4.1. Preliminary Verification of Comprehensive Evaluation

4.4.2. Post Verification of Comprehensive Evaluation

5. Results and Discussion

5.1. Data Introduction and Experimental Platform

5.2. Construction of Evaluation System and Data Processing

5.3. Evaluation Accuracy Results

5.4. Comparative Analysis of Evaluation Methods

5.5. Comparative Analysis of AI Models

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI