1. Introduction
In today’s international context, the business environment (BE), as the basis of the active economy, affects the operation of enterprises and the development of the world economy via many aspects such as electricity [
1], taxation [
2], and the protection of property rights [
3]. As electricity is one of the main energy sources in modern society, the electricity-optimized business ecosystem (EOBE) has become an important part of BE and is closely related to the international economy [
4,
5]. Access to electricity index (AEI) is an important criterion with to evaluate EOBE, and its accurate application becomes an effective means of optimizing EOBE. It has become a research hotspot in this field, allowing us to evaluate the level of EOBE from multiple dimensions and analyze the reasons for its changes.
Numerous scholars have conducted extensive research on the improvement in AEI. In 2001, the World Bank published the Doing Business Evaluation Indicator System, the first set of indicators to measure and evaluate the business environment provided to companies in a region or country. Since electricity has become one of the necessary energy sources for enterprises’ daily operations, the World Bank incorporated the index of “access to electricity” into its Index System of Doing Business in 2010, mainly considering the efficiency of enterprises’ access to electricity supply and the stability of power supply [
6,
7,
8]. Maryam Doroodi et al. pointed out that, in addition to the dimension of power acquisition efficiency at the enterprise level, the environmental impact caused by power supply should also be considered [
9]. Qiang Christine Zhenwei et al. also deeply influenced the development of EOBE by assessing the quality of power regulations [
10]. In 2024, the World Bank launched Business Ready (B-READY), based on the Doing Business indicator system, taking into account three dimensions: regulatory framework, public services, and overall efficiency [
11].
To study the nature of AEI changes, many scholars have carried out extensive research on the root cause tracing method of target variables. Zhu et al. conducted an in-depth analysis of the dominant influences on good production by calculating gray correlation, maximum mutual information, feature importance, and Shapley’s additive interpretation (SHAP) value. This provides a promising method for the prediction and contribution analysis of multifactorial influences [
12]. In their study of shale brittleness assessment methods, Wang et al. introduced SHAP value analysis for quantitatively assessing factor impacts, providing a ranking of characteristic contributions and visualizing contribution trends [
13]. Wang et al. found that the generation and reliability of electricity supply are affected by climatic conditions by studying the flow of electricity, from generation to the power system [
14]. Liang Weihui et al. simulated the experimental results under different feature combinations by controlling feature variables and accurately quantified the contribution degree of each feature [
15]. All of the above literature combines a prediction model and contribution analysis to analyze the reasons for the change in target value. In practical application, the accuracy of contribution degree analysis is affected by many complicated factors.
Developing a suitable prediction model is the key to ensuring the accuracy of prediction and contribution analysis. Emily Royal et al. used neural networks for the time series forecasting of electricity loads and found out that electricity loads are growing at a high rate and are affected by seasonal factors [
16]. Li et al. used the Extreme Gradient Boosting (XGBoost) model optimized by the Northern Goshawk Optimization (NGO) algorithm to forecast electricity load, which significantly reduced the mean absolute percentage error and improved the coefficient of determination [
17]. Fan et al. developed a novel short-term load forecasting model that mixes several machine learning methods such as support vector regression (SVR), grey catastrophe, and random forest (RF) modeling. This improves the forecasting accuracy and reduces the forecasting randomness [
18]. Wu et al. proposed a Bayesian network-based blast furnace gas generation prediction method, which unfolds the prediction for the training event set. It is constructed in two dimensions—interval generation and interval time—and effectively improves the prediction accuracy [
19]. The above literature studies methods used to improve the prediction accuracy, which can increase the accuracy of the contribution analysis results. However, in practical applications, contribution analysis should not only consider the influence of prediction accuracy.
Using an effective feature extraction method also plays a crucial role in the results of contribution analysis. Kan et al. extracted the wind speed distribution features of actual wind farms through data-mining methods and improved the accuracy of identifying the effective operating conditions of wind turbines under different wind speed intervals [
20]. Wang et al. used multiscale data mining for three-stage feature screening and later increased the accuracy of prediction by combining the features through convolutional neural network (CNN) modeling to carry out prediction [
21]. Yu et al. investigated a fast control method of distributed photovoltaic countercurrent prevention based on multi-dimensional data mining, enabling the idea of extracting features using data mining [
22]. Most of the above literature screens features using data mining, which can reflect the influence of experimental data on prediction results well. However, due to different application scenarios, the specific method of feature extraction and application objectives should be adapted.
In general, the above literature review improved AEI to adapt to local EOBE in various ways and selected appropriate contribution quantification methods to conduct root cause tracing research on target variables. In order to further promote the development of EOBE, more factors affecting EOBE need to be considered, and appropriate evaluation criteria should be specified. In addition, by optimizing the prediction model and feature screening methods to improve the accuracy of contribution quantification, a reasonable contribution analysis model is obtained, which has a positive impact on the development of root cause tracing research technology. In addition, it is worth noting that the current AEI has certain limitations in the evaluation of EOBE in different regions, and the level of local EOBE cannot be evaluated pertinently, resulting in the subsequent AEI change root cause tracing research results being accurate.
Inspired by this, working on the basis of the traditional AEI, which only focuses on the enterprises’ gain power efficiency and power reliability, this paper adds considerations of the soundness, the rationality of the regulatory framework, and the level of power public services. Further, it proposes an EOBE evaluation model considering three dimensions: power regulatory quality, public service level, and enterprises’ gain power efficiency. In addition, a feature extraction method targeting the influencing factors behind AEI changes is proposed, and the main causes of EOBE changes are studied. This provides a specific case and theoretical framework for EOBE optimization. The contributions of this paper can be summarized as follows:
The EOBE evaluation model is constructed, considering three dimensions: power regulatory quality, public service level, and enterprises’ gain power efficiency. This model includes the subjective evaluation of the soundness of the regulatory framework and the level of power public services in EOBE, breaking through the limitation of the traditional AEI that only assesses EOBE from an objective perspective. This model can measure and evaluate the local EOBE level simultaneously from both subjective and objective aspects.
This EOBE evaluation model also takes into account the correlation between EOBE and the environment. It has corresponding indicators to assess whether the government considers the sustainable development of the environment in the development of EOBE. In this way, EOBE can be accurately evaluated from the three perspectives of society, environment, and enterprise.
A root cause tracing method is proposed for AEI changes. This is based on data mining to collect the multi-dimensional data of sample materials, extract multi-dimensional features, and analyze the influence and mechanism of each feature on AEI changes by the SVR model and SHAP values. Thus, the method can trace the root causes of AEI changes.
The organizational structure of this paper is as follows:
Section 2 constructs a new EOBE evaluation model.
Section 3 mainly introduces the theoretical aspects and main process of the AEI change root cause tracing model.
Section 4 investigates the root causes of AEI changes. Finally,
Section 5 presents a summary of the main findings.
3. Root Cause Tracing Model of AEI Change
3.1. A Priori Arithmetic
In order to establish the factors affecting AEI changes and the mechanism of the factors influencing AEI changes, this paper conducts root art tracing experiments on AEI changes. In order to increase the experimental accuracy of the root cause tracing experiment, this paper uses association rule mining, a data-mining method, to extract the coupling features between AEI. The following is the introduction of the association rule algorithm.
The association rule algorithm is a data-mining technique. The purpose of association rules is to discover the correlation between the frequency of the occurrence of certain items and the frequency of the occurrence of other items through the relationships between items in a data set.
Suppose that there is a transaction set
= {
};
denotes a transaction in the transaction set
, consisting of several directions. If there are
which are subsets of the transaction set
, the association rule is usually expressed in the form
. When mining association rules using association rule algorithms, three metrics, support, confidence, and lift, are usually used to evaluate the effectiveness of the rules [
24,
25].
Formula for calculating support is shown in Equation (18):
where
is the probability of the simultaneous occurrence of
and is represented as the ratio of the number of times both itemsets
and
appear simultaneously in the transaction set
to the total number of transactions in the transaction set
.
Calculation formula for confidence is shown in Equation (19):
where
is the probability that
occurs conditional on the occurrence of
.
is represented as the ratio of the probability of
and
appearing together to the probability of
appearing alone.
The calculation formula for
is shown in Equation (20):
where
is the ratio of
to
. When
> 1, it means that
are positively correlated and
has a strong association rule; when
= 1, it means that
are independent of each other; when
< 1, it means that
are mutually exclusive or irrelevant.
Referring to the existing association rule mining algorithm, the a priori algorithm is selected in this paper to carry out a complete strong association rule mining of the AEI score of the experimental region to extract the coupling features. The a priori association rule algorithm is divided into two parts [
26,
27]:
The steps of the a priori algorithm are as follows:
Step 1: Read the data, scan the data set, and collect all the items in the data set to generate the candidate 1 itemset .
Step 2: Prune and filter out the items that meet the condition to generate the frequent 1 itemset .
Step 3: According to the property that the subsets of frequent itemsets must be frequent itemsets and the supersets of infrequent itemsets must not be frequent itemsets, the items in the frequent k-1 item are combined to generate the candidate k-itemset .
Step 4: Prune and filter the items that meet the condition to generate the frequent k-itemset .
Step 5: Repeat steps 3 and 4 until the generated frequent itemset is the empty set.
Step 6: Mining association rules in frequent itemsets based on and requirements.
3.2. Support Vector Regression Prediction Model
High-precision prediction models are necessary for accurately quantifying the contribution degree of features. The support vector regression (SVR) model uses kernel functions to capture the nonlinear relationships of sample data and minimize overfitting. Additionally, it is good at handling high-dimensional and small-sample data [
28]. Due to the small sample size of the experimental data, the SVR model has extremely strong matching with this study. The Decision Tree (DT) model has a low requirement for sample size and can have high robustness by handling missing values through alternative splitting [
29]. Therefore, the DT model has a better coping ability for the volatility of data, which can ensure the stability and prediction accuracy of the model. The feedforward neural network (FNN) model automatically learns the complex nonlinear relationships through multiple layers of neurons and activation functions [
30]. Theoretically, it can approximate any function and achieve optimal accuracy with sufficient data. Due to the insufficient data in this study, the FNN model cannot fully exert its advantage of the upper limit of high precision. However, the feature interaction automatic learning function of FNN can independently determine the importance of features and modify the feature weights according to the data change trend, thereby improving the prediction accuracy of the model.
Since the three models of SVR, DT, and FNN all have high adaptability to this study, in order to maximize the prediction accuracy of the models, this paper, respectively, applies the three different types of prediction models of SVR, DT, and FNN. The accuracies of these three models are compared after Bayesian optimization. Finally, according to the results of prediction accuracy comparison, SVR model is selected to carry out AEI root cause tracing research.
SVR is a regression analysis method based on Support Vector Machine that aims to predict the value of a continuous variable by finding an optimal hyperplane. The optimal hyperplane calculation formula is shown in Equation (21):
where
is the regression hyperplane,
is the weight parameter vector of the regression hyperplane,
is the mapping function, and
is the bias term.
To find a function that predicts the target value and tries to keep the error between all the training data points and the predicted value within a certain tolerance, parameter (called the “tolerance error”) is introduced, i.e., the error within is considered to be reasonable.
The prediction accuracy of the model is accurately measured by the mean square error (
) and the mean absolute error (MAE) [
31]. The calculation formula of
is shown in Equation (22):
The calculation formula of MAE is shown in Equation (23):
In the Equations (22) and (23), is the total number of samples; is the true value; and is the predicted value.
3.3. SHAP Value
In order to realize the root cause tracing study of AEI changes, this paper selects the SHAP value and prediction model to analyze the contribution degree of extracted features, so as to study the main influencing factors of AEI changes.
The SHAP value allows us to interpret the predicted results of a machine learning model based on the Shapley value concept in cooperative game theory. It aims to utilize the Shapley value in game theory to assign a corresponding value to the contribution of each feature to effectively explain the model’s predictions [
32]. The core idea of SHAP value calculation is to consider all possible combinations of features and evaluate the contribution of each feature in all possible combinations.
Suppose that the
i th sample of the problem under study is
, the
j th feature of the
i th sample is
, the model’s predicted value of the sample is
, and the base value of the entire model (usually the mean value of the target variables of all samples) is
. Then, the SHAP values satisfy Equation (24):
where
is the SHAP value of
.
The SHAP value, the marginal contribution of each feature, is the difference between a feature’s contribution at the time it was added and its contribution before it was added. The SHAP value is calculated as shown in Equation (25):
where
represents the subscript of the feature in the data set,
M is the total number of characteristic variables,
S is a subset of the set
,
is the total number of elements in
,
represents the predicted value of the model when only the features in
are included,
represents the predicted value of the model when there are only features in
S, the two are reduced to the marginal contribution of the
i-th characteristic variable under subset
S. 3.4. The Model Overall Operation Flow
The overall flow of running this root cause tracing model of AEI change is shown in
Figure 1. The “a priori” section refers to the literature [
27], while the rest is original. The output data of the “EOBE evaluation model” section are the scores of each indicator. The data in the “Feature” section are derived from
Section 4.2, “Eigenmatrix Construction”. First, the AEI score was calculated using the constructed EOBE evaluation model. Then, the a priori algorithm was used to mine the association rules among indicators to extract the coupling features, and the feature matrix was constructed by combining the time series features and policy features. Then, the eigenvalues and target values were brought in to compare the prediction accuracy of SVR, DT, and FNN models after Bayesian optimization. Finally, the optimal prediction model was used to calculate the SHAP value and analyze the feature contribution degree.
4. Calculus Analysis
4.1. Calculation of the Access to Electricity Index Score
4.1.1. Data Sources
The original data used in this study are the data needed to calculate the score of the EOBE evaluation model; they are obtained through the official channels of the official websites of relevant departments. The raw data for the calculation of the AEI come from the Guangdong Energy Bureau, the National Bureau of Statistics, and the official website of the China Southern Power Grid in China, and they are required for the calculation of the improved AEI system of this paper for the period of 2021–2024 in a region of Guangdong Province. The types of raw data required for the calculation of the AEI are summarized in
Table 2. The parameter definitions in the table are all from
Section 2.2., entitled “AEI calculation Method”. The data of the first six indicator parameters are derived from the results of the distributed questionnaire survey, while the data of the last three indicator parameters are from the business environment module of the online platform of China Southern Power Grid. The raw data for the calculation of the AEI consist of two overall types of data, expert judgment and actual values. Among them, the data required for indicators
–
are of the expert judgment type. These are assessed and judged by experts, using the information on the official website of the data source, and assigned values to the original data; the data required for indicators
–
are of the actual value type, which are the official and publicized real data collected from the official website of the data source.
4.1.2. Calculation and Analysis of Indicator Scores
Working according to the improved AEI formula, the indicator score calculation model is constructed in MATLAB R2021a, and the raw data collected and organized for indicator calculation are used as inputs to obtain the AEI score of a region in Guangdong Province, as shown in
Figure 2,
Figure 3 and
Figure 4 (the data in
Figure 2,
Figure 3 and
Figure 4 are all calculated from the data in
Table 2 through the calculation method in
Section 2.2).
As can be seen from
Figure 2, during 2021–2024, the “access to electricity” score of the experimental area fluctuates around 95 points, with the maximum fluctuation range not exceeding 5 points. In the comparison of the scores of the primary indicators, the scores of the two primary indicators “power regulatory quality” and “public service level” remained at the level of 100 points over the past four years, while the scores of the first indicator “Enterprises’ gain power efficiency” were less than the total score, indicating that for the experimental area, “power regulatory quality” and “public service level” were always at the forefront of the EOBE. The score of the first-level index of “Enterprises’ gain power efficiency” is lower than the total score, indicating that there is still a lot of room for improvement in the area of enterprise access to power efficiency. As can be seen from
Figure 3, the score curves of the two primary indicator groups, the quality of electric power regulations and the level of public service, are both 100 points, indicating that the quality of electric power regulations and the level of public service in the experimental area are both superior and stable. As can be seen from
Figure 4, the score of the secondary index in the index group of “Enterprises’ gain power efficiency” fluctuated somewhat over four years, but showed an overall upward trend. After analysis, it is believed that the cause of this situation may be that the index evaluation object is affected by the market economy changes, climate instability and other factors, resulting in a large fluctuation in the score of the index of “Enterprises’ gain power efficiency”.
4.2. Eigenmatrix Construction
The level of access to electricity is usually affected by national policies, market economic development trends, climate change, and internal coupling factors of AEI. So, in order to realize the root cause of changes in the level of access to electricity, the characteristics of changes in AEI are uniformly divided into time series characteristics, coupling characteristics, and policy characteristics. The time series features reflect the influence of the market economy’s developmental trend, climate change, and other factors on the change in the level of access to electricity. The corresponding influence of the year and month is different; the coupling features reflect the influence of the internal coupling factors of the AEI on the change in the level of access to electricity; and the policy features reflect the influence of the national policy, national development plan, and other factors on the change in the level of access to electricity. Finally, the feature matrix of the experimental data set is constructed from the time series features, coupling features, and policy features.
4.2.1. Time Series Feature Extraction
In this paper, the AEI values are evaluated in a month as a cycle. The scores of the indicators in the same month are divided into a group of data to reflect the impact of time on the level of access to electricity. The year and month of each group of data are extracted as the time-ordered characteristics of the group of data.
4.2.2. Coupling Feature Extraction
In this paper, the a priori algorithm is used to mine the experimental data, and the coupling features of each group of data are extracted according to the association rules of data mining. First of all, the experimental data were categorized, and since the total score of each indicator was set at 100 points, the categorization interval was set at 5, and the scores of each indicator were divided into 20 categories, which were assigned numbers in combination with the type of the indicator. For example, the number A12 indicates that the score for indicator A at that point in time is between 60 and 65.
Since the data samples used in this paper are small, the information of support and confidence parameters used in trial mining can be utilized to predict the optimal support and optimal confidence for data mining in this data set. After three attempts at mining, it was found that 10% min_support and 50% min_confidence are the best combination for mining binomial association rules. Thus, the min_support is set to 10% and the min_confidence is set to 50%. Using an a priori data-mining algorithm, the strong association rules that satisfy the conditions of minimum support and minimum confidence are obtained from the database of electric power index scores, which are Combination 1 (Outage Time 14 and Outage Frequency 19), Combination 2 (Outage Time 14 and Electricity Service Cost 17), Combination 3 (Outage Time 16 and Outage Frequency 19), Combination 4 (Outage Time 16 and Electricity Service Cost 17), Combination 5 (Outage Time 17 and Outage Frequency 19), Combination 6 (Outage Time 17 and Electricity Service Cost 18), Combination 7 (Outage Time 19 and Outage Frequency 19), Combination 8 (Outage Frequency 18 and Electricity Service Cost 17), Combination 9 (Outage Frequency 19 and Electricity Service Cost 18). Obtaining electricity indicator scores highlighting the strong association rules (support ≥ 10%, confidence ≥ 50%) for the impact of the two internal couplings are shown in
Table 3 (The data in
Table 3 is calculated from the data in
Figure 4 through the model in the “a priori” section of
Figure 1).
The cause and effect of the strong correlation rule is used as the judgment condition, if the cause item in the strong correlation rule meets the condition and the score of the result item is less than or equal to the judgment condition in a certain month’s data of obtaining power index score, it can be regarded as conforming to the strong correlation rule. According to this judgment method, we sequentially judge the number of compliance of each group of data in the data set of the score of the AEI for the strong association rule in
Table 1, and take the number of compliance of the strong association rule of this group of data as its coupling characteristics, and finally form the coupling characteristics of the experimental data set.
4.2.3. Policy Feature Extraction
According to the evaluation content and evaluation objectives of each indicator, experts’ suggestions were collected to select and establish the search keywords for each indicator, and then keyword searches were carried out on the official website of the National Energy Administration to collect and organize the number of policy documents involved in each indicator in the past four years. After correlation test and sorting, the number of governmental dynamic information disclosed by the government per month during the period of 2021–2024 related to each indicator is summarized as the policy characteristics of this data set.
4.3. Comparison of Predictive Models
The index scores of the dimension of enterprises’ access to electricity efficiency have great volatility and dynamics, so it is planned to predict the scores of the three secondary indexes, , , , of the dimension of enterprises’ access to electricity efficiency in the region by means of the prediction model. In order to ensure the accuracy of the prediction model, three different kinds of prediction models, namely, SVR, DT, and FNN, are selected for the comparison of prediction accuracy. Firstly, in order to improve the accuracy of the prediction models, Bayesian optimization is used to obtain the best hyperparameter combination of each model, and then the optimized prediction models are used to conduct model training and score prediction experiments on the scores of the indicators average customer outage time, average customer outage frequency and average customer electricity service cost, and to compare the prediction accuracies of the different prediction models so as to select the best prediction model for the root cause retrospective experiments of each indicator. The best prediction model for each indicator is selected by comparing the prediction accuracy of different prediction models.
The proportion of the training set is set to be 75%, and the proportion of the test set is set to be 25%. The prediction models were trained by the training set data to minimize the root MSE and the MAE as the objective function, and bayesian optimization was used to optimize the model hyper-parameters, and the optimal hyper-parameter combinations for each model were determined as shown in
Table 4 (The parameters in
Table 4 are the model parameters of the three models SVR, DT and FNN after Bayesian optimization).
The comparative results of the prediction experiments using different prediction models for the three metrics of average customer outage time, average customer outage frequency, and average customer cost of electric service are shown in
Figure 5 and
Figure 6 (
Figure 5 and
Figure 6 show the comparison of the prediction results and prediction accuracy of the three models SVR, DT, and FNN after the model was constructed using the parameters in
Table 4, with the data in
Figure 4 as the data set). The experimental results show that the best predictive model for all three metrics is the SVR model. After analysis, this situation may occur because SVR can capture complex data patterns by finding the best hyperplane in the high-dimensional feature space for regression. SVR shows better generalization ability for high-dimensional and small data sets, and is able to better generalize the trends in the training data to avoid overfitting to the training data. In contrast, DT and FNN produce overfitting in the prediction of small data sets, resulting in lower prediction accuracy.
4.4. Contribution Analysis
Using the SHAP method, the SVR model is constructed using the best combination of hyperparameters after Bayesian optimization, and the feature contributions of the three secondary indicators
,
and
are calculated, and the distributions of the influence of the features of the three indicators on the target value are visualized, as shown in
Figure 7,
Figure 8 and
Figure 9 (
Figure 7,
Figure 8 and
Figure 9 respectively show the contribution degrees of each feature in indicators
,
and
calculated by the contribution degree analysis model constructed based on the parameters in
Table 4 and the method of
Section 3.3. “SHAP Value”). In general, the influences on the changes of the AEI are, in descending order, time series feature, coupling feature, and policy feature, respectively.
The results of the comparison of SHAP values of each feature in
Figure 7 show that the influence of each feature on the indicator
is, in descending order, the temporal feature (month), the temporal feature in (year), the policy feature, and the coupling feature. The reason for this is that, first of all, the average customer outage time indicator is mainly affected by the professional degree of maintenance personnel and the level of power supply infrastructure, which is a factor that accompanies the change in time. Moreover, the time span of the year is larger than that of the month, and the influence on the above time factors is also larger, so the influence of the time series feature (year) is slightly greater than that of the time series feature (month). Secondly, as the country is committed to the improvement in maintenance personnel’s professionalism and the upgrading of power equipment, many relevant policies have been introduced also have a non-negligible impact on the change in the average outage time of customers. Of course, outage time is also affected by outage frequency, but not as influential as other characteristics.
The influence mechanism of each feature on was deeply analyzed. First of all, the influence of year on is almost always positive. Only with the increase of years, the positive influence of years gradually slowed down, which may indicate that the incentive effect of time tends to saturation. Then, the influence mechanism of the month is realized as a positive effect in the first half of the year, and a reverse effect in the second half of the year, which may indicate that the first half of the climate and other conditions, the blackout time is shorter. Next, policies have a progressive effect on as a whole, and the greater the policy intensity, the more obvious the promotion effect. Finally, when the influence among indicators is large, it has a promoting effect on , while when it is small, it has almost no effect or even a little negative effect.
The results of the comparison of the SHAP values of each feature in
Figure 8 show that the influence of each feature on the indicator
occurs in descending order of temporal feature (month), policy feature, temporal feature (year), and coupled feature. The reason for this situation is that the humidity and temperature in the air are different during the year due to the months and seasons. At high temperatures, the probability of power failures and power loads increases, thus significantly affecting the frequency of power outages. Secondly, the state attaches great importance to the reliability of power supply and, in response to the current situation of the industry, to the timely release of relevant policies and development and investment plans to promote the construction of power grids and optimize the mode of operation of the power system, significantly reducing the frequency of power outages. At the same time, with the growth of the year power system optimization, infrastructure renewal will also have an impact on the frequency of power outages. The final effect of coupling feature, which is generated by the correlation between outage time and outage frequency, is not as large as the effect of other characteristics.
The influence mechanism of each feature on was deeply analyzed. First of all, from January to December, the impact of the month on gradually decreases in the second half of the year, which may indicate that the power load is larger in the second half of the year, and the climate is more likely to cause power outages. Next, the influence of policy and year on almost all show positive incentive effect. The difference is that the greater the number of relevant policies, the greater the incentive effect on , while the overall year presents a relatively uniform positive incentive effect.
The results of the comparison of SHAP values of each feature in
Figure 9 show that the influence of each feature on the indicator
is, in descending order, time series features (year), policy features, time series features (month), and coupling features, among which the influence of coupling features is negligible. The reason for this situation is that the cost of electricity service is mainly influenced by market economic development trends and national policy regulation. At the same time, the average electricity service cost of customers has a strong independence compared with other indicators, and so it is minimally affected by the coupling features.
The influence mechanism of each feature on was analyzed deeply. First of all, with the growth of the year, the impact of gradually changed from negative to positive, and the influence gradually increased. This may indicate that over time, people’s electricity service cost as a proportion of disposable income is gradually decreasing, and people’s consumption level is constantly upgrading. Next, the more the number of relevant policies, the better the incentive effect on . If the number of policies is too small, it may cause some negative impact on . This may indicate that is significantly affected by macro-control policies. Finally, in the course of a year shows a gradual shift from negative to positive effects from January to December. This may indicate that the demand for electricity in the first half of the year is large, and so electricity charges account for a relatively large proportion of residents’ disposable income.
4.5. Comparative Discussion of Similar Studies
4.5.1. Similar Studies
Reference [
33] studied the electricity demand in Jiangsu Province and the seasonal and temporal influences on electricity prices through data-mining technology. Studies show that the electricity demand fluctuates greatly in summer and winter, and the electricity consumption is also relatively large. The resulting consequence is that the electricity charges in these two seasons are also higher.
The literature [
34] predicted the electricity demand from 2010 to 2050 through the mining and analysis of the electricity demand data before 2010 and carried out analysis and research on the changes in electricity demand. The research results show that the global demand for electricity is constantly increasing, but fossil energy is constantly decreasing. The introduction of renewable energy power generation technology can meet global electricity demand. However, distributed power generation is greatly affected by the environment and has weak stability, which may reduce the reliability of power supply.
4.5.2. Comparative Discussions
This study and reference [
33] are both based on the power data of 2024, but the research locations are different. In the analysis of electricity charges in winter, the research results of both are consistent, showing the characteristics of higher electricity charges in winter. The research results all indicate that the changes in electricity charges present seasonal characteristics, but show an overall upward trend within one year. The difference between the two studies is that, compared with the research results in reference [
33], the summer electricity bill in the research results of this paper is higher. After analysis, it was found that this was caused by the difference in electricity consumption due to the temperature difference between the two places.
This study and reference [
34] are, respectively, based on the power data of two time points in 2024 and 2010. The research results of both explain the characteristics of the continuous growth in electricity demand, and correspondingly, the increase in electricity charges and the increase in the difficulty of maintaining reliable power supply. All these results indicate that the efficiency of enterprises in obtaining electricity is affected by temporal characteristics.
This paper and the above two similar studies reveal the changing characteristics and trends of the indicator group of enterprises’ acquisition of power efficiency. Based on these research results, reasonable predictions can be made for electricity charges, power outage time, and power outage frequency, thereby making advanced decisions on possible future EOBE changes and promoting the sound development of EOBE.
4.5.3. Policy Action Recommendations
In response to the research results of this article, we have put forward the following policy action suggestions:
The government can adjust electricity prices according to the seasonality of electricity consumption changes. This will prevent users’ monthly electricity bills from taking up too large a proportion of their income and optimize the local EOBE.
Under the background of the integration of new energy, the government can plan a reasonable energy storage configuration, better utilize new energy for power generation, and ensure that power enterprises can meet the electricity demands of power users as much as possible. This can minimize power outages caused by insufficient public power supply to the greatest extent.
5. Conclusions
This paper studies the improved AEI system and puts forward the root cause tracing method based on data mining and contribution analysis. The main conclusions are summarized as follows:
Based on the World Bank’s B-ready system, the AEI system has been improved, taking into account the quality of electric power regulations, the level of public service, and the efficiency of enterprises’ access to electricity, and expanding the idea of constructing indicators for evaluating the EOBE.
Applying the a priori algorithm to mine the association rules and visualize the analysis helps to reveal the mutual coupling relationship between the indicators and their influence mechanism.
By comparing the prediction accuracy of the three models SVR, DT, and FNN, the optimal prediction model is finally constructed using SVR and contribution analysis is performed. The main factors affecting AEI changes are analyzed, and the influence of these factors on the change in different indicators and the influence mechanisms are different. In general, organizing by influence, from high to low, features are as follows: time series features, policy features, and coupling features. Studying the influence mechanism of each feature on AEI, it is found that the change in AEI is positively related to the year and the number of related policies, and the indicator of average customer outage time is positively incentivized by the coupling feature. As the number of months in a year increases, the average customer outage time decreases and the average customer outage frequency decreases, while the cost of electricity service increases.
The proposed improved AEI system provides a more scientific and reasonable basis for the accurate evaluation of EOBE. The results of root cause tracing experiment provide direction and targets for EOBE optimization.
This paper improved AEI to enable it to evaluate EOBE more accurately. The root cause traceability research on the changes in AEI has also achieved good results. However, in the face of constantly changing social demands and the social background of policy reforms and technological development, the improved AEI and root cause tracing methods proposed in this study still have certain limitations. Although the improved AEI in this study added to the subjective assessment of EOBE, the indicators relating to this relied on enterprise questionnaires, were vulnerable to the cognitive bias of the respondents, and lacked objective data support. However, the static nature of the evaluation criteria of the assessment indicators at the objective level is overly prominent, making it difficult to dynamically reflect the impact of policy adjustments or emergencies. Meanwhile, in the face of more complex factors and an increasing number of samples, the root cause tracing method also needs to continuously optimize the feature extraction approach and subdivide the influencing factors.
In view of the limitations of this study, we will continue to optimize the research of AEI. The evaluation indicators at the subjective level are supported by the evaluation criteria of objective data. More assessment details should be added to the assessment indicators at the objective level, such as how the assessment standards should be adjusted in the face of policy adjustments and unexpected events. In this way, AEI can evaluate EOBE more comprehensively and accurately. Based on data mining and SHAP values, we will also explore more scientific feature extraction methods and root cause tracking methods to improve the accuracy and scientific nature of mining and analyzing the factors and mechanisms influencing factors AEI changes and their influencing mechanisms, and provide more reliable case basis and reference suggestions for EOBE optimization.