Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network

Cao, Rui; Zhang, Chenjun; Zhao, Xiangyang; Deng, Yanan

doi:10.3390/en19092223

Open AccessArticle

Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network

¹

School of Economics and Management, Jiangsu University of Science and Technology, Zhenjiang 212000, China

²

School of Business, Renmin University of China, Beijing 100872, China

³

School of Business, Hohai University, Nanjing 211100, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(9), 2223; https://doi.org/10.3390/en19092223

Submission received: 23 March 2026 / Revised: 27 April 2026 / Accepted: 1 May 2026 / Published: 4 May 2026

(This article belongs to the Special Issue Economic and Technological Advances Shaping the Energy Transition)

Download

Browse Figures

Versions Notes

Abstract

Achieving synergy between the digital economy and energy efficiency is pivotal for realizing high-quality development under the “Dual Carbon” targets. However, traditional econometric methods struggle to capture the complex nonlinear and spatio-temporal dependencies inherent in this relationship. To address this issue, this study develops a two-stage framework using Chinese provincial panel data. It combines LightGBM/CatBoost and SHAP for critical factor identification, and employs STGNN for capturing nonlinear and spatial correlation patterns, to systematically decode the driving mechanisms of the digital economy on energy efficiency. The results reveal three key findings: (1) Complex Nonlinearity: The impact manifests in distinct U-shaped, inverted U-shaped, and weak correlation patterns, accompanied by significant spatial clustering. (2) Structural Heterogeneity: The dimensions of the digital economy show differential associations with energy efficiency. Industrial digitization and infrastructure are associated with more direct improvements in efficiency, whereas digital industrialization functions primarily through indirect technological supply. (3) Spatial Correlation Pattern: Higher levels of digital development correspond to higher local energy efficiency and are linked to positive predicted adjustments in neighboring regions, with notable regional heterogeneity. Combining machine learning-based feature selection with deep learning-based spatiotemporal modeling provides a scientific basis for formulating location-specific digital economy strategies and coordinated energy-saving policies.

Keywords:

digital economy; energy efficiency; nonlinear impact; spatial correlation pattern; machine learning

1. Introduction

Energy, as the driving force and primary engine of economic and social development [1], critically shapes a region’s potential for sustainable development and modernization [2]. Despite concerted global efforts to combat climate change, fossil fuels continue to dominate the energy consumption structure in most economies. This high-carbon energy utilization pattern not only accelerates resource depletion but also results in substantially elevated emissions, posing a severe threat to the ecological environment [2,3,4]. As the world’s leading energy consumer [5], China accounted for 26.5% of the world’s total energy consumption in 2023, with an annual growth rate of 6.6% [6]. Coal consumption constituted 55.3% of total energy consumption, while clean energy consumption represented 26.4%. This underscores an urgent need to improve energy efficiency and curb over-dependence on fossil fuels. The International Energy Agency (IEA) continues to emphasize that energy efficiency is the cornerstone for achieving global climate goals and building an energy security system, as well as an effective pathway to fulfilling the Paris Agreement [7]. Consequently, enhancing energy efficiency and facilitating a green, low-carbon transition have become shared strategic objectives for nations worldwide. Particularly for China, under the dual carbon goals, exploring pathways to enhance energy efficiency has become an urgent priority [3,8,9,10].

At this critical juncture, the wave of the digital economy has emerged as a transformative force. It brings new opportunities for high-quality social development [11] and opens fresh avenues for resolving energy challenges [12]. According to the Global Digital Economy Development Research Report (2024), the digital economy demonstrates strong resilience and vitality. It has become a key growth driver for the global economy, with its development momentum continuously strengthening. In major economies like the United States, China, Germany, and Japan, the digital economy is expanding rapidly and now accounts for approximately 60% of their GDP. As the core vehicle for new-quality productive forces, the digital economy is fundamentally transforming the operation of energy systems through technological innovation and industrial restructuring. Consequently, it is widely regarded as an effective tool for conserving energy, reducing emissions [13] and improving energy efficiency [14]. Therefore, systematically elucidating the mechanisms through which the digital economy impacts energy efficiency, thus identifying its key driving pathways, has become a critical issue of significant strategic importance.

Existing studies have confirmed the positive role of the digital economy in improving energy efficiency, primarily from the perspectives of green finance [15], industrial structure [16], technological progress [5], and resource allocation [1]. While this body of work provides a foundation for understanding their relationship, it has notable limitations. Most research employs traditional econometric models, which struggle to capture the complex, dynamic, and nonlinear connections between the digital economy and energy efficiency. Consequently, the underlying mechanisms of influence have not been fully elucidated.

Based on this, this study constructs a two-stage analytical framework that integrates feature engineering, spatio-temporal graph neural networks (STGNN), and counterfactual simulation. Focusing on 30 Chinese provinces, we quantify the contribution of different dimensions of the digital economy to energy efficiency, clarify the impact mechanisms through which the digital economy influences energy efficiency, and assess its spatial correlation patterns. The primary contributions of this study are threefold. (1) Establishing a comprehensive digital economy evaluation system that encompasses three dimensions: infrastructure development, industrial digitization, and digital industrialization. An integrated LightGBM-CatBoost machine learning approach is introduced, and SHAP analysis is applied to identify the key drivers from multi-dimensional feature data. (2) To overcome the shortcomings of conventional spatial matrices, we develop a novel three-dimensional dynamic spatial adjacency matrix that incorporates geographic proximity, economic distance, and similarity in digital economic development levels. Based on this matrix, we establish a spatio-temporal graph structure. (3) Moving beyond the linear assumptions of traditional models, this study employs a STGNN model to capture the complex relationship between the digital economy and energy efficiency. It reveals the nonlinear patterns of influence and spatiotemporal evolution, providing scientific decision-making support for differentiated strategies.

2. Literature Review

As a key engine of global economic growth, the digital economy significantly contributes to stimulating consumption [17], reducing carbon emissions and pollution [18], and fostering inclusive green growth [19]. Its interactive effects with energy systems have consequently become a frontier research topic of common interest to both academia and policymakers. Although the existing literature has explored this nexus, limitations in measurement methods, model assumptions, and analytical depth leave room for advancement. Systematic breakthroughs are urgently needed across three dimensions: comprehensive evaluation, spatial interconnections, and nonlinear mechanisms.

Accurately measuring and comprehensively evaluating the digital economy remains a central focus in academic research [20]. Early studies predominantly employed single indicators. This approach is simple and intuitive, but it struggles to fully capture the multidimensional and complex nature of the digital economy. Subsequently, the composite index method has become the mainstream approach in this field [21,22,23]. The Organisation for Economic Co-operation and Development (OECD) has established a three-dimensional framework encompassing infrastructure, digital transformation, and social impact. Similarly, the China Academy of Information and Communications Technology has released an evaluation system for “digital industrialization, industrial digitalization, and digital governance,” providing crucial reference for subsequent research. However, most comprehensive indicator systems have failed to achieve data-driven identification of feature importance from a predictive perspective.

In the field of feature selection and variable screening, traditional econometrics often relies on theoretical priors or stepwise regression methods, making it susceptible to subjective biases and multicollinearity issues. In recent years, machine learning methods have provided novel approaches for high-dimensional feature selection. Models like random forests and gradient-boosted trees have demonstrated superior performance by evaluating feature importance through the calculation of splitting gains in decision trees [14]. Explainable Artificial Intelligence (XAI) is a critical branch within the field of artificial intelligence. As machine learning technologies advance, AI model algorithms commonly face the black-box dilemma, unable to provide interpretable information that reveals the underlying logic behind their outputs. This hinders the extraction of value from data analysis. Zhou et al. demonstrated the significant practical utility of XAI in digital finance and consumption upgrades [24], while Sun et al. highlighted SHAP as a crucial tool for analyzing influencing factors [25]. Clearly, the application of XAI offers new possibilities for identifying key digital economic characteristics that impact EE.

Accurately characterizing spatial relationships is another critical but underdeveloped aspect of regional energy efficiency research. Early studies primarily relied on a single geographic adjacency matrix, which could only capture physical spatial proximity while neglecting spatial dependencies in non-geographic dimensions such as economics and technology. Subsequent research expanded to include economic distance matrices [26] and spatial economic geography nesting matrices [2]. However, these frameworks still do not account for “digital development similarity,” a key relational dimension in the digital economy era. Consequently, they remain incapable of reflecting technological spillovers and synergistic effects within the digital economy. Spatio-temporal graph models provide a promising alternative for modeling such complex dependencies. In economics, graph neural networks have been applied to market price forecasting [27] and industrial classification [28]. Nevertheless, research integrating geographical, economic, and digital three-dimensional proximity into dynamic graph structures for analyzing spatial correlation patterns in the digital economy remains unexplored.

The existing literature broadly acknowledges that the digital economy profoundly impacts energy efficiency via technological innovation, industrial upgrading, and optimized resource allocation, with its underlying mechanisms being complex and multidimensional. On one hand, the digital economy serves as a key driver for enhancing energy efficiency [21]. It directly enhances production-side energy utilization through information technology advances [16] and infrastructure development [4]. Concurrently, digital industry agglomeration [9] and the leadership of green finance [15] propel industrial restructuring toward low-carbon, high-efficiency transformation and upgrading, thereby elevating overall energy efficiency [29]. On the other hand, the digital economy’s high energy consumption characteristics have created new energy pressures. This partially offsets its energy-saving potential and can lead to complex outcomes such as the “rebound effect” [30,31]. As Liu et al. noted, while digital technological innovations can significantly reduce carbon emissions, it also generate energy rebound effects that partially diminish the impact of emission reductions [32].

Although research on the relationship between DE and EE has made progress, most existing studies rely on conventional methodological frameworks. Common approaches include using econometric models to test their basic correlation [4,33], DEA to calculate energy efficiency [15], mediation models to examine transmission pathways [22,34], and PSTR models to reveal their nonlinear characteristics [23]. For example, Gao et al. employed a dynamic panel data model to validate the impact of digitalization on green total factor energy efficiency (GTFEE) [16]. Notably, the spatial Durbin model has emerged as the mainstream approach for investigating the spatial spillover effects of the DE on EE [2]. However, econometric models possess limited capacity to capture complex nonlinear relationships, making it challenging to accurately represent the dynamic and nonlinear nature of the relationship between the digital economy and energy efficiency.

To overcome the limitations of traditional models, some cutting-edge research has begun incorporating advanced computational methods and machine learning models. This includes integrating network technology and artificial intelligence with conventional econometric models to propose novel green computing approaches [35]; employing Random forest, XGBoost regression, and backpropagation neural networks to identify key factors influencing energy intensity [14]; and utilizing artificial neural networks to investigate the nonlinear relationship between the digital economy and energy productivity [25]. Meanwhile, recent research has achieved breakthroughs in explainable artificial intelligence. Jiao et al. innovatively coupled SHAP analysis with machine learning models to evaluate the synergistic effect of renewable energy and the digital economy on energy intensity, yielding fresh insights into complex variable interactions [14]. Yu et al. demonstrated that the interpretability of SHAP models holds significant value for analyzing variable relationships [36].

STGNN is a deep learning model that combines graph neural networks with various temporal learning networks to capture dynamic features across spatial and temporal dimensions [37]. Owing to its strength in modeling complex spatial dependencies and temporal evolution simultaneously, STGNN has been widely applied in recent years in diverse domains including traffic flow prediction [38], energy forecasting [39], and economic indicator assessment [40]. However, research applying STGNN to dissect the “digital economy–energy efficiency” mechanism remains scarce, particularly lacking in-depth deconstruction of the influencing pathways.

In summary, this study achieves systematic innovation in the following ways: First, it constructs a three-dimensional comprehensive evaluation system for the digital economy, integrating the model advantages of LightGBM and CatBoost while combining the SHAP interpretability framework to enable key feature selection; Second, breaking free from the constraints of traditional spatial matrices, we construct a three-dimensional dynamic adjacency matrix that integrates geographic, economic, and digital similarity. This serves as the foundation for building a spatio-temporal graph structure that more closely aligns with reality. Finally, we introduce a spatio-temporal graph neural network model to precisely capture the spatiotemporal effects of the digital economy on energy efficiency. This approach effectively overcomes the inherent limitations of traditional econometric and machine learning methods in the context of this study.

3. Variables and Data

3.1. Core Variable Definition and Calculation Method

3.1.1. Dependent Variable: Energy Efficiency (EE)

Traditional energy efficiency calculations primarily focus on the economic output generated by energy inputs, using energy consumption per ten thousand yuan of GDP as the metric for energy efficiency. This approach overlooks the allocation efficiency of factors such as capital, labor, and technology. Therefore, this study adopts green total factor energy efficiency and employs the super-efficient SBM model for non-desirable outputs to conduct the calculation [29,41]. The indicator system is shown in Table 1.

Assume the research sample comprises N decision-making units (DMUs), each with m types of inputs as input variables, denoted as

X = [x_{1}, x_{2}, \dots, x_{n}] \in R^{m \times n}

; furthermore, there are S1 types of desired outputs, denoted as

Y = [y_{1}, y_{2}, \dots, y_{n}] \in R^{S_{1} \times n}

, and S2 types of undesired outputs, denoted as

B = [b_{1}, b_{2}, \dots, b_{n}] \in R^{S_{2} \times n}

.

The production possibility set is defined as follows:

p (x) = \{(y_{r}, b_{t}) ∣ xproduce (y_{r}, b_{t}), 0 \leq y_{r} \leq Y λ, 0 \leq b_{t} \leq B λ, λ \geq 0\}

The objective function and constraints are as follows:

\min ρ = \frac{1 - \frac{1}{m} \sum_{i = 1}^{m} s_{i}^{-} / x_{ik}}{1 + \frac{1}{s_{1} + s_{2}} (\sum_{t = 1}^{s_{1}} s_{t}^{+} / y_{tk} + \sum_{t = 1}^{s_{2}} s_{t}^{b} / b_{tk})}

(1)

s . t . \{\begin{array}{l} x_{ik} = \sum_{j = 1}^{N} x_{ij} λ_{j} + s_{i}^{-} (i = 1, \dots, m), \\ y_{rk} = \sum_{j = 1}^{N} y_{rj} λ_{j} - s_{r}^{+} (r = 1, \dots, s_{1}), \\ b_{tk} = \sum_{j = 1}^{N} b_{tj} λ_{j} + s_{t}^{b} (t = 1, \dots, s_{2}), \\ λ_{j}, s_{i}^{-}, s_{r}^{+}, s_{t}^{b} \geq 0 \end{array}

Here,

ρ

represents the green total factor energy efficiency value. When

ρ > 1

, the province lies on the green production frontier with no energy efficiency loss; when

ρ < 1

, energy efficiency loss exists, with smaller values indicating more severe losses.

s_{i}^{-}

is the input slack variable, indicating excess inputs;

s_{r}^{+}

is the desired output slack variable, indicating insufficient desired outputs;

s_{t}^{b}

is the non-expected output slack variable, indicating excess non-expected output;

λ_{j}

is the intensity variable.

3.1.2. Explanatory Variable: Digital Economy Development Level (DE)

Drawing upon the existing literature [21,22,23,29] and the Global Digital Economy Competitiveness Development Report, this study employs an entropy-weighted approach to construct a comprehensive evaluation framework applicable at the provincial level. The framework systematically assesses three core dimensions—infrastructure development, digital industrialization, and industrial digitalization—to provide a holistic and scientific evaluation of the digital economy’s development. Such an approach avoids the limitations of single-dimensional evaluations. The resulting integrated evaluation indicators for provincial-level assessment are presented in Table 2.

To eliminate the effects caused by different units of measurement, the raw data must first undergo positive normalization. The comprehensive digital economy index for Province i in Year t is calculated using a weighted sum formula:

{DE}_{i} (t) = \sum_{k = 1}^{n} w_{k} \cdot Z_{i, k} (t)

(2)

The higher the value of

{DE}_{i} (t) \in [0, 1]

, the more developed the digital economy.

w_{k}

represents the weight of the kth indicator obtained through the entropy weight method, while

Z_{i, k} (t)

denotes the standardized data corresponding to the kth indicator.

3.1.3. Control Variables

Based on the existing literature [4,8], we considered numerous factors anticipated to influence energy efficiency as control variables, thereby enabling more precise identification of the net effects of the digital economy.

These variables include: (a) Economic development level, measured by per capita GDP (base year = 2000). (b) Industrial structure, expressed as the ratio of tertiary to secondary industry value added. (c) Educational attainment, proxied by the number of undergraduate (or higher) students per 10,000 people, reflecting talent reserves. (d) Urbanization level, represented by the proportion of permanent urban residents to the total population at year-end. (e) Population density, measured by the number of people per unit area, reflecting potential energy demand. (f) Degree of openness to the outside world, measured by the ratio of regional goods imports and exports to GDP, reflecting a city’s competitiveness and openness.

3.2. Data Sources and Descriptive Statistics

This study uses 30 provincial-level administrative regions in China as its research sample (Hong Kong, Macao, Taiwan regions, and the Tibet Autonomous Region are temporarily excluded due to severe deficiencies in digital economy data). The time span is set from 2011 to 2023. Data primarily originates from the China Statistical Yearbook, China Energy Statistical Yearbook, China Digital Economy Development Report, and provincial/municipal statistical yearbooks.

Table 3 presents descriptive statistics for all variables, with a sample size of 390 for each. The dependent variable, EE displays notable positive skewness and high kurtosis. Most provinces’ energy efficiency levels cluster near the mean, while only a few provinces demonstrate exceptionally high energy efficiency, placing them in a leading position. The relatively small standard deviation suggests a concentrated distribution, yet the substantial range indicates pronounced regional disparities in green development. Economically advanced and technologically advanced eastern provinces have far surpassed the production frontier, demonstrating outstanding energy efficiency performance. The distribution characteristics of the core explanatory variable DE and its three-dimensional indicators are highly consistent, with most exhibiting significant positive skewness and kurtosis. Certain basic resource indicators, such as broadband penetration rate and mobile phone penetration rate, display relatively symmetric flat-peak distributions. This differentiated resource distribution corroborates the regional heterogeneity characterized by “high concentration of core digital resources alongside broadly accessible foundational resources.” Most control variables also exhibit strong positive skewness and kurtosis. Only educational attainment and urbanization levels show relatively balanced distributions without extreme outliers.

Based on the raw data, variables in subsequent analyses were standardized to eliminate differences in units of measurement and enhance comparability among indicators.

4. Research Methods

4.1. Research Framework and Basic Assumptions

This study constructs a two-stage analytical framework aimed at systematically exploring the relationship between the digital economy and energy efficiency. In the first stage, an ensemble learning method combining LightGBM and CatBoost, supplemented by SHAP analysis, is employed to identify key drivers. In the second stage, a spatiotemporal graph neural network (STGNN) model is constructed based on a three-dimensional dynamic adjacency matrix to capture the nonlinear patterns and spatial propagation characteristics of the selected features. The framework proceeds in four main steps: data preparation, model construction, predictive analysis, and mechanism deconstruction, as illustrated in Figure 1.

Based on existing theoretical research, the following fundamental assumptions are proposed.

This study investigates the impact of the digital economy on energy efficiency. Within the context of sustainable development, these two domains are closely interlinked, making it essential to analyze them within an integrated framework [42]. Theoretically, the digital economy influences energy efficiency through multiple pathways. One such pathway involves leveraging digital infrastructure like the Industrial Internet to enable real-time monitoring and precise scheduling of energy systems, thereby directly reducing transmission and usage losses [43,44]. Second, optimize industrial and supply chains through new models such as e-commerce and platform economies [45], thereby promoting industrial restructuring and enhancing resource allocation efficiency to reduce energy consumption per unit of output [5]. Third, by leveraging big data and artificial intelligence to cluster digital industries, carbon emissions can be reduced and energy utilization intensified [20,46]. Based on these pathways, this study hypothesizes that the digital economy exerts a fundamentally positive driving effect on energy efficiency.

However, this impact is not simply linear. In the early stages of digital economic development, high energy inputs in areas such as data centers and digital infrastructure may weaken its energy-saving effects. Once digital economic development crosses a “threshold,” it will amplify its positive influence on energy efficiency [5,47]. Concurrently, Jevons’ Paradox suggests that improvements in efficiency may reduce the effective cost of energy, potentially stimulating greater consumption and creating a rebound effect [30,32]. Furthermore, the interaction between different dimensions of the digital economy and regional endowments may also give rise to U-shaped, inverted U-shaped, or more complex relationships. Consequently, the overall influence of the digital economy on energy efficiency is characterized by complex nonlinearity.

H1:

The development of the digital economy exerts a positive impact on regional energy efficiency in China, and this impact exhibits complex nonlinear characteristics.

The digital economy transcends geographical boundaries and facilitates cross-regional flows of production factors, generating notable spillover effects on neighboring provinces [6]. Constrained by shared environmental regulations and influenced by similar levels of economic development, neighboring provinces often engage in intense competition [48]. Simultaneously, the development of the digital economy has significantly reduced information barriers, accelerating the flow and learning of green technologies, emission reduction experiences, and management knowledge across regions [26]. Furthermore, developed regions, leveraging their first-mover advantages, readily become demonstration hubs for digital technologies and production methods. Through spillover effects, they drive neighboring areas to achieve energy efficiency leaps by introducing technologies and replicating models [9,49].

H2:

The impact of the DE on EE exhibits significant spatial spillover effects, meaning that the development of the digital economy in a given region influences the energy efficiency of neighboring regions through spatial connections.

The digital economy is a multidimensional composite system, and the impact of different dimensions on energy efficiency also exhibits structural differences. Digital infrastructure holds significant potential for cost reduction and investment incentives [50], with its promotion of energy efficiency primarily achieved indirectly through other economic activities and characterized by network effect barriers [4]. The impact of the digital economy on industrial structure is primarily manifested in industrial digitization and digital industrialization [51]. On one hand, industrial digitization represents the core manifestation of the deep integration between the digital economy and the real economy [52]. Leveraging their high versatility and pervasiveness, digital technologies such as cloud computing, the Internet of Things, and 5G fundamentally transform traditional industries’ production methods and resource allocation, enhancing production efficiency with more direct and pronounced effects [53,54]. On the other hand, the digitalization of industries serves as the foundation of the digital economy, transforming data into a production factor [6]. This process drives industrial restructuring and integration, giving rise to new digital industries and business models [55]. It is characterized by long-term development, with contribution pathways that are more diverse and complex.

H3:

The impact of different dimensions of the digital economy on energy efficiency exhibits heterogeneity, with varying contributions and distinct pathways of influence.

4.2. Feature Engineering

This study adopts an ensemble modeling strategy that combines the feature importance scores from two structurally distinct machine learning models, thereby assessing the contribution of each feature to the generalization performance of energy efficiency.

4.2.1. LightGBM

The LightGBM model was employed to assess the importance of all initial features, including indicators across various dimensions of the digital economy, the level of digital economic development, and control variables. LightGBM utilizes gradient one-sided sampling and mutually exclusive feature bundling to efficiently handle high-dimensional features.

Feature importance

Imp (f_{i})

is calculated as the sum of loss reductions from its splits across all decision trees, when

Δ L_{n}

denotes the reduction in the loss function after node n splits, and Tt is the node set of the t-th decision tree.

Imp (f_{i}) = \sum_{t} \sum_{n \in T_{t}} Δ L_{n} \cdot I (n splits on fi)

(3)

The model uses Gradient Boosted Decision Trees (GBDT) as its boosting framework. The maximum number of leaf nodes per tree is set to 31, the learning rate to 0.05, the feature sampling rate to 0.9, and the sample sampling rate to 0.8. Additionally, a Bagging operation is performed every 5 iterations. The maximum number of iterations is set to 500, and an early stopping strategy is employed: if the RMSE on the validation set does not decrease for 50 consecutive rounds, training is terminated early. To ensure the reproducibility of the results, the random seed is fixed at 42.

4.2.2. CatBoost

The machine learning framework based on Gradient Boosted Decision Trees (GBDT) offers distinct advantages in processing categorical features and mitigating overfitting. It accurately captures the nonlinear relationship between the digital economy and energy efficiency while automatically uncovering potential influence pathways through feature interaction terms.

Imp (f_{i}) = \sum_{t} \sum_{n \in T_{t}} Gain (n) \cdot \frac{Count (n, f_{i})}{Count (n)}

(4)

Gain (n)

represents the split gain of node n.

Count (n, f_{i})

denotes the number of times feature

f_{i}

is used in node n.

The maximum number of iterations for the CatBoost model was set to 500, the learning rate to 0.05, and the depth of each tree to 6. A L2 regularization coefficient of 3.0 was applied to leaf nodes to impose a penalty on leaf weights and reduce model complexity. An early stopping mechanism was employed: training was halted if the validation set loss failed to improve for 50 consecutive rounds. The random seed was fixed at 42 to ensure the reproducibility of the experiments.

4.3. SHAP

To quantify the contribution and impact of input features on predictive models, this study introduces SHAP to provide consistent and interpretable feature contribution metrics for machine learning models. Let the set of model input variables be

F = \{f_{1}, f_{2}, \dots, f_{m}\}

. The energy efficiency forecast value

{\overset{ˇ}{Y}}_{i, t}

for country i in year t is decomposed as

\emptyset_{0} + \sum_{k = 1}^{m} \emptyset_{i, t, k}

. Here,

\emptyset_{0}

denotes the mean of the full sample prediction, while

\emptyset_{i, t, k}

represents the SHAP value of the kth variable across the sample, reflecting its contribution to the deviation of the prediction from the baseline.

The SHAP value for variable

f_{k}

is obtained by the weighted average of the marginal contributions across all variable subsets:

\emptyset_{i, t, k} = w_{i, t} \cdot \sum_{S \in F \ \{f_{k}\}} \frac{|S|! \cdot (|F| - |S| - 1)}{|F|!} \cdot [\overset{ˇ}{Y} (S \cup \{f_{k}\}) - \overset{ˇ}{Y} (S)]

(5)

Here, S denotes the subset of variables excluding

f_{k}

, and

\overset{ˇ}{Y} (S)

represents the energy efficiency predicted solely by the subset S.

4.4. Construction of Spatiotemporal Graph Models

Traditional econometric models often rely on simple geographic adjacency when addressing spatial effects, making it difficult to capture the complex network relationships between regions—such as those formed by industrial chains and data flows—in the digital economy era. To overcome this limitation, this study constructs a weighted, multidimensional spatiotemporal graph

G_{t} = (V, E_{t}, A_{t})

to provide realistic input foundations for the STGNN model [2].

4.4.1. Comprehensive Spatial Adjacency Matrix

The

e_{ij} \in E

represent spatial dependencies between regions. As the one-dimensional static edges of traditional models cannot accommodate the complex interconnections of the digital economy, this study constructs a three-dimensional weighted adjacency matrix integrating geographic, economic, and digital dimensions.

(1): Geographic Proximity Matrix $W_{geo}^{ij}$

Geographical proximity forms the foundation of regional connections. We construct a binary adjacency matrix to characterize geographical contiguity.

W_{geo}^{ij} = \{\begin{array}{l} 1, & If region i and region j share a common border \\ 0, & otherwise \end{array}

(6a)

(2): Economic Distance Matrix $W_{eco}^{ij}$

The economic distance matrix serves as a key component for depicting the intensity of economic ties between regions in the digital economy era. Depict spatial proximity through geographic distance and define the economic disparity term using an exponential decay function. By employing a multiplicative fusion approach to highlight the combined effects of dual correlations, the final matrix element calculation formula is as follows:

{W^{'}}_{eco}^{ij} = \{\begin{matrix} \frac{1}{dist (ij)} \times \exp (- \frac{|{GDP}_{i} (t) - {GDP}_{j} (t)|}{σ}), & if i \neq j \\ 0, & if i = j \end{matrix}

(6b)

{GDP}_{i} (t) {and GDP}_{j} (t)

represent the per capita GDP of provinces i and j in year t, respectively.

dist (i, j)

denotes the spherical distance between the provincial capitals of provinces i and j. σ is the exponential decay parameter, calculated as the median of the absolute differences in per capita GDP between all pairs of provinces across all years.

For computational convenience, the matrix undergoes row normalization, where N represents the sum of all provinces.

W_{eco}^{ij} (t) = \frac{{W^{'}}_{eco}^{ij} (t)}{\sum_{k = 1}^{N} {W^{'}}_{eco}^{ik} (t)}

(6c)

(3): Digital Economy Development Level Matrix $W_{dig}^{ij}$

In order to more directly capture the spatial correlation between the digital economy and energy efficiency and effectively mitigate endogeneity biases arising from simultaneous bidirectional causality between the digital economy and energy efficiency, this study uses the digital economy development level from one period prior to replace the current-period value in calculations. By measuring the development similarity among provinces using a Gaussian kernel function to define node connection weights, we construct a spatial similarity matrix of digital economy development levels with greater exogeneity.

For any two provinces i and j, their similarity S_dig(i,j,t) in year t is calculated as follows:

S_{dig} (i, j, t) = \exp (- \frac{{({DE}_{i} (t - 1) - {DE}_{j} (t - 1))}^{2}}{2 σ^{2}})

(6d)

σ is the scale parameter of the Gaussian kernel function, which is set as the median of the Euclidean distances between all sample pairs in this paper.

After constructing the similarity matrix of the original digital economy development levels, set the diagonal elements to zero and then perform row normalization.

W_{dig}^{ij} (t) = \frac{S_{dig}^{ij} (t)}{\sum_{k = 1}^{N} S_{dig}^{ik} (t)}

(6e)

The weighted sum is used to construct a comprehensive spatial adjacency matrix, with the formula as follows:

W = α \cdot W_{geo}^{ij} + β \cdot W_{eco}^{ij} + γ \cdot W_{dig}^{ij}

(6f)

where the weights

α

,

β

and

γ

represent the relative importance of geographical proximity, economic distance, and digital economy development level, respectively, in shaping spatial dependence.

Following the approach of Theodoropoulos et al. [56,57,58], in the absence of prior information, the three-dimensional spatial adjacency matrices for geography, economy, and the digital economy are linearly weighted and averaged equally at 1/3 each to obtain the composite spatial adjacency matrix W. This processing method is also regarded as the most transparent robustness strategy.

4.4.2. Construction of the Node Feature Matrix

Node features serve as information carriers for model learning and prediction. In this study, each time point t corresponds to a cross-sectional feature matrix containing the core feature set derived from feature engineering. This study selected the following seven features as model inputs: level of digital economic development (DE), GDP per capita (GDP), industrial structure (IS), level of education (EDU), urbanization rate (URBAN), population density (DENSITY), and degree of openness (OPEN). The target variable to be predicted is energy efficiency (EE). This matrix has dimensions N × F, where element

A_{i, f}^{t}

represents the value of the fth feature for the ith province in the tth year.

A^{t} = {[A_{1}^{t}, A_{2}^{t}, \dots, A_{N}^{t}]}^{T} \in ℝ^{N \times F}

(7)

By stacking the feature matrices at each time point, we obtained the time series feature matrix

χ \in ℝ^{N \times F \times T}

.

4.5. STGNN Prediction Framework

To more precisely capture the spatiotemporal dependencies between the digital economy and regional energy efficiency, this study employs the STGNN as the core model. By alternately performing spatial graph convolutions and temporal recurrent encoding, the study simultaneously extracts the evolutionary patterns of regional energy efficiency.

4.5.1. Data Preprocessing and Partitioning

For the three right-skewed variables—GDP, population density, and openness to the outside world—a log(1 + x) transformation is applied to mitigate issues of extreme values and heteroscedasticity; the remaining variables retain their original economic meaning and are not subjected to additional logarithmic transformations. For outliers, this study does not employ aggressive intervention methods such as winsorization, truncation, or manual replacement; instead, it uses a combination of logarithmic transformation, standardization, and subsequent regularized training to mitigate the impact of extreme observations on parameter estimation.

To eliminate the impact of differences in the units of measurement on model training, all node features and the target variable (EE) are standardized using Z-scores.

{\tilde{A}}_{i, f}^{t} = \frac{A_{i, f}^{t} - μ_{f}}{σ_{f}}

(8)

To prevent future data leaks, the mean and standard deviation required for standardization are calculated solely based on the training set. The validation and test sets are directly transformed using the statistics of the training set.

To capture time-dependent relationships, the sliding window method is used to convert spatiotemporal sequence data into supervised learning samples. The historical window length is set to T_in = 3, and the prediction step size is set to τ = 1. For any given starting year t, the input samples consist of a feature sequence and an adjacency matrix sequence, and the corresponding prediction targets are the EE values for each province in year t + 3.

Specifically, the training set covers the years 2011–2019, and is used for model parameter learning; the validation set covers 2020–2021, and is used for hyperparameter tuning, early stopping decisions, and model selection; and the test set covers 2022–2023, and is used solely for the final evaluation of the model’s generalization performance, without being involved in any training or hyperparameter selection processes throughout the process.

4.5.2. Model Architecture

(1): Feature Encoder

To enhance the nonlinear representational capacity of the original features, map low-dimensional input features to a unified high-dimensional latent space, and provide standardized input for subsequent spatial and temporal feature extraction, this study employs a two-layer fully connected neural network (MLP) to construct a feature encoder, which independently encodes the node features at each time step.

For the τth time step of any sample, the node feature matrix

{\tilde{A}}_{τ} \in ℝ^{N \times F}

is processed by the encoder to produce a hidden representation

H_{τ}^{(1)} \in ℝ^{N \times d_{hid}}

, which serves as the input to the subsequent spatial graph convolution layer.

H_{τ}^{(1)} = ReLU ({\tilde{A}}_{τ} W_{1} + b_{1}) H_{τ} = Dropout (ReLU (H_{τ}^{(1)} W_{2} + b_{2}), p = p_{drop})

(9)

Here, W₁ and W₂ are trainable weight matrices, b₁ and b₂ are the corresponding bias vectors, and

d_{h i d} = 32

. For low-data-volume scenarios, a low-dimensional design is adopted to avoid overfitting caused by parameter redundancy. Dropout is a stochastic activation regularization operation that randomly discards neuron outputs with a probability of p = 0.05.

This encoder maps features independently for each time step and each province, thereby preserving cross-sectional heterogeneity among provinces while ensuring consistency in the encoding rules across the temporal dimension, thus laying a unified foundation for subsequent spatiotemporal feature extraction.

(2): Spatial Graph Convolution Layer

To capture complex spatial dependencies between regions, this study employs a first-order neighbor aggregation graph convolution operation to aggregate spatial information from the encoded node features at each time step, thereby extracting inter-regional correlation features.

For the encoded features

H_{τ}

at the τth time step, the spatial graph convolution is computed as follows:

\begin{matrix} \tilde{W} = {\tilde{D}}^{1 / 2} (W_{com} + I) {\tilde{D}}^{1 / 2} H_{τ}^{(spa)} = R e L U ({\tilde{W} H}_{τ} W_{spa} + b_{spa}) \end{matrix}

(10)

Here,

W_{com}

represents the complete adjacency matrix, I denotes the identity matrix (used to add a self-loop to ensure that information regarding a node’s own characteristics is preserved), and

\tilde{D}

is the degree matrix corresponding to

W_{com} + I

.

{\tilde{D}}_{ii} = \sum_{j = 1}^{N} {(W_{com} + I)}_{ij}

addresses the issue of degree imbalance in graph convolutions through symmetric normalization, thereby preventing economically developed provinces with numerous adjacencies from unduly dominating the feature aggregation process.

(3): Temporal Encoding Layer

The feature sequences output by the spatial graph convolutional network are fed into a Gated Recurrent Unit (GRU) network to capture dynamic patterns of evolution over time. The update rules for the GRU are as follows:

\{\begin{matrix} Z_{t} = σ (W_{Z} [h_{t - 1}, x_{t}] + b_{Z}) r_{t} \\ h_{t} = (1 - Z_{t}) ⊙ h_{t - 1} + Z_{t} ⊙ {\tilde{h}}_{t} \end{matrix}

(11)

In practice, the GRU processes each node independently: spatial features are reshaped into a shape of (N × batch, T_in, d_hid), and the GRU is then applied to the time series of each node. The hidden state dimension is maintained at d_hid, and the inter-layer dropout probability is set to p = 0.05. We use the output of the GRU’s final time step as the temporal representation of the entire historical window.

(4): Prediction Output Layer

Finally, the final hidden state of the GRU is fed into a two-layer fully connected prediction network, which progressively reduces the dimension from 32 to 16 and then to 1, outputting predicted energy efficiency values for each province. The prediction layer also includes ReLU activation and Dropout regularization.

4.5.3. Model Training Strategies

(1): Loss Function and Optimizer

We use the mean squared error (MSE) as the training loss function:

L_{MSE} = \frac{1}{B \cdot N} \sum_{b = 1}^{B} \sum_{i = 1}^{N} {({\hat{Y}}_{b, i} - Y_{b, i})}^{2}

(12)

Here, B represents the batch size, which is set to B = 2 in this study; in practice, parameter updates for each epoch are based on the entire training dataset, resulting in minimal gradient noise.

The random seed is 42, and the Adam optimizer is used, whose parameter update rules combine momentum and adaptive learning rates. The initial learning rate is η = 0.0005, the weight decay coefficient is λ = 1 × 10⁻⁶, and the gradient clipping threshold is set to 1.0 to prevent gradient explosion. The learning rate employs the ReduceLROnPlateau strategy: when the validation set loss does not decrease for 40 consecutive epochs, the learning rate is multiplied by a decay factor of 0.5. A minimum learning rate threshold is set to prevent training stagnation.

(2): Early Stopping and Model Selection

To prevent overfitting, an early stopping mechanism is implemented. At the end of each epoch, the R² value on the validation set is calculated. If the validation R² does not exceed the historical best value for 40 consecutive epochs, training is terminated and the model is restored to the parameters that yielded the highest validation R². The maximum number of training epochs is set to 400.

(3): Evaluation Metrics

Model performance was evaluated using the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE). All metrics were calculated after being normalized to their original units. Table 4 shows a comparison of the STGNN model’s performance on the training, validation, and test sets.

There was no significant decline in R², RMSE, or MAE for either the test set or the validation set; in fact, the test set performed slightly better than the training set, indicating that the model does not suffer from severe overfitting and demonstrates good generalization ability. By combining a triple regularization strategy comprising Dropout, L2 weight decay, and early stopping, the model’s parameter settings are reasonable for low-data-volume scenarios.

Table 5 shows a comparison of the performance of STGNN and the baseline model on the out-of-sample test set. The R² value for the test set of the linear regression model was negative, indicating that the linear assumption completely fails to capture the complex relationship between the digital economy and energy efficiency; its predictive performance was even worse than that of simply using historical averages. Nonlinear ensemble models, such as ExtraTrees and HistGBDT, also performed poorly, suggesting that even with the introduction of nonlinear fitting capabilities, models still struggle to extract meaningful signals when temporal dynamics and spatial dependencies are ignored. The results show that the R² of the STGNN model on the test set is significantly higher than that of traditional econometric models and machine learning models, demonstrating its advantage in capturing spatiotemporal nonlinear relationships.

It is worth noting that, although the R² value is moderate, the STGNN consistently outperforms all baseline models in out-of-sample predictions. Therefore, while the model fails to account for the majority of the variance, it remains valuable for capturing spatiotemporal nonlinear dependencies, exploring conditional spatial associations through counterfactual simulations, and classifying nonlinear response patterns across provinces.

4.5.4. Sensitivity Analysis of Spatial Weights

To demonstrate the robustness of the core findings, this study further reconstructed the global adjacency matrix using the following three weighting schemes and refitted them into the STGNN model. The results are shown in Table 6 below.

Under the four weighting schemes, the R² values of the STGNN model on the test set remained stable within a certain range, with fluctuations in both RMSE and MAE not exceeding 3%, indicating no significant performance degradation. This demonstrates that the STGNN model is insensitive to the configuration of the spatial weighting matrix and exhibits good robustness.

The equal-weighting baseline scheme achieved the highest R² on the test set and the best fitting performance, demonstrating that this unbiased setting can more comprehensively capture spatial correlation information across the three dimensions of geography, economy, and digital data. This further validates the rationality of the main regression baseline setting adopted in this study.

Furthermore, to reduce the subjective bias that may arise from manually setting weights, this paper supplements the comprehensive weighting formula with learnable weight settings. By applying softmax normalization, the model automatically learns the optimal values of α, β, and γ, and estimates the learnable weights through five training iterations with different random seeds. The results are shown in Table 7 below.

The parameters yielded average optimal weights of α = 0.3277, β = 0.3326, and γ = 0.3398, which are very close to the benchmark of equal weights. This further proves that equal weights are a robust weighting scheme.

5. Results Analysis

5.1. Spatio-Temporal Evolution Characteristics of DE and EE

China’s 30 provinces are grouped into 7 major regions: North China: Beijing, Tianjin, Hebei, Shanxi, Inner Mongolia; Northeast China: Liaoning, Jilin, Heilongjiang; East China: Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Shandong; Central China: Henan, Hubei, Hunan; South China: Guangdong, Guangxi, Hainan; Southwest China: Chongqing, Sichuan, Guizhou, Yunnan; and Northwest China: Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang.

Figure A1 in the Appendix A depicts the evolving trends of the digital economy (DE) and energy efficiency (EE) across China’s 30 provinces, grouped into 7 major regions, from 2011 to 2023. The results reveal a clear upward trajectory in DE across all provinces over this period, suggesting that its development momentum is becoming increasingly strong. Notably, eastern provinces show markedly higher growth rates and absolute levels of DE compared to central and western regions. In contrast, EE does not follow a uniform, unidirectional trend nationwide. While showing an overall positive trajectory, it is characterized by pronounced interannual fluctuations, with significant divergence in EE trends across different regions and time periods. Visually, the long-term rise in DE does not correspond to a linear synchronous change in EE at the provincial level, but presents complex time-phased synchronization and deviation characteristics, with notable differences across regions. The specific associational patterns between DE and EE will be systematically verified and deconstructed through the model analysis in subsequent sections.

Figure 2 and Figure 3 present a comparative analysis of energy efficiency and digital economic development levels across four specific years from 2011 to 2023, revealing an overall upward shift in the range between different years.

The higher the Digital Economy Development Index (DE) value, the more advanced the development level. The evolution of China’s provincial digital economy development from 2011 to 2023 reveals an overall steady upward trend, marked regional disparities, and a gradual expansion of high-value clusters.

In 2013, the overall level remained relatively low, with over 80% of provinces concentrated in the low-value range (represented by deep and light green areas). Only a few developed coastal provinces in eastern China fell into the high-value category. With the release of the Broadband China Strategy and Implementation Plan, the nationwide broadband speed-up, cost-reduction, and coverage initiative has been launched. The digital economy was in its nascent stage across the country, with resources highly concentrated in the eastern coastal regions. Central and western areas remained largely at low levels, resulting in pronounced regional disparities. In 2017, after the digital economy was elevated to a national strategy, its overall level increased, serving as a core engine for high-quality economic development, while high-value regions expanded locally. The lowest range increased from 0.03 to 0.06, though development levels remained concentrated in the low-to-medium range. Beijing, Guangzhou, Jiangsu, Zhejiang, and Shanghai accelerated their digital industry deployments, expanding the high-value DE range to 0.33–0.39.In 2020, the designation of six provinces and municipalities—Hebei (Xiong’an New Area), Zhejiang, Fujian, Guangdong, Chongqing, and Sichuan—as National Digital Economy Innovation and Development Pilot Zones spurred a significant rise in the overall level and accelerated the diffusion of digital resources toward central regions. In 2023, the East Data West Computing initiative accelerated the transfer of digital resources to central and western regions while mandating the deep integration of digital technologies with green development. The minimum DE index for central and western regions stabilized at 0.08, with more provinces entering the 0.21–0.34 range. Meanwhile, eastern regions benefited from the Digital China master plan, achieving a high DE index exceeding 0.75 and solidifying a stable pattern of “eastern leadership and central-western catch-up.”

From 2011 to 2023, China’s overall energy efficiency has steadily converged toward the “green production frontier,” while regional disparities have narrowed. High-efficiency zones expanded from scattered areas in the eastern region in 2013 to gradually encompass core areas in central and western China. Their spatial distribution pattern closely mirrors the evolution trajectory of DE.

In 2013, energy efficiency exhibited substantial spatial disparities, with most provinces experiencing efficiency losses. Only a few eastern provinces reached frontier efficiency (

ρ

≥ 1). Deep green, light green, and yellow regions (

ρ

< 1) covered most western and central provinces, indicating energy efficiency losses, characterized by excessive energy inputs and higher pollution emissions. Among these, western regions (such as Xinjiang) in the 0.91–0.95 range, reflecting relatively severe losses. The Air Pollution Prevention and Control Action Plan has driven emissions reductions in energy-intensive industries across eastern China. A few provinces, including Shandong and Jiangsu, entered the high-value range (red, 1.06–1.1), revealing an uneven pattern of “eastern leadership with central-western lag.” In 2017, the front-running areas expanded, but localized energy efficiency losses intensified. In several northwestern regions, efficiency losses worsened relative to 2013, accompanied by more pronounced energy waste and pollution emissions. Orange and red zones (

ρ

> 1) expanded from eastern to central core provinces, with more regions achieving synergistic efficiency in “energy–economy–environment” and green production models beginning to spread. Following the introduction of the dual-carbon goals in 2020, efficiency losses were substantially mitigated, moving the country closer to the green frontier. Most central provinces have achieved extremely minimal energy efficiency losses, with energy inputs and pollution emissions nearing optimal levels. The 14th Five-Year Plan for Green Industrial Development (2023) further accelerated the green and low-carbon transformation of China’s industrial sector. By 2023, the country has largely entered the green production frontier, with a few eastern provinces reaching the ultra-high value range, exhibiting a balanced pattern characterized by “High level overall with local leadership.”

5.2. Feature Contribution Based on SHAP

5.2.1. Feature Importance Analysis

Figure 4 displays a heatmap of Pearson correlation coefficients, reflecting the strength of linear associations between variables. EE shows positive correlations with most variables and a weak negative correlation with the length of long-distance optical cable lines (J4). Overall, the correlation strengths are modest, with most coefficients ranging between 0.2 and 0.35. Among the digital-economy-related variables, 11 show correlation coefficients with EE exceeding 0.28. Notably, S5 (0.3510), DE (0.3476), and its squared term (DE², 0.3412) rank in the top three, all indicating moderate positive correlations. The positive correlation between the DE composite index and EE (r = 0.347) is consistent with the observation that provinces with higher digital economy development levels on average show higher energy efficiency on average. Meanwhile, the comparable magnitude of the correlation coefficient for the squared term (DE²) hints at potential nonlinearity in the bivariate relationship. The most prominent factors within the sub-dimensions are S5, J2, and C3, suggesting that the scale of the digital industry, infrastructure coverage, and corporate digital participation are the digital economy facets most closely associated with EE in terms of linear covariation. All control variables display relatively weak correlations with EE. Among these, industrial structure (0.2116) showed the strongest correlation, which is consistent with the pattern that a shift toward a service-oriented industrial structure is often accompanied by improved energy efficiency. The degree of openness to foreign investment (0.0689) exhibits the lowest correlation. This finding suggests that, in this sample, the introduction of foreign investment did not produce the expected “green technology spillover” effect.

Table 8 reports the feature importance rankings derived from the LightGBM (LGB) and CatBoost (CB) models. Among the top 10 most important features, nine belong to the digital economy dimension. Specifically, three are from the industrial digitalization (C) dimension, four from the digital industrialization (S) dimension, and two from the digital infrastructure (J) dimension. This indicates that, within the predictive framework of these models, digital economy features exhibit substantially higher relevance to EE prediction than conventional development indicators. Features such as C3, EDU, S5, and J2 rank consistently high in both models, demonstrating stable predictive utility across different ensemble architectures. The importance of control variables, however, varied considerably. Educational attainment (EDU) ranked second overall and performed exceptionally well in both models, which is consistent with the interpretation that highly skilled talent plays an important role in energy management. Notably, economic development level (GDP) ranked last in importance. Against the backdrop of the digital economy, GDP growth alone shows a weaker association with energy efficiency improvements. Instead, the quality of economic development—particularly the depth and breadth of digital transformation—has taken center stage.

The two models yield divergent results in feature importance analysis. The mobile phone facility scale (J5) ranks 3rd in importance in the CatBoost model, while it ranks 23rd in importance in the LightGBM model. This discrepancy can be attributed to J5’s high correlation with other digital infrastructure features. CatBoost mitigates multicollinearity through adaptive weight adjustment, isolating J5’s independent contribution. Conversely, LightGBM’s histogram-based feature merging process may weaken this independent contribution, diluting its effect into more strongly correlated features and consequently lowering J5’s importance ranking. As a core ICT component, the energy-saving effect of mobile infrastructure is highly contingent on interactions with other variables, forming a complex, nonlinear relationship. In the short term, increased base station construction directly raises energy consumption. Over the long term, indirect energy-saving effects gradually outweigh direct energy consumption, creating a dynamic pattern of “short-term negative impact, long-term positive impact.” Meanwhile, the significance of the overall level of digital economic development (DE) and its squared term (DE2) is lower than that of specific digitalization indicators. This indicates that the digital economy influences EE mainly through sector-specific digitalization, rather than through a simple linear association with the overall digital economy level.

5.2.2. SHAP Contribution Factors

Figure 5 presents the SHAP summary plot derived from the combined LightGBM and CatBoost ensemble models, illustrating the magnitude and direction of each feature’s contribution to the predicted value of energy efficiency. The plot further reveals the nonlinear and heterogeneous patterns of these predictive contributions across different feature value ranges. The horizontal axis represents the SHAP value: a positive value indicates that the feature contributes positively to the model’s prediction, while a negative value indicates a downward contribution. The vertical axis ranks features by their overall impact on the model (top = highest combined SHAP value across both models). Colors represent the magnitude of the feature’s value: red indicates a high value, while blue indicates a low value. It is important to note that SHAP scores quantify how each feature influences the model’s output, rather than measuring its causal effect on actual energy efficiency.

In the integrated model, the core characteristics of the digital economy are positively correlated with EE predictions. Among these factors, the proportion of enterprises engaged in e-commerce activities (C3) emerges as the feature with the highest predictive contribution. Most SHAP values fall above zero, with higher values (red) indicating a more concentrated positive contribution. In the model’s predictions, higher e-commerce adoption by enterprises is positively associated with energy efficiency, suggesting that industrial digitization is a highly relevant dimension for predicting EE. The Internet broadband penetration rate (J2) also shows a high overall SHAP contribution. When J2 values are high (red), the SHAP values are mostly positive; conversely, low values (blue) are associated with negative SHAP contributions. This pattern is consistent with the presence of threshold characteristics in the relationship between J2 and EE: digital technology advances may translate into positive EE associations only when broadband penetration reaches a certain level. Below that level, inadequate digital infrastructure is associated with weaker or even negative predicted effects on EE. The SHAP values for the number of legal entities in the information services sector (S5) are largely positive, suggesting that digital–industry clustering facilitates technology diffusion and application, thereby extending the reach of energy conservation and emission reduction.

Compared to the characteristics of the digital economy, the SHAP values for other control variables are scattered, contribute little, and lack stability. The SHAP values for educational attainment (EDU) also exhibit a broad distribution in the negative range. This phenomenon does not imply that education itself inhibits EE; rather, it is consistent with the interpretation that there may be a mismatch between the existing talent pool and the practical application of digital and energy-saving technologies within the model’s predictions. Educational resources show limited direct association with energy efficiency gains, necessitating integration with digital economy scenarios to achieve meaningful results.

Figure 6 illustrates the feature contribution decomposition of energy efficiency prediction values across different samples, based on eigenvalue magnitude and direction of contribution. The horizontal axis represents the predicted EE values. The Base Value (model baseline) is 1.007, corresponding to the sample average. Red indicates high-value features that increase predicted values, while blue indicates low-value features that decrease predicted values. The length of each color block represents the degree of contribution. This study primarily selected three typical sample types for analysis: high predicted values, median values, and low predicted values.

The low-efficiency sample (prediction f(x) = 0.79) falls well below the baseline. Here, digital economy features take relatively low values and are associated with predominantly negative SHAP contributions. Infrastructure-related features (J5, J2, J4, etc.) appear blue, exerting a pronounced negative pull on the predicted value. This pattern suggests that, within the fitted ensemble model, lower levels of digital infrastructure coverage tend to co-occur with lower predicted energy efficiency. The medium-efficiency sample (f(x) = 1.00) exhibits an equilibrium pattern dominated by digital economy characteristics. Telecommunications services per capita (S1) emerges as the primary positive driver, highlighting the pivotal role of digital industrialization in this range. Concurrently, the digital financial inclusion index (C4) and proportion of enterprises engaged in e-commerce activities (C3) made positive contributions to the predicted values, consistent with the view that the industrial digitization dimension is closely linked to higher predicted energy efficiency. Educational attainment (EDU) and broadband penetration (J2) show positive but weak contributions, reflecting diminishing marginal returns on these inputs. High-efficiency samples (f(x) = 1.15) exhibit significantly higher predicted values than the baseline, with high digital economy characteristics driving the positive contribution. The proportion of enterprises with e-commerce activities (C3) appears in red, indicating that this high-value range is associated with a significant upward revision of the predicted EE value. When core elements of the digital economy reach advanced levels, they can optimize supply chains and production processes, which is linked to improved energy efficiency. This represents the key to how high-efficiency samples achieve “low energy consumption, low pollution, and high output.”

5.3. The Nonlinear Relationship Between DE and EE

5.3.1. Spatial Distribution Characteristics and Typical Provinces

This study uses a trained STGNN model to extract the nonlinear curves of DE versus EE for each province. A rolling window approach is employed. With the DE value from the last period of the window as the x-axis variable and the corresponding EE value from the next period of the window as the y-axis variable, a provincial DE-EE time-series scatter plot is constructed, and ordinary least squares (OLS) is used to estimate a quadratic polynomial fitting equation.

The impact of the digital economy on energy efficiency can be categorized into three nonlinear patterns: U-shaped effects, inverted U-shaped effects, and weakly correlated patterns.

Specifically, if the coefficient of the quadratic term (a) is significantly positive at the 5% statistical significance level, the adjusted R² of the fitted equation is ≥0.20, and the amplitude of the curve’s response is no less than 3.2 times the standard deviation of the fitted values, this indicates a U-shaped pattern. When a is significantly negative at the 5% statistical level, the adjusted R² of the fitted equation is ≥0.20, and the amplitude of the curve’s response is no less than 0.8 times the standard deviation of the fitted values, this indicates an inverted U-shaped pattern. All other cases are classified as weak correlations. Given the limited effective sample size of the annual provincial panel data and the tendency for boundary observations to cause significant fluctuations in the location of inflection points, this study does not treat “the inflection point falling within the sample observation range” as a strict exclusion criterion. Instead, it is used only as supplementary explanatory information for pattern characteristics to avoid result biases caused by overclassification.

Figure 7 illustrates the spatial distribution of the impact types of the DE on EE at the provincial level nationwide, revealing pronounced regional clustering patterns. The southeastern region exhibits a dominant U-shaped pattern, while the central and western regions concentrate in an inverted U-shaped distribution.

The U-shaped pattern (dark green) is concentrated in China’s eastern coastal regions, the Yangtze River Delta, the Pearl River Delta, and the core industrial belt of Central China. These areas exhibit a high degree of alignment between China’s digital economy and its industrial foundation. Their economies tend to be centered on high-end manufacturing or modern services. The fitted response curves indicate that, during the early stages of DE development, higher DE levels are initially associated with relatively lower predicted EE. The infrastructure investments and equipment upgrades led to short-term increases in energy consumption. However, once a critical threshold is surpassed, the synergistic effects of industrial digitalization and the integration of core digital economy models are unleashed. This drives cost reduction and efficiency gains, leading to sustained improvements in energy efficiency. This inflection in the response curve suggests a transition of digital technologies from an “initial energy-investment phase” to an “energy-saving effect release phase.” Meanwhile, industrial structure upgrading and green technological innovation serve as key drivers of the U-shaped effect. Provinces in this category show significantly higher shares of high-end manufacturing and greater innovation technology investment compared to other provincial types. This suggests that industrial restructuring and green technological innovation are factors closely linked to energy efficiency. In addition, provinces exhibiting a U-shaped pattern tend to cluster spatially, suggesting that geographical proximity may help facilitate technology sharing and synergies.

The inverted U-shaped pattern (light green) is observed in 12 provinces, primarily located in the Northeast’s old industrial bases, the energy-dependent regions of the Northwest, and resource-rich provinces in the Southwest. These provinces have industrial structures dominated by energy extraction and traditional heavy chemical industries. In the fitted response curves for these provinces, the relationship between the level of digitalization (DE) and predicted energy efficiency (EE) typically rises initially and then declines. This pattern is consistent with the potential impact of the energy rebound effect and constraints on capacity expansion. Specifically, in these resource-rich provinces, as digital transformation deepens, the initial gains in energy efficiency resulting from the adoption of digital technologies may be partially offset by the expansion of energy-intensive production capacity. The resulting increase in energy consumption outweighed the energy-saving effects of digital technologies, leading to a net decline in EE. Meanwhile, the relatively low level of human capital also serves as a key factor undermining the long-term energy-saving effects in these provinces. In Anhui and Zhejiang, the pattern differs slightly: initial growth is followed by a stabilization phase. At moderate levels of digitalization, supply chain optimization is closely linked to improvements in energy efficiency; however, at higher levels, the marginal relationship between further increases in digitalization and improvements in energy efficiency weakens, leading to a plateau in energy efficiency rather than a sustained upward trend.

The weak-correlation pattern (light blue) is observed only in Guangxi, Hainan, Qinghai, and Ningxia. These provinces exhibit a relatively monolithic industrial structure, with lagging indicators for industrial digitization and digital industrialization. They have achieved only preliminary coverage of digital technologies, exerting minimal influence on energy efficiency. The association between EE and DE is weak, indicating a lack of industrial-level digital transformation. Consequently, energy-saving effects are unstable, owing to the insufficient depth of DE intervention in EE. Inadequate investment in digital factors and lack of industrial adaptability result in insignificant correlation.

Figure 8 presents the STGNN quadratic fitting results for DE and energy efficiency EE across six representative provinces—Xinjiang, Inner Mongolia, Tianjin, Qinghai, Jiangsu, and Sichuan—illustrating the distinct conditional associations between DE and predicted EE that emerge under different industrial characteristics and levels of digital integration. Inner Mongolia, a typical energy-dependent province with a large coal and thermal power sector, displays an inverted U-shaped pattern in which EE initially rises and then stabilizes, not yet reaching the threshold for decline. Digital technologies have been accompanied by a shift from expanding traditional energy production capacity to promoting energy efficiency in the new energy sector. However, due to the industry’s high dependence on energy, there remains a risk of declining energy efficiency in the future. Xinjiang exhibits pronounced inverted-U characteristics, with digital technologies driving large-scale expansion of the energy sector and amplifying energy consumption in high-energy-intensive industries. In Qinghai, DE has long remained below 0.15, and R&D investment is only about 2% of the national average. The province faces a shortage of digital elements, while its pillar industries lack sufficient scope for digital transformation, rendering them unable to exert an effective influence on EE. Sichuan Province exhibits a strong positive correlation driven by a U-shaped pattern. As the level of digital economic development increases, predicted energy efficiency shows a steady upward trend. The deepening integration of digital technologies with the real economy allows the digital economy to promote efficient energy use through optimized resource allocation and enhanced production efficiency. Tianjin and Jiangsu, both high-end manufacturing regions, display a U-shaped relationship between DE and predicted EE, suggesting that digital elements are linked to tangible EE gains in the model’s predictions. During the early stages of DE development, both regions invested heavily in energy to build industrial internet infrastructure and upgrade production equipment, leading to a short-term decline in energy consumption. Following the implementation of industrial digital transformation, energy savings were achieved in high-energy-consuming industries.

5.3.2. Stability Testing

To verify that the aforementioned nonlinear classification results were not merely random outputs caused by random model initialization, this study kept the dataset partitioning, model architecture, and training process constant, and the STGNN model was trained repeatedly using 30 different random seeds. After each training run, the nonlinear patterns for each province were re-evaluated according to a uniform set of rules, and the frequency and stability of the classification results for each province were statistically analyzed. The key statistical results are presented in Table 9.

The results show that the average number of identified patterns for the U-shaped and inverted U-shaped models were 13.70 and 13.87, respectively, accounting for a combined 91.9% of the total sample. The core conclusion—that a significant nonlinear relationship exists between the digital economy and energy efficiency—is highly robust and does not depend on specific random initialization conditions. It should be noted that classification results still exhibit some fluctuation at the provincial level, with provinces oscillating between the U-shaped and inverted U-shaped patterns. This is primarily due to the transitional characteristics at the boundaries of the two patterns: for provinces where the digital economy is near an inflection point, the sign of the quadratic term coefficient in the DE-EE curve is highly sensitive to model initialization.

5.4. Spatial Correlation Patterns: Model-Implied Associations from Counterfactual Simulations

5.4.1. Counterfactual Simulation

This section conducts counterfactual simulation analyses using the trained STGNN model, with Jiangsu and Guangdong provinces as the target regions. These two provinces are at the forefront of China’s digital economy development, serving as the core growth poles of the Yangtze River Delta and Pearl River Delta, respectively. They boast well-developed digital infrastructure and rank among the nation’s leaders in terms of the depth of industrial digitization, making them representative examples of the radiating and driving effects of digital economic development.

While holding all other characteristics constant, we apply an exogenous positive shock solely to the level of digital economic development in the target provinces and use the STGNN model to predict the counterfactual values of energy efficiency following the shock. Following mainstream macroeconomic conventions, we set the baseline positive development (DE) shock to +0.5 standard deviations (SD). The post-shock DE value falls within the observed sample range (with a maximum of 0.75) and correspond to the level of growth under the national average over 2 to 3 years under current policies. Meanwhile, Section 5.4.2 reports sensitivity tests of ±0.25 standard deviation and ±0.75 standard deviation.

The counterfactual simulation results for Jiangsu Province after a 0.5 standard deviation increase in DE are shown in Table 10. Overall, the findings suggest a pattern of pronounced direct associations, spatial propagation in model predictions, and clear geographic attenuation. Benefiting from its large-scale digital economy and advanced infrastructure, Jiangsu possesses robust capabilities for technology diffusion and industrial linkage effects. Key quantitative indicators show that the simulated increase in DE is correlated with a positive adjustment in Jiangsu Province’s projected EE (0.00112 units, accounting for approximately 0.10% of the baseline value). At the same time, through spatial linkages, this adjustment spreads to neighboring provinces, leading to positive projected changes in their energy efficiency (EE) values. The average absolute increase in the predicted EE across four neighboring provinces was 0.000455 units, representing a relative increase of 0.045%. The magnitude of the neighboring-province response reached about 40.6% of the adjustment predicted for Jiangsu. In terms of response magnitude rankings, Anhui > Shandong > Zhejiang > Shanghai. This pattern aligns closely with regional economic interconnections. Anhui and Shandong exhibit the strongest predicted responses, followed closely by Zhejiang. As a top-tier city, Shanghai possesses a high baseline in digital economy and energy efficiency, resulting in a weaker predicted marginal association from Jiangsu’s simulated increase. In contrast, Anhui and Shandong remain in the mid-stage of digital transformation, making them show stronger conditional associations with Jiangsu’s radiating influence.

Table 11 presents the counterfactual simulation results for Guangdong Province following a 0.5 standard deviation increase in DE, yielding a pattern generally consistent with that observed for Jiangsu. The simulated growth in DE for Guangdong Province is associated with a positive adjustment in its projected energy efficiency—an adjustment that is slightly smaller than the projected value for Jiangsu. Nevertheless, the model predicts that this adjustment will still have a significant ripple effect on neighboring provinces. Multiple neighboring provinces benefit synergistically, achieving an average energy efficiency (EE) improvement of 0.00040 units, representing a relative increase of 0.038%. This model-implied spatial association ratio in Guangdong Province accounts for 43.36% of its own impact, slightly exceeding that of Jiangsu Province. This is consistent with Guangdong’s status as the hub of the digital economy in South China and its radiating influence. Unlike Jiangsu and its neighboring provinces, which primarily focus on manufacturing collaboration, Guangdong and its neighboring provinces leverage their diversified industrial structures to build regional industrial collaboration networks by harnessing complementary advantages. The extent of the impact on neighboring provinces is closely linked to the level of connectivity in factor markets and the development of digital infrastructure. Specifically, the projected changes in Hunan and Jiangxi provinces are relatively moderate, which is consistent with the high proportion of traditional industries in their economies.

The spatial network diagram (Figure 9 and Figure 10) further illustrates the spatial propagation patterns reflected by the model’s predictions. Connecting edges between target provinces and neighboring provinces are thicker and darker in color, indicating that the model predicts stronger cross-province adjustments for geographically adjacent units than for non-neighboring provinces. At the same time, the magnitude of the predicted changes diminishes as spatial distance increases, a pattern consistent with the law of spatial decay. Geographical proximity is associated with lower marginal costs of digital technology diffusion and information transmission, and is further linked to stronger predicted effects through pathways such as industrial synergy and cross-regional factor mobility. In contrast, the model shows that remote provinces exhibit a weaker response to the economic growth driven by core cities, which is consistent with transaction costs and barriers to factor mobility caused by geographical distance.

5.4.2. Robustness Testing for Multi-Gradient Shocks

To verify that the baseline findings are not dependent on the magnitude of a specific shock, this study further conducts multi-gradient shock simulations using 0.25 and 0.75 standard deviations; the results are shown in Table 12.

Under the full-gradient shock scenario, both the absolute change and relative increase in model-predicted EE within Jiangsu and Guangdong provinces expanded in tandem with the increase in the magnitude of the DE shock, with no reversal in direction observed. The simulated DE increase in both provinces shows a positive model-implied spatial association with the predicted EE of neighboring provinces, with the Model-implied spatial association ratio remaining stable within the range of 39–45%. No negative siphoning effect was observed in the model predictions, and the core conclusion regarding the spatial associational pattern is not affected by the magnitude of the shock.

5.5. Heterogeneous Impacts of Different Aspects of DE on EE

Figure 11 and Figure 12 illustrate the importance distribution of the “digital economy dimension” (including digital industrialization, industrial digitalization, digital infrastructure, and core digital economy) and the “traditional development dimension” (including industrial structure, economic development level, education level, urbanization level, degree of external development, and population density) for energy efficiency across different provinces. This visually reflects the heterogeneity in the impacts of different dimensions on energy efficiency.

China’s regional economy is characterized by a development gradient decreasing from east to west, accompanied by significant disparities in digital economic advancement. Coastal developed regions, represented by Shanghai, Jiangsu, Guangdong, Beijing, and Shandong, have achieved significantly higher levels of digital economic development compared to other provinces, consistent with their economic foundations, policy support, and locational advantages. In these regions, the digital economy shows a stronger association with energy efficiency in the model’s predictions. In central and western provinces such as Sichuan and Chongqing, digital infrastructure exhibits higher predictive importance than dimensions such as industrial digitization and digital industrialization. Digital infrastructure—represented by broadband networks and data centers—is more prominently associated with the enabling efficiency of the digital economy in these regions, suggesting a supporting role in the model. However, deep applications of industrial digitization remain relatively scarce, and the clustering effects of core digital economy industries have not yet fully materialized, leaving overall potential largely untapped. Constrained by geographical and economic conditions, western provinces such as Yunnan, Ningxia, Guangxi, and Xinjiang score below 0.3 across all dimensions of the digital economy. Their performance shows lower values compared to traditional indicators like industrial structure and urbanization levels. Their development is characterized by a continued reliance on traditional manufacturing, and the digital economy in these underdeveloped regions shows a weak conditional association with energy efficiency in the model’s predictions.

Different dimensions show different associations with energy efficiency, which is consistent with the idea of differentiated pathways of various elements within the digital economy in relation to energy efficiency. The potential for energy conservation and efficiency improvement is more pronounced when specific sub-dimensions of the digital economy align with the local industrial foundation. For provinces relying on high-end manufacturing and modern services, the deep integration of industrial digitalization is associated with overcoming the energy input constraints during the early stages of the digital economy. This pattern corresponds to their energy efficiency curves shifting beyond the inflection point, followed by a trajectory of sustained improvement. The predicted pathways of digital infrastructure are primarily evident in the early stages of formative influence. Energy efficiency gradually improves as digital infrastructure matures, but has not yet entered a phase of rapid advancement. As exemplified by Guangdong Province, digital industrialization is associated with energy efficiency though optimizing technologies to reduce consumption within the digital sector itself, thereby indirectly influencing regional efficiency. This mechanism shows stronger associations only in provinces where the digital industry has achieved a substantial scale.

6. Discussion, Conclusions and Suggestions

6.1. Discussion

Based on provincial-level panel data nationwide and the STGNN fitting method, this study reveals three nonlinear patterns of the digital economy’s impact on energy efficiency: U-shaped, inverted U-shaped, and weak correlation. It further identifies regional clustering characteristics—U-shaped in the southeast and inverted U-shaped in the northwest—along with the heterogeneity of digital factor allocation and spatial relationships.

The three nonlinear patterns identified in this study represent theoretical extensions and morphological variations of the EKC theory within the digital economy era. The U-shaped pattern exhibits a “depression followed by recovery” characteristic, forming an inverse correlation with the “recovery followed by depression” pattern of the EKC. This finding aligns with scholarly research confirming that “the digital economy more readily accelerates the arrival of the carbon emission reduction inflection point” [59,60]. The inverted U-shaped pattern stems from the “rebound effect” observed in expanding digital industries [30], where digital-enabled production growth offsets energy-saving gains, reflecting management challenges due to imbalanced resource allocation. The weak correlation pattern underscores that technological empowerment requires dual conditions: a robust digital foundation and suitability for industrial scenarios. As a new type of production factor, the digital economy’s impact on energy efficiency fundamentally reflects differences in factor allocation across various dimensions. These variations generate distinct influence pathways driven by differing marginal output efficiencies, aligning with the core logic of factor allocation theory [2,6,61]. Given the complexity of the digital economy’s impact on energy efficiency, a single metric cannot capture its full scope, making structural deconstruction essential.

It should be noted that the choice of the STGNN model in this study is driven by clear methodological necessity. As the mainstream method in the field of spatial econometrics, the traditional Spatial Durbin Model (SDM) is based on three fundamental assumptions: linear parameterization, serial stationarity, and spatial homogeneity [62,63,64]. Consequently, it struggles to capture the nonlinear, dynamic, and multidimensional spatial structures driven by the digital economy [65,66]. Regional spatial relationships in the digital economy era constitute complex network interactions involving multiple overlapping dimensions. Spillover effects between regions exhibit significant heterogeneity, which contradicts the SDM’s core assumption of “homogeneous spatial spillover coefficients.” The preliminary results from this study indicate that the SDM based on a single geographic adjacency matrix fails to yield statistically significant spatial spillover coefficients; meanwhile, the SDM based on a three-dimensional weighted matrix suffers from multicollinearity among variables, resulting in coefficient signs that contradict theoretical expectations and insufficient statistical significance. This further confirms that linear spatial econometric models are ill-suited to the research subjects and data characteristics of this study. Therefore, this study employs a nonlinear spatio-temporal graph neural network to conduct spatial correlation analysis, serving as a supplement and optimization to traditional spatial econometric methods in the context of digital economy research.

Although the test-set R² of the STGNN model is approximately 0.25, this level of predictive performance is reasonable and informative for provincial panel analysis in energy economics, where outcomes are jointly determined by numerous economic, technological, institutional, and regional factors. A value of 0.25 indicates that the model captures meaningful and stable predictive signals from digital economy development and spatial interactions, rather than merely fitting noise. Importantly, this modeling framework is a predictive rather than a causal identification strategy. It aims to uncover robust spatio-temporal patterns, nonlinear relationships, and model-implied spatial associations rather than to estimate strict causal effects. Thus, the results provide reliable empirical evidence for understanding energy efficiency dynamics under digital transformation, while causal interpretation requires further identification strategies.

Compared with the existing literature, this study aligns with prior findings while also introducing significant breakthroughs. Constrained by model specifications, many traditional studies typically identify only a single inverted U-shaped or U-shaped relationship. In contrast, the deep learning framework employed in this study reveals the coexistence of three distinct patterns. Traditional research often faces challenges in mitigating multicollinearity and accounting for complex interaction effects. By employing feature engineering and counterfactual simulation, this study can clearly isolate the independent contributions and interactions of each dimension while holding other factors constant. This approach offers a more reliable identification of dimensional heterogeneity and spatial correlation.

Although this study makes advances in methodology and findings, it still has limitations. (1) The reliance on provincial-level panel data prevents disaggregation to the prefecture-level city or industry level, thereby obscuring variations in digital factor allocation within cities or across specific sectors. (2) This study covers only 13 years of data from 30 provinces, representing a relatively small sample size that may impact the model’s generalization ability. (3) Furthermore, the analysis of digital economy heterogeneity relies on dimensions such as “digital industrialization and industrial digitalization,” without incorporating specific subcategories of digital technologies, which may undermine the precision of the underlying mechanisms. (4) The STGNN framework adopted in this study is a predictive rather than causal identification strategy. The model-implied spatial associations from STGNN-based counterfactual simulations only quantify the associational relationship between the two variables and do not constitute a strict econometric causal effect estimate.

6.2. Conclusions

This study analyzes the relationship between the level of digital economic development and energy efficiency, characterizing the nonlinear association between DE and EE. It tests three hypotheses, with the core conclusions as follows:

(1): The digital economy shows a positive association with energy efficiency, with the fitted relationship exhibiting three types of nonlinear patterns and pronounced spatial clustering characteristics.
(2): The contributions of different digital economy dimensions to predicted energy efficiency vary. Digital infrastructure serves as a foundational element. Industrial digitalization corresponds to the most substantial direct improvements in model predictions by optimizing production processes. Digital industrialization is associated with indirect contributions via technological innovation and knowledge diffusion.
(3): The digital economy displays significant spatial associations with energy efficiency and, through complex networks, a positive conditional correlation emerges in neighboring regions. Moreover, these spatial patterns vary across provinces.

6.3. Suggestions

Based on the above research findings, the following policy implications are suggested to better support the synergistic development of the digital economy and the green transition of energy:

(1): Pursue regionally differentiated digital economic development strategies. In provinces exhibiting a U-shaped fitted pattern, policy support may be directed toward the deep integration of digital technology and energy systems to accelerate development beyond the threshold and strengthen energy-saving effects. In provinces with an inverted U-shaped pattern, attention to upgrading data centers and optimizing industry layouts could help to limit high-energy expansion associated with rebound effects. In weakly connected areas, priority could be given to inclusive digital infrastructure. Leverage local industrial strengths to cultivate application scenarios and solidify the foundation for energy efficiency.
(2): Emphasize targeted measures across digital economy dimensions. With industrial digitalization as a central focus, a unified governance platform can deepen collaboration. Facilitate the digital transformation of traditional industries via big data and IoT. Improving infrastructure sharing mechanisms and advancing the “East Data, West Computing” project can help optimize computing power distribution. Increased R&D investment in energy-saving technologies within digital industrialization, particularly in smart energy management, may strengthen technological supply capabilities.
(3): Foster a new pattern of coordinated development for the digital economy. Strengthening core nodes as hubs for digital innovation and green technology diffusion, as well as enhancing connectivity and data sharing with surrounding regions, can support inter-city cooperation, improved public services, and accelerated digital transformation in underdeveloped areas.

Author Contributions

Investigation, C.Z.; writing—original draft preparation, R.C. and Y.D.; writing—review and editing, C.Z.; methodology, X.Z.; formal analysis, C.Z.; visualization, R.C.; data curation, R.C.; resources, C.Z.; conceptualization, X.Z.; funding acquisition, C.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by The National Social Science Fund of China (No.: 23BGL188).

Institutional Review Board Statement

This research did not involve human participants or their data.

Data Availability Statement

The data that support the findings of this paper are available in the China National Bureau of Statistics website (https://www.stats.gov.cn/ (accessed on 28 December 2025)). Data are available from the authors upon reasonable request and with permission.

Conflicts of Interest

All authors have no relevant relationships to disclose that could be considered as potential conflicts of interest.

Appendix A

Figure A1. DE and EE Dynamic Changes in 30 Provinces: Evidence from China (2011–2023).

References

Sen, K.K.; Karmaker, S.C.; Chapman, A.J.; Saha, B.B. Digital economy in reducing energy inequality and enhancing energy security for environmental sustainability. J. Clean. Prod. 2025, 522, 146344. [Google Scholar] [CrossRef]
Lei, P.; Li, X.; Yuan, M. The consequence of the digital economy on energy efficiency in Chinese provincial and regional contexts: Unleashing the potential. Energy 2024, 311, 133371. [Google Scholar] [CrossRef]
Liu, X.; Wu, S.; Yue, T.; Lyu, W. Study on the impact mechanism of digital economy spatial network on regional carbon footprint. Environ. Chall. 2025, 20, 101188. [Google Scholar] [CrossRef]
He, K.; Li, S.; Shang, T. Can the construction of digital infrastructure promote urban energy efficiency?: A quasi-natural experiment based on 108 prefecture-level cities in the Yangtze River Economic Belt. Energy Build. 2025, 344, 116007. [Google Scholar] [CrossRef]
Li, G.; Gao, D.; Li, Y. Dynamic environmental regulation threshold effect of technical progress on green total factor energy efficiency: Evidence from China. Environ. Sci. Pollut. Res. 2022, 29, 8804–8815. [Google Scholar] [CrossRef]
Wang, J.; Guan, H.; Zhang, J. The impact of the digital economy on energy productivity: An empirical analysis based on 30 provinces in China. Energy 2025, 335, 138291. [Google Scholar] [CrossRef]
Wang, Q.-J.; Zhu, Y.-J.; Zhang, Y.; Chang, C.-P. Impact of digital economy on energy efficiency: Role of emerging technologies such as AI. Energy Econ. 2025, 150, 108840. [Google Scholar] [CrossRef]
Song, M.; Pan, H.; Vardanyan, M.; Shen, Z. Evaluating the energy efficiency-enhancing potential of the digital economy: Evidence from China. J. Environ. Manag. 2023, 344, 118408. [Google Scholar] [CrossRef]
Xu, R.-Y.; Wang, K.-L.; Miao, Z. Exploring the impact of digital industry agglomeration on provincial energy efficiency in China: A panel data analysis from 2012 to 2020. Energy 2024, 313, 133875. [Google Scholar] [CrossRef]
Zhu, W.; Shi, C.; Chen, Z.; Zhi, J.; Zhang, C.; Yao, X. Research on the process of energy poverty alleviation in China’s provinces by new energy revolution from the perspective of time and space. Energy 2025, 322, 135635. [Google Scholar] [CrossRef]
Cheng, J.; Yang, D.; Xu, L. Digital economy, technical progress reversal, and climate change governance–insights on digital technology and data factor. Energy Econ. 2025, 150, 108848. [Google Scholar] [CrossRef]
Lin, B.; Huang, C. How will promoting the digital economy affect electricity intensity? Energy Policy 2023, 173, 113341. [Google Scholar] [CrossRef]
Mei, B.; Khan, A.A.; Khan, S.U.; Ali, M.A.S.; Luo, J. Variation of digital economy’s effect on carbon emissions: Improving energy efficiency and structure for energy conservation and emission reduction. Environ. Sci. Pollut. Res. 2023, 30, 87300–87313. [Google Scholar] [CrossRef]
Jiao, J.; Song, J.; Ding, T. The impact of synergistic development of renewable energy and digital economy on energy intensity: Evidence from 33 countries. Energy 2024, 295, 130997. [Google Scholar] [CrossRef]
Huo, D.; Zhang, X.; Meng, S.; Wu, G.; Li, J.; Di, R. Green finance and energy efficiency: Dynamic study of the spatial externality of institutional support in a digital economy by using hidden Markov chain. Energy Econ. 2022, 116, 106431. [Google Scholar] [CrossRef]
Gao, D.; Li, G.; Yu, J. Does digitization improve green total factor energy efficiency? Evidence from Chinese 213 cities. Energy 2022, 247, 123395. [Google Scholar] [CrossRef]
Zuo, M.; Cui, Q.; Yu, S. Digital transformation and household energy consumption: Evidence from the “Broadband China” policy. J. Clean. Prod. 2024, 473, 143551. [Google Scholar] [CrossRef]
Wang, S.; Fan, J.; Feng, P.; Wei, G. The impact mechanism of the digital economy on the carbon neutrality process in resource-based cities and its contribution to urban SDGs. Environ. Sustain. Indic. 2025, 27, 100769. [Google Scholar] [CrossRef]
Ren, S.; Li, L.; Han, Y.; Hao, Y.; Wu, H. The emerging driving force of inclusive green growth: Does digital economy agglomeration work? Bus. Strategy Environ. 2022, 31, 1656–1678. [Google Scholar] [CrossRef]
Zheng, R.; Wu, G.; Cheng, Y.; Liu, H.; Wang, Y.; Wang, X. How does digitalization drive carbon emissions? The inverted U-shaped effect in China. Environ. Impact Assess. Rev. 2023, 102, 107203. [Google Scholar] [CrossRef]
Wang, L.; Shao, J. Digital economy, entrepreneurship and energy efficiency. Energy 2023, 269, 126801. [Google Scholar] [CrossRef]
Zhang, L.; Mu, R.; Zhan, Y.; Yu, J.; Liu, L.; Yu, Y.; Zhang, J. Digital economy, energy efficiency, and carbon emissions: Evidence from provincial panel data in China. Sci. Total Environ. 2022, 852, 158403. [Google Scholar] [CrossRef]
Zhao, H.; Guo, S. Analysis of the non-linear impact of digital economy development on energy intensity: Empirical research based on the PSTR model. Energy 2023, 282, 128867. [Google Scholar] [CrossRef]
Zhou, L.; Shi, X.; Bao, Y.; Gao, L.; Ma, C. Explainable artificial intelligence for digital finance and consumption upgrading. Financ. Res. Lett. 2023, 58, 104489. [Google Scholar] [CrossRef]
Sun, C.; Xu, M.; Wang, B. Deep learning: Spatiotemporal impact of digital economy on energy productivity. Renew. Sustain. Energy Rev. 2024, 199, 114501. [Google Scholar] [CrossRef]
Li, K.; Wang, H.; Xie, X. Mechanism and spatial spillover effect of the digital economy on urban carbon Productivity: Evidence from 271 prefecture-level cities in China. J. Environ. Manag. 2025, 382, 125435. [Google Scholar] [CrossRef]
Foroutan, P.; Lahmiri, S. Deep learning-based spatial-temporal graph neural networks for price movement classification in crude oil and precious metal markets. Mach. Learn. Appl. 2024, 16, 100552. [Google Scholar] [CrossRef]
Wu, D.; Wang, Q.; Olson, D.L. Industry classification based on supply chain network information using Graph Neural Networks. Appl. Soft Comput. 2023, 132, 109849. [Google Scholar] [CrossRef]
Wu, L.; Zhu, C.; Wang, G. The impact of green innovation resilience on energy efficiency: A perspective based on the development of the digital economy. J. Environ. Manag. 2024, 355, 120424. [Google Scholar] [CrossRef]
Huang, C.; Lin, B. The impact of digital economy on energy rebound effect in China: A stochastic energy demand frontier approach. Energy Policy 2025, 196, 114418. [Google Scholar] [CrossRef]
Zhang, Y.; Khan, N.U.; Cai, H.H.; Tang, S.; Bousrih, J. Sowing the seeds of sustainability: Digitalization, renewable energy, and carbon emissions in emerging economies’ global value chains. J. Environ. Manag. 2025, 393, 127119. [Google Scholar] [CrossRef]
Liu, Y.; Liu, N.; Huo, Y. Impact of digital technology innovation on carbon emission reduction and energy rebound: Evidence from the Chinese firm level. Energy 2025, 320, 135187. [Google Scholar] [CrossRef]
Wan, G.; Yang, L.; Hao, Y.; Geng, Y. Assessing the impacts of digital economy on urban green development efficiency. Sustain. Futures 2025, 10, 100910. [Google Scholar] [CrossRef]
Xiao, Y.; Duan, Y.; Zhou, H.; Han, X. Has digital technology innovation improved urban total factor energy efficiency?—Evidence from 282 prefecture-level cities in China. J. Environ. Manag. 2025, 378, 124784. [Google Scholar] [CrossRef]
Huo, D.; Gu, W.; Guo, D.; Tang, A. The service trade with AI and energy efficiency: Multiplier effect of the digital economy in a green city by using quantum computation based on QUBO modeling. Energy Econ. 2024, 140, 107976. [Google Scholar] [CrossRef]
Yu, Y.; Jian, X.; Won, D.; Jahanger, A. Breaking the carbon bind: How digitalization and energy transformation reshape carbon dependency based on wavelet and machine learning approaches. Environ. Dev. 2025, 55, 101226. [Google Scholar] [CrossRef]
Corradini, F.; Gerosa, F.; Gori, M.; Lucheroni, C.; Piangerelli, M.; Zannotti, M. A systematic literature review of spatio-temporal graph neural network models for time series forecasting and classification. Neural Netw. 2026, 195, 108269. [Google Scholar] [CrossRef]
Qu, Y.; Jia, X.; Guo, J.; Zhu, H.; Wu, W. MSSTGNN: Multi-scaled Spatio-temporal graph neural networks for short- and long-term traffic prediction. Knowl.-Based Syst. 2024, 306, 112716. [Google Scholar] [CrossRef]
Verdone, A.; Scardapane, S.; Panella, M. Explainable Spatio-Temporal Graph Neural Networks for multi-site photovoltaic energy production. Appl. Energy 2024, 353, 122151. [Google Scholar] [CrossRef]
Wang, Q. Interpretable decision-making model with uncertain weights for sustainable digital economy. Adv. Eng. Inform. 2024, 60, 102359. [Google Scholar] [CrossRef]
Zhang, C.; Zhao, X.; Shi, C. Efficiency assessment and scenario simulation of the water-energy-food system in the Yellow river basin, China. Energy 2024, 305, 132279. [Google Scholar] [CrossRef]
Shabur, M.A. Analyzing the challenges and opportunities in developing a sustainable digital economy. Discov. Appl. Sci. 2024, 6, 667. [Google Scholar] [CrossRef]
Xu, Q.; Zhong, M.; Li, X. How does digitalization affect energy? International evidence. Energy Econ. 2022, 107, 105879. [Google Scholar] [CrossRef]
Guo, B.; Hu, P.; Lin, J. The effect of digital infrastructure development on enterprise green transformation. Int. Rev. Financ. Anal. 2024, 92, 103085. [Google Scholar] [CrossRef]
Li, Z.; Hu, B.; Bao, Y.; Wang, Y. Supply chain digitalization, green technology innovation and corporate energy efficiency. Energy Econ. 2025, 142, 108153. [Google Scholar] [CrossRef]
Zheng, X.; Zou, F.; Liu, Z.; Nepal, R. How does digitalization affect capacity utilization in the energy sector? Evidence from China. Energy Econ. 2025, 144, 108337. [Google Scholar] [CrossRef]
Feng, C.; Liu, Y.-Q.; Yang, J. Do energy trade patterns affect renewable energy development? The threshold role of digital economy and economic freedom. Technol. Forecast. Soc. Change 2024, 203, 123371. [Google Scholar] [CrossRef]
Xu, M.; Tan, R.; He, X. How does economic agglomeration affect energy efficiency in China?: Evidence from endogenous stochastic frontier approach. Energy Econ. 2022, 108, 105901. [Google Scholar] [CrossRef]
Qiu, Y.; Gao, C.; Song, N. Trickle-down or siphon: The spillover effects of the digital economy on green innovation from the perspective of the circular economy. Socio-Econ. Plan. Sci. 2025, 102, 102328. [Google Scholar] [CrossRef]
Kibinda, N.; Shao, D.; Mwogosi, A.; Mambile, C. Broadband infrastructure sharing as a catalyst for rural digital economy: A systematic review for developing countries. Telecommun. Policy 2025, 49, 103028. [Google Scholar] [CrossRef]
Cheng, Y.; Zhang, Y.; Wang, J.; Jiang, J. The impact of the urban digital economy on China’s carbon intensity: Spatial spillover and mediating effect. Resour. Conserv. Recycl. 2023, 189, 106762. [Google Scholar] [CrossRef]
Zhong, K.; Lei, Y.; Zhao, J.; Jiang, Y. How to enhance China’s total-factor energy efficiency via digital-real economy integration: New evidence from dynamic QCA analysis. Energy Econ. 2025, 148, 108689. [Google Scholar] [CrossRef]
Xia, H.; Li, M. The “synergy paradox” in digital–industrial synergy and energy efficiency: Evidence from Chinese cities. Energy Policy 2026, 209, 114972. [Google Scholar] [CrossRef]
Jing, P.; Li, S.; Wang, M. Digital empowerment, industry chain integration and corporate energy efficiency. Energy Econ. 2025, 145, 108446. [Google Scholar] [CrossRef]
Ranta, V.; Aarikka-Stenroos, L.; Väisänen, J.-M. Digital technologies catalyzing business model innovation for circular economy—Multiple case study. Resour. Conserv. Recycl. 2021, 164, 105155. [Google Scholar] [CrossRef]
Theodoropoulos, T.; Maroudis, A.-C.; Zdun, U.; Makris, A.; Tserpes, K. WEST GCN-LSTM: Weighted stacked spatio-temporal graph neural networks for regional traffic forecasting. Int. J. Inf. Manag. Data Insights 2025, 5, 100338. [Google Scholar] [CrossRef]
Keller, W.; Shiue, C.H. The origin of spatial interaction. J. Econom. 2007, 140, 304–332. [Google Scholar] [CrossRef]
Hu, J.; Zhang, H.; Irfan, M. How does digital infrastructure construction affect low-carbon development? A multidimensional interpretation of evidence from China. J. Clean. Prod. 2023, 396, 136467. [Google Scholar] [CrossRef]
Zuo, S.; Zhao, Y.; Zheng, L.; Zhao, Z.; Fan, S.; Wang, J. Assessing the influence of the digital economy on carbon emissions: Evidence at the global level. Sci. Total Environ. 2024, 946, 174242. [Google Scholar] [CrossRef]
Chen, W.; Yao, L. The impact of digital economy on carbon total factor productivity: A spatial analysis of major urban agglomerations in China. J. Environ. Manag. 2024, 351, 119765. [Google Scholar] [CrossRef]
Matthess, M.; Kunkel, S.; Dachrodt, M.F.; Beier, G. The impact of digitalization on energy intensity in manufacturing sectors—A panel data analysis for Europe. J. Clean. Prod. 2023, 397, 136598. [Google Scholar] [CrossRef]
Basile, R.; Mínguez, R. Advances in Spatial Econometrics: Parametric vs. Semiparametric Spatial Autoregressive Models. In Proceedings of the The Economy as a Complex Spatial System; Springer Nature: Cham, Switzerland, 2018; pp. 81–106. [Google Scholar]
Elhorst, J.P. Dynamic spatial panels: Models, methods, and inferences. J. Geogr. Syst. 2012, 14, 5–28. [Google Scholar] [CrossRef]
Debarsy, N.; LeSage, J.P. Using Convex Combinations of Spatial Weights in Spatial Autoregressive Models. In Handbook of Regional Science; Fischer, M.M., Nijkamp, P., Eds.; Springer: Berlin, Heidelberg, 2021; pp. 2267–2282. [Google Scholar]
Lee, J.; Phillips, P.C.B.; Rossi, F. Heteroskedasticity robust specification testing in spatial autoregression. Econom. Theory 2025, 41, 995–1043. [Google Scholar] [CrossRef]
Koley, M.; Bera, A.K. To use, or not to use the spatial Durbin model?—That is the question. Spat. Econ. Anal. 2024, 19, 30–56. [Google Scholar] [CrossRef]

Figure 1. Comprehensive Analysis Framework. Note: EE = Energy Efficiency, DE = Digital Economy, J = Infrastructure Construction, C = Industrial Digitization, S= Digital Industrialization, GDP is the indicator of economic level, IS is the indicator of industrial structure, EDU is the indicator of education level, URBAN is the indicator of urbanization level, DENSITY is the indicator of population density, OPEN is the indicator of degree of openness, Imp(fi)gb = Importance of the i-th feature estimated by the LightGBM model., Imp(fi)cb = Importance of the i-th feature estimated by the CatBoost model, Weco = Economic Distance Matrix, Wgeo = Geographic Proximity Matrix, Wdig = Digital Economy Development Level Matrix, Wt = comprehensive spatial adjacency matrix, GCL and GRU are the graph convolution layer and gated recurrent unit, which are the core components of the STGNN model.

Figure 2. Dynamic Evolution of Provincial Digital Economy Development Level in China. Note: All maps in this article are based on standard maps downloaded from the National Geographic Information Public Service Platform, with the map review number GS (2024) 0650. The base maps have not been modified.

Figure 3. Dynamic Evolution of Provincial Energy Efficiency in China.

Figure 4. Heatmap of Pearson correlation coefficients between feature variables and target variable EE.

Figure 5. Comprehensive SHAP Feature Contributions of LightGBM and CatBoost (Top 20).

Figure 6. Decision power map of the combined SHAP values for three representative samples.

Figure 7. Spatial distribution of the impact types of DE on EE at the provincial level.

Figure 8. Typical Provinces: Nonlinear Relationship between DE and EE.

Figure 9. Predictive Spatial Network Following a Counterfactual Increase in Jiangsu’s Digital Economy (0.5 SD).

Figure 10. Predictive Spatial Network Following a Counterfactual Increase in Guangdong’s Digital Economy (0.5 SD).

Figure 11. Radar Chart of Provincial Dimensional Importance (1–15).

Figure 12. Radar Chart of Provincial Dimensional Importance (16–30).

Table 1. Input–Output Indicator System for Green Total Factor Energy Efficiency.

Variable Type	Name	Specific Indicator	Unit	Explanation
Input Variable	Capital Input (x1)	Capital Stock	CNY 100 million	Fixed capital stock estimated via the perpetual inventory method, deflated to constant prices with 2000 as the base year.
	Labor Input (x2)	Year-end Number of Employed Persons	10,000 persons	Total year-end employed population in each province, including urban units, private enterprises, self-employed individuals, and rural workers.
	Energy Input (x3)	Total Energy Consumption	10,000 tons of standard coal	Total primary energy consumption converted to standard coal equivalents using standard conversion coefficients.
Desirable Output	Economic Output (y1)	Regional GDP	CNY 100 million	Gross regional product deflated to constant prices with 2000 as the base year, representing the desirable economic output.
Undesirable Output	Pollution Emission (b1)	Industrial Wastewater Discharge	10,000 tons	Total industrial wastewater discharge, serving as a proxy for water pollution.
	Pollution Emission (b2)	Industrial Waste Gas (SO₂) Emissions	10,000 tons	Industrial sulfur dioxide emissions, a key air pollutant under stringent control within the dual carbon goals framework.
	Pollution Emission (b3)	General Industrial Solid Waste	10,000 tons	Total amount of general industrial solid waste generated, reflecting solid waste discharge intensity from industrial activities.

Table 2. Comprehensive Evaluation Index System for the Digital Economy.

Primary Indicator	Secondary Indicator	Unit	Symbol	Weight	Explanation
Infrastructure Construction	Number of Registered Domain Names	10,000 units	J1	0.0776	Reflects regional internet resource endowment and network entity activity
	Internet Broadband Access Rate	%	J2	0.0193	Number of broadband access ports divided by the permanent resident population
	Internet Broadband Penetration Rate	%	J3	0.0187	Number of fixed broadband subscribers divided by resident population.
	Long-distance Optical Cable Length	10,000 km	J4	0.0197	Total length of long-distance optical fiber cables laid within the province
	Scale of Mobile Phone Facilities	10,000 subscribers	J5	0.0235	Capacity of mobile telephone exchanges; i.e., the maximum number of subscribers the network can support.
	Number of Webpages	10,000 units	J6	0.1259	Reflects information supply and digital service activity.
Industrial Digitization	Value Added of Secondary and Tertiary Industries	CNY 100 million	C1	0.0384	Sum of value added of secondary and tertiary sectors (constant prices), excluding primary industry.
	Number of Websites per 100 Enterprises	units/100 enterprises	C2	0.0068	Reflects the basic level of enterprise online presence.
	Proportion of Enterprises with E-commerce Transactions	%	C3	0.0190	Share of enterprises above designated size that engage in e-commerce transactions.
	Digital Financial Inclusion Index	Index	C4	0.0137	Reflects the depth of coverage and usage of digital financial services
	E-commerce Sales Volume	CNY 100 million	C5	0.0837	Reflects digital supply–demand matching efficiency and consumption-side digitization
	R&D Expenditure of Industrial Enterprises above Designated Size	CNY 100 million	C6	0.0694	Measures innovation input intensity of industrial firms, a core driver for upgrading industrial digitization toward intelligentization
	Total Express Delivery Volume	10,000 pieces	C7	0.1304	Reflects the synergy between e-commerce and logistics digitization
Digital Industrialization	Per Capita Telecommunications Service Volume	CNY 100 million/10,000 persons	S1	0.0712	Measures the per capita output of telecommunication services, reflecting the direct economic contribution of the core digital sector
	Percentage of Employment in Information and Software Industries	%	S2	0.0505	Urban employed persons in information transmission, software, and IT services as a share of total urban employment
	Number of Domestic Patent Grants	units	S3	0.0811	Number of domestic patents (invention, utility model, and design) granted by CNIPA, attributed by applicant address
	Number of Domestic Patent Applications Accepted	units	S4	0.0718	Number of domestic patent applications accepted by CNIPA.
	Number of Corporate Units in Information Transmission, Software and IT Service Industry	units	S5	0.0666	Number of legal entities in information transmission, software, and IT services, by place of operation
	Mobile Phone Penetration Rate	units/100 persons	S6	0.0126	Reflects the prevalence of mobile terminals

Table 3. Descriptive statistics for all variables.

Type	Name	Mean	Std. Dev.	Max	Min	Skewness	Kurtosis
Dependent Variable	EE	1.03	0.08	1.60	0.79	2.04	10.29
Independent Variable	DE	0.14	0.12	0.75	0.01	2.17	5.86
	J1	98.69	142.31	882.49	1.11	2.68	7.85
	J2	0.53	0.25	1.11	0.10	0.10	−0.98
	J3	0.26	0.12	0.56	0.05	0.33	−0.95
	J4	3.23	1.88	12.54	0.09	0.87	3.38
	J5	7809.40	5090.11	24,521.60	649.00	1.23	1.41
	J6	827,831.24	1,864,598.21	14,090,056.70	9.62	4.49	23.23
	C1	26,096.36	23,121.88	130,132.50	1217.70	1.91	4.34
	C2	48.42	11.43	93.00	15.00	−0.01	0.50
	C3	8.13	4.45	24.70	0.40	0.46	0.59
	C4	254.53	110.68	498.28	18.33	−0.33	−0.69
	C5	5003.57	7893.35	53,154.07	0.92	3.14	11.43
	C6	4,199,274.49	5,670,805.14	34,266,367.00	57,760.00	2.66	8.13
	C7	172,549.66	411,895.22	3,456,729.00	244.47	5.01	28.94
	S1	0.28	0.30	1.48	0.06	1.86	2.51
	S2	0.02	0.02	0.14	0.01	4.00	18.16
	S3	76,241.20	119,739.17	872,209.00	502.00	3.49	15.10
	S4	117,989.95	165,491.14	993,480.00	732.00	2.84	9.29
	S5	27,722.81	35,603.61	192,060.00	373.00	2.38	6.20
	S6	104.64	25.53	189.46	52.04	0.74	1.36
Control Variable	Economic Level	12,918.29	8283.85	49,352.14	5125.50	2.49	6.26
	Industrial Structure	1.38	0.76	5.69	0.53	3.26	12.23
	Education Level	0.02	0.01	0.04	0.01	0.70	1.02
	Urbanization Level	0.61	0.12	0.90	0.35	0.63	0.18
	Population Density	474.44	705.82	3925.87	7.86	3.80	15.48
	Degree of Openness	0.27	0.28	1.46	0.01	1.87	3.29

Table 4. Performance of the STGNN Model.

Dataset	R²	RMSE	MAE
Training set	0.2465	0.0825	0.0567
Validation set	0.2341	0.0788	0.0551
Test set	0.2477	0.0751	0.0445

Table 5. Comparison of Test Set Performance Between STGNN and Baseline Models.

Model	R²	RMSE	MAE
Ridge Returns	−0.3624	0.1130	0.0707
Random Forest	0.0975	0.0920	0.0572
ExtraTrees	−0.0409	0.0988	0.0548
HistGBDT	0.0162	0.0960	0.0599
STGNN	0.2477	0.0751	0.0445

Table 6. Comparison of the Predictive Performance of the STGNN Model Under Different Weighting Schemes.

Proposal Number	Geographic Weight α	Economic Weight β	Digital Weight γ	R²	RMSE	MAE
Equal-weighted index	1/3	1/3	1/3	0.2477	0.0751	0.0445
Geography First	0.50	0.25	0.25	0.1865	0.0780	0.0461
Economy First	0.25	0.50	0.25	0.2196	0.0764	0.0441
Digital First	0.25	0.25	0.50	0.2143	0.0767	0.0441

Table 7. Results of trainable weights: Average and Optimal Estimates.

Weight Parameters	Geographic Weight α	Economic Weight β	Digital Weight γ	Test Set R²
Average of repeated training	0.3277	0.3326	0.3398	0.2103
Estimates of the optimal model	0.3270	0.3290	0.3440	0.2450

Table 8. Importance ranking of all features in different machine learning models.

Feature	Lgb_Importance	Lgb_Rank	Cb_Importance	Cb_Rank	Combined_Importance	Combined_Rank
C3	0.5547	3	12.3226	1	0.1164	1
EDU	0.6484	1	9.6170	2	0.1121	2
S5	0.6031	2	5.6289	4	0.0877	3
J2	0.4332	4	5.6001	5	0.0708	4
C7	0.4024	5	2.7111	16	0.0533	5
S1	0.3047	6	2.6292	17	0.0432	6
S6	0.1228	13	4.6636	7	0.0354	7
C4	0.2368	7	2.2733	20	0.0348	8
S3	0.1974	9	2.9156	14	0.0341	9
J5	0.0424	23	5.6485	3	0.0324	10
J3	0.0436	22	5.4005	6	0.0313	11
S2	0.1421	12	3.2392	12	0.0302	12
OPEN	0.1154	14	3.6569	9	0.0297	13
DENSITY	0.1758	10	2.4211	19	0.0295	14
C2	0.2166	8	1.6132	25	0.0295	15
DE	0.0601	21	3.7804	8	0.0248	16
C1	0.0783	18	3.3759	11	0.0246	17
J6	0.0749	20	3.4391	10	0.0246	18
URBAN	0.0939	17	2.8335	15	0.0234	19
J4	0.1451	11	1.7797	24	0.0232	20
IS	0.1123	15	1.9363	23	0.0208	21
C5	0.0772	19	2.2624	21	0.0189	22
S4	0.1030	16	1.2769	27	0.0166	23
C6	0.0416	24	2.4729	18	0.0165	24
DE2	0.0040	27	3.0957	13	0.0159	25
J1	0.0227	25	2.0981	22	0.0127	26
GDP	0.0102	26	1.3083	26	0.0075	27

Table 9. Stability Statistics of Nonlinear Pattern Classification across 30 Repeated STGNN Training Runs.

Nonlinear Mode	Average Number of Provinces Identified	Std. Dev.	95% Confidence Interval	Average Confidence by Main Category
U-shaped	13.70	3.6211	[8.73, 21.55]	87.08%
Inverted U-shaped	13.87	3.8175	[9.21, 19.27]	82.14%
Weakly correlated	2.43	2.2846	[0.00, 7.00]	88.32%

Table 10. Model Counterfactual Simulation Results Conditional on a 0.5 Standard Deviation Increase in Jiangsu’s DE.

Province	Province Type	EE Change	EE Change Std	Relative Change Percent	Rank
Jiangsu	Target	0.00112	0.00065	0.1028%	1
Anhui	Neighbor	0.00051	0.00031	0.0478%	2
Shandong	Neighbor	0.00045	0.00027	0.0444%	3
Zhejiang	Neighbor	0.00044	0.00029	0.0442%	4
Shanghai	Neighbor	0.00042	0.00025	0.0427%	5
Guangdong	Other	0.00036	0.00021	0.0345%	6
Fujian	Other	0.00031	0.00018	0.0295%	7
Hainan	Other	0.00029	0.00017	0.0281%	8
Henan	Other	0.00024	0.00014	0.0265%	9
Sichuan	Other	0.00027	0.00016	0.0260%	10
Hunan	Other	0.00026	0.00015	0.0252%	11
Chongqing	Other	0.00026	0.00015	0.0250%	12
Hubei	Other	0.00025	0.00015	0.0245%	13
Jilin	Other	0.00025	0.00015	0.0240%	14
Beijing	Other	0.00027	0.00020	0.0238%	15
Shaanxi	Other	0.00025	0.00014	0.0236%	16
Gansu	Other	0.00024	0.00014	0.0235%	17
Xinjiang	Other	0.00024	0.00015	0.0230%	18
Heilongjiang	Other	0.00024	0.00014	0.0228%	19
Liaoning	Other	0.00024	0.00014	0.0225%	20
Guangxi	Other	0.00023	0.00013	0.0220%	21
Yunnan	Other	0.00023	0.00013	0.0218%	22
Tianjin	Other	0.00023	0.00013	0.0215%	23
Shanxi	Other	0.00023	0.00013	0.0212%	24
Jiangxi	Other	0.00023	0.00013	0.0210%	25
Hebei	Other	0.00022	0.00013	0.0208%	26
Inner Mongolia	Other	0.00022	0.00012	0.0205%	27
Guizhou	Other	0.00021	0.00012	0.0202%	28
Ningxia	Other	0.00019	0.00012	0.0200%	29
Qinghai	Other	0.00020	0.00012	0.0198%	30

Note: All numerical representations are based on the differences between counterfactual simulations and baseline simulations of the model; they should be interpreted as conditional associations implied by the model, rather than causal effects. For column definitions: EE_Change = absolute predicted EE change (original units); EE_Change_Std = SD across time samples; Relative_Change_Percent = (change/baseline EE) × 100%, where baseline_EE is the predicted EE for the target province without the DE shock, measured in original units (not standardized).

Table 11. Model Counterfactual Simulation Results Conditional on a 0.5 Standard Deviation Increase in Guangdong’s DE.

Province	Province Type	EE Change	EE Change Std	Relative Change Percent	Rank
Guangdong	Target	0.00095	0.00061	0.0867%	1
Hainan	Neighbor	0.00044	0.00040	0.0415%	2
Fujian	Neighbor	0.00043	0.00041	0.0408%	3
Guangxi	Neighbor	0.00040	0.00033	0.0390%	4
Hunan	Neighbor	0.00037	0.00032	0.0362%	5
Jiangxi	Neighbor	0.00036	0.00030	0.0345%	6
Jiangsu	Other	0.00028	0.00026	0.0260%	7
Beijing	Other	0.00029	0.00029	0.0258%	8
Anhui	Other	0.00024	0.00020	0.0234%	9
Chongqing	Other	0.00023	0.00020	0.0218%	10
Zhejiang	Other	0.00023	0.00022	0.0217%	11
Shandong	Other	0.00023	0.00020	0.0216%	12
Henan	Other	0.00022	0.00019	0.0216%	13
Sichuan	Other	0.00022	0.00018	0.0216%	14
Shanghai	Other	0.00022	0.00019	0.0214%	15
Hubei	Other	0.00022	0.00019	0.0213%	16
Ningxia	Other	0.00022	0.00019	0.0209%	17
Tianjin	Other	0.00022	0.00016	0.0209%	18
Yunnan	Other	0.00021	0.00016	0.0204%	19
Xinjiang	Other	0.00021	0.00016	0.0204%	20
Liaoning	Other	0.00021	0.00017	0.0201%	21
Heilongjiang	Other	0.00021	0.00016	0.0201%	22
Shanxi	Other	0.00020	0.00015	0.0198%	23
Guizhou	Other	0.00020	0.00015	0.0197%	24
Qinghai	Other	0.00020	0.00018	0.0196%	25
Inner Mongolia	Other	0.00020	0.00015	0.0191%	26
Hebei	Other	0.00020	0.00018	0.0185%	27
Jilin	Other	0.00020	0.00019	0.0183%	28
Shaanxi	Other	0.00018	0.00013	0.0172%	29
Gansu	Other	0.00018	0.00018	0.0169%	30

Note: All numerical representations are based on the differences between counterfactual simulations and baseline simulations of the model; they should be interpreted as conditional associations implied by the model, rather than causal effects. For column definitions: EE_Change = absolute predicted EE change (original units); EE_Change_Std = SD across time samples; Relative_Change_Percent = (change/baseline EE) × 100%, where baseline_EE is the predicted EE for the target province without the DE shock, measured in original units (not standardized).

Table 12. Counterfactual simulation results for DE shocks of varying magnitudes.

Target Provinces	Amplitude of Impact	Absolute Change in the Province’s EE	Relative Increase in the Province’s EE	Average EE Changes in Neighboring Provinces	Model-Implied Spatial Association Ratio
Jiangsu	+0.25 SD	0.00072	0.0671%	0.000285	39.78%
Jiangsu	+0.50 SD	0.00112	0.1028%	0.000455	40.63%
Jiangsu	+0.75 SD	0.00161	0.1412%	0.000679	42.18%
Guangdong	+0.25 SD	0.00059	0.0454%	0.000246	41.89%
Guangdong	+0.50 SD	0.00095	0.0867%	0.000412	43.36%
Guangdong	+0.75 SD	0.00125	0.1123%	0.000564	45.21%

Note: Model-implied spatial association ratio = average EE change in neighboring provinces/absolute EE change within the target province; this reflects the relative strength of the target province’s DE shock in driving energy efficiency in surrounding regions.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cao, R.; Zhang, C.; Zhao, X.; Deng, Y. Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network. Energies 2026, 19, 2223. https://doi.org/10.3390/en19092223

AMA Style

Cao R, Zhang C, Zhao X, Deng Y. Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network. Energies. 2026; 19(9):2223. https://doi.org/10.3390/en19092223

Chicago/Turabian Style

Cao, Rui, Chenjun Zhang, Xiangyang Zhao, and Yanan Deng. 2026. "Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network" Energies 19, no. 9: 2223. https://doi.org/10.3390/en19092223

APA Style

Cao, R., Zhang, C., Zhao, X., & Deng, Y. (2026). Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network. Energies, 19(9), 2223. https://doi.org/10.3390/en19092223

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Nonlinear Dynamics and Spatial Correlation Pattern of the Digital Economy on Energy Efficiency: Evidence from Ensemble Learning and Spatio-Temporal Graph Neural Network

Abstract

1. Introduction

2. Literature Review

3. Variables and Data

3.1. Core Variable Definition and Calculation Method

3.1.1. Dependent Variable: Energy Efficiency (EE)

3.1.2. Explanatory Variable: Digital Economy Development Level (DE)

3.1.3. Control Variables

3.2. Data Sources and Descriptive Statistics

4. Research Methods

4.1. Research Framework and Basic Assumptions

4.2. Feature Engineering

4.2.1. LightGBM

4.2.2. CatBoost

4.3. SHAP

4.4. Construction of Spatiotemporal Graph Models

4.4.1. Comprehensive Spatial Adjacency Matrix

4.4.2. Construction of the Node Feature Matrix

4.5. STGNN Prediction Framework

4.5.1. Data Preprocessing and Partitioning

4.5.2. Model Architecture

4.5.3. Model Training Strategies

4.5.4. Sensitivity Analysis of Spatial Weights

5. Results Analysis

5.1. Spatio-Temporal Evolution Characteristics of DE and EE

5.2. Feature Contribution Based on SHAP

5.2.1. Feature Importance Analysis

5.2.2. SHAP Contribution Factors

5.3. The Nonlinear Relationship Between DE and EE

5.3.1. Spatial Distribution Characteristics and Typical Provinces

5.3.2. Stability Testing

5.4. Spatial Correlation Patterns: Model-Implied Associations from Counterfactual Simulations

5.4.1. Counterfactual Simulation

5.4.2. Robustness Testing for Multi-Gradient Shocks

5.5. Heterogeneous Impacts of Different Aspects of DE on EE

6. Discussion, Conclusions and Suggestions

6.1. Discussion

6.2. Conclusions

6.3. Suggestions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI