3.2.1. Model Selection
This paper aims to develop an urban carbon-emission driving mechanism and prediction model under conditions of limited data availability. Due to the relatively small dataset, a time-series–oriented training/validation partition (rolling validation) is employed, and lagged features are incorporated. A rigorous comparison is conducted across four models, LightGBM, LightGBM-TPE, XGBoost, and XGBoost-TPE, to determine the optimal estimator. Ensemble-learning approaches are adopted to simulate the nonlinear relationships between 77 features and Shanghai’s urban carbon footprint. Recent studies have demonstrated the advantages of ensemble methods for carbon-emissions prediction [
34,
35].
To address the time-series characteristics and multidimensional driving mechanisms of urban carbon emissions, this study constructs a predictive modelling framework based on Gradient Boosting Decision Trees (GBDTs), selecting LightGBM and XGBoost as the core algorithms to systematically compare performance differences and parameter optimisation effects in sequence forecasting. Compared to traditional linear regression or neural network models, GBDT-based methods offer a strong nonlinear representation capacity, enhanced interpretability, and robustness, making them well-suited for modelling multidimensional heterogeneous variables and nonstationary temporal data.
XGBoost (Extreme Gradient Boosting) demonstrates superior overall performance in this study. Its core advantages include the following: (1) the use of second-order gradient information for loss optimization, enabling more stable convergence and higher predictive accuracy; (2) built-in regularization terms (L1 and L2) that effectively suppress overfitting and enhance cross-period generalization; (3) support for temporally ordered incremental training, which facilitates dynamic model updating under the rolling-validation framework; and (4) improved robustness to feature fluctuations through combined column and row subsampling, making it particularly suitable for forecasting annual carbon-emission series. Considering the significant temporal dependence and trend characteristics of carbon-emission data, this study does not employ randomly shuffled cross-validation; instead, it uses Rolling Time-series Validation, which predicts subsequent years using data from earlier years. This approach follows the authentic predictive logic of “using the past to forecast the future” and provides a more reliable assessment of temporal stability and generalisation capacity.
Figure 5 displays the prediction results under rolling time-series validation, illustrating the relationship between observed and predicted carbon-emission values for the four models (LightGBM, LightGBM-TPE, XGBoost, XGBoost-TPE). The horizontal axis represents observed emissions, while the vertical axis denotes predicted emissions. The dashed line indicates the ideal fit line (y = x), and the red line shows the linear regression fit, reflecting the direction and magnitude of prediction deviations.
The rolling time-series validation results indicate that XGBoost achieves the best performance among all models, with the baseline version attaining an R2 of 0.66 and an RMSE of 989, demonstrating the highest predictive accuracy and trend consistency under the rolling validation framework. In contrast, the LightGBM and LightGBM-TPE models exhibit systematic underestimation in certain years, with larger overall fitting deviations, resulting in R2 values of −3 and 0.29, respectively. Although hyperparameter tuning slightly improves the accuracy of LightGBM-TPE, it remains insufficient to capture the year-to-year variations in carbon emissions effectively.
It is noteworthy that the XGBoost-TPE model displays significant overfitting under the current parameter configuration. Its predicted values substantially exceed the observed values in high-emission years, resulting in R2 = −6.31 and an RMSE of 4584. This indicates that excessively high learning rates and deeper tree structures impair the model’s cross-period generalisation ability within the rolling time-series framework. Overall, the baseline XGBoost model offers the best balance between fitting accuracy and generalisation stability, accurately capturing the annual fluctuation patterns of carbon emissions from 2015 to 2023.
It is worth noting that, under the current hyperparameter configuration, the XGBoost-TPE model exhibits pronounced overfitting. In years with high emission levels, its predicted values are substantially higher than the observed values, resulting in a negative coefficient of determination (R2 = −6.31) and a large root mean square error (RMSE = 4584). This outcome indicates that, in time-series settings characterised by limited sample size and marked interannual variability, highly complex XGBoost-TPE configurations tend to overfit the training period and struggle to maintain generalisation performance in rolling forecasts for future years.
Accordingly, the TPE-optimised variant is not adopted as the final forecasting model in this study. Instead, the baseline XGBoost model, which demonstrates greater structural stability and stronger cross-period generalisation ability, is selected. Overall, the baseline XGBoost model achieves the most favourable balance between fitting accuracy and generalisation robustness, and can track the year-to-year fluctuations in carbon emissions over the period 2015–2023 with relatively high accuracy.
To further verify the model’s stability and prediction error characteristics, this paper conducts a statistical analysis of the XGBoost model’s prediction residuals using the Rolling Time-series Validation framework, as shown in
Figure 6.
Figure 6a illustrates the relationship between prediction residuals and predicted values. Most residuals fall within the ±1000 range and exhibit an essentially random distribution, without any systematic deviation associated with increasing predicted values. This indicates that the model’s prediction bias remains stable across different carbon-emission levels.
Figure 6b presents the histogram of residuals, showing a mean residual of 411.3 and a standard deviation of 929.0. The distribution is approximately symmetric, suggesting the absence of significant systematic overestimation or underestimation; only slight positive skewness appears in a few years, corresponding to abnormal fluctuations during rapid emission-growth phases.
Figure 6c depicts the relationship between relative error and actual values, with the vast majority of samples concentrated within ±5%, demonstrating strong annual-scale predictive stability and robustness.
Overall, the XGBoost model achieves an R2 of 0.66 and an RMSE of 989.05 under rolling validation. The residual mean is close to zero, and the distribution is symmetric, indicating that the model effectively captures temporal variation patterns and nonlinear relationships in carbon-emission dynamics. Prediction errors primarily stem from year-to-year fluctuations rather than structural bias, reflecting strong cross-period generalisation ability and robustness. These results verify the reliability of the rolling-validation framework for time-series carbon-emission prediction and further confirm the stability and accuracy advantages of the XGBoost model when handling high-dimensional, multivariate, and nonlinear driving-factor data.
3.2.2. SHAP-Based Influencing Factors
To characterise the marginal contributions of different driving factors to Shanghai’s annual carbon emissions, this study applies SHAP (SHapley Additive exPlanations) decomposition to the year-by-year out-of-sample (OOS) predictions of the XGBoost model within a strict rolling time-validation framework. Rolling OOS SHAP avoids information leakage caused by random data splitting and, therefore more accurately reflects marginal effects under a “using the past to explain the future” setting. At the same time, SHAP values represent local attribution rather than causal effects. In particular, under conditions of strong multicollinearity, the model may allocate similar contributions across correlated features, leading to substitution effects among SHAP values. In other words, when two or more variables exhibit highly synchronous variation in a statistical sense, SHAP may be unable to precisely disentangle their independent marginal effects and instead tends to reflect their joint contribution. Consequently, SHAP values capture the internal attribution structure of the predictive model rather than structural causal effects in a strict sense.
Urban carbon emissions inherently exhibit substantial structural overlap. For example, population scale, investment activities, and infrastructure expansion often evolve simultaneously with economic growth and spatial development. As a result, interactions and overlapping contributions among variables are unavoidable in model interpretation, and their direction and magnitude are influenced by sample distribution and multicollinearity. For scale-related variables such as population, investment, and infrastructure, relatively large mean absolute SHAP values may therefore partly reflect their extensive indirect association networks rather than purely independent effects.
Figure 7 illustrates how SHAP values quantify the incremental contribution of a single variable relative to the baseline prediction, conditional on the remaining features. The sign of the value indicates its enhancing or suppressing direction, and the absolute magnitude represents its influence strength. For each year, the absolute SHAP values (mean|SHAP|) of OOS predictions are averaged and used as the cross-period robust importance measure, from which the top 20 dominant factors are identified. In
Figure 7, red points represent higher feature values, while blue points represent lower values; the horizontal axis denotes SHAP values (i.e., the direction and magnitude of the feature’s impact on carbon emissions). Points located further to the right indicate that the feature increases the predicted value, whereas points further to the left indicate a suppressing effect; greater dispersion implies higher variability in the feature’s influence.
Overall, the factors affecting Shanghai’s carbon emissions can be grouped into four major categories: population scale, industrial energy efficiency, investment structure, and infrastructure. (1) Population scale (X3 total population, X4 permanent population, X5 urban permanent population) shows the most prominent weights and dominates all other factors; (2) Industrial structure and energy efficiency (X18 secondary-industry employment, X16 share of secondary industry, X76_lag2 lagged electricity consumption per unit GDP) exhibit suppressing or moderating effects at different stages; (3) Investment and price/demand variables (X28 industrial investment, X30 share of industrial investment, X26 retail price index, X27 consumer price index, X28_lag2 lagged industrial investment) display positive stimulation effects in current or lagged years; (4) Infrastructure/energy structure (X50 urban road area, X57 water-supply production capacity, X74 coal share) generally contribute positively, while public infrastructure factors (X8 number of physicians, X12 number of schools, X34 investment in public administration and social security, X56 per capita urban green space) have relatively minor effects on carbon emissions.
- (1)
Scale effect: the dominant role of population-related variables.
X3 total population (mean|SHAP| ≈ 3956) ranks first, far exceeding all other factors, indicating that population stock changes exert the most substantial marginal pulling effect on annual emissions; X4 permanent population and X5 urban permanent population reinforce this conclusion. The rolling OOS SHAP beeswarm decomposition (
Figure 8) shows that high-value population samples (red points) cluster predominantly in the positive region, with SHAP values mostly above zero, suggesting that population expansion significantly elevates annual emissions through channels such as household energy consumption, travel demand, and infrastructure construction. This suggests that a larger population scale has a substantial positive impact on predicted carbon emissions, i.e., “greater population leads to higher emissions.” Conversely, low-population samples (blue points) lie in the region of near-zero values, reflecting lower and less volatile emission levels in years with smaller population sizes.
- (2)
Industrial energy efficiency: the sustained upward effect of industrial investment share.
X28 industrial investment and X30 share of industrial investment in total fixed-asset investment both rank within the top fifteen. Their high-value samples (red points) are clearly positioned on the right side of the SHAP axis, indicating that an increasing proportion of industrial investment raises current-year carbon emissions; conversely, low-value samples (blue points) cluster on the left side, suggesting that a slowdown in industrial investment corresponds to a decline in emissions. The out-of-sample SHAP values of X20 tertiary-industry employment share are predominantly negative for high-value samples, indicating that a higher share of the service sector suppresses current-year emissions. X28_lag2 (industrial investment lagged by two periods) continues to exert a positive contribution, suggesting that the production capacity and capital stock generated by investment not only affect emissions in the current year but also extend their energy-consumption effects into subsequent years.
- (3)
Investment structure: supplementary effects of price and demand-side indicators.
X18 secondary-industry employment and X16 share of secondary industry both enter the top twenty.
Figure 8 shows that their SHAP values span both sides of zero, indicating heterogeneous effects on carbon emissions: when the secondary-industry scale expands without sufficient energy-efficiency improvement, the effect is positive; however, as the industrial structure shifts toward technology-intensive and low-energy-consumption sectors, the effect weakens or becomes negative. X76_lag2 electricity consumption per unit GDP (lagged by two periods) is harmful primarily, indicating that years with higher energy efficiency (lower electricity intensity) yield delayed mitigation benefits in the subsequent two years. X26 retail price index and X27 consumer price index are included in the top twenty but exhibit relatively weak influence. Their SHAP scatter points fluctuate slightly around zero in both directions, indicating that price changes primarily serve as auxiliary signals of demand cycles and investment rhythms, indirectly affecting emissions through these channels. This suggests that the impact of price variation on carbon emissions is bidirectional but limited in magnitude, with effects transmitted mainly through indirect consumption and investment pathways.
- (4)
Infrastructure: indirect “energy spillover” under rigid service provision.
X50 urban road area and X57 water-supply production capacity mostly show positive effects on Shanghai’s carbon emissions, indicating that expansions in municipal and water-utility infrastructure lead to increased operational energy use and electricity consumption, thereby generating positive marginal contributions to emissions. This likely reflects the additional electricity and chemical reagent requirements of water supply systems. These results suggest that expansions in public infrastructure tend to exert upward pressure on carbon emissions. The energy-structure variable X74 (coal consumption share) also displays consistently positive SHAP values across the plot, highlighting the long-term emission-increasing effect of fossil-energy dependence. Although X66 (the number of gas users) also tends to show positive values, the SHAP magnitudes are small, indicating a limited marginal impact. Within public-infrastructure indicators, X8 number of physicians, X12 number of schools, and X34 investment in public administration and social security show generally small SHAP values, with slight rightward skew in some years, suggesting that these variables function more as proxies for urban-development stages and public-service expansion, contributing mild, pro-cyclical positive marginal effects. This indicates that education and social-service investments reflect broader development stages rather than direct emission drivers, though they may produce pro-cyclical increases during expansion periods. The X56 per capita urban green-space area shows a weak average influence, with scatter points near zero or slightly negative, suggesting that green-space expansion may exert limited mitigation effects through pathways such as microclimate improvement and optimised travel patterns.
3.2.3. Nonlinear Threshold Effects of Individual Factors
To further clarify how different factors influence Shanghai’s carbon emissions, SHAP dependence plots are generated (
Figure 8). These plots intuitively illustrate the specific contribution of each variable to Shanghai’s carbon emissions across its full value range. The
x-axis represents the feature value, while the right
y-axis shows the SHAP value, which quantifies the contribution of the corresponding feature to carbon emissions in each sample. The LOWESS (locally weighted scatterplot smoothing) trend line displays the average change in SHAP values, and the surrounding light-grey shaded area indicates the 95% confidence interval (CI). The left
y-axis corresponds to a vertical bar chart (distribution), showing the sample distribution of each variable in the dataset. Vertical dashed lines denote critical thresholds at which the direction of influence shifts.
A multidimensional analysis of these influencing factors reveals the following insights:
Figure 8a–c,e,h,l shows that population-related variables exhibit pronounced two-stage effects on carbon emissions. Overall, population expansion follows a three-phase pathway—scale effect, intensification effect, and congestion rebound—reflecting the nonlinear responses of urban systems across different development stages.
First, the SHAP curves for total population (X3) and resident population (X4) both display an “increase–decrease” pattern. The total population (X3) reaches its peak SHAP value at approximately 14.6 million, indicating the strongest positive contribution to carbon emissions. As population size further approaches the model-identified threshold (around 15.16 million), SHAP values shift from increasing to gradually declining. This suggests that population growth initially drives emissions through scale effects whereas, at higher density levels, infrastructure sharing and improvements in energy-use efficiency begin to generate intensification effects that slow emission growth. Similarly, the resident population (X4) exhibits a comparable turning point at around 24.5 million, beyond which the SHAP curve flattens and then declines, indicating notable marginal efficiency gains in urban operations at high population levels.
This transition reflects a structural shift from intensification effects to congestion effects and is consistent with the policy orientation of Shanghai’s 14th Five-Year Plan, which emphasises reducing population density in central urban areas, improving urban functions and the living environment, optimising spatial structure, and promoting population redistribution. These measures aim to guide central districts toward high-quality development while enhancing the carrying capacity of suburban areas. When SHAP values turn from negative to positive, congestion effects emerge, indicating that, beyond a certain density threshold, per capita carbon emissions begin to rise. This pattern aligns with Shanghai’s recent policies on relocating non-essential functions from the city centre and promoting a multi-centre urban structure. In high-density areas lacking sufficient ecological space and public infrastructure, building energy consumption, traffic pressure, and overall energy demand tend to increase.
Second, registered population density (X7) exhibits a clear positive turning point at approximately 3600 persons per km
2, where SHAP values shift from a declining to an increasing trend. This indicates that, once urban density exceeds this threshold, per capita carbon emissions begin to rise rather than fall. This structural transition from intensification effects to congestion effects is highly consistent with the objectives outlined in the Shanghai Master Plan (2017–2035) [
36], which calls for strict control of population size in megacities, continuous optimisation of population structure and spatial distribution, adjustment of population density and per capita construction land, and improvements in urban liveability. The plan also advocates policies on land use, employment, and housing supply to alleviate excessive population concentration in central areas while increasing population and employment density and spatial efficiency in new towns and suburban districts.
Within the moderate-density range (approximately 2000–3600 persons per km2), compact urban form and efficient public transport systems can reduce per capita energy consumption, reflecting the energy-efficiency dividend of compact cities. However, when density exceeds this optimal range, the combined effects of concentrated building energy demand, deteriorating ventilation and green-space conditions, and traffic congestion lead to rising energy use and carbon emissions. Therefore, urban density enhancement should operate within an optimal interval, beyond which measures such as spatial decentralisation, building energy-efficiency upgrades, and the integration of ecological spaces are required to avoid a reversal toward “high-density, high-emission” outcomes.
Figure 8e,h further shows that the urban resident population (X5) and urbanisation level (X6) exhibit interlinked two-stage effects on carbon emissions. At the scale level, the SHAP curve for X5 shows an overall downward trend. When the urban population is below approximately 22.26 million, SHAP values remain strongly positive, indicating that population concentration and urban expansion significantly promote carbon emissions in the early stage. Once population size approaches this threshold, SHAP values decline sharply and stabilise, suggesting that improvements in infrastructure, energy efficiency, and shared public services generate a clear intensification-driven mitigation effect. This demonstrates that urban population growth does not drive emissions linearly but instead exhibits diminishing marginal effects and efficiency turning points beyond a certain scale.
At the structural level, the urbanisation rate (X6) shows a negative turning point beyond 88.2%, where SHAP values gradually shift from a declining to an increasing trend. This indicates that, once urbanisation reaches an extremely high level, its marginal effect on carbon emissions changes from suppressive to mildly promotive. This pattern reflects a carbon rebound effect at advanced stages of urbanisation. During the intermediate-to-high urbanisation phase (approximately 75–88%), energy structure optimisation, improved public transport systems, and industrial service-oriented transformation contribute to reductions in carbon intensity. However, when urbanisation enters a saturated range (>88%), high-energy-consumption lifestyles, urban spatial expansion, and energy-intensive service-sector agglomeration lead to a slight rebound in total emissions.
Accordingly, 88.2% can be regarded as a critical threshold marking the transition from “structural dividends” to “energy rebound” in the urbanisation process. At this stage, mitigation strategies should shift from infrastructure expansion toward greener lifestyles and consumption patterns, including the promotion of green mobility, energy-efficient buildings, and shared public services, to offset the energy-demand rebound associated with ultra-high urbanisation levels.
- (2)
Industrial Dimension
Figure 8d,q illustrates the effects of industry-related variables on carbon emissions. First, the SHAP curve for employment in the secondary sector (X18) exhibits a clear turning point at approximately 4.4 million workers. Below this threshold, SHAP values remain consistently negative, indicating that, at low-to-moderate levels of industrial employment, industrial expansion is accompanied by technological upgrading and improvements in energy-use efficiency, thereby exerting a suppressive effect on carbon emissions. However, once industrial employment exceeds this critical level, SHAP values rapidly turn positive and increase, suggesting that output expansion and spillover energy demand associated with additional employment become the dominant forces, resulting in a pronounced positive contribution to carbon emissions. This transition from negative to positive SHAP values reveals a nonlinear threshold effect of industrial employment scale: when employment remains below the threshold, efficiency gains derived from technological upgrading and process optimisation can be further consolidated; when employment approaches or exceeds the threshold, it becomes necessary to simultaneously implement energy-consumption caps, promote clean-energy substitution, and encourage shifts toward higher value-added activities to prevent emission rebound driven by further scale expansion.
The mitigation effect observed at lower employment levels reflects the benefits of industrial renewal, technological transformation, and capacity optimisation, whereas the positive contribution beyond the threshold indicates that continued expansion of industrial employment leads to energy-demand spillovers. This pattern is highly consistent with the industrial upgrading direction emphasised in the Shanghai Master Plan (2017–2035), which highlights accelerating the transformation and upgrading of manufacturing, promoting high-end and service-oriented manufacturing, gradually phasing out labour-intensive low-end manufacturing, and increasing employment in high-technology sectors.
The SHAP curve for the share of the secondary industry (X16) displays a critical threshold at approximately 29.8%. Below this level, carbon emissions increase with a rising industrial share, representing a typical “industrialisation-driven” phase. Once the share exceeds 29.8%, SHAP values shift from positive to negative and gradually stabilise, indicating that structural optimisation and energy-efficiency improvements increasingly dominate changes in carbon emissions. This result suggests that, when the secondary-industry share surpasses roughly 30%, the regional economy begins to transition from energy-intensive manufacturing toward higher value-added industries and services, forming a progressive mitigation pathway characterised by scale, efficiency, and structural upgrading.
This trend is consistent with Shanghai’s transition toward a post-industrial, service-oriented development stage. The Outline of the 14th Five-Year Plan for Economic and Social Development and the Long-Range Objectives through 2035 emphasise the building of Shanghai into a service-economy hub, promoting the integration of service outsourcing with advanced manufacturing and producer services, and accelerating the formation of high-end industrial clusters driven by strategic emerging industries and digital transformation of traditional sectors. The data-driven results of the model thus provide empirical evidence of the endogenous link between industrial-structure optimisation and carbon-emission dynamics.
- (3)
Economic Dimension
Figure 8g,i,j,m,o depicts the dependence of carbon emissions on economic variables, revealing an overall nonlinear pattern characterised by “increase–turning point–stabilisation.” This structure reflects the dynamic transition of economic development from expansion-driven growth toward structural optimisation.
Both industrial investment (X28) and its two-period lag (X28_lag2) exert significant influences on Shanghai’s carbon emissions, highlighting the asymmetry between the immediate and lagged effects of investment. The SHAP curve for contemporaneous industrial investment (X28) reaches a threshold at approximately RMB 126.8 billion. Below this level, SHAP values increase rapidly, indicating a strong short-term emission-raising effect associated with investment expansion. Once investment exceeds the threshold (around RMB 126.84 billion), SHAP values level off or slightly decline, suggesting that diminishing marginal returns to capital and technological upgrading gradually emerge, thereby weakening the marginal contribution of investment to carbon emissions. This pattern is consistent with Shanghai’s policy orientation during the 14th Five-Year Plan period, which emphasises expanding investment while optimising its structure.
By contrast, the second-order lagged investment variable (X28_lag2) exhibits a clear sign reversal at approximately RMB 140 billion, where SHAP values decline sharply from positive to negative. This indicates that, after a two-year accumulation period, earlier investments begin to generate structural mitigation effects through improved capacity utilisation and equipment upgrading. This “positive construction-period effect with negative lagged feedback” reveals the dual role of industrial investment: while it raises carbon emissions in the short term, it can contribute to reductions in carbon intensity over the longer term through technology diffusion and the formation of greener productive capacity. This mechanism provides empirical support for policy measures advocated in the Outline of the 14th Five-Year Plan for Economic and Social Development and the Long-Range Objectives through 2035, which emphasise promoting green investment, developing green finance, and leveraging the demonstration role of national green development funds.
These findings underscore the critical importance of the temporal structure of investment for carbon-emission dynamics. When capital is primarily directed toward energy-intensive manufacturing, lagged effects continue to manifest as persistently high emissions. In contrast, when investment is oriented toward energy-efficiency retrofits and low-carbon infrastructure, the lagged SHAP values become negative, indicating that “green investment payback” generates a pronounced emission-mitigation effect over time.
Finally, the SHAP curve for the share of industrial investment in fixed-asset investment (X30) reveals a threshold at approximately 15.6%. Below this level, SHAP values are significantly positive, suggesting that, when industrial investment accounts for a relatively small share of total investment, incremental capital flows mainly into industrial production, thereby driving emission growth. Once the share exceeds this threshold, SHAP values decline rapidly and approach zero, indicating that the fixed-asset investment structure becomes more diversified, with an increasing proportion allocated to the tertiary sector and service industries. As a result, the carbon-intensity effect dominated by industrial expansion is gradually weakened. This pattern highlights the critical role of the “deindustrialisation–servitisation” transition in achieving regional carbon-emission mitigation.
- (4)
Public Service Dimension
Figure 8i–k,p,r indicates that public-service-related variables exert significant threshold and stage-dependent effects on carbon emissions, reflecting nonlinear feedback mechanisms among consumption behaviour, fiscal investment, and the public service system.
Both the retail price index (X26) and the consumer price index (X27) exhibit pronounced two-stage effects (see
Figure 8i,m). Taking X26 as an example, when the price index remains below approximately 100.8, SHAP values are negative, suggesting that, at low price levels, suppressed market demand and insufficient production activity lead to relatively low carbon emissions. As the index approaches the threshold, SHAP values rapidly turn positive and increase, indicating that the expansion of consumption and production drives higher emissions through increased energy use in transportation, manufacturing, and related sectors. Similarly, the SHAP curve for the consumer price index (X27) turns positive at around 102, following a pattern broadly consistent with X26. This further confirms that, when price levels are low, consumption and production activities have not fully recovered, and carbon emissions remain weakly responsive; once prices rise into the critical range, consumption-driven expansion significantly amplifies carbon emissions.
Investment in public administration, social security, and social organisations (X34) displays a typical “low-level stability–mid-level increase–high-level stabilisation” pattern in its SHAP curve (
Figure 8). When investment remains below approximately RMB 4.9 × 10
6, SHAP values are close to zero, indicating that limited public-sector investment has a negligible impact on carbon emissions. Once investment exceeds this threshold, SHAP values rise sharply, reflecting short-term increases in energy consumption and emissions associated with public infrastructure construction and the expansion of social security systems. However, when investment further increases beyond approximately RMB 0.95 × 10
7, the SHAP curve levels off, suggesting that high levels of public investment enter a “stable-effect” phase in which additional capital generates diminishing marginal emission impacts. This pattern reveals the dual role of public investment: while it creates emission spillovers through construction activities in the short term, it can indirectly promote carbon mitigation in the longer term by improving service efficiency and facilitating the development of green infrastructure.
Public service provision variables further exhibit a similar “increase–turning point–stabilisation” pattern. The number of physicians (X8) shows a negative-to-positive turning point at approximately 157,000. At low levels (<150,000), SHAP values are relatively high, reflecting the high energy intensity associated with concentrated operation of large hospitals under conditions of medical resource scarcity. As the number of physicians approaches the threshold, improved resource distribution and service balance enhance operational efficiency, reducing SHAP values to their minimum and indicating an energy-efficiency threshold in healthcare system expansion. When the number of physicians continues to increase beyond high levels (>180,000), SHAP values rise slightly again, suggesting that excessive concentration and facility redundancy may introduce new operational energy burdens. This pattern is highly consistent with the policy objectives outlined in the Outline of the 14th Five-Year Plan for Economic and Social Development and the Long-Range Objectives through 2035, which emphasises strengthening emergency public health capacity and promoting the expansion and balanced allocation of high-quality medical resources, while also revealing—at the model level—the potential risk of energy rebound associated with overconcentration.
Similarly, the number of schools (X12) exhibits a threshold at approximately 1678 institutions, beyond which SHAP values gradually increase. When the number of schools exceeds roughly 1725, SHAP values turn negative and continue to decline, indicating that the expansion and spatial balancing of educational resources significantly reduce carbon emissions at the urban scale and generate a clear structural mitigation effect. Overall, the expansion of educational and medical resources follows a dynamic pattern of “initial promotion–mid-term suppression–long-term stabilisation,” suggesting the existence of an optimal range for public service density. Moderate expansion enhances urban energy efficiency and residents’ quality of life, whereas excessive construction may lead to energy-consumption rebound effects.
- (5)
Environmental Dimension
Figure 8f,i,t indicates that improvements in environmental infrastructure generate significant marginal mitigation effects once development reaches medium-to-high levels. First, the SHAP curve for urban road area (X50) exhibits a positive threshold at approximately 1.5 × 10
4 m
2, where SHAP values increase rapidly. This suggests that, during the early stage of transport infrastructure development, road expansion is accompanied by substantial material inputs and construction-related energy consumption, thereby exerting a pronounced positive effect on carbon emissions. However, as the road area further expands to around 2.4 × 10
4 m
2, SHAP values reach a peak and then begin to decline. This pattern indicates that, as the road network becomes more complete, the combined effects of congestion alleviation, improved commuting efficiency, and optimisation of travel structure gradually slow the growth rate of transport-related emissions. This turning point reflects a typical transition from “high emissions during the construction phase” to “stabilised emissions during the operational phase.” The observed pattern is highly consistent with the planning objectives of the Shanghai Master Plan (2017–2035), which emphasises the development of a more sustainable and resilient eco-city and the construction of a green and low-carbon infrastructure system.
The SHAP curve for water supply production capacity (X57) shows a clear positive activation threshold at approximately 10.88 million tonnes per day (
Figure 8i). Below this level, SHAP values remain close to zero, indicating that, at smaller system scales, water supply infrastructure has a limited impact on carbon emissions. Once production capacity exceeds this threshold, SHAP values turn positive and continue to rise, suggesting that the expansion of basic public service infrastructure enters a high-energy-consumption phase. In this stage, processes such as water treatment plant operation, pumping energy demand, and pressure maintenance in transmission and distribution networks substantially increase energy use, thereby intensifying carbon emissions. This trend reveals a typical “scale expansion–energy consumption escalation” mechanism: while expanding water supply facilities beyond the threshold improves public service provision, it is also associated with a relatively high operational carbon cost.
Finally, per capita park green space (X56) exhibits persistently negative SHAP values once it exceeds approximately 8.45 m2 per person, indicating a significant reduction in carbon emissions as green space availability increases. This pattern reflects the positive role of urban ecological space in carbon sequestration and thermal environment regulation. Specifically, the expansion of green space and increased vegetation cover enhance urban carbon sinks and reduce energy demand, thereby generating a pronounced “ecological suppression effect” on carbon emissions.
3.2.4. Multi-Factor Interaction Analysis
This study further evaluates the interactions among multiple influencing factors to uncover the complex dynamic effects of different variable combinations on Shanghai’s carbon emissions.
Figure 9 presents the interaction strength of various feature combinations. Given the large number of possible interactions, only the top 20 are displayed, and the six strongest pairs are analysed in detail. The results show that combinations involving total population, permanent population, household-registered population density, secondary-industry employment, urban permanent population, and urbanisation level exhibit the most substantial interactive effects on Shanghai’s carbon emissions.
For each annual model generated through rolling training, the pairwise SHAP interaction values (interactions = True) are computed using that year’s OOS samples. The absolute values of these interactions are treated as interaction strength, and their averages across all OOS years are calculated to produce an “interaction strength matrix.” Interaction strength can be interpreted as the magnitude of the joint effect of two features on carbon-emission prediction, beyond their individual contributions. Larger values indicate more substantial interaction effects.
Figure 9 displays the top 20 feature pairs with the highest averaged interaction strength, enabling the identification of “significant co-movement” pathways that affect Shanghai’s carbon emissions.
To further verify the degree of synergistic influence among different factor combinations under higher-order nonlinear conditions, this study adopts Friedman’s H-statistic to quantify the contribution of interaction terms [
37]. A comparative modelling system is constructed using a shallow model (Baseline, max_depth = 4) and a deep model (Deep, max_depth = 8). A larger H value indicates a more substantial interaction effect between two variables.
Specifically, based on both the shallow (baseline, max_depth = 4) and deep (max_depth = 8) XGBoost models, the partial dependence surface for each pair of features, as well as the single-variable partial dependence curves , and , are computed.
Friedman’s H is defined as follows:
where
denotes the two-dimensional partial dependence surface (2D Partial Dependence Surface), representing the model’s average predicted response when features
and
vary simultaneously.
is the one-dimensional partial dependence function (1D Partial Dependence of
), describing the average effect on the prediction when only feature
), varies independently;
is the corresponding 1D partial dependence function of
, reflecting the independent contribution of feature
. A higher HHH value indicates stronger interaction effects between the two variables.
When ΔH > 0, it suggests that, under higher-order nonlinear modelling, the interaction strength of the feature pair is amplified, exhibiting a super-additive “1 + 1 > 2” effect.
Figure 9 presents the top 20 feature pairs ranked by interaction amplification effects (ΔH). The bar length represents the absolute magnitude of ΔH, and the colour gradient, from red (strong amplification) to blue (weak amplification), indicates the relative strength of amplification. The results show that specific feature pairs (e.g., X3 × X30, X4 × X50) exhibit markedly strengthened interactions in the deep model, revealing strong synergistic behaviour under nonlinear conditions.
The pair X4 × X5 (permanent population × urban permanent population) exhibits the largest ΔH (0.0249), with an amplification factor of 2.36, indicating that the nonlinear coupling between population scale and urbanisation rate is significantly intensified in the higher-order model. In other words, the influence of population urbanisation on carbon-emission dynamics is reflected not only in linear growth effects, but also in compounded amplification through the joint processes of population agglomeration and urban expansion.
The interaction X3 × X57 (total population × water-supply production capacity) yields a ΔH value of 0.0162, indicating a pronounced nonlinearity between population size and infrastructure capacity. As the population increases, the energy consumption and carbon emissions associated with expanding water-supply systems do not grow linearly but follow an accelerated trajectory. The interaction X3 × X30 (total population × industrial investment share) has a ΔH value of 0.0131, indicating that population growth and industrial structure expansion exert a strong amplification effect. This suggests that, during rapid industrialisation, the combined intensification of population-driven demand and industrial energy consumption significantly elevates emissions. The interaction X3 × X28_lag2 (total population × lagged industrial investment) also displays strong amplification (ΔH = 0.0085), implying that investment behaviour has a sustained temporal influence on the nonlinear response of the carbon-emission system.
Other interactions, such as X3 × X26 (total population × consumer price index) and X4 × X18 (urban population × secondary industry employment), also exhibit positive amplification, indicating substantial multi-factor coupling among population urbanisation, consumption structure, and industrial labour structure within the carbon-emission formation mechanism. In contrast, some energy-related pairs, such as X3 × X74 (total population × coal-consumption share) and X3 × X50 (total population × urban road area), show ΔH < 0, meaning that their interactions weaken in the deep model and become more additive. This may reflect the declining marginal amplification effects of high-carbon energy consumption and the diminishing marginal influence of transportation infrastructure development.
Overall, approximately 70% of variable pairs exhibit positive amplification in the deep model, suggesting that multidimensional socio-economic factors display widespread synergistic amplification under nonlinear frameworks. In particular, the nonlinear coupling among population, urbanisation, industrial structure, and infrastructure systems constitutes a key mechanism that shapes the spatial differentiation and temporal evolution of carbon emissions.
Figure 9 provides a comprehensive illustration of the extent to which multiple factors interactively influence carbon-emission intensity. Taking X3 × X30 as an example, the horizontal axis represents the value range of the primary feature
total population (X3). In contrast, the vertical axis represents the SHAP interaction effect between the total population and the share of industrial investment in total fixed-asset investment (X30), indicating the joint impact of these factors on carbon emissions. The colour of each scatter point corresponds to the magnitude of the industrial-investment share: red represents higher industrial-investment levels (X30 > 23.25), while blue represents lower levels (X30 ≤ 23.25).
In the low industrial-investment stage (blue regression line), the marginal effect of population growth on carbon emissions is weak and tends to stabilise. In contrast, under the high industrial-investment stage (red regression line), as the population increases from approximately 1.35 × 107 to 1.42 × 107, the SHAP interaction values rise rapidly. This indicates that the combination of population expansion and industrial-investment growth significantly intensifies carbon emissions.
Such a phenomenon reveals a marked amplification effect, indicating that, when a large population scale and high industrial capital input coexist, carbon emissions are significantly magnified. This reflects a synergistic driving mechanism between industrial-structure expansion and population growth (
Figure 10).
- (1)
Interaction between population and infrastructure (X4 × X5)
The
Figure 10a, representing the interaction between
permanent population and
urban permanent population, shows the highest ΔH value (0.0249), indicating the most substantial structural amplification effect. As total population increases, a rising level of urbanization significantly enhances the marginal impact on carbon-emission intensity: in the early stage, where population size is relatively small, urban agglomeration effects are not yet fully formed, and the emission pressure induced by population growth remains modest; however, once the permanent population exceeds 21.47 million, the rising proportion of urban residents markedly amplifies energy demand and lifestyle-related emissions. This result demonstrates that the coupled variation of total population and urban population concentration is a major driver of accelerated emission growth, reflecting how population spatial agglomeration magnifies pressure on energy systems and transportation activities. This indicates a typical “population–urbanisation amplification effect,” where high population density, combined with rapid urbanisation, substantially increases energy consumption and lifestyle-related emissions.
- (2)
Interaction between population and residential energy use (X3 × X57, X3 × X26)
The ΔH values of
Figure 10b (total population × water-supply production capacity) and X3 × X26 (total population × retail price index) are 0.0162 and 0.00127, respectively, both demonstrating consumption-driven amplification effects.
For the interaction between the total population (X3) and water-supply production capacity (X57), the marginal effect of population growth on carbon emissions is significantly higher under low water-supply capacity (X57 ≤ 11.525 million tons/day), indicating that substantial energy inputs are associated with early-stage infrastructure expansion. In the high-capacity range (X57 > 11.525 million tons/day), as the system approaches saturation, the interaction effect stabilises or slightly declines. This indicates a nonlinear pattern in the impact of infrastructure development on emissions across stages, transitioning from a driving phase to a dampening phase.
For the interaction between
total population (X3) and
retail price index (X26) (
Figure 10e), when price levels are low (X26 ≤ 101.10), the influence of population growth on emissions is limited. In contrast, during high-price periods (X26 > 101.10), the interaction SHAP values increase sharply as the population expands, indicating that higher consumption levels, combined with population growth, jointly drive increases in emissions. This “living-standard amplification effect” suggests that population growth indirectly strengthens emission intensity through consumption upgrading and increased energy demand.
- (3)
Interaction between population and industrial development (X3 × X30, X3 × X28_lag2, X4 × X18)
This group of variables reflects the dynamics of emissions along the “population–industrialisation” pathway. The ΔH values of
Figure 10c (total population × industrial investment share) and
Figure 10d (total population × lagged industrial investment) are 0.0131 and 0.00854, respectively, both indicating a clear pattern of synergistic amplification. As the population expands and fixed-asset investment accelerates, the energy demand of the industrial sector intensifies further in the lagged stage, reinforcing emission growth. For
Figure 10f (permanent population × secondary-industry employment), the ΔH value is 0.00107, suggesting that industrial employment has a strong positive amplification effect under a high-population context.
In the interaction between total population (X3) and lagged industrial investment (X28_lag2), population growth has relatively stable impacts on emissions in areas with low lagged investment (X28_lag2 ≤ 1179.34). In contrast, in areas with higher lagged investment (X28_lag2 > 1179.34), the interaction SHAP values rise sharply, indicating that historical investment activity continues to amplify current emissions. This reveals the time-lag effect of industrial capital accumulation and early investment expansion, which, when coupled with continued population growth, further strengthens carbon-emission levels.
In the interaction between the permanent population (X4) and secondary-industry employment (X18), under low industrial employment conditions (X18 ≤ 4.375 million), the effect of population growth on emissions is limited. However, when industrial employment exceeds this threshold (X18 > 4.3750 million), the interaction effect strengthens rapidly as the population expands. This indicates that the combination of industrial labour concentration and population agglomeration significantly amplifies carbon emissions, reflecting a deeper structural coupling between demographic expansion and industrial composition.
Overall, when the population exceeds approximately 14 million and urban employment is concentrated in industrial sectors, the emission response becomes markedly steeper, demonstrating the mutually reinforcing effects of population agglomeration and industrial capital accumulation. This confirms that population size acts as an “amplifier” of carbon emissions: when high population levels coincide with high industrial investment intensity, emissions exhibit an escalating response. In contrast, interaction terms involving efficiency and structural-optimisation variables (e.g., lagged electricity consumption intensity and secondary industry share) exhibit “offsetting amplification,” forming endogenous stabilising mechanisms. This suggests that a single factor does not drive urban carbon-emission dynamics, but instead emerges from the nonlinear synergies among population, investment, industrial structure, and energy efficiency.
Taken together, the amplification patterns revealed by Friedman’s ΔH demonstrate nonlinear interactions between “population factors” and the three categories of “infrastructure–consumption–industrial development.” Population size is not only a direct driver of carbon emissions but also an amplifier of infrastructure expansion and industrial growth. When urbanisation rate, water-supply capacity, and industrial investment rise concurrently with population growth, the system’s marginal amplification effect is substantially enhanced, suggesting that linear single-variable estimates may underestimate the actual compound impacts.