Next Article in Journal
Towards Sustainable Railways Using Polymeric Inclusions, Polyurethane Foam and Marginal Materials Derived from Rubber Tires
Previous Article in Journal
Sustainable Marketing: Can Retailers’ Profit-Motivated Consumer Education Enhance Green R&D and Production?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Optimizing Spatial Scales for Evaluating High-Resolution CO2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach

1
Faculty of Resources and Environmental Science, Hubei University, Wuhan 430062, China
2
Hubei Key Laboratory of Regional Development and Environmental Response, Hubei University, Wuhan 430062, China
3
School of Architecture and Engineering, Wuhan City Polytechnic, Wuhan 430064, China
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(20), 9009; https://doi.org/10.3390/su17209009 (registering DOI)
Submission received: 2 September 2025 / Revised: 4 October 2025 / Accepted: 6 October 2025 / Published: 11 October 2025

Abstract

High-resolution CO2 fossil fuel emission data are critical for developing targeted mitigation policies. As a key approach for estimating spatial distributions of CO2 emissions, top–down methods typically rely upon spatial proxies to disaggregate administrative-level emission to finer spatial scales. However, conventional linear regression models may fail to capture complex non-linear relationships between proxies and emissions. Furthermore, methods relying on nighttime light data are mostly inadequate in representing emissions for both industrial and rural zones. To address these limitations, this study developed a multiple proxy framework integrating nighttime light, points of interest (POIs), population, road networks, and impervious surface area data. Seven machine learning algorithms—Extra-Trees, Random Forest, XGBoost, CatBoost, Gradient Boosting Decision Trees, LightGBM, and Support Vector Regression—were comprehensively incorporated to estimate high-resolution CO2 fossil fuel emissions. Comprehensive evaluation revealed that the multiple proxy Extra-Trees model significantly outperformed the single-proxy nighttime light linear regression model at the county scale, achieving R2 = 0.96 (RMSE = 0.52 MtCO2) in cross-validation and R2 = 0.92 (RMSE = 0.54 MtCO2) on the independent test set. Feature importance analysis identified brightness of nighttime light (40.70%) and heavy industrial density (21.11%) as the most critical spatial proxies. The proposed approach also showed strong spatial consistency with the Multi-resolution Emission Inventory for China, exhibiting correlation coefficients of 0.82–0.84. This study demonstrates that integrating local multiple proxy data with machine learning corrects spatial biases inherent in traditional top–down approaches, establishing a transferable framework for high-resolution emissions mapping.
Keywords: CO2 emissions; machine learning; multi-source data; spatial distribution; Hubei Province CO2 emissions; machine learning; multi-source data; spatial distribution; Hubei Province

Share and Cite

MDPI and ACS Style

Fang, Y.; Li, R.; Cao, J. Optimizing Spatial Scales for Evaluating High-Resolution CO2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach. Sustainability 2025, 17, 9009. https://doi.org/10.3390/su17209009

AMA Style

Fang Y, Li R, Cao J. Optimizing Spatial Scales for Evaluating High-Resolution CO2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach. Sustainability. 2025; 17(20):9009. https://doi.org/10.3390/su17209009

Chicago/Turabian Style

Fang, Yujun, Rong Li, and Jun Cao. 2025. "Optimizing Spatial Scales for Evaluating High-Resolution CO2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach" Sustainability 17, no. 20: 9009. https://doi.org/10.3390/su17209009

APA Style

Fang, Y., Li, R., & Cao, J. (2025). Optimizing Spatial Scales for Evaluating High-Resolution CO2 Fossil Fuel Emissions: Multi-Source Data and Machine Learning Approach. Sustainability, 17(20), 9009. https://doi.org/10.3390/su17209009

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop