Next Article in Journal
A Sustainable Design Optimization of Atrium Spaces in Commercial Complexes for Enhanced Photothermal Comfort and Energy Efficiency in Severe Cold Regions
Previous Article in Journal
Generative AI in Mechanical Engineering Education: Enablers, Challenges, and Implementation Pathways
 
 
Article
Peer-Review Record

An Explainable Machine Learning Method for Neighborhood-Level Traffic Emissions Prediction: Insights from Ningbo, China

Sustainability 2025, 17(23), 10819; https://doi.org/10.3390/su172310819 (registering DOI)
by Yizhe Huang 1,2, Cunzhuo Liu 1,2, Yikang Fan 1,2, Jun Zhao 1,2, Chuanli Zhang 1,2, Yiwei Cao 1,2, Yibin Zhang 1,2 and Shuichao Zhang 1,2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Sustainability 2025, 17(23), 10819; https://doi.org/10.3390/su172310819 (registering DOI)
Submission received: 27 October 2025 / Revised: 15 November 2025 / Accepted: 18 November 2025 / Published: 2 December 2025
(This article belongs to the Section Sustainable Transportation)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper proposes a method based on interpretable machine learning to predict traffic carbon emissions at the community level, and combines field measurement data and open source data to explore the nonlinear impact of environmental factors on carbon emissions. The research design is reasonable, and the method is innovative, especially in the combination of XGBoost and SHAP model for feature interpretation. However, there is still room for improvement in the thesis in the following aspects:

(1) In the literature review section, it is recommended to supplement the latest research on interpretible machine learning in traffic carbon emission forecasting in recent years, and combine the model of geographic information and space-time dynamics to enhance the theoretical basis and cutting-edge of the research.

(2) In the method part, on the basis of introducing XGBoost and SHAP models, it is recommended to further explain specific steps such as data preprocessing, feature engineering, model parameter tuning, etc., especially how to deal with the consistency of data on different spatial scales, how to deal with missing values, etc., so as to enhance the reproducibility of the method.

(3) The XGBoost model used in this paper performs well on static data, but the urban transportation system and built-up environment are dynamically changing. It is suggested to discuss the adaptability of the model in the face of new data and new regions, and whether to consider introducing incremental learning or online learning mechanisms to cope with the changes in data distribution brought about by urban development.

(4) Although the paper analyzes the characteristic contributions of different functional areas through SHAP, it does not deeply explore the impact of spatial self-correlation or spatial dependence on model results. It is recommended to further verify the robustness of the results in combination with the method of spatial econometrics.

(5) It is recommended to clearly explain the limitations of this study in the conclusion, such as the limited space-time coverage of data sources, the insufficient ability of the model to predict extreme weather or special events, such as holidays, and discuss the impact of these limitations on the conclusion of the study.

(6) Policy Enlightenment Part: This article has strong practical significance, but the policy suggestions are relatively brief. It is suggested to add a section dedicated to policy inspiration, and put forward specific urban planning and traffic management suggestions in combination with research findings, such as how to optimize the layout of bus stops, how to adopt differentiated emission reduction strategies in old cities and high-tech areas, etc., so as to enhance the practical value of the research.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This study employs explainable machine learning (XGBoost + SHAP) to predict neighborhood-level traffic-related carbon emissions in Ningbo. The topic is timely and holds strong empirical value. However, to enhance the academic and practical rigor of the paper, improvements are needed in data representativeness, statistical validation, and integration with recent research.

  1. The introduction provides a general overview of transportation-related carbon emissions but lacks statistical evidence demonstrating the trends in Ningbo or other major Chinese cities (e.g., annual COâ‚‚ emissions from road transport, vehicle growth rate, or modal share changes). Such quantitative indicators would help strengthen the urgency and relevance of the research.
  2. The literature review is heavily focused on studies published between 2019 and 2023, without adequately reflecting SHAP-based urban transport and XAI research published in 2024–2025. Relevant recent studies should be incorporated to contextualize this work within emerging methodological frameworks: Party Politics in Transport Policy with a Large Language Model; eXplainable DEA for Evaluating Public Transport Origin-Destination Pairs
  3. The paper reports only the RMSE value, which limits the evaluation of model performance. Additional metrics such as MAE and R², along with k-fold cross-validation, should be included to assess model stability and generalization capability. Furthermore, a quantitative comparison between SHAP and GAM interpretations is necessary. Although normalization is mentioned, the specific scaling method (e.g., min–max vs. z-score) and the train/test data split procedure are not clearly described, undermining reproducibility.
  4. Field measurements were collected from only three areas, which reduces spatial representativeness. Expanding validation data by vehicle type (e.g., passenger cars, buses) or time of day, or incorporating bootstrapped uncertainty estimation, would substantially improve the reliability and robustness of the results.
  5. While the SHAP interpretation results are informative, they fall short of causal inference. The analysis should go beyond descriptive correlations by examining variable interactions (interaction terms) or comparing results with structural models such as SEM to deepen interpretive insight and theoretical contribution.
  6. Figures 2–6 present important results but lack sufficient explanation of colors and legends, making interpretation difficult. Each graph should explicitly indicate variable units and the direction of influence (positive/negative), and the main patterns should be summarized quantitatively in the text to enhance clarity and analytical transparency.
  7. XGBoost, SHAP, and GAM are used in parallel, but the theoretical rationale for their selection and complementary roles is not well justified. The reasoning that they are used “to capture nonlinear relationships” is not persuasive enough. The statistical nature and intended purpose of each method (prediction vs. interpretation) should be clearly distinguished, and the superiority of the XGBoost–SHAP combination over alternatives (e.g., LightGBM–LIME, CatBoost–ALE plot) should be explicitly discussed: Please refer…Understanding Gender Gap in Bike-Sharing Services via XAI; Impact of road transport system on groundwater quality inferred from XAI
  8. The performance of XGBoost critically depends on hyperparameter settings (e.g., learning rate, max_depth, subsample), yet these are not reported in the manuscript. Presenting results based on a single configuration raises the possibility of overfitting or underfitting. Bayesian optimization or grid search procedures should be implemented to identify optimal parameter combinations and ensure robust performance evaluation.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript “An Explainable Machine Learning Method for Neighborhood-level Traffic Emissions Prediction: Insights from Ningbo, China” presents an interesting topic that lies within the scope of Sustainability. The use of interpretable machine learning models (XGBoost and SHAP) for urban COâ‚‚ emission prediction has potential scientific value. However, the current version of the paper contains substantial methodological, structural, and formatting weaknesses that must be addressed before the manuscript can be considered for publication.

Comments:

1.The Introduction section provides a general overview of the topic but fails to clearly define the research gap and novelty of this study compared with prior literature. The authors should explicitly state how their approach advances existing knowledge and what specific questions it addresses that previous research has not.

2.Data Transparency and Reproducibility. The Research Data section lacks sufficient detail for replication. Although several data sources (e.g., Amap POI, Ningbo Data Open Platform, EDGAR) are mentioned, their accessibility, collection dates, and preprocessing steps are not described.

3.Multiple figures and tables contain incomplete references. Figures 1, 2, 5 and 6 are particularly unclear and of poor visual quality – the data and numeric labels are blurred and difficult to read. All figures must be reproduced at higher resolution (minimum 300 dpi) and formatted according to MDPI graphical standards.

4.Methodological Rigor and Validation. The justification for selecting XGBoost and SHAP over other interpretable algorithms is not provided. The model validation process is insufficiently explained; only RMSE is reported without contextual units or additional performance metrics such as MAE or R². The validation dataset is too limited to ensure robustness.

5.Missing Citations and Reference Quality. The text lacks in-text references in square brackets (e.g., [1], [2]) corresponding to the reference list. This creates inconsistencies between the citations and bibliography, which violates the MDPI referencing style. Furthermore, several references in the list are incomplete – missing page numbers, journal titles, publishers, or publication years. Examples include: [1]: Programme, U.N.E. Emissions Gap Report 2024. – no publisher or location; Ref. [4]: Rakha, H.A.; Farag, M.; Foroutan, H. – lacks page numbers and full journal title. Ref. [6]: Ji, T.; Li, K.; Sun, Q.; Duan, Z. – missing volume or page range. Ref. [7]: Liu, B.; Li, F.; Hou, Y.; Antonio Biancardo, S.; Ma, X. – incomplete citation format (missing issue number). Ref. [10], [11], [14] – missing volume, issue, or page range details. All references must be verified and formatted according to MDPI’s Vancouver citation style, including DOI numbers where available. Moreover, the literature review includes only 16 references, which appears highly insufficient and unconvincing for a journal of the level and academic standards of MDPI. 

6.The Results section primarily reports model outputs without critical interpretation or comparison with previous studies. The Discussion should be expanded to link findings with broader implications for sustainable transport and urban emission policy.

7.The Conclusions section is incomplete and too brief – only two short paragraphs that do not effectively summarize the full scope of the study. It should be entirely rewritten to: summarize the key findings; discuss practical and theoretical implications; identify limitations more explicitly; propose future research directions or perspectives for further development of the authors’ model.

8.Language and Formatting. The manuscript requires a full English language review. Grammar, syntax, and article usage (“the”, “a”) should be carefully checked. Some sentences are overly long and lack academic precision. Formatting of equations and figure captions must follow MDPI standards.

Recommendations. Given the above concerns – particularly the absence of in-text citations, incomplete references, unclear figures, and underdeveloped conclusions – the manuscript requires major revision. At this stage, it should be rejected and resubmitted only after substantial revision and restructuring. The study has potential merit, but to meet Sustainability’s publication standards, the authors must significantly improve data transparency, visual quality, methodological justification, and the overall academic structure of the paper.

Comments on the Quality of English Language

In addition, the list of authors seems to come from the same group (organization and Zhang - 3 persons, as a family), which looks unusual and may raise questions about the diversity of contributions. The authors should clearly explain each person’s role according to MDPI authorship guidelines.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The manuscript addresses a relevant and timely research problem related to the analysis and prediction of traffic-related COâ‚‚ emissions in urban environments, which is directly connected to sustainable development goals, urban air quality, and mobility planning. The application of interpretable machine learning models (XGBoost combined with SHAP analysis), complemented by the Generalized Additive Model (GAM) to capture and visualize nonlinear effects, represents a contemporary and methodologically well-grounded approach. The study’s particular contribution lies in combining open emission datasets with field measurements collected across three types of urban functional zones, which strengthens the empirical credibility of the results and enhances their practical applicability in real-world city planning scenarios.

The manuscript effectively demonstrates that built environment features, transportation infrastructure characteristics, and functional land-use patterns influence COâ‚‚ emissions in distinct and nonlinear ways across different neighborhood types. The findings therefore hold direct value for urban planners, mobility strategists, and policy designers, especially in the context of targeted emission reduction strategies at the local (neighborhood) scale.

Overall, the manuscript is clearly structured, the results are well interpreted, and the conclusions are relevant. However, several sections would benefit from clarification and refinement to improve precision, readability, and communicative clarity.

Comments and Suggestions for Revision

  1. Clearly articulate the contributions at the end of the Introduction.
    A short subsection listing 3–4 key contributions in explicit bullet-point form would improve clarity and distinguish the study from previous work (e.g., combination of open COâ‚‚ datasets with field measurements, interpretability of model outputs, functional-zone-based comparison).
  2. Add a brief outline of the manuscript structure.
    A standard guiding sentence at the end of the Introduction (e.g., “The remainder of the paper is organized as follows…”) would enhance the readability.
  3. Clarify the temporal alignment of emission data and field measurements.
    As the EDGAR dataset is annual while field measurements are taken in July, it would be helpful to briefly explain why the datasets are considered temporally compatible and how potential seasonal effects are addressed or minimized.
  4. Briefly justify the choice of XGBoost compared to alternative models.
    Adding one sentence referencing prior evidence of XGBoost’s stable predictive performance and suitability for SHAP-based interpretation would be sufficient.
    (No additional experiments are required.)
  5. Expand figure captions for SHAP and GAM outputs.
    Captions should include short explanations of color gradients and value scales to make the figures self-explanatory without returning to the text.
  6. Restructure the Conclusion into two concise parts:
    • A clearly summarized list of the key findings (4–6 statements), and
    • A distinct final paragraph discussing study limitations and future research directions (e.g., validation in additional cities, expanded field data collection).
  7. Minor language polishing.
    A light linguistic revision is recommended to improve sentence flow, especially in transitions between passive and active voice. The meaning of the text can remain unchanged.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

i'm happy with the responses.

Reviewer 3 Report

Comments and Suggestions for Authors

I thank the authors for the detailed explanations and consideration of the comments. I respect that the article should be accepted for publication in this corrected form. I wish you a speedy publication of the materials and new creative inspiration for new works.

Back to TopTop