Next Article in Journal
Impact of Winter Air Supply Strategies on Thermal Comfort in Yamen Buildings: A Case Study of The Jiangsu Provincial Judicial Commissioner’s Office
Previous Article in Journal
Reliability Assessment of Multi-Source TEC Maps over Brazil Using Ground Truth Validation
 
 
Article
Peer-Review Record

Estimating Surface NO2 in Mexico City Using Sentinel-5P and Machine Learning

Atmosphere 2026, 17(1), 37; https://doi.org/10.3390/atmos17010037 (registering DOI)
by Yolanda Rosenda Monzón Herrera 1, Mayrén Polanco Gaytán 2, Raúl Teodoro Aquino Santos 3,*, Lakshmi Babu Saheer 4, Oliver Mendoza-Cano 1 and Rafael Julio Macedo-Barragán 5
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Atmosphere 2026, 17(1), 37; https://doi.org/10.3390/atmos17010037 (registering DOI)
Submission received: 26 November 2025 / Revised: 18 December 2025 / Accepted: 22 December 2025 / Published: 26 December 2025
(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript presents a machine-learning-based downscaling framework to estimate surface NOâ‚‚ in Mexico City by integrating Sentinel‑5P, ERA5, and RAMA observations, and compares its performance with a recent multimodal deep learning model (AQNet). The main strengths are the clear applied focus on an air‑quality–relevant pollutant, the use of dense local monitoring data, a systematic evaluation of model skill, and an explicit treatment of uncertainty in the conversion chain and prediction errors. The work demonstrates that a carefully configured Random Forest model can outperform more complex deep-learning architectures when high‑quality, region‑specific data and preprocessing are available, which is an important and timely message for the community.​

The weaknesses concern primarily structure, focus, and methodological transparency rather than the core idea. The Materials and Methods section is very long and partly reads like a didactic exposition (e.g., step‑by‑step derivations of well-known gas-law relations and very detailed sensitivity descriptions), which could be condensed and better organized around the key methodological decisions and assumptions. Some critical aspects of the modelling design are underexplained or only briefly justified, such as the exclusive choice of Random Forest as “baseline”, the hyperparameter configuration, the rationale for using a fixed 7 km column height, and the representativeness of a single year (2024) for model evaluation. In the Results and Discussion, there is a tendency to repeat similar performance numbers and qualitative statements across several subsections, while the interpretation of spatial and temporal NOâ‚‚ patterns, as well as the implications for policy and health, remain relatively brief compared with the methodological detail. Finally, the comparison with AQNet relies on different spatial domains and data contexts, and this limitation should be more explicitly and critically discussed to avoid over‑generalization of the superiority of the proposed approach.

Author Response

Dear Editor and Reviewers,

We sincerely thank the reviewers for their careful evaluation of our manuscript entitled “Estimating Surface NOâ‚‚ in Mexico City Using Sentinel-5P and Machine Learning.” We appreciate the constructive and insightful comments, which have significantly contributed to improving the clarity, structure, and methodological transparency of the final version of the article.

Below, we provide a point-by-point response addressing each evaluation item and comment raised by the reviewers. All revisions have been fully incorporated into the manuscript and are reflected in the current version submitted for reassessment.

General Evaluation Criteria

1. Does the introduction provide sufficient background and include all relevant references?

Reviewer assessment: Can be improved

Response:
We agree with the reviewer’s assessment. The Introduction has been substantially strengthened in the revised manuscript. Specifically, we expanded the theoretical framing of machine-learning–based statistical downscaling in atmospheric science, clarifying its relevance for bridging satellite tropospheric NOâ‚‚ columns and surface-level concentrations in complex urban environments. The revised Introduction now explicitly emphasizes (i) the limitations of satellite-only products, (ii) the role of meteorology and boundary-layer dynamics, and (iii) the methodological significance of machine learning for urban air-quality assessment, particularly in megacities such as Mexico City. All additions rely on references already included in the manuscript.

2. Is the research design appropriate?

Reviewer assessment: Yes

Response:
We thank the reviewer for this positive evaluation. The overall research design was retained, as it was deemed appropriate. Minor clarifications were added to explicitly frame the study as a methodological case study for 2024, avoiding any implication of long-term climatological generalization.

3. Are the methods adequately described?

Reviewer assessment: Must be improved

Response:
We fully acknowledge this concern and have implemented substantial revisions to the Materials and Methods section. In the final manuscript:

  • The section has been reorganized and streamlined to focus on key methodological decisions rather than didactic exposition.

  • Step-by-step derivations of well-known gas-law relationships were condensed, with unnecessary repetition removed.

  • Critical modelling choices are now explicitly justified, including:

    • the selection of Random Forest as a regionally optimized baseline model;

    • the hyperparameter configuration and cross-validation strategy (GroupKFold);

    • the assumption of a 7 km effective tropospheric column height, now clearly framed as a baseline scenario and complemented by sensitivity analyses;

    • the use of a single year (2024), explicitly discussed as appropriate for methodological validation rather than long-term trend analysis.

  • Sensitivity, ablation, and uncertainty analyses are now better integrated and clearly motivated as robustness assessments.

These changes significantly improve methodological transparency while preserving reproducibility.

4. Are the results clearly presented?

Reviewer assessment: Can be improved

Response:
We agree and have substantially improved the structure and clarity of the Results section. Specifically:

  • Redundant repetition of performance metrics across subsections was eliminated.

  • The Results now follow a clear analytical progression:
    data description → Random Forest performance → satellite–ground comparison → statistical downscaling gains → ablation and uncertainty analysis.

  • Descriptive figures (e.g., station distribution maps) were moved to the Materials and Methods section, ensuring that the Results focus strictly on analytical outcomes.

  • Interpretative statements were reduced in Results and expanded in the Discussion to avoid overlap.

5. Are the conclusions supported by the results?

Reviewer assessment: Yes

Response:
We appreciate this positive assessment. The Conclusions section remains fully supported by the Results and has been lightly refined to align with the reorganized Discussion and to avoid over-generalization.

6. Are all figures and tables clear and well-presented?

Reviewer assessment: Yes

Response:
Thank you for this assessment. In addition, figures were repositioned where necessary to comply with standard journal conventions (e.g., study-area figures placed in Materials and Methods). Captions were revised for clarity and completeness.

Comments and Suggestions for Authors

Manuscript structure and focus

Reviewer comment:
The manuscript’s weaknesses concern structure, focus, and methodological transparency rather than the core idea.

Response:
We fully agree. The manuscript was systematically restructured to improve narrative flow and focus. Methodological detail was condensed, while the Results and Discussion were strengthened to emphasize interpretation, spatial–temporal patterns, and applied relevance.

Interpretation and policy relevance

Reviewer comment:
Interpretation of spatial and temporal NOâ‚‚ patterns and implications for policy and health are relatively brief.

Response:
The Discussion section has been significantly expanded to interpret spatial gradients, temporal variability, and meteorological controls on NOâ‚‚. A dedicated subsection now addresses implications for urban air-quality monitoring, exposure assessment, and policy support, particularly for megacities with complex terrain and limited monitoring infrastructure.

Comparison with AQNet and risk of over-generalization

Reviewer comment:
The comparison with AQNet relies on different spatial domains and data contexts and should be discussed more critically.

Response:
This point has been explicitly addressed. The comparison with AQNet is now clearly framed as context-dependent, emphasizing differences in spatial domain, data density, and training strategy. The revised Discussion explicitly states that the superior performance of the Random Forest model should not be generalized beyond similar data-rich urban environments, avoiding any implication of universal superiority.

Closing Remarks

We believe that the revised manuscript addresses all reviewer comments and significantly improves clarity, structure, and methodological rigor while preserving the original scientific contribution. We are grateful to the reviewers for their thoughtful feedback, which has strengthened the manuscript.

We hope that the revised version is now suitable for publication in Atmosphere and we remain at your disposal for any further clarification.

Kind regards,

Yolanda Monzón 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The present work could be reconsidered after addressing the following comments:

  1. How were the hyperparameters for the Random Forest model tuned? Was cross-validation sufficient to prevent overfitting, especially given the relatively small dataset containing 3,246 records?
  2. The study relies on RAMA stations for ground truth. Could the authors discuss potential biases resulting from the uneven spatial distribution of these stations across Mexico City?
  3. Have the missing data periods (e.g., due to cloud cover or station downtime) been systematically analyzed to ensure they do not bias the study's results?
  4. The reported annual mean NOâ‚‚ concentrations (≈42 µg/m³) appear to be relatively high. How do these values compare with WHO guidelines and local regulatory thresholds? Including this context would enhance the public health relevance of the study.
  5. The comparison with AQNet is compelling, but it's important to note that the datasets are from different regions (Europe vs. Mexico City). Could the authors clarify whether AQNet was adapted to the Mexico City dataset or simply benchmarked against published results from Europe?
  6. What are the prospects for extending this methodology to other Mexican cities that have less dense monitoring networks?

Minor Comments:

  • Figures 1–4 are informative but could be improved by enhancing the resolution and providing clearer labeling (e.g., including axis titles and units).
  • Correct typographical errors (g., “Ramdom Forest”, „TTemporal & Spatial Alignment”).
  • Please add section "author contributions" (conceptualization, methodology etc.).
  • In the "References" section the authors should add DOIs for all papers.
  • References must be numbered in order of appearance in the text (including citations in tables and legends).
  • Not all references are cited within the manuscript: 26, 32, 33, 35, 42
  • Please avoid using the lumped citation like: [1,7,8,11], [4-7,11,12,17-21,24,27-29,39-41], [18,39-41], [1,2,7,12], [5,6,18,19,39-41], [18,39-41], [1,2,7,34], [11,24,27-29], [24,27-29], [18,39-41], [11,24,27-29], [18,39-41], [11,24,27-29], [8,11,27,28], [5,6,19,20,39,40], [18,24,27,30], [10,11,27,31], [5,6,19,20,39,40], [8,27,31,41]

Author Response

Dear Editor and Reviewers,

We thank the reviewers for their detailed evaluation of our manuscript and for the constructive comments provided. We appreciate the opportunity to further improve the quality, clarity, and rigor of the study. Below, we address each comment point by point, indicating how the concerns have been resolved in the revised manuscript.

General Evaluation Criteria

1. Does the introduction provide sufficient background and include all relevant references?

Reviewer assessment: Can be improved

Response:
The Introduction has been revised to improve conceptual clarity and completeness. In particular, we expanded the discussion on (i) the limitations of satellite-derived tropospheric NOâ‚‚ columns, (ii) the role of boundary-layer dynamics and meteorology in shaping surface concentrations, and (iii) the relevance of machine-learning–based statistical downscaling for urban air-quality assessment. These revisions strengthen the theoretical background without altering the scope of the study.

2. Is the research design appropriate?

Reviewer assessment: Can be improved

Response:
The research design has been clarified to emphasize that this work is a methodological case study for the year 2024, rather than a climatological analysis. Additional text was added to explain the rationale for focusing on a single year with dense observational coverage, framing the design as appropriate for validating the downscaling framework while avoiding over-generalization.

3. Are the methods adequately described?

Reviewer assessment: Can be improved

Response:
The Materials and Methods section has been substantially refined to improve transparency and conciseness. Specifically:

  • Hyperparameter choices for the Random Forest model are now explicitly reported (number of trees, minimum samples per leaf, validation strategy).

  • The use of GroupKFold cross-validation grouped by monitoring station is clearly described and justified as a measure to prevent spatial overfitting, particularly given the dataset size (3,246 records).

  • The assumption of a 7 km effective tropospheric column height is explicitly framed as a baseline scenario and complemented by a sensitivity analysis exploring alternative heights.

These revisions ensure that all methodological decisions are clearly justified and reproducible.

4. Are the results clearly presented?

Reviewer assessment: Can be improved

Response:
The Results section has been reorganized to reduce redundancy and improve logical flow. Repeated reporting of performance metrics across subsections was minimized, and the presentation now follows a consistent progression from data description to model performance, validation, ablation, and uncertainty analysis.

5. Are the conclusions supported by the results?

Reviewer assessment: Can be improved

Response:
The Conclusions section was revised to more explicitly reflect the results presented, particularly regarding uncertainty propagation, methodological limitations, and the context-dependent nature of the comparison with deep-learning models. Over-generalized statements were avoided.

6. Are all figures and tables clear and well-presented?

Reviewer assessment: Must be improved

Response:
All figures were reviewed and improved. Specifically:

  • Figures describing the study area and methodology were relocated to the Materials and Methods section.

  • Figure resolution was increased and labeling improved, including axis titles and units where applicable.

  • Figure captions were revised for clarity and completeness.

Major Comments

1. Hyperparameter tuning and overfitting prevention

Comment:
How were the hyperparameters for the Random Forest model tuned, and was cross-validation sufficient?

Response:
Hyperparameters were selected based on prior studies and empirical testing to balance model flexibility and generalization. The Random Forest model used 500 trees and a minimum of five samples per leaf, which reduces overfitting. Importantly, GroupKFold cross-validation grouped by monitoring station was applied, ensuring that no station appeared simultaneously in training and validation sets. This spatially independent validation strategy is appropriate for the dataset size and mitigates overfitting risk.

2. Potential bias due to uneven RAMA station distribution

Comment:
Could uneven station distribution bias the results?

Response:
This issue is now explicitly discussed in the Methods and Discussion sections. While RAMA stations are denser in central urban areas, the study leverages this configuration to capture emission gradients typical of Mexico City. We acknowledge that peripheral areas are less densely sampled and discuss this as a limitation, emphasizing that uncertainty propagation and spatial predictors help mitigate—but not fully eliminate—this bias.

3. Impact of missing data periods

Comment:
Have missing data periods been systematically analyzed?

Response:
Yes. Quality filtering (QA > 0.75, cloud fraction < 20%) and temporal aggregation were applied consistently across all datasets. Days with missing satellite observations or station downtime were excluded prior to model training. The revised manuscript clarifies that the final dataset (3,246 records) represents only temporally aligned observations, reducing the risk of systematic bias due to missing data.

4. Public health relevance of reported NOâ‚‚ concentrations

Comment:
How do the reported concentrations compare with WHO guidelines and local thresholds?

Response:
The Discussion section now explicitly contextualizes the annual mean NOâ‚‚ concentrations (~42 µg/m³) relative to WHO Air Quality Guidelines and Mexican regulatory thresholds. This comparison highlights the public health relevance of the findings and underscores the importance of high-resolution exposure assessment in Mexico City.

5. Comparison with AQNet

Comment:
Was AQNet adapted to Mexico City or benchmarked against published results?

Response:
This point has been clarified. AQNet was not retrained or adapted to the Mexico City dataset; instead, its published European performance metrics were used for contextual benchmarking. The Discussion now explicitly acknowledges differences in spatial domain, data density, and training context, and cautions against over-generalization of comparative performance.

6. Extension to other Mexican cities

Comment:
What are the prospects for applying this method elsewhere?

Response:
The Conclusions and Discussion sections now address this explicitly. We note that the framework is transferable to other Mexican cities, but performance will depend on data availability. In cities with sparse monitoring, satellite and meteorological predictors may still provide valuable estimates, though with higher uncertainty.

Minor Comments

Response:
All minor comments have been addressed:

  • Figures 1–4 were improved in resolution and labeling.

  • Typographical errors (e.g., “Ramdom Forest”, formatting inconsistencies) were corrected.

  • An Author Contributions section was added.

  • DOIs were added to references where available.

  • References were renumbered to follow the order of appearance, including tables and figure captions.

  • All references listed are now cited in the manuscript.

  • Lumped citations were revised and distributed more selectively throughout the text.

We believe that the manuscript has been substantially improved in response to the reviewers’ comments. The revisions enhance methodological transparency, presentation clarity, and public health relevance while maintaining the original scientific contribution. We sincerely thank the reviewers for their valuable feedback and hope that the revised manuscript is now suitable for publication in Atmosphere.

Kind regards,
Yolanda Monzón 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This study proposes a machine learning-based downscaling framework for estimating high-resolution surface NOâ‚‚ concentrations in Mexico City by integrating satellite observations, meteorological variables, and ground-based measurements—a topic of clear relevance to atmospheric research. The manuscript is considered suitable for publication pending the following revisions:

  1. The introduction should better emphasize the significance of applying machine learning methods in atmospheric studies.
  2. The subsection titled “Statistical Comparison Between Satellite Data and Ground-Based Observations” reads more like results and should either be moved to the results section or integrated accordingly.
  3. Figure 9, which depicts the study area, should be presented in the materials and methods section rather than in the results.
  4. In Section 3.3, the statistical analysis of NOâ‚‚ mass concentrations could be enhanced by also illustrating their variations and potential influencing factors.
  5. While the research topic is compelling, the manuscript’s organization could be improved. For instance, the discussion section lacks depth and cohesion; merging the results and discussion could help articulate the findings more clearly and fluently. (If possible).

Author Response

Dear Editor and Reviewers,

We thank the reviewers for their careful evaluation of our manuscript entitled “Estimating Surface NOâ‚‚ in Mexico City Using Sentinel-5P and Machine Learning.” We appreciate the constructive feedback and the positive assessment of the study’s relevance. Below, we provide a point-by-point response detailing how each comment has been addressed in the revised manuscript.

General Evaluation Criteria

1. Does the introduction provide sufficient background and include all relevant references?

Reviewer assessment: Can be improved

Response:
We agree with this assessment. The Introduction has been revised and expanded to more clearly emphasize the significance of machine learning methods in atmospheric studies. In particular, we strengthened the discussion on the limitations of satellite-only NOâ‚‚ products, the role of nonlinear interactions between emissions and meteorology, and the capacity of machine-learning–based statistical downscaling to bridge the gap between tropospheric columns and surface concentrations in complex urban environments such as Mexico City. These additions enhance the conceptual framing while maintaining the original scope and references.

2. Is the research design appropriate?

Reviewer assessment: Yes

Response:
We thank the reviewer for this positive evaluation. The research design was retained, as it was considered appropriate. Minor clarifications were added to explicitly frame the study as a methodological case study for 2024, avoiding any implication of long-term climatological generalization.

3. Are the methods adequately described?

Reviewer assessment: Yes

Response:
We appreciate this assessment. Although the Methods were deemed adequate, we implemented minor refinements to improve readability and transparency, including clearer justification of model configuration, validation strategy, and uncertainty treatment.

4. Are the results clearly presented?

Reviewer assessment: Can be improved

Response:
We agree with this comment and have reorganized the Results section to improve clarity and coherence. Redundant repetition of performance metrics across subsections was reduced, and the presentation now follows a more logical sequence from data description to model performance, validation, and uncertainty analysis.

5. Are the conclusions supported by the results?

Reviewer assessment: Yes

Response:
We thank the reviewer for this positive assessment. The Conclusions section remains fully supported by the Results and has been lightly refined to align with the revised Discussion.

6. Are all figures and tables clear and well-presented?

Reviewer assessment: Yes

Response:
We appreciate this evaluation. Additionally, figures were reviewed to ensure consistency with journal conventions, and descriptive figures were relocated where necessary.

Specific Comments and Revisions

1. Emphasizing the significance of machine learning in the Introduction

Comment:
The introduction should better emphasize the significance of applying machine learning methods in atmospheric studies.

Response:
This comment has been fully addressed. The Introduction now includes an explicit paragraph highlighting how machine-learning approaches enable the integration of heterogeneous datasets, capture nonlinear atmospheric processes, and improve urban-scale air-quality estimation beyond traditional statistical or physical models.

2. Placement of the subsection “Statistical Comparison Between Satellite Data and Ground-Based Observations”

Comment:
This subsection reads more like results and should be moved or integrated accordingly.

Response:
We agree. This subsection is now fully integrated within the Results section, where it is presented as a validation component rather than a methodological description. Interpretative elements were shifted to the Discussion to maintain a clear separation between results and interpretation.

3. Placement of Figure 9

Comment:
Figure 9 should be presented in the Materials and Methods section rather than in the Results.

Response:
Figure 9 has been relocated to the Materials and Methods section, where it now provides spatial and contextual information about the study area and data domain. This change aligns the manuscript with standard structural conventions.

4. Enhancement of statistical analysis in Section 3.3

Comment:
The statistical analysis of NOâ‚‚ mass concentrations could be enhanced by illustrating variability and influencing factors.

Response:
Section 3.3 was expanded to explicitly describe temporal variability, seasonal modulation, and meteorological influences (e.g., boundary-layer height and wind conditions) on NOâ‚‚ concentrations. This addition strengthens the interpretation of the statistical results without introducing new data.

5. Overall manuscript organization and depth of discussion

Comment:
The discussion lacks depth and cohesion; merging results and discussion could help (if possible).

Response:
We addressed this concern by strengthening and expanding the Discussion section, improving cohesion and interpretative depth. While Results and Discussion remain formally separated to preserve clarity, the revised Discussion now more clearly synthesizes the findings, links them to existing literature, and articulates their implications for urban air-quality assessment and policy.

We believe that the revisions fully address the reviewer’s comments and have substantially improved the manuscript’s clarity, organization, and scientific framing. We thank the reviewer for their constructive suggestions and hope that the revised manuscript is now suitable for publication in Atmosphere.

Kind regards,

Yolanda Monzón 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have significantly improved the paper. They have justified their methodological choices, particularly the decision to employ Random Forest rather than more complex deep-learning architectures. The expanded discussion addressing implications for air quality management and potential methodology transfer to other cities is especially commendable.

Reviewer 3 Report

Comments and Suggestions for Authors

The paper can be accepted in the present form!

Back to TopTop