Estimation of PM2.5 Concentrations Using Fusion 3 km AOD of Two-Stage Models in Beijing–Tianjin–Hebei, China
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have presented a very interesting work with important contributions. The text is well structured, which makes the paper easy to read. I am leaving just a few minor comments to help improve the quality of the work.
A space should be left between the numerical value and the percent symbol.
In Figure 3, I suggest that the authors modify the x-axis to display the names of the months instead of the numbers from 1 to 12.
The tables from number 3 onward are not cited in the text.
The authors report the RMSE and MAE errors as percentage values. Shouldn't they be presented in the same unit as the measurand?
Table 9 appears somewhat disorganized. Would it be possible to improve the arrangement of the values in it?
Author Response
Please see the attachment.
Author Response File:
Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsArticle "Estimation of Daily PM2.5 Concentrations Using Fusion 3 km AOD of Two-Stage Models in Beijing-Tianjin-Hebei, China". The presented study is devoted to the important problem of assessing atmospheric air pollution using modern machine learning methods and satellite data. The study contains an analysis of a hybrid model (XGBoost + GTWR) to estimate daily concentrations of PM2.5 in one of the most ecologically stressed regions of China, Beijing–Tianjin-Hebei, using satellite data from the optical thickness of the atmosphere with a spatial resolution of 3 km. The main results demonstrate the high accuracy of the proposed model when restoring PM2.5 concentrations (R2 = 0.95). The relevance of the work is beyond doubt due to the severity of the problem of atmospheric air pollution in urbanized regions of China. Despite this, there are several critical comments, after which the manuscript can be published in the journal.
1. The annotation indicates that the highest level of atmospheric pollution was observed in winter and amounted to 100.3 micrograms/m3 in the city of Handan. However, Graph 5 shows that this is the highest seasonal average value of PM2.5. I believe that appropriate corrections should be made to the abstract text in order not to mislead readers.
2. In order to improve reproducibility and a better evaluation of the manuscript, the authors should describe the cross-validation process in more detail. for example, how the spatial and temporal independence of folds was ensured.
3. Chapter 3.3 uses R2, RMSE, and MAE to validate the model, but does not take into account the mean deviation (Bias), which is important for understanding the error direction of the model.
4. In Chapter 4, seasonal variations of PM2.5 are explained by the general phrases "national measures", "weather conditions". I suggest that the authors deepen the discussion of the physical causes of seasonal variability of PM2.5, supporting it with data (on boundary layer height, humidity, precipitation, temperature – changes in emissions, number of forest fires, etc.) or references to the literature [https://doi.org/10.3390/fire7040150 , doi.org/10.3390/su17083585 , https://doi.org/10.3390/ijerph181910191 , https://doi.org/10.3390/app14188327 ].
5. Figure 1 does not read the dark green color, which covers a significant part of the territory.
6. Figure 3 does not show the signatures of the average monthly concentrations of PM2.5
7. Figures 5, 6, 7. The captions to the figures of PM2.5 are indicated without a subscript.
Author Response
Please see the attachment.
Author Response File:
Author Response.docx
Reviewer 3 Report
Comments and Suggestions for AuthorsThis study proposes a two stage modeling framework integrating Xtreme Gradient Boosting (XGBoost) with geographically and temporally weighted regression (GTWR) to predict daily PM2.5 concentrations at a 3 km resolution in Beijing–Tianjin–Hebei, China.
The introduction section is approximately 3 pages long with large paragraphs, which is difficult to read. The authors should consider to split the large paragraphs into more concise. The majority of the introduction describes the different machine learning methods without relating to their research so it does not add any value to this work.
This part is considered to be redundant should be removed
"The remainder of this paper is structured as follows: Section 2 introduces the study area, data sources, and preprocessing methods..."
Figure 1 shows the Geographical location, air quality monitoring stations, and annual mean distribution of PM2.5 concentration in the study area. However, it is unclear that whether the AQMS is background or roadside station and the authors should clarify.
Some of the sentences are written in first-person tone (such as We), which should be avoided in scientific writing.
Table 1 shows the Selection of monthly model parameters in 2020 and Table 2 shows the Selection of quarterly model parameters in 2020. Can the authors explain why the variables is different between months and seasons?
The authors claimed that "We obtained data for the entire year of 2020 for the Beijing–Tianjin–Hebei region". The observation time seemed to be very short for air quality studies, which may have a negative impact on the result. A normal observation time would be 5 years or above for this kind of study. Can the authors explain why only 1 year of data is considered?
Table 7 shows the Monthly comparison of evaluation index for various models for 2020. It seemed that January has the highest R2 and also the highest RMSE and MAE for all models, which is quite unusual. Can the authors explain the reason behind this unusual trend?
Figure 7 shows Spatial distribution of predicted PM2.5 concentrations in the Beijing–Tianjin–Hebei region for each quarter and month in 2020. The arrangement of this figure seemed to unusual. Please separate the seasons from the months for better clarity and readability.
The conclusion is short and the limitation of this study is not discussed in this section. Also, the future direction after this study is unknown and should be discussed in details in the conclusion section.
Author Response
Please see the attachment.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThank you for your time. I have no further comments.
Author Response
Thank you for your time and for the positive feedback.
We appreciate your confirmation that no further comments remain.
All suggested changes have been incorporated into the revised manuscript.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe changes that are made to this manuscript are not shown in the PDF file, which makes it difficult to assess whether the changes made are appropriate.
In addition, it seemed that the authors did not address some of the concerns raised by the reviewer in the previous report. The introduction section is still very long, at 2.5 pages, and the paragraphs are unusually long, which makes it difficult to read.
Figure 1 still does not show the types of monitoring stations, which is unacceptable.
Also, the authors' explanation for the short observation time (only 1 year of data) for an air quality study remained to be unconvincing and does not aligned with other literature. The RMSE and MAE remained to be high and the authors explanation remained to be skeptical.
The biggest issue comes in the discussion and conclusion section. It is not acceptable to use bullet points in these sections. Please integrate the points into a paragraph and present them in an easy-to-understand way.
Author Response
Please see the attachment. We use highlighting and red text mode to mark the revised sections of the paper: yellow highlighting for the first round of revisions and red text for the second round of revisions.
Author Response File:
Author Response.pdf
