A Proposed Deep Learning Framework for Air Quality Forecasts, Combining Localized Particle Concentration Measurements and Meteorological Data
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsOverall evaluation:
(1) Main contribution:
The authors use a sliding-window feedforward model and a Gated Recurrent Unit (GRU)-based network to investigate correlations between PMs and meteorological parameters to make long-term air quality forecasts in Ioannina, Greece. At the beginning of their paper, they promise significant improvements in predictive accuracy, generalisation, and robustness to noise.
The structure of the paper is logical and coherent, thus permitting the reader to follow the argument smoothly. The introduction does well in placing the problem in context with an explanation of the relevance of the research.
(2) Defects of the paper:
What is the aim of this paper? The aim is to forecast air quality, test new methods and models, or compare different approaches to solving a problem? A comparative table should be inserted to assist the analysis if other works are compared.
(3) Minor mistakes:
Figures 2, 3 and 4 are difficult to read. The authors might find another solution to present the information from those tables.
(4) Questions:
a) How did you obtain the dataset? Who provided the dataset?
You used data from the National Meteorological Agency. Why didn't you use the data provided by the National Environmental Agency? Did you use data for PMs from a low-cost sensor network? Which network? The name of this network should be mentioned in your paper. Can you say something about calibrating those sensors used to measure PM concentrations? Or about the method used to make measurements? Usually, low-cost sensors use the laser scattering method, but the gravitational method is recognised as a standard. How many low-cost sensors did you use? Where are they located? How did you choose them? The frequency of data provided by low-cost sensors is very good (half a minute to one minute). Why do you have a frequency of one hour?
b) Which are the novelty elements of this paper?
c) What types of sensors are included in the IoT?
d) Why did you choose to study PM1 and PM4? The WHO or EEA makes no recommendations for these PM concentrations.
e) The authors mentioned that they used multiple data sources, but in reality, they used only two sources (IoT and the national meteorological agency). Is it right?
f) The authors mentioned using 26.000 complete measurements (from January 2021 to December 2023). From my calculations, 24 (hours)x365 (days)x3(years)=78.840 measurements. This means that the authors used only 1/3 of the data. Why?
(5) Innovation elements:
The authors applied two types of models (edge computing model - Slide NN and cloud computing model - GRU), and even in the abstract, they said they would also use hybrid models. What results did you obtain using the hybrid model?
Please emphasise the novelty elements by comparing your results with other papers on the same subject.
The authors said that both models showed significant improvements in predictive accuracy, generalisation, and robustness to noise, but did not say compared to which models.
Overall recommendation: The paper can be accepted after major revisions, with the identified issues addressed. The authors are encouraged to consider the feedback provided and make the necessary adjustments to improve the clarity, significance, and objectivity of their paper.
Author Response
(1) Main contribution:
The authors use a sliding-window feedforward model and a Gated Recurrent Unit (GRU)-based network to investigate correlations between PMs and meteorological parameters to make long-term air quality forecasts in Ioannina, Greece. At the beginning of their paper, they promise significant improvements in predictive accuracy, generalisation, and robustness to noise.
The structure of the paper is logical and coherent, thus permitting the reader to follow the argument smoothly. The introduction does well in placing the problem in context with an explanation of the relevance of the research.
Response: Thank you for taking the time to review our manuscript. Here, we present our responses and amendments made following your comments.
Comment 1: (2) Defects of the paper:
What is the aim of this paper? The aim is to forecast air quality, test new methods and models, or compare different approaches to solving a problem? A comparative table should be inserted to assist the analysis if other works are compared.
Response: The main aim of this paper is now stated in a paragraph of the Introduction Section (lines 146-154) and a comparative table between the strongest models of this study is included in the Discussion Section (Table 10), with some text explaining the contents of it (lines 832-860).
Comment 2: (3) Minor mistakes:
Figures 2, 3 and 4 are difficult to read. The authors might find another solution to present the information from those tables.
Response: Figures 2, 3, and 4 have been amended. The information is now much clearer to read.
Comment 3: (4) Questions:
- a) How did you obtain the dataset? Who provided the dataset?
You used data from the National Meteorological Agency. Why didn't you use the data provided by the National Environmental Agency? Did you use data for PMs from a low-cost sensor network? Which network? The name of this network should be mentioned in your paper. Can you say something about calibrating those sensors used to measure PM concentrations? Or about the method used to make measurements? Usually, low-cost sensors use the laser scattering method, but the gravitational method is recognised as a standard. How many low-cost sensors did you use? Where are they located? How did you choose them? The frequency of data provided by low-cost sensors is very good (half a minute to one minute). Why do you have a frequency of one hour?
Response: A comprehensive explanation and description of the origin of the data, which answers your questions above, as well as elements of it, have been added at the beginning of subsection 2.3 Data Collection and Preprocessing.
Comment 4: b) Which are the novelty elements of this paper?
Response: An appropriate Discussion section has been added, focusing on highlighting the novel elements of this paper (cloud–edge computing Deep Learning models) and the models' applicability in different environments.
Comment 5: c) What types of sensors are included in the IoT?
Response: The types of devices (stations) used in the IoT have also been included and appropriately discussed at the beginning of subsection 2.3, Data Collection and Preprocessing (lines 321-350).
Comment 6: d) Why did you choose to study PM1 and PM4? The WHO or EEA makes no recommendations for these PM concentrations.
Response: Appropriate paragraphs have been added to the Introduction section (see lines 68-93) supporting our choice of studying PM1 and PM4.
Comment 7: e) The authors mentioned that they used multiple data sources, but in reality, they used only two sources (IoT and the national meteorological agency). Is it right?
Response: The necessary changes have been made, discussing the source information of the data extracted for model training at the beginning of Section 2.3, Data Collection and Preprocessing (first and last paragraphs).
Comment 8: f) The authors mentioned using 26.000 complete measurements (from January 2021 to December 2023). From my calculations, 24 (hours)x365 (days)x3(years)=78.840 measurements. This means that the authors used only 1/3 of the data. Why?
Response: The exact measurements (as well as the dates) have been corrected and can be found in section 2.3, lines 351 for the exact dates and lines 321-324 for the exact measurements. However, the authors did not use only a third of the data, as the calculation of 24x365x3=26.280, interpolated to minute intervals.
Comment 9: (5) Innovation elements:
The authors applied two types of models: an edge computing model (Slide NN) and a cloud computing model (GRU). Even in the abstract, they stated that they would also use hybrid models. What results did you obtain using the hybrid model?
Response: Abstract has been amended to state the meaning of the hybrid model used clearly. The hybrid model is the one discussed in the cloud computing scenario (Scenario II). It consists of a hybrid GRU-NN architecture that, for a bigger number of internal GRU cells (>64), connects to a sub-network with decreasing neuron counts. The findings related to this model are detailed in the experimental results section (Section 3.4) and are further elaborated in the newly added Discussion section. Here, the benefits of hybridization are thoroughly examined, highlighting that this approach has demonstrated superior performance compared to other potential models.
Comment 10: Please emphasise the novelty elements by comparing your results with other papers on the same subject.
Response: A comparative review of the results of this study has been added to the Discussion section of this paper (lines 861-891).
Comment 11: The authors said that both models showed significant improvements in predictive accuracy, generalisation, and robustness to noise, but did not say compared to which models.
Response: The advantages of each model, as well as its improvements and achievements in relation to other similar models, have been mentioned in many parts of the Discussion section (lines 793-814).
Overall recommendation: The paper can be accepted after major revisions, with the identified issues addressed. The authors are encouraged to consider the feedback provided and make the necessary adjustments to improve the clarity, significance, and objectivity of their paper.
Response: Thank you for your time and effort in reviewing our manuscript, providing your overall recommendation, and offering your comments. We made the necessary adjustments to improve it by performing amendments based on your observations. A PDF document with track changes from our previous manuscript version is also attached.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe topic is relevant. Literature review nicely sets the scene for Greece. Suggestion: the focus of literature considered could include the situation internationally, as the problem is more global. This could enhance the application of the paper. Selection of models and training seems appropriate. The figures should be referred to from the text and explained in the text. "Standardisation as the crucial step" should be explained in more details. Data from one city were used. Would results differ if applied to cities like Athens? if not what accuracy would you expect. What is the potential of the model for transfer to other cities? what would be restrictions to do so? Ioannina is quite small and might have specifics. Are the results good enough to use them e. g. by the municipality to broadcast recommendations, to reduce traffic?
Author Response
Response: Thank you for your time and effort in reviewing our manuscript. Here, we quote our responses and amendments performed based on your comments.
Comment 1: The topic is relevant. Literature review nicely sets the scene for Greece. Suggestion: the focus of literature considered could include the situation internationally, as the problem is more global. This could enhance the application of the paper.
Response: A broader global scope of the situation has been incorporated, with relevant paragraphs added to emphasize a more comprehensive perspective in the literature. These can be found in the Introduction section, just below the literature review of Greece (lines 49-67).
Comment 2: Selection of models and training seems appropriate. The figures should be referred to from the text and explained in the text.
Response: Every figure has now been introduced and referenced from the text when needed, as well as explained in the text right above it.
Comment 3: "Standardisation as the crucial step" should be explained in more details.
Response: The claim, "Standardizing the output proved to be a crucial step" has been further analyzed with an explanatory text and a table (lines 404-428 and Table 4).
Comment 4: Data from one city were used. Would results differ if applied to cities like Athens? if not what accuracy would you expect. What is the potential of the model for transfer to other cities? what would be restrictions to do so? Ioannina is quite small and might have specifics. Are the results good enough to use them e. g. by the municipality to broadcast recommendations, to reduce traffic?
Response: An appropriate Discussion section has been added to emphasize the proposed models' experimental findings and applicability. Response to your comment is in the last paragraph of this section.
English Comment: The English could be improved to more clearly express the research.
Response: Appropriate amendments have been made throughout our manuscript to improve English expressions of this research. A PDF document with track changes from our previous manuscript version is also attached.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors1 Hyperparameters such as the learning rate (e.g., 0.0008), epoch numbers (400 vs. 50), and batch size (32) appear arbitrarily chosen. Provide rationale for these selections, ideally supported by ablation or sensitivity analysis to ensure optimality and reproducibility.
2 Results rely solely on RMSE, MSE, and MAE metrics without confidence intervals or significance testing. You should Include standard deviation/error bars over multiple runs and/or statistical tests (e.g., t-tests) to validate the superiority of GRU models.
3 SlideNN was trained for 400 epochs compared to 25–50 epochs for GRUs, which may indicate overfitting. Present training vs. validation loss curves for each model and comment on overfitting behavior.
4 Although normalization is discussed, the prevention of data leakage (especially during temporal slicing) is not sufficiently elaborated.
5 All experiments are based on data from Ioannina only. This limits generalizability. At minimum, include a discussion on how transferable the model is to other cities with different pollution profiles.
6 The hybrid GRU-NN model is said to outperform others in cloud settings, but the individual contributions of GRU and NN components are not separated. Include an ablation study comparing GRU-only vs. GRU-NN architectures to validate the added value of hybridization.
Author Response
Response: Thank you for taking the time and effort to review our manuscript. Here, we quote our responses and amendments performed based on your comments. A PDF document with track changes from our previous manuscript version is also attached.
Comment 1: Hyperparameters such as the learning rate (e.g., 0.0008), epoch numbers (400 vs. 50), and batch size (32) appear arbitrarily chosen. Provide rationale for these selections, ideally supported by ablation or sensitivity analysis to ensure optimality and reproducibility.
Response: A detailed explanation about the reasoning behind the choice of every hyperparameter has been included in various parts of this paper, specifically:
- for the learning rate, which has also been corrected to be 0.0001, in subsection 2.4, lines 511-524
- For the epochs: slideNN, in subsection 3.2, lines 596-607, GRU models, lines 642-645
- For the batch size, in the Discussion section, lines 815-824
Comment 2: Results rely solely on RMSE, MSE, and MAE metrics without confidence intervals or significance testing. You should Include standard deviation/error bars over multiple runs and/or statistical tests (e.g., t-tests) to validate the superiority of GRU models.
Response: The Metrics section has been updated, adding t-tests and Cohen's d effect measures. Tables 5, 6, and 8 have been updated, and a descriptive comparative paragraph has been added under each Table indicating significance testing of the models' results.
Comment 3: SlideNN was trained for 400 epochs compared to 25–50 epochs for GRUs, which may indicate overfitting. Present training vs. validation loss curves for each model and comment on overfitting behavior.
Response: The instance of overfitting for the slideNN model has now been addressed and can be found in the Experimental Results section of Scenario I. Figures containing training vs. validation loss curves have also been included for each model, and discussion of overfitting behavior has been added for each model as well.
Comment 4: Although normalization is discussed, the prevention of data leakage (especially during temporal slicing) is not sufficiently elaborated.
Response: An appropriate paragraph was added to the Data Collection and Preprocessing subsection (2.3.1, lines 377-385) to discuss the prevention of data leakage, particularly during temporal slicing.
Comment 5: All experiments are based on data from Ioannina only. This limits generalizability. At minimum, include a discussion on how transferable the model is to other cities with different pollution profiles.
Response: Appropriate Discussion section has been added to emphasise the proposed models' experimental findings and applicability. Response to your comment is in the last paragraph of this section.
Comment 6: The hybrid GRU-NN model is said to outperform others in cloud settings, but the individual contributions of GRU and NN components are not separated. Include an ablation study comparing GRU-only vs. GRU-NN architectures to validate the added value of hybridization.
Response: The additional value offered by the addition of hybridisation has been explained by analysing the individual benefits of each component. The comparison of the GRU-only model versus the hybrid architecture has also been increased in size with the addition of new paragraphs found in the discussion section (lines 825-844).
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsI am satisfied with the improvements brought by the authors to the paper.
The authors have considered the recommendations, adding information where necessary, including updates to the bibliography and improving the quality of some figures for the reader's sake.
There are a few minor errors regarding references (the journal is not written in abbreviation form) and the colour used to indicate some figures (blue instead of black) in the text. Still, I believe the authors will correct them before publication.
The recommendation is for publishing.
Author Response
I am satisfied with the improvements brought by the authors to the paper.
The authors have considered the recommendations, adding information where necessary, including updates to the bibliography and improving the quality of some figures for the reader's sake.
Response: Thank you for taking the time to review our manuscript.
Comment 1: There are a few minor errors regarding references (the journal is not written in abbreviation form) and the colour used to indicate some figures (blue instead of black) in the text. Still, I believe the authors will correct them before publication.
Response: Bibliography has been reviewed, and amendments have been made. Figures 1 and 5 have been amended (blue text color has been changed to black, and issues with the blue background color have been resolved).