Next Article in Journal
Reducing Carbon Emissions from Transport Sector: Experience and Policy Design Considerations
Previous Article in Journal
Crisis Management and Customer Adaptation: Pathways to Adaptive Capacity and Resilience in Micro- and Small-Sized Enterprises
Previous Article in Special Issue
Occurrence and Risk Assessment of Veterinary Antimicrobials in Commercial Organic Fertilizers on Chinese Markets
 
 
Article
Peer-Review Record

Towards Digitalization for Air Pollution Detection: Forecasting Information System of the Environmental Monitoring

Sustainability 2025, 17(9), 3760; https://doi.org/10.3390/su17093760
by Kyrylo Vadurin 1, Andrii Perekrest 1, Volodymyr Bakharev 1, Vira Shendryk 2,*, Yuliia Parfenenko 2 and Sergii Shendryk 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Sustainability 2025, 17(9), 3760; https://doi.org/10.3390/su17093760
Submission received: 19 March 2025 / Revised: 17 April 2025 / Accepted: 18 April 2025 / Published: 22 April 2025
(This article belongs to the Special Issue Environmental Pollution and Impacts on Human Health)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Please the manuscript should incorporate a complete analysis of the state-of-the-art of air quality modelling considering the current recommendations, legislation, tools, etc. Also please clarify the units and how you show the information in the manuscript and a complete analysis of performance evaluation.

Comments for author File: Comments.pdf

Author Response

1. Summary

 

 

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions and corrections highlighted in yellow changes in the re-submitted files.

2. Point-by-point response to Comments and Suggestions for Authors

Comments 1: 1. Suggestion: to include references to the new AAQD (2881/2024 EC)

Response 1: Thank you for pointing this out. A reference to the directive was added.

Comments 2: Is not true in the case of Air Quality (AQ). Specifically in Europe there are a lot of air quality forecasts systems. And many of them have a good representativity and accuracy. Authors does not explain anything about FAIRMODE, CAMS, etc.

Response 2: It was revised with the addition of the context of using the information system in financially underserved municipalities in Ukraine.

Comments 3: GIS allow us to forecast? Or GIS can help us to represent the outputst of modelling tools? Please to clarify.

Response 3: Changes to the GIS description were made.

Comments 4: Authors should refer to data-based models and a specific type, Machine Learning. And previously indicate that exists deterministics models and data-based models.

Response 4: Information about machine learning models was added.

Comments 5: Please distinguish references between applications for the atmosphere and applications for pollutants.

Response 5: An extended description of the content of the original sources is provided.

Comments 6: I don't understand if the application of air pollution why authors talk about hydrodinamical models, climate models, etc. Please focus on the aim of the research.

Response 6: Clarifications regarding the information system have been provided.

Comments 7: How is done the interpolation? based on? What parameters do you consider to interpolate pollutant concentrations?

Response 7: Clarification regarding interpolation has been provided.

Comments 8: Authors in the methodology do not explain the episode selected, the area of study and its characteristics, the features of the air quality data (sensors, location, pollutant species, uncertainty data, etc.). Authors should add all of this to the section.

Response 8: Contextual information about the data used for the study was added to line 481.

Comments 9: Change the sentence please. Not use He/She, better administrator.

Response 9: The administrator is indicated instead of a pronoun.

Comments 10: How is considered the legislation in the system? When do you introduce this information? Please clarify.

Response 10: Information on considering legislation in the system has been provided.

Comments 11: What is dust concentration? PM10? PM2.5? Units? What is MPC?

Response 11: An explanation about dust, its concentration, and the calculation of MPC has been added.

Comments 12: What representativity has this image? Air Quality is affected mainly for meteorological conditions and emission sources. It is not a good practice to apply interpolations without considering the emission sources in the region. This image is only a mathematical exercice where is not defined correctly the pollutant and the units.

Response 12: The image has been clarified, and its description has been added.

Comments 13: Which is the threshold value? Based on what legislation?

Response 13: The limit value for a graph that is built based on units of multiples of maximum permissible concentrations, in accordance with current legislation, passes through 1. Since the multiples of the MPC are actually the fraction that remains after dividing the actual value in mg/m3 of the pollutant by the standard that it should not exceed, the limit line passes through 1.

Comments 14: Could you explain the seasonality of the results? From a meteorological point of view is very difficult to understand.

Response 14: The first version of the graph was built based on synthetic values during system testing. Graph 16 has been replaced with readings from real stationary monitoring post-Post No. 4, Shevchenko St., 22/30, Kremenchuk. This graph already shows a certain seasonality and a trend towards a decrease in dust pollution during 2019-2024 is also clearly visible, which is most likely due to the partial closure of local industrial enterprises.

Comments 15: This method is not recommended to validate AQ forecasts. Please use the methods recommended for example by FAIRMODE and apply this. Is required if you develop an air quality forecasting system to show a complete evaluation performance analysis.

Response 15: Part of the information about the correlation matrix near Figure 18 has been removed. Correlation was not used to validate the forecast results. For validation, MSE was used as the main indicator, based on which optimization and selection of the approach from the selected ones providing the highest forecast accuracy were carried out.

Comments 16: Please define the achronyms and add a section with this achronyms.

Response 16: An explanation of MSE was added.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The study proposed a forecasting system for air pollution, and introduced the associated data treatment approaches. But the research line has not been well showed, and some mistakes also need be revised. Several details are listed below.

1. The abstract should be rephrased which should supplement the results and conclusions by the research.

2. The format for the equations and related should be double checked.

3.  List a table for the used models and related functions.

4. The analysis results focused on the dust concentration. Where is the data source? Explain the variances in analyzing other environmental impact index.  

5.  The manuscript has not been well expressed. The subtitles in current version are difficult to follow.It should be revised.

6. Provide a basic research line figure.

Comments on the Quality of English Language

Can be understood.

Author Response

1. Summary

 

 

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions and corrections highlighted in green changes in the re-submitted files.

2. Point-by-point response to Comments and Suggestions for Authors

Comments 1: The abstract should be rephrased which should supplement the results and conclusions by the research.

Response 1: The abstract has been revised to focus on the results and conclusions of the study.

Comments 2: The format for the equations and related should be double checked.

Response 2: The format of the formulas was checked.

Comments 3: List a table for the used models and related functions.

Response 3: A description of the data information and a table of the obtained MSE error values for the BATS and ARIMA methods when forecasting various substances were added. The conclusion from the obtained results in the table is given below the table. A part from line 923 was changed, since the previous data was indicated as when forecasting from 01-01-2022, which at this stage of work was determined to be insufficient for accurate retrospective analysis.

Comments 4: The analysis results focused on the dust concentration. Where is the data source? Explain the variances in analyzing other environmental impact index.

Response 4: Explanations of figures and information about data sources were added.

Comments 5: The manuscript has not been well expressed. The subtitles in current version are difficult to follow.It should be revised.

Response 5: Some of the subtitles have been changed.

Comments 6: Provide a basic research line figure.

Response 6: The image describing the general structure of the study has been changed.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript presents the prototype of the Environmental Monitoring Information System, which allows, in automated mode, to predict the basic characteristics of atmospheric air pollution based on available measurements and advanced machine learning methodology, spatial analysis and visualization algorithms. There is no doubt that the topic is highly actual, as air pollution remains one of the biggest ecological challenges. The authors fairly believe that the task of spatio-temporal prediction of air pollution is rather typical and ideally should be automated and standardized to be solved for different locations where some observational data series occur. They suggest using standard statistical models such as ARIMA and BATS for this purpose. The concept of an information system that would cover the whole chain from data processing to forecasting and recommendations is very interesting and deserves attention from both researchers and decision makers working in the field of environmental protection and ecology. The manuscript contains a detailed explanation of the proposed concept and gives an example of the application of the system. However, in my opinion, it is still raw and should be substantially modified and rewritten for publication.

The strongest part of the paper is the concept of an information system, including its mathematical basis. At the same time, it is necessary to pay more attention to other aspects when they are mentioned. Obviously, the authors use some data on air composition to illustrate the possibilities of the system. At least basic information about these data should be clearly stated in Section 2. Correspondingly, the illustrations in Section 3 should contain the clear information of what is presented on figures and described in the text.

Many questions arise when reading the parts on air pollution. It should be explained why exactly this set of key pollutants has been chosen (dust, sulphur and nitrogen dioxides, formaldehyde) or (if this isn't the case) to represent them in a more general way. For example, can ozone or carbon monoxide be used as key pollutants? Is it in principle preferable to use dust instead of particulate matter PM (which is a more general and regularly observed parameter)? Is the system available for any time resolution (e.g. most modern analysers provide measurements with time resolution within 1 min)? Should such high resolution data be averaged by the system or processed in its original form? Accordingly, which MPC values - single, hourly or daily - are used in the system to make analyses, forecasts and recommendations?

In Section 4, the authors compare the accuracy of forecasts made using ARIMA and BATS models, without providing any initial information about the results of the calculations. This is completely pointless. They should at least be included in the table with basic statistics indicating the robustness of the forecast calculations.

Finally, I strongly recommend proofreading the manuscript to improve the English. Also, some parts of the Introduction and Results contain too detailed descriptions of rather obvious things that are redundant for a scientific paper and can be shortened.

More specific comments are given below

Remarks:

Abstract - Abstract is too general (especially first half) and lacks essential information.

It should be modified.

line 51-54 - repeating "engaged" in 1 sentence

line 73 - What nitrogen do you mean? N2? NOx? Is it specifically dust or particulate matter in general? The statement sounds very shallow.

lines 73-75 controversial conclusion - I would recommend to omit it. Besides it contains repeating of "leads to" in 1 sentence

Introduction - generally it contains rather basic or trivial statements which can be shortened or even omitted. The first 3 passages need to have more references to support at least the most important statements.

lines 175-176 - it is better to explain abbreviations like IDW and MPC at thirst mention.

Figure 1 -

- As I understand it, the 1st box (Environmental Monitoring and Forecasting System) contains the title of the whole system. Then it needs to be shown as a title without an outgoing arrow. I would place it as the name of the figure.

- Why imperative is used in most of gray boxes and over them? I would replace all of them by nouns (for instance, selection, application , prediction, analyses etc.). All stages and operations should be written in a single style

lines 228-229 - Why has such a list of key pollutants been chosen?

lines 232-233 - Which MPCs (single, hourly, daily...) are taken into account?

line 248 - I would use 'sites' instead of 'points' here and elsewhere in the manuscript when "point" means "station". Why are only background points (sites with lowest pollution if I understood correctly) recommended?

line 257 - What is "addresses" here? Postal addresses or coordinates?

line 266 - It seems to me that it is better to express "Exceedance Count" as a variable. What is GDK?

line 287 - extra dot after 1. It is more logical to place the description of Gj at point 2, as it appears there.

line 291 - How (in what units) is time going to be expressed?

line 297 - "On" should be replaced with "at"

line 301 - It seems that this sentence should be deleted.

line 303 - Does "limit concentration" mean MPC or something else?

line 309 - "якщо" is likely to be replaced by "if"

line 336 - This sub-title sounds weird. I would recommend "Methods for predicting changes in concentrations of air pollutants" or something like that

line 394-395 - The sentence "For each of the selected pairs of variables, graphs

are constructed that allow us to evaluate their relationship" looks bad and needs to be rewritten

line 437-461 - These 2 passages are likely to be moved to the Section 2 "Materials and Methods".

line 449 - I suggest to replace "time slice" with "time period"

line 460 - What is UML modeling?

Figure 2 - As an option, I would replace imperative for the 3rd person of the verb (predicts, generates, uploads, etc). This also applies to some of the later figures (5-11). It would probably be good (also for other figures) to have different avatars for different types of users.

Figure 3 - What is GDK (in figure and in the text)? How "Exceedance" triggers "ObservationPost"? What mean 1 near the all arrows?

line 582 - the phrase "when making smart decisions" can be omitted.

line 602-643 - In my opinion, it's a useful to gather all libraries and tools that are recommended for different purposes in one table.

Section 3.2 "Air pollution data analysis using information system" - There is a complete lack of information on the data used (name of the monitoring network, substances measured, time period and resolution). Even if the data are only used as an example, such information should be provided in section 2.

Figure 13 - The resolution of this figure is very low. At least the main settlements should be visible.

Figure 14 - What is "m." before Kremenchuk? I think it can be omitted.

line 690 - Again, I don't like this subtitle. It might be better something like "Analysis of air pollution MPC exceedance data".

Figure 15 - Is this the specific example? Then details should be shown

Figure 16 - Is this the specific example? Then details should be shown. What is the definition of trend in your context? Why does seasonality not correlate with the seasons? What does the term "seasonality" mean in this case?

line 721 - I don't think that statistics of MPC exceedances in the air together with graphs are enough to give a "holistic picture of the environmental situation".

Figure 17 - What is the point of showing a forecast without comparing it with real observations? Why not run the experiment with the previously obtained data series to demonstrate the forecast quality? What does the horizontal dotted line mean?

Figure 18 - As I see it, the picture shows the real situation near Poltava. This should be reflected in the capture. According to the scale on the right, the CO concentration in the area varies within 0.002% MPC. It's even less than the accuracy of a standard gas analyzer. Please check this.

Figure 19 - Please, put the details about figure into the caption. It should be like "Visualization of forecasted spatial distribution on example of ..."

line 818 - extra "p" after ARIMA

lines 811-825 - It is completely unclear where the values in question have been taken from, so there can be no understanding of what you are talking about. The discussion seems pointless. The experiment with BATS and ARIMA models as well as its results in the form of tables and/or graphs should be clearly described and presented before the discussion. The "Discussion" section should be entirely rewritten.

Comments on the Quality of English Language

I strongly recommend proofreading the manuscript to improve the English. Also, some parts of the Introduction and Results contain too detailed descriptions of rather obvious things that are redundant for a scientific paper and can be shortened.

Author Response

Response to Reviewer 3 Comments

1. Summary

 

 

Thank you very much for taking the time to review this manuscript. Please find the detailed responses below and the corresponding revisions and corrections highlighted in turquoise changes in the re-submitted files.

2. Point-by-point response to Comments and Suggestions for Authors

Comments 1: The strongest part of the paper is the concept of an information system, including its mathematical basis. At the same time, it is necessary to pay more attention to other aspects when they are mentioned. Obviously, the authors use some data on air composition to illustrate the possibilities of the system. At least basic information about these data should be clearly stated in Section 2.

Response 1: The origin of the data is described in section 2.6.

Comments 2: Correspondingly, the illustrations in Section 3 should contain the clear information of what is presented on figures and described in the text.

Response 2: The descriptions of the figures have been supplemented, and the figures have been corrected.

Comments 3: It should be explained why exactly this set of key pollutants has been chosen (dust, sulphur and nitrogen dioxides, formaldehyde) or (if this isn't the case) to represent them in a more general way.

Response 3: After analyzing the MPC levels of pollutants, it was determined that these pollutants regularly exceed the established standards, so it was decided to focus on them in order to predict their future exceedances and local authorities could react in accordance with the forecasts.

Comments 4: For example, can ozone or carbon monoxide be used as key pollutants?

Response 4: Substances such as ozone or carbon monoxide can be used as key pollutants if regular trends are recorded to exceed the established standards, or a trend towards their stable growth is visible. The system provides for the possibility of using any substance available from retrospective data for forecasting, but since the study determines the state of a specific location, attention is focused on the selected set of substances.

Comments 5: Is it in principle preferable to use dust instead of particulate matter PM (which is a more general and regularly observed parameter)?

Response 5: The issue is quite complex for this work. The system involves further integration of automatic reference monitoring stations from the manufacturer Vaisala, model 420, 530, measurements of particulate matter in which are designated as PM and are divided into PM10, PM2.5, PM1 with units of measurement of μg/m3, while the discreteness of the data from the stations themselves is 1 minute with a stable connection to the server. In municipal monitoring with the participation of laboratories, the measurement of substances is carried out in accordance with Ukrainian legislation in the environmental sector, the main units of measurement are mg/m3, and the discreteness of such measurements is determined by the plan (measurements once a day, or upon request). Accordingly, for the study, data with a discreteness of a month with an average for the month in units of measurement of mg/m3 were received from the municipal institution. Accordingly, in the laboratories during the observation period, there was no separation of particles of the type PM10, PM2.5, PM1, and it was precisely solid particles without categorization that were determined.

Comments 6: Is the system available for any time resolution (e.g. most modern analysers provide measurements with time resolution within 1 min)?

Response 6: The system can currently work with data of any resolution, but it will automatically average the data to the month and automatically interpolate missing values for the month using the two closest existing values using linear interpolation. Such limitations are due to the nature of the input data of the initial municipal monitoring, which was carried out by laboratories and submitted as averaged values over months. The program modules are currently being improved to provide the ability to select data discretization when working with data from automatic stations.

Comments 7: Should such high resolution data be averaged by the system or processed in its original form?

Response 7: Currently, the system works with data in the native output format, i.e., with monthly average data provided by municipal laboratories. When integrating Vaisala reference station data, the values are automatically averaged to monthly values. In the future, this part will be improved to provide short-term forecasts based on neural networks, to predict the readings of automatic stations with high discreteness.

Comments 8: Accordingly, which MPC values - single, hourly or daily - are used in the system to make analyses, forecasts and recommendations?

Response 8: The system uses daily MPC values.

Comments 9: In Section 4, the authors compare the accuracy of forecasts made using ARIMA and BATS models, without providing any initial information about the results of the calculations. This is completely pointless. They should at least be included in the table with basic statistics indicating the robustness of the forecast calculations.

Response 9: In section 4, a table comparing MSE for ARIMA and BATS models has been added. The forecast was conducted using data from stationary posts in Kremenchuk from 2008 to 2024 with a forecast horizon of 5 months. The forecast horizon can be set by the administrator. Next to the graphs in text form, the operator will receive information about the expected MSE for each forecast.

Comments 10: Abstract - Abstract is too general (especially first half) and lacks essential information.

Response 10: The abstract has been clarified.

Comments 11: line 51-54 - repeating "engaged" in 1 sentence

Response 11: The sentence has been changed

Comments 12: line 73 - What nitrogen do you mean? N2? NOx? Is it specifically dust or particulate matter in general? The statement sounds very shallow.

Response 12: It has been corrected.

Comments 13: lines 73-75 controversial conclusion - I would recommend to omit it. Besides it contains repeating of "leads to" in 1 sentence

Response 13: This was considered

Comments 14: Introduction - generally it contains rather basic or trivial statements which can be shortened or even omitted. The first 3 passages need to have more references to support at least the most important statements.

Response 14: A reference was added in the introduction.

Comments 15: lines 175-176 - it is better to explain abbreviations like IDW and MPC at thirst mention.

Response 15: Explanations of acronyms have been added.

Comments 16: Figure 1:

- As I understand it, the 1st box (Environmental Monitoring and Forecasting System) contains the title of the whole system. Then it needs to be shown as a title without an outgoing arrow. I would place it as the name of the figure.

 

- Why imperative is used in most of gray boxes and over them? I would replace all of them by nouns (for instance, selection, application , prediction, analyses etc.). All stages and operations should be written in a single style

Response 16: Figure 1 has been modified.

Comments 17: lines 228-229 - Why has such a list of key pollutants been chosen?

Response 17: An explanation of the selection of pollutants for prediction was added.

Comments 18: lines 232-233 - Which MPCs (single, hourly, daily...) are taken into account?

Response 18: Daily MPC are taken into account.

Comments 19: line 248 - I would use 'sites' instead of 'points' here and elsewhere in the manuscript when "point" means "station". Why are only background points (sites with lowest pollution if I understood correctly) recommended?

Response 19: This approach is based on the methodology from the doctoral dissertation of V.S. Bakharev on determining background pollution points in urban agglomerations. Determining the background pollution point should allow, based on data from stations with greater pollution, located in places of concentration of industrial facilities, to determine the difference in the excess of the pollutant at the station in the concentration relative to the background value, and by the difference it will be possible to determine the source of pollution, having approximate information about the emission standards of enterprises.

Comments 20: line 257 - What is "addresses" here? Postal addresses or coordinates?

Response 20: In this case, we are talking about postal addresses where stationary monitoring posts are located. The system provides a separate parameter responsible for coordinates, which can also be used for sorting and filtering data. As for the integration of data of automatic systems, coordinates are mainly used for their analysis and grouping.

Comments 21: line 266 - It seems to me that it is better to express "Exceedance Count" as a variable. What is GDK?

Response 21: MPC transliterated translation was fixed

Comments 22: line 287 - extra dot after 1. It is more logical to place the description of Gj at point 2, as it appears there.

Response 22: The point has been removed.

Comments 23: line 291 - How (in what units) is time going to be expressed?

Response 23: A possible list of time units supported by the system is indicated.

Comments 24: line 297 - "On" should be replaced with "at"

Response 24: Was replaced.

Comments 25: line 301 - It seems that this sentence should be deleted.

Response 25: Was deleted.

Comments 26: line 303 - Does "limit concentration" mean MPC or something else?

Response 26: MPC is specified.

Comments 27: line 309 - "якщо" is likely to be replaced by "if"

Response 27: Was replaced.

Comments 28: This sub-title sounds weird. I would recommend "Methods for predicting changes in concentrations of air pollutants" or something like that

Response 28: Was changed.

Comments 29: The sentence "For each of the selected pairs of variables, graphs are constructed that allow us to evaluate their relationship" looks bad and needs to be rewritten

Response 29: Was rewritten.

Comments 30: These 2 passages are likely to be moved to the Section 2 "Materials and Methods".

Response 30: The passage has been moved to section 2.

Comments 31: I suggest to replace "time slice" with "time period"

Response 31: Was changed.

Comments 32: line 460 - What is UML modeling?

Response 32: In this case, the concept of how the system works was represented in the form of UML diagrams. The sentence has been removed.

Comments 33: Figure 2 - As an option, I would replace imperative for the 3rd person of the verb (predicts, generates, uploads, etc). This also applies to some of the later figures (5-11). It would probably be good (also for other figures) to have different avatars for different types of users.

Response 33: Expressions have been changed.

Comments 34: Figure 3 - What is GDK (in figure and in the text)? How "Exceedance" triggers "ObservationPost"? What mean 1 near the all arrows?

Response 34: An Exceedance triggers an Observation Post through a relationship labeled “triggers”. This means that when a certain pollution level is exceeded, it causes that event to be recorded or captured in the Observation Post.

The number “1” next to all arrows indicates the multiplicity of the relationship. In this context:

The number “1” next to the arrow from “Exceedance” to “Observation” means that each entry in the “Observation” (each observation event) is associated with one specific “Exceedance”. That is, each event of a certain level being exceeded is captured in the “Observation” data.

The number “1” next to the arrow from “Pollutant” to “Exceedance” means that each “Exceedance” is caused by one specific “Pollutant”.

Next to the arrow from "Observation" to "Pollutant", the number "1" means that each "Pollutant" can be associated with many (*) records in "Observation" (i.e., the same pollutant can be recorded at different points in time).

Next to the arrow from "MPC" to "Pollutant", the number "1" means that each "Pollutant" is defined by one specific "MPC" value (maximum permissible concentration).

Next to the arrow from "Observation" to "City", the number "1" means that each "Observation" occurs in one specific "City".

Next to the arrow from "Observation" to "Address" the number "1" means that each "Observation" occurs at one specific "Address".

Comments 35: line 582 - the phrase "when making smart decisions" can be omitted.

Response 35: The phrase has been removed.

Comments 36: line 602-643 - In my opinion, it's a useful to gather all libraries and tools that are recommended for different purposes in one table.

Response 36: The tools and their purpose are provided in Table 1.

Comments 37: Section 3.2 "Air pollution data analysis using information system" - There is a complete lack of information on the data used (name of the monitoring network, substances measured, time period and resolution). Even if the data are only used as an example, such information should be provided in section 2.

Response 37: Information was added in section 2.6.

Comments 38: Figure 13 - The resolution of this figure is very low. At least the main settlements should be visible.

Response 38: Figure 13 has been replaced.

Comments 39: Figure 14 - What is "m." before Kremenchuk? I think it can be omitted.

Response 39: Figure 14 has been replaced in accordance with the updated visualization of the exceedance graph in the system. Here, colored bars show the pollution level at various stationary posts in the range from 2019-01-01 to 2024-09-01, and the maximum permissible concentration line runs through 1.

Comments 40: line 690 - Again, I don't like this subtitle. It might be better something like "Analysis of air pollution MPC exceedance data".

Response 40: The title has been changed.

Comments 41: Figure 15 - Is this the specific example? Then details should be shown

Response 41: Previously, a specific example of a correlation matrix of averaged values of all indicators of all stations from 2019-01-01 to 2024-08-01 was provided. Now, a matrix with correlation values for a specific laboratory for the same period has been added.

Comments 42: Figure 16 - Is this the specific example? Then details should be shown. What is the definition of trend in your context? Why does seasonality not correlate with the seasons? What does the term "seasonality" mean in this case?

Response 42: Description of Figure 16 was added.

Comments 43: line 721 - I don't think that statistics of MPC exceedances in the air together with graphs are enough to give a "holistic picture of the environmental situation".

Response 43: Clarifications added regarding other factors affecting the environment have been.

Comments 44: Figure 17 - What is the point of showing a forecast without comparing it with real observations? Why not run the experiment with the previously obtained data series to demonstrate the forecast quality? What does the horizontal dotted line mean?

Response 44: The calculated MSE values for air pollution forecasts are shown in Table 2. During the development of the system, there was a function with visualization of the forecast with a given horizon, which displayed the forecast data on the basis of which the MSE was calculated next to the retrospective data; currently, the function is outdated and only the forecast outside the retrospective data is visualized with the output of numerical values of the expected MSE.

Comments 45: Figure 18 - As I see it, the picture shows the real situation near Poltava. This should be reflected in the capture. According to the scale on the right, the CO concentration in the area varies within 0.002% MPC. It's even less than the accuracy of a standard gas analyzer. Please check this.

Response 45: Figure 18 and 19 have been swapped. Description added.

Comments 46: Figure 19 - Please, put the details about figure into the caption. It should be like "Visualization of forecasted spatial distribution on example of ..."

Response 46: Figure 18 and 19 have been swapped. Description added.

Comments 47: line 818 - extra "p" after ARIMA

Response 47: Was removed.

Comments 48: lines 811-825 - It is completely unclear where the values in question have been taken from, so there can be no understanding of what you are talking about. The discussion seems pointless. The experiment with BATS and ARIMA models as well as its results in the form of tables and/or graphs should be clearly described and presented before the discussion. The "Discussion" section should be entirely rewritten.

Response 48: The discussion section has been rewritten.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The effort realized by the authors is appreciatted. Clarifications and changes that they have done improve a lot the quality of the manuscript. No more comments for my part.

Best regards

Author Response

Thank you very much for taking the time to review this manuscript. 

Reviewer 2 Report

Comments and Suggestions for Authors

The revised version is better, but several evident mistakes also need be further checked. 

1) Line 28-30: the future works plan is not recommended to explain here.

2) Line 69-87, 150-169: the supplemented revisions have no cited reference at all.

3) Line1132-1138: the future plan should move to the back of Conclusion part.

4) The marked places used so many colors. It not easy to tell their separated function apart. Of course, this is not the problem with the manuscript itself.

Author Response

Comments 1: Line 28-30: the future works plan is not recommended to explain here.

Response 1: The lines with plans for future work have been removed from the annotation.

Comments 2: Line 69-87, 150-169: the supplemented revisions have no cited reference at all.

Response 2: There are indeed no references in lines 69-87 since these statements were formed in the interaction of the focus group of the authors of the article, representatives of municipal enterprises, and city authorities. Accordingly, during the focus group meetings, problems are collectively identified, and a task statement is formed with an orientation toward the maximum speed of implementing solutions based on currently available resources. The results of such meetings are generated for internal use and are not available for reference.

In lines 150-169, links to the relevant references have been added.

Comments 3: Line 1132-1138: the future plan should move to the back of the Conclusion part.

Response 3: The plan for future research has been moved to the end of conclusions.

Comments 4: The marked places used so many colors. It is not easy to tell their separate function apart. Of course, this is not the problem with the manuscript itself.

Response 4: Thank you for your valid comment. We are working on improving the system interface, special attention will be paid to the color identification of individual objects on spatial pollution maps.

As for editing the article, individual colors are used to show individual changes made according to the recommendations of reviewers. Due to the number of changes, this technique may not be very successful, we will work on improving the quality of identification of changes made.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have done an impressive job of improving the manuscript, but it still has significant shortcomings (see remarks below). In addition, some of the previous remarks were left unanswered. I would also try to shorten some of the new parts (especially in Section 3) because the text is really difficult to read.

Remarks

line 180 - Since the decoding of ARIMA has already been done in the previous section, it can be omitted here.

Figure 1 - The capture is not good because it repeats the name of the 1st block in the figure. It should probably be something like "The general structure of the information system".



line 472 - Do MPC and MAC mean the same thing? Only 1 term should be used in the paper

Figure 2 - The capture should contain complete information about the figure. I recommend "Graphical representation of annual dust concentration in the air for the city of Kremenchuk during 2007-2024'

lines 726-740 - It should be an introductory sentence that dust includes both PM10 and PM2.5. Otherwise, it looks strange that PM10 and PM2.5 are explained here when they were not mentioned before.

lines 794-804 - Since the correlation between most species is too small to analyse, I would shorten and generalize this part.

Figure 16 - What does trend mean here? Trend is usually expressed as a straight line showing the direction and the rate of change of a variable. Please give a definition. Why do the trends for dust look completely different on the first and second panels? Again, the complete information about all panels (it's better to mark them as a, b, c, and d), location (Kremenchuk) and time (2007-2024) should be given in capture.

Figure 17 - Again, provide the complete information in the capture. Then it is not necessary to repeat it in the text here and for other figures. You can simply leave the essential references to the figure without an introductory sentence. I insist that all the forecasts (except those for Post #2) look odd. Even the ones that are not straight lines obviously do not reflect seasonal changes and are far from reality. I would exclude this figure if you cannot provide a retrospective forecast.

Figure 19 - I do not see any responses or changes in Figure 19 regarding my comment about CO.

Table 2. - There is no need to write "..in multiples of MPC" on every line. You can point this out in the capture or in the remark after the table. It is better to use the symbols MSEBATS and MSEARIMA with subscripts BATS and ARIMA.



Author Response

Comments 1: line 180 - Since the decoding of ARIMA has already been done in the previous section, it can be omitted here.

Response 1: Repetitive explanation removed.

Comments 2: Figure 1 - The capture is not good because it repeats the name of the 1st block in the figure. It should probably be something like "The general structure of the information system."

Response 2: The figure caption changed according to recommendations.

Comments 3: line 472 - Do MPC and MAC mean the same thing? Only 1 term should be used in the paper

Response 3: MAC replaced by MPC.

Comments 4: Figure 2 - The capture should contain complete information about the figure. I recommend "Graphical representation of annual dust concentration in the air for the city of Kremenchuk during 2007-2024'

Response 4: The caption for Figure 12 has been changed in accordance with the recommendations.

Comments 5: lines 726-740 - It should be an introductory sentence that dust includes both PM10 and PM2.5. Otherwise, it looks strange that PM10 and PM2.5 are explained here when they were not mentioned before.

Response 5: Moved the explanation of MPC and added information about dust composition, before the definition of PM10 and P2.5.

Comments 6: lines 794-804 - Since the correlation between most species is too small to analyse, I would shorten and generalize this part.

Response 6: Added a paragraph with a summary instead of two paragraphs.

Comments 7: Figure 16 - What does trend mean here? Trend is usually expressed as a straight line showing the direction and the rate of change of a variable. Please give a definition. Why do the trends for dust look completely different on the first and second panels? Again, the complete information about all panels (it's better to mark them as a, b, c, and d), location (Kremenchuk) and time (2007-2024) should be given in capture.

Response 7: Yes, "trend" is often depicted as a simple straight line (linear regression) showing a constant rate of change.

However, in time series analysis, especially when using decomposition methods like the one shown here (which separates a time series into trend, seasonality, and random fluctuations), the "trend" component represents the underlying long-term movement or direction in the data after smoothing out short-term fluctuations like seasonality and random noise.

This trend component does not have to be a straight line. It can be non-linear, capturing gradual changes, shifts in direction, and long-term cycles in the data that aren't part of the regular seasonal pattern. It's often calculated using methods like moving averages or more sophisticated smoothing techniques (e.g., LOESS - Locally Estimated Scatterplot Smoothing) which result in the curved line seen here.

In the context of Figure 16, the "Trend" line (orange) represents the estimated long-term progression of Dust concentrations (in multiples of MPC) over the period 2007-2024, with seasonal variations and random noise filtered out.

Looking closely, the orange "Trend" line in Panel (a) and the orange "Trend" line in Panel (b) are actually the exact same line.

The apparent difference arises because:

- In Panel (a), the trend line is overlaid on the highly variable "Monthly average" data (blue line). The visual comparison with the fluctuating blue line might make the smoother orange trend seem less distinct or appear differently relative to the peaks and troughs of the monthly data.

- In Panel (b), the trend line is shown in isolation. Without the distraction of the monthly data, the shape and variations of the long-term trend itself are much clearer.

The drawing caption has been extended. References to panel letters were added to the text description.

Comments 8: Figure 17 - Again, provide the complete information in the capture. Then it is not necessary to repeat it in the text here and for other figures. You can simply leave the essential references to the figure without an introductory sentence. I insist that all the forecasts (except those for Post #2) look odd. Even the ones that are not straight lines obviously do not reflect seasonal changes and are far from reality. I would exclude this figure if you cannot provide a retrospective forecast.

Response 8: The caption of the figure has been changed.

While we understand your concern that some forecasts might appear visually "odd" or less representative of strong seasonality compared to the historical data, we believe the figure provides a valuable contribution for several reasons:

- The dashed lines represent the direct output of ARIMA/BATS time series models applied to the historical data (2007-2024) for each specific monitoring station. These models attempt to capture underlying trends, seasonality, and autoregressive patterns mathematically. The resulting forecast shapes are reflections of what these established algorithms determined from the input data.

- Contrary to the impression that only Post #2 shows seasonality, the forecasts for Post #4 (Pink Dashed) and Post #5 (Brown Dashed) do exhibit cyclical patterns, as clearly visible in the figure and mentioned in our description ("Notably, while the forecasts for Post Nos. 2, 4, and 5 exhibit expected seasonal variations..."). These variations might be less pronounced than some historical peaks, which can occur when models average out extreme volatility or if recent data trends show dampened seasonality.

- You mentioned the need for a retrospective forecast. The provided MSE values (0.95, 0.62, 0.78, 0.55) are precisely the result of such an evaluation. MSE is calculated by comparing the model's predictions against actual historical data points (i.e., hindcasting or a form of retrospective analysis) during the model training/validation phase. These values quantify the average squared difference between predicted and actual concentrations historically, thus providing a standard measure of the model's past performance and expected future error margin for each station.

Comments 9: Figure 19 - I do not see any responses or changes in Figure 19 regarding my comment about CO.

Response 9: The color scale is next to the graph, as it is dynamically changing in the system, and it separately displays both the points when interpolating data across the entire area and the results of filtering by a given number of higher or lower values among the entire interpolated sample.

Regarding the color scale, Figure 19 shows 10% of the filtered interpolated points with the lowest pollutant concentration. Accordingly, since the spatial interpolation is determined based on the distances between points with predicted concentrations based on retrospective data, and since the concentrations between known points do not differ much, such minor changes in the color gradation of pollution are obtained.

It should also be noted the use of the basic interpolation library in python, in which points included in the contour circle have the closest possible concentration.

The left figure shows filtered points whose values are determined based on the calculation of pollution indicators within the contours, and the right one shows the selected part of the spatial interpolation using only the contours.

Comments 10: Table 2. - There is no need to write "..in multiples of MPC" on every line. You can point this out in the capture or in the remark after the table. It is better to use the symbols MSEBATS and MSEARIMA with subscripts BATS and ARIMA.

Response 10: The table caption and column names have been changed.

Author Response File: Author Response.pdf

Round 3

Reviewer 3 Report

Comments and Suggestions for Authors

I believe that the authors have done a great deal of work to correct all the remarks, and I have no additional remarks at this time.

The only thing that has not been fixed - although I think it should be fixed - is Figure 19 (see my corresponding remarks in the two previous reviews). Perhaps the authors did not understand me, so I will try to make it clear using an example. For instance, we take the WHO recommendations for CO hourly MPC 30 mkg/m3 (it is completely unimportant which values we consider - I just took this for better illustration).  Then we multiply this value by the extreme values from the right scale in Figure 19 (0.200 and 0.202) and get 6.00 and 6.06 mkg/m3 as the scale range. But the difference between the extremes is so small that it looks completely unrealistic and compromises the whole paper. That's why I insist on dealing with this problem.

Author Response

The authors are very grateful to the esteemed reviewer for his time and insightful comments, which allowed us to significantly improve our article. Based on current capabilities, to address this shortcoming, the color scale from Figure 19 was removed, and its caption was replaced.

Back to TopTop