Next Article in Journal
Mapping Paddy Rice Planting Area in Dongting Lake Area Combining Time Series Sentinel-1 and Sentinel-2 Images
Next Article in Special Issue
Assessment of Spatio-Temporal Variations in PM2.5 and Associated Long-Range Air Mass Transport and Mortality in South Asia
Previous Article in Journal
Characterizing and Mapping Volcanic Flow Deposits on Mount St. Helens via Dual-Band SAR Imagery
 
 
Article
Peer-Review Record

Machine Learning-Based Modeling of Air Temperature in the Complex Environment of Yerevan City, Armenia

Remote Sens. 2023, 15(11), 2795; https://doi.org/10.3390/rs15112795
by Garegin Tepanosyan 1, Shushanik Asmaryan 1, Vahagn Muradyan 1, Rima Avetisyan 1, Azatuhi Hovsepyan 1, Anahit Khlghatyan 1, Grigor Ayvazyan 1 and Fabio Dell’Acqua 2,*
Reviewer 2:
Reviewer 3: Anonymous
Remote Sens. 2023, 15(11), 2795; https://doi.org/10.3390/rs15112795
Submission received: 1 April 2023 / Revised: 22 May 2023 / Accepted: 24 May 2023 / Published: 27 May 2023

Round 1

Reviewer 1 Report

In the paper “Machine-learning based modelling of air temperature in the сomplex environment of Yerevan city, Armenia” the authors use a machine learning method to predict surface air temperature based on remote sensing data. The work is relevant, especially in complex terrain and with insufficient density of station data.

 The authors call one of the advantages of the work the use of a large number of variables (30).

However, there are actually 15 variables, and the remaining 15 variables are the standard deviation from the main variables.

What is the need to use the average series of variables and their standard deviations for inclusion in the model? To increase the number of variables? The original series of any variable already includes variability.

 

The discussion section is poorly represented. Add some information, especially since in the introduction the authors extensively describe the previously obtained results.

 

A few small remarks

Give a transcript of the abbreviations in the place of the first mention in the text

Line 16 – PLSR

Line 53 – RS

Line 103 - PLSR

Author Response

Reviewer 1

  1. In the paper “Machine-learning based modelling of air temperature in the сomplex environment of Yerevan city, Armenia” the authors use a machine learning method to predict surface air temperature based on remote sensing data. The work is relevant, especially in complex terrain and with insufficient density of station data.

Re: Thank you for the comment and the appreciation expressed.

 

  1. The authors call one of the advantages of the work the use of a large number of variables (30). However, there are actually 15 variables, and the remaining 15 variables are the standard deviation from the main variables. What is the need to use the average series of variables and their standard deviations for inclusion in the model? To increase the number of variables? The original series of any variable already includes variability.

Re: Thank you for the comment. This was an experiment to test our hypothesis that some derivatives of the main variables may be valuable and have an influence on the outcome of the estimation. If we look at the VIP scores of the variables for the different circled zones (100m; 200m; 300m; 400m; and 1000m), some of the “derivative” variables (e.g. LST_SD; IBI-SABI_SD, NDWI_SD, NDVI_SD etc ) got VIPs greater than 1 (one) and thus turned out to be relevant to the estimation process (See Figure 3 and the lines 300-304). Currently, our group is continuing this experiment by extending the list of the main variables with their derivatives via in situ weather data (solar radiation, wind direction, due point, precipitation etc.)

 

  1. The discussion section is poorly represented. Add some information, especially since in the introduction the authors extensively describe the previously obtained results.

Re: Thank you for the comment. The discussion was revised and updated (See lines 311-323)

 

  1. A few small remarks. Give a transcript of the abbreviations in the place of the first mention in the text line 16 – PLSR, line 53 – RS and line 103 – PLSR

Re: Thank you for comment. Please see the lines 15; 52 and 105 correspondingly.

 

 

Reviewer 2

 

  1. The number of decimal places in Table 2 should be reduced to two predictor variables in Figure 3 should be sorted descending (not alphabetically). That would increase overall readability. 

Re: Thank you for comment. Please see the corrections in the Table 2 and Figure 3

 

Reviewer 2 Report

the number of decimal places in Table 2 should be reduced to two

predictor variables in Figure 3 should be sorted descending (not alphabetically). That would increase overall readability. 

 

Author Response

Reviewer 2

 

  1. The number of decimal places in Table 2 should be reduced to two predictor variables in Figure 3 should be sorted descending (not alphabetically). That would increase overall readability. 

Re: Thank you for comment. Please see the corrections in the Table 2 and Figure 3

Reviewer 3 Report

This manuscript presents a straightforward way to estimate Air Temperature from remote sensing data. I recommend it be major revised before considering acceptance for publication.

 

1.     The research area map better shows the global location of the research area.

2.     Weather station information. Only outlier selection was introduced, but readers may be interested in the observation instruments, frequency, parameters, etc.

3.     I don’t quite get Table 2. I assume the authors hope to do a sensitivity analysis for all the input data, while Figure 3 presenting feature importance looks enough for this.

4.     Figure 3 and Table 3 also concern me. The task could be better finished with gridsearch of the sklearn package.

5.     Figure 4 better presents the sample size.

 

The main concern I have is that the research needs more novelty. It looks more like a course report. I strongly suggest that the authors withdraw the submission as an “Article” and resubmit it to Remote Sensing as a “Technical Note.”

Author Response

  1. This manuscript presents a straightforward way to estimate Air Temperature from remote sensing data. I recommend it be major revised before considering acceptance for publication.

Re: Thank you for comment. We revised paper and make some updates, especially in the discussion part (See lines 311-323)

 

 

  1. The research area map better shows the global location of the research area.

Re: Thank you for comment. Figure 1 has been corrected accordingly.

 

  1. Weather station information. Only outlier selection was introduced, but readers may be interested in the observation instruments, frequency, parameters, etc.

Re: Thank you for comment. Unfortunately, we have no more information about the observation instruments than we already shared: they belong to the “Hydrometeorology and monitoring center” State Non-Commercial Organization (SNCO) of Ministry of Environment of Armenia and provide daily data. Some of the weather stations are going to be modernized. Also, we have to state that these data are not free. However, these stations provided also precipitation, wind direction and intensity, due point, solar radiation etc. Our group has bought other parameters as well (e.g. solar radiation, wind direction, due point, precipitation etc.), which currently are in the processing stage for the further inclusion in the air temperature modeling process using other ML models (RF, SVM, MLP etc.). At this moment we made an accent only on the in situ measured air temperature in the estimation process.  It is noteworthy that we used those in-situ measurements which correspond to the data acquisition dates.    

 

  1. I don’t quite get Table 2. I assume the authors hope to do a sensitivity analysis for all the input data, while Figure 3 presenting feature importance looks enough for this.

Re: thank you for pointing out an unclear contribution of the manuscript. Table 2 and for Figure 3 seem to convey a similar message, but they actually provide different information and are both necessary. The purpose of table 2 is to assess the general dependence of Tair from all the considered variables. Figure 3 is intended to show how the impact of the different variables changes with the size of the considered buffer zone, and specifically to highlight how larger buffer zones make other variables to emerge as impacting factors. VIP coefficients, however, tend to flatten al values of less impacting variables; removing Table 2 would erase information on environmental variables that contribute less substantially but still in a relevant manner to the estimation.

 

 

  1. Figure 3 and Table 3 also concern me. The task could be better finished with gridsearch of the sklearn

Re: we regret we did not know about the sklearn package, which could have sped up our work. However, we followed the same procedure except we did it manually; the results should be the same. Regarding Figure 3 vs. Table 3, the purpose of this latter is to highlight that larger buffer areas make prediction of Tair more uncertain, which is not evident from Figure 3 alone.

 

  1. Figure 4 better presents the sample size.

Re: Thank you for comment If we got it correctly, we added sample size in the caption of the Figure 4.

 

  1. The main concern I have is that the research needs more novelty. It looks more like a course report. I strongly suggest that the authors withdraw the submission as an “Article” and resubmit it to Remote Sensing as a “Technical Note.”

Re: We thank you for your suggestion, yet we respectfully disagree. We believe there are some points that support the novelty of this research. We have 30 predicting variables including not only the main variables commonly used in this type of problems (e.g. LST_SD; IBI-SABI_SD, NDWI_SD, NDVI_SD etc ), but also their derivatives. Moreover, some of them became influential according to the VIP scores for specific sizes of the local circular neighbourhood (100m; 200m; 300m; 400m; and 1000m - see Figure 3). No other studies we reviewed experienced this approach, including:

 

    1. Ho, H.C.; Knudby, A.; Xu, Y.; Hodul, M.; Aminipouri, M. A Comparison of Urban Heat Islands Mapped Using Skin Temperature, Air Temperature, and Apparent Temperature (Humidex), for the Greater Vancouver Area. Science of The Total Environment 2016, 544, 929–938, doi:10.1016/j.scitotenv.2015.12.021.
    2. Meyer, H.; Pebesma, E. Predicting into Unknown Space? Estimating the Area of Applicability of Spatial Prediction Models. Methods in Ecology and Evolution 2021, 12, 1620–1633, doi:10.1111/2041-210X.13650.
    3. Xu, Y.; Shen, Y. Reconstruction of the Land Surface Temperature Time Series Using Harmonic Analysis. Computers & Geosciences 2013, 61, 126–132, doi:10.1016/j.cageo.2013.08.009.
    4. Noi, P.T.; Degener, J.; Kappas, M. Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data. Remote Sensing 2017, 9, 398, doi:10.3390/rs9050398.
    5. Otgonbayar, M.; Atzberger, C.; Mattiuzzi, M.; Erdenedalai, A. Estimation of Climatologies of Average Monthly Air Temperature over Mongolia Using MODIS Land Surface Temperature (LST) Time Series and Machine Learning Techniques. Remote Sensing 2019, 11, 2588, doi:10.3390/rs11212588.

 

A second important point is that besides the most influential variable, i.e. LST, the list of considered variables includes several ones based on remote sensing (spectral indices) such as NDVI, NDWI, IBI-SAVI; the last one (IBI-SAVI) has very limited application in urban areas though it was developed as a tool for built-up area recognition [Xu, H. A New Index for Delineating Builtup Land Features in Satellite Imagery. International Journal of Remote Sensing 2008, 29, 4269–4276, doi:10.1080/01431160802039957]. In a sense we picked up and “revised” the IBI-SAVI, which showed to be one of the most influential variables for all neighbourhood sizes, consistently ranking among the top four most influential variables.

As a third important point, the study is new in terms of complexity of the terrain configuration of the considered area. In a complex elevation pattern such as the Yerevan urban area and its surroundings, for the first time the feasibility of estimating urban Tair based on remote sensing data alone is assessed. We are continuing on this experiment extending the list of the main variables with their derivatives via in situ weather data (solar radiation, wind direction, due point, precipitation etc.)



Round 2

Reviewer 1 Report

The authors have made changes to the article and responded to comments 

Author Response

We thank the reviewer for approving our modifications to the manuscript and for the time spent in reviewing it.

Reviewer 3 Report

The authors addressed my suggestion No.1 & 5, but seemly no revisions on 2, 3 & 4 on Table 2, 3, Figure 3, and the weather station information.

 

Other than the above listed, I am okay with the research published on Remote Sensing as a Tech Note, according to the novelty and workload.

 

However, I would not stand in the way if the editor's decision is accepted as an Article.

Author Response

Author's Notes to Reviewers’ comments

 

The authors addressed my suggestion No.1 & 5, but seemly no revisions on 2, 3 & 4 on Table 2, 3, Figure 3, and the weather station information.

Other than the above listed, I am okay with the research published on Remote Sensing as a Tech Note, according to the novelty and workload.

However, I would not stand in the way if the editor's decision is accepted as an Article.

 

Re: We thank the reviewer for his/her favorable comment. We understand some of the previous issues are not considered as satisfactorily addressed. We picked them up again and replied to each of them individually below.

Regarding the publication in the form of a Tech Note in place of an Article, we are convinced that an Article would be a fairer placement considering the novelty points that we have explained in our previous reply, including:

  • We have used not only the variables frequently used for this type of study, but a broader selection, and also their derivatives;
  • We have also investigated the spatial dimension and found that the importance of variables change with the size of the neighborhood, suggesting that these variables have different typical spatial scales of action

None of these points were investigated in previous studies to the best of our knowledge.

 

Individual points discussed below

 

  1. Weather station information. Only outlier selection was introduced, but readers may be interested in the observation instruments, frequency, parameters, etc.

 

Re: Thank you for your comment. We revised the paper and added some information about weather stations and measured parameters (See lines 232-238).

.

  1. I don’t quite get Table 2. I assume the authors hope to do a sensitivity analysis for all the input data, while Figure 3 presenting feature importance looks enough for this.

Re: thank you for comment. However, we respectfully disagree with the point that we have to be limited with the VIP score estimation to show the sensitivity of all the input variables. The Pearson correlation and VIP estimation are two independent and different approaches of sensitivity analysis for input variables and comes to ensure the outputs of each of them. The analogous you can find in the following paper, which is cited in our paper.

Otgonbayar, M.; Atzberger, C.; Mattiuzzi, M.; Erdenedalai, A. Estimation of Climatologies of Average Monthly Air Temperature over Mongolia Using MODIS Land Surface Temperature (LST) Time Series and Machine Learning Techniques. Remote Sensing 2019, 11, 2588, doi:10.3390/rs11212588.

So, we believe it makes sense to leave Table 2, which makes the correlation visually more immediately understandable; plus, as we highlighted in the previous answers, removing Table 2 would erase information on environmental variables that contribute less substantially but still in a relevant manner to the estimation.

  1. Figure 3 and Table 3 also concern me. The task could be better finished with gridsearchof the sklearn

Re: Thank you for the comment. We fully accept your suggestion to apply sklearn package and we’ll definitely do that in the continuation of the study. However, for this paper we would kindly ask you to agree to proceed with the proposed Figure 3 and Table 3. In above mentioned reference Otgonbayar’s team also use this approach manually, though did it for different groups vs. our different sized circled zones. In our case, we also still think that the purpose of the Table 3 is to highlight that larger buffer areas make prediction of Tair more uncertain, which is not evident from Figure 3 alone.

 

Back to TopTop