Next Article in Journal
DeepForest: Novel Deep Learning Models for Land Use and Land Cover Classification Using Multi-Temporal and -Modal Sentinel Data of the Amazon Basin
Previous Article in Journal
Automatic Supraglacial Lake Extraction in Greenland Using Sentinel-1 SAR Images and Attention-Based U-Net
 
 
Article
Peer-Review Record

Daily Spatial Distribution of Apparent Temperature Comfort Zone in China Based on Heat Index

Remote Sens. 2022, 14(19), 4999; https://doi.org/10.3390/rs14194999
by Zhengkun Wang 1,2, An Zhang 2,3,* and Meiling Liu 1
Reviewer 1: Anonymous
Reviewer 2:
Remote Sens. 2022, 14(19), 4999; https://doi.org/10.3390/rs14194999
Submission received: 20 August 2022 / Revised: 29 September 2022 / Accepted: 4 October 2022 / Published: 8 October 2022

Round 1

Reviewer 1 Report (Previous Reviewer 2)

Thanks for considering my comments and revising the manuscript. The revised manuscript has been improved. I have no other comments now.

Author Response

Response 1: Thank you for your advice; it was very helpful to us. We have revised the manuscript based on the comments from the other reviewers.

Author Response File: Author Response.docx

Reviewer 2 Report (New Reviewer)

In general, I think that this study presents a useful step forward for the community and that it builds onto the present state of knowledge. I have a couple of concerns with the presented methodology that should be addressed. However, more details about the model development and optimization need to be clarified. My major comments and questions are as follows:

  • What is the uniqueness of the proposed algorithm, multivariate stepwise regression model and its potential impacts, over other recently established states of the art techniques? The authors should explain this aspect in the introduction section. Otherwise, the readers cannot see the importance or uniqueness of your proposed methods over other techniques.
  • You need to rewrite the abstract with proper summary. Also check the sentence structure and grammar error.
  • You need to introduce states of the art techniques mL technique such as RF/BT and compare multivariate stepwise regression models in the introduction?
  • Why did you choose features such as vegetation, meteorology, etc.? How about soil moisture and DEM information?

 

  • Can you explain the high-resolution features and information in your application?

 

  • How did you choose essential tuned hyperparameters for your method? You should provide a table for hyperparameters settings. What is the fundamental theory about this information?

 

 

  • In the discussion section, you should discuss your results vs previous research.

Author Response

Point 1: What is the uniqueness of the proposed algorithm, multivariate stepwise regression model and its potential impacts, over other recently established states of the art techniques? The authors should explain this aspect in the introduction section. Otherwise, the readers cannot see the importance or uniqueness of your proposed methods over other techniques.

 

Response 1: Thank you very much for pointing out this detail. Most of the studies on apparent temperature use empirical models that rely on low-resolution climate variables. Some of these studies are based only on meteorological station data, making it difficult to obtain high-resolution apparent temperatures at locations far away from meteorological stations and thereby posing a challenge for human comfort studies. Therefore, while we considered these climate factors, we added non-climate variables with potential effects on human comfort such as NDVI, NDWI, nighttime lights (NTL), and DEM, all variables with high spatial resolution, to build predictive regression models of heat index with high temporal and spatial resolution by screening the variables through stepwise regression. We further explained the reason for using a stepwise regression model in the introduction section. Machine learning technique such as RF/BT have applications in calculating the importance of variables based on the interactions between variables. Multiple stepwise regression models are established based on the importance and relevance of independent variables to the heat index, this is a common method to eliminate multicollinearity. Because of the presence of multicollinearity between the independent variables in our study, the multiple stepwise regression model was used to solve this problem. We therefore implemented the calculation of 1 km daily heat index by building a stepwise regression model.

 

Point 2: You need to rewrite the abstract with proper summary. Also check the sentence structure and grammar error.

 

Response 2: Thank you very much for your suggestions. We have revised the summary:

Abstract: Apparent temperature (AT) is used to evaluate human comfort and is of great importance for studies on the effects of environmental factors on human health. This study used the daytime heat index (HI) calculated by national surface meteorological stations in China as the AT de-pendent variable, with August 2020 as an example. The daytime fifth generation European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalysis of the global climate (ERA5) data and multi-source data extracted from the stations were used as the independent variables. Due to multicollinearity among the independent variables, we implemented a multiple stepwise regression model and developed a daily near-surface 1 km HI estimation model. The correlation analysis showed that the coefficient of determination (R2) was 0.89; the mean absolute error (MAE) was 1.49°C, and the root mean square error (RMSE) was 2.08°C. We also used 10-fold cross-validation to calculate the error between the parameters and the predicted values. The R2 of the model was 0.96, the MAE was 1.80°C, and the RMSE was 2.40°C. In this month, the mean daily daytime HI ranged from −7.32°C to 34.12°C. According to the Universal Thermal Climate Index (UTCI), the areas with more than 20 days for one month of heat stress were largely distributed in the desert areas of Northwest China and the coastal areas in Southeast China, accounting for 29.98% of the total land area of China. This study improves the spatial resolution and accuracy of HI prediction, thus providing a scientific reference for studying urban residential environments and the urban heat island effect.

 

Point 3: You need to introduce states of the art techniques mL technique such as RF/BT and compare multivariate stepwise regression models in the introduction?

 

Response 3: Thank you very much for your suggestion. Machine learning techniques such as RF/BT have applications in calculating the importance of variables, and these can be selected based on the interactions between variables. Multiple stepwise regression models were established based on the importance and relevance of independent variables to the heat index; this is a common method to eliminate multicollinearity [Zhu, X.; Hou, C.; Xu, K.; Liu, Y. Establishment of agricultural drought loss models: A comparison of statistical methods. Ecological Indicators 2020, 112, 106084, Doi: https://doi.org/10.1016/j.ecolind.2020.106084.]. Due to the multicollinearity among the variables selected in this paper, we established a multivariate stepwise regression model to screen the independent variables and build a multivariate linear regression model for HI prediction. We have added this explanation to the introduction section.

 

Point 4: Why did you choose features such as vegetation, meteorology, etc.? How about soil moisture and DEM information?

 

Response 4: Thank you very much for your questions. Air temperature is the main influencing factor on the HI, and its primary source is the solar radiation absorbed by the surface. Therefore, we prioritized air temperature. Research has shown that living with less vegetation and higher surface temperatures are related to greater numbers of deaths due to high temperatures. The NDVI and NDWI have strong negative effects on the land surface temperature (LST) that can restrict the urban heat island effect and reduce the harm of heat stress. Therefore, we added the NDVI, NDWI, and LST to the heat index calculation. Previous empirical formulas also considered wind speed, dew point temperature, and atmospheric pressure, and thus we added them to the calculation. The effect of DEM on the human body is also relevant, and the spatial resolution of the raw data was 30 m, and thus we expected it to help improve the spatial resolution of the heat index. However, the model results showed that the contribution of DEM was less significant, so it was discarded in the final heat index calculation. The meteorological factors have already considered the dew point temperature, as it can be used as a proxy for air humidity. Soil moisture has the most direct effect on vegetation and a lower effect on human comfort. Therefore, we did not consider the effect of soil moisture on human comfort.

 

Point 5: Can you explain the high-resolution features and information in your application?

 

Response 5: Thank you very much for your questions. We considered air temperature, dew point temperature, wind speed, land surface temperature, atmospheric pressure from ERA5 (with low spatial resolution about 0.1°≈ 11.3 km), and multi-source data including NTL, DEM, NDVI, NDWI in the calculation of heat index. All data except DEM were daily data, so this application had high temporal resolution. The spatial resolution of nighttime lights, NDVI, and NDWI were about 500 m, and the spatial resolution of DEM was 30 m. Their inclusion improved the spatial resolution of the heat index. Therefore, the heat index prediction model that we built has high temporal and spatial resolution.

 

Point 6: How did you choose essential tuned hyperparameters for your method? You should provide a table for hyperparameters settings. What is the fundamental theory about this information?

 

Response 6: Thank you very much for your questions. We used a forward selection method to establish a stepwise regression model with the LST, TEMP, DEW, WS, ATM, NDVI, NDWI, NTL, and DEM data. Based on the principle of optimal calculation results of the R2, AIC and VIF, the results showed that LST and TEMP, DEM and ATM had strong correlations. Considering the wide application and data resolution, we discarded LST and ATM. NDVI and DEM contributed less to the model calculation process, so we finally selected TEMP, DEW, WS, NTL, and NDWI to construct the multiple regression model. All significance tests satisfied p < 0.05. Table 1 shows the model calculation results. We have added this table to Section 3.1.1.

 

Table 1 Results of the multiple stepwise regression model.

 

VARBALS*

R2

AIC

VIF

p-value

1

+ TEMP

 

 

 

 

0.86

52004.48

1.00

< 0.05

2

+ TEMP

+ DEW

 

 

 

0.88

50013.95

1.52

< 0.05

3

+ TEMP

+ DEW

- WS

 

 

0.89

49628.11

1.65

< 0.05

4

+ TEMP

+ DEW

- WS

-NDWI

 

0.89

49322.25

1.80

< 0.05

5

+ TEMP

+ DEW

- WS

-NDWI

+NTL

0.89

49144.96

1.85

< 0.05

* +,- represents the direction of contribution to the model.

 

Point 7: In the discussion section, you should discuss your results vs previous research.

 

Response 7: Thank you very much for your suggestion. Yin investigated the application of HI on human comfort [28]. Similarly, we used the meteorological station data to control and verify the accuracy of the HI during the HI calculation process. We improved the spatial continuity and resolution of the HI by adding multi-source data. The calculated daily MAE of 1 km HI was 1.49°C, and the RMSE was 2.08°C; thus, the accuracy was significantly improved, and the distribution characteristics of the HI were also consistent. Ren calculated the thermal comfort of 183 cities in China from 1990 to 2016 using daily meteorological stations data [57]. Gobo also predicted future changes in climate comfort by kriging interpolation based on daily meteorological stations data [58]. However, the spatial resolution of their results was relatively low. Ge used the ERA-Interim atmospheric reanalysis data to calculate the average UTCI from 1985 to 2014 on annual and seasonal scales to assess the thermal environment in China. The results revealed that the summer thermal heat stress area accounted for 20.25% of the total area in China, and this would be larger in July and August [56]. We have modified section 4.1 accordingly.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report (New Reviewer)

The authors significantly improved the manuscript by addressing all the comments.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Main concerns on the aims of the paper:

  • Typically Heat Index, apparent temperature and similar, are used to individuate critical exposition to thermal discomfort that may have serious sanitary effect. Usually they are defined using extreme situation (e.g. heat waves) as maximum and diurnal temperature, mean diurnal humidity, mean diurnal wind and so on. For this reason, I’m very doubtful on the usefulness of an index computed on monthly average period. In this way, all the extreme situations are smoothed and may be no discernible for sanitary effects.
  • For the compute of AT where there are not ground stations the Authors propose 12 polynomial functions (one for each months) that involve 10 variables. I think a very simpler way is to compute AT using equation (1), or equation (2), using the meteorological variables extracted directly from ERA5 reanalysis. Notice that ERA5 reanalysis take into account also ground elevation. What is the advantage?
  • Some of explanatory variables used to estimate AT are very debatable. For example, why would human perception of environmental heat depend in different way to longitudinal wind speed and to latitudinal wind speed? NDWI, normalized difference water index, may refer to vegetation water content, how this quantity influence human perception of heat?    
  • The paper does not consider the thermal comfort indices furnished by ERA5 Copernicus Data Store (https://cds.climate.copernicus.eu/cdsapp#!/dataset/derived-utci-historical?tab=overview) . Why should the user prefer the index presented in this paper?

The methodology and the discussion of the results of this pare are poorly done, some example:

  • ERA5 meteorological data are used in place of ground measurements at the locations where these lasts are not available (there are no stations). The Authors say that to justify this replacement the two data sets have been compared and that there is a good agreement. This important analysis is not reported, a part some few words (rows 148-150) and the scatter plot in figure 2 that shows only the comparison of temperature.
  • Equations (8) and (9): AIC and VIS indexes are very poorly presented. Most of the readers must look for a statistics book to understand them. At least give a bibliographic reference and a short explanation about the interpretation of their values.
  • Figure 4: Usually we start from a block representing the source of data, e.g. “China national meteorological station” and using an arrow we point to a block that list the data extracted, e.g. “Average temperature” and “Relative humidity”: here is the opposite. It seems that data generated the stations. The upper left data block contains also ERA5 data but all the data are indicated (blue block) as “Multi source remote sensing data” (most of them are not produced by remote sensing). It is not clear the meaning of the two conditions “N” and “Y” placed on the same flow chart branch. In my opinion this flow chart need to be redone.
  • Table 2 reports the result of the multiple regressions, but very few numerical results are discussed. For example: a) why in August AT is correlated with latitude wind speed, in September with longitude wind speed and in October there is no correlation with wind? b) why vegetation (NDVI) is important only March, April and August? If I well remember, for each model (i.e. explanatory variable leaved out) an AIC value is produced, then at end they are compared in order to eliminate the redundant variables. Here only the “best” result is presented, what information we can get from this single value?
  • I think that for the reader interested in heat sanitary effects is very significant to realize the importance of using AT instead the air temperature. For example, it would very interesting to compare the China map of AT with the that of 2m air temperature.

 

There is a lot of confusion and disorganization in the manuscript, some example:

  • Section 2.3 is entitled “Remote Sensing Data” but it presents also ERA5 reanalysis data and some comparisons between ground measurements and ERA5 data. Moreover, it is not clear what data are obtained by MODIS and what are extracted by ERA5 (rows 131-135).
  • Moreover, Section 2.3 contains also data elaboration and discussion (rows 148-150, figure 2).
  • Row 146: “all data were resampled … by bilinear interpolation”. I don’t think that GDM data (30 m of resolution) are downscaled using interpolation (interpolation is used to upscale).
  • Figure 3 is not relevant for this study.
  • Page 9: We read that “The sample set consisting of  the  LST,  TEMP,  DEW,  WS_U,  WS_V,  ATM,  NDVI,  NDWI, NTL, and DEM was screened” but in Table 2 LST and ATM are not mentioned.
  • Section 2.6: “HI”. Here is stated that the HI calculation formula is equation (1) (there is not bibliographic reference), so what is the meaning of equations (2), (3) and (4)? What is the equation used in this work? Row 209 must be breached at the “If” statement, moreover it is “If T >= 80” not if “If HI >= 80”.
  • Equation (12) is wrong. This is not the formulation of RMSE.
  • Equation (5): I think that the 99.9% of the readers are able to convert Fahrenheit degrees to Celsius degrees, probably they learned it at high school.
  • Figure 1, caption: Replace Evaluation with Elevation

Reviewer 2 Report

This paper built a linear regression model to predict the heat index using ERA5 reanalysis data and remote sensing data, and then analyzed population exposure to a divided comfort zone calculated with the heat index. The generated data and analyzed patterns might provide some useful information for human health studies. However, the novelty of this study is not well stated and may not be suitable for this journal, and the manuscript is not well prepared. I have the following specified comments on this manuscript.

 

  1. Editing of the English language and style is needed, as some of the sentences are hard to read. For example, “in China 2020” in the title of this manuscript.

 

  1. Line 13. It would be better to also state research gaps in previous studies before the authors stated their methods and experiments.

 

  1. Line 14. please give the full names of “ECMWF-ERA5”

 

  1. Line 15. Please give the detailed type of remote sensing data used in this study after the phrase “remote sensing data”

 

  1. Line 20. Why not use “HI” directly rather than “AT”?

 

  1. Line 25-26. The authors stated that “This study accounts for the shortcomings of low spatial resolution of the atmospheric reanalysis data”, but in my opinion, this is not correct as the generated results actually rely on the atmospheric reanalysis data.

 

  1. Line 47-48. An expression like “domestic studies” is not suitable for an international journal.

 

  1. Line 61-62. “Although people spend most of their time indoors and under general 61

weather conditions, we can give priority to temperature and humidity [23].” This sentence may not logically clear.

 

  1. Line 68-69. ” but the spatial resolution of the results was not very good.” It would be better to use words like “high” or “low” to describe the spatial resolution of data. Also applied to similar expressions below. For example, in Line 114.

 

  1. Line 119. There are around 1000 stations in China. The authors should try to collect more data from the China meteorological data service center (http://data.cma.cn/en), as 176 stations are too sparse for the whole of China.

 

  1. Line 129-139. Please state clearly the temporal resolution of remote sensing data (monthly or annual mean).

 

  1. Line 131. Is ‘LST at 2m’ a remote sensing-based variable or the reanalysis data-based variable?

 

  1. Line 132. Please explain why the authors use dew point temperature at 10 m not at 2m. Is there any supporting literature?

 

  1. Line 182. Please check the source of remote sensing data. They should not be obtained from meteorological stations.

 

  1. Line 233-234. I would suggest the authors use the 10-fold cross-validation method for model validation.

 

  1. Equation (7). Just use AT instead of Ts.

 

  1. Figure 5. the first line of this figure should be deleted.

Reviewer 3 Report

Dear authors,

I appreciated reading your work. It is a timely discussion and nicely described.

It would be helpful to update some of the references and mention relevant work that has been done recently in this field, for example the work by Ariane Middel.

The map resolutions and configuration can be improved, for example please see figure 1, where legend is misplaced.

 

Best,

 

Back to TopTop