Next Article in Journal
Extensive Marine Heatwaves at the Sea Surface in the Northwestern Pacific Ocean in Summer 2021
Next Article in Special Issue
Evaluating Machine Learning and Remote Sensing in Monitoring NO2 Emission of Power Plants
Previous Article in Journal
Comparison of Total Column and Surface Mixing Ratio of Carbon Monoxide Derived from the TROPOMI/Sentinel-5 Precursor with In-Situ Measurements from Extensive Ground-Based Network over South Korea
Previous Article in Special Issue
SAR Ship Detection Dataset (SSDD): Official Release and Comprehensive Data Analysis
 
 
Article
Peer-Review Record

Can Neural Networks Forecast Open Field Burning of Crop Residue in Regions with Anthropogenic Management and Control? A Case Study in Northeastern China

Remote Sens. 2021, 13(19), 3988; https://doi.org/10.3390/rs13193988
by Bing Bai 1, Hongmei Zhao 1,*, Sumei Zhang 2, Xuelei Zhang 1 and Yabin Du 1,3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Remote Sens. 2021, 13(19), 3988; https://doi.org/10.3390/rs13193988
Submission received: 23 July 2021 / Revised: 30 September 2021 / Accepted: 1 October 2021 / Published: 5 October 2021

Round 1

Reviewer 1 Report

Report on manuscript

Can neural networks forecast open field burning of crop resi- 2 
due in regions with anthropogenic management and control? A 3 
case study in Northeastern China 
 
by

Bing Bai et al

First of all I want to apologize for handing in this review so late.
 
Second, I would like to anticipate my final vote: I recommend publication after a thorough 
check of english grammar.

The manuscript consists of  4 sections. The introduction is well written and reflects the state of the art in the fields. The references and discussion of the machine learning methods could be improved, e.g. by referencing (and maybe use in future work) modern methods like networks with adapted structure (CNN, GAN, U-Net methods) or physics - based methods like symbolic regression. ML has evolved a lot and the back propagation is inherent to basically all NNs, nomenclature has changed and many methods are subsummed under deep learning. The modern discussion is rather on what network fits best the concrete application. References for the methods are found in the updated https://www.deeplearningbook.org/ and on symbolic regression in https://link.aps.org/doi/10.1103/PhysRevE.94.012214 or https://doi.org/10.1007/978-1-4419-1626-6_5.

The idea and analysis of the paper, however is good. For the field it is timely. 
The description of method and data is sufficient and happily I notice that it is not too long.
For the concrete implementation, I can recommend, if the group will  continue in that direction, to migrate step by step to other tools. This is a remark and no criticism. SPSS is relatively slow and fix in its capabilities, whereas python notebooks are abundant and offer flexibility.

The results are well described, here, I could imagine that an overall summary of Table 1-5 would be beneficial for the reader to understand the progression and the differences of models and assumptions. Maybe a grafical representation will be beneficial.

The discussion is balanced as far as I can judge.

 

Author Response

We want to thank the reviewer for constructive comments on our manuscript. Revisions have been made (the changes highlighted with the tracking in the revised manuscript). We provide below in detail our responses to the comments:

 

Point 1: The manuscript consists of  4 sections. The introduction is well written and reflects the state of the art in the fields. The references and discussion of the machine learning methods could be improved, e.g. by referencing (and maybe use in future work) modern methods like networks with adapted structure (CNN, GAN, U-Net methods) or physics - based methods like symbolic regression. ML has evolved a lot and the back propagation is inherent to basically all NNs, nomenclature has changed and many methods are subsummed under deep learning. The modern discussion is rather on what network fits best the concrete application.

References for the methods are found in the updated https://www.deeplearningbook.org/  and on symbolic regression in https://link.aps.org/doi/10.1103/PhysRevE.94.012214 or https://doi.org/10.1007/978-1-4419-1626-6_5. 


 

Response 1: We added more discussion and references about machine learning methods, such as CNN, GAN, U-Net methods and symbolic regression in Discussion parts and  References parts (lines 490-500, revised manuscript).

 

Point 2: The idea and analysis of the paper, however is good. For the field it is timely.

The description of method and data is sufficient and happily I notice that it is not too long.

For the concrete implementation, I can recommend, if the group will  continue in that direction, to migrate step by step to other tools. This is a remark and no criticism. SPSS is relatively slow and fix in its capabilities, whereas python notebooks are abundant and offer flexibility.

 

Response 2: Thank you for the comments. We added a comparison about the SPSS and python tools, and consider use python to continue the research (lines 500-503, revised manuscript).

 

Point 3: The results are well described, here, I could imagine that an overall summary of Table 1-5 would be beneficial for the reader to understand the progression and the differences of models and assumptions. Maybe a grafical representation will be beneficial.

 

Response 3: Table 7 as an overall summary was added in the revised manuscript. (line 478, revised manuscript).

 

 

Your sincerely,

Bing Bai

 

Key Lab of Wetland Ecology and Environment,

Northeast Institute of Geography and Agroecology,

Chinese Academy of Sciences

Author Response File: Author Response.doc

Reviewer 2 Report

The authors propose an enhanced model to forecast agricultural fires by taking into account different factors such as anthropogenic management and control policies when training a supervised neural network. 

Very detailed descriptions of the considered datasets and of the motivation of this study are presented. I would say that, despite the final results were not outstanding, as the authors point out, this work adds worth and value to the existing literature for considering possible different input scenarios. 

I have only some minor comments:

  1. As the authors correctly point out, having enough "samplings" for the training stage is a very important step. Relatively to the considered framework, would it be possible to apply some data-augmentation technique? It is not clear by reading the text if the authors already did use this strategy or whether they plan to adopt it in future.
  2. When the authors compute the "correlation", to which coefficient do they refer to? For example, is it Pearson? Is it Kendall? Please, add some details.
  3. In the analysis of the experiments, the authors refer to the "overall accuracy" of the verification step and to the "accuracy" of the whole model. Regarding to this last metric, it is not clear to me, how this "accuracy" is computed by simply looking at the results reported on the Tables. 
  4.  When optimizing your network, what kind of loss function are you considering? Is it a softmax loss, a contrastive loss, an Euclidean loss, etc etc ... ?
  5.  As the authors noticed, sometimes, providing a suitable manipulation of the input factors, i.e. the soil difference in 24h,  can results in an "enhancement" of the model. Therefore, I would suggest to open for the discussion, perhaps in the conclusion chapter, to possible further development by adding some literature and hence some different techniques that could be investigated to manipulate the input data.  Here I simply list the techniques I am most familiar with, but the authors can feel free to add even more references with this respect: Kauth–Thomas transform  :  
  • E. P. Crist and R. C. Cicone, “A physically-based transformation of Thematic Mapper data – The TM tasseled cap,” IEEE Trans. Geosci. Remote
    Sens., vol. 22, no. 2, pp. 256–263, Mar. 1984.
  •  J. Morisette and S. Khorram, “An introduction to using generalized
    linear models to enhance satellite-based change detection,” in Proc.
    IGARSS, vol. 4, 1997, pp. 1769–1771.

A logarithmic transformation  of the input settings: 

  • Celik, T. (2009). Multiscale change detection in multitemporal satellite images. IEEE Geoscience and Remote Sensing Letters6(4), 820-824. 

Last chapter, a linear combination of simple differences and averaged ratios: 

  • Falini, A., Tamborrino, C., Castellano, G., Mazzia, F., Mininni, R. M., Appice, A., & Malerba, D. (2020, July). Novel Reconstruction Errors for Saliency Detection in Hyperspectral Images. In International Conference on Machine Learning, Optimization, and Data Science (pp. 113-124). Springer, Cham.

Author Response

We want to thank the reviewer for constructive comments on our manuscript. Revisions have been made (the changes highlighted with the tracking in the revised manuscript). We provide below in detail our responses to the comments:

 

Point 1: As the authors correctly point out, having enough "samplings" for the training stage is a very important step. Relatively to the considered framework, would it be possible to apply some data-augmentation technique? It is not clear by reading the text if the authors already did use this strategy or whether they plan to adopt it in future. 


 

Response 1: In this study, we chose a study period to collect enough sampling for training, no data-augmentation technique was used. That’s a good idea, we will consider this in our future study. (lines 487-490, revised manuscript).

 

Point 2: When the authors compute the "correlation", to which coefficient do they refer to? For example, is it Pearson? Is it Kendall? Please, add some details.

 

Response 2: I am sorry for the confusion. The SPSS Modeler software indicates the importance of the input variables, not the correlation . We revised correlation to importance and added the algorithm in the revised manuscript (lines 368-370, revised manuscript).

 

Point 3: In the analysis of the experiments, the authors refer to the "overall accuracy" of the verification step and to the "accuracy" of the whole model. Regarding to this last metric, it is not clear to me, how this "accuracy" is computed by simply looking at the results reported on the Tables.

 

Response 3: The accuracy is calculated by comparing the model’s forecasted value (the forecast results from BPNN) for each case to the case’s generated (fire points observed by MODIS) outcome. We added more detail information in the revised manuscript (lines 219-225, revised manuscript).

 

Point 4: When optimizing your network, what kind of loss function are you considering? Is it a softmax loss, a contrastive loss, an Euclidean loss, etc etc ... ?

 

Response 4: In this neural network model, it used the ordinary least squares to calculate the residual sum of squares as the loss function (lines 226-227, revised manuscript).

 

Point 5: As the authors noticed, sometimes, providing a suitable manipulation of the input factors, i.e. the soil difference in 24h, can results in an "enhancement" of the model. Therefore, I would suggest to open for the discussion, perhaps in the conclusion chapter, to possible further development by adding some literature and hence some different techniques that could be investigated to manipulate the input data.

 

Response 5: Thank you for your suggestion. We carefully read the papers you recommended, added some content about data-augmentation technique in Section 4.2 and consider to adopt them in future. We also added the reference in the revised manuscript (lines 487-490, revised manuscript).

 

Your sincerely,

Bing Bai

 

Key Lab of Wetland Ecology and Environment,

Northeast Institute of Geography and Agroecology,

Chinese Academy of Sciences

Author Response File: Author Response.doc

Reviewer 3 Report

General comments: The authors investigate the accuracy of using ANN based on MODIS  in forecasting open burning fires before and after the implementation of control policies in Jilin Province (China). A comparison between the use of only meteorogical parameters and adding anthropogenic factors and soil moisture was analysed. The issue is of some significance within the scope of the journal but I missed a further development of methodology, as well as, the application of  the goodness-of-fit tests and  measures of sensitivity and specificity of experiments carried out. I think the reseach is lack of robustness in the results and conclusions shown. Furthermore, the number of tables and figures is scarce rendering the adequate understanding of discussion section. 

Specific comments:

Line 16: were forecasted

Line 35: "these pollutants affect climate change..."--->always? in which concentration?

Line 38: ... fossil fuel consumption in rural regions has increased, ...--->How much? Determine the period too.

Line 93:  1240,000 km2 -->write correctly the figure.

Line 104: lower than the national average--->mention the quantity

Lines 114-115: Consequently, the spatiotemporal distribution of crop residue burning has also changed.--->It would be good to add a graphic of variation and metrics that quantify this evolution.

Lines 118-119: Why not adding a legend?

Are there no data related size of open burning fires?

Line 152: kriging method-->add type and characteristics.

Line 182: index variable---> add a table with variables and statistics

Line 183: ensemble of 78694 members--->indicate what "members" mean

Lines 184-185: A neural network consists of ....--> add explanatory graphic

Line 204: haven´t--> have not

Table 1: Proportion of training and verifying/forecasting samples---> why so different?

Indicate in Methods: goodnes-of-fit, sensitivity, specifity and confusion matrix

Line 237: selected 80% of the daily data to train the model and reserved the remaining 20% --->In order to improve the robustness and stability of results and to reduce bias, try other partitions and compare the results.

Lines 249-250:  the larger the 249 amount of training data, the better the learning performance of the neural network. ---> the authors do not substantiate that claim in different combinations in dataset.

Line 257: 3.1.2. Optimization of the forecasting model in Northeastern China--->This should be moved before Fig2 in methodology section. Reformulate paper

Line 271:  The correlation of the input factors given---> provide a table with  the percentage of correlation.

Lines 310-311:  After a lot of data training, the accuracy--->specify

Line 319. acceptable---> why? Previous hypothesis should be shown.

Table 5: the proportions 60/40 are OK??

Line 327:  Various methods have been--->which?

Figs. 3 and 4: Indicate the letter on the figure (a, b, c and d)

Lines 418-420: Generally, the supervision of administrative boundaries is not strict, 418 making the neural network increase the probability of the fire points at the boundary in  the learning process. --->Explain why.

Lines 440-450: it is more a summary than conclusions. Considere reduce.

References: Name of the journal is not abbreviated in all cases Follow journal´s rules.

 

 

Author Response

We want to thank the reviewer for constructive comments on our manuscript. Revisions have been made (the changes highlighted with the tracking in the revised manuscript). We provide below in detail our responses to the comments:

 

General comments:

The authors investigate the accuracy of using ANN based on MODIS in forecasting open burning fires before and after the implementation of control policies in Jilin Province (China). A comparison between the use of only meteorogical parameters and adding anthropogenic factors and soil moisture was analysed. The issue is of some significance within the scope of the journal but I missed a further development of methodology, as well as, the application of the goodness-of-fit tests and measures of sensitivity and specificity of experiments carried out. I think the research is lack of robustness in the results and conclusions shown. Furthermore, the number of tables and figures is scarce rendering the adequate understanding of discussion section. 

 

Response: Thanks for your comments. In this study, firstly, we considered the natural factors (meteorological factors, soil moisture content and harvesting time) to forecast the fire points of crop residue. Then the anthropogenic management and control policies (i.e., the straw open burning prohibition areas) were added to forecast the fire points of crop residue. We added more detail information about data calculation, methodology description and results discussion in the revised manuscript. Furthermore, Figure about fire points distribution in 2013-2020 and table about overall summary of model and forecasting accuracy were added to make the paper better understand. (everywhere, revised manuscript). 

Specific comments:

Point 1: Line 16: were forecasted. 


 

Response 1: We changed “forecast” to “forecasted” (line 16, revised manuscript).

 

Point 2: Line 35: "these pollutants affect climate change..."--->always? in which concentration?

 

Response 2: When these pollutants are in high concentrations, they will affect climate change and pose a great challenge for regional air quality. We improved the description of this part in the revised manuscript (lines 36-37, revised manuscript).

 

Point 3: Line 38: ... fossil fuel consumption in rural regions has increased, ...--->How much? Determine the period too.

 

Response 3: Until 2018, the demand for fossil fuels accounted for 80% of all energy in China. We added the description of this part in the revised manuscript (lines 41-42, revised manuscript).

 

Point 4: Line 93:  1240,000 km2 -->write correctly the figure.

 

Response 4: We changed “1240,000” to “1240000” (line 99, revised manuscript).

 

Point 5: Line 104: lower than the national average--->mention the quantity.

 

Response 5:We added the quantity in the revised manuscript (lines 110-111, revised manuscript).

 

Point 6: Lines 114-115: Consequently, the spatiotemporal distribution of crop residue burning has also changed.--->It would be good to add a graphic of variation and metrics that quantify this evolution. 


 

Response 6: We added “Figure 2. Spatial distribution of fire points in Northeastern China by MODIS observations from 2013-2020” to indicate the temporal and spatial variation of fire points (lines 126-127, revised manuscript).

 

Point 7: Lines 118-119: Why not adding a legend? Are there no data related size of open burning fires?

 

Response 7: We added a legend in the Figure 1. Meanwhile, we added a spatiotemporal distribution of 2013-2020 fire points graphic (Figure 2)  (lines 123-127, revised manuscript). 

 

Point 8: Line 152: kriging method-->add type and characteristics.

 

Response 8: The kriging method is the Ordinary Kriging Method, and its the semivariogram model is circular. We added the content of this part in the revised manuscript (lines 146-148, revised manuscript).

 

Point 9: Line 182: index variable---> add a table with variables and statistics

 

Response 9: Table 1 summarizes the relevant variables and statistical data of this study.  We also added the explanation for the neural network for each index variable (line 192, revised manuscript).

 

Point 10: Line 183: ensemble of 78694 members--->indicate what "members" mean?

 

Response 10: Because the farmland in Northeastern China can be divided into 78694 units with a spatial resolution of 3 km×3 km, so we constructed a BPNN ensemble of 78694 members to parameterize the relationships between agricultural fire points and environmental variables. We added the explanation of this part in the revised manuscript (lines 193-194, revised manuscript).

 

Point 11: Lines 184-185: A neural network consists of ....--> add explanatory graphic. 


 

Response 11: We drew the research flow chart of BPNN method used in this study, the research flow chart is shown in Figure 3. At the same time, it was expressed in words in the revised manuscript (lines 237-247, revised manuscript)

 

Point 12: Line 204: haven´t--> have not

 

Response 12: We revised this. (lines 215-218, revised manuscript). 

 

Point 13: Table 1: Proportion of training and verifying/forecasting samples---> why so different? Indicate in Methods: goodnes-of-fit, sensitivity, specifity and confusion matrix.

 

Response 13: Because in the first scenario, for the highest accuracy of model forecasting, we set the proportion of training and verifying/forecasting samples to 8:2, but in scenario2, the number of fire points in 2018-2020 is difficult to make the proportion of training and verifying/forecasting samples 8:2, then in scenario2, we chose the actual observed fire points to train and verify/forecast. We added the explanation of this problem in the revised manuscript (lines 237-247, revised manuscript).

 

Point 14: Line 237: selected 80% of the daily data to train the model and reserved the remaining 20% --->In order to improve the robustness and stability of results and to reduce bias, try other partitions and compare the results.

 

Response 14: According to previous research results, by set 10 kinds of different numbers of modeling and verification data combinations, when the proportion of training and verifying/forecasting samples was 8:2, the accuracy of model forecasting was the highest and the model constructed by the neural network forecasting was stable and feasible. And we also added the explanation of this method in the revised manuscript (lines 263-267, revised manuscript).

 

Point 15: Lines 249-250: the larger the amount of training data, the better the learning performance of the neural network. ---> the authors do not substantiate that claim in different combinations in dataset.

 

Response 15: Here we compared the study in Northeastern China and Songnen Plain. In the Northeastern China, we selected 38856 fire points as training data and the accuracy of forecasting was 73.67%. While in Songnen Plain, we selected 32642 fire points as training data, then the forecasting accuracy was 69.1%. So, we suggested that, within a certain sample range, the larger amount of training data, the better learning performance of neural network. The result was consistent with other studies. We improved the description of this part in the revised manuscript (lines 277-282, revised manuscript).

 

Point 16: Line 257: 3.1.2. Optimization of the forecasting model in Northeastern China--->This should be moved before Fig2 in methodology section. Reformulate paper. 


 

Response 16: We added some content of the optimization of the forecasting model in Northeastern China in Section 2.4 (lines 237-241, revised manuscript). 

 

Point 17: Line 271: The correlation of the input factors given---> provide a table with the percentage of correlation.

 

Response 17: After double check, the SPSS Modeler software indicates the importance of the input variables, not the correlation. We changed the description of this part, and added their importance scores in Table 6 (lines 368-374, revised manuscript).

 

Point 18: Lines 310-311:  After a lot of data training, the accuracy--->specify.

 

Response 18: We defined this quantity in the revised manuscript (line 344, revised manuscript).

 

Point 19: Line 319. acceptable---> why? Previous hypothesis should be shown. Table 5: the proportions 60/40 are OK?

 

Response 19: We added a hypothesis about this study in Introduction, if the final forecasting accuracy can reach more than 60%, then it means this model is acceptable, so the proportions 60/40 are acceptable (lines 92-93, revised manuscript).   

 

Point 20: Line 327: Various methods have been--->which?

 

Response 20: The methods are mathematical statistics, Pearson correlation coefficient and Spearman correlation coefficient (lines 366-367, revised manuscript).

 

Point 21: Figs. 3 and 4: Indicate the letter on the figure (a, b, c and d). 


 

Response 21: We added the letter on the revised figures.

 

Point 22: Lines 418-420: Generally, the supervision of administrative boundaries is not strict, 418 making the neural network increase the probability of the fire points at the boundary in the learning process. --->Explain why.

 

Response 22: Generally,as the supervision of administrative boundaries is not strict, when the weather condition is permit, farmers would carry out crop residue open field burning without government monitoring. Which cause the neural network increasing the probability of the fire points at the boundary in the learning process (lines 461-463, revised manuscript).

 

Point 23: Lines 440-450: it is more a summary than conclusions. Considere reduce.

 

Response 23: We reduced some statement in the conclusion part (lines 506-515, revised manuscript).

 

Point 24: References: Name of the journal is not abbreviated in all cases Follow journal´s rules.

 

Response 24: We revised the references (lines 545, 563, 576, 584, 595, revised manuscript).

 

 

 

Your sincerely,

Bing Bai

 

Key Lab of Wetland Ecology and Environment,

Northeast Institute of Geography and Agroecology,

Chinese Academy of Sciences

Author Response File: Author Response.doc

Round 2

Reviewer 3 Report

Major suggestions: The authors have made importante changes in the manuscript but I think that the models do not provide enough robustness with only Pearson and  Spearman correlation coefficients. Sensitivity, specificity and accuracy assessment is still lacking.  Furthermore, the authors refer to previous research results related to partion carried out but they do not provide them or reference to justify them.

Minor comments: All the journals in references should be abbreviated. See: https://www.mdpi.com/journal/atmosphere/instructions

Author Response

We want to thank the reviewer for constructive comments on our manuscript. Revisions have been made (the changes highlighted with the tracking in the revised manuscript). We provide below in detail our responses to the comments:

 

Major suggestions:

The authors have made important changes in the manuscript but I think that the models do not provide enough robustness with only Pearson and Spearman correlation coefficients. Sensitivity, specificity and accuracy assessment is still lacking. Furthermore, the authors refer to previous research results related to partion carried out but they do not provide them or reference to justify them. 

 

Response: Thanks for your comments. In this study we analyzed the importance of input variables and discussed the importance score and correlation coefficients. The importance of input variables was quantified automatically when the model was built using SPSS Modeler software. The importance score was calculated by variance of predictive error. The Pearson and Spearman correlation coefficients were mentioned by other studies. Both of them can explore the correlation of input data in outcomes.

We added a new section about the sensitivity, specificity, accuracy and AUC assessment in discussion part in the revised manuscript (4.1. Analysis of sensitivity, specificity, accuracy and AUC). The ROC curves for each BPNN model were constructed in this study (Fig.4). The ROC curve plots the probability of sensitivity and specificity for each scenario. When only natural factors were considered, the sensitivity and specificity were 70.02% and 68.78%. The AUC value was 0.836. The results indicated the forecasting model were very well. While when anthropogenic management and control policy were added, the sensitivity and specificity were decreased to 60.88% and 55.11%. The AUC value was 0.615 (more than 0.5), which suggested the results were also acceptable. More detail information about model evaluation were added in methodology part and discussion part.

Furthermore, the reference about previous study were added in the revised manuscript.

 

Minor comments:

All the journals in references should be abbreviated. See: https://www.mdpi.com/journal/atmosphere/instructions. 


 

Response: We changed all the journals to abbreviation in the references.

 

 

 

Your sincerely,

Bing Bai

 

Key Lab of Wetland Ecology and Environment,

Northeast Institute of Geography and Agroecology,

Chinese Academy of Sciences

Author Response File: Author Response.doc

Back to TopTop