Next Article in Journal
WCA-Based Low-PSLL and Wide-Nulling Beampattern Synthesis for Radar Applications
Next Article in Special Issue
Linking Remote Sensing with APSIM through Emulation and Bayesian Optimization to Improve Yield Prediction
Previous Article in Journal
Quantifying the Influences of Driving Factors on Vegetation EVI Changes Using Structural Equation Model: A Case Study in Anhui Province, China
Previous Article in Special Issue
Sentinel-2 Enables Nationwide Monitoring of Single Area Payment Scheme and Greening Agricultural Subsidies in Hungary
 
 
Article
Peer-Review Record

Development of a Multi-Scale Tomato Yield Prediction Model in Azerbaijan Using Spectral Indices from Sentinel-2 Imagery

Remote Sens. 2022, 14(17), 4202; https://doi.org/10.3390/rs14174202
by Vasilis Psiroukis 1,*, Nicoleta Darra 1, Aikaterini Kasimati 1, Pavel Trojacek 2, Gunay Hasanli 3 and Spyros Fountas 1
Reviewer 1:
Reviewer 2:
Reviewer 3:
Remote Sens. 2022, 14(17), 4202; https://doi.org/10.3390/rs14174202
Submission received: 20 May 2022 / Revised: 15 August 2022 / Accepted: 19 August 2022 / Published: 26 August 2022
(This article belongs to the Special Issue Remote Sensing of Agro-Ecosystems)

Round 1

Reviewer 1 Report

In this study, a crop specific yield prediction model was constructed using Stepwise linear regression based on Sentinel-2 NDVI data in tomato fields in the Khachmaz Region, Azerbaijan. Khachmaz's test plot was used to test and refine the model, which could be useful. The main problems are as follows:

1) In Figure 1, the location of the test area is scattered and not obvious. The location of the test area is not clear in the 1:8000000 view, is the white part? No comment is given. I don't think this is a good overview of the pilot fields.

2) It can be seen from Figure 2 that there are many test blocks but scattered. Is this a standard selection or random selection? Is the test area shown in Figure 2 included in the test area shown in Figure 1? If so, which part of the test area in figure 1 does it belong to?

3) As for Data Collection & Preprocessing,since Khachmaz will have high cloud cover throughout June 2021 and the first usable Image Mosaic was taken from early July, what is the reason for choosing summer 2021 (June to August) as the research time? Are the June numbers meaningless?

4) Figure 3 is the research area? Why does it different from Figure 1 and Figure 2?

5) As for Data Collection & Preprocessing,The author divides the test area into three grades (Low, Medium and High) based on NDVI. What is the basis for such division and what is the value range of NDVI of each grade?

6) The selection time of cumulative NDVI is random, which is not conducive to application and promotion.

Author Response

We are grateful for your insightful input and well-thought comments, which we are certain have helped us re-submit a manuscript of higher quality. Below the detailed answers to all review comments (underlined):

In Figure 1, the location of the test area is scattered and not obvious. The location of the test area is not clear in the 1:8000000 view, is the white part? No comment is given. I don't think this is a good overview of the pilot fields.

We thank the reviewer for this comment. Indeed, visualizing the experimental area in a “report format” was a challenging task, since it both consisted of a large number of fields and covered a large area (as several fields were isolated/far from each other). In an attempt to address the comment of the reviewer, the map caption has been updated to help readers better understand the study area.

It can be seen from Figure 2 that there are many test blocks but scattered. Is this a standard selection or random selection? Is the test area shown in Figure 2 included in the test area shown in Figure 1? If so,  which part of the test area in figure 1 does it belong to?

Yes, the selection was done by the Ministry officers, who approached tomato producers from Khachmaz region that they could contact prior to the beginning of this year’s growing season. The fields that appear in this image (Figure 2), naturally, belong to the experimental fields of our study, as stated in the caption.

As for Data Collection & Preprocessing, since Khachmaz will have high cloud cover throughout June 2021 and the first usable Image Mosaic was taken from early July, what is the reason for choosing summer 2021 (June to August) as the research time? Are the June numbers meaningless?

This is correct, and it is also the reason that the first data collection (mosaic of high quality/low cloud coverage) that was used this year was from the start of July (04/07/2021), as stated in the section describing the data collection dates. The study period of June was mentioned as 1) the normal cultivation season lasts this period, so next year, the process that will be replicated may start normally in June if cloud coverage allows for it, and 2) June was also used for initial preparations that had no connection to satellite imagery (i.e. field identification and mapping, contact with farmers etc).

Figure 3 is the research area? Why does it different from Figure 1 and Figure 2?

Figure 3 shows the generated orthomosaic from the four (4) images that fully covered the experimental area. Naturally, this mosaic covers significantly more area than the study area, but is required to be generated as an initial step, before we extract the data for our fields of interest. Moreover, the area of this image can be recognized as the same area presented in Figure 1 (b), as they are naturally the same location but in a different scale.

As for Data Collection & Preprocessing, The author divides the test area into three grades (Low, Medium and High) based on NDVI. What is the basis for such division and what is the value range of NDVI of each grade?

As stated in this section, a 3-class quantile classification across all experimental fields was used for their classification, both on a pixel-level (10x10 m resolution) and field level (after the mean values were calculated for each field).

The selection time of cumulative NDVI is random, which is not conducive to application and promotion.

The selection of the cumulative ranges is not random. We used a simple “area calculation” approach, which would require to add consecutive trapezoids on the 2d plane to calculate their total area under their curve. Therefore, with 2 consecutive data points (dates) creating a trapezoid, we have 4 trapezoids for each field (and for each VI, as we have now integrated more Vis in our manuscript after the revision process) per season. Therefore, the potential combinations that can be used (considering of course that we do not skip a data collection date and integrate, e.g. [1-3-5]), with having a minimum of three (3) trapezoids used per case as a pre-requirement, were the 5 ones that we used in our study.

Reviewer 2 Report

The content of the manuscript concerns the analysis of the effectiveness of the model proposed by the authors for the processing of optical satellite data into information on the condition of tomato cultivation in the Azerbaijan region. The work is prepared carefully, the content is interesting. The activity of the authors reported in the manuscript consisted in linking the tomato production efficiency in a selected area with the satellite data obtained for this area. The authors then used this link to predict production efficiency in a wider area. The model has been shown to be highly effective. However, the authors point to the need for its further improvement in the analyzes for other arable areas and over a longer period of time.

Author Response

The content of the manuscript concerns the analysis of the effectiveness of the model proposed by the authors for the processing of optical satellite data into information on the condition of tomato cultivation in the Azerbaijan region. The work is prepared carefully, the content is interesting. The activity of the authors reported in the manuscript consisted in linking the tomato production efficiency in a selected area with the satellite data obtained for this area. The authors then used this link to predict production efficiency in a wider area. The model has been shown to be highly effective. However, the authors point to the need for its further improvement in the analyzes for other arable areas and over a longer period of time.

We are grateful to the reviewer for evaluating our manuscript and we deeply thank them for their kind comments and positive input. We wholeheartedly hope that our future research mentioned in this manuscript will also appear equally interesting to you in future publications.

Reviewer 3 Report

The paper proposes a system for determining the productivity of tomato fields in Azerbaijan based on the NDVI parameter from Sentinel-2 imagery.

The paper is essentially well written, complete and clear. However, the scientific interest is limited because, as the authors themselves state, the NDVI parameter for such purposes has been used for decades. Essentially, the proposed paper lacks scientific and practical innovativeness.

 

Footnotes:

Figure 3: Insert colourbars to highlight the relationship between the colours and parameter values shown in the maps.

Line 229: Specify and justify the threshold values of the quantiles used for classification.

Figure 4: Add to the legendthe quantile ranges corresponding to the three classes .

Line 281: Specify and clearly list the five dates corresponding to the five images used.

Line 316: The sentence seems to speak of all possible intervals that can be constructed on the basis of the dates considered. However, only intervals 1-5, 1-4, 2-5, 1-3, 3-5 were considered. Explain that a subset of the possible intervals was considered, justify the choice and explain the nomenclature used.

Table 1: Define "Coef. of variation" as the ratio between standard deviation and mean.

Table 1: Modify the column headings by making it more explicit that the numbering used refers to the dates considered.

Table 2: In the caption specify that the intervals are not all the possible ones but all the considered ones.

Line 329 and Table 3: There is a disagreement as to what was considered to assess the correlation. In the text reference is made to NDVI values, in the caption to quantiles.

Table 3: It is necessary to describe the statistical significance of the correlations by indicating the corresponding p-value. It is required to enter the corresponding p-value for each correlation, or alternatively, enter in the caption the correlation value corresponding to p-value = 0.05. For example, the value 0.1553 is not statistically significant.

Table 4: Same consideration as table 3 on p-value.

Table 5: Same consideration as table 3 and 4 on p-value.

Figure 6: Change the header of the figures by clearly stating the date considered. Make the regression parameters more readable. Reduce the number of significant figures of the parameters.

Figure 7-9: Make the regression parameters more readable. Reduce the number of significant figures of the parameters.

Tables 1 to 5; Reduce the number of significant figures.

Paragraph "Regional -National upscale": Explain more clearly how the upscale was performed: were the linear models applied to each pixel of the fields considered? Which of the proposed linear models was applied? It would also be interesting to report the prediction obtained with each of the proposed models. Given the simplicity of the (linear) models, it would also be appropriate to assess the uncertainty in the estimates made, to be compared with the discrepancy between models and reference values.

Author Response

We are grateful for your insightful comments, which we are certain have not only helped us improve the quality of our manuscript, but also expand its scope and enhance its novelty. Below the detailed answers to all review comments (underlined):

The paper proposes a system for determining the productivity of tomato fields in Azerbaijan based on the NDVI parameter from Sentinel-2 imagery. The paper is essentially well written, complete and clear. However, the scientific interest is limited because, as the authors themselves state, the NDVI parameter for such purposes has been used for decades. Essentially, the proposed paper lacks scientific and practical innovativeness.

We thank the reviewer for his feedback. To address these concerns, we have further expanded the aim of this paper, and we have added five (5) new Vegetation Indices to the analysis pipeline, which have been presented accordingly to all relevant segments of the manuscript.

Figure 3: Insert colourbars to highlight the relationship between the colours and parameter values shown in the maps.

We are grateful to the author for this comment. We have added the corresponding colourbar to the image.

Line 229: Specify and justify the threshold values of the quantiles used for classification.

We thank the reviewer for his comment. We agree that providing a map with no numeric legend values is not something that is often correct, although, in our case, the exact threshold values are something volatile for each VI and date, as all fields are taken into consideration. To this end, the reason we included this image was not to provide the temporal evolution of the absolute values of a single VI, but rather the relative one, which was the initial aim of the implemented quantile classification, as it was providing a much “simpler” peace of information to the farmers.  A single legend with values can of course be added for each specific date (as the quantile thresholds vary between dates), but we strongly believe that it would be redundant, as it would provide information for the entire layer (all fields) rather than the field itself, which is information that we have focused on the start of our “Results” section, were all single-dates descriptive statistics are presented in detail.

Figure 4: Add to the legend the quantile ranges corresponding to the three classes.

Please see our reply to the previous comment, as the same thing also applies here.

Line 281: Specify and clearly list the five dates corresponding to the five images used.

We thank the reviewer for this comment, as it helped us noticed that the explanation of the data collection dates was not stated clearly at a point as early as we wanted it to be. To this end, an earlier paragraph before the presented data collection dates in the form of the sample (5) NDVI maps has been updated accordingly (Line 234).

Line 316: The sentence seems to speak of all possible intervals that can be constructed on the basis of the dates considered. However, only intervals 1-5, 1-4, 2-5, 1-3, 3-5 were considered. Explain that a subset of the possible intervals was considered, justify the choice and explain the nomenclature used.

We used a simple “area calculation” approach, which would require to add consecutive trapezoids on the 2d plane to calculate their total area under their curve. Therefore, with 2 consecutive data points (dates) creating a trapezoid, we have 4 trapezoids for each field (and for each VI, as we have now integrated more Vis in our manuscript after the revision process) per season. Therefore, all the potential combinations that can be used (again, considering only consecutive trapezoids, and not skipping dates with available data, e.g. integrating measurements [1-3-5] only), with having a minimum of three trapezoids used per case as a pre-requirement, were the 5 ones that we used in our study.

Table 1: Define "Coef. of variation" as the ratio between standard deviation and mean.

Addressed.

Table 1: Modify the column headings by making it more explicit that the numbering used refers to the dates considered.

Most tables have been modified and formatted accordingly to accommodate the new data and findings we have added to the manuscript.

Table 2: In the caption specify that the intervals are not all the possible ones but all the considered ones.

Please see an earlier reply regarding the cumulative ranges and the idea behind using the consecutive trapezoids.

Line 329 and Table 3: There is a disagreement as to what was considered to assess the correlation. In the text reference is made to NDVI values, in the caption to quantiles.

We deeply thank the reviewer for this comment, as the quantiles were indeed the ones used, and the paragraph above the table has been updated accordingly.

Table 3: It is necessary to describe the statistical significance of the correlations by indicating the corresponding p-value. It is required to enter the corresponding p-value for each correlation, or alternatively, enter in the caption the correlation value corresponding to p-value = 0.05. For example, the value 0.1553 is not statistically significant.

Table 4: Same consideration as table 3 on p-value.

Table 5: Same consideration as table 3 and 4 on p-value.

Addressed for all 3 tables.

Figure 6: Change the header of the figures by clearly stating the date considered. Make the regression parameters more readable. Reduce the number of significant figures of the parameters.

Figure 7-9: Make the regression parameters more readable. Reduce the number of significant figures of the parameters.

Addressed for all figures.

Tables 1 to 5; Reduce the number of significant figures.

Addressed for both the original 5 tables, and also all the new ones added during the revision process.

Paragraph "Regional -National upscale": Explain more clearly how the upscale was performed: were the linear models applied to each pixel of the fields considered? Which of the proposed linear models was applied? It would also be interesting to report the prediction obtained with each of the proposed models. Given the simplicity of the (linear) models, it would also be appropriate to assess the uncertainty in the estimates made, to be compared with the discrepancy between models and reference values.

We thank the reviewer for this very insightful comment. As a matter of fact, the section with the prediction upscale is of great interest to us, and we have already been doing preparations to update our methodology for this year’s iteration, which will be of course presented in a future publication. For the information mentioned earlier, some already exist within the manuscript (namely the prediction values are presented in Table 19), while the remaining points made by the reviewer (VI values considered and model used, along with method of prediction generation) have been added in line 459-462.

Round 2

Reviewer 1 Report

1.I can hardly distinguish the spatial position relation between Figure 1, Figure 2 and Figure 3.

2. The 3-class quantile classification, is not an appropriate approach.  The value range of NDVI for each grade will be change in different image.

Author Response

1. I can hardly distinguish the spatial position relation between Figure 1, Figure 2 and Figure 3.

We have tried to update the maps in Figures 1, to better show the connection between different Figures (namely Figure 3, as Figure 2 is an example of a few field boundaries). We are now confident that they should be more distinguishable and clear, and readers should not have difficulty identifying the study area.

2. The 3-class quantile classification, is not an appropriate approach. The value range of NDVI for each grade will be change in different image.

We thank the reviewer for their comment. We agree, naturally, and we understand this concern, as the exact threshold values of the NDVI maps are something volatile for each VI and date, since all fields are taken into consideration in our approach. To this end, the reason we selected the quantile classification approach, was not to provide the temporal evolution of the absolute values of a single VI, but rather the relative ones, as it was providing a much “simpler” peace of information to the farmers that could compare and assess the status of their fields throughout the growing season.  Please do consider that a critical part of the presented project is user accessibility, and how easily farmers can interpret the generated results is of major importance for the project’s success. Finally, for the purposes of this manuscript, a single legend with numeric class thresholds could of course have been added for each specific date (as the quantile thresholds vary between dates, resulting in 5 legends total in Figure 4), but we strongly believe that it would be redundant, as it would only provide information for the entire layer / all fields rather than the presented field itself (as the reviewer correctly stated in their comment), which is information that we have heavily focused on the start of our “Results” section, where all single-dates descriptive statistics are presented in detail.

Reviewer 3 Report

The paper improved a lot after the revision. Please indicate the p-values with at least one significant digit or in the form of, as example, 3*10-4

Author Response

The paper improved a lot after the revision. Please indicate the p-values with at least one significant digit or in the form of, as example, 3*10-4.

We are deeply pleased to know that we have succeeded in meeting the expectations of the reviewer, and we are grateful for their positive input on the revised version of the manuscript. Regarding the comment on statistical values, all respective tables have been updated accordingly.

Round 3

Reviewer 1 Report

The author has made serious revisions to the manuscript, but some contents are not revised, and we suggest that they be published after revision.

1, The regression models in Tabel 18  and Figure 8 is different,please check it. Similar as Table 16 and Figure 6.

2, In Table 17, and Table 18, The best VI is GNDVI, why using the NDVI model to estimate yield.

3. Compare with Manuscript v3, the PVI equations have been updated, but the Table 12,Table 13 is not update.

Author Response

Once again, we are deeply grateful to the reviewer for going through the updated document and providing us with their valuable inputs. Our replies and respective adjustments on the manuscript can be found below:

1, The regression models in Tabel 18  and Figure 8 is different, please check it. Similar as Table 16 and Figure 6.

There existed some minor differences (in the second decimal place) of some values, apparently originated from the different rounding up performed automatically by the statistical software. We have updated them and now the regression equations in the tables and the charts are identical.

2, In Table 17, and Table 18, The best VI is GNDVI, why using the NDVI model to estimate yield.

This is indeed an interesting aspect of our manuscript. Initially, the main focus of this work was to present the results of the EU-funded project mentioned in our paper. However, during the revision process, the scope of the paper has been expanded, and the analysis has included multiple new Indices, which were not initially part of the project, where only NDVI was used for various reasons (with its popularity among the research community and therefore its ease of results' reusability and existing literature being a few of them). As a result, we have tried to preserve this initial scope of the paper, which focuses on the results of the EU project (and therefore NDVI), without diminishing the results of our newly implemented analysis.

3. Compare with Manuscript v3, the PVI equations have been updated, but the Table 12,Table 13 is not update.

This is something that we also had to double-check ourselves before submitting the 3rd version of the manuscript to make sure that our results were correct. The PVI rasters (similarly to all Index rasters) were generated using a predetermined equation from the Index Database (in the case of PVI: https://www.indexdatabase.de/db/i-single.php?id=64), and not the formula that we had mistakenly written in the older versions of the manuscript. Therefore, the index formula used in our analysis was the correct one, and the mistake was only on the written equation on the table.

Author Response File: Author Response.pdf

Reviewer 3 Report

Apart of some problems with too much unsignificative digits in the numbers in the tanles, it seems ok

Round 4

Reviewer 1 Report

There are no further suggestions.

Back to TopTop