Soya Yield Prediction on a Within-Field Scale Using Machine Learning Models Trained on Sentinel-2 and Soil Data
Round 1
Reviewer 1 Report
Comments on “Pixel-based soya yield prediction using HPC system and machine learning models trained on satellite and soil data”
Major comments
The research topic is important, but the method, datasets, and major contribution of this work were not elaborated well, such as the motivation and major contribution of this study. This is a good idea to use machine learning to predict crop yield combined with remote sensing images and field observations, but this manuscript is not suitable for publication. There are still many works to do for improving this manuscript. Besides, there is no conclusion in this paper.
First, the authors should provide more information and background about current progress in crop yield prediction. And present the knowledge gap and scientific question, explain the major purpose and objectives of this work. The introduction section should be reorganized. In general, the background and previous studies review should be summarized firstly. And then present existing issues, the objectives, and the approach of this study.
Second, the authors explained how to integrate the yield data and remote sensing grid, but it was not mentioned about the spatial resolution of remote sensing images and soil grid datasets in the data processing. Model inputs include remote sensing and soil data, while they need to be integrated into the same resolution.
Third, the authors did not mention the bands of remote sensing images they used in this study. They just mentioned 12-bands. These images have different spatial resolutions, such as 10 m or 60 m.
Last is the most important question about the method. They used images from three years as 36 features and yield data to train the model. If the yield data come from three years? It is not reasonable to train the model using different years’ data. The training data should be consistent.
Specific comments
- Line 8: please revise this sentence, does the author mean three seasons of each year or each season of three years? It is not clear for readers.
- Line16: Keywords. Please add the “remote sensing”.
- Tables: Please revise the table format according to journal requirements.
- Line 121: may consider applying cross-validation for model training and evaluation.
Author Response
Dear Sir/Madam,
We thank You for these relevant comments that need to be clarified in the revised version of the manuscript. Please see the attachment file with our answers.
Best regards,
Branislav Pejak
Author Response File: Author Response.pdf
Reviewer 2 Report
- Introduction
the introduction part is very shallow, the authors do not refs. significant findings of past in terms of yield prediction using satellite remote sensing. Please provide the extensive review of past methods.
Similarly, the paper objectives are not clearly defined.
3. Results
In terms of results, the statistics shown in Table 3. are not sufficently convincing to use these algorithms for yield predictions. What is the current state of the existing research is still not clear in the manuscript. Does authors found a good model?. Did they made comparision w.r.t. to an existing research/methods.
Please enlarge the scope of your study and try to find a optimal solution by comparing with some exisitng state-of-the-art. Choosing some random machine learning models and producing results is trivial.
At this stage, I choose to go for major revisions with hope the revised version will provide better insight. Also, I recommend authors to provide their train model at GITHUB so others not only can test them but also use them in their research to even further improve the methods or test them.
Good luck
Author Response
Dear Sir/Madam,
We truly thank You for raising these important announcements. We have tried to improve the shortcomings of the article in this short time and present you with all changes and improvements. Please see the attachment file with our answers.
Best regards,
Branislav Pejak
Author Response File: Author Response.pdf
Reviewer 3 Report
The aim of this paper is to evaluate pixel-based yield regression models from optical satellite imagery and soil parameters, using a series of production data from soybean plots from three different seasons, captured directly from the harvester's monitor, as well as from the ISRIC (International Soil Reference and Information Centre) SoilGrids database.
However, the manuscript is not ready yet for publication, but has much promise if the authors are willing to undertake some revisions.
Line 35. The authors state that "Based on the pre-season yield prediction, soybean varieties are chosen [10,11]". In the development of the work, the varieties used to model the yield have not been detailed. Are all the plots in the 3 seasons sown with the same variety, and if so, which is it? In case they are different varieties, estimates should have been carried out for each of them.
Lines 76-78 the authors state: “Due to the proximity of fields, which were all located in the Upper Austria region, we considered the fields to have been influenced by the same weather conditions, and therefore, weather variables were not considered.” The data analysed are from three different seasons, so presumably the meteorological variables will not be the same. Would it not have been more appropriate to incorporate them so that the models could explain the interannual variations in production?
Line 115. 2.4 Machine Learning.
From the balance of available data defined in lines 116 and 117 it follows that the data have been divided into campaigns (47 variables = 3 images x 12 bands (36), + 11 soil data). This justifies the need to use meteorological data for each year.
The pixel size taken was 10 x 10 m, but the Sentinel 2 bands are 10 x 10 (4 bands), 20 x 20 (6 bands) and 60 x 60 (3 bands), the latter being, in my opinion, not very useful for this type of study. I understand that it would have been more correct to eliminate the 60 m bands and resample all of them to 20 m. This would have given more reliable results. This could have given more reliable results to the study. It would also have been interesting to use spectral indices calculated from the original bands: NDVI etc.
It would be necessary to carry out the study for each of the campaigns individually, as well as to do it jointly for all the years.
Finally, there is no "conclusions" section in the document where the results of the research are clearly and synthetically explained. A document of this kind should set them out in concrete terms.
Author Response
Dear Sir/Madam,
We gratefully thank You for Your outstanding comments and suggestions about our work and manuscript. Below, we address every comment carefully and explain the corresponding changes in the manuscript. Please see the attachment file with our answers.
Best regards,
Branislav Pejak
Author Response File: Author Response.pdf
Reviewer 4 Report
Dear Authors,
in your manuscript you have presented an interesting topic on the evaluation of pixel-based yield regression models from optical satellite images and soil parameters. The optical images used in this study were from ESA's Sentinel-2 satellites. The study was conducted in Austria, and the data used for modeling came from 3 growing seasons.
The work was submitted as a scientific article. According to the rules of all MDPI journals, the results section should be followed by a detailed discussion of the research results - compare the results obtained with those known in the literature. Additionally, you should specifically cite the sources on which you base your conclusions. Your manuscript is really lacking the part that aims to demonstrate the ability to draw valuable conclusions in scientific work. The dogma of the discussion of results can be used by agricultural practitioners. Without this part (prepared decently) the manuscript cannot be published.
The work is valuable. I suggest resubmitting it after creating a solid "discussion of results" part.
Author Response
Dear Sir/Madam,
We really thank You for Your outstanding comments and suggestions about our work and manuscript. We addressed every comment carefully and explained the corresponding changes in the manuscript. Please see the attachment file with our answers.
Best regards,
Branislav Pejak
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Dear all,
The authors made big progress on the revised version. I suggest that it can be accepted now.
Best,
Author Response
Dear Reviewer 1,
Thank you for taking the time to review our work. We are really appreciated your comments and suggestions to improve our research.
Best regards,
Branislav Pejak in the name of all authors.
Reviewer 2 Report
Revised version has improved and qualify for the publication. Authors have addressed all the comments and suggestions
Author Response
Dear Reviewer 2,
Thank you for taking the time to review our work. Thank You for your comments and suggestions that allowed us to greatly improve the quality of the manuscript.
Best regards,
Branislav Pejak in the name of all authors
Reviewer 3 Report
The authors have made the corrections proposed in the review phase. Therefore the paper is ready for the publication.
Author Response
Dear Reviewer 3,
Thank you for taking the time to review our work. We greatly appreciate your comments and suggestions.
Best regards,
Branislav Pejak in the name of all authors.
Reviewer 4 Report
The authors still did not comply with the reviewer's recommendations- they did not create a discussion of the results.
Author Response
Dear Reviewer 4,
Thank you for taking the time to review our work. Your feedback is highly appreciated and will help us to improve our manuscript. Also, we elaborated our discussion of the results in terms of performance and comparison with other similar approaches. Please find our elaboration in section “Results”. We also compared our results with a similar solution where we explained all our advantages and disadvantages.
Best regards,
Branislav Pejak in the name of all authors