Data Type and Data Sources for Agricultural Big Data and Machine Learning
Round 1
Reviewer 1 Report
The authors provided a broad review on effects of machine learning and big data applications in agricultural applications. Sufficient background information was given in terms of machine learning categories, data sources and types, as well as flow charts and architectures. Detailed discussion in terms of how machine learning algorithms were applied given a specific type of data (structured/semi-structured/unstructured) gave the audience a broad, high-level view of the applications in agricultural big data. Overall, this review is well structured with easy-to-follow organization. Importantly, the literature examples included are appropriate and mostly from very recent studies, which demonstrated the most latest work in the relevant field. As a result, I would recommend publication in the current form.
Author Response
Dear Reviewer
We appreciate the time you have taken to review our article. We have responded to your comments and updated the paper as requested. Please find attached a letter with the responses and a description of the adjustments made.
Yours sincerely
Ania Cravero
Author Response File: Author Response.pdf
Reviewer 2 Report
This manuscript reviewed 33 papers in the field of agricultural. The need for conducting such review, the aim of this review, and the reason for selecting these papers are not clear. Furthermore, the review results into a trivial conclusion based on the current abstract. Moreover, the relationship between this article and the scope of the “Sustainability” journal is not clear. Also, the text should be checked as it was not written clearly. As a reviewer, I suggest rejection. Further comments are presented in the following:
1. Line 3: How does “drought” “increase diseases”? It is not a fact. You need to elaborate it in the text. Also, for this type of general sentences, it is better to use simple present tense instead of past tense.
2. Line 4: How is “Agricultural Big Data” a “high-performance computing technology”?
3. Line 5: Write clearly. What do you mean by “agricultural processes”? Name the processes in the text.
4. Line 6: What is the relationship between “understanding agricultural processes” and “rapid progress” of “social networks”? It is not clear.
5. Lines 6-7: What do you mean by “rapid progress in” “enabling the collection and analysis of data volumes”?
6. Lines 7-8: How are “temperature and humidity” data structured and “data from spreadsheets and information repositories” semi-structured? It seems to be an arbitrary and more importantly inaccurate classification.
7. Line 9-10: What do you mean by “This study provides insight into …. its main challenges, and trends”?
8. General comment: This is a review paper, which should result into an overview. What is the outcome of this study? According to the current abstract, this review paper shows that “the primary data sources are Databases, Sensors, Cameras, GPS, and Remote Sensing, which capture data stored in Platforms such as Hadoop, Cloud Computing, and Google Earth Engine.” Does this conclusion really need a review article?
Author Response
Dear Reviewer
We appreciate the time you have taken to review our article. We have responded to your comments and updated the paper as requested. Please find attached a letter with the responses and a description of the adjustments made.
Yours sincerely
Ania Cravero
Author Response File: Author Response.pdf
Reviewer 3 Report
(1) Your point is your own work that should be further highlighted in the abstract.
(2) The parameters in expressions are given and explained.
(3) The method in the context of the proposed work should be written in detail.
(4) The main contributions of this paper should be further summarized and clearly demonstrated.
(5) In Section II, too long, please reduce some contents.
(6) In the Section III(,Methodology), the describing is not clear. Please revise it.
(7) The literature review is poor in this paper. I hope that the authors can add some new references in order to improve the reviews. For example, For example, https://doi.org/10.1016/j.ins.2022.08.115;
https://doi.org/10.3390/agriculture12060793; https://doi.org/10.1109/JSTARS.2021.3059451 and so on
(8) They must also better explain the metrics considered and motivate their case.
Author Response
Dear Reviewer
We appreciate the time you have taken to review our article. We have responded to your comments and updated the paper as requested. Please find attached a letter with the responses and a description of the adjustments made.
Yours sincerely
Ania Cravero
Author Response File: Author Response.pdf
Reviewer 4 Report
The article addresses a very important topic and is extremely interesting. Next are a few minor remarks:
line 7,8,9: the authors state:
This data can be structured, such as temperature and humidity; semi-structured, such as data from spreadsheets and information repositories; and unstructured, such as data from files like PDF, TIFF, and satellite images
This sentence mixes up different concepts: for instance, temperature is the semantical aspect, the meaning of the data. The data type is typically numeric (e.g., float). The data source can be a txt file, a spreadsheet, or a DB.
For an article whose main focus is on data being very precise on these differences is important.
line 10: the authors state:
It analyzed 33 papers collected through PRISMA.
PRISMA is a systematic review protocol, not a tool. Is this expression correct?
lines 328,329,330,331: these lines are in Spanish and should be translated
lines 342,343,344,345: these lines are in Spanish and should be translated.
lines 572,573,574,575,576: these lines are in Spanish and should be translated
line 318: writing error 'TThe study'
Author Response
Dear Reviewer
We appreciate the time you have taken to review our article. We have responded to your comments and updated the paper as requested. Please find attached a letter with the responses and a description of the adjustments made.
Yours sincerely
Ania Cravero
Author Response File: Author Response.pdf
Reviewer 5 Report
1- The advantages of the topic may be written more clearly, i.e., as a single paragraph or point-by-point.
2- There is an inconsistency about the number of paper selected.
- According to the Figure 11, 33 papers were selected.
- According to the Figure 12, 32 papers were selected.
- According to the text, 35 papers were selected.
"The 35 selected articles describe various problems in the domain of agriculture that 354 are possible to solve through Big Data and ML."
Which one is correct?
3- There is only one sentence related to Figure 6. It may be explained in more detail.
4- One of the important, popular and powerful recent techniques is "Ensemble learning". However, there is no any sentence about this term.
5- Some abbreviations are used in the text without giving their expansion.
For example; HDFS
The authors should write that "these abbreviations stand for what".
6- Figure 12. Selected papers by year.
The most recent selected paper was published in 2020.
I suggest the authors selecting the most recent papers (especially published in 2021 and 2022).
7- In the text, reference numbers should be placed in square brackets [].
For example,
"Semlali et al. use"
"Semlali et al. [22] use"
"Another example is Alex et al., who develop"
"Another example is Alex et al. [23], who develop"
"Eberendu et al. explain that 80 percent"
"Eberendu et al. [31] explain that 80 percent"
8- "Finally, used the Hadoop ecosystem to store and process the data 121
analyzed with ML. Figure 1 depicts the complete process."
I think, it should be Figure 3, instead of Figure 1.
Author Response
Dear Reviewer
We appreciate the time you have taken to review our article. We have responded to your comments and updated the paper as requested. Please find attached a letter with the responses and a description of the adjustments made.
Yours sincerely
Ania Cravero
Author Response File: Author Response.pdf
Round 2
Reviewer 2 Report
Unfortunately, the manuscript was not modified so that it can address the raised comments. I agree with the need to conduct a study on this topic. As an example, Lin et al. (2018), one of the papers suggested by the respected authors in their response file, reviewed 1,319 papers from 40 social sciences journals. As another example, Fassnacht et al. (2014) selected 113 out of 474 returned results by searching for their desirable keywords. However, this study only considers 43 papers, which is obviously much lower in number than the related papers published on the topic. It is expected from a review paper to consider as many papers as possible to derive a suitable perspective of the literature based on the state of the art. As a result of considering too short number of papers in this so-called “review” paper, trivial findings have been reported at the end of the revised abstract. As a reviewer, I do not see any contribution in this article. Also, I recommend the respected authors to select a more suitable journal for submission after revising their work.
References:
Fassnacht, F. E., Hartig, F., Latifi, H., Berger, C., Hernández, J., Corvalán, P., & Koch, B. (2014). Importance of sample size, data type and prediction method for remote sensing-based estimations of aboveground forest biomass. Remote Sensing of Environment, 154, 102-114.
Lin, C. S., & Wang, Y. Y. (2018). Data type and data source preferences for six social sciences subjects in quantitative data reuses. Proceedings of the Association for Information Science and Technology, 55(1), 867-868.
Previous comments with no satisfactory response:
1. Line 3: How does “drought” “increase diseases”? It is not a fact. You need to elaborate it in the text. Also, for this type of general sentences, it is better to use simple present tense instead of past tense (the response is not satisfactory).
2. Line 4: How is “Agricultural Big Data” a “high-performance computing technology” (the response is not satisfactory)?
3. Lines 7-8: How are “temperature and humidity” data structured and “data from spreadsheets and information repositories” semi-structured? It seems to be an arbitrary and more importantly inaccurate classification (the response is not satisfactory).
4. Line 9-10: What do you mean by “This study provides insight into …. its main challenges, and trends” (the response is not satisfactory)?
5. General comment: This is a review paper, which should result into an overview. What is the outcome of this study? According to the current abstract, this review paper shows that “the primary data sources are Databases, Sensors, Cameras, GPS, and Remote Sensing, which capture data stored in Platforms such as Hadoop, Cloud Computing, and Google Earth Engine.” Does this conclusion really need a review article?
Author Response
Dear Reviewer
Attached is a file with the responses to your comments.
Thank you for your time.
Author Response File: Author Response.pdf
Reviewer 3 Report
This paper can be accepted now.
Author Response
Dear Reviewer
Attached is a file with the responses to your comments.
Thank you for your time.
Author Response File: Author Response.pdf
Reviewer 5 Report
The authors revised the manuscript adequately according to the reviewers comments.
The manuscript is now more qualified and clear.
I have no further comments.
I suggest accepting it for publication in present form.
Author Response
Dear Reviewer
Attached is a file with the responses to your comments.
Thank you for your time.
Author Response File: Author Response.pdf
Round 3
Reviewer 2 Report
The authors’ response to my previous comments were not satisfactory:
Comment # 1: Only in “Computers and Electronics in Agriculture”, there are absolutely more than 43 papers available on the topic, let alone in other journals. For instance, I search “Big Data and Machine Learning” in the corresponding journal and got 469 results. Although all 469 results may not well-related to this topic, doing similar search in other related journals can at least lead to more than 400 related papers and not 43 papers. Thus, it is obvious that 43 papers are not a comprehensive review of the current state of the art. The authors are advised to search the literature instead of providing excuses for not including more related papers, which are available in the literature. If your article is a review type paper, it is expected that you reviewed a majority of papers already published and present in the current literature.
Comment #2: Remove this sentence from the abstract: “Machine Learning and Agricultural Big Data are high-performance computing technologies”.
Comment #3: In your article, it is not clear the definition of structured vs. semi-structured data. I suggest that the authors stick to the real definition of terminologies instead of making up new ones. According to the literature, “Structured data represents data in a flat table. In contrast, semi-structured data can contain hierarchies of nested information”. If we set this as an acceptable definition, any variable can be formatted into structured or semi-structured data. So, it is not up to the type of the variable as mentioned in your article (Line 10). Please correct it.
Comment #4: The conclusions obtained by this review paper is marginal and does not “provides insight into the data types used in Agricultural Big Data, its main challenges, and trends”, as stated in the abstract. Further analysis of available articles seems to be required to reach plausible conclusions.
New comments:
Comment #5: Add a figure to show where studies conducted on a world map.
Comment #6: Figure 4 and Figure 5 should be omitted because they have no added value.
Comment #7: What are the eligibility criteria for selecting papers in Figure 10? Also, why did you choose only 43 papers out of 521 papers in Figure 10? The text also needs to contain this information.