Review Reports - Data-Efficient Sowing Position Estimation for Agricultural Robots Combining Image Analysis and Expert Knowledge

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Review of the article: ‘Data-efficient sowing position estimation for agricultural robots combining image analysis and expert knowledge’.

The reviewed article presents an innovative structure aimed at automating sowing tasks in highly complex and biologically diverse agricultural environments, specifically in synecological farming systems. By integrating RGB image analysis with expert labelling and machine learning (Random Forests), the authors propose an energy-efficient approach to predicting optimal sowing positions and quantities. The optimisation of learning processes by minimising the number of training cases is extremely important in this work. In my opinion, this approach will soon become key in cooperation with ML systems, if only because of the energy savings achieved by the systems.

I have presented my detailed comments below

I must admit that despite the very innovative approach to the topic of cooperation between machine learning algorithms and human perception, several elements of the work require significant improvement or clarification. Why did the authors use the “Random Forest” (RF) method? I would appreciate a more detailed explanation. Personally, I believe that existing models using convolutional neural networks (CNN) or their hybrid models trained in a few-shot system outperform RF systems in visual object recognition. In my own research, I use YOLO methods, which I find very flexible for object recognition. The paper lacks a critical comparison with other solutions, such as CNN.

Was the developed algorithm validated? All robot movements and sowing are only simulations. Was the developed system tested in real conditions?

The manuscript mentions the use of 218 image features from ISOM, but does not provide a detailed description of them.

I realise that this is difficult due to the volume of the manuscript, but it would be worthwhile to group them according to common features (texture, spatial relationships), as this would significantly increase the chances of replicating the results, which is important due to the limitations in access to ISOM implementation.

I have some reservations about the fit of the models. Do they show excessive fit? For model one, the difference between training (0.86) and model evaluation (0.23) is very large. Are you not concerned about the model's lack of flexibility and generalisability? Please explain in your paper why such overfitting was considered correct. I am missing information on whether the robot presented in the manuscript is completely autonomous. To clarify, please answer the question to what extent the operator interferes with the sowing maps?

In summary, the reviewed manuscript presents very interesting research results, but in my opinion it requires significant supplementation with: validation in real conditions, a description of ISOM features, and broader comparisons of the developed models.

Author Response

We apologize for the delay in submitting the manuscript. Thank you for your very insightful comments. We have carefully reviewed the points you raised and made revisions accordingly.

In particular, regarding your comments on overfitting, we re-trained the model and conducted strict inspections, such as excluding models where the difference between the train score and evaluation score was 0.13 or greater. Finally adopted a group of models where the difference between the train and evaluation scores was within 0.06.

Additionally, regarding the details of the 218-dimensional features, we have added a summary within the paper and created supplementary materials detailing the specifics.

The revised sections within the paper are indicated in red text.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

1. Chapter 2.1 is placed in 1.Introduction.
2. "tool" is mentioned in Figure 1. What is "tool"? What is the function? What structure? How does it work?
3. Figure 3. Add physical pictures.
4. Figure 4. Is seed ball an industrialized production method? How about its compressive resistance performance? How about the sphericity and uniformity? Can the extrusion requirements of the Operating mechanism in Figure 3b be met?
5. Lines 376-420: The descriptions in 1.2.3.4 are overly complex. The four descriptions share similarities. It is suggested to merge and simplify them.
6. Line 384, grammatical error.
7. In Figure 12, near the coordinate point (20,50), it shows that the sowing point is located on the leaf. May I ask if this meets the requirements of the sowing position? Does this indicate that there are problems with your location annotation, training and recognition results? Secondly, this should make it impossible to complete the sowing operation correctly.
In line 8.517, was the influence of different datasets on the recognition accuracy tested? Will a larger dataset improve accuracy?
9. Chapter 3.5: Add on-site test photos to enhance the credibility of the thesis verification experiments.
10. The references are too outdated. Add at least one fifth of the references from the last five years.

Author Response

Thank you very much for sending us the content of your very thorough and detailed review. We appreciate it very much.

We will reply to you in the form of a response to the content of each number we receive.

Since the Journal Office has been urging me to respond fast, I will reply to you first. Please wait for a moment while I revise the manuscript.

---

1. Chapter 2.1 is placed in 1.Introduction.

A : Thank you for the remark. Revised.

2. "tool" is mentioned in Figure 1. What is "tool"? What is the function? What structure? How does it work?

A : "tool" is meant to be "task tools". We developed the interface that allows multiple task devices can be flexibly replaced depending on the task. We've developed a harvesting tool, a pruning tool, and a sowing tool.

The detail is explained in the following paper.

https://www.mdpi.com/2077-0472/13/1/18

We will add a brief introduction about this interface and tools.

3. Figure 3. Add physical pictures.

A : Asking the co-author. Add it if it is possible.

4. Figure 4. Is seed ball an industrialized production method? How about its compressive resistance performance? How about the sphericity and uniformity? Can the extrusion requirements of the Operating mechanism in Figure 3b be met?

A : Asking for details from the co-author. It is not yet an industrialized production method, but we developed the first protocol method to be industrialized. We added the details of the seed ball.

5. Lines 376-420: The descriptions in 1.2.3.4 are overly complex. The four descriptions share similarities. It is suggested to merge and simplify them.

A : Thank you. Revised.

6. Line 384, grammatical error.

A : Thank you. Revised.

7. In Figure 12, near the coordinate point (20,50), it shows that the sowing point is located on the leaf. May I ask if this meets the requirements of the sowing position? Does this indicate that there are problems with your location annotation, training and recognition results? Secondly, this should make it impossible to complete the sowing operation correctly.

The sowing point on the leaf you pointed out is correct. Even position annotations can be annotated on top of the leaves. This is especially true for those with a high degree of proficiency. The reason for this is that when a plant grows and spreads its leaves, the area below becomes shaded and the vegetation weakens. If that plant dies and no seeding is done underneath it, the vegetation will weaken over time, or encourage the formation of a niche for only certain vigorous plants. Therefore, in addition to simply sowing the uncovered area, sowing under the leaves where the vegetation weakens is a reasonable approach to form highly diverse vegetation. Furthermore, since plants that have just germinated from seeds are very fragile and will die if they are constantly exposed to sunlight, an environment under vegetation that is shaded or semi-shaded is suitable for seeding. In addition, although this paper only verified the low coverage situation, the model design can accommodate sowing under the leaves of plants when coverage is high.

As for the question of whether the sowing operation cannot be completed correctly, the recognition process recognizes leaves that are higher than the ground level from the depth data, and outputs the correct recognition results for the sowing height. For sowing control, if the leaves are soft or if the coordinates do not overlap with the trunk, it is assumed that the current interface can be used without any problems. However, if the leaves are hard, if the coordinates overlap with the trunk, or if the vegetation is more complex, it may not be possible to complete the sowing operation correctly. For those cases, it would be necessary to improve the interface for dispelling leaves and the control for executing sowing by going around from the periphery.

We will add an explanation of the points you pointed out to the paper.

8. In line 517, was the influence of different datasets on the recognition accuracy tested? Will a larger dataset improve accuracy?

A: We have not tested the impact of different datasets on recognition accuracy. We believe that using a larger data set would, of course, improve accuracy, but the key point is not the improvement in accuracy, but rather the first increase in generalization ability to respond appropriately to other situations. Furthermore, while we understand that there is little data in this paper, we believe that the strength of this model is the speed and flexibility of the improvement cycle, which can quickly adapt to new situations with human labeling for a variety of situations.

9. Chapter 3.5: Add on-site test photos to enhance the credibility of the thesis verification experiments.

A : In this paper, the verification is only up to simulation and not up to field testing.

10. The references are too outdated. Add at least one fifth of the references from the last five years.

A : Thank you. Revised.

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

well done

Author Response

Thank you very much for your cooperation.