Next Article in Journal
Parameter Optimization Design and Experimental Validation of a Header for Electric Rice Reaper Binders Employed in Hilly Regions
Previous Article in Journal
Phenological Performance, Thermal Demand, and Qualitative Potential of Wine Grape Cultivars Under Double Pruning
 
 
Article
Peer-Review Record

LGVM-YOLOv8n: A Lightweight Apple Instance Segmentation Model for Standard Orchard Environments

Agriculture 2025, 15(12), 1238; https://doi.org/10.3390/agriculture15121238
by Wenkai Han 1, Tao Li 2, Zhengwei Guo 2, Tao Wu 2, Wenlei Huang 2, Qingchun Feng 2 and Liping Chen 1,2,*
Reviewer 1: Anonymous
Reviewer 3: Anonymous
Agriculture 2025, 15(12), 1238; https://doi.org/10.3390/agriculture15121238
Submission received: 23 April 2025 / Revised: 23 May 2025 / Accepted: 30 May 2025 / Published: 6 June 2025
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

Round 1

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

What is the difference of your methodology to other papers using Yolo?

Use past sentence when applicable, e.g.: lines 25, 299, 425, 440, 444 (review for all text).

line 36: why those are critical?

line 70-73: cite references for those examples.

line 85: how much "real-time"?

line 176: what is the reason for chosen that area? what is the importance of the crop for the region? what is the size of the area? why that cultivar were selected for the study? specify also the weather conditions of the field during data acquisition.

Figure 1: was there any influence of the sunlight on quality of the data/image?

Figure 2: was there any standard of distance from cameras to the fruits?

line 208: how do you ensure there is no overfitting? I understood that the data frames are from the same field condition.

Figure 4: stated about manual annotation, but in introduction stated that is a disadvantage (line 49), how to build or to improve its performance without manual annotation?

Equations: cite the references, when applicable.

Equations 6 and 7: where applicable for the same dataset?

Table 1: why those configuration were adopted?

Method: what should be adapt for real ag conditions?

line 575: power consumption and thermal performance not demonstrated on results.

line 587-: it makes part of introduction.

Conclusion: focus to answer the objective of the paper.

References 21 and 22 are from co-authors?

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report (New Reviewer)

Comments and Suggestions for Authors

I find the work noteworthy, as it covers practical implementation, the use of field sensors, and shows strong results. It is rare to see studies aimed at real-world applications.

My only suggestion is to consider publishing the dataset, as this is a practice that should be more widely adopted to openly support the scientific community. Additionally, the paper lacks a section outlining the individual contributions of the authors.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report (New Reviewer)

Comments and Suggestions for Authors

Manuscript Evaluation

Abstract

This study proposes LGVM-YOLOv8n, a lightweight instance segmentation model for harvesting robots, based on YOLOv8n-seg, with enhancements that balance accuracy and efficiency. It incorporates three innovations: GSConv (which improves feature interaction and reduces cost), VoVGSCSP (which optimizes small object detection), and MPDIoU (which increases accuracy in locating occluded fruits). Tests show reduced computational cost, lighter model weight, and faster inference. The study's strengths include a well-developed contextualization, coherent structural organization, and a detailed methodology description, providing clarity and robustness to the proposal. The way the study is conducted—from theoretical foundations to practical testing—demonstrates scientific consistency and strong integration among the article’s components. Furthermore, the results are relevant and promising for the field of automated harvesting, indicating significant advances in the use of lightweight and efficient models in complex agricultural environments, with practical application potential in embedded robotic systems.

General Concept Comments

The Introduction presents a comprehensive and well-contextualized review of the state of the art in instance segmentation applied to automated fruit harvesting, with a specific focus on apples. The text clearly outlines the evolution of approaches, from methods based on handcrafted features to modern deep learning solutions, appropriately highlighting the advances, limitations, and existing gaps in each generation of techniques. Furthermore, the review effectively addresses the main challenges encountered in real orchard environments, such as occlusions, lighting variations, and computational constraints typical of embedded systems. The rationale for the study is well constructed, although the specific research objectives could be stated more clearly to better guide the reader regarding the intended contribution.

The Materials and Methods section provides a clear, technically grounded, and logically structured description, covering everything from data acquisition to model deployment on an edge computing platform. The use of a real robotic platform under variable environmental conditions lends practical applicability to the study, and the categorization of fruits based on accessibility is a methodological choice well suited to the demands of automated harvesting systems. The data augmentation part is well described, though it could be improved by providing more detailed information about the parameters of the applied transformations. The integration of figures within the text is appropriate and contributes to the understanding of the experimental setup. Overall, the description is robust and compatible with the technical scope of the article.

The LGVM-YOLOv8n model description, along with the other structures and strategies involved in the approach, is technically sound and well aligned with the application goals in complex agricultural environments with hardware constraints. The modifications made to the original YOLOv8n architecture—such as replacing modules with lighter versions (GSConv and VoVGSCSP) and adopting a new loss function—are coherently presented, taking into account the need to balance accuracy and computational efficiency for real-time applications. The explanation of the components is clear, well-founded, and highlights the expected gains in performance and generalization capability. The controls used seem appropriate for the scope of the proposal and are described with the necessary level of detail to ensure reproducibility. However, it would be more appropriate for the initial information currently presented in the “Experimental Results” section—especially those related to tools, frameworks, and software used—to be relocated to the “Materials and Methods” section, with the proper references, to maintain clearer and more standardized methodological organization.

The Results section is well structured and effectively fulfills its role in the scientific communication of the findings. The authors present the results clearly and in an organized manner, using tables and figures consistently to illustrate the model's performance under different scenarios and experimental conditions. The number of visual elements is appropriate and their distribution throughout the text is balanced, which contributes to a smooth reading experience and facilitates data comprehension. Graphs and comparative charts are employed with satisfactory graphic quality, accompanied by informative captions that assist in the accurate interpretation of the information. This visual presentation not only reinforces the arguments discussed but also highlights the practical impact of the architectural modifications implemented in the model, especially in the context of automated harvesting in complex agricultural settings. However, it is noted that part of the introductory content in this section—regarding the description of methods, tools, and experimental configurations—would be more appropriately placed in the “Materials and Methods” section to maintain the logical organization of the manuscript and avoid overlap between sections. Nonetheless, the way the results are communicated demonstrates attention to clarity, methodological rigor, and alignment with the study’s objectives, which strengthens the reliability and robustness of the research.

The Discussion section is appropriately focused on analyzing the main results, addressing them with good breadth and depth. The authors highlight both the strengths and the limitations of the proposed model in a balanced manner, demonstrating a critical understanding of the impacts and practical implications of their approach. The arguments are well developed and supported by relevant and up-to-date references, which lends credibility to the interpretations presented. Furthermore, the discussion is coherently connected to the study’s objectives and the existing literature, contextualizing the findings within the current landscape of instance segmentation in agricultural environments. The authors also suggest directions for future research, pointing out possible improvements in the model’s architecture and in data collection under different environmental conditions, which demonstrates a forward-looking vision and potential for continuity and refinement of the research line. This critical and well-founded approach positively contributes to the scientific maturity of the work.

The conclusions of the research are well established, focusing on the study’s main findings and maintaining a clear alignment with the previously defined objectives. The section adequately fulfills its role by revisiting the central points of the investigation without overstepping the boundaries of the presented data, which contributes to the consistency and scientific integrity of the work.

Specific Comments

  • At the end of the introduction, the authors describe their main contributions. It is recommended that this content be relocated to another section of the manuscript, such as the “Discussion” or “Conclusions,” where this type of synthesis is more commonly expected and better contextualized.
  • It is suggested that the research objectives be stated clearly and directly in the “Introduction” section. This will help guide the reader from the outset regarding the focus and purpose of the study.
  • Part of the information presented at the beginning of section “4. Experimental Results” pertains to methods, tools, and experimental configurations, including the software used. It would be more appropriate for these details to be moved to the “Materials and Methods” section, with proper citations of the tools and frameworks employed.
  • Some paragraphs are excessively long, which may hinder the readability of the text. In particular, it is recommended to split the paragraph spanning lines 493 to 527 in order to make the narrative clearer and more accessible to the reader.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report (New Reviewer)

Comments and Suggestions for Authors

Avoid auto-citation (references 21; 22)

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript (agriculture-3502467) presents LGVM-YOLOv8n, a lightweight instance segmentation model optimised for autonomous fruit harvesting, which enhances accuracy and efficiency through GSConv, VoVGSCSP, and MPDIoU. It achieves improved speed, reduced computational cost, and superior segmentation in complex orchard conditions. The manuscript is not suitable for publication. Firstly, the authors have not adequately described the introduction to provide proper context for the reader, nor have they included the necessary technical information and explanations. There are no hypotheses to be tested in the manuscript. The materials and methods section is confusing and lacks appropriate logical flow and coherence. The results section merely compiles descriptions of a few figures, and there is no dedicated discussion section to interpret and compare the findings with existing literature. Key questions remain unanswered: What improvements were made? What challenges were encountered in applying the models? Regarding the YOLO models, where are the applied codes? What statistical methods were used for comparisons? Additionally, there are no references following the introduction section. The conclusions need to clearly highlight what was improved and how the study advances compared to existing models. References must also be carefully checked. The figures require better quality, and the axes (x and y) need to be reformulated with proper descriptions and units.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The article is interesting and well written. As far as the quality of the presentation is concerned, some changes are necessary (as indicated in the specific comments). I ask the authors to change the section titles both so as not to confuse potential readers and (in the case of the Results and Discussions sections) to solve a problem related to the presence along with the results, of personal opinions and comments that would otherwise require a new separate section.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

It is important to design and develop lightweight detection models for real-time applications, especially for the real-time robotic application in edge computing devices, where lightweight models are more important in terms of processing power and response times. The following comments are for improving your article's quality.

  1. Line 13 – please mention what is LGVM stands for.
  2. Line 88 - RFA, DFP line 90-MSOAR, line 91-ECA Please mention what these abbreviations stand for, which will help new readers to understand the introduction more clearly.
  3. Line 138- Please mention the camera model.
  4. In Figure 1, the two cameras were in vertical turn, which will vertically give the images. Is there any specific reason for this arrangement? If so, please mention.
  5. In figure 2, you were trying to highlight that there were apples that were occluded by leaves and branches, which cannot or are difficult for robotic harvesting. Is there any specific reason? Why did you choose 90% of the fruit surface visible for harvestable apples? Why can't you take 100% apples that are not covered with any obstacles?
  6. In Figure 8, please insert w and h.
  7. To represent Tables 3 and 4, is it possible to add a comparison image of segmentation results from YOLOv8n-seg and your model (with around 10 apples in the image, which helps readers to clarify your model high accuracy)

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The article "LGVM-YOLOv8n: A Lightweight Apple Instance Segmentation Model for Standard Orchard Environments" presents an innovative approach for apple segmentation in orchards using an optimized model based on YOLOv8n-seg. The research emphasizes improvements in computational efficiency and segmentation accuracy by integrating the GSConv, VoVGSCSP, and MPDIoU loss function modules. The study is well-structured and provides quantitative results demonstrating the superiority of the proposed model compared to existing approaches. The model architecture is detailed, and the chosen evaluation metrics (precision, recall, mAP, GFLOPS, inference time, etc.) are appropriate for validating performance. The experimental approach is robust, including component ablation studies, comparisons with other models, and validation on an edge computing platform (Jetson TX2). The results indicate substantial improvements in inference speed and computational complexity reduction, making the model more viable for real-time applications. However, a few aspects could be improved:

1) While the model outperforms others in inference speed and computational efficiency, there is a slight drop in accuracy in some scenarios. This is acceptable for practical applications, but the article could further explore how this trade-off affects usability in different environmental conditions.

2) The ablation experiments demonstrate the importance of the added modules, but the analysis could be more detailed to clarify how each module impacts specific performance metrics.

The article makes excellent use of the existing literature, citing relevant and up-to-date works on fruit segmentation, deep learning, and lightweight modeling techniques. The review covers traditional segmentation methods as well as the latest deep learning-based approaches, including comparisons between semantic and instance segmentation models. However, some references could be expanded:

1) The citations related to edge computing (Edge AI) are relevant, but the discussion section could address implementation challenges on more resource-constrained hardware, considering variables like power consumption and available memory.

2) The comparison with competing models such as Mask R-CNN, YOLOv5n-seg, and Yolact is well-conducted, but there is a lack of references to emerging alternative techniques, such as Transformer-based segmentation.

3) A more in-depth discussion on other possible applications of the model, such as agricultural monitoring, plant growth analysis, and disease detection, could enhance the study’s impact.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I thank the authors for their comments. The manuscript (agriculture-3502467) has improved in certain aspects, but it is still not suitable for publication. The "Materials and Methods" section continues to lack detailed descriptions and appropriate citations. In the "Results" section, which ought to be concise and objective, there remain insufficient explicit references to the actual findings, and a discussion-like tone persists. Additionally, appropriate statistical analyses comparing the various models presented are absent, including tables and figures lacking rigorous statistical treatment. Merely reporting percentages is inadequate; it is essential to justify this methodological decision.

The "Discussion" section continues to lack analytical depth, an issue previously highlighted, and noticeably lacks bibliographical references. For instance, I was unable to identify any citation in this section, raising the question as to whether related studies exist.

The figure captions remain inadequate, failing to clearly explain the content of the images. Although I personally understand the meaning of these figures, it is necessary to question whether readers will share this understanding.

A scientific manuscript must meaningfully contribute to science; otherwise, it amounts to little more than a descriptive report.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Back to TopTop