GTDR-YOLOv12: Optimizing YOLO for Efficient and Accurate Weed Detection in Agriculture
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe design and development of weeding robotic systems has high potential for sustainable agricultural systems. The following comments are for the improvement of your article.
- In the abstract, indicate the novel application of your study. Did you apply your model in real-time weed detection?
- The introduction section needs to be revised; you need to mention what is the novel application of your study or the objectives in the last paragraph.
- The methodology and discussion do not reflect the application of this study.
- Can this model be used in edge devices?
- Many studies are being conducted to change the basic architecture of detection networks, but there are no applications in the agricultural field; at least, the study should show the application improvement.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsDear Authors,
The manuscript “GTDR-YOLOv12: Optimizing YOLO for Efficient and Accurate Weed Detection in Agriculture” addresses a current and relevant topic in modern agriculture: efficient weed detection. Weed infestation poses a significant problem to agricultural productivity, competing with crops for vital resources and leading to substantial yield losses, as well as increasing dependence on chemical herbicides, raising production costs, and environmental risks. Conventional management approaches, such as manual removal and broad-spectrum chemical spraying, are increasingly considered unsustainable in large-scale contexts. In this scenario, the use of deep learning-based methods for weed detection and differentiation between crops and weeds has gained prominence for its ability to offer superior accuracy and enable targeted intervention strategies. The need to improve the robustness of models in complex field conditions and maintain real-time performance with limited hardware is a considerable barrier, making your research timely and crucial for the advancement of precision agriculture. Your study contributes precisely to this gap.
The following are suggestions for improving the manuscript:
1) Abstract: Reinforce the urgency of the need for efficient weed detection, connecting it directly to agricultural losses and sustainability, before presenting the solution. The F1 score is mentioned twice. It should be mentioned only once.
2) Lines 28 to 32: Connect the use of the developed solution to the current context.
3) Although the topic is intrinsically current, it is possible to emphasize in chapter 1 the urgency in the context of modern precision agriculture and the growing adoption of autonomous platforms.
4) Line 107: There is an inconsistency in the name of the proposed model. In line 107, the model is referred to as “GTA-YOLOv12,” while the title of the article, the abstract, and several other sections call it “GTDR-YOLOv12.” Please review.
5) Overall, the introduction is informative, but there may be opportunities for greater conciseness and more impactful phrasing.
6) Present the objectives of the study more directly in Chapter 1.
7) Although you already mention that the selected dataset is “suitable for evaluating detection performance in scenarios involving densely distributed weeds, small target sizes, and low visual contrast between vegetation and the ground background,” you could reinforce the link between how these specific characteristics of the dataset represent the crucial challenges that GTDR-YOLOv12 aims to overcome.
8) Lines 154 to 156: Briefly explain the reason for choosing each type of data augmentation.
9) Contextualize the Experimental Platform Configuration (hardware and software configuration (CPU, GPU, VRAM, PyTorch, Python).
10) Consider adding a transition sentence at the end of Chapter 2 that links data preparation and environment configuration to the next modeling steps.
11) Elaborate on the justification for the limitations of YOLOv12 presented in Section 3.1. Although the limitations are listed, the section could explain how each of these general limitations manifests itself directly in the specific components of YOLOv12 that GTDR-YOLOv12 aims to replace.
12) Briefly explain how the combination of Ghost Convolution and DyReLU allows the model to capture “finer features” or “subtle textures” essential for detecting “small instances of weeds.”
13) Explain the practical benefit of how TDAM helps the model “select and amplify” the most relevant features.
14) Reinforce the direct connection between Lookahead and the model's ability to handle inter-class ambiguity and limited annotation.
15) An additional crucial metric would be inference time on real edge device hardware platforms (e.g., NVIDIA Jetson, embedded CPUs, microcontrollers with AI accelerators). This would provide direct validation of the “deployment feasibility” and suitability of the model for the intended operating environment.
16) Another interesting metric to calculate would be the average IoU values for all correct detections. This directly quantifies how well the boundaries are located.
17) Table 3/4: Highlight the best results in each column in bold.
18) It would be interesting to include the inference time for each ablation model variant (GTDR-YOLOv12 with and without each module) on a relevant edge hardware platform.
19) Although not a commonly used standard metric in object detection studies, quantifying energy consumption (per inference or per unit of time) for ablation variants would provide a more complete view of the model's “architectural efficiency” and “lightweight design.”
20) Identify patterns in residual errors (e.g., missed detections in extreme lighting conditions, false positives in specific types of non-weed foliage, or challenges with tough weed species). This would not only strengthen the analysis but also point to future research directions for the model.
21) Chapter 5 could be substantially enriched by comparing the results of the study with those of other authors cited in the text. Chapter 6 could be incorporated into Chapter 5.
I conclude my review by congratulating you on the study you have conducted.
Respectfully,
Comments for author File: Comments.pdf
Regarding the writing, please review the punctuation, especially the commas. Pay attention to long paragraphs, as they tend to confuse the reader. When introducing acronyms for the first time, their meaning should be presented. Please check the comments in the digital file.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe study presents a robust enhancement over YOLOv12 for weed detection by introducing the GTDR-YOLOv12 framework, integrating a novel architectural component such as GDR-Conv (Ghost Convolution + DyReLU) and GTDR-C3 (which includes the Task-Dependent Attention Mechanism). This topic is highly relevant and aligns well with the scope of the Agronomy. Journal. However, some concerns regarding your manuscript need to be addressed as follows:
1 Some references (e.g., [19], [21], [23]) are incomplete or missing. Additionally, further discussion could have been provided comparing non-YOLO architectures or transformers in agriculture for a broader perspective.
2 Although the proposed GTDR-YOLOv12 framework demonstrates promising improvements in accuracy and efficiency, several important limitations remain unaddressed. First, the study lacks any hardware deployment testing; the model's performance on embedded systems or edge devices is not evaluated, which is critical for real-world agricultural applications. Moreover, the paper does not assess the cross-domain robustness of the model, as it does not explore generalization across different crops, regions, or varying environmental conditions. This omission raises concerns about the scalability of the approach. Additionally, the model is trained and tested solely for binary classification (weed vs. crop), which significantly restricts its applicability in more diverse agricultural settings where multi-class detection may be required. Furthermore, the paper lacks statistical rigor, as there are no error bars, confidence intervals, or statistical significance tests accompanying the reported metrics, which limits the reliability of the results. Lastly, while architecture could potentially benefit from transfer learning or domain adaptation using pre-trained weights, this strategy is not explored, missing an opportunity to improve convergence speed and generalizability.
3 Performance under different lighting conditions or occlusion levels could have been further explored quantitatively.
4 The GitHub repository or access to code and weights would significantly strengthen reproducibility claims.
5 There are a few grammatical inconsistencies throughout the manuscript that should be addressed to improve the overall readability and professionalism of the text. For example, in line 38, the phrase “Recent advances have demonstrated” contains a subject-verb agreement error and should be corrected to “Recent advances have demonstrated.” Additionally, both the abstract and the introduction contain several overly long sentences.
6 The study lacks control tests on degraded images or highly cluttered scenes, the use of one dataset limits the ability to generalize findings, and finally, error analysis of false positives/negatives is not included.
7 The abstract accurately reflects the core contributions and results. Still, its readability can be improved by splitting long sentences and specifying key terms early.
Comments on the Quality of English LanguageThe English could be improved to more clearly express the research.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 4 Report
Comments and Suggestions for AuthorsThis paper proposes GTDR-YOLOv12, an improved, lightweight object detection model built on YOLOv12, aimed at accurate weed detection in complex agricultural settings. The authors introduce the GDR-Conv module (Ghost convolution + DyReLU) and GTDR-C3 blocks with attention mechanisms (TDAM). They demonstrate gains in precision, recall, mAP, and efficiency (GFLOPs, parameter count) versus YOLOv12 and other YOLO variants. However, several critical weaknesses and clarifications remain before this work is ready for strong journal acceptance.
- Authors should explicitly clarify how their integration strategy differs from prior work using attention, Ghost modules, or dynamic activation in YOLO variants.
- The model only distinguishes “weed” vs “crop,” ignoring species-level weed classification. In practice, precision agriculture needs species-specific spraying.
- The ablation tables show % improvements but lack standard deviations or multiple training runs.
- Non-YOLO light detection frameworks also include in the article.
- Add more diverse field scenarios, especially failure cases. This will make the paper more robust and honest.
- The references do not appear to follow the journal’s required formatting guidelines. Please ensure that all citations are revised to match the journal’s specified style.
- The paper does not clearly highlight the research gap in the context of previous work. The authors should explicitly state what specific limitation in existing studies their approach addresses to better justify the contribution of this work.
- The paper mentions 100 epochs, augmentation types, and optimizer. But: No learning rate schedule details, no batch size, no early stopping or regularization mention and no convergence plots. So add this all the terms in articles.
- The claim is that the model is "lightweight" and suitable for "real-time deployment." But no test is actually run on, it is batter to validate with hardware.
- Add a section discussing qualitative error analysis vs human annotation quality.
- Clearly justify why Lookahead is chosen over other stabilizing optimizers, include a small comparison table if possible.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 5 Report
Comments and Suggestions for Authors1. What is the full form of GTDR (Page 3, Line 114)?
2. In abstract, it's also nice to have the implications from the study derived from the results.
3. Please explain how the study results can be applied to other research as well, and summarize the contribution to policies.
4. In conclusion, please add policy implications and future research related to the study results.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsGTDR-YOLOv12: Optimizing YOLO for Efficient and Accurate Weed Detection in Agriculture
- Please arrange the introduction in order to give proper background information without filling in many details.
- Still, when it comes to the application, you have taken data (weed images) from outside. And then you try to relate it; is there any inconsistency? Please explain.
- Please arrange the flowcharts nicely and highlight your new work.
- Figures 11 and 12: When designing your experiment to test your algorithm, please use unique field images so readers can understand the difference.
Ex- crops are grown in a row manner, two weeks (time) after cultivation.
- When plants grow, shape and size differ with time. How does your network address these issues? At what stage can your algorithm be applied for weed detection?
Author Response
Thank you once again for your time and valuable feedback throughout the review process. We appreciate your careful evaluation and are grateful for your comments, which have helped improve the quality of our manuscript.
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsDear Authors,
Thank you for sending me the new version of the manuscript and the cover letter detailing the revisions made. I note that you have implemented the vast majority of my suggestions in an explicit and detailed manner in the second version of the manuscript. The revisions, as documented in the cover letter, contributed significantly to improving the conciseness and clarity of expression, the direct presentation of the study objectives, the connection between the characteristics of the dataset and the challenges overcome, and the contextualization of the experimental platform. The additional analyses, such as the average IoU values for all correct detections and the residual error patterns, together with the incorporation of Chapter 6 into Chapter 5 for a richer discussion, have enriched the work, making it more robust and impactful.
Regarding the suggestion to quantify energy consumption, I understand the hardware limitation mentioned. However, I reiterate that this is a crucial metric to provide a more complete picture of the “architectural efficiency” and “lightweight design” of the model, especially when considering the “feasibility of implementation” on edge device hardware platforms such as the NVIDIA Jetson AGX Xavier. It is encouraging to know that the authors are considering this metric for future work, which demonstrates the continued importance of this variable in the context of real-time agricultural applications.
Respectfully,
Author Response
Thank you very much for your thoughtful and constructive feedback on the revised manuscript and cover letter. We are truly grateful for your kind words regarding the improvements made, particularly your recognition of the enhanced clarity, structure, and contextualization of the study. Your detailed suggestions have been instrumental in elevating the quality and impact of our work.
We also sincerely appreciate your emphasis on the importance of energy consumption metrics. While current hardware limitations prevented us from including empirical energy measurements in this version, we fully agree with your perspective on the relevance of this metric in evaluating architectural efficiency and practical deployability on edge devices such as the NVIDIA Jetson AGX Xavier. As noted, we are actively exploring this direction and intend to integrate such measurements in future extensions of our research.
Thank you once again for your valuable insights and for helping us improve the robustness and relevance of our contribution.
Reviewer 3 Report
Comments and Suggestions for AuthorsNo additional comments
Author Response
Thank you once again for your time and valuable feedback throughout the review process. We appreciate your careful evaluation and are grateful for your comments, which have helped improve the quality of our manuscript.