RS Transformer: A Two-Stage Region Proposal Using Swin Transformer for Few-Shot Pest Detection in Automated Agricultural Monitoring Systems
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors1. Though the concept of two-stream/stage with Swin Transformer is known to the scientific community recently, it is indeed innovative and novel to apply it to pest detection. In principle, I recommend the acceptance of the manuscript. However, I have following suggestions to improve the quality of the paper. The compliance to these points by the author will take it to another level and make it more widely acceptable (and hence citable!) in the scientific community.
2. There are many pest datasets available, including on Kaggle. Why did the authors not use those or other gold-standard datasets? Designing own algorithm and presenting the results on own dataset is generally not acceptable evidence of the robustness of the proposed algorithm.
3. Authors need to make a consistent use of terms presently inconsistently used like ‘F1Score’, ‘F1 Score’, ‘F1’, F1 with superscript, etc.
4. ?period
5. Authors should report the precision, as well as also mention that they have reported the precision.
6. Authors should report the accuracy, as well as also mention that they have reported the accuracy.
7. For the formula of the F1-score, authors should explicitly mention the legend that P and R represent Precision and Recall, respectively.
8. Either there should be no specific formatting or the authors should specifically mention the use/rationale of italics, bold, etc. for both the table headers/sub-headers as well as the values in the table.
9. What is the purpose of including the sentence ‘Civilization starts from me to create a civilized city’??????? in the caption of Table 8. It has nothing to do with the content of this manuscript.
10. What was the resolution of the camera used for picturing the pests? This should be mentioned.
11. Ideally, geo-tagged photos captured by the camera should be presented.
12. All figures must be consistently cited in the manuscript. As of now, they are cited using interchangeable words like ‘Figure’ and ‘Fig.’.
13. Table 1 is not found cited in the manuscript. All the figures and tables must be cited in the manuscript body. Each table and each figure should be checked for compliance with this.
14. ‘…we gathered 512 produced pests…’ This needs to be rephrased as the pests were not produced but their images were produced.
15. It is highly recommended that the authors present a ‘table’ of comparison of the proposed work with similar research works and state-of-the-art research studied as part of the literature review. This will go a long way in emphasizing the research gaps and highlighting the specific contributions of the proposed work. Irrespective of the presented discussion, this should be done in terms of contrast and comparison of approach as well as the contrast and comparison of the results.
16. It is recommended that the authors include a table on the data metadata. This table should include at least the information like how many original images were captured, how many images were sourced from other datasets, how many synthetically produced images were used, how many primary images were there, and how many secondary images were there.
17. The corresponding long forms should accompany all the ‘first usages’ of abbreviations (in the abstract and the remaining manuscript). At the same location, the words of the long form should be suitably written in Title Case. Either the style of ‘long form followed by the abbreviation’ (preferably) or the ‘abbreviation followed by the long form’ should be consistently used throughout the manuscript. After the abbreviation has been defined in the first instance, the subsequent text of the manuscript should not unnecessarily mention the abbreviation and long-form again, and rather only the abbreviation should be used.
18. It is recommended that the authors recheck and reconsider the X-axis labels of Fig. 15 and try including some meaningful labels.
19. All the figures with graphical content need to have captions for ‘both’ axes in addition to the labels for the axes.
20. There are many references in the References section that do not have information like volume, issue, DOI, year, or even name of the journal and page numbers of the article. In the absence of this information, it is challenging for the reader to refer the concerned said paper.
21. It would also be appreciable if the authors can think of and mention the other areas where the proposed concept could be applied.
Comments on the Quality of English LanguageAcceptable usage of English language.
Author Response
Original Article Title: RS Transformer: A two-stage region proposal using the Swin Transformer for few-shot pest detection in automated agricultural monitoring systems
To: Applied Sciences
Re: Response to reviewers 1
Dear Editor,
Thank you for your approval of our manuscript and for granting permission for it to be published in Applied Sciences.
In response to your request, we have carefully reviewed your suggestions to enhance the quality of the paper, and we are committed to implementing them to make our work more widely acceptable and citable within the scientific community. We have already purchased MDPI's English editing service to modify the English grammar of our article and make it more fluent.
We are uploading
(a) our point-by-point response to the comments (below) (response to editor),
(b) an updated manuscript with highlight to indicate changes,
(c) manuscript in PDF format.
Best regards,
Tengyue Wu
Reviewer#1, Concern # 1: Though the concept of two-stream/stage with Swin Transformer is known to the scientific community recently, it is indeed innovative and novel to apply it to pest detection. In principle, I recommend the acceptance of the manuscript. However, I have following suggestions to improve the quality of the paper. The compliance to these points by the author will take it to another level and make it more widely acceptable (and hence citable!) in the scientific community.
Author response: Thank you for considering the manuscript for publication. We appreciate your positive feedback on our application of the RS Transformer for pest detection. In response to your request, we have carefully reviewed your suggestions to enhance the quality of the paper, and we are committed to implementing them to make our work more widely acceptable and citable within the scientific community.
Author action:
Concern # 2: There are many pest datasets available, including on Kaggle. Why did the authors not use those or other gold-standard datasets? Designing own algorithm and presenting the results on own dataset is generally not acceptable evidence of the robustness of the proposed algorithm.
Author response: We acknowledge the availability of various pest datasets, including those found on platforms like Kaggle. But we chose to customize our own dataset for two reasons. First of all, there are few high-quality public data sets available, and most of them are unlabeled pest pictures on kaggle. Some data photos have low clarity and inconsistent resolution. Therefore, we chose to combine the existing data sets and make our own data sets to contribute to the detection of agricultural pests. Secondly, with the help of Professor Lu from China Agricultural University, we can shoot pests in villages near Beijing, and have the opportunity to communicate with local farmers, understand their agricultural needs, discuss the problem of pest detection between fields and test the model, and complete our research through actual research.
Author action:
Concern # 3: Authors need to make a consistent use of terms presently inconsistently used like ‘F1Score’, ‘F1 Score’, ‘F1’, F1 with superscript, etc.
Author response: We apologize for the inconsistency in the usage of terms like "F1Score," "F1 Score," "F1," and "F1" with superscript. We appreciate your suggestion to make consistent use of these terms throughout the paper.
In our revision, we will ensure consistent terminology for the F1 score.
Author action:
Concern # 4: ?period.
Author response: Based on your suggestion, we have decided to purchase the editing services offered by MDPI to help me improve the grammar and expression of my paper.
Author action:
Concern # 5: Authors should report the precision, as well as also mention that they have reported the precision.
Author response: We appreciate your suggestion and recognize the importance of providing comprehensive evaluation metrics in our research.
Author action:
We have annotated the revised PDF manuscript with yellow color for your convenience in reading.
Abstract Lines 351(Table 5).
Concern # 6: Authors should report the accuracy, as well as also mention that they have reported the accuracy.
Author response: We appreciate your suggestion and recognize the importance of providing comprehensive evaluation metrics in our research.
Author action:
Abstract Lines 351(Table 5), Line 321.
Concern # 7: For the formula of the F1-score, authors should explicitly mention the legend that P and R represent Precision and Recall, respectively.
Author response: Thank you for your valuable feedback. Specifically, we have explicitly mentioned the legend for Precision and Recall in the formula of the F1 score, as you recommended.
Author action:
Abstract Lines 331.
Concern # 8: Either there should be no specific formatting or the authors should specifically mention the use/rationale of italics, bold, etc. for both the table headers/sub-headers as well as the values in the table.
Author response: In response to this feedback, we have carefully considered the formatting of the table and have made the necessary revisions. We removed any specific formatting, such as italics or bold, from the table headers, sub-headers, and values to ensure a consistent and standardized presentation. And we purchase the editing services offered by MDPI to help me improve the grammar and expression of my paper.
Author action:
Abstract Lines 336(Table 4), 351(Table 5), 381(Table 6), 393 (Table 7), 399(Table 8), 400(Table 9), 401(Table 10), 402(Table 11).
Concern # 9: What is the purpose of including the sentence ‘Civilization starts from me to create a civilized city’??????? in the caption of Table 8. It has nothing to do with the content of this manuscript.
Author response: We apologize for the inclusion of the sentence "Civilization starts from me to create a civilized city" in the caption. This sentence was mistakenly copied and translated from another document and has no relevance to the content of our manuscript.
Author action:
We already deleted it.
Concern # 10: What was the resolution of the camera used for picturing the pests? This should be mentioned.
Author response: We apologize for the oversight in not including this information in the manuscript. In response to your suggestion, we have revised the manuscript to explicitly mention the resolution of the camera used for capturing the pest images.
Author action:
Abstract Lines 124-125.
Concern # 11: Ideally, geo-tagged photos captured by the camera should be presented.
Author response: Thank you for your valuable feedback regarding. We appreciate your suggestion and understand the significance of this information in providing a comprehensive understanding of the study location. Unfortunately, the camera used for capturing the photos in our study did not store the geolocation information. However, we have taken note of the shooting locations and recorded them in the manuscript on line 122, where readers can refer to the specific details of the study site. We apologize for any inconvenience caused by the absence of geo-tagged photos and assure you that we have made efforts to address this limitation by providing alternative means of indicating the study locations within the text.
Author action:
Concern # 12: All figures must be consistently cited in the manuscript. As of now, they are cited using interchangeable words like ‘Figure’ and ‘Fig.’.
Author response: In response to your suggestion, we have thoroughly reviewed the manuscript and made the necessary revisions to ensure consistency in citing figures. We have updated all references to figures throughout the manuscript, using the term 'Figure' consistently.
Author action:
Concern # 13: Table 1 is not found cited in the manuscript. All the figures and tables must be cited in the manuscript body. Each table and each figure should be checked for compliance with this.
Author response: In response to your suggestion, we have carefully reviewed the manuscript and made the necessary revisions to ensure that all figures and tables are properly cited in the text. We have added the appropriate citation for Table 1 in the manuscript body.
Author action:
Abstract Lines 335.
Concern # 14: ‘…we gathered 512 produced pests…’ This needs to be rephrased as the pests were not produced but their images were produced.
Author response: In response to your suggestion, we have revised the sentence to accurately reflect the information. The revised sentence now reads as follows:"After carefully eliminating the last few false positives, we gathered a dataset of 512 pest images."
Author action:
Abstract Lines 171.
Concern # 15: It is highly recommended that the authors present a ‘table’ of comparison of the proposed work with similar research works and state-of-the-art research studied as part of the literature review. This will go a long way in emphasizing the research gaps and highlighting the specific contributions of the proposed work. Irrespective of the presented discussion, this should be done in terms of contrast and comparison of approach as well as the contrast and comparison of the results.
Author response: In response to your recommendation, we have thoroughly revised the manuscript and incorporated a table that presents a comprehensive comparison of our proposed work with relevant research studies and state-of-the-art literature.
Author action:
Abstract Lines 106 Table1.
Concern # 16: It is recommended that the authors include a table on the data metadata. This table should include at least the information like how many original images were captured, how many images were sourced from other datasets, how many synthetically produced images were used, how many primary images were there, and how many secondary images were there.
Author response: In response to your recommendation, we have thoroughly revised the manuscript and have included a table that presents the metadata of our dataset.
Author action:
Abstract Lines 80 Table 2, Lines 193 Table 3.
Concern # 17: The corresponding long forms should accompany all the ‘first usages’ of abbreviations (in the abstract and the remaining manuscript). At the same location, the words of the long form should be suitably written in Title Case. Either the style of ‘long form followed by the abbreviation’ (preferably) or the ‘abbreviation followed by the long form’ should be consistently used throughout the manuscript. After the abbreviation has been defined in the first instance, the subsequent text of the manuscript should not unnecessarily mention the abbreviation and long-form again, and rather only the abbreviation should be used.
Author response:
In response to your recommendation, we have thoroughly reviewed the manuscript and made the necessary revisions. We have ensured that all abbreviations used in the abstract and the remaining manuscript are accompanied by their corresponding long forms in their first usage.
Author action:
Abstract Lines 327.
Concern # 18: It is recommended that the authors recheck and reconsider the X-axis labels of Fig. 15 and try including some meaningful labels.
Author response: Thank you for your feedback regarding the X-axis labels of Figure 15. We appreciate your suggestion and understand the importance of providing meaningful labels for clear understanding. Upon careful reconsideration, we acknowledge that the current X-axis labels in Figure 15 may not adequately convey the intended message. The purpose of the graph was to illustrate the variations in numerical differences under different percentages. However, we acknowledge that the desired clarity and impact were not achieved with the current representation.
Author action: In light of this, we have decided to remove Figure 15 from the revised version of the manuscript. We believe that by eliminating this figure, we can ensure the overall coherence and quality of the graphical representations in our research.
Concern # 19: All the figures with graphical content need to have captions for ‘both’ axes in addition to the labels for the axes.
Author response: Thank you for your valuable feedback regarding the inclusion of captions for both axes in the figures with graphical content. We appreciate your suggestion and recognize the importance of providing comprehensive information for clear interpretation of the figures.
Author action:
Abstract Lines 379(Figure 13), 411(Figure 15).
Concern # 20: There are many references in the References section that do not have information like volume, issue, DOI, year, or even name of the journal and page numbers of the article. In the absence of this information, it is challenging for the reader to refer the concerned said paper.
Author response: I apologize for the incomplete references in the References section. It is indeed challenging for readers to refer to the mentioned papers without sufficient information. To address this issue, we have updated the paper and included the necessary details such as volume, issue, DOI, year, and page numbers for the referenced articles.
Author action:
Concern # 21: It would also be appreciable if the authors can think of and mention the other areas where the proposed concept could be applied.
Author response: Thank you for your valuable feedback. We have listened to your suggestions and added areas of application, and hope to have more in-depth research in the near future.
Author action:
Abstract Lines 511-516.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript holds promise but necessitates further refinement. Authors should concentrate on fulfilling essential requirements and enhancing specific sections through revision.
- Include the most salient findings in the abstract section.
- There is no need to emphasize performance metrics in concluding the abstract.
- Remove the semicolon in line 37.
- In lines 64, 86, and 93, omit the years (e.g., "in 2015," "in 2020," and "in 2021").
- For lines 69-78, replace numerical indicators like (1), (2) with words such as 'first,' 'second,' etc.
- The primary objectives of the study are unclear; consider revising lines 97-106 for clarity.
- Lines 116-119 should be rephrased to avoid using numerical indicators like (1), (2), (3), (4), etc.
- In line 125, omit the phrase "in 2022."
- Lines 162-165 require rephrasing for improved clarity.
- Lines 170-179 would benefit from more cohesive sentence structures, rather than fragmenting the statements.
- In line 186, under section 2.3.1 Swin Transformer backbone, add a brief paragraph mentioning Figure 6.
- Standardize the usage of 'Figure' or 'Fig.' throughout the paper, as observed in lines 213 and 223.
- Line 235 mentions "224" as the size of the ImageNet image used for pre-training, but the value is not cited elsewhere; please revise.
- Relocate the sections "Experiment Setup," "Evaluation Indicator," and "Experimental Baselines" to the Materials and Methods.
- The sentence in line 378 needs improvement.
- Include comparisons between your study and prior works to substantiate your findings.
- Elaborate on your results in the discussion section and validate them by referencing related studies.
- Consider adding confusion matrices for the proposed models to enhance the paper's rigor.
Overall, the manuscript requires significant revision, both in terms of structure and language quality. However, the core idea shows potential. A major revision is needed before the paper is ready for publication.
Extensive editing of English language required
Author Response
Original Article Title: RS Transformer: A two-stage region proposal using the Swin Transformer for few-shot pest detection in automated agricultural monitoring systems
To: Applied Sciences
Re: Response to reviewers 2
Dear Editor,
Thank you for your approval of our manuscript and for granting permission for it to be published in Applied Sciences.
In response to your request, we have carefully reviewed your suggestions to enhance the quality of the paper, and we are committed to implementing them to make our work more widely acceptable and citable within the scientific community. We have already purchased MDPI's English editing service to modify the English grammar of our article and make it more fluent.
We are uploading
(a) our point-by-point response to the comments (below) (response to editor),
(b) an updated manuscript with highlight to indicate changes,
(c) manuscript in PDF format.
Best regards,
Tengyue Wu
Editor#2, Concern # 1: Include the most salient findings in the abstract section.
Author response: Thank you for your feedback on our paper. We apologize for the abstract section and have already revised it.
Author action:
We have annotated the revised PDF manuscript with yellow color for your convenience in reading.
Abstract Lines 20-36.
Editor#2, Concern # 2: There is no need to emphasize performance metrics in concluding the abstract.
Author response: We apologize for the performance metrics and have already removed them.
Author action:
Abstract Lines 20-36.
Editor#2, Concern # 3: Remove the semicolon in line 37.
Author response: We apologize for the abstract section and have already removed them.
Author action: We have modified the relevant content.
Editor#2, Concern # 4: In lines 64, 86, and 93, omit the years (e.g., "in 2015," "in 2020," and "in 2021").
Author response: We apologize for the extra year information and have deleted it
Author action:
Abstract Lines 62,87,94.
Editor#2, Concern # 5: For lines 69-78, replace numerical indicators like (1), (2) with words such as 'first,' 'second,' etc.
Author response: We appreciate your suggestion, and we have revised the manuscript accordingly. The revised section now incorporates descriptive words instead of numerical indicators.
Author action:
Abstract Lines 73,75.
Editor#2, Concern # 6: The primary objectives of the study are unclear; consider revising lines 97-106 for clarity.
Author response: We have carefully considered your comment regarding the clarity of the primary objectives of the study, specifically lines 97-106. We appreciate your suggestion and have revised the section to improve its clarity.
Author action:
Abstract Lines 83-101.
Editor#2, Concern #7: Lines 116-119 should be rephrased to avoid using numerical indicators like (1), (2), (3), (4), etc.
Author response: We sincerely appreciate your suggestion to rephrase lines 116-119 to avoid the use of numerical indicators. After careful consideration, we have removed the numerical indicators
Author action:
Abstract Lines 128-130.
Editor#2, Concern #8: In line 125, omit the phrase "in 2022."
Author response: We apologize for the extra year information and have deleted it.
Author action:
Abstract Lines 135.
Editor#2, Concern #9: Lines 162-165 require rephrasing for improved clarity.
Author response: We sincerely appreciate your suggestion to rephrase lines 162-165 for improved clarity. After careful consideration, we have made the necessary revisions to enhance the clarity and comprehension of the paragraph. And we purchase the editing services offered by MDPI to help me improve the grammar and expression of our paper.
Author action:
Abstract Lines 152-163.
Editor#2, Concern #10: Lines 170-179 would benefit from more cohesive sentence structures, rather than fragmenting the statements.
Author response: Based on your feedback, we have thoroughly revised the section to create a more coherent flow of information.
Author action:
Abstract Lines 199-216.
Editor#2, Concern #11: In line 186, under section 2.3.1 Swin Transformer backbone, add a brief paragraph mentioning Figure 6.
Author response: We sincerely appreciate your suggestion to include a brief paragraph mentioning Figure 6 in line 186, under section 2.3.1 Swin Transformer backbone. Following your recommendation, we have made the necessary revisions to provide a clearer reference to the relevant Figure 6.
Author action:
Abstract Lines 224-228.
Editor#2, Concern #12: Standardize the usage of 'Figure' or 'Fig.' throughout the paper, as observed in lines 213 and 223.
Author response: In response to your suggestion, we have thoroughly reviewed the manuscript and made the necessary revisions to ensure consistency in citing figures. We have updated all references to figures throughout the manuscript, using the term 'Figure' consistently.
Author action:
Editor#2, Concern #13: Line 235 mentions "224" as the size of the ImageNet image used for pre-training, but the value is not cited elsewhere; please revise.
Author response: We apologize for the error. Upon reviewing our manuscript, we have identified that the correct value should be 299 pixels instead of 224 pixels. We appreciate your diligence in catching this mistake.
Author action:
Abstract Lines 266.
Editor#2, Concern #14: Relocate the sections "Experiment Setup," "Evaluation Indicator," and "Experimental Baselines" to the Materials and Methods.
Author response: We appreciate your suggestion to relocate the sections "Experiment Setup," "Evaluation Indicator," and "Experimental Baselines" to the Materials and Methods section. After careful consideration, we agree that it would be more appropriate to include these sections in the Materials and Methods section for better organization and clarity.
Author action:
Abstract Lines 312-336.
Editor#2, Concern #15: The sentence in line 378 needs improvement.
Author response: Based on your suggestion, I have decided to purchase the editing services offered by MDPI to help me improve the grammar and expression of my paper.
Author action:
Editor#2, Concern #:16 Include comparisons between your study and prior works to substantiate your findings.
Author response: Based on your suggestion, we have add the comparison results summary section and table 11
Author action:
Abstract Lines 427-446.
Editor#2, Concern #17: Elaborate on your results in the discussion section and validate them by referencing related studies.
Author response: We have carefully considered your recommendation and have incorporated it into the revised version of the paper.
Author action:
Abstract Lines 456-491.
Editor#2, Concern #18: Consider adding confusion matrices for the proposed models to enhance the paper's rigor.
Author response: We greatly appreciate your suggestion to enhance the rigor of our paper by including confusion matrices for the proposed models.We have carefully considered your recommendation and have incorporated it into the revised version of the paper.
Author action:
Abstract Lines 353-362.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsCurrently, the topics presented in the paper are discussed by many authors, so in my opinion the article is interesting and generally fits well within the scope of the journal. In their study, the authors constructed a pest dataset by acquiring domain agnostic images from the Internet, resizing them to a standardized format. In addition, they used diffusion models to generate complementary data. To complement convolutional neural networks with efficient global information integration and discriminative feature representation, the authors proposed RS Transformer, an innovative model that combines elements such as Region Proposal Network, Swin Transformer and ROI Align. In addition, they introduced a randomly generated stable diffusion dataset to augment the availability of high-quality pest datasets.
Due to the existence of a few editorial errors, I would like to ask you to read the whole work and correct its content.
A one-sentence subchapters should not appear in a professional publication.
It is worth sharing the applications on GitHub so that other readers can use the presented programming projects to test and admire the new solution.
In my opinion, in its current version, the article needs minor adjustments to make the presented research results even more.
Author Response
Original Article Title: RS Transformer: A two-stage region proposal using the Swin Transformer for few-shot pest detection in automated agricultural monitoring systems
To: Applied Sciences
Re: Response to reviewers 3
Dear Editor,
Thank you for taking the time to review our paper. We appreciate your positive feedback and your opinion that the article is interesting and aligns well with the scope of the journal. We are glad that you found our study valuable.
We acknowledge your feedback regarding the existence of a few editorial errors in the manuscript. We have thoroughly reviewed the entire work and made the necessary corrections to address these issues. We apologize for any inconvenience caused and appreciate your attention to detail.
We also agree with your assessment that the article may benefit from minor adjustments to further enhance the presentation of the research results. We have taken this into account and made the necessary adjustments to improve the clarity and effectiveness of our findings.
Once again, we would like to express our gratitude for your valuable feedback and suggestions. We believe that your input has significantly contributed to the improvement of our paper. We hope that the revised version meets your expectations and satisfies the requirements of the journal.
Thank you for your time and consideration.
Best regards,
Tengyue Wu
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe suggestions of the previous review round have been well-considered by the authors.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript has been significantly improved and is ready for publication. Thank you!
Comments on the Quality of English LanguageMinor editing of English language required.