Next Article in Journal
Intersectoral Linking of Agriculture, Hospitality, and Tourism—A Model for Implementation in AP Vojvodina (Republic of Serbia)
Next Article in Special Issue
Stereo Visual Odometry and Real-Time Appearance-Based SLAM for Mapping and Localization in Indoor and Outdoor Orchard Environments
Previous Article in Journal
Reducing Postharvest Losses in Organic Apples: The Role of Yeast Consortia Against Botrytis cinerea
Previous Article in Special Issue
Development and Experiment of Adaptive Oolong Tea Harvesting Robot Based on Visual Localization
 
 
Article
Peer-Review Record

Study on Predicting Blueberry Hardness from Images for Adjusting Mechanical Gripper Force

Agriculture 2025, 15(6), 603; https://doi.org/10.3390/agriculture15060603
by Hao Yin 1,2, Wenxin Li 1, Han Wang 3, Yuhuan Li 1, Jiang Liu 1 and Baogang Li 1,4,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Agriculture 2025, 15(6), 603; https://doi.org/10.3390/agriculture15060603
Submission received: 22 February 2025 / Revised: 5 March 2025 / Accepted: 10 March 2025 / Published: 11 March 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Keywords: avoid using the same from title.
line 39: is it the most recent data from reference [2]? also, use dollar when appicable.
Reference 5: is it from co-author? the same for reference [2]

line 13: "new" and "intelligent" are subjective. the same for line 139.
the same for line 23 "simple and efficient".
line 23: why two-parameter?
line 25: how much "increased"? which accuracy?
line 73: "and colleagues"?
Introducition: explain better which is the research gap of the paper and make more clear its objective.
line 149: "are" or were, use past sentence, review for all text.
line 154: why that region was chosen? what is the economical relevance of the crop for the region? all images were from the same crop variety and phenological stage?
Figure 2: is this methodology just applied for greenhouse conditions or it could be done under large scale?
line 192: cite a reference for that number.
line 244: include more details about prediction models (e.g.: number of iterations, etc.). What is the kind of input data? Is there any pre-processing of the images?
Equations: cite the references, when applicable.
Figure 7 should be an equation?
About machine vision: include more details about camera settings, etc. What is the computational capacity required for it? was there any influence of environment, such as luminosity, on data readings?
line 391: "We also recorded the system’s predicted data" why? what is the relevance from it?
line 393: figure 3?
Figures 9-10: how many samples were evaluated? is this result from testing or training? include the metrics for each comparison.
Figure 11: it is confused to understand the lines. if relevant for results, split in more figures.
Table 2: is it the resolution with 2 decimals? mainly for diameter.
line 500: is there any numbers/comparison to confirm it?
line 511: what should be the threshold to accept the error for robotics applications? is it possible to measure it or obtains from lterature?
line 521: is there any statistical test to confirm "not significant"?
Table 4: should be better for reader understanding to show a graph?
Discussion: improve it over the text, there are few references for a scientific paper.
Conclusion: focus to answer the proposal of the paper, use past sentence (review for all text).

Comments on the Quality of English Language

improve sentences and pontuation.

Author Response

Dear reviewer:

Thank you for taking the time to thoroughly review this paper and provide valuable suggestions. Your professional insights and constructive feedback have helped us further improve the quality and rigor of the article. We have carefully considered all your suggestions and made the necessary revisions. We sincerely appreciate the time and effort you have dedicated to this review! Here are our responses to your comments and questions.

 

Comments 1: Keywords: avoid using the same from title.

Response 1: Thank you very much for your comments, which have greatly helped improve our work. Based on your suggestions, we have adjusted the keywords. In the original manuscript, the keyword “blueberry” was repetitive with the title. After careful consideration, we have replaced it with “fruit hardness” and rearranged the keyword order. The revised order starts with “fruit hardness”, followed by “Non-destructive harvesting”, “Machine vision”, “Predictive modeling”, and “ChOA-RBF” to present the research focus more clearly.

The location is on page 1, line 37.

 

Comments 2: line 39: is it the most recent data from reference [2]? also, use dollar when appicable. Reference 5: is it from co-author? the same for reference [2]

Response 2: Thank you for pointing out the issue regarding the planting area and yield data. The original data was not the latest survey data, and outdated information lacked authenticity and timeliness. Therefore, we re-examined relevant literature, updated the planting area and yield data with the most recent information, and added corresponding references. The picking cost data in line 39 was obtained from joint calculations with the plantation but was not supported by published references. Considering its lack of verification, we have deleted this part.

Additionally, references 2 and 5 were not contributed by any co-author, and there is no cooperation between us and the authors of these references. During the review process, we found that reference 2 was outdated and reference 5 had low relevance. Therefore, we replaced reference 2 with a more recent reference and removed reference 5.

The specific location is on page 2, line 44.

The original text is [According to the latest data from the International Blueberry Organization, China’s blue-berry planting area has reached 77,641 hectares, with a total output of 525,300 tons, making it the largest blueberry producer in the world in terms of acreage [2].].

The revised version is [According to the 2024 report by the International Blueberry Organization, China's blueberry cultivation area reached 84,420 hectares, with a production of 563.46 kilotons, ranking first in the world [2].].

The deleted content is [Market surveys show that fresh blueberries are priced at £30 per kilogram, while labor costs for traditional hand-picking can exceed £2.50 per kilogram, accounting for more than 40% of the total cost, with a daily harvest rate of only about 40 kilograms per person.].

 

Comments 3: line 13: "new" and "intelligent" are subjective. the same for line 139. the same for line 23 "simple and efficient".

Response 3: Thank you very much for your comments. In line 13, we intended to express that the fruit hardness, which cannot be directly obtained from images, is predicted based on the fruit's size parameters through image recognition. The predicted “hardness” is used to adjust the gripping force of the mechanical claw to avoid fruit damage. This non-destructive picking method has not been mentioned in previous studies, so we described it as “novel”. The entire process is executed automatically by the visual recognition device, from image capture and parameter recognition to gripping force adjustment, without human intervention, so we described it as “intelligent”. However, we acknowledge that our original description was unclear, so we have rewritten this part on page 1 line 15-19. The description in line 139 was incorrect, and the term “intelligent” was inappropriate. We have revised this section on page 4 line 164-166.

The two-parameter model excludes the weight parameter compared to the three-parameter model, making its structure simpler and improving computational efficiency. In Section 3.3 and Figure 13, we provided a comparison of runtime and accuracy between the two models. Although the two-parameter model has slightly lower accuracy, its convergence time is significantly shorter, which supports our description of it as “simple and efficient”.

The original text is [This paper proposes a new intelligent method using machine vision to automatically identify physical characteristics of blueberries, such as diameter and thickness, to predict fruit hardness, thereby enabling real-time adjustment of the gripper force of mechanical claws.].

The revised version is [This paper proposes an intelligent recognition and prediction method based on machine vision. The method uses image recognition technology to extract the physical characteristics of blueberries, such as diameter and thickness, and estimates fruit hardness in real-time through a predictive model. The gripping force of the mechanical claw is dynamically adjusted to ensure non-destructive harvesting.].

The original text is [To improve the detection accuracy and reduce misjudgments due to occlusion, an efficient intelligent dual-parameter method is proposed, which estimates fruit hardness based solely on diameter and thickness.].

The revised version is [To simplify the model structure, improve prediction speed, and eliminate the weight parameter that cannot be directly obtained from images, a more efficient two-parameter method is proposed.].

 

Comments 4: line 23: why two-parameter?

Response 4: Thank you very much for your comments. The two-parameter model uses diameter and thickness to predict fruit hardness, while the three-parameter model includes weight. The description and method of the two-parameter model are detailed in Section 3.2.1 on page 17 line 527-528.

The specific description is [Therefore, this study adopts the "two-parameter method", which uses only the fruit's diameter and thickness to predict and verify blueberry hardness.].

 

Comments 5: line 25: how much "increased"? which accuracy?

Response 5: Thank you for your question. The two-parameter model simplifies the structure by excluding the weight parameter, resulting in faster convergence. In the simulation and comparison in Section 3.2.2, the two-parameter model reduces the convergence steps by 71 steps, shortens the time by 1/3, and increases the error by only 0.36% compared to the three-parameter model. This explanation was missing in the original manuscript, and we have supplemented it in both the Abstract and Section 3.2.2 on page 1 line 29-32 and page 18 line 566-569

The original text is [This method significantly increased the iteration speed of the algorithm while maintaining accuracy.].

The revised version is [Although the two-parameter method increases the prediction error by 0.36% compared to the three-parameter method, it reduces the number of convergence steps by 71 and shortens the computation time by one-third, significantly improving iteration speed.].

The original text is [The prediction error of the two-parameter method was within 8%, indicating that although the two-parameter method slightly lags behind the three-parameter method in statistical metrics, its prediction error does not significantly affect the results in practical applications.].

The revised version is [The results show that the prediction error of the two-parameter method remains within 8%. Although it is 0.36% higher than that of the three-parameter method, the impact on practical applications is negligible, while the speed improvement of the algorithm significantly outweighs the slight increase in error.].

 

Comments 6: line 73: "and colleagues"?

Response 6: Thank you very much for your comments. This mistake was caused by our carelessness, and we sincerely apologize for the confusion. We have corrected the error.

The revised version is [More recently, Oh and colleagues integrated sensory attributes with tool measurements to identify and predict blueberry hardness [15], enhancing the selection of desired blueberry textures through mechanical parameter estimation.].

 

Comments 7: Introducition: explain better which is the research gap of the paper and make more clear its objective.

Response 7: Thank you very much for your comment. A better introduction is crucial for a paper as it highlights the existing research problems, outlines future improvement goals, and introduces the research objectives and methods. Therefore, we have revised and supplemented the introduction, with the specific location on page 3 line 96-104 and line 136-146.

The specific description is [In addition, deep learning models have achieved remarkable results in image correction and object detection. Xu et al. proposed a specular highlight removal method based on the generative adversarial network (GAN). The method uses an attention mechanism to gen-erate highlight intensity masks, which remove highlights while preserving image details, providing an effective preprocessing method for feature extraction and analysis [20]. Zhuang et al. developed a dual-image and dual-local contrast measure (DDLCM) algo-rithm, which enhances target saliency through local feature enhancement, significantly improving the recognition accuracy of small and weak targets in complex backgrounds [21].].

[Machine learning techniques can establish nonlinear mapping relationships between multiple parameters, showing high flexibility and accuracy in prediction tasks. In recent years, the deep integration of visual recognition and machine learning has promoted the development of intelligent perception. These two technologies complement each other in feature extraction and prediction modeling. Ma et al. combined the canopy reflectance model of row crops with backward propagation neural networks (BPNNs) to estimate fractional vegetation cover (FVC), demonstrating the effectiveness of visual recognition and machine learning algorithms in agricultural parameter estimation [30]. Cai et al. used the YOLACT model to identify ice block images and applied image processing algorithms to estimate fracture features, proving that deep learning combined with image analysis can effectively extract physical features of objects [31]. Ma et al. proposed a pixel dichoto-my coupled with a linear kernel-driven model (PDKDM) to estimate FVC in drought areas through random sampling, verifying the feasibility of non-contact biological parameter estimation based on visual feature extraction and modeling [32].].

 

Comments 8: line 149: "are" or were, use past sentence, review for all text.

Response 8: Thank you for your question. This was indeed a mistake in our writing, and we apologize for the error. We have made the correction and thoroughly checked the entire manuscript for similar mistakes. The specific locations are on page 4, line 174, and page 5, line 185 et al.

 

Comments 9: line 154: why that region was chosen? what is the economical relevance of the crop for the region? all images were from the same crop variety and phenological stage?

Response 9: Thank you for your question. The discussion on the region and blueberry variety is indeed very important. Regarding the planting regions, blueberry cultivation in China is mainly distributed in Shandong, Jilin, Liaoning, and Yunnan provinces. In northern China, blueberries are mostly grown in greenhouses, while in southern regions, they are planted in open fields. With advancements in greenhouse technology, various blueberry varieties are now grown across different regions. The constant temperature environment of greenhouses makes them an increasingly popular cultivation method, especially for high-value crops like blueberries, which require year-round supply. Therefore, we chose to conduct our study on blueberries grown in northern greenhouse plantations. The plantation mentioned in our manuscript (latitude 119°89', longitude 35°75') is a representative northern region. Blueberries provide significant economic returns to northern regions, while rabbit-eye blueberries are widely cultivated in southern China due to their taste and high yield. However, high picking costs and the short harvesting window lead to considerable economic losses, which was briefly mentioned in the introduction. Based on your suggestion, we have added further discussion on this topic on page 5 line 178-186. The image information is described in Section 2.4, including the recognition device and shooting angles. The tested variety is rabbiteye blueberries, commonly grown in northern greenhouses. The data collection took place from June to July and September to October 2024. This information was included in the experimental validation section (Section 3.3) but lacked further explanation on the image collection process. We have now added this content on page 12 line 404-407.

The original text is [This project focuses on rabbit-eye blueberries grown in greenhouses in northern China, with the experimental area located at Qingdao Wolin Blueberry Agricultural Co., Ltd. (latitude 119°89', longitude 35°75').].

The revised version is [China's blueberry cultivation is mainly distributed in Shandong, Jilin, Liaoning, and Yunnan provinces. In northern China, blueberries are primarily grown in greenhouses, while in southern regions, they are cultivated in open fields. With the advancement of greenhouse technology and its constant temperature advantages, greenhouses have be-come an increasingly common cultivation method. Consistent with previous studies [33], we selected greenhouse-grown blueberries in northern China for our study, choosing the rabbit-eye blueberry variety, which is widely grown across both northern and southern China, as the research subject. The experiments and data collection were conducted at Qingdao Wolin Blueberry Agricultural Co., Ltd. (latitude 119°89', longitude 35°75').].

The original text is [During the experimental validation, we used visual equipment to collect blueberry fruit data in different experimental greenhouses, recognizing mature fruits from different angles (front view, side view, and bottom view) under the same lighting conditions, as shown in Figure 8].

The revised version is [During the experimental validation process, we used visual recognition equipment to collect blueberry fruit data from different greenhouses and seasons. Under consistent lighting conditions, images of mature fruits were captured from different angles (front view, side view, and bottom view), as shown in Figure 8.].

 

Comments 10: Figure 2: is this methodology just applied for greenhouse conditions or it could be done under large scale?

Response 10: Thank you very much for your question. Our current research focuses on rabbit-eye blueberries grown in northern greenhouses. Since most blueberries in northern China are cultivated in greenhouses, the experiments and data collection were all conducted under greenhouse conditions. However, the predominant blueberry variety in southern China is also rabbit-eye blueberries. Due to the consistency in variety, we believe the method has general applicability. In future research, we will collect data from field-grown blueberries in southern China to conduct comparative analysis on regional differences.

 

Comments 11: line 192: cite a reference for that number.

Response 11: Thank you for your question. We indeed lacked relevant literature references in this section. Based on your suggestion, we have reviewed studies on multi-parameter correlation coefficients, carefully read them, and cited them in the corresponding section of our manuscript. The specific location is on page 45.

 

Comments 12: line 244: include more details about prediction models (e.g.: number of iterations, etc.). What is the kind of input data? Is there any pre-processing of the images?

Response 12: Thank you very much for your question. The parameters of the optimized RBF model need to be continuously adjusted by the ChOA model. However, the specific parameters of the ChOA model were indeed missing in our original text. We have now added this information, which can be found on page 10 line 351-353. The model input consists of the physical feature parameters of blueberries, while the output is the fruit hardness. The three-parameter model uses diameter, thickness, and weight as input parameters, while the two-parameter model uses only diameter and thickness as input parameters, with hardness as the output. The input parameters for model prediction are extracted through image recognition during actual operation. The images are enhanced through different angles and varying exposure rates. This information is described in Section 2.4, on page 12 line 411-414.

The additional content is [The ChOA optimized model was set with a maximum iteration step of 300, population size of 50, default aggressiveness factor of 1.0, and repulsion factor of 2.0.].

Relevant description is [To ensure the richness of the dataset, we applied data augmentation methods such as rotation, scaling, contrast enhancement, and noise addition in random combinations to the collected images. Through data augmentation, we obtained 4210 images, which were used as training images for the maturity model recognition and for parameter extraction. In this study, we classified blueberries into three categories: mature, semi-mature, and immature.].

 

Comments 13: Equations: cite the references, when applicable.

Response 13: Thank you for your question. The optimization algorithm of ChOA has been cited in the corresponding section, with the reference included. The description of the relevant equations for RBF has also been cited, with the specific location on page 9 line 328.

 

Comments 14: Figure 7 should be an equation?

Response 14: Thank you for your question. Initially, we described the evaluation metrics using equations. Considering the importance of the evaluation metrics, we converted them into an image for better presentation. After receiving your question, we have reverted to the equation format. The specific location is on page 11 line 382.

 

Comments 15: About machine vision: include more details about camera settings, etc. What is the computational capacity required for it? was there any influence of environment, such as luminosity, on data readings?

Response 15: Thank you very much for your question. The key focus of this paper is to address the data processing and prediction process. We did not provide extensive details on the camera settings and computing power, as this content is more oriented toward visual research, which is presented in another paper. The influence of environmental factors (such as brightness) has been discussed in the pure visual study. After algorithm adjustments, optimizations, and extensive dataset training, this influence is minimal. We included descriptions of the meaningful aspects for this paper, such as image processing and experimental conditions, which are detailed on page 12 line 402-414.

Relevant description is [The data was collected at the Wolin Blueberry Base in Qingdao, using a Canon camera for shooting. The recognition device was the Canon camera (R6 MARK II) used by the re-search institute. During the experimental validation process, we used visual recognition equipment to collect blueberry fruit data from different greenhouses and seasons. Under consistent lighting conditions, images of mature fruits were captured from different angles (front view, side view, and bottom view), as shown in Figure 7.

To ensure the richness of the dataset, we applied data augmentation methods such as rotation, scaling, contrast enhancement, and noise addition in random combinations to the collected images. Through data augmentation, we obtained 4210 images, which were used as training images for the maturity model recognition and for parameter extraction.].

 

Comments 16: line 391: "We also recorded the system’s predicted data" why? what is the relevance from it?

Response 16: Thank you very much for your question. The unclear expression in this part was caused by our oversight. What we intended to express is that the relevant parameters are extracted through image recognition and used as model inputs. The model then performs prediction simulation and records the model output results (system’s predicted data). We appreciate your valuable suggestion and have revised the text accordingly. The modification can be found on page 12 line 419-421.

The original text is [The relevant parameters were extracted from the recognition images, such as the fruit diameter and thickness, which were used as input samples for real-time fruit hardness prediction. We also recorded the system’s predicted data.].

The revised version is [Fruit diameter, thickness, and other relevant parameters were extracted from the images and used as inputs for real-time fruit hardness prediction. The model's prediction data was recorded after operation.].

 

Comments 17: line 393: figure 3?

Response 17: Thank you very much for your question, and we sincerely apologize for our carelessness. This issue occurred because the image was originally described in a previous section during the early draft, but the section position was adjusted in subsequent revisions. While we updated the image title, we neglected to update the text description. We sincerely apologize for this oversight and have made the necessary corrections. The revised content is on page 12 line 422.

 

Comments 18: Figures 9-10: how many samples were evaluated? is this result from testing or training? include the metrics for each comparison.

Response 18: Thank you for your question. We evaluated 400 data samples and divided the dataset into training and testing sets in a 3:1 ratio. This is described at the beginning of section 3.1.1, on page 13 line 429-430.

Figures 8 and 9 show the training and testing results of the six models, respectively. Due to our inaccurate expression, we mistakenly referred to the testing set as the validation set in the manuscript. We have corrected this error throughout the text. The performance metrics of each model are described in Table 1 on page 14 line 469.

 

Comments 19: Figure 11: it is confused to understand the lines. if relevant for results, split in more figures.

Response 19: Thank you for your question. The excessive number of comparison models made the lines in the original Figure 11 cluttered and difficult to interpret. However, images can more intuitively illustrate the differences between models, their improvement ranges, and error gaps. Therefore, we redrew Figure 11 by splitting it into three subfigures. Each subfigure compares one model with and without ChOA optimization, demonstrating the advantages of the ChOA algorithm, while the overall figure shows the differences between models. This is presented on page 15, as Figure 10.

 

Comments 20: Table 2: is it the resolution with 2 decimals? mainly for diameter.

Response 20: Thank you for your question. Here, we intended to state that all parameters, including diameter, thickness, weight, and hardness, are rounded to two decimal places. Since the model's prediction results often include more decimal places, we standardized all data to two decimal places. This explanation is provided on page 15 line 488-489.

The specific description is [To ensure the consistency of data accuracy, all values are rounded to two decimal places.].

 

Comments 21: line 500: is there any numbers/comparison to confirm it?

Response 21: Thank you for your question. The two-parameter method excludes the weight parameter, which cannot be directly extracted from images. Weight needs to be calculated based on diameter and thickness, adding computational complexity to the robot's operation. In Figure 12a, the two-parameter method reduced the number of iterations by one-third compared to the three-parameter method. Figure 12b shows that the two-parameter method had only 0.36% more error, which is minimal and negligible. Therefore, we stated in the manuscript that the two-parameter method improves efficiency, simplifies the harvesting robot's operation, and maintains accuracy and reliability. The specific location is on page 18 line 566-569.

The specific description is [The results show that the prediction error of the two-parameter method remains within 8%. Although it is 0.36% higher than that of the three-parameter method, the impact on practical applications is negligible, while the speed improvement of the algorithm significantly outweighs the slight increase in error.].

 

Comments 22: line 511: what should be the threshold to accept the error for robotics applications? is it possible to measure it or obtains from lterature?

Response 22: Thank you for your question. We conducted a detailed discussion on the robot's acceptable error threshold, which was set based on gripping force and the safety factor in section 2.3.4. We assumed a safety factor of 1.3 to ensure the mechanical claw's grip strength avoids fruit slippage while preventing damage. This content is described on page 12 line 394-399.

The specific description is [Where Fc is the gripping force of the mechanical claw, F is the real-time predicted fruit hardness, and s is the safety coefficient, set at 1.3 in this study. The selection of the safety coefficient depends on various factors, including the fruit's physical properties, the prediction error in hardness, and the need to avoid fruit damage. By appropriately adjusting the safety coefficient, the gripping force can be optimized, ensuring the integrity of the fruit while improving the efficiency and accuracy of the mechanical harvesting process.].

 

Comments 23: line 521: is there any statistical test to confirm "not significant"?

Response 23: Thank you for your question. The comparison of MAE and MBE is shown in Table 3, but our primary evaluation metric is RMSE. The RMSE of the two-parameter method increased by approximately 0.1. The original statement that this increase is "not significant" was incorrect, and we have corrected it. MAE and MBE only show slight increases but do not fully reflect the model's overall performance. The final comparison is explained through the error data in Figure 12b, combined with the table parameters.

The original text is [However, the changes in MAE and MBE were minimal, indicating that the error differences between the two methods are not significant.].

The revised version is [The results show that the MAE and MBE increased by 0.1 and 0.01, respectively, indicating a slight decline in the performance of the two-parameter model compared to the three-parameter model.].

 

Comments 24: Table 4: should be better for reader understanding to show a graph?

Response 24: Thank you for your question. Table 4 contains multiple parameters, including diameter, thickness, actual hardness, predicted hardness, error, and relative error. While figures can better visualize comparative results, it is difficult to represent such extensive data graphically. If presented in graphical form, diameter and thickness parameters would need to be omitted. To maintain data authenticity and visibility, we chose to present the results in tabular form, listing each sample's results and calculating errors and relative errors, which still reveal the differences between experimental and simulation results.

 

Comments 25: Discussion: improve it over the text, there are few references for a scientific paper.

Response 25: Thank you for your question. The "Discussion" section is essential in a scientific paper, as it summarizes key points, discusses encountered issues, and outlines future technological improvements, potential applications, and scientific evaluations. Therefore, we have added this section and revised the conclusion accordingly. The updated content is on page 20 line 610-646, and we have also added more references throughout the manuscript.

The deleted content is [With the rise in labor costs and the advancement of agricultural technology, mechanized crop harvesting is bound to become the future trend of agriculture. However, the development of blueberry harvesting robots has been slow due to the fruit's clustered growth and soft texture. Additionally, blueberries have low fruit hardness and fragile skin, making them easily damaged by conventional mechanical claws. This necessitates the flexible design and precise control of mechanical claws. This study addresses the challenge of non-destructive harvesting for soft fruits by proposing a method that extracts blueberry features through a visual recognition device and predicts fruit hardness. Based on the predicted hardness, the mechanical claw's gripping force is dynamically adjusted to prevent fruit damage and achieve precise, non-destructive harvesting. We first trained and simulated ChOA optimized RBF, BP, and RF models to compare their differences and detection accuracy. To further verify the simulation's accuracy, we conducted orthogonal experiments. The results revealed the low correlation of the weight parameter and its complexity in extraction, prompting us to propose the simpler and faster two-parameter model, which estimates fruit hardness using only diameter and thickness.

Subsequent experimental validation and evaluation demonstrated that the two-parameter model improves execution speed while maintaining prediction accuracy compared to the three-parameter model. However, we also observed that the recognition device and prediction accuracy still require further optimization. As visual algorithms advance, recognition efficiency and accuracy can be continuously improved. Moreover, large amounts of data revealed the presence of positive and negative prediction errors. Positive errors indicate that the gripping force may crush the fruit, while negative errors suggest insufficient gripping force, potentially causing fruit slippage. We addressed this issue by introducing a safety factor in our study. In practical applications, the use of flexible materials and curved mechanical claw designs could further enhance harvesting efficiency and accuracy. These aspects will be considered in our future design and development.

In this study, we focused on rabbit-eye blueberries, a representative variety widely grown in northern greenhouses and southern open fields. In future extensions, we plan to incorporate different varieties and planting environments as parameters to optimize our model. This will enable the model to automatically adjust to various harvesting conditions across different varieties and regions. The proposed method could also be applied to other soft fruits with fragile skins, such as waxberries and peaches. By simulating human decision-making, the mechanical claw dynamically adjusts gripping force based on ex-ternal features and image details, preventing damage. This image-based fruit hardness prediction method provides a new approach to non-destructive harvesting and offers a theoretical foundation for the dynamic adjustment of gripping force in harvesting robots.].

 

Comments 26: Conclusion: focus to answer the proposal of the paper, use past sentence (review for all text).

Response 26: Thank you for your question. The Conclusion section initially lacked proper review and categorization. Since we did not include a Discussion section previously, some content that did not belong in the Conclusion was mistakenly placed there. Based on our previous response and revision, we have added a Discussion section and refined the Conclusion.

 

Thank you again for your thorough review of our manuscript and your valuable feedback. Your guidance and suggestions have greatly helped us improve our work and writing skills. We are very grateful for your hard work and expertise.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The proposed method, combining visual recognition for predicting fruit hardness and adjusting mechanical gripper force accordingly, presents a novel approach for non-destructive harvesting. The focus on improving the efficiency of blueberry harvesting through automation is timely, given the increasing demand for labor-saving and non-damaging methods. The use of the Chimpanzee Optimization Algorithm (ChOA) in optimizing the model is a unique contribution, and the comparison with other models such as RBF, BP, and RF provides strong evidence for its effectiveness.

More details on the parameter settings of the optimization process (such as population size, iteration limits, etc.) would help readers understand the robustness of the proposed model. The transition from the full-featured model (using diameter, thickness, and weight) to the simplified two-parameter method (using just diameter and thickness) is interesting, but it would be beneficial to elaborate more on the trade-offs between these methods and why the two-parameter method is more suitable for practical applications.

the dataset used for training and testing the models is not described in detail. Information on the number of samples, diversity of blueberry types (e.g., size, ripeness, etc.), and experimental conditions would be helpful for evaluating the generalizability of the approach. It would also be useful to mention if there were any potential biases in image acquisition, such as variations in lighting conditions or image resolution, and how these might impact the prediction accuracy.

The paper mentions that both positive and negative prediction errors were found. It would be beneficial to further explain how these errors might impact the practical application of the model and whether the negative errors (where the model underestimates hardness) might pose a higher risk for fruit damage. The suggestion to use flexible materials and curved designs for the mechanical claw is an interesting one. It could be helpful to briefly explore how these materials might improve the overall performance of the harvesting process, especially in relation to different types of fruits with varying textures.

Some good work could be discussed in intro or literature like a) Estimating Fractional Vegetation Cover of Row Crops from High Spatial Resolution Image b) A Pixel Dichotomy Coupled Linear Kernel-Driven Model for Estimating Fractional Vegetation Cover in Arid Areas From High-Spatial-Resolution Images c) Infrared Weak Target Detection in Dual Images and Dual Areas d) Highlight Removal from A Single Grayscale Image Using Attentive GAN e) Broken ice circumferential crack estimation via image techniques

The two-parameter method for predicting blueberry hardness using diameter and thickness is particularly valuable for real-time applications, given its faster execution speed and simplicity. Further, a discussion on the cost-effectiveness of the system, including the cost of the recognition device and gripper integration, would provide useful insights for its commercial viability. The paper could expand on potential challenges in deploying the system in large-scale commercial blueberry farms. For instance, how would the system handle different environmental conditions (e.g., weather, soil types) or variations in blueberry quality across harvest seasons? 

The paper would benefit from clearer and more informative figures and tables. A visual representation of the ChOA-RBF model and a flowchart of the experimental setup could aid in understanding the workflow. Tables summarizing key experimental results, including performance comparisons between models, would be useful. 

Comments on the Quality of English Language

In some parts, the technical terms (e.g., ChOA-RBF model, orthogonal experiments) are used without sufficient explanation for readers unfamiliar with these concepts. Providing a brief explanation or referencing earlier work would improve accessibility.

some sentences could be restructured for readability. For example, "The drawbacks of traditional hand-picking methods—such as high costs, low efficiency, damage, waste, and seasonal labor demands—have driven up the cost of blueberries" could be revised to avoid a long list of issues in a single sentence.

Author Response

Dear reviewer:

Thank you for taking the time to thoroughly review this paper and provide valuable suggestions. Your professional insights and constructive feedback have helped us further improve the quality and rigor of the article. We have carefully considered all your suggestions and made the necessary revisions. We sincerely appreciate the time and effort you have dedicated to this review! Here are our responses to your comments and questions.

 

Comments 1: The proposed method, combining visual recognition for predicting fruit hardness and adjusting mechanical gripper force accordingly, presents a novel approach for non-destructive harvesting. The focus on improving the efficiency of blueberry harvesting through automation is timely, given the increasing demand for labor-saving and non-damaging methods. The use of the Chimpanzee Optimization Algorithm (ChOA) in optimizing the model is a unique contribution, and the comparison with other models such as RBF, BP, and RF provides strong evidence for its effectiveness.

Response 1: We appreciate your recognition and comments. With rising labor costs and advancements in agricultural technology, mechanized crop harvesting is undoubtedly the future trend in agriculture. This paper addresses the challenge of non-destructive harvesting for soft-textured fruits by proposing a method that uses a vision-based system to extract blueberry features and predict fruit hardness. The predicted hardness is then used to dynamically adjust the mechanical claw’s gripping force, preventing damage and enabling precise, non-destructive harvesting. We believe this image-based hardness prediction method provides new insights for the non-destructive harvesting of soft fruits and lays a theoretical foundation for dynamically adjusting the gripping force of harvesting robots. The effectiveness of the Chimp Optimization Algorithm (ChOA) in optimizing the predictive model is evident, as shown in the comparison of different models in the previous Figure 11 (now Figure 10).

 

Comments 2: More details on the parameter settings of the optimization process (such as population size, iteration limits, etc.) would help readers understand the robustness of the proposed model. The transition from the full-featured model (using diameter, thickness, and weight) to the simplified two-parameter method (using just diameter and thickness) is interesting, but it would be beneficial to elaborate more on the trade-offs between these methods and why the two-parameter method is more suitable for practical applications.

Response 2: Thank you for your question. Our paper lacked detailed explanations of model parameters. Based on your suggestion, we have added more details, which can be found on page 10 line 351-352.

The additional content is [The ChOA-optimized model was set with a maximum iteration step of 300, population size of 50, default aggressiveness factor of 1.0, and repulsion factor of 2.0.].

The transition from the three-parameter method to the two-parameter method was made after repeated comparisons. First, as shown in the correlation heatmap in section 2.2.1, diameter and thickness had a stronger correlation with hardness, while weight had a correlation coefficient of 0.72. The correlation of weight with hardness was lower than that of diameter and thickness, but this part was not previously explained in detail. We have now revised and supplemented this section. Second, our orthogonal experiment data showed that fruits with the same weight but different diameters and thicknesses had significant differences in hardness, further confirming that weight had minimal influence. Additionally, since weight cannot be directly extracted from images and requires further estimation, we attempted the two-parameter method and found it to be effective. We then conducted simulations and experimental comparisons between the two-parameter and three-parameter methods, and in the results analysis, we explained their differences. We included iterative simulation graphs and experimental error graphs, along with quantitative explanations. These updates can be found on page 6 line 228-230.

The original text is [Notably, the diameter and thickness exhibit a stronger correlation with hardness, suggesting that larger fruits generally have higher hardness.].

The revised version is [Notably, the correlation between diameter and thickness with hardness is stronger, while the correlation between weight and hardness is 0.72, the weakest among the three parameters. This indicates that fruit hardness is more closely related to size.].

 

Comments 3: the dataset used for training and testing the models is not described in detail. Information on the number of samples, diversity of blueberry types (e.g., size, ripeness, etc.), and experimental conditions would be helpful for evaluating the generalizability of the approach. It would also be useful to mention if there were any potential biases in image acquisition, such as variations in lighting conditions or image resolution, and how these might impact the prediction accuracy.

Response 3: Thank you very much for your question. The terminology used in our manuscript, referring to the training set and validation set, caused a significant misunderstanding. We have now corrected this description throughout the manuscript. The validation set mentioned in the paper corresponds to the test set as referred to in your question. We divided 400 sets of data into training and test sets at a ratio of 3:1. This description has been updated on page 12 line 429-432.

The specific description is [In this study, the 400 experimental data points were randomly shuffled and divided into a training set and a testing set at a 4:1 ratio. During model prediction, the fruit diameter, thickness, and weight from the training set were used as input samples and fed into the prediction model.].

Regarding the diversity of blueberry varieties, we provided a partial description in Section 2.1 Experimental Field and more detailed information in Section 2.4 Visual Recognition and Experimental Design. This section also covers details on illumination conditions, fruit size, maturity level, and image processing methods. However, since this study primarily focuses on the relationship between visually extracted data and the dynamic adjustment of mechanical gripping force, further in-depth visual recognition research is part of another study. In this paper, we only described the visual setup and experimental process. Additionally, in the Discussion section, we mentioned that image recognition errors might cause deviations in results, but since this research is highly complex, we will address it in a separate paper.

The specific description is [The data was collected at the Wolin Blueberry Base in Qingdao, using a Canon camera for shooting. The recognition device was the Canon camera (R6 MARK II) used by the re-search institute. During the experimental validation process, we used visual recognition equipment to collect blueberry fruit data from different greenhouses and seasons. Under consistent lighting conditions, images of mature fruits were captured from different angles (front view, side view, and bottom view), as shown in Figure 8.

To ensure the richness of the dataset, we applied data augmentation methods such as rotation, scaling, contrast enhancement, and noise addition in random combinations to the collected images. Through data augmentation, we obtained 4210 images, which were used as training images for the maturity model recognition and for parameter extraction. In this study, we classified blueberries into three categories: mature, semi-mature, and immature. Mature blueberries are dark purple, large in size, and have a deep overall color. Considering real harvesting situations (taste, sales, transportation, texture), we selected fruits with a maturity score of 0.9-0.95 (fully ripe fruits have a score of 1.0) for image recognition.].

 

Comments 4: The paper mentions that both positive and negative prediction errors were found. It would be beneficial to further explain how these errors might impact the practical application of the model and whether the negative errors (where the model underestimates hardness) might pose a higher risk for fruit damage. The suggestion to use flexible materials and curved designs for the mechanical claw is an interesting one. It could be helpful to briefly explore how these materials might improve the overall performance of the harvesting process, especially in relation to different types of fruits with varying textures.

Response 4: Thank you very much for your valuable suggestion, which aligns with the request from Reviewer 1 to provide further discussions. Therefore, we added a Discussion section describing more interesting findings and future improvements. The updated content can be found on page 88. The gripping force for different fruit textures was set based on the safety factor, considering the significant gap between the force required to hold the fruit and the force that would cause damage.

The additional content is [Subsequent experimental validation and evaluation demonstrated that the two-parameter model improves execution speed while maintaining prediction accuracy compared to the three-parameter model. However, we also observed that the recognition device and prediction accuracy still require further optimization. As visual algorithms advance, recognition efficiency and accuracy can be continuously improved. Moreover, large amounts of data revealed the presence of positive and negative prediction errors. Positive errors indicate that the gripping force may crush the fruit, while negative errors suggest insufficient gripping force, potentially causing fruit slippage. We addressed this issue by introducing a safety factor in our study. In practical applications, the use of flexible materials and curved mechanical claw designs could further enhance harvesting efficiency and accuracy. These aspects will be considered in our future design and development.

In this study, we focused on rabbit-eye blueberries, a representative variety widely grown in northern greenhouses and southern open fields. In future extensions, we plan to incorporate different varieties and planting environments as parameters to optimize our model. This will enable the model to automatically adjust to various harvesting conditions across different varieties and regions. The proposed method could also be applied to other soft fruits with fragile skins, such as waxberries and peaches. By simulating human decision-making, the mechanical claw dynamically adjusts gripping force based on external features and image details, preventing damage. This image-based fruit hardness prediction method provides a new approach to non-destructive harvesting and offers a theoretical foundation for the dynamic adjustment of gripping force in harvesting robots.].

 

Comments 5: Some good work could be discussed in intro or literature like

  1. a) Estimating Fractional Vegetation Cover of Row Crops from High Spatial Resolution Image b) A Pixel Dichotomy Coupled Linear Kernel-Driven Model for Estimating Fractional Vegetation Cover in Arid Areas From High-Spatial-Resolution Images c) Infrared Weak Target Detection in Dual Images and Dual Areas d) Highlight Removal from A Single Grayscale Image Using Attentive GAN e) Broken ice circumferential crack estimation via image techniques

Response 5: We sincerely appreciate the high-quality references you provided. After carefully studying these papers, we found them very helpful. In response to Reviewer 1's suggestion regarding the limited scientific references in our paper, we cited the recommended references accordingly. The updated references can be found on page 3 line 96-104 and line 133-146.

The additional content is [In addition, deep learning models have achieved remarkable results in image correction and object detection. Xu et al. proposed a specular highlight removal method based on the generative adversarial network (GAN). The method uses an attention mechanism to generate highlight intensity masks, which remove highlights while preserving image details, providing an effective preprocessing method for feature extraction and analysis [20]. Zhuang et al. developed a dual-image and dual-local contrast measure (DDLCM) algorithm, which enhances target saliency through local feature enhancement, significantly improving the recognition accuracy of small and weak targets in complex backgrounds [21].].

[Machine learning techniques can establish nonlinear mapping relationships between multiple parameters, showing high flexibility and accuracy in prediction tasks. In recent years, the deep integration of visual recognition and machine learning has promoted the development of intelligent perception. These two technologies complement each other in feature extraction and prediction modeling. Ma et al. combined the canopy reflectance model of row crops with backward propagation neural networks (BPNNs) to estimate fractional vegetation cover (FVC), demonstrating the effectiveness of visual recognition and machine learning algorithms in agricultural parameter estimation [30]. Cai et al. used the YOLACT model to identify ice block images and applied image processing algorithms to estimate fracture features, proving that deep learning combined with image analysis can effectively extract physical features of objects [31]. Ma et al. proposed a pixel dichotomy coupled with a linear kernel-driven model (PDKDM) to estimate FVC in drought areas through random sampling, verifying the feasibility of non-contact biological parameter estimation based on visual feature extraction and modeling [32].].

 

Comments 6: The two-parameter method for predicting blueberry hardness using diameter and thickness is particularly valuable for real-time applications, given its faster execution speed and simplicity. Further, a discussion on the cost-effectiveness of the system, including the cost of the recognition device and gripper integration, would provide useful insights for its commercial viability. The paper could expand on potential challenges in deploying the system in large-scale commercial blueberry farms. For instance, how would the system handle different environmental conditions (e.g., weather, soil types) or variations in blueberry quality across harvest seasons?

Response 6: Thank you very much for your recognition and questions. We have not yet discussed the cost considerations because the current stage of our work focuses on design and development. However, we agree that cost is an essential factor for future commercial applications. Regarding environmental conditions, our current research targets blueberries grown in northern greenhouse environments, where constant temperature, no wind, and sufficient lighting provide favorable conditions. Therefore, we have not yet conducted in-depth studies on harsh environmental conditions. Similarly, soil type is another key factor, as different soil conditions could affect fruit properties. In future work, we plan to expand our experiments to cover different regions, environments, and plant varieties across the country, collecting data to further improve the model.

The selection of blueberry samples was explained in the Experimental Design section. The fruits were collected from different seasons and greenhouses to increase the diversity of the dataset. Additionally, we clarified that green and overripe fruits were not considered due to their low commercial value, unsuitability for harvesting, and short shelf life. The selected fruits were just-ripe hard blueberries with a maturity index of 0.9–0.95, which we believe is an appropriate choice for this study. The explanation is detailed on page 12 line 415-419.

The specific description is [In this study, we classified blueberries into three categories: mature, semi-mature, and immature. Mature blueberries are dark purple, large in size, and have a deep overall color. Considering real harvesting situations (taste, sales, transportation, texture), we selected fruits with a maturity score of 0.9-0.95 (fully ripe fruits have a score of 1.0) for image recognition.].

 

Comments 7: The paper would benefit from clearer and more informative figures and tables. A visual representation of the ChOA-RBF model and a flowchart of the experimental setup could aid in understanding the workflow. Tables summarizing key experimental results, including performance comparisons between models, would be useful.

Response 7: Thank you very much for your recognition and suggestions. Due to the poor line clarity in some of the original images, they were difficult to interpret. We have revised key figures into three sub-figures, each comparing the results of models with and without ChOA optimization. This layout allows readers to visually observe the optimization effect within each sub-figure and better understand the differences between models. The revised figure is now Figure 10. Additionally, we corrected some data presentation issues that could have caused misunderstandings. The exact location is on page 17, line 548-551 and page 18, line 566-569.

The original text is [However, the changes in MAE and MBE were minimal, indicating that the error differences between the two methods are not significant.].

The revised version is [The results show that the MAE and MBE increased by 0.1 and 0.01, respectively, indicating a slight decline in the performance of the two-parameter model compared to the three-parameter model.].

The original text is [This method significantly increased the iteration speed of the algorithm while maintaining accuracy.].

The revised version is [Although the two-parameter method increases the prediction error by 0.36% compared to the three-parameter method, it reduces the number of convergence steps by 71 and shortens the computation time by one-third, significantly improving iteration speed.].

 

Comments 8: In some parts, the technical terms (e.g., ChOA-RBF model, orthogonal experiments) are used without sufficient explanation for readers unfamiliar with these concepts. Providing a brief explanation or referencing earlier work would improve accessibility.

Response 8: Thank you very much for your question. This oversight was caused by our negligence. As you mentioned, technical terms require clear explanations and descriptions. Based on your suggestion, we carefully checked all technical abbreviations throughout the manuscript. We expanded explanations where no proper definitions were provided and added references to previous related work in appropriate sections. The exact location is on page 1, line 22 and page 4, line 162 et al.

 

Comments 9: some sentences could be restructured for readability. For example, "The drawbacks of traditional hand-picking methods—such as high costs, low efficiency, damage, waste, and seasonal labor demands—have driven up the cost of blueberries" could be revised to avoid a long list of issues in a single sentence.

Response 9: Thank you very much for your question. This error occurred during the translation process, and we have now corrected it. Additionally, we performed a thorough check of the entire manuscript to ensure consistency and accuracy of all terminology.

The original text is [The drawbacks of traditional hand-picking methods—such as high costs, low efficiency, damage, waste, and seasonal labor demands—have driven up the cost of blueberries [3], severely limiting the growth of the blueberry industry [4].].

The revised version is [Traditional manual picking methods have disadvantages such as high costs, low efficiency, fruit loss, and reliance on seasonal labor. These factors increase the cost of blue-berry harvesting [3] and severely limit the development of the blueberry industry [4].].

 

Thank you again for your thorough review of our manuscript and for providing your valuable feedback. Your guidance and suggestions have greatly helped us improve our work and writing skills. We are very grateful for your hard work and professional expertise.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

the authors made improvement on text.

Back to TopTop