Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Non-Destructive Detection of External Defects in Potatoes Using Hyperspectral Imaging and Machine Learning

Agriculture 2025, 15(6), 573; https://doi.org/10.3390/agriculture15060573

by Ping Zhao^*

, Xiaojian Wang, Qing Zhao, Qingbing Xu, Yiru Sun and Xiaofeng Ning

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Pulakesh Das

Agriculture 2025, 15(6), 573; https://doi.org/10.3390/agriculture15060573

Submission received: 28 January 2025 / Revised: 26 February 2025 / Accepted: 6 March 2025 / Published: 7 March 2025

(This article belongs to the Special Issue Agricultural Products Processing and Quality Detection)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Abstract

Line 11: why did authors decide to directly use a such complex and expensive imaging technique as HSI is to detect external defects? Did they make considerations about the possibility to use the traditional RGB imaging at first?

Lines 15-16: I should say "to distinguish between healthy and damaged potatoes with different external defects".

Line 17: PCA is known to be a common (or maybe the most common) explorative and data reduction (compression) technique rather than a way to obtain a "quantitative model". So, why and how did the authors consider PCA that way?

Lines 19-20: how could authors say that about black and green potatoes if they just considered red skin potatoes as the research object? If "black" and "green" is instead referred to the skin, so authors are talking about two specific types of defects, the word "skin" should be added before the word "potatoes".

Introduction

Lines 73-75: it is not clear if this statement describes something that was done in this paper by the authors or if it is related to something done by other authors in previous works.

Materials and Methods

Section 2.1: it is not so clear if Jianping Town, Jianping County, Liaoning Province is the same planting base or if they are 3 different planting bases. In this last case, were these 180 samples selected for each planting base or did authors consider a total of 180 samples?

Section 2.2: please, specify also the kind of acquisition mode (point-line-area scan) of the HIS analyser.

Section 2.4: expressions like "Take black potato as an example..." or "Extract the spectral data of the region of interest..." should be avoided and revised.

Section 2.5: authors do not say nothing about the performance indicators that were considered to express the model accuracy and reliability. Please add something about this key aspect.

Results

Section 3.3: the text from line 217 to line 254 pertains to the descriptive part of modelling techniques and performance indicators that should be given in the corresponding section of the M&M part.

Moreover, a brief description of the performance indicators (R², RMSE, RPD), together with their formula, should be given in the manuscript, even if (I think) this information is already available in reference [22].

Lines 243-246: what is said here need a little bit more of details. So, for the data modelling step, authors considered black and green skin potato as a unique class and, likewise, scab, mechanical damage and broken skin were grouped as a unique class. Is it right? Moreover, why did authors say again "Then, the quantitative prediction model was established..." if the previously mentioned techniques (SVM, PLSR, PCA, and LSSVM) were from the beginning used to develop quantitative models?

Line 247: did authors say "three quantitative analysis models" referring to the three abovementioned groups? Moreover, since we are talking about quantitative models, which is the response (Y) parameter to be predicted in all of them? If it is a reference value obtained by a reference analytical technique, this part should be for sure integrated (obviously not here but already as a M&M dedicated section). Furthermore, knowing what is the measurement unit of the predicted parameter/s (Y) is vital to understand the size of the model error values (RMSE) which actually refers to the same measurement unit.

Line 257: this aspect keeps on being not so clear to me: are we dealing with qualitative (discriminant) or quantitative analysis models?

Lines 265-268: this aspect of using quantitative prediction model to establish the qualitative discrimination model is not clear definitely. Why and how was it performed? Which were these discrimination models? I kindly ask authors to provide a suitable description of this task.

Lines 283-291: this is a descriptive part on how PCA algorithm works. Therefore, it should be moved above into the appropriate M&M section.

Line 286: did authors mean "original variable number 336" or "336 original variables"? In the first case, which is this specific variable and why is it so important?

Figure 8: one of the most important pieces of information is missing: which one of the modelling methods (i.e., SVM, PLSR, PCA, LSSVM) Fig. 8(a), 8(b) and 8(c) refer to?

Section 3.4; Lines 300-315: again, this is a descriptive part on how SPA feature selection algorithm works that should be introduced as an appropriate M&M section.

Figure 9: for both x and y axis, authors must report clearly the parameter to which they are referred to and its corresponding measurement unit.

Section 3.5; Line 329: so, which was the aim of this task? Discriminating between healthy and defective potatoes? And, among the defective ones, to distinguish the type of defect? This is absolutely not clear to the reader since authors never have discussed it before. Actually, from the subsequent text, it seems that, again, three discriminative models were built separately for healthy, green-black skin and scab-mechanical damage-broken. I can understand the meaning of distinguishing among each specific type of defect for the defective potatoes, but I cannot understand which different classes were considered for the healthy potato model.

In my mind, I would imagine a unique discrimination model (for each considered modelling method) with all the healthy and defective samples (3 classes), in order to investigate how many sound samples were actually predicted as defective, falling in which group of defects (green-black skin or scab-mechanical damage-broken), and vice-versa.

Section 3.6; Lines 353-362: again, this is a descriptive part on the meaning of confusion matrix representation. Therefore, the authors should consider moving it above into an appropriate M&M section.

Referring to lines 352-353, why just 30 samples per group were selected to show the results into the confusion matrix?

Figure 11: what about the results obtained using LDA? Why not considering directly to express the correctly and wrongly assigned samples as percentage also inside the confusion matrix?

Discussion

Discussion: even if, as authors say, nothing is available yet for red skin potatoes, something more should be provided about the already existing studies performed on the yellow skin potatoes. More in details, are there some works in literature that already focused on the differentiation between healthy and defective potatoes? Or on a particular kind of potato's skin defect? If so, these studies should be included into the References and, in this section, authors should discuss them briefly.

Line 387: "link" does not sound to be the most appropriate term.

Conclusion

Lines 414-423: this is not a conclusion statement, but a repetition of the results already shown and discussed in the proper section. The same consideration can be made for Lines 426-433.

Line 442: I should say "study" rather than "experiment".

Author Response

Comments 1: [Line 11: why did authors decide to directly use a such complex and expensive imaging technique as HSI is to detect external defects? Did they make considerations about the possibility to use the traditional RGB imaging at first?]

Response 1: Thank you for pointing this out. We chose hyperspectral imaging (HSI) technology to detect potato external defects, the reasons are as follows: First, the detection effect of RGB images is more easily affected by the illumination conditions than HSI images, and the accuracy is not easy to be guaranteed in the actual production; Secondly, the HSI image can detect not only external defects but also internal defects, which will be studied next, so as to finally be applied to intelligent sorting equipment.

Comments 2: [Lines 15-16: I should say "to distinguish between healthy and damaged potatoes with different external defects".

preprocess the hyperspectral images of potato with healthy and different external defects. ]

Response 2: Thank you for your careful review. We don’t follow your suggestion because the pretreatment methods can’t distinguish between healthy and defective potatoes, however we improved the sentence, the change marked in red. [Page 1, L12-L15.]

Comments 3: [Line 17: PCA is known to be a common (or maybe the most common) explorative and data reduction (compression) technique rather than a way to obtain a "quantitative model". So, why and how did the authors consider PCA that way?]

Response3: Thank you for pointing this out. We used PCA for principal component analysis. When establishing the quantitative model, we used SVM, PLSR, PCR, and LSSVM. I'm sorry for the trouble you had in reviewing the manuscript because I miswrote "PCR" as "PCA" before. I have made corrections in the revision and marked the changes in red.[Page 1, L16.]

Comments 4: [Lines 19-20: how could authors say that about black and green potatoes if they just considered red skin potatoes as the research object? If "black" and "green" is instead referred to the skin, so authors are talking about two specific types of defects, the word "skin" should be added before the word "potatoes".]

Response 4: Thank you for pointing this out. We agree with this comment. Therefore, we have described both green and black skin defective potatoes as green skin potatoes and black skin potatoes in the whole article. Namely, the word "skin" was added before the word "potatoes".

Comments5: [Lines 73-75: it is not clear if this statement describes something that was done in this paper by the authors or if it is related to something done by other authors in previous works.]

Response5: Thank you for pointing this out. We are very sorry that this was an incorrect statement in the original manuscript. We want to express the limitations of these existing methods. In fact, these limitations were found in our preliminary experiments, and have also been mentioned in other literature.

Combined with your and other reviewers' opinions, in section introduction, we have improved the structure and transition between studies, justification of method and definition of the hypothesis, and clarified the significance of this research. At the same time, the references were supplemented and updated. The changed can be founded in the revised manuscript, we have marked in red. [Page 2, L65-L74.]

Comments 6: [Section 2.1: it is not so clear if Jianping Town, Jianping County, Liaoning Province is the same planting base or if they are 3 different planting bases. In this last case, were these 180 samples selected for each planting base or did authors consider a total of 180 samples?]

Response 6: Thank you for pointing this out. Potato planting base is a planting base. “Jianping Town, Jianping County, Liaoning Province, China” is a location. We considered a total of 180 samples. I have modified this part and added latitude and longitude of the planting base in the revised manuscript to avoid ambiguity. [Page 3, L118-L119.]

Comments 7: [Section 2.2: please, specify also the kind of acquisition mode (point-line-area scan) of the HIS analyser.]

Response 7: Thank you for pointing this out. We agree with this comment. Therefore, I have included the relevant content in the revised manuscript. [Page 3, L137-L140.]

Comments 8: [Section 2.4: expressions like "Take black potato as an example..." or "Extract the spectral data of the region of interest..." should be avoided and revised.]

Response 8: Thank you for pointing this out. We agree with this comment. Therefore, I have deleted "Take black potato as an example...", and modified the other content in the revised manuscript. [Page 5, L186-187.]

Comments 9: [Section 2.5: authors do not say nothing about the performance indicators that were considered to express the model accuracy and reliability. Please add something about this key aspect.]

Response 9: Thank you for pointing this out. We agree with this comment. Based on the comments of the three reviews, we made in-depth consideration and re-modified the structure of Section 2 Materials and Methods, and Section 2.5 was removed in the revised manuscript. We have supplemented the content related to performance indicators of the model in Section 2.4 of revised manuscript, the changes marked in red can be found in the revised manuscript. [Page 8, L275-L287.]

Comments 10: [Section 3.3: the text from line 217 to line 254 pertains to the descriptive part of modelling techniques and performance indicators that should be given in the corresponding section of the M&M part.

Moreover, a brief description of the performance indicators (R2, RMSE, RPD), together with their formula, should be given in the manuscript, even if (I think) this information is already available in reference [22].]

Response 10: Thank you for pointing this out. We agree with this comment. Therefore, I moved the methods included Section 3.3 in original manuscript to Section 2.4 in the revised manuscript. The changes marked in red can be found in the revised manuscript. [Page 8, L281-L287.]

Comments 11: [Lines 243-246: what is said here need a little bit more of details. So, for the data modelling step, authors considered black and green skin potato as a unique class and, likewise, scab, mechanical damage and broken skin were grouped as a unique class. Is it right? Moreover, why did authors say again "Then, the quantitative prediction model was established..." if the previously mentioned techniques (SVM, PLSR, PCA, and LSSVM) were from the beginning used to develop quantitative models?]

Response 11: Thank you for pointing this out. We are very sorry for unclear description.

Yes, it is. Based on spectral feature similarity and model simplification requirements, we grouped black and green potatoes into one class, while scab, mechanical damage, and broken potatoes into another class, and establish quantitative models using SVM, PLSR, PCR, and LSSVM. The part was described in detail in Section 2.4.2. [Page 7, L230-L287.]

Comments 12: [Line 247: did authors say "three quantitative analysis models" referring to the three above mentioned groups? Moreover, since we are talking about quantitative models, which is the response (Y) parameter to be predicted in all of them? If it is a reference value obtained by a reference analytical technique, this part should be for sure integrated (obviously not here but already as a M&M dedicated section). Furthermore, knowing what is the measurement unit of the predicted parameter/s (Y) is vital to understand the size of the model error values (RMSE) which actually refers to the same measurement unit.]

Response 12: Thank you for your kind and detail suggestion. " three quantitative analysis models" refer to healthy potatoes, black-green skin potatoes and scab - mechanical damage - broken skin potatoes. The response parameters of the quantitative model are R², RMSEC, RMSEP, and RPD, which I explained in section 2.4 in the revised manuscript. [Page 8, L275-L287.]

Comments 13: [Line 257: this aspect keeps on being not so clear to me: are we dealing with qualitative (discriminant) or quantitative analysis models?]

Response 13: We are very sorry for unclear description.

I have improved this part in the section 2.4 and section 3.2 to avoid ambiguity. [ Page 6, L201-L206 & Page 11, L358-L362.]

Comments 14: [Lines 265-268: this aspect of using quantitative prediction model to establish the qualitative discrimination model is not clear definitely. Why and how was it performed? Which were these discrimination models? I kindly ask authors to provide a suitable description of this task.]

Response 14: Thank you for pointing this out.

After careful consideration, we realized that this part is not the result but the method. We moved it to section 2.4, and rewrote the results and analysis. [Page 12, L365-L372.]

In addition, we give a suitable description in section 2.4, and “quantitative prediction model” and “quantitative model” are the same, and we have uniformly described them as “quantitative model “throughout the article. [ Page 7-8, L230-L287 & Page 8-11, L288-L310.]

Comments 15: [Lines 283-291: this is a descriptive part on how PCA algorithm works. Therefore, it should be moved above into the appropriate M&M section.]

Response 15: Thank you for pointing this out. We agree with this comment. Therefore, moved the this to section2.4. [ Page 7, L268-L263.]

Comments 16: [Line 286: did authors mean "original variable number 336" or "336 original variables"? In the first case, which is this specific variable and why is it so important?]

Response 16: Thank you for pointing this out. This is a typo on my part, the number "336" does not exist and I have corrected the error in the text. This should be the line number copied here during the modification process. I apologize for causing you trouble in reviewing the manuscript.

Comments 17: [Figure 8: one of the most important pieces of information is missing: which one of the modelling methods (i.e., SVM, PLSR, PCA, LSSVM) Fig. 8(a), 8(b) and 8(c) refer to?]

Response 17: Thank you for pointing this out. Figure 8 shows the scatterplot of variance for principal component analyses(PCA) of three types potatoes. To better describe this set of figures, I changed the title of Figure 8 and added an analysis of Figure 8 in the revised manuscript. (a) healthy potatoes; (b) black-green skin potatoes; (c) scab-mechanical damage-broken skin potatoes. [Page 14, L403-L404.]

Comments 18: [Section 3.4; Lines 300-315: again, this is a descriptive part on how SPA feature selection algorithm works that should be introduced as an appropriate M&M section.]

Response 18: Thank you for pointing this out. We agree with this comment. Therefore, I moved them to section2.4, and improved them. [ Page 8, L297-L302.]

Comments 19: [Figure 9: for both x and y axis, authors must report clearly the parameter to which they are referred to and its corresponding measurement unit.]

Response 19: Thank you for pointing this out. I changed Figure 9, the parameter of the horizontal axis is wavelength, while the parameter of the vertical axis is the variable index value. The variable index value is the specific position of each spectral wavelength in the spectral distribution matrix, which only reflects the mathematical projection relationship, but does not have direct physical significance, so it hasn’t measurement unit. In the figure, the corresponding wavelengths of small rectangles mean the selected characteristic wavelengths. [Page 15, L425-L437.]

Comments 20: [Section 3.5; Line 329: so, which was the aim of this task? Discriminating between healthy and defective potatoes? And, among the defective ones, to distinguish the type of defect? This is absolutely not clear to the reader since authors never have discussed it before. Actually, from the subsequent text, it seems that, again, three discriminative models were built separately for healthy, green-black skin and scab-mechanical damage-broken. I can understand the meaning of distinguishing among each specific type of defect for the defective potatoes, but I cannot understand which different classes were considered for the healthy potato model.

Response 20: Thank you for pointing this out. Your understanding is correct, the qualitative model was created to distinguish between three types of potatoes, namely healthy potatoes, green-dark potatoes, and scab - mechanical damage - broken skin potatoes. the model was trained to distinguish whether the potatoes were healthy or had any defect type. For healthy potatoes, we do not consider classification. In Section 2.4 of the revised version, we supplemented this part. [Page 6, L200-L207]

In fact, through comparative analysis, we concluded that KNN model has the best universality, and take it a unique discrimination model with all the healthy and defective samples (3 classes) to investigate how many samples were predicted to be defective and which defect. Which is consistent with your imagining. It should be that we fail to express clearly, which brings troubles to your reading. Thank you for your understanding and for giving us the opportunity to make a revision. [Page 16, L463-L489]

Comments 21: [Section 3.6; Lines 353-362: again, this is a descriptive part on the meaning of confusion matrix representation. Therefore, the authors should consider moving it above into an appropriate M&M section.]

Response 21: Thank you for pointing this out. We agree with this comment. Therefore, I moved them to the section2.4. [ Page 9, L312-L320.]

Comments 22: [Referring to lines 352-353, why just 30 samples per group were selected to show the results into the confusion matrix?]

Response 22: Thank you for pointing this out. Only 30 samples were selected for each group to be displayed in the confusion matrix, mainly due to the limited number of healthy potato samples (only 30). The reason as follows:

After finishing the training set and verification set of the model, only 30 healthy potatoes were left for experimental verification. If the samples are supplemented again, they aren’t freshly harvested potatoes, which make the sample characteristics are inconsistent, which will lead to greater error. In order to ensure the rigor of the experiment and the fairness of the results, the number of samples for other categories were also unified to 30. This approach can avoid the impact of data imbalance on model evaluation, while improving the reliability and science of the results. We believe that 30 potatoes per group for experimental verification can be illustrative.

Comments 23: [Figure 11: what about the results obtained using LDA? Why not considering directly to express the correctly and wrongly assigned samples as percentage also inside the confusion matrix?]

Response 23: We very appreciate reviewer’s preciseness aiming to scientific research. The study results of Section3.3 in revised manuscript showed that LDA was inferior to CATR, KNN and BPNN in determining the universality of the three types of potatoes. So LDA was eliminated, in the same time, we improved related content. [ Page 15, L458-L459 & Page 16, L470-L471.] .

In Figure 11(d) shows the percentage, we think that the four figures can express clearly, if the percentages were inside confusion matrix, they are not good-looking.

Comments 24: [Discussion: even if, as authors say, nothing is available yet for red skin potatoes, something more should be provided about the already existing studies performed on the yellow skin potatoes. More in details, are there some works in literature that already focused on the differentiation between healthy and defective potatoes? Or on a particular kind of potato's skin defect? If so, these studies should be included into the References and, in this section, authors should discuss them briefly.]

Response 24: Thank you for your detailed and valuable suggestions. We have done in-depth thinking, rewrote the discussion according to your suggestions, the detail was shown in Section Discussion. [Page 17-18, L490-L532.].

Furthermore, in Section Introduction, we supplemented existing studies performed on the yellow skin potatoes and corresponding literature [reference 2,3,4,7,14 and 15].

Comments 25: [Line 387: "link" does not sound to be the most appropriate term.]

Response 25: Thank you for pointing this out. We agree with this comment. We rewrote the discussion. “link” wasn’t be used.

Comments 26: [Lines 414-423: this is not a conclusion statement, but a repetition of the results already shown and discussed in the proper section. The same consideration can be made for Lines 426-433.]

Response 26: Thank you for your detailed and valuable suggestions. We have done in-depth thinking, and rewrote the discussions and conclusions. The detail was shown in section conclusions. [Page 18-19, L533-L559.]

Comments 28: [Line 442: I should say "study" rather than "experiment".]

Response 28: Thank you for pointing this out. We agree with this comment. Therefore, I made a change in the article. [Page 19, L556.]

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors present a very interesting manuscript in which they use a non-destructive detection method to identify external defects in potatoes using hyperspectral imaging and machine learning algorithms. Spectral preprocessing (SG, SNV, MSC), feature selection (SPA), and quantitative (PCA, SVM) and qualitative (BPNN, KNNN) modeling techniques were compared. Results indicated that the PCA-SG model achieved X% accuracy on healthy potatoes, while SVM was the most stable on defects such as black and green spots. For scab and mechanical damage, KNN provided the best classification with an accuracy of Y%. These findings demonstrate the feasibility of using hyperspectral imaging to improve quality inspection in the potato industry, although further studies are required to optimize model generalization.

The title makes it clear that the study deals with the detection of external defects in potatoes and that the technique used is near-infrared hyperspectral imaging (NIR-HSI). The focus of the study is immediately understood and is aligned with the trend of improving nondestructive detection methods in the food and agricultural industry. However, the phrase “based on near-infrared hyperspectral imaging” is rather long-winded. The word hyperspectral already suggests the use of near-infrared in many agricultural applications, so it could be shortened. Also, the authors do not mention the specific techniques they used to analyze the data. Since the study uses PCA, SVM, KNN, and neural networks (BPNN), the title could refer to classification or machine learning models, which would make it more informative. For example, and this is just a suggestion, the title might be more aptly titled “Nondestructive detection of external defects in potatoes using hyperspectral imaging and machine learning.”

The Abstract Is really a summary, include key findings and is an appropriate length. The abstract is indeed a summary, includes the key findings, and is of adequate length. However, the authors could improve it by starting with a clearer introductory sentence about the problem. In addition, they mention SG, SNV, MSC, SPA, PCA, SVM, BPNN, KNN without explaining what they represent or why they were chosen. For a reader unfamiliar with these methods, it may be difficult to follow. Therefore, I recommend that you include a brief explanation of the methods in layman's terms. It is mentioned that some models are more effective, but no numerical values are presented to support the comparison. This reduces the credibility of the abstract, as a reader should see key data before reading the full article. Therefore, the authors should include the most salient quantitative results. Finally regarding the abstract, there is no mention of whether there were challenges in classifying certain defects or whether the technique has restrictions. Therefore, I recommend ending with a sentence mentioning the limitations.

The keywords will be used for indexing purposes and these keyword do not contribute to the ability to index the work well since it includes terms or expressions that are already contained in the title as it is ‘hyperspectral imaging’; ‘potato’; ‘External defect’. This should replace this keyword.

The introduction is clear and presents the context of the problem well, but needs improvement in structure, transition between studies, justification of the method and definition of the hypothesis. The authors should include a clear hypothesis, improve the transition between the existing literature and the rationale for using HSI as the best alternative, better structure the literature review by grouping studies into clearer sections, better explain the expected impact of the research on practical applications, and verify and update references, ensuring that recent studies are included.

The literature review provides an initial context on the detection of defects in potatoes using hyperspectral technology. However, the review is currently based on a limited number of references (only 7), which does not allow for a comprehensive coverage of the state of the art. It is recommended that the review be expanded to include more recent and relevant studies in high impact journals. In addition, most of the references come from Chinese authors and journals (6 out of 7), which may result in a geographically restricted view of the field. To ensure a more balanced and representative review, it would be convenient to include studies published in international journals including studies from Europe, USA, Latin America, etc., where significant advances in hyperspectral spectroscopy applied to potato quality have been reported. It is also suggested to include more recent work (last 5 years) in machine learning applied to hyperspectral imaging, as this field has evolved rapidly. Expanding the review with these studies would strengthen the theoretical basis of the work and facilitate its comparison with other recent research.

It is striking that of the 25 citations used in the manuscript, only three refer to potato, when it is a product with which many researchers have used HSI, as can be seen in this systematic review of the use of HSI in potato https://doi.org/10.1007/s11540-024-09702-7

The Materials and Methods section is well-structured and provides a detailed description of the equipment used, hyperspectral data acquisition, and the applied analysis models. However, several aspects need improvement to ensure the reproducibility and scientific rigor of the study.

- Justification of methods and parameters: It is recommended to include an explanation of why specific statistical models (SVM, PLSR, LSSVM, PCA, etc.) were chosen and the criteria used to select key parameters, such as the region of interest (ROI) size and spectrometer settings.

- Experiment repeatability: There is no mention of whether the analyses were conducted at different times or if system variability was assessed. Including information about repeatability would enhance the reliability of the results.

- Selection of the region of interest (ROI): The 25×25 pixel ROI was selected manually, but there is no explanation of how this size was determined or whether other alternatives were considered. More details are recommended, as ROI selection can impact model accuracy.

- Potential sources of error: Factors such as spectral noise, lighting variability, or potato moisture content are not discussed, even though they may influence the results. Adding a short section on these sources of error and how they were mitigated is recommended.

- In line 101 there is a mistake, HIS should be HSI, and the meaning of this abbreviation, which has not been included in the list of abbreviations, is not indicated. In addition, it is not used in the article and should be repeated too many times 'Hyperspectral image'.

The Results section provides valuable insights into the application of hyperspectral imaging for detecting potato defects. However, several aspects require significant revision and improvement to enhance the clarity and impact of the study:

- The section contains excessive methodological explanations, such as variable selection, PCA theory, and SPA fundamentals. These should be moved to the Materials and Methods section to maintain clarity.

- Excessive numerical values without adequate interpretation: Numerous R², RMSEC, RMSEP, and RPD values are reported without proper analysis or comparison with previous studies. The recommendation is to reduce the amount of numerical data and focus on highlighting only the most relevant results.

- The section would benefit from better structuring to improve readability. A more logical organization could include: 1) Spectral preprocessing 2) Quantitative models (including error evaluation) 3) Qualitative models (confusion matrices and classification accuracy) 4) Comparison and analysis of results.

- Some figures do not contribute significantly to the discussion and could be removed or merged. Additionally, it is recommended to include residual plots and prediction vs. actual curves to better illustrate model performance.

The discussion primarily repeats information from the results section rather than providing an in-depth interpretation of the findings. Instead of summarizing the numerical outcomes, this section should relate the results to previous studies, emphasizing the study’s significance within the broader scientific context. It is essential to include references to prior studies on HSI applications in agricultural defect detection to assess the novelty and quality of the work. Additionally, it lacks a critical evaluation of potential limitations, such as misclassification errors, environmental factors influencing spectral measurements, and the need for validation in real-world conditions. There is no discussion on potential overfitting, particularly in cases where R² values are excessively high (>0.99). Moreover, possible error sources in defect classification and limitations of the chosen approach should be addressed.

The conclusions rely too heavily on listing numerical results without explaining their broader implications. While R², RMSEC, and RMSEP values are useful, they should be contextualized to clarify what they mean in practical terms. The study also fails to explicitly state its contribution to the scientific community—does it significantly improve existing methods or address a specific gap? To strengthen the conclusions, the authors should refine this section to highlight the study’s impact, avoid unnecessary repetition of numerical data, and propose future research avenues, such as testing on other potato varieties or integrating additional detection technologies.

The references has several limitations. The number of references is limited, and most sources come from Chinese publications or journals that are not easily accessible to the international scientific community. To strengthen the study, it is recommended to include key review studies and previous research on hyperspectral imaging applied to defect detection in agricultural products, especially potatoes. Additionally, some references have future publication dates (2024-2025), raising concerns about their availability. Formatting errors were also identified in some citations (e.g., references 8 and 21). In reference 8 the authors have given the first names of the authors and not the surnames. The correct form would be:

Morales, A.; Horstrand, P.; Guerra, R.; Leon, R.; Ortega, S.; Díaz, M.; Melián, J.M.; López, S.; López, J.F.; Callico, G.M.; et al. Laboratory Hyperspectral Image Acquisition System Setup and Validation. Sensors 2022, 22, 2159. https://doi.org/10.3390/s22062159

Reference 19 is wrong, the names of the authors are not included, instead, the names of the institutions appear. The correct form would be:

Morais, C.L.M.; Santos, M.C.D.; Lima, K.M.G.; Martin, F.L. Improving data splitting for classification applications in spectrochemical analyses employing a random-mutation Kennard-Stone algorithm approach, Bioinformatics, 2019, 35(24), 5257–5263, https://doi.org/10.1093/bioinformatics/btz421

A thorough review of the references section is recommended to ensure accuracy and relevance.

Author Response

Comments 1: [The title makes it clear that the study deals with the detection of external defects in potatoes and that the technique used is near-infrared hyperspectral imaging (NIR-HSI). The focus of the study is immediately understood and is aligned with the trend of improving nondestructive detection methods in the food and agricultural industry. However, the phrase “based on near-infrared hyperspectral imaging” is rather long-winded. The word hyperspectral already suggests the use of near-infrared in many agricultural applications, so it could be shortened. Also, the authors do not mention the specific techniques they used to analyze the data. Since the study uses PCA, SVM, KNN, and neural networks (BPNN), the title could refer to classification or machine learning models, which would make it more informative. For example, and this is just a suggestion, the title might be more aptly titled “Nondestructive detection of external defects in potatoes using hyperspectral imaging and machine learning.”]

Response 1: Thank you for pointing this out. We agree with this comment that can better describe the content and characteristics of the article. Therefore, we revised the title according to your advice. [Page 1, L2-L3.]

Comments 2: [The Abstract Is really a summary, include key findings and is an appropriate length. The abstract is indeed a summary, includes the key findings, and is of adequate length. However, the authors could improve it by starting with a clearer introductory sentence about the problem. In addition, they mention SG, SNV, MSC, SPA, PCA, SVM, BPNN, KNN without explaining what they represent or why they were chosen. For a reader unfamiliar with these methods, it may be difficult to follow. Therefore, I recommend that you include a brief explanation of the methods in layman's terms. It is mentioned that some models are more effective, but no numerical values are presented to support the comparison. This reduces the credibility of the abstract, as a reader should see key data before reading the full article. Therefore, the authors should include the most salient quantitative results. Finally regarding the abstract, there is no mention of whether there were challenges in classifying certain defects or whether the technique has restrictions. Therefore, I recommend ending with a sentence mentioning the limitations.]

Response 2: Thanks for your kind and comprehensive suggestions. We agree with this comment. We have modified and performed the abstract according to your advice. In the revised manuscript this change can be found in the abstract, we have marked in red. [Page 1, L8-L32.]

Comments 3: [The keywords will be used for indexing purposes and these keywords do not contribute to the ability to index the work well since it includes terms or expressions that are already contained in the title as it is ‘hyperspectral imaging’; ‘potato’; ‘External defect’. This should replace this keyword.]

Response 3: Thanks for your kind suggestions. We have revised the keywords that are beneficial to index and are not included in the title. The changed can be founded in the revised manuscript. [Page 1, L33-L34.]

Comments 4: [The introduction is clear and presents the context of the problem well, but needs improvement in structure, transition between studies, justification of the method and definition of the hypothesis. The authors should include a clear hypothesis, improve the transition between the existing literature and the rationale for using HSI as the best alternative, better structure the literature review by grouping studies into clearer sections, better explain the expected impact of the research on practical applications, and verify and update references, ensuring that recent studies are included.]

Response 4: Thanks for your kind and comprehensive suggestions. We have done in-depth thinking. In the introduction, we have improved the structure and transition between studies, justification of method and definition of the hypothesis, and clarified the significance of this research. At the same time, the references were supplemented and updated. The changed can be founded in the revised manuscript, we have marked in red. [Page 1-3, L36-L115 & Page 20, L579-L625.]

Comments 5: [The literature review provides an initial context on the detection of defects in potatoes using hyperspectral technology. However, the review is currently based on a limited number of references (only 7), which does not allow for a comprehensive coverage of the state of the art. It is recommended that the review be expanded to include more recent and relevant studies in high impact journals. In addition, most of the references come from Chinese authors and journals (6 out of 7), which may result in a geographically restricted view of the field. To ensure a more balanced and representative review, it would be convenient to include studies published in international journals including studies from Europe, USA, Latin America, etc., where significant advances in hyperspectral spectroscopy applied to potato quality have been reported. It is also suggested to include more recent work (last 5 years) in machine learning applied to hyperspectral imaging, as this field has evolved rapidly. Expanding the review with these studies would strengthen the theoretical basis of the work and facilitate its comparison with other recent research.]

Response 5: Thanks you very much for your detailed suggestions. In the introduction, we have supplied the reference, especially recent, international journals and high impact journals from Europe, USA, Latin America, etc. Recent work on the application of machine learning to hyperspectral imaging was added. The changed have been found in Section abstract and Section References.

Comments 6: [It is striking that of the 25 citations used in the manuscript, only three refer to potato, when it is a product with which many researchers have used HSI, as can be seen in this systematic review of the use of HSI in potato https://doi.org/10.1007/s11540-024-09702-7]

Response 6: Thank you for pointing this out. We agree with this comment. Therefore, I added references about the application of HSI in potatoes, including https://doi.org/10.1007/s11540-024-09702-7.

Comments 7: [The Materials and Methods section is well-structured and provides a detailed description of the equipment used, hyperspectral data acquisition, and the applied analysis models. However, several aspects need improvement to ensure the reproducibility and scientific rigor of the study.]

Response 7: Our responses one by one are shown from 7-1 to 7-5.

Comments 7-1: [- Justification of methods and parameters: It is recommended to include an explanation of why specific statistical models (SVM, PLSR, LSSVM, PCA, etc.) were chosen and the criteria used to select key parameters, such as the region of interest (ROI) size and spectrometer settings.]

Response 7-1: Thank you for pointing this out. We agree with this comment. Therefore, I have added the corresponding explanations in Section 2.3 and Section 2.4. [Page 5, L171-L177 & Page 7-8, L264-L287.]

Comments 7-2: [- Experiment repeatability: There is no mention of whether the analyses were conducted at different times or if system variability was assessed. Including information about repeatability would enhance the reliability of the results.]

Response 7-2: Thank you for pointing this out. We agree on this comment. In fact，we conducted the repetitive experiments, we are sorry for not describing it clearly. Therefore, we have supplemented corresponding contents in Section 2.3. [Page 5, L180-L182.]

Comments 7-3: [- Selection of the region of interest (ROI): The 25×25 pixel ROI was selected manually, but there is no explanation of how this size was determined or whether other alternatives were considered. More details are recommended, as ROI selection can impact model accuracy.]

Response 7-3: Thanks for your kind suggestion. Therefore, I added a description about selection criterion of 25 pixels by 25 pixels in Section2.3. [Page 5, L182-L186.]

Comments 7-4: [- Potential sources of error: Factors such as spectral noise, lighting variability, or potato moisture content are not discussed, even though they may influence the results. Adding a short section on these sources of error and how they were mitigated is recommended.]

Response 7-4: Thanks for your kind suggestion. These potential sources of error have been considered, we are sorry for not describing them clearly. Taking carefully into account the structure of the article, we have performed the corresponding content in section2.2. [Page 4, L131-L132, L142-L143, L146-L147.]

Comments 7-5: [- In line 101 there is a mistake, HIS should be HSI, and the meaning of this abbreviation, which has not been included in the list of abbreviations, is not indicated. In addition, it is not used in the article and should be repeated too many times 'Hyperspectral image'.]

Response 7-5: Thank you for pointing this out. We agree with this comment. We are very sorry for the mistake. Therefore, I have corrected HIS to HSI and added it to the list of abbreviations. In addition, we all use “Hyperspectral image” in the article. The detail was shown in Section 2.2. [Page 4, L140.]

Comments 8: [The Results section provides valuable insights into the application of hyperspectral imaging for detecting potato defects. However, several aspects require significant revision and improvement to enhance the clarity and impact of the study:]

Response 8: Our responses one by one are shown from 8-1 to 8-4.

Comments 8-1: [- The section contains excessive methodological explanations, such as variable selection, PCA theory, and SPA fundamentals. These should be moved to the Materials and Methods section to maintain clarity.]

Response 8-1: Thank you for pointing this out. We agree with this comment. Therefore, we moved the methodological explanations included in the results section to Section2.4, and integrated and improved them. [Page 7, L268-L273 & Page8, L297-L302.]

Comments 8-2: [- Excessive numerical values without adequate interpretation: Numerous R², RMSEC, RMSEP, and RPD values are reported without proper analysis or comparison with previous studies. The recommendation is to reduce the amount of numerical data and focus on highlighting only the most relevant results.]

Response 8-2: Thanks for your kind suggestions. We have supplemented the corresponding analysis or comparison with previous studies on 3 classes respectively, and removed some unimportant data, the changes have been found in Section3.2. [Page 11, L358-L362 & Page 12, L365-L372 & Page 13, L375-L383.]

Comments 8-3: [- The section would benefit from better structuring to improve readability. A more logical organization could include: 1) Spectral preprocessing 2) Quantitative models (including error evaluation) 3) Qualitative models (confusion matrices and classification accuracy) 4) Comparison and analysis of results.]

Response 8-3: Thanks for your kind suggestions. We have reorganized the structure of this section according to your suggestions and our thinking, and performed the content. The details can be found in Section 3. [Page 9- 16, L322,L353 ,L411,L463.]

Comments 8-4: [- Some figures do not contribute significantly to the discussion and could be removed or merged. Additionally, it is recommended to include residual plots and prediction vs. actual curves to better illustrate model performance.]

Response 8-4: We very appreciate reviewer’s preciseness aiming to scientific research. Comprehensively consider the comments of all 3 reviews, We didn’t removed the figures. Additionally, we are sorry for not understanding how and where and what residual plots. We think figure 8 includes prediction vs. actual curves , which can better illustrate model performance.

Comments 9: [The discussion primarily repeats information from the results section rather than providing an in-depth interpretation of the findings. Instead of summarizing the numerical outcomes, this section should relate the results to previous studies, emphasizing the study’s significance within the broader scientific context. It is essential to include references to prior studies on HSI applications in agricultural defect detection to assess the novelty and quality of the work. Additionally, it lacks a critical evaluation of potential limitations, such as misclassification errors, environmental factors influencing spectral measurements, and the need for validation in real-world conditions. There is no discussion on potential overfitting, particularly in cases where R² values are excessively high (>0.99). Moreover, possible error sources in defect classification and limitations of the chosen approach should be addressed.]

Response 9: Thank you for your detailed and valuable suggestions. We have done in-depth thinking, and rewrote the discussion. The detail was shown in Section Discussion. [Page 17-18, L490-L532.]

Comments 10: [The conclusions rely too heavily on listing numerical results without explaining their broader implications. While R², RMSEC, and RMSEP values are useful, they should be contextualized to clarify what they mean in practical terms. The study also fails to explicitly state its contribution to the scientific community—does it significantly improve existing methods or address a specific gap? To strengthen the conclusions, the authors should refine this section to highlight the study’s impact, avoid unnecessary repetition of numerical data, and propose future research avenues, such as testing on other potato varieties or integrating additional detection technologies.]

Response 10: Thank you for your detailed and valuable suggestions. We have done in-depth thinking, and rewrote the conclusions. The detail was shown in Section Conclusions. [Page 18-19, L533-L559.]

Comments 11: [The references has several limitations. The number of references is limited, and most sources come from Chinese publications or journals that are not easily accessible to the international scientific community. To strengthen the study, it is recommended to include key review studies and previous research on hyperspectral imaging applied to defect detection in agricultural products, especially potatoes. Additionally, some references have future publication dates (2024-2025), raising concerns about their availability. Formatting errors were also identified in some citations (e.g., references 8 and 21). In reference 8 the authors have given the first names of the authors and not the surnames. The correct form would be:

Morales, A.; Horstrand, P.; Guerra, R.; Leon, R.; Ortega, S.; Díaz, M.; Melián, J.M.; López, S.; López, J.F.; Callico, G.M.; et al. Laboratory Hyperspectral Image Acquisition System Setup and Validation. Sensors 2022, 22, 2159. https://doi.org/10.3390/s22062159

Reference 19 is wrong, the names of the authors are not included, instead, the names of the institutions appear. The correct form would be:

Morais, C.L.M.; Santos, M.C.D.; Lima, K.M.G.; Martin, F.L. Improving data splitting for classification applications in spectrochemical analyses employing a random-mutation Kennard-Stone algorithm approach, Bioinformatics, 2019, 35(24), 5257–5263, https://doi.org/10.1093/bioinformatics/btz421

A thorough review of the references section is recommended to ensure accuracy and relevance.]

Response 11: Thank you for your detailed and valuable suggestions. We have added relevant references and removed the references having future publication dates, In addition, we have carefully and comprehensively reviewed all references to ensure their relevance, standardization, and timeliness. The changes marked in red can be found in Section References. [Page 21, L626-L628, L651-L653.]

We are very sorry for not finding the formatting errors of reference 21 in original manuscript (30 in revised manuscript) .

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The Abstract section is too poor and needs major revision.

The Introduction section is short and doesn’t contain important information on the differences in spectral and spatial characteristics of healthy and unhealthy or damaged fruits or vegetables. Need major revision. Please add how machine learning models can improve accuracy in differentiating objects from traditional methods.

A significant portion of Result section included methodologies, which must be shifted to the suitable Methodology sections.

It is unclear what is the quantitative and qualitative models here? What was the quantity the authors are trying to predict here? How these were measured through laboratory test?

There entire methodology is too vague. Many algorithms and models were applied, which can’t be remembered even after several reading. There is no consistency and clear flow of data processing methodology.

The discussion part is too short and doesn’t have significant information or comparison with other studies. The effect of various conditions on the spectral signal are not deeply analyzed.

It is unclear how the authors ensured about the outcome of study without destructive testing. Potato are cheap, so the authors could be applied destructive sampling to match the outside and inside conditions, followed by verification with hyperspectral data analysis.

Specific comments

Please explain why the mentioned defects are harmful to socio-economy: “five exterior defects: scab disease, black skin, broken skin, green skin, and mechanical damage”?

“In order to achieve the accuracy and speed of potato, improve the limitations of spectral technology in the spatial positioning detection of potato external defects, and the shortcomings of machine vision recognition technology such as long feedback time, a non-destructive detection method of potato external defects based on hyperspectral technology was explored.” – poor English and long sentence. Please rewrite and break sentences for more clarity.

“SG, SNV, MSC, SPA, PCA, SVM, BPNN, KNN” – add full form in acronym’s first appearance

“Quantitative model was established by PCA and SVM. The qualitative model is established by BPNN, KNN and other algorithms.” – what is meant by quantitative and qualitative models? Use simple terms in the abstract due to space limitations, where complex terms can’t be explained.

“The results showed that the PCA quantitative model after SG-SNV pretreatment was the best for healthy potatoes.” – best for what? Poor English language. Similarly, “In black and green potatoes, SVM quantitative model is the most stable and reliable.” It is not making any sense.

“Potato staple development . . .” – incorrect phrase. Stable food or staple crop is fine. But potato staple is not a usual phrase.

“at home and abroad” – replace with in China or other countries. Home doesn’t make sense here.

“In summary, non-destructive testing of fruit and vegetable quality mainly adopts spectral and visual methods. Near-infrared spectroscopy technology can quickly and non-destructively detect the quality of agricultural products in real time; however, it can only detect a certain region, and the quality of agricultural products in different spatial locations presents limited changes [5].” – please explain in detail why NIR bands are more useful than visual spectral bands for fruit and vegetable quality assessment. What changes are seen in fruits and vegetables when quality degrades?

“Although near-infrared spectroscopy is commonly used, it can only detect the quality of agricultural products at different spatial locations. However, its ability to detect damaged potatoes has certain limitations.” – please mention what are the limitations and sources (citations).

“This research group used machine vision recognition technology to conduct pre-tests, and the results showed that this technology had a high recognition rate for deformed potatoes but a poor recognition rate for external defects, such as potato skin defects and scabs.” – this is part of Results or Discussion section, should be moved to suitable section.

“Hyperspectral imaging technology integrates traditional digital imaging and spectral analysis technologies, with the advantages of high resolution and many bands; thus, it can not only obtain spatial image information of the measured object but also the spectral information of each pixel [6].” - Hyperspectral imaging is not linked to spatial resolution. It only links to higher spectral resolution.

“The limitations of near-infrared spectroscopy (NIR) to detect potato external defects in spatial position and the long feedback time of machine vision recognition algorithm are improved.” – please mention what the limitations are. Please expand on what is meant by long feedback time here.

2.2. Main instrument and equipment – please mention the number of spectral bands.

Section: 2.3. Hyperspectral image acquisition and correction – instead of using ‘should be’, use ‘was or were’, as the authors already applied. Similarly, in section ‘2.4. Hyperspectral data extraction’.

Figure 2: binary image creation from the original image is incorrect as it removed a significant portion of the object. It should ideally remove the background and retain the entire object (potato here).

Why 25 pixels ×25 pixels chosen? What was the selection criteria for the location?

L139-140: What was the criteria to remove the noisy bands? What was the source or reason of the noisy bands?

Figure: add the legend of right-hand side figure (spectral curves).

“Hyperspectral technology has been used in agriculture, environment, medicine, and other fields because of its high-resolution detection of fine matter characteristics. Before

establishing a hyperspectral data model, the hyperspectral data should be preprocessed [11]” – Please shift these lines to Introduction section.

“In this test, convolution smoothing (Savitzky–Golay, SG), multiple scattering correction (MSC), standard normal transformation (SNV), and normalization algorithms were used to preprocess the spectral data [12]. Such processing was performed to remove noise, background interference, and other interference factors in spectral data to improve the signal-to-noise ratio, accuracy, quality, and availability.” – please mention which tool was used for which correction. What is meant by ‘availability’ here?

L159 and L168: ‘A quantitative prediction model’ and ‘A qualitative discrimination model’ – please clearly mention what are quantitative and qualitative models here and why these were used. Why different models were used for quantitative and qualitative analyses. How these models were chosen.

“The hyperspectral data of potato were extracted using ENVI5.6 and imported into EXCEL for editing. The full-band spectral curve of potato hyperspectrum was obtained by MATLAB. In the preview spectrogram, it can be found that the clutter noise is large at 400-549nm and 921-1100nm, so the band 550-920nm is selected for analysis and processing. The original average spectral curve in this band range is shown in Figure 4.” – Figure 3 shows a different wavelength range, whereas Figure 4 has a different range. Citation of Figure 4 is wrong here. I guess it would be Figure 3. How were the clutter noise decided and what criteria was applied to remove these bands?

“To solve the problems of noise, distortion, and interference in spectral data, improve the quality and reliability of spectral data, and increase the accuracy of spectral analysis and quantitative prediction model establishment, spectral data preprocessing was performed after obtaining the potato hyperspectral data.” – this information is already mentioned in the methodology section. Therefore, this line can be removed. Entire 3.2. Spectral data preprocessing is part of Methodology section.

L192: What is X10.4. and Normalize algorithms? Please mention these in the methodology section.

L194-195: “The SG is essentially a weighted average method that can effectively eliminate high-frequency noise in the spectral curve [16].” – what was the kernel size used here?

L196-198: “MSC and SNV are used to eliminate the influence of scattering caused by uneven particle distribution and different particle sizes on the spectrum and to reduce the interference of uncertain factors such as the scattering effect and instrument response.” – what was the impact of uneven particle distribution and different particle sizes on spectra and how it was decided?

“The function of Normalize is to standardize and normalize the data, unify the range of data, eliminate the impact of data dimensions, and make data indicators comparable.” – what was before and after data ranges?

“Although similar to SNV, this method is different because it averages the row rather than the column of the spectrum [17].” – how many samples (kernel size) were used for averaging?

“In the first step of the KS algorithm. all samples were treated as training sets, the Euclidean distance of the entire sample set was calculated, and the two samples with the largest Euclidean distance were selected as the training set. In the second step, the distance between the remaining and selected samples was calculated. Samples with the shortest distance were selected as the training set. After all the remaining samples were calculated, the sample corresponding to the longest distance among the shortest distances was selected as the training set. In the third step, the second step was repeated until the number of samples selected was equal to the number determined in advance.” – if I understood it correctly, this method creates grouping of the input spectra based on the required classes as defined in the beginning. It sounds like identifying the best sample spectra for each class. Why the authors didn’t apply pixel purity index (PPI) kind of approach for purest spectra identification?

L243 – L254: these are part of Methodology section.

Figure 5, 6, and 7, and Table 1: Origin or Original?

“Three quantitative analysis models of the potato spectrum were developed” – What was the quantity the authors are trying to predict here? How these were measured through laboratory test?

L283-L296: shift lines to Methodology as required, retaining the results. Mention how many bands in there before and after PCA and what criteria was used in PCA to select the bands. What is the actual measurement index here? How these values were obtained and predicted? It looks like the current approach can identify the different potato categories with multi-spectral images, i.e., with a lower number of spectral data.

L295-296: Figure 8 doesn’t show any spectrum as mentioned here.

L300: what is Characteristic spectral data and why these are required?

While PCA was applied for dimensionality reduction, then why 3.4. Spectral data feature wavelength extraction were used?

L30-382: How did the authors ensured that there was no damage inside, even partially?

Comments on the Quality of English Language

Scientific writing is missing in many places. Needs revision.

Author Response

Comments 1: [The Abstract section is too poor and needs major revision.]

Response 1: Thanks for your kind and comprehensive suggestions. We agree with this comment. We have modified and performed the abstract according to your advice. In the revised manuscript this change can be found in the abstract, we have marked in red. [Page 1, L8-L32.]

Comments 2: [The Introduction section is short and doesn’t contain important information on the differences in spectral and spatial characteristics of healthy and unhealthy or damaged fruits or vegetables. Need major revision. Please add how machine learning models can improve accuracy in differentiating objects from traditional methods.]

Response 2: Thank you for pointing this out. We agree with this comment. Therefore, I revised the Introduction. The introduction enriches the literature review section, adding important information about differences in spectral and spatial characteristics of healthy or damaged fruits or vegetables, and complements how machine learning model models can improve the accuracy of distinguishing objects from traditional methods. [Page 1-3, L36-L115.]

Comments 3: [A significant portion of Result section included methodologies, which must be shifted to the suitable Methodology sections.]

Response 3: Thank you for pointing this out. We agree with this comment. Therefore, these methodological sections，which in the Result section have now been appropriately relocated to the Methods section, ensuring that each section serves its intended purpose clearly and cohesively. The Results section now exclusively focuses on presenting the findings and outcomes of our study.[Page 3-9, L116-L320.]

Comments 4: [It is unclear what is the quantitative and qualitative models here? What was the quantity the authors are trying to predict here? How these were measured through laboratory test?]

Response 4: Thank you for pointing this out.

The qualitative model in this study refers to the detection of spectral data by using different machine learning models. Determine which of the three categories (healthy potatoes, black-green skin potatoes and scab-mechanical damage-broken skin potatoes) the sample belongs to. Quantitative model refers to the quantitative evaluation results obtained when different data prediction models are used to predict a variety of pre-processed spectral data. By comparing the results of different preprocessing algorithms combined with four quantitative models, the optimal preprocessing algorithm for hyperspectral data is determined. [Page 7, L231-L235 & Page 8, L289-L295.]

The quantification is four indicators of the accuracy of the prediction of the spectral data (R², RMSEC, RMSEP, RPD). These quantitative indicators are very important for selecting the optimal spectral data preprocessing algorithm. [Page 8, L275-L280.]

The quantitative indicators (R², RMSEC, RMSEP, RPD) are calculated automatically in the software according to the data analysis formula. The qualitative index is based on human eye appearance observation to determine whether the sample is healthy potatoes or black-green skin potatoes and scab-mechanical damage-broken skin potatoes. Then, the results of model recognition are compared with the results of manual judgment, and the accuracy of qualitative analysis is finally determined.

Comments5: [There entire methodology is too vague. Many algorithms and models were applied, which can’t be remembered even after several reading. There is no consistency and clear flow of data processing methodology.]

Response 5: Thank you for pointing this out. We agree with this comment. Therefore, we logically combed through the methods section and made changes in the text. In order to detect the external defects of potato, this study firstly preprocessed the original hyperspectral data using a variety of algorithms. Secondly, by comparing the results of different preprocessing algorithms combined with four quantitative models, the optimal preprocessing algorithm for hyperspectral data is determined. Then, the characteristic bands are extracted from the hyperspectral data obtained by a better preprocessing algorithm, and the qualitative model of defect detection is established by using the hyperspectral data of the characteristic bands. Finally, a universal method for the detection of external defects in potato was obtained through experiments. [Page 6, L200-L207.]

Comments 6: [The discussion part is too short and doesn’t have significant information or comparison with other studies. The effect of various conditions on the spectral signal are not deeply analyzed.]

Response 6: Thank you for your detailed and valuable suggestions. We have done in-depth thinking, and rewrite the discussion. The detail was shown in section discussion.

[Page 17-18, L490-L532.]

Comments 7: [It is unclear how the authors ensured about the outcome of study without destructive testing. Potato are cheap, so the authors could be applied destructive sampling to match the outside and inside conditions, followed by verification with hyperspectral data analysis.]

Response 7: Thank you for pointing this out. This research is based on the project of "Research and System Development of intelligent sorting method for Potato cellaring", aiming at screening potatoes before cellaring to ensure the quality of potatoes after entering the cellaring, so destructive sampling cannot be used. Hyperspectral technology and machine learning technology are used for non-destructive testing of potato external defects.

Comments 8: [Please explain why the mentioned defects are harmful to socio-economy: “five exterior defects: scab disease, black skin, broken skin, green skin, and mechanical damage”?]

Response 8: Thank you for pointing this out. The external defects of cellaring potatoes will lead to increased storage costs and food safety problems, resulting in economic losses and waste of resources. Sorting potatoes before they enter the cellar is crucial, and by eliminating defective potatoes, it can effectively reduce economic losses, ensure food safety, improve resource efficiency, and maintain market confidence and supply chain stability, thus having a positive socio-economic impact. [Page 2, L43-L46.]

Comments 9: [“In order to achieve the accuracy and speed of potato, improve the limitations of spectral technology in the spatial positioning detection of potato external defects, and the shortcomings of machine vision recognition technology such as long feedback time, a non-destructive detection method of potato external defects based on hyperspectral technology was explored.” – poor English and long sentence. Please rewrite and break sentences for more clarity.]

Response 9: Thanks for your kind and comprehensive suggestions. We agree with this comment. We have modified and performed the abstract according to your advice. In the revised manuscript this change can be found in the abstract, we have marked in red.

[For potato external defects detection, ordinary spectral technology has restrictions in detail detection and processing accuracy, machine vision method has restrictions of long feedback time. To realize accurate and rapid external defects detection for red skin potatoes, a non-destructive detection method using hyperspectral imaging and machine learning model was explored in this study. Page 1, L8-L12.]

Comments 10: [“SG, SNV, MSC, SPA, PCA, SVM, BPNN, KNN” – add full form in acronym’s first appearance]

Response 10: Thank you for pointing this out. We agree with this comment. Therefore, I added the full form when the acronym first appeared.

[ Page 1, L12-L22.]

Comments 11: [“Quantitative model was established by PCA and SVM. The qualitative model is established by BPNN, KNN and other algorithms.” – what is meant by quantitative and qualitative models? Use simple terms in the abstract due to space limitations, where complex terms can’t be explained.]

Response 11: Thank you for pointing this out. We agree with this comment. Therefore, I explain quantitative and qualitative models in simple terms in the abstract.

[quantitative models for find the most suitable preprocessing algorithm. Page 1, L18.]

[the qualitative models were established to detect the external defects of potatoes Page 1, L20-L21.]

Comments 12: [“The results showed that the PCA quantitative model after SG-SNV pretreatment was the best for healthy potatoes.” – best for what? Poor English language. Similarly, “In black and green potatoes, SVM quantitative model is the most stable and reliable.” It is not making any sense.]

Response 12: Thanks for your kind and comprehensive suggestions. We agree with this comment. We have modified and performed the abstract according to your advice. In the revised manuscript this change can be found in the abstract, we have marked in red. [ Page 1, L23-L28.]

Comments 13: [“Potato staple development . . .” – incorrect phrase. Stable food or staple crop is fine. But potato staple is not a usual phrase.]

Response 13: Thank you for pointing this out. We agree with this comment. Therefore, I corrected the sentence.[Page 2, L41-L43.]

Comments 14: [“at home and abroad” – replace with in China or other countries. Home doesn’t make sense here.]

Response 14: Thank you for pointing this out. In order to ensure the logic of the introduction to the revised version, I deleted this sentence.

Comments 15: [“In summary, non-destructive testing of fruit and vegetable quality mainly adopts spectral and visual methods. Near-infrared spectroscopy technology can quickly and non-destructively detect the quality of agricultural products in real time; however, it can only detect a certain region, and the quality of agricultural products in different spatial locations presents limited changes [5].” – please explain in detail why NIR bands are more useful than visual spectral bands for fruit and vegetable quality assessment. What changes are seen in fruits and vegetables when quality degrades?]

Response 15: Thank you for pointing this out. In order to ensure the logic of the introduction to the revised manuscript, I have deleted this passage. Healthy or damaged fruits and vegetables have significant differences in spectral characteristics, and these differences can be detected and distinguished by hyperspectral imaging techniques. Healthy fruits and vegetables have the characteristics of high reflectivity and stable absorption in the spectrum. The spectral reflectance of damaged fruits and vegetables decreased and the absorption characteristics changed. These differences provide an important basis for non-destructive testing of fruit and vegetable quality.

Comments 16: [“Although near-infrared spectroscopy is commonly used, it can only detect the quality of agricultural products at different spatial locations. However, its ability to detect damaged potatoes has certain limitations.” – please mention what are the limitations and sources (citations).]

Response 16: Thank you for pointing this out. We agree with this comment. In the new revised version, we have made changes to the introduction of the original manuscript and added the limitations of ordinary NIR spectroscopy. [ Ordinary spectrum technology has only spectral information but no image information. This limitation results in lower collection efficiency and identification accuracy. It is applicable to local measurement, but it is difficult to apply to complex actual scenes of multiple external defects of potato. Page 2, L66-L70.]

Comments 17: [“This research group used machine vision recognition technology to conduct pre-tests, and the results showed that this technology had a high recognition rate for deformed potatoes but a poor recognition rate for external defects, such as potato skin defects and scabs.” – this is part of Results or Discussion section, should be moved to suitable section.]

Response 17: Thank you for pointing this out. We think that this part of the text is out of place in the revised manuscript. Therefore, I removed this section in the modified version.

Comments 18: [“Hyperspectral imaging technology integrates traditional digital imaging and spectral analysis technologies, with the advantages of high resolution and many bands; thus, it can not only obtain spatial image information of the measured object but also the spectral information of each pixel [6].” - Hyperspectral imaging is not linked to spatial resolution. It only links to higher spectral resolution.]

Response 18: Thank you for pointing this out. We agree with this comment. Therefore, I corrected the sentence.[Page 3, L105-L108.]

Comments 19: [“The limitations of near-infrared spectroscopy (NIR) to detect potato external defects in spatial position and the long feedback time of machine vision recognition algorithm are improved.” – please mention what the limitations are. Please expand on what is meant by long feedback time here.]

Response 19: Thank you for pointing this out. We agree with this comment. In the new revised version, we have made changes to the introduction of the original manuscript, and supplemented the limitations of ordinary near infrared spectroscopy and machine vision recognition technology. Page 2, L65-L74.]

Comments 20: [2.2. Main instrument and equipment – please mention the number of spectral bands.]

Response 20: Thank you for pointing this out. We agree with this comment. Therefore, I added this section to the text. The number of spectral bands is 472. [Page 4, L133-L134.]

Comments 21: [Section: 2.3. Hyperspectral image acquisition and correction – instead of using ‘should be’, use ‘was or were’, as the authors already applied. Similarly, in section ‘2.4. Hyperspectral data extraction’. ]

Response 21: Thank you for pointing this out. We agree with this comment. Therefore, I made changes in the text. [Page 4-5, L141-L170.]

Comments 22: [Figure 2: binary image creation from the original image is incorrect as it removed a significant portion of the object. It should ideally remove the background and retain the entire object (potato here).]

Response 22: Thank you for pointing this out. We agree with this comment. Therefore, I changed this figure. [Figure 2,Page 5, L179.]

Comments 23: [Why 25 pixels ×25 pixels chosen? What was the selection criteria for the location?]

Response 23: Thank you for pointing this out. We agree with this comment. Therefore, I added a selection criterion of 25 pixels by 25 pixels in the text.[Page 5, L184-L186.]

Comments 24: [L139-140: What was the criteria to remove the noisy bands? What was the source or reason of the noisy bands?]

Response 24: Thank you for pointing this out. We agree with this comment. Therefore, I have added to the text the criteria for removing noise bands and the sources of noise. [Page 6, L190-L196.]

Comments 25: [Figure: add the legend of right-hand side figure (spectral curves).]

Response 25: Thank you for pointing this out. We agree with this comment. Therefore, I changed the image to add the legend for the figure to the right of Figure 3. [Figure 3,Page 6, L198.]

Comments 26: [“Hyperspectral technology has been used in agriculture, environment, medicine, and other fields because of its high-resolution detection of fine matter characteristics. Before establishing a hyperspectral data model, the hyperspectral data should be preprocessed [11]” – Please shift these lines to Introduction section.]

Response 26: Thank you for pointing this out. In order to ensure the logic of the introduction to the revised version, I deleted this sentence.

Comments 27: [“In this test, convolution smoothing (Savitzky–Golay, SG), multiple scattering correction (MSC), standard normal transformation (SNV), and normalization algorithms were used to preprocess the spectral data [12]. Such processing was performed to remove noise, background interference, and other interference factors in spectral data to improve the signal-to-noise ratio, accuracy, quality, and availability.” – please mention which tool was used for which correction. What is meant by ‘availability’ here?]

Response 27: Thank you for pointing this out. We agree with this comment. Therefore, I have made changes in the text, detailing the usefulness of each method and explaining what "availability" means. The "availability" here refers to the higher quality of the pre-processed hyperspectral data, which is more suitable for subsequent modeling and analysis, thus improving the reliability and consistency of the experimental results. [Page 6, L211-L218 & Page 8, L227-L229.]

Comments 28: [L159 and L168: ‘A quantitative prediction model’ and ‘A qualitative discrimination model’ – please clearly mention what are quantitative and qualitative models here and why these were used. Why different models were used for quantitative and qualitative analyses. How these models were chosen.]

Response 28: Thank you for pointing this out. The definitions of these two types of models are similar to those in Response 4.

SVM, PLSR, PCR, LSSVM were used to establish quantitative models. These models were chosen because they are well suited to the regression task, guaranteeing the accuracy and reliability of the model through cross-validation and parameter optimization. Qualitative models were established using BPNN, CART, KNN and LDA to divide potatoes into discrete classes. These models were chosen because they are effective for classification tasks, where the goal is to assign data points to specific classes based on spectral characteristics.

The use of different quantitative and qualitative models is conducive to comparative analysis, so as to find the most suitable method to detect potato external defects. We chose these models by testing and comparing them repeatedly.

Comments 29: [“The hyperspectral data of potato were extracted using ENVI5.6 and imported into EXCEL for editing. The full-band spectral curve of potato hyperspectrum was obtained by MATLAB. In the preview spectrogram, it can be found that the clutter noise is large at 400-549nm and 921-1100nm, so the band 550-920nm is selected for analysis and processing. The original average spectral curve in this band range is shown in Figure 4.” – Figure 3 shows a different wavelength range, whereas Figure 4 has a different range. Citation of Figure 4 is wrong here. I guess it would be Figure 3. How were the clutter noise decided and what criteria was applied to remove these bands?]

Response 29: Thank you for pointing this out. Due to equipment reasons (thermal noise, current noise), the clutter noise is large in the wavelength range of 400-549 nm and 921-1100 nm, resulting in unstable spectral curves. Therefore, relatively stable spectral data in the 550-920 nm band were selected for analysis and processing. Figure 4 shows the average spectral curves of six types of potatoes after noise removal. (Thermal noise, current noise)

Comments 30: [“To solve the problems of noise, distortion, and interference in spectral data, improve the quality and reliability of spectral data, and increase the accuracy of spectral analysis and quantitative prediction model establishment, spectral data preprocessing was performed after obtaining the potato hyperspectral data.” – this information is already mentioned in the methodology section. Therefore, this line can be removed. Entire 3.2. Spectral data preprocessing is part of Methodology section.]

Response 30: Thank you for pointing this out. I have moved the preprocessing methods to the methods section. [Page 6, L208-L229.]

Since the spectral data can only be pre-processed after extraction, I keep the pre-processing process in the revised version 3.1. [Page 9-11, L334-L352.]

Comments 31: [L192: What is X10.4. and Normalize algorithms? Please mention these in the methodology section.]

Response 31: Thank you for pointing this out. We agree with this comment. Therefore, I've moved the algorithm to the methods section. The normalization algorithmwas used to eliminate spectral intensity differences caused by instrument response or environmental changes, making the data more suitable for comparison and analysis. Unscrambler X10.4 is the software used in the preprocessing part.[Page 6, L216-L218.]

Comments 32: [L194-195: “The SG is essentially a weighted average method that can effectively eliminate high-frequency noise in the spectral curve [16].” – what was the kernel size used here?]

Response 32: Thank you for pointing this out. We agree with this comment. Therefore, I made changes in the text. The kernel size used by the SG algorithm in this study is 11. [Page 9, L338-L339.]

Comments 33: [L196-198: “MSC and SNV are used to eliminate the influence of scattering caused by uneven particle distribution and different particle sizes on the spectrum and to reduce the interference of uncertain factors such as the scattering effect and instrument response.” – what was the impact of uneven particle distribution and different particle sizes on spectra and how it was decided?]

Response 33: Thank you for pointing this out. We agree with this comment. Therefore, I made changes in the text. When the particle distribution is not uniform or the particle size difference is large, the propagation path and scattering intensity of light in the sample will change, resulting in spectral reflectance fluctuations, especially in the short-wave region. This scattering effect can mask the true spectral characteristics of the sample, increasing noise and uncertainty. [Page 6, L220-L224.]

Comments 34: [“The function of Normalize is to standardize and normalize the data, unify the range of data, eliminate the impact of data dimensions, and make data indicators comparable.” – what was before and after data ranges?]

Response 34: Thank you for pointing this out. The range of data before standardization depends on the distribution and scale of the original data, and the range of different features can vary greatly. The standardized data range has a mean of 0 and a standard deviation of 1. In order to keep the article logical, I made some changes in the method section. [Page 6, L216-L218.]

Comments 35: [“Although similar to SNV, this method is different because it averages the row rather than the column of the spectrum [17].” – how many samples (kernel size) were used for averaging?]

Response 35: Thank you for pointing this out. The kernel size used by the Normalize algorithm in this study is 5. In order to keep the article logical, I have deleted this sentence from the methods section.

Comments 36: [“In the first step of the KS algorithm. all samples were treated as training sets, the Euclidean distance of the entire sample set was calculated, and the two samples with the largest Euclidean distance were selected as the training set. In the second step, the distance between the remaining and selected samples was calculated. Samples with the shortest distance were selected as the training set. After all the remaining samples were calculated, the sample corresponding to the longest distance among the shortest distances was selected as the training set. In the third step, the second step was repeated until the number of samples selected was equal to the number determined in advance.” – if I understood it correctly, this method creates grouping of the input spectra based on the required classes as defined in the beginning. It sounds like identifying the best sample spectra for each class. Why the authors didn’t apply pixel purity index (PPI) kind of approach for purest spectra identification?]

Response 36: Thank you for pointing this out. We did not use PPIs because their tasks focus more on sample representation and coverage than on end-member extraction or mixed pixel analysis. KS algorithm is chosen to better meet the modeling requirements.

Comments 37: [L243 – L254: these are part of Methodology section.]

Response 37: Thank you for pointing this out. We agree with this comment. Therefore, I moved this to the methods section. [Page 7-8, L265-L287.]

Comments 38: [Figure 5, 6, and 7, and Table 1: Origin or Original?]

Response 38: Thank you for pointing this out. We agree with this comment. "Original" should be used. Therefore, I made changes in the text. [Figure 5, 6, and 7, and Table 1,2and3. Page 10-13, L347-L389.]

Comments 39: [“Three quantitative analysis models of the potato spectrum were developed” – What was the quantity the authors are trying to predict here? How these were measured through laboratory test?]

Response 39: Thank you for pointing this out. The quantitative indicators (R², RMSEC, RMSEP, RPD) are calculated automatically in the software according to the data analysis formula. The qualitative index is based on human eye appearance observation to determine whether the sample is healthy potatoes or black-green skin potatoes and scab-mechanical damage-broken skin potatoes. Then, the results of model recognition are compared with the results of manual judgment, and the accuracy of qualitative analysis is finally determined. The original data of the model is the spectral data measured by the high spectrometer in the laboratory.

Comments 40: [L283-L296: shift lines to Methodology as required, retaining the results. Mention how many bands in there before and after PCA and what criteria was used in PCA to select the bands. What is the actual measurement index here? How these values were obtained and predicted? It looks like the current approach can identify the different potato categories with multi-spectral images, i.e., with a lower number of spectral data.]

Response 40: Thank you for pointing this out. We agree with this comment. Therefore, I moved this section about methods to the methods section and kept the results. The number of bands in the first 288 bands of PCA is reduced by 2 principal components after PCA. The criteria for selecting bands in PCA is based on variance maximization and eigenvalue ordering, with the principal component with the largest variance being retained first. The actual measurement index is the spectral reflectance, which is obtained by the high spectrometer. [Page 7, L268-L273 & Page 14, L390-L402.]

Comments 41: [L295-296: Figure 8 doesn’t show any spectrum as mentioned here.]

Response 41: Thank you for pointing this out. We agree with this comment. Therefore, I changed the title of Figure 8. “The scatterplot of variance for PCA of three types of potatoes” [Figure 8,Page 14, L403-404.]

Comments 42: [L300: what is Characteristic spectral data and why these are required?]

Response 42: Thank you for pointing this out. Characteristic spectral data refers to the characteristic pattern of light absorbed, emitted, or scattered by a substance over a specific wavelength range, usually measured by a spectrometer. These data reflect the chemical composition, structure and physical properties of the substance. The characteristic spectral data combined with qualitative model can be used to identify potato external defects.

Comments 43: [While PCA was applied for dimensionality reduction, then why 3.4. Spectral data feature wavelength extraction were used?]

Response 43: Thank you for pointing this out. The SPA method was used to extract the characteristic wavelength of spectral data.

Comments 44: [L30-382: How did the authors ensured that there was no damage inside, even partially?]

Response 44: Thank you for pointing this out. This paper is a study on the external defects of potatoes, so it is not certain that the potatoes are not damaged inside.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have substantially improved the previous manuscript and, under these conditions, it can now be published by me. It is a very interesting manuscript in which they use a non-destructive detection method to identify external defects in potatoes using hyperspectral imaging and machine learning algorithms. Spectral preprocessing (SG, SNV, MSC), feature selection (SPA) and quantitative (PCA, SVM) and qualitative (BPNN, KNNNN) modeling techniques were compared. Results indicated that the PCA-SG model achieved X% accuracy on healthy potatoes, while SVM was the most stable on defects such as black and green spots. For scab and mechanical damage, KNN provided the best classification with an accuracy of Y%. These results demonstrate the feasibility of using hyperspectral imaging to improve quality inspection in the potato industry, although further studies are required to optimize model generalization.

Reviewer 3 Report

Comments and Suggestions for Authors

The suggestions are incorporated. The manuscript can be accepted with minor changes in the English language.

Article Menu

Non-Destructive Detection of External Defects in Potatoes Using Hyperspectral Imaging and Machine Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI