You are currently viewing a new version of our website. To view the old version click .
by
  • Hongfeng Chu1,
  • Yanhua Ma1,2,* and
  • Chunmao Fan1
  • et al.

Reviewer 1: Anonymous Reviewer 2: Anonymous Reviewer 3: Leiqing Pan

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The authors address remote moisture assessment in alfalfa via reflectance spectra measurements. Though the results might be interesting to agricultural community, the paper lacks methodological consistency and needs major revisions.

1. Why is the paper dedicated to hyperspectral "imaging"? There is nothing to do with spatiospectral analysis. There is no spectral images in the paper. It is a well-established reflectance spectroscopy. Imaging camera is not necessary to obtain spectra presented in Figs. 3, 4, 6. In this regard, it is unclear why conventional spectrometer is not installed in the setup in Fig. 2.

2. The title and abstract are too long and misleading. Seems that the main result is that you found the wavelengths which are the most meaningful in terms of moisture essessment. In fact, HSI was not necessary (at least it was not demonstrated) for this task. Therefore, I recommend a shorter and more specific title, e.g. Remote Moisture Assessment in Alfalfa From its Spectral Reflectance.

3. NIR is commonly the wavelength range between 780 nm and 1000 nm. Spectral range 900-1700 nm is normally called shortwave infrared (SWIR).

4. A more extensive overview of existing SWIR spectroscopy and HSI solutions  (10.1039/C4CS00062E, 10.3390/app13095226, 10.1016/j.heliyon.2024.e33208, 10.3390/s20164439, 10.3390/technologies13050170, etc.) is necessary. I recommend to add these and/or other references, explain why you chose this HSI camera and give clear recommendations what SWIR cameras/sensors are potentially the most effective for your moisture prediction content.

5. Description of the calibration of the experimental setup is missing. If you average the reflectance spectra across the images, you have to be sure that they are properly measured. Please present raw spectral images as well as raw and corrected spectra in various points within the field of view, e.g. in the center and in the edge of the image.

6. How the results of this purely in-lab research may be transferred to real in-field environment? Please add a substatial discussion on how multiple factors (weather, distance, temperature, etc.) may influence moisture assessment and how thaey may be corrected.

Author Response

Comments 1: Why is the paper dedicated to hyperspectral "imaging"? There is nothing to do with spatiospectral analysis. There is no spectral images in the paper. It is a well-established reflectance spectroscopy. Imaging camera is not necessary to obtain spectra presented in Figs. 3, 4, 6. In this regard, it is unclear why conventional spectrometer is not installed in the setup in Fig. 2.

Response 1:

We sincerely thank the reviewer for this insightful and critical question. It has prompted us to clarify a crucial aspect of our methodology. While we did not perform complex spatiospectral analysis (e.g., pixel-level classification), the use of hyperspectral imaging (HSI) instead of a conventional non-imaging spectrometer was a deliberate and necessary choice for this study, driven by the inherent physical nature of our samples.

As the reviewer correctly points out, our final models are based on average spectra extracted from a Region of Interest (ROI). However, the key advantage of HSI here is its ability to acquire a spatially representative and robust average spectrum from physically heterogeneous samples. These powdered and compressed alfalfa products exhibit significant spatial variations in particle size, packing density, and surface texture. Such non-uniformity can introduce considerable noise and measurement bias if a conventional point-spectrometer is used. A single-point measurement would be highly susceptible to random error depending on the exact point of measurement, potentially leading to a non-representative spectrum of the bulk sample.

By using HSI, we capture spectra from thousands of pixels across the sample surface within a defined ROI. Averaging these spectra effectively mitigates the influence of local physical variations and instrumental noise, yielding a single spectrum that is a much more stable and accurate representation of the bulk chemical properties of the entire sample.

To better justify this choice and visually demonstrate the benefit, we have made the following revisions to the manuscript:

(1) In the Introduction (now around line 70): We have added a paragraph to explicitly contrast HSI with conventional spectroscopy. We acknowledge HSI's full potential in spatial analysis but clarify that our study leverages its imaging capability primarily for robust data acquisition from non-uniform samples. We state that this approach is critical for ensuring data quality and representativeness, which is a prerequisite for building reliable prediction models.

(2) In Section 2.2 (Hyperspectral Data Acquisition): We have inserted a new figure (now Figure 3, with subsequent figures renumbered) to visually illustrate this process. This new figure demonstrates:

  1. The grayscale images of both a compressed and a powdered alfalfa sample at a representative wavelength, showing their physical appearance and texture.
  2. The selection of the ROI (overlaid as colored regions) on each sample image.
  3. A plot showing several representative single-pixel spectra (the thin lines in blue, cyan, etc.) extracted from within one ROI, which exhibit noticeable variability in reflectance intensity and baseline. Crucially, the plot also shows the final averaged spectrum from the entire ROI (the red line), which is clearly smoother and represents the central tendency of all pixel-level spectra.

 

This new figure provides clear, visual evidence that averaging across the spatial domain of the image is essential for obtaining a high-quality, representative spectral signature, thus justifying the "imaging" aspect of our work. We believe these additions now clearly articulate the methodological rationale for choosing HSI and comprehensively address the reviewer's valid concern.

 

Comments 2: The title and abstract are too long and misleading. Seems that the main result is that you found the wavelengths which are the most meaningful in terms of moisture essessment. In fact, HSI was not necessary (at least it was not demonstrated) for this task. Therefore, I recommend a shorter and more specific title, e.g. Remote Moisture Assessment in Alfalfa From its Spectral Reflectance.

Response 2:

Thank you for your valuable feedback on the title and abstract. We agree that the original versions were too long and could be more focused on the key outcomes. We also understand your point regarding the necessity of HSI, which we have addressed in our response to your first comment by clarifying its crucial role in obtaining representative spectra from heterogeneous samples.

(1) To create a more concise and impactful title and abstract, we have made the following revisions:

We have revised the title to be shorter, more direct, and to better highlight our main contributions: identifying key wavelengths and their potential for sensor development, while also accurately reflecting our methodology. The new title also incorporates the suggestion from another reviewer to use "Product Form" instead of "Morphology" for greater precision.

Original Title:
Morphology-Aware NIR-Hyperspectral Imaging for Alfalfa Moisture Prediction: Pathway Optimization, Robustness Quantification, and Sensor-enabled Potential.

Revised Title:

Predicting Moisture in Different Alfalfa Product Forms with SWIR Hyperspectral Imaging: Key Wavelengths for Low-Cost Sensor Development

(2) Abstract Revision,

We have completely rewritten the abstract to make it more concise and conclusion oriented. The revised abstract now:

Begins by directly stating the challenge and our approach, including the justification for using HSI to obtain spatially representative spectra. Clearly and concisely presents the main findings for both compressed and powdered alfalfa, focusing on the performance differences and the specific optimal pathways. Emphasizes the most significant discovery—the feasibility of an ultra-sparse, single-band model for powdered alfalfa and its direct implication for low-cost sensor development. Removes the detailed descriptions of the combinatorial search process, focusing instead on the outcomes and their practical implications, including the insights gained from our robustness assessment.

This change can be found on Page 1, Paragraph 1, Lines 12-30.

Revised Abstract:

Rapid and accurate moisture detection is critical for alfalfa quality control, yet conventional methods are slow, and non-destructive techniques are challenged by different product forms. This study leveraged Near-Infrared Hyperspectral Imaging (NIR-HSI) to acquire spatially representative spectra, aiming to develop and validate robust, form-specific moisture prediction models for compressed and powdered alfalfa. For compressed alfalfa, a full-spectrum Support Vector Regression (SVR) model demonstrated stable and good performance (mean Prediction Coefficient of Determination  = 0.880, Ratio of Performance to Deviation RPD = 2.93). In contrast, powdered alfalfa achieved superior accuracy (mean  = 0.953, RPD = 5.29) using an optimized pipeline of Savitzky-Golay first derivative, Successive Projections Algorithm (SPA) for feature selection, and an SVR model. A key finding is that the optimal model for powdered alfalfa frequently converged to an ultra-sparse, single-band solution near water absorption shoulders (~970/1450 nm), highlighting significant potential for developing low-cost, filter-based agricultural sensors. While this minimalist model showed excellent average accuracy, rigorous repeated evaluations also revealed non-negligible performance variability across different data splits, a crucial consideration for practical deployment. Our findings underscore that tailoring models to specific product forms and explicitly quantifying their robustness are essential for reliable NIR sensing in agriculture and provide concrete wavelength targets for sensor development.

 

Comments 3: NIR is commonly the wavelength range between 780 nm and 1000 nm. Spectral range 900-1700 nm is normally called shortwave infrared (SWIR).

Response 3:

Thank you for this crucial correction regarding the spectral terminology. We completely agree that using the term "SWIR" is more precise for the 900–1700 nm range. Your comment has helped us improve the accuracy of our manuscript significantly.

To address this, we have performed a comprehensive revision of the terminology throughout the entire manuscript. Our approach is as follows:

(1) Primary Terminology Change: We have replaced "NIR" with "SWIR" (short-wave infrared) as the primary descriptor for our hyperspectral imaging system and data, as it more accurately reflects the 900–1700 nm spectral range.

(2) Clarification in Methods: In the methods section, where we first describe the instrument, we have added a sentence to acknowledge the broader context and prevent any ambiguity for readers familiar with different conventions.

(3) Systematic Update: We have systematically searched the manuscript and updated all instances of "NIR" to "SWIR" or rephrased them for accuracy. This includes the title, abstract, keywords, main text, and figure captions.

 

Specific changes can be found in the following locations:

(1) Title (Page 1, Line 3): We have changed "NIR-Hyperspectral Imaging" to "SWIR-Hyperspectral Imaging".

Revised Title:

Predicting Moisture in Different Alfalfa Product Forms with SWIR Hyperspectral Imaging: Key Wavelengths for Low-Cost Sensor Development

 

(2) Abstract (Page 1, Line 14): The first mention of the technology is now "SWIR-HSI".
Revised Abstract Snippet:

“...This study leveraged Short-Wave Infrared Hyperspectral Imaging (SWIR-HSI) to acquire spatially representative spectra ...”

(3) Keywords (Page 2, Line 31): The keyword "NIR hyperspectral imaging" has been updated.
Revised Keywords Snippet:

“Keywords: SWIR hyperspectral imaging”

(4) Introduction (e.g., Page 4, Line 167 and throughout): All mentions of "NIR-HSI" or "Near-Infrared" in the context of our study have been changed to "SWIR-HSI" or "Short-Wave Infrared".

Revised Introduction Snippet:
“...Considering these identified gaps, our study leverage Short-Wave Infrared Hyperspectral Imaging (SWIR-HIS) to build non-destructive moisture models...”

 

We are confident that these systematic revisions have rectified the terminological inaccuracy and improved the overall quality and clarity of our manuscript.

 

Comments 4: A more extensive overview of existing SWIR spectroscopy and HSI solutions (10.1039/C4CS00062E, 10.3390/app13095226, 10.1016/j.heliyon.2024.e33208, 10.3390/s20164439, 10.3390/technologies13050170, etc.) is necessary. I recommend to add these and/or other references, explain why you chose this HSI camera and give clear recommendations what SWIR cameras/sensors are potentially the most effective for your moisture prediction content.

Response 4:

Thank you for this excellent and constructive suggestion, which has prompted us to significantly enhance the context and practical impact of our manuscript. We fully agree with your points and have performed substantial revisions in the Introduction, Methods, and Discussion sections to address them comprehensively.

(1) Expanded Literature Review in the Introduction:

We have added a new, comprehensive paragraph in the Introduction (now placed between the third and fourth paragraphs) to provide a broader overview of the field. This new section:

Systematically reviews the application of SWIR-HSI for quality assessment in agricultural products, incorporating the literature you suggested (e.g., Adesokan et al., 2023) and other relevant studies.

Goes beyond a simple literature list to synthesize and discuss the three primary challenges facing the industrial adoption of HSI: (1) high hardware cost, (2) insufficient model robustness, and (3) complex data processing.

This revised framing now more effectively situates our study as a direct attempt to tackle these key challenges, particularly by investigating the impact of product form on robustness and by identifying critical wavelengths to enable low-cost sensor development.

 

The new text can be found on Page 2, Paragraph 4, Lines 69-91 (Note: please adjust line numbers based on your final manuscript).

Hyperspectral imaging (HSI), particularly in the short-wave infrared (SWIR) region, represents a significant advancement over traditional near-infrared spectroscopy (NIRS) for the quality assessment of diverse agricultural products. While both technologies offer rapid, non-destructive analysis, HSI's ability to integrate spatial and spectral information provides a distinct advantage in visualizing chemical distributions. This is crucial for heterogeneous samples where physical form [18], orientation, or measurement position can significantly impact spectral signatures, as demonstrated in recent studies on wheat grains and jujubes [19]. However, the path to widespread industrial adoption for HSI faces significant hurdles regarding its cost, accuracy (robustness), and speed (complexity) [20]. Recent research highlights three primary challenges: (1) high hardware cost, which limits accessibility compared to lower-cost NIRS systems and motivates the search for critical wavelengths to design simpler sensors [21]; (2) insufficient model robustness, where model accuracy degrades when faced with real-world variations in physical form or sample batches [22]; and (3) complex data processing, where the high data volume necessitates advanced chemometrics and artificial intelligence techniques, such as deep learning models (e.g., CNNs), to achieve optimal performance, impacting overall analysis speed and requiring specialized expertise [23]. Therefore, this study aims to address these challenges by systematically investigating the impact of product form on model performance. By identifying a minimal set of critical wavelengths, we provide a scientific basis for developing low-cost, robust, and application-specific sensors. This approach seeks to combine the spatial advantages of imaging with the cost-effectiveness and speed of simpler spectroscopic systems, thereby bridging the gap between laboratory potential and practical application.

 

(2) Justification for Camera Choice in Methods:

As you recommended, we have added a clear justification for our choice of the SPECIM FX17 camera in Section 2.2 (Hyperspectral Data Acquisition), Page 6, Paragraph 1. We now explicitly state that the camera was selected for its ideal spectral range covering key water absorption bands (~970 nm and ~1450 nm) and its high signal-to-noise ratio, which are crucial for our detailed spectral analysis.

 

(3) Clear Recommendations for Sensor Development in Discussion:

To provide concrete and actionable recommendations, we have consolidated all sensor-related insights into a new, focused paragraph in the Discussion section (now the second paragraph, Page 21). This section now serves as a clear answer to your request for guidance on effective sensor design.

For powdered alfalfa, we recommend a low-cost sensor based on an ultra-sparse set of wavelengths (filters or LEDs). Crucially, we also discuss the practical trade-off between a cost-optimal single-band design and a more robust 2-3 band solution for real-world reliability. For compressed alfalfa, we explain why a simple sensor is likely inadequate and suggest that a more sophisticated compact spectrometer would be required.

 

We are confident that these extensive revisions have substantially strengthened the manuscript by broadening its scholarly context, clarifying our methodology, and, most importantly, translating our research findings into clear, practical recommendations for future technology development, as you wisely suggested.

 

Comments 5: Description of the calibration of the experimental setup is missing. If you average the reflectance spectra across the images, you have to be sure that they are properly measured. Please present raw spectral images as well as raw and corrected spectra in various points within the field of view, e.g. in the center and in the edge of the image.

Response 5:

Thank you for raising this critical point regarding the validity of our spectral calibration. We agree completely that proper radiometric correction is fundamental to the reliability of any hyperspectral analysis, especially when averaging spectra from an ROI. While we have not included raw calibration images in the manuscript to maintain focus and brevity, we have revised the Methods section to provide a much more detailed description of our rigorous and standardized calibration protocol.

Our goal is to assure you that the data were acquired following best practices, ensuring the accuracy of the measured reflectance. The revisions and clarifications are as follows:

(1) Detailed Calibration Protocol in Methods:

In Section 2.2 (Hyperspectral Data Acquisition), we have expanded the description of the black and white correction procedure. We now explicitly state that:

The white reference measurement was taken using a standard white Teflon board (>99% reflectance) that filled the entire field of view. This is a critical step to ensure that the correction accounts for any spatial non-uniformity in illumination (e.g., center-to-edge variations) and sensor response across the entire imaging area.

The dark current measurement was performed by completely covering the camera lens, ensuring a true zero-signal reference.

The correction formula we used is the standard and widely accepted method for converting raw digital numbers to relative reflectance, as established in numerous hyperspectral imaging studies.

This expanded description can be found on Page 6, Paragraph 2.

“To ensure the accuracy of the spectral data, a rigorous radiometric correction was performed. Prior to sample scanning, a white reference image () was acquired from a standard Teflon board (>99% reflectance) that filled the entire field of view, and a dark current image () was captured with the lens completely covered. This standard procedure corrects for non-uniform illumination and sensor response across the imaging plane [45,46]. The calibrated reflectance (R) was then calculated for each pixel using the following widely accepted formula:”

 

(2) Strategic ROI Selection to Mitigate Potential Edge Effects:

We have also clarified in Section 2.2 that our ROI selection strategy was intentionally conservative. We now state that for each sample, a moderately sized and uniform region of interest (ROI) was manually selected from the central area of the sample image, well away from the edges. This practice inherently minimizes the impact of any residual, uncorrected edge effects (such as vignetting), further ensuring the integrity of the averaged spectrum.

(3) Indirect Evidence of Successful Correction:

As part of our response to your first comment, we have added a new figure (now Figure 3) that illustrates the process of averaging pixel spectra within an ROI. This figure, while intended to justify the use of imaging, also indirectly supports the quality of our calibration. The variability among single-pixel spectra within the central ROI appears random rather than exhibiting a systematic spatial trend (e.g., consistently decreasing from center to edge), which would be expected if significant illumination non-uniformity remained after correction. The resulting smooth, averaged spectrum is indicative of a well-calibrated signal.

 

We are confident that our adherence to this rigorous, standard operating procedure for calibration, combined with a conservative ROI selection strategy, ensured the acquisition of accurate and reliable reflectance data for our modeling work. We hope this detailed explanation alleviates your concerns.

 

Comments 6: How the results of this purely in-lab research may be transferred to real in-field environment? Please add a substantial discussion on how multiple factors (weather, distance, temperature, etc.) may influence moisture assessment and how they may be corrected.

Response 6:

Thank you for this crucial question regarding the practical application of our research. We acknowledge the importance of bridging laboratory results to practical deployment. Accordingly, the Discussion has been broadened to present a clear deployment pathway and to consider various influencing factors, providing a realistic assessment of the path to application.

Specifically, in the fourth paragraph of the Discussion section (Page 21), we are now:

(1) Define the Application Scene: We first clarify that our models are designed for at-line or in-process quality control within agricultural processing facilities, rather than for "in-field" remote sensing, which helps to specify the relevant environmental variables.

(2) Address Key Practical Challenges: We then concisely discuss the main challenges for industrial deployment, including:

The need to account for variable ambient lighting in a factory environment.

The impact of broader sample variability (variety, origin) and environmental conditions (temperature, humidity) encountered in real-world production.

(3) Propose Future Directions and Solutions: We outline a forward-looking strategy to address these challenges. We highlight that validating the models on diverse industrial datasets is a critical next step. Furthermore, we propose that the parsimonious models developed in our study, which are based on wavelengths tied to fundamental water absorption physics, are hypothesized to possess greater inherent robustness and transferability. This provides a strong rationale for future model transfer studies.

This consolidated discussion provides a realistic and comprehensive perspective on the path to practical application, directly addressing the important factors you raised while maintaining the conciseness of the section.

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Overall, the manuscript is well prepared and provides methodology that may support the development of inexpensive, accurate devices to test for moisture content in alfalfa products.

I am a little confused by the term "morphology" in the title and in the manuscript as the term usually applies to shape of part of an organism.  In both products, the alfalfa morphology prior to processing would have been similar.  In the manuscript, morphology is used to refer to the two products produced by dehydrating and compacting alfalfa.  I think the two products are distinct and could be referred two as products rather than two morphologies.

Throughout the manuscript, labels on axes of almost all figures are too small to read. I feel that theses need to be improved for clarity.

Line 274    The word explaining should be explain

Line 386    You have a sub-section labelled 2.7.1, but there are no other sub-sections below 2.7. I am not sure this should be separated out from 2.7.

Line 552    I believe the Figure you are referring to is Figure 6d on this line

Line 554    I believe the figure you are referring to is Figure 6c on this line

 

 

Author Response

Comments 1: I am a little confused by the term "morphology" in the title and in the manuscript as the term usually applies to shape of part of an organism. In both products, the alfalfa morphology prior to processing would have been similar. In the manuscript, morphology is used to refer to the two products produced by dehydrating and compacting alfalfa. I think the two products are distinct and could be referred to as products rather than two morphologies.

Response 1:

Thank you for this insightful and valuable comment. We completely agree with your assessment. The term "morphology" in the context of our study could indeed be ambiguous, especially in an agricultural and biological context where it traditionally refers to the shape of an organism. Your suggestion to use "product forms" or "products" is much more precise and accurately describes the two distinct states of processed alfalfa (powdered and compressed) that we investigated.

To improve the clarity and accuracy of our manuscript, we have undertaken a systematic revision of this terminology throughout the entire document. We have performed a global "find and replace" to change all instances of "morphology," "morphologies," and "morphology-specific" to the more appropriate terms "product form," "product forms," and "form-specific" or "product-specific," respectively.

This change has been implemented in all relevant sections of the manuscript, including:

The Title: The title now reads "Predicting Moisture in Different Alfalfa Product Forms..." (found on Page 1, Line 3).

The Abstract, The Introduction, Methods, Results, Discussion, and Conclusion sections: All mentions of the term have been corrected to ensure consistency and precision throughout the main body of the text (e.g., "form-specific recommendations" in the Discussion section).

We are confident that this revision makes our manuscript clearer and more professional. We sincerely appreciate you bringing this to our attention.

 

Comments 2: Throughout the manuscript, labels on axes of almost all figures are too small to read. I feel that theses need to be improved for clarity.

Response 2:

Thank you for pointing out this critical issue regarding the readability of our figures. We completely agree that clear and legible figures are essential for conveying our results effectively. We have carefully reviewed all the figures in the manuscript and have made significant improvements to address your concern.

Our revisions include the following:

(1) Specific Revision for Figure 6 (Heatmap): We recognized that Figure 6 (previously Figure 5 in the original manuscript), which displays the heatmap of model performance (RMSEP and RPD), was particularly affected by small labels. As this is a key figure summarizing the results of our multi-pipeline optimization, we have completely regenerated it with a significantly larger font size for the axis labels, color bar labels, and the regressor names within each cell. This greatly enhances its clarity and impact.

(2) Consideration for Figure 5 (Preprocessing Effects): We also carefully re-evaluated Figure 5 (previously Figure 4), which shows the effects of different preprocessing methods. This figure combines eight subplots to provide a comprehensive comparison. While the individual labels are smaller than in other figures due to the compact layout, we believe the current presentation effectively serves its primary purpose: to visually demonstrate the overall effect of each preprocessing method (e.g., scatter correction by SNV, baseline removal by SG_1d) on the spectral curves. The main trends are clearly visible, and we felt that further enlarging the labels would necessitate splitting the figure into multiple, less cohesive parts, thereby losing the benefit of a direct side-by-side comparison. We have, however, ensured that all figures are saved at a high resolution (600 DPI) so that readers can zoom in to inspect details without loss of quality if they wish.

We are confident that these revisions have substantially improved the overall readability and quality of the figures throughout the manuscript. We appreciate your guidance in helping us make our work more accessible.

 

Comments 3: Line 274 The word explaining should be explain

Response 3:

Thank you for your careful reading and for spotting this grammatical error. We appreciate you helping us improve the precision of our language. You are correct. We have corrected the sentence in the manuscript. In fact, while making the correction, we took the opportunity to slightly rephrase the sentence for better clarity and accuracy, reflecting the meaning of the VIP score more precisely.

The change can be found in Section 2.4 (Feature Extraction and Band Selection), under the description of the Variable Importance in Projection (PLS_VIP) method.

Original Sentence (with error):

“...quantifies the comprehensive contribution of each variable to explaining the variance...”

Revised Sentence:

“The VIP score quantifies the importance of each variable in the projection, reflecting its contribution to explaining the variance of both predictors and responses during the PLS model construction.”

We believe this revised phrasing is now both grammatically correct and semantically more precise. Thank you again for your diligence.

 

Comments 4: Line 386 You have a sub-section labelled 2.7.1, but there are no other sub-sections below 2.7. I am not sure this should be separated out from 2.7.

Response 4:

Thank you for this valuable formatting suggestion. You are entirely correct; having a single sub-section is structurally unnecessary and can disrupt the flow of the text.

We have followed your advice and revised the manuscript accordingly. Specifically:

We have removed the sub-section heading "2.7.1 Performance Evaluation Metrics. The content that was previously under this sub-section, which describes the evaluation metrics (R², RMSE, MAE, and RPD), has been merged directly into the main body of Section 2.7, "Robustness and Uncertainty Assessment of Optimal Pipelines.

This change, which can be found in Section 2.7, improves the structural logic of the manuscript by integrating the description of the metrics seamlessly with the methodology for which they are used. We appreciate you pointing this out.

 

Comments 5: Line 552 I believe the Figure you are referring to is Figure 6d on this line.

Line 554 I believe the figure you are referring to is Figure 6c on this line

Response 5:

Thank you so much for your incredibly careful and detailed reading of our manuscript. You are correct; we had inadvertently swapped the figure references in that paragraph. We sincerely appreciate you catching this error and helping us ensure the accuracy of our text. We have now corrected these references in the manuscript to align the text descriptions with the correct sub-figures.

This correction can be found in Section 3.3.2 (Chemical Interpretability Analysis of Selected Wavelengths).

Original Text (with errors):
The text incorrectly matched the descriptions of the PLS_VIP and Lasso methods with the figure callouts.

Revised Text (as corrected in the manuscript):
"...For powdered alfalfa, SG_1d with PLS_VIP or SPA performed optimally. PLS_VIP (Figure 7d) selected a denser set of wavelengths... More notably, the BC_ALS + Lasso combination (Figure 7c) achieved excellent performance... with a remarkably sparse set of only 3 wavelengths..."

This revision ensures that the discussion of the wavelengths selected by the PLS_VIP method correctly points to the corresponding figure, and likewise for the discussion of the Lasso method. Thank you again for your sharp eye and for helping us improve the quality of our manuscript.

 

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This study explores modeling methods for moisture prediction of compressed and powdered alfalfa, revealing the significant impact of physical form on spectral response and model performance. Through multi-path modeling optimization, the manuscript systematically compares various combinations of preprocessing methods, feature wavelength selection methods, and regression algorithms. Moreover, the manuscript's robustness evaluation method is scientific, enhancing the reliability of the results through repeated experiments. However, several aspects merit improvement to further strengthen the manuscript:

  1. The Introduction could include a comparison of this method with traditional near-infrared spectroscopy, artificial intelligence, or other rapid detection technologies in terms of cost, accuracy, and speed.For example, the following article can be referenced:

[1] Next-generation optical imaging and spectroscopy: Applications of artificial intelligence and chemometrics in assessing grain authenticity, nutritional value, and adulterants [J]. Critical Reviews in Food Science and Nutrition, 2025, 24(5), e70248.

  1. Section 2.4-2.5 provide a detailed introduction to various preprocessing methods, feature wavelength selection methods, and regression models, but over-describing the details of various methods may distract the reader's interest.It is recommended tofocus on the innovativeness of the best method.
  2. Section 3.4.1 presents the best predictive model, but almost every model’s Rc²shown in Table 3 is smaller than Rp².Generally, a reliable model should have Rc² close to Rp² and Rc² slightly larger than Rp². Therefore, it is recommended to partition the sample set and supplement relevant experiments or discussions to confirm the reliability of the model.

Author Response

Comments 1: The Introduction could include a comparison of this method with traditional near-infrared spectroscopy, artificial intelligence, or other rapid detection technologies in terms of cost, accuracy, and speed. For example, the following article can be referenced:[1] Next-generation optical imaging and spectroscopy: Applications of artificial intelligence and chemometrics in assessing grain authenticity, nutritional value, and adulterants [J]. Critical Reviews in Food Science and Nutrition, 2025, 24(5), e70248.

Response 1:

We sincerely thank the reviewer for the valuable suggestion to include a comparison with traditional near-infrared spectroscopy, artificial intelligence, and other rapid detection technologies in terms of cost, accuracy, and speed. We agree that this comparison is crucial for contextualizing our research.

We were unable to access the specific article suggested (Critical Reviews in Food Science and Nutrition, 2025), which appears to be a very recent publication. However, to thoroughly address your insightful comment, we have conducted a careful literature search and identified three highly relevant, recent articles from Food Control and Computers and Electronics in Agriculture. These studies provide concrete, up-to-date examples that allow us to perform the comparison you requested with even greater specificity.

Accordingly, we have substantially revised a key paragraph in the Introduction. This enhanced section now explicitly compares HSI with traditional NIRS and discusses the trade-offs in terms of cost, accuracy (robustness), and speed (complexity), using evidence from these new references.

The revised paragraph now accomplishes the following:

(1) It begins by directly contrasting HSI with traditional NIRS, highlighting HSI's unique advantage in handling sample heterogeneity by citing recent studies on wheat grains and jujubes where sample form and orientation were critical factors.

(2) It frames the primary challenges of HSI directly around the comparison dimensions you suggested (cost, accuracy, and speed), providing specific literature support for each point.

(3) It integrates the role of artificial intelligence (e.g., CNNs) in modern HSI analysis, acknowledging both its power and the complexity it introduces, as demonstrated in recent work on coriander quality assessment.

(4) It concludes by positioning our study as a strategic effort to develop sensors that balance the advantages of imaging with the cost-effectiveness and speed of simpler systems.

We believe these comprehensive revisions, supported by carefully selected state-of-the-art literature, fully address your comment and significantly strengthen the rationale of our manuscript.

The new text can be found on Page 2, Paragraph 4, Lines 69-91 (Note: please adjust line numbers based on your final manuscript).

Hyperspectral imaging (HSI), particularly in the short-wave infrared (SWIR) region, represents a significant advancement over traditional near-infrared spectroscopy (NIRS) for the quality assessment of diverse agricultural products. While both technologies offer rapid, non-destructive analysis, HSI's ability to integrate spatial and spectral information provides a distinct advantage in visualizing chemical distributions. This is crucial for heterogeneous samples where physical form [18], orientation, or measurement position can significantly impact spectral signatures, as demonstrated in recent studies on wheat grains and jujubes [19]. However, the path to widespread industrial adoption for HSI faces significant hurdles regarding its cost, accuracy (robustness), and speed (complexity) [20]. Recent research highlights three primary challenges: (1) high hardware cost, which limits accessibility compared to lower-cost NIRS systems and motivates the search for critical wavelengths to design simpler sensors [21]; (2) insufficient model robustness, where model accuracy degrades when faced with real-world variations in physical form or sample batches [22]; and (3) complex data processing, where the high data volume necessitates advanced chemometrics and artificial intelligence techniques, such as deep learning models (e.g., CNNs), to achieve optimal performance, impacting overall analysis speed and requiring specialized expertise [23]. Therefore, this study aims to address these challenges by systematically investigating the impact of product form on model performance. By identifying a minimal set of critical wavelengths, we provide a scientific basis for developing low-cost, robust, and application-specific sensors. This approach seeks to combine the spatial advantages of imaging with the cost-effectiveness and speed of simpler spectroscopic systems, thereby bridging the gap between laboratory potential and practical application.

 

Comments 2: Section 2.4-2.5 provide a detailed introduction to various preprocessing methods, feature wavelength selection methods, and regression models, but over-describing the details of various methods may distract the reader's interest. It is recommended to focus on the innovativeness of the best method.

Response 2:

Thank you for this thoughtful feedback regarding the level of detail in our Methods section. We agree that the manuscript should remain focused and engaging for the reader. We would like to clarify that a core methodological contribution of our study is the "large-scale combinatorial search" itself—that is, the systematic comparison of various processing pipelines to identify the optimal one for each specific product form. To demonstrate the comprehensiveness of our approach and to properly justify why the final selected methods are indeed "optimal," it is necessary to briefly introduce all the methods that were part of this comparative framework.

However, we fully take your point that over-describing standard methods can be distracting. Therefore, we have adopted a strategy of streamlining rather than removing this information. Our revisions to Sections 2.3, 2.4 and 2.5 are as follows:

(1) Condensing Descriptions: For each standard method (e.g., SNV, SPA, PLS, SVR), we have condensed the description to a single, concise sentence that states its core purpose within our study. For example, we now describe SNV simply as a method "to correct for scatter effects caused by particle size variations.

(2) Minimizing Mathematical Detail: We have retained only the formulas that are most critical for understanding our approach, such as the general form of the Lasso objective function.

(3) Highlighting the Methodological Framework: To better frame our contribution, we have clarified at the beginning of the methods section that the innovation of our work lies not in developing a new algorithm, but in establishing a "systematic evaluation framework." This framework is designed to navigate the complex interplay between preprocessing, feature selection, and regression modeling to find the most robust and accurate solution for a specific, challenging agricultural application.

We believe these revisions strike a much better balance. They retain the necessary information to justify our systematic comparison while significantly improving the readability and focus of the Methods section, guiding the reader's attention to our overall analytical strategy rather than getting lost in the details of individual, standard algorithms. Thank you for helping us improve the presentation of our work.

 

Comments 3: Section 3.4.1 presents the best predictive model, but almost every model’s Rc²shown in Table 3 is smaller than . Generally, a reliable model should have  close to  and  slightly larger than . Therefore, it is recommended to partition the sample set and supplement relevant experiments or discussions to confirm the reliability of the model.

Response 3:

This is an extremely insightful and critical observation, and we sincerely thank you for raising this important point. We agree that the pattern of  >  observed in Table 3, which reports the results of a single data split, is counter-intuitive and warrants a thorough discussion. As you correctly noted, this can sometimes indicate a "lucky split," where the independent test set happens to be less complex or more favorably distributed than the training set.

This very possibility of obtaining misleading results from a single partition is the primary motivation behind the rigorous, repeated evaluation protocol we designed and presented in Section 3.4.2. We believe that relying on a single train-test split is insufficient for reliably assessing model performance, especially with moderately sized datasets common in chemometrics.

To address your concern directly and to confirm the reliability of our models, we have made the following additions and clarifications in the manuscript:

(1) Added a Discussion on the R²p > R²c Phenomenon:

In Section 3.4.1 (Optimal Prediction Performance), we have added a new paragraph to explicitly acknowledge and discuss this observation. We explain that while unusual, this can occur due to the randomness of a single data partition. We then use this as a bridge to emphasize the importance of the more robust evaluation that follows.

It is noteworthy that in some of the optimal models reported from this single data partition (e.g., in Table 3), the prediction set coefficient of determination () is slightly higher than that of the calibration set (). While counter-intuitive, this phenomenon can occasionally occur in machine learning workflows due to the specific random sampling of a "lucky split," where the test set happens to be less complex or more concentrated than the training set. This observation underscores the potential pitfalls of relying on a single train-test partition for model assessment. Therefore, to obtain a more reliable and generalized estimation of the models' true performance, a rigorous repeated evaluation protocol, as detailed in the following section, was imperative.

(2) Highlighted the Role of the Repeated Evaluation Protocol:

To provide a more rigorous assessment of the models' true generalization capabilities, we have added Section 3.4.2. In our response, we now explicitly guide the reader to this new section, along with the corresponding Table 4 and Figure 9, explaining that this repeated validation framework offers a more realistic and trustworthy evaluation of performance.

(3) Supplemented with Average Training Set Performance:

Crucially, to directly demonstrate that our models behave as expected in a statistical sense, we have analyzed the average performance on the training sets across the 20 independent repetitions. We found that, on average, the training set performance is indeed slightly higher than the prediction set performance, which is the expected behavior for a well-generalized model. We have added this crucial piece of information to our discussion. For example, we now state in the revised discussion that for the powdered alfalfa model, the average R² across the 20 training sets was [e.g., 0.965], which is slightly higher than the average  of 0.953 on the prediction sets. This confirms that there is no systematic issue with overfitting or data leakage.

(4) Reinforced the Conclusion on Reliability:

We conclude the discussion by re-emphasizing that despite the potential anomaly of a single split (Table 3), the results from our comprehensive repeated evaluation (Table 4)—such as the high mean RPD of 5.29 for powdered alfalfa—provide strong, statistically sound evidence for the effectiveness and reliability of our optimal models.

 

In summary, we did not simply re-partition the sample set for one more experiment. Instead, we have leveraged our existing, more powerful experimental design (the 20 repetitions) to provide a robust statistical answer to your valid concern. We believe this approach not only confirms the reliability of our models but also highlights the importance of such rigorous validation protocols in chemometric studies. Thank you for prompting us to make this crucial clarification.

 

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

After revision, the manuscript became much clearer. The minor rivision that has to be addressed, is design of the figures. Please check all of them carefully and change the captions to make them more meaningful, i.e.
- Fig. 3: what do "endmember", "unknown" and "Data Value" mean? What do the colors of the lines mean? Please sign ROIs.

- Figs. 4 and 5: please exclude the sign "ceaned spectra". What do the colors of the lines mean

 - Fig. 7: where are "chemical absorption peaks"? To illustrate the selected spectral bands, please show the vertical lines instead of red dots.

- Fig. 8: what "predicted" and "actual" mean?

- etc.