Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Detection of Nutrients and Contaminants in the Agri-Food Industry Evaluating the Probabilities of False Compliance and False Non-Compliance Through PLS Models and NIR Spectroscopy

Appl. Sci. 2025, 15(9), 4808; https://doi.org/10.3390/app15094808

by David Castro-Reigía^1,2

, Iker García², Silvia Sanllorente¹

, María Cruz Ortiz¹

and Luis A. Sarabia^3,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2025, 15(9), 4808; https://doi.org/10.3390/app15094808

Submission received: 14 March 2025 / Revised: 23 April 2025 / Accepted: 24 April 2025 / Published: 26 April 2025

(This article belongs to the Special Issue Innovative Technologies in Food Detection—2nd Edition)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Line 167: There is no citation for reference 41 in the manuscript. Please carefully check the content of the reference section.
Line 259: The specific parameters of the cross-validation method are not described. Please further elaborate on the details.
Table 2: It is recommended to provide RMSEP (Root Mean Square Error of Prediction) to reflect the model’s predictive ability, and to explain terms such as RMSEC.
Figure 5: The annotation of the CDβ value corresponding to r=3 in the figure is not clearly matched with the data in Table 4. This issue also exists in Figures 6, 7, and 8. It is suggested to mark the data in a more conspicuous way.
Table 4: The statement “CDβ for diflufenican in olives: 1.20 mg kg⁻¹” indicates that the detection capability of NIR-PLS (1.20 mg kg⁻¹) is higher than the maximum residue level permitted by regulations (0.6 mg kg⁻¹). This means that the detection capability of the method presented in this paper is not sufficient to meet the European Union’s detection standards. Therefore, please add an explanation of this aspect in the conclusion and provide directions for future optimization.

Comments on the Quality of English Language

Line 34-35: The term “repercusses” in the introduction should be “repercussions.” Please correct the spelling error.
Line 63-64: “Patial Least Squares” should be “Partial Least Squares.” Please correct the spelling error.
Other language problems need to be corrected.

Author Response

"Please see the attachment."

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The article explores the application of NIR spectroscopy combined with PLS models for detecting nutrients and contaminants in agri-food products. It focuses on evaluating the capability of detection and discrimination, addressing probabilities of false compliance and false non-compliance, a novel addition to standard analytical methods. The methodology is applied to diverse food matrices, like milk, flour, yogurt, and olives, with real-world examples illustrating detection limits for substances such as fat and agrochemicals.

While the study successfully showcases the versatility of NIR-PLS in food quality control, it lacks clarity in presenting actionable conclusions for practical implementation in industries. Additionally, more attention could have been paid to discussing potential limitations, such as cost or equipment accessibility for smaller-scale operations. The dense technical language and complex statistical details may alienate non-specialist readers, limiting its broader applicability despite its scientific robustness.

Abstract:

The abstract is clear in presenting the relevance of NIR spectroscopy coupled with PLS models in food quality assessment, but it could be more structured and concise. While it highlights the novel evaluation of false compliance and non-compliance probabilities, the abstract would benefit from a clearer linkage between its objectives, methodologies, and key findings. For example, the specific results, such as detection capabilities for fat or agrochemicals, seem detailed but lack context regarding their broader significance. Refining the flow to emphasize the practical implications for the agri-food industry and reducing redundancy in technical specifics would improve readability and impact.

Introduction:

Line 34-40: This paragraph introduces the importance of food quality and NIR spectroscopy but could benefit from stronger clarity and precision. The phrase "a raising piece of food control" feels ambiguous and could be rephrased to something like "an increasingly critical aspect of food control." Additionally, mentioning the connection to human health and the economy is impactful, but the statement is too broad. Examples or statistics, such as the economic impact of food adulteration, would provide more grounding. Add specific examples or data to emphasize the significance of food quality control.
Line 41-62: These paragraphs provide a useful overview but are dense and lack flow. The bibliographic search results are informative but need better integration into the narrative. The statistics should be summarized concisely, and the relevance of the cited figures to the main argument should be clarified. Streamline the discussion of bibliographic research results and explicitly tie them to the importance of NIR spectroscopy in the agri-food industry.
Line 63-90: This paragraph is effective in discussing PLS models but could improve accessibility by reducing technical jargon or briefly explaining terms like "accuracy line" and "capability of detection." The sentence structure feels overly complex, which might alienate non-specialist readers. Simplify sentence structures and briefly define technical terms to make the content more accessible to a wider audience.
Line 91-103: This outlines the scope of the study but lacks focus on its broader implications. While the technical methodology is well described, a brief mention of its potential benefits for industry stakeholders or consumers would add value. Highlight the practical relevance of the methodology, such as its potential to improve compliance and reduce errors in food quality testing.

Materials and Methods:

Line 109-119: It provides a good overview of the experimental setup and food matrices but at the same time it feels cluttered and lacks a clear structure. The mention of different sampling conditions across industries is informative but could be streamlined. Additionally, the specific configuration details for the NIR reflectance measurements (e.g., wavelength range, spectral resolution) are irrelevant without the reasoning for the selection of these configurations. Highlight the adaptability of the methodology across industries without excessive technical detail. Specify why the variability in sampling conditions is relevant to the study's reliability or scope.
Line 123-125: This paragraph transitions well into the validation process, emphasizing the importance of certified reference methods. However, it could benefit from a stronger justification for the choice of these reference methods and the role of accredited laboratories in ensuring data accuracy. Include a brief rationale for why the certified reference methods were chosen for each food matrix. Discuss how using accredited laboratories contributes to the study's credibility and robustness.
Line 130-135: This paragraph provides valuable information about the tools and software used, but it is overly dense with technical jargon. Simplify the description of AONIR and MATLAB tools while focusing on their significance to the study. Briefly explain the relevance of programs like DETARCHI to the research without overwhelming technical specifics. Highlight how these tools enhance precision or efficiency in real-time monitoring.
Line 137-148: It could improve by succinctly defining CCa and CCB before introducing their multivariate counterparts, CDa and CDB. Adding real-world applications for these detection figures in agri-food processes would enhance relevance. Integrating schematic representations earlier in the text would also aid academically skilled readers in understanding these concepts effortlessly.

Subsection 2.2.1. Decision limit and capability of detection at x0=0 or for a permitted limit, x0=PL with multivariate signals:

This subsection is overly complex, containing excessive mathematical details and lacking focus from the end-user perspective. It should be more like storytelling by incorporating relevant examples to engage the reader.

Line 151-154: The paragraph introduces the framework effectively, but the new figures of merit, CD-alpha, and CD-beta, are not clearly distinguished from the established CC-alpha and CC-beta for univariate calibrations. Additionally, the transition to multivariate extensions could be made smoother. Provide a concise definition for CD-alpha and CD-beta immediately after introducing them to enhance clarity. Use a comparative statement to highlight the differences between these terms and their univariate counterparts.
Line 155-166: This paragraph explains the regulatory definitions well but becomes dense due to the inclusion of multiple PL scenarios without concrete examples. The relationship between the mathematical framework and its practical relevance could also be clarified. Break down the scenarios (e.g., prohibited substances vs. authorized substances with PL) into subsections or bullet points. Add a brief example for each case to illustrate how these definitions are applied in real-world contexts.
Line 167-170: The explanation of generalizing CC-alpha and CC-beta to CD-alpha and CD-beta is detailed but dense with mathematical notation. The narrative could benefit from a more reader-friendly flow, especially since academically inclined readers may still prefer intuitive connections between concepts. Begin with a sentence summarizing why such generalizations are necessary. Integrate visual aids, such as a flowchart or schematic representation, to elucidate how CD-alpha and CD-beta are computed for multivariate calibrations.
Line 171-173: The hypothesis test framework is articulated but lacks concrete examples to anchor the theory. Terms such as “accuracy line” and “one-tailed hypothesis test” may require more context for seamless understanding. Include a specific example of a food parameter (e.g., fat content in milk) to demonstrate the one-tailed test process. Briefly describe the significance of the accuracy line in connecting the PLS predictions to real concentrations.
Line 174-188: The explanation of Equation (2) and related terms is thorough but presented in a dense manner, which might overwhelm readers. Moreover, the derivation of the non-centrality parameter is not tied to its practical implications. Simplify the explanation of Equation (2) by breaking it into smaller steps. Provide a brief interpretation of the non-centrality parameter and its influence on the hypothesis test outcomes.
Line 189-194: The paragraph effectively references Figure 2 but does not fully utilize the schematic to enhance understanding. Readers may need more context to interpret the visual representation. Clearly describe how Figure 2 illustrates the relationship between the accuracy line and the hypothesis test. Emphasize the practical interpretation of CD-alpha and CD-beta values in the context of minimizing false compliance and non-compliance risks. Same for the explanation of Figure 3.

Subsection 2.2.2. Capability of discrimination:

Line 219-227: This paragraph provides a foundational understanding of sensitivity but lacks a clear connection between sensitivity and the capability of discrimination. The concept of "multivariate sensitivity" could be clarified for better contextual understanding. Introduce the distinction between sensitivity and discrimination early in the paragraph. Provide a brief real-world example (e.g., detecting trace agrochemicals) to relate sensitivity to practical analytical challenges.
Line 228-230: The paragraph effectively sets up the hypothesis test for compliance versus non-compliance but is text-heavy with mathematical notation. This may make it difficult for readers to follow. Use a flowchart or schematic to visually represent the two-tailed hypothesis test. Highlight the practical implications of false compliance and non-compliance in real-world scenarios.
Line 231-232: This paragraph clearly explains the bifurcation of decision limits but feels repetitive due to the extensive inclusion of equations. There is little focus on broader significance. Minimize repetition by summarizing key equations and directing readers to the relevant figure (Figure 4). Elaborate on how decision limits impact analytical decision-making in quality control processes.
Line 237-242: The paragraph effectively references Figure 4, but the explanation assumes a high degree of familiarity with the schematic's elements. The connection between Figure 4 and the text could be reinforced. Provide a step-by-step walkthrough of Figure 4 to connect its elements (e.g., CD-alpha-1, CD-alpha-upper) to the text. Emphasize how the schematic simplifies the two-tailed hypothesis test for readers.
Line 254-263: While the sequential steps for building the PLS model are detailed, it could benefit from a more concise structure. Repetition (e.g., references to outliers and residual thresholds) makes it harder to follow. Include a brief rationale for the preprocessing techniques used (e.g., why SNV or derivatives are applied) and explain their impact on calibration performance (a good reference would be https://doi.org/10.1016/j.trac.2009.07.007). Highlight the relevance of cross-validation and the identification of outliers to ensure robustness. Briefly explain how removing outliers improves model accuracy and provides a clear transition to the next step in the procedure.
Line 264-267: The paragraph effectively explains the construction of the accuracy line but does not emphasize its significance in the overall context. The mention of Table 3 feels abrupt and could be better integrated. Highlight the purpose of the accuracy line (e.g., linking predicted and true concentrations to validate the model). Add a sentence describing how the data in Table 3 supports the process.
Line 268-270: This paragraph concludes the procedure but lacks depth in explaining the practical implications of calculating CC-alpha and CC-beta. It also assumes the reader is already familiar with their significance in real-world applications. Provide a sentence summarizing why these calculations are important for food quality assurance or regulatory compliance. Suggest linking the results to examples from the study (e.g., specific analytes or matrices).

Results:

Line 276-281: This paragraph introduces the calibration process, but it lacks explicit emphasis on the importance of preprocessing methods like SNV and derivatives. Briefly explain how preprocessing steps influence signal quality and the reliability of the PLS calibration. Add context on why mean centering is crucial for multivariate analyses. Probably a plot of spectral data before and after preprocessing and a discussion that explains how the data improved. Include justification for the selection of the number of LVs. Explain why additional LVs are included in the context of your data (what kind of additional variability, why it is 3 for one analyte and 7 for another). Do those additional LVs cover additional inherent properties of analytes or some process variations during data acquisition? Also, there is no mention of wavelength selection in PLS model calibration. Add sufficient details to reproduce the results in Table 2 from the spectral data.
Line 286-290: The paragraph effectively describes the invariance of decision limits and detection capabilities but could better connect this concept to the methodology's practical implications. The reference to Table 2 is abrupt and could use a smoother integration. Provide a concise explanation of how invariance strengthens the analytical reliability in real-world scenarios. Discuss briefly how the calibration approach enhances consistency across various food matrices.
Line 290-294: This paragraph discusses the accuracy lines with sufficient technical depth, but it could better contextualize the significance of p-values in Table 3. The impact of regression statistics on confidence intervals is underexplored. Explain how significant p-values validate the accuracy line and emphasize their importance in ensuring robust compliance measures. Relate these statistics to broader industrial applications.
Table 3: There is no mention of the analytes’ range of values. The Syx data makes no sense without the range of values. A Syx of 0.29 is better for the analyte’s range of 10-20% compared to 1-2%. These range of values are in Table 4 but include them somewhere before Table 3 (may be in Table 1). Include parity plots for predicted vs actual data of analytes either in the main text or supplement.
Line 302-309: Emphasize the practical importance of distinguishing between false compliance and false non-compliance in regulatory scenarios. Strengthen the linkage between the analytical sensitivity and its implications for regulatory assurance. Include a brief mention of how the methodology can be adapted for diverse matrices beyond the agri-food industry.
Line 310-317: The connection between the Operative Characteristic Curves and real-world applications is underexplored. Elaborate on how the curves demonstrate improved reliability in distinguishing analyte concentrations. Include specific examples to connect the results to practical industry applications, such as detecting agrochemical residues.
Line 317-322: The technical explanation could be enriched with broader implications. Explain why the increased replicates improve accuracy, and discuss the cost-benefit trade-offs of performing additional replicates. A brief mention of how this applies to large-scale industrial processes would enhance the practical context.
Line 323-329: The narrative is repetitive and does not emphasize the significance of the interval or probabilities in practical terms. Focus on the importance of the calculated intervals in ensuring compliance and reducing regulatory risks. Highlight potential limitations of the methodology, such as sensitivity to matrix variability.
Line 333-345: Include a discussion on challenges in calibrating minimum limits across varying matrices. Compare the results for different matrices in Table 4 and discuss the potential challenges in ensuring reproducibility and consistency across food industries.
Line 349-362: Include the importance of the broader utility of detecting contaminants at zero levels (PL=0). The differentiation between false positives and negatives in such cases could be clearer. Expand on the significance of zero-level detection for consumer safety and regulatory compliance. Discuss how this capability can be extended to other industries requiring high sensitivity, like pharmaceuticals.

Discussions:

Line 370-377: While the paragraph highlights the relevance of NIR spectroscopy and PLS models, the connection between these methodologies and the novel contributions of the current study could be clearer. The distinction between existing approaches (e.g., LOD/LOQ) and the introduced concepts of false compliance and non-compliance needs greater emphasis. Reframe the discussion to explicitly outline the innovative aspects of the proposed methodology. Include examples from the study (e.g., diflufenican detection) to illustrate practical implications more concretely.
Line 378-403: This introduces the methodological novelty well, but the explanation regarding hypothesis tests could benefit from simplification. Dense technical language may hinder accessibility, even for a fairly expert audience in the agri-food industry. Use a concise summary of the hypothesis test types (e.g., one-tailed versus two-tailed) and focus on how they enhance the reliability of compliance assessments. Avoid redundancy when describing intervals and density functions.
Line 404-415: This paragraph effectively contextualizes the study's findings but lacks focus on broader industry implications. The discussion on NIR signals and food matrix variability could be expanded. Elaborate on how variability across food matrices affects the application of the methodology. Link these findings to potential benefits or limitations in agri-food quality control.
Line 416-432: This introduces the influence of food matrix characteristics on results but is overly technical, making it challenging to grasp its practical relevance. The cost-analysis discussion is introduced but remains superficial. Simplify the explanation of analytical sensitivity (alpha/beta) and relate it to real-world challenges like cost-efficiency or scalability. Expand on how these findings could inform future agri-food process optimization.
Line 433-439: The paragraph effectively introduces the economic implications of false compliance and non-compliance but needs more depth in describing decision-making frameworks. Discuss how the probabilistic approach can guide industry stakeholders in balancing cost and risk. Provide a specific example illustrating the calculation of expected costs using the proposed methodology.
Line 441-446: This paragraph outlines the importance of calibration sample distribution but could emphasize the challenges of balancing experimental effort and analytical quality. The mention of Wx factors is technical but lacks an introductory explanation for clarity. Briefly explain the role of Wx factors in calibration and their impact on CDα and CDβ values. Highlight practical strategies to optimize calibration design while minimizing costs, such as adaptive sample selection techniques.
Line 446-451: The paragraph mentions the cost and time required for calibration sample preparation but does not offer concrete solutions to address these challenges. It assumes familiarity with off-line and at-line sampling without sufficient context. Propose specific methods, such as leveraging synthetic data or in-line calibration techniques, to reduce experimental effort. Discuss the trade-offs between experimental costs and calibration quality.
Line 451-454: Predictive maintenance is an important concept, but its implications for long-term NIR system efficiency are underexplored. The connection between cost reduction and maintaining calibration accuracy could be clarified. Include examples of predictive maintenance strategies, such as periodic recalibration or automated signal monitoring. Explain how these approaches ensure consistent performance over time while keeping costs low.

Author Response

"Please see the attachment."

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

The study demonstrates the quantitative determination of specific analytes using PLS models based on NIR spectra across various relevant food matrices (butter, flour, milk, yogurt, oil, olives). After developing sufficiently effective PLS models, hypothesis testing is applied to statistically calculate the threshold values at which the probability of false positive or false negative results can be reduced to the desired level. Only some minor revisions need to be addressed to clarify some questions.

The 6 nm is the pixel resolution. The spectral resolution is likely to be approx. 12 nm, at least three consecutive wavelengths are needed to differentiate two peaks.
How does the FTIR spectroscopic method work for the determination of fat and protein content in the case of milk? Is it a gold standard?
What was the format of NIR spectra from the AONIR instrument to import data into the PLS Toolbox?
What was the applied cross-validation method for PLS models?
In Table 2, the RMSE and RMSECV values for diflufenican and piretrin content are not in percentage but rather in mg*kg-1.

Typos:

Line 52 Yogurth → Yogurt

Line 63 Patial → Partial

Line 505 ISO 21543:2006 → ISO 21543:2020

Author Response

"Please see the attachment."

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The revised version (v2) is improved in terms of readability by addressing the comments in the previous version.

There is a minor typo that needs to be fixed. At line 204, the residual standard deviation (sigma_hat) symbol is typed twice.

Author Response

Comment 1 The revised version (v2) is improved in terms of readability by addressing the comments in the previous version.

There is a minor typo that needs to be fixed. At line 204, the residual standard deviation (sigma_hat) symbol is typed twice.

Response 1 Thank very much for your pointing out this typo. It has been fixed in the new version.

Article Menu

Detection of Nutrients and Contaminants in the Agri-Food Industry Evaluating the Probabilities of False Compliance and False Non-Compliance Through PLS Models and NIR Spectroscopy

Further Information

Guidelines

MDPI Initiatives

Follow MDPI