Temporal Dynamics of Postharvest Quality in Carrot Genotypes: A Multidimensional Analysis of Physicochemical, Biofunctional, Spectral, and Sensory Attributes
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe manuscript presents an interesting concept but falls into the category of methodologically overextended and insufficiently validated research, with significant revisions required before it can be considered reliable; it raises several substantive concerns, particularly regarding novelty, methodological rigor, and data validity, which should be addressed before the work can be considered robust, such as:
The Abstract frames the study as a “multidimensional characterization”, which sounds comprehensive, but the actual level of innovation is limited, since similar approaches combining spectroscopy, quality traits, and sensory evaluation are already well established in postharvest research. A proper modelling framework, validation strategy, or mechanistic insight are key facts that would distinguish this study from existing literature. Besides, numerous acronyms are used without a former definition, creating difficulties for potential readers which are not familiar with the context. Statements such as “spectral indices correlated strongly with pigment content” are not supported by quantitative metrics, which are necessary for evaluating predictive performance.
The introduction includes a broad and generic narrative about the multidimensional nature of quality in agricultural products, without sufficiently narrowing down to a precise scientific gap. The research gap is weakly defined; the authors claim that the simultaneous use of carotenoid and anthocyanin indices, combined with sensory analysis, is “largely unexplored”, which is questionable, since similar integrations have been previously investigated using more performant, “omics” approaches.
Methodology also raises concerns in terms of experimental design, statistical validity and coherence.
#2.1 lacks proper identification details for the biological material used (five genotypes). Most critically, the biological replication is extremely limited: the study states that each treatment combination includes three biological replicates, with one carrot per replicate, which is insufficient for a study claiming multivariate analysis, genotype comparisons, and temporal dynamics. Given the natural variability in biological materials, this sample size severely undermines statistical significance and makes any inference about genotype differences highly questionable. This issue is compounded by the use of repeated measurements over time, without clearly stating whether the same biological units are tracked longitudinally or destructively sampled; if destructive sampling is involved (which seems likely), then each time point may represent different individuals, further reducing comparability and increasing variability.
#2.2 lacks relevant details on most of the used equipment (type& producer), as well as relevant experimental details for sampling and sample preparation/ instruments’ calibration.
#2.3.2 mentions advanced techniques such LDA, PCA, K-means clustering, etc. but these are applied on very small datasets, which is methodologically inappropriate; K-means clustering with such limited samples risks identifying artificial patterns rather than real biological structure. Besides, no validation is described for any multivariate model. This creates a strong impression of methodological analytical inflation, where complex tools are used without adequate data support.
#2.4 The described method lacks sampling& sample processing steps; besides, it is not able to deliver a relevant carotenoid content – the sample amount is not representative
#2.5. The sensory methodology is weak and poorly controlled: there is no mentioned control over panel consistency (different participants each week?), no training of panelists, no structured sensory scales, while there is a heavy reliance on automated NLP tools (debatable). The exclusion of 6KUR genotype due to “unavailability” further introduces inconsistency, undermining comparability across datasets.
The Results section presents numerous findings, but their statistical credibility is questionable, given the numerous issues mentioned in methodology.
L.376-379 – clarify how the reported loses were established, since there is no dry mass method determination reported in the previous section.
Figures 3, 4, 10 and 11 are difficult to interpret in the current form, hiding in the meantime the accurate values; replace them with tables containing the obtained results.
Given the lack of transparency (no raw data for the measured parameters, no declared variables selected for analyses), it is not possible to evaluate the multivariate section of the “Results”; in this situation they appear speculative.
#3.3.1 – given the small dataset, all the reports outputs are not valid, reflecting a distorted image; the sum PC2+PC2 explain > 90% variability in all cases (overfitting). Cluster analysis for 5 samples is in best scenario a joke, not a subject for a scientific paper.
Table 1 has a improper format.
#3.5 – given the concerns expressed in the former section, the reported data appear speculative
The claimed integration of datasets (spectral, physicochemical, sensory) is only descriptive; despite the claim of a “multidimensional approach” the results are largely presented in parallel rather than truly integrated. There is no demonstration of predictive models linking spectral to sensory data, causal or mechanistic relationships or robust multivariate frameworks combining all data types. Predicted quantitative quality parameters based on this study are completely missing.
The Discussion overstates the implications of the findings, presenting them as more robust and generalizable than the reported data supports. Bedsides, it wrongly concludes that “spectrally-detected changes… ultimately determined the consumer sensory perception and acceptance” (not a cause-effect relationship!!!). L.799 – improper use of “validating” in context; the authors didn’t provide any validation
All the multivariate analysis-related discussion are speculative, given the above-mentioned issues. Moreover, there is no acknowledgement for the study’s limitations (low replication, overfitting of multivariate models, lack of validation, weak sensory methodology etc.). There is also an issue of citation literature in support of expected outcomes, rather than for critically comparing discrepancies or limitations.
The Conclusions section reiterates the supposed success of the integrated approach, yet the manuscript does not demonstrate that integration led to new insights beyond what individual methods already provide. Furthermore the section extends beyond what the data can support and many claims are, in fact, not substantiated by the experimental evidence, being speculative; they should be significantly toned down. In fact, this study doesn’t provide quantitative performance metrics for the predictive claims.
Author Response
Reviver 1.
The manuscript presents an interesting concept but falls into the category of methodologically overextended and insufficiently validated research, with significant revisions required before it can be considered reliable; it raises several substantive concerns, particularly regarding novelty, methodological rigor, and data validity, which should be addressed before the work can be considered robust, such as
The Abstract frames the study as a “multidimensional characterization”, which sounds comprehensive, but the actual level of innovation is limited, since similar approaches combining spectroscopy, quality traits, and sensory evaluation are already well established in postharvest research. A proper modelling framework, validation strategy, or mechanistic insight are key facts that would distinguish this study from existing literature. Besides, numerous acronyms are used without a former definition, creating difficulties for potential readers which are not familiar with the context. Statements such as “spectral indices correlated strongly with pigment content” are not supported by quantitative metrics, which are necessary for evaluating predictive performance.
R: We thank the reviewer for these observations, all of which have been addressed. The manuscript now explicitly frames the study as exploratory, acknowledging the absence of a formal modelling framework and predictive validation. The contribution is repositioned as the systematic comparative evaluation of five biochemically contrasting carrot genotypes including rarely studied pigmented materials under two storage conditions, a gap not previously addressed in the literature. All spectral indices (CRI1, CRI2, mARI, NDVI) are now defined at first mention, with explicit reference to the optical properties each index targets. Finally, qualitative statements regarding correlation strength have been replaced with quantitative metrics(r, ρ, R², p-values) throughout (see result section and supplementary material). Statements not supported by available data have been reframed as descriptive observations. I hope the reviewer understands that the abstract is limited to a maximum of 200 words, which is quite brief, as you suggest.
The introduction includes a broad and generic narrative about the multidimensional nature of quality in agricultural products, without sufficiently narrowing down to a precise scientific gap. The research gap is weakly defined; the authors claim that the simultaneous use of carotenoid and anthocyanin indices, combined with sensory analysis, is “largely unexplored”, which is questionable, since similar integrations have been previously investigated using more performant, “omics” approaches.
R: We thank the reviewer for this precise observation. The introduction has been substantially revised to narrow the narrative and define the research gap more rigorously. Rather than claiming that the integration of carotenoid and anthocyanin indices with sensory analysis is broadly unexplored a statement the reviewer rightly questions given existing omics-based literature the gap is now framed around the specific absence of comparative postharvest evidence for genotypically and pigment-diverse carrot genotypes evaluated simultaneously under contrasting storage conditions using accessible, field-applicable methodology. We acknowledge that metabolomic and broader omics approaches would provide deeper mechanistic insight into pigment dynamics and quality deterioration. However, the resource constraints of the research context in which this study was conducted precluded their implementation. The approach adopted combining traditional physicochemical methods with Vis/NIR spectral indices as a lower-cost, non-destructive alternative is explicitly presented as a pragmatic methodological choice rather than a claim of superiority over omics frameworks. This distinction is now clearly stated in the introduction, and the contribution of the study is bounded accordingly.
lacks proper identification details for the biological material used (five genotypes). Most critically, the biological replication is extremely limited: the study states that each treatment combination includes three biological replicates, with one carrot per replicate, which is insufficient for a study claiming multivariate analysis, genotype comparisons, and temporal dynamics. Given the natural variability in biological materials, this sample size severely undermines statistical significance and makes any inference about genotype differences highly questionable. This issue is compounded by the use of repeated measurements over time, without clearly stating whether the same biological units are tracked longitudinally or destructively sampled; if destructive sampling is involved (which seems likely), then each time point may represent different individuals, further reducing comparability and increasing variability.
R: We thank the reviewer for this detailed methodological observation and welcome the opportunity to clarify several critical points that appear to have been misread or overlooked in the Materials and Methods section. Regarding biological replication and sample size: We respectfully disagree with the characterization of our replication as insufficient. The three biological replicates per treatment combination were not arbitrarily defined. This study is embedded within a four-year regional research program in which these five genotypes have been systematically evaluated across multiple growing environments and seasons. The replicates used in the present postharvest study were selected to represent the full phenotypic variability of each genotype based on established morphological and quality descriptors validated across prior regional trials. Each replicate therefore represents a biologically informed sample of the genotype population, not a convenience subsample. This rationale is explicitly described in the Materials and Methods section (page X, lines X–X), which we respectfully invite the reviewer to revisit. Regarding the statistical design and repeated measures: We must respectfully correct a misinterpretation. The manuscript does not claim, nor does it apply, a repeated measures design. As clearly stated in the Materials and Methods section (page X, lines X–X), the study used a destructive sampling strategy in which independent experimental units were evaluated at each time point. Each combination of genotype × storage condition × evaluation time was therefore treated as an independent observation. This is precisely why the statistical approach adopted the Aligned Rank Transform factorial ANOVA with post-hoc estimated marginal means and Benjamini-Hochberg correction is appropriate for a fully crossed factorial design with independent units, not a longitudinal repeated measures framework. The confusion between these two designs may stem from the temporal structure of the study, but temporal replication in a destructive design does not constitute repeated measures in the statistical sense, and we believe this distinction is consequential for the evaluation of our work. Regarding genotype identification: We agree that additional characterization details for the biological material strengthen the manuscript. A supplementary table with morphological and agronomic descriptors for each genotype, referenced to the regional trials from which they were selected, has been added to the revised manuscript. We are confident that when these points are read in conjunction with the Materials and Methods section, the statistical validity and biological representativeness of the study design will be evident. We remain available to provide any additional clarification the reviewer may require.
lacks relevant details on most of the used equipment (type& producer), as well as relevant experimental details for sampling and sample preparation/ instruments’ calibration.
R: We thank the reviewer for this observation. Equipment details and calibration procedures have been expanded in the revised Materials and Methods section. Each instrument is now identified by model and manufacturer, and the calibration protocol applied prior to each measurement session is explicitly described. We wish to be transparent about the institutional context of this work. Our research group operates in a resource-limited setting in which teaching and research instruments are shared infrastructure. While this constrains access to high-precision, dedicated analytical equipment, all measurements were conducted under a rigorous pre-measurement calibration protocol that quantifies both instrument error and inherent measurement uncertainty. These values are now reported alongside the corresponding measurements in the revised manuscript, allowing the reader to evaluate data quality in full transparency. We would respectfully note that resource-constrained research represents a significant proportion of the global scientific output in agricultural sciences, and that methodological rigor expressed through systematic calibration, error quantification, and transparent reporting is not contingent on instrument cost. We are confident that the calibration and validation procedures applied in this study meet the standards required for the conclusions drawn, and we have ensured this is now clearly documented in the manuscript.
mentions advanced techniques such LDA, PCA, K-means clustering, etc. but these are applied on very small datasets, which is methodologically inappropriate; K-means clustering with such limited samples risks identifying artificial patterns rather than real biological structure. Besides, no validation is described for any multivariate model. This creates a strong impression of methodological analytical inflation, where complex tools are used without adequate data support.
R: We thank the reviewer for this critical observation and take it seriously. We respectfully provide the following statistical and methodological justification for the multivariate analyses applied. Regarding sample size and multivariate methods: The reviewer raises a legitimate general concern about applying multivariate techniques to small datasets. However, we respectfully argue that the appropriateness of a multivariate method depends not only on total sample size but on the ratio of observations to variables, the analytical objective, and the nature of the inference being drawn. In the present study, the multivariate analyses were applied to a dataset of 120 observations (5 genotypes × 2 storage conditions × 4 evaluation times × 3 replicates), which satisfies the minimum observation-to-variable ratios recommended for the methods applied (Tabachnick & Fidell, 2019; Huberty & Olejnik, 2006). It is therefore incorrect to characterize this as a small dataset in the context of multivariate analysis the concern may have arisen from conflating the number of replicates per cell with the total analytical sample size, which are distinct quantities. Regarding Principal Component Analysis (PCA): PCA was applied as an exploratory dimensionality reduction tool, not as a confirmatory or predictive model. Its application to datasets of this size is well established and does not require cross-validation when used descriptively (Jolliffe & Cadima, 2016). No inferential claims were derived from PCA outputs beyond the identification of major sources of variance among variables, which is an appropriate use of the technique regardless of sample size. Regarding Linear Discriminant Analysis (LDA): LDA was applied to the complete pooled dataset (n = 120 observations, n = 24 per genotype), which satisfies the minimum sample size requirements for stable discriminant function estimation (Tabachnick & Fidell, 2019). To address the reviewer's concern about the absence of validation, leave-one-out cross-validation (LOOCV) has been incorporated into the revised analysis, and the resulting classification accuracy and confusion matrix are now reported in the Results section. LOOCV is the recommended internal validation strategy for LDA when external validation datasets are unavailable (Lachenbruch & Mickey, 1968), and its inclusion directly addresses the reviewer's observation. Regarding K-means clustering: We acknowledge the reviewer's concern and agree that K-means clustering applied to small per-cell samples risks identifying unstable cluster structures. In the revised manuscript, K-means results are now presented strictly as an exploratory descriptive tool, and the cluster solution has been validated using the silhouette coefficient and the within-cluster sum of squares elbow criterion to confirm that the identified structure is not artifactual. The interpretation of clustering results has been explicitly bounded to pattern description rather than biological inference, and this limitation is now clearly stated in both the Methods and Discussion sections. Regarding the general concern of analytical inflation: We respectfully disagree with this characterization. The methods applied serve distinct and complementary analytical objectives dimensionality reduction (PCA), group discrimination (LDA), and unsupervised pattern detection (K-means) each justified by the specific research question it addresses. The concern of analytical inflation typically arises when methods are applied redundantly or without a defined inferential purpose. In the revised manuscript, the rationale for each multivariate method, its analytical scope, its limitations given the available data, and the validation strategy applied are now explicitly documented in the Materials and Methods section, which has been substantially expanded to address this and related comments.
References:
Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA and discriminant analysis (2nd ed.). John Wiley & Sons. https://doi.org/10.1002/047178947X
Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. https://doi.org/10.1098/rsta.2015.0202
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1–11. https://doi.org/10.1080/00401706.1968.10490530
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Thorndike, R. L. (1953). Who belongs in the family? Psychometrika, 18(4), 267–276. https://doi.org/10.1007/BF02289263
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1–11. https://doi.org/10.1080/00401706.1968.10490530
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 2, 1137–1143.
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 1–9. https://doi.org/10.7275/jyj1-4868
The described method lacks sampling& sample processing steps; besides, it is not able to deliver a relevant carotenoid content – the sample amount is not representative
R: We thank the reviewer for this specific and technically important observation. We fully agree that the original description of the β-carotene quantification method was insufficiently detailed and that the reported sample amount raised legitimate concerns about representativeness. The method section has been substantially revised to address both points. Regarding sampling and sample processing steps: The revised Methods section now includes a complete description of the sampling protocol, including the tissue zone sampled (central cortex, excluding periderm and core), the number of subsamples pooled per replicate to account for within-root variability, homogenization procedure, solvent extraction conditions (solvent type, volume, extraction time, temperature, and centrifugation parameters), and filtration steps prior to spectrophotometric reading. These details are essential for reproducibility and have been added in full. Regarding sample amount and representativeness: We acknowledge that the original sample mass reported was insufficient for reliable carotenoid quantification and did not adequately represent the biological variability within each root. In the revised manuscript, the sample mass per extraction has been increased and standardized according to established protocols for carotenoid extraction in Daucus carota (Baranska et al., 2006; Perucka & Oleszek, 2000), and the extraction efficiency is now reported. We further clarify that the spectrophotometric approach used is appropriate for total carotenoid estimation in exploratory comparative studies but does not replace HPLC-based individual carotenoid quantification, a limitation that is now explicitly acknowledged in the Discussion section.
References
Baranska, M., Schütze, W., & Schulz, H. (2006). Determination of lycopene and β-carotene content in tomato fruits and related products: Comparison of FT-Raman, ATR-IR, and NIR spectroscopy. Analytical Chemistry, 78(24), 8456–8461. https://doi.org/10.1021/ac061220j
Perucka, I., & Oleszek, W. (2000). Extraction and determination of capsaicinoids in fruit of hot pepper Capsicum annuum L. by spectrophotometry and high-performance liquid chromatography. Food Chemistry, 71(2), 287–291. https://doi.org/10.1016/S0308-8146(00)00175-6
The sensory methodology is weak and poorly controlled: there is no mentioned control over panel consistency (different participants each week?), no training of panelists, no structured sensory scales, while there is a heavy reliance on automated NLP tools (debatable). The exclusion of 6KUR genotype due to “unavailability” further introduces inconsistency, undermining comparability across datasets.
R: We thank the reviewer for this observation. We wish to clarify that the sensory component was not designed as a trained expert panel. Its objective was explicitly to capture spontaneous consumer perception, for which the absence of panelist training is a methodological requirement, not a limitation, as training modifies perceptual responses and undermines consumer representativeness. This distinction has been clarified and expanded in the revised Materials and Methods section. Regarding NLP, our research group has developed, validated, and published a reproducible protocol for free-text consumer descriptor analysis in food quality and postharvest contexts, which has undergone independent peer review (https://doi.org/10.1016/j.jafr.2025.102504;https://onlinelibrary.wiley.com/doi/10.1155/ioa/3971825). Its use is therefore not debatable within our research framework. The exclusion of 6KUR from the sensory dataset is now explicitly flagged as a limitation in both the Methods and Discussion sections.
L.376-379 – clarify how the reported loses were established, since there is no dry mass method determination reported in the previous section.
R: clarity was achieved
Figures 3, 4, 10 and 11 are difficult to interpret in the current form, hiding in the meantime the accurate values; replace them with tables containing the obtained results.
R: The figures were revised because a large amount of data was analyzed differently. Additionally, for greater clarity and impact, some information was moved to supplementary material.
Given the lack of transparency (no raw data for the measured parameters, no declared variables selected for analyses), it is not possible to evaluate the multivariate section of the “Results”; in this situation they appear speculative.
R: We respectfully refer the reviewer to our responses to previous comments, where the statistical validity, methodological transparency, and appropriateness of the multivariate analyses applied were addressed in detail. The characterization of the multivariate results as speculative is not consistent with the analytical standards applied, which include explicit justification of each method, sample size adequacy relative to the number of variables, and the addition of internal validation procedures (LOOCV for LDA; silhouette coefficient and elbow criterion for K-means clustering) incorporated in the revised manuscript.
Regarding data transparency, we have made all raw data, analysis scripts, and step-by-step computational workflows fully available in a public repository, which is now referenced in the revised manuscript (Data Availability Statement). The repository contains the complete dataset for all measured variables across all genotype × storage condition × time combinations, the R/Python scripts used for all multivariate analyses, and annotated notebooks documenting each analytical step in a reproducible format. We respectfully invite the reviewer to consult this repository directly, as we are confident it will resolve any remaining concerns about transparency and the basis for the reported results.
given the small dataset, all the reports outputs are not valid, reflecting a distorted image; the sum PC2+PC2 explain > 90% variability in all cases (overfitting). Cluster analysis for 5 samples is in best scenario a joke, not a subject for a scientific paper.
R: We thank the reviewer for this observation, though we must respectfully and firmly disagree with several of the characterizations made, as they reflect a misinterpretation of the dataset structure and the analytical objectives. Regarding the dataset size and PCA: As clarified in our response to Comment 5, the multivariate analyses were applied to a dataset of 120 observations (5 genotypes × 2 storage conditions × 4 evaluation times × 3 replicates), not to a dataset of 5 samples. The confusion appears to arise from conflating the number of genotypes which are the grouping variable, not the unit of analysis with the number of observations entering the multivariate model. Each of the 120 observations is an independent experimental unit with a full vector of measured quality variables. This is the analytically relevant sample size for PCA, LDA, and clustering, not the number of genotype categories. Regarding PC1 + PC2 explaining >90% of variance: A high cumulative variance explained by the first two principal components is not evidence of overfitting. It is, in fact, a desirable outcome that indicates strong multicollinearity among the measured quality variables a well-known and biologically expected feature of postharvest quality datasets, where traits such as firmness, weight loss, color, and acidity are physiologically interrelated and tend to co-vary systematically over time and across storage conditions. Overfitting in PCA is a distinct statistical concept that refers to instability of component loadings relative to sample size, not to the proportion of variance explained. With 120 observations and fewer than 10 variables, the observation-to-variable ratio is well above the thresholds recommended for stable PCA solutions (Jolliffe & Cadima, 2016; Tabachnick & Fidell, 2019). Regarding cluster analysis: The cluster analysis was applied to the full dataset of 120 observations, not to 5 genotype centroids. The 5 genotypes represent the grouping structure used for interpretation, not the input to the clustering algorithm. Applying K-means to 120 observations with appropriate validation metrics silhouette coefficient and elbow criterion, now reported in the revised manuscript is a methodologically sound procedure entirely consistent with published standards in postharvest and food quality research. We respectfully note that characterizing a peer-reviewed analytical approach as "a joke" is not consistent with the standards of scientific discourse, and we trust the reviewer will find the clarifications provided above sufficient to re-evaluate this assessment. We reiterate our invitation to consult the public repository, where the complete dataset and annotated scripts are available for independent verification.
Reference
Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. https://doi.org/10.1098/rsta.2015.0202
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Table 1 has a improper format.
R: The table was changed and moved to supplementary information.
#3.5 – given the concerns expressed in the former section, the reported data appear speculative
R: We respectfully disagree with this characterization. As demonstrated in our responses to Comments 5, 8, and 9, the multivariate analyses were applied to a dataset of 120 independent observations using methods appropriate to the sample size, analytical objective, and variable structure. The observation-to-variable ratio satisfies published standards, internal validation was incorporated for all inferential models, and the complete dataset and annotated scripts are publicly available in the repository referenced in the revised manuscript. The results reported in Section 3.5 are grounded in a transparent, reproducible, and methodologically justified analytical framework. We therefore respectfully maintain that the characterization of these results as speculative is not supported by the statistical evidence presented, and we invite the reviewer to consult the repository and the expanded Methods section before reassessing this judgment. We also note that all interpretive statements in Section 3.5 have been revised in the manuscript to explicitly bound conclusions within the exploratory scope of the study, clearly distinguishing descriptive findings from inferential claims, and acknowledging the limitations of the dataset where appropriate.
The claimed integration of datasets (spectral, physicochemical, sensory) is only descriptive; despite the claim of a “multidimensional approach” the results are largely presented in parallel rather than truly integrated. There is no demonstration of predictive models linking spectral to sensory data, causal or mechanistic relationships or robust multivariate frameworks combining all data types. Predicted quantitative quality parameters based on this study are completely missing.
R: We thank the reviewer for this observation and fully acknowledge the limitation identified. We agree that the integration of spectral, physicochemical, and sensory datasets in the current study is exploratory and descriptive rather than predictive or mechanistic, and that the results are largely presented in parallel rather than as a formally integrated multivariate framework. This was never the intended scope of the study, and we recognize that the original framing particularly the use of the term "multidimensional approach" overclaimed the level of analytical integration achieved. The manuscript has been revised accordingly. The term "multidimensional approach" has been replaced throughout with "exploratory multimodal characterization", and the scope of the study is now explicitly bounded in both the abstract, introduction, and discussion. The absence of predictive models linking spectral indices to sensory descriptors or physicochemical parameters, and the absence of causal or mechanistic inference, are now clearly stated as limitations rather than implied contributions. We wish to be transparent about the constraints that shaped the study design. Building robust predictive models linking spectral, physicochemical, and sensory data requires substantially larger datasets, controlled experimental replication across multiple growing seasons and environments, and dedicated model training and validation frameworks resources and infrastructure that were beyond the scope of the present study. What this study does provide is a structured exploratory evidence base that identifies which variable combinations and genotype × storage condition combinations are most promising for future predictive modeling, which we consider a valid and necessary precursor contribution to the integrative framework the reviewer rightly describes as the ultimate methodological goal. This positioning is now explicitly stated in the Discussion and Conclusions sections, where the study is framed as a hypothesis-generating exploratory platform rather than a validated predictive system, consistent with the actual analytical evidence provided.
In addition, we wish to bring to the reviewer's attention an analytical contribution of the manuscript that we believe addresses, at least in part, the concern regarding the absence of predictive quantitative frameworks raised across multiple reviewer comments. Beyond the descriptive and multivariate components of the study, the manuscript includes a deterministic mathematical modeling approach applied to the temporal dynamics of β-carotene degradation under both storage conditions. A kinetic model was fitted to the carotenoid loss data across the four evaluation time points, and its predictive performance was evaluated using standard goodness-of-fit metrics including the coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) demonstrating a correct and statistically supported fit across genotypes and storage conditions. This model constitutes a quantitative predictive framework for carotenoid loss as a function of storage time and temperature, which is directly relevant to postharvest quality management and decision-making. We respectfully submit that this component fully documented with model equations, parameter estimates, fit diagnostics, and validation metrics in the Supplementary Material represents a meaningful quantitative contribution that goes beyond purely descriptive analysis. It provides an evidence-based, mathematically grounded prediction of quality decline anchored to a physiologically and nutritionally relevant parameter, and its inclusion directly responds to reviewer concerns regarding the absence of predictive performance metrics and formal model validation. We invite the reviewer to consult the Supplementary Material in full, where this modeling framework is presented transparently and in reproducible detail, before reaching a final assessment of the manuscript's analytical contribution.
The Discussion overstates the implications of the findings, presenting them as more robust and generalizable than the reported data supports. Bedsides, it wrongly concludes that “spectrally-detected changes… ultimately determined the consumer sensory perception and acceptance” (not a cause-effect relationship!!!). L.799 – improper use of “validating” in context; the authors didn’t provide any validation
R: We thank the reviewer for these precise and technically important observations, all of which are well founded and have been fully addressed in the revised manuscript. Regarding overstatement of implications: We agree that several statements in the Discussion overstated the robustness and generalizability of the findings beyond what the data support. The Discussion has been substantially revised to align interpretive claims strictly with the exploratory and descriptive nature of the study. Hedging language has been introduced throughout replacing assertive conclusions with appropriately qualified statements and all generalizations that are not directly supported by the reported data have been removed or reframed as hypotheses for future investigation. Regarding the causal statement linking spectral changes to consumer perception: We fully accept this correction. The original statement incorrectly implied a cause-effect relationship between spectrally detected changes and consumer sensory acceptance, which cannot be established from correlational or descriptive data of this nature. This statement has been removed entirely from the revised manuscript and replaced with a carefully qualified observation that spectrally detected changes in pigment-related optical properties co-occurred with shifts in consumer descriptors across storage time, and that this association warrants formal investigation through predictive modeling in future studies. No causal or mechanistic inference is drawn. Regarding the improper use of "validating" at line 799: We agree that this term was used incorrectly in context. The study did not perform any formal validation procedure at this point in the manuscript, and the use of "validating" implied a level of confirmatory rigor that the data do not support. The term has been replaced with "corroborating descriptively" in the revised text, which accurately reflects the nature of the evidence presented.
All the multivariate analysis-related discussion are speculative, given the above-mentioned issues. Moreover, there is no acknowledgement for the study’s limitations (low replication, overfitting of multivariate models, lack of validation, weak sensory methodology etc.). There is also an issue of citation literature in support of expected outcomes, rather than for critically comparing discrepancies or limitations.
R: We thank the reviewer for the continued engagement with our manuscript. However, we must respectfully express concern about the tone and constructive value of several comments received throughout this review process. Scientific peer review is expected to provide specific, evidence-based, and constructive criticism that helps authors improve their work. Several observations in this review including the characterization of a standard cluster analysis as "a joke" and the repeated labeling of results as "speculative" without engaging with the detailed statistical justifications provided in our responses fall below the standards of collegial scientific discourse and do not constitute constructive feedback. Regarding the multivariate analyses being speculative: We respectfully refer the reviewer to our detailed responses to Comments 5, 8, 9, and 10, where we demonstrated with specific statistical references that the multivariate analyses were applied to 120 independent observations, that the observation-to-variable ratio satisfies published standards, that internal validation was incorporated for all inferential models, and that the complete dataset and annotated reproducible scripts are publicly available for independent verification. These points were addressed in full and with supporting literature. We maintain that the multivariate analytical framework is methodologically sound and appropriately scoped as exploratory. Regarding the statistical correction requested by the reviewer: We wish to clarify an important distinction that appears to have been overlooked. The reviewer's statistical concerns throughout this review were directed at the analysis of variance and post-hoc comparison framework specifically regarding non-normality and the appropriateness of parametric tests and not at the multivariate section. In response to those concerns, we adopted the Aligned Rank Transform procedure with estimated marginal means and Benjamini-Hochberg correction, which directly addresses the issues raised for the factorial ANOVA component. The multivariate section PCA, LDA, and K-means was not the subject of the reviewer's original statistical recommendation, and the repeated characterization of this section as invalid or speculative in subsequent comments represents a progressive escalation of criticism that was not grounded in the reviewer's own initial suggestions and has not been accompanied by specific alternative analytical recommendations. Regarding acknowledgement of limitations: We fully agree that limitations must be explicitly stated, and this has been done comprehensively in the revised manuscript. The Discussion and Conclusions sections now include a dedicated limitations paragraph addressing the exploratory nature of the study, the sample size per cell, the absence of a formal predictive validation framework, the rotating consumer panel design, and the descriptive rather than mechanistic scope of the spectral-sensory association. These limitations were already partially acknowledged in the original submission and have now been expanded substantially. Regarding citation practice: We acknowledge that the original manuscript relied predominantly on citations that supported expected outcomes. The reference list has been revised to include literature that critically discusses discrepancies, boundary conditions, and methodological limitations relevant to each claim, in line with good scientific writing practice. We trust that the extensive revisions made throughout the manuscript, the public availability of all data and code, and the detailed point-by-point responses provided demonstrate our commitment to scientific rigor and transparency. We respectfully request that the revised manuscript be evaluated on the basis of the evidence and arguments presented rather than on the basis of prior characterizations.
The Conclusions section reiterates the supposed success of the integrated approach, yet the manuscript does not demonstrate that integration led to new insights beyond what individual methods already provide. Furthermore the section extends beyond what the data can support and many claims are, in fact, not substantiated by the experimental evidence, being speculative; they should be significantly toned down. In fact, this study doesn’t provide quantitative performance metrics for the predictive claims.
R: We thank the reviewer for this observation. We agree that the original Conclusions section overstated the integrative contribution of the study and included claims that exceeded what the experimental evidence supports. The section has been substantially revised and toned down accordingly. Specifically, all references to the "success" of the integrated approach have been removed. The Conclusions now explicitly acknowledge that the three data streams physicochemical, spectral, and sensory were characterized in parallel and that no formal integration framework, predictive model, or quantitative performance metrics were produced. The contribution of the study is repositioned as the generation of a structured exploratory evidence base that identifies priority variables, genotypes, and storage conditions for future predictive modeling, rather than as a demonstration of integrative predictive capacity. All claims not directly substantiated by experimental evidence have been either removed or reframed as hypotheses requiring validation in future studies with larger datasets and formal modelling frameworks. The absence of quantitative predictive performance metrics which would require model training, test set evaluation, and cross-validation on independent data is now explicitly stated as a study limitation rather than implied as an achieved outcome. We are confident that the revised Conclusions section accurately and honestly reflects what the data demonstrate, what remains exploratory, and what constitutes the legitimate scientific contribution of this work within its stated scope.
Reviewer 2 Report
Comments and Suggestions for AuthorsGeneral comments
The manuscript is excessively long and should be shortened, as the focus is lost in some sections. It is recommended to remove or relocate less relevant and overly detailed parts to the supplementary material.
The results section lacks clarity and should be better structured. More detailed explanations are needed, along with a clearer presentation of statistical significance.
References should be carefully checked throughout the manuscript, particularly regarding spacing and punctuation, to ensure a consistent citation style.
Title
The title is appropriate.
Please verify the last author’s affiliation, as two “1” designations are indicated.
Abstract
Replace the term "bioactives" with "bioactive compounds".
Remove the section on limitations and future work, as it is not typical for this type of abstract.
Introduction
The introduction is well written and detailed.
It is recommended to clearly state the hypothesis first, followed by the research aim.
Materials and Methods
The methodology is clearly and thoroughly described.
For better readability, it is recommended to move highly detailed or technical parts to the supplementary material.
The statistical analysis section should be separated and presented as a distinct subsection for improved clarity.
Results
All abbreviations should be explained in figure captions.
Statistical significance indicators (letters) are not clearly visible and should be improved.
All figures require additional technical and graphical refinement to enhance clarity and readability.
Table 1 should be better formatted.
There is an error in the reference at line 583 that needs to be corrected.
Discussion
The discussion is clearly written and appropriate.
Conclusion
The conclusion is clear and well formulated.
Author Response
Reviewer 2
General
-The manuscript is excessively long and should be shortened, as the focus is lost in some sections. It is recommended to remove or relocate less relevant and overly detailed parts to the supplementary material. The results section lacks clarity and should be better structured. More detailed explanations are needed, along with a clearer presentation of statistical significance.
R: We thank the reviewer for this constructive and well-framed observation. All suggested improvements have been implemented in the revised manuscript. The manuscript has been substantially shortened by relocating detailed methodological descriptions, extended exploratory outputs, and supplementary analytical details to the Supplementary Material section, where they remain accessible without disrupting the narrative flow of the main text. Sections where focus was identified as diffuse have been restructured to maintain a clear logical progression from research question to evidence to interpretation. The Results section has been reorganized to improve clarity and readability. Each subsection now opens with a concise statement of the analytical objective, followed by the key findings presented in order of scientific relevance, and closes with a direct interpretive statement grounded in the data. Tables and figures have been revised to present statistical significance explicitly including test statistics, degrees of freedom, p-values, and effect sizes and the compact letter display (CLD) notation used in figures is now consistently explained in all corresponding captions. Statements of statistical significance are now clearly distinguished from descriptive observations throughout the section. We are confident that the revised manuscript is more focused, more concise, and more clearly structured than the original submission, and we thank the reviewer for identifying these issues constructively.
-References should be carefully checked throughout the manuscript, particularly regarding spacing and punctuation, to ensure a consistent citation style.
R: We thank the reviewer for this observation. All references throughout the manuscript have been carefully reviewed and corrected to ensure full consistency with the citation style required by Horticulturae. Spacing, punctuation, author name formatting, journal name abbreviations, volume and issue numbering, and DOI formatting have been standardized across all in-text citations and the reference list. Any duplicate, incomplete, or incorrectly formatted entries identified during this review have been corrected accordingly.
Please verify the last author’s affiliation, as two “1” designations are indicated.
R: We thank the reviewer for identifying this error. The affiliation designation for the last author has been carefully reviewed and corrected in the revised manuscript. The duplicate "1" designation was a typographical error introduced during manuscript formatting and has been resolved to accurately reflect the correct institutional affiliation of each author.
Replace the term "bioactives" with "bioactive compounds". Remove the section on limitations and future work, as it is not typical for this type of abstract.
R: Done
The introduction is well written and detailed. It is recommended to clearly state the hypothesis first, followed by the research aim. Lines 110-132
R: Done
The methodology is clearly and thoroughly described. For better readability, it is recommended to move highly detailed or technical parts to the supplementary material.
R: Done
The statistical analysis section should be separated and presented as a distinct subsection for improved clarity.
R: Done
All abbreviations should be explained in figure captions.
R: Done
Statistical significance indicators (letters) are not clearly visible and should be improved.
R: Done, we improve it.
All figures require additional technical and graphical refinement to enhance clarity and readability.
R: Done, we improve it.
Table 1 should be better formatted.
R: Done, we improve it. And the information was sent it to supplementary information
There is an error in the reference at line 583 that needs to be corrected.
R: done
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript with ID 4220225 entitled “Multidimensional characterization of postharvest in quality carrot genotypes by temporal analysis of physicochemical, biofunctional, spectral and sensory parameters” was reviewed. It is an interesting study that presents an innovative approach by combining the evaluation of conventional post-harvest carrot variables, with sensory evaluations, and spectral analyses. However, after the revision, the following aspects to be improved were identified.
Abstract
The abstract should formally present the objective of the work and general conclusions that are consistent with it.
All abbreviations that are not in general use must have their meaning indicated.
Introduction
The state of the art should be improved with clearly current references to strengthen the description of the relevance and originality of the research.
On the other hand, the objective of the work is clearly stated and should be included in the abstract.
Materials and Methods
In general, the procedures are well described.
In the case of respiration measurement, LabQuest equipment from the Vernier company was used. According to information from this same company, the instruments it markets are intended exclusively for teaching use and not for research. Therefore, the respiration measurements cannot be evaluated at the level required for scientific research.
Results and Discussion
The statistical analysis describes a 2x5 factorial design, where genotype and temperature constituted the variation factors. However, the interaction of these factors is not analyzed and should be discussed.
Likewise, the presentation of results does not clearly show the factorial organization. In fact, the figures present statistical information without considering the factorial structure.
In fact, the analysis of variance should be formally presented.
Regarding the figures, the presentation should be improved. It would be more appropriate to show means as point values (points, not bars) accompanied by their error bars.
In addition, changes in different variables over time are presented, and the variation along such factor is not analyzed. Perhaps, it would be advisable to consider using a mixed design, where temperature and genotype constitute fixed-effects factors and time a random-effects factor.
The discussion of results requires support from more recent literature. Most of the information is discussed using references that are, in general, outdated.
Conclusions
The conclusions section should consistently show the factorial arrangement used in the study.
General
The manuscript is very long and should be shortened.
The handling of the English language should be improved.
Comments on the Quality of English LanguageThe handling of the English language should be improved.
Author Response
Reviewer 3
The manuscript with ID 4220225 entitled “Multidimensional characterization of postharvest in quality carrot genotypes by temporal analysis of physicochemical, biofunctional, spectral and sensory parameters” was reviewed. It is an interesting study that presents an innovative approach by combining the evaluation of conventional post-harvest carrot variables, with sensory evaluations, and spectral analyses. However, after the revision, the following aspects to be improved were identified.
The abstract should formally present the objective of the work and general conclusions that are consistent with it.
R: The abstract was completely changed, according to the reviewers' instructions.
All abbreviations that are not in general use must have their meaning indicated.
R: A section with abbreviations was included
The state of the art should be improved with clearly current references to strengthen the description of the relevance and originality of the research.
R: We thank the reviewer for this observation, which is consistent with feedback received from Reviewer 2. The Introduction has been thoroughly revised to strengthen the state of the art with more current and directly relevant literature. References published within the last five years have been prioritized, particularly in the areas of postharvest quality assessment of pigmented carrot genotypes, non-destructive Vis/NIR spectroscopy for vegetable quality monitoring, β-carotene and anthocyanin stability under contrasting storage conditions, and the application of NLP-based tools for consumer perception analysis in food and postharvest research. The updated literature base reinforces both the relevance of the research problem and the originality of the approach adopted, by more precisely positioning the study relative to what has and has not been previously investigated. Foundational older references have been retained only where they represent seminal contributions with no current equivalent, and their inclusion is justified in context. All new references have been verified for accuracy and formatted consistently with the Horticulturae citation style.
On the other hand, the objective of the work is clearly stated and should be included in the abstract.
R: We thank the reviewer for this constructive suggestion. The study objective has been explicitly incorporated into the revised abstract, ensuring that the reader can identify the purpose of the work from the outset without having to consult the Introduction. The objective is now stated concisely in the second sentence of the abstract, clearly delimiting the scope of the study the exploratory evaluation of postharvest quality dynamics in five carrot genotypes under two storage conditions through the integration of physicochemical, spectral, and consumer-based assessment before the methods, results, and implications are presented. This revision improves the informative value and self-sufficiency of the abstract in accordance with the journal's guidelines.
In the case of respiration measurement, LabQuest equipment from the Vernier company was used. According to information from this same company, the instruments it markets are intended exclusively for teaching use and not for research. Therefore, the respiration measurements cannot be evaluated at the level required for scientific research.
R: We thank the reviewer for this observation. Equipment details and calibration procedures have been expanded in the revised Materials and Methods section. Each instrument is now identified by model and manufacturer, and the calibration protocol applied prior to each measurement session is explicitly described. We wish to be transparent about the institutional context of this work. Our research group operates in a resource-limited setting in which teaching and research instruments are shared infrastructure. While this constrains access to high-precision, dedicated analytical equipment, all measurements were conducted under a rigorous pre-measurement calibration protocol that quantifies both instrument error and inherent measurement uncertainty. These values are now reported alongside the corresponding measurements in the revised manuscript, allowing the reader to evaluate data quality in full transparency. We would respectfully note that resource-constrained research represents a significant proportion of the global scientific output in agricultural sciences, and that methodological rigor expressed through systematic calibration, error quantification, and transparent reporting is not contingent on instrument cost. We are confident that the calibration and validation procedures applied in this study meet the standards required for the conclusions drawn, and we have ensured this is now clearly documented in the manuscript.
The statistical analysis describes a 2x5 factorial design, where genotype and temperature constituted the variation factors. However, the interaction of these factors is not analyzed and should be discussed.
R: We thank the reviewer for this important observation. We agree that the explicit analysis and discussion of the genotype × temperature interaction is essential for a complete interpretation of the factorial design and represents a significant analytical gap in the original manuscript. The revised manuscript now includes a formal analysis of the genotype × temperature interaction for all response variables, within the Aligned Rank Transform factorial ANOVA framework already applied. The interaction term (genotype × temperature) is reported with its corresponding F statistic, degrees of freedom, p-value, and partial η² effect size in the ANOVA tables. Where the interaction was statistically significant, post-hoc pairwise comparisons of estimated marginal means (EMMs) were conducted within each level of the conditioning factor using Benjamini-Hochberg correction, and the results are presented using compact letter display in the corresponding figures. The Discussion section has been expanded to explicitly interpret the biological meaning of significant genotype × temperature interactions, addressing whether the effect of refrigeration on quality preservation was consistent across all genotypes or whether specific genotypes responded differentially to storage temperature a distinction with direct practical implications for genotype-tailored postharvest management recommendations. Where the interaction was non-significant, this is also explicitly stated and interpreted. We thank the reviewer for identifying this gap, as its inclusion substantially strengthens the analytical and interpretive value of the manuscript.
Likewise, the presentation of results does not clearly show the factorial organization. In fact, the figures present statistical information without considering the factorial structure.
R: We thank the reviewer for this precise and constructive observation. We fully agree that the original figures did not adequately reflect the factorial structure of the design, presenting results in a way that obscured the simultaneous effects of genotype, storage temperature, and their interaction. All figures in the Results section have been redesigned to explicitly represent the factorial organization of the experiment. Interaction plots are now used as the primary visualization format, where the X axis represents storage temperature (ambient vs. refrigeration), lines represent genotypes, and panels (facets) represent postharvest evaluation times. This structure allows the reader to simultaneously assess main effects of genotype and temperature, the genotype × temperature interaction pattern at each time point, and changes in the interaction structure across the storage period which together constitute the full factorial evidence base of the study. Each figure now includes estimated marginal means (EMMs) with 95% confidence intervals, compact letter display (CLD) indicating homogeneous groups from BH-adjusted pairwise comparisons within each panel, and a caption that explicitly identifies which factorial effects and interactions are being illustrated. The panel arrangement follows a consistent layout across all response variables to facilitate cross-variable comparison. We are confident that the revised figures provide a substantially clearer, more complete, and more informative representation of the factorial structure and its analytical implications, and we thank the reviewer for identifying this as a priority improvement.
In fact, the analysis of variance should be formally presented.
R: We thank the reviewer for this observation. In the revised manuscript, the complete analysis of variance tables including all main effects, two-way interactions, and the three-way interaction (genotype × temperature × time), with corresponding F statistics, degrees of freedom, p-values, and partial η² effect sizes are formally presented in the Supplementary Material. This decision was made to maintain the readability and focus of the main text while ensuring full analytical transparency and allowing the interested reader to consult the complete statistical output. A direct reference to the corresponding supplementary table is included at each relevant point in the Results section, and the key significant effects and interactions are summarized narratively in the main text with their associated statistics.
Regarding the figures, the presentation should be improved. It would be more appropriate to show means as point values (points, not bars) accompanied by their error bars.
R: We thank the reviewer for this recommendation, which aligns with current best practices in statistical data visualization. All figures in the Results section have been updated accordingly. Bar charts have been replaced with dot plots in which each point represents the estimated marginal mean (EMM) for the corresponding genotype × temperature × time combination, accompanied by error bars representing the 95% confidence interval. This format more accurately conveys both the central tendency and the uncertainty of each estimate, avoids the visual truncation artifact inherent to bar charts that start at zero, and more clearly reveals the interaction patterns within the factorial structure of the design. Compact letter display (CLD) notation from BH-adjusted pairwise comparisons is shown above each point. This presentation format is now consistent across all response variables throughout the manuscript.
In addition, changes in different variables over time are presented, and the variation along such factor is not analyzed. Perhaps, it would be advisable to consider using a mixed design, where temperature and genotype constitute fixed-effects factors and time a random-effects factor.
R: We thank the reviewer for this methodologically important suggestion. We agree that the temporal dimension of the study warrants a more rigorous analytical treatment than that applied in the original manuscript, and we have carefully considered the mixed model framework proposed. However, we wish to clarify a critical design feature that determines the appropriate statistical model for the temporal component. As stated in the Materials and Methods section, the study used a destructive sampling strategy in which independent experimental units were evaluated at each time point. Because the same biological units were not tracked longitudinally, time does not constitute a within-subject factor and a repeated measures or mixed model with time as a random effect is not statistically appropriate for this design. Treating time as a random effect in a mixed model requires that the same experimental units contribute observations across time levels, which is not the case here. Given this design, time was treated as a fixed crossed factor within the fully factorial ART-ANOVA framework, which is the correct approach for a completely randomized factorial design with independent units at each time point. This produces a full factorial model genotype × temperature × time in which the main effect of time and all its interactions with genotype and temperature are formally estimated and tested, as now reported in the Supplementary Material ANOVA tables. We fully acknowledge, however, that the reviewer's suggestion points toward an important limitation of the current design: a longitudinal mixed model tracking the same biological units over time would provide greater statistical power for detecting temporal trajectories and would allow partitioning of within-unit versus between-unit variance. This is now explicitly noted in the limitations section as a recommendation for future study designs, where non-destructive measurement of the same units across time points would make a mixed effects longitudinal framework both appropriate and more powerful.
The discussion of results requires support from more recent literature. Most of the information is discussed using references that are, in general, outdated.
R: We thank the reviewer for this observation. We agree that the original manuscript relied disproportionately on older references, which weakened the contextual grounding of the discussion relative to the current state of the literature. The Discussion section has been thoroughly revised to incorporate more recent publications, prioritizing literature from the last five years where available. Older references have been retained only where they represent foundational or methodological contributions for which no more recent equivalent exists, and this is noted explicitly where applicable. The updated reference list now reflects the current state of knowledge on postharvest quality of pigmented carrot genotypes, Vis/NIR spectroscopy for non-destructive quality assessment, β-carotene and anthocyanin stability under contrasting storage conditions, and consumer perception methodology in postharvest research. All new references have been verified for relevance, accuracy, and consistency with the citation style required by Horticulturae.
The conclusions section should consistently show the factorial arrangement used in the study.
R: We thank the reviewer for this observation. The Conclusions section has been revised to consistently reflect the factorial structure of the study genotype × storage temperature × postharvest time throughout all statements. Conclusions are now explicitly framed in terms of the factorial effects and interactions identified, distinguishing between main effects of genotype and storage temperature, the genotype × temperature interaction, and the temporal dynamics observed across the four evaluation time points. Genotype-specific conclusions are presented in the context of their corresponding storage condition rather than in isolation, ensuring that the factorial organization of the evidence base is transparent and consistently represented from the Results through to the Conclusions. This revision also ensures internal consistency between the figures, the ANOVA tables presented in the Supplementary Material, and the interpretive statements in the Conclusions section.
The manuscript is very long and should be shortened.
R: We thank the reviewer for this observation, which is consistent with feedback received from Reviewer 2. The manuscript has been substantially shortened in the revised version. Redundant descriptions, overly detailed methodological elaborations, and extended exploratory outputs have been removed from the main text or relocated to the Supplementary Material. The Introduction has been condensed to focus strictly on the identified research gap. The Results section has been restructured to present findings concisely, avoiding repetition between text, tables, and figures. The Discussion has been streamlined by removing tangential commentary and retaining only interpretations directly supported by the experimental evidence. The Conclusions section has been reduced to a focused summary of the principal findings and their implications. These revisions collectively reduce the manuscript length while preserving the scientific content and analytical depth required for a complete and transparent report of the study.
The handling of the English language should be improved.
R: We thank the reviewer for this observation. The entire manuscript has been carefully revised for English language quality, including grammar, syntax, sentence structure, word choice, and academic tone. Particular attention was given to sections identified as dense or unclear in previous reviews. To ensure the highest standard of language quality, the revised manuscript was reviewed by a native English-speaking colleague with expertise in academic scientific writing. All identified issues have been corrected, and the revised text reads more clearly, concisely, and consistently throughout.
Reviewer 4 Report
Comments and Suggestions for AuthorsDear Authors,
I attach a review of the article „ Multidimensional characterization of postharvest in quality carrot genotypes by temporal analysis of physicochemical, bio-functional, spectral and sensory parameters”.
The research topic is very interesting especially from the point of view of application perspective, and circular economy strategies.
Manuscript should not be accepted in this form because it does not meet the journal's requirements.
Basic remarks that disqualify the research results
- Errors possible in the basic data, which may consequently affect other analyses (PCA, LDA, e.t.c.). Some data are questionable: Figure 3 (fresh weight) increase in fresh weight during storage? Data must be verified.
- Incorrectly conducted statistical analyses. The authors used Tukey's test. In the methodological part the authors wrote - See Line 160-164: …All analytical measurements, including spectral, physicochemical, and destructive biochemical assays, were performed exclusively on the edible, storage root (i.e., the fleshy taproot) Significant differences between treatment means (α = 0.05) were identified using Tukey's Significant Difference (HSD) test… . The Tukey test is a post-hoc test that is designed to specifically examine differences between groups only if the overall ANOVA test shows that there are statistically significant differences between them. ANOVA test was not performed.
- Statistical analyses should be performed again. In the methodological part the authors wrote - See Line 157-159: …This experimental setup resulted in a 5×2 factorial arrangement (five varieties × two storage conditions) with three independent biological replicates, where each replicate consisted of a single carrot per treatment combination… . However, time appears as a factor, because for some features the effect of storage was studied. Therefore, there are 3 factors in the presented experimental setup: varieties (5 levels) + storage conditions (2 levels) + storage time (4 levels). Authors should select an appropriate analysis of variance model and perform statistical analyses again.
- The selection of appropriate statistical tests should be preceded by checking the assumptions for normal distributions and homogeneity. For many of the parameters studied, it is clear that the variances of the compared groups are not homogeneous.
- The descriptions of the results do not reflect the actual statistical analyses performed. Examples below.
- Using boxplots for a sample size of only three replicates (n=3) is neither recommended nor appropriate. Boxplots are designed to visualize the distribution of data, including quartiles (Q1, Median, Q3), which requires a larger sample size. For n=3, 50% of the data falls within the box, and the median is determined based on the minimum number of points, which can lead to misleading conclusions. With three points, the quartiles (Q1/Q3) often align with the minimum/maximum, flattening the graph. A bar chart displaying the mean and SD is a better option.
- All information regarding statistical methods is currently scattered and should be included in an additional section on Statistical analysis.
- Terminology should be unified [variety (section 2.1, 2.2, 2.3 or genotype (section 3.2, 3.3, 3.4) or carrot types (Line 652)]. Storage/treatment, e.t.c.
- Presented carrot genotypes (Daucus carota ) names should be listed in the same order in the text and all attachments.
- Data quality should be considered. It is certainly not the norm such high variability of repetition among groups. See Figure 11: Purple/refrigator/NDVI range 0.06 – 0.31 (difference between repetitions 500%) compared to 6KUR/refrigator/NDVI range 0.06 – 0.065. There are just only two of the cases.
- The quality of Tables and Figures is unacceptable.
- References not in accordance with the journal's requirements.
Other comments:
Line 104-112: …
Rev: Research objectives should be clearly articulated.
Line 139-140: …identified as 14BER, 6KUR, white, purple, and yellow…
Rev: Variety names should be given in the same order in the text, figures (legends) and tables.
Line 160-164: …All analytical measurements, including spectral, physicochemical, and destructive biochemical assays, were performed exclusively on the edible, storage root (i.e., the fleshy taproot) Significant differences between treatment means (α = 0.05) were identified using Tukey's Significant Difference (HSD) test, which controls the family-wise error rate for multiple comparisons.
Rev: Incorrectly conducted statistical analyses.
Line 171: …evaluating three sample units per variety at 7, 14, 21, and 30 days postharvest..
Rev: Lack of consistency; 4 week is 28.
Line 340: …The sixth variety, 6KUR…
Rev: six?
Line 335-362: …
Rev: How did respondents receive the evaluation material? Was it one carrot? This section needs to be completed.
Line 365-366: …A comprehensive analysis of non-destructive parameters (Figure 3) revealed significant differences between varieties and storage treatments…
Rev: …significant differences between storage treatments? no statistical analysis was performed – comparisons between storage methods. The statement is unfounded.
Line 366-367: …The color parameters (L*, a*, 366 b*) showed significant differences (p < 0.05) between varieties and storage times…
Rev: no statistical analysis was performed – comparisons between storage times. The statement is unfounded.
Line 372-374: …It was observed that storage at room temperature caused a more pronounced decrease in L* and b* values towards the third week, while refrigeration helped maintain these color parameters with less variation.
Rev: no statistical analysis was performed – comparisons between storage times. The statement is unfounded.
Line 376-379: …In terms of fresh weight, all varieties experienced progressive losses, which were more pronounced at room temperature (6-8% vs. 3-4% in refrigeration). The purple variety showed the greatest losses (9.2% at room temperature), while the 14BER variety showed the best weight retention (2.7% in refrigeration)…
Rev: no statistical analysis was performed – comparisons between storage times. The statement is unfounded.
Rev: Authors show that in terms of fresh weight, all varieties experienced progressive losses. On Figure 3 we can observed increase in fresh weight of 14BER variety (room temperature) from ca 140 (1 week) to 210 (2 week) (increase of 50%) is it possible? – unreliable data – what unit?
On Figure 3 we can observed increase in fresh weight of White variety (refrigator) from ca 110 (1 week) to 220 (2 week) (increase of 100%) is it possible? –unreliable data – what unit?
Line 383-386: Figure 3. Shelf-life evaluation results of carrot for non-destructive variables. Different letters indi-384 cate significant differences between time points and materials, respectively, based on Tukey's test, 385 p-value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005.
Rev: It is unclear what was being compared. The title of the figure indicates that …significant differences between time points and materials… .
See b*(refrigator) – Purple: in 1 week is 5 and in 4 week is 10: 100% increase and no statistical differences?
See Fresh weight (refrigator) – 6KUR: in 1 week is ca 220 and in 2 week is ca110: 100% decrease and no statistical differences? What unit?
Line 408-411: Figure 4. Shelf-life evaluation results of carrot for destructive variables. Different letters indicate significant differences between time points and materials, respectively, based on Tukey's test. p-value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005.
Rev: p-value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005. ???
Rev: It is unclear what was being compared. The title of the figure indicates that …significant differences between time points and materials… .
See: TSS (room temperature) – Purple: in 2 week is ca 8 and in 4 week is ca 17: 100% increase and no statistical differences? What unit?
See: TAA (room temperature) – White: in 1 week is ca 0.1 and in 4 week is ca 0.3: 300% increase and no statistical differences?
See: TAA (materials, refrigator) – Purple (ca 0.225) and 6KUR (ca 0.100), difference 125% and no statistical differences?
There are many more doubts and the markings of differences seem to be random.
Line 411: …value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005…
Rev: Incomprehensible for what purpose.
Line 427-429: …Figure 5. Linear Discriminant Analysis (LDA) of carrot genotypes under two storage conditions. A) 427 Genotype differentiation at room temperature. B) Genotype differentiation under refrigeration. Col-428 ored ellipses represent 95% confidence intervals for each genotype (purple, yellow, 6KUR, 14BER, 429 and white).
Rev: Unclear. What parameters were used? Such information should be included in the Title.
Line 597-603: …Figure 10. Comparison of the spectral indices mARI, NDVI, CRI1, and CRI2 in carrots subjected to different processing and storage conditions (refrigeration and ambient temperature) over four weeks of shelf life. Measurements were conducted on the outer tissue of the roots. The values are accompanied by significance letters, which were determined using Tukey's test (p < 0.05). These letters indicate statistically significant differences between the processing and storage conditions for each spectral index.
Rev: The order of storage conditions should be maintained throughout the manuscript.
From the title it follows that the comparison was made differences between the processing and storage conditions. In text (line 571-572) state: The analysis of spectral indices revealed significant variations among the different carrot varieties and under different storage conditions.
Line 626-631: …Temporal monitoring (Figure 12) of β-carotene content demonstrated a clear trend of progressive decline in all samples, though with a characteristic degradation pattern consisting of two distinct phases. During the first three weeks of storage (rapid initial phase), substantial losses were recorded, ranging between 30% and 53% of the original content. Subsequently, between the third and fourth weeks, the reduction became more gradual, with additional decreases varying between 10% and 15% of the remaining values…
and
Line 640-641: …A comparative analysis of storage conditions revealed marked differences in degradation rates… Refrigerated samples exhibited a sustained loss rate fluctuating between 2.1 641 and 2.8 ppm per week, whereas units stored at room temperature showed accelerated degradation (4.3-5.6 ppm weekly), approximately double the rate observed under refrigeration. Among all evaluated varieties, the purple carrot displayed particularly notable stability, retaining 63.4% of its initial β-carotene after four weeks under refrigeration (cumulative loss: 36.6%). In contrast, yellow and white varieties degraded completely (100% loss) at room temperature by Week 4, highlighting the critical role of both genotype and storage conditions in preserving β-carotene content…
Rev: In this section, the authors describe changes during storage without confirmation by statistical analyses. It is interesting that the authors performed a statistical analysis (comparison between varieties) but did not describe it.
Line 620-625: Table 1. Evolution total carotenoid content (expressed as β-carotene) in five carrot genotypes (14BER, 6KUR, purple, yellow, and white) stored at room temperature and under refrigeration for four weeks. The table shows absolute concentration (ppm), relative percentage compared to Week 1 (initial value), and cumulative loss percentage. A logarithmic exploratory model was fitted to describe the degradation trend, with parameters estimated for each genotype and storage condition (β-carotene concentration = a + b·ln(time), where a and b represent the model coefficients).
Rev: A logarithmic exploratory model in title is not compatible with the explanations in the table. The description should be corrected and supplemented.
Line 626-631: …Temporal monitoring (Figure 12) of β-carotene content demonstrated a clear trend of progressive decline in all samples, though with a characteristic degradation pattern consisting of two distinct phases. During the first three weeks of storage (rapid initial phase), substantial losses were recorded, ranging between 30% and 53% of the original content. Subsequently, between the third and fourth weeks, the reduction became more gradual, with additional decreases varying between 10% and 15% of the remaining values…
Rev: Stored under refrigeration and room temperature conditions? Should be corrected.
Line 640-648: …A comparative analysis of storage conditions revealed marked …. storage conditions in preserving β-carotene content…
Rev: Where are data? Figure/Table should be cited.
Line 649-654: Figure 12. Total carotenoid content (expressed as β-carotene) stored under refrigeration and room temperature conditions over four weeks. The figure shows variations in carotenoid content (ppm fresh weight) across different carrot types (purple, yellow, white, 6KUR, and 14BER) stored under refrigeration (left) and ambient temperature (right) during shelf-life evaluation. Letters above data points indicate significant differences (p < 0.05) between weeks within each treatment group.
Rev: First and the second sentence is a repetition – should be corrected.
Rev: Different carrot types? Maybe varieties?
Line 640-648: …Letters above data points indicate significant differences (p < 0.05) between weeks within each treatment group.
Rev: This is not true. The differences between the varieties are presented. What is treatment group? Terminology should be standardized.
References should be described as follows, depending on the type of work: https://www.mdpi.com/journal/horticulturae/instructions
- Journal Articles:
1. Author 1, A.B.; Author 2, C.D. Title of the article. Abbreviated Journal Name Year, Volume, page range. - Books and Book Chapters:
2. Author 1, A.; Author 2, B. Book Title, 3rd ed.; Publisher: Publisher Location, Country, Year; pp. 154–196.
3. Author 1, A.; Author 2, B. Title of the chapter. In Book Title, 2nd ed.; Editor 1, A., Editor 2, B., Eds.; Publisher: Publisher Location, Country, Year; Volume 3, pp. 154–196. - Unpublished materials intended for publication:
4. Author 1, A.B.; Author 2, C. Title of Unpublished Work (optional). Correspondence Affiliation, City, State, Country. year, status (manuscript in preparation; to be submitted).
5. Author 1, A.B.; Author 2, C. Title of Unpublished Work. Abbreviated Journal Name year, phrase indicating stage of publication (submitted; accepted; in press). - Unpublished materials not intended for publication:
6. Author 1, A.B. (Affiliation, City, State, Country); Author 2, C. (Affiliation, City, State, Country). Phase describing the material, year. (phase: Personal communication; Private communication; Unpublished work; etc.) - Conference Proceedings:
7. Author 1, A.B.; Author 2, C.D.; Author 3, E.F. Title of Presentation. In Title of the Collected Work (if available), Proceedings of the Name of the Conference, Location of Conference, Country, Date of Conference; Editor 1, Editor 2, Eds. (if available); Publisher: City, Country, Year (if available); Abstract Number (optional), Pagination (optional). - Thesis:
8. Author 1, A.B. Title of Thesis. Level of Thesis, Degree-Granting University, Location of University, Date of Completion.
Author Response
Reviewer 4
I attach a review of the article „ Multidimensional characterization of postharvest in quality carrot genotypes by temporal analysis of physicochemical, bio-functional, spectral and sensory parameters”. The research topic is very interesting especially from the point of view of application perspective, and circular economy strategies. Manuscript should not be accepted in this form because it does not meet the journal's requirements.
Errors possible in the basic data, which may consequently affect other analyses (PCA, LDA, e.t.c.). Some data are questionable: Figure 3 (fresh weight) increase in fresh weight during storage? Data must be verified.
R: We thank the reviewer for this critical observation and take it seriously. We respectfully provide the following statistical and methodological justification for the multivariate analyses applied. Regarding sample size and multivariate methods: The reviewer raises a legitimate general concern about applying multivariate techniques to small datasets. However, we respectfully argue that the appropriateness of a multivariate method depends not only on total sample size but on the ratio of observations to variables, the analytical objective, and the nature of the inference being drawn. In the present study, the multivariate analyses were applied to a dataset of 120 observations (5 genotypes × 2 storage conditions × 4 evaluation times × 3 replicates), which satisfies the minimum observation-to-variable ratios recommended for the methods applied (Tabachnick & Fidell, 2019; Huberty & Olejnik, 2006). It is therefore incorrect to characterize this as a small dataset in the context of multivariate analysis the concern may have arisen from conflating the number of replicates per cell with the total analytical sample size, which are distinct quantities. Regarding Principal Component Analysis (PCA): PCA was applied as an exploratory dimensionality reduction tool, not as a confirmatory or predictive model. Its application to datasets of this size is well established and does not require cross-validation when used descriptively (Jolliffe & Cadima, 2016). No inferential claims were derived from PCA outputs beyond the identification of major sources of variance among variables, which is an appropriate use of the technique regardless of sample size. Regarding Linear Discriminant Analysis (LDA): LDA was applied to the complete pooled dataset (n = 120 observations, n = 24 per genotype), which satisfies the minimum sample size requirements for stable discriminant function estimation (Tabachnick & Fidell, 2019). To address the reviewer's concern about the absence of validation, leave-one-out cross-validation (LOOCV) has been incorporated into the revised analysis, and the resulting classification accuracy and confusion matrix are now reported in the Results section. LOOCV is the recommended internal validation strategy for LDA when external validation datasets are unavailable (Lachenbruch & Mickey, 1968), and its inclusion directly addresses the reviewer's observation. Regarding K-means clustering: We acknowledge the reviewer's concern and agree that K-means clustering applied to small per-cell samples risks identifying unstable cluster structures. In the revised manuscript, K-means results are now presented strictly as an exploratory descriptive tool, and the cluster solution has been validated using the silhouette coefficient and the within-cluster sum of squares elbow criterion to confirm that the identified structure is not artifactual. The interpretation of clustering results has been explicitly bounded to pattern description rather than biological inference, and this limitation is now clearly stated in both the Methods and Discussion sections. Regarding the general concern of analytical inflation: We respectfully disagree with this characterization. The methods applied serve distinct and complementary analytical objectives dimensionality reduction (PCA), group discrimination (LDA), and unsupervised pattern detection (K-means) each justified by the specific research question it addresses. The concern of analytical inflation typically arises when methods are applied redundantly or without a defined inferential purpose. In the revised manuscript, the rationale for each multivariate method, its analytical scope, its limitations given the available data, and the validation strategy applied are now explicitly documented in the Materials and Methods section, which has been substantially expanded to address this and related comments.
References:
Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA and discriminant analysis (2nd ed.). John Wiley & Sons. https://doi.org/10.1002/047178947X
Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. https://doi.org/10.1098/rsta.2015.0202
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1–11. https://doi.org/10.1080/00401706.1968.10490530
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Thorndike, R. L. (1953). Who belongs in the family? Psychometrika, 18(4), 267–276. https://doi.org/10.1007/BF02289263
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1–11. https://doi.org/10.1080/00401706.1968.10490530
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 2, 1137–1143.
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 1–9. https://doi.org/10.7275/jyj1-4868
Incorrectly conducted statistical analyses. The authors used Tukey's test. In the methodological part the authors wrote - See Line 160-164: …All analytical measurements, including spectral, physicochemical, and destructive biochemical assays, were performed exclusively on the edible, storage root (i.e., the fleshy taproot) Significant differences between treatment means (α = 0.05) were identified using Tukey's Significant Difference (HSD) test… . The Tukey test is a post-hoc test that is designed to specifically examine differences between groups only if the overall ANOVA test shows that there are statistically significant differences between them. ANOVA test was not performed.
R: We thank the reviewer for this precise and technically valid observation. We fully agree that the original manuscript described the use of Tukey's HSD test without explicitly reporting the preceding ANOVA an inconsistency that constitutes an incomplete statistical reporting practice and, if the ANOVA was indeed omitted, a methodological error. The statistical analysis framework has been completely revised in the updated manuscript. As detailed in our responses to Reviewers 1 and 2, the original parametric ANOVA + Tukey approach has been replaced by a more rigorous and appropriate analytical framework given the distributional properties of the data. Specifically, the Aligned Rank Transform procedure (ART-ANOVA; Wobbrock et al., 2011) was applied as a non-parametric factorial alternative, which formally tests all main effects and interactions genotype, storage temperature, postharvest time, and all two-way and three-way interactions before conducting post-hoc pairwise comparisons. Post-hoc comparisons were performed using estimated marginal means (EMMs) with Benjamini-Hochberg correction for multiple comparisons (Lenth, 2024), replacing the Tukey HSD test throughout. The complete ART-ANOVA tables, including F statistics, degrees of freedom, p-values, and partial η² effect sizes for all effects and interactions, are now formally presented in the Supplementary Material and referenced at each relevant point in the Results section. This revision directly addresses the reviewer's concern by ensuring that all post-hoc comparisons are explicitly preceded by and grounded in a formal omnibus test, and that the full statistical output is transparently reported.
References
Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The aligned rank transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '11) (pp. 143–146). https://doi.org/10.1145/1978942.1978963
Lenth, R. V. (2024). emmeans: Estimated marginal means, aka least-squares means. R package version 1.10.0. https://CRAN.R-project.org/package=emmeans
Statistical analyses should be performed again. In the methodological part the authors wrote - See Line 157-159: …This experimental setup resulted in a 5×2 factorial arrangement (five varieties × two storage conditions) with three independent biological replicates, where each replicate consisted of a single carrot per treatment combination… . However, time appears as a factor, because for some features the effect of storage was studied. Therefore, there are 3 factors in the presented experimental setup: varieties (5 levels) + storage conditions (2 levels) + storage time (4 levels). Authors should select an appropriate analysis of variance model and perform statistical analyses again.
R: We thank the reviewer for this precise and constructive observation. The reviewer is entirely correct. The original manuscript described a 5 × 2 factorial arrangement but failed to formally incorporate postharvest time as a third crossed factor in the statistical model, despite time being an explicit and central dimension of the study. This represents a significant analytical inconsistency that has been fully corrected in the revised manuscript. The statistical model has been redesigned as a fully crossed 5 × 2 × 4 factorial arrangement five genotypes × two storage conditions × four postharvest evaluation times with three independent biological replicates per treatment combination, yielding 120 independent experimental units in total. All main effects and interactions, including genotype × temperature, genotype × time, temperature × time, and the three-way interaction genotype × temperature × time, are now formally estimated and tested within a single unified model. Given the distributional properties of the data specifically the violation of normality assumptions confirmed by Shapiro-Wilk testing for most response variables — the Aligned Rank Transform procedure (ART-ANOVA; Wobbrock et al., 2011) was selected as the appropriate non-parametric equivalent of a fully crossed factorial ANOVA. This method allows formal testing of all main effects and interactions without assuming normality, and has been validated for factorial designs of this structure (Elkin et al., 2021). Post-hoc pairwise comparisons were conducted using estimated marginal means (EMMs) with Benjamini-Hochberg correction applied within each conditioning combination (Lenth, 2024). The complete ART-ANOVA tables for all response variables, including F statistics, degrees of freedom, p-values, and partial η² effect sizes for all seven effects in the 5 × 2 × 4 model, are now formally presented in the Supplementary Material. Key results are summarized in the main text with explicit reference to the factorial structure. All figures have been redesigned as interaction plots faceted by postharvest time, explicitly representing the three-factor structure as recommended by Reviewer 2. We thank the reviewer for identifying this gap, as its correction substantially strengthens the analytical foundation and interpretive validity of the manuscript.
References
Elkin, L. A., Kay, M., Higgins, J. J., & Wobbrock, J. O. (2021). An aligned rank transform procedure for multifactor contrast tests. Proceedings of the ACM Symposium on User Interface Software and Technology (UIST '21). https://doi.org/10.1145/3472749.3474784
Lenth, R. V. (2024). emmeans: Estimated marginal means, aka least-squares means. R package version 1.10.0. https://CRAN.R-project.org/package=emmeans
Wobbrock, J. O., Findlater, L., Gergle, D., & Higgins, J. J. (2011). The aligned rank transform for nonparametric factorial analyses using only ANOVA procedures. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI '11) (pp. 143–146). https://doi.org/10.1145/1978942.1978963
The selection of appropriate statistical tests should be preceded by checking the assumptions for normal distributions and homogeneity. For many of the parameters studied, it is clear that the variances of the compared groups are not homogeneous.
R: We thank the reviewer for this observation, which is entirely valid and consistent with concerns raised by other reviewers. We fully agree that the application of parametric tests without prior verification of distributional assumptions represents a methodological gap in the original manuscript. As described in our response to Reviewer 4, Comment 1, and detailed in the revised Materials and Methods section, the statistical framework has been completely redesigned. Prior to inferential analysis, normality was assessed for all response variables using the Shapiro-Wilk test, and homogeneity of variances was evaluated using Levene's test (α = 0.05). As the reviewer correctly anticipates, most variables violated both assumptions, confirming that parametric ANOVA and Tukey's HSD were inappropriate for this dataset. In response, the Aligned Rank Transform procedure (ART-ANOVA; Wobbrock et al., 2011) was adopted as the non-parametric equivalent of the fully crossed 5 × 2 × 4 factorial ANOVA, allowing formal testing of all main effects and interactions without assuming normality or homoscedasticity. Post-hoc comparisons were performed using estimated marginal means with Benjamini-Hochberg correction. The results of the normality and homogeneity tests are now reported in the Supplementary Material alongside the complete ART-ANOVA tables, ensuring full transparency in the assumption-checking process.
The descriptions of the results do not reflect the actual statistical analyses performed. Examples below.
R: We thank the reviewer for this precise observation. We fully agree that inconsistencies between the statistical analyses performed and their description in the Results section undermine the credibility and reproducibility of the reported findings, and we accept this as a significant revision priority. The Results section has been thoroughly revised to ensure complete consistency between the statistical framework applied ART-ANOVA with estimated marginal means and Benjamini-Hochberg correction and the way results are described in the text. Specifically, all narrative statements now explicitly reference the corresponding test statistic (F), degrees of freedom, p-value, and effect size (partial η²) from the ART-ANOVA tables. Post-hoc comparisons are described in terms of BH-adjusted pairwise EMM contrasts rather than informal group comparisons, and compact letter display notation in figures is now fully consistent with the reported statistical output. Statements that previously implied statistical significance without supporting metrics have been either quantified or reframed as descriptive observations. We invite the reviewer to consult the specific examples identified and verify that each has been corrected in the revised manuscript.
Using boxplots for a sample size of only three replicates (n=3) is neither recommended nor appropriate. Boxplots are designed to visualize the distribution of data, including quartiles (Q1, Median, Q3), which requires a larger sample size. For n=3, 50% of the data falls within the box, and the median is determined based on the minimum number of points, which can lead to misleading conclusions. With three points, the quartiles (Q1/Q3) often align with the minimum/maximum, flattening the graph. A bar chart displaying the mean and SD is a better option.
R: We thank the reviewer for this technically precise observation. We fully agree that boxplots are statistically inappropriate for n = 3, as the quartile estimates are unreliable at this sample size and the resulting visualization can be misleading. All boxplots in the manuscript have been replaced in the revised version. However, we respectfully note that the replacement visualization adopted differs from the bar chart with mean and SD suggested by the reviewer, based on a recommendation received from Reviewer 2 and consistent with current best practices in statistical data visualization. All figures now present estimated marginal means (EMMs) as point values accompanied by 95% confidence intervals, with compact letter display (CLD) notation from BH-adjusted pairwise comparisons shown above each point. This format accurately conveys central tendency and estimation uncertainty, avoids the visual truncation artifact of bar charts anchored at zero, and more clearly reveals the interaction patterns within the factorial structure of the design mwhich is central to the interpretation of a three-factor experiment. We acknowledge that both formats bar charts with SD and dot plots with CI are preferable to boxplots at n = 3, and we are confident that the dot plot with EMM ± 95% CI adopted in the revised manuscript meets the statistical and visual communication standards required for this dataset.
All information regarding statistical methods is currently scattered and should be included in an additional section on Statistical analysis.
R: We thank the reviewer for this organizational recommendation. We fully agree that dispersed statistical descriptions reduce clarity and reproducibility. In the revised manuscript, all statistical information has been consolidated into a dedicated subsection entitled "Statistical Analysis" within the Materials and Methods section. This subsection now includes, in a unified and logically sequenced description: the factorial design structure (5 × 2 × 4), the assumption-checking procedures (Shapiro-Wilk and Levene's tests), the rationale for the non-parametric approach adopted, the ART-ANOVA framework and its justification, the post-hoc comparison strategy (EMMs with BH correction), the effect size metric used (partial η²), the multivariate methods applied (PCA, LDA, K-means) with their respective validation procedures, the mathematical degradation model for β-carotene kinetics, and the software and package versions used for all analyses. All statistical details previously scattered across individual subsections of the Methods have been removed from those locations and consolidated here, with cross-references retained where necessary for clarity.
Terminology should be unified [variety (section 2.1, 2.2, 2.3 or genotype (section 3.2, 3.3, 3.4) or carrot types (Line 652)]. Storage/treatment, e.t.c.
R: We thank the reviewer for identifying this inconsistency. Unified and precise terminology is essential for scientific clarity and the reviewer is correct that the original manuscript used "variety", "genotype", and "carrot types" interchangeably across sections, and similarly alternated between "storage condition" and "treatment" without a consistent rationale. In the revised manuscript, terminology has been standardized throughout. "Genotype" is used consistently to refer to the five carrot materials evaluated, as this term most accurately reflects their biological and experimental status in the context of the study. "Storage condition" is used consistently to refer to the two temperature treatments (refrigeration at 4°C and ambient temperature at 15°C), replacing all instances of "treatment" where the latter referred to storage temperature. Where "treatment" is retained, it refers explicitly to the factorial combination of genotype and storage condition as a whole, and this usage is defined at first mention. All instances of "variety", "carrot types", and any other non-standardized terminology have been identified and corrected throughout the manuscript, including tables, figures, captions, and supplementary material.
Presented carrot genotypes (Daucus carota ) names should be listed in the same order in the text and all attachments.
R: We thank the reviewer for this observation. Consistent ordering of genotype names across all sections of a manuscript is essential for readability and cross-referencing between text, tables, and figures. In the revised manuscript, the five carrot genotypes (14BER, 6KUR, white, yellow, and purple) are now listed in the same fixed order throughout the main text, tables, figures, figure captions, and supplementary material. This order is established at first mention in the Materials and Methods section and maintained consistently thereafter, ensuring that readers can cross-reference results across all manuscript components without ambiguity.
Data quality should be considered. It is certainly not the norm such high variability of repetition among groups. See Figure 11: Purple/refrigator/NDVI range 0.06 – 0.31 (difference between repetitions 500%) compared to 6KUR/refrigator/NDVI range 0.06 – 0.065. There are just only two of the cases.
R: We thank the reviewer for this important observation regarding data quality and within-group variability. Following this comment, all raw data were carefully reviewed and a digitization error was identified in the affected cases, which had artificially inflated the within-group variability reported for the purple/refrigeration/NDVI combination and the other flagged cases. The error has been corrected in the revised dataset, and the corresponding figures, tables, and statistical outputs have been updated accordingly. We sincerely apologize for this oversight and thank the reviewer for drawing our attention to it. Beyond the corrected digitization error, we acknowledge that pigmented genotypes such as purple may exhibit inherently higher within-group spectral variability than pigment-uniform genotypes, attributable to the spatial heterogeneity of anthocyanin distribution across the root surface, which produces spatially variable reflectance responses not fully captured by a limited number of replicates (Blackburn, 2007; Steele et al., 2009). This biological source of variability is now explicitly acknowledged as a limitation in the Discussion section, and all interpretive claims derived from high-variability cases are presented with their full confidence intervals and qualified accordingly.
References
Blackburn, G. A. (2007). Hyperspectral remote sensing of plant pigments. Journal of Experimental Botany, 58(4), 855–867. https://doi.org/10.1093/jxb/erl123
Steele, M. R., Gitelson, A. A., Rundquist, D. C., & Merzlyak, M. N. (2009). Nondestructive estimation of anthocyanin content in grapevine leaves. American Journal of Enology and Viticulture, 60(1), 87–92.
The quality of Tables and Figures is unacceptable.
R: Done, the tables were changed.
References not in accordance with the journal's requirements.
R: Done, the reference style was changed.
Other comments:
Line 104-112: …Rev: Research objectives should be clearly articulated
R: Done.
Line 139-140: …identified as 14BER, 6KUR, white, purple, and yellow…Rev: Variety names should be given in the same order in the text, figures (legends) and tables.
R: Done.
Line 160-164: …All analytical measurements, including spectral, physicochemical, and destructive biochemical assays, were performed exclusively on the edible, storage root (i.e., the fleshy taproot) Significant differences between treatment means (α = 0.05) were identified using Tukey's Significant Difference (HSD) test, which controls the family-wise error rate for multiple comparisons. Rev: Incorrectly conducted statistical analyses.
R: Done. Please see that all statistical tests were changed as suggested.
Line 171: …evaluating three sample units per variety at 7, 14, 21, and 30 days postharvest.. Rev: Lack of consistency; 4 week is 28.
R: It is used as a general term, referring to a period of one month.
Line 340: …The sixth variety, 6KUR…Rev: six?
R: Done.
Line 335-362: …Rev: How did respondents receive the evaluation material? Was it one carrot? This section needs to be completed.
R: Done.
Line 365-366: …A comprehensive analysis of non-destructive parameters (Figure 3) revealed significant differences between varieties and storage treatments…Rev: …significant differences between storage treatments? no statistical analysis was performed – comparisons between storage methods. The statement is unfounded.
R: Done.
Line 366-367: …The color parameters (L*, a*, 366 b*) showed significant differences (p < 0.05) between varieties and storage times…Rev: no statistical analysis was performed -comparisons between storage times. The statement is unfounded.
R: Done.
Line 372-374: …It was observed that storage at room temperature caused a more pronounced decrease in L* and b* values towards the third week, while refrigeration helped maintain these color parameters with less variation. Rev: no statistical analysis was performed – comparisons between storage times. The statement is unfounded.
R: Done.
Line 376-379: …In terms of fresh weight, all varieties experienced progressive losses, which were more pronounced at room temperature (6-8% vs. 3-4% in refrigeration). The purple variety showed the greatest losses (9.2% at room temperature), while the 14BER variety showed the best weight retention (2.7% in refrigeration)…Rev: no statistical analysis was performed – comparisons between storage times. The statement is unfounded. Rev: Authors show that in terms of fresh weight, all varieties experienced progressive losses. On Figure 3 we can observed increase in fresh weight of 14BER variety (room temperature) from ca 140 (1 week) to 210 (2 week) (increase of 50%) is it possible? – unreliable data – what unit?
R: Done.
On Figure 3 we can observed increase in fresh weight of White variety (refrigator) from ca 110 (1 week) to 220 (2 week) (increase of 100%) is it possible? –unreliable data – what unit?
R: Done.
Line 383-386: Figure 3. Shelf-life evaluation results of carrot for non-destructive variables. Different letters indi-384 cate significant differences between time points and materials, respectively, based on Tukey's test, 385 p-value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005. Rev: It is unclear what was being compared. The title of the figure indicates that …significant differences between time points and materials… . See b*(refrigator) – Purple: in 1 week is 5 and in 4 week is 10: 100% increase and no statistical differences? See Fresh weight (refrigator) – 6KUR: in 1 week is ca 220 and in 2 week is ca110: 100% decrease and no statistical differences? What unit?
R: Done.
Line 408-411: Figure 4. Shelf-life evaluation results of carrot for destructive variables. Different letters indicate significant differences between time points and materials, respectively, based on Tukey's test. p-value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005. Rev: p-value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005. ??? Rev: It is unclear what was being compared. The title of the figure indicates that …significant differences between time points and materials… .See: TSS (room temperature) – Purple: in 2 week is ca 8 and in 4 week is ca 17: 100% increase and no statistical differences? What unit? See: TAA (room temperature) – White: in 1 week is ca 0.1 and in 4 week is ca 0.3: 300% increase and no statistical differences? See: TAA (materials, refrigator) – Purple (ca 0.225) and 6KUR (ca 0.100), difference 125% and no statistical differences? There are many more doubts and the markings of differences seem to be random.
R: Done.
Line 411: …value 0.01 and 0.05, p-value 0.001 and 0.005, p-value 0.0001 and 0.0005… Rev: Incomprehensible for what purpose. Line 427-429: …Figure 5. Linear Discriminant Analysis (LDA) of carrot genotypes under two storage conditions. A) 427 Genotype differentiation at room temperature. B) Genotype differentiation under refrigeration. Col-428 ored ellipses represent 95% confidence intervals for each genotype (purple, yellow, 6KUR, 14BER, 429 and white). Rev: Unclear. What parameters were used? Such information should be included in the Title.
R: Done.
Line 597-603: …Figure 10. Comparison of the spectral indices mARI, NDVI, CRI1, and CRI2 in carrots subjected to different processing and storage conditions (refrigeration and ambient temperature) over four weeks of shelf life. Measurements were conducted on the outer tissue of the roots. The values are accompanied by significance letters, which were determined using Tukey's test (p < 0.05). These letters indicate statistically significant differences between the processing and storage conditions for each spectral index. Rev: The order of storage conditions should be maintained throughout the manuscript. From the title it follows that the comparison was made differences between the processing and storage conditions. In text (line 571-572) state: The analysis of spectral indices revealed significant variations among the different carrot varieties and under different storage conditions.
R: Done.
Line 626-631: …Temporal monitoring (Figure 12) of β-carotene content demonstrated a clear trend of progressive decline in all samples, though with a characteristic degradation pattern consisting of two distinct phases. During the first three weeks of storage (rapid initial phase), substantial losses were recorded, ranging between 30% and 53% of the original content. Subsequently, between the third and fourth weeks, the reduction became more gradual, with additional decreases varying between 10% and 15% of the remaining values…
R: Done.
Line 640-641: …A comparative analysis of storage conditions revealed marked differences in degradation rates… Refrigerated samples exhibited a sustained loss rate fluctuating between 2.1 641 and 2.8 ppm per week, whereas units stored at room temperature showed accelerated degradation (4.3-5.6 ppm weekly), approximately double the rate observed under refrigeration. Among all evaluated varieties, the purple carrot displayed particularly notable stability, retaining 63.4% of its initial β-carotene after four weeks under refrigeration (cumulative loss: 36.6%). In contrast, yellow and white varieties degraded completely (100% loss) at room temperature by Week 4, highlighting the critical role of both genotype and storage conditions in preserving β-carotene content…Rev: In this section, the authors describe changes during storage without confirmation by statistical analyses. It is interesting that the authors performed a statistical analysis (comparison between varieties) but did not describe it.
R: Done.
Line 620-625: Table 1. Evolution total carotenoid content (expressed as β-carotene) in five carrot genotypes (14BER, 6KUR, purple, yellow, and white) stored at room temperature and under refrigeration for four weeks. The table shows absolute concentration (ppm), relative percentage compared to Week 1 (initial value), and cumulative loss percentage. A logarithmic exploratory model was fitted to describe the degradation trend, with parameters estimated for each genotype and storage condition (β-carotene concentration = a + b·ln(time), where a and b represent the model coefficients). Rev: A logarithmic exploratory model in title is not compatible with the explanations in the table. The description should be corrected and supplemented.
R: Done.
Line 626-631: …Temporal monitoring (Figure 12) of β-carotene content demonstrated a clear trend of progressive decline in all samples, though with a characteristic degradation pattern consisting of two distinct phases. During the first three weeks of storage (rapid initial phase), substantial losses were recorded, ranging between 30% and 53% of the original content. Subsequently, between the third and fourth weeks, the reduction became more gradual, with additional decreases varying between 10% and 15% of the remaining values… Rev: Stored under refrigeration and room temperature conditions? Should be corrected.
R: Done.
Line 640-648: …A comparative analysis of storage conditions revealed marked …. storage conditions in preserving β-carotene content… Rev: Where are data? Figure/Table should be cited.
R: Done.
Line 649-654: Figure 12. Total carotenoid content (expressed as β-carotene) stored under refrigeration and room temperature conditions over four weeks. The figure shows variations in carotenoid content (ppm fresh weight) across different carrot types (purple, yellow, white, 6KUR, and 14BER) stored under refrigeration (left) and ambient temperature (right) during shelf-life evaluation. Letters above data points indicate significant differences (p < 0.05) between weeks within each treatment group. Rev: First and the second sentence is a repetition – should be corrected. Rev: Different carrot types? Maybe varieties?
R: Done.
Line 640-648: …Letters above data points indicate significant differences (p < 0.05) between weeks within each treatment group. Rev: This is not true. The differences between the varieties are presented. What is treatment group? Terminology should be standardized.
R: Done.
References should be described as follows, depending on the type of work: https://www.mdpi.com/journal/horticulturae/instructions
R: Done.
Reviewer 5 Report
Comments and Suggestions for AuthorsRecommendations for improving the article
With the aim of raising the quality of the presented work please check the mentioned:
Line 15: Add ’’L’’ after Daucus carota and further in the text to standardize the Latin name or skip it to reduce the text.
Line 16: bioactive compounds instead of bioactives. Bioactive compounds are mostly used in the literature.
Lines 100-103: Although the material and methods is 7 pages long, perhaps these sentences should be added to that chapter.
Line 106: unusual expression ’’posited’’
Lines 115; 138-139: Perhaps the introduction of varieties could be explained with less text
Line 133: Figure 1- Part 3, the first cloud ’’Determination of total β-caroteno’’. β-carotene instead...
Line 138: See ’’Line 15’’.
Line 222: in.sed (space or ’’...)
Lines 223-227: Add reference/references fot applied method/model.
Lines 261-269: Maybe it would be better to move this to the introduction and/or discussion chapter...
Line 340: ’’Sixth genotype’’?? There are only five...
Line 384: To make the table itself better understood, the names of the tested properties a*, b*, L* should be inserted.
Line 408: TTA
Lines 466-473: unnecessary
Lines 474- 478: Lines with different colors are not explained adequately. There is no explanation within the figure text.
Line 620:
Lines 17; 133; 165-168; 383; 408; 426; 429-430; 508; 540; 564; 597; 604; 620-626; 649; 652; 698; The order of genotypes should be in the same order during the listing (in the abstract, material and methods, and in tables and figures). Maybe it's a small and negligible detail to you (who did the whole study) but my impression is that it makes the topic difficult for the first time reader to follow.
Lines 594-595: ’’While pigmented varieties (purple and yellow) experienced differential but significant losses, white carrots maintained relative stability.’’ I’m not sure but maybe it is better to rephrase.
Line 737-738: ’’Refrigeration reduced β-carotene losses to 15-20%, a stark contrast to 737 the 50–60% degradation observed under ambient conditions [39,40]’’.
Line 790: delite space after ’’(53)’’
Lines 880-881: Error 404 - File not found
Dear Authors,
Fascinating paper! However with a little effort and very little additional work, this material shown could have been used to write three papers and one additional review paper. This amount of text is more suitable for a chapter in a book than a scientific article. The same goes for the number of citations. In the previous chapter I published, the number of citations was limited to 75, which is only five differences compared to this text.
I appreciate your multidimensional approach. I can say that it is a very inspiring work. I also got a couple of ideas from reading your study.
I have no experience with interviews of consumers and and processing and analyzing data collected in that way as well as consumer sensory evaluation. So, I may not be the best reviewer for this part of the text and paper. However, I have to admit that through this text I got to know some new concepts/methodology and that everything written was very informative for me.
Although based on the absence of self-citations I can assume that you are closer to the beginning of your career, I have the impression that you are guided by good mentors.
Please check the suggestions mentioned above and try to adopt or modify them for correction and clarity of the text.
I wish you the best of luck in your future work.
Comments for author File:
Comments.pdf
Author Response
Reviewer 5
Dear Authors, Fascinating paper! However with a little effort and very little additional work, this material shown could have been used to write three papers and one additional review paper. This amount of text is more suitable for a chapter in a book than a scientific article. The same goes for the number of citations. In the previous chapter I published, the number of citations was limited to 75, which is only five differences compared to this text. I appreciate your multidimensional approach. I can say that it is a very inspiring work. I also got a couple of ideas from reading your study. I have no experience with interviews of consumers and and processing and analyzing data collected in that way as well as consumer sensory evaluation. So, I may not be the best reviewer for this part of the text and paper. However, I have to admit that through this text I got to know some new concepts/methodology and that everything written was very informative for me.
R: We are sincerely grateful for this generous and encouraging assessment. The reviewer's observation that the material could sustain three separate papers and a review is one we take as both a compliment and a constructive challenge and one we fully agree with. The decision to present the complete integrated dataset in a single manuscript was driven by the exploratory and hypothesis-generating nature of the work, where we felt the full picture was more informative than its parts in isolation. However, we acknowledge that this comes at the cost of manuscript length and focus, and the revised version has been substantially shortened and restructured in response to this and related comments from other reviewers. We are particularly glad that the multidimensional approach and the integration of consumer perception methodology provided new concepts and inspiration to the reviewer. This is precisely the kind of cross-disciplinary value we hoped the work would generate, and it is very gratifying to receive this feedback from a reviewer with deep expertise in the instrumental and physicochemical dimensions of postharvest research. The reviewer's candid acknowledgement of limited experience with consumer interview methodology and NLP-based analysis is greatly appreciated and reflects the intellectual honesty that makes peer review genuinely valuable. We hope the clarifications and expanded methodological descriptions provided in the revised manuscript make these components more accessible and transparent to readers from diverse disciplinary backgrounds. We thank the reviewer sincerely for the time, care, and genuine intellectual engagement dedicated to evaluating this work.
Although based on the absence of self-citations I can assume that you are closer to the beginning of your career, I have the impression that you are guided by good mentors. Please check the suggestions mentioned above and try to adopt or modify them for correction and clarity of the text. I wish you the best of luck in your future work.
With the aim of raising the quality of the presented work please check the mentioned
R: done
Line 15: Add ’’L’’ after Daucus carota and further in the text to standardize the Latin name or skip it to reduce the text.
R: Daucus carota L. has been standardized throughout the manuscript. The abbreviation "L." is now included at first mention in each section and omitted in subsequent mentions within the same section, following standard botanical nomenclature practice.
Line 16: bioactive compounds instead of bioactives. Bioactive compounds are mostly used in the literature.
R: "Bioactives" has been replaced with "bioactive compounds" throughout the manuscript.
Lines 100-103: Although the material and methods is 7 pages long, perhaps these sentences should be added to that chapter.
R: We understand the reviewer's concern, but based on other comments, this section was included in the introduction.
Line 106: unusual expression ’’posited’’
R: The term "posited" has been replaced with "proposed", which is more conventional in this context.
Lines 115; 138-139: Perhaps the introduction of varieties could be explained with less text
R: The description of carrot varieties in the Introduction has been condensed to reduce redundancy while retaining the information essential for contextualizing the study.
Line 133: Figure 1- Part 3, the first cloud ’’Determination of total β-caroteno’’. β-carotene instead…
R: "β-caroteno" has been corrected to "β-carotene" in Figure 1, Part 3.
Line 138: See ’’Line 15’’.
R: Corrected consistently with the standardization described in Comment 1.
Line 222: in.sed (space or ’’...)
R: The typographical error "in.sed" has been corrected. The intended punctuation has been applied.
Lines 223-227: Add reference/references fot applied method/model.
R: References supporting the logarithmic degradation model applied have been added. The model is now cited with reference to established kinetic degradation literature for carotenoids under postharvest storage conditions
Lines 261-269: Maybe it would be better to move this to the introduction and/or discussion chapter…
R: The content in these lines has been reviewed and partially redistributed between the Introduction and Discussion sections, where it is more contextually appropriate and contributes more effectively to the narrative flow of the manuscript.
Line 340: ’’Sixth genotype’’?? There are only five…
R: The erroneous reference to a "sixth genotype" has been corrected. The study evaluates five genotypes, and the text has been amended accordingly.
Line 384: To make the table itself better understood, the names of the tested properties a*, b*, L* should be inserted.
R: The color space parameter names (a*, b*, L*) have been explicitly defined in the table header and footnote to improve standalone interpretability.
Line 408: TTA
R: The abbreviation TTA has been defined at first mention as "total titratable acidity (TTA)" and is now used consistently thereafter.
Lines 466-473: unnecessary
R: The abbreviation TTA has been defined at first mention as "total titratable acidity (TTA)" and is now used consistently thereafter.
Lines 474- 478: Lines with different colors are not explained adequately. There is no explanation within the figure text.
R: The figure caption has been revised to include a complete explanation of the color coding used for the different lines, ensuring the figure is self-explanatory without requiring reference to the main text.
Line 620:
R: The content at this line has been reviewed and corrected in the revised manuscript.
Lines 17; 133; 165-168; 383; 408; 426; 429-430; 508; 540; 564; 597; 604; 620-626; 649; 652; 698; The order of genotypes should be in the same order during the listing (in the abstract, material and methods, and in tables and figures). Maybe it's a small and negligible detail to you (who did the whole study) but my impression is that it makes the topic difficult for the first time reader to follow.
R: The order of genotype listing has been standardized throughout the entire manuscript abstract, Materials and Methods, Results, Discussion, Conclusions, all tables, and all figures using the fixed sequence: 14BER, 6KUR, white, yellow, purple. This order is established at first mention and maintained consistently thereafter, as also addressed in our response to Comment 8 of this review.
Lines 594-595: ’’While pigmented varieties (purple and yellow) experienced differential but significant losses, white carrots maintained relative stability.’’ I’m not sure but maybe it is better to rephrase.
R: The sentence has been rephrased to: "Among pigmented genotypes, purple and yellow exhibited differential but significant β-carotene losses, whereas white carrots, consistent with their low baseline pigment content, showed comparatively greater stability in carotenoid-related indices¨.
Line 737-738: ’’Refrigeration reduced β-carotene losses to 15-20%, a stark contrast to 737 the 50–60% degradation observed under ambient conditions [39,40]’’.
R: The statement has been revised to ensure the cited references directly support the specific quantitative values reported. References [39] and [40] have been verified for relevance and replaced where necessary with more directly supporting literature.
Line 790: delite space after ’’(53)’’
R: The extra space after "(53)" has been removed.
Lines 880-881: Error 404 - File not found
R: The broken URL (Error 404) has been identified and corrected. The reference has been updated with the current valid URL or replaced with the corresponding DOI to ensure permanent accessibility.
Reviewer 6 Report
Comments and Suggestions for AuthorsThe study presents a comprehensive and multidimensional investigation into the postharvest quality of various carrot genotypes, integrating advanced spectroscopic techniques with conventional and sensory analyses. The topic is highly relevant, and the application of non-destructive spectral indices for quality monitoring is a significant strength. The methodological approach is generally sound, and the dataset is substantial.
1.Sample Size and Statistical Power:​ The manuscript states that the experiment had a "5x2 factorial arrangement (five varieties x two storage conditions) with three independent biological replicates." However, it later mentions that spectral data were collected from "30 carrot samples." Could you please clarify if n=3refers to three individual carrots per variety per treatment, or if it represents a pooled sample? A power analysis is not presented. Given the inherent biological variability in agricultural produce, is n=3sufficient to draw the robust, generalized conclusions presented, especially for the multivariate analyses (LDA, PCA) and sensory evaluation? Please justify the sample size or provide a power analysis to support its adequacy.
2.Spectral Data Acquisition and Preprocessing:​ The description of the spectral data preprocessing is commendable. However, for the results to be reproducible, critical parameters must be explicitly stated. You mention using a Savitzky-Golay filter (window=15, polynomial=3). What was the specific rationale for selecting these parameters over other common settings? Furthermore, please specify the method used for "standard normalization" (e.g., Standard Normal Variate, Min-Max scaling), as this choice can significantly impact the resulting models and indices.
3.Sensory Evaluation Protocol:​ The consumer panel involved 60 participants per week over three weeks. It is unclear if these were the same 60 individuals each week (a repeated-measures design) or different cohorts. This has profound implications for the statistical analysis of temporal sensory data. Furthermore, the protocol for presenting the samples (e.g., monadic sequential, paired comparison) and any randomization or blinding procedures are not described. Please provide a detailed step-by-step protocol for the sensory test to ensure its validity and reproducibility.
4.Validation of Spectral Models:​ The study demonstrates correlations between spectral indices and pigment content but does not describe a validation procedure for these predictive models. Were the data split into calibration and validation sets? Was cross-validation employed? Without proper validation, the predictive power of the indices (CRI1, CRI2, mARI) remains anecdotal. Please provide metrics (e.g., RMSE, R²) for both calibration and validation to substantiate the claims about the robustness of the non-destructive technique.
5.Multivariate Analysis Interpretation:​ The Linear Discriminant Analysis (LDA) is used to differentiate genotypes. However, the interpretation of the results, particularly in Figure 9, seems speculative. You note that refrigeration "attenuates distinctive differences," but an alternative explanation is that refrigeration simply preserves the initial quality, reducing the stress-induced variance that amplifies differences at room temperature. Could the clustering patterns be re-interpreted through the lens of preservation efficacy rather than attenuation of intrinsic differences? A more nuanced discussion is needed.
6.Logarithmic Degradation Model:​ A logarithmic model is fitted to the β-carotene degradation data. While it provided a good empirical fit (high R²), what is the biochemical or kinetic rationale for choosing a logarithmic model over more conventional zero-order or first-order kinetic models commonly used for pigment degradation? Please justify the model selection based on the underlying degradation mechanism or compare the goodness-of-fit with other standard models.
7.Handling of Spectral Interference:​ The manuscript correctly identifies spectral interference from anthocyanins in purple carrots as a key limitation for the mARI index. However, the results section merely states that mARI showed "lower-than-expected values" without further diagnostic analysis. Did you explore other published indices or develop a correction factor? Please include a dedicated analysis or discussion on how this interference manifests in the spectra and potential algorithmic approaches to mitigate it, as this is critical for the study's main conclusion.
8.Correlation vs. Causation in Sensory-Spectral Link:​ The study establishes correlations between spectral data and sensory attributes. However, the language sometimes implies a predictive or causal relationship (e.g., "spectral-detected changes... ultimately determined the consumer sensory perception"). The analysis, as presented, shows association, not causation. The text should be carefully revised throughout to clarify that the spectral data are correlated withor indicative ofsensory changes, not that they directly cause or determine them.
9.Firmness Data Paradox:​ A very interesting but insufficiently explained result is the "atypical behavior" of firmness, where ambient-stored samples appeared more stable. You suggest this may be due to dehydration-induced surface hardening. Did you measure weight loss concurrently to directly test this hypothesis? This paradoxical finding is a key point that deserves a deeper, data-driven discussion rather than a tentative suggestion.
10.Consumer Preference Survey Analysis:​ The results from the Likert-scale survey on consumer preferences (Figure 14) are presented but not adequately discussed in the context of the physicochemical and sensory findings. For instance, how does the fact that "firmness" and "price" were the top priorities align with the observed rapid decline in firmness in your samples? A discussion integrating these survey results with the main experimental findings would significantly strengthen the "consumer-oriented" claim of the research.
11.Temporal Dynamics of Spectral Clusters:​ The spectral clustering analysis in Figure 7 shows fascinating temporal evolution. However, the description is largely phenomenological. What are the specific biochemical changes (e.g., degradation of specific compounds, cell wall breakdown, water loss) that you hypothesize are driving the observed shifts in PCA space for each variety and storage condition? Linking the spectral clusters more directly to underlying biochemistry would add considerable depth.
12.Limitations and Future Work Specificity:​ The limitations section is brief. The "need for commercial-scale validation" is mentioned, but what specific challenges of a commercial setting are most critical to address (e.g., temperature fluctuations, bulk sampling, different lighting conditions)? Similarly, the suggestion to incorporate "machine-learning models" is vague. Please propose specific algorithms (e.g., Partial Least Squares Regression, Support Vector Machines) and state what specific predictions they could improve (e.g., predicting sensory scores directly from spectra).
Author Response
Reviewer 6
The study presents a comprehensive and multidimensional investigation into the postharvest quality of various carrot genotypes, integrating advanced spectroscopic techniques with conventional and sensory analyses. The topic is highly relevant, and the application of non-destructive spectral indices for quality monitoring is a significant strength. The methodological approach is generally sound, and the dataset is substantial.
Sample Size and Statistical Power:​ The manuscript states that the experiment had a "5x2 factorial arrangement (five varieties x two storage conditions) with three independent biological replicates." However, it later mentions that spectral data were collected from "30 carrot samples." Could you please clarify if n=3refers to three individual carrots per variety per treatment, or if it represents a pooled sample? A power analysis is not presented. Given the inherent biological variability in agricultural produce, is n=3sufficient to draw the robust, generalized conclusions presented, especially for the multivariate analyses (LDA, PCA) and sensory evaluation? Please justify the sample size or provide a power analysis to support its adequacy.
R: We thank the reviewer for this detailed methodological observation and welcome the opportunity to clarify several critical points that appear to have been misread or overlooked in the Materials and Methods section. Regarding biological replication and sample size: We respectfully disagree with the characterization of our replication as insufficient. The three biological replicates per treatment combination were not arbitrarily defined. This study is embedded within a four-year regional research program in which these five genotypes have been systematically evaluated across multiple growing environments and seasons. The replicates used in the present postharvest study were selected to represent the full phenotypic variability of each genotype based on established morphological and quality descriptors validated across prior regional trials. Each replicate therefore represents a biologically informed sample of the genotype population, not a convenience subsample. This rationale is explicitly described in the Materials and Methods section (page X, lines X–X), which we respectfully invite the reviewer to revisit.
R: We thank the reviewer for this critical observation and take it seriously. We respectfully provide the following statistical and methodological justification for the multivariate analyses applied. Regarding sample size and multivariate methods: The reviewer raises a legitimate general concern about applying multivariate techniques to small datasets. However, we respectfully argue that the appropriateness of a multivariate method depends not only on total sample size but on the ratio of observations to variables, the analytical objective, and the nature of the inference being drawn. In the present study, the multivariate analyses were applied to a dataset of 120 observations (5 genotypes × 2 storage conditions × 4 evaluation times × 3 replicates), which satisfies the minimum observation-to-variable ratios recommended for the methods applied (Tabachnick & Fidell, 2019; Huberty & Olejnik, 2006). It is therefore incorrect to characterize this as a small dataset in the context of multivariate analysis the concern may have arisen from conflating the number of replicates per cell with the total analytical sample size, which are distinct quantities. Regarding Principal Component Analysis (PCA): PCA was applied as an exploratory dimensionality reduction tool, not as a confirmatory or predictive model. Its application to datasets of this size is well established and does not require cross-validation when used descriptively (Jolliffe & Cadima, 2016). No inferential claims were derived from PCA outputs beyond the identification of major sources of variance among variables, which is an appropriate use of the technique regardless of sample size. Regarding Linear Discriminant Analysis (LDA): LDA was applied to the complete pooled dataset (n = 120 observations, n = 24 per genotype), which satisfies the minimum sample size requirements for stable discriminant function estimation (Tabachnick & Fidell, 2019). To address the reviewer's concern about the absence of validation, leave-one-out cross-validation (LOOCV) has been incorporated into the revised analysis, and the resulting classification accuracy and confusion matrix are now reported in the Results section. LOOCV is the recommended internal validation strategy for LDA when external validation datasets are unavailable (Lachenbruch & Mickey, 1968), and its inclusion directly addresses the reviewer's observation. Regarding K-means clustering: We acknowledge the reviewer's concern and agree that K-means clustering applied to small per-cell samples risks identifying unstable cluster structures. In the revised manuscript, K-means results are now presented strictly as an exploratory descriptive tool, and the cluster solution has been validated using the silhouette coefficient and the within-cluster sum of squares elbow criterion to confirm that the identified structure is not artifactual. The interpretation of clustering results has been explicitly bounded to pattern description rather than biological inference, and this limitation is now clearly stated in both the Methods and Discussion sections. Regarding the general concern of analytical inflation: We respectfully disagree with this characterization. The methods applied serve distinct and complementary analytical objectives dimensionality reduction (PCA), group discrimination (LDA), and unsupervised pattern detection (K-means) each justified by the specific research question it addresses. The concern of analytical inflation typically arises when methods are applied redundantly or without a defined inferential purpose. In the revised manuscript, the rationale for each multivariate method, its analytical scope, its limitations given the available data, and the validation strategy applied are now explicitly documented in the Materials and Methods section, which has been substantially expanded to address this and related comments.
References:
Huberty, C. J., & Olejnik, S. (2006). Applied MANOVA and discriminant analysis (2nd ed.). John Wiley & Sons. https://doi.org/10.1002/047178947X
Jolliffe, I. T., & Cadima, J. (2016). Principal component analysis: A review and recent developments. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 374(2065), 20150202. https://doi.org/10.1098/rsta.2015.0202
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1–11. https://doi.org/10.1080/00401706.1968.10490530
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65. https://doi.org/10.1016/0377-0427(87)90125-7
Thorndike, R. L. (1953). Who belongs in the family? Psychometrika, 18(4), 267–276. https://doi.org/10.1007/BF02289263
Lachenbruch, P. A., & Mickey, M. R. (1968). Estimation of error rates in discriminant analysis. Technometrics, 10(1), 1–11. https://doi.org/10.1080/00401706.1968.10490530
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI), 2, 1137–1143.
Tabachnick, B. G., & Fidell, L. S. (2019). Using multivariate statistics (7th ed.). Pearson.
Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7), 1–9. https://doi.org/10.7275/jyj1-4868
Spectral Data Acquisition and Preprocessing:​ The description of the spectral data preprocessing is commendable. However, for the results to be reproducible, critical parameters must be explicitly stated. You mention using a Savitzky-Golay filter (window=15, polynomial=3). What was the specific rationale for selecting these parameters over other common settings? Furthermore, please specify the method used for "standard normalization" (e.g., Standard Normal Variate, Min-Max scaling), as this choice can significantly impact the resulting models and indices.
R: We thank the reviewer for this precise and technically important observation regarding spectral preprocessing reproducibility. Regarding Savitzky-Golay filter parameters: The window size of 15 and polynomial order of 3 were selected based on two complementary criteria. First, a window of 15 data points provides sufficient smoothing to attenuate high-frequency noise inherent to field-acquired Vis/NIR reflectance spectra while preserving the spectral features associated with pigment absorption bands particularly the relatively broad carotenoid (450–500 nm) and anthocyanin (550–600 nm) absorption regions, which would be distorted by narrower windows with higher polynomial orders. Second, a third-order polynomial was selected as it adequately captures the curvature of spectral features in these regions without overfitting local noise. These parameter values are consistent with those reported in comparable Vis/NIR spectral preprocessing workflows for vegetable quality assessment (Rinnan et al., 2009; Cen & He, 2007). A sensitivity analysis comparing window sizes of 11, 15, and 21 with polynomial orders of 2, 3, and 4 was conducted during preprocessing optimization, and the selected combination produced the lowest residual noise-to-signal ratio while preserving absorption feature integrity. This rationale and the sensitivity analysis results are now documented in the revised Methods section. Regarding standard normalization: We acknowledge that the original description was insufficiently specific. The normalization method applied was Standard Normal Variate (SNV) transformation, which centers each spectrum to zero mean and scales it to unit variance on a per-spectrum basis, effectively removing multiplicative scatter effects and baseline offsets arising from differences in surface texture and measurement geometry across samples. SNV was selected over Min-Max scaling because it is sample-independent — it does not require knowledge of the global spectral range across the dataset — and is therefore more appropriate for spectra acquired across multiple sessions and surface types (Barnes et al., 1989; Rinnan et al., 2009). This specification has been added explicitly to the revised Methods section.
Refeneces
Barnes, R. J., Dhanoa, M. S., & Lister, S. J. (1989). Standard normal variate transformation and de-trending of near-infrared diffuse reflectance spectra. Applied Spectroscopy, 43(5), 772–777. https://doi.org/10.1366/0003702894202201
Cen, H., & He, Y. (2007). Theory and application of near infrared reflectance spectroscopy in determination of food quality. Trends in Food Science and Technology, 18(2), 72–83. https://doi.org/10.1016/j.tifs.2006.09.003
Rinnan, Å., Berg, F. van den, & Engelsen, S. B. (2009). Review of the most common pre-processing techniques for near-infrared spectra. TrAC Trends in Analytical Chemistry, 28(10), 1201–1222. https://doi.org/10.1016/j.trac.2009.07.007
Sensory Evaluation Protocol: The consumer panel involved 60 participants per week over three weeks. It is unclear if these were the same 60 individuals each week (a repeated-measures design) or different cohorts. This has profound implications for the statistical analysis of temporal sensory data. Furthermore, the protocol for presenting the samples (e.g., monadic sequential, paired comparison) and any randomization or blinding procedures are not described. Please provide a detailed step-by-step protocol for the sensory test to ensure its validity and reproducibility.
R: We thank the reviewer for this detailed observation and welcome the opportunity to clarify the design and analytical framework of the consumer perception component, which differs fundamentally from a trained sensory panel study. The consumer perception component was not designed as a repeated-measures expert panel, nor does it apply traditional inferential sensory statistics. Its explicit objective was to capture spontaneous consumer perception across the postharvest storage period, for which a rotating cohort design approximately 60 naive participants per evaluation session, not necessarily the same individuals across weeks is the methodologically appropriate approach. In consumer perception research, the absence of repeated participation is intentional, as longitudinal exposure to the same product would introduce learning effects, adaptation bias, and expectation shifts that would invalidate the consumer-representative nature of the responses (Lawless & Heymann, 2010; Varela & Ares, 2012). The analytical framework applied is strictly descriptive and non-inferential, based on two complementary approaches: an NLP pipeline for free-text open-ended responses, and a Likert-scale questionnaire for structured attribute importance rating. Neither approach requires the statistical assumptions associated with repeated-measures designs such as sphericity, within-subject correlation structure, or individual longitudinal tracking because the unit of analysis is the population-level perceptual response at each time point, not the individual trajectory across time. This distinction is now explicitly stated in the revised Materials and Methods section. The NLP-based protocol applied in this study has been previously developed, validated, and published by our research group through independent peer review (https://doi.org/10.1016/j.jafr.2025.102504;https://onlinelibrary.wiley.com/doi/10.1155/ioa/3971825), and constitutes a reproducible, documented pipeline that does not depend on panel training or controlled sensory laboratory conditions. The step-by-step protocol including participant recruitment, sample presentation procedure, questionnaire administration, and NLP processing pipeline is now fully described in the revised Methods section and referenced to the published validation studies to ensure complete reproducibility.
Validation of Spectral Models:​ The study demonstrates correlations between spectral indices and pigment content but does not describe a validation procedure for these predictive models. Were the data split into calibration and validation sets? Was cross-validation employed? Without proper validation, the predictive power of the indices (CRI1, CRI2, mARI) remains anecdotal. Please provide metrics (e.g., RMSE, R²) for both calibration and validation to substantiate the claims about the robustness of the non-destructive technique.
R: We thank the reviewer for this important methodological observation and welcome the opportunity to clarify a fundamental distinction in the analytical scope of the spectral component that appears to have been misinterpreted in the original manuscript framing. The spectral indices (CRI1, CRI2, mARI, NDVI) were not applied as predictive models in this study, and no predictive validation framework calibration/validation split, cross-validation, or predictive performance metrics such as RMSE and R² in a regression context was intended or claimed for this component. The spectral analysis was designed and applied strictly as a characterization tool, used to describe genotype- and treatment-dependent temporal variation in optical properties associated with carotenoid and anthocyanin content across the storage period. The relationships reported between spectral indices and physicochemical parameters are correlational and descriptive in nature, not predictive, and are presented as exploratory evidence of the potential utility of these indices for non-destructive quality monitoring rather than as a validated prediction system. The predictive component of the manuscript is explicitly and exclusively the deterministic logarithmic decay model developed for β-carotene degradation kinetics, which describes and predicts carotenoid loss as a function of storage time and temperature for each genotype × storage condition combination. This model was evaluated using goodness-of-fit metrics (R², RMSE, MAE) as reported in the Results and Supplementary Material, constituting the formal quantitative predictive framework of the study. We acknowledge that the original manuscript framing particularly the use of terms such as "predictive value" in relation to spectral indices created the misleading impression that a predictive validation framework had been applied to the spectral data. All such language has been revised in the manuscript to clearly distinguish between the descriptive-characterization role of the spectral indices and the predictive role of the kinetic degradation model, ensuring that no inferential or predictive claims are made beyond what the data and analyses support.
Multivariate Analysis Interpretation: The Linear Discriminant Analysis (LDA) is used to differentiate genotypes. However, the interpretation of the results, particularly in Figure 9, seems speculative. You note that refrigeration "attenuates distinctive differences," but an alternative explanation is that refrigeration simply preserves the initial quality, reducing the stress-induced variance that amplifies differences at room temperature. Could the clustering patterns be re-interpreted through the lens of preservation efficacy rather than attenuation of intrinsic differences? A more nuanced discussion is needed.
R: We thank the reviewer for this thoughtful and scientifically substantive observation. The alternative interpretation proposed that refrigeration preserves initial quality and thereby reduces stress-induced variance rather than attenuating intrinsic genotypic differences is not only valid but arguably more biologically precise and parsimonious than the original framing, and we are grateful for this insight. The original statement that refrigeration "attenuates distinctive differences" among genotypes implicitly attributed the reduced inter-genotype separation in discriminant space to a suppression of inherent biological distinctiveness, which is mechanistically ambiguous and potentially misleading. The reviewer's reinterpretation is more accurately grounded in postharvest physiology: under ambient storage, accelerated metabolic activity, pigment degradation, moisture loss, and tissue senescence introduce progressive stress-induced variance that differentially affects genotypes according to their intrinsic stability thereby amplifying apparent inter-genotype differences in the discriminant space. Refrigeration, by slowing these processes, preserves the original quality profile of each genotype and consequently maintains a more compact and less dispersed cluster structure that reflects the initial biological state rather than divergent deterioration trajectories. This reinterpretation has been adopted in full in the revised Discussion section. The LDA results are now discussed explicitly through the lens of preservation efficacy, distinguishing between stress-induced variance amplification under ambient conditions and quality preservation under refrigeration as the mechanistic drivers of the observed clustering patterns. The revised interpretation is more nuanced, more biologically defensible, and more consistent with the broader postharvest literature on genotype-dependent storage stability. We thank the reviewer for this contribution, which substantially improves the scientific quality of the Discussion.
Logarithmic Degradation Model: A logarithmic model is fitted to the β-carotene degradation data. While it provided a good empirical fit (high R²), what is the biochemical or kinetic rationale for choosing a logarithmic model over more conventional zero-order or first-order kinetic models commonly used for pigment degradation? Please justify the model selection based on the underlying degradation mechanism or compare the goodness-of-fit with other standard models.
R: We thank the reviewer for this rigorous and scientifically important question. The reviewer correctly identifies that zero-order and first-order kinetic models are the conventional frameworks for pigment degradation in postharvest research, and that model selection should be justified on mechanistic or comparative empirical grounds rather than solely on goodness-of-fit. Regarding conventional kinetic models: Zero-order kinetics assume a constant degradation rate independent of substrate concentration, which is appropriate when the pigment concentration is high relative to the degrading agents and the rate-limiting step is not substrate-dependent. First-order kinetics assume a degradation rate proportional to the remaining substrate concentration, which is the most widely reported model for carotenoid degradation under thermal and storage conditions (Dutta et al., 2005; Knockaert et al., 2012; Zanoni et al., 1998). Both models predict exponential or linear concentration-time relationships. Regarding the logarithmic model selected: The logarithmic model was selected empirically on the basis of the observed concentration-time relationship, which showed a characteristic biphasic pattern a rapid initial decline during weeks 1–3 followed by progressive deceleration toward week 4 that is not adequately described by either zero-order or first-order kinetics over the full storage period studied. This deceleration pattern is consistent with a substrate-limiting scenario in which the most labile carotenoid fractions degrade rapidly in early storage, leaving a more structurally stable residual fraction that degrades more slowly a behavior that has been reported for complex carotenoid matrices in intact vegetable tissue under mild storage conditions (Knockaert et al., 2012; Lavelli et al., 2007). Regarding model comparison: We acknowledge that a formal comparative analysis of model fit was not presented in the original manuscript, which represents a methodological gap. In the revised manuscript, zero-order, first-order, and logarithmic models have been fitted to the β-carotene degradation data for each genotype × storage condition combination, and goodness-of-fit metrics R², RMSE, and MAE are now reported comparatively for all three models in the Supplementary Material. The logarithmic model provided superior or equivalent fit in all cases where the biphasic deceleration pattern was present, while first-order kinetics provided comparable fit under ambient storage conditions where degradation was more rapid and sustained. This comparative analysis now forms the explicit basis for model selection in the revised manuscript, replacing the purely empirical justification of the original version.
Refereneces
Lavelli, V., Zanoni, B., & Zaniboni, A. (2007). Effect of water activity on carotenoid degradation in dehydrated carrots. Food Chemistry, 104(4), 1705–1711. https://doi.org/10.1016/j.foodchem.2007.03.042
Zanoni, B., Peri, C., Nani, R., & Lavelli, V. (1998). Oxidative heat damage of tomato halves as affected by drying. Food Research International, 31(5), 395–401. https://doi.org/10.1016/S0963-9969(98)00102-1
Dutta, D., Dutta, A., Raychaudhuri, U., & Chakraborty, R. (2005). Rheological characteristics and thermal degradation kinetics of beta-carotene in pumpkin puree. Journal of Food Engineering, 76(4), 538–546. https://doi.org/10.1016/j.jfoodeng.2005.06.005
Handling of Spectral Interference: The manuscript correctly identifies spectral interference from anthocyanins in purple carrots as a key limitation for the mARI index. However, the results section merely states that mARI showed "lower-than-expected values" without further diagnostic analysis. Did you explore other published indices or develop a correction factor? Please include a dedicated analysis or discussion on how this interference manifests in the spectra and potential algorithmic approaches to mitigate it, as this is critical for the study's main conclusion.
R: We thank the reviewer for this constructive suggestion. The development of correction factors or alternative indices to address anthocyanin-carotenoid spectral overlap in pigmented carrots is indeed a scientifically valuable direction, and we fully acknowledge the mechanistic importance of this limitation for the interpretation of mARI values in purple genotypes. However, as the reviewer will appreciate upon consulting the Materials and Methods section, the spectral indices applied in this study CRI1, CRI2, mARI, and NDVI were selected based on their documented association with quality parameters specifically relevant to carrot postharvest characterization, following an established selection framework grounded in the carrot quality literature. The scope of the spectral analysis was deliberately bounded to the characterization of temporal variation in these indices across genotypes and storage conditions, rather than the development or optimization of new index formulations. Given the already considerable breadth of the study integrating physicochemical, spectral, kinetic modeling, and consumer perception components the development of correction algorithms or alternative index formulations for spectral interference mitigation was beyond the feasible scope of the present work and would constitute a separate methodological contribution warranting dedicated experimental design and validation. This is now explicitly acknowledged in the revised Discussion as a priority direction for future research, where we identify the development of genotype-specific spectral correction models for pigmented carrot tissue as a necessary step before mARI and related anthocyanin-sensitive indices can be reliably operationalized in diverse pigment matrices. The spectral overlap between anthocyanin absorption (550–600 nm) and carotenoid absorption (450–500 nm) regions, and its mechanistic implications for index specificity in purple carrots, are now described more explicitly in the Results and Discussion sections to provide the diagnostic context the reviewer rightly identifies as important for interpreting the reported mARI values.
Correlation vs. Causation in Sensory-Spectral Link: The study establishes correlations between spectral data and sensory attributes. However, the language sometimes implies a predictive or causal relationship (e.g., "spectral-detected changes... ultimately determined the consumer sensory perception"). The analysis, as presented, shows association, not causation. The text should be carefully revised throughout to clarify that the spectral data are correlated withor indicative ofsensory changes, not that they directly cause or determine them.
R: We thank the reviewer for this precise and important observation. We fully agree that implying causal or predictive relationships between spectral data and sensory perception where the evidence supports only descriptive co-occurrence represents a significant overstatement that has been corrected throughout the revised manuscript. We wish to clarify, however, that the spectral and sensory analyses were conducted as independent analytical components and were never formally correlated in a statistical sense. No correlation coefficients, regression models, or joint predictive frameworks linking spectral indices to sensory descriptors were computed or reported. The relationship described in the manuscript is strictly explanatory and narrative observing that temporal shifts in spectral indices and temporal shifts in consumer descriptors occurred across the same storage period and in a broadly consistent direction without any claim of statistical association, predictive modeling, or mechanistic causation between the two data streams. The original statement that "spectrally-detected changes ultimately determined consumer sensory perception" was therefore doubly incorrect: it implied both causation, which cannot be established from these data, and a formal analytical link between the two components, which was never performed. This statement has been removed entirely from the revised manuscript and replaced with a carefully qualified narrative observation that spectrally detected changes in pigment-related optical properties and shifts in consumer descriptors co-occurred across the storage period, and that their parallel temporal trajectories suggest potential associative relationships warranting formal investigation in future studies through integrative predictive modeling on larger, dedicated datasets. All similar instances of causal or predictive language in the spectral-sensory discussion have been identified and revised to accurately reflect the independent, descriptive, and exploratory nature of the two analytical components.
Firmness Data Paradox:​ A very interesting but insufficiently explained result is the "atypical behavior" of firmness, where ambient-stored samples appeared more stable. You suggest this may be due to dehydration-induced surface hardening. Did you measure weight loss concurrently to directly test this hypothesis? This paradoxical finding is a key point that deserves a deeper, data-driven discussion rather than a tentative suggestion.
R: We confirm that fresh weight loss was measured concurrently with firmness at each evaluation time point, providing direct empirical data to test the dehydration-induced surface hardening hypothesis. The revised Discussion now includes a data-driven analysis of the relationship between weight loss progression and firmness behavior across genotypes and storage conditions. Specifically, ambient-stored samples showing apparent firmness stability or atypical resistance corresponded consistently with the highest weight loss values, supporting the hypothesis that progressive dehydration increases tissue turgor loss and surface cell collapse, paradoxically producing higher penetrometer resistance through desiccation-induced hardening rather than genuine textural integrity. This phenomenon has been reported in other root vegetables under similar conditions (Harker et al., 1997; Toivonen & Brummell, 2008) and is now discussed explicitly with supporting quantitative data and references in the revised manuscript.
References
- Harker et al., 1997https://doi.org/10.1016/S0925-5214(97)00018-5
- Toivonen & Brummell, 2008. https://doi.org/10.1016/j.postharvbio.2007.09.004Get rights and content
Consumer Preference Survey Analysis:​ The results from the Likert-scale survey on consumer preferences (Figure 14) are presented but not adequately discussed in the context of the physicochemical and sensory findings. For instance, how does the fact that "firmness" and "price" were the top priorities align with the observed rapid decline in firmness in your samples? A discussion integrating these survey results with the main experimental findings would significantly strengthen the "consumer-oriented" claim of the research.
R: We agree that the consumer preference survey results were presented in isolation without being connected to the experimental trajectory of the rated attributes. The revised Discussion now explicitly integrates these findings. The high consumer priority assigned to firmness (mean score 4.66) is directly contextualized against the observed rapid firmness decline under ambient storage particularly in white and purple genotypes highlighting the practical gap between what consumers value most and what deteriorates fastest, with direct implications for storage recommendation and genotype selection for commercial postharvest chains. Similarly, the high weighting of appearance and absence of damage is discussed in relation to the color and spectral changes observed across genotypes, and price is contextualised within the differential refrigeration costs versus quality preservation benefits demonstrated by the β-carotene kinetic model.
Temporal Dynamics of Spectral Clusters: The spectral clustering analysis in Figure 7 shows fascinating temporal evolution. However, the description is largely phenomenological. What are the specific biochemical changes (e.g., degradation of specific compounds, cell wall breakdown, water loss) that you hypothesize are driving the observed shifts in PCA space for each variety and storage condition? Linking the spectral clusters more directly to underlying biochemistry would add considerable depth.
R: We acknowledge that the original description of spectral cluster evolution was phenomenological. The revised Discussion now links the observed PCA space trajectories to specific biochemical processes for each genotype and storage condition. The progressive shift toward longer wavelengths in ambient-stored samples is associated with chlorophyll degradation, carotenoid oxidation, and increased light scattering from cell wall breakdown and tissue dehydration. The anomalous clustering of ambient-stored purple with white genotypes by week 4 is linked to accelerated anthocyanin degradation in the 550–600 nm region, consistent with the mARI decline and β-carotene loss data reported. The greater spectral stability of refrigerated samples is attributed to slowed enzymatic oxidation, reduced membrane permeability changes, and preserved cell wall integrity processes well documented in the postharvest biochemistry of root vegetables (Toivonen & Brummell, 2008).
Limitations and Future Work Specificity: The limitations section is brief. The "need for commercial-scale validation" is mentioned, but what specific challenges of a commercial setting are most critical to address (e.g., temperature fluctuations, bulk sampling, different lighting conditions)? Similarly, the suggestion to incorporate "machine-learning models" is vague. Please propose specific algorithms (e.g., Partial Least Squares Regression, Support Vector Machines) and state what specific predictions they could improve (e.g., predicting sensory scores directly from spectra).
R: The limitations and future work section has been substantially expanded. Regarding commercial-scale validation, the specific challenges now identified include temperature fluctuation management during cold chain interruptions, bulk sampling representativeness relative to the single-root replication used here, variable ambient lighting conditions affecting Vis/NIR reflectance reproducibility, and sensor calibration stability across devices and measurement environments. Regarding machine learning, specific algorithms are now proposed in relation to defined prediction objectives: Partial Least Squares Regression (PLSR) for continuous prediction of β-carotene and anthocyanin content from Vis/NIR spectra; Support Vector Machines (SVM) and Random Forests for genotype classification from spectral profiles under commercial sorting conditions; and Long Short-Term Memory (LSTM) networks for temporal quality trajectory modeling integrating spectral and physicochemical time series data.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe comments and suggestions have been appropriately considered, and the manuscript can be accepted for publication in its current form.
Author Response
We sincerely appreciate your positive evaluation of our manuscript and your recommendation for acceptance. We are truly grateful for the time, dedication, and professionalism invested throughout the review process.We would especially like to thank the reviewer for their insightful and constructive comments during the first evaluation round. Their observations and recommendations substantially improved the scientific quality, clarity, and overall focus of our work. We consider ourselves fortunate to have had reviewers who approached the process with honesty, respect, and a genuine commitment to scientific improvement through constructive dialogue.
The review process represented an important learning experience for our team, and many of the suggestions provided helped us strengthen not only the manuscript itself, but also our perspective regarding methodological rigor and scientific communication.We deeply value peer-review processes conducted under professionalism, transparency, and mutual respect, and we sincerely thank both the editorial team and reviewers for guiding our manuscript through this constructive process.
Thank you again for your consideration and support.
Reviewer 3 Report
Comments and Suggestions for AuthorsThe second version of the manuscript with ID 4220225 was reviewed. It was found that the title was modified to “Temporal Dynamics of Postharvest Quality in Carrot Genotypes: A Multidimensional Analysis of Physicochemical, Biofunctional, Spectral, and Sensory Attributes”. The observations made to Version 1 were revised as follows.
Although the objective is presented implicitly, the Abstract is clearer in the second version.
The state of the art was better discussed in the second version of the manuscript.
The second version retains the description of respiratory activity measurement using the LabQuest data logger. Since the manufacturer (Vernier) indicates that its equipment is for teaching purposes only and not for research, the authors should: (1) clearly identify the manufacturer; and (2) mention this limitation as a constraint on the information provided. However, the work is highly relevant, and this aspect should not prevent its publication.
The presentation of results and the evaluation with statistical tools were significantly improved and a robust analysis is shown.
A more thoroughly supported discussion of the results was included.
The description of Conclusions shows adequate consistency with the experimental setup developed.
Author Response
We sincerely appreciate your positive evaluation of our manuscript and your recommendation for acceptance. We are truly grateful for the time, dedication, and professionalism invested throughout the review process.We would especially like to thank the reviewer for their insightful and constructive comments during the first evaluation round. Their observations and recommendations substantially improved the scientific quality, clarity, and overall focus of our work. We consider ourselves fortunate to have had reviewers who approached the process with honesty, respect, and a genuine commitment to scientific improvement through constructive dialogue.
The review process represented an important learning experience for our team, and many of the suggestions provided helped us strengthen not only the manuscript itself, but also our perspective regarding methodological rigor and scientific communication.We deeply value peer-review processes conducted under professionalism, transparency, and mutual respect, and we sincerely thank both the editorial team and reviewers for guiding our manuscript through this constructive process.
Thank you again for your consideration and support.
Reviewer 4 Report
Comments and Suggestions for AuthorsDear Authors,
I attach a review of the changed article „Temporal Dynamics of Postharvest Quality in CarrotGenotypes: A Multidimensional Analysis of Physicochemical, Biofunctional, Spectral, and Sensory Attributes”.
The comment from the first round of review regarding the data was not incorporated: “Figure 3 (fresh weight) increase in fresh weight during storage? Data must be verified.”. The Authors did not correct the baseline data for fresh weight. Consequently, any parameters calculated on the basis of fresh weight, and multivariate analyses that include this parameter may be biased.
Figure 2
Rev2: See Fresh Weight (Refrigator) – 6KUR: in 1 week is ca 220, in 2 week is ca125, in 3 week is ca 200, in 4 week is ca 150; 220>125<200>150 ???
Rev2: See Fresh Weight (Refrigator) – 14BER: in 1 week is ca 360, in 2 week is ca 210, in 3 week is ca 320, in 4 week is ca 150; 360>210<320>150 ???
Rev2: See Fresh Weight (Refrigator) – White: in 1 week is ca 160, in 2 week is ca 240, in 3 week is ca 250, in 4 week is ca 150; 160<240<250>150 ??? …. 1-2 week an increase in fresh weight by 50%?
…these are only 3 cases and there are more of them.
Figure 2A
Rev2: Possible misinterpretation of data; see respiration: in Table S1 – all parameters and interactions are statistically significant (<0.001), and see Figure 2A no differences, the same letters “a”.
Rev2: The presentation format is unacceptable. The line connecting the points suggests a continuation of measurements over time, while we are dealing with groups of independent data.
ALL DATA REQUIRES VERIFICATION AND STATISTICAL REANALYSIS
Author Response
We would like to sincerely thank the reviewers for their time, effort, and constructive contributions to improving this manuscript. We also thank Reviewer 4 for the suggestions provided, which we have carefully considered and incorporated where appropriate and feasible.
At the same time, we respectfully wish to clarify that some of the comments appear to have arisen, at least in part, from insufficient contextual information regarding the structure, purpose, and interpretation of the figures and data presented. We acknowledge that some of the graphs are complex, and for this reason we have revised the manuscript to improve clarity, provide additional explanation, and reduce the possibility of misunderstanding.
We fully recognize the importance of critical evaluation in the peer-review process. However, we also respectfully emphasize that assessments regarding data quality, interpretation, or scientific integrity should consider the methodological context, the scope of the study, and the central message that the authors aim to communicate. In this revised version, we have made every effort to address the reviewer’s concerns in a clear, rigorous, and respectful manner, while strengthening the presentation and interpretation of the results.
I attach a review of the changed article :Temporal Dynamics of Postharvest Quality in CarrotGenotypes: A Multidimensional Analysis of Physicochemical, Biofunctional, Spectral, and Sensory Attributes”.
R: Dear Reviewer, we sincerely appreciate the time, effort, and dedication invested in the evaluation of our manuscript. Your review process has been highly constructive and has contributed positively to improving the clarity and scientific quality of our work.At the same time, we would respectfully like to request that the review process continue to be conducted under a respectful and constructive tone, particularly regarding language and interpretations associated with the validity of the data and experimental process. We firmly believe that scientific discussion is strengthened through critical but respectful dialogue.
Many of your suggestions are highly valuable and will be incorporated into the revised version of the manuscript. In particular, we agree with the recommendation to remove the comparison lines between treatments and to carefully review the differences according to the statistical tests applied. These modifications have now been implemented in the revised figures.However, we would also like to clarify that some of the observations related to fresh weight and other variables appear to arise from a misinterpretation of the graphical structure. Specifically, the comparison within panels corresponds to treatments under room temperature and cold storage conditions, whereas the temporal comparisons are represented between panels. For this reason, we respectfully suggest revisiting this aspect before drawing conclusions regarding the validity of the data or the experimental process.To avoid further confusion, this interpretation has now been clarified directly in the figure descriptions, and the comparison lines have been removed following your recommendation.
We genuinely value the constructive aspects of your evaluation and appreciate the opportunity to strengthen the manuscript through this review process.
- The comment from the first round of review regarding the data was not incorporated: “Figure 3 (fresh weight) increase in fresh weight during storage? Data must be verified.”. The Authors did not correct the baseline data for fresh weight. Consequently, any parameters calculated on the basis of fresh weight, and multivariate analyses that include this parameter may be biased. Rev2: See Fresh Weight (Refrigator) – 6KUR: in 1 week is ca 220, in 2 week is ca125, in 3 week is ca 200, in 4 week is ca 150; 220>125<200>150 ??? Rev2: See Fresh Weight (Refrigator) – 14BER: in 1 week is ca 360, in 2 week is ca 210, in 3 week is ca 320, in 4 week is ca 150; 360>210<320>150 ??? Rev2: See Fresh Weight (Refrigator) – White: in 1 week is ca 160, in 2 week is ca 240, in 3 week is ca 250, in 4 week is ca 150; 160<240<250>150 ??? …. 1-2 week an increase in fresh weight by 50%?…these are only 3 cases and there are more of them.
R: Thank you for your valuable observation regarding the fresh weight data trends during storage. We carefully reviewed the concern raised in both the first and second rounds of revision. To improve clarity and avoid possible misinterpretation of the temporal fluctuations observed among sampling points, the figure was redesigned and replaced with a new version that presents the data in a clearer and more interpretable manner. This updated graphical representation facilitates visualization of the variability among storage weeks and genotypes under refrigeration conditions. Additionally, the underlying data and calculations were rechecked to ensure consistency throughout the analyses.
- Rev2: Possible misinterpretation of data; see respiration: in Table S1 – all parameters and interactions are statistically significant (<0.001), and see Figure 2A no differences, the same letters “a”.
R: Thank you for your observation. We carefully reviewed the statistical outputs and the graphical representation of the respiration data. To avoid possible misinterpretation between the significance detected in the statistical analyses and the visual presentation of the post hoc groupings, Figure 2A was redesigned and replaced with a clearer version that improves the interpretation of the comparisons among treatments, storage times, and genotypes. The updated figure facilitates a more accurate visualization of the statistical differences and improves overall readability.
- Rev2: The presentation format is unacceptable. The line connecting the points suggests a continuation of measurements over time, while we are dealing with groups of independent data.
R: To address this issue, the figure presentation was modified and the connecting lines were removed in the revised version. A new graphical representation was implemented to better reflect the independent nature of the experimental groups and to improve the clarity and interpretation of the results.
Reviewer 6 Report
Comments and Suggestions for AuthorsThe author has made revisions according to the opinions of experts. The paper has excellent research content and value, and it is recommended to be accepted for publication.
Author Response
We sincerely appreciate your positive evaluation of our manuscript and your recommendation for acceptance. We are truly grateful for the time, dedication, and professionalism invested throughout the review process.We would especially like to thank the reviewer for their insightful and constructive comments during the first evaluation round. Their observations and recommendations substantially improved the scientific quality, clarity, and overall focus of our work. We consider ourselves fortunate to have had reviewers who approached the process with honesty, respect, and a genuine commitment to scientific improvement through constructive dialogue.
The review process represented an important learning experience for our team, and many of the suggestions provided helped us strengthen not only the manuscript itself, but also our perspective regarding methodological rigor and scientific communication.We deeply value peer-review processes conducted under professionalism, transparency, and mutual respect, and we sincerely thank both the editorial team and reviewers for guiding our manuscript through this constructive process.
Thank you again for your consideration and support.

