Machine Learning Prediction Models of Beneficial and Toxicological Effects of Zinc Oxide Nanoparticles in Rat Feed
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper can be accepted after the addressing the following minor concerns, further there is no need of my confirmation:
1. The nanoparticle size mentioned in the paper "line- 263, is 6000 nm= 6 micron", I don't think the particles of this size can be called as nanoparticles. If it is a typographical error then please correct it, or if it is really of that size then please write it as "Particle or micron size particle" instead of nanoparticle.
2. In the table 4, Table 5 and in Figure 4, the groups(3,1 ZnO NPs, 6,2 ZnO NPs etc ), I think it is better to write 3.1 ZnO( dot instead of comma, Ex- 3.1 instead of 3,1) similarly for each places thought the tables and figure 4. Or mentioned somewhere because comma creates confusion to readers.
3. From line 465 to 474 and some other places wherever you refereed equation numbers like in line- 645 "Weighted important score (4)" please write equation before (4) like -Weighted important score eq (4) or equation (4)". That will be more clear and easy to read for readers because there are lots of tables figures and and data.
4. Is there any specific reason behind using these specific pattern of dose 1.55 mg/kg, 3.1 mg/kg, 6.2 mg/kg and sudden jump to 150 mg/kg. Was it based on any specific previous studies which shows some interesting results about these values or it was just intuition of authors?
5. Line-460, "group V (150 ZnO NPs)" please continue the consistency of naming of group if you are mentioning the unit of dose please do it thought - here better to write it ""group V (150 mg/kg ZnO NPs)"
Author Response
Comments 1: The nanoparticle size mentioned in the paper "line- 263, is 6000 nm= 6 micron", I don't think the particles of this size can be called as nanoparticles. If it is a typographical error then please correct it, or if it is really of that size then please write it as "Particle or micron size particle" instead of nanoparticle.
Response 1: Thank you for pointing this out. Due to a technical error, we indicated an incorrect nanoparticle size in the article. Correct size is 329 nm.
Comments 2: In the table 4, Table 5 and in Figure 4, the groups (3,1 ZnO NPs, 6,2 ZnO NPs etc ), I think it is better to write 3.1 ZnO( dot instead of comma, Ex- 3.1 instead of 3,1) similarly for each places thought the tables and figure 4. Or mentioned somewhere because comma creates confusion to readers.
Response 2: Naming of groups in tables and figures fixed.
Comments 3: From line 465 to 474 and some other places wherever you refereed equation numbers like in line- 645 "Weighted important score (4)" please write equation before (4) like -Weighted important score eq (4) or equation (4)". That will be more clear and easy to read for readers because there are lots of tables figures and data.
Response 3: Equation references added to the text.
Comments 4: Is there any specific reason behind using these specific pattern of dose 1.55 mg/kg, 3.1 mg/kg, 6.2 mg/kg and sudden jump to 150 mg/kg. Was it based on any specific previous studies which shows some interesting results about these values or it was just intuition of authors?
Response 4: The experimental data in their current form were provided by the biological center; Authors of the paper did not influence the dosage control.
Comments 5: Line-460, "group V (150 ZnO NPs)" please continue the consistency of naming of group if you are mentioning the unit of dose please do it thought - here better to write it ""group V (150 mg/kg ZnO NPs)"
Response 5: Thank you for pointing this out. Groups naming is fixed in the text and figures.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript investigates the does-dependent effects of ZnO nanoparticles (NPs) on the elemental homeostasis in male Wistar rats. The core strength of this paper is its innovative approach to overcoming the "small data" problem in experimental biology. Using GANs to augment a limited dataset, the authors develop and evaluate machine learning models to predict the concentrations of essential elements, toxic elements, proteins, and enzymes in the blood across a range of ZnO NP dosages. The study also introduces novel metrics—a Weighted Importance Score and an Integral Indicator of Efficiency and Safety (IIES)—to identify an optimal nanoparticle dosage (3.1 mg/kg). This is a well-conceived and timely study that tackles a significant problem in nanotoxicology: data scarcity. However, there are several major points that need to be addressed before the manuscript can be considered for publication. Below are my comments:
- The major concern I have is the justification and limitations of synthetic data. While the authors use SDMetrics to evaluate the quality of the synthetic data, more discussion is needed to further justify the use of synthetic data. The KDE plots in Figure 1 clearly show that the distributions for several toxic and partially essential elements (for example, As, B, Cd, V) in the synthetic data do not match the original data well. This discrepancy must be explicitly acknowledged and discussed. Some discussion on how this might impact the reliability of the downstream predictive models for these specific elements is also helpful.
- The inexplicit assumption that GAN can generate biologically plausible data needs more robust defense. There is risk that the model might learn spurious correlations from the small initial dataset and amplify them in the synthetic data, leading to predictions that are statistically plausible but biologically meaningless. I was wondering if the authors could comment on this.
- In Equation 8, the weights for efficacy, balance, and toxicity are given without justification. Since they critically influence the final IIES score and the resulting "optimal dose”, the authors should discuss how they are determined. Including a sensitivity analysis showing how the optimal dose changes with different weightings could be helpful.
- In Equation 14, similarly, the weight in final loss is also given without justification. I was wondering if the authors could explain how they decided to emphasize certain losses over the others.
- In the discussion, the authors mentioned that the predictive models for toxic elements performed poorly, yielding "template-like" results. I was wondering if the authors could put this as the central part of the discussion instead of just briefly mentioning it. There are many things that can be discussed for this key finding. For example, what are the implications of this failure? Does it suggest that the toxic response at these low levels is stochastic and inherently unpredictable with this type of model? Or is it a failure of the synthetic data to capture the true dynamics?
- It would be highly beneficial to visualize the model's performance by overlaying the original experimental data points onto the prediction curves in Figures 6 and 7. By doing so, the readers can directly assess how well the model fits the ground truth.
- In the experimental design, the authors jump from dosages 6.2 mg/kg to 150 mg/kg. This huge gap makes it extremely challenging for any model to learn the dose-response relationship in this wide, unobserved range. I was wondering if the authors could give more details to justify the choice.
Author Response
Comments 1: The major concern I have is the justification and limitations of synthetic data. While the authors use SDMetrics to evaluate the quality of the synthetic data, more discussion is needed to further justify the use of synthetic data. The KDE plots in Figure 1 clearly show that the distributions for several toxic and partially essential elements (for example, As, B, Cd, V) in the synthetic data do not match the original data well. This discrepancy must be explicitly acknowledged and discussed. Some discussion on how this might impact the reliability of the downstream predictive models for these specific elements is also helpful.
Response 1: The distribution of certain elements in the synthetic dataset deviates from that in the original dataset, attributable to the sensitivity threshold of the laboratory instrumentation. Experimental studies reveal that accurate forecasting is particularly challenging for elements whose values are very close to zero, or were determined in the original experiments in categorical form (i.e. "<0.0001"). For the Al element with sufficiently high concentration values, the model shows better predictive results, compared to elements whose concentration is close to zero.
Comments 2: The inexplicit assumption that GAN can generate biologically plausible data needs more robust defense. There is risk that the model might learn spurious correlations from the small initial dataset and amplify them in the synthetic data, leading to predictions that are statistically plausible but biologically meaningless. I was wondering if the authors could comment on this.
Response 2: We appreciate the reviewer’s insightful observation regarding the potential limitations of GAN-based synthetic data generation in preserving biological plausibility, particularly when trained on small datasets.
To mitigate this risk and enhance the biological credibility of the generated data:
- We rigorously evaluated the similarity between the original and synthetic dataset using SDMetrics tool and KDE plots;
- During training, we incorporated domain-informed constraints to prevent the generation of physiologically implausible values with custom loss function;
- The utility of the augmented dataset was assessed through the performance of two predictive models (FCNN and KernelRidge).
Therefore, we position the synthetic data as a tool for statistical augmentation under the assumption of distributional stability, not as a substitute for mechanistic modeling.
Comments 3: In Equation 8, the weights for efficacy, balance, and toxicity are given without justification. Since they critically influence the final IIES score and the resulting "optimal dose”, the authors should discuss how they are determined. Including a sensitivity analysis showing how the optimal dose changes with different weightings could be helpful.
Response 3: The weighting coefficients employed in the IIES score, as defined in Equation (8), were not derived through computational optimization within the scope of this study. Instead, they were established by the authors based on expert judgment to reflect the relative significance of each evaluation criterion in relation to the study's objectives. The assigned weights are as follows: wt = 2, wb = 1, and we = 2. This weighting scheme is justified by the prioritization of environmental safety – emphasizing toxicity minimization – and process efficiency in zinc recovery, which are considered more critical than functional performance indicators in the context of the present analysis.
Comments 4: In Equation 14, similarly, the weight in final loss is also given without justification. I was wondering if the authors could explain how they decided to emphasize certain losses over the others.
Response 4: Due to the extreme sparsity of experimentally validated concentrations (only few data points across the 1–150 mg range) we used the 2:1 ratio which reflects a hierarchical severity in terms of scientific interpretability and model utility. This aligns with principles in constrained regression and physical modeling, where violating physical constraints is prioritized over other forms of inaccuracy.
Comments 5: In the discussion, the authors mentioned that the predictive models for toxic elements performed poorly, yielding "template-like" results. I was wondering if the authors could put this as the central part of the discussion instead of just briefly mentioning it. There are many things that can be discussed for this key finding. For example, what are the implications of this failure? Does it suggest that the toxic response at these low levels is stochastic and inherently unpredictable with this type of model? Or is it a failure of the synthetic data to capture the true dynamics?
Response 5: Comparative analysis of two predictive modeling approaches – fully connected neural networks (FCNN) and Kernel Ridge regression – reveals that accurate forecasting is particularly challenging for elements predominantly belonging to the toxic group, likely due to measurement limitations and, as a consequence, the inability of synthetic data to capture the true dynamics. However, using the proposed methodology, robust predictive models were successfully developed for most other elements whose concentrations were measured with higher precision.
Comments 6: It would be highly beneficial to visualize the model's performance by overlaying the original experimental data points onto the prediction curves in Figures 6 and 7. By doing so, the readers can directly assess how well the model fits the ground truth.
Response 6: Thank you for pointing this out. We conducted additional experiments and added original data points in Figures 6 and 7 for better visual clarity.
Comments 7: In the experimental design, the authors jump from dosages 6.2 mg/kg to 150 mg/kg. This huge gap makes it extremely challenging for any model to learn the dose-response relationship in this wide, unobserved range. I was wondering if the authors could give more details to justify the choice.
Response 7: The experimental data in their current form were provided by the biological center; Authors of the paper did not influence the dosage control.
Reviewer 3 Report
Comments and Suggestions for AuthorsRecommendation
Comments:
The manuscript describes using prediction of beneficial and toxicological effects of ZnO in rat feed with the aid of data augmentation methods.
I recommend the article for publication upon addressing the comments listed here:
- For the Chord Diagram, the resolution of the figure can be improved. Another point of improvement is to use the same color for the same element in all six sub-plots, this would reinforce consistency over all the plots.
- I seem to be missing the meaning of the x-axis in figure 6 and 7, are these number of data points or are these have physical meaning like time making the plots time profiles or physical penetration depth? For all these plots, the axis should have units, currently they are unitless. Also the meaning of the red dotted line is not obvious. To me, it seems to be the location of the maximum of the element content, is it necessary to highlight? It also should be mentioned in the figure caption.
- For the synthetic data points, it seems the authors have glossed over the fact that close to half of the elements’ distributions are not matching that of the original dataset. What’s more important is the percentage difference of the synthetic vs. Original (experimental) dataset, for the toxic elements, even though the numbers are very small because they are microelements, extra caution should be given to ensure that the synthetic dataset strictly follow the original dataset. For example, Cd, Co, Mn, Li, Hg, these have very small values, but their synthetic dataset have much wider distribution that very much can be more than 100% larger than the original data. I hope the authors can look into the matter and if possible reiterate on the synthetic data generation with extra care.
Author Response
Comments 1: For the Chord Diagram, the resolution of the figure can be improved. Another point of improvement is to use the same color for the same element in all six sub-plots, this would reinforce consistency over all the plots.
Response 1: The blurriness is caused by file converting to PDF format. All figures are prepared in high quality and inserted into the text; all figures uploaded separately in .zip file. Chord diagrams from the holoviews library are initially interactive HTML files, the colors of the elements in the diagrams are assigned automatically.
Comments 2: I seem to be missing the meaning of the x-axis in figure 6 and 7, are these number of data points or are these have physical meaning like time making the plots time profiles or physical penetration depth? For all these plots, the axis should have units, currently they are unitless. Also the meaning of the red dotted line is not obvious. To me, it seems to be the location of the maximum of the element content, is it necessary to highlight? It also should be mentioned in the figure caption.
Response 2: Thank you for pointing this out. The x-axis shows the concentration of nanoparticles in the range from 1 to 150 mg/kg. Red dotted line corresponds to the highest predicted concentration. We also conducted additional experiments and added original data points in Figures 6 and 7 for better visual clarity. Figures and captions are fixed.
Comments 3: For the synthetic data points, it seems the authors have glossed over the fact that close to half of the elements’ distributions are not matching that of the original dataset. What’s more important is the percentage difference of the synthetic vs. Original (experimental) dataset, for the toxic elements, even though the numbers are very small because they are microelements, extra caution should be given to ensure that the synthetic dataset strictly follow the original dataset. For example, Cd, Co, Mn, Li, Hg, these have very small values, but their synthetic dataset have much wider distribution that very much can be more than 100% larger than the original data. I hope the authors can look into the matter and if possible reiterate on the synthetic data generation with extra care.
Response 3: The distribution of certain elements in the synthetic dataset deviates from that in the original dataset, attributable to the sensitivity threshold of the laboratory instrumentation. Comparative analysis of two predictive modeling approaches – fully connected neural networks (FCNN) and Kernel Ridge regression – reveals that accurate forecasting is particularly challenging for elements predominantly belonging to the toxic group, likely due to measurement limitations. However, using the proposed methodology, robust predictive models were successfully developed for most other elements whose concentrations were measured with higher precision.
Reviewer 4 Report
Comments and Suggestions for AuthorsThis manuscript presents a GAN-based data augmentation method and a machine learning algorithm based on fully connected neural networks and ridge regression models. The following questions and comments should be addressed:
- Line 339-340. How are the weights in the multi-objective optimization determined?
- Figure 1, some of the original/synthetic distributions differ significantly, the authors should comment on if this will impact the validity of the prediction result for those element groups where the distributions are significantly different.
- Line 384. Why use a penalty formulation to enforce this? This could be easily enforced rigorously by proper choice of activation functions.
- Equation 14, how are the weights for different loss terms determined?
- Table 6, a systematic hyper-parameter optimization should be performed to identify the best hyperparameter to use for each of the two models.
- Figure 6. The meaning of this figure is very unclear. The authors must provide much more explanation on the meaning of the curves presented and the meaning of the point indicated by the red dashed line to provide clarity of the results.
Author Response
Comments 1: Line 339-340. How are the weights in the multi-objective optimization determined?
Response 1: The weighting coefficients employed in the IIES score, as defined in Equation (8), were not derived through computational optimization within the scope of this study. Instead, they were established by the authors based on expert judgment to reflect the relative significance of each evaluation criterion in relation to the study's objectives. The assigned weights are as follows: wt = 2, wb = 1, and we = 2. This weighting scheme is justified by the prioritization of environmental safety – emphasizing toxicity minimization – and process efficiency in zinc recovery, which are considered more critical than functional performance indicators in the context of the present analysis.
Comments 2: Figure 1, some of the original/synthetic distributions differ significantly, the authors should comment on if this will impact the validity of the prediction result for those element groups where the distributions are significantly different.
Response 2: The distribution of certain elements in the synthetic dataset deviates from that in the original dataset, attributable to the sensitivity threshold of the laboratory instrumentation. Comparative analysis of two predictive modeling approaches – fully connected neural networks and Kernel Ridge regression – reveals that accurate forecasting is particularly challenging for elements predominantly belonging to the toxic group, likely due to measurement limitations. However, using the proposed methodology, robust predictive models were successfully developed for most other elements whose concentrations were measured with higher precision.
Comments 3: Line 384. Why use a penalty formulation to enforce this? This could be easily enforced rigorously by proper choice of activation functions.
Response 3: We performed a comparison of two predictive modeling approaches – fully connected neural networks and Kernel Ridge regression – under the same conditions using the same custom loss function in both cases.
Comments 4: Equation 14, how are the weights for different loss terms determined?
Response 4: Due to the extreme sparsity of experimentally validated concentrations (only few data points across the 1–150 mg range) we used the 2:1 ratio which reflects a hierarchical severity in terms of scientific interpretability and model utility. This aligns with principles in constrained regression and physical modeling, where violating physical constraints is prioritized over other forms of inaccuracy.
Comments 5: Table 6, a systematic hyper-parameter optimization should be performed to identify the best hyperparameter to use for each of the two models.
Response 5: While comprehensive hyper-parameter optimization was not performed, our focus was on comparing the impact of activation functions in FCNNs and kernel types in KernelRidge under standardized settings. Other hyper-parameters were held constant at commonly used default values to ensure interpretability and reproducibility. This approach allows for a controlled comparison of core modeling choices, aligning with the study’s objective.
Comments 6: Figure 6. The meaning of this figure is very unclear. The authors must provide much more explanation on the meaning of the curves presented and the meaning of the point indicated by the red dashed line to provide clarity of the results.
Response 6: Thank you for pointing this out. The x-axis shows the concentration of nanoparticles in the range from 1 to 150 mg/kg. Red dotted line corresponds to the highest predicted concentration. We also conducted additional experiments and added original data points in Figures 6 and 7 for better visual clarity. Figures and captions are fixed.
Round 2
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors have addressed my concerns, and the manuscript is ready for publication.