Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Validation of Computational Software for Criticality Safety Analysis of Spent Nuclear Fuel Systems

J. Nucl. Eng. 2026, 7(1), 21; https://doi.org/10.3390/jne7010021

by Matej Sikl^1,2,* and Radim Vocka²

Reviewer 1: Anonymous

Reviewer 2:

Waclaw Gudowski

Reviewer 3:

Mathieu Hursin

J. Nucl. Eng. 2026, 7(1), 21; https://doi.org/10.3390/jne7010021

Submission received: 23 January 2026 / Revised: 24 February 2026 / Accepted: 10 March 2026 / Published: 17 March 2026

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The manuscript entitled “Validation of Computational Software for Criticality Safety Analysis of Spent Nuclear Fuel Systems” by Matej Sikl and Radim Vocka presents numerical experiments in the field of criticality safety analysis. Standard computational codes are applied and their results are compared with data derived from operational measurements, such as zero-power criticality tests in commercial nuclear power plants.

While such comparisons may be of practical interest to professionals directly involved in specific safety analyses — e.g., within nuclear safety and reliability divisions or operating organizations— the present work does not appear to provide substantial new scientific insight to the broader field of criticality assessment. The manuscript reads more as a technical work report than as a truly useful contribution advancing methodology, theory, or validation practice in a meaningful way.

Moreover, similar preliminary considerations have already been published by one of the co-authors (Sikl) in Acta Polytechnica CTU Proceedings more than seven years ago (DOI: https://doi.org/10.14311/APP.2018.19.0030 and, four years later also in 10.14311/APP.2022.37.0054). These earlier works are not cited in the present manuscript, despite their clear thematic relevance (this omission is a bit surprising). And they have not been cited by others.

Although the manuscript is written in clear and fluent English (AI makes it possible - the English in the report from 2018 is poorer), its scientific depth remains limited. The analysis largely confirms expected results obtained with established tools, without offering new methodological developments, uncertainty quantification advances, or broader validation insights that would be of interest to a scientifically oriented readership.

As an example, Figure 1 illustrates a color-coded image of a VVER-440 core. However, the figure lacks a color scale and does not convey quantitative information beyond what is widely available in the literature. This is symptomatic of the overall presentation, which remains largely descriptive rather than analytically substantive.

In its current form, the manuscript appears more suitable for presentation at a technical conference than for publication as a peer-reviewed scientific article.

Typos: a period is missing at the end of line 68. Calculates --> calculate in line 330.

For these reasons, I recommend rejection of the manuscript.

Author Response

Thank you very much for your review of the manuscript. Most of the suggested changes have been incorporated and can be seen in the attached manuscript revision, the changes are highlighted in red. In the following text I will try to address all of your comments.

Comments 1: While such comparisons may be of practical interest to professionals directly involved in specific safety analyses — e.g., within nuclear safety and reliability divisions or operating organizations— the present work does not appear to provide substantial new scientific insight to the broader field of criticality assessment. The manuscript reads more as a technical work report than as a truly useful contribution advancing methodology, theory, or validation practice in a meaningful way.

Response 1: I understand your point of view regarding the lack of new methodological developments. The methods used in our methodology have been widely known for more than 10 years. However, we believe the scientific benefit lies in two main objectives fulfilled in the manuscript. Firstly, it is the complexity of the calculations performed. While it has been stated that this sort of comparison can be performed, I have seen no recent study using realistic models of reactor critical states from recent years for similarity comparison. We utilized data from complex reactor criticals states and storage systems, and I believe we demonstrated the viability of such similarity assessment.

The second important objective, in my view, is the definition of the methodology itself. In previously published papers, many possibilities on how to use similarity for comparison are mentioned, along with how they differ under specific circumstances. Our methodology is more robust; specific limits were described, individual values entering the subcriticality assessment were identified and quantified, and example values were provided to demonstrate the magnitude of individual parameters.

Regarding previous publications — I have been involved in similarity and subcriticality assessment studies for the last several years. Both of the mentioned publications are only a proceedings from student conferences. The older one addressed similarity and the influence of system parameters, but these comparison were calculated using a theoretical systems. The newer one showed only preliminary results from the first reactor criticals experiments, but their number was not high enough for any assessment, these comparisons were intended for confirming the possibility of using reactor critical states for validation experiments. Later, as is described in the currently proposed article, we expanded the reactor criticals database, performed calculations with greater complexity and in more detail, and finalized and optimized the similarity profiles calculation. Only after that it was possible to define methodology and present it in the current article.

Comments 2: As an example, Figure 1 illustrates a color-coded image of a VVER-440 core. However, the figure lacks a color scale and does not convey quantitative information beyond what is widely available in the literature. This is symptomatic of the overall presentation, which remains largely descriptive rather than analytically substantive.

Response 2: I have added a color scale for the most relevant model components to the figure captions. A full color legend would be too large and difficult to read (specifically for the core model in Fig. 1). I agree with your comment that the figure (and the model descriptions in the text) does not offer new information. Unfortunately, this is due to the fact that the majority of the data used are confidential data of the Czech NPP operator. As stated in the manuscript, it is not possible to make all data publicly available. However, we have at least demonstrated that the effort involved in creating reactor critical models is worthwhile for such assessments.

Thank you once again for your review and for your valuable comments
Best regards
Matej Sikl

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

This paper, being pretty interesting does not formulate an ultimate objective. I suggest that the authors at least make a statement what is wrong with "conservative use fresh fuel" approaxch for subcriticality assessement. Too conservative? Can a more precise method optimize the used (not spent) fuel storage, like to pack more FA elements into the same volume? Or is it just refinment of the scientific method, which is in my opinion a fully justified "driver".

Methodoloy is interesting results equally well, However I would alo like to see reformatting of the tables. Table 1 and 2 are only one row table whith not that clear meaning. Reformat to row and column format, coloumn can just be number of the burnup batches or whatever. Table 3 and 4 should also be presented more clearly: divide ther table to "pool" part and "cask" part, do not repeat many times "pools" and "casks" - Look like a paste from a raw "python" or whatever output. Figure 5 is also difficult to digest. What does it mean "reactor criticals"?? What is "cask_c44E12_14997? Are there any specific reasons for a very bumpy spectrum over 1 MeV or it is simply a calculational artifact?

And last but not least - a comment of subcriticality evolution in time would valuable. The natural decay of used fuel generates significant changes in the composition. How the evlution of subriticality will look like e.g. in thousands of years perspective??

Author Response

Comments 1: This paper, being pretty interesting does not formulate an ultimate objective. I suggest that the authors at least make a statement what is wrong with "conservative use fresh fuel" approaxch for subcriticality assessement. Too conservative? Can a more precise method optimize the used (not spent) fuel storage, like to pack more FA elements into the same volume? Or is it just refinment of the scientific method, which is in my opinion a fully justified "driver".

Response 1: Thank you for pointing this out. In fact, the study addresses a combination of these objectives. In previous projects conducted at the Department of Reactor Physics, when assessing the subcriticality of storage systems at Czech NPPs, the margin between the regulatory limits and the calculated multiplication factor was found to be minimal, or even non-existent, as initial fuel enrichment increased. We believe that our methodology can effectively resolve this lack of margin. Moreover, I believe it is possible to safely address subcriticality even when taking spent fuel composition into account, it is just about appropriate computational code validation. I added a paragraph to the introduction section to explain objectives more specifically.

Comments 2: Methodoloy is interesting results equally well, However I would alo like to see reformatting of the tables. Table 1 and 2 are only one row table whith not that clear meaning. Reformat to row and column format, coloumn can just be number of the burnup batches or whatever. Table 3 and 4 should also be presented more clearly: divide ther table to "pool" part and "cask" part, do not repeat many times "pools" and "casks" - Look like a paste from a raw "python" or whatever output. Figure 5 is also difficult to digest. What does it mean "reactor criticals"?? What is "cask_c44E12_14997? Are there any specific reasons for a very bumpy spectrum over 1 MeV or it is simply a calculational artifact?

Response 2: I would like to thank you for these suggestions. I have reformatted the tables to improve readability. Figure 5 legend and caption were modified and the y-axis values were changed to display the flux per unit lethargy, which smoothed the graph lines.

Comments 3: And last but not least - a comment of subcriticality evolution in time would valuable. The natural decay of used fuel generates significant changes in the composition. How the evlution of subriticality will look like e.g. in thousands of years perspective??

Response 3: I agree that this is an interesting topic. Sadly I cannot provide a reliable answer, as I do not have data regarding long-term natural decay and its impact on fuel composition. I believe these could be prepared by SCALE TRITON or ORIGEN-ARP, but for now I do not have these. For the purposes of this paper, we only focused on the timepoint of storage cask loading, as is usual in storage systems criticality assessment.

Thank you once again for your review, your valuable comments and advice, and for an interesting objective for further studies regarding long-term similarity
Best regards
Matej Sikl

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript is generally well written and addresses an interesting and relevant topic within the field. The content is supported by appropriate references and technical arguments. However, several important issues must be addressed before the work can be considered for publication. I recommend major revision.

Below are detailed comments and requests for clarification.

Similarity is expressed in terms of nuclear data uncertainties…what about the other aspects of the model (technological parameters, such as material composition and geometry) ? they will affect keff, possibly more than nuclear data…can we believe the uncertainty bounds derived in the paper if only nuclear data uncertainties are considered ?
LR-0 benchmarks are considered because they are representative of the increased pitch between assemblies found in CASTOR cask…yet, LR-0 is composed of fresh fuel so it is not straight forward why they should have high c_k. How are the isotopes not present in the experiment considered ? for example Pu-239, Pu-240…are their sensitivities considered in the c_k formula ? if only U-238/U-235 sensitivities are considered, then it is conceivable that high c_k are obtained…can the authors provided a breakdown by isotope to the contribution of c_k ?
On L286, even though a reference is provided, the authors should provide what kind of effects are considered within this « modeling uncertainty »
In last paragraph of Section 3, why would a similar value would be added to the conservative fresh fuel pool approach ? 0.05 needs to be added in the best-estimate case, not to the conservative case…please clarify, this paragraph is not really understandable as written.
Section 4 should be a conclusion, not a discussion.
Update figure 5 to show the flux per unit lethargy, this will remove the fluctuations in the Maxwellian spectra (10-8 to 10-6 eV).
In section 2.2.3, please explain what is the nominal filling of CASTOR casks considered and why non-borated water is considered in this specific case…does it correspond to degraded (accidental -water ingress) conditions ? when the cask loading is described, the description in line 150-151 corresponds to the fresh state of the assembly, correct ? At the burnup listed in Table 1, there is no Gd left and the fuel composition will be very different…please clarify in the text.
Page 5, L166-167, it is not clear how the reduction is performed, please ellaborate
It is unclear from the text, what is the burnup of each invidual assembly considered in the CASTOR and Spent Fuel Pool (SFP) models : are all assemblies of the same burnup/composition; or are distributions of individual assemblies with unique burnup/composition considered…in the latter case, how to pick the distribution to be conservative, e.g the one that maximize criticality.
L181, when computational uncertainty is mentioned, we are talking about monte carlo statistics correct ? given the size of the SFP, did you check the Shannon entropy to make sure that the fission source of the Monte Carlo calculation is actually converged.
How is a sdf formatted file produced from a Serpent calculation ? I assume that Serpent is used to model CASTOR and SFP models in Section 2.3. please clarify.
In section 2.3, point #2, please explain what is a conservative and a best-estimate models in the context of this manuscript.
Proof read the manuscript by a native speaker…articles are missing in section 2.3 for example : In the end, the k_final value is compared with the safety limit for the storage system.
In table 3, it would be interesting to know what is the ck values obtained for the cask.

Author Response

Thank you very much for the time taken to review this manuscript. The review was truly comprehensive and provided many valuable insights. Most of the suggested changes have been incorporated and can be seen in the attached manuscript revision, the changes are highlighted in red. In the following text, I will address your comments point by point.

Comments 1: Similarity is expressed in terms of nuclear data uncertainties…what about the other aspects of the model (technological parameters, such as material composition and geometry) ? they will affect keff, possibly more than nuclear data…can we believe the uncertainty bounds derived in the paper if only nuclear data uncertainties are considered ?
Response 1: It is true that a core feature of our methodology is the similarity focusing on nuclear data uncertainties. However, this does not mean that we are not considering other sources of uncertainty. In the methodology, there is an explicitly defined application model uncertainty sigma k_c^a, which should be determined when preparing an application model and should include uncertainties you mentioned.

Comments 2: LR-0 benchmarks are considered because they are representative of the increased pitch between assemblies found in CASTOR cask…yet, LR-0 is composed of fresh fuel so it is not straight forward why they should have high c_k. How are the isotopes not present in the experiment considered ? for example Pu-239, Pu-240…are their sensitivities considered in the c_k formula ? if only U-238/U-235 sensitivities are considered, then it is conceivable that high c_k are obtained…can the authors provided a breakdown by isotope to the contribution of c_k ?
Response 2: This is an excellent question. I studied the reasons for this similarity and these are not possible to summarize easily, but I will describe three main observations.
Firstly, yes, LR-0 are fresh fuel experiments, and it can be seen from Table 4 that the number of sufficiently similar LR-0 experiments decreases significantly with an increasing burnup level of the fuel contained in the cask (see the comparison to pools and reactor critical experiments in Table 3). This corresponds to isotopic composition changes.
Secondly, I have analyzed the isotope contributions to the ck values as requested. The sample application case was a cask containing fuel with an average burnup of approximately 25 GWd/tU. When compared to a random reactor critical experiment, the contributions are U-235 0.3, U-238 0.18, Pu-239 0.14 (ck is 0.69). When the experiment is a random LR-0, the contributions are U-235 0.49, U-238 0.27, Pu-239 0.003 (ck is 0.82). Low contribution for Pu for fresh fuel LR-0 experiment is consistent with expectations and the lesser influence to similarity is discussed below.
Thirdly, to investigate the high similarity between the cask and LR-0 experiments (at least for low and medium burnup levels) – after I compared the similarity and analyzed the neutron spectra shown in Figure 5 – I examined the sensitivity profiles of important isotopes using Fulcrum. Although it cannot be easily visualized as a proof (because of the large number of profiles I examined), the sensitivity profiles of the LR-0 experiments overlap with the cask profiles better than the reactor critical profiles do. I believe this effect is connected to different neutron spectra, which differs because of FA pitch, and it explains the higher ck similarity.

Comments 3:On L286, even though a reference is provided, the authors should provide what kind of effects are considered within this « modeling uncertainty »
Response 3: I added the information, the modeling uncertainty was derived from a sensitivity analysis of manufacturing tolerances and uncertainties in the boron content of the borated steel.

Comments 4: In last paragraph of Section 3, why would a similar value would be added to the conservative fresh fuel pool approach ? 0.05 needs to be added in the best-estimate case, not to the conservative case…please clarify, this paragraph is not really understandable as written.
Response 4: Thank you for pointing this out, I agree that this paragraph required a clearer explanation. It has been modified in the revised manuscript.

Comments 5:Section 4 should be a conclusion, not a discussion.
Response 5: I agree with you and I would expect the same, but in "Instructions for Authors" (https://www.mdpi.com/journal/jne/instructions), a discussion section is required, while the conclusion section is optional, so I followed these instructions. I will try to contact the editors about the section name and content.

Comments 6:Update figure 5 to show the flux per unit lethargy, this will remove the fluctuations in the Maxwellian spectra (10-8 to 10-6 eV).
Response 6: Thank you for your advice, I did update Figure 5.

Comments 7: In section 2.2.3, please explain what is the nominal filling of CASTOR casks considered and why non-borated water is considered in this specific case…does it correspond to degraded (accidental -water ingress) conditions ? when the cask loading is described, the description in line 150-151 corresponds to the fresh state of the assembly, correct ? At the burnup listed in Table 1, there is no Gd left and the fuel composition will be very different…please clarify in the text.
Response 7: A flooding condition or a potential loss of the boric acid in the moderator were considered as a worst cases as is usual in the accident scenarios. You are right, the initial state of the fuel assembly was described. And finally Gd is depleted and its effect becomes negligible at approximately 12,000 MWd/tU. Each of these points has been modified or added to the text, thank you for pointing these out.

Comments 8: Page 5, L166-167, it is not clear how the reduction is performed, please ellaborate
Response 8: I added a brief description about how the reduction is performed.

Comments 9: It is unclear from the text, what is the burnup of each invidual assembly considered in the CASTOR and Spent Fuel Pool (SFP) models : are all assemblies of the same burnup/composition; or are distributions of individual assemblies with unique burnup/composition considered…in the latter case, how to pick the distribution to be conservative, e.g the one that maximize criticality.
Response 9: Thank you for pointing this out. In all storage system model cases, it is assumed that all loaded fuel assemblies are identical. The text in the paper has been modified to state this explicitly.

Comments 10: L181, when computational uncertainty is mentioned, we are talking about monte carlo statistics correct ? given the size of the SFP, did you check the Shannon entropy to make sure that the fission source of the Monte Carlo calculation is actually converged.
Response 10: Yes, it is correct. I added that information to the text. About the convergence - we checked source convergence based on entropy of several calculations to ensure the source population was set correctly. Furthermore, the sensitivity data are calculated in Serpent in two parallel independent runs. When processing the sensitivity data for the .sdf file, we automatically check, that keff difference between these two runs do not differ to much.

Comments 11: How is a sdf formatted file produced from a Serpent calculation ? I assume that Serpent is used to model CASTOR and SFP models in Section 2.3. please clarify.
Response 11: We developed a short script which collects sensitivity data from the Serpent output and creates the .sdf file. And you are right, both CASTOR and SFP are modeled and calculated using the Serpent 2 code, same for the reactor criticals models. All of these information were added to the paper text

Comments 12: In section 2.3, point #2, please explain what is a conservative and a best-estimate models in the context of this manuscript.
Response 12: I added a short explanation. The conservative model is characterized by the most reactive geometry and the minimum possible amount of absorbers in storage construction materials within the model specification range.

Comments 13: Proof read the manuscript by a native speaker…articles are missing in section 2.3
Response 13: The methodology section 2.3 (and the text overall) has been reviewed and corrected to improve clarity and technical accuracy.

Comments 14: In table 3, it would be interesting to know what is the ck values obtained for the cask.
Response 14: Thank you for this suggestion, I have added a maximal ck value column to display the similarity between the application and experiments.

I would like to express my sincere thanks for your thorough review and for the time you invested in providing such helpful suggestions.
Best regards
Matej Sikl

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have addressed a few points, such as, e.g., improving the rather sparse explanations accompanying some of the figures. It is still my opinion that the approaches presented in the manuscript are likely to be of limited use to the broader scientific community. As already mentionened, in my view, the paper reads more like a technical report - albeit one based on the authors’ extensive and long-standing experience and on a comprehensive analysis that cannot easily be reproduced within the framework of a publication.

Since the manuscript is written in a clear and comprehensible manner, I suggest that it be published and that the readership be left to decide what level of impact or resonance it ultimately deserves.

Reviewer 3 Report

Comments and Suggestions for Authors

comments were addressed in a satisfactory manner. The paper can be published as it is now.

Article Menu

Validation of Computational Software for Criticality Safety Analysis of Spent Nuclear Fuel Systems

Further Information

Guidelines

MDPI Initiatives

Follow MDPI