Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors

Kokkiligadda, Sai Kiran; Barker, Cathleen; Gunger, Emily; Johnson, Jalen; Turner, Brice; Enqvist, Andreas

doi:10.3390/jne6030031

Open AccessArticle

Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors

by

Sai Kiran Kokkiligadda

^1,*,

Cathleen Barker

^1,2,*

,

Emily Gunger

¹,

Jalen Johnson

¹,

Brice Turner

¹ and

Andreas Enqvist

^1,*

¹

Department of Material Sciences and Engineering, University of Florida, Gainesville, FL 32611, USA

²

Department of Physics and Nuclear Engineering, United States Military Academy, West Point, NY 10996, USA

^*

Authors to whom correspondence should be addressed.

J. Nucl. Eng. 2025, 6(3), 31; https://doi.org/10.3390/jne6030031

Submission received: 9 June 2025 / Revised: 19 July 2025 / Accepted: 5 August 2025 / Published: 11 August 2025

Download

Browse Figures

Versions Notes

Abstract

The capability to discriminate among nuclear fuel properties is essential for a successful nuclear safeguard and security program. Accurate nuclear material identification is hindered due to challenges such as differing levels of enrichments, weak radiation signals in the case of fresh nuclear fuel, and complex self-shielding effects. This study explores the application of supervised machine learning algorithms to digitized radiation detector data for classifying signatures of special nuclear materials. Three scintillation detectors, an EJ-309 liquid scintillator, a CLYC crystal scintillator, and an EJ-276 plastic scintillator, were used to measure gamma-ray and neutron data from special nuclear material at the National Criticality Experiments Research Center (NCERC) at the National Nuclear Security Site (NNSS), at Nevada, USA. Radiation detector pulse data was extracted from the collected digitized data and applied to three separate supervised learning models: Random Forest, XGBoost, and a feedforward Deep Neural Network, chosen for their wide-spread use and distinct data ingest and processing analytics. Through model refinement, such as adding an additional parameter feature, an accuracy of greater than 95% was achieved. Analysis on model parameter feature importance revealed Countrate, which is the overall gamma-ray and neutron incidents for each detector, was the most influential parameter and essential to include for improved classification. Initial model versions not including the Countrate parameter feature failed to classify. Supervised learning models allow for measured gamma-ray and neutron pulse data to be used to develop effective identification and discrimination between material compositions of different fuel assemblies. The study demonstrated that traditional pulse shape parameters alone were insufficient for discriminating between special nuclear materials; the addition of Countrate markedly improved model accuracy but all models were heavily dependent on this specific feature, thus illustrating the need for alternative, more distinct parameter features. The machine learning development framework captured in this study will be beneficial for future applications in discriminating between different fuel enrichments and additives such as burnable poisons.

Keywords:

scintillation detectors; machine learning; fresh nuclear fuel

1. Introduction

Nuclear safeguards rely on extensive and accurate material counting and visual inspection methods. These processes can be slow or have potential errors; as such, research development has concentrated applying detector techniques to aid in nuclear material safeguards. Nuclear fuel, particularly fresh nuclear fuel, remains a challenge to accurately identify and verify. Fresh nuclear fuel’s relatively low radiation signature, shielding effects, and additives such as burnable poison greatly impact the ability to accurately identify its composition. However, machine learning techniques may aid accurately discriminating between nuclear fuel assemblies based on composition of the material and location of the detector system.

Passive nondestructive assay (NDA) techniques are commonly used for special nuclear material assay, in which detectors are positioned to passively receive interactions from the material of interest [1]. However, passive NDA may require long measurements and results may require further lengthy analysis. Machine learning provides the nuclear community with the ability to rapidly process complex data files and discern between spent or fresh nuclear fuels [2,3,4]. These machine learning methods have demonstrated the possibility of near real-time safeguard analysis. Supervised learning models have shown promise in their ability to provide effective pattern matching as long as the best-suited parameter features are inputted into the algorithm. As this research will show, selecting the right parameter features to effectively improve model accuracy while also preventing reliance on a single dominant variable is a complex, iterative process.

The primary goal of this research was to assign pulse features to discrete special nuclear fuel position and configuration identification; for machine learning models, this would be identified as a classification task. Classification algorithms are well-suited to this problem because they can identify patterns within different datasets and group observations into distinct categories. By leveraging features extracted from raw pulse data, machine learning models can be used to identify position and material configurations of the three SNM types commonly found in fresh or used nuclear fuel. Although the scope of this study focused on evaluated model performance due to detector and SNM position, the result of this study establishes a machine learning model framework in which different and more discriminative parameter features could be applied for future applications and studies.

Three supervised learning techniques were evaluated for this research: Random Forest, XGBoost, and Neural Networks. They were selected for their broad use, ease of adaptability to analyzing digitized data, and their variations in data analysis. Overall, these three machine learning models provide a robust assessment of suitability for this study. In addition, this study applied seven unique parameter features for each detector’s data sets. Six of these parameter features evaluated pulse shape features intrinsic to the measured SNM while the seventh looked at combined gamma-ray and neutron countrate for each detector which is influenced by extrinsic measurement conditions.

This study provides an adaptable framework for model development aimed at classifying special nuclear materials. By identifying and implementing stronger intrinsic features such as neutron-to-gamma ray information or time-dependent pulse shape parameter features, the models could be improved for more robust nuclear safeguard needs. This research could be applied to situations such as distinguishing LEU and HEU fuel, differing enrichments of fresh nuclear fuel, or identifying challenging scenarios such as inclusion of burnable poisons.

This article has five distinct sections, to include the introduction. Section 2, Materials and Methods, details the measurement design conducted at the Nevada Nuclear Security Site. Section 3, Machine Learning Model Development Workflow, explains the model development framework used for this study and that can be modified for all follow-on work. Section 4, Results and Discussion, conducts an in-depth analysis on each model for this study. Finally, Section 5, Conclusions, highlights the four major conclusions from this study and possible future work and study considerations.

2. Materials and Methods

The National Criticality Experiments Research Center (NCERC), located at the Nevada Nuclear Security Site, houses unique special nuclear materials that can be used for an assortment of measurements [5,6]. The Defense Nuclear Nonproliferation (NA-22) university consortia hosts teams from universities to conduct measurement campaigns at NCERC utilizing special nuclear materials. During the summer of 2024, the University of Florida was one of several teams to utilize the center’s capabilities for measurements.

In order to simulate some of the unique signatures found in nuclear fuel materials, three special nuclear materials were chosen and arranged in different configurations: highly enriched uranium from Rocky Flats hemishells (RF), weapons-grade plutonium from the Beryllium-Reflected Plutonium (BeRP) ball, and a neptunium (²³⁷Np) sphere. These are all heavy isotopes and materials commonly found in fresh and used nuclear fuel.

The experiment was designed to emulate the challenges of discerning two signatures simultaneously, while also measuring data from multiple positions. Three configurations were measured against with two of the special nuclear materials situated 122 cm apart. The detector system measured at three different positions for each configuration (Figure 1). The three positions were chosen in order to evaluate the accuracy of discernment between special nuclear material type regardless of detector location.

Configuration 1 (C1) consisted of RF hemishells and BeRP ball, configuration 2 (C2) consisted of ²³⁷Np sphere and BeRP ball, and configuration 3 (C3) consisted of ²³⁷Np sphere and RF hemishells (Table 1). Position 1 (p1) arranged the detectors in front of the far left SNM, position 2 (p2) arranged the detectors between both SNMs, and position 3 (p3) arranged the detectors in front of the far right SNM.

The detector system consisted of three detectors arranged in a vertical configuration on a rolling cart so they could easily be moved between positions (Figure 2). From top to bottom, an Eljen-309 (EJ-309) liquid scintillator, a Cs₂LiYC_l6:Ce³⁺ (CLYC) crystal scintillator, and an Eljen-276 (EJ-276) plastic scintillator were attached to a single rod and angled such that there was equal distance between the detector front and the special nuclear material (Figure 3). These three detectors were chosen for data collection for initial machine learning model development due to their commercial availability and versatility. All three detectors are capable of measuring and discerning between gamma-ray and neutron events. Per manufacturer specifications, the EJ-309 liquid scintillator has a scintillation efficiency of 12,300 photons per 1 MeV electrons while the EJ-276 plastic scintillator has a scintillation efficiency of 8000 photons per 1 MeV electron [7,8]. The CLYC detector has been characterized as detecting 2000 photons per MeV [9]. They also represent three different scintillation materials and allowed this study to assess possible limitations on identification and classification challenges based on scintillation material. The EJ-276 and EJ-309 detectors used both have a 3 in × 3 in volume while the CLYC detector used has a 1 in × 1 in volume.

Pulses were measured using a STRUCK SIS3316 16-channel digitizer, manufactured by Struck Innovative Systeme GmbH in Hamburg, Germany, and raw pulse data was processed and analyzed with MATLAB R2024a, version 6. MATLAB was chosen for its availability for this research. The authors wrote and designed scripts to analyze raw data and calculate each parameter of interest for each measured pulse for each detector. Data was measured in sets for each position, for each configuration where an average of 10,000 CLYC pulses, 28,000 EJ-276 pulses, and 50,000 EJ-309 pulses were stored per set. It took about 30 s to measure and transfer each set of data. The time it takes to measure this number of pulses varied per configuration, per position; the number of pulses were chosen based on optimizing the maximum amount of measurement time available while still obtaining statistically significant measurement data for each desired configuration. When the detectors were closer to the ²³⁷Np, for example, the time it took to reach the requisite predetermined pulse value for a set of data was much shorter than the time it took to reach the same number for a different SNM. Over 2 million pulses were measured by each detector for each position, for each configuration.

The STRUCK digitizer allows for raw pulse information to be post-processed as needed. Supervised learning models require a large input data set to train the algorithm to classify accordingly. For this research, we wanted to evaluate multiple supervised learning models for their effectiveness at discerning both position and configuration. Measuring a large quantity of data for each position, for each configuration ensured there was a sufficient amount of data available for different supervised learning models to develop off of.

3. Machine Learning Model Development Workflow

A simple procedural schematic was developed to summarize the steps towards developing models for analysis and evaluation (Figure 4). Figure 4 outlines five primary steps that were used to develop each model for this study: (1) Raw Data Collection, (2) Preprocessing, (3) Model Building, (4) Model Training, and (5) Evaluation and Classification. Details of each step are listed in this section.

3.1. Raw Data Collection

Gamma-ray and neutron discrimination using pulse shaped analysis is a critical tool for analyzing the energy spectrums of different nuclear materials. As previously detailed in Section 2, the raw waveform data used for this study was collected during experimental measurements at the Nevada National Security Site (NNSS). Pulse shape discrimination techniques were adapted and applied to all three detectors and used as a starting base for determining which inputs would be beneficial for the supervised algorithms [10,11]. Unlike previous articles applying pulse shaping techniques for pulse shape discrimination between neutron and gamma-ray events, all pulses were used in the data. The pulse shaping techniques were used to calculate pulse parameter features that could be used to discriminate between SNM.

3.2. Preprocessing

A 3000-sample waveform (12 µs at 250 MHz sampling rate) was used for the CLYC pulses and a 500-sample waveform length was used for the pulses from the EJ-309 and EJ-276 detectors. Six pulse shape parameters were optimized and calculated for each detector’s pulses: the total integration of the pulses (Qtotal), the integration of the pulse during its fall (Qtail), the ratio between the two integration lengths (Qratio), the peak amplitude of the pulse (Amp), the difference of the peaks of the tail (Jitter), and the logarithmic sum of the square of the tail amplitudes (Dflog) [10,11]. These parameters served as the initial inputs for each of the machine learning models. After analyzing initial results, it was decided to add a seventh parameter feature: Countrate. Countrate is calculated by totaling the number of pulses measured, regardless of gamma-ray or neutron type, and dividing it by the measurement time. The STRUCK digitizer allows for raw time data to be included in the raw pulse data. Therefore, the measurement duration can be extracted from the data regardless of fraction of data used for training or which detector is used to accurately calculate the Countrate. For this study, Countrate was calculated for each set of detector data.

3.3. Model Building

After the raw data is collected and preprocessing is conducted, the dataset, which included the calculated pulse features, was divided into training (60%), validation (20%), and testing (20%) subsets for each model development. This ensured balanced class representation. Machine learning models were developed using Python version 3.12; standard machine learning libraries were implemented for all three models in this study. Random forest model was implanted using “RandomForestClassifier,” XGBoost model was implemented using the official XGBoost Python package, and the Neural Networks model was developed using TensorFlow framework. All models ran through the University of Florida’s HiPerGator supercomputer, which is a high-performance cluster that provides rapid data analysis.

Pulse features were applied to develop three different machine learning models capable of classifying the configurations based on position and material composition. The results from the three machine learning models were compared. Each model was built with the same dataset, which included the six initial pulse features and then reevaluated with the additional Countrate parameter.

3.4. Model Training

Three supervised models were built and then trained on using the study’s measured dataset. Models were trained before evaluation and classification.

Random Forest: Random Forest is an ensemble learning method that constructs multiple decision trees and aggregates their predictions to improve classification accuracy [12]. To classify special nuclear material configurations based on the extracted features, a random forest (RF) model was implemented using the cuML library for GPU-accelerated machine learning. This allowed for efficient training and evaluation of the model on a large dataset. The dataset was prepared by combining multiple CSV files containing pulse features and countrates for each detector, configuration, and position. The random forest model was implemented using the cuRF class from the cuML library. One hundred decision trees were used to form the forest and the depth of each tree was limited to 10 to avoid overfitting while maintaining interpretability. 80% of the features were randomly selected at each split, ensuring diverse decision boundaries across trees. Additionally, reproducibility was ensured by using a fixed random seed and one stream was used for parallel process. Furthermore, the model was trained on the training dataset using GPU acceleration for fast computation
XGBoost: XGBoost is a gradient boosting model and can handle complex data distributions [13]. The XGBoost model was implemented using the XGBClassifier class from the XGBoost library, which provides a highly efficient and scalable gradient boosting framework. For this study, the hist tree method was used, optimized for performance with large datasets. Additionally, the model was configured to run on a GPU using the CUDA backend, ensuring faster training and prediction times. The total number of boosting rounds was set to 50 while the maximum depth of individual trees was set to 6. The learning rate was set to 0.1 to control the weight updates during boosting iterations, balancing both the model’s learning speed and performance. The hist method was chosen since it is highly efficient for large datasets and supports GPU acceleration; and, the random state was fixed to 42 to ensure reproducibility.
Neural Network: Neural Networks is a feedforward network architecture [14]. Its architecture consisted of an input layer with the same dimensionality as the feature set. It also had several layers; the first dense layer had 128 neurons and Rectified Linear Unit (ReLU) activation, followed by a dropout layer with a dropout rate of 20% to prevent overfitting, a second dense layer with 64 neurons and ReLU activation, followed by another dropout layer which had the same parameters as the first dropout layer. A dense output layer followed with neurons equal to the number of configurations and a softmax activation function for multi-class classification. The model was compiled using the Adam optimizer. Training was conducted for 5 epochs with a batch size of 32, and validation was performed using a holdout validation set.

3.5. Evaluation and Classification

The purpose of the final step is to assess model accuracy and address possible areas of improvement. For this study, when the initial six parameter features, Qtotal, Qtail, Qratio, Amp, Jitter, and Dflog, were used with the machine learning models, the classification accuracy for all three machine learning models was low, averaging just 18.86%. The models struggled to classify the configurations, regardless of detector position, likely due to the limited set of features and the inherent complexity of the data. The initial low accuracy results led to the models being restructured to include Countrate as the seventh parameter feature for each data set. This model refinement led to improved model accuracy of greater than 95%.

During initial assessment of the feasibility of Countrate for this study, it was found that it was an easily distinguishable parameter; however, there was some feature overlap for C2, p1 and C3, p1 for all three detectors. It was anticipated that these configurations and positions would have the most model inaccuracies.

For this study, gamma-ray and neutron data were not separated prior to model training; pulse shape parameters were calculated for all pulses without discrimination. Future studies will consider whether there are improvements by discriminating between gamma-ray or neutron only data.

4. Results and Discussion

After collecting the data from NNSS, analysis on the data was conducted at the University of Florida. Data was assessed for each detector, for each configuration, and for each position for all seven parameter features of interest. This data was then inputted into the three machine learning models that were refined specifically for this research.

Each model’s accuracy was evaluated and compared by detector and was analyzed based on two categories: the initial six parameter features (Qtotal, Qtail, Qratio, Amp, Dflog, and Jitter) and then the seven parameter features (the original six with the addition of Countrate parameter feature). In this study, model accuracy represents how well a machine learning model correctly classifies the configuration and position of nuclear material as a result of input pulse parameter features. High accuracy represents how well a machine learning model correctly classifies the configuration and position of a nuclear material based on input pulse parameter features.

Model accuracy and feature importance were the two criteria primarily considered during assessment of each model’s performance. Additionally, models were evaluated with data from all three detectors combined in order to assess if multi-detector systems have improved results. Lastly, a classification report was developed to assess the accuracy of the XGBoost model, specifically. The classification report includes key performance metrics (precision, recall, and F1-score) for each configuration and position and provides a detailed assessment of each model’s performance and ability.

4.1. Initial Results Using Six Pulse Features

Initially, only six pulse parameter features were utilized for each detector. The individual accuracy of random forest, XGBoost, and neural networks for the three detectors using the six initial pulse features are presented in Table 2.

All three models had a low classification accuracy when only the six pulse features were utilized. The CLYC detector did have slightly higher accuracy across all models. This modest difference is considered relevant given the overall low accuracy range using six parameter features; even small gains may indicate underlying detector-driven advantages. This could be attributed to the detector’s higher spectroscopic accuracy compared to the other two organic scintillator materials. The CLYC detector produces pulse shapes that are long and distinctive which may enable better machine learning classification. Although the XGBoost model performed slightly better than the other two models, the differences between models were less than 1% which suggests the models were comparative.

The initial six pulse features were evaluated for their importance to the random forest model for each detector. This was done to determine the relative contribution of each pulse feature for each detector for the random forest model, specifically (Table 3).

The most important feature for all three detectors was Qtotal. This demonstrates the strength in analyzing pulses based on the total pulse integration which correlates to total amount of energy deposited and is the most commonly used pulse feature for radiation spectroscopy.

Peak amplitude was the second most important feature for the EJ-309 and EJ-276 detectors. This makes sense because both detectors have a short pulse length with well-measured amplitude-voltage while the CLYC has a much lower peak amplitude with significant influence of sampling noise. The importance of Dflog to the CLYC detector demonstrates the CLYC’s reliance on tail pulse analytics.

The remaining three features, Qratio, Jitter, and Qtail, had relatively minimal importance to the random forest model. This demonstrates that additional pulse or detector features are needed to improve the model’s accuracy. Another feature needed to be considered for model improvement.

Separately, the data in each configuration and position was combined for all three detectors for each of the three models. This was to ascertain if there were improved results using a multi-detector approach as opposed to only using data from one detector per model.

Surprisingly, combining the three detectors did not improve the model’s success rate. Instead, it decreased their accuracy and was worse than simply averaging the three detectors’ accuracies. The models’ accuracies ranged from 17.04–17.11% with XGBoost having the highest combined accuracy percentages. This demonstrates that the models over-rely on EJ-276 and EJ-309 pulse data. Both EJ-276 and EJ-309 have a higher detection rate compared to the CLYC detector due to the large variation in detector volume. This could be alleviated by using a CLYC-detector with a larger detection volume or putting an artificial limit on number of pulses used from each detector versus assess by time length for each set, as opposed to using date form each detector collected for the same duration of time.

4.2. Evaluation by Set Using Six Pulse Features Plus Countrate

The three models were further evaluated by using each set’s Countrate value in addition to the six pulse parameter features. The overall average accuracy for each configuration and position for each detector and model was shown in Table 4.

Each model performed within 1% accuracy of each other. This is beneficial for future studies because of the three models, Random Forest has a much lower computational cost. EJ-309 had the overall best accuracy regardless of model with EJ-276 as a close second. CLYC had on average a 5 percentage units worse accuracy compared to the other detectors regardless of model.

Countrate’s inclusion had a clear, dominant influence on model performance; it substantially increased classification accuracy for all detectors and models. The model revision with the inclusion of Countrate suggests that the revised models rely almost exclusively on variation in Countrate. Countrate’s influence is further demonstrated in Table 5, which lists the feature importance for all seven pulse parameter features for the each detector the random forest model.

As shown in Table 5, Countrate had a 40 to 200 fold increase in relative importance compared to Qtotal for each detector. The decreased to negligible impact of the remaining initial pulse features indicates the limited influence of pulse shape-based metrics with the inclusion of temporal detection data. Countrate is predominantly determined by the detector position and radiological source strength of the nuclear material.

Although the models relied primarily on the Countrate feature, the intent of using machine learning in this study was not just to identify the most dominant parameter feature, but to also exam the flexibility and applicability of multiple machine learning models for complex classification scenarios in order to develop a framework for incorporate additional, revised parameter features in future studies. These results serve as a performance baseline and demonstrate the approach in applying machine learning is feasible and future work could exam revised parameters based on other time-dependent features as opposed to the current seven parameters evaluated.

The data for each of the detectors was combined with the addition of Countrate in order to determine an overall model accuracy (Table 6). The multi-detector data combined accuracy improved upon the CLYC detector’s individual accuracy by 2–3%; however, it did not surpass the individual detector accuracy for either the EJ-276 and EJ-309 detectors. Although the Neural Network model was slightly more accurate than the XGBoost model, due to a difference of <0.5%, it is probably not statistically significant. Nonetheless, model selection may still have some benefit depending on different study objectives, different detector combinations, and new parameter features; as such, further evaluation in future work should be considered.

A classification report was developed to evaluate the validation and test datasets for the combined detector analysis for each configuration and position for the XGBoost model. The classification report includes three calculated values: precision, recall, and F1-score [16]. The results for precision, recall, and F1-score are summarized in Table 7. Precision is used to assess the accuracy of positive predictions while recall is used to assess how many positive predictions were correctly predicted. Lastly, F1-score is a calculation of the harmonic mean of the precision and recall scores. This analysis was conducted to identify whether there were discrepancies in accuracy based on position of detector or configurations of SNM. The results in Table 7 suggest that accuracy is highest for configurations with the most distinct Countrate characteristics relative to other configurations.

XGBoost performed extremely well, if not perfect, regardless of configuration or position. Configuration C1 had the best overall performance with the highest precision, recall, and F1-scores (Table 7). However, due to the models’ overreliance on Countrate, it is difficult to infer how well the models were able to discriminate between intrinsic signatures of highly enriched uranium or plutonium signatures. Instead, these results highlight a weakness in the current parameter features chosen for the models, specifically the overuse of Countrate, and the need for further model refinement by incorporating alternative pulse-shape and time-dependent parameters.

Configurations with ²³⁷Np (C2 and C3) were consistently less accurate than C1, though all three models still performed well. The decrease in accuracy for those configurations is likely due to the high gamma-ray emission rate from ²³⁷Np. ²³⁷Np has a robust gamma-ray output; the strength of the source can saturate detectors and reduce classification reliability. During measurements at the Nevada Nuclear Security Site, shielding for gamma-rays ²³⁷Np was considered to mitigate this but was not implemented in order to best emulate measurement conditions at a nuclear power plant or fuel manufacturing facility. This may have contributed to the difference in performance based on configuration.

Overall, inclusion of the Countrate feature significantly improved the accuracy of each model. This demonstrates that it is possible to distinguish between configurations and positions for different SNM with current measurement conditions. These techniques could be further adapted to distinguish between different fuel enrichments, such as LEU versus HEU or fresh nuclear fuel with different enrichment percentages or burnable poisons contents. Further examination and incorporation of additional discriminative features outside of Countrate, such as neutron-to-gamma discrimination characteristics or time-dependent pulse metrics will improve model robustness and decrease overreliance on one dominant parameter feature.

5. Conclusions

The paper evaluated the feasibility of using machine learning models to distinguish between configurations of several types of special nuclear materials from multiple position locations and using three detector types. Specifically, the models were trained on pulse and detector feature data as opposed to gamma-ray spectrum data as commonly used. Several key conclusions were drawn from the results that could be applied to future work in the nuclear community for machine learning algorithm development.

Traditional pulse shape parameters were insufficient for accurately distinguishing between SNM configuration, regardless of detector type.

Pulse shape parameters were used as initial features for the three machine learning models since they are highly suitable for pulse-shaped discrimination between gamma-rays and neutrons. However, looking at pulse shape information individually fails to capture the unique position characteristics for detector and SNM location. Identifying and developing the best parameter feature is critical for a model’s accuracy. Pulse shape parameters are good to start out with, and the parameters that are more effective will vary based on detector type, but additional parameter features are needed for model improvement. This is notably also true for detectors capable of detecting both gamma-ray and neutron pulse data in this case.

2.: Inclusion of Countrate and thereby incorporating time was the leading metric to greatly improving model accuracy.

Countrate served as the most effective metric toward rapidly improving the accuracy of all three models, regardless of detectors. Countrate implies both the position of detectors and strength of the measured special nuclear material. Future parameters expanding the inclusion of time may be better at improving model strength.

3.: With the inclusion of the seven parameters, all three models were highly accurate and comparable to each other, regardless of detector type.

Since all models had an accuracy of 94% or greater, with all detectors included, this means that other model challenges can be considered when evaluating a model’s applicability to future work. Both XGBoost and Neural Networks had accuracies above 96%. However, they both use significantly more GPU processing power than random forest. The random forest model had an accuracy of 94.6% but required fewer GPUs. Using fewer GPU resources while maintaining a high accuracy makes the random forest model more appealing than the other two models for special nuclear material classification.

4.: Models could improve with inclusion of multi-detector data, though this increases the ML processing cost.

Combining detector data did have a strong performance overall, with model accuracies ranging from 94–96%. The drawback of this performance was a significant increase in the number of GPUs needed to run each model. Revisions or modifications to the number or types of parameter features used may decrease the ML processing cost while still obtaining strong model accuracy.

In summary, each machine learning model and each detector demonstrated the ability to distinguish special nuclear material types from each location. The results of this study are promising; these models and the data ingested by them can be further refined and applied towards identifying and distinguishing other nuclear materials such as fresh or used nuclear fuel. Future work will aim to conduct measurements against fresh fuel assemblies of different enrichments and additives such as burnable poisons. Additionally, follow-on studies will aim to explore new intrinsic and time-based parameter features and their impacts on model performance. The machine learning model development framework examined in this study provides a baseline for future work to span from.

Author Contributions

Conceptualization, C.B., A.E., E.G., J.J., S.K.K. and B.T.; methodology, C.B. and S.K.K.; software, C.B., A.E. and S.K.K.; validation, C.B. and S.K.K.; formal analysis, C.B. and S.K.K.; investigation, C.B. and E.G.; resources, C.B. and J.J.; data curation, C.B. and S.K.K.; writing—original draft preparation, C.B. and S.K.K.; writing—review and editing, C.B., A.E., E.G. and S.K.K.; visualization, C.B.; supervision, C.B. and A.E.; project administration, C.B. and A.E.; funding acquisition, A.E. All authors have read and agreed to the published version of the manuscript.

Funding

This work was performed at the National Criticality Experiments Research Center, operated for the DOE Nuclear Criticality Safety Program, funded and managed by the National Nuclear Security Administration for the Department of Energy. This work was funded in part by the Consortium for Monitoring, Technology, and Verification under the Department of Energy National Nuclear Security Administration award number DE-NA0003920.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Geist, W.H.; Santi, P.A.; Swinhoe, M.T. Nondestructive Assay of Nuclear Materials for Safeguards and Security; Springer: Cham, Switzerland, 2024. [Google Scholar]
Ebiwonjumi, B.; Cherezov, A.; Dzianisau, S.; Lee, D. Machine Learning of LWR Spent Nuclear Fuel Assembly Decay Heat Measurements. Nucl. Eng. Technol. 2021, 53, 3563–3579. [Google Scholar] [CrossRef]
Awe, D. Comparative Analysis of Surrogate Models for the Dissolution of Spent Nuclear Fuel. Master’s Thesis, East Tennessee State University, Johnson City, TN, USA, 2024. [Google Scholar]
Grape, S.; Brange, E.; Elter, Z.; Balkestahl, L.P. Determination of spent nuclear fuel parameters using modelled signatures from non-destructive assay and Random Forest regression. Nucl. Instrum. Methods Phys. Res. A 2020, 969, 163979. [Google Scholar] [CrossRef]
Thompson, N.W.; Maldonado, A.; Cutler, T.E.; Trellue, H.R.; Amundson, K.M.; Rao Dasari, V.; Goda, J.M.; Grove, T.J.; Hayes, D.K.; Hutchinson, J.D.; et al. The National Criticality Experiments Research Center and its role in support of advanced reactor design. Front. Energy Res. 2023, 10, 1082389. [Google Scholar] [CrossRef]
Goda, J.M.; Grove, T.J.; Hayes, D.K.; Myers, W.L.; Sanchez, R.G. Nuclear Criticality Experimental Research Center (NCERC) Overview; Los Alamos National Laboratory: Los Alamos, NM, USA, 2017. [Google Scholar]
Eljen Technology. Pulse Shape Discrimination EJ-301 and EJ-309; Eljen Technology: Sweetwater, TX, USA, 2025; Available online: https://eljentechnology.com/products/liquid-scintillators/ej-301-ej-309 (accessed on 1 June 2025).
Eljen Technology. Pulse Shape Discrimination EJ-276D and EJ-276G; Eljen Technology: Sweetwater, TX, USA, 2025; Available online: https://eljentechnology.com/products/plastic-scintillators/ej-276 (accessed on 1 June 2025).
Wen, X.; Enqvist, A. Measuring the scintillation decay time for different energy deposited by gamma-rays and neutrons in a Cs₂LiYCl₆:Ce³⁺ detector. Nucl. Instrum. Methods Phys. Res. A 2017, 853, 9–15. [Google Scholar] [CrossRef]
Gamage, K.A.A.; Joyce, M.J.; Hawkes, N.P. A comparison of four different digital algorithms for pulse-shape discrimination in fast scintillators. Nucl. Instrum. Methods Phys. Res. A 2011, 642, 78–83. [Google Scholar] [CrossRef]
Barker, C.; Turner, B.; Enqvist, A. PSD Performance using CLYC Scintillation Detector for Special Nuclear Materials Measurement. In Proceedings of the Institute for Nuclear Materials Management Annual Meeting, Portland, OR, USA, 21–25 July 2024. [Google Scholar]
Rigatti, S.J. Random Forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef] [PubMed]
Mitchell, R.; Frank, E. Accelerating the XGBoost algorithm using GPU computing. PeerJ Comput. Sci. 2017, 3, e127. [Google Scholar] [CrossRef]
Zhang, G.P. Neural Networks for Classification: A Survey. IEEE Trans. Syst. Man Cybern. 2000, 30, 451–462. [Google Scholar] [CrossRef]
Barker, C.B. Fresh Nuclear Fuel Verification System Optimized Using Neutron and Gamma Signatures. Ph.D. Thesis, University of Florida, Gainesville, FL, USA, 2025. [Google Scholar]
Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]

Figure 1. A top-down view of a sample configuration. There were three configurations in which three different SNMs were altered at the two SNM locations. The detector system measured data at three positions for each configuration of SNM material. The measured distances remained the same throughout the experiment.

Figure 2. The detector system was assembled on a moving cart so that it could be easily rolled from position to position. This image is of the detector system situated at position 2, centered between the two SNM cannisters. The black cannisters contained the SNM that would be changed for each configuration. Photo provided by NCERC personnel.

Figure 3. The three detectors were arranged vertically with the EJ-276 detector on top, the CLYC detector in the middle, and the EJ-309 detector on the bottom. These detectors remained in this arrangement for all measurements. Photo provided by NCERC personnel.

Figure 4. Procedural schematic for machine learning model development. Raw Data Collection occurred at the NNSS while the remaining steps (Preprocessing, Model Building, Model Training, and Evaluation & Classification) occurred at the University of Florida.

Table 1. Summary of the configuration and position labels along with the special nuclear materials measured against and in what position.

Configuration	Detector Position	SNM #1	SNM #2
C1	p1	RF Hemishells	BeRP Ball
C1	p2	RF Hemishells	BeRP Ball
C1	p3	RF Hemishells	BeRP Ball
C2	p1	237Np	BeRP Ball
C2	p2	237Np	BeRP Ball
C2	p3	237Np	BeRP Ball
C3	p1	237Np	RF Hemishells
C3	p2	237Np	RF Hemishells
C3	p3	237Np	RF Hemishells

Table 2. Overall accuracy for each model, for each detector; six parameter features were used. Data courtesy of [15].

Model	CLYC Accuracy (%)	EJ-309 Accuracy (%)	EJ-276 Accuracy (%)
Random Forest	21.26	17.29	16.98
XGBoost	21.40	17.31	17.01
Neural Network	21.47	17.09	16.92

Table 3. Random forest feature parameter accuracy for the CLYC, EJ-309, and EJ-276 detectors.

Feature	CLYC Importance (%)	EJ-309 Importance (%)	EJ-276 Importance (%)
Qtotal	46.01	58.5	54.2
Qtail	4.9	2.7	1.5
Qratio	2.3	4.3	4.7
Amp	18.4	23.5	28.8
Dflog	20.3	7.9	8.6
Jitter	8	2.9	2.1

Table 4. Overall accuracy for each model, for each detector; seven parameter features, including Countrate were used. Data courtesy of [15].

Model	CLYC Accuracy (%)	EJ-309 Accuracy (%)	EJ-276 Accuracy (%)
Random Forest	92.53	97.39	96.57
XGBoost	93.64	98.06	97.42
Neural Network	93.58	97.89	97.21

Table 5. Random forest feature parameter accuracy for the CLYC, EJ-309, and EJ-276 detectors; seven parameter features, including Countrate, were used.

Feature	CLYC Importance (%)	EJ-309 Importance (%)	EJ-276 Importance (%)
Countrate	96.50	99.35	99.28
Qtotal	2.33	0.49	0.55
Dflog	0.54	0.00	0.00
Amp	0.48	0.15	0.16
Qtail	0.07	0.00	0.00
Jitter	0.05	0.00	0.00
Qratio	0.02	0.01	0.00

Table 6. Multi-detector combined data accuracy for each model; seven parameter features, including Countrate, were used.

Model	Accuracy (%)
Random Forest	94.74
XGBoost	96.40
Neural Network	96.82

Table 7. Classification report for each configuration and position for the combined, multi-detector XGBoost model. Data courtesy of [15].

Configuration	Position	SNM #1	SNM #2	Precision	Recall	F1-Score
C1	P1	RF Hemishells	BeRP Ball	1.00	1.00	1.00
C1	P2	RF Hemishells	BeRP Ball	0.99	1.00	0.99
C1	P3	RF Hemishells	BeRP Ball	0.99	0.99	0.99
C2	P1	²³⁷Np	BeRP Ball	0.93	0.94	0.94
C2	P2	²³⁷Np	BeRP Ball	0.99	0.99	0.99
C2	P3	²³⁷Np	BeRP Ball	0.94	0.92	0.93
C3	P1	²³⁷Np	RF Hemishells	0.93	0.91	0.92
C3	P2	²³⁷Np	RF Hemishells	0.93	0.95	0.94
C3	P3	²³⁷Np	RF Hemishells	0.99	0.98	0.99

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kokkiligadda, S.K.; Barker, C.; Gunger, E.; Johnson, J.; Turner, B.; Enqvist, A. Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors. J. Nucl. Eng. 2025, 6, 31. https://doi.org/10.3390/jne6030031

AMA Style

Kokkiligadda SK, Barker C, Gunger E, Johnson J, Turner B, Enqvist A. Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors. Journal of Nuclear Engineering. 2025; 6(3):31. https://doi.org/10.3390/jne6030031

Chicago/Turabian Style

Kokkiligadda, Sai Kiran, Cathleen Barker, Emily Gunger, Jalen Johnson, Brice Turner, and Andreas Enqvist. 2025. "Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors" Journal of Nuclear Engineering 6, no. 3: 31. https://doi.org/10.3390/jne6030031

APA Style

Kokkiligadda, S. K., Barker, C., Gunger, E., Johnson, J., Turner, B., & Enqvist, A. (2025). Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors. Journal of Nuclear Engineering, 6(3), 31. https://doi.org/10.3390/jne6030031

Article Menu

Applying Machine Learning Algorithms to Classify Digitized Special Nuclear Material Obtained from Scintillation Detectors

Abstract

1. Introduction

2. Materials and Methods

3. Machine Learning Model Development Workflow

3.1. Raw Data Collection

3.2. Preprocessing

3.3. Model Building

3.4. Model Training

3.5. Evaluation and Classification

4. Results and Discussion

4.1. Initial Results Using Six Pulse Features

4.2. Evaluation by Set Using Six Pulse Features Plus Countrate

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI