1. Introduction
Radioisotopes play a vital role across a broad range of disciplines, including industry, healthcare, and scientific research, with their application areas expanding steadily. Each year, more than 50 million nuclear medicine procedures are performed worldwide, and the global demand for radioisotopes continues to increase [
1]. In the medical field, the growing use of radioisotopes for both diagnostic and therapeutic purposes has driven the development of innovative applications involving multiple isotopes with distinct nuclear properties.
Particle accelerators, particularly cyclotrons, have historically played a central role in the production of medical radioisotopes. Although their use declined for a period in favor of reactor-based production, accelerator-based methods have regained strategic importance in recent years [
2]. As clinical demand rises and populations age, the need for a reliable isotope supply has begun to exceed the capacity of conventional reactor-based systems, thereby increasing reliance on accelerator technologies, especially proton accelerators [
3].
Nuclear reaction codes such as ALICE, EMPIRE, GEANT4, MCNPX, and TALYS have been widely used to calculate reaction cross sections and optimize production routes for medically relevant isotopes. However, the accuracy of these theoretical models strongly depends on nuclear input parameters such as level density, gamma-ray strength functions, and optical model potentials, many of which are associated with considerable uncertainties. As a result, noticeable discrepancies often arise between calculated excitation functions and experimental measurements, particularly for reactions involving complex particle emissions or near-threshold energies [
4,
5,
6,
7,
8,
9].
To overcome these limitations and improve predictive performance, recent research has increasingly focused on integrating artificial intelligence (AI) tools especially artificial neural networks (ANNs) into nuclear physics workflows. ANN models trained on experimental databases such as EXFOR have demonstrated strong capability in estimating reaction cross sections, level density parameters, and resonance widths with improved accuracy compared to classical approaches [
10,
11,
12]. For example, ANN-based predictions for
124I production via the
124Te(p,n)
124I reaction have been shown to reproduce experimental data more accurately than TALYS and EMPIRE calculations, while also enabling reliable extrapolation into unmeasured energy regions [
13]. Nuclear medicine has witnessed rapid advancements driven by innovations in radiopharmaceuticals, simulation techniques, and AI, which have significantly enhanced diagnostic precision and therapeutic design [
14]. Alongside these developments, improvements in nuclear reaction modeling and excitation function evaluations have strengthened the production of medically important radionuclides such as
123I and
127Xe [
15].
Iodine-124 (
124I) is a positron-emitting radioisotope with a physical half-life of approximately 4.2 days, making it particularly well suited for immunoPET applications requiring extended imaging timeframes. Its relatively long half-life provides a significant advantage over conventional PET isotopes such as
18F (t
1/
2 = 110 min) and
68Ga (t
1/
2 = 68 min), especially for imaging large biomolecules, including monoclonal antibodies and peptides, which exhibit slow in vivo pharmacokinetics [
16].
Radiolabeled antibodies with
124I have demonstrated strong potential in high-resolution PET imaging of various malignancies, including differentiated thyroid cancer, gliomas, neuroendocrine tumors, and lymphomas [
17,
18]. Moreover, owing to its identical chemical behavior to
131I,
124I is particularly attractive for theranostic applications, enabling accurate pre-therapeutic dosimetry, treatment planning, and personalized therapy monitoring [
19].
Recent studies have further shown that AI-based approaches, particularly ANN and Bayesian-optimized deep learning models, provide effective alternatives to conventional nuclear reaction codes for modeling radionuclide production. In the case of
124I, these models have achieved high-accuracy predictions of proton-induced reaction cross sections, often outperforming classical codes such as TALYS while offering enhanced model interpretability through explainable AI techniques such as SHAP analysis [
20]. Furthermore, ANN-based surrogate models trained on evaluated nuclear data libraries have demonstrated strong generalization capability across proton- and alpha-induced reactions relevant to medical radionuclide production, including iodine-related systems [
21,
22]. These approaches enable rapid cross-section estimation over broad energy ranges, facilitating beam energy optimization and reducing experimental trial-and-error efforts in cyclotron-based
124I production.
Despite its clinical importance, the efficient and reliable production of high-purity 124I remains challenging. The increasing clinical demand underscores the need for improved nuclear data, optimized production routes, and systematic evaluation of theoretical and AI-based modeling approaches. Therefore, in the present study, proton-induced 124I production is investigated using TALYS, EMPIRE, and ANN-based models. Theoretical cross-section predictions are benchmarked against available experimental data, and reaction yield and activation analyses are performed to assess the feasibility and scalability of 124I production under medically relevant conditions.
2. Materials and Methods
In this study, reaction cross sections were calculated using the TALYS and EMPIRE nuclear reaction codes. The calculations were performed employing the Two-Component Exciton (TCE), Geometry-Dependent Hybrid (GDH), Exciton, and Hybrid Monte Carlo Simulation (HMS) models.
TALYS is an open-source nuclear reaction code developed by Koning and collaborators to enable theoretical modeling of nuclear reactions. The current version allows nuclear reaction calculations for a wide range of projectile particles (neutrons, protons, deuterons, tritons, alpha particles,
3He, and gamma rays) incident on target nuclei with mass numbers of 12 and above over a broad energy range. In TALYS calculations, optical model potentials, level density models, direct reaction mechanisms, compound nucleus models, and pre-equilibrium models are employed. The default approaches include the two-component exciton model for pre-equilibrium processes and the Hauser–Feshbach formalism for compound nucleus reactions [
23].
EMPIRE is a nuclear reaction code capable of performing theoretical calculations and comparative analyses over a wide incident energy range. It supports various projectile particles, including neutrons, protons, deuterons, tritons, alpha particles, 3He, photons, and heavy ions. In EMPIRE calculations, optical model potentials, level density models, multi-step direct and compound reaction models, DWBA, the PCROSS exciton model, and the HMS approach are utilized. In this work, PCROSS Exciton and HMS models were used for pre-equilibrium calculations, while the default Generalized Superfluid Model (GSM) was adopted for level density calculations.
During ANN training, the Levenberg–Marquardt (LM) algorithm, a Newton-based optimization method, was employed due to its fast convergence properties and computational efficiency. Compared to classical backpropagation algorithms, the LM method requires fewer adjustable parameters and relies only on first-order partial derivatives.
In the LM algorithm, network weights are optimized through a backpropagation-based update scheme. The neuron model used in this study employs the sigmoid activation function, which is well suited for weight learning. Each neuron consists of n inputs (x1 … xn), associated weights (w1 … wn) and a single output. The neuron output is obtained by applying the activation function to the weighted sum of the inputs. The sigmoid activation function was used in both network layers.
Due to the characteristics of the sigmoid function, input and output data were scaled to the ([0, 1]) interval before being presented to the network, as expressed in Equation (1):
In the LM algorithm, the weight update at the
k-th iteration is given by Equation (2):
where
represents the change in network weights,
J is the Jacobian matrix composed of first-order derivatives of the network error vector (
e) with respect to the weights,
λ is the Marquardt parameter, and (
I) denotes the identity matrix. During training, if the mean squared error decreases, (
λ) is multiplied by a decay factor (typically 0.1); otherwise, it is divided by the same factor. The initial value of
λ was set to 0.01 in this study.
For activation function configurations involving sigmoid–linear mappings, data with zero mean and unit standard deviation provides a more suitable representation. Therefore, a flexible preprocessing scheme incorporating both normalization strategies was implemented to ensure appropriate scaling depending on the selected activation function.
The Jacobian matrix is a critical component of the LM algorithm and was computed using the finite difference method. The Jacobian elements with respect to the weights are defined in Equation (3):
To reduce computational cost while maintaining sufficient accuracy, the forward difference approach was adopted instead of the central difference method. The statistical assessment relies on key metrics such as mean squared error (MSE), and root mean squared error (RMSE) to evaluate the ANN model’s performance. Performance functions assess cumulative discrepancies between network-produced outputs and actual values. These computed discrepancies inform how closely the network aligns with the training set, influencing weight updates. In feed-forward networks, the conventional performance function is MSE [
24,
25]. In this study, data is categorized into three segments; training (70%), validation (15%), and a test (15%) [
26]. The production cross-section data for the
124Te(p,n)
124I and
126Te(p,3n)
124I reactions from the existing literature are examined. The ANN is initially trained and validated to establish the final weights. Regression analysis reveals a fitting relationship between the output (y) and target (x) variables, affirming the algorithm’s suitability for this purpose. ANN was employed to predict continuous output values as a function of proton energy. The ANN consists of an input layer with a single neuron corresponding to the proton energy, one hidden layer comprising ten neurons, and an output layer with a single neuron. The ANN architecture was determined through preliminary tests aimed at balancing model complexity and generalization capability. A single hidden layer was preferred to approximately represent theoretically nonlinear continuous functions and minimize the risk of overfitting given the limited size of the available experimental datasets. The number of hidden neurons (
n = 10) was experimentally optimized by monitoring validation error trends; increasing the number of neurons beyond this value did not lead to significant performance improvements but increased variance. The Levenberg-Marquardt algorithm was chosen for its fast convergence and proven robustness in small to medium-scale regression problems commonly encountered in nuclear data modeling.
The input data are first processed in the hidden layer through a weighted summation followed by the addition of a bias term, after which a nonlinear activation function is applied. This nonlinear transformation enables the network to capture complex and nonlinearly varying relationships inherent in the dataset. The resulting hidden-layer outputs are subsequently propagated to the output layer, where another weighted linear combination with an associated bias term is performed to generate the final prediction.
Since the objective of the model is to estimate continuous numerical values, a linear activation function is applied at the output layer, making the proposed architecture suitable for regression-based problems. The selected network configuration provides an effective balance between computational efficiency and sufficient representational capacity, ensuring reliable performance while meeting the requirements of the present study.
All ANN calculations were carried out using the MATLAB R2014a software environment, which is provided under a campus license by Antalya Bilim University.
4. Discussion
The present study provides a comprehensive comparative evaluation of proton-induced production routes of the positron-emitting radionuclide iodine-124 using conventional nuclear reaction codes (TALYS and EMPIRE) and ANN–based modeling.
For the
124Te(p,n)
124I reaction, the ANN predictions represents the overall trend of the experimental excitation function; however, the level of agreement is clearly energy-dependent. Noticeable deviations are observed in specific regions, particularly near the reaction threshold, where experimental data are sparse and rapid variations occur. Although global statistical metrics such as correlation coefficients and MSE/RMSE values indicate a competitive overall performance, these metrics may mask local discrepancies at certain energies. In contrast, physics-based models such as TALYS and EMPIRE, which explicitly incorporate reaction mechanisms, can provide improved agreement in near-threshold regions under appropriate parameter choices. Consequently, the ANN model demonstrates mainly in selected energy intervals, rather than uniformly superior performance across the full excitation function. Similar observations have been reported by Siddik [
13] and Üncü and Özdoğan [
21], who demonstrated that ANN-based models outperform conventional reaction codes when benchmarked against experimental cross sections for medically relevant reactions.
A comparable trend was observed for the 126Te(p,3n)124I reaction. Although TALYS and EMPIRE calculations captured the location of the excitation function maximum in the 25–30 MeV energy range, the ANN model again yielded the lowest error metrics, indicating a closer match to the experimental data. Among the theoretical approaches, EMPIRE with the HMS model provided slightly improved agreement relative to TALYS. Nevertheless, the ANN approach consistently offered the best overall predictive performance, underscoring its capability to learn complex nonlinear relationships directly from experimental datasets.
The performance of ANN-based modeling observed in this study aligns well with recent literature emphasizing the growing role of artificial intelligence in nuclear data evaluation. Tang [
20] reported that Bayesian-optimized deep learning models achieved higher predictive accuracy than TALYS for proton-induced medical isotope production reactions, while also enabling model interpretability through SHAP-based analyses. Similarly, Hamid et al. [
22] demonstrated that supervised machine learning techniques, particularly random forest algorithms trained on experimental EXFOR data, can successfully generate reliable proton- and alpha-induced nuclear reaction cross sections for medically relevant radionuclide production. By benchmarking their predictions against evaluated nuclear data libraries such as ENDF/B-VII.0, they showed that data-driven regression models are capable of reproducing excitation function trends with reasonable accuracy across multiple reaction channels. These findings provide strong support for the present study, in which an ANN-based framework is employed to further enhance predictive accuracy and to systematically evaluate proton-induced
124I production routes under medically relevant conditions.
From a practical production perspective, the activity and yield calculations further confirm the feasibility of both investigated production routes under clinically relevant irradiation conditions. For the
124Te(p,n) reaction, the calculated activity of approximately 2.1 GBq following a 1 h irradiation at 11.9 MeV and 0.1 mA beam current corresponds to a production yield of about 21 GBq/mAh. In contrast, the
126Te(p,3n) reaction yields substantially higher activity (≈8.5 GBq) and production yield (≈85 GBq/mAh) under higher-energy irradiation.
Figure 7 and
Figure 8 highlight the practical advantages of the
126Te(p,3n)
124I route in terms of achievable activity and production yield. Although this route requires higher proton energies, the substantially increased yield may offset operational constraints in medium- and high-energy medical cyclotrons. Conversely, the (p,n) reaction remains attractive for low-energy facilities due to reduced activation of unwanted radionuclidic impurities.
Despite their strong performance, ANN-based approaches also exhibit inherent limitations. To assess the robustness of the ANN predictions, the dataset was divided into independent training, validation, and test subsets, ensuring that model performance was not dominated by memorization effects. The consistency of regression coefficients across these subsets indicates stable learning behavior. It noted that ANN predictions remain data-driven; therefore, their reliability decreases outside the energy ranges sufficiently represented in the training dataset. This limitation is particularly relevant for extrapolation beyond experimentally explored proton energies. Their predictive capability depends critically on the availability and quality of experimental data, and extrapolation beyond the energy range represented in the training dataset should be approached with caution. Moreover, although explainable AI techniques have improved model transparency [
20], ANN models generally lack the direct physical interpretability of reaction codes such as TALYS and EMPIRE. Consequently, hybrid strategies that combine AI-based surrogate models with physics-informed constraints or nuclear reaction theory may represent a promising direction for future research.
As a results, the findings of this study demonstrate that ANN-based modeling provides a robust and efficient complementary tool for nuclear reaction analysis and optimization of 124I production. By significantly reducing prediction errors and enabling rapid cross-section estimation over broad energy ranges, AI-driven approaches can support improved beam energy selection, irradiation planning, and production efficiency in medical cyclotron facilities. These advantages are expected to become increasingly important as the clinical demand for high-purity 124I continues to grow, particularly in immunoPET and theranostic applications.
5. Conclusions
In this study, the proton-induced production of iodine-124 was systematically investigated using conventional nuclear reaction codes (TALYS and EMPIRE) alongside an ANN–based modeling approach. The calculated excitation functions for the 124Te(p,n)124I and 126Te(p,3n)124I reactions were benchmarked against available experimental data, and production yield and activity analyses were performed under medically relevant irradiation conditions.
The results demonstrate that while TALYS and EMPIRE are capable of reproducing the general trends of the experimental excitation functions, their predictive accuracy is limited by uncertainties in nuclear input parameters, particularly in near-threshold and high-energy regions. In contrast, the ANN-based model consistently achieved superior agreement with experimental data, as evidenced by significantly lower MSE and RMSE values and high correlation coefficients for both investigated reaction channels. These findings confirm the ability of ANN models to effectively capture the nonlinear relationships between proton energy and reaction cross sections without requiring detailed physical parameterization.
From a production perspective, the calculated activity and yield values indicate that both reaction routes are feasible for clinical-scale 124I production, with the 126Te(p,3n) reaction offering higher yields at the expense of increased cyclotron energy requirements. The ANN-based predictions provide a practical advantage by enabling rapid and reliable cross-section estimation across broad energy ranges, thereby supporting optimized beam energy selection and irradiation planning in medical cyclotron facilities.
This study highlights the potential of ANN-based approaches as complementary tools to conventional nuclear reaction modeling for medical radioisotope production. By improving predictive accuracy and reducing reliance on extensive experimental trial-and-error, artificial intelligence–driven methods can contribute to more efficient, cost-effective, and scalable production of iodine-124. Future work may focus on integrating physics-informed constraints, expanding training datasets, and extending the approach to other clinically important radionuclides to further enhance the robustness and generalizability of AI-assisted nuclear data modeling.