1. Introduction
During the last decade, machine learning has become a cornerstone in numerous scientific and industrial domains, providing powerful tools for fast and efficient data processing applications. To name a few examples, these applications include highly accurate skin cancer recognition [
1], complex protein folding predictions [
2], and advanced financial strategies like deep hedging, which uses machine learning to optimize hedging strategies under realistic market conditions [
3].
In recent years, machine learning has also become instrumental in particle physics experiments [
4,
5,
6]. These experiments, typically conducted at particle accelerators, involve colliding heavy atomic nuclei—such as gold or lead—at nearly the speed of light, creating an extremely hot or dense fireball to study the properties of nuclear matter under extreme conditions [
7,
8,
9]. To search for rare probes, some of these experiments have trended toward higher collision rates [
10,
11], driving the need for sophisticated data processing techniques capable of handling massive datasets on high-performance computing resources. Machine learning methods are increasingly used to address challenges such as particle identification and event classification in these data-intensive environments.
The future heavy-ion experiment Compressed Baryonic Matter (CBM) at the Facility for Antiproton and Ion Research (FAIR) [
12] in Darmstadt, Germany, is such an experiment, targeting a collision rate of up to 10 MHz [
13]. This unprecedented rate of collisions will generate vast volumes of about 1 TByte of data per second [
14], demanding real-time event reconstruction and analysis capabilities that go beyond traditional methods, as it is simply not feasible to store all the data produced [
15]. To meet these requirements, CBM will rely on high-performance computing infrastructure combined with highly efficient algorithms for tasks such as online event reconstruction [
16] and event selection [
17,
18], identifying events of particular scientific interest to retain for further analysis. Newly developed machine learning applications will complement the existing algorithm packages by increasing reconstruction efficiency and improving the quality of physics measurements [
19,
20,
21], enabling more precise insights into the conditions created in heavy-ion collisions.
One of the core packages of CBM, the Kalman Filter (KF) Particle Finder [
22], was developed for the online reconstruction and selection of short-lived particles. Due to their instability, these particles decay before or in the tracking system and, therefore, cannot be directly detected or reconstructed [
23]. To reconstruct them, their decay products, known as daughter particles, are analyzed to infer the original particle, referred to as the mother particle. During the reconstruction process, when a decay is identified, all potential mother particles are created to ensure that no real particle signal is inadvertently discarded due to an incorrect hypothesis. However, this method leads to noise, the so-called background, which contains both incorrectly reconstructed particles (referred to as combinatorial background or ghosts) and particles that have been finally assigned to the wrong particle type (physical background). The KF Particle Finder is part of the First-Level Event Selection (FLES) package [
17], which selects the most interesting events for the current research online.
Previous work has investigated machine learning techniques to enhance particle reconstruction quality and background suppression. Boosted Decision Trees have been employed to automatically optimize cut parameters for
particle selection [
24], while neural networks have been integrated into the KF Particle Finder’s particle competition to reduce the mutual background between short-lived particle candidates through particle classification [
25], demonstrating promising improvements in reconstruction performance.
In this work, the KF Particle Finder is extended by a deep-learning-based approach to investigate the capabilities of improving the signal/background (S/B) ratio for
particles by classifying the
particle candidates into signal and background using neural networks. For this task, the neural network package Artificial Neural Networks for First-Level Event Selection (ANN4FLES) [
26] (p. 161) is used, which is primarily developed for use within the CBM experiment. Improving the reconstruction quality of short-lived particles, such as the
particle, is essential not only for extracting rare physical signals in the high-rate environment of CBM but also for enabling a more accurate interpretation and analysis of the underlying physical processes.
2. Materials and Methods
The investigated particle
is a neutral baryon consisting of an up quark, a down quark, and a strange quark. Strange quarks are significant because their presence is expected to indicate deconfined matter, such as Quark Gluon Plasma [
27]. Additionally,
particles are abundantly produced within the energy range of the CBM experiment. Since
is an electrically neutral particle with a short lifetime [
28] (p. 81), it must be reconstructed by its decay products. The
particle decays predominantly into a proton
and a negatively charged pion
, with a branching ratio of approximately
[
28] (p. 81). This common decay mode with two detectable daughter particles, along with the
particle’s abundant production and relevance in heavy-ion physics, makes it a suitable candidate for initial performance tests of neural networks in classification tasks, such as the one in this work.
In CBM, the FLES package is utilized for online full event reconstruction, covering everything from combining detector measurements for track reconstruction to identifying particles, decays, and decay chains. Once the tracks have been reconstructed, they are classified as either primary or secondary in the KF Particle Finder [
22] (p. 88). Primary tracks are those that, when extrapolated to the primary vertex (the collision point), are statistically likely to originate from it. In contrast, secondary tracks are those produced at the decay point of another particle, known as the secondary vertex. Tracks are further grouped by the charge of the associated particle (negative or positive) [
22] (p. 88). To reconstruct short-lived particles, potential daughter tracks are extrapolated to locate a possible secondary vertex. When combining secondary tracks to form potential mother particle candidates, tracks are combined, such as one positive and one negative track, which are potential candidates for the reconstruction of the
decay. Once a secondary vertex is identified, all potential mother particle candidates for that decay are reconstructed. However, as only one true mother particle decays at a given secondary vertex, reconstructing all possible candidates introduces noise (referred to as background) which can reduce the quality of subsequent physics analyses. In addition to real particles that are misclassified (physical background), the background also includes non-existent particles that lack physical meaning (combinatorial background). The observation that daughter particles can also act as mother particles after decaying into their own daughter particles implies a hierarchy of decays, where reconstructed particles at one level might serve as components for reconstructing higher-level particles. This layered reconstruction approach increases the complexity of the analysis, as each new hypothesis of a mother particle at one level introduces potential combinations that could add to the background, further challenging the extraction of true physical signals.
To mitigate this issue and reduce background, statistical cuts are applied to discard unlikely mother particles that are incorrectly reconstructed. Additionally, particle competition methods can be employed to further minimize the background introduced during this process. For this work, the competition is not enabled.
2.1. Selection of Input Features for the Deep Learning Approach
In the KF Particle Finder, various cuts are applied to determine whether reconstructed particles align with the assumptions of the underlying particle model. If a particle is statistically unlikely to match the model, it is excluded. However, since particles are typically assigned to multiple model hypotheses initially, this exclusion does not necessarily remove the particle entirely from the analysis.
In this work, ten input features, five of which are
values used for cuts, are used for the neural-network-based approach. One of them is
, which determines whether a particle’s track is classified as a primary or secondary track. This distinction is particularly important for daughter tracks, as they should be secondary tracks originating from a particle decay (here of
) and not pointing directly to the primary vertex. Pairs of secondary tracks often overlap in time and space, suggesting they may have originated from a common mother particle. To identify such mother particle candidates,
, evaluates the quality of the intersection point of the daughters’ tracks, indicating a likely point of decay. The third
value,
, defined as the sum of
and
, captures the overall topology of the particle. In addition to these
values for the
particle, other features are used as input for the classification model. These include the decay length normalized by its error, the particle mass, transverse momentum (momentum perpendicular to the beam axis), and the transverse momentum and
values of the mother particle candidate’s daughter particles. The complete list of input features is shown in
Table 1.
To support the subsequent application of the deep learning model, it is crucial to avoid prematurely discarding particles that may be misclassified as background due to not meeting the cut criteria. Therefore, the thresholds for
,
, and
were relaxed to retain more particles. In the KF Particle Finder, the threshold value of
is, unlike the other two threshold values that are directly hard-coded, calculated by a function that converts a given probability of incorrectly identifying a primary track as a secondary track into a corresponding threshold value. Here, the probability was relaxed from
to
. All resulting cut thresholds are shown in
Table 2. This provides the network with greater room for decision-making, ensuring the cuts do not preemptively reject particles that the model might classify as signal.
2.2. Simulated Data and Deep Learning Model for Particle Classification
Since the CBM experiment is still under construction, applying the algorithms to real data is not yet feasible. However, well-established simulation frameworks for heavy-ion physics are widely accepted and utilized. One such tool is the Ultra-relativistic Quantum Molecular Dynamics (UrQMD) framework [
29], which generates raw heavy-ion collision data across a wide range of energies. These simulated datasets offer the advantage of known ground truth, making them ideal for supervised learning approaches. The data used in this work originate from CBM repositories, which provide a collection of validated datasets specifically prepared for algorithm development and testing. The CBM event reconstruction pipeline utilizes the Monte Carlo (MC) information from the simulation by making comparisons between algorithmic results and MC information at each phase. For instance, in the KF Particle Finder package, one attempts to match reconstructed particles with their respective MC particles. A reconstructed particle is called a signal if it is reconstructed and matched with the respective particle of the same type from the MC data. Physical background refers to reconstructed particles that match an MC particle but of a different type. Thus, this background is usually produced by other types of particles that have a similar topology. Combinatorial background (or ghosts) are reconstructed particles that do not match any MC particle and, therefore, lack physical meaning.
The deep learning model, a multi-layer perceptron [
30] constructed using ANN4FLES, is composed of an input layer with 10 neurons for the input features, four hidden layers with 64 neurons each, and an output layer with 2 neurons for the classification of signal and background. LeakyReLU with a negative slope of
serves as the activation function for the hidden layers, while Softmax is applied in the output layer. The model is trained using categorical cross-entropy loss and the Adam [
31] weight optimization algorithm, set up with
,
, and
and
. This model was selected through manual tuning. Initial tests with smaller models showed comparable results, suggesting a trade-off between classification performance and runtime. Automated hyperparameter optimization was not yet applied, as the chosen configuration performed well for our initial tests. The model architecture can be seen in
Figure 1.
3. Results
For the results presented in this work, UrQMD-simulated gold–gold central collisions at
were utilized. The training and validation datasets were artificially balanced to contain 50% signal and 50% background, ensuring even representation for training. A total of 787,000
particles were used, with an 80:20 split for training and validation, respectively. The training was performed over 30 epochs with a batch size of 100 samples. The deep learning model achieved a classification accuracy of 98.76% on the validation set, which was used to select a well-performing model. The progression of the training process over 30 epochs is shown in
Figure 2.
Following the training phase, the pre-trained model was integrated into the KF Particle Finder to classify particles within the CBM reconstruction pipeline. This classification was performed using 46,000 new collisions. The subsequent results presented here are directly derived from the KF Particle Finder package 22.2.99.
In
Figure 3, the mass distribution in the range from 1.06 GeV/
to 1.2 GeV/
for all reconstructed
particles is shown, consisting of signal, physical background, and ghosts. The actual
particles are concentrated around the mass region of 1.115 GeV/
, corresponding to the expected
mass. While background particles may also appear near the
mass region, the majority are located in the surrounding areas. As shown, the overall number of reconstructed
particles is reduced by the deep learning approach (red) compared to the default method (blue), with the reduction primarily occurring in regions dominated by background.
True positives are those
particles that are correctly reconstructed and identified, also known as signal. The histogram of the true positives of the default approach and the deep learning approach can be seen in
Figure 4a. In the mass range of 1.1 GeV/
to 1.135 GeV/
, the deep learning approach has reconstructed 179,891 true positives, 1.36% fewer true
particles than the default approach, which reconstructs 182,364 true
particles. Thus, slightly more
particles were incorrectly classified as background in comparison to the default approach based on pre-defined cuts.
The total background, here false positives, can be seen in
Figure 4b. With 25,761 incorrectly classified
particles, the deep learning approach produces 95.94% less total background than the default approach, which incorrectly classified 633,771 particles as
.
Since there are two types of background with different meanings for physics analysis, it is important to analyze both separately. In
Figure 4c, the results of the physical background can be seen. Both approaches encountered difficulties in the removal of the background in the anticipated mass region of the
particle. Nevertheless, the deep learning approach exhibited superior performance in the reduction of background in comparison to the standard method. In the remaining regions, the reductionof background is more evident. Here, the deep learning approach demonstrated a markedly superior reduction compared to the default approach. Overall, within the range of 1.06 GeV/
to 1.2 GeV/
, the standard approach misclassified 34,746 of the observed particles, while the deep learning approach misclassified only 8314.
Since ghosts are particles that do not exist, these incorrectly created particles are of no use for physics analysis. It would be desirable to completely avoid any ghost particles being misclassified as
. The proposed approach significantly reduces these reconstructed particles, as shown by the data presented in
Figure 4d. While a small peak remains visible within the range of the mass of
, this range has been significantly constrained. Overall, the combinatorial background was reduced from 599,025 ghosts to 17,447 ghosts, achieving a reduction factor of 34.33.
In total, within and beyond the shown mass ranges, 2758 particles were incorrectly classified as background, while 2,006,910 particles were correctly identified as such. Therefore, the number of rejected particle signal remains within a manageable range.
Given the absolute numbers from the evaluation, the dataset’s imbalance—containing significantly more background than signal—is clearly evident. In such cases, accuracy alone can be misleading. Therefore, to more reliably assess the performance of the deep learning approach, additional classification metrics are presented in
Figure 5, including the confusion matrix of the evaluation results. On the test dataset, within the KF Particle Finder, the accuracy dropped slightly in comparison to the validation (from 0.9876 to 0.9859). The results show a precision of 0.9852, indicating that the vast majority of particles identified as
were correctly classified, with less than 2% of the background being misclassified as
. The recall of 0.8651 reflects the model’s ability to recover most true
particles. However, this also implies that some
particles were misclassified as background, leading to the omission of these true
particles in the physics analysis. The F1-score of 0.9212 demonstrates a good balance between precision and recall, and the high specificity of 0.9986 confirms that background particles were effectively rejected, highlighting the method’s strength in minimizing false positives, as desired.
In
Table 3, the S/B ratio and significance of both the default approach and the deep learning approach are presented. In each case, the signal (S) and background (B) are determined using the integrals of two function fits—one for the signal and one for the background—performed with the CbmRoot framework [
32]. The S/B ratio quantifies the proportion of signal to background, while the significance reflects the confidence in distinguishing the signal of a particle type from the background. A significance of 1 indicates no distinction, while a significance of 5 marks the threshold for identifying a real particle signal, ensuring it is unlikely to be a statistical fluctuation and typically resulting in a clear peak.
The deep learning approach improved the S/B ratio significantly, from 3.49 to 38.28—an increase by a factor of 10.97. While these values depend on the functions used for fitting the signal and background, the improvement in the S/B ratio is evident. Additionally, the significance increased slightly, from 11.38 to 12.95—a factor of 1.14. In both approaches, the significance remains well above the threshold. However, the increased significance demonstrates that the deep learning approach enhances the distinguishability of the particle signal.
4. Discussion
The results of this work demonstrate a significant reduction in background while keeping most of the particles’ signal, as shown in
Figure 5. This is specifically true when comparing to the Monte Carlo information of the simulated data. However, the mass is typically used as the final observable in physics analyses. Including it—or variables strongly correlated with it—within the classifier may bias the mass spectrum, artificially sharpening the signal peak and suppressing the sidebands. Although the metrics indicate good performance of the neural network, this can compromise the reliability of background estimation and signal extraction, particularly when applying the model to experimental data and therefore requires thorough examination.
To investigate this effect, we conducted tests using the training and validation datasets, excluding the mass and other kinematically correlated variables (specifically, the transverse momenta of and its daughters). The classifier still achieved strong results. When only the mass was removed, the model reached an accuracy of 0.9745, with a precision and F1-score of 0.9746 and 0.9745, respectively. When both the mass and all transverse momenta were excluded—leaving only cut-based features such as primary vertex and track quality variables—the model still achieved an accuracy of 0.9444, with precision and F1-score exceeding 0.94. These results confirm that a comparable signal–background discrimination is achievable even without relying on mass-correlated features, thus preserving the integrity of the mass as an unbiased observable in subsequent physics analyses using experimental data.
It should be emphasized that the results presented in this work were obtained using simulated central collisions, which represent idealized “perfect” scenarios. Future investigations should evaluate the performance of the proposed approach in simulated minimum bias collisions, where less ideal interactions and a wider variety of collision geometries are present. This will provide a more realistic assessment of the method’s robustness and its applicability to a broader range of collision conditions—a crucial step toward its use with experimental data. Furthermore, the presented deep learning approach was only trained, validated, and tested on simulated data. While these simulations are typically well studied and assumed to reflect real data reasonably well, the method must be thoroughly studied before being applied to experimental data.
The strong suppression of combinatorial background highlights the potential of machine learning in this domain. However, physical background arises from real particles mistakenly identified as particles. Completely removing such a background negatively impacts the quality of the physics analysis for other particles. To address this, two extensions of the approach are proposed. Firstly, the analysis should include other particles of interest, especially those that typically interact or overlap with particles, contributing to the mutual physical background. Secondly, the deep learning approach could be expanded to classify not only signal and background for but also for multiple particle species. An extended neural network could potentially distinguish physical background for one particle as a valid signal for another, ensuring a more comprehensive and balanced reconstruction process. With such an approach, the physical background might be reduced without compromising the reconstruction quality of other particles while effectively eliminating combinatorial background with no physical significance.
5. Conclusions
Instead of relying solely on statistical cuts, this work employed a deep learning approach using a deep multi-layer perceptron for signal and background classification. The model achieved a high accuracy of about 0.9876 on the validation set used for selecting the best-fitting neural network.
Integrated into the KF Particle Finder package for a final test, the neural network effectively reduced reconstructed background, particularly eliminating ghosts (combinatorial background) that lack physical meaning. Importantly, the particle signal was minimally affected. The model achieved an accuracy of 0.9859, a precision of 0.9852, a recall of 0.8651, an F1-score of 0.9212, and a specificity of 0.9986, indicating strong overall performance in classifying signal and background. However, reducing the physical background and rejecting these particles has implications for other particles and should be carefully investigated in future works.
Overall, this deep learning method significantly improved the S/B ratio by a factor of 10.97 (from 3.49 to 38.28) and also increased the significance by a factor of 1.14 (from 11.38 to 12.95). These advancements highlight the potential of machine learning to improve particle reconstruction and signal clarity, paving the way for more refined analyses in heavy-ion physics. Future work on extending the model to other particle types, integrating inter-particle interactions, and assessing performance under diverse collision conditions will further enhance its utility and impact.