Electronic Population Reconstruction from Strong-Field-Modified Absorption Spectra with a Convolutional Neural Network

: We simulate ultrafast electronic transitions in an atom and corresponding absorption line changes with a numerical, few-level model, similar to previous work. In addition, a convolutional neural network (CNN) is employed for the first time to predict electronic state populations based on the simulated modifications of the absorption lines. We utilize a two-level and four-level system, as well as a variety of laser-pulse peak intensities and detunings, to account for different common scenarios of light–matter interaction. As a first step towards the use of CNNs for experimental absorption data in the future, we apply two different noise levels to the simulated input absorption data.


Introduction
The development of attosecond laser sources [1][2][3] not only allows for capturing electronic motion in atoms, molecules and solids on their natural time scales, but it has also expanded the spectral regime of ultrafast pulses into the ultraviolet (UV) and extreme-ultraviolet (XUV) ranges.Besides the pioneering work on high-order harmonic generation (HHG) from near-infrared optical pulses [1][2][3], free-electron lasers can also generate XUV laser pulses [4], which are, since recently, also available with attosecond pulse durations [5,6].One of the crucial benefits of FEL pulses is their high peak intensity, which allows for all-XUV pump-probe experiments [7] and strong-field effects in the XUV range on ultrashort time scales, such as Rabi cycling [8] or absorption line shape modifications [9][10][11][12].Although properties such as the temporal dipole moment can be reconstructed under certain circumstances [13], the (intra-pulse) time-dependent state populations have not yet been directly reconstructed.This kind of inversion problem is common in a broad variety of quantum-based experiments, where phase information is typically lost and inversions are non-trivial, if possible at all.Still, electronic state populations and coherences are crucial for understanding effects such as Rabi cycling [8] or X-ray lasing [14] in atoms or charge transfer in molecules [15][16][17].
In this work, we simulate strong-field-induced line shape modifications with a numerical, few-level system, similar to previous work [9][10][11][12].Furthermore, we implement a CNN, which reconstructs the electronic populations during the driving pulse from the simulated strong-field-modified absorption spectra of a two-level system.To allow the CNN to predict populations from experimental data in the future, we also introduce different noise levels into the absorption spectra before training the CNN.To extend our model towards more complex electron dynamics involving several states, we additionally apply the CNN to a four-level system, where a coherent wavepacket of three excited states is initiated.

Few-Level Systems for Simulations of Absorption Line Shape Changes
To simulate strong-field light-matter interaction and the resulting absorption changes, we consider a few-level system and solve the Schrödinger equation numerically, as described in detail in our previous work [11].First, we describe a generic two-electronic-state system in an atom interacting with a laser pulse, cf. Figure 1a,b, with a Hamiltonian as follows: The diagonal matrix elements contain the eigenstate energies of the ground (g) and excited (e) state, E g = 0 a.u and E e = 0.2932 a.u., and the inverse lifetime of the excited state, Γ e = 0.002 a.u.(a.u.denotes atomic units).The i represents the imaginary unit, whereas the h is the Planck constant.The off-diagonal matrix elements describe the dipole coupling between the two states, where we choose d ge = d eg = 1 a.u. as the dipole constant and ε(t) is the electric field of the laser pulse.We use a Gaussian pulse defined in the spectral domain ε(ω), with a spectral width of σ = 0.02314 a.u., corresponding to a 2.5 fs pulse duration, and centered at the resonance transition of the two states.To additionally produce a detuned dataset, we shift the excited-state energy by 0.0568 a.u towards lower photon energies without changing the laser pulse parameters.The time-dependent Schrödinger equation is solved for each time step t between −2.5 fs and 2.5 fs relative to the pulse peak intensity in steps of 0.025 fs.The two resulting time-dependent populations, P g,e (t) = |c g,e (t)| 2 , of the bare states are shown in Figure 1c,d for the resonant and detuned pulses, respectively.c g,e are the coefficients of the general quantum state of the system, |Ψ(t) ⟩ = c g (t), c e (t) T , in the basis of the two bare states.With the help of the state coefficients, we can calculate the dipole response of the two-level system, d(t) = d ge •c g (t)c e × (t) + c.c., see Figure 1e, and its Fourier transform, d(ω).In the case of strong coupling dynamics, the resulting energy-level shifts lead to a phase shift of the temporal dipole response since the laser pulse is (much) shorter than the lifetime of the excited state [34].This leads to line shape changes in the optical density, OD(ω), which we calculate from the in-and outgoing fields: Optics 2024, 5 90

Ԑ (ω)
Here, η = 10 is a numerical constant, proportional to the particle density in an absorption experiment.We illustrate the change in the resonance line shape in OD(ω) from the weak-field case in Figure 1f to the strongly driven case for resonant and detuned pulses, shown in Figure 1g,h, respectively.OD(ω) is the input for the CNN, see Figure 1i, which is used to reconstruct the populations, as described in the next section.(f-h) A spectroscopic measurement of these signals leads to a resonance line in the optical density for a weak (Ԑ0 = 0.7 a.u.) and resonant pulse (f), a strongly coupling (Ԑ0 = 6.1 a.u.) and resonant pulse (g) or a strongly coupling (Ԑ0 = 6.1 a.u.) and detuned pulse (h).The natural line shape (f) is modified due to the strong driving fields (g,h), which we use to train the CNN (i) to predict the state populations of the two states (c,d).
Regarding the four-level system, we replace the Hamiltonian with a 4 × 4 matrix containing the four eigen-energies of the ground state and the three excited states (e1, e2 e3): Eg = 0 a.u, Ee1 = 0.2676 a.u., Ee2 = 0.2932 a.u. and Ee3 = 0.315 a.u.For all three excited states, we use the same inverse lifetime and dipole constant as for the two-level system.No dipole couplings are allowed between any two excited states by setting all other off-diagonal matrix elements to zero.The laser pulse is defined as above and resonant with the second excited state, thus also being equally red-and blue-detuned as the two other excited states.

Convolutional Neural Network Architecture
The inputs for the CNN are the OD spectra sampled with 300 data points symmetrically around the resonance position.The output variables of the CNN are the electron populations, each sampled on a time grid centered around a pulse between −2.5 fs and 2.5 fs with 200 points.For the two-level system, this results in a prediction of 2 × 200 output variables based on 300 input variables.To achieve such a high-dimensional output (f-h) A spectroscopic measurement of these signals leads to a resonance line in the optical density for a weak (ε 0 = 0.7 a.u.) and resonant pulse (f), a strongly coupling (ε 0 = 6.1 a.u.) and resonant pulse (g) or a strongly coupling (ε 0 = 6.1 a.u.) and detuned pulse (h).The natural line shape (f) is modified due to the strong driving fields (g,h), which we use to train the CNN (i) to predict the state populations of the two states (c,d).
Here, η = 10 −4 is a numerical constant, proportional to the particle density in an absorption experiment.We illustrate the change in the resonance line shape in OD(ω) from the weak-field case in Figure 1f to the strongly driven case for resonant and detuned pulses, shown in Figure 1g,h, respectively.OD(ω) is the input for the CNN, see Figure 1i, which is used to reconstruct the populations, as described in the next section.
Regarding the four-level system, we replace the Hamiltonian with a 4 × 4 matrix containing the four eigen-energies of the ground state and the three excited states (e1, e2 e3): E g = 0 a.u, E e1 = 0.2676 a.u., E e2 = 0.2932 a.u. and E e3 = 0.315 a.u.For all three excited states, we use the same inverse lifetime and dipole constant as for the two-level system.No dipole couplings are allowed between any two excited states by setting all other off-diagonal matrix elements to zero.The laser pulse is defined as above and resonant with the second excited state, thus also being equally red-and blue-detuned as the two other excited states.

Convolutional Neutral Network for State Population Reconstruction 2.2.1. Convolutional Neural Network Architecture
The inputs for the CNN are the OD spectra sampled with 300 data points symmetrically around the resonance position.The output variables of the CNN are the electron populations, each sampled on a time grid centered around a pulse between −2.5 fs and 2.5 fs with 200 points.For the two-level system, this results in a prediction of 2 × 200 output variables based on 300 input variables.To achieve such a high-dimensional output (relative to the input), the CNN architecture is constructed with several layers, in total containing 279,932 trainable parameters.The CNN layer-by-layer structure, which is constructed in a similar way as the methodologies in [30,33] describe, is depicted in Figure 2. The CNN architecture is characterized by five blocks, each comprising two convolutional layers with a convolution window of size = 3 (kernel) and stride = 1, which is the step size by which the convolutional window moves across the input data.After each convolutional operation within the network, the convolutional layers are followed up by the Rectified Linear Unit (ReLU) activation function, defined as ReLU(x) = max(0, x).The convolutional operations are calculated without padding, resulting in an array size shrinkage of 2 after each layer.In each block, the two convolutional layers are followed up by a max-pooling layer with a kernel size = 2 and stride = 2.The inclusion of max-pooling layers segments the arrays into pooling regions of size 2, resulting in a further reduction in array dimensions by half after each block.To allow for more complex patterns to be captured by the CNN, the number of filters in each convolutional layer is doubled between consecutive blocks, ranging from 8 filters in the initial block to 128 filters in the last block.For the final block, the max-pooling layer is replaced by a densely, i.e., fully connected layer (with ReLU activation) consisting of 100 neurons.The final output layer is also a dense layer (with linear activation) with an output size of 2 × 200-quantifying the two time-dependent populations of the two-level system.
Optics 2024, 5, FOR PEER REVIEW 4 (relative to the input), the CNN architecture is constructed with several layers, in total containing 279,932 trainable parameters.The CNN layer-by-layer structure, which is constructed in a similar way as the methodologies in [30,33] describe, is depicted in Figure 2.
The CNN architecture is characterized by five blocks, each comprising two convolutional layers with a convolution window of size = 3 (kernel) and stride = 1, which is the step size by which the convolutional window moves across the input data.After each convolutional operation within the network, the convolutional layers are followed up by the Rectified Linear Unit (ReLU) activation function, defined as ReLU(x) = max(0, x).The convolutional operations are calculated without padding, resulting in an array size shrinkage of 2 after each layer.In each block, the two convolutional layers are followed up by a max-pooling layer with a kernel size = 2 and stride = 2.The inclusion of max-pooling layers segments the arrays into pooling regions of size 2, resulting in a further reduction in array dimensions by half after each block.To allow for more complex patterns to be captured by the CNN, the number of filters in each convolutional layer is doubled between consecutive blocks, ranging from 8 filters in the initial block to 128 filters in the last block.For the final block, the max-pooling layer is replaced by a densely, i.e., fully connected layer (with ReLU activation) consisting of 100 neurons.The final output layer is also a dense layer (with linear activation) with an output size of 2 × 200-quantifying the two time-dependent populations of the two-level system.Overview of the convolutional neural network structure (for the two-level system).On the left, the sequence of layers is shown.On the right, the (output) array sizes for each layer are given.

Training Dataset Properties
For the training phase, we generated a comprehensive dataset consisting of 10,000 samples through repeated simulations of the two-level model by randomly sampling the peak field strength of the driving laser pulse, Ԑ0, on a logarithmic scale, ranging from 0.1 a.u. to 10 a.u.The random sampling improves consistency, promotes comparability and Overview of the convolutional neural network structure (for the two-level system).On the left, the sequence of layers is shown.On the right, the (output) array sizes for each layer are given.

Training Dataset Properties
For the training phase, we generated a comprehensive dataset consisting of 10,000 samples through repeated simulations of the two-level model by randomly sampling the peak field strength of the driving laser pulse, ε 0 , on a logarithmic scale, ranging from 0.1 a.u. to 10 a.u.The random sampling improves consistency, promotes comparability and serves to minimize randomness as a confounding factor during the training process.This dataset combines the resonant and detuned cases introduced in Section 2.1.We allocate subsets of 6400 of these samples for training, 1600 for validation during training and 2000 for subsequent testing.Exactly one half of all (sub)sets are calculated with the resonant and detuned settings, respectively.Furthermore, we generate two more datasets by introducing two Gaussian noise levels into the OD spectra.The two noise levels, 1% and 3%, are given as the standard deviation of Gaussian sampled noise.We chose these noise levels based on our experimental observations, for example, detecting absorption changes at a 2% level clearly above a smaller noise level in [35].For the four-level system, as described in Section 2.1, we use nearly the same architecture as for the two-level system.Since the output variables, i.e., the state populations, are twice as large as before, 4 × 200, we also increase the resolution of the OD to 350 points to maintain comparable predictive resolution.This results in a total network size of 358,732 trainable parameters for the four-level system.

Training Process
To train the CNN, we utilize the Adam optimizer [36], a widely known optimization algorithm used for training deep neural networks.Both the selection of the optimizer and an initial learning rate of 10 −3 are grounded in a heuristic approach, given that the Adam optimizer inherently adapts the learning rate during training.For the established regression task, the mean-squared error (MSE) is used as the loss function.This metric quantifies the average squared difference between predicted and actual values, thereby providing a measure of the CNN's proficiency in capturing the deviations between simulated and reconstructed populations.
Here, ŷi are the (simulated) input values and y i are the values predicted by the CNN.The sample size, n samples , sums over all data points, including all populations, time steps and laser field strengths.By training the CNN on datasets of different sizes and quantifying the losses with the MSE, as shown in Figure 3a, we observe the loss to be converged for the scenarios with 1% and 3% noise when our utilized training size of 6400 is reached.The loss function for the set without noise still decreases within the reported number of samples, as expected for a noise-free scenario, since the machine precision was not reached.Our goal is a CNN which, ultimately, can predict populations from (noisy) experimental data; thus, we did not increase the data size for the noise-free scenario.Further, we also use the Mean Absolute Error (MAE) as an additional metric, given as follows: Optics 2024, 5, FOR PEER REVIEW 5 serves to minimize randomness as a confounding factor during the training process.This dataset combines the resonant and detuned cases introduced in Section 2.1.We allocate subsets of 6400 of these samples for training, 1600 for validation during training and 2000 for subsequent testing.Exactly one half of all (sub)sets are calculated with the resonant and detuned settings, respectively.Furthermore, we generate two more datasets by introducing two Gaussian noise levels into the OD spectra.The two noise levels, 1% and 3%, are given as the standard deviation of Gaussian sampled noise.We chose these noise levels based on our experimental observations, for example, detecting absorption changes at a 2% level clearly above a smaller noise level in [35].For the four-level system, as described in Section 2.1, we use nearly the same architecture as for the two-level system.Since the output variables, i.e., the state populations, are twice as large as before, 4 × 200, we also increase the resolution of the OD to 350 points to maintain comparable predictive resolution.This results in a total network size of 358,732 trainable parameters for the four-level system.

Training Process
To train the CNN, we utilize the Adam optimizer [36], a widely known optimization algorithm used for training deep neural networks.Both the selection of the optimizer and an initial learning rate of 10 −3 are grounded in a heuristic approach, given that the Adam optimizer inherently adapts the learning rate during training.For the established regression task, the mean-squared error (MSE) is used as the loss function.This metric quantifies the average squared difference between predicted and actual values, thereby providing a measure of the CNN's proficiency in capturing the deviations between simulated and reconstructed populations.
Here, y are the (simulated) input values and yi are the values predicted by the CNN.The sample size, n , sums over all data points, including all populations, time steps and laser field strengths.By training the CNN on datasets of different sizes and quantifying the losses with the MSE, as shown in Figure 3a, we observe the loss to be converged for the scenarios with 1% and 3% noise when our utilized training size of 6400 is reached.
The loss function for the set without noise still decreases within the reported number of samples, as expected for a noise-free scenario, since the machine precision was not reached.Our goal is a CNN which, ultimately, can predict populations from (noisy) experimental data; thus, we did not increase the data size for the noise-free scenario.Further, we also use the Mean Absolute Error (MAE) as an additional metric, given as follows:  This choice is grounded in the constraint of our model outputs within the range of zero to one for the populations.Consequently, interpreting the Mean Absolute Error allows for an intuitive understanding of our CNN's performance because it quantifies the average error over all data points, which can also be read as a percentage, i.e., MAE = 0.01 can be read as an average error of 1% over all data points.We train the two-and four-level CNNs for 5000 epochs each, with early stopping after 1000 epochs if the loss function has already converged.For each epoch, we randomly divide the training set into 100 batches of size 64, on which the CNN is trained iteratively to circumvent the expansive full training dataset size of 6400.After training, only the best-performing model is saved and selected for subsequent testing.As for the two-level case, we evaluate our model with the MSE for different training data sizes, as shown in Figure 3b.For the two cases containing noise, the loss converges for our data size of 3200, whereas the noise-free case has not reached convergence yet.

Line Shape Changes and Population Reconstruction for the Two-Level System
In this section, we first show the results of the simulated absorption spectra (Figure 4) and discuss them with respect to previous findings [11,12].Afterwards, the results of our novel approach to reconst the time-dependent electronic state populations from the absorption spectra are presented and discussed (Figures 5 and 6).In Figure 4a, the field-strengthdependent absorption is changing continuously in amplitude while staying Lorentzian, thus symmetric.As shown in our previous work [11,12], dipole phase shifts cancel out for exactly resonant driving pulses, hence explaining the symmetric line shape.For electric field strengths of 3.3 a.u. and 6.3 a.u., the resonant OD switches sign due to π-phase jumps in the Rabi cycle of the population coefficients [12].In contrast, the detuned pulses can change the asymmetry of the resonance line by inducing dipole phase shifts [11,12], as shown in Figure 4b, hence making it Fano-like shaped [37].For electric field strengths around 5 a.u. to 6 a.u., Fano-like line shapes emerge, which exhibit negative OD.A more detailed discussion of these line shape changes and how they are connected to the electronic state energies and coefficients can be found in [11,12].
Optics 2024, 5, FOR PEER REVIEW 6 This choice is grounded in the constraint of our model outputs within the range of zero to one for the populations.Consequently, interpreting the Mean Absolute Error allows for an intuitive understanding of our CNN's performance because it quantifies the average error over all data points, which can also be read as a percentage, i.e., MAE = 0.01 can be read as an average error of 1% over all data points.We train the two-and four-level CNNs for 5000 epochs each, with early stopping after 1000 epochs if the loss function has already converged.For each epoch, we randomly divide the training set into 100 batches of size 64, on which the CNN is trained iteratively to circumvent the expansive full training dataset size of 6400.After training, only the best-performing model is saved and selected for subsequent testing.As for the two-level case, we evaluate our model with the MSE for different training data sizes, as shown in Figure 3b.For the two cases containing noise, the loss converges for our data size of 3200, whereas the noise-free case has not reached convergence yet.

Line Shape Changes and Population Reconstruction for the Two-Level System
In this section, we first show the results of the simulated absorption spectra (Figure 4) and discuss them with respect to previous findings [11,12].Afterwards, the results of our novel approach to reconst the time-dependent electronic state populations from the absorption spectra are presented and discussed (Figures 5 and 6).In Figure 4a, the fieldstrength-dependent absorption is changing continuously in amplitude while staying Lorentzian, thus symmetric.As shown in our previous work [11,12], dipole phase shifts cancel out for exactly resonant driving pulses, hence explaining the symmetric line shape.For electric field strengths of 3.3 a.u. and 6.3 a.u., the resonant OD switches sign due to πphase jumps in the Rabi cycle of the population coefficients [12].In contrast, the detuned pulses can change the asymmetry of the resonance line by inducing dipole phase shifts [11,12], as shown in Figure 4b, hence making it Fano-like shaped [37].For electric field strengths around 5 a.u. to 6 a.u., Fano-like line shapes emerge, which exhibit negative OD.A more detailed discussion of these line shape changes and how they are connected to the electronic state energies and coefficients can be found in [11,12].c,d) for the 1% noise level on the ODs.In all panels, the populations are close to zero for low field strengths (Ԑ0 ≤ 1).For higher field strengths (1 a.u.< Ԑ0), the population is first significantly increased and further begins to oscillate up and down when 7 a.u.< Ԑ0.As is well known from Rabi oscillations [38], the population transfer is significantly reduced for detuned pulses; thus, the maximum excited state population for the detuned case (b,d) of Pe max ≈ 0.5 is smaller than for the resonant case (a,c), where Pe max = 1.We do not use the rotating wave approximation; thus, the populations oscillate with 2ωr [38] during the interaction with the pulse, which is twice the frequency of the resonance transition.In Figure 5, we illustrate that the CNN is capable of reconstructing the populations of the ground (Figure 5d) and excited states (Figure 5e), including the three different noise levels, for a single field strength of ε 0 = 6.1 a.u. of a detuned driving pulse.The OD spectrum is shown in Figure 5a without noise, with a 1% noise level in Figure 5b and a 3% noise level in Figure 5c.For both states-the ground state in Figure 5d and the excited state in Figure 5e-the reconstructed populations for the noise-free case (blue) are in near-perfect agreement with the simulated populations (black).Even when introducing 1% noise into the OD, the CNN reconstructs the state populations excellently (shown in green).In contrast, when the noise level is increased to 3%, only the slow overall shapes of the reconstructed populations (orange) can be predicted reasonably well, whereas the faster dynamics are not accurate anymore.To prove that the CNN can reconstruct the electronic populations in general for the complete dataset, we compare the simulated population (Figure 5a,b) of the excited state to the reconstructed population (Figure 5c,d) as a function of the field strength of the driving pulse in Figure 6.We look at the excited state population only because the sum of the ground state and excited state populations is equal to one for all time steps without the presence of further loss channels.For the CNN reconstruction, we chose the 1% noise level in the OD based on the results obtained for a single electric field strength, as discussed in Figure 5.For the resonant driving pulse, the population is reconstructed excellently for most field strengths in Figure 6c compared to the simulated population in Figure 6a.Only for field strengths from 6.1 a.u. to 6.6 a.u.does the reconstruction differ from the simulation, as discussed below.In the detuned case, the population is reconstructed accurately for most field strengths in Figure 6d c,d) for the 1% noise level on the ODs.In all panels, the populations are close to zero for low field strengths (Ԑ0 ≤ 1).For higher field strengths (1 a.u.< Ԑ0), the population is first significantly increased and further begins to oscillate up and down when 7 a.u.< Ԑ0.As is well known from Rabi oscillations [38], the population transfer is significantly reduced for detuned pulses; thus, the maximum excited state population for the detuned case (b,d) of Pe max ≈ 0.5 is smaller than for the resonant case (a,c), where Pe max = 1.We do not use the rotating wave approximation; thus, the populations oscillate with 2ωr [38] during the interaction with the pulse, which is twice the frequency of the resonance transition.c,d) for the 1% noise level on the ODs.In all panels, the populations are close to zero for low field strengths (ε 0 ≤ 1).For higher field strengths (1 a.u.< ε 0 ), the population is first significantly increased and further begins to oscillate up and down when 7 a.u.< ε 0 .As is well known from Rabi oscillations [38], the population transfer is significantly reduced for detuned pulses; thus, the maximum excited state population for the detuned case (b,d) of P e max ≈ 0.5 is smaller than for the resonant case (a,c), where P e max = 1.We do not use the rotating wave approximation; thus, the populations oscillate with 2ω r [38] during the interaction with the pulse, which is twice the frequency of the resonance transition.
For a quantitative comparison of the reconstructed with the simulated populations, the MSEs (Equation (3)) and MAEs (Equation ( 4)) are presented in Table 1 for all three noise levels.As expected from the above findings, the errors increase with the noise.All error values are a few percent or less, thus confirming the excellent agreement between the reconstructed and simulated populations for most field strengths, with the exceptions mentioned above.This demonstrates that the CNN is capable of reconstructing the populations of two electronic states based on absorption line changes in all cases where the absorption signal is larger than the noise.

Line Shape Changes and Population Reconstruction for the Four-Level System
To investigate how more complex electronic population dynamics can be reconstructed with our CNN, we simulate a four-level system, as described in Section 2.1.
The driving pulse, resonant with the central excited state, excites a coherent wavepacket across the three states, where the energy spacing between the excited states is smaller than the spectral bandwidth of the pulse.The resulting absorption spectra for three different field strengths are shown in Figure 7.With an increasing field strength of the pulse, the amplitudes of all three resonances are reduced because of a reduction in the ground state population.As in the two-level system, the resonant excited state stays symmetric, whereas the line shapes of the two other excited states become more asymmetric with increased field strength.As shown in [11], for equal red and blue detuning of the driving pulse, their line shapes become Fano-like with mirrored asymmetries.Small deviations from this asymmetry come from numerical errors due to the discrete spectral grid.The corresponding population dynamics of the four states are shown in Figure 8.The CNN can reconstruct all four state populations for all three field strengths when no noise is added to the OD, compared to the simulated populations (depicted with black markers; see Figure 8a,d,g).Adding noise levels of 1% and 3%, the populations can be reconstructed as well, but only for low and intermediate field strengths; see Figure 8b,c,e,f.For the highest field strength and a 1% noise level, shown in Figure 8h, the reconstructed populations show similar slower dynamics as the simulated populations, but the local minima and maxima of the population transfer are decreased in amplitude when compared to the simulations.When increasing the noise to 3% for the highest field strength, as shown in Figure 8i, many of the predicted population dynamics are not correct with regard to the simulated populations: the fast oscillations are missing, the number of Rabi cycles is reduced and after the pulse is over (t > 1 fs), the populations of the detuned states, e1 (green) and e2 (red), are larger than the resonant excited state population (orange), which is in contrast to the simulated populations.Yet these deviations can only be found for the highest field strengths in the 3% noise case.Overall, the CNN reconstructs the four electronic state populations mostly accurately, as quantified by the mean errors in Table 2.The mean errors are obtained by averaging over the populations of all field strengths; thus, the deviations for the highest field strengths contribute only marginally.As in the two-level system, the errors increase by an order(s) of magnitude when the noise level is increased.Comparing the overall performance to the two-level system and the error values in Table 1, the CNN reconstructs the four-level populations slightly better than for the two-level system, which might be due to the higher amount of information provided in the OD spectra.
Optics 2024, 5, FOR PEER REVIEW 9 driving pulse, their line shapes become Fano-like with mirrored asymmetries.Small deviations from this asymmetry come from numerical errors due to the discrete spectral grid.The corresponding population dynamics of the four states are shown in Figure 8.
The CNN can reconstruct all four state populations for all three field strengths when no noise is added to the OD, compared to the simulated populations (depicted with black markers; see Figure 8a,d,g).Adding noise levels of 1% and 3%, the populations can be reconstructed as well, but only for low and intermediate field strengths; see Figure 8b,c,e,f.For the highest field strength and a 1% noise level, shown in Figure 8h, the reconstructed populations show similar slower dynamics as the simulated populations, but the local minima and maxima of the population transfer are decreased in amplitude when compared to the simulations.When increasing the noise to 3% for the highest field strength, as shown in Figure 8i, many of the predicted population dynamics are not correct with regard to the simulated populations: the fast oscillations are missing, the number of Rabi cycles is reduced and after the pulse is over (t > 1 fs), the populations of the detuned states, e1 (green) and e2 (red), are larger than the resonant excited state population (orange), which is in contrast to the simulated populations.Yet these deviations can only be found for the highest field strengths in the 3% noise case.Overall, the CNN reconstructs the four electronic state populations mostly accurately, as quantified by the mean errors in Table 2.The mean errors are obtained by averaging over the populations of all field strengths; thus, the deviations for the highest field strengths contribute only marginally.As in the two-level system, the errors increase by an order(s) of magnitude when the noise level is increased.Comparing the overall performance to the two-level system and the error values in Table 1, the CNN reconstructs the four-level populations slightly better than for the two-level system, which might be due to the higher amount of information provided in the OD spectra.

Conclusions and Outlook
In summary, we have shown that a CNN can be used to reconstruct time-dependent electronic state populations from simulated OD spectra for two different scenarios of laser pulse excitations: the excitation of an individual electronic state (in the two-level system) as well as launching an electronic wave packet consisting of three excited states (in the four-level system).We have demonstrated this for driving pulse electric-field strengths spanning across two orders of magnitude-continuously tuning from the weak-field to the strong-coupling case.For the two-level system, we have further shown that reconstruction is possible for (two) different cases of pulse detunings.Furthermore, by including two different noise levels in the input spectra, we have found that a 1% noise level does not change the CNN reconstructions significantly, whereas an increase to a 3% noise level leads to the CNN predictions deviating more significantly from the input populations for the highest driving pulse field strengths.With regards to the pulse intensities and dipole couplings chosen here, we thus identify the 3% noise level as an upper limit.In the future, for utilizing the CNN for experimental strong-field-driven absorption spectra, we suggest a dataset combining theoretical simulations and experimental weak-field absorption measurements-where the populations are nearly unchanged and could be calculated with perturbation theory-which is conceptually similar to previous works [21,[23][24][25]27,28,30,33].In the XUV spectral regime, strong coupling experiments have been performed with self-amplified spontaneous emission (SASE) [4]-based FEL pulses in an autoionizing state [9].In these cases, not only the peak intensity but also the spectral structure, central photon energy and pulse duration vary from shot to shot.The models and datasets discussed here should thus be expanded to higher dimensions by also training the CNN on this extended parameter space.To that end, the measurements should ) and (g-i) high field strengths (ε 0 = 9.06 a.u.) of the pulse, as well as for (a,d,g) no noise, (b,e,h) 1% and (c,f,i) 3% noise.In (a,d,g), the simulated populations are shown with black markers.Similar to the two-level case, the populations stay nearly unchanged for low field strengths but are significantly transferred for higher field strengths and undergo several Rabi oscillations (and faster 2ω r oscillations) for the highest field strength.In all panels, the populations of the two detuned excited states, e1 (green) and e2 (red), are nearly the same, whereas the population of the resonant excited state, e2 (orange), shows clearly different temporal behavior.

Conclusions and Outlook
In summary, we have shown that a CNN can be used to reconstruct time-dependent electronic state populations from simulated OD spectra for two different scenarios of laser pulse excitations: the excitation of an individual electronic state (in the two-level system) as well as launching an electronic wave packet consisting of three excited states (in the four-level system).We have demonstrated this for driving pulse electric-field strengths spanning across two orders of magnitude-continuously tuning from the weak-field to the strong-coupling case.For the two-level system, we have further shown that reconstruction is possible for (two) different cases of pulse detunings.Furthermore, by including two different noise levels in the input spectra, we have found that a 1% noise level does not change the CNN reconstructions significantly, whereas an increase to a 3% noise level leads to the CNN predictions deviating more significantly from the input populations for the highest driving pulse field strengths.With regards to the pulse intensities and dipole couplings chosen here, we thus identify the 3% noise level as an upper limit.In the future, for utilizing the CNN for experimental strong-field-driven absorption spectra, we suggest a dataset combining theoretical simulations and experimental weak-field absorption measurementswhere the populations are nearly unchanged and could be calculated with perturbation theory-which is conceptually similar to previous works [21,[23][24][25]27,28,30,33].In the XUV spectral regime, strong coupling experiments have been performed with self-amplified spontaneous emission (SASE) [4]-based FEL pulses in an autoionizing state [9].In these cases, not only the peak intensity but also the spectral structure, central photon energy and pulse duration vary from shot to shot.The models and datasets discussed here should thus be expanded to higher dimensions by also training the CNN on this extended parameter space.To that end, the measurements should account for these different input parameters instead of averaging over them to provide large enough datasets.As a possible benefit of this, the CNN might be capable of learning and predicting FEL pulse parameters in parallel with the populations or could be combined with other neural networks (NNs) trained for FEL pulses [20,24] to achieve this.Alternatively, using more stable and coherent seeded FEL pulses [39], the simulation and CNN presented here could already be sufficient to predict the electronic state populations, but such experiments have not been performed yet.Furthermore, combinations with noise reduction NNs [26] might be helpful for even more precise predictions.As an outlook, our simulated two-level dynamics reveal absorption changes that in principle allow for a novel method of light amplification (when the OD becomes negative), even without population inversion (cf. Figure 5).In previous work, light amplification was achieved by population inversion [14], stimulated Raman scattering [40,41], phase shifts through mechanical displacement [42] or by including additional states/ionization continua or light fields in the case of amplification without population inversion [34,[43][44][45][46][47][48][49][50][51][52].In most cases, the electron populations play a key role.Thus, we expect that our approach of a few-level-based simulation and CNN will also help in the future to investigate different light amplification mechanisms.In addition, the population dynamics of the coherent wavepacket excitation in the four-level system illustrate how intra-pulse electronic population transfer leads to absorption changes when more than a single resonance in an atom is involved.Going one step further by exciting or ionizing (several) electronic states in molecules could ultimately lead to ultrafast charge transfer dynamics [15,17,53], where we expect a CNN to provide predictions of electronic populations during the pulse duration-which might influence subsequent charge transfer and even slower molecular structural dynamics-perhaps in combination with corresponding CNNs [27][28][29][30].Overall, strong-field-modified absorption spectra can be used to investigate electronic dynamics in atoms and molecules, which-in turn-can also be used to shape and modify the driving pulses themselves, such as their amplification in selected spectral regions.In the future, we expect ML in general and CNNs in particular to provide new insights into the ultrafast interplay of UV, XUV and X-ray laser pulses with atoms or molecules.

Figure 1 .
Figure 1.Conceptual overview of the population reconstruction from absorption changes in a twolevel system.(a,b) A laser pulse (violet) excites a two-level system from its electronic ground (blue) state to an excited state (green)-either resonantly (a) or with a small detuning (b).(c) Time-dependent populations of the ground (blue) and excited state (green) for the resonant excitation are simulated with the numerical model.(d) Same as (c) but for the detuned case.(e) The excitation of the two-level system leads to a dipole response (blue) interfering with the incoming laser pulse (violet).(f-h)A spectroscopic measurement of these signals leads to a resonance line in the optical density for a weak (Ԑ0 = 0.7 a.u.) and resonant pulse (f), a strongly coupling (Ԑ0 = 6.1 a.u.) and resonant pulse (g) or a strongly coupling (Ԑ0 = 6.1 a.u.) and detuned pulse (h).The natural line shape (f) is modified due to the strong driving fields (g,h), which we use to train the CNN (i) to predict the state populations of the two states (c,d).

Figure 1 .
Figure 1.Conceptual overview of the population reconstruction from absorption changes in a two-level system.(a,b) A laser pulse (violet) excites a two-level system from its electronic ground (blue) state to an excited state (green)-either resonantly (a) or with a small detuning (b).(c) Timedependent populations of the ground (blue) and excited state (green) for the resonant excitation are simulated with the numerical model.(d) Same as (c) but for the detuned case.(e) The excitation of the two-level system leads to a dipole response (blue) interfering with the incoming laser pulse (violet).(f-h)A spectroscopic measurement of these signals leads to a resonance line in the optical density for a weak (ε 0 = 0.7 a.u.) and resonant pulse (f), a strongly coupling (ε 0 = 6.1 a.u.) and resonant pulse (g) or a strongly coupling (ε 0 = 6.1 a.u.) and detuned pulse (h).The natural line shape (f) is modified due to the strong driving fields (g,h), which we use to train the CNN (i) to predict the state populations of the two states (c,d).

Figure 2 .
Figure 2. Overview of the convolutional neural network structure (for the two-level system).On the left, the sequence of layers is shown.On the right, the (output) array sizes for each layer are given.

Figure 2 .
Figure 2. Overview of the convolutional neural network structure (for the two-level system).On the left, the sequence of layers is shown.On the right, the (output) array sizes for each layer are given.

Figure 3 .
Figure 3. Loss (MSE) as a function of training dataset size for (a) the two-level system and (b) the four-level system.In both panels, the models trained on data without noise are shown in blue, the 1% noise cases are shown in green and the 3% noise cases are shown in orange.

Figure 3 .
Figure 3. Loss (MSE) as a function of training dataset size for (a) the two-level system and (b) the four-level system.In both panels, the models trained on data without noise are shown in blue, the 1% noise cases are shown in green and the 3% noise cases are shown in orange.

Figure 4 .
Figure 4. Field-strength-dependent OD(ω, Ԑ0) for (a) resonant and (b) detuned driving pulses of the two-level system.The field strength axis is logarithmic to cover the two orders of magnitude of field strength changes.For the resonant case (a), the line shape stays symmetric.In contrast, the line shape becomes asymmetric for the detuned case (b), as discussed in more detail in the text.

Figure 4 .
Figure 4. Field-strength-dependent OD(ω, ε 0 ) for (a) resonant and (b) detuned driving pulses of the two-level system.The field strength axis is logarithmic to cover the two orders of magnitude of field strength changes.For the resonant case (a), the line shape stays symmetric.In contrast, the line shape becomes asymmetric for the detuned case (b), as discussed in more detail in the text.

Figure 5 .
Figure 5. Population reconstructions for different noise levels.The same absorption lineout for a detuned driving pulse with a peak field strength of Ԑ0 = 6.1 a.u. is shown for different noise levels: (a) no noise, (b) 1% noise and (c) 3% noise.Simulated populations (black line) are compared to the CNN reconstructions (blue for no noise, green for 1% noise and orange for 3% noise) of the excited (d) and ground states (e).

Figure 6 .
Figure 6.Field-strength-dependent simulated populations of the excited state for (a) resonant and (b) detuned driving pulses and respective CNN reconstructions (c,d) for the 1% noise level on the ODs.In all panels, the populations are close to zero for low field strengths (Ԑ0 ≤ 1).For higher field strengths (1 a.u.< Ԑ0), the population is first significantly increased and further begins to oscillate up and down when 7 a.u.< Ԑ0.As is well known from Rabi oscillations[38], the population transfer is significantly reduced for detuned pulses; thus, the maximum excited state population for the detuned case (b,d) of Pe max ≈ 0.5 is smaller than for the resonant case (a,c), where Pe max = 1.We do not use the rotating wave approximation; thus, the populations oscillate with 2ωr[38] during the interaction with the pulse, which is twice the frequency of the resonance transition.

Figure 5 .
Figure 5. Population reconstructions for different noise levels.The same absorption lineout for a detuned driving pulse with a peak field strength of ε 0 = 6.1 a.u. is shown for different noise levels: (a) no noise, (b) 1% noise and (c) 3% noise.Simulated populations (black line) are compared to the CNN reconstructions (blue for no noise, green for 1% noise and orange for 3% noise) of the excited (d) and ground states (e).
compared to Figure 6b.For field strengths of 4.5 a.u. to 4.6 a.u., the reconstruction is significantly different from the simulation.Looking at the corresponding input ODs in Figure 4a,b reveals that for a field strength of 3.3 a.u. and 6.5 a.u. in the resonant case (a) and 4.5 a.u. in the detuned case (b), the ODs are near flat and close to zero.Due to this ambiguity, the CNN cannot distinguish between the three cases where the OD vanishes, which is why the training and reconstruction with the CNN fail in these specific cases.

Figure 5 .
Figure 5. Population reconstructions for different noise levels.The same absorption lineout for a detuned driving pulse with a peak field strength of Ԑ0 = 6.1 a.u. is shown for different noise levels: (a) no noise, (b) 1% noise and (c) 3% noise.Simulated populations (black line) are compared to the CNN reconstructions (blue for no noise, green for 1% noise and orange for 3% noise) of the excited (d) and ground states (e).

Figure 6 .
Figure 6.Field-strength-dependent simulated populations of the excited state for (a) resonant and (b) detuned driving pulses and respective CNN reconstructions (c,d) for the 1% noise level on the ODs.In all panels, the populations are close to zero for low field strengths (Ԑ0 ≤ 1).For higher field strengths (1 a.u.< Ԑ0), the population is first significantly increased and further begins to oscillate up and down when 7 a.u.< Ԑ0.As is well known from Rabi oscillations[38], the population transfer is significantly reduced for detuned pulses; thus, the maximum excited state population for the detuned case (b,d) of Pe max ≈ 0.5 is smaller than for the resonant case (a,c), where Pe max = 1.We do not use the rotating wave approximation; thus, the populations oscillate with 2ωr[38] during the interaction with the pulse, which is twice the frequency of the resonance transition.

Figure 6 .
Figure 6.Field-strength-dependent simulated populations of the excited state for (a) resonant and (b) detuned driving pulses and respective CNN reconstructions (c,d) for the 1% noise level on the ODs.In all panels, the populations are close to zero for low field strengths (ε 0 ≤ 1).For higher field strengths (1 a.u.< ε 0 ), the population is first significantly increased and further begins to oscillate up and down when 7 a.u.< ε 0 .As is well known from Rabi oscillations[38], the population transfer is significantly reduced for detuned pulses; thus, the maximum excited state population for the detuned case (b,d) of P e max ≈ 0.5 is smaller than for the resonant case (a,c), where P e max = 1.We do not use

Figure 7 .
Figure 7. Absorption of the four-level system for different field strengths.The OD is shown for a weak field strength, Ԑ0 = 0.1 a.u.(grey), an intermediate field strength, Ԑ0 = 1.2 a.u.(green) and a high field-strength, Ԑ0 = 9.1 a.u.(blue), of the driving pulse.For the high field strength, the OD is multiplied by a factor of 4 for better visibility.

Figure 7 .
Figure 7. Absorption of the four-level system for different field strengths.The OD is shown for a weak field strength, ε 0 = 0.1 a.u.(grey), an intermediate field strength, ε 0 = 1.2 a.u.(green) and a high field-strength, ε 0 = 9.1 a.u.(blue), of the driving pulse.For the high field strength, the OD is multiplied by a factor of 4 for better visibility.

Figure 8 .
Figure 8. CNN-reconstructed populations of the ground (blue), first excited (green), second excited (orange) and third excited states (red) in the four-level system for (a-c) low (Ԑ0 = 0.14 a.u.), (d-f) intermediate (Ԑ0 = 1.23 a.u.) and (g-i) high field strengths (Ԑ0 = 9.06 a.u.) of the pulse, as well as for (a,d,g) no noise, (b,e,h) 1% and (c,f,i) 3% noise.In (a,d,g), the simulated populations are shown with black markers.Similar to the two-level case, the populations stay nearly unchanged for low field strengths but are significantly transferred for higher field strengths and undergo several Rabi oscillations (and faster 2ωr oscillations) for the highest field strength.In all panels, the populations of the two detuned excited states, e1 (green) and e2 (red), are nearly the same, whereas the population of the resonant excited state, e2 (orange), shows clearly different temporal behavior.

Figure 8 .
Figure8.CNN-reconstructed populations of the ground (blue), first excited (green), second excited (orange) and third excited states (red) in the four-level system for (a-c) low (ε 0 = 0.14 a.u.), (d-f) intermediate (ε 0 = 1.23 a.u.) and (g-i) high field strengths (ε 0 = 9.06 a.u.) of the pulse, as well as for (a,d,g) no noise, (b,e,h) 1% and (c,f,i) 3% noise.In (a,d,g), the simulated populations are shown with black markers.Similar to the two-level case, the populations stay nearly unchanged for low field strengths but are significantly transferred for higher field strengths and undergo several Rabi oscillations (and faster 2ω r oscillations) for the highest field strength.In all panels, the populations of the two detuned excited states, e1 (green) and e2 (red), are nearly the same, whereas the population of the resonant excited state, e2 (orange), shows clearly different temporal behavior.

Table 1 .
MSE and MAE of the population reconstruction for the two-level system.

Table 2 .
MSE and MAE of the population reconstruction for the four-level system.

Table 2 .
MSE and MAE of the population reconstruction for the four-level system.