Bayesian Analysis of Femtosecond Pump-Probe Photoelectron-Photoion Coincidence Spectra with Fluctuating Laser Intensities

This paper employs Bayesian probability theory for analyzing data generated in femtosecond pump-probe photoelectron-photoion coincidence (PEPICO) experiments. These experiments allow investigating ultrafast dynamical processes in photoexcited molecules. Bayesian probability theory is consistently applied to data analysis problems occurring in these types of experiments such as background subtraction and false coincidences. We previously demonstrated that the Bayesian formalism has many advantages, amongst which are compensation of false coincidences, no overestimation of pump-only contributions, significantly increased signal-to-noise ratio, and applicability to any experimental situation and noise statistics. Most importantly, by accounting for false coincidences, our approach allows running experiments at higher ionization rates, resulting in an appreciable reduction of data acquisition times. In addition to our previous paper, we include fluctuating laser intensities, of which the straightforward implementation highlights yet another advantage of the Bayesian formalism. Our method is thoroughly scrutinized by challenging mock data, where we find a minor impact of laser fluctuations on false coincidences, yet a noteworthy influence on background subtraction. We apply our algorithm to data obtained in experiments and discuss the impact of laser fluctuations on the data analysis.


Introduction
Coincidence measurements are a widely-used and powerful experimental technique in physics and chemistry. Photoelectron-photoion coincidence (PEPICO) spectroscopy utilizes not only information obtained from the detection of electrons and ions, but also the fact that they stem from the very same ionization event [1][2][3][4][5][6]. Frequently used in photoionization studies of gas phase molecules or clusters, this technique allows for conclusions about the ionization process such as the disentanglement of competing intramolecular relaxation channels [4,[7][8][9] or multiple species [10], a depth of insight that cannot be achieved without assigning electrons to the ions from which they originate. Thus, the success of PEPICO is based on these recordings of pairs being unambiguous, energy-resolved for electrons, and mass-resolved for the cations. Yet, the correct pairwise assignment (true coincidence) may be affected by certain experimental conditions: If a laser pulse triggers a number of simultaneous ionization events arising from different neutral molecules, the assignment of correlated electron-ion pairs is impaired and causes so-called false coincidences [11]. The three possible events are: (1) Not exactly one electron and one ion are detected; in this case the event is rejected. (2) One electron and one ion are detected, which can originate from the same molecule (true coincidence) or (3) an electron and ion can originate from different molecules (false coincidence). Let this be illustrated using the example of exactly two ionization events. If two electrons and/or two ions are detected, the measurement would simply be discarded since no unambiguous assignments could be made. Yet, with imperfect detectors, there is a non-negligible probability of the following event: The electron from Molecule 1 is detected, the electron from Molecule 2 is not detected; Cation 1 is not detected; Cation 2 is detected. Hence, the experimentalist sees a false coincidence, where Electron 1 is wrongly assigned to Cation 2. Obviously, false coincidences only arise if both detectors are not perfect, i.e., the detection probabilities are less than unity, and are thus to some extent present in any such experiment. An easy way out is to work with low ionization rates, for the price of either a bad signal-to-noise ratio or time-consuming measurements. In principle, detector noise or ionization events not caused by the laser pulse might also lead to false coincidences, but, for the situation at hand, are sufficiently low to be neglected.
Time-resolved studies are typically carried out as pump-probe experiments [10,12] as depicted in Figure 1. Excitation by a laser pulse, commonly referred to as the pump pulse, triggers dynamical processes in the molecule after which a time-delayed second laser pulse, commonly referred to as the probe pulse, ionizes the molecule. The transient change of photo-electron and -ion signals associated with the excited states, as a function of the time-delay, provides insight into the underlying processes. Unfortunately, pump and/or probe pulses on their own can ionize the molecule as well, leading to signals that are referred to as pump-only and probe-only further on. This background signal is superimposed on the excited state signal, and in many experimental situations, the pump-only and/or the probe-only signals significantly contribute to the pump-probe signal, e.g., if multiphoton transitions are applied for pumping or probing, or if high photon energies are used for probing [13]. In order to extract the excited state transients, the pump-only and/or the probe-only signals are measured separately and usually subtracted from the pump-probe signals, obviously resulting in increased noise if the background and pump-probe spectra energetically overlap with each other. The application of the Bayesian formalism to background subtraction alone was already presented for astrophysical applications [14,15] and for photo-induced X-ray emission spectroscopy (PIXE) [16][17][18][19][20][21]. Yet, here, it has to be considered that the pump-only, the probe-only, and the pump-probe measurements have different ionization rates. Since the statistics of the coincidences depend on the ionization rates, the statistics of sole pump-only and probe-only measurements differ from those in the pump-probe measurement, and simple subtraction turns out not to be an unbiased estimator any longer [22].
Although it is feasible to distinguish between true and false coincidences by pure experimental finesse, such as in cold target recoil ion momentum spectroscopy (COLTRIMS) [23,24], it demands quite a technical and financial effort and is entirely impossible for time-of-flight detection, as used in the presented experiment. Covariance mapping, which is based on the calculation of the covariance for the photoelectron and mass spectra measured with each laser shot [25][26][27], does not guarantee that the reconstructed spectrum is positive, is restricted to Poisson processes, and leads to systematic deviations in other scenarios [26,27]. Further limitations are outlined in [25]. Utterly simplified sketch of a time-resolved photoionization study carried out with a pump-probe setup and a time-of-flight spectrometer. A commercial Ti:sapphire laser system delivers pulses of 800 nm in center wavelength and 25 fs in temporal length at a repetition rate of 3 kHz. The delay stage is used to control the length of the optical path, and hence the time delay. The energy level diagram shows how the electron kinetic energy, given the energy of the states and the photons, identifies the state the system was in at the moment of ionization. A detailed description of the setup can be found in our previous publications [7,28].
We recently presented a Bayesian approach to PEPICO, which treats both coincidences and background subtraction on the same footing [22,29]. In this work, we extend our theory to the prevalent experimental situation of unstable laser intensity, i.e., ionization rates fluctuating from pulse to pulse. We provide our software, including introductory examples, at https://github.com/fslab-tugraz/ PEPICOBayes/. The experiment described in Section 2 is treated with the Bayesian formalism developed in Section 3. It will be tested by some challenging mock data in Section 4 and applied to real experimental data in Section 5.

Experiment
The analyzed experiment is of the type depicted in Figure 1 and described in detail in previous publications [7,28]. To apply our method, we choose our specimen and excitation-ionization-scheme according to Figure 2, since we expect the effects described above to be of particular importance in this scenario. Acetone molecules are excited by a three-photon transition to high-lying Rydberg states and ionized in the extraction region of a time-of-flight spectrometer, which measures both the electron kinetic energies and the ion masses [4,7]. The electron and ion flight times are then analyzed by a coincidence algorithm to produce photoelectron spectra corresponding to either an intact parent acetone ion or a fragmented acetyl ion. The concept and the necessity of time-resolved PEPICO (TR-PEPICO) in this context are further elucidated in Animations A1 and A2 in the Supplementary Material.
The addressed excited state lies energetically close to the ionization continuum, resulting in a certain probability of four-photon ionization from the ground state caused by the pump pulse alone (measurement α and Channel 1), leading to a background signal. In a separate pump-probe measurement (measurement β), the pump process is the same as in the pump-only case. The population is generated in the excited states, which in turn is ionized by a time-delayed probe pulse. Due to the low laser intensity of the probe pulse, ground state molecules are not ionized by the probe pulse alone, i.e., there is no probe-only background. Consequently, the measured pump-probe spectrum consists of pump-only ionization events (Channel 1) and pump-probe ionization events (Channel 2). Cations, produced in both channels, dependent on the ionization path, can either be stable and detected as parent ions or undergo fragmentation into neutral and ionic fragments. Coincidence detection of electrons and ions allows obtaining separate electron spectra for each ion, i.e., parent and fragment. The excited state of the molecule at the moment of ionization is identified by the measured electron kinetic energy in combination with the energy of the ionizing photon and knowledge of the vertical ionization energy of the excited state. In addition to the information of species and electronic state that is ionized, the related ion mass of the PEPICO spectrum provides insight into the fragmentation behavior. For example, the assignment of the photoelectron kinetic energy to an excited electronic state of the unfragmented molecule and coincidence detection of an ion fragment show that the molecule was intact at the moment of ionization and that fragmentation must have occurred afterwards. Moreover, the population in the excited state can decay to energetically lower states quite quickly, e.g., on a femtosecond timescale [4,7,28]. It is due to this decay that the Channel 2 signal can become significantly smaller than the Channel 1 background, in particular for long delay times, causing a poor signal-to-noise ratio.

Preliminary Considerations
We now introduce our notation and develop the Bayesian algorithm for analyzing the data generated in the experiment described in Sections 1 and 2. We consider the following standard setup consisting of two experiments on the same target: pump-only and pump-probe, denoted by α and β, respectively. Each experiment consists of N p measurements. A measurement of the α experiment is performed with exactly one pump pulse, while a measurement of the β experiment comprises exactly one pump pulse and one probe pulse. During one measurement, two types of elementary coincidence events are detected, either a molecule is ionized from its ground state (referred to as Channel 1) or from its excited state (Channel 2). The latter is only possible in a pump-probe measurement (β). We assume that the number m j of ionization events in channel j ∈ {1, 2} is Poisson distributed with some mean ionization rate λ j .
For the sake of readability, we suppress the index j in the following considerations. Furthermore, we assume that in each experiment, characterized by a defined delay time between pump and probe pulse, λ is independent of the occupation of the states, which means that we neglect population depletion effects. We additionally presume the laser intensity, and thus the ionization rate λ, to fluctuate. Mikosch and Patchkovskii [26,27] proposed to describe λ with a Gaussian Probability Density Function (PDF). We rather resort to a Γ-distribution of the latter instead, since: (1) the assignment of a Gaussian PDF for λ is inconsistent with the fact that λ ≥ 0, while the Γ-distribution includes this constraint naturally; (2) it turns out to be quite convenient mathematically, while the effect of this choice on the result is deemed negligible, which will become apparent later on. We parametrize the Γ-distribution, p Γ (λ|λ, σ), with their expectation value λ and variance σ 2 . All results match the findings of our previous paper [22] in the limit σ → 0.
In one elementary event, the involved molecule can have mass M µ and the emitted electron energy E ν . For brevity, we will refer to this particular event as (µν). The ion masses and the electron energies are discretized, µ, ν ∈ N, due to the finite resolution of the time-of-flight spectrometer. We will also use the symbol ρ ∈ {α, β}, if we refer to the measurements/experiments α or β, the symbol j for Channel 1 or 2, and x for the combination of both sets, i.e., x ∈ {1, 2, α, β}. Given that an elementary event happens in channel j ∈ {1, 2}, the probability that it corresponds to (µν) is denoted by: where I denotes the conditional complex. The probabilities are properly normalized: In the pump-only measurement (α), all molecules are in their respective ground state; therefore, only Channel 1 is allowed and: We now introduce the spectrum:q with a subtle distinction of q (1) in the explicit propositions. q (1) is conditioned on (µν) being a true single coincidence, whereasq (1) µν is conditioned on (µν) being a single coincidence, either true or false, and hence also on π. The latter is what is actually observed in a pump-only PEPICO measurement, while the former is what is desired. It will become apparent in the following how this additional condition distorts our statistics. We summarize all unknown parameters in the variable π := {λ 1 , σ 1 , λ 2 , σ 2 , ξ i , ξ e }. λ 1 , σ 1 , λ 2 , and σ 2 describe the fluctuations of λ 1 and λ 2 according to Γ-distributions, and ξ i and ξ e are the detection probabilities of ions and electrons, respectively. The probability of detecting an electron with energy E ν and an ion with mass M µ in a single coincidence measurement is: .ν λ 2 with the marginals q (j) µν and the detection failure probabilitiesξ e = 1 − ξ e and ξ i = 1 − ξ i . In the second line, we use that P(E ν , M µ , SC|q, λ 1 , I) was already derived in Appendix 1 of [22]. The appearing integrals are of the type: and solved in Appendix A. Note that the PDF describing the laser fluctuations enters the whole algorithm only within integrals of the above type. Thus, the theory can easily be adapted to different descriptions of the fluctuations by merely exchanging those integrals. The spectrum including false coincidences,q µν , is then given by: The false coincidences are represented by the term κ 1 q .ν , and therefore κ 1 is a measure of the amount of false coincidences.
Distinguishing the case where the electron and ion are measured in coincidence and stem from the same (i = j) or a different (i = j, or ¬i = j) channel, we obtain in Appendix B: In summary, we have:q .ν + α ν q with the parameters: where γ .. = ∑ µν γ µν and: Up to now, we have defined all variables and dependencies we need for the derivation of the desired posterior distribution presented in the following sections.

The Posterior PDF
We use Bayes' theorem to determine the posterior probability of the parameters we want to estimate, where capital P shall denote discrete and lower case p continuous distributions. , which counts how many measurements lead to the detection of N e electrons and N i ions during the experiments α and β, respectively. In this case, it is expedient to use all detected events, not just single coincidences.
The appearing transformation of the Dirac distribution, with S = ∑ µν q (2) µν , is derived in Appendix C, and the derivation of the Jacobian determinant, is shown in Appendix D. N µ and N ν denote the total number of bins for ions (µ) and electrons (ν), respectively. Putting everything together, Equation (11) becomes: The second term in Equation (10), p(q (1) |π, I), is obtained by setting Ω = 1 and replacing experiment β by α and Channel 2 by 1; hence, the prior is fully determined.

The Likelihood
In this paper, we only use the single coincidence events D 1 for estimating the spectra q (1) and q (2) given the parameters π. This is justified by the fact that especially these events include relevant information about the spectrum, because the case of detecting more than one electron-ion pair does not allow one to link an electron to the ion it originates from, and the Bayesian approach would be different. The dataset D 2 will be used to determine the unknown parameters π. Therefore, the likelihood splits into: P(D 1 , D 2 |q (1) , q (2) , π, I) = P(D 1 |q (1) , q (2) , π, I) P(D 2 |π, I) .
Marginalizing overq µν introduces the multinomial distribution: according to Appendix 2 of [22]. We now turn to the second term in Equation (14), P N e ,N i is already derived in Appendix 4 in [22] and has the general form: Inserting in Equation (15) produces:P

Remarks on the Posterior Sampling
In the previous sections, we fully determined the posterior p(q (1) , q (2) , π|D 1 , D 2 , I). Ultimately, we are interested in the spectrum q (2) , or rather say its probability p(q (2) |D 1 , D 2 , I), which can easily be achieved by integrating out q (1) and π in the posterior. For performing this integration and, more generally, computing expectation values, variances, and covariances of the posterior, we resort to numerical techniques, posterior sampling specifically. A suitable technique for sampling from this PDF is Markov Chain Monte Carlo (MCMC), which is based on a Markov chain that has the desired distribution as its equilibrium distribution. The technique is standard in Bayesian probability theory (see [30], especially Chapter 30 and the references therein). To generate the Markov chain, we used the Metropolis-Hastings algorithm with local updates in q (1) , q (2) , and π. We chose the step size for the update of each parameter separately to achieve an acceptance probability of ca. 50%. We discarded the first 20% of the Markov chain as means of thermalization and tested the convergence to the equilibrium distribution with several initial states. Correlations were checked with binning.

Mock Data Analysis
In this section, we demonstrate the performance of our algorithm including λ-fluctuations. There are two disturbing influences in the reconstruction of the spectra: false coincidences and pump-only background. We study these influences separately. First, we investigate the influence of the λ-fluctuations on the false coincidences. In the second part, we scrutinize our algorithm in the presence of a challenging pump-only background and λ-fluctuations.

False Coincidences
In the simulation, we have two different ion masses µ ∈ {p, f}, called parent (p) and fragment (f). We use the test spectra related to the ones reported in [22,26]. The electronic spectrum of the parent has a step-like form, and the electronic spectrum of the fragment consists of Gaussian peaks; see Figure 3. These challenging test spectra exhibit a strongly-varying parent/fragment-ratio as a function of the electron energy. Since we are firstly interested in the effect of λ-fluctuations on false coincidences, we suppress the background in our simulations (λ 1 → 0 and σ 1 → 0). Before studying the simulation results in detail, we investigate the influence of λ-fluctuations on false coincidences by analyzing Equation (7). Since we have only signal contributions (q (2) ), Equation (7) reduces to Equation (4) with the superscripts (2) and 2 instead of (1) and 1, respectively. The equation describes the connection between the measured spectrumq (2) , including false coincidences, and the true spectrum q (2) . κ 2 gives the weight between false and true coincidences, This equation shows that the influence of λ-fluctuations on false coincidences is negligible in two cases: first, if the relative fluctuations σ 2 2 /λ 2 2 are small; and second, if the relation (1 −ξ eξi )λ 2 ≈ 1 holds, the influence is negligible regardless of high relative fluctuations.
In our mock data analysis, we choose the parameters ξ i = ξ e = 0.5 and N p = 10 7 , σ 2 = 0.5, and λ 2 = {1.5, 0.5}. Now, we turn to the parameter estimates summarized in Table 1. The algorithm reported in [22], which does not include λ-fluctuations, provides estimates for λ 2 , ξ i , and ξ e with about 5 % (for λ 2 = 1.5) and 13 % (for λ 2 = 0.5) relative deviation from the real values. The differences are caused by the assumption of having a constant λ 2 without any statistical noise, which is simply wrong, since we generated our mock data by drawing λ 2 from the Γ-distribution with the parameters λ 2 and σ 2 . As expected, λ 2 estimated by the algorithm ignoring λ-fluctuations lies within λ 2 ± σ 2 . The algorithm reported in this paper includes the λ-fluctuations and is therefore able to give reliable estimates for all parameters. Table 1. Estimated parameters λ 2 , σ 2 , ξ i , and ξ e . In the lines showing the results of the algorithm presented in [22], λ 2 is shown instead of λ 2 . Each value denotes the mean and standard deviation of the parameter's distribution.
Altogether, the influence of λ-fluctuations on false coincidences in the reconstructed spectra is small in the experimentally-relevant parameter regime, which is in accordance with Mikosch et al. [26]. Still, including λ-fluctuations is important in the parameter estimation.

Background Subtraction
To analyze the influence of λ-fluctuations on the background subtraction, we use the same spectra as in Section 4.1. Now, the step-like spectrum is chosen as the signal and the spectrum consisting of Gaussian peaks as the background. We restrict ourselves to one ion mass to exclude the influence of false coincidences stemming from two ion masses. If we neglect false coincidences, the prefactor (8) is: This factor represents the ratio between the background (1) and signal (2). The equation shows that the corrections cancel if λ 1 = λ 2 and σ 1 = σ 2 ; see Figure 4a,b. If σ 2 1 /λ 1 < σ 2 2 /λ 2 (Figure 4c), the algorithm without λ-fluctuations (blue line) underestimates the background, leading to peaks in every step. In the opposite case, σ 2 1 /λ 1 > σ 2 2 /λ 2 (Figure 4d), the algorithm without λ-fluctuations (blue line) overestimates the background, leading to notches in every step. The algorithm including λ-fluctuations (green line) is able to reconstruct the signal correctly in all scenarios.
Similar to the mock data analysis in the previous section, not including λ-fluctuations leads to deviations in the parameter estimation in all of the simulated test cases; see Table 2. As expected, λ j estimated by the algorithm ignoring λ-fluctuations lies within λ j ± σ j . Obviously, including λ-fluctuations in the parameter estimation becomes important especially for high σ 1 or σ 2 . In summary, the influence of

Application to Experimental Data
With our setup for time-resolved PEPICO studies on gas phase molecules (see Figure 2), we were not able to produce stable λ-fluctuations with a sufficiently large σ to demonstrate a systematic influence on the reconstructed spectra. The output of our commercial laser system (Coherent Vitara oscillator and Legend Elite Duo amplifier) turned out to be too stable for this purpose. We tried to simulate a less stable setup by operating the Optical Parametric Amplifier (OPA) at intensities below the saturation threshold, which induces λ-fluctuations. In this setting, however, the pulse energy also undergoes temporal drifts on time scales of the measurement, which are different from the statistical fluctuations and not accounted for in our analysis as a setup would not be operated in this regime.
Nevertheless, by using the algorithm including the λ-fluctuations, we find that the fluctuations were in the order of σ = O(0.001) in the experimentally-relevant parameter regime. The parameters estimated by the different algorithms are depicted in Table 3. Finally, we note that less stable systems are expected to suffer from significant λ-fluctuations, which have to be accounted for in the reconstruction process of the spectrum. This will in particular be the case if multiple nonlinear optical processes like in the OPA are used in the pump path of a pump-probe setup. The fluctuations get even more prominent if multiphoton processes are used for the excitation process, because these processes depend nonlinearly on the pulse power. Table 3. Estimated parameters λ 1 , λ 2 , σ 1 , σ 2 , ξ i , and ξ e . Line 1 (2) contains the parameter estimations performed with the algorithm without (with) λ-fluctuations, respectively. In the line showing the results of the algorithm ignoring λ-fluctuations, λ j is shown instead of λ j . Each value denotes the mean and standard deviation of the parameter's distribution.

Conclusions
We used Bayesian probability theory to analyze data obtained from pump-probe photoionization experiments with photoelectron-photoion coincidence detection. We extended the algorithm developed previously in [22] by including fluctuations of the laser intensity as a random variable λ. Based on challenging mock data, we have demonstrated the reliability of the developed algorithm. In accordance with Mikosch et al. [26], the influence of λ-fluctuations on false coincidences was small in a certain parameter regime. We derived the condition (1 −ξ eξi )λ ≈ 1 for negligible influence of λ-fluctuations on false coincidences, even at high relative fluctuations.ξ e andξ i denote the complement of the detection probabilities of electrons (ξ e ) and ions (ξ i ), respectively. In the case of the pump-only background (Channel 1) and the signal (Channel 2) contained in the pump-probe measurements, neglecting λ-fluctuations underestimates the background signal in the case σ 2 1 /λ 1 < σ 2 2 /λ 2 . The other way around, the background signal is overestimated. In both scenarios, including λ-fluctuations is important for estimating the parameters λ 1 , λ 2 , σ 1 , σ 2 , ξ i , and ξ e correctly. In our application to the experimental data, we find that the relative laser fluctuations in the experimental setup are fortunately too small to see effects on the reconstructed spectra. With the developed algorithm, we were able to determine the experimental λ-fluctuations to be in the order of σ = O(0.001). Compared to conventional subtraction of the pump-only spectrum from the pump-probe spectrum, the Bayesian approach provides several important advantages: (i) It results in a significant increase of the signal-to-noise ratio. (ii) It does not overestimate the pump-only contribution and never leads to negative spectra because the relative weight of the pump-only contribution is self-consistently determined. (iii) Spectral signatures based on false coincidences are eliminated, allowing for higher signal rates. (iv) It includes consistently all prior knowledge, such as positivity, and (v) a confidence interval is obtained for the estimated spectrum. (vi) It is applicable to any experimental situation and noise statistics, as demonstrated in this paper for the case of λ-fluctuations.
In the second scenario, the electron and the ion can stem from a different channel (i = j, or ¬i = j), and therefore, only false coincidences are possible, e.g., the probability of detecting one electron from channel i and one ion from channel ¬i is the product: P(E ν , M µ , SC i,¬i |q, π, I) = P(E ν |q (i) , π, I)P(M µ |q (¬i) , π, I) .

Appendix C. Transformation of the Dirac Distribution
For the derivation of the transformation of the Dirac distribution in Equation (12), we need the relation betweenS and S. This is given by: (1 + 2κ 2 (Ω − 1))S + κ 2 S 2 + γ (β) ..
The argument of the Dirac distribution, δ(S − 1), has a unique zero at S = 1. Considered as a function of S, we therefore have:

Appendix D. The Jacobian Determinant
In this section, we derive the Jacobian determinant needed for the transformation in Equation (11).
The final result is presented in Equation (13).