1. Introduction
Recently we described the results of a pump-probe experiment, in which the lifetimes of doubly excited states of neon dimers were measured [
1]. The dimers were excited by absorption of two EUV photons from the Free Electron Laser (FEL) FERMI-1 [
2] and probed via ionization by a UV laser pulse.
The excited dimers decayed by Interatomic Coulombic Decay to stable dimer cations Ne, which were detected by a time-of-flight (TOF) mass spectrometer. Ionization by the UV pulse led to a repulsive state of the dimer, which dissociated, so that the yield of dimer ions was reduced, and the yield of Ne increased. The dimer sample used in that work was very dilute with a large atomic background, as it was produced in a supersonic expansion of neon gas, with a yield of about 1%. The data were therefore noisy, and a substantial effort was needed to analyze and filter them. In this paper, we describe the methods used for that analysis.
The data consisted of a number of temporal scans, each lasting approximately 30 min, in which the ion-TOF signal was measured as a function of the delay between pump and probe pulses, and these scans had poor signal-to-noise ratios. Simply averaging them did not give good results, probably because some of them suffered from poor FEL conditions or poor alignment of the pump and probe pulses. One should reject those scans that do not contribute significant signal, while avoiding human bias. A possible approach is to exploit the difference between signal and noise with respect to auto- and cross-correlation. This is an instance of the more general concept of “matched filter”, i.e., the linear filter maximizing the signal-to-noise ratio (SNR) of a measured noisy sample [
3]. Under the assumption of white noise, the matched filter is the complex-conjugate time-reversal of the signal sought, i.e., application of the matched filter returns the autocorrelation of the signal [
4]. The concept is widely applied in signal processing, where one is primarily interested in extracting a burst of periodic signal from a noisy sample. As far as peak detection is concerned, a popular field of application is chromatography, but the general results obtained there apply equally well to our case; Ref. [
5] explicitly discusses the use of a matched filter to determine amplitude and time shift of the peak being sought. We consider the use of auto- and cross-correlation for three purposes:
validate the scans to be included or excluded from averaging
determine the weight with which each scan should enter the average
determine, if desired, by how much to temporally shift a scan prior to averaging
2. Results
For a set of delays
,
, let us consider two delay scans
and
each consisting of a sequence of Ne
ion-TOF signals, reduced as explained in
Section 4, and padded with zeros outside of the delay range scanned. We will consider
our reference scan, which for convenience is assumed to be noise-free.
The definition of cross-correlation without normalization,
is:
Note that
is linear in the two sequences (scaling either one by a factor
scales
by the same factor), and that
is the autocorrelation of
, which has a maximum for
:
We now assume that any scan
consisting of pure noise has zero (negligible) cross-correlation with any other scan (including itself except, obviously, at zero-shift), thus:
with
the Kronecker delta; we use Equation (
3) to define a variance
. Equations (
3) and (
4) strictly hold for white noise and infinite
n; in a real-life situation we can expect that noise correlation just decays much faster than signal correlation: thus all scans whose autocorrelation is sharply peaked are probably pure noise, and will also have poor cross-correlation with the reference scan.
Any noisy scan
can be written in terms of the reference scan
and a pure-noise sequence
as:
, with
a real number. In a worse-case scenario,
may be shifted (for simplicity by an integer index
r), that is:
. Then:
note that
is proportional to
but shifted by
r as expected; this is the argument invoked in Ref. [
5] to associate the position of the maximum of the cross-correlation function to the shift of the peak being sought. In our work [
1] it was not necessary to include a shift; while Equation (
5) could be used to estimate the signal amplitude
, it is preferable to use the autocorrelation instead:
(note that the shift by
r cancels, here). Equation (
6) tells us in particular that all autocorrelation sequences should be the same except for a scale factor
and a sharp noise peak at
. Visual inspection shows that this is true for a number of scans which we consider good; other scans exhibit a narrower, or structured, sometimes negative, autocorrelation, probably indicating correlated noise, i.e., a drift during the measurement; the rest only exhibits the noise peak, indicating no signal at all (
Figure 1). From now on we will consider
to be noiseless and, as empirically found, satisfactorily approximated by a gaussian; we derive the width of the gaussian from our best scan (
Figure 1).
Our choice of a gaussian is purely empirical, and our method does not critically depend on it: to the extent that all good scans have the same shape of the autocorrelation function, Equation (
6), the best fit parameters will be the same for all good scans, except for a scale factor for the height. What is important is that the fitting function provides a reasonable approximation of
in Equation (
6), i.e., of the peak value of the signal component of the autocorrelation.
It is nevertheless instructive to discuss some limiting cases for the shape of the autocorrelation function: we begin by noting that the expected shape of a pump-probe signal such as that of our experiment is an exponential decay (we ignore the possible complication brought on by the presence of more than one decay constant) convoluted with the instrumental resolution (in our case a gaussian, coming from the finite duration of the FEL and UV pulses). The respective autocorrelation functions are an exponential and a gaussian. For the case of non-negligible instrumental resolution the autocorrelation will resemble a gaussian near
, but will have broader wings decaying as
rather than
. An early truncation of the scan will clip the wings of the autocorrelation function. In our case an early truncation does somewhat contribute to determining the shape of the autocorrelation function, but does so equally for all scans (the pump-probe delay was scanned in reverse, and point index
in
Figure 1 consistently corresponds to the maximum value of the delay).
Let us finally consider two possible sources of artefacts, namely a constant, or a linearly drifting, baseline (remembering that the scans are padded with zeros outside of the delay range scanned). This would primarily contribute a slowly decaying component to the autocorrelation signal (of width comparable to that of the scan itself). Let us however note that because the scans are acquired by rapid double-background subtraction (Equation (
9) and
Figure 4), we expect a complete baseline cancellation. Finally, a drifting signal would alter the shape of the peak and consequently of its autocorrelation function: assuming an even drift that causes a loss of the optimal experimental conditions (temporal or spatial overlap; quality of the focus; resonance wavelength) one can speculate that its main effect would be a reduction of the measured width.
3. Discussion
Given a set of scans
, their weighted average is
is the quantity reported in our work ([
1] note for the sake of exactness that in
Figure 2a therein, the unity baseline was not subtracted); we want to determine which scans to include in the average, and their weights
.
From Equation (
6) and
Figure 1b we can estimate the signal-to-noise ratio as:
A simple analysis of its trend over the course of the experiment reveals some regularities that we exploit to qualitatively classify scans, and to define a quantitative criterion that we adopt to accept or reject them. When the scans are ordered in the sequence they were acquired (
Figure 2a) no obvious trend is visible for the
, but one does note a large number of scans with
: they are either those for which a gaussian fit of the autocorrelation was not successful, or those which have been excluded
a priori (e.g., because the scan was aborted). The ordering of the scans by increasing
(
Figure 2b) reveals a gap between scans with
and
. We cannot find an obvious reason for this gap; because this was one of the first resonant two-photon experiments performed at FERMI we can speculate a threshold behavior of some of the less-controlled parameters of the FEL (peak intensity; second harmonic content). In any case, we decided to use the condition
as a discriminant to include a scan in the average.
Let us now come to the weight with which each accepted scan enters the average. We show in
Appendix A that the weight which maximizes the
SNR of
is
; in Ref. [
1] we used
, which gives a slightly worse result, although the difference is not significant. We presume that the latter fact depends on the limited number of samples, the predominance of few of them (see
Figure 2a), and the low dispersion of the
. In both cases we observe an improvement of the signal-to-noise ratio of
by a factor ≈2.25 relative to the best single scan.
Unfortunately at the time of the experiment we were not anticipating the need for this test, so we are not able to perform further checks on which parameters and events mostly affected our experiment. Likewise our primary goal was to identify a simple unbiased method of data selection, and we do not attempt a systematic analysis of its merits and limitations, for which we refer the reader to the vast existing literature ([
6,
7] and references therein); we simply note that the method is most easily applicable to the case of white noise and single- or well-separated peaks, although generalizations to other noise [
6] or multiple peaks [
8] have been discussed. Let us finally note that because the autocorrelation is the Fourier Transform of the power spectrum (Wiener-Khinchin Theorem), the same determination of the signal-to-noise ratio could have been accomplished by Fourier Transform of each scan (in the frequency domain, the white-noise component appears as a constant).
Despite its simplicity and limitations, we believe that our simple test can be further characterized and profitably applied in future experiments. Let us note that the original and most immediate application of the method addresses a situation common to the beginning of many experiments, namely the need of making a weak signal visible when the expected signal shape is not precisely known in advance.