Analysis Strategies for MHz XPCS at the European XFEL

: The nanometer length-scale holds precious information on several dynamical processes 1 that develop from picoseconds to seconds. In the past decades X-ray scattering techniques have 2 been developed to probe the dynamics at such length-scales on either ultrafast (sub-nanosecond) 3 or slow ((milli)-second) time scales. With the start of operation of the European XFEL, thanks to 4 the MHz repetition rate of its X-ray pulses, even the intermediate µ s range have become accessible. 5 Measuring dynamics on such fast timescales requires the development of new technologies such 6 as the Adaptive Gain Integrating Pixel Detector (AGIPD). µ s-XPCS is a promising technique to 7 answer many scientiﬁc questions, especially for soft condensed matter systems, but the procedure 8 to obtain reliable results is still not straightforward and requires several additional steps compared 9 to experiments at synchrotron storage-rings. Here we discuss challenges and possible solutions to 10 perform XPCS experiments with the AGIPD at European XFEL. We present our data pipeline and 11 benchmark the results obtained at the MID instrument with a well-known sample composed by 12 silica nanoparticles dispersed in water. 13


15
Free-electron laser facilities in the hard X-ray regime (XFELs) bear the potential for 16 studying molecular dynamics utilizing time-domain methods such as X-ray photon 17 correlation spectroscopy (XPCS) and the related technique X-ray speckle visibility spec-18 troscopy (XSVS). Both techniques are based on coherent X-rays enable probing dynamics 19 between femtoseconds and several hours. These techniques have been developed at 20 synchrotron radiation sources since the 1990's [1][2][3][4]. Applications cover a broad range 21 of materials and scientific questions, such as diffusion dynamics in soft matter, glass 22 transition and gelation, as well as domain-wall dynamics, see [5] for a recent overview. 23 In XPCS experiments the sample dynamics are studied by acquiring series of coherent diffraction patterns, so-called speckle patterns. The intensity fluctuations of the speckles reflect the change of the spatial arrangement of the sample where the length scale is selected by choosing a particular momentum transfer q ≡ |q| = 4π λ sin(θ/2); with wavelength λ and scattering angle θ). The intensities I(q, t) and I(q, t + ∆t) at two different times with a lag time of ∆t are recorded to calculate intensity-intensity correlation functions g 2 (q, ∆t) = I(q, t)I(q, t + ∆t) I(q, t) 2 . (1) The average denoted by . . . is performed over both detector pixels with equivalent q-values and all times t. The correlation function, g 2 , can be expressed by the interme-g 2 (q, ∆t) = 1 + β| f (q, ∆t)| 2 , with the speckle contrast β that is determined by the coherence properties of the beamline 24 [5]. While XPCS at storage-ring sources typically covers dynamics in the range between 25 hours down to (sub-)millisecond time scales, its main application at XFEL facilities 26 are fast time scales in the femto-to nanosecond domain [6][7][8], using either double-27 pulse approaches via split-and-delay devices or modification of the X-ray pulse length 28 between a few to about 100 fs. The apparent gap of time scales between nano-and 29 milliseconds is originating from the limitations of the time resolution of two-dimensional In this work we present the XPCS data pipeline developed by us and explain the 60 various steps that are required to perform an XPCS experiment with the AGIPD at the 61 European XFEL instruments. We distinguish between two different aspects of the data 62 treatment. First, we will explain how the raw data should be calibrated to provide the 63 best possible data quality of single speckle images. Second, we will discuss how the 64 standard XPCS analysis (cf. Eq. (1)) needs to be modified to correct detector artifacts.

65
Eventually, we compare different data calibration and analysis methods to benchmark 66 the different approaches. A summary of our data pipeline is sketched in Figure 1. A 67 detailed explanation of the individual steps follows in the next sections.

69
As a model system with well-known dynamic properties, we use silica nanoparticles dispersed in water. The samples were produced such that their intrinsic dynamics match the MHz time scale of the European XFEL. The colloidal silica nanoparticles were Data collection, callibration, and photonization (see section 3.1).
Computation of correlation matrices and crosscorrelation matrices (see section 3.2).
Subtraction of the average cross-correlation matrix from the correlation matrix results in the correct correlation matrix.
Calculating the time average of the corrected two-time correlation matrix yields the autocorrelation functions.
Here the x-axis is the delay time. synthesized with a modified Stöber method [23]. We used a particle concentration of 2.5 wt%, chosen to ensure a system in which particle-particle interactions are negligible and capable at the same time to provide speckle patterns with sufficient intensity. Monodisperse diluted colloids are extremely useful for calibration purposes because their intermediate scattering function is described by a simple exponential relaxation where Γ(q) = D 0 q 2 and D 0 is the Stokes-Einstein diffusion constant.

70
The colloidal dispersions were filled into thin-walled quartz capillaries with an outer 71 diameter of 1.5 mm that were sealed afterwards and placed in a specifically designed 72 sample-holder. The experiment was performed in air at the MID instrument. A detailed 73 description of the instrument can be found in [20].

74
The intensity of the X-ray pulses was measured on a single-pulse basis with a gas 75 monitor placed upstream the beamline. The beam-size at the sample position was about 76 10 µm × 10 µm, obtained by focusing the beam by compound refractive lenses (CRL). The 77 X-ray flux was controlled with stacks of chemically vapour deposited (CVD) diamond 78 windows. For the data reported here the total thicknesses ranged from 4.7 mm to 2.7 mm.

79
Taking the beamline and air transmission into account, we found that the X-ray intensity 80 measured by the gas monitor is a factor of 4.1 × 10 3 larger than the intensity on the 81 sample. This factor depends on the actual setup and was measured to determine the 82 incoming intensity accurately.

83
The AGIPD installed at MID is composed by four quadrants each consisting of For sake of the visual representation, we limit the figure to six images. In principle, 352 images per train can be acquired with the AGIPD. The data are stored in individual memory cells for each X-ray pulse that arrives at the detector. ∆t is the delay time between successive pulses. In the sketch, the first six memory cells of one pixel are shown. The intensity is measured as analogue to digital units (ADUs) and then converted to the number of photons (photonization, see Figure 3) (a).
A precise calibration of the AGIPD geometry is achieved with the aid of a previous Taking into account the different resolutions, the average radius is found to be R 0 = (33.6 ± 0.1) nm and R 0 = (33.7 ± 0.1) nm for the Eiger 4M and AGIPD, respectively. In black the form factor of the spherical particles with radius R 0 = 33.65 nm is shown.
A measurement-at XFEL also denoted a run-is gathering data over a certain time identification number associated with each recorded pattern [25] or using the software 120 extra_data developed by XFEL [26].

121
Strictly speaking, Eq. (1) only holds for stationary dynamics, i.e., when the dynamics only depend on the delay time but not on the absolute time during the measurement. To measure time dependent sample dynamics and catch time dependent detector artifacts, we will calculate two-time correlation functions (TTCs) [5,[27][28][29]: where . . . sp denotes the ensemble average over speckles in the same q-ROI and

142
In this section we explain the steps of our XPCS data pipeline in detail starting with the  quickly fails as soon as the intensity decreases. Moreover, the information that can be 161 obtained from the TTC is tainted by several spurious features as can be seen in Figure   162 4 (a). The dynamics in that q-region is expected to be described by a fast exponential

167
XFEL provides a simple way of calibrating the raw data including many steps in 168 addition to the dark subtraction [ref]. A dataset that the XFEL calibration was applied to 169 will be referred to as processed data in the following. Processing of AGIPD data has also ADUs. Normalizing the data by the single photon value and defining the width of the 194 bins we can identify how many photons were recorded in every pixel and memory cell.

195
The size of the bin will determine how selective our photonization procedure will be.

196
Depending on the photon energy, which defines the separation of the photon events in

Corrections on the dynamical quantities 212
Despite of the correction of single patterns, we have not yet reached an optimal result, as the value of the correlation at large lag times, visible also from the g 2 (t) − 1 in Figure 4 (d) (green triangles), is still well above 0. This additional baseline originates from the static variance of the recorded intensity in the investigated q-ROI. It can have several origins, such as the natural change of the intensity as a function of q, small misalignments of the modules or other artifacts which have yet to be clearly identified. However, this non-zero baseline is not the only concern, in low-intensity regions the data is much more exposed to the appearance of "hot pixels" or other similar defects, usually affect the TTC with strongly correlated streaks or blocks. Figure 5 (a) reports an example of such artefacts, with a typical block of 32 memory cells that shows a very high, and unphysical, correlation. For clarification, this does not mean that all those memory cells are affected, but only the memory cell of a sub-group of the pixels in the investigated q-ROI. On conventional detectors it would be possible to identify and mask such pixels, but in the present case such effects can be observed also on different memory cells on different runs, suggesting a variability which is not yet completely under control. Luckily, there are methods to tackle this problem thanks to the cross-correlation matrices introduced in the previous section. In fact, correlating neighbouring trains a precise estimation of such a baseline can be obtained because illumination and detector conditions are very similar, but the speckle patterns are completely uncorrelated. For a given train tr n then two cross-correlation matrices are calculated as X −1 = δI(tr n−1 , t 1 )δI(tr n , t 2 ) /(I(tr n−1 , t 1 )I(tr n , t 2 )) between tr n−1 and tr n , and X +1 = δI(tr n , t 1 )δI(tr n+1 , t 2 ) /(I(tr n , t 1 )I(tr n+1 , t 2 )) between tr n+1 and tr n . The cross-correlations are then made symmetric with XT −1 = (X −1 + X T −1 )/2 and finally the correction matrix is defined by Finally, simply subtracting XC from the TTC will corrects the baseline and some of the 213 more common artefacts, as can be seen in the corrected TTC shown in Figure 5

250
The situation presented in the previous paragraph is a straightforward example of the 251 advantages that come with the "photonization" step that eliminates correlated noisy (c) Figure 6. Practical example of the advantages offered by the "photonization" of the speckle patterns. In (a) the g 2 (t) − 1 obtained from the photonized patterns are reported along with their respective exponential fits (black lines). For comparison, in (b) the g 2 (t) − 1 from the same run, but without the photonization step are reported. In (c) the relaxation rates obtained with simple exponential fits to the data are shown. The data points used for the fit of D 0 are marked as filled symbols, the black continuous line is the fitted function, while the black dash dotted line is the extrapolation to the whole q-range.
the separation of the photon peaks and the accuracy of the photonization. Another the beamline or the sample and the intensity of the speckle patterns will be weaker for 274 the same fluxes. Furthermore, the speckle size will be smaller which will reduce the 275 contrast and the signal-to-noise ratio. In general, for the outcome of a very low intensity 276 experiment any method that can make the AGIPD work with a precise single photon 277 resolution, either an even more refined calibration pipeline or different experimental 278 conditions, should be pursued.

279
With this paper we showed the key aspects of sequential XPCS experiments that 280 make use of the unique pulse structure of the European XFEL. We outlined the unique