The SFX datasets described here were collected from lysozyme and myoglobin microcrystals at the SPB/SFX instrument [8
] of EuXFEL, utilizing XFEL pulse rates of up to 1.129 MHz. We used bunch patterning [9
] to effectively collect data at 90 Hz rather than 10 Hz, the train repetition rate. Each X-ray pulse train was divided into 9 “wagons” separated by ~13–14 µs. This time interval is significantly longer than the 4–5 µs needed to replace the ~200 µm free-standing jet segment destroyed by the last XFEL pulse of the preceding wagon (from the nozzle to the interaction region) flowing at 40–50 m s−1
. Thus, between wagons, the full length of the jet was replenished along with any possibly damaged crystals. Further upstream of the free-standing jet, in the GDVN nozzle meniscus region, the jet diameter increases from the ~5 µm jet diameter to the 75 µm diameter of the sample capillary. As a shock traverses this meniscus, energy conservation ensures that the associated pressure jump diminishes rapidly with penetration into the meniscus. Moreover, the pressure jump has already decreased significantly even before the shock reaches the meniscus, given that the pressure jump damps exponentially with travel distance within the jet, as shown by Blaj et al. [3
]. Consequently, samples within the meniscus at the time of shock wave passage could be considered damage-free. By this criterion, two consecutive X-ray pulses separated by 13–14 µs ensured data collection of samples undamaged by shock waves. Each wagon consisted of four consecutive pulses separated by 0.886 µs (~1.129 MHz repetition rate), followed by a fifth pulse after 1.772 µs (~0.564 MHz) (Figure 1
). In total, 10,087,200 diffraction images were collected, 3,766,500 of the lysozyme and 6,320,700 myoglobin crystals, respectively. Of those images, 491,120 (13.0%) were hits for lysozyme and 425,819 (6.7%) for myoglobin. The final number of indexed diffraction patterns is 315,157 (64.2%) for lysozyme and 194,034 (45.6%) for myoglobin, with the resolution limit of the Monte-Carlo integrated data being 1.8 Å in both cases. We deposited the images containing hits of lysozyme and myoglobin microcrystals in the Coherent X-ray Imaging Data Bank website (CXIDB) [24
] with the CXIDB ID 144 at (http://cxidb.org/id-144.html
). In addition to the deposited diffraction data, we provide experiment metadata, such as the pulse energy and images of the sample jet for further analysis.
3.1. Analysis of all Crystal Hits
As in previous work [4
], we first merged and sorted SFX patterns according to their position within X-ray wagons, then compared “shock-free” and “shocked” datasets. With increasing number of previous pulses, and thus increasing shock exposure from previous pulses, the resolution appeared unchanged (indicating no damage, Figure 2
a), yet the indexing rate appeared to decrease (indicating damage, Figure 2
b). This quandary prompted a thorough and detailed examination of all metadata, which revealed numerous intertwining issues.
Experimental properties (pulse energy, detector behavior, etc.) were found to vary systematically pulse-by-pulse within wagons, as well as wagon-by-wagon within trains. The higher pulse energies of the last pulses in each wagon (Figure 2
c) ensues an increase in the mean intensity of a diffraction image, which in turn leads to an increased number of diffraction peaks (Figure 2
e). However, concomitantly, the fraction of images containing a peculiarly large number of diffraction peaks increases, which translates into an increase of the number of diffraction images with substantial noise (Figure 2
d,e), lowering the indexing rate (Figure 2
b). The apparent decrease of the indexing rate along the wagon is therefore an artefact generated by intertwining of these dependencies.
3.2. Analysis of Hits in Continuous Jets
When analyzing the femtosecond images of the jet, we noted dramatic large-scale motions of the jet (Figure 3
). The issue of jet instabilities has been known for a long time, but it was believed to be happening on a relatively long timescale. When observing the jet during data collection, it seemed to “jump” from time to time, requiring adjustments to ensure jet/X-ray overlap. Unexpectedly, on top of these slow motions, very fast jet movements also occur, as evidenced by our femtosecond jet imaging. This means that one cannot assume that for any given shot the jet was also intersected by the previous X-ray pulse. Hence, it is not clear whether a given crystal actually experienced a shock wave launched by a previous pulse, since this requires a continuous liquid jet column between the shots. Therefore, a rigorous shock investigation first requires an analysis of whether or not a continuous jet existed between subsequent XFEL exposures. We examined this by analyzing the intensity of the water ring in the diffraction images.
We observed an increase in solvent scattering intensity with pulse energy, as expected. Moreover, we detected a correlation between the pulse energy at a given wagon position and the number of previous pulses also hitting the jet (Figure 4
a). This is indicative of a better overlap/alignment of X-rays and jet, possibly due to higher beam stability. The increase in pulse energy in turn leads to an improvement in the mean resolution as a function of the number of shocks experienced (Figure 4
b). To compensate for this effect, we selected from the indexed data only the cases where all five X-ray pulses in the wagon hit the jet, meaning that only data of comparable jet/X‑ray alignment quality were selected and within this subset, we compared data only within wagons, shown exemplary for wagon 1 in Figure 4
c,d. The latter excludes effects from the dramatic change in pulse energy over the whole pulse train (Figure 2
c). For these cases, the indexing rate (Figure 4
c) and mean crystal diffraction resolution (Figure 4
d) show no dependence on pulse position. After carefully incorporating these additional constraints, and only then, could we conclude that the data gave no evidence of shock-induced damage under our operating conditions. The corresponding data statistics are shown in Table 1
The quality of diffraction data of protein crystals, including of SFX data [4
], is often judged by the strength of the anomalous signal of sulfur atoms. To this end, it is common to refine the protein structure and show an anomalous electron density map around methionine residues or disulfide bridges [4
]. Because of the extensive data filtering and selection, there are <1000 indexed images per pulse position, which is fewer than is typically used for structure determination. However, the values of the established data quality indicators of the diffraction intensities (Table 1
) are within the range expected, particularly in view of the relatively low number of merged images, which would not allow the Monte Carlo integration to fully converge. Moreover, the intensities of each of the individual datasets show a high correlation (~90%) with the intensities derived from the images from the entire wagon, which does contain sufficient images for Monte Carlo convergence.