Towards Realistic Simulations of Macromolecules Irradiated under the Conditions of Coherent Diffraction Imaging with an X-ray Free-Electron Laser

Biological samples are highly radiation sensitive. The rapid progress of their radiation damage prevents accurate structure determination of single macromolecular assemblies in standard diffraction experiments. However, computer simulations of the damage formation have shown that the radiation tolerance might be extended at very high intensities with ultrafast imaging such as is possible with the presently developed and operating x-ray free-electron lasers. Recent experiments with free-electron lasers on nanocrystals have demonstrated proof of the imaging principle at resolutions down to 1:6 Angstroms. However, there are still many physical and technical problems to be clarified on the way to imaging of single biomolecules at atomic resolution. In particular, theoretical simulations try to address an important question: How does the radiation damage progressing within an imaged single object limit the structural information about this object recorded in its diffraction image during a 3D imaging experiment? This information is crucial for adjusting pulse parameters during imaging so that high-resolution diffraction patterns can be obtained. Further, dynamics simulations should be used to verify the accuracy of the structure reconstruction performed from the experimental data. This is an important issue as the experimentally recorded diffraction signal is recorded from radiation-damaged samples. It also contains various kinds of background. In contrast, the currently used reconstruction algorithms assume perfectly coherent scattering patterns with shot noise only. In this review paper, we discuss the most important processes and effects relevant for imaging-related simulations that are not yet fully understood, or omitted in the irradiation description. We give estimates for their contribution to the overall radiation damage. In this way we can identify unsolved issues and challenges for simulations of x-ray irradiated single molecules relevant for imaging studies. They should be addressed during further development of these simulation tools.

Abstract: Biological samples are highly radiation sensitive. The rapid progress of their radiation damage prevents accurate structure determination of single macromolecular assemblies in standard diffraction experiments. However, computer simulations of the damage formation have shown that the radiation tolerance might be extended at very high intensities with ultrafast imaging such as is possible with the presently developed and operating x-ray free-electron lasers. Recent experiments with free-electron lasers on nanocrystals have demonstrated proof of the imaging principle at resolutions down to 1.6 Angstroms. However, there are still many physical and technical problems to be clarified on the way to imaging of single biomolecules at atomic resolution. In particular, theoretical simulations try to address an important question: How does the radiation damage progressing within an imaged single object limit the structural information about this object recorded in its diffraction image during a 3D imaging experiment? This information is crucial for adjusting pulse parameters during imaging so that high-resolution diffraction patterns can be obtained. Further, dynamics simulations should be used to verify the accuracy of the structure reconstruction performed from the experimental data. This is an important issue as the experimentally recorded diffraction signal is recorded from radiation-damaged samples.
It also contains various kinds of background. In contrast, the currently used reconstruction algorithms assume perfectly coherent scattering patterns with shot noise only. In this review paper, we discuss the most important processes and effects relevant for imaging-related simulations that are not yet fully understood, or omitted in the irradiation description. We give estimates for their contribution to the overall radiation damage. In this way we can identify unsolved issues and challenges for simulations of x-ray irradiated single molecules relevant for imaging studies. They should be addressed during further development of these simulation tools.
Keywords: free-electron laser; coherent diffraction imaging; simulation

Introduction
X-ray free-electron lasers (XFELs) are expected to open up new opportunities for structural studies of biological systems. Due to the low radiation tolerance of biological samples an accurate structure determination of single macromolecular assemblies is not possible with standard diffraction experiments. However, computer simulations of the damage formation have shown [1][2][3][4][5] that the radiation tolerance limit might be "extended" at very high dose intensities (10 11 -10 13 photons of 12 keV energy focused to 100 nm spot) with ultrafast imaging (pulse duration of 10 fs or shorter) such as is possible with the presently developed and operating XFELs (LCLS, SACLA, European XFEL) [6][7][8]. This new barrier of radiation tolerance indicates the possibility of recording images of single biological particles at high resolution without the need to amplify scattered radiation through Bragg reflections. This application of XFELs can have a tremendous impact on structural studies at both the molecular and cellular level with profound implications for biology and medicine. Recent experiments performed at FLASH [9,10] have demonstrated proof of the imaging principle, as well as the LCLS experiment with single mimivirus particle [11]. The LCLS measurements with nanocrystals [12][13][14][15][16] have demonstrated imaging at high resolution, down to 1.6 Ångstroms.
There are still many physical and technical problems that have to be clarified on the way to imaging of single biomolecules at atomic resolution [17,18]. During a single-shot experiment a deterioration of the sample is inevitable. Therefore one can collect enough information only by recording many 2D diffraction patterns measured with differently oriented, but otherwise identical replicas of the target. In what follows we restrict our considerations to such non-periodic reproducible objects which are the ultimate goal of the coherent diffraction imaging studies in structural biology. Theoretical simulations, e.g., [1,19,20], try to estimate how the radiation damage progressing within an imaged single object restricts the structural information about this object recorded in its diffraction image. Available simulation techniques and tools have been improved significantly since the very first simulation describing the evolution of irradiated lysozyme [1]. However, there are still unsolved or not fully understood issues in the description of irradiated samples such as: (i) contribution of inelastic scattering background to the total scattering signal; (ii) realistic particle size; (iii) effect of interparticle correlations; (iv) effect of pulse duration for pulses below 10 fs; (v) strongly non-equilibrium evolution of electron distribution during femtosecond pulse; and (vi) effect of chemical environment and plasma screening on ionization dynamics. Below we discuss their effect on the overall radiation damage, providing relevant estimates. Their omission or inaccurate treatment may lead to an incorrect estimation of the radiation damage, which, in turn, may hinder achieving the optimal pulse parameters for the imaging. Further, dynamics simulations should be used to verify the accuracy of the structure reconstruction performed from the experimental data. This is an important issue as the experimentally recorded diffraction signal is recorded from radiation-damaged samples [19]. It also contains various kinds of background [21]. In contrast, the currently used reconstruction algorithms assume perfectly coherent scattering patterns with shot noise only.

Contribution of Inelastic Scattering Background to the Total Scattering Signal
Assuming the coherence time of the pulse to be short compared to the timescales of the processes occurring within the irradiated sample, the total number of photons scattered at the momentum transfer q during a single XFEL shot becomes proportional to the incoherently summed signal intensities recorded at instantaneous snapshots of the system: The function h(t) describes the temporal evolution of the x-ray flux, which is ensemble-averaged over XFEL shots. The intensity I( q, t) is the instantaneous scattering intensity. It separates as follows [22]: where I el ( q, t), I inel ( q, t) are the instantaneous scattering intensities, scattered elastically and inelastically at time t, respectively. Correspondingly, the total scattered signal, I( q), separates into: The elastically scattered intensity is related to the Fourier transform F ( q, t) of the electronic density of the system, n( r, t), and reads I el ( q, t) = |F ( q, t)| 2 = d 3 r d 3 r n( r, t) n( r , t) e i q·( r− r ) The treatment of the inelastic scattering contribution from neutral ground state atoms is known, and well documented in the literature. Tabulated values of inelastic (incoherent) scattering functions for all elements can be found, e.g., in [23]. Inelastic x-ray scattering from an ion in an arbitrary electronic state can be computed with ab-initio methods, as it has been done, e.g., in [24] with the XATOM code [25]. In experiments that have already been done at LCLS, the effect of inelastic scattering was negligible. When the sample is a nanocrystal [15], the strong coherent Bragg peaks dominate over inelastic scattering. Inelastic scattering is also negligible at low resolution experiments on single objects [11] when intensity at small scattering vectors is collected. However, the effect of inelastic scattering should be taken into account when planning atomic-resolution imaging of non-periodic samples. In [24,26] we showed that in this case the inelastic scattering from bound and unbound electrons can have a significant impact on the measured intensities: it contributes to the background that reduces the contrast of the recorded image. This effect is more pronounced at larger momentum transfers, i.e., at high resolution [26]. Dedicated theoretical calculations [24] show that the contribution of inelastic scattering from a carbon cluster to the total scattering signal at a resolution of 1.5 Å is high: ∼ 40%-50% at the imaging performed at a photon energy of 12 keV and pulse fluences of 10 11 -10 13 photons per pulse [6], focused to a 100 nm spot. This yields a significant background signal.
Here we also note that a rigorous derivation of Equation (1) does not yet exist. Because of time-dependent processes progressing during the x-ray pulse (e.g., photoionization) the electronic system is in a non-stationary state [27,28]. Generalization of the differential scattering probability from the stationary case to the non-stationary case (Equation (1) in [27]) by assigning an additional degree of freedom (time) to the electronic density is not justified in general case. A detailed discussion on this issue is presented in [27][28][29]. They contain some known "special cases", in which the scattering signal can be accurately described with Equation (1). In particular, in [29] Vrakking and Elsaesser discuss imaging of a sample containing a mixture of excited atoms and ground-state atoms. If ground-state and low-excited atoms are in majority, imaging signal "recovers" electronic density dependence from Equation (1). For nanocrystal imaging, this implies that if a sample is weakly damaged during the imaging pulse, Equation (1) can be applied. Such analysis was done, e.g., in [12]. However, the correct description of formation of scattering pattern under the conditions of coherent diffraction imaging with XFELs in general case remains yet another challenge for theory.

Realistic Particle Size
We can now formulate the first condition on sample size for a realistic imaging simulation. The simulated particle has to be large enough to obtain a sufficiently high signal from elastic scattering with sufficiently low background noise. Otherwise, its structural reconstruction at the desired resolution may not be possible. The originally developed two-step methods for pattern classification and orientation [30][31][32][33] require on the order of 0.1 photons per speckle at a desired resolution [34]. In [20] we estimated that in order to fulfill this condition under realistic pulse conditions (∼ 1 Å resolution with 10 12 incident 12 keV photons focused on a ∼100 nm spot), the scattering power of the sample has to be equivalent to that of an amorphous carbon sphere of ∼50 nm radius. Novel reconstruction algorithms [35][36][37][38][39] may enable 3D reconstruction from unoriented 2D images of a much lower signal-to-noise ratio. These algorithms are under development and their convergence properties are still being studied. These methods also require many images of the same object to be analyzed. This condition can be experimentally demanding. Also, it is still an open question how the reconstruction algorithms can deal with patterns containing different sorts of noise, in particular, the background signal from inelastically scattered photons. The second condition on sample size is given by the size of the objects of interest for XFEL imaging studies. We mean here large non-periodic reproducible samples which structure cannot be investigated with other experimental techniques.
To conclude this part, the simulations of high-resolution single-particle imaging should consider particles large enough: (i) so as to obtain the signal-to-noise ratio appropriate for the available reconstruction method; and (ii) which structure cannot be investigated with other experimental techniques. The theoretical analysis has to take into account the inelastic scattering contribution which is non-negligible for non-periodic samples. Also, the effect of other background noises that contribute to the imaging signal should be treated. Realistic studies of single molecule imaging should account for these effects [40].

Effect of Interparticle Correlations
Another important issue is the choice of simulation method to follow the evolution of an XFEL irradiated macromolecule. In order to investigate the effect of radiation damage, detailed modeling and understanding of ionization dynamics are needed. A continuum approach [41,42] is an efficient way to model radiation damage within large samples. However, it is not straightforward to obtain accurate information about imaging from this approach, as continuum models in use describe dynamical properties of electrons and ions, using average single-particle densities [26], and typically neglect two-particle correlations. The elastically scattered signal intensity that one can construct from the average density obtained from the continuum model is where the average denotes the average over the measured realizations (R) of the sample evolution, i.e., the ensemble average. To compare, the real scattering signal in Equation (4) depends on two particle correlations during individual realization. In [26] we have studied in detail the effect of two-particle electron-electron correlations on x-ray scattering patterns, obtained from systems under conditions similar to those expected during XFEL imaging experiments at atomic resolution. We have shown that to a large extent these correlations can be neglected, and the resulting simple estimate of the scattering intensity can describe the elastically scattered intensity with a good accuracy. This scheme can be applied when the reconstruction is performed from an ensemble-averaged pattern and also for the reconstruction methods such as the EMC algorithm [39] which are based on single-shot images. In the latter case, a simulation method which follows individual realizations of the system (shot-to-shot) with correlations included intrinsically would be a more preferable choice. Molecular dynamics (MD) modeling is an example of such method [1,[43][44][45]. It also accounts for the granularity of ions. In this way the gradual degradation of atomic scattering factors and eventual atomic displacements can be conveniently followed during the simulations. Molecular dynamics approaches used for imaging studies simulate trajectories of classical particles (atoms, ions and electrons). Currently, only with the classical MD approach one can achieve computational efficiency enabling imaging studies of large molecules (> 10 4 particles). However, interparticle bonding and interparticle scattering are quantum processes. One can account for them in part by: (i) using quantum-mechanical transition rates for scattering processes (atomic cross sections); and (ii) introducing modified interatomic potentials and force fields to describe interparticle bonding. However, this is only an approximate treatment. Especially for low energy particles [45] quantum effects do play a role and cannot be treated accurately in such an approximate way. Therefore, we view the classical approximation as the main limitation of the MD approach.

Effect of Pulse Duration for Pulses below 10 fs
Let us mention that pulse propagation effects (reduction of pulse intensity) within a macromolecule can be neglected due to the long attenuation length for hard x-ray photons, e.g., ∼ 3500 µm in carbon (ρ = 2.2 g/cm 3 ) for a 12 keV photon. However, the dependence of dynamics within the irradiated sample on the temporal pulse shape is not fully understood. If imaging is performed with laser intensities far below the direct (non-sequential) multiphoton absorption regime and with pulses longer than the excitation and relaxation times of photoinduced processes, one can expect that the final ionization state of the sample, i.e., the distribution of ion charges within the irradiated sample should predominantly depend on the pulse fluence, and not on the temporal characteristics of the femtosecond pulse.
But during imaging with short XFEL pulses the timescales of some radiation-induced processes within the sample become comparable to the pulse duration. According to our earlier studies on radiation damage [20,46], the damage within the irradiated sample during coherent diffraction imaging can be reduced to electronic damage, if sufficiently short pulses are used. Although charging, and later, expansion of an irradiated molecule are not homogeneous processes, pulses of duration of 10 fs or shorter allow to eliminate the effect of atomic displacements driven by repulsive forces between ions during the pulse [20]. Also, Auger processes have timescales in light elements of 10.7 fs (C), 7.1 fs (N), and 4.9 fs (O) [47]. This implies that after reducing the pulse duration below 10 fs, fewer Auger electrons will be emitted during the pulse. As Auger electrons and their secondaries are the main contributors to the electronic radiation damage, one can achieve a suppression of the total radiation damage by reducing Auger electron emission [46]. In this way, for imaging pulses of duration ≤ 10 fs the dynamics within the sample during the pulse, thus affecting the formation of the diffraction pattern, starts to depend on the pulse duration.
An interesting question also arises, how the pulse duration influences the total number of photons scattered elastically and inelastically from the sample. If the imaging is performed with laser intensities low enough, one can expect that the total scattered signal should depend only on the pulse fluence, independently of the temporal characteristics of the pulse. However, suppression of the sample damage achieved with a shorter x-ray pulse would result in a higher scattering power of the sample [46]. Detailed studies of the effect of the pulse duration on the number of scattered photons can be found in [46].

Strongly Non-Equilibrium Evolution of Electron Distribution during Femtosecond Pulse
Short pulses of duration < 10 fs impose an additional requirement on the simulation tools. At such short timescales, high-energy photoelectrons and Auger electrons emitted as a result of sample irradiation do not thermalize, i.e., their non-equilibrium evolution has to be followed [48][49][50]. Enforcing instantaneous thermalization (Maxwell-Boltzmann distribution) on the free electrons artificially increases the number of low energy electrons in the sample. As a result, the number of secondary ionizations is overestimated, as well as the predicted radiation damage. Therefore the usage of codes based on hydrodynamic approximations for imaging studies has to be restricted only to cases of long exposure [51]. For comparison, we estimated that within a neutral bulk material of atomic content corresponding to a typical protein [51] (H 50 C 30 N 9 O 10 S 1 of 1.35 g/cm 3 ) the full thermalization of a single 12 keV photoelectron takes ∼ 140 fs, and the thermalization of single KLL Auger electrons (∼300-500 eV for C, O, N and ∼ 2000 eV for S) takes 5-15 fs. After these times one can assume a fully thermal (Maxwell-Boltzmann) distribution of the free electrons in the sample. Figure 1 shows the average number of secondary electrons created during impact ionizations by a primary (photo-or Auger) electron as a function of time. The calculations were performed using impact ionization cross sections from [52]. Parameters for photoinduced processes were taken from [47]. Please note that the above results show cascading times of a single electron within an infinite-size neutral medium. In case of intense x-ray irradiation, the presence of ions and the electron-electron interaction (efficient at high free-electron density) would reduce the thermalization time. Therefore, the above estimates for the thermalization time are the upper limits. As the results above were obtained for a bulk protein, the question arises how much those electrons would contribute to the damage within a sample of a finite size. In previous simulations [1] it was assumed that the fastest electrons can leave a macromolecule without causing any impact ionization. Below we show the effective inelastic mean free path (IMFP) of an electron calculated for an infinitely extended, neutral protein of the atomic content listed above ( Figure 2). As mentioned earlier, the imaged macromolecule has to be large enough to scatter enough photons (as, e.g., the carbon cluster of 50 nm radius discussed in [20]). From Figure 2 we find that the IMFP of 12 keV electron is ∼ 130 Å, mean free paths for electrons of impact energy 300-2000 eV are between 10 and 30 Å. This implies that within a 100 nm large molecule all these electrons would be able to ionize further electrons collisionally, i.e., they would contribute to the radiation damage. However, if we make the imaging pulse short enough, the scattering events happen predominantly after the pulse is over. For instance, it was shown in [46] that with 0.1 fs long pulses of 12 keV photons the probability of impact ionization per photoelectron can be reduced to 20%. These results and the presented plot then indicate another possible path to reduce the electronic damage within the irradiated sample: through imaging with high-energy photons and sub-femtosecond pulses. Concerning direct photoabsorption processes, in [24] we found that inelastic scattering gives a significant background signal to imaging at high resolution. However, the background decreases with the increasing photon energy at high pulse fluences. Already, the use of photon energies of 12 keV and higher improves the signal quality at high resolution. Again, using high-energy photons seems to be more advantageous for imaging.

Effect of Chemical Environment and Plasma Screening on Ionization Dynamics
Finally, we list some physical processes occurring after x-ray irradiation which so far have not been fully treated. The first one is the effect of the chemical bonds on the x-ray absorption and electronic damage within the irradiated sample. A frequent simplifying assumption made is that the ionization within the imaged sample triggered by high-intensity radiation progresses very fast, and the chemical bonds within the molecule break very early in the exposure. The system can then be treated as being composed of ions and electrons, and atomic cross sections and rates can accurately describe ionization processes. As the experimental and theoretical data on C 60 [43] confirm, this assumption is justified for molecules irradiated with high-fluence x-ray pulses.
However, at moderate pulse fluences the damage progress may be much slower, and chemical bonding can "survive" the laser shot to a large extent. As in non-periodic samples there is no signal amplification through Bragg reflection, the imaging signal becomes sensitive to the structure of individual scattering centers. The presence of a partially perturbed chemical environment affects the scattering signal from bound electrons. Hence, any rigorous attempt to determine electronic structure within an irradiated molecule should evaluate the effect of the radiation-perturbed chemical environment on the electronic structure of the molecule. So far, in molecular-dynamics based codes the classical force field method has been a crude approximation to account in part for chemical bond effects. An accurate treatment of electronic damage in the presence of a changing chemical environment (e.g., charge migration effects, Auger molecular decays) still remains a challenge for theory.
Second, imaging simulations frequently neglect possible electronic excitation and deexcitation processes, trying to reduce in this way the overall computational costs. For instance, the electronic inelastic MFP is often interpreted as the ionization mean free path: each inelastic event is then supposed to end up with a collisional ionization. In this approximation possible inelastic energy losses of electrons due to excitation processes are neglected. Including excitations could also affect the overall damage due to the additional channel of energy loss within the system.
Finally, the question can be posed if the presence of free electrons in the sample and, in particular, the charge screening induced by them can significantly influence the ionization process. It is well known that the charged environment within a dense plasma leads to the phenomenon of ionization potential depression (IPD) for ions embedded in the plasma. Accurate predictions of the IPD effect are of crucial importance for modeling atomic processes occurring within dense plasmas. Several theoretical models have been developed to describe the IPD effect, with frequently discrepant predictions. Only recently, first experiments on the screening effect of solid-density plasma on atoms embedded in the plasma have been performed at LCLS [53][54][55]. Their results were found to be in disagreement with the widely-used IPD model by Stewart and Pyatt [56]. This controversy showed a strong need for a rigorous and consistent theoretical approach to calculate the IPD effect. In [57] we proposed such an approach: an ab initio two-step Hartree-Fock-Slater model. With this parameter-free model we could accurately describe the experimental Al data and test the accuracy of standard IPD models. Here, with this model we calculate the shifts of K-shell levels in free-electron-screened ions, as K-shell ionization is the most probable photoionization channel for 12 keV photons. We also estimate the valence-shell shifts, as electron impact ionization excites predominantly valence-shell electrons from an atom. We consider a broad span of electron temperatures between 10 eV and a few hundred eV and assume charge quasi-neutrality. The estimated K-shell shifts for C, C +1 and C +2 ions -which give the dominant contribution to the scattering signal-are ∼ 12-27 eV, ∼ 30-40 eV and 50-64 eV respectively. They are not large when compared to the unscreened K-shell binding energy and the high impact energy of the incoming photon, 12 keV. Therefore, the plasma-induced shifts should not lead to a significant increase of the photoinduced ionization of the sample. Similarly, the shifts of the valence energy levels are 1-2 eV for C, 11-20 eV for C +1 and 35-40 eV for C +2 ions. The shifts could lead to an increased ionization of the sample by impact electrons only in case the impact electrons have an energy comparable with the shifts. As we showed above, electron cascades triggered by energetic photoelectrons or Auger electrons need a time of ≥ 10 fs for cascading. Additionally, Auger emission can be delayed by up to 10 fs with respect to the photoionization event. This implies that for imaging pulses below 10 fs the number of low energy electrons within the sample will be low, so the effect of plasma-induced shifts on the damage by secondary electrons can be neglected with a good accuracy. Let us emphasize that this calculation based on the average atom model assumed local thermal equilibrium for electrons. As we mentioned above, for 10 fs long or shorter pulses, the electron distribution is non-equilibrium. A rigorous approach would require to include the non-equilibrium distribution in the calculations of atomic level shifts. Such a calculation scheme should be available soon within the XATOM package through an on-the-fly connection to a molecular dynamics simulation tool.
However, the problem not addressed yet in the context of atomic-resolution imaging is the x-ray scattering and x-ray absorption in plasma. Its correct treatment should take into account both the shifts of energy levels and the change of corresponding cross sections due to the dense free-electron environment. Such a rigorous approach to the scattering from an ionized biomolecule in the presence of plasma electrons is missing, and it requires dedicated ab-initio studies. We therefore add it into the list of challenges.

Summary
To sum up, in this review paper, we have discussed various processes and effects relevant for imaging-related simulations that have been neglected so far in the irradiation description, or are not yet fully understood. We recall briefly the main points: (i) contribution of inelastic scattering background to the total scattering signal; (ii) realistic particle size; (iii) effect of interparticle correlations; (iv) effect of pulse duration for pulses below 10 fs; (v) strongly non-equilibrium evolution of electron distribution during femtosecond pulse; and (vi) effect of chemical environment and plasma screening on ionization dynamics. We estimated the contribution of the effects to the overall radiation damage. In this way we could identify the unsolved issues and challenges for single-particle simulations, important for a correct description of imaging experiments. We expect that quantitative studies exploring the ideas described here will be performed in the near future.
Nikita Medvedev contributed to the manuscript with discussions, calculations and editing of the manuscript.
Vikrant Saxena contributed to the manuscript with discussions, calculations and editing of the manuscript.
Sang-Kil Son contributed to the manuscript with discussions and editing of the manuscript. He is a co-author of papers [24,25,43,44,46,57].