Towards Ab-Initio Simulations of Crystalline Defects at the Exascale Using Spectral Quadrature Density Functional Theory

: Defects in crystalline solids play a crucial role in determining properties of materials at the nano, meso- and macroscales, such as the coalescence of vacancies at the nanoscale to form voids and prismatic dislocation loops or diffusion and segregation of solutes to nucleate precipitates, phase transitions in magnetic materials via disorder and doping. First principles Density Functional Theory (DFT) simulations can provide a detailed understanding of these phenomena. However, the number of atoms needed to correctly simulate these systems is often beyond the reach of many widely used DFT codes. The aim of this article is to discuss recent advances in ﬁrst principles modeling of crystal defects using the spectral quadrature method. The spectral quadrature method is linear scaling with respect to the number of atoms, permits spatial coarse-graining, and is capable of simulating non-periodic systems embedded in a bulk environment, which allows the application of appropriate boundary conditions for simulations of crystalline defects. In this article, we discuss the state-of-the-art in ab-initio modeling of large metallic systems of the order of several thousand atoms that are suitable for utilizing exascale computing resourses.


Introduction
Crystal defects, often present in small concentrations, are crucial in determining macroscopic properties of materials. Vacancies are fundamental to creep, spall and radiation aging, even when they are present in parts per million. Solutes, present in parts per hundred, are responsible for strengthening by interacting with the motion of dislocations. Solutes can also cluster to nucleate a precipitate. Dislocations are the primary carriers of plasticity in metals even when their density is as small as 10 −8 per atomic row. Defects not only influence structural properties but also affect electronic and magnetic properties as well. For example, p-type and n-type semiconductors are primarily achieved by dopants. Doping and site mixing by magnetic ions in magnetic compounds can bring about magnetic phase transitions. Defects affect the macroscopic properties of crystals. This is because they simultaneously couple the chemical effects of the defect core, the discrete effects of the lattice and the long range effects of the elastic fields. Thus, to understand defects at physically relevant concentrations, we need to simultaneously study the details near the core and the long range elastic fields.
Density Functional Theory (DFT), developed by Hohenberg, Kohn and Sham [1,2], is one of the most successful first principles methods for predicting, understanding and developing insights into a wide range of material behavior. The phenomenal success and popularity of DFT is because of its excellent predictive power with high accuracy to cost ratio compared to other theories that solve for the electronic structure. In spite of this, the efficient solution of Kohn-Sham equations is computationally daunting, as the computational complexity of DFT calculations scale cubically with the number of atoms N (i.e., O(N 3 )). This restricts the range of physical systems that can be investigated. A number of methods have been proposed that reduce the prefactor by subspace projection [3][4][5][6].
Other methods focus on overcoming the cubic scaling bottleneck which can scale subquadratic [7] linear [8,9] and sub-linear [10] with the number of atoms. Methods that solve for orbitals achieve linear scaling by localizing the orbitals in real-space. However this approach only works for insulating or semiconducting systems. Other linear scaling methods do not solve the orbitals, and instead are based on the exponential decay of the density matrix and expand the density matrix using polynomials followed by truncation of off-diagonal components. The primary issue with this type of approach is the complicated interprocessor communication arising from distributed matrix-matrix products with changing sparsity patterns.
The Spectral Quadrature method is an alternative linear scaling formulation [8,[10][11][12][13][14][15] suited for first-principles simulations of metallic as well as insulating and semiconducting systems. The key idea of this formulation is to write the electron density and energies as integrals over the spectrum of the linearized Hamiltonian, and then approximate these integrals by quadrature rules. The order of the quadrature is independent of system size, thereby allowing the evaluation of the electron density and energy with O(1) effort at each spatial point. The number of spatial points scale linearly with system size, and therefore the method scales linearly. This method has other attractive features as well. First, it allows the application of non-zero Dirichlet boundary conditions on the electronic fields [15]. This is specifically attractive for studying crystalline defects, where periodic boundary conditions can introduce artificial interactions. Second, the local nature of the spectral quadrature calculation allows variable resolution and coarse graining [8,10,14]. Specifically, fine resolution is maintained where necessary and sub-grid sampling is employed in regions of uniform deformation. See Figure 1 for different modeling schemes used for studying defects.  [16][17][18]. (d) shows the specified Dirichlet boundary conditions in the spectral quadrature method. In (d), the inner box is the simulation domain, and the region outside is used to specify pre-computed electronic fields. (e) shows the spatial coarse-grained atomic mesh [14]. The region close to the defect has full atomic resolution, and the resolution is gradually decreased with distance increasing from the defect core.
The spectral quadrature method and the coarse grained approximation have been applied to the study of crystal defects [10,14,15], and high temperature simulations of materials [13,19,20]. In this paper, we present an overview of the spectral quadrature formulation and discuss results on defects in magnesium.

Spectral Quadrature
In Kohn-Sham Density Functional Theory, the ground state energy (E 0 ) of a system with atomic positions R is given by minimizing the Kohn-Sham energy functional E over the set of orbitals {ψ n }.
The orbitals {ψ n } are orthonormal, which give rise to the typical cubic scaling of Kohn-Sham DFT. As a consequence of their orthonormality, the orbitals are long ranged, oscillatory and do not decay in metallic systems. Several linear scaling approaches have been developed by localizing or truncating the orbitals in real space. Though these approaches are limited to insulating and semiconducting systems, they do not work for metallic systems i.e., absence of a band-gap. An alternate route is to work with the density operator formulation of DFT. The density operator (γ) is where g( n ) are the occupancies of the n th orbital with energy n . The Kohn-Sham energy functional can be expressed in terms of the density operator, and the ground state energy is then given by minimizing this functional over the density operator.
The diagonal of the density operator is the electron density (i.e., ρ = diag(γ)). The solution to Equation (3) is given by the following fixed-point problem: where H is the Kohn-Sham Hamiltonian, µ is the Fermi level, and σ = K B T is the electronic smearing. The right side of Equation (3) can be expanded as a polynomial of H, the discrete version of H. Linear scaling methods based on the polynomial expansion of H are known as Fermi Operator Expansion (FOE), make use of the sparsity, off-diagonal decay of the density operator, and ignore the entries of matrix-matrix products which are lower than a certain tolerance. In large-scale simulations using the FOE method, H is distributed across several computational cores, and the global matrix-matrix multiplications with changing sparsity patterns can give rise to complicated parallel communication.
The Spectral Quadrature method [8,[10][11][12][13][14][15] is an alternate linear scaling formulation of DFT where the essential ground state quantities are expressed as integrals over the spectrum of the Hamiltonian.
where σ(H) is the spectrum of H, and µ η,η is the spectral measure of H contracted with η.
where K is the order of the quadrature w η k are the spectral weights and λ η k are the spectral nodes. The quadrature order is independent of the system size, hence each evaluation of Equation (6) is O(1). The summation (Equation (6)) is evaluated at every discrete point in real space and the number of such discrete points scale linearly with the number of atoms N, and therefore the scaling of evaluating the ground state is linear.
In spectral Gauss quadrature, the weights {w η k } K k=1 are fixed apriori, and the spectral nodes {λ k η} K k=1 are treated as unknowns. The spectral weights and nodes at any grid point q can be calculated by the Lanczos iteration, where a q k+1 = (Hv  (7)) is the most computationally expensive part of the entire computation and can be accelerated by offloading onto Graphics Processing Units (GPUs).
After the spectral weights and spectral nodes are evaluated at a grid point q using the Lanczos scheme (Equation (7)), the electron density at the grid point is given by The electron density, and similarly other electronic fields, can be calculated in a linear scaling fashion using the spectral quadrature technique. The prefactor can be reduced by noting that the growth of vectors {v q k } K k=1 are restricted within a sphere of a finite radius-typically a few lattice spacings-around the grid point q. Therefore, the discrete Hamiltonian in Equation (7), can be spatially truncated beyond a certain distance. This insight allows us to employ local sparse algebra routines instead of distributed sparse algebra, decreases computation time, and storage.

Spatial Coarse Graining
The Spectral Quadrature Method can be used for large scale first principles simulations of materials. Typical calculations with the Spectral quadrature method involve several hundred to tens of thousands of atoms. Further controlled approximations can be introduced which can push the number of atoms even higher using the idea of spatial coarse-graining [8,10,14,21]. This is specifically advantageous for simulating extended defects such as dislocations. To understand heuristics involved in the approximations, first consider an isolated defect. The electronic fields and the atomic displacements are complex in the vicinity of the defect. Moving far away from the defect, the atomic displacements decay in either a polynomial of logarithmic manner, and the electronic fields far away from the defect are almost locally periodic. In the coarse grained formulation, the nature of the decay of the elastic and the electronic fields are exploited in the following manner. First, the position of all atoms is approximated by tracking only a subset P RA of Representative Atoms (RAs) which are fully dense in the vicinity of the defect core and is increasingly sparse with increase in distance from the defect. The positions of atoms which are not in P RA are evaluated by linear interpolation, which provides a coarse grained representation of the atomistic fields. Next, for the electronic fields, the discrete Hamiltonian is defined uniformly on a spatial finite difference mesh P f . However, the electronic fields are not evaluated at all points in P f . Instead, a different representation is employed based on the observation that sufficiently far away from the defect, the atom positions are locally periodic, where the periodicity is related to the local strain, and the electronic fields are almost locally periodic. Due to this, the electronic fields can be written as the sum of a corrector and a locally periodic predictor in the following way: where ρ q 0 is the piecewise periodic predictor and ρ q c is the corrector at a mesh point q ∈ P f . The predictor can be calculated by using a periodic DFT calculation with the lattice parameters subjected to local strain. To calculate the corrector, the electron density is first calculated at a select few points p ∈ P c using Spectral quadrature, and the difference between the electron density and predictor is interpolated onto the points in P f : where the P c is a subset of P f . Once the electronic fields are calculated, the total energy can be computed using cluster summation technique [22]. Figure 2a-c shows the features of defects and the associated atomic and electronic mesh for a one dimensional system. Note that, because the electron density is calculated only on a select few points in P c , the scaling is sub-linear with respect to the total number of atoms in the domain. Figure 2d shows the representative plot of computation time versus number of atoms for cubic, linear and sub-linear scaling methods.
In the coarse-grained formulation, a key computational expense comes from repeated on-the-fly calculation of the predictor electronic fields for every local strain field. This significant overhead can be reduced by employing a neural-network which maps the local strain to the electronic fields. We refer the reader to the recent work by Teh, Ghosh and Bhattacharya [23] for constructing such a map using machine learning.

Bulk Properties of Elements
The Spectral Quadrature Method was used to calculate the bulk properties of materials. Higher order finite differences were used to represent the discrete Hamiltonian operator. The finite difference mesh spacing and tolerances were chosen such that the energies and forces are converged to within chemical accuracies. Table 1

Defects in Magnesium
Defects in crystalline materials give rise to several interesting phenomena such as creep, spall, plasticity, radiation aging, phase transitions. Defects are present in very small concentrations which make their accurate study challenging. Defect can also interact with other defects in crystalline solids. Vacancies can coalesce to nucleate voids, solutes cluster to nucleate a precipitate, solutes bind to vacancies for diffusion, solutes interact with dislocations for strengthening.
We consider divacancy, solute-vacancy and solute-solute pairs in magnesium lattice. Figure 3 shows the six nearest neighbor positions on a HCP lattice where the first defect is placed at the position marked 0 and the other defect occupies positions marked 1 through 6. These denote the first six nearest neighbor configurations in increasing order of their distances from the defect at position 0. Let E vv , E sv and E ss denote the energies of divacancy, solute-vacancy and solute-solute pairs, respectively in a magnesium supercell with M lattice sites. The formation energies of the divacancy pair is: where E per f ect is the energy of a magnesium atom in the perfect crystalline state. The formation energy of the solute-vacancy pair is: where E solute is the energy of a solute atom in in the perfect crystalline state. The formation energy of the solute-solute pair is: A negative formation energy implies that the defect formation is energetically favorable. The binding energy of defect pair can be calculated as the difference between the sum of formation energies of the individual isolated defects and the defect pair.
where E f v and E f s are the formation energies of vacancies and solutes respectively. A positive binding energy implies that the defect pair is more stable than the individual isolated defects. In Figure 4, the top-panel shows the formation energies of (a) divacancy, (b) Al solutevacancy and (c) Al solute-solute pairs, respectively with increasing cell size from 64 to 1152 atoms. From these plots we see that the formation energy of defect pairs strongly depend on the cell size. Furthermore, the solute-solute formation energy calculated at cell sizes less than 200 atoms is negative but is positive at larger cell sizes. The bottom panel of Figure 4 shows the contours of the difference in electron density of the defect pairs in the second nearest neighbor configuration. We see the depletion of charge around vacancies and an accumulation of charge around the solute atoms. Table 2 shows the formation and binding energies of isolated defects and defect pairs calculated using the spectral quadrature method. The formation energy of a monovacancy and Al solute is positive. Further the formation energies of divacancy, solute-vacancy and solute-solute pairs are also positive. Interestingly, the formation energy increases from ∼1 eV to ∼26 eV when the system is changed from a divacancy to an Al solutevacancy pair, but decreases to ∼1 eV for solute-solute pair. Thus, the formation energy follows a non-linear trend with increasing solute number. The equilibrium concentration of spontaneously formed vacancies and divacancies depend on the formation energy and temperature as C = exp(−E f /k B T). As the formation energies of divacancies are greater than monovacancies, the spontaneous formation of monovacancies is favored over the spontaneous formation of divacancies. Additionally, the binding energies of divacancies are positive, implying two monovacancies can coalesce to form a divacancy. Again from Table 2, the binding energies of solute-vacancy and solute-solute pairs are also positive. The solute-vacancy binding energy is also greater than the divacancy binding energy, which implies that solute and vacancy can strongly bind, which aids the self diffusion of solute atoms. As the solute-solute binding energy is also large, these solutes atoms can bind to each other, form clusters and nucleate precipitates [15]. (d-f) shows the contour plots of electron density difference for defect pairs in second nearest neighbor configuration. A cell size of 1000 atoms is needed to calculate the formation energy of defect pairs. The electron density difference around a divacancy is shown in (d), around a Al solute-vacancy pair in (e) and around a Al solute pair in (f). In these plots, the electron density is depleted around a vacancy and elevated near the Al solute.
The Spectral Quadrature Method can be used to simulate even larger systems, containing several tens of thousands of atoms. Figure 5 shows the electron density contours around a prismatic dislocation loop in Magnesium calculated using 24,539 atoms, and the prismatic loop consists of a cluster of 37 vacancies. The cell sizes used in these calculations are beyond the scope of most widely popular DFT codes. Calculations with coarse-grained spectral quadrature reveal the preference towards forming large stable vacancy clusters and preference towards forming large prismatic dislocation loops by clustering of vacancies, as observed in nanoporous Mg alloys and experiments [14]. Further optimizing the Lanczos iteration into Graphics Processing Units (GPUs) and utilizing exascale resources (e.g., Frontier) will significantly increase the size of systems to several hundred thousand atoms.
We finally close by commenting on the applicability of the above method to problems in the mechanical behavior of materials. It is of interest to calculate the influence of defects on the mechanical moduli of a crystal. Simulating processes involving material flow [28] and additive manufacturing [29] require accurate estimation of effective moduli of materials under evolving defects such as voids and precipitate growth. Micromechanics approaches dating back to the work of Eshelby [30], Hashin and Shtrikman [31] provide upper and lower bounds on the moduli. Traditional DFT methods have been used to calculate the moduli under high defect concentrations [32]. Therefore, it will be interesting to use the spectral quadrature method on exascale computing resources to calculate the influence of voids and precipitates at high or realistic concentrations from ab-initio.

Conclusions
In this article, we discuss recent advances in first principles simulations of bulk materials and crystalline defects using the Spectral Quadrature Computational method. This framework is capable of simulating non-periodic systems embedded in a bulk environment, which allows the application of correct boundary conditions for simulations of crystalline defects. Furthermore, due to the local representation of the electronic fields, this method is also amenable to spatial coarse graining. We have presented studies of defect clusters in magnesium using cell sizes ranging from a few tens to several tens of thousands of atoms. These calculations are typically beyond the reach of the most widely used DFT codes. Due to the local nature of all calculations, the method is amenable to spatial coarse-graining which can be further accelerated by utilizing machine learning models. Furthermore, the local computations can be transferred onto Graphics Processing Units (GPUs), which significantly reduces the solution times. These features make the formulation suitable for efficiently utilizing the exascale supercomputing frameworks.