Systematic Quantum Cluster Typical Medium Method For the Study of Localization in Strongly Disordered Electronic Systems

Great progress has been made in the last several years towards understanding the properties of disordered electronic systems. In part, this is made possible by recent advances in quantum effective medium methods which enable the study of disorder and electron-electronic interactions on equal footing. They include dynamical mean field theory and the coherent potential approximation, and their cluster extension, the dynamical cluster approximation. Despite their successes, these methods do not enable the first-principles study of the strongly disordered regime, including the effects of electronic localization. The main focus of this review is the recently developed typical medium dynamical cluster approximation for disordered electronic systems. This method has been constructed to capture disorder-induced localization, and is based on a mapping of a lattice onto a quantum cluster embedded in an effective typical medium, which is determined self-consistently. Here we provide an overview of various recent applications of the typical medium dynamical cluster approximation to a variety of models and systems, including single and multi-band Anderson model, and models with local and off-diagonal disorder. We then present the application of the method to realistic systems in the framework of the density functional theory.

The metal-to-insulator transition (MIT) is one of the most spectacular effects in condensed matter physics and materials science. The dramatic change in electrical properties of materials undergoing such a transition is exploited in electronic devices that are components of data storage and mem-ory technology 1,2 . It is generally recognized that the underlying mechanism of MITs are the interplay of electron correlation effects (Mott type) and disorder effects (Anderson type) [3][4][5][6][7] . Recent developments in many-body physics make it possible to study these phenomena on equal footing rather than having to disentangle the two.
The purpose of this review is to bring together the various developments and applications of such a new method, namely the Typical Medium Dynamical Cluster Approach (TMDCA) [8][9][10][11][12] , for investigating interacting disordered quantum systems.
The organization of this article is as follows: Sec. II is dedicated to a few basic aspects of modeling disorder in solids.
We The central focus of this review, is the typical medium theories of Anderson localization, which are discussed in Sec. V.
We show how this method is used to study disorder-induced electron localization. Starting from the single-site typical medium theory, we present its natural cluster extension, dis-cussing several algorithms for the self-consistent embedding of periodic clusters fulfilling the original symmetries of the lattice in addition to other desirable properties. We present details of how this method can be used to incorporate the full chemical complexity of various systems, including off-diagonal disorder and multi-band nature, along with the interplay of disorder and electron-electron interactions.
In Sec. VI we discuss how the developed typical medium methods can be practically applied to real materials. This is done in a three-step process in which DFT results are used to generate an effective disordered Hamiltonian, which is passed to the typical medium cluster/single-site solver to compute spectral densities and estimate the degree of localization. Section Sec.VII reviews the application of the TMDCA from single-band three dimensional models to more complex cases such as off-diagonal disorder, multi-orbital cases and electronic interactions. Finally the concluding remarks are presented in Sec. VIII.

II. BACKGROUND: ELECTRON LOCALIZATION IN DISORDERED MEDIUM
Disorder is a common feature of many materials and often plays a key role in changing and controlling their properties.
As a ubiquitous feature of real systems it can arise in varying degrees in the crystalline host for a number of reasons. As shown in Figure 1, disorder may range from a few impurities or defects in perfect crystals, (vacancies, dislocations, interstitial atoms, etc), chemical substitutions in alloys and random arrangements of electron spins or glassy systems.
One of the most important effects of disorder is that it can induce spatial localization of electrons and lead to a metalinsulator transition, which is known as Anderson localization.
Anderson predicted 25 that in a disordered medium, electrons scattered off randomly distributed impurities can become localized in certain regions of space due to interference between multiple-scattering paths.
Besides being a fundamental solid-state physics phenomena, Anderson localization has a profound consequences on many functional properties of materials. For example, the substitution of P or B for Si may be used to dope holes or particles into The transition is first order, with the finite temperature second order terminus. In Anderson's picture, localization is a quantum phase transition driven by disorder. Despite more than five decades of intense research 37,38 , a completely satisfactory picture of Anderson localization does not exist, especially when applied to real materials.
Several standard computationally exact numerical techniques including exact diagonalization, transfer matrix method [39][40][41] , and kernel polynomial method 42 have been developed. They are extensively applied to study the Anderson model (a tight binding model with a random local potential). While these are very robust methods for the Anderson model, their application to real modern materials is highly non-trivial.This is due to the computational difficulty in treating simultaneously the effects of multiple orbitals and complex real disorder potentials ( Figure 2) for large system sizes. In particular, it is very challenging to include the electron-electron interaction. Practical calculations are limited to rather small systems. Also the effects from the long range disorder potential which happens in real materials, such as semi-conductors, are completely absent. This, perhaps, is not surprising, as direct numerical calculations on interacting systems even in the clean limit often come with various challenges. Reliable calculations for sufficiently large system sizes infer the behaviors at the thermodynamic limit that are largely done in specific cases such as systems at one dimension or at special filling in which the fermionic minus sign problem in the quantum Monte Carlo calculations can be subsided.
During the past two decades or so,several effective medium mean field methods have been developed as an alternative to direct numerical methods. For example, for systems with strong electron-electron interactions, over the past two decades or so, the Dynamical Mean Field Theory (DMFT) [13][14][15][16][17][18][19] , constitutes a major development in the field of computational many body systems and materials science. The DMFT shares many similarities with the Coherent Potential approximation (CPA) for disordered systems 20,21 . Conceptually,in both these methods, the lattice problem is approximated by a single site problem in a fluctuating local dynamical field (the effective medium). The fluctuating environment due to the lattice is replaced by the local energy fluctuation, and the dynamical field is determined by the condition that the local Green's function is equal to (in CPA, the disorder averaged) Green's function of the single site problem 43 . DMFT has been extensively used on strongly correlated models, such as the Hubbard model 17 , the periodic Anderson model 44 , and the Holstein model 45 48 . This is also the subject of the present review on a cluster development in the form of the typical medium DCA.
There are a number of excellent extensive research papers, reviews, and books covering different aspects of DMFT/CPA/DFT. These include Ref. 18,19 on DMFT aspects, Ref. 20,21 concerning CPA, Wannier-function-based methods [74][75][76] to extract a tight-binding Hamiltonian from the DFT calculation, multiple scattering theory 77 , and the combined LDA+DMFT approach 78 , to enumerate just a few.
Although these methods allow the study of various phenomena resulting from the interplay of disorder and interaction, they fail to capture the disorder-driven localization. As we will discuss in detail in the sections below, the fundamental obstacle in tackling the Anderson localization is the lack of a proper order parameter. Once the order parameter is identified as the typical density of states (Sec.II B), it can be incorporated into a self-consistency loop leading to the Typical Medium Theory 9 . This was subsequently extended to clusters incorporating ideas of the DCA. This theory came to be known as the Typical Medium Dynamical Cluster Approximation (TMDCA) and is the major focus of current review. In addition to being able to capture the Anderson localization properly, the TMDCA also allows the study of the interplay between disorder and interaction in both weak and strong coupling limits. Thus, it provides a new basis for studying the Mott and Anderson transitions on equal footing. As any cluster extension TMDCA inherits, so also the system size (i.e. the number of sites in the cluster N c ) dependence. In analogy with the DCA , the 1/N c can be treated as a small parameter, therefore a systematic improvement of the approximation can be achieved by increasing the cluster size. In addition, in contrast to direct numerical methods, the major strength of TMDCA lies in its flexibility to handle complex long range impurities and multi-orbitals systems which are unavoidable features of many realistic disordered system Figure 3. This review collects the recent results of the TMDCA applied to the Anderson model and its extension, and to the real materials.

A. Anderson localization
Strong disorder may have dramatic effects upon the metallic state 38 : the extended states that are spread over the entire system become exponentially localized, centered at one position in the material. In the most extreme limit, this is obviously true. Consider for example a single orbital that is shifted in energy so that it falls below (or above) the continuum in the density of states (DOS). Clearly, such a state cannot hybridize with other states since there are none at the same energy. Thus, any electron on this orbital is localized, via this (deep) trapped states mechanism, and the electronic DOS at this energy will be a delta function. Of course this is an extreme limit. Even in the weak disorder limit, the resistivity of ideal metallic conductors decreases with lowering temperature. In reality, at very low temperatures, the resistivity saturates to a residual value. This is due to the imperfections in the formation of the crystal. If the disorder is not too strong, the perfect crystal still remains a good approximation.
The imperfections can be considered as the scattering centers for the current-carrying electrons. Hence, the scattering processes between the electrons and defects lead to the reduction in the conduction of electrons. The assumption of a single coupling parameter leads to the development of the scaling theory for the conductance.
It is based on the assumption that conductance at different length scales (say L and L) are related by the scaling relation g(L ) = f ((L /L), g(L)). In the continuum it can be written as dlng(L) dlnL = β(g(L)). The β function can be estimated from small and large g limits. From these results, Abrahams, Anderson, Licciardello, and Ramakrishnan conclude that there is no true metallic behaviors in two dimensions, but a mobility edge exists in three dimensions 88 . The validity of the scaling theory gained further support after the discovery of the absence of ln L 2 term from the perturbation theory. 89 The connection between the mobility edge and the critical properties of disorder spin models was realized in the 70's. 90 In a series of papers Wegner proposed that the Anderson transition can be described in terms of a non-linear sigma model. [91][92][93] . Multifractality of the critical eigenstate was first proposed within the context of the sigma model 92,94 . All three Dyson symmetry classes were studied. Hikami, Larkin, and Nagaoka found that the symplectic class corresponds to the system with spin-orbit coupling that can induce delocalization in two dimensions. 95 In 1982, Efetov showed that tricks from super-symmetry can be employed to reformulate the mapping to a non-linear sigma model with both commuting and anticommuting variables. 96 Many of the recent efforts in studying Anderson localization, focus on the critical properties within an effective field theory-non-linear sigma model in different representations: fermionic, bosonic, and supersymmetric 6 . While these works provide answers to important questions, such as the existence of mobility edges of different symmetry classes at different dimensions, they are not able to provide universal or off from criticality quantities, such as critical disorder strength, the correlation length and the correction to conductivity in the metallic phase. An important development to address these issues is the self consistent theory proposed by Vollhardt and Wölffle. 97,98 It has also been shown that the results from this theory also obey the scaling hypothesis. 99 More recent studies focus on classifying the criticality according to the local symmetry. Ten different symmetry classes based on classifying the local symmetry are identified generalizing the three Dyson classes including the Nambu space 100 .
The renormalization group study on the sigma model has been carried out on different classes and dimensions. 6 . The importance of the topology of the sigma model target space is studied extensively in recent works 6,101,102 .

B. Order parameter of Anderson localization
As we discussed in the previous section, effective medium theories have been used to study Anderson localization, however progress has been hampered partly due to ambiguity in identifying an appropriate order parameter for Anderson localization, allowing for a clear distinction between localized and extended states 9 .
An order parameter function had been suggested about three decades ago, in the study of Anderson localization on the Bethe lattice. 103,104 It has been shown that the parameter is closely related to the distribution of on-site Green's functions, in particular the local density of states. 105 Recently, following the work of Dobrosavljevic et. al 9 , there has been tremendous progress along these ideas, with the local typical density of states identified as the order parameter.
To demonstrate how the local density of states and its typ- ical (most probable value) can be utilized as an order parameter for Anderson localization, we consider a thought experiment. We imagine dividing the system up into blocks, as illustrated in Figure 4. Later, when we construct our quantum cluster theory of localization, each of the blocks should be thought of as a cluster, and we construct the system by periodically stacking the blocks. We make two controllable approximations. So what does this mean in terms of the local electronic density of states (LDOS) that is measured, i.e., via STM at one site in the system, and the average DOS (ADOS) measured, i.e., via tunneling (or just by averaging the LDOS)? In Figure 5 we calculate the ADOS and TDOS for a simple (Anderson) single-band model on a cubic lattice with near-neighbor hopping t (bare bandwidth 12t = 3 to establish an energy unit) and with a random site i local potential V i drawn from a "box" distribution of width 2W , with As can be seen from the Figure 5, as we increase the disorder strength W , the global average DOS (dashed lines) always favors the metallic state (with a finite DOS at the Fermi level ω = 0) and it is a smooth (not critical) function even above the transition. In contrast to the global average DOS, the local density of states (solid lines), which measures the amplitude of the electron wave function at a given site, undergoes significant qualitative changes as the disorder strength W increases, and eventually becomes a set of the discrete delta-like functions as the transition is approached. This must mean that the probability distributions of the local DOS for a metal and for an insulator is also very different. This is illustrated in Figure 6. In particular, the most probable An alternative confirmation is also possible. Early on, Anderson realized that the distribution of the density of states in a strongly disordered metal would be strongly skewed towards smaller values. More recently, this distribution has been demonstrated to be log normal. Perhaps the strongest demon-stration of this fact is that DOS near the transition has a lognormal distribution (Figure 7) over 10 orders of magnitude 106 .
Furthermore, one may also show that the typical value of a log-normal distribution can be approximated by the geometric average which is particularly easy to calculate and can serve as an order parameter 9,106 .
C. On the role of interactions: Thomas-Fermi screening Thus far, we have ignored the role of interactions in our discussion. Surely the strongest such effect is screening. In fact, its impact is so large that is often cited as the reason why a sea of electrons act as if they are non-interacting, or free, despite the fact that the average Coulomb interaction is as large or larger than the kinetic energy in many metals [107][108][109] .
As an introduction to the effect of screening on electronic correlations, consider the effect of a charged defect in a conductor 110 . Assume that the defect is a cation, so that in the vicinity of the defect the electrostatic potential and the electronic charge density are reduced. We will model the electronic density of states in this material with the DOS of free electrons trapped in a box potential; we can think of this reduction in the local charge density in terms of raising the DOS parabola near the defect (cf. Figure 8). This will cause the free electronic charge to flow away from the defect. We will treat the screening as a perturbation to the free electron picture, so we assume that the electronic density is just given by an integral over the DOS which we will model with an infinite square well potential with a bare density of states: with the Fermi energy E F = 2 2m 3π 2 n 2/3 . If |eδU | E F , then we can find the electron density by integrating the bare DOS shifted by the change in potential +eδU (c.f. Figure 8).
The change in the electrostatic potential is obtained by solving the Poisson equation.
The solution is: The length 1/λ = r T F is known as the Thomas-Fermi screening length. A.

D. The Mott transition
Consider further, an electron bound to an ion in Cu or some other metal. As shown in Figure 9, as the screening length decreases, the bound states rise up in energy. In a weak metal, in which the valence state is barely free, a reduction in the number of carriers (electrons) will increase the screening length, since r T F ∼ n −1/6 .
This will extend the range of the potential, causing it to trap or bind more states-making the one free valance state bound.
where a 0 is the Bohr radius. Despite the fact that electronic interactions are only incorporated in the extremely weak coupling limit, Thomas Here we provide a brief overview of some of the popular numerical methods proposed for the study of disordered lattice models, including the transfer matrix, kernel polynomial, and exact diagonalization methods. These methods will be used to benchmark and verify our quantum cluster method.
We will outline the main steps of these methods, highlighting their advantages and limitations, particularly for applying to materials with disorder.

A. Transfer matrix method
The transfer matrix method (TMM) is used extensively on various disorder problems [39][40][41] . Unlike brute force diagonal-ization methods, the TMM can handle rather large system sizes. When combined with finite-size scaling, this method is very robust for detecting the localization transition and its corresponding exponents. Most of the accurate estimates of critical disorder and correlation length exponents for disorder models in the literature are based on this method 40,41 .
The simplifying assumption of the TMM is that the system can be decomposed into many slices, and each slice only connects to its adjacent slice. Precisely for this reason, the TMM is not ideal for models with long range hopping, or long range disorder potentials or interactions. We can understand the computational scaling of the TMM by a simple 3D example without an explicit interaction. We assume the system has a width and height equal to M for each slice of a N -slice cuboid, forming a "bar" of length N . The Hamiltonian can be decomposed into the form where H i describes the Hamiltonian for slice i and H i,i+1 contains the coupling terms between the i and i + 1 slices. The Schrödinger equation can be written as where ψ i is a vector with M 2 components which represent the wavefunction of the slice i. This may be reinterpreted as an iterative equation where the transfer matrix The goal of the transfer matrix method is to calculate the localization length, λ M (E) for a system with linear size M at energy E, from the product of N transfer matrices The Lyapunov exponents, α, of the matrix τ N is given by the logarithm of its eigenvalues, Y , at the limit of N → ∞, N . The smallest exponent corresponds to the slowest exponential decay of the wavefunction and thus can be identified as corresponding to the localization length, Since the repeated multiplication of T i is numerically unstable, periodic reorthogonalization is needed in the numerical implementation [39][40][41] . For the 3D Anderson model, the reorthogonalization is done for about every 10 multiplications. This is the major bottleneck for the TMM method, as reorthogonalization scales as the third power of the matrix size.
Therefore, the method in general scales as M 3 .
where g n is the kernel function, µ n is the expansion coefficient, and T n is the Chebyshev polynomial. Jackson's kernel is usually used for the g n 139 . The expansion coefficient is given as  [142][143][144] . The feature common to these methods is the Krylov subspace, K, generated by repeatedly multiplying a matrix, H, on an initial trial vector, ψ t , As all the vectors generated converge towards the eigenvector with the lowest eigenvalue, the basis set that is generated is ill-conditioned for large j.
The solution is to orthogonalize the basis at each step of the iteration via the Gram-Schmidt process. In essence, the The inverse of the Hamiltonian with a shifted spectrum is generally not known. Then, instead of expanding the basis in the Krylov subspace, the Jacobi-Davidson method (JDM) is often employed 147 . It expands the basis (u 0 , u 1 , u 2 , · · ·) using the Jacobi orthogonal component correction which may be written as where (u j , θ j ) and (u j + δ,θ j + ) are the approximate and the exact eigenvector and eigenvalue pairs, respectively. Upon as a pre-conditioner 150 .
The scaling of this method seems to be strongly dependent on the Hamiltonian. It tends to be more efficient for matrices which are diagonally dominant, but much less so when off-diagonal matrix elements are large. This is probably due to the difficulty of obtaining a good approximation of the inverse based on the incomplete LU factorization used as a preconditioner.
Exact diagonalization methods provide an accurate variational approximation for the eigenvalues and eigenvectors of the Hamiltonian, thus allowing the calculation of quantities such as multifractal spectrum and entanglement spectrum which are difficult to obtain from other approaches 151,152 . On the other hand, Krylov subspace methods are not a good option for calculating the density of states as only one, or a few, eigenstates are targeted at each calculation. A self-consistent treatment of the interaction, even at a single particle level, would also be rather challenging. Clearly, the major obstacle for applying it to systems with an explicit interaction is again the exponential growth of the matrix size with respect to the system size.
While these numerical methods can provide very accurate results for the models which are non-interacting, single band, and with local or short-ranged disorder, applying them to chemically specific calculations is a major challenge. None of these conditions is satisfied for realistic models of materials with disorder. In this case, the complexity of these methods increases drastically and obtaining accurate results for sufficiently large system sizes to perform a finite size scaling analysis is often impossible. This highlights the importance, or perhaps necessity, of the coarse grained methods described below.

IV. COARSE GRAINED METHODS
In this section and corresponding subsections, we discuss coarse-graining as a unifying concept behind quantum cluster theories such as the CPA and DMFT as well as their cluster extension, the DCA, which preserve the translational invariance of the original lattice problem. All quantum cluster theories are defined by their mapping of the lattice to a self-consistency embedded cluster problem, and the mapping from the cluster back to the lattice. The map from the lattice to the cluster in these quantum cluster methods may be obtained when the coarse-graining approximation is used to simplify the momentum sums implicit in the irreducible Feynman diagrams of the lattice problem (see subsection IV A).
As discussed in Secs. IV B and IV C this approximation is equivalent to the neglect of momentum conservation at the internal vertices, which is exact in the limit of infinite dimensions, and systematically restored in the DCA. The resulting diagrams are identical to those of a finite-sized cluster embedded in a self-consistently determined dynamical host. The cluster problem is then defined by the coarse-grained interaction and bare Green's function of the cluster. The mapping from the cluster back to the lattice is motivated in Sec. IV C 2 by the observation that irreducible or compact diagrammatic quantities are much better approximated on the cluster than their reducible counterparts. This mapping may also be obtained by optimizing the lattice free energy, as discussed in Sec IV C 3.

A. A few fundamentals sec:fundamentals
In this section, we will introduce two central paradigms in the physics of many-body systems: the Anderson and Hubbard models of disordered and interacting electrons on a lattice, respectively. We will then use perturbation theory to prove and demonstrate some fundamental ideas.
Consider an Anderson model with diagonal disorder, described by the Hamiltonian where c † i,σ creates a quasiparticle on site i with spin σ, and The disorder occurs in the local orbital energies V i , which we assume are independent quenched random variables distributed according to some specified probability distribution P (V ).
The effect of the disorder potential iσ V i n i,σ can be described using standard diagrammatic perturbation theory (although we will eventually sum to all orders). It may be re- 11. The first few graphs in the irreducible self energy of a diagonally disordered system. Each • represents the scattering of a state k from sites (marked X) with a local disorder potential distributed according to some specified probability distribution P (V ).
The numbers label the k states of the fully-dressed Green's functions, represented by solid lines with arrows.
written in reciprocal space as The corresponding irreducible (skeleton) contributions to the self energy may be represented diagrammatically 77 and the first few are displayed in Figure 11. Here each • represents the scattering of an electronic Bloch state from a local disorder potential at some site X. The dashed lines connect scattering events that involve the same local potential. In each graph, the sums over the sites are restricted so that the different X where G(k) is the disorder-averaged single-particle Green's function for state k. The average over the distribution of scattering potentials V 3 i = V 3 is independent of the position i in the lattice. After summation over the remaining labels, this becomes where G(r = 0) is the local Green's function. Thus the second diagram's contribution to the self energy involves only local correlations. Since the internal momentum labels always cancel in the exponential, the same is true for all non-crossing diagrams shown in the top half of Figure 11.
Only the diagrams with crossing dashed lines have non-local contributions. Consider the fourth-order diagrams such as those shown on the bottom left and upper right of Figure 11.
During the disorder averaging, we generate potential terms V 4 when the scattering occurs from the same local potential (i.e. the third diagram) or V 2 2 when the scattering occurs from different sites, as in the fourth diagram. When the latter diagram is evaluated, to avoid overcounting, we need to subtract a term proportional to V 2 2 but corresponding to scattering from the same site. This term is needed to account for the fact that the fourth diagram should really only be evaluated for sites i = j. For example, the fourth diagram yields Evaluating the disorder average , we get the following two terms: Momentum conservation is restored by the sum over i and j; i.e. over all possible locations of the two scatterers. It is reflected by the Laue functions, Λ = N δ k+··· , within the sums Since the first term in Eq. 22 involves convolutions of G(k) it reflects non-local correlations. Local contributions such as the second term in Eq. 22 can be combined together with the contributions from the corresponding local diagrams such as the third diagram in Figure 11 by replacing V 4 in the latter by the cumulant V 4 − V 2 2 . Given the fact that different X's must correspond to different sites, it is easy to see that all crossing diagrams must involve non-local correlations. The developed formalism also works for interacting systems.
Again we will use perturbation theory to illustrate some of where c † jσ (c jσ ) creates (destroys) an electron at site j with spin σ, n iσ = c † iσ c iσ stands for the particle number at a given site i. The first term describes the hopping of electrons between nearest-neighboring sites i and j, and the U term describes the interaction between two electrons once they meet at a given site i.
As for the disordered case described above, the effect of the local Hubbard U potential can be described using standard diagrammatic perturbation theory. The first few diagrams for the single-particle Green's function are shown in Figure 12.
Very similar arguments to those employed above may be used to show that the first self energy correction to the Green's function is local whereas some of the higher order graphs reflect non-local contributions. In this section, we will show that the DMFT and CPA share a common interpretation as coarse graining approximations in which the propagators used to calculate the self energy Σ and its functional derivatives are coarse-grained over the entire Brillouin zone. Müller-Hartmann 14,15 showed that it is possible to completely neglect momentum conservation so that this coarse-graining becomes exact in the limit of infinitedimensions. For simple models like the Hubbard and Anderson models, the properties of the bare vertex are completely characterized by the Laue function Λ which expresses the momentum conservation at each vertex. In a conventional diagrammatic approach where k 1 and k 2 (k 3 and k 4 ) are the momenta entering (leaving) each vertex through its legs of Green's function G. However as the dimensionality D → ∞, Müller-Hartmann showed that the Laue function reduces to 14 The DMFT/CPA assumes the same Laue function, tering. These graphs may be obtained from the full set of graphs shown in Figure 11 by replacing each graphical element (Green's function and impurity scattering lines) with its local analog coarsegrained through the entire first Brillouin zone.
More generally, for an electron scattering from an interaction (boson) pictured in Figure 13, Thus, the conservation of momentum at internal vertices is neglected. We may freely sum over the internal momentum labels of each Green's function leg and interaction leading to a collapse of the momentum dependent contributions leaving only local terms.
These arguments may then be applied to the self energy Σ, which becomes a local (momentum-independent) function. These diagrams are shown in Figure 14.
It is easy to show this reduction in the number and complexity of the graphs is fully equivalent to the neglect of momentum conservation at each internal vertex. This is accomplished by setting each Laue function within the sum (eg., in Eq. 22) to 1. We may then freely sum over the internal momenta, leaving only local propagators. All non-local self energy contributions (crossing diagrams) must then vanish. For example, consider again the fourth graph at the bottom of Figure 11.
If we replace the Laue function N δ k1+k4,k5+k3 → 1 in Eq. 22, then the two contributions cancel and this diagram vanishes.
Thus an alternate definition of the CPA, in terms of the Laue functions Λ, is I.e., the CPA is equivalent to the neglect of momentum conservation at all internal vertices of the disorder-averaged irreducible graphs. It is easy to see that this same definition applies to the DMFT for the Hubbard model. This will be done below in the context of a generating functional based derivation.
Now it is easy to see that both DMFT and CPA employ the locality of the self energy Σ(ω) in their construction. As a result, the two algorithms are very similar, they both employ the mapping of the lattice problem onto an impurity embedded in an effective medium, described by a local self energy Σ(ω) which is determined self-consistently. The perturbative series for the self energy Σ in the DMFT/CPA are identical to those of the corresponding impurity model, so that conventional impurity solvers may be used. However, since most impurity solvers can be viewed as methods that sum all the graphs, not just the skeleton ones, it is necessary to exclude Σ(ω) from the bare local propagator G(ω) input to the impurity solver in order to avoid overcounting the local self energy Σ(ω) 17 corrections. This is typically done via the Dyson's A generalized algorithm constructed for such local approximations is the following (see Figure 15): (i) An initial guess for Σ(ω) is chosen (usually from perturbation theory). (ii) Σ(ω) is used to calculate the corresponding coarse-grained local Green's functionḠ (iii) Starting fromḠ(ω) and Σ(ω) used in the second step, the It serves as the bare Green's function of the impurity model.

C. The Dynamical cluster approximation
In this section, we will review the dynamical cluster approximation (DCA) formalism 23,24,46,156 . We motivate the fundamental idea of the DCA which is coarse-graining and then use it to define the relationship between the cluster and lattice at the one and two-particle level.  The coarse-graining transformation is set by averaging the function within each cell as illustrated in Figure 17. For an arbitrary function f (k) (with k = K +k), this corresponds tō wherek label the wavenumbers within the coarse-graining cell adjacent to K. According to Nyquist's sampling theorem 157 , to reproduce the function f at lengths < ∼ L/2 in Eq. 28, we only need to sample the reciprocal space at intervals of ∆k ≈ 2π/L. Eq. 28 may be interpreted as the sum of N/N c such samplings.
Knowledge of f on a finer scale in momentum than ∆k is unnecessary, and may be discarded to reduce the complexity of the problem. For example, convolutions of periodic functions f may be approximated as where Q = M (q). Eq. 29 is an approximation where we first average the function over a set of D dimensional cells and then perform a sum over the cells. Thus, reducing the numerical complexity from order N to order N c floating point operations. We saw above, that when we completely neglect momentum conservation by first coarse graining the interactions and Green's functions over the entire first Brillioun zone, the di-agrams corresponding to non-local corrections vanish, leaving the reduced set of local diagrams which constitute the CPA illustrated in Figure 14. The resulting approximation shares the limitations of a local approximation, described above, including the neglect of non-local correlations.
The DCA systematically incorporates such neglected nonlocal correlations by systematically restoring the momentum conservation at the internal vertices of the self energy Σ. To this end, the Brillouin-zone is divided into N c = L D c cells of size ∆k = 2π/L c (c.f. Figure 16 for N c = 8). Each cell is represented by a cluster momentum K in the center of the cell. We require that momentum conservation is (partially) observed for momentum transfers between cells, i.e., for momentum transfers larger than ∆k, but neglected for momentum transfers within a cell, i.e., less than ∆k. This requirement can be established by using the Laue function 24 where M(k) is a function which maps k onto the momentum label K of the cell containing k (see, Figure 16 When applied to the DCA, the cluster self energy will be constructed from the coarse-grained average of the singleparticle Green's function within the cell centered on the cluster momenta. This is illustrated for a fourth-order term in the self energy shown in Figure 18. Each internal leg G(k) in a diagram is replaced by the coarse-grained Green's function and each interaction in the diagram is replaced by the coarsegrained interaction where N is the number of points of the lattice, N c is the num- Provided that the propagators are sufficiently weakly momentum dependent, this is a good approximation. If N c is chosen to be small, the cluster problem can be solved using conventional techniques such as QMC. This averaging process also establishes a relationship between the systems of size N and N c . When N c = N a finite size simulation is recovered.
So, there are no mean-field embedding effects, etc.
b. Map from the cluster back to the lattice Once the cluster problem is solved, we use the solution of the cluster prob-lem to approximate the lattice problem. This may be done in a number of ways, and its not a priori clear which way is optimal. At the single-particle particle level, we could, e.g., calculate the cluster single particle Green's function and use it to approximate the lattice result, Or, at the other extreme, we could calculate the self energy on the cluster, and use it to first approximate the lattice result , ω), and then use the Dyson equation The second way is far better. We will motivate this mapping with more rigor in the next part, where we calculate and minimize the free energy, but here we offer a physically intuitive motivation.
FIG. 19. Path-integral interpretation of the screening of a propagating particle. The single particle lattice Green's function, G l , describes the quantum phase and amplitude the particle accumulates along its path as it propagates from space-time location 0 to x. It is poorly approximated by the cluster Green's function from a small cluster calculation, G l ≈ G c , especially when x, r ≤ Lc, the linear cluster size. Its self energy, which describes generally short ranged r screening processes, is well approximated Σ l ≈ Σ c , by a small cluster calculation, especially when the cluster size Lc is greater than the screening length. As discussed in Sec. II this screening length fT F ≈ r which may be less than an Angstrom for a good metal. So, rather than directly approximating the lattice Green's function by the cluster Green's function, the cluster self energy is used to approximate the lattice self energy in a Dyson equation for the lattice Green's function where G l0 is the bare lattice Green's function.
Physically, this is justified by the fact that irreducible terms like the self energy are short ranged, while reducible quantities the G must be able to reflect the long length and time scale physics. This is motivated in Figure 19. As the particle propagates from the origin to space-time location x, the quantum phase and amplitude it accumulates is described by the single-particle Green's function G(x). Consequently if x is larger than the size of the DCA cluster, then G(x) is poorly approximated by the cluster Green's function. However, the Self energy Σ describes the many-body processes that produce the screening cloud surrounding the particle. As we saw in Sec. II C these distances are typically very short, on the order of an Angstrom or less, so the lattice self energy is often well approximated by the cluster quantity.

DCA: a generating functional derivation
Finally, in this section, we will derive the DCA for the Hubbard model using the Baym generating functional formalism. The generating functional Φ is the collection of all compact closed graphs that may be constructed from the fully dressed single-particle Green's function and the bare interaction. Starting from the generating functional, it is quite easy to generate the diagrams in the fully irreducible self energy and the irreducible vertex function needed in the calculation of the phase diagram. Note that in terms of Feynman graphs, each functional derivative δ/δG σ is equivalent to breaking a single Green's function line. So, the self energy Σ σ is obtained from a functional derivative of Φ, Σ σ = δΦ/δG σ , and the irreducible vertices Γ σσ = δΣ σ /δG σ . Since we obtain the free energy, Baym's formalism is also quite useful for proving a few essentials.
a. Map from the lattice to the cluster To derive the DCA, we first apply the DCA coarse-graining procedure to the diagrams in the generating functional Φ(G, U ). In the DCA, we obtain an approximate Φ c by applying the DCA Laue function to the internal vertices of the lattice Φ l . This is illustrated for the second order term in Figure 20 It is easy to see that the corresponding term in the self energy Σ (2) is obtained from a functional derivative of Φ (2) , Σ σ = δΦ (2) /δG σ , and the irreducible vertices Γ (2) σσ = δΣ (2) σ /δG σ . This is illustrated for the second order self energy in Figure 21.
Above, we justified these approximations in wavenumber space; however, one may also make a real-space argument. In high spatial dimensions D, one may show 13,14 that G(r, τ ) falls of exponentially quickly with increasing r G(r, τ ) ∼ t r ∝ d −r/2 while the interaction remains local. Thus, when D = ∞ all non-local graphs vanish. In finite D, due to causality, we may expect the Green's functions to fall exponentially for large time displacements; whereas, the decay of the quaisparticle ensures that it also fall exponentially with large spacial displacements. So, one may safely assume that longer range graphs are "smaller" in magnitude. Now, consider a non-local correction to the local approximation where only graphs constructed from G(r = 0, τ ) enter.
The first such graph would be when all vertices are at r = 0 apart from one which is on a near neighbor to r = 0, which we will label as r = 1. We allow G(r = 1)/G(r = 0) to be the "small" parameter. It is easy to see that the first non-local correction to Φ is fourth order in G(r = 1)/G(r = 0).
Likewise, the first such corrections to the self energy are third order while those for the Green's function itself are first order in G(r = 1)/G(r = 0). Thus, the approximation where lattice quantities are approximated by cluster quantities, is much better for the self energy than for the Green's function.
Thus, the most accurate approximation is to replace the lattice generating functional with the cluster result, Φ l ≈ Φ c and the lattice self energy as the cluster result Σ l (k) ≈ Σ c (K) and use it in the lattice Dyson's equation to form the lattice single particle Green's function.
Summarizing, the map from the lattice to the cluster is accomplished by replacing G(k) byḠ(K) and the interaction V (k) byV (K) in the diagrams for the generating functional.
These are precisely the generating functional, self energy and vertex diagrams of a finite size cluster with a bare Hamiltonian defined by G, and an interaction determined by the bare coarse-grainedV (K). In this mapping from the lattice to the cluster, the complexity of the problem has been greatly reduced since this cluster problem may often be solved exactly and with multiple methods including quantum Monte Carlo 158 b. Map from the cluster back to the lattice We may accomplish the mapping from the cluster back to the lattice problem by minimizing the lattice estimate for the self energy. The corresponding DCA estimate for the free energy is where Φ c is the cluster generating functional. The trace indicates summation over frequency, momentum and spin.
We may prove that the corresponding optimal estimates of the lattice self energy and irreducible lattice vertices are the corresponding cluster quantities. F DCA is stationary with which means that Σ l (k) = Σ c (M (k)) is the proper approximation for the lattice self energy corresponding to Φ c . The corresponding lattice single-particle propagator is then given by A similar procedure is used to construct the two-particle quantities needed to determine the phase diagram or the nature of the dominant fluctuations that can eventually destroy the quasi-particle. This procedure is a generalization of the method of calculating response functions in the DMFT 17,159 .
In the DCA, the introduction of the momentum dependence in the self energy will allow one to detect some precursor to transitions which are absent in the DMFT; but for the actual determination of the nature of the instability, one needs to compute the response functions. These susceptibilities are thermodynamically defined as second derivatives of the free energy with respect to external fields. Φ c (G) and Σ c σ , and hence F DCA depend on these fields only through G σ and G 0 σ . Following Baym 160,161 it is easy to verify that, the approxi- yields the same estimate that would be obtained from the second derivative of F DCA with respect to the applied field.
For example, the first derivative of the free energy with respect to a spatially homogeneous external magnetic field h is the magnetization, The susceptibility is given by the second derivative, We substitute G σ = G 0−1 σ − Σ c σ −1 , and evaluate the derivative, If we identify χ σ,σ = σ δG σ δh , and χ 0 σ = G 2 σ , collect all of the terms within both traces, and sum over the cell momentak, we obtain the two-particle Dyson's equation We see again it is the irreducible quantity, this time the irreducible vertex function Γ, for which cluster and lattice correspond.
Summarizing, the mapping from the cluster back to the lattice problem is accomplished by approximating the lattice generating functional by the cluster result Φ c and then optimizing the resulting free energy for its functional derivatives yields In this section, we address these difficulties associated with the construction of a mean field theory with no known nontrivial exact limiting solution. Our approach will be to construct a theory which inherits the desirable properties of the DMFT/CPA and DCA in the weak disorder limit, while also incorporating the TDOS order parameter into the mean field host ensuring that the method is also able to capture localization phenomena. The natural way to improve upon the local TMT is to construct a cluster extension which satisfies the constraints mentioned in Sec. II A which when rephrased in terms of clusters are: 1. We approximate the coupling of the clusters to their lattice environment at the single-particle level (akin to the Fermi golden rule) neglecting two-particle and higher processes. This coupling is proportional to the square of a matrix element between the cluster and its host, times an appropriate DOS which describes the states available on the surrounding clusters.
2. Since on average each cluster is equivalent to all the others, this DOS will also be proportional to some appropriate cluster density of states. And, since the distribution of the DOS is highly skewed, the typical DOS is quite different than the average DOS. The typical cluster DOS, which is clearly more representative of the local environment, will be used to define the effective medium. Betts [169][170][171] .
All proposed algorithms are fully causal. The first two algorithms discussed below may be shown to be causal with a proof involving two conformal maps 8,24 . This proof is not applicable to the multiband methods; however, we have not observed any causality violations in the iteration of the resulting equations.
All of the algorithms recover the DCA in the weak disorder limit, whereas they do not all recover the TMT when N c = 1.
There appears to be a trade-off between this and maintaining the independence of the scatterings at different energies. Each of the algorithms become equivalent to a finite size simulation when N = N c , so they all recover the exact result in this limit, and the thermodynamic limit for large N . On the other hand, the injunction against self averaging in item 9 is a bit subtle, which can be illustrated by an example. Consider another apparently good limit where the formalism becomes exact, the defining Ansatz for this formalism is not uniquely defined. In consideration of this, we will be guided by the desirable properties listed above.
We found two Ansatze which satisfy most of these desirable properties.
• Ansatz 1 When the cluster size N c = 1, this Ansatz 10 recovers the local TMT with ρ typ (ω) = e ln ρ(w,V ) . For weak disorder, the TMDCA recovers the average DCA results, with ρ typ (K, ω) ≈ ρ(K, w, V ) . And in the limit of N c → ∞, the TMDCA becomes exact. Hence, between these limits, this Ansatz 1 of the TMDCA systematically incorporates non-local correlations into the local TMT. Since, this Ansatz uses the TDOS, to get typical cluster Green's function G c typ (K, ω), we use a Hilbert transformation, with • Ansatz 2 While Ansatz 1 works rather well for simple single-band models with local and non-local disorder, we find that it can suffer from numerical instabilities when applied to complex first-principle effective Hamiltonians with many orbitals and non-local disorder potentials. Such numerical instabilities arise due to the Hilbert transformation which is used to calculate the Green's function from the typical density of states ρ c typ (K, ω). To avoid such numerical instabilities, we constructed the following Ansatz 2 173 where we calculate G c typ (K, ω) directly as This Ansatz 2 again incorporates the typical value of the local density of states, the resulting formalism again becomes exact in the limit of N c → ∞, promotes numerical stability of the algorithm, and converges quickly with cluster size. As noted in Table V B it does not reproduce the TMT when N c = 1. This is due to the lack of a limit where the formalism is exact so that the Ansatz may be uniquely defined.
These two Ansatze will be used below as paradigms for the development of Ansatze for more realistic systems and will be referred to as Ansatz 1 and 2, respectively.
The main modification of the DCA self-consistency loop for the TMDCA involves the calculation of the cluster typical Green's function G c typ (K, ω) using Eq.44 and Eq.45 or Eq. 46. The typical Green's function is then used to complete the selfconsistency loop. A schematic diagram of the TMDCA selfconsistency loop is shown in Figure 23. The TMDCA iterative procedure is described as follows: 1. We start with a guess for the cluster self energy Σ(K, ω), usually set to zero.
2. Then we calculate the coarse-grained cluster Green's functionḠ(K, ω) as 3. The cluster problem is now set up by calculating the cluster-excluded Green's function G(K, ω) as 4. Since the cluster problem is solved in real space, we then Fourier transform G(K,ω) to real space: 5. We solve the cluster problem using, e.g., a random sampling simulation. Here, we stochastically generate random configurations of the disorder potential V . For each disordered configuration, we construct the new fully dressed cluster Green's function as We then calculate the disorder-averaged, typical cluster Green's function G c typ (K, ω) via the Hilbert transform using Eq. 45 for Ansatz 1, or we can directly calculate the G c typ (K, ω) from Eq. 46 if we use Ansatz 2.
6. With the cluster problem solved, we use the obtained typical cluster Green's function G c typ (K, ω) to obtain a new estimate for the cluster self energy 7. We repeat this procedure starting from 2, until Σ(K, ω) converges to the desired accuracy.
We note that instead of using the self energy in the self-consistency, one can also use the hybridization function ∆(K, ω). Both procedures are observed to converge to the same solution. To illustrate these ideas, we will employ a simple binary alloy model with random nearest-neighbor hoppings. Each site may be one of two types, A and B, with random diagonal potential depending on the type, V A and V B , and hoppings between nearest neighbors i and j, t ij , are introduced as with all others being zero. The hopping depends on the type of ion occupying sites i and j. We will assume that the alloy is Since physically the Green's function describes the amplitude and phase the particle accumulates as it propagates, we can expect, i.e., dω −1 In momentum space, if there is only nearest-neighbor hopping between all ions as in our simple example, the bare dispersion can be written as (the under-bar denotes matrices) where in three dimensions for our simple model ε k = −2t(cos(k x ) + cos(k y ) + cos(k z )) with 4t = 1 which sets our unit of energy, and t AA , t BB , t AB , and t BA are unitless prefactors. Using this, we may define a bare lattice propagator, and a corresponding diagrammatic perturbation theory for the lattice single-particle propagator G(k, ω). This causes all the Green's functions and vertices to be replaced by their coarse-grained counterparts. The remaining details of the DCA formalism for off-diagonal disorder may then be defined by following the same procedures discussed in

As done in previous sections
Sec. IV C.
To define the mean-field coupling between the cluster and its host, we introduce a DCA hybridization matrix ∆.
which is related to the cluster Green's function, through the 2 × 2 matrix equation We then average over the translations and point group operations of the cluster to restore the expected symmetries of a disorder-averaged system. Our goal is to calculate the average G c (X − X ) for each link X − X . This may be done by assigning the components according to the occupancy of the sites in the cluster I and J with the other components being zero (for any disorder configuration, only 1/4 of the G c,αβ (X − X ) are non-zero).
Once the average cluster G c Green's function is obtained, we can get the cluster self-energy Σ(K, ω) or the hybridization function matrix ∆(K, ω) using the Dyson's equation.
We then close the loop on the DCA algorithm by calculating the coarse-grained lattice Green's function as A new estimate of the hybridization function is then formed This may be used to define a new cluster problem, etc. This procedure continues until ∆ converges.

TMDCA with off-diagonal disorder
In this section, we will discuss the modifications needed for the above DCA off-diagonal disorder formalism in order to incorporate the typical medium analysis 175 In the presence of off-diagonal disorder, following BEB, the typical density of states becomes a 2 × 2 matrix, which we define as Here the scalar prefactor depicts the local typical (geometrically averaged) density of states, while the matrix elements are linearly averaged over the disorder. Also notice that the cluster Green's function (G c ) IJ and its components G c,AA , G c,BB and G c,AB are defined in the same way as in Eqs. 52-56 above.
For N c = 1 with only diagonal disorder (t AA = t BB = t AB = t BA ) the above procedure reduces to the local TMT scheme. In this case, the diagonal elements of the matrix in Eq. 58 will contribute c A and c B , respectively, with the off-diagonal elements being zero (for N c = 1 the off-diagonal terms vanish because a given site can only be either A or B).
Hence, the typical density reduces to the local scalar prefactor only, which has exactly the same form as in the local TMT scheme.
Another limit of the proposed Ansatz for the typical density of states of Eq. 58 is obtained at small disorder. In this case, the TMDCA reduces to the DCA for off-diagonal disorder, as the geometrically averaged local prefactor term cancels by the contribution from the linearly averaged local term in the denominator of Eq. 58.
Once the first Ansatz is used to calculate the typical spectra, ρ αβ typ , the typical Green's function G c typ (K, ω) is then obtained by performing Hilbert transform for each component Once the disorder averaged cluster Green's function G c typ (K, ω) is obtained from Eq. 59, the self-consistency steps are the same as in the procedure for the off-diagonal disorder DCA. I.e., we calculate the coarse-grained lattice Green's functionḠ(K, ω) using Eq. 57. Then, we use the obtained coarse-grained lattice Green's functionḠ(K, ω) to update the hybridization function with the effective medium as ∆ new = ∆ old +G c typ (K, ω) −1 −Ḡ(K, ω) −1 , which is used to construct a new input to the cluster problem. The procedure is repeated, until numerical convergence is reached.

D. TMDCA for multi-orbital systems
Since realistic materials also have multiple orbitals, the TMDCA formalism has been generalized to multi-orbital system at the simple model level 12 as well as for realistic mate- here as before, V i describes the random disorder potential, and U is the strength of electron-electron interactions between electrons at site i.

Electron
where Σ Int (U ) is a thermodynamically averaged self energy matrix that may be derived through a real-space, real-

Second order perturbation theory
In order to understand the effect of weak interaction effects on the critical disorder concentration, as well as to investigate the effect on the mobility edge, we have incorporated a straight second order perturbation theory in the cluster momentum space into TMDCA formalism 11 . In the constructed SOPT formalism, the interacting self energy Σ Int is obtained using the first and the second order perturbation theory contributions (shown in Figure 25) Here the first term is the static Hartree correction Σ H = Uñ I /2. The second term is the non-local second order contribution, defined as whereG(iω n , V, U ) is the Hartree-corrected host Green's func- Uñ I /2 and the cluster Green's function is finally given by Although the above expression (equation 68) appears to imply that we evaluate the self energy on the Matsubara frequency axis, it is not really so. We use the spectral representation of the propagators within a Hilbert transform to get a real-frequency expression for the imaginary part of the self energy (for more details, see Appendix of 11 ). Further, the real part of the self-energy is obtained through a Kramers-Krönig transform.
Once the cluster self energy due electron-electron interaction Σ Int is obtained via Eq. 67, we then use Eq. 66 to get the interaction-corrected cluster Green's function for the given disorder configuration V . This is then used to calculate the typical density of states Ansatz 1 of Eq. 44, with ρ c (K, ω, V, U ) = − 1 π ImG c (K, ω, V, U ). The other parts of the TMDCA algorithm, namely the disorder averaging, coarse graining etc. remain exactly the same as in the non-interacting case described above in section V. A second order (in U ) self energy evaluated on the full cluster, either in real or momentum space, is capable of incorporating non-local dynamical effects. However, by construction, such a cluster solver would only be valid for weakly interacting systems. If the system is strongly renormalized close to a metal-insulator transition, due to the reduction in ∆ then this method might break down, since the assumption of weak coupling is not valid for large U/∆.

Stat DMFT approach
The SOPT method described above is applicable only in the weakly interacting regime. Unfortunately for the strong coupling regime, there are very few cluster solvers available for disordered interacting electron systems. The two most extensively used solvers capable of treating a wide range of energy and length scales, and are numerically exact, are quantum Monte Carlo methods 183,184 and exact diagonalization 19,[142][143][144] .
Quantum Monte Carlo methods have been extended to clusters 46,158 . However, since the typical averaging has to be performed on the real-frequency spectral function, the ill-posed step of analytic continuation is required for every disorder configuration and in every TMDCA iteration, rendering them unusable. Alternatively, exact diagonalization may be used, but as is well-known, the cluster sizes that can be treated are very modest, and the associated computational expense is quite substantial. At present, the only fully non-local cluster solver available that is computationally feasible, and yields a real frequency self energy is a straight perturbation theory.
Thus, one has to resort to approximate cluster solvers, especially for investigating the strong coupling regime. Such a solver may be constructed by combining a non-perturbative real frequency single-site solver and statistical DMFT 180 . The former must be capable of treating the moment formation and Kondo physics characteristic of the strong coupling regime.
It must also properly incorporate the eventual many body screening of the local moment leading to a singlet ground state. The resulting formalism is then able to capture these local dynamical correlations due to U , while treating the cor- Since, within stat-DMFT, the hybridization is site-dependent.
Rather than a single Kondo scale for the entire system, a distribution of Kondo scales, P (T K ) is obtained. The form of such a distribution and its consequences on the properties of the disordered system have been extensively investigated using slave-boson methods and phenomenological arguments [185][186][187] It has been seen in the above mentioned studies that typical medium theory based calculations yield a Kondo scale distribution P (T K ) exhibiting a long tail at higher Kondo scales, while diverging at a specific, lower bound scale. This is determined by the solution of the impurity problem in the particlehole symmetric limit 188 . Extensions to statistical DMFT combined with the slave-boson solver yields a P (T K ) that also has a long tail at larger T K , but is not divergent at lower scales 125 .
Instead, it is highly skewed, has a maximum at a specific scale, and has either a vanishing or a finite intercept depending on whether the disorder is below or above a critical disorder value. Such a distribution with a finite intercept has been shown to be a sufficient condition for the system to ex-  tribution, as well as a physical self energy which encompasses disorder and interaction effects on an equal footing. Some of these results are reviewed in Sec. VII D.

F. Two-particle calculations
Up to this point, the theory has focused on the calculation of single particle quantities, i.e., the TDOS to capture the localization transition. However, most experimental measurements are described by two-particle Green's functions, including transport, most X-ray and neutron scattering, NMR, etc.
Therefore, the TMDCA has also been extended to include the description of two-particle quantities including vertex corrections 191 in a similar fashion as that in the CPA and DCA 46,158 .
In conventional mean-field theories such as the CPA and DCA, the order parameters are constructed from the lattice Green's function defined as where M (k) = K maps an arbitrary wave number k to the closest DCA cluster K and Σ(M (k), ω) is the self energy calculated on the cluster. If the order parameter is local, the order parameters may also be constructed from the cluster single-particle Green's function .
For example, for the magnetization m Since these equations depend on h through the Green's function and through the dependence of Σ and ∆ on G, in order to calculate the susceptibility dm/dh| h=0 using the cluster Green's function, we need to know both δG/δΣ and δ∆/δG.
The former is the irreducible vertex function but the lack of information on δ∆/δG prevent us from using this representation for the extended states. However, for the localized states, ∆ vanishes, so that δ∆/δG is not needed and we can use the cluster Green's function for the localized states.
Since the scattering events at different ω are completely independent, to avoid using δ∆/δG for the extended states, we introduce a mixed representation with where and ω e is the mobility edge energy. Physically, this is more meaningful than the use of one of the formulas in Eq. 70,71 alone. Below the mobility edge, ω < ω e , all of the states are extended, and they may be described as states with a dispersion k renormalized by Σ. However, for localized states ω > ω e , above the mobility edge, the electrons are localized to the cluster with ∆ σ (K, ω) = 0 so that δ∆σ(K,ω) δh = 0. These states may not be described as extended states with a renormalized dispersion. So the usual interpretation fails, and it is much better to think in terms of states localized to the cluster described by the cluster Green's function for frequencies above the localization edge. This leads to the main difference between the typical analysis of the two-particle quantities and the conventional CPA and DCA, where for the states above the mobility edge, the TMDCA average cluster Green's function G c σ (K, ω) is used to construct the two-particle susceptibility matrix Based on this, and the observation that at convergence, G c = G so that for the δ δG c σ = δ δḠσ = δ δG l σ the Bethe-Salpeter equation can be derived with G p Green's function where χ p0 σσ = (G p σ (k, ω)) 2 . This equation may be described diagrammatically as in Figure 27. Again, the lattice momentum sums onk, where k = M (k) +k, render the direct solution to Eq. 77 intractable. Fortunately, since the irreducible vertex function above depends only on the momentum cell centers K, this equation may be coarse-grained, by summing over thek,k , · · · labels. The corresponding coarse-grained Bethe-Salpeter equation becomes σχ σ,σ σ = σχ p0 σ σ + σχ p0 σσ Γ σ,σ σ σ χ σ ,σ σ The susceptibility corresponding to different physical quantities can be constructed through the two-particle Green's function. For instance, the charge susceptibility can be constructed as , which is also used to calculate the DC conductivity at zero temperature for a single band Anderson model with results shown in Sec. VII C. In this typical analysis, the inclusion of the vertex corrections follows the same procedure as that described in 46,158 27. Bethe-Salpeter equation relating the two-particle Green's function χ and the irreducible vertex Γ. While k, k and q represent momentum indices, ω and ν represent frequency indices (for fermionic and bosonic frequencies respectively) and the spin indices are suppressed. Note that for the disordered systems considered here, the scatterings are elastic and thus the energy is conserved following any fermionic Green's function line. Therefore, we only need two frequency indices to represent the frequency degree of freedom of the system.

VI. METHODOLOGY
FOR FIRST-PRINCIPLES

STUDIES OF LOCALIZATION
There are two general methods which may be used to study localization from first principles. The first is a componentbased approach wherein the calculation is split into three basic components, as depicted in Figure 28 and described in Secs., VI A and VI B below. Here, the DFT and TMDCA calculations are performed separately, connected by the second step where a tight-binding model is extracted from the DFT to be solved in the third, TMDCA step. The first two steps of this process are quite mature, allowing researchers to focus on the third step, as we have done thus far in this review.
Alternatively, in the integrated approach, the coarsegraining ideas behind the DCA, the typical medium analysis, and multiple scattering theory based DFT are integrated together to form a fully self consistent treatment of the problem.
This multiple-scattering formalism has been developed 192 , but as it has not yet been implemented in a real materials calculation, it is beyond the scope of this review.
In this section we focus on the component-based approach based approach illustrated in Figure 28). Specifically the first sub-section will describe how to extract low energy effective models of disordered materials using the Effective Disordered Hamiltonian Method (EDHM) 193 . The second sub-section will describe how these models with real material parameters are inserted into the Effective Medium Solver, in this case the TMDCA framework. at (x i ,x j ), etc. We have found that for many materials it is already highly accurate to retain only the single impurity potentials and neglect the higher order corrections 193,[199][200][201][202] .
Furthermore, we are typically interested in very dilute impurity concentrations for which Anderson and Mott localization take place. In this limit it is unlikely that multi-impurity corrections to the Hamiltonian need to be taken into account.
Here we emphasize keeping in Eq. 80 only the single impurity potentials does not mean that multi-impurity scattering is not taken into account. At this point we are deriving the low-energy Hamiltonian which can, in principle, be solved by exact diagonalization that takes into account multi-impurity scattering exactly to all orders.
In practice, the EDHM consists of three steps. In addition to chemical disorder it is also possible to take into account the influence of magnetic disorder by mapping the DFT onto a generalized spin-fermion model as we describe below. This is relevant for dilute magnetic semiconductors in which a strongly interacting impurity is embedded into a weakly interacting host.
In practice, the generalized spin-fermion model is derived as follows. First we perform spin-density functional theory (us- xj ↓ corresponding to the impurity at site x j . In the generalized spin-fermion model the impurity potential is given by: which incorporates the effect of the strong Coulomb repulsion at the impurity site. As usual, c inσ (c † inσ ) annihilates (creates) an electron with spin σ in unit-cell r i in the n-th host orbital. τ σσ τ σσ τ σσ and S j S j S j are the Pauli matrices and the spin-vector operator. The non-magnetic and magnetic co- Here we note that the impurity potential involves three spatial points labelled by i, i and j, meaning that if we place an impurity at site j the processes from site i to i will be modified. We have recently performed such a derivation for Ga 1−x Mn x N to resolve a long standing debate on the valence state of Mn 206 .
The main advantage of this approach compared to deriving a multi-orbital Hubbard model 207 is that by treating the impurity spins classically one can avoid the fermion sign problem 208 and thus greatly reduce the computational expense of including interactions in the typical medium dynamical cluster approximation.
Recently, we also generalized the EDHM to include the treatment of phonons 209 . Rather than making a cluster expansion of the Wannier function based Hamiltonian of the electrons, a cluster expansion can be made in the force constant matrices of the phonons. This opens the way for studying disorder induced localization of phonons from first principles.

B. From the EDHM to TMDCA
In order to incorporate the EDHM into the TMDCA, we first need to convert the parameters derived from the EDHM into the form of the multi-orbital Anderson model used in the TMDCA. Moreover, since the impurity potentials derived are usually quite long ranged, an appropriate coarse-graining procedure is needed to map the effective impurity potential from the lattice to the DCA cluster (c.f. IV C). In the following, we outline the procedure of these two steps.

a. Extraction of the impurity potential
We start from the effective EDHM Hamiltonian: is the Hamiltonian of the pure host material with i, i corresponding to the site indices and n, n corresponding to the orbital indices. V is defined in Eq. 81 which contains the impurity potential induced by the impurity located at site j. Since for each impurity, the induced impurity potential on neighboring sites has the same form, we can rewrite the parameters in Eq. (81) as: Here, since the spin-independent and spin-dependent parameters have similar structures, we only show the transformation for the spin-independent parameter. The spin-dependent component can be inferred by analogy.
To investigate the structure of the impurity potential, we first look at the terms induced by a single impurity located at the origin V 0 by letting j = 0 in Eq. 81, and further split it into three parts: The first term is diagonal disorder which in general, extends to a finite region from the origin. The second term is the offdiagonal disorder associated with hopping between the impurity site and a host site. The disorder induced by this term can be properly described in the Blackman formalism 174 . The last term is the off-diagonal disorder associated with the hopping between two host sites that are induced by the impurity located on the sites other than these two host sites. Due to this feature, the disorder caused by this term can not be described properly in the original Blackman formalism so a slight modification is made to include these terms in our calculation.
To extend the Blackman formalism we first write H ef f for a specific disorder configuration, with impurities labeled by j, where, Here, in Eq. 88 the first term is independent of the disorder configuration. The third term depends on the disorder con- j=i,or,i so in the Blackman formalism, the hopping term W nn i,i ,σ can be written as a 2 by 2 block matrix: Here, we use underscore to denote the 2 by 2 matrix in Blackman formalism and we use overbar to denote the quantities that are coarse-grained. We can see that the first two terms are configuration independent and translationally invariant in the Blackman formalism, because so we can combine the first two terms as and we identify the remaining term as so that Note, W 2,nn i,i ,σ which is related to the last term of Eq. 85, is not translational invariant even in the Blackman formalism, and cannot be described in the original Blackman method, so a slight modification is made to account for these terms in DCA/TMDCA calculations.
b. Coarse-graining the impurity potential Then, W nn i,i ,σ is coarse-grained in the DCA cluster with periodic boundary conditions to obtain the cluster parameters Here, since W 1,nn i,i ,σ is translationally invariant, it can be coarse-grained easily in the same manner as the regular kinetic energy terms: W 1,nn But W 2,nn i,i ,σ still depends on the disorder configuration, and is not translationally invariant, so it needs to be coarse-grained differently. We carry out the the coarse-graining according to the following procedure: The diagonal disorder component from Eq. (87) includes also an extended contribution, T nn jii = T nn i−j,i−j , which needs to be coarsed grained. We implement the following procedure: Then the coarse-grained version of Eq. (87) is just nn where is the diagonal disorder potential in the cluster. Since t nn IIσ is local and translationally invariant, it is not modified by coarse graining, so we set it to nn 0σ . For the spin-dependent part, the same procedure can be carried out completely by analogy.  TMT.
The resulting W − ω (disoder-energy) phase diagram is shown in Figure 31. Here, we show the mobility edge trajecto-

B.
Results for models with more realistic parameters In this section we apply the typical medium analysis to more complex disordered systems, including those with off-diagonal disorder, multiple orbitals, and interactions. We continue this section by showing application of TMDCA to calculate twoparticle quantities and explore the effect of interactions. Finally, we discuss the simulation of some select high temperature superconductors and dilute magnetic semiconductors. To illustrate the method, we return to our simple model of an AB binary alloy. In Figure 33 comparing the mobility edge boundaries for N c > 1 with those obtained using TMM, we find very good agreement. This again confirms the validity of our generalized TMDCA.

Multiple orbitals
The multi-orbital TMDCA with the Ansatz defined in Eq. 60 and Eq. 62 has been tested for a 3D Anderson model with two degenerate bands (denoted by a and b), so that both nearest neighbor hopping and disorder potential in this case are 2 × 2 matrices in the band basis given by and respectively. The intra-band hopping is set as t aa = t bb = 1, with finite inter-band hopping t ab . The local inter-band disorder V ab i is set to to be zero considering the two bands orthogonal to each other so that the randomness only comes from the local intra-band disorder potential V   C. Results for two-particle calculations The typical analysis has been applied to the single band Anderson model to calculate the DC conductivity 191 . As shown in Figure 38, the DC conductivity vanishes in the region where the TDOS is zero. This is expected since when the TDOS is zero, meaning all states are localized on the cluster, the hybridization function also becomes zero and all clusters are isolated.   The convergence of the critical disorder strength W c with the cluster size N c is also studied. Figure 39 shows the DC conductivity at zero chemical potential as a function of disor-  (Figure 39), but the deviation from the KPM calculations is in the correct direction given that the KPM is a finite-sized approximation and the conductivity vanishes near the critical disorder strength.
D. Results for interacting models

Results from SOPT
As discussed in the introduction, the interplay between disorder and interactions can be quite subtle and counter- Beyond U ∼ 1.0, deviations begin to appear, and the SOPT does not remain reliable.
One of the main results of this study was the absence of a sharp mobility edge separating the localized from the delocalized spectrum if the chemical potential is at or beyond the mobility edge of the corresponding non-interacting system. We show the result for both p-h symmetric and away from p-h symmetry cases in Figure 40. In Figure ?? the typ- band edges.
We also found that the width of the mobility edge depends on the location of the chemical potential 10     Interestingly, we also found a dip in the density of states at the chemical potential, akin to a pseudogap, at disorder values that were very close to the critical disorder. Since this is the weak coupling regime, this pseudogap could be a precursor of the Efros-Shklovskii Coulomb gap 177 , however the present model has purely local interactions, while the Coulomb gap is found for long-range interactions, which have not been explored yet.

Results from Stat-DMFT
The role of strong interactions is also of great interest. Unfortunately, the second order perturbation theory based cluster solver is, naturally, restricted to the weakly interacting regime. Hence, to investigate the interplay of disorder and interactions in the strong coupling regime, we developed a real-space cluster solver based on statistical DMFT coupled with an impurity solver, namely the local moment approach, that is capable of capturing local Kondo physics in a nonperturbative way.
Since, within stat-DMFT, the hybridization is different for each site, the Kondo scale, T K , acquires a highly non-trivial and skewed distribution, P (T K ), as shown in Figure 42. The right panel shows clearly that the frequency dependence is Fermi liquid like (ω 2 ) at low frequencies, and crosses over to |ω| α , with a disorder-dependent α < 2 at higher frequencies.
The crossover scale, ω c (W ) decreases with increasing W , lead- As the schematic suggests, a quantum critical point at W c , identified by the vanishing of the crossover scale, separates a Fermi liquid phase from a second phase which we simply call Phase-2.This second phase could not be identified within the TMDCA calculations, but can be speculated to be some kind of a quantum spin liquid. It was also argued in the work that the quantum criticality cannot be of a local type or a Hertz-Millis-Moriya type, and hence has to be of a new type.

E. Results of the first-principles studies of localization
The combined method EDHM+TMDCA (described in Sec. VI) has so far been applied to study localization from first principles in two types of functional materials: superconductors 12 and diluted magnetic semiconductors 173 . Due to its ability to access systems with multiple orbitals and complicated disorder potentials, it provides a powerful approach to study localization caused by the impurities in these functional materials in an unbiased and material-specific way.
1. Application to KyFe2−xSe2 sec:KFe2Se2 For example, among the iron based superconductors, K x Fe 2−y Se 2 has been studied intensely because of its unique properties. It has a relatively high T c of 31 Kelvin 217 and an exotic type of antiferromagnetic order. It was the first iron based superconductor that only has electron pockets and no hole pockets. Moreover, K x Fe 2−y Se 2 is strongly disordered due to a significant amount of Fe vacancies and it is the only iron based superconductor whose parent compound is an anti-ferromagnetic insulator instead of a anti-ferromagnetic metal 218 . Like other iron based superconductors, it is quasi two dimensional which makes it more sensitive to the disorder. This leads to the question whether it can be an Anderson insulator. Due to the presence of the strong disorder, the precise number of electrons in K x Fe 2−y Se 2 is difficult to quantify, we consider two extreme cases with fillings of 6.0 and 6.5 electrons per Fe. The true electron concentration should fall in between these cases. As shown in Figure 45, the calculated DCA and TDOS indicate that despite the strong Fe vacancy disorder and the low dimensionality, for both fillings, there are very few states that are Anderson localized in the Fe bands.
Since those states reside far away from the Fermi level it can be concluded that K x Fe 2−y Se 2 is not an Anderson insulator.  To study localization of the impurity band is not only important for the transport properties, but is also essential to understand the magnetic exchange mechanism these materials.
When the carriers in the impurity band are localized, itinerant mechanisms of magnetism, such as double exchange, are ruled out, in favor of other mechanisms such as superexchange. 219 Among the DMS materials, (Ga,Mn)N is of particular interest since Dietl 220 predicted its Curie temperature to be above room temperature. However, until now, this prediction remains far from being fulfilled as various experiments lead to controversial conclusions concerning the ferromagnetism.

221-225
To enhance the understanding of magnetism in (Ga,Mn)N we have studied localization in this material from first principles. Figure 46 shows the calculated ADOS and TDOS of the minority band for various Mn concentrations. We can see that for Mn impurity concentrations less than 10% (the compositional limit of (Ga,Mn)N), the chemical potential always sits above the mobility edge, indicating that it is insulating due to localization. Moreover, when the Mn concentration is below 3%, the TDOS of the impurity band vanishes completely, leading to the complete localization of the impurity band supporting the dominance of the ferromagnetic superexchange mechanism over the double exchange mechanism for the low concentration. showing that the impurity band is completely localized for x ≤ 0.03.
The chemical potential is set to be zero and denoted as the dash line. Inset: Zoom in of the DOS and TDOS around the chemical potential. Reprinted from 173 .

VIII. CONCLUSIONS
Over the past couple of decades, dynamical mean field theory and its generalization, the DCA have become a major paradigm in the field of computational strongly correlated systems. They provide a new framework for the study of strong interaction. Interesting phenomena such as the metal-Mott insulator transition can be studied in a controllable fashion.
A glaring shortcoming of the CPA (a DMFT analog for disordered system) is its limitation for treating strong disorder.
The Anderson insulator due to disorder is completely absent not only due to the local nature of the method but also be- ization in a material-specific way. We also discuss the calculation of two-particle response functions, such as the conductivity, which can be directly measured in experiments.
A prominent advantage of the TMDCA is that it can include electronic interactions and treat the disorder and interaction on equal footing. Since in the TMDCA a geometric average of the local DOS is used for the self consistency, it requires a real-frequency cluster solver to provide reliable spectra for each disorder configuration. A general real frequency cluster solver that can cover the whole range of electronic interaction will greatly improve the TMDCA results to study the interplay between disorder and correlation effect.
We presented two calculations for the Anderson Hubbard model using two perturbation based cluster solvers each of which is suitable for weak or strong interaction respectively.
Most significantly, we show that in the limits of strong disorder and weak interactions treated perturbatively, that the phenomena of 3D localization, including a mobility edge, remains intact. However, the metal-insulator transition is pushed to larger disorder values by the local interactions. We also study the limits of strong disorder and strong interactions capable of producing moment formation and screening, with a nonperturbative local approximation. Here, we find that the Anderson localization quantum phase transition is accompanied by a quantum-critical fan in the energy-disorder phase diagram.
The TMDCA has been successfully combined with the Den-