An Overview of Self-Consistent Field Calculations Within Finite Basis Sets

Lehtola, Susi; Blockhuys, Frank; Van Alsenoy, Christian

doi:10.3390/molecules25051218

Open AccessArticle

An Overview of Self-Consistent Field Calculations Within Finite Basis Sets^†

by

Susi Lehtola

^1,*

,

Frank Blockhuys

² and

Christian Van Alsenoy

^2,*

¹

Department of Chemistry, University of Helsinki, P.O. Box 55 (A. I. Virtasen aukio 1) Helsinki, Finland

²

Department of Chemistry, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerpen, Belgium

^*

Authors to whom correspondence should be addressed.

^†

The authors dedicate this paper to Prof. Paul Geerlings of the Free University of Brussels-VUB on the occasion of his 70th birthday.

Molecules 2020, 25(5), 1218; https://doi.org/10.3390/molecules25051218

Submission received: 26 December 2019 / Revised: 12 February 2020 / Accepted: 17 February 2020 / Published: 8 March 2020

(This article belongs to the Special Issue Electron Density Analysis Tools)

Download Versions Notes

Abstract

A uniform derivation of the self-consistent field equations in a finite basis set is presented. Both restricted and unrestricted Hartree–Fock (HF) theory as well as various density functional approximations are considered. The unitary invariance of the HF and density functional models is discussed, paving the way for the use of localized molecular orbitals. The self-consistent field equations are derived in a non-orthogonal basis set, and their solution is discussed also in the presence of linear dependencies in the basis. It is argued why iterative diagonalization of the Kohn–Sham–Fock matrix leads to the minimization of the total energy. Alternative methods for the solution of the self-consistent field equations via direct minimization as well as stability analysis are briefly discussed. Explicit expressions are given for the contributions to the Kohn–Sham–Fock matrix up to meta-GGA functionals. Range-separated hybrids and non-local correlation functionals are summarily reviewed.

Keywords:

self-consistent field theory; hartree-fock; density functional theory

1. Introduction

Electronic structure calculations have become a cornerstone of modern-day research in chemistry and materials physics, allowing in silico modeling of chemical reactions and the first principles design of novel catalysts [1]. Electronic structure calculations on molecular systems most often employ the linear combination of atomic orbitals (LCAO) approach, where the molecular orbitals (MOs) are expanded in terms of atomic orbitals (AOs). Several possible alternatives for the form of the AOs are commonly used—Gaussian-type orbitals (GTOs), Slater-type orbitals (STOs), as well as numerical atomic orbitals (NAOs); see [2] for details. LCAO electronic structure calculations involve a variational minimization of the total energy with respect to the AO expansion coefficients of the MOs. Importantly, the formalism used in the LCAO approach is not restricted to AOs which are atom-centered basis functions; it can also be used, e.g., in combination with fully numerical basis functions as in the finite element approach, as has been recently demonstrated in [3,4]. Once the energy has been minimized and the corresponding wave function has been obtained, it is possible to compute a number of properties either directly from the electronic wave function (e.g., electron densities, orbital energies, molecular dipole moment), or from its response to external perturbations (nuclear magnetic shieldings, vibrational frequencies, etc.).

The mathematical foundations for spin-restricted Hartree–Fock (HF) theory within the LCAO approach were laid out independently by Roothaan and Hall [5,6]. In their seminal papers, Roothaan and Hall derived matrix equations that can be conveniently implemented on a computer as an iterative procedure. As will be seen later in Section 6, the Roothaan–Hall equations turn out to yield a generalized eigenvalue problem

F C = S C E

in the non-orthogonal AO basis set, which had been solved some years before by Löwdin in the context of Heitler–London theory [7].

Subsequently to the work by Roothaan and Hall, Pople and Nesbet [8] and Berthier [9] independently published the corresponding equations for an unrestricted (open-shell) HF description by an analogous scheme, without providing an explicit derivation. The Pople–Nesbet–Berthier equations assume a form similar to the Roothaan–Hall equations—constituting a coupled set of general eigenvalue equations—as will also be seen later on in the manuscript (Section 6). Restricted open-shell HF was then described by Roothaan [10]; restricted open-shell calculations will not be considered in the present work as they have been extensively reviewed by Krebs in [11] to which we refer for further details.

Density functional theory [12,13] (DFT; see also [14,15]) became popular in chemistry through the efforts of Pople and coworkers in making the method generally available to quantum chemists [16] and showing that atomization energies from DFT may agree well with experiment [17,18]. Also DFT turns out to yield self-consistent field (SCF) equations that assume the same form as in HF but with a different expression for the Fock matrix

F

. Pople and coworkers reported the equations necessary for solving SCF for DFT in the LCAO context up to generalized gradient approximation (GGA) functionals in [16]; an analogous derivation was also presented by Kobayashi et al in [19]. The self-consistent implementation of meta-GGA functionals was later described by Neumann, Nobes and Handy in [20]. Density functional calculations sometimes include also non-local correlation contributions; self-consistent LCAO implementations thereof have been reported by Vydrov and coworkers [21,22,23,24].

Despite the progress in and widespread success of DFT, to our knowledge, a uniform derivation of the SCF equations for HF and DFT including all the necessary expressions for the elements of the Kohn–Sham–Fock matrix up to the level of meta-GGA functionals has, up to now, not been explicitly published in the literature. This has likely contributed to the lack of complete support for meta-GGA functionals in popular quantum chemistry programs; for instance, PSI4 [25] and PySCF [26] lack support for meta-GGAs that depend on the Laplacian of the density such as the Becke–Roussel exchange functional [27], for example. This paper, therefore, presents such a derivation, yielding expressions of the DFT contributions to the Kohn–Sham–Fock matrix up to the level of meta-GGA functionals in a consistent way, facilitating the implementation of DFT in new programs.

The present derivation also has an obvious educational value. Indeed, in what follows, HF and various flavors of DFT belonging to different rungs of Jacob’s Ladder [28]—the local spin density approximation (LDA), the GGA and meta-GGA approximations—will be explicitly described in a uniform notation, making the similarities and dissimilarities between the approaches crystal clear. Facilitated by the uniform derivation, we will discuss key issues and features in the HF and DFT methodologies that arise from the mathematical formulation.

First, the basis set expansion of the molecular orbitals and the electron density is written out in Section 2. Then, the energy expression for HF and DFT is presented in Section 3, with a brief explanation of their physical content. The HF and DFT energy is shown to be invariant to rotations of the occupied and of the virtual orbitals in Section 4, allowing the construction of localized orbitals. The possibilities and drawbacks of spin-restricted calculations are discussed in Section 5. The finite-basis SCF equations are derived as generalized eigenvalue equations in Section 6. It is shown that the general eigenvalue equations can be reduced into normal eigenvalue equations by a transformation to an orthonormal basis in Section 7, and that linear dependencies in the basis can be eliminated on the way. The reason why the solution of the SCF equations amounts to a minimization of the total energy is rationalized in Section 8. Direct minimization methods are briefly introduced and stability analysis discussed in Section 9. The SCF method and direct minimization are contrasted in Section 10. Finally, the contributions to the Kohn–Sham–Fock matrix arising from various-rung DFT functionals are listed in Section 11. The article concludes with a brief summary and discussion in Section 12. Atomic units are used throughout the text.

2. Basis Set Expansion

In the HF and DFT approaches, the electronic wave function is written as a Slater determinant, in which the electrons occupy a set of MOs

φ (\vec{r})

. The MOs are expanded in terms of normalized expansion functions

χ (\vec{r})

, which are typically AOs. Both

φ

and

χ

, as well as the LCAO coefficients

C_{μ i}

, are typically chosen to be real in the lack of magnetic interactions that would generally make the Hamiltonian operator complex. Note, however, that the use of complex coefficients has been shown to be sometimes beneficial to describe challenging systems even in the lack of magnetic fields in Hartree–Fock (as reviewed in [29]) or DFT (as shown in [30]); complex instabilities may also arise in specialized methods beyond the SCF level, see [31] for instance. The expansion functions are generally not orthonormal

\int d \vec{r} χ_{μ} (\vec{r}) χ_{ν} (\vec{r}) = S_{μ ν} \neq δ_{μ ν}

where

δ_{μ ν}

is the Kronecker delta:

δ_{μ ν} = 1

if

μ = ν

and

δ_{μ ν} = 0

otherwise. Greek letters,

μ, ν, λ, σ, η

,

ζ

and

θ

will be used to identify the expansion functions

χ (\vec{r})

, whereas Roman letters, i, j and k will be used to identify the MOs

φ

. The

α

(spin-up) and

β

(spin-down) MOs are expanded separately as

\begin{matrix} φ_{i}^{α} (\vec{r}) = \sum_{μ = 1}^{M} C_{μ i}^{α} χ_{μ} (\vec{r}) \end{matrix}

(1)

\begin{matrix} φ_{i}^{β} (\vec{r}) = \sum_{μ = 1}^{M} C_{μ i}^{β} χ_{μ} (\vec{r}) \end{matrix}

(2)

Both the

α

and

β

MOs are orthonormal to themselves

\int d \vec{r} φ_{i}^{α} (\vec{r}) φ_{j}^{α} (\vec{r}) = δ_{i j} and \int d \vec{r} φ_{i}^{β} (\vec{r}) φ_{j}^{β} (\vec{r}) = δ_{i j}

or equivalently within the basis set

{(C^{α})}^{T} S C^{α} = 1 and {(C^{β})}^{T} S C^{β} = 1

However, the

α

orbitals are generally not orthonormal to the

β

orbitals:

\int d \vec{r} φ_{i}^{α} (\vec{r}) φ_{j}^{β} (\vec{r}) \neq δ_{i j}

or

{(C^{α})}^{T} S C^{β} \neq 1

(3)

The electron density plays a pivotal role in quantum chemistry. In line with chemistry literature,

ρ (\vec{r})

will be used to denote the electron density at the point

\vec{r}

in contrast to the physics notation

n (\vec{r})

which is customary in the DFT literature. The total electron density is formed from the

α

and

β

densities,

ρ_{α}

and

ρ_{β}

, as

ρ (\vec{r}) = ρ_{α} (\vec{r}) + ρ_{β} (\vec{r})

. The spin-

σ

electron density can be evaluated as

ρ_{σ} (\vec{r}) = \sum_{i = 1}^{N_{σ}} {|φ_{i}^{σ} (\vec{r})|}^{2} = \sum_{i = 1}^{N_{σ}} \sum_{μ ν} C_{μ i}^{σ} C_{ν i}^{σ} χ_{μ} (\vec{r}) χ_{ν} (\vec{r}) = \sum_{μ ν} P_{μ ν}^{σ} χ_{μ} (\vec{r}) χ_{ν} (\vec{r})

(4)

in which

N_{σ}

is the number spin-

σ

electrons in the system, and the sums over basis functions

\sum_{μ}

and

\sum_{ν}

indicate

\sum_{μ = 1}^{M}

and

\sum_{ν = 1}^{M}

, respectively; this convention for easier readability of sums over basis functions is used throughout the rest of this work.

The density matrix

P^{σ}

has been defined in Equation (4) as

P_{μ ν}^{σ} = \sum_{i = 1}^{N_{σ}} C_{μ i}^{σ} C_{ν i}^{σ}

(5)

As is evident from the form of Equation (5), the density matrices are symmetric,

P_{μ ν}^{σ} = P_{ν μ}^{σ}

. As was already mentioned above, the total electron density is obtained from the sum of the

α

and

β

densities. Correspondingly, a total density matrix is given by

P = P^{α} + P^{β}

(6)

from which the total density can be evaluated using a relation analogous to Equation (4).

3. Energy Expression

The starting point for the derivation is the non-relativistic energy expression [5,8,13,16],

E = \sum_{μ ν} P_{μ ν} H_{μ ν} + \frac{1}{2} \sum_{μ ν λ σ} P_{μ ν} P_{λ σ} (μ ν | λ σ) - \frac{a}{2} \sum_{μ ν λ σ} (P_{μ λ}^{α} P_{ν σ}^{α} + P_{μ λ}^{β} P_{ν σ}^{β}) (μ ν | λ σ) + b \int f (\vec{r}) d \vec{r}

(7)

where the electron repulsion integral

(μ ν | λ σ)

is defined as

(μ ν | λ σ) = \int d {\vec{r}}_{1} \int d {\vec{r}}_{2} χ_{μ} (\vec{r_{1}}) χ_{ν} (\vec{r_{1}}) \frac{1}{r_{12}} χ_{λ} (\vec{r_{2}}) χ_{σ} (\vec{r_{2}})

(8)

and a and b are constants that define the fraction of HF exchange and the weight of the density functional approximation, respectively. The choice

a = 1

and

b = 0

corresponds to HF, whereas

a = 0

and

b = 1

yields a “pure” density functional without exact exchange such as the Perdew–Burke–Ernzerhof functional [32]. The choice

a \neq 0

and

b \neq 0

is the most general one, which corresponds to a hybrid functional [33] that are popular in quantum chemistry; perhaps the most famous example being the historical B3LYP functional [34].

The first term in Equation (7), which will be referred to as

E^{H}

, describes the kinetic energy of the electrons and the Coulombic attraction of the N nuclei in the system, with the matrix elements

H_{μ ν} = \int d r χ_{μ} (\vec{r}) (- \frac{1}{2} \nabla^{2} + \sum_{N} \frac{Z_{N}}{r_{N}}) χ_{ν} (\vec{r})

(9)

The one-electron operator in Equation (9) is commonly known as the core Hamiltonian, and the resulting

E^{H}

is the dominating contribution to the total energy.

However, the core Hamiltonian lacks electronic interactions. These are described by the second and third terms in Equation (7), which describe the classical Coulomb and the quantum mechanical “exchange” energy, and are referred to as

E^{J}

and

E^{K}

, respectively. The

E^{J}

contribution to the total energy can be straightforwardly derived from the expression for the Coulomb repulsion between the electrons described by the electron density

ρ (\vec{r})

E^{J} = \frac{1}{2} \int d \vec{r_{1}} \int d \vec{r_{2}} ρ (\vec{r_{1}}) \frac{1}{r_{12}} ρ (\vec{r_{2}})

whereas the expression for the exchange energy contribution

E^{K}

can be obtained, for instance, using Slater’s rules for a HF wave function (

a = 1

).

The final term in Equation (7), referred to as

E^{XC}

, describes the DFT exchange-correlation contributions which, alike

E^{J}

and

E^{K}

, arise from electronic interactions. The exchange-correlation term is commonly written as

E^{XC} = \int d \vec{r} f (\vec{r}) = \int d \vec{r} ρ (\vec{r}) ϵ^{XC} (\vec{r})

where

ϵ^{XC}

is the exchange-correlation energy density per electron. Usually

f (\vec{r})

is a function of the electron density

ρ (\vec{r})

; it may also depend on the derivatives of

ρ (\vec{r})

and the kinetic energy density

τ (\vec{r})

, depending on which rung of Jacob’s Ladder [28] is used to the describe the exchange-correlation effects. The various rungs are discussed in Section 11.

4. Unitary Invariance

The

P^{α}

and

P^{β}

matrices turn out to be invariant to rotations of the occupied orbitals among themselves. Rotating the occupied subset of the molecular orbitals

φ

by a orthogonal matrix

U

defines a new set of occupied orbitals

φ_{i}^{α^{'}} = \sum_{k = 1}^{N_{α}} φ_{k}^{α} U_{k i}

the MO coefficients of which can be obtained as

C_{μ i}^{α^{'}} = \sum_{k = 1}^{N_{α}} C_{μ k}^{α} U_{k i}

This can also be written in matrix notation as

C^{α^{'}} = C^{α} U or C^{α^{'}} U^{T} = C^{α}

(10)

The invariance to rotations in the occupied-occupied block is easy to prove, as

P_{μ ν}^{α^{'}} = \sum_{i = 1}^{N_{α}} C_{μ i}^{α^{'}} C_{ν i}^{α^{'}} = \sum_{i k l = 1}^{N_{α}} U_{i k} C_{μ k}^{α} U_{i l} C_{ν l}^{α} = \sum_{k l = 1}^{N_{α}} δ_{k l} C_{μ k}^{α} C_{ν l}^{α} = \sum_{k = 1}^{N_{α}} C_{μ k}^{α} C_{ν k}^{α} = P_{μ ν}^{α}

(11)

where we have used the orthogonality of

U

,

U^{T} U = 1 = U U^{T}

.

The invariance to rotations in the occupied-occupied block can be used to fashion localized orbitals, for instance using an unitary optimization procedure [35]. Although localized orbitals are not strictly speaking observables—due to which several localization criteria have been suggested in the literature [36,37,38,39,40]—they have been shown to offer an effective way to study chemical reactions with ab initio calculations [41,42,43].

In addition to the occupied orbitals, in general there are also a number of unoccupied orbitals, which are commonly known as virtual orbitals. The number of virtual orbitals in any given calculation depends on the size of the basis set: the bigger the basis is, the more virtual orbitals there are. Because the virtual orbitals do not enter into the density matrix, the HF and DFT energy expression, Equation (7), is also invariant to rotations in the virtual-virtual block, similarly allowing their localization. However, as will be seen below in Section 9, the energy can be changed by mixing virtual orbitals into the occupied orbitals [44]. This approach provides another way to optimize the orbitals directly with, e.g., a gradient descent method, such as the geometric direct minimization method described in [45].

5. Spin-Restriction vs. Unrestriction

The molecular orbitals are obtained from the requirement that they minimize the total energy according to Equation (7). However, one must first choose the used spin formalism. The general choice is to use different spatial orbitals for the

α

and

β

electrons, in which case a spin-unrestricted approach is obtained. The unrestricted approach is often used even in systems in which there are an equal number of alpha and beta electrons,

N_{α} = N_{β}

: although the spin-restricted and unrestricted descriptions often reproduce matching results for such systems near the equilibrium, only the unrestricted formalism is able to break bonds in general. The reason for this is that when molecules are stretched past the Coulson–Fischer point [46], the optimal orbitals spontaneously break spin symmetry, which can only be described in the unrestricted formalism. At variance, in the spin-restricted case the electrons occupy a common set of

N = N_{α} = N_{β} = (N_{α} + N_{β}) / 2

spatial orbitals. The limitation of the spatial orbitals to be the same for both spins,

C_{μ i} = C_{μ i}^{α} = C_{μ i}^{β}

, yields less variational freedom, and prevents the correct dissociation of e.g., the H

_{2}

molecule. As a flip side, the spin-restricted formalism affords computational savings over the unrestricted approach. The spin-restricted density matrices, Equation (5), reduce to

P_{μ ν}^{α} = P_{μ ν}^{β} = \frac{1}{2} P_{μ ν} = \sum_{i}^{N} C_{μ i} C_{ν i}

(12)

meaning, e.g., that the

α

and

β

exchange terms in Equation (7) coincide and can be simplified.

Spin-restriction is also possible in the case in which

N_{α} \neq N_{β}

. In this case, a restricted open-shell method is obtained. Restricted open-shell methods are more involved than the spin-restricted and spin-unrestricted methods discussed in the present work. Restricted open-shell methods have been extensively discussed in [11] to which we refer for further discussion.

6. Self-Consistent Field Equations

Having chosen to use either spin-restricted or spin-unrestricted orbitals, one can proceed to the minimization of the energy expression in Equation (7). The energy expression depends only on the

α

and

β

density matrices

P^{α}

and

P^{β}

and their sum

P

. The density matrices, in turn, are determined by the lowest

N_{α}

and

N_{β}

molecular orbitals according to Equation (5). Because the energy expression in Equation (7) thus only depends on the density matrices

P^{α}

and

P^{β}

, it is expedient to use the chain rule to write, e.g.,

\frac{\partial}{\partial C_{θ k}^{α}} = \sum_{η ζ} \frac{\partial P_{η ζ}^{α}}{\partial C_{θ k}^{α}} \frac{\partial}{\partial P_{η ζ}^{α}}

(13)

where the partial derivative of the density matrix element

P_{η ζ}^{α}

is

\begin{array}{l} \frac{\partial P_{η ζ}^{α}}{\partial C_{θ k}^{α}} & = & \frac{\partial}{\partial C_{θ k}^{α}} \sum_{i = 1}^{N_{α}} C_{η i}^{α} C_{ζ i}^{α} \\ = & \sum_{i = 1}^{N_{α}} δ_{θ η} δ_{k i} C_{ζ i}^{α} + \sum_{i = 1}^{N_{α}} C_{η i}^{α} δ_{θ ζ} δ_{k i} \\ = & δ_{θ η} C_{ζ k}^{α} + C_{η k}^{α} δ_{θ ζ} \end{array}

(14)

The

β

orbital derivative of the total density matrix has the same form as Equation (14), where all

α

are replaced with

β

. Note that these findings hold even when the same orbitals are used for both

α

and

β

in a spin-restricted formalism, since the

α

and

β

orbitals are formally independent.

Due to the chain rule, Equation (13), all we need are the density matrix derivatives of the energy expression. We only have to calculate the derivatives of the energy expression of Equation (7) for one spin, as the energy expression is symmetric with respect to the

α

and

β

densities. It does not matter which spin we choose to be “up”; the expressions for the other spin will follow by symmetry by interchanging

α \leftrightarrow β

. The first term of Equation (7) yields simply

\frac{\partial E^{H}}{\partial P_{η ζ}^{α}} = \frac{\partial}{\partial P_{η ζ}^{α}} \sum_{μ ν} P_{μ ν} H_{μ ν} = H_{η ζ}

(15)

Next, taking the partial derivative with respect to

P_{η ζ}^{α}

of the Coulomb and exchange terms in Equation (7) results in

\begin{matrix} \frac{\partial E^{J}}{\partial P_{η ζ}^{α}} & = & \sum_{μ ν} (P_{μ ν}^{α} + P_{μ ν}^{β}) (μ ν | η ζ) = \sum_{μ ν} P_{μ ν} (μ ν | η ζ) = J_{η ζ} \end{matrix}

(16)

where

J

is known as the Coulomb matrix, and

\begin{matrix} \frac{\partial E^{K}}{\partial P_{η ζ}^{α}} & = & - a \sum_{μ ν} P_{μ ν}^{α} (μ η | ν ζ) = - a K_{η ζ}^{α} \end{matrix}

(17)

where

K^{α}

is the spin-

α

exchange matrix, respectively. The Coulomb and exchange matrices can be used to rewrite the energy expression in Equation (7) as

E = \sum_{μ ν} P_{μ ν} H_{μ ν} + \frac{1}{2} \sum_{μ ν} P_{μ ν} J_{μ ν} - \frac{a}{2} \sum_{μ ν} (P_{μ ν}^{α} K_{μ ν}^{α} + P_{μ ν}^{β} K_{μ ν}^{β}) + b \int f (\vec{r}) d \vec{r}

(18)

Note that in contrast to the Coulomb and exact exchange terms, the exchange-correlation term does not undergo simplifications, because the exchange-correlation term is not quadratic in the density matrix, as will be seen later in Section 11. For the time being, we will denote the partial derivative of

E^{XC}

with respect to

P_{η ζ}^{α}

as

\frac{\partial E^{XC}}{\partial P_{η ζ}^{α}} = b K_{η ζ}^{XC; α}

(19)

as the full expressions for

K^{XC; α}

will be presented in Section 11. Now, collecting the partial derivatives in Equations (15)–(19) gives us the density matrix derivatives of the energy expression as

\frac{\partial E}{\partial P_{η ζ}^{σ}} = H_{η ζ} + J_{η ζ} - a K_{η ζ}^{σ} + b K_{η ζ}^{XC; σ} = F_{η ζ}^{σ}

(20)

where we have identified the Kohn–Sham–Fock matrices

F^{σ}

, where

σ

denotes

α

or

β

. Because the density matrices defined by Equation (5) are symmetric, also the Fock matrices are symmetric,

F_{η ζ}^{σ} = F_{ζ η}^{σ}

. Note that since the Fock matrices only depend on the density matrices, also they are invariant to occupied-occupied and virtual-virtual rotations,

F^{σ^{'}} = F^{σ}

.

Naïvely, one would obtain the orbital derivative of the full energy expression in Equation (7) with Equations (13), (14) and (20) and then set it to zero to yield an equation for the unknown expansion coefficients

C^{α}

. However, the molecular orbitals cannot be varied freely: one must make sure that the orbitals stay orthonormal during the variation, as otherwise the Pauli exclusion principle would be violated. For instance, the orthonormality condition for the

α

electrons is

\int φ_{i}^{α} (\vec{r}) φ_{j}^{α} (\vec{r}) d r = δ_{i j}

(21)

The way to enforce these conditions is to use Lagrangian multipliers

ε_{i j}

. That is, instead of the bare energy expression E, we will optimize the Lagrangian

L = E - \sum_{i j} ε_{i j}^{α} [\int d \vec{r} φ_{i}^{α} (\vec{r}) φ_{j}^{α} (\vec{r}) - δ_{i j}] - \sum_{i j} ε_{i j}^{β} [\int d \vec{r} φ_{i}^{β} (\vec{r}) φ_{j}^{β} (\vec{r}) - δ_{i j}]

(22)

where the sums over i and j run over all orbitals; that is, both the occupied and the virtual ones. We can see from Equation (22) that the matrices of Lagrangian multipliers

ε^{α}

and

ε^{β}

can be chosen to be symmetric. For instance, if

ε^{α}

contained a symmetric part

ε_{s}^{α}

and an antisymmetric part

ε_{a}^{α}

,

ε^{α} = ε_{s}^{α} + ε_{a}^{α}

, the contribution from the antisymmetric part would vanish because it is multiplied with the orbital overlap that is symmetric.

Next, we can calculate

\partial L / \partial C_{θ k}^{α}

, where

\partial E / \partial C_{θ k}^{α}

is given by Equations (13), (14), (20) and the derivative of the constraint term is given by

\begin{matrix} \frac{\partial}{\partial C_{θ k}^{α}} \sum_{i j} ε_{i j}^{α} [\int d \vec{r} φ_{i}^{α} (\vec{r}) φ_{j}^{α} (\vec{r}) - δ_{i j}] & = & \sum_{i j} ε_{i j}^{α} \frac{\partial}{\partial C_{θ k}^{α}} \sum_{η ζ} [C_{η i}^{α} S_{η ζ} C_{ζ j}^{α} - δ_{i j}] \\ = & \sum_{i j} ε_{i j}^{α} \sum_{η ζ} [δ_{η θ} δ_{k i} C_{ζ j}^{α} + C_{η i}^{α} δ_{ζ θ} δ_{k j}] S_{η ζ} \\ = & \sum_{η} \sum_{i} ε_{k i}^{α} C_{η i}^{α} S_{θ η} + \sum_{η} \sum_{i} ε_{i k}^{α} C_{η i}^{α} S_{η θ} \end{matrix}

(23)

where on the third line dummy summation indices have been renamed from j to i and

ζ

to

η

. The derivative can be evaluated as

\begin{matrix} \frac{\partial L}{\partial C_{θ k}^{α}} & = & \sum_{η ζ} \frac{\partial E}{\partial P_{η ζ}^{α}} \frac{\partial P_{η ζ}^{α}}{\partial C_{θ k}^{α}} - \sum_{i j} ε_{i j}^{α} \frac{\partial}{\partial C_{θ k}^{α}} \int d \vec{r} φ_{i}^{α} (\vec{r}) φ_{j}^{α} (\vec{r}) \\ = & \sum_{η ζ} F_{η ζ}^{α} [δ_{θ η} C_{ζ k}^{α} + C_{η k}^{α} δ_{θ ζ}] - \sum_{ζ} \sum_{i} ε_{k i}^{α} C_{ζ i}^{α} S_{θ ζ} - \sum_{η} \sum_{i} ε_{i k}^{α} C_{η i}^{α} S_{η θ} \\ = & 2 \sum_{η} F_{θ η}^{α} C_{η k}^{α} - 2 \sum_{η} \sum_{i} ε_{k i}^{α} S_{θ η} C_{η i}^{α} \end{matrix}

(24)

because

F

and

ε

are symmetric, and dummy summation indices can be renamed.

The optimal orbitals satisfy the stationary condition

\partial L / \partial C_{θ k}^{α} = 0

from which

\begin{matrix} \sum_{η} F_{θ η}^{α} C_{η k}^{α} = \sum_{η} \sum_{i} S_{θ η} C_{η i}^{α} ε_{i k}^{α} \end{matrix}

(25)

Equation (25) can thus be written in matrix form as

F^{α} C^{α} = S C^{α} E^{α}

(26)

where

E^{α}

is the (symmetric) matrix of Lagrangian multipliers.

Because

E^{α}

is symmetric, it can be diagonalized and it has real eigenvalues. Let us now assume that

U^{α}

is an orthogonal matrix that diagonalizes

E^{α}

E_{i j}^{α} \to E_{i j}^{α^{'}} = \sum_{k l} U_{k i}^{α} E_{k l}^{α} U_{l j}^{α} = δ_{i j} ε_{i}^{α}

(27)

where

ε^{α}

are the eigenvalues. Re-expressing the orbital coefficients

C^{α}

in terms of a new set of orbitals rotated by

U^{α}

with Equation (10),

C^{α} = C^{α^{'}} {(U^{α})}^{T}

, rewrites Equation (26) in the form

F^{α} C^{α^{'}} {(U^{α})}^{T} = S C^{α^{'}} {(U^{α})}^{T} E^{α}

that can be multiplied from the right by

U^{α}

producing

F^{α} C^{α^{'}} = S C^{α^{'}} E^{α^{'}}

(28)

where, according to Equation (27),

E^{α^{'}} = {(U^{α})}^{T} E^{α} U^{α}

is a diagonal matrix with elements

ε_{i}^{α}

.

Equation (28) is almost what we want—an equation in the rotated basis that looks like Equation (26) with a diagonal

E^{α^{'}}

—but one problem remains: the Kohn–Sham–Fock matrix is still the one corresponding to the original orbitals

C^{α}

instead of the transformed orbitals

C^{α^{'}}

, while the orbital rotation by

U

that takes us from

C^{α}

to

C^{α^{'}}

may lead to a different Kohn–Sham–Fock matrix

F^{α^{'}} \neq F^{α}

. However, if we choose the form of

U

such that the occupied-virtual (ov) and virtual-occupied (vo) blocks vanish

U = (\begin{matrix} U_{oo} & U_{ov} \\ U_{vo} & U_{vv} \end{matrix}) = (\begin{matrix} U_{oo} & 0 \\ 0 & U_{vv} \end{matrix})

(29)

then

U

only rotates occupied orbitals with occupied orbitals and virtual orbitals with virtual orbitals, meaning that the orbital rotation does not change the density matrix given in Equation (11). Then, the Fock matrix corresponding to the rotated orbitals coincides with that of the original orbitals,

F^{'} = F

, completing the proof that

E^{α}

can be chosen to be diagonal. (Occupied-virtual rotations

U_{ov}

or

U_{ov}

, discussed in more detail in Section 9, are in fact here forbidden: the SCF equations were derived with the assumption that the energy is stationary, but this condition would instantly be violated by such rotations.)

We have thus obtained the Berthier–Pople–Nesbet [8,9,13,16] equations for the orbital coefficients

\{\begin{matrix} F^{α} C^{α} = S C^{α} E^{α} \\ F^{β} C^{β} = S C^{β} E^{β} \end{matrix}

(30)

where the primes have become unnecessary and have been omitted for simplicity. The elements of the Kohn–Sham–Fock matrices

F^{α}

and

F^{β}

are given by Equation (20), and the orbital energy matrices

E^{α}

and

E^{β}

are diagonal. In the spin-restricted case [5,16] the

α

and

β

molecular orbitals coincide, leading to identical density matrices

P^{α}

and

P^{β}

, and identical Fock matrices

F^{α}

and

F^{β}

. In this case, the SCF equations simplify to the Roothaan–Hall form

F C = S C E

(31)

which was already mentioned in the Introduction.

7. Solution of Self-Consistent Field Equations

The Roothaan–Hall and Berthier–Pople–Nesbet expressions take the form of a generalized eigenvalue equation. The conventional way to solve these equations is to re-express the (unknown) orbital coefficients in terms of a matrix

X

as

C = X \tilde{C}

(32)

Inserting Equation (32) into the Roothaan–Hall equation, Equation (31), yields

F X \tilde{C} = S X \tilde{C} E

which can be multiplied from the left with

X^{T}

to yield

X^{T} F X \tilde{C} = X^{T} S X \tilde{C} E

This means that the orbital transform of Equation (32) yields a new generalized eigenvalue equation

\tilde{F} \tilde{C} = \tilde{S} \tilde{C} E

(33)

where

\tilde{F} = X^{T} F X

and

\tilde{S} = X^{T} S X

. Now, if we choose

X

in such a way that

\tilde{S} = 1

, Equation (33) reduces to a normal eigenvalue equation

\tilde{F} \tilde{C} = \tilde{C} E

(34)

which can be solved with standard techniques. Then, the wanted orbital coefficients

C

can be calculated from

\tilde{C}

using Equation (32).

If the basis set is well-conditioned, the matrix

X

can be chosen as

X = V Λ^{- 1 / 2} V^{T}

(35)

where

V

and

Λ

are the eigenvectors and eigenvalues of

S

S = V Λ V^{T}

(36)

This procedure is known as symmetric orthogonalization [7].

However, if a large LCAO basis is used, the atomic orbital basis functions centered on different atoms may generate linear dependencies in the basis, making the basis set expansion ambiguous. These linear dependencies can be removed with the “canonical” orthonormalization procedure [47], in which

X = V^{'} {Λ^{'}}^{- 1 / 2}

(37)

where only those eigenvectors

V_{i}

with large enough eigenvalues

λ_{i} \geq τ

are included. The threshold

τ

is typically of the order of

τ = 10^{- 7} \dots 10^{- 5}

, and its value may have a noticeable effect on, e.g., the absolute energies that result from a SCF calculation; relative energies, however, should be less sensitive to

τ

. If no eigenvalues fall under the threshold

τ

, the symmetric and canonical orthogonalization approaches become equivalent for the purposes of SCF calculations in the case of a well-conditioned basis set: both yield an orthonormal basis of the same size, which will yield the same variational ground state energy.

Unnormalized basis sets can also be handled easily by the orthogonalization procedure. Although in principle it is not necessary to normalize the individual basis functions before obtaining an orthonormal basis by Equations (35) and (37), computer linear algebra packages may fail to find the eigenvalues and eigenvectors in a reliable fashion if the basis functions have pronouncedly different norms. Moreover, missing normalization of the basis set affects the eigenvalues, which has repercussions for canonical orthogonalization. These issues can be circumvented by normalizing the overlap matrix

S \to S^{'} = n S n

where

n_{i j} = S_{i i}^{- 1 / 2} δ_{i j}

before using Equations (35) and (37) [3,4]. The orthogonalizer for the unnormalized basis set is obtained as

X \to n X

; it is easy to see that this satisfies the necessary condition

X^{T} S X = 1

even though the symmetricity of

X

for the case of Equation (35) will be lost.

Even if

S

has been properly normalized, the use of the symmetric or canonical orthogonalization procedures still requires that the diagonalization of

S

is numerically stable. However, whenever a large number of linear dependencies exists in the basis set (e.g., a large number of diffuse functions are used or two nuclei are close together),

S

may become so ill-conditioned it cannot be accurately diagonalized. In such cases it is possible to reduce the size of the basis set without losing a significant amount of accuracy by an automatic procedure, see [48,49] for details.

8. Why Does the Self-Consistent Field Method Minimize the Energy?

The SCF equations, Equation (30) or Equation (31), offer a way to solve for the molecular orbitals described by

C

from a Kohn–Sham–Fock matrix

F

by finding its eigenvectors from Equation (34). However, the Kohn–Sham–Fock matrix depends on the density matrices, which are built from the molecular orbitals according to the Aufbau principle. In the SCF procedure, one tries to find a self-consistent solution:

C

yields

F

, whose eigenvectors are

C

. The procedure starts from an initial guess for the orbitals

C^{σ}

or the density matrices

P^{σ}

, which have been recently reviewed and benchmarked in [50] to which we refer for further details.

Why does the self-consistent field procedure—diagonalizing

F^{σ}

to update the orbital coefficients

C^{σ}

—correspond to minimization of the Hartree–Fock/Kohn–Sham energy? For simplicity, let us examine the case of HF theory. The energy expression, Equation (18), can be written in this case (

a = 1

,

b = 0

) as

E = Tr P H + \frac{1}{2} Tr P J - \frac{1}{2} Tr P^{α} K^{α} - \frac{1}{2} Tr P^{β} K^{β}

(38)

The Fock matrix elements, Equation (20), are given by

\begin{matrix} F^{α} & = & H + J - K^{α} \end{matrix}

(39)

\begin{matrix} F^{β} & = & H + J - K^{β} \end{matrix}

(40)

Equation (38) can be rewritten with Equations (39) and (40) as

E = \frac{1}{2} Tr P^{α} (H + F^{α}) + \frac{1}{2} Tr P^{β} (H + F^{β})

(41)

Expanding the density matrices using Equation (5) we see that Equation (41) can be written as

E = \frac{1}{2} \sum_{i = 1}^{N_{α}} (h_{i i}^{α} + f_{i i}^{α}) + \frac{1}{2} \sum_{i = 1}^{N_{β}} (h_{i i}^{β} + f_{i i}^{β})

(42)

where the core Hamiltonian and Fock matrices have been written in the molecular orbital basis,

h^{σ} = {(C^{σ})}^{T} H C^{σ}

and

f^{σ} = {(C^{σ})}^{T} F^{σ} C^{σ}

.

If one were to start the calculation from the core guess, then

\sum_{i} h_{i i}^{α}

and

\sum_{i} h_{i i}^{β}

would be minimized. However, as discussed in [50], this is a horrible choice as it completely disregards electronic repulsion effects, meaning that the

\sum_{i} f_{i i}^{α}

and

\sum_{i} f_{i i}^{β}

terms are far from optimal. The Roothaan step—obtaining new molecular orbitals by diagonalization of the Fock matrix—results in a minimization of the

\sum_{i} f_{i i}^{α}

and

\sum_{i} f_{i i}^{β}

terms, as after diagonalization only the lowest orbitals become populated and the sum thus runs only over the lowest eigenvalues

f_{i i}^{σ}

. After the update,

\sum_{i} h_{i i}^{α}

and

\sum_{i} h_{i i}^{β}

no longer yield their lowest possible values. However, the increase in the value of

\sum_{i} h_{i i}^{α} + \sum_{i} h_{i i}^{β}

should be much smaller than the decrease in the value of

\sum_{i} f_{i i}^{α} + \sum_{i} f_{i i}^{β}

, as the Fock matrices

f^{α}

and

f^{β}

also contain the core Hamiltonian. It is thus seen that Roothaan’s self-consistent field method, that is, the iterative diagonalization of the Fock matrix minimizes the energy.

However, the minimization is only valid for a fixed potential

f^{σ}

in which the electrons are moving. When the orbitals are changed—as happens when

f

is made diagonal and its lowest eigenvectors occupied—a new Fock matrix

F

must be built and a new

f

constructed: the potential also changes with the electron density. If the orbitals were far from their optimal values,

P

and therefore

F

may change quite radically by the orbital update. This means that even though

f

was made diagonal in the previous iteration, it is no longer diagonally dominant after it has been updated. Indeed, the straightforward iterative diagonalization procedure often fails to converge for all but the simplest systems, because the density tends to undergo large oscillations in the naïve self-consistency cycle. To make the method usable, the convergence of the fixed-point problem of finding a

C

that generates

F

that generates

C

must be stabilized or accelerated in some way. This can be achieved, e.g., by damping [51,52], level shifts [53,54,55], or extrapolation [56,57,58,59]. Fractional occupations can also be used in the initial iterations to aid convergence [60].

The argument for why density functional calculations converge similarly to HF with the iterative Roothaan procedure is somewhat less obvious, because unlike HF the exchange-correlation functional is not generally quadratic in the density. However, the total energy expression is approximately quadratic also in DFT when one is sufficiently close to an extremal point, as is easily seen by a Taylor expansion of Equation (7). In practice the iterative procedure works well also for DFT, whose contributions to the Kohn–Sham–Fock matrix we will discuss in Section 11.

9. Direct Minimization of the Energy

Instead of solving the orbitals from the SCF equations, which were obtained in Section 6 from the stationary condition for the energy under the constraint of orthonormal orbitals, the orbitals can also be optimized by a direct minimization of the energy. As was discussed in Section 4, the energy expression of Equation (7) is invariant to occupied-occupied and virtual-virtual rotations. This means that if we have

o^{σ}

occupied orbitals and

v^{σ}

virtual orbitals from some initial guess (see possible choices in [50]) for spin

σ

, we can consider the energy as a function of a set of

o^{σ} v^{σ}

rotation angles [44] by examining a rotation of the orbitals via Equation (10) by an orthogonal matrix

U^{σ} (θ^{σ}) = exp (\begin{matrix} 0 & θ^{σ} \\ - {(θ^{σ})}^{T} & 0 \end{matrix})

(43)

where

θ^{σ}

is an

o^{σ} \times v^{σ}

matrix containing the rotation angles. The rotation matrix determined by Equation (43) reduces to an identity matrix for vanishing rotation parameters,

θ^{σ} = 0

. Because the rotation matrix of Equation (43) is orthogonal, it automatically preserves the orthonormality of the orbitals, and special tricks i.e. Lagrangian multipliers are not needed to enforce this behavior.

The change in the density matrix is given by

\frac{\partial P_{η ζ}^{σ}}{\partial θ_{i a}^{σ}} = \sum_{k = 1}^{N_{σ}} [\frac{\partial C_{η k}^{σ}}{\partial θ_{i a}^{σ}} C_{ζ k}^{σ} + C_{η k}^{σ} \frac{\partial C_{ζ k}^{σ}}{\partial θ_{i a}^{σ}}]

(44)

How do the orbital coefficients change? Remembering that the first

N_{σ}

orbitals are occupied, and the rest are virtual, we can write

C^{σ} = (\begin{matrix} C_{o}^{σ} & C_{v}^{σ}) \end{matrix}

. After an infinitesimal rotation

θ \approx 0

, the occupied orbitals change into

C_{o}^{σ^{'}} \approx C_{o}^{σ} - C_{v}^{σ} {(θ^{σ})}^{T}

, that is

C_{η k}^{σ^{'}} = C_{η k}^{σ} - \sum_{b virtual} C_{η b}^{σ} θ_{k b}^{σ}

(45)

from which

\partial C_{η k}^{σ} / \partial θ_{i a}^{σ} = - C_{η a}^{σ} δ_{i k}

. Now the gradient of the energy with respect to rotation of the current set of orbitals can be obtained as

\frac{\partial E}{\partial θ_{i a}^{σ}} = \sum_{η ζ} \frac{\partial E}{\partial P_{η ζ}^{σ}} \frac{\partial P_{η ζ}^{σ}}{\partial θ_{i a}^{σ}} = \sum_{η ζ} F_{η ζ}^{σ} \sum_{k = 1}^{N_{σ}} [\frac{\partial C_{η k}^{σ}}{\partial θ_{i a}^{σ}} C_{ζ k}^{σ} + C_{η k}^{σ} \frac{\partial C_{ζ k}^{σ}}{\partial θ_{i a}^{σ}}] = - \sum_{η ζ} F_{η ζ}^{σ} [C_{η a}^{σ} C_{ζ i}^{σ} + C_{η i}^{σ} C_{η a}^{σ}] = - 2 f_{i a}^{σ}

(46)

where

f^{σ} = {(C^{σ})}^{T} F^{σ} C^{σ}

is the Fock matrix in the MO basis. Direct minimization of Equation (7) can then be pursued using Equation (46) with, e.g., gradient descent methods. However, a proper preconditioning of the search direction is essential in order for the algorithm to be usable; see, e.g., the geometric direct minimization method described in [45]. Many other direct minimization methods for the HF or DFT energy have also been proposed, and we refer the interested reader to the vast existing literature that cannot be comprehensively cited here.

10. SCF vs. Direct Minimization

Having described two alternative ways for solving the orbitals, we can discuss their advantages and disadvantages. The self-consistent field method is hard to beat for systems where convergence is straightforward: a suitably stabilized and accelerated SCF procedure often converges within 10 to 20 iterations when a suitable initial guess (see [50]) has been provided. However, when the gap between the highest occupied and lowest unoccupied orbital is small, which commonly occurs in, e.g., first-row transition metal complexes, the SCF procedure may become extremely slow to converge, oscillate between two or more solutions, converge to a higher-lying solution, or even to a saddle-point solution. Namely, it is critically important to realize that even if the orbital gradient vanishes, or equivalently, that the SCF equations are fulfilled, this does not mean that the energy expression Equation (7) truly has been minimized. Because there are typically several occupied as well as virtual orbitals, the minimization problem involves a large number of degrees of freedom. In multivariate calculus, a vanishing gradient only means that the orbitals correspond to some kind of extremum of the energy: a local minimum, a saddle-point solution, or even a local maximum, although the lattermost is highly improbable in SCF calculations.

In contrast to the sometimes erratic behavior of the SCF method, direct minimization based on orbital rotations is guaranteed to converge onto an extremal point

f_{i a} = 0

per the theory of numerical analysis; this is of great worth when studying systems with complicated electronic structures for which conventional SCF algorithms fail. However, more predictable convergence does not come for free: the downside of direct minimization methods is that they carry a higher computational cost due to, e.g., the use of line searches in the orbital optimization. Direct minimization methods can also be formulated at the second order, yielding more robust convergence to a local minimum at the cost of more computational resources per iteration, see, e.g., [61,62,63]. Because direct minimization methods are based on an explicit rotation of the orbitals, they are able to always follow the same solution at variance to SCF methods where the orbital occupations are typically reset at every iteration according to the Aufbau principle. Because of this, direct minimization can lead to a solution where the Aufbau rule is violated, that is, the highest occupied orbital lies higher in energy than the lowest unoccupied orbital. Direct minimization methods can also be straightforwardly applied in more complicated electronic structure theories than self-consistent field theory. Such methods may especially include explicit dependence on the molecular orbitals, as discussed by one of the present authors in [31,64,65] for the Perdew–Zunger self-interaction correction [66] which depends explicitly on the

N_{σ}

occupied orbitals, and [67] for the perfect quadruples [68] and perfect hextuples [69] models that also depend on the

N_{σ}

corresponding virtual orbitals.

In order to check the character of the extremum found by the SCF procedure or a direct minimization method, it is necessary to continue the analysis to second-order changes in the energy with respect to the orbital rotations by finding the lowest eigenvalue of the Hessian matrix: if it is negative, rotating the orbitals in the direction of the corresponding eigenvector will result in a further decrease of the energy. Whenever post-HF calculations are performed, or benchmark-quality values are sought at the SCF level, stability analysis [70,71] should be used to guarantee that the wave function indeed corresponds to a local minimum. Alternatively, trust-region methods [72,73,74] can be employed to ensure that the orbitals converge onto a true local minimum.

As always in the minimization of multivariate functions, locating the global minimum is difficult, and typically the best one can hope for is to find a local minimum. Some systems permit several local electronic minima: for instance, charge transfer complexes may allow both a neutral (X ⋯ Y) as well as an ionic (X

^{+} \dots

Y

^{-}

) solution. Finding such physically motivated solutions is often straightforward by suitable manipulations of the initial guess, for instance, by constructing guesses via the superposition of atomic potentials [75,76] with the correct atomic charges. Sometimes it may also be interesting to locate saddle-point solutions, which have physical interpretations as excited states. Specific excited states can be explored within the SCF approach by replacing Aufbau population of the orbitals with overlap criteria [77,78] or with direct minimization by replacing the energy with the square of the gradient [79]; for instance, such an approach has been recently shown to predict highly accurate core spectra [80]. The full space of SCF solutions can be explored via, e.g., meta-dynamics [81].

11. Density Functional Contributions to Kohn–Sham–Fock Matrix

In Section 6 we derived expressions for the Kohn–Sham–Fock matrix elements for all but the density functional contribution

E^{XC} = b \int f (\vec{r}) d \vec{r}

(47)

which we will consider next. Hundreds of density functionals

f (\vec{r})

of various forms have been published in the literature in the recent decades [82], and offering a comprehensive selection thereof poses a considerable challenge to quantum chemistry software developers. This problem is further exacerbated by the need to keep track with the several new functionals still being published every year. Moreover, the density functionals

f (\vec{r})

typically carry extremely complicated functional forms, making their correct implementation painstaking work. The implementation is made even more difficult by the need to compute the first derivatives of

f (\vec{r})

for the SCF procedure, as well as several higher-order ones for, e.g., the calculation of various properties.

Fortunately, these challenges have been obviated by freely available, portable standard implementations such as LibXC [83] and XCFun [84]. The LibXC software package strives to implement all DFT functionals published in the literature, and provides a uniform interface to ∼500 functionals of various forms. At present, LibXC is used by ∼30 electronic structure programs based on various numerical representations that range from basis set approaches (Gaussian-type orbitals, Slater-type orbitals, numerical atomic orbitals, finite elements, plane waves) to finite difference procedures. New functionals only have to be added once to LibXC, meaning the library is easily kept up to date, after which they become available to all programs that support the corresponding rung of functionals on Jacob’s ladder [28]. Next, we will derive the equations necessary to implement the various rungs’ functionals in the variational basis set approach.

11.1. LDA Functionals

The simplest density functional approximations (DFAs), belonging to the first rung of Jacob’s Ladder [28], are generally referred to as local (spin) density approximations (LDAs). These are functions of only the electron density [13]

f (\vec{r}) = f (ρ_{α} (\vec{r}), ρ_{β} (\vec{r}))

(48)

such as the LDA exchange functional [85,86]

f (\vec{r}) = - \frac{3}{2} {(\frac{3}{4 π})}^{\frac{1}{3}} (ρ_{α}^{4 / 3} (\vec{r}) + ρ_{β}^{4 / 3} (\vec{r}))

Assuming f has the form of Equation (48), the resulting contribution to the Kohn–Sham–Fock matrix

F_{μ ν}^{α; XC} = \partial E^{XC} / \partial P_{μ ν}^{α}

can be evaluated using Equation (5) for the densities at point

\vec{r}

ρ_{σ} (\vec{r}) = \sum_{μ ν} P_{μ ν}^{σ} χ_{μ} (\vec{r}) χ_{ν} (\vec{r})

(49)

as [16]

F_{μ ν}^{α; LDA} = \frac{\partial E^{XC}}{\partial P_{μ ν}^{α}} = b \int d \vec{r} \frac{\partial f (\vec{r})}{\partial ρ_{α}} \frac{\partial ρ_{α}}{\partial P_{μ ν}^{α}} = b \int d \vec{r} \frac{\partial f (\vec{r})}{\partial ρ_{α}} χ_{μ} (\vec{r}) χ_{ν} (\vec{r})

(50)

with

F^{β; LDA}

having an analogous expression. Note that if the integral is evaluated using numerical quadrature,

F_{μ ν}^{α; LDA} \approx b \sum_{i} w_{i} \frac{\partial f ({\vec{r}}_{i})}{\partial ρ_{α}} χ_{μ} (\vec{r_{i}}) χ_{ν} (\vec{r_{i}})

(51)

Becke’s multigrid approach [87] and further developments thereof being the standard approach in LCAO programs, the expression of Equation (51) can be most efficiently formulated with matrix products. Storing the values of the basis functions at the quadrature points as a matrix

X_{μ i} = χ_{μ} ({\vec{r}}_{i})

and defining a scaled version thereof as

Y_{μ i}^{α} = w_{i} \frac{\partial f ({\vec{r}}_{i})}{\partial ρ_{α}} χ_{μ} ({\vec{r}}_{i})

the Fock matrix contribution can be evaluated as simply as

F^{α; LDA} = b X {(Y^{α})}^{T}

which is orders of magnitude faster than a simple for loop based algorithm.

11.2. GGA Functionals

The second rung of Jacob’s Ladder [28] is referred to as the Generalised Gradient Approximation [88] (GGA). Density functional approximations on this rung also depend on the derivatives of the density

f (\vec{r}) = f (ρ_{α}, ρ_{β}, γ_{α α}, γ_{α β}, γ_{β β})

via the reduced gradients

γ_{α α} = \nabla ρ_{α} \cdot \nabla ρ_{α}

,

γ_{α β} = \nabla ρ_{α} \cdot \nabla ρ_{β}

, and

γ_{β β} = \nabla ρ_{β} \cdot \nabla ρ_{β}

, with the gradient of the density

\nabla ρ_{σ}

being determined by

\nabla ρ_{σ} = \sum_{μ ν} P_{μ ν}^{σ} \nabla [χ_{μ} (\vec{r}) χ_{ν} (\vec{r})] = \sum_{μ ν} P_{μ ν}^{σ} [χ_{ν} (\vec{r}) \nabla χ_{μ} (\vec{r}) + χ_{μ} (\vec{r}) \nabla χ_{ν} (\vec{r})]

The GGA contribution to the Fock matrix is given by [16]

\begin{matrix} F_{μ ν}^{α; GGA} & = & \frac{\partial E^{XC}}{\partial P_{μ ν}^{α}} \\ = & \int d \vec{r} [\frac{\partial f (\vec{r})}{\partial ρ_{α}} \frac{\partial ρ_{α}}{\partial P_{μ ν}^{α}} + \frac{\partial f (\vec{r})}{\partial γ_{α α}} \frac{\partial γ_{α α}}{\partial \nabla ρ_{α}} \cdot \frac{\partial \nabla ρ_{α}}{\partial P_{μ ν}^{α}} + \frac{\partial f (\vec{r})}{\partial γ_{α β}} \frac{\partial γ_{α β}}{\partial \nabla ρ_{α}} \cdot \frac{\partial \nabla ρ_{α}}{\partial P_{μ ν}^{α}}] = F_{μ ν}^{α; LDA} \\ + & \int d \vec{r} (2 \frac{\partial f (\vec{r})}{\partial γ_{α α}} \nabla ρ_{α} (\vec{r}) + \frac{\partial f (\vec{r})}{\partial γ_{α β}} \nabla ρ_{β} (\vec{r})) \cdot [χ_{ν} (\vec{r}) \nabla χ_{μ} (\vec{r}) + χ_{μ} (\vec{r}) \nabla χ_{ν} (\vec{r})] \end{matrix}

(52)

The

β

expression can be obtained by switching

α

and

β

in Equation (52). In the restricted case,

E^{XC} = \int d \vec{r} f (ρ, γ)

with

γ = \nabla ρ \cdot \nabla ρ

, which leads to the DFT contributions to

F^{GGA}

given by the simpler expression

F_{μ ν}^{GGA} = \int d \vec{r} [\frac{\partial f (\vec{r})}{\partial ρ} χ_{μ} (\vec{r}) χ_{ν} (\vec{r}) + 2 \frac{\partial f (\vec{r})}{\partial γ} \nabla ρ (\vec{r}) \cdot [χ_{ν} (\vec{r}) \nabla χ_{μ} (\vec{r}) + χ_{μ} (\vec{r}) \nabla χ_{ν} (\vec{r})]]

(53)

Practical implementations of Equations (52) and (53) can again be formulated using matrix products.

11.3. Meta-GGA Functionals

On the third rung on Jacob’s Ladder [28] are the meta-GGA (mGGA) approximations

f (\vec{r}) = f (ρ_{α}, ρ_{β}, γ_{α α}, γ_{α β}, γ_{β β}, τ_{α}, τ_{β}, \nabla^{2} ρ_{α}, \nabla^{2} ρ_{β})

in which

τ_{σ}

and

\nabla^{2} ρ_{σ}

are obtained as

\begin{matrix} τ_{σ} & = & \frac{1}{2} \sum_{i = 1}^{N_{σ}} {| \nabla φ_{i} (\vec{r}) |}^{2} = \frac{1}{2} \sum_{i = 1}^{N_{σ}} \nabla φ_{i} (\vec{r}) \cdot \nabla φ_{i} (\vec{r}) = \frac{1}{2} \sum_{i = 1}^{N_{σ}} \sum_{μ ν} C_{μ i}^{σ} C_{ν i}^{σ} \nabla χ_{μ} (\vec{r}) \cdot \nabla χ_{ν} (\vec{r}) \\ = & \frac{1}{2} \sum_{μ ν} P_{μ ν}^{σ} \nabla χ_{μ} (\vec{r}) \cdot \nabla χ_{ν} (\vec{r}) \end{matrix}

(54)

\begin{matrix} \nabla^{2} ρ_{σ} & = & \sum_{μ ν} P_{μ ν}^{σ} \nabla^{2} [χ_{μ} (\vec{r}) χ_{ν} (\vec{r})] \\ = & \sum_{μ ν} P_{μ ν}^{σ} [χ_{μ} (\vec{r}) \nabla^{2} χ_{ν} (\vec{r}) + 2 \nabla χ_{μ} (\vec{r}) \cdot \nabla χ_{ν} (\vec{r}) + χ_{ν} (\vec{r}) \nabla^{2} χ_{μ} (\vec{r})] \\ = & \sum_{μ ν} P_{μ ν}^{σ} [χ_{μ} (\vec{r}) \nabla^{2} χ_{ν} (\vec{r}) + χ_{ν} (\vec{r}) \nabla^{2} χ_{μ} (\vec{r})] + 4 τ_{σ} \end{matrix}

(55)

The meta-GGA contributions to the Kohn–Sham–Fock matrix are straightforwardly obtained as [20]

\begin{matrix} F_{μ ν}^{α; mGGA} & = & \frac{\partial E^{XC}}{\partial P_{μ ν}^{α}} \\ = & F_{μ ν}^{α; GGA} + \int d \vec{r} [\frac{\partial f (\vec{r})}{\partial τ_{α}} \frac{\partial τ_{α}}{\partial P_{μ ν}^{α}} + \frac{\partial f (\vec{r})}{\partial \nabla^{2} ρ_{α} (\vec{r})} \frac{\partial \nabla^{2} ρ_{α} (\vec{r})}{\partial P_{μ ν}^{α}}] \\ = & F_{μ ν}^{α; GGA} + \int d \vec{r} [(\frac{1}{2} \frac{\partial f (\vec{r})}{\partial τ_{α}} + 2 \frac{\partial f (\vec{r})}{\partial \nabla^{2} ρ_{α} (\vec{r})}) \nabla χ_{μ} (\vec{r}) \cdot \nabla χ_{ν} (\vec{r}) \\ + \frac{\partial f (\vec{r})}{\partial \nabla^{2} ρ_{α} (\vec{r})} [χ_{μ} (\vec{r}) \nabla^{2} χ_{ν} (\vec{r}) + χ_{ν} (\vec{r}) \nabla^{2} χ_{μ} (\vec{r})]] \end{matrix}

(56)

which can again be expressed in terms of matrix products to achieve faster quadrature. The expressions remain formally the same in the restricted case, but the quantities correspond to the total electron density.

11.4. Range-Separated Hybrid Functionals

As was mentioned before, the use of non-zero values for the constants a and b in Equation (7) allows the inclusion of exact exchange effects in a DFT calculation. Such functionals represent the fourth rung of Jacob’s Ladder [28], and are generally referred to as hybrids. A further development on hybrid functionals are range-separated hybrids [89,90], in which the interelectronic interaction is divided into a short-range (sr) and a long-range (lr) part with a resolution of the identity

\frac{1}{r_{12}} = \frac{ϕ^{sr} (r_{12})}{r_{12}} + \frac{ϕ^{lr} (r_{12})}{r_{12}}

(57)

where

ϕ^{sr} (r_{12}) + ϕ^{lr} (r_{12}) = 1

. The rationale for range separation is that since density functional approximations for the exchange are based only on local information about the density, they fail to reproduce accurate estimates for, e.g., charge transfer processes. Separating the interaction by range per Equation (57) leads to a hybrid exchange functional that has four contributions

E^{X} = a^{sr} E^{sr-HF} + a^{lr} E^{lr-HF} + b^{sr} E^{sr-DFT} + b^{lr} E^{lr-DFT} = a^{sr} E^{sr-HF} + a^{lr} E^{lr-HF} + E^{DFT}

(58)

where we have stressed that since the DFT contributions are evaluated based only on the density (and possibly its derivatives),

b^{sr} E^{sr-DFT} + b^{lr} E^{lr-DFT}

is nothing but a definition of a new density functional. In contrast, the HF contributions to the energy and the Kohn–Sham–Fock matrix have to be evaluated separately with range-separated ERIs

{(μ ν | λ σ)}^{sr / lr} = \int d {\vec{r}}_{1} \int d {\vec{r}}_{2} χ_{μ} (\vec{r_{1}}) χ_{ν} (\vec{r_{1}}) \frac{ϕ^{sr / lr} (r_{12})}{r_{12}} χ_{λ} (\vec{r_{2}}) χ_{σ} (\vec{r_{2}})

(59)

Several kinds of range-separation kernels

ϕ^{sr / lr} (r)

have been proposed; however, the error function based kernel

ϕ^{sr} (r; ω) = erfc (ω r)

,

ϕ^{lr} (r; ω) = \erf (ω r)

, where

ω

is the range-separation parameter, is by far the most commonly used one because it is exceedingly simple to implement in codes employing a plane-wave or Gaussian basis set [91,92]. The error function kernel is used, for instance, in the Heyd–Scuseria–Ernzerhof (HSE) functionals for solid-state calculations [91,93], as well as in the

ω

B97M-V [94] functional that is discussed below in Section 11.5. Some functionals based on Yukawa kernels,

ϕ^{sr} (r; ω) = e^{- λ r}

,

ϕ^{lr} (r; ω) = 1 - e^{- λ r}

, have also been published and are available in LibXC, for instance. It is important to check that the range-separation kernel used in the density functional implementation matches the one used in the computation of the range-separated ERIs in Equation (59), as the results will be incorrect otherwise.

11.5. Non-Local Correlation

Dispersion effects, i.e., van der Waals interactions, can be modeled in an ab initio DFT setting with non-local correlation functionals [95]

E^{nlc} = \int d {\vec{r}}_{1} d {\vec{r}}_{2} ρ ({\vec{r}}_{1}) Φ_{0} ({\vec{r}}_{1}, {\vec{r}}_{2}) ρ ({\vec{r}}_{2})

(60)

Because the non-local correlation energy term depends explicitly on the electron density, it also needs to be included in the SCF procedure, in principle. In contrast, empirical dispersion corrections such as Grimme’s various DFT-D approaches [96,97,98] do not depend on the electron density, and are added only as an ad hoc correction onto the electronic energy.

Perhaps the most accurate rung-3 and rung-4 functionals currently available [99,100,101], the pure B97M-V [102] mGGA as well as the range-separated

ω

B97M-V [94] hybrid mGGA, respectively, are built on top of [103] the VV10 non-local correlation functional [23] which is controlled by two adjustable parameters, b and C, which have been trained alongside the density functional in B97M-V and

ω

B97M-V. The results of a recent benchmark study suggest that the VV10 contributions on densities and orbital energies are negligible, and that sufficiently accurate energetics may be obtained by a one-shot evaluation of

E^{nlc}

in a post-SCF fashion [99]. Still, a rigorous minimization of the energy requires considering the effects of the non-local correlation on the wave function. Although Equation (60) does not appear to fit on the rungs of functionals discussed above, the VV10 kernel turns out to yield a GGA-type contribution to the Kohn–Sham–Fock matrix as discussed in [23], to which we refer for further details.

12. Summary and Discussion

We have presented an overview of self-consistent field calculations within a variational basis set formalism, and discussed the solution of the self-consistent field equations arising from Hartree–Fock as well as various levels of density functional approximations using either the traditional fixed-point equations or direct minimization, as well as various conceptual and numerical issues arising in their implementation. No assumptions have been made on the underlying basis set in the present work: the self-consistent field formalism is the same regardless of the form of the basis functions, which can be chosen to be, e.g., Gaussian-type orbitals (GTOs), Slater-type orbitals (STOs), numerical atomic orbitals (NAOs), or finite element shape functions. The basis set is only reflected in the molecular integrals, that is, the matrix elements of the core Hamiltonian and the two-electron integrals. The various choices for the basis set have different advantages and disadvantages, including the evaluation of the molecular integrals; see [2] for discussion. Instead, the main restriction in the present work is the implicit assumption that the basis set is compact enough so that the

N \times N

density and Fock matrices are small enough to allow

O (N^{2})

dense matrix storage and

O (N^{3})

diagonalization. Although the present overview has focused on non-relativistic calculations on molecules, the discussion for relativistic calculations as well as crystalline systems is analogous to a large degree. We hope that the present, consistent and thorough derivation and analysis will be useful for reference as well as teaching purposes, and that the results presented herein will lead to a wider availability of density functionals in electronic structure programs.

Author Contributions

S.L. derived the equations in Section 11. C.V.A. and F.B. derived the equations in Section 2, Section 3, Section 4, Section 5 and Section 6 and wrote an initial version of the manuscript. S.L. wrote the final version of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the Academy of Finland (Suomen Akatemia) through project number 311149.

Conflicts of Interest

The authors declare no conflict of interest.

Sample Availability: Not available.

References

Poree, C.; Schoenebeck, F. A Holy Grail in Chemistry: Computational Catalyst Design: Feasible or Fiction? Acc. Chem. Res. 2017, 50, 605–608. [Google Scholar] [CrossRef]
Lehtola, S. A review on non-relativistic, fully numerical electronic structure calculations on atoms and diatomic molecules. Int. J. Quantum Chem. 2019, 119, e25968. [Google Scholar] [CrossRef]
Lehtola, S. Fully numerical Hartree–Fock and density functional calculations. I. Atoms. Int. J. Quantum Chem. 2019, 119, e25945. [Google Scholar] [CrossRef]
Lehtola, S. Fully numerical Hartree–Fock and density functional calculations. II. Diatomic molecules. Int. J. Quantum Chem. 2019, 119, e25944. [Google Scholar] [CrossRef]
Roothaan, C. New Developments in Molecular Orbital Theory. Rev. Mod. Phys. 1951, 23, 69–89. [Google Scholar] [CrossRef]
Hall, G.G. The Molecular Orbital Theory of Chemical Valency. VIII. A Method of Calculating Ionization Potentials. Proc. R. Soc. A Math. Phys. Eng. Sci. 1951, 205, 541–552. [Google Scholar] [CrossRef]
Löwdin, P.O. On the Nonorthogonality Problem. Adv. Quantum Chem. 1970, 5, 185–199. [Google Scholar] [CrossRef]
Pople, J.A.; Nesbet, R.K. Self-Consistent Orbitals for Radicals. J. Chem. Phys. 1954, 22, 571. [Google Scholar] [CrossRef]
Berthier, G. Configurations électroniques incomplètes. Partie I. La Méthode du Champ Moléculaire Self-Consistent et l’Etude des Etats à Couches Incomplètes. J. Chim. Phys. 1954, 51, 363–371. [Google Scholar] [CrossRef]
Roothaan, C. Self-Consistent Field Theory for Open Shells of Electronic Systems. Rev. Mod. Phys. 1960, 32, 179–185. [Google Scholar] [CrossRef]
Krebs, S. A review on the derivation of the spin-Restricted Hartree–Fock (RHF) Self-Consistent Field (SCF) equations for open-shell systems. Description of different methods to handle the off-diagonal Lagrangian multipliers coupling closed and open shells. Comput. Phys. Commun. 1999, 116, 137–277. [Google Scholar] [CrossRef]
Hohenberg, P.; Kohn, W. Inhomogeneous Electron Gas. Phys. Rev. 1964, 136, B864–B871. [Google Scholar] [CrossRef]
Kohn, W.; Sham, L.J. Self-Consistent Equations Including Exchange and Correlation Effects. Phys. Rev. 1965, 140, A1133–A1138. [Google Scholar] [CrossRef]
Baerends, E.J. Perspective on “Self-consistent equations including exchange and correlation effects”. In Theoretical Chemistry Accounts; Cramer, C.J., Truhlar, D.G., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; pp. 265–269. [Google Scholar] [CrossRef]
Geerlings, P.; De Proft, F.; Langenaeker, W. Conceptual Density Functional Theory. Chem. Rev. 2003, 103, 1793–1874. [Google Scholar] [CrossRef] [PubMed]
Pople, J.A.; Gill, P.M.W.; Johnson, B.G. Kohn–Sham density-functional theory within a finite basis set. Chem. Phys. Lett. 1992, 199, 557–560. [Google Scholar] [CrossRef]
Johnson, B.G.; Gill, P.M.W.; Pople, J.A. The performance of a family of density functional methods. J. Chem. Phys. 1993, 98, 5612. [Google Scholar] [CrossRef]
Johnson, B.G. Erratum: The performance of a family of density functional methods. J. Chem. Phys. 1994, 101, 9202. [Google Scholar] [CrossRef]
Kobayashi, K.; Kurita, N.; Kumahora, H.; Tago, K. Bond-energy calculations of Cu₂, Ag₂, and CuAg with the generalized gradient approximation. Phys. Rev. A 1991, 43, 5810–5813. [Google Scholar] [CrossRef]
Neumann, R.; Nobes, R.H.; Handy, N.C. Exchange functionals and potentials. Mol. Phys. 1996, 87, 1–36. [Google Scholar] [CrossRef]
Vydrov, O.A.; Wu, Q.; Van Voorhis, T. Self-consistent implementation of a nonlocal van der Waals density functional with a Gaussian basis set. J. Chem. Phys. 2008, 129. [Google Scholar] [CrossRef]
Vydrov, O.A.; Van Voorhis, T. Improving the accuracy of the nonlocal van der Waals density functional with minimal empiricism. J. Chem. Phys. 2009, 130, 104105. [Google Scholar] [CrossRef] [PubMed]
Vydrov, O.A.; Van Voorhis, T. Nonlocal van der Waals density functional: The simpler the better. J. Chem. Phys. 2010, 133, 244103. [Google Scholar] [CrossRef] [PubMed]
Vydrov, O.A.; Van Voorhis, T. Implementation and assessment of a simple nonlocal van der Waals density functional. J. Chem. Phys. 2010, 132. [Google Scholar] [CrossRef] [PubMed]
Parrish, R.M.; Burns, L.A.; Smith, D.G.A.; Simmonett, A.C.; DePrince, A.E.; Hohenstein, E.G.; Bozkaya, U.; Sokolov, A.Y.; Di Remigio, R.; Richard, R.M.; et al. Psi4 1.1: An Open-Source Electronic Structure Program Emphasizing Automation, Advanced Libraries, and Interoperability. J. Chem. Theory Comput. 2017, 13, 3185–3197. [Google Scholar] [CrossRef] [PubMed]
Sun, Q.; Berkelbach, T.C.; Blunt, N.S.; Booth, G.H.; Guo, S.; Li, Z.; Liu, J.; McClain, J.D.; Sayfutyarova, E.R.; Sharma, S.; et al. PySCF: The Python-based simulations of chemistry framework. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2018, 8, e1340. [Google Scholar] [CrossRef]
Becke, A.D.; Roussel, M.R. Exchange holes in inhomogeneous systems: A coordinate-space model. Phys. Rev. A 1989, 39, 3761–3767. [Google Scholar] [CrossRef]
Perdew, J.P. Jacob’s ladder of density functional approximations for the exchange-correlation energy. AIP Conf. Proc. 2001, 577, 1–20. [Google Scholar] [CrossRef]
Small, D.W.; Sundstrom, E.J.; Head-Gordon, M. Restricted Hartree Fock using complex-valued orbitals: A long-known but neglected tool in electronic structure theory. J. Chem. Phys. 2015, 142, 024104. [Google Scholar] [CrossRef]
Lee, J.; Bertels, L.W.; Small, D.W.; Head-Gordon, M. Kohn–Sham Density Functional Theory with Complex, Spin-Restricted Orbitals: Accessing a New Class of Densities without the Symmetry Dilemma. Phys. Rev. Lett. 2019, 123, 113001. [Google Scholar] [CrossRef]
Lehtola, S.; Head-Gordon, M.; Jónsson, H. Complex Orbitals, Multiple Local Minima, and Symmetry Breaking in Perdew–Zunger Self-Interaction Corrected Density Functional Theory Calculations. J. Chem. Theory Comput. 2016, 12, 3195–3207. [Google Scholar] [CrossRef]
Perdew, J.P.; Burke, K.; Ernzerhof, M. Generalized Gradient Approximation Made Simple. Phys. Rev. Lett. 1996, 77, 3865–3868. [Google Scholar] [CrossRef] [PubMed]
Becke, A.D. A new mixing of Hartree–Fock and local density-functional theories. J. Chem. Phys. 1993, 98, 1372–1377. [Google Scholar] [CrossRef]
Stephens, P.J.; Devlin, F.J.; Chabalowski, C.F.; Frisch, M.J. Ab Initio Calculation of Vibrational Absorption and Circular Dichroism Spectra Using Density Functional Force Fields. J. Phys. Chem. 1994, 98, 11623–11627. [Google Scholar] [CrossRef]
Lehtola, S.; Jónsson, H. Unitary Optimization of Localized Molecular Orbitals. J. Chem. Theory Comput. 2013, 9, 5365–5372. [Google Scholar] [CrossRef]
Foster, J.; Boys, S. Canonical Configurational Interaction Procedure. Rev. Mod. Phys. 1960, 32, 300–302. [Google Scholar] [CrossRef]
Edmiston, C.; Ruedenberg, K. Localized Atomic and Molecular Orbitals. Rev. Mod. Phys. 1963, 35, 457–464. [Google Scholar] [CrossRef]
Pipek, J.; Mezey, P.G. A fast intrinsic localization procedure applicable for ab initio and semiempirical linear combination of atomic orbital wave functions. J. Chem. Phys. 1989, 90, 4916. [Google Scholar] [CrossRef]
Høyvik, I.; Jansik, B.; Jørgensen, P. Orbital localization using fourth central moment minimization. J. Chem. Phys. 2012, 137, 224114. [Google Scholar] [CrossRef]
Lehtola, S.; Jónsson, H. Pipek–Mezey Orbital Localization Using Various Partial Charge Estimates. J. Chem. Theory Comput. 2014, 10, 642–649. [Google Scholar] [CrossRef]
Knizia, G.; Klein, J.E.M.N. Electron Flow in Reaction Mechanisms-Revealed from First Principles. Angew. Chemie Int. Ed. 2015, 54, 5518–5522. [Google Scholar] [CrossRef]
Liu, Y.; Kilby, P.; Frankcombe, T.J.; Schmidt, T.W. Calculating curly arrows from ab initio wavefunctions. Nat. Commun. 2018, 9, 1436. [Google Scholar] [CrossRef] [PubMed]
Klein, J.E.M.N.; Knizia, G.; Rzepa, H.S. Epoxidation of Alkenes by Peracids: From Textbook Mechanisms to a Quantum Mechanically Derived Curly-Arrow Depiction. ChemistryOpen 2019, 8, 1244–1250. [Google Scholar] [CrossRef] [PubMed]
Head-Gordon, M.; Pople, J.A. Optimization of wave function and geometry in the finite basis Hartree–Fock method. J. Phys. Chem. 1988, 92, 3063–3069. [Google Scholar] [CrossRef]
Van Voorhis, T.; Head-Gordon, M. A geometric approach to direct minimization. Mol. Phys. 2002, 100, 1713–1721. [Google Scholar] [CrossRef]
Coulson, C.A.; Fischer, I. XXXIV. Notes on the molecular orbital treatment of the hydrogen molecule. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1949, 40, 386–393. [Google Scholar] [CrossRef]
Löwdin, P.O. Quantum theory of cohesive properties of solids. Adv. Phys. 1956, 5, 1–171. [Google Scholar] [CrossRef]
Lehtola, S. Curing basis set overcompleteness with pivoted Cholesky decompositions. J. Chem. Phys. 2019, 151, 241102. [Google Scholar] [CrossRef]
Lehtola, S. Accurate reproduction of strongly repulsive interatomic potentials. Phys. Rev. A. 2020, 101, 032504. [Google Scholar] [CrossRef]
Lehtola, S. Assessment of Initial Guesses for Self-Consistent Field Calculations. Superposition of Atomic Potentials: Simple yet Efficient. J. Chem. Theory Comput. 2019, 15, 1593–1604. [Google Scholar] [CrossRef]
Karlström, G. Dynamical damping based on energy minimization for use ab initio molecular orbital SCF calculations. Chem. Phys. Lett. 1979, 67, 348–350. [Google Scholar] [CrossRef]
Cancès, E.; Le Bris, C. Can we outperform the DIIS approach for electronic structure calculations? Int. J. Quantum Chem. 2000, 79, 82–90. [Google Scholar] [CrossRef]
Saunders, V.R.; Hillier, I.H. A “Level-Shifting” method for converging closed shell Hartree–Fock wave functions. Int. J. Quantum Chem. 1973, 7, 699–705. [Google Scholar] [CrossRef]
Mitin, A.V. The dynamic "level shift" method for improving the convergence of the SCF procedure. J. Comput. Chem. 1988, 9, 107–110. [Google Scholar] [CrossRef]
Dömötör, G.; Bán, M.I. Dynamic level-shifting. Comput. Chem. 1989, 13, 53–57. [Google Scholar] [CrossRef]
Pulay, P. Convergence acceleration of iterative sequences. The case of SCF iteration. Chem. Phys. Lett. 1980, 73, 393–398. [Google Scholar] [CrossRef]
Pulay, P. Improved SCF convergence acceleration. J. Comput. Chem. 1982, 3, 556–560. [Google Scholar] [CrossRef]
Kudin, K.N.; Scuseria, G.E.; Cancès, E. A black-box self-consistent field convergence algorithm: One step closer. J. Chem. Phys. 2002, 116, 8255. [Google Scholar] [CrossRef]
Hu, X.; Yang, W. Accelerating self-consistent field convergence with the augmented Roothaan–Hall energy function. J. Chem. Phys. 2010, 132, 054109. [Google Scholar] [CrossRef]
Rabuck, A.D.; Scuseria, G.E. Improving self-consistent field convergence by varying occupation numbers. J. Chem. Phys. 1999, 110, 695–700. [Google Scholar] [CrossRef]
Douady, J.; Ellinger, Y.; Subra, R.; Levy, B. A quadratically convergent SCF procedure. Comput. Phys. Commun. 1979, 17, 23–25. [Google Scholar] [CrossRef]
Douady, J.; Ellinger, Y.; Subra, R.; Levy, B. Exponential transformation of molecular orbitals: A quadratically convergent SCF procedure. I. General formulation and application to closed-shell ground states. J. Chem. Phys. 1980, 72, 1452. [Google Scholar] [CrossRef]
Head-Gordon, M.; Pople, J.A.; Frisch, M.J. Quadratically convergent simultaneous optimization of wavefunction and geometry. Int. J. Quantum Chem. 1989, 36, 291–303. [Google Scholar] [CrossRef]
Lehtola, S.; Jónsson, H. Variational, Self-Consistent Implementation of the Perdew–Zunger Self-Interaction Correction with Complex Optimal Orbitals. J. Chem. Theory Comput. 2014, 10, 5324–5337. [Google Scholar] [CrossRef] [PubMed]
Lehtola, S.; Jónsson, H. Correction to Variational, Self-Consistent Implementation of the Perdew–Zunger Self-Interaction Correction with Complex Optimal Orbitals. J. Chem. Theory Comput. 2015, 11, 5052–5053. [Google Scholar] [CrossRef][Green Version]
Perdew, J.P.; Zunger, A. Self-interaction correction to density-functional approximations for many-electron systems. Phys. Rev. B 1981, 23, 5048–5079. [Google Scholar] [CrossRef]
Lehtola, S.; Parkhill, J.; Head-Gordon, M. Orbital optimisation in the perfect pairing hierarchy: Applications to full-valence calculations on linear polyacenes. Mol. Phys. 2018, 116, 547–560. [Google Scholar] [CrossRef]
Parkhill, J.A.; Lawler, K.; Head-Gordon, M. The perfect quadruples model for electron correlation in a valence active space. J. Chem. Phys. 2009, 130, 084101. [Google Scholar] [CrossRef]
Parkhill, J.A.; Head-Gordon, M. A tractable and accurate electronic structure method for static correlations: The perfect hextuples model. J. Chem. Phys. 2010, 133, 024103. [Google Scholar] [CrossRef]
Seeger, R.; Pople, J.A. Self-consistent molecular orbital methods. XVIII. Constraints and stability in Hartree–Fock theory. J. Chem. Phys. 1977, 66, 3045–3050. [Google Scholar] [CrossRef]
Bauernschmitt, R.; Ahlrichs, R. Stability analysis for solutions of the closed shell Kohn–Sham equation. J. Chem. Phys. 1996, 104, 9047. [Google Scholar] [CrossRef]
Francisco, J.B.; Martínez, J.M.; Martínez, L. Globally convergent trust-region methods for self-consistent field electronic structure calculations. J. Chem. Phys. 2004, 121, 10863–10878. [Google Scholar] [CrossRef] [PubMed]
Thøgersen, L.; Olsen, J.; Yeager, D.; Jørgensen, P.; Sałek, P.; Helgaker, T. The trust-region self-consistent field method: Towards a black-box optimization in Hartree–Fock and Kohn–Sham theories. J. Chem. Phys. 2004, 121, 16–27. [Google Scholar] [CrossRef] [PubMed]
Thøgersen, L.; Olsen, J.; Köhn, A.; Jørgensen, P.; Sałek, P.; Helgaker, T. The trust-region self-consistent field method in Kohn–Sham density-functional theory. J. Chem. Phys. 2005, 123, 074103. [Google Scholar] [CrossRef] [PubMed]
Almlöf, J.; Faegri, K.; Korsell, K. Principles for a direct SCF approach to LCAO-MO ab-initio calculations. J. Comput. Chem. 1982, 3, 385–399. [Google Scholar] [CrossRef]
Van Lenthe, J.H.; Zwaans, R.; Van Dam, H.J.J.; Guest, M.F. Starting SCF calculations by superposition of atomic densities. J. Comput. Chem. 2006, 27, 926–932. [Google Scholar] [CrossRef]
Gilbert, A.T.B.; Besley, N.A.; Gill, P.M.W. Self-Consistent Field Calculations of Excited States Using the Maximum Overlap Method (MOM). J. Phys. Chem. A 2008, 112, 13164–13171. [Google Scholar] [CrossRef]
Barca, G.M.J.; Gilbert, A.T.B.; Gill, P.M.W. Simple Models for Difficult Electronic Excitations. J. Chem. Theory Comput. 2018, 14, 1501–1509. [Google Scholar] [CrossRef]
Hait, D.; Head-Gordon, M. Excited state orbital optimization via minimizing the square of the gradient: General approach and application to singly and doubly excited states via density functional theory. J. Chem. Theory Comput. 2020. [Google Scholar] [CrossRef]
Hait, D.; Head-Gordon, M. Highly Accurate Prediction of Core Spectra of Molecules at Density Functional Theory Cost: Attaining Sub-electronvolt Error from a Restricted Open-Shell Kohn–Sham Approach. J. Phys. Chem. Lett. 2020, 11, 775–786. [Google Scholar] [CrossRef]
Thom, A.; Head-Gordon, M. Locating Multiple Self-Consistent Field Solutions: An Approach Inspired by Metadynamics. Phys. Rev. Lett. 2008, 101, 193001. [Google Scholar] [CrossRef]
Mardirossian, N.; Head-Gordon, M. Thirty years of density functional theory in computational chemistry: An overview and extensive assessment of 200 density functionals. Mol. Phys. 2017, 115, 2315–2372. [Google Scholar] [CrossRef]
Lehtola, S.; Steigemann, C.; Oliveira, M.J.T.; Marques, M.A.L. Recent developments in LIBXC—A comprehensive library of functionals for density functional theory. SoftwareX 2018, 7, 1–5. [Google Scholar] [CrossRef]
Ekström, U.; Visscher, L.; Bast, R.; Thorvaldsen, A.J.; Ruud, K. Arbitrary-Order Density Functional Response Theory from Automatic Differentiation. J. Chem. Theory Comput. 2010, 6, 1971–1980. [Google Scholar] [CrossRef] [PubMed]
Bloch, F. Bemerkung zur Elektronentheorie des Ferromagnetismus und der elektrischen Leitfähigkeit. Z. Phys. 1929, 57, 545–555. [Google Scholar] [CrossRef]
Dirac, P.A.M. Note on Exchange Phenomena in the Thomas Atom. Math. Proc. Cambridge Philos. Soc. 1930, 26, 376–385. [Google Scholar] [CrossRef]
Becke, A.D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 1988, 38, 3098–3100. [Google Scholar] [CrossRef]
Perdew, J.P.; Yue, W. Accurate and simple density functional for the electronic exchange energy: Generalized gradient approximation. Phys. Rev. B 1986, 33, 8800–8802. [Google Scholar] [CrossRef]
Savin, A.; Flad, H.J. Density functionals for the Yukawa electron-electron interaction. Int. J. Quantum Chem. 1995, 56, 327–332. [Google Scholar] [CrossRef]
Leininger, T.; Stoll, H.; Werner, H.J.; Savin, A. Combining long-range configuration interaction with short-range density functionals. Chem. Phys. Lett. 1997, 275, 151–160. [Google Scholar] [CrossRef]
Heyd, J.; Scuseria, G.E.; Ernzerhof, M. Hybrid functionals based on a screened Coulomb potential. J. Chem. Phys. 2003, 118, 8207. [Google Scholar] [CrossRef]
Ahlrichs, R. A simple algebraic derivation of the Obara-Saika scheme for general two-electron interaction potentials. Phys. Chem. Chem. Phys. 2006, 8, 3072–3077. [Google Scholar] [CrossRef] [PubMed]
Heyd, J.; Scuseria, G.E.; Ernzerhof, M. Erratum: “Hybrid functionals based on a screened Coulomb potential”. J. Chem. Phys. 2006, 124, 219906. [Google Scholar] [CrossRef]
Mardirossian, N.; Head-Gordon, M. ωB97M-V: A combinatorially optimized, range-separated hybrid, meta-GGA density functional with VV10 nonlocal correlation. J. Chem. Phys. 2016, 144, 214110. [Google Scholar] [CrossRef] [PubMed]
Berland, K.; Cooper, V.R.; Lee, K.; Schröder, E.; Thonhauser, T.; Hyldgaard, P.; Lundqvist, B.I. van der Waals forces in density functional theory: A review of the vdW-DF method. Rep. Prog. Phys. 2015, 78, 066501. [Google Scholar] [CrossRef] [PubMed]
Grimme, S. Accurate description of van der Waals complexes by density functional theory including empirical corrections. J. Comput. Chem. 2004, 25, 1463–1473. [Google Scholar] [CrossRef] [PubMed]
Grimme, S.; Antony, J.; Ehrlich, S.; Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 2010, 132, 154104. [Google Scholar] [CrossRef] [PubMed]
Caldeweyher, E.; Bannwarth, C.; Grimme, S. Extension of the D3 dispersion coefficient model. J. Chem. Phys. 2017, 147, 034112. [Google Scholar] [CrossRef]
Najibi, A.; Goerigk, L. The Nonlocal Kernel in van der Waals Density Functionals as an Additive Correction: An Extensive Analysis with Special Emphasis on the B97M-V and ωB97M-V Approaches. J. Chem. Theory Comput. 2018, 14, 5725–5738. [Google Scholar] [CrossRef]
Santra, G.; Sylvetsky, N.; Martin, J.M.L. Minimally Empirical Double-Hybrid Functionals Trained against the GMTKN55 Database: revDSD-PBEP86-D4, revDOD-PBE-D4, and DOD-SCAN-D4. J. Phys. Chem. A 2019, 123, 5129–5143. [Google Scholar] [CrossRef]
Iron, M.A.; Janes, T. Evaluating Transition Metal Barrier Heights with the Latest Density Functional Theory Exchange–Correlation Functionals: The MOBH35 Benchmark Database. J. Phys. Chem. A 2019, 123, 3761–3781. [Google Scholar] [CrossRef]
Mardirossian, N.; Head-Gordon, M. Mapping the genome of meta-generalized gradient approximation density functionals: The search for B97M-V. J. Chem. Phys. 2015, 142, 074111. [Google Scholar] [CrossRef] [PubMed]
Calbo, J.; Ortí, E.; Sancho-García, J.C.; Aragó, J. The Nonlocal Correlation Density Functional VV10. Annu. Rep. Comput. Chem. 2015, 11, 37–102. [Google Scholar] [CrossRef]

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lehtola, S.; Blockhuys, F.; Van Alsenoy, C. An Overview of Self-Consistent Field Calculations Within Finite Basis Sets. Molecules 2020, 25, 1218. https://doi.org/10.3390/molecules25051218

AMA Style

Lehtola S, Blockhuys F, Van Alsenoy C. An Overview of Self-Consistent Field Calculations Within Finite Basis Sets. Molecules. 2020; 25(5):1218. https://doi.org/10.3390/molecules25051218

Chicago/Turabian Style

Lehtola, Susi, Frank Blockhuys, and Christian Van Alsenoy. 2020. "An Overview of Self-Consistent Field Calculations Within Finite Basis Sets" Molecules 25, no. 5: 1218. https://doi.org/10.3390/molecules25051218

APA Style

Lehtola, S., Blockhuys, F., & Van Alsenoy, C. (2020). An Overview of Self-Consistent Field Calculations Within Finite Basis Sets. Molecules, 25(5), 1218. https://doi.org/10.3390/molecules25051218

Article Menu

An Overview of Self-Consistent Field Calculations Within Finite Basis Sets^†

Abstract

1. Introduction

2. Basis Set Expansion

3. Energy Expression

4. Unitary Invariance

5. Spin-Restriction vs. Unrestriction

6. Self-Consistent Field Equations

7. Solution of Self-Consistent Field Equations

8. Why Does the Self-Consistent Field Method Minimize the Energy?

9. Direct Minimization of the Energy

10. SCF vs. Direct Minimization

11. Density Functional Contributions to Kohn–Sham–Fock Matrix

11.1. LDA Functionals

11.2. GGA Functionals

11.3. Meta-GGA Functionals

11.4. Range-Separated Hybrid Functionals

11.5. Non-Local Correlation

12. Summary and Discussion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

An Overview of Self-Consistent Field Calculations Within Finite Basis Sets †

Abstract

1. Introduction

2. Basis Set Expansion

3. Energy Expression

4. Unitary Invariance

5. Spin-Restriction vs. Unrestriction

6. Self-Consistent Field Equations

7. Solution of Self-Consistent Field Equations

8. Why Does the Self-Consistent Field Method Minimize the Energy?

9. Direct Minimization of the Energy

10. SCF vs. Direct Minimization

11. Density Functional Contributions to Kohn–Sham–Fock Matrix

11.1. LDA Functionals

11.2. GGA Functionals

11.3. Meta-GGA Functionals

11.4. Range-Separated Hybrid Functionals

11.5. Non-Local Correlation

12. Summary and Discussion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

An Overview of Self-Consistent Field Calculations Within Finite Basis Sets^†