Next Article in Journal / Special Issue
The Role of the Reduced Laplacian Renormalization in the Kinetic Energy Functional Development
Previous Article in Journal
Field Programmable Gate Array Applications—A Scientometric Review
Previous Article in Special Issue
Emerging DFT Methods and Their Importance for Challenging Molecular Systems with Orbital Degeneracy
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Efficient Evaluation of Molecular Electrostatic Potential in Large Systems

Departamento de Química Física Aplicada, Facultad de Ciencias, Universidad Autónoma de Madrid, 28049 Madrid, Spain
*
Author to whom correspondence should be addressed.
Computation 2019, 7(4), 64; https://doi.org/10.3390/computation7040064
Submission received: 23 October 2019 / Revised: 8 November 2019 / Accepted: 9 November 2019 / Published: 12 November 2019
(This article belongs to the Special Issue New Advances in Density Functional Theory and Its Application)

Abstract

:
An algorithm for the efficient computation of molecular electrostatic potential is reported. It is based on the partition/expansion of density into (pseudo) atomic fragments with the method of Deformed Atoms in Molecules, which allows to compute the potential as a sum of atomic contributions. These contributions are expressed as a series of irregular spherical harmonics times effective multipole moments and inverse multipole moments, including short-range terms. The problem is split into two steps. The first one consists of the partition/expansion of density accompanied by the computation of multipole moments, and its cost depends on the size of the basis set used in the computation of electron density within the Linear Combination of Atomic Orbitals framework. The second one is the actual computation of the electrostatic potential from the quantities calculated in the first step, and its cost depends on the number of computation points. For a precision in the electrostatic potential of six decimal figures, the algorithm leads to a dramatic reduction of the computation time with respect to the calculation from electron density matrix and integrals involving basis set functions.

1. Introduction

Despite of its undeniable success in the description of chemical structure and reactivity, the language of chemistry is largely grounded on concepts with loose theoretical foundations. Concepts like bond, atomic charges, orbitals, hybridization, resonance, delocalization, hyperconjugation and many others cannot be directly measured in physical experiments or expressed more precisely in the jargon of quantum mechanics, they are not observables.
The proposal of G.N Lewis [1], in the early twenties of last century, to explain all the previous work made on the study of the molecular structure was recognized from the very beginning as an outstanding contribution to the field. Because of the weak arguments that supported his model, his work was promptly followed by attempts to overcome its theoretical shortcomings by providing a more rigorous foundation based on the recently discovered quantum mechanics. In these attempts, many concepts were proposed to underpin Lewis ideas by pioneers like Coulson, Pauling, Hückel and Mulliken, just to mention some of the most conspicuous figures. Their early efforts to accommodate the complex and sometimes elusive chemical facts in the frame provided by the new emerging physics were driven in part by the available computational capabilities, that were very reduced at the time.
Thus, what at first was an admirable exercise of imagination in its origin has led to a burden of new concepts in an attempt to grasp the increasing knowledge on the complexity of the chemical world. The effect of this multiplication of concepts has been keenly illustrated by P. Politzer in the case of the interpretation of a well established concept in chemistry as is hydrogen bonding. Numerous interpretations have been given depending on the circumstances and the authors’ preferences. As Politzer humorously quotes, hydrogen bonding “has been dissected into classical and nonclassical, proper and improper (immoral?), blue-shifted and red-shifted, dihydrogen, anti-hydrogen (rebellious?), resonance-assisted, polarization-assisted, and more” [2].
Without wishing to understate the usefulness and achievements of the somehow imprecise concepts in which chemical language is currently built, it is worth recalling that there are two physically well grounded observables that are most useful in describing the realm of chemistry: the electron density and the electrostatic potential.
In the Born–Oppenheimer envisioning of molecular structure [3], molecular electron density (MED) can be regarded as a mathematical description of the electron cloud surrounding the nuclei in the molecule. As it is a function defined in 3D space, it can be displayed and easily connected with intuition, and faithful approximations can be obtained with current computational methods at a low or moderate cost, depending on the system’s size. Although the importance of MED was realized very early, it remained long time as an auxiliary quantity for computing other properties like atom and bond charges [4] or to reinforce the analysis based on wave function and its related concepts in the Linear Combination of Atomic Orbitals (LCAO) framework, such as orbitals, hybridization and the like. The recognition of its usefulness as a powerful tool for analysis of molecular structure had to wait to R. Bader’s work [5,6], who changed the focus considering MED as a very central property for chemical interpretation, as an alternative to the well stablished orbital analysis. Since then, the topological analysis of MED has unceasingly gained popularity among theoretical chemists, and currently, it has become a well established tool that is customarily used in the interpretation of the chemical structure and reactivity.
The case of molecular electrostatic potential (MESP) is somehow comparable to that of MED, with the added difficulty of the cost of its faithful computation in large systems. Former works on the application of MESP to chemical facts [7,8,9] highlighted the possibilities of MESP in the interpretation of reactivity. The analysis of MESP in molecular surfaces and the exploration of the relations between its values with thermochemical properties by Politzer and Murray [10,11,12] have proved to be useful tools for systematization and prediction of such properties. Besides, the introduction of the concepts of sigma and pi holes by these authors has been very valuable in the interpretation of molecular interactions between closed shell systems [13,14]. Furthermore, the outstanding contributions by S. Gadre et al [15,16,17,18,19,20,21] on the topological analysis of MESP have added new insight on the structure and properties of molecules and molecular aggregates. However, albeit this analysis of MESP topology is breaking through in current research, its application to large systems is being hindered by the cost of computing accurate values of MESP and its derivatives. Thus, the introduction of more efficient procedures like that proposed herein has greatly facilitated this type of studies, and several works using the new algorithms have appeared in recent literature [22,23,24,25].
In this work, we report details on the algorithms used for the efficient calculation of MESP in large systems, based on the Deformed Atoms in Molecules (DAM) partition/expansion [26] of MED. This procedure allows us to split the problem in two separate steps, one dealing with the MED partition/expansion itself, and another corresponding to the actual computation of MESP [27]. As it will be shown, the cost of the first step is independent of the number of MESP computation points, and the second is independent of the number of basis set functions in the calculation. This separation is crucial for the high performance of the method.
The article is organized as follows. In the next section, the theoretical foundations of the method are outlined, only essential equations are given and the remaining ones are collected in the supplementary file (SF) accompanying this work. The third section is devoted to the description of the algorithms’ implementation, and is split in two subsections collecting the algorithms for the two steps of the procedure. Again, subordinate equations and technical details are addressed in SF. Several results on performance and accuracy are presented in the fourth section and, finally, significant conclusions are drawn from these results.

2. Method

The algorithm for the efficient evaluation of MESP reported here is based on DAM [26] partition of electron density. In this partition the molecular density is expressed as a sum of atomic contributions:
ρ ( r ) = A ρ A ( r A )
where the densities of the (pseudo)atomic fragments, ρ A ( r A ) , are obtained with a least-deformation criterion based on the fast convergence of the long-range multipole expansion of the electrostatic potential [28]. In practice, in the Linear Combination of Atomic Orbitals framework (LCAO), in which DAM is formulated, this fast convergence is achieved by assigning the one-center distributions, i.e., products of pairs of basis functions centered at the same point χ a A ( r A ) χ a A ( r A ) , to their pertaining atoms, A, and partitioning the two-center distributions, χ a A ( r A ) χ b B ( r B ) , between their respective centers:
χ a A ( r A ) χ b B ( r B ) = d a b A ( r A ) + d a b B ( r B )
In this work, we will deal with basis functions, χ , consisting of Gaussian contracted functions (CGTO), which are most used in molecular calculations.
Contracted Gaussian functions, χ L M ( C G T O ) , are linear combinations of primitive Gaussian functions, g L M ( ξ , r ) :
χ L M ( C G T O ) ( r ) = i = 1 N G c i g L M , i ( r )
where N G is the number of primitive functions (length of the contraction), index i runs over the primitive functions in the contraction, and the expansion coefficients, c i , are chosen so that the radial part of the contraction remains normalized. Primitive Gaussians on their side are defined as:
g L M ( ξ , r ) = N ξ r ( G T O ) r L e ξ r 2 N L M Ω z L M ( θ , ϕ )
and the same angular part is taken for all the primitives in a given contraction. Thus, the contracted functions can be written as:
χ L M ( C G T O ) ( r ) = i = 1 N G c i N ξ i r ( G T O ) e ξ i r 2 r L N L M Ω z L M ( θ , ϕ )
where N ξ i r and N L M Ω are radial and angular normalization constants defined by:
N ξ r ( G T O ) = 2 L + 1 ξ L ( 2 L + 1 ) ! ! ( 2 ξ ) 3 π 1 / 4
N L M Ω = ( 2 L + 1 ) ( L | M | ) ! 2 ( 1 + δ M 0 ) π ( L + | M | ) ! 1 / 2
and z L M ( θ , ϕ ) are unnormalized real spherical harmonics given by:
z L M ( θ , ϕ ) = ( 1 ) M P L | M | ( cos θ ) cos M ϕ ( M 0 ) sin | M | ϕ ( M < 0 )
where P L | M | ( z ) are the corresponding associated Legendre functions (see Equation 8.751.1 at [29]).
For two-center densities, g 00 A ( ξ A , r A ) g 00 B ( ξ B , r B ) , consisting of pairs of spherical primitive Gaussians ( L = 0 ) centered at two different points, A and B, it has been proved [28] that the best convergence in the long-range potential is achieved by assigning the whole density to the center with higher exponent ξ or, in case of equal exponents, by assigning one half to each center. This result reminds the nearest-site algorithm reported by A. Stone and M. Alderton [30], but without considering expansions at the bond center.
When applying this criterion to densities made of pairs of spherical contracted functions, the products of primitives become assigned to one center or another depending on their exponents, yielding a partition that can be regarded as a more realistic version of the Mulliken partition (see Figure 1).
For distributions consisting of nonspherical functions ( L 0 ), the partition is applied to the products of brackets of Equation (5), and the remaining terms are translated to the centers as described below.
The next step in the procedure is to expand the densities of the atomic fragments thus obtained as a series of radial factors times unnormalized real regular solid harmonics (hereafter regular harmonics, for short), namely:
ρ A ( r A ) = l = 0 m = l l z l m ( r A ) ρ l m A ( r A )
where the regular harmonics, z l m ( r ) , are related to unnormalized real spherical harmonics by:
z l m ( r ) = r l z l m ( θ , ϕ )
The partition of density given in Equation (1) combined with the expansion (9) allows us to write the electrostatic potential as [27]:
V ( r ) = A = 1 N ζ A r A l = 0 m = l l z l m ( r A ) Q l m A ( r A ) r A 2 l + 1 + q l m A ( r A )
in terms of atomic nuclei charges, ζ A , effective multipoles, Q l m ( r ) , and inverse multipoles, q l m ( r ) :
Q l m A ( r ) = ( l | m | ) ! ( l + | m | ) ! r < r d r z l m ( r ) ρ A ( r ) = 4 π 2 l + 1 0 r d r r 2 l + 2 ρ l m A ( r )
q l m A ( r ) = ( l | m | ) ! ( l + | m | ) ! r > r d r z l m ( r ) r 2 l + 1 ρ A ( r ) = 4 π 2 l + 1 r d r r ρ l m A ( r )
In this way, the molecular electrostatic potential results in a sum of atomic contributions and the short/long-range separation can be carried out at the atomic level. Thus in the long-range region, the effective multipoles can be accurately replaced by point multipoles:
Q l m A = lim r Q l m A ( r ) = 4 π 2 l + 1 0 d r r 2 l + 2 ρ l m A ( r )
and the inverse multipoles do vanish: lim r q l m A ( r ) = 0 .
As it will be shown below, this is a most useful feature when dealing with large systems because a huge fraction of the atomic contributions to MESP can be computed from only the long range terms even in the regions where molecular short-range potential is necessary. This apparent paradox comes from the fact that, in these systems, the molecular short-range at a given point usually involves a reduced number of atomic short-range contributions coming from atoms in the neighborhood of the point, the remaining contributions being of long-range type.

3. The Algorithm

The algorithm for the evaluation of the electrostatic potential in large molecular systems, according to the method described in the previous section, consists of two main steps, which are executed in sequence. First, a partition of the molecular density must be carried out with the DAM procedure followed by the expansion of the atomic fragments and the computation of effective and inverse multipoles. Second, once the partition/expansion has been made, the electrostatic potential is computed in the desired points with the aid of Equation (11).
It is worth noticing that the first step depends on the size of the basis set used in the computation of molecular density, but it is independent of the number of points where the MESP has to be computed. On the other hand, the second step depends on the number of points for computation, but it is independent of the basis set size.
This decoupling of processes depending on the system size (number of basis functions) from those depending on the number of MESP computation points, combined with the short/long-range separation at atomic level, makes the procedure reported here most efficient to deal with large systems, provided that the MESP is to be evaluated in a not too reduced number of points.

3.1. Algorithm for Molecular Density Partition/Expansion

According to DAM partition criterion, the density of atomic fragment at A is given by:
ρ A ( r ) = a , a A χ a A ( r A ) χ a A ( r A ) + B A a a b B d a b A ( r A )
where, for two-center CGTO distributions:
d a b A = i N G a j N G b Θ ( ξ i a ξ j b ) c i c j g i a ( ξ i a , r A ) g j b ( ξ j b , r B )
Θ ( x ) being the step function:
Θ ( x ) = 0 x < 0 1 / 2 x = 0 1 x > 0
As our purpose is to express the fragments d a b A in coordinates referred to center A and the second primitive in each term of Equation (16) is centered at B, it is necessary to translate the functions g j b to center A. The translation of the exponential factor in g j b can be made in terms of Bessel I functions as proposed by Kaufmann and Baumeister [31]. Working in an aligned frame, with A placed at the origin and B lying on the z axis, i.e., R A = ( 0 , 0 , 0 ) , R B = ( 0 , 0 , R B ) , the translation formula reads:
e ξ j b | r R B | 2 = e ξ j b ( r 2 + R B 2 ) l = 0 ( 2 l + 1 ) π 4 ξ j b r R B I l + 1 / 2 ( 2 ξ j b r R B ) P l ( cos θ )
where R B | R B | , I l + 1 / 2 ( z ) are the corresponding Bessel functions (see [29] 8.467) and P l ( z ) are the Legendre polynomials (ibid 8.91).
On the other hand, the remaining factor, which is in essence the regular harmonic z L B M B ( r ) (apart from a constant factor), can be translated to center A by well known formulas [32]. In the aligned frame, the formula reads:
z l m ( r R ) = k = | m | l l + | m | k + | m | ( R ) l k z k m ( r )
Once the functions are referred to center A, the one-center products of regular harmonics, appearing both in the one-center and in the translated two-center distributions, are decomposed in terms of regular harmonics as described in section, Decomposition of products of regular harmonics, in SF. The final radial factors are identified as the quantities that multiply the corresponding regular harmonics resultant from the decomposition.
In practice, the algorithm proceeds as follows:
  • The interval 0 r 20 bohr is partitioned in n i n t e r v subintervals with boundaries corresponding to previously selected values of r that will be noted as λ i , i = 0 , 1 , n i n t e r v (currently n i n t e r v = 30 , see SF for details).
  • For each interval, the variable r is mapped onto the interval [ 1 , 1 ] according to:
    t 2 r λ i 1 λ i λ i 1 1
    and a set of values of t is chosen as the zeroes of the Chebyshev T polynomial of order n (currently n = 30 ) given by (see [33] 22.16.4):
    t k = cos [ π ( k 1 / 2 ) / n ] k = 1 , n
  • For each center, A, of the system, one-center distributions are expanded as:
    χ a A ( r A ) χ a A ( r A ) = l m f l m L a , M a ; L a , M a ( r ) z l m ( r )
    As described in section, Expansion of one-center distributions, of SF. The radial factors are evaluated in the tabulation points r j , i = ( t j + 1 ) ( λ i λ i 1 ) / 2 + λ i 1 , multiplied by the ρ a a element of the density matrix, and accumulated.
  • Likewise, for each center, A, a loop over all the remaining centers B A is performed. In this loop, for each center B, all the fragments d a b A coming from two-center distributions with one function at A and the other at B, and attributted to center A, are expanded in an aligned frame as a series of regular solid harmonics times radial factors
    d a b A = l m f l m L a , M a ; L b , M b ( r ) z l m ( r )
    as described in section, One-center expansion of two-center fragments, of SF. The radial factors are evaluated in the tabulation points r j , i = ( t j + 1 ) ( λ i λ i 1 ) / 2 + λ i 1 , multiplied by the ρ ˜ a b element of the density matrix, which has been previously rotated to the aligned frame (that is what the tilde means) and accumulated in the aligned frame. Next, the locally accumulated radial factors (i.e., for fixed B) are rotated back to the molecular frame, and the resultant radial factors are further accumulated together with those coming from the one-center distributions and with the radial factors of other pairs of centers to yield the full radial factor ρ l m A ( r A ) of Equation (9). Details on rotations of both density matrix and radial factors are given in section, Rotations, in SF.
  • The tabulations of the ρ l m A ( r A ) radial factors are used to decide whether they are negligible or not and, for non-negligible factors, to carry out a numerical projection on Chebyshev T polynomials of variable t in each interval r [ λ i 1 , λ i ] . This projection yields the corresponding piecewise expansion of ρ l m A ( r A ) . Details of this expansion are given in section Expansion of atomic radial factors in SF. Thus, the final expansion in the i-th interval takes the form:
    ρ l m i , A ( r A ) e ζ i r A k = 0 n i c k ( i ) ( l , m ) T k ( t ) , r A [ λ i 1 , λ i ]
    where t is a function of r A , as defined in (20), and the exponential factor is introduced when ρ 00 A ( r A ) (leading term in expansion (9)) decays steeply in the interval (see SF for details); otherwise, ζ i = 0 is taken. The number of polynomials taken in the expansion at the i-th interval, n i , is determined on the fly by analyzing the convergence of the projections. The expansion coefficients, c k ( i ) ( l , m ) , of non-negligible factors are stored in a buffer. An array with a set of suitable pointers to address the coefficients is also generated and stored.
  • Once the radial factors of expansion have been piecewise expanded, they are used to compute the auxiliary partial integrals:
    Q l m A ( λ i 1 , r j , i ) 4 π 2 l + 1 λ i 1 r j , i d r r 2 l + 2 ρ l m ( r ) , r j , i [ λ i 1 , λ i ]
    q l m A ( λ i , r j , i ) 4 π 2 l + 1 r j , i λ i d r r ρ l m ( r ) , r j , i [ λ i 1 , λ i ]
    in the same tabulation points, r j , i , as used for the density, as well as the auxiliary constants:
    Q l m A ( λ i ) 4 π 2 l + 1 λ i 1 λ i d r r 2 l + 2 ρ l m ( r )
    and
    q l m A ( λ i ) 4 π 2 l + 1 λ i 1 λ i d r r ρ l m ( r )
    Details are given in section, Effective multipoles from density expansion, in SF.
  • The tabulations of Q l m A ( r j , i ) and q l m A ( r j , i ) are used to project these partial integrals onto Chebyshev T polynomials in the same intervals as used for the radial factors of density. In this case, no exponential factor is necessary:
    Q l m A ( λ i , r ) k = 0 n i b k ( i ) ( l , m ) T k ( t ) , r [ λ i 1 , λ i ]
    q l m A ( λ i , r ) k = 0 n i d k ( i ) ( l , m ) T k ( t ) , r [ λ i 1 , λ i ]
    The numbers of polynomials in the intervals, n i , are the same as in the corresponding radial factors. In this way, the pointers defined for addressing density expansion coefficients, c k ( i ) ( l , m ) , can be used also for coefficients b k ( i ) ( l , m ) and d k ( i ) ( l , m ) of Equations (29) and (30).
  • Atomic point multipoles of Equation (14) are obtained by:
    Q l m A = i = 1 n i n t e r v Q l m A ( λ i )
    where the sum runs over the intervals.
  • Molecular geometry and data corresponding to the tabulation of radial factors are stored in an external binary file with extension damqt, ready to be used for computation of DAM expansion of density. In particular, the following information is stored: number of atoms, number of basis functions and number of shell functions, atomic number and Cartesian coordinates of nuclei, basis set, length of expansion (9) ( l m a x ), and for each center A: pointers to expansion coefficients of radial factors, fitting exponents, ζ i , and expansion coefficients, c k ( i ) ( l , m ) .
  • Atomic multipole moments Q l m A , auxiliary quantities Q l m A ( λ i ) and q l m A ( λ i ) , and expansion coefficients b k ( i ) ( l , m ) and d k ( i ) ( l , m ) are stored in another external binary file with extension dmqtv. Since the pointers to b k and d k are the same as those used for a k by construction, they do not require to be stored again.

3.2. Algorithm for Electrostatic Potential Expansion

Once the files containing the information on MED partition/expansion and the auxiliary quantities for MESP have been generated, MESP can be computed at any desired points using this information. This step is completely independent of the first one, so that the computation can be made as many times as necessary and in different sets of points without requiring repetition of the partition/expansion process.
To compute the MESP, the following algorithm is employed:
  • MED partition/expansion data stored in file damqt are read and stored in memory.
  • MESP auxiliary data are read from dmqtv, stored in memory and used for computing further auxiliary quantities. In particular, partial accumulated sums:
    Q l m i , A = k = 0 i Q l m A ( λ k ) i = 1 , 2 , n i n t e r v
    and
    q l m i , A = k = i n i n t e r v q l m A ( λ k ) i = 1 , 2 , n i n t e r v
    are computed and stored too.
  • A double loop over atoms (outer) and tabulation intervals (inner) is performed to determine the length of expansion (11) in each interval and the long-range radius for the atom. This radius is chosen as the lower limit of the interval i, λ i 1 , for which q l m i , A is lower than a user defined long-range threshold.
  • Next, MESP is computed, running over the atoms, with Equation (11). For points placed in the long-range region, l r , of atom A, the contribution to MESP, V l r A ( r A ) , is computed in terms of the corresponding atomic point multipoles Q l m A as:
    V l r A ( r A ) = ζ A r A l = 0 m = l l z l m ( r A ) Q l m A r A 2 l + 1
    For points placed in the short-range region, s r , the contribution is computed by means of:
    V s r A ( r A ) = ζ A r A l = 0 m = l l z l m ( r A ) Q l m A ( r A ) r A 2 l + 1 + q l m A ( r A )
    and the quantities Q l m A ( r A ) and q l m A ( r A ) are obtained in terms of Q l m i , A and q l m i + 1 , A of Equations (32) and (33), i being the index of the interval such that λ i r A < λ i + 1 , plus the expansions (29) and (30) for the integrals in the interval [ λ i , r A ] and [ r A , λ i + 1 ] , respectively.
    In all cases, the regular solid harmonics are fast and accurately computed by recursion, as described in section, Recurrence relations of regular solid harmonics, of SF. In the short-range case, eqs (29) and (30) are evaluated with the coefficients b k and d k previously retrieved from file dmqtv and stored in memory, and with the Chebyshev polynomials computed by recursion, as shown in section, Recurrence relations of Chebyshev polynomials, of SF.
  • If MESP derivatives are wanted, they can be computed together with MESP and using the same auxiliary quantities [34]. The procedure is quoted in section, Computing MESP derivatives, in SF.
  • Data on basis set and density matrix are only necessary if computation of MESP in terms of nuclear attraction integrals and density matrix, without DAM partition/expansion, is required. As this is an expensive procedure, its usage should be restricted to those cases in which a reference is necessary for testing the accuracy of the algorithm reported here.

4. Results

To test the performance of the method, we have started by analizing the accuracy of the results attained with the current algorithm. For this purpose, we have computed MESP values for benzene molecule in a set of equally spaced points corresponding to a 129 × 129 × 129 grid in the octant defined by: x , y , z [ 0 , 20 ] (length in bohr). These results have been compared with the MESP exact values, V e x a c t , computed using the electron density matrix and the integrals involving basis set functions:
V e x a c t ( r C ) = r = 1 n b a s i s s = 1 r ( 2 δ r s ) ρ r s d r χ r ( r ) χ s ( r ) | r r C |
where ρ r s are the elements of the density matrix, and n b a s i s stands for the number of basis functions, χ r , in the LCAO calculation.
The results of this comparison are reported in Table 1 for four different lengths in expansion (9), namely: l m a x = 5 , 10 , 15 , 20 . The highest absolute error, Δ m a x , and the root mean square error, r m s e , in the grid points significantly decrease with the length of multipole expansion, and they suggest that an expansion with l m a x = 10 is sufficiently precise for most practical applications. Nevertheless, higher precision can be attained for more demanding applications like topological analysis at a very moderate cost.
Remarkably, for expansions with l m a x = 10 or higher, a great amount of points are computed with a high number of accurate decimal figures (twelve or more) as is shown in the last row of Table 1. This can be explained because short-range contributions in most points of the grid (those lying outside the molecular volume) are very small or even negligible, and precision is greatly determined by the convergence of the long-range expansion, i.e., the number of terms in (9), the accuracy in the radial factors playing a minor role.
In this regard, it is interesting to check how the algorithm performs in those points in which short-range terms are important. For this purpose, the number of points with a number of decimal accurate figures is plotted in Figure 2 for the four expansions. As it can be seen, the number of points addressed in the curve corresponding to l m a x = 5 is significantly greater than in the remaining curves. This is due to the insufficient convergence of the long-range MESP in this case, as mentioned before. In the remaining curves, the convergence in the long-range MESP has been achieved to a great extent, and the precision is mainly determined by the short-range terms. As the curves show, also in these cases the precision readily increases with the length of the expansion in two senses: a steady augment of the number of correct decimal figures in the least precise points is observed, and a significant decrease in the number of points for a given precision occurs. Raw data used for the figure can be found in section, Precision of MESP calculation, in SF.
The computation wall-clock time with Equation (36) was 5600 seconds, and 2.9, 6.6, 16.2 and 28.0 s for the respective expansions. On the other hand, DAM partition/expansion step with l m a x = 20 took only 2.2 s for benzene density computed with Dunning’s cc-pVDZ basis set [35]. These times were measured on a laptop with processor Intel(R) Core(TM) i7-6700HQ CPU @ 2.60 GHz, running an MPI parallel version of the codes compiled with gfortran, and using 4 processors. Although the calculation was made at HF level, it is important to stress that the computational cost of the full algorithm is independent of the computational level of the molecular density. This is so because the partition/expansion step only depends on the number of elements of the density matrix (i.e., the square of the basis set size), and the MESP computation only depends on the length of expansion (9) and the number of atoms in the system.
It is fairly evident that the algorithm based on DAM yields results that can be sufficient for most practical purposes at a computational time that, in this case, is between two and three orders of magnitude lower than the time required for computation from density matrix and integrals. This is so in a rather small system in which the short-range atomic contributions in the selected grid are about 20% of the total contributions, for a long-range threshold equal to 10 7 taken in cases of l m a x = 5 , 10 , and about 40% for a threshold of 10 8 taken in cases of l m a x = 15 , 20 . As it will be shown below, for really large systems, the fraction of short-range contributions in equivalent grids is much smaller, and the gain in the computational cost increases with the system size with respect to the computation by means of Equation (36). A further test on MESP surface extrema on a density isosurface has been included in the supplementary files.
Once the validity of the results has been established from the point of view of accuracy, we have analyzed the performance of the algorithm in large systems. In Table 2 we collect the results of MESP calculations in a set of molecules ranging from a small system like benzene (12 atoms, 222 basis functions) to two large ones: CC-MMIM BF 4 , consisting of a three circumcoronene slices with two pairs of MMIM BF 4 ionic liquid molecules embedded between the circumcoronene sheets (617 atoms) and a DNA fragment with 750 atoms. In the last two, the electron densities correspond to PM7 calculations [36] (only basis valence functions).
With these results, we have analyzed the dependence of the computational cost of the two steps involved. Three different expansions have been used ( l m a x = 10 , 15 , 20 ), except in CC-MMIM FB 4 and DNA fragment. In these cases, due to the ZDO aproximation involved in PM7 method, only valence one-center distributions contribute to densities. Consequenty, the partition/expansion is a very fast process and terms in MED and MESP expansions with l m a x higher than twice the highest L value in the basis set are zero.
As expected, the computational time of the partition/expansion step in small systems increases with the square of the number of basis functions, and at a lower rate when dealing with large systems, in which cases the dependence tends to be linear. The increment of the cost with respect to the length of expansion (11) is also smaller than the predicted l m a x 2 dependence. In the supplementary files, a little movie showing the MESP of the DNA fragment over its molecular surface defined as MED isosurface with ρ = 0.001 au is included. Red color means positive MESP values, blue color, negative values.
Finally, in Table 3 we analyze the performance of MPI parallelization of the codes corresponding to both steps of the algorithm. Calculations have been carried out on a polyhidroxilated circumcoronene system with 360 centers, with a basis set consisting of 7560 contracted GTOs. The expansion of the atomic fragments has been made up to l m a x = 20 , and the MESP has been computed with this expansion on a 129 × 129 × 129 cubic grid (2,146,689 points) within an interval x , y , z [ 38 , 38 ] . The wall clock time, the average time per processor and the standard deviation are provided for the DAM partition/expansion and for the MESP tabulation.
In both steps, the computational time scales very well with the number of processors. In the partition/expansion step, a linear fit of the clock wall time vs. the reciprocal of the number of processors gives a regression parameter of R 2 = 0.9998 , and the same value is obtained for the fit of the average time. In case of MESP tabulation, the scaling is likewise, with R 2 = 0.9982 and R 2 = 0.9990 , respectively.
The standard deviations show that the time is more evenly shared among processors in the partition/expansion than in the tabulation, but in both cases the time distribution is satisfactory. Furthermore, in the second step the standard deviation is affected by the fact that the main processor spends more time than the remaining ones, due to the workload associated to the gathering of tabulations accomplished by ancillary processors, which are stored in external files, and to tidying tasks.

5. Conclusions

A method for the efficient computation of molecular electrostatic potential has been reported. The method splits the problem in two steps. The first one consists of the partition of the molecular electron density in terms of (pseudo)atomic densities, which are further expanded as a truncated series of spherical harmonics times radial factors. The second step corresponds to the computation of the electrostatic potential from these expansions. The computational cost of the first step depends on the size of the basis set used in the LCAO computation and the length of the expansions, and it is independent of the number of points in which the potential is computed. The cost of the second step is independent of the basis set size and only depends on the expansions length and the number of tabulation points. The result is an algorithm that reduces the computational time around two orders of magnitude when compared with the calculation from electron density matrix and integrals involving the basis set, for a precision of 6 decimal figures. The algorithm can be very efficiently parallelized with MPI, showing a good sharing of computational cost among processors in both steps. The algorithm has been implemented in DAMQT package for the analysis of molecular electron density and related properties. A recent version of the package can be downloaded freely from: https://data.mendeley.com/datasets/7mwfftd2x4/2.

Supplementary Materials

Supplementary materials can be found at https://www.mdpi.com/2079-3197/7/4/64/s1.

Author Contributions

Conceptualization: R.L., I.E., J.M.G.d.l.V., G.R.; Methodology: R.L., G.R., I.E.; Software: R.L., I.E., F.M., G.R.; Supervision: J.M.G.d.l.V.; Validation: F.M.; Writing: R.L., J.M.G.d.l.V.

Funding

This research received no external funding.

Acknowledgments

We wish to thank Noel Ferro for supplying us the Catechin data for testing MESP on a density isosurface.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
CGTOContracted Gaussian type orbital
CPUCentral processing unit
GTOGaussian type orbital (primitive)
DAMMethod of deformed atoms in molecules
DAMQTPackage for the analysis of electron density and related properties
LCAOLinear combination of atomic orbitals
MEDMolecular electron density
MESPMolecular electrostatic potential
MMIMDimethyl imidazolium
MPIMessage passing interface
ZDOZero differential overlap

References

  1. Lewis, G.N. The atom and the molecule. J. Am. Chem. Soc. 1916, 38, 762–785. [Google Scholar] [CrossRef]
  2. Politzer, P.; Murray, J.S. An Overview of Strengths and Directionalities of Noncovalent Interactions: σ-Holes and π-Holes. Crystals 2019, 9, 165. [Google Scholar] [CrossRef]
  3. Born, M.; Oppenheimer, R. Zur Quantentheorie der Molekeln. Ann. Phys. 1927, 389, 457–484. [Google Scholar] [CrossRef]
  4. Mulliken, R.S. Electronic Population Analysis on LCAO–MO Molecular Wave Functions. J. Chem. Phys. 1955, 23, 1833–1840. [Google Scholar] [CrossRef]
  5. Bader, R.F.W.; Henneker, W.H. Molecular Charge Distributions and Chemical Binding. J. Chem. Phys. 1967, 46, 3341–3363. [Google Scholar] [CrossRef]
  6. Bader, R.F.W. Atoms in Molecules: A Quantum Theory; Clarendon Press: Oxford, UK, 1990. [Google Scholar]
  7. Scrocco, E.; Tomasi, J. The electrostatic molecular potential as a tool for the interpretation of molecular properties. In New Concepts II; Springer: Berlin/Heidelberg, Germany, 1973; Volume 42, pp. 95–170. [Google Scholar] [CrossRef]
  8. Scrocco, E.; Tomasi, J. Electronic molecular structure, reactivity and intermolecular forces: An heuristic interpretation by means of electrostatic molecular potentials. Advan. Quantum Chem. 1978, 11, 116–193. [Google Scholar]
  9. Sjoberg, P.; Politzer, P. Use of the Electrostatic Potential at the Molecular Surface to Interpret and Predict Nucleophilic Processes. J. Phys. Chem. 1990, 94, 3959–3961. [Google Scholar] [CrossRef]
  10. Jane S, M.; Lane, P.; Brinck, T.; Politzer, P. Relationships between Computed Molecular Properties and Solute-Solvent Interactions in Supercritical Solutions. J. Phys. Chem. 1993, 97, 5144–5148. [Google Scholar]
  11. Murray, J.S.; Lane, P.; Brinck, T.; Paulsen, K.; Grice, M.E.; Politzer, P. Relationships of Critical Constants and Boiling Points to Computed Molecular Surface Properties. J. Phys. Chem. 1993, 97, 9369–9373. [Google Scholar] [CrossRef]
  12. Murray, J.S.; Peralta-Inga, Z.; Politzer, P.; Ekanayake, K.; Lebreton, P. Computational Characterization of Nucleotide Bases: Molecular Surface Electrostatic Potentials and Local Ionization Energies, and Local Polarization Energies. Int. J. Quantum Chem. 2001, 83, 245–254. [Google Scholar] [CrossRef]
  13. Murray, J.S.; Paulsen, K.; Politzer, P. Molecular surface electrostatic potentials in the analysis of non-hydrogen-bonding noncovalent interactions. Proc. Indian Acad. Sci. 1994, 106, 267–275. [Google Scholar]
  14. Clark, T.; Hennemann, M.; Murray, J.; Politzer, P. Halogen bonding: The σ-hole. J. Mol. Model. 2007, 13, 291–296. [Google Scholar] [CrossRef] [PubMed]
  15. Pathak, R.K.; Gadre, S.R. Maximal and minimal characteristics of molecular electrostatic potentials. J. Chem. Phys. 1990, 93, 1770–1773. [Google Scholar] [CrossRef]
  16. Gadre, S.R.; Pathak, R.K. Nonexistence of local maxima in nolecular electrostatic potential maps. Proc. Ind. Acad. Sci. (Chem. Sci.) 1990, 102, 189–192. [Google Scholar]
  17. Gadre, S.R.; Shrivastava, I.H. Shapes and sizes of molecular anions via topographical of electrostatic potential analysis. J. Chem. Phys. 1991, 94, 4384–4390. [Google Scholar] [CrossRef]
  18. Gadre, S.R.; Kulkarni, S.A.; Suresh, C.H.; Shrivastava, I.H. Basis set dependence of the molecular electrostatic potential topography. A case study of substituted benzenes. Chem. Phys. Lett. 1995, 239, 273–281. [Google Scholar] [CrossRef]
  19. Roy, D.; Balanarayan, P.; Gadre, S.R. An appraisal of Poincaré–Hopf relation and application to topography of molecular electrostatic potentials. J. Chem. Phys. 2008, 129, 174103. [Google Scholar] [CrossRef] [PubMed]
  20. Kumar, A.; Gadre, S.R. Exploring the Gradient Paths and Zero Flux Surfaces of Molecular Electrostatic Potential. J. Chem. Theory Comput. 2016, 12, 1705–1713. [Google Scholar] [CrossRef] [PubMed]
  21. Kumar, A.; Gadre, S.R. Molecular Electrostatic Potential Based Atoms in Molecules: Shielding Effects and Reactivity Patterns. Aust. J. Chem. 2016, 69, 975–982. [Google Scholar] [CrossRef]
  22. Alkorta, I.; Martín-Fernández, C.; Montero-Campillo, M.M.; Elguero, J. Hydrogen-Bonding Acceptor Character of Be3, the Beryllium Three- Membered Ring. J. Phys. Chem. 2018, 122, 1472–1478. [Google Scholar] [CrossRef] [PubMed]
  23. Alkorta, I.; Elguero, J.; Bene, J.E.D. Complexes of O=C=S with Nitrogen Bases: Chalcogen Bonds, Tetrel Bonds, and Other Secondary Interactions. Chem. Phys. Chem. 2018, 19, 1886–1894. [Google Scholar] [CrossRef] [PubMed]
  24. Tahir, M.N.; Ashfaq, M.; de la Torre, A.F.; Caballero, J.; W.Hernández-Rodríguez, E.; Ali, A. Rationalizing the stability and interactions of 2,4-diamino-5-(4-chlorophenyl)-6-ethylpyrimidin-1-ium 2-hydroxy-3, 5-dinitrobenzoate salt. J. Mol. Struct. 2019, 1193, 185–194. [Google Scholar] [CrossRef]
  25. Kote, S.R.; Albuquerque, J.V.; Shirsat, R.N.; Phadke, R.V.; Kohapare, J.T.; Dhoble, S.S. TLC Detection and Theoretical Structure Elucidation of Nitrogen Containing Compounds with Cobalt Thiocyanate. Anal. Chem. Lett. 2019, 9, 453–462. [Google Scholar] [CrossRef]
  26. Rico, J.F.; López, R.; Ema, I.; Ramírez, G.; Ludeña, E. Analytical method for the representation of atoms-in-molecules densities. J. Comput. Chem. 2004, 25, 1355–1363. [Google Scholar] [CrossRef] [PubMed]
  27. Rico, J.F.; López, R.; Ema, I.; Ramírez, G. Electrostatic potentials and fields from density expansions of deformed atoms in molecules. J. Comput. Chem. 2004, 25, 1347–1354. [Google Scholar] [CrossRef] [PubMed]
  28. Rico, J.F.; López, R.; Ramírez, G. Analysis of the molecular density. J. Chem. Phys. 1999, 110, 4213–4220. [Google Scholar] [CrossRef]
  29. Gradshteyn, I.S.; Ryzhik, I.M. Table of Integrals, Series and Products, 4th ed.; Academic Press: New York, NY, USA, 1980. [Google Scholar]
  30. Stone, A.; Alderton, M. Distributed Multipole Analysis. Mol. Phys. 1985, 56, 1047–1064. [Google Scholar] [CrossRef]
  31. Kaufmann, K.; Baumeister, W. Single-centre expansion of Gaussian basis functions and the angular decomposition of their overlap integrals. J. Phys. B At. Mol. Opt. Phys. 1989, 22, 1–12. [Google Scholar] [CrossRef]
  32. Rico, J.F.; López, R.; Ema, I.; Ramírez, G. Translation of Real Solid Spherical Harmonics. Int. J. Quantum Chem. 2013, 113, 1544–1548. [Google Scholar] [CrossRef]
  33. Abramowitz, M.; Stegun, I. Handbook of Mathematical Functions, 9th printing ed.; Dover: Long Island, NY, USA, 1970. [Google Scholar]
  34. López, R.; Rico, J.F.; Ramírez, G.; Ema, I.; Zorrilla, D.; Kumar, A.; Yeole, S.D.; Gadre, S.R. Topology of molecular electron density and electrostatic potential with DAMQT. Comput. Phys. Commun. 2017, 214, 207–215. [Google Scholar] [CrossRef]
  35. Dunning, T.H., Jr. Gaussian basis sets for use in correlated molecular calculations. I. The atoms boron through neon and hydrogen. J. Chem. Phys. 1989, 90, 1007–1023. [Google Scholar] [CrossRef]
  36. Stewart, J.J.P. Optimization of parameters for semiempirical methods VI: More modifications to the NDDO approximations and re-optimization of parameters. J. Mol. Model. 2013, 19, 1–32. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Partition of two-center electron densities, χ A χ B , into atomic fragments d a b A , d a b B : (a) DAM, (b) Mulliken.
Figure 1. Partition of two-center electron densities, χ A χ B , into atomic fragments d a b A , d a b B : (a) DAM, (b) Mulliken.
Computation 07 00064 g001
Figure 2. MESP precision.
Figure 2. MESP precision.
Computation 07 00064 g002
Table 1. Precision of MESP computed with the current algorithm.
Table 1. Precision of MESP computed with the current algorithm.
l m a x 5101520
Δ m a x a 0.189 × 10 2 0.305 × 10 4 0.954 × 10 6 0.119 × 10 6
rmse b 0.379 × 10 4 0.264 × 10 6 0.274 × 10 8 0.164 × 10 9
High prec c 6601,239,1421,996,5692,097,390
a Highest absolute error. b Root mean square error. c Number of grid points with more than eleven accurate decimal figures.
Table 2. Computational cost in seconds of MESP computation with the current algorithm a . Percentage of long-range contributions to MESP for a threshold of 10 8
Table 2. Computational cost in seconds of MESP computation with the current algorithm a . Percentage of long-range contributions to MESP for a threshold of 10 8
MoleculeBenzeneLiposidomycinDBCOx2 b CC-MMIM BF 4 c DNA Fragment
n. atoms1271360617750
n. basis d 2221157756017152250
partition/expansion time e2.2617801.1 g 2.0 h
box size f 408476112156
time MESP ( l m a x = 10 )736230195 g 254 h
time MESP ( l m a x = 15 )1665435
time MESP ( l m a x = 20 )43105617
% long-range h 60959399.599.99
a MESP computed on a 129 × 129 × 192 grid (2,146,689 points) of a cubic box. b Dimer of polyhidroxilated circumcoronene. c Three sheets of circumcoronene with two pairs of MMIM BF 4 molecules embedded between the sheets. PM7 calculation. d Number of basis functions. e Time in DAM partition/expansion with l m a x = 20 . f Length in bohr of the cubic box edges. g Due to ZDO approximation, only terms up to l m a x = 2 do not vanish. h Due to ZDO approximation, only terms up to l m a x = 4 do not vanish.
Table 3. Performance of the algorithm parallelization. All times in seconds.
Table 3. Performance of the algorithm parallelization. All times in seconds.
N. Procs a Partition/ExpansionMESP Tabulation b
Wall Clock c Average d Std. Dev. eWall Clock c Average d Std. Dev. e
1216520482048
2112511240.29929894
46706690.855252026
65315300.840636632
84664650.533529324
104024000.927223721
a Number of MPI processes on a system with an Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz with 10 cores. b  Computation on a 129 × 129 × 129 grid (2,146,689 points). c Highest time among processors. d Average time per processor. e Standard deviation.

Share and Cite

MDPI and ACS Style

Lopez, R.; Martinez, F.; Ema, I.; Garcia de la Vega, J.M.; Ramirez, G. Efficient Evaluation of Molecular Electrostatic Potential in Large Systems. Computation 2019, 7, 64. https://doi.org/10.3390/computation7040064

AMA Style

Lopez R, Martinez F, Ema I, Garcia de la Vega JM, Ramirez G. Efficient Evaluation of Molecular Electrostatic Potential in Large Systems. Computation. 2019; 7(4):64. https://doi.org/10.3390/computation7040064

Chicago/Turabian Style

Lopez, Rafael, Frank Martinez, Ignacio Ema, Jose Manuel Garcia de la Vega, and Guillermo Ramirez. 2019. "Efficient Evaluation of Molecular Electrostatic Potential in Large Systems" Computation 7, no. 4: 64. https://doi.org/10.3390/computation7040064

APA Style

Lopez, R., Martinez, F., Ema, I., Garcia de la Vega, J. M., & Ramirez, G. (2019). Efficient Evaluation of Molecular Electrostatic Potential in Large Systems. Computation, 7(4), 64. https://doi.org/10.3390/computation7040064

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop