Generation of Basis Sets for Accurate Molecular Calculations: Application to Helium Atom and Dimer

: A new approach for basis set generation is reported and tested in helium atom and dimer. The basis sets thus computed, named sigma, range from DZ to 5Z and consist of the same composition as Dunning basis sets but with a different treatment of contractions. The performance of the sigma sets is analyzed for energy and other properties of He atom and He dimer, and the results are compared with those obtained with Dunning and ANO basis sets. The sigma basis sets and their extended versions up to triple augmented provide better energy values than Dunning basis sets of the same composition, and similar values to those attained with the currently available ANO. Extrapolation to complete basis set of correlation energy is compared between the sigma basis sets and those of Dunning, showing the better performance of the former in this respect.


Introduction
The study of weak van der Waals (vdW) interactions has always been one of the most challenging applications of theoretical calculations of electron structure. Thus, methods based on Kohn-Sham (KS) density functional theory (DFT) have shown limitations for including weak and long-range interactions in the exchange-correlation term of the KS equation. In particular, standard functionals of DFT fail to explain these interactions because the stabilization is determined by dispersion interactions, and is not explained by these functionals. Exchange-correlation potentials, derived from local and semi-local models, often exhibit artifacts when applied to systems with large non-local correlation effects. Nevertheless, there is a continuous effort to include vdW interactions within the framework of KS theory [1], and remarkable progress on corrections to this fact has been made, such as the exchange-hole dipole moment (XDM) model [2,3].
In the long range, vdW interactions are dominated by dispersion and permanent multipole moment interaction and include superposition and exchange contributions. The behavior of an exchange functional in the region of small density and large density gradients plays a very important role. From a topological point of view, the presence of vdW interactions can be identified by reduced density gradient (RDG) analysis. In this respect, a novel procedure for studying van der Waals interactions has been developed as an extension of Bader's QTAIM in combination with RDG analysis in which a volumetric source function is used for describing the atomic composition of vdW interactions [4].
The above considerations reinforce that the well known fact is that to reliably study vdW interactions, not only large basis sets (BS) are needed, but also fine electron correlation treatments are required to reproduce experimental data. In particular, noble gases have been the focus of much research as a starting point for the study of rare gas dimer interactions [5], a criterion for the performance of basis sets and correlation methods in vdW studies.
Among the systems linked by vdW dispersion interactions, helium occupies an outstanding position. The properties of gaseous helium are close to those of the ideal gas because the interactions between helium atoms are extremely weak, and its behavior at very low temperatures makes helium a paramount system on its own, with unusual physicochemical properties under these conditions. Furthermore, vdW interaction in helium dimer, He 2 , is a touchstone for testing the capabilities of the most precise theoretical procedures available [6], as its existence is due to purely electronic correlation effects.
The accurate computation of the He 2 potential energy curve is a big challenge, and a great amount of work has been devoted to this task. Ab initio studies on He 2 interactions have been published by Van Mourik et al., who report dimer calculations using different methods and computational levels (MPn, CCSD(T), FCI, . . . ) with large basis sets, including polarization [7][8][9]. In the late 20th and 21st centuries, many other authors have published works on He 2 , using various sets of a consistent basis for correlation and a high level of theory: SAPT [10], CCSD(T) [11], r12-MR-ACPF [12][13][14], MRCI [15], CCSD(T) [16], Monte Carlo [17], Compton profiles [18] and Gaussian geminal theory [19]. The most accurate results were obtained with BS supplemented with an additional set of bond functions [20], obtaining better results when compared with larger BS without bond functions.
Because of the high accuracy intended and the feeble interaction involved, the question about the full elimination of basis set superposition error (BSSE) in the calculations of He 2 energy has been a matter of discussion [21], which still remains today. This problem comes from the difficulty of saturating the dispersion energy in calculations with conventional basis sets, and it is especially relevant in weakly interacting species. A good representation of the dispersion energy requires polarization functions with small exponents, and extrapolation methods from the raw energies without Counterpoise (CP) correction have been proposed to reduce the BSSE [22][23][24]. In particular, extrapolation to complete basis set (CBS) limit for the helium dimer was studied by A. Varandas [22,25], who carried out calculations with size-consistent methods such as Hartree-Fock (HF) and full configuration interaction (FCI) using Dunning BS.
As a consequence of this extraordinary effort, the accuracy achieved is certainly impressive and, according to the best estimations, the potential energy curve of He 2 has a minimum at a distance R e = 2.9676 Å, with an energy of −34.82 µE h with respect to the limit of separated atoms [20,26]. This minimum in the potential energy curve is remarkably shallow and has been found to admit only one vibrational state, the mean distance between nuclei in this state being ca. ten times the equilibrium distance [27].
In this work, we report a way of constructing new basis sets for molecular calculations which overcome the variational performance of the existing ones of equivalent composition. Furthermore, we apply the procedure to the development of basis sets, hereafter named sigma (σBS), for helium atoms and use them for the study of helium atom and dimer.
The article is organized as follows. In Section 2, the procedure for developing sigma BS is explained, and the contraction scheme and composition are described. Supplementary Materials are included, which contain the link for the sigma basis sets for He. In Section 3, a brief summary of computational procedures, methods, bases and programs is given. The results on the He atom and the He 2 dimer are reported in Section 4, in which the the precision of the total and dissociation energies as well as the equilibrium distances is discussed. Results on the CBS extrapolation of correlation energies are included in a subsection of Section 4. Finally, conclusions are drawn from these results in Section 5.

Criteria for Basis Set Optimization and Contraction Scheme: Size and Composition
The σBS consists of linear combinations (contractions) of radial primitive Gaussians aimed at providing highly accurate energy values at different computational levels. In the construction of σBS, we have exploited our previous experience in the development of exponential type BS [28][29][30][31]. Guided by this experience, we have decided to design σBS with the characteristic that, if a given primitive contains a spherical harmonic of quantum number l = L, all the primitives with the same exponent and l < L are also present in the set. For instance, if there is a d function with exponent α i , then s and p functions with this exponent will appear in the BS. This characteristic was thought to reduce the computation cost of integrals involving primitives, although most standard packages are usually not prepared to profit from it. Another feature of σBS is that all primitives in a given shell, i.e., with the same angular part, participate in all contractions of the same shell. The combination of both features makes it possible for the number of primitives in the contractions can be increased, and the quality of the BS functions is thus improved, without penalizing the computational cost. In summary, contractions in σBS are built from the same set of exponentials combined with different angular functions. Furthermore, whereas polarization functions of Dunning BS consist of single Gaussian functions (one primitive per function without contraction), in sigma basis sets they are true contractions, yielding significant improvements in the results on energy.
As a rule of thumb, the number of primitives included in each shell of polarization functions is equal to the number of contractions in the shell plus two. As mentioned before, the radial parts of the primitives used in the polarization functions are also present in the functions of the core shells. The choice of primitives for polarization functions, among those of core shells functions, is not obvious and must be accomplished by optimizing the exponents together for both types of functions. It is also noticeable that, albeit not specifically intended, the exponents of the primitives thus obtained almost follow an even tempered sequence, with slight variations and covering a wide range of values.
To simplify the notation, XZ will be used in the following as an abbreviation of cc-pVXZ, and aXZ of aug-cc-pcVXZ families. The equivalent Atomic Natural Orbitals (ANO) and σBS will be denoted as anoXZ, aanoXZ, σXZ and aσXZ, respectively. In each of the six families, we have considered basis sets ranging from X = 2 (DZ) up to 5 (5Z).
The optimization of helium σBS follows the general lines of Dunning's procedure which relies on the minimization of CISD energy in He atoms. Starting with a given set of exponents in the primitives, the contractions are constructed in a stepwise way. In the case of He, the (1s) contraction of the σDZ is determined by minimizing the HF energy for the ground state. Next, the CISD is used to add a new shell and one more contraction per shell, but keeping unchanged the contraction previously optimized. These two steps yield the σDZ basis set. This procedure is repeated, changing the primitive exponents until their optimum values are obtained.
To build the σTZ, we proceed as in the case of the σDZ. After optimizing the 2(s) + 1(p) contractions, a new shell and a new contraction per shell, 1(s) + 1(p) + 1(d), are added and optimized with CISD of the He atom. The scheme for all σBS follows the procedure described but repeating the steps of contraction/optimization as many times as required according to the BS level. Proceeding in this way, and taking into account that the number of primitives is increased according to the rule mentioned, σBS tend to saturate one-electron space per shell, yielding energies as close as possible to the best attainable values according to their size.
In general, these σBS give energy values for He atoms lower than those of Dunning BS and similar to those of ANO BS, as we will see the next section. Unlike in Dunning and ANO BS, in the case of the augmented σBS (aσBS), the CISD energy of the dimer at the best available equilibrium distance (BED) R e = 2.9676 Å [26] is considered for the optimization of σBS exponents.

Computational Details
As it is well known, basis sets augmented with polarization and diffuse functions can adequately reproduce weak dispersion interactions [9,25,32]. Bearing this in mind, to study He 2 interactions in a systematic way, we have carried out calculations using atom-centered basis sets. In particular, in this work we use correlation consistent basis sets developed by Dunning et al. [32][33][34][35] as a reference for testing the performance of σBS reported herein. Dunning BS are widely used, and they are especially well suited for our purposes because they incorporate polarization functions and, in the case of augmented versions, diffuse functions. In addition, it has been shown that correlation-consistent basis sets doubly augmented with diffuse functions can be used to nearly saturate the radial contribution to the dispersion energy in rare gas dimers [8,9,36]. Furthermore, these basis sets are grouped in families whose members are ranked in size (and quality), and this facilitates the extrapolation of results to the CBS limit.
For testing the accuracy of energy values attained with the σBS, we have also used the ANO BS, which have proved to be able to exhaust the capabilities of the underlying Gaussian expansion basis (to minimize the contraction error) and provide a highly accurate reference [37,38].
Electronic structure calculations of He and He 2 have been carried out at HF, CISD and FCI levels, using MOLPRO suite [39].

Sigma Basis Sets vs. Dunning and ANO Basis Sets
The composition of the basis sets for the He atom is detailed in Table 1, in which the numbers of exponentials, primitives and contractions are quoted. Notice that the composition is the same for the three families, the only difference being the number of primitives, smaller in Dunning BS and similar in ANO and σBS. On the other hand, the number of exponentials is smaller in σBS than in the other two. In Table 2, some properties computed at HF and FCI levels are reported. As can be appreciated, FCI energies computed with σBS are lower than those computed with their equivalent partners of the other two families. In the case of He 2 , energies have been calculated at the equilibrium distance optimized at each computational level, and no result is displayed in the case of HF calculations with ANO or σBS because no minimum is found in these cases. Dissociation energies of He 2 , D e have been computed as the difference between the energies of the separated atoms and that in the minimum of the curve, R e . In the case of Dunning BS, a shallow minimum is obtained at the HF level at a distance that increases with the BS size. As mentioned above, this minimum does not appear in HF calculations with the two other families, suggesting that the improvements in the core zone with respect to those of Dunning should prevent the presence of a minimum at the HF level.
The convergence of D e and R e towards the currently best available values (D e = 34.82 µE h and R e = 2.9676 Å) [26] has been analyzed in FCI calculations, and the results are depicted in Figures 1 and 2. In Figure 1, D e is plotted vs. the BS size for the three types of BS, including the augmented versions. It is observed that non-augmented BS have a slow convergence towards the exact result, and they are far away from it even at the 5Z level. Augmented ANO and σBS improve the results slightly, but the convergence is still far from the reference. On the other hand, Dunning augmented BS, although not optimized for CISD energy of He 2 at R e , yield astonishing precise depth values of the well.
In the case of equilibrium distance, displayed in Figure 2, the behavior is quite similar to that of D e , with Dunning augmented BS giving a very good agreement even with the smallest set (aDZ).

Multiple Augmented Basis Sets
Given the excellent performance of Dunning BS with regard to the dissociation energy of He 2 , and in an attempt to understand why this is not so in the case of σBS, we decided to explore the performance of multiple augmented basis sets in both families.
We developed double and triple augmented σBS but, unlike in Dunning BS, which were designed to improve the polarizability of the He atom, we followed the methodology based only on energy. Thus, we obtained double and triple augmented σBS (daσXZ and taσXZ) to compare with the corresponding Dunning BS (daXZ and taXZ, for short). These multiply augmented BS are described in Table 3. In analogy with Table 1, the number of exponentials, primitives and contracted functions are given, and the same comments on the composition, size and characteristics mentioned before also hold for the new multiple augmented σBS. Results obtained with multiple augmented BS are summarized in Table 4, whose structure is identical to that of Table 2. Regarding the atomic calculations at HF and FCI levels, it is observed again that energies computed with σBS are always better than those of the equivalent Dunning sets. In fact, taσXZ, albeit not directly intended, yields a good agreement with the HF limit (−2.861679995 E h ) [40,41] even for the smallest set of the series, with an error ca. 10 −6 E h in all cases. With respect to the dimer, Dunning BS (daXZ and taXZ) again give minima at the HF level, the well depth being greater than that obtained with the smaller XZ and aXZ, and the equilibrium distances range from 3.2 to 4.0 Å. Again, the σBS (daσXZ and taσXZ) give no minimum. In all cases, the FCI energies for the diatom tend to the exact value −5.80748357 E h , estimated from the best available values for the He energy (2.903724377 . . . E h ) [42] and for the well depth (34.82 µE h ) [26]. Notice that energy values computed with daσXZ y taσXZ are significantly lower than those obtained with the equivalent Dunning BS.
Regarding the dissociation energy, Figure 3 shows that in Dunning BS, the inclusion of new polarization functions in the sequence aXZ, daXZ, taXZ tends to worsen the results, whereas in σBS, a regular improvement in the value of D e is observed. In the case of equilibrium distance, R e , Dunning BS produce results closer to the exact than those of σBS, but the results attained with the latter also exhibit a regular approach to the right value, as shown in Figure 4.

Extrapolation to CBS
The performance of the basis sets in extrapolations to CBS has been analyzed in the case of correlation energy both for the atom and for the dimer at BED. Although there are many extrapolation schemes, we have taken one due to Helgaker [43]: The correlation energy of the atom has been calculated as the difference between the exact value [42] and the HF limit energy [40] and that of the dimer with the HF value at BED and adding the energy of the well (34.82 µE h ).
Correlation energy values thus computed plus the extrapolated CBS energies obtained for the Dunning and σBS are collected in Table 5 and depicted in Figures 5 and 6. The best available correlation energies appear at the top of Table 5, with values of 0.0420444 E h and 0.0841528 E h for the atom and the dimer, respectively. To facilitate comparisons, figures coincident with reference values are displayed in bold type.  The better performance of σBS over Dunning BS is evident. CBS extrapolation yields a gain of at least two figures in the former, while scarcely one figure is gained in the latter. The case of σaXZ is specially noticeable, as three figures are gained in the CBS extrapolation both in the atom and in the diatom. Furthermore, CBS extrapolation in Dunning BS lies below the exact value, whereas it converges to the right value in the case of σBS. This behavior is clearly visible in Figures 7 and 8

Conclusions
A new scheme for developing basis sets is reported and applied to helium atom and dimer. The σBS thus developed are configured by combining the same set of exponentials with different angular functions and saturating the corresponding one-electron spaces. The new family ranges from DZ to 5Z, with the same composition as Dunning BS, and also includes, simply, double and triple augmented versions. Energy values for helium atom and diatom computed with the σBS are lower than those obtained with their partners in Dunning BS, and similar to ANO BS, both at HF and FCI levels.
The analysis of the energy of He 2 reveals the presence of a minimum in the curve at the HF level in the case of Dunning BS that does not appear with σ or ANO BS. In the case of FCI, all calculations yield a well whose depth tends to the reference value as the BS quality improves. Augmented σ BS display a good convergence which, in the case of multiple augmented BS, is even more regular than in the corresponding Dunning BS.
CBS extrapolation of the correlation energy of He 2 at BED has also been examined, proving that the results with σBS are clearly superior to those attained with Dunning BS. Especially remarkable is the result with the aσXZ, which shows an agreement of five decimal figures with the best available result. Furthermore, it can be noticed that the CBS extrapolation with Dunning BS falls below the exact value, while the sigma converges to the right value.

Abbreviations
The following abbreviations are used in this manuscript: