1. Introduction
Electron correlation energy lies at the heart of quantum chemistry [
1,
2]. However, the computational cost of high-level
post-Hartree–Fock methods skyrockets with system size. In this context, there is a pressing need for alternative lower-scaling cost-efficient methods across broad classes of systems. In recent years, the information-theoretic approach (ITA) [
3,
4,
5,
6] has emerged as a promising framework for understanding and predicting the electron correlation energy from the perspective of information theory. By treating the electron density as a continuous probability distribution, ITA introduces a set of descriptors, such as Shannon entropy [
7] and Fisher information [
8], that encode global and local features of the electron density distribution. These quantities are inherently basis-agnostic and physically interpretable, providing a new lens through which quantum chemical problems can be approached.
In continuation with our previous work by employing the simple physics-inspired density-based ITA quantities to appreciate response properties [
9,
10,
11,
12,
13] (such as molecular polarizability and NMR chemical shielding constant) and energetics of elongated hydrogen chains [
14], in this work, we aim to predict the
post-Hartree–Fock (see
Figure 1) electron correlation energies of various molecular clusters and linear or quasi-linear organic polymers with increasing cluster size and polymer length. The shared set of physically motivated ITA quantities include Shannon entropy (
) [
7], Fisher information (
) [
8], Ghosh, Berkowitz, and Parr entropy (
SGBP) [
15], Onicescu information energy (
E2 and
E3) [
16], relative Rényi entropy (
and
) [
16], relative Shannon entropy (
IG) [
17] and relative Fisher information (
G1,
G2, and
G3) [
18]. The definitions of these 11 quantities can be found in
Section 4. The Shannon entropy characterizes the global delocalization of the electron density, reflecting how uniformly electrons are distributed throughout space. The Fisher information quantifies local inhomogeneity, serving as a measure of the sharpness or localization of density features such as bonding regions or lone pairs. The Kullback–Leibler divergence (relative entropy) measures the distinguishability between two densities, providing a quantification of the difference in electronic structure between two systems/states. These systems include(i) 24 octane isomers (see
Figure 2) [
11]; (ii) polymeric structures (see
Figure 3), polyyne, polyene, all-
trans-polymethineimine, and acene [
11]; (iii) molecular clusters (see
Figure 4), such as metallic Be
n and Mg
n [
19,
20], covalent S
n [
21,
22], hydrogen-bonded protonated water clusters H
+(H
2O)
n [
23], and dispersion-bound carbon dioxide (CO
2)
n [
24], and benzene clusters (C
6H
6)
n [
25]. We construct strong linear relationships between the low-cost Hartree–Fock [
26] ITAs and the electron correlation energies from
post-Hartree–Fock methods, such as MP2 or RI-MP2 [
27,
28,
29], CCSD [
30,
31], and CCSD(T) [
32]. It is noteworthy to mention that MP2 is mainly used here only as a proof-of-concept; Hartree–Fock can be simply replaced with any approximate functionals of density functional theory (DFT) [
33,
34].
By examining trends across increasing cluster size and polymer length, we assess the transferability, scalability, and physical insights provided by ITA features in capturing electron correlation. Our findings highlight not only the feasibility of ITA-driven correlation energy prediction but also reveal key descriptors that most strongly govern correlation effects in extended systems. These results suggest that ITA may serve as a promising direction for developing efficient, interpretable, and physically grounded models in quantum machine learning and electronic structure theory.
2. Results
To validate the accuracy of the LR(ITA) method, we chose a total of 24 octane isomers as shown in
Figure 2. MP2, CCSD, and CCSD(T) are used to generate the electron correlation energies, and ITA quantities are obtained at the Hartree-Fock level at the same basis set 6-311++G(d,p). More details can be found in the
Supplementary Materials (Table S1).
Table 1 shows the linear relationships and RMSDs between the LR(ITA)-predicted and calculated electron correlation energies. There seems to be no substantial differences between
R2 (and RMSD) values for MP2, CCSD, and CCSD(T). I
F is slightly better than S
GBP and substantially better than S
S, which reflects the highly localized nature of the density in alkanes. For
SS,
IF, and
SGBP, the RMSDs are <2.0 mH, indicating that LR(ITA) should be accurate enough to predict the electron correlation energies. Because CCSD and CCSD(T) are too computationally-intensive and intractable, only MP2 is used hereafter as proof-of-concept.
In
Table 2,
Table 3,
Table 4 and
Table 5, we have collected the linear correlation coefficients (
R2 = 1.000) and RMSDs (root mean squared deviations) between the calculated correlation energies at the MP2/6-311++G(d,p) level and those predicted based on the ITA quantities at the HF/6-311++G(d,p) level for polyyne, polyene, all-
trans-polymethineimine, and acene, respectively. More details can be found in
Tables S2–S5. Some ITA quantities are not tabulated in the text mainly because of inferior accuracy, for example,
G2 in
Table 2,
G2 and
IG in
Table 3, and
G1,
G2, and
IG in
Table 4, respectively. It is clearly showcased that
R2 is close to 1 for most ITA quantities. More strikingly, based on the linear regression (LR) equations of ITA quantities, the predicted electron correlations deviate from the calculated ones only by ~1.5 mH for polyyne, ~3.0 mH for polyene, and <4.0 mH for all-
trans-polymethineimine. For acene, the RMSDs are reasonably satisfactory by ~10–11 mH. These results collectively reveal that ITA quantities are indeed good descriptors of electron correlations for those linear or quasi-linear polymeric systems with delocalized electronic structures. For more challenging acenes, a single ITA quantity fails to capture a sufficient amount of information about more delocalized electronic structures.
Shown in
Table 6,
Table 7 and
Table 8 are the results of the linear correlation coefficients (
R2) and RMSDs (root mean squared deviations) between the calculated correlation energies at the MP2/6-311++G(d,p) level and those predicted based on the ITA quantities at the HF/6-311++G(d,p) level for neutral metallic Be
n, Mg
n, and covalent S
n systems, respectively. More details can be found in
Tables S6–S11. One can see that strong correlations exist (
R2 > 0.990) between ITA quantities and MP2 correlation energies, indicating that they are extensive in nature. However, the predicted electron correlation energies deviate much from the calculated ones by ~28–37 mH for Be
n, ~17–33 mH for Mg
n, and ~26–42 mH for S
n, respectively. These results collectively showcase that for 3-dimensional metallic clusters, Be
n and Mg
n, and covalent S
n, a single ITA quantity fails to quantitatively capture enough information about electron energies of complex systems.
Shown in
Table 9 are the results of the linear correlation coefficients (
R2) and RMSDs (root mean squared deviations) between the calculated correlation energies at the MP2/6-311++G(d,p) level and those predicted based on the ITA quantities at the HF/6-311++G(d,p) level for hydrogen-bonded protonated water clusters. The corresponding regression slopes and intercepts are provided in
Table S12. Of note, the ITAs and the MP2 correlation energies are not shown mainly because the dataset has a total of 1480 structures. One can see that strong correlations exist (
R2 = 1.000) between (8 out of 11) ITA quantities and the MP2 correlation energies, indicating that they are extensive in nature. The RMSDs range from 2.1 (
and
) to 9.3 (
) mH, indicating that ITA quantities are good descriptors of the
post-Hartree-Fock electron correlation energies of hydrogen-bonded systems.
Finally, we will switch our gear to two dispersion-bound clusters, (CO
2)
n and (C
6H
6)
n.
Table 10 gives the strong correlations (
R2 = 1.000) and RMSDs between the RI-MP2 correlation energies and Hartree–Fock ITA quantities at the same basis set 6-311++G(d,p) for (CO
2)
n(
n = 4−40). More details can be found in
Table S13. The RMSDs vary from 6.3 (
and
) to 10.8 (
) to 14.6 (
) mH. For (C
6H
6)
n (
n = 4−14) clusters, we have calculated the linear correlations (
R2 = 1.000) and RMSDs between the MP2/6-311++G(d,p) electron correlation energies and HF/6-311++G(d,p) ITA quantities, as collected in
Table 11. More details can be found in
Tables S14 and S15. The RMSDs range from 2.8 (
) to 6.9 (
) to 10.7 (
) mH. The RMSD results collectively suggest (8 out of 11) ITA quantities are reasonably good descriptors of the
post-Hartree-Fock electron correlation energies of dispersion-bound clusters.
To further illustrate the extrapolative capability of the LR(ITA) method, we employ some relatively larger (C
6H
6)
n (
n = 15−30) clusters to this end. Plus, as conventional MP2/6-311++G(d,p) calculations are too computationally-intensive, we employ GEBF [
35,
36,
37,
38] to obtain the MP2-level electron correlation energies as reference. Finally, as the linear regression based on the ITA quantity
G3 has the least RMSD value, we choose LR(
G3) to make predictions of electron correlation energies of benzene clusters. More details can be found in
Tables S15 and S16.
Figure 5 shows a comparison of the LR(G
3)-predicted and GEBF-calculated MP2 electron correlation energies for benzene clusters (C
6H
6)
n (
n = 15−30). The RMSD between the LR(G
3)-predicted and GEBF-calculated data is 8.6 mH, indicative that the LR(ITA) method has a comparable performance to the linear-scaling GEBF method. Of note, the
R2 and RMSD values in
Figure 5 characterize the prediction quality of an extrapolated set, which differs from the regression statistics in the previous tables that summarize fits within the training set. In addition, we have found that when subsystem wavefunctions (thus electron density and ITA quantities) are used to obtain the subsystem electron correlation energies, the final total electron correlation energies of GEBF-LR(G
3) deviate from GEBF by 40.0 mH in terms of RMSD, as shown in
Figure 5 and
Table S17. This indicates that it is not a good choice to combine the ITA quantities with a fragment-based method (GEBF in our case) for predicting the electron correlation energy. One possible reason for this observation may come from the error accumulation, rather than error cancellation, on which the great success of GEBF relies. To further verify this point, we have plotted the deviations of LR(
G3) and GEBF-LR(
G3) as referenced to those of GEBF with respect to the cluster size as shown in
Figure S1, it is lucidly shown that the overall trend observed for LR(
G3) and GEBF-LR(
G3) is that the deviation only fluctuates to some degree for the former; while that of the latter grows with the cluster size.
3. Discussion
To accurately and efficiently predict the
post-Hartree-Fock electron correlation energy at a relatively low cost is a hot area in the community of quantum chemistry. Starting from Hartree-Fock molecular orbitals, there exist two typical methods. One is to calculate the local electron correlation energy, whose early development is due to Pulay and Sæbø [
39,
40,
41]; the other is to predict the correlation energy with the aid of deep learning (DL) [
42,
43,
44,
45,
46,
47,
48,
49,
50,
51]. Our proposed LR(ITA) method is a special flavor of DL. Suffice it to note that an inherent drawback of local correlation methods is that they perform orbital localization [
52,
53]. This problem is also encountered by the DL-driven method. For our LR(ITA) method, only the molecular orbitals (thus, the electron density) are required without any manipulation. Very recently, we have showcased the good accuracy of LR(ITA) and its variant DL(ITA). With LR(ITA), one can even predict the FCI-level electron correlation with the DMRG (density matrix renormalization group) [
54,
55] algorithm as a solver for the elongated hydrogen chain [
14], and the RMSD is only a few mH. Moreover, with DL(ITA), where a total of 11 ITA quantities are used as input [
13], we have predicted the DLPNO-MP2 (Domain-Based Local Pair Natural Orbital MP2) [
56] electron correlation energy for a database of >90 K real organic molecules, and the RMSD is about 6.8 mH. In addition, LR(ITA) is not limited to any
post-Hartree-Fock electronic structure methods; MP2 is used here as a proof-of-concept. Thus, we have showcased that LR(ITA) is designed with architectural and conceptual simplicity and is numerically shown to be a good protocol to predict the electron correlation energies of various systems. Of note, the predictive power of LR(ITA) is best for chemically similar systems, whereas extrapolation across chemically distinct sets should be performed with caution. Plus, while the LR(ITA) model generally maintains a strong linear correlation for geometries close to the equilibrium, the predictive accuracy can decrease for significantly distorted geometries. This is because the ITA descriptors are computed from the Hartree-Fock electron density, which changes with geometry, and the linear regression coefficients are fitted to equilibrium structures.
Up to now, we have mainly focused on MP2, it is compelling and valuable to carry out a more extensive benchmarking against (i) CCSD(T) for larger or more complex systems and (ii) more challenging cases where both dynamic and static correlation effects may be significant, like polyyne, polyene, and acene with large n.
Admittedly, using LR(ITA) to accurately and efficiently predict the electron correlation energy is still in its infancy. On the one hand, for three-dimensional systems, the RMSD values between the predicted and computed MP2 correlation energies are unacceptably large, even though there is still a strong linearity between the ITA quantities and the MP2 correlation energy. Would it be possible that more sophisticated, higher-order ITA quantities could capture additional electronic structure information, analogous to the “rungs” of Jacob’s ladder in DFT? If so, developing and testing a hierarchy of ITA quantities could potentially improve the predictive power of LR(ITA) for complex three-dimensional systems.
On the other hand, we will implement a new concept of “ITL-DL Loop”. The physics behind it is simple: low-tier (such as semiempirical PM7 [
57] or even promolecular [
58,
59]) electron densities are used as input for ITA quantities, and DL is introduced to obtain high-tier (such as DFT) electron densities. Based on the newly generated electron densities, ITA quantities are obtained and used as input for another either classical or quantum DL model to predict the electron correlation energies of electrons of physicochemical properties of molecules. Moreover, extending the ITA-based method to quantities reflecting the response of electronic energy with respect to the nuclear displacement is another potential direction. Work along these lines is in progress, and the results will be presented elsewhere.