Predicting the Post-Hartree-Fock Electron Correlation Energy of Complex Systems with the Information-Theoretic Approach

Wang, Ping; Hu, Dongxiong; Lu, Linling; Zhao, Yilin; Chen, Jingbo; Ayers, Paul W.; Liu, Shubin; Zhao, Dongbo

doi:10.3390/molecules30173500

Open AccessArticle

Predicting the Post-Hartree-Fock Electron Correlation Energy of Complex Systems with the Information-Theoretic Approach

by

Ping Wang

^1,†,

Dongxiong Hu

^2,†,

Linling Lu

^3,4,†,

Yilin Zhao

⁵,

Jingbo Chen

^3,4,

Paul W. Ayers

^5,*

,

Shubin Liu

^6,7,*

and

Dongbo Zhao

^3,4,*

¹

Key Laboratory of High Performance Scientific Computation, School of Science, Xihua University, Chengdu 610039, China

²

School of Basic Medical Sciences, Yunnan University of Chinese Medicine, Kunming 650500, China

³

Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education, Institute of Biomedical Research, School of Chemical Science and Technology, Yunnan University, Kunming 650500, China

⁴

Yunnan Key Laboratory of Research Development for Natural Products, School of Pharmacy, Yunnan University, Kunming 650500, China

⁵

Department of Chemistry and Chemical Biology, McMaster University, Hamilton, ON L8S 4M1, Canada

⁶

Research Computing Center, University of North Carolina, Chapel Hill, NC 27599, USA

⁷

Department of Chemistry, University of North Carolina, Chapel Hill, NC 27599, USA

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Molecules 2025, 30(17), 3500; https://doi.org/10.3390/molecules30173500

Submission received: 27 July 2025 / Revised: 19 August 2025 / Accepted: 25 August 2025 / Published: 26 August 2025

Download

Browse Figures

Versions Notes

Abstract

Employing some simple physics-inspired density-based information-theoretic approach (ITA) quantities to predict the electron correlation energies remains an open challenge. In this work, we expand the scope of the LR(ITA) (LR means linear regression) protocol to more complex systems, including (i) 24 octane isomers; (ii) polymeric structures, polyyne, polyene, all-trans-polymethineimine, and acene; (iii) molecular clusters, such as metallic Be_n and Mg_n, covalent S_n, hydrogen-bonded protonated water clusters H⁺(H₂O)_n, and dispersion-bound carbon dioxide (CO₂)_n, and benzene (C₆H₆)_n clusters. With LR(ITA), one can simply predict the post-Hartree-Fock (such as MP2 and coupled cluster) electron correlation energies at the cost of Hartree-Fock calculations, even with chemical accuracy. For large molecular clusters, we employ the linear-scaling generalized energy-based fragmentation (GEBF) method to gauge the accuracy of LR(ITA). Employing benzene clusters as an illustration, the LR(ITA) method shows similar accuracy to that of GEBF. Overall, we have verified that ITA quantities can be used to predict the post-Hartree-Fock electron correlation energies of various complex systems.

Keywords:

electron correlation energy; information-theoretic approach; generalized energy-based fragmentation (GEBF); molecular clusters; polymers

1. Introduction

Electron correlation energy lies at the heart of quantum chemistry [1,2]. However, the computational cost of high-level post-Hartree–Fock methods skyrockets with system size. In this context, there is a pressing need for alternative lower-scaling cost-efficient methods across broad classes of systems. In recent years, the information-theoretic approach (ITA) [3,4,5,6] has emerged as a promising framework for understanding and predicting the electron correlation energy from the perspective of information theory. By treating the electron density as a continuous probability distribution, ITA introduces a set of descriptors, such as Shannon entropy [7] and Fisher information [8], that encode global and local features of the electron density distribution. These quantities are inherently basis-agnostic and physically interpretable, providing a new lens through which quantum chemical problems can be approached.

In continuation with our previous work by employing the simple physics-inspired density-based ITA quantities to appreciate response properties [9,10,11,12,13] (such as molecular polarizability and NMR chemical shielding constant) and energetics of elongated hydrogen chains [14], in this work, we aim to predict the post-Hartree–Fock (see Figure 1) electron correlation energies of various molecular clusters and linear or quasi-linear organic polymers with increasing cluster size and polymer length. The shared set of physically motivated ITA quantities include Shannon entropy (

S_{S}

) [7], Fisher information (

I_{F}

) [8], Ghosh, Berkowitz, and Parr entropy (S_GBP) [15], Onicescu information energy (E₂ and E₃) [16], relative Rényi entropy (

R_{2}^{r}

and

R_{3}^{r}

) [16], relative Shannon entropy (I_G) [17] and relative Fisher information (G₁, G₂, and G₃) [18]. The definitions of these 11 quantities can be found in Section 4. The Shannon entropy characterizes the global delocalization of the electron density, reflecting how uniformly electrons are distributed throughout space. The Fisher information quantifies local inhomogeneity, serving as a measure of the sharpness or localization of density features such as bonding regions or lone pairs. The Kullback–Leibler divergence (relative entropy) measures the distinguishability between two densities, providing a quantification of the difference in electronic structure between two systems/states. These systems include(i) 24 octane isomers (see Figure 2) [11]; (ii) polymeric structures (see Figure 3), polyyne, polyene, all-trans-polymethineimine, and acene [11]; (iii) molecular clusters (see Figure 4), such as metallic Be_n and Mg_n [19,20], covalent S_n [21,22], hydrogen-bonded protonated water clusters H⁺(H₂O)_n [23], and dispersion-bound carbon dioxide (CO₂)_n [24], and benzene clusters (C₆H₆)_n [25]. We construct strong linear relationships between the low-cost Hartree–Fock [26] ITAs and the electron correlation energies from post-Hartree–Fock methods, such as MP2 or RI-MP2 [27,28,29], CCSD [30,31], and CCSD(T) [32]. It is noteworthy to mention that MP2 is mainly used here only as a proof-of-concept; Hartree–Fock can be simply replaced with any approximate functionals of density functional theory (DFT) [33,34].

By examining trends across increasing cluster size and polymer length, we assess the transferability, scalability, and physical insights provided by ITA features in capturing electron correlation. Our findings highlight not only the feasibility of ITA-driven correlation energy prediction but also reveal key descriptors that most strongly govern correlation effects in extended systems. These results suggest that ITA may serve as a promising direction for developing efficient, interpretable, and physically grounded models in quantum machine learning and electronic structure theory.

2. Results

To validate the accuracy of the LR(ITA) method, we chose a total of 24 octane isomers as shown in Figure 2. MP2, CCSD, and CCSD(T) are used to generate the electron correlation energies, and ITA quantities are obtained at the Hartree-Fock level at the same basis set 6-311++G(d,p). More details can be found in the Supplementary Materials (Table S1). Table 1 shows the linear relationships and RMSDs between the LR(ITA)-predicted and calculated electron correlation energies. There seems to be no substantial differences between R² (and RMSD) values for MP2, CCSD, and CCSD(T). I_F is slightly better than S_GBP and substantially better than S_S, which reflects the highly localized nature of the density in alkanes. For S_S, I_F, and S_GBP, the RMSDs are <2.0 mH, indicating that LR(ITA) should be accurate enough to predict the electron correlation energies. Because CCSD and CCSD(T) are too computationally-intensive and intractable, only MP2 is used hereafter as proof-of-concept.

In Table 2, Table 3, Table 4 and Table 5, we have collected the linear correlation coefficients (R² = 1.000) and RMSDs (root mean squared deviations) between the calculated correlation energies at the MP2/6-311++G(d,p) level and those predicted based on the ITA quantities at the HF/6-311++G(d,p) level for polyyne, polyene, all-trans-polymethineimine, and acene, respectively. More details can be found in Tables S2–S5. Some ITA quantities are not tabulated in the text mainly because of inferior accuracy, for example, G₂ in Table 2, G₂ and I_G in Table 3, and G₁, G₂, and I_G in Table 4, respectively. It is clearly showcased that R² is close to 1 for most ITA quantities. More strikingly, based on the linear regression (LR) equations of ITA quantities, the predicted electron correlations deviate from the calculated ones only by ~1.5 mH for polyyne, ~3.0 mH for polyene, and <4.0 mH for all-trans-polymethineimine. For acene, the RMSDs are reasonably satisfactory by ~10–11 mH. These results collectively reveal that ITA quantities are indeed good descriptors of electron correlations for those linear or quasi-linear polymeric systems with delocalized electronic structures. For more challenging acenes, a single ITA quantity fails to capture a sufficient amount of information about more delocalized electronic structures.

Shown in Table 6, Table 7 and Table 8 are the results of the linear correlation coefficients (R²) and RMSDs (root mean squared deviations) between the calculated correlation energies at the MP2/6-311++G(d,p) level and those predicted based on the ITA quantities at the HF/6-311++G(d,p) level for neutral metallic Be_n, Mg_n, and covalent S_n systems, respectively. More details can be found in Tables S6–S11. One can see that strong correlations exist (R² > 0.990) between ITA quantities and MP2 correlation energies, indicating that they are extensive in nature. However, the predicted electron correlation energies deviate much from the calculated ones by ~28–37 mH for Be_n, ~17–33 mH for Mg_n, and ~26–42 mH for S_n, respectively. These results collectively showcase that for 3-dimensional metallic clusters, Be_n and Mg_n, and covalent S_n, a single ITA quantity fails to quantitatively capture enough information about electron energies of complex systems.

Shown in Table 9 are the results of the linear correlation coefficients (R²) and RMSDs (root mean squared deviations) between the calculated correlation energies at the MP2/6-311++G(d,p) level and those predicted based on the ITA quantities at the HF/6-311++G(d,p) level for hydrogen-bonded protonated water clusters. The corresponding regression slopes and intercepts are provided in Table S12. Of note, the ITAs and the MP2 correlation energies are not shown mainly because the dataset has a total of 1480 structures. One can see that strong correlations exist (R² = 1.000) between (8 out of 11) ITA quantities and the MP2 correlation energies, indicating that they are extensive in nature. The RMSDs range from 2.1 (

E_{2}

and

E_{3}

) to 9.3 (

G_{3}

) mH, indicating that ITA quantities are good descriptors of the post-Hartree-Fock electron correlation energies of hydrogen-bonded systems.

Finally, we will switch our gear to two dispersion-bound clusters, (CO₂)_n and (C₆H₆)_n. Table 10 gives the strong correlations (R² = 1.000) and RMSDs between the RI-MP2 correlation energies and Hartree–Fock ITA quantities at the same basis set 6-311++G(d,p) for (CO₂)_n(n = 4−40). More details can be found in Table S13. The RMSDs vary from 6.3 (

E_{2}

and

E_{3}

) to 10.8 (

G_{3}

) to 14.6 (

S_{S}

) mH. For (C₆H₆)_n (n = 4−14) clusters, we have calculated the linear correlations (R² = 1.000) and RMSDs between the MP2/6-311++G(d,p) electron correlation energies and HF/6-311++G(d,p) ITA quantities, as collected in Table 11. More details can be found in Tables S14 and S15. The RMSDs range from 2.8 (

G_{3}

) to 6.9 (

E_{3}

) to 10.7 (

S_{S}

) mH. The RMSD results collectively suggest (8 out of 11) ITA quantities are reasonably good descriptors of the post-Hartree-Fock electron correlation energies of dispersion-bound clusters.

To further illustrate the extrapolative capability of the LR(ITA) method, we employ some relatively larger (C₆H₆)_n (n = 15−30) clusters to this end. Plus, as conventional MP2/6-311++G(d,p) calculations are too computationally-intensive, we employ GEBF [35,36,37,38] to obtain the MP2-level electron correlation energies as reference. Finally, as the linear regression based on the ITA quantity G₃ has the least RMSD value, we choose LR(G₃) to make predictions of electron correlation energies of benzene clusters. More details can be found in Tables S15 and S16. Figure 5 shows a comparison of the LR(G₃)-predicted and GEBF-calculated MP2 electron correlation energies for benzene clusters (C₆H₆)_n (n = 15−30). The RMSD between the LR(G₃)-predicted and GEBF-calculated data is 8.6 mH, indicative that the LR(ITA) method has a comparable performance to the linear-scaling GEBF method. Of note, the R² and RMSD values in Figure 5 characterize the prediction quality of an extrapolated set, which differs from the regression statistics in the previous tables that summarize fits within the training set. In addition, we have found that when subsystem wavefunctions (thus electron density and ITA quantities) are used to obtain the subsystem electron correlation energies, the final total electron correlation energies of GEBF-LR(G₃) deviate from GEBF by 40.0 mH in terms of RMSD, as shown in Figure 5 and Table S17. This indicates that it is not a good choice to combine the ITA quantities with a fragment-based method (GEBF in our case) for predicting the electron correlation energy. One possible reason for this observation may come from the error accumulation, rather than error cancellation, on which the great success of GEBF relies. To further verify this point, we have plotted the deviations of LR(G₃) and GEBF-LR(G₃) as referenced to those of GEBF with respect to the cluster size as shown in Figure S1, it is lucidly shown that the overall trend observed for LR(G₃) and GEBF-LR(G₃) is that the deviation only fluctuates to some degree for the former; while that of the latter grows with the cluster size.

3. Discussion

To accurately and efficiently predict the post-Hartree-Fock electron correlation energy at a relatively low cost is a hot area in the community of quantum chemistry. Starting from Hartree-Fock molecular orbitals, there exist two typical methods. One is to calculate the local electron correlation energy, whose early development is due to Pulay and Sæbø [39,40,41]; the other is to predict the correlation energy with the aid of deep learning (DL) [42,43,44,45,46,47,48,49,50,51]. Our proposed LR(ITA) method is a special flavor of DL. Suffice it to note that an inherent drawback of local correlation methods is that they perform orbital localization [52,53]. This problem is also encountered by the DL-driven method. For our LR(ITA) method, only the molecular orbitals (thus, the electron density) are required without any manipulation. Very recently, we have showcased the good accuracy of LR(ITA) and its variant DL(ITA). With LR(ITA), one can even predict the FCI-level electron correlation with the DMRG (density matrix renormalization group) [54,55] algorithm as a solver for the elongated hydrogen chain [14], and the RMSD is only a few mH. Moreover, with DL(ITA), where a total of 11 ITA quantities are used as input [13], we have predicted the DLPNO-MP2 (Domain-Based Local Pair Natural Orbital MP2) [56] electron correlation energy for a database of >90 K real organic molecules, and the RMSD is about 6.8 mH. In addition, LR(ITA) is not limited to any post-Hartree-Fock electronic structure methods; MP2 is used here as a proof-of-concept. Thus, we have showcased that LR(ITA) is designed with architectural and conceptual simplicity and is numerically shown to be a good protocol to predict the electron correlation energies of various systems. Of note, the predictive power of LR(ITA) is best for chemically similar systems, whereas extrapolation across chemically distinct sets should be performed with caution. Plus, while the LR(ITA) model generally maintains a strong linear correlation for geometries close to the equilibrium, the predictive accuracy can decrease for significantly distorted geometries. This is because the ITA descriptors are computed from the Hartree-Fock electron density, which changes with geometry, and the linear regression coefficients are fitted to equilibrium structures.

Up to now, we have mainly focused on MP2, it is compelling and valuable to carry out a more extensive benchmarking against (i) CCSD(T) for larger or more complex systems and (ii) more challenging cases where both dynamic and static correlation effects may be significant, like polyyne, polyene, and acene with large n.

Admittedly, using LR(ITA) to accurately and efficiently predict the electron correlation energy is still in its infancy. On the one hand, for three-dimensional systems, the RMSD values between the predicted and computed MP2 correlation energies are unacceptably large, even though there is still a strong linearity between the ITA quantities and the MP2 correlation energy. Would it be possible that more sophisticated, higher-order ITA quantities could capture additional electronic structure information, analogous to the “rungs” of Jacob’s ladder in DFT? If so, developing and testing a hierarchy of ITA quantities could potentially improve the predictive power of LR(ITA) for complex three-dimensional systems.

On the other hand, we will implement a new concept of “ITL-DL Loop”. The physics behind it is simple: low-tier (such as semiempirical PM7 [57] or even promolecular [58,59]) electron densities are used as input for ITA quantities, and DL is introduced to obtain high-tier (such as DFT) electron densities. Based on the newly generated electron densities, ITA quantities are obtained and used as input for another either classical or quantum DL model to predict the electron correlation energies of electrons of physicochemical properties of molecules. Moreover, extending the ITA-based method to quantities reflecting the response of electronic energy with respect to the nuclear displacement is another potential direction. Work along these lines is in progress, and the results will be presented elsewhere.

4. Materials and Methods

4.1. Information-Theoretic Approach Quantities

Though density functional theory (DFT) [33,34] and information theory (IT) [3,4] are two totally different areas, they have been combined together with the electron density distribution as a seamless linker, and this community has seen many successes for more than 40 years [15,60,61,62,63,64,65,66,67,68,69,70,71]. In this work, we will outline some well-established ITA quantities. First and foremost, Shannon entropy

S_{S}

[7] and Fisher information

I_{F}

[8] are two foundational quantities in information theory. They are defined as Equations (1) and (2), respectively.

S_{S} = - \int ρ (r) l n ρ (r) d r

(1)

I_{F} = \int \frac{{| \nabla ρ (r) |}^{2}}{ρ (r)} d r

(2)

where

ρ (r)

is the electron density and

\nabla ρ (r)

is the density gradient. Physically,

S_{S}

characterizes the spatial delocalization of the electron density, while

I_{F}

reflects its sharpness or localization. Of note,

S_{S}

and

I_{F}

are not mutually exclusive and but always intercorrelated [72,73].

Beyond the total electron density, additional quantities such as kinetic-energy density can be incorporated into the formulation of information-theoretic approaches (ITA). Utilizing both electron density and kinetic-energy density, Ghosh, Berkowitz, and Parr introduced an entropy functional known as (

S_{G B P}

) [15]

S_{G B P} = - \int \frac{3}{2} k ρ (r) [c + l n \frac{t (r; ρ)}{t_{T F} (r; ρ)}] d r

(3)

where t(r; ρ) and t_TF(r; ρ) represent the non-interacting and Thomas–Fermi (TF) kinetic energy density, respectively. The constants are defined as follows: k is the Boltzmann constant, c = (5/3) +ln(4πc_K/3), and c_K = (3/10)(3π²)^2/3]. The non-interacting kinetic energy density

t (r; ρ)

integrates to give the total kinetic energy

T_{S}

,

\int t (r; ρ) d r = T_{S}

(4)

It can be computed from the canonical orbital densities as,

t (r; ρ) = \sum_{i} \frac{1}{8} \frac{\nabla ρ_{i} \cdot \nabla ρ_{i}}{ρ_{i}} - \frac{1}{8} \nabla^{2} ρ

(5)

while the Thomas–Fermi expression is given by,

t_{T F} (r; ρ) = c_{K} ρ^{5 / 3} (r)

(6)

It is important to note that kinetic-energy density may take different forms depending on context [74,75,76,77,78,79,80,81]. Nonetheless, S_GBP satisfies the maximum-entropy principle from a rigorous mathematical viewpoint [15].

Expanding further, several ITA descriptors have been proposed to characterize chemical reactivity. Within the framework of conceptual density functional theory (CDFT) [82,83,84,85], other well-established ITA quantities have been proposed, including the Onicescu information energy (of order n) [16],

E_{n} = \frac{1}{n - 1} \int ρ^{n} (r) d r

(7)

relative Rényi entropy of order n [16],

R_{n}^{r} = \frac{1}{1 - n} {l o g}_{10} [\int \frac{ρ^{n} (r)}{ρ_{0}^{n - 1} (r)} d r]

(8)

and relative Shannon entropy, or information gain (

I_{G}

) [17], also called Kullback−Leibler divergence,

I_{G} = \int ρ (r) l n \frac{ρ (r)}{ρ_{0} (r)} d r

(9)

E₂ and E₃ (of Equation (7) were introduced to define a finer measure of dispersion distribution than

S_{S}

. In Equations (8) and (9),

ρ_{0} (r)

is a reference-state density, and both

ρ_{0} (r)

and

ρ (r)

are normalized to the total number of electrons of a molecule.

More recently [18], one of the present authors introduced three ITA descriptors, G₁, G₂, and G₃, applicable at both atomic and molecular levels, as follows:

G_{1} = \sum_{A} \int {\nabla^{2} ρ}_{A} (r) \frac{ρ_{A} (r)}{ρ_{A}^{0} (r)} d r

(10)

G_{2} = \sum_{A} \int ρ_{A} (r) [\frac{{\nabla^{2} ρ}_{A} (r)}{ρ_{A} (r)} - \frac{\nabla^{2} ρ_{A}^{0} (r)}{ρ_{A}^{0} (r)}] d r

(11)

G_{3} = \sum_{A} \int ρ_{A} (r) {[\nabla l n \frac{ρ_{A} (r)}{ρ_{A}^{0} (r)}]}^{2} d r

(12)

Finally, to partition the electron density into atomic contributions within a molecule, the Hirshfeld stockholder approach [86,87] is frequently adopted. It is defined as follows:

ρ_{A} (r) = ω_{A} (r) ρ (r) = \frac{ρ_{A}^{0} (r) (r - R_{A})}{\sum_{B} ρ_{B}^{0} (r - R_{B})} ρ (r)

(13)

Here,

ρ_{A} (r)

is the atomic (Hirshfeld) density,

ω_{A} (r)

is the weight or “sharing” function,

ρ_{B}^{0} (r - R_{B})

represents the reference (typically spherically averaged) atomic density centered at

R_{B}

. The denominator is known as the promolecular density. The stockholder method naturally aligns with ITA due to its information-theoretic foundation. Alternative partitioning schemes include Becke’s fuzzy atom method [88] and Bader’s atoms-in-molecules (AIM) approach based on zero-flux surfaces [58]. A summary of our recent work in this direction is available in Ref. [89].

4.2. An Outline of GEBF

In the generalized energy-based fragmentation (GEBF) method [35,36,37,38], the total energy of a large system, such as a macromolecule or molecular aggregate, is expressed as a linear combination of the energies of smaller embedded subsystems, as given in Equation (14).

E_{t o t} = \sum_{m} C_{m} {\tilde{E}}_{m} - [(\sum_{m} C_{m}) - 1] \sum_{A} \sum_{B > A} \frac{Q_{A} Q_{B}}{R_{A B}}

(14)

Here,

{\tilde{E}}_{m}

and

C_{m}

stand for the total energy and the coefficient of the mth subsystem, respectively. Q_A, is the atomic charge on atom A. R_AB is the interatomic distance between atoms A and B.

The general procedure for performing GEBF calculations involves several steps. Employing a molecular cluster of benzene (C₆H₆) as illustrated in Figure 4f, each benzene molecule is treated as a fragment. Primitive subsystems are then constructed centered at each fragment, defined by a distance threshold (ζ). These primitive subsystems are assigned coefficients C_m = +1. Due to the spatial overlap among primitive subsystems, smaller derivative subsystems are generated. The coefficients of these derivative subsystems are determined automatically using the principle of inclusion and exclusion, ensuring proper energy accounting. Another parameter, γ_max, representing the maximum number of fragments allowed in a subsystem, is introduced to control subsystem size.

All quantum chemical calculations for the subsystems are carried out using the GEBF method as implemented in the LSQC 3.0 (low scaling quantum chemistry) package [90]. In this work, the two key GEBF parameters (ζ, γ_max) are set to be (4.0, 6).

4.3. Computational Details

A total of 24 of octane isomers, metallic clusters Be_n (n = 3 to 25), Mg_n (n = 3 to 20 and 28), (CO₂)_n (n = 4 to 40), organic clusters of (C₆H₆)_n (n = 4 to 30), covalent S_n (n = 2 to 18); polymeric structures (see Figure 2) of polyyne, polyene, all-trans-polymethineimine, and acene, were taken from our previous publication. For the protonated clusters [(H₂O)_n(H₃O)]⁺, they were taken from Ref [23]. For cluster sizes n = 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, and 20, there are 74, 79, 113, 119, 108, 140, 121, 138, 114, 125, and 143 structures, respectively.

Molecular wavefunctions for all the systems were obtained at the HF/6-311++G(d,p) level. The Multiwfn 3.8 [91,92] program was utilized to calculate all ITA quantities by using the Gaussian 16 checkpoint or wavefunction file as the input. The stockholder Hirshfeld partition scheme of atoms in molecules was employed when atomic contributions were concerned. The reference-state density was the neutral atom calculated at the restricted open-shell ROHF/6-311++G(d,p) level. CCSD and CCSD(T) calculations for octane isomers were performed with the Gaussian 16 [93] package. For RI-MP2 calculations, Hartree-Fock (HF) orbitals from the Gaussian 16 calculations were then transformed into the ORCA [94] format by using the MOKIT [95] program (version 1.2.7rc9). The frozen core formalism [96,97] was used throughout this work, unless otherwise stated.

5. Conclusions

To summarize, in this work, we have applied the information-theoretic approach (ITA) quantities to appreciate the post-Hartree-Fock (such as MP2 or RI-MP2) correlation energies for various molecular clusters and polymeric systems with both localized and delocalized electronic structures. We have found that for linear or quasi-linear polymeric systems, such as polyyne and polyene, the predicted results based on the Hartree-Fock ITA quantities are in excellent agreement with the calculated MP2 correlation energies. For other systems, such as hydrogen-bonded protonated water clusters and dispersion-bound carbon dioxide and benzene clusters, satisfactory results can be obtained with the LR(ITA) protocol. For metallic Be_n and Mg_n, as well as covalent S_n, one can still obtain reasonable results. In addition, for relatively larger benzene clusters, we compare the LR(ITA) results with those from the GEBF method, and similar accuracy is observed. Our results collectively showcase that LR(ITA) is a promising method as a cost-efficient tool in predicting the electron correlation energy.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/molecules30173500/s1, Tables S1–S17: Hartree-Fock ITA quantities, the electron correlation energies and the total energies, and linear regression coefficients and correlation coefficients. Figure S1. The correlation energy differences between LR(G3)-, and GEBF-LR(G3)-predicted values as referenced to those of GEBF versus the cluster size.

Author Contributions

Conceptualization, J.C., S.L., P.W.A., and D.Z.; data curation, P.W.,D.H., L.L., Y.Z., and D.Z.; formal analysis, P.W., D.H., Y.Z., and D.Z.; funding acquisition, P.W.A. and D.Z.; project administration, S.L., P.W.A., and D.Z.; supervision, S.L., P.W.A., and D.Z.; writing—original draft, J.C., S.L., P.W.A., and D.Z.; writing—reviewing and editing, J.C., S.L., P.W.A., and D.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of China (grant No. 22203071 and 22361051), the High-Level Talent Special Support Plan, the China Scholarship Council, NSERC, Canada Research Chairs, and the Digital Research Alliance of Canada.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

Part of the computations were performed on the high-performance computers of the Advanced Computing Center of Yunnan University.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Szabo, A.; Ostlund, N.S. Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory; Dover Publications: Garden City, NY, USA, 1996. [Google Scholar]
Tew, D.P.; Klopper, W.; Helgaker, T. Electron correlation: The many-body problem at the heart of chemistry. J. Comput. Chem. 2007, 28, 1307–1320. [Google Scholar] [CrossRef]
Nalewajski, R.F.; Parr, R.G. Information theory, atoms in molecules, and molecular similarity. Proc. Natl. Acad. Sci. USA 2000, 97, 8879–8882. [Google Scholar] [CrossRef]
Ayers, P.W. Information Theory, the Shape Function, and the Hirshfeld Atom. Theor. Chem. Acc. 2006, 115, 370–378. [Google Scholar] [CrossRef]
Zhao, Y.; Zhao, D.; Rong, C.; Liu, S.; Ayers, P.W. Information Theory Meets Quantum Chemistry: A Review and Perspective. Entropy 2025, 27, 644. [Google Scholar] [CrossRef] [PubMed]
Zhao, Y.; Zhao, D.; Rong, C.; Liu, S.; Ayers, P.W. Extending the information-theoretic approach from the (one) electron density to the pair density. J. Chem. Phys. 2025, 162, 244108. [Google Scholar] [CrossRef] [PubMed]
Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar] [CrossRef]
Fisher, R.A. Theory of statistical estimation. Math. Proc. Camb. Philos. Soc. 1925, 22, 700–725. [Google Scholar] [CrossRef]
Zhao, D.; Liu, S.; Chen, D. A Density Functional Theory and Information-Theoretic Approach Study of Interaction Energy and Polarizability for Base Pairs and Peptides. Pharmaceuticals 2022, 15, 938. [Google Scholar] [CrossRef]
Zhao, D.; He, X.; Ayers, P.W.; Liu, S. Excited-State Polarizabilities: A Combined Density Functional Theory and Information-Theoretic Approach Study. Molecules 2023, 28, 2576. [Google Scholar] [CrossRef]
Zhao, D.; Zhao, Y.; He, X.; Ayers, P.W.; Liu, S. Efficient and accurate density-based prediction of macromolecular polarizabilities. Phys. Chem. Chem. Phys. 2023, 25, 2131–2141. [Google Scholar] [CrossRef]
Zhao, D.; Zhao, Y.; Xu, E.; Liu, W.; Ayers, P.W.; Liu, S.; Chen, D. Fragment-Based Deep Learning for Simultaneous Prediction of Polarizabilities and NMR Shieldings of Macromolecules and Their Aggregates. J. Chem. Theory Comput. 2024, 20, 2655–2665. [Google Scholar] [CrossRef]
Yuan, Y.; Zhao, Y.; Lu, L.; Wang, J.; Chen, J.; Ayers, P.W.; Liu, S.; Zhao, D. Multiproperty Deep Learning of the Correlation Energy of Electrons and the Physicochemical Properties of Molecules. J. Chem. Theory Comput. 2025, 21, 5997–6006. [Google Scholar] [CrossRef]
Zhao, Y.; Richer, M.; Ayers, P.W.; Liu, S.; Zhao, D. Can the FCI Energies/Properties be Predicted with HF/DFT Densities? J. Chem. Sci. 2025, accepted. [Google Scholar]
Ghosh, S.K.; Berkowitz, M.; Parr, R.G. Transcription of ground-state density-functional theory into a local thermodynamics. Proc. Natl. Acad. Sci. USA 1984, 81, 8028–8031. [Google Scholar] [CrossRef]
Liu, S.; Rong, C.; Wu, Z.; Lu, T. Rényi entropy, Tsallis entropy and Onicescu information energy in density functional reactivity theory. Acta Phys. -Chim. Sin. 2015, 31, 2057–2063. [Google Scholar] [CrossRef]
Kullback, S. Information Theory and Statistics; Dover Publications: Mineola, NY, USA, 1997. [Google Scholar]
Liu, S. Identity for Kullback-Leibler divergence in density functional reactivity theory. J. Chem. Phys. 2019, 151, 141103. [Google Scholar] [CrossRef] [PubMed]
Abyaz, B.; Mahdavifar, Z.; Schreckenbach, G.; Gao, Y. Prediction of beryllium clusters (Be_n; n = 3–25) from first principles. Phys. Chem. Chem. Phys. 2021, 23, 19716–19728. [Google Scholar] [CrossRef] [PubMed]
Duanmu, K.; Friedrich, J.; Truhlar, D.G. Thermodynamics of Metal Nanoparticles: Energies and Enthalpies of Formation of Magnesium Clusters and Nanoparticles as Large as 1.3 nm. J. Phys. Chem. C 2016, 120, 26110–26118. [Google Scholar] [CrossRef]
Raghavachari, K. Structures and stabilities of sulfur clusters. J. Chem. Phys. 1990, 93, 5862–5874. [Google Scholar] [CrossRef]
Jones, R.O.; Ballone, P. Density functional and Monte Carlo studies of sulfur. I. Structure and bonding in S_n rings and chains (n = 2 − 18). J. Chem. Phys. 2003, 118, 9257–9265. [Google Scholar] [CrossRef]
Ng, W.-P.; Zhang, Z.; Yang, J. Accurate Neural Network Fine-Tuning Approach for Transferable Ab Initio Energy Prediction across Varying Molecular and Crystalline Scales. J. Chem. Theory Comput. 2025, 21, 1602–1614. [Google Scholar] [CrossRef]
Takeuchi, H. Geometry Optimization of Carbon Dioxide Clusters (CO₂)_n for 4 ≤ n ≤ 40. J. Phys. Chem. A 2008, 112, 7492–7497. [Google Scholar] [CrossRef]
Takeuchi, H. Structural Features of Small Benzene Clusters (C₆H₆)_n (n ≤ 30) As Investigated with the All-Atom OPLS Potential. J. Phys. Chem. A 2012, 116, 10172–10181. [Google Scholar] [CrossRef]
Roothaan, C.C.J. New Developments in Molecular Orbital Theory. Rev. Mod. Phys. 1951, 23, 69–89. [Google Scholar] [CrossRef]
Møller, C.; Plesset, M.S. Note on an Approximation Treatment for Many-Electron Systems. Phys. Rev. 1934, 46, 618–622. [Google Scholar] [CrossRef]
Weigend, F.; Ahlrichs, R. Efficient use of the correlation consistent basis sets in resolution of the identity MP2 calculations. Chem. Phys. Lett. 1997, 294, 143–152. [Google Scholar] [CrossRef]
Weigend, F.; Häser, M.; Patzelt, H.; Ahlrichs, R. RI-MP2: Optimized auxiliary basis sets and demonstration of efficiency. Chem. Phys. Lett. 1998, 294, 143–152. [Google Scholar] [CrossRef]
Bartlett, R.J.; Watts, J.D. The coupled-cluster single and double excitation model for the ground-state correlation energy. Chem. Phys. Lett. 1989, 155, 133–140. [Google Scholar] [CrossRef]
Čížek, J.; Paldus, J. Coupled-cluster method with singles and doubles for closed-shell systems. Int. J. Quantum Chem. 1971, 5, 359–379. [Google Scholar]
Purvis, G.D.; Bartlett, R.J. A full coupled-cluster singles and doubles model: The inclusion of disconnected triples. J. Chem. Phys. 1982, 76, 1910–1918. [Google Scholar] [CrossRef]
Parr, R.G.; Yang, W. Density Functional Theory of Atoms and Molecules; Oxford University Press: Oxford, UK, 1989. [Google Scholar]
Teale, A.M.; Helgaker, T.; Savin, A.; Adamo, C.; Aradi, B.; Arbuznikov, A.V.; Ayers, P.W.; Baerends, E.J.; Barone, V.; Calaminici, P.; et al. DFT exchange: Sharing perspectives on the workhorse of quantum chemistry and materials science. Phys. Chem. Chem. Phys. 2022, 24, 28700–28781. [Google Scholar] [CrossRef]
Li, S.; Li, W.; Fang, T. An Efficient Fragment-Based Approach for Predicting the Ground-State Energies and Structures of Large Molecules. J. Am. Chem. Soc. 2005, 127, 7215–7226. [Google Scholar] [CrossRef]
Li, W.; Li, S.; Jiang, Y. Generalized Energy-Based Fragmentation Approach for Computing the Ground-State Energies and Properties of Large Molecules. J. Phys. Chem. A 2007, 111, 2193–2199. [Google Scholar] [CrossRef]
Li, S.; Li, W.; Ma, J. Generalized Energy-Based Fragmentation Approach and Its Applications to Macromolecules and Molecular Aggregates. Acc. Chem. Res. 2014, 47, 2712–2720. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Dong, H.; Ma, J.; Li, S. Structures and Spectroscopic Properties of Large Molecules and Condensed-Phase Systems Predicted by Generalized Energy-Based Fragmentation Approach. Acc. Chem. Res. 2021, 54, 169–181. [Google Scholar] [CrossRef] [PubMed]
Pulay, P. Localizability of dynamic electron correlation. Chem. Phys. Lett. 1983, 100, 151–154. [Google Scholar] [CrossRef]
Sæbø, S.; Pulay, P. Local configuration interaction: An efficient approach for larger molecules. Chem. Phys. Lett. 1985, 113, 13–18. [Google Scholar] [CrossRef]
Sæbø, S.; Pulay, P. Local Treatment of Electron Correlation. Annu. Rev. Phys. Chem. 1993, 44, 213–236. [Google Scholar] [CrossRef]
Welborn, M.; Cheng, L.; Miller, T.F., III. Transferability in Machine Learning for Electronic Structure via the Molecular Orbital Basis. J. Chem. Theory Comput. 2018, 14, 4772–4779. [Google Scholar] [CrossRef]
Cheng, L.; Welborn, M.; Christensen, A.S.; Miller, T.F., III. A universal density matrix functional from molecular orbital-based machine learning: Transferability across organic molecules. J. Chem. Phys. 2019, 150, 131103. [Google Scholar] [CrossRef]
Cheng, L.; Kovachki, N.B.; Welborn, M.; Miller, T.F., III. Regression Clustering for Improved Accuracy and Training Costs with Molecular-Orbital-Based Machine Learning. J. Chem. Theory Comput. 2019, 15, 6668–6677. [Google Scholar] [CrossRef]
Imamura, Y.; Takahashi, A.; Nakai, H. Grid-based energy density analysis: Implementation and assessment. J. Chem. Phys. 2007, 126, 034103. [Google Scholar] [CrossRef]
Nudejima, T.; Ikabata, Y.; Seino, J.; Yoshikawa, T.; Nakai, H. Machine-learned electron correlation model based on correlation energy density at complete basis set limit. J. Chem. Phys. 2019, 151, 024104. [Google Scholar] [CrossRef] [PubMed]
Han, R.; Luber, S. Fast Estimation of Møller–Plesset Correlation Energies Based on Atomic Contributions. J. Phys. Chem. Lett. 2021, 12, 5324–5331. [Google Scholar] [CrossRef] [PubMed]
Han, R.; Rodríguez-Mayorga, M.; Luber, S. A Machine Learning Approach for MP2 Correlation Energies and Its Application to Organic Compounds. J. Chem. Theory Comput. 2021, 17, 777–790. [Google Scholar] [CrossRef] [PubMed]
Ng, W.-P.; Liang, Q.; Yang, J. Low-Data Deep Quantum Chemical Learning for Accurate MP2 and Coupled-Cluster Correlations. J. Chem. Theory Comput. 2023, 19, 5439–5449. [Google Scholar] [CrossRef]
Townsend, J.; Vogiatzis, K.D. Transferable MP2-Based Machine Learning for Accurate Coupled-Cluster Energies. J. Chem. Theory Comput. 2020, 16, 7453–7461. [Google Scholar] [CrossRef]
McGibbon, R.T.; Taube, A.G.; Donchev, A.G.; Siva, K.; Hernández, F.; Hargus, C.; Law, K.-H.; Klepeis, J.L.; Shaw, D.E. Improving the accuracy of Møller-Plesset perturbation theory with neural networks. J. Chem. Phys. 2017, 147, 161725. [Google Scholar] [CrossRef]
Boys, S.F. Construction of Some Molecular Orbitals to Be Approximately Invariant for Changes from One Molecule to Another. Rev. Mod. Phys. 1960, 32, 296–299. [Google Scholar] [CrossRef]
Edmiston, C.; Ruedenberg, K. Localized Atomic and Molecular Orbitals. Rev. Mod. Phys. 1963, 35, 457–464. [Google Scholar] [CrossRef]
White, S.R. Density matrix formulation for quantum renormalization groups. Phys. Rev. Lett. 1992, 69, 2863–2866. [Google Scholar] [CrossRef]
White, S.R. Density-matrix algorithms for quantum renormalization groups. Phys. Rev. B 1993, 48, 10345–10356. [Google Scholar] [CrossRef]
Riplinger, T.; Neese, F. An efficient and near linear scaling pair natural orbital based local coupled cluster method. J. Chem. Phys. 2013, 138, 034106. [Google Scholar] [CrossRef]
Stewart, J.J.P. Optimization of parameters for semiempirical methods V: Modification of NDDO approximations and application to 70 elements. J. Mol. Model. 2013, 19, 1173–1213. [Google Scholar] [CrossRef] [PubMed]
Bader, R.F.W. Atoms in Molecules: A Quantum Theory; Clarendon Press: Oxford, UK, 1990. [Google Scholar]
Clementi, E.; Raimondi, D.L. Atomic Screening Constants from SCF Functions. J. Chem. Phys. 1963, 38, 2686–2689. [Google Scholar] [CrossRef]
Gadre, S.R.; Sears, S.B.; Chakravorty, S.J.; Bendale, R.D. Some novel characteristics of atomic information entropies. Phys. Rev. A 1985, 32, 2602–2606. [Google Scholar] [CrossRef] [PubMed]
Sears, S.B.; Gadre, S.R. An information theoretic synthesis and analysis of Compton profiles. J. Chem. Phys. 1981, 75, 4626–4635. [Google Scholar] [CrossRef]
Sears, S.B.; Parr, R.G.; Dinur, U. On the Quantum-Mechanical Kinetic Energy as a Measure of the Information in a Distribution. Isr. J. Chem. 1980, 19, 165–173. [Google Scholar] [CrossRef]
Nagy, Á. Fisher information in density functional theory. J. Chem. Phys. 2003, 119, 9401–9405. [Google Scholar] [CrossRef]
Nagy, Á.; Parr, R.G. Information entropy as a measure of the quality of an approximate electronic wave function. Int. J. Quantum Chem. 1996, 58, 323–327. [Google Scholar] [CrossRef]
Morrison, R.C.; Parr, R.G. Approximate density matrices and Husimi functions using the maximum entropy formulation with constraints. Int. J. Quantum Chem. 1991, 39, 823–837. [Google Scholar] [CrossRef]
Ramírez, J.C.; Pérez, J.M.H.; Sagar, R.P.; Esquivel, R.O.; Hô, M.; Smith, V.H., Jr. Amount of information present in the one-particle density matrix and the charge density. Phys. Rev. A 1998, 58, 3507–3515. [Google Scholar] [CrossRef]
Hô, M.; Weaver, D.F.; Smith, V.H., Jr.; Sagar, R.P.; Esquivel, R.O. Calculating the logarithmic mean excitation energy from the Shannon information entropy of the electronic charge density. Phys. Rev. A 1998, 57, 4512–4517. [Google Scholar] [CrossRef]
Hô, M.; Smith, V.H., Jr.; Weaver, D.F.; Gatti, C.; Sagar, R.P.; Esquivel, R.O. Molecular similarity based on information entropies and distances. J. Chem. Phys. 1998, 108, 5469–5475. [Google Scholar] [CrossRef]
Hô, M.; Sagar, R.P.; Weaver, D.F.; Smith, V.H., Jr. An investigation of the dependence of Shannon-information entropies and distance measures on molecular-geometry. Int. J. Quantum Chem. 1995, S29, 109–115. [Google Scholar] [CrossRef]
Hô, M.; Sagar, R.P.; Smith, V.H., Jr.; Esquivel, R.O. Atomic information entropies beyond the Hartree-Fock limit. J. Phys. B At. Mol. Opt. Phys. 1994, 27, 5149–5157. [Google Scholar] [CrossRef]
Hô, M.; Sagar, R.P.; Pérez-Jordá, J.M.; Smith, V.H., Jr.; Esquivel, R.O. A numerical study of molecular information entropies. Chem. Phys. Lett. 1994, 219, 15–20. [Google Scholar] [CrossRef]
Nagy, Á.; Liu, S. Local wave-vector, Shannon and Fisher information. Phys. Lett. A 2008, 372, 1654–1656. [Google Scholar] [CrossRef]
Liu, S. On the relationship between densities of Shannon entropy and Fisher information for atoms and molecules. J. Chem. Phys. 2007, 126, 191107. [Google Scholar] [CrossRef]
Bader, R.F.W.; Preston, H.J.T. The kinetic energy of molecular charge distributions and molecular stability. Int. J. Quantum Chem. 1969, 3, 327–347. [Google Scholar] [CrossRef]
Tal, Y.; Bader, R.F.W. Studies of the energy density functional approach. I. Kinetic energy. Int. J. Quantum Chem. 1978, 14, 153–168. [Google Scholar] [CrossRef]
Cohen, L. Local kinetic energy in quantum mechanics. J. Chem. Phys. 1979, 70, 788–789. [Google Scholar] [CrossRef]
Cohen, L. Representable local kinetic energy. J. Chem. Phys. 1984, 80, 4277–4279. [Google Scholar] [CrossRef]
Yang, Z.; Liu, S.; Wang, Y.A. Uniqueness and Asymptotic Behavior of the Local Kinetic Energy. Chem. Phys. Lett. 1996, 258, 30–36. [Google Scholar] [CrossRef]
Ayers, P.W.; Parr, R.G.; Nagy, Á. Local kinetic energy and local temperature in the density-functional theory of electronic structure. Int. J. Quantum Chem. 2002, 90, 309–326. [Google Scholar] [CrossRef]
Anderson, J.S.M.; Ayers, P.W.; Hernandez, J.I.R. How Ambiguous Is the Local Kinetic Energy? J. Phys. Chem. A 2010, 114, 8884–8895. [Google Scholar] [CrossRef]
Berkowitz, M. Exponential approximation for the density matrix and the Wigner distribution. Chem. Phys. Lett. 1986, 129, 486–488. [Google Scholar] [CrossRef]
Geerlings, P.; De Proft, F.; Langenaeker, W. Conceptual Density Functional Theory. Chem. Rev. 2003, 103, 1793–1874. [Google Scholar] [CrossRef]
Johnson, P.A.; Bartolotti, L.J.; Ayers, P.W.; Fievez, T.; Geerlings, P. Charge Density and Chemical Reactivity: A Unified View from Conceptual DFT. In Modern Charge Density Analysis; Gatti, C., Macchi, P., Eds.; Springer: New York, NY, USA, 2012. [Google Scholar]
Liu, S. Conceptual Density Functional Theory and Some Recent Developments. Acta Phys. -Chim. Sin. 2009, 25, 590–600. [Google Scholar]
Geerlings, P.; Chamorro, E.; Chattaraj, P.K.; De Proft, F.; Gázquez, J.L.; Liu, S.; Morell, C.; Toro-Labbé, A.; Vela, A.; Ayers, P.W. Conceptual density functional theory: Status, prospects, issues. Theor. Chem. Acc. 2020, 139, 36. [Google Scholar] [CrossRef]
Hirshfeld, F.L. Bonded-atom fragments for describing molecular charge densities. Theor. Chim. Acta 1977, 44, 129–138. [Google Scholar] [CrossRef]
Heidar-Zadeh, F.; Ayers, P.W.; Verstraelen, T.; Vinogradov, I.; Vohringer-Martinez, E.; Bultinck, P. Information-Theoretic Ap-proaches to Atoms-in-Molecules: Hirshfeld Family of Partitioning Schemes. J. Phys. Chem. A 2018, 122, 4219–4245. [Google Scholar] [CrossRef] [PubMed]
Becke, A.D. A multicenter numerical integration scheme for polyatomic molecules. J. Chem. Phys. 1988, 88, 2547–2553. [Google Scholar] [CrossRef]
Rong, C.; Wang, B.; Zhao, D.; Liu, S. Information-Theoretic approach in density functional theory and its recent applications to chemical problems. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2020, 10, e1461. [Google Scholar] [CrossRef]
Li, W.; Chen, C.; Zhao, D.; Li, S. LSQC: Low scaling quantum chemistry program. Int. J. Quantum Chem. 2015, 115, 641–646. [Google Scholar] [CrossRef]
Lu, T.; Chen, F. Multiwfn: A multifunctional wavefunction analyzer. J. Comput. Chem. 2012, 33, 580–592. [Google Scholar] [CrossRef]
Lu, T. A comprehensive electron wavefunction analysis toolbox for chemists, Multiwfn. J. Chem. Phys. 2024, 161, 082503. [Google Scholar] [CrossRef]
Frisch, M.J.; Trucks, G.W.; Schlegel, H.B.; Scuseria, G.E.; Robb, M.A.; Cheeseman, J.R.; Scalmani, G.; Barone, V.; Petersson, G.A.; Nakatsuji, H.; et al. Gaussian 16 Rev. C.01; Gaussian, Inc.: Wallingford, CT, USA, 2016. [Google Scholar]
Zou, J. Molecular Orbital Kit (MOKIT). Available online: https://gitlab.com/jxzou/mokit (accessed on 20 May 2025).
Neese, F. Software update: The ORCA program system—Version 5.0. Wiley Interdiscip. Rev. Comput. Mol. Sci. 2022, 12, e1606. [Google Scholar] [CrossRef]
Pulay, P.; Saebø, S. Orbital-invariant formulation and second-order gradient evaluation in Møller–Plesset perturbation theory. Theor. Chim. Acta 1986, 69, 357–368. [Google Scholar] [CrossRef]
Hampel, C.; Peterson, K.; Werner, H. A comparison of the efficiency and accuracy of different electron correlation methods for large molecules: Quasi-Newton, MP2, CCSD, and CCSD(T). Chem. Phys. Lett. 1992, 190, 1–12. [Google Scholar] [CrossRef]

Figure 1. Comparison of (a) conventional MP2 method (with Hartree-Fock orbitals as input) and (b) linear regression LR(ITA) models used in this work, where the density-based information-theoretic approach (ITA) quantities are used as input. Here, MP2 is used only as a proof-of-concept.

Figure 2. Shown here are a total of 24 isomers of both branched and linear octane studied in this work.

Figure 3. Some representative polymeric structures used in this work, including (a) polyyne, (b) polyene, (c) all-trans-polymethineimine, and (d) acene.

Figure 4. Some representative molecular structures used in this work, including (a) Be_n, (b) Mg_n, (c) S_n, (d) [H⁺(H₂O)_n], (e) (CO₂)_n, and (f) (C₆H₆)_n clusters, respectively.

Figure 5. Comparison of the LR(G₃)-, GEBF-LR(G₃)-predicted and GEBF-calculated MP2-level electron correlation energies for benzene clusters (C₆H₆)_n (n = 15–30). The regression equation is trained on smaller benzene clusters (C₆H₆)_n (n = 4–14). RMSD: root mean squared deviation. Note that the R² and RMSD values here gauge the prediction quality of an extrapolated set, differing from the regression statistics in previous tables that summarize fits within the training set.

Table 1. Strong linear correlations (R²) and RMSD ^a (in mH) between the calculated ^b and predicted correlation energies based on the ITA quantities ^c for octane isomers.

ITA	Method	Slope	Intercept	R²	RMSD
$S_{S}$	MP2	0.03673221	−4.47037893	0.878	1.9
	CCSD	0.02760739	−3.77240773	0.897	1.3
	CCSD(T)	0.03224137	−4.22658251	0.893	1.5
$I_{F}$	MP2	0.01016369	−21.9076991	0.987	0.6
	CCSD	0.00756499	−16.7278042	0.989	0.4
	CCSD(T)	0.00885171	−19.3909815	0.988	0.5
$S_{G B P}$	MP2	0.03958034	−18.81389475	0.964	1.0
	CCSD	0.02958941	−14.48237993	0.974	0.6
	CCSD(T)	0.03459737	−16.75258592	0.972	0.8

^a RMSD: root mean squared deviation. ^b The basis set 6-311++G(d,p) was used. ^c HF/6-311++G(d,p).

Table 2. Strong linear relationships (R²) and RMSD ^a between the calculated ^b (in the last column) and predicted correlation energies based on the ITA quantities ^c for polyyne. RMSD is in mH, and others are in a.u.

n	$S_{S}$	$I_{F}$ /10³	$S_{G B P}$ /10³	$E_{2}$	$E_{3}$ /10³	$R_{2}^{r}$	$R_{3}^{r}$	$G_{1}$	$G_{3}$	$I_{G}$	$ϵ_{M P 2}$
1	17.116	0.503	0.096	63.341	2.251	14.478	15.411	−6.702	13.889	0.253	−0.2718
2	27.503	0.996	0.178	126.454	4.498	26.687	28.049	−11.822	26.724	0.357	−0.5284
3	37.877	1.489	0.260	189.565	6.744	38.891	40.680	−16.946	39.589	0.458	−0.7879
4	48.238	1.982	0.342	252.682	8.991	51.093	53.301	−22.064	52.468	0.556	−1.0485
5	58.604	2.475	0.425	315.797	11.238	63.292	65.918	−27.186	65.335	0.654	−1.3100
6	68.968	2.968	0.507	378.914	13.485	75.491	78.532	−32.303	78.206	0.751	−1.5715
7	79.331	3.461	0.589	442.032	15.731	87.690	91.146	−37.422	91.079	0.849	−1.8332
8	89.696	3.954	0.671	505.147	17.978	99.888	103.759	−42.541	103.952	0.946	−2.0952
9	100.063	4.447	0.753	568.264	20.225	112.086	116.372	−47.659	116.821	1.043	−2.3570
10	110.435	4.940	0.835	631.378	22.472	124.284	128.984	−52.780	129.686	1.139	−2.6192
30	317.730	14.800	2.478	1893.708	67.408	368.246	381.243	−155.141	387.180	3.076	−7.8579
R²	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	−0.2718
RMSD	1.5	1.3	1.3	1.2	1.2	1.3	1.5	1.4	0.9	2.9