Comparison of Two Efficient Methods for Calculating Partition Functions

In the long-time pursuit of the solution to calculating the partition function (or free energy) of condensed matter, Monte-Carlo-based nested sampling should be the state-of-the-art method, and very recently, we established a direct integral approach that works at least four orders faster. In present work, the above two methods were applied to solid argon at temperatures up to 300 K. The derived internal energy and pressure were compared with the molecular dynamics simulation as well as experimental measurements, showing that the calculation precision of our approach is about 10 times higher than that of the nested sampling method.


Introduction
The birth of statistical physics laid a solid foundation to predict thermodynamic properties of macroscopic condensed matters. Phase transitions [1-3], protein folding [4] and the optimal conditions for novel material growth could be predicted theoretically as long as the partition function (PF) or free energy can be evaluated [5]. Nevertheless, solutions to the PF have been a long standing problem [6] and attempts were reluctantly turned to the help of molecular simulations [7]. With precedent efforts made in calculating the relative free energy, e.g., Gibbs ensemble Monte Carlo (MC) [8] sampling and thermodynamic integration [9], more attentions have been paid to density of states (DOSs) for computing the absolute PF [10][11][12][13]. The Bayesian-statistics-based nested sampling (NS) may be the state-of-the-art one [14,15], which aims at uniformly sampling a series of fixed fractions partitioned by potential energies in the configurational space to calculate DOS and has been applied in several systems [1, [16][17][18][19][20][21][22][23][24][25].
Very recently, we put forward a direct integral approach (DIA) to calculate the PF of condensed matters [26] and the high accuracy has been proved by molecular dynamics (MD) simulations of condensed copper using tight-binding potential [26], graphene and γ-graphyne materials using Brenner potential [27]. Based on our reinterpretation of the original sense of the integral, it was shown that DIA works at least four-order faster than NS [26]. On the other hand, it has not yet been confirmed whether the DIA has improved the computational precision of precedent MC methods. In this work, we carried out detailed analysis of DIA and NS in terms of the computational precision, and performed MD simulations to test the precision of internal energy and equations of state derived from the PF. It should be pointed out that the tests with MD simulations, instead of experimental data, are the most rigorous because same interatomic potentials can be used in calculations of the PF and MD simulations, which have been proved to be capable of producing very accurate results for various systems [28][29][30]. If the results derived from PF are only compared with experimental measurements, just as in most previous works, it would yet be doubted that the method for calculating the PF is accurate or not even if the agreements are excellent since it would be very likely that a deficient algorithm combined with an inappropriate empirical potential accidentally gives rise to an outcome close to the experiment.
The paper is organized as follows. In Section 2, NS and DIA were briefly formulated, and in Section 3, we first discussed the relationship between efficiency and accuracy of NS, and then performed MD simulations of solid argon to test the computational precision of DIA and NS, showing that DIA has a much higher precision than NS. In addition, we found that NS works badly for the highly-condensed systems while DIA has no such a problem. A comparison with experimental data of solid argon along the melting line was presented as well, which further validates that DIA is more accurate than NS.

Methods
PF is defined as a summation over the probabilities of all the microstates, and for a canonical ensemble consisting of N particles confined in volume V at temperature T, it reads where Λ is the thermal wavelength, β = 1/k B T with k B the Boltzmann constant, q 3N = {q 1 , q 2 . . . , q 3N } the Cartesian coordinates of particles and U(q 3N ) the potential energy. The 3N-dimensional integral on the right hand of Equation (1) is solely related to the microscopic states in configurational space, the so-called configurational integral (CI),

Nested Sampling
In Equation (2), microstates in configurational space are expressed in terms of coordinates of particles. From another point of view, we may also label the microstates by their corresponding potential energy and the integral can be rewritten in terms of the DOS of potential energy.
To introduce this, Equation (2) is re-expressed in reduced coordinates(s = q/L) as where L is the length of the cubic simulation box and V is the volume. The integral in Equation (3) represents the ensemble average of Boltzmann factors and denoted as Q ex . Since the probability of finding potential energy U is proportional to the DOS, Ω(U) = 1 0 δ(U − U(s N ))ds N , Q ex can be expressed as [16,20] The strategy of NS is to partition the configurational space into a series of energy-decrease subdivisions numbered by m. For the mth subspace with upper energy limit U m , a fixed number of configurations (L) with each energy ε i < U m are generated by MC method and ordered in a sequence as ε 1 < ε 2 < . . . < ε L . The lower energy boundary U m+1 , which is the upper one for the (m + 1)th subspace, is set to be the energy of a fixed fraction α of current subspace, as U m+1 = ε I with I = αL. By the NS algorithm, Equation (4) can be simplified as [22] where U m is an averaged energy of the mth subspace and ω m stands for the percentile of the mth phase space volume. It is obvious that ω m = α m − α m+1 , and after nth iteration when the convergence condition is reached, CI is evaluated as where U m = (U m + U m+1 )/2 which is the average of two consecutive median energies of each sampled partition to approximate an averaged energy of the mth subspace [20,22]. According to E = − ∂ ln Z ∂β , the internal energy of the N-particle system is calculated by For determining the pressure by P = 1 β ∂ ln Z ∂V , another CI for the system with a volume of V + ∆V should be calculated and P is obtained by

Direct Integral Approach
Consider Equation (2) and let the set Q 3N = {Q 1 , Q 2 . . . Q 3N } be the coordinates of particles in the state of the lowest potential energy U 0 , we may introduce a function as where q i = q i − Q i . By inserting Equation (9) into Equation (2), we obtain According to our very recent work [26], the integral can be solved as where L i represents the effective length on the ith degree of freedom and is defined as For homogeneous systems with certain geometric symmetry, such as perfect one-component crystals, all the particles are equivalent and U felt by one particle moving along q x may be the same as the one along q y (or q z ). In such a case, Equation (11) turns into where L is determined by Equation (12). Otherwise, it is needed to calculate the effective length, L x , L y , L z respectively, and Equation (11) turns into and, E and P are thus evaluated as

Comparisons and Discussion
The tested models are face-centered-cubic (FCC) solid Ar systems consisting of 500 or 4000 atoms confined in a cubic box with different sizes, and, NS and DIA were applied to calculate internal energy E and pressure P at different temperatures to be compared with MD simulations. The interatomic potential for solid Ar was characterized by the commonly used pairwise 12-6 Lennard-Jones (L-J) potential [31], where r ij is the distance between atoms i and j, = 117.05 (K), σ = 3.4 Å and the cutoff distance is r cut = 12.0 Å. The MD simulations with periodic boundary condition applied were performed by the Large-scale Atomic/Molecular Massively Parallel Simulator software package [32] with a time step of 0.1 fs. The Nose-Hoover constant-temperature algorithm [33] was used to produce a canonical ensemble at temperature T. The pressure was computed by the virial theorem for two-body potentials [34,35], the algorithm of which has already been integrated in the software package. The system was allowed to relax 20 ps at first and then continued to run for another 50 ps, during which averages of E and P were recorded in every 10 fs.
To implement NS, it should be at first to select appropriate values of α and L, which cooperatively balance the computational efficiency and precision of Q [Equation (6)]. Apparently, the larger the values of α and L are, the higher the calculation precision is, but the slower the computation speed is. Although the initial choice of α made by Pártay et al. [16] is L/(L + 1), successive works [17,18,21,22] have showed that a smaller value of α = 1/2 is sufficient enough to guarantee the calculation precision and enables NS to be applicable to systems consisting of up to several hundred atoms, of which the computational cost is too expensive for NS with α = L/(L + 1). Therefore, α = 1/2 was adopted in this work. Cares should be also paid to the value of L because, besides the factors of efficiency and systematic errors mentioned above, fluctuations of the calculated results in NS simulations closely depend on L for a fixed α [22]. We performed NS with four different numbers of configuration (L = 45,000, 60,000, 75,000, 90,000) to calculate the pressures of the solid Ar system consisting of 500 atoms with a density of 1.83 g/cm 3 at different temperatures, where the well-built cage model for solid systems [18] was used. For each L, we ran the NS simulations 15 times to produce the averaged value of pressure which was compared with MD simulations to see the relationship between the deviations and L. Figure 1 (data in supplementary materials), the pressures obtained by the NS are gradually approaching to those of the MD simulations as L increases and the corresponding fluctuations of NS is relatively larger with the smallest L. On the other hand, it should be noted that the fluctuations does not monotonically decrease with the increase of L, which was also observed in previous works [17]. The fluctuations for L = 60,000 and 90,000 are almost the same, which are about 30% smaller than those with L = 75,000, though the pressures with L = 90,000 are slightly closer to the MD simulations. Considering that the computational time with L = 90,000 is twice as much as that with L = 60,000, we chose L = 60,000 in the following work and conducted the NS simulations at each (N, V, T) conditions for 15 times to calculate the averaged values of internal energy by Equation (7) and pressure by Equation (8), where the volume difference ∆V was made by changing the length of the box by 1% because our calculations showed that smaller volume difference would produce very unphysical results. Relatively, systematic parameters are much fewer for implementation of DIA. For the solid Ar system, the atoms were placed right at the FCC sites to produce U 0 , and U (0 . . . q i . . . 0) in Equation (12) was obtained by moving the center atoms along its Z-axis ([100] direction) by 2Å while the coordinates of its X-axis, Y-axis, and of all the other particles were kept fixed. 2 × 10 4 potential energies were recorded to calculate the L by Equation (12), and, the internal energy and pressure were subsequently calculated by Equations (15) and (16), where the volume difference was made by changing the length of the box by 10 −3 %. For the argon system of 500 atoms with different densities (1.83, 2.13, 2.43 and 2.98 g/cm 3 ) at temperatures from 25 K to 300 K, E and P obtained by DIA and the NS are shown in Figure 2, where the corresponding quantities of E MD and P MD of MD simulations are also presented as comparisons (data in supplementary materials). For the systems with a density of 1.83 g/cm 3 and 2.13 g/cm 3 , the averaged relative difference of internal energy, RDE (= | E−E MD E MD |), of DIA is less than 4.1%, which is about four times smaller than that, 16.6%, of NS [36]. As the density increases up to 2.43 g/cm 3 and 2.98 g/cm 3 , the averaged RDE of DIA decrease to 5.51% and 0.48% respectively, while the averaged RDE of the NS climbs up to 36.44% for the density of 2.43 g/cm 3 and the NS fails to work for the system with density of 2.98 g/cm 3 . As to precision of the pressure, the averaged relative difference, RDP (= | P−P MD P MD |), of DIA is 2.48%, 1.69%, 0.17% and 0.63% for the densities of 1.83, 2.13, 2.43 and 2.98 g/cm 3 respectively, while the corresponding RDP of the NS is 10.22%, 9.18%, 4.54% and ∞.

As shown in
The above comparisons show that the calculation precision of DIA is much higher than that of the NS. Furthermore, DIA works better with increase of the density while the NS can hardly work when the density is higher than 2.98 g/cm 3 . The difficulty should be attributed to numerical calculations of Equation (6), where the factor (α m − α m+1 ) approaches to zero (α = 1/2) as m approaches to larger number, meanwhile, the factor e −β(U m +U m+1 )/2 increases quickly when U m < 0, which is the common case for the Ar systems with lower density and the product ((α m − α m+1 ) · e −β(U m +U m+1 )/2 ) is not too large (or small) for the 16 bit number of computer to describe. However, when the density is large enough that the U m > 0, both (α m − α m+1 ) and (e −β(U m +U m+1 )/2 ) approach to zero as m getting larger, and the product ((α m − α m+1 ) · e −β(U m +U m+1 )/2 ) gets to be so small (but not exactly "0") that the output of computer is exact "0", which makes the denominator in Equation (7) be zero easily. For this reason, we failed to apply the NS to calculate E and P of the Ar system with a density of 2.98 g/cm 3 . A larger value of α might be helpful while the computational efficiency would be slowed down. By contrast, DIA has no such a problem because the largest part of the potential energy, U 0 of the MSS, has been extracted in Equations (9) and (10), and the left part U is small enough to guarantee the precision of the integral for high density systems. The lower precision for the NS calculating the pressure can be understood as follows. The pressure is determined by Equation (8), where the volume difference ∆V should be set as small as possible to achieve high precision. However, the integral Q of Equation (6) is not very sensitive to the small changes of the volume V because of the random characteristic of MC simulations, leading to large fluctuations of Q(V + ∆V) − Q(V) for each running of the MC simulation. Our calculations showed that the large fluctuations would produce unphysical pressures when the ∆V is smaller than 10 −4 %V, which corresponds to the length of the cubic box changed by 1% adopted in our calculations. In DIA for calculating the pressure [Equation (16)], the involved quantities U 0 and L determined by Equation (12) are all sensitive to volume of the system, so the volume difference in Equation (16) can be set much smaller. We tried several values of the box length difference in the range of 10 −1 %-10 −6 % and confirmed that the obtained pressures converges at the volume difference of 10 −13 % (10 −3 % box length difference).
The computational efficiency of the NS and DIA depends on the number of the total potential calculation. For the NS running, the MC algorithm has to work 6 × 10 3 -8 × 10 3 times each producing 60,000 configurations to reach the convergence, so 3.6 × 10 8 -4.8 × 10 8 times of potential energy calculations must be performed to produce the U m in Equation (6). Because of the fluctuations, the NS was run 15 times for a given system to produce the averaged results, thus the number of the total potential calculations is larger than 5.4 × 10 9 , which is about five orders of magnitude larger than the one, 2 × 10 4 , for running DIA in the same system.
Because of the ultra-high efficiency, DIA was applied to calculate the internal energy and pressure of solid argon composed of 4000 atoms, on which the NS costs too much computer hours and we have to give up the calculations, and we performed MD simulations to give comparisons. As shown in Figure 3, both E and P obtained by DIA coincide well with MD simulations where both RDE and RDP of DIA are almost the same as those calculated in the 500-atom system (data in Supplementary materials).
Finally, a comparison was made of DIA and the NS with experimental data of solid Ar along melting line [37]. Considering the lower efficiency of the NS, the simulated system for both DIA and NS consists of 500 atoms and the computational procedures are the same as described above. As shown in Figure 4, the internal energy and pressure obtained by DIA are significantly better than those of the NS. The averaged relative deviation of internal energy and pressure to the experimental data is 5.34% and 5.5% for DIA, which are about 6 times smaller than the ones, 39.12% and 28.72%, for the NS.

Conclusions
In summary, by comparisons with MD simulations as well as experimental data, we confirmed that the accuracy of DIA outperforms the NS. The precision of DIA is about four times higher than that of NS for low-density systems and about one order higher in high-density situations. We also analyzed the intrinsic deficiency of NS in calculations of systems under highly condensed situations. Since the efficiency of DIA is at least five orders faster than that of the NS at the same time, DIA paves a better way to investigate thermodynamic properties of condensed matters, especially the ones with high density under extreme conditions.