Modeling Tracer Diffusion Coefficients of Any Type of Solutes in Polar and Non-Polar Dense Solvents

In this work, a simple two-parameters correlation based on the Rice and Gray, Lennard-Jones, and Stockmayer theories was devised for the calculation of binary diffusion coefficients (D12) of any type of solutes at infinite dilution in polar and non-polar solvents. This equation can be relevant for systems with polar solvents, since most models in the literature fail when strong intermolecular forces predominate in solution. The new correlation embodies the Stockmayer potential without requiring the dipole moments of any component, which significantly enlarges its application. It was validated with the largest D12 database of polar and non-polar dense systems, with 8812 data points (NDP) spanning 553 systems, of which 133 have water as solvent (NDP = 1266), 89 contain polar solvents excluding water (NDP = 1405), 177 have supercritical carbon dioxide (SC-CO2) as solvent (NDP = 5028), and 154 have non-polar or weakly polar solvents excluding SC-CO2 (NDP = 1113). Overall, the model achieved an average deviation of only 3.43%, with accurate and unbiased behavior even for polar systems.


Introduction
Diffusion coefficients, namely binary diffusivities (D 12 ), are of major importance for the chemical and related industries since it is a transport property required for the accurate design and simulation of processes limited by mass transfer kinetics [1]. The experimental measurement of D 12 is expensive in terms of equipment, chemicals, time, and the necessary apparatuses are frequently limited by their intrinsic range of operation that cannot encompass wide intervals of temperature and pressure. Hence, accurate models applicable over a wide range of temperature, pressure, and solvent and solute types (both in terms of size, molecular weight, chemical nature, and polarity) are of utmost importance.
For nonpolar and weekly polar solvent systems, which may or may not include polar solutes, there are already several published models providing good results. In 2021 Zêzere et al. [2] compared various models using a database containing 6180 data points of weakly and non-polar systems, and expressed their performances in terms of the wellknown average absolute relative deviation (AARD): where NDP is the number of data points, and the superscripts calc and exp stand for calculated and experimental, respectively. The hydrodynamic Wilke-Chang equation [3], one of the simplest D 12 models, achieved good predictive results with AARD = 15.64%. Additionally, the predictive hybrid free-volume model of Tracer Liu-Silva-Macedo (TLSM) [4,5] attained an equivalent AARD of 16.84%, while their derived 1-parameter TLSM d and TLSM en correlations achieved average deviations of 4.53% and 4.55%, respectively, which represent a significant improvement over the seminal equation. Finally, the 2-parameters free-volume model of Dymond-Hildebrand-Batschinski (DHB) [6,7] achieved AARD = 4.23%; it is the simplest equation in the literature, and can be linearized to obtain parameters from experimental data.
On the other hand, in the case of polar solvents the situation is quite different with fewer models being applicable due to the increased complexity of the intermolecular interactions, mainly hydrogen bonding [8,9]. For instance, the predictive Wilke-Chang and Tyn-Calus equations [10] reached high deviations (AARD = 30.05% and 28.96%, respectively) when tested with 1994 data points from polar systems [11]. Concerning correlations, the DHB model (2 parameters, AARD = 6.02%) and the Rice and Gray approach of Magalhães et al. [11] (2 parameters, AARD = 4.27%) obtained good results for the same database [11]. Nonetheless, the last correlation requires the knowledge of the dipole moments of both solute and solvent which are frequently non-available or very complex to estimate and, unlike this essay, has been validated with a much smaller database (3463 data points/211 binary systems, from which 1994 data points/141 systems are for polar solvents). Finally, a machine learning approach based on the gradient boost model was developed by Aniceto et al. [12] achieving reliable results (AARD = 5.07%) especially in comparison with the previously mentioned equations.
Various empirical equations are available and can be applied to polar and non-polar solvent systems, namely those proposed by Magalhães et al. [13] and others sparsely tested in the literature [14][15][16][17][18]. Notwithstanding their suitability for D 12 correlation and the good results achieved (e.g., Magalhães et al. [13] obtained AARD = 2.8% for 8219 data points/539 systems), their scope of application is difficult to assess and they are generally tested with few data points (with exception for the work of Magalhães et al.).
In this work an upgraded Rice and Gray-based correlation for the calculation of binary diffusivities is presented and, with the objective to extend its applicability to polar solvent systems, a combined Stockmayer and Lennard-Jones (LJ) potential term is embodied. Unlike a previous approach in the field [11], the proposed model does not require the solvent molar volume at normal boiling point and the dipole moments of both solute and solvent which expands its applicability to any type of solutes in supercritical fluids, liquids, and dense gases involving nonpolar, weakly polar, and polar solvents. The model is tested with the largest database compiled until now comprehending 8812 data points (NDP) from 553 systems, from which 2671 points are from systems with a polar solvent, including water, and the remaining 6141 points are from nonpolar or weakly polar solvent systems where supercritical carbon dioxide (SC-CO 2 ) is included. In relation to the previous work on polar systems [11] this essay includes a much larger database for validation, namely, 5349 (=8812-3463) points of which 677 (=2671-1994) are specific for polar solvents.

Rice and Gray D 12 Correlation
The D 12 model is based on the general Einstein equation [19,20], which relates diffusivity (D) with absolute temperature (T) and a non-null friction coefficient (ξ): where k B is the Boltzmann constant (1.380658 × 10 −16 g cm 2 s −2 K), ξ is the sum of a hard core (ξ H ) and an attractive (ξ S ) contribution, and ξ S embodies a Lennard-Jones (LJ) (ξ S,S ) and a polar (ξ S,P ) term. The ξ H coefficient is expressed by modifying the Enskog equation [21,22] with a correction term F [1]: where ρ n is the number density, σ is the molecule diameter, m is its mass, g(σ) is the radial distribution function at contact, and F is the hard sphere correction factor. The ξ S,S is given by the recipe employed by Ruckenstein and Liu [23], and ξ S,P is proposed embodying the method of Brokaw [24]: where T * is the reduced temperature and δ the Stockmayer parameter [24]. The sum of ξ S,S and ξ S,P is now expressed in terms of a fitting coefficient B = 0.4 + δ 2 : In this way, the Stockmayer parameter is eliminated from Equation (6) and the dipole moments are no longer required for the δ calculation, in opposition to the original work of Magalhães et al. [11]. This approach expands the model applicability, since dipole moments are not frequently available and their accurate estimation is not easy. This is especially true in systems with complex molecules, such as, for instance, astaxanthin and quercetin [25,26] that, in the case of the latter, exhibits 48 possible stable conformers with dipole moments varying from 0.35 to 9.87 Debye under vacuum [27].
The unary quantities of the previous expressions should be now replaced by the corresponding binary ones, giving rise to the core equation: where subscripts 1 and 2 denote solvent and solute, respectively, ρ n,1 is the number density of the solvent, σ 12,eff is the binary effective hard sphere diameter (where the Ben-Amotz-Herschbach expression [28] adopted for the effective calculation), m 12 is the reduced mass of the system, g(σ eff,12 ) is the pair radial distribution function at contact [29], F 12 is the hard sphere correction factor for D 12 [30] and T * 12 is the binary reduced temperature. For the calculation of the above-mentioned properties it is further required to know the critical volume (V c ), the critical temperature (T c ), and the mass (m) of both solute and solvent, and the mass density of the solvent (ρ 1 ) at the desired temperature and pressure. Knowing the properties of the pure compounds one can estimate the previously mentioned properties by recurring to Equations (8)- (19), where σ LJ,j is the LJ diameter of component j [23], ε LJ,j /k B is the LJ energy parameter of j [23], σ LJ,12 and ε LJ,12 /k B are the binary LJ parameters, T * j is the reduced temperature of j, σ eff,j is the effective hard sphere diameter of component j calculated by the Ben-Amotz-Herschbach expression [28], ρ * n,1 is the reduced number density of the solvent, ϕ is the packing fraction of the solvent and F 11 is the correction factor for the self-diffusion coefficient of the solvent [23].
Besides the parameter B 12 explicit in Equation (7), a second fitting constant is now introduced in the model, namely, the binary interaction parameter k 12 embodied in the Lennard-Jones diameter combining rule (Equation (10)).

Parameters Optimization and Model Assessment
The optimization of the D 12 parameters (k 12 and B 12 ) was performed using the Nelder-Mead simplex algorithm described in Lagarias et al. [31] and codified in the fminsearch function of Matlab R2017b [32], adopting AARD (Equation (1)) as objective function. The model performance was assessed in terms of AARD (%) and average relative deviation (ARD, %, Equation (20)). A flowchart of the optimization procedure can be observed in Figure 1.
In general, the fitting results achieved were good. However, some systems/data points were excluded from the database because the data were outside the domain of application of the equations used to estimate F 11 and/or F 12 delivering negative values without physical meaning. This was the case for carbon disulfide [161], methanol [ [160] in acetonitrile, and for 9,10-dimethylanthracene in n-hexane at 298.15 K and 3500 bar [149]. Nevertheless, the exclusion of these points corresponds to only 0.52% of the eligible compiled data.

Results and Discussion
The model performance was evaluated in terms of AARD and ARD. The global results achieved by the model and the results for each subset of systems are presented in Table 1. The individual deviations obtained for each system are provided in Supplementary Material (Tables S2-S5). Overall, the D 12 model offers excellent results with very low AARD and ARD values as evidenced in Table 1 and shown graphically in Figure 2. Altogether, for the 553 studied systems the global AARD is 3.43% and the global ARD is 0.13%. For the four subsets analyzed the AARD values are within 3.23% and 3.96%. As for the individual systems, the top three with the highest deviations correspond to carbon tetrabromide (CBr 4 ) in n-hexane [149], pyrene (C 16 H 10 ) in n-hexane [149], and potassium chloride (KCl) in water [97,100,117], with AARD values of 22.48%, 21.89% and 15.26%, respectively.
Generally, no bias was observed in any subset of systems as demonstrated by the ARD values of −0.36%, −0.31%, 0.16%, and 0.35% for the polar, non-polar, water, and SC-CO 2 systems, respectively. These observations are further highlighted in Figure 1 where one confirms that only three systems exhibit ARD values above 10% or below −10%. Interestingly, these systems are the same as those mentioned above (in the AARD analysis) and present ARD values of −16.94%, −18.07% and −11.15%. The calculated vs. experimental D 12 results illustrated in Figure 3 corroborate the previous AARD and ARD analysis, evidencing the very good performance of the D 12 model. The higher deviations observed for the SC-CO 2 subset are further analyzed in Figure 4, where the relative deviations (RD) against the reduced number density of solvent, ρ * n,1 , are plotted. As illustrated, the higher deviations tend to be found closer and below the critical density where accurate D 12 measurements are frequently difficult to carry out as well as their modeling; hence, such deviations may be expected in advance.  The experimental and calculated D 12 vs. ρ * n,1 is presented in Figure 5 for various solutes and solvents at fixed temperature and pressure. In general, the results are accurately described by the new D 12 model. For the polar systems shown in Figure 5a (i.e., ethanol/propane (A) at 298.15 K, 323.15 K, and 348.15 K [217], ethyl acetate/quercetin (B) at 1 bar [25], and ethanol/palladium(II) acetylacetonate (C) at 1 bar [121]) an excellent representation can be observed especially for systems B and C. Regarding the subset of water as solvent (see Figure 5b) , and α-cyclodextrin (G) [95], all at P = 1 bar, clearly emphasizes the wide range of applicability of the model. For the subset of non-polar and weakly polar solvents excluding SC-CO 2 (see Figure 5c), it is observed that the model predicts D 12 for diverse systems and conditions, including n-octane/xenon (H) at 1 bar [125], n-hexane/ferrocene (I) at 313.15 K [122], and cyclohexane/acetone (J) at 1 bar and 160 bar [118]. Finally, Figure 5d illustrates the results for SC-CO 2 as solvent, namely for ethanol (L) at 313.21 K [190] and eicosapentaenoic acid (K) at 313.15 K and 333.15 K [227], respectively. Notwithstanding the good results achieved for most systems, higher deviations are observed for ethanol at low CO 2 density which correspond to D 12 measurements close to the critical point (see previous comment). Ultimately, it may be concluded that the D 12 model has a wide range of applicability enabling the calculation of diffusion coefficients of simple (e.g., monoatomic species such as xenon) and complex (e.g., α-cyclodextrin and ionic liquids such as [Bmim][PF6]) solutes in polar or non-polar dense solvents, under isothermal or isobaric conditions. A program that allows the estimation of diffusivities using the present model can be found in the Supplementary Material along with instructions on its use.

Conclusions
In this work, an upgraded Rice and Gray-based correlation is proposed for the calculation of binary diffusivities of solutes in polar and non-polar solvents. The model embodies a combined and simplified Stockmeyer and Lennard-Jones potential contribution in D 12 but does not require the dipole moments of either solute or solvent. Moreover, the solvent normal boiling point and solvent molar volume at normal boiling point are also not necessary. Hence, the applicability of the new correlation is significantly extended in terms of chemical species.
The model validation was accomplished using the literature data for 553 polar and non-polar systems (8812 data points) involving diverse solvents and solutes. Overall, the model achieves very low deviations, with AARD = 3.43% and ARD = 0.13%, suggesting a consistent unbiased behavior. The database was divided into four subsets according to the nature of the solvent, namely polar solvents (excluding water), water, and non-polar and weakly polar solvents (excluding SC-CO 2 ), and SC-CO 2 . Excellent results were obtained for all subsets with the exception of the low-density SC-CO 2 region, i.e., near the critical point.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/ma15186416/s1: Software; Table S1. Pure component data, compound name, chemical formula, CAS number, molecular weight, M, critical temperature, T c , and volume, V c ; Table S2. Polar solvents systems studied, number of data point (NDP), reduced number density range (ρ * n,1 range), absolute temperature range (T range), binary interaction parameters (k 12 and B 12 ), average absolute relative deviations (AARD), and source of the diffusion data; Table S3. Water solvent systems studied, number of data point (NDP), reduced number density range (ρ * n,1 range), absolute temperature range (T range), binary interaction parameters (k 12 and B 12 ), average absolute relative deviations (AARD), and source of the diffusion data; Table S4. Non-polar solvent systems studied, number of data point (NDP), reduced number density range (ρ * n,1 range), absolute temperature range (T range), binary interaction parameters (k 12 and B 12 ), average absolute relative deviations (AARD), and source of the diffusion data; Table S5. SC-CO 2 solvent systems studied, number of data point (NDP), reduced number density range (ρ * n,1 range), absolute temperature range (T range), binary interaction parameters (k 12 and B 12 ), average absolute relative deviations (AARD), and source of the diffusion data.