Application of the Consonance Solvent Concept for Accurate Prediction of Buckminster Solubility in 180 Net Solvents using COSMO-RS Approach

The default COSMO-RS (Conductor like Screening Model for Real Solvents) approach is incapable of accurate computation of C60 solubility in net solvents. Additionally, there is no adequate selection of single or multiple reference solvent, which can be applied to the whole population of 180 solvents for improving prediction of mole fraction at saturated conditions. This failure cannot be addressed to inaccurate data of the Buckminster fusion, although they pose a challenge for experimental measurement due to intense sublimation of C60 at elevated temperatures and the possibility of solvates precipitation. However, taking advantage of the richness of experimental data of fullerene solubility, it is possible to identify the source of errors expressed in terms of fluidization affinity. Classification of solvents according to the value of this fluidization term allowed for formulation of a consonance solvents approach, which enables accurate prediction of C60 solubility using the single reference solvent method.


Introduction
Buckminster fullerene is a highly symmetric all carbon molecule, in which structure is represented by truncated icosahedron with a cage-like fused-rings made of twenty hexagons and twelve pentagons.The sixty carbon atoms placed at each vertex of each polygon are covalently bonded along each polygon edge.Fullerene C60 belongs to a broad class of Goldberg polyhedron-like carbon allotropes occurring in the form of spheres, ellipsoids or tubes.It was first generated in 1984 using a laser induced carbon vaporization in a supersonic helium beam [1].The unique structure of C60 results [2] in its unparalleled unique physicochemical properties [3], which were very welcome in many industries taking advantage of nanomaterials as for example biomedicine [4,5], optics, electronics and cosmetics [6].Unfortunately, low solubility of C60 in many organic solvents [7] stands for the major cost of the production from soot [8], since extraction by organic solvents is the first step for obtaining fullerenes-rich fractions further separated using HPLC [9].Moreover, strong tendency of precipitation in the form of solvates [10][11][12][13][14][15][16][17] makes it difficult to preserve purity of the solid.This is why modeling of C60 solubility attracted so much attention and resulted in a variety of theoretical approaches, among which the best predications come so far from non-linear modeling via machine learning [18].Other approaches taking advantage of quantitative structure-property relationships (QSPR) [19][20][21], the multiple linear regression (MLR) [19,22], partial least square regression (PLS) [23], support vector machines (SVMs) [22] and neural networks (NNs) [20] have been reported for predicting the solubility of C60 fullerenes in different organic solvents.Although these models offer quite an acceptable estimate suitable for screening of new solvents, they all rely on the sets of molecular descriptors characterizing solute-solvents properties.Very often these descriptors have no simple physical meaning and the predictive models can seriously suffer from the over-fitting problem in the training phase.
The different philosophy relies on the calculating of bulk phase equilibria from the first principles using well-established quantum chemistry approaches.Among these methods COSMO-RS (Conductor like Screening Model for Real Solvents) [24][25][26] offers an attractive alternative to linear and non-linear chemometrics approaches.Although it was originally formulated for computing thermodynamic properties of bulk liquid systems [27], it was extended also for treatment of interphases [28,29].Particularly solids solubility [30] in organic solvents and their mixtures was the subject of significant interests [31][32][33][34][35][36][37].Although the obtained results are of relatively poor or at most medium accuracy, they still offer valuable insight into the solid-liquid equilibrium phenomena.It is however important to emphasize the uniqueness of solids solubility modelling as requiring a proper thermodynamic formulation.Some basic, but important remarks are provided in the forthcoming paragraph.
Thermodynamic conditions for equilibrated mixtures of a solid-liquid saturated solution are characterized by equal values of chemical potential of pure solute solid phase, µ solid i , and of the solute in the saturated solution, µ l i .Formally this can be expressed as follows: where µ o,l i µ o,solid i are the reference state chemical potentials, γ l i x l i is an activity coefficient at the given value of mole fraction of solute in saturated solution, x l i , R is the universal gas constant and T denotes temperature.The thermodynamic reference state for the solute in the solution is often defined as the pure compound in a hypothetical super-cooled melt state at given temperature.On the other hand, in chemical practice often the activity of the solid phase is set to unity, which means that the reference state for the solid phase is the solid phase itself.Then, the solute activity in the saturated solution represents simply the difference between values of chemical potentials of chosen thermodynamic reference states, which means that ln a l i x l i = ln γ l i x l i •x l i = Hence, activity of the solid phase, which obviously is different from unity, is exclusively solute related and in any saturated solution has exactly the same value provided that the same solid form is preserved.This of course does not mean that solubility represented by the mole fraction of a given solute is the same in different solvents because the activity coefficients are local properties related to concentration and solvent type, a l i x l i .Of course, experimental determination of solubility allows for quantification of activities if activity coefficients are known.Particularly, in the case of ideal solutions solubility equals the activity of the solute and can be deduced just from calorimetric measurements.In the case of real solvents varying solubility simply stands for deviations from the Raoult law and such non-ideality is accounted by the values of activity coefficients.Due to the fact that chemical potential by definition is a molar value of Gibbs free energy at a given temperature and pressure, Equation (3) can be rewritten in terms of pure solute properties leading to the following formula: where ∆G is the value of a hypothetical partial molar Gibbs free energy of the melt at the system temperature and pressure.This value is temperature dependent and becomes zero for the pure solute at its melting point.Knowledge of the temperature related change of heat capacities of solid and melt solute ∆C , allows for computing entropic and enthalpic contributions to the values of fusion Gibbs free energy by the following basic definitions: ∆H where T m represents melting temperature of pure solute.Combining Equations ( 4) and ( 5) leads to the following thermodynamically rigorous relationship (if no phase transition occurs between different solid states): Unfortunately, direct application of this equation is impossible due to the lack of experimental data.With contemporary thermo-analytical techniques the temperature related values of isobaric heat capacity of the solid can be determined below the melting temperature.Additionally C l p i (T), characterizing the solute melt state, can be measured above the melting temperature, if the compound does not decompose.Unfortunately, the super-cooled melt system used as a reference state is in practice experimentally inaccessible far away from the melting temperature, as it is the case in the instance of typical solubility measurements.Hence, to overcome this difficulty several simplifications of Equation (7) were proposed enabling for extrapolation of heat capacities.First of all, it is possible to assume that ∆C f us p i (T) is small in comparison to the other terms and is omitted by imposing a zero value.This means that the enthalpy of fusion is independent of the temperature.This seems to be reasonable for temperatures not too remote from the melting point [38,39].On the other hand, there are strong suggestions that in general this represents serious oversimplification [40][41][42] since change of the heat capacity upon phase transition is very often significant.Hence, the alternative proposed by Hildebrand and Scott [43,44] assumes that ∆C f us p i (T) is constant and can be approximated by the entropy of fusion at the melting temperature.In turn, this value can be approximated by the ratio of fusion enthalpy and system temperature.Although this assumption is not exceptionally accurate, it has been shown [41] that the true value of ∆C f us p i (T) is generally much closer to ∆S f us i (T) than to zero.The importance of this assumption was validated by Moller [45] reporting a mean difference of 10% between these two simplifications.However, a more systematic study on complex substances did not confirm this notion since no significant influence has been found on the overall solubility predictions [46] in relation to assumptions of ∆C f us p i (T) model.There are also different semi-empirical models developed for characteristics of solid-liquid heat capacities enabling much higher precision of solute activity estimation and thermodynamically rigorous extrapolation over a broad range of temperatures [47,48].
On the other hand, Equation ( 7) plays an important role in the area of theoretical solubility predictions and since it defines the problem with two unknowns coming from different sources all the above comments are valid also in this context.Many theoretical models were developed for estimating activity coefficients based on which solubility can be deduced.Among many theoretical approaches the COSMO-RS theory offers a very attractive way of chemical potential characteristics in bulk systems.Foundations of this theory are available in original papers [24][25][26][27], so details will be omitted here.It is suffice to remind that the values of the chemical potentials of compounds in the solvents are calculated by an iterative solution of the following equation: where σ represents the charge density of molecular segment, E(σ,σ') is the sum of electrostatic and hydrogen bonding contributions to the total energy of the system computed as pair-wise additive interactions of molecular surface segments of charge density σ.The molecular surface is defined by the molecule polarization after immersing in dielectric continuum or conductor.This solvation model [25] results in a so called σ-profile, P i (σ), representing simply a histogram of surfaces of a given charge density.The most important information of the molecular polarity and intermolecular interactions is directly encoded in σ-profile.The electrostatics (ES), hydrogen bonding (HB) and van der Waals interactions of contacting surface pieces of charge density σ and σ' are computed based on the pair-wise additivity assumption.The detailed definitions are of less importance here, except for the fact that they comprise scaling factors and adjustable model parameters fitted to a large number of thermodynamic data [25,26].After integrating the segments chemical potentials over the surface of the i-th compound in the multicomponent system, the value of total chemical potential is computed by inclusion of a combinatorial term: The first component plays the role of the residual part and the latter accounts for size and shape effects of solute and solvent [24,27].Since the above equation offers a general way of computing chemical potential of any system, it is directly applicable to liquids solubility prediction.However, for solid solutes a generalization was proposed by accounting also the fusion contribution: This solubility equation is solved iteratively until self-consistency criterion is met by re-computing values of the chemical potentials dependent on the mole fraction of solute in solution at the current iteration.This leads to systematic improvement of the initial guess of solubility representing solubility at infinite dilution in given solvent or solvents mixture.
The solubility in Equation ( 10) can be used also in the reversed manner for Gibbs free energy of fusion computations.Based on COMSO-RS derived chemical potential and experimental solubility fusion data are directly available.This value is to be treated as apparent Gibbs free energy of fusion, which knowledge, along with computed values of chemical potential, is sufficient for exact reproduction of the experimental solubility.
The main advantage of using COSMO calculations for calculating phase equilibria is that the results are obtained from the first principles.However, application of this approach for solid solutes solubility prediction, apart from its own shortcomings of estimation of chemical potentials, is also prone to fusion related inaccuracies as discussed above.The goal of this paper is three-fold and inherently associated with Equation ( 10) First of all, the critical evaluation of quality of predicted values of C60 solubility in 180 net solvents is undertaken.Then, the origin of the observed inaccuracies is discussed in the light of the values of Gibbs free energy of fusion derived from solubility data.Decomposing these data into meaningful contributions provides insight into the source of observed discrepancies between predicted and computed data.Finally, classification of solvents using the consonance solvents approach is proposed and solubility data predicted based on this idea are confronted against experimental data.

Solubility Dataset
Solubility data of C60 in a variety of solvents were compiled by different authors for modeling purposes [18,19,21,49,50].In few cases, some discrepancies were found and averaged values were used in this paper.In Tables 1-3 all experimental values are provided.

Thermodynamic Data of C60 Fusion
So far the fusion of fullerene C60 has not been observed at any temperature and existing estimate values do not offer consensus.For example, the following data were reported: T m > 950 K [51], T m ≈ 1200 K [52] or T m ≈ 2023 K [53].These temperatures are rather debatable since C60 sublimates at temperature between 700 and 945 K [54], depending on the experimental protocol used for the measurements.However, the enthalpy of fusion can be obtained from the available experimental enthalpy of sublimation and an estimated enthalpy of vaporization of the liquid fullerene.This leads to different estimates depending on the assumed values [49].Here, all solubility computations rely on the values of fusion provided by Kulkarni et al. [55].Hence, 25.77 kJ/mol was used for approximate fusion enthalpy as calculated from crystal energy in toluene assessed using solubility parameters approach [55].Instead of melting temperature the sublimation one is assumed here, T m = T sub = 750 K [56].These values were successfully used for C60 solubility prediction in solvent mixtures with an aid of Wohl's equation and Scatchard-Hildebrand theory [57].

COSMO-RS Computations
The structures of solute and solvent molecules were represented by sets of relevant conformations generated using COSMOconf 4.2 and Turbomole 7.0 as the default engine for geometries optimization.The obtained conformers used further for characteristics of bulk systems had their geometries fully optimized using BP functional and TZVP basis set.All structures were generated both in the gas phase and including environment effects via the COSMO-RS [25] solvation model.For solubility computations TZVPD-FINE level was used, which corresponds to single point calculation with TZVPD basis set and the same density functional based on previously generated geometries.The BP-TZVPD-FINE_19.01 parameterization was used for all physicochemical properties computations as implemented in COSMOtherm [Version 19.0.1 (Revision 5259)].

Results and Discussion
The starting point of this project was the frustrating observation of extremely high deviations of values computed using the COSMO-RS approach compared to experimental solubility data of fullerene C60 dissolved in 180 net solvents.This is documented in Figure 1 by the distribution marked with black crosses representing results of default computations done in COSMOtherm on fine level.Practically, there is no correlation between predicted and measured values (R 2 = 0.364).Although the mean absolute percentage error, MAPE = 23% seems to be acceptable, only half of the systems Symmetry 2019, 11, 828 9 of 20 is characterized with absolute percentage error, APE, less than 20%.In the case of cyclopentane the percentage error, PE, exceeds 55% and for pyrrolidine reaches -73%.These values characterize deviations expressed in the decadal logarithm.This means in practice one order of magnitude underor overestimation of mole fractions of C60 in these solvents.For the majority of solvents the solubility of C60 is significantly overestimated by COSMO-RS without any detectable trend related to molecular structure of the solvents.For some alcohols, as for example 1,4-Butanediol, computed C60 solubility perfectly match experimental ones, APE = 1.0%.For other solvents of this type, as for example methanol, a significant discrepancy with APE = 32.2% is observed.Generally poorer performance is noticed for less polar systems containing aliphatic chains and better for systems being able to form hydrogen bonds and with high contributions of electrostatic interactions.For detailed inspection of the origin of this situation the reversed solubility computations were performed for all systems and apparent Gibbs free energy of fusion were generated.

Computations of C60 fusion from solubility measurements
As it was mentioned in the introduction that the procedure of solubility computation in COSMO-RS can be reversed for estimating adequate values of fusion Gibbs free energy based Equation 10 allowing for straightforward determination of Gfus values.This means that if a proper value of such fusion affinity is available normal solubility computations using COSMOtherm would reproduce exactly experimental data.This option relies simply on the reference solubility approach in which chemical potential of the solute is determined based on experimental concentration of saturated solution and computed values of the equilibrium activity of the solute    (   ) .The computed values characterizing Buckminster fusion for the set of available 180 solubility values are presented in Figure 2. Presented distribution of the apparent Gibbs free energy of fusion univocally documents that this value is very strongly dependent on the nature of the solvent.For generalization of the discussion, instead of absolute values of solubility-derived fusion, the relative value with respect to calorimetrically measured data will be used and termed here as fluidization contribution, FLUID =∆ 60, ,− − ∆ 60 , , where ∆ 60 , equals 3.71 kcal/mol at room temperature [55].
The obtained result presented in Figure 2 is rather unexpected since in an ideal situation FLUID should be at least constant if not equal to zero for any solvent at constant temperature and pressure.Evidently this is not the case for fullerene C60 since the deviations from calorimetric fusion value range from -3.4 kcal/mol for nitromethane up to +4.9 kcal/mol in the case of cyclopentane.Data for

Computations of C60 Fusion from Solubility Measurements
As it was mentioned in the introduction that the procedure of solubility computation in COSMO-RS can be reversed for estimating adequate values of fusion Gibbs free energy based Equation ( 10) allowing for straightforward determination of ∆G fus values.This means that if a proper value of such fusion affinity is available normal solubility computations using COSMOtherm would reproduce exactly experimental data.This option relies simply on the reference solubility approach in which chemical potential of the solute is determined based on experimental concentration of saturated solution and computed values of the equilibrium activity of the solute a l i x sat i .The computed values characterizing Buckminster fusion for the set of available 180 solubility values are presented in Figure 2. Presented distribution of the apparent Gibbs free energy of fusion univocally documents that this value is very strongly dependent on the nature of the solvent.For generalization of the discussion, instead of absolute values of solubility-derived fusion, the relative value with respect to calorimetrically measured data will be used and termed here as fluidization contribution, FLUID = ∆G where ∆G f us,cal C60 equals 3.71 kcal/mol at room temperature [55].The obtained result presented in Figure 2 is rather unexpected since in an ideal situation FLUID should be at least constant if not equal to zero for any solvent at constant temperature and pressure.Evidently this is not the case for fullerene C60 since the deviations from calorimetric fusion value range from -3.4 kcal/mol for nitromethane up to +4.9 kcal/mol in the case of cyclopentane.Data for other few extreme cases are also provided in Figure 2.Such pronounced variability of apparent ∆G fus cannot be addressed solely to improper representation of the heat capacities.Additionally, problems with measurements of melting temperature and heat of fusion of C60 cannot justify such strong irregularities.According to equation 9 the relationship between log( 60  ) and Gfus is linear and if inaccuracy of the fusion value equals 1 kcal/mol than the error of solubility expressed in units of decadal logarithm of mole fraction at room temperature will be equal to RTln(1) ≈ 0.74.This means that for proper estimation of solubility a very high accuracy of Gfus is required exceeding the so called chemical accuracy, which is of the order of 1 kcal/mol or 0.03 eV [58].However, in the case of C60 the solvent imposed alterations are much higher, variable and not negligible.For 103 solvents FLUID exceeds 0.5 kcal/mol and for 76 is lower than -0.5 kcal/mol suggesting that C60 is to be considered as a rather non-ideal agent in the majority of solvents despite low concentrations at saturated conditions.In fact, this is supposed to be the reason of very poor C60 solubility predictions by COSMO-RS, as documented in Figure 1.This means that activities coming from the COSMO-RS approach are inaccurately computed not accounting for all necessary contributions characterizing saturated solutions.The span of solubility values is very broad encompassing solvents in which C60 is practically insoluble as for example water (log( 60  = −12.5)and modestly miscible as it is in the case of 1-phenylnaphthalene log( 60  ) = −1.9.
It is not surprising that fluidization terms for cyclopentane and nitromethane have opposite signs due to differences in electronic structure affecting intermolecular interactions in bulk systems.Generally, in the case of non-polar structures as alkanes and cycloalkanes, the fluidization contribution is positive indicating higher affinity of solute to such solvents, what results in overestimation of solubility in the case of using to low Gfus values.To the contrary, highly polar and rich in lone pairs systems, as for example nitromethane or heteroaromatic compounds, are characterized by negative values of FLUID term.Hence, the affinity of solute toward such solvents is lower than expected from uncorrected calorimetric values of Gfus and computed solubility will be According to equation 9 the relationship between log(x sat C60 ) and ∆G fus is linear and if inaccuracy of the fusion value equals 1 kcal/mol than the error of solubility expressed in units of decadal logarithm of mole fraction at room temperature will be equal to RTln(1) ≈ 0.74.This means that for proper estimation of solubility a very high accuracy of ∆G fus is required exceeding the so called chemical accuracy, which is of the order of 1 kcal/mol or 0.03 eV [58].However, in the case of C60 the solvent imposed alterations are much higher, variable and not negligible.For 103 solvents FLUID exceeds 0.5 kcal/mol and for 76 is lower than -0.5 kcal/mol suggesting that C60 is to be considered as a rather non-ideal agent in the majority of solvents despite low concentrations at saturated conditions.In fact, this is supposed to be the reason of very poor C60 solubility predictions by COSMO-RS, as documented in Figure 1.This means that activities coming from the COSMO-RS approach are inaccurately computed not accounting for all necessary contributions characterizing saturated solutions.The span of solubility values is very broad encompassing solvents in which C60 is practically insoluble as for example water (log(x sat C60 = −12.5)and modestly miscible as it is in the case of 1-phenylnaphthalene log(x sat C60 ) = −1.9.It is not surprising that fluidization terms for cyclopentane and nitromethane have opposite signs due to differences in electronic structure affecting intermolecular interactions in bulk systems.Generally, in the case of non-polar structures as alkanes and cycloalkanes, the fluidization contribution is positive indicating higher affinity of solute to such solvents, what results in overestimation of solubility in the case of using to low ∆G fus values.To the contrary, highly polar and rich in lone pairs systems, as for example nitromethane or heteroaromatic compounds, are characterized by negative values of FLUID term.Hence, the affinity of solute toward such solvents is lower than expected from uncorrected calorimetric values of G fus and computed solubility will be underestimated.It is a bit surprising that methanol and also other alcohols were found in the same group as some alkanes or halogenated alkanes but this should be addressed probably to different sources of FLUID value and cancelation of such contributions as electrostatic, hydrogen boding and dispersion.
Before proceeding any further it is necessary to comment on the common phenomenon observed in the case of systems in which high affinity of solute is observed toward solvent molecules.In such cases, it is quite reasonable to expect that new solid forms can precipitate in the saturated solution in the form stable solvates.This in turn would give rise to a different activity for the solid phase due to altered thermodynamic properties of solid phase of different composition.Such multicomponent solids are well known in medicinal chemistry.It was already recognized that variations in the composition of the contents of the stomach/gut may affect the recrystallization kinetics between different solid-state forms of a drug [59,60].This can be addressed not only to polymorphism but more often to solvates as for example it is the case for carbamazepine [61][62][63] and its cocrystals [60].This aspect is also relevant to C60 solid-liquid equilibria in organic solvents.Unfortunately, in reports providing the values of solubility there are not sufficient information about solid phase present in the equilibrium with the saturated solutions.Although no direct clues about C60 solvates can be inferred from solubility measurements this aspect was not overlooked and attracted serious attention by many investigators [10][11][12][13][14][15][16][17][64][65][66][67][68].For example, the dissolution properties of C60 in aromatic solvents were studied using the thermos-analytical approach [10,12] and many solid solvates were identified.This of course is relevant to the data provided in Figure 2 since documented dramatic influence of the thermodynamic properties of solid solvates might be addressed to this phenomenon of C60.

Reference Solvent Computations
The remedy on the lack of G fus is supposed to be the so called reference solubility computation.It is simply an attempt of computing solubility in one solvent based on information of solubility in another.In practice, this is exactly the same procedure as the one used for computing the FLUID term but two solvents are declared in the input files.Since it is not known a priori which reference solvent is the most appropriate for given solvent all possible combinations were considered here.Hence, reference solubility was computed for all 180 × 179 = 32,220 solvents pairs.In each case one solvent was used as a reference one and solubility was computed for the other solvent.Based on obtained results MAPE was computed for each reference solvent.The best result of these computations was provided in Figure 1 as distribution marked with grey diamonds.It happened that the selection of either ethylbenzene, toluene, bromooctane, cis-decahydronaphthalene, propylene glycol, propargyl bromide, 1,5-Pentanediol or n-butylbenzene leads to similar quality of solubility prediction for the rest solvents.Unfortunately, this approach is unsatisfactory as evidenced by the plot collected by Figure 1, where ethylbenzene served as the single reference solvent system.Obviously, the diversity of solvents in the data set prevents from finding one optimal reference system.Hence, this option does not provide any sensible increase of overall accuracy of solubility predictions if a single set of reference solvents is applied to the whole data set.It is also worth mentioning that the reference solvent method cannot help in the cases if solid states in equilibrium with saturated solutions comprise solvates.Since, there is a vast gross of evidences [10][11][12][13][14]16,17,[64][65][66]68] that Buckminster can interact with many solvents, it is quite clear that reference solubility cannot be used neither for serious improvement of predicted solubility data nor for treating cases of unknown fusion characteristics.

Consonance Solvents Classification
Fortunately, the situation is not as hopeless as it might seem from results presented so far since some interesting trends can be revealed after sorting data obtained based on the single reference solvent computations.The 180 × 180 matrix comprising APE values of every pair solvent-reference solvent was sorted systematically by rows and columns with criterion of ascending values of APE.After sorting rows, the columns were rearranged accordingly for keeping the same order of rows and columns.Finally, this resulted in a kind of heat map presented in Figure 3.This operation allows for grouping solvents according to their ability of playing the role of good reference solvent for other solvents found in the near neighborhood in Figure 3. Existence of such regions, encompassing solvents with high similarity in terms of FLUID values, is termed here as the consonance solvent approach.This means that on the map provided in Figure 3, for each case, it is possible to find suitable reference solvent providing quite accurate values of computed C60 solubility.Restricting the expected accuracy to the fine criterion for which APE < 10% of solubility prediction of every pair solvent-reference solvent results in rectangle regions marked with thinner black lines in Figure 3.Alternatively, reducing accuracy to APE < 15% leads to a much lower number of sections marked with bold black squares.In the former case, it is necessary to know thirteen solubility values for proper computation of solubility of the rest of the population.If the modest criterion is accepted only five measurements are indispensable for the same purpose.It happens that solvents in the middle of each class or subclass are the most suited reference solvents.However, three cases are out of this classification for which it is hardly possible to find suitable reference solvent.These are water, cyclopentane and nitromethane.These solvents are treated by COMOS-RS on different manner compared to the rest of the population that the consonance solvent approach is unable to improve solubility predictions.Particularly, water cannot be used as reference solvent due to ultra-low solubility of C60.Consequently, reference solubility predicts complete miscibility in majority cases if water is used as a reference system.Additionally, none of the organic solvents seem to be an appropriate reference for water with one exception.Using nitromethane as a reference solvent leads to 3.8% error of estimated C60 solubility in water.This is probably an accidental artefact.Cyclopentane and nitromethane are those solvents, which were found to be characterized by extreme values of fluidization term, which already anticipated their uniqueness.
The effectiveness of proposed classification is demonstrated in Figure 1 by a series marked with black circles.The mean absolute error is as low as 4.3% and the regression coefficient reached R 2 = 0.977 indicating very high accuracy of computed values.Reducing accuracy to the modest level results in an increase of the overall error up to 8.4% with reduced linearity of the trend to R 2 = 0.904.The coefficient of the linear regression equation provided in Figure 1 suggests that predicted solubility values are still slightly overestimated compared to measurements.The detailed classification with identified consonant solvents is provided in Tables 1-3.
According to the modest criterion, the first class of consonance solvents comprises 49 solvents.This operation allows for grouping solvents according to their ability of playing the role of good reference solvent for other solvents found in the near neighborhood in Figure 3. Existence of such regions, encompassing solvents with high similarity in terms of FLUID values, is termed here as the consonance solvent approach.This means that on the map provided in Figure 3, for each case, it is possible to find suitable reference solvent providing quite accurate values of computed C60 solubility.Restricting the expected accuracy to the fine criterion for which APE < 10% of solubility prediction of every pair solvent-reference solvent results in rectangle regions marked with thinner black lines in Figure 3.Alternatively, reducing accuracy to APE < 15% leads to a much lower number of sections marked with bold black squares.In the former case, it is necessary to know thirteen solubility values for proper computation of solubility of the rest of the population.If the modest criterion is accepted only five measurements are indispensable for the same purpose.It happens that solvents in the middle of each class or subclass are the most suited reference solvents.However, three cases are out of this classification for which it is hardly possible to find suitable reference solvent.These are water, cyclopentane and nitromethane.These solvents are treated by COMOS-RS on different manner compared to the rest of the population that the consonance solvent approach is unable to improve solubility predictions.Particularly, water cannot be used as reference solvent due to ultra-low solubility of C60.Consequently, reference solubility predicts complete miscibility in majority cases if water is used as a reference system.Additionally, none of the organic solvents seem to be an appropriate reference for water with one exception.Using nitromethane as a reference solvent leads to 3.8% error of estimated C60 solubility in water.This is probably an accidental artefact.Cyclopentane and nitromethane are those solvents, which were found to be characterized by extreme values of fluidization term, which already anticipated their uniqueness.
The effectiveness of proposed classification is demonstrated in Figure 1 by a series marked with black circles.The mean absolute error is as low as 4.3% and the regression coefficient reached R 2 = 0.977 indicating very high accuracy of computed values.Reducing accuracy to the modest level results in an increase of the overall error up to 8.4% with reduced linearity of the trend to R 2 = 0.904.The coefficient of the linear regression equation provided in Figure 1 suggests that predicted solubility values are still slightly overestimated compared to measurements.The detailed classification with identified consonant solvents is provided in Tables 1-3.
According to the modest criterion, the first class of consonance solvents comprises 49 solvents.Alkanes and halogenated alkanes were mainly included here.However, surprisingly, also some light alcohols as methanol and ethanol were found within this class.Hexane is suggested as the best reference solvent for this class offering accuracy as good as MAPE = 5.4%.There are however many other solvents, which can be treated as consonance ones with similar applicability.For example, such commonly used solvents as tetrahydrofuran, acetone or propan-2-ol are promising substituents for hexane in reference solubility computations of C60 solubility.As it was enumerated in Table 1, for further increase of solubility predictions this class can be subdivided into three subclasses.This increases the accuracy by a factor of two, reducing MAPE to 2.1%.Hence, very accurate predictions of C60 solubility in all solvents collected in Table 1 are possible provided that three experimental measured values are available one for each subclass A1, A2 or A3.Interestingly, class A is characterized by positive and highest values of fluidization term being in the range from 1.36 kcal/mol for thiophene and 4.02 kcal/mol for pentane.
Similarly, collection of further 60 solvents provided in Table 2 can also be classified according to the consonance rule.Although the best reference solvent for class B is nonan-1-ol but also such solvents as 1,2-dichloroethane, 1,2-dibromoethane, iodobenzene, butanoic acid or decan-1-ol offer similar accuracy.The average percentage error is slightly higher in comparison with class A and is equal to 7.5%.As it is presented in Figure 2 class B is characterized by smaller values of fluidization term being within the range of 0.35 kcal/mol for 1,4-dimethylbenzene and 1.84 kcal/mol for pentan-3-ol.Class B also can be subdivided into three subclasses for increase of accuracy of computed C60 solubility.
Finally, in Table 3 the classification of the remaining 68 solvents is provided.Here are the solvents characterized either by close to zero values of fluidization term or negative values.This class is more heterogeneous compared to the previous two and consists of a larger selection of 59 solvents augmented with two small sets grouping three (class C') or six (class D) solvents.It has been found that 1,5,9-cyclododecatriene is the best reference solvent for class C offering 10.3% accuracy expressed in term of MAPE value.Alternatively, other solvents offer similar accuracy as for example 1-bromooctadecane, thiophenol, 1-bromo-2-chloro-benzene, 1,5,9-cyclododecatriene, 1,2-dichlorobenzene, benzyl bromide or 1,2-dibromocyclohexane.Class C', which merges with class C4 in case of fine classification, comprises solvents for which selection of piperidine seems to be the best reference solvent.Finally, selection of tetralin is suggested as consonance solvent for the final class.

Multiple Reference Solvent Computations
Results presented above tempted for testing if any benefit will result from solubility computations based on information of solubility in several solvents.This option, referred to as the multiple reference solvent approach, was extensively tested here for C60 solubility.There were several conducted trials with a varying number of solvents starting from two and ending on the maximum possible number in the current version of COSMOtherm.Combinations of suggested consonance solvents were used and also randomly selected ones but no significant benefit was achieved as a result of all these attempts.The best results found during this stage were provided in Figure 1 as distribution marked with open grey circles.The conclusion is quite simple and univocal.No sets of multiple reference solvents were found for which the quality of prediction would be comparable to the single consonance solvent approach.

Decomposition of Fluidization Term
In the context of the mentioned variability of fluidization term, a closer inspection of the factors contributing to its values seems to be valuable.Fluidization encompasses all contributions, which are inherently associated with dissolution but not included in the value of chemical potential of solute in the saturated solution computed by COSMO-RS.This might be related to improper accounting for entropy by the combinatory term.In addition, the residual portion of chemical potential might be the source of non-negligible FLUID values due to the inaccurate characteristics of some solvent specific aspects of solubilization including mixing, solvation and solvent structure alterations imposed by solute cavitation in the presence of the solute.Indeed, the incoherence of the COSMO-RS theory in the current formulation was already noticed in the case of strongly interacting species which resulted in formulation of the COMSO-RS-DARE approach [69].This extension of the original method was introduced for proper evaluation of activity coefficients for the binary mixtures of carboxylic acids and non-polar components.Recently the author demonstrated the effectiveness of this approach for ethenzamide solubility predictions [70].
The quantification of fluidization contributions to the values of fusion Gibbs free energy is indispensable for a proper prediction of solids.To the author's best knowledge, this aspect was not raised so far, mainly because of the fact that there is no other substance, which solubility was so extensively studied as fullerene C60.In this case, conducting measurements in 180 solvents with such high chemical diversity, covering all types of solvents enables in depth analysis of the discussed aspect.For further exploring of the feature of the fluidization term, it seems to be valuable splitting it into combinatory contribution and residual one.Additionally, the latter term can be further decomposed into three major energetic contributions used by the COSMO-RS approach for chemical potential computations.
where ∆G MSF f luid is the misfit part of the fluidization term coming from electrostatics interactions, ∆G HB f luid stands for hydrogen bonding contribution and ∆G vdW f luid accounts for all non-specific interactions expressed as van der Waals interactions.All four contributions can be estimated by scaling capabilities of COMSOtherm and enforcing corresponding scaling factors to be equal to zero.The result of separations of these contributions is provided in Figure 4.
First of all, the provided series show a broad trend suggesting that higher solubility is generally associated with lower values of fluidization term and such contributions as vdW and RES.On the contrary, the stronger the electrostatics the lower the solubility of C60.The remaining two contributions, namely HB and COMBI have much smaller values.Moreover, the origin of the observed variation of FLUID comes from the vdW term.It is somewhat compensated with misfit values for those solvents with higher solubility and consequently COMS-RS performs slightly better in such situations.Contrary to this, low solubility systems pose a real challenge to COSMO-RS.
where ∆   is the misfit part of the fluidization term coming from electrostatics interactions, ∆   stands for hydrogen bonding contribution and ∆   accounts for all non-specific interactions expressed as van der Waals interactions.All four contributions can be estimated by scaling capabilities of COMSOtherm and enforcing corresponding scaling factors to be equal to zero.The result of separations of these contributions is provided in Figure 4. First of all, the provided series show a broad trend suggesting that higher solubility is generally associated with lower values of fluidization term and such contributions as vdW and RES.On the contrary, the stronger the electrostatics the lower the solubility of C60.The remaining two contributions, namely HB and COMBI have much smaller values.Moreover, the origin of the observed variation of FLUID comes from the vdW term.It is somewhat compensated with misfit values for those solvents with higher solubility and consequently COMS-RS performs slightly better in such situations.Contrary to this, low solubility systems pose a real challenge to COSMO-RS.
Finally, these conclusions can be also supported by statistical analysis of the distributions by performing the principal component analysis (PCA) on covariance matrix.This step reveals that all contributions involved in Equation 11 can be reduced just to two statistically significant factors encompassing 98.5% of the total variance.Values of loadings provided in Figure 5 suggest that the whole fluidization term is mainly determined by its residual part and these two contributions are highly correlated with PC1 with correlation coefficient as high as -0.99 and -0.98, respectively.The major contribution to the residual part comes from dispersion and electrostatics influences RES in much lesser extent.It is really interesting to see vdW and MSF loading to PC2 not only because of the Finally, these conclusions can be also supported by statistical analysis of the distributions by performing the principal component analysis (PCA) on covariance matrix.This step reveals that all contributions involved in Equation ( 11) can be reduced just to two statistically significant factors encompassing 98.5% of the total variance.Values of loadings provided in Figure 5 suggest that the whole fluidization term is mainly determined by its residual part and these two contributions are highly correlated with PC1 with correlation coefficient as high as -0.99 and -0.98, respectively.The major contribution to the residual part comes from dispersion and electrostatics influences RES in much lesser extent.It is really interesting to see vdW and MSF loading to PC2 not only because of the fact that they predominantly define this component, but also for their opposite signs of their contributions.This explains why so chemically diverse compounds were grouped into the same classes of consonance solvents.Indeed, increase of dispersion is compensated by electrostatic contribution to fluidization term and vice versa.Due to small values of both combinatorial term and hydrogen boding their overall importance is marginal for considered data set what was already documented in Figure 4.
fact that they predominantly define this component, but also for their opposite signs of their contributions.This explains why so chemically diverse compounds were grouped into the same classes of consonance solvents.Indeed, increase of dispersion is compensated by electrostatic contribution to fluidization term and vice versa.Due to small values of both combinatorial term and hydrogen boding their overall importance is marginal for considered data set what was already documented in Figure 4.

Conclusions
Solubility of Buckminster dissolved in 180 net organic solvents was predicted using the COSMO-RS approach.This universal methodology, taking advantage of the first principle quantum chemistry computations augmented with statistical thermodynamics, allows in principle for characteristics of any bulk liquid systems or their interfaces.Formally, application to solid solubility requires data characterizing the fusion process.It was documented that from the perspective of the COSMO-RS approach it is indispensable to distinguish calorimetric contributions to Gibbs free energy of fusion from the fluidization term resulting from inappropriate characteristics of probably both residual and combinatorial parts of chemical potential.The higher the FLUID the lower the experimental solubility, which is also associated with a significant increase of error of computed solubility.
These negative conclusions came from performed computations stimulated for seeking some solution to this discouraging result.This goal has been achieved by proposal of classification of solvents into groups with similar values of fluidization term.It was possible to find sets of solvents used in the reference solvent computations protocol that lead to very accurate prediction of C60 solubility.However, it is not possible to define any set of reference solvents suitable for computation of the solubility for the whole set of solvents.As the rule of thumb solvents within interval of 1 kcal/mol of fluidization term can be mutually exchanged either as the solvent or the reference.
It is of course debatable if the observations made in this paper are of general nature or hold just for Buckminster dissolvent in net organic solvents.What is certain, however, is that observations were possible because of the richness of the experimental data available in the literature and further exploration is worth the effort.
Funding: This research received no external funding.

Conclusions
Solubility of Buckminster dissolved in 180 net organic solvents was predicted using the COSMO-RS approach.This universal methodology, taking advantage of the first principle quantum chemistry computations augmented with statistical thermodynamics, allows in principle for characteristics of any bulk liquid systems or their interfaces.Formally, application to solid solubility requires data characterizing the fusion process.It was documented that from the perspective of the COSMO-RS approach it is indispensable to distinguish calorimetric contributions to Gibbs free energy of fusion from the fluidization term resulting from inappropriate characteristics of probably both residual and combinatorial parts of chemical potential.The higher the FLUID the lower the experimental solubility, which is also associated with a significant increase of error of computed solubility.
These negative conclusions came from performed computations stimulated for seeking some solution to this discouraging result.This goal has been achieved by proposal of classification of solvents into groups with similar values of fluidization term.It was possible to find sets of solvents used in the reference solvent computations protocol that lead to very accurate prediction of C60 solubility.However, it is not possible to define any set of reference solvents suitable for computation of the solubility for the whole set of solvents.As the rule of thumb solvents within interval of 1 kcal/mol of fluidization term can be mutually exchanged either as the solvent or the reference.
It is of course debatable if the observations made in this paper are of general nature or hold just for Buckminster dissolvent in net organic solvents.What is certain, however, is that observations were possible because of the richness of the experimental data available in the literature and further exploration is worth the effort.

21 Figure 1 .
Figure 1.The correspondence between computed and experimental values of fullerene C60 solubility in 180 net organic solvents.Statistics of regression curve denotes accuracy of consonance solvent selection according to fine criterion.

Figure 1 .
Figure 1.The correspondence between computed and experimental values of fullerene C60 solubility in 180 net organic solvents.Statistics of regression curve denotes accuracy of consonance solvent selection according to fine criterion.

Figure 2 .
Figure 2. Correspondence between percentage error and values of fluidization term of Buckminster fusion derived based on experimental solubility data and calorimetric measurements (FLUID = ∆ 60, ,− − ∆ 60 , ).Data points are colored according to solvents consonance classified using fine criterion.Codes represent subclasses enlisted in Tables I-III.

Figure 2 .
Figure 2. Correspondence between percentage error and values of fluidization term of Buckminster fusion derived based on experimental solubility data and calorimetric measurements (FLUID = ∆G f us,COSMO−RS C60,sat − ∆G f us,cal C60 ).Data points are colored according to solvents consonance classified using fine criterion.Codes represent subclasses enlisted in Tables I-III.

Symmetry 2019 , 21 Figure 3 .
Figure 3. Graphical representation of consonance reference approach.Rows and columns ordered with increasing percentage error values were colored using green (low PE values) to red (high PE values) spectrum.Squares define consonance regions according to fine (PE < 10%) and modest (PE < 15%) criterions.The order of solvents is the same as provided in Tables 1-3.

Figure 3 .
Figure 3. Graphical representation of consonance reference approach.Rows and columns ordered with increasing percentage error values were colored using green (low PE values) to red (high PE values) spectrum.Squares define consonance regions according to fine (PE < 10%) and modest (PE < 15%) criterions.The order of solvents is the same as provided in Tables 1-3.

Figure 4 .
Figure 4. Relationship between different fluidization contributions and experimental solubility of C60 in studied solvents.MSF denotes the misfit part of fluidization term coming from electrostatics interactions (∆   ), HB stands for hydrogen bonding contributions (∆   ), vdW accounts for all non-specific interactions expressed as van der Waals interactions (∆   H), RES represents the residual part of the chemical potential by summing of these three terms, COMBI is the combinatorial part of chemical potentials and FLUID is the sum of all contributions.

Figure 4 .
Figure 4. Relationship between different fluidization contributions and experimental solubility of C60 in studied solvents.MSF denotes the misfit part of fluidization term coming from electrostatics interactions (∆G MSF f luid ), HB stands for hydrogen bonding contributions (∆G HB f luid ), vdW accounts for all non-specific interactions expressed as van der Waals interactions (∆G vdW f luid H), RES represents the residual part of the chemical potential by summing of these three terms, COMBI is the combinatorial part of chemical potentials and FLUID is the sum of all contributions.

Figure 5 .
Figure 5. Results of principal component analysis of FLUID and its components including the residual part of the chemical potential (RES), combinatorial term (COMB), contributions coming from electrostatic (MSF), hydrogen boding (HB) and van der Waals (vdW) interactions.

Figure 5 .
Figure 5. Results of principal component analysis of FLUID and its components including the residual part of the chemical potential (RES), combinatorial term (COMB), contributions coming from electrostatic (MSF), hydrogen boding (HB) and van der Waals (vdW) interactions.

Table 1 .
The list of type A of consonance reference solvents suggested according to modest and fine criterions of solvents classification.APE (absolute percentage error) quantifies of solubility computations if the given solvent is used as a reference solvent within the consonance group.Best solvent within each group is marked in bold face.Fluidization term (FLUID) is given in kcal/mol.

Table 2 .
The list of type B of consonance reference solvents suggested according to modest and fine criterions of solvents classification.Notation is the same as in Table1.

Table 3 .
The list of type B of consonance reference solvent suggested according to modest and fine criterions of solvents classification.Notation is the same as in Table1.