You are currently viewing a new version of our website. To view the old version click .
Symmetry
  • Article
  • Open Access

24 October 2020

Fuzzy Divisive Hierarchical Clustering of Solvents According to Their Experimentally and Theoretically Predicted Descriptors

,
,
and
1
Department of Inorganic Chemistry, Faculty of Chemistry and Pharmacy, University of Sofia, 1 James Bourchier Blvd., Sofia 1164, Bulgaria
2
Faculty of Chemistry and Chemical Engineering, Babes-Bolyai University, 400084 Cluj-Napoca, Romania
3
Department of Analytical Chemistry, Chemical Faculty, Gdańsk University of Technology (GUT), 11/12 G. Narutowicza St., 80-233 Gdańsk, Poland
4
Department of Analytical Chemistry, Faculty of Chemistry and Pharmacy, University of Sofia, 1 James Bourchier Blvd., Sofia 1164, Bulgaria
This article belongs to the Special Issue Chemometrics in Assessing Molecular Structures and Properties

Abstract

The present study describes a simple procedure to separate into patterns of similarity a large group of solvents, 259 in total, presented by 15 specific descriptors (experimentally found and theoretically predicted physicochemical parameters). Solvent data is usually characterized by its high variability, different molecular symmetry, and spatial orientation. Methods of chemometrics can usefully be used to extract and explore accurately the information contained in such data. In this order, advanced fuzzy divisive hierarchical-clustering methods were efficiently applied in the present study of a large group of solvents using specific descriptors. The fuzzy divisive hierarchical associative-clustering algorithm provides not only a fuzzy partition of the solvents investigated, but also a fuzzy partition of descriptors considered. In this way, it is possible to identify the most specific descriptors (in terms of higher, smallest, or intermediate values) to each fuzzy partition (group) of solvents. Additionally, the partitioning performed could be interpreted with respect to the molecular symmetry. The chemometric approach used for this goal is fuzzy c-means method being a semi-supervised clustering procedure. The advantage of such a clustering process is the opportunity to achieve separation of the solvents into similarity patterns with a certain degree of membership of each solvent to a certain pattern, as well as to consider possible membership of the same object (solvent) in another cluster. Partitioning based on a hybrid approach of the theoretical molecular descriptors and experimentally obtained ones permits a more straightforward separation into groups of similarity and acceptable interpretation. It was shown that an important link between objects’ groups of similarity and similarity groups of variables is achieved. Ten classes of solvents are interpreted depending on their specific descriptors, as one of the classes includes a single object and could be interpreted as an outlier. Setting the results of this research into broader perspective, it has been shown that the fuzzy clustering approach provides a useful tool for partitioning by the variables related to the main physicochemical properties of the solvents. It gets possible to offer a simple guide for solvents recognition based on theoretically calculated or experimentally found descriptors related to the physicochemical properties of the solvents.

1. Introduction

The large number of different solvents used for many important chemical processes and technologies need special attention since their properties depend on a range of specific chemical and physical parameters such as melting and boiling point, water solubility, polarity, vapor pressure, density, viscosity, and even toxicity and many others.
Solvents can be separated by one of four basic methods: by solvent power (solubility polarity, acidity/basicity, properties/parameters), evaporation rate/boiling point, chemical structure, and hazard classification. Within the latter, this evaluation identifies both physical hazards (e.g., flash point, flammability, or reactivity) and toxicity, etc. The partitioning based on chemical structure groups used three groups: hydrocarbons, and oxygenated and chlorinated solvents [1,2,3].
Parker [1] divides them into: protic, aprotic, and inert according to the dipolarity of the solvent molecules and their ability to act as hydrogen bond donors. One disadvantage of a classification scheme such as this is that the groups are not restraining.
Partitioning of solvents based on physicochemical properties proved to be a significant and challenging problem [2,3,4,5,6]. Special interest provides a new study [7] where a new solvent similarity index is introduced, aiding in discovering the most suitable solvent for specific purposes. The solvent similarity index was calculated based on 261 pure solvents at 298 K, and classification was done for the solvents according to their solvation properties. Pushkarova et al. [8] used, as empirical characteristics of solvent-solute interactions via Taft-Kamlet-Abboud, polarity functions to determine the solvatochromic polarity. The practice of solvatochromic probing is growing rapidly but classification of media based on these values can be difficult. The paper focuses on the artificial neural networks (ANN) for the classification of solvent on the basis of their solvatochromic characteristics. Also, the influence of data variation on the stability of classification has been studied.
In the study of Gramatica et al. [9] a neuron nets approach was used for solvent separation. In general, many other chemometric methods contributed to proper solvent selection for practical needs like regression analysis, factor analysis, or partial least square regression [3,4,5].
Bradley et al. [10] used the Abraham general solvation model to predict the solvent coefficients for all organic solvents. The models were used to propose sustainable solvent replacements for commonly used solvents.
Recent efforts are concentrated on the application of chemometric strategies as suitable tools for classification of solvents (as objects of the analysis) characterized by many properly selected variables (chemical, structural, and physicochemical descriptors [11,12,13,14,15]. The majority of the methodologies are well developed and widely used for classification, interpretation, and modeling purposes like cluster analysis, principal components and factor analysis, artificial neural networks, partial least square regression, and discriminant analysis. A limited number of applications are related to fuzzy analysis [16,17].
Fuzzy clustering and partitioning also finds application in solvents characterization [18].
Fuzzy clustering analysis offers unique opportunities for decomposition of a large data set into a fixed number of similarity groups or clusters. Indeed, the classical cluster analysis (hierarchical or non-hierarchical) could achieve similar results but the strong advantage of the fuzzy partitioning strategy is the opportunity to locate a certain object (or variable) not to a single group of similarity but to calculate a function of membership for each object. Thus, a single object could be attributed to more than one cluster. This makes the interpretation efforts more loosely allowing considering specific distribution of objects into clusters with respective degree of membership. It eliminates ambiguity in interpretation or often unavoidable overlapping of clusters.
The major goal of the present study is to achieve a reliable partitioning of a large number of solvents with broad practical use by application of fuzzy partitioning methodology.
In this study, the fuzzy divisive hierarchical clustering and the powerful fuzzy divisive hierarchical associative-clustering method, which offer an excellent possibility to associate each fuzzy partition of samples to a fuzzy set of characteristics (descriptors), were successfully applied for the characterization of 259 solvents, according to their 15 specific descriptors (experimentally found and theoretically predicted). What is quite new is the partitioning of solvents and their association with different descriptors with high, moderate, and low values. The obtained results clearly demonstrated the efficiency and information power of the advanced fuzzy clustering method in solvents characterization and clustering.

2. Materials and Methods

2.1. Fuzzy Clustering Methods

The application of fuzzy logic for various scientific and technical goals has been commented on for decades [19]. This approach differs from the classical hard clustering where each object of the data set finds its own cluster. Thus, an object either belongs to a defined cluster or is out of it. The application of Fuzzy theory to the problem of finding similarity between objects of interest leads to the conclusion that a particular object can belong simultaneously to more than one cluster, but with different degrees of membership (DOMs) between 0 and 1 [20,21]. In one of the possible approaches to so-called fuzzy c-means clustering (FCM), each cluster is replaced by a cluster prototype [22,23] with a respective center, which contains information about the size and the shape of the cluster. The degrees of membership are computed from the distances of the data point to the cluster centers. These distances are responsible for the value of DOM and determine the cluster properties and shape (point, line, etc.) [24].
There are different algorithms in fuzzy clustering applications, the most used being the binary divisive algorithm and the generalized fuzzy c-means algorithm (GFCM). The fuzzy methods briefly described above and the corresponding software were clearly described and efficiently applied in previous papers [25,26,27,28,29,30].

2.2. Data Set

The dataset consists of 269 solvents. Each solvent was described by 15 variables (molecular descriptors and experimentally obtain properties) shown below in Table 1.
Table 1. Molecular descriptors and experimentally obtained properties.
In the present study, the following set of subprograms implemented in the EPI Suite™ version 4.10 were used: MPBPWIN™, WATERNT™, HENRYWIN™, KOAWIN™, KOWWIN™, and BCFBAF™.
The melting point (MP), boiling point (BP), and vapor pressure (VP) within the MPBPWIN™ module in EPI Suite™ were applied to predict the properties of our interests. The MPBPWIN™ estimates melting point by the two methods: (1) the Joback Method (a group contribution method); (2) the Gold and Ogle method MP = 0.5839 * BP (in °K). Boiling point is valued by an adaptation of the Stein and Brown (1994) method, which is also a group contribution method. Vapor pressure is predictable as well by the methods: (1) Antoine, (2) Modified Grain method, and (3) the Mackay method. WATERNT™ estimates water solubility directly using a “fragment constant” method similar to that used in the KOWWIN™ program.
The Henry’s law constant is estimated by the subprogram HENRYWIN™, which calculates (air/water partition coefficient) using both the group contribution and the bond contribution methods.
This KOAWIN™ program evaluates the logarithm of the octanol-air partition coefficient (KOA) of an organic compound with the compound’s octanol–water partition coefficient (Kow) and Henry’s law constant (HLC). For the KOAWIN only a chemical structure was needed for estimation of KOA. In the KOAWIN structures are implemented by the SMILES codes (Simplified Molecular Input Line Entry System). The KOA is possible to be predicted from the octanol–water partition coefficient (KOW) and Henry’s law constant (H) by the subsequent equation:
KOA = KOW(RT)/H
where R is the ideal gas constant and T is the absolute temperature. KOA and KOW are unitless values. H/RT is the unit less Henry’s law constant, also known as the air–water partition coefficient (KAW).
Therefore, the equation to estimate KOA is:
KOA = KOW/KAW
The KOWWIN™ program is for the octanol–water partition coefficient prediction. The basis of prediction in KOWWIN is a “fragment constant” methodology. In this “fragment constant” method, the starting structure is divided and then evaluated.
The comparison with the available experimental data shows a high level of correlation. In such a way, missing data in the large data set could be replaced.

3. Results and Discussion

3.1. Fuzzy Divisive Hierarchical Clustering of Descriptors

The fuzzy clustering of the variables (15 in total) aims to check the following:
  • If the experimental values of the respective variables conform with the calculated one (i.e. if they fall within a fuzzy cluster with high membership function);
  • If the partitioning procedure could determine stable groups of similarity between the variables with high DOM;
  • The procedure is important for revealing information about possible descriptors for classification of the solvents in interest.
In the supplemental information section (Supplement T1) the fuzzy partitioning results for 15 variables are presented. In total, 28 groups are considered. The summary of the final partitioning is shown below:
  • A1—only HLc is included (a typical outlier)
  • A2—MPe MPc BPe BPc Dens WSe WSc VPe VPc HLe logKOWe LogKOWc logKOAc logBCF (the rest of the variables show a high level of similarity with a distinct difference from HLc).
In the next steps of fuzzy partitioning respective groups of similarity based on DOM will be sought.
  • A21—MPe MPc BPe BPc Dens VPe VPc HLe logKOWe LogKOWc logKOAc logBCF
  • A22—WSe WSc
In this partitioning stage, the experimentally found and theoretically calculated values of water solubility are extracted as a group of similarity different from the rest of variables in subgroup A21.
  • A211—BPe BPc
  • A212—MPe MPc Dens VPe VPc HLe logKOWe LogKOWc logKOAc logBCF
  • A2111—BPe
  • A2112—BPc
  • A2121—Dens
At this level of fuzzy partitioning, one finds separation between the experimentally and theoretically found values of boiling points and density. According to the DOM values the differences are small and the similarity between these three variables is significant,
  • A2122—MPe MPc VPe VPc HLe logKOWe LogKOWc logKOAc logBCF
  • A21221—VPe VPc HLe logKOWe LogKOWc logKOAc logBCF
  • A21222—MPe MPc
  • A212211—VPe VPc
The separation of two other groups of similarity is indicated melting point (experimental and theoretical values) and vapor pressure (experimental and theoretical values).
  • A212212—HLe logKOWe LogKOWc logKOAc logBCF
  • A2122111—VPc
  • A2122112—VPe
  • A2122121—logBCF
This stage of fuzzy partitioning reveals a slight difference between vapor pressure (theoretical and experimental values), and a more specific role of logBCF as compared to the stable group of logKOW (experimental and theoretical values) and logKOAc (calculated values).
  • A2122122—HLe logKOWe LogKOWc logKOAc
  • A21221221—HLe logKOWe LogKOWc
  • A21221222—logKOAc
  • A212212211—LogKOWc
  • A212212212—HLe logKOWe
  • A2122122121—logKOWe
  • A21221221212—HLe
  • A212221—MPe
  • A212222—MPc
  • A221—WSc
  • A222—Wse
The fuzzy partitioning carried out for 15 variables characterizing a set of solvents revealed the following fuzzy linkage of the variables:
  • Very good coincidence between experimentally determined and theoretically calculated values of the variables characterizing the solvents; this means that if experimental values of some solvents are missing, calculation substitutes could be successfully used for classification and interpretation goals;
  • HLc was defined as a typical outlier;
  • The group of variables characterizing the distribution between different media (important for toxicity properties determination) is very compact;
  • The parameters characterizing physicochemical properties (MP, BP, WS, and VP) indicate various type of similarity with the other parameters—water solubility is the most distant to the rest of parameters, followed by BP and MP; density is closest to BP; logBCF is slightly different as compared to the rest of “toxicity esteems.”
Additional material could be found in Supplement

3.2. Fuzzy Divisive Hierarchical Clustering of Solvents

To compare the partitions, and the similarity and differences of the investigated solvents, we have to analyze both the characteristics of the prototypes corresponding to the partitions hierarchy obtained by applying fuzzy divisive hierarchical clustering and DOMs of solvents corresponding to all fuzzy partitions. The results presented in Table 2 clearly illustrate the most specific characteristics of each fuzzy partition and their similarity and differences.
Table 2. Final fuzzy partitioning.
The initial two clusters A1 and A2 indicate that one typical outlier is present in the list of solvents—perfluorooctane, whose properties are completely different from those of the other 268 solvents. The further divisive fuzzy clustering indicates the level of the membership function of each solvent into each of the next groups included (22 in total).
Next, Table 2 shows the final fuzzy partitioning with the prototypes of the partitions, ranked solvents for each group and the range of DOM.

3.3. Fuzzy Divisive Hierarchical Associative-Clustering of Solvents and Descriptors

To compare the partitions, and the similarity and differences of solvents, we have to analyze the DOMs corresponding to all fuzzy partitions for both the samples and characteristics (descriptors). The results obtained by applying the fuzzy divisive hierarchical associative-clustering method using the descriptor data are presented in Table 3. By carefully analyzing the fuzzy partitions at each level (partition history/hierarchy) in parallel with the descriptor considered data, the following remarks may be taken. The fuzzy partitioning of the solvents with indication of the descriptors related to each fuzzy partition (cluster) is depicted in Table 3.
Table 3. The fuzzy partitioning of the solvents and variables (descriptors).
For the final goal of fuzzy partitioning of the objects (solvents) was performed by the use of 10 variables (only the experimentally found ones) (Table 4).
Table 4. Solvents included in each class and the respective class descriptor.
The fuzzy partitioning performed reveals the following classes of solvents:
  • Class 1 (WSe): 9 19 38 52 53 54 102 117 157 162 179 202 217 223 259 57 72 83 96 104 107 128
Iodoethane Diethyl glutarate 1,1-dichloroethane Dimethyl phthalate quinoline 2,4-dimethyl-3-pentanone n-butyl acetate 1,2-dichloroethane Glycerol-1,3-Dibutyl ether m-cresol Dimethyl adipate Glycerol-1,2,3-triethyl ether diethyl carbonate Caprylic acid diethanolamide DMEU 1-hexanol 4-methyl-2-pentanone Butylacetate 1-chloropropane 2-chloropropane 2-Methyltetrahydrofuran Diethyl succinate.
Mainly chlorinated solvents and similar ones except for diethyl carbonate and buthylacetate (CHLORINATED SOLVENTS CLASS) with major descriptor WSe (experimental water solubility value; calculated water solubility gives the same separation).
  • Class 2 (Dens): 144 145 184 200 218 227 231 233 234
Bromoethane Di-isopropyl ether Benzonitrile Acetophenone Isobutyl acetate Di-n-propyl ether Carbon disulfide Benzaldehyde Chloroform.
Nonpolar and volatile solvents except for isobutyl acetate (NON-POLAR AND VOLATILE SOLVENTS CLASS) major descriptor DENS (density).
  • Class 3 (VPe): 7 40 63 79 131 146 155 166 205 206 207 209 215 243 247 250 255 257 258 269
Benzene 1,1,1-trichloroethane Dichloromethane Methyl formate Diethyl ether 1,1-dichloroethylene Oleyl alcohol Fluorobenzene Tetraethylene glycol Ricinoleic acid Triethylene glycol Butyl stearate Methyl benzoate 1,1,2,2-tetrachloroethane 1,8-Cineole gamma-Valerolactone Nopol alpha-Terpineol beta-Terpineol PolyEthyleneGlycol 200.
This is a mixture of polar solvents—acids and esters with non-polar ones such as benzene or 1,1,1-TCA (POLAR AND NON-POLAR SOLVENTS MIXED CLASS separated mainly by descriptor vapor pressure experimental values; the calculated value gives the same results).
  • Class 4 (MPe): 81 101 106 185 216 240 260 268
Glycerol triacetate Oleic acid Menthanol Diisooctylsuccinat Dioctylsuccinate Isosorbide dioctanoate DPMU PolyEthyleneGlycol 600.
POLAR SOLVENTS CLASS I defined by descriptor melting point I.
  • Class 5 (logKOA): 11 21 22 87 95 119
2-Pyrrolidone Sulfolane Propylene carbonate N-methylacetamide Glycerol Water
POLAR SOLVENTS CLASS II defined by descriptor logKOA.
  • Class 6 (logKOWe): 10 93 98 126 220 221 244
Phenetole Diisobutyl adipate Geranyl acetate Menthanyl acetate Trichloroethylene Pentyl acetate 1-Octanol.
The grouping was defined properly as a polar solvent except trichloroethylene POLAR SOLVENTS CLASS III defined by logKOW (experimental or theoretical).
  • Class 7 (logBCF): 8 16 25 27 59 69 70 71 92 105 118 127 132 134 239
Ethyl myristate Isoamyle acetate Butyl laurate Methyl abietate 2,6-dimethyl-4-heptanone N,N-dimethylaniline Nitrobenzene Benzyl benzoate Methyl stearate Methyl myristate Dimethyl 2-methylglutarate 1,1,3,3-tetramethyl urea Diethyl phthalate Diethyl adipate Glycerol-1,2,3-tributyl ether.
The group of polar ones, except for nitrobenzene—POLAR SOLVENTS CLASS IV defined by logBCF.
  • Class 8 (HLe): 2 4 36 78 82 90 99 103 116 120 125 130 229
Ethyl laurate Glycerol-1,2-dibutyl ether Acetyltributyl citrate Diisoamylsuccinate N,N-Diethylolcapramide Dibenzyl ether Butyl palmitate Methyl linolenate Methyl ricinoleate Methyl laurate Ethyl benzoate Dibutyl sebacate Anisole.
The group defined by a HIGH MOLECULAR WEIGHT POLAR SOLVENTS defined by HLe (experimental).
  • Class 9 (BPe): 5 6 13 15 17 18 23 24 26 34 35 37 42 43 44 46 47 48 49 50 55 60 61 62 64 66 68 73 74 76 80 88 89 97 108 109 110 111 112 113 114 115 123 124 129 133 135 140 188 267
1 3 12 14 28 29 30 31 32 33 39 41 45 51 56 58 65 67 75 77 84 85 86 91 94 100 121 122 136 137 138 139 141 142 143 147 148 149 150 151 152 153 154 156 158 159 160 161 163 164 165 167 168 169 170 171 172 173 174 175 176 177 178 180 181 182 183 186 187 189 190 191 192 193 194 195 196 197 198 199 201 203 204 208 210 211 212 213 214 219 222
224 225 226 228 230 232 235 236 237 238 241 242 245 246 248 249 251 252 253 254 256 261 262 263 264 265 266
This group is quite large. Most of the solvents are polar except for: carbon tetrachloride, xylenes, and bromobenzene.
Cyclohexanol Isododecane Di-n-butyl acetate 1-chlorobutane Glycerol-2-methyl monoether m-dichlorobenzene p-Cymene Methyl palmitate Isopropylacetate chlorobenzene Isopropyl palmitate 2,6-dimethylpyridine 1-bromobutane Butyl myristate Furfurylic alcohol 1-2,4-dimethylpyridine Dihydromyrcenol 3-Hydroxypropionic acid Benzyl alcohol Cyclohexanone 1,3-Dioxan-5-ol Diisobutyl succinate Glycerol-2-ethyl monoether Toluen Methyl Linoleate.
N,N-Dimethyldecanamide N-methylformamide Cyclopentane Propylene glycol Iodobenzene Glycerol-1,3-dimethyl ether Piperidine o-xylene Aniline Diisobutyl glutarate Tetrahydrofurfurylic alcohol 3-Methoxy-3-methyl-1-butanol p-xylene cis-decaline Dimethylisosorbide mesitylene Glycerol-1,2-dimethyl ether Isopropyl myristate d-Limonene 1,3-Dioxolane-4-methanol Propionic acid N-decane Carbon tetrachloride Cyclopentyl methyl ether N-pentane Triethylamine Propyl formate Ethanol Ethyl acetate 1-Butanol 4-picoline 3-methyl-2-butanone n-Propyl acetate propionitrile Dimethyl sulfoxide 1,3-Dioxolane Cyclohexane Formamide Diethylamine Iso-octane Glycerol-1,2,3-trimethyl ether Dimethyl succinate 1,3-Propanediol Propylene glycol Butyronitrile N,N-dimethylformamide Ethyl formate β-Pinene 2,2,2-trifluoroethanol 3-pentanone Pyridine 2-pentanone n-heptane 3-Butyl-1-methylimidazolium tetrafluoroborate 1-Decanol N-methyl-pyrrolidin-2-one α-Pinene 1,2-dimethoxyethane 2-methoxyethanol Methyl oleate Decamethylcyclopentasiloxane Diethylene glycol Glycerol-2-butyl monoether Tributylamine 1-pentanol EthylHexyllactate Nitromethane.
Tert-butyl alcohol 1,4-dioxane Glycerol-1-ethyl monoether Cyclohexene N,N-dimethylacetamide Ethyl palmitate 5-(Hydroxymethyl)furfural 2-butanone 2-methyl-2-butanol styrene Methyl acetate Pyrrolidine N,N-Dimethyloctanamide Glycerol carbonate Acetone 2-aminoethanol tert-butyl methyl ether Acetylacetone 3-picoline Dipropyleneglycol.
2-pentanol n-butylamine Diphenyl ether 2-propanol Ethylene glycol Ethyl linolenate Methanol Cyclopentyl methyl ether Nitroethane Phenol Isobutyl alcohol Ethylenediamine.
β-Farnesen Tetrachloroethylene Tetrahydrofuran 3-pentanol Methyl 5-(dimethylamino) 2-methyl-oxopentanoate 2,4,6-trimethylpyridine Glycerol-1,3-diethyl ether 2-butanol Acetic anhydride Ethyl linoleate trifluoroacetic acid n-hexane Ethyl lactate Cyclopentanon o-dichlorobenzene 3,3-dimethyl-2-butanone Dimethyl glutarate 1-propanol Glycerol-1-methyl monoether n-octane m-xylene Bromobenzene Choline acetate Ethyl oleate Acetic acid.
Acetonitrile Glycerol-1,2,3-tributyl ether morpholine 3-methyl-1-butanol Acetone 1,4-Cineole Terpineol acetate 2-Furfuraldehyde beta-Myrcene Terpinolene Cyclademol Glycofurol (n = 2) Solketal HMPTA DEGDEE DEGDME Ethyl propionate TEGDME Dimetylsulfoxide.
  • Class 10 (HLc): Outlier Perfluorooctane 20
The solvents underlined above do not strictly belong to logical formation of similarity classes and seem more to be rather odd than reasonable as members of the respective class (polar, non-polar, or volatile solvents determined by specific variables). A careful check of the position of these 12 solvents into the fuzzy partitioning groups indicates that all of them have quite low maximal value of DOM as determined by fuzzy analysis (this values is shown next to the name of the solvent).
The few exceptions found (only 9 out of 259 solvents), namely:
(diethyl carbonate, benzene, nitrobenzene);(o-, m-, p-xylene) and (carbon tetrachloride, bromobenzene, trichloroethylene), are resultant to their low maximal DOM, so their position into one group of similarity is not stable and they could be considered either as members of the group with low probability, or members of a different class.
In Table 5 summarized results according to obtained classes are presented.
Table 5. Defined classes of solvents with the descriptors.
The table could be used as a practical guide for selection of type of solvents based on their physicochemical properties.

4. Conclusions

The fuzzy hierarchical clustering of a large group of solvents into 10 classes of similarity made it possible to find patterns of the chemicals with specific properties divided by important descriptors. The fuzzy partitioning method applied helped in finding relationships between solvents of various nature (polar, non-polar, volatile etc.) and the physicochemical variables used. Additionally, the chemometric analysis has proven that if there are missing data of specific descriptors the theoretical calculation of them is possible with very high level of approximate to the experimentally observed and established physicochemical indicators.
Thus, the present study offers a simple methodological approach to the complex problem of solvent partitioning.
In order to understand the similarity and differences of various solvents, fuzzy divisive hierarchical clustering and fuzzy divisive hierarchical associative-clustering were successfully applied. The fuzzy partition hierarchy of solvents and descriptors associated allowed identifying partitions (groups) of solvents with more or less similar characteristics in terms of higher, smallest, or intermediate values of considered descriptors.

Supplementary Materials

The following are available online at https://www.mdpi.com/2073-8994/12/11/1763/s1.

Author Contributions

Conceptualization, M.N.; methodology, C.S.; M.N.; software C.S.; M.N.; validation, M.N., V.S. formal analysis, M.N., M.T., C.S., V.S.; investigation, M.N.; resources, M.N.; data curation, C.S., M.N.; writing—original draft preparation, M.N., C.S., M.T., V.S.; writing—review and editing, M.N., V.S.; visualization, M.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Information and Communication Technologies for a Single Digital Market in Science, Education and Security” of the Scientific Research Center, grant number NIS-3317 and National roadmaps for research infrastructures (RIs) grant number [NIS-3318]. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Acknowledgments

The author M.N. is grateful for the additional support by the project “Information and Communication Technologies for a Single Digital Market in Science, Education and Security” of the Scientific Research Center, NIS-3317 and National roadmaps for research infrastructures (RIs) grant number [NIS-3318].

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Parker, A. Protic-dipolar aprotic solvent effects on rates of bimolecular reactions. Chem. Rev. 1967, 69, 1–32. [Google Scholar] [CrossRef]
  2. Tobiszewski, M.; Nedyalkova, M.; Madurga, M.; Pena-Pereira, F.; Namieśnik, J.; Simeonov, V. Pre-selection and assessment of green organic solvents by clustering chemometric tools. Ecotoxicol. Environ. Saf. 2018, 147, 292–298. [Google Scholar] [CrossRef] [PubMed]
  3. Katritzky, A.; Fara, D.; Kuanar, M.; Hur, E.; Karelson, M. The Classification of Solvents by Combining Classical QSPR Methodology with Principal Component Analysis. J. Phys. Chem. A. 2005, 109, 10323–10341. [Google Scholar] [CrossRef] [PubMed]
  4. Molnar, M.; Komar, M.; Brahmbhatt, H.; Babić, J.; Jokić, S.; Rastija, V. Deep Eutectic Solvents as Convenient Media for Synthesis of Novel Coumarinyl Schiff Bases and Their QSAR Studies. Molecules 2017, 22, 1482. [Google Scholar] [CrossRef]
  5. Chastrette, M.; Rajzmann, M.; Chanon, M.; Purcell, K. Approach to a general classification of solvents using a multivariate statistical treatment of quantitative solvent parameters. J. Am. Chem. Soc. 1985, 107, 1–11. [Google Scholar] [CrossRef]
  6. Laurence, C.; Legros, J.; Chantzis, A.; Planchat, A.; Jacquemin, D. A Database of Dispersion-Induction DI, Electrostatic ES, and Hydrogen Bonding α1 and β1 Solvent Parameters and Some Applications to the Multiparameter Correlation Analysis of Solvent Effects. J. Phys. Chem. B 2015, 119, 3174–3184. [Google Scholar] [CrossRef]
  7. Driver, M.; Hunter, C. Solvent similarity index. PCCP 2020, 22, 11967–11975. [Google Scholar] [CrossRef]
  8. Pushkarova, Y.; Kholin, Y. The classification of solvents based on solvatochromic characteristics: The choice of optimal parameters for artificial neural networks. Cent. Eur. J. Chem. 2010, 10, 1318–1327. [Google Scholar] [CrossRef]
  9. Sahigara, F.; Ballabio, D.; Todeschini, R.; Consonni, V. Defining a novel k-nearest neighbors approach to assess the applicability domain of a QSAR model for reliable predictions. J. Cheminform. 2013, 5, 27. [Google Scholar] [CrossRef]
  10. Bradley, J.-C.; Abraham, M.H.; Acree, W.E.; Lang, A.S.I.D. Predicting Abraham model solvent coefficients. Chem. Cent. J. 2015, 12, 2–10. [Google Scholar] [CrossRef]
  11. Johnson, A.R.; Vitha, M.F. Chromatographic SelectivityTriangles. J. Chromatogr. A 2011, 1218, 556–586. [Google Scholar] [CrossRef] [PubMed]
  12. Katritzky, A.R.; Tamm, T.; Wang, Y.; Sild, S.; Karelson, M. A Unified Treatment of Solvent Properties. J. Chem. Inf. Comput. Sci. 1999, 39, 692–698. [Google Scholar] [CrossRef]
  13. Poole, C.F.; Karunasekara, T. Solvent Classification for Chromatography and Extraction. J. Planar Chromatogr. 2012, 25, 190–199. [Google Scholar] [CrossRef]
  14. Lesellier, E. Rpider Diagram: A Universal and Versatile Approach for System Comparison and Classification: Application to Solvent Properties. J. Chromatogr. A 2015, 1389, 49–64. [Google Scholar] [CrossRef]
  15. Wypych, G. Handbook of Solvents, 2nd ed.; Chem Tec Publishing: Toronto, ON, Canada, 2001. [Google Scholar]
  16. Sarbu, C.; Pop, H.F. Fuzzy Soft-Computing Methods and Their Applications in Chemistry. In Reviews in Computational Chemistry; Lipkowitz, K.B., Larter, R., Cundari, T.R., Eds.; Wiley-VCH: Hoboken, NJ, USA, 2004; Chapter 5; pp. 249–332. [Google Scholar]
  17. Halgamuge, S.K.; Wang, L. (Eds.) Classification and Clustering for Knowledge Discovery; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
  18. Guidea, A.; Sarbu, C. Fuzzy characterization and classification of solvents according to their polarity and selectivity. A comparison with the Snyder approach. J. Liq. Chromatogr. Relat. Technol. 2020, 43, 336–343. [Google Scholar] [CrossRef]
  19. Kaufman, L.; Rousseeuw, P.J. Finding Groups in Data: An Introduction to Cluster Analysis; John Wiley & Sons: New York, NY, USA, 2009. [Google Scholar]
  20. Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
  21. Rouvray, D. Fuzzy Logic in Chemistry; Academic Press: San Diego, CA, USA, 1997; p. 364. [Google Scholar]
  22. Sarbu, C.; Pop, H.F. Fuzzy Soft-Computing Methods and Their Applications in Chemistry. Rev. Comput. Chem. 2004, 20, 249–332. [Google Scholar]
  23. Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms; Plenum Press: New York, NY, USA, 1987; p. 272. [Google Scholar]
  24. Hoppner, F.; Klawonn, R.K.; Kruse, R.; Runkler, T. Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition; John Wiley &Sons: Chichester, UK, 1999; p. 300. [Google Scholar]
  25. Sârbu, C.; Zehl, K.; Einax, J.W. Fuzzy divisive hierarchical clustering of soil data using gustafson-kessel algorithm. Chemom. Intell. Lab. Syst. 2007, 87, 121–129. [Google Scholar] [CrossRef]
  26. Sârbu, C.; Moţ, A.C. Ecosystem discrimination and fingerprinting of Romanian propolis by hierarchical fuzzy clustering and image analysis of TLC patterns. Talanta 2011, 85, 1112–1117. [Google Scholar] [CrossRef]
  27. Pop, H.; Dumitrescu, D.; Sârbu, C. A Study of Roman Pottery (terra sigillata) Using Hierarchical Fuzzy Clustering. Anal. Chim. Acta 1995, 310, 269–279. [Google Scholar] [CrossRef]
  28. Pop, H.; Sârbu, C. The fuzzy hierarchical cross-clustering algorithm. Improvements and comparative study. J. Chem. Inf. Comput. Sci. 1997, 37, 510–516. [Google Scholar] [CrossRef]
  29. Dumitrescu, D.; Pop, H.; Sârbu, C. Fuzzy Hierarchical Cross-Classification of Greek Muds. J. Chem. Inf. Comput. Sci. 1995, 35, 851–857. [Google Scholar] [CrossRef]
  30. Sârbu, C.; Horovitz, O.; Pop, H. A Fuzzy Cross-Classification of Chemical Elements, Based on Their Physical, Chemical and Structural Features. J. Chem. Inf. Comput. Sci. 1996, 36, 1098–1108. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.