Fuzzy Divisive Hierarchical Associative-Clustering Applied to Different Varieties of White Wines According to Their Multi-Elemental Profiles

Wine data are usually characterized by high variability, in terms of compounds and concentration ranges. Chemometric methods can be efficiently used to extract and exploit the meaningful information contained in such data. Therefore, the fuzzy divisive hierarchical associative-clustering (FDHAC) method was efficiently applied in this study, for the classification of several varieties of Romanian white wines, using the elemental profile (concentrations of 30 elements analyzed by ICP-MS). The investigated wines were produced in four different geographical areas of Romania (Transylvania, Moldova, Muntenia and Oltenia). The FDHAC algorithm provided not only a fuzzy partition of the investigated white wines, but also a fuzzy partition of considered characteristics. Furthermore, this method is unique because it allows a 3D bi-plot representation of membership degrees corresponding to wine samples and elements. In this way, it was possible to identify the most specific elements (in terms of highest, smallest or intermediate concentration values) to each fuzzy partition (group) of wine samples. The chemical elements that appeared to be more powerful for the differentiation of the wines produced in different Romanian areas were: K, Rb, P, Ca, B, Na.


Introduction
Wine represents one of the most consumed beverages in the world and therefore has a large interest from an economic and social point of view. Wine quality is directly related to geographical origin, grape variety and technological processes. The wide variety of products results from different grape cultivars, vintages, geographic origins or winemaking techniques [1,2]. The geographic origin, grape variety and vintage of a specific wine are determined mainly by economic factors, and in turn, these factors determine the quality and price of wines from different regions, as well as the issue of trademark and consumer protection [1][2][3]. Wine composition is given by several factors like grape variety, soil influences and vinicultures practices (fertilizer or pesticide treatments), climate, and winemaking processes (yeast culture, aging, storage, quality and hygiene of vinery facilities) [4]. During the last years, many studies have focused on food and beverage fingerprinting and authentication, and one of the most important criteria followed is the recognition of the geographical origin of a certain product [5]. In this regard, an effective approach is determining the association between the matrix characteristics, obtained using different analytical methods, and multivariate statistical methods.
The mineral composition of wine is mainly influenced by the soil composition of the vineyard, agricultural practices (i.e., fertilizers, pesticide) and the environment (pollution). Besides this, an important influence is the capacity of the grape variety to absorb minerals from the soil and also during the various steps of winemaking practices from grape to the final product [14,16,17]. Climatic changes may only affect the fungicide treatments, especially the level of Cu in grapes [18]. The elemental profile of wine is very often employed in the assessment of its geographical origin, as well as to determine its quality and safety, in terms of the maximum residue limits established by control authorities.
There are many reported studies dealing with the assessment of wine authenticity according to their mineral profiles using distinct techniques like: flame atomic absorption spectroscopy (FAAS), graphite furnace atomic absorption spectroscopy (GFAAS), voltammetry, capillary electrophoresis, inductively coupled plasma mass spectrometry (ICP-MS) and inductively coupled plasma optical emission spectrometry (ICP-OES), assisted in many cases by chemometrics [19][20][21]. From all these, ICP techniques are the most extensively employed for geographical traceability of food products, because of the proven low detection limits, multi element determinations, and wide dynamic ranges. In order to assure superior valorization of the results, the association with different chemometric techniques is often applied for different food matrices, such as: honey [22], juices [23], wines [24], dairy products [25,26] or vegetables [27,28].
In this study, the fuzzy divisive hierarchical associative-clustering method, which gives an excellent overview to associate each fuzzy partition of samples to a fuzzy set of characteristics (specific chemical elements), was successfully applied for modelling 65 Romanian white wines according to their elemental profile. The novelty of this study is represented by the partitioning of white wines and their association with different chemical elements with high, moderate and low concentrations. The obtained results clearly demonstrated the efficiency and information power offered by the advanced fuzzy clustering method in wine characterization and authentication.

Descriptive Statistics
The elemental data used in this study were obtained for 65 Romanian white wines (Table S1). It is easy to observe the important differences among the elemental concentrations of the investigated elements ( Table 1). The highest concentrations (expressed in µg·L −1 ), in decreasing order, were obtained for K (238898.5), P, Mg, Ca, Na, and Rb (1237.1), moderate concentrations were identified for Mn (676.8), Sr, Zn, Al, Ba, and Cu (81.5), while the lowest concentrations corresponded to Pb (12.4), Li, Sc, Ce, Cs, Pd, Bi, Co, As, Be, Ga, U, Tl, Ag, Au, Sb, and In (0.2).

Fuzzy Divisive Hierarchical Associative-Clustering
For comparison of similarities and differences among white wine sample partitions, the DOMs (degrees of memberships) corresponding to all fuzzy partitions for both the samples and characteristics (elements concentrations) were analyzed. The results obtained by applying the fuzzy divisive hierarchical associative-clustering method using the elemental data are presented in Table 2.
By analyzing the fuzzy partitions at each level (partition history/hierarchy) along with the elemental considered data, the following observations may be taken. At the first partition level the white wines (65 samples) are separated into two fuzzy partitions A1 and A2, respectively. The degrees of membership (DOMs) of the wines included in partition A1 are in the range 0.9836-0.7428 or 98. 36-74.28%, and between 0.9998 and 0.6449 (99.98-64.49%) in the case of A2. Most of the wines assigned to partition A1 belong to the Sauvignon Blanc cultivar from different areas (4 from Moldova, 2 from Oltenia and 1 from Muntenia), except one sample (Italian Riesling) from Oltenia. Potassium is the single element in this group associated with a very high DOM (0.9990). The concentration of K is the highest for all seven wines and is quite different from the rest of investigated samples. Other authors found a link between the elemental composition of wines and soil, with K having the maximum concentration in analyzed soil samples, among investigated macroelements [29]. At the second level, only partition A2 is divided, resulting in partitions A21 and A22, with DOMs between 0.9867 and 0.5665 for A21 and from 0.9884 to 0.6115 for A22. The elements assigned to partition A21 according to their DOMs (0.9090-0.7406) are Mg, Ca and P. The concentrations of these elements are very high, but are comparable and close for the samples assigned to this group (the majority from Moldova); only the vintage and cultivar are different. The elements that were grouped in this partition are applied in different areas, suggesting similar agricultural practices, especially related to the use of fertilizers. The application of multiple element fertilizers (calcium, magnesium, iron, manganese, copper, zinc and boron) might affect both the yield of the wine grapes and the content of tannins, anthocyanins or total phenols [30]. On the contrary, the A22 partition includes only samples from Transylvania, Muntenia and Oltenia and all the elements are associated with very high DOMs (0.9996-0.9817). The sample group A221 contains a sample originated mainly from the Transylvania area and among associated variables are Sr, Li, Mn, Au and Ag. Our previously reported papers [25][26][27] found similar geographical markers for other food products (raw or processed) grown in Transylvania (milk, cheese, potatoes); a fact that is confirmed by our results. Moreover, the content of Au and Ag could reflect wine from Transylvania, due to the fact that there are some important gold mining areas there that contain important reserves of the above-mentioned elements, and, through natural leaching processes, these elements are spread on the surrounding area, ending up in the food products.
The partition A21 is divided to the final partition A211, A2121 and A2122. The majority of the samples assigned to A211 are from Moldova and the associated element is P, with a relatively high DOM (0.7405). A possible explanation in this regard is the agricultural practices that are undertaken, where phosphorus-based fertilizers are a key component for a healthy plant development.
The partition A2121 contains only two samples from two different areas (Moldova and Muntenia) and has only Mg as an associated element. The common features in this partition are represented by the year 2012, which can be linked by the similar meteorological conditions from the two geographical areas.
The partition A2122 includes samples from three different areas (Moldova, Oltenia and Transylvania) and Ca as a specific element (DOM = 0.7837). The elements associated to the group corresponding to the partition A222, with a very high DOM, are Na (0.9598) and B (0.9097), and include mainly samples from Muntenia and some samples from Oltenia and Transylvania. The partition A2211 contains only samples from Transylvania with various DOMs (0.8888-0.4492) and the element associated with a very high DOM (0.9422) is Rb. It seems that Rb, among other before-mentioned geographical markers, is very characteristic for the Transylvania area [28].
The partition A2212 includes many samples from Transylvania and some from Oltenia, but with different DOMs (0.8945-0.2854), and it has many associated elements that have, in many cases, a low concentration. The partition A22121 is more or less similar to A2212, but the partition A22122 is more interesting because it includes only samples from Transylvania and the most characteristic elements are Zn (0.9364), Sr (0.8097) and Mn (0.4689). Sr is an acknowledged marker for geographical differentiation, while Mn content was proved by our previously reported studies [27] to be an important marker/discriminator for Transylvanian food products. The partition A221211 includes the majority of the samples corresponding to A22121 except two wines (1 from Transylvania and 1 from Oltenia assigned to A221212), and also the majority of elements with the lowest concentration, all with different DOMs. The elements assigned with relatively high DOMs to A221212 are Al, Ba and Cu. The last two partitions A2221 and A2222 contain samples from different areas, with very different DOMs; B being associated with partition A2221 and Na with A2222. The DOMs of these two elements are very high (0.9093 and 0.9591) and the majority of samples included in partition A2221 are from Muntenia. All of the above statements are very well supported by the 3D bi-plot of DOMs corresponding to different fuzzy partitions as is illustrated in Figure 1a

ICP-MS Analysis
For sample dilution and preparation of standards, ultrapure deionized water (18 MΩ·cm −1 ) from a Milli-Q analytical reagent grade water purification system (Millipore) was employed. For wine digestion, nitric acid ultrapure grade (69% Merck) was used. The wine samples were prepared according to the following procedure [31]: 2.5 mL of ultra-pure nitric acid were added to 2.5 mL of wine in a Teflon receptacle, tightly closed. Six devices were inserted into stainless steel cylinders placed between two flanges, for pressure resistance. The whole system was put in an oven at 200 • C for 12 h. A colorless solution resulted and ultra-pure water was made up to 50 mL. Thus, the wine sample was diluted 1:20 v/v and directly analyzed by ICP-MS.
Element determinations were carried out with a PerkinElmer ELAN DRC (e) ICP-MS apparatus, equipped with a Meinhart nebulizer and silica cyclonic spray chamber and continuous nebulization. For the quantitative method, calibration standards for multi-element determinations were prepared by successive dilution of four high purity ICP, Multi-Element Calibration Standards (Perkin Elmer Life and Analytical Sciences): Standard 3 (10 µg·mL −1 Al, As, Ba, Be, Bi, Ca, Cd, Co, Cr, Cs, Cu, Fe, Ga, In, K, Li, Mg, Mn, Ni, Pb, Rb, Se, Na, Ag, Sr, Tl, V, U, Zn); Standard 2 (10 µg·mL −1 Ce, Sc), Standard 4 (10 µg·mL −1 Au, Pd, Sb) and Standard 5 (10 µg·mL −1 B and P). For each sample analysis, three replicates were performed. The precision, expressed as relative standard deviation, was under 5%. Accuracy was expressed by recovery tests carried out for a wine sample spiked with a 2.5 µg·L −1 standard solution. The % of recovery was in the range of 85-110%.

Fuzzy Clustering Methods
Fuzzy analysis and fuzzy logic represent useful and powerful tools in analytical chemistry and other scientific and technical fields [32][33][34][35]. Clustering and classification methods are useful since they allow meaningful generalizations to be made about large quantities of data sets by recognizing general patterns among them [36]. There are a lot of algorithms that aim to give principal results as hard clusters from a given data set, c-means algorithms being the most widely used. Hard c-means methods execute a sharp clustering, in which each object (sample) is either assigned to a cluster or not. The membership of objects to a specific cluster is assigned to values between 0 and 1. The application of a Fuzzy algorithm in a clustering approach causes this cluster membership to become a relative one and consequently an object can belong to several clusters at the same time, but with different degrees of membership (DOMs) between 0 and 1. The DOMs to which a given data point belongs to the different clusters, are computed from the distances of the data point to the cluster centers (prototypes). The closer a data point is to the center of a cluster, the higher is its degree of membership to this cluster.
Clustering methods can be split in two types, as follows: partitional clustering and hierarchical. Partitional clustering methods try to directly decompose a data set into a fixed number of disjoint clusters (semi-supervised), while the divisive hierarchical clustering algorithm is an unsupervised procedure and may be used to obtain the cluster structure of the data set when the number of clusters is unknown.
Using the generalized fuzzy c-means algorithm (GFCM) [37], one can determine a binary fuzzy partition {A1, A2} of the data set X. If this partition describes real clusters, it is denoted P 1 = {A1, A2}.
Using the GFCM algorithm for two subclusters (n = 2), one can determine a binary fuzzy partition for each Ai of P 1 . If this partition of Ai describes real clusters, these clusters will be attached to a new fuzzy partition, P 2 . Otherwise, Ai will remain undivided. The cluster Ai will be marked and will be allocated to the partition P 2 . The unmarked cluster members of P 2 will follow the same procedure. The divisive procedure will stop when all the clusters of the current partition P k are marked, that is there are no more real clusters. This procedure is a divisive one and gives the possibility of performing fuzzy hierarchy [38][39][40][41].
Considering now a fuzzy partition of the fuzzy set C of objects, and Q, a fuzzy partition of the fuzzy set D of characteristics (variables), the problem of the fuzzy divisive hierarchical associative-clustering is to determine the pair (P, Q) that optimizes a certain criterion function [33][34][35]. By starting with an initial partition P 0 of C and an initial partition Q 0 of D, a new partition P 1 will be obtained. The pair (P 1 , Q 0 ) allows the determination of a new partition Q 1 for characteristics. The algorithm consists of producing a sequence (P k , Q k ) of pairs of partitions, starting from the initial pair (P 0 , Q 0 ), in the following steps: (Pk, Qk) → (Pk + 1, Qk) (1) (Pk + 1, Qk) → (Pk + 1, Qk + 1) The rationality of the divisive hierarchical associative-clustering method essentially supposes the splitting of the sets X and Y in two subclasses. The obtained classes are further divided in two subclasses, and so on. The two hierarchies may be represented by the same tree, having a pair (C, D) in each node (level of partition), where C is a fuzzy set of objects and D is a fuzzy set of characteristics. As a first step, the fuzzy partitions of the classes C and D must be simultaneously determined (as a particular case, the binary fuzzy partitions), so that the two partitions should be highly correlated. With the GFCM algorithm, a fuzzy partition P = A1, . . . , An of the class C will be determined using the original characteristics. In order to classify the characteristics, the algorithm computes their values for the classes A i , i = 1, . . . , c. The value y k i of the characteristic k with respect to the class A i is defined as: A i x j x j k , i = 1, . . . .c; k = 1, . . . .d where A i (x j ) is the membership degree of object x j .

Conclusions
In order to understand the distribution pattern of chemical elements in various Romanian white wine samples, collected from four geographical areas, fuzzy divisive hierarchical associative-clustering was successfully applied. The fuzzy partition hierarchy of samples and associated chemical elements allowed us to identify partitions (groups) of wine samples with more or less similar characteristics in terms of higher, smallest or intermediate values of concentration. In addition, the 3D bi-plot representation of DOMs corresponding to different fuzzy partitions offers the possibility of visualizing the relationship among samples and specific elements. Some elements appeared to be more specifically (markers) related to the geographical origin of wines (K, Rb, P, Ca, B, Na). For Transylvanian wines, some specific markers were highlighted, namely: Sr, Li, Ag, Au and Rb, while for wines originated from Moldova, P was the most representative characteristic. It was possible to determine the vintage in one case, 2012, through Mg content. Apart from the above-mentioned differentiations, the similarities in the agricultural practices were also noticed among the analyzed wine samples.

Conflicts of Interest:
The authors declare no conflict of interest.