Chemometric Evaluation of the Link between Acute Toxicity, Health Issues and Physicochemical Properties of Silver Nanoparticles

: The present study’s objective is to focus on some developments in the ﬁeld of statistical models of a complex system, like nanoparticles responses in the environmental media. An important problem that still needs to be studied and interpreted is the relations between physicochemical parameters of the nanoparticles like primary size, primary hydrophobic diameter, zeta potential, etc. with respective toxicity values. It holds true especially for silver nanoparticle systems due to their known bactericidal e ﬀ ect and wide distribution in practice. The present study deals with the data for physicochemical and toxicity parameters of 94 di ﬀ erent silver nanoparticle systems in order to reveal speciﬁc relations between physicochemical properties and acute toxicity readings using multivariate statistical methods. Searching for these speciﬁc relationships between physicochemical parameters and toxicity responses is the novel element in the present study. This has focused our study toward developing a model that describes the relationship between physicochemical properties and toxicity of silver NPs based on a dataset gathered from the literature. It is shown that the systems studied could be divided into four patterns (clusters) of similarity depending not only on the physicochemical indicators related to particles size but also by their acute toxicity. The acute toxicity is strongly correlated to the zeta potential of the particles if the whole data set is considered.


Introduction
With the increasing interest and application of nanosystems as advanced materials, the questions about their environmental and health impact is respectively growing. The principal question that needs to be answered is: Whether there is a connection between the specific size of the nanomaterials and their influence on the ecological and health status of the nature and humans.
Nanoparticle pollution is an evolving problem currently for aquatic and soil environments. Nanoparticles can enter the environment after relief from a vast number of uses in the commercial and pharmaceutical product through disposal, weathering, application of sewage sludge or with sewage effluent waters. In the soil environments and as well the aquatic once, expected to collect increasing amounts of nanoparticle pollution. The question and concern are how nanoparticle pollution will distress the processes mediated in a bacterial level, because of their antibacterial actions. Silver nanoparticles (AgNP) are frequently used for these antibacterial properties. The major factors controlling nanoparticle toxicity to bacteria are size, charge, shape, number of nanoparticles per cell, aggregation, nature if the capping agents and interaction with UV light. Numerous modes in a of bigger NPs. Although the latter conclusion is initially based on one test organism, it may lead to an explanation for size dependent biological effects of silver Ag-NPs [1][2][3].
The major goal of the present study is to reveal patterns of similarity due to the specific relationships between the different nanoparticle systems collected from literature sources, on one hand, and between the estimators of structural or ecotoxicity properties, on the other. There are studies dealing with relationship between particle size parameters and particle properties but quite few using acute toxicity as additional estimator of the nanoparticle applications and options within a common data set subject to multivariate statistical analysis.

Chemometric Methods
Chemometric tools have been frequently applied from analytical and pharmaceutical method optimization problems and environmental issues to spectroscopy data analysis. The use of chemometric for the treatment of different data sets provides a valuable tool for objective decision-making.
The intelligent data analysis uses cluster analysis (hierarchical and non-hierarchical mode) and principal components analysis (Varimax rotation mode) for the data classification and interpretation. The multivariate methods used in the present study are well known and developed chemometric approaches for classification, modeling and interpretation of experimental datasets.
Hierarchical cluster analysis (joining tree mode) is an unsupervised pattern recognition method whose aim is to construct patterns of similarity either between objects of interest described by various features (variables) or between the different features related to the objects. The major steps in performing hierarchical clustering from the data are: Data standardization (by z-transform) in order to eliminate differences in the variable dimensions; determination of a similarity measure between the objects of interest (normally squared Euclidean distances); linkage between the objects (among many options Ward's method is often preferred) and, finally, assessment of the clusters significance (e.g., by Sneath's criterion) [4]. The graphical output of the analysis is a tree-like plot called dendrogram. Very often it is said that the unsupervised pattern recognition is a spontaneous classification method without any preliminary training of the data set to follow a specific requirement to classify the data into a preliminary chosen number of similarity groups.
Another clustering option is offered by non-hierarchical clustering (e.g., K-means methods), which is a supervised technique requiring the division of the studied objects into a priori given number of clusters (determined by some practical or theoretical reasons). K-means is a clustering algorithm, which divides observations into k clusters. Since we can predetermine the number of clusters, it can be easily used in classification where we divide data into clusters, which can be equal to or more than the number of classes. This approach is suitable for solving classification problems and resembles other traditional chemometric methods for classification as, for instance, partially least squares regression-based discriminant analysis (PLS-DA).
Principal component analysis (PCA) seems to be the most widespread multivariate chemometric technique and is a typical display method (also known as eigenvector analysis, eigenvector decomposition or Karhunen-Loéve expansion). It enables revealing the "hidden" structure of the data set and helps to explain the influence of latent factors on the data distribution. PCA is done on the covariance matrix when the data are centered or on correlation matrix when the data are standardized. PCA transforms the original data matrix into a product of two matrices, one of which contains the information about the objects the other about the variables [5].
Interpretation of the results of PCA is shown by visualization of the component scores and loadings. In the score plot, the linear projection of objects is found, representing the main part of the total variance of the data (in the plot PC1 vs. PC2). Other projection plots are also available (e.g., PC1 vs. PC3 or PC2 vs. PC3) but they represent less percentage of explained total variance of the system in consideration. Correlation and importance of feature variables is to be decided from the factor loadings plots.
All statistical analyses were carried out using STATISTICA 8.0 statistical software.

Results
The input data set was standardized by the z-transform procedure in order to eliminate the parameter dimension impact on the classification and interpretation results.
The study started with the application of PCA in order to reveal the specific data set structure and to identify the latent factors responsible for this structure.
As already mentioned, one of the goals of this study was to find relationships between the features used for description of the nanoparticle systems. Principal components analysis (PCA) was used also in order to search for similarity between features. Two latent factors were identified, which explain nearly 65% of the total variance. In Table 2 the factor loadings (in bold are marked the variables with highest loadings) are presented giving a reliable explanation between features similarity. As seen, there was a close relationship between three of the physicochemical parameters (PS, HPD and DLS), which have high factor loadings on PC1. It confirms that these three parameters simultaneously were responsible for the structural identity of the nanoparticle systems in consideration and could be (again simultaneously) a discriminating factor for every single nanoparticle composition. The conditional name of PC1 could be the "structural" factor.
The acute toxicity and the zeta potential were negatively correlated to the rest of variables with high negative factor loadings on PC2. It means that these two estimators were different discriminating type of variables for the nanoparticle compositions studied. The conditional name of PC2 could be "toxicity" factor.
These results were completely confirmed by application of non-hierarchical clustering by the K-means method. As previously mentioned, the non-hierarchical clustering is a supervised pattern recognition method and allows accepting or rejecting a priori presented hypotheses about data classification. In the present case the a priori hypothesis states that the variables involved were separated into two classes of similarity related with structural and toxic properties of the AgNPs. The K-means analysis output gives the following two groups of variable similarity as follows: Members of cluster 1 (zeta potential, acute toxicity); Members of cluster 2 (primary size, primary hydrodynamic diameter, hydrodynamic diameter in the test media); This result confirms the PCA latent factors identification with a "toxic" and a "structural" cluster. In Figure 1 the plot of average values for each of the clusters formed by non-hierarchical clustering for each of the 94 AgNPs systems studied is presented.
Symmetry 2019, 11, x FOR PEER REVIEW 5 of 13 explain nearly 65% of the total variance. In Table 2 the factor loadings (in bold are marked the variables with highest loadings) are presented giving a reliable explanation between features similarity. As seen, there was a close relationship between three of the physicochemical parameters (PS, HPD and DLS), which have high factor loadings on PC1. It confirms that these three parameters simultaneously were responsible for the structural identity of the nanoparticle systems in consideration and could be (again simultaneously) a discriminating factor for every single nanoparticle composition. The conditional name of PC1 could be the "structural" factor.
The acute toxicity and the zeta potential were negatively correlated to the rest of variables with high negative factor loadings on PC2. It means that these two estimators were different discriminating type of variables for the nanoparticle compositions studied. The conditional name of PC2 could be "toxicity" factor.
These results were completely confirmed by application of non-hierarchical clustering by the Kmeans method. As previously mentioned, the non-hierarchical clustering is a supervised pattern recognition method and allows accepting or rejecting a priori presented hypotheses about data classification. In the present case the a priori hypothesis states that the variables involved were separated into two classes of similarity related with structural and toxic properties of the AgNPs. The K-means analysis output gives the following two groups of variable similarity as follows: Members of cluster 1 (zeta potential, acute toxicity); Members of cluster 2 (primary size, primary hydrodynamic diameter, hydrodynamic diameter in the test media); This result confirms the PCA latent factors identification with a "toxic" and a "structural" cluster.
In Figure 1 the plot of average values for each of the clusters formed by non-hierarchical clustering for each of the 94 AgNPs systems studied is presented. It is seen that the "toxicity" impact becomes more significant for systems with numbers higher than 60 (dominantly algae, mammalian cells, bacteria as test organisms for toxicity testing), high zeta potential values and lower primary size. The "structural" impact was significantly expressed for systems with very higher primary size and primary hydrodynamic diameter. It confirmed previous conclusions about the toxicity as a function of the primary size of AgNPs [3].
In the next step of chemometric analysis it was a substantial interest to find similarity patterns between the nanosystems in consideration. In Figure 2 a hierarchical dendrogram for clustering of all nanosystems is presented. The hierarchical clustering allows a spontaneous formation of similarity groups within the objects of interest. Four major clusters were formed as indicated in the dendrogram.
The hierarchical clustering created spontaneously four major clusters as follows: K1 (10, 11, 13,   It is seen that the "toxicity" impact becomes more significant for systems with numbers higher than 60 (dominantly algae, mammalian cells, bacteria as test organisms for toxicity testing), high zeta potential values and lower primary size. The "structural" impact was significantly expressed for systems with very higher primary size and primary hydrodynamic diameter. It confirmed previous conclusions about the toxicity as a function of the primary size of AgNPs [3].
In the next step of chemometric analysis it was a substantial interest to find similarity patterns between the nanosystems in consideration. In Figure 2 a hierarchical dendrogram for clustering of all nanosystems is presented. The hierarchical clustering allows a spontaneous formation of similarity groups within the objects of interest. Four major clusters were formed as indicated in the dendrogram.
The hierarchical clustering created spontaneously four major clusters as follows:  This classification of the nanosystems was confirmed by using a non-hierarchical clustering. We were using the hypothesis that all systems should be divided into four clusters (patterns) depending on the systems specificities-physicochemical features (all size measures PS, PHD, and DLS) zeta potential ZP, acute toxicity AT and group of outliers. K1 was the cluster whose members were typical outliers. From the input data set was seen, for instance, that object 70 shows the highest acute toxicity level indicated by bacterial assay, highest value for hydrodynamic diameter and very high negative zeta potential. The group of objects 10, 11 and 13 was characterized by very high values of the hydrodynamic diameter. The last three nanosystems were tested for acute toxicity by crustacean organisms (Daphnia magna).
K2 included 26 members whose acute toxicity was determined either by crustacean or by bacterial assays. The cluster was characterized by the lowest values of the particle size parameters and of the toxicity readings. This classification of the nanosystems was confirmed by using a non-hierarchical clustering. We were using the hypothesis that all systems should be divided into four clusters (patterns) depending on the systems specificities-physicochemical features (all size measures PS, PHD, and DLS) zeta potential ZP, acute toxicity AT and group of outliers. K1 was the cluster whose members were typical outliers. From the input data set was seen, for instance, that object 70 shows the highest acute toxicity level indicated by bacterial assay, highest value for hydrodynamic diameter and very high negative zeta potential. The group of objects 10, 11 and 13 was characterized by very high values of the hydrodynamic diameter. The last three nanosystems were tested for acute toxicity by crustacean organisms (Daphnia magna).
K2 included 26 members whose acute toxicity was determined either by crustacean or by bacterial assays. The cluster was characterized by the lowest values of the particle size parameters and of the toxicity readings.
Cluster K3 consisted of 23 members with relatively high values of the size parameters but with respect to toxicity resembles the average value of cluster 2 and next cluster 4. There was no specificity to the bioassays applied for measuring acute toxicity (dominantly crustacean and bacteria; it should be kept in mind that the majority of the acute toxicity measurements were performed by the use of these two bioassay types).
The last cluster included the largest number of objects (totally 41). It was characterized by the presence of nanoparticles with highest zeta potential. The acute toxicity was relatively high and was determined by various test organisms (crustacean, bacteria, algae, yeast, mammalian cells and fungi). It could be assumed that there was no specificity for the test organisms that were observed. In next Table 3 the average values for each variable for each one of the identified by hierarchical cluster analysis clusters of objects are shown. In general, the cluster of outliers K1 is characterized by the highest values of PS, PHD, DLS and AT and relatively low negative value of ZP.
Cluster 4, the biggest one, indicated the lowest average acute toxicity related to the lowest level of PS, PHD, ZP and to some extent DLS. The other two clusters had intermediate values.
It is interesting to note that K1 toxicity was determined dominantly by the crustacean (Daphnia magna) while K4 used almost all test organisms for toxicity readings. No specificity with respect to for the test organisms was found in K2 and K3. It could be assumed that the bioassay mode was not a specific or discriminating factor.
The nonhierarchical clustering of the AgNPs followed the hypothesis for division of the 94 systems into four patterns of similarity depending probably on the structural and toxicity parameters. Indeed, four clusters were calculated having the same members as in the case of hierarchical clustering. This number of clusters proved to be optimal with respect to the minimal spread of distances (in comparison with other options for non-hierarchical clustering) between the clusters.
In order to determine specific discriminating parameters for each cluster a plot of means ( Figure 3) is presented.
For the small cluster of objects K1 maximum values of PS, PHD and DLS were observed with low levels of ZP and AT, which did not differ much from the values of the other clusters. It could be assumed that the cluster represents relatively large nanoparticles with high primary hydrodynamic diameter and hydrodynamic diameter in test media. Therefore, we could conditionally name this pattern of silver nanoparticles systems "coarse nanosystems" with relatively high acute toxicity determined by Daphnia magna.
K4 was the largest cluster characterized by the lowest primary size, primary hydrodynamic diameter and hydrodynamic diameter in test media as well as by lowest acute toxicity. This pattern could be conditionally being named as the "finest nanoparticles systems".
The other two clusters had intermediate properties but possessing discriminators, which allowed good separation into two independent patterns: K3 was very similar to K1 with one significant exception-low hydrodynamic diameter in test media, so that it formed the pattern of "coarse nanoparticle systems with low hydrodynamic diameter in test media"; K2 was characterized by the highest values of the zeta potential and relatively good resemblance to the structural parameters of the "fine nanoparticles systems" and its conditional name could be "fine nanoparticles systems with high zeta potential" (it should be kept in mind that the ZP values were negative and "high" should be understood as "highest negative value").  For the small cluster of objects K1 maximum values of PS, PHD and DLS were observed with low levels of ZP and AT, which did not differ much from the values of the other clusters. It could be assumed that the cluster represents relatively large nanoparticles with high primary hydrodynamic diameter and hydrodynamic diameter in test media. Therefore, we could conditionally name this pattern of silver nanoparticles systems "coarse nanosystems" with relatively high acute toxicity determined by Daphnia magna.
K4 was the largest cluster characterized by the lowest primary size, primary hydrodynamic diameter and hydrodynamic diameter in test media as well as by lowest acute toxicity. This pattern could be conditionally being named as the "finest nanoparticles systems".
The other two clusters had intermediate properties but possessing discriminators, which allowed good separation into two independent patterns: K3 was very similar to K1 with one significant exception-low hydrodynamic diameter in test media, so that it formed the pattern of "coarse nanoparticle systems with low hydrodynamic diameter in test media"; K2 was characterized by the highest values of the zeta potential and relatively good resemblance to the structural parameters of the "fine nanoparticles systems" and its conditional name could be "fine nanoparticles systems with high zeta potential" (it should be kept in mind that the ZP values were negative and "high" should be understood as "highest negative value").
Since there was no specificity to the test organisms used for acute toxicity determination, the separation of the whole set of silver nanoparticles systems into four patterns was the result of different structural variations (primary size, primary hydrodynamic diameter, low hydrodynamic diameter in test media and zeta potential) since the toxicity differences were relatively small.

4.Relationship between Size Parameters and Toxicity
As shown in Table 3 the four clusters were well separated by their particle size and hydrodynamic parameters, on one side and the acute toxicity on the other. The ZP was not a satisfactory discriminant factor for the whole system in consideration.
The results from Table 4 could be used for regression modeling of the dependence of the acute toxicity on the different size parameters of the AgNPs, for the four identified patterns of similarity.
In  Since there was no specificity to the test organisms used for acute toxicity determination, the separation of the whole set of silver nanoparticles systems into four patterns was the result of different structural variations (primary size, primary hydrodynamic diameter, low hydrodynamic diameter in test media and zeta potential) since the toxicity differences were relatively small.

Relationship between Size Parameters and Toxicity
As shown in Table 3 the four clusters were well separated by their particle size and hydrodynamic parameters, on one side and the acute toxicity on the other. The ZP was not a satisfactory discriminant factor for the whole system in consideration.
The results from Table 4 could be used for regression modeling of the dependence of the acute toxicity on the different size parameters of the AgNPs, for the four identified patterns of similarity.
In           The linear regression found for the dependence of AT on the primary size parameter (Figure 4) is conformation of what other studies have already found on that issue [4].

5.Multiple Regression Results
We used multiple linear regressions (MLR) to obtain the quick estimation of the MLR model and to conduct the effect of toxicity to the structure properties of nanoparticles (Table 4). The linear regression found for the dependence of AT on the primary size parameter (Figure 4) is conformation of what other studies have already found on that issue [4].

Multiple Regression Results
We used multiple linear regressions (MLR) to obtain the quick estimation of the MLR model and to conduct the effect of toxicity to the structure properties of nanoparticles (Table 4). Obtained data (Table 5) were statistically analyzed using analysis of variance (ANOVA) and expressed as the mean with standard error. The obtained F-value was compared with the corresponding critical value (p = 0.05). A value of p < 0.05 was considered statistically significant. The multivariate regression analysis has indicated that the model obtained was not satisfactory for assessment of AT. Obviously, the relationship between AT and the physicochemical parameters of the nanoparticles are of more complex character and non-linear. That is why we tried to estimate the impact of each single parameter on AT. The results are commented below.
In the present study an effort is made to find relations between toxicity and size parameters but not only by the primary size (PS) of AgNPs. It is evident that the different size patterns found by the classification mode are size dependent and so is the link between AT and PHD or between AT and DLS. However, this dependence is not linear any more but typically nonlinear (logarithmic or polynomial). This is an indication for the different mechanism of the toxic effects, which has to be additionally studied by specifically designed experiments including the size parameters as input factors and AT as an output factor.
Our data analysis indicated that the relation AT/ZP is not very indicative for the toxic impact of the AgNPs studied.
Next table (Table 6) is trying to summarize on a semi-qualitative level the relationship between acute toxicity and size parameters for the identified four patterns of similarity of silver nanoparticle systems. Since no specificity with respect to the bioassays used was found no special classification for them was offered. An effort for validation of the models suggested by leave-one-out cross validation approach was made. In all tested cases the correct classification was obtained. This is probably to be expected for our data set but we did not have additional experimentally synthesized objects for model validation.

Conclusions
The multivariate statistical analysis applied to the data set of silver nanoparticles made it possible to identify four different patterns of similarity between all 94 silver nanoparticles systems depending on their physical and toxicity properties using hierarchical and non-hierarchical cluster analysis and principal components analysis. For the group of four clusters regression models were constructed showing the relationship between the AgNPs size parameters and the acute toxicity measured by various bioassays. Relatively adequate models were found for the links between AT and PS, as well as between AT and DLS and PHD. Based on the regression models and multivariate statistical analysis a semi-qualitative classification is offered, which tries to order the toxicity readings with respect to size parameters of the silver nanoparticles. The validity of the classification was checked by leave-one-out cross validation procedure. The practical importance of the present study was its predictive ability in relating readings of acute toxicity of the nanoparticles with their major physicochemical parameters. We were aware of the fact that the relations found did not explain the mechanism of toxicity of the nanoparticles but made it possible to detect the importance for the practice hints for toxicity dependence of the particle size.