Next Article in Journal
Nitrogen-Fixing Bacteria Associated with Peltigera Cyanolichens and Cladonia Chlorolichens
Next Article in Special Issue
Quantification of VOC Emissions from Carbonized Refuse-Derived Fuel Using Solid-Phase Microextraction and Gas Chromatography-Mass Spectrometry
Previous Article in Journal
Design, Synthesis, and Evaluation of Alkyl-Quinoxalin-2(1H)-One Derivatives as Anti-Quorum Sensing Molecules, Inhibiting Biofilm Formation in Aeromonas caviae Sch3
Previous Article in Special Issue
Application of Direct Immersion Solid-Phase Microextraction (DI-SPME) for Understanding Biological Changes of Mediterranean Fruit Fly (Ceratitis capitata) During Mating Procedures

Molecules 2018, 23(12), 3076; https://doi.org/10.3390/molecules23123076

Article
Skin Permeation of Solutes from Metalworking Fluids to Build Prediction Models and Test A Partition Theory
1
Department of Statistics, North Carolina State University, Raleigh, NC 27695-8203, USA
2
Wells Fargo and Company, Charlotte, NC 28202-0901, USA
3
Center for Chemical Toxicology Research & Pharmacokinetics, Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, 1060 William Moore Dr., Raleigh, NC 27607, USA
*
Author to whom correspondence should be addressed.
Academic Editors: Constantinos K. Zacharis and Paraskevas D. Tzanavaras
Received: 27 October 2018 / Accepted: 23 November 2018 / Published: 24 November 2018

Abstract

:
Permeation of chemical solutes through skin can create major health issues. Using the membrane-coated fiber (MCF) as a solid phase membrane extraction (SPME) approach to simulate skin permeation, we obtained partition coefficients for 37 solutes under 90 treatment combinations that could broadly represent formulations that could be associated with occupational skin exposure. These formulations were designed to mimic fluids in the metalworking process, and they are defined in this manuscript using: one of mineral oil, polyethylene glycol-200, soluble oil, synthetic oil, or semi-synthetic oil; at a concentration of 0.05 or 0.5 or 5 percent; with solute concentration of 0.01, 0.05, 0.1, 0.5, 1, or 5 ppm. A single linear free-energy relationship (LFER) model was shown to be inadequate, but extensions that account for experimental conditions provide important improvements in estimating solute partitioning from selected formulations into the MCF. The benefit of the Expanded Nested-Solute-Concentration LFER model over the Expanded Crossed-Factors LFER model is only revealed through a careful leave-one-solute-out cross-validation that properly addresses the existence of replicates to avoid an overly optimistic view of predictive power. Finally, the partition theory that accompanies the MCF approach is thoroughly tested and found to not be supported under complex experimental settings that mimic occupational exposure in the metalworking industry.
Keywords:
leave-one-solute-out (LOSO) cross-validation; leave-one-out (LOO) cross-validation; linear free-energy relationship (LFER) model; membrane-coated fiber (MCF) approach; partition coefficient; quantitative structure-activity relationship (QSAR); metalworking fluid

1. Introduction

The assessment of skin permeation of chemical solutes can be used to inform scientific research and regulatory agencies in the risk management of chemical solutes that may be of concern especially for occupational exposures [1,2,3]. For example, in the metalworking industry, certain performance enhancing solutes such as corrosive inhibitors, emulsifiers, and biocides/preservatives are often added to the metalworking fluids (MWF). Contact with these industrial fluids containing some or all of these performance additives could sometimes cause skin irritation or even more harmful consequences [4,5,6,7]. Thus, it is of interest to study the permeation capability of the added solutes through skin, in the hopes of finding less permeable solutes that can be used in metalworking fluids.
Unfortunately, conducting skin absorption studies of the many industrial chemicals and many formulations can be very expensive, and many efforts have been made to mimic the skin using synthetic membranes [8,9,10,11,12,13]. Xia et al. [14] proposed an intriguing technique, called the membrane-coated fiber (MCF) assay approach, to simulate the different molecular interactions in skin permeation by different types of materials. In this approach, an MCF is used as the absorption membrane to determine partition coefficients, namely the ratio of the concentration of solute partitioning to the MCF relative to the concentration of solute not partitioning to the MCF. The partition coefficient is a measurement of the strength of molecular interaction that governs percutaneous absorption processes. Assuming that the MCF adequately represents skin absorption, larger values of partition coefficients suggest greater levels of absorption of the solute into skin, translating to possible health implications during the metalworking processes.
To relate the dermal permeability of a solute to the solute’s chemical structure or properties, it is very common practice to develop and study a relevant quantitative structure-activity relationship (QSAR) model as classically demonstrated by [15] and [16], and also demonstrated more recently in studies more relevant to this paper ([17,18,19]). Many commonly used QSAR models are linear regression models that use the biological activity (partition coefficients, permeation coefficients, etc.) as the response variable and the molecular descriptors as predictors. The linear free-energy relationship (LFER) model of [20] is a particular type of QSAR model that is widely used in modeling results from dermal permeability studies. The LFER model is easy to use and interpret, however, when experimental conditions are complex, a simple LFER model may not be able to appropriately account for the observed variability, leading to a model with poor fit statistics and low predictive power. Xu et al. [19] expanded the LFER model to account for the heterogeneity introduced by experimental factors, in which one set of partial slopes are defined for each experimental condition. This model proved to be useful, improving both the model fit statistics and predictive power. This article pursues extensions of the LFER model that are in the spirit of [19], but we are able to obtain further improvements in model performance by incorporating additional features observed in the current study. The critical role played by model assessment criterion Q L O S O 2 is also reviewed. The resulting model provides interpretations that are useful for identifying solutes whose chemical structures are consistent with low predicted levels of skin permeability.
An attractive feature of the MCF approach of [14] is their proposed partition theory, namely that the partition coefficient of a solute from a formulation is not affected by the starting concentration of that solute in the formulation. This theory, if realized, can lead to simplified analysis even in the most complex of experimental conditions. By applying an expanded LFER model, we are able to test this theory that could not otherwise be tested.
Earlier efforts by Xia et al. 2007 [13] demonstrated the use of a MCF array to simulate skin permeability in simple binary mixtures. However the present paper utilizes the MCF and molecular structure parameters within an LFER model described above to now better estimate the effects of several real world formulations at various concentrations on the partitioning behavior of 37 solutes at different concentrations in an effort to estimate solute partitioning into MCF which serves as a surrogate for skin permeability

2. Results and Discussion

2.1. Data Summaries

Formulations are designed to mimic fluids used in the metalworking process. For this article, a formulation refers to: a particular metalworking fluid (MWF), at a particular MWF concentration, spiked with a solute at a particular concentration. Formulations are spiked with trace levels of solutes in such a way that the chemistry of the MWF is not altered.
In this study, we considered 37 solutes (see Table 1) and five solvatochromic descriptors believed to be most relevant to the solvation process during permeation [16,20]. These descriptors represent different characteristics of compounds involved in the solvation process, specified as follows. E is the solute excess molar refractivity, S is the solute dipolarity/polarizability, A is the overall hydrogen bond acidity, B is the overall hydrogen bond basicity, and V is the McGowan characteristic volume. For most solutes, V can be calculated directly, E can be obtained from experiment or calculated, but A, B, and S must be experimentally derived.
We varied the three other factors to create a formulation: the MWF, MWF concentration, and solute concentration. Five MWFs were considered: mineral oil (MO), polyethylene glycol-200 (PEG), soluble oil (SO), synthetic oil (SYN), and semi-synthetic oil (SSYN). MWF concentrations were at three levels: 0.05 percent, 0.5 percent, and 5 percent. Six solute concentrations were considered: 0.01, 0.05, 0.1, 0.5, 1, and 5 ppm. As a result, there were 5 × 3 × 6 = 90 treatment combinations, as displayed in Table A1 in Appendix A.
The study was designed to obtain partition coefficients, K M C F / m i x , for all 37 solutes, under each of the 90 treatment combinations, using three replicates. Unfortunately, due to a variety of reasons (e.g., lack of detection in gas chromatography, records outside the calibration range, etc.), not all replicates were recordable, with some treatment combinations even ending in no replicates for a particular solute. Fitting the QSAR model does not require replicates because of the structure provided by the model, and all collected data informs the fitting process. Having replicates would likely result in smaller measures of variability and hence greater power to make inference beyond what could done here, but the lack of replicates has not impeded the ability to conduct statistical analysis and model building. Of the maximum possible 37 × 90 × 3 = 9990 observations, we actually generated 4646 partition coefficients.
Summary statistics are displayed in Table 2 for all variables, based on the complete dataset of 4646 observations. Partition coefficients range from 0.015 to 1279 (−1.820 to 3.107 on the base 10 logarithm scale). To get a more detailed view of the range of values for partition coefficients, Figure 1 shows boxplots of log   K M C F / m i x grouped by solute concentration. It is somewhat surprising that the smallest partition coefficients are associated with higher concentrations of solute present in the formulation; we return to this observation later in the article.

2.2. Insufficiency of the LFER Model

Abraham and Martins [20] proposed the general linear free-energy relationship (LFER) model to study dermal absorption:
S P = β 0 + β 1 E + β 2 S + β 3 A + β 4 B + β 5 V ,
where SP is the property of interest for the solutes (such as log   K p , log   P , etc.). Given data, the coefficients in the LFER model are determined by multiple linear regression. These coefficients are also commonly denoted as c, e, s, a, b, and v; we used β 0 , β 1 , β 2 , β 3 , β 4 , and β 5 as this is more common in the literature of multiple linear regression. In this article, logarithm of the partition coefficient, log   K M C F / m i x , is the property of interest. The resulting LFER model is shown in Equation (1):
log   K M C F / m i x = β 0 + β 1 E + β 2 S + β 3 A + β 4 B + β 5 V .
While the LFER model in Equation (1) is simple and easy to interpret, it is not always sufficient, especially for large datasets under complicated experimental conditions. Equation (1) suggests that the expected value of log   K M C F / m i x is a function of only E, S, A, B, and V. However, as is clearly demonstrated in Figure 1, log   K M C F / m i x decreases as solute concentration increases, suggesting that solute concentration should likely be included as a predictor in Equation (1); we return to this observation below.
Focusing for the moment on the LFER model, Equation (1) was separately applied to data from each of the 90 treatment combinations, resulting in 90 separate estimated models. If all 90 estimated models essentially coincide, then the LFER model that only accounts for E, S, A, B, and V, and does not adjust for experimental conditions, is sufficient. To investigate this, Table 3 presents details on three of the 90 estimated models; details include estimated coefficients, their standard errors, and associated 95 percent confidence intervals. Estimated models are shown for: treatment combination 5, with mineral oil at 0.05 percent and solute concentration 1 ppm; treatment combination 17, with mineral oil at five percent and solute concentration 1 ppm; and treatment combination 52, with soluble oil at five percent and solute concentration 0.5 ppm.
The estimated models in Table 3 did not coincide. Consider, for example, the coefficient β 1 corresponding to E. For treatment combination 5, the 95 percent confidence interval consists of only positive values (0.89 to 2.65), suggesting that log   K M C F / m i x is expected to increase as excess molar refractivity increases. On the other hand, the 95 percent confidence interval consists of only negative values (−1.35 to −0.48) for treatment combination 17, suggesting that log   K M C F / m i x is expected to decrease as excess molar refractivity increases. These conflicting interpretations are not isolated. Figure 2 graphs the 95 percent confidence intervals for coefficient β 1 corresponding to E from all 90 treatment combinations, and these intervals clearly do not coincide. Moreover, similar results hold for all coefficients, as demonstrated in Table 3.

2.3. Improvement by Expanded LFER Models

Xu et al. [19] demonstrate insufficiency of the LFER model for accounting for experimental conditions defined by four MWFs. They extend the LFER model by allowing for different sets of estimated coefficients for each of the four MWFs, all while using a single model. They obtained substantial improvements in predictive power of the Extended LFER model compared to the (single) LFER model. Hoping to achieve similar levels of improvement as [19], we also fitted an Extended LFER model that allows for different sets of estimated coefficients for each of the 90 treatment combinations, while using a single model, as follows:
log   K M C F / m i x , i j k l = F 1 j k l C i 1 k l W i j 1 l ( β 0111 + β 1111 E l + β 2111 S l + β 3111 A l + β 4111 B l + β 5111 V l ) + F 1 j k l C i 1 k l W i j 2 l ( β 0112 + β 1112 E l + β 2112 S l + β 3112 A l + β 4112 B l + β 5112 V l ) + + F 5 j k l C i 3 k l W i j 6 l ( β 0536 + β 1536 E l + β 2536 S l + β 3536 A l + β 4536 + β 5536 V l ) ,
where log   K M C F / m i x , i j k l is the lth observation from MWF i (i = 1 for MO, i = 2 for PEG, i = 3 for SO, i = 4 for SYN, and i = 5 for SSYN), MWF concentration j (j = 1 for 0.05, j = 2 for 0.5, and j = 3 for 5 percent), and solute concentration k (k = 1 for 0.01, k = 2 for 0.05, k = 3 for 0.1, k = 4 for 0.5, k = 5 for 1, and k = 6 for 5 ppm). In Equation (2), β d i j k denotes the coefficient for descriptor d (with d = 0 for the intercept, d = 1 for E, d = 2 for S, d = 3 for A, d = 4 for B, and d = 5 for V) corresponding to MWF i, MWF concentration j, and solute concentration k. For example, β 1111 is the partial slope for descriptor E under treatment combination 1, with mineral oil at 0.05 percent and solute concentration 0.01 ppm. Three “dummy variables” F i j k l , C i j k l , and W i j k l are defined to indicate treatment combinations; these variables take value zero or one according to the levels of MWF, MWF concentration, and solute concentration. F i j k l = 1 if the observation comes from MWF i, otherwise F i j k l = 0 ; C i j k l = 1 if the observation comes from MWF concentration j, otherwise C i j k l = 0 ; and W i j k l = 1 if the observation comes from solute concentration k, otherwise W i j k l = 0 .
The model in Equation (2) is quite large, having a maximum of 90 intercepts (one for each treatment combination) and 5 × 90 = 450 partial slopes (slopes corresponding to each of E, S, A, B, and V for each treatment combination). For any given observation, Equation (2) activates only a single set of coefficients because the product F i j k l C i j k l W i j k l will only be nonzero for a single treatment combination. For example, if the observation is in treatment combination 2 (mineral oil at concentration 0.05 percent with solute concentration 0.05 ppm), then F 1 j k l C i 1 k l W i j 2 l = 1 and all other F i j k l C i j k l W i j k l = 0 , thus activating only β 0112 + β 1112 E l + β 2112 S l + β 3112 A l + β 4112 B l + β 5112 V l in Equation (2). Since Equation (2) is based on multiplying the dummy variables, we refer to it as the Expanded Crossed-Factors LFER model.
Table 4 shows regression statistics of fitting the Expanded Crossed-Factors LFER model of Equation (2). Regression statistics are also shown for the (single) LFER model of Equation (1), and another model to be described later. The improvements in r2, Adj-r2, Q L O O 2 , and Q L O S O 2 are quite noticeable in favor of the Expanded Crossed-Factors LFER model over the LFER model. While r2 and Adj-r2 are widely known, Q L O O 2 , and Q L O S O 2 may be less familiar. Both Q L O O 2 and Q L O S O 2 are designed to measure predictive ability of a model, but [19] demonstrate the advantage of Q L O S O 2 over Q L O O 2 for the current context. Leave-one-out (LOO) cross-validation is employed in both, meaning models are fit after reducing the dataset, then the resulting fit is used to make prediction on the portion of the data that was left out. The difference is that Q L O S O 2 leaves out an entire solute at a time, whereas Q L O O 2 omits a single row from the dataset. If only a single row is removed from the dataset, we are left with the possibility that a single replicate of a solute in a particular formulation may be removed, but the other two replicates remain in the dataset. The result is that the model is fit with almost full knowledge of the solute in question, and the consequence is that we are misled about the quality of the model for fitting “new, unseen” solutes. By removing every instance of a solute, Q L O S O 2 provides a better assessment of the quality of the model for predicting new, unseen solutes. Large values are desirable for both Q L O O 2 and Q L O S O 2 , but the extra demands placed on Q L O S O 2 usually result in smaller values of Q L O S O 2 compared to Q L O O 2 , in much the same way that Adj-r2 is often smaller than r2. (It is important to note that Q L O S O 2 in this article is equivalent to Q L O O a d j 2 in [19]. We prefer the simpler “LOSO” as it more clearly explains the difference from “LOO”.)
Q L O O 2 is calculated as
Q L O O 2 = 1 l = 1 n ( y l y ^ l , l ) 2 l = 1 n ( y l y ¯ ) 2 ,
where y l is the lth observed response of log   K M C F / m i x , y ^ l , l is the leave-one-out prediction of the lth observation based on the model fit without the lth observation, and y   ¯ is the average of all the observed responses. Q L O S O 2 , designed by [19] to handle pseudo or real replicates in leave-one-out cross-validation for proper assessment of predictive power, is defined as:
Q L O S O 2 = 1 s = 1 37 l = 1 n s ( y s l y ^ s l , s ) 2 s = 1 37 l = 1 n s ( y s l y ¯ ) 2 ,
where y s l is the lth observation of the sth solute, y ¯ is the average of all the observed responses, and y ^ s l , s is the predicted value of y s l based on the model fit from leaving out all the observations belonging to the sth solute.
While Q L O S O 2 showed improvement of the Expanded Crossed-Factors LFER model over the LFER model, the value of 0.68 is not impressive and indicates some deficiency of the model. One possible reason may be overfitting. With so many regression para meters, this model seems to fit the data too closely, thus the idiosyncrasies of the data are captured instead of the general trends. The problem of overfitting is that when the model is applied to a new dataset, it cannot predict the new data well, as indicated by the weak value of Q L O S O 2 . This motivates us to look for an alternative model, which not only accounts for the heterogeneity introduced by different experimental conditions, but is also simpler and more predictive. The LFER model may be expanded in a variety of ways that accommodate experimental conditions, and the goal is to identify the simplest adequate expansion. As previously mentioned, the Expanded Crossed-Factors LFER model of Equation (2) is quite large, and we wondered whether it could be simplified.
Figure 1 tells us that partition coefficients decrease as the solute concentration increases. This suggests that there may be a quantifiable relationship between log   K M C F / m i x and solute concentration. However, Figure 1 is the overall effect of solute concentration, not accounting for the effect of MWF or MWF concentration. Thus, a more detailed visualization is desired. Figure 3 depicts the trend of log   K M C F / m i x over solute concentration in all 15 combinations of MWF and MWF concentration. It shows a similar trend as in Figure 1, for each of the 15 combinations of MWF and MWF concentration. Figure 3 suggests that instead of viewing solute concentration as a third factor crossed with MWF and MWF concentration, we can take it as a (numerically) nested factor within each of the combinations of MWF and MWF concentration. In other words, for each combination of MWF and MWF concentration, allow a different partial slope for solute concentration. By doing this, we place a structure within each MWF x MWF concentration condition, and may be able to see how log   K M C F / m i x changes as a function of solute concentration.
We propose a new Expanded Nested-Solute-Concentration LFER model as in Equation (5):
log   K M C F / m i x , i j l = F 1 j l C i 1 l ( β 011 + β 111 E l + β 211 S l + β 311 A l + β 411 B l + β 511 V l + β 611 t l ) + F 1 j l C i 2 l ( β 012 + β 112 E l + β 212 S l + β 312 A l + β 412 B l + β 512 V l + β 612 t l ) + + F 5 j l C i 3 l ( β 053 + β 153 E l + β 253 S l + β 353 A l + β 453 B l + β 553 V l + β 653 t l ) ,
where log   K M C F / m i x , i j l is the lth observation from MWF i, MWF concentration j, t l is the logarithm (base 10) of solute concentration of the lth observation, β d i j is the regression coefficient of descriptor d (d = 0 for intercept, d = 1 for E, d = 2 for S, d = 3 for A, d = 4 for B, d = 5 for V, and d = 6 for logarithm of solute concentration), for MWF i and MWF concentration j. We take the logarithm of solute concentration as it is common practice and it linearizes the relationship. This model is relatively small, with a maximum of 15 × 7 = 105 coefficients to be estimated, compared to a maximum of 540 for the model in Equation (2).
Regression statistics are shown in Table 4, and it is clear that the Expanded Nested-Solute-Concentration LFER model of Equation (5) is at least as good as the Expanded Crossed-Factors LFER model of Equation (2), because it has comparable or larger values for all regression statistics. However, the Expanded Nested-Solute-Concentration LFER model of Equation (5) has a tremendous advantage in that: (1) it is much smaller, and so more amenable to interpretation; and (2) it is more predictive as indicated by a much larger value for Q L O S O 2 .
Figure 4 plots observed versus predicted log   K M C F / m i x values for both the LFER and Expanded Nested-Solute-Concentration LFER models. The tighter grouping around the line for the latter model is yet another demonstration of that model’s better predictive power.

2.4. Model Interpretation

We now intepret the estimated Expanded Nested-Solute-Concentration LFER model of Equation (5).
There are 15 rows in Equation (5), each representing the regression function for one combination of MWF/MWF concentration. For example, row one is for MWF mineral oil at concentration 0.05 percent, while row 15 is for MWF semi-synthetic oil at concentration five percent. Each row has a set of partial slopes that vary among the different combinations of MWF/MWF concentration. The estimates and associated standard errors of all partial slopes are shown in Table A2 in Appendix A.
To show how the partial slopes vary, in Figure 5 we plot 95 percent confidence intervals for each partial slope corresponding to E, S, A, B, V and log solute concentration across all 15 combinations of MWF/MWF concentration. The 95 percent confidence intevals are shown as vertical lines with two bars at the ends. A horizontal reference line of zero is also shown. There are some interesting trends seen in Figure 5.
For example, in Figure 5a, the partial slope of E generally decreases as MWF concentration increases within each MWF. In mineral oil, the effect (sign of β 1 ) of E (solute excess molar refractivity) even changes as MWF concentration increases. To be specific, using mineral oil at concentration of 0.05 percent, if we increase solute excess molar refractivity and other predictors are held fixed, then the partition coefficient is expected to increase (the 95 percent confidence interval lays above the reference line). On the other hand, using mineral oil at the higher concentration of five percent, if we increase solute excess molar refractivity, then we expect the partition coefficient to decrease (the 95 percent confidence interval lays below the reference line).
In Figure 5b, the partial slope of S generally increases as MWF concentration increases within mineral oil, soluble oil, and semi-synthetic oil, but partial slopes show no significant change as MWF concentration increases within polyethylene glycol-200 and synthetic oil. In general, S (solute dipolarity/polarizability) has an inverse relationship with expected partition coefficient, meaning that as S increases we expected a decrease in partition coefficient.
Figure 5c suggests increased levels of hydrogen bond acidity A are associated with decreased partition coefficients. However, the pattern of decrease changes according to the concentration of MWF. For example, in both mineral oil and soluble oil, higher MWF concentrations result in smaller decrease in partition coefficients. Figure 5d indicates that increased levels of hydrogen bond basicity B generally leads to decreased partition coefficients.
Figure 5e says larger molecules tend to have larger partition coefficients. In soluble oil, synthetic oil and semi-synthetic oil, the effect of molecule size V gets smaller as MWF concentration increases, resulting in less dramatic effect of molecule size on partition coefficients.
Figure 5f suggests that higher concentrations of solute generally result in lower partition coefficients. In both mineral oil and soluble oil, higher MWF concentrations result in stronger inverse relationships.

2.5. Validation of Partition Theory

2.5.1. Implication of Partition Theory

According to [14], it is assumed that the amount of solute extracted from the MCF, n 0 , is proportional to the solute concentration, C 0 , where the proportionality constant is not affected by C 0 . Based on this assumption, we obtain n 0 = p C 0 , where p is the proportionality constant and 0 p 1 . Applying this relationship to partition coefficients, we obtain:
K M C F / m i x = n 0 V d V m ( C 0 V d n 0 ) = p C 0 V d V m ( C 0 V d p C 0 ) = p V d V m ( V d p ) .
Equation (6) suggests that K M C F / m i x is independent of C 0 , which suggests that irrespective of the solute concentration, the partition coefficient remains the same. This so-called “partition theory”, if true, has practical meaning in the metalworking industry as it would indicate that increasing solute concentration has no impact on skin permeation ability of the solute. For example, higher concentrations of biocides might be preferred to extend preservation of fluids, while there is no detrimental effect of increasing the biocide’s ability to permeate skin. As described in more detail in the methods section, the MCF consists of a PDMS coating that is 100 µm thick and 1 cm long on an inert silica fiber. Solute partitioning into this membrane is dependent on the many chemical-chemical interactions quantified by our Expanded LFER models. However, the membrane volume (Vm) suggests that this may be a limitation with increasing solute concentration. It was, therefore, interesting to see if this partition theory is supported by our data.

2.5.2. Violation from Experimental Data

Assume the Expanded Nested-Solute-Concentration LFER model of Equation (5). To test whether the partition theory holds, we simply tested whether the coefficients corresponding to any solute concentration terms are different from zero. If all coefficients corresponding to solute concentration terms equal zero in Equation (5), then log   K M C F / m i x will not change as solute concentration changes. More specifically, we test the following null hypothesis:
H0: β6ij = 0 for all i = 1, 2, 3, 4, 5 and j = 1, 2, 3.
The resulting p-value of less than 0.0001 allows us to strongly conclude that the solute concentration term for at least one combination of MWF/MWF concentration is significantly different from zero. In fact, the individual P-values for testing each β 6 i j = 0 show that the solute concentration effect is significantly different from zero for 12 of the 15 combinations; nonsignificance is obtained only in MO/0.05, PEG/5 and SYN/0.05. These results are consistent with Figure 5f, where confidence intervals contain zero only for mineral oil at concentration 0.05, polyethylene glycol-200 at concentration 5 and synthetic oil at concentration 0.05.
Hoping to find that the partition theory holds true in either low or high solute concentrations, we considered subsets of data that contain only some of the solute concentrations. Detailed results are given in Table 5 of testing the null hypothesis that the partition theory holds for a number of different subsets of solute concentrations. For example, does the partition theory hold when considering only observations with solute concentrations less than or equal to 1 ppm? The answer is provided by row two of Table 5: with a p-value of less than 0.0001, the partition theory does not hold for solute concentrations less than or equal to 1 ppm, with violations happening in eight of the 15 combinations. In fact, the partition theory is violated in all subsets of solute concentrations.

3. Materials and Methods

Our experiments were based on the MCF approach proposed in [14]. Only a single MCF was used, namely PDMS (polydimethylsiloxane). In the current study, solutes were dissolved into a particular formulation, then an MCF was placed in the vial to allow the solute to partition from the solute-spiked formulation into the MCF over a period of one to four hours; see Figure 6. Gas chromatography and mass spectrometry were then used to extract or desorb the solute from the MCF, and the amount extracted was recorded.

3.1. Solvent/Solute Preparation

Three industry generic metal working fluids (MWF) formulations; soluble oil, synthetic fluid, and semi-synthetic fluid were kindly supplied by from Cimcool Industrial Products LLC (Cincinnati, OH, USA). The precise composition for each of these three formulations is proprietary information. In general, soluble oil concentrates contained approximately 58% mineral oil along with various other performance additives such as sulfonates and ethanolamines, semi-synthetic fluid concentrates contained about 15% mineral oil along with other additives such as sulfonates and ethanolamines, and synthetic fluid concentrates contain no mineral oil but contained various carboxylic acid salts, ethanolamines, ethyleneglycols, and plant seed oils. This is typical of many commercial MWF formulations that fall into these three categories. In addition to these three MWFs, two laboratory prepared surrogate formulations, mineral oil and PEG-200 (Aldrich, St. Louis, MO, USA) were prepared volumetrically in 0.05%, 0.5%, and 5.0% formulations in ultrapure water (Pure Water Solutions, Hillsborough, NC, USA). Each of these formulations were then spiked to six concentrations in the range of 0.01–5.0 µg/mL ranges with a set of 37 solutes (Table 1). These solutes were chosen to represent a wide variety of physiochemical properties. All solutes were of the highest purity available for purchase (Sigma Aldrich, Milwaukee, WI, USA). The 37 solutes were also prepared in acetone in a 2000 µg/mL stock solution. Experimental solutions were prepared fresh and all samples were kept at ambient temperature prior to analysis by SPME/GC-MS. Liquid GC-MS injections of the same 37 solutes prepared in acetone (0.01–10.00 µg/mL) were run daily, as well as blank liquid (acetone) and SPME (prepared solvent without addition of 37 solute) injections.

3.2. SPME/GC-MS Analysis

SPME absorption and injection was performed by a CTC Analytics Comi-Pal auto injector (Varian Inc., Walnut Creek, CA, USA) outfitted with a 100 µm polydimethylsiloxane SPME unit (Supelco Analytical, Bellafonte, PA, USA). A 9 mL sample was first agitated in a 37 °C heating block for 5 min, the SPME MCF (Figure 6) was then inserted and exposed for 30 min at 37 °C with constant agitation. SPME and liquid (0.5 µL) injections were introduced into a Varian 1079 injector (Varian Inc., Walnut Creek, CA, USA) at 280 °C in a split less mode for five min, at 5.5 min the split was turned on to 100%. For the first 30 seconds a pressure pulse of 21.0 psi was applied. Column flow was maintained at a constant 1.0 mL/min using helium as the carrier gas (National Welders, Raleigh, NC, USA). The Varian CP-3800 GC oven (Varian Inc., Walnut Creek, CA, USA) was programmed to hold at 40 °C for the first minute, followed by a 20 °C/min ramp to 90 °C (3.5 min), at which time the ramp slowed to 2.5 °C/min until 127 C (18.30 min) was reached and the ramp was increased to 40 C/min until it reached 250 °C and held for 2.0 min (23.38 min), followed by another increased ramp of 40 C/min until 280 °C and held for 5.0 min (29.13 min). The Saturn 2200-MS (Varian Inc., Walnut Creek, CA, USA) was programmed to run in full scan mode (40–300 m/z) after the first 3.0 min. Individual solute peaks were identified/quantified by the Star v6.5 software (Varian Inc., Walnut Creek, CA, USA) using retention time and known quant ions as identified and confirmed in the initial method development. Our sensitivity was set at 0.01 µg/mL as we were working with solutes ranging in concentrations from 0.01–5.0 µg/mL. More importantly, no residues were detected in the second injection after each first test injection, which indicated that there was negligible carry over under the optimum desorption conditions.
Differential ability of the solute to dissolve into the MCF or remain in the formulation was measured using a partition ratio (coefficient) K M C F / m i x between the equilibrium concentration of the solute in the MCF and the equilibrium concentration of the solute in the formulation. K M C F / m i x was calculated, following [14], as:
K M C F / m i x = C p e C m e = n 0 / V m C 0 n 0 / V d = n 0 V d V m ( C 0 V d n 0 )
where n 0 is the amount (in μ g ) of solute extracted from the MCF, V m is the volume (in mL ) of the MCF, V d is the volume (in mL ) of formulation placed in the vial based on solute concentration C 0 (in μ g / mL ), C p e = n 0 / V m is the equilibrium concentration of solute in the MCF, and C m e = C 0 n 0 / V d is the equilibrium concentration of solute in the formulation.
ADME Boxes 4.95, commercial software from ACD/Labs [21], was used to identify the E, S, A, B, and V descriptors for all the 37 solutes used in the experiment.

4. Summary and Conclusions

The partition theory of [14] does not appear to hold for the current study, as evidenced by Figure 1, Figure 3, and Table 5. It is probable that there is a finite number of binding sites available in the coating of the fiber (i.e., in the MCF). As the solute concentration increases, the percentage of the solute that absorbs and/or adsorbs to the membrane coating decreases due to this finite number of binding sites.
Notwithstanding the complications that arise from violations of the partition theory, our Expanded LFER models are able to adequately capture the variability of partition coefficients as a function of solute properties and experimental conditions. The Expanded Crossed-Factors LFER model based on [19] is a vast improvement over the single LFER model, while the Expanded Nested-Solute-Concentration LFER model developed in this article is even more refined, more predictive, and offers simple interpretations. Table 3, Table 4, Figure 2, and Figure 4 provide strong evidence that the simple LFER model is not adequate in the presence of complicated experimental conditions.
Proper assessment of model prediction ability is demonstrated with Q L O S O 2 (previously Q L O O a d j 2 in [19]), and this measure is contrasted with Q L O O 2 and the more familiar r2 and Adj-r2. The leave-one-solute-out strategy allows assessment to occur based on completely unseen solutes.

Author Contributions

J.M.H.-O. and G.X. conducted all predictive modeling and assessment; data was collected in REB’s lab; and J.M.H.-O., G.X., and R.E.B. wrote, read, and approved of the paper.

Funding

This work was supported by National Institute of Occupational Safety and Health grant NIOSH R01-OH-03669.

Acknowledgments

We thank Cimcool Fluid Technology (Cincinnati, OH, USA) for contributing the generic MWFs (synthetic, semi-synthetic, and soluble oil) for this study. We also thank James Brooks for contributions to an earlier version of this manuscript and Beth Barlow for collecting the data.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

This appendix contains tables that describe experimental results.
Table A1. Ninety treatment combinations with observation counts: MO for mineral oil, PEG-200 for polyethyleneglycol-200; SO for soluble oil; SYN for synthetic oil, and SSYN for semi-synthetic oil. Ideally, each treatment combination would include 37 × 3 = 111 observed partition coefficients, but missing data issues result in N observations, with N specified in the table.
Table A1. Ninety treatment combinations with observation counts: MO for mineral oil, PEG-200 for polyethyleneglycol-200; SO for soluble oil; SYN for synthetic oil, and SSYN for semi-synthetic oil. Ideally, each treatment combination would include 37 × 3 = 111 observed partition coefficients, but missing data issues result in N observations, with N specified in the table.
Treatment CombinationMWFMWF Conc. (%)Solute Conc. (ppm)N
1MO0.050.0153
2MO0.050.0565
3MO0.050.165
4MO0.050.543
5MO0.05139
6MO0.0550
7MO0.50.0162
8MO0.50.0567
9MO0.50.176
10MO0.50.580
11MO0.5160
12MO0.5534
13MO50.0153
14MO50.0557
15MO50.166
16MO50.585
17MO5186
18MO5550
19PEG0.050.0152
20PEG0.050.0565
21PEG0.050.168
22PEG0.050.551
23PEG0.05143
24PEG0.05534
25PEG0.50.0150
26PEG0.50.0556
27PEG0.50.159
28PEG0.50.540
29PEG0.5134
30PEG0.5522
31PEG50.0135
32PEG50.0550
33PEG50.150
34PEG50.547
35PEG5136
36PEG5528
37SO0.050.0159
38SO0.050.0568
39SO0.050.147
40SO0.050.538
41SO0.05130
42SO0.0555
43SO0.50.0151
44SO0.50.0571
45SO0.50.170
46SO0.50.579
47SO0.5156
48SO0.5514
49SO50.0143
50SO50.0548
51SO50.154
52SO50.574
53SO5173
54SO5576
55SYN0.050.0143
56SYN0.050.0555
57SYN0.050.159
58SYN0.050.542
59SYN0.05141
60SYN0.05516
61SYN0.50.0157
62SYN0.50.0567
63SYN0.50.173
64SYN0.50.546
65SYN0.5140
66SYN0.5517
67SYN50.0147
68SYN50.0563
69SYN50.162
70SYN50.562
71SYN5153
72SYN5534
73SSYN0.050.0158
74SSYN0.050.0565
75SSYN0.050.168
76SSYN0.050.547
77SSYN0.05140
78SSYN0.05520
79SSYN0.50.0150
80SSYN0.50.0565
81SSYN0.50.168
82SSYN0.50.563
83SSYN0.5132
84SSYN0.5519
85SSYN50.0142
86SSYN50.0557
87SSYN50.160
88SSYN50.570
89SSYN5174
90SSYN5554
Table A2. Estimation details for the Expanded Nested-Solute-Concentration LFER model of Equation (5). Estimated coefficients (with standard errors in parentheses) are given that correspond to the 15 combinations of MWF/MWF concentrations of the nested model. The logarithm of solute concentration is denoted as t.
Table A2. Estimation details for the Expanded Nested-Solute-Concentration LFER model of Equation (5). Estimated coefficients (with standard errors in parentheses) are given that correspond to the 15 combinations of MWF/MWF concentrations of the nested model. The logarithm of solute concentration is denoted as t.
β 0 i j
(Intercept)
β 1 i j
(for E)
β 2 i j
(for S)
β 3 i j
(for A)
β 4 i j
(for B)
β 5 i j
(for V)
β 6 i j
(for t)
MO/0.051.15(0.15)0.9(0.11)−1.59(0.10)−1.95(0.08)−1.83(0.18)1.95(0.15)0.04(0.03)
MO/0.50.10(0.11)0.04(0.09)−0.59(0.07)−1.45(0.06)−2.85(0.13)2.67(0.11)−0.06(0.02)
MO/5−0.86(0.10)−0.27(0.1)0.04(0.07)−1.17(0.05)−1.91(0.12)2.55(0.10)−0.25(0.02)
PEG/0.05−0.12(0.11)0.81(0.11)−0.96(0.10)−1.69(0.07)−2.79(0.19)2.54(0.14)−0.09(0.02)
PEG/0.5−0.10(0.11)0.93(0.14)−0.96(0.11)−1.90(0.09)−2.23(0.19)2.39(0.14)−0.13(0.02)
PEG/50.15(0.15)0.3(0.14)−0.97(0.10)−1.99(0.11)−2.88(0.18)2.58(0.16)−0.02(0.02)
SO/0.050.30(0.13)0.66(0.11)−1.01(0.11)−1.63(0.09)−2.14(0.21)2.27(0.17)−0.11(0.03)
SO/0.50.69(0.11)−0.13(0.10)−0.02(0.10)−0.79(0.07)−1.82(0.19)1.34(0.13)−0.19(0.02)
SO/50.74(0.12)0.02(0.09)−0.34(0.09)−0.76(0.08)−0.77(0.18)0.44(0.14)−0.27(0.02)
SYN/0.050.03(0.12)1.09(0.13)−1.36(0.11)−2.11(0.09)−2.46(0.20)2.66(0.14)−0.01(0.03)
SYN/0.50.53(0.11)0.59(0.11)−1.17(0.10)−1.88(0.07)−2.48(0.20)2.39(0.14)−0.09(0.02)
SYN/51.55(0.11)0.30(0.10)−1.03(0.09)−2.08(0.07)−1.41(0.17)1.06(0.12)−0.07(0.02)
SSYN/0.050.53(0.11)1.08(0.11)−1.78(0.10)−1.77(0.07)−2.24(0.20)2.38(0.14)−0.09(0.02)
SSYN/0.50.90(0.12)0.30(0.11)−0.90(0.10)−1.88(0.09)−2.27(0.20)1.74(0.14)−0.06(0.02)
SSYN/51.03(0.11)−0.10(0.09)−0.44(0.09)−1.78(0.08)−1.11(0.17)0.81(0.13)−0.10(0.02)

References

  1. McDougal, J.N.; Boeniger, M.F. Methods for assessing risks of dermal exposures in the workplace. Crit. Rev. Toxicol. 2002, 32, 291–327. [Google Scholar] [CrossRef] [PubMed]
  2. Semple, S. Dermal Exposure to Chemicals in the Workplace: Just How Important Is Skin Absorption? Occup. Environ. Med. 2004, 61, 376–382. [Google Scholar] [CrossRef] [PubMed]
  3. U.S. EPA Risk Assessment: Guidance for Superfund Volume I: Human Heatlh Evaluation Manual (Part E, Supplemental Guidance for Dermal Risk Assessment). Available online: https://www.google.com.tw/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=2ahUKEwjm5Len6uneAhUIV7wKHXKLCX0QFjAAegQICRAC&url=https%3A%2F%2Fwww.epa.gov%2Fsites%2Fproduction%2Ffiles%2F2015-09%2Fdocuments%2Fpart_e_final_revision_10-03-07.pdf&usg=AOvVaw0icHqYHCaEqHYIwkv7WVCg (accessed on 23 November 2018).
  4. Eisen, E.A.; Tolbert, P.E.; Monson, R.R.; Smith, T.J. Mortality Studies of Machining Fluid Exposure in the Automobile Industry I: A Standardized Mortality Ratio Analysis. Am. J. Ind. Med. 1992, 22, 809–824. [Google Scholar] [CrossRef] [PubMed]
  5. Gordon, T. Metalworking Fluid-The Toxicity of a Complex Mixture. J. Toxicol. Environ. Heal. Part A 2004, 67, 209–219. [Google Scholar] [CrossRef] [PubMed]
  6. Mehta, A.J.; Malloy, E.J.; Applebaum, K.M.; Schwartz, J.; Christiani, D.C.; Eisen, E.A. Reduced lung cancer mortality and exposure to synthetic fluids and biocide in the auto manufacturing industry. Scand. J. Work. Environ. Health 2010, 36, 499–508. [Google Scholar] [CrossRef] [PubMed][Green Version]
  7. Monteiro-Riviere, N.A.; Inman, A.O.; Barlow, B.M.; Baynes, R.E. Dermatotoxicity of Cutting Fluid Mixtures: In Vitro and In Vivo Studies. Cutan. Ocul. Toxicol. 2006, 25, 235–247. [Google Scholar] [CrossRef] [PubMed]
  8. Feldstein, M.M.; Raigorodskii, I.M.; Iordanskii, A.L.; Hadgraft, J. Modeling of percutaneous drug transport in vitro using skin-imitating Carbosil membrane. J. Control. Release 1998, 52, 25–40. [Google Scholar] [CrossRef]
  9. Moeckly, D.M.; Matheson, L.E. The development of a predictive method for the estimation of flux through polydimethylsiloxane membranes: I. Identification of critical variables for a series of substituted benzenes. Int. J. Pharm. 1991, 77, 151–162. [Google Scholar] [CrossRef]
  10. Matheson, L.E.; Vayumhasuwan, P.; Moeckly, D.M. The development of a predictive method for the estimation of flux through polydimethylsiloxane membranes. II. Derivation of a diffusion parameter and its application to multisubstituted benzenes. Int. J. Pharm. 1991, 77, 163–168. [Google Scholar] [CrossRef]
  11. Baynes, R.E.; Xia, X.-R.; Barlow, B.M.; Riviere, J.E. Partitioning Behavior of Aromatic Components in Jet Fuel into Diverse Membrane-coated Fibers. J. Toxicol. Environ. Heal. Part A 2007, 70, 1879–1887. [Google Scholar] [CrossRef] [PubMed]
  12. Baynes, R.E.; Xia, X.R.; Imran, M.; Riviere, J.E. Quantification of Chemical Mixture Interactions Modulating Dermal Absorption Using a Multiple Membrane Fiber Array. Chem. Res. Toxicol. 2008, 21, 591–599. [Google Scholar] [CrossRef] [PubMed]
  13. Xia, X.-R.; Baynes, R.E.; Monteiro-Riviere, N.A.; Riviere, J.E. An experimentally based approach for predicting skin permeability of chemicals and drugs using a membrane-coated fiber array. Toxicol. Appl. Pharmacol. 2007, 221, 320–328. [Google Scholar] [CrossRef] [PubMed]
  14. Xia, X.; Baynes, R.E.; Monteiro-Riviere, N.A.; Leidy, R.B.; Shea, D.; Riviere, J.E. A Novel in-Vitro Technique for Studying Percutaneous Permeation with a Membrane-Coated Fiber and Gas Chromatography/Mass Spectrometry: Part I. Performances of the Technique and Determination of the Permeation Rates and Partition Coefficients of Chemical M. Pharm. Res. 2003, 20, 275–282. [Google Scholar] [CrossRef] [PubMed]
  15. Potts, R.O.; Guy, R.H. Predicting Skin Permeability. Pharm. Res. 1992, 9, 663–669. [Google Scholar] [CrossRef] [PubMed]
  16. Abraham, M.H. Scales of solute hydrogen-bonding: Their construction and application to physicochemical and biochemical processes. Chem. Soc. Rev. 1993, 22, 73–83. [Google Scholar] [CrossRef]
  17. Vijay, V.; White, E.M.; Kaminski, M.D.; Riviere, J.E.; Baynes, R.E. Dermal Permeation of Biocides and Aromatic Chemicals in Three Generic Formulations of Metalworking Fluids. J. Toxicol. Environ. Heal. Part A 2009, 72, 832–841. [Google Scholar] [CrossRef] [PubMed]
  18. Xu, G.; Hughes-Oliver, J.M.; Brooks, J.D.; Yeatts, J.L.; Baynes, R.E. Selection of appropriate training and validation set chemicals for modelling dermal permeability by U-optimal design. SAR QSAR Environ. Res. 2013, 24, 135–156. [Google Scholar] [CrossRef] [PubMed]
  19. Xu, G.; Hughes-Oliver, J.M.; Brooks, J.D.; Baynes, R.E. Predicting skin permeability from complex chemical mixtures: Incorporation of an expanded QSAR model. SAR QSAR Environ. Res. 2013, 24, 711–731. [Google Scholar] [CrossRef] [PubMed]
  20. Abraham, M.H.; Martins, F. Human Skin Permeation and Partition: General Linear Free-Energy Relationship Analyses. J. Pharm. Sci. 2004, 93, 1508–1523. [Google Scholar] [CrossRef] [PubMed]
  21. Advanced Chemistry Development Inc. ACD/ADME BOXES, Version 4.95. Available online: www.acdlabs.com (accessed on 23 November 2018).
Sample Availability: Samples of the compounds are not available from the authors.
Figure 1. Boxplots of log   K M C F / m i x across different solute concentrations. Thick horizontal lines are the medians, the means are shown as +, boxes contain the middle half of the data, and dotted lines extend to the minimum and maximum.
Figure 1. Boxplots of log   K M C F / m i x across different solute concentrations. Thick horizontal lines are the medians, the means are shown as +, boxes contain the middle half of the data, and dotted lines extend to the minimum and maximum.
Molecules 23 03076 g001
Figure 2. Estimated β 1 coefficients (circles) for molecular descriptor E from fitting separate LFER models across the 90 treatment combinations. Ninety-five percent (95%) confidence intervals are also shown, as vertical lines with two bars at the ends.
Figure 2. Estimated β 1 coefficients (circles) for molecular descriptor E from fitting separate LFER models across the 90 treatment combinations. Ninety-five percent (95%) confidence intervals are also shown, as vertical lines with two bars at the ends.
Molecules 23 03076 g002
Figure 3. Boxplots of log   K M C F / m i x across different solute concentrations in each of the 15 combinations of MWF and MWF concentration. Within each of the 15 panels, boxplots are shown for solute concentrations of 0.01, 0.05, 0.1, 0.5, 1, and 5 ppm. The 15 combinations (MWF/MWF concentration), from left to right, are: MO/0.05, MO/0.5, MO/5, PEG/0.05, PEG/0.5, PEG/5, SO/0.05, SO/0.5, SO/5, SYN/0.05, SYN/0.5, SYN/5, SSYN/0.05, SSYN/0.5, and SSYN/5.
Figure 3. Boxplots of log   K M C F / m i x across different solute concentrations in each of the 15 combinations of MWF and MWF concentration. Within each of the 15 panels, boxplots are shown for solute concentrations of 0.01, 0.05, 0.1, 0.5, 1, and 5 ppm. The 15 combinations (MWF/MWF concentration), from left to right, are: MO/0.05, MO/0.5, MO/5, PEG/0.05, PEG/0.5, PEG/5, SO/0.05, SO/0.5, SO/5, SYN/0.05, SYN/0.5, SYN/5, SSYN/0.05, SSYN/0.5, and SSYN/5.
Molecules 23 03076 g003
Figure 4. Observed versus predicted log   K M C F / m i x for (a) the LFER model of Equation (1) and (b) the Expanded Nested-Solute-Concentration LFER model of Equation (5). Tightness around the line is indicative of a more predictive model.
Figure 4. Observed versus predicted log   K M C F / m i x for (a) the LFER model of Equation (1) and (b) the Expanded Nested-Solute-Concentration LFER model of Equation (5). Tightness around the line is indicative of a more predictive model.
Molecules 23 03076 g004
Figure 5. Estimated partial slopes (circles) corresponding to (a) E, (b) S, (c) A, (d) B, (e) V, and (f) log solute concentration from fitting the Expanded Nested-Solute-Concentration LFER model of Equation (5), for all 15 combinations of MWF/MWF concentration. Ninety-five percent (95%) confidence intervals are also shown, as vertical lines with two bars at the ends.
Figure 5. Estimated partial slopes (circles) corresponding to (a) E, (b) S, (c) A, (d) B, (e) V, and (f) log solute concentration from fitting the Expanded Nested-Solute-Concentration LFER model of Equation (5), for all 15 combinations of MWF/MWF concentration. Ninety-five percent (95%) confidence intervals are also shown, as vertical lines with two bars at the ends.
Molecules 23 03076 g005
Figure 6. Membrane–coated fiber (MCF) and experimental setup. MWF = metal working fluid and the polymer coating is 100 µm thick polydimethylsiloxane (PDMS) that is part of the MCF.
Figure 6. Membrane–coated fiber (MCF) and experimental setup. MWF = metal working fluid and the polymer coating is 100 µm thick polydimethylsiloxane (PDMS) that is part of the MCF.
Molecules 23 03076 g006
Table 1. Set of 37 solutes, and their descriptor values, used in this study.
Table 1. Set of 37 solutes, and their descriptor values, used in this study.
SoluteSolute NameESABV
1Toluene0.600.5200.140.8573
2Chloro-benzene0.720.6500.070.8388
3Ethylbenzene0.610.5100.150.9982
4p-Xylene0.610.5200.160.9982
5Bromo-benzene0.880.7300.090.8914
6Propyl-benzene0.600.5000.151.1391
71-Chloro-4-methyl-benzene0.710.7400.050.9797
8Phenol0.810.890.600.300.7751
9Benzonitrile0.741.1100.330.8711
104-Fluoro-phenol0.670.970.630.230.7927
11Benzyl alcohol0.800.870.390.560.9160
12Iodo-benzene1.190.8200.120.9746
13Phenyl ester acetic acid0.661.1300.541.0726
142-Chloro-acetophenone1.021.5900.411.1363
15Phenol, 4-methyl-0.820.870.570.310.9160
16Nitro-Benzene0.871.1100.280.8906
17Methyl ester benzoic acid0.730.8500.461.0726
181-chloro-4-methoxy-benzene0.840.8600.241.0384
19Phenylethyl alcohol0.810.860.310.651.0569
203-Methylbenzyl alcohol0.820.900.390.591.0569
214-Ethyl-phenol0.800.900.550.361.0569
223,5-Dimethyl-phenol0.820.840.570.361.0569
23Ethyl ester benzoic acid0.690.8500.461.2135
242-Methyl-methyl ester benzoic acid0.770.8700.431.2135
25Naphthalene1.340.9200.201.0854
263-Chloro-phenol0.911.060.690.150.8975
27p-Chloroaniline1.061.130.300.310.9386
281-methyl-4-nitro-benzene0.871.1100.281.0315
291-(4-Chlorophenyl)-ethanone0.961.0900.441.1363
303-Bromo-phenol1.061.130.700.160.9501
314-Chloro-3-methyl-phenol0.921.020.670.221.0384
321-Methyl-naphthalene1.340.9200.201.2263
33Biphenyl1.360.9900.261.3242
34Chloroxylenol0.930.960.640.211.1793
354-(1,1-Dimethylpropyl)-phenol0.790.800.500.441.4796
36o-Hydroxybiphenyl1.551.400.560.491.3829
37Clorophene1.531.420.670.471.6462
Table 2. Summary statistics for all variables, based on the complete dataset of 4646 observations.
Table 2. Summary statistics for all variables, based on the complete dataset of 4646 observations.
VariableMinimumLower QuartileMeanMedianUpper QuartileMaximumStd Dev
log   K M C F / m i x −1.820 0.841 1.329 1.380 1.879 3.107 0.719
E0.600 0.710 0.862 0.800 0.960 1.550 0.225
S0.500 0.800 0.928 0.900 1.110 1.590 0.266
A0.000 0.000 0.120 0.000 0.000 0.700 0.232
B0.050 0.150 0.293 0.280 0.440 0.650 0.146
V0.775 0.939 1.058 1.038 1.136 1.646 0.170
Table 3. Results from fitting separate LFER models (Equation (1)) for each of three treatment combinations (T).
Table 3. Results from fitting separate LFER models (Equation (1)) for each of three treatment combinations (T).
T β 0 (Intercept) β 1 (for E) β 2 (for S) β 3 (for A) β 4 (for B) β 5 (for V)
5est(se)0.21(0.45)1.77(0.43)−1.59(0.27)−1.87(0.18)−0.50(0.42)1.61(0.32)
ci(−0.71, 1.12)(0.89, 2.65)(−2.14, −1.03)(−2.23, −1.51)(−1.36, 0.36)(0.97, 2.25)
17est(se)−0.61(0.27)−0.91(0.22)0.42(0.18)−1.14(0.12)−2.00(0.24)2.47(0.25)
ci(−1.15, −0.07)(−1.35, −0.48)(0.07, 0.77)(−1.37, −0.91)(−2.48, −1.53)(1.97, 2.97)
52est(se)1.50(0.31)−0.03(0.23)−0.36(0.25)−0.42(0.20)−0.98(0.48)−0.14(0.34)
ci(0.89, 2.11)(−0.50, 0.44)(−0.87, 0.15)(−0.83, −0.02)(−1.94, −0.02)(−0.81, 0.53)
For each of the intercept, E, S, A, B, and V, the table provides: the estimated coefficient (est), the associated standard error (se), and a 95 percent confidence interval (ci) for the coefficient. With large differences in estimated coefficients for different treatment combinations, these estimated models indicate a clear dependency on treatment combinations.
Table 4. Fit statistics of a single LFER model (1), the Expanded Crossed-Factors LFER model (2) and the Expanded Nested-Solute-Concentration LFER model (5).
Table 4. Fit statistics of a single LFER model (1), the Expanded Crossed-Factors LFER model (2) and the Expanded Nested-Solute-Concentration LFER model (5).
Regression StatisticsLFER Model (1)Expanded Crossed-Factors LFER Model (2)Expanded Nested-Solute-Concentration LFER Model (5)
r20.600.900.88
Adj-r20.600.890.87
Q L O O 2 0.600.870.87
Q L O S O 2 0.570.680.80
Table 5. Testing the null hypothesis that the partition theory holds for a number of different subsets of solute concentrations.
Table 5. Testing the null hypothesis that the partition theory holds for a number of different subsets of solute concentrations.
Solute Concentrationsp-Value for H0Insignificant ConditionsSignificant Conditions
All< 0.0001MO/0.05(265), PEG/5(246), SYN/0.05(256), SSYN/0.5(297)MO/0.5(379), MO/5(397), PEG/0.05(313), PEG/0.5(261), SO/0.05(247), SO/0.5(341), SO/5(368), SYN/0.5(300), SYN/5(321), SYN/0.05(298), SSYN/5(357)
0.01, 0.05, 0.1, 0.5, 1< 0.0001MO/0.05(265), MO/0.5(345), PEG/0.05(279), PEG/5(218), SYN/0.05(240), SYN/5(287), SSYN/0.5(278)MO/5(347), PEG/0.5(239), SO/0.05(242), SO/0.5(327), SO/5(292), SYN/0.5(283), SSYN/0.05(278), SSYN/5(303)
0.01, 0.05, 0.1, 0.5< 0.0001MO/0.05(226), MO/0.5(285), PEG/0.05(236), PEG/0.5(205), PEG/5(182), SO/0.05(212), SYN/0.05(199), SYN/0.5(243), SYN/5(234), SSYN/0.05(238), SSYN/0.5(246)MO/5(261), SO/0.5(271), SO/5(219), SSYN/5(229)
0.01, 0.05, 0.1< 0.0001MO/0.05(183), PEG/0.05(185), PEG/0.5(165), PEG/5(135), SO/0.05 (174), SYN/0.05(157), SYN/0.5(197), SYN/5(172), SSYN/0.05(191), SSYN/0.5(183)MO/0.5(205), MO/5(176), SO/0.5(192), SO/5(145), SSYN/5(159)
0.05, 0.1, 0.5< 0.0001MO/0.05(173), MO/0.5(223),MO/5(208), PEG/0.05(184), PEG/0.5(155), PEG/5(147), SO/0.05(153), SYN/0.05(156), SYN/0.5(186), SYN/5(187), SSYN/0.05(180), SSYN/0.5(196), SSYN/5(187)SO/0.5(141), SO/5(102)
0.01, 0.05< 0.0001MO/0.05(118), PEG/0.05(117), PEG/5(85), SO/0.05(127), SO/0.5(122), SYN/0.05(98), SYN/0.5(124), SYN/5(110), SSYN/0.05(123), SSYN/0.5(115), SSYN/5(99)MO/0.5(129), MO/5(110), PEG/0.5(106), SO/5(91)
0.05, 0.1< 0.0001MO/0.05(130), MO/0.5(143), MO/5(123), PEG/0.5(115), PEG/5(100), SO/0.05(115), SO/5(102), SYN/0.05(114), SYN/0.5(140), SYN/5(125), SSYN/0.05(133), SSYN/0.5(133), SSYN/5(117)PEG/0.05(133), SO/0.5(141)
The subset of solute concentrations is shown in the first column, with p-value given in the second column. MWF/MWF concentrations that support the partition theory (meaning their individual p-values are larger than 0.05/15, where division by 15 is to adjust for multiple testing) are shown in the third column (with sample sizes in parentheses). MWF/MWF concentrations that violate the partition theory are shown in the last column (with sample sizes in parentheses). The partition theory is violated in every subset, with the greatest support for the partition theory being achieved when limiting solute concentration to 0.05 or 0.1 or 0.5 ppm as the largest subset.
Back to TopTop