The Internal Relation between Quantum Chemical Descriptors and Empirical Constants of Polychlorinated Compounds

Quantum chemical descriptors and empirical parameters are two different types of chemical parameters that play the fundamental roles in chemical reactivity and model development. However, previous studies have lacked detail regarding the relationship between quantum chemical descriptors and empirical constants. We selected polychlorinated biphenyls (PCBs) as an object to investigate the intrinsic correlation between 16 quantum chemical descriptors and Hammett constants. The results exhibited extremely high linearity for ∑σo, m, p+ with Qxx/yy/zz, α and EHOMO based on the meta-position grouping. Polychlorinated dibenzodioxins (PCDDs) and polychlorinated naphthalenes (PCNs) congeners, as two independent compounds, validated the reliability of the relationship. The meta-substituent grouping method between ∑σo, m, p+ and α was successfully used to predict the rate constant (k) for •OH oxidation of PCBs, as well as the octanol/water partition coefficient (logKOW) and aqueous solubility (−logSW) of PCDDs, and exhibited excellent agreement with experimental measurements. Revealing the intrinsic correlation underlying the empirical constant and quantum chemical descriptors can develop simpler and higher efficient model application in predicting the environmental behavior and chemical properties of compounds.


Introduction
Computational chemistry is defined as a mathematical description of chemistry that is an effective tool to investigate the kinetics and rate constant of chemical reactions, develop a predictive model, and calculate the properties of molecules to obtain some quantum chemical descriptors [1]. Quantum chemical descriptors play a fundamental role in chemistry, environmental protection, pharmaceutical science, and health research [2], as they identify the correlations between chemical structures and properties (i.e., quantitative structure−activity relationship, QSAR) [3][4][5][6]. A large number of geometrical, electrostatic, and quantum information regarding molecules can be presented by computational chemistry software. Thus, many descriptors reflect the properties of molecules and can provide insight into the chemical nature of compounds under given reaction conditions [7][8][9]. For example, the descriptors E LUMO (energy of lowest unoccupied molecular orbital) and E HOMO (highest occupied molecular orbital) reflect the molecular orbital energies [1,10,11], which play an important role in predominating many chemical reactions and determining molecular electronic transition [1,12]. The E HOMO and E LUMO are directly related to the ionization potential and electron affinity, and characterize the susceptibility of molecules toward attack by electrophiles and nucleophiles, respectively [1]. Another descriptor polarizability is dependent on the electron distribution of the entire molecule, which determines the dynamical response to external fields, and provides insight into a molecular internal structure [13,14].
Molecular descriptors derived from quantum chemical calculations have been widely used for the prediction and interpretation of quantitative aspects of organic reactions [1,[15][16][17][18]. For example, Luo et al. [19] investigated the UV direct photolysis of ibuprofen and sulfamethoxazole based on experimental measurements, and further accounted for its mechanism based on E LUMO -E HOMO descriptor. The small E LUMO -E HOMO gap values presented the lower excitation energy and higher quantum yield, which accounted for the high photolysis rate value [19]. Xiao  . The model provided a robust predictive tool for estimating emerging micropollutants removal by SO 4 •− mediated process [20]. More importantly, the QSAR model combining quantum chemical descriptors and experimental data can predict unobserved chemical phenomena in some cases. Although quantum chemical descriptors can provide a more accurate and detailed description of electronic effects than empirical methods, quantum chemical descriptors calculated at a higher level of theory are still difficult to obtain [21,22]. Thereby, chemical descriptor calculation is an expensive and difficult process, which limits the high-efficiency prediction at the screening level [21][22][23]. Furthermore, another type of empirical parameter is determined by experimentations under the same experimental constraints and controls, and a common understanding of measurement. An empirical parameter is a similar effect on the properties or reactivity of each compound in a series of structurally related compounds [24], such as acid dissociation constant (pKa), octanol/water partition coefficient (logk OW ), and substituent constants. As an important empirical constant, the Hammett substituent constants (σ) has provided insight into the relationship between reactivity and chemical structures containing aromatic rings [25]. Although this type of constant is reckoned to be accurate, simple, and have a low computational cost, it still neglects the isomers and the steric effects that exert a great influence on chemical activity [23,26]. Instead, the QSAR model reflects the structural and chemical reactivity of the molecule and exhibits advantages for an empirical constant model [14,20,27]. For instance, Russell et al. [28] revealed that Henry's law constant could be approximated as a linear function of factors related to bulk, lipophilicity, and polarity based on 63 molecular structures. Overall, both the QSAR model and empirical constants reflect the relationship regarding structure-activity of compounds [29]; there may be a connection between the quantum chemical descriptor and empirical constant. The intrinsic relationship underlying the quantum chemical descriptor and empirical constant still needs to be revealed. Thus, how to combine the advantage of quantum chemical descriptors and empirical constants to develop an efficient, accurate, and simple model is quite meaningful to study. However, there rarely have been other reports about the relationship between quantum chemical descriptors and empirical constants are rare. Santiago et al. [25] developed a mathematical modeling approach to incorporate steric effects in Hammett-type correlations. They found a strong correlation between the Hammett values of para-substitution and natural bond orbital (NBO) charges (R 2 = 0.96). The Hammett values can be used as an alternative to NBO charges [25]. Our previous study had tried to trap the relationship among polychlorinated compounds between polarizability (α, a quantum chemical descriptor) and Hammett constant (σ, an empirical constant) [21], which based on two good models (logk = −11.6 − 1.39 × ∑ σ + o, m, p [30]) and α (lnk = −0.054 × α − 19.49 [14]) to predicted the kinetics of • OH oxidation of PCBs (k values) in gas-phase. However, the findings were haphazard and limited.
Revealing the relationships hidden in quantum chemical descriptors and empirical constants will greatly improve the efficiency and accuracy of the prediction model. More importantly, the revelation relationship will disclose the intrinsic correlation between structure and apparent experiment. In this study, we selected a class of polychlorinated compounds, polychlorinated biphenyls (PCBs) as an object because of 210 compounds with similar structures and multiple substitution positions, in order to investigate the relationships between 16 quantum chemical descriptors and Hammett constants. Another two classes of polychlorinated compounds, polychlorinated dibenzodioxins (PCDDs) and polychlorinated naphthalenes (PCNs) congeners, were selected to validate the obtained relationship. To reveal the intrinsic correlation underlying empirical constants and quantum chemical descriptors can provide a simpler and higher efficient method with great application potential for model development. The result will help develop fast and tractable prediction power in predicting the phenomenon of polychlorinated compounds involving environmental pollution and chemical properties.

Reveal Relationships between Quantum Descriptors and Hammett Constants
The Hammett constant (including σ, σ + and σ − ) is a reflection of the electronic nature and position of the substituent [31]. There have been multiple positions substituted by Cl atoms at the ortho-, meta-, and para-positions, respectively ( Figure Table S1) [6,31]. Sixteen quantum chemical descriptors (Table S2) were obtained from the optimized results.
For 210 PCBs congeners, the relationships among 16 quantum chemical descriptors with ∑ σ o, m, p , ∑ σ + o, m, p , and ∑ σ − o, m, p values were listed in Figure S2, Figure 1 and Figure S3 which caused the points of ∑σ − o, m, p and descriptors concentrated distribution into 15 approximations and hid more discrepant information. Thus, the high relationships among Q xx/yy/zz , α, E HOMO and ∑σ + o, m, p were selected for further analysis in this study, respectively.  [21].

Mechanistic Interpretation of Internal Relationship
The further investigation result showed that Qxx/yy/zz, α, and EHOMO all displayed extremely high linearly correlation to ∑σ o, m, p + ( Figures 2 and S4). The result showed that all of the PCBs congeners are classified into five clusters according to the number of Cl atoms substituted at the meta-position (Nm-Cl) on the ring. As shown in Figure  and Qxx/yy/zz, and α and EHOMO. Furthermore, the trend for Qxx/yy/zz, α, and EHOMO values also supported that meta-position determined the reactivity of PCBs. The highly linear correlation illustrated that the simple ∑σ o, m, p + can be used to explain or substitute complex quantum chemical descriptors, such as Qxx/yy/zz, α, and EHOMO, based on meta-substitute grouping. Their corresponding fitted linear equations were listed in Table 1 Figure 1. Relationships of 16 quantum chemical descriptors (a-r) and ∑σ + o, m, p for polychlorinated biphenyls (PCBs) congeners. The (a), (b) referred to our previous study [21].

Mechanistic Interpretation of Internal Relationship
The further investigation result showed that Q xx/yy/zz , α, and E HOMO all displayed extremely high linearly correlation to ∑ σ + o, m, p ( Figure 2 and Figure S4). The result showed that all of the PCBs congeners are classified into five clusters according to the number of Cl atoms substituted at the meta-position (N m-Cl ) on the ring. As shown in Figure 2, there are 21, 48, 72, 48, and 21 congeners in each group for meta-position with N m-Cl ranging from 0 to 4, respectively. In each meta-position cluster, Q xx/yy/zz and E HOMO values decrease with the increase of ∑ σ + o, m, p while α values increase with the increase of ∑ σ + o, m, p . The R 2 for Q xx/yy/zz , and α with ∑ σ + o, m, p were 0.836~0.935 and 0.987~0.994, respectively, which were higher the R 2 for E HOMO with ∑ σ + o, m, p (0.759~0.824). The extremely high R 2 (>0.7) indicated that the meta-position chlorination on PCBs congeners play a crucial role in the relationship between ∑ σ + o, m, p and Q xx/yy/zz , and α and E HOMO . Furthermore, the trend for Q xx/yy/zz , α, and E HOMO values also supported that meta-position determined the reactivity of PCBs. The highly linear correlation illustrated that the simple ∑ σ + o, m, p can be used to explain or substitute complex quantum chemical descriptors, such as Q xx/yy/zz , α, and E HOMO , based on meta-substitute grouping. Their corresponding fitted linear equations were listed in Table 1    In order to gain insights into their connections, the intrinsic characters of Qxx/yy/zz, α and EHOMO need further to be further revealed. The ∑σ o, m, p + is an empirical value reflecting the electronic nature and position of the substituent [31]. The quadrupole moment (Qxx, Qyy, Qzz) reflects the distribution of the molecular charge in the x-, y-, and z-coordinates or the departure degree relative to the spherical-symmetry [32]. The polarizability (α) is defined as the ratio of the induced dipole moment of a molecule to the electric field that produces its dipole moment [33], which is an important electronic descriptor to reflect the electron distribution in the molecule [34] that is well correlated to the overall reactivity of molecule [13,35,36]. EHOMO characterizes the susceptibility of a molecule toward attack by electrophiles. A molecule with higher EHOMO is more reactive to attack by strong electrophiles [14,21]. Some investigations have shown that Qxx/yy/zz, α and EHOMO are associated with many chemical activities and properties. For instance, Kim and Mhin et al. suggested that the change in the polarity of the quadrupole moment (Qxx/yy/zz) was related to the reduction of the repulsive interaction, which played a vital role in governing the geometry of aromatics [37,38]. The investigation reported by Zeng et al. indicated that the quadrupole moment (Qyy and Qzz) were successfully used to develop a model for predicting the n-octanol/water partition coefficients (logKOW) and aqueous solubility coefficients(−logSW) of PCDDs [39,40]. In addition, our previous study developed a QSAR model to predict the • OH degradation of PCBs based on single descriptor α. The α played an important role in determining the reaction rate (k) [14]. Luo et al. suggested that the more polarizable (α) the molecule, the easier an approaching electrophile (or nucleophile) can distort the electron density of the aromatic molecule increasing the rate of reaction [21]. For EHOMO descriptor, Yan et al. developed a QSAR model for • OH oxidation of the multiring hydrocarbon in the gas-phase based on partial least squares regression [41]. They reported that EHOMO was the most suitable for model development and the higher EHOMO value corresponds to higher reactivity. Thus, ∑σ o, m, p + can be considered as the intuitive experimental phenomenon of the structure descriptor, as Qxx/yy/zz, α and EHOMO, and so on.
Due to Qxx/yy/zz, α, and EHOMO having highly correlated to ∑σ o, m, p + , there may be collinearity between Qxx/yy/zz, α, and EHOMO. Further, Figure 3 showed that α with Qxx and EHOMO had a high correlation, with corresponding R 2 values of 0.94 and 0.81, respectively. Meanwhile, α and Qyy/Qzz    In order to gain insights into their connections, the intrinsic characters of Q xx/yy/zz , α and E HOMO need further to be further revealed. The ∑ σ + o, m, p is an empirical value reflecting the electronic nature and position of the substituent [31]. The quadrupole moment (Q xx , Q yy , Q zz ) reflects the distribution of the molecular charge in the x-, y-, and z-coordinates or the departure degree relative to the spherical-symmetry [32]. The polarizability (α) is defined as the ratio of the induced dipole moment of a molecule to the electric field that produces its dipole moment [33], which is an important electronic descriptor to reflect the electron distribution in the molecule [34] that is well correlated to the overall reactivity of molecule [13,35,36]. E HOMO characterizes the susceptibility of a molecule toward attack by electrophiles. A molecule with higher E HOMO is more reactive to attack by strong electrophiles [14,21]. Some investigations have shown that Q xx/yy/zz , α and E HOMO are associated with many chemical activities and properties. For instance, Kim and Mhin et al. suggested that the change in the polarity of the quadrupole moment (Q xx/yy/zz ) was related to the reduction of the repulsive interaction, which played a vital role in governing the geometry of aromatics [37,38]. The investigation reported by Zeng et al. indicated that the quadrupole moment (Q yy and Q zz ) were successfully used to develop a model for predicting the n-octanol/water partition coefficients (logK OW ) and aqueous solubility coefficients(−logS W ) of PCDDs [39,40]. In addition, our previous study developed a QSAR model to predict the • OH degradation of PCBs based on single descriptor α. The α played an important role in determining the reaction rate (k) [14]. Luo et al. suggested that the more polarizable (α) the molecule, the easier an approaching electrophile (or nucleophile) can distort the electron density of the aromatic molecule increasing the rate of reaction [21]. For E HOMO descriptor, Yan et al. developed a QSAR model for • OH oxidation of the multiring hydrocarbon in the gas-phase based on partial least squares regression [41]. They reported that E HOMO was the most suitable for model development and the higher E HOMO value corresponds to higher reactivity. Thus, ∑σ + o, m, p can be considered as the intuitive experimental phenomenon of the structure descriptor, as Q xx/yy/zz , α and E HOMO, and so on.
Due to Q xx/yy/zz , α, and E HOMO having highly correlated to ∑σ + o, m, p , there may be collinearity between Q xx/yy/zz , α, and E HOMO . Further, Figure 3 showed that α with Q xx and E HOMO had a high correlation, with corresponding R 2 values of 0.94 and 0.81, respectively. Meanwhile, α and Q yy /Q zz had similarly high correlation (in Figure S5, the R 2 values were 0.92 and 0.95, respectively). The quadrupole moment (Q xx , Q yy , Q zz ) and α reflect the electron behavior and the homogeneity in the electronic properties of the molecule. The α, as one of the molecular electrostatic descriptors [14,20], is the principal factor determining the structure-activity relationship, even though E HOMO represents the electron-donating power of the molecule [42]. Yang et al. suggested that the E HOMO reflected only a single aspect of the molecule, while the α incorporated a number of molecular features [14].
had similarly high correlation (in Figure S5, the R 2 values were 0.92 and 0.95, respectively). The quadrupole moment (Qxx, Qyy, Qzz) and α reflect the electron behavior and the homogeneity in the electronic properties of the molecule. The α, as one of the molecular electrostatic descriptors [14,20], is the principal factor determining the structure-activity relationship, even though EHOMO represents the electron-donating power of the molecule [42]. Yang et al. suggested that the EHOMO reflected only a single aspect of the molecule, while the α incorporated a number of molecular features [14]. The reasons for good performance with aromatic meta-substituent grouping regarding the relationships of quantum chemical descriptors and ∑σ o, m, p + need to be further discussed. First of all, the σ m + value (0.4) over σ p + (0.11) and σ o + (0.073) for Cl substituents was probably attributed to the meta-position, which determined its dominant role and showed the high correlation [21]. Another important reason, since Cl atoms are substituted on aromatic rings, is electron withdrawing through the σ-bond, which decreases the ring electron density in the Cl atoms' substituted site [9,14]. The Cl atoms that were substituted at the meta-position can pull electrons from the aromatic ring, resulting in decreased electron density and suppressed HOMO distribution [14,21]. The HOMO distribution is the most direct reflection of the changes in electron distribution. Figure S6   The reasons for good performance with aromatic meta-substituent grouping regarding the relationships of quantum chemical descriptors and ∑ σ + o, m, p need to be further discussed. First of all, the σ + m value (0.4) over σ + p (0.11) and σ + o (0.073) for Cl substituents was probably attributed to the meta-position, which determined its dominant role and showed the high correlation [21]. Another important reason, since Cl atoms are substituted on aromatic rings, is electron withdrawing through the σ-bond, which decreases the ring electron density in the Cl atoms' substituted site [9,14]. The Cl atoms that were substituted at the meta-position can pull electrons from the aromatic ring, resulting in decreased electron density and suppressed HOMO distribution [14,21]. The HOMO distribution is the most direct reflection of the changes in electron distribution. Figure S6 listed the HOMO distribution of Cl atoms with different N m-Cl numbers at meta-position and N o-Cl at ortho-position. Luo et al. investigated the changed of HOMO distribution and Cl atoms at the meta-position [21]. The HOMO distribution of Cl atoms at the meta-position increased independent of the increasing number of Cl atoms at the meta-position. However, with the increasing number of Cl atoms at the meta-position, the HOMO distribution of the 1-, 2-, 6-, 1 -, 2 -, 6 -positions in the biphenyl ring (PCB15, PCB28, PCB100, and PCB155) was easily distorted and greatly varied. For the Cl atoms increasing at the meta-position and ortho-position, the Cl atoms at the meta-position (PCB15, PCB37, PCB81, PCB126, and PCB169) do not change the HOMO distribution in the biphenyl junction. However, once Cl atoms are added at the ortho-position (PCB66, PCB123, PCB167, and PCB189), their HOMO distribution of biphenyl junction and meta-position were greatly influenced and obviously changed. It is helpful to deepen our understanding of why meta-position played an important role in correlating α to ∑ σ + o, m, p in high linearity. However, the HOMO distribution is easily distorted and greatly varied when Cl atoms are substituted in other positions [21].

Application in Similar Compounds
The aromatic meta-substituent grouping method was suitable for the application of PCBs congeners; however, a good method should also be applied to other similar compounds. Thus, in order to confirm the aromatic meta-substituent grouping method, we examined the relationships of ∑ σ + o, m, p and quantum chemical descriptors (Q xx/yy/zz , α and E HOMO ) with Cl atoms substituted at meta-position for PCDDs and PCNs congeners (Figures S7 and S8). For PCDDs and PCNs with only 25 different ∑ σ + o, m, p values, the trends in the relationships of ∑ σ + o, m, p and Q xx/yy/zz , α, and E HOMO were similar to those of the PCBs based on aromatic meta-substituent grouping. For PCDD, these parallel lines exhibited extremely high linearity for ∑ σ + o, m, p with α (R 2 = 0.994~0.999) and E HOMO (R 2 = 0.982~0.999); however, the linearity for ∑ σ + o, m, p and Q xx/yy/zz were acceptable (R 2 = 0.712~0.999) (Table S8). For PCNs, the parallel lines Q xx/yy/zz and α also exhibited extremely high linearity (R 2 = 0.883~0.999) except for E HOMO , while the linearity of E HOMO was not obvious (R 2 = 0.438~0.677) ( Table S9). The reason may be that the HOMO distribution is severely disturbed by the naphthalene ring structure relative to the biphenyl structure for PCBs and the dibenzodioxin structure for PCDDs, especially for the alpha positions in the naphthalene ring. The overall trends of Q xx/yy/zz , α and E HOMO with ∑ σ + o, m, p are correlated to the number of Cl atoms substituted on the meta-position as well. The validation results of PCNs and PCDDs support the application domain in aromatic compounds based on the meta-substituent grouping method.

Application in Prediction Model
Our previous study was the first to predict the k values of • OH degradation of PCBs in the gas-phase based on the QSAR model and α [21]. The observed lnk values of • OH oxidation of PCBs congeners (as validation data) were listed in Table S6. The result showed that the prediction k values were excellently consistent with experimental measurements (the validation coefficient Q 2 = 0.825, the standard deviation ∆lnk = −0.430~0.626, and the average of standard deviation ∆lnk = −0.03), and exhibited greater predictive power and convenience than the QSAR model for single α descriptor ( Table 2). We also developed the meta-substituent grouping model to predict the logK OW and −logS W of PCDDs based on the existent quantum chemical descriptor model (logK OW = 0.03345 × α + 0.39092 and −logS W = 0.0693 × α − 3.6425) ( Table 2) and observed values (Table S7) in this study [43][44][45]. The results showed that the standard deviation ∆logK OW and ∆−logS W ranged from −0.15 to 0.92 and from −0.25 to −1.45, and the average of standard deviation ∆ log K OW and ∆ − log S W were 0.45 and −0.92, respectively. The Q 2 between the prediction and observation values were 0.954 and 0.981, respectively. All of the models showed that the p < 0.01. These statistical diagnostics demonstrated that the predicted values of logK OW and −logS W were very accurate, which indicated that the method of combining the empirical Hammett constant and quantum-chemical descriptor based on meta-substituent grouping showed fast and tractable prediction power and a great application potential for model development.

Quantum Chemical Descriptors Calculation
The structures of 210 PCB congeners, 76 PCDDs, and 76 PCNs congeners (Tables S3-S5) were created by GaussView 5.0 [47]. First, the global minimum energy was optimized at Spartan'10 program [48] using the MMFF (Merck Molecular Force Field) method [49,50]. Then, the geometries were performed to further optimize in the gas-phase using Gaussian 09 (Revision C.01) [51] at the mPW1PW91 (modified Perdew-Wang exchange and Perdew-Wang 91) hybrid density functional [52,53] combination with the MIDIX+ basis set [54,55]. It is reported that the MIDIX+ basis set had a good performance-to-cost ratio for the geometrical, orbital energy and electrostatic calculations in aromatic compounds [14,21,56]. All of the optimization structures were the local minima on potential energy surfaces with positive vibration frequencies. Sixteen quantum chemical descriptors, including the molecular dipole moment (µ), energy of the highest occupied molecular orbital (E HOMO ), and energy of the lowest unoccupied molecular orbital (E LUMO ), energy of the second HOMO and LUMO (E HOMO−1 and E LUMO+1 ), gap of E LUMO and E HOMO (E LUMO −E HOMO ), polarizability (α), electron affinity (EA), ionization potential (IP), quadrupole moment tensor along the x/y/z axis (Q xx /Q yy /Q zz ), softness (S), electronegativity (ζ), hardness (η), and electrophilicity index (ω) were obtained from the optimized results. The descriptors and their formulas were introduced in detail in Table S2.

Model Development
The multilinear regression (MLR) analysis [57] was used to develop the meta-substituent grouping models in this study. We selected the compounds with experimental measurements (26 lnk, 17 logK OW and 15 logS W values) to validate the predictive power based on the meta-substituent relationship (Tables S6 and S7). The standard deviation and average of the standard deviation of prediction values represent the error between the experimental and predicted values. The determination coefficient R 2 measures the observation value repeatability of the model, and the validation coefficient Q 2 reflects the correlation between predicted values and observed values. The Q 2 was calculated as following: where y i andŷ i are the observed and predicted values, respectively, and y was the average values of the predicted values. High R 2 and Q 2 values indicate a model with robust performance and good predictive power, respectively. In addition, the R 2 and Q 2 > 0.7 indicates the method with a better predictive performance. The statistical analyses were conducted using SPSS software version 17.0 [58].

Conclusions
In this study, we selected the PCBs as an object to investigate the relationships of 16 quantum chemical descriptors and Hammett constants in order to reveal their intrinsic correlation. By systematically analyzing the relationship of 16 quantum chemical descriptors and the Hammett relationship (∑ σ + o, m, p , ∑ σ o, m, p and ∑ σ − o, m, p ) for PCBs congeners, a very good correlation of ∑ σ + o, m, p with Q xx/yy/zz , α, and E HOMO based on meta-position grouping were observed. PCDDs and PCNs as two independent compounds validated the reliability of the relationship in aromatic compounds based on the meta-substituent grouping. Furthermore, the meta-substituent grouping method between ∑ σ + o, m, p and quantum chemical descriptors was successfully used for apply in predicting lnk values for • OH oxidation of PCBs, as well as the logK OW and −logS W of PCDDs, which exhibit excellent agreement with experimental measurements. The results indicated that combining empirical constants and quantum chemical descriptors based on meta-substituent grouping has greater tool application for predicting the environmental behavior and chemical properties of compounds.