Catalytic Performance of Cycloalkyl-Fused Aryliminopyridyl Nickel Complexes toward Ethylene Polymerization by QSPR Modeling

: Quantitative structure–property relationship (QSPR) modeling is performed to investigate the role of cycloalkyl-fused rings on the catalytic performance of 46 aryliminopyridyl nickel precatalysts. The catalytic activities for nickel complexes in ethylene polymerization are well-predicted by the obtained 2D-QSPR model, exploring the main contribution from the charge distribution of negatively charged atoms. Comparatively, 3D-QSPR models show better predictive and validation capabilities than that of 2D-QSPR for both catalytic activity ( Act .) and the molecular weight of the product ( M w ). Three-dimensional contour maps illustrate the predominant effect of a steric field on both catalytic properties; smaller sizes of cycloalkyl-fused rings are favorable to Act .y, whereas they are unfavorable to M w . This study may provide assistance in the design of a new nickel complex with high catalytic performance.


Introduction
Since the pioneering report of -diimino-nickel precatalysts toward ethylene oligo-/polymerization in the middle of 1990s [1], numerous developments have been witnessed in the design and synthesis of nickel complex catalysts, showing their impact on a wide range of catalytic properties, such as catalyst activity, molecular weight, and the distribution of the produced oligomer/polymer [2,3]. In recent years, our group and others reported aryliminopyridyl nickel precatalysts with the incorporation of cycloalkyl-fused rings as compatible N,N-ligand frames. The fused rings rang from small sizes, like cyclopentyl and cyclohexyl, to bigger sizes, such as cycloheptyl and cyclooctyl, as in Scheme 1. All of these derivatives exhibit good activities in ethylene oligo-/polymerization and produce polyethylenes of generally low molecular weight [4,5]. Their specific properties show dependence on the fused ring size and the electronic and steric effects of the substituents. Even though the empirical information on variation trends can be obtained and provide useful guidance for experiments, it is more meaningful to explain the principle underlying the improved catalytic performance through a modeling method at the molecular level.    Scheme 1. Structures of the initial data set for 46 nickel complex precatalysts along with their catalytic activities and molecular weight of produced polyethylene under the optimum conditions. a , 10 6 g·mol −1 ·h −1 , b , g·mol −1 .
Quantitative structure-activity/property relationship (QSA/PR) modeling is an approach to build the quantitative relationship between the structure of a catalyst and its activity and/or property, through the key structural descriptors within the catalyst. This method has been mainly proposed for the field of drug design and has already proven a powerful tool. However, its application in the polyolefin catalyst is relatively limited. Regarding the field of polyolefin catalysts, QSPR has been performed to predict the catalytic activities and molecular weight of polyethylene by the metallocene complexes [6,7], bis(imino)pyridine iron complex systems [8], and aryliminopyridyl metal complexes [9,10]. Furthermore, by using the chemically meaningful descriptors, such as buried volume, the free energy of reactions are predicted for ansa-metallocenes catalysts towards propene polymerization and ethene/1-hexene copolymerization [11][12][13][14]. Recently, QSPR modeling was applied to investigate the role of cycloalkyl-fused rings within bis(imino)pyridine (Fe, Co, Cr) complexes on their catalytic performance [15]. The obtained results indicate the dominant role played by the conjugated degree in complexes on the catalytic activities.
However, this finding is not applicable for nickel complexes. As shown in Scheme 1, the introduction of a benzene ring in the aryliminopyridyl framework (category D) does not enhance the catalytic activities; on the contrary, their activities are lower by about ten times than that the analogues without a benzene ring (category C). It means that, although the ligands of the complexes are similar, the effects of a cycloalkyl-fused ring on the catalytic performance are varied, affected by the change of different central metal atoms and the framework of ligands. Therefore, it is necessary to further investigate the nickel system to detect the role of the cycloalkyl-fused ring on the corresponding catalytic performance. In this work, we subsequently apply both 2D-and 3D-QSPR methods to the nickel analogue complexes to explore the variation mechanism caused by the different sizes of cycloalkyl-fused rings. 2D-QSPR modeling obtains the predictive model for catalytic activities. Through the analysis of selected descriptors, it was found that the charge distribution of net charged atom plays the dominant role, which is different from the results of cycloalkyl-fused aryliminopyridyl Fe/Co analogues, where the dominant role is played by the number of aromatic bonds. Meanwhile, the 3D-QSPR modeling reveals the important effect of steric fields in the models of both catalytic activity and molecular weight, which is in agreement with previous works [10,15]. An exception to this agreement is that the favorable areas are different. It indicates that, depending on the analogue complexes with different central metal atoms, the cycloalkyl-fused ring play different roles in the catalytic performance.

2D-QSPR Modeling
In order to build the quantitative relationship between the structure of a complex and its catalytic performance, the descriptors are first calculated and selected for the model. Then, four parameters including correlation coefficient (R 2 ), the root mean square error of the fitted data (RMSEF), the cross-validation coefficient (Q 2 ), and the root mean squared error of cross-validation (RMSEV) are calculated and listed Table 1 to validate the stability and predictive power of the model. Additionally, the value of A means the number of components. As shown in Table 1, based on the VIP-PLS analysis, the number of descriptors gradually decreased from 63 to 22 and 12 to 5. An acceptable 2D-QSPR model should have higher R 2 and Q 2 values, as well as lower values of RMSEV/RMSEF under a smaller number of descriptors. Therefore, a set of 12 descriptors is selected as the optimum number of independent variables. The triangle matrix is plotted to show the correlation between each pair of descriptors, indicating their independent nature as in Figure 1. The specific values of each descriptor are summarized in Table S1 for each Ni complex. To build and validate the QSPR model, the training and test sets are selected from the whole data by the Kennard-Stone method [16]. As given in Table S2, the combination of 29 complexes as a training set and 10 complexes as a test set is regarded as optimum. The correlation coefficients (R 2 ) for the training set and test set are 0.892 and 0.814, respectively. The cross-validation coefficient (Q 2 ) is 0.697, indicating the acceptable prediction and validation capabilities of the model. Comparatively, the R 2 values are not as high as in previous reports [9,10,15]. The reason might be the selection of model complexes, which not only vary in the sizes of the cycloalkyl-fused rings, but also vary in the presence of alkyl substituents on the fused ring, such as categories D and E in Scheme 1. The comparisons of experimental and predicted catalytic activities are plotted in Figure 2. The specific values of predicted catalytic activities for each complex are given in Table S3.  Exp. Activity (10 6 gmol 1 h 1 ) In order to interpret the obtained model, detailed information on the selected 12 descriptors is listed in Table 2. Besides the R 2 values of each descriptor in the fitting model, the contribution values to catalytic activities are also calculated and listed together. The descriptors were ranked from higher to lower according to their contribution values to catalytic activity. The explanations focus on the descriptors exhibiting higher contributions. It is clear that among all the selected descriptors, the minimum partial charges on a C atom (No.2), present the biggest contribution to the catalytic activities, with the value of 29.17%. This electrostatic descriptor describes the characteristics of the charge distribution of C atoms, showing that the minimum charge values on carbon atoms present largely positive correlations with catalytic activity. Following the highest contribution to activity, the second and third highest contributors are descriptors No.3 and No.9, with the values of 17.38% and 12.74%, respectively. These two descriptors are classified as the charged partial surface areas (CPSA) descriptors [17,18], in which the No.3 descriptor is calculated by Equation (1): where is the partial charge for the a th negatively charged atom, is the surface area contribution of this a th atom, and TMSA is the total molecular solvent-accessible surface area. For the model complexes, the negatively charged atoms usually refer to halogen, nitrogen, and part of the carbon atoms. Therefore, this descriptor roughly reflects the charge distribution of these negatively charged atoms. Similar to descriptor No.2, descriptor No.3 presents a positive contribution to catalytic activities as well, due to the fact that the minimum partial charge on the C atom is usually negative.
The No.9 descriptor is a parameter to describe the area-weighted surface charge of H-bonding donor H atoms, which is defined by Equation (2): where SD is the solvent-accessible surface area of H-bonding donor H atoms, qD is the partial charge on H-bonding donor H atoms, and Stot is the total solvent-accessible molecular surface area. This descriptor shows the charge distribution on the H atoms and the interaction of hydrogen bonding. Different from the No.2 and No.3 descriptors, the No.9 descriptor has a negative correlation with catalytic activities, due to the generally positive charge of the H atoms. These three electrostatic descriptors, reflecting the charge distribution on negatively charged atoms and H-bonding donor H atoms, provide around 60% of the contribution to catalytic activities, showing the dominant role of the net charge of negatively charged atoms on catalytic activities.
For the remaining descriptors, their explanations are briefly provided in the Supporting Materials.

3D-QSPR Modeling
Different from the 2D-QSPR method, 3D-type builds the quantitative relationship based on the steric fields and electrostatic fields of ligands at the three-dimensional lattice. After modeling, the calculated values of the catalytic activity and molecular weight of the product are obtained and compared with the experimental data, as shown in Figure 3. The training and test sets are the same as those used in the 2D type. The detailed predicted values of the catalytic activity and molecular weight of the product for each complex are given in Tables S4 and S5, respectively. Regarding the model of catalytic activity, the correlation coefficient values for the training set (r 2 ) and test set (rt 2 ) are 0.830 and 0.889, respectively, as shown in Figure 3a. Meanwhile, the cross-validation coefficient (q 2 ) value is 0.692, indicating the good predictive and validated capabilities of the model. The contributions for steric and electrostatic fields are calculated with values of 87.3% and 12.7%, respectively, indicating the predominant role of steric fields on catalytic activity. As to the model of molecular weight, quantitative results are obtained as in Figure 3b. The correlation coefficients (r 2 , rt 2 ) for the training and test sets are 0.950 and 0.893, respectively, and the q 2 value is 0.787. Comparatively, the predictive and validated powers are higher than that of the model for catalytic activity. The decisive effect on molecular weight comes from the steric field, with an overwhelming contribution of 89.1%.
Since the dominant role is played by the steric effect in the models of both catalytic activity and the molecular weight of the product, only the 3D contour maps of steric fields are shown in Figure 4. The iso-surface values are the result of the multiplication of the standard deviation (StDev) of energy by the PLS coefficient (Coeff), providing an approximate location to explain the selective areas about its structure-performance relationship. Regarding the 3D contour map of the steric field for catalytic activity, as shown in Figure 4a, the green areas represent the positive effect of the bulky group, meaning the introduction of a bulky group in this region enhances the catalytic activity. Yellow areas, on the contrary, represent the negative effect of the bulky group. Clearly, we can see that the green areas are mainly located around the halogen atoms, and the yellow areas are surrounded on the upper side of the cycloalkyl-fused ring. This indicates that bigger, bulky steric substituents surrounding halogen atoms are favorable for catalytic activity, whereas a larger size of cycloalkyl-fused ring and/or bigger hindrance groups on the fused ring are unfavorable factors. As in Scheme 1, complexes in category D have propyl groups as substituents on the cyclohexane-fused ring and aryl-fused rings as substituents on pyridine; as a result, the catalytic activities for this category are lower than that of category C. This can be explained as: bigger steric groups on the frame work decrease the freedom degree of a complex and make its molecule rigid. As a result, the insertion reaction between the ethylene monomer and the central metal is hindered during coordinated polymerization, leading to the decrease in catalytic activity. The opposite of the results for the catalytic activities, the 3D contour map of steric fields for molecular weight show a completely different trend. It is clear in Figure 4b that green areas are located at the upper side of the cycloalkyl-fused ring, indicating that larger cycloalkyl-fused members and/or bulky substituents on the fused ring are favorable factors. This is because the highly crowded aniline group reduces the space and the rotation freedom surrounding the active site so that the monomer insertion can only occur from one side. Then, the -H elimination is inhibited, resulting in highly linear polymers with a high molecular weight.
Comparing the 2D-and the 3D-QSPR, it is clear that the 3D type provides better predictive and validation capabilities for each catalytic property, due to the more exact and detailed information of spatial descriptors provided by the 3D type. Additionally, for the models of both catalytic activity and molecular weight, 3D results show that the steric field plays a dominant role. The favorable areas with respect to catalytic activity usually turn out to be the unfavorable areas for the molecular weight of the product.

QSPR Modeling
Based on the methodology discussed in previous reports [9,10,15], a brief introduction of 2D-and 3D-QSPR is described as follows. Regarding the 2D type, firstly, the geometry is optimized for each complex by density functional theory (DFT) calculations through the Dmol3 program. [25] GGA-type BP exchange-correlation functional [28][29][30] and double-numerical basis sets with polarization functions (DNP) basis sets [31,32] are chosen to calculate the geometry and energy combined with effective core potentials [33,34]. The comparisons of the optimized structures with the experimental crystal data are listed in Table S7, showing the reasonable optimized structures with lower standardized deviation values for the bond lengths and the bond angles. In order to use the CODESSA Software [35] to calculate the descriptors for each complex, the input file needs to be converted into a Gaussian type (*.log). Therefore, the single-point energy (SPE) calculation is carried out using the Gaussian 09 program package [36] under BLYP/6-31G* level based on a previous study [9]. The natural bond orbital (NBO) is used for electron distribution analysis [37,38].
The descriptors are then calculated by the CODESSA software, generating around 350 descriptors for each complex. As in previous studies [39][40][41][42], there are seven self-defined descriptors which show good correlations with catalytic activity. Therefore, these descriptors are also calculated for each complex as shown in Table S8. Adding these seven self-defined descriptors into the pool of descriptors, the total number of descriptors is around 357. This huge number of descriptors is first pre-screened by the heuristic method [43], leaving 99 descriptors for each complex. At this point, the values of the correlation coefficients (R 2 ) are calculated and show lower values below 0.6 for the model of the molecular weight of the product. Therefore, the modeling is proceeded only for the properties of catalytic activity. The obtained descriptors are cross-validated for the optimum values of the cross-validation coefficient (Q 2 ) and the root mean squared error of cross-validation (RMSEV) using the k-fold method [44]. Descriptors are then further re-duced with a k-fold value of 5, by the variable importance of projection (VIP) [45,46] method incorporated in the partial least-squares (PLS) toolbox [47,48]. The average of the squared VIP scores equals 1, thus the "greater than one rule" is commonly used as a criterion for variable selection, meaning that only the descriptors with an importance value over 1 are selected. This pruning is repeated cyclically to gradually eliminate the descriptors.
To decide on the optimal number of descriptors, four parameters expressing the stability and predictive power of the model are employed. The correlation coefficient (R 2 ) is the square correlation coefficient between the experimental and predicted values of the activity, and the root mean square error of the fitted data (RMSEF) illustrates the difference between the predicted and experimental activities, which are defined as Equations (3) and (4), respectively.
where ŷ and are the predicted and experimental activity, respectively, is the averaged value of the experimental activity, and n is the number of complexes. The five-fold cross-validation result is defined as the cross-validation coefficient (Q 2 ) in Equation (5), and the root mean squared error of cross-validation (RMSEV) is calculated by Equation (6): where ŷ and are the predicted and experimental activity in the validation set, respectively, is the averaged value of experimental activity for the complexes across all validation sets, and n is the number of complexes. A 2D-QSPR model is considered to be predictive, provided it has R 2 and Q 2 values close to 1.0, as well as values of RMSEF/RMSEV close to 0. Later on, the data set is divided into a training set and a test set to build and validate the model, respectively.
3D-QSPR relates the catalytic properties of the complexes to their 3D descriptors, including steric and electrostatic types, which are calculated by comparative molecular field analysis (CoMFA) [49,50]. First, a database of all optimized structures is built, and according to the alignment rule, a common substructure is selected as the alignment structure, as shown in Scheme 2. After the alignment, structures are organized with the associated data into molecular spread sheets. A cube of dimension 20 Å × 20 Å × 20 Å is selected, in which molecular descriptors are calculated in a 3D grid at a step size of 2 Å [49]. Scheme 2. The common framework selected from the complexes for the alignment of structures.
In the CoMFA method, the steric and electrostatic interactions between a probe atom, usually a carbon atom of sp 3 type corresponding to atom C.3 in the Tripos force field, and complex structures are calculated at each 3D grid point. As to the electrostatic field, coulomb interactions are calculated with the charge values assigned by the MMFF94 force field. Lennard-Jones potentials are used to assess steric fields by the interaction of van der Waals. The calculated 3D descriptors are not independent variables, thus PLS analysis has been performed to extract the value of the principle components. In order to investigate the predictive power of the models and to obtain the optimal number of principal components, leave-one-out (LOO) cross-validation is carried out to provide the result of the cross-validation coefficient (q 2 ). Afterwards, non-cross-validated analysis (without any validation) is performed to get the value of predictive correlation coefficient (r 2 ). Herein, q 2 and r 2 are described in Equations (7) and (8), respectively.
where ŷ , , , and n are same as in Equation (4). A good 3D-QSPR model should have values of q 2 over 0.5 and r 2 over 0.6.

Conclusions
In the present work, QSPR modeling is performed to investigate the catalytic performance of cycloalkyl-fused aryliminopyridyl nickel precatalysts toward ethylene polymerization, including the catalytic activity and molecular weight of the product. The 2D-QSPR model shows acceptable predictive and validation capabilities for catalytic activity, indicating that the net charge of negatively charged atoms within complexes has a large positive contribution to catalytic activity. Meanwhile, 3D-QSPR analysis obtains good models not only for the catalytic activity, but also for the molecular weight of the product. A 3D contour map indicates the dominant role of steric fields for the both two performances. Larger steric environments around cycloalkyl-fused rings have an unfavorable effect on catalytic activity. On the contrary, these negative effect areas become positive effect areas in the model of the molecular weight of the product. This study explores the structural factors determining the catalytic performance of nickel precatalysts, providing potential assistance for the design of new nickel complexes with desirable properties.

Supplementary
Materials: The following are available online at www.mdpi.com/2073-4344/11/8/920/s1: Table S1: The calculated values of the final 12 descriptors for each complex; Table S2: The modeling results of the different combinations of training and test sets including the R 2 , Q 2 , Rt 2 , RMSEF, and RMSEV values for 39 complexes; Table S3: The values of the experimental and predicted catalytic activities for the training set of 29 complexes and the test set of 10 complexes from the 2D-QSAR analysis; Table S4: The values of the experimental and predicted catalytic activity for the training and test sets from the 3D-QSPR analysis; Table S5: The values of the experimental and predicted molecular weight for the training and test sets from the 3D-QSPR analysis; Table S6: Experimental values of catalytic activities, molecular weights, and melting temperatures of products under the optimum reaction conditions; Table S7: The calculated bond lengths and bond angles by DFT compared with experimental crystal structures for complex 8 along with the values of the standard deviation (); and Table S8

Data Availability Statement:
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Conflicts of Interest:
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.