On Use of the Variable Zagreb v M2 Index in QSPR: Boiling Points of Benzenoid Hydrocarbons

The variable Zagreb (v)M(2) index is introduced and applied to the structure-boiling point modeling of benzenoid hydrocarbons. The linear model obtained (the standard error of estimate for the fit model S(fit)=6.8 degrees C) is much better than the corresponding model based on the original Zagreb M2 index (S(fit)=16.4 degrees C). Surprisingly,the model based on the variable vertex-connectivity index (S(fit)=6.8 degrees C) is comparable to the model based on (v)M2 index. A comparative study with models based on the vertex-connectivity index, edge-connectivity index and several distance indices favours models based on the variable Zagreb (v)M2 index and variable vertex-connectivity index.However, the multivariate regression with two-, three- and four-descriptors gives improved models, the best being the model with four-descriptors (but (v)M2 index is not among them) with S(fit)=5 degrees C, though the four-descriptor model contaning (v)M2 index is only slightly inferior (S(fit)=5.3 degrees C).


Introduction
The concept of the variable molecular descriptors was proposed as an alternative way of characterizing heteroatoms in molecules [1,2], but also to assess the structural differences, such as, for example, the relative role of carbon atoms of acyclic and cyclic parts in alkylcycloalkanes [3].The idea behind the variable molecular descriptors is that the variables are determined during the regression so that the standard error of estimate for a studied property is as small as possible.
Several molecular descriptors have already been tested in their variable forms in QSPR and QSAR [4][5][6][7][8][9][10][11][12].Here we report the use of the variable Zagreb v M 2 index in the structure-boiling point modeling of benzenoid hydrocarbons.We selected benzenoid hydrocarbons because there are several structureboiling point models of these compounds already published [13,14].Due to this fact, we were also able to carry out a comparative study of the model based on v M 2 index against the models based on the standard vertex-connectivity index, variable vertex-connectivity index, edge-connectivity index and several distance indices.Since the Zagreb index in its original form was derived using graphtheoretical concepts and terminology [15], we will use these in the present report.Graphs will be generated from molecules in the usual way by replacing atoms with vertices and bonds with edges [16].Besides, graphs that we will use will represent only carbon skeletons of benzenoid hydrocarbons.Therefore, benzenoid hydrocarbons in this report will be presented as various arrangements of hexagons in the plane.

The Zagreb M 2 index and Its Variable Form v M 2
Originally, the Zagreb M 2 index together with the Zagreb M 1 index appeared in the topological formula for the total π-electron energy of conjugated molecules [17]: edges where d(i) is the degree of vertex i and d(i) d(j) is the weight of edge i-j.This index was first used as a branching index [18] and later as a useful molecular descriptor in various forms in QSPR and QSAR studies [19][20][21][22][23].
The easiest way to introduce the variable Zagreb v M 2 index is by means of an example.For this purpose we will use a graph G representing the carbon skeleton of naphthalene (see Figure 1).

Figure 1. Graph G representing the carbon skeleton of naphthalene
Since it is known from the chemistry of benzenoid hydrocarbons [24,25] that the carbon atoms with two adjacent carbon atoms possess different characteristics then the carbon atoms with three adjacent atoms, we assess their relative roles by differentiating these two groups with variable parameters.Using the graph-theoretical approach, the difference between the two groups of carbon atoms is expressed by means of the degrees of the corresponding vertices plus the variables.Hence, the vertex-degree of the carbon atom adjacent to two other carbon atoms is taken to be: and likewise the vertex-degree of the carbon atom adjacent to three other carbon atoms is given by: Putting ( 2) and ( 3) into (1), one obtains the following Zagreb v M 2 index for naphthalene as a function of the variables x and y: In general, the variable Zagreb v M 2i index of a benzenoid hydrocarbon i can be given as: We denote the boiling point of a benzenoid hydrocarbon i (i=1,…,21) by bp i.Thus, for naphthalene (i=1), bp 1 =218°C and c 11 , c 21 and c 31 are respectively 6, 4 and 1.In Table 1 we give bp i , c 1i , c 2i and c 3i values for 21 benzenoid hydrocarbons whose graphs are given in Figure 2.  Experimental values of boiling points of considered benzenoid hydrocarbons are taken from Randić [13].
Expression (5) for x=0 and y=0 reduces to: which is the formula for computing the original Zagreb M 2 index of a given benzenoid i.It should also be noted that the variable connectivity index v χ i is related to the Zagreb v M 2i index with the same set of coefficients: For x=0 and y=0, eq. ( 7) reduces to one for computing the vertex-connectivity index of a benzenoid hydrocarbon i:

Results and Discussion
In order to find optimal variable Zagreb v M 2 index, the values of x were varied in the range between -2 and 2 and values of y were varied in the range between -3 and 3, both in steps of 0.1.This range of x and y values was imposed by the degrees of valences in benzenoid graphs.In non-optimized Zagreb index (M 2 ), the values of variables x and y are equal to 0.0.We want to see are there optimal values of x and y near their non-optimized values (0.0, 0.0) for which the standard error of estimate of the structure-boiling point model reaches minimum.For each pair (x, y) in the given range, coefficients a 0 and a 1 in the linear regression model: were computed using the least square fitting procedure as implemented in the CROMRsel program [26][27][28].The quality of models is expressed by fitted (descriptive) statistical parameters: the correlation coefficient R fit , the standard error of estimate S fit and F, the Fisher´s values.S fit was computed with N and N-I-1 in the denominator, where N is the number of considered benzenoid hydrocarbons and I is the number of descriptors used in the model.In addition, the models were cross(internally)-validated using the leave-one-out method.Statistical parameters for the cross- In Figure 3, we give the scatter plot between S fit (N) and x values for the optimum value of y (-1.2) and in Figure 4, we give the scatter plot between S fit (N) and y values for the optimum value of x (0.0).The results for the fitted models were supported by the results for the cross-validated models (see Figures 5 and 6).
In Figure 7, we give plots between the experimental and calculated boiling points for fit and crossvalidated models and in Figure 8 we give the scatter plots of fit residuals against the fitted values and cross-validated residuals against cross-validated boiling points, respectively.
Model (10)  We also derived the structure-boiling point model using the variable vertex-connectivity index v χ and the CROMRsel procedure.The following linear model is obtained for the optimum parameters x (0.0) and y (0.5): bp = -11.5 (± 8.2) + 46.27 (± 0.74 Both models, ( 10) and ( 12), possess practically identical statistical parameters.This is unexpected results, since the vertex-connectivity index [29] is superior to the Zagreb M 2 index in building QSPR models, though there are indications [22] that the various variable forms of these two indices lead to the models of the same quality regarding the statistical parameters.However, that is so because both eqs.( 5) and ( 7) are based on the same coefficients c 1i , c 2i and c 3i at constant factors.
We also considered the following linear model: where c 1i , c 2i and c 3i are taken from possesses fit statistical parameters R fit and S fit (N) almost identical to those in the models (10) and (12).These parameters would be exactly the same if x and y were obtained more accurately.Because of the greater number of descriptors used in model ( 14) its S fit (N-I-1), S cv (N), S cv (N-I-1), R cv and F values are somewhat worse.
Both models (19) and (20) are fitted models − they were not cross-validated.
We also carried out the multivariate regression with two, three and four descriptors using the CROMRsel procedure and the obtained fitted model were cross-validated.Descriptors we considered were v M 2 , χ, ε and three distance indices (the Wiener sum index WS, ws, the detour index ω).The WS index is a Wiener-like index, obtained from the quotient matrix D/∆ [35] and ω is equal to the halfsum of the elements of the detour matrix [32,36].We considered χ, ε, WS, ws and ω indices because they have been in the previous structure-boiling point studies of benzenoid hydrocarbons [13,14].Below we give the best obtained models followed by the best models containing the v M 2 index − in the case of the two-descriptor model the best model contains the v M 2 index: The model (27) possesseses the lowest values of the standard errors of estimate, but the linear models based on the variable Zagreb M 2 index (10) and the variable connectivity index (18) are also very good models with the standard errors of estimate for the fit and cross-validated models of 6.8 (7.2) o C and 8.0 (8.4) o C, respectively.
We also considered the intercorrelation between the descriptors used in building up models ( 24) - (28).The intercorrelation matrix reflecting the pairwise linear correlation between v M 2 , ε, WS, ws and ω computed for 21 benzenoid is given in Table 2.The intercorrelation-degree is appraised by the correlation coefficient R. Pairs of indices with R ≥ 0.97 are regarded highly correlated, those with 0.90 ≤ R ≤ 0.97 appreciably correlated, those with 0.50 ≤ R ≤ 0.89 correlated and the pairs of descriptors with low values of R (< 0.50) not correlated [37].It appears that, according to the above classification, all the considered descriptors are either highly correlated ( v M 2 ,ε; v M 2 ,ws; v M 2 ,ω;ε, ws; ε,ω; ws, ω ) or appreciably correlated ( v M 2 ,WS; ε,WS; WS, ws; WS,ω).However, as Randić [e.g., 38] pointed out, the intercorrelation criterion should not be always used for filtering descriptors to be used in building up the QSPR models.

Conclusions
The variable Zagreb M 2 index was used in the structure-boiling point modeling of benzenoid hydrocarbons.The obtained model is practically identical to the model based on the variable vertexconnectivity index and this is due to close relationship between the formulas for the two indices.
Comparative analysis of models based on several descriptors favoured the multivariate models with three and four descriptors.The best models with three and four descriptors did not include v M 2 index.However, the next best three-and four-descriptor models contain the v M 2 index.The best twodescriptor model contains v M 2 index.The standard errors of estimate for the fit and cross-validated models listed in this report are in the 5.0 o C-9.4 o C range and this is a very good result since it shows that the boiling points of benzenoid hydrocarbons can be predicted within an error range of 0.8-4.3%.

Table 1 .
The values of experimental boiling points (bp i in o C, i=1,…,21) and coefficients (c 1i , c 2i and c 3i ) of the variable Zagreb v M 2 indices of 21 benzenoid hydrocarbons.
v M