# Current Mathematical Methods Used in QSAR/QSPR Studies

^{*}

^{}

^{#}

## Abstract

**:**

## 1. Introduction

## 2. Multiple Linear Regression (MLR)

#### 2.1. Best Multiple Linear Regression (BMLR)

#### 2.2. Heuristic Method (HM)

_{ij}is lower than 0.1; (b) the pairs of orthogonal descriptors are used to compute the biparametric regression equations; (c) to a multi-linear regression (MLR) model containing n descriptors, a new descriptor is added to generate a model with n + 1 descriptors if the new descriptor is not significantly correlated with the previous n descriptors; step (c) is repeated until MLR models with a prescribed number of descriptors are obtained. The goodness of the correlation is tested by the square of coefficient regression (R

^{2}), square of cross-validate coefficient regression (q

^{2}), the F-test (F), and the standard deviation (S) [1].

#### 2.3. Genetic Algorithm based Multiple Linear Regression (GA-MLR)

^{2}error, the LOF measure cannot always be reduced by adding more terms to the regression model. By limiting the tendency to simply add more terms, the LOF measure resists over-fitting of a model. Then, crossover and mutation operations are performed to generate new individuals. In the subsequent selection stage, the fittest individuals evolve to the next generation. These steps of evolution continue until the stopping conditions are satisfied. After that, the MLR is employed to correlate the descriptors selected by GA and the values of activities or properties.

## 3. Partial Least Squares (PLS)

#### 3.1. Genetic Partial Least Squares (G/PLS)

#### 3.2. Factor Analysis Partial Least Squares (FA-PLS)

#### 3.3. Orthogonal Signal Correction Partial Least Squares (OSC-PLS)

## 4. Neural Networks (NN)

#### 4.1. Radial Basis Function Neural Network (RBFNN)

_{j}) and width (r

_{j}). Due to the limited length of writing, only Gaussian RBF is introduced in this paper. The nonlinear transformation with RBF in the hidden layer is given as follows:

_{j}is the notation for the output of the jth RBF unit, c

_{j}and r

_{j}are the center and width of the jth RBF, respectively. The operation of the output layer is linear, which is given as below:

_{k}is the kth output unit for the input vector x, w

_{kj}is the weight connection between the kth output unit and the jth hidden layer unit, and b

_{k}is the bias. The training procedure when using RBF involves selecting centers, width and weights. In this paper, the forward subset selection routine was used to select the centers from training set samples. The adjustment of the connection weight between the hidden layer and the output layer was performed using a least squares solution after the selection of centers and widths of radial basis functions. Compared with the Back Propagation Neural Network (BPNN), RBFNN has the advantage of short training time and is guaranteed to reach the global minimum of error surface during training process. The optimization of its topology and learning parameters are easy to implement. Applications of RBFNN in QSAR/QSPR studies can be found in [22,24,27,32,51,92–102].

#### 4.2. General Regression Neural Network (GRNN)

_{i}) is the distance between the unknown sample and a data point. The Gaussian function is frequently used as the weighting function because it is well behaved, easily calculated, and satisfies the conditions required by Parzen’s estimator. Substituting Parzen’s nonparametric estimator for f(x, y) and performing the integrations leads to the fundamental equation of GRNN:

## 5. Support Vector Machine (SVM)

#### 5.1. Least Square Support Vector Machine (LS-SVM)

^{n}→ R

^{m}is the feature map mapping the input space to a usually high-dimensional feature space, γ is the relative weight of the error term, and e

_{k}is error variables taking noisy data into accurate and avoiding poor generalization. LS-SVM considers this optimization problem to be a constrained optimization problem and uses a language function to solve it. By solving the Lagrange style of equation (8), the weight coefficients (w) can be written as:

_{k}) is the kernel function. The value is equal to the inner product of two vectors x and x

_{k}in the feature space Φ(x) and Φ(x

_{k}); that is, K(x, x

_{k})) = Φ(x)

^{T}Φ(x

_{k}). The choices of a kernel and its specific parameters together with γ have to be tuned by the user. The radial basis function (RBF) kernel K(x, x

_{k}) = exp(-‖x

_{k}– x‖

^{2}/σ

^{2}) is commonly used, and then leave-one-out (LOO) cross-validation is employed to tune the optimized values of the two parameters γ and σ. LS-SVM is a simplification of traditional support vector machine (SVM). It encompasses similar advantages with SVM and its own additional advantages. It only requires solving a set of linear equations (linear programming), which is much easier and computationally simpler than nonlinear equations (quadratic programming) employed by traditional SVM. Therefore, LS-SVM is faster than traditional SVM in treating the same work. The related literature is presented in [37,114–118].

## 6. Gene Expression Programming (GEP)

#### 6.1. The GEP chromosomes, expression trees (ETs), and the mapping mechanism

#### 6.2. Description of the GEP algorithm

_{i}of an individual program i is expressed by the equation:

_{(ij)}is the value predicted by the individual program i for fitness case j (out of n fitness cases), and T

_{j}is the target value for fitness case j. Note that the absolute value term corresponds to the relative error. This term is what is called the precision and if the error is smaller than or is equal to the precision then the error becomes zero. Thus, for a good match the absolute value term is zero and f

_{i}= f

_{max}= nR. For some function finding problems it is important to evolve a model that performs well for all fitness cases within a certain relative error (the precision) of the correct value:

_{(ij)}is the relative error of an individual program i for the fitting case j.

_{(ij)}is given by:

_{t}and σ

_{p}are the corresponding standard deviations. GEP is the newest chemometrics method, and Si et al. [23,25,19–121] have applied this method to QSAR studies for the first time. The results from their studies are satisfactory and show a promising use in the nonlinear structure-activity/property relationship correlation area, but GEP is congenitally defective as far as reproducibility of the predicted results is concerned, and always deduces too complex equations. This means a higher requirement for a user who is involved with GEP. GEP is now commercialized in the software Automatic Problem Solver 3.0 or GeneXproTools 4.0 [122].

## 7. Project Pursuit Regression (PPR)

_{i}and g

_{i}:

_{i}values are m × n orthonormal matrices and p is the number of ridge functions. In recent QSAR/QSPR studies [21,37,124–129] PPR was employed as a regression method and always resulted in the best final models. This indicates that PPR is a promising regression method in QSAR/QSPR studies, especially when the correlation between descriptors and activities or properties is nonlinear.

## 8. Local Lazy Regression (LLR)

_{NN(x)}and build a regression model with only the points in NN

_{(x)}using the least-squares method and minimized the squared residuals for the region using this model. So we do not need to build multiple models for each point in the training set beforehand. In essence, when faced with a query point, the approach builds a representative predictive model. Hence, this approach is termed local LLR, which is described below:

_{NN(x)}was a matrix of independent variables, Y

_{NN(x)}the column vectors representing the dependent variables, for the molecules in the neighborhood of the query point. β

_{x}was the column vectors of regression coefficients. One of the main advantages of LLR is the fact that no a priori model needs to be built. This makes it suitable for large data sets, where using all of the observations can normally be time-consuming and even lead to over-fitting. At the same time, because it builds a regression model for each query point, one cannot extract meaningful structure-activity trends for the data set as a whole. That is, the focus of LLR is on predictive ability, rather than interpretability. Like every method, the lazy approach has a number of shortcomings. First, as all of the computations are done at query time, the determination of the local neighborhood must be efficient. Second, uncorrelated features might result in errors in the identification of near neighbors. Finally, it is nontrivial to integrate feature selection in this framework. LLR is generally used to develop linear models for data sets in which the global structure-activity/property relationship is nonlinear in nature. However, as a new arising method in the field of QSAR/QSPR, LLR is not used extensively, with only a few relevant studies shown in references [125,130–132]. It is expected that more application studies involving LLR will appear in future QSAR/QSPR analyses.

## 9. Conclusions

## References and Notes

- Katritzky, AR; Lobanov, VS; Karelson, M. Comprehensive Descriptors for Structural and Statistical Analysis (CODESSA) Ref. Man. Version 2.7.10, 2007.
- Du, H; Wang, J; Hu, Z; Yao, X. Quantitative Structure-Retention relationship study of the constituents of saffron aroma in SPME-GC-MS based on the projection pursuit regression method. Talanta
**2008**, 77, 360–365. [Google Scholar] - Du, H; Watzl, J; Wang, J; Zhang, X; Yao, X; Hu, Z. Prediction of retention indices of drugs based on immobilized artificial membrane chromatography using Projection Pursuit Regression and Local Lazy Regression. J. Sep. Sci
**2008**, 31, 2325–2333. [Google Scholar] - Du, H; Zhang, X; Wang, J; Yao, X; Hu, Z. Novel approaches to predict the retention of histidine-containing peptides in immobilized metal-affinity chromatography. Proteomics
**2008**, 8, 2185–2195. [Google Scholar] - Katritzky, AR; Pacureanu, L; Dobchev, D; Karelson, M. QSPR modeling of hyperpolarizabilities. J. Mol. Model
**2007**, 13, 951–963. [Google Scholar] - Ren, Y; Liu, H; Yao, X; Liu, M. An accurate QSRR model for the prediction of the GCxGCTOFMS retention time of polychlorinated biphenyl (PCB) congeners. Anal. Bioanal. Chem
**2007**, 388, 165–172. [Google Scholar] - Srivani, P; Srinivas, E; Raghu, R; Sastry, GN. Molecular modeling studies of pyridopurinone derivatives--potential phosphodiesterase 5 inhibitors. J. Mol. Graph. Model
**2007**, 26, 378–390. [Google Scholar] - Kahn, I; Sild, S; Maran, U. Modeling the toxicity of chemicals to Tetrahymena pyriformis using heuristic multilinear regression and heuristic back-propagation neural networks. J. Chem. Inf. Model
**2007**, 47, 2271–2279. [Google Scholar] - Semichem Home Page. Available online: http://www.semichem.com/codessa (accessed on 10 March 2009).
- Codessa Pro Home Page. Available online: http://www.codessa-pro.com/ (accessed on 10 March 2009).
- Xia, B; Liu, K; Gong, Z; Zheng, B; Zhang, X; Fan, B. Rapid toxicity prediction of organic chemicals to Chlorella vulgaris using quantitative structure-activity relationships methods. Ecotoxicol. Environ. Saf
**2009**, 72, 787–794. [Google Scholar] - Yuan, Y; Zhang, R; Hu, R; Ruan, X. Prediction of CCR5 receptor binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas based on the heuristic method, support vector machine and projection pursuit regression. Eur. J. Med. Chem
**2009**, 44, 25–34. [Google Scholar] - Lu, WJ; Chen, YL; Ma, WP; Zhang, XY; Luan, F; Liu, MC; Chen, XG; Hu, ZD. QSAR study of neuraminidase inhibitors based on heuristic method and radial basis function network. Eur. J. Med. Chem
**2008**, 43, 569–576. [Google Scholar] - Xia, B; Ma, W; Zheng, B; Zhang, X; Fan, B. Quantitative structure-activity relationship studies of a series of non-benzodiazepine structural ligands binding to benzodiazepine receptor. Eur. J. Med. Chem
**2008**, 43, 1489–1498. [Google Scholar] - Zhao, C; Zhang, H; Luan, F; Zhang, R; Liu, M; Hu, Z; Fan, B. QSAR method for prediction of protein-peptide binding affinity: application to MHC class I molecule HLA-A*0201. J. Mol. Graph. Model
**2007**, 26, 246–254. [Google Scholar] - Rebehmed, J; Barbault, F; Teixeira, C; Maurel, F. 2D and 3D QSAR studies of diarylpyrimidine HIV-1 reverse transcriptase inhibitors. J. Comput. Aided Mol. Des
**2008**, 22, 831–841. [Google Scholar] - Agrafiotis, DK; Gibbs, AC; Zhu, F; Izrailev, S; Martin, E. Conformational sampling of bioactive molecules: a comparative study. J. Chem. Inf. Model
**2007**, 47, 1067–1086. [Google Scholar] - Si, HZ; Wang, T; Zhang, KJ; Hu, ZD; Fan, BT. QSAR study of 1,4-dihydropyridine calcium channel antagonists based on gene expression programming. Bioorg. Med. Chem
**2006**, 14, 4834–4841. [Google Scholar] - Li, X; Luan, F; Si, H; Hu, Z; Liu, M. Prediction of retention times for a large set of pesticides or toxicants based on support vector machine and the heuristic method. Toxicol. Lett
**2007**, 175, 136–144. [Google Scholar] - Gong, ZG; Zhang, RS; Xia, BB; Hu, RJ; Fan, BT. Study of nematic transition temperatures in themotropic liquid crystal using heuristic method and radial basis function neural networks and support vector machine. QSAR Comb.Sci
**2008**, 27, 1282–1290. [Google Scholar] - Yuan, YN; Zhang, RS; Hu, RJ; Ruan, XF. Prediction of CCR5 receptor binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas based on the heuristic method, support vector machine and projection pursuit regression. Eur. J. Med. Chem
**2009**, 44, 25–34. [Google Scholar] - Xia, BB; Liu, KP; Gong, ZG; Zheng, B; Zhang, XY; Fan, BT. Rapid toxicity prediction of organic chemicals to Chlorella vulgaris using quantitative structure-activity relationships methods. Ecotoxicol. Environ. Saf
**2009**, 72, 787–794. [Google Scholar] - Luan, F; Si, HZ; Liu, HT; Wen, YY; Zhang, XY. Prediction of atmospheric degradation data for POPs by gene expression programming. SAR QSAR Environ. Res
**2008**, 19, 465–479. [Google Scholar] - Xia, BB; Ma, WP; Zheng, B; Zhang, XY; Fan, BT. Quantitative structure-activity relationship studies of a series of non-benzodiazepine structural ligands binding to benzodiazepine receptor. Eur. J. Med. Chem
**2008**, 43, 1489–1498. [Google Scholar] - Wang, T; Si, HZ; Chen, PP; Zhang, KJ; Yao, XJ. QSAR models for the dermal penetration of polycyclic aromatic hydrocarbons based on Gene Expression Programming. QSAR Comb. Sci
**2008**, 27, 913–921. [Google Scholar] - Liu, KP; Xia, BB; Ma, WP; Zheng, B; Zhang, XY; Fan, BT. Quantitative structure-activity relationship modeling of triaminotriazine drugs based on Heuristic Method. QSAR Comb. Sci
**2008**, 27, 425–431. [Google Scholar] - Lu, WJ; Chen, YL; Ma, WP; Zhang, XY; Luan, F; Liu, MC; Chen, XG; Hu, ZD. QSAR study of neuraminidase inhibitors based on heuristic method and radial basis function network. Eur. J. Med. Chem
**2008**, 43, 569–576. [Google Scholar] - Zhao, CY; Zhang, HX; Luan, F; Zhang, RS; Liu, MC; Hu, ZD; Fan, BT. QSAR method for prediction of protein-peptide binding affinity: Application to MHC class I molecule HLA-A*0201. J. Mol. Graph. Model
**2007**, 26, 246–254. [Google Scholar] - Li, HZ; Liu, HX; Yao, XJ; Liu, MC; Hu, ZD; Fan, BT. Quantitative structure-activity relationship study of acyl ureas as inhibitors of human liver glycogen phosphorylase using least squares support vector machines. Chemometr. Intel. Lab. Syst
**2007**, 87, 139–146. [Google Scholar] - Qin, S; Liu, HX; Wang, J; Yao, XJ; Liu, MC; Hu, ZD; Fan, BT. Quantitative Structure-Activity Relationship study on a series of novel ligands binding to central benzodiazepine receptor by using the combination of Heuristic Method and Support Vector Machines. QSAR Comb. Sci
**2007**, 26, 443–451. [Google Scholar] - Ma, WP; Luan, F; Zhao, CY; Zhang, XY; Liu, MC; Hu, ZD; Fan, BT. QSAR prediction of the penetration of drugs across a polydimethylsiloxane membrane. QSAR Comb. Sci
**2006**, 25, 895–904. [Google Scholar] - Luan, F; Ma, WP; Zhang, XY; Zhang, HX; Liu, MC; Hu, ZD; Fan, BT. Quantitative structure-activity relationship models for prediction of sensory irritants (logRD(50)) of volatile organic chemicals. Chemosphere
**2006**, 63, 1142–1153. [Google Scholar] - Si, HZ; Yao, XJ; Liu, HX; Wang, J; Li, JZ; Hu, ZD; Liu, MC. Prediction of binding rate of drug to human plasma protein based on heuristic method and support vector machine. Acta Chim. Sinica
**2006**, 64, 415–422. [Google Scholar] - Luan, F; Ma, WP; Zhang, XY; Zhang, HX; Liu, MC; Hu, ZD; Fan, BT. QSAR study of polychlorinated dibenzodioxins, dibenzofurans, and Biphenyls using the heuristic method and support vector machine. QSAR Comb. Sci
**2006**, 25, 46–55. [Google Scholar] - Gharagheizi, F; Tirandazi, B; Barzin, R. Estimation of aniline point temperature of pure hydrocarbons: A quantitative structure-property relationship approach. Ind. Eng. Chem. Res
**2009**, 48, 1678–1682. [Google Scholar] - Riahi, S; Mousavi, MF; Ganjali, MR; Norouzi, P. Application of correlation ranking procedure and artificial neural networks in the modeling of liquid chromatographic retention times (tR) of various pesticides. Anal. Lett
**2008**, 41, 3364–3385. [Google Scholar] - Du, HY; Wang, J; Hu, ZD; Yao, XJ; Zhang, XY. Prediction of fungicidal activities of rice blast disease based on least-squares support vector machines and project pursuit regression. J. Agric. Food Chem
**2008**, 56, 10785–10792. [Google Scholar] - Gharagheizi, F; Mehrpooya, M. Prediction of some important physical properties of sulfur compounds using quantitative structure-properties relationships. Mol. Div
**2008**, 12, 143–155. [Google Scholar] - Sattari, M; Gharagheizi, F. Prediction of molecular diffusivity of pure components into air: A QSPR approach. Chemosphere
**2008**, 72, 1298–1302. [Google Scholar] - Gharagheizi, F; Alamdari, RF. Prediction of flash point temperature of pure components using a Quantitative Structure-Property Relationship model. QSAR Comb. Sci
**2008**, 27, 679–683. [Google Scholar] - Gharagheizi, F; Fazeli, A. Prediction of the Watson characterization factor of hydrocarbon components from molecular properties. QSAR Comb. Sci
**2008**, 27, 758–767. [Google Scholar] - Om, AS; Ryu, JC; Kim, JH. Quantitative structure-activity relationships for radical scavenging activities of flavonoid compounds by GA-MLR technique. Mol. Cell. Toxicol
**2008**, 4, 170–176. [Google Scholar] - Riahi, S; Ganjali, MR; Pourbasheer, E; Norouzi, P. QSRR study of GC retention indices of essential-oil compounds by multiple linear regression with a genetic algorithm. Chromatographia
**2008**, 67, 917–922. [Google Scholar] - Hashemianzadeh, M; Safarpour, MA; Gholamjani-Moghaddam, K; Mehdipour, AR. DFT-based QSAR study of valproic acid and its derivatives. QSAR Comb. Sci
**2008**, 27, 469–474. [Google Scholar] - Gharagheizi, F. A new molecular-based model for prediction of enthalpy of sublimation of pure components. Thermochim. Acta
**2008**, 469, 8–11. [Google Scholar] - Gharagheizi, F. QSPR studies for solubility parameter by means of Genetic Algorithm-Based Multivariate Linear Regression and generalized regression neural network. QSAR Comb. Sci
**2008**, 27, 165–170. [Google Scholar] - Gharagheizi, F; Alamdari, RF. A molecular-based model for prediction of solubility of C-60 fullerene in various solvents. Fuller. Nanotub. Carbon Nanostr
**2008**, 16, 40–57. [Google Scholar] - Carlucci, G; D'Archivio, AA; Maggi, MA; Mazzeo, P; Ruggieri, F. Investigation of retention behaviour of non-steroidal anti-inflammatory drugs in high-performance liquid chromatography by using quantitative structure-retention relationships. Anal. Chim. Acta
**2007**, 601, 68–76. [Google Scholar] - Gharagheizi, F. A new accurate neural network quantitative structure-property relationship for prediction of theta (lower critical solution temperature) of polymer solutions. E-Polymers
**2007**. [Google Scholar] - Elliott, GN; Worgan, H; Broadhurst, D; Draper, J; Scullion, J. Soil differentiation using fingerprint Fourier transform infrared spectroscopy, chemometrics and genetic algorithm-based feature selection. Soil Biol. Biochem
**2007**, 39, 2888–2896. [Google Scholar] - Gharagheizi, F. QSPR analysis for intrinsic viscosity of polymer solutions by means of GA-MLR and RBFNN. Comput. Mater. Sci
**2007**, 40, 159–167. [Google Scholar] - Deeb, O; Hemmateenejad, B; Jaber, A; Garduno-Juarez, R; Miri, R. Effect of the electronic and physicochemical parameters on the carcinogenesis activity of some sulfa drugs using QSAR analysis based on genetic-MLR and genetic-PLS. Chemosphere
**2007**, 67, 2122–2130. [Google Scholar] - Vatani, A; Mehrpooya, M; Gharagheizi, F. Prediction of standard enthalpy of formation by a QSPR model. Int. J. Mol. Sci
**2007**, 8, 407–432. [Google Scholar] - Jung, M; Tak, J; Lee, Y; Jung, Y. Quantitative structure-activity relationship (QSAR) of tacrine derivatives against acetylcholinesterase (ACNE) activity using variable selections. Bioorg. Med. Chem. Lett
**2007**, 17, 1082–1090. [Google Scholar] - Fisz, JJ. Combined genetic algorithm and multiple linear regression (GA-MLR) optimizer: Application to multi-exponential fluorescence decay surface. J. Phys. Chem. A
**2006**, 110, 12977–12985. [Google Scholar] - Word, H. Research Papers in Statistics; Wiley: New York, NY, USA, 1966. [Google Scholar]
- Jores-Kong, H; Word, H. Systems under Indirect Observation: Causality, structure, prediction; North-Holland: Amsterdam, The Netherlands, 1982. [Google Scholar]
- Rogers, D; Hopfinger, AJ. Application of genetic function approximation to quantitative structure-activity-relationships and quantitative structure-property relationships. J. Chem. Inf. Comput. Sci
**1994**, 34, 854–866. [Google Scholar] - Fan, Y; Shi, LM; Kohn, KW; Pommier, Y; Weinstein, JN. Quantitative structure-antitumor activity relationships of camptothecin analogues: Cluster analysis and genetic algorithm-based studies. J. Med. Chem
**2001**, 44, 3254–3263. [Google Scholar] - Sammi, T; Silakari, O; Ravikumar, M. Three-dimensional quantitative structure-activity relationship (3D-QSAR) studies of various benzodiazepine analogues of gamma-secretase inhibitors. J. Mol. Model
**2009**, 15, 343–348. [Google Scholar] - Li, ZG; Chen, KX; Xie, HY; Gao, JR. Quantitative structure - activity relationship analysis of some thiourea derivatives with activities against HIV-1 (IIIB). QSAR Comb. Sci
**2009**, 28, 89–97. [Google Scholar] - Samee, W; Nunthanavanit, P; Ungwitayatorn, J. 3D-QSAR investigation of synthetic antioxidant chromone derivatives by molecular field analysis. Int. J. Mol. Sci
**2008**, 9, 235–246. [Google Scholar] - Nunthanavanit, P; Anthony, NG; Johnston, BF; Mackay, SP; Ungwitayatorn, J. 3D-QSAR studies on chromone derivatives as HIV-1 protease inhibitors: Application of molecular field analysis. Arch. Der. Pharm
**2008**, 341, 357–364. [Google Scholar] - Kansal, N; Silakari, O; Ravikumar, M. 3D-QSAR studies of various diaryl urea derivatives of multi-targeted receptor tyrosine kinase inhibitors: Molecular field analysis approach. Lett. Drug Des. Dis
**2008**, 5, 437–448. [Google Scholar] - Joseph, TB; Kumar, B; Santhosh, B; Kriti, S; Pramod, AB; Ravikumar, M; Kishore, M. Quantitative structure activity relationship and pharmacophore studies of adenosine receptor A(2B) inhibitors. Chem. Biol. Drug Des
**2008**, 72, 395–408. [Google Scholar] - Equbal, T; Silakari, O; Ravikumar, M. Exploring three-dimensional quantitative structural activity relationship (3D-QSAR) analysis of SCH 66336 (Sarasar) analogues of farnesyltransferase inhibitors. Eur. J. Med. Chem
**2008**, 43, 204–209. [Google Scholar] - Bhonsle, JB; Bhattacharjee, AK; Gupta, RK. Novel semi-automated methodology for developing highly predictive QSAR models: application for development of QSAR models for insect repellent amides. J. Mol. Model
**2007**, 13, 179–208. [Google Scholar] - Thomas Leonard, J; Roy, K. Comparative QSAR modeling of CCR5 receptor binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas. Bioorg. Med. Chem. Lett
**2006**, 16, 4467–4474. [Google Scholar] - Roy, K; Leonard, JT. Topological QSAR modeling of cytotoxicity data of anti-HIV 5-phenyl-1-phenylamino-imidazole derivatives using GFA, G/PLS, FA and PCRA techniques. Indian J. Chem. Sect. A-Inorg. Bio-Inorg. Phys. Theor. Anal. Chem
**2006**, 45, 126–137. [Google Scholar] - Davies, MN; Hattotuwagama, CK; Moss, DS; Drew, MGB; Flower, DR. Statistical deconvolution of enthalpic energetic contributions to MHC-peptide binding affinity. BMC Struct Biol
**2006**, 6. [Google Scholar] - Mandal, AS; Roy, K. Predictive QSAR modeling of HIV reverse transcriptase inhibitor TIBO derivatives. Eur. J. Med. Chem
**2009**, 44, 1509–1524. [Google Scholar] - Leonard, JT; Roy, K. Exploring molecular shape analysis of styrylquinoline derivatives as HIV-1 integrase inhibitors. Eur. J. Med. Chem
**2008**, 43, 81–92. [Google Scholar] - Roy, K; Ghosh, G. QSTR with extended topochemical atom (ETA) indices 8.(a) QSAR for the inhibition of substituted phenols on germination rate of Cucumis sativus using chemometric tools. QSAR Comb. Sci
**2006**, 25, 846–859. [Google Scholar] - Leonard, JT; Roy, K. Comparative QSAR modeling of CCR5 receptor binding affinity of substituted 1-(3,3-diphenylpropyl)-piperidinyl amides and ureas. Bioorg. Med. Chem. Lett
**2006**, 16, 4467–4474. [Google Scholar] - Wold, S; Antti, H; Lindgren, F; Ohman, J. Orthogonal signal correction of near-infrared spectra. Chemometr. Intel. Lab. Syst
**1998**, 44, 175–185. [Google Scholar] - Yin, PY; Mohemaiti, P; Chen, J; Zhao, XJ; Lu, X; Yimiti, A; Upur, H; Xu, GW. Serum metabolic profiling of abnormal savda by liquid chromatography/mass spectrometry. J. Chromatogr. B-Anal. Technol. Biomed. Life Sci
**2008**, 871, 322–327. [Google Scholar] - Samadi-Maybodi, A; Darzi, S. Simultaneous determination of vitamin B12 and its derivatives using some of multivariate calibration 1 (MVC1) techniques. Spectrochim. Acta A-Mol. Biomol. Spectrosc
**2008**, 70, 1167–1172. [Google Scholar] - Niazi, A; Jafarian, B; Ghasemi, J. Kinetic spectrophotometric determination of trace amounts of palladium by whole kinetic curve and a fixed time method using resazurine sulfide reaction. Spectrochim. Acta A-Mol. Biomol. Spectrosc
**2008**, 71, 841–846. [Google Scholar] - Niazi, A; Goodarzi, M. Orthogonal signal correction-partial least squares method for simultaneous spectrophotometric determination of cypermethrin and tetramethrin. Spectrochim. Acta A-Mol. Biomol. Spectrosc
**2008**, 69, 1165–1169. [Google Scholar] - Niazi, A; Amjadi, E; Nori-Shargh, D; Bozorghi, SJ. Simultaneous voltammetric determination of lead and tin by adsorptive differential pulse stripping method and orthogonal signal correction-partial least squares in water samples. J. Chinese Chem. Soc
**2008**, 55, 276–285. [Google Scholar] - Karimi, MA; Ardakani, MM; Behjatmanesh-Ardakani, R; Nezhad, MRH; Amiryan, H. Individual and simultaneous determinations of phenothiazine drugs using PCR, PLS and (OSC)-PLS multivariate calibration methods. J. Serb. Chem. Soc
**2008**, 73, 233–247. [Google Scholar] - Cho, HW; Kim, SB; Jeong, MK; Park, Y; Miller, NG; Ziegler, TR; Jones, DP. Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra. Int. J. Data Min. Bioinf
**2008**, 2, 176–192. [Google Scholar] - Cheng, Z; Zhu, AS; Zhang, LQ. Quantitative analysis of electronic absorption spectroscopy by piecewise orthogonal signal correction and partial least square. Guang Pu Xue Yu Guang Pu Fen Xi
**2008**, 28, 860–864. [Google Scholar] - Cheng, Z; Zhu, AS; Zhang, LQ. Quantitative analysis of electronic absorption spectroscopy by piecewise orthogonal signal correction and partial least square. Spectrosc. Spectr. Anal
**2008**, 28, 860–864. [Google Scholar] - Cheng, Z; Zhu, AS. Piecewise orthogonal signal correction approach and its application in the analysis of wheat near-infrared spectroscopic data. Chinese J. Anal. Chem
**2008**, 36, 788–792. [Google Scholar] - Rouhollahi, A; Rajabzadeh, R; Ghasemi, J. Simultaneous determination of dopamine and ascorbic acid by linear sweep voltammetry along with chemometrics using a glassy carbon electrode. Microchim. Acta
**2007**, 157, 139–147. [Google Scholar] - Psihogios, NG; Kalaitzidis, RG; Dimou, S; Seferiadis, KI; Siamopoulos, KC; Bairaktari, ET. Evaluation of tubulointerstitial lesions' severity in patients with glomerulonephritides: An NMR-Based metabonomic study. J. Proteome Res
**2007**, 6, 3760–3770. [Google Scholar] - Niazi, A; Azizi, A; Leardi, R. A comparative study between PLS and OSC-PLS in the simultaneous determination of lead and mercury in water samples: effect of wavelength selection. Can. J. Anal. Sci. Spectrosc
**2007**, 52, 365–374. [Google Scholar] - Priolo, N; Arribere, CM; Caffini, N; Barberis, S; Vazquez, RN; Luco, JM. Isolation and purification of cysteine peptidases from the latex of Araujia hortorum fruits - Study of their esterase activities using partial least-squares (PLS) modeling. J. Mol. Catal. B-Enzym
**2001**, 15, 177–189. [Google Scholar] - Yang, SB; Xia, ZN; Shu, M; Mei, H; Lue, FL; Zhang, M; Wu, YQ; Li, ZL. VHSEH Descriptors for the Development of QSAMs of Peptides. Chem. J. Chinese Univ
**2008**, 29, 2213–2217. [Google Scholar] - Liang, GZ; Mei, H; Zhou, Y; Yang, SB; Wu, SR; Li, ZL. Using SZOTT descriptors for the development of QSAMs of peptides. Chem. J. Chinese Univ
**2006**, 27, 1900–1902. [Google Scholar] - Zhao, CY; Boriani, E; Chana, A; Roncaglioni, A; Benfenati, E. A new hybrid system of QSAR models for predicting bioconcentration factors (BCF). Chemosphere
**2008**, 73, 1701–1707. [Google Scholar] - Qi, J; Niu, JF; Wang, LL. Research on QSPR for n-octanol-water partition coefficients of organic compounds based on genetic algorithms-support vector machine and genetic algorithms-radial basis function neural networks. Huanjing Kexue
**2008**, 29, 212–218. [Google Scholar] - Luan, F; Liu, HT; Wen, YY; Zhang, XY. Prediction of quantitative calibration factors of some organic compounds in gas chromatography. Analyst
**2008**, 133, 881–887. [Google Scholar] - Luan, F; Liu, HT; Wen, YY; Zhang, XY. Quantitative structure-property relationship study for estimation of quantitative calibration factors of some organic compounds in gas chromatography. Anal. Chim. Acta
**2008**, 612, 126–135. [Google Scholar] - Luan, F; Liu, HT; Ma, WP; Fan, BT. QSPR analysis of air-to-blood distribution of volatile organic compounds. Ecotoxicol. Environ. Saf
**2008**, 71, 731–739. [Google Scholar] - Chen, HF. Quantitative predictions of gas chromatography retention indexes with support vector machines, radial basis neural networks and multiple linear regression. Anal. Chim. Acta
**2008**, 609, 24–36. [Google Scholar] - Zhao, CY; Zhang, HX; Zhang, XY; Liu, MC; Hu, ZD; Fan, BT. Application of support vector machine (SVM) for prediction toxic activity of different data sets. Toxicology
**2006**, 217, 105–119. [Google Scholar] - Tetko, IV; Solov’ev, VP; Antonov, AV; Yao, XJ; Doucet, JP; Fan, BT; Hoonakker, F; Fourches, D; Jost, P; Lachiche, N; Varnek, A. Benchmarking of linear and nonlinear approaches for quantitative structure-property relationship studies of metal complexation with ionophores. J. Chem. Inf. Model
**2006**, 46, 808–819. [Google Scholar] - Shi, J; Luan, F; Zhang, HX; Liu, MC; Guo, QX; Hu, ZD; Fan, BT. QSPR study of fluorescence wavelengths (lambda(ex)/lambda(em)) based on the heuristic method and radial basis function neural networks. QSAR Comb. Sci
**2006**, 25, 147–155. [Google Scholar] - Ma, WP; Luan, F; Zhang, HX; Zhang, XY; Liu, MC; Hu, ZD; Fan, BT. Accurate quantitative structure-property relationship model of mobilities of peptides in capillary zone electrophoresis. Analyst
**2006**, 131, 1254–1260. [Google Scholar] - Luan, F; Zhang, X; Zhang, H; Zhang, R; Liu, M; Hu, Z; Fan, B. Prediction of standard Gibbs energies of the transfer of peptide anions from aqueous solution to nitrobenzene based on support vector machine and the heuristic method. J. Comput. Aided Mol. Des
**2006**, 20, 1–11. [Google Scholar] - Specht, DF. A general regression neural network. IEEE Trans. Neur. Netw
**1991**, 2, 568–576. [Google Scholar] - Szaleniec, M; Tadeusiewicz, R; Witko, M. How to select an optimal neural model of chemical reactivity? Neurocomputing
**2008**, 72, 241–256. [Google Scholar] - Ji, L; Wang, XD; Luo, S; Qin, LA; Yang, XS; Liu, SS; Wang, LS. QSAR study on estrogenic activity of structurally diverse compounds using generalized regression neural network. Sci. China Ser. B-Chem
**2008**, 51, 677–683. [Google Scholar] - Mager, PP. Subset selection and docking of human P2X7 inhibitors. Curr. Comput. Aided Drug Des
**2007**, 3, 248–253. [Google Scholar] - Ibric, S; Jovanovic, M; Djuric, Z; Parojcic, J; Solomun, L; Lucic, B. Generalized regression neural networks in prediction of drug stability. J. Pharm. Pharmacol
**2007**, 59, 745–750. [Google Scholar] - Yap, CW; Li, ZR; Chen, YZ. Quantitative structure-pharmacokinetic relationships for drug clearance by using statistical learning methods. J. Mol. Graph. Model
**2006**, 24, 383–395. [Google Scholar] - Agatonovic-Kustrin, S; Turner, JV. Artificial neural network modeling of phytoestrogen binding to estrogen receptors. Lett. Drug Des. Disc
**2006**, 3, 436–442. [Google Scholar] - Cortes, C; Vapnik, V. Support-Vector Networks. Mach. Learn
**1995**, 20, 273–297. [Google Scholar] - Vapnik, V. The Support Vector method of function estimation. US Patent 5,950,146
**1999**. [Google Scholar] - Wang, WJ; Xu, ZB; Lu, WZ; Zhang, XY. Determination of the spread parameter in the Gaussian kernel for classification and regression. Neurocomputing
**2003**, 55. [Google Scholar] - Suykens, JAK; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett
**1999**, 9, 293–300. [Google Scholar] - Yuan, YN; Zhang, RS; Hu, RJ; Ruan, XF. Prediction of volatile components retention time in blackstrap molasses by least-squares support vector machine. QSAR Comb. Sci
**2008**, 27, 535–542. [Google Scholar] - Niazi, A; Jameh-Bozorghi, S; Nori-Shargh, D. Prediction of toxicity of nitrobenzenes using ab initio and least squares support vector machines. J. Hazard. Mater
**2008**, 151, 603–609. [Google Scholar] - Goudarzi, N; Goodarzi, M. Prediction of the logarithmic of partition coefficients (log P) of some organic compounds by least square-support vector machine (LS-SVM). Mol Phys
**2008**, 106. [Google Scholar] - Liu, H; Papa, E; Walker, JD; Gramatica, P. In silico screening of estrogen-like chemicals based on different nonlinear classification models. J. Mol. Graph. Model
**2007**, 26, 135–144. [Google Scholar] - Liu, HX; Yao, XJ; Zhang, RS; Liu, MC; Hu, ZD; Fan, BT. Prediction of the tissue/blood partition coefficients of organic compounds based on the molecular structure using least-squares support vector machines. J. Comput. Aided Mol. Des
**2005**, 19, 499–508. [Google Scholar] - Si, HZ; Wang, T; Zhang, KJ; Duan, YB; Yuan, SP; Fu, AP; Hu, ZD. Quantitative structure activity relationship model for predicting the depletion percentage of skin allergic chemical substances of glutathione. Anal. Chim. Acta
**2007**, 591, 255–264. [Google Scholar] - Si, HZ; Zhang, KJ; Hu, ZD; Fan, BT. QSAR model for prediction capacity factor of molecular imprinting polymer based on gene expression programming. QSAR Comb. Sci
**2007**, 26, 41–50. [Google Scholar] - Si, HZ; Yuan, SP; Zhang, KJ; Fu, AP; Duan, YB; Hue, ZD. Quantitative structure activity relationship study on EC5.0 of anti-HIV drugs. Chemometr. Intel. Lab. Syst
**2008**, 90, 15–24. [Google Scholar] - Gepsoft Home Page. Available online: http://www.gepsoft.com/ (accessed on 10 March 2009).
- Friedman, JH; Stuetzle, W. Projection Pursuit Regression. J. Am. Stat. Assoc
**1981**, 76, 817–823. [Google Scholar] - Yuan, YN; Zhang, RS; Hu, RJ. Prediction of Photolysis of PCDD/Fs Adsorbed to Spruce [Picea abies (L.) Karst.] Needle Surfaces Under Sunlight Irradiation Based on Projection Pursuit Regression. QSAR Comb. Sci
**2009**, 28, 155–162. [Google Scholar] - Du, HY; Zhang, XY; Wang, X; Yao, XJ; Hu, ZD. Novel approaches to predict the retention of histidine-containing peptides in immobilized metal-affinity chromatography. Proteomics
**2008**, 8, 2185–2195. [Google Scholar] - Du, HY; Wang, J; Zhang, XY; Hu, ZD. A novel quantitative structure-activity relationship method to predict the affinities of MT3 melatonin binding site. Eur. J. Med. Chem
**2008**, 43, 2861–2869. [Google Scholar] - Du, HY; Wang, J; Watzl, J; Zhang, XY; Hu, ZD. Prediction of inhibition of matrix metalloproteinase inhibitors based on the combination of Projection Pursuit Regression and Grid Search method. Chemometr. Intel. Lab. Syst
**2008**, 93, 160–166. [Google Scholar] - Ren, YY; Liu, HX; Yao, XJ; Liu, MC. Prediction of ozone tropospheric degradation rate constants by projection pursuit regression. Anal. Chim. Acta
**2007**, 589, 150–158. [Google Scholar] - Ren, YY; Liu, HX; Li, SY; Yao, XJ; Liu, MC. Prediction of binding affinities to beta(1) isoform of human thyroid hormone receptor by genetic algorithm and projection pursuit regression. Bioorg. Med. Chem. Lett
**2007**, 17, 2474–2482. [Google Scholar] - Gunturi, SB; Archana, K; Khandelwal, A; Narayanan, R. Prediction of hERG Potassium Channel Blockade Using kNN-QSAR and Local Lazy Regression Methods. QSAR Comb. Sci
**2008**, 27, 1305–1317. [Google Scholar] - Du, HY; Watzl, J; Wang, J; Zhang, XY; Yaol, XJ; Hu, ZD. Prediction of retention indices of drugs based on immobilized artificial membrane chromatography using Projection Pursuit Regression and Local Lazy Regression. J. Sep. Sci
**2008**, 31, 2325–2333. [Google Scholar] - Guha, R; Dutta, D; Jurs, PC; Chen, T. Local lazy regression: Making use of the neighborhood to improve QSAR predictions. J. Chem. Inf. Model
**2006**, 46, 1836–1847. [Google Scholar]

© 2009 by the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Liu, P.; Long, W.
Current Mathematical Methods Used in QSAR/QSPR Studies. *Int. J. Mol. Sci.* **2009**, *10*, 1978-1998.
https://doi.org/10.3390/ijms10051978

**AMA Style**

Liu P, Long W.
Current Mathematical Methods Used in QSAR/QSPR Studies. *International Journal of Molecular Sciences*. 2009; 10(5):1978-1998.
https://doi.org/10.3390/ijms10051978

**Chicago/Turabian Style**

Liu, Peixun, and Wei Long.
2009. "Current Mathematical Methods Used in QSAR/QSPR Studies" *International Journal of Molecular Sciences* 10, no. 5: 1978-1998.
https://doi.org/10.3390/ijms10051978