Theoretical Prediction of the Complex P-Glycoprotein Substrate Efflux Based on the Novel Hierarchical Support Vector Regression Scheme

P-glycoprotein (P-gp), a membrane-bound transporter, can eliminate xenobiotics by transporting them out of the cells or blood–brain barrier (BBB) at the expense of ATP hydrolysis. Thus, P-gp mediated efflux plays a pivotal role in altering the absorption and disposition of a wide range of substrates. Nevertheless, the mechanism of P-gp substrate efflux is rather complex since it can take place through active transport and passive permeability in addition to multiple P-gp substrate binding sites. A nonlinear quantitative structure–activity relationship (QSAR) model was developed in this study using the novel machine learning-based hierarchical support vector regression (HSVR) scheme to explore the perplexing relationships between descriptors and efflux ratio. The predictions by HSVR were found to be in good agreement with the observed values for the molecules in the training set (n = 50, r2 = 0.96, qCV2 = 0.94, RMSE = 0.10, s = 0.10) and test set (n = 13, q2 = 0.80–0.87, RMSE = 0.21, s = 0.22). When subjected to a variety of statistical validations, the developed HSVR model consistently met the most stringent criteria. A mock test also asserted the predictivity of HSVR. Consequently, this HSVR model can be adopted to facilitate drug discovery and development.


Introduction
Permeability glycoprotein also known as P-glycoprotein (P-gp), which belongs to the ATP-binding cassette (ABC) superfamily of transporters, can actively transport a wide range of structurally and mechanistically diverse endogenous and xenobiotic chemical agents across the cell membrane at the energy expense of ATP hydrolysis [1]. P-gp, a 170-kDa plasma membrane protein encoded by the multidrug resistance gene (MDR1/ABCB1), is expressed at high levels in various tissues such as blood-brain-barriers (BBB), gastrointestinal tract (GIT), liver, kidney, and placenta [2][3][4][5][6]. In addition, P-gp plays significant roles in cell and tissue detoxification and elimination of harmful substances per se [1]. For example, the accumulation of neurotoxic amyloid-β (Aβ) peptides in the brain represents a pathogenic hallmark of Alzheimer's disease (AD), which is the most common form of dementia in aging populations [7]. It has been found that the decreased clearance rather than production of Aβ is the primary formation of the deleterious Aβ plaques in the brain [8]. The decreased elimination of Aβ from the brain into the blood can be partially attributed to the dysfunction of P-gp function, leading to the progression of AD [9][10][11]. Furthermore, it has been shown that Aβ can downregulate the P-gp expression along with other transporters and consequently lead to further accelerated neurodegeneration [12]. Hence, it has been suggested to increase Aβ clearance from the brain by restoring P-gp function of BBB to reduce Aβ brain accumulation as a new strategy in the medical treatment of the early stages of AD [13,14].
Additionally, P-gp efflux can profoundly implicate the role of drug absorption, distribution, metabolism, excretion, and toxicity (ADME/Tox) [15] that can clinically alter the administrated drug efficacy or even lead to various adverse side-effects due to drug-drug interaction (DDI) in the case of polypharmacy [16]. For instance, rifampin can interact with the P-gp substrate digoxin, leading to a lower accumulation of digoxin, as demonstrated by a clinical study [17]. Moreover, it is of particular interest to observe the subtle role played by P-gp in the central nervous system (CNS) since P-gp can affect the BBB penetration and pharmacological activities of administrated drugs [18]. The CNS-related side-effects of non-CNS drugs can be eliminated by P-gp because of their limited BBB penetration [19,20]. For instance, the P-gp substrate loperamide, which is a long-acting anti-diarrheal agent by agonizing the µ-opioid receptor, does not cause any CNS side-effects when administrated alone due to the blockage of the BBB penetration by P-gp [21]. When co-administrated with the P-gp inhibitor quinidine, loperamide produces adverse respiratory depression without significant alteration of the plasma accumulation due to its central opioid effect [22]. Conversely, P-gp can restrict or even eliminate the entry of CNS-targeted drugs into the brain, resulting in the reduction of the clinical efficacy [23].
In addition to normal tissues and organs, various types of tumor can over-express P-gp, producing multidrug resistance (MDR) [24], in which a single drug causes a non-drug resistant cell or cell line to become cross-resistant to other pharmacologically unrelated drugs due to the increase of administrated drug efflux and the decrease of intracellular drug accumulation [25]. As a result, P-gp efflux remains a major obstacle in the success of various kinds of cancer treatment [26] as well as infectious diseases [3,27]. For instance, brain tumor is one of the leading forms of malignancy and one of highest causes of cancer-related mortality among young adults aged less than 40 years and children [28] and glioma is the most common type of primary brain cancer with limited survival time and rate [29]. The CNS penetration of cediranib, which is a tyrosine kinase inhibitor for the treatment of glioma, is severely limited by the P-gp active efflux [30]. Co-administration of P-gp inhibitors is conceptually plausible and yet infeasible to circumvent MDR because of ineffective P-gp inhibitors in practical clinical applications [31,32]. Alternatively, P-gp can be considered as an anti-target in pharmaceutical research [33] especially in the field of CNS-targeted therapeutics [34,35]. Nevertheless, not all of marketed drugs have to be P-gp non-substrates provided that their therapeutic index is large with respect to the P-gp efflux ratio (ER) [36,37]. For instance, risperidone and 9-hydroxyl risperidone are clinically approved therapeutic agents for the treatment of schizophrenia even though they are P-gp substrates [38]. Accordingly, it is conceivable to expect that quantitative measure, viz. P-gp substrate efflux ratio, is more clinically relevant than qualitative classification, viz. substrate/non-substrate classification.
Of various in vitro assays to measure the efflux ratio [39][40][41][42], the monolayer efflux assay is the most relevant to drug distribution and the most commonly used in practice [20], in which the polarized epithelial cells, such as Madin-Darby canine kidney (MDCK) cells, are transfected with the MDR1 gene, followed by measuring the ratios between basolateral-to-apical (B→A) apparent permeability (P app ) and apical-to-basolateral (A→B) P app [43].
Molecules 2018, 23, 1820 where P app is evaluated using the membrane surface area (A), initial dosing concentration of the test molecule (C 0 ) in the donor compartment, and the amount of molecule transported per time (dQ/dt) in the receiver compartment [44]. Normally, molecules with ER > 2 are classified as P-gp substrates [39]. In contrast to in vitro and in vivo assays, in silico approaches are usually swift, inexpensive, less labor intensive, and less time-consuming for drug discovery and ADME/Tox profiling [45,46]. In fact, numerous P-gp classification structure-activity relationship (CSAR) models have been published elsewhere , whereas in silico quantitative studies of efflux ratio are scant [69][70][71]. Nevertheless, it is highly challenging to accurately model P-gp-substrate interactions [72] since P-gp is highly promiscuous per se as the result of the fact that P-gp can undergo substantial conformational changes upon binding with various ligands as illustrated by Figure 1 of Leong et al. [73]. In addition, P-gp has multiple substrate binding sites, as reported [72,[74][75][76][77]. The mechanism of P-gp substrate efflux is far more complicated than P-gp-substrate interactions since P-gp substrate efflux can take place through various routes in that substrates can be actively transported by P-gp from the cytoplasm into the extracellular environment in an energy-dependent manner or through a protein channel positioned between the inner and outer leaflets of the lipid membrane, as illustrated by Figure 2 of Edwards [78]. In addition to active transport, P-gp substrates can also passively diffuse from the cytoplasm into the extracellular environment through transcellular diffusion and/or paracellular route, as illustrated by Figure 1 of Balimane et al. [79]. Notably, the P-gp substrate vinblastine, for instance, can be both passively diffused and actively transported [80]. As such, those modeling schemes employed by previously published investigations can only render the direct protein-ligand interactions and they are not suitable to model the efflux ratio. Conversely, quantitative structure-activity relationship (QSAR) schemes, which are a mathematic means to establish the relationship between biological activity and chemical characteristics, provide better approaches to model the efflux ratio since they can take into account any mechanisms that can occur through complex routes [81].
The complexity of P-gp mediated efflux can be problematic once the delicate roles played by those associated chemical features, viz. descriptors in QSAR models, are considered. For instance, inhibitors, modulators, and substrates can interact with P-gp using the hydrophobicity, hydrogen-bond acceptor (HBA), and hydrogen-bond donor (HBD) features [47,73,82]. Accordingly, hydrophobicity, HBA, and HBD can simultaneously enhance and reduce the P-gp efflux, and it is plausible to expect extremely nonlinear relationships between those chemical features and efflux ratio, suggesting that those linear models can yield significant prediction errors once applied to the test samples that are very different from their training patterns. where Papp is evaluated using the membrane surface area (A), initial dosing concentration of the test molecule (C0) in the donor compartment, and the amount of molecule transported per time (dQ/dt) in the receiver compartment [44]. Normally, molecules with ER > 2 are classified as P-gp substrates [39]. In contrast to in vitro and in vivo assays, in silico approaches are usually swift, inexpensive, less labor intensive, and less time-consuming for drug discovery and ADME/Tox profiling [45,46]. In fact, numerous P-gp classification structure-activity relationship (CSAR) models have been published elsewhere , whereas in silico quantitative studies of efflux ratio are scant [69][70][71]. Nevertheless, it is highly challenging to accurately model P-gp-substrate interactions [72] since P-gp is highly promiscuous per se as the result of the fact that P-gp can undergo substantial conformational changes upon binding with various ligands as illustrated by Figure 1 of Leong et al. [73]. In addition, P-gp has multiple substrate binding sites, as reported [72,[74][75][76][77]. The mechanism of P-gp substrate efflux is far more complicated than P-gp-substrate interactions since P-gp substrate efflux can take place through various routes in that substrates can be actively transported by P-gp from the cytoplasm into the extracellular environment in an energy-dependent manner or through a protein channel positioned between the inner and outer leaflets of the lipid membrane, as illustrated by Figure 2 of Edwards [78]. In addition to active transport, P-gp substrates can also passively diffuse from the cytoplasm into the extracellular environment through transcellular diffusion and/or paracellular route, as illustrated by Figure 1 of Balimane et al. [79]. Notably, the P-gp substrate vinblastine, for instance, can be both passively diffused and actively transported [80]. As such, those modeling schemes employed by previously published investigations can only render the direct protein-ligand interactions and they are not suitable to model the efflux ratio. Conversely, quantitative structure-activity relationship (QSAR) schemes, which are a mathematic means to establish the relationship between biological activity and chemical characteristics, provide better approaches to model the efflux ratio since they can take into account any mechanisms that can occur through complex routes [81].   The complexity of P-gp mediated efflux can be problematic once the delicate roles played by those associated chemical features, viz. descriptors in QSAR models, are considered. For instance, inhibitors, modulators, and substrates can interact with P-gp using the hydrophobicity, hydrogenbond acceptor (HBA), and hydrogen-bond donor (HBD) features [47,73,82]. Accordingly, hydrophobicity, HBA, and HBD can simultaneously enhance and reduce the P-gp efflux, and it is plausible to expect extremely nonlinear relationships between those chemical features and efflux ratio, suggesting that those linear models can yield significant prediction errors once applied to the test samples that are very different from their training patterns.
Thus, it seems extremely difficult, if not completely impossible, to develop a sound in silico model to predict the P-gp substrate efflux ratio to compressively take into account those critical factors mentioned above. A solution to such challenge, however, can be obtained by the novel hierarchical support vector regression (HSVR) scheme proposed by Leong et al. [83] because HSVR can render the complex and varied dependencies of descriptors. As such, HSVR can simultaneously possess the advantageous characteristics of a local model and a global model, viz. broader coverage of applicability domain and higher level of predictivity, respectively. Furthermore, HSVR is designated to circumvent the "mesa effect" [84] in that the performance of a developed model deteriorates dramatically when applied to extrapolated predictions as demonstrated elsewhere [85,86]. In other words, HSVR is insensitive to outliers as compared with the other predictive models that is of critical importance to a predictive model [87]. Herein, the objective of this investigation was to develop an accurate, fast, and predictive in silico model based on the HSVR scheme to predict the P-gp substrate efflux ratio to facilitate drug discovery to design molecules with a preferable ADME/Tox profile.

Data Compilation
More than 550 compounds were collected after comprehensive literature search. data curation was carefully carried out by eliminating those compounds: (i) with only qualitative array results (i.e. substrate or non-substrate); (ii) without specific ER values; or (iii) chemical structures. In addition, cells used to express P-gp protein also play a significant role in determining ER values. For instance, the measured ER values of astemizole were 2.16 and 0.6 assayed in MDCK and human colon Thus, it seems extremely difficult, if not completely impossible, to develop a sound in silico model to predict the P-gp substrate efflux ratio to compressively take into account those critical factors mentioned above. A solution to such challenge, however, can be obtained by the novel hierarchical support vector regression (HSVR) scheme proposed by Leong et al. [83] because HSVR can render the complex and varied dependencies of descriptors. As such, HSVR can simultaneously possess the advantageous characteristics of a local model and a global model, viz. broader coverage of applicability domain and higher level of predictivity, respectively. Furthermore, HSVR is designated to circumvent the "mesa effect" [84] in that the performance of a developed model deteriorates dramatically when applied to extrapolated predictions as demonstrated elsewhere [85,86]. In other words, HSVR is insensitive to outliers as compared with the other predictive models that is of critical importance to a predictive model [87]. Herein, the objective of this investigation was to develop an accurate, fast, and predictive in silico model based on the HSVR scheme to predict the P-gp substrate efflux ratio to facilitate drug discovery to design molecules with a preferable ADME/Tox profile.

Data Compilation
More than 550 compounds were collected after comprehensive literature search. data curation was carefully carried out by eliminating those compounds: (i) with only qualitative array results (i.e., substrate or non-substrate); (ii) without specific ER values; or (iii) chemical structures. In addition, cells used to express P-gp protein also play a significant role in determining ER values. For instance, the measured ER values of astemizole were 2.16 and 0.6 assayed in MDCK and human colon adenocarcinoma (Caco-2) cells, respectively [51,88]. Of various assayed cells, 63 molecules tested in MDCK cells were selected from various sources [23,39,[88][89][90][91][92][93][94][95][96][97][98][99][100][101] since it constituted the largest amount of data. The data size is seemingly small since several CSAR models have been derived based on rather large amounts of data. For instance, Li et al. [66] built various predictive models based on 423 P-gp substrates and 399 non-substrates compiled from numerous sources. Nevertheless, their data were generated from different assay conditions (e.g., different cell lines), leading to high levels of data heterogeneity. QSAR models, conversely, are vulnerable to data inhomogeneity [102]. Additionally, some molecules such as selenium-containing ones [103] were excluded because their topological descriptors, for instance, cannot be enumerated. Those ER values were discarded when they were not consistent with their measured P app (B→A) and P app (A→B) values [104]. Recently, the efflux ratios of more than 4000 Amgen in-house compounds were measured [105]. It is plausible to expect that the great sample amount and data consistency can furnish a good ER pool. Unfortunately, those chemical structures are proprietary, leading to the fact that there are only limited quantitative data with chemical structures available in the public domain to date. Those factors partially contribute to the fact that there is no genuine QSAR model has been published.
As such, only very limited data samples with available chemical structures and consistent assay conditions were recruited in this study to maximize the structural diversity and to maintain data homogeneity after purging inappropriate data based on above-mentioned criteria. Table S1 lists the SMILES strings, CAS registry numbers, efflux ratio values, and literature references of all molecules collected in this study.

Data Partition
Of all molecules adopted in this study, 50 and 13 molecules were randomly assigned to the training set and test set, respectively, with a ca. 4:1 ratio as suggested [106]. Figure S1 displays the projection of all molecules enrolled in this investigation in chemical space, spanned by the first three principal components (PCs), explaining 94.6% of the variance in the original data. As illustrated, both datasets exhibited high levels of similarity in the chemical space. Furthermore, the high levels of biological and chemical similarity between both datasets can also be validated by Figure S2, which shows the histograms of log ER, molecular weight (MW), polar surface area (PSA), number of HBA, and number of HBD in density form for all molecules in the training set and test set. Thus, it can be asserted that there was no substantial bias in datasets.

SVRE
Of all generated SVR models using various combinations of descriptors and runtime parameters, three SVR models, denoted by SVR A, SVR B, and SVR C, were assembled to construct the SVR ensemble, which was further subjected to regression by another SVR to generate the HSVR model. Table S2 summarizes the optimal runtime parameters of SVR A, SVR B, and SVR C. These three SVR models, which adopted 4, 6, and 3 descriptors (Table 1), respectively, were selected based on their individual performances on all molecules and statistical analyses in the training set and test set. Table S1 lists the predicted log ER values. Tables 2 and 3 summarize the associated statistical analyses of these three SVR models in the training set and test set, respectively. Figures 1 and 2 display the scatter plots of observed versus the predicted log ER values by SVR A, SVR B, and SVR C for the molecules in the training set and test set, respectively.   (Table 2). Of 50 training samples, SVR A, SVR B, and SVR C gave rise to 28, 3, and 2 predictions, which deviated from the experimental values by more than 0.10, respectively. It can be further observed in Figure 1 that most of the points predicted by SVR C generally lie on or are closer to the regression line when compared with Molecules 2018, 23, 1820 6 of 25 SVR A and SVR B. As a result, SVR C produced the lowest MAE (0.02), s (0.06), and RMSE (0.06) and the highest r 2 parameter (0.98), suggesting that SVR C performed better than SVR A and SVR B for the molecules in the training set. Nevertheless, the predictions of quinidine (48) by SVR A, SVR B and SVR C unanimously yielded the maximum residuals of 0.32, 0.51 and 0.40, respectively, denoting that SVR A executed better than SVR B and SVR C. Table 2. Statistic evaluations, namely correlation coefficient (r 2 ), maximum residual (∆ Max ), mean absolute error (MAE), standard deviation (s), RMSE, and 10-fold cross-validation correlation coefficient (q 2 CV ) evaluated by SVR A, SVR B, SVR C, and HSVR in the training set. The predictions by SVR A, SVR B, and SVR C in the test set are also in good agreement with the experimental values ( Figure 2). Nevertheless, most of the residuals obtained by the three SVR models in the test set are more than 0.15 (11, 11, and 8, respectively). It can be further observed in Table 3 that the mean absolute errors computed by SVR A, SVR B, and SVR C unequivocally increase from 0.11, 0.07, and 0.02 in the training set to 0.29, 0.22, and 0.24 in test set, respectively. The other statistical parameters also suggest that the performances of these three models in the SVRE slightly decline from the training set to the test set (Tables 2 and 3). The maximum residual computed by SVR C in the test set was yielded from the prediction of cimetidine (13) with an absolute residual of 0.55, which were only 0.34 and 0.10 by SVR A and SVR B, respectively. Similarly, vinblastine (58) was best predicted by SVR C with an absolute residual of 0.01, whereas SVR A and SVR B gave rise to absolute errors of 0.60 and 0.41, respectively.  (Tables 2 and 3). When subjected to the Y-scrambling test, SVR A, SVR B, and SVR C gave rise to the r 2 s values of 0.02, 0.03, and 0.03, respectively ( Table 1). The almost zero values of r 2 s as well as substantial differences between corresponding r 2 and r 2 s signify that those three SVR models in the ensemble are not the result of chance correlation [107]. Conversely, the substantial differences between r 2 and q 2 and between r 2 and q 2 CV imply the over-fitting characteristics of these three models that actually can be further manifested by their small q 2 F1 , q 2 F2 , q 2 F3 , and CCC values ( Table 3). As a result, it is plausible to expect that these models are local models per se, which have limited coverage of applicability domain (vide infra) [108].

HSVR
The HSVR model was produced by the regression of the SVR ensemble based on the predictions of all molecules and statistical evaluations in the training set (Table S1 and Table 2). Table S2 lists the optimal runtime conditions for the final SVR model. It can be observed in Figure 1 that the HSVR model showed better prediction accuracy than SVR A, SVR B, and SVR C for the molecules in the training set because the distances between the predictions by HSVR and regression line are generally between the largest ones and smallest ones produced by its SVR counterparts in the ensemble. However, HSVR executed better than any of SVR models in the ensemble in some cases. The predictions of desloratadine (19) by SVR A, SVR B, SVR C, and HSVR, for instance, yielded absolute residuals of 0.10, 0.06, 0.01, and 0.00, respectively. Statistically, HSVR performed better than SVR A and SVR B, whereas SVR C, in turn, functioned negligibly better than HSVR, as manifested by those parameters listed in Table 2. For example, SVR A, SVR B, SVR C, and HSVR yielded the r 2 values of 0.95, 0.95, 0.98, and 0.96, respectively.
When applied to the test samples, HSVR only showed insignificant performance decreases from the training set to the test set. For instance, RMSE increased from 0.10 in the training set to 0.21 in the test set (Tables 2 and 3). However, the maximum residual declined from 0.45 in the training set to 0.42 in the test set. Figure 2 displays that HSVR showed better performance than SVR A, SVR B, and SVR C in the test set. The performance predominance of HSVR can be further manifested by those statistical parameters listed in Table 3. For instance, SVR A, SVR B, SVR C, and HSVR gave rise to MAE values of 0.29, 0.22, 0.24, and 0.17, respectively. Similar observation that HSVR generated smaller absolute residuals than its counterparts in the ensemble can also be found in the test set. The absolute prediction error of paliperidone (41), for instance, was 0.14 given rise by HSVR, whereas SVR A, SVR B, and SVR C produced residuals of 0.57, 0.25, and 0.30, respectively. When compared with its counterparts in the ensemble, HSVR generally produced consist and small errors in both training set and test set as manifested by those parameters associated with error listed in Tables 2 and 3, suggesting that HSVR has broader coverage of applicability domain. Additionally, HSVR yielded the smallest differences between r 2 and q 2 CV (0.02) and between r 2 and q 2 (0.13), indicating that HSVR was a well-trained model or no over-fitting effect was observed because it will otherwise produce at least one significant difference among those parameters. Similarly, the possibility of chance correlation of HSVR can be eliminated by Y-scrambling since it also produced an almost zero r 2 s (0.03) and marked difference between r 2 and r 2 s (Table 2) [107]. Figure 3 displays the scatter plots of the residual vs. the log ER values predicted by HSVR for the molecules in the training set and test set. It can be observed that the residuals are approximately evenly distributed on both sides of x-axis along the range of predicted values in both datasets, suggesting that there is no systematic error associated with the HSVR model [102]. The unbiased predictions can be further exhibited by its almost negligible average residuals that were −0.02 and −0.02 in the training set and test set, respectively (Table S1).

Predictive Evaluations
The predictivity of generated HSVR model was further evaluated by the validation requirements proposed by Golbraikh et al. [109], Ojha et al. [110], Roy et al. [111], and Chirico and Gramatica [112] (Equations (18)- (21)) in the training set and test set. Table 4 summarizes the results, from which it can be observed that HSVR maintained similar high levels of performance in the training set and test set. Additionally, HSVR fulfilled all validation requirements, indicating that this predictive model is highly accurate and predictive. Figure 3 displays the scatter plots of the residual vs. the log ER values predicted by HSVR for the molecules in the training set and test set. It can be observed that the residuals are approximately evenly distributed on both sides of x-axis along the range of predicted values in both datasets, suggesting that there is no systematic error associated with the HSVR model [102]. The unbiased predictions can be further exhibited by its almost negligible average residuals that were −0.02 and −0.02 in the training set and test set, respectively (Table S1). The predictivity of generated HSVR model was further evaluated by the validation requirements proposed by Golbraikh et al. [109], Ojha et al. [110], Roy et al. [111], and Chirico and Gramatica [112] (Equations (18)-(21)) in the training set and test set. Table 4 summarizes the results, from which it can be observed that HSVR maintained similar high levels of performance in the training set and test set. Additionally, HSVR fulfilled all validation requirements, indicating that this predictive model is highly accurate and predictive.

Mock Test
To mimic real world challenges, the developed HSVR model was further tested on the P-gp substrates assayed by Crivori et al. [51]. Of all marketed drugs measured by Crivori et al., 12 were also enrolled in this study, yielding a good way to calibrate the testing system. However, these molecules were measured in Caco-2 cells, whereas all of the molecules adopted in this study were tested in MDCK cells, suggesting that those compounds assayed by Crivori et al. are not qualified as the second external or test set since those validation criteria (vide supra) are not applicable to these compounds. To eliminate the discrepancy between both assay systems, the linear correlation between both assay systems for those common molecules was first inspected and the obtained scatter plot is illustrated in Figure 4. It can be observed that the experimental values in both systems were modestly correlated with each other well with an r value of 0.78. Thus, it is plausible to examine the HSVR model with those novel P-gp substrates assayed in Caco-2 cells. also enrolled in this study, yielding a good way to calibrate the testing system. However, these molecules were measured in Caco-2 cells, whereas all of the molecules adopted in this study were tested in MDCK cells, suggesting that those compounds assayed by Crivori et al. are not qualified as the second external or test set since those validation criteria (vide supra) are not applicable to these compounds. To eliminate the discrepancy between both assay systems, the linear correlation between both assay systems for those common molecules was first inspected and the obtained scatter plot is illustrated in Figure 4. It can be observed that the experimental values in both systems were modestly correlated with each other well with an r value of 0.78. Thus, it is plausible to examine the HSVR model with those novel P-gp substrates assayed in Caco-2 cells.

Discussion
Collectively, seven descriptors were adopted in this study. Intrinsically, the sample-todescriptor ratio was ca. 7:1, which is significantly larger than 5, viz. the minimal requirement to lessen the probability of chance correlations in a predictive model [113]. However, the process of P-gp substrate efflux is complex since it can take place thought various routes (vide supra). As such, different descriptors were adopted by different classification models. Of various descriptors selected by qualitative predictive models, hydrophobic, HBA, and HBD are the most frequently selected chemical features, as illustrated by the model proposed by Penzotti et al. [47]. However, the analysis of Amgen in-house compounds can reveal that HBD and topological PSA (tPSA) are the predominant factors associated with ER [105]. Figure 6 displays the average log ER for each histogram bin of HBD for all molecules selected in this study. It can be observed that the average log ER value initially increased with HBD when HBD was no more than 6 and then subsequently decreased when HBD was more than 6. Such positive dependence of log ER on HBD is, in fact, consistent with the analysis made by Hitchcock et al. [105]. However, those Amgen in-house compounds had HBD of no more than 5, leading to an only positive relationship between log ER and HBD. Such discrepancy in both systems can be conceivably attributed to the fact that the initial P-gp substrate binding can be enhanced by HBD as illustrated by the pharmacophore models of Penzotti et al. [47], whereas the consequent transport of the substrates into the extracellular environment can be hampered by too many HBDs, plausibly because of the increase in water desolvation energy [114] and the decrease in membrane fluidity [115]. As such, a

Discussion
Collectively, seven descriptors were adopted in this study. Intrinsically, the sample-to-descriptor ratio was ca. 7:1, which is significantly larger than 5, viz. the minimal requirement to lessen the probability of chance correlations in a predictive model [113]. However, the process of P-gp substrate efflux is complex since it can take place thought various routes (vide supra). As such, different descriptors were adopted by different classification models. Of various descriptors selected by qualitative predictive models, hydrophobic, HBA, and HBD are the most frequently selected chemical features, as illustrated by the model proposed by Penzotti et al. [47]. However, the analysis of Amgen in-house compounds can reveal that HBD and topological PSA (tPSA) are the predominant factors associated with ER [105]. Figure 6 displays the average log ER for each histogram bin of HBD for all molecules selected in this study. It can be observed that the average log ER value initially increased with HBD when HBD was no more than 6 and then subsequently decreased when HBD was more than 6. Such positive dependence of log ER on HBD is, in fact, consistent with the analysis made by Hitchcock et al. [105]. However, those Amgen in-house compounds had HBD of no more than 5, leading to an only positive relationship between log ER and HBD. Such discrepancy in both systems can be conceivably attributed to the fact that the initial P-gp substrate binding can be enhanced by HBD as illustrated by the pharmacophore models of Penzotti et al. [47], whereas the consequent transport of the substrates into the extracellular environment can be hampered by too many HBDs, plausibly because of the increase in water desolvation energy [114] and the decrease in membrane fluidity [115]. As such, a nonlinear relationship between HBD and log ER was yielded consequently. It has been observed that hydrophobicity, which normally can be represented by log P, plays an important role in P-gp-substrate interaction due to the hydrophobic nature of the substrate binding pocket, resulting in stronger P-gp substrate binding for those more hydrophobic substrates [116]. Nevertheless, the interaction between substrates and lipid bilayer as well as the release of substrates into the extracellular environment also depend on the hydrophobicity of substrates (vide supra), leading to a nonlinear relationship between log P and log ER. Figure 7 displays the average log ER for histogram bin of log P for all molecules enlisted in this study. It can be observed that the average log ER initially increased with log P when log P was smaller and decreased with log P when log P became higher. Such observation is qualitatively similar to the trend of Papp (A→B) found by Hitchcock et al. [105].  It has been observed that hydrophobicity, which normally can be represented by log P, plays an important role in P-gp-substrate interaction due to the hydrophobic nature of the substrate binding pocket, resulting in stronger P-gp substrate binding for those more hydrophobic substrates [116]. Nevertheless, the interaction between substrates and lipid bilayer as well as the release of substrates into the extracellular environment also depend on the hydrophobicity of substrates (vide supra), leading to a nonlinear relationship between log P and log ER. Figure 7 displays the average log ER for histogram bin of log P for all molecules enlisted in this study. It can be observed that the average log ER initially increased with log P when log P was smaller and decreased with log P when log P became higher. Such observation is qualitatively similar to the trend of P app (A→B) found by Hitchcock et al. [105].
Nevertheless, it is unusual to observe that log P was not included in this study, whereas the number of aromatic rings (n Ar ) was enlisted in this study. Such inconsistency can be realized by the fact that the average log P values increased with n Ar for all of molecules included in this study, as illustrated in Figure 8, which displays the average log P versus the distribution of n Ar . As such, it is plausible to replace log P by n Ar . Furthermore, it has been found that aromatic ring moieties are important in substrate recognition and efflux modulation [117,118]. More importantly, the empirical observation has indicated that models with the selection of n Ar unanimously showed better performance than those with the selection of log P (data not shown). pocket, resulting in stronger P-gp substrate binding for those more hydrophobic substrates [116]. Nevertheless, the interaction between substrates and lipid bilayer as well as the release of substrates into the extracellular environment also depend on the hydrophobicity of substrates (vide supra), leading to a nonlinear relationship between log P and log ER. Figure 7 displays the average log ER for histogram bin of log P for all molecules enlisted in this study. It can be observed that the average log ER initially increased with log P when log P was smaller and decreased with log P when log P became higher. Such observation is qualitatively similar to the trend of Papp (A→B) found by Hitchcock et al. [105].  Nevertheless, it is unusual to observe that log P was not included in this study, whereas the number of aromatic rings (nAr) was enlisted in this study. Such inconsistency can be realized by the fact that the average log P values increased with nAr for all of molecules included in this study, as illustrated in Figure 8, which displays the average log P versus the distribution of nAr. As such, it is plausible to replace log P by nAr. Furthermore, it has been found that aromatic ring moieties are important in substrate recognition and efflux modulation [117,118]. More importantly, the empirical observation has indicated that models with the selection of nAr unanimously showed better performance than those with the selection of log P (data not shown). The significant role of HBA in the P-gp-substrate interaction has been manifested by molecular docking simulations [71] as well as numerous qualitative models. Additionally, it has been suggested that HBA can enhance P-gp-mediated efflux [56]. Nevertheless, it is unusual to observe that none of SVR models in the ensemble has adopted HBA, plausibly because the descriptor number of nitrogen and oxygen atoms (nN+O) correlated well with HBA as demonstrated by Figure 9, which displays nN+O versus HBA. In fact, Desai et al. [56] adopted nN+O instead of HBA as the substrate classification criterion. Furthermore, empirical model development has shown that models with the selection of nN+O executed better than those with the selection of HBA (data not shown). As a result, the descriptor nN+O was selected in lieu of HBA. The significant role of HBA in the P-gp-substrate interaction has been manifested by molecular docking simulations [71] as well as numerous qualitative models. Additionally, it has been suggested that HBA can enhance P-gp-mediated efflux [56]. Nevertheless, it is unusual to observe that none of SVR models in the ensemble has adopted HBA, plausibly because the descriptor number of nitrogen and oxygen atoms (n N+O ) correlated well with HBA as demonstrated by Figure 9, which displays n N+O versus HBA. In fact, Desai et al. [56] adopted n N+O instead of HBA as the substrate classification criterion. Furthermore, empirical model development has shown that models with the selection of n N+O executed better than those with the selection of HBA (data not shown). As a result, the descriptor n N+O was selected in lieu of HBA.
The descriptor tPSA is a modified version to swiftly calculate the polar surface area only based on the additive polar surface areas [119]. The recursive partitioning (RP) model of Joung et al. [68] indicated the significant role of PSA in classifying molecules as P-gp substrates/non-substrates. Moreover, Hitchcock et al. also found the profound contribution of tPSA to P-gp mediated efflux (vide supra). Accordingly, the more sophisticated version of PSA was adopted in this study since it can function as polarity and hydrogen-bonding features [66]. The descriptor tPSA is a modified version to swiftly calculate the polar surface area only based on the additive polar surface areas [119]. The recursive partitioning (RP) model of Joung et al. [68] indicated the significant role of PSA in classifying molecules as P-gp substrates/non-substrates. Moreover, Hitchcock et al. also found the profound contribution of tPSA to P-gp mediated efflux (vide supra). Accordingly, the more sophisticated version of PSA was adopted in this study since it can function as polarity and hydrogen-bonding features [66].
It has been observed that the substrate size, which can be characterized by molecular weight (MW), molecular volume (Vm), and total surface area (SA), can have a large impact on P-gp-substrate interaction as well as passive permeability [120]. Nevertheless, it has been suggested that both Vm and SA can be better metrics to estimate the actual molecular size [121], and MW, conversely, was closely associated with Vm with an r 2 values of 0.98 for the molecules enlisted in this study. In fact, it has been postulated that Vm rather than MW is a better metric to associate with ER [122]. Accordingly, Vm and SA were adopted to render the size effects, whereas MW was discarded to reduce the probability of spurious correlations.
It has been found that P-gp substrates generally have more rotatable bonds than non-substrates since more flexible molecules can be more easily to adopt favorable orientation to interact with P-gp [49,66,123]. In fact, non-CNS drugs are more flexible than their CNS counterparts [23] since molecules with more conformational flexibility can favor the internal H-bond formation, which, in turn, can enhance the passive membrane permeability [124]. As such, substrate conformational flexibility, which can be characterized by the number of rotatable bond (nRot), can facilitate not only the active transport but also passive permeability of P-gp substrates, and nRot was adopted in this investigation.
Gunaydin et al. [69] only took into account the contribution of the differences between free energy in water (GH2o) and that in chloroform (GCHCl3), viz. ΔGH2O−CHCl3, since it was hypothesized that P-gp undergoes a conformation change from the intercellular-facing state to extracellular-facing state upon binding with substrates. As such, the transported substrates experience from a lipophilic environment into a hydrophilic one. In addition to ΔGH2O−CHCl3, the contribution of ΔGDMSO−CHCl3 was also computed in this study to mimic the assay conditions. Nevertheless, neither of the solvation free energy differences was selected in this study due to their insignificant contribution to ERs (data not shown), plausibly because the P-gp conformation change can only account for a small part of the whole complicated efflux process and, additionally, passive permeability is not resulted from the Pgp conformation change. The predictive model of Gunaydin et al. [102], nevertheless, was derived It has been observed that the substrate size, which can be characterized by molecular weight (MW), molecular volume (V m ), and total surface area (SA), can have a large impact on P-gp-substrate interaction as well as passive permeability [120]. Nevertheless, it has been suggested that both V m and SA can be better metrics to estimate the actual molecular size [121], and MW, conversely, was closely associated with V m with an r 2 values of 0.98 for the molecules enlisted in this study. In fact, it has been postulated that V m rather than MW is a better metric to associate with ER [122]. Accordingly, V m and SA were adopted to render the size effects, whereas MW was discarded to reduce the probability of spurious correlations.
It has been found that P-gp substrates generally have more rotatable bonds than non-substrates since more flexible molecules can be more easily to adopt favorable orientation to interact with P-gp [49,66,123]. In fact, non-CNS drugs are more flexible than their CNS counterparts [23] since molecules with more conformational flexibility can favor the internal H-bond formation, which, in turn, can enhance the passive membrane permeability [124]. As such, substrate conformational flexibility, which can be characterized by the number of rotatable bond (n Rot ), can facilitate not only the active transport but also passive permeability of P-gp substrates, and n Rot was adopted in this investigation.
Gunaydin et al. [69] only took into account the contribution of the differences between free energy in water (G H2 o) and that in chloroform (G CHCl3 ), viz. ∆G H2O−CHCl3 , since it was hypothesized that P-gp undergoes a conformation change from the intercellular-facing state to extracellular-facing state upon binding with substrates. As such, the transported substrates experience from a lipophilic environment into a hydrophilic one. In addition to ∆G H2O−CHCl3 , the contribution of ∆G DMSO−CHCl3 was also computed in this study to mimic the assay conditions. Nevertheless, neither of the solvation free energy differences was selected in this study due to their insignificant contribution to ERs (data not shown), plausibly because the P-gp conformation change can only account for a small part of the whole complicated efflux process and, additionally, passive permeability is not resulted from the P-gp conformation change. The predictive model of Gunaydin et al. [102], nevertheless, was derived only based on 12 marketed drugs that cannot comprehensively render the complex efflux. As such, more descriptors will be required in case of more diverse samples.
Didziapetris et al. [63] proposed the "rule-of-fours," which states that molecules with: (i) n N+O ≥ 8; (ii) MW > 400; and (iii) acid pK a > 4 are likely to be P-gp substrates. Of all molecule with ER > 2 selected in this study, viz. substrates, approximately 32%, 52%, and 100% can meet the criteria n N+O ≥ 8, MW > 400, and acid pK a > 4, respectively, and only 29% can completely fulfill those three criteria. Actually, Li et al. [66] also found that only ca. 34% of samples can simultaneously meet those three criteria. Furthermore, it is not unusual to observe that different rules have been proposed to classify molecules into P-gp substrates/non-substrates. Desai et al. [56], for instance, have proposed the molecules with TPSA >100 Å 2 and most basic pK a > 8 have higher probability to be substrates. The inconsistency in various proposed rules can be plausibly explained as those rules were derived only based on linear analyses of those P-gp substrates/non-substrates. However, such bisection is not always true, as manifested by the naïve Bayesian classifiers built by Li et al. [66]. In addition, the size and hydrophobicity of substrates can affect the substrate-membrane interactions nonlinearly [125]. Further complexity can be raised once the P-gp substrate efflux is considered instead of P-gp substrate/non-substrate classification since the P-gp substrate efflux can take place through various routes (vide supra), leading to nonlinear relationships between some descriptors and log ER, such as HBD and log P (Figures 6 and 7). Numerous attempts have been made in this study to develop various partial least square (PLS) models to accommodate the novel 2-QSAR scheme [86] and no satisfactory models were produced (data not shown). Conversely, the accurate and predictive HSVR can comprehensively describe such nonlinear dependence of log ER on descriptors.
Moreover, it has been observed that P-gp and other ABC members, namely breast cancer resistant protein (BCRP/ABCG2) and multidrug resistance-associated protein 4 (ABCC4/MRP4), play a critical role in BBB permeability [126], which can take place via various routes [127] in addition to the already complicated P-gp mediated efflux. As such, it is plausible to expect that it is extremely difficult to develop a sound in silico model to predict BBB permeability if not entirely impossible [128]. The development of an accurate in silico model in this study to predict the P-gp substrate efflux can pave the way to establish a sound theoretical model to predict the BBB permeability in the future. Most molecules adopted in this study are marketed drugs for treating various illnesses, such as HIV infection, allergy symptoms, rheumatoid arthritis, hypertension, diarrhea, and different types of cancer in addition to assorted CNS-related disorders (Supplementary Materials Table S1). The broad spectrum of therapeutic agents unequivocally indicates that the data samples are structurally diverse, which can be further manifested by the fact that the average minimum distance between two molecules, viz. the distance between two nearest neighbors, in the chemical space was 2.06 with an standard deviation of 1.39 and the maximum distance between two collected samples was 29.57 (Supplementary Materials Figure S1), giving rise to an ratio of ca. 1:14. As such, it is plausible to expect that developed HSVR should have a larger coverage of applicability domain accordingly, which is an important characteristic for a predictive model in practical application. More importantly, the derived HSVR model and published P-gp substrate/non-substrate classification models can work in a synergistic fashion, in which the latter can be used to identify those P-gp substrates and the former can be deployed to predict their efflux ratios.

Data Compilation
A sound predictive model can only be built based on good quality of sample data [102]. To compile quality data for this study, a comprehensive literature search was conducted to retrieve efflux ratio values from various sources to maximize the structural diversity. If there were two or more available efflux ratio data for a given compound and in close range, the average values were then taken to warrant better consistency. Further data curation was carried out by cautiously inspecting molecular structures to remove those molecules without definite stereochemistry.

Molecular Descriptors
All of the molecules enlisted in this study were subjected to full geometry optimization using the density functional theory (DFT) B3LYP method with the basis set 6-31G(d,p) by the Gaussian 09 package (Gaussian, Wallingford, CT) in the dimethyl sulfoxide (DMSO) solvent system using the polarizable continuum model (PCM) [129,130] to mimic the experimental conditions. These geometries were confirmed to be real minima on the potential energy surface by force calculations when no imaginary frequency was obtained. Additionally, atomic charges were also calculated by the molecular electrostatic potential-based method of Merz and Kollman [131] and the highest occupied molecular orbital energy (E HOMO ), lowest unoccupied molecular orbital energy (E LUMO ), free energy (∆G), and dipole (µ) were also retrieved from the optimization calculations since those quantum mechanics descriptors have been adopted previously. As such, it is of necessity to employ a more sophisticated quantum mechanics method to optimize those selected molecules and to calculate their associated descriptors.
The Discovery Studio package (BIOVIA, San Diego, CA) and E-Dragon (available at the web site http://www.vcclab.org/lab/edragon/) were also utilized to calculate more than 200 one-, two-, and three-dimensional molecular descriptors of those optimized molecules. These descriptors can be classified as electronic descriptors, spatial descriptors, structural descriptors, thermodynamic descriptors, topological descriptors, and E-state indices.
Data filtering was initially performed by removing those descriptors missing for at least one sample or showing little or no discrimination against all samples. Furthermore, only one descriptor should be kept among those descriptors with intercorrelation values of r 2 > 0.8 to reduce the probability of spurious correlations as postulated by Topliss and Edwards [113]. It is not uncommon to observe that certain descriptors with broader ranges outweigh those with narrower ranges because of substantial variations in magnitudes. Nevertheless, such problem can be resolved when the non-descriptive descriptors, viz. real variable descriptors, are normalized with the following equation [132] where x ij and χ ij represent the original and normalized jth descriptors of the ith compound, respectively; x j stands for the mean value of the original jth descriptor; and n is the number of samples. Descriptor selection plays a pivotal role in determining the performance of predictive models [133]. More descriptors will be needed once there are more training samples with more diverse structures [102]. Conversely, it is highly possible to yield an over-trained model when there are too many selected descriptors [134]. The descriptor selection was initially executed by genetic function approximation (GFA) using the QSAR module of Discovery Studio due to its effectiveness and efficiency [135]. Further descriptor selection was carried out by the recursive feature elimination (RFE) method, in which the predictive model was repeatedly generated by all but one of descriptors. The descriptors were then ranked according to their contributions to the predictive performance; and the descriptor with least contribution was discarded [136].

Data Partition
The collected molecules were divided into two datasets, namely the training set and test set, to develop and to verify the predictive models using the Kennard-Stone (KS) algorithm [137] implemented in MATLAB (The Mathworks, Natick, MA, USA) with an approximate 4:1 ratio as suggested [106]. It has been suggested that a sound model can be derived only based on chemically and biologically similar training samples and test samples [138]. As such, the data distribution was carefully examined to ensure the high levels of biological and chemical similarity in both datasets.

Hierarchical Support Vector Regression
Support vector machine (SVM) proposed by Vapnik et al. [139] was initially designated for use in classification and consequently modified for regression problems by nonlinearly mapping the input data into a higher-dimension space, in which a linear regression is performed [140]. SVM regression takes into account both the training error and the model complexity as compared with the traditional regression algorithms, which develop predictive models by minimizing the training error. As such, SVM performs better than traditional regression methods because of its advantageous characteristics, namely dimensional independence, limited number of freedom, excellent generalization capability, global optimum, and easy to implement [141]. Similar to other linear or machine learning (ML)-based QSAR techniques, SVM has to tradeoff between the characteristics of a global model, viz. broader coverage of applicability domain (AD), and a local model, viz. higher level of predictivity [108]. This seeming dilemma, nevertheless, can be plausibly resolved using the hierarchical support vector regression (HSVR) scheme, which was initially proposed by Leong et al. and was derived from SVM [83], because HSVR can simultaneously take into consideration both seemingly mutually exclusive characteristics. Practically speaking, it has been demonstrated that HSVR outperformed a number of ML-based models, namely artificial neural network (ANN), genetic algorithm (GA), and SVM [85].
The detail of HSVR has been mentioned elsewhere [83]. Briefly, a panel of SVR models was built by the LIBSVM package (software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm) based on various descriptor combinations, and each SVR model represented a local model. The model generation and verification were executed using the modules svm-train and svm-predict, respectively, implemented in the LIBSVM package. The regression modes, namely, ε-SVR and γ-SVR, were adopted, and radial basis function (RBF) was employed as the kernel due to its simplicity and better performance when compared with the others [142]. The runtime parameters, namely regression modes ε-SVR and ν-SVR, the associated ε and ν, cost C, and the kernel width γ, were scanned by the systemic grid search algorithm using an in-house Perl script [143], in which all parameters were changed independently in a parallel fashion.
Two SVR models were initially adopted to develop an SVR ensemble (SVRE), which, in turn, was further subjected to regression by another SVR to yield the final HSVR model. The two-member SVREs were continuously assembled until the HSVR model performed well. Otherwise, the threeor even four-member ensembles were built by adding one or more SVR models, respectively, if all two-member ensembles failed to perform well. The descriptor selection and ensemble assembly were predominantly governed by the principle of Occam s razor [144] by adopting the fewest descriptors and SVR models.

Predictive Evaluation
The predictivity of a generated model was evaluated by several statistic metrics. The coefficients r 2 and q 2 in the training set and external set, respectively, for the linear least square regression were computed by the following equation whereŷ i and y i are the predicted and observed values, respectively; and ŷ and n stand for the average predicted value and the number of samples in the dataset, respectively. Furthermore, the residual ∆ i , which is the difference between y i andŷ i , was calculated The root mean square error (RMSE) and the mean absolute error (MAE) for n samples in the dataset were computed The produced model was further subjected to 10-fold cross-validation instead of the widely used leave-one-out due to its better performance [145], giving rise to the correlation coefficient of 10-fold cross validation q 2 CV . In addition to cross-validation, the developed models were also internally validated by the Y-scrambling test [102], which was carried out by randomly permuting the log ER values, viz. Y values, to refit the previously developed models while the descriptors were remained unaltered, giving rise to the correlation coefficient r 2 s . The observed log ER values were scrambled 25 times as suggested [107] to produce the average correlation coefficient r 2 s . Furthermore, various modified versions of r 2 proposed by Ojha et al. [110] were also computed where the correlation coefficient r 2 o and the slope of the regression line k were calculated from the regression line (predicted vs. observed values) through the origin, whereas r 2 o was calculated from the regression line (observed vs. predicted values) through the origin.
Moreover, the correlation coefficients q 2 F1 , q 2 F2 , and q 2 F3 and concordance correlation coefficient (CCC) proposed by Shi et al. [146], Schüürmann et al. [147], Consonni et al. [148], and Chirico and Gramatica [149] were also computed by QSARINS [150,151] to evaluate the model performance in the external dataset where n TR and n EXT are the numbers of samples in the training set and external set, respectively; ŷ TR is the average predicted value in the training set; and y EXT and ŷ EXT are the average observed and predicted values in the external set, respectively. Various criteria for those statistical parameters have been proposed to gauge the model predictivity [152]. For instance, Chirico and Gramatica considered that both q 2 F3 and CCC are the best validation parameters to measure the predictivity [149], whereas Roy et al. suggested that r 2 m and ∆r 2 m are the most stringent metrics [111]. Recently, Todeschini et al. demonstrated that q 2 F3 is the most reliable metric [112]. The parameter q 2 F2 has been adopted by Organization for Economic Co-operation and Development (OECD) to assess the performance of QSAR models [147].

Conclusions
P-gp substrate efflux can be a major obstacle in the success of CNS-targeted therapeutic delivery as well as a critical pharmacokinetic factor for causing DDIs. On the other hand, the CNS-related side-effects of non-CNS drugs can be reduced by P-gp mediated efflux. As such, P-gp substrate efflux is of critical importance to drug discovery and development regardless of CNS drugs or non-CNS drugs. An in silico model to predict the P-gp substrate efflux can be valuable to drug discovery and development. Nevertheless, P-gp substrate efflux is a complex process that can take place through various routes, namely active transport and passive permeability, leading to different descriptor combinations as well as different relationships to render these variations in different mechanisms. In this study, a QSAR predictive model derived from the novel hierarchical support vector regression (HSVR) scheme, which can simultaneously possess the advantageous characteristics of a local model and a global model, viz. broader coverage of applicability domain and higher level of predictivity, respectively, was developed to envisage the P-gp substrate efflux ratio. The developed HSVR showed great prediction accuracy for the 50 and 13 molecules in the training set and test set, respectively, with excellent predictivity and statistical significance. When mock tested by a group of molecules to mimic real challenges, the derived HSVR model also executed accordantly well. Furthermore, the HSVR model can elucidate the discrepancies among all published P-gp substrate classifiers, indicating its superiority. Hence, it can be affirmed that this HSVR model can be adopted as an accurate and reliable predictive tool, even in the high throughput fashion, to facilitate drug discovery and development by designing drug candidates with a more desirable pharmacokinetic profile. : Table S1. Selected compounds for this study, their names, SMILES strings, CAS numbers, observed log ER values and predicted values by SVR A, SVR B, and HSVR, data partitions, and references; Table S2. Optimal runtime parameters for the SVR models; Figure