2D-QSAR and CoMFA Models for Antitubercular Activity of Scalarane-Type Sesterterpenes

: A series of scalarane sesterterpenes were prepared using heteronemin ( 1 ) as a primary precursor. Combined with the scalarane derivatives obtained from natural sources, a total of 22 antitubercular scalaranes were used to build QSAR models based in the 2D-QSAR and CoMFA approaches. Both models indicated the inﬂuences of substitutions in the vicinity of C-12 and C-16 of the scalaranes. A 2D-QSAR model suggested the necessity of hydrophilic functionalities on the peripherals with hydrophobic cores, and the lowering steric repulsion to improve the potential energy. This was complemented by the pictorial CoMFA model, which indicated the importance of the positive electrostatic with shortened steric extension crowning over C-12 and the lengthy negative functionalities extended from C-16.


Introduction
Tuberculosis is among the neglected diseases that place a heavy burden on the developing world. The disease was declared a global public health emergency by the World Health Organization (WHO) in 1993, and enormous efforts have been made to prevent its spread. Such efforts have led to the fall in the incidence rate over the past ten years; yet, apart from the COVID-19 pandemic, tuberculosis is among the top ten causes of death and the leading cause of death due to a single causative pathogen, even higher than HIV/AIDS. In a 2021 report, the WHO estimated the 2020 global tuberculosis incidence to be as staggeringly high as 9.87 million or 127 cases per 100,000 population, with a mortality (HIV-positive and -negative combined) of 1.49 million [1].
One of the most problematic aspects of tuberculosis for disease management is the lengthy treatment period that can last as long as six months. An abrupt stop in taking the antitubercular medicines, especially in economically challenged patients, primarily contributes to the prevalence of multidrug-resistant mycobacterium strains. With a limited choice of first-line drugs for effective treatment, and the less effective second-line drugs with undesirable adverse effects, the need for new antitubercular medicines, preferably with new modes of action, has hastened the search for leads that may offer an alternative in tuberculosis treatment for health care providers.
From our search for drug leads from Thai marine bioresources, we reported the antitubercular activity of marine-derived scalarane sesterterpenes isolated from the sponge of the genus Hyrtios (formerly identified as Brachiaster sp.) [2,3]. The representative analog heteronemin (1) was first isolated from the sponge Heteronema erecta [4,5], and later from several other sources, including the sponges of the genera Hyatella and Hyrtios, and from the nudibranch Glossodoris atromarginata [6][7][8][9][10][11]. The antitubercular activity of 1 was first recognized by El Sayed et al. [12]. However, the compound has not been considered an attractive antitubercular drug candidate due to its potent and indiscriminating cytotoxicity [7,12]. In our previous report, we noticed that certain derivatives, namely compounds 2 and 3, showed promising selectivity towards antitubercular activity (MICs 3 and 4 µM, respectively), but virtually no cytotoxicity against any cancer cell lines described therein [2].
Preliminary QSAR and CoMFA (Comparative Molecular Field Analysis) models based on 14 natural and chemically-derived scalaranes (compounds 1-14) had been conducted [13]. The results suggested that the antitubercular activity of the scalaranes could potentially be influenced by the oxygenated functionalities surrounding the furan moiety of the pentacyclic skeleton of the scalaranes. However, as most of the compounds that constituted the dataset during our preliminary study were natural products, the structural variation was rather serendipitous and random. In this study we expanded the dataset, comprising both the newly derived compounds and natural scalaranes, for more refined models. Referred to in our preliminary results, the primary foci for the chemical derivatizations are a modification towards the electrostatic and non-static fields crowning C-12 and the negative electrostatic and lengthy extension from C-16. The results from the devised models are compared, and possible SARs for the antitubercular activity of the scalaranes are described.

General Procedures
Unless stated otherwise, solvents and chemicals were all reagent grade and were used as purchased without further purifications. Chromatographic solvents were commercial grade and were redistilled prior to use. For the chemical reactions, THF was distilled over benzophenone/sodium and used immediately, and CH 2 Cl 2 was distilled over CaH 2 onto 4 Å molecular sieve. All chemical reactions were performed under dried argon atmosphere in oven-dried vessels.
The UV spectra were obtained from a Shimadzu UV-160A (Markham, ON, Canada), and IR spectra were from Perkin Elmer Spectrum One FT-IR spectrometers (Shelton, CT, USA), both in the operating solvents or matrixes indicated accordingly. All of the NMR experiments were performed either on a Varian Unity Inova 500 (Palo Alto, CA, USA), or on a Bruker Avance 500 spectrometer (Billerica, MA, USA) (both at 500 MHz for 1 H). The chemical shifts referred to those of the operating solvents as described accordingly. EI mass spectra were obtained from an MAT 95 XL mass spectrometer (Waltham, MA, USA).

Antitubercular Activity
All compounds to be incorporated in the compound dataset, except for 6-10, which were no longer available at the time of this investigation, were subjected to the antitubercular activity determination using a green fluorescent protein-based technique [14]. The bioassay was serviced by the Central Bioassay Laboratory, National Center of Genetic Engineering and Biotechnology (BIOTEC), Pathumthani, Thailand. The targeted microbe was green-fluorescent-protein-expressing M. tuberculosis H37Ra (ATCC25177). The inhibition percentage was calculated based on the fluorescent intensities of treated-over-untreated cultures and reported in MIC. The activity was referred to those of rifampicin, streptomycin, isoniazid, and ofloxacin as standard drugs. For the unavailable compounds 6-10, the antitubercular activities were referred to the previously reported MICs obtained from the microplate alamar blue assay [15].
Note; obs. = observed MIC; calc. = calculated MIC; NA = not active at the highest concentration tested. a All of the calculations for the QSAR models were performed in pMIC scale. b Test set. c These compounds were not included to the dataset either because they were not active (NA) at the highest concentration of 400 μg/mL, therefore unavailable MICs, or the low solubility did not permit the precise MICs to be obtained.

QSAR
The 2D-QSAR and CoMFA approaches for the QSAR studies were carried out using the antitubercular activities, calculated as pMICs, of 22 compounds from the compound dataset as a training set (see Section 3.3 below). For the 2D-QSAR, the geometrical optimization was performed on Gaussian03 (Wallingford, CT, USA) and the descriptor calculation was computed using Molecular Operating Environment (MOE2008) (Montreol, QC, Canada). In the CoMFA approach, the studies were conducted on SYBYL 7.0 (St. Louis, Test set. c These compounds were not included to the dataset either because they were not active (NA) at the highest concentration of 400 μg/mL, therefore unavailable MICs, or the low solubility did not permit the precise MICs to be obtained.

QSAR
The 2D-QSAR and CoMFA approaches for the QSAR studies were carried out using the antitubercular activities, calculated as pMICs, of 22 compounds from the compound dataset as a training set (see Section 3.3 below). For the 2D-QSAR, the geometrical optimization was performed on Gaussian03 (Wallingford, CT, USA) and the descriptor calculation was computed using Molecular Operating Environment (MOE2008) (Montreol, QC, Canada). In the CoMFA approach, the studies were conducted on SYBYL 7.0 (St. Louis,

210
-- Note: obs. = observed MIC; calc. = calculated MIC; NA = not active at the highest concentration tested. a All of the calculations for the QSAR models were performed in pMIC scale. b Test set. c These compounds were not included to the dataset either because they were not active (NA) at the highest concentration of 400 µg/mL, therefore unavailable MICs, or the low solubility did not permit the precise MICs to be obtained.

QSAR
The 2D-QSAR and CoMFA approaches for the QSAR studies were carried out using the antitubercular activities, calculated as pMICs, of 22 compounds from the compound dataset as a training set (see Section 3.3 below). For the 2D-QSAR, the geometrical optimization was performed on Gaussian03 (Wallingford, CT, USA) and the descriptor calculation was computed using Molecular Operating Environment (MOE2008) (Montreol, QC, Canada). In the CoMFA approach, the studies were conducted on SYBYL 7.0 (St. Louis, MO, USA), courtesy of the National Electronics and Computer Technology Center (NECTEC), Bangkok. Data analysis for QSAR evaluation was performed on SPSS 1.2.

2D-QSAR Model
In the 2D-QSAR setting, the 3D structures of compounds in the training and test sets were energetically refined with a density function theory of B3LYP under a 6-31 g (d, p) basis set. The calculation of 327 physico-chemical properties of each compound was performed in the lowest-energy, optimized conformations. The relevant particular physicochemical properties resulted from said calculation were selected using a stepwise method.

CoMFA Model
For the CoMFA model, each molecule was optimized to yield the lowest-energy conformation under the Tripos Force Field, and the partial charge was calculated on each atom based on the Gasteiger-Huckel charge calculation. The alignments, based on fit atom, matching, and superimposition, were performed on the obtained conformations by fixing C-1 to C-10 of compound 3 as a template. The grid box was automatically created to cover all aligned molecules with a 4-Å extension in all directions. The interactions, namely electrostatic and steric, were computed between all grid points located at each intersection of box defined by sp 3 carbon with charge +1 and atoms of molecules. Electrostatic and steric parameters were collected from those probe atoms to evaluate the model.
The molecular field data was treated with PLS to correlate with the antitubercular activity. The coefficient of determination, F ratio, and SEE were used to evaluate the quality of the model. The leave-one-out experiment was carried out to determine the predictability of the resulting models.

Results and Discussion
The main objective of this investigation is to establish improved 2D-QSAR and CoMFA models for the antitubercular activity of the scalaranes using a rebuilt larger compound dataset. In addition to 14 compounds previously described in our preliminary runs [13], included in this dataset are 14 new semi-synthetic derivatives and five additional natural scalaranes, comprising a total of 33 scalarane analogs.

Derivatization of the Scalaranes
As described earlier, the main foci of the chemical derivatization performed in this study is the specific manipulation of the electrostatic field surrounding C-12, C-16, and C-19, as suggested by our preliminary results [13]. A total of 18 scalaranes were prepared in this investigation using heteronemin (1) as the primary starting material. Among 18 synthesized compounds, four (compounds 4, 9, 10, and 13) were readily available as natural products [2,3], and were described in this part of the investigation solely as intermediates for the further derivatization. In addition, compounds 15 and 24 were also reported previously as natural products, whereas 16-23 and 25-28, a total of 12 compounds, were newly synthesized and reported here for the first time.

Scalarafuran Series
Deacetoxylation of 1 using triphenylphosphine-iodine [17,18] led to 13 and 15 as two major products (Scheme 1). Both compounds were previously reported as natural products [3,19,20], and were used here primarily as intermediates for the further manipulation on the oxygenated functionalities on C-12. Note that although our previous results suggested that the furan moiety may lower the potency at least one magnitude (for example, 1 vs. 13), therefore making it not entirely beneficial for antitubercular activity [13], the furan ring offered a stability better than did its furanol and furanone counterparts. This superior stability allowed the chemical transformation, particularly on the oxygenated functional groups on C-12 and C-16, to be conducted. The side reaction towards dideacetoxylated 15 was unexpected, and its mechanisms have never been documented nor described. However, incorporating 15 and the subsequent derivatives to the dataset shall demonstrate the importance of substituting groups on C-16 (or the lack thereof; see Section 3.1.2) to the antitubercular activity. The geometry of the oximes 20-23, presumably governed by the repulsion from the nearby furan ring, was deduced to be E. This is supported by the chemical shift differences of α carbons on both sides of the oxime, as compared between the oxo starting materials and resulting oximes (ΔδCα = δCα-oxo − δCα-oxime; Table S1, Supplementary Materials) [22,23]. The proposed geometry contradicted the results from the NOE-DS experiments in which the dipolar couplings between H-19s and methoxy protons of 22 and 23 were observed. However, the calculations indicated that the inter-atomic distances between H-19 and the methoxy protons on either E or Z geometries of both derivatives were in an indistinguishable range for the NOE (Figures S69 and S70, Supplementary Materials). In fact, the steric repulsion in the Z conformers led to farther inter-atomic distances for both 22 and 23. The results from ΔδCαs, on the other hand, provided more concrete evidence that indicated the E geometry unambiguously. The approach can also be used with compounds 20 and 21, the exchangeable protons of which were unobservable, hence were inapplicable NOE experiments. Therefore, the determination of the oxime geometry based on the chemical shift differences prevailed.
A hydrolysis of 13 yielded a 16-deacetyl derivative 24 (Scheme 2). The compound is identical to sesterstatin, a natural sesterterpene previously isolated from Hyrtios erecta sponge [24]. The presence of the hydroxy group on C-16 of 24 added the electrostatic variation on the C-16 extension of the scalarafuran series.  (18 & 19), and coupling with methoxylamine and hydroxylamine (20-23) [21] (Scheme 1). The structural variation allowed a good range of electrostatic functionalities over C-12.
The geometry of the oximes 20-23, presumably governed by the repulsion from the nearby furan ring, was deduced to be E. This is supported by the chemical shift differences of α carbons on both sides of the oxime, as compared between the oxo starting materials and resulting oximes (∆δ Cα = δ Cα-oxo − δ Cα-oxime ; Table S1, Supplementary Materials) [22,23]. The proposed geometry contradicted the results from the NOE-DS experiments in which the dipolar couplings between H-19s and methoxy protons of 22 and 23 were observed. However, the calculations indicated that the inter-atomic distances between H-19 and the methoxy protons on either E or Z geometries of both derivatives were in an indistinguishable range for the NOE (Figures S69 and S70, Supplementary Materials). In fact, the steric repulsion in the Z conformers led to farther inter-atomic distances for both 22 and 23. The results from ∆δ Cα s, on the other hand, provided more concrete evidence that indicated the E geometry unambiguously. The approach can also be used with compounds 20 and 21, the exchangeable protons of which were unobservable, hence were inapplicable NOE experiments. Therefore, the determination of the oxime geometry based on the chemical shift differences prevailed.
A hydrolysis of 13 yielded a 16-deacetyl derivative 24 (Scheme 2). The compound is identical to sesterstatin, a natural sesterterpene previously isolated from Hyrtios erecta sponge [24]. The presence of the hydroxy group on C-16 of 24 added the electrostatic variation on the C-16 extension of the scalarafuran series. methoxy protons on either E or Z geometries of both derivatives were in an indistinguishable range for the NOE (Figures S69 and S70, Supplementary Materials). In fact, the steric repulsion in the Z conformers led to farther inter-atomic distances for both 22 and 23. The results from ΔδCαs, on the other hand, provided more concrete evidence that indicated the E geometry unambiguously. The approach can also be used with compounds 20 and 21, the exchangeable protons of which were unobservable, hence were inapplicable NOE experiments. Therefore, the determination of the oxime geometry based on the chemical shift differences prevailed.
A hydrolysis of 13 yielded a 16-deacetyl derivative 24 (Scheme 2). The compound is identical to sesterstatin, a natural sesterterpene previously isolated from Hyrtios erecta sponge [24]. The presence of the hydroxy group on C-16 of 24 added the electrostatic variation on the C-16 extension of the scalarafuran series.

Deoxyscalarin Series
In addition to the oxygenated functionalities on C-16, the alkyl extensions on this side of the scalarane skeleton were attempted. Reduction of 1 with a borohydride led to a spontaneous ring opening toward the triol 25 (Scheme 3). An oxidation with Mn(IV) [25] recyclized the resulting intermediate allylic aldehyde toward the furanone 4, which was subjected to a β-alkyltion with nitromethane to yield epimeric nitromethyls 26 and 27 [26]. Disappointingly, alkylation on C-16 of 4 with various alkyl cuprates failed to yield any C-16 alkyl-substituted analogs.

Deoxyscalarin Series
In addition to the oxygenated functionalities on C-16, the alkyl extensions on this side of the scalarane skeleton were attempted. Reduction of 1 with a borohydride led to a spontaneous ring opening toward the triol 25 (Scheme 3). An oxidation with Mn(IV) [25] recyclized the resulting intermediate allylic aldehyde toward the furanone 4, which was subjected to a β-alkyltion with nitromethane to yield epimeric nitromethyls 26 and 27 [26]. Disappointingly, alkylation on C-16 of 4 with various alkyl cuprates failed to yield any C-16 alkyl-substituted analogs.

Scalaradials and Scalarapyridazine
The acetoxy acetal moiety of 1 proved to be too labile for the extensive deacetylation. When a similar hydrolysis condition described for 13 toward 24 was applied to 1, the results failed to yield any isolatable products. Even with a mild methanolysis, the resulting hemiacetal spontaneously underwent a ring opening and yielded a mixture of dials 9 and 10 [10]. The mixture, although available in a limited amount, was used to prepare the pyridazine 28 [27] (Scheme 4), adding to the aromatic variation other than the furans and dihydrofurans.

Scalaradials and Scalarapyridazine
The acetoxy acetal moiety of 1 proved to be too labile for the extensive deacetylation. When a similar hydrolysis condition described for 13 toward 24 was applied to 1, the results failed to yield any isolatable products. Even with a mild methanolysis, the resulting hemiacetal spontaneously underwent a ring opening and yielded a mixture of dials 9 and 10 [10]. The mixture, although available in a limited amount, was used to prepare the pyridazine 28 [27] (Scheme 4), adding to the aromatic variation other than the furans and dihydrofurans.

Deoxyscalarin Series
In addition to the oxygenated functionalities on C-16, the alkyl extensions on this side of the scalarane skeleton were attempted. Reduction of 1 with a borohydride led to a spontaneous ring opening toward the triol 25 (Scheme 3). An oxidation with Mn(IV) [25] recyclized the resulting intermediate allylic aldehyde toward the furanone 4, which was subjected to a β-alkyltion with nitromethane to yield epimeric nitromethyls 26 and 27 [26]. Disappointingly, alkylation on C-16 of 4 with various alkyl cuprates failed to yield any C-16 alkyl-substituted analogs.

Scalaradials and Scalarapyridazine
The acetoxy acetal moiety of 1 proved to be too labile for the extensive deacetylation. When a similar hydrolysis condition described for 13 toward 24 was applied to 1, the results failed to yield any isolatable products. Even with a mild methanolysis, the resulting hemiacetal spontaneously underwent a ring opening and yielded a mixture of dials 9 and 10 [10]. The mixture, although available in a limited amount, was used to prepare the pyridazine 28 [27] (Scheme 4), adding to the aromatic variation other than the furans and dihydrofurans.

Determination of Antitubercular Activity
Fourteen chemically derived scalaranes (15-28) and five natural sesterterpenes (29-33) were subjected to an antitubercular activity determination using a green fluorescent assay protocol targeting M. tuberculosis H37Ra (ATCC25177) [14]. Due to the different assay protocol from our first report [2,3,13,15], compounds from the previous dataset (1-5,  11-14), except for 6-10 which were no longer available, were re-examined with the green fluorescent assay. For the compounds that were not re-tested, the activities were referred to the previous reports [3,13] with an assumption that the two bioassays acceptably provided the comparable results [14]. The activities of the compounds in the dataset are shown in Table 1.

11-14)
, except for 6-10 which were no longer available, were re-examined with the green fluorescent assay. For the compounds that were not re-tested, the activities were referred to the previous reports [3,13] with an assumption that the two bioassays acceptably provided the comparable results [14]. The activities of the compounds in the dataset are shown in Table 1.

Structure Activity Relationship and Computer Modeling
Upon obtaining the newly derived and natural scalaranes, all compounds, including those from the previous report, were considered for integrating into the QSAR analyses. Compounds 23 and 29 were arbitrarily selected to be in the test set due to the relevant functionalities on C-12 and C-16 amended according to our preliminary results [13], therefore allowing a predictability assessment. The compounds that were inactive (16, 25, 31,  33) were automatically dismissed. In addition, due to the incomplete solubility of 15, 17, 18, 19, and 32, hence inaccurate MICs, these compounds were also excluded. A total of 22 compounds were imported to the new training set. The MICs of the compounds in the dataset spanned from 0.23 (pMIC 6.67) to 263 µM (pMIC 3.58) (pMIC range 3.09, x 4.53 ± 0.74, s 2 0.55, for 22 compounds).

2D-QSAR
Geometric optimization of each compound for the 2D-QSAR was performed at an ab initio level (B3LYP) and with a 6-31 g (d, p) basis set (Gaussian03). The resulting optimized conformation was used to calculate 327 physico-chemical descriptors (MOE2008). Based on a step-wise selection (SPSS12), four descriptors emerged to yield the best fitting Equation (1). (See Table S2 where E nb → = → potential energy with all bonded terms disabled, vsurf ID2 → = → hydrophobic integy moment, vsurf CW3 → = → capacity factor, and PEOE vsa+1 → = → sum of v i (van der Waals surface area of atom i); for an atom i, where q i is a partial charge of such atom i, and is in the range of [0.05,0.10), expressed as The coefficient for each descriptor is expressed as a mean(±SD).
The resulting model gave an acceptable linearity (r 2 = 0.878 l, Figure 1, dash line) and predictability (q 2 = 0.774). Based on the resulting SAR (Equation (1)        With a larger array of calculated descriptors provided by MOE, the re-established model offered a better fitted QSAR function. This was indicated by q 2 , which improved from 0.685 [13] in the preliminary results to 0.774, shown above. Due to the different software (Material Studio vs. MOE), all of the four descriptors involved here differed entirely from those predicted in our preliminary runs [13]; however, related properties were inherited. For example, both volume-surface descriptors (vsurf ID2 and vsurf CW3 ) were closely related to the dipole moments, with direct implications for the steric influences. Complementing each other, vsurfs indicated the improved potency by means of spreading out the hydrophilic functionalities onto the peripheral surface of the molecules, and holding the hydrophobicity closely and tightly to the inner cores of the structures. For E nb , in addition to a direct interpretation that the more stable the compound, the better the activity, the functions that were calculated with disabled bonded terms implied that such stability may be achieved through less steric repulsion among each substituent. The last descriptor, PEOE vsa+1, in fact reflected the influences of RPCG (ratio of positive charge of the most positive atom over the total positive charge) previously employed [13]. PEOE vsa+1 indicated the influence of charge at a single atom level, but rather integrating more of the polarizing factors through van der Waals surface areas than the ionic characters formerly implied by RPCG.

CoMFA Model
Parallel with the 2D-QSAR approach, the CoMFA study was carried out. Three alignment rules were trialed (see Section 2.5.2). The computational parameter was best met when employing the superimposition alignment rule, with sp 3 C as the probing atoms. This agreed well with the methods attempted in the previous report [13].
The CoMFA results ( Figure 3) concurred with the prediction in the previous study [13] and allowed a parallel interpretation. This included the negative electrostatic and steric disfavored regions (blue and yellow patches, respectively) that hovered over C-12, and indicated the need for positive dipole moment inducers with as little bulkiness as possible, e.g., an oxo functionality in 3. On the other end, the negative electrostatic and steric favored regions (red and green patches, respectively) extending from C-16 reflected the positive influences of an acetoxy group substituted at this position.
The re-established CoMFA models yielded much improved linearity from that obtained from the 2D-QSAR model (r 2 = 0.973, Figure 1, solid line). However, the predictability, despite being in an acceptable range with a q 2 of 0.630, was lower than that provided by the 2D-QSAR model described above, and that of the CoMFA calculation using the smaller dataset (q 2 0.810) [13]. This may be attributed to compounds with larger and more flexible substitutions being added to the new dataset. Compared with the results from the 2D-QSAR, despite less predictability, the better alignment between the observed and calculated MICs (Table 1) was demonstrated here as there was only one compound (8) that showed a residual MIC larger than 50 µM (Figure 2).
The ratio between the electrostatic and steric contributions in the resulting CoMFA model was 59:41. The contribution ratio that was in favor of the electrostatic influences sounded controversial, as the antitubercular drug leads should rather be hydrophobic for they have to penetrate through the mycolic acid-coated cell wall of the mycobacterium. That is, a higher proportion from the steric contribution, hence hydrophobic implication, shall be expected. A closer look into the compound dataset, however, suggested that the compounds in the dataset differed mainly in their electrostatic substitutions. These electrostatic functionalities dominated the model and reflected in the resulting contribution ratio. Nonetheless, if clogP of each compound (average 5.69, range 4.53-7.06; see Supplementary Materials), albeit not integrated directly into any QSAR functions, was considered, the range of strong hydrophobicity indeed agreed well with such a requirement.  The re-established CoMFA models yielded much improved linearity from that obtained from the 2D-QSAR model (r 2 = 0.973, Figure 1, solid line). However, the predictability, despite being in an acceptable range with a q 2 of 0.630, was lower than that provided by the 2D-QSAR model described above, and that of the CoMFA calculation using the smaller dataset (q 2 0.810) [13]. This may be attributed to compounds with larger and more flexible substitutions being added to the new dataset. Compared with the results from the 2D-QSAR, despite less predictability, the better alignment between the observed and calculated MICs (Table 1) was demonstrated here as there was only one compound (8) that showed a residual MIC larger than 50 μM ( Figure 2).
The ratio between the electrostatic and steric contributions in the resulting CoMFA model was 59:41. The contribution ratio that was in favor of the electrostatic influences sounded controversial, as the antitubercular drug leads should rather be hydrophobic for they have to penetrate through the mycolic acid-coated cell wall of the mycobacterium. That is, a higher proportion from the steric contribution, hence hydrophobic implication, shall be expected. A closer look into the compound dataset, however, suggested that the compounds in the dataset differed mainly in their electrostatic substitutions. These electrostatic functionalities dominated the model and reflected in the resulting contribution ratio. Nonetheless, if clogP of each compound (average 5.69, range 4.53-7.06; see Supplementary Materials), albeit not integrated directly into any QSAR functions, was considered, the range of strong hydrophobicity indeed agreed well with such a requirement.

Compound Test Set
Compounds 23 and 29 were subjected to the calculation as the test set ( Table 1). The calculated MICs of 29 based on both models were in a good agreement with the observed activity. Notice that 29 is in fact a 19-epimer of 2. The slightly lower potency of 29 than

Compound Test Set
Compounds 23 and 29 were subjected to the calculation as the test set ( Table 1). The calculated MICs of 29 based on both models were in a good agreement with the observed activity. Notice that 29 is in fact a 19-epimer of 2. The slightly lower potency of 29 than that of 2 provides a good evidence of the predicted negative effects from the steric field in the vicinity of C-12, particularly on the β plane.
As for 23, although its calculated and observed pMICs were in the same magnitude (observed pMIC 3.90; calculated pMICs based on 2D-QSAR 3.64, on CoMFA 3.58), the calculated MICs of 23 from both models strayed from the one observed by more than 50 µM. This in part resulted from the disadvantage of the assay protocol, in which two-fold dilution was performed to obtain MICs, therefore leading to the wider and less precise gaps between consecutive dilutions as the concentration become larger. Nevertheless, combined with the results from 5, 8, and 11, all of which also showed inaccuracy in either, or both, predictions, it is noticeable that all have no C-16 extensions. Such inaccuracy indicated that the derived models may underestimate the influences from such extensions, and the effects from the functionalities both on C-12 and on C-16 may need to be more comprehensively integrated.
Considering the compounds that were dismissed from the dataset as inactive analogs (compounds 16, 25, 31, and 33), both resulting QSAR models sufficiently reflected the lack in the activity and merited such dismissal. To a certain extent, a similar comparison may also extend to accommodate the compounds that were excluded due to low solubility (compounds 15, 17-19, and 32); i.e., most could be predicted to be weakly active to virtually inactive. The arguable case was 18, which possessed two beneficial functionalities, a 12-oxo and a 16-OAc. The compound, however, was marginally active (observed MIC 235 µM). Unfortunately, 18 was no longer available at the time of the QSAR analysis. A confirmative determination on its activity was unable to be performed, and such a controversial argument remains unsolved.

Conclusions
In summary, a series of scalaranes, both natural and chemically derived, were assessed for the QSARs of the antitubercular activity based on the 2D-QSAR and CoMFA approaches. Yielding acceptable linearity and predictability, the models complemented and agreed well with each other. Despite lower q 2 , the pictorial CoMFA method yielded less deviated calculation. Both models showed that the areas crowning C-12 and the extended regions from C-16 of the scalarane skeleton effectively influenced the potency. In addition, although not being integrated directly, the hydrophobicity reflected by clogP indirectly plays an important role for the active analogs, crucial for the active compounds to penetrate through the lipophilic cell wall of the mycobacteria.
Based on the observed and calculated results, 3, which was the most active, fit best as a prospective platform for the further modification and investigation. The finding may open up an opportunity to extend the search for new chemical entities for the antitubercular drugs. Our results may also facilitate the search for new drug targets and binding sites for the antitubercular and other related activities.  Table S1: ∆δ Cα of compounds 20-23, Table S2: Calculated working molecular descriptors of scalaranes in the compound dataset.