Modelling the Anti-Methicillin-Resistant Staphylococcus Aureus (MRSA) Activity of Cannabinoids: A QSAR and Docking Study

Twenty-four cannabinoids active against MRSA SA1199B and XU212 were optimized at WB97XD/6-31G(d,p), and several molecular descriptors were obtained. Using a multiple linear regression method, several mathematical models with statistical significance were obtained. The robustness of the models was validated, employing the leave-one-out cross-validation and Y-scrambling methods. The entire data set was docked against penicillin-binding protein, iso-tyrosyl tRNA synthetase, and DNA gyrase. The most active cannabinoids had high affinity to penicillin-binding protein (PBP), whereas the least active compounds had low affinities for all of the targets. Among the cannabinoid compounds, Cannabinoid 2 was highlighted due to its suitable combination of both antimicrobial activity and higher scoring values against the selected target; therefore, its docking performance was compared to that of oxacillin, a commercial PBP inhibitor. The 2D figures reveal that both compounds hit the protein in the active site with a similar type of molecular interaction, where the hydroxyl groups in the aromatic ring of cannabinoids play a pivotal role in the biological activity. These results provide some evidence that the anti-Staphylococcus aureus activity of these cannabinoids may be related to the inhibition of the PBP protein; besides, the robustness of the models along with the docking and Quantitative Structure–Activity Relationship (QSAR) results allow the proposal of three new compounds; the predicted activity combined with the scoring values against PBP should encourage future synthesis and experimental testing.


Introduction
Staphylococcus aureus is known as the principal cause of human bacterial infections around the world. Even though the infections are not symptomatic in many cases, in immunosuppressed patients, this bacterial infection can have lethal consequences such as pneumonia, meningitis, or septicemias. Additionally, Staphylococcus aureus presents high morbidity in clinical infections, mainly in less developed countries where access to medicines is complicated [1].
The problem becomes more serious considering the appearance of methicillin-resistant Staphylococcus aureus (MRSA) strains every year around the planet with a high rate of resistance [2]. However, despite the urgent need for new antimicrobial agents, the discovery rates are low; only one class of antibiotics has been introduced in the last 30 years [3,4]. Therefore, searching, designing,

Molecular Descriptor Calculation
The structures of the 24 compounds shown in Figure 1 were optimized using density functional theory (DFT), employing the functional WB97XD/6-311G(d,p) as a theory level implemented into the Gaussian 16 for Linux software [22]. As shown, there is a common moiety in all cases, i.e., the benzene ring with one or more hydroxyl group(s) on it. Except for in Compounds 23 and 24, the benzene structure is fusing with the terpene-like structure. Moreover, the compound with the benzene ring with a hydroxyl group in Positions 2 and 6 in the aromatic ring seems to be more active than the rest of the cannabinoids. No hydroxyl groups in the benzene ring become the cannabinoids in the inactive one. Therefore, according to the pMIC of Compound 6, that activity diminishes with alkyl chain presence on the meta-position.
Considering the hydroxyl ortho-para-directing properties [20,21], these results seem to be strong evidence that the meta-unsubstituted benzene ring plays a pivotal role in the anti-MRSA activity.

Molecular Descriptor Calculation
The structures of the 24 compounds shown in Figure 1 were optimized using density functional theory (DFT), employing the functional WB97XD/6-311G(d,p) as a theory level implemented into the Gaussian 16 for Linux software [22]. Tables 2 and 3 show the most important topographic descriptors related to antimicrobial activity against SA-1199B and XU212 Staphylococcus aureus strains, respectively. Even though we computed electronic and thermodynamic descriptors, only topographic ones seem related to anti-MRSA activity.

Quantitative Structure-Activity Relationships (QSAR):
Using the descriptors shown in Tables 2 and 3, and employing the stepwise statistical method, combined with forward selection and backward eliminations, we obtained three equations describing the antimicrobial activity in both SA-1199B and XU212 Staphylococcus aureus strains. The maximum number of molecular descriptors in the models was five.
For the SA-1199B strain, For further analysis, Equations (2) and (5) were selected due to their better statistics, mainly, their high predictor coefficients (Q 2 and Q 2 boot) and the good Y scrambling values (a(R 2 ) and a(Q 2 )). Figure 2 shows the reproducibility of the experimental values by using the representative model in each case. Additionally, the matrix correlations reveal that there are no molecular descriptors correlated in each model (see supplementary material). Henceforth, we will only consider these models for their chemical explanation. their high predictor coefficients (Q 2 and Q 2 boot) and the good Y scrambling values (a(R 2 ) and a(Q 2 )). Figure 2 shows the reproducibility of the experimental values by using the representative model in each case. Additionally, the matrix correlations reveal that there are no molecular descriptors correlated in each model (see supplementary material). Henceforth, we will only consider these models for their chemical explanation. As Tables 2 and 3 show, the topographic descriptors describe with high accuracy the biological activities evaluated in this work. We computed these descriptors by applying different mathematical approaches to the graphs constructed by the molecules' 3D structure. At the end of the attribute's name, we present the physicochemical property employed in each descriptor as lowercase letters. The properties found are the atomic mass (m), electronegativity (e), polar surface area (PSA), atomic charges (c), Van der Waals volume (v), refractive index (r), softness (s), hardness (h), AlogP (a), and polarizability (p). We can extract relevant information from these properties, such as information about the molecular size (v), the atom mass (m), and the presence of polar groups (PSA), which are associated with the adequate coupling of the molecules in the active site of the protein receptor [23]. We also associated the molecules' electron donor-acceptor properties with the molecules' biological activity, as can be described by the properties c, s, h, r, and p [24]. The AlogP property denoted as "a" is associated with crossing the membrane, which plays a pivotal role in the biological activity [25]. As Tables 2 and 3 show, the topographic descriptors describe with high accuracy the biological activities evaluated in this work. We computed these descriptors by applying different mathematical approaches to the graphs constructed by the molecules' 3D structure. At the end of the attribute's name, we present the physicochemical property employed in each descriptor as lowercase letters. The properties found are the atomic mass (m), electronegativity (e), polar surface area (PSA), atomic charges (c), Van der Waals volume (v), refractive index (r), softness (s), hardness (h), AlogP (a), and polarizability (p). We can extract relevant information from these properties, such as information about the molecular size (v), the atom mass (m), and the presence of polar groups (PSA), which are associated with the adequate coupling of the molecules in the active site of the protein receptor [23]. We also associated the molecules' electron donor-acceptor properties with the molecules' biological activity, as can be described by the properties c, s, h, r, and p [24]. The AlogP property denoted as "a" is associated with crossing the membrane, which plays a pivotal role in the biological activity [25].
Due to the importance of frontier molecular orbitals in chemical reactivity, this work calculated both the shape and energy of HOMO and LUMO orbitals as shown in Figure 3.
To find any patterns that elucidate the activity and inactivity of cannabinoids against Staphylococcus aureus, we analyzed the shape of the frontier orbitals. Figure 1 shows that the most active compounds (2, 10, and 19) have both HOMO and LUMO orbitals located in different molecular regions. By contrast, for the smallest active compounds (23), the HOMO and LUMO are almost dispersed across the entire molecular region. These results suggest the most active compounds have acceptor-donator behavior, as reported before in several studies [55][56][57][58][59][60][61][62][63].  Table 4 shows the values of several drug-likeness properties used as criteria for Lipinski´s rules, the MDL drug data report (MDDR), and others [73]. According to these criteria, a molecule must have a ClogP value of −0.4 < logp < 5, molar refractivity of 40 < MR < 130 mol/cm 3 , MW < 500, PSA < 140, and nRB ≥ 6. In this sense, Compounds 2, 8, and 10 seem to have good drug-likeness characteristics. These results, along with the experimental biological activities, represent a piece of extraordinary evidence for the potentiality of some cannabinoids as future drugs.  Moreover, it seems that the characteristic explained above is very common in antibacterial and antimicrobial compounds. An in-depth review of the bibliography reveals that several antibacterial and antibiotic (commercial and potential pharmaceutical) compounds have the HOMO-LUMO orbitals located in different molecular regions [64][65][66][67][68][69][70][71][72]. Thus, Compounds 2, 10, and 19, might be serious candidates for becoming pharmaceutical compounds. Table 4 shows the values of several drug-likeness properties used as criteria for Lipinski´s rules, the MDL drug data report (MDDR), and others [73]. According to these criteria, a molecule must have a ClogP value of −0.4 < logp < 5, molar refractivity of 40 < MR < 130 mol/cm 3 , MW < 500, PSA < 140, and nRB ≥ 6. In this sense, Compounds 2, 8, and 10 seem to have good drug-likeness characteristics. These results, along with the experimental biological activities, represent a piece of extraordinary evidence for the potentiality of some cannabinoids as future drugs.

Molecular Docking
Although the broad antibacterial spectrum of cannabinoids has been reported in vitro, the antibacterial mechanism of action remains unknown yet. Therefore, it seems interesting to assess these kinds of compounds' possible interactions with some protein targets reported as fundamental in the Staphylococcus aureus life cycle. Consequently, the cannabinoids most active against the two strains studied were tested for three targets: penicillin-binding protein (PBP), isoleucyl-tRNA synthetases, and DNA gyrase (DNA Gyr). We selected an inactive compound (23) to carry out the molecular docking to gain more insight into the differences between active and inactive cannabinoids. Additionally, we used norfloxacin (DNA Gyr), oxacillin (PBP), and mupirocin (TyRS) as control ligands for the targets. The use of different proteins allows one to gain some hints about the mechanism of the cannabinoids' probable action; however, any hypothesis must be proven experimentally. Table 5 reveals some interesting patterns. It shows that all of the active compounds seem to have a good affinity for PBP but have a low affinity for both theIso-TyrRS and DNA Gyr proteins. By contrast, the most inactive compounds present low affinities for all the protein targets. Therefore, these results suggest that Iso-TyrRS and DNA Gyr might be ruled out from the mechanism of action.  Table 5 shows that Compound 2 has the best performance in terms of both the microbial activity and scoring factor (against PBP), whereas Compound 23 has the lowest values. Therefore, to find any differences between active and inactive structures, these two compounds were selected to gain more insight into their molecular interactions with PBP. We compare these cannabinoids' interactions with those of oxacillin, a commercial inhibitor of this target [74][75][76][77][78][79][80]. Figure 4 shows the most active poses and a 2D diagram of interactions with reactive penicillin-binding protein pockets. Oxacillin and Compound 2 hit the targets in the same pocket; oxacillin, a commercial PBP inhibitor, presents two types of interactions, i.e., Van der Waals and hydrophobic. Carbonyl groups in oxacillin exhibit a short-range dipolar interaction with amino acid residues such as Glu294, His293, Lys 319, and Gln 292, and long-range dipolar interactions with other amino acids such as Gly321, ASP320, Lys 290, and Lys 322. Additionally, some strong hydrophobic interactions are seen: two pi-pi interactions (Lys273 and Lys289 with aromatic ring) and two strong n-pi interactions between aromatic rings and the amino acid residue TyrB272.

Predicted Compounds
The QSAR results reveal that some electronic properties such as the electronegativity (e), polar surface area (PSA), atomic charges (c), Van der Waals volume (v), refractive index (r), softness (s), and hardness (h) have a noticeable influence on the biological anti-MRSA activity. Additionally, the docking results reveal that some structural characteristics seem essential; the hydroxyl group on the benzene ring takes part in the molecular interactions with the target. Considering these factors and Oxacilli Cannabino Cannabinoid By contrast, Cannabinoid 23 targets the PBP in a site different from the active one. Thus, the fact that Compound 23 did not hit the active site and is inactive, and Cannabinoid 2 did hit the active site, seems to support the idea that the microbial activity is related to the capability to inhibit the PBP target.
The docking results enhance some characteristics, such as the aromatic ring, the hydroxyl group on Position 2-6, the hydrogen in Position 5, and sp 2 bonds. These features guarantee the presence of two molecular interactions with the amino acid residues in the active pocket, dipolar and hydrophobic (pi-pi, preferably) ones. The behavior of Cannabinoid 2 against the targets and its biological activity suggest the anti-Staphylococcus aureus activity may be related to the inhibition of the PBP target. Moreover, the intrinsic characteristics such as the lipophilicity and low polarity seem to enhance the probability of attacking this membrane protein.

Predicted Compounds
The QSAR results reveal that some electronic properties such as the electronegativity (e), polar surface area (PSA), atomic charges (c), Van der Waals volume (v), refractive index (r), softness (s), and hardness (h) have a noticeable influence on the biological anti-MRSA activity. Additionally, the docking results reveal that some structural characteristics seem essential; the hydroxyl group on the benzene ring takes part in the molecular interactions with the target. Considering these factors and the possibility of targeting the PBP, three new structures may be proposed as potential MRSA compounds in this work. The oxacillin beta-lactam ring coupled to Cannabinoid 2 in different positions based on the docking results, in which the moieties that do not participate in the molecular interactions with the target were withdrawn from the molecules and replaced for the oxacillin beta-lactam ring.
The three new molecules, named CLP1, CLP2, and CLP3, were drawn by employing ChemDraw and optimized to a minimum geometry structure at the WB97XD/6-311G(d,p) theory level. Figure 5 shows both the structure and the 2D diagram of molecular interaction with the PBP target. In all of the cases, the new compounds exhibited a pMICSA1199B higher than that of Cannabinoid 2, whereas the pMICX212 values are slightly smaller than those presented for Cannabinoid 2.
After predicting the pMIC for the proposed compounds, we computed their drug-likeness properties, along with the scoring values against the PBP target. As shown in Table 6, all of the proposed compounds have a suitable scoring factor against PBP, with the compound ClP1 showing the most activity. In all cases, the scoring values are higher or equal to the Cannabinoid 2 values. Likewise, the tree-predicted compounds would be good candidates for drugs, meeting all of Lipinski's rules' parameters. Moreover, the LogP, a critical descriptor, was improved with the substitution, achieving values smaller than 5. According to Lipinski's rules, a logP of less than five indicates better ligand bioavailability. Figure 5 shows that the three compounds hit the PBP in the active site, similar to oxacillin and Cannabinoid 2. Likewise, the molecular interactions are analogous for oxacillin and Cannabinoid 2. However, we highlight CLP1 because of its strong hydrophobic interactions (LysB143, ASPB295, and HISB293 with aromatic ring) and the three hydrogen bonds formed (AspB275, ARGB151, and THRB165).
The QSAR, docking, and drug likeness properties not only for Cannabinoid 2 but also the three proposed structures extend the possibility for the design, synthesis, and testing of new cannabinoid derivatives against MRSA strains. Molecules 2020, 25, x FOR PEER REVIEW 12 of 21

Data Set Collection and Minimum Energy Structure Computation
The structures of twenty-four cannabinoid compounds isolated from Cannabis sativa and tested in vitro against several MRSA Staphylococcus aureus were retrieved from Appenddino et al. along with their activity reported as minimum inhibitory concentrations, MICs (mol/L). The entire structures were drawn using Gausview 5.0 and optimized at WB97XD/6-311G(d,p) theory level, using Gaussian 16 for Linux [22]. WB97XD represents the functional of exchange-correlation from Density Functional Theory (DFT), whereas 6-31G(d,p) is the basis set describing the atoms involved in the molecular structure of the cannabinoids. We selected this level of theory due to the excellent correlation with experimental values for some reactions involving both polar and non-polar organic structures [81][82][83][84][85][86][87][88][89].
The minimum energy structures were verified utilizing frequency calculations over the 24 compounds. In this sense, all compounds must exhibit a complete set of real and positive values. We used the minimum structures to find every molecular descriptor such as the HOMO (highest occupied molecular orbital), LUMO (lowest unoccupied molecular orbital), polarizability, entropy, enthalpy, free energy, hardness, softness, electrophilic index (ω), lipophilia (ClogP), polar surface area, and topological index, and all sets of topographic descriptors implemented in Quibils-Midas [90]. The molecular descriptors derived from conceptual DFT (µ, η, s, and ω) were computed using Equations (7)- (10).

Statistical Analysis
In this work, we calculated and tabulated more than 700 molecular descriptors. We employed linear regression to select the most important molecular descriptors (LR). The pMIC (=Log (10 6 /MIC)) was used as a dependent variable, whereas each molecular descriptor was used as an independent variable. We considered values of R 2 > 0.5 as "important", which implies a high association with the dependent variable.
We then tabulated the meaningful molecular descriptors (the most correlated with the activity) and employed them to build mathematical models. Therefore, to achieve this goal, multiple linear regressions, and the addition-forward selection and backward-elimination methods were employed. In this method, the "important" descriptors are individually incorporated or withdrawn from the model at each step of the regression depending on three criteria: the correlation coefficient (R), the Fisher ratio values (F), and the standard deviation (σ). We selected variables to enter or remove until we obtained the best model. For this, we used the IBM-SPSS software.
When two or more molecular descriptors were correlated to each other, one of them was automatically removed from the model. After the development of the QSAR models, they must be statistically validated; thus, we proved the robustness of the models using the so-called "leave-one-out cross" methods (LOOCV), already reported in some work where QSAR methods have been used [91][92][93][94][95]. This method consists of omitting a date, generating a new lineal equation, and predicting back new omitted data values. Therefore, for a sample of "n" dates, there will be "n" new dates generated. The sum of squares for the differences between the experimental and generated dates are used to compute the standard error of prediction, SEP, defined as: where y is the experimental value of log (10 6 /MIC), ÿ is the estimated value from LOOCV, and n is the number of samples used in the mathematical model. Additionally, we quantified the models' predictive ability through Q 2 (prediction coefficient or Q 2 LOOCV ), which is defined as: whereȳ are the mean values of experimental ones. A Q 2 > 0.5 suggests a reasonable prediction power, while a Q 2 < 0.5 indicates a poor one. Additional statistical validation was carried out by the computation of both the bootstrapping coefficient (Q2boot) and the Y-scrambling analysis (a(R2) and a(Q2)). The first one is determined by taking a random sample from the total dataset where each sheet is repeatedly analyzed by the replacement of the full dataset, and a high value of this coefficient demonstrates the robustness of the model, whereas the Y-scrambling analysis introduces a random perturbation of the response variable; low values of these parameters suggest good robustness of the model.

Drug-Likeness Property Prediction
The drug-likely concept relates to the similitude of any ligand with drugs already in the pharmaceutical market. According to several studies, some physicochemical characteristics could determine whether or not a particular molecule is able to become a drug [96][97][98][99]. Therefore, some critical values for the hydrophobicity, electronic distribution, hydrogen bonding characteristics, molecular size, and flexibility are determined in the administration, distribution, metabolism, and excretion behavior of any molecule in the living organism.

Molecular Docking
All of the data of the compounds were optimized at WB97XD/6-311G(d,p) and saved in .pdb format, while we obtained the targets from XRD (x-rays diffraction) reported at the protein data bank (PDB) webpage [100]. Four targets were selected: isoleucyl-tRNA synthetase (Iso-TyRS, PDB ID: 1JZS, inhibitor: mupirocin), penicillin-binding protein (PBP, PDB ID: 1MWT, inhibitor: oxacillin) and DNA gyrase (PDB ID: 2XCQ, inhibitor: norfloxacin). These proteins are already known as the targets for some commercial antibiotics (in italics, following PDB); moreover, their cellular locations provide some hints about the action mechanism involved with this kind of compound, i.e., if the activity occurs on the membrane or, otherwise, if the cannabinoids need to cross the cellular membrane to achieve the activity.
After downloading the proteins' structures, both the native ligand (if any) and water molecules were removed. Additionally, polar hydrogen atoms and Kollman-type charges were added. The non-ligand proteins were then saved in PDB format. The refined PDB files, without native ligands, were "redocked", ensuring that the native ligands that hit the target in the same pocket before were removed. Therefore, this procedure allows verifying if the docking parameters specified in the input file for the docking method are reasonable and able to recover a known complex's structure and interactions. In all cases, Autodock4 was used [101].
After the redocking protocols, all of the compounds were docked against the four proteins mentioned above. Ligands and proteins, saved in PDBQT format, were employed to perform the docking calculation using the Autodock4 software. The Autogrid software was used to compute the interaction points with the grid box into the targets' active center. For this aim, 50 points, through the x, y, and z directions, were considered, and a genetic algorithm was used to take into account 2.5 million interactions. The box-grid dimensions were used, taking into account the active site in each case. Then, the grids' centers were constructed as follows: Iso-TyRS: The best scoring values were reported, saved in .pdb format, and visualized with Discovery Studio Visualizer [102]; the 2D interaction diagrams were used to identify the molecular interactions and to analyze each one.

Conclusions
Twenty four cannabinoid derivatives reported as anti-Staphylococcus aureus agents have been optimized at the wB97XD/6-31G(d,p) level of theory. From the minimum geometries, several electronic, topological, and topographic descriptors were calculated and tabulated. Using the different multiple regression methods, three mathematical models that describe the quantitative structure-antimicrobial activity relationship were found for the two Staphylococcus aureus strains, SA-1199B and XU212. In both cases, the mathematical equations with four topographic descriptors are highlighted due to their suitable correlation coefficient (R2); their small prediction error, s; and the high Fisher coefficient (F). The three-molecular descriptor models were validated, employing leave-one-out cross-validation and Y-scrambling methods, with coefficients Q 2 > 0.86, suggesting suitable robustness for the obtained models. According to the molecular descriptors found in the QSAR models, the antimicrobial activity may be associated with the lipophilic factor, the frontier orbital, and the benzene ring's polarity. A detailed look into the frontier orbitals suggested that most active cannabinoids have donor-acceptor characteristics. On the other hand, to gain more insight into the molecular interactions with potential targets, all of the data were docked against three proteins, iso-tyrosyl RNA synthetase (Iso-TyRS), penicillin-binding protein (PBP), and DNA gyrase (DNA Gyr). Interestingly, the most active cannabinoids show favorable activity against PBP and a small affinity to iso-tyrosyl RNA synthetase. The inactive compounds had few affinities toward the three targets. Cannabinoid 2 has the best scoring values against PBP. The results of the docking against PBP revealed that this compound coupled favorably into the active site, having molecular interactions similar to oxacillin, a commercial PBP-inhibitor. Between these molecular interactions, both short-and large-range dipole-dipole and strong pi-pi interactions are highlighted. These results confirm the role of the hydroxyl group and the pi bonds in these cannabinoids' biological activity. Therefore, the biological activity, QSAR results, and molecular docking suggest that cannabinoids' antimicrobial activity may be related to the capability to inhibit the PBP protein. The statistical robustness of mathematical models and the docking results analysis allowed us to propose three new compounds that are potentially active against the MRSA strains, encouraging future synthesis and experimental testing.