Structure-Based Molecular Networking for the Discovery of Anti-HBV Compounds from Saussurea lappa (Decne.) C.B Clarke

It is a crucial to find target compounds in natural product research. This study presents a concept of structure-guided isolation to find candidate active molecules from herbs. We establish a process of anti-viral sesquiterpene networking. An analysis of the networking suggested that new anti-HBV sesquiterpene may be attributable to eudesmane-, guaiane-, cadinane-, germacane- and bisabolane-type sesquiterpenes. In order to evaluate the efficiency of the structure-based molecular networking, ethanol extract of Saussurea lappa (Decne.) C.B Clarke was investigated, which led to the isolation of two guaiane-type (1 and 14), ten eudesmane-type (2–5 and 8–13), two chain (6 and 7) and one germacrane-type (15) sesquiterpenes, including seven new ones, lappaterpenes A–G (1–7), which are reported on herein. The absolute configurations of the new compounds were established by coupling constants, calculated ECD and ROESY correlations, as well as comparisons of optical rotation values with those of known compounds. The absolute configuration of compound 2 was further confirmed by X-ray diffraction. Compounds 1–15 were evaluated for their potency against hepatitis B virus. Compounds 4, 6, 7 and 9 showed effect on HBsAg with inhibition ratios of more than 40% at 30 μM concentrations. Compounds 14 and 15 inhibited HBsAg secretion with the values of IC50 0.73 ± 0.18 and 1.43 ± 0.54 μM, respectively. Structure-based molecular networking inspired the discovery of target compounds.


Introduction
Natural product structures play a significant role in drug discovery and development [1,2]. Bioactive natural products offer opportunities to discover novel targets and mechanisms for treating human diseases [3]. However, there is a lack of effective methods to find target compounds. Previously, ethnopharmacological knowledge or screening of extract for bioactivity and bioassay-guided isolation have inspired the discovery of active natural products [4,5]. In recent years, bioactivity-based molecular networking has significantly increased the efficiency with which active natural products as potential drug leads have been discovered [6,7].
As part of an ongoing effort to find anti-viral sesquiterpenes in herbs in Yunnan province in China [8][9][10], herein we establish a new strategy, i.e., structure-based molecular networking, to investigate antiviral sesquiterpenes. Firstly, we collected antiviral sesquiterpenes from ZINC and CHEMBL. Secondly, based on the skeletons of the sesquiterpenes, we calculated the degrees of similarity of those sesquiterpenes by ChemmineR and ChemmineBO [11]. Lastly, we constructed a structure-based molecular network and divided the sesquiterpenes into communities by cluster_louvain from igraph [12], which led to the identification of key nodes (i.e., representative sesquiterpenes) in sesquiterpene communities. A cluster analysis of the sesquiterpene network suggested that new anti-HBV sesquiterpenes may be attributable to eudesmane-(degree = 27), guaiane-(degree = 21), cadinane-(degree = 20), germacane-(degree = 18) and bisabolane-type (degree = 16) sesquiterpenes (Table S1). Previous studies showed that bisabolane-type sesquiterpenes have good anti-HBV activity [13].
In order to evaluate the efficiency of the structure-based molecular networking, ethanol extract of Saussurea lappa (Decne.) C.B Clarke was investigated. S. lappa belongs to Saussurea, a large genus of the Asteraceae family including more than 400 species distributed worldwide [2], more than 40 of which have been used in traditional and alternative medicine [14]. Nowadays, the plant mainly is cultivated in Yunnan province in China. The roots of S. lappa are recorded in Chinese Pharmacopoeia (2020 Edition) [15]. It is also a common Tibetan medicine and has been used to treat stomach pain and blood disorders. In Mongolian medicine, the roots of S. lappa have been used to treat lung abscesses and phlegm [16]. Some eudesmane-, guaiane-and germacane-type sesquiterpenes have been isolated from S. lappa. These compounds showed potential effects on chronic superficial gastritis, ulcer, cancer, bacterial, fungal and viral diseases [16][17][18].

Development of Structure-Based Molecular Networking
ChemmineR and ChemmineOB were applied to calculate similarities among sesquiterpenes. When a threshold value of similarity is 90%, similar sesquiterpenes can be identified and clustered by cluster_louvain to categorize the compounds into different communities. Key nodes with the highest degree values, and other similar nodes, are displayed as square nodes ( Figure 1A). The top ten square nodes are listed in Figure 1A and Table S1. In the molecular networking, anti-HBV sesquiterpenes (Table S2) were extracted from Figure 1A; in Figure 1B, nodes 21 (Guaiane, degree = 21), 45 (Eudesmane, degree = 27) and 54 (Germacrane, degree = 18) are key nodes. Node 21 represents guaiane, located in the green community in Figure 1A; its community and related sesquiterpenes are illustrated in Figure 1C. Node 45 represents eudesmane, located in the red community in Figure 1A; its community and related sesquiterpenes are illustrated in Figure 1D. Node 54 represents germacrene, located in the yellow community in Figure 1A; related sesquiterpenes are illustrated in Figure 1E. The sources of germacrene-, guaiane-and eudesmane-type sesquiterpenes were searched in our database, which showed that these three kinds of sesquiterpene skeletons were simultaneously enriched in Saussurea spp. (Table S2).

Anti-HBV Activities and SARs of Sesquiterpene Derivatives
S. lappa is a famous medicinal plant growing in the Himalayan region. It is now mainly is cultured in Yunnan and Sichuan provinces in China. The roots of S. lappa have been used to treat viral diseases in Ayurveda, Unani and Siddha as well as in traditional Chinese medicine. The plants have also been used in Tibet and other minority regions in China. Herein we reported the isolation and determination of seven new sesquiterpenes (1-7), together with eight known ones (8)(9)(10)(11)(12)(13)(14)(15)). All compounds were tested for their anti-HBV and cytotoxic activities. An anti-HBV assay suggested that known compounds dehydrocostus lactone (14) and costunolide (15) showed potent effects on HBsAg and HBeAg (displayed in Table 3), which agreed with the preliminary screening results. Active compounds (14 and 15) featured a α,β-unsaturated-lactone, and both showed inhibition toward both HBsAg and HBeAg. However, it was of note that the eudesmane sesquiterpene (13) bearing with α,β-unsaturated-lactone showed no activity against HBV at 30 µM, while new eudesmane sesquiterpenes 4 and 9, in which α,β-unsaturated-lactone was broken, showed some effect on HBsAg, with an inhibition ratio of more than 40% at 30 µM. Although further research will be required to evaluate the mechanism of HBV inhibition and the structure-activity relationships (SARs) of these compounds, the results suggest that germacrane and guaiane sesquiterpenes may be the anti-HBV active chemical constituents of S. lappa. This research is the first to report that eudesmane sesquiterpenes without α,β-unsaturated-lactone show moderate anti-HBV activity.

Discussion
The loss of the bioactive compounds during isolation is a common problem in natural product research. Herein we present the concept of structure-guided isolation to find candidate active molecules directly from herbs. We established a library of antiviral sesquiterpenes which included structures, skeleton type, bioactivities and network of sesquiterpenes relationships to illustrate key node (i.e., representative sesquiterpenes). By employing similarity calculations, we constructed a sesquiterpene molecular network and categorized molecular clusters. Then, we used the bioactivity characteristics to filter the molecular clusters and predict the bioactivity of the sesquiterpenes in the clusters. We applied this workflow to discover antiviral compounds from an extract of S. lappa. It can be expected that this approach, i.e., structure-based molecular networking, will lead to the discovery of active natural products. It can be based on biologically active molecular networks for the analysis of bioassay-guided separation

General Experimental Techniques
Optical rotations were measured with a Jasco P-1020 digital polarimeter. IR spectra were measured on a Thermo NICOLET iS10 with KBr pellets. UV spectra were recorded on a Shimadzu UV-2700 spectrophotometer. CD spectra were measured on an Applied Photophysics Chirascan instrument. X-ray diffraction was measured on a Bruker D8 Quest instrument. ESIMS and HRESI-MS were run on an Agilent 1290 UPLC spectrometer and Agilent 6500 series Q-TOF spectrometer, respectively. NMR spectra were measured in CD 3 OD solution and recorded on a Bruker Avance III HD-600 or AV 800 spectrometer at 25 • C, using TMS as an internal standard. Chemical shifts were reported in units of δ (ppm), and coupling constants (J) were expressed in Hz. Column chromatography (CC) was carried out over silica gel (200-300 or 500-800 mesh, Qingdao Marine Chemical Factory), Sephadex LH-20 (25-100 µm, Pharmacia Fine Chemical Co., Ltd., Tokyo, Japan), MCI-gel CHP-20P (75-150 µm, Mitsubishi Chemical Industry, Ltd., Guangzhou, China), Rp-18 (40-63 µm, Merck, Shanghai, China). Precoated silica gel plates (Qingdao Haiyang Chemical Co., Qingdao, China) were used for TLC. Detection was done under UV light (254 nm and 365 nm) and by spraying the plates with 10% sulfuric acid followed by heating. A Waters 1525/2998 liquid chromatography machine (Waters Technologies, Wexford, Ireland) was used for HPLC. An ACE C 18 -PFP and Waters sunfire-C 18 column 5 µm 143 Å column (250 mm × 10 mm) were used for semipreparative HPLC separations.

Sesquiterpene Network
A dataset of 11,741 sesquiterpenes was collected from the ChEMBL database, Binding DB, and publications (Supplementary Material). Duplicate compounds were removed. Among them, anti-HBV activity was represented according to IC 50 value; 152 compounds (agonists) in the data set were active (IC 50 < 10 µM), and were tagged with "1". Other compounds were marked with "0" (inactive).
Skeleton type was determined for each sesquiterpene. Similarities among sesquiterpenes were calculate by the ChemmineR and ChemmineBO to afford a matrix of similarity. Based on a threshold value of similarity (90%), a network of sesquiterpenes was generated with "ggnet" in R, and nodes in the network were divided into different communities by cluster_louvain.

Plant Material
The roots of S. lappa were collected in Lijiang county, Yunnan province, China, and identified by Associate Professor Wu Zhikun (School of Pharmacy, Guizhou University of Chinese Medicine). A voucher (KUMST-BS-0007) specimen was deposited in the Labora-tory of Chemical Biology for Natural Medicines, School of Life Science and Technology, Kunming University of Science and Technology.

ECD Calculation
The aglycons of the compounds were used as the chemical models to carry out ECD calculations. A conformation analysis was carried out using molecular mechanics MMFF. The resulting conformers (<15 KJ/mol) were optimized using DFT at the B3LYP-SCRF/6-311G(d,p) level using the integral equation formalism variant of the polarizable continuum model (IEF-PCM). All the calculations were run with Gaussian 09 [32]. The free energies and vibrational frequencies were calculated at the same level to confirm their stability, and no imaginary frequencies were found. The optimized low energy conformers with energy < 2 Kcal/mol were considered for ECD calculations. The TD-DFT/B3LYP-SCRF/6-311G(d,p) method was applied to calculate the excited energies, oscillator strength and rotational strength. The excited energies and rotational strength were used to simulate ECD spectra of each conformer by introducing the Gaussian Function. The final ECD spectrum of each compound was obtained by averaging all the simulated ECD spectra of all conformers according to their excited energies and Boltzmann distribution.

Anti-HBV Activity Evaluation
HepG2.2.15 cells, a human cancer cell line, were obtained from China Center for Type Culture Collection (Wuhan, China) and maintained in supplemented with 10% fetal bovine serum (meilunbio, Dalian, China), and 380 µg/mL G418 in a humidified 5% CO 2 atmosphere at 37 • C. The inhibition to HBsAg and HBeAg was detected by ELISA.
HepG2.2.15 cells were seeded in 96-well plates and treated with compounds for 6 days. On day 3, the culture medium containing compounds was collected and replaced. The levels of HBsAg and HBeAg from cell culture supernatant were measured by HBsAg and HBeAg ELISA kits (Kehua, Shanghai, China), according to the manufacturer's instructions. Lamivudine (Adamas, Shanghai, China) was tested as the positive control for anti HBV.

Cytotoxic Activity Evaluation
HepG2 cells were plated in 96-well plates in 100 µL medium (meilunbio, Dalian, China), to which the test samples were added at varied concentrations. After 72 h of incubation, MTT [[3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide] solution [0.5 mg/mL in phosphate buffered saline (PBS)] was added (20 µL/well) [33], and the incubation continued for another 4 h to give a formazan product. In each well, 200 µL DMSO was added after the medium had been removed. Then, the formazan product is completely dissolved by sufficient oscillation. The absorbance of the solution was measured at 490 nm using a microplate reader (Tecan, Mendov, Switzerland). MTT is reduced by dehydrogenase activities in cells to give a purple formazan dye. The amount of the formazan dye generated by dehydrogenases in cells is directly proportional to the number of living cells. Compound concentrations reducing the viability of HepG2 cells culture by 50% (CC 50 ) were calculated by regression analysis of the dose-response curves.