Identification of Key Amino Acid Residues Determining Product Specificity of 2 , 3-Oxidosqualene Cyclase in Siraitia grosvenorii

Sterols and triterpenes are structurally diverse bioactive molecules generated through cyclization of linear 2,3-oxidosqualene. Based on carbocationic intermediates generated during the initial substrate preorganization step, oxidosqualene cyclases (OSCs) are roughly segregated into a dammarenyl cation group that predominantly catalyzes triterpenoid precursor products and a protosteryl cation group which mostly generates sterol precursor products. The mechanism of conversion between two scaffolds is not well understood. Previously, we have characterized a promiscuous OSC from Siraitia grosvenorii (SgCS) that synthesizes a novel cucurbitane-type triterpene cucurbitadienol as its main product. By integration of homology modeling, molecular docking and site-directed mutagenesis, we discover that five key amino acid residues (Asp486, Cys487, Cys565, Tyr535, and His260) may be responsible for interconversions between chair–boat–chair and chair–chair–chair conformations. The discovery of euphol, dihydrolanosterol, dihydroxyeuphol and tirucallenol unlocks a new path to triterpene diversity in nature. Our findings also reveal mechanistic insights into the cyclization of oxidosqualene into cucurbitane-type and lanostane-type skeletons, and provide a new strategy to identify key residues determining OSC specificity.


Introduction
Sterols and triterpenes have been shown to exert a wide spectrum of important biological activities [1,2].They are structurally diverse molecules, over 20,000 steroids and triterpenoids with 200 different skeletons have been found in eukaryotes [3].These compounds display promising pharmaceutical activity and remarkable stability, as a result, they have attracted much attention from researchers [4,5].However, naturally producing these ingredients is limited by the difficulties in plant cultivation and artificial extraction; therefore, further exploration of their synthesis and pharmaceutical applications were encouraged.Nevertheless, the challenges of the chemical synthesis of sterols and triterpenes in high yields are still enormous, mainly because of their dramatic structural micro-heterogeneity [6,7].
The important natural products catalyzed by metabolic enzymes from the biosynthetic pathway appears to be a promising alternative, which has attracted growing interest and made progress in the production of triterpenoids and sterols.Metabolic enzymes in the biosynthetic pathway appear to be a promising alternative candidate to catalyze the formation of important natural products, for this reason, they attracted growing interest and made tremendous strides on the productions of the triterpenoids and sterols.In the process of biosynthesis of all triterpenoids and sterols, the first step is to cyclize a 30-carbon precursor, 2,3-oxidosqualene, that arises from the isoprenoid pathway and is catalyzed by oxidosqualene cyclases (OSCs) [8].It triggers a complex chain reaction to produce tricyclic, tetracyclic, or pentacyclic molecules.At this point, the divergence of the sterol and triterpenoid biosynthetic pathways depends on different types of OSC.Cyclization of 2,3-oxidosqualene in the chair-boat-chair (CBC) conformation leads to a protosteryl cation intermediate, the precursor of the sterols via the formation of cycloartenol, lanosterol, and parkeol [9,10].On the contrary, 2,3-oxidosqualene in the chair-chair-chair (CCC) conformation is cyclized into a dammarenyl carbocation intermediate, which subsequently gives rise to diverse triterpenoid skeletons after further rearrangements, see Figure A1.Since then, intensive studies have revealed the function of the OSC enzyme family but its functional diversity is still unclear and further research is needed.
Studies on the cyclization mechanisms go back to the 1950s [11].Much evidence has shown that the functional diversity of OSCs have been regulated by several key amino acid residues.Up to now, four sterol synthases and two triterpene synthases for the identification and function of the active sites in OSCs have been performed.They are lanosterol synthase from Homo sapiens (HsLAS) and Saccharomyces cerevisiae (SceERG7) [12][13][14][15], squalene-hopene cyclase from Alicyclobacillus acidocaldarius (AacSHC) [16], cycloartenol synthase from Arabidopsis thaliana (AthCAS) [17][18][19][20], β-amyrin synthase from Avena strigose (AsbAS) [21], and Euphorbia tirucalli (EtAS) [22].Mutagenesis of key amino acid residues may lead to loss of functions and a change of product types or profiles.Although many mutagenesis studies have focused on the species of S. cerevisiae and A. thaliana, progress in the study of OSC in medicinal plants has lagged in comparison.Moreover, there has been very little research reported on the key residues which are essential in functional conversion between sterol and triterpenes synthases.Therefore, it is of great significance to search the key amino acids of OSCs, so as to expand their range of products to create more novel structurally diverse products.
Siraitia grosvenorii, a member of Cucurbitaceae family, is a traditional herb mainly planted in the Guangxi Zhuang Autonomous Region in China.Its genome includes two members of the OSC gene family, cucurbitadienol synthase (CS) cyclizes oxidosqualene to form the cucurbitadienol triterpenoid skeleton, which is a distinct step in phytosterol biosynthesis [23,24].So far, CS has been functionally characterized from Cucurbita pepo [25], Citrullus colocynthis [26], and Cucumis sativus [27].A previous study suggests that four indirectly interacting residues of CS in C. sativus may act collaboratively in the choice of substrate folding type and five-member-ring formation reaction for plant OSCs.Here, using a combined approach that relied on homology modeling, residue coevolution analysis, as well as site-directed mutagenesis, the key amino acid residues and reaction mechanism underlying triterpene product specificity were identified, providing catalytic insights into the functional conversion between sterols and triterpenes.

Catalytic Mechanism Prediction of Cucurbitadienol Biosynthesis in S. grosvenorii
Figure A2 shows the homology model of SgCS using the human LaS crystal structure [28] as a template, where the presumed active sites responsible for the construction of the cucurbitadienol scaffold are highlighted.During optimization of the model, no major structural changes were observed for active site residues, and refinement consisted only of fine-tuning the coordinates of the active site sidechain atoms.The active sites of all residues are the same for the SgCS model and LAS, Asp486, Cys487, and Cys565 in SgCS coincide with their counterparts in LAS Asp455, Cys456, and Cys533, respectively.
Molecular docking and energy calculation of quantum chemistry are useful for explaining the experimental results in theory.In order to verify the hypothesis that cucurbitadienol is catalyzed via the C4 intermediate.As shown in Figure A3, the geometry optimization and the single-point energy calculations were performed on 2,3-oxidosqualene and C-4 cation substrates, respectively.The AutoDock (V4.2) was conducted to dock both 2,3-oxidosqualene and C-4 cation into the active site pocket of SgCS, and the data in Figure 1  In this study, the homology modeling and molecular docking indicated that Asp486, Cys487, Cys565, Tyr535, and His260 in SgCS may be regarded as an active-site residue for the 2,3-oxidosqualene cyclization.These residues were subjected to site-directed mutagenesis experiments to identify the functions of the residues.In this study, the homology modeling and molecular docking indicated that Asp486, Cys487, Cys565, Tyr535, and His260 in SgCS may be regarded as an active-site residue for the 2,3oxidosqualene cyclization.These residues were subjected to site-directed mutagenesis experiments to identify the functions of the residues.

Bioinformatic Analysis of SgCS
The Table A1 has listed the deduced protein parameters that are predicted by the online Pxpasy's ProtParam tool, the SgCS was divided into a stable protein and no signal peptides.Our results suggested that SgCS was mainly found in the nucleus or cytoplasm.The putative protein sequences of SgCS had negative Grand average of hydropathicity (GRAVY) values which were the indication of hydrophilic proteins.Moreover, two transmembrane domains were found in SgCS, including 117-194 AAs and 601-641 AA.
The CS cloned from different species was analyzed using multiple sequence comparison analyses, and the results indicated the presence of the DCTAE motif, which included the catalytic aspartic acid residue at the 486 site and was proposed to be the initiation of the cyclization reaction, see Figure 2. [8].SgCS also contains three repeats of QW motifs (conserved motifs rich in aromatic amino acids, starting with Q-Gln and ending with W-Trp) which have structure-stabilizing effects and may be useful in the stabilization of the carbocation intermediates during cyclization [29].In (c) 2,3-oxidosqualene and protein interaction hydrogen bond interaction view; (d) C-4 cation and protein interaction hydrogen bond interaction view.3D representation of SgCS is shown in cartoon (gray), the sticks with cyan carbon atoms represent amino acid residues and the sticks with yellow carbon atoms represent 2,3-oxidosqualene and C-4 cation compounds.The red dotted lines denote hydrogen bonds.

Bioinformatic Analysis of SgCS
The Table A1 has listed the deduced protein parameters that are predicted by the online Pxpasy's ProtParam tool, the SgCS was divided into a stable protein and no signal peptides.Our results suggested that SgCS was mainly found in the nucleus or cytoplasm.The putative protein sequences of SgCS had negative Grand average of hydropathicity (GRAVY) values which were the indication of hydrophilic proteins.Moreover, two transmembrane domains were found in SgCS, including 117-194 AAs and 601-641 AA.
The CS cloned from different species was analyzed using multiple sequence comparison analyses, and the results indicated the presence of the DCTAE motif, which included the catalytic aspartic acid residue at the 486 site and was proposed to be the initiation of the cyclization reaction, see Figure 2 [8].SgCS also contains three repeats of QW motifs (conserved motifs rich in aromatic amino acids, starting with Q-Gln and ending with W-Trp) which have structure-stabilizing effects and may be useful in the stabilization of the carbocation intermediates during cyclization [29].In addition, SgCS has a 260-site histidine residue in the MWCHCR motif, that has been proposed to play an important role in the process of stabilization of the protosteryl cation intermediate [30].
Catalysts 2018, 8, x FOR PEER REVIEW 4 of 28 addition, SgCS has a 260-site histidine residue in the MWCHCR motif, that has been proposed to play an important role in the process of stabilization of the protosteryl cation intermediate [30].

DCTAE Motif for the Initiation of the Polycyclization Cascade
Corey [30] proposed that the D456 residue of the DCTAE motif of S. cerevisiae LAS initiates the ring-forming reaction.Asp486 in SgCS coincides with its counterparts in S. cerevisiae, LAS Asp456.At first, we constructed three variants (D486N, D486E, D486A), to verify the function of this amino acid residue of SgCS.As shown in Figure 3a, all of the mutants completely lost their function to produce cucurbitadienol because of a single-point mutation at the Asp486 position.This demonstrates that the acidic residue of the Asp is regarded as a proton donor to initiate the cucurbitadienol ring-forming reaction.Interestingly, the D486A and D486N mutants showed multiple profiles of products with a molecular mass of (m/z) 426, whereas D486E produced a minor amount of compound 1 (unidentified) as the sole product, see Figure 3c.Compound 2 was distinguished from an authentic cucurbitadienol standard.The retention times of compounds 3, 4, 5, and 7 were 19.86, 20.71, 22.52, and 25.52 min relative to cucurbitadienol, and were identified as euphol, dihydrolanosterol, dihydroxyeuphol, and tirucallenol, respectively, the data is shown in Figures A4-A7.These additional four compounds can be assigned to different C-C bond-forming conformations.Compound 4 was formed through a chair-boat-chair-chair envelope (C-B-C-C) conformation featuring a trans (CH3-14 and H-13) orientation at the C-D ring fused carbons to give a 6/6/6/5-fused tetracyclic protosteryl cation.The subsequent reactions of hydride shifts from H-9 to C-8, H-13 to C-17, and H-17 to C-20 together with the methyl shifts from CH3-8 to C-14 and CH3-14 to C-13 give the final skeleton of compound 4. The other compounds, 3 and 7, initiate the chair-boatchair (C-C-C) conformation of A/B/C rings as well as the chair and boat of D ring, respectively, to give the damarrenyl cation path.Further hydride and methyl shifts formed compounds 3 and 7, while compound 5 experienced an additional hydrogenation reaction, see Figure 4.

DCTAE Motif for the Initiation of the Polycyclization Cascade
Corey [30] proposed that the D456 residue of the DCTAE motif of S. cerevisiae LAS initiates the ring-forming reaction.Asp486 in SgCS coincides with its counterparts in S. cerevisiae, LAS Asp456.At first, we constructed three variants (D486N, D486E, D486A), to verify the function of this amino acid residue of SgCS.As shown in Figure 3a, all of the mutants completely lost their function to produce cucurbitadienol because of a single-point mutation at the Asp486 position.This demonstrates that the acidic residue of the Asp is regarded as a proton donor to initiate the cucurbitadienol ring-forming reaction.Interestingly, the D486A and D486N mutants showed multiple profiles of products with a molecular mass of (m/z) 426, whereas D486E produced a minor amount of compound 1 (unidentified) as the sole product, see Figure 3c.Compound 2 was distinguished from an authentic cucurbitadienol standard.The retention times of compounds 3, 4, 5, and 7 were 19.86, 20.71, 22.52, and 25.52 min relative to cucurbitadienol, and were identified as euphol, dihydrolanosterol, dihydroxyeuphol, and tirucallenol, respectively, the data is shown in Figures A4-A7.These additional four compounds can be assigned to different C-C bond-forming conformations.Compound 4 was formed through a chair-boat-chair-chair envelope (C-B-C-C) conformation featuring a trans (CH 3 -14 and H-13) orientation at the C-D ring fused carbons to give a 6/6/6/5-fused tetracyclic protosteryl cation.The subsequent reactions of hydride shifts from H-9 to C-8, H-13 to C-17, and H-17 to C-20 together with the methyl shifts from CH 3 -8 to C-14 and CH 3 -14 to C-13 give the final skeleton of compound 4. The other compounds, 3 and 7, initiate the chair-boat-chair (C-C-C) conformation of A/B/C rings as well as the chair and boat of D ring, respectively, to give the damarrenyl cation path.Further hydride and methyl shifts formed compounds 3 and 7, while compound 5 experienced an additional hydrogenation reaction, see Figure 4.
Based upon the human LAS structure analysis by X-ray crystallography, Thoma [28] proposed that the acidity of D455 of the DCTAE sequence was increased by the hydrogen bond formations with Cys456 and Cys533, versus their counterparts in SgCS with cysteine at position 487 (C487) and cysteine at position 565 (C565), respectively.In order to explore whether additional residues C487 and C485 are included, we constructed six variants (C487M, C487R, C487A, C565M, C565R and C565A), to verify the functions of these two amino acid residues of SgCS.Results showed that the hydrogen-bond with carboxyl residue Asp486 of cys487 is stronger than that with cys565 because the activity of the C487A variant is about 20% of the wild type, whereas the C565A is about 35%, as shown in Figure 3a.The product profiles of 487 and 565 site mutants are summarized in Figure 3b.The Met and Ala mutants produced compounds 3-7 as part of the product profile in addition to cucurbitadienol.On the contrary, the Arg mutants only produced a small number of compound 1.
Catalysts 2018, 8, x FOR PEER REVIEW 5 of 28 Based upon the human LAS structure analysis by X-ray crystallography, Thoma [28] proposed that the acidity of D455 of the DCTAE sequence was increased by the hydrogen bond formations with Cys456 and Cys533, versus their counterparts in SgCS with cysteine at position 487 (C487) and cysteine at position 565 (C565), respectively.In order to explore whether additional residues C487 and C485 are included, we constructed six variants (C487M, C487R, C487A, C565M, C565R and C565A), to verify the functions of these two amino acid residues of SgCS.Results showed that the hydrogen-bond with carboxyl residue Asp486 of cys487 is stronger than that with cys565 because the activity of the C487A variant is about 20% of the wild type, whereas the C565A is about 35%, as shown in Figure 3a.The product profiles of 487 and 565 site mutants are summarized in Figure 3b.The Met and Ala mutants produced compounds 3-7 as part of the product profile in addition to cucurbitadienol.On the contrary, the Arg mutants only produced a small number of compound 1.

Functions of the Tyr535 and His260 Residue
A homology model suggested that the Tyr535 should synergize with the His260 to biosynthesize cucurbitadienol more precisely.Residues His260 and Tyr535 in SgCS correspond to His234 and Tyr510 in Erg7, respectively.Wu [12,13,31] proposed that Tyr510 points to the monocyclic C-10 cation (lanosterol numbering) and that the His234:Tyr510 catalytic base dyad forms the organized local structure of Erg7.A mutant of Y535L of SgCS did not afford any aberrant cyclization products, and only cucurbitadienol was produced in a very small amount.By contrast, the Y535W and Y535A variants of SgCS yielded 3 to 7 in addition to cucurbitadienol.Furthermore, the Trp residue has the largest steric volume among all the proteinogenic amino acids, and thus, the production of sterol compounds 3 to 7 in Y535W is significantly higher than Y535A.Similar to compound 3 to 7, the content of lanosterol (RT = 15.51 min) in Y535W was obviously increased compared to that of Y535A, see Figure 5.These results demonstrated that substitution of Tyr535 with other amino acids with smaller or larger steric bulks alters the organized local structure, thus influences the polycyclization pathway, resulting in the formation of the products 3 to 7.

Functions of the Tyr535 and His260 Residue
A homology model suggested that the Tyr535 should synergize with the His260 to biosynthesize cucurbitadienol more precisely.Residues His260 and Tyr535 in SgCS correspond to His234 and Tyr510 in Erg7, respectively.Wu [12,13,31] proposed that Tyr510 points to the monocyclic C-10 cation (lanosterol numbering) and that the His234:Tyr510 catalytic base dyad forms the organized local structure of Erg7.A mutant of Y535L of SgCS did not afford any aberrant cyclization products, and only cucurbitadienol was produced in a very small amount.By contrast, the Y535W and Y535A variants of SgCS yielded 3 to 7 in addition to cucurbitadienol.Furthermore, the Trp residue has the largest steric volume among all the proteinogenic amino acids, and thus, the production of sterol compounds 3 to 7 in Y535W is significantly higher than Y535A.Similar to compound 3 to 7, the content of lanosterol (RT = 15.51 min) in Y535W was obviously increased compared to that of Y535A, see Figure 5.These results demonstrated that substitution of Tyr535 with other amino acids with smaller or larger steric bulks alters the organized local structure, thus influences the polycyclization pathway, resulting in the formation of the products 3 to 7. Next, position 260-directed mutagenesis was used to design more promiscuous OSCs properly, the product profiles of mutants are shown in Table 1.All of the mutants produced compound 5 and 7 as part of the product profile.Interestingly, substitution with acidic amino acid (Asp, Glu), aromatic amino acids (Phe, Trp, Tyr), and hydroxyl amino acid (Ser, Thr) at this position abolish cucurbitadienol biosynthesis, consistent with a specific role for His260 in C4-C8 bond formation.The H260X mutants substituted with hydroxyl amino acids (H260S and H260T) afforded the same product profile including compound 1, 5, and 7.The H260A, H260R, and H260Y mutants produced a minor amount of compound 4, in addition to other products.However, none of the mutants showed a specific product profile.Our data reveal that an aromatic amino acid residue at position 260 is important for promiscuity.Next, position 260-directed mutagenesis was used to design more promiscuous OSCs properly, the product profiles of mutants are shown in Table 1.All of the mutants produced compound 5 and 7 as part of the product profile.Interestingly, substitution with acidic amino acid (Asp, Glu), aromatic amino acids (Phe, Trp, Tyr), and hydroxyl amino acid (Ser, Thr) at this position abolish cucurbitadienol biosynthesis, consistent with a specific role for His260 in C4-C8 bond formation.The H260X mutants substituted with hydroxyl amino acids (H260S and H260T) afforded the same product profile including compound 1, 5, and 7.The H260A, H260R, and H260Y mutants produced a minor amount of compound 4, in addition to other products.However, none of the mutants showed a specific product profile.Our data reveal that an aromatic amino acid residue at position 260 is important for promiscuity.

Double Mutants Determining Product Specificity
Various possible SgCS mutants were created via replacing amino acid residues at each of these five sites in combination and expressing them in BY4742.The D486E/C487R double mutants obtained an ability to specifically produce compound 1, while the production was not increased in comparison with single mutant D486E or C487R.The D486A/C487A double mutants also didn't change the product profile compared to the D486A or C487A single mutant.Interestingly, compound 6 synthesis was significantly increased in the double mutant Y535L/C565R, producing at least up to 15 times more product than that of the single mutant, see Figure 5.These results showed that residues may develop synergism in the choice of an extraordinary substrate folding intermediate for OSCs.

Reaction Mechanism of SgCS
A one-step reaction to form the tetracyclic cucurbitadienol molecule liner squalene is considered to be a very complex biochemical reaction in nature.Based on the results of our investigation, we proposed a hypothetical mechanism of the SgCS-catalyzed 2,3-oxidosqualene cyclization reaction of cucurbitadienol, shown in Figure 6.Firstly, the cyclization reaction is initiated by Asp486 protonating the epoxide group of prefolded 2,3-oxidosqualene.Cys487 and Cys565 serve as hydrogen-bonding partners with Asp486; therefore, there is an increased acidity of Asp486 in SgCS.Then, the OSC cyclization cascade stops with the tertiary protosterol cation at C-20 when the formation of the five-membered D-ring is accomplished.Secondly, the protosterol C-20 cation is transformed to the cucurbitadienol C-4/C-8 cation via a rearranging skeleton.Ultimately, the highly conserved His 260 is the basic residue that is close enough to accept the proton in the specific deprotonation of the C-4/C-8 cucurbitadienol cation that terminates catalysis.The hydroxy group of Tyr535 that is hydrogen-bonded to His 260 would be in an appropriate position for the final deprotonation step.

Double Mutants Determining Product Specificity
Various possible SgCS mutants were created via replacing amino acid residues at each of these five sites in combination and expressing them in BY4742.The D486E/C487R double mutants obtained an ability to specifically produce compound 1, while the production was not increased in comparison with single mutant D486E or C487R.The D486A/C487A double mutants also didn't change the product profile compared to the D486A or C487A single mutant.Interestingly, compound 6 synthesis was significantly increased in the double mutant Y535L/C565R, producing at least up to 15 times more product than that of the single mutant, see Figure 5.These results showed that residues may develop synergism in the choice of an extraordinary substrate folding intermediate for OSCs.

Reaction Mechanism of SgCS
A one-step reaction to form the tetracyclic cucurbitadienol molecule liner squalene is considered to be a very complex biochemical reaction in nature.Based on the results of our investigation, we proposed a hypothetical mechanism of the SgCS-catalyzed 2,3-oxidosqualene cyclization reaction of cucurbitadienol, shown in Figure 6.Firstly, the cyclization reaction is initiated by Asp486 protonating the epoxide group of prefolded 2,3-oxidosqualene.Cys487 and Cys565 serve as hydrogen-bonding partners with Asp486; therefore, there is an increased acidity of Asp486 in SgCS.Then, the OSC cyclization cascade stops with the tertiary protosterol cation at C-20 when the formation of the fivemembered D-ring is accomplished.Secondly, the protosterol C-20 cation is transformed to the cucurbitadienol C-4/C-8 cation via a rearranging skeleton.Ultimately, the highly conserved His 260 is the basic residue that is close enough to accept the proton in the specific deprotonation of the C-4/C-8 cucurbitadienol cation that terminates catalysis.The hydroxy group of Tyr535 that is hydrogenbonded to His 260 would be in an appropriate position for the final deprotonation step.

Discussion
We identified five key amino acid positions which regulate the functional switch between cucurbitane-type and lanostane-type triterpenoids synthesis using structural modeling.The use of protein homology modeling combined with molecular docking is a useful method to discover new enzymes, and it reveals a chemical diversity of catalytic mechanisms of enzymes [32].In the present study, this method enabled us to identify the novel phytosterols, euphol, dihydroxyeuphol, and tirucallenol, and uncover an unknown field regarding OSC cyclization.As far as we know, biosynthesis of these compounds has not been reported before, their structures are quite similar to cycloartenol's and lanosterol's, that are the precursors of sterol biosynthesis.The products of SgCS mutants may participant in synthesizing new steroid hormones, and, therefore, affect the growth and development of a plant.
A recent research of bacterial OSCs (HsLAS) has found four sites at positions Trp230, His232, Tyr503, and Asn697 that may lead to a change in product from tetracyclic triterpenoid lansterol to pentacyclic products, but keep a protosteryl-type C-B-C conformation [33].Two residues, at position Phe696 and Ser699, were identified in the studies of Avena strigose and Euphorbia tirucalli b-amyrin synthase (SAD1), which determined the functional interconversion of b-amyrin and lansterol synthases.These enzymes produce pentacyclic and tetracyclic triterpenes from the same dammarane-type C-C-C conformation [21,34].Most of the tetracyclic products catalyzed via the C-B-C intermediate are precursors for primary metabolites, such as cholesterol and brassinosteroids (BRs) [35].Although, one exception is cucurbitadienol synthase from cucurbit plants [25,27].In this study, the five key residue positions in SgCS, Asp486, Cys487, Cys565, Tyr535, and His260, are responsible for the alternation between the C-B-C and C-C-C conformations.This is the first enzyme to report the conformational change from the C-C-C to the C-B-C folding conformation among the OSC mutagenesis studies reported hitherto.
Over the course of evolution, in order to adapt to both natural and artificial selections, the plants produce diverse catalogs of compounds via improvements to their system.Research on engineering enzymes has progressed, such as a broadened substrate, reaction, and product specificity.In this study, because of a single-point mutation, the product scope of SgCS for oxisqualene-cyclization was significantly enlarged.It has proven to be dramatically more difficult to redesign enzymes to have a stringent specificity.There are various factors that can be used to improve the product specificity and stereoselectivity.During the cyclization process, specific cation-pi interactions mediated by aromatic residues and cation intermediates are important for OSC's generation of the specific final product [20,[36][37][38].It is worthy to note that we found two residues, Cys565 and Tyr535, can act collaboratively in the choice of the specific substrate folding intermediate for OSCs.These results offered a possibility that may produce promiscuous OSC variants to create specific products.
Since the 1950s, the response mechanism of OSC has been studied in depth.Many OSCs have been cloned and sequenced since the first DNA sequence of Candida albicans OSC was reported in 1990 [39].However, isolating OSCs in an active form and characterizing their enzymatic properties in vitro has been very difficult because they are membrane-bound proteins.To date, only the X-ray crystallographic structures of the two triterpene cyclases of SHC and human lanosterol cyclase have been solved and the structure of cucurbitadienol synthase has not yet been elucidated.Recently, a report describing the functional analyses of the active sites in cucurbitadienol synthase have appeared, but at present, studies on the catalytic mechanism of cucurbitadienol synthase remain insufficient, and further research is needed to better understand cucurbitadienol biosynthesis [40].To acquire detailed information on the polycyclization cascades promoted by triterpene cyclases, collaborations involving multiple research fields, such as molecular biology, structural biology, biochemistry, and organic chemistry, will be required.

Chemicals and General Methods
The host strain S. cerevisiae BY4742 was purchased from Invitrogen (Carlsbad, CA, USA).Chemical reagents were purchased from Sigma Aldrich (St. Louis, MO, USA), J&K Scientific Ltd. (Beijing, China), and Taihe Biotechnology Co. Ltd. (Beijing, China).KOD-Plus DNA Polymerase was purchased from TOYOBO Biotech Co. Ltd. (Shanghai, China).Primers synthesis and DNA sequencing were obtained from Sangon Biotech Company (Beijing, China).Restriction enzymes and DNA ligase were purchased from Takara Biotechnology Co. Ltd. (Dalian, China).The yeast expression plasmids pCEV-G4-Km (Addgene plasmid # 46819) were a gift from Lars Nielsen and Claudia Vickers [41].Yeast products were detected using an Acquity UPLC™ system (Waters Corp. USA) with an Acquity BEH C18 column (100 mm × 2.1 mm i.d.1.7 µm), and purified by a Lumiere K-1001 pump, a Lumiere K-2501 single λ absorbance detector and a YMC-Pack ODS-A column (5 µm, 10 × 250 mm, YMC, Kyoto, Japan).ESI-MS wasperformed on a LTQ-Obitrap XL spectrometer (Thermo Fisher Scientific, Boston, MA, USA).The conversion rates of the substrate were calculated from peak areas of products as analyzed by UPLC at their maximum absorption wavelength, respectively.Compounds were characterized by 1 H NMR at 600 MHz and 13 C APT at 150 MHz on Bruker AV III 600 NMR spectrometers.Chemical shifts (δ) were referenced to internal solvent resonances and were given in parts per million (ppm).Coupling constants (J) were given in hertz (Hz).
The three-dimensional structure models of SgCS were constructed using Modeller 9.17 (Accelrys, Inc., San Diego, CA, USA).The human lanosterol synthase (PDB accession code 1W6J) was used as the template for modeling.The generated three-dimensional structures were improved by using the loop refined program of Modeller 9.17.The final models were checked by Procheck (http://nihserver.mbi.ucla.edu/511SAVES/)and 90.4% of residues were located in the most favored regions in the Ramachandran plot.

Molecular Docking
AutoDock (V4.2) was used with an empirical free-energy function to evaluate binding free energies and the Lamarckian genetic algorithm (LGA) to search for favorable binding positions [42].The empirical scoring function which contains hydrogen bonding, electrostatics, conform, torsion, and solvent terms, was trained to calculate the affinity between ligand and protein [43].It was used to dock compounds into the SgCS protein.The AutoGrid was used to calculate the grid maps which can define the search region and represent the protein during the docking process, the dimensions of grid maps were 50 Å × 50 Å × 40 Å centered on coordinates 26.271, 68.084, and 7.604, with a spacing of 0.375 Å between the grid points.The LGA parameters were accepted as number of GA runs 100, population size 150, maximum number of evals 2,500,000 generations and other parameters were left at the default values.

Mutagenesis Experiments of SgCS
The CDS of SgCS biosynthetic gene (GenBank accession No. HQ128567) was codon-optimized (Box A1) for synthesis and cloned into the BamHI/EcoRI sites of the pCEV-G4-Km-USER [44] yeast expression vector under the control of the TEF1 promoter to construct pCEV-G4-Km-USER-SgCS. The Site-Directed Mutagenesis Kit obtained from Biomed (Beijing, China) was used to detect the mutagenesis, and the corresponding degenerate primers are given in Table A2 with the substitutions underlined.After separation by 1% agarose gel electrophoresis, the resulting PCR products were directly transformed into the Trans1-T1 E. coli.Sanger sequencing was used to determine the sequence of the mutation in the resulting plasmid (pCEV-G4-Km) using the oligonucleotide primers pCEV-Seq, the data is shown in Table A2.

Yeast Transformation and Cell Cultivation
The plasmids were transfected into S. cerevisiae strains BY4742 using the Frozen-EZ yeast transformation II kit obtained from Zymo Research (Orange, CA, USA) and selected for growth on YPD plates with 200 mg/L G418.The empty pCEV-G4-Km vector was also introduced into BY4742 as a control.
The recombinant cells were transferred and cultured in 2 mL YPD medium with 200 mg/L G418 and in a humidified incubator at 30 • C, and 250 rpm to 600 nm (OD600) of approximately 1.0.Then flasks (250 mL) containing 100 mL medium were then inoculated to an OD600 of 0.05 with the seed cultures.After inoculation, incubate the strains at 30 • C, 250 rpm for 3 days.Moreover, Shimadzu UV-2550 spectrophotometer was used to detect the optical densities at OD600.

UPLC-Q-TOF Analysis of Yeast Extracts
Yeast cells were harvest from culture using a centrifuge at 10,000× g for 5 min, then washed with 5 mL of 20% KOH/50% ethanol and extracted three times with the same volume of n-hexane.The recombinant extracts were distilled and dried under reduced pressure and dissolved in 1 mL of acetonitrile.

Compound Structure Elucidation by NMR Data Analysis
NMR analyses of compounds 3-5, and 7 were performed on a BrukerAvance III 600 analysis system with the solvent of CDCl 3 , and the structures were elucidated according to optical rotation, ESIMS, and NMR data ( 1 H NMR, 13 C APT, and NOESY spectra) and compared with the reported ones.

Conclusions
To summarize, we developed a reliable homology model of SgCS to explore the potential function of the active site.Molecular docking and energy calculation of quantum chemistry were useful for explaining the catalytic mechanism in theory.The dynamic domain of SgCS was constituted by five covarying residues including Asp486, Cys487, Cys565, Tyr535, and His260.We described a site-directed mutagenesis strategy for the synthesis of cucurbitane-type and lanostane-type derivatives via an oxidosqualene cyclization reaction using SgCS as a safe, economical, and eco-friendly catalyst.The results suggested that the active site of SgCS accounted for the alternation between C-B-C and C-C-C conformations.A lot of useful information about enzyme evolution can be provided by exploring the untapped catalytic promiscuity of natural enzymes.OSCs can polycyclize compounds in a one-step reaction, as a result, it is currently being applied as an interesting biocatalyst in a variety of biotechnological processes.Its tight stereo control, when compared to chemically produced compounds, makes this enzyme greatly superior to catalyze the reaction.Further studies are needed to broaden the substrate range of the enzyme, which makes OSCs a promising tool to produce novel, unnatural cyclic products.
indicated that both 2,3-oxidosqualene and C-4 cation could combine with the active site excellently.The computed binding free energy of the C-4 cation (−12.46 kcal/mol) is significantly larger in magnitude than the binding free energy of 2,3-oxidosqualene (−10.86 kcal/mol).A proper position and orientation of the conserved side-chains of His260, Trp413, Phe475, Trp613, and Phe729 was required with the purpose of stabilization of intermediate tertiary cations at C-4. Active site residues Tyr535 possibly participates in CH-π complexation with His260 because of the closer distance (3.3 Å) of the C-4 cation compared to the 2,3-oxidosqualene (4.1 Å).These results clearly indicated that the Tyr535 and His260 together could trigger the 4β-demethylation via oxidative decarboxylation.

Catalysts 2018, 8 ,
x FOR PEER REVIEW 3 of 28 calculations were performed on 2,3-oxidosqualene and C-4 cation substrates, respectively.The AutoDock (V4.2) was conducted to dock both 2,3-oxidosqualene and C-4 cation into the active site pocket of SgCS, and the data in Figure1indicated that both 2,3-oxidosqualene and C-4 cation could combine with the active site excellently.The computed binding free energy of the C-4 cation (−12.46 kcal/mol) is significantly larger in magnitude than the binding free energy of 2,3-oxidosqualene (−10.86 kcal/mol).A proper position and orientation of the conserved side-chains of His260, Trp413, Phe475, Trp613, and Phe729 was required with the purpose of stabilization of intermediate tertiary cations at C-4. Active site residues Tyr535 possibly participates in CH-π complexation with His260 because of the closer distance (3.3 Å ) of the C-4 cation compared to the 2,3-oxidosqualene (4.1 Å ).These results clearly indicated that the Tyr535 and His260 together could trigger the 4βdemethylation via oxidative decarboxylation.

Figure 1 .
Figure 1.The two primary energetically favored binding clusters of the 2,3-oxidosqualene and C-4 cation in the active sites of SgCS from docking studies.(a) 2D docked view of ligand SgCS with 2,3oxidosqualene receptor; (b) 2D docked view of ligand SgCS with C-4 cation receptor; (c) 2,3oxidosqualene and protein interaction hydrogen bond interaction view; (d) C-4 cation and protein interaction hydrogen bond interaction view.3D representation of SgCS is shown in cartoon (gray), the sticks with cyan carbon atoms represent amino acid residues and the sticks with yellow carbon atoms represent 2,3-oxidosqualene and C-4 cation compounds.The red dotted lines denote hydrogen bonds.

Figure 1 .
Figure 1.The two primary energetically favored binding clusters of the 2,3-oxidosqualene and C-4 cation in the active sites of SgCS from docking studies.(a) 2D docked view of ligand SgCS with 2,3-oxidosqualene receptor; (b) 2D docked view of ligand SgCS with C-4 cation receptor; (c) 2,3-oxidosqualene and protein interaction hydrogen bond interaction view; (d) C-4 cation and protein interaction hydrogen bond interaction view.3D representation of SgCS is shown in cartoon (gray), the sticks with cyan carbon atoms represent amino acid residues and the sticks with yellow carbon atoms represent 2,3-oxidosqualene and C-4 cation compounds.The red dotted lines denote hydrogen bonds.

Figure 5 .
Figure 5. Extracted ion chromatogram of the 535 site and Y535L/C565R double mutant activity.

Figure 5 .
Figure 5. Extracted ion chromatogram of the 535 site and Y535L/C565R double mutant activity.

Figure 6 .
Figure 6.A proposed tentative mechanism for the SgCS-catalyzed cyclization reaction.

Figure 6 .
Figure 6.A proposed tentative mechanism for the SgCS-catalyzed cyclization reaction.
The underlined oligonucleotides represent mutagenized sites.

Figure A2 .
Figure A2.Homology model of cucurbitadienol synthase constructed using the human LAS crystal structure as a template.During optimization of the model no major structural changes were observed for active site residues, and refinement consisted only on fine-tuning the coordinates of active site sidechain atoms.

Figure A2 . 28 Figure A2 .
Figure A2.Homology model of cucurbitadienol synthase constructed using the human LAS crystal structure as a template.During optimization of the model no major structural changes were observed for active site residues, and refinement consisted only on fine-tuning the coordinates of active site sidechain atoms.

Table 1 .
Product profile of S. cerevisiae BY4742 expressing the SgCS H260X site-saturated mutants.

Table 1 .
Product profile of S. cerevisiae BY4742 expressing the SgCS H260X site-saturated mutants.