Mechanistic Insight into SARS-CoV-2 M pro Inhibition by Organoselenides: The Ebselen Case Study

Featured Application: Density functional theory and docking studies applied to the understanding of the chemical binding of the M pro -inhibitor ebselen to the enzyme catalytic cysteine. Abstract: The main protease (M pro ) of SARS-CoV-2 is a current target for the inhibition of viral replication. Through a combined Docking and Density Functional Theory (DFT) approach, we investigated in-silico the molecular mechanism by which ebselen (IUPAC: 2-phenyl-1,2-benzoselenazol-3-one), the most famous and pharmacologically active organoselenide, inhibits M pro . For the ﬁrst time, we report on a mechanistic investigation in an enzyme for the formation of the covalent -S-Se- bond between ebselen and a key enzymatic cysteine. The results highlight the strengths and weaknesses of ebselen and provide hints for a rational drug design of bioorganic selenium-based inhibitors.


Introduction
The main protease (M pro ) and papain-like protease (PL pro ) of SARS-CoV-2 are suitable targets for the pharmacological action against its viral replication [1]. Both enzymes are Cysteine (Cys) proteases [2,3], and in the case of M pro , the mechanism closely resembles the well-studied mechanism of Serine protease. M pro works via a catalytic dyad formed by a nucleophilic Cysteine (Cys145) activated by a Histidine (His41) residue. After a preliminary proton transfer from Cys145 to His41, through which the nucleophilic strength of the former residue is strongly enhanced, M pro preferentially attacks peptide bonds after a glutamine residue [4], leading to a tetrahedral intermediate from which the actual peptide bond cleavage occurs, due to the back-proton-transfer from His45 leading to a thioester (Scheme 1a). After that, the thioester is hydrolyzed, regenerating the active enzyme.
EbSe is supposed to covalently inhibit SARS-CoV-2 M pro via the formation of a selenylsulfide (Se-S) adduct with Cys145 (Scheme 1b). The formation of a selenyl-sulfide bond with Cys residues was also postulated five years ago in order to explain the antiviral activity of EbSe against HIV [12]. However, to the best of our knowledge, the precise chemical mechanism has never been investigated in protein and it might provide insight into the rational engineering of existing organoselenides (e.g., diphenyl diselenide), ebselen derivatives [13,14] or the design of new selenium-based antivirals.
In a recent preprint, Sancineto et al. [15] reported interesting results indicating that the inhibitory capacity of organoselenium compounds towards M pro is greatly reduced upon dimerization to diselenides, and Ma et al. recently demonstrated that, under reducing conditions, i.e., in the presence of 1,4-dithiothreitol (DTT) and/or glutathione (GSH), EbSe is not able to effectively inhibit SARS-CoV-2 M pro as well as a panel of other Cys proteases [16,17]. Particularly, all the ebselen-like scaffolds investigated by Sancineto et al. were 1-2 orders of magnitude less efficient at inhibiting M pro after dimerization to diselenides. Moreover, the decrease in the inhibitory potency of EbSe in the presence of DTT and GSH raises the problem of formation of ebselen-low-molecular mass thiol adducts in vivo (i.e., adducts with free Cys and GSH). Further investigation is thus important to understand the true antiviral potential of organoselenium compounds, starting from the chemical mechanism, and then progressively moving to realistic biological conditions. For this reason, we thoroughly analyzed, in silico, the chemical mechanism of Se-S bond formation between EbSe and Cys145, which is at the basis of the protease inhibition, and we extended our analysis to the Se-S bond formation starting from ebselen ethanthiolate (EbSeS) (Scheme 2) as a model of the adduct of EbSe with endogenous thiol molecules. In fact, since it is unlikely that the reactive N-Se bond of EbSe reaches M pro unmodified, EbSeS was taken as a general model of ebselen selenyl sulfides, which are expected to be present in the biological environment. EbSe actually travels through the plasma bonded to the free Cys of albumin and/or other low-molecular-containing thiol molecules (Cys and GSH) [18,19]. Thus, EbSe is suitable for the description of in vitro inhibition and EbSeS is a simplified model for the description of in vivo inhibition of M pro . Scheme 1. (a) Acylation step of the proteolysis catalysed by SARS-CoV-2 M pro , [5], (b) Predicted mechanism of the inhibition of SARS-CoV-2 M pro by ebselen as investigated in this study.
In early 2020, a high-throughput screening discovered ebselen (EbSe, Scheme 2), one of the most famous organoselenides, as a potent inhibitor of SARS-CoV-2 M pro (IC 50 = 0.67 µM) [6]. EbSe is known to interact with a plethora of biologically relevant cysteines [7][8][9], implying both pharmacological interest and toxicological concern. However, the low toxicity of EbSe, assessed in different experimental and clinical trials, makes it an interesting scaffold on which to design multipurpose drugs [10,11]. In this work, we employ a joint molecular docking and density functional theory (DFT) approach, in which the speed of docking is used to provide a reasonable guess of the non-covalent complex between EbSe/EbSeS and M pro , and DFT is used to investigate bondbreaking and formation phenomena, which cannot be properly addressed without quantum mechanics. Lastly, we briefly discuss the thermodynamics of a possible evolution of the -S-Se-adduct hypothesized on the basis of X-ray structures and mass spectrometry data in a recent paper by Amporndanai et al. [20] who experimentally observed the breaking of the Se-C bond of EbSe in M pro , with subsequent release of the EbSe scaffold in the form of a salicylanilide (IUPAC: 2-hydroxy-N-phenylbenzamide) (Scheme 3).

Materials and Methods
In this work, we have used a combination of flexible molecular docking and Density Functional Theory (DFT) calculations. Docking methodologies were used to reasonably estimate the adduct between ebselen and M pro . The mechanism was investigated employing the so-called cluster approach [21], which is the fully quantum investigation of the enzyme reactivity using only selected residues that reproduce the chemical features of the catalytic pocket. Such a methodology was largely applied to the reactivity of highly different enzymes, spanning oxidoreductase [22], lyase [23] and serine [24] or metalloprotease [25] enzymes. While the inclusion of the enzymatic environment at a Molecular Mechanics level of theory (i.e., employing a QM/MM methodology such as the ONIOM scheme) [26]) is expected to have an impact on the reactions' energetics, Morokuma and coworkers [27] showed that the surrounding effect is rather weak when the catalytic pocket is exposed to the solvent, as it is for SARS-CoV-2 M pro . Thus, the environment is expected to affect the energetics, but is unlikely to alter the mechanistic features of the reaction.
Molecular docking was carried out to simulate the binding pose of EbSe with M pro from SARS-CoV-2 (PDB ID 6LU7 [6]). AutoDock Vina software [28], which has a high accuracy for binding mode predictions, was used [29]. For the M pro structure preparation, the waters, ions, ligands, and other molecules were removed, while the hydrogen atoms were added using the CHIMERA program, followed by 100 steps of energy minimization (amberff99SB) [30]. The catalytic dyad (Cys145 and His41) was considered neutral, as highlighted in previous studies [31,32]. An adduct between EbSe and cysteine (EbSe-Cys) was used in the docking simulations to mimic the putative metabolites of EbSe [33][34][35]. The tridimensional model of the ligands (EbSe and EbSe-Cys) were created with Avogadro and MOPAC (PM6 method) [36,37]. The files were prepared for the docking, using the AutoDock Tools 4.2 [38], with the flexible ligands. To consider the protein induced-fit effect and to improve the interactions between ligands and M pro , the flexible-flexible docking method was applied, where EbSe, EbSe-Cys, and the side chain of His41, Met49, Asn142, Cys145, Met165, Glu166, and Gln189 residues (from the active site) were considered flexible during the simulations. The grid box was centered on the coordinates x = −14.04, y = 17.44, and z = 66.22 (size = 25 × 25 × 25 Å), and an exhaustiveness of 50 was used. The complex ligand-receptor with the most favorable binding free energy, and the best Se···S(Cys145) orientation, was selected as a model of the binding pose and used to build the M pro pocket cluster for the DFT study.
After the docking, taking into account the closest M pro residues (at 4.0 Å) from EbSe and EbSeS, His41, Met49, Cys145, and Met165 were selected and removed from the M pro pocket, and the CH 3 CO and CH 3 NH groups were added to the N-and C-terminal regions, respectively, to mimic the backbone peptide bonds. In fact, the His41, Met49, Cys145, and Met165 residues were involved in many interactions with small molecules [39]. As a general model of ebselen thiolates, the carboxyl and amino moieties of EbSe-Cys were replaced with H atoms to create the ethanthiolate (EbSeS).
All density functional theory calculations were performed, employing Gaussian16 [40]. The level of theory employed was based on previous mechanistic investigations in an enzyme whose catalysis is based on reactive Cys or Sec residues [41,42]. Thus, the B3LYP hybrid functional [43,44], in combination with Grimme D3 dispersion correction and the Becke-Johnson damping function [45,46] was used. All first and second period atoms were described with the 6-311G(d,p) basis set, while for sulfur and selenium, Dunning's correlation consistent cc-pVTZ basis set was used. All structures were optimized in the gas phase and further solvation correction was taken into account as a single point, employing the SMD solvation model at SMD-B3LYP-D3(BJ)/6-311G(d,p), cc-pVTZ//B3LYP-D3(BJ)/6-311G(d,p), cc-pVTZ level of theory [47]. Diethyl ether was used to mimic the low dielectric constant of the enzymatic environment, as described in the literature [48]. Thermodynamic corrections at 298.15 K and 1 atm were computed by means of a standard statistical mechanics relationship based on electronic energies and gas phase frequency calculations, as implemented in the Gaussian software. All energies described in the main text are Gibbs free energies corrected with the contribution of solvation. The atoms from the backbone were kept constrained during all geometry optimizations. Frequency calculations were performed to assess the nature of the optimized geometries: all transition states have one imaginary frequency related to the normal mode connecting reactants to products; all minima have no imaginary frequencies. For EbSe, starting from TS1 and TS2, an intrinsic reaction coordinate (IRC) calculation was performed to verify that the correct transition state was located [49].

Results and Discussion
The EbSe ( Figure 1) and EbSeS binding poses were obtained from the molecular docking, as described in the Computational Methods, and the adduct between EbSe and EbSeS with the model catalytic pocket was reoptimized via DFT. The reactant complex (RC) formed by the four amino acids cluster with docked EbSe was used as the starting point of our mechanistic investigation. From our calculations, (Figure 2) the overall inhibition mechanism closely resembles the acylation step of the fully functional M pro shown in Scheme 1a,b. The presence of EbSe does not impair the activation of Cys145, which is effectively deprotonated by His41 with an activation energy of about 15.16 kcal mol −1 (TS1), a value close to that previously reported for this step in a computational study on the functional enzyme (19.9 kcal mol −1 ) [5].
Notably, an Ion Pair (IP) between deprotonated cysteine (Cys -) and protonated Histidine (HIP) was not located on the potential energy surface (PES); in addition, after the proton transfer, the cysteinate residue efficiently attacks the Se atom of EbSe, leading to an adduct with an almost linear N-Se-S bond (three centers intermediate, TCI). While confor-mational freedom of the backbone might help in stabilizing an Ion Pair, which was located in another M pro mechanistic investigation [50], our results suggest that the cysteinate attack to the Se-N bond occurs with a very low, if any, activation energy. The TCI is situated 3.41 kcal mol −1 above the RC; thus, the step is weakly endergonic. After the formation of the TCI, full covalent inhibition is reached with the back proton transfer from HIP to the N atom of EbSe, leading to the N-Se bond cleavage and to the complete formation of the S-Se bond between Cys145 and EbSe. This step is exergonic; the inhibited product (Pinhb) is almost 7 kcal mol −1 more stable than the RC. While the backproton transfer is predicted to have an activation energy of about 26 kcal mol −1 , thus being rate-determining for the whole process, the geometrical features of the transition state (TS2) appear slightly distorted, due to the constraints imposed to His41 backbone. Therefore, the activation energy of this process is likely overestimated in our model because of the limited flexibility of the catalytic pocket. Indeed, in the acylation step of the functional M pro , the analogous step occurs with a definitively lower activation energy [5]. However, previous investigations on enzymatic clusters [41,42] and on oversimplified models [51] based on reactive Cys/Sec showed that while the barriers might be magnified in these last ones, the relevant features of the mechanism can be correctly reproduced even in very reduced systems, given that the relevant reactive residues are considered, as in our case. The Pinhb obtained here presented a similar binding pose when compared with the EbSe from the covalent docking, previously published [52].
Using the same approach, ( Figure 3) the inhibition mechanism by EbSeS has been explored (energies are reported in Table S3 (Supplementary Materials)). As seen for EbSe, in this case, the inhibitor does not impair Cys145 activation, which occurs with an even lower activation energy of 8.83 kcal mol −1 (TS1). However, this is almost certainly related to the closer proximity of Cys145 and His41 residues in the cluster (the distance between S and N atoms is 3.46 Å; conversely, in the cluster docked with EbSe, the corresponding distance is 4.61 Å).
After TS1, the mechanism of EbSeS displays important differences with respect to that of EbSe. Particularly, a strongly destabilized IP (9.62 kcal mol −1 ) was located on the PES, from which the proton dislocated on HIP might be shuttled to Cys145, leading back to the RC. In addition, different pathways might be suitable for the nucleophilic Cysattack to the Se atom of EbSeS. First, the deprotonated Cys might attack the Se-S bond of EbSeS, leading to a TCI that has been previously identified in vacuo and in water in model molecular systems [53]. In our case, the TCI is located only 1.25 kcal mol −1 above the RC. Wiberg bond indexes analysis [54] of this intermediate in the gas phase (Table 1) revealed a behaviour similar to that computed for the small molecular model presenting a -S-Se-Sbond, whose coordinates were taken from Bortoli et al [53].  Wiberg indexes for S1-S2-S3 1 and S1-Se-S3 1 TCIs, with EbSeS and a sulfur analogue, and for two previously described minimal models. Level of theory: B3LYP-D3(BJ)/6-311G(d,p), cc-pVTZ.

EbS(e)S CH 3 S(S/SeCH 3 )SCH 3
2 S1-S2 0.190 3 0.450 S2-S3 0.833 3 0.610 S1-Se 0.325 0.535 Se-S3 0.703 0.537 1 TheTCI atoms' numbering scheme is defined for the three chalcogens of the TCI. For EbSeS S1 is the M pro -Cys sulfur, Se the selenium atom of ebselen, and S3 the sulfur atom of ethanethiolate. For the sulfur analogue EbSS, the central Se is substituted with S2. The same numbering scheme is defined for the minimal model with Se/S2 being the central chalcogen, and S1 and S3 the two marginal sulfur atoms. 2 Computed on structures taken from [53]; 3 In the sulfur analogue of EbSeS (EbSS), the TCI presents strongly asymmetric Wiberg indexes (0.19 and 0.83) in the gas phase, suggesting a more "reactant complex"-like nature. This is in agreement with the shift in reactivity towards an S N 2 behavior after inclusion of solvation (in our case, the enzymatic cluster environment) described in [53].
For our TCI, the indexes are consistent with two asymmetric S-Se bonds, as is reasonable given the different sizes of the two thiolates bonded to Se, ethanethiolate and an enzymatic cysteinate, with respect to the two symmetric thiolates of [53]. Further re-optimization of the TCI in diethyl ether (see Computational Methods) does not significantly affect the geometry, confirming the stability of this intermediate in a non-polar environment, such as the enzymatic pocket.
Such a structure might spontaneously loose ethanethiolate (EtS -), leading to the desired thiolate exchange and the EbSe-S-Cys(M pro ) adduct. However, this process is computed to be endergonic (+5.20 kcal mol −1 with respect to the RC), and the back proton transfer from HIP to EtSis required for the overall inhibition process to become exergonic (−1.78 kcal mol −1 with respect to the RC). Whether such a back proton transfer happens directly from the IP (leading to the concerted breaking of the EbSeS Se-S bond and to the formation of Cys-Se bond, bypassing the formation of a TCI) or at a later mechanistic stage (e.g., at the TCI itself) was not investigated, since such analysis is expected to be strongly influenced by the nature of the thiolate, by cavity rearrangements and by the presence of explicit water molecules, which can assist with the thiol exchange mechanism. Thus, we require further molecular dynamics investigation to understand how the evolution of the binding site can affect ebselen inhibition. However, the overall thermodynamic feasibility of the inhibition process seems to be less favorable with respect to the same process for EbSe (−1.78 and −6.86 kcal mol −1 , respectively). Since the chalcogenolate exchange between a thiolate and a diselenide was previously computed to be less thermodynamically favorable (in the condensed phase) with respect to the thiolate exchange between a thiolate and a selenyl sulfide [53], the same trend in thermodynamics can be expected to hold true when comparing EbSe to its diselenide, in agreement with the findings of Sancineto et al. [15].
Lastly, starting from the Pinhb obtained from EbSe, we investigated the mechanistic hypothesis of Se-C bond-breaking depicted in Scheme 3. The reaction appears to proceed as an aromatic nucleophilic substitution (S N Ar), but no Meisenheimer intermediate was located, and all attempted optimizations led back to either the reactants or to the products, both in the gas and the condensed phases. Thus, in agreement with previous studies on molecular models, the soft chalcogen-leaving group and the weakly activated aromatic ring seem to favor a concerted S N Ar mechanism [55]. Beside the level of theory used throughout this work, some other density functionals (i.e., M06-2X [56], M11 [57] and B3LYP without Grimme dispersion) were tested to assess the non-existence of a Meisenheimer intermediate, as was done for model compounds in [55]. In all cases, the guess intermediate broke down.
Interestingly, the step described in Scheme 3 appears to be energetically favorable (all the following energies are gas phase Gibbs free energies), leading to products (i.e., (Cys)-SSeH and salicylanilide) that are located on the PES 12.33 kcal mol −1 below the Pinhb with the addition of one water molecule. However, the mechanistic details of the Se-C bond-breaking of EbSe should be carefully investigated to assess the kinetic feasibility of such a pathway and the role of the catalytic pocket residues, especially as no evidence suggests the occurrence of a similar reaction in solution. Such an extensive investigation is beyond the scope of our work, which revolves around the preliminary well-documented Se-S bond formation. However, due to the importance of the topic of the use of organoselenium compounds as M pro inhibitors, the preliminary computation of one of the possible transition states involved in the process was carried out, i.e., for the attack of OH − to the aromatic ring, with the other water proton located on His41 (HIP). This transition state is situated at a rather high energy on the PES (+52.37 with respect to the Pinhb with the addition of a water molecule). This suggests that it is unlikely that such a reaction would proceed in a biological environment without the active participation of nearby residues, such as His41. Indeed, the direct participation of His41 in the Se-C bond breaking event was hypothesized in [20].

Conclusions
In this work, we investigated the mechanistic details of S-Se bond formation between EbSe and the catalytic Cys of M pro , and a comparison was made with EbSeS. To the best of our knowledge, our investigation provides, for the first time, the stationary points for the covalent binding mechanism of Ebselen and one of its possible metabolites to an enzymatic Cys, i.e., M pro Cys145. Overall, the energetics computed for the metabolite suggest its weaker inhibition potential with respect to EbSe, in agreement with the recent studies on the activity of EbSe upon reduction to diselenide or in the presence of DTT/GSH. However, we must stress that the nature of the leaving thiol might exert an important effect on the thermodynamics of the whole reaction, especially since EbSe might reach the M pro bonded to an extremely bulky group, such as a whole protein. This prompts further studies based on molecular dynamics with suitable accurate force-fields for the selenides, as well as a fine experimental investigation of EbSe antiviral activity in biologically plausible conditions. Moreover, we provide thermodynamic support to the hypothesis of Se-C bond-breaking within M pro [20], and we suggest a detailed mechanistic investigation to understand if and how the direct participation of His residues of the catalytic pocket lowers the activation energy of the process.
Conversely to EbSeS, EbSe mechanism is straightforwardly similar to the canonical mechanism of Cys proteases, and the Cys145 attack on the Se atom appears to proceed without an appreciable activation energy. In addition, differently from the RC with EbSeS, EbSe exhibits an almost linear arrangement of S-Se-N atoms of Cys145 and EbSe (157 • ), making it suitable for the S N 2 process to occur without large structural rearrangements. Notably, since SARS-CoV-2 PL pro employs a catalytic triad that also includes reactive Cys and His residues, a similar mechanism is readily extendable to this and other viral Cys proteases, both for EbSe and EbSeS. Thus, efforts are prompted to design selenium-based systems able to reach the target protein with the N-Se bond still intact. This requires a finetuning of the N-Se bond, designing ad hoc molecular features [58] to prevent ring-opening in the presence of free cysteines such as those present in the serum, preserving the ability of N-Se bond activation in the target proteins.