Mass Spectrometry and Structural Biology Techniques in the Studies on the Coronavirus-Receptor Interaction

Mass spectrometry and some other biophysical methods, have made substantial contributions to the studies on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and human proteins interactions. The most interesting feature of SARS-CoV-2 seems to be the structure of its spike (S) protein and its interaction with the human cell receptor. Mass spectrometry of spike S protein revealed how the glycoforms are distributed across the S protein surface. X-ray crystallography and cryo-electron microscopy made huge impact on the studies on the S protein and ACE2 receptor protein interaction, by elucidating the three-dimensional structures of these proteins and their conformational changes. The findings of the most recent studies in the scope of SARS-CoV-2-Human protein-protein interactions are described here.


Introduction
Emerging infectious diseases caused by severe acute respiratory syndrome coronaviruses (SARS-CoV and SARS-CoV-2) present a tremendous threat to international public health [1]. The risk to the global population brought by the Middle East respiratory syndrome (MERS-CoV) [2] is deemed to be much lower. According to the World Health Organization (WHO), the fatality rate of MERS-CoV is quite high (36%) [3,4]; however, the virus does not seem to pass easily from person to person. Dromedary exposure has been shown to be one of the main risk factors for that disease [3,5,6]. The other four coronaviruses that are pathogenic to humans (229E, OC43, NL63, HKU1) are usually associated with mild clinical symptoms [4,7].
According to the International Committee on Taxonomy of Viruses, coronaviruses belong to the subfamily Coronavirinae in the family Coronaviridae [8]. The group of viruses was recognized by a few virologists in 1968 [9]. The viruses were named "coronaviruses" to reflect the characteristic crown-like appearance by which they are identified under an electron microscope (EM) [9]. Coronaviruses are enveloped, positive-sense, single-stranded RNA viruses of pleomorphic shape, measuring between 80 and 160 nm [1,10]. On the basis of serological and genomic evidence, CoVs are categorized into four important genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus [1].
In December 2019, a few adult patients with pneumonia of unknown cause were admitted to a hospital in Wuhan [20]. In early January 2020, a novel betacoronavirus (SARS-CoV-2, tentatively Table 1. Comparison of the biological features of SARS-CoV and SARS-CoV-2.
In another study, MS-based HLA-I (human leukocyte antigens-I) and HLA-II epitope binding prediction tools were utilized to identify SARS-CoV-2 epitopes recognized by helper and cytotoxic T cells (CD4 + and CD8 + , respectively) [71]. Responses to both CD4 + and CD8 + T cells have been detected in SARS-CoV and in SARS-CoV-2-infected patients [72,73]. Unlike traditional binding assays which rely on chemical synthesis and the a priori knowledge of ligands to be assayed, MS method used by Poran et al. was based on natural peptide-HLA complexes that are subject to the endogenous processing and presentation pathways within the cell [71]. It has been revealed that the relative expression of SARS-CoV-2 proteins in virally infected cells vary significantly; this should be considered in vaccine design to induce cellular immunity [71].

Spike Glycoproteins
As the S protein is surface-exposed and mediates the host cell penetration by coronaviruses, it is the main focus of vaccine and therapeutic design [48,51]. Understanding the role of the SARS viruses spike glycoprotein and the character of its interaction with host receptor is fundamental to the understanding of viral pathogenesis [74,75]. One of the hypotheses to explain the higher transmission rate of SARS-CoV-2 compared to SARS-CoV is the genetic recombination of the S protein [27,48,76].
The S protein of SARS coronaviruses belongs to class-I viral fusion proteins. It consists of three monomers, each~1200 amino acid residues long ( Figure 1A) [73]. Each monomer of this densely glycosylated spike protein is approximately 180 kDa and contains two subunits, S1 and S2 ( Figure 1C), which are responsible for the attachment to the host cell and membrane fusion, respectively [77,78]. S2 subunits of SARS-CoV-2 and SARS-CoV are structurally conserved, whereas the S binding part of SARS-CoV, which is used to recognize its entry receptor, shares only approximately 73-75% overall amino acid sequence identity with SARS-CoV-2 binding domain [79]. interactome exposes new aspects of SARS-CoV-2 biology and potential targets for SARS-CoV-2 inhibition.
In another study, MS-based HLA-I (human leukocyte antigens-I) and HLA-II epitope binding prediction tools were utilized to identify SARS-CoV-2 epitopes recognized by helper and cytotoxic T cells (CD4 + and CD8 + , respectively) [71]. Responses to both CD4 + and CD8 + T cells have been detected in SARS-CoV and in SARS-CoV-2-infected patients [72,73]. Unlike traditional binding assays which rely on chemical synthesis and the a priori knowledge of ligands to be assayed, MS method used by Poran et al. was based on natural peptide-HLA complexes that are subject to the endogenous processing and presentation pathways within the cell [71]. It has been revealed that the relative expression of SARS-CoV-2 proteins in virally infected cells vary significantly; this should be considered in vaccine design to induce cellular immunity [71].

Spike Glycoproteins
As the S protein is surface-exposed and mediates the host cell penetration by coronaviruses, it is the main focus of vaccine and therapeutic design [48,51]. Understanding the role of the SARS viruses spike glycoprotein and the character of its interaction with host receptor is fundamental to the understanding of viral pathogenesis [74,75]. One of the hypotheses to explain the higher transmission rate of SARS-CoV-2 compared to SARS-CoV is the genetic recombination of the S protein [27,48,76].
The S protein of SARS coronaviruses belongs to class-I viral fusion proteins. It consists of three monomers, each ~1200 amino acid residues long ( Figure 1A) [73]. Each monomer of this densely glycosylated spike protein is approximately 180 kDa and contains two subunits, S1 and S2 ( Figure  1C), which are responsible for the attachment to the host cell and membrane fusion, respectively [77,78]. S2 subunits of SARS-CoV-2 and SARS-CoV are structurally conserved, whereas the S binding part of SARS-CoV, which is used to recognize its entry receptor, shares only approximately 73-75% overall amino acid sequence identity with SARS-CoV-2 binding domain [79]. Glycosylation is one of the most outstanding post-translational modifications in many viral S or envelop proteins, and some researchers argue that the determination of site-specific glycosylation of virus glycoproteins would enable the development of vaccines that take advantage of glycosylationdependent mechanisms [81]. Mass spectrometric methods have proved to be very useful for quantifying site-specific glycosylation [81,82]. Indeed, mass spectrometry has arisen as a pivotal method for the characterization of numerous virus surface proteins glycosylation in recent years [83,84].
Although genomic methods are very informative for viral mutation or adaptation through immune selective pressure, they cannot inform on that crucial feature of enveloped viruses-viral spike glycosylation. Exploring spike glycosylation and plasticity with advanced mass spectrometric methods using e.g., recombinant preparations compared to wild type viral proteins can be very helpful for a better understanding of the conformational dynamics that shape receptor or antibody binding [84]. The binding of previous coronavirus S proteins to their respective receptors has been shown by bioinformatics and proteomics approaches to be mediated by its oligomannose N-glycans [82,85]. Very recently, Watanabe et al. revealed, by combined mass spectrometric and cryo-EM analysis, how the N-linked glycans occlude distinct regions across the surface of the SARS-CoV-2 spike protein [86]. To resolve the site-specific glycosylation of the SARS-CoV-2 S protein, three kinds of proteases were used to generate glycopeptides that contain a single N-linked glycan sequon. Liquid chromatography-mass spectrometry (LC-MS) analysis determined the glycan composition for all 22 N-linked glycan sites. It was shown that 8 sites contain substantial populations of oligomannose-type glycans, principally N234 and N709 sites, and the remaining 14 sites are dominated by processed, complex-type glycans [86]. This extensive heterogeneity is similar to this of the S proteins of other coronaviruses such as MERS and HKU1, with the broad distribution of oligomannose-type glycans, without one particular dominant peak, as is the case for some viral glycoproteins [87]. Alteration of glycosites can affect viral infectivity, pathogenesis and host responses, e.g., by sterically masking polypeptide epitopes and modulating S protein-receptor interactions. Distinct epitope features between SARS-RBD and SARS-CoV-2-RBD have been shown by studies using murine polyclonal antibodies [88].
It is worth mentioning that all of the glycan sites are conserved on the S2 subunit between SARS-CoV and SARS-CoV-2, whereas the S1 subunit exhibits glycan site additions and deletions. SARS-CoV-2 maintains a total of 22 N-linked glycan sites in comparison with 23 on SARS, with 18 of these sites being in common [87]. Glycosylation is one of the most outstanding post-translational modifications in many viral S or envelop proteins, and some researchers argue that the determination of site-specific glycosylation of virus glycoproteins would enable the development of vaccines that take advantage of glycosylation-dependent mechanisms [81]. Mass spectrometric methods have proved to be very useful for quantifying site-specific glycosylation [81,82]. Indeed, mass spectrometry has arisen as a pivotal method for the characterization of numerous virus surface proteins glycosylation in recent years [83,84].
Although genomic methods are very informative for viral mutation or adaptation through immune selective pressure, they cannot inform on that crucial feature of enveloped viruses-viral spike glycosylation. Exploring spike glycosylation and plasticity with advanced mass spectrometric methods using e.g., recombinant preparations compared to wild type viral proteins can be very helpful for a better understanding of the conformational dynamics that shape receptor or antibody binding [84]. The binding of previous coronavirus S proteins to their respective receptors has been shown by bioinformatics and proteomics approaches to be mediated by its oligomannose N-glycans [82,85]. Very recently, Watanabe et al. revealed, by combined mass spectrometric and cryo-EM analysis, how the N-linked glycans occlude distinct regions across the surface of the SARS-CoV-2 spike protein [86]. To resolve the site-specific glycosylation of the SARS-CoV-2 S protein, three kinds of proteases were used to generate glycopeptides that contain a single N-linked glycan sequon. Liquid chromatography-mass spectrometry (LC-MS) analysis determined the glycan composition for all 22 N-linked glycan sites. It was shown that 8 sites contain substantial populations of oligomannose-type glycans, principally N234 and N709 sites, and the remaining 14 sites are dominated by processed, complex-type glycans [86]. This extensive heterogeneity is similar to this of the S proteins of other coronaviruses such as MERS and HKU1, with the broad distribution of oligomannose-type glycans, without one particular dominant peak, as is the case for some viral glycoproteins [87]. Alteration of glycosites can affect viral infectivity, pathogenesis and host responses, e.g., by sterically masking polypeptide epitopes and modulating S protein-receptor interactions. Distinct epitope features between SARS-RBD and SARS-CoV-2-RBD have been shown by studies using murine polyclonal antibodies [88].
It is worth mentioning that all of the glycan sites are conserved on the S2 subunit between SARS-CoV and SARS-CoV-2, whereas the S1 subunit exhibits glycan site additions and deletions. SARS-CoV-2 maintains a total of 22 N-linked glycan sites in comparison with 23 on SARS, with 18 of these sites being in common [87].
During viral infection, the spike protein is cleaved into these S1 and S2 subunits by nearby host proteases, such as human airway trypsin-like protease (HAT), cathepsins and transmembrane protease serine 2 (TMPRSS2), and releases the signal peptide to promote virus entry into host cells [7,89]. The proteolytic priming event is usually individual; however, SARS coronavirus entry requires another cleavage (on the S2 domain) by the endosomal protease cathepsin [90,91]. That second cleavage activates the protein for the membrane fusion via irreversible conformational changes [92].
Recently, it was revealed that SARS-CoV-2 has a furin cleavage site at the boundary between S1 and S2 that has unique sequence among ß-coronaviruses [48]. It had been shown before that the introduction of a furin recognition motif at R667 of SARS-CoV spike glycoprotein allows for efficient cleavage and increased cell-cell fusion activity [93]. The motif of RRAR amino acids in the novel SARS virus, instead of a single arginine, as is present in other similar viruses, allows effective cleavage by furin and other proteases [27,76]. Since furin is highly expressed in lungs, the virus can easily exploit that enzyme to activate its S glycoprotein [76]. Interestingly, Ou et al. discovered that SARS-CoV-2 S protein could trigger syncytia in human receptor cells independently of exogenous protease [77]. The next distinguishing feature of the SARS-CoV-2 spike glycoprotein is the significant variability of its receptor binding domain (RBD). That domain is the most variable part of the coronavirus genome [37,94].
Different coronaviruses use different domains within the S1 subunit to attach to the appropriate receptor [48]. The RBD of the SARS-CoV-2 spike glycoprotein binds directly to the peptidase domain (PD) of the human cell receptor (Figure 2). These small (~21 kDa each) [51] receptor-binding domains in the whole protomer are depicted in colors in Figure 1B. Yan et al. discovered prominent variations and conformational differences in the interfaces of SARS-CoV and SARS-CoV-2 with the ACE 2 receptor [63]. In that work, the most relevant alteration were shown to be the substitution of Val 404 in the SARS-CoV-RBD, with Lys 417 in the RBD of the virus that is responsible for the current pandemic.
It was suggested that the substitution of other residues (Tyr 442 to Leu 455 , Leu 443 to Phe 456 , Phe 460 to Tyr 473 , and Asn 479 to Gln 493 ) may also influence the affinity for the human cell receptor [63] (highlighted in yellow in Figure 3). Superimposition of the SARS-CoV-2 C-terminal domain (encompassing RBD) structure onto the SARS-RBD structure revealed that the majority of the secondary structure elements are well superimposed in that domain [87]. However, cryo-electron microscopy studies were done in the presence of the neutral amino acid transporter B 0 AT1, which could be the reason for the some divergence of the results compared to the work of Walls et al. [48]. Many studies have proved that the human angiotensin-converting enzyme (hACE2) is a functional receptor for SARS-CoV-2, similar to other SARS-related coronaviruses [48,63,95]. Zhou et al. showed that only ACE2, and no other coronavirus receptors, such as aminopeptidase N (APN) and dipeptidyl peptidase 4 (DPP4), are used by SARS-CoV-2 as an entry receptor [37]. However, the primary physiological role of ACE2 is catalyzing the hydrolysis of angiotensin II (a vasoconstrictor peptide) into angiotensin heptapeptide (a vasodilator) [96]. ACE2 is an integral membrane metalloproteinase with the N-terminal extracellular domain containing six canonical sequons for N-linked glycosylation and several potential O-linked sites. The occupancy of N-linked glycans, O-linked glycosylation and the heterogeneity of the O-linked glycans on ACE2 have been recently studied using multiple MS-based approaches, including glycomic and glycoproteomic methods [97]. Glycomic analyses revealed that the majority of ACE2 N-glycans are complex, with limited high-mannose and hybrid glycans. That work with the help of molecular dynamics (MD) simulations revealed crucial roles for glycosylation, not only in sterically masking polypeptide epitopes, but also in directly modulating spike-ACE2 interactions [97].
To engage the ACE2 receptor, the RBD of S1 undergoes hinge-like conformational motions that transiently hide or expose the determinants of receptor binding, as has been shown by Wrapp et al. [51]. These two states are referred to as "up" and "down" conformations, where "up" corresponds to the receptor-accessible state ( Figure 1C). Each PD of homodimeric ACE2 protein accommodates one RBD of spike protein, mainly by the arch-shaped α1 helix [63,84]. There is also a limited contribution of the α2 helix, and a loop connecting the ß3 and ß34 strands of ACE2 to that binding (Figure 2).
When the RBD is in the "down" conformation, shielding of receptor binding sites on the SARS-CoV-2 S protein by proximal glycosylation sites (N165, N234, N343) can be observed [86].
Interestingly, two glycans on ACE2 (at N090 and N322) have been predicted by MD to form interactions with the S protein. Each of multiple simulations showed N322 glycan interaction with the S trimer, despite its presence outside of the receptor-binding domain. The arms of the ACE2 glycan at N090 were shown to interact with multiple regions of the S trimer surface over the course of the simulations, exposing the relatively high degree of glycan dynamics [97]. Nevertheless, considerable efforts still need to be overtaken in order to fully understand the role of glycans in SARS-CoV-2 infection and pathogenicity.
Yan et al. discovered prominent variations and conformational differences in the interfaces of SARS-CoV and SARS-CoV-2 with the ACE 2 receptor [63]. In that work, the most relevant alteration were shown to be the substitution of Val 404 in the SARS-CoV-RBD, with Lys 417 in the RBD of the virus that is responsible for the current pandemic.
It was suggested that the substitution of other residues (Tyr 442 to Leu 455 , Leu 443 to Phe 456 , Phe 460 to Tyr 473 , and Asn 479 to Gln 493 ) may also influence the affinity for the human cell receptor [63] (highlighted in yellow in Figure 3). Superimposition of the SARS-CoV-2 C-terminal domain (encompassing RBD) structure onto the SARS-RBD structure revealed that the majority of the secondary structure elements are well superimposed in that domain [87]. However, cryo-electron microscopy studies were done in the presence of the neutral amino acid transporter B 0 AT1, which could be the reason for the some divergence of the results compared to the work of Walls et al. [48].
Nonetheless, as shown by Wang et al. among 24 residues in hACE2 that make van der Waals contacts with both RBDs, 15 amino acids exhibit more contacts with the SARS-CoV-2 C terminal domain [87]. Consistently, the SARS-CoV-2 RBD also has more residues than SARS-CoV RBD that directly interact with ACE2, forming vdw contacts and H-bonds. In that work, F486 in SARS-CoV-2, instead of L472 in SARS-CoV, was shown to form strong aromatic-aromatic interactions with Y83 residue of ACE2, and E484 in SARS-CoV-2 instead of P470 in SARS-CoV, formed ionic interactions with receptor's K31 residue [87].
It has been shown, using biolayer interferometry, that the SARS-CoV-2 S B (binding) domain engages human receptor with comparable affinity to SARS-CoV S B from viral isolates associated with the 2002-2003 epidemic [48]. On the other hand, Wrapp et al. showed that ACE2 binds to the SARS-CoV-2 ectodomain 10-to 20-fold more tightly than to SARS-CoV [51]. Similarly, Wang et al. found that SARS-CoV-2 RBD displays approximately 4-fold stronger affinity towards hACE2 than SARS-RBD. The equilibrium dissociation constant (K D ) of SARS-CoV-2 RBD binding to ACE2 was calculated to be 133.3 ± 5.6 nM [87]. Surprisingly, computational analyses predicted that the SARS-CoV-2 RBD sequence is not optimal for receptor binding [79].
Tyr 473 , and Asn 479 to Gln 493 ) may also influence the affinity for the human cell receptor [63] (highlighted in yellow in Figure 3). Superimposition of the SARS-CoV-2 C-terminal domain (encompassing RBD) structure onto the SARS-RBD structure revealed that the majority of the secondary structure elements are well superimposed in that domain [87]. However, cryo-electron microscopy studies were done in the presence of the neutral amino acid transporter B 0 AT1, which could be the reason for the some divergence of the results compared to the work of Walls et al. [48].  There are other analytical and biophysical tools which can provide detailed information on the binding affinity of the biomolecules, such as Förster resonance energy transfer (FRET) and biosensors (also FRET-based biosensors) [99][100][101], and isothermal titration calorimetry (ITC) [102][103][104], however, this is not in the scope of this review. The choice of the experimental strategy is usually dictated by the proteins under investigation. Optical biosensors for studies on protein interactions, their advantages and limitations have been thoroughly reviewed by Zhao et al. [105].
The usage of other orthogonal methods supporting MS, X-ray crystallography and cryo-electron microscopy, can give the more detailed insight into the S protein-ACE2 interaction, and find protein targets for the discovery and development of anti-coronavirus therapy [63,99,106].
It is worth mentioning that, five years after the first SARS outbreak, a range of candidate vaccines have been developed. However, as of July 2020, there are no approved vaccines or drugs against any human-infecting CoV infections [48,53,104].
Before, despite intensive work on SARS-CoV protease inhibitors, none of the studied compounds have gone through a complete preclinical development program, mainly because of sharp funding cuts in most countries in 2005-2006 [106].
There are suggestions that some medicines and vaccines against SARS-CoV could probably have been used to treat infections with the SARS-CoV-2 virus. THis drug repurposing strategy could significantly shorten the time and reduce the cost in comparison to de novo drug discovery and randomized clinical trials [107].
However, Wang et al. showed that, despite the structural similarity of RBD of SARS-CoV and SARS-CoV-2, they exhibit different epitope features and differing immunogenicity [86]. On the other hand, Tai and co-workers revealed that SARS-CoV RBD-induced antibodies could cross-react with SARS-CoV-2 RBD and cross-neutralize SARS-CoV-2 pseudovirus infection, suggesting that SARS-CoV RBD-specific antibodies may be used for treatment of SARS-CoV-2 infection [108].
More studies are needed to explore the pathogenicity mechanism of SARS-CoV-2, particularly to uncover the mystery of the molecular mechanism of viral entry into host cells and replication. Such studies will provide the basis for future research on developing targeted antiviral drugs and vaccines [109].
Since the outbreak at the end of 2019, scientists have been working extensively on therapies and vaccines against the novel coronavirus. Treatments and vaccines not only have to be proven effective against the virus, but must also be safe for people. Deeper investigation into the spike protein-human cell receptor/s interactions may provide early scientific guidance for viral prevention and control.

Conclusions and Future Perspectives
As of 31 August, 25,118,689 cases of COVID-19 have been confirmed in 213 countries and territories around the world. As of that day, approximately 844,000 people have not been able to overcome this disease [110].
The situation with COVID-19 brings the questions: when will it end and what will be next? There is a hypothesis that SARS-CoV-2 will enter the group of the "seasonal infections" in the future [35]. At this moment, despite the many scientific groups working on it, we have only a partial knowledge about this novel virus and its interactions with human cells. Moreover, another adaptive process could result in a virus with even higher infectiousness and transmissibility in humans.
Furthermore, there are many other species of coronaviruses in animals that can become global health threats. For these reasons, deep studies joining together researchers from the frontiers of biology and chemistry, epidemiologists and doctors are clearly of utmost importance.
This review provides insights into the COVID-19 current situation, with a special emphasis on the current state of the art in terms of the SARS-CoV-2 S protein and human cell receptor protein interactions. From the lessons learned during the SARS-CoV and SARS-CoV-2 epidemics, we can improve on the handling of global pandemics or epidemics. The clear, reliable and timely dissemination of information is also crucial in dealing with the epidemic.
Mass spectrometry methods are utilized in studies on virus S protein and receptor protein glycosylation and interaction, revealing targets for neutralizing antibodies elicited through vaccination. Moreover, multi-omic profiling of the host response could be helpful in tracking disease, and prevent future pandemics of similar viruses.