Oligomeric State of β-Coronavirus Non-Structural Protein 10 Stimulators Studied by Small Angle X-ray Scattering

The β-coronavirus family, encompassing Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), Severe Acute Respiratory Syndrome Coronavirus (SARS), and Middle East Respiratory Syndrome Coronavirus (MERS), has triggered pandemics within the last two decades. With the possibility of future pandemics, studying the coronavirus family members is necessary to improve knowledge and treatment. These viruses possess 16 non-structural proteins, many of which play crucial roles in viral replication and in other vital functions. One such vital protein is non-structural protein 10 (nsp10), acting as a pivotal stimulator of nsp14 and nsp16, thereby influencing RNA proofreading and viral RNA cap formation. Studying nsp10 of pathogenic coronaviruses is central to unraveling its multifunctional roles. Our study involves the biochemical and biophysical characterisation of full-length nsp10 from MERS, SARS and SARS-CoV-2. To elucidate their oligomeric state, we employed a combination of Multi-detection Size exclusion chromatography (Multi-detection SEC) with multi-angle static light scattering (MALS) and small angle X-ray scattering (SAXS) techniques. Our findings reveal that full-length nsp10s primarily exist as monomers in solution, while truncated versions tend to oligomerise. SAXS experiments reveal a globular shape for nsp10, a trait conserved in all three coronaviruses, although MERS nsp10, diverges most from SARS and SARS-CoV-2 nsp10s. In summary, unbound nsp10 proteins from SARS, MERS, and SARS-CoV-2 exhibit a globular and predominantly monomeric state in solution.


Introduction
Coronaviruses (CoVs) belong to the RNA viruses in the Coronaviridae family, order Nidovirales.Within this family, there are four genera, which are α, β, γ, and σ coronaviruses.Among these, the β-coronaviruses are known to infect humans, with seven of them having been identified as affecting the human host.While four of these viruses typically cause mild symptoms similar to the common cold, the remaining three have led to deadly outbreaks over the past two decades [1].One of the pathogenic CoVs, SARS, caused a limited outbreak in China in 2002, resulting in approximately 8000 infections and a mortality rate close to 10% [2,3].A decade later, another CoV outbreak occurred in Saudi Arabia, named MERS, which had a lower infection rate (ca.2500) but a significantly higher mortality rate at ~35% [4].
The most recent CoV outbreak emerged at the end of 2019, caused by a novel CoV later named SARS-CoV-2, leading to the COVID-19 pandemic.SARS-CoV-2 has caused a global ongoing pandemic, affecting the entire world population with over 670 million infections and more than 6.7 million deaths, although unofficial estimates suggest higher numbers [5].It is now widely accepted that SARS-CoV-2 is becoming endemic in many countries [6].Given the regular recurrence of CoVs over the past two decades, it is essential to study and understand how these viruses infect humans, replicate in the human host, and develop strategies to handle infections to reduce severe disease and mortality.
SARS-CoV-2 is a single-stranded positive-sense RNA virus, with a genome that is approximately 30 kb in size, coding for three distinct classes of proteins [7].The virion is composed of four structural proteins named spike, envelope, nucleocapsid, and membrane proteins.Additionally, seven accessory proteins, known for their contribution to viral pathogenicity [8], may vary significantly between different β-CoVs [9].However, the largest class of proteins comprises the 16 non-structural proteins (nsp1 to nsp16), many of which play vital roles in viral replication [10] as part of the replication-transcription complex (RTC).
Some of these nsps are already validated drug targets, such as Paxlovid (Nirmatrelvir and Ritonavir), which targets the main protease (nsp5) [11][12][13], and Remdesivir, inhibiting the RNA-dependent RNA polymerase (RdRp) complex (nsp12-nsp7 and nsp8) [14].As their application is limited and the development of resistance is widely anticipated, the search for new drugs targeting other potential protein targets becomes crucial [15,16].Recently, clinical trials have shown that approved monoclonal antibodies lose their efficacy against omicron variants, leading to their approval being reversed [17,18], indicating that the pandemic is far from over.
Among these nsps, one of the central players is nsp10, a scaffolding protein that interacts with and stimulates at least two other non-structural proteins, nsp14 and nsp16 [22,23].Nsp10 stimulates the ExoN activity of SARS and SARS-CoV-2 nsp14, but not its N7-MTase activity [24] by forming a 1:1 complex [25].Similarly, it also stimulates the 2 -O-MTase activity of nsp16, forming a heterodimeric complex reported for all three pathogenic β-CoVs [21,26,27] (Protein Data Bank (PDB) example entries 2XYR and 5YN8).This makes nsp10 vital not only in the RNA cap formation process [28] but also for RNA proofreading [29].Recently, it has even been proposed that nsp14, nsp10, and nsp16 can form a heterotrimeric complex, with nsp14 and nsp16 overlapping on nsp10 in a particular N-terminal region of nsp14, termed the "lid", showing high structural flexibility [30,31].Successfully targeting nsp10 could lead to the inhibition of ExoN and N7-MTase functions, thus preventing RNA cap formation and hindering RNA proofreading by the ExoN domain, which corrects falsely incorporated bases by the proofreading RdRp complex [32].However, several knowledge gaps remain, such as the oligomeric state of the unbound form of nsp10 in solution compared to its state in forming various heterodimeric complexes.
In this project, we aim to characterise and compare the oligomeric state of nsp10 proteins from SARS, MERS, and SARS-CoV-2.The sequence alignment in Figure 1 illustrates the high degree of nsp10 sequence conservation (59.4%) in MERS compared to both SARS and SARS-COV-2, while nsp10 is 97.1% conserved between SARS and SARS-CoV-2. Figure 2 depicts a structural overlay of nsp10 crystal structures obtained from SARS, MERS, and SARS-CoV-2, revealing a remarkable degree of structural homology (r.m.s.d. is <1 Å over all atoms).Our comprehensive analysis, employing various techniques, consistently indicates that these nsp10s predominantly exists as monomers in solution.To gain further insight into their shapes, we conducted Small Angle X-ray Scattering (SAXS) experiments [33], confirming a conserved shape for nsp10s from SARS, MERS, and SARS-CoV-2, consistent with the known nsp10 crystal structures.CoV-2 nsp10 construct are highlighted in green, and additional missing residues of the short SARS-CoV-2 nsp10 construct are highlighted in blue.The percent identity between SARS-CoV-2 and SARS nsp10 proteins is 97.1%.In contrast, the percent identity between SARS-CoV-2 and MERS nsp10 proteins is 59.4%.The alignment was carried out using CLUSTAL OMEGA, 1.2.4 [34] while the secondary structure was assigned by DSSP as implemented in ESPript 3.0 [35].Protein sequence and structural alignment of full-length nsp10 proteins from SARS-CoV-2, SARS and MERS.Identical residues are shown in white on a red background, similar residues are shaded in red and distinct residues are displayed in black.Truncated residues of the long SARS-CoV-2 nsp10 construct are highlighted in green, and additional missing residues of the short SARS-CoV-2 nsp10 construct are highlighted in blue.The percent identity between SARS-CoV-2 and SARS nsp10 proteins is 97.1%.In contrast, the percent identity between SARS-CoV-2 and MERS nsp10 proteins is 59.4%.The alignment was carried out using CLUSTAL OMEGA, 1.2.4 [34] while the secondary structure was assigned by DSSP as implemented in ESPript 3.0 [35].

Results and Discussion
We prepared nsp10 constructs of varying lengths for MERS, SARS, and SARS-CoV-2, including full-length proteins for all three CoVs, alongside two shorter constructs for SARS-CoV-2.The purification process yielded an estimated purity of >95% for all proteins (Figure 3).On the SDS-PAGE, we observed that full-length MERS nsp10 migrates close to the lower band of each full-length SARS and SARS-CoV-2 nsp10.However, the splitting of the protein bands, evident in lanes C and D by SDS-PAGE, was not observed when examining the proteins in solution by multi-detection SEC (Table 1).We, therefore, conducted high-resolution mass spectrometry (HRMS) using all five protein constructs to validate the protein species.HRMS measurements show single species corresponding to the expected masses of all nsp10 constructs, suggesting that the double-bands and the lower migrating band for MERS nsp10 are technique-related artefacts of SDS-PAGE (Figure S1).Additionally, we detected a minor band at approximately 75 kDa in full-length MERS (lane B) and SARS-CoV-2 (lane D) nsp10, later identified as the E. coli chaperone DNAK (UniProtKB P0A6Y8, calculated MW of 69.1 kDa) through mass spectrometry analysis of the excised SDS-PAGE gel.To determine the oligomeric state nsp10s from SARS, MERS, and SARS-CoV-2, we conducted Multi-detection SEC experiments, combining SEC separation, light sca ering at two different angles, and differential viscometer measurements.The use of Multi-detection SEC allowed us to make reliable and highly sensitive measurements of the absolute molecular weight and particle distribution in solution.The Multi-detection SEC results revealed that the molecular weight of all three full-length nsp10s in solution did not significantly differ.Furthermore, we found that all constructs, especially the full-length nsp10 proteins, predominantly exist as monomers in solution (Table 1).Figure S2 shows representative Multi-detection SEC traces.
Notably, the full-length MERS nsp10 demonstrated exceptional monodispersity, whereas full-length SARS and SARS-CoV-2 nsp10s formed a minor pool of dimers (2 to To determine the oligomeric state nsp10s from SARS, MERS, and SARS-CoV-2, we conducted Multi-detection SEC experiments, combining SEC separation, light scattering at two different angles, and differential viscometer measurements.The use of Multidetection SEC allowed us to make reliable and highly sensitive measurements of the absolute molecular weight and particle distribution in solution.The Multi-detection SEC results revealed that the molecular weight of all three full-length nsp10s in solution did not significantly differ.Furthermore, we found that all constructs, especially the full-length nsp10 proteins, predominantly exist as monomers in solution (Table 1).Figure S2 shows representative Multi-detection SEC traces.Notably, the full-length MERS nsp10 demonstrated exceptional monodispersity, whereas full-length SARS and SARS-CoV-2 nsp10s formed a minor pool of dimers (2 to 3%) in solution.The shorter, "artificial" SARS-CoV-2 protein constructs exhibited a more substantial extent of dimer (3-11%) and trimer (up to 3%) formation in solution.However, it is important to emphasise that across all three β-CoV nsp10s, the predominant state in solution was monomeric.Figure S3, derived from SEC-MALS experiments [37] with the three SARS-CoV-2 nsp10 constructs, further confirmed the predominantly monomeric nature of nsp10, with only slight variations in the mass fraction.These findings reinforce the reliability and significance of our earlier Multi-detection SEC data, providing valuable insights into the oligomeric states of nsp10s from various coronaviruses.
The SARS nsp10 crystal structure was previously reported as monomeric, with one molecule in the asymmetric unit.However, SEC analysis revealed a dimeric state in solution [38].It is important to note that the SEC analysis utilised a truncated SARS nsp10 construct, not the full-length protein, as we did in our study.In other work, a fusion between SARS nsp10 and nsp11 has been reported to crystallise as a dodecamer [39].Nevertheless, there is no supporting evidence for the existence of such a nsp10-nsp11 fusion protein.In contrast, our novel findings clearly demonstrate that full-length nsp10s from various medically relevant β-CoVs predominantly exist as monomers, whereas truncated nsp10 constructs have a tendency to oligomerise into dimers and trimers, albeit to a very low extent.In conclusion, our study confirms that the β-CoV nsp10s studied here primarily form monomers in solution, which aligns with their functional roles as stimulators of nsp14 and nsp16 activities.These nsp10s act in a 1:1 complex with their binding partners or participate as one partner in a nsp10-nsp14-nsp16 heterotrimeric complex.
To gain further structural insights into the various nsp10 proteins in solution, we characterised the three SARS-CoV-2 nsp10 constructs (short, long, full-length) along with the full-length SARS and MERS nsp10 proteins.In-line SEC was performed just before measuring SAXS data (SEC-SAXS).Protein concentrations were adjusted to 10.0 mg/mL, except for the short SARS-CoV-2 nsp10 construct (8.0 mg/mL) and MERS nsp10 (7.3 mg/mL).
All scattering intensity plots displayed a similar profile to what was observed in SEC-MALS, showing a major peak with an almost negligent peak at an earlier elution volume (Figure S4).
The Radius of Gyration (R g ) was consistent for all full-length nsp10s at approximately 17.2-17.3Å, slightly smaller for the shorter SARS-CoV-2 nsp10 constructs.Similar values were calculated for the maximum distance (D max ), ranging from 51 to 59 Å, as well as for the estimated molecular masses.All statistics are summarised in Table S2.Overall, SAXS revealed a similar shape of a globular domain for all full-length and shorter nsp10s (Figure 4).The distance distribution layout of the two shorter nsp10 proteins of SARS-CoV-2 closely resembled a Gaussian layout, suggesting a more spherical shape (Figure 4B).Kratky plots suggested a rigid and identical shape for all measured samples (Figure S5).
nsp14 and nsp16 activities.These nsp10s act in a 1:1 complex with their binding partne or participate as one partner in a nsp10-nsp14-nsp16 heterotrimeric complex.
To gain further structural insights into the various nsp10 proteins in solution, w characterised the three SARS-CoV-2 nsp10 constructs (short, long, full-length) along w the full-length SARS and MERS nsp10 proteins.In-line SEC was performed just befo measuring SAXS data (SEC-SAXS).Protein concentrations were adjusted to 10.0 mg/m except for the short SARS-CoV-2 nsp10 construct (8.0 mg/mL) and MERS nsp10 ( mg/mL).All sca ering intensity plots displayed a similar profile to what was observed SEC-MALS, showing a major peak with an almost negligent peak at an earlier eluti volume (Figure S4).
The Radius of Gyration (Rg) was consistent for all full-length nsp10s at approximate 17.2-17.3Å, slightly smaller for the shorter SARS-CoV-2 nsp10 constructs.Similar valu were calculated for the maximum distance (Dmax), ranging from 51 to 59 Å, as well as f the estimated molecular masses.All statistics are summarised in Table S2.Overall, SA revealed a similar shape of a globular domain for all full-length and shorter nsp10s (Figu 4).The distance distribution layout of the two shorter nsp10 proteins of SARS-CoV closely resembled a Gaussian layout, suggesting a more spherical shape (Figure 4 Kratky plots suggested a rigid and identical shape for all measured samples (Figure S5 (A) We further compared the sca ering data for all five samples to reference coordin For SARS-CoV-2 nsp10, the coordinates in complex with the nsp14-ExoN domain trimmed to include residues 6-128, representing the most characteristic of the domain As for the other two viral nsp10s, we used the coordinates as found in the PDB.Th crepancy χ 2 , calculated from CRYSOL, ranged from 1.072 up to 1.364 for the four m trimmed to include residues 6-128, representing the most characteristic of the domain fold.As for the other two viral nsp10s, we used the coordinates as found in the PDB.The discrepancy χ 2 , calculated from CRYSOL, ranged from 1.072 up to 1.364 for the four measured proteins and 2.408 for the long nsp10 SARS-CoV-2 construct.These values indicated and excellent agreement between the solution sca ering data and the crystal structure of nsp10 (Figure 5).We further compared the scattering data for all five samples to reference coordinates.For SARS-CoV-2 nsp10, the coordinates in complex with the nsp14-ExoN domain were trimmed to include residues 6-128, representing the most characteristic of the domain fold.As for the other two viral nsp10s, we used the coordinates as found in the PDB.The discrepancy χ 2 , calculated from CRYSOL, ranged from 1.072 up to 1.364 for the four measured proteins and 2.408 for the long nsp10 SARS-CoV-2 construct.These values indicated and excellent agreement between the solution scattering data and the crystal structure of nsp10 (Figure 5).
To gain a deeper understanding of the roles played by the residues at the N-and C-terminal regions of nsp10, we generated a low-resolution model for full-length SARS-CoV-2 nsp10.Our approach involved two main steps: firstly, we generated 20 ab initio models using the program Dammif [40]; then, aligning these models, we used their average envelope as the starting search volume for a slow mode ab initio calculation using the software Dammin from ATSAS suite (v3.2.1) [41].The final model converged with a discrepancy χ 2 against the raw data of 0.9876, signifying a very reasonable fit although slightly overfit.Moreover, we independently applied a rigid body/ab initio approach to determine the positions of the missing residues at both termini, utilising the software BUNCH from ATSAS suite (v3.2.1) [42].The final model reported a discrepancy χ 2 of 1.01897 against the raw data, indicating a similar and highly accurate fit compared to the ab initio model.Notably, upon superposition, these two models exhibited a reasonable fit, effectively demonstrating that the termini significantly contributed to the compact core of the structure (Figure 6).This result is in agreement with the observations from the Kratky plots, which convincingly suggest that nsp10 forms a rigid and stable protein domain.Encouragingly, very similar ab initio/rigid body-refined models were obtained for full-length SARS and MERS nsp10 proteins.
Overall, all nsp10 proteins from SARS-CoV-2, SARS, and MERS exhibit an almost identical low-resolution shape of a monomeric domain.While this is expected for the SARS-CoV-2 and SARS nsp10 proteins, where the sequence identity is 97.1%, it is less obvious for MERS nsp10, which is only 59.4% identical to SARS-Cov-2 nsp10.Furthermore, the two full-length SARS-CoV-2 nsp10 models (ab initio and rigid body), combined with the evaluation of the data by CRYSOL using a SARS-CoV-2 nsp10 crystal structure, strongly suggest that extra residues at the termini of all proteins do not affect the overall fold and shape of the proteins.
determine the positions of the missing residues at both termini, utilising the software BUNCH from ATSAS suite (v3.2.1) [42].The final model reported a discrepancy χ 2 of 1.01897 against the raw data, indicating a similar and highly accurate fit compared to the ab initio model.Notably, upon superposition, these two models exhibited a reasonable fit, effectively demonstrating that the termini significantly contributed to the compact core of the structure (Figure 6).This result is in agreement with the observations from the Kratky plots, which convincingly suggest that nsp10 forms a rigid and stable protein domain.Encouragingly, very similar ab initio/rigid body-refined models were obtained for fulllength SARS and MERS nsp10 proteins.Overall, all nsp10 proteins from SARS-CoV-2, SARS, and MERS exhibit an almost identical low-resolution shape of a monomeric domain.While this is expected for the SARS-CoV-2 and SARS nsp10 proteins, where the sequence identity is 97.1%, it is less obvious for MERS nsp10, which is only 59.4% identical to SARS-Cov-2 nsp10.Furthermore, the two full-length SARS-CoV-2 nsp10 models (ab initio and rigid body), combined with the evaluation of the data by CRYSOL using a SARS-CoV-2 nsp10 crystal structure, In conclusion, full-length nsp10 stimulators from medically relevant β-coronaviruses, such as SARS, MERS, and SARS-CoV2, are predominantly compact monomers in solution, as revealed by SAXS.This finding is consistent with the crystal structures of SARS and SARS-CoV-2 nsp10, which also exist as monomers in their unbound forms, in the absence of their nsp14 and nsp16 binding partners.An exception is the nsp10-nsp11 fusion protein, which has been shown to crystallise as a dodecamer.Truncated SARS-CoV-2 nsp10 constructs display a higher propensity to oligomerise and may even form trimers, explaining why a truncated SARS nsp10 construct has been reported as a dimer using size exclusion chromatography.Our work represents the first integration of results on various β-coronavirus nsp10s into a single model for nsp10 structure and function.

Subcloning, Expression and Purification of β-CoV nsp10 Proteins
Expression constructs for nsp10 from SARS-CoV-2 (UniProtKB: P0DTD1), SARS (UniProtKB: P0C6X7) and MERS (UniProtKB: K9N7C7) were synthesised at Genscript and subcloned into E. coli expression vector ppSUMO-2.Due to the cloning strategy, they contain up to two nonspecific residues (TM) at the N-terminus of the proteins.Expressed proteins contain an N-terminal, cleavable His-SUMO double tag aiding in the purification of the various proteins.The features of the expression clones are summarised in Table S1.A short SARS-CoV-2 nsp10 construct was designed as previously described [43], covering residues Asn4264 (renumbered to Asn10 for convenience) to Gln4385, with a total length of 123 residues.This construct lacks residues at both the N-and C-termini.A long SARS-CoV-2 nsp10 construct consists of 133 residues and includes the N-terminus of the protein but is still devoid of the C-terminus.Full-length SARS-CoV-2 nsp10 contains 139 residues.Full-length SARS nsp10 is a protein of 139 residues.The protein sequence of SARS nsp10 (NP_828868, 139 residues), located between Ala4231 to Gln4369 of pp1a/pp1ab was used to generate the expression clone.MERS full-length nsp10 contains 140 residues.
The long and the short SARS-CoV-2 nsp10 proteins were expressed and purified as recently described [43].Full-length SARS-CoV-2 nsp10 was expressed and purified as recently reported [44].The full-length SARS and MERS nsp10s were purified using the same strategy employed for full-length SARS-Cov-2 nsp10.In brief, plasmids were transformed into Rosetta (DE3) competent cells (Merck KGaA, Darmstadt, Germany) and inoculated onto LB agar plates supplemented with 50 µg/mL kanamycin and 34 µg/mL chloramphenicol.For full-length MERS nsp10, the plasmid was transformed into BL21-Gold (DE3) competent cells (Agilent Technologies, Santa Clara, CA, USA) and inoculated onto an LB agar plate supplemented with 50 µg/mL kanamycin.
A single colony from each LB agar culture was inoculated into 30 mL TB medium supplemented with the same antibiotics as above and incubated at 37 • C and 220 RPM overnight; 5 mL of each culture was inoculated into each 1 L TB medium with the same antibiotics and incubated at 37 • C and 220 RPM until the OD 600 reached 0.6-1.0.Cultures were cooled down to 18 • C and 1 mM IPTG was added to induce protein expression.The induced cultures were kept at 18 • C, and 220 RPM for 20-24 h.After expression, cultures were centrifuged at 8000× g at 4 • C for 20 min to remove the medium.Cell pellets were resuspended in nsp10 buffer A (50 mM sodium phosphate buffer (NaPO 4 ) pH 8.0, 300 mM NaCl, 20 mM Imidazole and 1 mM PMSF), flash-frozen in liquid nitrogen and stored at −80 • C. Cell pellets were thawed followed by sonication on ice for 10 cycles (30 s on, 60 s off) at a 16 kHz ultrasonic frequency, followed by centrifugation at 50,000× g at 4 • C for 1 h to remove cell debris.The supernatant was loaded into a 5 mL HisTrap FF crude column (Cytiva Life Sciences, Buckinghamshire, UK) pre-equilibrated with nsp10 buffer B (50 mM NaPO 4 pH 8.0, 300 mM NaCl and 20 mM Imidazole).The loaded column was washed with 50 column volumes (CVs) nsp10 buffer B, followed by elution with 20 CVs nsp10 buffer C (50 mM NaPO 4 pH 8.0, 300 mM NaCl and 250 mM Imidazole).Collected elution fractions were evaluated by SDS-PAGE and those containing nsp10 were pooled.ULP1 and 2 mM DTT were added to the pooled sample and dialysed against 2 L nsp10 dialysis buffer (50 mM NaPO 4 pH 8.0, 300 mM NaCl, 20 mM Imidazole and 2 mM DTT) at 4 • C overnight to cleave off the SUMO and His double tag from nsp10.Cleaved nsp10 was purified through 2 × 5 mL HisTrap FF crude columns pre-equilibrated with nsp10 buffer B to remove any uncleaved fusion protein, His-tagged ULP1 protease and other contaminations.Buffer exchanges against final buffer (50 mM Tris-HCl pH 8.0 and 150 mM NaCl) were carried out by using 15 mL Amicon Ultra Centrifugal Units with 10,000 MWCO (Merck KGaA, Darmstadt, Germany).

MALDI-TOF/TOF Mass Spectrometry
SDS-PAGE gel bands were cut into 1 × 1 mm gel pieces and washed three times for 30 min by incubation with 400 µL 50% acetonitrile (ACN, Sigma-Aldrich, Dorset, UK) and 50 mM ammonium bicarbonate (ABC, Sigma-Aldrich).The gel pieces were dehydrated using 100% ACN before 25 µL of 12 ng/µL sequence grade modified trypsin porcine (Promega, Tokyo, Japan) in 25 mM ABC was added.The gel pieces in the digestion buffer were incubated on ice for 3 h before further incubation overnight at 37 • C. The following day Trifluoroacetic acid was added to a final concentration of 0.5% and peptides were extracted into a new tube ready for analysis by MALDI mass spectrometry.
MS and MS/MS spectra were acquired using an Autoflex Speed MALDI TOF/TOF mass spectrometer (Bruker Daltonics, Bremen, Germany) in positive reflector mode.Matrix solution, 0.5 µL consisting of 5 mg/ml α-cyano-4-hydroxy cinnamic acid, 80% acetonitrile, 0.1% TFA, was added to 1 µL peptide sample on a MALDI stainless steel plate.MS spectra were externally calibrated using Peptide Calibration Standard II (Bruker Daltonics, Billerica, MA, USA) containing nine standard peptides (Bradykinin Fragment 1-7 (757.40) The identification of proteins was based on both MS and MS/MS data and was carried out with the Mascot Daemon software (version 2.4).The following search settings were used: trypsin as protease, two allowed missed cleavage sites, 50 ppm MS accuracy for peptides and 0.2 Da for MS/MS fragments, variable modifications: Oxidation (M).The files were searched against all organisms in SwissProt (updated 221008).

High-Resolution Mass Spectrometry
High-resolution mass spectrometry measurements were performed on a 1290 Infinity II UHPLC system (Agilent Technologies, Santa Clara, CA, USA) connected to a Q-TOF LC-MS system (Agilent Technologies, 6545XT) equipped with a dual Agilent jet stream high sensitivity electrospray ionisation (ESI) source.LC and LC-MS data were acquired in MassHunter (Agilent Technologies).Data analysis was performed in MassHunter Qualitative Analysis with BioConfirm add-on (v.11.0).
One microlitre of each nsp10, dissolved in the HPLC-grade chromatography water at the concentration of 0.25 mg/ml, was injected into the system and went through a 50 × 2.1 mm, 5 µM, 1000 Å PLRP-S Reversed-Phase column (Agilent Technologies).The mobile phase consists of 95% water and 5% acetonitrile initially, gradually changing to 60% water and 40% acetonitrile at 9 min, then gradually shifting to 0% water and 100% acetonitrile at 10 min and held for 1 min, and finally gradually returning to 95% water and 5% acetonitrile at 12 min.The flow rate was set to 0.4 mL/min, and the column temperature was maintained at 24 • C. Operational parameters for the Q-TOF source were as follows: drying gas temperature: 350 • C, drying gas flow: 12 L/min, nebulizer pressure: 35 psig, sheath gas temperature: 350 • C, sheath gas flow: 11 L/min, nozzle voltage: 1000 V, capillary voltage: 4000 V, and fragmentor voltage at 200 V.The system was operated in Extended Dynamic Range (2 GHz) mode.

Multi-Detection SEC Experiments
The in-solution oligomeric state of the various nsp10 proteins was evaluated in running buffer composed of 50 mM Tris-HCl pH 8.0 and 150 mM NaCl; 100 µg of protein were injected in triplicate in a Multi-detection SEC system (Malvern Panalytical, Malvern, UK) composed of the OMNISEC RESOLVE module (integrating a Superdex 75 Increase 10/300 GL column (Cytiva, Uppsala, Sweden), with a combined pump, degasser, autosampler and column oven) and the OMNISEC REVEAL, an integrated multi-detector module (Right Angle Light Scattering (RALS) 90 • angle and Low Angle Light Scattering (LALS) 7 • angle), differential refractive index, viscometer and diode-array-based UV/Vis spectrometer).The flow rate used was 0.5 ml/min and the detectors were normalised with bovine serum albumin (Thermo Fisher Scientific, Waltham, MA, USA).The sample changer was cooled down to 4 • C while the column oven and detector compartment were temperature controlled at 25 • C. Data were collected and analysed with the Multi-detection SEC integrated software (v11.32)provided by Malvern.The protein concentration was determined using an average refractive index increment (dn/dc) of 0.185 mL/g.Based on the reproducibility of the results, one injection from the triplicate was chosen for the calculations.

SEC-MALS Experiments
The concentration for each protein was adjusted to 5 mg/mL.The samples were measured in an HPLC (Agilent 1100) in line connected to an eight-channel light scattering detector (Wyatt DAWN 8+) and a differential refractive index detector (Wyatt Optilab T-rex).The size-exclusion chromatography column (Superdex 75 10/300, Cytiva, Tokyo, Japan) was pre-equilibrated using the running buffer (10 mM HEPES pH 7.5, 300 mM NaCl, 1.5% (v/v) Glycerol and 2 mM DTT) for 2 CVs and then 100 µL of each protein sample were injected using the auto-injector module of the HPLC at a flow rate of 0.5 mL/min.The main peaks were analysed using the ASTRA v6 software (Wyatt).

Figure 1 .
Figure1.Protein sequence and structural alignment of full-length nsp10 proteins from SARS-CoV-2, SARS and MERS.Identical residues are shown in white on a red background, similar residues are shaded in red and distinct residues are displayed in black.Truncated residues of the long SARS-CoV-2 nsp10 construct are highlighted in green, and additional missing residues of the short SARS-CoV-2 nsp10 construct are highlighted in blue.The percent identity between SARS-CoV-2 and SARS nsp10 proteins is 97.1%.In contrast, the percent identity between SARS-CoV-2 and MERS nsp10 proteins is 59.4%.The alignment was carried out using CLUSTAL OMEGA, 1.2.4[34] while the secondary structure was assigned by DSSP as implemented in ESPript 3.0[35].

Figure 2 .
Figure 2. Overlay of crystal structures of nsp10 from MERS, SARS, and SARS-CoV-2.Ribbon diagram of crystal structures of nsp10 MERS (grey, PDB entry 5YN5), SARS (yellow, PDB entry 2G9T), and SARS-CoV-2 (blue, PDB entry 7DIY) with structural zinc atoms shown as blue spheres.View on right is rotated by 180° along the horizontal axis with respect to view on the left.The three nsp10 crystal structures downloaded from the RCSB PDB were superposed in Pymol (v2.4.1) [36].

Figure 1 .
Figure1.Protein sequence and structural alignment of full-length nsp10 proteins from SARS-CoV-2, SARS and MERS.Identical residues are shown in white on a red background, similar residues are shaded in red and distinct residues are displayed in black.Truncated residues of the long SARS-CoV-2 nsp10 construct are highlighted in green, and additional missing residues of the short SARS-CoV-2 nsp10 construct are highlighted in blue.The percent identity between SARS-CoV-2 and SARS nsp10 proteins is 97.1%.In contrast, the percent identity between SARS-CoV-2 and MERS nsp10 proteins is 59.4%.The alignment was carried out using CLUSTAL OMEGA, 1.2.4[34] while the secondary structure was assigned by DSSP as implemented in ESPript 3.0[35].

Figure 1 .
Figure1.Protein sequence and structural alignment of full-length nsp10 proteins from SARS-CoV-2, SARS and MERS.Identical residues are shown in white on a red background, similar residues are shaded in red and distinct residues are displayed in black.Truncated residues of the long SARS-CoV-2 nsp10 construct are highlighted in green, and additional missing residues of the short SARS-CoV-2 nsp10 construct are highlighted in blue.The percent identity between SARS-CoV-2 and SARS nsp10 proteins is 97.1%.In contrast, the percent identity between SARS-CoV-2 and MERS nsp10 proteins is 59.4%.The alignment was carried out using CLUSTAL OMEGA, 1.2.4[34] while the secondary structure was assigned by DSSP as implemented in ESPript 3.0[35].

Figure 2 .
Figure 2. Overlay of crystal structures of nsp10 from MERS, SARS, and SARS-CoV-2.Ribbon diagram of crystal structures of nsp10 MERS (grey, PDB entry 5YN5), SARS (yellow, PDB entry 2G9T), and SARS-CoV-2 (blue, PDB entry 7DIY) with structural zinc atoms shown as blue spheres.View on right is rotated by 180° along the horizontal axis with respect to view on the left.The three nsp10 crystal structures downloaded from the RCSB PDB were superposed in Pymol (v2.4.1) [36].

Figure 2 .
Figure 2. Overlay of crystal structures of nsp10 from MERS, SARS, and SARS-CoV-2.Ribbon diagram of crystal structures of nsp10 MERS (grey, PDB entry 5YN5), SARS (yellow, PDB entry 2G9T), and SARS-CoV-2 (blue, PDB entry 7DIY) with structural zinc atoms shown as blue spheres.View on right is rotated by 180 • along the horizontal axis with respect to view on the left.The three nsp10 crystal structures downloaded from the RCSB PDB were superposed in Pymol (v2.4.1) [36].

15 Figure 3 .
Figure 3. SDS-PAGE of SARS-CoV-2, SARS and MERS nsp10 constructs purified for this study.Lane designation: MW contains the molecular weight marker proteins (in kDa) and the nsp10 proteins in the other lanes are indicated by names.The faint bands slightly above 70 kDa were identified by mass-spectrometry analysis as chaperone protein DNAK from E. coli (UniProtKB ID: P0A6Y8; calculated MW: 69.1 kDa).

Figure 3 .
Figure 3. SDS-PAGE of SARS-CoV-2, SARS and MERS nsp10 constructs purified for this study.Lane designation: MW contains the molecular weight marker proteins (in kDa) and the nsp10 proteins in the other lanes are indicated by names.The faint bands slightly above 70 kDa were identified by mass-spectrometry analysis as chaperone protein DNAK from E. coli (UniProtKB ID: P0A6Y8; calculated MW: 69.1 kDa).

Figure 4 .
Figure 4. (A) Experimental SAXS data measured for nsp10 proteins from SARS-CoV-2 (short and full-length) and the full-length nsp10 proteins from MERS and SARS coloured in light g dark green, red, blue and yellow respectively in this figure and Figure 5. Data are shown in re scales.(B) Distance distribution profiles of the sca ering data from (A).

Figure 4 .
Figure 4. (A) Experimental SAXS data measured for nsp10 proteins from SARS-CoV-2 (short, long and full-length) and the full-length nsp10 proteins from MERS and SARS coloured in light green, dark green, red, blue and yellow respectively in this figure and Figure 5. Data are shown in relative scales.(B) Distance distribution profiles of the scattering data from (A).

Figure 5 .
Figure 5.Comparison between the measured scattering data and the calculated scattering curve from the model (in black and in cartoon representation on the right).Discrepancy χ 2 is shown on the header of each plot.The nsp10 coordinates were extracted from the PDB.

Figure 6 .
Figure 6.Superimposition of the full-length ab initio nsp10 SARS-CoV-2 model shown as a molecular envelope with the rigid body/ab initio model of the same protein shown as cartoon in two different orientations.The ab initio envelope was generated in chimera using a map covering the calculated coordinates with a resolution of 13 Å.The extra residues added in the rigid body/ab initio are shown as blue spheres.Maximum dimensions for two of the orientations are indicated on the left as well as the N-and C-terminal boundaries of the model.

Figure 6 .
Figure 6.Superimposition of the full-length ab initio nsp10 SARS-CoV-2 model shown as a molecular envelope with the rigid body/ab initio model of the same protein shown as cartoon in two different orientations.The ab initio envelope was generated in chimera using a map covering the calculated coordinates with a resolution of 13 Å.The extra residues added in the rigid body/ab initio are shown as blue spheres.Maximum dimensions for two of the orientations are indicated on the left as well as the N-and C-terminal boundaries of the model.

Table 1 .
Molecular weights and oligomerisation of various nsp10s proteins determined by multidetection SEC.Multi-detection SEC runs with various nsp10 for the three β-CoVs.Only peaks with a relative amount of >1.0% are included in the table.MW and relative amount are average ± SD from three experiments.