1. Introduction
The Yezo virus (YEZV;
Orthonairovirus yezoense,
Orthonairovirus genus,
Nairoviridae family) is an emerging tick-borne pathogen first identified in humans in Hokkaido, Japan (2021) [
1]. Infection typically presents as an acute febrile illness characterized by high fever, thrombocytopenia, leukopenia, elevated hepatic transaminases, and marked hyperferritinemia, with a median incubation period of ~7 days following tick exposure [
1]. In contrast to several highly pathogenic orthonairoviruses, reported YEZV infections have not exhibited hemorrhagic manifestations, and all documented patients have achieved complete clinical recovery within 2–3 weeks [
1,
2]. Nevertheless, the expanding geographic footprint and zoonotic potential of YEZV have positioned it as a notable emerging public health concern.
Since its initial discovery, YEZV has been documented across multiple countries, ecological niches, and host species. In Japan, 22 confirmed human cases have been reported, predominantly in Hokkaido, with disease onset tightly correlated with the seasonal activity of
Ixodes ticks [
1]. In northeastern China, at least 22 human cases have been identified, including retrospective infections from 2012. Clinical manifestations in Chinese patients are generally milder than those observed in Japan [
1,
2]. In Russia, YEZV was first isolated in 2024 from
Ixodes persulcatus ticks, representing the westernmost known distribution limit at the time of reporting [
3]. Subsequent metagenomic surveillance identified a novel YEZV genetic variant in
Ixodes pavlovskyi ticks in the Tomsk region (Russia), expanding geographic distribution [
2]. Beyond human and tick hosts, YEZV RNA or specific antibodies have been detected in wild mammals, including sika deer (
Cervus nippon yesoensis), raccoons (
Procyon lotor), and raccoon dogs (
Nyctereutes procyonoides) [
3]. Additionally, viral genomes have been identified in ticks collected from migratory passerines, such as black-faced buntings (
Emberiza spodocephala), supporting a role for avian migration in the cross-regional dispersal of YEZV [
1,
3]. Recent metagenomic analysis of a gray seal from the Baltic Sea led to the proposal of a novel Östhammar virus (
Orthonairovirus östhammarense), which shares ~90% identity with YEZV and phylogenetically clusters within the same Eurasian clade of human-pathogenic orthonairoviruses [
4]. This discovery demonstrates that the viral lineage can infect a wide range of vertebrate hosts, including marine mammals. It also emphasizes the urgent need to characterize conserved structural elements. Understanding these features will help elucidate the mechanisms of cross-species adaptation. Ultimately, such knowledge will directly inform the development of broad-spectrum antiviral strategies.
The YEZV genome comprises three negative-sense, single-stranded RNA (ssRNA(-)) segments: small (S), medium (M), and large (L) [
3]. The S segment (~1.6 kb) encodes the nucleoprotein (N). The M segment (~4.0 kb) encodes the glycoprotein precursor (GPC), which is post-translationally processed into surface glycoproteins Gn and Gc. The L segment (~12.0 kb) encodes a multifunctional L protein containing an RNA-dependent RNA polymerase (RdRp) and an OTU-like protease domain implicated in suppressing host innate immune responses [
2,
3]. The 3′ and 5′ termini of each segment share partial complementarity, enabling intramolecular base-pairing to form a conserved panhandle promoter structure essential for viral transcription and genome replication [
3,
5].
The N is the most abundant viral protein and serves as the foundational structural component of the viral ribonucleoprotein (RNP) complex. By encapsidating the viral genomic RNA, N shields it from degradation by cellular nucleases and provides the essential structural template for the viral RdRp to initiate transcription and replication [
5,
6]. Structural studies of related orthonairoviruses reveal that RNPs adopt flexible, ring-like oligomeric architectures. This conformational plasticity likely facilitates dynamic interactions with the viral polymerase and host cellular machinery [
5,
7].
Structural analyses of related virus N, including those from Crimean-Congo hemorrhagic fever virus (
Orthonairovirus haemorrhagiae; CCHFV), Songling virus (
Orthonairovirus songlingense; SGLV), and Beiji nairovirus (
Norwavirus beijiense; BJNV), demonstrate a highly conserved “racket-shaped” tertiary architecture composed of a compact globular “head” domain (Domain 2, D2) and an elongated “stalk” domain (Domain 1, D1) [
3,
5,
7]. The N D1 protrudes from the globular body and mediates critical head-to-tail intermolecular contacts that drive oligomerization into the superhelical RNP [
5,
6]. In CCHFV, this region contains a conserved caspase-3 cleavage site characterized by a DEVD motif. In the oligomeric state, this site remains sterically occluded. However, the binding of primer-length RNA triggers a conformational switch that exposes the motif. This structural rearrangement directly links RNP dynamics to host antiviral defense mechanisms [
5]. Similarly, integrative structural studies of SGLV N and BJNV N confirm that the D1 exhibits pronounced conformational flexibility, adopting distinct orientations relative to the D2 that likely regulate RNA-binding affinity and monomer–oligomer transitions [
7]. Positively charged residues within the D1 contribute to a continuous RNA-binding crevice that runs along the interior of the RNP, shielding the viral genome and facilitating specific recognition of the terminal panhandle structure [
6,
7]. Notably, immunoinformatic predictions for YEZV N identify several candidate B- and T-cell epitopes within the YEZV N D1 [
8,
9]. Therefore, elucidating the structural dynamics of the corresponding domain in the emerging YEZV is critical for understanding its pathogenesis and identifying conserved targets for broad-spectrum antiviral interventions.
These observations strongly justify a detailed biophysical characterization of the YEZV N D1. To elucidate the structural features of the YEZV N D1, we applied an integrated approach combining small angle X-ray scattering with size exclusion chromatography (SEC-SAXS), tertiary structure prediction using AlphaFold 3, and molecular dynamic (MD) simulations. The recombinant YEZV N D1 and its first solution structure reported here provide a foundation for understanding the molecular architecture of YEZV, development of serological assays and structure-guided therapeutic design against this and other emerging orthonairoviruses.
3. Discussion
Orthonairovirus N proteins represent one of the most functionally critical and structurally conserved components across the
Nairoviridae family, serving as essential mediators of viral genome packaging, replication, and host immune modulation [
11]. These proteins play central roles in the viral life cycle, from initial RNA encapsidation to the formation of functional RNP complexes that serve as templates for viral transcription and replication [
12,
13]. The N protein’s multifunctional nature extends beyond RNA binding to include modulation of host cellular processes, immune evasion mechanisms, and facilitation of viral assembly and release [
14,
15]. Within this broader context of orthonairovirus nucleoprotein biology, the structural characterization of YEZV N assumes particular significance. YEZV represents an emerging tick-borne orthonairovirus that has recently expanded its geographical range from its initial discovery in Japan to include detection across China and, most recently, Russia [
2,
3]. This geographic expansion, coupled with the virus’s association with acute febrile illness in humans, positions YEZV as a pathogen of growing public health concern.
This study presents the first experimental structural characterization of the YEZV N D1, providing crucial insights into the molecular architecture of this emerging tick-borne pathogen. Orthonairovirus nucleoproteins are highly immunogenic and serve as major targets for both humoral and cellular immune responses [
16,
17]. The CCHFV N D1 was shown to be essential for Th17 cell activation, particularly through promoting IL-17A production [
18]. The high immunogenicity of orthonairovirus N D1 and their role as dominant antigens in natural infections make them ideal targets for both IgM and IgG detection assays. A distinctive feature of this study is the use of the native, non-codon-optimized nucleotide sequence encoding YEZV N D1 derived from the HH009-2017 isolate, originally identified in a patient in Hokkaido, Japan. Biophysical data obtained for the native YEZV N D1 sequence ensure direct relevance to the naturally circulating YEZV. This is particularly important for the development of ELISA diagnostic assays, where antibodies raised against a consensus variant might exhibit reduced affinity for the wild-type antigen. However, potential cross-reactivity with other orthonairovirus N must be carefully considered in assay design, particularly in regions where multiple orthonairoviruses co-circulate [
19].
Our structural characterization of YEZV N D1 demonstrates remarkable agreement between experimental SEC-SAXS data and AlphaFold 3 predictions, with χ2 value of 1.3. This level of agreement is particularly noteworthy when compared to similar studies of related orthonairovirus nucleoproteins. The modest improvement in fit quality for GASBOR (χ2 = 1.3) relative to CRYSOL (χ2 = 1.7) reflects methodological differences. CRYSOL computes scattering from a single static atomic model, whereas GASBOR reconstructs a low-resolution envelope directly optimized against experimental data, implicitly accounting for solution-averaged conformational heterogeneity. This interpretation is consistent with independent evidence from Kratky analysis and MD simulations, which indicate limited flexibility in surface loops and terminal regions of YEZV N D1.
The high pLDDT score and close correspondence of key structural parameters validate the computational model and demonstrate the reliability of current AI-based structure prediction methods for proteins of novel viruses. These results can be directly compared with our recent structural characterization of SGLV N and BJNV N, which employed identical SAXS and computational modeling approaches [
7]. The consistency of experimental–computational agreement across these three emerging members of
Nairoviridae family validates the methodology and establishes a robust framework for rapid structural characterization of recently discovered viruses.
The MD simulation results reveal important insights into YEZV N D1 dynamics. The structural stability observed over 500 ns indicates a well-folded, thermodynamically stable domain capable of maintaining its functional conformation under physiological conditions. The localized flexibility in terminal regions and inter-helical loops suggests these regions may serve as hinge points for conformational changes during RNA-binding or protein–protein interactions.
The comparison with homologous domains from BJNV and SGLV reveals interesting evolutionary and functional relationships within the Nairoviridae family. While all three domains share similar overall organization and dimensions, the distinct P(r) profiles indicate subtle but potentially important structural differences. The more elongated shape of YEZV N D1 compared to the more compact architectures of BJNV and SGLV domains may reflect adaptation to specific RNA-binding requirements. However, it is important to note a key limitation. The theoretical scattering curves profiles for BJNV N D1 and SGLV N D1 were derived exclusively from computational structural models rather than experimental SAXS data. Consequently, these predictions may not fully reflect the actual BJNV N D1 and SGLV N D1 solution-state structures.
The identification of a conserved cationic patch in YEZV N D1 aligns with our SEC-SAXS and MD observations of moderate flexibility in surface loops, suggesting that dynamic rearrangements may facilitate RNA capture and release during RNP assembly. Given the conservation of this RNA-binding interface across pathogenic orthonairoviruses, the characterized cationic patch represents a promising target for structure-guided development of antivirals [
20]. While computational modeling provides high-confidence predictions, experimental validation of RNA-binding affinity (electrophoretic mobility shift assays, surface plasmon resonance, or isothermal titration calorimetry) and mapping of contact residues by mutagenesis will be essential in future. Rigorous experimental validation will therefore be essential to confirm the biological relevance of this RNA-binding interface.
As genomic surveillance continues to uncover novel tick-borne orthonairoviruses, the methodological and structural framework established here will accelerate the characterization of emerging viral proteins and support the rapid development of targeted countermeasures.
4. Materials and Methods
4.1. Construction of the YEZV Nucleoprotein Domain 1 Expression Plasmid
A DNA copy of the sequence encoding YEZV N D1 (GenBank ID: LC628645) was synthesized de novo by assembly from pairwise overlapping oligonucleotides followed by PCR amplification (375 bp in length) with Q5 High-Fidelity DNA Polymerase (NEB, Hitchin, UK). The resulting DNA fragment was directionally cloned into the pJET1.2/blunt (Invitrogen, Carlsbad, CA, USA) for sequence verification and long-term storage. Subsequently, it was PCR-amplified to introduce a cleavage site for the rhinovirus A28 3C protease and a CACC overhang for D-TOPO cloning. The target fragment was directionally ligated into the pET200 D-TOPO vector (Invitrogen, Carlsbad, CA, USA), enabling the production of a chimeric protein bearing an N-terminal 6×His affinity tag (
Figure S6). The resulting recombinant plasmid, 6×His-pET-YEZV-N-D1, was transformed into chemically competent
E.coli BL-21(DE3) cells (Thermo Fisher Scientific, Waltham, MA, USA). The accuracy of the final construct was verified by full-length Sanger sequencing.
4.2. Production and Purification of Recombinant YEZV Nucleoprotein Domain 1
Transformed cells were grown in LB medium (AppliChem, Darmstadt, Germany) supplemented with kanamycin (100 μg/mL) at 37 °C until an optical density of OD600 ~1.0 was reached. Expression was induced by adding IPTG (Thermo Fisher Scientific, Waltham, MA, USA) to a final concentration of 1 mM, and incubation was continued at a reduced temperature (16 °C, 180 rpm) for 20–24 h. The biomass was harvested by centrifugation (4500× g, 15 min, 4 °C).
The cell pellet was resuspended in lysis buffer (20 mM Tris, 500 mM NaCl, 20 mM imidazole, pH 7.5) with the addition of DNase (Biolabmix, Novosibirsk, Russia) and MgCl2 (New England BioLab, Ipswich, MA, USA) and a cocktail of protease inhibitors (Trans Gene Biotech, Beijing, China), disrupted by sonication on ice, and centrifuged (13,000×
g, 30 min, 4 °C). The clarified supernatant was filtered through a 0.22 μm membrane and loaded onto a HisPur Ni-NTA Resin (Thermo Fisher Scientific, Waltham, MA, USA) for IMAC (HBBio-Lab 100 Chromatography System; Hanbon Sci. & Tech., Huaian, China). After washing away unbound proteins, the chimeric YEZV N D1 protein was eluted by stepwise increase in imidazole concentration (
Figure S7a). The pooled fractions were dialyzed against a proteolysis-compatible buffer. The identity and specificity of the purified chimeric YEZV N D1 were subsequently confirmed by Western blot analysis according to Section Western Blot Analysis (
Figure S8).
Removal of the 6×His tag was performed using recombinant rhinovirus A28 3C protease (SRC VB “Vector” Rospotrebnadzor, Koltsovo, Russia) at an enzyme-to-substrate molar ratio of 1:8 (4 °C, 16 h). Following proteolysis, the mixture was reapplied to a Ni-NTA column under reverse-IMAC, allowing the target YEZV N D1 (lacking N-terminal tag) to be collected in the flow-through fraction (
Figure S7b). Eluted YEZV N D1 was concentrated using centrifugal filter units (5 kDa MWCO; Jet BioFil, Guangzhou, China) and purified using SEC with Superdex
®200 Increase 10/300 GL (GE Healthcare, Stockholm, Sweden) in a buffer containing 20 mM Tris, 150 mM NaCl, and pH 7.5. Protein purity and molecular weight were assessed by SDS-PAGE in a 12% polyacrylamide gel according to Laemmli (
Figure S7c).
Western Blot Analysis
Proteins were transferred from the gel to a nitrocellulose membrane using a WIX-fastBLOT Fast Semi-dry Blot (Wix Technology, Beijing, China) in transfer buffer (20 mM Tris, 192 mM glycine, 10% (
v/
v) ethanol) at 0.6 A for 90 min. Following transfer, membranes were blocked with EveryBlot Blocking Buffer (Bio-Rad, Hercules, CA, USA) for 1 h at room temperature. Membranes were then incubated overnight at 4 °C with Anti-6X His tag
® antibody (Abcam, Cambridge, UK) diluted 1:5000 in blocking buffer. After five washes (5 min each) with TBST (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 0.1% (
v/
v) Tween 20), membranes were incubated for 1 h at room temperature with horseradish peroxidase (HRP)-conjugated goat anti-mouse IgG (H+L) secondary antibody (Thermo Fisher Scientific, Waltham, MA, USA) diluted 1:100,000 in blocking solution. Chemiluminescent detection was performed using SuperSignal™ West Pico PLUS Chemiluminescent Substrate (Thermo Fisher Scientific, Waltham, MA, USA). Signals were captured using a Li-Cor C-Digit Western Blot Scanner (LI-COR Biosciences, Lincoln, NE, USA) and quantified with Image Studio 5.2 software (LI-COR Biosciences, Lincoln, NE, USA) (
Figure S8).
4.3. Dynamic Light Scattering (DLS) Measurement
DLS data were collected using a BeNano 180 Zeta Pro (Bettersize, Dandong, China) with a light source wavelength of 671 nm, a fixed scattering angle of 173°, and a temperature of 23–25 °C. The HullRad server (
http://52.14.70.9/Run_hullrad.html; accessed on 20 March 2024) [
21] was used for calculating hydrodynamic properties of AlphaFold-predicted structure and the theoretical molecular weight of protein.
4.4. SEC-SAXS Data Collection and Analysis
SEC-SAXS measurements for YEZV N D1 were performed at the BL19U2 beamline of the Shanghai Synchrotron Radiation Facility (SSRF, Shanghai, China) [
22]. The X-ray beam size on the stage was 0.33 mm (H) × 0.05 mm (V). A two-dimensional Pilatus3 2M detector (DECTRIS Ltd., Baden, Switzerland) was placed at a sample-to-detector distance of 2.7 m. The scattering vector magnitude range (s = (4π/λ)sinθ, where 2θ is the scattering angle and λ = 0.1033 nm is the wavelength) was 0.07–4.5 nm
−1.
SEC-SAXS was carried out using a Superdex®200 Increase 10/300 GL column at 20 °C with a flow rate of 0.5 mL/min. Chromatography was performed on an Agilent 1260 Infinity HPLC system (Agilent Technologies, Santa Clara, CA, USA). The YEZV N D1 sample was loaded at a concentration of 10.34 mg/mL, with an injection volume of 150 μL.
Data collection was performed by continuous acquisition of SAXS frames from the moment of sample injection onto the SEC column until the return to baseline. Primary data processing was carried out using the ATSAS v.4.0.1 with CHROMIXS (EMBL, Dublin, Ireland;
https://www.embl-hamburg.de/biosaxs/; accessed on 1 December 2025), employed for automated analysis of SEC-SAXS datasets [
23,
24]. Background frames were selected from regions preceding and following the elution peak, followed by frame-by-frame buffer subtraction from each frame within the protein peak region. Frames corresponding to the monodisperse peak apex were merged for subsequent analysis.
Guinier analysis, pair-distance distribution function P(r) calculation, and ab initio shape reconstruction were performed using PRIMUS, GNOM, and DAMMIF, respectively [
25,
26,
27]. The radius of gyration (Rg) and forward scattering intensity I(0) were determined from Guinier analysis (valid for sRg < 1.3; sRg = 0.3–1.3). Molecular weight (MW) was estimated using a Bayesian approach.
Ab initio low-resolution shape reconstruction was performed by approximating the molecular envelope with a system of dummy atoms using DAMMIF [
27]. The resulting ab initio models were averaged and filtered against the experimental scattering curve using DAMAVER [
28]. Superposition of the low-resolution envelope onto the atomic model of YEZV N D1 generated by AlphaFold was carried out using the CIFSUP [
29], with model alignment optimized according to the normalized spatial discrepancy criterion. The excluded volume was estimated from ab initio modeling results generated by DAMMIF. Theoretical scattering curves from AlphaFold-predicted YEZV N D1 model was calculated with CRYSOL and GASBOR [
30]. The goodness of fit was assessed using the reduced, weighted χ
2 statistic.
4.5. Structure Prediction and MD Simulation
Structural models were predicted using AlphaFold3 v3.0.1 via the official AlphaFold Server (Google DeepMind, London, UK;
https://alphafoldserver.com, accessed on 1 December 2025) [
31]. Structural inferences were based on per-residue confidence scores provided via the AlphaFold 3 pLDDT metric and pTM metric. The pLDDT quantitatively estimates deviations in Cα–Cα interatomic distances between the reference and predicted structural models, with values ranging from 0 to 100. The pTM was used to assess the global structural accuracy of the predicted models, independent of residue-level confidence metrics. For models involving potential protein complex formation, the ipTM was additionally calculated to evaluate the reliability of intermolecular interface predictions. Models were selected for downstream analysis based on high-confidence thresholds established for three-dimensional structure prediction. Each selected model was spatially aligned with previously identified homologous proteins using structural superposition. The quality of structural alignment was evaluated using two key metrics: RMSD of Cα atoms, which quantifies the average atomic displacement upon superposition, and TM-score, which provides a length-independent measure of topological similarity. Tertiary structure models of viral proteins were visualized using UCSF ChimeraX v1.15rc (University of California, San Francisco, CA, USA) [
32]. Comparison of secondary structures was visualized using the ESPript v3.0 software (Institute of Protein Biology and Chemistry, Lyon, France;
https://espript.ibcp.fr/; accessed on 1 December 2025) [
33].
The simulation system was automatically sized to accommodate the viral proteins with a minimum distance of 10.0 Å between any protein atom and the edge of a rectangular periodic boundary box. The system was solvated with explicit OPC water molecules. Counterions were added using a Monte Carlo placement algorithm to neutralize the net protein charge and achieve a physiological ionic strength of 150 mM NaCl. All simulations were performed using the AMBER ff19SB force field (Amber package, University of California, San Francisco, CA, USA).
System equilibration was carried out in two stages. First, an NVT ensemble simulation was conducted at 303.15 K for 125 ps with a 1.0 fs time step. Temperature was controlled using Langevin dynamics with a collision frequency of 1.0 ps
−1, and protein heavy atoms were restrained with a harmonic force constant of 1.0 kcal·mol
−1·Å
−2. All covalent bonds involving hydrogen atoms were constrained using the SHAKE algorithm, and a nonbonded cutoff distance of 9.0 Å was employed. The equilibrated system was subsequently subjected to a 10 ns conventional NPT simulation at 303.15 K and 1 bar using a 2.0 fs integration time step. Pressure was maintained using a Monte Carlo barostat, while temperature control was achieved with the Langevin thermostat. After the conventional MD pre-run, accelerated molecular dynamics simulations were performed in the NPT ensemble using the dual-boost mode, in which both dihedral and total potential energy terms were modified to enhance conformational sampling. Following equilibration, a 500 ns production trajectory was generated. Trajectory analysis was performed using CPPTRAJ (Amber package, University of California, San Francisco, CA, USA;
https://ambermd.org, accessed on 1 December 2025) [
34]. Key parameters evaluated included RMSD of Cα atoms relative to the initial structure, RMSF, and Rg.
MD simulations of the YEZV N D1-ssRNA(-) complex and the dimeric YEZV N-ssRNA(-) complex were additionally performed for 300 ns. All MD simulations were carried out using the Amber package (University of California, San Francisco, CA, USA). Clustering analysis of the trajectories was carried out for both systems using 10 clusters. For the YEZV N D1-ssRNA(-) complex, clustering was performed based on conserved residues of the monomer and three nucleotides involved in the RNA-binding site. For the YEZV N dimer-ssRNA(-) complex, clustering was performed using Cα atoms. Protein–RNA interactions were analyzed using VMD v2.0 (University of Illinois, Urbana-Champaign, IL, USA).