1. Introduction
Viral hemorrhagic fever is a global problem, with most cases due to dengue virus (DENV), which originates over 390 million infections per year worldwide, being a major socio-economic burden, mainly for tropical and subtropical developing countries [
1]. A working vaccine was registered in Mexico in December 2015, approved for official use in some endemic regions of Latin America and Asia and, as of October 2018, also in Europe [
2,
3,
4]. However, this vaccine is not 100% effective against all DENV serotypes. Thus, research into new prophylactics is still ongoing, with a new vaccine proposed recently being now in phase 3 clinical trials [
5]. In spite of these recent developments, fully effective prophylactics approaches are lacking and there are no effective therapies. This is in part, due to a poor understanding of key steps of the viral life cycle.
There are four dengue serotypes occurring: DENV-1, DENV-2, DENV-3 and DENV-4 [
6]. Here, if not otherwise indicated, DENV refers to DENV-2. DENV is a member of the
Flavivirus genus, part of the Flaviviridae family, a genus which comprises 53 viral species [
6]. Many of these are important human pathogens as well, such as hepatitis C (HCV), tick-borne encephalitis (TBEV), yellow fever (YFV), West Nile (WNV) and Zika (ZIKV) viruses [
6,
7,
8,
9]. Flaviviridae are single-stranded positive-sense RNA viruses with approximately 11 kb, containing a single open reading frame [
10]. Using the host cell translation machinery, the
Flavivirus RNA genome is translated into a polyprotein that is co- and post-translationally cleaved by cellular and viral proteases into three structural proteins and seven non-structural proteins [
10]. Structural proteins are named as such since they are present in the mature virion structure [
11]. Nevertheless, they may also have non-structural roles, such as the capsid (C) protein. This is a structural protein that also mediates viral assembly and encapsidation, crucial steps of the viral life cycle. Given the C protein key roles, it is the focus of this work and will be described in detail below.
DENV C contains 100 amino acid residues, which form an homodimer with an intrinsically disordered protein (IDP) region in the N-terminal followed by four α-helices, α1 to α4, per monomer [
12]. Overall, the main structural/dynamics regions consist of the disordered N-terminal, a short flexible intermediate fold and, finally, a large conserved fold region, which greatly stabilizes the protein homodimer structure [
12,
13,
14,
15,
16]. The C protein has an asymmetric charge distribution: one side of the dimer contains a hydrophobic pocket (α2–α2′ interface), responsible for, alongside the disordered N-terminal, the binding to host lipid droplets (LDs) [
12,
13,
14,
15,
16]. The other is the positively charged C-terminal side (α4–α4′ interface), proposed to mediate the C protein binding to the viral RNA [
12]. It is noteworthy that several transient conformations for DENV C N-terminal were proposed, which may help modulate DENV C interaction with host lipid systems, via an autoinhibition mechanism [
15].
DENV infection affects the host lipid metabolism, increasing host intracellular LDs and unbalancing plasma lipoprotein levels and composition [
17,
18,
19]. Importantly, DENV C binds LDs, an interaction essential for viral replication [
18,
20]. DENV C-LDs binding requires potassium ions, the LDs surface protein perilipin 3 (PLIN3) and involves specific amino acid residues of DENV C α2–α2′ helical hydrophobic core and of the N-terminal [
14,
20]. This knowledge led us to design pep14-23, a patented peptide, based on a
Flavivirus C protein conserved N-terminal motif. We then established that pep14-23 inhibits DENV C-LDs binding [
14], acquiring α-helical structure in the presence of anionic phospholipids [
15]. Moreover, we also found that DENV C binds specifically to very low-density lipoproteins (VLDL), requiring K
+ ions and a specific VLDL surface protein, apolipoprotein E (APOE), being also inhibited by pep14-23 [
21]. This is analogous to DENV C-LDs interaction. The similarities between APOE and PLIN3 further reinforce this, suggesting a common mechanism [
22]. The role of LDs in
Flavivirus infection is well known and has been recently reviewed [
14,
18,
20,
23,
24,
25]. Given that, pep14-23 is an excellent drug development lead. Further developments require a better understanding of the function of the C protein of dengue and of
Flavivirus in general.
Therefore, here, we seek to contribute to understand the C proteins biological activity, with a special focus on DENV C. Briefly, we studied DENV C structure-activity relationship in the context of similar and highly homologous mosquito-borne Flavivirus C proteins. Our findings shed light into the structure-function relationship behind the C protein biological roles, which may contribute to future therapeutic approaches against DENV and closely related Flavivirus.
3. Discussion
Flavivirus C proteins are known to have similar sequences and structure [
12,
13,
14,
15,
16,
25,
31]. Here, we go further by examining common features at different structural levels, complemented with data on DENV C size and thermodynamic stability. The phylogenetic analysis of the C proteins and the polyproteins (
Figure 1) shows that the former is a marker of
Flavivirus evolution. There are several conserved motifs, highlighted in previous studies with 16
Flavivirus [
12,
14]. The work is now expanded to include the four DENV serotypes (
Figure 2). When these 20
Flavivirus C amino acid sequences, with between 96 and 107 amino acid residues each, are jointly analyzed, it is clear that 55% of the residues are conserved or stereochemically similar (
Figure 2a). About 80% of amino acid residues are equal or similar and, thus, conserved among the four DENV C serotypes (
Figure 2b). From the five major conserved motifs, four are known to be involved in dimer stabilization [
14]: the
40GXGP
43 motif at loop L1-2, that marks the transition from the flexible to the conserved fold region [
14]; the
68RW
69 at α3 forms an hydrophobic pocket that accommodates the W69 side chain involving residues from α2, α3 and α4 [
12,
32]; and, the
44h+hhLAhhAFF+F
56 and
84F++–h
88 motifs, respectively from α2 and α4 helices, maintain the homodimer structure both via the α2–α2′ hydrophobic interaction and via the salt bridges of residues [RK]
45 and [RK]
55′ with [ED]
87 [
12,
14,
32].
Flavivirus C proteins must have similarly sized secondary structure domains, since G/P are in the same positions and these amino acid residues tend to break the secondary structure (
Figure 2c). Charged residues are also conserved (
Figure 2d), which makes sense as charges would promote the interaction of the C protein with the negatively charged host lipid systems [
12,
14,
20,
21,
22] and the viral RNA [
12]. C proteins have a common homodimer conserved fold region (roughly, residues 45–100), as observed for DENV, WNV and ZIKV C structures [
12,
14,
25,
31]. Conserved motifs are summarized in (
Figure 2e).
The above explains the C proteins similar hydrophobic and α-helix propensities (
Figure 3). The conserved motif
13hNML+R
18, at the N-terminal region, and the α2–α2′ hydrophobic cleft are of particular importance for DENV C interaction with LDs and VLDL [
14,
20,
21,
22,
38]. Mutations in specific residues of DENV C α2–α2′ and α4–α4′ also impair RNA binding. Likewise, ZIKV C also accumulates on LDs surface, with specific mutations on this protein disrupting the association [
25]. ZIKV C also binds single-stranded and double-stranded RNAs [
25], with, as for DENV C, the high positively charged residues density prompting the binding to LDs and RNA [
12,
39,
40]. Given the match at the level of N-terminal α-helical propensity and α2–α2′ hydrophobicity (
Figure 3), the C proteins may all be self-regulated by an autoinhibition mechanism, as proposed for DENV C [
15].
The autoinhibition hypothesis is corroborated by the quaternary structure analysis (
Figure 4);
Table 1. Two clusters, C and D, are autoinhibited conformations. Importantly, cluster D α1 aligns with WNV C [
14,
31] and ZIKV C [
25]. Moreover, if two monomers are in a D conformation (D–D′ homoconformer), the dimer α2–α2′ region is totally inaccessible. Cluster C does not allow a C–C′ homoconformer nor a C–D heteroconformer, imposing restrictions to the simultaneous transitions that are possible between A, B, C and D, as homodimer. The interaction between N-terminal regions within a dimer may be considered. Nonetheless, the disordered nature and high density of positively charged amino acid residues will mostly favor the repulsion between these IDP regions.
It is important to look at the clusters (
Figure 4), while considering the number of positively charged residues (
Figure 2) in the disordered N-terminal and flexible fold (10 K and R residues) versus those in the conserved fold (15 K and R). The charge distribution in some arrangements implies that the disordered N-terminal is at least in theory able to bind the viral RNA [
39,
40]. Such binding would be governed by the N-terminal region cationic amino acid residues [
41,
42]. Here, the structure predictions reveal that, indeed, the first 12 N-terminal residues can locate near α4–α4′ Cluster A (
Figure 4), the most likely RNA binding site [
12,
39,
40]. Furthermore, binding to RNA via the C-terminal α4–α4′ interface may be favored by a previous or simultaneous interaction of the protein with host LDs via the N-terminal region and α2–α2′ interface. Access to α2–α2′ (controlled by the N-terminal region) would modulate the interaction (
Figure 4) and, thus, viral assembly. In agreement, the binding of the related hepatitis C virus core protein (homologous to DENV C) to host LDs is what enables efficient viral assembly [
43]. Thus, the C protein disordered N-terminal would be critical to protein function, enabling crucial structural and functional roles.
To evaluate this, we used DENV C as a model system, measuring its
τc value by time-resolved fluorescence anisotropy (
Figure 5) and its thermal stability by CD spectroscopy (
Figure 6), at pH 6.0 and 7.5 (within the usual pH range of its biological microenvironment). A similar
τc, 15.2 ± 0.5 ns, is obtained at both pH values (
Figure 5;
Table 2 and
Table 3), in line with previous work [
13]. DENV C maintains its homodimer structure and dynamics behavior between pH 6.0 and 7.5. The
τc value and respective size are higher than expected, due to the N-terminal disordered nature.
Regarding DENV C thermodynamic stability (
Figure 6,
Table 4), the protein
Tm is ~70 °C at both pH values. These denaturation parameters are in line with other authors, as a chemically synthesized DENV C 21–100 fragment (without most of the disordered N-terminal region) displays a
Tm = 71.6 °C [
32]. DENV C high thermal stability in physiological conditions is likely due to the large hydrophobic area that is shared by the two monomers [
12], but also to the W69 stabilizing interactions and, as experimentally observed [
32], the formation of salt bridges (residues K45 and R55′ with E87). As structure/dynamics properties are conserved among
Flavivirus C proteins (
Figure 2,
Figure 3 and
Figure 4), these observations can probably be generalized for all these proteins.
These findings must also be considered in light of DENV C biologically relevant interactions with LDs [
22] and RNA (
Figure 7). DENV C experimental structure [
12] contains three distinct structural regions [
13]: a disordered N-terminal region (from the N-terminal up to residue R22), a flexible fold (residues V23 to L44, where α-helix 1 is located) and a conserved fold with helices α2, α3 and α4, containing the R68 and W69 amino acid residues, highly conserved among
Flavivirus [
12]. R68 terminates α3 helix, with its side chain pointing to the protein interior [
12]. W69 locates at DENV C α4–α4′ interface, having a crucial role in the dimer structural stabilization [
12]. Along with dimer structural stability, these interactions enable allosteric communication and movements between DENV C more hydrophobic section (α2–α2’dimer interface) and its remaining sections, namely the α4–α4′ region.
Figure 7 displays this, in the context of the C protein biologically relevant interactions, as they are understood on the basis of recent studies [
12,
13,
14,
15,
18,
20,
21,
22,
23,
24].
Looking further, it is important to consider that the binding of DENV C to host LDs is mediated by both the N-terminal IDP region and the α2–α2′ interface [
14]. V51 of α2 is affected by the interaction with LDs and stabilizes the dimer by contacting with α3 (I65). Another interaction via salt bridges, between α2 (K45 and R55′) and α4 (E87), stabilizes the homodimer (
Figure 7a). The C protein binding to host LDs, which affects the α2–α2′, can lead to changes in the α4–α4′ structural arrangement (
Figure 7b). To investigate this we searched for similar proteins. An RNA-binding protein with a two-helix domain similar to DENV C α4–α4′ was identified (
Figure 7c), influenza A non-structural protein 1 (NS1, PDB ID: 2ZKO [
44]). Influenza NS1 has interesting features: it accumulates in the nuclei of host cells after being translocated by importin α and β and works as a viral immuno-suppressor by weakening the host cell gene expression [
45]. DENV C was also reported to have an importin α-like motif in the N-terminal [
15,
46]. Regarding the targets that may interact with importin α and be transported to the nucleus, they normally contain a nuclear localization sequence (NLS), consisting of a motif of at least 2 consecutive positively charged residues [
47,
48,
49,
50,
51]. Some of these proteins contain 2 NLS motifs, with at least 8 (up to 40 or even more) residues in between, designated as a bipartite NLS motif [
49,
50,
51]. Strikingly,
Flavivirus C proteins have three motifs of two consecutive cationic residues in the N-terminal region and α1 domain, which could form a bipartite NLS. A bipartite NLS formed by the cationic residues before position 10 and at positions 17 and 18, with a spacer of 7 to 13 residues can occur. The other bipartite NLS possibility may be formed by residues at positions 17 and 18, and at positions 31 and 32, with 9 to 12 spacer residues. Possible bipartite NLS are also seen in the conserved fold region but its static nature precludes activity as NLS. If DENV C binds to importin α, it may act as a cargo protein to be transported to the nucleus. This could explain why has DENV C been found in the nucleus of DENV infected cells [
46,
52,
53]. DENV C may directly bind importin β, given the similarities between the N-terminal region of DENV C and importin α [
49]. This may allow it to disrupt the normal nuclear import/export system in DENV-infected cells. The conformational plasticity of the N-terminal and flexible fold regions is certainly compatible with interactions with importin(s). As the hypothesized bipartite NLS are conserved among
Flavivirus C proteins, this may occur in other
Flavivirus.
The C protein may act as an immuno-suppressor, similarly to influenza NS1, by interacting with importins α and/or importin β. Ivermectin, a specific inhibitor of importin α/β-mediated nuclear import, is able to inhibit HIV-1 and DENV replication [
54]. The mechanism of DENV C inhibition might involve the C protein, specifically the intrinsically disordered N-terminal IDP region, which is similar to importin α disordered N-terminal region [
15]. Moreover, influenza NS1 can counteract the RNA-activated protein kinase (PKR)-mediated antiviral response through a direct interaction with PKR [
55]. Besides, influenza NS1 blocks interferon (IFN) regulatory factor 3 activation, which in turn prevents the induction of IFN-related genes [
56]. DENV inhibits the IFN signaling pathway in a similar manner [
57]. By its N-terminal region dsRNA-binding ability, influenza NS1 inhibits the nuclear export of mRNAs and modulates pre-mRNA splicing, suppressing antiviral response [
44]. Similarities between DENV C and influenza NS1 also extend to the later ability to bind RNA (
Figure 7c). Recognition of dsRNA is made by the influenza NS1 RNA-binding domain, which forms a homodimer [
44]. Afterwards, a slight change in R38-R38′ orientation leads to anchoring the dsRNA to the protein by a hydrogen bond network to the protein [
44]. One of the main functions of influenza NS1 binding to RNA is sequestering dsRNA from the 2′–5′ oligo(A) synthetase [
58]. We propose that, as with influenza NS1, a small conformational change in DENV C α4–α4′ interface occurs after the contact of its α2–α2′ interface with LDs, modulated by transitions between alternative N-terminal “open” and “closed” conformations. Binding to LDs requires an open conformation (
Figure 7d), decreasing the conformational variability and entropy of the C protein, which trigger the allosteric movements affecting the C-terminal α4–α4′. As with influenza NS1, the
Flavivirus C protein would remain in the same overall fold, but a small opening of α4–α4′ would facilitate its binding to RNA.
The C-terminal is likely to be the crucial section for RNA binding given its similarity with influenza NS1 (
Figure 7). Nevertheless, the N-terminal conformers must also be considered in the context of RNA binding (
Figure 4). The A and D conformers allow RNA to be bound to the α4–α4′ interface and, simultaneously, to the N-terminal cationic amino acid residues. A–A′ and D–D′ conformations result in the possible binding of a single continuous portion of RNA to both the C-terminal α4–α4′ and the N-terminal IDP region, making the RNA more tightly bound. Moreover, the A–B’, B–B’ and B–C’ conformations would enable the protein to bind two distinct sections of the RNA, one bound to α4–α4′ and another to the N-terminal regions. That arrangement may allow to further compact the viral RNA. The N-terminal IDP region putative binding to RNA should not be disregarded given its positive net charge (+7). It compares very well with the C-terminal α-helical region net charge (+8 for a monomer, +16 for α4–α4′ dimer interface). Both may thus bind RNA due to, mostly, electrostatic forces. This IDP region can thus provide multi-functionality by several modes of binding and different ligands, enabled by alternative conformations. It must be stressed that this is not unlikely. Viral proteins tend to have IDP regions that increase their biological activity [
59,
60,
61]. In a proteome as small as that of flaviviruses (10 proteins), IDP regions augment the number of ligands with which it can interact. Less structure often means more function. This is an increasingly hot topic of recent research, leading to design of algorithms to identify these regions [
62,
63]. Further analysis will help understand the interaction between DENV C and its ligands.
To conclude, the data imply a common structure and functions for mosquito-borne Flavivirus C proteins. Moreover, studying DENV C rotational diffusion and thermodynamics reveals a stable protein due to the conserved fold maintaining the homodimer structure. These findings apply to other Flavivirus C proteins, supporting a common mechanism for their biological activity. Such understanding of this key protein structure and dynamics properties may contribute to the future development of C protein-targeted drugs to impair dengue virus and other Flavivirus infections.