Next Article in Journal
Antileishmanial and Immunomodulatory Activity of Paclitaxel and Docetaxel Combined with Miltefosine and Paromomycin
Previous Article in Journal
Structural Features and Mitogenome-Based Evolutionary Insights into Acanthopleura loochooana (Polyplacophora: Chitonidae)
Previous Article in Special Issue
Immune Response Elicited by Recombinant Adenovirus-Delivered Glycoprotein B and Nucleocapsid Protein UL18 and UL25 of HSV-1 in Mice
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploring EBNA3C Genetic Variability and Recombination in Epstein–Barr Virus-Associated Cancers

by
Abdiel Barra
1,2,
Paulina Vasquez-Aguilar
1,2,3,
Paulo Henrique Braz-Silva
4,5 and
Louise Zanella
1,2,*
1
Doctorado en Ciencias Médicas, Universidad de La Frontera, Temuco 4811322, Chile
2
Medical Science Research Laboratory, Universidad de La Frontera, Temuco 4811322, Chile
3
Diagnósticos y Evaluación Facultad de Ciencias de la Salud, Universidad Católica de Temuco, Temuco 4780000, Chile
4
Department of Stomatology, School of Dentistry, University of São Paulo, São Paulo 05508-000, Brazil
5
Institute of Tropical Medicine of São Paulo, School of Medicine, University of São Paulo, São Paulo 05403-000, Brazil
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2026, 27(7), 3054; https://doi.org/10.3390/ijms27073054
Submission received: 21 June 2025 / Revised: 22 August 2025 / Accepted: 25 August 2025 / Published: 27 March 2026
(This article belongs to the Special Issue Recent Advances in Herpesviruses (2nd Edition))

Abstract

Epstein–Barr virus is a globally disseminated oncovirus capable of causing various malignancies, including gastric cancer, Burkitt lymphoma, and Hodgkin’s lymphoma. The influence of recombination on the EBV genome revealed limitations in the current traditional EBV classification, and the extent of these recombination events across the EBV genome is not fully understood. The nuclear antigen 3C (EBNA3C) is an indispensable gene in the oncogenesis of the virus. Despite its critical role, little is known about EBNA3C sequence variability. We examined 988 EBNA3C gene sequences extracted from EBV genomes in this context. Among the protein motifs, the interaction sites with Nm23-H1, RBP-Jk, and nuclear localization signal (NLS) 2 and 3 were the most divergent between EBV types, while NLS-1 and the leucine zipper-like showed high conservation. In our study of the impact of recombination vs. point mutations in the EBNA3C gene, we found that recombination contributed five times more to substitutions than mutation. Notably, Asian populations exhibited the highest variability and recombination rates. Importantly, our analysis revealed geographical rather than disease-specific markers. Furthermore, filtering for recombination regions did not affect the classical classification of EBV-1 and EBV-2. This finding suggests that recombination is pivotal in the architecture of EBV genetic diversity of the EBNA3C gene.

1. Introduction

Epstein–Barr virus (EBV) is a human herpes virus that infects more than 90% of people worldwide. It is linked with a range of clinical outcomes, from mild conditions like infectious mononucleosis (IM) to several neoplastic conditions, including Burkitt lymphoma (BL), Hodgkin’s lymphoma (HL), nasopharyngeal cancer (NPC), gastric cancer (GC), and lung cancer (LC) [1,2,3]. Although EBV has a clear association with cancer, it is still neglected in public health.
Traditionally, Epstein–Barr virus (EBV) is classified into two types: type 1 (EBV-1) and type 2 (EBV-2). Recent studies have revised this classification in light of recombination, underscoring the impact of genetic reshuffling in the genomic context, as previously shown with complete genomes and the EBNA3A gene [1,4]. EBV phylopopulations have only been classified using complete genome data. The EBV genome is large and complex (due to its internal repeats and recombination regions) compared to other viruses, making sequencing challenging. However, the potential of a single gene or a combination of a limited set of genes to provide sufficient phylogenetic signal for this classification remains unexplored.
The EBV nuclear antigen 3 (EBNA3) is a family of viral proteins contributing to EBV latency and B-cell transformation. EBNA3 comprises three proteins, EBNA3A, EBNA3B, and EBNA3C, varying the significance level in the processes [5,6]. Because of the amino acid sequence homology, it is believed that these genes originated through a series of duplication events during their evolutionary history [7]. Research on recombination within the EBNA3 family has primarily focused on EBNA3A, with a lack of information available about the other EBNA3 proteins. Knockout and recombinant studies have shown that EBNA3A and EBNA3C are indispensable for establishing indefinitely growing lymphoblastoid cell lines (LCLs) in vitro [5,6]. Conversely, studies using recombinant viruses have shown that mutants lacking the EBNA3B region remain competent to transform B cells in vitro [8]. One critical interaction involves the EBNA3C and the metastasis regulatory gene Nm23-H1 [9] under certain conditions, cell proliferation has been observed even in the absence of EBNA3A [6]. This evidence highlights EBNA3C as a key to EBV latency and B-cell transformation.
EBNA3C is an essential transcriptional regulator that plays a crucial role in the EBV-mediated transformation leading to B-cell immortalization. Composed of two exons called BERF3 and BERF4, separated by an intron [10,11]. EBNA3C is a nuclear protein, and its translocation into the nucleus is facilitated by multiple nuclear localization signals (NLS) present in its sequence [12,13], which enables interaction with numerous cellular factors, like complement receptor 2 (CD21) and multiple transcription factors, such as RBP-Jκ, Cyclin A, and c-Myc, which play a pivotal role in modulating the host gene expression during the latency phase [9,14,15,16,17,18,19,20,21,22,23]. The basic leucine zipper-like (bZIP) domain further regulates transcription via DNA binding and dimerization, while NM23-H1 influences replication through its nucleotide kinase activity [14,24,25].
In this study, we analyzed EBNA3C sequences retrieved from EBV genomes to explore the influence of recombination in functional sequence variability, focusing on domain-specific variations to discuss their possible implications for B-cell transformation and tumor progression. Our results highlight the role of recombination in the evolution of EBNA3C, providing new insights into genetic diversity and its impact on virus biology.

2. Results

2.1. EBNA3C Sequences Metadata

A total of 988 EBNA3C sequences were recovered from 1314 genomes, covering a broad geographic range across Africa, America, Asia, Europe, and Oceania (see Supplementary Table S1). The dataset includes 147 sequences from Africa (East Africa, Ghana, Kenya, and North Africa), 79 from the Americas (Argentina, Brazil, and the USA), 925 from Asia (China, Indonesia, Japan, Korea, Singapore, South Korea, Taiwan, and Vietnam), 133 from Europe (Finland, France, Germany, Poland, Ukraine, and the United Kingdom) and 30 from Oceania (Australia and Papua New Guinea). Isolates from Asia, particularly China and Japan, are overrepresented, constituting 70% of the total. In contrast, sequences from Oceania and South America are underrepresented. The health statuses of EBV-infected patients varied, with 19% classified as healthy individuals and 81% diagnosed with verified EBV-associated diseases. Associated diseases were angioimmunoblastic T-cell lymphoma (AIL), AIDS-related lymphoma (ARL), BL, chronic active EBV infection (CAEBV), diffuse large B-cell lymphoma (DLBCL), EBV-related disease (EBV-RD—no precise disease condition described), GC, HL, IM, LC, lymphoblastoid cell line (LCL), lymphoepithelioma (LE), lymphoid neoplasia (LM), natural killer T-cell lymphoma (NKTCL), NPC, post-transplant B-cell lymphoma (PTBL), post-transplant lymphoproliferative disorder (PTLD), lymphoepithelioma-like carcinoma of the lung (LELC), spontaneous lymphoblastoid cell line (sLCL), and T cells disorders (TCDs).

2.2. Recombination Rates and Recombination-Induced Motifs Distribution in the EBNA3C Gene

The impact of recombination events and point mutations was estimated using ClonalFrameML v.1.13. The average length of recombination fragments was δ = 16 bp [14.6–16.9], and the average divergence between the donor and the recipient ν = 0.222 [0.218–0.237]. The ratio of rates of recombination and mutation was p/θ = 1.395 [1.290–1.500], whereas the ratio of effects of recombination and mutation was r/m = 5 [4.64–5.38]. This indicated that recombination occurs ~1.4 times more often than mutation, and because each recombination event introduced on average δν = 3.6 substitutions, recombination overall caused five times more substitutions than mutation, confirming the substantial role of recombination in EBV evolution.
The presence and distribution of specific recombination initiation motifs were analyzed. We found the CCCAG motif significantly enriched within EBV1 recombinant regions, highlighting its role as a potential recombination initiator. Additional motifs, including AGGAG and GGGCT, appear within these recombination regions. Genotype differences in motif abundance were observed: EBV2 showed more CCCAG (14 vs. 12 in EBV1), but fewer GGGCT (2 vs. 6) and TGGAG (3 vs. 4) motifs than EBV1.

2.3. Phylogenetic Reconstruction of EBNA3C

A previous study highlighted the impact of recombination in EBV by comparing phylogenetic reconstructions of complete genomes with and without filtering recombination [1]. Considering these findings, we applied a similar approach to the EBNA3C gene. Phylogenetic reconstructions of EBNA3C, excluding the intronic region, were inferred from both unfiltered and recombination-filtered alignments to assess the impact of recombination on phylogenetic inference. The tree topologies were compared to assess the impact of recombination in the EBNA3C gene. Both unfiltered and recombination-filtered trees showed identical topology with maximum bootstrap support for the split EBV-1 (914 sequences) and EBV-2 (74 sequences) (see Figure 1, Supplementary Figure S1a,b). No significant variations in the relationship among isolates were observed within EBV-1 and EBV-2 genotypes after recombinant filtering. Only minor branch rearrangement was observed. Both EBV-1 and EBV-2 genotypes exhibit internal rearrangements; however, no intermixing between genotypes was identified.

2.4. EBNA3C Phylopopulations Estimation

Phylogenetic analysis using RhierBAPS revealed 10 distinct EBV populations at the first hierarchical level. Each population showed distinct patterns of geographic distribution and disease manifestation, including non-malignant manifestations and diverse cancer types. However, this geographic pattern may reflect sampling bias, as data were only available from China, Kenya, the United States, and the United Kingdom (see Supplementary Table S1). Similarly, the relationship of EBV and different health conditions varied among populations, emphasizing the complexity of the EBV population structure. Regarding the phylogenetic relationships of the ten populations analyzed, five (Pop2, Pop5, Pop7, Pop8, and Pop10) exhibit a monophyletic structure, and the other five populations (Pop1, Pop3, Pop4, Pop6, and Pop9) are paraphyletic (see Figure 2).

2.5. Identification of Putative Recombinant Regions in EBNA3C

The distribution of putative recombinant regions in EBNA3C was analyzed to assess their contribution to genetic diversity. A detailed analysis of putative recombinant regions identified by Gubbins for all EBNA3C isolates revealed a limited number of putative recombinant regions within the EBV-1 clade, while a significant putative recombinant region was observed among EBV-2 isolates (see Figure 3, red horizontal bars).
Six putative recombinant regions were identified: region I (71–246), region II (1445–1553), region III (1821–2356), region IV (2391–2879), region V (2836–2869), and region VI (2512–3162). Interestingly, all isolates forming regions I, II, III, and IV belong to EBV-2 genotypes, including two isolates (HS039 and NPCT025) from the EBV-1 type. These putative recombinant regions in EBV-2 were mainly represented by China (47%) and Kenya (26%); other minor proportions were represented by France (6%), Indonesia (5%), Taiwan (4%), Ghana (3%), the United States (3%), Japan (3%), South Korea (2%) and the United Kingdom (1%). Regions V and VI belong exclusively to the EBV-1 genotype. Region V represents the clade predominantly composed of isolates from China (38%), Japan (22%), and the United States (12%) and a small representation from the United Kingdom (7%), Kenya (4%), France (4%), Argentina and Indonesia (2% each), and other countries contributing 1%: Brazil, Germany, Ghana, Italy, Poland, Singapore, South Korea, Taiwan, and Vietnam. Region VI represents a clade containing isolates from Kenya (78.6%), the United Kingdom (14.4%), and Brazil (7%). Some unique putative recombinant regions, each exclusively found in a single isolate, were identified and belong to both genotypes EBV-1 and EBV-2, as shown in Figure 3, as light blue horizontal bars. Putative recombination events defined by RDP5 are detailed in Supplementary Tables S2 and S3.

2.6. EBNA3C Sequence Variation Context

We investigated the sequence variation in EBNA3C in the BERF3 and BERF4 exons as well as the intronic region. We found variability in all three regions, including the functional motif and NLS (see Figure 4). The analysis identified amino acid substitution patterns with specific geographic and clinical associations (see Table 1).
We identified the amino acid substitution R11I + N21D + R44G + Y51D + T107I in BERF3 as a dominant pattern associated with NPC (62.6%) in Chinese populations (99.4% of cases). The I141V + Q213H + E336D + I348L + R656G + E701Q pattern found in BERF3 of EBV-1 showed NPC association (64.8%, p < 0.001) in Chinese populations (99.4%, p < 0.001); see Table 1. Additionally, the amino acid substitution A162V + S557L + G655S + T677M + A683V + E701Q + Q740P + Q744R + P753Q + L866S + K976E, present in BERF4 of EBV-1, was the most frequent pattern among those associated with EBV-related diseases. This pattern was significantly associated with CAEBV cases (p < 0.001), occurring in 65.2% of samples overall and reaching 100% prevalence in Japanese CAEBV patients (see Table 1). The deletion pattern spanning regions 404–434, 680–702, 713–731, and 818–824, found exclusively in the EBV-2 genotype, was present in 84% of Burkitt lymphoma cases, all of which originated from Kenya (see Table 1).
Beyond these disease associations, our analysis revealed (i) the health-associated pattern I348L + L669P + S690P + E701Q + H831Y + C915W in BERF4 of EBV-1 in asymptomatic carriers (p < 0.001) from China and Indonesia, and (ii) the geographic-associated pattern R11I + N21D + R44G + Y51D in BERF3 of EBV-1 shared between patients with CAEBV (p < 0.001) from Japan (p < 0.001) and GC patients from Asia (p < 0.001) (see Table 1). Notably, none of the EBV-2-related sequences exhibited amino acid variation patterns in BERF3.
Beyond identifying amino acid variation patterns and deletions found in EBNA3C, we also examined repeat sequences. In BERF4, two distinct repeated clusters were observed: one consisting of GPPAA (see Figure 4, first green triangle, positions 440–489) and another composed of PAPQAPYQGYQEQ (see Figure 4, second green triangle, positions 594–686). The GPPAA motif exhibited a range of 1 to 18 repeats. The nine-repeat pattern (present in 192 sequences) was the most frequent in the Chinese population, accounting for 70.8% of NPC and 15.6% of healthy individuals. The eight-repeat pattern (present in 188 sequences) was also common in 19.7% of IM cases from the USA (97.3%), 17.6% of BL cases from Kenya (66.7%), and both NPC and healthy individuals (20.7% each) from China (77% and 100%, respectively). The PAPQAPYQGYQEQ motif ranged from 1 to 9 repeats, with 2 repeats being most common (371 sequences), mainly observed in NPC (53.1%) and healthy individuals (18.3%) in China, with lower frequencies in BL (10.8%) from Kenya, and pLELC (6.7%) from China. Additionally, the seven-repeat pattern (present in 111 sequences) was predominantly found in CAEBV (53.2% of this group) and NKTCL (14.4%), both predominantly in Japan.
A distinct feature was identified exclusively in EBV-2: a 9-amino acid repeat cluster of APPSTGPRD (see Figure 4, light blue triangle), located at positions 492 to 530 of the BERF4 amino acid sequence. This motif varied from 2 to 10 repeats, with a five-repeat configuration (36%) being the most common, particularly in BL (47.2%) from Kenya. In contrast, the four-repeat pattern was primarily observed in healthy individuals (70.6%) from China, while the three-repeat (11 sequences) was mainly found in healthy individuals (54.5%) in China as well.
Remarkably, a type-specific marker was identified, characterized by a change in the A1T intronic sequence in all EBV-2 isolates. Furthermore, the C6A change had an overall frequency of 15.2% (143 occurrences), mostly concentrated in China (50.3%) and Japan (43.4%). Among Chinese isolates, 19.4% were present in GC, while 29.2% were observed in healthy individuals. In Japan, 58.1% of C6A occurrences were linked with CAEBV.

2.7. Variations in EBNA3C Functional Motifs

Analysis of EBNA3C functional motifs of bZIP, NLS, RBP-Jk, and Nm23-H1 revealed the variability of these elements (Figure 4 and Table S4). The NLS (NLS-1, NLS-2, and NLS-3) showed complete conservation among EBV-1 isolates (see Figure 4, red circles). In EBV-2 isolates, the NLS-1 was identical to that of EBV-1, except for three specific EBV-2 sequences from China (HKHD134, HKNPC45, HKHD67), which exhibit the I73V conservative change. In NLS-2, all 73 EBV-2 sequences exhibited simultaneously a K414R conservative and a K418T non-conservative substitution. The NLS-3 of almost all EBV-2 sequences displayed an R941S non-conservative change, with the single exception of the HC-0004 isolate from Kenya.
The RBP-Jk interaction domain (186–240) in EBV-1 (Figure 4, purple circles) exhibits a non-conservative T188A substitution in 73 sequences (8%). This pattern was related to healthy individuals (9.6%), predominantly in the United Kingdom (85.7%), and TCD cases (9.6%) exclusively from France (100%). It was also observed in IM (20.5%), mainly from the USA (93.3%), and CAEBV (9.6%) cases, all from the United States of America (100%). A notable association was observed with BL (27.4%), primarily in Kenya (55%), with other occurrences reported in other parts of East and North Africa, Ghana, France, and Brazil. Another frequent non-conservative substitution, Q213H, was identified in 427 sequences (46.2%), predominantly in NPC cases (52.6%) and healthy individuals (19%) in China. Additionally, a non-conservative A215G change was present in 120 sequences (13.1%), related to TCD cases (6.7%) from France (100%) and IM (24.2%), predominantly in the United States of America (96.6%). This substitution was also linked to BL (29.2%), with a high prevalence in Kenya (62.9%), and present in East and North Africa, Ghana, Brazil, France, and the United States of America. We further identified co-occurrence of T188A and A215G substitutions in 57 sequences (6.2%). Among the 73 EBV-2 sequences analyzed, all harbored the non-conservative substitutions A215G, R217L, T218A, and T228I, as well as the conservative change at L219I. The L219I conservative substitution was observed in all 73 EBV-2 sequences. Regarding the H231P non-conservative change, 70 out of 73 EBV-2 sequences carry this change, while three isolates (HKHD134, HKNPC45, HKHD67) from China are identical in sequence to EBV-1. Furthermore, these three isolates exhibit characteristic patterns of conservative substitutions at N197Q, D202E, I204V, alongside N214S, and non-conservative substitutions at S203I and M205R.
In the bZIP domain (Figure 4, yellow circles), regarding the EBV-2 sequences, we can observe I273V and L277M conservative substitutions in all 73 isolates. These substitutions were predominantly found in healthy individuals (35%), with occurrences in China (80%) and Kenya (11.5%). Additionally, these variants were related to BL (27%) in Kenya (80%). The EBV-1 exhibits an L277M synonymous substitution that was identified in 17 sequences (1.9% of the total), with a notable prevalence among BL patients (80%), a high occurrence in Kenya (93%), and a small proportion in Brazil (7%).
Analysis of the Nm23-H1 motif (637–675) revealed specific substitutions, highlighting significant differences in EBV-2 compared to EBV-1 (Figure 4, graphite gray circles). The analysis shows low variations affecting the Nm23-H1 motif in EBV-1 compared with EBV-2. The overall identity of EBV genotypes was 65%, with a total of 19 substitutions. Of these, 4 were conservative (L638F, E689D, I690A, and M693I) and 15 were non-conservative (Q637L, P639T, R644P, K645A, Q647R, C648S, G655S, R656G, T659P, Q660K, H668Q, L669P, L669Q, S671P, P675S).
A total of three non-conservative substitutions were identified in EBV-1 (G655S, R656G, and L669P). The non-conservative change G655S in 167 sequences (17.7%) was significant in CAEBV (30.9%), only in Japan, and in healthy individuals (22.8%), mostly in China (92%). The R656G non-conservative change showed the highest overall frequency, 46.3% (445 sequences), being most prevalent in NPC (57.0%), mostly in China (96%) and healthy individuals (21.7%), predominantly in China (92%). Finally, the change L669P in 106 sequences (10.1%) was prominent in healthy individuals (43.5%) in China (97.5%) and CAEBV (21.7%), entirely in Japan.
In EBV-2, a total of 16 substitutions were identified, including 4 conservative (L638F, E689D, I690A, and M693I) and 12 non-conservatives (Q637L, P639T, R644P, K645A, Q647R, C648S, T659P, Q660K, H668Q, L669Q, S671P, P675S). Among the four conservative substitutions, three (E689D, I690A, and M693I) were present in all 73 EBV-2 isolates. Among the 12 non-conservative substitutions, 9 (Q637L, R644P, K645A, Q647R, C648S, T659P, H668Q, L669Q, S671P) were present in all 73 EBV-2 isolates. These 73 isolates included samples from healthy individuals (35.6% of the total) and from BL cases (27.4%), followed by ND cases with no available clinical information (19.2%), and less frequent categories such as NPC (5.5%), LCL (4.1%), TCD (2.7%), pLELC (2.7%), GC (1.4%), and NKTCL (1.4%). Geographically, isolates originated mainly from Kenya (38.4%) and China (34.2%), with smaller proportions from the USA (4.1%), the UK (2.7%), Japan (2.7%), Indonesia (2.7%), France (2.7%), and PNG (2.7%). A few additional isolates came from Taiwan, Ghana, South Korea, Singapore, and regional designations such as Africa and North Africa (1.4% each). One isolate had no geographic information available (ND). The only synonymous substitution that was not present in all EBV-2 isolates but occurred in seven (AH Saliva 9316, HKHD41, HKHD72, HKHD79, HKHD97, HKHD104, and HKHD141) of them was L638F. The L638F substitution was identified in seven sequences (9.6%), which were represented by 100% of healthy patients’ sequences from China (85%) and Taiwan (15%). We identified non-synonymous substitutions that were not present in all EBV-2 isolates but occurred in 70, 71, and 72 of them, respectively: T659P, Q660K, and P675S. The T659P substitution was observed in healthy individuals (34%) in China (75%) and BL patients (28.6%) in Kenya (80%). Similarly, the Q660K substitution is present in healthy individuals (36.6%) in China (80%) and BL patients (25.4%) in Kenya (88.9%). Finally, the P675S substitution was present in healthy individuals (34.7%) in China (80%) and BL patients (27.8%) in Kenya (80%).

2.8. Variability of Epitopes Among EBNA3C

The eight previously identified EBNA3C epitopes (EENLLDFVRF, EGGVGWRHW, FRKAQIQGL, KEHVIQNAF, LDFVRFMGV, LRGKWQRRYR, QPRAPIRPI, and RRIYDLIEL) were analyzed to assess their variability. EGGVGWRHW, EENLLDFVRF, and LDFVRFMGV epitopes remained completely conserved in all EBV. The results are detailed in Table 2.
In the FRKAQIQGL epitope, one conservative and three non-conservative substitutions were identified. The conservative substitution I348L and the non-conservative substitution I348M were found in EBV-1 isolates. In EBV-2 isolates, the non-conservative substitutions R344L and I348R co-occurred. The conservative substitution I348L, which had an overall frequency of 55.8%. This change was predominantly observed in NPC (46.7%) and healthy individuals (25.3%) in China. Additionally, the I348M substitution was detected with a lower overall frequency of 1.5%, showing a strong association with BL (85.7%, p < 0.001), primarily in Kenya (92%, p < 0.001). The co-occurrence of two non-conservative substitutions, R224L and I228R, was identified in all EBV-2 sequences.
In the KEHVIQNAF epitope, one conservative and three non-conservative substitutions were identified. The conservative substitution E336D and the non-conservative substitution H337Q were found in EBV-1 isolates. In EBV-2 isolates, the non-conservative substitutions H337Q and N341K co-occurred. The conservative substitution E336D was identified at a high overall frequency of 46.8%, predominantly in NPC (51.6%) and healthy individuals (20.3%) in China. Additionally, the non-conservative substitution H337Q, detected at a lower overall frequency of 1.5%, exhibited association with BL (85.7%) and was primarily found in Kenya (92.7%). The co-occurrence of two non-conservative substitutions, H337Q and N341K, was identified in all EBV-2 sequences.
In the LRGKWQRRYR epitope, a unique non-conservative substitution, Y257F, was exhibited in EBV-2 isolates. This conservative substitution was identified in healthy individuals (35%) in China (80%) and Kenya (11.5%), as well as in BL (27%), predominantly in Kenya (80%).
In the RRIYDLIEL epitope, one conservative and two non-conservative substitutions were identified. The conservative substitution R259K and the non-conservative substitution Y261F were found in EBV-1 isolates. In EBV-2 isolates, a non-conservative substitution, Y261F, was identified. The R259K conservative substitution found in EBV-1 isolates showed an overall frequency of 3.2%, occurring in small numbers across diverse conditions. The most notable occurrences were in healthy individuals (20.7%) exclusively from the United Kingdom, IM (20.7%) entirely in the United States of America, and PTLD (13.8%) all in the United Kingdom. The Y261F non-conservative change (2.4%) found in EBV-1 isolates was associated with NPC (50.0%) in Indonesia (91%, p < 0.001). In contrast, the Y261F non-conservative change identified in EBV-2 isolates was identified in healthy patients (35%) in China (80%) and BL patients (27%) in Kenya (80%).
Finally, the QPRAPIRPI epitope was completely conserved in all EBV-1 isolates. On the other hand, 19% of the EBV-2 isolates presented the co-occurrence of non-conservative substitutions Q761P and R763P. These non-conservative substitutions were identified in healthy individuals (14%) and in BL patients (20%) in Kenya (100%). Considering that a proline amino acid is located before the epitope, changes in these sequences lead to the presence of two additional prolines, resulting in a proline cluster.

2.9. Geographic and Clinical Associations of EBNA3C Mutations

In EBV-1, the strongest statistical associations were observed for NPC with R11I, N21D, R44G, Y51D, T107I, I141V, Q213H, E336D, I348L, E701Q, H782Q showing the most significant association (p < 0.001). For BL, the most significant associated mutations included D102G, T188A, Q213K, A215G, R217Q, L277M, I348M, T356M, T821A, A840S, L869M, A978S (p < 0.001). Robust associations were also found for IM (L45F, Q60P, T221S, R255K, S385T, V489A, R499I, G783R, M904L, A921S, all with p < 0.001), CAEBV (T104A, T107V, A162V, S557L, T677M, A683V, H782P, L866S, K976E, A978V, all with p < 0.001) and statistically weaker associations for GC (R44I, p < 000.1 only for FDR) and DLBCL (P919T, p < 0.001 only for FDR). Additionally, variants such as R259K, G357V, L669P, S690P, H831Y, C915W, and P916S (p < 0.001) were detected in the healthy control group, suggesting they represent non-pathogenic polymorphisms present in the general population. Strong geographic associations were observed in populations from the United States, with mutations T188A, A215G, T356M, V489A, A840S, M904L, and A921S (p < 0.001), and from Indonesia, with T104P, Y257F, and Y261F (p < 0.001). The Chinese population exhibited the highest number of geographically specific variants, including R11I, N21D, R44G, Y51D, T104A, T107V, I141V, A162V, Q213H, E336D, I348L, G357V, E359A, L669P, E701Q, H782Q, G783R, H831Y, L866S, C915W, P916S, and P919T (all with p < 0.001), with T107I being the most statistically significant (p < 0.001). Distinctive and statistically robust mutational profiles were identified in Kenya with D102G, R217Q, L277M, I348M, T821A, L869M, and A978S (all with p < 0.001), in the United Kingdom with L45F, Q60P, T221S, R255K, R259K, S385T, and R499I (all with p < 0.001), and in Japan with S557L, T677M, A683V, H782P, K976E, and A978V (all with p < 0.001). When integrated, these findings revealed a consistent pattern of mutations associated with specific pathologies. The statistically significant variants associated with NPC, including R11I, N21D, R44G, Y51D, T107I, I141V, Q213H, E336D, I348L, E701Q, and H782Q (p < 0.001 each), coincide with those identified as geographically specific to the Chinese population. Similarly, mutations linked to BL show strong geographical stratification. T188A, A215G, T356M, and A840S are predominantly associated with the United States, while D102G, R217Q, L277M, I348M, T821A, L869M, and A978S are characteristic of the Kenyan population.
For EBV-2, a strong association was observed for the P537L, A547T, and T553M mutation cluster in African populations (p < 0.001). Furthermore, another limited association was found for the T524M and I827L variants in Indonesia (p < 0.002), and for the G472E, F806V, and L881S variants in China (p < 0.009). The remaining mutations analyzed, mainly in Kenyan and Chinese populations, showed no statistically significant association. Regarding their association with clinical manifestations, EBV-2 mutations lacked statistical association to be correlated with a health status.

3. Discussion

Epstein–Barr virus (EBV), a global oncovirus associated with B-cell carcinomas and neoplasms, is traditionally classified into EBV-1 and EBV-2 based on their ability to transform lymphoblastoid cell lines. Recent studies suggest this classification may reflect artifacts from genomic recombination regions. Therefore, analyzing the genome with the putative recombinant regions excluded better clarifies its population structure [26]. Although this approach has only been applied to whole genomes, its implications for individual genes remain unexplored. Among viral proteins, the EBNA family regulates viral transcription, apoptosis, and host genes [26]. EBNA3C, a member of the EBNA3 family, presents a conserved genetic structure (short and long exons separated by an intron), suggesting an evolutionary origin by gene duplication [14]. While EBNA3A shows high susceptibility to recombination, EBNA3C, despite its relevance in B-cell transformation, has not been studied in this context. Unlike EBNA3B, whose absence does not affect this capacity [14]. Given EBNA3C’s key role in B cell transformation and gene regulation and its poorly understood genetic diversity, we analyzed the genetic variability and recombination patterns in 988 EBNA3C sequences to explore their association with EBV-related diseases.
Our analysis revealed a substantial bias in the geographic distribution of the sequences [27,28], with a high proportion of sequences from the Asian continent associated with NPC. Despite the recent increase in deposited EBV genomes, a notable limitation persists in the available data, particularly a lack of global representativeness. The imbalanced representation of EBV genomes from specific geographic regions introduces challenges that complicate the interpretation of disease-specific patterns [27]. This inequality in sequence distribution can be attributed to both the virus’s natural geographic spread and disparate access to sequencing resources. Moreover, the predominance of complete genomes from complex EBV-related diseases, such as NPC, suggests a prioritization of sequencing efforts toward clinically significant cases. The restricted presence of EBV-2 in certain regions may result from differences in genotype-specific viral replication efficiency, influencing its capacity for broader dissemination [29]. Phylogenetic reconstruction of EBNA3C with and without recombinant filtering confirms a higher prevalence of EBV-1 versus EBV-2, aligned with previous findings on genome-wide studies, EBNA3A, and lytic genes [1,4,30,31,32,33,34].
The EBNA3C gene showed a high recombination rate, with the calculated impact showing that recombination events contributed five times more to genetic diversity substitutions than mutations. This confirms previous genome-wide studies with a sample size, which reported a high recombination signal in the EBNA3 region, specifically in EBNA3C [1,35,36]. Previous studies using complete genomes demonstrated that the impact of recombination could be observed when comparing phylogenetic reconstruction with and without recombination regions. We compared the phylogenetic reconstruction of EBNA3C with and without recombination but were unable to identify rearrangements between EBV genotypes, detecting only internal rearrangements. Consequently, we aimed to recover previously identified EBV phylopopulations using a single-gene strategy.
Obtaining the complete EBV genome remains a costly and complex process. In this study, we aimed to assess whether EBNA3C carries sufficient phylogenetic signal to resolve EBV phylopopulations. Previous analyses identified 12 EBV phylopopulations using complete genomes, of which seven were monophyletic and five paraphyletic. The EBNA3C gene was analyzed using the HierBAPS algorithm, which allows the identification of EBV clusters. Those clusters identified herein correlate with the populations previously classified by Zanella et al. [1], who applied a similar methodology in a genomic approach. Our EBNA3C analysis identified 10 populations: five with a monophyletic structure and five with a paraphyletic structure. Our dataset included more sequences than the previous genome-based study, yielding fewer distinct populations but a higher number of paraphyletic groups. Comparison of the phylopopulations revealed similarities with whole-genome analyses, with both EBVpop1 and EBVpop2 (identified in this study) including isolates from diverse geographic regions and being associated with pathologies such as BL and IM. However, EBNA3C alone did not provide sufficient phylogenetic signal to robustly segregate EBV phylopopulations.
Furthermore, our analysis revealed that recombination regions were predominantly localized in EBV-2 isolates (Table S5), mirroring patterns previously reported for EBNA3A [4]. Unlike previous studies, we identified recombination regions within EBV-1 lineages. This is consistent with observations in other herpesviruses, where recombination facilitates adaptive divergence in geographically structured populations [37,38,39,40,41,42]. Notably, we detected two EBV-1 sequences that shared a recombination region with all EBV-2 isolates. This interchange of recombinant regions between genotypes impacts the current EBV population structure. It may also reflect an ancestral intertype recombination region that maintained minimal divergence from the EBV-1 parental lineage. This phenomenon is documented in herpesviruses under selective pressures to preserve essential gene functions [43,44,45].
Analysis of the EBNA3C intronic region revealed remarkable phylogenetic conservation between EBV-1 and EBV-2 types, with variation limited to a single change in the first nucleotide A1T. This pattern of intronic conservation is consistent with previous studies on EBNA3 family genes [4,46], where intron retention has been proposed as a post-transcriptional regulatory mechanism to modulate the expression of these viral oncoproteins [46,47,48]. The observed conservation suggests strong selective constraints are possibly linked to: (i) the presence of critical regulatory elements in introns (e.g., microRNA binding sites or splicing factors) [49], and (ii) the need to maintain overlapping open reading frames (ORFs) in compact herpesvirus genomes [47].
Several epitopes related to various processes, such as interaction with cellular proteins, regulation of the viral cycle, and other functions, have already been identified in EBNA3C [50]. However, there is still no clear characterization of genotypic variation among these epitopes, nor is it well established whether these changes have a geographic association or correlate with an increased frequency of clinical manifestations linked to EBV infection. Our analyses provide a comprehensive view of EBNA3C epitope variability. The I38L substitution in the FRKAQIQGL epitope and the E336L substitution in the KEHVIQNAF epitope could be related to NPC in China. Furthermore, the FRKAQIQGL epitope has also been linked to BL in Kenya. Specifically, this association appears when the epitope presents the I348M substitution in EBV-1 and the co-occurrence of R344L + I348R in EBV-2. The conservation of these substitutions, coupled with their possible role as disease markers in some geographic populations, supports the theory that these regions are under selective pressure [51]. In addition to the highlighted substitutions, several non-synonymous mutations were identified in the analyzed epitopes. These modifications may be associated with conformational changes in EBNA3C, potentially affecting its efficiency and recognition by molecules such as HLA [52]. Therefore, it can be hypothesized that these mutated epitopes may influence EBNA3C function, stability, recognition, and interactions with the host immune system.
Recombination initiator motifs, such as TGGAG and CCCAG, are commonly found across the EBV genome [53,54]. The CCCAG motif is concentrated within predicted recombinant regions of EBV, suggesting its role in initiating recombination or serving as a breakpoint, as discussed in previous studies [35,53]. Recent studies have shown differences between distinct haplotypes of the virus, particularly the CCCAG motif associated with a higher frequency in EBV-positive patients [54]. Consistent with our results, we found differences in motif abundance between EBV1 and EBV2, with EBV2 having a higher frequency of CCCAG, which may reflect distinct recombination patterns and adaptive strategies between the genotypes. These findings contribute to understanding how motif distribution influences EBNA3C variability. The CCCAG motif is significantly concentrated within predicted recombinant regions of EBV, suggesting its role in initiating recombination or serving as a breakpoint, as discussed in previous studies [35,53].
The analysis of the EBNA3C NLS (NLS-1, NLS-2, and NLS-3) revealed a high degree of conservation across EBV-1 isolates (>99% identity). Conversely, EBV-2 exhibited non-synonymous substitutions in all NLS. Notably, the NLS-2 of EBV-2, where the K418T substitution replaces a positively charged lysine with a neutral threonine, removes the basic residue critical for nuclear transport. This non-synonymous substitution disrupts a key basic residue, which is critical for nuclear transport. Consistent with observations in simian virus 40 (SV40), where comparable mutations in the T-antigen NLS abolished nuclear localization [55]. This implies that EBV-2 NLS-2 may be non-functional, raising questions about compensatory nuclear import mechanisms in this genotype. In NLS-3 EBV-2 isolates, the substitution R1018S, replacing arginine (basic) with serine (polar uncharged), was universally harbored. While this preserves partial polarity, the loss of positive charges likely reduces binding affinity to importin-α, a key mediator of nuclear transport in herpes simplex virus [56]. Notably, such substitutions are absent in EBV-1, underscoring a conserved functional imperative for cationic residues in its NLS motifs. These findings align with broader patterns in herpesvirus evolution, where non-essential motifs in latent proteins tolerate drift, while core functional motifs remain under purifying selection [57,58]. The retention of divergent NLS sequences in EBV-2 may reflect relaxed constraints due to alternative nuclear shuttling mechanisms or lineage-specific adaptations to distinct host–cell environments [59,60].
The RBP-Jk binding domain of EBNA3C harbors four evolutionarily conserved residues (209-TFGC-212) critical for mediating its interaction with the transcriptional repressor RBP-Jk in vivo [61]. Despite substantial divergence across the broader domain (82% sequence dissimilarity between EBV-1 and EBV-2), the TFGC motif remains conserved, underscoring stringent purifying selection to preserve this interaction interface. This conservation likely reflects its non-redundant role in displacing EBNA2 from RBP-Jk, a mechanism essential for EBV-driven B-cell immortalization [62,63,64]. This pattern mirrors observations in other herpesviruses: baboon LCV maintains ~35% EBNA3C N-terminal conservation, including the TFGC motif, despite peripheral divergence [62]. Therapeutic disruption of this interface, for instance, via peptides mimicking the TFGC motif, could competitively inhibit EBNA3C sequestration of RBP-Jk, thereby restoring EBNA2-mediated tumor-suppressive signaling [65,66]. The identification of recombination regions in key functional regions of EBNA3C, such as the NLS, suggests that these structural variations may have functional implications for viral biology. This finding provides insights into potential impacts of mutation and recombination on EBNA3 based on bioinformatics analyses; further experimental analysis is necessary to confirm these findings and their potential contribution to viral pathogenesis.
The analysis of EBNA3C functional motif variability revealed promising opportunities for therapeutic development. The high conservation of NLS-1 in both EBV genotypes suggests that peptide-based strategies could be employed to block EBNA3C nuclear translocation, similar to the approach proposed by Levin et al. (2009) [67]. This work demonstrated that a synthetic peptide derived from the NLS sequence of human immunodeficiency virus 1 (HIV-1) integrase (IN) blocks the protein’s nuclear import. Another interesting result corresponds to the RBP-Jk binding site, in which the amino acids critical for the establishment of LCLs remain completely conserved. This preservation, analogous to that observed in NLS-1, may represent a viable target for competitive inhibition via synthetic peptides.
Targeting the degradation of viral oncoproteins has emerged as a promising therapeutic strategy. In this context, Proteolysis Targeting Chimera (PROTAC) has gained attention. It consists of a small synthetic molecule designed to bind an E3 ubiquitin ligase to a specific target protein, leading to subsequent degradation by the proteasome [68]. This strategy demonstrates efficacy in degrading the human papillomavirus E6 oncoprotein [69]. According to the Sugiokto and Li (2025) study, the EBNA1 synthetic inhibitor could induce EBNA1 degradation through ubiquitination [70]. Taken together, current evidence supports EBNA3C as a therapeutic target through PROTAC-mediated degradation.

4. Materials and Methods

4.1. EBNA3C Recovery

We performed sequence similarity searches on July 10, 2025, with the Basic Local Alignment Search Tool BLASTn https://blast.ncbi.nlm.nih.gov/Blast.cgi (accessed on 10 July 2025) using the EBNA3C sequence of EBV-1 prototype (NC_007605) from 86083 to 89132 nucleotide (nt) and of EBV-2 prototype (NC_009334) from 86654 to 89934 nt. Sequences of both EBV types were aligned with Mafft http://mafft.cbrc.jp/alignment/server (accessed on 11 July 2025) [71] and manually edited using Ugene v50 [72]. Sequences that contained more than 5% of unresolved nucleotides (Ns) and had less than 75% of similarity with the query were removed from the analysis, along with samples from established cell lines. The metadata of each isolate was annotated in the header of the sequences, including health status, country, and year of collection. A final dataset of 988 sequences was obtained for the study (see Supplementary Table S1). The EBNA3C sequence includes the BERF3 and BERF4 exons and the intronic region between them, which were analyzed to assess sequence variation (nucleotide and amino acid) and recombination regions.

4.2. Annotation of EBNA3C Motifs

The EBNA3C protein motifs annotation was performed employing a Uniprot-based search (Supplementary Table S6), using NC_007605 and NC_009334 as queries for EBV-1 and EBV-2, respectively. The key residues of the protein were identified according to the literature: bZIP [14], RBP-Jk [61], Nm23-H1 [25], and NLS (1, 2, and 3) [13]. The epitopes analyzed in this study were described according to the literature [36,73,74].

4.3. Recombination Analysis

The recombination analyses of EBNA3C were performed with two distinct software programs, Recombination Detection Program (RPD v5.61) [75] and Genealogies Unbiased by Recombinations in Nucleotide Sequences (Gubbins v2.4.1) [76] to identify the recombination regions based on phylogenetic discrepancies. With EBNA3C recombination regions defined, recombination-inducing motifs for EBV were analyzed as previously described [35,53]. Recombination-inducing motifs were categorized as inside or outside these regions, and occurrences were counted for both viral types. ClonalFrameML v1.13 [77] was employed to assess the impact of recombination and the frequency of recombination regions vs. point mutations. Various methods were applied in RDP5 to detect potential recombination regions. The recombinant regions were considered significant if they were detected by at least 5 of 7 detection methods (RDP, GENECONV, MaxChi, Chimera, SisScan, 3Seq) [78,79,80,81,82,83,84] and if the p-value < 0.05 after applying the Bonferroni correction. We used Gubbins to detect recombination regions based on phylogenetic discrepancies.

4.4. Phylogenetic Reconstructions and Identification of Population Groups

A phylogenetic reconstruction was performed with the EBNA3C dataset using PhyML 3.0 http://www.atgc-montpellier.fr/phymll/ (accessed on 12 July 2025), applying the Akaike information criterion (AIC) selection criterion and aLRT SH-like fast likelihood-based method. The best-fitting model selected was the GTR + R. This approach was used to compare the reconstruction of the unfiltered and filtered sequences to assess the impact of recombination. The phylogenetic trees were visualized by iTOL https://itol.embl.de/ (accessed on 12 July 2025) [85]. To determine the phylopopulations, we performed a Bayesian analysis of population structure according to the RhierBAPS v1.01 [86] software with the recombinant regions identified with Gubbins and then the intronic regions removed; the run was defined at two levels of hierarchy with 20 initial clusters. The comparison of the tree topologies obtained before and after the filtering process was measured using Robinson–Foulds Distance [87], Matching Split Distance [88], Branch Score Distance [89], and Path Difference [90].

4.5. Statistical Analysis

To assess associations between amino acid mutations and categorical metadata (clinical status and country of origin), a custom Python script was implemented. For each variant detected at the selected positions, a presence/absence matrix was constructed and compared between the categories defined in Supplementary Table S1, separately. Associations were assessed using Pearson’s chi-square test. Multiple testing was corrected using the Benjamini–Hochberg false discovery rate (FDR) method and the Bonferroni correction. Associations were considered statistically significant at an adjusted p-value ≤ 0.05. All statistical analyses and data visualizations were performed in Python (v3.12).

5. Conclusions

This study demonstrated that variability in the EBNA3C gene is pivotal due to recombination regions, which contributed five times more to substitutions than mutation. and contribute significantly to the emergence of divergent motifs within functionally important regions such as Nm23-H1, RBP-Jκ, and nuclear localization signals. Despite this extensive variability, the EBNA3C gene considered alone does not have sufficient phylogenetic signal to discriminate monophyletic phylopopulations. EBNA3C diversity is more closely associated with geographic origin than with specific disease phenotypes. These results provide new insights into how molecular evolution could shape the virus cycle. Further experimental studies are warranted to elucidate the functional consequences of the identified variants and to better understand their potential impact on viral pathogenesis and oncogenesis.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijms27073054/s1.

Author Contributions

Conceptualization: A.B. and L.Z. Writing—original draft preparation: A.B., P.V.-A., and L.Z. Writing, reviewing, and editing: A.B., L.Z., P.V.-A., and P.H.B.-S. A.B., L.Z., P.V.-A., and P.H.B.-S. assisted in data analysis and the final editing of the manuscript. A.B. and L.Z. have read the initial draft of the manuscript and participated in the final editing of the manuscript. L.Z. and P.H.B.-S. supervised the whole study, including the study design and final editing of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

The authors declare no funding was received for this study.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data obtained in this study are detailed in the text and in the Supplementary Materials.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Zanella, L.; Riquelme, I.; Buchegger, K.; Abanto, M.; Ili, C.; Brebi, P. A Reliable Epstein-Barr Virus Classification Based on Phylogenomic and Population Analyses. Sci. Rep. 2019, 9, 9829. [Google Scholar] [CrossRef]
  2. Nilsson, K.; Klein, G.; Henle, W.; Henle, G. The Establishment of Lymphoblastoid Lines from Adult and Fetal Human Lymphoid Tissue and Its Dependence on EBV. Int. J. Cancer 1971, 8, 443–450. [Google Scholar] [CrossRef] [PubMed]
  3. Peng, R.; Gordadze, A.V.; Fuentes Pananá, E.M.; Wang, F.; Zong, J.; Hayward, G.S.; Tan, J.; Ling, P.D. Sequence and Functional Analysis of EBNA-LP and EBNA2 Proteins from Nonhuman Primate Lymphocryptoviruses. J. Virol. 2000, 74, 379–389. [Google Scholar] [CrossRef] [PubMed]
  4. Zanella, L.; Reyes, M.E.; Riquelme, I.; Abanto, M.; León, D.; Viscarra, T.; Ili, C.; Brebi, P. Genetic Patterns Found in the Nuclear Localization Signals (NLSs) Associated with EBV-1 and EBV-2 Provide New Insights into Their Contribution to Different Cell-Type Specificities. Cancers 2021, 13, 2569. [Google Scholar] [CrossRef]
  5. Tomkinson, B.; Robertson, E.; Yalamanchili, R.; Longnecker, R.; Kieff, E. Epstein-Barr Virus Recombinants from Overlapping Cosmid Fragments. J. Virol. 1993, 67, 7298–7306. [Google Scholar] [CrossRef] [PubMed]
  6. Hertle, M.L.; Popp, C.; Petermann, S.; Maier, S.; Kremmer, E.; Lang, R.; Mages, J.; Kempkes, B. Differential Gene Expression Patterns of EBV Infected EBNA-3A Positive and Negative Human B Lymphocytes. PLoS Pathog. 2009, 5, e1000506. [Google Scholar] [CrossRef]
  7. Yenamandra, S.P.; Sompallae, R.; Klein, G.; Kashuba, E. Comparative Analysis of the Epstein-Barr Virus Encoded Nuclear Proteins of EBNA-3 Family. Comput. Biol. Med. 2009, 39, 1036–1042. [Google Scholar] [CrossRef]
  8. Chen, A.; DiVisconte, M.; Jiang, X.; Quink, C.; Wang, F. Epstein-Barr Virus with the Latent Infection Nuclear Antigen 3B Completely Deleted Is Still Competent for B-Cell Growth Transformation In Vitro. J. Virol. 2005, 79, 4506–4509. [Google Scholar] [CrossRef]
  9. Knight, J.S.; Lan, K.; Subramanian, C.; Robertson, E.S. Epstein-Barr Virus Nuclear Antigen 3C Recruits Histone Deacetylase Activity and Associates with the Corepressors MSin3A and NCoR in Human B-Cell Lines. J. Virol. 2003, 77, 4261–4272. [Google Scholar] [CrossRef]
  10. Allday, M.J.; Crawford, D.H.; Griffin, B.E. Prediction and Demonstration of a Novel Epstein-Barr Virus Nuclear Antigen. Nucleic Acids Res. 1988, 16, 4353–4367. [Google Scholar] [CrossRef]
  11. Petti, L.; Sample, J.; Wang, F.; Kieff, E. A Fifth Epstein-Barr Virus Nuclear Protein (EBNA3C) Is Expressed in Latently Infected Growth-Transformed Lymphocytes. J. Virol. 1988, 62, 1330–1338. [Google Scholar] [CrossRef]
  12. Lu, J.; Wu, T.; Zhang, B.; Liu, S.; Song, W.; Qiao, J.; Ruan, H. Types of Nuclear Localization Signals and Mechanisms of Protein Import into the Nucleus. Cell Commun. Signal. 2021, 19, 60. [Google Scholar] [CrossRef]
  13. Krauer, K.; Buck, M.; Flanagan, J.; Belzer, D.; Sculley, T. Identification of the Nuclear Localization Signals within the Epstein-Barr Virus EBNA-6 Protein. J. Gen. Virol. 2004, 85, 165–172. [Google Scholar] [CrossRef]
  14. West, M. Structure and Function of the Epstein-Barr Virus Transcription Factor, EBNA 3C. Curr. Protein Pept. Sci. 2006, 7, 123–136. [Google Scholar] [CrossRef]
  15. Saha, A.; Murakami, M.; Kumar, P.; Bajaj, B.; Sims, K.; Robertson, E.S. Epstein-Barr Virus Nuclear Antigen 3C Augments Mdm2-Mediated P53 Ubiquitination and Degradation by Deubiquitinating Mdm2. J. Virol. 2009, 83, 4652–4669. [Google Scholar] [CrossRef]
  16. Jha, H.C.; Lu, J.; Saha, A.; Cai, Q.; Banerjee, S.; Prasad, M.A.J.; Robertson, E.S. EBNA3C-Mediated Regulation of Aurora Kinase B Contributes to Epstein-Barr Virus-Induced B-Cell Proliferation through Modulation of the Activities of the Retinoblastoma Protein and Apoptotic Caspases. J. Virol. 2013, 87, 12121–12138. [Google Scholar] [CrossRef] [PubMed]
  17. Bajaj, B.G.; Murakami, M.; Cai, Q.; Verma, S.C.; Lan, K.; Robertson, E.S. Epstein-Barr Virus Nuclear Antigen 3C Interacts with and Enhances the Stability of the c-Myc Oncoprotein. J. Virol. 2008, 82, 4082–4090. [Google Scholar] [CrossRef]
  18. Rosendorff, A.; Illanes, D.; David, G.; Lin, J.; Kieff, E.; Johannsen, E. EBNA3C Coactivation with EBNA2 Requires a SUMO Homology Domain. J. Virol. 2004, 78, 367–377. [Google Scholar] [CrossRef] [PubMed]
  19. Yan, X.; Mouillet, J.-F.; Ou, Q.; Sadovsky, Y. A Novel Domain within the DEAD-Box Protein DP103 Is Essential for Transcriptional Repression and Helicase Activity. Mol. Cell Biol. 2003, 23, 414–423. [Google Scholar] [CrossRef] [PubMed]
  20. Subramanian, C.; Hasan, S.; Rowe, M.; Hottiger, M.; Orre, R.; Robertson, E.S. Epstein-Barr Virus Nuclear Antigen 3C and Prothymosin Alpha Interact with the P300 Transcriptional Coactivator at the CH1 and CH3/HAT Domains and Cooperate in Regulation of Transcription and Histone Acetylation. J. Virol. 2002, 76, 4699–4708. [Google Scholar] [CrossRef]
  21. Yi, F.; Saha, A.; Murakami, M.; Kumar, P.; Knight, J.S.; Cai, Q.; Choudhuri, T.; Robertson, E.S. Epstein-Barr Virus Nuclear Antigen 3C Targets P53 and Modulates Its Transcriptional and Apoptotic Activities. Virology 2009, 388, 236–247. [Google Scholar] [CrossRef]
  22. Lee, S.; Sakakibara, S.; Maruo, S.; Zhao, B.; Calderwood, M.A.; Holthaus, A.M.; Lai, C.-Y.; Takada, K.; Kieff, E.; Johannsen, E. Epstein-Barr Virus Nuclear Protein 3C Domains Necessary for Lymphoblastoid Cell Growth: Interaction with RBP-Jκ Regulates TCL1. J. Virol. 2009, 83, 12368–12377. [Google Scholar] [CrossRef]
  23. Saha, A.; Bamidele, A.; Murakami, M.; Robertson, E.S. EBNA3C Attenuates the Function of P53 through Interaction with Inhibitor of Growth Family Proteins 4 and 5. J. Virol. 2011, 85, 2079–2088. [Google Scholar] [CrossRef]
  24. Pandey, S.; Robertson, E.S. Oncogenic Epstein–Barr Virus Recruits Nm23-H1 to Regulate Chromatin Modifiers. Lab. Investig. 2018, 98, 258–268. [Google Scholar] [CrossRef]
  25. Subramanian, C.; Robertson, E.S. The Metastatic Suppressor Nm23-H1 Interacts with EBNA3C at Sequences Located between the Glutamine- and Proline-Rich Domains and Can Cooperate in Activation of Transcription. J. Virol. 2002, 76, 8702–8709. [Google Scholar] [CrossRef]
  26. Sawada, K.; Yamamoto, M.; Tabata, T.; Smith, M.; Tanaka, A.; Nonoyama, M. Expression of EBNA-3 Family in Fresh B Lymphocytes Infected with Epstein-Barr Virus. Virology 1989, 168, 22–30. [Google Scholar] [CrossRef]
  27. Briercheck, E.L.; Ravishankar, S.; Ahmed, E.H.; Carías Alvarado, C.C.; Barrios Menéndez, J.C.; Silva, O.; Solórzano-Ortiz, E.; Siliézar Tala, M.M.; Stevenson, P.; Xu, Y.; et al. Geographic EBV Variants Confound Disease-Specific Variant Interpretation and Predict Variable Immune Therapy Responses. Blood Adv. 2024, 8, 3731–3744. [Google Scholar] [CrossRef]
  28. Hirabayashi, M.; Georges, D.; Clifford, G.M.; de Martel, C. Estimating the Global Burden of Epstein-Barr Virus–Associated Gastric Cancer: A Systematic Review and Meta-Analysis. Clin. Gastroenterol. Hepatol. 2023, 21, 922–930.e21. [Google Scholar] [CrossRef]
  29. Lucchesi, W.; Brady, G.; Dittrich-Breiholz, O.; Kracht, M.; Russ, R.; Farrell, P.J. Differential Gene Regulation by Epstein-Barr Virus Type 1 and Type 2 EBNA2. J. Virol. 2008, 82, 7456–7466. [Google Scholar] [CrossRef]
  30. Chang, C.M.; Yu, K.J.; Mbulaiteye, S.M.; Hildesheim, A.; Bhatia, K. The Extent of Genetic Diversity of Epstein-Barr Virus and Its Geographic and Disease Patterns: A Need for Reappraisal. Virus Res. 2009, 143, 209–221. [Google Scholar] [CrossRef]
  31. Chabay, P.; De Matteo, E.; Merediz, A.; Preciado, M. V High Frequency of Epstein Barr Virus Latent Membrane Protein-1 30 Bp Deletion in a Series of Pediatric Malignancies in Argentina. Arch. Virol. 2004, 149, 1515–1526. [Google Scholar] [CrossRef]
  32. Correa, R.M.; Fellner, M.D.; Alonio, L.V.; Durand, K.; Teyssié, A.R.; Picconi, M.A. Epstein-Barr Virus (EBV) in Healthy Carriers: Distribution of Genotypes and 30 Bp Deletion in Latent Membrane Protein-1 (LMP-1) Oncogene. J. Med. Virol. 2004, 73, 583–588. [Google Scholar] [CrossRef]
  33. Arturo-Terranova, D.; Giraldo-Ocampo, S.; Castillo, A. Caracterización Molecular de Las Variantes Del Virus de Epstein-Barr Detectadas En La Cavidad Oral de Adolescentes de Cali, Colombia. Biomédica 2020, 40, 76–88. [Google Scholar] [CrossRef]
  34. Zimber, U.; Adldinger, H.K.; Lenoir, G.M.; Vuillaume, M.; Knebel-Doeberitz, M.V.; Laux, G.; Desgranges, C.; Wittmann, P.; Freese, U.-K.; Schneider, U.; et al. Geographical Prevalence of Two Types of Epstein-Barr Virus. Virology 1986, 154, 56–66. [Google Scholar] [CrossRef]
  35. Berenstein, A.J.; Lorenzetti, M.A.; Preciado, M.V. Recombination Rates along the Entire Epstein Barr Virus Genome Display a Highly Heterogeneous Landscape. Infect. Genet. Evol. 2018, 65, 96–103. [Google Scholar] [CrossRef]
  36. Palser, A.L.; Grayson, N.E.; White, R.E.; Corton, C.; Correia, S.; Ba, M.; Watson, S.J.; Cotten, M.; Arrand, J.R.; Murray, P.G.; et al. Genome Diversity of Epstein-Barr Virus from Multiple Tumor Types and Normal Infection. J. Virol. 2015, 89, 5222–5237. [Google Scholar] [CrossRef]
  37. Norberg, P.; Kasubi, M.J.; Haarr, L.; Bergström, T.; Liljeqvist, J.-A. Divergence and Recombination of Clinical Herpes Simplex Virus Type 2 Isolates. J. Virol. 2007, 81, 13158–13167. [Google Scholar] [CrossRef]
  38. Szpara, M.L.; Gatherer, D.; Ochoa, A.; Greenbaum, B.; Dolan, A.; Bowden, R.J.; Enquist, L.W.; Legendre, M.; Davison, A.J. Evolution and Diversity in Human Herpes Simplex Virus Genomes. J. Virol. 2014, 88, 1209–1227. [Google Scholar] [CrossRef]
  39. Houldcroft, C.J. Human Herpesvirus Sequencing in the Genomic Era: The Growing Ranks of the Herpetic Legion. Pathogens 2019, 8, 186. [Google Scholar] [CrossRef]
  40. Morga, B.; Jacquot, M.; Pelletier, C.; Chevignon, G.; Dégremont, L.; Biétry, A.; Pepin, J.-F.; Heurtebise, S.; Escoubas, J.-M.; Bean, T.P.; et al. Genomic Diversity of the Ostreid Herpesvirus Type 1 Across Time and Location and Among Host Species. Front. Microbiol. 2021, 12, 711377. [Google Scholar] [CrossRef]
  41. Telford, M.; Navarro, A.; Santpere, G. Whole Genome Diversity of Inherited Chromosomally Integrated HHV-6 Derived from Healthy Individuals of Diverse Geographic Origin. Sci. Rep. 2018, 8, 3472, Erratum in Sci Rep. 2018, 8, 6357. [Google Scholar] [CrossRef]
  42. Zhang, X.; Chen, Y.; Liang, J.; Yang, Y.; Chen, H.; Chen, Z.; Li, M.; Chen, S.; Chen, T.; He, H.; et al. Out-of-Africa Migration and Clonal Expansion of a Recombinant Epstein-Barr Virus Drives Frequent Nasopharyngeal Carcinoma in Southern China. Natl. Sci. Rev. 2025, 12, nwae438. [Google Scholar] [CrossRef]
  43. Agwati, E.O.; Oduor, C.I.; Ayieko, C.; Ong’echa, J.M.; Moormann, A.M.; Bailey, J.A. Profiling Genome-Wide Recombination in Epstein Barr Virus Reveals Type-Specific Patterns and Associations with Endemic-Burkitt Lymphoma. Virol. J. 2022, 19, 208. [Google Scholar] [CrossRef]
  44. Manzano, M.; Patil, A.; Waldrop, A.; Dave, S.S.; Behdad, A.; Gottwein, E. Gene Essentiality Landscape and Druggable Oncogenic Dependencies in Herpesviral Primary Effusion Lymphoma. Nat. Commun. 2018, 9, 3263. [Google Scholar] [CrossRef]
  45. Liao, Y.; Bajwa, K.; Reddy, S.M.; Lupiani, B. Methods for the Manipulation of Herpesvirus Genome and the Application to Marek’s Disease Virus Research. Microorganisms 2021, 9, 1260. [Google Scholar] [CrossRef]
  46. Kienzle, N.; Young, D.B.; Liaskou, D.; Buck, M.; Greco, S.; Sculley, T.B. Intron Retention May Regulate Expression of Epstein-Barr Virus Nuclear Antigen 3 Family Genes. J. Virol. 1999, 73, 1195–1204. [Google Scholar] [CrossRef]
  47. Li, R.; Gao, S.; Chen, H.; Zhang, X.; Yang, X.; Zhao, J.; Wang, Z. Virus Usurps Alternative Splicing to Clear the Decks for Infection. Virol. J. 2023, 20, 131. [Google Scholar] [CrossRef]
  48. Rekosh, D.; Hammarskjold, M. Intron Retention in Viruses and Cellular Genes: Detention, Border Controls and Passports. WIREs RNA 2018, 9, e1470. [Google Scholar] [CrossRef]
  49. Monteuuis, G.; Wong, J.J.L.; Bailey, C.G.; Schmitz, U.; Rasko, J.E.J. The Changing Paradigm of Intron Retention: Regulation, Ramifications and Recipes. Nucleic Acids Res. 2019, 47, 11497–11513. [Google Scholar] [CrossRef]
  50. Thompson, M.P.; Kurzrock, R. Epstein-Barr Virus and Cancer. Clin. Cancer Res. 2004, 10, 803–821. [Google Scholar] [CrossRef]
  51. Sample, J.; Young, L.; Martin, B.; Chatman, T.; Kieff, E.; Rickinson, A.; Kieff, E. Epstein-Barr Virus Types 1 and 2 Differ in Their EBNA-3A, EBNA-3B, and EBNA-3C Genes. J. Virol. 1990, 64, 4084–4092. [Google Scholar] [CrossRef]
  52. Brooks, J.M.; Croom-Carter, D.S.G.; Leese, A.M.; Tierney, R.J.; Habeshaw, G.; Rickinson, A.B. Cytotoxic T-Lymphocyte Responses to a Polymorphic Epstein-Barr Virus Epitope Identify Healthy Carriers with Coresident Viral Strains. J. Virol. 2000, 74, 1801–1809. [Google Scholar] [CrossRef]
  53. Brown, J.C. The Role of DNA Repair in Herpesvirus Pathogenesis. Genomics 2014, 104, 287–294. [Google Scholar] [CrossRef]
  54. Alves, P.D.; Rohan, P.; Hassan, R.; Abdelhay, E. Lytic and Latent Genetic Diversity of the Epstein–Barr Virus Reveals Raji-Related Variants from Southeastern Brazil Associated with Recombination Markers. Int. J. Mol. Sci. 2024, 25, 5002. [Google Scholar] [CrossRef]
  55. Kalderon, D.; Roberts, B.L.; Richardson, W.D.; Smith, A.E. A Short Amino Acid Sequence Able to Specify Nuclear Location. Cell 1984, 39, 499–509. [Google Scholar] [CrossRef]
  56. Döhner, K.; Ramos-Nascimento, A.; Bialy, D.; Anderson, F.; Hickford-Martinez, A.; Rother, F.; Koithan, T.; Rudolph, K.; Buch, A.; Prank, U.; et al. Importin A1 Is Required for Nuclear Import of Herpes Simplex Virus Proteins and Capsid Assembly in Fibroblasts and Neurons. PLoS Pathog. 2018, 14, e1006823. [Google Scholar] [CrossRef]
  57. Aswad, A.; Katzourakis, A. Cell-Derived Viral Genes Evolve under Stronger Purifying Selection in Rhadinoviruses. J. Virol. 2018, 92, 10-1128. [Google Scholar] [CrossRef]
  58. Andrade-Martínez, J.S.; Moreno-Gallego, J.L.; Reyes, A. Defining a Core Genome for the Herpesvirales and Exploring Their Evolutionary Relationship with the Caudovirales. Sci. Rep. 2019, 9, 11342. [Google Scholar] [CrossRef]
  59. Sankhala, R.S.; Lokareddy, R.K.; Cingolani, G. Divergent Evolution of Nuclear Localization Signal Sequences in Herpesvirus Terminase Subunits. J. Biol. Chem. 2016, 291, 11420–11433. [Google Scholar] [CrossRef]
  60. Fossum, E.; Friedel, C.C.; Rajagopala, S.V.; Titz, B.; Baiker, A.; Schmidt, T.; Kraus, T.; Stellberger, T.; Rutenberg, C.; Suthram, S.; et al. Evolutionarily Conserved Herpesviral Protein Interaction Networks. PLoS Pathog. 2009, 5, e1000570. [Google Scholar] [CrossRef]
  61. Zhao, B.; Marshall, D.R.; Sample, C.E. A Conserved Domain of the Epstein-Barr Virus Nuclear Antigens 3A and 3C Binds to a Discrete Domain of Jkappa. J. Virol. 1996, 70, 4228–4236. [Google Scholar] [CrossRef]
  62. Zhao, B.; Dalbiès-Tran, R.; Jiang, H.; Ruf, I.K.; Sample, J.T.; Wang, F.; Sample, C.E. Transcriptional Regulatory Properties of Epstein-Barr Virus Nuclear Antigen 3C Are Conserved in Simian Lymphocryptoviruses. J. Virol. 2003, 77, 5639–5648. [Google Scholar] [CrossRef]
  63. Bhattacharjee, S.; Ghosh Roy, S.; Bose, P.; Saha, A. Role of EBNA-3 Family Proteins in EBV Associated B-Cell Lymphomagenesis. Front. Microbiol. 2016, 7, 457. [Google Scholar] [CrossRef]
  64. Maruo, S.; Wu, Y.; Ito, T.; Kanda, T.; Kieff, E.D.; Takada, K. Epstein-Barr Virus Nuclear Protein EBNA3C Residues Critical for Maintaining Lymphoblastoid Cell Growth. Proc. Natl. Acad. Sci. USA 2009, 106, 4419–4424. [Google Scholar] [CrossRef]
  65. Wang, L.; Wang, N.; Zhang, W.; Cheng, X.; Yan, Z.; Shao, G.; Wang, X.; Wang, R.; Fu, C. Therapeutic Peptides: Current Applications and Future Directions. Signal Transduct. Target. Ther. 2022, 7, 48. [Google Scholar] [CrossRef]
  66. Kapat, K.; Kumbhakarn, S.; Sable, R.; Gondane, P.; Takle, S.; Maity, P. Peptide-Based Biomaterials for Bone and Cartilage Regeneration. Biomedicines 2024, 12, 313. [Google Scholar] [CrossRef]
  67. Levin, A.; Armon-Omer, A.; Rosenbluh, J.; Melamed-Book, N.; Graessmann, A.; Waigmann, E.; Loyter, A. Inhibition of HIV-1 Integrase Nuclear Import and Replication by a Peptide Bearing Integrase Putative Nuclear Localization Signal. Retrovirology 2009, 6, 112. [Google Scholar] [CrossRef] [PubMed]
  68. Burslem, G.M.; Crews, C.M. Proteolysis-Targeting Chimeras as Therapeutics and Tools for Biological Discovery. Cell 2020, 181, 102–114. [Google Scholar] [CrossRef] [PubMed]
  69. Mukerjee, N.; Mukherjee, D. PROTAC-Based Therapeutics for Targeting HPV Oncoproteins in Head and Neck Cancers. Nano TransMed 2025, 4, 100071. [Google Scholar] [CrossRef]
  70. Sugiokto, F.G.; Li, R. Targeting EBV Episome for Anti-Cancer Therapy: Emerging Strategies and Challenges. Viruses 2025, 17, 110. [Google Scholar] [CrossRef] [PubMed]
  71. Katoh, K.; Rozewicki, J.; Yamada, K.D. MAFFT Online Service: Multiple Sequence Alignment, Interactive Sequence Choice and Visualization. Brief. Bioinform. 2019, 20, 1160–1166. [Google Scholar] [CrossRef]
  72. Okonechnikov, K.; Golosova, O.; Fursov, M. Unipro UGENE: A Unified Bioinformatics Toolkit. Bioinformatics 2012, 28, 1166–1167. [Google Scholar] [CrossRef]
  73. Midgley, R.S.; Bell, A.I.; McGeoch, D.J.; Rickinson, A.B. Latent Gene Sequencing Reveals Familial Relationships among Chinese Epstein-Barr Virus Strains and Evidence for Positive Selection of A11 Epitope Changes. J. Virol. 2003, 77, 11517–11530. [Google Scholar] [CrossRef] [PubMed]
  74. Moss, D.J.; Burrows, S.R.; Silins, S.L.; Misko, I.; Khanna, R. The Immunology of Epstein–Barr Virus Infection. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2001, 356, 475–488. [Google Scholar] [CrossRef]
  75. Martin, D.P.; Varsani, A.; Roumagnac, P.; Botha, G.; Maslamoney, S.; Schwab, T.; Kelz, Z.; Kumar, V.; Murrell, B. RDP5: A Computer Program for Analyzing Recombination in, and Removing Signals of Recombination from, Nucleotide Sequence Datasets. Virus Evol. 2021, 7, veaa087. [Google Scholar] [CrossRef] [PubMed]
  76. Croucher, N.J.; Page, A.J.; Connor, T.R.; Delaney, A.J.; Keane, J.A.; Bentley, S.D.; Parkhill, J.; Harris, S.R. Rapid Phylogenetic Analysis of Large Samples of Recombinant Bacterial Whole Genome Sequences Using Gubbins. Nucleic Acids Res. 2015, 43, e15. [Google Scholar] [CrossRef]
  77. Didelot, X.; Wilson, D.J. ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes. PLoS Comput. Biol. 2015, 11, e1004041. [Google Scholar] [CrossRef] [PubMed]
  78. Martin, D.P.; Posada, D.; Crandall, K.A.; Williamson, C. A Modified Bootscan Algorithm for Automated Identification of Recombinant Sequences and Recombination Breakpoints. AIDS Res. Hum. Retroviruses 2005, 21, 98–102. [Google Scholar] [CrossRef] [PubMed]
  79. Martin, D.; Rybicki, E. RDP: Detection of Recombination amongst Aligned Sequences. Bioinformatics 2000, 16, 562–563. [Google Scholar] [CrossRef]
  80. Padidam, M.; Sawyer, S.; Fauquet, C.M. Possible Emergence of New Geminiviruses by Frequent Recombination. Virology 1999, 265, 218–225. [Google Scholar] [CrossRef]
  81. Smith, J. Analyzing the Mosaic Structure of Genes. J. Mol. Evol. 1992, 34, 126–129. [Google Scholar] [CrossRef] [PubMed]
  82. Posada, D. Evaluation of Methods for Detecting Recombination from DNA Sequences: Empirical Data. Mol. Biol. Evol. 2002, 19, 708–717. [Google Scholar] [CrossRef]
  83. Gibbs, M.J.; Armstrong, J.S.; Gibbs, A.J. Sister-Scanning: A Monte Carlo Procedure for Assessing Signals in Recombinant Sequences. Bioinformatics 2000, 16, 573–582. [Google Scholar] [CrossRef]
  84. Lam, H.M.; Ratmann, O.; Boni, M.F. Improved Algorithmic Complexity for the 3SEQ Recombination Detection Algorithm. Mol. Biol. Evol. 2018, 35, 247–251. [Google Scholar] [CrossRef] [PubMed]
  85. Letunic, I.; Bork, P. Interactive Tree of Life (ITOL) v3: An Online Tool for the Display and Annotation of Phylogenetic and Other Trees. Nucleic Acids Res. 2016, 44, W242–W245. [Google Scholar] [CrossRef] [PubMed]
  86. Tonkin-Hill, G.; Lees, J.A.; Bentley, S.D.; Frost, S.D.W.; Corander, J. RhierBAPS: An R Implementation of the Population Clustering Algorithm HierBAPS. Wellcome Open Res. 2018, 3, 93. [Google Scholar] [CrossRef]
  87. Robinson, D.F.; Foulds, L.R. Comparison of Phylogenetic Trees. Math. Biosci. 1981, 53, 131–147. [Google Scholar] [CrossRef]
  88. Bogdanowicz, D.; Giaro, K. Matching Split Distance for Unrooted Binary Phylogenetic Trees. IEEE/ACM Trans. Comput. Biol. Bioinform. 2012, 9, 150–160. [Google Scholar] [CrossRef]
  89. Kuhner, M.K.; Felsenstein, J. A Simulation Comparison of Phylogeny Algorithms under Equal and Unequal Evolutionary Rates. Mol. Biol. Evol. 1994, 11, 459–468. [Google Scholar] [CrossRef]
  90. Steel, M.A.; Penny, D. Distributions of Tree Comparison Metrics-Some New Results. Syst. Biol. 1993, 42, 126. [Google Scholar] [CrossRef]
Figure 1. Phylogenetic comparison of EBNA3C unfiltered (left) vs. filtered (right) recombination regions. Phylogenetic reconstruction of EBNA3C includes both exons (BERF3 and BERF4) and excludes the intron. The left tree shows the reconstruction with unfiltered recombinant regions, and the right tree shows the reconstruction with recombinant regions removed. The red branches of the tree indicate the EBV-2 isolates. The black dots represent the EBV isolates. The lines connect the identical isolates in the different trees. Some isolates exhibit internal rearrangements within their phylogenetic clade, which are demonstrated by dashed blue lines.
Figure 1. Phylogenetic comparison of EBNA3C unfiltered (left) vs. filtered (right) recombination regions. Phylogenetic reconstruction of EBNA3C includes both exons (BERF3 and BERF4) and excludes the intron. The left tree shows the reconstruction with unfiltered recombinant regions, and the right tree shows the reconstruction with recombinant regions removed. The red branches of the tree indicate the EBV-2 isolates. The black dots represent the EBV isolates. The lines connect the identical isolates in the different trees. Some isolates exhibit internal rearrangements within their phylogenetic clade, which are demonstrated by dashed blue lines.
Ijms 27 03054 g001
Figure 2. Phylogenetic reconstruction of EBNA3C with filtered putative recombinant regions. The maximum likelihood tree includes BERF3 and BERF4 exons. Isolates with pink branches correspond to EBV-2. The outer circle shows the 10 distinct populations defined through hierarchical Bayesian clustering. The legend represents the corresponding color for each 10 populations. Bootstrap values above 80% are shown with red triangles.
Figure 2. Phylogenetic reconstruction of EBNA3C with filtered putative recombinant regions. The maximum likelihood tree includes BERF3 and BERF4 exons. Isolates with pink branches correspond to EBV-2. The outer circle shows the 10 distinct populations defined through hierarchical Bayesian clustering. The legend represents the corresponding color for each 10 populations. Bootstrap values above 80% are shown with red triangles.
Ijms 27 03054 g002
Figure 3. Predicted putative recombinant regions in EBNA3C. The figure depicts the putative recombinant regions within the EBNA3C gene. The left side of the figure shows the maximum likelihood phylogenetic reconstruction of the EBNA3C gene, excluding the intronic region. The red branches of the tree represent the sequences classified as EBV-2. On the right-hand side of the figure, red horizontal bars represent the putative recombinant regions on internal branches shared among multiple isolates through common ancestry. Each putative recombinant region was numbered from I to VI. Soft light blue horizontal bars represent putative recombinant regions specific to individual isolates. The numbers displayed at the lower edges of the horizontal bars represent the nucleotide positions of the reference sequence.
Figure 3. Predicted putative recombinant regions in EBNA3C. The figure depicts the putative recombinant regions within the EBNA3C gene. The left side of the figure shows the maximum likelihood phylogenetic reconstruction of the EBNA3C gene, excluding the intronic region. The red branches of the tree represent the sequences classified as EBV-2. On the right-hand side of the figure, red horizontal bars represent the putative recombinant regions on internal branches shared among multiple isolates through common ancestry. Each putative recombinant region was numbered from I to VI. Soft light blue horizontal bars represent putative recombinant regions specific to individual isolates. The numbers displayed at the lower edges of the horizontal bars represent the nucleotide positions of the reference sequence.
Ijms 27 03054 g003
Figure 4. Schematic representation of EBNA3C’s sequence variation. Visualization of nucleotide and amino acid variation in EBNA3C. The two exons (BERF3 and BERF4) and the intron are depicted in the bottom part of the image, with the nucleotide (nt) size indicated inside the light blue box. The upper region of the figure indicates the functional motifs (3 NLS, RBP-JK, bZIP, and NM23-H1) concerning the correlated exon, with the amino acid (aa) size indicated inside the light red box. The amino acid substitutions identified in functional motifs indicated by arrows are depicted in three sequence lines: the reference sequence in the center, shaded with light gray; EBV-1 in the upper line; and EBV-2 in the lower line. The vertical lines between the sequences indicate identical amino acids. The numbers between the vertical lines account for the number of sequences with the indicated substitution. Conservatively substituted amino acids are shown in blue, while those with physicochemical changes are highlighted in red. The green triangles show the repetitive region characteristic of EBV-1, and the blue triangle shows the repetitive region of EBV-2.
Figure 4. Schematic representation of EBNA3C’s sequence variation. Visualization of nucleotide and amino acid variation in EBNA3C. The two exons (BERF3 and BERF4) and the intron are depicted in the bottom part of the image, with the nucleotide (nt) size indicated inside the light blue box. The upper region of the figure indicates the functional motifs (3 NLS, RBP-JK, bZIP, and NM23-H1) concerning the correlated exon, with the amino acid (aa) size indicated inside the light red box. The amino acid substitutions identified in functional motifs indicated by arrows are depicted in three sequence lines: the reference sequence in the center, shaded with light gray; EBV-1 in the upper line; and EBV-2 in the lower line. The vertical lines between the sequences indicate identical amino acids. The numbers between the vertical lines account for the number of sequences with the indicated substitution. Conservatively substituted amino acids are shown in blue, while those with physicochemical changes are highlighted in red. The green triangles show the repetitive region characteristic of EBV-1, and the blue triangle shows the repetitive region of EBV-2.
Ijms 27 03054 g004
Table 1. Overview of amino acid variation in BERF3 and BERF4. The table compares the amino acid substitution patterns of BERF3 and BERF4, highlighting their overall frequency and the diseases related with this pattern. Moreover, it provides the geographic distribution of these disease-related patterns and deletions. No amino acid variation patterns were found in BERF3 of the EBV-2 genotype. Amino acid patterns are represented as wild-type position patterns (XposY), while deletions are denoted by their range start…end (pos1…pos2). The most significant results are highlighted in bold.
Table 1. Overview of amino acid variation in BERF3 and BERF4. The table compares the amino acid substitution patterns of BERF3 and BERF4, highlighting their overall frequency and the diseases related with this pattern. Moreover, it provides the geographic distribution of these disease-related patterns and deletions. No amino acid variation patterns were found in BERF3 of the EBV-2 genotype. Amino acid patterns are represented as wild-type position patterns (XposY), while deletions are denoted by their range start…end (pos1…pos2). The most significant results are highlighted in bold.
RegionAmino Acid Substitution PatternFrequency (%)Disease (%)Country (%)
BERF3
EBV-1
T107I4.6Healthy (88.1)China (97.3)
Taiwan (2.7)
T104A + T107V19.0CAEBV (28.5%)Japan (100.0)
Healthy (21.2)China (89.5)
Taiwan (5.3)
UK (5.3)
NPC (14.0)China (96.0)
Indonesia (4.0)
NKTCL (8.4)Japan (60.0)
China (13.3)
Indonesia (13.3)
Singapore (13.3)
L45F + Q60P + T104A + T107V1.8IM (37.5)USA (100.0)
PTLD (25.0)UK (100.0)
BL (18.8)Africa (33.3)
North Africa (33.3)
USA (33.3)
R11I + N21D + R44G + Y51D16.3NPC (24.8)China (100.0)
CAEBV (24.2)Japan (97.2)
USA (2.8)
Healthy (15.4)China (91.3)
UK (8.7)
GC (13.4)China (65.0)
South Korea (25.0)
Japan (5.0)
Korea (5.0)
R11I + N21D + R44G + Y51D + T107I30.4NPC (62.6)China (99.4)
Indonesia (0.6)
Healthy (25.2)China (98.6)
Taiwan (1.4)
pLELC (7.6)China (100.0)
BERF4
EBV-1
I141V + Q213H + E336D + I348L + R656G + E701Q27.0NPC (64.8)China (99.4)
Indonesia (0.6)
Healthy (22.7)China (98.2)
UK (1.8)
pLELC (7.3)China (100.0)
I348L + L669P + S690P + E701Q + H831Y + C915W3.8Healthy (88.6)China (96.8)
Taiwan (3.2)
NKTCL (11.4)China (75.0)
Singapore (25.0)
I141V + Q213H + E336D + I348L + L669P + E701Q + C915W2.4CAEBV (50.0)Japan (100.0)
I141V + Q213H + E336D + I348L + L669P + E701Q + C915W + A978V1.5GC (42.9)South Korea (50.0)
China (16.7)
Japan (16.7)
Korea (16.7)
CAEBV (35.7)Japan (100.0)
A162V + G357V + G655S + T677M + A683V + E701Q + Q740P + Q744R + P753Q + L866S4.7Healthy (46.5)China (90.0)
Taiwan (5.0)
UK (5.0)
CAEBV (18.6)Japan (100.0)
I141V + A162V + T188A + A215G + T356M + G357V + E701Q + A840S + M904L + A921S2.4IM (54.5)USA (100.0)
ND (22.7)UK (80.0)
Germany (20.0)
A162V + S557L + G655S + T677M + A683V + E701Q + Q740P + Q744R + P753Q + L866S + K976E2.5CAEBV (65.2)Japan (100.0%)
IM (17.4)Japan (50.0)
NKTCL (13.0)Japan (100.0)
BERF4
EBV-2
R376S + G472E + V719T + F806V + P829S+ S868L + L881S11.0Healthy (62.5)China (100)
404…434 + 680…702 + 713…731 + 818…82413.7BL (84)Kenya (100)
Table 2. EBNA3C epitope alterations related to medical conditions and geographic distribution. The table details the amino acid variations in eight described epitopes on EBNA3C.
Table 2. EBNA3C epitope alterations related to medical conditions and geographic distribution. The table details the amino acid variations in eight described epitopes on EBNA3C.
EpitopeViral TypeAmino Acid SubstitutionsFrequency (%)Clinical Condition (%)Country (%)
FRKAQIQGLEBV-1I348L55.8NPC (46.7)China (100)
Healthy (25.3)
I348M1.5BL (85.7)Kenya (92)
EBV-2R344L + I348R100BL (27)Kenya (80)
Healthy (35)China (80)
KEHVIQNAFEBV-1E336D46.8NPC (51.6)China (100)
Healthy (20.3)
H337Q1.5BL (85.7)Kenya (92)
EBV-2H337Q + N341K100BL (27)Kenya (80)
Healthy (35)China (80)
LRGKWQRRYREBV-2Y257F100BL (27)Kenya (80)
Healthy (35)China (80)
RRIYDLIELEBV-1R259K3.2Healthy (20.7)United Kingdom (100)
IM (20.7)United States (100)
PTLD (13.8)United Kingdom (100)
Y261F2.4NPC (50)Indonesia (91)
EBV-2Y261F100BL (27)Kenya (80)
Healthy (35)China (80)
QPRAPIRPI EBV-2Q761P + R763P19Healthy (14)Kenya (100)
BL (20)
Burkitt lymphoma (BL), Infectious Mononucleosis (IM), Conserved, showing no changes (-).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Barra, A.; Vasquez-Aguilar, P.; Braz-Silva, P.H.; Zanella, L. Exploring EBNA3C Genetic Variability and Recombination in Epstein–Barr Virus-Associated Cancers. Int. J. Mol. Sci. 2026, 27, 3054. https://doi.org/10.3390/ijms27073054

AMA Style

Barra A, Vasquez-Aguilar P, Braz-Silva PH, Zanella L. Exploring EBNA3C Genetic Variability and Recombination in Epstein–Barr Virus-Associated Cancers. International Journal of Molecular Sciences. 2026; 27(7):3054. https://doi.org/10.3390/ijms27073054

Chicago/Turabian Style

Barra, Abdiel, Paulina Vasquez-Aguilar, Paulo Henrique Braz-Silva, and Louise Zanella. 2026. "Exploring EBNA3C Genetic Variability and Recombination in Epstein–Barr Virus-Associated Cancers" International Journal of Molecular Sciences 27, no. 7: 3054. https://doi.org/10.3390/ijms27073054

APA Style

Barra, A., Vasquez-Aguilar, P., Braz-Silva, P. H., & Zanella, L. (2026). Exploring EBNA3C Genetic Variability and Recombination in Epstein–Barr Virus-Associated Cancers. International Journal of Molecular Sciences, 27(7), 3054. https://doi.org/10.3390/ijms27073054

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop