Evidence of Circulation of Several HAV Genetic Variants and Emergence of Potential Antigenic Variants in an Endemo-Epidemic Country before Vaccine Introduction

Similar to several other countries in the world, the epidemiology of hepatitis A virus changed from high to intermediate endemicity level in Tunisia, which led to the occurrence of outbreaks. This study aimed to determine the genetic and antigenic variability of HAV strains circulating in Tunisia during the last few years. Genotyping using complete VP1 gene and VP1-2A junction confirmed the predominance of genotype IA, with co-circulation of several genetic and antigenic variants. Phylogenetic analysis including Tunisian and strains from other regions of the world showed the presence of at least two IA-variants within IA subgenotype. Amino-acid analysis showed several mutations in or close to epitope regions in the VP1-region. This study provides a baseline on the genetic and antigenic variability of HAV circulating strains before the introduction of vaccination into the national immunization schedule.


Introduction
Hepatitis A infection remains a major public health problem despite the presence of effective vaccines since 1992. The infection is distributed worldwide but occurs more frequently in less developed regions with poor hygienic and sanitary conditions, causing substantial morbidity and mortality rates [1,2]. WHO estimates that hepatitis A caused approximately 7134 deaths in 2016, which accounts for 0.5% of the mortality worldwide due to viral hepatitis [3].
Hepatitis A infection is caused by hepatitis A virus (HAV), an RNA virus classified within the Picornaviridae family and Hepatovirus genus. HAV exists in a dual phenotype, naked and quasi-enveloped virions [4]. The quasi-enveloped virions represent the immature particles that are found in the blood, whereas the naked virions in feces are those released form their quasi-envelope by the action of bile salts in the passage from the bile ducts to the gut [4][5][6]. The naked virions are mature and contain the fully processed VP1 protein [4][5][6]. HAV genome is a positive single stranded RNA of approximately 7.5 Kb which comprises a single open reading frame (ORF) and two untranslated regions on its 3 and 5 ends. Genotyping analysis of different HAV genomic regions revealed the existence of 6 different genotypes [7][8][9][10]. Genotypes I to III, sub-divided into sub-genotype IA, IB, IC, IIA, IIB, IIIA and IIIB, infect humans, while genotypes IV to VI are of simian origin [11,12].
The severity of the disease is age dependent. Unlike young children, teenagers and adults usually develop symptomatic clinical forms, sometimes with high severity [13,14]. HAV infection is more common in developing countries, most children get infected before the age of 10 years and older children and adults are generally immune. In these countries, symptomatic disease rates are low and outbreaks are uncommon. In countries with higher economic level, the HAV infection occurs in older ages and symptomatic cases as well as outbreaks are more frequent. During the past years, an epidemiological change of hepatitis A pattern was observed in many parts of the world [15][16][17][18][19][20][21][22][23][24]. Globally, an increase of symptomatic cases was observed [25,26]. Several outbreaks have been triggered in countries whose epidemiological status has changed from high to medium or low endemicity with more severe outcomes [25,27]. Thus, the implementation of programs for HAV surveillance and the introduction of HAV vaccine stand out as the best measure to limit and prevent the spread of the virus, especially in countries with intermediate or low endemicity levels.
In Tunisia, a transition from high to intermediate endemicity level was observed [28][29][30]. Improvement of socioeconomic and hygienic conditions had contributed to the decreasing frequency of the infection among young children, especially among primary school attendants, and led to serious outbreaks and significant disease burden. The national health authorities have committed, in line with the World Health Organization's (WHO) resolution, to eliminate hepatitis viruses by 2030 as a public health threat. In this context, novel measures to limit the transmission of the disease were undertaken, by implementing an HAV surveillance program and introducing HAV vaccine. In October 2018, HAV vaccination was introduced for 6-year-old children at school entry. Since September 2020, the national HAV vaccination schedule has combined a single dose at 12 months with a catch-up vaccination for non-vaccinated children aged 6 years.
The knowledge of the molecular epidemiology of viral strains circulating is a key component to evaluate the vaccine efficiency, to follow progression of the national program and to detect a possible emergence of vaccine-escape variants. Only few studies have investigated the molecular patterns of HAV in Tunisian population. The most recent study identified HAV strains circulating during 2008-2013 in North Central Tunisia and its migration pattern [31]. The present retrospective study aims to provide an overview of the genetic diversity of HAV strains circulating in Tunisia between 2013 and 2018, prior to the vaccine introduction into the national immunization program.

Samples
Out of 103 HAV IgM positive serum samples, 77 (74.8%) were included in this study. The inclusion criteria were positivity of both anti-HAV IgM and RNA detection by PCR amplification in the 5 UTR genomic regions (position: 55-678, according to reference strains HM175). The samples were collected as part of diagnostic activity of the Laboratory of Clinical Virology of Pasteur Institute of Tunis, between 2013 and 2018, and are originated from 7 districts; in the Northern, the Eastern and the central regions of the country. The samples were stored at −20 • C. No identifying patient data were used in this study.

RNA Extraction and RT-PCR
Viral RNA was extracted from 140 µL of serum using the QIAmp viral RNA mini Kit (QIAGEN, Hilden-Germany) following the manufacturer's instructions. The RNA extracted was used as a template for the amplification of the complete VP1 gene (954 bp) and the VP1-2A junction (394bp) by nested RT-PCR. Reverse transcription and amplification were performed using previously published primers [8,31,32]. For both regions, the cDNA was synthetized using 10 µL of RNA and anti-sens primer m2: 5 -AGTCACACCTCTCCAG GAA-3 and HAV3381: 5 -CCATYTCAAGAGTCCACACACT-3 for the complete VP1 gene and the VP1-2A junction, respectively. The amplification of the VP1-2A junction was then carried out with 10 µL of cDNA and 0.2 µM of outer primers: HAV2870 (forward: 5 -GACAGATTCYACATTTGGATTGGT-3 ) and HAV3381 (reverse: 5 -CCATYTCAAGAGTC CACACACT-3 ) at 94 • C for 5 min followed by 35 cycles (30 s at 94 • C, 30 s at 58 • C, 1 min at 72 • C) and a final elongation at 72 • C for 5 min. Nested PCR was carried out with 5 µL of the product of the first PCR and 0.2 µM of inner primers HAV2896 (forward: 5 -CTATTCAGATTGCAAATTAYAAT-3 ) and HAV3289 (reverse: 5 -AAYTTCATYATTTCATG CTCCT-3 ), following the same thermal condition with the exception of the hybridation temperature 55 • C for the nested PCR. Total VP1 gene was amplified using outer primers HAV-2167 (forward: 5 -GTTTTGCTCCTCTTTATCATG-3 ) and m2 (reverse: 5 -AGTCACACCTCTCCAGGAA-3 ) and inner primers HAV m1 (forward: 5 -GCTCCTCTTT ATCATGCTATG-3 ) and HAV3125 (reverse: 5 -CCTGCATTCTATATGACTCT-3 ), with same amplification duration for both PCR rounds: 94 • C for 5 min followed by 40 cycles (30 s at 94 • C, 30 s at 53 • C for first PCR or at 55 • C for the second PCR, 1 min at 72 • C) and a final elongation at 72 • C for 5 min. The PCR products were visualized on a 1% agarose gel.

Amplicon Purification and Sequencing
The PCR products of the VP1-2A junction and the whole VP1 were purified using the QIAquick PCR purification Kit (QIAGEN, GmnH, Hilden, Germany) following manufacturer's instructions. Purified PCR-products were sequenced by inner primers of the nested PCR on an ABI Prism 3130-Genetic Analyser (3130-Genetic Analyser Applied Biosystems) using the Big Dye terminator ready reaction cycle sequencing Kit (Applied Biosystems).

Sequence Analyses and Genotyping
The sequence of each isolate was deduced by aligning the respective forward and reverse sequences using Chromas software version 2.6.2. Genotyping was performed using phylogenetic analysis comparing the VP1-2A and VP1 sequences with reference sequences representing the different HAV genotypes and sub-genotypes IA to IIIB. Gen-Bank accessions numbers of references sequences used in this study were as follows: X75215, AF357222, AB020566 for genotype IA, M14707, M20273, AF314208 for genotype IB, HQ401240 for genotype IC, AY644676 for genotype IIA, AY644670 for genotype IIB, AB279733, AY644337 for genotype IIIA and AB279735, AB258387 for genotype IIIB. Different strains previously submitted in the nucleotide GenBank collection from Tunisia and other countries were selected, to study nucleotide and antigenic variability. Phylogenetic trees were constructed using MEGA software version 7.0.26 by the maximum likelihood method and the kimura-2 parameter model [33]. Topology was supported by 1000 bootstrap replicates. The new HAV sequences generated as part of the present work were submitted to GenBank database under accession numbers: MW117967 to MW118018 and MW222090 to MW222114 for the complete VP1 with a partial 2A region, and VP1-2A junction, respectively.

Molecular Modeling
The three-dimensional structure of HAV-VP1 protein (accession number: MW118004) was predicted using Modeller version 9.24 [34]. Its amino acid sequence was compared with other sequences retrieved from nrNCBI database using FASTA and BLAST. The Crystal structure of HAV (PDB code 4QPI) was used as a template, in which the VP1 protein (Chain_A) was extracted using MMTSB tool set [12]. The model corresponding to the best value of the DOPE score was selected after generating 1000 conformers [35]. All structures were visually explored using PyMOL molecular Viewer [36].

Genetic and Phylogenetic Analysis
The 393 nucleotides in the VP1-2A junction could be amplified and sequenced from the 77 samples included in the study. Fifty-two complete sequences of the VP1 gene (954 nt long) were obtained. The phylogenetic trees in Figure 1 compare the Tunisian sequences in the entire VP1 region ( Figure 1A) and the VP1-2A junction ( Figure 1B) with reference sequences representative of the different HAV genotypes and subtypes. Both trees show that all isolates belonged to HAV genotype IA with similar grouping of the Tunisian sequences into 5 clusters (designed (a) to (d)); more consistent bootstrap values in the complete VP1 region were observed.
To better understand the molecular epidemiology of circulating strains in Tunisia, the 77 VP1-2A junctions detected in this work (2013-2018) were aligned with 104 previously published Tunisian strains of genotype IA collected during 2001-2013 period ( Figure 2). The 104 sequences were selected among a total of 187 sequences published in Genbank. For identical sequences detected in the same year, only one representative sequence was selected for the phylogenetic analysis. Similar to Figure 1, the tree in Figure 2  The phylogenetic comparison of the entire VP1 of HAV Tunisian strains with others published from 21 different countries of the world is shown in Figure 3. Based on the entire HAV VP1 sequence, the phylogenetic analysis shows the presence of three main clusters I to III. All the Tunisian sequences were grouped in one cluster (Cluster I) with sequences from France (4 sequences, represented in red), Spain (6 sequences, represented in blue) and Yugoslavia (2 sequences, represented in green). Cluster II enclosed sequences almost from all the different countries whereas the cluster III grouped only one sequence from Chile, one from France and one from Spain ( Figure 3).
In addition to the twelve amino acid replacements, a deletion of six amino acids RWFFNL (position 128 to 133) was found in one sequence (accession number: MW118004) located at the B/T cell epitope region 115-139. To study the effect of this deletion on the VP1 protein, a model of MW118004 was generated and selected according to several evaluation criteria. Figure 5 shows the three-dimensional structure of the VP1 of this isolate (Figure 5b) as compared to the template HAV-VP1 protein (PDB: 4QPI) (Figure 5a). The analysis of the three-dimensional structure, determined by homology modeling, shows that the deleted region found in the sequence of this isolate (PSTLRWFFNLF/PASTL------F) resulted in a change in the VP1 protein structure by causing a destruction of an α-helix in the mature protein ( Figure 5). This destroyed α-helix being located in an epitope region is likely to induce physicochemical modifications of the properties of this epitope, which may affect the binding affinity of the responding B cell and also of T cell, given that this deletion are located in a peptide recognized by T cell [38].

Discussion
This study reports an in-depth analysis of the genetic variability of HAV in Tunisia during the last few years. The sampling is not representative of hepatitis A cases in Tunisia due to the high frequency of asymptomatic forms and of symptomatic forms that are not reported. However, this study provides an overview on the molecular characterization of circulating strains. Phylogenetic analysis of HAV strains collected from different Tunisian districts during a six year period was conducted. All of the detected sequences belonged to genotype IA. The predominance of this genotype with the co-circulation of genotype IB was previously reported in Tunisia in both environmental and clinical samples [31,[42][43][44]46,47]. The predominance of genotype IA was also described globally [48]. Same genotyping results were obtained based on the VP1-2A junction or the entire VP1 gene by phylogenetic analysis. Several studies showed that HAV genotyping could be performed using either the entire VP1 region, the N terminus of the VP1 region, the VP1-2A junction or the VP1-2B region [11,[49][50][51][52]. Our results confirm that both regions can be used for HAV genotyping, although the reliability of bootstrap values in the phylogenetic analysis is more consistent when the entire VP1 sequences is used. Thus, VP1-2A junction remains a good solution for HAV genotyping, but for more in-depth analysis, larger genomic regions are more helpful.
To better understand the evolution and the genetic variability of circulating HAV sequences, we compared the VP1-2A junction sequences obtained in this work with others described in previous studies. Phylogenetic analyses showed that the Tunisian HAV strains detected between 2001 and 2018 were divided into at least five variants. indicates that the endemicity level remains quite important. In fact, the number of genetic variants of pathogens generally decreases together with the progress of control strategies. While variant2 has continuous circulation, variant1, 3 and 5 were detected during shorter period, but it cannot be excluded that they may circulated during longer period given the lack of performant molecular surveillance during the study period.
When genotype IA Tunisian sequences, including the ones reported herein and previously published ones, were compared to HAV sequences from other countries, we found that all sequences from Tunisia grouped together in one cluster with other sequences from France, Spain and Yugoslavia. The second Cluster II enclosed strains from Asia, America and Europe whereas the cluster III grouped only sequences from Chile, France and Spain. These results suggest the presence of at least three IA variants within the IA subgenotype. However, the relatively limited number of sequences covering the total VP1 region available in GenBank, which originates from only 21 countries, may be the reason of the detection of only these three IA variants. Most of the published sequences cover a short size segment (<900 bases) of the genome which is not enough to properly determine the genetic variability of circulating strains. The availability in the future of full VP1 sequences from other countries in the world will help to better define the number of variants circulating worldwide and assess the genetic variability within genotype IA.
In the second part of the present work, we studied the impact of nucleotide mutations in the capsid protein VP1 gene on the amino-acid sequences to better understand the antigenic diversity of circulating HAV strains. The VP1 protein is a structural protein, known to contain major immunodominant epitopes of HAV [53,54]. VP1 protein analysis is important for predicting the possible antigenic escaping mutants within the circulating strains. Comparison of the deduced VP1 amino acid sequences obtained in the present study with the reference sequence of genotype IA showed several amino acid differences, mostly located in the N terminus of the VP1 region. Three to five different amino-acid mutations were identified per strain, all these mutations were located in or close to B/T cell epitope regions [12,37,38,40,41,45]. Most of the amino acid replacements were nonconservative and located in epitope regions. Thus, strains bearing such localized amino acid changes might be considered as potential antigenic variants. Eight amino acid replacements located in positions VP1-1, VP1-2, VP1-3, VP1-6, VP1-18, VP1-64, VP1-72 and VP1-99 were described for the first time in our study. Replacements VP1-26, VP1-37 and VP1-99 were found in all isolates of the present study. Replacement VP1-26 was identified previously in Argentina [49]. Replacement VP1-37 (R → K) was previously described in Tunisia and also in Argentina [36][37][38]49]. Replacement VP1-99 (I → T) is, to our knowledge reported for the first time in this study. This replacement same as replacements VP1-26 (R → K) and VP1-29 (R → K) seem to be associated to subtype IB [37]. In fact, replacement VP1-29 (R → K) was previously identified in Tunisia and in India and was associated widely to subtype IB [37]. Furthermore, recombination between sub-genotype IA and IB was previously described in Tunisia and other countries, which may explain this result [43,55].
A non-conservative replacement at position VP1-166 was previously identified in three isolates in Spain, during two outbreaks among male patients having sex with men; one was in non-vaccinated patient and the two others were in partly vaccinated patients [41,56]. These three isolates were considered as antigenic variants. In the present work, a semiconservative amino acid replacement (V → A) at the same position was identified in one isolate detected in non-vaccinated patient, bearing four other amino acid replacements; two non-conservatives (in B/T cell epitope regions) and two conservatives (one in and the other close to B/T cell epitope regions). Thus, this isolate can be also considered as an antigenic variant.
These findings emphasize the importance of following the genetic variability of circulating strains to determinate the effects of vaccination on the molecular epidemiology of hepatitis A and on the potential selection of vaccine-escape mutants. Indeed, the emergence of potential antigenic variants, as well as vaccine-escape mutants after vaccine implementation, were previously described in many studies, especially when a man has sex with another man, and in immunocompromised patients [41,45,[56][57][58][59]. In addition, it was previously reported that after vaccination amino acid replacement occurring in and around the epitopes regions was observed in vaccinated and unvaccinated persons [56]. The abundance of these amino-acid replacements was significantly higher in vaccinated patients, suggesting a positive selection of antigenic variants in some vaccinated patients [56]. Further molecular studies on HAV strains targeting the entire capsid proteins (VP1, VP2 and VP3) will be of great interest to better understand the evolution of circulating strains before and after introduction of HAV vaccine.
Interestingly, a deletion of 6 amino acids (PSTL------F) position 128 to 133 was found in another isolate of the present study (accession N • : MW118004), located in the B/T cell epitope region (position 115-139) and resulted in the destruction of an α-helix ( Figure 5) in the mature protein. Despite this deletion, the reading frame was conserved providing a non-defective virus as it was characterized by the presence of anti-HAV IgM antibody in symptomatic patient sera. Deletion mutants of VP1 were identified in other studies [44,45]. Aragones et al., 2007, studied HAV isolates evolution under a selective pressure of monoclonal antibodies and showed the occurrence of several deletion mutants, among which four contains the total or a part of the deleted region identified in the present study [45]. In Tunisia, Khelifi et al., reported another deletion in one isolate, a 38 amino acid deletion in VP1 protein (position 150-187) located in the part of the immunodominant antigenic site [44]. The occurrence of deletions seems to play a role in the adaptation to different selective pressures such as host defense mechanisms or new environmental conditions.
Despite the high degree of conservation of the capsid amino acid sequence shown by many studies, we found some degree of heterogeneity which revealed the circulation of several antigenic variants in the Tunisian HAV viral population. A previous study reported the circulation of two antigenic variants in Tunisia during 2003 [44]. One presented 38 amino acid deletions in the part of the immunodominant antigenic site of VP1 region and the second variant presented a replacement at the positionVP1-10. Our results suggest that a variety of mutants and antigenic variants circulates in Tunisia, despite the relatively low mutation rate reported for HAV, which seems to be related to the strict structural constraints of the viral capsid and a restricted codon usage [45,60].

Conclusions
The present study provides recent data on the molecular characteristics of circulating HAV strains in Tunisia. Genotype IA predominates, and several genetic and antigenic variants within this genotype co-circulate. Our study provides a baseline on the genetic variability of circulating strains before the introduction of the anti-HAV vaccines into the national immunization schedule. Strengthening the national hepatitis A infection surveillance system will be of great interest to monitor the impact of the vaccination program into HAV molecular epidemiology and the possible emergence of vaccine-escaping variants under selective pressures of this vaccination. Funding: This study was funded by the Ministry of Higher Education (Reasearch Laboratory LR20IPT02: "Virus, Vectors and Hosts: One Health approach and technological innovation for a better health") and the Clinical Investigation Center (CIC).

Institutional Review Board Statement:
Samples used in the present study were obtained from the routine diagnostic activity of the laboratory of clinical Virology in Pasteur Institute of Tunis.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.