Molecular and Genetic Characterization of Hepatitis B Virus (HBV) among Saudi Chronically HBV-Infected Individuals

The study aimed to characterize the genotype and subgenotypes of HBV circulating in Saudi Arabia, the presence of clinically relevant mutations possibly associated with resistance to antivirals or immune escape phenomena, and the possible impact of mutations in the structural characteristics of HBV polymerase. Plasma samples from 12 Saudi Arabian HBV-infected patients were analyzed using an in-house PCR method and direct sequencing. Saudi patients were infected with mainly subgenotype D1. A number of mutations in the RT gene (correlated to antiviral resistance) and within and outside the major hydrophilic region of the S gene (claimed to influence immunogenicity and be related to immune escape) were observed in almost all patients. Furthermore, the presence of mutations in the S region caused a change in the tertiary structure of the protein compared with the consensus region. Clinical manifestations of HBV infection may change dramatically as a result of viral and host factors: the study of mutations and protein-associated cofactors might define possible aspects relevant for the natural and therapeutic history of HBV infection.


Introduction
Human hepatitis B virus (HBV) is an enveloped DNA virus belonging to the Hepadnaviridae family [1] associated with a wide spectrum of clinical manifestations resulting in both acute and chronic liver infection [2]; in addition, chronic HBV hepatitis can be complicated by liver cirrhosis, and hepatocellular carcinoma [3,4]. Despite the wide implementation of immunization programs, HBV infection still remains a major public health problem with approximately 296 million people who are chronically infected worldwide [5][6][7].
The incidence rate of hepatitis B in Saudi Arabia has declined from 19.65 in 2003 to 13.63 per 100,000 inhabitants in 2016 [8]. Furthermore, a recent cross-sectional study involving 74,662 participants reported a prevalence rate of hepatitis B infection of approximately 1.3% [8]. Nevertheless, hepatitis B continues to be a serious health problem in Saudi Arabia [9], mainly affecting older people with advanced forms of liver disease and several co-morbidities [10]. HBV, despite being a DNA virus, is characterized by a high genetic heterogeneity due to the peculiar mechanism of viral replication requiring the activity of reverse transcriptase (RT) polymerase for the reverse transcription of an RNA intermediate, called pregenomic RNA (pgRNA) [11]. Consequently, there are 10 genotypes of HBV(A-J) until now and several subgenotypes [12]. The predominant genotype in the Eastern Mediterranean countries including Saudi Arabia is genotype D.
Additionally, HBV has a quasispecies distribution in infected individuals; mutations may occur in all of the genes and can be responsible for treatment resistance, immune escape, disease outcome, and carcinogenesis [13][14][15][16].
Mutations in the reverse transcriptase (RT) region of HBV polymerase can reduce the susceptibility to antiviral drugs or the restoration of replicative fitness [17]. In addition, HBV surface antigen (HBsAg) contains the major hydrophilic region (MHR), a dominant epitope crucial for binding to neutralizing antibodies. The presence of mutations in this region might change the hydrophilicity, electric charge, or acidity of the loop, having a number of pathobiological effects on the structure of HBV polymerase. Moreover, due to overlapping between the RT gene and HBsAg, the presence of mutations can also contribute to reduce the binding affinity for neutralizing antibodies, including those induced by HBV vaccine [11].
This study aimed to assess the HBV genotypes and subgenotypes circulating in Saudi Arabia and to investigate the presence of clinically relevant mutations in RT and S region in order to verify the possible impact of mutations on the structural characteristics of HBV polymerase in Saudi infected patients.

Study Population
In this study, we included hepatitis-B-infected patients older than 18 years attending the Hepatology Clinic in the King Fahad Hospital of the University-Al-Khobar (Saudi Arabia) enrolled in the period January to March 2018. The patients were included in the study, after signed informed consent, if they were reactive for HBsAg for more than six months and if they had detectable plasma HBV DNA, while we excluded patients with undetectable plasma HBV DNA, co-infection with HCV or HDV, autoimmune liver disease, primary biliary cirrhosis, hemochromatosis, alpha one antitrypsin deficiency, Wilson's disease, liver cirrhosis, or hepatic injury caused by drug use; pregnant women and nursing mothers were excluded as well.
The clinical and demographical data were obtained from the patient's medical records and included: age, sex, and history of previous use of antiviral treatment. The standard clinical investigations included liver function tests (ALT, alanine aminotransferase; AST, aspartate aminotransferase; ALP, alkaline phosphatase; total bilirubin; albumin); platelet count; international normalized ratio (INR); and results of HBV serological markers (HB-sAg, HBeAg, anti-HBs, anti-HBc, and anti-HBe) using the Abbott Architect assay (Abbott Diagnostics Division Max-Planck-ring 2, 65205, Wiesbaden, Germany).
All TE procedures were performed by an experienced practitioner (M.I.) after at least 4 h of fasting by the patients. The measurements were made on the right lobe of the liver, as described by the manufacturing company, and ten successful measurements were obtained for each patient. TE failure was recorded when no value was obtained after at least ten attempts. The results were considered unreliable if the number of valid attempts was fewer than 10, the success rate was <60%, or the interquartile range/median was >30%.

Plasma Samples
The plasma samples were stored at −80 • C until testing. Plasma samples were obtained from 26 Saudi Arabian HBV-infected patients. The real-time HBV PCR viral load was performed using the Artus ® HBV RG PCR kit (QIAGEN GmbH, QIAGEN Strasse 1, 40724 Hilden, Germany). The lower detection limit was 10 IU/mL.

HBV Sequencing and Genotyping
HBV genotype and detection of mutations in the polymerase gene were carried out using an in-house PCR method, as shown elsewhere [18]. For each patient, HBV reverse transcriptase (HBV-RT) (344 amino acids) and the overlapped HBsAg (226 amino acids) full-length sequencing was performed on plasma samples, as described previously [18,19].
Briefly, HBV DNA was extracted from 140 uL of plasma using a commercially available kit (QIAmp DNA blood mini kit, Qiagen Inc., 19300 Germantown Rd., Germantown, MD 20874, USA); the first round of PCR was carried out in the final volume of 50 uL containing Ampli Taq Gold polymerase enzyme and using the following primer pairs: 5-GGTCACCATATTCTTGGGAA-3 and 5 -GTGGGGGTTGCGTCAGCAAA-3 . PCR conditions were: one cycle at 93 • C for 12 min, 40 cycles (94 • C 50 s, 53 • C 50 s, 72 • C 1 min and 30 s), and a final cycle at 72 • C for 10 min. [18]. When the first amplification round provided negative results, a second round of PCR was used. The second round of PCR was carried out using 5 µL of the first PCR product under the same condition as the first round of PCR. The PCR products were electrophoresed on 2% agarose gel and stained with ethidium bromide [18].
The lower detection limit of the described nested PCR in this study was estimated to be 20 copies/mL HBV DNA. PCR products were sequenced by using eight different overlapping sequence-specific primers with a Big Dye terminator v. 3.1 cycle sequencing kit (Applied Biosystems, Foster City, CA, USA) with the Sanger sequencing method, and an automated sequencer (ABI-3100), as reported elsewhere [18,19].
The sequences were analyzed using Seqscape-v.2.0 software. Amino acid (aa) polymorphisms associated with drug resistance were obtained using geno2pheno HBV. Genotyping was carried out by phylogenetic comparison of all patient sequences (RT and S genomic regions separately) with genotype reference sequences, as recommended by Schaefer [20], that used one reference sequence per subgenotype, labelled in the trees as "GT-X.accession number", plus a non-human primate virus as out group; 7 + 1 sequences. In addition, BLASTN identified the closest previously published sequences in GenBank that were added to phylogeny; sometimes, the same GenBank sequence was identified as closest to >1 of our sequences. Thus, this resulted in addition 11 and 7 GenBank sequences to the S and RT trees. We did use the SH-test in PhyML to assess reconstruction robustness. All GenBank sequences were subgenotyped as genotype D1 in agreement with their original classification, providing adequate subgenotyping robustness. Sequences were aligned using MAFFT V7 under the G-INS-1 algorithm [21], and phylogenetic trees were calculated using PhyML V3 under a GTR + I + G model and best of NNI + SPR search [22] (Figures 1 and 2).
The online ExPASy ProtParam tool [23], available at http://expasy.org/tools/protparam. html, (accessed on 2 September 2022) was used to study, in both the S gene and the RT gene, molecular weight, theoretical isoelectric point (pI), extinction coefficient, aliphatic index, instability index, grand average of hydropathy (GRAVY), and the total number of positive and negative residual amino acids [23].
The online I-TASSER (I-TASSER Suite 5.2Department of Computational Medicine and Bioinformatics, Department of Biological Chemistry, University of Michigan Medical School, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA) server was used for automated protein-structure prediction and structure-based function annotation [24][25][26].

Results
Demographic, virological, and clinical characteristics of the twenty-six HBV-infected Saudi Arabian patients enrolled in the study are shown in Table 1. Patients were predominantly males (17 were male and 9 were female), with a mean age of 45.2 years (range 31-57). All patients were anti-HBe positive with HBV DNA levels ranging from 1 × 10 3 to 6 × 10 6 IU/mL. Liver fibrosis was generally mild (F1-F2), and only three patients were treated with antiretrovirals (tenofovir disoproxil fumarate) ( Table 1). Although, at the time of enrollment in the study, all tested patients had detectable HBV-DNA in plasma, for molecular investigation, only plasma samples obtained from 12 of the patients were suitable for HBV DNA extraction and amplification. For the remaining samples, the absence of HBV DNA was confirmed even after running the second round of PCR amplification.
The phylogenetic tree analysis of the HBV polymerase gene showed that all of the 12 subjects were infected with an HBV D genotype. In particular, we observed that eleven isolates belonged to subgenotype D1 (92%), and one to subgenotype D2 (8%). No deletion or insertion were detected in the polymerase region. However, several amino acid (aa) substitutions were observed in the RT region of HBV polymerase and in the S gene (Tables 2 and 3).  In the RT gene, the Y135S substitution was observed in 11/12 HBV strains, followed by the N248H substitution found in 10/12 HBV isolates regardless of the HBV subgenotype.
The change of serine to threonine at position 213 was identified in two HBV isolates ( Table 2). Only in one HBV isolate (patient 772), we identified a mutation at position 181 (A181G), known to confer resistance to adefovir and tenofovir as well as telbivudine and lamivudine [27][28][29].
Additionally, this strain also showed a mutation at position 233 (I233M) known to be associated with resistance to adefovir, although this resistance included a valine as aa change [30]. The mutation at position 215 associated with resistance to adefovir was observed in 5/12 HBV strains, although two HBV strains showed a modified glutamine in serine, two in histidine, and one in proline. Interestingly, all patients were naïve to the treatment. Other substitutions were also seen at positions 54, 122, 266, 329, 336, and 337 ( Table 2).
The presence of aa substitutions at different positions in the surface region was observed in all HBV isolates. At position 207, different aa changes were found in 6/12 HBV strains; the substitutions T118A and L209V were observed in two different HBV isolates (Table 3)  in the second loop of "a" determinant associated with escape from vaccine-induced immunity in several HBV isolates (Table 3). A mutation at position 129 (Q129H) was observed in three isolates and additional mutations T131N + M133T were found in the HBV_800 strain, whereas the HBV_272 strain showed more additional mutations at position 109, 120, 126, and 131. The HBV_738 isolate displayed a mutation at position 144 (D144E).
The presence of polymorphisms in the region of the S protein outside the MHR (aa 78-aa 99) associated with antigenicity prediction was observed in 4/12 (33%) of HBV Saudi strains (001; 800; 170; 389).
The physicochemical properties of both polymerase and S genes of HBV isolates are reported in Tables 4 and 5, as observed by the online ExPASy ProtParam tool (SIB Swiss Institute of Bioinformatics, Quartier Sorge-Batiment Amphipole, 1015 Lausanne, Switzerland [23]. The theoretical isoelectric point (pI) in the polymerase protein was found to be either acidic or neutral in all samples ranging from 5.11 to 7.39 (Table 4). Table 4. Molecular weight, theoretical isoelectric point (pI), extinction coefficient, aliphatic index, instability index, grand average of hydropathy (GRAVY), and total number of positive and negative residues of RT of the polymerase region.  The GRAVY score was indeed similar for all polymerase proteins except one; in fact, 11/12 had proteins with a GRAVY score above 0, whereas, in an HBV isolate, the GRAVY score was below 0 (−0.034) which is considered a hydrophilic protein (globular protein). Our sequences of the polymerase gene with an instability index of more than 44.55 resulted in unstable proteins.

Molecular
For the S gene, our sequences were identified as alkaline proteins in 11/12 HBV isolates with a pI value above 8.2. However, for the HBV_272 isolate, the pI was not computed because the sequence had multiple polymorphisms. The GRAVY index was greater than 0 in all S proteins, showing a more probable membraneous protein (hydrophobic protein). The instability indices around 51.98-64.12 identified all S proteins as unstable (Table 5).
Tertiary structures of HBsAg from HBV consensus and from HBV_800 (an isolate that showed several mutations) were predicted by I-Tasser and further by DeepFold models (Figures 3-5).
Our results based on DeepFold models showed that the two proteins seemed to have completely different folds with TM-score = 0.38. The two models both have N-terminus helix region, C-terminus helix region, and a centrally beta-like core region; however, the orientations between those two helix regions of the two models are quite different, mainly due to the mutation on core region changing the fold orientation of the N terminus region and C terminus region (Figure 3).     . 701347 (HBV_800 isolate) tertiary structure using I-Tasser online software. Pink is alphahelix; Blue is coil and white region is extended strand Our results based on DeepFold models showed that the two proteins seemed to have completely different folds with TM-score = 0.38. The two models both have N-terminus helix region, C-terminus helix region, and a centrally beta-like core region; however, the orientations between those two helix regions of the two models are quite different, mainly due to the mutation on core region changing the fold orientation of the N terminus region and C terminus region (Figure 3).

Discussion
Although HBV is a DNA virus, it is characterized by a great variability, with different genotypes and subgenotypes and also an intraindividual variability, probably influencing several aspects of the infection, including prognosis, evolution, response/resistance to antivirals, and sensitivity to natural or vaccine/induced neutralizing antibodies In this study, we characterized the genotypes and subgenotypes of HBV circulating in Saudi Arabia, the presence of clinically relevant mutations possibly associated with resistance to antivirals or immune escape phenomena, and the possible impact of mutations in the structural characteristics of HBV polymerase.
Phylogenetic analysis showed that all Saudi subjects were infected with HBV genotype D in line with other molecular studies in Saudi Arabia [31][32][33][34] This finding corresponds to the epidemiological profile observed in other neighboring countries [35,36]. Interestingly, in fact, the D1 subgenotype is the dominant subtype in the Mediterranean area

Discussion
Although HBV is a DNA virus, it is characterized by a great variability, with different genotypes and subgenotypes and also an intraindividual variability, probably influencing several aspects of the infection, including prognosis, evolution, response/resistance to antivirals, and sensitivity to natural or vaccine/induced neutralizing antibodies In this study, we characterized the genotypes and subgenotypes of HBV circulating in Saudi Arabia, the presence of clinically relevant mutations possibly associated with resistance to antivirals or immune escape phenomena, and the possible impact of mutations in the structural characteristics of HBV polymerase.
Phylogenetic analysis showed that all Saudi subjects were infected with HBV genotype D in line with other molecular studies in Saudi Arabia [31][32][33][34] This finding corresponds to the epidemiological profile observed in other neighboring countries [35,36]. Interestingly, in fact, the D1 subgenotype is the dominant subtype in the Mediterranean area [32], whereas the presence of D2 subgenotypes had been found in different geographic areas as Europe, India, Australia, and Africa [37][38][39][40].
In our study, 11 of the HBV isolates studied belonged to the subgenotype D1 (92%) and one to the subgenotype D2 (8%). These results support the subgenotype profile already presented in the recent study using a limited number of isolates from Saudi Arabia [34].
In this paper, different mutations in the RT region of the polymerase peptide as well as in the surface region were observed in all of the HBV Saudi strains.
Two different mutations, Y135S and N248H, in the reverse transcriptase region of the polymerase peptide have been identified among our HBV strains causing infection in the Saudi population. Tuteja and collaborators showed both these two mutations in the HBV strains with the D genotype circulating in the infected Indian population, and the frequency was 67 vs. 59, respectively [29]. We found a mutation at position 248 in RT in 83% of Saudi HBV isolates, independently of D subgenotypes. This is consistent with a study by Chavan and collaborators, in which this mutation was described as the most common genotypic variant in their West Indian population [41].
The substitution in the RT region of serine with threonine at position 213 (S213T) was observed and detected in naïve patients with subgenotype D1. The S213T in the RT region of the polymerase gene was reported to be associated with HBV A2, B, and C genotypes in previous studies [42,43]; moreover, Zhang and colleagues [43] described the S213T mutation and classified it as an unconventional mutation since it is found in patients with virological breakthrough and treated with ADV, ETV, and LMV.
The 772_HBV isolate, from an infected individual naïve to antiviral treatments, exhibited an aa mutation at position 181 (A181G), known to confer resistance to adefovir, tenofovir, telbivudine, and lamivudine [29,44]. Another mutation also associated with adefovir resistance was observed at position 233 (I233M), though this resistance includes valine as an aa modification.
Five out of twelve HBV isolates showed the presence of a mutation at position 215 (Q215S) associated with adefovir resistance, although all these patients were naïve to the treatment.
Sequence analysis of the S gene showed aa substitutions in various positions, including those associated with hepatitis B surface antigen escape [45,46] in several HBV isolates.
Mutations in "a" determinant of surface protein were further observed in 4/12 of HBV isolates of Saudi patients; in previous reports [45][46][47], the presence of mutations in this region correlated with the absence of HBsAg in samples, though in our case we could not confirm this finding.
However, if different mutations or combinations could be involved in the absence /or presence of HBsAg is still unclear.
In our study, the presence of polymorphisms in the region of S protein outside the MHR (78 aa-99 aa) was observed in 4/12 (33%) of Saudi HBV strains (001; 800; 170; 389). It is known that this region might influence the immunogenicity and antigenicity of HBsAg, as reported by Khodadad [47].
The physicochemical properties of HBs protein confirmed the data published by Khodadad [47], showing that all the S proteins were basic, having a pI higher than 8; furthermore, the instability index around 51.98-64.12 identified the S proteins as unstable, and they were recognized as hydrophobic having a GRAVY above 0.
It is known that the clinical manifestations of HBV infection may change radically due to either viral or host factors; therefore, it may be of great importance to study, understand, and define these potentially important co-factors.
Surely, one of the limits of this study was that not all of our HBV samples could be amplified and sequenced. One possible explanation could be that sample storage could have affected DNA extraction/amplification, because of viral DNA degradation.
In conclusion, we characterized in this study the HBV strains circulating in Saudi Arabia, both from the molecular and physicochemical point of view, with the aim of providing new information needed to better clarify the potential outcomes relevant to the natural and therapeutic history of HBV infection.