Molecular Characterization of Near Full-Length Genomes of Hepatitis B Virus Isolated from Predominantly HIV Infected Individuals in Botswana

The World Health Organization plans to eliminate hepatitis B and C Infections by 2030. Therefore, there is a need to study and understand hepatitis B virus (HBV) epidemiology and viral evolution further, including evaluating occult (HBsAg-negative) HBV infection (OBI), given that such infections are frequently undiagnosed and rarely treated. We aimed to molecularly characterize HBV genomes from 108 individuals co-infected with human immunodeficiency virus (HIV) and chronic hepatitis B (CHB) or OBI identified from previous HIV studies conducted in Botswana from 2009 to 2012. Full-length (3.2 kb) and nearly full-length (~3 kb) genomes were amplified by nested polymerase chain reaction (PCR). Sequences from OBI participants were compared to sequences from CHB participants and GenBank references to identify OBI-unique mutations. HBV genomes from 50 (25 CHB and 25 OBI) individuals were successfully genotyped. Among OBI participants, subgenotype A1 was identified in 12 (48%), D3 in 12 (48%), and E in 1 (4%). A similar genotype distribution was observed in CHB participants. Whole HBV genome sequences from Botswana, representing OBI and CHB, were compared for the first time. There were 43 OBI-unique mutations, of which 26 were novel. Future studies using larger sample sizes and functional analysis of OBI-unique mutations are warranted.


Introduction
Hepatitis B virus (HBV) remains a major global health problem, with approximately 257 million people chronically infected [1].Viral hepatitis is now the seventh leading cause of death worldwide, with a 63% mortality increase to 1.45 million from 1990 to 2013 [2].HBV accounts for most viral hepatitis associated deaths [2].It is endemic in Africa and Western Pacific, with about 6% chronic infections in adults [3].In Botswana, the prevalence of HBV is between 3.8% and 10.6% in human immunodeficiency virus (HIV)-infected individuals [4][5][6][7].
HBV is a DNA virus belonging to the family Hepadnaviradae that consists of a partially double stranded 3.2 kb genome arranged in four partially overlapping open reading frames (ORFs) [8].The polymerase (Pol) ORF encodes for the polymerase enzyme, which is responsible for DNA priming and reverse transcription during replication [9].The pre-surface1/pre-surface2/surface (preS1/preS2/S) ORF codes for the large, middle, and small hepatitis B surface antigen (HBsAg) proteins, respectively, which are used to diagnose HBV, contain several B and T cell epitopes, and are utilized for attaching to hepatocytes [10].The precore/core (preC/C) ORF encodes for hepatitis e antigen (HBeAg)-an indicator for viral replication and an immune regulator [9]-and hepatitis B core antigen (HBcAg)-the capsid protein.The X ORF codes for the X trans-activator protein [11].
Hepadnaviruses are unique in that they replicate via a pregenomic RNA intermediate, which is reverse transcribed by the polymerase, an enzyme with no proof-reading capabilities [12].This leads to nucleotide misincorporations during replication and accounts for the significant HBV heterogeneity observed within individuals [13,14].To date, HBV has been classified into at least nine genotypes (A-I) and a putative 10th genotype J according to nucleotide divergence of >7.5% at whole genome level [15][16][17][18].Several genotypes have been divided further into subgenotypes based on intergroup nucleotide divergence of 4-8% [18].
Routine screening for HBV is performed by detection of HBsAg.However, this approach misses occult hepatitis B infections (OBI), defined as the presence of HBV DNA in HBsAg negative participants [19].True OBI is characterized by low HBV DNA levels (<200 IU/mL) [19].OBI is an important clinical entity that poses diagnostic challenges and, transmissible via blood donations, mother to child transmission, close contact with infected individuals, sexual transmission, and organ transplantation [20][21][22][23][24][25][26].In addition, end stage clinical complications associated with OBI include cirrhosis and hepatocellular carcinoma (HCC) [27][28][29].The risk of an individual with an HBsAg-negative, HBV DNA-positive individual infection is not negligible.In a recently completed study in South Africa, it was shown that the risk of a HBsAg-negative, HBV DNA-positive individual (with or without anti-HBc) developing HCC was 5.10 (2.06-12.62)compared to a risk of 34.48 (16.26-73.13) in HBsAg-positive individuals (with or without anti-HBc), adjusted for age group, sex, HCV serostatus, country of birth, and HIV status [30].
There is a call by World Health Organization to eliminate HBV and HCV infections as a public health problem by 2030 [45].To achieve this, robust sequence data of HBV isolated from chronic hepatitis (CHB) and OBI are necessary.However, there are very few studies that evaluated full-length OBI genomes owing to the low viremia that is characteristic of OBI [35].Furthermore, some studies did not have controls such as CHB participants from the same population to accurately identify occult-associated mutations [46,47].Thus, the objective of this study was to conduct robust molecular characterization of nearly full-length HBV genomes from individuals with CHB and OBI in Botswana.

Population
This was a cross-sectional study comprising 108 known HBV-infected individuals (both CHB and OBI) from previous studies conducted at the Botswana Harvard AIDS Institute Partnership (BHP) in Gaborone.All available HBV positive samples from various cohorts were utilized to maximize the number of genomes available for analysis.One hundred participants-72 with OBI and 28 with CHB-were included from the Botswana National Evaluation Models of HIV Care (Bomolemo) study, which enrolled HIV positive individuals initiating highly active antiretroviral therapy (HAART) between 2009 and 2012.The Bomolemo study evaluated the efficacy and tolerability of tenofovir and emtricitabine (Truvada™) as the nucleoside reverse transcriptase inhibitor (NRTI) backbone as first-line HAART for adults in Botswana.The HBV screening for the Bomolemo cohort was previously described in detail [6,48].An additional eight CHB participants-four HIV-positive and four HIV-negative pregnant women-were included from a study comparing the effects of HIV and ARV exposure on child health and neurodevelopment (Tshipidi study) [49].HBV screening for participants from this cohort was previously described [50].

Ethical Considerations
The study was approved by the University of Botswana Institute Review Board and the Human Research Development Committee at the Botswana Ministry of Health.The Office of Human Research Administration at the Harvard T.H. Chan School of Public Health approved the Tshipidi and Bomolemo studies.Ethics permit number: PPME 13/1811V (318).

Extraction of Plasma DNA
DNA was extracted from 1 mL of plasma using QiAamp Ultrasense virus kit according to the manufacturer's protocol (Qiagen, Hilden, Germany).An elution volume of 30 µL was used.The extracted DNA was either used immediately for polymerase chain reaction (PCR) or stored at −70 • C until used.

Amplification and Sequencing
Full-length genomes were amplified in two fragments.A 3 kb fragment was amplified by nested PCR using PrimeSTAR ® GXL DNA Polymerase kit (Takara Bio Inc., Shiga, Japan) with some modifications.The master mix was composed of 5 µL of 5X PrimeSTAR GXL Buffer containing 5 mM Mg 2+ , 0.5 µL of PrimeSTAR GXL DNA Polymerase (1.25 U/µL), 0.5 µL of 5 µm P1 primer, 0.5 µL of 5 µm P2 primer, 1.0 µL of DNA extract, and 17.5 µL of dH 2 O to make up a 25 µL reaction.The primers used were P1 and P2 for first round, whereas second-round primers were P3WRS and P4WRS [51,52], Table 1.The cycling conditions were 35 cycles of denaturation at 98 • C for 10 s, annealing at 50 • C for 15 s, and extension at 68 • C for 9 min.The second-round cycling conditions were similar to those of first round except for annealing temperature set at 55 • C for 15 s and extension temperature at 68 • C for 4 min.The remaining 276 base-pair (bp) of the preC/C region was amplified using One Step superscript III (Invitrogen, Waltham, MA, USA) protocol with minor modifications.A semi-nested PCR was performed using KU1 and MA3 as first round primers and MA3 and KU2 as second-round primers [53,54], Table 1.The 3 kb and 276 bp amplicons were visualized in 1% and 2% ethidium bromide stained gels, respectively.PCR products were purified using QiAquick PCR Purification (Qiagen, Hilden, Germany), and sequencing clean-up was done using ZR DNA Sequencing Clean up Kit (Zymo, Irvine, CA, USA) according to the manufacture's protocol.Direct sequencing was then performed on an ABI 3130xl genetic analyzer (Applied Biosystems, Foster City, CA, USA) using Big Dye sequencing chemistry.The primers used for sequencing are shown in Table 1.Sequences were submitted to National Center for Biotechnology Information (NCBI) GenBank under the accession numbers MH464807 to MH464856.The primers used for sequencing are those shown in bold.

Phylogenetic Analysis
Phylogenetic trees were constructed utilizing a Bayesian Markov chain Monte Carlo (MCMC) in the Bayesian Evolutionary Analysis by Sampling Trees (BEAST) v1.8.2 (BEAST Developers) [57] program with a chain length of 100,000,000 and sampling every 10,000 generations.The analysis utilized an uncorrelated log-normal relaxed molecular clock, the Hasegawa, Kishino, and Yano (HKY) model, and the general time reversible model with gamma distributed rates of variation among sites and a proportion of invariable sites (GTR+G+I).Tracer v1.7 (BEAST Developers) [57] was used to visualize results and confirm chain convergence.Every parameter had an effective sample size (ESS) > 500 implying sufficient sampling.Tree Annotator v1.7.3 (BEAST Developers) [57] was utilized to choose the maximum clade credibility tree after a 10% burn-in.Posterior probabilities > 90% were deemed statistically significant.Trees for subgenotype A1 and D3 sequences from this study and the respective GenBank references for the whole surface region were constructed.The S ORF was used to determine the clustering of Botswana strains relative to other African HBV sequences, because there are very few whole genome HBV sequences from Africa except for South Africa.

HBV Genotype Recombination Analysis
To evaluate potential recombination, study sequences were compared to GenBank references A-H utilizing Simplot software v3.5.1 (Ray, S.C, Baltimore, MD, USA) with a step size of 20 bp and 200 bp window size [58].The neighbor-joining method (NJ) was used to conduct a bootscan analysis utilizing 1000 bootstrap replicates and an 80% threshold.

Analysis of Immune Selection Pressure, Signature Amino Acids, and Escape Mutations
DataMonkey was used to approximate the rates of nonsynonymous (dN) and synonymous (dS) substitutions using fixed effects likelihood (FEL) [59].For this analysis, codons for each ORF (S, preS1, preS2, Pol, X, core) were analyzed individually according to HBV status (either CHB or OBI).Codon positions were numbered from the beginning of each open reading frame.The preC region could not be amplified from several OBI participants; hence, it was excluded from this analysis.The viral epidemiology signature pattern analysis (VESPA) was utilized to search for signature amino acids (aa) in HBV sequences isolated from OBI participant's sequences compared to those isolated from CHB participants for each ORF based on subgenotype [60].Escape mutations were identified using the online tool Geno2pheno available at https://hbv.geno2pheno.org/.Escape mutations previously reported in the literature were also searched for manually in BioEdit.

Mutational Analysis
Genomes of HBV were aligned with GenBank references using ClustalX v2.1 (Higgins D., Sievers F., Dineen D., Wilm A, Dublin, Ireland).The subgenotype A1 sequences were aligned with 107 full-length subgenotype A1 references, while the subgenotype D3 sequences were aligned with 85 full-length subgenotype D3 references.The references used were HBV sequences isolated from CHB participants, which were extracted from online curated GenBank alignments [61].All subgenotypes' A1 and D3 whole genome sequences available at that time (January 2017) were included.The sequences were then trimmed to the same length in BioEdit v7.2.5 (Hall T., Carlsbad, CA, USA).Subsequently, Babylon Translator was utilized to extract each ORF (Pre S1, Pre S2, S, X, PreC, core, Pol) and the sequences translated into aa [62].To identify potential OBI-unique mutations, sequences from CHB participants were first compared with the OBI sequences for each respective subgenotype to identify potential OBI unique mutations and then compared to the GenBank reference sequences.To avoid polymorphisms that may represent subgenotype differences, sequences were compared at a subgenotype level [63,64].Mutations that were unique to HBV isolated from OBI participants without appearing in any sequences from CHB patients or references from CHB were classified as occult-unique mutations [41,47,[65][66][67][68][69].

Results
There were no statistically significant differences at baseline in terms of CD4 + T cell count, HIV viral load, liver enzymes, Fibrosis 4 (a noninvasive measure of liver scarring), or other clinical parameters between the CHB, OBI, and the HBV negative participants as reported in detail elsewhere [48,50].The HBV viral loads were low in the OBI group with a median of 57.4 copies per mL versus 31,600 copies per mL in the CHB group as reported elsewhere [48].HBV antigens results (HBsAg and HBeAg) and HBV antibodies (anti-HBc and anti-HBs) have been reported in detail elsewhere [48,50].
Of the 108 participants, HBV genomes from 50 (46.3%)individuals were successfully genotyped, including 25 of 36 (69.4%) from CHB and 25 of 72 (34.7%) from OBI.The low amplification rate likely reflects difficulties in amplifying longer fragments from individuals with low HBV levels such as occurs during OBI infections [64].Twenty-seven samples were successfully genotyped from the whole 3.2 kb HBV genome (18 CHB and nine OBI) and 23 from the 3 kb fragment (seven CHB and 16 OBI).The circulating genotypes were 24 A1 (48%), 24 D3 (48%), and two E (4%), Supplementary Figure S1.For OBI participants, subgenotype A1 was found in 12 (48%) individuals, subgenotype D3 in 12 (48%), and genotype E in one (4%), with the same proportions found in CHB participants.OBI and CHB sequences clustered together (Figures 1 and 2).There were three serotypes that were found in this study; adw2 was found in all genotype As; ayw2 was found in all genotype Ds, except one, which was ay (could not be serotyped further); and ayw4 was found in the genotype E isolate.

Results
There were no statistically significant differences at baseline in terms of CD4 + T cell count, HIV viral load, liver enzymes, Fibrosis 4 (a noninvasive measure of liver scarring), or other clinical parameters between the CHB, OBI, and the HBV negative participants as reported in detail elsewhere [48,50].The HBV viral loads were low in the OBI group with a median of 57.4 copies per mL versus 31,600 copies per mL in the CHB group as reported elsewhere [48].HBV antigens results (HBsAg and HBeAg) and HBV antibodies (anti-HBc and anti-HBs) have been reported in detail elsewhere [48,50].
Of the 108 participants, HBV genomes from 50 (46.3%)individuals were successfully genotyped, including 25 of 36 (69.4%) from CHB and 25 of 72 (34.7%) from OBI.The low amplification rate likely reflects difficulties in amplifying longer fragments from individuals with low HBV levels such as occurs during OBI infections [64].Twenty-seven samples were successfully genotyped from the whole 3.2 kb HBV genome (18 CHB and nine OBI) and 23 from the 3 kb fragment (seven CHB and 16 OBI).The circulating genotypes were 24 A1 (48%), 24 D3 (48%), and two E (4%), Supplementary Figure S1.For OBI participants, subgenotype A1 was found in 12 (48%) individuals, subgenotype D3 in 12 (48%), and genotype E in one (4%), with the same proportions found in CHB participants.OBI and CHB sequences clustered together (Figures 1 and 2).There were three serotypes that were found in this study; adw2 was found in all genotype As; ayw2 was found in all genotype Ds, except one, which was ay (could not be serotyped further); and ayw4 was found in the genotype E isolate.

Results for Subgenotypes A1 and D3
Most sequences from this study grouped with the southern African A1, as shown in Figure 1.Only one sequence (MA94)-from an HBsAg positive participant-clustered with the Asian A1s.Mutations unique to MA94 were preS2 A7T, preS2 V17I, preS2 T31I (PreS2), and preS1 A90V.There was no separate clustering of HBV sequences from HBsAg-negative and HBsAg-positive participants.Most sequences from Botswana had the subgenotype A1 unique, as in the preS1 region (Q54, V74, A86 and V91) and in the preS2 region (L32) [70,71], except for three sequences with mutations at position 86 (two had preS1 A86T and one was preS1 A86V).When comparing the sequences in preS1 ORF (positions five, six, and 25) and preS2 (position 48), all Botswana sequences had 5S except MA101 and MA87, which were 5P.At position six, all were 6A except MA94, MA95, and MA85, which were 6S, as well as MA89, which was 6T.At position 25, however, all were 25F, except two, MA95, and MA85, which were 25L.In the PreS2 at position 48, all study sequences had 48R except MA95 and MA85, which had 48T.Moreover, in the PreS2 region at position 38, the majority of Botswana sequences had 38T except eight (MA94, MA101, MA89, MA87, MA98, MA95, MA86, and MA85), which had 38I.Additionally, all study sequences clustering with Zimbabwean HBV sequences had 38T.The Botswana genotype D3s clustered with South African D3s, Figure 2.

Results for Subgenotypes A1 and D3
Most sequences from this study grouped with the southern African A1, as shown in Figure 1.Only one sequence (MA94)-from an HBsAg positive participant-clustered with the Asian A1s.Mutations unique to MA94 were preS2A7T, preS2V17I, preS2T31I (PreS2), and preS1A90V.There was no separate clustering of HBV sequences from HBsAg-negative and HBsAg-positive participants.Most sequences from Botswana had the subgenotype A1 unique, as in the preS1 region (Q54, V74, A86 and V91) and in the preS2 region (L32) [70,71], except for three sequences with mutations at position 86 (two had preS1A86T and one was preS1A86V).When comparing the sequences in preS1 ORF (positions five, six, and 25) and preS2 (position 48), all Botswana sequences had 5S except MA101 and MA87, which were 5P.At position six, all were 6A except MA94, MA95, and MA85, which were 6S, as well as MA89, which was 6T.At position 25, however, all were 25F, except two, MA95, and MA85, which were 25L.In the PreS2 at position 48, all study sequences had 48R except MA95 and MA85, which had 48T.Moreover, in the PreS2 region at position 38, the majority of Botswana sequences had 38T except eight (MA94, MA101, MA89, MA87, MA98, MA95, MA86, and MA85), which had 38I.Additionally, all study sequences clustering with Zimbabwean HBV sequences had 38T.The Botswana genotype D3s clustered with South African D3s, Figure 2.

Nucleotide Divergence
Nucleotide pairwise distances for each ORF were calculated using Molecular Evolutionary Genetics Analysis (MEGA) version 7.0.26,Table 2 [72].There was a statistically significant difference (p < 0.05) in the median nucleotide pairwise distances between CHB and OBI participants for most ORFs except in the preS2 (for both subgenotypes) and the X ORF (subgenotype D3), Table 2.

HBV Escape Mutations in the S ORF of OBI Versus CHB Sequences
Ten escape mutations associated with vaccine, diagnostic, or immunoglobulin therapy failure were identified in the S ORF.These include four mutations in OBI sequences ( S K122R, S C124Y, S Q129H, and S K160N) and seven in CHB sequences ( S P120S, S M103I, S Q129R, S G130N, S T140I, S G145R, and S K160N).One escape mutation-S K160N-occurred in both groups.The frequency of all mutations was one of 25 (4%) sequences, Table 3. S G145R CHB Vaccine, detection and immunoglobulin therapy escape, decreased antigenicity and viral secretion [75] Abbreviations: CHB: Chronic hepatitis B; OBI: Occult hepatitis B infections; S: Surface.

Occult-Unique Mutations
Occult-unique mutations were those mutations that only appeared in HBV sequences from OBI participants and not in CHB or reference sequences.For subgenotype A1, 12 HBV strains from OBI positive participants were compared with 119 CHB strains (12 CHB from this study + 107 GenBank CHB references).For subgenotype D3, 12 OBI strains were compared with 97 CHB strains (12 from this study + 85 from GenBank reference sequences).A total of 43 OBI unique mutations were identified in this study.Several of these mutations were reported previously.However, 26 (60.5%) were novel.Each mutation was found in a single strain, Table 4.The OBI unique mutations were found in HBV strains from 14 participants most of whom (12) harbored multiple mutations, Table 5.

Signature Amino Acid Analysis
The VESPA available at https://www.hiv.lanl.gov/content/sequence/VESPA/vespa.html was used to determine signature aa occurring in HBV sequences from OBI participants compared to CHB sequences.CHB sequences from the current study were used as background, and all analyses were performed by subgenotype.A total of 16 signature aa were detected, 12 in subgenotype A1 and four in subgenotype D3, Table 6.The PreS2 and C ORFs had no signature aa in either subgenotype.

Immune Selection Pressure
There were 26 negatively selected and no positively selected codons in the C ORF. Three of the negatively selected codons in the CHB group (four, 97, and 45) were in positions with OBI-unique mutations in the respective subgenotype.Two positions were under negative selection in both CHB and OBI participants (96,122).None of the signature aa overlapped with codons under negative selection pressure.Five of the codons (four, seven, 68, 69, and 122) were within known HBV T cell epitopes: 1-20, 50-69, and 121-140.In the Pol ORF, there were 50 negatively selected codons and one positively selected codon.Only one OBI-associated mutation (rh81) overlapped with a codon under negative selection in the CHB participants, and two codons were negatively selected in both CHB and OBI participants.In the PreS1 and S region, two and one codons, respectively, were negatively selected in both CHB and OBI participants.In the Pre S1 region, seven of the codons were in HBV immune epitope positions: (28,29) in the 21-30 T cell epitope (42,44,46) in the 29-48 T cell epitope and (85 and 90) in the 81-95 T cell epitope.In the X region, two codons were in epitopes positions (86 and 95): B cell epitope 85-110, whereas in the S region, one codon (148) was situated in a B cell epitope position: 122-148, Supplementary Table S1.There were significantly more negatively selected codons in the CHB than in the OBI (p = 0.031) sequences.

Discussion
This is the first study to report whole genome and nearly whole genome HBV sequences from Botswana and their molecular characterization.We also report the HBV genotypes in participants with OBI, and we identified mutations associated with OBI.
Subgenotypes A1, D3, and genotype E were identified in both OBI and CHB participants, which were also isolated from CHB participants in previous studies carried out in Botswana [4,80].Most HBV sequences from Botswana clustered with other African sequences both at whole genome and at whole surface region level, as indicated in a study by Makondo et al. [81].Similar to the aforementioned South African study, in which some isolates clustered with Asian sequences, one sequence also clustered with Asian sequences [81].In further concordance with the South African study, sequences from HBsAg-positive and HBsAg-negative participants clustered together [81].
There are several mechanisms that have been implicated in OBI, including mutations in the HBV genome.However, there are few studies that have undertaken mutational analysis on nearly whole genomes of OBI strains owing to the difficulty in amplifying OBI samples with low viral loads [35,36,64,[82][83][84].Hence, most studies have focused on fragments of the HBV genome, especially the surface gene [41,42,64,[85][86][87].This is the first study to perform mutational analysis of nearly whole HBV genome sequences from Botswana.
The S ORF is evaluated most frequently for identification of OBI associated mutations.Some mutations in this ORF may lead to decreased detection by enzyme-linked immunosorbent assay (ELISA) kits, whereas some mutations demonstrated decreased secretion of HBsAg or increased retention of HBsAg [41,42,64,[85][86][87].All OBI associated mutations identified in the S ORF in this study (Table 4) have been reported previously [88][89][90][91][92]; however, the S region has been extensively studied [41,42,64,[85][86][87].The sL97P and sT114I mutations are located in an immunodominant region, the major hydrophilic region (position 99 to 169) [93].Furthermore, the sN131K and sP217L mutations have been associated with diagnostic escape [89,94].One of the mutations (sC124Y) has been reported and characterized in OBI sequences from blood donors in China and has been found to diminish both viral antigenicity and viral secretion [75].There are multiple mutations that have been reported before in the same position and found to impact OBI phenotype in both a similar or different manner, e.g., (sC124R, sC124Y, and sC124A); sC124A was found to decrease extracellular HBsAg by decreasing viral secretion, whereas sC124R and sC124Y decreased antigenicity and viral secretion [75,76].Cysteine forms disulfide bridges, which are critical in the formation of the HBsAg structure, hence, loss of cysteine might alter the structure and impact immunogenicity [95].
Conversely, for D3 isolates, there was only one OBI-associated mutation (sQ129H), which has been reported and functionally characterized before and found to decrease viral secretion [76].Other studies have, however, reported OBI associated mutations, which have never been reported before, such as a study in South Africa [41].However, there are differences between this study and the current study.
In the South African study, most OBI mutations were from subgenotype A2, a subgenotype not present in the current study [41].Another study in South Africa also identified three new OBI-associated mutations for genotype A participants (sY72H, sI82T, and sA128T), which were not observed in the current study [65].
In Botswana, there were no OBI-unique mutations in the PreS1 ORF for subgenotype A1 participants.The absence of OBI associated mutations has been reported before; for example, a study in Turkey found no OBI associated mutations in the S gene of female sex workers [96].Some studies have reported deletions, which reduced HBsAg production in OBI participants, whereas other studies reported point mutations in this ORF [41,44,65,97].The studies that reported point mutations concur with the present study, which reported one OBI-associated mutation in PreS1 for subgenotype D3.The preS1 S78N mutation was novel.A different variant ( preS1 S78G) was observed at the same position in subgenotype A1 participants from South Africa [41].
There were two OBI-unique mutations in the PreS2 ORF of the subgenotype D3 sequences, as in other studies, which also found point mutations in this ORF [41,65,96,98].A different variant ( preS2 F122L) was reported at the position of the two OBI associated mutations and has been linked with HCC [99,100].The OBI-unique mutations in the current study are within the attachment region for human albumin (aa 17-28).Therefore, they may reduce HBV infectivity, since preS2 is responsible for mediating HBV attachment to hepatocytes [101].Some studies documented deletions in the preS2 ORF, which may reduce HBsAg secretion [98,102].The two mutations preS2 F22P and preS2 F22H were novel.The loss of phenylalanine, an aromatic amino acid, might impact protein structure, as aromatic aa play a role in protein structure stabilization [103].
The OBI associated mutations have also been described in the C ORF [22].In the current study, there were 16 OBI-unique mutations in this ORF, and some of these mutations were located in functionally relevant regions.Three of these mutations (cD2A, cI3R, and cD4Y) spanned CD4 + T cell epitope (aa 1-20), and two other mutations (cE64K and cI59L) from this study were located within another CD4 + T cell epitope (aa 50-69) [104,105].Furthermore, mutations cE117Stop and cR127H were also found in a CD4 + T cell epitope (aa 117-131) [104,105].There is a mutation (cR127H), which was located in the CD4 + T cell epitope (aa 120-140) [104,105].Some of the 16 OBI mutations in the core region have been reported before.For instance, cE64K has been reported in OBI-positive children born to HBsAg positive mothers in Iran [22], whereas cE46D was found in OBI-positive Chinese patients [35].The aforementioned study [35] also reported different variants at the same residue as some of the OBI associated mutations (aa positions 26, 59, 74, 87, and 113) [35].cV74N, cS87N, and cF97I have also been reported previously [106], with cF97I also detected in HBV from India [107].Mutation cR127H has been shown to block capsid formation [108].In the present study, there were 10 novel OBI-unique mutations in the core region: cS26P, cD32G, cP45S, cI59L, cD2A, cI13R, cD4Y, cW102V, cF103V, and cE117Stop.Position 102 plays a role in capsid assembly [109].
The X ORF is essential for transcription, hence, mutations in this ORF may affect viral replication [64].In this study, there were seven OBI associated mutations identified in the X region, four of which have been previously reported.Mutation xS31A has been reported previously, and phosphorylation of the X protein occurs at serine 31 [110,111].Mutations xQ87L, xS101P, and xL116V have also been reported in the literature [99,112].In the present study, three novel OBI-unique mutations were detected in this ORF (xS11A, xV15I, and xP11S).Other studies in India reported OBI-associated mutations in the X ORF [113].Some studies have reported deletions in the X region, which are linked to OBI.For instance, a Korean study documented an 8 bp deletion in OBI strains [36].These deletions were shown to decrease HBsAg and HBV secretion and therefore may be responsible for the OBI phenotype [36].In spite of this, in the present study, similar to other studies, there were no long stretches of deletions in the OBI strains [22].
Several studies have reported mutations in the Pol gene, which lead to drug resistance, and because of the overlapping nature, the mutations also simultaneously lead to changes in the S region, which may affect HBsAg production [114].For example, studies in India and South Africa documented drug resistance mutations in OBI-positive participants [113,115].In this study, rtT128I has also been reported similar to several studies including in chronic participants, and it has been linked to lamivudine drug resistance [116][117][118][119][120][121].One of the mutations found in this study (spL140I) has been reported in HIV-infected participants in Germany but has not been characterized yet [89].A study in China reported a 218 bp deletion in the Pol region of OBI positive adults, which might explain the low level of HBV viral load in the OBI participants [67].In the current study, there were no deletions.On the other hand, the Chinese study found no OBI-associated mutations in the Pol region [44].There were 10 novel OBI mutations in this region: tpN120Y, tpK155R, spW64R, spS91T, spP103S, spS133G, rtT225A, rtY257F, rtA329T, and rhI81M.
In most ORFs, more codons were under negative immune selection pressure in the CHB participants compared to the OBI participants.These results concur with a South African study, which found more negatively selected codons in the surface and Pol region in chronic participants compared to their OBI counterparts [41].The reduced negative selection in the OBI participants might be due to the low HBV DNA, a characteristic of OBI [19,41].Several of the negatively selected codons identified in this study are similar to those described in the South African study [41].The overlap might be due to some similarities between the populations owing to the close proximity of the two countries.In contrast to the 16 negatively selected codons, which occurred in sites with OBI-unique mutations, in the current study only four were detected.These differences may be because the former study had more CHB participants (31 vs. 24) and also variations in HBV genotypes studied (A1 and A2 versus A1 and D3) [41,122].Generally, there were also more negatively selected sites in genotype A compared to genotype D. Differences between genotypes even in mutations have been reported before [122].Some of the codons under negative immune selection pressure were in known HBV epitopes [123][124][125][126][127]. Furthermore, nucleotide divergence was higher in the CHB group compared to the OBI group for most ORFs, Table 2, which concurs with other studies [113].These differences in diversity may be due to immune pressure in the CHB group, as HBsAg triggers a greater immune response, thereby increasing the number of mutations arising, as the virus responds to evade the immune system.The low diversity seen in the OBI group may be due to the lack of HBsAg.
Several limitations of this study should be noted.The modest sample size may not reflect all aspects of HBV sequences circulating in Botswana or Southern Africa.Furthermore, the preC region was not analyzed, because it was not amplified in most participants.As most study participants were HIV-positive, findings from this study may not be generalizable to HBV mono-infected participants.Future HBV full-length studies on a larger sample size, including HBV mono-infected individuals, are warranted.
In conclusion, mutations in HBV from OBI participants in Botswana have been determined for the first time.Some of the OBI-unique mutations reported here have been functionally characterized before and found to impact HBsAg secretion/production and HBV virion secretion, and hence have an impact in the OBI phenotype.Other OBI mutations have been reported before, but their functional relevance is still unknown.In addition to these, there were also mutations that were found in this study that have not been reported previously.This study, therefore, has added to the body of knowledge of OBI-associated mutations by reporting novel OBI-unique mutations.Knowledge of OBI-unique mutations is important considering that with increased knowledge of OBI, diagnostic and preventative measures may be put in place in order to eliminate HBV per WHO goals.

Figure 1 .
Figure 1.A phyloge ne tic tree of the whole surface region (nucle otide (nt) 2854-835 from EcoRI site) of subge notype A1 he patitis B virus (HBV) se quences generated by Baye sian Evolutionary Analysis by Sampling Tre es (BEAST).Strains from Botswana sequenced in the present study are shown in blue, while re fe rence sequences are shown in black.Re fe rence strains are de signated by the ir accession numbe r and country of origin, whe re as Botswana sequences are de signated by MA followe d by a numbe r and e ithe r CHB (for chronic HBV strains) or OBI (for occult HBV strains).

Figure 1 .
Figure 1.A phylogenetic tree of the whole surface region (nucleotide (nt) 2854-835 from EcoRI site) of subgenotype A1 hepatitis B virus (HBV) sequences generated by Bayesian Evolutionary Analysis by Sampling Trees (BEAST).Strains from Botswana sequenced in the present study are shown in blue, while reference sequences are shown in black.Reference strains are designated by their accession number and country of origin, whereas Botswana sequences are designated by MA followed by a number and either CHB (for chronic HBV strains) or OBI (for occult HBV strains).

Figure 2 .
Figure 2. A phyloge ne tic tre e of the whole surface re gion (nt 2854-835 from EcoRI site ) of HBV subge notype D3 ge ne rated by BEAST.Botswana se quences are shown in blue , whe re as reference se que nces are in black.Re fe rence strains are de signated by the ir accession number and country of origin, whe re as Botswana sequences are de signated by MA followe d by a numbe r and e ithe r CHB or OBI.

Figure 2 .
Figure 2. A phylogenetic tree of the whole surface region (nt 2854-835 from EcoRI site) of HBV subgenotype D3 generated by BEAST.Botswana sequences are shown in blue, whereas reference sequences are in black.Reference strains are designated by their accession number and country of origin, whereas Botswana sequences are designated by MA followed by a number and either CHB or OBI.

Table 1 .
Primers used for PCR and for sequencing.

Table 2 .
Comparison of the median nnucleotide pairwise distance (%) between chronic and occult infections by open reading frame and subgenotype.

Table 3 .
HBV escape mutations in the S ORF of OBI versus CHB sequences.

Table 4 .
OBI unique mutations in the different ORFs.

Table 5 .
OBI unique mutations in different participants.

Table 6 .
Signature amino acids found mostly in HBV sequences from OBI participants compared to sequences from CHB participants.