In Silico Prediction of Human Leukocytes Antigen (HLA) Class II Binding Hepatitis B Virus (HBV) Peptides in Botswana

Hepatitis B virus (HBV) is the primary cause of liver-related malignancies worldwide, and there is no effective cure for chronic HBV infection (CHB) currently. Strong immunological responses induced by T cells are associated with HBV clearance during acute infection; however, the repertoire of epitopes (epi) presented by major histocompatibility complexes (MHCs) to elicit these responses in various African populations is not well understood. In silico approaches were used to map and investigate 15-mers HBV peptides restricted to 9 HLA class II alleles with high population coverage in Botswana. Sequences from 44 HBV genotype A and 48 genotype D surface genes (PreS/S) from Botswana were used. Of the 1819 epi bindings predicted, 20.2% were strong binders (SB), and none of the putative epi bind to all the 9 alleles suggesting that multi-epitope, genotype-based, population-based vaccines will be more effective against HBV infections as opposed to previously proposed broad potency epitope-vaccines which were assumed to work for all alleles. In total, there were 297 unique epi predicted from the 3 proteins and amongst, S regions had the highest number of epi (n = 186). Epitope-densities (Depi) between genotypes A and D were similar. A number of mutations that hindered HLA-peptide binding were observed. We also identified antigenic and genotype-specific peptides with characteristics that are well suited for the development of sensitive diagnostic kits. This study identified candidate peptides that can be used for developing multi-epitope vaccines and highly sensitive diagnostic kits against HBV infection in an African population. Our results suggest that viral variability may hinder HBV peptide-MHC binding, required to initiate a cascade of immunological responses against infection.


Background
Hepatitis B virus (HBV), a member of the Hepadnaviradae family, is the major etiology of end stage liver diseases (ESLD), liver cirrhosis (LC) and hepatocellular carcinoma (HCC), and causes up to 887,000 deaths per year [1]. Although more than 90% of healthy adults resolve acute HBV infection within 6 months, there remain over 287 million people who test seropositive for hepatitis B surface antigen (HBsAg) [2] and have chronic HBV infection (CHB). Viral clearance is mediated by cytokines, lymphocytes, and the ability to mount a multi-specific polyclonal and vigorous T cell-mediated response against HBV antigens for a protective immunity [3][4][5]. The quality of these responses is influenced by host genetics, as well as the ability of certain viral variants to escape immune recognition [6][7][8].
The major histocompatibility complexes (MHCs)-known as human leukocytes antigens (HLAs) in humans-are integral components of host genes located at chromosome 6p21. These highly polymorphic proteins serve as mediators of adaptive immune responses by presenting processed antigenic peptides to T cells. The two compatible types of MHCs-class I and class II-present exogenous and endogenous epitopes to CD8 + cytolytic T cells and CD4 + T helper (T h ) cells, respectively [9]. The MHC class II alleles (HLA-DR, -DQ and -DP) present epitopes to CD4 + T cells [epi-HLA class II → CD4 + T-cells] that in turn elicit adaptive immune responses against viral infections by facilitating the induction of CD8 + cytotoxic T-lymphocytes (CTLs), production of cytokines crucial for survival, and maturation of B cells [10][11][12][13].
The link between HBV pathogenesis and host immunological profiles is still poorly understood [2]. HBV exhibits a high mutation rate, although only a small number of amino acid substitutions have been characterized functionally due to the costly and time-consuming nature of in vitro assays. Recent approaches have utilized in silico approaches such as machine learning techniques to prioritize candidate peptides for in vitro assays [14,15]. Thus far, the HBV mutations characterized have been associated with sensitivity of immunologic and molecular-based assays and viral escape leading to poor prognosis [16][17][18][19]. However, amino acid variations that influence HBV-MHC binding are poorly understood. In silico mapping of HLA class II binding peptides can be used to identify candidate peptides for in vitro assays to confirm CD4 + T cell epitopes which are crucial for the design of epitope-based vaccines and highly sensitive diagnostic tools that can detect low HBV DNA levels which are frequently missed by diagnostic kits [20][21][22][23].
Sub-Sahara Africa (SSA) and the Western Pacific regions are highly endemic for CHB [24], where the circulating genotypes include A, D, and E for SSA, and B and C for the Western Pacific. HBV genotypes A (subgenotype A1) and D (subgenotype D3) have been reported in Botswana, with genotype E occurring rarely [25][26][27][28][29]. Not only do genotypes show unique geographic distribution, they also differ in treatment response, pathogenesis potential, and prognosis [30,31]. Studies conducted in China have mapped different T cell epitopes that may be eligible for epitope-based vaccines and some were evaluated in vitro [20,32]. However, these findings may be less applicable to African populations whose host genetic pool, circulating genotypes, and immune profiles for HBV (e.g., hepatitis B e antigen [HBeAg] and HBsAg positivity) differ considerably with those of the Chinese population. A prerequisite to determine the epitope(s) for inclusion in epitope-based vaccines include (1) identification of conserved regions of the genome and (2) characterization of those regions that elicit protective immune responses [33]. Although there are hepatitis B vaccines in use currently, vaccine escape does occur; thus, more optimized vaccine candidates may be needed to avoid vaccine failure. In this study, we utilized HLA class II alleles that occur at the highest frequency in Botswana and locally derived HBV strains to identify HLA class II binding peptides which are good candidates for confirmatory in vitro tests of immunogenicity. The present study had three major aims: (1) to determine the repertoires of HLA class II epitopes within HBV envelope sequences of genotypes A and D isolated from different risk groups in Botswana (described in our earlier papers); (2) to compare if the predicted epitopes in genotypes A and D may vary across other HBV genotypes, suggesting that genotype-based multi-epitope vaccines would be more successful than the broad potency vaccines currently in use; (3) to investigate amino acid variations within these epitopes to determine if they may lead to immune escape (i.e., candidate escape mutations).

Mapping Peptides from HBV Surface Gene Restricted to HLA-class II Alleles
Three sequence datasets were included in the current analysis. The first database (N 1 ) was used to map epitopes (epi) that bind predominantly to HLA class II alleles in Botswana and consisted of 92 non-recombinant full-length S gene (PreS/S) sequences [25,26,28] retrieved from GenBank (accession numbers MF979142-MF979176, KR139743-KR139748, and MH464807-MH464854). The aligned sequences were sorted by genotype and included A (n = 44) and D (n = 48). The three domains of HBV surface proteins-PreS1, PreS2, and S-were manually extracted from an overlapping Pol/S fragment and were divided into genotypes whose amino acid (aa) sequence alignments were sorted according to column similarities. Nucleotide alignments and sorting were performed using AliView 1.21 software [34]. Each region was then used to create a consensus sequence with the threshold set at 90% for all positions. Variants that did not meet this threshold were investigated independently in post-analyses. To assess if the aa composition of consensus sequences was representative of existing HBV strains, BLAST searches were conducted using the NCBI database, and strains exhibiting 100% similarity and coverage were evaluated further (Supplementary Table S1).
The relationship between the length of the protein and the frequency of binding-peptides were compared using epitope density score (D epi ) for all protein domains of S gene (PreS1, PreS2, S) and genotypes (A versus D). D epi score was defined as the proportion of binding peptides n i=1 I epi (where; I epi = WB + SB) to total predicted epitopes (T epi ) relative to protein size. T epi represent the total count of all predicted proteins.

Determining Prevalence of Putative Epitopes in HBV Genotypes (A-I) Except A and D
Putative or promiscuous epi were defined as peptides that exhibit similar binding affinity to 2 or more alleles. The prevalence of predicted putative epi were determined in a second dataset (N 2 ) that included 10,308 PreS/S sequences (genotype B = 2905; C = 5575; E = 1118; F = 477; G = 86; H = 69; I = 78) retrieved from HBV database available at http://hvdr.bioinf.wits.ac.za/alignments/index.html [38]. N 2 was also used to determine the overall prevalence of the predicted escape mutations. The sequences used in this analysis are included in the supplementary file provided; Supplementary-Table S3.

Variations Causing Escape to HLA Class II Binding
Dataset N 3 consisted of 7743 HBV sequences (genotype A = 3115; genotype D = 4628) used to determine the frequency of aa variations which were termed HLA escape mutations in other HBV sequences. Sequences used were curated from http://hvdr.bioinf.wits.ac.za/alignments/index.html, partitioned by proteins (PreS1, PreS, S) [38]. Escape mutations were defined as those aa variations within the 15-mer core aa sequence that cause the binding affinity to change from either strong to pseudo binding (SB → NB) or from weak to pseudo binding (WB → NB). Several in-house customized Python pipelines were used to accurately investigate the frequency of escape mutations. The sequences used in this analysis are in the supplementary file provided; Supplementary Table S4.

Screening Putative Epi and Reconstruction of Tertiary Structure of the Modelled Vaccine
Since the predicted putative epi can be also homologous to human peptides that may (1) cause either autoimmune responses when used as a vaccine or (2) give false results when used as a diagnostic marker, a BLAST search was conducted with the NCBI protein database for all immunogenic epi. Afterwards, the predicted putative epi were catenated using a previously described method [39], and different combinations were used to construct candidate multi-epitope vaccines (MEVs). Physiological and Viruses 2020, 12, 731 5 of 26 biochemistry proteins such as thermal stability, desirable shelf-life, and pH among other properties are prerequisites during development of an ideal vaccine. The biochemical properties of generated candidate proteins were evaluated using online ProtoParam tool [40]. The proteins exhibiting properties similarly to those of vaccines currently in use were deemed the best candidate. The properties of 3 current HepB vaccines-including VO_0011094, VO_0011095, and VO_0011093-are curated in the DNA vaccine database [41]. Properties predicted include; immunogenicity, antigenicity, instability index, estimated half-life in humans' molecular weight (mw), aliphatic index (AI), grand average of hydrophobicity (GRAVY), and theoretical (pH).

Determining the HLA-HBV Association Using T epi
To test the hypothesis that T epi could serve as a useful predictor of HLA-HBV associations, T epi of S genes (A and D) were used to rank the 9 HLA class II alleles in the post epi prediction analyses. The available literature was used to corroborate the analyses.

Predicting T-cell Epitopes Using Consensus from N 1 Dataset
We first generated 6 different consensus sequences from PreS/S sequences of genotypes A and D. Consensus sequences were validated by comparison to the NCBI database and identified 16 pre-existing sequences which exhibit 100% identity to: PreS1 genotype A (PreS1 A ) consensus sequence; 39 to PreS1 D , 6 to PreS2 A , 12 to PreS2 D , 19 to S A , and 20 to S D , respectively. The consensus sequences, identical sequences, and their country of origin are provided in supplementary file; Supplementary Table S1. Epitope is defined as a peptide that binds either weakly or strongly bind to HLA-DR alleles used, while HLA class II escape mutations were defined as aa variations within the 9-mer core aa sequence that changed the epitope-HLA-DR binding from SB/WB to pseudo binding. In total, there were 1819 total binding predicted (T epi ) from 297 unique epitopes restricted to 9 HLA class II including 20.2% SB. The number of signature aa differentiating HBV genotypes A and D were 31, 13, and 9 for the S, PreS1, and PreS1 regions, respectively (Table 1). S protein had the highest binding peptides constituting 79.9% of the sum of all T epi ( T epi ), PreS1 constituted 11%, while PreS2 had the least (9.1%) T epi . The D epi of the SB and WB among genotypes (A and D) and proteins (PreS1, PreS2, S) are summarized in Figure 2. To test the hypothesis that Tepi could serve as a useful predictor of HLA-HBV associations, Tepi of S genes (A and D) were used to rank the 9 HLA class II alleles in the post epi prediction analyses. The available literature was used to corroborate the analyses.

Predicting T-cell Epitopes Using Consensus from N1 Dataset
We first generated 6 different consensus sequences from PreS/S sequences of genotypes A and D. Consensus sequences were validated by comparison to the NCBI database and identified 16 preexisting sequences which exhibit 100% identity to: PreS1 genotype A (PreS1A) consensus sequence; 39 to PreS1D, 6 to PreS2A, 12 to PreS2D, 19 to SA, and 20 to SD, respectively. The consensus sequences, identical sequences, and their country of origin are provided in supplementary file; Supplementary  Table S1. Epitope is defined as a peptide that binds either weakly or strongly bind to HLA-DR alleles used, while HLA class II escape mutations were defined as aa variations within the 9-mer core aa sequence that changed the epitope-HLA-DR binding from SB/WB to pseudo binding. In total, there were 1819 total binding predicted (Tepi) from 297 unique epitopes restricted to 9 HLA class II including 20.2% SB. The number of signature aa differentiating HBV genotypes A and D were 31, 13, and 9 for the S, PreS1, and PreS1 regions, respectively (Table 1). S protein had the highest binding peptides constituting 79.9% of the sum of all Tepi (∑ Tepi), PreS1 constituted 11%, while PreS2 had the least (9.1%) ∑ Tepi. The Depi of the SB and WB among genotypes (A and D) and proteins (PreS1, PreS2, S) are summarized in Figure 2.  T epi X Protein length where i can be any protein (PreS/S A versus PreS/S D ). PreS1 A represent genotype A large Hepatitis B surface antigen (HBsAg); PreS1 D represent genotype D large HBsAg; PreS2 A represent genotype A middle HBsAg; PreS2 D represent genotype D middle HBsAg; S A represent genotype A small HBsAg; S D represent genotype D small HBsAg. T epi = Total binding peptides (WB + SB). N epi unique = count of unique binding peptides per each protein.

Prevalence of Putative Epitopes Across all HBV Genotypes
To assess if the predicted epi may be suitable for vaccine inclusion, the most common epi were compared to sequences other than genotypes A and D. Table 2 shows the number of aa sequences containing the indicted epi. Eight out of 53 epi were found in semi-conserved regions ranked as "++++" and had prevalence > 85% among sequences in N 2 dataset. These results suggest that multi-epitope genotype-based vaccines may be better to avoid vaccine escape.

Profiles of Strong Binding Epitopes
SB epi of the 3 proteins were sorted by the core aa sequence and analyzed based on genotype. Those found in both genotypes were considered putative when binding to alleles. There were no SB epitopes that were common between sequences of genotypes A and D for PreS1 and PreS2 regions. S protein had 89 out of 230 epi that satisfied the above criteria and were promiscuous for at least 5 alleles. There were 5 unique epi whose core aa were at S protein residues 41-49 (FLGGSPVCL), 14 epi with S protein residues 20-28 (FLLTRILTI), 10 epi with S residues 183-191 (FVGLSPTVW), 9 epi with S residues 22-30 (LTRILTIPQ), 6 epi with S residues 162-170 (LWEWASARF), 6 epi with S residues 184-192 (VGLSPTVWL), 12 epi with S residues 96-104 (VLLDYQGML), 2 epi with S residues 180-188 (VQWFVGLSP), 9 epi with S residues 163-171 (WEWASARFS), 4 epi with S residues 182-190 (WFVGLSPTV), and 12 epi with S residues 72-80 (YRWMCLRRF). Table 3 shows the full profiles of SB epi mapped for the S proteins for genotypes A and D.

Profiles of Most Promiscuous Epitopes
Since the majority of predicted epi were WB, a strict threshold was applied to select the most the promiscuous epi. Thus, epi were selected if they bind to least 6 alleles or more as shown in Table 4. Table 4 shows 27 putative epi binding to at least 6 alleles and were selected to model the tertiary structure of candidate vaccines. The overlapping S region epi (S A = 11, S D = 10) were catenated to form proteins which were used to model a tertiary structure as shown in Figure 3. The overlapping S A epi at S protein residues 6-20 with aa sequence SGFLGPLLVLQAGFF, aa sequence CIPIPSSWAFAKYLWEWASVRFSWLSLLVPFVQWF at S protein residues 155-183, and aa sequence WYWGPSLYNILSPFIPLLPIFFCLW at S protein residues 199-223 yields a 75 amino acid protein of mw = 8878.62. Of the 6 proteins predicted, the most stable S A protein was vacci-S A with amino acid sequence: 5 -SGFLGPLLVLQAGFFWYWGPSLYNILSPFIPLLPIFFCLWCIPIPSSWAFAKYLWEWASVRFSWLSLLV PFVQWF-3 . The overlapping epi in S D occupied residues S: 6-20, S: 68-82, S: 197-223 and S: 155-183. The resulting protein was 85 aa long and had mw = 10,749.99. Of the 24-proteins predicted, the one selected for constructing tertiary structure was vacci-S D with aa sequence: 5 -MMWYWGPSLYSILSPFLPLLPIFFCLWSGFLGPLLVLQAGFFSWAFGKFLWEWASARFSWLSLLVPF VQWFTCPGYRWMCLRRFIIFLF-3 . When a BLAST search was conducted with the NCBI protein database for the 2 protein sequences, results show that both sequences were similar to 2 domains of major surface antigen (vMSA) from hepadnavirus superfamily; accession number pfam00695. A similar approach was used to select proteins from the epi in PreS2 region to determine proteins that can be used to model the 3D tertiary structures of epi in the PreS2 region. In total, there were 6 epi selected 3 for each genotype. The overlapping epi were 20 aa long with mw of 2006.20 and occupying residues PreS2: 34-53 in genotype A sequences. vacci-PreS D was made from overlapping epi occupying residues PreS2: 37-53 and had a mw of 1719.91. All proteins generated using the different epi ordering (permutations and combinations) have been provided under the supplementary file Supplementary Table S5.

Designing and Predicting Structure of Candidate Multi-Epitope Vaccine
The degree of conservation. The scale used: if score > = 100, then highly conserved and will be denoted by '+++++'. elif score > = 85: then semi conserved = '++++'. elif score > = 60: region of mutation and is denoted by '+++'. elif score > = 20, then highly variable region = '++'. else: high escape mutation = '+'. n represents the number of sequences used in the analysis. B represents full-length genotype B sequences, C represents full-length genotype C sequences, etc. 17-31 representing the beginning the position occupied by the 14-mer epitope predicts (e.g., 17 is the first amino acid of the epitope, while 31 is the last amino acid of the epitope).  major surface antigen (vMSA) from hepadnavirus superfamily; accession number pfam00695. A similar approach was used to select proteins from the epi in PreS2 region to determine proteins that can be used to model the 3D tertiary structures of epi in the PreS2 region. In total, there were 6 epi selected 3 for each genotype. The overlapping epi were 20 aa long with mw of 2006.20 and occupying residues PreS2: 34-53 in genotype A sequences. vacci-PreSD was made from overlapping epi occupying residues PreS2: 37-53 and had a mw of 1719.91. All proteins generated using the different epi ordering (permutations and combinations) have been provided under the supplementary file Supplementary Table S5.

Mutations Associated with Escape from Class II HLA Binding
For amino acid variants present in more than 5% of sequences of N 1 database, we evaluated the in silico impact on immune recognition. Mutations were labelled relative to the HBV surface gene proteins-PreS1 1-115 , PreS2 1-42 , and S 1-226 . Mutations that hindered HBV peptide-HLA binding were 86T, 90T, and 94P in PreS1 A ; 54P, 79E, 84S, and 85Q in PreS1 D ; 12I, 31I, and 54P in PreS2 A ; 5F, 22H, 22L, 22P, 32H, 36L, and 42S in PreS2 D . The list of escape mutations and their prevalence in 7743 HBV genotype A and D sequences are shown in Table 5 and supplementary file Supplementary Table S5. There were coordinated variations among positions in aa sequence alignments that had an impact on HBV-HLA binding but showed no impact when analyzed individually. These were termed covariance mutations. The combinations of the covariance mutations and their impact on binding potential are shown in Tables 5-7 below.

Assessing the Distribution of HBV-HLA (epi) as a Predictor of HLA Protective Effective
We used existing information on immunological studies for meta-analysis to estimate T epi as a predictor of the protective effect of MHC class II alleles. Figure 4 shows a Pareto analysis and the 20% threshold indicates the 3 HLA class II alleles-DRB1*0301, 1302, *1101-that are highly likely to be less protective against HBV. We therefore speculate, with caution, that there is a relationship between T epi and protectiveness and this should be further investigated to establish correlations (p-value < 0.05) with high statistical confidence.

Assessing the Distribution of HBV-HLA (epi) as a Predictor of HLA Protective Effective
We used existing information on immunological studies for meta-analysis to estimate Tepi as a predictor of the protective effect of MHC class II alleles. Figure 4 shows a Pareto analysis and the 20% threshold indicates the 3 HLA class II alleles-DRB1*0301, 1302, *1101-that are highly likely to be less protective against HBV. We therefore speculate, with caution, that there is a relationship between Tepi and protectiveness and this should be further investigated to establish correlations (p-value < 0.05) with high statistical confidence.

Discussion
This detailed HBV immunoinformatics approach outlines the candidate peptides that can be used to develop biologicals against CHB. However, mutations within the T cell epitopes may impair HBV-HLA complex formation, a crucial component responsible for initiating cascade of responses for viral clearance [67]. In this study, we observed that there were no epi that bind to all alleles, and there was a large difference between the epi profiles of genotype A compared to genotype D. This phenomenon may explain previous failures during preclinical trials of candidate vaccines against CHB generated thus far, which were designed for broad potency. [68][69][70][71][72][73][74][75]. This may suggest that a genotype-and population-based, multi-epitope vaccine would be the best candidate to combat HBV. Using the cut-off set in this analysis, 21 epi from S regions and 6 from PreS2 regions were identified and used to construct tertiary structure of candidate vaccine. The candidate vaccines showed high binding in all alleles except for DRB1*0301. Populations with high coverage of DRB1*0301 have been closely associated with high susceptibility to CHB infection and nonresponse vaccination with envelope proteins [53,76,77].
Among the 3 types of proteins (PreS/S) analyzed in this study, S had the highest binding peptides constituting 79.9% of the sum of all T epi ( T epi ), and genotype D had the most epitopes. However, when D epi were compared between genotypes (A and D) and proteins (PreS/S), it was clear that the length of proteins where independent of the epi but dependent on the aa compositions and the position of the epi within the proteins. While S region was the longest protein (213 aa), it had similar D epi to the smallest protein PreS2 D (PreS2 -54 aa). Previous studies that investigated all 7 proteins of HBV and showed that S protein had the most T cell antigenic epitopes [78,79]. However, further investigation should be conducted to determine any relationships between antigenicity and PreS1 protein with least epi.
We also compared the frequency of epi between 2 genotypes, A and D, and observed that genotype D had generally more immunogenic epi than A with exception for PreS2 A whose epi were 12% more than those of PreS2 D , (Figure 1). Clinically, these trends correspond to data reported from studies that investigated the prognosis of patients infected with HBV genotype A1 strains compared to genotype D3. Others have reported a 10-fold increased progression to HCC among HBV patients infected with genotype A compared to genotype D and that patients infected with genotype A strains were likely to progress to CHB compared to genotype D [8,80]. Most countries in SSA, including Botswana, have low prevalence of HBV genotype D3 among different risk groups than genotype A1 [25][26][27]81].
Using existing information from studies that investigated the impact of alleles on different HBV outcomes to validate our statistical associations [53,[82][83][84][85], we observed that out of 9 alleles, HLA-DRB*1301/2 and *0401 alleles-which had most epi-have been associated with spontaneous clearance of HBV infection [18,24,33,68,78,82,[85][86][87], and HLA DRB1*0301 that had the least epi in our study has been previously associated with susceptibility to HBV infection, autoimmune hepatitis, chronicity, and non-responsiveness to HBV vaccination across different ethnic groups [9,88,89]. This strongly suggests that T epi of PreS/S should be explored further as a predictor of the protective effect of HLA class II alleles. A host immune system can recognize foreign antigens (epi) and clear the infection in some cases [90][91][92]; however, most pathogens including HBV can mutate within epitopes, and this may result in an escape from host immune surveillance leading to persistence of infection [93]. This characteristic is regarded as one of the major hindrances in developing high potency therapeutic drugs. This mechanism of aa changes within epitopes (escape mutations) interferes with both peptide processing reducing the intracellular antigen load and downregulation of MHC expression hence increased risk of developing liver malignancies (HCC, LC) among CHB patients [4,5,[94][95][96][97][98]. Studies investigating the role of escape mutations within the T cell epitopes are relatively rare. In this study, we observed coordinated aa variations, which reveal genetic dependencies (i.e., epi that escaped HLA binding when there were two or more mutations); however, some single aa mutations altered the binding potential. These mutations were termed covariance mutations. For instance, in the proportion of PreS1 binding peptides from both genotype A and D against alleles shown in Figure 3, 15-mers with core aa sequence ILATVPAVP [84][85][86][87][88][89][90][91][92] in PreS1 A -corresponding to ILQTLPANP [73][74][75][76][77][78][79][80][81] in PreS1 D -weakly binds to alleles: *0101; 0401; 0701; 0802; *1101; 1302; *1501; 5*0101 → PreS1 A , and *0101; 0401; 0701; 0802; 1302→ PreS1 D respectively. The aa Ala in genotype A is replaced by aa Gln for genotype D [A (A) → Q (D) ] and [Val (A) → Lue (D) ] 91 causing PreS1 D to pseudo bind to 3 alleles (*1101, 1302, and 5*0101). Additionally, the epitope strongly binds to allele *0802, but the changes in genotype D result in pseudo binding. Tables 3-7 summarize all the core aa of PreS/S epitopes that are restricted to 9 HLAs. We observed that the HBV epitope-HLA is greatly influenced by the position of core aa. For instance PreS1 D epi AFGLGFTPPHGGLLG 51-65 is a WB to alleles: *0101, *0401 and 5*0101 when using core-aa: FGLGFTPPH 52-60 but it can only bind to *0701 when the core aa is FTPPHGGLL [57][58][59][60][61][62][63][64] . Furthermore, post epi analyses show that there were mutations outside the core aa sequences but had impact on the HBV-HLA blinding. For instance, 2 epi with aa residues S: 41-55-FLGGPPVCLGQNSQS and FLGGSPVCLGQNSQS-and all with core aa sequences FLGGPPVCL show different binding affinities thus NB and WB respectively. The escape mutations defined in the present study were those found within the core aa sequence.
Overall, the present computational study facilitates the development of experimental epitope and escape mutation mapping studies.

Conclusions
Vaccines act by inducing strong immunity to counteract viral antigens presented by MHC-epitopes. However, their success is affected by virus evolution (e.g., escape mutations) within known protective epitopes; hence, multi-epitope, population-based vaccine constructs are preferred in order to generate a potent immunologic response against HBV. We demonstrate the quality of T cell epitopes among different HBV genotypes and reconstructed a candidate multi-epitope population-based vaccine. Our results suggest that among aa variations classified as polymorphisms do exit T-cell escape mutations and. In silico studies should be followed up with preclinical assays to validate the novelty of their findings. Viral hepatitis (VH) is a global burden, and the WHO has put forth an ambitious goal to eliminate VH as a public threat by 2030. HBV contributes a vast majority (77%) of VH cases and there are no therapeutic cures for chronic hepatitis B infections (CHB). We hypothesized that epitope vaccines are a potential CHB treatment because they can induce strong immune systems with ability to achieve hepatitis B virus surface antigen (HBsAg) loss. While several trials have failed to produce effective vaccines against CHB from T cell epitopes, we aimed to investigate the repertoire of T cell epitopes from different HBV genotypes (A and D), MHC class II alleles with high population coverage in Botswana. In silico analyses were used to map promiscuous epitopes (15-mers) using alleles -9 MHC class II alleles-, and PreS/S sequences -genotype A and D-with high population coverage in Botswana. Some epitopes mapped within PreS/S conserved regions, and none were promiscuous to all alleles suggesting that multi-epitope, population-based vaccines (MEPBV) may be more effective candidate vaccines against CHB compared to previously reported broad potency epitope-based candidate vaccines. Highly promiscuous peptides may also be considered as candidate peptides for designing highly sensitive diagnostic chips since current serological kits may fail to detect other HBV clinical phenotypes. The mapped T epitopes exhibited high mean diversity among genotypes and others had coordinated amino acid variations that were genetically dependent on each other in order to escape epitope-HLA binding.