Molecular and Genetic Characterization of HIV-1 Tat Exon-1 Gene from Cameroon Shows Conserved Tat HLA-Binding Epitopes: Functional Implications

HIV-1 Tat plays a critical role in viral transactivation. Subtype-B Tat has potential use as a therapeutic vaccine. However, viral genetic diversity and population genetics would significantly impact the efficacy of such a vaccine. Over 70% of the 37-million HIV-infected individuals are in sub-Saharan Africa (SSA) and harbor non-subtype-B HIV-1. Using specimens from 100 HIV-infected Cameroonians, we analyzed the sequences of HIV-1 Tat exon-1, its functional domains, post-translational modifications (PTMs), and human leukocyte antigens (HLA)-binding epitopes. Molecular phylogeny revealed a high genetic diversity with nine subtypes, CRF22_01A1/CRF01_AE, and negative selection in all subtypes. Amino acid mutations in Tat functional domains included N24K (44%), N29K (58%), and N40K (30%) in CRF02_AG, and N24K in all G subtypes. Motifs and phosphorylation analyses showed conserved amidation, N-myristoylation, casein kinase-2 (CK2), serine and threonine phosphorylation sites. Analysis of HLA allelic frequencies showed that epitopes for HLAs A*0205, B*5301, Cw*0401, Cw*0602, and Cw*0702 were conserved in 58%–100% of samples, with B*5301 epitopes having binding affinity scores > 100 in all subtypes. This is the first report of N-myristoylation, amidation, and CK2 sites in Tat; these PTMs and mutations could affect Tat function. HLA epitopes identified could be useful for designing Tat-based vaccines for highly diverse HIV-1 populations, as in SSA.


Introduction
About 37 million individuals worldwide are living with HIV/AIDS and HIV-1 accounts for over 95% of all infections [1,2]. HIV-1 includes four groups: M (major), O (outlier), N (non-M non-O), and P [3,4]. HIV-1 group M accounts for the vast majority of infection globally and includes nine Helsinki Declaration and was approved by the Cameroon National Ethics Committee (National Ethical Clearance #146/CNE/SE/2012, approved on 13 June 2006 and renewed on 2 May 2012), as well as the Institutional Review Board of the University of Nebraska Medical Center (IRB# 307-06-FB, approved on 26 March 2007). Written informed consent was obtained from all participants and data were processed using unique identifiers to ensure confidentiality.

DNA Sequencing and Phylogenetic Analysis
Nucleotide sequences were obtained by direct sequencing of the PCR products. Briefly, amplicons were purified using Amicon Microcon Ultra pure kit (Centrifugal Filters Devices, Millipore, Billerica, MA, USA) according to the manufacturers' instructions. DNA sequencing was performed at the University of Nebraska Medical Center High-Throughput DNA Sequencing and Genotyping Core Facility, using a 20 µL reaction mix containing 20 ng of the purified PCR product, nested primers (12.8 pmoles forward primer or 12.8 pmoles reverse primer), and the Big-Dye chemistry method (Perkin-Elmer, Austin, TX, USA). Capillary electrophoresis was performed using an Applied Biosystems 3730 DNA sequencer (Applied Biosystems, Tokyo, Japan), and sequences were loaded and assembled into Pregap4 v.1.5 software to generate contigs [38]. Nucleotide sequences were aligned with subtype/CRFs reference sequences from the Los Alamos National Laboratory (LANL) database using the CLUSTAL.W integrated into Bioedit.7.2.5 software [39]. The phylogenetic tree was constructed by the neighbor-joining and Kimura's two-parameter methods [40] using the MEGA.v.5 software [41]. The reliability of the branching orders was determined using 70% bootstrap robustness for subtype assignation [42,43].

Building of Consensus Sequences
Three to twenty Tat exon-1 sequences belonging to the same subtype were selected using BioEdit.7.2.5 [44] and the LANL HIV sequence database Consensus Maker tool [45] and aligned as fasta format in the CLUSTAL.W program to obtain a consensus sequence for each subtype. At each position, nucleotides were compared and the most frequent (50% minimal threshold) were considered in the consensus. Nucleotides with a frequency below the 50% threshold were considered missing, and gaps were treated as a fifth residue [45]. For each subtype, the validity and consistency of the selected consensus sequence was further verified by alignment with other HIV-1 Tat sequences in the database.

Analysis of Recombination Events
Query sequences, consensus sequences, and reference sequences were first aligned and gaps were stripped prior to subtyping analysis. Subtyping and recombination events were verified using the NCBI genotyping tool for retroviruses [46]. To ensure data accuracy, subtyping and recombination events were further confirmed using four different statistical and bioinformatics tools: the SplitsTree.4.13.1, COMET, SCUEAL, and the recombinant identification program of the HIV sequence databases. The bootscan analysis was performed using consensus CRF01_AE, CRF22_01A1, A2, A, G, C with gene specific window size and step size in Simplot v.3.5.1 to predict breakpoints within the strains [47]. Reference sequences (CRF02_AG, A1, A2, CRF09_cpx, CRF22_01A1, CRF18_cpx, CRF 19_cpx, U (unclassified), CRF13_cpx, CRF37_cpx, CRF11_cpx, CRF06_cpx, F1, F2, CRF36_cpx, CRF43_02G, CRF01_AE, CRF25_cpx; HIV-1 groups N, O, P and SIVcpz) were selected from the NCBI genotype tool for retroviruses database [46]. To determine the recombination events, gene specific window size was set at 80 bp and step size at 20 bp.

Identification of Mutations
Tat exon-1 nucleotide sequences were translated into amino acid (aa) sequences using the BioEdit.7.2.5 software [44]. For all viral strains, multiple sequence alignments with their corresponding consensus sequences were made using Clustal.W integrated into Bioedit.7.2.5 [39], and sequences analyzed using the LANL VESPA program [48]. This program displayed a table containing the mutations detected in the Cameroon HIV isolates, compared to corresponding consensus sequences, as well as the position of each mutation. We then analyzed the allelic frequencies of each mutation in all samples, as well as the extent of aa sequence conservation in each sample.

Identification of Motifs and Phosphorylation Sites
The sequence of each sample and its corresponding consensus sequence were translated into aa and analyzed using the motif scan [49,50] and NetPhos.2.0 [51] programs, to identify the motifs and phosphorylation sites in each sample (in comparison its corresponding consensus sequence), and detect the presence of unknown motifs, their positions, and match scores. The NetPhos.2.0 program was also used to identify phosphorylated aa residues, with a threshold score set at 0.500 (range 0 to 1) for prediction of phosphorylation sites.

Non-Synonymous/Synonymous Substitution Ratios (dn/ds)
The SNAP.2.1.1 program [52] was employed to determine the accumulation rate of non-synonymous base substitutions per potential non-synonymous site (dn) relative to the accumulation rate of synonymous base substitutions per potential synonymous site (ds). For each query sequence, the SNAP.2.1.1 program was used to calculate codon specific dn/ds ratios, compared to the corresponding subtype consensus sequence; and to determine the evolutionary pattern of codons and Tat regions under positive (dn/ds > 1) or negative (dn/ds < 1) selection. The average dn/ds values for all samples in each subtype were used for analyses.

Determination of HLA-Binding Peptide Motifs
Only HLAs previously shown to occur at high frequency in the Cameroonian population (including HLA-A*0201, HLA-A*0205, HLA-B*5301, HLA-B*5801, HLA-Cw*0401, HLA-Cw*0602, HLA-Cw*0702) [53][54][55] were considered in this study. The Propred-I prediction program and HTLM-II display mode [56] were used to identify promiscuous HLA-I binding sites. Because 4% is the cut-off affinity score considered sensitive and specific by the Propred-I software and HTLM-II display mode [56], we considered only epitopes with a minimum of 4% binding affinity scores. Only subtypes that included at least 5% of the Cameroon HIV-1 isolates were considered in the analysis. However, we also performed additional analyses using a clade-B consensus sequence and the clade-B Tat vaccine sequence (GenBank accession # AAA44199.1).

Statistical Analyses
Data were analyzed by t-test (two-tailed) for two-group comparisons using GraphPad Prism 5.0b. (GraphPad Software, La Jolla, CA, USA). The threshold of significance was 0.05.

Demographic and Clinical Characteristics of Study Subjects
We analyzed plasma samples obtained between 2008 and 2010 from 100 HIV-infected Cameroonians in Yaoundé; 34 samples were from individuals with undetectable viremia and we could not amplify HIV-1 Tat from those subjects. Sixty-six samples were from individuals with detectable viremia, and we successfully amplified and sequenced HIV-1 Tat exon-1 in 60 of those samples, 53 of which were antiretroviral therapy-naïve. Subjects' demographics and clinical characteristics are summarized in Table 1. Tat exon-1 nucleotide sequences for all 60 new clinical HIV-1 isolates analyzed in this study are available in the NCBI database; GenBank accession numbers KX360666 to KX360725.

Phylogenetic Analysis Shows High Genetic Diversity of Tat Exon-1 in Cameroon
Phylogenetic analysis of Tat exon-1 identified ten HIV-1 subtypes, with 43 subjects (71.66%) harboring CRF02_AG and 17 subjects (28.33%) with non-CRF02_AG subtypes (5 (8.33%) CRF11_cpx, 3 (5%) subtype G, 2 (3.33%) CRF01_AE, 2 (3.33%) CRF13_cpx, and 1 (1.66%) each for CRF37_cpx, CRF22_01A1, CRF18_cpx, subtype D, and CRF22_01A1/CRF01_AE) ( Figure 1). The non-CRF02_AG strains identified included a URF, CRF22_01A1/CRF01_AE. Recombination analyses of this isolate, as well as a representative CRF02_AG isolate, further confirmed the identity of both the representative CRF02_AG isolate (Figure 2A) and the URF CRF22_01A1/CRF01_AE ( Figure 2B). CRF22_01A1/CRF01_AE recombination was also confirmed by breakpoint analyses showing that this URF aligned with the N-terminal region of HIV-1 CRF22_01A1 and the C-terminal region of HIV-1 CRF01_AE. These findings were also supported by bootscan analyses showing with 84% and 92% confidence a recombination breakpoint occurring at the 120th nucleotide in CRF02_AG ( Figure 2C) and CRF22_01A1/CRF01_AE ( Figure 2D), respectively. Informative site analyses also showed that for CRF22_01A1/CRF01_AE recombinant, the N-terminal moiety consisted of 2, 0, and 0 value for subtypes CRF22_01A1, CRF01_AE, and A2 respectively, while the C-terminal moiety consisted of 0, 2, and 0 value for CRF22_01A1, CRF01_AE, and A2 respectively ( Figure 2D), confirming that CRF22_01A1 is in the N-terminal and CRF01_AE in the C-terminal region of this URF. Similarly, informative site analyses of CRF02_AG showed that its N-terminal moiety consisted of 0, 2, and 0 value for subtypes G, A, and C, respectively, and its C-terminal moiety consisted of 3, 0, and 0 value for G, A, and C, respectively ( Figure 2C). Tat exon-1 nucleotide sequences of 60 clinical HIV-1 isolates from Cameroon (NACMR IDs) were aligned using Clustal.W, and phylogenetic analysis performed using the neighbor-joining method and MEGA.5 software as described in the Methods Section. The reference sequences were from the Los Alamos database, and included HIV-1 isolates from eight countries (Cameroon, Ghana, Kenya, France, China, Thailand, Korea, and Ecuador); some references have been omitted to enable better visualization of the new Cameroon sequences (marked by "∆"). The Bootstrap value of 1000 replicates of at least 70% was used to determine the HIV-1 subtype. Subject NACMR092 was infected with recombinant HIV-1 CRF01_AE/CRF22_01A1 ("", blue arrow). The scale bar represents 2% genetic distance. Tat exon-1 nucleotide sequences of 60 clinical HIV-1 isolates from Cameroon (NACMR IDs) were aligned using Clustal.W, and phylogenetic analysis performed using the neighbor-joining method and MEGA.5 software as described in the Methods Section. The reference sequences were from the Los Alamos database, and included HIV-1 isolates from eight countries (Cameroon, Ghana, Kenya, France, China, Thailand, Korea, and Ecuador); some references have been omitted to enable better visualization of the new Cameroon sequences (marked by "∆"). The Bootstrap value of 1000 replicates of at least 70% was used to determine the HIV-1 subtype. Subject NACMR092 was infected with recombinant HIV-1 CRF01_AE/CRF22_01A1 ("˝", blue arrow). The scale bar represents 2% genetic distance. For breakpoint prediction, bootscan analysis was performed using consensus sequences for subtypes G, A, C (panel C, blue, green and pink colors, respectively), or CRF22_01A1, CRF01_AE, and A2 (panel D, blue, green and purple colors, respectively). For all panels, the X-axis represents the percentage of sequence similarity to the corresponding subtype and the Y-axis represents the aa position of the sample sequenced.

Mutations in Cameroon HIV-1 Tat Functional Domains are Associated with Potential Post-Translational Modifications (PTMs)
Since mutations at specific aa residues can affect PTMs and Tat function, including viral transactivation [57,58], we analyzed the motifs and predicted PTMs in Tat functional domains of Cameroon HIV-1 isolates. Compared to consensus sequences, arginine (R) residues in the TAR-binding domain were overall conserved in Cameroon isolates ( Figure 4); and in the glutamine-rich region, Q residues were conserved in 77% of CRF02_AG isolates and were mostly conserved in other subtypes ( Figure 4). Compared to consensus sequences, C residues in the cysteine-rich region were conserved in 93% of samples analyzed; only three CRF02_AG isolates showed mutations at C31 (C31S/A/F), and one CRF18_cpx isolate showed a C31S mutation (Figure 4). Several CRF02_AG samples showed mutations of asparagine (N) into lysine (K) in the cysteine-rich and Core regions, including N24K, N29K, and N40K in 44%, 58%, and 30% of samples, respectively ( Figure 4). All subtype-G samples also showed N24K mutation, and 42% of CRF02_AG samples had S23N substitution (Figure 4). Compared to consensus sequences, no major mutations were observed in the Tat N-terminal region of Cameroon isolates except the P3L substitution in 63% of CRF02_AG isolates (Figure 4).
Analysis showed the presence of functional protein motifs in Tat of Cameroon HIV-1 isolates. All samples showed the presence of an N-myristoylation domain in the Core region, and CRF02_AG, CRF22_01A1, and subtype-D isolates had an additional N-myristoylation domain in the N-terminal region (Figure 4, III). All samples also showed an amidation domain spanning the Core and TAR-binding regions (Figure 4, I). A cAMP protein kinase (PK) domain (PKA) spanning the TAR-binding and glutamine-rich regions was present in Tat sequences of CRF02_AG, CRF13_cpx, CRF22_01A1, and CRF01_AE viral isolates (Figure 4, IV). A casein kinase-2 (CK2) domain was present in the glutamine-rich region of CRF02_AG, subtype-G, CRF37_cpx, and CRF01_AE Tat sequences (Figure 4, II), and CRF11_cpx isolates also showed a PKC domain in the glutamine-rich region (Figure 4, V).
Analysis also showed phosphorylation site within serine residues (S59, S61, S62) in CRF02_AG, subtypes G and D, CRF11_cpx, CRF13_cpx, CRF18_cpx, and CRF01_AE Tat; and phosphorylation of threonine residues (T58) in CRF02_AG, CRF22_01A1, and CRF01_AE Tat (Figure 4). No aa phosphorylation site was detected in CRF37_cpx Tat sequences. Interestingly, serine and threonine phosphorylation occurred only in the glutamine-rich regions and serine phosphorylation occurred only in CK2 or PKC motifs, whereas threonine phosphorylation occurred only in PKA motifs (Figure 4). 4). Several CRF02_AG samples showed mutations of asparagine (N) into lysine (K) in the cysteine-rich and Core regions, including N24K, N29K, and N40K in 44%, 58%, and 30% of samples, respectively ( Figure 4). All subtype-G samples also showed N24K mutation, and 42% of CRF02_AG samples had S23N substitution (Figure 4). Compared to consensus sequences, no major mutations were observed in the Tat N-terminal region of Cameroon isolates except the P3L substitution in 63% of CRF02_AG isolates (Figure 4).

Selection Pressure and HLA-Binding Motifs in Cameroon HIV-1 Tat Sequences
Given that the dn/ds ratio can influence viral sequence evolution, viral adaptation, and disease progression [59], we analyzed the Tat dn/ds ratios of Cameroon HIV-1 isolates. Fifty-seven of the 60 (95%) HIV-1 Tat sequences analyzed showed a purifying selection, with dn/ds ratios < 1 (Table 3). Tat sequences from three individuals infected with HIV-1 CRF02_AG had dn/ds ratios between 1.27 and 2.07, but the other 40 individuals with HIV-1 CRF02_AG had dn/ds ratios < 1 and the overall mean dn/ds ratio for all 43 CRF02_AG infected subjects was below 0.4 (Table 3). We identified and analyzed the binding affinity of HLA motifs in Tat sequences of Cameroon HIV-1 isolates, focusing on HLAs that were previously shown in Cameroon populations [53][54][55], subtypes identified in at least 5% of the samples analyzed, and subtype B. For all four subtypes analyzed (CRF02_AG, CRF11_cpx, G, and B), the HLA-B*5301 allele showed epitopes that had binding affinity scores > 100; the HLA-Cw*0401 allele had epitopes with binding affinity scores of 25 to 600; and the HLA-A*0205, HLA-Cw*0401, HLA-Cw*0602, and HLA-Cw*0702 alleles had epitopes with binding affinity scores of 4 to 25 (Table 4, Figure 5). The HLA-B*5801 allele had epitopes with binding affinity scores of ě4 only in subtype B and CRF11_cpx (Table 4). HLA alleles and epitopes present in CRF02_AG, CRF11_cpx, and subtype G Tat sequences were conserved, respectively, in 58% to 81%, 60% to 100%, and 67% to 100% of samples analyzed ( Table 4). The epitope HPGSQPKTA in subtype B HLA-B*5301 allele was also present in 66% of subtype G isolates, but none of the other subtype B epitopes identified was present in the Cameroon isolates analyzed (Table 4). Subtype B data shown (Table 4 and Figure 5) were obtained using a B consensus sequence. Additional analyses using the subtype B Tat vaccine sequence (GenBank accession # AAA44199.1) gave similar results.

Discussion
This is, to our knowledge, the first study of HIV-1 Tat sequences in Cameroon. This comprehensive molecular study showed high genetic diversity of HIV-1 Tat exon-1 among infected subjects in Cameroon, in agreement with diversity shown for HIV-1 gag, pol, env, nef genes in Cameroon [60][61][62][63][64][65]. Our analysis of Tat exon-1 sequences confirmed these findings, and showed a predominance of CRF02_AG (71.6% of samples analyzed), similar to previous studies that showed

Discussion
This is, to our knowledge, the first study of HIV-1 Tat sequences in Cameroon. This comprehensive molecular study showed high genetic diversity of HIV-1 Tat exon-1 among infected subjects in Cameroon, in agreement with diversity shown for HIV-1 gag, pol, env, nef genes in Cameroon [60][61][62][63][64][65]. Our analysis of Tat exon-1 sequences confirmed these findings, and showed a predominance of CRF02_AG (71.6% of samples analyzed), similar to previous studies that showed based on the analysis of env, gag, pol, and nef genes that HIV-1 CRF02_AG represented 48% to 74% of viral strains in Cameroon [60][61][62][63][64].
A previous analysis of env, gag and pol sequences in a sample from an infected individual in Bertoua (Eastern region of Cameroon) concluded that the individual was infected with a novel unique recombinant HIV-1 CRF22_01A1/CRF01_AE [66]. Of relevance, our current analysis of Tat sequences showed that an individual from Yaoundé (Central region of Cameroon) was infected with HIV-1 CRF22_01A1/CRF01_AE (NACMR092). This is the second report of infection with CRF22_01A1/CRF01_AE in the literature, and the fact that this mosaic virus has been identified in two different Cameroonians who had no apparent epidemiological link, suggests that CRF22_01A1/CRF01_AE is spreading in Cameroon. Although these two pieces of evidence were not generated from the same genes, subsequent full-length analysis would be more informative.
There have been reports of other CRF22_01A1 recombinants in Cameroon; analyses of gag, pol, and env sequences in samples from infected Cameroonians showed recombinants of CRF22_01A1 with CRF02_AG, CRF11_cpx, and clades A [66,67], confirming recombination hotspots between HIV-1 strains circulating in Cameroon. In fact, genetic recombination often occurs at hotspot regions; hotspot motifs are found at breakpoint regions and are associated with genomic instability and evolution [68,69]. Such recombination events can lead to the rearrangement of gene sequences, which in turn contribute to the wide viral heterogeneity and evolution in SSA, thus increasing viral adaptability to selective pressure. In fact, all subtypes in our study had dn/ds ratios of <1, indicating a negative selection. These data show that HIV-1 genetic diversity in Cameroon may be associated with mutations and allelic purification in the viral Tat sequences that might improve/increase viral fitness and adaption in the human host.
Amino acid substitutions identified in Cameroon Tat sequences included N24K, N29K, and N40K respectively in 44%, 58%, and 30% of CRF02_AG samples and N24K in all subtype G samples. Such mutations can have significant functional implications. Lysine-associated hydrogen bonds are important for protein stability and K residues play a significant role in HIV-1 Tat transactivation. Studies of HIV-1 subtypes B, C and CRF01_AE Tat exon-1 showed that compared to subtype B Tat, there was a significant increase in viral transactivation with CRF01_AE and subtype C Tat [70]; K residues in the cysteine-rich, Core, and TAR-binding regions played an important role in this increased Tat activity and viral transactivation, and mutations of K residues to A decreased viral transactivation by two to 20 folds [70]. Thus, it is possible that mutations resulting in increased K residues, as shown in our studies for CRF02_AG and subtypes G isolates, could result in increased LTR transactivation and viral replication in subjects infected with those subtypes.
PTMs of proteins modulate their structure and function, including their signaling and interactions with other molecules and co-factors. Thus, identifying Tat PTMs and associated motifs is critical for understanding its function in disease and future therapeutic vaccine development. In the current study, we identify five different PTMs sites in Tat exon-1 of Cameroon HIV-1 isolates. This is, to our knowledge, the first study to show the presence of amidation and N-myristoylation sites in Tat proteins. It is likely that amidation plays a role in Tat function, as other studies showed that amidation of neuropeptides is important for their activity, bioavailability, and biological function [71]. Myristoylation plays an important role in protein signaling and function, as the myristoyl moiety guides the protein subcellular localization, its interaction with cellular membranes and other proteins [72][73][74]. In fact, myristoylation has been shown to play a major role in HIV infection. Myristoylation of the matrix domain of HIV Gag and Gag-Pol precursor protein is necessary for Gag anchoring to the plasma membrane, viral assembly, and the formation of mature infectious viral particles [75,76]; and inhibiting myristoylation blocks the formation of competent virions [77,78]. Nef myristoylation is also required for the incorporation of virions into cells [79,80], and is associated with enhanced HIV replication and progression to AIDS [79,81]. Our data showing myristoylation of Tat in Cameroon HIV isolates suggest that N-myristoylation could be playing a role in Tat function, including viral transactivation, Tat cellular uptake and cytotoxicity. Significantly, HIV-1 CRF02_AG, CRF22_01A1, and subtype D isolates had two N-myristoylation motifs whereas other subtypes had only one N-myristoylation motif in the Core region. Considering the importance of N-myristoylation in HIV subcellular location, viral assembly, transmission and replication [75][76][77][78][79]; as well as the importance of N-myristoylation of HIV proteins in signaling, inflammation, and progression to AIDS [79][80][81]; the increased number of N-myristoylation motifs in CRF02_AG, CRF22_01A1, and subtype D isolates could suggest enhanced Tat function in those viral subtypes. This may include increased transactivation and viral replication with Tat CRF02_AG, CRF22_01A1, and subtype D compared to Tat of other HIV-1 subtypes (CRF11_cpx, CRF37_cpx, CRF13_cpx, CRF18_cpx, CRF01_AE, subtype G) that had only one N-myristoylation motif. Our future studies will test this hypothesis, and determine which glycine residues in the Tat N-myristoylation motifs (G15, G42, or G44) are myristoylated.
Other PTMs sites identified in Tat sequences included a CK2 motif in four subtypes (CRF02_AG, CRF37_cpx, CRF01_AE, and subtype G), a PKA motif in four subtypes (CRF02_AG, CRF13_cpx, CRF22_01A1, and CRF01_AE), and a PKC motif in one subtype (CRF11_cpx). CK2, PKC and PKA are all serine/threonine kinases that play major role in the phosphorylation of serine and threonine residues, cellular signaling and regulation of HIV transcription [82][83][84][85][86][87]. HIV uses cellular CK2 to phosphorylate viral proteins, and CK2 mediates the phosphorylation of Rev, Vpu, and protease at serine residues to facilitate HIV infection, syncytia formation [82][83][84][85], and disease progression in simian/human immunodeficiency virus-infected primates [88]. CK2-mediated phosphorylation of Rev serine residues influenced Vpu interaction with CD4 [89,90], and mutations in CK2 site significantly altered Vpu biological activity [91]. There has been, to our knowledge, no previous study showing CK2 domains in HIV-1 Tat. Our current study showing serine/threonine kinases phosphorylation sites in Cameroon Tat sequences suggests that CK2 modulates the signaling and biological activity of Tat during infections with HIV-1 CRF02_AG, CRF37_cpx, CRF01_AE, and subtype G by phosphorylating S61 and/or S62; that PKA modulates the signaling and function of Tat during infections with HIV-1 CRF02_AG, CRF13_cpx, CRF22_01A1, and CRF01_AE by phosphorylating T58, T59 or S59; and that PKC modulates the signaling and biological activity of Tat during infections with HIV-1 CRF11_cpx by phosphorylating S60 and S62.
High levels of anti-Tat antibodies in subjects infected with subtype B HIV-1 are associated with better CTL immune response [26][27][28][29]92], and there have been efforts to develop a subtype B Tat-based vaccine [30][31][32][33]. Given that HLA variations among different populations would influence viral evolution, genetic diversity, immune response and the efficacy of such Tat-based vaccine [20,93], we analyzed HLA motifs in Tat sequences of Cameroon HIV-1 isolates. Our data showed that epitopes for HLAs A*0205, B*5301, Cw*0401, Cw*0602, and Cw*0702 were present in 58% to 100% of all samples analyzed, with B*5301 epitopes having binding affinity scores > 100 in all subtypes analyzed. CTL target specific HIV protein epitopes for immune control and HLA alleles determine that CTL response, including recognition and binding to T-cell receptors [19,93,94]. Our data suggests that Tat-based immunogens targeting HLAs B*5301 and/or Cw*0401 epitopes could work for most Cameroonians infected with HIV-1 CRF02_AG, CRF11_cpx, and subtype G; could also work in some subtype B-infected subjects; and that multi-epitope constructs of these HLAs alleles identified could work for infected subjects in Cameroon. Our subsequent studies will further investigate the frequency and affinities of these HLA epitopes in other HIV subtypes in SSA, including subtype C that is highly prevalent in Southern and Eastern Africa. We will also analyze the relationship between HLAs B*5301 and Cw*0401 epitopes and the clinical status of HIV-infected Cameroonians; as studies of subjects infected with subtype B HIV-1 showed that some HLA alleles such as HLA B*5701 and HLA-B*27 correlate with better immune control and slower progression to AIDS [21][22][23], while other alleles such as HLA-B*5802 correlate with faster disease progression [19,20].

Conclusions
The current study is, to the best of our knowledge, the first analysis of HIV-1 Tat-exon-1 in Cameroon. Our data confirm the broad HIV genetic diversity in Cameroon with predominant CRF02_AG. Our data also show a negative selection for all subtypes and the presence of CRF22_01A1/CRF01_AE, the only second report of such recombinant in the literature. Furthermore, we showed the presence of conserved PTMs motifs in Tat functional domains, including N-myristoylation, amidation, and CK2 sites. This is, to our knowledge, the first study to show N-myristoylation, amidation, and CK2 motifs in Tat sequences, and our future studies will investigate the role of these PTMs in Tat-mediated signaling and biological function. We also showed conserved Tat HLA-binding epitopes that had high frequencies and high affinity, and this could be useful for future multi-epitope vaccine constructs for Cameroonian and SSA populations.