Immunomics Datasets and Tools : To Identify Potential Epitope Segments for Designing Chimeric Vaccine Candidate to Cervix Papilloma

Immunomics tools and databases play an important role in the designing of prophylactic or therapeutic vaccines against pathogenic bacteria and viruses. Therefore, we aimed to illustrate the different immunological databases and web servers used to design a chimeric vaccine candidate against human cervix papilloma. Initially, cellular immunity inducing major histocompatibility complex class I and II epitopes from L2 protein of papilloma 58 strain were predicted using the IEDB, NetMHC, and Tepi tools. Then, the overlapped segments from the above analysis were used to calculate efficiency on interferon-gamma and humoral immunity production. In addition, the allergenicity, antigenicity, cross-reactivity with human proteomes, and epitope conservancy of elite segments were determined. The chimeric vaccine candidate (SGD58) was constructed with two different overlapped peptide segments (23–36) and (29–42), adjuvants (flagellin and RS09), two Th epitopes, and amino acid linkers. The results of homology modeling demonstrated that SGD58 have 88.6% of favored regions based on Ramachandran plot. Protein–protein docking with Swarm Dock reveals SGD58 with receptor complex have −54.74 kcal/mol of binding energy with more than 20 interacting residues. Docked complex are stable in 100ns of molecular dynamic simulation. Further, coding sequences of SGD58 also show elevated gene expression in E. coli. In conclusion, SGD58 may prompt vaccine against cervix papilloma. This study provides insight of vaccine design against different pathogenic microbes as well. Dataset: http://doi.org/10.5281/zenodo.1997695

Data 2019, 4, 31 2 of 17 specifically infection associated with the cervix with 250,000 mortalities every year [1].HPV with its double-stranded DNA contains a nonenvelope small virus, which infects a region of the cutaneous epithelial membrane (skin or integumentary system), or the mucous membrane (i.e., coated as an internal line in hollow spaced organs like the mouth, reproductive organ, urinary tract, or rectum) in the host system [2].The genomic relationship between different cancer types has demonstrated that more than 99% of cervical cancer patients are infected with 15 different types of α-clade HPV, defined as "high-risk" or "oncogenic" genital HPVs.The α-clade HPV (6 and 11) causes genital warts while the remaining strains of HPV are related to the risk of cervical cancer.HPV infection is attributed to more than 50% of oropharyngeal and anogenital cancers [3].Generally, the human immune system can clear the pathogenic infection caused by HPV within two years, but this also depends on the efficiency of an individual's immune system and the invading type of HPV.However, in the case of a very weak immune system, it fails to remove the invading high-risk HPVs (hrHPV) that may lead to the development of cervical cancer [4,5].hrHPV infections are responsible for causing more than 99% of precancerous cervical intraepithelial neoplasias (CIN) and invasive cervical cancers (ICC) [6][7][8].In China, HPV-mediated cervical cancer is a substantial public health issue, with 1 million new cervical cancer incidences and 30,000 moralities registered every year [9,10].In 2018, clinical, epidemiological, and clinicopathological studies reported HPV58 to be the second or third most predominant genotypes in precancerous CIN includes mild dysplasia (CIN I), mild to moderate dysplasia (CIN II), severe dysplasia to cancer (CIN III), and ICC lesions.Higher grades of squamous intraepithelial, or cell carcinoma, and adenocarcinoma of HPV positive patients were diagnosed in different geographical regions of China [11].Seven provinces of China that have reported hrHPV-mediated cervical cancer incidences, include Guangdong, Liaocheng, Shanghai, Wenzhou, Wuhan, Southwestern China, and Western China [11][12][13][14][15][16][17].Zhang et al. [18] reported that the HPV16 (6.4%) and HPV58 (5.3%) genotypes were predominantly found in males who had recently involved in sex, in Shanghai.
Cervarix ® , Gardasil ® , and Gardasil 9 ® are the three noninfectious prophylactic Food and Drug Administration (FDA)-approved HPV licensed subunit vaccines in active usage.These vaccines were developed from the major capsid L1 virus-like particles (VLPs) using recombinant DNA technology.Cervarix is a bivalent vaccine based on the Baculovirus fermentation and it provides ~70% protection to HPV (16 and 18)-mediated cervical cancer but not to genital warts [19].Gardasil is a quadrivalent HPV (6, 11, 16, and 18) vaccine based on yeast fermentation technology.It is efficiently used for the prevention of genital warts and gives ~70% protection for cervical cancer [20].In 2009, the FDA approved a nine-valent Gardasil 9 ® that provides protection to HPV types 6, 11, 16, 18, 31, 33, 45, 52, and 58.It has been used for both males and females in the age groups of 9-15 and 9-26 [21].The new nine-valent vaccine exhibited a positive outcome in high-grade lesions in the absence of HPV (18 and 16) infections [22].In October 2018, FDA extended the use of Gardasil 9 to the age group of 27-45 among both the sexes.In addition, the L1 VLP (absence of viral genomic materials)-mediated vaccine production in the eukaryotic (ex.Baculovirus) host system is a complex and tedious process [23,24].The main limitations of currently available prophylactic vaccines are strain specific, not therapeutic for patients already infected with HPV22, require multiple dosages, and is expensive [25,26].In addition, the effective straightforward delivery of HPV vaccines can enhance the immunogenic potential against HPVs.
The implementation of L2 minor capsid protein is a potential alternative in the HPV prophylactic vaccine production.Since the N-terminal region of the L2 protein is highly conserved in low-risk HPV (6 and 11) and 13 different hrHPVs, it is contrasted with the type-specific protection of L1 prophylactic VLPs [27].The single copy of L2 protein (~473 amino acids (AA)) is present in each L1 capsomere, resulting in 72 copies per virion [28].Incidentally, L2 protein plays a vital role in L1 assembly into the VLPs and enhances the encapsidation of double-stranded ~8kb circular viral genome [29].Moreover, the full-length or polypeptides (1-8 or 11-200 AA in length) of L2 protein enhance the production of neutralizing antibodies in vaccinated experimental models including mice, cattle, and rabbit [30][31][32].To date, no L2 VLP-derived prophylactic vaccine has been approved in clinical trials due to their limitation of weak immunogenicity, which imitates the incapability of multimerizing into the VLPs.
With this information, in this study, we aimed to design the novel chimeric vaccine from the N-terminal region of the L2 sequence of HPV58 targets to hrHPVs.Immunomics tools and databases play an important role in the designing of prophylactic or therapeutic vaccines against pathogenic bacteria and viruses [33][34][35].Immunomic tools include immune epitope database (IEDB) and NetMHCv4.0,Tepitool, CTLPred, PAComplex, IFNepitope, ABCPred, AllerTOP, AllergenFPv 1.0, ANTIGENpro, program of protein information resource (PIR), and epitopes conservancy were implemented to discover the overlapped epitope segment to induce B cell and T cell immunity.Then, the chimeric vaccine (SGD58) was constructed using overlapped epitope segments, TLR adjuvants, Th epitopes, and amino acid linkers.The physiochemical and immunological properties of the chimeric vaccine was validated using Protparam, SolPro, VaxiJen, and ANTIGENpro tools.In addition, homology modeling using iterative threading ASSEmbly refinement (I-TASSER), structural refinement (GalaxyRefine and 3DRefine), and structural validation (protein structure analysis (ProSA), Ramachandran plot, and ERRAT) were performed to obtain the best three-dimensional (3D) model of the chimeric vaccine and target TLR5 receptor.Then, the interaction of the chimeric vaccine with TLR5 and stability of this complex were determined through PP docking and molecular dynamic (MD) simulation.Moreover, the virtual cloning and gene expression of the chimeric vaccine in E. coli were analyzed to obtain a low-cost HPV vaccine.All the necessary supporting information for SGD58 design illustrated in the study and the original results was reported in our previous study [36].

Data Description
The specifications of the data description namely subject area, types of data, method of acquiring data, format, experimental factors, source and accessibility of data for selection of potential epitope segments and designing of chimeric vaccine was illustrated in Table 1.The overlapped epitope segments obtained from the major histocompatibility complex class I (MHC-I) prediction were compared with the results of both CTLPred and PAComplex servers.Furthermore, the shared epitope segments obtained from the CTLPred and PAComplex were used for epitope selection, and vaccine design as shown in Table 2.
The lowest percentile rank with strong binding affinity epitope segments with human MHC-II alleles, such as the DQB1-, DRB1-, and DPB1-restricted epitopes, were obtained using IEDB consensus and Tepitool servers.The overlapping promiscuous epitope segments from the above prediction (Table 3) were selected and evaluated for their INF-γ production ability.The overlapped INF-γ producing CD4+ (MHC-II) epitope segments are as given in Table 3.Therefore, the shared MHC-II epitope segments could produce IFN-γ against viral infection.Interestingly, the above-obtained overlapped CD4+ epitopes shared the CD8+ epitope segments.In addition, Table 3 illustrates overlapped B cell epitopes predicted through the ABCPred tool.Table 4 gives the comprehensive analysis of overlapped epitope (>=30%), positions, subsequences identity, and hrHPV.The conservation of selected epitopes has cross-protection to the 15 hrHPV as shown in Figure 1.The refined 3D structure obtained in the above section underwent quality improvement using three potential tools: ProSA-web, RAMPAGE, and the ERRAT.The z-score (ProSA), overall quality factor (ERRAT), and favored, allowed, and the outlier region (RAMPAGE) of the validated 3D structure of SGD58 are given in Table 5 and TLR5 in Table 6.The refined 3D structure obtained in the above section underwent quality improvement using three potential tools: ProSA-web, RAMPAGE, and the ERRAT.The z-score (ProSA), overall quality factor (ERRAT), and favored, allowed, and the outlier region (RAMPAGE) of the validated 3D structure of SGD58 are given in Table 5 and TLR5 in Table 6.
Table 5. Validation of 3D structures of the designed SGD58 obtained by the iterative threading ASSEmbly refinement (I-TASSER) and its refinement by the Galaxy Refine (named as I-T Gal) and 3Drefine (named as I-T 3DR).The I-T 3DR5 structure was chosen as the most appropriate model, which is shown in bold.

Model
From overall comparison of the results, model 3 of GalaxyRefine of SGD58 (Figure 2a) and model 5 of 3Drefine of TLR5 (Figure 2b) using UCSF Chimera were selected for further analysis.The I-T 3DR5 structure was chosen as the most appropriate model, which is shown in bold.
From overall comparison of the results, model 3 of GalaxyRefine of SGD58 (Figure 2a) and model 5 of 3Drefine of TLR5 (Figure 2b) using UCSF Chimera were selected for further analysis.SwarmDock modeling demonstrated the list of clusters with SGD58 and TLR5 complex.The clusters are ranked based on interacting residues between the complexes with binding energies.The input TLR5 receptor contains 858 amino acids and SGD58 contains 2923 amino acids.The human TLR5 In addition, the highest percentage of interacting residues (88.87%) and the lowest percentage of interacting residues (57.71%) were observed in models 1 and 9 of the SGD58 and TLR5 complex, respectively.Table 8 provides the list of top ten clusters, binding residues, and interaction energies.
Figure 3 illustrates that potential energy (PE), temperature, total energy (TE), and pressure of SGD58 was stable during the simulation period.The average TE of SGD58 is −7,206,525.282with a standard deviation of 4373.407.In addition, the average PE of SGD58 is −8,984,582.127with a standard deviation of 3472.905.PE and TE attained equilibrium at a temperature of 300 K.The result of the radius of gyration (Rg) analysis is shown in Figure 4.The simultaneous changes in Rg plots of the SGD58 and (Figure 4a) and complex with TLR5 (Figure 4b) indicate that the substantial nature of the complex frequently increases.Rg plots compression of SGD58 with TLR5 and are similar to the RMSD parameter, which indicates the effort of SGD58 to reach internal configuration in TLR5 Data 2019, 4 FOR PEER REVIEW 10 Figure 3 illustrates that potential energy (PE), temperature, total energy (TE), and pressure of SGD58 was stable during the simulation period.The average TE of SGD58 is −7206525.282with a standard deviation of 4373.407.In addition, the average PE of SGD58 is −8984582.127with a standard deviation of 3472.905.PE and TE attained equilibrium at a temperature of 300 K.The result of the radius of gyration (Rg) analysis is shown in Figure 4.The simultaneous changes in Rg plots of the SGD58 and (Figure 4a) and complex with TLR5 (Figure 4b) indicate that the substantial nature of the complex frequently increases.Rg plots compression of SGD58 with TLR5 and are similar to the RMSD parameter, which indicates the effort of SGD58 to reach internal configuration in TLR5  The maximal protein expression of this optimized coding sequence in the host (E.coli) was analyzed by the GenScript's Optimum GeneTM codon optimization tool.Figure 5 illustrates the CAI, GC, and CFD of the gene transcript.The gene (reverse translated coding sequence of the vaccine construct) having ideal CAI value of 1.00 (>0.8) is more suitable for the above expression (E.coli) in the host organism.Moreover, 59.92% of ideal GC content is presented in the gene (between 30% and 70%).However, the outside of these peak ranges would severely inhibit the transcriptional and translational efficiency of the gene products.The CFD value of the gene is 100%, representing their highest codon frequency distribution in the preferred expression organism.The maximal protein expression of this optimized coding sequence in the host (E.coli) was analyzed by the GenScript's Optimum GeneTM codon optimization tool.Figure 5 illustrates the CAI, GC, and CFD of the gene transcript.The gene (reverse translated coding sequence of the vaccine construct) having ideal CAI value of 1.00 (>0.8) is more suitable for the above expression (E.coli) in the host organism.Moreover, 59.92% of ideal GC content is presented in the gene (between 30% and 70%).However, the outside of these peak ranges would severely inhibit the transcriptional and translational efficiency of the gene products.The CFD value of the gene is 100%, representing their highest codon frequency distribution in the preferred expression organism.The gene (reverse translated coding sequence of the vaccine construct) having ideal CAI value of 1.00 (>0.8), which is more suitable for higher expression in the E. coli host organism.(b) The percentage of GC content in the gene is 59.92%, which is in the ideal range of GC content (between 30 to 70%).(c) CFD value of the gene is 100%.The value of 100 is set for the codon with the highest usage frequency for a given amino acid in the desired expression organism.
We conclude SGD58 may prompt vaccine against cervix papilloma.This study provides insight of vaccine design against different pathogenic microbes as well.

Methods
The L2 protein of HPV58 (Accession No.: P26538), flagellin of Salmonella enterica serovar Dublin (Accession No.: Q06971), and human TLR5 (Accession No.: O60602) sequences were obtained from the Swiss-Prot reviewed universal protein knowledgebase (UniProt) [37].The designed chimeric vaccine was named SGD58, using the name of the first and principal authors along with the strain number.Two servers-IEDB and NetMHCv4.0-havebeen exploited for identification of major histocompatibility complex class I (MHC-I) binding epitopes from the N-terminal region of the L2 The gene (reverse translated coding sequence of the vaccine construct) having ideal CAI value of 1.00 (>0.8), which is more suitable for higher expression in the E. coli host organism.(b) The percentage of GC content in the gene is 59.92%, which is in the ideal range of GC content (between 30 to 70%).(c) CFD value of the gene is 100%.The value of 100 is set for the codon with the highest usage frequency for a given amino acid in the desired expression organism.
We conclude SGD58 may prompt vaccine against cervix papilloma.This study provides insight of vaccine design against different pathogenic microbes as well.

Methods
The L2 protein of HPV58 (Accession No.: P26538), flagellin of Salmonella enterica serovar Dublin (Accession No.: Q06971), and human TLR5 (Accession No.: O60602) sequences were obtained from the Swiss-Prot reviewed universal protein knowledgebase (UniProt) [37].The designed chimeric vaccine was named SGD58, using the name of the first and principal authors along with the strain number.Two servers-IEDB and NetMHCv4.0-havebeen exploited for identification of major histocompatibility complex class I (MHC-I) binding epitopes from the N-terminal region of the L2 sequence.Specific human MHC-I alleles such as human leukocyte antigen (HLA)-A* (01:01, 02:01, 02:07, 11:01, 24:01), HLA-B* (46:01, 58:01), and HLA-C* (07:02, 12:03) were abundantly diagnosed in different regions of China, including Guizhou, Henan, Taihu River Basin, Tibetan, Yunnan, Wenzhou, and Wuhan.These 11 alleles were selected for epitope prediction [38][39][40][41][42][43][44].IEDB [45] is a freely available analysis resource with specified algorithms for the identification and determination of immunogenic epitopes.A consensus method was implemented to predict the MHC-I binding epitopes and its production pathway [46].In this consensus method, three algorithms, including neural network (artificial), matrix method (stabilized), and peptide libraries (combinatorial), were combined to predict the promising CTL epitope segments.The epitopes involve proteasomal cleavage (pCle), transporter associated with antigen processing (TAP), and MHC-I binding pathway.The lowest percentile rank (<10%) indicated the good binding efficiency of epitopes with the restricted alleles.NetMHCV4.0 is another potential tool implemented to find MHC-I binding peptides with the best Pearson's correlation coefficient (PCC) of 0.895, based on the combined neural network.The strong and weak binding peptides were predicted based on the thresholds of <0.5 and <2, respectively [47].The CTLPred tool is a direct method for the prediction of CTL epitope segments instead of MHC binders.The prominent combined approaches were implemented to find the epitopes, based on both the artificial neural networks (ANN) trained by the Stuttgart neural network simulator (SNNS) and support vector machine (SVM) methods.The combined methods demonstrate a higher level of accuracy (75.8 %) compared with other individual methods of prediction such as ANN (72.2%) and SVM (75.4%).The default cutoff scores of 0.51 of ANN and 0.36 of SVM were used to find the epitopes or nonepitopes at which the sensitivity and specificity of the predictions are almost similar [48].A web server PAComplex provides access to examine and visualize the TCR-peptide and peptide-MHC interface (pMHC), respectively.For a given viral protein query sequence, the joint Z-value obtained with threshold 2.5.Moreover, it allows the selection of only limited allotypes of MHC class I such as HLA-A0201, HLA-B (0801, 3501, 3508, and 4405), and HLA-E, respectively.The Z-value was calculated using the following formula.
The peptides with the lowest percentile rank were considered as the higher binding affinity.Tepitool is a tool from IEDB analysis resources, which provides accession to the prediction of both class I and II binders.The peptides which show the lowest percentile rank (IC 50 < or = 500 nM) are potentially considered as higher affinity binding peptides [51].IFNepitope is a potential server useful for the prediction and design of INF-γ inducing epitopes.INF-γ inducing epitopes were identified based on motif-based SVM or hybrid algorithms.The hybrid method using residue or dipeptide composition shows 81.39% accuracy [52].ABCPred is used to predict linear B cell epitopes.It provides 65.93% of accuracy with the involvement of the recurrent neural network (RNN) algorithm.It consists of 700 B cell and non-B cell epitope segment datasets each with a length of 20 amino acids [53].AllerTOP is the first proper alignment-free allergenicity server.In this, five machine learning methods such as partial least squares, logistic regression, decision tree, naive Bayes, or k nearest neighbors (kNN = 1) were implemented to find the allergen.It shows 88.7%, 90.7%, and 86.7% for accuracy, specificity, and sensitivity, respectively [54].AllergenFPv 1.0 is another essential tool for the allergenicity prediction based on novel descriptor fingerprint approaches.Twenty naturally existing amino acids in the protein sequences were classified into five descriptors (E) such as E1 (hydrophobicity), E2 (size), E3 (helix-forming propensity), E4 (relative abundance of amino acids), and E5 (β-strand forming propensity).Based on this, the strings were transformed into normal vectors by ACC transformation to find the allergen protein.It exhibits accuracy (87%), specificity (89%), and sensitivity (86%) [55].ANTIGENpro is the potential alignment-free and sequence-based antigenicity prediction server with 79% accuracy and area under curve (AUC) of 0.89.It shows results based on amino acid composition and random-forest algorithm.The datasets were trained using 5-fold cross-validation.It consists of both protective antigen (193) and nonantigen (193) sequences.It predicts whether the given query epitope segments is antigenic or nonantigenic with their respective probability [56].The presence or absence of similarity in predicted epitope segments with the human proteome was analyzed using the peptide matching program of PIR [57].The EC tool [58] was employed to find the degree of conservancy of the epitope segments within the set of given hrHPV L2 protein sequences.The selected epitope segments of HPV58 with 14 hrHPV (16,18,31,33,35,39,45,51,56,59, 68, 69, 73, and 82) strains performed the EC analysis.The complete chimeric vaccine was designed by joining the optimized epitope segments (02), TLR adjuvants (02), and Th epitopes (02) with suitable amino acid linkers.Moreover, it is required to find the solubility of the designed chimeric vaccine on overexpression in E. coli.SOLpro is a useful tool to find the solubility of protein based on the two-stage SVM algorithm.It achieves an overall accuracy of 74%, which develops on standard evaluation metrics with 10-fold cross-validation.It predicts the query protein to be soluble or insoluble at p >= 0.5 [59].A range of physiochemical characteristics of the designed chimeric vaccine was also determined through ProtParam [60].VaxiJen is the primary server used for prediction of antigenicity of the input sequence against different targets such as virus, bacteria, fungi, parasites, and tumors.Antigenicity was calculated based on the physicochemical properties of the protein sequences.Every target organism dataset contained 100 antigens and nonantigens.Moreover, the model organisms were validated using leave-one-out cross-validation (LOO-CV): providing 89% and 0.964 of accuracy and AUC at the threshold of 0.4 [61].I-TASSER tool was employed to design the 3D structure of SGD58 and TLR5.It is a potential server that depends on the secondary-structure-mediated program of "Profile-Profile threading alignment (PPA) and iterative implementation of the TASSER".It has predicted a number of protein structures on request basis from 35 countries in the world.For the query inputs, the user obtains the confidence score, TM score (topology similarity assessments of the two various protein structures), root-mean-square deviation (RMSD), and cluster density values.Nevertheless, the higher C-score (ranging from −5 to +2) determines the best model with a higher confidence level [62].Moreover, the 3D structure of the modeled protein was visualized using UCSF Chimera.Besides, unavailability of the crystal structure of TLR5, we have chosen TLR5 (PDB ID: 3J0A) as a template model to perform the homology modeling using I-TASSER.The high C-score model of the designed vaccine from the I-TASSER was further refined using the GalaxyRefine and 3DRefine tools.The GalaxyRefine is tool accessible in the GalaxyWeb server, is useful to refine the structure of a protein from the given query sequences based on template-based modeling, and undergoes loop and terminus portion refinement through the ab initio modeling method.The ninth critical assessment of techniques for protein structure prediction (CASP9) optimizes refinement and produces consistent core structures [63].Another tool is 3Drefine, which prompts iteration analysis for ~300 amino acid residues efficiently in less than 5 min.It performs post-refinement model analysis with both or single MolProbity and random walk (RW) plus methods.The results are visualized using Javascript-based molecular viewer JSmol [64].The top five models of each tool were used for further validation.The refined 3D models from the above steps were validated using the three interactive services such as ProSA, Ramachandran plot analysis, and ERRAT.ProSA-web is a potential tool for the refinement, validation, prediction, and modeling of protein structures.It indicates the difference in the protein structures through the respective score MHC-II overlapped epitope segments prediction by using different tools as a IEDB consensus and b Tepitool; INF-γ production of the overlapped epitope segments by using c INFepitope; Overlapped B cell linear epitope segments prediction by using d ABCPred.

Figure 2 .
Figure 2. Refined 3D structure of the SGD58 and mTLR5 by using UCSF Chimera.(a) The 3D structure of the SGD58 was obtained through homology modeling by using i-TASSER, and then the best proposed model was refined by using GalaxyRefine.(b) The 3D structure of the mouse TLR5 was obtained through homology modeling by using i-TASSER, and then the best proposed model was refined by using 3Drefine.

Figure 2 .
Figure 2. Refined 3D structure of the SGD58 and mTLR5 by using UCSF Chimera.(a) The 3D structure of the SGD58 was obtained through homology modeling by using i-TASSER, and then the best proposed model was refined by using GalaxyRefine.(b) The 3D structure of the mouse TLR5 was obtained through homology modeling by using i-TASSER, and then the best proposed model was refined by using 3Drefine.

Figure 4 .
Figure 4. (a) Rg plot of vaccine molecule and (b) Rg plot of complex vaccine molecule.Figure 4. (a) Rg plot of vaccine molecule and (b) Rg plot of complex vaccine molecule.

Figure 4 .
Figure 4. (a) Rg plot of vaccine molecule and (b) Rg plot of complex vaccine molecule.Figure 4. (a) Rg plot of vaccine molecule and (b) Rg plot of complex vaccine molecule.

Figure 5 .
Figure 5. Codon optimization and in silico cloning of the gene.(a) The gene (reverse translated coding sequence of the vaccine construct) having ideal CAI value of 1.00 (>0.8), which is more suitable for higher expression in the E. coli host organism.(b) The percentage of GC content in the gene is 59.92%, which is in the ideal range of GC content (between 30 to 70%).(c) CFD value of the gene is 100%.The value of 100 is set for the codon with the highest usage frequency for a given amino acid in the desired expression organism.

Figure 5 .
Figure 5. Codon optimization and in silico cloning of the gene.(a) The gene (reverse translated coding sequence of the vaccine construct) having ideal CAI value of 1.00 (>0.8), which is more suitable for higher expression in the E. coli host organism.(b) The percentage of GC content in the gene is 59.92%, which is in the ideal range of GC content (between 30 to 70%).(c) CFD value of the gene is 100%.The value of 100 is set for the codon with the highest usage frequency for a given amino acid in the desired expression organism.

Table 2 .
Overlapped epitope segments of major histocompatibility complex class I (MHC-I), CTL, and TCR from the N-terminal region of HPV58 were predicted using different servers.

Table 3 .
The overlapped epitope segments of MHC-II, INF-gamma-producing, and B cell epitopes N-terminal region of HPV58 by using different servers.
Residues that are different from their corresponding residues in the reference sequence are highlighted in bold with gray shadow.Identity indicates the number (%) of residues in the homologous sequences that are identical to the corresponding residues in the reference sequence.

Table 6 .
Validation of 3D structures of the TLR5 obtained by the I-TASSER and its refinement by the GalaxyRefine (named as I-T Gal) and 3Drefine (named as I-T 3DR).

Table 7
explains the respective amino acid, residue with contact number, propensity, and Discotope score of the predicted B cell epitopes.

Table 7 .
Dis-continuous B cell epitopes identified in the refined 3D structure of designed vaccine constructs of HPV58 by using Discotope 2.0.

Table 8 .
List of top ten clusters, binding residues, and interaction energy of SGD58 and TLR5 complex.