Pan-Genome Analysis of Oral Bacterial Pathogens to Predict a Potential Novel Multi-Epitopes Vaccine Candidate

Porphyromonas gingivalis is a Gram-negative anaerobic bacterium, mainly present in the oral cavity and causes periodontal infections. Currently, no licensed vaccine is available against P. gingivalis and other oral bacterial pathogens. To develop a vaccine against P. gingivalis, herein, we applied a bacterial pan-genome analysis (BPGA) on the bacterial genomes that retrieved a total number of 4908 core proteins, which were further utilized for the identification of good vaccine candidates. After several vaccine candidacy analyses, three proteins, namely lytic transglycosylase domain-containing protein, FKBP-type peptidyl-propyl cis-trans isomerase and superoxide dismutase, were shortlisted for epitopes prediction. In the epitopes prediction phase, different types of B and T-cell epitopes were predicted and only those with an antigenic, immunogenic, non-allergenic, and non-toxic profile were selected. Moreover, all the predicted epitopes were joined with each other to make a multi-epitopes vaccine construct, which was linked further to the cholera toxin B-subunit to enhance the antigenicity of the vaccine. For downward analysis, a three dimensional structure of the designed vaccine was modeled. The modeled structure was checked for binding potency with major histocompatibility complex I (MHC-I), major histocompatibility complex II (MHC-II), and Toll-like receptor 4 (TLR-4) immune cell receptors which revealed that the designed vaccine performed proper binding with respect to immune cell receptors. Additionally, the binding efficacy of the vaccine was validated through a molecular dynamic simulation that interpreted strong intermolecular vaccine–receptor binding and confirmed the exposed situation of vaccine epitopes to the host immune system. In conclusion, the study suggested that the model vaccine construct has the potency to generate protective host immune responses and that it might be a good vaccine candidate for experimental in vivo and in vitro studies.


Introduction
Antibiotics are drugs used to hinder the growth of bacteria or kill it. The indiscriminate use of antibiotics pushes bacteria to evolve novel antibiotic resistance mechanisms resulting in economic losses and high mortality [1,2]. The increasing resistance of microbes towards antibiotics is the greatest challenge for mankind. The antibiotic resistance causes 33,000 deaths each year. In Thailand, antibiotic resistance causes almost 38,000 deaths and in the US, the death rate is about 23,000 deaths [3]. The total estimated deaths caused by antimicrobial resistance per year is 70,000 [4,5]. The problem of resistance arises due to the overuse and inappropriate use of antibiotics [6]. To combat antibiotic-resistant bacterial pathogens, the development of antibodies (immunotherapeutic) that specifically target infectious pathogens is an attractive technique [7]. On the other hand, antibiotic-resistant pathogen transmission can be controlled by developing vaccines to provide acquired active immunity to the host [8]. As there is limited availability of vaccines for treating healthassociated infections (HAIs), there is an urgent need to accelerate and open new avenues for advancing vaccine development [9]. A safe, specific, and potent vaccine thus can be tagged as the "the need of the hour" [7,10,11]. For combating antibiotic resistance, methods that nurture the immune system of humans by immunological and immunotherapeutic mediation [12] are an attractive and effective approach. The therapeutic solutions which are used for the treatment of pathogens are limited because these pathogens evade innate defenses [13]. For combating a disease, both drugs and vaccines are used, and resistance develops against both. However, resistance to drugs by bacterial pathogens is quick while resistance to vaccines is rare [14]. The reason behind the resistance of drugs is its therapeutic nature as it is given after infection while the vaccine is prophylactic as it is given before infection as a preventive measure. Additionally, another reason for drug resistance is that they are used against very few targets while vaccines are used against various targets [15].
Porphyromonas gingivalis is an oral bacterial pathogen and is responsible for chronic periodontitis [16]. P. gingivalis, a Gram-negative, black-pigmented anaerobic rod residing in subgingival biofilms, is a causative agent of periodontal diseases along with other oral microorganisms. This bacteria has been additionally thought to cause coronary illness, stroke, and diabetes mellitus [16][17][18]. Periodontal disease is initiated by oral bacteria perturbing epithelial cells, thereby triggering innate, inflammatory, and adaptive immune responses. More than 500 bacterial species interact with human tissues in the human oral cavity. Of these species, Treponema denticola, Aggregatibacter actinomycetemcomitans, P. gingiyalis, Tannerella forsythia, Campylobacter rectus and Fusobacterium nucleatum are associated with periodontitis [19]. No vaccine is currently licensed against periodontal disease; however, several efforts are underway to develop a vaccine [20]. Herein, reverse vaccinology was integrated with subtractive proteomics to prioritize potential vaccine candidates in the proteome of oral pathogens, especially P. gingivalis followed by epitopes mapping using immunoinformatic techniques. Further, biophysics approaches including molecular modelling and molecular dynamics simulation approaches were employed to probe designed vaccine ensemble interactions with the innate immune receptors and understand vaccineimmune receptor dynamics in solution. We hypothesized that the vaccine will be helpful for experimentalists in vaccine development against oral pathogens. The prime significance of the study is to provide an excellent platform for vaccinologists to make use of the in silico-based vaccine in experimental in vivo and in vitro studies to disclose the real immune protection efficacy of the vaccine. This will shorten the vaccine development period and will save on the associated cost of vaccine development. From the user perspective, it will lower the burden of antibiotic resistance and improve human health in general.

Research Methodology
For designing a multi-epitopes vaccine, the following methodology flow was used as given in Figure 1.

Research Methodology
For designing a multi-epitopes vaccine, the following methodology flow was used as given in Figure 1.

Subtractive Proteome and Reverse Vaccinology Phase
The data of pathogen proteomes were extracted from the genome database of NCBI [21]. At the time of the research, we retrieved three complete sequenced genomes of P. gingivalis. Potential vaccine candidates were identified using filters, and methods discussed in [22][23][24][25][26].

Subtractive Proteome and Reverse Vaccinology Phase
The data of pathogen proteomes were extracted from the genome database of NCBI [21]. At the time of the research, we retrieved three complete sequenced genomes of P. gingivalis. Potential vaccine candidates were identified using filters, and methods discussed in [22][23][24][25][26].

Pre-Screening Phase
BLASTp was considered for predicting host non-similar proteins as well as pathogenspecific proteins [27]. Essential proteins of the pathogens were identified using the database of essential genes (DEG) [28].
The screening of vaccine proteins was performed as follows; (i) conservation in the sequenced strains of the pathogen [29] (ii) not present in the human host [30], (iii) critical for the growth of bacteria [31] (iv) exposed to the host environment [25] (v) non-redundant and part of the core genome [24]. For the identification of conserved proteins, a bacterial pan-genome analysis tool was employed [32]. BLASTp was performed using different parameters, i.e., the identity of the sequence was required to be greater than 30%, an E-value smaller than 1.0 E−5, and bit score greater than 100 [33]. Using BLASTp, we checked the similarity of these sequences with normal flora, i.e., Lactobacillus rhamnosus, Lactobacillus casei, Lactobacillus jhonsoni and Bacteroides (oral normal flora) and found no similarity. Essential proteins of the pathogens were identified using the DEG database [28] and the essential proteins were those who fulfill the criteria of parameters discussed above [34].

Cluster Database at High Identity with Tolerance (CD-HIT) Analysis
Redundant proteins are not part of the core genome because they are not appraised as a good vaccine target [35] while, on the other hand, non-redundant proteins are known as good vaccine candidates [36]. Redundant proteins from the proteomes were discarded using CD-Hit using 50% of threshold sequence identity [37].

Sub-Cellular Localization Phase
Then, we analyzed the essential proteome in the subcellular localization [38] by using PSORTb 3.0 [39]. PSORTb is an online web resource commonly used to predict the subcellular localization of proteins. The proteins that are localized on the outer membrane, extracellular and periplasmic regions are regarded as good vaccine targets as they come into direct contact with the host cells and contain multiple antigenic determinants [40].

Vaccine Candidate's Prioritization Phase
In this step first, surface proteins involved in pathogen disease development and progression were identified [41]. To select such proteins, BLASTp was performed against the virulent factor database (VFDB) (http://www.mgc.ac.cn/VFs/, accessed on 15 September 2022). The different parameters used in the check involved a sequence similarity check (>30%) and bit score >100 [42].

Physiochemical Properties Analysis
By using the online tool of ProtParam [43], physiochemical properties such as the instability index, molecular weight, theoretical PI, number of amino acids, grand average of hydropathy, and aliphatic index of selected virulent proteins were determined [43]. The proteins with a predicted value of greater than 40 were deemed unstable and discarded [40]. Similarly, proteins were considered as good vaccine targets if they had a molecular weight of smaller than 110 kDa [22].

Analysis of Transmembrane Helices
Selected proteins were further analyzed for transmembrane helices and only those proteins with values of 0 or 1 were selected [22,24]. Proteins with a low number of transmembrane helices are easy to purify during experimental investigation [44]. An analysis of transmembrane helices was performed using online tools named HTMMTOP [45] and TMHMM 2.0 [46].

Antigenicity, Allergenicity, and Adhesion Probability Prediction
Using Vaxijen [47], the antigenicity of proteins was checked [48]. Only those proteins with an antigenicity value of higher than 0.4 were selected. The allergenicity of proteins was detected using Allertop 2.0 [49]. To obtain a good vaccine candidate, adhesion was checked via Vaxign 2.0 [50].

Prediction of Immune Cell Epitopes
The immune epitope database (IEDB) server was used to predict B-cell epitopes and Tcell epitopes [51]. Through a server named Bepipred linear epitope 2.0 [52], B-Cell epitopes were predicted. Only the epitopes, passing the cut off score of 0.5 were selected. The T-cell epitopes were in turn predicted from the B-cell epitope using the IEDB MHC-I and MHC-II servers [22,53]. In both MHC-I and MHC-II epitopes prediction, a reference set of alleles available in the IEDB database was used. The common MHC epitopes with a low percentile score were opted for further processing.

MHCPred Analysis
To perform an MHCPred analysis, DRB*0101 was chosen as the receptor allele due to the highly prevalent nature of the allele in human populations [23]. In this analysis, we selected only those B-cell-derived T-cell epitopes with IC50 values smaller than 100 nm for DRB*0101 [54]. Once the final set of epitopes was finalized, the epitopes were BLASTp against T. denticola (tax id: 158), A. actinomycetemcomitans (tax id: 714), T. forsythia (tax id: 28112), C. rectus (tax id: 203) and F nucleatum (tax id: 851) for conservation among oral bacterial species.

Multi-Epitopes Vaccine Design
Peptide vaccines are weakly immunogenic which can be overcome by designing a multi-epitopes vaccine [55,56]. By using linkers such as GPGPG, antigenic epitopes can be linked with each other and to the beta-subunit of cholera toxin to form a multi-epitope vaccine construct [57]. The linker (EAAAK) was used to link the cholera beta-subunit to the N-terminus. The 3D Pro tool of the SCRATCH protein server was used to predict the designed vaccine structure [58]. Vaccine-structure modeling was performed ab initio as no appropriate template was available.

Loop Modeling and Vaccine Refinement
Loop modeling was performed using a Galaxy loop server [59]. Refinement was performed using the Galaxy refine server [60] which lowers the global binding energy and lowers the error in the 3D structure.

Disulfide Engineering and Codon Optimization
For achieving stability, disulfide bonds were introduced to the vaccine by Design 2.0 [2,61]. To ensure the maximum expression of the vaccine in Escherichia coli, in silico codon optimization was performed using the Java Codon Adaptation tool (JCat) [62]. A vaccine candidate with a good GC value and codon adaptation index (CAI) value can be considered to possess a good expression level in E. coli.

Docking and Refinement
In this step, designed chimeric vaccine docking was performed with immune receptors [63,64]. The vaccine was blindly docked with TLR-4 (PDB: 4G8A), MHC-I (PDB ID: 1L1Y), and MHC-II (1KG0) receptors [65] using an online server of PATCHDOCK [66]. The selection of TLR-4 was made as it plays a key role in host defense against Gram-negative bacteria. It activates the signaling pathway of NF-κB and inflammatory cytokine production that leads to the activation of innate immunity, which in turn results in adaptive immune responses against P. gingivalis [67,68]. The server provided 20 docking solutions and each docking solution was assigned a global binding energy. Subsequently, the complexes were refined with FireDock [69] to re-score the docked solutions after extensive refinement of the complexes. The best complex with a low global binding energy in each case was further selected for intermolecular interactions and binding conformation using UCSF Chimera.13.1 [70].

Molecular Dynamics Simulation
The dynamic behavior of vaccine-immune receptors can be investigated through in silico methods such as a molecular dynamics simulation. Based on the global energy value, complexes were selected for the molecular dynamic simulation phase. The analysis was performed using AMBER20 simulation software on a time scale of 200 ns. This process consists of three steps, i.e., preparation of the system, pre-processing, and production [71] and was carried out using the AMBER SANDER module [72]. The complexes were solvated into a TIP3P solvation box of 12 Angstrom padding distance [73]. The complexes were first heated to 300 K, followed by equilibration for 1 ns. This was followed by a production run of 250 ns and each trajectory file was saved at a time interval of 10 ns. The SHAKE algorithm [74] was used to constrain hydrogen bonds while Langevin dynamics was used for temperature control. The CCPTRAJ module [75] was applied for trajectories analysis while XMGRACE [76] was considered for graphs plotting.

MM-GBSA Binding Free Energies
Binding free energies of the vaccine-immune receptor complex were calculated using MMPBSA.py of AMBER20 [77]. About 100 frames were evaluated for free energies. The analysis estimated the difference between the binding free energies of complexes in unsolvated and solvated phases [78].

Immune Simulation
The designed vaccine was further characterized for an immune-response profile using the C-ImmSim server (http://150.146.2.1/C-IMMSIM/index.php, accessed on 28 September 2021), an agent based server for in silico immune system simulation in response to the vaccine antigen [79]. The server used a position-specific scoring matrix and machine learning approaches to study immune interactions. The server stimulates three compartments such as bone marrow, the thymus and lymph nodes. During the analysis, simulation parameters were treated as default. Three injections were delivered at an interval of four weeks. The time steps used were 1, 84 and 168 [80].

Genomes Retrieval of P. gingivalis
For the development of a multi-epitope-based vaccine, we required completely sequenced genome sequences. Three completely sequenced genomes of P. gingivalis were retrieved from the NCBI genome database (https://www.ncbi.nlm.nih.gov/genome/714, accessed on 25 October 2021). On average, the size of pathogen strains varied from 2.34 Mb to 2.35 Mb with average GC contents of 48.5%. The net number of genes in each strain was about 1542. Table 1 explains the strain type, the genome size and percent of GC content.

Bacterial Pan-Genome Analysis
In the next step, we performed a bacterial pan-genome analysis to obtain the core genome for downward steps. By conducting a pan-genome analysis, core genome and accessory genomes were generated [81]. The core genes are common among species genomes and are used to better understand genome evolution, gene orthology, genome complexity and the mining of pathogenic and therapeutic sequences. Pan-genome encompasses all strain genomic sequences while the core genome set is the set of sequences common in all strains. The accessory genomes represent a set of sequences present in one or more strains but not in all strains. These accessory genes are also called accessory proteomes or dispensable proteomes. Unique genes are present in only one strain and are strain-specific, also called singleton. The core genome contains those proteins which are conserved across the strains. In Figure 2A, the genome size of the strains is presented, while in Figure 2B a pan-phylogeny tree of P. gingivalis is provided. The number of core proteins was 4908, which contained both redundant and non-redundant sequences.
genome for downward steps. By conducting a pan-genome analysis, core genome and accessory genomes were generated [81]. The core genes are common among species genomes and are used to better understand genome evolution, gene orthology, genome complexity and the mining of pathogenic and therapeutic sequences. Pan-genome encompasses all strain genomic sequences while the core genome set is the set of sequences common in all strains. The accessory genomes represent a set of sequences present in one or more strains but not in all strains. These accessory genes are also called accessory proteomes or dispensable proteomes. Unique genes are present in only one strain and are strainspecific, also called singleton. The core genome contains those proteins which are conserved across the strains. In Figure 2A, the genome size of the strains is presented, while in Figure 2B a pan-phylogeny tree of P. gingivalis is provided. The number of core proteins was 4908, which contained both redundant and non-redundant sequences.

CD-HIT Analysis
In total, 1552 non-redundant proteins and 3356 redundant proteins were identified in the core genome of the pathogen as shown in Figure 3A. The redundant proteins were removed as they were duplicated sequences and thus not considered as good vaccine candidates. The non-redundant proteins were processed further [82].

CD-HIT Analysis
In total, 1552 non-redundant proteins and 3356 redundant proteins were identified in the core genome of the pathogen as shown in Figure 3A. The redundant proteins were removed as they were duplicated sequences and thus not considered as good vaccine candidates. The non-redundant proteins were processed further [82].

Proteins Subcellular Localization
The proteins present on the surface and in the periplasmic, extracellular and outer membranes can be easily recognized by the host immune system. In total, sixty proteins were found to be present at the pathogen surface among which sixteen were periplasmic proteins, thirty nine were outer membrane proteins and five were extracellular proteins, as shown in Figure 3B.

VFDB Analysis
In total, six virulent proteins were identified as per the criteria defined in the Meth-

Proteins Subcellular Localization
The proteins present on the surface and in the periplasmic, extracellular and outer membranes can be easily recognized by the host immune system. In total, sixty proteins were found to be present at the pathogen surface among which sixteen were periplasmic proteins, thirty nine were outer membrane proteins and five were extracellular proteins, as shown in Figure 3B.

VFDB Analysis
In total, six virulent proteins were identified as per the criteria defined in the Methodology Section 2. In these results, we found two selected outer membranes, three periplasmic and one extracellular membrane, as shown in Table 2. Virulent proteins can act as attractive vaccine targets as they can stimulate immune pathways resulting in the improved production of safe immune responses. As only patches of such proteins are used in vaccine design, it is safe to use them in vaccine formulations without producing any detrimental effects on human cells.

Transmembrane Helices and Physiochemical Analysis
In the transmembrane helices analysis, only proteins that harbored 0 or 1 transmembrane helices were selected. In this analysis, two proteins were removed from a total of six proteins. Proteins with a smaller number of transmembrane helices are easy to experimentally analyze due to their easy cloning and expression analysis. Through physiochemical analysis, one protein was discarded due to having a greater molecular weight than the threshold value. So, three target proteins of vaccines were shortlisted. In Table 3, results of the physicochemical properties analysis of all six proteins are shown.

Similarity with Human Genome and Prediction of Antigenicity and Allergenicity
After the physiochemical analysis of proteins, the shortlisted proteins were subjected to an homologous check. The proteins used in the designing of vaccines must not be homologous to the host as homologous proteins can cause autoimmune diseases. Only those proteins that are not homologous to the human genome were forwarded. Similarly, those proteins that were antigenic and not allergic were selected. The host non-similar, antigenic and non-allergic proteins are shown in Table 3.

Homology Check of Normal Flora
The selected three proteins shown in Table 4 were also found to show no homology to normal flora of human. The strains of bacteria used were L. casei, L. johnsonii, L. rhamnosus and Bacteroides (oral normal flora). This analysis helped in selecting those proteins which avoid the accidental inhibition of host normal flora.

B-Cell Epitopes Prediction
After passing through all essential filters required for a good vaccine candidate, the selected three proteins were subjected to the epitopes prediction phase. After which we predicted the B-Cell epitope and T-Cell epitope using the IEDB server [83]. First, B-cell epitopes were predicted. A total of nine B-cell epitopes were predicted for lytic transglycosylase domain-containing proteins, five for superoxide dismutase protein, and six for FKBP-type peptidyl-propyl cis-trans isomerase as tabulated in Table S1.

MHC-I and MHC-II Epitopes Prediction
The T-cell-epitope prediction phase involves MHC-I and then MHC-II binding, as described in Table S2. The MHC-I and MHC-II alleles used are tabulated in Table S3.

Epitope Prioritization Phase
Different filters such as MHCPred, water-solubility, toxicity, allergenicity and antigenicity were applied to the selected epitopes to prioritize those which can be used in a multi-epitopes vaccine design.

MHCPred, Allergenicity, Antigenicity, Solubility and Toxicity Analysis
By utilizing MHCPred, the binding affinity of epitopes for DRB*0101 was evaluated. Only epitopes with IC 50 values < 100 nm were selected because DRB*0101 is the most common allele present in 95% of the population [84]. The epitopes with an IC 50 value of smaller than 100 nm are shown in Table 5; only antigenic and non-allergic epitopes were selected to stimulate strong and safe immune responses. The antigenic and non-allergic epitopes are tabulated in Table 5. The solubility of epitopes was checked using the Protein-Sol web server [85]; the server can easily predict the solubility of a vaccine molecule and only soluble epitopes were selected [86]. Toxin-Pred was employed for the selection of non-toxic epitopes. In Table 5, epitopes that are antigenic, non-allergic, and non-toxic and have good water solubility are listed. These selected epitopes were then forwarded to make a multi-epitopes vaccine. The nine shortlisted epitopes are also schematically presented in Figure 4. Similarly, the screened epitopes were conserved in T. denticola, A. actinomycetemcomitans, T. forsythia, C. rectus and F. nucleatum and thus the epitopes can be used in broad-spectrum vaccine design.

Multi-Epitopes Vaccine Designing
A multi-epitopes was designed to overcome the weak immunogenicity of epitopes [87]. The epitopes were joined with linkers to allow efficient separation of the epitopes. Additionally, an adjuvant molecule was added to the multi-epitopes peptide to further enhance the antigenic and immunogenic potential of the vaccine. The adjuvant used was cholera toxin B-subunit, which is a potent stimulator of interferons and cellular immunity. The schematic representation of the multi-epitopes vaccine construct is shown in Figure 5.

Epitope Prioritization Phase
Different filters such as MHCPred, water-solubility, toxicity, allergenicity and antigenicity were applied to the selected epitopes to prioritize those which can be used in a multi-epitopes vaccine design.

MHCPred, Allergenicity, Antigenicity, Solubility and Toxicity Analysis
By utilizing MHCPred, the binding affinity of epitopes for DRB*0101 was evaluated. Only epitopes with IC50 values < 100 nm were selected because DRB*0101 is the most common allele present in 95% of the population [84]. The epitopes with an IC50 value of smaller than 100 nm are shown in Table 5; only antigenic and non-allergic epitopes were selected to stimulate strong and safe immune responses. The antigenic and non-allergic epitopes are tabulated in Table 5. The solubility of epitopes was checked using the Protein-Sol web server [85]; the server can easily predict the solubility of a vaccine molecule and only soluble epitopes were selected [86]. Toxin-Pred was employed for the selection of non-toxic epitopes. In Table 5, epitopes that are antigenic, non-allergic, and non-toxic and have good water solubility are listed. These selected epitopes were then forwarded to make a multiepitopes vaccine. The nine shortlisted epitopes are also schematically presented in Figure  4. Similarly, the screened epitopes were conserved in T. denticola, A. actinomycetemcomitans, T. forsythia, C. rectus and F. nucleatum and thus the epitopes can be used in broadspectrum vaccine design.

Multi-Epitopes Vaccine Designing
A multi-epitopes was designed to overcome the weak immunogenicity of epitopes [87]. The epitopes were joined with linkers to allow efficient separation of the epitopes. Additionally, an adjuvant molecule was added to the multi-epitopes peptide to further enhance the antigenic and immunogenic potential of the vaccine. The adjuvant used was cholera toxin B-subunit, which is a potent stimulator of interferons and cellular immunity. The schematic representation of the multi-epitopes vaccine construct is shown in Figure  5.

Vaccine Structure Prediction, Loops Modeling and Refinement
The three-dimensional structure of the vaccine construct was modeled to further understand vaccine binding with different immune receptors and the exposed nature of the vaccine epitopes. Ab initio structure modeling was performed as no appropriate template was available at the time of vaccine-structure modeling. The designed vaccine 3D structure is given in Figure 5. To avoid structure instability, the following loop-comprising residues were modeled into secondary structure elements to obtain the most refined structure: Met1-Val8, Lys55-Pro74,Ala101-Asn111,Glu158,Gly165,Glu177,Pro184,Pro196,Leu202,Gln216,Pro220,Gln235-Pro240,Cys30-Thr36,Val149-Gly157,Val166-Pro170,Gly185,Ile190,Asn203-Lys207,Gln221-

Vaccine Structure Prediction, Loops Modeling and Refinement
The three-dimensional structure of the vaccine construct was modeled to further understand vaccine binding with different immune receptors and the exposed nature of the vaccine epitopes. Ab initio structure modeling was performed as no appropriate template was available at the time of vaccine-structure modeling. The designed vaccine 3D structure is given in Figure 5.

Disulfide Engineering and Codon Optimization
The vaccine was further subjected to disulfide engineering to strengthen the intermolecular bonding of the vaccine and enhance vaccine structure stability. This further ensured that weaker segments of the vaccine were resistant to cellular degradation and conferred conformation stability to the vaccine [88]. During the analysis, only residue pairs with a higher energy value (>0 kcal/mol) were mutated to cysteine. The amino acid residues replaced by cysteine are tabulated in Table 6, while the cysteine bonds are shown by yellow sticks in Figure 6. The vaccine was further subjected to disulfide engineering to strengthen the intermolecular bonding of the vaccine and enhance vaccine structure stability. This further ensured that weaker segments of the vaccine were resistant to cellular degradation and conferred conformation stability to the vaccine [88]. During the analysis, only residue pairs with a higher energy value (>0 kcal/mol) were mutated to cysteine. The amino acid residues replaced by cysteine are tabulated in Table 6, while the cysteine bonds are shown by yellow sticks in Figure 6.  The vaccine sequence was reverse translated into a DNA sequence to perform codon optimization according to the E. coli expression system. The GC value of the vaccine was 57.08% and the CAI value was 0.92. Both values are indicators of a highly expressed sequence.

Analysis of Molecular Docking
Robust interactions of a vaccine with receptors are necessary for generating good im- The vaccine sequence was reverse translated into a DNA sequence to perform codon optimization according to the E. coli expression system. The GC value of the vaccine was 57.08% and the CAI value was 0.92. Both values are indicators of a highly expressed sequence.

Analysis of Molecular Docking
Robust interactions of a vaccine with receptors are necessary for generating good immune responses. To analyze the interaction of the host receptors and vaccine construct, we conducted blind molecular docking. The top 20 docked solutions of vaccine with MHC-I, MHC-II, and TLR-4 were picked as shown in Tables S4-S6.

Docked Complexes Refinement
The docked complexes were further refined to remove false positive results and select the complex with the lowest binding energy. The term lowest binding energy complex implies the best binding of the vaccine with immune receptors. In case of MHC-I, solution number 5 was selected as it had the lowest global energy of −13.83 KJ·m −1 . In MHC-II, solution number 2 with a global binding energy value of −11.10 KJ·m −1 was selected. In case of TLR-4, solution number 9 was selected as it had the lowest global energy of −13.10 KJ·m −1 . The FireDock rescored docked solutions are given in Tables 7-9.   Table 9. FireDock solutions of TLR-4-vaccine. KJ·m −1 is the unit of energy for each term given below.

Docked Conformation of Vaccine with Immune Receptors
The best docked complex for each receptor was visualized to investigate the docked conformation of the vaccine with immune receptors such as MHC-I, MHC-II and TLR-4 as shown in Figure 7. The vaccine was observed to perform deep binding with the receptors and the epitopes were exposed to the host immune system cells for recognition Int. J. Environ. Res. Public Health 2022, 19, 8408 13 of 23 and processing. This further implies that the vaccine epitopes can stimulate proper immune responses, leading to the generation of humoral and cellular immunity.
The best docked complex for each receptor was visualized to investigate the docked conformation of the vaccine with immune receptors such as MHC-I, MHC-II and TLR-4 as shown in Figure 7. The vaccine was observed to perform deep binding with the receptors and the epitopes were exposed to the host immune system cells for recognition and processing. This further implies that the vaccine epitopes can stimulate proper immune responses, leading to the generation of humoral and cellular immunity.

Interactions of Vaccine to Immune Receptors
Understanding the interactions type and number of interactions between the vaccine and receptors is important as they are key in determining the strength of vaccine-receptors interactions. Different types of interactions were observed, especially hydrophilic, hydrophobic, salt bridges and di-sulfide bonds, between vaccines and receptors. All these interactions were found to play a key role in stability of the docked conformation of the vaccine with the immune receptors. These interactions require a number of residues of the receptors to engage the vaccine molecules. These residues are shown in Table 10.

Molecular Dynamics Simulation
The dynamic behavior of selected docked complexes was checked through an all-atom molecular dynamics simulation. The simulation trajectories were investigated through the radius of gyration (RoG), root mean square deviation (RMSD) and root means square fluctuation (RMSF) based on the carbon alpha atoms. This analysis was vital to understand the dynamic binding stability of the vaccine with respect to receptors and determine whether the epitopes were exposed to the host immune cells. The plot of RMSD remained stable from the start and no major changes were observed in the structures. Few minor structure deviations were noted that were due to the many loops present in the systems. The RMSD plot varied between 2.5-3 Å throughout the length of simulation time as shown in Figure 8A. Further, RMSF was determined, which depicted that major receptors' residues remained stable with few high flexibilities due to the loops in the presence of the vaccine molecule. The majority of the residues present in the system were smaller than 3 Å which shows that they have better stability ( Figure 8B). To further validate these findings, RoG was investigated for the systems as presented in Figure 8C. The systems were observed to have a good compact nature and the secondary structures were found to have good tight conformation. These results are in agreement with the RMSD and RMSF results and overall indicate good system stability.

Calculation of Binding Free Energies
Binding free energies of the docked complexes were calculated using MM-GBSA and MM-PBSA approaches. Both these approaches are well known and considered modest approaches due to high speed and good accuracy. These calculations were used to validate the binding stability of the docked complexes. The total binding free energy of the vaccine-TLR-4 complex was −135.73 kcal/mol, for the vaccine-MHC-I complex it was −101.32 kcal/mol and for the vaccine-MHC-II vaccine complex it was −76.17 kcal/mol as shown in Table 11. Electrostatic and van der Waals energies contributed positively to complex formation.

Immune Stimulations
The vaccine antigen was exposed to the host immune system for 350 days. It was revealed that an increased IgM and IgG antibodies level were observed against the antigen. The secondary response, in turn, increased the tertiary responses and resulted in the formation of B-cells, IgG1, IgG2, IgG1+IgG2, IgM and IgM+IgG as shown in Figure 9A. Similarly, in Figure 9B, the production of interferon-gamma is shown to be greater than 250,000 counts per ml. The different B-cell and T-cell responses are shown in Figures  S1 and S2, respectively. The humoral and cellular immune responses were documented to play a significant role in clearing P. gingivalis and related oral bacterial pathogens. Innate immune cells such as dendritic cells and adaptive immunity lymphocytes, and monocytes/macrophages localized in periodontium recognize and respond to P. gingivalis through pattern recognition receptors (PRRs), followed by the release of inflammatory cytokines and reactive oxygen species [18]. As P. gingivalis is an oral pathogen, secretory IgA antibody plays a role as the first line of defense by blocking pathogen adherence to host mucosal surfaces [89].

Discussion
P. gingivalis is a Gram-negative anaerobic bacterium that is responsible for periodontitis which results in teeth loss. More than 500 bacterial species inhabit the human oral cavity and most of them are non-pathogenic. However, some, such as P. gingivalis, form biofilm that contributes to chronic periodontitis [16]. Recently, the bacterium has been reported to have an association with the development of Alzheimer's [90]. The bacterium is resistant to multiple antibiotics such as clindamycin, metronidazole, and amoxicillin

Discussion
P. gingivalis is a Gram-negative anaerobic bacterium that is responsible for periodontitis which results in teeth loss. More than 500 bacterial species inhabit the human oral cavity and most of them are non-pathogenic. However, some, such as P. gingivalis, form biofilm that contributes to chronic periodontitis [16]. Recently, the bacterium has been reported to have an association with the development of Alzheimer's [90]. The bacterium is resistant to multiple antibiotics such as clindamycin, metronidazole, and amoxicillin and therefore warrants the search for novel antibiotics and vaccines to manage the said bacterial pathogenicity [91].
Vaccines have excellent potential of preventing infections and proved so by saving millions of lives from many pandemics in the past. Successful examples of vaccines that saved humanity from pandemics include the Spanish flu vaccine and smallpox vaccine. The has been a significant effect of vaccine development in combatting many diseases. Traditional vaccinology, though still in use and has been successful for many decades, suffers from several limitations that shift the focus towards genome-based vaccines. The use of bioinformatics in recent times has considerably broadened the scope of vaccinology, particularly for those pathogens which are unable to culture in lab conditions and those which undergo continuous genetic changes in surface antigens. Reverse vaccinology, which is the reverse of traditional vaccinology, has now attracted more attention due to its key role in the recent development of the meningococci vaccine [92,93]. Reverse vaccinology is genome-based vaccinology and has contributed remarkably to designing of multi-epitopes vaccines [94][95][96].
Several attempts have been made to develop a vaccine against P. gingivalis and other oral bacterial pathogens so far. The heat shock proteins of this bacteria have been investigated for vaccine development and it has been concluded that the P. gingivalis HSP60 protein has the potential to reduce periodontitis in mice models [97]. In another study, PG32 and PG33 proteins were found to show immune protective efficacy and clear the pathogen [98]. Despite these efforts, none of the study findings are convincing and no appropriate vaccine candidate is under development. Considering this, herein, we performed an in-depth computational vaccine design study to thoroughly screen the core genome of P. gingivalis and identify proteins that are capable of stimulating host immune responses.
In this investigation, three potential vaccine targets, namely lytic transglycosylase domain-containing protein, superoxide dismutase enzyme, and FKBP-type peptidyl-propyl cis-trans isomerase enzyme were identified and fulfilled all the required potential vaccine candidate properties. The targets are part of the pathogen core genome thus ensuring the development of a broad-spectrum vaccine. Further, it was ensured that these proteins were present on the pathogen surface. Such proteins are easily accessible to the host immune system for interactions. These proteins also harbor strong antigenic determinants capable of stimulating the immune system. The selected proteins are also non-homologous to human proteomes and thereby good candidates for avoiding autoimmune responses. Moreover, the proteins are antigenic and able to bind products of acquired immunity and activate immune signaling pathways. Immunoinformatics further affirms that these proteins harbor antigenic epitopes that are non-toxic, antigenic, non-allergic, and have strong binding affinity for the DRB*0101 allele. This allele is present in the majority of the human population and the interaction of epitopes with this allele leads to accurate and robust immune responses. The predicted epitopes were further utilized in multiepitopes vaccine design to remove the limitations of a single peptide vaccine. The designed vaccine showed stable binding conformation with different immune receptors such as MHC-I, MHC-II, and TLR-4. As the findings of the intermolecular interaction analysis revealed, multiple hydrophilic and hydrophobic interactions were formed between the vaccine and receptors, thereby leading to stable complex formation. Towards the end, the vaccine candidate was evaluated for its potential to stimulate host immune system cells. High primary, secondary, and tertiary immune responses were noticed. Similarly, a high concentration of interleukins and interferons was observed.
Computer-aided vaccine design based on genomic data is gaining rapid recognition for vaccine development. It is not only time and money saving but could deliver data in a short time for specific experimentations. All these findings suggest that the designed vaccine is a good candidate for in vivo and in vitro testing.

Conclusions and Limitations
A multi-epitopes vaccine against an oral bacterial pathogen, P. gingivalis and other oral bacterial pathogens, was proposed in this research using a variety of computer-aided vaccine design techniques, including reverse vaccinology, subtractive proteomics, immune-informatics, and several biophysical analyses. The vaccine epitopes were predicted using the following three potential vaccine targets: lytic transglycosylase domain-containing protein, superoxide dismutase enzyme, and FKBP-type peptidyl-propyl cis-trans isomerase enzyme. The mentioned targets were prioritized based on several vaccine candidacy parameters including but not limited to the presence of protein in the core proteome of the pathogen, being present on the cell surface, non-homologous to the host, the presence of probiotic bacteria, and being feasible for experimental analysis. Similarly, the epitopes used in the vaccine were non-toxic, antigenic, non-allergic, and had high binding potential for B-cell alleles, and T-cell alleles. The designed vaccine construct showed excellent binding with the different immune receptors and remained stable for a simulated period of time. Host immune-system simulation in response to the vaccine unveiled the production of strong primary, secondary and tertiary immune responses. All these findings determined the vaccine as a good candidate to be evaluated for its immune protection ability in in vivo models. A vaccine against P. gingivalis could be developed faster with our findings and data from the study might speed up the process of vaccine discovery against this pathogen. Even though our selection criteria were quite tight throughout the study, there are still some shortcomings that must be addressed in future investigations. Firstly, the ordering of epitopes in the vaccine for optimal activity was not tested. Secondly, the MHC epitopesprediction algorithm's accuracy was not tested extensively.

Data Availability Statement:
The data presented in this study are available within the article.