Integrated Core Proteomics, Subtractive Proteomics, and Immunoinformatics Investigation to Unveil a Potential Multi-Epitope Vaccine against Schistosomiasis

Schistosomiasis is a parasitic infection that causes considerable morbidity and mortality in the world. Infections of parasitic blood flukes, known as schistosomes, cause the disease. No vaccine is available yet and thus there is a need to design an effective vaccine against schistosomiasis. Schistosoma japonicum, Schistosoma mansoni, and Schistosoma haematobium are the main pathogenic species that infect humans. In this research, core proteomics was combined with a subtractive proteomics pipeline to identify suitable antigenic proteins for the construction of a multi-epitope vaccine (MEV) against human-infecting Schistosoma species. The pipeline revealed two antigenic proteins—calcium binding and mycosubtilin synthase subunit C—as promising vaccine targets. T and B cell epitopes from the targeted proteins were predicted using multiple bioinformatics and immunoinformatics databases. Seven cytotoxic T cell lymphocytes (CTL), three helper T cell lymphocytes (HTL), and four linear B cell lymphocytes (LBL) epitopes were fused with a suitable adjuvant and linkers to design a 217 amino-acid-long MEV. The vaccine was coupled with a TLR-4 agonist (RS-09; Sequence: APPHALS) adjuvant to enhance the immune responses. The designed MEV was stable, highly antigenic, and non-allergenic to human use. Molecular docking, molecular dynamics (MD) simulations, and molecular mechanics/generalized Born surface area (MMGBSA) analysis were performed to study the binding affinity and molecular interactions of the MEV with human immune receptors (TLR2 and TLR4) and MHC molecules (MHC I and MHC II). The MEV expression capability was tested in an Escherichia coli (strain-K12) plasmid vector pET-28a(+). Findings of these computer assays proved the MEV as highly promising in establishing protective immunity against the pathogens; nevertheless, additional validation by in vivo and in vitro experiments is required to discuss its real immune-protective efficacy.


Introduction
Schistosomes are parasitic flatworms that cause an infectious disease called schistosomiasis [1]. Schistosomiasis is one of the most common and neglected tropical parasite sequences are conserved across strains and species and categorized as broad-spectrum vaccine candidates [29].

Subtractive Proteomics Approach
A subtractive proteomic approach was used for analyzing the core proteome to recognize suitable vaccine candidates. Subtractive proteomics is a computational method for identifying potential vaccine and drug targets by excluding proteins that are not useful for vaccine and drug designing [30]. The first step in subtractive proteomics was to find duplicated sequences in the core proteome that shared an 80% sequence identity using the CD-HIT algorithm [31]. Following that, the non-redundant proteins were compared to the human host to eliminate homologous proteins and prevent functional blockage of similar host proteins. BLASTp against a reference human proteome with predetermined parameters was used to screen non-homologous proteins from a pool of non-redundant proteins [32]. The CELLO server was combined with the SVM method to predict the subcellular localization of non-host/homologous proteins with a prediction accuracy of 89% [33].
Virulent proteins play a key role in the development of vaccines due to their host invasion and pathogenesis nature. Schistosoma virulent proteins were identified by using ViroBLAST [34]. Furthermore, the antigenicity of the virulent proteins was analyzed using the Vaxijen server. Through ACC (auto cross-covariance) transformation, this server maintains a prediction accuracy of 70-89% [35]. An antigenic protein was defined as one with an antigenic score greater than 0.5. TMHMM was used with cut-off < 1 to predict transmembrane helices [36]. Proteins with fewer transmembrane helices are easier to express and clone [21]. Top antigenic proteins having 1 or 0 transmembrane helices were chosen for vaccine development. Moreover, the AllerTOP server evaluated the allergenicity of proteins. AllerTOP is a robust and powerful complementary approach based on the k-nearest neighbors (kNN) method for classifying non-allergens and allergens with 88.7% accuracy [37].

Prediction of CTL Epitopes
A significant breakthrough in rational vaccine design is the development of cytotoxic T-lymphocyte (CTL) epitopes. Most importantly, it decreases the time and expense of predicting epitopes compared to wet laboratory experiments [38]. CTL epitopes were predicted by using the MHC I binding tool present in the Immune Epitope Database (IEDB) [39]. The lesser percentile rank shows greater affinity, so the percentile rank was considered as 2. Immunogenic, non-toxic, non-allergen, and antigenic CTL epitopes were considered for vaccine construction. The immunogenicity tool present in IEDB was used to find the immunogenic nature of the epitopes [39]. The Toxin Pred and Vaxijen servers were used to check for toxicity and antigenicity [40,41]. The Toxin Pred server is based on the various properties of peptides and employs a quantitative matrix and machine learning technique. Allergenicity was checked using AllergenFP, which ensures a prediction accuracy up to 88.9% [42].

Prediction of HTL Epitopes
Helper T lymphocyte (HTL) cells influence CTL cells to kill target parasitized cells, macrophages to ingest infectious agents, and B cells to excrete antibodies, making them the most effective cells in adaptive immunity [43]. Therefore, it is essential to include helper T cell epitopes to create a healthier immune reaction. Cytokines were generated in HTL cells, such as interleukin-4 (IL-4) and IL-10, and interferon-gamma (IFN-), which cause immune cells such as macrophages and cytotoxic T cells to become activated [44]. Hence, HTL epitopes that induce cytokine production are critical for vaccine creation. The MHC-II binding tool of IEDB was utilized to forecast HTL epitopes (15-mer) of the target proteins [39]. The IFN-epitope server was used with motif scan and SVM approaches to produce overlapping HTL sequence peptides against the query protein [45]. The IL-4Pred server enables to plan, detect, and map peptides that really can induce IL-4 with a threshold value of 0.2, which is important for subunit vaccines [46]. IL-10Pred was also used to predict the inducing properties at the threshold value of −0.3 [47].

Prediction of LBL Epitopes
The surface receptor of B cell identifies B cell epitopes and produces antigen-related immunoglobulins. The formulation of a B cell epitopic vaccine would therefore play an important role in adaptive immunity [48]. The online server ABCPred, based on a neural networking approach, was used to identify the linear B cell epitopes (LBL) with an accuracy of 75% [49]. The AllergenFP v1.0, ToxinPred, and VaxiJen v2.0 servers were utilized to check the allergenicity, toxicity, and antigenicity of the anticipated B cell epitopes [35,40,42].

World Population Coverage
Different HLA alleles and their expression are distributed in different ethnic groups at various frequencies. The worldwide HLA-alleles distribution is therefore essential for successful multi-epitope vaccine development [50]. The main goal of these population coverage studies was to see if the candidates chosen were appropriate for large groups of people [51]. To determine the population coverage of the screened epitopes and their HLA alleles, the IEDB population coverage server was used [52].

MEV Construction
The potential CTL, HTL, and LBL epitopes were combined to form a fusion peptide using the AAY, GPGPG, and KK peptide linkers, respectively. Since the peptides used to make vaccines are generally not very immunogenic when used alone, adjuvants are mandatory to improve the immune reaction. Hence, the TLR-4 agonist (RS-09; Sequence: APPHALS) adjuvant was linked to the CTL epitope via the EAAAK linker. A linker must be used to join two epitopes for each epitope to work efficiently.

Structural Analysis of the Vaccine Construct
The constructed vaccine was initially tested using BLASTp for its similarity with the human proteome [53]. Next, the Protparam web server was used to evaluate the vaccine construct's physiochemical characteristics [37]. Molecular weight, atomic composition, amino-acid composition, extinction coefficient, theoretical pI, and approximate half-life were the parameters determined by the ProtParam server [54]. In addition, the IEDB Immunogenicity server was utilized to calculate immunogenicity and the Vaxijen server was used to evaluate vaccine construct antigenicity (antigen or non-antigen) [39,41]. One more important step was to calculate the allergenicity (allergen or non-allergen) of the vaccine construct using the AllerTOP server [55]. Protein secondary structure is very significant because it is a key indicator of protein folding, so the SOPMA Tool evaluates the secondary structure of the vaccine. The secondary structure prediction by SOPMA shows an accuracy of 69.5% by showing a three-state structure (a-helix, B-sheet, and coil) description. [56]. The last step of the vaccine construct evaluation includes the SOLpro Tool for checking the vaccine solubility, which has a 74% accuracy with a 10-fold crossvalidation [57].

3D Structure Prediction and Validation
A 3D protein model can be created using computer algorithms from the amino acid sequence. The 3Dpro tool of the SCRATCH suite was used for the three-dimensional (3D) modeling of the MEV [58]. This server predicts the three-dimensional structures of the protein based on several-threading alignments and iterative prototype fragment assembly simulations. After the prediction of the vaccine model, it was refined using the GalaxyRefine webserver [59]. The GalaxyRefine webserver reconfigures the side chain and then uses molecular dynamics to carry out the structural reassembling and then the overall structural inspection. It is especially effective in improving the quality of the local structures, as demonstrated by the CASP refinement category. The Ramachandran plot was used to evaluate the energetically permissible and forbidden phi (φ) and psi (ψ) dihedral angles of the amino acid residues [60]. Besides, ProSA-web deals with specific needs in the validation of the protein structures derived from NMR spectroscopy, X-ray, and theoretical calculations [61]. ERRAT was also used to evaluate our predicted tertiary structure and assign a value of the quality factor based on the non-bonded atomic interactions of the various atomic types within the protein [62].

Prediction of the B Cell Epitopes of the Vaccine
The IEDB v2.22 Ellipro tool and ABCPred online server were used to predict the MEV's conformational and linear B cell epitopes [49,63]. On the ABCPred server, the MEV amino acid sequence was given as input, with the length set to 14 and the threshold set to 0.5. The 3D structure of the MEV was also given as input in the Ellipro tool by keeping the default parameters.

In Silico Cloning and Codon Adaptation
A codon adaptation strategy can improve the foreign gene expression in the host. The use of codons varies from organism to organism, and this change in the use of codons results in low foreign gene expression. The codon adaptation of the vaccine sequence was performed through the JCAT tool and adapted to codon usage as per the E.coli K12 strain usage [64]. The adapted nucleotide sequence was cloned into an E.coli vector of pET28a (+) utilizing Snap Gene v5.0.8 software [65].

Immune Simulation
The C-IMMSIM server assesses the immune response and immunogenicity of the MEV. The C-IMMSIM server predicts the peptide interactions with the immune system employing position-specific scoring matrices (PSSM). The C-ImmSim server is routinely employed in immunoinformatics studies and gives reliable results with regard to the vaccination strategy employed [66][67][68]. The simulation was conducted in 1000 steps, with two doses given over a four-week interval.

Protein-Protein Docking
The vaccine interacts with host immune cells to provide an effective immune response. The HADDOCK v2.2 server was used to conduct the protein-protein docking to analyze the MEV binding capability with human Toll-like receptors (TLR-2 and TLR-4) and MHC molecules (MHC I and MHC II) [69]. HADDOCK is a (experimental) knowledge-based docking method in the form of information on the interface zone between the molecular components and related orientations. In contrast to many other docking programs, HAD-DOCK permits a conformational modification of molecules not only of the side chains but also of the backbone during the complex formation. Furthermore, HADDOCK supports the docking of multi-model NMR structures and other Protein Data Bank (PDB) structures [70]. Crystal structures of TLR4 (ID: 4G8A) [71], TLR2 (ID: 2Z7X) [72], MHC I (ID: 1I1Y) [73], and MHC II (ID: 1KG0) [74] were retrieved from the PDB [75]. Interactions were observed by evaluating docked complexes in the PDBsum database, and the docked complexes were imaged using PyMOL [76,77]. PDBsum is a web-based database that provides a large pictorial summary of the key data on each PDB macromolecular structure. It includes PROMOTIF structural studies, annotated plots of each protein chain's secondary structure, images of the structure, PROCHECK results, and schematic diagrams of the protein-DNA and protein-ligand interactions [78].

MD Simulations
Simulations were used to understand the molecular dynamics of the docked complexes. The AMBER software and its various modules were used for this purpose [79].
Initially, the TLeap module was used for the generation of topological files and initial coordinates. The system was solvated utilizing the force field ff14SB in a TIP3P water box with 8.0 dimensions [80]. To recover adverse conflicts, the complex's energy was minimized using the conjugate gradient for 1000 steps and the steepest descent. Using the Langevin dynamics algorithm, the device was then heated for 10 ps to preserve temperature stability. The pressure was balanced in accordance with protocol. Finally, the complex was subjected to an efficient simulation of 100 ns. The simulation box's canonical ensemble inferred periodic boundary conditions. The Berendsen Coupling Integration Algorithm was employed to keep the temperature stable [81]. The Process TRAJectory (PTRAJ) module was used to analyze the results. Xmgrace was used to calculate and display three properties [82]. These are the radius of gyration (RoG), root mean square deviation (RMSD), and root mean square fluctuation (RMSF). Alpha carbon (Cα) co-ordinates are commonly considered to be depictive of an amino acid's position in 3D space. The RMSD method compares the protein carbon atoms in relative positions by determining the average distance between them over a certain period of time [83]. RMSF calculates the protein's backbone atoms (N, C, and C) and analyzes the structural changes. This reflects the root average square distance in certain dynamics between an atom and its average geometric position. RoG is used to evaluate the 3D packaging and density of the docked complexes [84].

MMGBSA Binding Energy Analysis
MMGBSA was used to estimate the binding free energies of the dominating simulated complexes. The ante-MMPBSA.py module generated and evaluated the initial prompt files for the MEBV, TLR2, TLR4, MHCI, MHCII, and complexes. The variation in the free energies of the complex, receptor, and the vaccine was used to calculate the free energy: The gas-phase energy influences are exchanged with polar and non-polar salvation free energy modules during this process [85]. For each terminus, MMGBSA calculates Gibb's free energy, which is a term used to describe the amount of energy that is denoted by G, as follows: where temperature is denoted by T, and is multiplied by entropy S, which is estimated by normal mode analysis. The MM energy from the force field is often used as "Egas" at the gas stage energy. Van der Waals collaboration, internal, and electrostatic energy are all included in this category.

Core Proteome Analysis
The core proteins are now appreciated in vaccine design as they are present in all or the majority of the targeted pathogen strains and the use of these proteins in vaccine design ensure immune protection against broader pathogenic species. For vaccine design against Schistosoma, all three prominent pathogenic species (Schistosoma mansoni, Schistosoma japonicum, and Schistosoma haematobium) were considered. The reference proteome of Schistosoma japonicum has a total size of 369.9 Mb encoding 16,936 proteins with GC contents of 33.8%. Similarly, Schistosoma mansoni and Schistosoma haematobium encodes for 3462 and 8934, respectively. The total protein count of these species is 29,332; which reduced to 16,557 after core proteome analysis.

Identification of Vaccine Candidates
The CD-hit study of the core proteome identified 11,430 proteins as non-redundant. The redundant proteins are the product of evolutionary duplication and are a copy of the core protein. Since these proteins are not conserved in the genome, they are not good candidates for vaccine development [86]. When screening vaccine proteins, homologous proteins must be removed because they cause host proteins cross-reactivity, resulting in auto-immune responses [87]. Homology analysis identified 2094 non-homologous proteins in the host, while 9336 homologous proteins were excluded. During comparative subcellular localization, CELLO identified 79 cytoplasmic, 612 plasma membrane, 184 extracellular, 102 mitochondrial, 1 cytoskeletal, and 1116 nuclear proteins. Since cytoplasmic proteins do not interact with the extracellular environment of the pathogen, these are not considered to be adequate vaccine design targets and therefore were discarded. The remaining proteins were subjected to virulent protein analysis. Virulent protein immunization and/or their combinations enhance the host's protection when exposed to a microbial challenge [14]. The virulent protein mediates severe signals in host cells and thus can serve as valuable vaccine candidates in comparison with non-virulent proteins. Four proteins were shortlisted as virulent proteins [88]. The antigenicity of these four proteins was then analyzed and only two proteins were found to be antigenic. Antigenicity is defined as the ability to elicit an immune response in response to an antigen; thus, selecting proteins with higher antigenicity is a requirement for peptide-based vaccine design [89]. Besides, no transmembrane helices were found in these two proteins. Furthermore, these proteins were found to be non-allergenic, making them prime candidates for vaccine development. Immunoinformatics analysis of both these proteins was done. Details of both proteins are listed in Table 1.

Epitopes Prediction
Vaccine candidates were used to forecast CTL, HTL, and LBL cell epitopes. The top 7 non-toxic, antigenic, non-allergenic, and immunogenic CTL epitopes were screened for vaccine designing, as shown in Table 2, from a total of 66 CTL epitopes forecasted (Table S1).  Similarly, a total of 10 HTL epitopes were forecasted (Table S2) and top 3 antigenic, IFN-positive, IL-10 inducer, and IL-4 inducer HTL epitopes were screened for the vaccine construct, as shown in Table 3. B cell epitopes are protein antigenic regions that can trigger the formation of antibodies. A total 23 LBL epitopes were predicted (Table S3). However, only the top 4 antigenic, nonallergic, and non-toxic LBL epitopes with a probability score above 0.5 were considered for the MEV design, as shown in Table 4.

World Population Coverage
The distribution of MHC alleles varies across the geographical regions and ethnic groups. Hence, population coverage must be considered while developing an effective vaccine [50]. Population coverage of the screened T cell epitopes was checked ( Figure 1A). the global population of selected epitopes is estimated to be 99.64%. The highest population coverage was found in Europe (99.83%), followed by North America (99.82%), East Asia (99.48%), and South Africa (99.34%). Central America has shown the lowest population coverage (22.61%).

Construction of the MEV
The MEV is a fusion peptide that was designed by using AAY, KK, and GPGPG linkers to combine the 10 T cell (7 CTL and 3 HTL epitopes) and 4 LBL epitopes ( Figure 1B). The TLR-4 agonist (RS-09; Sequence: APPHALS) was added to activate an antigen-specific immune response with an EAAAK linker in the N-terminal of the vaccine. The final vaccine contained a total of 217 amino acid residues ( Figure 1C).

Physiochemical and Structural Analysis of the Vaccine Construct
After the development of a vaccine, different physiochemical and immunogenic properties were analyzed. Initially, the vaccine homology was checked, and it was found to be non-homologous. Further, the antigenicity of the construct was calculated at a threshold value of 0.5 and found to be an antigen with a score of 0.7222. In the next step, allergenicity and toxicity were also checked, and the vaccine was observed to be non-toxic and non-allergenic.
The physiochemical properties were evaluated by the Protparam server. The MEV was estimated to have a theoretical pI of 8.90 and a molecular weight of 24.3 kDa. The MEV consists of 32 negatively charged amino acid residues (Asp + Glu) and 36 positively charged amino acid residues (Arg + Lys). A large number of positively charged amino acids and a pI greater than 7 showed that the MEV was alkaline. A 29.76 Instability Index (II) classified the MEV as stable. The MEV's thermostability was indicated by an Aliphatic Index of 78.29. The MEV's GRAVY score was −0.655; a negative sign indicates that it is hydrophilic. Finally, the SOLpro tool was used to estimate MEV solubility and the results indicated that MEV was soluble (probability: 0.952236). Transmembrane helices were checked by the TMHMM server and one transmembrane helix was found in the MEV at the position of 40-59 amino acid residues. Each of these characteristics suggests that the designed MEV has a high probability of being accepted as a vaccine candidate.
SOPMA was used to examine the secondary structure of the MEV. The SOPMA predictions are based on the MEV amino acid sequence. SOPMA results indicate that 68 (31.34%) amino acids form coils, 101 (46.54%) amino acids form an α-helix, and 30 (13.82%) amino acids forms β-strands.
The 3Dpro server was used for the tertiary structure prediction of the MEV. GalaxyRefine was used to further refine the predicted 3D structure of the MEV. The 3D structure of the MEV is shown in Figure 1D. GalaxyRefine returned five refined structures. Model 2 was chosen for further validation among the pool because it had a significantly higher Rama favored region (95.9) and overall suitable MolProbity (1.730), RMSD (0.483), and GDT-HA (0.9278) scores.
A Ramachandran plot of the vaccine subunit revealed that 94.6% of the amino acid residues were present in the favored region, 4.3% in the additional allowed region, and 0.5% in the disallowed region, which corresponds to GalaxyRefine's prediction. A Z-score of −2.85 was predicted by the ProsA web server. For high-quality models, ERRAT generates an overall quality factor of >50, and our model's quality factor was 76.84.

Prediction of B Cell Epitopes of the MEV
B cells also produce antibodies, which give humoral immunity and release cytokines. Therefore, B cell epitopes should be present in the MEV's domain. Seven conformal B cell epitopes (Table S4) and sixteen linear B cell epitopes (Table S5) were forecasted.

In Silico Cloning
To express the vaccine peptide in an E. coli system, in silico cloning of the MEV was performed in the pET28a (+) plasmid. The cloned sequence and the plasmid were cleaved with the restriction enzymes (Ndel and Xhol) to produce flanking sites. Before that, the construct sequence's codon usage was adapted as per the E.coli K12 strain's usage. JCAT revealed that the improved sequence exhibits a GC content of 50.0% and a CIA value of 1. To aid in the purification process, the 6-histidine residues were labeled on both sides of the target sequence. The clone was 5944 bp in size (Figure 2A).

Immune Simulation
The C-IMMSIM web server was used to test the immunogenic profile of the designed vaccine. All primary, secondary, and tertiary immune responses contributed significantly to the vaccine immunity ( Figure 2B). In particular, high titers of IgG + IgM antibodies were observed, followed by IgM and IgG1. Furthermore, in response to vaccine administration, different B cell isotypes were developed, which resulted in memory cell development. Furthermore, the vaccine candidate induces high IFN-γ and IL-2 levels ( Figure 2C).  Table 5. According to the docking statistics, the MEV has strong binding interactions with TLR2, TLR4, MHC I, and MHC II.

Protein-Protein Docking
The MEV docked conformation and atomic-level hydrogen bonding with different immune receptors are illustrated in Figure 3. The PDBsum server was used to gain a better understanding of the interactions between the MEV and receptor molecules and to draw up an interaction map between the docked complexes. It resulted in a schematic representation of non-bonded and H-bond interactions between the docked complexes. The MEV had 19 hydrogen bonds with TLR2 in the range of 3.07 Å, 15 hydrogen bonds with TLR4 in the range of 3.08 Å, 19 hydrogen bonds with the MHC I receptor in the range of 3.13 Å, and 17 hydrogen bonds with the MHC II receptor in the range of 3.27 Å.

MD Simulation
The statistical parameters based on the 100 ns MD simulation RMSF and RMSD were determined for the docked complexes (MEV-TLR4, MEV-TLR2, MEV-MHCI, and MEV-MHCII) to confirm the docked and structural stability of the constructed MEV (Figure 4). Simulation trajectories were used in both studies, and charts for the backbone carbon Alpha atoms were also created for both RMSF and RMSD. The RMSD backbone steadily increases in the complexes over time, and visualizing frames at 10 ns intervals revealed that the varying plot corresponds to slight structural changes induced, due to flexible loop regions, by the MEV. The binding of TLR2, TLR4, MHCI, and MHCII, as well as the overall stability of the complexes, were unaffected by these changes. The mean RMSD values for the MEV-TLR2, MEV-TLR4, MEV-MHC I, and MEV-MHC II complexes were 9.9 Å, 7.9 Å, 12.3 Å, and 12.4 Å, respectively. The RMSF analysis showed that the complex flexibility is due to the MEV's loop flexibility, which is higher for the MEV compared to the receptor molecules. An RMSF of TLR2 showed stability up to 2.5 Å. RMSF of TLR4 showed good conformational stability up to 2.3 Å. MHC I showed minor deviation at some points while MHC II showed stability at the value of 3 Å; after minor deviation, it remained stable at 2.5 Å. This indicated that the MEV in complex with TLR receptors are more stable than MHC molecules. The compactness of the complexes was then studied through the 100 ns simulation by measuring the RoG. The TLR2 receptor showed stability up to 32.5 Å over the time period of 35 ns; after that, it showed a minor deviation and remained stable up to 100 ns with the MEV receptor. The MHC-I and MHC-II plots of the RG indicate that they showed stability between 30 Å with minor deviations of 1 Å over the time period of 100 ns. RG conformational stability of the TLR4 receptor with MEV showed stability over 36.9 Å. All three these statistical analyses validated that the MEV is enjoying more stable dynamics with the TLR receptors than with the MHC molecules; therefore, it can be predicted that the MEV is more likely to bind to the TLRs.

MMGBSA Binding Energy Analysis
The MMGBSA method was used to calculate the binding free energies of the docked complexes and is represented in Table 6  Based on the calculated values, it could be concluded that van der Waals energy and electrostatic energy seem to be more beneficial in complex formation than the non-polar portion of the solvation energy's minor contribution. Overall, polar solvation energy tends to be less favorable to net energy.

Discussion
Schistosomiasis, the second most common tropical disease following malaria, has been a threat to the residents in regions it is endemic. In light of the possibility of resistance to the widely used drug (praziquantel) for schistosomiasis treatment, the need for a longterm vaccination strategy has been justified [90]. Vaccines are important for inducing immune protection and high immune responses against a variety of infectious diseases. Traditional approaches for vaccine production are less capable than computational methods for a variety of reasons, including inaccuracy, safety, stability, high cost, hypersensitivity, and specificity [91]. The subtractive proteomics approach in combination with immunoinformatics recently became more attractive for the designing of an effective low-cost vaccine. A substantial amount of genomic information available allows us to identify the pathogen proteins that are suitable for vaccine designing [91,92].
Prioritizing potential vaccine candidates requires the presence of a virulent factor capable of causing severe host infection [88,93]. Subcellular localization, no transmembrane helices, allergenicity, and antigenicity are also reported parameters that may help to screen potential vaccine candidates [14]. In the current study, subtractive proteomics in combination with reverse vaccinology and molecular docking was applied to identify and evaluate epitope-based antigenic peptide proteins in the core proteome of Schistosoma species. Subtractive proteomics was used to thoroughly investigate the core proteome to identify non-homologous, non-redundant, non-allergenic, virulent, and antigenic vaccine candidates. Two proteins-calcium binding and mycosubtilin synthase subunit C-were identified as promising vaccine candidates. Further, epitopes were predicted from these proteins. It has been revealed that nucleus localized proteins can enter endogenous pathways and are targets of the class I processing pathway [94][95][96]. Another way of using these vaccine candidates is in the form of a DNA vaccine comprising integrated antigens. The development of such multivalent DNA vaccines is a novel approach in parasite vaccinology where multiple antigens are combined into a plasmid or administered as plasmid mixture. This type of vaccination also allows protection against a variety of parasite strains as well as against different life cycle stages of parasites. The efficacy of these vaccines against parasites can be enhanced by achieving prime-boost immunizations, use of genetic adjuvants, and codon optimization [97].
Both B and T cell epitopes are required to concatenate and induce both humoral and cellular immunity when a prolonged, significant immune response is desired. Another benefit of using both types of epitopes is that antigens can sometimes elude memory B cells, making identification difficult. In such cases, the contingency plan is that the T cells neutralize and recognize the missed epitopes or antigens so that no pathogen remains are found. For T cells, CD8+ (MHC-I) and CD4+ (MHC-II) epitopes were forecasted to have a greater number of potential epitopes capable of generating an increased immune response [98]. Since linear epitopes are more stable than conformational epitopes, only linear epitopes were predicted for B cell epitope prediction [99]. Hence, both T and B cell epitopes from vaccine candidates were forecasted and rigorously analyzed. The selected epitopes covered 99.64% of the population of the world.
KK, AAY, and GPGPG linkers between the epitope sequences were added to make a more coherent and stable vaccine construct [100,101]. The use of EAAAK as a linker enhances the bioactivity of the vaccine and was added to the N-terminus of the vaccine, based on previous studies [102]. The TLR-4 agonist (RS-09; Sequence: APPHALS) was included as an adjuvant. The involvement of RS09 allows CTL epitopes to be co-stimulated, resulting in a more robust immune activation. The use of synthetic adjuvants (RS09) is a safer approach that is regarded as an advancement over traditional vaccination methods [103]. RS09 serves as an excellent adjuvant and it has been used in many previous studies [104][105][106]. The final MEV is 217 amino-acids long, which is consistent with previous studies [107,108]. The results, therefore, suggested that the efficacy, expression, and stability of our vaccine would not be an issue. The physiochemical analysis of the MEV revealed that it is stable and hydrophilic.
Tertiary and secondary structures provide information about a protein's function, interactions with other proteins/ligands, and dynamics [109,110]. Secondary structure analysis indicates that the MEV has a good amount of beta-sheets (30%), alpha-helixes (27%), and coils (41%), which is consistent with its capability to bind to immune cells and antigenicity [111]. A Ramachandran plot is an easy and simple way to analyze the tertiary structure quality. The presence of >90% residues in the Ramachandran plot (favored region) indicates high model quality, and our model has 94.6 residues in the favored region, indicating the high quality of our vaccine model [112]. Furthermore, the ERRAT quality factor and Z-score of our MEV were 76.84 and −2.85, respectively. A quality factor greater than 50% indicates that the model is of high quality [113]. The quality factor and Z-score confirm our model's overall high quality.
The MEV construct was used to predict the B cell epitopes to see if it had enough epitopes for antibodies to detect and latch onto [114]. Furthermore, elevated vaccine construct expression in the E. coli system is necessary for serological assays to detect the vaccine candidate's immunoreactivity [115]. For efficient expression in E.coli, the sequence was optimized. The optimized sequence has a CAI (Codon Adaptation Index) value of 1.00 and a GC content of 50%, indicating further purification and excellent expression of the vaccine [116,117].
Since MEV includes B and T cell epitopes, it should trigger cellular and humoral immune responses. Among other cytokines, IFN-β production was the highest. Important IL-10 and IL-2 activities were also observed. Extracellular protection was also provided by the antibodies. A large number of active immunoglobulins, including IgG and IgM, and their isotypes, have also been found, which can contribute to isotype switching. Furthermore, the negligible Simpson Index (D) indicates a varied, plausible immune response, as the MEV includes various B and T cell epitopes.
For MEV to be transported into the body successfully, it must have a high binding affinity for the immune receptor [118]. The MEV's strong binding ability with MHC molecules (MHC-1 and MHC-2) is required to elicit the immune system and develop a vaccine and immunotherapy directed against infectious microorganisms. In response to the administrated specific epitopic antigens, these interactions initiate an innate immune response and then elicit an adaptive immune response [119,120]. MD simulation and molecular docking verified the strong interactions of MEV with TLR2, TLR4, MHC I, and MHC II; the MMGBSA study demonstrated that this stable binding required very little energy. A significant number of H-bonds in docking, as well as minor fluctuations during MD simulations were observed. According to these findings, the MEV can bind to immune receptors effectively.
Although some multi-epitope vaccines against Schistosoma have been reported [104,121,122], the MEV developed in this work exhibits outstanding properties because it uses the core proteome of three Schistosoma species that primarily affects humans. As the constructed vaccine contains an adjuvant, B cell, and T cell epitopes (CTL, HTL), it can promote innate and adjuvant immunological reactions in the host body, making it an excellent and suitable candidate for Schistosoma vaccine production. The only drawback of the current research is that further laboratory trials are required to demonstrate the safety and efficacy of the designed vaccine since it is based on an integral computation pipeline.

Conclusions
The availability of a complete parasitic proteome facilitates several computational methods. The results presented here exhibit the gradual prioritization of the core proteome using various reverse vaccinology and comparative proteomics approaches, thus lessening the undesirable number of Schistosoma proteins. This method is effective for enriching potential targets and identifying those that are essential for normal cell functions as well as those that are virulent in the host cell. This approach enables us to identify immunogenic and antigenic vaccine targets that are important for pathogenesis. Antigenic, non-allergenic, and non-toxic T cell and B cell epitopes were fused with several linkers and adjuvants to boost immune reactions that further improve the effectiveness of the designed vaccine. Nevertheless, the methods applied in the study focused on the use of stringent parameters to make the predictions as accurate as possible; however, the pipeline used herein suffers from several limitations. For example, predictions about the MEV epitopes to MHC alleles and B cells need extensive validation, although their reliability has been proved widely. An appropriate order of the epitopes in the MEV is required to be investigated experimentally in the context of immunogenicity. Moreover, to confirm its worth against Schistosoma infections, the designed vaccine needs to be validated in animal models.

Data Availability Statement:
The data presented in this study are available within the article.