Candidate Multi-Epitope Vaccine against Corona B.1.617 Lineage: In Silico Approach

Various mutations have accumulated since the first genome sequence of SARS-CoV2 in 2020. Mutants of the virus carrying the D614G and P681R mutations in the spike protein are increasingly becoming dominant all over the world. The two mutations increase the viral infectivity and severity of the disease. This report describes an in silico design of SARS-CoV-2 multi-epitope carrying the spike D614G and P681R mutations. The designed vaccine harbors the D614G mutation that increases viral infectivity, fitness, and the P681R mutation that enhances the cleavage of S to S1 and S2 subunits. The designed multi-epitope vaccine showed an antigenic property with a value of 0.67 and the immunogenicity of the predicted vaccine was calculated and yielded 3.4. The vaccine construct is predicted to be non-allergenic, thermostable and has hydrophilic nature. The combination of the selected CTL and HTL epitopes in the vaccine resulted in 96.85% population coverage globally. Stable interactions of the vaccine with Toll-Like Receptor 4 were tested by docking studies. The multi-epitope vaccine can be a good candidate against highly infecting SARS-CoV-2 variants.


Introduction
The emergence of the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) which causes COVID-19 has led to a high mortality rate all over the world. The virus attacks the vital organs of the body and leads to fatal respiratory distress. SARS-CoV-2 belongs to a group of viruses known as human coronaviruses [1]. Human coronavirus is a member of the Coronaviridae family that infects the respiratory tract of humans. They can be classified into seven genera of coronaviruses. Four human corona viruses namely HCoV-HKU1, HCoV-OC43, HCoV-NL63, and HCoV-229E, can lead to mild or weak respiratory symptoms [2]. The remaining three, Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), Middle East Respiratory Syndrome Coronavirus (MERS-CoV), and Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), are highly pathogenic with symptoms including high fever, dry cough, fatigue and can lead to severe pneumonia [3].
The virus targets the angiotensin-converting enzyme 2 (ACE2) receptor of the host cell. Spike glycoprotein (S-protein) of CoV binds to the ACE2 receptor and releases its RNA into the cytoplasm of the host cell that is translated into polyproteins and structural protein. The S protein has two subunits; S1 which harbors the receptor-binding domain (RBD) and responsible for the binding to the host ACE2. The second subunit S2 contains the membrane fusion machinery. [4]. Spike protein is the prime choice for vaccine designing because it is involved in receptor recognition, as well as virus attachment and entry. Most approved candidates of COVID-19 vaccines are based upon the Spike (S) protein, which is the target of neutralizing antibodies [5].

Structural Modeling of Multi-Epitope Vaccine, Refinement, and Validation
The selected CTL and HTL from the spike glycoprotein were joined by the flexible GPGPG linkers to ensure the effective separation of the epitope. The Cholera Toxin B (CTB) adjuvant was added to the N-terminal of the vaccine construct. To induce regulatory immune responses, the 3-D model was generated using the scratch 3Dpro tool [23]. The 3D model is refined by GalaxyRefine [24] and validated by the ProSAweb server [25].

Population Coverage
The IEDB population coverage analysis tool [28] with default parameters, was used to confirm that the designed vaccine is covering the entire world population. The analysis was performed against HLA alleles (Class I and Class II).

Molecular Docking
Protein-protein docking, rigid docking, was employed using the ClusPro 2.0 server [29]. The TLR4 (PDB ID: 4G8A) was used as the immune receptor. Unstructured terminal residues were removed from the TLR-4 structure through ClusPro 2.0 server. Docked complex showing the lowest energy score was selected for molecular dynamics simulation. The iMOD server (iMODS) was used for investigating the molecular dynamics simulation [30]. Codon optimization of the multi-epitope construct was performed by vector builder. The construct was uploaded to the codon adaptation service of vector builder (https: //en.vectorbuilder.com/tool/codon-optimization.html) (accessed on 17 October 2022). In silico cloning of the adapted nucleotide sequence into the pET28a (+) expression vector was performed using SnapGene v4.2 software (GSL Biotech LLC, Chicago, IL, USA). The folding of the expressed RNA sequence was performed using RNAfold server [31].

CTL/HTL Epitope Prediction and Assessment
Both CTL and HTL epitopes have an essential role in the generation of long-lasting adaptive immunity. They have an important role in the induction of humoral and cellular immune responses. HTL epitopes provoke a CD4+ helper reaction, that leads to the induction of CD8+ T-cell memory and the stimulation of B-cells for antibody production [32,33]. The of CTL epitopes (9 mer) was selected using the NetCTL server2.1. The S protein of SARS-CoV-2 Delta variant was scanned for the CTL epitopes. Various immune filters were used to confirm that the predicted epitopes have a strong affinity to MHC class I and class II alleles. The epitopes were tested and been confirmed to be promiscuous, antigenic, and immunogenic. Seven epitopes meeting the previous criteria were selected for construct design (Table 1A). Similarly, the S protein of SARS-CoV-2 Delta variant was scanned for the HTL epitopes. Four epitopes were selected and confirmed to be antigenic, non-allergenic, and non-toxic and selected for construct design (Table 1B). All selected epitopes were checked for autoimmunity against human genome using BLASTp and showed no significance.

Multi-Epitope Vaccine Construct Design
The final epitopes were joined together via flexible GPGPG linkers to ensure that the epitope domains are separated and for decreasing the possibility of creating junctional epitopes ( Figure 1). The sequence of the multi-epitope vaccine was proceeded by Cholera Toxin β subunit (CTB) sequence to improve the overall immunogenicity of the multiepitope. CTB is a non-toxic constituent of cholera toxin, showed high affinity to good mono sialo-tetrahexosylganglioside on the surface of the gut epithelium. Moreover, it also has a high affinity to macrophages, B-cells, and dendritic cells. The CTB sequence should lead to high efficiency in activating the human immune system. The CTB was joined to first epitope in the designed construct by EAAAK linker sequence. The molecular weight of the designed vaccine is 29.2 kDa.

Prediction of Immunogenicity, Antigenicity, Allergenicity, and Physicochemical Parameters
VaxiJen v2.0 and ANTIGENpro were used to predict the antigenic property of the multi-epitope vaccine and it showed 0.67 and 0.899, respectively. The immunogenicity of the predicted vaccine was calculated using IEDB server and yielded 3.4. The AllerTop software confirmed that the multi-epitope vaccine is non-allergenic in nature. The physicochemical characteristics of the construct was investigated by the ProtParam. The vaccine construct has a molecular weight of 29.2 kDa. The theoretical isoelectric point (pI) was 6.06. The multi-epitope vaccine has of 282 amino acids, 22 negative charged aa and 19 positive charged aa. The chemical formula of the vaccine is C1303H2025N355O388S10. It has aliphatic index of 80.32 characteristic of a vaccine candidate that is highly thermostable. The construct should yield a stable protein after expression, as it has instability index of 32.35. The vaccine has a GRAVY index value of -0.063 which reflects the hydrophilic nature of the vaccine. The expected half-life of the vaccine is 30 h in mammalian reticulocyte (in vitro).

Population Coverage Study
The epitopes used in vaccine design were tested for coverage of the covid-19 maximum allele population. Notably, our results showed that the CTL and HTL epitopes used in our design resulted in 96.85% population coverage globally. Population coverage analysis showed that the vaccine should yield a 99.8% coverage for England, 82% for east Africa and 99.3% for Europe. For India and Brazil, that were highly affected by SARS-CoV-2, the lower coverages of 88.89% and 85.39%, respectively were not surprising ( Table 2).

Prediction of Immunogenicity, Antigenicity, Allergenicity, and Physicochemical Parameters
VaxiJen v2.0 and ANTIGENpro were used to predict the antigenic property of the multi-epitope vaccine and it showed 0.67 and 0.899, respectively. The immunogenicity of the predicted vaccine was calculated using IEDB server and yielded 3.4. The AllerTop software confirmed that the multi-epitope vaccine is non-allergenic in nature. The physicochemical characteristics of the construct was investigated by the ProtParam. The vaccine construct has a molecular weight of 29.2 kDa. The theoretical isoelectric point (pI) was 6.06. The multiepitope vaccine has of 282 amino acids, 22 negative charged aa and 19 positive charged aa. The chemical formula of the vaccine is C 1303 H 2025 N 355 O 388 S 10 . It has aliphatic index of 80.32 characteristic of a vaccine candidate that is highly thermostable. The construct should yield a stable protein after expression, as it has instability index of 32.35. The vaccine has a GRAVY index value of -0.063 which reflects the hydrophilic nature of the vaccine. The expected half-life of the vaccine is 30 h in mammalian reticulocyte (in vitro).

Population Coverage Study
The epitopes used in vaccine design were tested for coverage of the covid-19 maximum allele population. Notably, our results showed that the CTL and HTL epitopes used in our design resulted in 96.85% population coverage globally. Population coverage analysis showed that the vaccine should yield a 99.8% coverage for England, 82% for east Africa and 99.3% for Europe. For India and Brazil, that were highly affected by SARS-CoV-2, the lower coverages of 88.89% and 85.39%, respectively were not surprising ( Table 2).

D Structure Modeling and Validation of the Multi-Epitope
As there are no good PDB templates to guide the protein structure prediction process, the 3Dpro tool was used to model the 3D structure of the multi-epitope vaccine (Figure 2). This tool uses a de novo method for structure modeling. GalaxyRefine server was used to refine the final vaccine 3D structure model. The initial generated 3D protein model was refined using GalaxyRefine server. The server-produced five models based on the root-mean-square deviation (RMSD) and MolProbity algorithm (Table 3). Refined model 3, showing the highest Ramachandran value, was selected for performing the docking analysis (Table 3). The final multi-epitope vaccine model structures were validated by using ProSAweb. The overall Z-score was −2.94 for the refined model.). Negative z-score values indicate no error parts inside the model structure. This score indicates acceptable model quality as it is within the range of the comparable sized native proteins ( Figure 3A). The residue scores were checked ( Figure 3B)  The final multi-epitope vaccine model structures were validated by using ProSA-web. The overall Z-score was −2.94 for the refined model). Negative z-score values indicate no error parts inside the model structure. This score indicates acceptable model quality as it is within the range of the comparable sized native proteins ( Figure 3A). The residue scores were checked ( Figure 3B

Molecular Docking Analyses of Multi-Epitope-Based Vaccine against TLR4
Molecular docking of the vaccinse with TLR4 was examined by ClusPro 2.0 server. The antibody mode of ClusPro 2.0 server was selected as this mode was developed for docking antibody and antigen pairs. To improve the docking results, residues that do not fall in the complementarity-determining regions were masked. TLRs have an essential role in innate immunity activation and organization of the adaptive immune response as they detect structurally conserved molecules derived from microbes and viruses [34]. Activation of Toll-like receptor 4 leads to an intracellular signaling pathway NF-κB and inflammatory cytokine production that trigger the innate immune system. The role of TLR4 in the generation of an effective immune response against SARS-CoV-2 has been studied [35]. The 3D model of the vaccine was docked with TLR4 (PDB ID: 4G8A) immune receptor by ClusPro 2.0 server. The lowest energy score, -1346.3, was selected as the best-docked complex, suggesting that the vaccine model occupies the receptor accurately and showing high binding affinity (Figure 4).

Molecular Docking Analyses of Multi-Epitope-Based Vaccine against TLR4
Molecular docking of the vaccinse with TLR4 was examined by ClusPro 2.0 server. The antibody mode of ClusPro 2.0 server was selected as this mode was developed for docking antibody and antigen pairs. To improve the docking results, residues that do not fall in the complementarity-determining regions were masked. TLRs have an essential role in innate immunity activation and organization of the adaptive immune response as they detect structurally conserved molecules derived from microbes and viruses [34]. Activation of Toll-like receptor 4 leads to an intracellular signaling pathway NF-κB and inflammatory cytokine production that trigger the innate immune system. The role of TLR4 in the generation of an effective immune response against SARS-CoV-2 has been studied [35]. The 3D model of the vaccine was docked with TLR4 (PDB ID: 4G8A) immune receptor by ClusPro 2.0 server. The lowest energy score, −1346.3, was selected as the best-docked complex, suggesting that the vaccine model occupies the receptor accurately and showing high binding affinity (Figure 4).

Molecular Dynamics Simulation of the Vaccine-Receptor Complex
Molecular dynamics simulation was used to study the binding affinity of the vaccine-TLR4 docked complex and to assess its stability and physical movements over time. Analysis of the main-chain deformability and the mobility stiffness of the residues in the complex showed low distortion in the residues of the complex ( Figure 5A). The B-factor values were proportional to root mean square (see Figure 5B). B-factor values determines uncertainty of the atomic position, which includes the effects of noise due to model errors and lattice defects, in addition to the positional variance of thermal protein motion. The eigenvalue of the complex is 5.426 × 10 −6 and it increased gradually in each mode during the dynamics ( Figure 5C). The covariance matrix between the pairs of residues is shown in Figure 5D,E shows the elastic network model in the docked complex. The results suggest the stability of the interface between the TLR4-vaccine complex

Molecular Dynamics Simulation of the Vaccine-Receptor Complex
Molecular dynamics simulation was used to study the binding affinity of the vaccine-TLR4 docked complex and to assess its stability and physical movements over time. Analysis of the main-chain deformability and the mobility stiffness of the residues in the complex showed low distortion in the residues of the complex ( Figure 5A). The B-factor values were proportional to root mean square (see Figure 5B). B-factor values determines uncertainty of the atomic position, which includes the effects of noise due to model errors and lattice defects, in addition to the positional variance of thermal protein motion. The eigenvalue of the complex is 5.426 × 10 −6 and it increased gradually in each mode during the dynamics ( Figure 5C). The covariance matrix between the pairs of residues is shown in Figure 5D,E shows the elastic network model in the docked complex. The results suggest the stability of the interface between the TLR4-vaccine complex

Codon Adaptation, In Silico Cloning and Testing the Stability of the Secondary Structure of the Produced mRNA
Codon optimization of the designed multi-epitope was performed to improve the translation efficiency of the construct design. The codons of the designed sequence were adjusted to the E. coli K12 on the vector builder service ( Figure 6A). The modified nucleotide sequence showed a codon adaptation index (CAI) value of 0.93 guanine-cytosine (GC) content of 59.48%. Hind III and Bam HI restriction sites were added at the start and end to clone the insert into the pET28a (+) vector. The modified sequence was in silico cloned in the pET28a (+) cloning vector using SnapGene software ( Figure 6B). The stability of the secondary structure of the produced mRNA was studied by RNA fold server. The thermodynamic stability of the mRNA structure is indicated by the minimal free energy of -388.60 kcal/mol. The initial 12 nucleotides of the mRNA secondary structure were free of any pseudoknots or long stable hairpins (Figure 7). Codon optimization of the designed multi-epitope was performed to improve the translation efficiency of the construct design. The codons of the designed sequence were adjusted to the E. coli K12 on the vector builder service ( Figure 6A). The modified nucleotide sequence showed a codon adaptation index (CAI) value of 0.93 guanine-cytosine (GC) content of 59.48%. Hind III and Bam HI restriction sites were added at the start and end to clone the insert into the pET28a (+) vector. The modified sequence was in silico cloned in the pET28a (+) cloning vector using SnapGene software ( Figure 6B). The stability of the secondary structure of the produced mRNA was studied by RNA fold server. The thermodynamic stability of the mRNA structure is indicated by the minimal free energy of −388.60 kcal/mol. The initial 12 nucleotides of the mRNA secondary structure were free of any pseudoknots or long stable hairpins (Figure 7).

Discussion
Since the release of the first whole genome sequencing of corona virus on February 2020 [36], mutations in the S1 sub-unit of SARS-CoV-2, such as D614G and P681R have developed and become dominant within few months. Investigations suggest a correlation between the D614G mutation and increased viral load, which give the possibility that the

Discussion
Since the release of the first whole genome sequencing of corona virus on February 2020 [36], mutations in the S1 sub-unit of SARS-CoV-2, such as D614G and P681R have developed and become dominant within few months. Investigations suggest a correlation between the D614G mutation and increased viral load, which give the possibility that the

Discussion
Since the release of the first whole genome sequencing of corona virus on February 2020 [36], mutations in the S1 sub-unit of SARS-CoV-2, such as D614G and P681R have developed and become dominant within few months. Investigations suggest a correlation between the D614G mutation and increased viral load, which give the possibility that the D614G mutation increase the infectivity of the virus [7]. Another reports patients carrying strains with G614 have higher airway viral loads but not with severe disease symptoms [37].
On the other hand, another study found a correlation between G614 mutation and higher case fatality rate of COVID-19 [38]. The P681R is a critical mutation as it positioned at the furin cleavage site that splits the S protein to S1 and S2 subunits. It also increases the viral replication of the virus [10].
This study aimed to design a multi-epitope vaccine to the corona virus and more specifically to the corona B.1.617 lineage. The designed vaccine, based on short immunogenic sequences that can generate both humoral and cell mediated immunity. This method of vaccine construction offers improved safety levels and provides the opportunity to specifically attach/engineer combinations of epitopes for augmented potency. This approach offers an alternative for recombinant vaccine technology and spares the use of whole genome/large proteins. This will relieve the extra antigenic load and the allergenic responses [39,40]. Moreover, the conventional method of vaccine designing using large proteins can produce unnecessary antigenic load and may lead to increased chances of allergenic responses [41]. Moreover, this strategy gives the possibility of selecting epitopes producing the anticipated response in largest possible human population [14].
The designed multi-epitope followed the same strategy used by Kar et al. [17]. The multi-epitope was linked at its N-terminal by Cholera Toxin B (CTB) with appropriate linkers. This sequence was verified as a strong viral adjuvant in many studies [42][43][44]. Furthermore, GPGPG was used to join the selected epitopes and make them more reachable. The selected epitopes were confirmed to bind with multiple MHC class I and MHC class II alleles, following the same approach used by Bazhan et al. [45]. One epitope (YQGVNCTEV), showing antigenicity of 1.39 and immunogenicity of 0.08, contains the D614G mutation. The D614G mutation resulted in higher pseudovirus titers in multiple cell types, suggesting that this mutation might be associated with infectivity of the COVID-19 virus and enhanced replication in airways [7,46]. Another epitope (RRRARSVAS), showed antigenicity of 1.16 and immunogenicity of 0.008, contains the P681R mutation. The P681R mutation is the most representative mutation in B.1.617 lineage. Recent studies showed that the D614G/P681R pseudovirus was more resistant to the monoclonal antibodies targeting the RBD domain of the corona virus [9].
The instability index of the designed vaccine yielded 32.35, which predicts that the expressed protein to be stable [27]. The protein is predicted to be highly thermostable as its aliphatic index showed 80.32 [47]. The GRAVY index of vaccine construct was found to be −0.063 (lower the GRAVY score, better is the solubility). These values are similar to the values of Kar et. al. Foroutan et al., used the same sequence of in silico analysis against Toxoplasma gondii. They validated the activation of strong humoral and cellular responses by their multi-epitope vaccine in mice by laboratory experiment [48]. They computed the instability index (II) of the vaccine to be 66.37, aliphatic index and GRAVY of multi-epitope peptide were 57.76 and -0.449, respectively. The predicted values of the instability index, aliphatic index and GRAVY were found to be comparable when compared to the previous values. The prediction of the vaccine tertiary structure model was performed using the 3Dpro tool and refined using the ProSA web server and the Z-score assessment yielded a score of −2.94. The obtained Z-score makes our model among the determined structures solved by NMR and X-ray crystallographic experiments.
The binding affinity of complexes of the developed vaccine and TLR4 receptor was confirmed by the ClusPro server. Generation of a stable immune response has to be confirmed, in any in-silico-designed vaccine. TLRs are the main target for these kinds of docking experiments as TLRs have an essential role in innate immunity activation and synchronization of the adaptive immune response against microbes, including viruses [34,49,50]. In this study, TLR4 was selected because of its confirmed role in viral structural proteins recognition leading to inflammatory cytokine production [51]. Moreover, various studies confirmed its role in the generation of an effective immune response against SARs-COV, which makes most in silico approaches test their construct again TLRs [17,50,[52][53][54][55][56]. The docking and MD simulation was performed according to Yang et al. [48]. iMOD server was selected for performing the molecular dynamics simulation as it simplifies and creates potential transition pathways between two homologous structures. Toll-like receptor 4 was selected as a target to the designed vaccine as many studies confirm the recognition of the S protein to this receptor [50,57].
The protein sequence of the multi-epitope was reverse translated into its specific cDNA sequence, and its codon was optimized for expression in E. coli. Optimization results showed a codon adaptation index (CAI) value of 0.93 and (GC) content of 59.48%. Basically, a (CAI) value > 0.8 and GC content between 30 and 70% are considered efficient for protein expression in the host system [58]. It was in silico cloned into the pET-28a (+) expression vector. We followed the same approach used by Foroutan et al. for the efficient in vitro expression of the vaccine. The allergenecity, physicochemical properties, instability index and aliphatic index of our design are comparable to those predicted by Foroutan et al. [48]. This group has validated their in silico design by laboratory experiments and has confirmed trigging strong humoral and cellular responses in mice.

Conclusions
Application of immunoinformatic tools in the design of vaccines propose effective vaccines in shorter time with reduced cost. In this report, a multi-epitope SARS-CoV-2 vaccine was designed. Two epitopes carrying the D614G and P681R mutations responsible for increasing the viral infectivity and severity of the disease, were included. The spike multi-epitope protein was in silico tested for the induction of CTLs and HTLs. The designed multi-epitope can be a promising vaccine against strains carrying the any of the two mutations. Despite the high efficacy of the in silico designed candidate vaccine, experimental investigations are needed for its validation.  Data Availability Statement: The datasets used and/or analyzed in the current study are available from the corresponding author on reasonable request.