Reducing the Immunogenicity of Pulchellin A-Chain, Ribosome-Inactivating Protein Type 2, by Computational Protein Engineering for Potential New Immunotoxins

: Pulchellin is a plant biotoxin categorized as a type 2 ribosome-inactivating protein (RIPs) which potentially kills cells at very low concentrations. Biotoxins serve as targeting immunotoxins (IT), consisting of antibodies conjugated to toxins. ITs have two independent protein components, a human antibody and a toxin with a bacterial or plant source; therefore, they pose unique setbacks in immunogenicity. To overcome this issue, the engineering of epitopes is one of the beneﬁcial methods to elicit an immunological response. Here, we predicted the tertiary structure of the pulchellin A-chain (PAC) using ﬁve common powerful servers and adopted the best model after reﬁning. Then, predicted structure using four distinct computational approaches identiﬁed conformational B-cell epitopes. This approach identiﬁed some amino acids as a potential for lowering immunogenicity by point mutation. All mutations were then applied to generate a model of pulchellin containing all mutations (so-called PAM). Mutants’ immunogenicity was assessed and compared to the wild type as well as other mutant characteristics, including stability and compactness, were computationally examined in addition to immunogenicity. The ﬁndings revealed a reduction in immunogenicity in all mutants and signiﬁcantly in N146V and R149A. Furthermore, all mutants demonstrated remarkable stability and validity in Molecular Dynamic (MD) simulations. During docking and simulations, the most homologous toxin to pulchellin, Abrin-A was applied as a control. In addition, the toxin candidate containing all mutations (PAM) disclosed a high level of stability, making it a potential model for experimental deployment. In conclusion, by eliminating B-cell epitopes, our computational approach provides a potential less immunogenic IT based on PAC.


Three-Dimensional Structure Validation
Available web applications were utilized in order to choose the optimal model including Verify 3D (http://services.mbi.ucla.edu/Verify3D, accessed on 3 April 2022), ERRAT (http://services.mbi.ucla.edu/ERRAT, accessed on 3 April 2022), PROCHECK (https://servicesn.mbi.ucla.edu/PROCHECK, accessed on 3 April 2022), and ProSA-web (https://prosa.services.came.sbg.ac.at/prosa.php, accessed on 3 April 2022). By defining a structural class depending on its position and surroundings (alpha, beta, loop, polar, and nonpolar), the Verify 3D determines the compatibility of an atomic model (3D) with its sequence of amino acids. It examines the outputs and compares them to qualified structures [25]. ERRAT program is used to validate bad contacts of the protein structure. The error function is founded on nonbonded atom-atom interactions, represented in the structure [26]. The ProSA software takes advantage of the benefits of interactive web-based programs to present scores and energy graphs that identify possible issues in protein structures. The z-score evaluates the variation of the total energy of the structure from an energy distribution calculated from random conformations and reflects the ultimate quality of the model [27]. PROCHECK analyzes the total and residue-by-residue geometry of a protein structure, providing a Ramachandran plot that shows (ϕ) and (ψ) bond angles. In the plot, the amino acids are split into four sections (favored, additionally allowed, generously allowed, and disallowed) [28]. According to the validation results, the most favorable model was chosen for refinement.

Selected Model Refinement
GalaxyRefine2 was used to refine the chosen model (http://galaxy.seoklab.org/cgibin/submit.cgi?type=REFINE2, accessed on 5 April 2022). This server exploits molecular dynamics simulation to achieve repetitive structure perturbation and subsequent structural relaxation. As opposed to GalaxyRefine, GalaxyRefine2 protocol is said to be more precise, offering ten refinement models and only up to 300 amino acids are allowed for refinement submission [29]. Validation servers (Verify 3D, ERRAT, PROCHECK, and ProSA-web) assessed the refinement models' validity, and ultimately, the top refined model was selected for subsequent steps.

Predicting Conformational B-Cell Epitopes
The final pulchellin A-Chain (PAC) model was exploited to identify possible B-cell epitopes in the form of discontinued structure. Various web servers (described below) were employed to achieve this goal. Each residue's score was calculated using the servers listed below. Several residues were determined as possible epitopes based on servers' scores. Ultimately, in common residues were marked as immunogenic spots. EPSVR (Antigenic Epitopes Prediction with Support Vector Regression) is a discontinuous information-based predictor of B-cell antigenic epitopes on protein surfaces (http://sysbio.unl.edu/EPSVR, accessed on 8 April 2022). There are six distinct characteristics employed by the EPSVR server: residue epitope propensity, conservation score, side chain energy score, contact number, surface planarity score, and secondary structure composition. There is no such issue as a threshold for this server [30]. DiscoTope 2.0 is a new version of the DiscoTope J 2023, 6 88 method that includes a new spatial neighborhood definition and a half-sphere exposure as a surface measure (https://services.healthtech.dtu.dk/service.php?DiscoTope-2.0, accessed on 8 April 2022). Amino acid statistics, spatial information, and surface accessibility are the features on which this server is based. The default threshold (−3.7) was set for prediction [31]. Geometrical characteristics of protein structure is a method that the Ellipro server exploits to predict conformational B-cell epitopes (http://tools.iedb.org/ellipro, accessed on 8 April 2022). This server uses three algorithms: ellipsoid prediction of the protein structure, The residue protrusion index (PI) measuring, and employing PI values to cluster the adjacent residues. Ellipro utilizes two features to define a threshold: minimum residue score and maximum distance (Å). These parameters were set at 0.5 and 6 respectively (default values) [32]. SEPPA creates a triangular unit patch by combining exposed and nearby residual features (http://www.badd-cao.net/seppa3/index.html, accessed on 8 April 2022). SEPPA 3.0 is the most recent version. It has improved performance on common protein antigens by updating the training dataset and incorporating new characteristics that allow for reliable prediction of N-linked glycoprotein antigens. The prediction threshold was the default value (Threshold: 0.089) [33]. The AUC (Area under the ROC Curve) is a number that varies from 0 to 1 and represents the method's total performance. The AUC of a model whose predictions are 100%incorrect is 0.0, whereas the AUC of a model whose predictions are 100%accurate is 1.0 [34]. For EPSVR, DiscoTope 2.0, Ellipro, and SEPPA 3.0, the AUC scores were 0.597, 0.727, 0.732, and 0.740, respectively.

Establishment of Mutants
Initially, the 3D structure of PAC was exposed to the ConSurf server (https://consurf. tau.ac.il/, accessed on 10 April 2022) to determine whether candidate immunogenic residues are classified as highly conserved and functional residues or not. This server examines the evolutionary pattern of the macromolecule's amino acids and nucleic acids to identify sections crucial for structure and function [35]. After showing that proposed immunogenic residues do not have a particular role, three non-polar amino acids having a short side chain (A, V, L) were chosen for point mutations.

Obtaining the 3D Structure of Mutants and Evaluating Their Initial Properties
To create mutant models and assess their stability, the SDM2 server was employed (http://marid.bioc.cam.ac.uk/sdm2, accessed on 12 April 2022). This free web tool calculates a stability score using a statistical potential energy function and provides The PDB (Protein Data Bank) file format for mutants [36]. The validity of the 3D structure of each mutant supplied from the SDM2 server was subjected to validation servers (similarly performed in Section 2.3).

Analyzing Immunogenicity of Mutants
The immunogenicity of each mutant was investigated using the servers indicated before (EPSVR, DiscoTope 2.0, ElliPro, and SEPPA 3.0). Score changes of antigenic spots were scrutinized at the location of mutations. Supportive analysis was performed by SDM2 and Discovery Studio 4.5 software. SDM2 provides structural environment data like side chain solvent accessibility, residue depth, and packing density [36]. Furthermore, sidechain accessibility and hydrophobicity of mutants were assessed by Discovery Studio 4.5 software.

Building Pulchellin Containing All Mutations Model
After establishing point mutation, the pulchellin containing all mutations (PAM), including T82A, T100A, D101V, Q121A, N146V, D147A, and R149A, was predicted and refined by GalaxyTBM and GalaxyRefine2, respectively. The validity of 3D structure of PAM was measured by validation servers similar to the previous step in Section 2.3. The immunogenicity of the final model of PAM was then examined by the servers mentioned previously (EPSVR, DiscoTope 2.0, ElliPro, and SEPPA 3.0).

Molecular Docking
The molecular docking was accomplished by AutoDock Vina version 4.2 [37]. PAC wildtype and mutant models were employed as receptors, and the active sites of PAC were identified using homology to other RIP-2 families like Ricin [8]. Using residues C13 through G18 from the RNA sequence (5 GGGUGCUCAGUACGAGAGGAACCGCACCC3 ), the structure of the ligand was retrieved from the 29-mer Sarcin-Ricin loop (PDB ID 1SCL) [38]. The underlined characters represented the ligand sequence and "A" bold font shows the target adenine ( Figure 1). The grid box size was set at 20 × 22 × 22 Å for x, y, and z, respectively, with 1 Å spacing between the grid points. The grid center of x, y, and z, was set to 68, 10, and 37, respectively. To acquire specific control on the docking process of mutants, docking of Abrin-A with CGAGAG was performed simultaneously. For this purpose, the x, y, and z grid centers were set at 67, 7, and 37, respectively without changing other parameters. Each docking was carried out 100 runs utilizing the Lamarckian Genetic Algorithm (LGA). The graphics of interaction quality was plotted by AutoDockTools-1.5.6.
were identified using homology to other RIP-2 familie through G18 from the RNA sequence (5′GGGU CACCC3′), the structure of the ligand was retrieved (PDB ID 1SCL) [38]. The underlined characters represe bold font shows the target adenine ( Figure 1). The grid for x, y, and z, respectively, with 1 Å spacing between x, y, and z, was set to 68, 10, and 37, respectively. To acq process of mutants, docking of Abrin-A with CGAGA For this purpose, the x, y, and z grid centers were set a changing other parameters. Each docking was carried ian Genetic Algorithm (LGA). The graphics of interac DockTools-1.5.6.

Molecular Dynamics Simulation
MD (Molecular dynamics) simulations could be stability and structural change of macromolecules u GROMACS and the CHARMM 27 all-atom force field tions of all docking Protein-ligand complexes (Abrin-A CGAGAG) [39]. The topology files of the receptor wer topology of ligands was prepared by SwissParam [40 produce ligand topology autonomously. A dodecahe water molecules was applied to solvate the complex, w complex and the solvated box's edge. After applying ated system, the systems were energy minimized usi The "genrestr" module was used to restrain the ligand

Molecular Dynamics Simulation
MD (Molecular dynamics) simulations could be a valuable technique for studying stability and structural change of macromolecules under physiological circumstances. GROMACS and the CHARMM 27 all-atom force field were used to perform MD simulations of all docking Protein-ligand complexes (Abrin-A, PAC mutants and wild type with CGAGAG) [39]. The topology files of the receptor were generated by GROMACS but the topology of ligands was prepared by SwissParam [40], since GROMACS is not able to produce ligand topology autonomously. A dodecahedron box of 48 × 60 × 48 Å TIP3P water molecules was applied to solvate the complex, with a spacing of 10 Å between the complex and the solvated box's edge. After applying sodium ions to neutralize the solvated system, the systems were energy minimized using the steepest descent technique. The "genrestr" module was used to restrain the ligand position and temperature coupling groups were set at Protein_LIG water_and_ions. The system was equilibrated with velocity-rescale thermostat at 300 K (reference temperature) for 100 ps using NVT (constant Number of particles, Volume, and Temperature). Then, NPT with Berendsen pressure coupling at one atmosphere (reference pressure) (constant Number of particles, Pressure, and Temperature) for another 100 ps. The long-range electrostatic interactions were estimated using Particle Mesh Ewald (PME), and for covalent bond constraints, the Linear Constraint Solver (LINCS) algorithm was deployed [41]. Following this, the MD simulation of the equilibrated system was run for 100 ns. The assessment of RMSD (root mean square deviation), RMSF (root mean square fluctuation), the radius of gyration (Rg), solvent accessible surface area (SASA), and hydrogen bonding (H-bonds) was performed using GORMACS MD simulation trajectories. Additionally, binding energy between mutants and CGAGAG were estimated using MM/PBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) tool based on MMPBSA.py of AmberTools20 [42].

Results and Discussion
Exploiting plants to synthesize toxins like pulchellin, might promote the production of recombinant immunotoxins (antibody-toxin fusion proteins) [43]. The use of toxic enzymes to make ITs is a vital field. The main limits to the clinical use of heterologous proteins derive from their immunogenicity and, to a lesser extent, from their systemic toxicity. The creation of macromolecules with lower immunogenicity could be accomplished effectively by theoretical protein engineering [44]. In this regard, the immunogenic zones of pulchellin were discovered by an in silico analysis of conformational B-cell epitopes, and then, point mutations were employed to obtain the least immunogenic mutants without influencing the stability and functional features of the wild type model.

Toxin Selection and Structural Prediction
Four isoforms for pulchellin have been discovered (P I, P II, P III, P IV). Isoforms P I and P II have been indicated to have similar toxicities, and both are more poisonous than P III and P IV [8]. Among all isoforms, it has been established that isoform II has the highest level of toxicity and the corresponding recombinant protein has also been created. [4]. Moreover, in another study, this isoform was used to produce immunotoxin against HIV envelope [10], which appeared to be an important compound according to its ability to specific cell targeting. This led to the selection of isoform P II for examination in this study. Because the crystal structure of pulchellin has not been determined, the first stage in the study was to make a reliable prediction of the tertiary structure of pulchellin isoform II. Using various online servers, I-TASSER, GalaxyWEB, Phyre2, RaptorX, and SWISS-MODEL, the tertiary structure of the PAC was created. Based on template modeling, all servers predict just one final model except GalaxyWEB, which indicated five last models. As a template, this server supplied two crystal structures of the Abrin-A A chain (PDB ID 1ABR and 5Z37). The identity of PAC and Abrin-A A chain was determined 78% when using NCBI blast. The similarities between the Abrin-A PAC and the A-chain were investigated using pairwise alignment. The total identity values were discovered to be about 78%, and the catalytic residues of PAC were found to be identical to Abrin-A [8]. To choose the most suitable model, all produced models were checked using PROCHECK, Verify 3D, ERRAT, and ProSA-web. The outputs of validation are represented in Supplementary Table S1.
Among Galaxy predicted models, model 4 was chosen for subsequent development based on the validation results in Supplementary Table S2. After subjecting this model to GalaxyRefine2 and assessing the validation scores of refined models, the refined model 2 (PAC) was ultimately selected for the immunogenicity survey. Figure 2 depicts the validation graphs for the final model.
According to Verify 3D results, 98.01% of modeled PAC residues had an average 3D-1D greater than or Equal to 0.2. Overall Quality Factor of 97.51 was received from ERRAT server. Obtaining a −7.74 Z-score from ProSA-web represented that PAC comes within the normal range of scores for natural proteins with comparable size. Rama-chandran plot revealed that 93.4%, 6.1%, 0.0% and 0.4% of residues were positioned in favored, additionally allowed, generously allowed, and disallowed, respectively. In fact, there is only one residue (R31) in the disallowed region. This amino acid is located in β-sheet and is not involved in an immunogenic domain or an active site, therefore its presence in the disallowed zone does not contradict with the present project (Table 1). Table 1. Validation scores of top Galaxy Model 4, before and after refinement. Verify 3D (based on percentage of the residues had an averaged 3D-1D score ≥ 0.2), ERRAT (based on overall quality factor), ProSA-WEB (based on Z-score), and Ramachandran Plot (F: Favored, AA: additionally allowed, GA: generously allowed, D: disallowed). The best model of refinement with the highest score was selected for the next steps.

Models
Verify

Immunogenic Epitopes Prediction and Making Mutants
The four servers uncovered seven shared residues (out of 251 total residues) are strong immunogenic sites, which are T82, T100, D101, Q121, N146, D147, and R149. Except for R149, located at the alpha helix, the remaining chosen residues are found at beta strands. According to ConSurf server data, these residues are not in the active site and are not categorized as highly conserved amino acids. The discontinuous B-cell epitopes are visually portrayed in

Immunogenic Epitopes Prediction and Making Mutants
The four servers uncovered seven shared residues (out of 251 total residues) are strong immunogenic sites, which are T82, T100, D101, Q121, N146, D147, and R149. Except for R149, located at the alpha helix, the remaining chosen residues are found at beta strands. According to ConSurf server data, these residues are not in the active site and are not categorized as highly conserved amino acids. The discontinuous B-cell epitopes are visually portrayed in Figure 3. These residues were altered to alanine, valine, or Leucine. Of note, the presence of aromatic amino and large hydrophilic side chains, which are known to elicit a robust immunological response, has been confirmed [45,46]. Furthermore, it has been demonstrated that by identifying B-cell epitopes and making a point mutation in the immunogen residue, the immunogenicity of the protein could be reduced [47]. The mutation can be achieved by using nonpolar amino acids with small side chains, which could lead to a decrease in the immunogenicity of the immunogenic residues [13]. In this study, three small side chain amino acids were used for mutation: Ala, Val and Leu.
J 2023, 6, FOR PEER REVIEW immunological response, has been confirmed [45,46]. Furthermore, it has been demons that by identifying B-cell epitopes and making a point mutation in the immunogen re the immunogenicity of the protein could be reduced [47]. The mutation can be achiev using nonpolar amino acids with small side chains, which could lead to a decrease in th munogenicity of the immunogenic residues. [13]. In this study, three small side chain a acids were used for mutation: Ala, Val and Leu.

Making Mutants and Evaluation Stability and Immunogenicity
According to the evaluation of stability and immunogenicity of each mutation (Su  mentary Table S3), seven final mutants were obtained: T82A, T100A, D101V, Q121V, N D147A, and R149A. The stability of all mutants has increased after mutation except D according to SDM2 server results. Immunogenicity changes and SDM2 overall stability tained mutants are described in Table 2. SDM2 employs two structural parameters for e ating protein stability: residue-occluded packing density (OSP) and residue depth. High due packing density areas (4 and 8 Å depth levels) are commonly observed with extre

Making Mutants and Evaluation Stability and Immunogenicity
According to the evaluation of stability and immunogenicity of each mutation (Supplementary Table S3), seven final mutants were obtained: T82A, T100A, D101V, Q121V, N146V, D147A, and R149A. The stability of all mutants has increased after mutation except D101V, according to SDM2 server results. Immunogenicity changes and SDM2 overall stability of obtained mutants are described in Table 2. SDM2 employs two structural parameters for evaluating protein stability: residue-occluded packing density (OSP) and residue depth. High residue packing density areas (4 and 8 Å depth levels) are commonly observed with extremely destabilizing mutations. Destabilizing mutations have negative pseudo ∆∆G levels, whereas stabilizing mutations exhibit positive pseudo ∆∆G values. The mutant T100A displayed the most stable one according to the SDM2 results. All mutations, according to the EPSVR and DiscoTope servers, result in a reduction in immunogenicity. Except for T100A and D101V, practically all mutants reported a decline in immunogenicity in SEPPA analysis. In ElliPro conformational server, however, no significant changes were identified. Bioinformatics techniques and databases are critical for finding appropriate epitopes and lowering the immunogenicity of a targeted protein [48]. In this study, we just used conformational B-cell epitopes (not linear) because protein has a three-dimensional structure in its native state and predicting linear epitopes for immunogenicity could not be an effective strategy. In fact, defining epitopes is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. Table 2. Score changes of immunogenic residues according to conformational B-cell epitope predictor servers alongside prediction of overall stability through SDM2 server.

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to validation servers, and the results were highly acceptable (Supplementary Table S4), except for the D101V mutation, which showed a small drop in ERRAT output (from 97.51 to 95.85) and a decline in the number of residues in the favored region in Ramachandran plot (decreased from 93.4 to 93). Nonetheless, this reduction was unimportant because D101V conformational structure reliability remained within the acceptable limit. Side chain accessibility and hydrophobicity are two features that play a role in immunogenicity. Residues that are more accessible and, as a result, more hydrophilic can be more likely to form epitopes [53,54]. The accessibility of side chains could be reduced as a result of mutations to small side chain amino acids, resulting in a reduction in antigenicity. According to the SDM server and Discovery studio results, the most reduction in residues accessibility was observed in R149A (Table 3). However, following mutation, the N146V mutant revealed is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52].

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to validation servers, and the results were highly acceptable (Supplementary Table S4), except for the D101V mutation, which showed a small drop in ERRAT output (from 97.51 to 95.85) and a decline in the number of residues in the favored region in Ramachandran plot (decreased from 93.4 to 93). Nonetheless, this reduction was unimportant because D101V conformational structure reliability remained within the acceptable limit. Side chain accessibility and hydrophobicity are two features that play a role in immunogenicity. Residues that are more accessible and, as a result, more hydrophilic can be more likely to form epitopes [53,54]. The accessibility of side chains could be reduced as a result of mutations to small side chain amino acids, resulting in a reduction in antigenicity. According to the is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. Table 2. Score changes of immunogenic residues according to conformational B-cell epitope predictor servers alongside prediction of overall stability through SDM2 server.

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to validation servers, and the results were highly acceptable (Supplementary Table S4), except for the D101V mutation, which showed a small drop in ERRAT output (from 97.51 to 95.85) and a decline in the number of residues in the favored region in Ramachandran plot (decreased from 93.4 to 93). Nonetheless, this reduction was unimportant because D101V conformational structure reliability remained within the acceptable limit. Side chain accessibility and hydrophobicity are two features that play a role in immunogenicity. Residues that are more accessible and, as a result, more hydrophilic can be more likely to form epitopes [53,54]. The accessibility of side chains could be reduced as a result of mutations to small side chain amino acids, resulting in a reduction in antigenicity. According to the SDM server and Discovery studio results, the most reduction in residues accessibility was observed in R149A (Table 3). However, following mutation, the N146V mutant revealed is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. Table 2. Score changes of immunogenic residues according to conformational B-cell epitope predictor servers alongside prediction of overall stability through SDM2 server.

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to validation servers, and the results were highly acceptable (Supplementary Table S4), except for the D101V mutation, which showed a small drop in ERRAT output (from 97.51 to 95.85) and a decline in the number of residues in the favored region in Ramachandran plot (decreased from 93.4 to 93). Nonetheless, this reduction was unimportant because D101V conformational structure reliability remained within the acceptable limit. Side chain ac- is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. Table 2. Score changes of immunogenic residues according to conformational B-cell epitope predictor servers alongside prediction of overall stability through SDM2 server.

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to validation servers, and the results were highly acceptable (Supplementary is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. Table 2. Score changes of immunogenic residues according to conformational B-cell epitope predictor servers alongside prediction of overall stability through SDM2 server.

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to is considerably dependent on conformational structure [49]. There are several methods and servers which could predict conformational B-cell epitope. Each server takes into account a variety of factors and by using a combination of servers, identifying immunogenic areas could be more precise [50]. The PAC epitopes were defined using four conformational B-cell epitope prediction systems with four distinct methods. For the prediction of conformational B-cell epitope in surface protein in SARSCoV2, Lon et al. employed the SEPPA3.0 and Ellipro servers [51]. Ellipro, EPSVR, and DiscoTope servers were used in another investigation to discover potent B-and T-cell epitopes of four structural proteins of SARS-CoV-2 [52]. Table 2. Score changes of immunogenic residues according to conformational B-cell epitope predictor servers alongside prediction of overall stability through SDM2 server.

Validation Analysis and Investigating Further Properties of Mutants
Because the mutants were obtained from the SDM2 server, it was required to assess their 3D structure validation once again. For this purpose, all mutants were subjected to validation servers, and the results were highly acceptable (Supplementary Table S4), except for the D101V mutation, which showed a small drop in ERRAT output (from 97.51 to 95.85) and a decline in the number of residues in the favored region in Ramachandran plot (decreased from 93.4 to 93). Nonetheless, this reduction was unimportant because D101V conformational structure reliability remained within the acceptable limit. Side chain accessibility and hydrophobicity are two features that play a role in immunogenicity. Residues that are more accessible and, as a result, more hydrophilic can be more likely to form epitopes [53,54]. The accessibility of side chains could be reduced as a result of mutations to small side chain amino acids, resulting in a reduction in antigenicity. According to the SDM server and Discovery studio results, the most reduction in residues accessibility was observed in R149A (Table 3). However, following mutation, the N146V mutant revealed the most accessible residues among the others.
Furthermore, after evaluating the attributes of point mutations, we opted to create a PAM model by combining all point mutations into a single model. As a result, the Galaxy server predicted and refined the favored model of PAM. A final model's validation scores corresponded to permissible values across all validation servers (Supplementary  Tables S5 and S6). Subsequently, it was discovered that there was a remarkable reduction in the number of shared Immunogenic residues by using servers computing conformational B-cell epitopes. Although certain residues remained immunogenic, no residues were found to be shared by all of the servers (data not shown). Therefore, it could be concluded that the most immunogenic epitopes were successfully eliminated in the PAM model. Table 3. Accessibility and hydrophobicity of residues before and after mutation. The R149A mutant revealed the most decrease in Side chain accessibility in both SDM and Discovery studio. By exploiting mutation to short side chain and non-polar amino acids, it is clear that hydrophobicity of all mutants has been increased, which could bring about a reduction in antigenicity.

Molecular Docking
Molecular docking and molecular dynamic modeling were used to ensure that mutations had no effect on the protein's function. It was essential to obtain protein and ligand complexes to operate the MD simulation. Therefore, molecular docking was exploited initially by AutoDock Vina. This software uses a flexible docking algorithm based on Monte Carlo simulated dockings [37]. Docking findings revealed that approximately all ten residues of either wild type or mutants having a function at active site (based on homology to other type 2 RIPs) were involved in interacting with the CGAGAG ligand, even though all mutated amino acids did not contribute in interaction with ligand (supplementary Table S7). More precisely, in the N146V docking result, all predicted residues in the active site interacted with CGAGAG, and Q121A docking output represented the least contribution in having interaction with ligand by participating seven amino acids out of ten in the interaction process. Pulchellin belongs to the family of rRNA N-glycosylases, and it has been documented that the A-chain of type 2 RIPs cleaves a single adenine base in the GAGA loop [55]. Olson employed CGAGAG sequence as a ligand for interacting with Ricin in part of a research. They demonstrated that using oligonucleotide ligand rather than just adenine is more accurate [56]. As a result, the current investigation adopted CGAGAG oligonucleotide as a ligand. In addition to PAC, we also used the A-chain of Abrin-A (1ABR) as the control for two reasons; first, it is classified as rRNA N-glycosylase, same as pulchellin, and as opposed to pulchellin, the crystallography structure of Abrin-A RIP has been released on Protein Data Bank server. We were able to discover critical interactions between the binding site of receptors and the ligand, including hydrogen bonds, π-π interactions, and π-sigma interactions, by exploiting the Discovery studio visualizer. Table 4 exhibits the number of several forms of interaction. According to these findings, T82A formed the maximum number of hydrogen bonds (21 H-bonds) compared to PAC, which had 7 bonds. There were no ligand-receptor π-π interactions in PAC or Q121A whereas, D101V and R149A disclosed the greatest number (six π-π interactions). The highest number of π-interactions was reported by N146V with 8 and D147A did not form any π-Sigma interaction with ligand. By and large, practically every mutant revealed that they can interact with ligands appropriately. Besides, in comparison to other mutants and Abrin-A, PAM demonstrated an acceptable number of interactions with ligand, suggesting that it would be a promising mutant for use in the next experimental work. Although molecular docking offers valuable data, it might be insufficient to screen mutants. Therefore, MD simulation was used to evaluate the attributes of mutants more precisely by utilizing RMSD, gyrate, H-bonds, and binding energies.

Molecular Dynamic Simulation
In the next step, MD simulation was performed to evaluate some main dynamic trajectories, including RMSD, RMSF, Rg, SASA, and hydrogen bonding of docked complexes. The last frame of each protein-ligand complex obtained from MD simulations is shown in Figure 4. To get more insights into mutant MD results, the simulations of PAC wild type and Abrin-A in complexes with ligand were performed simultaneously. Each toxin-ligand complex's simulation parameters and plot are described as average values ( Figure 5).

Molecular Dynamic Simulation
In the next step, MD simulation was performed to evaluate some main dynamic traject ries, including RMSD, RMSF, Rg, SASA, and hydrogen bonding of docked complexes. Th last frame of each protein-ligand complex obtained from MD simulations is shown in Figu 4. To get more insights into mutant MD results, the simulations of PAC wild type and Abri A in complexes with ligand were performed simultaneously. Each toxin-ligand complex simulation parameters and plot are described as average values ( Figure 5). The spatial variation of mutants and wild type in the presence of Abrin-A was inves gated using RMSD and RMSF. Attempting to assess the level of stability to examine the flex bility of the backbone structure of mutants or toxins, the RMSD of protein-ligand complex was computed against their original structure. Protein structure would possess high stabili when the RMSD value is in its minimum amount. After about five nanoseconds of simulatio the RMSD of mutants and toxins in the complex with CGAGAG begins to stabilize, accordin  CGAGAG, PAM demonstrated acceptable RMSF value and revealed a similar RMSD rate to Abrin-A. This result could bring about promising hope for using PAM model for in vitro application.  The spatial variation of mutants and wild type in the presence of Abrin-A was investigated using RMSD and RMSF. Attempting to assess the level of stability to examine the flexibility of the backbone structure of mutants or toxins, the RMSD of protein-ligand complexes was computed against their original structure. Protein structure would possess high stability when the RMSD value is in its minimum amount. After about five nanoseconds of simulation, the RMSD of mutants and toxins in the complex with CGAGAG begins to stabilize, according to the results. During the 100 ns MD simulation, there was no significant variation in the RMSDs, indicating that all complexes were consistent. D147A disclosed the least RMSD (0.32) among all mutants and wild type toxins, indicating excellent stability. Following that, the RMSF of protein portion in all complexes was determined to quantify residual flexibility over MD simulation. Except for wild type, N146V and D147A which residues showed RMSF value over 6 Å at some spots, almost all protein residues in all mutants revealed no considerable variation in the residual level. Average fluctuations of mutants' residues were observed to be diminished upon ligand binding compared to wild type, which indicates that mutations not only did not disrupt the toxin stability but lowered the residual fluctuations. Moreover, the behavior of wildtype and mutant residues fluctuated similarly to that of Abrin-A. Kandasamy et al. evaluated the residual flexibility and stability of the Protein-ligand combination using RMSD and RMSF. They demonstrated that the lower value of these two factors is close associated with the higher stability of the complexes [57]. Additionally, in complex with CGAGAG, PAM demonstrated acceptable RMSF value and revealed a similar RMSD rate to Abrin-A. This result could bring about promising hope for using PAM model for in vitro application.
The solvent-accessible surface area and radius of gyration of mutants were determined to analyze their intactness. The SASA was estimated by measuring the whole area of the protein surface, and the radius of gyration was calculated by measuring the distance between the protein's center of mass and both termini. A significant variation in these values implies that the protein structure has been disrupted. Tjoa et al. employed SASA and Rg parameters to evaluate the compactness of mutants compared to wild type in the MD simulation section of their investigation. They revealed less structural change, reflecting greater conformational tightness [58]. In this study, during 100 ns MD simulation, substantial variations were not observed in the SASA and Rg values of mutants or toxins when they bind to ligand.
When SASA and Rg values are compared between wildtype and point mutation model complexes, it is evident that they are comparable. Furthermore, these values are strikingly similar in Abrin-A and PAM model, suggesting that a model of pulchellin containing all mutations could be a good choice for creating immunotoxin.
MD simulations estimated the hydrogen bonds within 0.35 nm between mutations or toxins and ligand in a solvent setting to further confirm the docked complexes' stability. The directionality and specificity of contact provided by hydrogen bonding between proteins and molecules are critical for molecular recognition [59]. According to the H-bond results, D147A created the most H-bonds, with an average of 9-11 H-bonds, as opposed to N146V, which formed the fewest H-bonds (average of 4-6). Additionally, the average number of H-bonds created in the PAM was greater than in the Abrin-A complex, demonstrating further confirmation of PAM validity.

Estimation of Binding Free Energy
The MM-PBSA method was utilized to more thoroughly describe the energy behavior of mutants in complex with CGAGAG. To comprehend the interactions between proteins and ligands, the Molecular Mechanics Poisson-Boltzmann Surface Area method has been extensively used and is regarded as a reputable free energy simulation technique [60]. The binding free energy was calculated by employing the gmx_MMPBSA to compute the latest 500 snapshots of each protein-ligand complex. The binding free energy was calculated using this formula: where ∆G bind represents the binding free energy, ∆G complex (free energy of complex), ∆G protein (free energy of protein), and ∆G ligand (free energy of ligand).
Comparing PAC and PAM, PAC has a greater ∆G bind quantity, suggesting a weaker interaction with CGAGAG, whereas PAM has the lowest ∆G bind values, reflecting the strongest ligand affinity among the mutants ( Figure 6). The fact that all complexes showed a positive quantity of G bind which could be due to the large structure of the ligand involved in the complex.
J 2023, 6, FOR PEER REVIEW Figure 6. Binding free energy of complexes. For 500 frames taken from the last 5 ns of the MD ulation, binding free energy was calculated using the gmx_MMPBSA tool. In comparison to A A and other mutants, the PAM in interaction with the ligand represented a significant amou binding energy.

Conclusions
In conclusion, utilizing in silico approaches could eliminate B-cell epitopes, resu in less immunogenic ITs. Here, the immunogenic portions of pulchellin were ident using in silico analysis of conformational B cell epitopes, and these regions were then stituted with less immunogenic residues to generate seven mutants. Once mutants w obtained, their properties, such as immunogenicity, stability, and ligand binding, w assessed. The toxin candidate containing all mutations (PAM model) has the maxim stability and the lowest immunogenic areas, which makes it a promising candidate experimental development. It should be highlighted that this study, based on in s methodologies proposing potential models, could serve as a foundation for a larger ject and calls for being supported by experimental validations.
Supplementary Materials: The following supporting information can be downloaded www.mdpi.com/xxx/s1. Table S1. Evaluation of 3D structure models of PAC by Verify 3D, ER ProSAWEB, and Ramachandran Plot. Galaxy model 4 was chosen for refinement process. Tab Validation scores of Galaxy Model after refinement. Selected model 2 of refinement obtained Figure 6. Binding free energy of complexes. For 500 frames taken from the last 5 ns of the MD simulation, binding free energy was calculated using the gmx_MMPBSA tool. In comparison to Abrin-A and other mutants, the PAM in interaction with the ligand represented a significant amount of binding energy.

Conclusions
In conclusion, utilizing in silico approaches could eliminate B-cell epitopes, resulting in less immunogenic ITs. Here, the immunogenic portions of pulchellin were identified using in silico analysis of conformational B cell epitopes, and these regions were then substituted with less immunogenic residues to generate seven mutants. Once mutants were obtained, their properties, such as immunogenicity, stability, and ligand binding, were assessed. The toxin candidate containing all mutations (PAM model) has the maximum stability and the lowest immunogenic areas, which makes it a promising candidate for experimental development. It should be highlighted that this study, based on in silico methodologies proposing potential models, could serve as a foundation for a larger project and calls for being supported by experimental validations.

Supplementary Materials:
The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/j6010006/s1. Table S1. Evaluation of 3D structure models of PAC by Verify 3D, ERRAT, ProSAWEB, and Ramachandran Plot. Galaxy model 4 was chosen for refinement process. Table S2. Validation scores of Galaxy Model after refinement. Selected model 2 of refinement obtained the highest validation score. Table S3. Conformational B-cell epitope scores and SDM2 results of all potential mutants. Table S4. Validation scores of wild type and mutants form of PAC. Mutant models were obtained from SDM2 server. Table S5. Evaluation of 3D structure of a pulchellin containing all mutations model with Verify 3D, ERRAT, ProSAWEB, and Ramachandran Plot. Table  S6. Validation scores of PAM model after refinement. Model 3 of refinement was selected due to the highest validation score. Table S7. Pulchellin interaction amino acids (wild type and mutants) in the docking process. Residues of active site in N146V contributed the most, while Q121A contributed the least in docking process. Highlight amino acids indicate shared active site residues between homology prediction and molecular docking.