Optimization of an Inclusion Body-Based Production of the Influenza Virus Neuraminidase in Escherichia coli

Neuraminidase (NA), as an important protein of influenza virus, represents a promising target for the development of new antiviral agents for the treatment and prevention of influenza A and B. Bacterial host strain Escherichia coli BL21 (DE3)pLysS containing the NA gene of the H1N1 influenza virus produced this overexpressed enzyme in the insoluble fraction of cells in the form of inclusion bodies. The aim of this work was to investigate the effect of independent variables (propagation time, isopropyl β-d-1-thiogalactopyranoside (IPTG) concentration and expression time) on NA accumulation in inclusion bodies and to optimize these conditions by response surface methodology (RSM). The maximum yield of NA (112.97 ± 2.82 U/g) was achieved under optimal conditions, namely, a propagation time of 7.72 h, IPTG concentration of 1.82 mM and gene expression time of 7.35 h. This study demonstrated that bacterially expressed NA was enzymatically active.


Introduction
Influenza is a respiratory disease caused by a virus belonging to the family Orthomyxoviridae [1]. Every year, influenza viruses cause seasonal epidemics that mainly affect the adult population. Of the total number of infected adults, 10-30% are hospitalized, and 3-15% of those infected die [2,3]. Influenza pandemics including the Spanish flu (1918), Asian flu (1957) or Hong Kong flu (1968) have killed millions of people [4,5]. Mortality from the highly symptomatic H5N1 influenza virus in 1997 exceeded 60% [6,7]. Influenza virus variability can lead to a pandemic posing a serious threat to public health [8,9].
Vaccination currently remains the most effective tool for influenza infection prevention [10]. However, influenza viruses tend to mutate, and the vaccine needs to be updated frequently. Therefore, there is an effort to develop antiviral drugs that effectively suppress the infection. Research has mainly been focused on neuraminidase (NA) inhibition [11][12][13][14][15][16][17]. NA is a sialidase [18] and helps to release new virions from infected cells. It is possible to protect the host and prevent the multiplication of viruses in the body by the inhibition of this enzyme [19]. Some NA inhibitors are already available on the market, such as oseltamivir (Tamiflu), zanamivir (Relenza), peramivir (Rapivab) and laninamivir (Inavir) [20][21][22][23]. These classical NA inhibitors are mostly polar and have poor oral bioavailability. In addition, some mutants (e.g., A (H1N1)pdm09) can be resistant to these inhibitors, and have shown strong resistance to oseltamivir [24].
There is a need to have a broad spectrum of different types of NA to effectively study new compounds with the potential to inhibit viral NA. NAs can be extracted directly from the virus surface or recombinantly produced using a suitable host [25]. Although NAs are commonly obtained from the surface of the virus, this process requires expensive laboratory

Bacterial Strain, Plasmid and Growth Conditions
The NA gene was designed according to the codon usage of E. coli using the sequence deposited on GenBank under the accession number KM244086.1 and synthetized in the pET15b vector (ATG Biosynthetic, Merzhausen, DE). E. coli expression strain BL21 (DE3)pLysS was purchased from Agilent Technologies (Santa Clara, CA, USA) and transformed by pET15b containing the NA gene according to a standard protocol. The presence of the NA gene was verified by restriction digestion using XhoI and NdeI restriction enzymes (ThermoFisher Scientific, Waltham, MA, USA) after transformation of the E. coli strain [32].
Luria Bertani broth (10 g/L tryptone, 10 g/L NaCl and 5 g/L yeast extract) with ampicillin (100 µg/mL) was inoculated with a single bacterial colony of transformed E. coli BL21 (DE3)pLysS and the culture was incubated overnight at 37 • C under shaking (200 RPM). This overnight culture (0.5 mL) was used to prepare a stock containing glycerol at a final concentration of 25% (v/v). Glycerin stocks were stored in a freezer (Arctiko ULUF, Esbjerg Kommune, DK) at −80 • C and used to prepare the inoculum (25 mL).
The inoculum was prepared by mixing LB broth (25 mL supplemented with ampicillin 100 µg/mL) and one glycerin stock. The culture was incubated for 12 h at 37 • C and 200 RPM. The cells were harvested at 3000 RPM for 5 min and then diluted with sterile distilled water to obtain 2.0 McFarland units (MFU) solution. Prepared inoculum was used to inoculate the culture medium at a final concentration of 2% (v/v) at different propagation times. The expression of recombinant enzyme was inducted by adding isopropyl β-D-1thiogalactopyranoside (IPTG) at different concentrations for various time periods.
After expression, the medium was separated from E. coli biomass by centrifugation (4000 RPM for 10 min) and the biomass was processed to release soluble and insoluble cell proteins. Soluble and insoluble cell fractions were analyzed by SDS-PAGE. The pellet was then used to determine biomass yield, isolate inclusion bodies and evaluate the yield of proteins released from inclusion bodies as well as NA yield.

Experimental Design and Optimization
RSM was used to investigate the effect of propagation time, IPTG concentration and expression time on the dependent variables (protein yield and NA yield). These three independent factors were tested on five code levels: −1.682; −1; 0; 1 and 1.682 (Table 1). The second-order polynomial function with respect to the three selected parameters is given in Equation (1).
where X are the independent variables (propagation time, IPTG concentration or expression time) causing the Y response (protein yield or NA yield) and b are regression coefficients. The interaction between two variables and the effect of these factor levels on the protein or NA yield were derived from 3D surface response plots. The third constant was kept at the optimized point. The coefficients of the response surface equation were estimated.

Isolation of Inclusion Bodies and NA Renaturation
The procedure for isolating inclusion bodies from E. coli cells and the subsequent release and renaturation of NA was proposed previously [32]. Briefly, the cell pellet was first resuspended in lysis solution (100 mM NaCl, 5 mM ethylenediaminetetraacetic acid (EDTA), 1 mM phenylmethylsulfonyl fluoride (PMSF), 10 mM DL-dithiothreitol (DTT) and 5 g/L lysozyme in 100 mM tris (hydroxymethyl)aminomethane (Tris)-HCl buffer, pH 8.0) followed by sonication in 10 cycles (30 s sonication alternated with 30 s of incubation on ice). The cell pellets were harvested (4000 RPM for 30 min) and washed 5 times with a solution of urea (2 M) in 100 mM Tris-HCl buffer. Inclusion bodies were obtained by centrifugation (4000 RPM for 30 min) and were then dissolved in a solution of urea (8 M) in 100 mM Tris-HCl (pH 8.0). The suspension was dialyzed against a solution with NaCl (150 mM) and EDTA (5 mM) in 50 mM Tris-HCl to refold the recombinant NA. The refolded and soluble proteins were centrifuged (4000 RPM for 20 min) and the supernatants were used for further analyses.

Analytical Methods
The amount of biomass was determined after centrifugation of the culture medium at 4000 RPM for 20 min, the biomass was washed twice with distilled water, then dried to constant weight at 60 • C and expressed in grams per volume of culture medium.
The concentration of released proteins from the inclusion bodies was determined using the Bradford method [36] with bovine serum albumin as a standard and expressed as protein yield released from the inclusion bodies per gram of dry biomass (mg/g).
SDS-PAGE analysis (80 V, 2.5 h) was carried out using a 12% (w/v) separation gel in Tris-glycine buffer (pH 8.3) [37]. After electrophoresis, the gels were stained with CBB-R250 solution and then washed in a destaining solution containing 10% (v/v) methanol. NA captured in the gel was analyzed by densitometry using the ImageJ (version 1.46r, National Institute of Health, Bethesda, MD, USA). The amount of NA was evaluated from a linear regression of the calibration curve as a function of the peak area (pixel) on the lysozyme concentration and expressed as the amount of NA per biomass (mg/g).
After recovery of recombinant NA from inclusion bodies [32], enzyme activity was determined by coupled reactions using fetuin as a substrate [32,38]. One unit of refolded NA activity (U) was defined as the amount of enzyme that converts 1 µmol of fetuin per minute at a wavelength of 540 nm and expressed in U/g of bacterial biomass.

Statistical Analysis
OriginPro 2016 (version 9.3, OriginLab Corporation, Northampton, MA, USA) was used to process all experimental data obtained. Statgraphics Centurion XV (version 15.1.2, Statpoint Technologies, Warrenton, VA, USA) was used for the statistical analysis of experimental data. All assays were performed in triplicate.

Preliminary Experiments
In higher eukaryotic expression systems, the expression of the native gene directly amplified from the influenza virus occurs without major problems [39]. However, when such genes are expressed in E. coli cells, problems arise due to the different use of individual codons, leading to a protein without biological activity. Regardless of the chosen expression conditions of the inserted NA gene, the recombinant protein accumulates in the insoluble fraction of the cell biomass in the form of inclusion bodies. The formation of inclusion bodies is a common but often undesirable phenomenon associated with the overexpression of a heterologous gene in E. coli cells. However, accumulation of recombinant protein in inclusion bodies may be an advantage. They contain a high proportion of target protein and at the same time fewer undesirable proteins [40][41][42]. The optimization of recombinant proteins expressed in this form is relatively rare, but may represent a promising strategy for various insoluble proteins produced by E. coli [35,[43][44][45][46]. In our previous work [26], we were able to produce biologically active NA in E. coli cells, and now we bring our focus to optimizing the conditions for the production of recombinant enzymes in inclusion bodies in order to maximize NA production.
The effect of propagation (pre-incubation) time on the monitored variables was observed in the range of 2-24 h at 37 • C. Recombinant protein production was induced by the addition of a 1 mM IPTG solution at the end of the lag phase (biomass yield 0.10 ± 0.03 g/L), in the early exponential phase (0.43 ± 0.03 g/L), in the mid-exponential phase (1.02 ± 0.02 g/L), in the late exponential phase (1.39 ± 0.02 g/L), in the early stationary phase (1.64 ± 0.02 g/L) and in the stationary phase (1.68 ± 0.03 g/L) of producer growth ( Figure 1).

Figure 1.
The effect of producer propagation time on the protein yield (mg/g) released fr sion bodies, the NA yield (U/g) and on the amount of biomass (g/L) before and after NA pression.
A significant increase in biomass yield was observed after the NA gene expr the end of the lag phase and at the beginning of the exponential growth phase. The yields of NA released from the inclusion bodies were obtained by induction in exponential phase of growth (47.58 ± 0.22 U/g) and at the end of this phase (58. U/g), and protein yields were comparable (9.54 ± 0.47 mg/g and 9.51 ± 0.49 mg/g tively). Fazaeli et al. [47] also observed the highest recombinant enzyme producti mid-exponential phase of growth. The highest NA yields were obtained by ind coli cells during the exponential phase of growth ( Figure 1). A decrease in biomas and a higher yield of proteins released from inclusion bodies could indicate a pr transition from biomass production to recombinant protein production [48]. Mos ably, the late exponential phase of growth induced gene expression (31.87 ± 0 Similarly, Rengby et al. [49] found that changing the induction time from mid-exp to late exponential growth phase increased the recombinant protein yield 4-fold. The effect of IPTG concentration on the protein and NA yields from inclusio was monitored in the range of 0-2 mM. The producer was cultivated for 8 h at 37 after reaching the late exponential growth phase (biomass yield 1.39 ± 0.02 g/L), pression was started by adding IPTG. In all experiments (Figure 2), the presenc teins released from inclusion bodies was determined. A significant increase in biomass yield was observed after the NA gene expression at the end of the lag phase and at the beginning of the exponential growth phase. The highest yields of NA released from the inclusion bodies were obtained by induction in the mid-exponential phase of growth (47.58 ± 0.22 U/g) and at the end of this phase (58.97 ± 0.58 U/g), and protein yields were comparable (9.54 ± 0.47 mg/g and 9.51 ± 0.49 mg/g, respectively). Fazaeli et al. [47] also observed the highest recombinant enzyme production in the mid-exponential phase of growth. The highest NA yields were obtained by inducing E. coli cells during the exponential phase of growth ( Figure 1). A decrease in biomass growth and a higher yield of proteins released from inclusion bodies could indicate a producer's transition from biomass production to recombinant protein production [48]. Most preferably, the late exponential phase of growth induced gene expression (31.87 ± 0.31 U/g). Similarly, Rengby et al. [49] found that changing the induction time from mid-exponential to late exponential growth phase increased the recombinant protein yield 4-fold.
The effect of IPTG concentration on the protein and NA yields from inclusion bodies was monitored in the range of 0-2 mM. The producer was cultivated for 8 h at 37 • C, and after reaching the late exponential growth phase (biomass yield 1.39 ± 0.02 g/L), gene expression was started by adding IPTG. In all experiments (Figure 2), the presence of proteins released from inclusion bodies was determined.
The presence of proteins in the inclusion bodies was also determined in medium without IPTG, but the presence of NA was not confirmed after expression. The increasing concentration of IPTG caused a higher expression of NA, but it had a negative effect on the amount of biomass. This effect has also been described in other studies [50,51]. The highest protein yield released from the inclusion bodies (19.56 ± 0.52 mg/g) was achieved by adding IPTG at a concentration of 1.5 mM. Increasing the IPTG concentration from 1 to 1.5 mM caused a 1.4-fold higher enzyme yield (84.09 ± 3.21 U/g) at a concentration of 1.5 mM IPTG. The presence of proteins in the inclusion bodies was also determined without IPTG, but the presence of NA was not confirmed after expression. Th concentration of IPTG caused a higher expression of NA, but it had a negati the amount of biomass. This effect has also been described in other studies highest protein yield released from the inclusion bodies (19.56 ± 0.52 mg/g) w by adding IPTG at a concentration of 1.5 mM. Increasing the IPTG concentra to 1.5 mM caused a 1.4-fold higher enzyme yield (84.09 ± 3.21 U/g) at a conc 1.5 mM IPTG.
The effect of expression time in the range of 0-48 h and expression tem 11-45 °C on the monitored variables was evaluated after 8 h of propagation, pression was induced by 1.5 mM IPTG (data not shown). E. coli cells can grow temperature range (15-45 °C) with optimal growth in the range of 20-42 °C. A tionship between the peptide chain elongation rate of the newly formed prote perature during expression of the corresponding gene has been observed in ature range [51,52]. We found that temperatures of 11 and 45 °C were not suit expression. At 11 °C, the lowest protein and NA yields were obtained. The of 45 °C caused partial denaturation of the undesirable proteins but also the r NA. Protein yields were highest at 20-37 °C after a 20 h expression time (21.24and did not change significantly after this time. However, the NA yield at 3 times higher than at 30 °C , and 2.3 times higher than at 20 °C. The NA yield did significantly after 10 h of expression (84.57 ± 5.12 U/g). Therefore, we set the temperature to 37 °C during the optimization and the extraction time was c independent variable. The effect of expression time in the range of 0-48 h and expression temperature of 11-45 • C on the monitored variables was evaluated after 8 h of propagation, and NA expression was induced by 1.5 mM IPTG (data not shown). E. coli cells can grow over a wide temperature range (15-45 • C) with optimal growth in the range of 20-42 • C. A direct relationship between the peptide chain elongation rate of the newly formed protein and temperature during expression of the corresponding gene has been observed in this temperature range [51,52]. We found that temperatures of 11 and 45 • C were not suitable for NA expression. At 11 • C, the lowest protein and NA yields were obtained. The temperature of 45 • C caused partial denaturation of the undesirable proteins but also the recombinant NA. Protein yields were highest at 20-37 • C after a 20 h expression time (21.24-23.61 mg/g) and did not change significantly after this time. However, the NA yield at 37 • C was 1.3 times higher than at 30 • C, and 2.3 times higher than at 20 • C. The NA yield did not change significantly after 10 h of expression (84.57 ± 5.12 U/g). Therefore, we set the expression temperature to 37 • C during the optimization and the extraction time was chosen as an independent variable.

Optimization of Neuraminidase Production in Inclusion Bodies Using Response Surface Methodology
Preliminary experiments confirmed that parameters such as propagation time, IPTG concentration and gene expression time at 37 • C affected the protein yield from inclusion bodies and the NA yield itself. The optimal values of these selected production parameters were calculated using RSM in order to obtain the maximum yield of NA produced by E. coli. Table 2 shows the design matrix including coded and actual variables as well as the protein and NA yields for each run (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15)(16)(17).
As shown in Figure 3, after dissolution of inclusion bodies obtained from induced cells from all optimization runs (1-17), a band corresponding to NA (indicated by an arrow; 54 kDa) could be observed, while no band corresponding to NA was observed in the control (Figure 3, lane C). The intensity of the individual bands varied depending on the conditions Biomolecules 2022, 12, 331 7 of 12 of NA production, and while the most intense bands could be observed in runs 7-11, the intensity of the bands from runs 1, 2 and 6 was the lowest.  As shown in Figure 3, after dissolution of inclusion bodies obtained from i cells from all optimization runs (1-17), a band corresponding to NA (indicated b row; 54 kDa) could be observed, while no band corresponding to NA was observe control (Figure 3, lane C). The intensity of the individual bands varied depending conditions of NA production, and while the most intense bands could be observed 7-11, the intensity of the bands from runs 1, 2 and 6 was the lowest. Depending on the selected production conditions, the protein yield released f inclusion bodies also varied, in the range of 18.52-28.84 mg/g. The highest prote released from the inclusion bodies (28.84 ± 1.01mg/g) was achieved by 4.7 h prop of the producer and gene expression of 1.5 mM IPTG for 4 h. However, in terms yield, it is more appropriate to extend the propagation and expression time to 8 an respectively with 1.5 mM IPTG concentration (Table 1) (124.06 ± 2.91 U/g). E. coli ce held in the late exponential phase of growth by prolonging the propagation tim growth phase appears to be most suitable for the synthesis of other recombinant p such as human betaferon and heat shock protein (HSPA6) [51,53]. The effect o Depending on the selected production conditions, the protein yield released from the inclusion bodies also varied, in the range of 18.52-28.84 mg/g. The highest protein yield released from the inclusion bodies (28.84 ± 1.01 mg/g) was achieved by 4.7 h propagation of the producer and gene expression of 1.5 mM IPTG for 4 h. However, in terms of NA yield, it is more appropriate to extend the propagation and expression time to 8 and 7.3 h, respectively with 1.5 mM IPTG concentration (Table 1) (124.06 ± 2.91 U/g). E. coli cells were held in the late exponential phase of growth by prolonging the propagation time.
This growth phase appears to be most suitable for the synthesis of other recombinant proteins, such as human betaferon and heat shock protein (HSPA6) [51,53]. The effect of IPTG concentration and expression time on protein and NA yields can be monitored using response surface model plots ( Figure 4) at a constant value of optimized propagation time (7.72 h).  The experimentally measured data were evaluated by a second-order polynomia model (Equation 1) for protein and NA yields. Inclusion bodies usually contain almos exclusively overexpressed recombinant protein [54][55][56], but the results of the optimization do not confirm this. For protein yield, the R 2 coefficient reached 79% and the calculated R value of the model for NA yield was 94%. These results suggest that recombinant protein was not the only protein found in inclusion bodies. The ratio of NA amount (g/g) in inclu sion bodies and protein yield (g/g) ranged from 16 to 60% ( Figure 5). The lowest value was reached at the shortest propagation time, the lowest IPTG concentration and the shortest expression time from the optimization matrix. As the values of the independen variables increased, so did the amount of recombinant enzyme in the insoluble fraction Dang et al. [57] found that inclusion bodies contained approximately 80% recombinan protein. Therefore, we continued to work only on optimizing the conditions for NA yield  The experimentally measured data were evaluated by a second-order polynomial model (Equation 1) for protein and NA yields. Inclusion bodies usually contain almost exclusively overexpressed recombinant protein [54][55][56], but the results of the optimization do not confirm this. For protein yield, the R 2 coefficient reached 79% and the calculated R 2 value of the model for NA yield was 94%. These results suggest that recombinant protein was not the only protein found in inclusion bodies. The ratio of NA amount (g/g) in inclusion bodies and protein yield (g/g) ranged from 16 to 60% ( Figure 5). The lowest value was reached at the shortest propagation time, the lowest IPTG concentration and the shortest expression time from the optimization matrix. As the values of the independent variables increased, so did the amount of recombinant enzyme in the insoluble fraction. Dang et al. [57] found that inclusion bodies contained approximately 80% recombinant protein. Therefore, we continued to work only on optimizing the conditions for NA yield.  The experimentally measured data were evaluated by a second-order polyn model (Equation 1) for protein and NA yields. Inclusion bodies usually contain exclusively overexpressed recombinant protein [54][55][56], but the results of the optim do not confirm this. For protein yield, the R 2 coefficient reached 79% and the calcula value of the model for NA yield was 94%. These results suggest that recombinant p was not the only protein found in inclusion bodies. The ratio of NA amount (g/g) in sion bodies and protein yield (g/g) ranged from 16 to 60% ( Figure 5). The lowes was reached at the shortest propagation time, the lowest IPTG concentration a shortest expression time from the optimization matrix. As the values of the indep variables increased, so did the amount of recombinant enzyme in the insoluble fr Dang et al. [57] found that inclusion bodies contained approximately 80% recom protein. Therefore, we continued to work only on optimizing the conditions for NA   Table 3 summarizes the regression coefficients and analysis of variances calculated for NA yield. For NA yield, the propagation time and the expression time had a significant positive linear influence (p-value < 0.05). The increase of the expression time from 4 to 7 h affected the value of NA yield ( Figure 4A). Moreover, the propagation time had a negative quadratic influence (p-value < 0.05) on NA yield (Table 3). Table 3. Regression coefficients of the predicted second-order polynomial models for NA yield. The optimal production conditions for the highest NA yield were propagation time 7.72 h, IPTG concentration 1.82 mM and expression time 7.25 h. Optimal production conditions were verified and there was no significant difference between predicted and experimental values of NA yield (p < 0.05). Our results indicate that properly set optimization can increase the yield of the recombinant protein. The NA yield was increased 1.9-fold from the original 58.97 ± 0.58 (Figure 1) to 112.97 ± 2.82 U/g (Table 4). Moreover, here we confirmed the results of our previous paper [32] where the refolded non-glycosylated monomer of NA produced by E. coli at optimal production conditions achieved V max value of 9.73 U/mg with k cat 8.76 s −1 and the affinity to fetuin was demonstrated by the K m value of 0.51 g/L.

Conclusions
In this study, we focused on the optimization of the expression of influenza virus NA, the enzyme responsible for releasing new virions from infected cells. We tested various expression conditions, such as propagation time, IPTG concentration and expression temperature and time. The recombinant enzyme production process was optimized using response surface methodology, which led to a higher production of NA in inclusion bodies. The optimal values were as follows: propagation time 7.72 h, IPTG concentration 1.82 mM and gene expression time 7.35 h. The maximum NA yield was 112.97 ± 2.82 U/g, resulting in a 1.9-fold increase over the original production conditions. The results of this study confirmed that influenza virus NA can be produced by E. coli cells.