Anti-EGFR VHH Antibody under Thermal Stress Is Better Solubilized with a Lysine than with an Arginine SEP Tag

In this study, we assessed the potential of arginine and lysine solubility-enhancing peptide (SEP) tags to control the solubility of a model protein, anti-EGFR VHH-7D12, in a thermally denatured state at a high temperature. We produced VHH-7D12 antibodies attached with a C-terminal SEP tag made of either five or nine arginines or lysines (7D12-C5R, 7D12-C9R, 7D12-C5K and 7D12-C9K, respectively). The 5-arginine and 5-lysine SEP tags increased the E. coli expression of VHH-7D12 by over 80%. Biophysical and biochemical analysis confirmed the native-like secondary and tertiary structural properties and the monomeric nature of all VHH-7D12 variants. Moreover, all VHH-7D12 variants retained a full binding activity to the EGFR extracellular domain. Finally, thermal stress with 45-minute incubation at 60 and 75 °C, where VHH-7D12 variants are unfolded, showed that the untagged VHH-7D12 formed aggregates in all of the four buffers, and the supernatant protein concentration was reduced by up to 35%. 7D12-C5R and 7D12-C9R did not aggregate in Na-acetate (pH 4.7) and Tris-HCl (pH 8.5) but formed aggregates in phosphate buffer (PB, pH 7.4) and phosphate buffer saline (PBS, pH 7.4). The lysine tags (either C5K or C9K) had the strongest solubilization effect, and both 7D12-C5K and 7D12-C9K remained in the supernatant. Altogether, our results indicate that, under a thermal stress condition, the lysine SEP tags solubilization effect is more potent than that of an arginine SEP tags, and the SEP tags did not affect the structural and functional properties of the protein.


Introduction
Protein-based therapeutic drugs are one of the fasted growing classes of pharmaceutical products [1,2]. Among them, monoclonal antibodies (mAbs) and engineered antibody fragments are attractive therapeutic platforms [3]. In particular, a single-domain antibody (V HH ) is the smallest antibody fragment, and unlike full-length mAbs, it consists of a single Ig domain-containing three complementarity determining regions (CDRs) [4]. Their minimal size (~15 kDa) provides better tumor and tissue penetration than the full-length mAbs [5], making it an attractive drug candidate.
Aggregation is a critical issue in protein chemistry and the development of therapeutic proteins [6]. Aggregated therapeutic proteins can cause an adverse immune response, resulting in anti-drug antibodies (ADAs) and decline their therapeutic efficiencies [7][8][9]. Proteins may aggregate in the natively folded state, partially unfolded state, or fully denatured state [10,11], and the aggregates may be stabilized through electrostatic or hydrophobic interactions [12][13][14]. Aggregation induced by thermal, physical, and chemical stresses are often irreversible.
In vitro protein solubility can be controlled by optimizing the buffer condition. Sugars, polyols, amino acids, or surfactants used as additives can act as aggregation inhibitors [15,16]. In particular, arginine has gained much attention since it can increase protein solubility without altering the protein structure [17,18]. Besides, several methods related to the addition of arginine have been reported [19,20], but none of these techniques can be used in vivo, and even in vitro, the high concentration of arginine (up to 1.0 M) makes this technique costly. Moreover, in some cases, arginine has been shown to decrease protein stability, making it unfit for high-temperature usage where the protein is usually unfolded [21]. SEP tag has emerged as a reliable and versatile technique for controlling protein solubility [13,22]. In particular, we have shown their solubilization properties for bovine pancreatic trypsin inhibitor (BPTI) [23], dengue envelope protein [24], anti-epidermal growth factor receptor (EGFR)-ScFv [25], tobacco etch virus (TEV) protease [26], Gaussia luciferase (GLuc) [27,28] and the third domain of EGFR [29]. Here we assessed the effect of arginine and lysine tags to control protein solubility under thermal stress. We prepared four anti-EGFR V HH -7D12 variants tagged with 5 or 9 arginines or lysines at the C-terminus (7D12-C5R, 7D12-C9R, 7D12-C5K, and 7D12-C9K, respectively). The untagged V HH -7D12 formed aggregates at high temperatures and reduced the supernatant's protein concentration by 35%. The arginine tags were effective, but some aggregates appeared after high-temperature incubation in PB and PBS. Lysine tags with either 5 or 9 residues were the best and completely suppressed aggregation over a wide range of buffer conditions, pHs, and temperatures.

Mutant Design and Protein Expression
A pAED4 vector [30] was cloned with a synthetic gene that encodes anti-EGFR V HH -7D12 at a restriction site of NdeI and EcoR1, and the SEP tag variants were constructed by adding three repeated block of three arginines or lysines spaced by one glycine at the C-terminus of V HH -7D12 using site-directed mutagenesis (referred as 7D12-C9R and 7D12-C9K, respectively). Similarly, 7D12-C5R and 7D12-C5K variants were designed by adding one block of three arginines or lysines and another block containing two arginines or lysines spaced by one glycine (Figure 1). ecules 2021, 11, x FOR PEER REVIEW 2 of 11 natured state [10,11], and the aggregates may be stabilized through electrostatic or hydrophobic interactions [12][13][14]. Aggregation induced by thermal, physical, and chemical stresses are often irreversible. In vitro protein solubility can be controlled by optimizing the buffer condition. Sugars, polyols, amino acids, or surfactants used as additives can act as aggregation inhibitors [15,16]. In particular, arginine has gained much attention since it can increase protein solubility without altering the protein structure [17,18]. Besides, several methods related to the addition of arginine have been reported [19,20], but none of these techniques can be used in vivo, and even in vitro, the high concentration of arginine (up to 1.0 M) makes this technique costly. Moreover, in some cases, arginine has been shown to decrease protein stability, making it unfit for high-temperature usage where the protein is usually unfolded [21]. SEP tag has emerged as a reliable and versatile technique for controlling protein solubility [13,22]. In particular, we have shown their solubilization properties for bovine pancreatic trypsin inhibitor (BPTI) [23], dengue envelope protein [24], anti-epidermal growth factor receptor (EGFR)-ScFv [25], tobacco etch virus (TEV) protease [26], Gaussia luciferase (GLuc) [27,28] and the third domain of EGFR [29]. Here we assessed the effect of arginine and lysine tags to control protein solubility under thermal stress. We prepared four anti-EGFR VHH-7D12 variants tagged with 5 or 9 arginines or lysines at the C-terminus (7D12-C5R, 7D12-C9R, 7D12-C5K, and 7D12-C9K, respectively). The untagged VHH-7D12 formed aggregates at high temperatures and reduced the supernatant's protein concentration by 35%. The arginine tags were effective, but some aggregates appeared after high-temperature incubation in PB and PBS. Lysine tags with either 5 or 9 residues were the best and completely suppressed aggregation over a wide range of buffer conditions, pHs, and temperatures.

Mutant Design and Protein Expression
A pAED4 vector [30] was cloned with a synthetic gene that encodes anti-EGFR VHH-7D12 at a restriction site of NdeI and EcoR1, and the SEP tag variants were constructed by adding three repeated block of three arginines or lysines spaced by one glycine at the Cterminus of VHH-7D12 using site-directed mutagenesis (referred as 7D12-C9R and 7D12-C9K, respectively). Similarly, 7D12-C5R and 7D12-C5K variants were designed by adding one block of three arginines or lysines and another block containing two arginines or lysines spaced by one glycine (Figure 1).  The V HH plasmid was first transformed into the BL21 (DE3) pLysS cell line. The level of protein expression was assessed using small-scale culture (5 mL) at 37 • C. Protein expression was induced by adding 1 mM of isopropyl β-D-1-thiogalactopyranoside (IPTG) to the media at an optical density (OD) of 0.6 at 590 nm. The E. coli cells were collected by centrifugation 6 h after the IPTG induction, and the cells were lysed by sonication. Protein expression was analyzed by gel electrophoresis (SDS-PAGE).

Large Scale Protein Expression and Purification
BL21 (DE3) pLysS cell lines, transformed with V HH plasmid, were grown in Luria Bertani (LB) medium (1 L) at 37 • C. Protein expression was induced with 1 mM IPTG when the OD at 590 nm of the culture reached 0.6. After 6 h of IPTG induction, the E. coli cells were harvested by centrifugation at 4500 rpm for 20 min at 4 • C. After sonication, V HH was purified by Ni-NTA (Qiagen, Hilden, Germany), followed by ion-exchange chromatography (Gigacap-s-650 M, Tosoh Bioscience, Tokyo, Japan) according to our previously reported protocol [33]. Protein identities were confirmed by MALDI-TOF mass spectrometry (Autoflex III, Bruker Daltonics, MA, USA), and the purified proteins dissolved in Milli-Q (MQ) water was kept at −30 • C as a stock protein solution.

Dynamic Light Scattering (DLS)
The hydrodynamic radius (R h ) of the V HH -7D12 variants was measured by dynamic light scattering (DLS) using a Zetasizer Nano-S system (Malvern, Worcestershire, UK). Protein samples were prepared in 20 mM Na-acetate buffer (pH 4.7) at a concentration of 0.3 mg/mL. A 100 µL polystyrene cuvette was used for DLS measurement at 25 • C. Three independent measurements were performed and averaged for the final R h value.

Spectroscopic Measurements
Far-UV circular dichroism (CD) spectroscopy measurements were performed at a protein concentration of 0.15 mg/mL (10 µM) in 20 mM Na-acetate buffer (pH 4.7) using a J-820 spectropolarimeter (JASCO, Tokyo, Japan). 500 µL of the protein solution was placed in a 2 mm path-length quartz cuvette, and the spectra were collected in a continuous scanning mode from 260 to 205 nm wavelength. The spectral baseline was measured independently for each of the samples and subtracted to obtain the final spectrum. Thermal stability was measured from 15 to 90 • C using a scan rate of 1 • C/min at a wavelength of 222 nm. The midpoint temperature (Tm) was computed by fitting the thermal denaturation curve with a two-state model without dissociation/association, using Origin 6.1.J (OriginLab Co., Northampton, MA, USA).
Sample for tryptophan fluorescence measurements was prepared according to the same protocol as for the CD measurements. Fluorescence measurements were performed on an FP-8500 spectrofluorometer (JASCO, Tokyo, Japan) using a quartz glass cuvette containing 200 µL of the sample at 25 • C. The tryptophan excitation and emission wavelength were set to 295 nm and 345 nm, respectively, and the spectra were monitored from 300 nm to 500 nm using a continuous scanning mode.

Surface Plasmon Resonance (SPR)
The binding affinity of the anti-EGFR V HH -7D12 variant was measured by surface plasmon resonance (SPR) (Biacore x100, GE Healthcare, MA, USA), as previously reported [33]. In short, the extracellular domain of human EGFR (Abcam, Cambridge, UK) was immobilized onto a CM5 sensor chip using amine coupling according to the manufacturer's guidance. The V HH protein was passed over a CM5 sensor chip at a concentration range between 0.165 and 1.25 µg/mL. All measurements were performed at a flow rate of 10 µL/min in 10 mM HBS-EP buffer pH 7.4 (10 mM HEPES, 150 mM NaCl, 3 mM EDTA, and 0.005% surfactant P20) containing 1.5 M of NaCl at 20 • C. We used 1.5 M NaCl in the buffer to avoid undesired electrostatic interaction between the negatively charged CM5 sensor chip and the positively charged tags.

Determination of Protein Aggregation under Thermal Stress
Protein samples were prepared at 0.5 mg/mL in four different buffers (Na-acetate buffer, pH 4.7; PB, pH 7.4; PBS, pH 7.4 and Tris-HCl buffer, pH 8.5). The protein concentration was measured based on the absorbance at 280 nm and molar extinction coefficient using a Nanodrop-2000 instrument (Thermo Fisher Scientific, MA, USA). The samples were then centrifuged at 20,000× g for 20 min at 4 • C, and after centrifugation, the protein concentrations were confirmed to be 0.5 mg/mL within an error of 5% arising from the removal of the debris. Protein samples were then incubated at either 60 • C or 75 • C for 45 min. Afterward, the samples were kept at room temperature for 25 min and centrifuged again at 20,000× g for 20 min to remove the insoluble aggregates that appeared during the heat stress. The supernatant concentration was measured for calculating the percent of the protein that formed insoluble aggregates during the incubation. The soluble aggregates remaining in the supernatant were further analyzed by measuring the Z-average (Z-ave) hydrodynamic radius (R h ) using DLS. An aliquot of 100 µL of the supernatant sample was used for DLS measurement at 25 • C. Three individual measurements were performed and averaged.

Effect of SEP Tags on E. coli Expression of Anti-EGFR V HH -7D12 Variants
To assess the effect of positively charged SEP tags on the E. coli expression of V HH -7D12, we first conducted small-scale culture (5 mL) at 37 • C. All of the V HH -7D12 variants were expressed as inclusion bodies. SDS-PAGE data showed that both C-terminal arginine tags (7D12-C5R and 7D12-C9R) increased the V HH protein expression approximately twice (Figure 2A,B and Table S1), in line with our previous findings [25,26]. Likewise, the expression of 7D12-C5K increased by about two-fold, but 7D12-C9K did not show any significant expression change compared with the untagged V HH -7D12.

Biophysical and Functional Properties of V HH -7D12 Variants
We assessed that the tags did not affect the biophysical and biochemical nature of the V HH variants. The secondary structure content of V HH -7D12 variants was assessed by circular dichroism (CD) using a far-UV range of 205-260 nm. The CD spectrum of the untagged V HH -7D12 was typical of a β-sheeted protein, and according to BestSel [34], it contained 44% of β-sheets and 1.2% of α-helices ( Figure 2C and Table S2), which is in line with our previous report and the crystal structure [32,33]. The secondary structure content of the tagged variants was very similar to that of the untagged V HH -7D12 ( Figure 2C and Table S2).
The tryptophan fluorescence spectra of all of the V HH -7D12 variants showed a maximum fluorescence intensity at 345 nm, suggesting that the tertiary structure remained unchanged ( Figure 2E). Furthermore, the hydrodynamic radius (R h ) measured by DLS indicated that all of the V HH -7D12 variants were monomeric with an R h value of around 2 nm ( Figure 2F), as expected for a small globular protein with a molecular weight of~15 kDa. Finally, the SPR measurements indicated that all of the tagged V HH -7D12 variants bind to the EGFR extracellular domain, a target ligand of V HH -7D12, in a concentration-dependent manner, confirming their native functional properties ( Figure 3 and Table 1).

Effect of Thermal Stress on V HH -7D12 Variant's Aggregation
We first determined the midpoint temperature (Tm) of the V HH -7D12 variants to fix the incubation temperature. The Tm of the untagged V HH -7D12 was 63 • C, whereas the 5-residue tagged variants (7D12-C5R and 7D12-C5K) showed a 2 • C decrease ( Figure 2D and Table 1), and the 9-residue tagged variants (7D12-C9R and 7D12-C9K) were reduced by 4 • C. We have no good rationale for this slight stability decrease, but we speculate that this is related to some electrostatic interaction since the decrease was correlated with the number of charged residues in the tag. In any case, the decrease was minimal and did not affect the protein's function at ambient temperature (Table 1). For the incubation experiments, we chose 60 • C and 75 • C, where approximately half and all of the V HH proteins are unfolded, respectively. The fraction of V HH remaining in the supernatant before and after the heat incubation was determined in four different buffers (Na-acetate, pH 4.7; PB, pH 7.4; PBS, pH 7.4 and Tris-HCl, pH 8.5; see Materials and Methods for detailed experimental settings). Furthermore, the Z-average (Z-ave) hydrodynamic radius (R h ) of the sub-visible (soluble) aggregates present in the supernatant was measured by DLS (see below).
The untagged V HH -7D12 formed insoluble aggregates (precipitates) in all of the four buffer conditions after a 45 min incubation at 60 • C, the supernatant's protein concentration decreased by 4~30% depending on the buffer, with the most significant reduction occurring in Na-acetate buffer at pH 4.7 ( Figure 4A and Table 2). The arginine tagged variants (7D12-C5R and 7D12-C9R) did not form precipitates in Na-acetate and Tris-HCl buffer. However, they precipitated in PB and PBS, reducing the supernatant's protein concentration by 11~20%. Thus, under thermal stress, the arginine tags solubilization efficiency was bufferdependent (not pH-dependent, Figure S1). In contrast, C5K and C9K tag fully inhibited the aggregation of V HH -7D12 in all of the buffers, including PB and PBS, and 98~100% of the V HH proteins remained in the supernatant after heat stress ( Figure 4A and Table 2). The difference between the effect of the arginine and the lysine tags in PB and PBS might be attributed to their side-chain properties. Namely, the guanidinium group in the arginine side-chain can form hydrogen bonds with donors in the solution (including the phosphate ions) and lead to the formation of an arginine-phosphate complex structure. This has been invoked in several studies and coined as an arginine fork [35], an arginine claw [36], or a cyclic water-phosphate-guanidinium [37]. Additionally, we assessed the effect of a C-terminal histidine tag in a control experiment since it is the only SEP tag that increases protein solubility at low pH [38]. Indeed, the histidine tag showed a strong solubilization effect at pH 4.7 (Na-acetate buffer) but not at higher pH ( Figure S2), in line with our previous report [38]. As an additional control experiment, we measured the effect of free L-arginine on the protein's solubility [15]. Arginine at a concentration of 300 µM, which corresponds to the concentration of the C9R tag, did not inhibit aggregation, but at a concentration of 500 mM, arginine had a strong solubilizing effect ( Figure 4A and Table 2).
only SEP tag that increases protein solubility at low pH [38]. Indeed, the histidine tag showed a strong solubilization effect at pH 4.7 (Na-acetate buffer) but not at higher pH ( Figure S2), in line with our previous report [38]. As an additional control experiment, we measured the effect of free L-arginine on the protein's solubility [15]. Arginine at a concentration of 300 µM, which corresponds to the concentration of the C9R tag, did not inhibit aggregation, but at a concentration of 500 mM, arginine had a strong solubilizing effect ( Figure 4A and Table 2). High-temperature thermal aggregation behavior was analyzed at 60 and 75 °C, where half and all of the VHHs were unfolded, respectively. 0.5 mg/mL of proteins in four different buffers were heated for 45 min. After heat incubation, the samples were centrifuged at 20,000× g for 20 min. The amount of supernatant protein was measured just before and after heat-incubation by absorption at 280 nm to calculate the percent of the protein that formed insoluble aggregates (precipitate). Percent of protein retained in the supernatant after heat stress of tagged and untagged VHH-7D12 at (A) 60 °C and (B) 75 °C. Experiments at 60 °C and 75 °C were performed on the same day and using the same lot of protein.   * 300 µM free L-arginine ** 500 mM free L-arginine.
Very similar trends were observed for V HH s under harsher thermal stress generated by incubation at 75 • C, where V HH s were essentially unfolded according to CD ( Figure 4B and Table 2).
Using DLS, we assessed the Z-ave hydrodynamic radii (R h ) of the heat-induced subvisible aggregates that remained in the supernatant after 75 • C heat incubation followed by centrifugation. In all four buffers, except in PB, the untagged V HH -7D12 formed aggregates with an R h over 100 nm ( Figure 5 and Table 3). The tagged V HH -7D12 formed some aggregates smaller than 50 nm under most conditions. The most stringent exception was PB, where the V HH -C5R formed aggregates of almost 400 nm ( Figure 5 and Table 3). Noteworthy, the R h of 7D12-C5K and 7D12-C9K, which essentially remained (96~99.5%) in the supernatant after 75 • C heat incubation, showed a slight increase in Na-acetate and PB (20~41 nm), but not in PBS and Tris-HCl buffer, which emphasizes the potential of the lysine tag as an aggregation inhibitor tag. L-Arginine had a similar effect on the size of the soluble aggregates formed under thermal stress, but the R h value, particularly in PB and PBS, was larger than those of aggregates formed by the tagged variants. Note that the protein concentration was not adjusted for the amount of precipitated protein to avoid unwanted aggregation or dissociation. Overall, the lysine tag solubilized V HH under heat stress, and it was monomeric, natively folded ( Figures S3 and S4), and active ( Figure S5 and Table 1) upon reverting the temperature to ambient temperature.
aggregates smaller than 50 nm under most conditions. The most stringent exception was PB, where the VHH-C5R formed aggregates of almost 400 nm ( Figure 5 and Table 3). Noteworthy, the Rh of 7D12-C5K and 7D12-C9K, which essentially remained (96~99.5%) in the supernatant after 75 °C heat incubation, showed a slight increase in Na-acetate and PB (20~41 nm), but not in PBS and Tris-HCl buffer, which emphasizes the potential of the lysine tag as an aggregation inhibitor tag. L-Arginine had a similar effect on the size of the soluble aggregates formed under thermal stress, but the Rh value, particularly in PB and PBS, was larger than those of aggregates formed by the tagged variants. Note that the protein concentration was not adjusted for the amount of precipitated protein to avoid unwanted aggregation or dissociation. Overall, the lysine tag solubilized VHH under heat stress, and it was monomeric, natively folded ( Figures S3 and S4), and active ( Figure S5 and Table 1) upon reverting the temperature to ambient temperature.

The SEP Tag Is a Versatile Technique for Solubilizing Proteins
Controlling protein solubility in a versatile and inexpensive way is a holy grail of protein engineering [39], especially in protein drug development. Fusion proteins such as thioredoxin [40], N utilization substance (NusA), maltose-binding protein (MBP) [41], and small ubiquitin-like modifier (SUMO) [42] have been used to solubilize proteins, but because of their large sizes, they need to be removed, which generates further cost and handling. Co-solutes can be used to control the aggregate formation, but the need of a high co-solute concentration often makes the solution hypertonic and unusable for therapeutic purposes. To date, the SEP tag is the only way for reliably controlling the protein solubility and aggregation without changing the buffer's condition and without altering the protein's structural and functional properties [38,43,44]. Besides, SEP tags can solubilize recombinant proteins containing multiple disulfide bonds and yield a substantial amount of fully native proteins from E. coli expression systems, which would otherwise not be possible [29,45].

Conclusions
In conclusion, we showed that a SEP tag made of positively charged arginine or lysine could inhibit the high-temperature thermal aggregation, where proteins are unfolded, often resulting in irreversible aggregation. Overall, the lysine tags performed somewhat better than the arginine tags, and the five residue lysine tag was the best as it increased both the expression level and solubilized V HH under heat stress conditions in a monomeric, natively folded, and active state.