3.1. Functional Analysis of Deletion Mutants in Crude E. coli Lysates
One of the simplest ways to localize the enzyme catalytic domain(s) is to evaluate the catalytic ability of deletion mutants. MLuc7 is the smallest natural luciferase comprising the minimal variable N-terminus. In fact, this isoform consists almost entirely of two repeats which form the C-conserved region of copepod luciferases [
2]. It makes MLuc7 a convenient model to localize the catalytic domain of copepod luciferases. The obtained set of N- and C-terminal gradual-deletion mutants for MLuc7 is schematically presented in
Figure 1a,b with R1 and R2 corresponding to individual homologous repeats in the sequence. The deletion variants were directly expressed without signal peptides in
E. coli cells by employing the pET expression system and were functionally compared using crude cell extract.
When synthesized in this expression system, MLuc7 of the wild type is mainly found in an insoluble fraction as inclusion bodies [
15]. To avoid inaccuracies in the bioluminescent activity determination in view of possible changes in the levels of expression and solubility, those were analyzed for all mutants by gel electrophoresis (
Figure S3). After induction of synthesis, all deletion mutants were found in the insoluble fraction of the
E. coli lysate, similar to wild-type MLuc7, except for ML7-R2. In this case, half of the synthesized protein was in a soluble form. However, this was not significant because we could not detect any bioluminescent activity in separately expressed luciferase halves corresponding to the repeats (ML7-R1 and ML7-R2,
Figure 1c). These findings are consistent with the absence of noticeable bioluminescent activity in the homologous halves of GpLuc in studies on the construction of its fragments for protein complementation assays [
15,
16], but rather differ from the data for GpLuc, where bioluminescent activity was declared for analogous fragments (
Figure 1a) [
9,
13].
The bioluminescent activity of the mutants was evaluated at 15 °C (optimal for MLuc7 [
6]) and 5 °C because the deletions may reduce the structural rigidity of the protein molecule and potentially decrease the reaction temperature optimum, which occurs in the case of the psychrophilic MLuc2 isoform from
M. longa containing an additional glycine cluster in the amino acid sequence and four disulfide bonds only [
7]. The N-terminal deletion of 10 residues (ML7-N10) did not significantly change the bioluminescent activity (
Figure 1c), while removing 14 amino acids (ML7-N14) resulted in a loss of ~70% of activity at 15 °C and decrease in the temperature optimum of the reaction (mutant activity at 5 °C was ~20% less than that of wild-type MLuc7 at this temperature). The deletion of 19 amino acids (ML7-N19) completely destroyed the bioluminescent activity, thereby allocating the catalytic domain boundary between G32 and L37 from the N-terminus of MLuc7.
The deletions at C-terminus caused a gradual decrease in bioluminescent activity as well, which, to our surprise, partially remained even when the deletions were extended to the part of highly conserved motif 2 (
Figure 1). A loss in bioluminescent activity for C-terminus deletion mutants with an increasing number of truncated amino acid residues was also accompanied by a significant decrease in the reaction temperature optimum—a higher bioluminescent activity was observed at 5 °C for all mutants. The activity of the ML7-C20 variant with the 3-aa deletion of the conserved motif 2, for example, could only be detected at 5 °C (
Figure 1c, insert). Enzymatic activity of the deletion mutants could be detected as long as the highly conserved 148Cys was not deleted. Then, a complete loss of bioluminescent activity and bordering the catalytic domain by this 148Cys residue from the C-terminus side followed. Thus, the catalytic domain of MLuc7 luciferase resides within the G32-A149 sequence and both repeats are involved in its formation. Considering a high degree of identity between the conserved C-regions of copepod luciferases, this statement seems to be true for all of them. Interestingly, similar bordering has been recently identified for the artificial luciferase ALuc30 designed on the base of consensus sequences of the cloned copepod luciferases [
26].
3.2. Bioluminescent Properties of High-Activity Deletion Mutants
One of the attractive features of any reporter is a small size, which provides the minimum of: (1) metabolic load on host-cell metabolism when it is used as a genetically encoded reporter; (2) steric hindrances when applying the one as a part of hybrid proteins; and (3) distance when the bioluminescent reporter is used as a BRET partner. These are the reasons we chose two deletion variants of MLuc7 with minimal effects from deletions on bioluminescent activity, ML7-N10, and ML7-N10C4 (
Figure 1) for more detailed characterization. Monomeric preparations of the natively folded proteins were obtained from
E. coli inclusion bodies by glutathione-based oxidative refolding, followed by chromatography purification (the purity exceeded 95%) (
Figure S3c). The homogeneity of the preparations was confirmed by SDS and semi-native polyacrylamide gel electrophoresis.
The bioluminescence spectrum of the ML7-N10C4 mutant with λ
max around 485 nm corresponded to those of wild-type MLuc7, the other
M. longa isoforms, and GpLuc [
1], while the one of the ML7-N10 mutant appeared to be a little wider and shifted by ~5 nm towards the longer wavelengths (
Figure 2a). The similarity of light-emission spectra of MLuc truncated mutants and GpLuc indicates an identical environment of a substrate in their active centers and, consequently, it obviously shows that the variable N-terminus of both GpLuc and MLuc is hardly involved in the formation of the substrate-binding cavity of copepod luciferases as was suggested based on the NMR structure of GpLuc [
13].
The pH profiles of light intensities of the luciferase mutant ML7-N10 and wild-type MLuc7 turned out to be identical at physiological pH in the range of 6.5–8.0, whereas the pH profile of the ML7-N10C4 mutant was slightly shifted towards acidic pH by ~0.25 units (
Figure 2b). The dependences of bioluminescent activities on salt concentration of both mutants are also almost identical to that of MLuc7 refolded from
E. coli inclusion bodies, i.e., the optima were found at ~0.3 M (
Figure 2c). However, of note is that ML7-N10C4 appeared to be somewhat more resistant to high salt concentrations. The distinctive feature of all known copepod luciferases is a greater resistance to thermal inactivation due to the presence of multiple disulfide bonds [
2]. Both deletion mutants have retained this property (
Figure 2d), thus indicating the presence of the correctly formed S-S bonds in both mutants.
Bioluminescence activity is a parameter that is more affected by simultaneous truncation of residues at the N- and C-termini. The ML7-N10C4 mutant retains only 23% of activity compared to wild-type MLuc7 (
Table 1). In contrast, the deletion of 10 residues at the N-terminus only does not influence bioluminescent activity (ML7-N10 mutant preserves practically the same activity as the wild-type MLuc7). The low activity of the ML7-N10C4 mutant can be also caused by a non-optimal temperature of 15 °C because the deletion of four residues at the C-terminus results in a shift in the temperature optimum of bioluminescent reaction from 15 °C to 7–10 °C (
Figure 2e).
In this way, the removal of the first ten residues in mature MLuc7 does not practically affect the bioluminescence properties of luciferase. However, additional deletion of even four amino acids at the C-terminus significantly reduces activity and shifts the temperature optimum. This emphasizes the importance of the C-terminus in stabilizing the molecular structure.
To test the truncated MLuc7 variants as secreted bioluminescent reporters, the deletion mutants ML7-N10 and ML7-N10C4 in the pcDNA3m vector were transiently transfected to CHO cells. The plasmids encoding the truncated versions of MLuc7, corresponding to the ML7-R1 and ML7-R2 repeats (
Figure 1b) with N-terminal signal peptides, were also tested for transient transfection of mammalian cells. The cells were observed to start secreting the reporter immediately after transfection and to accumulate the luciferase in the medium.
Figure 2f shows the time course of secretion after replacement of the transfection medium to the fresh medium after 5 h treatment. Up to the 20 h point, bioluminescent activity grew synchronously for all three luciferases. It should be noted that the intensity of light signals was in proportion to those determined for high-purity proteins (
Table 1). Then, the increase in activity slowed down and began to fall for the mutant variants, while for wild-type MLuc7, it continued. Most likely, such a pattern for deletion mutants is associated with partial degradation in the medium due to its lower resistance to the action of proteases compared to the wild-type luciferase.
It is noteworthy that each of the MLuc7 repeats itself (ML7-R1 and ML7-R2) did not reveal any bioluminescent activity exceeding that of the negative control, namely the pcDNA3.1+ vector without insert. This is additional evidence for the results obtained on the bacterial expression of halves of MLuc7 luciferase and for the absence of bioluminescent activity in separate domains of luciferase corresponding to the ML7-R1 and ML7-R2 repeats, as was previously suggested [
9,
14].
3.3. Protein Spatial Structure Analysis
Despite active development of numerous copepod luciferase-based reporters, the structural features of luciferases themselves remained unknown until recently, and these were only derived from prediction structural analysis, spectral studies, or mutagenesis [
14,
27,
28,
29,
30] that could indirectly determine the involvement of certain amino acids in the enzyme active center. Because copepod luciferases have no significant amino acid sequence homology with other proteins, thereby representing a unique class of proteins [
2], the exact tertiary structure of luciferases could not be predicted for a long time.
However, in 2020, the NMR structure of copepod luciferase, GpLuc, was determined (PDB ID 7D2O) (
Figure 3a) [
13]. For NMR experiments GpLuc was obtained in a soluble form by bacterial expression in an M9 medium containing
13C and
15N and by the subsequent oxidative refolding on air [
31]. This NMR structure is considered a novel coelenterazine-dependent luciferase fold which represents a globe-like protein formed by nine α-helices (α1–α9) and disordered regions at the N- and C-termini and at the region between two conservative motifs of luciferase (
Figure 3a).
The structure of GpLuc allowed us to predict the spatial structure of Metridia luciferase. In the present study, this known structure was used as a reference model for predicting the structures of MLuc7 luciferase and its highly active truncated mutant, ML7-N10 (
Table 1), using the online server I-TASSER [
22,
23,
24] (
Figure 3a). The predicted models for MLuc7 and ML7-N10 (
Coordinates data S1, S2) showed high confidence with C-scores [
22,
23,
24] of 1.28 and 1.86, respectively. The RMSD values were calculated for MLuc7 and ML7-N10 with alignment to GpLuc as 0.957 Å and 0.792 Å, respectively. The aforementioned parameters allow one to consider the obtained high-confidence models of Metridia luciferases as suitable structures for further analysis.
The overall structures of the obtained Metridia luciferase models in general corresponded to the fold of GpLuc because of high homology (
Figure 3a and
Figure S1). The Metridia structures, such as that in GpLuc, also have a set of α-helices and intrinsically disordered regions located at C-terminus, as well as between helices α5 and α6 (
Figure 3a,b), which corresponds to the sequence between two conservative motifs (
Figure 1a). The main difference in the structural organization of the two luciferases is the absence of the central α1-helix in the natural sequence of MLuc7 (
Figure 3a,b). This fact clearly confirms that this structural element does not play an essential role for enzymatic activity in copepod luciferases.
Two of the obtained mutants, ML7-N10 and ML7-N14, with the deleted first 10 and 14 N-terminal amino acid residues (
Figure 1b), represent truncated versions of MLuc7 without α2-helix. According to our results, these deletions caused no sufficient changes in bioluminescence (
Figure 1c). However, the truncation in ML7-N19 that affected the first proline of the α3-helix (
Figure 3b) resulted in complete inactivation of the enzyme. Considering the data on the M5 mutant of MLuc164, which had a deleted variable N-terminus and exhibited even a two-fold increase in bioluminescent activity (
Figure 3b) [
8], we suggest that the N-terminal boundary of the catalytic domain of copepod luciferases lies at the edge of the α3-helix. As such, the conclusion could be that helices α1 and α2 are not involved in the oxidative decarboxylation of coelenterazine.
Figure 3.
(
a) Overall structure of GpLuc (PDB ID 7D2O) aligned with structures of MLuc7 and ML7-N10 mutant predicted by online server I-TASSER [
22,
23,
24] presented as a stereo capture. GpLuc is shown in purple, predicted model for MLuc7 is shown in aquamarine, and predicted model for ML7-N10 is shown in pale pink. α-helixes are indicated by numbers 1–9 as it was ordered in the study [
13]; (
b) Amino acid alignment of GpLuc and isoforms of Metridia luciferase, MLuc164, MLuc7, and their truncated mutants, ML164M5 [
8] and ML7-N10, respectively. The residues are numbered in accordance with full-length protein sequences. The conservative motifs are highlighted in yellow, and cysteines are colored in green and numbered. Gray boxes indicate the α-helixes according to the structure of GpLuc. Due to high homology within luciferases’ sequences, the similar patterns in frames were expanded to all aligned sequences. Because of low-confidence score for prediction of the full-length MLuc164 model, the N-terminal α1-helix (in dashed box) was identified using PSSpred server, providing the prediction of protein secondary structures [
32]. The a/a column represents the length of amino acid sequences for mature proteins without signal peptides. The arrows show the boundaries of the sequence (32-149) found as a minimal essential region for catalytic activity of MLuc7; (
c) Spatial structure of GpLuc with determined disulfide bridges. Insert shows loop-α5-helix-loop and loop-α9-helix-loop patterns forming the folds in the same manner.
Figure 3.
(
a) Overall structure of GpLuc (PDB ID 7D2O) aligned with structures of MLuc7 and ML7-N10 mutant predicted by online server I-TASSER [
22,
23,
24] presented as a stereo capture. GpLuc is shown in purple, predicted model for MLuc7 is shown in aquamarine, and predicted model for ML7-N10 is shown in pale pink. α-helixes are indicated by numbers 1–9 as it was ordered in the study [
13]; (
b) Amino acid alignment of GpLuc and isoforms of Metridia luciferase, MLuc164, MLuc7, and their truncated mutants, ML164M5 [
8] and ML7-N10, respectively. The residues are numbered in accordance with full-length protein sequences. The conservative motifs are highlighted in yellow, and cysteines are colored in green and numbered. Gray boxes indicate the α-helixes according to the structure of GpLuc. Due to high homology within luciferases’ sequences, the similar patterns in frames were expanded to all aligned sequences. Because of low-confidence score for prediction of the full-length MLuc164 model, the N-terminal α1-helix (in dashed box) was identified using PSSpred server, providing the prediction of protein secondary structures [
32]. The a/a column represents the length of amino acid sequences for mature proteins without signal peptides. The arrows show the boundaries of the sequence (32-149) found as a minimal essential region for catalytic activity of MLuc7; (
c) Spatial structure of GpLuc with determined disulfide bridges. Insert shows loop-α5-helix-loop and loop-α9-helix-loop patterns forming the folds in the same manner.
The disulfide bonds in the structures of copepod luciferases are of key importance in bioluminescence, considering that the addition of DTT to the luciferase samples results in their inactivation [
33]. Original structural analysis on GpLuc yielded the positions of disulfide bonds that connect conservative motifs containing 10 Cys residues. It suggests that the combination of disulfide bonds may be the same for the whole class of copepod luciferases. The NMR structure made it possible to clearly identify the pairs of cysteines forming the S-S bridges: 1/3, 4/5, and 9/10. However, the contacts 1/8 and 2/7 were calculated according to the atomic distances between Sγ atoms. Interestingly, the three of these pairs in the same form were predicted by DISULFIND [
14]. The structure reveals that disulfide bridges (1/8, 3/6, and 2/7) firmly bind antiparallel bundles, thereby tightening the α3/α8 and α4/α7 helices (
Figure 3c). However, the two well-defined loop-α-helix-loop patterns for α5 and α9 adopt similar folds, where the helices encompass the residues between the two cysteine pairs located at the end of each of the conserved motifs (
Figure 1a and
Figure 3b,c). The similarity in the fold of α5 and α9 helices may confirm the hypothesis on the evolution of copepod luciferases originated by the duplication of the gene encoding consensus sequences in motifs 1 and 2 [
10]. The importance of the cysteine No. 5 in α5-helix was already confirmed by mutagenesis that resulted in complete inactivation of GpLuc [
27]. The current study also determined that the truncation of the last cysteine placed at α9 in ML7-C22 mutant leads to the loss in bioluminescence activity. Numerous truncations of the C-terminus of Metridia luciferase definitely show the importance of the last cysteine, providing the stability of the protein molecule, and, consequently, the enzyme activity. Considering the above, we can conclude that the ability of copepod luciferase to catalyze the bioluminescence reaction remains as far as at least cysteines No. 4, 5, and 9, 10, and the structural elements loop-α5-helix-loop and the last loop-α9-helix-loop are present in the sequence. Thus, it was determined that the C-terminal boundary of the catalytic domain of copepod luciferase is placed at the edge of the last cysteine.
Based on the predicted structures, we also attempted to determine the reason for the ~5-fold reduction in luciferase activity observed in the truncated mutant ML7-N10C4, which differs from ML7-N10 by only four C-terminal residues. The C-terminus represents the most conserved region among all known copepod luciferases and terminates a second tandem repeat. It belongs to intrinsically disordered regions with high flexibility, as it was shown for GpLuc [
13]. The decrease in activity observed in ML7-N10C4 leads us to suggest that the C-terminal unstructured region is functionally important. In a recent study investigating the mechanism of bioluminescence of another coelenterazine-utilizing luciferase, Renilla, the presence of a cap domain was discovered, providing an open or closed state for the substrate cavity [
34]. It should not be excluded that the C-terminal part of the copepod luciferases may similarly serve as a structural element, providing a hydrophobic “wall” for the substrate-binding pocket.
Structural modeling of MLuc7, ML7-N10, and ML7-N10C4 as surfaces with indicated hydrophobic regions revealed significant differences in the hydrophobic pockets as compared to the previously published GpLuc structure. While original structural analysis of GpLuc postulated an entrance to the substrate cavity between α4, α5, and the C-terminus, in the case of ML7-N10 and ML7-N10C4, this cavity was extended throughout the entire molecule, forming a distinctly gaping hole (
Figure 4a). We supposed that this may not only affect the bioluminescence activity of the deleted luciferase mutants but also their ability to retain the reaction product, coelenteramide, in the substrate-binding pocket.
This hypothesis was experimentally tested by evaluating the luciferase activity upon addition of the second portion of the substrate coelenterazine into the reaction mixture after bioluminescence was initiated. As was previously shown for GpLuc [
19], the maximum bioluminescent activity at the second injection of the substrate was around 10%. In this study, we evaluated the activity after secondary substrate injection for MLuc7, ML7-N10, and ML7-N10C4 in comparison to GpLuc under the same conditions. The activity values measured after secondary substrate injection for MLuc7 and GpLuc were 6.1 ± 0.4% and 10.8 ± 3.1%, respectively, which is consistent with the previous data, and show that the hydrophobic substrate cavity is still occupied by coelenteramide. Surprisingly, for the deleted Metridia luciferase mutants, bioluminescence after a second injection of the substrate reached 23.5 ± 4.7% for ML7-N10 and 40.4 ± 4.1% for ML7-N10C4 (
Figure 4b). These findings clearly indicate that the N-terminal amino acid residues, which form the α2-helix, and C-terminal intrinsically disordered region are functionally important for the formation of a cavity that retains the reaction product but are not involved in the catalytic domain of the copepod luciferases.