Identification and Characterization of the Haloperoxidase VPO-RR from Rhodoplanes roseus by Genome Mining and Structure-Based Catalytic Site Mapping

: Halogenating enzymes have evolved in considerable mechanistic diversity. The apparent need for secondary metabolism coincides with the current need to introduce halogens in synthetic products. The potential of halogenating enzymes and, especially, vanadate-dependent haloperoxidases has been insufficiently exploited for synthetic purposes. In this work, we identified potential halogenase sequences by screening algal, fungal, and protobacterial sequence databases, structural modeling of putative halogenases, and mapping and comparing active sites. In a final step, individual haloperoxidases were expressed and kinetically characterized. A vanadate-dependent haloperoxidase from Rhodoplanes roseus was heterologously expressible by E. coli and could be purified to homogeneity. The kinetic data revealed a higher turnover number than the known V Cl PO-CI and no inhibitory effect from bromide, rendering this enzyme a promising biocatalyst. Other predicted haloperoxidases were not expressed successfully yet but these enzymes were predicted to be present in a wide taxonomic variety.


Introduction
Halogens can have a tremendous impact on the physicochemical properties of chemical compounds. The heavier halogens chlorine, bromine, and iodine are large and electron-withdrawing substituents and can form halogen bonds via their σ-hole [1][2][3][4][5][6]. On the one hand, introducing halogens is an important lever for chemical transformations such as substitutions, metal-halogen exchanges, or eliminations. On the other hand, these properties can have a considerable impact on the biological activity of an active compound. Therefore, halogens are a component of many active substances or their synthesis intermediates and taxonomically broad secondary metabolites [7][8][9].
Consequently, several halogenating enzymes have evolved in nature [9,10]. The most prominent halogenating enzymes are the flavin-dependent halogenases [11,12] and haloperoxidases [13][14][15]. Here, we focus on the enzyme family forming hypohalides from hydrogen peroxide and the respective halides. For catalytic activity, either a heme [13,16,17] or vanadate needs to be bound. It was hypothesized that the electrophilic hypohalide species reacts remotely from the active site and, thus, the regioselectivity depends exclusively on the reaction partners.
However, the number of halogenating enzymes studied is rather limited [9], although these enzymes can be synthetically valuable tools for synthesis, as precise doses of hypohalides can be used during the reaction to avoid over-halogenation. From a synthetic point of view, it is also problematic that, for example, in the case of the bestdescribed haloperoxidase from Curvularia inaequalis, there is an evident substrate excess inhibition by chloride [18].
Here, we searched for novel haloperoxidases that are not catalytically inhibited at a substrate excess and do not release hypohalide from the active center such that the reaction takes place selectively in the active center. Initially, we analyzed sequence databases with algal, fungal, and proteobacterial sequences, thus focusing on a large taxonomic breadth. Representatives identified in this step as haloperoxidases with only moderate sequence similarity to known haloperoxidases were subjected to structure predictions and binding site comparisons to identify active sites with new structural characteristics. Selected candidates were subjected to expression studies and biochemical analyses (see Figure 1).

Identification of Haloperoxidases from Sequence Databases
To obtain potential halogenase sequences, GenBank was screened for sequences from the Phaeophyceae (brown algae) using simple similarity searches on published sequences, e.g., from Ascophyllum nodosum [19]. The GenBank search yielded several sequences from Laminaria digitata, Saccharina japonica, and one from Fucus distichus (AAC35279.1). For further analysis, one sequence was chosen for Laminaria digitata (CAQ51446.1), which was found after stress elicitation and described as a vanadium-dependent haloperoxidase potentially involved in iodation. Additionally, one sequence was selected for Saccharina japonica (AYP65253.1) [20]. To extend the coverage, the transcriptome shotgun assembly database of NCBI was searched, which yielded multiple sequence candidates for Sagassum spp. (S. fusiforme GFKJ01014004.1, S. muticum GFKI01033812, and S. vulgare GEHA01064915) [21] and an additional sequence for Fucus ceranoides (HACY01003800.1). After comparison with the published sequences, the most likely reading frames were identified and potential frameshifts were corrected.
Similarly, to identify non-algal sequences, the sequence underlying the chloroperoxidase structure PDB ID 1IDQ_A from Curvularia inaequalis [22] was used as bait. The most similar fungal sequences not stemming from Curvularia (with a sequence similarity between 81 and 84%) were picked for further analysis and represented sequences from several Bipolaris species as well as one each from Stemphylium lycopersici and Exserohilum turcica. All species stem from the fungal Pleosporaceae family and mostly represent plant pathogens.
In addition, sequences were selected from proteobacteria (a sequence similarity between 43 and 49%) and from the cyanobacterium Nostoc minutum (a sequence similarity of 37%) for further analysis.

Clustering of Sequences of Selected Haloperoxidases
To visualize relative distances between sequences, a cladogram was constructed (Figure 2, left) based on a multiple sequence alignment (MSA, Figure S1). Brown algae sequences are clearly separate from those of bacteria and fungi, whereas the latter two are more similar to each other. This corresponds to a higher sequence identity ( Figure S2) between bacteria and fungi sequences, mostly ~45%, with the exceptions of Luteitalea pratensis (~20-25%) and Deltaproteobacterium bacterium (~50%). By contrast, the sequence identity of algae sequences is in the range of 15-20%. algae in brown, bacteria in magenta, cyanobacteria in cyan, and fungi in grey. The tree is calculated based on the distances generated from the pairwise scores (numbers in the middle column). The quality of the structural models is expressed by TopScore (second right column) and TopScoreSingle values (right column) [23], where lower values and blue color indicate better quality.
Among algae, the separation corresponds to the species tree, separating the order Laminariales (Laminaria digitata and Saccharina japonica) from the Fucales order. Within the Fucales order, the Fucaceae family sequences (Fucus spp. and Ascophyllum nodosum) are separated from the Sargassaceae family (Sargassum spp.). Consequently, L. digitata and S. japonica are the most diverse in the group and form a separate cluster (identity range of 51.6-58.8% to all other algae sequences, and 63.50% between the two). The other two main clusters are either formed by the sequences from the genus Sargassum (identity ~90% within the cluster) and the order Fucales (identity of 95.8% between F. distichus and F. ceranoides) together with A. nodosum (identity range of 62.3-66.9% in the group, and 86.2-95.8% in the cluster). Among bacteria, L. pratensis and N. minutum are the most divergent in the group (identity range of 21.5-22.9% and 43.0-45.4%, respectively, in the group, and 28.0% between the two). R. roseus and Methylobacter sp. form another cluster (identity between the two of 55.3%), whereas the sequence from Deltaproteobacteria bacterium is separated and has a higher similarity to the group of fungi (identity range of 22.2-43.5% and 50.6-52.1%, respectively, for the bacteria and fungi groups). Among fungi, S. lycopersici and E. turcica are branching out and are separated from the sequences of the genus Bipolaris (identity range of 82.4-86.1% in the group, and 83.9% between the two). In the Bipolaris cluster, B. victoriae and B. zeicola are the most similar (identity of 99.3%), while B. sorokiniana, B. maydis, and B. orizae show, respectively, increasing diversity.

Structural Modeling of Selected Haloperoxidases
As no experimentally derived structures were available for the 20 selected haloperoxidase enzymes, a structural model for each sequence was constructed. In general, the models from fungi and some algae sequences are of very good quality (TopScore [23] values ~0.15; TopScore values are bounded between 0 (very good) and 1 (bad)), and for other algal and bacteria sequences, the models are of lower, albeit still good quality (TopScore values ~0.25-0.3) [24] (Figures 2, 3 and S3-S5). This is due to different template availabilities and differences in the sequence identities to the templates used for modeling (see Table S1 for details). Interestingly, the Ascophyllum sequence, which is very similar to the known structure from A. crysall (PDB ID 1QI9_B), had a similar TopScore value as the more distant sequence from S. fusiforme (Tables S1), indicating a good modeling performance of TopModel [25]. The high-quality structural models (Figures 2 and S2) allow for the inference of the location of the catalytic residues after alignment as well as the probable functional assembly. . The local quality of the models is assessed with TopScore where darker blue color indicates better quality (left). The best template selected for each example is also reported for comparison as white colored structure (right). The vanadium cofactor is shown as spheres and residues located ≤5 Å from it are reported as sticks.

Structural Models of Haloperoxidases from Algae
For models from algae, the best templates generally have a sequence identity of ~55-65% to the target sequences (Table S1). Exceptions include sequences of the genus Fucus, where the best templates have an identity of ~85% and A. nodosum, for which a structure with an identical sequence was deposited (PDB ID: 1QI9, 100% identity and 99.6% coverage). Most algae sequences were modeled based on templates in multimeric functional assembly. Since the interface is close to the catalytic pocket, generating at least a dimeric structural model is required. The regions modeled with lower quality (Table S4) include unstructured N-and C-termini and some short loop regions located mostly at a large distance (>20 Å) from the catalytic pocket. Notably, the N-terminus is placed right before a helix involved in dimerization (~23 residues long, Table S4), which is pointing out and oriented towards the putative dimerization interface. To make constructing a dimeric model easier, a truncated model without the low-quality N-terminus, thus, starting from the dimerization helix, was considered instead for the next steps.

Structural Models of Haloperoxidases from Bacteria
For models from bacteria, the best templates available have an identity of ~45-50% to the target sequences (Table S2), indicating that less well-suited templates are available for the selected sequences. Compared to models from algae, the catalytic pocket is less exposed due to a more compact helical bundle, which is generally modeled with high quality. Additionally, in most models, a β-hairpin located on top of the pocket can be found, likely acting as a lid. Exceptions include the cases of L. pratensis and N. minutum, where this region is mostly unfolded but still of good quality (TopScore < 0.5). Other than unstructured N-and C-termini, low-quality regions (Table S5) include some loop regions located mostly at a large distance (>25-30 Å) from the catalytic pocket. In the case of Methylobacterium sp., while the folding of the β-hairpin lid is conserved, its quality is low (TopScore > 0.5), especially in the first β-strand.

Structural Models of Haloperoxidases from Fungi
For models from fungi, the best templates available have a very high sequence identity of ~85% to the target sequence (Table S3). Structurally, these resemble the models from bacteria due to a very compact helical bundle supporting the catalytic pocket and the presence of a β-hairpin lid. The sequence conservation of this lid is very high, with only one position not fully conserved (M, K, and R, respectively, for S. lycopersici, E. turcica, and models from the genus Bipolaris) and three positions with "strongly similar properties" (indicated with a colon in the MSA; Table S6 and Figure S1). In these models, low-quality regions (Table S6) are located in unstructured N-and C-termini and in extremely short loop regions located mostly at a large distance (>25-30 Å) from the catalytic pocket.

Mapping and Comparison of the Catalytic Sites
For the structural models, including dimeric models for algae, molecular interaction fields (MIF) were calculated with DrugScore 2018 [26,27] and represented with Zernike descriptors [28]. In doing so, the description of the molecular recognition properties of the binding sites was compressed, allowing an alignment-independent pairwise comparison using the Manhattan distance as a measure. That way, proteins with similar molecular recognition properties can be identified, likely interacting and reacting with similar substrates.
L. pratensis and B. sorokiniana are branching out ( Figure 4 and Figure S6) and have the most diverse binding sites in the set. For the latter, this result is surprising given the high identity (82.4-86.1%) with the other members of the Bipolaris group. These are mainly part of one cluster, with subclusters by E. turcica and R. roseus (normalized distance of 0.57), and B. oryzae (normalized distance within the cluster of 0.50-0.55), as well as B. victoriae and B. zeicola (normalized distance of 0.43). Therefore, interestingly, the model from the bacterium R. roseus displays molecular recognition similarities to some other models from fungi. Instead, a bacterial model that differs from other examples is the one of Deltaproteobacteria bacterium. This result is expected, given the low to average sequence identity of 22.2-43.5% and 50.6-52.1% for the bacteria and fungi models, respectively. Manhattan distances between the Zernike descriptors were used for similarity assessment. The sequences are color-coded according to the taxonomy: algae in brown circles, bacteria in magenta triangles, cyanobacteria in cyan squares, and fungi in grey diamond shapes. The tree is drawn to scale with branch lengths in the same units as those of the distances used to infer the tree. Cyan concentric lines represent 25%, 50%, and 75% from the center of the maximum distance, respectively.
All the other examples form two main clusters, one formed by models from fungi and bacteria and another one entirely formed by models from algae. In the first, apparently unrelated systems are clustered together, including Methylobacter sp.

Expression of Four Putative Haloperoxidase Sequences
To confirm the bioinformatics results achieved for this group of genes and putative enzymes, four sequences were chosen from A. nodosum, F. ceranoides, S. fusiforme, and R. roseus to test for haloperoxidase activity. These cases include algae sequences with similar binding sites from A. nodosum and F. ceranoides, as shown in the previous analysis of the predicted structural models, but also algae sequences with a more diverse binding site from S. fusiforme. Finally, a more divergent example from the bacteria R. roseus was also included. The expression system was based on a previously established expression platform using E. coli BL21 Arctic Express (DE3) as the heterologous expression host [16].
SDS-PAGE analysis following expression of all four constructs in 50 mL cultures using the optimized haloperoxidase expression system showed a high amount of inclusion body formation visible for all constructs except that from R. roseus (see Figure  S7). However, all four proteins seemed to be produced in E. coli to a certain extent.
Following these first small-scale attempts, large-scale expression of all genes was tested using 500 mL cultures. After cell disruption via ultrasonication, the integrated N-terminal His-Tag was used to perform an affinity chromatography-based isolation on an Nicolumn (see Section 4 for details). Due to scale-up problems for the constructs of A. nodosum, S. fusiforme and F. ceranoides, all three proteins did not show the desired purity required for biochemical characterization and no haloperoxidase activity was found in initial activity tests. Only the construct of R. roseus showed sufficient purity ( Figure 5) and detectable haloperoxidase activity. Therefore, all further experiments focused on this enzyme, which will be referred to as VPO-RR from now on.

Biochemical Characterization of VPO-RR
In the first step, the kinetics of the enzyme was determined using a two-dimensional Michaelis-Menten function as previously reported [16]. For this experiment, potassium bromide was chosen as the halide source ( Figure 6). Figure 6. Two-dimensional kinetics of VPO-RR with hydrogen peroxide and potassium bromide as substrates. The activity was determined using the MeDAC assay with triplicates for each concentration pair. The volumetric activities were plotted against concentrations (black and white dots) and fitted using the substrate-excess inhibition term by Murray [29] in Origin (surface and contour plot).
where v0 is the initial activity; vmax the maximal activity; KM a Michaelis constant; c a molar concentration; and Ki an inhibitory constant.
The turnover number kcat was determined using the given values of vmax and each affinity constant (KM), respectively, and compared to the parameters of a literature-known enzyme, the vanadium-dependent chloroperoxidase (VClPO) from Curvularia inaequalis, short VClPO-CI (Table 1). [a] The parameters were determined using a two-dimensional Michaelis-Menten fit with an included substrate-excess inhibition term by Murray using Origin [29].
The kinetic profile of VPO-RR shows some remarkable differences compared to known homologs. The so-far generally assumed substrate-excess inhibition by halides is not as pronounced as with VClPO-CI (405 mM compared to 56.8 mM) [30,31]. However, hydrogen peroxide inhibits the enzyme with Ki = 178 mM, in contrast to VClPO-CI, which is not inhibited. The overall turnover number of VPO-RR exceeds the one of VClPO-CI by a hundred-fold (1304 s −1 compared to 9.8 s −1 ). In conclusion, VPO-RR shows several advantages in regard to kinetic properties compared to literature-known haloperoxidases. Alternately, the apparent inhibition by hydrogen peroxide may be a lack of stability against the strongly oxidizing effect of the peroxide rather than a molecular excess inhibition, which are virtually indistinguishable in this experiment. The overall higher turnover number and lower inhibitory effect of bromide compared to VClPO-CI may allow for biotransformations with higher efficiency.
As reaction temperatures are a frequent optimization parameter in synthetic applications, the overall temperature stability of VPO-RR was investigated. Over a period of three hours, the enzyme was incubated at different temperatures ranging from 25 to 85 °C. After the incubation, the samples were centrifuged at maximum speed and the supernatant was measured using the MeDAC assay. All activities determined were normalized to the activity at room temperature (here 25 °C) and compared (Figure 7). Although the temperature stability measurements suffer from large variations, principal information can be drawn from this series. Similar to VClPO-CI, VPO-RR exhibits good stability against higher temperatures, e.g., with a loss in the relative activity of 28.2% after incubation at 60.1 °C. Comparing slightly elevated temperatures such as 40 °C to high temperatures such as 70 °C showed a behavioral change. In several attempts, the activity of VPO-RR dropped drastically after incubation at 40 °C (60% decrease), whereas no such significant decrease in activity was observed for samples incubated at 45 °C (13% decrease) to 60 °C (28% decrease). It might be possible that an energetically favored structure is reached at elevated temperatures compared to the structure of VPO-RR at 25 °C. Failure to reach this state due to insufficiently high temperatures, such as 35 or 40 °C, might lead to an undesirable or partially undesirable state where enzymatic activity is severely affected. Whether the nature of this phenomenon is generally attributable to the tertiary structure of the enzyme or the ortho-vanadate cofactor itself has to be elucidated further.
Another important aspect of haloperoxidases is the selectivity for the corresponding halide. Here, one needs to differentiate between a preferred halide, i.e., one that leads to a higher turnover number, and the enzyme's general acceptance of a halide. Using the MeDAC assay, bromination and chlorination experiments were performed under the same conditions. As a reference, the bromination and chlorination using the MeDAC assay with VClPO-CI, similar to recent work, were included in the SI to illustrate the applicability of the assay for this purpose (see Table S7) [16]. In direct comparison, no chlorination activity could be measured for VPO-RR, while bromination occurred with a volumetric activity of 1.34 U ml −1 (Table 2). Hence, VPO-RR does not accept chloride as a halide source under these conditions. [a] The activity was determined in triplicates using different halide sources for halogenation using the MeDAC assay. Given is the mean ± standard deviation.

Discussion
Halogenating enzymes do not yet have anywhere near the potential to overtake mega-ton processes such as chlor-alkali electrolysis, on which the entire production of halogenating reagents is based. Here, we identified potential halogenase sequences by screening algal, fungal, and protobacterial sequence databases, structural modeling of putative halogenases, and mapping and comparing active sites. After test expressions of four putative halogenase sequences, the vanadium-dependent haloperoxidase VPO-RR was biochemically characterized, which showed beneficial kinetic properties, including an overall higher turnover number and lower inhibitory effect of bromide compared to VClPO-CI.
At present, there is a gap in the scope of known halogenating enzymes. On the one hand, some enzymes are involved in the biosynthesis of a particular natural compound, such as many flavin-dependent halogenases (e.g., Chl from Streptomyces aureofaciens [32] or CmdE from Chondromyces crocatus Cm c5 [33]), but their high substrate selectivity renders these enzymes unsuitable as general synthetic tools. On the other hand, the vanadate-dependent haloperoxidases only produce hypohalous acid rather than provide an enzyme-substrate complex and, thus, lack an enzyme-driven substrate selectivity [14,34,35]. Furthermore, vanadate-dependent haloperoxidases are still underrepresented in the set of halogenases used for synthetic purposes. Accordingly, predominantly VClPO-CI from Curvularia inaequalis [14] and VBrPO-CO from Corallina officinalis [36] were extensively described. Identifying and characterizing VPO-RR here thus provides a valuable expansion of the known set of vanadate-dependent haloperoxidases.
The pipeline for halogenase identification applied here, including structural characterization and comparison of molecular recognition properties of active sites, is unique because it benefited from screening taxonomically wide sequence databases at the beginning. The structural modeling resulted in overall very good to good models as predicted by TopScore [25,37] based on good available templates, such that the structural quality of the active site should also be high. Recently, we compared the quality of 10,336 enzyme structural models generated with TopModel [25,37] used here and AlphaFold2 [38] and found that the average scores differed only by ~5% [24]. Particularly, if good templates are available, as was the case here, the comparative modeling approach by TopModel resulted in very good models [24]. For binding site comparison, MIFs based on the established knowledge-based potentials for scoring protein-ligand interactions, DrugScore 2018 [27,39] were used. This provides a balanced description of enthalpic and entropic interaction contributions and also allows for the consideration of metal ions in the recognition process. Finally, compressing this information as Zernike descriptors [28] yields an alignment-independent way of comparing active sites, which is advantageous in view of the identities of the selected halogenase sequences down to ~20% and, hence, the resulting variability of the overall enzyme structure.
Our analyses show that this group of haloperoxidases is taxonomically widespread, although they must be assigned to secondary metabolism. The sequences chosen here for further analysis are representatives of the Phaeophyta and thus still closely related to the described Rhodophyta C. officinalis, Ascomycota as well as Curvularia inaequalis, but also cyanobacteria and bacteria. Both the phylogenetic tree and cladogram build upon the haloperoxidase sequences and the clustering of the active site geometry (see Figure 4) represents the approximate taxonomic relation. They also underline the variety in the assumed structures indicating different physicochemical properties.
Four sequences (from A. nodosum, F. ceranoides, S. fusiforme, and R. roseus,) were selected for test expressions. These organisms are representatives of the Phaeophyta and thus still closely related to the described Rhodophyta C. officinalis, Ascomycota, and Curvularia inaequalis, but also cyanobacteria and bacteria. However, only the bacterial sequence from R. roseus was expressible within E. coli. The kinetic data revealed that VPO-RR is not strongly inhibited by halides, in contrast to what was described for VClPO-CI [18,31,40]. Furthermore, its 100-fold higher catalytic turnover number renders this enzyme a potential advancement over literature-known haloperoxidases for biocatalytic processes. The temperature stability of VClPO-CI could be observed for VPO-RR; for temperatures up to 85 °C, a remaining activity of well over 60% was observed. Regarding its halide selectivity, only bromide was found to be an accepted halide source, whereas chloride showed no measurable activity. Whether the enzyme is capable of selective halogenations or possesses substrate binding properties is the focus of current investigations.
In summary, we present a structure-based pipeline to identify new halogenases and characterize VPO-RR as a new member of vanadium-dependent haloperoxidases. Its beneficial kinetic properties might allow for biotransformations with higher efficiency. Therefore, a more detailed biochemical characterization will be essential.

Structural Modeling of Selected Haloperoxidases
As no experimentally derived structures were available for the 20 selected haloperoxidase sequences, a homology model was constructed for each. To do so, the template-based structure prediction program TopModel [25,37] was applied using the "normal run" mode. In order to build models from algae arranged in a dimeric assembly, a monomer was first built, taking into account the best dimeric template selected by TopModel (Table S1, PDB ID: 1QI9). Next, the low-quality N-terminus was truncated as no dimeric template information is available for this region, meaning a clash might occur at the interface. Since for algae the functional assembly is consistent and involves the interaction between dimerization helices, the model was duplicated, superimposed to the reference template structure, and minimized. Both monomer and dimer structures were preprocessed with the Protein Preparation Wizard [41] of Schrödinger's Maestro Suite. The truncated dimeric models were energy-minimized using the OPLS 2005 force field [42] with standard cutoff values for van der Waals, electrostatic, and H-bond interactions until the average RMSD of non-hydrogen atoms reached 0.30 Å. Bond orders and missing hydrogen atoms were assigned, and the H-bond network was optimized.

Calculation of DrugScore 2018 Potential Fields
Binding sites were characterized and represented with descriptors based on DrugScore 2018 [27,39] potential fields. Potential fields of six different probe atoms were considered: C_ar (aromatic interactions), C_3 (aliphatic interactions), O_2 (H-bond acceptor), N_3 (H-bond donor), O_3 (H-bond donor and acceptor), Cl (representative for halogens). For all computed DrugScore 2018 potential fields, a grid spacing of 0.375 Å was used. Since the vanadate cofactor was a constant in all models for the classification, it was not considered for potential field generation.

Calculation of Zernike Descriptors and Clustering
The DrugScore 2018 potential interaction fields were transformed to a set of descriptors using a series expansion in 3D Zernike polynomials as described by Nisius et al. [28]. First, different descriptors for one individual protein and different probe atoms were merged. Second, the pairwise distances between all proteins (Manhattan distance) of the stacked Zernike descriptors were calculated. A matrix of normalized distance values is reported in Figure S6. These distances were then used for clustering. The UPGMA (unweighted pair group method with arithmetic mean) agglomerative bottom-up hierarchical clustering method as implemented in the MEGA (Molecular Evolutionary Genetics Analysis) software [43] was applied to generate the tree reported in Figure 4 from the distance matrix in Figure S6.

Expression and Isolation of Haloperoxidases
The expression and isolation procedure was established according to recent work [16]. The platform for expression of the haloperoxidase genes consisted of E. coli ArcticExpress (DE3) as the heterologous host and a pET21-vector backbone, using Nterminal His-Tags for later isolation.
For small-scale test expressions, 50 mL cultures in TB-media were inoculated with a cell amount that corresponded to a starting OD600 of 0.05 in 250 mL flasks. Then, 0.05% glucose and 0.2% lactose were supplemented to enable autoinduction later on, with 100 µg mL −1 ampicillin as the antibiotic. After an incubation period of 3 h at 37 °C, 2% ethanol (v/v) was added and the temperature decreased to 13 °C. The cultivation was continued for 72 h before the cells were centrifuged and the dry pellets stored at −20 °C.
The isolation was performed on ice with resuspended pellets in lysis buffer (20%, Bis-TRIS acetate, 100 mM, pH 7 with 12.5 mM imidazole) using ultrasonication for 5 min, cooling for 5 min, and an additional ultrasonication step. The lysate was centrifuged for 1 h at 10.000× g and the supernatant separated from the cell debris. SDS samples were prepared according to the final OD600 values of the culture, aiming for a final OD600 in the SDS sample of 1 for SDS-GEL analysis using the NuPage system. For cell debris samples, the pellet was resuspended with the same amount of lysis buffer (20%) as before.
For large-scale expression, 500 mL cultures were used with the exact same concentrations of supplements. For isolation, an EDTA-free protease inhibitor tablet (cOmplete Tablet, Mini, EASYpack, Roche) was added to the resuspended pellet in lysis buffer and stirred for 20 min. For cell disruption, the ultrasonication steps were increased to 10 min each. After centrifugation, the cleared lysate was loaded onto a 5 mL Ni-NTA Superflow Cartridge (Qiagen). Initial experiments were carried out on an ÄKTA prime system using a linear elution gradient with Bis-TRIS acetate (BTA) buffer 100 mM at pH 7 and BTA with 250 mM imidazole included. Peak fractions were collected and analyzed via SDS-PAGE, and fractions containing the target protein were pooled and dialyzed overnight in assay buffer (BTA 100 mM pH 6) in the presence of 100 µM sodium orthovanadate. The dialysate was concentrated using PALL concentrators with a molecular cutoff of 10 kDa. Final protein concentrations were measured using molecular extinction coefficients for absorption at 280 nm for each individual protein calculated by ExpPASy assuming oxidative conditions (εox (VPO-RR) = 100,730 mM −1 cm −1 ).

Activity Measurement and Kinetics of VPO-RR
Activity of haloperoxidases was measured using the MeDAC assay [18]. A 1 µL of enzyme solution was added to 249 µL of assay buffer, containing 5 mM of potassium bromide, 3 mM of hydrogen peroxide, 30 µM MeDAC (in DMSO) with 0.4% v/v DMSO (as the final concentration) in BTA 100 mM pH 6. The fluorescence was measured over 10 min at an excitation wavelength of 425 nm and emission wavelengths of 450 and 550 nm. Using a calibration function included in a calculation sheet, the activity was determined by plotting the concentration of BrMeDAC against time and determining the slope of the linear fit. Multiplying with corresponding dilution factors yielded the final volumetric activities in U mL −1 .
For the two-dimensional Michaelis-Menten kinetics, the substrate-inhibition term by Murray [29] was used as a fit function in OriginPro 2018. The enzyme was mixed with MeDAC and added to the wells filled with predefined concentrations of hydrogen peroxide and potassium bromide. Measurement and evaluation were performed according to the general activity measurements. The resulting volumetric activities were plotted in Origin and fitted with the custom Michaelis-Menten function. The complete function is accessible in the Supplementary Materials (see Equation (S4)).

Temperature Stability
To test the thermal stability of VPO-RR, 20 µL of enzyme solution was placed in a PCR tube and incubated for 3 h at the given temperatures. The samples were centrifuged for 5 min at maximum speed, and 1 µL of the solution was used according to the activity assay mentioned before (vide supra). The experiments were performed in biological triplicates. The averaged 25 °C sample was used as a reference and set to 100% of relative activity.

Halide Selectivity
For the halide selectivity test, the assay was performed under normal conditions using potassium bromide and potassium chloride, respectively. After the determination of activity, the values were directly compared to each other. For samples where no linear area could be determined, the values were set to zero.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/catal12101195/s1; Supplemental Methods; Figure S1: Multiple sequence alignment for the selected haloperoxidases; Figure S2: Percent identity matrix for the selected haloperoxidases; Figure S3: Structural models of haloperoxidases from algae sequences; Figure S4: Structural models of haloperoxidases from bacteria sequences; Figure S5: Structural models of haloperoxidases from fungi sequences; Figure S6: Matrix of normalized pairwise Manhattan distances between all proteins of the stacked Zernike descriptors; Figure S7: SDS-PAGE analysis of the test expression of all four putative VPO constructs; Figure S8: SDS-PAGE of samples obtained during isolation procedures for the enzymes VPO-SF, VPO-FC, and VPO-AN; Figure S9: Example measurement of chlorinating activity for VClPO-CI using the MeDAC assay; Table S1: Threading information; Table S2: Threading information; Table S3: Threading information; Table  S4: Critical evaluation of algae structural models; Table S5: Critical evaluation of bacteria structural models; Table S6: Critical evaluation of fungi structural models; Table S7: Comparison between bromination and chlorination of MeDAC using VClPO-CI; Supplemental References..