Selection of Primer–Template Sequences That Bind with Enhanced Affinity to Vaccinia Virus E9 DNA Polymerase

A modified SELEX (Systematic Evolution of Ligands by Exponential Enrichment) pr,otocol (referred to as PT SELEX) was used to select primer–template (P/T) sequences that bound to the vaccinia virus polymerase catalytic subunit (E9) with enhanced affinity. A single selected P/T sequence (referred to as E9-R5-12) bound in physiological salt conditions with an apparent equilibrium dissociation constant (KD,app) of 93 ± 7 nM. The dissociation rate constant (koff) and binding half-life (t1/2) for E9-R5-12 were 0.083 ± 0.019 min−1 and 8.6 ± 2.0 min, respectively. The values indicated a several-fold greater binding ability compared to controls, which bound too weakly to be accurately measured under the conditions employed. Loop-back DNA constructs with 3′-recessed termini derived from E9-R5-12 also showed enhanced binding when the hybrid region was 21 nucleotides or more. Although the sequence of E9-R5-12 matched perfectly over a 12-base-pair segment in the coding region of the virus B20 protein, there was no clear indication that this sequence plays any role in vaccinia virus biology, or a clear reason why it promotes stronger binding to E9. In addition to E9, five other polymerases (HIV-1, Moloney murine leukemia virus, and avian myeloblastosis virus reverse transcriptases (RTs), and Taq and Klenow DNA polymerases) have demonstrated strong sequence binding preferences for P/Ts and, in those cases, there was biological or potential evolutionary relevance. For the HIV-1 RT, sequence preferences were used to aid crystallization and study viral inhibitors. The results suggest that several other DNA polymerases may have P/T sequence preferences that could potentially be exploited in various protocols.


Introduction
The recognition of recessed 3 termini is a hallmark of all DNA polymerases, except for some viral enzymes that can initiate synthesis using specific viral proteins (e.g., adenovirus DNA polymerase [1]). The generally accepted dogma is that both the recognition of and stable binding to the 3 terminus are structurally driven, with the specific sequence playing little or no role in binding. However, our group has shown that viral RTs, including those from human immunodeficiency virus-1 (HIV-1), Maloney murine leukemia virus (MuLV) and avian myeloblastosis virus (AMV), bind more tightly to DNA-DNA primer-templates with runs of G (G-tract) at the 3 primer end [2,3]. The G-tract on the DNA primer mimics the G-tract found on the polypurine tract (PPT) RNAs of these viruses. The findings suggest that RTs and PPTs may have co-evolved, leading to strong interactions and the proper orientation of the RT at the 3 end of the PPT. Conversely, it could be argued that viral PPT sequences evolved to conform to RTs binding preferences, but played little part in RT DNA oligonucleotides were 5 -end-labeled in a 50 µL volume containing 10-250 pmol of the oligonucleotide of interest, 1X T4 PNK reaction buffer (provided by manufacturer), 10 U of T4 PNK and 5-10 µL of (γ-32 P) ATP (3000 Ci/mmol, 10 µCi/µL). The labeling reaction was performed at 37 • C for 30-60 min according to the manufacturer's protocol. The PNK enzyme was heat-inactivated by incubating the reaction at 75 • C for 15 min. Excess radiolabeled nucleotides (nt) were then removed via centrifugation using a Sephadex G-25 column.

Selection of Primer-Template Sequences That Bind with Enhanced Affinity to E9 Using PT SELEX
The protocol used to select P/T sequences that bind with high affinity to E9 protein was similar to previous PT SELEX protocols, and further detail can be found in those publications [3,7]. Briefly, a starting material template containing 20 nt 5 and 3 fixed sequences and a 25 nt random region (i.e., 5 -GCATGAATTCCCGAAGACGC(N) 25  where N is any base) was used to start the process.
A 5 -32 P-labeled primer (5 -GCCTGCAGGTCGACTCTAGA-3 ) was hybridized to the template and extended with exonuclease minus Klenow polymerase. The resulting 65 nt dsDNA material was processed to produce P/T sequences with a 41 nt primer strand and a 45 nt template via digestion with Bbs I (site underlined above) which cuts the outside of its recognition site in the "N" region of the duplex. The digestion products were separated using a 12% native PAGE gel, and the radiolabeled P/T was recovered. The 3 -recessed terminus and 4 nt of 5 single-stranded overhang were in the "N" region of the P/T. About 200 pmoles (~10 14 different sequences) of P/T material was used in the first round of selection. The selection buffer was 20 mM Tris-HCl pH 7.5, 90 mM KCl, 60 mM NaCl, 4 mM DTT and 2% glycerol, and all selections were carried out at room temperature by incubating E9 at a~1:10 ratio of protein to P/T with~1 µM P/T for 30 min, before filtering the sample over nitrocellulose to capture the selected P/T sequences. Filters were washed twice with 5 mL of wash buffer (see below). The material bound to the filters was recovered with a phenol:chloroform:isoamyl alcohol extraction and ethanol precipitation, and recovery was monitored by radioactivity. Klenow polymerase (with 3 -5 exonuclease) was used to extend the recessed 3 terminus of the recovered material, and the blunt end products were ligated to a duplex DNA to add back the Bbs I site, essentially reproducing the dsDNA used to make the starting material (see above). A standard PCR was employed to amplify this material, and the gel-purified product was digested with Bbs I and gel-purified again to produce material for another round of selection. A notable difference in this protocol from previous ones [3,7] was the use of heparin after the 1st round of selection. For rounds 2-5, recovered material was incubated with E9 for 30 min, then heparin (1 µg/µL,~85:1 (w/w) heparin:E9 protein) was added, and the incubation was continued for 10 min before filtering. The heparin was used to sequester the E9 molecules that were not bound to or had dissociated from P/T, and also to compete with the P/T sequences for binding to E9 (see Section 3). The PT SELEX process was stopped after 5 rounds. A limited number of recovered sequences from round 5 were sequenced by cloning and Sanger sequencing, as described previously [3,7].

Determination of Apparent Equilibrium Dissociation Constant (K D , app ) between E9 and P/T Sequences Using Nitrocellulose Filter Binding Assays
Standard reactions for K D,app determinations were performed by adding 4 µL of various amounts of E9 protein diluted in buffer (20 mM Tris-HCl pH 7.5, 300 mM NaCl, 4 mM DTT, 10% glycerol) to 16 µL of a solution containing radiolabeled P/T, such that the final concentrations of components in 20 µL was: 0.1 nM P/T (5 32 P end-labeled on the primer strand), 20 mM Tris-HCl pH 7.5, 90 mM KCl, 60 mM NaCl, 4 mM DTT, 0.1 µg/µL BSA and 2% glycerol. Protein was added in amounts that were in the range of the K D,app value for the P/T based on preliminary experiments. Reaction components were mixed, Viruses 2022, 14, 369 4 of 13 and after 10 min at room temperature, the reactions were applied to a 25 mm nitrocellulose disk (0.45 µm pore, Protran BA 85, Whatman™) that was pre-soaked in filter wash buffer (25 mM Tris-HCl pH 7.5, 10 mM KCl). The filter was washed under vacuum with 5 mL of wash buffer at a flow rate of~1 mL/s. Filters were then counted in a scintillation counter. A plot of bound P/T vs. E9 protein concentration was fit to the following equation for ligand binding and one-site saturation in SigmaPlot in order to determine the K D,app : y = B max (x)/(K D + x), where x is the concentration of protein, y is the amount of bound aptamer and B max is the amount of bound P/T at saturation.

Dissociation Rate Constant (k off ) and
Half-Life (t 1/2 ) Determinations Ten nM (final concentration) 5 -32 P end-labeled P/T was incubated for 10 min at room temperature in 72 µL of 50 nM E9 protein, 20 mM Tris-HCl pH 7.5, 90 mM KCl, 60 mM NaCl, 4 mM DTT, 0.1 µg/µL BSA and 2% glycerol. The high concentration of E9 was required due to the sequences binding relatively weakly under these conditions. At time "0", 8 µL of heparin (10 µg/µL,~200:1 (w/w) heparin:E9 protein) in the same buffer was added to the solution. Ten µL aliquots were removed at 0.25, 1, 2, 4, 8, 12 and 16 min and filtered over nitrocellulose, as described above. A background control was prepared by mixing 10 nM of the same 5 32 P end-labeled P/T being tested with 10 µg of heparin, then adding E9 protein (50 nM final concentration) in a total of 10 ul of buffer. This tests the effectiveness of heparin at "trapping" the E9 protein and preventing its binding to P/T. This sample was incubated for 16 min before processing, and was subtracted from the other samples in the final calculations. The control sample typically showed less than 10% binding to P/T compared to the time 0.5 min sample, indicating that heparin was an effective trap at the concentration employed. The dissociation rate constant was determined by fitting the data from a plot of P/T bound to the filter vs. time to an equation for single 2-parameter exponential decay in SigmaPlot: y = ae −bx , where b is the dissociation rate constant (k off in this case). The t 1/2 value was determined from k off using the following equation: t 1/2 = 0.69/k off .

DNA Loop-Back Extension Assays
Reactions were performed under the same conditions as the dissociation rate constant determinations, except that 25 nM of the 40 nt, 50 nt or 60 nt loop-back DNA sequences, 2 mM MgCl 2 and 50 µM dNTPs were included. Reactions were initiated by adding E9 at 25, 50, 100 or 200 nM to a total volume of 10 µL and incubating the mixture at 30 • C for 5 min. Reactions were terminated with 10 µL of 2X loading buffer (90% formamide, 10 mM EDTA pH 8, 0.025% bromophenol blue and xylene cyanol), and the samples were run on a 16% denaturing polyacrylamide gel [18]. The material was visualized using a phosphorimager (Fuji FLA 7000).

Results
The PT SELEX selection process used for E9 was similar to previous protocols with RTs and bacterial polymerases [3,7], with the notable exceptions of using 90 mM KCl along with 60 mM NaCl as opposed to 80 mM KCl at the start of the process, and using heparin as a competitor (see Section 2.2). Heparin is also a particularly good "trap" for polymerase, and competes strongly with the binding of the P/T, which makes it ideal for selections with polymerases [19,20]. Five rounds of selection were performed, with a step including heparin in rounds 2-5 (see Section 2). There was no increase in binding after round 4, and material from round 5 was cloned and sequenced. Fourteen sequences (shown in Figure 1) were recovered from a limited number of clones. The sequences fell into three distinct lineages, represented by E9-R5-12 (Protein-SELEX round #-sequence clone), which was recovered twice, along with four other closely related sequences, E9-R5-4, which was recovered twice, and E9-R5-3, which was recovered five times, along with one other closely related sequence (E9-R5-2). Members of each lineage, including E9-R5-12, E9-R5-4 and E9-R5-3 (all in the 41 nt primer and 45 nt template configurations), were tested for binding to E9 using the same buffer conditions that were used for selection. Stronger binding was detected only with E9-R5-12 (apparent K D (K D,app ) 93 ± 7 nM (see Table 1)). The E9-R5-4 and E9-R5-3 sequences showed low binding, similar to the starting material control sequence that was selected randomly from the starting pool ( Figure 2). Binding was so low for the other sequences that a K D,app could not be determined with the filter binding assay used, although this likely would have been possible if more enzyme was used, or if non-physiological lower salt concentrations were employed.
This result of low binding to the control sequence was expected because E9, in the absence of its processivity factor [14,15], binds poorly to P/T in physiological salt [16]. The low binding of sequence E9-R5-3 was somewhat unexpected, as the sequence was present five times (with one additional closely related sequence) among the 14 total recovered sequences. It is possible that it was selected for a reason not related to E9 binding (e.g., enhanced binding to the nitrocellulose filters used for selection or over-representation in the starting pool). Consistent with this, sequence E9-R5-3 also bore little resemblance to E9-R5-12 ( Figure 1). Of note, another prior selection conducted without heparin produced a predominant product after eight rounds of selection that resembled E9-R5-12. The sequence of the primer strand in the random region (5 --GTAGGGTAGACAGAGCAACAG-3 ) exactly matched E9-R5-12 (Table 1) over the last six nucleotides at the 3 end, although there was no homology beyond these six nucleotides. This sequence bound just modestly better than the controls, so we elected not to continue with it. Still, this resemblance suggests that the 3 terminal nucleotides are likely important to the observed enhanced binding. This is consistent with previous selections with RT, where the 3 nucleotides were the main driving force for strong binding [2,3].
Additional experiments to examine sequence E9-R5-12 were performed using off-rate analysis rather than K D analysis, as the former requires much less enzyme. E9-R5-12 bound E9 with a k off and t 1/2 of 0.083 ± 0.019 min −1 and 8.6 ± 2.0 min, respectively, while the control sequence rapidly dissociated from E9, such that k off and t 1/2 could not be measured by our filter binding method ( Figure 3 and Table 1). A set of four modified versions of sequence E9-R5-12 were also tested for binding. The modifications replaced five nucleotides at different positions along the P/T with 5 -GACTA-3 . The first replaced the five nucleotides at the 3 end of the primer, and the corresponding template nucleotides and the other three successively moved five nucleotides toward the 5 end of the primer. A k off was not measurable with any of these modifications (Table 1). This suggests that changes along most of the primer-template can affect binding. This result was different from a previous result with HIV RT, where the nucleotides near the 3 primer terminus were the only ones required for tight binding [2].
Previously, single-stranded loop-back DNA primer-templates were used to examine the length requirements for the tight binding of HIV RT to selected primer-templates [21].
We found that adding one extra nucleotide to the 5 overhang, increasing it from four to five nucleotides, modestly improved binding. Therefore, three loop-back primertemplate structures were prepared using the E9-R5-12 sequence. They comprised 60, 50 and 40 nucleotides with five nucleotide 5 overhangs, three nucleotide loops and hybrid regions of 26, 21 and 16 base pairs ( Table 1). Measurements of k off indicated that the 60 and 50 nucleotide structures bound to E9 with essentially the same stability as the original E9-R5-12 sequence (Table 1). This indicated that adding an additional nucleotide to the overhang did not significantly affect binding. In contrast, the 40 nucleotide loop-back bound much less stably ( Figure 4 and Table 1). Although the results at first glance suggest that stable binding requires a duplex region longer than the 16 nt in the 40 nt loopback, the 40 nt loop-back is also missing nucleotides in the duplex region that may be critical for strong binding. Results in Table 1 show that nucleotide changes within the 20 nt duplex region proximal to the 3 primer end all result in a loss of strong binding. The 40 nt loop-back construct contains only 16 of the 20 nucleotides. Therefore, it is not clear if the loss of strong binding resulted from the shortened duplex region or is related to specific sequence requirements. including E9-R5-12, E9-R5-4 and E9-R5-3 (all in the 41 nt primer and 45 nt template configurations), were tested for binding to E9 using the same buffer conditions that were used for selection. Stronger binding was detected only with E9-R5-12 (apparent KD (KD,app) 93 ± 7 nM (see Table 1)). The E9-R5-4 and E9-R5-3 sequences showed low binding, similar to the starting material control sequence that was selected randomly from the starting pool ( Figure 2). Binding was so low for the other sequences that a KD,app could not be determined with the filter binding assay used, although this likely would have been possible if more enzyme was used, or if non-physiological lower salt concentrations were employed.      . Apparent equilibrium dissociation constant (K D,app ) analysis for material selected with E9 using PT SELEX. Nitrocellulose binding assays were used to measure binding of P/T sequences to E9 vaccinia virus polymerase, as described in Section 2. The sequences of the different P/Ts are shown in Table 1. The K D,app for E9-R5-12 from this experiment was 100 nM (see Table 1). Other P/Ts did not bind well enough to determine K D,app with these amounts of E9 (see Section 3). * All values were relative to the predicted amount of bound P/T at saturation (B max in the equation y = B max (x)/(K D + x) (see Section 2)) for the E9-R5-12 sample, which was set to 1. The sequences of all double stranded DNA P/Ts use in the figure are shown in Table 1.  The value was not able to be determined using the assay conditions with the amount of protein used. 6 -"ND"-Not determine. No attempt was made to determine the value. 7 -Sequences were derived from E9-R5-12 by replacing a portion of the duplex with a 5 -GACTA-3 sequence as indicated and underlined.
To verify that E9 bound to the loop-back structures in a productive orientation, the loop-backs were 5 -end-labeled with 32 P and used in primer extension reactions ( Figure 5). All the loop-back structures were extended by E9, even the 40 nucleotide loop-back that bound more weakly to the polymerase. The extension was clearly distributive rather than processive, as the 5 nt addition required to reach the end of the template occurred progressively with shorter products present with lower amounts of enzyme. This is consistent with the distributive nature of E9 in the absence of its processivity factor in physiological salt [16].  . Dissociation rate constant (k off ) analysis of E9-R5-12 P/T sequence. Experiments were performed using nitrocellulose filter binding, as described in Section 2. * Values are relative to the value for the amount of DNA bound to E9 at time "0" (see Section 2), which was set to "1". The sequences of the Control P/T and E9-R5-12 double-stranded DNA sequences are shown in Table 1.
Viruses 2021, 13, x which was set to "1". The sequences of the Control P/T and E9-R5-12 double-stranded quences are shown in Table 1.
Previously, single-stranded loop-back DNA primer-templates were used to the length requirements for the tight binding of HIV RT to selected primer-temp We found that adding one extra nucleotide to the 5′ overhang, increasing it fro five nucleotides, modestly improved binding. Therefore, three loop-back pri plate structures were prepared using the E9-R5-12 sequence. They comprised 6 40 nucleotides with five nucleotide 5′ overhangs, three nucleotide loops and h gions of 26, 21 and 16 base pairs ( Table 1). Measurements of koff indicated that t 50 nucleotide structures bound to E9 with essentially the same stability as the or R5-12 sequence (Table 1). This indicated that adding an additional nucleotide to hang did not significantly affect binding. In contrast, the 40 nucleotide loop-ba much less stably ( Figure 4 and Table 1). Although the results at first glance sug stable binding requires a duplex region longer than the 16 nt in the 40 nt loopba nt loop-back is also missing nucleotides in the duplex region that may be critical binding. Results in Table 1 show that nucleotide changes within the 20 nt dupl proximal to the 3′ primer end all result in a loss of strong binding. The 40 nt l construct contains only 16 of the 20 nucleotides. Therefore, it is not clear if th strong binding resulted from the shortened duplex region or is related to specific requirements.  Table 1. Experiments formed using nitrocellulose filter binding, as described in Materials and Methods. * Valu ative to the value for the amount of loop-back DNA bound to E9 at time "0" (see Ma Methods), which was set to "1".
To verify that E9 bound to the loop-back structures in a productive orient loop-backs were 5′-end-labeled with 32 P and used in primer extension reactions ( All the loop-back structures were extended by E9, even the 40 nucleotide loop-   Table 1. Experiments were performed using nitrocellulose filter binding, as described in Section 2. * Values are relative to the value for the amount of loop-back DNA bound to E9 at time "0" (see Section 2), which was set to "1".   Table 1.
Previously selected P/T sequences from RTs [2,3] and Taq and Klenow polymerases [7] contained biologically relevant sequences (resembling the PPT in the case of RTs), or sequence regions with homology to phage RNA polymerase promoter sequences (for Taq and Klenow). We searched for E9-R5-12-related sequences in the enriched results for RTs, Taq and Klenow, but could not find any resemblance. To further examine possible relationships between E9-R5-12 and previously recovered tight-binding P/Ts to other polymerases, we conducted binding dissociation and primer extension experiments using E9 (Supplemental Materials. Primer-templates selected with PT SELEX for HIV RT, Taq and Klenow dissociated rapidly from E9 (Table S1 and Figure S1). Like the control shown in Figure 3, the dissociation was too fast to determine a dissociation constant using filter binding. Interestingly, all the P/Ts were extended in primer extension reactions, and there was no clear advantage for the extension of the E9-R5-12 sequences compared to the others ( Figure S2). This was synonymous to the experiment with the 40 nt loop-back in Figure  5, which, despite binding less stably to E9 (Figure 4), was extended similarly to the 50 and 60 nt loop-backs ( Figure 5). We should note that the two assays use different reaction conditions, with the primer extension assays containing Mg 2+ and dNTPs that are not included in the binding dissociation assays (see Methods). These could have affected binding to E9. Magnesium could not be added to the binding assays, as it would activate the 3′-5′ exonuclease activity of E9, leading to erroneous measurements.
Curiously, E9-R5-12 contains 5′-CCCAT-3′ and 5′-CCCAA-3′ motifs that are separated by five nucleotides on the primer strand. These are similar to the well-characterized "CCAAT" box (also referred to as the "CAT" or "CAAT" box), present upstream of many eukaryotic promoters, that is a binding site for transcription factors [22]. However, the function of these sequences in E9 binding was not explored. There was a notable match to a region of poxvirus genomes. Sequence E9-R5-12 matches with complete homology over  Table 1.
Previously selected P/T sequences from RTs [2,3] and Taq and Klenow polymerases [7] contained biologically relevant sequences (resembling the PPT in the case of RTs), or sequence regions with homology to phage RNA polymerase promoter sequences (for Taq and Klenow). We searched for E9-R5-12-related sequences in the enriched results for RTs, Taq and Klenow, but could not find any resemblance. To further examine possible relationships between E9-R5-12 and previously recovered tight-binding P/Ts to other polymerases, we conducted binding dissociation and primer extension experiments using E9 (Supplemental Materials. Primer-templates selected with PT SELEX for HIV RT, Taq and Klenow dissociated rapidly from E9 (Table S1 and Figure S1). Like the control shown in Figure 3, the dissociation was too fast to determine a dissociation constant using filter binding. Interestingly, all the P/Ts were extended in primer extension reactions, and there was no clear advantage for the extension of the E9-R5-12 sequences compared to the others ( Figure S2). This was synonymous to the experiment with the 40 nt loop-back in Figure 5, which, despite binding less stably to E9 (Figure 4), was extended similarly to the 50 and 60 nt loop-backs ( Figure 5). We should note that the two assays use different reaction conditions, with the primer extension assays containing Mg 2+ and dNTPs that are not included in the binding dissociation assays (see Section 2.2). These could have affected binding to E9. Magnesium could not be added to the binding assays, as it would activate the 3 -5 exonuclease activity of E9, leading to erroneous measurements.
Curiously, E9-R5-12 contains 5 -CCCAT-3 and 5 -CCCAA-3 motifs that are separated by five nucleotides on the primer strand. These are similar to the well-characterized "CCAAT" box (also referred to as the "CAT" or "CAAT" box), present upstream of many eukaryotic promoters, that is a binding site for transcription factors [22]. However, the function of these sequences in E9 binding was not explored. There was a notable match to a region of poxvirus genomes. Sequence E9-R5-12 matches with complete homology over a 12 nucleotide region (5 -CATGGACACCCA-3 on the primer strand) in the coding region of the B20R gene from some poxviruses. In this case, the primer strand of E9-R5-12 matches the minus strand of the poxvirus gene, and the template strand matches the plus strand. There is very little known about the protein encoded by the B20R gene. One report suggests that the gene may be involved in interferon evasion [23]. The 12 nucleotide sequence is close to the right end of the genome, but not in the putative regulatory region, and it is not near the putative start of virus replication, as described in [24]. Although the chances of a 12 nucleotide exact match in a typical poxvirus genome are low (the sequence would be expected to be present one time over 16,777,216 nucleotides, with about a 1/40 chance of being present in a~200,000 base-pair (i.e., 400,000 nucleotides) poxvirus genome), there is no clear reason to believe there is any biological significance, especially since the segment is in the coding region of only a signal viral gene.

Discussion
In this report, we show that E9, like other DNA polymerases [2,3,7], demonstrates preferential binding to specific P/T sequences. Unlike previous P/T sequences selected for other polymerases, there was no clear biological significance for the selected sequence, and the selected sequence did not share homology with those selected with other polymerases. However, it is important to point out that only a few P/T sequences have been selected for preferential binding to polymerases [2,3,7], and it is possible that the E9 sequence could be related to other sequences that bind with high affinity to polymerases.
With respect to binding affinity, the selected E9-R5-12 P/T sequence, though binding much more tightly to E9 than controls, still bound with a K D that was more than an order of magnitude higher when compared to sequences selected for tight binding to RTs and Taq and was more comparable to binding with Klenow [7]. The weak binding suggests that preferred sequences cannot overcome the generally weak binding of E9 to P/Ts in physiological salt [16]. The enzyme demonstrates distributive synthesis under these conditions but can be switched to a processive mode in the presence of its processivity factor or in low salt [14][15][16]. This suggests that high-affinity binding, which is presumably required for the completion of DNA synthesis over the long viral genome, is dependent on association with the processivity factor. Interestingly, P/T sequences selected for binding to Klenow also bound relatively weakly. Klenow is derived from E. coli Pol I, an enzyme involved in DNA synthesis repair. Like E9 in the absence of the processivity factor, Pol I also demonstrates distributive synthesis [25], which may be beneficial for a repair enzyme that typically functions over short DNA segments.
We were unable to determine the features of the selected E9-R5-12 P/T sequence that were responsible for tight binding. Progressively replacing five nucleotide segments in the P/T hybrid region abrogated the tight binding to E9 in all cases. This suggests that strong binding requires sequences present throughout the duplex. A more precise examination of the effects of individual nucleotides in the selected P/T on binding was not undertaken, as the five nucleotide replacement approach did not allow us to home in on a particular region. With HIV RT, only the G-tract sequences near the 3 primer terminus were required for strong binding, and this was the region of the selected P/T sequence that matched the HIV PPT region [2]. If this were the case for E9, the P/Ts containing the 3 proximal 10-15 nucleotides (Table 1) would have been expected to bind like E9-R5-12, but they did not. In a subsequent report, it was shown the G-tract likely induces a bend in the P/T that may allow it to fit better into the RT active site [26]. HIV RT and other polymerases typically induce bends in P/T sequences upon binding. The pre-bent nature of the select HIV RT P/T sequences may have induced strong binding by reducing the energy required for bending and maintaining the bend of the bound P/T. It would be interesting to examine the E9-R5-12 sequence in the future to see if it is also bent. Since there is not a current crystal structure of E9 bound to DNA P/T, the extent to which it bends the P/T during binding is not known.
With HIV RT, a 38 nucleotide loop-back structure (15 nucleotides in the duplex region) that retained tight binding was produced based on the selected P/T sequence [21]. For E9, a similar 40 nucleotide structure (16 nucleotides in the duplex region) did not retain tight binding, while a 50 nucleotide loop-back (21 nucleotides in the duplex region) did ( Figure 5 and Table 1). This suggests that the two enzymes require a similar length of minimal duplex for strong binding. However, the E9 results are less clear because the 40 nt loop-back sequence, which showed low binding, was also missing specific sequences that may have been required for tight binding (see Section 3). The binding affinity of the HIV 38 nucleotide loop-back was improved by placing two O-CH 3 groups at the 2 sugar position of the -2 and -4 (relative to the 3 terminal primer base) positions in the template strand [27]. This allowed the crystallization (at 2.3 Angstroms) of the loop-back with HIV RT in the absence of any cross-linking agent [28]. This approach has been used in subsequent reports analyzing various drugs and mutations [29][30][31][32], and these are the only reported structures of HIV RT with nucleic acid in the absence of cross-linking. Since the E9 loop-backs bind about two to three orders of magnitude less tightly to E9 than the O-CH 3 -modified HIV RT 38 nucleotide loop-back, it is likely that additional modification (e.g., O-CH 3 or the addition of other groups) would be required to help in obtaining E9 P/T crystals. We are currently pursuing ways to improve binding.
Finally, it is notable that E9 is the sixth polymerase (three RTs [2,3] and three DNA polymerases [7]) to demonstrate strong sequence binding preferences. Of the previous five, the three RTs revealed a biologically significant preference for binding PPT-like sequence, while Taq and Klenow revealed a possible evolutionary/biochemical relationship between phage RNA polymerase-promoter recognition and bacterial DNA polymerases. As DNA polymerases generally recognize recessed 3 termini for binding and activity without preferred sequence context, the role of sequence preferences for DNA polymerases is unclear, and preferred binding is, perhaps, unexpected. However, the examples noted above indicate that these enzymes do, in fact, have sequence preferences, and they can potentially be exploited for structural analysis, as well as for understanding their evolutionary and biochemical relationships with other DNA-binding proteins.
Supplementary Materials: The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/v14020369/s1, Figure S1. Dissociation of E9 vaccinia virus polymerase from P/T constructs select with E9 or other proteins (see Table 1) shows that only E9-R5-1 binds stably; Figure S2. Extension of selected radiolabeled primer-templates with E9 polymerase shows that all primer templates are extended in a similar fashion and the E9-R5-12 is not extended better than P/T constructs selected with other polymerases; and Table S1: Primer-template sequences selected by PT SELEX examined for binding and extension with vaccinia virus polymerase (E9).

Conflicts of Interest:
The authors declare no conflict of interest.