Synthesis of Polyanionic C5-Modified 2′-Deoxyuridine and 2′-Deoxycytidine-5′-Triphosphates and Their Properties as Substrates for DNA Polymerases

Modified 2′-deoxyribonucleotide triphosphates (dNTPs) have widespread applications in both existing and emerging biomolecular technologies. For such applications it is an essential requirement that the modified dNTPs be substrates for DNA polymerases. To date very few examples of C5-modified dNTPs bearing negatively charged functionality have been described, despite the fact that such nucleotides might potentially be valuable in diagnostic applications using Si-nanowire-based detection systems. Herein we have synthesised C5-modified dUTP and dCTP nucleotides each of which are labelled with an dianionic reporter group. The reporter group is tethered to the nucleobase via a polyethylene glycol (PEG)-based linkers of varying length. The substrate properties of these modified dNTPs with a variety of DNA polymerases have been investigated to study the effects of varying the length and mode of attachment of the PEG linker to the nucleobase. In general, nucleotides containing the PEG linker tethered to the nucleobase via an amide rather than an ether linkage proved to be the best substrates, whilst nucleotides containing PEG linkers from PEG6 to PEG24 could all be incorporated by one or more DNA polymerase. The polymerases most able to incorporate these modified nucleotides included Klentaq, Vent(exo-) and therminator, with incorporation by Klenow(exo-) generally being very poor.


Introduction
Base modified 2 -deoxyribonucleoside-5 -triphosphates (dNTPs) in which a specific reporter group, ligand or catalytic moiety is tethered via a linker to the nucleobase allow the preparation of functionalised nucleic acids that have a wide variety of applications in biotechnology and chemical biology [1]. Typically dNTPs modified at the C5-position of pyrimidines or C7-position of 7-deazapurine bases are good substrates for most DNA polymerases since the modification does not interfere with Watson Crick base pairing. In addition these modifications are in the major groove whilst most protein-DNA interactions during nucleotide incorporation arise in the minor groove [2]. Such modified dNTPs can be used in PCR [1], DNA sequencing [3,4] and labelled for use in structural studies [5]. Other applications of modified dNTPs include the directed evolution of novel ligands and catalysts using SELEX [6,7].
In order to incorporate poorer dNTP substrates, for example with bulkier modifications, a number of groups have evolved existing DNA polymerases to create new polymerases with altered substrate specificities [4,[8][9][10]. For example, the Marx group

Results and Discussion
There are relatively few examples of C5-modified dUTP analogues with negatively charged side chains in the literature (Figure 1). Of these, the dUTP analogues 1 and 2 could not be incorporated into DNA by polymerases [20]. Compound 3 is a substrate for KOD XL DNA polymerase [21] and in common with 4, can be incorporated by PrimeSTAR HS and Pwo polymerases [22]. The dNTP 5 is a substrate for Taq DNA polymerase [21], whilst nucleotides of general structure 6, which have ODNs of differing length tethered to the base are incorporated into DNA by the engineered polymerase KlenTaq [12]. In the dCTP series, the 5-valeric acid-dCTP 7 has been successfully incorporated [23] during primer extension reactions by Vent (exo-), Pwo and 9 • N m DNA polymerases in addition to the Klenow fragment of DNA polymerase I. Successful amplification during PCR was also demonstrated. The modified dCTP 8, can be incorporated into DNA using KOD XL DNA polymerase [24]. These studies indicate that the substrate properties of these modified nucleotides depend upon the type and/or length of the linker between the nucleobase and charged moiety: Specifically, with a suitably long linker, even bulky substituents can be tolerated.
Influenced by the findings of previous studies, we designed a series of modified dNTPs with three features; a rigid C5 anchor attached to the C5 atom of the nucleobase, a flexible PEG based linker of varying length and a charged reporter moiety ( Figure 2).
We chose to use trimesic acid as the anionic reporter group that would be tethered to the C5 positions of the triphosphates 9 and 10 using polyethyleneglycol (PEG) linkers of differing lengths. PEG linkers were chosen in order to maintain good water solubility of the dNTPs. To obtain the desired nucleotides we required PEG linkers functionalized with amino and carboxylate termini (CA-PEG) that would facilitate coupling to the reporter group and nucleotides via amide coupling reactions. Thus, commercially available PEG linkers CA-PEG 8 and CA-PEG 24 were initially chosen in order to fully explore the effect of linker length on the substrate properties of the dNTPs. These PEG linkers give approximate lengths [25] of 2.9 nm (PEG 8 ) to 8.6 nm (PEG 24 ) i.e., distances of around 9 or 25 base pairs in DNA.
To prepare the desired reporter-linker units we attempted a HATU-mediated amide coupling between the diester of trimesic acid (11) and commercially available CA-PEG 8 (12). However, this formed not only the desired linker-reporter unit 13, but also the extended product 14 (Scheme 1) that were isolated as colourless oils in 31% and 19% yield, respectively, following reversed phase HPLC. Influenced by the findings of previous studies, we designed a series of modified dNTPs with three features; a rigid C5 anchor attached to the C5 atom of the nucleobase, a flexible PEG based linker of varying length and a charged reporter moiety ( Figure 2). We chose to use trimesic acid as the anionic reporter group that would be tethered to the C5 positions of the triphosphates 9 and 10 using polyethyleneglycol (PEG) linkers of differing lengths. PEG linkers were chosen in order to maintain good water solubility of the dNTPs. To obtain the desired nucleotides we required PEG linkers functionalized with amino and carboxylate termini (CA-PEG) that would facilitate coupling to the reporter group and nucleotides via amide coupling reactions. Thus, commercially available PEG linkers CA-PEG8 and CA-PEG24 were initially chosen in order to fully explore the effect of linker length on the substrate properties of the dNTPs. These PEG linkers give approximate lengths [25] of 2.9 nm (PEG8) to 8.6 nm (PEG24) i.e., distances of around 9 or 25 base pairs in DNA.
To prepare the desired reporter-linker units we attempted a HATU-mediated amide coupling between the diester of trimesic acid (11) and commercially available CA-PEG8 (12). However, this formed not only the desired linker-reporter unit 13, but also the extended product 14 (Scheme 1) that were isolated as colourless oils in 31% and 19% yield, respectively, following reversed phase HPLC.  Influenced by the findings of previous studies, we designed a series of modified dNTPs with three features; a rigid C5 anchor attached to the C5 atom of the nucleobase, a flexible PEG based linker of varying length and a charged reporter moiety ( Figure 2). We chose to use trimesic acid as the anionic reporter group that would be tethered to the C5 positions of the triphosphates 9 and 10 using polyethyleneglycol (PEG) linkers of differing lengths. PEG linkers were chosen in order to maintain good water solubility of the dNTPs. To obtain the desired nucleotides we required PEG linkers functionalized with amino and carboxylate termini (CA-PEG) that would facilitate coupling to the reporter group and nucleotides via amide coupling reactions. Thus, commercially available PEG linkers CA-PEG8 and CA-PEG24 were initially chosen in order to fully explore the effect of linker length on the substrate properties of the dNTPs. These PEG linkers give approximate lengths [25] of 2.9 nm (PEG8) to 8.6 nm (PEG24) i.e., distances of around 9 or 25 base pairs in DNA.
To prepare the desired reporter-linker units we attempted a HATU-mediated amide coupling between the diester of trimesic acid (11) and commercially available CA-PEG8 (12). However, this formed not only the desired linker-reporter unit 13, but also the extended product 14 (Scheme 1) that were isolated as colourless oils in 31% and 19% yield, respectively, following reversed phase HPLC.  Consequently, in order to improve the yield of 13 we chose an alternative route in which we converted the acid 11 to its corresponding pentafluorophenyl ester 15. Compound 15 in turn could be reacted cleanly with CA-PEG8 (12) to afford 13 in 61% yield following RP-HPLC (Scheme 2). CA-PEG24 (16) was conjugated in the same manner to afford 17 and the analogous PEG4 compound 21 was synthesized from PEG-4 (Scheme 2). Scheme 1. HATU coupling reaction of diethyl 1,3,5-benzenetricarboxylate (11) with CA-PEG 8 forming the desired product 13 (31% yield) and the extended product 14 (19% yield). Reagents and conditions: Dry DMF, HATU, DIPEA, 11, CA-PEG 8 (12), 0 • C to r.t. over 30 min.
Consequently, in order to improve the yield of 13 we chose an alternative route in which we converted the acid 11 to its corresponding pentafluorophenyl ester 15. Compound 15 in turn could be reacted cleanly with CA-PEG 8 (12) to afford 13 in 61% yield following RP-HPLC (Scheme 2). CA-PEG 24 (16) was conjugated in the same manner to afford 17 and the analogous PEG 4 compound 21 was synthesized from PEG-4 (Scheme 2). Consequently, in order to improve the yield of 13 we chose an alternative route in which we converted the acid 11 to its corresponding pentafluorophenyl ester 15. Compound 15 in turn could be reacted cleanly with CA-PEG8 (12) to afford 13 in 61% yield following RP-HPLC (Scheme 2). CA-PEG24 (16) was conjugated in the same manner to afford 17 and the analogous PEG4 compound 21 was synthesized from PEG-4 (Scheme 2). The reporter-linker units 13, 14, 17 were then conjugated to C5-propargylamino-dUTP (9) [13] using TSTU as the coupling agent. The products were then treated with aqueous sodium hydroxide solution followed by RP-HPLC to afford the modified dNTPs 22-24 ( Figure  3), typically in yields around 60%. Analogous reactions of 13 and 21 with C5-propargylamino-dCTP (10) [21] gave modified dCTPs 25 and 26, respectively. Alongside these dNTP analogues we also prepared the dUTP analogue 27. This contains an O-propargyl-PEG linker. C5-modified dUTPs with an O-propargyl-PEG3 linker are known to be substrates for KlenTaq DNA polymerase. [26] The linker unit for dUTP 27 was prepared from the O-propargyl-PEG6 compound 28 which was reacted with 1,4-(dibromomethyl)benzene to give 29 which was then reacted with potassium phthalimide to give 30. A Sonogashira coupling between compound 30 and 3'-O-acetyl-5-iodo-2'-deoxyuridine (31) [27] provided the modified nucleoside 32 (Scheme 3). Phosphorylation of nucleoside using the Ludwig-Eckstein method [28] followed by treatment with aqueous methylamine solution provided dUTP 27 in 35% yield. The reporter-linker units 13, 14, 17 were then conjugated to C5-propargylamino-dUTP (9) [13] using TSTU as the coupling agent. The products were then treated with aqueous sodium hydroxide solution followed by RP-HPLC to afford the modified dNTPs 22-24 ( Figure 3), typically in yields around 60%. Analogous reactions of 13 and 21 with C5-propargylamino-dCTP (10) [21] gave modified dCTPs 25 and 26, respectively. Alongside these dNTP analogues we also prepared the dUTP analogue 27. This contains an O-propargyl-PEG linker. C5-modified dUTPs with an O-propargyl-PEG 3 linker are known to be substrates for KlenTaq DNA polymerase. [26] The linker unit for dUTP 27 was prepared from the O-propargyl-PEG 6 compound 28 which was reacted with 1,4-(dibromomethyl)benzene to give 29 which was then reacted with potassium phthalimide to give 30. A Sonogashira coupling between compound 30 and 3 -O-acetyl-5-iodo-2deoxyuridine (31) [27] provided the modified nucleoside 32 (Scheme 3). Phosphorylation of nucleoside using the Ludwig-Eckstein method [28] followed by treatment with aqueous methylamine solution provided dUTP 27 in 35% yield.
To compare the effects of the linker on the substrate properties of the modified nucleotides, we first investigated the incorporation of the C5-modified dUTPs in primer extension reactions using Vent (exo-) DNA polymerase ( Figure 4). Initially we chose a simple template containing the nucleotide sequence AC requiring insertion of dTTP/dUTP and dGTP. Each reaction contained dATP, dCTP and dGTP together with dTTP or one of the four dUTP nucleotides (22, 23, 24 and 27). Full length products were observed in each case. When dTTP or dUTP was absent (i.e., using only dATP, dCTP and dGTP) significant amounts of n + 1 product were observed (Figure 4, lane 6). In contrast, incorporation of the modified nucleotides results in a significantly decreased gel mobility of the extended products that was dependent on the molecular weight of the linker (Figure 4, lanes 2-5). Furthermore, the nature of the PEG linker (propargylamido (22)(23)(24) or O-propargyl (27)) did not appear to affect properties of the modified dUTPs as substrates for Vent (exo-).
In order to investigate the ability of DNA polymerases to extend from the modified nucleotides following incorporation, we chose a different primer-template requiring the insertion of the modified nucleotide followed by incorporation of a further 10 natural dNTPs ( Figure 5). In addition, we included the polymerases Bst 2.0, Therminator and Pfu. In reactions using Vent (exo-) polymerase nucleotide incorporation and extension can be observed, however in each case there appears to be some misincorporation of the the natural nucleotides indicated by the presence of faster-mobility full-length product obersved by PAGE. A similar outcome is observed using Therminator polymerase. However, in each case the reduced mobility of the full-length products indicates the incorporation of the modified nucleotides 22-24 and their subsequent extension albeit with some amounts of misincorporation of the natural nucleotides. When the polymerases Bst 2.0 and Pfu were

Vent (exo-)
Full length +T (Lane 1) +PEG 8 -dUTP (22) +PEG 16 -dUTP (23) +PEG 24 -dUTP (24)    We then examined incorporation of the dCTP analogues PEG 4 -dCTP (26) and PEG 8 -dCTP (25). Since Vent (exo-) and Therminator had shown incorporation of the modified dUTPs and subsequent extension of the primer, we chose both of these polymerases to examine their incorporation together with a further two DNA polymerases, Omni Klentaq and Klenow (exo-). All polymerases with the exception of Klenow (exo-) successfully incorporated the modified dCTPs and subsequently extended the primer ( Figure 6). However misincorporation can still be observed with Therminator and Omni Klentaq. Although Klenow (exo-) did not successfully incorporate either modified dCTP within the 1 h incubation period with an extended incubation period of 7 h some incorporation and extension was seen (data not shown).
In summary, we have described the chemical synthesis of C5-modified dUTP and dCTP analogues functionalized with anionic reporter group tethered via different length PEG-based linkers. We have shown that these analogues can be incorporated by a variety of different polymerases producing full-length products with a significantly decreased gel mobility. We anticipate that these functionalized nucleotides will have a variety of applications in biotechnology and will report further on these in due course.

General Methods
Dry solvents were obtained from a Grubbs dry solvent apparatus. Column chromatography purifications were carried out on silica (60-200 mesh, VWR Chemicals, Hayes, UK). Thin layer chromatography (TLC) was performed on pre-coated silica gel 60 F254 aluminium backed plates (Merck, London, UK). TLCs were visualised under UV (254 nm). NMR spectra were recorded on either a Bruker AV250, AV400 or AV500 spectrometer (as individually stated for all data) (Bruker, Billerica, MA, USA) and chemical shifts are reported in δ values relative to tetramethylsilane as an external standard. J values are given in Hz. All 1 H, 13 C and 31 P spectra are available as supplementary materials. Mass spectrometry was performed by the University of Sheffield Mass Spectrometry Service using the method of electrospray ionisation on an LCT Mass Spectrometer (Waters, Millford, MA, USA) unless otherwise stated. Analytical RP-HPLC was performed on Waters 2695 or 2690 instrumentation using a Gemini C18 5 µm 4.6 × 250 mm column (Phenomenex, London, UK) at a flow rate of 1 mL/min, UV detection was accomplished at 260 nm unless specified otherwise. Preparative RP-HPLC was performed using a Phenomenex Gemini C18 5 µm 110Å 21.2 × 250 mm column at a flow rate of 21 mL/min. UV detection was recorded at 260 nm unless specified otherwise. All quoted retention those found by analytical RP-HPLC. 3,5-Diethyl-1,3,5-benzenetricarboxylate (11) was purchased from Sigma-Aldrich (London, UK) and CA-PEG 8 (12) CA-PEG 24 (16) were purchased from ThermoFisher (Waltham, MA, USA). Compounds 9 and 10 were synthesized as previously described [13,21] and bis(tri-n-butylammonium) pyrophosphate was also prepared as described [28].
3.2.6. 1-Azido-3,6,9-trioxaundecane-11-ol (19) Sodium azide (3.85 g, 59.2 mmol) was added portionwise to a stirred solution of compound 19 (8.25 g, 23.7 mmol) in ethanol (100 mL) over a period of 15 min. Once homogeneous, the reaction was heated to 70 • C. After 20 h. the reaction was cooled to r.t. and the mixture then evaporated to dryness to give a orange oil that was dissolved in DCM (100 mL), washed with water (3 × 30 mL), and the organic layer dried (MgSO 4 ) and evaporated to give a pale yellow oil 3.2.7. 14-Azido-3,6,9,12-tetraoxatetradecan-1-oic Acid (20) Sodium hydride (2.46 g, 61.5 mmol) was added portionwise to a stirred solution of compound 19 in anhydrous THF (60 mL) at 0 • C over a period of 30 min. Once in solution, bromoacetic acid (3.56 g, 25.6 mmol) was added and the reaction stirred overnight at r.t. The reaction was then quenched with the slow addition of water (5 mL) and stirred for 15 min. The solvent was removed under reduced pressure and the residue dissolved in DCM (150 mL), washed with HCl (2 M, 50 mL), sat. brine (50 mL) and dried (MgSO 4 ) before being concentrated to give the title compound as a light orange oil (5.52 (21) Compound 20 (0.30 g, 1.08 mmol) in MeOH (10 mL) was placed under a H 2 atmosphere (30 bar) at 50 • C using a ThalesNano H-Cube (Budapest, Hungary). The solution was then passed over a 10% Pd/C Catcart ® at 1 mL/min. The solvent was removed under reduced pressure to give the title compound as pale yellow oil (0.25 g). The oil was then dissolved in in anhydrous DMF (10 mL) and dry diisopropylethylamine (170 µL, 1 mmol) followed by compound 15 (0.52g, 1.   Compound 17 (88 mg, 63 µmol) was dissolved in anhydrous DMF (1 mL) and TSTU (19 mg, 63 µmol,) and DIPEA (22 µL, 126 µmol) added in quick succession. After 2 h. stirring at r.t., 5-(3-aminoprop-1-ynyl)-2 -deoxyuridine-5 -triphosphate (9, 31.6 µmol) in sodium borate buffer (1 mL of a 0.1 M solution) was added and the mixture stirred for 24 h. The crude reaction mixture was analysed by analytical RP-HPLC (10% B for 5 min then 10-100% B over 30 min; (A = 100 mM TEAB, B = 100 mM TEAB/50% MeCN) and the product (retention time 32 min) was used without further purification. Aq NaOH (4 mL of a 1 M solution) was then added and the mixture stirred for 2 h at r.t., then neutralised by dropwise addition of acetic acid (1 M). The crude reaction mixture was analysed by analytical RP-HPLC (5% B for 5 min then 5-100% B over 30 min; (A = 100 mM TEAB, B = 100 mM TEAB / 50% MeCN), which showed the product at 18 min retention time. Subsequent purification by preparative RP-HPLC using the same gradient gave the product (60 µmol, 60%). This was converted to the corresponding Na salt using Dowex ion mobility that is dependent of the length of the PEG linker. Although all modified dNTPs were successfully incorporated by all DNA polymerases tested the largest modified dNTP, the PEG 24 -dUTP analogue, hindered the continued extension with both Pfu and Bst 2.0 DNA polymerases. The alkyne-ether linkage also examined showed good incorporation with only Bst 2.0 DNA polymerase failing to continue primer extension after its incorporation. We have therefore demonstrated that these anionically charged PEG modified dNTPs are substrates for a number of DNA polymerases and therefore might be exploited for a variety of biotechnological platforms.
Supplementary Materials: The following are available online, 1 H-, 13 C-and 31 P-NMR spectra of synthesized compounds.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.