C5-Substituted 2-Selenouridines Ensure Efficient Base Pairing with Guanosine; Consequences for Reading the NNG-3' Synonymous mRNA Codons.

5-Substituted 2-selenouridines (R5Se2U) are post-transcriptional modifications present in the first anticodon position of transfer RNA. Their functional role in the regulation of gene expression is elusive. Here, we present efficient syntheses of 5-methylaminomethyl-2-selenouridine (1, mnm5Se2U), 5-carboxymethylaminomethyl-2-selenouridine (2, cmnm5Se2U), and Se2U (3) alongside the crystal structure of the latter nucleoside. By using pH-dependent potentiometric titration, pKa values for the N3H groups of 1–3 were assessed to be significantly lower compared to their 2-thio- and 2-oxo-congeners. At physiological conditions (pH 7.4), Se2-uridines 1 and 2 preferentially adopted the zwitterionic form (ZI, ca. 90%), with the positive charge located at the amino alkyl side chain and the negative charge at the Se2-N3-O4 edge. As shown by density functional theory (DFT) calculations, this ZI form efficiently bound to guanine, forming the so-called “new wobble base pair”, which was accepted by the ribosome architecture. These data suggest that the tRNA anticodons with wobble R5Se2Us may preferentially read the 5′-NNG-3′ synonymous codons, unlike their 2-thio- and 2-oxo-precursors, which preferentially read the 5′-NNA-3′ codons. Thus, the interplay between the levels of U-, S2U- and Se2U-tRNA may have a dominant role in the epitranscriptomic regulation of gene expression via reading of the synonymous 3′-A- and 3′-G-ending codons.


Chemistry General Remarks
Thin layer chromatography was done on silica gel coated plates (60F254, Merck), and Merck silica gel 60 (mesh 230-400, Merck) was used for column chromatography. HPLC was performed with a Waters chromatograph equipped with a 996 spectral diode array detector preparative SUPELCO, Ascentis ® column (C18, 25 cm x 21.2 mm, 10 µm). Separation was run at room temperature (rt) using water as an eluent. NMR spectra were recorded at a 700 MHz (for 1 H) instrument and at 176 MHz for 13 C. Chemical shifts (δ) are reported in ppm relative to TMS (an internal standard) for 1 H and 13 C. The signal multiplicities are described as s (singlet), d (doublet), dd (doublet of doublets), t (triplet), q (quartet), m (multiplet), and bs (broad singlet). High-resolution mass spectrometry (HRMS) measurements were performed using Synapt G2Si mass spectrometer (Waters) equipped with an ESI source and quadrupole-Time-of-flight mass analyser or using a Finnigan MAT 95 spectrometer (FAB ionization).

5-(N-Trifluoroacetyl)carboxymethylaminomethyl-2-selenouridine (2f).
Protected selenouridine 2d (48 mg, 0.05 mmol, 1 eq.) was dissolved in 1 M solution of TBAF in THF (421 µl, 0.423 mmol, 8 eq.). The mixture was stirred for 50 min at room temperature. After conversion (TLC analysis), CaCO3 (88 mg), dry DOWEX 50WX8 H form (263 mg) and anhydrous methanol (0.6 ml) were added. The mixture was stirred for 1 h at room temperature and then filtered and washed with MeOH. The filtrate was evaporated under reduced pressure. The crude 5'-DMTr-N-TFA-2-selenouridine 2e was treated with 50% aq. AcOH (2 ml). The reaction mixture was stirred for 1 h at room temperature and evaporated under reduced pressure. The residue was partitioned between chloroform (2 ml) and water (5 ml). The water phase was washed with chloroform (2 ml). The water layers were combined, and concentrated under reduced pressure. The solution was passed through the column with Dowex 50WX8 (pyridinium form) and eluted with mixture of water-pyridine (1:1, v/v). Fraction containing compound 2f (TLC control) was concentrated under reduced pressure, lyophilized and purified by flash column chromatography (50 % methanol in chloroform) using argon overpressure. Compound 2f was obtained in 70 % yield (18 mg, yield refers to 2d) as a mixture of rotamers along the NC(O)CF3 amide bond in a 0.75: 0.25 ratio according to 1 H NMR. Consequently, two chemical shifts are observed for some of the 1 H and 13 C NMR resonances (the secondary shifts in 13 C NMR spectra are given in parentheses). TLC (BuOH/H2O
.49 mmol, 1 equiv) was dissolved in anhydrous ethanol (14.9 ml) and triethylamine (622 µl, 4.48 mmol, 3 equiv) and methyl iodide (278 µl, 4.48 mmol, 3 equiv) were added. The solution was stirred for 3.5 h at room temperature. Ethanol was removed under reduced pressure and the solid residue was dissolved in DCM (50 ml) and washed with water (25 ml). The water phase was extracted with DCM (2 x 50 ml). The organic layers were combined and dried with anhydrous MgSO4. After filtration, the organic solvent was evaporated in vacuo. The residue was co-evaporated with anhydrous toluene and purified on column of silica gel with 3 % methanol in chloroform.

Potentiometric measurements
The acidity constants of the nucleosides 1-3 (pKa) were determined by the pH-potentiometric titration of 2.0-ml samples. The concentration of the nucleoside in solution was 1×10 −3 M. Measurements were carried out at 298 K and at a constant ionic strength of 0.1 M NaCl using a MOLSPIN pH meter (Molspin Ltd., Newcastle-upon-Tyne, UK) equipped with a digitally operated syringe (the Molspin DSI 0.250 ml) controlled by a computer. For the titrations, a carbonate-free NaOH solution of known concentration (0.1 M) was used and measurements were made using a Russel CMAWL/S7 semi-micro combined electrode, calibrated for hydrogen ion concentration using the method of Irving et al. The accepted fit for the titration curves was always less than 0.01 ml. The number of experimental points was 100-150 for each titration curve. The titration points included in the evaluation could be reproduced within 0.005 pH units in the whole pH range examined (pH from 2 to 12). The protonation constants of the nucleosides were evaluated by performing iterative non-linear least squares fit of the potentiometric equilibrium curves through mass balance equations using the computer program SUPERQUAD. The sigma value (the root mean squared weighted residual) obtained after the refinement of the stability constants was 1, which suggested that the data were fitted within experimental error. The equilibrium constants reported in this work were obtained from a fitting performed using three titration curves simultaneously.

Structural analysis of m1Se2Ura and m1mnm5Se2Ura
The theoretically determined lengths of all covalent bonds in the X2-C2-N3-C4-O4 bonding region of the tautomers of the m1Se2Ura and m1mnm5Se2Ura models in water are shown in Fig. S28. Moreover, the C2-X bond lengths, where X=O, S, or Se, are compared in Table S7. Our calculations demonstrate that all C2-X bonds are quite sensitive to tautomerization and vary by approximately 0.1 Å upon conversion from the K form to the E2 form. Generally, the C2-X bond is the shortest in the K tautomer (which corresponds to the strongest double bond character) and the longest in the E2 tautomer (corresponding to the strongest single bond character). The C-Se bond in mnm5Se2U is slightly longer in each tautomeric form than in unsubstituted Se2U (by 0.005-0.011 Å), but between the lengths for a pure C=Se bond (1.74 Å) and a pure C2-Se single bond (1.94 Å). d Figure S27. The possible keto-enol tautomers of 1-methyl-5-substituted 2-selenouracils (R=H or CH2NHCH3) (diketo-K, 4-keto-2-enol-E2, 2-keto-4-enol-E4 and zwitterionic-ZI) in water.   Values for 5-substituted 1-methyl-uracils and 2-thiouracils are taken from ref. 3. b The free energies of the most stable K tautomers of m1Se2Ura and m1mnm5Se2Ura were taken as zero (reference values). Figure S29. ESP atomic charge distribution (B3LYP-GD3/6-311++G(3df,2p)//B3LYP-GD3/6-31+G(d)) for m1Se2Ura and m1mnm5Se2Ura in water Crystal structure overlap of crystal and DFT DFT structure (in H2O) r(C-Se) = 1.851 Å r(C-Se) = 1.820 Å Figure S30. Overlapping of crystal and DFT(H2O) structures of Se2Ura.

Electrostatic potential map
An electrostatic potential energy map illustrating the charge distributions in the most abundant mnm5Se2U base was analysed by quantum chemical calculations carried out for three of the most stable tautomeric forms of the m1mnm5Se2Ura model protonated at the amino alkyl residue. As shown in Fig. 5, the 2,4-diketo tautomer (K) contains electron-rich regions in the vicinity of both Se2 and O4 atoms, while N3 is shielded by the hydrogen atom. In the E4 tautomer, the electron-rich region is noted at the Se2 … N3 location. In the zwitterionic tautomeric structure, the electron-deficient region is located in the vicinity of the ammonium cation at the side chain, while the electron-rich region is dispersed over the Se2 … N3 … O4 edge. The electrostatic potential maps obtained for the three tautomeric forms of the 2-selenouracil model are consistent with those of the corresponding 2-oxo-and 2-thio-uracils. e (K) (E4) (ZI) Figure S31. Electrostatic potential energy map illustrating the charge distributions in the mnm5Se2U base analysed by quantum chemical calculations carried out for three the most stable tautomeric forms of the m1mnm5Se2Ura model K, E4 and E2, protonated at the amino alkyl residue. e Sochacka,E., Lodyga-Chruscinska,E., Pawlak,J., Cypryk,M., Bartos,P., Ebenryter-Olbinska,K., Leszczynska,G., and Nawrot,B. (2017) C5-substituents of uridines and 2-thiouridines present at the wobble position of tRNA determine the formation of their keto-enol or zwitterionic forms -a factor important for accuracy of reading of guanosine at the 3΄-end of the mRNA codons. Nucleic Acids Res., 45, 4825-4836.