Directed Evolution of the Methanosarcina barkeri Pyrrolysyl tRNA/aminoacyl tRNA Synthetase Pair for Rapid Evaluation of Sense Codon Reassignment Potential

Genetic code expansion has largely focused on the reassignment of amber stop codons to insert single copies of non-canonical amino acids (ncAAs) into proteins. Increasing effort has been directed at employing the set of aminoacyl tRNA synthetase (aaRS) variants previously evolved for amber suppression to incorporate multiple copies of ncAAs in response to sense codons in Escherichia coli. Predicting which sense codons are most amenable to reassignment and which orthogonal translation machinery is best suited to each codon is challenging. This manuscript describes the directed evolution of a new, highly efficient variant of the Methanosarcina barkeri pyrrolysyl orthogonal tRNA/aaRS pair that activates and incorporates tyrosine. The evolved M. barkeri tRNA/aaRS pair reprograms the amber stop codon with 98.1 ± 3.6% efficiency in E. coli DH10B, rivaling the efficiency of the wild-type tyrosine-incorporating Methanocaldococcus jannaschii orthogonal pair. The new orthogonal pair is deployed for the rapid evaluation of sense codon reassignment potential using our previously developed fluorescence-based screen. Measurements of sense codon reassignment efficiencies with the evolved M. barkeri machinery are compared with related measurements employing the M. jannaschii orthogonal pair system. Importantly, we observe different patterns of sense codon reassignment efficiency for the M. jannaschii tyrosyl and M. barkeri pyrrolysyl systems, suggesting that particular codons will be better suited to reassignment by different orthogonal pairs. A broad evaluation of sense codon reassignment efficiencies to tyrosine with the M. barkeri system will highlight the most promising positions at which the M. barkeri orthogonal pair may infiltrate the E. coli genetic code.


S.2 General materials and reagents
All restriction enzymes, DNA polymerases, and T4 kinase were purchased from New England Biolabs and used according to the manufacturer's instructions. ATP was purchased from Fisher (BP413-25) and dNTPs were purchased form New England Biolabs (N0447S). DNA isolation was performed using a Thermo Scientific GeneJET plasmid miniprep kit (K0503) according to the manufacturer's protocols. Some cloning steps and PCR products were purified using a Thermo Scientific GeneJET PCR spin kit (K0701).
LB liquid media (per liter: 10 g tryptone, 5 g yeast extract, 5 g NaCl) and LB agar plates with 15 g/L agar (TEKNova, A7777) were used unless otherwise noted. Isopropyl-beta-D-thiogalactoside (IPTG) was purchased from Gold Bio (I2481C5). Spectinomycin (Enzo Life Science, BML-A281) was used at 50 μg/mL to maintain the vectors harboring the tRNA and aaRS genes. Carbenicillin (PlantMedia, 40310000-2) was used at 50 μg/mL to maintain the vectors harboring the GFP reporter gene. All bacterial cultures were grown at 37 °C unless otherwise noted. All liquid cultures were shaken at 225 rpm unless otherwise noted.
Electrocompetent stocks of all strains were prepared in-house according to the method of Sambrook and Russell (J. Sambrook and D. W. Russell Molecular cloning: a laboratory manual. 2001, Cold Spring Harbor Laboratory press). Typical transformation efficiencies for electrocompetent cells produced in this way are 10 9 cfu/μg of supercoiled DNA. All transformations were recovered in SOC (20 g/L tryptone, 5 g/L yeast extract, 10 mM NaCl, 2.5 mM KCl, 10 mM MgCl2, 20 mM glucose) for 1 hour at 37 °C with shaking prior to transfer to media containing appropriate antibiotics and/or inducers as noted.
All oligonucleotides were purchased from Integrated DNA Technologies (Coralville, Iowa, USA). All DNA sequencing was performed by Genewiz (Plainfield, NJ, USA). S4 Figure S1: Directed evolution workflow. Figure S1. Visual representation of the two phase directed evolution workflow, including key quantifications (e.g. library size, FACS details, codon reassignment efficiencies) for evolution of wild type, pyrrolysine-charging M. barkeri aaRS to a variant that charges tyrosine to its cognate tRNA at an efficiency rivaling wild type tRNA/aaRS pairs. Phase 1 describes the evolutionary step of "selection".
Structure and function-guided site-directed mutagenesis resulted in identification of an aaRS that enables its cognate tRNA to decode an amber stop codon as tyrosine with 8.4 ± 0.6% efficiency. In Phase 2, the aaRS is "matured" via random mutagenesis. The resulting aaRS enables its cognate tRNA to decode an amber stop codon as tyrosine with 98.1 ± 3.6%.

S.3 GFP reporter vectors for codon reassignment
All GFP reporter vectors utilized in the evaluation of codon reassignment by the M. barkeri tRNA/aaRS pair are those used for evaluation of codon reassignment by the M. jannaschii tRNA/aaRS pair.
Full sequence data for the suite of GFP reporter vectors used in this manuscript has been reported previously:

S.4 TBIO-PCR synthesis of M. barkeri Pyl tRNA and aaRS genes
Both the wild type M. barkeri pyrrolysyl tRNA and aaRS (wild type, except for Y349F) genes were prepared using the primers in Table S1 and Table S2     This mutagenic strategy relies upon preparation of single-stranded template DNA enriched in dU content.
DNA is prepared in a cell strain lacking two of the enzymes responsible for editing deoxyuridine from newly synthesized DNA. Following annealing of mutagenic primers and extension of the template DNA in the absence of deoxyuridine, the double-stranded DNA is transformed into cells that have the enzymes responsible for editing dU out of DNA. The non-mutated, template strand of DNA is degraded, and the mutated strand is replicated and transcribed.
Briefly, 4 mL cultures of CJ236 cells harboring the phagemid to be mutated were grown to an OD600 of 0.5 and infected with M13KO7 helper phage at a multiplicity of infection of 10:1. The infected culture was transferred into 125 mL of LB media with 5 μg/mL chloramphenicol, appropriate antibiotic to maintain the phagemid, and 0.25 μg/mL uridine. Cultures were grown overnight at 30 °C. Cells were pelleted at 17,000 xg at 4 °C for 20 minutes in a Sorvall RC 6+ with a Thermo FIBERLite F14-6x250y rotor. Phage particles were isolated by decanting the supernatant from the pelleted cells into 1/5 th volume of 20% 8,000 molecular weight polyethylene glycol and 2.5M NaCl in water. Solutions were incubated on ice for at least 2 hours.
Phage particles were isolated by pelleting at 17,000 xg at 4 °C for 20 minutes. The supernatant was decanted, S11 and the phage pellet was spun for an additional minute to collect remaining supernatant, which was then removed. The phage pellet was resuspended in 1.5 mL of phosphate buffered saline, pH = 7.4. Insoluble material was pelleted out of the phage solution at 17,000 xg for 5 minutes. Single-stranded, uridine-enriched DNA (ss dU DNA) was isolated from phage using a Qiagen M13 spin kit.

S.7. Preparation of the site-directed aaRS library
Three mutagenic primers (Table S3) were phosphorylated for 1.5 hours at 37 °C using T4 polynucleotide kinase. Phosphorylated primers were annealed to 5 μg of single-stranded template DNA at a 3:1 molar ratio by incubating at 90 °C for 2 minutes, 70 °C for 20 seconds, followed by a decreasing temperature ramp of 1°C every 20 seconds until the temperature reached 20 °C. The reaction was held at 20 °C for 2 minutes.
The annealed mixture was extended using T7 DNA polymerase in the presence of T4 DNA ligase and 670 μM ATP and 330 μM dNTPs (each) at room temperature overnight.
The crude mutagenesis reaction was purified using a PCR spin kit column. The entirety of the eluted purified DNA (55 μL) was transformed into electrocompetent E. coli DH10B to degrade the ss dU DNA template. Transformed cells were allowed to recover in 6 mL of SOC without antibiotics for 1 hour at 37 °C. Following recovery, a small aliquot was removed for plating onto LB agar/spectinomycin to determine the size of the library. 1 x 10 8 unique clones were generated.
The remainder of the SOC recovery media was diluted into LB with spectinomycin (160 mL total volume) and grown for approximately 2 doublings based on the starting OD600 (2 hours and 20 minutes).
15.3 μg of plasmid DNA was purified from the culture using 3 miniprep columns and digested with KpnI-HF (NEB) for 2 hours at 37 °C to remove nonmutated and partially mutated DNA. The digested DNA was purified using 2 PCR spin columns. The contents of each column were eluted using 55 μL of sterile water for a total of 110 μL of eluted DNA.
Purified, digested DNA was transformed into electrocompetent E. coli DH10B already harboring the GFP reporter plasmid with a UAG amber codon at position 66. Transformations were recovered in a total of 50 mL SOC for 1 hour. A small aliquot of the recovery was plated to determine transformation efficiency. 2.7 x 10 8 unique transformants were acquired. 19/23 clones characterized by cPCR and sequencing were unique library members.
The remainder of the SOC recovery media was diluted into LB media with appropriate antibiotics and 1 mM IPTG to induce expression of both the aaRS variants and the GFP reporter (600 mL total volume) and grown overnight. The following morning, the library aliquoted into volumes containing 0.3 ODs and frozen at -80 °C in 35% glycerol. S12

S.8. Preparation of the tRNA anticodon variants for sense codon reassignment
Similar to preparation of the aaRS variant library, Kunkel mutagenesis was used to prepare anticodon variants of the M. barkeri tRNA. Briefly, a single mutagenic primer was phosphorylated for 1.5 hours at 37 °C using T4 polynucleotide kinase. Phosphorylated primers were annealed to 500 ng of single-stranded template DNA at a 3:1 molar ratio by incubating at 90 °C for 2 minutes, 53 °C for 3 minutes, and 25 °C for 5 minutes. The annealed mixture was extended using T7 DNA polymerase in the presence of T4 DNA ligase and 670 μM ATP and 330 μM dNTPs (each) at room temperature overnight. The total reaction volume was typically 30 μL. 2μL of each crude mutagenesis reaction was transformed into electrocompetent E. coli DH10B and recovered for 1 hour in 1 mL SOC. This procedure typically yields 10 5 transformants with 40-80% mutation efficiency. tRNA variants were easily identified by digesting crude colony PCR products with XhoI. The absence of an XhoI site indicates successful mutation to the desired anticodon. All variants were confirmed by sequencing (Genewiz, LLC).
tRNA variants were prepared using a mutagenic primer having the following basic sequence: where "nnn" specifies the desired anticodon.

Rev: 5'-CAT GGG GTC AGG TGG GAC -3'
Product is 2023 nt long. Cutting the crude product of PCR using the starting tRNA vector XhoI produces two pieces: 469 nt and 1554 nt. PCR products from desired, anticodon altered tRNAs are uncut. S13

S.9 Fluorescence-activated cell sorting (FACS) for identification of tyrosine-incorporating M. barkeri aaRS variants
Tubes of cells harboring the aaRS library and the UAG codon GFP reporter that had been stored at -80 °C in 35% glycerol were thawed on ice and centrifuged at 8,000 xg for 5 minutes at room temperature.
The supernatant was removed via pipette, and 1 mL sterile 0.9% aqueous NaCl was added to each tube.
Cells were resuspended via gentle pipetting and returned to ice. Isolated DNA is co-transformed with the appropriate GFP reporter vector, and multiple colonies are evaluated in the fluorescence-based screen. In this case, the GFP reporter vector DNA has not been shuttled through rounds of high throughput screening and amplification. This additional step not only provides an opportunity to evaluate biological replicates of a given system but also reveals false positives from the initial single colony evaluation that were the result of a fluorophore sequence revertant of the GFP reporter gene (UAG amber stop to UAU or UAC for tyrosine).
Following confirmation that apparent enzyme activity was not the result of an undesired mutation within the reporter vector, a final set of evaluations was undertaken to confirm the activity of the aaRS. Just as multiple rounds of high throughput functional sorting and amplification could inadvertently result in amplification of clones with mutations to the GFP reporter vector, undesired mutations could also result in apparent improved activity of the translational machinery. Mutations that increase gene expression would lead to the false impression that an aaRS is more effective than it really is.
In order to confirm that an identified aaRS sequence enables reassignment of a given codon in a particular system with the reported efficiency, the aaRS gene was PCR amplified out of the isolated DNA and recloned into the backbone vector. Again, the vector DNA used in the cloning reaction had not been through rounds screening and amplification and is not expected to have undesired mutations. The cloned DNA is isolated and sequenced to confirm that the sequence has not changed. This DNA is co-transformed with a GFP reporter vector with the codon of interest specifying the fluorophore tyrosine and evaluated in the fluorescence-based screen. Only after demonstrating consistent codon reassignment efficiency in each of these tests is an aaRS considered active. S16 S.11. Error prone PCR PCR conditions known to lower the fidelity of Taq polymerase and an unequal concentration of nucleotides were used to generate a library of TyrGen1 aminoacyl tRNA synthetase variants with random mutations throughout the entire sequence. TyrGen1 was amplified with the intention to introduce 1-4 mutations per 1000 nucleotides. 10 ng of template DNA was included in a reaction with 1.0 mM dCTP and dTTP and 0.2 mM dATP and dGTP. Standard Taq buffer (NEB) with a final concentration of 7 mM MgCl2 was used as the reaction buffer. The final concentration of each amplification primer in the final reaction was 0.4 µM.
Following initial denaturation, 25 cycles of denaturation, annealing, and extension were performed. To enhance the permissivity of Taq polymerase, the extension temperature was increased to 72 °C (as opposed to the optimal Taq extension temperature of 68 °C).
The purified EP-PCR products were digested for ligation into the M. barkeri orthogonal translation machinery backbone vector. Ligated products were transformed into DH10B cells harboring the UAG stop codon GFP reporter vector. A small portion of the recovery media was plated to determine transformation efficiency and mutation frequency. The remainder of the transformed cells were transferred to media containing IPTG and appropriate antibiotics and grown overnight. Cells were prepared for FACS analysis as described. FACS was performed as described previously, except that a BD FACSAria III instrument (BD Biosciences, San Jose, CA) was used. Relevant details (number of cells sorted, etc) are given below:   The codon corresponding to the amino acid at position 5 of the Z domain was mutated to either F5Y or F5amber.

S20
The sequence of the Z domain gene is: "nnn" corresponds to the codon whose identity is changed across the suite of vectors.